9 research outputs found
Reply to the commentary "Be careful when assuming the obvious", by P. Alday
Here we respond to some comments by Alday concerning headedness in linguistic
theory and the validity of the assumptions of a mathematical model for word
order. For brevity, we focus only on two assumptions: the unit of measurement
of dependency length and the monotonicity of the cost of a dependency as a
function of its length. We also revise the implicit psychological bias in
Alday's comments. Notwithstanding, Alday is indicating the path for linguistic
research with his unusual concerns about parsimony from multiple dimensions.Comment: Minor corrections (language improved
A commentary on "The now-or-never bottleneck: a fundamental constraint on language", by Christiansen and Chater (2016)
In a recent article, Christiansen and Chater (2016) present a fundamental
constraint on language, i.e. a now-or-never bottleneck that arises from our
fleeting memory, and explore its implications, e.g., chunk-and-pass processing,
outlining a framework that promises to unify different areas of research. Here
we explore additional support for this constraint and suggest further
connections from quantitative linguistics and information theory
Liberating language research from dogmas of the 20th century
A commentary on the article "Large-scale evidence of dependency length
minimization in 37 languages" by Futrell, Mahowald & Gibson (PNAS 2015 112 (33)
10336-10341).Comment: Minor correction
The meaning-frequency law in Zipfian optimization models of communication
According to Zipf's meaning-frequency law, words that are more frequent tend
to have more meanings. Here it is shown that a linear dependency between the
frequency of a form and its number of meanings is found in a family of models
of Zipf's law for word frequencies. This is evidence for a weak version of the
meaning-frequency law. Interestingly, that weak law (a) is not an inevitable of
property of the assumptions of the family and (b) is found at least in the
narrow regime where those models exhibit Zipf's law for word frequencies
Reply to the commentary "Be careful when assuming the obvious", by P. Alday
Here we respond to some comments by Alday concerning headedness in linguistic theory and the validity of the assumptions of a mathematical model for word order. For brevity, we focus only on two assumptions: the unit of measurement of dependency length and the monotonicity of the cost of a dependency as a function of its length. We also revise the implicit psychological bias in Alday’s comments. Notwithstanding, Alday is indicating the path for linguistic research with his unusual concerns about parsimony from multiple dimensions.Peer Reviewe
Anti dependency distance minimization in short sequences: A graph theoretic approach
Dependency distance minimization (DDm) is a word order principle favouring the placement of syntactically related words close to each other in sentences. Massive evidence of the principle has been reported for more than a decade with the help of syntactic dependency treebanks where long sentences abound. However, it has been predicted theoretically that the principle is more likely to be beaten in short sequences by the principle of surprisal minimization (predictability maximization). Here we introduce a simple binomial test to verify such a hypothesis. In short sentences, we find anti-DDm for some languages from different families. Our analysis of the syntactic dependency structures suggests that anti-DDm is produced by star trees.Peer ReviewedPostprint (author's final draft
The placement of the head that maximizes predictability. An information theoretic approach
The minimization of the length of syntactic dependencies is a
well-established principle of word order and the basis of a mathematical theory
of word order. Here we complete that theory from the perspective of information
theory, adding a competing word order principle: the maximization of
predictability of a target element. These two principles are in conflict: to
maximize the predictability of the head, the head should appear last, which
maximizes the costs with respect to dependency length minimization. The
implications of such a broad theoretical framework to understand the
optimality, diversity and evolution of the six possible orderings of subject,
object and verb are reviewed.Comment: in press in Glottometric
The placement of the head that maximizes predictability: An information theoretic approach
The minimization of the length of syntactic dependencies is a well-established principle of word order and the basis of a mathematical theory of word order. Here we complete that theory from the perspective of information theory, adding a competing word order principle: the maximization of predictability of a target element. These two principles are in conflict: to maximize the predictability of the head, the head should appear last, which maximizes the costs with respect to dependency length minimization. The implications of such a broad theoretical framework to understand the optimality, diversity and evolution of the six possible orderings of subject, object and verb, are reviewed.Peer ReviewedPostprint (published version
The optimality of syntactic dependency distances
It is often stated that human languages, as other biological systems, are
shaped by cost-cutting pressures but, to what extent? Attempts to quantify the
degree of optimality of languages by means of an optimality score have been
scarce and focused mostly on English. Here we recast the problem of the
optimality of the word order of a sentence as an optimization problem on a
spatial network where the vertices are words, arcs indicate syntactic
dependencies and the space is defined by the linear order of the words in the
sentence. We introduce a new score to quantify the cognitive pressure to reduce
the distance between linked words in a sentence. The analysis of sentences from
93 languages representing 19 linguistic families reveals that half of languages
are optimized to a 70% or more. The score indicates that distances are not
significantly reduced in a few languages and confirms two theoretical
predictions, i.e. that longer sentences are more optimized and that distances
are more likely to be longer than expected by chance in short sentences. We
present a new hierarchical ranking of languages by their degree of
optimization. The statistical advantages of the new score call for a
reevaluation of the evolution of dependency distance over time in languages as
well as the relationship between dependency distance and linguistic competence.
Finally, the principles behind the design of the score can be extended to
develop more powerful normalizations of topological distances or physical
distances in more dimensions