5 research outputs found
The optimality of syntactic dependency distances
It is often stated that human languages, as other biological systems, are
shaped by cost-cutting pressures but, to what extent? Attempts to quantify the
degree of optimality of languages by means of an optimality score have been
scarce and focused mostly on English. Here we recast the problem of the
optimality of the word order of a sentence as an optimization problem on a
spatial network where the vertices are words, arcs indicate syntactic
dependencies and the space is defined by the linear order of the words in the
sentence. We introduce a new score to quantify the cognitive pressure to reduce
the distance between linked words in a sentence. The analysis of sentences from
93 languages representing 19 linguistic families reveals that half of languages
are optimized to a 70% or more. The score indicates that distances are not
significantly reduced in a few languages and confirms two theoretical
predictions, i.e. that longer sentences are more optimized and that distances
are more likely to be longer than expected by chance in short sentences. We
present a new hierarchical ranking of languages by their degree of
optimization. The statistical advantages of the new score call for a
reevaluation of the evolution of dependency distance over time in languages as
well as the relationship between dependency distance and linguistic competence.
Finally, the principles behind the design of the score can be extended to
develop more powerful normalizations of topological distances or physical
distances in more dimensions
Optimality of syntactic dependency distances
It is often stated that human languages, as other biological systems, are shaped by cost-cutting pressures but, to what extent? Attempts to quantify the degree of optimality of languages by means of an optimality score have been scarce and focused mostly on English. Here we recast the problem of the optimality of the word order of a sentence as an optimization problem on a spatial network where the vertices are words, arcs indicate syntactic dependencies, and the space is defined by the linear order of the words in the sentence. We introduce a score to quantify the cognitive pressure to reduce the distance between linked words in a sentence. The analysis of sentences from 93 languages representing 19 linguistic families reveals that half of languages are optimized to a 70% or more. The score indicates that distances are not significantly reduced in a few languages and confirms two theoretical predictions: that longer sentences are more optimized and that distances are more likely to be longer than expected by chance in short sentences. We present a hierarchical ranking of languages by their degree of optimization. The score has implications for various fields of language research (dependency linguistics, typology, historical linguistics, clinical linguistics, and cognitive science). Finally, the principles behind the design of the score have implications for network science.Peer ReviewedPostprint (published version
Minimizing dependencies across languages and speakers. Evidence from basque, polish and spanish and native and non-native bilinguals.
223 p.Within the last years, evidence for a general preference towards grammars reducing the linear distance between elements in a dependency has been accumulating (e. g., Futrell, Mahowald, and Gibson, 2015b; Gildea and Temperley, 2010). This cognitive bias towards dependency length minimization has been argued to result from communicative and cognitive pressures at play during language production. Although corpus evidence supporting this claim is quite broad insofar as grammaticalized structures are concerned (e. g., Futrell et al., 2015b; Liu, 2008; Temperley, 2007, among others), its validity rests on more shaky foundations regarding production preferences (Stallings, MacDonald, and O¿Seaghdha, 1998; Wasow, 1997; Yamashita and Chang, 2001, among others). This dissertation intends to address this gap. It examines whether dependency length minimization is an active mechanism shaping language production preferences, and explores the specific nature of this principle and its interplay with linguistic specifications and architectural properties of the human memory system. In a series of 5 cued-recall production experiments and 2 complex memory span tasks, I investigate the effect of dependency length in modulating production preferences across languages with differing grammatical properties (e.g., head-position and case marking) and across speakers (e. g., natives and non-natives and with variable working memory capacity). I begin by showing that the preference for short dependencies is better accounted by a general cognitive preference for minimizing the distance across dependents than by conceptual availability. I then show how languages as diverse as Basque, Spanish and Polish tend to choose the communicatively more efficient structures, when there is more than one available alternative to express the same meaning. Crucially, I confirm that there is consistent variation regarding this tendency both across languages and across speakers. I argue that language-specific (e. g., pluripersonal agreement) and general cognitive mechanisms (e. g., word order based-expectations) interact with the preference towards dependency length minimization. Also, I show that the degree of communicative efficiency achieved by highly proficient and early non-native bilingual speakers is lower than that reached by their native peers. Finally, I find that the bias towards shifted orders that yield shorter dependencies correlates positively with working memory. Based on these findings, I conclude that there is strong evidence supporting the claim that dependency length minimization is a pervasive force in human language production, resulting from a general cognitive constraint towards efficient communication, and also that its strength varies depending on grammatical and individual specifications compatible with information-theoretic considerations
Minimizing dependencies across languages and speakers. Evidence from basque, polish and spanish and native and non-native bilinguals.
223 p.Within the last years, evidence for a general preference towards grammars reducing the linear distance between elements in a dependency has been accumulating (e. g., Futrell, Mahowald, and Gibson, 2015b; Gildea and Temperley, 2010). This cognitive bias towards dependency length minimization has been argued to result from communicative and cognitive pressures at play during language production. Although corpus evidence supporting this claim is quite broad insofar as grammaticalized structures are concerned (e. g., Futrell et al., 2015b; Liu, 2008; Temperley, 2007, among others), its validity rests on more shaky foundations regarding production preferences (Stallings, MacDonald, and O¿Seaghdha, 1998; Wasow, 1997; Yamashita and Chang, 2001, among others). This dissertation intends to address this gap. It examines whether dependency length minimization is an active mechanism shaping language production preferences, and explores the specific nature of this principle and its interplay with linguistic specifications and architectural properties of the human memory system. In a series of 5 cued-recall production experiments and 2 complex memory span tasks, I investigate the effect of dependency length in modulating production preferences across languages with differing grammatical properties (e.g., head-position and case marking) and across speakers (e. g., natives and non-natives and with variable working memory capacity). I begin by showing that the preference for short dependencies is better accounted by a general cognitive preference for minimizing the distance across dependents than by conceptual availability. I then show how languages as diverse as Basque, Spanish and Polish tend to choose the communicatively more efficient structures, when there is more than one available alternative to express the same meaning. Crucially, I confirm that there is consistent variation regarding this tendency both across languages and across speakers. I argue that language-specific (e. g., pluripersonal agreement) and general cognitive mechanisms (e. g., word order based-expectations) interact with the preference towards dependency length minimization. Also, I show that the degree of communicative efficiency achieved by highly proficient and early non-native bilingual speakers is lower than that reached by their native peers. Finally, I find that the bias towards shifted orders that yield shorter dependencies correlates positively with working memory. Based on these findings, I conclude that there is strong evidence supporting the claim that dependency length minimization is a pervasive force in human language production, resulting from a general cognitive constraint towards efficient communication, and also that its strength varies depending on grammatical and individual specifications compatible with information-theoretic considerations