Search CORE

5 research outputs found

The optimality of syntactic dependency distances

Author: Alemany-Puig Lluís
Esteban Juan Luis
Ferrer-i-Cancho Ramon
Gómez-Rodríguez Carlos
Publication venue
Publication date: 30/07/2020
Field of study

It is often stated that human languages, as other biological systems, are shaped by cost-cutting pressures but, to what extent? Attempts to quantify the degree of optimality of languages by means of an optimality score have been scarce and focused mostly on English. Here we recast the problem of the optimality of the word order of a sentence as an optimization problem on a spatial network where the vertices are words, arcs indicate syntactic dependencies and the space is defined by the linear order of the words in the sentence. We introduce a new score to quantify the cognitive pressure to reduce the distance between linked words in a sentence. The analysis of sentences from 93 languages representing 19 linguistic families reveals that half of languages are optimized to a 70% or more. The score indicates that distances are not significantly reduced in a few languages and confirms two theoretical predictions, i.e. that longer sentences are more optimized and that distances are more likely to be longer than expected by chance in short sentences. We present a new hierarchical ranking of languages by their degree of optimization. The statistical advantages of the new score call for a reevaluation of the evolution of dependency distance over time in languages as well as the relationship between dependency distance and linguistic competence. Finally, the principles behind the design of the score can be extended to develop more powerful normalizations of topological distances or physical distances in more dimensions

arXiv.org e-Print Archive

Repositorio da Universidade da Coruña

UPCommons. Portal del coneixement obert de la UPC

Optimality of syntactic dependency distances

Author: Alemany Puig Lluís
Esteban Ángeles Juan Luis
Ferrer Cancho Ramon
Gómez Rodríguez Carlos
Publication venue: 'American Physical Society (APS)'
Publication date: 01/01/2022
Field of study

It is often stated that human languages, as other biological systems, are shaped by cost-cutting pressures but, to what extent? Attempts to quantify the degree of optimality of languages by means of an optimality score have been scarce and focused mostly on English. Here we recast the problem of the optimality of the word order of a sentence as an optimization problem on a spatial network where the vertices are words, arcs indicate syntactic dependencies, and the space is defined by the linear order of the words in the sentence. We introduce a score to quantify the cognitive pressure to reduce the distance between linked words in a sentence. The analysis of sentences from 93 languages representing 19 linguistic families reveals that half of languages are optimized to a 70% or more. The score indicates that distances are not significantly reduced in a few languages and confirms two theoretical predictions: that longer sentences are more optimized and that distances are more likely to be longer than expected by chance in short sentences. We present a hierarchical ranking of languages by their degree of optimization. The score has implications for various fields of language research (dependency linguistics, typology, historical linguistics, clinical linguistics, and cognitive science). Finally, the principles behind the design of the score have implications for network science.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

Minimizing dependencies across languages and speakers. Evidence from basque, polish and spanish and native and non-native bilinguals.

Author: Ros García Idoia
Publication venue
Publication date: 17/09/2018
Field of study

223 p.Within the last years, evidence for a general preference towards grammars reducing the linear distance between elements in a dependency has been accumulating (e. g., Futrell, Mahowald, and Gibson, 2015b; Gildea and Temperley, 2010). This cognitive bias towards dependency length minimization has been argued to result from communicative and cognitive pressures at play during language production. Although corpus evidence supporting this claim is quite broad insofar as grammaticalized structures are concerned (e. g., Futrell et al., 2015b; Liu, 2008; Temperley, 2007, among others), its validity rests on more shaky foundations regarding production preferences (Stallings, MacDonald, and O¿Seaghdha, 1998; Wasow, 1997; Yamashita and Chang, 2001, among others). This dissertation intends to address this gap. It examines whether dependency length minimization is an active mechanism shaping language production preferences, and explores the specific nature of this principle and its interplay with linguistic specifications and architectural properties of the human memory system. In a series of 5 cued-recall production experiments and 2 complex memory span tasks, I investigate the effect of dependency length in modulating production preferences across languages with differing grammatical properties (e.g., head-position and case marking) and across speakers (e. g., natives and non-natives and with variable working memory capacity). I begin by showing that the preference for short dependencies is better accounted by a general cognitive preference for minimizing the distance across dependents than by conceptual availability. I then show how languages as diverse as Basque, Spanish and Polish tend to choose the communicatively more efficient structures, when there is more than one available alternative to express the same meaning. Crucially, I confirm that there is consistent variation regarding this tendency both across languages and across speakers. I argue that language-specific (e. g., pluripersonal agreement) and general cognitive mechanisms (e. g., word order based-expectations) interact with the preference towards dependency length minimization. Also, I show that the degree of communicative efficiency achieved by highly proficient and early non-native bilingual speakers is lower than that reached by their native peers. Finally, I find that the bias towards shifted orders that yield shorter dependencies correlates positively with working memory. Based on these findings, I conclude that there is strong evidence supporting the claim that dependency length minimization is a pervasive force in human language production, resulting from a general cognitive constraint towards efficient communication, and also that its strength varies depending on grammatical and individual specifications compatible with information-theoretic considerations

Archivo Digital para la Docencia y la Investigación

Minimizing dependencies across languages and speakers. Evidence from basque, polish and spanish and native and non-native bilinguals.

Author: Ros García Idoia
Publication venue
Publication date: 01/01/2018
Field of study

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Archivo Digital para la Docencia y la Investigación