Search CORE

23 research outputs found

Hubiness, length, crossings and their relationships in dependency trees

Author: Ferrer Cancho Ramon
Publication venue: RAM-Verlag
Publication date: 01/01/2013
Field of study

Here tree dependency structures are studied from three different perspectives: their degree variance (hubiness), the mean dependency length and the number of dependency crossings. Bounds that reveal pairwise dependencies among these three metrics are derived. Hubiness (the variance of degrees) plays a central role: the mean dependency length is bounded below by hubiness while the number of crossings is bounded above by hubiness. Our findings suggest that the online memory cost of a sentence might be determined not just by the ordering of words but also by the hubiness of the underlying structure. The 2nd moment of degree plays a crucial role that is reminiscent of its role in large complex networks.Peer ReviewedPostprint (published version

arXiv.org e-Print Archive

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Are crossing dependencies really scarce?

Author: Esteban Ángeles Juan Luis
Ferrer Cancho Ramon
Gómez Rodríguez Carlos
Publication venue: 'Elsevier BV'
Publication date: 01/01/2018
Field of study

The syntactic structure of a sentence can be modelled as a tree, where vertices correspond to words and edges indicate syntactic dependencies. It has been claimed recurrently that the number of edge crossings in real sentences is small. However, a baseline or null hypothesis has been lacking. Here we quantify the amount of crossings of real sentences and compare it to the predictions of a series of baselines. We conclude that crossings are really scarce in real sentences. Their scarcity is unexpected by the hubiness of the trees. Indeed, real sentences are close to linear trees, where the potential number of crossings is maximized.Peer ReviewedPostprint (author's final draft

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Crossings as a side effect of dependency lengths

Author: Bick
Christensen
Conover
Ferrer-i-Cancho
Ferrer-i-Cancho
Ferrer-i-Cancho
Ferrer-i-Cancho
Ferrer-i-Cancho
Ferrer-i-Cancho
Ferrer-i-Cancho
Futrell
Gibson
Gildea
Gildea
Gómez-Rodríguez
Hays
Hochberg
Hudson
Iwatate
Jiang
Kawata
Kelih
Liu
Lu
Newman
Poirier
Popper
Prokhorov
Ramasamy
Tanaka
Temperley
Publication venue: 'Wiley'
Publication date: 01/01/2016
Field of study

The syntactic structure of sentences exhibits a striking regularity: dependencies tend to not cross when drawn above the sentence. We investigate two competing explanations. The traditional hypothesis is that this trend arises from an independent principle of syntax that reduces crossings practically to zero. An alternative to this view is the hypothesis that crossings are a side effect of dependency lengths, i.e. sentences with shorter dependency lengths should tend to have fewer crossings. We are able to reject the traditional view in the majority of languages considered. The alternative hypothesis can lead to a more parsimonious theory of language.Comment: the discussion section has been expanded significantly; in press in Complexity (Wiley

arXiv.org e-Print Archive

CiteSeerX

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

UPCommons. Portal del coneixement obert de la UPC

Non-crossing dependencies: Least effort, not grammar

Author: A. Baronchelli
A. Broder
A.E. Goldberg
B. Bollobás
C.D. Manning
Carlos Gómez-Rodríguez
D. Aldous
D. Hays
E. Bach
E. Gibson
E. Gibson
E. Gibson
F. Moscoso del Prado
F.J. Newmeyer
F.J. Newmeyer
F.R.K. Chung
G. Morrill
G. Morrill
G.A. Miller
G.K. Zipf
H. Liu
H. Liu
H. Liu
I. Mel’čuk
J. Uriagereka
J.A. Hawkins
J.A. Hawkins
J.R. Hurford
K.P. Burnham
M. Bunge
M. Gell-Mann
M. Molloy
M. Molloy
M. Noy
M.D. Hauser
M.H. Christiansen
M.H. Vries de
N. Chomsky
N. Evans
P.J. Hopper
P.W. Culicover
R. Ferrer i Cancho
R. Ferrer i Cancho
R. Ferrer-i-Cancho
R. Ferrer-i-Cancho
R. Ferrer-i-Cancho
R. Ferrer-i-Cancho
R. Hudson
R. Hudson
R. Jackendoff
R. Köhler
R. Levy
R. McDonald
R. Suzuki
R.A. Hochberg
R.V. Solé
Ramon Ferrer-i-Cancho
Ramon Ferrer-i-Cancho
S.L. Frank
S.T. Piantadosi
S.T. Piantadosi
W.Y.C. Chen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

The use of null hypotheses (in a statistical sense) is common in hard sciences but not in theoretical linguistics. Here the null hypothesis that the low frequency of syntactic dependency crossings is expected by an arbitrary ordering of words is rejected. It is shown that this would require star dependency structures, which are both unrealistic and too restrictive. The hypothesis of the limited resources of the human brain is revisited. Stronger null hypotheses taking into account actual dependency lengths for the likelihood of crossings are presented. Those hypotheses suggests that crossings are likely to reduce when dependencies are shortened. A hypothesis based on pressure to reduce dependency lengths is more parsimonious than a principle of minimization of crossings or a grammatical ban that is totally dissociated from the general and non-linguistic principle of economy.Postprint (author's final draft

CiteSeerX

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

The risks of mixing dependency lengths from sequences of different length

Author: Ferrer-i-Cancho Ramon
Liu Haitao
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 01/01/2014
Field of study

Mixing dependency lengths from sequences of different length is a common practice in language research. However, the empirical distribution of dependency lengths of sentences of the same length differs from that of sentences of varying length and the distribution of dependency lengths depends on sentence length for real sentences and also under the null hypothesis that dependencies connect vertices located in random positions of the sequence. This suggests that certain results, such as the distribution of syntactic dependency lengths mixing dependencies from sentences of varying length, could be a mere consequence of that mixing. Furthermore, differences in the global averages of dependency length (mixing lengths from sentences of varying length) for two different languages do not simply imply a priori that one language optimizes dependency lengths better than the other because those differences could be due to differences in the distribution of sentence lengths and other factors.Comment: Laguage and referencing has been improved; Eqs. 7, 11, B7 and B8 have been correcte

arXiv.org e-Print Archive

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

The sum of edge lengths in random linear arrangements

Author: Ferrer-i-Cancho Ramon
Publication venue: 'IOP Publishing'
Publication date: 01/01/2019
Field of study

Spatial networks are networks where nodes are located in a space equipped with a metric. Typically, the space is two-dimensional and until recently and traditionally, the metric that was usually considered was the Euclidean distance. In spatial networks, the cost of a link depends on the edge length, i.e. the distance between the nodes that define the edge. Hypothesizing that there is pressure to reduce the length of the edges of a network requires a null model, e.g., a random layout of the vertices of the network. Here we investigate the properties of the distribution of the sum of edge lengths in random linear arrangement of vertices, that has many applications in different fields. A random linear arrangement consists of an ordering of the elements of the nodes of a network being all possible orderings equally likely. The distance between two vertices is one plus the number of intermediate vertices in the ordering. Compact formulae for the 1st and 2nd moments about zero as well as the variance of the sum of edge lengths are obtained for arbitrary graphs and trees. We also analyze the evolution of that variance in Erdos-Renyi graphs and its scaling in uniformly random trees. Various developments and applications for future research are suggested

arXiv.org e-Print Archive

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Anti dependency distance minimization in short sequences: A graph theoretic approach

Author: Ferrer Cancho Ramon
Gómez Rodríguez Carlos
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2021
Field of study

Dependency distance minimization (DDm) is a word order principle favouring the placement of syntactically related words close to each other in sentences. Massive evidence of the principle has been reported for more than a decade with the help of syntactic dependency treebanks where long sentences abound. However, it has been predicted theoretically that the principle is more likely to be beaten in short sequences by the principle of surprisal minimization (predictability maximization). Here we introduce a simple binomial test to verify such a hypothesis. In short sentences, we find anti-DDm for some languages from different families. Our analysis of the syntactic dependency structures suggests that anti-DDm is produced by star trees.Peer ReviewedPostprint (author's final draft

UPCommons. Portal del coneixement obert de la UPC