10 research outputs found
Random crossings in dependency trees
It has been hypothesized that the rather small number of crossings in real
syntactic dependency trees is a side-effect of pressure for dependency length
minimization. Here we answer a related important research question: what would
be the expected number of crossings if the natural order of a sentence was lost
and replaced by a random ordering? We show that this number depends only on the
number of vertices of the dependency tree (the sentence length) and the second
moment about zero of vertex degrees. The expected number of crossings is
minimum for a star tree (crossings are impossible) and maximum for a linear
tree (the number of crossings is of the order of the square of the sequence
length).Comment: changes of format and language; some corrections in Appendix A; in
press in Glottometric
Beyond description. Comment on "Approaching human language with complex networks" by Cong & Liu
Comment on "Approaching human language with complex networks" by Cong & Li
Crossings as a side effect of dependency lengths
The syntactic structure of sentences exhibits a striking regularity:
dependencies tend to not cross when drawn above the sentence. We investigate
two competing explanations. The traditional hypothesis is that this trend
arises from an independent principle of syntax that reduces crossings
practically to zero. An alternative to this view is the hypothesis that
crossings are a side effect of dependency lengths, i.e. sentences with shorter
dependency lengths should tend to have fewer crossings. We are able to reject
the traditional view in the majority of languages considered. The alternative
hypothesis can lead to a more parsimonious theory of language.Comment: the discussion section has been expanded significantly; in press in
Complexity (Wiley
The scaling of the minimum sum of edge lengths in uniformly random trees
[Abstract] The minimum linear arrangement problem on a network consists of finding the minimum sum of edge lengths that can be achieved when the vertices are arranged linearly. Although there are algorithms to solve this problem on trees in polynomial time, they have remained theoretical and have not been implemented in practical contexts to our knowledge. Here we use one of those algorithms to investigate the growth of this sum as a function of the size of the tree in uniformly random trees. We show that this sum is bounded above by its value in a star tree. We also show that the mean edge length grows logarithmically in optimal linear arrangements, in stark contrast to the linear growth that is expected on optimal arrangements of star trees or on random linear arrangements.Ministerio de Economía, Industria y Competitividad; TIN2013-48031- C4-1-PXunta de Galicia; R2014/034Agència de Gestió d'Ajuts Universitaris i de Recerca; 2014SGR 890Ministerio de Economía, Industria y Competitividad; TIN2014-57226-PMinisterio de Economía, Industria y Competitividad; FFI2014-51978-C2-2-
Reappraising the distribution of the number of edge crossings of graphs on a sphere
Many real transportation and mobility networks have their vertices placed on
the surface of the Earth. In such embeddings, the edges laid on that surface
may cross. In his pioneering research, Moon analyzed the distribution of the
number of crossings on complete graphs and complete bipartite graphs whose
vertices are located uniformly at random on the surface of a sphere assuming
that vertex placements are independent from each other. Here we revise his
derivation of that variance in the light of recent theoretical developments on
the variance of crossings and computer simulations. We show that Moon's
formulae are inaccurate in predicting the true variance and provide exact
formulae.Comment: Corrected mistakes in equation 31. Added new figure (7). Added
acknowledgements to J. W. Moon. Other minor changes. Updated figures. Minor
changes in the last updat
Anti dependency distance minimization in short sequences: A graph theoretic approach
Dependency distance minimization (DDm) is a word order principle favouring the placement of syntactically related words close to each other in sentences. Massive evidence of the principle has been reported for more than a decade with the help of syntactic dependency treebanks where long sentences abound. However, it has been predicted theoretically that the principle is more likely to be beaten in short sequences by the principle of surprisal minimization (predictability maximization). Here we introduce a simple binomial test to verify such a hypothesis. In short sentences, we find anti-DDm for some languages from different families. Our analysis of the syntactic dependency structures suggests that anti-DDm is produced by star trees.Peer ReviewedPostprint (author's final draft
Are crossing dependencies really scarce?
The syntactic structure of a sentence can be modelled as a tree, where vertices correspond to words and edges indicate syntactic dependencies. It has been claimed recurrently that the number of edge crossings in real sentences is small. However, a baseline or null hypothesis has been lacking. Here we quantify the amount of crossings of real sentences and compare it to the predictions of a series of baselines. We conclude that crossings are really scarce in real sentences. Their scarcity is unexpected by the hubiness of the trees. Indeed, real sentences are close to linear trees, where the potential number of crossings is maximized.Peer ReviewedPostprint (author's final draft
Edge crossings in random linear arrangements
In spatial networks vertices are arranged in some space and edges may cross.
When arranging vertices in a 1-dimensional lattice edges may cross when drawn
above the vertex sequence as it happens in linguistic and biological networks.
Here we investigate the general of problem of the distribution of edge
crossings in random arrangements of the vertices. We generalize the existing
formula for the expectation of this number in random linear arrangements of
trees to any network and derive an expression for the variance of the number of
crossings in an arbitrary layout relying on a novel characterization of the
algebraic structure of that variance in an arbitrary space. We provide compact
formulae for the expectation and the variance in complete graphs, complete
bipartite graphs, cycle graphs, one-regular graphs and various kinds of trees
(star trees, quasi-star trees and linear trees). In these networks, the scaling
of expectation and variance as a function of network size is asymptotically
power-law-like in random linear arrangements. Our work paves the way for
further research and applications in 1-dimension or investigating the
distribution of the number of crossings in lattices of higher dimension or
other embeddings.Comment: Generalised our theory from one-dimensional layouts to practically
any type of layout. This helps study the variance of the number of crossings
in graphs when their vertices are arranged on the surface of a sphere, or on
the plane. Moreover, we also give closed formulae for this variance on
particular types of graphs in both linear arrangements and general layout
Random crossings in dependency trees
It has been hypothesized that the rather small number of crossings in real syntactic dependency trees is a side-effect of pressure for dependency length minimization. Here we answer a
related important research question: what would be the expected number of crossings if the natural order of a sentence was lost and replaced by a random ordering? We show that this number depends only on the number of vertices of the dependency tree (the sentence length) and the second moment about zero of vertex degrees. The expected number of crossings is minimum for a star tree (crossings are impossible) and maximum for a linear tree (the number of crossings is of the order of the square of the sequence length).Peer Reviewe