14 research outputs found
Are crossing dependencies really scarce?
The syntactic structure of a sentence can be modelled as a tree, where vertices correspond to words and edges indicate syntactic dependencies. It has been claimed recurrently that the number of edge crossings in real sentences is small. However, a baseline or null hypothesis has been lacking. Here we quantify the amount of crossings of real sentences and compare it to the predictions of a series of baselines. We conclude that crossings are really scarce in real sentences. Their scarcity is unexpected by the hubiness of the trees. Indeed, real sentences are close to linear trees, where the potential number of crossings is maximized.Peer ReviewedPostprint (author's final draft
The sum of edge lengths in random linear arrangements
Spatial networks are networks where nodes are located in a space equipped
with a metric. Typically, the space is two-dimensional and until recently and
traditionally, the metric that was usually considered was the Euclidean
distance. In spatial networks, the cost of a link depends on the edge length,
i.e. the distance between the nodes that define the edge. Hypothesizing that
there is pressure to reduce the length of the edges of a network requires a
null model, e.g., a random layout of the vertices of the network. Here we
investigate the properties of the distribution of the sum of edge lengths in
random linear arrangement of vertices, that has many applications in different
fields. A random linear arrangement consists of an ordering of the elements of
the nodes of a network being all possible orderings equally likely. The
distance between two vertices is one plus the number of intermediate vertices
in the ordering. Compact formulae for the 1st and 2nd moments about zero as
well as the variance of the sum of edge lengths are obtained for arbitrary
graphs and trees. We also analyze the evolution of that variance in Erdos-Renyi
graphs and its scaling in uniformly random trees. Various developments and
applications for future research are suggested
Publicacions UPC de 2018 amb més presència a Twitter: dades recollides per Altmetric
Aquest document mostra dades sobre la presència de publicacions de la UPC del 2018 a la xarxa social Twitter recollides per l’aplicació Altmetric.
Les referències que es mostren s’han extret del portal FUTUR i corresponen a 34 publicacions amb 20 o més tweets en el moment d’elaboració del document.This document shows data on the presence on the Twitter of UPC publications of 2018, compiled by the application Altmetric. The references shown have been extracted from the portal FUTUR and correspond to 34 publications with 20 or more tweets at the time of writing.Este documento muestra datos sobre la presencia de publicaciones de la UPC del 2018 en la red social Twitter recopiladas por la aplicación Altmetric. Las referencias que se muestran se han extraído del portal FUTUR y corresponden a 34 publicaciones con 20 o más tweets en el momento de elaboración del documento.Postprint (published version
A Cross-Linguistic Pressure for Uniform Information Density in Word Order
While natural languages differ widely in both canonical word order and word
order flexibility, their word orders still follow shared cross-linguistic
statistical patterns, often attributed to functional pressures. In the effort
to identify these pressures, prior work has compared real and counterfactual
word orders. Yet one functional pressure has been overlooked in such
investigations: the uniform information density (UID) hypothesis, which holds
that information should be spread evenly throughout an utterance. Here, we ask
whether a pressure for UID may have influenced word order patterns
cross-linguistically. To this end, we use computational models to test whether
real orders lead to greater information uniformity than counterfactual orders.
In our empirical study of 10 typologically diverse languages, we find that: (i)
among SVO languages, real word orders consistently have greater uniformity than
reverse word orders, and (ii) only linguistically implausible counterfactual
orders consistently exceed the uniformity of real orders. These findings are
compatible with a pressure for information uniformity in the development and
usage of natural languages
Reappraising the distribution of the number of edge crossings of graphs on a sphere
Many real transportation and mobility networks have their vertices placed on
the surface of the Earth. In such embeddings, the edges laid on that surface
may cross. In his pioneering research, Moon analyzed the distribution of the
number of crossings on complete graphs and complete bipartite graphs whose
vertices are located uniformly at random on the surface of a sphere assuming
that vertex placements are independent from each other. Here we revise his
derivation of that variance in the light of recent theoretical developments on
the variance of crossings and computer simulations. We show that Moon's
formulae are inaccurate in predicting the true variance and provide exact
formulae.Comment: Corrected mistakes in equation 31. Added new figure (7). Added
acknowledgements to J. W. Moon. Other minor changes. Updated figures. Minor
changes in the last updat
Memory limitations are hidden in grammar
[Abstract] The ability to produce and understand an unlimited number of different sentences is a hallmark of human language. Linguists have sought to define the essence of this generative capacity using formal grammars that describe the syntactic dependencies between constituents, independent of the computational limitations of the human brain. Here, we evaluate this independence assumption by sampling sentences uniformly from the space of possible syntactic structures. We find that the average dependency distance between syntactically related words, a proxy for memory limitations, is less than expected by chance in a collection of state-of-the-art classes of dependency grammars. Our findings indicate that memory limitations have permeated grammatical descriptions, suggesting that it may be impossible to build a parsimonious theory of human linguistic productivity independent
of non-linguistic cognitive constraints
Anti dependency distance minimization in short sequences: A graph theoretic approach
Dependency distance minimization (DDm) is a word order principle favouring the placement of syntactically related words close to each other in sentences. Massive evidence of the principle has been reported for more than a decade with the help of syntactic dependency treebanks where long sentences abound. However, it has been predicted theoretically that the principle is more likely to be beaten in short sequences by the principle of surprisal minimization (predictability maximization). Here we introduce a simple binomial test to verify such a hypothesis. In short sentences, we find anti-DDm for some languages from different families. Our analysis of the syntactic dependency structures suggests that anti-DDm is produced by star trees.Peer ReviewedPostprint (author's final draft
Bounds of the sum of edge lengths in linear arrangements of trees
A fundamental problem in network science is the normalization of the
topological or physical distance between vertices, that requires understanding
the range of variation of the unnormalized distances. Here we investigate the
limits of the variation of the physical distance in linear arrangements of the
vertices of trees. In particular, we investigate various problems on the sum of
edge lengths in trees of a fixed size: the minimum and the maximum value of the
sum for specific trees, the minimum and the maximum in classes of trees (bistar
trees and caterpillar trees) and finally the minimum and the maximum for any
tree. We establish some foundations for research on optimality scores for
spatial networks in one dimension.Comment: Title changed at proof stag
Edge crossings in random linear arrangements
In spatial networks vertices are arranged in some space and edges may cross.
When arranging vertices in a 1-dimensional lattice edges may cross when drawn
above the vertex sequence as it happens in linguistic and biological networks.
Here we investigate the general of problem of the distribution of edge
crossings in random arrangements of the vertices. We generalize the existing
formula for the expectation of this number in random linear arrangements of
trees to any network and derive an expression for the variance of the number of
crossings in an arbitrary layout relying on a novel characterization of the
algebraic structure of that variance in an arbitrary space. We provide compact
formulae for the expectation and the variance in complete graphs, complete
bipartite graphs, cycle graphs, one-regular graphs and various kinds of trees
(star trees, quasi-star trees and linear trees). In these networks, the scaling
of expectation and variance as a function of network size is asymptotically
power-law-like in random linear arrangements. Our work paves the way for
further research and applications in 1-dimension or investigating the
distribution of the number of crossings in lattices of higher dimension or
other embeddings.Comment: Generalised our theory from one-dimensional layouts to practically
any type of layout. This helps study the variance of the number of crossings
in graphs when their vertices are arranged on the surface of a sphere, or on
the plane. Moreover, we also give closed formulae for this variance on
particular types of graphs in both linear arrangements and general layout