55 research outputs found
Ultrametric Distance in Syntax
Phrase structure trees have a hierarchical structure. In many subjects, most notably in Taxonomy such tree structures have been studied using ultrametrics. Here syntactical hierarchical phrase trees are subject to a similar analysis, which is much simpler as the branching structure is more readily discernible and switched. The occurrence of hierarchical structure elsewhere in linguistics is mentioned. The phrase tree can be represented by a matrix and the elements of the matrix can be represented by triangles. The height at which branching occurs is not prescribed in previous syntactic models, but it is by using the ultrametric matrix. In other words the ultrametric approach gives a complete description of phrase trees, unlike previous approaches. The ambiguity of which branching height to choose, is resolved by postulating that branching occurs at the lowest height available. An ultrametric produces a measure of the complexity of sentences: presumably the complexity of sentences increases as a language is acquired so that this can be tested. All ultrametric triangles are equilateral or isoceles, here it is shown that \={X} structure implies that there are no equilateral triangles. Restricting attention to simple syntax a minimum ultrametric distance between lexical categories is calculated. This ultrametric distance is shown to be different than the matrix obtained from features. It is shown that the definition of {\sc c-command} can be replaced by an equivalent ultrametric definition. The new definition invokes a minimum distance between nodes and this is more aesthetically satisfying than previous varieties of definitions. From the new definition of {\sc c-command} follows a new definition of {\sc government}
A CLUE for CLUster Ensembles
Cluster ensembles are collections of individual solutions to a given clustering problem which are useful or necessary to consider in a wide range of applications. The R package clue provides an extensible computational environment for creating and analyzing cluster ensembles, with basic data structures for representing partitions and hierarchies, and facilities for computing on these, including methods for measuring proximity and obtaining consensus and "secondary" clusterings.
STOCHASTIC OPTIMIZATION FOR TROPICAL PRINCIPAL COMPONENT ANALYSIS OVER TREE SPACES
A known challenge in the rapidly growing area of phylogenomics is the lack of tools to analyze the large volume of genome data. Genomic data includes information on the evolution, structure and mapping of genomes. Phylogenetic trees are branching diagrams that show the evolutionary history of species and their genes. Gene trees show the evolutionary history of a particular gene. To analyze evolutionary history from genomic data, we reduce the dimensionality of gene trees, overcoming high dimensional analytical challenges. Through the vectorization of pairwise distances between each combination of two leaves within a phylogenetic tree, we utilize a tropical principle component analysis: a principal component analysis (PCA) in terms of a tropical metric. We project gene trees onto a two-dimensional space using a tropical PCA, a tropical convex hull that minimizes the sum of residuals between each gene tree in the dataset and its projection onto the tropical convex hull over the tree space, which is the set of all possible gene trees. Since computing a tropical PCA for the given dataset is computationally time intensive, we implement a Markov Chain Monte Carlo Metropolis-Hastings algorithm to effectively and efficiently estimate the tropical PCA. Utilizing simulation and real-world data, we implement our tropical PCA algorithm and visualize the results in two-dimensional plots, the results of which look promising and demonstrate our algorithm's strengths.http://archive.org/details/stochasticoptimi1094562731Major, United States ArmyApproved for public release; distribution is unlimited
Ultrametric Distance in Syntax
Phrase structure trees have a hierarchical structure.
In many subjects, most notably in {\bf taxonomy} such tree structures
have been studied using ultrametrics.
Here syntactical hierarchical phrase trees are subject
to a similar analysis, which is much simpler as the branching structure is
more readily discernible and switched. The occurrence of hierarchical
structure elsewhere in linguistics is mentioned.
The phrase tree can be represented
by a matrix and the elements of the matrix can be represented by triangles.
The height at which branching occurs is not prescribed in previous syntactic
models, but it is by using the ultrametric matrix.
In other words the ultrametric approach gives a complete description of
phrase trees, unlike previous approaches.
The ambiguity of which branching height to choose,
is resolved by postulating that branching occurs
at the lowest height available.
An ultrametric produces a measure of the
complexity of sentences: presumably the complexity of sentences increases
as a language is acquired so that this can be tested.
All ultrametric triangles are equilateral or isosceles.
Here it is shown that \={X}
structure implies that there are no equilateral triangles.
Restricting attention to simple syntax a minimum ultrametric distance between lexical
categories is calculated.
A matrix constructed from this ultrametric distance
is shown to be different than the matrix obtained from features.
It is shown that the definition of {\sc c-command}
can be replaced by an equivalent ultrametric definition.
The new definition invokes a minimum distance between nodes and
this is more aesthetically satisfying than previous varieties of definitions.
From the new definition of {\sc c-command} follows a new definition of
of the central notion in syntax namely {\sc government}
Reliable Spanners for Metric Spaces
A spanner is reliable if it can withstand large, catastrophic failures in the
network. More precisely, any failure of some nodes can only cause a small
damage in the remaining graph in terms of the dilation, that is, the spanner
property is maintained for almost all nodes in the residual graph.
Constructions of reliable spanners of near linear size are known in the
low-dimensional Euclidean settings. Here, we present new constructions of
reliable spanners for planar graphs, trees and (general) metric spaces
- …