7 research outputs found
Combinatorics and geometry of finite and infinite squaregraphs
Squaregraphs were originally defined as finite plane graphs in which all
inner faces are quadrilaterals (i.e., 4-cycles) and all inner vertices (i.e.,
the vertices not incident with the outer face) have degrees larger than three.
The planar dual of a finite squaregraph is determined by a triangle-free chord
diagram of the unit disk, which could alternatively be viewed as a
triangle-free line arrangement in the hyperbolic plane. This representation
carries over to infinite plane graphs with finite vertex degrees in which the
balls are finite squaregraphs. Algebraically, finite squaregraphs are median
graphs for which the duals are finite circular split systems. Hence
squaregraphs are at the crosspoint of two dualities, an algebraic and a
geometric one, and thus lend themselves to several combinatorial
interpretations and structural characterizations. With these and the
5-colorability theorem for circle graphs at hand, we prove that every
squaregraph can be isometrically embedded into the Cartesian product of five
trees. This embedding result can also be extended to the infinite case without
reference to an embedding in the plane and without any cardinality restriction
when formulated for median graphs free of cubes and further finite
obstructions. Further, we exhibit a class of squaregraphs that can be embedded
into the product of three trees and we characterize those squaregraphs that are
embeddable into the product of just two trees. Finally, finite squaregraphs
enjoy a number of algorithmic features that do not extend to arbitrary median
graphs. For instance, we show that median-generating sets of finite
squaregraphs can be computed in polynomial time, whereas, not unexpectedly, the
corresponding problem for median graphs turns out to be NP-hard.Comment: 46 pages, 14 figure
Marc Barbut au pays des médianes
Mathématique/Théorie des treillis Classification AMS : 06 - Order, lattices, ordered algebraic structures/06B - Lattices 05 - Combinatorics For finite fields/05C - Graph theory for applications of graphs 91 - Game theory, economics, social and behavioral sciences/91B - Mathematical economics for econometrics/91B14 - Social choice URL des Documents de travail : http://centredeconomiesorbonne.univ-paris1.fr/bandeau-haut/document-de-travail/Documents de travail du Centre d'Economie de la Sorbonne 2013.39 - ISSN : 1955-611XThe notion of median originally appeared in Statistics was introduced more later in Algebra and Combinatorics. Marc Barbut was the first to develop the link between these two notions of median. I present his precursory works linking the metric medians and the algebraic medians of a distributive lattice and using these links within the framework of the "median procedure" in data analysis. I also give a short survey on the development of the - more general - theory of "median spaces" and I mention some problems about the median procedure.La notion de médiane apparue d'abord en statistique (notamment sous forme métrique) l'a été ensuite en algèbre et combinatoire. Marc Barbut a été le premier à développer le lien entre ces deux aspects. Je présente ses travaux précurseurs reliant les médianes métriques et les médianes latticielles d'un treillis distributif et utilisant leurs liens dans le cadre d'une " procédure médiane " en analyse des données. Je fais aussi un bref survol du développement de la théorie (plus générale) des " espaces à médianes " et des problèmes posés par la procédure médiane
Generalized Buneman pruning for inferring the most parsimonious multi-state phylogeny
Accurate reconstruction of phylogenies remains a key challenge in
evolutionary biology. Most biologically plausible formulations of the problem
are formally NP-hard, with no known efficient solution. The standard in
practice are fast heuristic methods that are empirically known to work very
well in general, but can yield results arbitrarily far from optimal. Practical
exact methods, which yield exponential worst-case running times but generally
much better times in practice, provide an important alternative. We report
progress in this direction by introducing a provably optimal method for the
weighted multi-state maximum parsimony phylogeny problem. The method is based
on generalizing the notion of the Buneman graph, a construction key to
efficient exact methods for binary sequences, so as to apply to sequences with
arbitrary finite numbers of states with arbitrary state transition weights. We
implement an integer linear programming (ILP) method for the multi-state
problem using this generalized Buneman graph and demonstrate that the resulting
method is able to solve data sets that are intractable by prior exact methods
in run times comparable with popular heuristics. Our work provides the first
method for provably optimal maximum parsimony phylogeny inference that is
practical for multi-state data sets of more than a few characters.Comment: 15 page
Direct maximum parsimony phylogeny reconstruction from genotype data
<p>Abstract</p> <p>Background</p> <p>Maximum parsimony phylogenetic tree reconstruction from genetic variation data is a fundamental problem in computational genetics with many practical applications in population genetics, whole genome analysis, and the search for genetic predictors of disease. Efficient methods are available for reconstruction of maximum parsimony trees from haplotype data, but such data are difficult to determine directly for autosomal DNA. Data more commonly is available in the form of genotypes, which consist of conflated combinations of pairs of haplotypes from homologous chromosomes. Currently, there are no general algorithms for the direct reconstruction of maximum parsimony phylogenies from genotype data. Hence phylogenetic applications for autosomal data must therefore rely on other methods for first computationally inferring haplotypes from genotypes.</p> <p>Results</p> <p>In this work, we develop the first practical method for computing maximum parsimony phylogenies directly from genotype data. We show that the standard practice of first inferring haplotypes from genotypes and then reconstructing a phylogeny on the haplotypes often substantially overestimates phylogeny size. As an immediate application, our method can be used to determine the minimum number of mutations required to explain a given set of observed genotypes.</p> <p>Conclusion</p> <p>Phylogeny reconstruction directly from unphased data is computationally feasible for moderate-sized problem instances and can lead to substantially more accurate tree size inferences than the standard practice of treating phasing and phylogeny construction as two separate analysis stages. The difference between the approaches is particularly important for downstream applications that require a lower-bound on the number of mutations that the genetic region has undergone.</p