Search CORE

2,584 research outputs found

Maximum agreement and compatible supertrees

Author: Berry Vincent
Nicolas François
Publication venue: Elsevier B.V.
Publication date: 01/01/2007
Field of study

AbstractGiven a set of leaf-labelled trees with identical leaf sets, the MAST problem, respectively MCT problem, consists of finding a largest subset of leaves such that all input trees restricted to these leaves are isomorphic, respectively compatible. In this paper, we propose extensions of these problems to the context of supertree inference, where input trees have non-identical leaf sets. This situation is of particular interest in phylogenetics. The resulting problems are called SMAST and SMCT.A sufficient condition is given that identifies cases where these problems can be solved by resorting to MAST and MCT as subproblems. This condition is met, for instance, when only two input trees are considered. Then we give algorithms for SMAST and SMCT that benefit from the link with the subtree problems. These algorithms run in time linear to the time needed to solve MAST, respectively MCT, on an instance of the same or smaller size.It is shown that arbitrary instances of SMAST and SMCT can be turned in polynomial time into instances composed of trees with a bounded number of leaves.SMAST is shown to be W[2]-hard when the considered parameter is the number of input leaves that have to be removed to obtain the agreement of the input trees. A similar result holds for SMCT. Moreover, the corresponding optimization problems, that is the complements of SMAST and SMCT, cannot be approximated in polynomial time within any constant factor, unless P=NP. These results also hold when the input trees have a bounded number of leaves.The presented results apply to both collections of rooted and unrooted trees

Elsevier - Publisher Connector

HAL Descartes

Hal-Diderot

Recommended from our members

COMPACT REPRESENTATIONS OF UNCERTAINTY IN CLUSTERING

Author: Greenberg Craig Stuart
Publication venue: ScholarWorks@UMass Amherst
Publication date: 06/04/2021
Field of study

Flat clustering and hierarchical clustering are two fundamental tasks, often used to discover meaningful structures in data, such as subtypes of cancer, phylogenetic relationships, taxonomies of concepts, and cascades of particle decays in particle physics. When multiple clusterings of the data are possible, it is useful to represent uncertainty in clustering through various probabilistic quantities, such as the distribution over partitions or tree structures, and the marginal probabilities of subpartitions or subtrees. Many compact representations exist for structured prediction problems, enabling the efficient computation of probability distributions, e.g., a trellis structure and corresponding Forward-Backward algorithm for Markov models that model sequences. However, no such representation has been proposed for either flat or hierarchical clustering models. In this thesis, we present our work developing data structures and algorithms for computing probability distributions over flat and hierarchical clusterings, as well as for finding maximum a posteriori (MAP) flat and hierarchical clusterings, and various marginal probabilities, as given by a wide range of energy-based clustering models. First, we describe a trellis structure that compactly represents distributions over flat or hierarchical clusterings. We also describe related data structures that represent approximate distributions. We then present algorithms that, using these structures, allow us to compute the partition function, MAP clustering, and the marginal proba- bilities of a cluster (and sub-hierarchy, in the case of hierarchical clustering) exactly. We also show how these and related algorithms can be used to approximate these values, and analyze the time and space complexity of our proposed methods. We demonstrate the utility of our approaches using various synthetic data of interest as well as in two real world applications, namely particle physics at the Large Hadron Collider at CERN and in cancer genomics. We conclude with a brief discussion of future work

ScholarWorks@UMass Amherst

Ferromagnetic Potts Model: Refined #BIS-hardness and Related Results

Author: Galanis Andreas
Stefankovic Daniel
Vigoda Eric
Yang Linji
Publication venue
Publication date: 01/01/2016
Field of study

Recent results establish for 2-spin antiferromagnetic systems that the computational complexity of approximating the partition function on graphs of maximum degree D undergoes a phase transition that coincides with the uniqueness phase transition on the infinite D-regular tree. For the ferromagnetic Potts model we investigate whether analogous hardness results hold. Goldberg and Jerrum showed that approximating the partition function of the ferromagnetic Potts model is at least as hard as approximating the number of independent sets in bipartite graphs (#BIS-hardness). We improve this hardness result by establishing it for bipartite graphs of maximum degree D. We first present a detailed picture for the phase diagram for the infinite D-regular tree, giving a refined picture of its first-order phase transition and establishing the critical temperature for the coexistence of the disordered and ordered phases. We then prove for all temperatures below this critical temperature that it is #BIS-hard to approximate the partition function on bipartite graphs of maximum degree D. As a corollary, it is #BIS-hard to approximate the number of k-colorings on bipartite graphs of maximum degree D when k <= D/(2 ln D). The #BIS-hardness result for the ferromagnetic Potts model uses random bipartite regular graphs as a gadget in the reduction. The analysis of these random graphs relies on recent connections between the maxima of the expectation of their partition function, attractive fixpoints of the associated tree recursions, and induced matrix norms. We extend these connections to random regular graphs for all ferromagnetic models and establish the Bethe prediction for every ferromagnetic spin system on random regular graphs. We also prove for the ferromagnetic Potts model that the Swendsen-Wang algorithm is torpidly mixing on random D-regular graphs at the critical temperature for large q.Comment: To appear in SIAM J. Computin

arXiv.org e-Print Archive

Crossref

Oxford University Research Archive