Search CORE

189 research outputs found

Optimizing Phylogenetic Supertrees Using Answer Set Programming

Author: Janhunen Tomi
Koponen Laura
Oikarinen Emilia
Säilä Laura
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 01/01/2015
Field of study

The supertree construction problem is about combining several phylogenetic trees with possibly conflicting information into a single tree that has all the leaves of the source trees as its leaves and the relationships between the leaves are as consistent with the source trees as possible. This leads to an optimization problem that is computationally challenging and typically heuristic methods, such as matrix representation with parsimony (MRP), are used. In this paper we consider the use of answer set programming to solve the supertree construction problem in terms of two alternative encodings. The first is based on an existing encoding of trees using substructures known as quartets, while the other novel encoding captures the relationships present in trees through direct projections. We use these encodings to compute a genus-level supertree for the family of cats (Felidae). Furthermore, we compare our results to recent supertrees obtained by the MRP method.Comment: To appear in Theory and Practice of Logic Programming (TPLP), Proceedings of ICLP 201

arXiv.org e-Print Archive

Crossref

Aaltodoc Publication Archive

Approximate Capacities of Two-Dimensional Codes by Spatial Mixing

Author: Wang Yi-Kai
Yin Yitong
Zhong Sheng
Publication venue
Publication date: 25/02/2014
Field of study

We apply several state-of-the-art techniques developed in recent advances of counting algorithms and statistical physics to study the spatial mixing property of the two-dimensional codes arising from local hard (independent set) constraints, including: hard-square, hard-hexagon, read/write isolated memory (RWIM), and non-attacking kings (NAK). For these constraints, the strong spatial mixing would imply the existence of polynomial-time approximation scheme (PTAS) for computing the capacity. It was previously known for the hard-square constraint the existence of strong spatial mixing and PTAS. We show the existence of strong spatial mixing for hard-hexagon and RWIM constraints by establishing the strong spatial mixing along self-avoiding walks, and consequently we give PTAS for computing the capacities of these codes. We also show that for the NAK constraint, the strong spatial mixing does not hold along self-avoiding walks

arXiv.org e-Print Archive

Crossref

Probabilistic Constraint Logic Programming

Author: Riezler Stefan
Publication venue
Publication date: 11/11/1997
Field of study

This paper addresses two central problems for probabilistic processing models: parameter estimation from incomplete data and efficient retrieval of most probable analyses. These questions have been answered satisfactorily only for probabilistic regular and context-free models. We address these problems for a more expressive probabilistic constraint logic programming model. We present a log-linear probability model for probabilistic constraint logic programming. On top of this model we define an algorithm to estimate the parameters and to select the properties of log-linear models from incomplete data. This algorithm is an extension of the improved iterative scaling algorithm of Della-Pietra, Della-Pietra, and Lafferty (1995). Our algorithm applies to log-linear models in general and is accompanied with suitable approximation methods when applied to large data spaces. Furthermore, we present an approach for searching for most probable analyses of the probabilistic constraint logic programming model. This method can be applied to the ambiguity resolution problem in natural language processing applications.Comment: 35 pages, uses sfbart.cl

arXiv.org e-Print Archive

CiteSeerX

Reducing the storage requirements of dataflow constraints using model dependencies

Author: Halterman Richard L.
Publication venue: TRACE: Tennessee Research and Creative Exchange
Publication date: 01/05/1999
Field of study

Dataflow constraints allow programmers to specify relationships among application objects in a natural,declarative manner. Most constraint solvers represent these dataflow relationships as directed edges in a dependency graph. Unfortunately, dependency graphs require a great deal of storage. Consequently,an application with a large number of constraints can get pushed into virtual memory and performance degrades in interactive applications. The solution presented here is based on the observation that objects derived from the same prototype use the same constraints and thus have the same dependency graphs. The Common dependency patterns are represented in a model dependency graph that is stored in a prototype.Instance objects may derive explicit dependencies from this graph when the dependencies are needed.Model dependencies provide a useful new mechanism for improving the storage efficiency of data flow constraint systems, especially when a large number of constrained objects must be managed

University of Tennessee, Knoxville: Trace

Constructing liberal and conservative supertrees and exact solutions for reduced consensus problems

Author: Dong Jianrong
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2012
Field of study

This thesis studies two different approaches to extracting information from collections of phylogenetic trees: supertrees and reduced consensus. Supertree methods combine the phylogenetic information from multiple partially-overlapping trees into a larger phylogenetic tree called a supertree. Several supertree construction methods have been proposed to date, but most of these are not designed with any specific properties in mind. Recently, Cotton and Wilkinson proposed extensions of the majority-rule consensus tree method to the supertree setting that inherit many of the appealing properties of the former. We study a variant of one of Cotton and Wilkinson\u27s methods, called majority-rule (+) supertrees. After proving that a key underlying problem for constructing majority-rule (+) supertrees is NP-hard, we develop a polynomial-size exact integer linear programming formulation of the problem. We then present a data reduction heuristic that identifies smaller subproblems that can be solved independently. While this technique is not guaranteed to produce optimal solutions, it can achieve substantial problem-size reduction. Finally, we report on a computational study of our approach on various real data sets, including the 121-taxon, 7-tree Seabirds data set of Kennedy and Page. The results indicate that our exact method is computationally feasible for moderately large inputs. For larger inputs, our data reduction heuristic makes it feasible to tackle problems that are well beyond the range of the basic integer programming approach. Comparisons between the results obtained by our heuristic and exact solutions indicate that the heuristic produces good answers. Our results also suggest that the majority-rule (+) approach, in both its basic form and with data reduction, yields biologically meaningful phylogenies. Generalizations of the strict and loose consensus methods to the supertree setting, recently introduced by McMorris and Wilkinson, are studied. The supertrees these methods produce are conservative in the sense that they only preserve information (in the form of splits) that is supported by at least one the input trees and that is not contradicted by any of the input trees. Alternative, equivalent, formulations of these supertrees are developed. These are used to prove the NP-completeness of the underlying optimization problems and to give exact integer linear programming solutions. For larger data sets, a divide and conquer approach is adopted, based on the structural properties of these supertrees. Experiments show that it is feasible to solve problems with several hundred taxa and several hundred trees in a reasonable amount of time. A rogue taxon in a collection of phylogenetic trees is one whose position varies drastically from tree to tree. The presence of such taxa can greatly reduce the resolution of the consensus tree (e.g., the majority-rule or strict consensus) for a collection. The reduced consensus approach aims to identify rogue taxa and to produce more informative consensus trees. Given a collection of phylogenetic trees over the same leaf set, the goal is to find a set of taxa whose removal maximizes the number of internal edges in the consensus tree of the collection. This problem is proven to be NP-hard for strict and majority-rule consensus. We describe exact integer linear programming formulations for computing reduced strict, majority and loose consensus trees. In experimental tests, our exact solutions show significant improvement over heuristic methods on several problem instances

Digital Repository @ Iowa State University (ISU)

Constructing majority-rule supertrees

Author: A Purvis
AD Gordon
BR Baum
C Semple
CG Sibley
D Bryant
D Gusfield
D Gusfield
D Gusfield
D Pisani
David Fernández-Baca
DF Robinson
DG Brown
E Danna
EN Adams
F Delsuc
FR McMorris
G Sierksma
GB Nunn
J Dong
JA Cotton
JA Cotton
Jianrong Dong
JP Barthélemy
M Kennedy
M Wilkinson
M Wilkinson
MA Ragan
MA Steel
MdL Brooke
N Amenta
ND Pattengale
ORP Bininda-Emonds
P Goloboff
PA Goloboff
S Sridhar
T Margush
V Ranwez
W Day
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Supertree methods combine the phylogenetic information from multiple partially-overlapping trees into a larger phylogenetic tree called a supertree. Several supertree construction methods have been proposed to date, but most of these are not designed with any specific properties in mind. Recently, Cotton and Wilkinson proposed extensions of the majority-rule consensus tree method to the supertree setting that inherit many of the appealing properties of the former. Results We study a variant of one of Cotton and Wilkinson's methods, called majority-rule (+) supertrees. After proving that a key underlying problem for constructing majority-rule (+) supertrees is NP-hard, we develop a polynomial-size exact integer linear programming formulation of the problem. We then present a data reduction heuristic that identifies smaller subproblems that can be solved independently. While this technique is not guaranteed to produce optimal solutions, it can achieve substantial problem-size reduction. Finally, we report on a computational study of our approach on various real data sets, including the 121-taxon, 7-tree Seabirds data set of Kennedy and Page. Conclusions The results indicate that our exact method is computationally feasible for moderately large inputs. For larger inputs, our data reduction heuristic makes it feasible to tackle problems that are well beyond the range of the basic integer programming approach. Comparisons between the results obtained by our heuristic and exact solutions indicate that the heuristic produces good answers. Our results also suggest that the majority-rule (+) approach, in both its basic form and with data reduction, yields biologically meaningful phylogenies.</p

CiteSeerX

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central