Search CORE

45 research outputs found

The Application of Chordal Graphs to Inferring Phylogenetic Trees of Languages

Author: Enright Jessica
Kondrak Grzegorz
Publication venue
Publication date: 01/01/2011
Field of study

Phylogenetic methods are used to build evolutionary trees of languages given character data that may include lexical, phonological, and morphological information. Such data rarely admits a perfect phylogeny. We explore the use of the more permissive conservative Dollo phylogeny as an alternative or complementary approach. We propose a heuristic search algorithm based on the notion of chordal graphs. We test this approach by generating phylogenetic trees from three datasets, and comparing them to those produced by other researchers

Enlighten

Characterization of Super Strongly Perfect Graphs in Chordal and Strongly Chordal Graphs

Author: Amutha A
Jeya Jothi R Mary
Publication venue: 'Christ University Bangalore'
Publication date: 25/09/2012
Field of study

A Graph G is Super Strongly Perfect Graph if every induced sub graph H of G possesses a minimal dominating set that meets all the maximal complete sub graphs of H. In this paper, we have investigated the characterization of Super Strongly Perfect graphs using odd cycles. We have given the characterization of Super Strongly Perfect graphs in chordal and strongly chordal graphs. We have presented the results of Chordal graphs in terms of domination and co - domination numbers γ and . We have given the relationship between diameter, domination and co - domination numbers of chordal graphs. Also we have analysed the structure of Super Strongly Perfect Graph in Chordal graphs and Strongly Chordal graphs

Christ University Bengaluru: Open Journal Systems

The hardness of perfect phylogeny, feasible register assignment and other problems on thin colored graphs

Author: Bodlaender Hans L.
Fellows Michael R.
Hallett Michael T.
Wareham H.Todd
Warnow Tandy J.
Publication venue: Elsevier Science B.V.
Publication date: 06/08/2000
Field of study

AbstractIn this paper, we consider the complexity of a number of combinatorial problems; namely, Intervalizing Colored Graphs (DNA physical mapping), Triangulating Colored Graphs (perfect phylogeny), (Directed) (Modified) Colored Cutwidth, Feasible Register Assignment and Module Allocation for graphs of bounded pathwidth. Each of these problems has as a characteristic a uniform upper bound on the tree or path width of the graphs in “yes”-instances. For all of these problems with the exceptions of Feasible Register Assignment and Module Allocation, a vertex or edge coloring is given as part of the input. Our main results are that the parameterized variant of each of the considered problems is hard for the complexity classes W[t] for all t∈N. We also show that Intervalizing Colored Graphs, Triangulating Colored Graphs, and Colored Cutwidth are NP-Complete

Elsevier - Publisher Connector

Fast and accurate supertrees: towards large scale phylogenies

Author: Fleischauer Markus
Publication venue
Publication date: 01/01/2018
Field of study

Phylogenetics is the study of evolutionary relationships between biological entities; phylogenetic trees (phylogenies) are a visualization of these evolutionary relationships. Accurate approaches to reconstruct hylogenies from sequence data usually result in NPhard optimization problems, hence local search heuristics have to be applied in practice. These methods are highly accurate and fast enough as long as the input data is not too large. Divide-and-conquer techniques are a promising approach to boost scalability and accuracy of those local search heuristics on very large datasets. A divide-and-conquer method breaks down a large phylogenetic problem into smaller sub-problems that are computationally easier to solve. The sub-problems (overlapping trees) are then combined using a supertree method. Supertree methods merge a set of overlapping phylogenetic trees into a supertree containing all taxa of the input trees. The challenge in supertree reconstruction is the way of dealing with conflicting information in the input trees. Many different algorithms for different objective functions have been suggested to resolve these conflicts. In particular, there are methods that encode the source trees in a matrix and the supertree is constructed applying a local search heuristic to optimize the respective objective function. The most widely used supertree methods use such local search heuristics. However, to really improve the scalability of accurate tree reconstruction by divide-and-conquer approaches, accurate polynomial time methods are needed for the supertree reconstruction step. In this work, we present approaches for accurate polynomial time supertree reconstruction in particular Bad Clade Deletion (BCD), a novel heuristic supertree algorithm with polynomial running time. BCD uses minimum cuts to greedily delete a locally minimal number of columns from a matrix representation to make it compatible. Different from local search heuristics, it guarantees to return the directed perfect phylogeny for the input matrix, corresponding to the parent tree of the input trees if one exists. BCD can take support values of the source trees into account without an increase in complexity. We show how reliable clades can be used to restrict the search space for BCD and how those clades can be collected from the input data using the Greedy Strict Consensus Merger. Finally, we introduce a beam search extension for the BCD algorithm that keeps alive a constant number of partial solutions in each top-down iteration phase. The guaranteed worst-case running time of BCD with beam search extension is still polynomial. We present an exact and a randomized subroutine to generate suboptimal partial solutions. In our thorough evaluation on several simulated and biological datasets against a representative set of supertree methods we found that BCD is more accurate than the most accurate supertree methods when using support values and search space restriction on simulated data. Simultaneously BCD is faster than any other evaluated method. The beam search approach improved the accuracy of BCD on all evaluated datasets at the cost of speed. We found that BCD supertrees can boost maximum likelihood tree reconstruction when used as starting tree. Further, BCD could handle large scale datasets where local search heuristics did not converge in reasonable time. Due to its combination of speed, accuracy, and the ability to reconstruct the parent tree if one exists, BCD is a promising approach to enable outstanding scalability of divide-and-conquer approaches.Die Phylogenetik studiert die evolutionären Beziehungen zwischen biologischen Entitäten. Phylogenetische Bäume sind eine Visualisierung dieser Beziehungen. Akkurate Ansätze zur Rekonstruktion von Phylogenien aus Sequenzdaten führen in der Regel zu NP-schweren Optimierungsproblemen, sodass in der Praxis lokale Suchheuristiken angewendet werden müssen. Diese Methoden liefern akkurate Bäume und sind schnell genug, solange die Eingabedaten nicht zu groß werden. Teile-und-herrsche-Verfahren sind ein vielversprechender Ansatz, um Skalierbarkeit und Genauigkeit dieser lokalen Suchheuristiken auf sehr großen Datensätzen zu verbessern. Beim Teile-und-herrsche-Ansatz zerlegt man ein großes phylogenetisches Problem in kleinere Teilprobleme, die einfacher und schneller zu lösen sind. Die Teilprobleme, in diesem Fall überlappende Teilbäume, müssen dann zu einem gesamtheitlichen Baum kombiniert werden. Superbaummethoden verschmelzen solche überlappenden phylogenetischen Bäume zu einem Superbaum, der alle Taxa der Eingangsbäume enthält. Die Herausforderung bei der Superbaumrekonstruktion besteht darin, mit widersprüchlichen Eingabebäumen umzugehen. Es wurden viele verschiedene Algorithmen mit unterschiedlichen Zielfunktionen entwickelt, um solche Widersprüche möglichst sinnvoll aufzulösen. Verfahren, die auf der Kodierung der Eingabebäume als Matrixrepräsentation basieren, sind am weitesten verbreitet. Die zum Auflösen der Konflikte verwendeten Zielfunktionen führen in der Regel zu NP-schweren Optimierungsproblemen, sodass in der Praxis auch hier lokale Suchheuristiken zum Einsatz kommen. Da diese Ansätze nicht wesentlich besser mit der Größe der Eingabedaten skalieren als die direkte Rekonstruktion aus Sequenzdaten, werden für die Superbaumrekonstruktion in Teile-undherrsche-Ansätzen akkurate Polynomialzeitmethoden benötigt. Diese Arbeit beschäftigt sich mit der akkuraten Rekonstruktion von Superbäumen in Polynomialzeit. Wir präsentieren Bad Clade Deletion (BCD), eine neue Polynomialzeitheuristik zur Superbaumrekonstruktion. BCD verwendet minimale Schnitte in Graphen, um eine minimale Anzahl von Spalten aus der Matrixrepräsentation zu löschen, sodass diese konfliktfrei wird. Im Gegensatz zu lokalen Suchheuristiken garantiert BCD die Rekonstruktion einer perfekten Phylogenie, sofern eine solche für die Eingabematrix existiert. BCD ermöglicht es, Gütekriterien der Eingabebäume zu berücksichtigen, ohne dass sich dadurch die Komplexität erhöht. Weiterhin zeigen wir, wie zuverlässige Kladen verwendet werden können, um den Suchraum für BCD einzuschränken und wie man diese mit Hilfe des Greedy Strict Consensus Mergers aus den Eingabedaten gewinnen kann. Schließlich stellen wir eine Strahlensuche für BCD vor. Diese erlaubt es eine bestimmte Anzahl suboptimaler Teillösungen (anstatt nur der optimalen) zu berücksichtigen, um so das Gesamtergebnis zu verbessern. Die Worst-Case-Laufzeit der Strahlensuche ist immer noch polynomiell. Zur Berechnung suboptimaler Teillösungen stellen wir einen exakten und einen randomisierten Algorithmus vor. In einer ausführlichen Evaluation auf mehreren simulierten und biologischen Datensätzen vergleichen wir BCD mit einer repräsentativen Auswahl an Superbaummethoden. Wir haben herausgefunden, dass BCD bei Verwendung von Gütekriterien und Suchraumbeschränkung auf simulierten Daten genauer ist als die akkuratesten evaluierten Superbaummethoden. Gleichzeitig ist BCD deutlich schneller als alle evaluierten Methoden. Die Strahlensuche verbessert die Qualität der BCD-Bäume auf allen Datensätzen, allerdings auf Kosten der Laufzeit. Weiterhin fanden wir heraus, dass ein BCD-Superbaum, der als Startbaum verwendet wird, die Qualität einer Maximum-Likelihood-Baumrekonstruktion verbessern kann. Außerdem kann BCD Datensätze verarbeiten, die so groß sind, dass lokale Suchheuristiken auf diesen nicht mehr in angemessener Zeit konvergieren. Aufgrund der Kombination aus Geschwindigkeit, Genauigkeit und der Fähigkeit, den Elternbaum zu rekonstruieren, sofern ein solcher existiert, ist BCD ein vielversprechender Ansatz um die Skalierbarkeit von Teile-und-herrsche-Methoden entscheidend zu verbessern

Digitale Bibliothek Thüringen

Maximal Chordal Subgraphs

Author: Gishboliner Lior
Sudakov Benny
Publication venue
Publication date: 10/03/2023
Field of study

A chordal graph is a graph with no induced cycles of length at least

4

. Let

f(n,m)

be the maximal integer such that every graph with

n

vertices and

m

edges has a chordal subgraph with at least

f(n,m)

edges. In 1985 Erd\H{o}s and Laskar posed the problem of estimating

f(n,m)

. In the late '80s, Erd\H{o}s, Gy\'arf\'as, Ordman and Zalcstein determined the value of

f(n,n^2/4+1)

and made a conjecture on the value of

f(n,n^2/3+1)

. In this paper we prove this conjecture and answer the question of Erd\H{o}s and Laskar, determining

f(n,m)

asymptotically for all

m

and exactly for

m \leq n^2/3+1

arXiv.org e-Print Archive

Repository for Publications and Research Data

Recommended from our members

Inference of single-cell phylogenies from lineage tracing data using Cassiopeia.

Author: Chan Michelle M
Hussmann Jeffrey A
Jones Matthew G
Khodaverdian Alex
Quinn Jeffrey J
Wang Robert
Weissman Jonathan S
Xu Chenling
Yosef Nir
Publication venue: eScholarship, University of California
Publication date: 01/04/2020
Field of study

The pairing of CRISPR/Cas9-based gene editing with massively parallel single-cell readouts now enables large-scale lineage tracing. However, the rapid growth in complexity of data from these assays has outpaced our ability to accurately infer phylogenetic relationships. First, we introduce Cassiopeia-a suite of scalable maximum parsimony approaches for tree reconstruction. Second, we provide a simulation framework for evaluating algorithms and exploring lineage tracer design principles. Finally, we generate the most complex experimental lineage tracing dataset to date, 34,557 human cells continuously traced over 15 generations, and use it for benchmarking phylogenetic inference approaches. We show that Cassiopeia outperforms traditional methods by several metrics and under a wide variety of parameter regimes, and provide insight into the principles for the design of improved Cas9-enabled recorders. Together, these should broadly enable large-scale mammalian lineage tracing efforts. Cassiopeia and its benchmarking resources are publicly available at www.github.com/YosefLab/Cassiopeia

eScholarship - University of California

Predicting Horizontal Gene Transfers with Perfect Transfer Networks

Author: Lafond Manuel
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 22nd International Workshop on Algorithms in Bioinformatics (WABI 2022)
Publication date: 01/01/2022
Field of study

Horizontal gene transfer inference approaches are usually based on gene sequences: parametric methods search for patterns that deviate from a particular genomic signature, while phylogenetic methods use sequences to reconstruct the gene and species trees. However, it is well-known that sequences have difficulty identifying ancient transfers since mutations have enough time to erase all evidence of such events. In this work, we ask whether character-based methods can predict gene transfers. Their advantage over sequences is that homologous genes can have low DNA similarity, but still have retained enough important common motifs that allow them to have common character traits, for instance the same functional or expression profile. A phylogeny that has two separate clades that acquired the same character independently might indicate the presence of a transfer even in the absence of sequence similarity. We introduce perfect transfer networks, which are phylogenetic networks that can explain the character diversity of a set of taxa. This problem has been studied extensively in the form of ancestral recombination networks, but these only model hybridation events and do not differentiate between direct parents and lateral donors. We focus on tree-based networks, in which edges representing vertical descent are clearly distinguished from those that represent horizontal transmission. Our model is a direct generalization of perfect phylogeny models to such networks. Our goal is to initiate a study on the structural and algorithmic properties of perfect transfer networks. We then show that in polynomial time, one can decide whether a given network is a valid explanation for a set of taxa, and show how, for a given tree, one can add transfer edges to it so that it explains a set of taxa

Dagstuhl Research Online Publication Server

Evolution of protein domain architectures

Author: A Heger
A Marchler-Bauer
A Nagy
A Nagy
A Nagy
A Nasir
A Rijk van
A Rzhetsky
A-L Barabási
AD Moore
AD Moore
AD Moore
AH Brivanlou
AR Kersting
B Lee
B Snel
C Bru
C Chothia
C Feschotte
C Haider
C Vogel
C Vogel
C-H Hsu
C-H Hsu
CM Zmasek
D Ekman
D Wilson
DP Syamaladevi
E Bornberg-Bauer
E Dohmen
E Gogvadze
E Nimwegen van
EE Schmidt
EM Marcotte
EV Koonin
G Apic
G Apic
GP Karev
H Tordai
I Cohen-Gihon
I Letunic
I Yanai
J Gough
J Qian
J Weiner
J Weiner
J Weiner III
J Wiedenhoeft
J-M Chandonia
JAG Ranea
JH Fong
JM Eirin-Lopez
JP Demuth
JS Farris
K Forslund
L Grassi
L Leclère
L Li
L Patthy
LY Geer
M Bashton
M Buljan
M Buljan
M d C Orozco-Mosqueda
M Itoh
M Liu
M Sharma
M Stolzer
M Toll-Riera
MA Huynen
MK Basu
MK Basu
N Terrapon
N Vera-Parra
NC Brissett
NL Dawson
NM Luscombe
R Cordaux
RD Finn
RD Finn
RF Doolittle
S Wuchty
S Yang
SD Lam
SK Kummerfeld
SK Kummerfeld
T Bitard-Feildel
T Doğan
T Koestler
T Przytycka
TE Lewis
UniProt Consortium
V Hollich
VA Kuznetsov
W-D Heyer
X Xie
X-C Zhang
Y-C Wu
ÅK Björklund
ÅK Björklund
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

This chapter reviews current research on how protein domain architectures evolve. We begin by summarizing work on the phylogenetic distribution of proteins, as this will directly impact which domain architectures can be formed in different species. Studies relating domain family size to occurrence have shown that they generally follow power law distributions, both within genomes and larger evolutionary groups. These findings were subsequently extended to multi-domain architectures. Genome evolution models that have been suggested to explain the shape of these distributions are reviewed, as well as evidence for selective pressure to expand certain domain families more than others. Each domain has an intrinsic combinatorial propensity, and the effects of this have been studied using measures of domain versatility or promiscuity. Next, we study the principles of protein domain architecture evolution and how these have been inferred from distributions of extant domain arrangements. Following this, we review inferences of ancestral domain architecture and the conclusions concerning domain architecture evolution mechanisms that can be drawn from these. Finally, we examine whether all known cases of a given domain architecture can be assumed to have a single common origin (monophyly) or have evolved convergently (polyphyly). We end by a discussion of some available tools for computational analysis or exploitation of protein domain architectures and their evolution

Crossref

MDC Repository