41 research outputs found
Resolving Prime Modules: The Structure of Pseudo-cographs and Galled-Tree Explainable Graphs
The modular decomposition of a graph is a natural construction to capture
key features of in terms of a labeled tree whose vertices are
labeled as "series" (), "parallel" () or "prime". However, full
information of is provided by its modular decomposition tree only,
if is a cograph, i.e., does not contain prime modules. In this case,
explains , i.e., if and only if the lowest common
ancestor of and has label "". Pseudo-cographs,
or, more general, GaTEx graphs are graphs that can be explained by labeled
galled-trees, i.e., labeled networks that are obtained from the modular
decomposition tree of by replacing the prime vertices in by
simple labeled cycles. GaTEx graphs can be recognized and labeled galled-trees
that explain these graphs can be constructed in linear time.
In this contribution, we provide a novel characterization of GaTEx graphs in
terms of a set of 25 forbidden induced subgraphs.
This characterization, in turn, allows us to show that GaTEx graphs are closely
related to many other well-known graph classes such as -sparse and
-reducible graphs, weakly-chordal graphs, perfect graphs with perfect
order, comparability and permutation graphs, murky graphs as well as interval
graphs, Meyniel graphs or very strongly-perfect and brittle graphs. Moreover,
we show that every GaTEx graph as twin-width at most 1.Comment: 18 pages, 3 figure
Reconstructing Gene Trees From Fitch's Xenology Relation
Two genes are xenologs in the sense of Fitch if they are separated by at
least one horizontal gene transfer event. Horizonal gene transfer is asymmetric
in the sense that the transferred copy is distinguished from the one that
remains within the ancestral lineage. Hence xenology is more precisely thought
of as a non-symmetric relation: is xenologous to if has been
horizontally transferred at least once since it diverged from the least common
ancestor of and . We show that xenology relations are characterized by a
small set of forbidden induced subgraphs on three vertices. Furthermore, each
xenology relation can be derived from a unique least-resolved edge-labeled
phylogenetic tree. We provide a linear-time algorithm for the recognition of
xenology relations and for the construction of its least-resolved edge-labeled
phylogenetic tree. The fact that being a xenology relation is a heritable graph
property, finally has far-reaching consequences on approximation problems
associated with xenology relations
Induced minors and well-quasi-ordering
A graph is an induced minor of a graph if it can be obtained from an
induced subgraph of by contracting edges. Otherwise, is said to be
-induced minor-free. Robin Thomas showed that -induced minor-free
graphs are well-quasi-ordered by induced minors [Graphs without and
well-quasi-ordering, Journal of Combinatorial Theory, Series B, 38(3):240 --
247, 1985].
We provide a dichotomy theorem for -induced minor-free graphs and show
that the class of -induced minor-free graphs is well-quasi-ordered by the
induced minor relation if and only if is an induced minor of the gem (the
path on 4 vertices plus a dominating vertex) or of the graph obtained by adding
a vertex of degree 2 to the complete graph on 4 vertices. To this end we proved
two decomposition theorems which are of independent interest.
Similar dichotomy results were previously given for subgraphs by Guoli Ding
in [Subgraphs and well-quasi-ordering, Journal of Graph Theory, 16(5):489--502,
1992] and for induced subgraphs by Peter Damaschke in [Induced subgraphs and
well-quasi-ordering, Journal of Graph Theory, 14(4):427--435, 1990]
Hadwiger number of graphs with small chordality
The Hadwiger number of a graph G is the largest integer h such that G has the
complete graph K_h as a minor. We show that the problem of determining the
Hadwiger number of a graph is NP-hard on co-bipartite graphs, but can be solved
in polynomial time on cographs and on bipartite permutation graphs. We also
consider a natural generalization of this problem that asks for the largest
integer h such that G has a minor with h vertices and diameter at most . We
show that this problem can be solved in polynomial time on AT-free graphs when
s>=2, but is NP-hard on chordal graphs for every fixed s>=2
The Orthology Road: Theory and Methods in Orthology Analysis
The evolution of biological species depends on changes in genes. Among these changes are the gradual accumulation of DNA mutations, insertions and deletions, duplication of genes, movements of genes within and between chromosomes, gene losses and gene transfer. As two populations of the same species evolve independently, they will eventually become reproductively isolated and become two distinct species. The evolutionary history of a set of related species through the repeated occurrence of this speciation process can be represented as a tree-like structure, called a phylogenetic tree or a species tree. Since duplicated genes in a single species also independently accumulate point mutations, insertions and deletions, they drift apart in composition in the same way as genes in two related species. The divergence of all the genes descended from a single gene in an ancestral species can also be represented as a tree, a gene tree that takes into account both speciation and duplication events.
In order to reconstruct the evolutionary history from the study of extant species, we use sets of similar genes, with relatively high degree of DNA similarity and usually with some functional resemblance, that appear to have been derived from a common ancestor. The degree of similarity among different instances of the âsame geneâ in different species can be used to explore their evolutionary history via the reconstruction of gene family histories, namely gene trees.
Orthology refers specifically to the relationship between two genes that arose by a speciation event, recent or remote, rather than duplication. Comparing orthologous genes is essential to the correct reconstruction of species trees, so that detecting and identifying orthologous genes is an important problem, and a longstanding challenge, in comparative and evolutionary genomics as well as phylogenetics.
A variety of orthology detection methods have been devised in recent years. Although many of these methods are dependent on generating gene and/or species trees, it has been shown that orthology can be estimated at acceptable levels of accuracy without having to infer gene trees and/or reconciling gene trees with species trees. Therefore, there is good reason to look at the connection of trees and orthology from a different angle: How much information about the gene tree, the species tree, and their reconciliation is already contained in the orthology relation among genes? Intriguingly, a solution to the first part of this question has already been given by Boecker and Dress [Boecker and Dress, 1998] in a different context. In particular, they completely characterized certain maps which they called symbolic ultrametrics. Semple and Steel [Semple and Steel, 2003] then presented an algorithm that can be used to reconstruct a phylogenetic tree from any given symbolic ultrametric. In this thesis we investigate a new characterization of orthology relations, based on symbolic ultramterics for recovering the gene tree.
According to Fitchâs definition [Fitch, 2000], two genes are (co-)orthologous if their last common ancestor in the gene tree represents a speciation event. On the other hand, when their last common ancestor is a duplication event, the genes are paralogs. The orthology relation on a set of genes is therefore determined by the gene tree and an âevent labelingâ that identifies each interior vertex of that tree as either a duplication or a speciation event. In the context of analyzing orthology data, the problem of reconciling event-labeled gene trees with a species tree appears as a variant of the reconciliation problem where genes trees have no labels in their internal vertices. When reconciling a gene tree with a species tree, it can be assumed that the species tree is correct or, in the case of a unknown species tree, it can be inferred. Therefore it is crucial to know for a given gene tree whether there even exists a species tree. In this thesis we characterize event-labelled gene trees for which a species tree exists and species trees to which event-labelled gene trees can be mapped. Reconciliation methods are not always the best options for detecting orthology. A fundamental problem is that, aside from multicellular eukaryotes, evolution does not seem to have conformed to the descent-with-modification model that gives rise to tree-like phylogenies. Examples include many cases of prokaryotes and viruses whose evolution involved horizontal gene transfer. To treat the problem of distinguishing orthology and paralogy within a more general framework, graph-based methods have been proposed to detect and differentiate among evolutionary relationships of genes in those organisms. In this work we introduce a measure of orthology that can be used to test graph-based methods and reconciliation methods that detect orthology. Using these results a new algorithm BOTTOM-UP to determine whether a map from the set of vertices of a tree to a set of events is a symbolic ultrametric or not is devised. Additioanlly, a simulation environment designed to generate large gene families with complex duplication histories on which reconstruction algorithms can be tested and software tools can be benchmarked is presented
Capturing Polynomial Time using Modular Decomposition
The question of whether there is a logic that captures polynomial time is one
of the main open problems in descriptive complexity theory and database theory.
In 2010 Grohe showed that fixed point logic with counting captures polynomial
time on all classes of graphs with excluded minors. We now consider classes of
graphs with excluded induced subgraphs. For such graph classes, an effective
graph decomposition, called modular decomposition, was introduced by Gallai in
1976. The graphs that are non-decomposable with respect to modular
decomposition are called prime. We present a tool, the Modular Decomposition
Theorem, that reduces (definable) canonization of a graph class C to
(definable) canonization of the class of prime graphs of C that are colored
with binary relations on a linearly ordered set. By an application of the
Modular Decomposition Theorem, we show that fixed point logic with counting
captures polynomial time on the class of permutation graphs. Within the proof
of the Modular Decomposition Theorem, we show that the modular decomposition of
a graph is definable in symmetric transitive closure logic with counting. We
obtain that the modular decomposition tree is computable in logarithmic space.
It follows that cograph recognition and cograph canonization is computable in
logarithmic space.Comment: 38 pages, 10 Figures. A preliminary version of this article appeared
in the Proceedings of the 32nd Annual ACM/IEEE Symposium on Logic in Computer
Science (LICS '17
Gene Family Histories: Theory and Algorithms
Detailed gene family histories and reconciliations with species trees are a prerequisite for studying associations between genetic and phenotypic innovations. Even though the true evolutionary scenarios are usually unknown, they impose certain constraints on the mathematical structure of data obtained from simple yes/no questions in pairwise comparisons of gene sequences. Recent advances in this field have led to the development of methods for reconstructing (aspects of) the scenarios on the basis of such relation data, which can most naturally be represented by graphs on the set of considered genes.
We provide here novel characterizations of best match graphs (BMGs) which capture the notion of (reciprocal) best hits based on sequence similarities. BMGs provide the basis for the detection of orthologous genes (genes that diverged after a speciation event). There are two main sources of error in pipelines for orthology inference based on BMGs. Firstly, measurement errors in the estimation of best matches from sequence similarity in general lead to violations of the characteristic properties of BMGs. The second issue concerns the reconstruction of the orthology relation from a BMG. We show how to correct estimated BMG to mathematically valid ones and how much information about orthologs is contained in BMGs.
We then discuss implicit methods for horizontal gene transfer (HGT) inference that focus on pairs of genes that have diverged only after the divergence of the two species in which the genes reside. This situation defines the edge set of an undirected graph, the later-divergence-time (LDT) graph. We explore the mathematical structure of LDT graphs and show how much information about all HGT events is contained in such LDT graphs