41 research outputs found

    Resolving Prime Modules: The Structure of Pseudo-cographs and Galled-Tree Explainable Graphs

    Full text link
    The modular decomposition of a graph GG is a natural construction to capture key features of GG in terms of a labeled tree (T,t)(T,t) whose vertices are labeled as "series" (11), "parallel" (00) or "prime". However, full information of GG is provided by its modular decomposition tree (T,t)(T,t) only, if GG is a cograph, i.e., GG does not contain prime modules. In this case, (T,t)(T,t) explains GG, i.e., {x,y}∈E(G)\{x,y\}\in E(G) if and only if the lowest common ancestor lcaT(x,y)\mathrm{lca}_T(x,y) of xx and yy has label "11". Pseudo-cographs, or, more general, GaTEx graphs GG are graphs that can be explained by labeled galled-trees, i.e., labeled networks (N,t)(N,t) that are obtained from the modular decomposition tree (T,t)(T,t) of GG by replacing the prime vertices in TT by simple labeled cycles. GaTEx graphs can be recognized and labeled galled-trees that explain these graphs can be constructed in linear time. In this contribution, we provide a novel characterization of GaTEx graphs in terms of a set FGT\mathfrak{F}_{\mathrm{GT}} of 25 forbidden induced subgraphs. This characterization, in turn, allows us to show that GaTEx graphs are closely related to many other well-known graph classes such as P4P_4-sparse and P4P_4-reducible graphs, weakly-chordal graphs, perfect graphs with perfect order, comparability and permutation graphs, murky graphs as well as interval graphs, Meyniel graphs or very strongly-perfect and brittle graphs. Moreover, we show that every GaTEx graph as twin-width at most 1.Comment: 18 pages, 3 figure

    Reconstructing Gene Trees From Fitch's Xenology Relation

    Full text link
    Two genes are xenologs in the sense of Fitch if they are separated by at least one horizontal gene transfer event. Horizonal gene transfer is asymmetric in the sense that the transferred copy is distinguished from the one that remains within the ancestral lineage. Hence xenology is more precisely thought of as a non-symmetric relation: yy is xenologous to xx if yy has been horizontally transferred at least once since it diverged from the least common ancestor of xx and yy. We show that xenology relations are characterized by a small set of forbidden induced subgraphs on three vertices. Furthermore, each xenology relation can be derived from a unique least-resolved edge-labeled phylogenetic tree. We provide a linear-time algorithm for the recognition of xenology relations and for the construction of its least-resolved edge-labeled phylogenetic tree. The fact that being a xenology relation is a heritable graph property, finally has far-reaching consequences on approximation problems associated with xenology relations

    Induced minors and well-quasi-ordering

    Get PDF
    A graph HH is an induced minor of a graph GG if it can be obtained from an induced subgraph of GG by contracting edges. Otherwise, GG is said to be HH-induced minor-free. Robin Thomas showed that K4K_4-induced minor-free graphs are well-quasi-ordered by induced minors [Graphs without K4K_4 and well-quasi-ordering, Journal of Combinatorial Theory, Series B, 38(3):240 -- 247, 1985]. We provide a dichotomy theorem for HH-induced minor-free graphs and show that the class of HH-induced minor-free graphs is well-quasi-ordered by the induced minor relation if and only if HH is an induced minor of the gem (the path on 4 vertices plus a dominating vertex) or of the graph obtained by adding a vertex of degree 2 to the complete graph on 4 vertices. To this end we proved two decomposition theorems which are of independent interest. Similar dichotomy results were previously given for subgraphs by Guoli Ding in [Subgraphs and well-quasi-ordering, Journal of Graph Theory, 16(5):489--502, 1992] and for induced subgraphs by Peter Damaschke in [Induced subgraphs and well-quasi-ordering, Journal of Graph Theory, 14(4):427--435, 1990]

    Hadwiger number of graphs with small chordality

    Full text link
    The Hadwiger number of a graph G is the largest integer h such that G has the complete graph K_h as a minor. We show that the problem of determining the Hadwiger number of a graph is NP-hard on co-bipartite graphs, but can be solved in polynomial time on cographs and on bipartite permutation graphs. We also consider a natural generalization of this problem that asks for the largest integer h such that G has a minor with h vertices and diameter at most ss. We show that this problem can be solved in polynomial time on AT-free graphs when s>=2, but is NP-hard on chordal graphs for every fixed s>=2

    The Orthology Road: Theory and Methods in Orthology Analysis

    Get PDF
    The evolution of biological species depends on changes in genes. Among these changes are the gradual accumulation of DNA mutations, insertions and deletions, duplication of genes, movements of genes within and between chromosomes, gene losses and gene transfer. As two populations of the same species evolve independently, they will eventually become reproductively isolated and become two distinct species. The evolutionary history of a set of related species through the repeated occurrence of this speciation process can be represented as a tree-like structure, called a phylogenetic tree or a species tree. Since duplicated genes in a single species also independently accumulate point mutations, insertions and deletions, they drift apart in composition in the same way as genes in two related species. The divergence of all the genes descended from a single gene in an ancestral species can also be represented as a tree, a gene tree that takes into account both speciation and duplication events. In order to reconstruct the evolutionary history from the study of extant species, we use sets of similar genes, with relatively high degree of DNA similarity and usually with some functional resemblance, that appear to have been derived from a common ancestor. The degree of similarity among different instances of the “same gene” in different species can be used to explore their evolutionary history via the reconstruction of gene family histories, namely gene trees. Orthology refers specifically to the relationship between two genes that arose by a speciation event, recent or remote, rather than duplication. Comparing orthologous genes is essential to the correct reconstruction of species trees, so that detecting and identifying orthologous genes is an important problem, and a longstanding challenge, in comparative and evolutionary genomics as well as phylogenetics. A variety of orthology detection methods have been devised in recent years. Although many of these methods are dependent on generating gene and/or species trees, it has been shown that orthology can be estimated at acceptable levels of accuracy without having to infer gene trees and/or reconciling gene trees with species trees. Therefore, there is good reason to look at the connection of trees and orthology from a different angle: How much information about the gene tree, the species tree, and their reconciliation is already contained in the orthology relation among genes? Intriguingly, a solution to the first part of this question has already been given by Boecker and Dress [Boecker and Dress, 1998] in a different context. In particular, they completely characterized certain maps which they called symbolic ultrametrics. Semple and Steel [Semple and Steel, 2003] then presented an algorithm that can be used to reconstruct a phylogenetic tree from any given symbolic ultrametric. In this thesis we investigate a new characterization of orthology relations, based on symbolic ultramterics for recovering the gene tree. According to Fitch’s definition [Fitch, 2000], two genes are (co-)orthologous if their last common ancestor in the gene tree represents a speciation event. On the other hand, when their last common ancestor is a duplication event, the genes are paralogs. The orthology relation on a set of genes is therefore determined by the gene tree and an “event labeling” that identifies each interior vertex of that tree as either a duplication or a speciation event. In the context of analyzing orthology data, the problem of reconciling event-labeled gene trees with a species tree appears as a variant of the reconciliation problem where genes trees have no labels in their internal vertices. When reconciling a gene tree with a species tree, it can be assumed that the species tree is correct or, in the case of a unknown species tree, it can be inferred. Therefore it is crucial to know for a given gene tree whether there even exists a species tree. In this thesis we characterize event-labelled gene trees for which a species tree exists and species trees to which event-labelled gene trees can be mapped. Reconciliation methods are not always the best options for detecting orthology. A fundamental problem is that, aside from multicellular eukaryotes, evolution does not seem to have conformed to the descent-with-modification model that gives rise to tree-like phylogenies. Examples include many cases of prokaryotes and viruses whose evolution involved horizontal gene transfer. To treat the problem of distinguishing orthology and paralogy within a more general framework, graph-based methods have been proposed to detect and differentiate among evolutionary relationships of genes in those organisms. In this work we introduce a measure of orthology that can be used to test graph-based methods and reconciliation methods that detect orthology. Using these results a new algorithm BOTTOM-UP to determine whether a map from the set of vertices of a tree to a set of events is a symbolic ultrametric or not is devised. Additioanlly, a simulation environment designed to generate large gene families with complex duplication histories on which reconstruction algorithms can be tested and software tools can be benchmarked is presented

    Capturing Polynomial Time using Modular Decomposition

    Full text link
    The question of whether there is a logic that captures polynomial time is one of the main open problems in descriptive complexity theory and database theory. In 2010 Grohe showed that fixed point logic with counting captures polynomial time on all classes of graphs with excluded minors. We now consider classes of graphs with excluded induced subgraphs. For such graph classes, an effective graph decomposition, called modular decomposition, was introduced by Gallai in 1976. The graphs that are non-decomposable with respect to modular decomposition are called prime. We present a tool, the Modular Decomposition Theorem, that reduces (definable) canonization of a graph class C to (definable) canonization of the class of prime graphs of C that are colored with binary relations on a linearly ordered set. By an application of the Modular Decomposition Theorem, we show that fixed point logic with counting captures polynomial time on the class of permutation graphs. Within the proof of the Modular Decomposition Theorem, we show that the modular decomposition of a graph is definable in symmetric transitive closure logic with counting. We obtain that the modular decomposition tree is computable in logarithmic space. It follows that cograph recognition and cograph canonization is computable in logarithmic space.Comment: 38 pages, 10 Figures. A preliminary version of this article appeared in the Proceedings of the 32nd Annual ACM/IEEE Symposium on Logic in Computer Science (LICS '17

    Gene Family Histories: Theory and Algorithms

    Get PDF
    Detailed gene family histories and reconciliations with species trees are a prerequisite for studying associations between genetic and phenotypic innovations. Even though the true evolutionary scenarios are usually unknown, they impose certain constraints on the mathematical structure of data obtained from simple yes/no questions in pairwise comparisons of gene sequences. Recent advances in this field have led to the development of methods for reconstructing (aspects of) the scenarios on the basis of such relation data, which can most naturally be represented by graphs on the set of considered genes. We provide here novel characterizations of best match graphs (BMGs) which capture the notion of (reciprocal) best hits based on sequence similarities. BMGs provide the basis for the detection of orthologous genes (genes that diverged after a speciation event). There are two main sources of error in pipelines for orthology inference based on BMGs. Firstly, measurement errors in the estimation of best matches from sequence similarity in general lead to violations of the characteristic properties of BMGs. The second issue concerns the reconstruction of the orthology relation from a BMG. We show how to correct estimated BMG to mathematically valid ones and how much information about orthologs is contained in BMGs. We then discuss implicit methods for horizontal gene transfer (HGT) inference that focus on pairs of genes that have diverged only after the divergence of the two species in which the genes reside. This situation defines the edge set of an undirected graph, the later-divergence-time (LDT) graph. We explore the mathematical structure of LDT graphs and show how much information about all HGT events is contained in such LDT graphs
    corecore