10 research outputs found

    Pairwise Compatibility Graphs (Invited Talk)

    Get PDF
    Pairwise Compatibility Graphs (PCG) are graphs introduced in relation to the biological problem of reconstructing phylogenetic trees. Without demanding to be exhaustive, in this note we take a quick look at what is known in the literature for these graphs. The evolutionary history of a set of organisms is usually represented by a tree-like structure called phylogenetic tree, where the leaves are the known species and the internal nodes are the possible ancestors that might have led, through evolution, to this set of species. Edges are evolutionary relationships between species, while the edge weights represent evolutionary distances among species (evolutionary times). The phylogenetic tree reconstruction problem consists in finding a fully labeled phylogenetic tree that'best' explains the evolution of given species, where'best' means that it optimizes a specific target function. Tree reconstruction problem is proved to be NP-hard under many criteria of optimality, so the performance of the heuristics for this problem is usually experimentally evaluated by comparing the output trees with the partial trees that are unanimously recognized as sure by biologists. But real data consist of a huge number of species, and it is unfeasible to compare trees with such a number of leaves, so it is common to exploit sample techniques. The idea is to find efficient ways to sample subsets of species from a large set in order to test the heuristics on the smaller sub-trees induced by the sample. The constraints on the sample attempt to ensure that the behavior of the heuristics will not be biased by the fact it is applied on the sample instead of on the whole tree. Since very close or very distant taxa can create problems for phylogenetic reconstruction heuristics [9], the following definition of Pairwise Compatibility Graphs [12] appears natura

    All graphs with at most seven vertices are Pairwise Compatibility Graphs

    Full text link
    A graph GG is called a pairwise compatibility graph (PCG) if there exists an edge-weighted tree TT and two non-negative real numbers dmind_{min} and dmaxd_{max} such that each leaf lul_u of TT corresponds to a vertex uVu \in V and there is an edge (u,v)E(u,v) \in E if and only if dmindT,w(lu,lv)dmaxd_{min} \leq d_{T,w} (l_u, l_v) \leq d_{max} where dT,w(lu,lv)d_{T,w} (l_u, l_v) is the sum of the weights of the edges on the unique path from lul_u to lvl_v in TT. In this note, we show that all the graphs with at most seven vertices are PCGs. In particular all these graphs except for the wheel on 7 vertices W7W_7 are PCGs of a particular structure of a tree: a centipede.Comment: 8 pages, 2 figure

    On relaxing the constraints in pairwise compatibility graphs

    Full text link
    A graph GG is called a pairwise compatibility graph (PCG) if there exists an edge weighted tree TT and two non-negative real numbers dmind_{min} and dmaxd_{max} such that each leaf lul_u of TT corresponds to a vertex uVu \in V and there is an edge (u,v)E(u,v) \in E if and only if dmindT(lu,lv)dmaxd_{min} \leq d_T (l_u, l_v) \leq d_{max} where dT(lu,lv)d_T (l_u, l_v) is the sum of the weights of the edges on the unique path from lul_u to lvl_v in TT. In this paper we analyze the class of PCG in relation with two particular subclasses resulting from the the cases where \dmin=0 (LPG) and \dmax=+\infty (mLPG). In particular, we show that the union of LPG and mLPG does not coincide with the whole class PCG, their intersection is not empty, and that neither of the classes LPG and mLPG is contained in the other. Finally, as the graphs we deal with belong to the more general class of split matrogenic graphs, we focus on this class of graphs for which we try to establish the membership to the PCG class.Comment: 12 pages, 7 figure

    Graphs that are not pairwise compatible: A new proof technique (extended abstract)

    Get PDF
    A graph G = (V,E) is a pairwise compatibility graph (PCG) if there exists an edge-weighted tree T and two non-negative real numbers dminand dmax, dmin≤ dmax, such that each node u∈V is uniquely associated to a leaf of T and there is an edge (u, v) ∈ E if and only if dmin≤ dT(u, v) ≤ dmax, where dT(u, v) is the sum of the weights of the edges on the unique path PT(u, v) from u to v in T. Understanding which graph classes lie inside and which ones outside the PCG class is an important issue. Despite numerous efforts, a complete characterization of the PCG class is not known yet. In this paper we propose a new proof technique that allows us to show that some interesting classes of graphs have empty intersection with PCG. We demonstrate our technique by showing many graph classes that do not lie in PCG. As a side effect, we show a not pairwise compatibility planar graph with 8 nodes (i.e. C28), so improving the previously known result concerning the smallest planar graph known not to be PCG

    Ancestral sequence alignment under optimal conditions

    Get PDF
    BACKGROUND: Multiple genome alignment is an important problem in bioinformatics. An important subproblem used by many multiple alignment approaches is that of aligning two multiple alignments. Many popular alignment algorithms for DNA use the sum-of-pairs heuristic, where the score of a multiple alignment is the sum of its induced pairwise alignment scores. However, the biological meaning of the sum-of-pairs of pairs heuristic is not obvious. Additionally, many algorithms based on the sum-of-pairs heuristic are complicated and slow, compared to pairwise alignment algorithms. An alternative approach to aligning alignments is to first infer ancestral sequences for each alignment, and then align the two ancestral sequences. In addition to being fast, this method has a clear biological basis that takes into account the evolution implied by an underlying phylogenetic tree. In this study we explore the accuracy of aligning alignments by ancestral sequence alignment. We examine the use of both maximum likelihood and parsimony to infer ancestral sequences. Additionally, we investigate the effect on accuracy of allowing ambiguity in our ancestral sequences. RESULTS: We use synthetic sequence data that we generate by simulating evolution on a phylogenetic tree. We use two different types of phylogenetic trees: trees with a period of rapid growth followed by a period of slow growth, and trees with a period of slow growth followed by a period of rapid growth. We examine the alignment accuracy of four ancestral sequence reconstruction and alignment methods: parsimony, maximum likelihood, ambiguous parsimony, and ambiguous maximum likelihood. Additionally, we compare against the alignment accuracy of two sum-of-pairs algorithms: ClustalW and the heuristic of Ma, Zhang, and Wang. CONCLUSION: We find that allowing ambiguity in ancestral sequences does not lead to better multiple alignments. Regardless of whether we use parsimony or maximum likelihood, the success of aligning ancestral sequences containing ambiguity is very sensitive to the choice of gap open cost. Surprisingly, we find that using maximum likelihood to infer ancestral sequences results in less accurate alignments than when using parsimony to infer ancestral sequences. Finally, we find that the sum-of-pairs methods produce better alignments than all of the ancestral alignment methods

    Efficient Generation of Uniform Samples from Phylogenetic Trees

    No full text
    In this paper, we introduce new algorithms for selecting taxon (leaf) samples from large phylogenetic trees, uniformly at random, under certain biologically relevant constraints on the taxa. All the algorithms run in polynomial time and have been implemented. The algorithms have direct applications..
    corecore