37 research outputs found

    Simultaneous Orthogonal Planarity

    Full text link
    We introduce and study the OrthoSEFEk\textit{OrthoSEFE}-k problem: Given kk planar graphs each with maximum degree 4 and the same vertex set, do they admit an OrthoSEFE, that is, is there an assignment of the vertices to grid points and of the edges to paths on the grid such that the same edges in distinct graphs are assigned the same path and such that the assignment induces a planar orthogonal drawing of each of the kk graphs? We show that the problem is NP-complete for k3k \geq 3 even if the shared graph is a Hamiltonian cycle and has sunflower intersection and for k2k \geq 2 even if the shared graph consists of a cycle and of isolated vertices. Whereas the problem is polynomial-time solvable for k=2k=2 when the union graph has maximum degree five and the shared graph is biconnected. Further, when the shared graph is biconnected and has sunflower intersection, we show that every positive instance has an OrthoSEFE with at most three bends per edge.Comment: Appears in the Proceedings of the 24th International Symposium on Graph Drawing and Network Visualization (GD 2016

    Gene Order Phylogeny of the Genus Prochlorococcus

    Get PDF
    Using gene order as a phylogenetic character has the potential to resolve previously unresolved species relationships. This character was used to resolve the evolutionary history within the genus Prochlorococcus, a group of marine cyanobacteria.Orthologous gene sets and their genomic positions were identified from 12 species of Prochlorococcus and 1 outgroup species of Synechococcus. From this data, inversion and breakpoint distance-based phylogenetic trees were computed by GRAPPA and FastME. Statistical support of the resulting topology was obtained by application of a 50% jackknife resampling technique. The result was consistent and congruent with nucleotide sequence-based and gene-content based trees. Also, a previously unresolved clade was resolved, that of MIT9211 and SS120.This is the first study to use gene order data to resolve a bacterial phylogeny at the genus level. It suggests that the technique is useful in resolving the Tree of Life

    Rec-DCM-Eigen: Reconstructing a Less Parsimonious but More Accurate Tree in Shorter Time

    Get PDF
    Maximum parsimony (MP) methods aim to reconstruct the phylogeny of extant species by finding the most parsimonious evolutionary scenario using the species' genome data. MP methods are considered to be accurate, but they are also computationally expensive especially for a large number of species. Several disk-covering methods (DCMs), which decompose the input species to multiple overlapping subgroups (or disks), have been proposed to solve the problem in a divide-and-conquer way

    Running Experiments with Confidence and Sanity

    Get PDF
    Analyzing data from large experimental suites is a daily task for anyone doing experimental algorithmics. In this paper we report on several approaches we tried for this seemingly mundane task in a similarity search setting, reflecting on the challenges it poses. We conclude by proposing a workflow, which can be implemented using several tools, that allows to analyze experimental data with confidence. The extended version of this paper and the support code are provided at https://github.com/Cecca/running-experiments

    Locating a Tree in a Phylogenetic Network in Quadratic Time

    Get PDF
    International audienceA fundamental problem in the study of phylogenetic networks is to determine whether or not a given phylogenetic network contains a given phylogenetic tree. We develop a quadratic-time algorithm for this problem for binary nearly-stable phylogenetic networks. We also show that the number of reticulations in a reticulation visible or nearly stable phylogenetic network is bounded from above by a function linear in the number of taxa

    The approximability of the String Barcoding problem

    Get PDF
    The String Barcoding (SBC) problem, introduced by Rash and Gusfield (RECOMB, 2002), consists in finding a minimum set of substrings that can be used to distinguish between all members of a set of given strings. In a computational biology context, the given strings represent a set of known viruses, while the substrings can be used as probes for an hybridization experiment via microarray. Eventually, one aims at the classification of new strings (unknown viruses) through the result of the hybridization experiment. In this paper we show that SBC is as hard to approximate as Set Cover. Furthermore, we show that the constrained version of SBC (with probes of bounded length) is also hard to approximate. These negative results are tight

    Minimizing recombinations in consensus networks for phylogeographic studies

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>We address the problem of studying recombinational variations in (human) populations. In this paper, our focus is on one computational aspect of the general task: Given two networks <it>G</it><sub>1 </sub>and <it>G</it><sub>2</sub>, with both mutation and recombination events, defined on overlapping sets of extant units the objective is to compute a consensus network <it>G</it><sub>3 </sub>with minimum number of additional recombinations. We describe a polynomial time algorithm with a guarantee that the number of computed new recombination events is within <it>ϵ </it>= <it>sz</it>(<it>G</it><sub>1</sub>, <it>G</it><sub>2</sub>) (function <it>sz </it>is a well-behaved function of the sizes and topologies of <it>G</it><sub>1 </sub>and <it>G</it><sub>2</sub>) of the optimal <it>number </it>of recombinations. To date, this is the best known result for a network consensus problem.</p> <p>Results</p> <p>Although the network consensus problem can be applied to a variety of domains, here we focus on structure of human populations. With our preliminary analysis on a segment of the human Chromosome X data we are able to infer ancient recombinations, population-specific recombinations and more, which also support the widely accepted 'Out of Africa' model. These results have been verified independently using traditional manual procedures. To the best of our knowledge, this is the first recombinations-based characterization of human populations.</p> <p>Conclusion</p> <p>We show that our mathematical model identifies recombination spots in the individual haplotypes; the aggregate of these spots over a set of haplotypes defines a recombinational landscape that has enough signal to detect continental as well as population divide based on a short segment of Chromosome X. In particular, we are able to infer ancient recombinations, population-specific recombinations and more, which also support the widely accepted 'Out of Africa' model. The agreement with mutation-based analysis can be viewed as an indirect validation of our results and the model. Since the model in principle gives us more information embedded in the networks, in our future work, we plan to investigate more non-traditional questions via these structures computed by our methodology.</p

    A Note on Encodings of Phylogenetic Networks of Bounded Level

    Full text link
    Driven by the need for better models that allow one to shed light into the question how life's diversity has evolved, phylogenetic networks have now joined phylogenetic trees in the center of phylogenetics research. Like phylogenetic trees, such networks canonically induce collections of phylogenetic trees, clusters, and triplets, respectively. Thus it is not surprising that many network approaches aim to reconstruct a phylogenetic network from such collections. Related to the well-studied perfect phylogeny problem, the following question is of fundamental importance in this context: When does one of the above collections encode (i.e. uniquely describe) the network that induces it? In this note, we present a complete answer to this question for the special case of a level-1 (phylogenetic) network by characterizing those level-1 networks for which an encoding in terms of one (or equivalently all) of the above collections exists. Given that this type of network forms the first layer of the rich hierarchy of level-k networks, k a non-negative integer, it is natural to wonder whether our arguments could be extended to members of that hierarchy for higher values for k. By giving examples, we show that this is not the case

    Cinteny: flexible analysis and visualization of synteny and genome rearrangements in multiple organisms

    Get PDF
    BACKGROUND: Identifying syntenic regions, i.e., blocks of genes or other markers with evolutionary conserved order, and quantifying evolutionary relatedness between genomes in terms of chromosomal rearrangements is one of the central goals in comparative genomics. However, the analysis of synteny and the resulting assessment of genome rearrangements are sensitive to the choice of a number of arbitrary parameters that affect the detection of synteny blocks. In particular, the choice of a set of markers and the effect of different aggregation strategies, which enable coarse graining of synteny blocks and exclusion of micro-rearrangements, need to be assessed. Therefore, existing tools and resources that facilitate identification, visualization and analysis of synteny need to be further improved to provide a flexible platform for such analysis, especially in the context of multiple genomes. RESULTS: We present a new tool, Cinteny, for fast identification and analysis of synteny with different sets of markers and various levels of coarse graining of syntenic blocks. Using Hannenhalli-Pevzner approach and its extensions, Cinteny also enables interactive determination of evolutionary relationships between genomes in terms of the number of rearrangements (the reversal distance). In particular, Cinteny provides: i) integration of synteny browsing with assessment of evolutionary distances for multiple genomes; ii) flexibility to adjust the parameters and re-compute the results on-the-fly; iii) ability to work with user provided data, such as orthologous genes, sequence tags or other conserved markers. In addition, Cinteny provides many annotated mammalian, invertebrate and fungal genomes that are pre-loaded and available for analysis at . CONCLUSION: Cinteny allows one to automatically compare multiple genomes and perform sensitivity analysis for synteny block detection and for the subsequent computation of reversal distances. Cinteny can also be used to interactively browse syntenic blocks conserved in multiple genomes, to facilitate genome annotation and validation of assemblies for newly sequenced genomes, and to construct and assess phylogenomic trees

    Refining transcriptional regulatory networks using network evolutionary models and gene histories

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Computational inference of transcriptional regulatory networks remains a challenging problem, in part due to the lack of strong network models. In this paper we present evolutionary approaches to improve the inference of regulatory networks for a family of organisms by developing an evolutionary model for these networks and taking advantage of established phylogenetic relationships among these organisms. In previous work, we used a simple evolutionary model and provided extensive simulation results showing that phylogenetic information, combined with such a model, could be used to gain significant improvements on the performance of current inference algorithms.</p> <p>Results</p> <p>In this paper, we extend the evolutionary model so as to take into account gene duplications and losses, which are viewed as major drivers in the evolution of regulatory networks. We show how to adapt our evolutionary approach to this new model and provide detailed simulation results, which show significant improvement on the reference network inference algorithms. Different evolutionary histories for gene duplications and losses are studied, showing that our adapted approach is feasible under a broad range of conditions. We also provide results on biological data (<it>cis</it>-regulatory modules for 12 species of <it>Drosophila</it>), confirming our simulation results.</p
    corecore