19 research outputs found

    From trees to networks and back

    Get PDF
    The evolutionary history of a set of species is commonly represented by a phylogenetic tree. Often, however, the data contain conflicting signals, which can be better represented by a more general structure, namely a phylogenetic network. Such networks allow the display of several alternative evolutionary scenarios simultaneously but this can come at the price of complex visual representations. Using so-called circular split networks reduces this complexity, because this type of network can always be visualized in the plane without any crossing edges. These circular split networks form the core of this thesis. We construct them, use them as a search space for minimum evolution trees and explore their properties. More specifically, we present a new method, called SuperQ, to construct a circular split network summarising a collection of phylogenetic trees that have overlapping leaf sets. Then, we explore the set of phylogenetic trees associated with a �fixed circular split network, in particular using it as a search space for optimal trees. This set represents just a tiny fraction of the space of all phylogenetic trees, but we still �find trees within it that compare quite favourably with those obtained by a leading heuristic, which uses tree edit operations for searching the whole tree space. In the last part, we advance our understanding of the set of phylogenetic trees associated with a circular split network. Specifically, we investigate the size of the so-called circular tree neighbourhood for the three tree edit operations, tree bisection and reconnection (tbr), subtree prune and regraft (spr) and nearest neighbour interchange (nni)

    The minimum evolution problem is hard:A link between tree inference and graph clustering problems

    Get PDF
    Motivation: Distance methods are well suited for constructing massive phylogenetic trees. However, the computational complexity for Rzhetsky and Nei’s minimum evolution (ME) approach, one of the earliest methods for constructing a phylogenetic tree from a distance matrix, remains open. Results: We show that Rzhetsky and Nei’s ME problem is NP-complete, and so probably computationally intractable. We do this by linking the ME problem to a graph clustering problem called the quasi-clique decomposition problem, which has recently also been shown to be NP-complete. We also discuss how this link could potentially open up some useful new connections between phylogenetics and graph clustering

    SPECTRE: a Suite of PhylogEnetiC Tools for Reticulate Evolution

    Get PDF
    Split-networks are a generalization of phylogenetic trees that have proven to be a powerful tool in phylogenetics. Various ways have been developed for computing such networks, including split-decomposition, NeighborNet, QNet and FlatNJ. Some of these approaches are implemented in the user-friendly SplitsTree software package. However, to give the user the option to adjust and extend these approaches and to facilitate their integration into analysis pipelines, there is a need for robust, open-source implementations of associated data structures and algorithms. Here we present SPECTRE, a readily available, open-source library of data structures written in Java, that comes complete with new implementations of several pre-published algorithms and a basic interactive graphical interface for visualizing planar split networks. SPECTRE also supports the use of longer running algorithms by providing command line interfaces, which can be executed on servers or in High Performance Computing (HPC) environments

    AlbaTraDIS:Comparative analysis of large datasets from parallel transposon mutagenesis experiments

    Get PDF
    Bacteria need to survive in a wide range of environments. Currently, there is an incomplete understanding of the genetic basis for mechanisms underpinning survival in stressful conditions, such as the presence of anti-microbials. Transposon directed insertion-site sequencing (TraDIS) is a powerful tool to identify genes and networks which are involved in survival and fitness under a given condition by simultaneously assaying the fitness of millions of mutants, thereby relating genotype to phenotype and contributing to an understanding of bacterial cell biology. A recent refinement of this approach allows the roles of essential genes in conditional stress survival to be inferred by altering their expression. These advancements combined with the rapidly falling costs of sequencing now allows comparisons between multiple experiments to identify commonalities in stress responses to different conditions. This capacity however poses a new challenge for analysis of multiple data sets in conjunction. To address this analysis need, we have developed 'AlbaTraDIS'; a software application for rapid large-scale comparative analysis of TraDIS experiments that predicts the impact of transposon insertions on nearby genes. AlbaTraDIS can identify genes which are up or down regulated, or inactivated, between multiple conditions, producing a filtered list of genes for further experimental validation as well as several accompanying data visualisations. We demonstrate the utility of our new approach by applying it to identify genes used by Escherichia coli to survive in a wide range of different concentrations of the biocide Triclosan. AlbaTraDIS identified all well characterised Triclosan resistance genes, including the primary target, fabI. A number of new loci were also implicated in Triclosan resistance and the predicted phenotypes for a selection of these were validated experimentally with results being consistent with predictions. AlbaTraDIS provides a simple and rapid method to analyse multiple transposon mutagenesis data sets allowing this technology to be used at large scale. To our knowledge this is the only tool currently available that can perform these tasks. AlbaTraDIS is written in Python 3 and is available under the open source licence GNU GPL 3 from https://github.com/quadram-institute-bioscience/albatradis

    Chemical biology-whole genome engineering datasets predict new antibacterial combinations

    Get PDF
    Trimethoprim and sulfamethoxazole are used commonly together as cotrimoxazole for the treatment of urinary tract and other infections. The evolution of resistance to these and other antibacterials threatens therapeutic options for clinicians. We generated and analysed a chemical-biology-whole-genome data set to predict new targets for antibacterial combinations with trimethoprim and sulfamethoxazole. For this we used a large transposon mutant library in Escherichia coli BW25113 where an outward-transcribing inducible promoter was engineered into one end of the transposon. This approach allows regulated expression of adjacent genes in addition to gene inactivation at transposon insertion sites, a methodology that has been called TraDIS-Xpress. These chemical genomic data sets identified mechanisms for both reduced and increased susceptibility to trimethoprim and sulfamethoxazole. The data identified that over-expression of FolA reduced trimethoprim susceptibility, a known mechanism for reduced susceptibility. In addition, transposon insertions into the genes tdk, deoR, ybbC, hha, ldcA, wbbK and waaS increased susceptibility to trimethoprim and likewise for rsmH, fadR, ddlB, nlpI and prc with sulfamethoxazole, while insertions in ispD, uspC, minC, minD, yebK, truD and umpG increased susceptibility to both these antibiotics. Two of these genes’ products, Tdk and IspD, are inhibited by AZT and fosmidomycin respectively, antibiotics that are known to synergise with trimethoprim. Thus, the data identified two known targets and several new target candidates for the development of co-drugs that synergise with trimethoprim, sulfamethoxazole or cotrimoxazole. We demonstrate that the TraDIS-Xpress technology can be used to generate information-rich chemical-genomic data sets that can be used for antibacterial development

    TraDIS-Xpress: a high-resolution whole-genome assay identifies novel mechanisms of triclosan action and resistance

    Get PDF
    Understanding the genetic basis for a phenotype is a central goal in biological research. Much has been learnt about bacterial genomes by creating large mutant libraries and looking for conditionally important genes. However, current genome-wide methods are largely unable to assay essential genes which are not amenable to disruption. To overcome this limitation, we developed a new version of “TraDIS” (transposon directed insertion-site sequencing) that we term “TraDIS-Xpress” that combines an inducible promoter into the transposon cassette. This allows controlled overexpression and repression of all genes owing to saturation of inserts adjacent to all open reading frames as well as conventional inactivation. We applied TraDIS-Xpress to identify responses to the biocide triclosan across a range of concentrations. Triclosan is endemic in modern life, but there is uncertainty about its mode of action with a concentration-dependent switch from bacteriostatic to bactericidal action unexplained. Our results show a concentration-dependent response to triclosan with different genes important in survival between static and cidal exposures. These genes include those previously reported to have a role in triclosan resistance as well as a new set of genes, including essential genes. Novel genes identified as being sensitive to triclosan exposure include those involved in barrier function, small molecule uptake, and integrity of transcription and translation. We anticipate the approach we show here, by allowing comparisons across multiple experimental conditions of TraDIS data, and including essential genes, will be a starting point for future work examining how different drug conditions impact bacterial survival mechanisms

    A genome-wide analysis of Escherichia coli responses to fosfomycin using TraDIS-Xpress reveals novel roles for phosphonate degradation and phosphate transport systems

    Get PDF
    BACKGROUND: Fosfomycin is an antibiotic that has seen a revival in use due to its unique mechanism of action and efficacy against isolates resistant to many other antibiotics. In Escherichia coli, fosfomycin often selects for loss-of-function mutations within the genes encoding the sugar importers, GlpT and UhpT. There has, however, not been a genome-wide analysis of the basis for fosfomycin susceptibility reported to date. METHODS: Here we used TraDIS-Xpress, a high-density transposon mutagenesis approach, to assay the role of all genes in E. coli involved in fosfomycin susceptibility. RESULTS: The data confirmed known fosfomycin susceptibility mechanisms and identified new ones. The assay was able to identify domains within proteins of importance and revealed essential genes with roles in fosfomycin susceptibility based on expression changes. Novel mechanisms of fosfomycin susceptibility that were identified included those involved in glucose metabolism and phosphonate catabolism (phnC-M), and the phosphate importer, PstSACB. The impact of these genes on fosfomycin susceptibility was validated by measuring the susceptibility of defined inactivation mutants. CONCLUSIONS: This work reveals a wider set of genes that contribute to fosfomycin susceptibility, including core sugar metabolism genes and two systems involved in phosphate uptake and metabolism previously unrecognized as having a role in fosfomycin susceptibility

    Neighborhoods of trees in circular orderings

    Get PDF
    In phylogenetics, a common strategy used to construct an evolutionary tree for a set of species X is to search in the space of all such trees for one that optimizes some given score function (such as the minimum evolution, parsimony or likelihood score). As this can be computationally intensive, it was recently proposed to restrict such searches to the set of all those trees that are compatible with some circular ordering of the set X. To inform the design of efficient algorithms to perform such searches, it is therefore of interest to find bounds for the number of trees compatible with a fixed ordering in the neighborhood of a tree that is determined by certain tree operations commonly used to search for trees: the nearest neighbor interchange (nni), the subtree prune and regraft (spr) and the tree bisection and reconnection (tbr) operations. We show that the size of such a neighborhood of a binary tree associated with the nni operation is independent of the tree’s topology, but that this is not the case for the spr and tbr operations. We also give tight upper and lower bounds for the size of the neighborhood of a binary tree for the spr and tbr operations and characterize those trees for which these bounds are attained

    Long-read sequencing for identification of insertion sites in large transposon mutant libraries

    Get PDF
    Transposon insertion site sequencing (TIS) is a powerful method for associating genotype to phenotype. However, all TIS methods described to date use short nucleotide sequence reads which cannot uniquely determine the locations of transposon insertions within repeating genomic sequences where the repeat units are longer than the sequence read length. To overcome this limitation, we have developed a TIS method using Oxford Nanopore sequencing technology that generates and uses long nucleotide sequence reads; we have called this method LoRTIS (Long-Read Transposon Insertion-site Sequencing). LoRTIS enabled the unique localisation of transposon insertion sites within long repetitive genetic elements of E. coli, such as the transposase genes of insertion sequences and copies of the ~ 5 kb ribosomal RNA operon. We demonstrate that LoRTIS is reproducible, gives comparable results to short-read TIS methods for essential genes, and better resolution around repeat elements. The Oxford Nanopore sequencing device that we used is cost-effective, small and easily portable. Thus, LoRTIS is an efficient means of uniquely identifying transposon insertion sites within long repetitive genetic elements and can be easily transported to, and used in, laboratories that lack access to expensive DNA sequencing facilities
    corecore