9,839 research outputs found

    Evolutionary patchwork of an insecticidal toxin shared between plant-associated pseudomonads and the insect pathogens Photorhabdus and Xenorhabdus

    Get PDF
    Background: Root-colonizing fluorescent pseudomonads are known for their excellent abilities to protect plants against soil-borne fungal pathogens. Some of these bacteria produce an insecticidal toxin (Fit) suggesting that they may exploit insect hosts as a secondary niche. However, the ecological relevance of insect toxicity and the mechanisms driving the evolution of toxin production remain puzzling. Results: Screening a large collection of plant-associated pseudomonads for insecticidal activity and presence of the Fit toxin revealed that Fit is highly indicative of insecticidal activity and predicts that Pseudomonas protegens and P. chlororaphis are exclusive Fit producers. A comparative evolutionary analysis of Fit toxin-producing Pseudomonas including the insect-pathogenic bacteria Photorhabdus and Xenorhadus, which produce the Fit related Mcf toxin, showed that fit genes are part of a dynamic genomic region with substantial presence/absence polymorphism and local variation in GC base composition. The patchy distribution and phylogenetic incongruence of fit genes indicate that the Fit cluster evolved via horizontal transfer, followed by functional integration of vertically transmitted genes, generating a unique Pseudomonas-specific insect toxin cluster. Conclusions: Our findings suggest that multiple independent evolutionary events led to formation of at least three versions of the Mcf/Fit toxin highlighting the dynamic nature of insect toxin evolution

    Insights into bacterial genome composition through variable target GC content profiling

    Full text link
    This study presents a new computational method for guanine (G) and cytosine (C), or GC, content profiling based on the idea of multiple resolution sampling (MRS). The benefit of our new approach over existing techniques follows from its ability to locate significant regions without prior knowledge of the sequence, nor the features being sought. The use of MRS has provided novel insights into bacterial genome composition. Key findings include those that are related to the core composition of bacterial genomes, to the identification of large genomic islands (in Enterobacterial genomes), and to the identification of surface protein determinants in human pathogenic organisms (e.g., Staphylococcus genomes). We observed that bacterial surface binding proteins maintain abnormal GC content, potentially pointing to a viral origin. This study has demonstrated that GC content holds a high informational worth and hints at many underlying evolutionary processes. For online Supplementary Material, see www.liebertonline.com

    GISMO—gene identification using a support vector machine for ORF classification

    Get PDF
    We present the novel prokaryotic gene finder GISMO, which combines searches for protein family domains with composition-based classification based on a support vector machine. GISMO is highly accurate; exhibiting high sensitivity and specificity in gene identification. We found that it performs well for complete prokaryotic chromosomes, irrespective of their GC content, and also for plasmids as short as 10 kb, short genes and for genes with atypical sequence composition. Using GISMO, we found several thousand new predictions for the published genomes that are supported by extrinsic evidence, which strongly suggest that these are very likely biologically active genes. The source code for GISMO is freely available under the GPL license

    Detection of recombination in DNA multiple alignments with hidden markov models

    Get PDF
    CConventional phylogenetic tree estimation methods assume that all sites in a DNA multiple alignment have the same evolutionary history. This assumption is violated in data sets from certain bacteria and viruses due to recombination, a process that leads to the creation of mosaic sequences from different strains and, if undetected, causes systematic errors in phylogenetic tree estimation. In the current work, a hidden Markov model (HMM) is employed to detect recombination events in multiple alignments of DNA sequences. The emission probabilities in a given state are determined by the branching order (topology) and the branch lengths of the respective phylogenetic tree, while the transition probabilities depend on the global recombination probability. The present study improves on an earlier heuristic parameter optimization scheme and shows how the branch lengths and the recombination probability can be optimized in a maximum likelihood sense by applying the expectation maximization (EM) algorithm. The novel algorithm is tested on a synthetic benchmark problem and is found to clearly outperform the earlier heuristic approach. The paper concludes with an application of this scheme to a DNA sequence alignment of the argF gene from four Neisseria strains, where a likely recombination event is clearly detected

    STATISTICS IN THE BILLERA-HOLMES-VOGTMANN TREESPACE

    Get PDF
    This dissertation is an effort to adapt two classical non-parametric statistical techniques, kernel density estimation (KDE) and principal components analysis (PCA), to the Billera-Holmes-Vogtmann (BHV) metric space for phylogenetic trees. This adaption gives a more general framework for developing and testing various hypotheses about apparent differences or similarities between sets of phylogenetic trees than currently exists. For example, while the majority of gene histories found in a clade of organisms are expected to be generated by a common evolutionary process, numerous other coexisting processes (e.g. horizontal gene transfers, gene duplication and subsequent neofunctionalization) will cause some genes to exhibit a history quite distinct from the histories of the majority of genes. Such “outlying” gene trees are considered to be biologically interesting and identifying these genes has become an important problem in phylogenetics. The R sofware package kdetrees, developed in Chapter 2, contains an implementation of the kernel density estimation method. The primary theoretical difficulty involved in this adaptation concerns the normalizion of the kernel functions in the BHV metric space. This problem is addressed in Chapter 3. In both chapters, the software package is applied to both simulated and empirical datasets to demonstrate the properties of the method. A few first theoretical steps in adaption of principal components analysis to the BHV space are presented in Chapter 4. It becomes necessary to generalize the notion of a set of perpendicular vectors in Euclidean space to the BHV metric space, but there some ambiguity about how to best proceed. We show that convex hulls are one reasonable approach to the problem. The Nye-PCA- algorithm provides a method of projecting onto arbitrary convex hulls in BHV space, providing the core of a modified PCA-type method

    Lateral gene transfer of streptococcal ICE element RD2 (region of difference 2) encoding secreted proteins

    Get PDF
    Background: The genome of serotype M28 group A Streptococcus (GAS) strain MGAS6180 contains a novel genetic element named Region of Difference 2 (RD2) that encodes seven putative secreted extracellular proteins. RD2 is present in all serotype M28 strains and strains of several other GAS serotypes associated with female urogenital infections. We show here that the GAS RD2 element is present in strain MGAS6180 both as an integrative chromosomal form and a circular extrachromosomal element. RD2-like regions were identified in publicly available genome sequences of strains representing three of the five major group B streptococcal serotypes causing human disease. Ten RD2-encoded proteins have significant similarity to proteins involved in conjugative transfer of Streptococcus thermophilus integrative chromosomal elements (ICEs). Results: We transferred RD2 from GAS strain MGAS6180 (serotype M28) to serotype M1 and M4 GAS strains by filter mating. The copy number of the RD2 element was rapidly and significantly increased following treatment of strain MGAS6180 with mitomycin C, a DNA damaging agent. Using a PCR-based method, we also identified RD2-like regions in multiple group C and G strains of Streptococcus dysgalactiae subsp.equisimilis cultured from invasive human infections. Conclusions: Taken together, the data indicate that the RD2 element has disseminated by lateral gene transfer to genetically diverse strains of human-pathogenic streptococci

    Bacterial genomic G + C composition-eliciting environmental adaptation

    Get PDF
    Bacterial genomes reflect their adaptation strategies through nucleotide usage trends found in their chromosome composition. Bacteria, unlike eukaryotes contain a wide range of genomic G + C. This wide variability may be viewed as a response to environmental adaptation. Two overarching trends are observed across bacterial genomes, the first, correlates genomic G + C to environmental niches and lifestyle, while the other utilizees intra-genomic G + C incongruence to delineate horizontally transferred material. In this review, we focus on the influence of several properties including biochemical, genetic flows, selection biases, and the biochemical-energetic properties shaping genome composition. Outcomes indicate a trend toward high G + C and larger genomes in free-living organisms, as a result of more complex and varied environments (higher chance for horizontal gene transfer). Conversely, nutrient limiting and nutrient poor environments dictate smaller genomes of low GC in attempts to conserve replication expense. Varied processes including translesion repair mechanisms, phage insertion and cytosine degradation has been shown to introduce higher AT in genomic sequences. We conclude the review with an analysis of current bioinformatics tools seeking to elicit compositional variances and highlight the practical implications when using such techniques

    UNDERSTANDING THE EVOLUTION OF PATHOGENICITY WITHIN GEOSMITHIA

    Get PDF
    Geosmithia morbida is a filamentous ascomycete that causes thousand cankers disease in the eastern black walnut tree. This pathogen is commonly found in the western US; however, recently the disease was also detected in several eastern states where the black walnut lumber industry is concentrated. G. morbida is one of two known phytopathogens within the genus Geosmithia, and it is vectored into the host tree via the walnut twig beetle. We present the first de novo draft genome of G. morbida (Chapter 2). It is 26.5 Mbp in length and contains less than 1% repetitive elements. The genome possesses an estimated 6,273 genes, 277 of which are predicted to encode proteins with unknown functions. Approximately 31.5% of the proteins in G. morbida are homologous to proteins involved in pathogenicity, and 5.6% of the proteins contain signal peptides that indicate these proteins are secreted. Additionally, the genomes of Geosmithia flava and Geosmithia putterillii were assembled and compared with G. morbida (Chapter 3). The G. flava assembly composed of 1,819 scaffolds totaling in 29.47 Mbp in length, and G. putterillii genome contained 320 scaffolds consisting of 29.99 Mbp. Our results showed that all three Geosmithia species possess similar number of carbohydrate binding enzymes and proteases. We also constructed a Bayesian phylogeny that illustrates the evolutionary relationships between Geosmithia and other fungal species. Our phylogeny is consistent with topologies from previous studies. Lastly, we identified genes under positive selection in G. morbida that could potentially contribute to pathogenicity. Our results showed 38 genes under selection in G. morbida; none of which were under selection in G. clavigera. These findings indicate that species-specific mechanisms might be the driving force behind the evolution of pathogenicity in both of these beetle-vectored fungal pathogens

    Recovering complete and draft population genomes from metagenome datasets.

    Get PDF
    Assembly of metagenomic sequence data into microbial genomes is of fundamental value to improving our understanding of microbial ecology and metabolism by elucidating the functional potential of hard-to-culture microorganisms. Here, we provide a synthesis of available methods to bin metagenomic contigs into species-level groups and highlight how genetic diversity, sequencing depth, and coverage influence binning success. Despite the computational cost on application to deeply sequenced complex metagenomes (e.g., soil), covarying patterns of contig coverage across multiple datasets significantly improves the binning process. We also discuss and compare current genome validation methods and reveal how these methods tackle the problem of chimeric genome bins i.e., sequences from multiple species. Finally, we explore how population genome assembly can be used to uncover biogeographic trends and to characterize the effect of in situ functional constraints on the genome-wide evolution
    corecore