4,565 research outputs found

    Evaluation of genomic island predictors using a comparative genomics approach

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Genomic islands (GIs) are clusters of genes in prokaryotic genomes of probable horizontal origin. GIs are disproportionately associated with microbial adaptations of medical or environmental interest. Recently, multiple programs for automated detection of GIs have been developed that utilize sequence composition characteristics, such as G+C ratio and dinucleotide bias. To robustly evaluate the accuracy of such methods, we propose that a dataset of GIs be constructed using criteria that are independent of sequence composition-based analysis approaches.</p> <p>Results</p> <p>We developed a comparative genomics approach (IslandPick) that identifies both very probable islands and non-island regions. The approach involves 1) flexible, automated selection of comparative genomes for each query genome, using a distance function that picks appropriate genomes for identification of GIs, 2) identification of regions unique to the query genome, compared with the chosen genomes (positive dataset) and 3) identification of regions conserved across all genomes (negative dataset). Using our constructed datasets, we investigated the accuracy of several sequence composition-based GI prediction tools.</p> <p>Conclusion</p> <p>Our results indicate that AlienHunter has the highest recall, but the lowest measured precision, while SIGI-HMM is the most precise method. SIGI-HMM and IslandPath/DIMOB have comparable overall highest accuracy. Our comparative genomics approach, IslandPick, was the most accurate, compared with a curated list of GIs, indicating that we have constructed suitable datasets. This represents the first evaluation, using diverse and, independent datasets that were not artificially constructed, of the accuracy of several sequence composition-based GI predictors. The caveats associated with this analysis and proposals for optimal island prediction are discussed.</p

    A genomic view of food-related and probiotic Enterococcus strains

    Get PDF
    The study of enterococcal genomes has grown considerably in recent years. While special attentionis paid to comparative genomic analysis among clinical relevant isolates, in this study we performedan exhaustive comparative analysis of enterococcal genomes of food origin and/or with potential tobe used as probiotics. Beyond common genetic features, we especially aimed to identify those thatare specific to enterococcal strains isolated from a certain food-related source as well as features presentin a species-specific manner. Thus, the genome sequences of 25 Enterococcus strains, from 7different species, were examined and compared. Their phylogenetic relationship was reconstructedbased on orthologous proteins and whole genomes. Likewise, markers associated with a successfulcolonization (bacteriocin genes and genomic islands) and genome plasticity (phages and clusteredregularly interspaced short palindromic repeats) were investigated for lifestyle specific genetic features.At the same time, a search for antibiotic resistance genes was carried out, since they are of bigconcern in the food industry. Finally, it was possible to locate 1617 FIGfam families as a core proteomeuniversally present among the genera and to determine that most of the accessory genes codefor hypothetical proteins, providing reasonable hints to support their functional characterization.Fil: Bonacina, Julieta. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tucuman. Centro de Referencia Para Lactobacilos; ArgentinaFil: Suárez, Nadia Elina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tucuman. Centro de Referencia Para Lactobacilos; ArgentinaFil: Hormigo, Daniel Ricardo. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tucuman. Centro de Referencia Para Lactobacilos; ArgentinaFil: Fadda, Silvina G.. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tucuman. Centro de Referencia Para Lactobacilos; ArgentinaFil: Lechner, Marcus. University Marburg; AlemaniaFil: Saavedra, Maria Lucila. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tucuman. Centro de Referencia Para Lactobacilos; Argentin

    Joint genomic and proteomic analysis identifies meta-trait characteristics of virulent and non-virulent Staphylococcus aureus strains

    Get PDF
    Staphylococcus aureus is an opportunistic pathogen of humans and warm-blooded animals and presents a growing threat in terms of multi-drug resistance. Despite numerous studies, the basis of staphylococcal virulence and switching between commensal and pathogenic phenotypes is not fully understood. Using genomics, we show here that S. aureus strains exhibiting virulent (VIR) and non-virulent (NVIR) phenotypes in a chicken embryo infection model genetically fall into two separate groups, with the VIR group being much more cohesive than the NVIR group. Significantly, the genes encoding known staphylococcal virulence factors, such as clumping factors, are either found in different allelic variants in the genomes of NVIR strains (compared to VIR strains) or are inactive pseudogenes. Moreover, the pyruvate carboxylase and gamma-aminobutyrate permease genes, which were previously linked with virulence, are pseudogenized in NVIR strain ch22. Further, we use comprehensive proteomics tools to characterize strains that show opposing phenotypes in a chicken embryo virulence model. VIR strain CH21 had an elevated level of diapolycopene oxygenase involved in staphyloxanthin production (protection against free radicals) and expressed a higher level of immunoglobulin-binding protein Sbi on its surface compared to NVIR strain ch22. Furthermore, joint genomic and proteomic approaches linked the elevated production of superoxide dismutase and DNA-binding protein by NVIR strain ch22 with gene duplications

    VCGIDB: A database and web resource for the genomic islands from Vibrio cholerae

    Get PDF
    Vibrio cholerae is the causative agent of cholera, which is a severe, life-threatening diarrheal disease. The current seventh pandemic has not been eradicated and the outbreak is still ongoing around the world. The evolution of the pandemic-causing strain has been greatly influenced by lateral gene transfer, and the mechanisms of acquisition of pathogenicity in V. cholerae are mainly involved with genomic islands (GIs). Thus, detecting GIs and their comprehensive information is necessary to understand the continuing resurgence and newly emerging pathogenic V. cholerae strains. In this study, 798 V. cholerae strains were tested using the GI-Scanner algorithm, which was developed to detect candidate GIs and identify them in a comparative genomics approach. The algorithm predicted 435 highly possible genomic islands, and we built a database, called Vibrio cholerae Genomic Island Database (VCGIDB). This database shows advanced results that were acquired from a large genome set using phylogeny-based predictions. Moreover, VCGIDB is a highly expendable database that does not require intensive computation, which enables us to update it with a greater number of genomes using a novel genomic island prediction method. The VCGIDB website allows the user to browse the data and presents the results in a visual manner.

    Integrated Machine Learning and Bioinformatics Approaches for Prediction of Cancer-Driving Gene Mutations

    Get PDF
    Cancer arises from the accumulation of somatic mutations and genetic alterations in cell division checkpoints and apoptosis, this often leads to abnormal tumor proliferation. Proper classification of cancer-linked driver mutations will considerably help our understanding of the molecular dynamics of cancer. In this study, we compared several cancer-specific predictive models for prediction of driver mutations in cancer-linked genes that were validated on canonical data sets of functionally validated mutations and applied to a raw cancer genomics data. By analyzing pathogenicity prediction and conservation scores, we have shown that evolutionary conservation scores play a pivotal role in the classification of cancer drivers and were the most informative features in the driver mutation classification. Through extensive comparative analysis with structure-functional experiments and multicenter mutational calling data from PanCancer Atlas studies, we have demonstrated the robustness of our models and addressed the validity of computational predictions. We evaluated the performance of our models using the standard diagnostic metrics such as sensitivity, specificity, area under the curve and F-measure. To address the interpretability of cancer-specific classification models and obtain novel insights about molecular signatures of driver mutations, we have complemented machine learning predictions with structure-functional analysis of cancer driver mutations in several key tumor suppressor genes and oncogenes. Through the experiments carried out in this study, we found that evolutionary-based features have the strongest signal in the machine learning classification VII of driver mutations and provide orthogonal information to the ensembled-based scores that are prominent in the ranking of feature importance

    A quantitative account of genomic island acquisitions in prokaryotes

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Microbial genomes do not merely evolve through the slow accumulation of mutations, but also, and often more dramatically, by taking up new DNA in a process called horizontal gene transfer. These innovation leaps in the acquisition of new traits can take place via the introgression of single genes, but also through the acquisition of large gene clusters, which are termed Genomic Islands. Since only a small proportion of all the DNA diversity has been sequenced, it can be hard to find the appropriate donors for acquired genes via sequence alignments from databases. In contrast, relative oligonucleotide frequencies represent a remarkably stable genomic signature in prokaryotes, which facilitates compositional comparisons as an alignment-free alternative for phylogenetic relatedness.</p> <p>In this project, we test whether Genomic Islands identified in individual bacterial genomes have a similar genomic signature, in terms of relative dinucleotide frequencies, and can therefore be expected to originate from a common donor species.</p> <p>Results</p> <p>When multiple Genomic Islands are present within a single genome, we find that up to 28% of these are compositionally very similar to each other, indicative of frequent recurring acquisitions from the same donor to the same acceptor.</p> <p>Conclusions</p> <p>This represents the first quantitative assessment of common directional transfer events in prokaryotic evolutionary history. We suggest that many of the resident Genomic Islands per prokaryotic genome originated from the same source, which may have implications with respect to their regulatory interactions, and for the elucidation of the common origins of these acquired gene clusters.</p

    Identification of Novel Genomic Islands in Liverpool Epidemic Strain of Pseudomonas aeruginosa Using Segmentation and Clustering

    Get PDF
    This article utilizes a recursive segmentation and cluster procedure presented as a genome-mining tool, GEMINI, to decipher genomic islands and understand their contributions to the evolution of virulence and antibiotic resistance in Pseudomonas aeruginosa
    corecore