22,172 research outputs found

    Subspecies typing of Streptococcus agalactiae based on ribosomal subunit protein mass variation by MALDI-TOF MS

    Get PDF
    Background: A ribosomal subunit protein (rsp)-based matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) method was developed for fast subspecies-level typing of Streptococcus agalactiae (Group B Streptococcus, GBS), a major cause of neonatal sepsis and meningitis. Methods: A total of 796 GBS whole genome sequences, covering the genetic diversity of the global GBS population, were used to in silico predict molecular mass variability of 28 rsp and to identify unique rsp mass combinations, termed “rsp-profiles”. The in silico established GBS typing scheme was validated by MALDI-TOF MS analysis of GBS isolates at two independent research sites in Europe and South East Asia. Results: We identified in silico 62 rsp-profiles, with the majority (>80%) of the 796 GBS isolates displaying one of the six rsp-profiles 1-6. These dominant rsp-profiles classify GBS strains in high concordance with the core-genome based phylogenetic clustering. Validation of our approach by in-house MALDI-TOF MS analysis of 248 GBS isolates and external analysis of 8 GBS isolates showed that across different laboratories and MALDI-TOF MS platforms, the 28 rsp were detected reliably in the mass spectra, allowing assignment of clinical isolates to rsp-profiles at high sensitivity (99%) and specificity (97%). Our approach distinguishes the major phylogenetic GBS genotypes, identifies hyper-virulent strains, predicts the probable capsular serotype and surface protein variants and distinguishes between GBS genotypes of human and animal origin. Conclusion: We combine the information depth of whole genome sequences with the highly cost efficient, rapid and robust MALDI-TOF MS approach facilitating high-throughput, inter-laboratory, large-scale GBS epidemiological and clinical studies based on pre-defined rsp-profiles

    Fast and scalable inference of multi-sample cancer lineages.

    Get PDF
    Somatic variants can be used as lineage markers for the phylogenetic reconstruction of cancer evolution. Since somatic phylogenetics is complicated by sample heterogeneity, novel specialized tree-building methods are required for cancer phylogeny reconstruction. We present LICHeE (Lineage Inference for Cancer Heterogeneity and Evolution), a novel method that automates the phylogenetic inference of cancer progression from multiple somatic samples. LICHeE uses variant allele frequencies of somatic single nucleotide variants obtained by deep sequencing to reconstruct multi-sample cell lineage trees and infer the subclonal composition of the samples. LICHeE is open source and available at http://viq854.github.io/lichee

    Bayesian modeling of recombination events in bacterial populations

    Get PDF
    Background: We consider the discovery of recombinant segments jointly with their origins within multilocus DNA sequences from bacteria representing heterogeneous populations of fairly closely related species. The currently available methods for recombination detection capable of probabilistic characterization of uncertainty have a limited applicability in practice as the number of strains in a data set increases. Results: We introduce a Bayesian spatial structural model representing the continuum of origins over sites within the observed sequences, including a probabilistic characterization of uncertainty related to the origin of any particular site. To enable a statistically accurate and practically feasible approach to the analysis of large-scale data sets representing a single genus, we have developed a novel software tool (BRAT, Bayesian Recombination Tracker) implementing the model and the corresponding learning algorithm, which is capable of identifying the posterior optimal structure and to estimate the marginal posterior probabilities of putative origins over the sites. Conclusion: A multitude of challenging simulation scenarios and an analysis of real data from seven housekeeping genes of 120 strains of genus Burkholderia are used to illustrate the possibilities offered by our approach. The software is freely available for download at URL http://web.abo.fi/fak/ mnf//mate/jc/software/brat.html

    Genetic affinities within a large global collection of pathogenic <i>Leptospira</i>: implications for strain identification and molecular epidemiology

    Get PDF
    Leptospirosis is an important zoonosis with widespread human health implications. The non-availability of accurate identification methods for the individualization of different Leptospira for outbreak investigations poses bountiful problems in the disease control arena. We harnessed fluorescent amplified fragment length polymorphism analysis (FAFLP) for Leptospira and investigated its utility in establishing genetic relationships among 271 isolates in the context of species level assignments of our global collection of isolates and strains obtained from a diverse array of hosts. In addition, this method was compared to an in-house multilocus sequence typing (MLST) method based on polymorphisms in three housekeeping genes, the rrs locus and two envelope proteins. Phylogenetic relationships were deduced based on bifurcating Neighbor-joining trees as well as median joining network analyses integrating both the FAFLP data and MLST based haplotypes. The phylogenetic relationships were also reproduced through Bayesian analysis of the multilocus sequence polymorphisms. We found FAFLP to be an important method for outbreak investigation and for clustering of isolates based on their geographical descent rather than by genome species types. The FAFLP method was, however, not able to convey much taxonomical utility sufficient to replace the highly tedious serotyping procedures in vogue. MLST, on the other hand, was found to be highly robust and efficient in identifying ancestral relationships and segregating the outbreak associated strains or otherwise according to their genome species status and, therefore, could unambiguously be applied for investigating phylogenetics of Leptospira in the context of taxonomy as well as gene flow. For instance, MLST was more efficient, as compared to FAFLP method, in clustering strains from the Andaman island of India, with their counterparts from mainland India and Sri Lanka, implying that such strains share genetic relationships and that leptospiral strains might be frequently circulating between the islands and the mainland

    Predicting protein function with hierarchical phylogenetic profiles: The Gene3D phylo-tuner method applied to eukaryotic Genomes

    Get PDF
    "Phylogenetic profiling'' is based on the hypothesis that during evolution functionally or physically interacting genes are likely to be inherited or eliminated in a codependent manner. Creating presence-absence profiles of orthologous genes is now a common and powerful way of identifying functionally associated genes. In this approach, correctly determining orthology, as a means of identifying functional equivalence between two genes, is a critical and nontrivial step and largely explains why previous work in this area has mainly focused on using presence-absence profiles in prokaryotic species. Here, we demonstrate that eukaryotic genomes have a high proportion of multigene families whose phylogenetic profile distributions are poor in presence-absence information content. This feature makes them prone to orthology mis-assignment and unsuited to standard profile-based prediction methods. Using CATH structural domain assignments from the Gene3D database for 13 complete eukaryotic genomes, we have developed a novel modification of the phylogenetic profiling method that uses genome copy number of each domain superfamily to predict functional relationships. In our approach, superfamilies are subclustered at ten levels of sequence identity from 30% to 100% - and phylogenetic profiles built at each level. All the profiles are compared using normalised Euclidean distances to identify those with correlated changes in their domain copy number. We demonstrate that two protein families will "auto-tune'' with strong co-evolutionary signals when their profiles are compared at the similarity levels that capture their functional relationship. Our method finds functional relationships that are not detectable by the conventional presence - absence profile comparisons, and it does not require a priori any fixed criteria to define orthologous genes

    Species status of Neisseria gonorrhoeae: Evolutionary and epidemiological inferences from multilocus sequence typing

    Get PDF
    This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited - Copyright @ 2007 Bennett et al; licensee BioMed Central Ltd.Background: Various typing methods have been developed for Neisseria gonorrhoeae, but none provide the combination of discrimination, reproducibility, portability, and genetic inference that allows the analysis of all aspects of the epidemiology of this pathogen from a single data set. Multilocus sequence typing (MLST) has been used successfully to characterize the related organisms Neisseria meningitidis and Neisseria lactamica. Here, the same seven locus Neisseria scheme was used to characterize a diverse collection of N. gonorrhoeae isolates to investigate whether this method would allow differentiation among isolates, and to distinguish these three species. Results: A total of 149 gonococcal isolates were typed and submitted to the Neisseria MLST database. Although relatively few (27) polymorphisms were detected among the seven MLST loci, a total of 66 unique allele combinations (sequence types, STs), were observed, a number comparable to that seen among isolate collections of the more diverse meningococcus. Patterns of genetic variation were consistent with high levels of recombination generating this diversity. There was no evidence for geographical structuring among the isolates examined, with isolates collected in Liverpool, UK, showing levels of diversity similar to a global collection of isolates. There was, however, evidence that populations of N. meningitidis, N. gonorrhoeae and N. lactamica were distinct, with little support for frequent genetic recombination among these species, with the sequences from the gdh locus alone grouping the species into distinct clusters. Conclusion: The seven loci Neisseria MLST scheme was readily adapted to N. gonorrhoeae isolates, providing a highly discriminatory typing method. In addition, these data permitted phylogenetic and population genetic inferences to be made, including direct comparisons with N. meningitidis and N. lactamica. Examination of these data demonstrated that alleles were rarely shared among the three species. Analysis of variation at a single locus, gdh, provided a rapid means of identifying misclassified isolates and determining whether mixed cultures were present.This study is funded by the Wellcome Trust and European Unio
    corecore