Search CORE

178 research outputs found

Efficient Inference of Recent and Ancestral Recombination within Bacterial Populations

Author: Andam Cheryl P.
Corander Jukka
Croucher Nicholas J.
Hanage William P.
Marttinen Pekka
Mostowy Rafal
Publication venue
Publication date: 01/02/2017
Field of study

Prokaryotic evolution is affected by horizontal transfer of genetic material through recombination. Inference of an evolutionary tree of bacteria thus relies on accurate identification of the population genetic structure and recombination-derived mosaicism. Rapidly growing databases represent a challenge for computational methods to detect recombinations in bacterial genomes. We introduce a novel algorithm called fastGEAR which identifies lineages in diverse microbial alignments, and recombinations between them and from external origins. The algorithm detects both recent recombinations (affecting a few isolates) and ancestral recombinations between detected lineages (affecting entire lineages), thus providing insight into recombinations affecting deep branches of the phylogenetic tree. In simulations, fastGEAR had comparable power to detect recent recombinations and outstanding power to detect the ancestral ones, compared with state-of-the-art methods, often with a fraction of computational cost. We demonstrate the utility of the method by analyzing a collection of 616 whole-genomes of a recombinogenic pathogen Streptococcus pneumoniae, for which the method provided a high-resolution view of recombination across the genome. We examined in detail the penicillin-binding genes across the Streptococcus genus, demonstrating previously undetected genetic exchanges between different species at these three loci. Hence, fastGEAR can be readily applied to investigate mosaicism in bacterial genes across multiple species. Finally, fastGEAR correctly identified many known recombination hotspots and pointed to potential new ones. Matlab code and Linux/Windows executables are available at https://users.ics.aalto.fi/similar to pemartti/fastGEAR/ (last accessed February 6, 2017).Peer reviewe

Crossref

Harvard University - DASH

Aaltodoc Publication Archive

Spiral - Imperial College Digital Repository

Helsingin yliopiston digitaalinen arkisto

Bayesian modeling of recombination events in bacterial populations

Author: A Baldwin
A Baldwin
A Baldwin
A Rambaut
A Skalka
Adam Baldwin
C Fraser
Chris Dowson
CP Robert
CX Chan
D Falush
D Husmeier
D Posada
DJ Hand
E Mahenthiralingam
E Mahenthiralingam
EHL Aarts
Eshwar Mahenthiralingam
FM Cohan
J Corander
J Corander
J Corander
J Corander
J Felsenstein
J Hein
J Maynard Smith
JG Lawrence
JS Sinsheimer
Jukka Corander
JV Braun
M Arenas
M Hasegawa
MA Suchard
MJ Schervish
NC Grassly
P Marttinen
Pekka Marttinen
R Jain
RA Elton
S Sawyer
SA Sisson
VN Minin
VN Minin
William P Hanage
WJ Wiersinga
WP Hanage
X Didelot
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2008
Field of study

Background: We consider the discovery of recombinant segments jointly with their origins within multilocus DNA sequences from bacteria representing heterogeneous populations of fairly closely related species. The currently available methods for recombination detection capable of probabilistic characterization of uncertainty have a limited applicability in practice as the number of strains in a data set increases. Results: We introduce a Bayesian spatial structural model representing the continuum of origins over sites within the observed sequences, including a probabilistic characterization of uncertainty related to the origin of any particular site. To enable a statistically accurate and practically feasible approach to the analysis of large-scale data sets representing a single genus, we have developed a novel software tool (BRAT, Bayesian Recombination Tracker) implementing the model and the corresponding learning algorithm, which is capable of identifying the posterior optimal structure and to estimate the marginal posterior probabilities of putative origins over the sites. Conclusion: A multitude of challenging simulation scenarios and an analysis of real data from seven housekeeping genes of 120 strains of genus Burkholderia are used to illustrate the possibilities offered by our approach. The software is freely available for download at URL http://web.abo.fi/fak/ mnf//mate/jc/software/brat.html

Crossref

Online Research @ Cardiff

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Warwick Research Archives Portal Repository

Recommended from our members

Recombination produces coherent bacterial species clusters in both core and accessory genomes

Author: Corander Jukka
Croucher Nicholas J.
Gutmann Michael U.
Hanage William P.
Marttinen Pekka
Publication venue: 'Microbiology Society'
Publication date: 06/04/2017
Field of study

Background: Population samples show bacterial genomes can be divided into a core of ubiquitous genes and accessory genes that are present in a fraction of isolates. The ecological significance of this variation in gene content remains unclear. However, microbiologists agree that a bacterial species should be ‘genomically coherent’, even though there is no consensus on how this should be determined. Results: We use a parsimonious model combining diversification in both the core and accessory genome, including mutation, homologous recombination (HR) and horizontal gene transfer (HGT) introducing new loci, to produce a population of interacting clusters of strains with varying genome content. New loci introduced by HGT may then be transferred on by HR. The model fits well to a systematic population sample of 616 pneumococcal genomes, capturing the major features of the population structure with parameter values that agree well with empirical estimates. Conclusions: The model does not include explicit selection on individual genes, suggesting that crude comparisons of gene content may be a poor predictor of ecological function. We identify a clearly divergent subpopulation of pneumococci that are inconsistent with the model and may be considered genomically incoherent with the rest of the population. These strains have a distinct disease tropism and may be rationally defined as a separate species. We also find deviations from the model that may be explained by recent population bottlenecks or spatial structure

Harvard University - DASH

Recommended from our members

Phylogeographic variation in recombination rates within a global clone of methicillin-resistant Staphylococcus aureus

Author: Aldeljawi Mona
Bentley Stephen D
Boye Kit
Castillo-Ramírez Santiago
Corander Jukka
Feil Edward J
Gulay Zeynep
Hanage William P
Holden Matthew T
Marttinen Pekka
Parkhill Julian
Westh Henrik
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 10/03/2014
Field of study

Background: Next-generation sequencing (NGS) is a powerful tool for understanding both patterns of descent over time and space (phylogeography) and the molecular processes underpinning genome divergence in pathogenic bacteria. Here, we describe a synthesis between these perspectives by employing a recently developed Bayesian approach, BRATNextGen, for detecting recombination on an expanded NGS dataset of the globally disseminated methicillin-resistant Staphylococcus aureus (MRSA) clone ST239. Results: The data confirm strong geographical clustering at continental, national and city scales and demonstrate that the rate of recombination varies significantly between phylogeographic sub-groups representing independent introductions from Europe. These differences are most striking when mobile non-core genes are included, but remain apparent even when only considering the stable core genome. The monophyletic ST239 sub-group corresponding to isolates from South America shows heightened recombination, the sub-group predominantly from Asia shows an intermediate level, and a very low level of recombination is noted in a third sub-group representing a large collection from Turkey. Conclusions: We show that the rapid global dissemination of a single pathogenic bacterial clone results in local variation in measured recombination rates. Possible explanatory variables include the size and time since emergence of each defined sub-population (as determined by the sampling frame), variation in transmission dynamics due to host movement, and changes in the bacterial genome affecting the propensity for recombination

Harvard University - DASH

The impact of host metapopulation structure on the population genetics of colonizing bacteria

Author: Baquero Fernando
Coque Teresa M
Corander Jukka
Feil Edward J
Gutmann Michael
Hanage William P
Marttinen Pekka
Méric Guillaume
Numminen Elina
Sheppard Samuel K
Shubin Mikhail
van Schaik Willem
Willems Rob J L
Publication venue: 'Elsevier BV'
Publication date: 25/09/2015
Field of study

Many key bacterial pathogens are frequently carried asymptomatically, and the emergence and spread of these opportunistic pathogens can be driven, or mitigated, via demographic changes within the host population. These inter-host transmission dynamics combine with basic evolutionary parameters such as rates of mutation and recombination, population size and selection, to shape the genetic diversity within bacterial populations. Whilst many studies have focused on how molecular processes underpin bacterial population structure, the impact of host migration and the connectivity of the local populations has received far less attention. A stochastic neutral model incorporating heightened local transmission has been previously shown to fit closely with genetic data for several bacterial species. However, this model did not incorporate transmission limiting population stratification, nor the possibility of migration of strains between subpopulations, which we address here by presenting an extended model. We study the consequences of migration in terms of shared genetic variation and show by simulation that the previously used summary statistic, the allelic mismatch distribution, can be insensitive to even large changes in microepidemic and migration rates. Using likelihood-free inference with genotype network topological summaries we fit a simpler model to commensal and hospital samples from the common nosocomial pathogens Staphylococcus aureus, Staphylococcus epidermidis, Enterococcus faecalis and Enterococcus faecium. Only the hospital data for E. faecium display clearly marked deviations from the model predictions which may be attributable to its adaptation to the hospital environment

Crossref

University of Birmingham Research Portal

Cronfa at Swansea University

Utrecht University Repository

Plasmids Shaped the Recent Emergence of the Major Nosocomial Pathogen Enterococcus faecium

Author: Arredondo-Alonso S.
Braat J. C.
Corander J.
Kaski S.
Marttinen P.
McNally A.
Pensar J.
Pesonen M.
Puranen S.
Rogers M. R. C.
Schurch A. C.
Top J.
van Schaik W.
Willems R. J. L.
Publication venue
Publication date: 01/02/2020
Field of study

Enterococcus faecium is a gut commensal of humans and animals but is also listed on the WHO global priority list of multidrug-resistant pathogens. Many of its antibiotic resistance traits reside on plasmids and have the potential to be disseminated by horizontal gene transfer. Here, we present the first comprehensive population-wide analysis of the pan-plasmidome of a clinically important bacterium, by whole-genome sequence analysis of 1,644 isolates from hospital, commensal, and animal sources of E. faecium. Long-read sequencing on a selection of isolates resulted in the completion of 305 plasmids that exhibited high levels of sequence modularity. We further investigated the entirety of all plasmids of each isolate (plasmidome) using a combination of short-read sequencing and machine-learning classifiers. Clustering of the plasmid sequences unraveled different E. faecium populations with a clear association with hospitalized patient isolates, suggesting different optimal configurations of plasmids in the hospital environment. The characterization of these populations allowed us to identify common mechanisms of plasmid stabilization such as toxin-antitoxin systems and genes exclusively present in particular plasmidome populations exemplified by copper resistance, phosphotransferase systems, or bacteriocin genes potentially involved in niche adaptation. Based on the distribution of k-mer distances between isolates, we concluded that plasmidomes rather than chromosomes are most informative for source specificity of E. faecium. IMPORTANCE Enterococcus faecium is one of the most frequent nosocomial pathogens of hospital-acquired infections. E. faecium has gained resistance against most commonly available antibiotics, most notably, against ampicillin, gentamicin, and vancomycin, which renders infections difficult to treat. Many antibiotic resistance traits, in particular, vancomycin resistance, can be encoded in autonomous and extrachromosomal elements called plasmids. These sequences can be disseminated to other isolates by horizontal gene transfer and confer novel mechanisms to source specificity. In our study, we elucidated the total plasmid content, referred to as the plasmidome, of 1,644 E. faecium isolates by using short- and long-read whole-genome technologies with the combination of a machine-learning classifier. This was fundamental to investigate the full collection of plasmid sequences present in our collection (pan-plasmidome) and to observe the potential transfer of plasmid sequences between E. faecium hosts. We observed that E. faecium isolates from hospitalized patients carried a larger number of plasmid sequences compared to that from other sources, and they elucidated different configurations of plasmidome populations in the hospital environment. We assessed the contribution of different genomic components and observed that plasmid sequences have the highest contribution to source specificity. Our study suggests that E. faecium plasmids are regulated by complex ecological constraints rather than physical interaction between hosts.Peer reviewe

University of Birmingham Research Portal

Directory of Open Access Journals

The University of Manchester - Institutional Repository

Helsingin yliopiston digitaalinen arkisto

Utrecht University Repository

Detection of recombination events in bacterial genomes from large population samples

Author: Bentley Stephen D.
Connor Thomas R.
Corander Jukka
Croucher Nicholas J.
Hanage William P.
Harris Simon R.
Marttinen Pekka
Publication venue
Publication date: 01/10/2011
Field of study

Peer reviewe

Crossref

Online Research @ Cardiff

Harvard University - DASH

PubMed Central

Helsingin yliopiston digitaalinen arkisto

Population gene introgression and high genome plasticity for the zoonotic pathogen Streptococcus agalactiae

Author: Abbott
Abby
Almeida
Baily
Bankevich
Beerli
Beerli
Benjamini
Bertels
Bikard
Bisharat
Bishop
Bohnsack
Borchardt
Brett M Probert
Brochet
Bruen
Brynildsrud
Capella-Gutierrez
Chen
Chen
Cheng
Chiara Crestani
Chopra
Christopher D Town
Conrad
Croucher
Da Cunha
Delannoy
Delannoy
Dogan
Edgar
Enright
Erwin
Fernandez
Ferreira
Flores
Fluegge
Garrett H Springer
Gauthier
Glazko
Glazko
Greig
Guglielmini
Gupta
Hayley B Hassler
Heaps
Holt
Imperi
Inouye
Irina M Velsko
Jafar
Jaskowiak
Jeukens
Johri
Jones
Jones
Jorgensen
Joubrel
Kalimuddin
Kim
Konig
Langdon
Librado
Lin
Lindahl
Liu
Liu
Lopez-Sanchez
Loytynoja
Lyhs
Manning
Manning
Martins
Marttinen
Mather
McArthur
Md Tauqeer Alam
Michael J Stanhope
Morse
Murrell
Page
Pal
Paulina D Pavinski Bitar
Pedersen
Petrovska
Pond
Poyart
Price
Qin
Richards
Richards
Richards
Rosinski-Chupin
Ruth N Zadoks
Sahl
Sahl
Sahl
Scheffer
Schrieber
Seemann
Shannon D Manning
Shapiro
Shepheard
Sheppard
Spoor
Springman
Srivastava
Stamatakis
Stoddard
Sukhnanand
Supek
Tettelin
Tettelin
Tian
van der Mee-Marquet
Verani
Viana
Vincent P Richards
Yu
Zadoks
Zankari
Zerbino
Zhang
Zhu
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/11/2019
Field of study

The influence that bacterial adaptation (or niche partitioning) within species has on gene spillover and transmission among bacteria populations occupying different niches is not well understood. Streptococcus agalactiae is an important bacterial pathogen that has a taxonomically diverse host range making it an excellent model system to study these processes. Here we analyze a global set of 901 genome sequences from nine diverse host species to advance our understanding of these processes. Bayesian clustering analysis delineated twelve major populations that closely aligned with niches. Comparative genomics revealed extensive gene gain/loss among populations and a large pan-genome of 9,527 genes, which remained open and was strongly partitioned among niches. As a result, the biochemical characteristics of eleven populations were highly distinctive (significantly enriched). Positive selection was detected and biochemical characteristics of the dispensable genes under selection were enriched in ten populations. Despite the strong gene partitioning, phylogenomics detected gene spillover. In particular, tetracycline resistance (which likely evolved in the human-associated population) from humans to bovine, canines, seals, and fish, demonstrating how a gene selected in one host can ultimately be transmitted into another, and biased transmission from humans to bovines was confirmed with a Bayesian migration analysis. Our findings show high bacterial genome plasticity acting in balance with selection pressure from distinct functional requirements of niches that is associated with an extensive and highly partitioned dispensable genome, likely facilitating continued and expansive adaptation

Crossref

Enlighten

MPG.PuRe

International genomic definition of pneumococcal lineages, to contextualise disease, antibiotic resistance and vaccine impact

Author: Antonio M
Benisty R
Bentley LJ
Bentley SD
Breiman RF
Corander J
Cornick JE
Croucher NJ
Dagan R
du Plessis M
Everett DB
Gladstone RA
Hawkins PA
Ho PL
Klugman KP
Kwambana-Adams B
Lees JA
Lo SW
Madhi SA
Marttinen P
McGee L
Nzenze SA
Ochoa TJ
Page AJ
van Tonder AJ
von Gottberg A
Publication venue: ELSEVIER SCIENCE BV
Publication date: 01/05/2019
Field of study

Background: Pneumococcal conjugate vaccines have reduced the incidence of invasive pneumococcal disease, caused by vaccine serotypes, but non-vaccine-serotypes remain a concern. We used whole genome sequencing to study pneumococcal serotype, antibiotic resistance and invasiveness, in the context of genetic background. / Methods: Our dataset of 13,454 genomes, combined with four published genomic datasets, represented Africa (40%), Asia (25%), Europe (19%), North America (12%), and South America (5%). These 20,027 pneumococcal genomes were clustered into lineages using PopPUNK, and named Global Pneumococcal Sequence Clusters (GPSCs). From our dataset, we additionally derived serotype and sequence type, and predicted antibiotic sensitivity. We then measured invasiveness using odds ratios that relating prevalence in invasive pneumococcal disease to carriage. / Findings: The combined collections (n = 20,027) were clustered into 621 GPSCs. Thirty-five GPSCs observed in our dataset were represented by >100 isolates, and subsequently classed as dominant-GPSCs. In 22/35 (63%) of dominant-GPSCs both non-vaccine serotypes and vaccine serotypes were observed in the years up until, and including, the first year of pneumococcal conjugate vaccine introduction. Penicillin and multidrug resistance were higher (p < .05) in a subset dominant-GPSCs (14/35, 9/35 respectively), and resistance to an increasing number of antibiotic classes was associated with increased recombination (R2 = 0.27 p < .0001). In 28/35 dominant-GPSCs, the country of isolation was a significant predictor (p < .05) of its antibiogram (mean misclassification error 0.28, SD ± 0.13). We detected increased invasiveness of six genetic backgrounds, when compared to other genetic backgrounds expressing the same serotype. Up to 1.6-fold changes in invasiveness odds ratio were observed. / Interpretation: We define GPSCs that can be assigned to any pneumococcal genomic dataset, to aid international comparisons. Existing non-vaccine-serotypes in most GPSCs preclude the removal of these lineages by pneumococcal conjugate vaccines; leaving potential for serotype replacement. A subset of GPSCs have increased resistance, and/or serotype-independent invasiveness

UCL Discovery

Ensemble approach to predict specificity determinants: benchmarking and validation

Author: A Carro
A del Sol
AA Schäffer
Anna R Panchenko
B Reva
DP Brown
E Marchiori
HM Berman
I Kononenko
IM Wallace
J Pei
JA Capra
JE Donald
K Mizuguchi
K Ye
L Mirny
N Krishnamurthy
O Lichtarge
OV Kalinina
OV Kalinina
P Marttinen
RF Doolittle
RM Ward
S Chakrabarti
S Chakrabarti
S Ohno
Saikat Chakrabarti
SS Hannenhalli
W Pirovano
WL DeLano
X Gu
X Gu
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background It is extremely important and challenging to identify the sites that are responsible for functional specification or diversification in protein families. In this study, a rigorous comparative benchmarking protocol was employed to provide a reliable evaluation of methods which predict the specificity determining sites. Subsequently, three best performing methods were applied to identify new potential specificity determining sites through ensemble approach and common agreement of their prediction results. Results It was shown that the analysis of structural characteristics of predicted specificity determining sites might provide the means to validate their prediction accuracy. For example, we found that for smaller distances it holds true that the more reliable the prediction method is, the closer predicted specificity determining sites are to each other and to the ligand. Conclusion We observed certain similarities of structural features between predicted and actual subsites which might point to their functional relevance. We speculate that majority of the identified potential specificity determining sites might be indirectly involved in specific interactions and could be ideal target for mutagenesis experiments.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central