135 research outputs found
Bayesian phylogenetic analysis of linguistic data using BEAST
Bayesian phylogenetic methods provide a set of tools to efficiently evaluate large linguistic datasets by reconstructing phylogenies—family trees—that represent the history of language families. These methods provide a powerful way to test hypotheses about prehistory, regarding the subgrouping, origins, expansion, and timing of the languages and their speakers. Through phylogenetics, we gain insights into the process of language evolution in general and into how fast individual features change in particular. This article introduces Bayesian phylogenetics as applied to languages. We describe substitution models for cognate evolution, molecular clock models for the evolutionary rate along the branches of a tree, and tree generating processes suitable for linguistic data. We explain how to find the best-suited model using path sampling or nested sampling. The theoretical background of these models is supplemented by a practical tutorial describing how to set up a Bayesian phylogenetic analysis using the software tool BEAST2.1. Introduction 2. Bayesian phylogenetics 3. Models of evolution 4. Rate variation and calibration 5. Tree priors 6. Choosing the best analysis 7. Exploring the space of trees using BEAST2 8. Hypothesis testing with trees 9. Conclusio
Rapid incidence estimation from SARS-CoV-2 genomes reveals decreased case detection in Europe during summer 2020
By October 2021, 230 million SARS-CoV-2 diagnoses have been reported. Yet, a considerable proportion of cases remains undetected. Here, we propose GInPipe, a method that rapidly reconstructs SARS-CoV-2 incidence profiles solely from publicly available, time-stamped viral genomes. We validate GInPipe against simulated outbreaks and elaborate phylodynamic analyses. Using available sequence data, we reconstruct incidence histories for Denmark, Scotland, Switzerland, and Victoria (Australia) and demonstrate, how to use the method to investigate the effects of changing testing policies on case ascertainment. Specifically, we find that under-reporting was highest during summer 2020 in Europe, coinciding with more liberal testing policies at times of low testing capacities. Due to the increased use of real-time sequencing, it is envisaged that GInPipe can complement established surveillance tools to monitor the SARS-CoV-2 pandemic. In post-pandemic times, when diagnostic efforts are decreasing, GInPipe may facilitate the detection of hidden infection dynamics.Results - Method validation: in silico experiment. - Method validation: phylodynamics. - Reconstructed incidence histories. - Relative case detection rate. Discussion Method
Molecular epidemiology of SARS-CoV-2: a regional to global perspective
Background After a year of the global SARS-CoV-2 pandemic, a highly dynamic genetic diversity is surfacing. Among nearly 1000 reported virus lineages, dominant lineages such as B.1.1.7 or B.1.351 attract media attention with questions regarding vaccine efficiency and transmission potential. In response to the pandemic, the Jena University Hospital began sequencing SARS-CoV-2 samples in Thuringia in early 2020.Methods Viral RNA was sequenced in tiled amplicons using Nanopore sequencing. Subsequently, bioinformatic workflows were used to process the generated data. As a genomic background, 9,642 representative SARS-CoV-2 genomes (1,917 of German origin) were extracted from more than 300.000 genomes.Results In a comprehensive bioinformatics analysis, we have set Thuringian isolates in the German, European and global context. In Thuringia, a largely rural German region without an international airport and a population density below the German average, we discovered many of the common “EU lineages”. German samples are scattered across eight major clades, and Thuringian samples occupy four of them.Conclusion The rapid emergence and spread of novel variants are of great concern as these lineages could transmit more efficiently, evade current vaccine efforts or undermine diagnostic test accuracy. To anticipate and mitigate these threats, a continuous molecular surveillance is essential.Key messagesBioinformatics analysis of 1,917, 4,251, and 3,474 SARS-CoV-2 genomes from Germany, the EU (except Germany), and non-EU, respectively, subsampled from more than 300,000 public genomes and placed in the context of Thuringian sequencesConstant antigenic drift for SARS-CoV-2 and no clear pattern or clustering is visible in Thuringia based on the current number of samplesCurrently over 100 described lineages are identified in Germany and only a subset (9) are detected in Thuringia so far, most likely due to genetic undersamplingFrom a national perspective, it is likely that high-frequency lineages, which are currently spreading throughout Europe, will eventually also reach ThuringiaSystematic and dense molecular surveillance via whole-genome sequencing is needed to detect concerning new lineages early, limit spread and adjust vaccines if necessaryCompeting Interest StatementThe authors have declared no competing interest.Funding StatementThe work is funded by the German Ministry of Education and Research (BMBF), grant number 01KX2021, and the Thuringian Region Government, grant number TZUZI82094.Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:not applicableAll necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesAll data is available on GISAID.Introduction Methods - Nanopore sequencing and genome reconstruction - Time tree creation Results - Most highly prevalent SARS-CoV-2 lineages in Germany detected in Thuringia - Genetic divergence and current lineage distribution Discussio
Optimal interpolation of satellite and ground data for irradiance nowcasting at city scales
We use a Bayesian method, optimal interpolation, to improve satellite derived irradiance estimates at city-scales using ground sensor data. Optimal interpolation requires error covariances in the satellite estimates and ground data, which define how information from the sensor locations is distributed across a large area. We describe three methods to choose such covariances, including a covariance parameterization that depends on the relative cloudiness between locations. Results are computed with ground data from 22 sensors over a 75×80 km area centered on Tucson, AZ, using two satellite derived irradiance models. The improvements in standard error metrics for both satellite models indicate that our approach is applicable to additional satellite derived irradiance models. We also show that optimal interpolation can nearly eliminate mean bias error and improve the root mean squared error by 50%
Genomic Surveillance of Vancomycin-Resistant Enterococcus faecium Reveals Spread of a Linear Plasmid Conferring a Nutrient Utilization Advantage
Healthcare-associated outbreaks of vancomycin-resistant Enterococcus faecium (VREfm) are a worldwide problem with increasing prevalence. The genomic plasticity of this hospital-adapted pathogen contributes to its efficient spread despite infection control measures. Here, we aimed to identify the genomic and phenotypic determinants of health care-associated transmission of VREfm. We assessed the VREfm transmission networks at the tertiary-care University Hospital of Zurich (USZ) between October 2014 and February 2018 and investigated microevolutionary dynamics of this pathogen. We performed whole-genome sequencing for the 69 VREfm isolates collected during this time frame and assessed the population structure and variability of the vancomycin resistance transposon. Phylogenomic analysis allowed us to reconstruct transmission networks and to unveil external or wider transmission networks undetectable by routine surveillance. Notably, it unveiled a persistent clone, sampled 31 times over a 29-month period. Exploring the evolutionary dynamics of this clone and characterizing the phenotypic consequences revealed the spread of a variant with decreased daptomycin susceptibility and the acquired ability to utilize N-acetyl-galactosamine (GalNAc), one of the primary constituents of the human gut mucins. This nutrient utilization advantage was conferred by a novel plasmid, termed pELF_USZ, which exhibited a linear topology. This plasmid, which was harbored by two distinct clones, was transferable by conjugation. Overall, this work highlights the potential of combining epidemiological, functional genomic, and evolutionary perspectives to unveil adaptation strategies of VREfm. IMPORTANCE Sequencing microbial pathogens causing outbreaks has become a common practice to characterize transmission networks. In addition to the signal provided by vertical evolution, bacterial genomes harbor mobile genetic elements shared horizontally between clones. While macroevolutionary studies have revealed an important role of plasmids and genes encoding carbohydrate utilization systems in the adaptation of Enterococcus faecium to the hospital environment, mechanisms of dissemination and the specific function of many of these genetic determinants remain to be elucidated. Here, we characterize a plasmid providing a nutrient utilization advantage and show evidence for its clonal and horizontal spread at a local scale. Further studies integrating epidemiological, functional genomics, and evolutionary perspectives will be critical to identify changes shaping the success of this pathogen.
Keywords: Enterococcus faecium; N-acetyl-galactosamine; horizontal gene transfer; linear plasmid; transmission network
Influenza A Virus Migration and Persistence in North American Wild Birds
Wild birds have been implicated in the emergence of human and livestock influenza. The successful prediction of viral spread and disease emergence, as well as formulation of preparedness plans have been hampered by a critical lack of knowledge of viral movements between different host populations. The patterns of viral spread and subsequent risk posed by wild bird viruses therefore remain unpredictable. Here we analyze genomic data, including 287 newly sequenced avian influenza A virus (AIV) samples isolated over a 34-year period of continuous systematic surveillance of North American migratory birds. We use a Bayesian statistical framework to test hypotheses of viral migration, population structure and patterns of genetic reassortment. Our results reveal that despite the high prevalence of Charadriiformes infected in Delaware Bay this host population does not appear to significantly contribute to the North American AIV diversity sampled in Anseriformes. In contrast, influenza viruses sampled from Anseriformes in Alberta are representative of the AIV diversity circulating in North American Anseriformes. While AIV may be restricted to specific migratory flyways over short time frames, our large-scale analysis showed that the long-term persistence of AIV was independent of bird flyways with migration between populations throughout North America. Analysis of long-term surveillance data provides vital insights to develop appropriately informed predictive models critical for pandemic preparedness and livestock protection. © 2013 Bahl et al
The relationship between transmission time and clustering methods in Mycobacterium tuberculosis epidemiology
YesBackground: Tracking recent transmission is a vital part of controlling widespread pathogens such as Mycobacterium tuberculosis. Multiple methods with specific performance characteristics exist for detecting recent transmission chains, usually by clustering strains based on genotype similarities. With such a large variety of methods available, informed selection of an appropriate approach for determining transmissions within a given setting/time period is difficult.
Methods: This study combines whole genome sequence (WGS) data derived from 324 isolates collected 2005–2010 in Kinshasa, Democratic Republic of Congo (DRC), a high endemic setting, with phylodynamics to unveil the timing of transmission events posited by a variety of standard genotyping methods. Clustering data based on Spoligotyping, 24-loci MIRU-VNTR typing, WGS based SNP (Single Nucleotide Polymorphism) and core genome multi locus sequence typing (cgMLST) typing were evaluated.
Findings: Our results suggest that clusters based on Spoligotyping could encompass transmission events that occurred almost 200 years prior to sampling while 24-loci-MIRU-VNTR often represented three decades of transmission. Instead, WGS based genotyping applying low SNP or cgMLST allele thresholds allows for determination of recent transmission events, e.g. in timespans of up to 10 years for a 5 SNP/allele cut-off.
Interpretation: With the rapid uptake of WGS methods in surveillance and outbreak tracking, the findings obtained in this study can guide the selection of appropriate clustering methods for uncovering relevant transmission chains within a given time-period. For high resolution cluster analyses, WGS-SNP and cgMLST based analyses have similar clustering/timing characteristics even for data obtained from a high incidence setting.ERC grant [INTERRUPTB; no. 311725] to BdJ, FG and CJM; an ERC grant to TS [PhyPD; no. 335529]; an FWO PhD fellowship to PM [grant number 1141217N]; the Leibniz Science Campus EvolLUNG for MM and SN; the German Centre for Infection Research (DZIF) for TAK, MM, CU, PB and SN; a SNF SystemsX grant (TBX) to JP and TS and a Marie Heim-Vögtlin fellowship granted to DK by the Swiss National Science Foundation. The computational resources and services used in this work were provided by the VSC (Flemish Supercomputer Center), funded by the Research Foundation - Flanders (FWO) and the Flemish Government – department EWI
Analysis of 3800-year-old Yersinia pestis genomes suggests Bronze Age origin for bubonic plague
© 2018 The Author(s). The origin of Yersinia pestis and the early stages of its evolution are fundamental subjects of investigation given its high virulence and mortality that resulted from past pandemics. Although the earliest evidence of Y. pestis infections in humans has been identified in Late Neolithic/Bronze Age Eurasia (LNBA 5000-3500y BP), these strains lack key genetic components required for flea adaptation, thus making their mode of transmission and disease presentation in humans unclear. Here, we reconstruct ancient Y. pestis genomes from individuals associated with the Late Bronze Age period (~3800 BP) in the Samara region of modern-day Russia. We show clear distinctions between our new strains and the LNBA lineage, and suggest that the full ability for flea-mediated transmission causing bubonic plague evolved more than 1000 years earlier than previously suggested. Finally, we propose that several Y. pestis lineages were established during the Bronze Age, some of which persist to the present day
- …