64 research outputs found
Assignment of epidemiological lineages in an emerging pandemic using the pangolin tool.
Funder: Oxford Martin School, University of OxfordThe response of the global virus genomics community to the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) pandemic has been unprecedented, with significant advances made towards the 'real-time' generation and sharing of SARS-CoV-2 genomic data. The rapid growth in virus genome data production has necessitated the development of new analytical methods that can deal with orders of magnitude of more genomes than previously available. Here, we present and describe Phylogenetic Assignment of Named Global Outbreak Lineages (pangolin), a computational tool that has been developed to assign the most likely lineage to a given SARS-CoV-2 genome sequence according to the Pango dynamic lineage nomenclature scheme. To date, nearly two million virus genomes have been submitted to the web-application implementation of pangolin, which has facilitated the SARS-CoV-2 genomic epidemiology and provided researchers with access to actionable information about the pandemic's transmission lineages
A way to synchronize models with seismic faults for earthquake forecasting: Insights from a simple stochastic model
Numerical models are starting to be used for determining the future behaviour
of seismic faults and fault networks. Their final goal would be to forecast
future large earthquakes. In order to use them for this task, it is necessary
to synchronize each model with the current status of the actual fault or fault
network it simulates (just as, for example, meteorologists synchronize their
models with the atmosphere by incorporating current atmospheric data in them).
However, lithospheric dynamics is largely unobservable: important parameters
cannot (or can rarely) be measured in Nature. Earthquakes, though, provide
indirect but measurable clues of the stress and strain status in the
lithosphere, which should be helpful for the synchronization of the models. The
rupture area is one of the measurable parameters of earthquakes. Here we
explore how it can be used to at least synchronize fault models between
themselves and forecast synthetic earthquakes. Our purpose here is to forecast
synthetic earthquakes in a simple but stochastic (random) fault model. By
imposing the rupture area of the synthetic earthquakes of this model on other
models, the latter become partially synchronized with the first one. We use
these partially synchronized models to successfully forecast most of the
largest earthquakes generated by the first model. This forecasting strategy
outperforms others that only take into account the earthquake series. Our
results suggest that probably a good way to synchronize more detailed models
with real faults is to force them to reproduce the sequence of previous
earthquake ruptures on the faults. This hypothesis could be tested in the
future with more detailed models and actual seismic data.Comment: Revised version. Recommended for publication in Tectonophysic
Serial ruptures of the San Andreas fault, Carrizo Plain, California, revealed by three-dimensional excavations
It is poorly known if fault slip repeats regularly through many earthquake cycles. Well‐documented measurements of successive slips rarely span more than three earthquake cycles. In this paper, we present evidence of six sequential offsets across the San Andreas fault at a site in the Carrizo Plain, using stream channels as piercing lines. We opened a latticework of trenches across the offset channels on both sides of the fault to expose their subsurface stratigraphy. We can correlate the channels across the fault on the basis of their elevations, shapes, stratigraphy, and ages. The three‐dimensional excavations allow us to locate accurately the offset channel pairs and to determine the amounts of motion for each pair. We find that the dextral slips associated with the six events in the last millennium are, from oldest to youngest, ≥ 5.4 ± 0.6, 8.0 ± 0.5, 1.4 ± 0.5, 5.2 ± 0.6, 7.6 ± 0.4 and 7.9 ± 0.1 m. In this series, three and possibly four of the six offset values are between 7 and 8 m. The common occurrence of 7–8 m offsets suggests remarkably regular, but not strictly uniform, slip behavior. Age constraints for these events at our site, combined with previous paleoseismic investigations within a few kilometers, allow a construction of offset history and a preliminary evaluation of slip‐ and time‐predictable models. The average slip rate over the span of the past five events (between A.D. 1210 and A.D. 1857.) has been 34 mm/yr, not resolvably different from the previously determined late Holocene slip rate and the modern geodetic strain accumulation rate. We find that the slip‐predictable model is a better fit than the time‐predictable model. In general, earthquake slip is positively correlated with the time interval preceding the event. Smaller offsets coincide with shorter prior intervals and larger offset with longer prior intervals
Visualizing variation within global pneumococcal sequence clusters (GPSCS) and country population snapshots to contextualize pneumococcal isolates
Knowledge of pneumococcal lineages, their geographic distribution and antibiotic resistance patterns, can give insights into global pneumococcal disease. We provide interactive bioinformatic outputs to explore such topics, aiming to increase dissemi-nation of genomic insights to the wider community, without the need for specialist training. We prepared 12 country-specific phylogenetic snapshots, and international phylogenetic snapshots of 73 common Global Pneumococcal Sequence Clusters (GPSCs) previously defined using PopPUNK, and present them in Microreact. Gene presence and absence defined using Roary, and recombination profiles derived from Gubbins are presented in Phandango for each GPSC. Temporal phylogenetic signal was assessed for each GPSC using BactDating. We provide examples of how such resources can be used. In our example use of a country-specific phylogenetic snapshot we determined that serotype 14 was observed in nine unrelated genetic backgrounds in South Africa. The international phylogenetic snapshot of GPSC9, in which most serotype 14 isolates from South Africa were observed, highlights that there were three independent sub-clusters represented by South African serotype 14 isolates. We estimated from the GPSC9-dated tree that the sub-clusters were each established in South Africa during the 1980s. We show how recombination plots allowed the identification of a 20 kb recombination spanning the capsular polysaccharide locus within GPSC97. This was consistent with a switch from serotype 6A to 19A estimated to have occured in the 1990s from the GPSC97-dated tree. Plots of gene presence/absence of resistance genes (tet, erm, cat) across the GPSC23 phylogeny were consistent with acquisition of a composite transposon. We estimated from the GPSC23-dated tree that the acquisition occurred between 1953 and 1975. Finally, we demonstrate the assignment of GPSC31 to 17 externally generated pneumococcal serotype 1 assemblies from Utah via Pathogenwatch. Most of the Utah isolates clustered within GPSC31 in a USA-specific clade with the most recent common ancestor estimated between 1958 and 1981. The resources we have provided can be used to explore to data, test hypothesis and generate new hypotheses. The accessible assignment of GPSCs allows others to contextualize their own collections beyond the data presented here.Fil: Gladstone, Rebecca A.. Wellcome Sanger Institute; Reino UnidoFil: Lo, Stephanie W.. Wellcome Sanger Institute; Reino UnidoFil: Goater, Richard. Wellcome Sanger Institute; Reino Unido. University of Oxford; Reino UnidoFil: Yeats, Corin. Wellcome Sanger Institute; Reino Unido. University of Oxford; Reino UnidoFil: Taylor, Ben. Wellcome Sanger Institute; Reino Unido. University of Oxford; Reino UnidoFil: Hadfield, James. Fred Hutchinson Cancer Research Center; Estados UnidosFil: Lees, John A.. Imperial College London; Reino UnidoFil: Croucher, Nicholas J.. Imperial College London; Reino UnidoFil: van Tonder, Andries. Wellcome Sanger Institute; Reino Unido. University of Cambridge; Estados UnidosFil: Bentley, Leon J.. Wellcome Sanger Institute; Reino UnidoFil: Quah, Fu Xiang. Wellcome Sanger Institute; Reino UnidoFil: Blaschke, Anne J.. University of Utah; Estados UnidosFil: Pershing, Nicole L.. University of Utah; Estados UnidosFil: Byington, Carrie L.. University of California; Estados UnidosFil: Balaji, Veeraraghavan. Christian Medical College; IndiaFil: Hryniewicz, Waleria. National Medicines Institute; PoloniaFil: Sigauque, Betuel. Instituto Nacional de Saude Maputo; MozambiqueFil: Ravikumar, K. L.. Kempegowda Institute Of Medical Sciences; IndiaFil: Grassi Almeida, Samanta Cristine. Adolfo Lutz Institute; BrasilFil: Ochoa, Theresa J.. Universidad Peruana Cayetano Heredia; PerúFil: Ho, Pak Leung. The University Of Hong Kong; Hong KongFil: du Plessis, Mignon. National Institute for Communicable Diseases; SudáfricaFil: Ndlangisa, Kedibone M.. National Institute for Communicable Diseases; SudáfricaFil: Cornick, Jennifer. Malawi liverpool wellcome Trust Clinical Research Programme; MalauiFil: Kwambana Adams, Brenda. Colegio Universitario de Londres; Reino Unido. Medical Research Council Unit The Gambia at The London School of Hygiene & Tropical Medicine; GambiaFil: Benisty, Rachel. Ben Gurion University of the Negev; IsraelFil: Nzenze, Susan A.. University of the Witwatersrand; SudáfricaFil: Madhi, Shabir A.. University of the Witwatersrand; SudáfricaFil: Hawkins, Paulina A.. Emory University; Estados UnidosFil: Faccone, Diego Francisco. Dirección Nacional de Institutos de Investigación. Administración Nacional de Laboratorios e Institutos de Salud. Instituto Nacional de Enfermedades Infecciosas. Área de Antimicrobianos; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentin
SARS-CoV-2 Omicron is an immune escape variant with an altered cell entry pathway
Vaccines based on the spike protein of SARS-CoV-2 are a cornerstone of the public health response to COVID-19. The emergence of hypermutated, increasingly transmissible variants of concern (VOCs) threaten this strategy. Omicron (B.1.1.529), the fifth VOC to be described, harbours multiple amino acid mutations in spike, half of which lie within the receptor-binding domain. Here we demonstrate substantial evasion of neutralization by Omicron BA.1 and BA.2 variants in vitro using sera from individuals vaccinated with ChAdOx1, BNT162b2 and mRNA-1273. These data were mirrored by a substantial reduction in real-world vaccine effectiveness that was partially restored by booster vaccination. The Omicron variants BA.1 and BA.2 did not induce cell syncytia in vitro and favoured a TMPRSS2-independent endosomal entry pathway, these phenotypes mapping to distinct regions of the spike protein. Impaired cell fusion was determined by the receptor-binding domain, while endosomal entry mapped to the S2 domain. Such marked changes in antigenicity and replicative biology may underlie the rapid global spread and altered pathogenicity of the Omicron variant
Globetrotting strangles: the unbridled national and international transmission of Streptococcus equi between horses.
The equine disease strangles, which is characterized by the formation of abscesses in the lymph nodes of the head and neck, is one of the most frequently diagnosed infectious diseases of horses around the world. The causal agent, Streptococcus equi subspecies equi, establishes a persistent infection in approximately 10 % of animals that recover from the acute disease. Such 'carrier' animals appear healthy and are rarely identified during routine veterinary examinations pre-purchase or transit, but can transmit S. equi to naïve animals initiating new episodes of disease. Here, we report the analysis and visualization of phylogenomic and epidemiological data for 670 isolates of S. equi recovered from 19 different countries using a new core-genome multilocus sequence typing (cgMLST) web bioresource. Genetic relationships among all 670 S. equi isolates were determined at high resolution, revealing national and international transmission events that drive this endemic disease in horse populations throughout the world. Our data argue for the recognition of the international importance of strangles by the Office International des Épizooties to highlight the health, welfare and economic cost of this disease. The Pathogenwatch cgMLST web bioresource described herein is available for tailored genomic analysis of populations of S. equi and its close relative S. equi subspecies zooepidemicus that are recovered from horses and other animals, including humans, throughout the world. This article contains data hosted by Microreact
Finding the “Dark Matter” in Human and Yeast Protein Network Prediction and Modelling
Accurate modelling of biological systems requires a deeper and more complete knowledge about the molecular components and their functional associations than we currently have. Traditionally, new knowledge on protein associations generated by experiments has played a central role in systems modelling, in contrast to generally less trusted bio-computational predictions. However, we will not achieve realistic modelling of complex molecular systems if the current experimental designs lead to biased screenings of real protein networks and leave large, functionally important areas poorly characterised. To assess the likelihood of this, we have built comprehensive network models of the yeast and human proteomes by using a meta-statistical integration of diverse computationally predicted protein association datasets. We have compared these predicted networks against combined experimental datasets from seven biological resources at different level of statistical significance. These eukaryotic predicted networks resemble all the topological and noise features of the experimentally inferred networks in both species, and we also show that this observation is not due to random behaviour. In addition, the topology of the predicted networks contains information on true protein associations, beyond the constitutive first order binary predictions. We also observe that most of the reliable predicted protein associations are experimentally uncharacterised in our models, constituting the hidden or “dark matter” of networks by analogy to astronomical systems. Some of this dark matter shows enrichment of particular functions and contains key functional elements of protein networks, such as hubs associated with important functional areas like the regulation of Ras protein signal transduction in human cells. Thus, characterising this large and functionally important dark matter, elusive to established experimental designs, may be crucial for modelling biological systems. In any case, these predictions provide a valuable guide to these experimentally elusive regions
Rapid Genomic Characterization and Global Surveillance of Klebsiella Using Pathogenwatch.
BACKGROUND: Klebsiella species, including the notable pathogen K. pneumoniae, are increasingly associated with antimicrobial resistance (AMR). Genome-based surveillance can inform interventions aimed at controlling AMR. However, its widespread implementation requires tools to streamline bioinformatic analyses and public health reporting. METHODS: We developed the web application Pathogenwatch, which implements analytics tailored to Klebsiella species for integration and visualization of genomic and epidemiological data. We populated Pathogenwatch with 16 537 public Klebsiella genomes to enable contextualization of user genomes. We demonstrated its features with 1636 genomes from 4 low- and middle-income countries (LMICs) participating in the NIHR Global Health Research Unit (GHRU) on AMR. RESULTS: Using Pathogenwatch, we found that GHRU genomes were dominated by a small number of epidemic drug-resistant clones of K. pneumoniae. However, differences in their distribution were observed (eg, ST258/512 dominated in Colombia, ST231 in India, ST307 in Nigeria, ST147 in the Philippines). Phylogenetic analyses including public genomes for contextualization enabled retrospective monitoring of their spread. In particular, we identified hospital outbreaks, detected introductions from abroad, and uncovered clonal expansions associated with resistance and virulence genes. Assessment of loci encoding O-antigens and capsule in K. pneumoniae, which represent possible vaccine candidates, showed that 3 O-types (O1-O3) represented 88.9% of all genomes, whereas capsule types were much more diverse. CONCLUSIONS: Pathogenwatch provides a free, accessible platform for real-time analysis of Klebsiella genomes to aid surveillance at local, national, and global levels. We have improved representation of genomes from GHRU participant countries, further facilitating ongoing surveillance
From protein sequences to 3D-structures and beyond: the example of the UniProt Knowledgebase
With the dramatic increase in the volume of experimental results in every domain of life sciences, assembling pertinent data and combining information from different fields has become a challenge. Information is dispersed over numerous specialized databases and is presented in many different formats. Rapid access to experiment-based information about well-characterized proteins helps predict the function of uncharacterized proteins identified by large-scale sequencing. In this context, universal knowledgebases play essential roles in providing access to data from complementary types of experiments and serving as hubs with cross-references to many specialized databases. This review outlines how the value of experimental data is optimized by combining high-quality protein sequences with complementary experimental results, including information derived from protein 3D-structures, using as an example the UniProt knowledgebase (UniProtKB) and the tools and links provided on its website (http://www.uniprot.org/). It also evokes precautions that are necessary for successful predictions and extrapolations
- …