131 research outputs found
Detecting Remote Evolutionary Relationships among Proteins by Large-Scale Semantic Embedding
Virtually every molecular biologist has searched a protein or DNA sequence database to find sequences that are evolutionarily related to a given query. Pairwise sequence comparison methods—i.e., measures of similarity between query and target sequences—provide the engine for sequence database search and have been the subject of 30 years of computational research. For the difficult problem of detecting remote evolutionary relationships between protein sequences, the most successful pairwise comparison methods involve building local models (e.g., profile hidden Markov models) of protein sequences. However, recent work in massive data domains like web search and natural language processing demonstrate the advantage of exploiting the global structure of the data space. Motivated by this work, we present a large-scale algorithm called ProtEmbed, which learns an embedding of protein sequences into a low-dimensional “semantic space.” Evolutionarily related proteins are embedded in close proximity, and additional pieces of evidence, such as 3D structural similarity or class labels, can be incorporated into the learning process. We find that ProtEmbed achieves superior accuracy to widely used pairwise sequence methods like PSI-BLAST and HHSearch for remote homology detection; it also outperforms our previous RankProp algorithm, which incorporates global structure in the form of a protein similarity network. Finally, the ProtEmbed embedding space can be visualized, both at the global level and local to a given query, yielding intuition about the structure of protein sequence space
The Structural Biology Knowledgebase: a portal to protein structures, sequences, functions, and methods
The Protein Structure Initiative’s Structural Biology Knowledgebase (SBKB, URL: http://sbkb.org) is an open web resource designed to turn the products of the structural genomics and structural biology efforts into knowledge that can be used by the biological community to understand living systems and disease. Here we will present examples on how to use the SBKB to enable biological research. For example, a protein sequence or Protein Data Bank (PDB) structure ID search will provide a list of related protein structures in the PDB, associated biological descriptions (annotations), homology models, structural genomics protein target status, experimental protocols, and the ability to order available DNA clones from the PSI:Biology-Materials Repository. A text search will find publication and technology reports resulting from the PSI’s high-throughput research efforts. Web tools that aid in research, including a system that accepts protein structure requests from the community, will also be described. Created in collaboration with the Nature Publishing Group, the Structural Biology Knowledgebase monthly update also provides a research library, editorials about new research advances, news, and an events calendar to present a broader view of structural genomics and structural biology
Geographical and temporal distribution of SARS-CoV-2 clades in the WHO European Region, January to June 2020
We show the distribution of severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) genetic clades over time and between countries and outline potential genomic surveillance objectives. We applied three genomic nomenclature systems to all sequence data from the World Health Organization European Region available until 10 July 2020. We highlight the importance of real-time sequencing and data dissemination in a pandemic situation, compare the nomenclatures and lay a foundation for future European genomic surveillance of SARS-CoV-2
In re: ‘Experimental Music’
John Cage is universally associated with the phrase experimental music. But what did that phrase mean, for Cage and for Cage’s predecessors? I begin with Cage and Lejaren Hiller, both writing important texts on ‘experimental music’ in 1959. From there, I trace the phrase backwards, eventually reaching Emile Zola, Gertrude Stein, and William James. A final section traces the phrase forward to Cage and Hiller’s collaboration on HPSCHD (1969)
Evaluating the Effects of SARS-CoV-2 Spike Mutation D614G on Transmissibility and Pathogenicity.
Global dispersal and increasing frequency of the SARS-CoV-2 spike protein variant D614G are suggestive of a selective advantage but may also be due to a random founder effect. We investigate the hypothesis for positive selection of spike D614G in the United Kingdom using more than 25,000 whole genome SARS-CoV-2 sequences. Despite the availability of a large dataset, well represented by both spike 614 variants, not all approaches showed a conclusive signal of positive selection. Population genetic analysis indicates that 614G increases in frequency relative to 614D in a manner consistent with a selective advantage. We do not find any indication that patients infected with the spike 614G variant have higher COVID-19 mortality or clinical severity, but 614G is associated with higher viral load and younger age of patients. Significant differences in growth and size of 614G phylogenetic clusters indicate a need for continued study of this variant
SARS-CoV-2 Omicron is an immune escape variant with an altered cell entry pathway
Vaccines based on the spike protein of SARS-CoV-2 are a cornerstone of the public health response to COVID-19. The emergence of hypermutated, increasingly transmissible variants of concern (VOCs) threaten this strategy. Omicron (B.1.1.529), the fifth VOC to be described, harbours multiple amino acid mutations in spike, half of which lie within the receptor-binding domain. Here we demonstrate substantial evasion of neutralization by Omicron BA.1 and BA.2 variants in vitro using sera from individuals vaccinated with ChAdOx1, BNT162b2 and mRNA-1273. These data were mirrored by a substantial reduction in real-world vaccine effectiveness that was partially restored by booster vaccination. The Omicron variants BA.1 and BA.2 did not induce cell syncytia in vitro and favoured a TMPRSS2-independent endosomal entry pathway, these phenotypes mapping to distinct regions of the spike protein. Impaired cell fusion was determined by the receptor-binding domain, while endosomal entry mapped to the S2 domain. Such marked changes in antigenicity and replicative biology may underlie the rapid global spread and altered pathogenicity of the Omicron variant
Investigation of hospital discharge cases and SARS-CoV-2 introduction into Lothian care homes
Background
The first epidemic wave of severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) in Scotland resulted in high case numbers and mortality in care homes. In Lothian, over one-third of care homes reported an outbreak, while there was limited testing of hospital patients discharged to care homes.
Aim
To investigate patients discharged from hospitals as a source of SARS-CoV-2 introduction into care homes during the first epidemic wave.
Methods
A clinical review was performed for all patients discharges from hospitals to care homes from 1st March 2020 to 31st May 2020. Episodes were ruled out based on coronavirus disease 2019 (COVID-19) test history, clinical assessment at discharge, whole-genome sequencing (WGS) data and an infectious period of 14 days. Clinical samples were processed for WGS, and consensus genomes generated were used for analysis using Cluster Investigation and Virus Epidemiological Tool software. Patient timelines were obtained using electronic hospital records.
Findings
In total, 787 patients discharged from hospitals to care homes were identified. Of these, 776 (99%) were ruled out for subsequent introduction of SARS-CoV-2 into care homes. However, for 10 episodes, the results were inconclusive as there was low genomic diversity in consensus genomes or no sequencing data were available. Only one discharge episode had a genomic, time and location link to positive cases during hospital admission, leading to 10 positive cases in their care home.
Conclusion
The majority of patients discharged from hospitals were ruled out for introduction of SARS-CoV-2 into care homes, highlighting the importance of screening all new admissions when faced with a novel emerging virus and no available vaccine
Evaluating the Effects of SARS-CoV-2 Spike Mutation D614G on Transmissibility and Pathogenicity
Global dispersal and increasing frequency of the SARS-CoV-2 spike protein variant D614G are suggestive of a selective advantage but may also be due to a random founder effect. We investigate the hypothesis for positive selection of spike D614G in the United Kingdom using more than 25,000 whole genome SARS-CoV-2 sequences. Despite the availability of a large dataset, well represented by both spike 614 variants, not all approaches showed a conclusive signal of positive selection. Population genetic analysis indicates that 614G increases in frequency relative to 614D in a manner consistent with a selective advantage. We do not find any indication that patients infected with the spike 614G variant have higher COVID-19 mortality or clinical severity, but 614G is associated with higher viral load and younger age of patients. Significant differences in growth and size of 614G phylogenetic clusters indicate a need for continued study of this variant
- …