9 research outputs found

    SARS-CoV-2 infection dynamics revealed by wastewater sequencing analysis and deconvolution

    No full text
    The use of RNA sequencing from wastewater samples is a valuable way for estimating infection dynamics and circulating lineages of SARS-CoV-2. This approach is independent from testing individuals and can therefore become the key tool to monitor this and potentially other viruses. However, it is equally important to develop easily accessible and scalable tools which can highlight critical changes in infection rates and dynamics over time across different locations given sequencing data from wastewater. Here, we provide an analysis of lineage dynamics in Berlin and New York City using wastewater sequencing and present PiGx SARS-CoV-2, a highly reproducible computational analysis pipeline with comprehensive reports. This end-to-end pipeline includes all steps from raw data to shareable reports, additional taxonomic analysis, deconvolution and geospatial time series analyses. Using simulated datasets (in silico generated and spiked-in samples) we could demonstrate the accuracy of our pipeline calculating proportions of Variants of Concern (VOC) from environmental as well as pre-mixed samples (spiked-in). By applying our pipeline on a dataset of wastewater samples from Berlin between February 2021 and January 2022, we could reconstruct the emergence of B.1.1.7(alpha) in February/March 2021 and the replacement dynamics from B.1.617.2 (delta) to BA.1 and BA.2 (omicron) during the winter of 2021/2022. Using data from very-short-reads generated in an industrial scale setting, we could see even higher accuracy in our deconvolution. Lastly, using a targeted sequencing dataset from New York City (receptor-binding-domain (RBD) only), we could reproduce the results recovering the proportions of the so-called cryptic lineages shown in the original study. Overall our study provides an in-depth analysis reconstructing virus lineage dynamics from wastewater. While applying our tool on a wide range of different datasets (from different types of wastewater sample locations and sequenced with different methods), we show that PiGx SARS-CoV-2 can be used to identify new mutations and detect any emerging new lineages in a highly automated and scalable way. Our approach can support efforts to establish continuous monitoring and early-warning projects for detecting SARS-CoV-2 or any other pathogen

    A harmonized meta-knowledgebase of clinical interpretations of somatic genomic variants in cancer

    Get PDF
    Precision oncology relies on accurate discovery and interpretation of genomic variants, enabling individualized diagnosis, prognosis and therapy selection. We found that six prominent somatic cancer variant knowledgebases were highly disparate in content, structure and supporting primary literature, impeding consensus when evaluating variants and their relevance in a clinical setting. We developed a framework for harmonizing variant interpretations to produce a meta-knowledgebase of 12,856 aggregate interpretations. We demonstrated large gains in overlap between resources across variants, diseases and drugs as a result of this harmonization. We subsequently demonstrated improved matching between a patient cohort and harmonized interpretations of potential clinical significance, observing an increase from an average of 33% per individual knowledgebase to 57% in aggregate. Our analyses illuminate the need for open, interoperable sharing of variant interpretation data. We also provide a freely available web interface (search.cancervariants.org) for exploring the harmonized interpretations from these six knowledgebases.We acknowledge the contributions from members of GA4GH and specifically the Genotype to Phenotype Task Team for their numerous contributions leading to this study. We thank the VICC knowledgebase partners for their input in construction of the meta-knowledgebase and drafting of the paper, M. McCoy for his assistance in proofreading the manuscript and J. McMichael for his work in restyling Fig. 1. A.H.W. was supported by NIH National Cancer Institute (NCI) award F32CA206247 and National Human Genome Research Institute (NHGRI) award K99HG010157. B.W. was supported by NIH NHGRI award U54HG007990, NIH NCI R01CA180778 and Intel SRA-16-037. D.T.R. is a participant in the Berlin Institute of Health—Charité Clinical Scientist Program funded by the Charité—Universitätsmedizin Berlin and the Berlin Institute of Health, and was supported by grant nos. 031L0030E and 031L0023 awarded by the German Federal Ministry of Education and Research. D.I.R. and S.M. are supported by ClinGen, through the NHGRI awards U41HG006834, U41HG009649, U41HG009650 and U01HG007437. T.A. was supported by an award from Academy of Finland (grant no. 330857), Cancer Society of Finland. M.H. was supported by the Monarch Initiative NIH Office of Director award R24OD011883. J. Gao, D.C. and N.S. were supported by NIH NCI award P30CA008748. N.L.B. acknowledges funding from the European Research Council (consolidator grant 682398). M.L. was supported through the Medical Research Council—Cancer Research UK Stratification in Colorectal Cancer Program grant and Health Data Research UK Substantive Site grant. M.G. was supported by NIH NHGRI award R00HG007940 and a V Scholar Award from the V Foundation for Cancer Research. O.L.G., M.G. and the CIViC knowledgebase were supported by the NIH NCI awards U01CA209936 and U24CA237719 and a Cancer Moonshot funding opportunity, specifically an Activities to Promote Technology Research Collaborations for Cancer Research (Administrative Support) award

    The GA4GH Phenopacket schema defines a computable representation of clinical data.

    No full text
    n the clinical domain, substantial work has been dedicated to the development of computational phenotypes.1 Traditionally, these approaches have largely relied on rule-based methods and large sources of clinical data to identify cohorts of patients with or without a specific disease.2–5 However, they were not developed to enable deep phenotyping of abnormalities, to facilitate computational analysis of interpatient phenotypic similarity, or to support computational decision support. To address this, the Global Alliance for Genomics and Health6 (GA4GH) has developed the Phenopacket schema, which supports exchange of computable longitudinal case-level phenotypic information for diagnosis of and research on all types of disease, including Mendelian and complex genetic diseases, cancer, and infectious diseases. A Phenopacket characterizes an individual person or biosample, linking that individual to detailed phenotypic descriptions, genetic information, diagnoses, and treatments (Fig 1). The Phenopacket software is available at https://github.com/phenopackets/

    Pharmacogenetic Allele Nomenclature

    No full text
    This article provides nomenclature recommendations developed by an international workgroup to increase transparency and standardization of pharmacogenetic (PGx) result reporting. Presently, sequence variants identified by PGx tests are described using different nomenclature systems. In addition, PGx analysis may detect different sets of variants for each gene, which can affect interpretation of results. This practice has caused confusion and may thereby impede the adoption of clinical PGx testing. Standardization is critical to move PGx forward

    Responding to ambiguity: HIV communication campaigns for gay and bisexual African Americans

    No full text

    GA4GH: International policies and standards for data sharing across genomic research and healthcare.

    No full text
    The Global Alliance for Genomics and Health (GA4GH) aims to accelerate biomedical advances by enabling the responsible sharing of clinical and genomic data through both harmonized data aggregation and federated approaches. The decreasing cost of genomic sequencing (along with other genome-wide molecular assays) and increasing evidence of its clinical utility will soon drive the generation of sequence data from tens of millions of humans, with increasing levels of diversity. In this perspective, we present the GA4GH strategies for addressing the major challenges of this data revolution. We describe the GA4GH organization, which is fueled by the development efforts of eight Work Streams and informed by the needs of 24 Driver Projects and other key stakeholders. We present the GA4GH suite of secure, interoperable technical standards and policy frameworks and review the current status of standards, their relevance to key domains of research and clinical care, and future plans of GA4GH. Broad international participation in building, adopting, and deploying GA4GH standards and frameworks will catalyze an unprecedented effort in data sharing that will be critical to advancing genomic medicine and ensuring that all populations can access its benefits
    corecore