13 research outputs found

    A scalable, fully automated process for construction of sequence-ready human exome targeted capture libraries

    Get PDF
    Genome targeting methods enable cost-effective capture of specific subsets of the genome for sequencing. We present here an automated, highly scalable method for carrying out the Solution Hybrid Selection capture approach that provides a dramatic increase in scale and throughput of sequence-ready libraries produced. Significant process improvements and a series of in-process quality control checkpoints are also added. These process improvements can also be used in a manual version of the protocol

    Mutations causing medullary cystic kidney disease type 1 lie in a large VNTR in MUC1 missed by massively parallel sequencing

    Get PDF
    Although genetic lesions responsible for some mendelian disorders can be rapidly discovered through massively parallel sequencing of whole genomes or exomes, not all diseases readily yield to such efforts. We describe the illustrative case of the simple mendelian disorder medullary cystic kidney disease type 1 (MCKD1), mapped more than a decade ago to a 2-Mb region on chromosome 1. Ultimately, only by cloning, capillary sequencing and de novo assembly did we find that each of six families with MCKD1 harbors an equivalent but apparently independently arising mutation in sequence markedly under-represented in massively parallel sequencing data: the insertion of a single cytosine in one copy (but a different copy in each family) of the repeat unit comprising the extremely long (~1.5–5 kb), GC-rich (>80%) coding variable-number tandem repeat (VNTR) sequence in the MUC1 gene encoding mucin 1. These results provide a cautionary tale about the challenges in identifying the genes responsible for mendelian, let alone more complex, disorders through massively parallel sequencing.National Institutes of Health (U.S.) (Intramural Research Program)National Human Genome Research Institute (U.S.)Charles University (program UNCE 204011)Charles University (program PRVOUK-P24/LF1/3)Czech Republic. Ministry of Education, Youth, and Sports (grant NT13116-4/2012)Czech Republic. Ministry of Health (grant NT13116-4/2012)Czech Republic. Ministry of Health (grant LH12015)National Institutes of Health (U.S.) (Harvard Digestive Diseases Center, grant DK34854

    Using viral load and epidemic dynamics to optimize pooled testing in resource-constrained settings

    Get PDF
    Virological testing is central to severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) containment, but many settings face severe limitations on testing. Group testing offers a way to increase throughput by testing pools of combined samples; however, most proposed designs have not yet addressed key concerns over sensitivity loss and implementation feasibility. Here, we combined a mathematical model of epidemic spread and empirically derived viral kinetics for SARS-CoV-2 infections to identify pooling designs that are robust to changes in prevalence and to ratify sensitivity losses against the time course of individual infections. We show that prevalence can be accurately estimated across a broad range, from 0.02 to 20%, using only a few dozen pooled tests and using up to 400 times fewer tests than would be needed for individual identification. We then exhaustively evaluated the ability of different pooling designs to maximize the number of detected infections under various resource constraints, finding that simple pooling designs can identify up to 20 times as many true positives as individual testing with a given budget. Crucially, we confirmed that our theoretical results can be translated into practice using pooled human nasopharyngeal specimens by accurately estimating a 1% prevalence among 2304 samples using only 48 tests and through pooled sample identification in a panel of 960 samples. Our results show that accounting for variation in sampled viral loads provides a nuanced picture of how pooling affects sensitivity to detect infections. Using simple, practical group testing designs can vastly increase surveillance capabilities in resource-limited settings.National Institute of General Medical Sciences (Grant U54GM088558

    EGFR Variant Heterogeneity in Glioblastoma Resolved through Single-Nucleus Sequencing

    No full text
    Glioblastomas (GBM) with EGFR amplification represent approximately 50% of newly diagnosed cases, and recent studies have revealed frequent coexistence of multiple EGFR aberrations within the same tumor, which has implications for mutation cooperation and treatment resistance. However, bulk tumor sequencing studies cannot resolve the patterns of how the multiple EGFR aberrations coexist with other mutations within single tumor cells. Here, we applied a population-based single-cell whole-genome sequencing methodology to characterize genomic heterogeneity in EGFR-amplified glioblastomas. Our analysis effectively identified clonal events, including a novel translocation of a super enhancer to the TERT promoter, as well as subclonal LOH and multiple EGFR mutational variants within tumors. Correlating the EGFR mutations onto the cellular hierarchy revealed that EGFR truncation variants (EGFRvII and EGFR carboxyl-terminal deletions) identified in the bulk tumor segregate into nonoverlapping subclonal populations. In vitro and in vivo functional studies show that EGFRvII is oncogenic and sensitive to EGFR inhibitors currently in clinical trials. Thus, the association between diverse activating mutations in EGFR and other subclonal mutations within a single tumor supports an intrinsic mechanism for proliferative and clonal diversification with broad implications in resistance to treatment. Significance: We developed a novel single-cell sequencing methodology capable of identifying unique, nonoverlapping subclonal alterations from archived frozen clinical specimens. Using GBM as an example, we validated our method to successfully define tumor cell subpopulations containing distinct genetic and treatment resistance profiles and potentially mutually cooperative combinations of alterations in EGFR and other genes.Dana-Farber/Harvard Cancer Center (MIT Bridge Project Fund)National Brain Tumor Societ

    Noninvasive Immunohistochemical Diagnosis and Novel MUC1 Mutations Causing Autosomal Dominant Tubulointerstitial Kidney Disease

    No full text
    Background Autosomal dominant tubulointerstitial kidney disease caused by mucin-1 gene (MUC1) mutations (ADTKD-MUC1) is characterized by progressive kidney failure. Genetic evaluation for ADTKD-MUC1 specifically tests for a cytosine duplication that creates a unique frameshift protein (MUC1fs). Our goal was to develop immunohistochemical methods to detect the MUC1fs created by the cytosine duplication and, possibly, by other similar frameshift mutations and to identify novel MUC1 mutations in individuals with positive immunohistochemical staining for the MUC1fs protein. Methods We performed MUC1fs immunostaining on urinary cell smears and various tissues from ADTKD-MUC1-positive and -negative controls as well as in individuals from 37 ADTKD families that were negative for mutations in known ADTKD genes. We used novel analytic methods to identify MUC1 frameshift mutations. Results After technique refinement, the sensitivity and specificity for MUC1fs immunostaining of urinary cell smears were 94.2% and 88.6%, respectively. Further genetic testing on 17 families with positive MUC1fs immunostaining revealed six families with five novel MUC1 frameshift mutations that all predict production of the identical MUC1fs protein. Conclusions We developed a noninvasive immunohistochemical method to detect MUC1fs that, after further validation, may be useful in the future for diagnostic testing. Production of the MUC1fs protein may be central to the pathogenesis of ADTKD-MUC1

    Sensitive Detection of Minimal Residual Disease in Patients Treated for Early-Stage Breast Cancer

    No full text
    © 2020 American Association for Cancer Research. Purpose: Existing cell-free DNA (cfDNA) methods lack the sensitivity needed for detecting minimal residual disease (MRD) following therapy. We developed a test for tracking hundreds of patient-specific mutations to detect MRD with a 1,000-fold lower error rate than conventional sequencing. Experimental Design: We compared the sensitivity of our approach to digital droplet PCR (ddPCR) in a dilution series, then retrospectively identified two cohorts of patients who had undergone prospective plasma sampling and clinical data collection: 16 patients with ER+/HER2- metastatic breast cancer (MBC) sampled within 6 months following metastatic diagnosis and 142 patients with stage 0 to III breast cancer who received curative-intent treatment with most sampled at surgery and 1 year postoperative. We performed whole-exome sequencing of tumors and designed individualized MRD tests, which we applied to serial cfDNA samples. Results: Our approach was 100-fold more sensitive than ddPCR when tracking 488 mutations, but most patients had fewer identifiable tumor mutations to track in cfDNA (median = 57; range = 2–346). Clinical sensitivity was 81% (n = 13/16) in newly diagnosed MBC, 23% (n = 7/30) at postoperative and 19% (n = 6/32) at 1 year in early-stage disease, and highest in patients with the most tumor mutations available to track. MRD detection at 1 year was strongly associated with distant recurrence [HR = 20.8; 95% confidence interval, 7.3–58.9]. Median lead time from first positive sample to recurrence was 18.9 months (range = 3.4–39.2 months). Conclusions: Tracking large numbers of individualized tumor mutations in cfDNA can improve MRD detection, but its sensitivity is driven by the number of tumor mutations available to track

    Genome-wide association analysis identifies loci for type 2 diabetes and triglyceride levels

    No full text
    New strategies for prevention and treatment of type 2 diabetes (T2D) require improved insight into disease etiology. We analyzed 386,731 common single-nucleotide polymorphisms (SNPs) in 1464 patients with T2D and 1467 matched controls, each characterized for measures of glucose metabolism, lipids, obesity, and blood pressure. With collaborators (FUSION and WTCCC/UKT2D), we identified and confirmed three loci associated with T2D - in a noncoding region near CDKN2A and CDKN2B, in an intron of IGF2BP2, and an intron of CDKAL1 - and replicated associations near HHEX and in SLC30A8 found by a recent whole-genome association study. We identified and confirmed association of a SNP in an intron of glucokinase regulatory protein (GCKR) with serum triglycerides. The discovery of associated variants in unsuspected genes and outside coding regions illustrates the ability of genome-wide association studies to provide potentially important clues to the pathogenesis of common diseases
    corecore