205 research outputs found

    Quantifying single nucleotide variant detection sensitivity in exome sequencing

    Get PDF
    BACKGROUND: The targeted capture and sequencing of genomic regions has rapidly demonstrated its utility in genetic studies. Inherent in this technology is considerable heterogeneity of target coverage and this is expected to systematically impact our sensitivity to detect genuine polymorphisms. To fully interpret the polymorphisms identified in a genetic study it is often essential to both detect polymorphisms and to understand where and with what probability real polymorphisms may have been missed. RESULTS: Using down-sampling of 30 deeply sequenced exomes and a set of gold-standard single nucleotide variant (SNV) genotype calls for each sample, we developed an empirical model relating the read depth at a polymorphic site to the probability of calling the correct genotype at that site. We find that measured sensitivity in SNV detection is substantially worse than that predicted from the naive expectation of sampling from a binomial. This calibrated model allows us to produce single nucleotide resolution SNV sensitivity estimates which can be merged to give summary sensitivity measures for any arbitrary partition of the target sequences (nucleotide, exon, gene, pathway, exome). These metrics are directly comparable between platforms and can be combined between samples to give “power estimates” for an entire study. We estimate a local read depth of 13X is required to detect the alleles and genotype of a heterozygous SNV 95% of the time, but only 3X for a homozygous SNV. At a mean on-target read depth of 20X, commonly used for rare disease exome sequencing studies, we predict 5–15% of heterozygous and 1–4% of homozygous SNVs in the targeted regions will be missed. CONCLUSIONS: Non-reference alleles in the heterozygote state have a high chance of being missed when commonly applied read coverage thresholds are used despite the widely held assumption that there is good polymorphism detection at these coverage levels. Such alleles are likely to be of functional importance in population based studies of rare diseases, somatic mutations in cancer and explaining the “missing heritability” of quantitative traits

    Good Enough Practices in Scientific Computing: A Learning Module for Researchers

    Get PDF
    We present a half-day learning module targeted at a broad audience of researchers who want to learn how to be more efficient and effective in their data analysis and computing, whatever their career stage. The module teaches “good enough practices” that are near universally useful for researchers who use computers in their work. These practices encompass data management, software and programming, collaborating with colleagues, organizing projects, keeping track of changes, and writing manuscripts. Good enough practices rely on a shared set of principles that span these areas: planning, modular organization, names, and documentation. The lesson is in The Carpentries format and the materials are open-source and hosted on GitHub by The Carpentries Lab. The lesson is visible at https://carpentries-lab.github.io/good-enough-practices/

    Variant detection sensitivity and biases in whole genome and exome sequencing

    Get PDF
    BACKGROUND: Less than two percent of the human genome is protein coding, yet that small fraction harbours the majority of known disease causing mutations. Despite rapidly falling whole genome sequencing (WGS) costs, much research and increasingly the clinical use of sequence data is likely to remain focused on the protein coding exome. We set out to quantify and understand how WGS compares with the targeted capture and sequencing of the exome (exome-seq), for the specific purpose of identifying single nucleotide polymorphisms (SNPs) in exome targeted regions. RESULTS: We have compared polymorphism detection sensitivity and systematic biases using a set of tissue samples that have been subject to both deep exome and whole genome sequencing. The scoring of detection sensitivity was based on sequence down sampling and reference to a set of gold-standard SNP calls for each sample. Despite evidence of incremental improvements in exome capture technology over time, whole genome sequencing has greater uniformity of sequence read coverage and reduced biases in the detection of non-reference alleles than exome-seq. Exome-seq achieves 95% SNP detection sensitivity at a mean on-target depth of 40 reads, whereas WGS only requires a mean of 14 reads. Known disease causing mutations are not biased towards easy or hard to sequence areas of the genome for either exome-seq or WGS. CONCLUSIONS: From an economic perspective, WGS is at parity with exome-seq for variant detection in the targeted coding regions. WGS offers benefits in uniformity of read coverage and more balanced allele ratio calls, both of which can in most cases be offset by deeper exome-seq, with the caveat that some exome-seq targets will never achieve sufficient mapped read depth for variant detection due to technical difficulties or probe failures. As WGS is intrinsically richer data that can provide insight into polymorphisms outside coding regions and reveal genomic rearrangements, it is likely to progressively replace exome-seq for many applications. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/1471-2105-15-247) contains supplementary material, which is available to authorized users

    Regionally enriched rare deleterious exonic variants in the UK and Ireland

    Get PDF
    It is unclear how patterns of regional genetic differentiation in the UK and Ireland might impact the protein-coding fraction of the genome. We exploit UK Biobank (UKB) and Viking Genes whole exome sequencing data to study regional genetic differentiation across the UK and Ireland in protein coding genes, encompassing 44,696 unrelated individuals from 20 regions of origin. We demonstrate substantial exonic differentiation among Shetlanders, Orcadians, individuals with full or partial Ashkenazi Jewish ancestry and in several mainland regions (particularly north and south Wales, southeast Scotland and Ireland). With stringent filtering criteria, we find 67 regionally enriched (≥5-fold) variants likely to have adverse biomedical consequences in homozygous individuals. Here, we show that regional genetic variation across the UK and Ireland should be considered in the design of genetic studies and may inform effective genetic screening and counselling

    Rapid evolution of colistin resistance in a bioreactor model of infection of Klebsiella pneumoniae

    Get PDF
    Colistin remains an important antibiotic for the therapeutic management of drug-resistant Klebsiella pneumoniae. Despite the numerous reports of colistin resistance in clinical strains, it remains unclear exactly when and how different mutational events arise resulting in reduced colistin susceptibility. Using a bioreactor model of infection, we modelled the emergence of colistin resistance in a susceptible isolate of K. pneumoniae. Genotypic, phenotypic and mathematical analyses of the antibiotic-challenged and un-challenged population indicates that after an initial decline, the population recovers within 24 h due to a small number of “founder cells” which have single point mutations mainly in the regulatory genes encoding crrB and pmrB that when mutated results in up to 100-fold reduction in colistin susceptibility. Our work underlines the rapid development of colistin resistance during treatment or exposure of susceptible K. pneumoniae infections having implications for the use of cationic antimicrobial peptides as a monotherapy.T.S. acknowledges financial support from the Medical Research Council (MR/P007597/1), and B.W. acknowledges the support of the project financed under Dioscuri, a programme initiated by the Max Planck Society, jointly managed with the National Science Centre in Poland, and mutually funded by Polish Ministry of Science and Higher Education and German Federal Ministry of Education and Research (grant no. UMO-2019/02/H/NZ6/00003). SM acknowledges salary support from the Biotechnology and Biological Sciences Council (BBS/E/D/20002173)

    Gorham-Stout case report: a multi-omic analysis reveals recurrent fusions as new potential drivers of the disease

    Get PDF
    BACKGROUND: Gorham-Stout disease is a rare condition characterized by vascular proliferation and the massive destruction of bone tissue. With less than 400 cases in the literature of Gorham-Stout syndrome, we performed a unique study combining whole-genome sequencing and RNA-Seq to probe the genomic features and differentially expressed pathways of a presented case, revealing new possible drivers and biomarkers of the disease. CASE PRESENTATION: We present a case report of a white 45-year-old female patient with marked bone loss of the left humerus associated with vascular proliferation, diagnosed with Gorham-Stout disease. The analysis of whole-genome sequencing showed a dominance of large structural DNA rearrangements. Particularly, rearrangements in chromosomes seven, twelve, and twenty could contribute to the development of the disease, especially a gene fusion involving ATG101 that could affect macroautophagy. The study of RNA-sequencing data from the patient uncovered the PI3K/AKT/mTOR pathway as the most affected signaling cascade in the Gorham-Stout lesional tissue. Furthermore, M2 macrophage infiltration was detected using immunohistochemical staining and confirmed by deconvolution of the RNA-seq expression data. CONCLUSIONS: The way that DNA and RNA aberrations lead to Gorham-Stout disease is poorly understood due to the limited number of studies focusing on this rare disease. Our study provides the first glimpse into this facet of the disease, exposing new possible therapeutic targets and facilitating the clinicopathological diagnosis of Gorham-Stout disease. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12920-022-01277-x

    Control of endothelial cell function and arteriogenesis by MEG3:EZH2 epigenetic regulation of integrin expression

    Get PDF
    Epigenetic processes involving long non-coding RNAs regulate endothelial gene expression. However, the underlying regulatory mechanisms causing endothelial dysfunction remain to be elucidated. Enhancer of zeste homolog 2 (EZH2) is an important rheostat of histone H3K27 trimethylation (H3K27me3) that represses endothelial targets, but EZH2 RNA binding capacity and EZH2:RNA functional interactions have not been explored in post-ischemic angiogenesis. We used formaldehyde/UV-assisted crosslinking ligation and sequencing of hybrids and identified a new role for maternally expressed gene 3 (MEG3). MEG3 formed the predominant RNA:RNA hybrid structures in endothelial cells. Moreover, MEG3:EZH2 assists recruitment onto chromatin. By EZH2-chromatin immunoprecipitation, following MEG3 depletion, we demonstrated that MEG3 controls recruitment of EZH2/H3K27me3 onto integrin subunit alpha4 ( ITGA4) promoter. Both MEG3 knockdown or EZH2 inhibition (A-395) promoted ITGA4 expression and improved endothelial cell migration and adhesion to fibronectin in vitro. The A-395 inhibitor re-directed MEG3-assisted chromatin remodeling, offering a direct therapeutic benefit by increasing endothelial function and resilience. This approach subsequently increased the expression of ITGA4 in arterioles following ischemic injury in mice, thus promoting arteriogenesis. Our findings show a context-specific role for MEG3 in guiding EZH2 to repress ITGA4. Novel therapeutic strategies could antagonize MEG3:EZH2 interaction for pre-clinical studies. </p

    Integrated molecular characterisation of endometrioid ovarian carcinoma identifies opportunities for stratification

    Get PDF
    Endometrioid ovarian carcinoma (EnOC) is an under-investigated ovarian cancer type. Recent studies have described disease subtypes defined by genomics and hormone receptor expression patterns; here, we determine the relationship between these subtyping layers to define the molecular landscape of EnOC with high granularity and identify therapeutic vulnerabilities in high-risk cases. Whole exome sequencing data were integrated with progesterone and oestrogen receptor (PR and ER) expression-defined subtypes in 90 EnOC cases following robust pathological assessment, revealing dominant clinical and molecular features in the resulting integrated subtypes. We demonstrate significant correlation between subtyping approaches: PR-high (PR + /ER + , PR + /ER−) cases were predominantly CTNNB1-mutant (73.2% vs 18.4%, P < 0.001), while PR-low (PR−/ER + , PR−/ER−) cases displayed higher TP53 mutation frequency (38.8% vs 7.3%, P = 0.001), greater genomic complexity (P = 0.007) and more frequent copy number alterations (P = 0.001). PR-high EnOC patients experience favourable disease-specific survival independent of clinicopathological and genomic features (HR = 0.16, 95% CI 0.04–0.71). TP53 mutation further delineates the outcome of patients with PR-low tumours (HR = 2.56, 95% CI 1.14–5.75). A simple, routinely applicable, classification algorithm utilising immunohistochemistry for PR and p53 recapitulated these subtypes and their survival profiles. The genomic profile of high-risk EnOC subtypes suggests that inhibitors of the MAPK and PI3K-AKT pathways, alongside PARP inhibitors, represent promising candidate agents for improving patient survival. Patients with PR-low TP53-mutant EnOC have the greatest unmet clinical need, while PR-high tumours—which are typically CTNNB1-mutant and TP53 wild-type—experience excellent survival and may represent candidates for trials investigating de-escalation of adjuvant chemotherapy to agents such as endocrine therapy
    corecore