81 research outputs found

    A variational piecewise smooth model for identification of chromosomal imbalances in cancer

    Get PDF
    Monitoring of changes at the DNA level enables the characterization of the underlying structure of genetic diseases. In particular, copy number alterations (CNAs) are increasingly being recognized as an important component of genetic variations in cancer: oncogenes may be enhanced by DNA amplification and tumor suppressor genes may be inactivated by physical deletion. Encouraged by the advent of array comparative genomic hybridization technology, several biological studies have been designed to look for chromosomal aberrations involved in cancer. Hence, the development of algorithms aimed at the identification of CNAs is a current challenge in bioinformatics. Despite the amount of proposed approaches, identification of CNAs is yet an open problem. Here we propose a new approach for detection of CNAs that extends a previously published algorithm where a popular image segmentation variational model was used. The proposed algorithm, called Vega Multi-Channel (VegaMC), starts from the assumption that copy number profiles are piecewise constant and finds the optimal segmentation by minimizing a functional energy that represents a compromise between accuracy and parsimony of the boundaries. We applied VegaMC on a published gastrointestinal stromal tumor aCGH dataset, showing the ability of the proposed approach in the identification of well-known cytogenetic mutations, and eventually discover new ones

    GARFIELD classifies disease-relevant genomic features through integration of functional annotations with association signals.

    Get PDF
    Loci discovered by genome-wide association studies predominantly map outside protein-coding genes. The interpretation of the functional consequences of non-coding variants can be greatly enhanced by catalogs of regulatory genomic regions in cell lines and primary tissues. However, robust and readily applicable methods are still lacking by which to systematically evaluate the contribution of these regions to genetic variation implicated in diseases or quantitative traits. Here we propose a novel approach that leverages genome-wide association studies' findings with regulatory or functional annotations to classify features relevant to a phenotype of interest. Within our framework, we account for major sources of confounding not offered by current methods. We further assess enrichment of genome-wide association studies for 19 traits within Encyclopedia of DNA Elements- and Roadmap-derived regulatory regions. We characterize unique enrichment patterns for traits and annotations driving novel biological insights. The method is implemented in standalone software and an R package, to facilitate its application by the research community

    Visualization of Genomic Changes by Segmented Smoothing Using an L0 Penalty

    Get PDF
    Copy number variations (CNV) and allelic imbalance in tumor tissue can show strong segmentation. Their graphical presentation can be enhanced by appropriate smoothing. Existing signal and scatterplot smoothers do not respect segmentation well. We present novel algorithms that use a penalty on the norm of differences of neighboring values. Visualization is our main goal, but we compare classification performance to that of VEGA

    Fast MCMC sampling for hidden markov models to determine copy number variations

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Hidden Markov Models (HMM) are often used for analyzing Comparative Genomic Hybridization (CGH) data to identify chromosomal aberrations or copy number variations by segmenting observation sequences. For efficiency reasons the parameters of a HMM are often estimated with maximum likelihood and a segmentation is obtained with the Viterbi algorithm. This introduces considerable uncertainty in the segmentation, which can be avoided with Bayesian approaches integrating out parameters using Markov Chain Monte Carlo (MCMC) sampling. While the advantages of Bayesian approaches have been clearly demonstrated, the likelihood based approaches are still preferred in practice for their lower running times; datasets coming from high-density arrays and next generation sequencing amplify these problems.</p> <p>Results</p> <p>We propose an approximate sampling technique, inspired by compression of discrete sequences in HMM computations and by <it>kd</it>-trees to leverage spatial relations between data points in typical data sets, to speed up the MCMC sampling.</p> <p>Conclusions</p> <p>We test our approximate sampling method on simulated and biological ArrayCGH datasets and high-density SNP arrays, and demonstrate a speed-up of 10 to 60 respectively 90 while achieving competitive results with the state-of-the art Bayesian approaches.</p> <p><it>Availability: </it>An implementation of our method will be made available as part of the open source GHMM library from <url>http://ghmm.org</url>.</p

    A somatic-mutational process recurrently duplicates germline susceptibility loci and tissue-specific super-enhancers in breast cancers

    Get PDF
    Somatic rearrangements contribute to the mutagenized landscape of cancer genomes. Here, we systematically interrogated rearrangements in 560 breast cancers by using a piecewise constant fitting approach. We identified 33 hotspots of large (>100 kb) tandem duplications, a mutational signature associated with homologous-recombination-repair deficiency. Notably, these tandem-duplication hotspots were enriched in breast cancer germline susceptibility loci (odds ratio (OR) = 4.28) and breast-specific 'super-enhancer' regulatory elements (OR = 3.54). These hotspots may be sites of selective susceptibility to double-strand-break damage due to high transcriptional activity or, through incrementally increasing copy number, may be sites of secondary selective pressure. The transcriptomic consequences ranged from strong individual oncogene effects to weak but quantifiable multigene expression effects. We thus present a somatic-rearrangement mutational process affecting coding sequences and noncoding regulatory elements and contributing a continuum of driver consequences, from modest to strong effects, thereby supporting a polygenic model of cancer development.DG is supported by the EU-FP7-SUPPRESSTEM project. SN-Z is funded by a Wellcome Trust Intermediate Fellowship (WT100183MA) and is a Wellcome Beit Fellow. For more information, please visit the publisher's website

    The Genomic and Immune Landscapes of Lethal Metastatic Breast Cancer.

    Get PDF
    The detailed molecular characterization of lethal cancers is a prerequisite to understanding resistance to therapy and escape from cancer immunoediting. We performed extensive multi-platform profiling of multi-regional metastases in autopsies from 10 patients with therapy-resistant breast cancer. The integrated genomic and immune landscapes show that metastases propagate and evolve as communities of clones, reveal their predicted neo-antigen landscapes, and show that they can accumulate HLA loss of heterozygosity (LOH). The data further identify variable tumor microenvironments and reveal, through analyses of T cell receptor repertoires, that adaptive immune responses appear to co-evolve with the metastatic genomes. These findings reveal in fine detail the landscapes of lethal metastatic breast cancer.CRUK

    A somatic-mutational process recurrently duplicates germline susceptibility loci and tissue-specific super-enhancers in breast cancers

    Get PDF
    Somatic rearrangements contribute to the mutagenized landscape of cancer genomes. Here, we systematically interrogated rearrangements in 560 breast cancers by using a piecewise constant fitting approach. We identified 33 hotspots of large (>100 kb) tandem duplications, a mutational signature associated with homologous-recombination-repair deficiency. Notably, these tandem-duplication hotspots were enriched in breast cancer germline susceptibility loci (odds ratio (OR) = 4.28) and breast-specific 'super-enhancer' regulatory elements (OR = 3.54). These hotspots may b

    HRDetect is a predictor of BRCA1 and BRCA2 deficiency based on mutational signatures.

    Get PDF
    Approximately 1-5% of breast cancers are attributed to inherited mutations in BRCA1 or BRCA2 and are selectively sensitive to poly(ADP-ribose) polymerase (PARP) inhibitors. In other cancer types, germline and/or somatic mutations in BRCA1 and/or BRCA2 (BRCA1/BRCA2) also confer selective sensitivity to PARP inhibitors. Thus, assays to detect BRCA1/BRCA2-deficient tumors have been sought. Recently, somatic substitution, insertion/deletion and rearrangement patterns, or 'mutational signatures', were associated with BRCA1/BRCA2 dysfunction. Herein we used a lasso logistic regression model to identify six distinguishing mutational signatures predictive of BRCA1/BRCA2 deficiency. A weighted model called HRDetect was developed to accurately detect BRCA1/BRCA2-deficient samples. HRDetect identifies BRCA1/BRCA2-deficient tumors with 98.7% sensitivity (area under the curve (AUC) = 0.98). Application of this model in a cohort of 560 individuals with breast cancer, of whom 22 were known to carry a germline BRCA1 or BRCA2 mutation, allowed us to identify an additional 22 tumors with somatic loss of BRCA1 or BRCA2 and 47 tumors with functional BRCA1/BRCA2 deficiency where no mutation was detected. We validated HRDetect on independent cohorts of breast, ovarian and pancreatic cancers and demonstrated its efficacy in alternative sequencing strategies. Integrating all of the classes of mutational signatures thus reveals a larger proportion of individuals with breast cancer harboring BRCA1/BRCA2 deficiency (up to 22%) than hitherto appreciated (∼1-5%) who could have selective therapeutic sensitivity to PARP inhibition

    Validating the concept of mutational signatures with isogenic cell models.

    Get PDF
    The diversity of somatic mutations in human cancers can be decomposed into individual mutational signatures, patterns of mutagenesis that arise because of DNA damage and DNA repair processes that have occurred in cells as they evolved towards malignancy. Correlations between mutational signatures and environmental exposures, enzymatic activities and genetic defects have been described, but human cancers are not ideal experimental systems-the exposures to different mutational processes in a patient's lifetime are uncontrolled and any relationships observed can only be described as an association. Here, we demonstrate the proof-of-principle that it is possible to recreate cancer mutational signatures in vitro using CRISPR-Cas9-based gene-editing experiments in an isogenic human-cell system. We provide experimental and algorithmic methods to discover mutational signatures generated under highly experimentally-controlled conditions. Our in vitro findings strikingly recapitulate in vivo observations of cancer data, fundamentally validating the concept of (particularly) endogenously-arising mutational signatures

    Discretization Provides a Conceptually Simple Tool to Build Expression Networks

    Get PDF
    Biomarker identification, using network methods, depends on finding regular co-expression patterns; the overall connectivity is of greater importance than any single relationship. A second requirement is a simple algorithm for ranking patients on how relevant a gene-set is. For both of these requirements discretized data helps to first identify gene cliques, and then to stratify patients
    • …
    corecore