463 research outputs found

    The copy-number tree mixture deconvolution problem and applications to multi-sample bulk sequencing tumor data

    Get PDF
    Cancer is an evolutionary process driven by somatic mutation. This process can be represented as a phylogenetic tree. Constructing such a phylogenetic tree from genome sequencing data is a challenging task due to the mutational complexity of cancer and the fact that nearly all cancer sequencing is of bulk tissue, measuring a super-position of somatic mutations present in different cells. We study the problem of reconstructing tumor phylogenies from copy number aberrations (CNAs) measured in bulk-sequencing data. We introduce the Copy-Number Tree Mixture Deconvolution (CNTMD) problem, which aims to find the phylogenetic tree with the fewest number of CNAs that explain the copy number data from multiple samples of a tumor. CNTMD generalizes two approaches that have been researched intensively in recent years: deconvolution/factorization algorithms that aim to infer the number and proportions of clones in a mixed tumor sample; and phylogenetic models of copy number evolution that model the dependencies between copy number events that affect the same genomic loci. We design an algorithm for solving the CNTMD problem and apply the algorithm to both simulated and real data. On simulated data, we find that our algorithm outperforms existing approaches that perform either deconvolution or phylogenetic tree construction under the assumption of a single tumor clone per sample. On real data, we analyze multiple samples from a prostate cancer patient, identifying clones within these samples and a phylogenetic tree that relates these clones and their differing proportions across samples. This phylogenetic tree provides a higher-resolution view of copy number evolution of this cancer than published analyses

    Statistical Methods For Genomic And Transcriptomic Sequencing

    Get PDF
    Part 1: High-throughput sequencing of DNA coding regions has become a common way of assaying genomic variation in the study of human diseases. Copy number variation (CNV) is an important type of genomic variation, but CNV profiling from whole-exome sequencing (WES) is challenging due to the high level of biases and artifacts. We propose CODEX, a normalization and CNV calling procedure for WES data. CODEX includes a Poisson latent factor model, which includes terms that specifically remove biases due to GC content, exon capture and amplification efficiency, and latent systemic artifacts. CODEX also includes a Poisson likelihood-based segmentation procedure that explicitly models the count-based WES data. CODEX is compared to existing methods on germline CNV detection in HapMap samples using microarray-based gold standard and is further evaluated on 222 neuroblastoma samples with matched normal, with focus on somatic CNVs within the ATRX gene. Part 2: Cancer is a disease driven by evolutionary selection on somatic genetic and epigenetic alterations. We propose Canopy, a method for inferring the evolutionary phylogeny of a tumor using both somatic copy number alterations and single nucleotide alterations from one or more samples derived from a single patient. Canopy is applied to bulk sequencing datasets of both longitudinal and spatial experimental designs and to a transplantable metastasis model derived from human cancer cell line MDA-MB-231. Canopy successfully identifies cell populations and infers phylogenies that are in concordance with existing knowledge and ground truth. Through simulations, we explore the effects of key parameters on deconvolution accuracy, and compare against existing methods. Part 3: Allele-specific expression is traditionally studied by bulk RNA sequencing, which measures average expression across cells. Single-cell RNA sequencing (scRNA-seq) allows the comparison of expression distribution between the two alleles of a diploid organism and thus the characterization of allele-specific bursting. We propose SCALE to analyze genome-wide allele-specific bursting, with adjustment of technical variability. SCALE detects genes exhibiting allelic differences in bursting parameters, and genes whose alleles burst non-independently. We apply SCALE to mouse blastocyst and human fibroblast cells and find that, globally, cis control in gene expression overwhelmingly manifests as differences in burst frequency

    Parsimonious Clone Tree Integration in cancer

    Get PDF
    BACKGROUND: Every tumor is composed of heterogeneous clones, each corresponding to a distinct subpopulation of cells that accumulated different types of somatic mutations, ranging from single-nucleotide variants (SNVs) to copy-number aberrations (CNAs). As the analysis of this intra-tumor heterogeneity has important clinical applications, several computational methods have been introduced to identify clones from DNA sequencing data. However, due to technological and methodological limitations, current analyses are restricted to identifying tumor clones only based on either SNVs or CNAs, preventing a comprehensive characterization of a tumor's clonal composition. RESULTS: To overcome these challenges, we formulate the identification of clones in terms of both SNVs and CNAs as a integration problem while accounting for uncertainty in the input SNV and CNA proportions. We thus characterize the computational complexity of this problem and we introduce PACTION (PArsimonious Clone Tree integratION), an algorithm that solves the problem using a mixed integer linear programming formulation. On simulated data, we show that tumor clones can be identified reliably, especially when further taking into account the ancestral relationships that can be inferred from the input SNVs and CNAs. On 49 tumor samples from 10 prostate cancer patients, our integration approach provides a higher resolution view of tumor evolution than previous studies. CONCLUSION: PACTION is an accurate and fast method that reconstructs clonal architecture of cancer tumors by integrating SNV and CNA clones inferred using existing methods

    Parsimonious Clone Tree Reconciliation in Cancer

    Get PDF
    Every tumor is composed of heterogeneous clones, each corresponding to a distinct subpopulation of cells that accumulated different types of somatic mutations, ranging from single-nucleotide variants (SNVs) to copy-number aberrations (CNAs). As the analysis of this intra-tumor heterogeneity has important clinical applications, several computational methods have been introduced to identify clones from DNA sequencing data. However, due to technological and methodological limitations, current analyses are restricted to identifying tumor clones only based on either SNVs or CNAs, preventing a comprehensive characterization of a tumor's clonal composition. To overcome these challenges, we formulate the identification of clones in terms of both SNVs and CNAs as a reconciliation problem while accounting for uncertainty in the input SNV and CNA proportions. We thus characterize the computational complexity of this problem and we introduce a mixed integer linear programming formulation to solve it exactly. On simulated data, we show that tumor clones can be identified reliably, especially when further taking into account the ancestral relationships that can be inferred from the input SNVs and CNAs. On 49 tumor samples from 10 prostate cancer patients, our reconciliation approach provides a higher resolution view of tumor evolution than previous studies

    Mapping the breast cancer metastatic cascade onto ctDNA using genetic and epigenetic clonal tracking.

    Get PDF
    Circulating tumour DNA (ctDNA) allows tracking of the evolution of human cancers at high resolution, overcoming many limitations of tissue biopsies. However, exploiting ctDNA to determine how a patient's cancer is evolving in order to aid clinical decisions remains difficult. This is because ctDNA is a mix of fragmented alleles, and the contribution of different cancer deposits to ctDNA is largely unknown. Profiling ctDNA almost invariably requires prior knowledge of what genomic alterations to track. Here, we leverage on a rapid autopsy programme to demonstrate that unbiased genomic characterisation of several metastatic sites and concomitant ctDNA profiling at whole-genome resolution reveals the extent to which ctDNA is representative of widespread disease. We also present a methylation profiling method that allows tracking evolutionary changes in ctDNA at single-molecule resolution without prior knowledge. These results have critical implications for the use of liquid biopsies to monitor cancer evolution in humans and guide treatment

    Assess the effect of angiogenesis inhibition in intra-tumor heterogeneity

    Get PDF
    The genetic diversity of the populations that arise within a tumor following cancer pro- gression is known as intra-tumor heterogeneity. Angiogenesis is the formation of new blood vessels from the existing vascular network. It is involved in different physiolog- ical processes, including wound healing. Tumors induce the excessive production of pro-angiogenic factors that promote the proliferation and variety of cancer cells. The inhibitors of tumor angiogenesis were initially designed to destroy the tumor blood ves- sels, causing the death of cancer cells. However, they have been associated with selecting therapy-resistant cells, thus leading to treatment failure. This thesis aims to study the effect of angiogenesis inhibition in intra-tumor heterogeneity based on an experiment in which a breast cancer cell line, designated as MDA-MB-231, was injected into four mice. Two of them were administered with sunitinib, an anti-angiogenic drug. In partic- ular, this thesis investigates whether the two groups of mice, control and treatment, are distinct. Two variables were analyzed to distinguish between the mice: the intra-tumor heterogeneity and the mutational profiles of the genes. Three different heterogeneity esti- mation methods were chosen: the tumor heterogeneity index, PyClone-VI, and Canopy. These worked with the sequencing data of the mice tumor biopsies, specifically with the somatic mutations and copy number alterations. Dimensionality reduction techniques were applied to extract information from several genes. These relied not only on the mice samples but also on the tumor data of patients stored in The Cancer Genome Atlas, which allowed access to more examples. None of the methods could identify a clear difference between the two groups of mice. Their intra-tumor heterogeneity values were similar, and the mutational profiles of their genes appeared to follow the same pattern. Consid- ering these results, we can assume that destroying the tumor blood vessels of the mice from the treatment group did not drive the diversification of cancer cells. Nonetheless, further research should be conducted to confirm this conclusion. For example, test the latest heterogeneity estimation methods and explore the capabilities of neural networks.A diversidade genética das populações que surgem dentro de um tumor após a progressão do cancro é conhecida como heterogeneidade intra-tumoral. A angiogénese é a formação de novos vasos sanguíneos a partir da rede vascular existente. Está envolvida em dife- rentes processos fisiológicos, incluindo a cicatrização de feridas. Os tumores induzem a produção excessiva de factores pró-angiogénicos que promovem a proliferação e varie- dade de células cancerígenas. Os inibidores da angiogénese tumoral foram inicialmente concebidos para destruir os vasos sanguíneos tumorais, causando a morte de células can- cerígenas. No entanto, têm sido associados à selecção de células resistentes à terapia, levando assim ao fracasso do tratamento. Esta tese visa estudar o efeito da inibição da angiogénese na heterogeneidade intra-tumoral, com base numa experiência em que uma linha celular de cancro da mama, designada por MDA-MB-231, foi injectada em quatro ratos. Dois deles foram administrados com sunitinib, um medicamento anti-angiogénico. Em particular, esta tese investiga se os dois grupos de ratos, controlo e tratamento, são dis- tintos. Duas variáveis foram analisadas para distinguir entre os ratos: a heterogeneidade intra-tumoral e os perfis mutacionais dos genes. Foram escolhidos três métodos diferentes de estimativa da heterogeneidade: o índice de heterogeneidade tumoral, o PyClone-VI, e o Canopy. Estes trabalharam com os dados de sequenciação das biópsias tumorais dos ratos, especificamente com as mutações somáticas e as alterações do número de cópias. Técnicas de redução da dimensionalidade foram aplicadas para extrair informação de vários genes. Estas basearam-se não só nas amostras dos ratos mas também nos dados tumorais de pacientes armazenados no Atlas do Genoma do Cancro, o que permitiu o acesso a mais exemplos. Nenhum dos métodos conseguiu identificar uma diferença clara entre os dois grupos de ratos. Os seus valores de heterogeneidade intra-tumoral eram semelhantes, e os perfis mutacionais dos seus genes pareciam seguir o mesmo padrão. Considerando estes resultados, podemos assumir que a destruição dos vasos sanguíneos tumorais dos ratos do grupo de tratamento não impulsionou a diversificação das células cancerígenas. No entanto, devem ser realizadas mais pesquisas para confirmar esta con- clusão. Por exemplo, testar os métodos de estimativa da heterogeneidade mais recentes e explorar as capacidades de redes neuronais
    corecore