47 research outputs found
SciClone: Inferring clonal architecture and tracking the spatial and temporal patterns of tumor evolution
The sensitivity of massively-parallel sequencing has confirmed that most cancers are oligoclonal, with subpopulations of neoplastic cells harboring distinct mutations. A fine resolution view of this clonal architecture provides insight into tumor heterogeneity, evolution, and treatment response, all of which may have clinical implications. Single tumor analysis already contributes to understanding these phenomena. However, cryptic subclones are frequently revealed by additional patient samples (e.g., collected at relapse or following treatment), indicating that accurately characterizing a tumor requires analyzing multiple samples from the same patient. To address this need, we present SciClone, a computational method that identifies the number and genetic composition of subclones by analyzing the variant allele frequencies of somatic mutations. We use it to detect subclones in acute myeloid leukemia and breast cancer samples that, though present at disease onset, are not evident from a single primary tumor sample. By doing so, we can track tumor evolution and identify the spatial origins of cells resisting therapy
Evaluation of simulation methods for tumor subclonal reconstruction
Most neoplastic tumors originate from a single cell, and their evolution can
be genetically traced through lineages characterized by common alterations such
as small somatic mutations (SSMs), copy number alterations (CNAs), structural
variants (SVs), and aneuploidies. Due to the complexity of these alterations in
most tumors and the errors introduced by sequencing protocols and calling
algorithms, tumor subclonal reconstruction algorithms are necessary to
recapitulate the DNA sequence composition and tumor evolution in silico. With a
growing number of these algorithms available, there is a pressing need for
consistent and comprehensive benchmarking, which relies on realistic tumor
sequencing generated by simulation tools. Here, we examine the current
simulation methods, identifying their strengths and weaknesses, and provide
recommendations for their improvement. Our review also explores potential new
directions for research in this area. This work aims to serve as a resource for
understanding and enhancing tumor genomic simulations, contributing to the
advancement of the field
ISOpureR: an R implementation of a computational purification algorithm of mixed tumour profiles
Background
Tumour samples containing distinct sub-populations of cancer and normal cells present challenges in the development of reproducible biomarkers, as these biomarkers are based on bulk signals from mixed tumour profiles. ISOpure is the only mRNA computational purification method to date that does not require a paired tumour-normal sample, provides a personalized cancer profile for each patient, and has been tested on clinical data. Replacing mixed tumour profiles with ISOpure-preprocessed cancer profiles led to better prognostic gene signatures for lung and prostate cancer.
Results
To simplify the integration of ISOpure into standard R-based bioinformatics analysis pipelines, the algorithm has been implemented as an R package. The ISOpureR package performs analogously to the original code in estimating the fraction of cancer cells and the patient cancer mRNA abundance profile from tumour samples in four cancer datasets.
Conclusions
The ISOpureR package estimates the fraction of cancer cells and personalized patient cancer mRNA abundance profile from a mixed tumour profile. This open-source R implementation enables integration into existing computational pipelines, as well as easy testing, modification and extension of the model.Prostate Cancer CanadaMovember Foundation (Grant RS2014-01
Inferring clonal evolution of tumors from single nucleotide somatic mutations
High-throughput sequencing allows the detection and quantification of
frequencies of somatic single nucleotide variants (SNV) in heterogeneous tumor
cell populations. In some cases, the evolutionary history and population
frequency of the subclonal lineages of tumor cells present in the sample can be
reconstructed from these SNV frequency measurements. However, automated methods
to do this reconstruction are not available and the conditions under which
reconstruction is possible have not been described.
We describe the conditions under which the evolutionary history can be
uniquely reconstructed from SNV frequencies from single or multiple samples
from the tumor population and we introduce a new statistical model, PhyloSub,
that infers the phylogeny and genotype of the major subclonal lineages
represented in the population of cancer cells. It uses a Bayesian nonparametric
prior over trees that groups SNVs into major subclonal lineages and
automatically estimates the number of lineages and their ancestry. We sample
from the joint posterior distribution over trees to identify evolutionary
histories and cell population frequencies that have the highest probability of
generating the observed SNV frequency data. When multiple phylogenies are
consistent with a given set of SNV frequencies, PhyloSub represents the
uncertainty in the tumor phylogeny using a partial order plot. Experiments on a
simulated dataset and two real datasets comprising tumor samples from acute
myeloid leukemia and chronic lymphocytic leukemia patients demonstrate that
PhyloSub can infer both linear (or chain) and branching lineages and its
inferences are in good agreement with ground truth, where it is available
IST Austria Technical Report
A comprehensive understanding of the clonal evolution of cancer is critical for understanding neoplasia. Genome-wide sequencing data enables evolutionary studies at unprecedented depth. However, classical phylogenetic methods often struggle with noisy sequencing data of impure DNA samples and fail to detect subclones that have different evolutionary trajectories. We have developed a tool, called Treeomics, that allows us to reconstruct the phylogeny of a cancer with commonly available sequencing technologies. Using Bayesian inference and Integer Linear Programming, robust phylogenies consistent with the biological processes underlying cancer evolution were obtained for pancreatic, ovarian, and prostate cancers. Furthermore, Treeomics correctly identified sequencing artifacts such as those resulting from low statistical power; nearly 7% of variants were misclassified by conventional statistical methods. These artifacts can skew phylogenies by creating illusory tumor heterogeneity among distinct samples. Importantly, we show that the evolutionary trees generated with Treeomics are mathematically optimal
Fast and scalable inference of multi-sample cancer lineages.
Somatic variants can be used as lineage markers for the phylogenetic reconstruction of cancer evolution. Since somatic phylogenetics is complicated by sample heterogeneity, novel specialized tree-building methods are required for cancer phylogeny reconstruction. We present LICHeE (Lineage Inference for Cancer Heterogeneity and Evolution), a novel method that automates the phylogenetic inference of cancer progression from multiple somatic samples. LICHeE uses variant allele frequencies of somatic single nucleotide variants obtained by deep sequencing to reconstruct multi-sample cell lineage trees and infer the subclonal composition of the samples. LICHeE is open source and available at http://viq854.github.io/lichee