53 research outputs found
RNA-SeQC: RNA-seq metrics for quality control and process optimization
Summary: RNA-seq, the application of next-generation sequencing to RNA, provides transcriptome-wide characterization of cellular activity. Assessment of sequencing performance and library quality is critical to the interpretation of RNA-seq data, yet few tools exist to address this issue. We introduce RNA-SeQC, a program which provides key measures of data quality. These metrics include yield, alignment and duplication rates; GC bias, rRNA content, regions of alignment (exon, intron and intragenic), continuity of coverage, 3′/5′ bias and count of detectable transcripts, among others. The software provides multi-sample evaluation of library construction protocols, input materials and other experimental parameters. The modularity of the software enables pipeline integration and the routine monitoring of key measures of data quality such as the number of alignable reads, duplication rates and rRNA contamination. RNA-SeQC allows investigators to make informed decisions about sample inclusion in downstream analysis. In summary, RNA-SeQC provides quality control measures critical to experiment design, process optimization and downstream computational analysis
Integrative Genomics Viewer
Author Manuscript 2012 May 07.To the Editor:
Rapid improvements in sequencing and array-based platforms are resulting in a flood of diverse genome-wide data, including data from exome and whole-genome sequencing, epigenetic surveys, expression profiling of coding and noncoding RNAs, single nucleotide polymorphism (SNP) and copy number profiling, and functional assays. Analysis of these large, diverse data sets holds the promise of a more comprehensive understanding of the genome and its relation to human disease. Experienced and knowledgeable human review is an essential component of this process, complementing computational approaches. This calls for efficient and intuitive visualization tools able to scale to very large data sets and to flexibly integrate multiple data types, including clinical data. However, the sheer volume and scope of data pose a significant challenge to the development of such tools.National Institute of General Medical Sciences (U.S.) (R01GM074024)National Cancer Institute (U.S.) (R21CA135827)National Human Genome Research Institute (U.S.) (U54HG003067
High-throughput identification of genotype-specific cancer vulnerabilities in mixtures of barcoded tumor cell lines.
Hundreds of genetically characterized cell lines are available for the discovery of genotype-specific cancer vulnerabilities. However, screening large numbers of compounds against large numbers of cell lines is currently impractical, and such experiments are often difficult to control. Here we report a method called PRISM that allows pooled screening of mixtures of cancer cell lines by labeling each cell line with 24-nucleotide barcodes. PRISM revealed the expected patterns of cell killing seen in conventional (unpooled) assays. In a screen of 102 cell lines across 8,400 compounds, PRISM led to the identification of BRD-7880 as a potent and highly specific inhibitor of aurora kinases B and C. Cell line pools also efficiently formed tumors as xenografts, and PRISM recapitulated the expected pattern of erlotinib sensitivity in vivo
The Metabochip, a Custom Genotyping Array for Genetic Studies of Metabolic, Cardiovascular, and Anthropometric Traits
PMCID: PMC3410907This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited
Recommended from our members
Absolute quantification of somatic DNA alterations in human cancer
We developed a computational method (ABSOLUTE) that infers tumor purity and malignant cell ploidy directly from analysis of somatic DNA alterations. ABSOLUTE can detect subclonal heterogeneity, somatic homozygosity, and calculate statistical sensitivity to detect specific aberrations. We used ABSOLUTE to analyze ovarian cancer data and identified pervasive subclonal somatic point mutations. In contrast, mutations occurring in key tumor suppressor genes, TP53 and NF1 were predominantly clonal and homozygous, as were mutations in a candidate tumor suppressor gene, CDK12. Analysis of absolute allelic copy-number profiles from 3,155 cancer specimens revealed that genome-doubling events are common in human cancer, and likely occur in already aneuploid cells. By correlating genome-doubling status with mutation data, we found that homozygous mutations in NF1 occurred predominantly in non-doubled samples. This finding suggests that genome doubling influences the pathways of tumor progression, with recessive inactivation being less common after genome doubling
Recommended from our members
Mutational heterogeneity in cancer and the search for new cancer genes
Major international projects are now underway aimed at creating a comprehensive catalog of all genes responsible for the initiation and progression of cancer. These studies involve sequencing of matched tumor–normal samples followed by mathematical analysis to identify those genes in which mutations occur more frequently than expected by random chance. Here, we describe a fundamental problem with cancer genome studies: as the sample size increases, the list of putatively significant genes produced by current analytical methods burgeons into the hundreds. The list includes many implausible genes (such as those encoding olfactory receptors and the muscle protein titin), suggesting extensive false positive findings that overshadow true driver events. Here, we show that this problem stems largely from mutational heterogeneity and provide a novel analytical methodology, MutSigCV, for resolving the problem. We apply MutSigCV to exome sequences from 3,083 tumor-normal pairs and discover extraordinary variation in (i) mutation frequency and spectrum within cancer types, which shed light on mutational processes and disease etiology, and (ii) mutation frequency across the genome, which is strongly correlated with DNA replication timing and also with transcriptional activity. By incorporating mutational heterogeneity into the analyses, MutSigCV is able to eliminate most of the apparent artefactual findings and allow true cancer genes to rise to attention
The landscape of somatic copy-number alteration across human cancers
available in PMC 2010 August 18.A powerful way to discover key genes with causal roles in oncogenesis is to identify genomic regions that undergo frequent alteration in human cancers. Here we present high-resolution analyses of somatic copy-number alterations (SCNAs) from 3,131 cancer specimens, belonging largely to 26 histological types. We identify 158 regions of focal SCNA that are altered at significant frequency across several cancer types, of which 122 cannot be explained by the presence of a known cancer target gene located within these regions. Several gene families are enriched among these regions of focal SCNA, including the BCL2 family of apoptosis regulators and the NF-κΒ pathway. We show that cancer cells containing amplifications surrounding the MCL1 and BCL2L1 anti-apoptotic genes depend on the expression of these genes for survival. Finally, we demonstrate that a large majority of SCNAs identified in individual cancer types are present in several cancer types.National Institutes of Health (U.S.) (Dana-Farber/Harvard Cancer Center and Pacific Northwest Prostate Cancer SPOREs, P50CA90578)National Institutes of Health (U.S.) (Dana-Farber/Harvard Cancer Center and Pacific Northwest Prostate Cancer SPOREs, R01CA109038))National Institutes of Health (U.S.) (Dana-Farber/Harvard Cancer Center and Pacific Northwest Prostate Cancer SPOREs, R01CA109467)National Institutes of Health (U.S.) (Dana-Farber/Harvard Cancer Center and Pacific Northwest Prostate Cancer SPOREs, P01CA085859)National Institutes of Health (U.S.) (Dana-Farber/Harvard Cancer Center and Pacific Northwest Prostate Cancer SPOREs, P01CA 098101)National Institutes of Health (U.S.) (Dana-Farber/Harvard Cancer Center and Pacific Northwest Prostate Cancer SPOREs, K08CA122833
Characterizing the cancer genome in lung adenocarcinoma
Somatic alterations in cellular DNA underlie almost all human cancers(1). The prospect of targeted therapies(2) and the development of high-resolution, genome-wide approaches(3-8) are now spurring systematic efforts to characterize cancer genomes. Here we report a large-scale project to characterize copy-number alterations in primary lung adenocarcinomas. By analysis of a large collection of tumours ( n = 371) using dense single nucleotide polymorphism arrays, we identify a total of 57 significantly recurrent events. We find that 26 of 39 autosomal chromosome arms show consistent large-scale copy-number gain or loss, of which only a handful have been linked to a specific gene. We also identify 31 recurrent focal events, including 24 amplifications and 7 homozygous deletions. Only six of these focal events are currently associated with known mutations in lung carcinomas. The most common event, amplification of chromosome 14q13.3, is found in similar to 12% of samples. On the basis of genomic and functional analyses, we identify NKX2-1 ( NK2 homeobox 1, also called TITF1), which lies in the minimal 14q13.3 amplification interval and encodes a lineage-specific transcription factor, as a novel candidate proto-oncogene involved in a significant fraction of lung adenocarcinomas. More generally, our results indicate that many of the genes that are involved in lung adenocarcinoma remain to be discovered.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/62944/1/nature06358.pd
The landscape of somatic copy-number alteration across human cancers
A powerful way to discover key genes playing causal roles in oncogenesis is to identify genomic regions that undergo frequent alteration in human cancers. Here, we report high-resolution analyses of somatic copy-number alterations (SCNAs) from 3131 cancer specimens, belonging largely to 26 histological types. We identify 158 regions of focal SCNA that are altered at significant frequency across multiple cancer types, of which 122 cannot be explained by the presence of a known cancer target gene located within these regions. Several gene families are enriched among these regions of focal SCNA, including the BCL2 family of apoptosis regulators and the NF-κB pathway. We show that cancer cells harboring amplifications surrounding the MCL1 and BCL2L1 anti-apoptotic genes depend upon expression of these genes for survival. Finally, we demonstrate that a large majority of SCNAs identified in individual cancer types are present in multiple cancer types
Somatic mutations affect key pathways in lung adenocarcinoma
Determining the genetic basis of cancer requires comprehensive analyses of large collections of histopathologically well- classified primary tumours. Here we report the results of a collaborative study to discover somatic mutations in 188 human lung adenocarcinomas. DNA sequencing of 623 genes with known or potential relationships to cancer revealed more than 1,000 somatic mutations across the samples. Our analysis identified 26 genes that are mutated at significantly high frequencies and thus are probably involved in carcinogenesis. The frequently mutated genes include tyrosine kinases, among them the EGFR homologue ERBB4; multiple ephrin receptor genes, notably EPHA3; vascular endothelial growth factor receptor KDR; and NTRK genes. These data provide evidence of somatic mutations in primary lung adenocarcinoma for several tumour suppressor genes involved in other cancers - including NF1, APC, RB1 and ATM - and for sequence changes in PTPRD as well as the frequently deleted gene LRP1B. The observed mutational profiles correlate with clinical features, smoking status and DNA repair defects. These results are reinforced by data integration including single nucleotide polymorphism array and gene expression array. Our findings shed further light on several important signalling pathways involved in lung adenocarcinoma, and suggest new molecular targets for treatment.National Human Genome Research InstituteWe thank A. Lash, M.F. Zakowski, M.G. Kris and V. Rusch for intellectual contributions, and many members of the Baylor Human Genome Sequencing Center, the Broad Institute of Harvard and MIT, and the Genome Center at Washington University for support. This work was funded by grants from the National Human Genome Research Institute to E.S.L., R.A.G. and R.K.W.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/62885/1/nature07423.pd
- …