53 research outputs found

    RNA-SeQC: RNA-seq metrics for quality control and process optimization

    Get PDF
    Summary: RNA-seq, the application of next-generation sequencing to RNA, provides transcriptome-wide characterization of cellular activity. Assessment of sequencing performance and library quality is critical to the interpretation of RNA-seq data, yet few tools exist to address this issue. We introduce RNA-SeQC, a program which provides key measures of data quality. These metrics include yield, alignment and duplication rates; GC bias, rRNA content, regions of alignment (exon, intron and intragenic), continuity of coverage, 3′/5′ bias and count of detectable transcripts, among others. The software provides multi-sample evaluation of library construction protocols, input materials and other experimental parameters. The modularity of the software enables pipeline integration and the routine monitoring of key measures of data quality such as the number of alignable reads, duplication rates and rRNA contamination. RNA-SeQC allows investigators to make informed decisions about sample inclusion in downstream analysis. In summary, RNA-SeQC provides quality control measures critical to experiment design, process optimization and downstream computational analysis

    Integrative Genomics Viewer

    Get PDF
    Author Manuscript 2012 May 07.To the Editor: Rapid improvements in sequencing and array-based platforms are resulting in a flood of diverse genome-wide data, including data from exome and whole-genome sequencing, epigenetic surveys, expression profiling of coding and noncoding RNAs, single nucleotide polymorphism (SNP) and copy number profiling, and functional assays. Analysis of these large, diverse data sets holds the promise of a more comprehensive understanding of the genome and its relation to human disease. Experienced and knowledgeable human review is an essential component of this process, complementing computational approaches. This calls for efficient and intuitive visualization tools able to scale to very large data sets and to flexibly integrate multiple data types, including clinical data. However, the sheer volume and scope of data pose a significant challenge to the development of such tools.National Institute of General Medical Sciences (U.S.) (R01GM074024)National Cancer Institute (U.S.) (R21CA135827)National Human Genome Research Institute (U.S.) (U54HG003067

    High-throughput identification of genotype-specific cancer vulnerabilities in mixtures of barcoded tumor cell lines.

    Get PDF
    Hundreds of genetically characterized cell lines are available for the discovery of genotype-specific cancer vulnerabilities. However, screening large numbers of compounds against large numbers of cell lines is currently impractical, and such experiments are often difficult to control. Here we report a method called PRISM that allows pooled screening of mixtures of cancer cell lines by labeling each cell line with 24-nucleotide barcodes. PRISM revealed the expected patterns of cell killing seen in conventional (unpooled) assays. In a screen of 102 cell lines across 8,400 compounds, PRISM led to the identification of BRD-7880 as a potent and highly specific inhibitor of aurora kinases B and C. Cell line pools also efficiently formed tumors as xenografts, and PRISM recapitulated the expected pattern of erlotinib sensitivity in vivo

    The Metabochip, a Custom Genotyping Array for Genetic Studies of Metabolic, Cardiovascular, and Anthropometric Traits

    Get PDF
    PMCID: PMC3410907This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited

    The landscape of somatic copy-number alteration across human cancers

    Get PDF
    available in PMC 2010 August 18.A powerful way to discover key genes with causal roles in oncogenesis is to identify genomic regions that undergo frequent alteration in human cancers. Here we present high-resolution analyses of somatic copy-number alterations (SCNAs) from 3,131 cancer specimens, belonging largely to 26 histological types. We identify 158 regions of focal SCNA that are altered at significant frequency across several cancer types, of which 122 cannot be explained by the presence of a known cancer target gene located within these regions. Several gene families are enriched among these regions of focal SCNA, including the BCL2 family of apoptosis regulators and the NF-κΒ pathway. We show that cancer cells containing amplifications surrounding the MCL1 and BCL2L1 anti-apoptotic genes depend on the expression of these genes for survival. Finally, we demonstrate that a large majority of SCNAs identified in individual cancer types are present in several cancer types.National Institutes of Health (U.S.) (Dana-Farber/Harvard Cancer Center and Pacific Northwest Prostate Cancer SPOREs, P50CA90578)National Institutes of Health (U.S.) (Dana-Farber/Harvard Cancer Center and Pacific Northwest Prostate Cancer SPOREs, R01CA109038))National Institutes of Health (U.S.) (Dana-Farber/Harvard Cancer Center and Pacific Northwest Prostate Cancer SPOREs, R01CA109467)National Institutes of Health (U.S.) (Dana-Farber/Harvard Cancer Center and Pacific Northwest Prostate Cancer SPOREs, P01CA085859)National Institutes of Health (U.S.) (Dana-Farber/Harvard Cancer Center and Pacific Northwest Prostate Cancer SPOREs, P01CA 098101)National Institutes of Health (U.S.) (Dana-Farber/Harvard Cancer Center and Pacific Northwest Prostate Cancer SPOREs, K08CA122833

    Characterizing the cancer genome in lung adenocarcinoma

    Full text link
    Somatic alterations in cellular DNA underlie almost all human cancers(1). The prospect of targeted therapies(2) and the development of high-resolution, genome-wide approaches(3-8) are now spurring systematic efforts to characterize cancer genomes. Here we report a large-scale project to characterize copy-number alterations in primary lung adenocarcinomas. By analysis of a large collection of tumours ( n = 371) using dense single nucleotide polymorphism arrays, we identify a total of 57 significantly recurrent events. We find that 26 of 39 autosomal chromosome arms show consistent large-scale copy-number gain or loss, of which only a handful have been linked to a specific gene. We also identify 31 recurrent focal events, including 24 amplifications and 7 homozygous deletions. Only six of these focal events are currently associated with known mutations in lung carcinomas. The most common event, amplification of chromosome 14q13.3, is found in similar to 12% of samples. On the basis of genomic and functional analyses, we identify NKX2-1 ( NK2 homeobox 1, also called TITF1), which lies in the minimal 14q13.3 amplification interval and encodes a lineage-specific transcription factor, as a novel candidate proto-oncogene involved in a significant fraction of lung adenocarcinomas. More generally, our results indicate that many of the genes that are involved in lung adenocarcinoma remain to be discovered.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/62944/1/nature06358.pd

    The landscape of somatic copy-number alteration across human cancers

    Get PDF
    A powerful way to discover key genes playing causal roles in oncogenesis is to identify genomic regions that undergo frequent alteration in human cancers. Here, we report high-resolution analyses of somatic copy-number alterations (SCNAs) from 3131 cancer specimens, belonging largely to 26 histological types. We identify 158 regions of focal SCNA that are altered at significant frequency across multiple cancer types, of which 122 cannot be explained by the presence of a known cancer target gene located within these regions. Several gene families are enriched among these regions of focal SCNA, including the BCL2 family of apoptosis regulators and the NF-κB pathway. We show that cancer cells harboring amplifications surrounding the MCL1 and BCL2L1 anti-apoptotic genes depend upon expression of these genes for survival. Finally, we demonstrate that a large majority of SCNAs identified in individual cancer types are present in multiple cancer types

    Somatic mutations affect key pathways in lung adenocarcinoma

    Full text link
    Determining the genetic basis of cancer requires comprehensive analyses of large collections of histopathologically well- classified primary tumours. Here we report the results of a collaborative study to discover somatic mutations in 188 human lung adenocarcinomas. DNA sequencing of 623 genes with known or potential relationships to cancer revealed more than 1,000 somatic mutations across the samples. Our analysis identified 26 genes that are mutated at significantly high frequencies and thus are probably involved in carcinogenesis. The frequently mutated genes include tyrosine kinases, among them the EGFR homologue ERBB4; multiple ephrin receptor genes, notably EPHA3; vascular endothelial growth factor receptor KDR; and NTRK genes. These data provide evidence of somatic mutations in primary lung adenocarcinoma for several tumour suppressor genes involved in other cancers - including NF1, APC, RB1 and ATM - and for sequence changes in PTPRD as well as the frequently deleted gene LRP1B. The observed mutational profiles correlate with clinical features, smoking status and DNA repair defects. These results are reinforced by data integration including single nucleotide polymorphism array and gene expression array. Our findings shed further light on several important signalling pathways involved in lung adenocarcinoma, and suggest new molecular targets for treatment.National Human Genome Research InstituteWe thank A. Lash, M.F. Zakowski, M.G. Kris and V. Rusch for intellectual contributions, and many members of the Baylor Human Genome Sequencing Center, the Broad Institute of Harvard and MIT, and the Genome Center at Washington University for support. This work was funded by grants from the National Human Genome Research Institute to E.S.L., R.A.G. and R.K.W.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/62885/1/nature07423.pd
    corecore