2,812 research outputs found

    An integrated analysis of molecular aberrations in NCI-60 cell lines

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Cancer is a complex disease where various types of molecular aberrations drive the development and progression of malignancies. Large-scale screenings of multiple types of molecular aberrations (e.g., mutations, copy number variations, DNA methylations, gene expressions) become increasingly important in the prognosis and study of cancer. Consequently, a computational model integrating multiple types of information is essential for the analysis of the comprehensive data.</p> <p>Results</p> <p>We propose an integrated modeling framework to identify the statistical and putative causal relations of various molecular aberrations and gene expressions in cancer. To reduce spurious associations among the massive number of probed features, we sequentially applied three layers of logistic regression models with increasing complexity and uncertainty regarding the possible mechanisms connecting molecular aberrations and gene expressions. Layer 1 models associate gene expressions with the molecular aberrations on the same loci. Layer 2 models associate expressions with the aberrations on different loci but have known mechanistic links. Layer 3 models associate expressions with nonlocal aberrations which have unknown mechanistic links. We applied the layered models to the integrated datasets of NCI-60 cancer cell lines and validated the results with large-scale statistical analysis. Furthermore, we discovered/reaffirmed the following prominent links: (1)Protein expressions are generally consistent with mRNA expressions. (2)Several gene expressions are modulated by composite local aberrations. For instance, CDKN2A expressions are repressed by either frame-shift mutations or DNA methylations. (3)Amplification of chromosome 6q in leukemia elevates the expression of MYB, and the downstream targets of MYB on other chromosomes are up-regulated accordingly. (4)Amplification of chromosome 3p and hypo-methylation of PAX3 together elevate MITF expression in melanoma, which up-regulates the downstream targets of MITF. (5)Mutations of TP53 are negatively associated with its direct target genes.</p> <p>Conclusions</p> <p>The analysis results on NCI-60 data justify the utility of the layered models for the incoming flow of cancer genomic data. Experimental validations on selected prominent links and application of the layered modeling framework to other integrated datasets will be carried out subsequently.</p

    An algorithm for classifying tumors based on genomic aberrations and selecting representative tumor models

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Cancer is a heterogeneous disease caused by genomic aberrations and characterized by significant variability in clinical outcomes and response to therapies. Several subtypes of common cancers have been identified based on alterations of individual cancer genes, such as HER2, EGFR, and others. However, cancer is a complex disease driven by the interaction of multiple genes, so the copy number status of individual genes is not sufficient to define cancer subtypes and predict responses to treatments. A classification based on genome-wide copy number patterns would be better suited for this purpose.</p> <p>Method</p> <p>To develop a more comprehensive cancer taxonomy based on genome-wide patterns of copy number abnormalities, we designed an unsupervised classification algorithm that identifies genomic subgroups of tumors. This algorithm is based on a modified genomic Non-negative Matrix Factorization (gNMF) algorithm and includes several additional components, namely a pilot hierarchical clustering procedure to determine the number of clusters, a multiple random initiation scheme, a new stop criterion for the core gNMF, as well as a 10-fold cross-validation stability test for quality assessment.</p> <p>Result</p> <p>We applied our algorithm to identify genomic subgroups of three major cancer types: non-small cell lung carcinoma (NSCLC), colorectal cancer (CRC), and malignant melanoma. High-density SNP array datasets for patient tumors and established cell lines were used to define genomic subclasses of the diseases and identify cell lines representative of each genomic subtype. The algorithm was compared with several traditional clustering methods and showed improved performance. To validate our genomic taxonomy of NSCLC, we correlated the genomic classification with disease outcomes. Overall survival time and time to recurrence were shown to differ significantly between the genomic subtypes.</p> <p>Conclusions</p> <p>We developed an algorithm for cancer classification based on genome-wide patterns of copy number aberrations and demonstrated its superiority to existing clustering methods. The algorithm was applied to define genomic subgroups of three cancer types and identify cell lines representative of these subgroups. Our data enabled the assembly of representative cell line panels for testing drug candidates.</p

    Multilevel omic data integration in cancer cell lines: advanced annotation and emergent properties

    Get PDF
    BACKGROUND: High-throughput (omic) data have become more widespread in both quantity and frequency of use, thanks to technological advances, lower costs and higher precision. Consequently, computational scientists are confronted by two parallel challenges: on one side, the design of efficient methods to interpret each of these data in their own right (gene expression signatures, protein markers, etc.) and, on the other side, realization of a novel, pressing request from the biological field to design methodologies that allow for these data to be interpreted as a whole, i.e. not only as the union of relevant molecules in each of these layers, but as a complex molecular signature containing proteins, mRNAs and miRNAs, all of which must be directly associated in the results of analyses that are able to capture inter-layers connections and complexity. RESULTS: We address the latter of these two challenges by testing an integrated approach on a known cancer benchmark: the NCI-60 cell panel. Here, high-throughput screens for mRNA, miRNA and proteins are jointly analyzed using factor analysis, combined with linear discriminant analysis, to identify the molecular characteristics of cancer. Comparisons with separate (non-joint) analyses show that the proposed integrated approach can uncover deeper and more precise biological information. In particular, the integrated approach gives a more complete picture of the set of miRNAs identified and the Wnt pathway, which represents an important surrogate marker of melanoma progression. We further test the approach on a more challenging patient-dataset, for which we are able to identify clinically relevant markers. CONCLUSIONS: The integration of multiple layers of omics can bring more information than analysis of single layers alone. Using and expanding the proposed integrated framework to integrate omic data from other molecular levels will allow researchers to uncover further systemic information. The application of this approach to a clinically challenging dataset shows its promising potential

    SOX2 Is an Oncogene Activated by Recurrent 3q26.3 Amplifications in Human Lung Squamous Cell Carcinomas

    Get PDF
    Squamous cell carcinoma (SCC) of the lung is a frequent and aggressive cancer type. Gene amplifications, a known activating mechanism of oncogenes, target the 3q26-qter region as one of the most frequently gained/amplified genomic sites in SCC of various types. Here, we used array comparative genomic hybridization to delineate the consensus region of 3q26.3 amplifications in lung SCC. Recurrent amplifications occur in 20% of lung SCC (136 tumors in total) and map to a core region of 2 Mb (Megabases) that encompasses SOX2, a transcription factor gene. Intense SOX2 immunostaining is frequent in nuclei of lung SCC, indicating potential active transcriptional regulation by SOX2. Analyses of the transcriptome of lung SCC, SOX2-overexpressing lung epithelial cells and embryonic stem cells (ESCs) reveal that SOX2 contributes to activate ESC-like phenotypes and provide clues pertaining to the deregulated genes involved in the malignant phenotype. In cell culture experiments, overexpression of SOX2 stimulates cellular migration and anchorage-independent growth while SOX2 knockdown impairs cell growth. Finally, SOX2 over-expression in non-tumorigenic human lung bronchial epithelial cells is tumorigenic in immunocompromised mice. These results indicate that the SOX2 transcription factor, a major regulator of stem cell function, is also an oncogene and a driver gene for the recurrent 3q26.33 amplifications in lung SCC

    Systems analysis of the NCI-60 cancer cell lines by alignment of protein pathway activation modules with "-OMIC" data fields and therapeutic response signatures

    Get PDF
    The NCI-60 cell line set is likely the most molecularly profiled set of human tumor cell lines in the world. However, a critical missing component of previous analyses has been the inability to place the massive amounts of "-omic" data in the context of functional protein signaling networks, which often contain many of the drug targets for new targeted therapeutics. We used reverse-phase protein array (RPPA) analysis to measure the activation/phosphorylation state of 135 proteins, with a total analysis of nearly 200 key protein isoforms involved in cell proliferation, survival, migration, adhesion, etc., in all 60 cell lines. We aggregated the signaling data into biochemical modules of interconnected kinase substrates for 6 key cancer signaling pathways: AKT, mTOR, EGF receptor (EGFR), insulin-like growth factor-1 receptor (IGF-1R), integrin, and apoptosis signaling. The net activation state of these protein network modules was correlated to available individual protein, phosphoprotein, mutational, metabolomic, miRNA, transcriptional, and drug sensitivity data. Pathway activation mapping identified reproducible and distinct signaling cohorts that transcended organ-type distinctions. Direct correlations with the protein network modules involved largely protein phosphorylation data but we also identified direct correlations of signaling networks with metabolites, miRNA, and DNA data. The integration of protein activation measurements into biochemically interconnected modules provided a novel means to align the functional protein architecture with multiple "-omic" data sets and therapeutic response correlations. This approach may provide a deeper understanding of how cellular biochemistry defines therapeutic response. Such "-omic" portraits could inform rational anticancer agent screenings and drive personalized therapeutic approaches. © 2013 American Association for Cancer Research

    Systems Analysis of the NCI-60 Cancer Cell Lines by Alignment of Protein Pathway Activation Modules with "-OMIC" Data Fields and Therapeutic Response Signatures

    Get PDF
    The NCI-60 cell line set is likely the most molecularly profiled set of human tumor cell lines in the world. However, a critical missing component of previous analyses has been the inability to place the massive amounts of "-omic" data in the context of functional protein signaling networks, which often contain many of the drug targets for new targeted therapeutics. We used reverse-phase protein array (RPPA) analysis to measure the activation/ phosphorylation state of 135 proteins, with a total analysis of nearly 200 key protein isoforms involved in cell proliferation, survival, migration, adhesion, etc., in all 60 cell lines. We aggregated the signaling data into biochemical modules of interconnected kinase substrates for 6 key cancer signaling pathways: AKT, mTOR, EGF receptor (EGFR), insulin-like growth factor-1 receptor (IGF-1R), integrin, and apoptosis signaling. The net activation state of these protein network modules was correlated to available individual protein, phosphoprotein, mutational, metabolomic, miRNA, transcriptional, and drug sensitivity data. Pathway activation mapping identified reproducible and distinct signaling cohorts that transcended organ-type distinctions. Direct correlations with the protein network modules involved largely protein phosphorylation data but we also identified direct correlations of signaling networks with metabolites, miRNA, and DNA data. The integration of protein activation measurements into biochemically interconnected modules provided a novel means to align the functional protein architecture with multiple "-omic" data sets and therapeutic response correlations. This approach may provide a deeper understanding of how cellular biochemistry defines therapeutic response. Such "-omic" portraits could inform rational anticancer agent screenings and drive personalized therapeutic approache

    Comparison of TCGA and GENIE genomic datasets for the detection of clinically actionable alterations in breast cancer.

    Get PDF
    Whole exome sequencing (WES), targeted gene panel sequencing and single nucleotide polymorphism (SNP) arrays are increasingly used for the identification of actionable alterations that are critical to cancer care. Here, we compared The Cancer Genome Atlas (TCGA) and the Genomics Evidence Neoplasia Information Exchange (GENIE) breast cancer genomic datasets (array and next generation sequencing (NGS) data) in detecting genomic alterations in clinically relevant genes. We performed an in silico analysis to determine the concordance in the frequencies of actionable mutations and copy number alterations/aberrations (CNAs) in the two most common breast cancer histologies, invasive lobular and invasive ductal carcinoma. We found that targeted sequencing identified a larger number of mutational hotspots and clinically significant amplifications that would have been missed by WES and SNP arrays in many actionable genes such as PIK3CA, EGFR, AKT3, FGFR1, ERBB2, ERBB3 and ESR1. The striking differences between the number of mutational hotspots and CNAs generated from these platforms highlight a number of factors that should be considered in the interpretation of array and NGS-based genomic data for precision medicine. Targeted panel sequencing was preferable to WES to define the full spectrum of somatic mutations present in a tumor
    • …
    corecore