30 research outputs found
Atlas of Signaling for Interpretation of Microarray Experiments
Microarray-based expression profiling of living systems is a quick and inexpensive method to obtain insights into the nature of various diseases and phenotypes. A typical microarray profile can yield hundreds or even thousands of differentially expressed genes and finding biologically plausible themes or regulatory mechanisms underlying these changes is a non-trivial and daunting task. We describe a novel approach for systems-level interpretation of microarray expression data using a manually constructed “overview” pathway depicting the main cellular signaling channels (Atlas of Signaling). Currently, the developed pathway focuses on signal transduction from surface receptors to transcription factors and further transcriptional regulation of cellular “workhorse” proteins. We show how the constructed Atlas of Signaling in combination with an enrichment analysis algorithm allows quick identification and visualization of the main signaling cascades and cellular processes affected in a gene expression profiling experiment. We validate our approach using several publicly available gene expression datasets
ProteoLens: a visual analytic tool for multi-scale database-driven biological network data mining
Background
New systems biology studies require researchers to understand how interplay among myriads of biomolecular entities is orchestrated in order to achieve high-level cellular and physiological functions. Many software tools have been developed in the past decade to help researchers visually navigate large networks of biomolecular interactions with built-in template-based query capabilities. To further advance researchers' ability to interrogate global physiological states of cells through multi-scale visual network explorations, new visualization software tools still need to be developed to empower the analysis. A robust visual data analysis platform driven by database management systems to perform bi-directional data processing-to-visualizations with declarative querying capabilities is needed.
Results
We developed ProteoLens as a JAVA-based visual analytic software tool for creating, annotating and exploring multi-scale biological networks. It supports direct database connectivity to either Oracle or PostgreSQL database tables/views, on which SQL statements using both Data Definition Languages (DDL) and Data Manipulation languages (DML) may be specified. The robust query languages embedded directly within the visualization software help users to bring their network data into a visualization context for annotation and exploration. ProteoLens supports graph/network represented data in standard Graph Modeling Language (GML) formats, and this enables interoperation with a wide range of other visual layout tools. The architectural design of ProteoLens enables the de-coupling of complex network data visualization tasks into two distinct phases: 1) creating network data association rules, which are mapping rules between network node IDs or edge IDs and data attributes such as functional annotations, expression levels, scores, synonyms, descriptions etc; 2) applying network data association rules to build the network and perform the visual annotation of graph nodes and edges according to associated data values. We demonstrated the advantages of these new capabilities through three biological network visualization case studies: human disease association network, drug-target interaction network and protein-peptide mapping network.
Conclusion
The architectural design of ProteoLens makes it suitable for bioinformatics expert data analysts who are experienced with relational database management to perform large-scale integrated network visual explorations. ProteoLens is a promising visual analytic platform that will facilitate knowledge discoveries in future network and systems biology studies
Medulloblastoma Exome Sequencing Uncovers Subtype-Specific Somatic Mutations
Medulloblastomas are the most common malignant brain tumors in children1. Identifying and understanding the genetic events that drive these tumors is critical for the development of more effective diagnostic, prognostic and therapeutic strategies. Recently, our group and others described distinct molecular subtypes of medulloblastoma based on transcriptional and copy number profiles2–5. Here, we utilized whole exome hybrid capture and deep sequencing to identify somatic mutations across the coding regions of 92 primary medulloblastoma/normal pairs. Overall, medulloblastomas exhibit low mutation rates consistent with other pediatric tumors, with a median of 0.35 non-silent mutations per megabase. We identified twelve genes mutated at statistically significant frequencies, including previously known mutated genes in medulloblastoma such as CTNNB1, PTCH1, MLL2, SMARCA4 and TP53. Recurrent somatic mutations were identified in an RNA helicase gene, DDX3X, often concurrent with CTNNB1 mutations, and in the nuclear co-repressor (N-CoR) complex genes GPS2, BCOR, and LDB1, novel findings in medulloblastoma. We show that mutant DDX3X potentiates transactivation of a TCF promoter and enhances cell viability in combination with mutant but not wild type beta-catenin. Together, our study reveals the alteration of Wnt, Hedgehog, histone methyltransferase and now N-CoR pathways across medulloblastomas and within specific subtypes of this disease, and nominates the RNA helicase DDX3X as a component of pathogenic beta-catenin signaling in medulloblastoma
Genomic sequencing of colorectal adenocarcinomas identifies a recurrent VTI1A-TCF7L2 fusion
Prior studies have identified recurrent oncogenic mutations in colorectal adenocarcinoma1 and have surveyed exons of protein-coding genes for mutations in 11 affected individuals2,3. Here we report whole-genome sequencing from nine individuals with colorectal cancer, including primary colorectal tumors and matched adjacent non-tumor tissues, at an average of 30.7× and 31.9× coverage, respectively. We identify an average of 75 somatic rearrangements per tumor, including complex networks of translocations between pairs of chromosomes. Eleven rearrangements encode predicted in-frame fusion proteins, including a fusion of VTI1A and TCF7L2 found in 3 out of 97 colorectal cancers. Although TCF7L2 encodes TCF4, which cooperates with β-catenin4 in colorectal carcinogenesis5,6, the fusion lacks the TCF4 β-catenin–binding domain. We found a colorectal carcinoma cell line harboring the fusion gene to be dependent on VTI1A-TCF7L2 for anchorage-independent growth using RNA interference-mediated knockdown. This study shows previously unidentified levels of genomic rearrangements in colorectal carcinoma that can lead to essential gene fusions and other oncogenic events
Integrative and Comparative Genomic Analysis of Lung Squamous Cell Carcinomas in East Asian Patients
Lung squamous cell carcinoma (SCC) is the second most prevalent type of lung cancer. Currently, no targeted therapeutics are approved for treatment of this cancer, largely because of a lack of systematic understanding of the molecular pathogenesis of the disease. To identify therapeutic targets and perform comparative analyses of lung SCC, we probed somatic genome alterations of lung SCC by using samples from Korean patients
ProteoLens: a visual analytic tool for multi-scale database-driven biological network data mining
Abstract Background New systems biology studies require researchers to understand how interplay among myriads of biomolecular entities is orchestrated in order to achieve high-level cellular and physiological functions. Many software tools have been developed in the past decade to help researchers visually navigate large networks of biomolecular interactions with built-in template-based query capabilities. To further advance researchers' ability to interrogate global physiological states of cells through multi-scale visual network explorations, new visualization software tools still need to be developed to empower the analysis. A robust visual data analysis platform driven by database management systems to perform bi-directional data processing-to-visualizations with declarative querying capabilities is needed. Results We developed ProteoLens as a JAVA-based visual analytic software tool for creating, annotating and exploring multi-scale biological networks. It supports direct database connectivity to either Oracle or PostgreSQL database tables/views, on which SQL statements using both Data Definition Languages (DDL) and Data Manipulation languages (DML) may be specified. The robust query languages embedded directly within the visualization software help users to bring their network data into a visualization context for annotation and exploration. ProteoLens supports graph/network represented data in standard Graph Modeling Language (GML) formats, and this enables interoperation with a wide range of other visual layout tools. The architectural design of ProteoLens enables the de-coupling of complex network data visualization tasks into two distinct phases: 1) creating network data association rules, which are mapping rules between network node IDs or edge IDs and data attributes such as functional annotations, expression levels, scores, synonyms, descriptions etc; 2) applying network data association rules to build the network and perform the visual annotation of graph nodes and edges according to associated data values. We demonstrated the advantages of these new capabilities through three biological network visualization case studies: human disease association network, drug-target interaction network and protein-peptide mapping network. Conclusion The architectural design of ProteoLens makes it suitable for bioinformatics expert data analysts who are experienced with relational database management to perform large-scale integrated network visual explorations. ProteoLens is a promising visual analytic platform that will facilitate knowledge discoveries in future network and systems biology studies.</p
Integrative and Comparative Genomic Analysis of Lung Squamous Cell Carcinomas in East Asian Patients
PURPOSE: Lung squamous cell carcinoma (SCC) is the second most prevalent type of lung cancer. Currently, no targeted therapeutics are approved for treatment of this cancer, largely because of a lack of systematic understanding of the molecular pathogenesis of the disease. To identify therapeutic targets and perform comparative analyses of lung SCC, we probed somatic genome alterations of lung SCC by using samples from Korean patients. PATIENTS AND METHODS: We performed whole-exome sequencing of DNA from 104 lung SCC samples from Korean patients and matched normal DNA. In addition, copy-number analysis and transcriptome analysis were conducted for a subset of these samples. Clinical association with cancer-specific somatic alterations was investigated. RESULTS: This cancer cohort is characterized by a high mutational burden with an average of 261 somatic exonic mutations per tumor and a mutational spectrum showing a signature of exposure to cigarette smoke. Seven genes demonstrated statistical enrichment for mutation: TP53, RB1, PTEN, NFE2L2, KEAP1, MLL2, and PIK3CA). Comparative analysis between Korean and North American lung SCC samples demonstrated a similar spectrum of alterations in these two populations in contrast to the differences seen in lung adenocarcinoma. We also uncovered recurrent occurrence of therapeutically actionable FGFR3-TACC3 fusion in lung SCC. CONCLUSION: These findings provide new steps toward the identification of genomic target candidates for precision medicine in lung SCC, a disease with significant unmet medical needs