772 research outputs found

    A sparse regulatory network of copy-number driven expression reveals putative breast cancer oncogenes

    Full text link
    The influence of DNA cis-regulatory elements on a gene's expression has been intensively studied. However, little is known about expressions driven by trans-acting DNA hotspots. DNA hotspots harboring copy number aberrations are recognized to be important in cancer as they influence multiple genes on a global scale. The challenge in detecting trans-effects is mainly due to the computational difficulty in detecting weak and sparse trans-acting signals amidst co-occuring passenger events. We propose an integrative approach to learn a sparse interaction network of DNA copy-number regions with their downstream targets in a breast cancer dataset. Information from this network helps distinguish copy-number driven from copy-number independent expression changes on a global scale. Our result further delineates cis- and trans-effects in a breast cancer dataset, for which important oncogenes such as ESR1 and ERBB2 appear to be highly copy-number dependent. Further, our model is shown to be efficient and in terms of goodness of fit no worse than other state-of the art predictors and network reconstruction models using both simulated and real data.Comment: Accepted at IEEE International Conference on Bioinformatics & Biomedicine (BIBM 2010

    Computational Cancer Research: Network-based analysis of cancer data disentangles clinically relevant alterations from molecular measurements

    Get PDF
    Cancer is a very complex genetic disease driven by combinations of mutated genes. This complexity strongly complicates the identification of driver genes and puts enormous challenges to reveal how they influence cancerogenesis, prognosis or therapy response. Thousands of molecular profiles of the major human types of cancer have been measured over the last years. Apart from well-studied frequently mutated genes, still only little is known about the role of rarely mutated genes in cancer or the interplay of mutated genes in individual cancers. Gene expression and mutation profiles can be measured routinely, but computational methods for the identification of driver candidates along with the prediction of their potential impacts on downstream targets and clinically relevant characteristics only rarely exist. Instead of only focusing on frequently mutated genes, each cancer patient should better be analyzed by using the full information in its cancer-specific molecular profiles to improve the understanding of cancerogenesis and to more precisely predict prognosis and therapy response of individual patients. This requires novel computational methods for the integrative analysis of molecular cancer data. A promising way to realize this is to consider cancer as a disease of cellular networks. Therefore, I have developed a novel network-based approach for the integrative analysis of molecular cancer data over the last years. This approach directly learns gene regulatory networks form gene expression and copy number data and further enables to quantify impacts of altered genes on clinically relevant downstream targets using network propagation. This habilitation thesis summarizes the results of seven of my publications. All publications have a focus on the integrative analysis of molecular cancer data with an overarching connection to the newly developed network-based approach. In the first three publications, networks were learned to identify major regulators that distinguish characteristic gene expression signatures with applications to astrocytomas, oligodendrogliomas, and acute myeloid leukemia. Next, the central publication of this habilitation thesis, which combines network inference with network propagation, is introduced. The great value of this approach is demonstrated by quantifying potential direct and indirect impacts of rare and frequent gene copy number alterations on patient survival. Further, the publication of the corresponding user-friendly R package regNet is introduced. Finally, two additional publications that also strongly highlight the value of the developed network-based approach are presented with the aims to predict cancer gene candidates within the region of the 1p/19q co-deletion of oligodendrogliomas and to determine driver candidates associated with radioresistance and relapse of prostate cancer. All seven publications are embedded into a brief introduction that motivates the scientific background and the major objectives of this thesis. The background is briefly going from the hallmarks of cancer over the complexity of cancer genomes down to the importance of networks in cancer. This includes a short introduction of the mathematical concepts that underlie the developed network inference and network propagation algorithms. Further, I briefly motivate and summarize my studies before the original publications are presented. The habilitation thesis is completed with a general discussion of the major results with a specific focus on the utilized network-based data analysis strategies. Major biologically and clinically relevant findings of each publication are also briefly summarized

    Network modeling of the transcriptional effects of copy number aberrations in glioblastoma

    Get PDF
    DNA copy number aberrations (CNAs) are a characteristic feature of cancer genomes. In this work, Rebecka Jörnsten, Sven Nelander and colleagues combine network modeling and experimental methods to analyze the systems-level effects of CNAs in glioblastoma

    INTEGRATIVE ANALYSIS OF OMICS DATA IN ADULT GLIOMA AND OTHER TCGA CANCERS TO GUIDE PRECISION MEDICINE

    Get PDF
    Transcriptomic profiling and gene expression signatures have been widely applied as effective approaches for enhancing the molecular classification, diagnosis, prognosis or prediction of therapeutic response towards personalized therapy for cancer patients. Thanks to modern genome-wide profiling technology, scientists are able to build engines leveraging massive genomic variations and integrating with clinical data to identify “at risk” individuals for the sake of prevention, diagnosis and therapeutic interventions. In my graduate work for my Ph.D. thesis, I have investigated genomic sequencing data mining to comprehensively characterise molecular classifications and aberrant genomic events associated with clinical prognosis and treatment response, through applying high-dimensional omics genomic data to promote the understanding of gene signatures and somatic molecular alterations contributing to cancer progression and clinical outcomes. Following this motivation, my dissertation has been focused on the following three topics in translational genomics. 1) Characterization of transcriptomic plasticity and its association with the tumor microenvironment in glioblastoma (GBM). I have integrated transcriptomic, genomic, protein and clinical data to increase the accuracy of GBM classification, and identify the association between the GBM mesenchymal subtype and reduced tumorpurity, accompanied with increased presence of tumor-associated microglia. Then I have tackled the sole source of microglial as intrinsic tumor bulk but not their corresponding neurosphere cells through both transcriptional and protein level analysis using a panel of sphere-forming glioma cultures and their parent GBM samples.FurthermoreI have demonstrated my hypothesis through longitudinal analysis of paired primary and recurrent GBM samples that the phenotypic alterations of GBM subtypes are not due to intrinsic proneural-to-mesenchymal transition in tumor cells, rather it is intertwined with increased level of microglia upon disease recurrence. Collectively I have elucidated the critical role of tumor microenvironment (Microglia and macrophages from central nervous system) contributing to the intra-tumor heterogeneity and accurate classification of GBM patients based on transcriptomic profiling, which will not only significantly impact on clinical perspective but also pave the way for preclinical cancer research. 2) Identification of prognostic gene signatures that stratify adult diffuse glioma patientsharboring1p/19q co-deletions. I have compared multiple statistical methods and derived a gene signature significantly associated with survival by applying a machine learning algorithm. Then I have identified inflammatory response and acetylation activity that associated with malignant progression of 1p/19q co-deleted glioma. In addition, I showed this signature translates to other types of adult diffuse glioma, suggesting its universality in the pathobiology of other subset gliomas. My efforts on integrative data analysis of this highly curated data set usingoptimizedstatistical models will reflect the pending update to WHO classification system oftumorsin the central nervous system (CNS). 3) Comprehensive characterization of somatic fusion transcripts in Pan-Cancers. I have identified a panel of novel fusion transcripts across all of TCGA cancer types through transcriptomic profiling. Then I have predicted fusion proteins with kinase activity and hub function of pathway network based on the annotation of genetically mobile domains and functional domain architectures. I have evaluated a panel of in -frame gene fusions as potential driver mutations based on network fusion centrality hypothesis. I have also characterised the emerging complexity of genetic architecture in fusion transcripts through integrating genomic structure and somatic variants and delineating the distinct genomic patterns of fusion events across different cancer types. Overall my exploration of the pathogenetic impact and clinical relevance of candidate gene fusions have provided fundamental insights into the management of a subset of cancer patients by predicting the oncogenic signalling and specific drug targets encoded by these fusion genes. Taken together, the translational genomic research I have conducted during my Ph.D. study will shed new light on precision medicine and contribute to the cancer research community. The novel classification concept, gene signature and fusion transcripts I have identified will address several hotly debated issues in translational genomics, such as complex interactions between tumor bulks and their adjacent microenvironments, prognostic markers for clinical diagnostics and personalized therapy, distinct patterns of genomic structure alterations and oncogenic events in different cancer types, therefore facilitating our understanding of genomic alterations and moving us towards the development of precision medicine

    Sparse multivariate factor analysis regression models and its applications to integrative genomics analysis

    Full text link
    The multivariate regression model is a useful tool to explore complex associations between two kinds of molecular markers, which enables the understanding of the biological pathways underlying disease etiology. For a set of correlated response variables, accounting for such dependency can increase statistical power. Motivated by integrative genomic data analyses, we propose a new methodologyâ sparse multivariate factor analysis regression model (smFARM), in which correlations of response variables are assumed to follow a factor analysis model with latent factors. This proposed method not only allows us to address the challenge that the number of association parameters is larger than the sample size, but also to adjust for unobserved genetic and/or nongenetic factors that potentially conceal the underlying responseâ predictor associations. The proposed smFARM is implemented by the EM algorithm and the blockwise coordinate descent algorithm. The proposed methodology is evaluated and compared to the existing methods through extensive simulation studies. Our results show that accounting for latent factors through the proposed smFARM can improve sensitivity of signal detection and accuracy of sparse association map estimation. We illustrate smFARM by two integrative genomics analysis examples, a breast cancer dataset, and an ovarian cancer dataset, to assess the relationship between DNA copy numbers and gene expression arrays to understand genetic regulatory patterns relevant to the disease. We identify two transâ hub regions: one in cytoband 17q12 whose amplification influences the RNA expression levels of important breast cancer genes, and the other in cytoband 9q21.32â 33, which is associated with chemoresistance in ovarian cancer.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/135396/1/gepi22018.pdfhttp://deepblue.lib.umich.edu/bitstream/2027.42/135396/2/gepi22018_am.pdfhttp://deepblue.lib.umich.edu/bitstream/2027.42/135396/3/gepi22018-sup-0001-SuppMat.pd

    Bridging the Gap between Genotype and Phenotype via Network Approaches

    Get PDF
    In the last few years we have witnessed tremendous progress in detecting associations between genetic variations and complex traits. While genome-wide association studies have been able to discover genomic regions that may influence many common human diseases, these discoveries created an urgent need for methods that extend the knowledge of genotype-phenotype relationships to the level of the molecular mechanisms behind them. To address this emerging need, computational approaches increasingly utilize a pathway-centric perspective. These new methods often utilize known or predicted interactions between genes and/or gene products. In this review, we survey recently developed network based methods that attempt to bridge the genotype-phenotype gap. We note that although these methods help narrow the gap between genotype and phenotype relationships, these approaches alone cannot provide the precise details of underlying mechanisms and current research is still far from closing the gap

    Data Integration And Targeted Anticancer Drug Synergies Prediction

    Get PDF
    In the past decades, targeted cancer therapies have made considerable achievements in inhibiting cancer progression by modulating specific molecular targets. However, targeted cancer therapies have reached a plateau of efficacy as the primary therapy since tumor cells can achieve adaptability through functional redundancies and activation of compensatory signaling pathways. Therapies using drug combinations have been developed to overcome the bottleneck. Accurate predictions of synergies effect can help prioritize biological experiments to identify effective combination therapies. Data integration can give us a deeper insight into the mechanism of cancer and drug synergies and help to address the challenge in prediction of drug combinations. In this thesis, we illustrate that integrative analysis of multiple types of omics data and pharmacological data can more effectively identify drug synergies, hence improve the prediction accuracy. As part of the AstraZeneca-Sanger Drug Combination Prediction DREAM Challenge, we showed that multiple data integration methods could identify multiple oncogenes and tumor suppressor genes as signature genes. We showed that several models built through data integration outperformed benchmark models without data integration methods
    corecore