1,195 research outputs found

    Dissection of a complex transcriptional response using genome-wide transcriptional modelling

    Get PDF
    Modern genomics technologies generate huge data sets creating a demand for systems level, experimentally verified, analysis techniques. We examined the transcriptional response to DNA damage in a human T cell line (MOLT4) using microarrays. By measuring both mRNA accumulation and degradation over a short time course, we were able to construct a mechanistic model of the transcriptional response. The model predicted three dominant transcriptional activity profilesβ€”an early response controlled by NFΞΊB and c-Jun, a delayed response controlled by p53, and a late response related to cell cycle re-entry. The method also identified, with defined confidence limits, the transcriptional targets associated with each activity. Experimental inhibition of NFΞΊB, c-Jun and p53 confirmed that target predictions were accurate. Model predictions directly explained 70% of the 200 most significantly upregulated genes in the DNA-damage response. Genome-wide transcriptional modelling (GWTM) requires no prior knowledge of either transcription factors or their targets. GWTM is an economical and effective method for identifying the main transcriptional activators in a complex response and confidently predicting their targets

    Vasohibin-1 is identified as a master-regulator of endothelial cell apoptosis using gene network analysis.

    Get PDF
    BACKGROUND: Apoptosis is a critical process in endothelial cell (EC) biology and pathology, which has been extensively studied at protein level. Numerous gene expression studies of EC apoptosis have also been performed, however few attempts have been made to use gene expression data to identify the molecular relationships and master regulators that underlie EC apoptosis. Therefore, we sought to understand these relationships by generating a Bayesian gene regulatory network (GRN) model. RESULTS: ECs were induced to undergo apoptosis using serum withdrawal and followed over a time course in triplicate, using microarrays. When generating the GRN, this EC time course data was supplemented by a library of microarray data from EC treated with siRNAs targeting over 350 signalling molecules.The GRN model proposed Vasohibin-1 (VASH1) as one of the candidate master-regulators of EC apoptosis with numerous downstream mRNAs. To evaluate the role played by VASH1 in EC, we used siRNA to reduce the expression of VASH1. Of 10 mRNAs downstream of VASH1 in the GRN that were examined, 7 were significantly up- or down-regulated in the direction predicted by the GRN.Further supporting an important biological role of VASH1 in EC, targeted reduction of VASH1 mRNA abundance conferred resistance to serum withdrawal-induced EC death. CONCLUSION: We have utilised Bayesian GRN modelling to identify a novel candidate master regulator of EC apoptosis. This study demonstrates how GRN technology can complement traditional methods to hypothesise the regulatory relationships that underlie important biological processes

    Machine learning and data mining frameworks for predicting drug response in cancer:An overview and a novel <i>in silico</i> screening process based on association rule mining

    Get PDF

    Extensive and coordinated transcription of noncoding RNAs within cell-cycle promoters

    Get PDF
    Transcription of long noncoding RNAs (lncRNAs) within gene regulatory elements can modulate gene activity in response to external stimuli, but the scope and functions of such activity are not known. Here we use an ultrahigh-density array that tiles the promoters of 56 cell-cycle genes to interrogate 108 samples representing diverse perturbations. We identify 216 transcribed regions that encode putative lncRNAs, many with RT-PCR–validated periodic expression during the cell cycle, show altered expression in human cancers and are regulated in expression by specific oncogenic stimuli, stem cell differentiation or DNA damage. DNA damage induces five lncRNAs from the CDKN1A promoter, and one such lncRNA, named PANDA, is induced in a p53-dependent manner. PANDA interacts with the transcription factor NF-YA to limit expression of pro-apoptotic genes; PANDA depletion markedly sensitized human fibroblasts to apoptosis by doxorubicin. These findings suggest potentially widespread roles for promoter lncRNAs in cell-growth control.National Institutes of Health (U.S.)National Institute of Arthritis and Musculoskeletal and Skin Diseases (U.S.) (NIAMS) (K08-AR054615))National Cancer Institute (U.S.) (NIH/(NCI) (R01-CA118750))National Cancer Institute (U.S.) (NIH/(NCI) R01-CA130795))Juvenile Diabetes Research Foundation InternationalAmerican Cancer SocietyHoward Hughes Medical Institute (Early career scientist)Stanford University (Graduate Fellowship)National Science Foundation (U.S.) (Graduate Research Fellowship)United States. Dept. of Defense (National Defense Science and Engineering Graduate Fellowship

    Machine learning methods for genomic high-content screen data analysis applied to deduce organization of endocytic network

    Get PDF
    High-content screens are widely used to get insight on mechanistic organization of biological systems. Chemical and/or genomic interferences are used to modulate molecular machinery, then light microscopy and quantitative image analysis yield a large number of parameters describing phenotype. However, extracting functional information from such high-content datasets (e.g. links between cellular processes or functions of unknown genes) remains challenging. This work is devoted to the analysis of a multi-parametric image-based genomic screen of endocytosis, the process whereby cells uptake cargoes (signals and nutrients) and distribute them into different subcellular compartments. The complexity of the quantitative endocytic data was approached using different Machine Learning techniques, namely, Clustering methods, Bayesian networks, Principal and Independent component analysis, Artificial neural networks. The main goal of such an analysis is to predict possible modes of action of screened genes and also to find candidate genes that can be involved in a process of interest. The degree of freedom for the multidimensional phenotypic space was identified using the data distributions, and then the high-content data were deconvolved into separate signals from different cellular modules. Some of those basic signals (phenotypic traits) were straightforward to interpret in terms of known molecular processes; the other components gave insight into interesting directions for further research. The phenotypic profile of perturbation of individual genes are sparse in coordinates of the basic signals, and, therefore, intrinsically suggest their functional roles in cellular processes. Being a very fundamental process, endocytosis is specifically modulated by a variety of different pathways in the cell; therefore, endocytic phenotyping can be used for analysis of non-endocytic modules in the cell. Proposed approach can be also generalized for analysis of other high-content screens.:Contents Objectives Chapter 1 Introduction 1.1 High-content biological data 1.1.1 Different perturbation types for HCS 1.1.2 Types of observations in HTS 1.1.3 Goals and outcomes of MP HTS 1.1.4 An overview of the classical methods of analysis of biological HT- and HCS data 1.2 Machine learning for systems biology 1.2.1 Feature selection 1.2.2 Unsupervised learning 1.2.3 Supervised learning 1.2.4 Artificial neural networks 1.3 Endocytosis as a system process 1.3.1 Endocytic compartments and main players 1.3.2 Relation to other cellular processes Chapter 2 Experimental and analytical techniques 2.1 Experimental methods 2.1.1 RNA interference 2.1.2 Quantitative multiparametric image analysis 2.2 Detailed description of the endocytic HCS dataset 2.2.1 Basic properties of the endocytic dataset 2.2.2 Control subset of genes 2.3 Machine learning methods 2.3.1 Latent variables models 2.3.2 Clustering 2.3.3 Bayesian networks 2.3.4 Neural networks Chapter 3 Results 3.1 Selection of labeled data for training and validation based on KEGG information about genes pathways 3.2 Clustering of genes 3.2.1 Comparison of clustering techniques on control dataset 3.2.2 Clustering results 3.3 Independent components as basic phenotypes 3.3.1 Algorithm for identification of the best number of independent components 3.3.2 Application of ICA on the full dataset and on separate assays of the screen 3.3.3 Gene annotation based on revealed phenotypes 3.3.4 Searching for genes with target function 3.4 Bayesian network on endocytic parameters 3.4.1 Prediction of pathway based on parameters values using NaΓ―ve Bayesian Classifier 3.4.2 General Bayesian Networks 3.5 Neural networks 3.5.1 Autoencoders as nonlinear ICA 3.5.2 siRNA sequence motives discovery with deep NN 3.6 Biological results 3.6.1 Rab11 ZNF-specific phenotype found by ICA 3.6.2 Structure of BN revealed dependency between endocytosis and cell adhesion Chapter 4 Discussion 4.1 Machine learning approaches for discovery of phenotypic patterns 4.1.1 Functional annotation of unknown genes based on phenotypic profiles 4.1.2 Candidate genes search 4.2 Adaptation to other HCS data and generalization Chapter 5 Outlook and future perspectives 5.1 Handling sequence-dependent off-target effects with neural networks 5.2 Transition between machine learning and systems biology models Acknowledgements References Appendix A.1 Full list of cellular and endocytic parameters A.2 Description of independent components of the full dataset A.3 Description of independent components extracted from separate assays of the HC

    Gene expression analysis in breast cancer

    Get PDF
    Breast cancer is the most common type of cancer among females, both in incidence and death. As meaningful biological understanding of the disease is confounded by the existence of various molecular groups and sub-groups, the challenge for targeted drug development may lie in understanding the molecular mechanisms of various sub-groups in breast cancer. An in-house breast cancer gene expression dataset comprising 17 normal and 104 tumour samples was analysed to identify important genes and pathways relevant to various clinical parameters. Our results identified groups of patients with similar expression profiles, the possible biology driving them and the clinical implications. Comparing Normal and Cancer specimens’ gene expression profiles, TP53, along with cell cycle genes, were up-regulated in cancer samples. Embryonic stem cell pathway genes were up-regulated, while fatty acid biosynthesis pathways were down-regulated in tumors vs normal. The cancer specimens largely clustered with respect to ER status. Meta-analysis was performed on in-house datasets along with five public datasets to identify ER pathway genes. The analysis identified novel genes which had not been previously associated with ER-related pathways in cancer. Nuclear receptor pathways were up-regulated in ER-positive tumors/cell lines. Mining for ESR1-correlated genes across 5897 specimens identified FOXA1, SPDEF, C1ORF34 and GATA3 expression to be highly correlated. Three sub-clusters were identified among the ER-negative cluster. One represented ERBB2 over-expressing cluster. Additionally two unique groups of patients, with significant differences in survival, previously un-identified by other studies, were identified among the ER-negative cluster; a good prognosis cluster with high expression of Immune response genes; and a bad prognosis cluster with high expression of Ropporin, over-expression of which was also linked to high incidence of relapse in our study. siRNA knockdown of Ropporin (ROPN1 and ROPN1B) in the M14 melanoma cell line impaired cancer cell motility and invasion. Knockdown of ROPN1B in MDA-MB-435s reduced motility. In the first study of its kind our results validated the role of Ropporin in cancer cell motility and invasion. A list of 162 relapse-associated prognostically-important genes was used to develop a Neural Network back propagation model to predict the clinical outcomes. The model was successful in predicting relapse with 97.8% accuracy and outperformed existing models, indicating a strong possibility of its use as diagnostic model

    Integrative characterisation and prediction of the radiation response in radiation oncology

    Get PDF

    A Novel Network Profiling Analysis Reveals System Changes in Epithelial-Mesenchymal Transition

    Get PDF
    Patient-specific analysis of molecular networks is a promising strategy for making individual risk predictions and treatment decisions in cancer therapy. Although systems biology allows the gene network of a cell to be reconstructed from clinical gene expression data, traditional methods, such as Bayesian networks, only provide an averaged network for all samples. Therefore, these methods cannot reveal patient-specific differences in molecular networks during cancer progression. In this study, we developed a novel statistical method called NetworkProfiler, which infers patient-specific gene regulatory networks for a specific clinical characteristic, such as cancer progression, from gene expression data of cancer patients. We applied NetworkProfiler to microarray gene expression data from 762 cancer cell lines and extracted the system changes that were related to the epithelial-mesenchymal transition (EMT). Out of 1732 possible regulators of E-cadherin, a cell adhesion molecule that modulates the EMT, NetworkProfiler, identified 25 candidate regulators, of which about half have been experimentally verified in the literature. In addition, we used NetworkProfiler to predict EMT-dependent master regulators that enhanced cell adhesion, migration, invasion, and metastasis. In order to further evaluate the performance of NetworkProfiler, we selected Krueppel-like factor 5 (KLF5) from a list of the remaining candidate regulators of E-cadherin and conducted in vitro validation experiments. As a result, we found that knockdown of KLF5 by siRNA significantly decreased E-cadherin expression and induced morphological changes characteristic of EMT. In addition, in vitro experiments of a novel candidate EMT-related microRNA, miR-100, confirmed the involvement of miR-100 in several EMT-related aspects, which was consistent with the predictions obtained by NetworkProfiler
    • …
    corecore