1,010 research outputs found

    In silico prediction of lncRNA function using tissue specific and evolutionary conserved expression

    Get PDF
    BACKGROUND: In recent years long non coding RNAs (lncRNAs) have been the subject of increasing interest. Thanks to many recent functional studies, the existence of a large class of lncRNAs with potential regulatory functions is now widely accepted. Although an increasing number of lncRNAs is being characterized and shown to be involved in many biological processes, the functions of the vast majority lncRNA genes is still unknown. Therefore computational methods able to take advantage of the increasing amount of publicly available data to predict lncRNA functions could be very useful. RESULTS: Since coding genes are much better annotated than lncRNAs, we attempted to project known functional information regarding proteins onto non coding genes using the guilt by association principle: if a gene shows an expression profile that correlates with those of a set of coding genes involved in a given function, that gene is probably involved in the same function. We computed gene coexpression for 30 human tissues and 9 vertebrates and mined the resulting networks with a methodology inspired by the rank product algorithm used to identify differentially expressed genes. Using different types of reference data we can predict putative new annotations for thousands of lncRNAs and proteins, ranging from cellular localization to relevance for disease and cancer. CONCLUSIONS: New function of coding genes and lncRNA can be profitably predicted using tissue specific coexpression, as well as expression of orthologous genes in different species. The data are available for download and through a user-friendly web interface at www.funcpred.com. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-017-1535-x) contains supplementary material, which is available to authorized users. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-017-1535-x) contains supplementary material, which is available to authorized users

    Non-coding genome contributions to the development and evolution of mammalian organs

    Get PDF
    Protein-coding sequences only cover 1-2% of a typical mammalian genome. The remaining non-coding space hides thousands of genomic elements, some of which act via their DNA sequence while others are transcribed into non-coding RNAs. Many well-characterized non-coding elements are involved in the regulation of other genes, a process essential for the emergence of different cell types and organs during development. Changes in the expression of conserved genes during development are in turn thought to facilitate evolutionary innovation in form and function. Thus, non-coding genomic elements are hypothesized to play important roles in developmental and evolutionary processes. However, challenges related to the identification and characterization of these elements, in particular in non-model organisms, has limited the study of their overall contributions to mammalian organ development and evolution. During my dissertation work, I addressed this gap by studying two major classes of non-coding elements, long non-coding RNAs (lncRNAs) and cis-regulatory elements (CREs). In the first part of my thesis, I analyzed the expression profiles of lncRNAs during the development of seven major organs in six mammals and a bird. I showed that, unlike protein-coding genes, only a small fraction of lncRNAs is expressed in reproducibly dynamic patterns during organ development. These lncRNAs are enriched for a series of features associated with functional relevance, including increased evolutionary conservation and regulatory complexity, highlighting them as candidates for further molecular characterization. I then associated these lncRNAs with specific genes and functions based on their spatiotemporal expression profiles. My analyses also revealed differences in lncRNA contributions across organs and developmental stages, identifying a developmental transition from broadly expressed and conserved lncRNAs towards an increasing number of lineage- and organ-specific lncRNAs. Following up on these global analyses, I then focused on a newly-identified lncRNA in the marsupial opossum, Female Specific on chromosome X (FSX). The broad and likely autonomous female-specific expression of FSX suggests a role in marsupial X-chromosome inactivation (XCI). I showed that FSX shares many expression and sequence features with another lncRNA, RSX — a known regulator of XCI in marsupials. Comparisons to other marsupials revealed that both RSX and FSX emerged in the common marsupial ancestor and have since been preserved in marsupial genomes, while their broad and female-specific expression has been retained for at least 76 million years of evolution. Taken together, my analyses highlighted FSX as a novel candidate for regulating marsupial XCI. In the third part of this work, I shifted my focus to CREs and their cell type-specific activities in the developing mouse cerebellum. After annotating cerebellar cell types and states based on single-cell chromatin accessibility data, I identified putative CREs and characterized their spatiotemporal activity across cell types and developmental stages. Focusing on progenitor cells, I described temporal changes in CRE activity that are shared between early germinal zones, supporting a model of cell fate induction through common developmental cues. By examining chromatin accessibility dynamics during neuronal differentiation, I revealed a gradual divergence in the regulatory programs of major cerebellar neuron types. In the final part, I explored the evolutionary histories of CREs and their potential contributions to gene expression changes between species. By comparing mouse CREs to vertebrate genomes and chromatin accessibility profiles from the marsupial opossum, I identified a temporal decrease in CRE conservation, which is shared across cerebellar cell types. However, I also found differences in constraint between cell types, with microglia having the fastest evolving CREs in the mouse cerebellum. Finally, I used deep learning models to study the regulatory grammar of cerebellar cell types in human and mouse, showing that the sequence rules determining CRE activity are conserved across mammals. I then used these models to retrace the evolutionary changes leading to divergent CRE activity between species. Collectively, my PhD work provides insights into the evolutionary dynamics of non-coding genes and regulatory elements, the processes associated with their conservation, and their contributions to the development and evolution of mammalian cell types and organs

    Exploring lncRNAs in cancer : tools for discovery and characterization of cancer associated lncRNAs

    Get PDF

    In silico prediction of housekeeping long intergenic non-coding RNAs reveals HKlincR1 as an essential player in lung cancer cell survival

    Get PDF
    Prioritising long intergenic noncoding RNAs (lincRNAs) for functional characterisation is a significant challenge. Here we applied computational approaches to discover lincRNAs expected to play a critical housekeeping (HK) role within the cell. Using the Illumina Human BodyMap RNA sequencing dataset as a starting point, we first identified lincRNAs ubiquitously expressed across a panel of human tissues. This list was then further refined by reference to conservation score, secondary structure and promoter DNA methylation status. Finally, we used tumour expression and copy number data to identify lincRNAs rarely downregulated or deleted in multiple tumour types. The resulting list of candidate essential lincRNAs was then subjected to co-expression analyses using independent data from ENCODE and The Cancer Genome Atlas (TCGA). This identified a substantial subset with a predicted role in DNA replication and cell cycle regulation. One of these, HKlincR1, was selected for further characterisation. Depletion of HKlincR1 affected cell growth in multiple lung cancer cell lines, and led to disruption of genes involved in cell growth and viability. In addition, HKlincR1 expression was correlated with overall survival in lung adenocarcinoma patients. Our in silico studies therefore reveal a set of housekeeping noncoding RNAs of interest both in terms of their role in normal homeostasis, and their relevance in tumour growth and maintenance

    Global identification and analysis of long non-coding RNAs in diploid strawberry Fragaria vesca during flower and fruit development

    Get PDF
    Length distribution of small RNAs derived from lncRNAs. Figure S2. Characterization of strawberry lncRNAs. Figure S3. Heatmaps showing tissue-specific expression patterns of lncRNAs. Figure S4. Expression correlations between lncRNAs and the adjacent PC genes. Figure S5. Negative correlations of lncRNA expression with PC genes across the genome. (PDF 12315 kb

    T-ALL and thymocytes : a message of noncoding RNAs

    Get PDF
    In the last decade, the role for noncoding RNAs in disease was clearly established, starting with microRNAs and later expanded towards long noncoding RNAs. This was also the case for T cell acute lymphoblastic leukemia, which is a malignant blood disorder arising from oncogenic events during normal T cell development in the thymus. By studying the transcriptomic profile of protein-coding genes, several oncogenic events leading to T cell acute lymphoblastic leukemia (T-ALL) could be identified. In recent years, it became apparent that several of these oncogenes function via microRNAs and long noncoding RNAs. In this review, we give a detailed overview of the studies that describe the noncoding RNAome in T-ALL oncogenesis and normal T cell development

    Understanding the Implications of Anandamide, an Endocannabinoid in an Early Land Plant, Physcomitrella patens

    Get PDF
    Endocannabinoid signaling is well studied in mammals and known to be involved in numerous pathological and physiological processes. Fatty acid amide hydrolase (FAAH) terminates endocannabinoid signaling in mammals. In Physcomitrella patens, we identified nine orthologs of FAAH (PpFAAH1 to PpFAAH9) with the characteristic catalytic triad and amidase signature sequence. Kinetics of PpFAAH1 showed specificity towards anandamide (AEA) at 37°C and pH 8.0. Further biophysical and bioinformatic analyses revealed that, structurally, PpFAAH1 to PpFAAH4 were closely associated to the plant FAAH whereas PpFAAH6 to PpFAAH9 were more closely associated to the animal FAAH. A substrate entry gate or ‘dynamic paddle’ in FAAH is fully formed in vertebrates but absent or not fully developed in non-vertebrates and plants. In planta analysis revealed that PpFAAH responded differently with saturated and unsaturated N-acylethanolamines (NAEs). In vivo amidohydrolase activity showed specificity associated with developmental stages. Additionally, overexpression of PpFAAH1 indicated the need for NAEs in developmental transition. To understand and identify key molecules related to endocannabinoid signaling in P. patens, we used high-throughput RNA sequencing. We analyzed temporal expression of mRNA and long non-coding RNA (lncRNA) in response not only to exogenous anandamide but also its precursor arachidonic acid and abscisic acid (ABA, a stress hormone). From the 40 RNA-seq libraries generated, we identified 4244 novel lncRNAs. The highest number of differentially expressed genes (DEGs) for both mRNA and lncRNA were detected on short-term exposure (1 h) to AEA. Furthermore, gene ontology enrichment analysis showed that 17 genes related to activation of the G protein-coupled receptor signaling pathway were highly expressed along with a number of genes associated with organelle relocation and localization. We identified key signaling components of AEA that showed significant difference when compared with ABA. This study provides a fundamental understanding of novel endocannabinoid signaling in early land plants and a future direction to elucidate its functional role

    Functional analysis and transcriptional output of the Göttingen minipig genome

    Get PDF
    In the past decade the Göttingen minipig has gained increasing recognition as animal model in pharmaceutical and safety research because it recapitulates many aspects of human physiology and metabolism. Genome-based comparison of drug targets together with quantitative tissue expression analysis allows rational prediction of pharmacology and cross-reactivity of human drugs in animal models thereby improving drug attrition which is an important challenge in the process of drug development.; Here we present a new chromosome level based version of the Göttingen minipig genome together with a comparative transcriptional analysis of tissues with pharmaceutical relevance as basis for translational research. We relied on mapping and assembly of WGS (whole-genome-shotgun sequencing) derived reads to the reference genome of the Duroc pig and predict 19,228 human orthologous protein-coding genes. Genome-based prediction of the sequence of human drug targets enables the prediction of drug cross-reactivity based on conservation of binding sites. We further support the finding that the genome of Sus scrofa contains about ten-times less pseudogenized genes compared to other vertebrates. Among the functional human orthologs of these minipig pseudogenes we found HEPN1, a putative tumor suppressor gene. The genomes of Sus scrofa, the Tibetan boar, the African Bushpig, and the Warthog show sequence conservation of all inactivating HEPN1 mutations suggesting disruption before the evolutionary split of these pig species. We identify 133 Sus scrofa specific, conserved long non-coding RNAs (lncRNAs) in the minipig genome and show that these transcripts are highly conserved in the African pigs and the Tibetan boar suggesting functional significance. Using a new minipig specific microarray we show high conservation of gene expression signatures in 13 tissues with biomedical relevance between humans and adult minipigs. We underline this relationship for minipig and human liver where we could demonstrate similar expression levels for most phase I drug-metabolizing enzymes. Higher expression levels and metabolic activities were found for FMO1, AKR/CRs and for phase II drug metabolizing enzymes in minipig as compared to human. The variability of gene expression in equivalent human and minipig tissues is considerably higher in minipig organs, which is important for study design in case a human target belongs to this variable category in the minipig. The first analysis of gene expression in multiple tissues during development from young to adult shows that the majority of transcriptional programs are concluded four weeks after birth. This finding is in line with the advanced state of human postnatal organ development at comparative age categories and further supports the minipig as model for pediatric drug safety studies.; Genome based assessment of sequence conservation combined with gene expression data in several tissues improves the translational value of the minipig for human drug development. The genome and gene expression data presented here are important resources for researchers using the minipig as model for biomedical research or commercial breeding. Potential impact of our data for comparative genomics, translational research, and experimental medicine are discussed
    • …
    corecore