31 research outputs found

    A Genome-Wide Screen for Genetic Variants That Modify the Recruitment of REST to Its Target Genes

    Get PDF
    Increasing numbers of human diseases are being linked to genetic variants, but our understanding of the mechanistic links leading from DNA sequence to disease phenotype is limited. The majority of disease-causing nucleotide variants fall within the non-protein-coding portion of the genome, making it likely that they act by altering gene regulatory sequences. We hypothesised that SNPs within the binding sites of the transcriptional repressor REST alter the degree of repression of target genes. Given that changes in the effective concentration of REST contribute to several pathologies—various cancers, Huntington's disease, cardiac hypertrophy, vascular smooth muscle proliferation—these SNPs should alter disease-susceptibility in carriers. We devised a strategy to identify SNPs that affect the recruitment of REST to target genes through the alteration of its DNA recognition element, the RE1. A multi-step screen combining genetic, genomic, and experimental filters yielded 56 polymorphic RE1 sequences with robust and statistically significant differences of affinity between alleles. These SNPs have a considerable effect on the the functional recruitment of REST to DNA in a range of in vitro, reporter gene, and in vivo analyses. Furthermore, we observe allele-specific biases in deeply sequenced chromatin immunoprecipitation data, consistent with predicted differenes in RE1 affinity. Amongst the targets of polymorphic RE1 elements are important disease genes including NPPA, PTPRT, and CDH4. Thus, considerable genetic variation exists in the DNA motifs that connect gene regulatory networks. Recently available ChIP–seq data allow the annotation of human genetic polymorphisms with regulatory information to generate prior hypotheses about their disease-causing mechanism

    Genetic effects on gene expression across human tissues

    Get PDF
    Characterization of the molecular function of the human genome and its variation across individuals is essential for identifying the cellular mechanisms that underlie human genetic traits and diseases. The Genotype-Tissue Expression (GTEx) project aims to characterize variation in gene expression levels across individuals and diverse tissues of the human body, many of which are not easily accessible. Here we describe genetic effects on gene expression levels across 44 human tissues. We find that local genetic variation affects gene expression levels for the majority of genes, and we further identify inter-chromosomal genetic effects for 93 genes and 112 loci. On the basis of the identified genetic effects, we characterize patterns of tissue specificity, compare local and distal effects, and evaluate the functional properties of the genetic effects. We also demonstrate that multi-tissue, multi-individual data can be used to identify genes and pathways affected by human disease-associated variation, enabling a mechanistic interpretation of gene regulation and the genetic basis of diseas

    Genetic effects on gene expression across human tissues

    Get PDF
    Characterization of the molecular function of the human genome and its variation across individuals is essential for identifying the cellular mechanisms that underlie human genetic traits and diseases. The Genotype-Tissue Expression (GTEx) project aims to characterize variation in gene expression levels across individuals and diverse tissues of the human body, many of which are not easily accessible. Here we describe genetic effects on gene expression levels across 44 human tissues. We find that local genetic variation affects gene expression levels for the majority of genes, and we further identify inter-chromosomal genetic effects for 93 genes and 112 loci. On the basis of the identified genetic effects, we characterize patterns of tissue specificity, compare local and distal effects, and evaluate the functional properties of the genetic effects. We also demonstrate that multi-tissue, multi-individual data can be used to identify genes and pathways affected by human disease-associated variation, enabling a mechanistic interpretation of gene regulation and the genetic basis of disease

    Understanding the transcriptional landscape of non-coding genome in mammals

    No full text
    Widespread transcription in mammals revealed unexpected discovery of non-coding elements like long noncoding RNAs (lncRNAs) and repetitive elements. First, lncRNAs were previously identified in limited number of tissues or cell lines in mouse and the discovery of lncRNAs was still pending in many other tissues in mouse. To address this, we applied a computational pipeline that discovered 2,803 highconfidence novel lncRNAs by mapping and de novo assembling billions of RNA-Seq reads in eight tissues and a primary cell line in mouse. Further, we integrated this catalog of lncRNAs with chromatin state maps and found many regulatory lncRNAs (promoter-associated and enhancer-associated lncRNAs). Second, more than half of the human genome contains repetitive elements. However, it is not clear how they are expressed across all mammalian tissues. To address this, as a part of Genotype- Tissue Expression (GTEx) project, we profiled repetitive elements using 8,551 poly-A RNA-Seq datasets from 53 tissues across 550 individuals and found various repeat families transcribed across multiple human tissues in a tissue-specific manner. In summary, to understand the transcriptional landscape of non-coding genome, we mainly analyzed RNA-Seq datasets across many tissues in mammals and show that the non-coding elements like lncRNA and repetitive elements are not only transcribed but also tissue-specific. Together, this thesis work defines a unique collection of non-coding elements that are transcribed and tissue-specific in mammalian tissues.Una gran parte del genoma de mamiefores se expresa en forma de ARNs y se conoce hoy en dia que una gran parte de estos transcritos son no codificantes llamados lncRNAs y que contienen elementos repetitivos. En ratones, estos han sido identificados recientemente en un número limitado de tejidos y líneas celulares. Esta tesis presenta un trabajo exhaustivo de estudio de lnRNAs en ratón en ocho tejidos y una línea celular. En este trabajo se descubrieron 2803 nuevos lncRNAs a los cuáles se les asignó una función reguladora (asociados a promotores o activadores “enhancers”) en el genoma usando datos del estado de la cromatina. Asimismo, más de la mitad del genoma humano contiene elementos repetitivos. Desafortunadamente no se conoce el patrón de expresión de estos elementos repetitivos en los tejidos mamíferos. Como miembros del proyecto GTEx (GenotypeviTissue Expression), analizamos la expresión de estos elementos repetitivos en 8,551 muestras de polyA RNA-Seq en 53 tejidos de 550 individuos. Encontramos que muchas familias de elementos repetitivos son expresadas en tejidos específicos en varios individuos, y representan una característica peculiar de la identidad de cada tejido en humanos

    Understanding the transcriptional landscape of non-coding genome in mammals

    No full text
    Widespread transcription in mammals revealed unexpected discovery of non-coding elements like long noncoding RNAs (lncRNAs) and repetitive elements. First, lncRNAs were previously identified in limited number of tissues or cell lines in mouse and the discovery of lncRNAs was still pending in many other tissues in mouse. To address this, we applied a computational pipeline that discovered 2,803 highconfidence novel lncRNAs by mapping and de novo assembling billions of RNA-Seq reads in eight tissues and a primary cell line in mouse. Further, we integrated this catalog of lncRNAs with chromatin state maps and found many regulatory lncRNAs (promoter-associated and enhancer-associated lncRNAs). Second, more than half of the human genome contains repetitive elements. However, it is not clear how they are expressed across all mammalian tissues. To address this, as a part of Genotype- Tissue Expression (GTEx) project, we profiled repetitive elements using 8,551 poly-A RNA-Seq datasets from 53 tissues across 550 individuals and found various repeat families transcribed across multiple human tissues in a tissue-specific manner. In summary, to understand the transcriptional landscape of non-coding genome, we mainly analyzed RNA-Seq datasets across many tissues in mammals and show that the non-coding elements like lncRNA and repetitive elements are not only transcribed but also tissue-specific. Together, this thesis work defines a unique collection of non-coding elements that are transcribed and tissue-specific in mammalian tissues.Una gran parte del genoma de mamiefores se expresa en forma de ARNs y se conoce hoy en dia que una gran parte de estos transcritos son no codificantes llamados lncRNAs y que contienen elementos repetitivos. En ratones, estos han sido identificados recientemente en un número limitado de tejidos y líneas celulares. Esta tesis presenta un trabajo exhaustivo de estudio de lnRNAs en ratón en ocho tejidos y una línea celular. En este trabajo se descubrieron 2803 nuevos lncRNAs a los cuáles se les asignó una función reguladora (asociados a promotores o activadores “enhancers”) en el genoma usando datos del estado de la cromatina. Asimismo, más de la mitad del genoma humano contiene elementos repetitivos. Desafortunadamente no se conoce el patrón de expresión de estos elementos repetitivos en los tejidos mamíferos. Como miembros del proyecto GTEx (GenotypeviTissue Expression), analizamos la expresión de estos elementos repetitivos en 8,551 muestras de polyA RNA-Seq en 53 tejidos de 550 individuos. Encontramos que muchas familias de elementos repetitivos son expresadas en tejidos específicos en varios individuos, y representan una característica peculiar de la identidad de cada tejido en humanos

    The Long Noncoding RNA RMST Interacts with SOX2 to Regulate Neurogenesis

    No full text
    Long noncoding RNAs (lncRNAs) are abundant in the mammalian transcriptome, and many are specifically expressed in the brain. We have identified a group of lncRNAs, including rhabdomyosarcoma 2-associated transcript (RMST), which are indispensable for neurogenesis. Here, we provide mechanistic insight into the role of human RMST in modulating neurogenesis. RMST expression is specific to the brain, regulated by the transcriptional repressor REST, and increases during neuronal differentiation, indicating a role in neurogenesis. RMST physically interacts with SOX2, a transcription factor known to regulate neural fate. RMST and SOX2 coregulate a large pool of downstream genes implicated in neurogenesis. Through RNA interference and genome-wide SOX2 binding studies, we found that RMST is required for the binding of SOX2 to promoter regions of neurogenic transcription factors. These results establish the role of RMST as a transcriptional coregulator of SOX2 and a key player in the regulation of neural stem cell fate.ASTAR (Agency for Sci., Tech. and Research, S’pore

    Genome-wide computational identification and manual annotation of human long noncoding RNA genes

    No full text
    Experimental evidence suggests that half or more of the mammalian transcriptome consists of noncoding RNA. Noncoding RNAs are divided into short noncoding RNAs (including microRNAs) and long noncoding RNAs (lncRNAs). We defined complementary DNAs (cDNAs) lacking any positive-strand open reading frames (ORFs) longer than 30 amino acids, as well as cDNAs lacking any evidence of interspecies conservation of their longer-than-30-amino acid ORFs, as noncoding. We have identified 5446 lncRNA genes in the human genome from ∼24,000 full-length cDNAs, using our new ORF-prediction pipeline. We combined them nonredundantly with lncRNAs from four published sources to derive 6736 lncRNA genes. In an effort to distinguish standalone and antisense lncRNA genes from database artifacts, we stratified our catalog of lncRNAs according to the distance between each lncRNA gene candidate and its nearest known protein-coding gene. We concurrently examined the protein-coding capacity of known genes overlapping with lncRNAs. Remarkably, 62% of known genes with “hypothetical protein” names actually lacked protein-coding capacity. This study has greatly expanded the known human lncRNA catalog, increased its accuracy through manual annotation of cDNA-to-genome alignments, and revealed that a large set of hypothetical-protein genes in GenBank lacks protein-coding capacity. In addition, we have developed, independently of existing NCBI tools, command-line programs with high-throughput ORF-finding and BLASTP-parsing functionality, suitable for future automated assessments of protein-coding capacity of novel transcripts

    Chromatin and RNA maps reveal regulatory long noncoding RNAs in mouse

    No full text
    Discovering and classifying long noncoding RNAs (lncRNAs) across all mammalian tissues and cell lines remains a major challenge. Previously, mouse lncRNAs were identified using transcriptome sequencing (RNA-seq) data from a limited number of tissues or cell lines. Additionally, associating a few hundred lncRNA promoters with chromatin states in a single mouse cell line has identified two classes of chromatin-associated lncRNA. However, the discovery and classification of lncRNAs is still pending in many other tissues in mouse. To address this, we built a comprehensive catalog of lncRNAs by combining known lncRNAs with high-confidence novel lncRNAs identified by mapping and de novo assembling billions of RNA-seq reads from eight tissues and a primary cell line in mouse. Next, we integrated this catalog of lncRNAs with multiple genome-wide chromatin state maps and found two different classes of chromatin state-associated lncRNAs, including promoter-associated (plncRNAs) and enhancer-associated (elncRNAs) lncRNAs, across various tissues. Experimental knockdown of an elncRNA resulted in the downregulation of the neighboring protein-coding Kdm8 gene, encoding a histone demethylase. Our findings provide 2,803 novel lncRNAs and a comprehensive catalog of chromatin-associated lncRNAs across different tissues in mouse.This work was supported by the Spanish MINECO BFU2010-19310, BFU2013-47736-P, Spanish MINECO SAF2013-48926-P, Centro de Excelencia Severo Ochoa 2013-2017SEV-2012-020

    Chromatin and RNA maps reveal regulatory long noncoding RNAs in mouse

    No full text
    Discovering and classifying long noncoding RNAs (lncRNAs) across all mammalian tissues and cell lines remains a major challenge. Previously, mouse lncRNAs were identified using transcriptome sequencing (RNA-seq) data from a limited number of tissues or cell lines. Additionally, associating a few hundred lncRNA promoters with chromatin states in a single mouse cell line has identified two classes of chromatin-associated lncRNA. However, the discovery and classification of lncRNAs is still pending in many other tissues in mouse. To address this, we built a comprehensive catalog of lncRNAs by combining known lncRNAs with high-confidence novel lncRNAs identified by mapping and de novo assembling billions of RNA-seq reads from eight tissues and a primary cell line in mouse. Next, we integrated this catalog of lncRNAs with multiple genome-wide chromatin state maps and found two different classes of chromatin state-associated lncRNAs, including promoter-associated (plncRNAs) and enhancer-associated (elncRNAs) lncRNAs, across various tissues. Experimental knockdown of an elncRNA resulted in the downregulation of the neighboring protein-coding Kdm8 gene, encoding a histone demethylase. Our findings provide 2,803 novel lncRNAs and a comprehensive catalog of chromatin-associated lncRNAs across different tissues in mouse.This work was supported by the Spanish MINECO BFU2010-19310, BFU2013-47736-P, Spanish MINECO SAF2013-48926-P, Centro de Excelencia Severo Ochoa 2013-2017SEV-2012-020
    corecore