43,805 research outputs found
Identification of novel post-transcriptional features in olfactory receptor family mRNAs.
Olfactory receptor (Olfr) genes comprise the largest gene family in mice. Despite their importance in olfaction, how most Olfr mRNAs are regulated remains unexplored. Using RNA-seq analysis coupled with analysis of pre-existing databases, we found that Olfr mRNAs have several atypical features suggesting that post-transcriptional regulation impacts their expression. First, Olfr mRNAs, as a group, have dramatically higher average AU-content and lower predicted secondary structure than do control mRNAs. Second, Olfr mRNAs have a higher density of AU-rich elements (AREs) in their 3'UTR and upstream open reading frames (uORFs) in their 5 UTR than do control mRNAs. Third, Olfr mRNAs have shorter 3' UTR regions and with fewer predicted miRNA-binding sites. All of these novel properties correlated with higher Olfr expression. We also identified striking differences in the post-transcriptional features of the mRNAs from the two major classes of Olfr genes, a finding consistent with their independent evolutionary origin. Together, our results suggest that the Olfr gene family has encountered unusual selective forces in neural cells that have driven them to acquire unique post-transcriptional regulatory features. In support of this possibility, we found that while Olfr mRNAs are degraded by a deadenylation-dependent mechanism, they are largely protected from this decay in neural lineage cells
Systematic discovery of structural elements governing stability of mammalian messenger RNAs.
Decoding post-transcriptional regulatory programs in RNA is a critical step towards the larger goal of developing predictive dynamical models of cellular behaviour. Despite recent efforts, the vast landscape of RNA regulatory elements remains largely uncharacterized. A long-standing obstacle is the contribution of local RNA secondary structure to the definition of interaction partners in a variety of regulatory contexts, including--but not limited to--transcript stability, alternative splicing and localization. There are many documented instances where the presence of a structural regulatory element dictates alternative splicing patterns (for example, human cardiac troponin T) or affects other aspects of RNA biology. Thus, a full characterization of post-transcriptional regulatory programs requires capturing information provided by both local secondary structures and the underlying sequence. Here we present a computational framework based on context-free grammars and mutual information that systematically explores the immense space of small structural elements and reveals motifs that are significantly informative of genome-wide measurements of RNA behaviour. By applying this framework to genome-wide human mRNA stability data, we reveal eight highly significant elements with substantial structural information, for the strongest of which we show a major role in global mRNA regulation. Through biochemistry, mass spectrometry and in vivo binding studies, we identified human HNRPA2B1 (heterogeneous nuclear ribonucleoprotein A2/B1, also known as HNRNPA2B1) as the key regulator that binds this element and stabilizes a large number of its target genes. We created a global post-transcriptional regulatory map based on the identity of the discovered linear and structural cis-regulatory elements, their regulatory interactions and their target pathways. This approach could also be used to reveal the structural elements that modulate other aspects of RNA behaviour
Predicting Genetic Regulatory Response Using Classification
We present a novel classification-based method for learning to predict gene
regulatory response. Our approach is motivated by the hypothesis that in simple
organisms such as Saccharomyces cerevisiae, we can learn a decision rule for
predicting whether a gene is up- or down-regulated in a particular experiment
based on (1) the presence of binding site subsequences (``motifs'') in the
gene's regulatory region and (2) the expression levels of regulators such as
transcription factors in the experiment (``parents''). Thus our learning task
integrates two qualitatively different data sources: genome-wide cDNA
microarray data across multiple perturbation and mutant experiments along with
motif profile data from regulatory sequences. We convert the regression task of
predicting real-valued gene expression measurement to a classification task of
predicting +1 and -1 labels, corresponding to up- and down-regulation beyond
the levels of biological and measurement noise in microarray measurements. The
learning algorithm employed is boosting with a margin-based generalization of
decision trees, alternating decision trees. This large-margin classifier is
sufficiently flexible to allow complex logical functions, yet sufficiently
simple to give insight into the combinatorial mechanisms of gene regulation. We
observe encouraging prediction accuracy on experiments based on the Gasch S.
cerevisiae dataset, and we show that we can accurately predict up- and
down-regulation on held-out experiments. Our method thus provides predictive
hypotheses, suggests biological experiments, and provides interpretable insight
into the structure of genetic regulatory networks.Comment: 8 pages, 4 figures, presented at Twelfth International Conference on
Intelligent Systems for Molecular Biology (ISMB 2004), supplemental website:
http://www.cs.columbia.edu/compbio/geneclas
Transposon variants and their effects on gene expression in arabidopsis
Transposable elements (TEs) make up the majority of many plant genomes. Their transcription and transposition is controlled through siRNAs and epigenetic marks including DNA methylation. To dissect the interplay of siRNA–mediated regulation and TE evolution, and to examine how TE differences affect nearby gene expression, we investigated genome-wide differences in TEs, siRNAs, and gene expression among three Arabidopsis thaliana accessions. Both TE sequence polymorphisms and presence of linked TEs are positively correlated with intraspecific variation in gene expression. The expression of genes within 2 kb of conserved TEs is more stable than that of genes next to variant TEs harboring sequence polymorphisms. Polymorphism levels of TEs and closely linked adjacent genes are positively correlated as well. We also investigated the distribution of 24-nt-long siRNAs, which mediate TE repression. TEs targeted by uniquely mapping siRNAs are on average farther from coding genes, apparently because they more strongly suppress expression of adjacent genes. Furthermore, siRNAs, and especially uniquely mapping siRNAs, are enriched in TE regions missing in other accessions. Thus, targeting by uniquely mapping siRNAs appears to promote sequence deletions in TEs. Overall, our work indicates that siRNA–targeting of TEs may influence removal of sequences from the genome and hence evolution of gene expression in plants
Recommended from our members
Creating New β-Globin-Expressing Lentiviral Vectors by High-Resolution Mapping of Locus Control Region Enhancer Sequences.
Hematopoietic stem cell gene therapy is a promising approach for treating disorders of the hematopoietic system. Identifying combinations of cis-regulatory elements that do not impede packaging or transduction efficiency when included in lentiviral vectors has proven challenging. In this study, we deploy LV-MPRA (lentiviral vector-based, massively parallel reporter assay), an approach that simultaneously analyzes thousands of synthetic DNA fragments in parallel to identify sequence-intrinsic and lineage-specific enhancer function at near-base-pair resolution. We demonstrate the power of LV-MPRA in elucidating the boundaries of previously unknown intrinsic enhancer sequences of the human β-globin locus control region. Our approach facilitated the rapid assembly of novel therapeutic βAS3-globin lentiviral vectors harboring strong lineage-specific recombinant control elements capable of correcting a mouse model of sickle cell disease. LV-MPRA can be used to map any genomic locus for enhancer activity and facilitates the rapid development of therapeutic vectors for treating disorders of the hematopoietic system or other specific tissues and cell types
Statistical analysis of simple repeats in the human genome
The human genome contains repetitive DNA at different level of sequence
length, number and dispersion. Highly repetitive DNA is particularly rich in
homo-- and di--nucleotide repeats, while middle repetitive DNA is rich of
families of interspersed, mobile elements hundreds of base pairs (bp) long,
among which the Alu families. A link between homo- and di-polymeric tracts and
mobile elements has been recently highlighted. In particular, the mobility of
Alu repeats, which form 10% of the human genome, has been correlated with the
length of poly(A) tracts located at one end of the Alu. These tracts have a
rigid and non-bendable structure and have an inhibitory effect on nucleosomes,
which normally compact the DNA. We performed a statistical analysis of the
genome-wide distribution of lengths and inter--tract separations of poly(X) and
poly(XY) tracts in the human genome. Our study shows that in humans the length
distributions of these sequences reflect the dynamics of their expansion and
DNA replication. By means of general tools from linguistics, we show that the
latter play the role of highly-significant content-bearing terms in the DNA
text. Furthermore, we find that such tracts are positioned in a non-random
fashion, with an apparent periodicity of 150 bases. This allows us to extend
the link between repetitive, highly mobile elements such as Alus and
low-complexity words in human DNA. More precisely, we show that Alus are
sources of poly(X) tracts, which in turn affect in a subtle way the combination
and diversification of gene expression and the fixation of multigene families
TF2Network : predicting transcription factor regulators and gene regulatory networks in Arabidopsis using publicly available binding site information
A gene regulatory network (GRN) is a collection of regulatory interactions between transcription factors (TFs) and their target genes. GRNs control different biological processes and have been instrumental to understand the organization and complexity of gene regulation. Although various experimental methods have been used to map GRNs in Arabidop-sis thaliana, their limited throughput combined with the large number of TFs makes that for many genes our knowledge about regulating TFs is incomplete. We introduce TF2Network, a tool that exploits the vast amount of TF binding site information and enables the delineation of GRNs by detecting potential regulators for a set of co-expressed or functionally related genes. Validation using two experimental benchmarks reveals that TF2Network predicts the correct regulator in 75-92% of the test sets. Furthermore, our tool is robust to noise in the input gene sets, has a low false discovery rate, and shows a better performance to recover correct regulators compared to other plant tools. TF2Network is accessible through a web interface where GRNs are interactively visualized and annotated with various types of experimental functional information. TF2Network was used to perform systematic functional and regulatory gene annotations, identifying new TFs involved in circadian rhythm and stress response
Recommended from our members
Transcriptionally active HERV-H retrotransposons demarcate topologically associating domains in human pluripotent stem cells.
Chromatin architecture has been implicated in cell type-specific gene regulatory programs, yet how chromatin remodels during development remains to be fully elucidated. Here, by interrogating chromatin reorganization during human pluripotent stem cell (hPSC) differentiation, we discover a role for the primate-specific endogenous retrotransposon human endogenous retrovirus subfamily H (HERV-H) in creating topologically associating domains (TADs) in hPSCs. Deleting these HERV-H elements eliminates their corresponding TAD boundaries and reduces the transcription of upstream genes, while de novo insertion of HERV-H elements can introduce new TAD boundaries. The ability of HERV-H to create TAD boundaries depends on high transcription, as transcriptional repression of HERV-H elements prevents the formation of boundaries. This ability is not limited to hPSCs, as these actively transcribed HERV-H elements and their corresponding TAD boundaries also appear in pluripotent stem cells from other hominids but not in more distantly related species lacking HERV-H elements. Overall, our results provide direct evidence for retrotransposons in actively shaping cell type- and species-specific chromatin architecture
Cell-type specific analysis of translating RNAs in developing flowers reveals new levels of control
Determining both the expression levels of mRNA and the regulation of its translation is important in understanding specialized cell functions. In this study, we describe both the expression profiles of cells within spatiotemporal domains of the Arabidopsis thaliana flower and the post-transcriptional regulation of these mRNAs, at nucleotide resolution. We express a tagged ribosomal protein under the promoters of three master regulators of flower development. By precipitating tagged polysomes, we isolated cell type specific mRNAs that are probably translating, and quantified those mRNAs through deep sequencing. Cell type comparisons identified known cell-specific transcripts and uncovered many new ones, from which we inferred cell type-specific hormone responses, promoter motifs and coexpressed cognate binding factor candidates, and splicing isoforms. By comparing translating mRNAs with steady-state overall transcripts, we found evidence for widespread post-transcriptional regulation at both the intron splicing and translational stages. Sequence analyses identified structural features associated with each step. Finally, we identified a new class of noncoding RNAs associated with polysomes. Findings from our profiling lead to new hypotheses in the understanding of flower development
- …