213 research outputs found

    Systematic Identification of Independent Functional Non-coding RNA Genes in Oxytricha trifallax

    Get PDF
    Functional noncoding RNAs participate in a variety of biological processes: for example, modulating translation, catalyzing biochemical reactions, sensing environments etc. Independent of conventional approaches such as transcriptomics and computational comparative analysis, we took advantage of the unusual genomic organization of the ciliated unicellular protozoan Oxytricha trifallax to screen for eukaryotic independent functional noncoding RNA genes. The Oxytricha macronuclear genome consists of thousands of gene-sized nanochromosomes , each of which usually contains only a single gene. Using a draft Oxytricha trifallax genome assembly and a custom-written noncoding nanochromosome classifier, we identified a subset of nanochromosomes that lack any detectable protein coding gene, thereby strongly enriching for nanochromosomes that carry noncoding RNA genes. Surprisingly, we found only a small proportion of noncoding nanochromosomes, suggesting that Oxytricha has few independent functional noncoding RNA genes besides homologs of already known noncoding RNAs. Other than new members of known noncoding RNA classes including C/D and H/ACA snoRNAs, our screen identified a single novel family of small RNA genes, named the Arisong RNAs, which share some of the features of small nuclear RNAs. The small number of novel independent functional noncoding RNA genes identified in this screen contrasts to numerous recent reports of a large number of noncoding RNAs in a variety of eukaryotes. We think the difficulty of distinguishing functional noncoding RNA genes from other sources of putative noncoding RNAs has been underestimated

    Identifying ceRNA Networks Associated With the Susceptibility and Persistence of Atrial Fibrillation Through Weighted Gene Co-Expression Network Analysis

    Get PDF
    Background: Atrial fibrillation (AF) is the most common arrhythmia. We aimed to construct competing endogenous RNA (ceRNA) networks associated with the susceptibility and persistence of AF by applying the weighted gene co-expression network analysis (WGCNA) and prioritize key genes using the random walk with restart on multiplex networks (RWR-M) algorithm.Methods: RNA sequencing results from 235 left atrial appendage samples were downloaded from the GEO database. The top 5,000 lncRNAs/mRNAs with the highest variance were used to construct a gene co-expression network using the WGCNA method. AF susceptibility- or persistence-associated modules were identified by correlating the module eigengene with the atrial rhythm phenotype. Using a module-specific manner, ceRNA pairs of lncRNA–mRNA were predicted. The RWR-M algorithm was applied to calculate the proximity between lncRNAs and known AF protein-coding genes. Random forest classifiers, based on the expression value of key lncRNA-associated ceRNA pairs, were constructed and validated against an independent data set.Results: From the 21 identified modules, magenta and tan modules were associated with AF susceptibility, whereas turquoise and yellow modules were associated with AF persistence. ceRNA networks in magenta and tan modules were primarily involved in the inflammatory process, whereas ceRNA networks in turquoise and yellow modules were primarily associated with electrical remodeling. A total of 106 previously identified AF-associated protein-coding genes were found in the ceRNA networks, including 16 that were previously implicated in the genome-wide association study. Myocardial infarction–associated transcript (MIAT) and LINC00964 were prioritized as key lncRNAs through RWR-M. The classifiers based on their associated ceRNA pairs were able to distinguish AF from sinus rhythm with respective AUC values of 0.810 and 0.940 in the training set and 0.870 and 0.922 in the independent test set. The AF-related single-nucleotide polymorphism rs35006907 was found in the intronic region of LINC00964 and negatively regulated the LINC00964 expression.Conclusion: Our study constructed AF susceptibility- and persistence-associated ceRNA networks, linked genetics with epigenetics, identified MIAT and LINC00964 as key lncRNAs, and constructed random forest classifiers based on their associated ceRNA pairs. These results will help us to better understand the mechanisms underlying AF from the ceRNA perspective and provide candidate therapeutic and diagnostic tools

    Computational Tools for Classifying and Visualizing RNA Structure Change in High-Throughput Experimental Data

    Get PDF
    Mutations (or Single Nucleotide Variants) in folded RiboNucleic Acid (RNA) structures that cause local or global conformational change are riboSNitches. Predicting riboSNitches is challenging, as it requires making two, albeit related, structure predictions. The data most often used to experimentally validate riboSNitch predictions is Selective 2’ Hydroxyl Acylation by Primer Extension, or SHAPE. Experimentally establishing a riboSNitch requires the quantitative comparison of two SHAPE traces: wild-type (WT) and mutant. Historically, SHAPE data was collected on electropherograms and change in structure was evaluated by “gel gazing.” SHAPE data is now routinely collected with next generation sequencing and/or capillary sequencers. We aim to establish a classifier capable of simulating human “gazing” by identifying features of the SHAPE profile that human experts agree “looks” like a riboSNitch. Additionally, when an RNA molecule folds, it does not always adopt a single, well-defined conformation. The folding energy landscape of the RNA is highly dependent on sequence and the molecular environment. Endogenous molecules, especially in the cellular context, will in some cases completely alter the energy landscape and therefore the ensemble of likely low-energy conformations. The effects of these energy landscape changes on the conformational ensemble are particularly challenging to visualize for larger RNAs including most messenger RNAs (mRNAs). We propose here a robust approach for visualizing the conformational ensemble of RNAs particularly well suited for in vitro vs. in vivo comparisons.Doctor of Philosoph

    Killing The Messenger: Exploring Novel Triggers For Messenger Rna Decay In Eukaryotes

    Get PDF
    The lifecycle of messenger RNAs is regulated by multiple layers beyond their primary sequence. In addition to carrying the information for protein synthesis, mRNAs are decorated with RNA binding proteins, marked with covalent chemical modifications, and fold into intricate secondary structures. However, the full set of information encoded by these “epitranscriptomic” layers is only partially understood, and is often only characterized for select transcripts. Thus, it is crucial to develop and apply transcriptome-wide analytical tools to probe the location and functional relevance of epitranscriptome features. In this dissertation, I focus on applying such methods toward better understanding determinants of mRNA stability, through using 1) High Throughput Annotation of Modified Nucleotides, 2) nuclease-mediated probing of RNA secondary structure, and 3) detection of partial mRNA degradation from RNA sequencing. I observe that chemical modifications tend to mark uncapped and small RNA fragments derived from mRNAs in plants and humans, suggesting a link between modifications and mRNA stability. I then show this link is direct through showing differential stability at Arabidopsis transcripts that change modification status during long-term salt stress. By probing secondary structure, I show a link between structure, smRNA production, and co-translational RNA decay. Finally, I develop a new in silico method to detect partial RNA degradation in mouse oocytes, and identify sequence elements that appear to block complete exonucleolytic transcript cleavage during meiosis. I then identify putative RNA binding proteins that might mediate this partial decay. In summary, I apply transcriptome-wide sequencing-based methods to survey the effects of covalent modifications, secondary structure, and RNA binding proteins on mRNA stability

    Detecting and comparing non-coding RNAs in the high-throughput era.

    Get PDF
    In recent years there has been a growing interest in the field of non-coding RNA. This surge is a direct consequence of the discovery of a huge number of new non-coding genes and of the finding that many of these transcripts are involved in key cellular functions. In this context, accurately detecting and comparing RNA sequences has become important. Aligning nucleotide sequences is a key requisite when searching for homologous genes. Accurate alignments reveal evolutionary relationships, conserved regions and more generally any biologically relevant pattern. Comparing RNA molecules is, however, a challenging task. The nucleotide alphabet is simpler and therefore less informative than that of amino-acids. Moreover for many non-coding RNAs, evolution is likely to be mostly constrained at the structural level and not at the sequence level. This results in very poor sequence conservation impeding comparison of these molecules. These difficulties define a context where new methods are urgently needed in order to exploit experimental results to their full potential. This review focuses on the comparative genomics of non-coding RNAs in the context of new sequencing technologies and especially dealing with two extremely important and timely research aspects: the development of new methods to align RNAs and the analysis of high-throughput data

    Genomic data mining for the computational prediction of small non-coding RNA genes

    Get PDF
    The objective of this research is to develop a novel computational prediction algorithm for non-coding RNA (ncRNA) genes using features computable for any genomic sequence without the need for comparative analysis. Existing comparative-based methods require the knowledge of closely related organisms in order to search for sequence and structural similarities. This approach imposes constraints on the type of ncRNAs, the organism, and the regions where the ncRNAs can be found. We have developed a novel approach for ncRNA gene prediction without the limitations of current comparative-based methods. Our work has established a ncRNA database required for subsequent feature and genomic analysis. Furthermore, we have identified significant features from folding-, structural-, and ensemble-based statistics for use in ncRNA prediction. We have also examined higher-order gene structures, namely operons, to discover potential insights into how ncRNAs are transcribed. Being able to automatically identify ncRNAs on a genome-wide scale is immensely powerful for incorporating it into a pipeline for large-scale genome annotation. This work will contribute to a more comprehensive annotation of ncRNA genes in microbial genomes to meet the demands of functional and regulatory genomic studies.Ph.D.Committee Chair: Dr. G. Tong Zhou; Committee Member: Dr. Arthur Koblasz; Committee Member: Dr. Eberhard Voit; Committee Member: Dr. Xiaoli Ma; Committee Member: Dr. Ying X
    • …
    corecore