14 research outputs found

    fRNAdb: a platform for mining/annotating functional RNA candidates from non-coding RNA sequences

    Get PDF
    There are abundance of transcripts that code for no particular protein and that remain functionally uncharacterized. Some of these transcripts may have novel functions while others might be junk transcripts. Unfortunately, the experimental validation of such transcripts to find functional non-coding RNA candidates is very costly. Therefore, our primary interest is to computationally mine candidate functional transcripts from a pool of uncharacterized transcripts. We introduce fRNAdb: a novel database service that hosts a large collection of non-coding transcripts including annotated/non-annotated sequences from the H-inv database, NONCODE and RNAdb. A set of computational analyses have been performed on the included sequences. These analyses include RNA secondary structure motif discovery, EST support evaluation, cis-regulatory element search, protein homology search, etc. fRNAdb provides an efficient interface to help users filter out particular transcripts under their own criteria to sort out functional RNA candidates. fRNAdb is available a

    Discovery of short pseudogenes derived from messenger RNAs

    Get PDF
    More than 40% of the human genome is generated by retrotransposition, a series of in vivo processes involving reverse transcription of RNA molecules and integration of the transcripts into the genomic sequence. The mechanism of retrotransposition, however, is not fully understood, and additional genomic elements generated by retrotransposition may remain to be discovered. Here, we report that the human genome contains many previously unidentified short pseudogenes generated by retrotransposition of mRNAs. Genomic elements generated by non-long terminal repeat retrotransposition have specific sequence signatures: a poly-A tract that is immediately downstream and a pair of duplicated sequences, called target site duplications (TSDs), at either end. Using a new computer program, TSDscan, that can accurately detect pseudogenes based on the presence of the poly-A tract and TSDs, we found 654 short (≤300 bp), previously unknown pseudogenes derived from mRNAs. Comprehensive analyses of the pseudogenes that we identified and their parent mRNAs revealed that the pseudogene length depends on the parent mRNA length: long mRNAs generate more short pseudogenes than do short mRNAs. To explain this phenomenon, we hypothesize that most long mRNAs are truncated before they are reverse transcribed. Truncated mRNAs would be rapidly degraded during reverse transcription, resulting in the generation of short pseudogenes

    Prediction of conserved precursors of miRNAs and their mature forms by integrating position-specific structural features.

    Get PDF
    MicroRNA (miRNA) precursor hairpins have a unique secondary structure, nucleotide length, and nucleotide content that are in most cases evolutionarily conserved. The aim of this study was to utilize position-specific features of miRNA hairpins to improve their identification. To this end, we defined the evolutionary and structurally conserved features in each position of miRNA hairpins with heuristically derived values, which were successfully integrated using a probabilistic framework. Our method, miRRim2, can not only accurately detect miRNA hairpins, but infer the location of a mature miRNA sequence. To evaluate the accuracy of miRRim2, we designed a cross validation test in which the whole human genome was used for evaluation. miRRim2 could more accurately detect miRNA hairpins than the other computational predictions that had been performed on the human genome, and detect the position of the 5'-end of mature miRNAs with sensitivity and positive predictive value (PPV) above 0.4. To further evaluate miRRim2 on independent data, we applied it to the Ciona intestinalis genome. Our method detected 47 known miRNA hairpins among top 115 candidates, and pinpointed the 5'-end of mature miRNAs with sensitivity and PPV about 0.4. When our results were compared with deep-sequencing reads of small RNA libraries from Ciona intestinalis cells, we found several candidates in which the predicted mature miRNAs were in good accordance with deep-sequencing results

    Architecture of the Model.

    No full text
    <p>Each sub-model is represented by an oval. The circled “s” and “e” represent a start and end state, respectively. Dotted rectangles indicate sub-models corresponding to an miRNA duplex.</p

    Comparison of 5

    No full text
    <p>′<b>-ends of mature miRNAs predicted by miRRim2 and those identified by deep-sequencing.</b> The probability of a predicted 5′-end (P<sup>5end</sup><sub>i</sub>) is indicated by colours; Black, blue, orange, and red means 0≤P<sup>5end</sup><0.05, 0.05≤P<sup>5end</sup><0.1, 0.1≤P<sup>5end</sup><0.4, and 0.4≤P<sup>5end</sup>, respectively. Arrows indicate the 5′-ends identified by deep-sequencing experiments by Hendrix <i>et al</i>. The number associated with an arrow indicates the number of reads.</p

    Promising candidates.

    No full text
    a)<p>Genomic coordinate of ci2 genome (Mar. 2005 Assembly).</p

    Accuracy for detecting the 5′-end of mature RNAs.

    No full text
    <p>(a) Sensitivity-PPV plot for mature miRNA prediction. (b) The change of the accuracy when one type of features is excluded. BPP: base-pair potential, BPD: base-pair distance.</p

    PhastCons scores, PhyloP scores, and base-pair potential averaged in each position.

    No full text
    <p>Position 0 indicates to 5′ ends of miRNA duplexes in the upper strand of miRNA hairpins. Dotted rectangles indicate the approximate location of the miRNA duplex.</p
    corecore