11 research outputs found

    In-silico prediction of blood-secretory human proteins using a ranking algorithm

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Computational identification of blood-secretory proteins, especially proteins with differentially expressed genes in diseased tissues, can provide highly useful information in linking transcriptomic data to proteomic studies for targeted disease biomarker discovery in serum.</p> <p>Results</p> <p>A new algorithm for prediction of blood-secretory proteins is presented using an information-retrieval technique, called <it>manifold ranking</it>. On a dataset containing 305 known blood-secretory human proteins and a large number of other proteins that are either not blood-secretory or unknown, the new method performs better than the previous published method, measured in terms of the area under the recall-precision curve (AUC). A key advantage of the presented method is that it does not explicitly require a negative training set, which could often be noisy or difficult to derive for most biological problems, hence making our method more applicable than classification-based data mining methods in general biological studies.</p> <p>Conclusion</p> <p>We believe that our program will prove to be very useful to biomedical researchers who are interested in finding serum markers, especially when they have candidate proteins derived through transcriptomic or proteomic analyses of diseased tissues. A computer program is developed for prediction of blood-secretory proteins based on manifold ranking, which is accessible at our website <url>http://csbl.bmb.uga.edu/publications/materials/qiliu/blood_secretory_protein.html</url>.</p

    \u3ci\u3eIn-silico\u3c/i\u3e prediction of blood-secretory human proteins using a ranking algorithm

    Get PDF
    Background: Computational identification of blood-secretory proteins, especially proteins with differentially expressed genes in diseased tissues, can provide highly useful information in linking transcriptomic data to proteomic studies for targeted disease biomarker discovery in serum. Results: A new algorithm for prediction of blood-secretory proteins is presented using an information-retrieval technique, called manifold ranking. On a dataset containing 305 known blood-secretory human proteins and a large number of other proteins that are either not blood-secretory or unknown, the new method performs better than the previous published method, measured in terms of the area under the recall-precision curve (AUC). A key advantage of the presented method is that it does not explicitly require a negative training set, which could often be noisy or difficult to derive for most biological problems, hence making our method more applicable than classification-based data mining methods in general biological studies. Conclusion: We believe that our program will prove to be very useful to biomedical researchers who are interested in finding serum markers, especially when they have candidate proteins derived through transcriptomic or proteomic analyses of diseased tissues. A computer program is developed for prediction of blood-secretory proteins based on manifold ranking, which is accessible at our website http://csbl.bmb.uga.edu/publications/materials/qiliu/ blood_secretory_protein.html

    An Algorithm for Identifying Novel Targets of Transcription Factor Families: Application to Hypoxia-inducible Factor 1 Targets

    Get PDF
    Efficient and effective analysis of the growing genomic databases requires the development of adequate computational tools. We introduce a fast method based on the suffix tree data structure for predicting novel targets of hypoxia-inducible factor 1 (HIF-1) from huge genome databases. The suffix tree data structure has two powerful applications here: one is to extract unknown patterns from multiple strings/sequences in linear time; the other is to search multiple strings/sequences using multiple patterns in linear time. Using 15 known HIF-1 target gene sequences as a training set, we extracted 105 common patterns that all occur in the 15 training genes using suffix trees. Using these 105 common patterns along with known subsequences surrounding HIF-1 binding sites from the literature, the algorithm searches a genome database that contains 2,078,786 DNA sequences. It reported 258 potentially novel HIF-1 targets including 25 known HIF-1 targets. Based on microarray studies from the literature, 17 putative genes were confirmed to be upregulated by HIF-1 or hypoxia inside these 258 genes. We further studied one of the potential targets, COX-2, in the biological lab; and showed that it was a biologically relevant HIF-1 target. These results demonstrate that our methodology is an effective computational approach for identifying novel HIF-1 targets

    siRNAs from miRNA sites mediate DNA methylation of target genes

    Get PDF
    Arabidopsis microRNA (miRNA) genes (MIR) give rise to 20- to 22-nt miRNAs that are generated predominantly by the type III endoribonuclease Dicer-like 1 (DCL1) but do not require any RNA-dependent RNA Polymerases (RDRs) or RNA Polymerase IV (Pol IV). Here, we identify a novel class of non-conserved MIR genes that give rise to two small RNA species, a 20- to 22-nt species and a 23- to 27-nt species, at the same site. Genetic analysis using small RNA pathway mutants reveals that the 20- to 22-nt small RNAs are typical miRNAs generated by DCL1 and are associated with Argonaute 1 (AGO1). In contrast, the accumulation of the 23- to 27-nt small RNAs from the miRNA-generating sites is dependent on DCL3, RDR2 and Pol IV, components of the typical heterochromatic small interfering RNA (hc-siRNA) pathway. We further demonstrate that these MIR-derived siRNAs associate with AGO4 and direct DNA methylation at some of their target loci in trans. In addition, we find that at the miRNA-generating sites, some conserved canonical MIR genes also produce siRNAs, which also induce DNA methylation at some of their target sites. Our systematic examination of published small RNA deep sequencing datasets of rice and moss suggests that this type of dual functional MIRs exist broadly in plants

    MiRPara: a SVM-based software tool for prediction of most probable microRNA coding regions in genome scale sequences

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>MicroRNAs are a family of ~22 nt small RNAs that can regulate gene expression at the post-transcriptional level. Identification of these molecules and their targets can aid understanding of regulatory processes. Recently, HTS has become a common identification method but there are two major limitations associated with the technique. Firstly, the method has low efficiency, with typically less than 1 in 10,000 sequences representing miRNA reads and secondly the method preferentially targets highly expressed miRNAs. If sequences are available, computational methods can provide a screening step to investigate the value of an HTS study and aid interpretation of results. However, current methods can only predict miRNAs for short fragments and have usually been trained against small datasets which don't always reflect the diversity of these molecules.</p> <p>Results</p> <p>We have developed a software tool, miRPara, that predicts most probable mature miRNA coding regions from genome scale sequences in a species specific manner. We classified sequences from miRBase into animal, plant and overall categories and used a support vector machine to train three models based on an initial set of 77 parameters related to the physical properties of the pre-miRNA and its miRNAs. By applying parameter filtering we found a subset of ~25 parameters produced higher prediction ability compared to the full set. Our software achieves an accuracy of up to 80% against experimentally verified mature miRNAs, making it one of the most accurate methods available.</p> <p>Conclusions</p> <p>miRPara is an effective tool for locating miRNAs coding regions in genome sequences and can be used as a screening step prior to HTS experiments. It is available at <url>http://www.whiov.ac.cn/bioinformatics/mirpara</url></p

    Direct sequencing and expression analysis of a large number of miRNAs in Aedes aegypti and a multi-species survey of novel mosquito miRNAs

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>MicroRNAs (miRNAs) are a novel class of gene regulators whose biogenesis involves hairpin structures called precursor miRNAs, or pre-miRNAs. A pre-miRNA is processed to make a miRNA:miRNA* duplex, which is then separated to generate a mature miRNA and a miRNA*. The mature miRNAs play key regulatory roles during embryonic development as well as other cellular processes. They are also implicated in control of viral infection as well as innate immunity. Direct experimental evidence for mosquito miRNAs has been recently reported in anopheline mosquitoes based on small-scale cloning efforts.</p> <p>Results</p> <p>We obtained approximately 130, 000 small RNA sequences from the yellow fever mosquito, <it>Aedes aegypti</it>, by 454 sequencing of samples that were isolated from mixed-age embryos and midguts from sugar-fed and blood-fed females, respectively. We also performed bioinformatics analysis on the <it>Ae. aegypti </it>genome assembly to identify evidence for additional miRNAs. The combination of these approaches uncovered 98 different pre-miRNAs in <it>Ae. aegypti </it>which could produce 86 distinct miRNAs. Thirteen miRNAs, including eight novel miRNAs identified in this study, are currently only found in mosquitoes. We also identified five potential revisions to previously annotated miRNAs at the miRNA termini, two cases of highly abundant miRNA* sequences, 14 miRNA clusters, and 17 cases where more than one pre-miRNA hairpin produces the same or highly similar mature miRNAs. A number of miRNAs showed higher levels in midgut from blood-fed female than that from sugar-fed female, which was confirmed by northern blots on two of these miRNAs. Northern blots also revealed several miRNAs that showed stage-specific expression. Detailed expression analysis of eight of the 13 mosquito-specific miRNAs in four divergent mosquito genera identified cases of clearly conserved expression patterns and obvious differences. Four of the 13 miRNAs are specific to certain lineage(s) within mosquitoes.</p> <p>Conclusion</p> <p>This study provides the first systematic analysis of miRNAs in <it>Ae. aegypti </it>and offers a substantially expanded list of miRNAs for all mosquitoes. New insights were gained on the evolution of conserved and lineage-specific miRNAs in mosquitoes. The expression profiles of a few miRNAs suggest stage-specific functions and functions related to embryonic development or blood feeding. A better understanding of the functions of these miRNAs will offer new insights in mosquito biology and may lead to novel approaches to combat mosquito-borne infectious diseases.</p

    A framework for improving microRNA prediction in non-human genomes

    Get PDF
    The prediction of novel pre-microRNA (miRNA) from genomic sequence has received considerable attention recently. However, the majority of studies have focused on the human genome. Previous studies have demonstrated that sensitivity (correctly detecting true miRNA) is sustained when human-trained methods are applied to other species, however they have failed to report the dramatic drop in specificity (the ability to correctly reject non-miRNA sequences) in

    A Review on Recent Computational Methods for Predicting Noncoding RNAs

    Get PDF
    Noncoding RNAs (ncRNAs) play important roles in various cellular activities and diseases. In this paper, we presented a comprehensive review on computational methods for ncRNA prediction, which are generally grouped into four categories: (1) homology-based methods, that is, comparative methods involving evolutionarily conserved RNA sequences and structures, (2) de novo methods using RNA sequence and structure features, (3) transcriptional sequencing and assembling based methods, that is, methods designed for single and pair-ended reads generated from next-generation RNA sequencing, and (4) RNA family specific methods, for example, methods specific for microRNAs and long noncoding RNAs. In the end, we summarized the advantages and limitations of these methods and pointed out a few possible future directions for ncRNA prediction. In conclusion, many computational methods have been demonstrated to be effective in predicting ncRNAs for further experimental validation. They are critical in reducing the huge number of potential ncRNAs and pointing the community to high confidence candidates. In the future, high efficient mapping technology and more intrinsic sequence features (e.g., motif and -mer frequencies) and structure features (e.g., minimum free energy, conserved stem-loop, or graph structures) are suggested to be combined with the next-and third-generation sequencing platforms to improve ncRNA prediction

    Characterisation of microRNAs in Human Stem Cells

    No full text
    In collaboration with David Baulcombe and Attila Molnar we have generated microRNA libraries for human embryonic stem cells (hESCs) before and after differentiation along the neuronal lineage and also from human mesenchymal stem cells (hMSCs). Both cell types are of medical importance and understanding how their proliferation and differentiation is regulated by microRNAs is also of scientific interest. The hMSC library was sequenced by 454 technology and the two subsequent hESC libraries by Solexa sequencing. Approximately a quarter of all currently known microRNAs were identified between the libraries, in addition to 3 novel microRNAs and 25 annotated piRNAs. For the hESC libraries, we verified the presence of embryonic specific microRNAs (miR-302 family) and neuronal specific microRNAs (miR-9/miR-9*), and demonstrated that expression of these miRNAs is regulated at the transcriptional level. Additionally, promoter assessments of miR-9 transcription revealed that multiple upstream regions may be important in neuronal specific expression. Almost half of all known human microRNAs are located within the introns of host genes. We used microarrays to analyse host gene expression and found that there was little correlation with microRNA expression, indicating that many microRNAs are not regulated at the transcriptional level by their host promoter. Furthermore, the expression of microRNAs from the same cluster, and also from the same hairpin precursor, did not always correlate when compared between the stem cell libraries. Taken together, this data indicates that microRNAs are regulated at a variety of levels both pre- and post-transcriptionally. Many microRNA isomers were also detected that differed in expression between human cell types, and upon differentiation of the hMSCs through the osteoblastic lineage. Interestingly, microRNAs and some of their isomers showed different affinities for Argonaute proteins in pulldown assays. We also profiled mRNAs that were immunoprecipitated with Argonaute in order to identify miRNA target
    corecore