22 research outputs found

    In silico method for systematic analysis of feature importance in microRNA-mRNA interactions

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>MicroRNA (miRNA), which is short non-coding RNA, plays a pivotal role in the regulation of many biological processes and affects the stability and/or translation of mRNA. Recently, machine learning algorithms were developed to predict potential miRNA targets. Most of these methods are robust but are not sensitive to redundant or irrelevant features. Despite their good performance, the relative importance of each feature is still unclear. With increasing experimental data becoming available, research interest has shifted from higher prediction performance to uncovering the mechanism of microRNA-mRNA interactions.</p> <p>Results</p> <p>Systematic analysis of sequence, structural and positional features was carried out for two different data sets. The dominant functional features were distinguished from uninformative features in single and hybrid feature sets. Models were developed using only statistically significant sequence, structural and positional features, resulting in area under the receiver operating curves (AUC) values of 0.919, 0.927 and 0.969 for one data set and of 0.926, 0.874 and 0.954 for another data set, respectively. Hybrid models were developed by combining various features and achieved AUC of 0.978 and 0.970 for two different data sets. Functional miRNA information is well reflected in these features, which are expected to be valuable in understanding the mechanism of microRNA-mRNA interactions and in designing experiments.</p> <p>Conclusions</p> <p>Differing from previous approaches, this study focused on systematic analysis of all types of features. Statistically significant features were identified and used to construct models that yield similar accuracy to previous studies in a shorter computation time.</p

    PRED_PPI: a server for predicting protein-protein interactions based on sequence data with probability assignment

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Protein-protein interactions (PPIs) are crucial for almost all cellular processes, including metabolic cycles, DNA transcription and replication, and signaling cascades. Given the importance of PPIs, several methods have been developed to detect them. Since the experimental methods are time-consuming and expensive, developing computational methods for effectively identifying PPIs is of great practical significance.</p> <p>Findings</p> <p>Most previous methods were developed for predicting PPIs in only one species, and do not account for probability estimations. In this work, a relatively comprehensive prediction system was developed, based on a support vector machine (SVM), for predicting PPIs in five organisms, specifically humans, yeast, <it>Drosophila</it>, <it>Escherichia coli</it>, and <it>Caenorhabditis elegans</it>. This PPI predictor includes the probability of its prediction in the output, so it can be used to assess the confidence of each SVM prediction by the probability assignment. Using a probability of 0.5 as the threshold for assigning class labels, the method had an average accuracy for detecting protein interactions of 90.67% for humans, 88.99% for yeast, 90.09% for <it>Drosophila</it>, 92.73% for <it>E. coli</it>, and 97.51% for <it>C. elegans</it>. Moreover, among the correctly predicted pairs, more than 80% were predicted with a high probability of ≄0.8, indicating that this tool could predict novel PPIs with high confidence.</p> <p>Conclusions</p> <p>Based on this work, a web-based system, Pred_PPI, was constructed for predicting PPIs from the five organisms. Users can predict novel PPIs and obtain a probability value about the prediction using this tool. Pred_PPI is freely available at <url>http://cic.scu.edu.cn/bioinformatics/predict_ppi/default.html</url>.</p

    Genomic analysis of the domestication and post-Spanish conquest evolution of the llama and alpaca

    Get PDF
    Background Despite their regional economic importance and being increasingly reared globally, the origins and evolution of the llama and alpaca remain poorly understood. Here we report reference genomes for the llama, and for the guanaco and vicuña (their putative wild progenitors), compare these with the published alpaca genome, and resequence seven individuals of all four species to better understand domestication and introgression between the llama and alpaca. Results Phylogenomic analysis confirms that the llama was domesticated from the guanaco and the alpaca from the vicuña. Introgression was much higher in the alpaca genome (36%) than the llama (5%) and could be dated close to the time of the Spanish conquest, approximately 500 years ago. Introgression patterns are at their most variable on the X-chromosome of the alpaca, featuring 53 genes known to have deleterious X-linked phenotypes in humans. Strong genome-wide introgression signatures include olfactory receptor complexes into both species, hypertension resistance into alpaca, and fleece/fiber traits into llama. Genomic signatures of domestication in the llama include male reproductive traits, while in alpaca feature fleece characteristics, olfaction-related and hypoxia adaptation traits. Expression analysis of the introgressed region that is syntenic to human HSA4q21, a gene cluster previously associated with hypertension in humans under hypoxic conditions, shows a previously undocumented role for PRDM8 downregulation as a potential transcriptional regulation mechanism, analogous to that previously reported at high altitude for hypoxia-inducible factor 1α. Conclusions The unprecedented introgression signatures within both domestic camelid genomes may reflect post-conquest changes in agriculture and the breakdown of traditional management practices

    Molecular Footprints of Aquatic Adaptation Including Bone Mass Changes in Cetaceans

    Get PDF
    Abstract Cetaceans (whales, dolphins, and porpoises) are a group of specialized mammals that evolved from terrestrial ancestors and are fully adapted to aquatic habitats. Taking advantage of the recently sequenced finless porpoise genome, we conducted comparative analyses of the genomes of seven cetaceans and related terrestrial species to provide insight into the molecular bases of adaptation of these aquatic mammals. Changes in gene sequences were identified in main lineages of cetaceans, offering an evolutionary picture of cetacean genomes that reveal new pathways that could be associated with adaptation to aquatic lifestyle. We profiled bone microanatomical structures across 28 mammals, including representatives of cetaceans, pinnipeds, and sirenians. Subsequent phylogenetic comparative analyses revealed genes (including leptin, insulin-like growth factor 1, and collagen type I alpha 2 chain) with the root-to-tip substitution rate significantly correlated with bone compactness, implicating these genes could be involved in bone mass control. Overall, this study described adjustments of the genomes of cetaceans according to lifestyle, phylogeny, and bone mass

    Population genomics of finless porpoises reveal an incipient cetacean species adapted to freshwater

    No full text
    Cetaceans (whales, dolphins, and porpoises) are a group of mammals adapted to various aquatic habitats, from oceans to freshwater rivers. We report the sequencing, de novo assembly and analysis of a finless porpoise genome, and the re-sequencing of an additional 48 finless porpoise individuals. We use these data to reconstruct the demographic history of finless porpoises from their origin to the occupation into the Yangtze River. Analyses of selection between marine and freshwater porpoises identify genes associated with renal water homeostasis and urea cycle, such as urea transporter 2 and angiotensin I-converting enzyme 2, which are likely adaptations associated with the difference in osmotic stress between ocean and rivers. Our results strongly suggest that the critically endangered Yangtze finless porpoises are reproductively isolated from other porpoise populations and harbor unique genetic adaptations, supporting that they should be considered a unique incipient species
    corecore