3,237 research outputs found

    Integration of molecular network data reconstructs Gene Ontology.

    Get PDF
    Motivation: Recently, a shift was made from using Gene Ontology (GO) to evaluate molecular network data to using these data to construct and evaluate GO. Dutkowski et al. provide the first evidence that a large part of GO can be reconstructed solely from topologies of molecular networks. Motivated by this work, we develop a novel data integration framework that integrates multiple types of molecular network data to reconstruct and update GO. We ask how much of GO can be recovered by integrating various molecular interaction data. Results: We introduce a computational framework for integration of various biological networks using penalized non-negative matrix tri-factorization (PNMTF). It takes all network data in a matrix form and performs simultaneous clustering of genes and GO terms, inducing new relations between genes and GO terms (annotations) and between GO terms themselves. To improve the accuracy of our predicted relations, we extend the integration methodology to include additional topological information represented as the similarity in wiring around non-interacting genes. Surprisingly, by integrating topologies of bakersā€™ yeasts proteinā€“protein interaction, genetic interaction (GI) and co-expression networks, our method reports as related 96% of GO terms that are directly related in GO. The inclusion of the wiring similarity of non-interacting genes contributes 6% to this large GO term association capture. Furthermore, we use our method to infer new relationships between GO terms solely from the topologies of these networks and validate 44% of our predictions in the literature. In addition, our integration method reproduces 48% of cellular component, 41% of molecular function and 41% of biological process GO terms, outperforming the previous method in the former two domains of GO. Finally, we predict new GO annotations of yeast genes and validate our predictions through GIs profiling. Availability and implementation: Supplementary Tables of new GO term associations and predicted gene annotations are available at http://bio-nets.doc.ic.ac.uk/GO-Reconstruction/. Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online

    A message passing framework with multiple data integration for miRNA-disease association prediction

    Get PDF
    Micro RNA or miRNA is a highly conserved class of non-coding RNA that plays an important role in many diseases. Identifying miRNA-disease associations can pave the way for better clinical diagnosis and finding potential drug targets. We propose a biologically-motivated data-driven approach for the miRNA-disease association prediction, which overcomes the data scarcity problem by exploiting information from multiple data sources. The key idea is to enrich the existing miRNA/disease-protein-coding gene (PCG) associations via a message passing framework, followed by the use of disease ontology information for further feature filtering. The enriched and filtered PCG associations are then used to construct the inter-connected miRNA-PCG-disease network to train a structural deep network embedding (SDNE) model. Finally, the pre-trained embeddings and the biologically relevant features from the miRNA family and disease semantic similarity are concatenated to form the pair input representations to a Random Forest classifier whose task is to predict the miRNA-disease association probabilities. We present large-scale comparative experiments, ablation, and case studies to showcase our approachā€™s superiority. Besides, we make the model prediction results for 1618 miRNAs and 3679 diseases, along with all related information, publicly available at http://software.mpm.leibniz-ai-lab.de/ to foster assessments and future adoption

    Prioritization of disease microRNAs through a human phenome-microRNAome network

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The identification of disease-related microRNAs is vital for understanding the pathogenesis of diseases at the molecular level, and is critical for designing specific molecular tools for diagnosis, treatment and prevention. Experimental identification of disease-related microRNAs poses considerable difficulties. Computational analysis of microRNA-disease associations is an important complementary means for prioritizing microRNAs for further experimental examination.</p> <p>Results</p> <p>Herein, we devised a computational model to infer potential microRNA-disease associations by prioritizing the entire human microRNAome for diseases of interest. We tested the model on 270 known experimentally verified microRNA-disease associations and achieved an area under the ROC curve of 75.80%. Moreover, we demonstrated that the model is applicable to diseases with which no known microRNAs are associated. The microRNAome-wide prioritization of microRNAs for 1,599 disease phenotypes is publicly released to facilitate future identification of disease-related microRNAs.</p> <p>Conclusions</p> <p>We presented a network-based approach that can infer potential microRNA-disease associations and drive testable hypotheses for the experimental efforts to identify the roles of microRNAs in human diseases.</p

    mintRULS: Prediction of miRNA-mRNA Target Site Interactions Using Regularized Least Square Method

    Get PDF
    Identification of miRNA-mRNA interactions is critical to understand the new paradigms in gene regulation. Existing methods show suboptimal performance owing to inappropriate feature selection and limited integration of intuitive biological features of both miRNAs and mRNAs. The present regularized least square-based method, mintRULS, employs features of miRNAs and their target sites using pairwise similarity metrics based on free energy, sequence and repeat identities, and target site accessibility to predict miRNA-target site interactions. We hypothesized that miRNAs sharing similar structural and functional features are more likely to target the same mRNA, and conversely, mRNAs with similar features can be targeted by the same miRNA. Our prediction model achieved an impressive AUC of 0.93 and 0.92 in LOOCV and LmiTOCV settings, respectively. In comparison, other popular tools such as miRDB, TargetScan, MBSTAR, RPmirDIP, and STarMir scored AUCs at 0.73, 0.77, 0.55, 0.84, and 0.67, respectively, in LOOCV setting. Similarly, mintRULS outperformed other methods using metrics such as accuracy, sensitivity, specificity, and MCC. Our method also demonstrated high accuracy when validated against experimentally derived data from condition- and cell-specific studies and expression studies of miRNAs and target genes, both in human and mouse

    Allele-specific miRNA-binding analysis identifies candidate target genes for breast cancer risk

    Get PDF
    Most breast cancer (BC) risk-associated single-nucleotide polymorphisms (raSNPs) identified in genome-wide association studies (GWAS) are believed to cis-regulate the expression of genes. We hypothesise that cis-regulatory variants contributing to disease risk may be affecting microRNA (miRNA) genes and/or miRNA binding. To test this, we adapted two miRNA-binding prediction algorithms-TargetScan and miRanda-to perform allele-specific queries, and integrated differential allelic expression (DAE) and expression quantitative trait loci (eQTL) data, to query 150 genome-wide significant ( Pā‰¤5Ɨ10-8 ) raSNPs, plus proxies. We found that no raSNP mapped to a miRNA gene, suggesting that altered miRNA targeting is an unlikely mechanism involved in BC risk. Also, 11.5% (6 out of 52) raSNPs located in 3'-untranslated regions of putative miRNA target genes were predicted to alter miRNA::mRNA (messenger RNA) pair binding stability in five candidate target genes. Of these, we propose RNF115, at locus 1q21.1, as a strong novel target gene associated with BC risk, and reinforce the role of miRNA-mediated cis-regulation at locus 19p13.11. We believe that integrating allele-specific querying in miRNA-binding prediction, and data supporting cis-regulation of expression, improves the identification of candidate target genes in BC risk, as well as in other common cancers and complex diseases.Funding Agency Portuguese Foundation for Science and Technology CRESC ALGARVE 2020 European Union (EU) 303745 Maratona da Saude Award DL 57/2016/CP1361/CT0042 SFRH/BPD/99502/2014 CBMR-UID/BIM/04773/2013 POCI-01-0145-FEDER-022184info:eu-repo/semantics/publishedVersio

    Statistical Modeling of MicroRNA Expression with Human Cancers

    Get PDF
    MicroRNAs (miRNAs) are small non-coding RNAs (containing about 22 nucleotides) that regulate gene expression. MiRNAs are involved in many different biological processes such as cell proliferation, differentiation, apoptosis, fat metabolism, and human cancer genes; while miRNAs may function as candidates for diagnostic and prognostic biomarkers and predictors of drug response. This paper emphasizes the statistical methods in the analysis of the associations of miRNA gene expression with human cancers and related clinical phenotypes: 1) simple statistical methods include chi-square test, correlation analysis, t-test and one-way ANOVA; 2) regression models include linear and logistic regression; 3) survival analysis approaches such as non-parametric Kaplan-Meier method and log-rank test as well as semi-parametric Cox proportional hazards models have been used for time to event data; 4) multivariate method such as cluster analysis has been used for clustering samples and principal component analysis (PCA) has been used for data mining; 5) Bayesian statistical methods have recently made great inroads into many areas of science, including the assessment of association between miRNA expression and human cancers; and 6) multiple testing
    • ā€¦
    corecore