12,588 research outputs found

    Random walks on mutual microRNA-target gene interaction network improve the prediction of disease-associated microRNAs

    Get PDF
    Background: MicroRNAs (miRNAs) have been shown to play an important role in pathological initiation, progression and maintenance. Because identification in the laboratory of disease-related miRNAs is not straightforward, numerous network-based methods have been developed to predict novel miRNAs in silico. Homogeneous networks (in which every node is a miRNA) based on the targets shared between miRNAs have been widely used to predict their role in disease phenotypes. Although such homogeneous networks can predict potential disease-associated miRNAs, they do not consider the roles of the target genes of the miRNAs. Here, we introduce a novel method based on a heterogeneous network that not only considers miRNAs but also the corresponding target genes in the network model. Results: Instead of constructing homogeneous miRNA networks, we built heterogeneous miRNA networks consisting of both miRNAs and their target genes, using databases of known miRNA-target gene interactions. In addition, as recent studies demonstrated reciprocal regulatory relations between miRNAs and their target genes, we considered these heterogeneous miRNA networks to be undirected, assuming mutual miRNA-target interactions. Next, we introduced a novel method (RWRMTN) operating on these mutual heterogeneous miRNA networks to rank candidate disease-related miRNAs using a random walk with restart (RWR) based algorithm. Using both known disease-associated miRNAs and their target genes as seed nodes, the method can identify additional miRNAs involved in the disease phenotype. Experiments indicated that RWRMTN outperformed two existing state-of-the-art methods: RWRMDA, a network-based method that also uses a RWR on homogeneous (rather than heterogeneous) miRNA networks, and RLSMDA, a machine learning-based method. Interestingly, we could relate this performance gain to the emergence of "disease modules" in the heterogeneous miRNA networks used as input for the algorithm. Moreover, we could demonstrate that RWRMTN is stable, performing well when using both experimentally validated and predicted miRNA-target gene interaction data for network construction. Finally, using RWRMTN, we identified 76 novel miRNAs associated with 23 disease phenotypes which were present in a recent database of known disease-miRNA associations. Conclusions: Summarizing, using random walks on mutual miRNA-target networks improves the prediction of novel disease-associated miRNAs because of the existence of "disease modules" in these networks

    Machine Learning and Integrative Analysis of Biomedical Big Data.

    Get PDF
    Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues

    Methods for protein complex prediction and their contributions towards understanding the organization, function and dynamics of complexes

    Get PDF
    Complexes of physically interacting proteins constitute fundamental functional units responsible for driving biological processes within cells. A faithful reconstruction of the entire set of complexes is therefore essential to understand the functional organization of cells. In this review, we discuss the key contributions of computational methods developed till date (approximately between 2003 and 2015) for identifying complexes from the network of interacting proteins (PPI network). We evaluate in depth the performance of these methods on PPI datasets from yeast, and highlight challenges faced by these methods, in particular detection of sparse and small or sub- complexes and discerning of overlapping complexes. We describe methods for integrating diverse information including expression profiles and 3D structures of proteins with PPI networks to understand the dynamics of complex formation, for instance, of time-based assembly of complex subunits and formation of fuzzy complexes from intrinsically disordered proteins. Finally, we discuss methods for identifying dysfunctional complexes in human diseases, an application that is proving invaluable to understand disease mechanisms and to discover novel therapeutic targets. We hope this review aptly commemorates a decade of research on computational prediction of complexes and constitutes a valuable reference for further advancements in this exciting area.Comment: 1 Tabl

    Identification of transcriptional regulatory networks specific to pilocytic astrocytoma.

    Get PDF
    BackgroundPilocytic Astrocytomas (PAs) are common low-grade central nervous system malignancies for which few recurrent and specific genetic alterations have been identified. In an effort to better understand the molecular biology underlying the pathogenesis of these pediatric brain tumors, we performed higher-order transcriptional network analysis of a large gene expression dataset to identify gene regulatory pathways that are specific to this tumor type, relative to other, more aggressive glial or histologically distinct brain tumours.MethodsRNA derived from frozen human PA tumours was subjected to microarray-based gene expression profiling, using Affymetrix U133Plus2 GeneChip microarrays. This data set was compared to similar data sets previously generated from non-malignant human brain tissue and other brain tumour types, after appropriate normalization.ResultsIn this study, we examined gene expression in 66 PA tumors compared to 15 non-malignant cortical brain tissues, and identified 792 genes that demonstrated consistent differential expression between independent sets of PA and non-malignant specimens. From this entire 792 gene set, we used the previously described PAP tool to assemble a core transcriptional regulatory network composed of 6 transcription factor genes (TFs) and 24 target genes, for a total of 55 interactions. A similar analysis of oligodendroglioma and glioblastoma multiforme (GBM) gene expression data sets identified distinct, but overlapping, networks. Most importantly, comparison of each of the brain tumor type-specific networks revealed a network unique to PA that included repressed expression of ONECUT2, a gene frequently methylated in other tumor types, and 13 other uniquely predicted TF-gene interactions.ConclusionsThese results suggest specific transcriptional pathways that may operate to create the unique molecular phenotype of PA and thus opportunities for corresponding targeted therapeutic intervention. Moreover, this study also demonstrates how integration of gene expression data with TF-gene and TF-TF interaction data is a powerful approach to generating testable hypotheses to better understand cell-type specific genetic programs relevant to cancer
    corecore