3,569 research outputs found

    Integration of molecular network data reconstructs Gene Ontology.

    Get PDF
    Motivation: Recently, a shift was made from using Gene Ontology (GO) to evaluate molecular network data to using these data to construct and evaluate GO. Dutkowski et al. provide the first evidence that a large part of GO can be reconstructed solely from topologies of molecular networks. Motivated by this work, we develop a novel data integration framework that integrates multiple types of molecular network data to reconstruct and update GO. We ask how much of GO can be recovered by integrating various molecular interaction data. Results: We introduce a computational framework for integration of various biological networks using penalized non-negative matrix tri-factorization (PNMTF). It takes all network data in a matrix form and performs simultaneous clustering of genes and GO terms, inducing new relations between genes and GO terms (annotations) and between GO terms themselves. To improve the accuracy of our predicted relations, we extend the integration methodology to include additional topological information represented as the similarity in wiring around non-interacting genes. Surprisingly, by integrating topologies of bakers’ yeasts protein–protein interaction, genetic interaction (GI) and co-expression networks, our method reports as related 96% of GO terms that are directly related in GO. The inclusion of the wiring similarity of non-interacting genes contributes 6% to this large GO term association capture. Furthermore, we use our method to infer new relationships between GO terms solely from the topologies of these networks and validate 44% of our predictions in the literature. In addition, our integration method reproduces 48% of cellular component, 41% of molecular function and 41% of biological process GO terms, outperforming the previous method in the former two domains of GO. Finally, we predict new GO annotations of yeast genes and validate our predictions through GIs profiling. Availability and implementation: Supplementary Tables of new GO term associations and predicted gene annotations are available at http://bio-nets.doc.ic.ac.uk/GO-Reconstruction/. Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online

    Herb Target Prediction Based on Representation Learning of Symptom related Heterogeneous Network.

    Get PDF
    Traditional Chinese Medicine (TCM) has received increasing attention as a complementary approach or alternative to modern medicine. However, experimental methods for identifying novel targets of TCM herbs heavily relied on the current available herb-compound-target relationships. In this work, we present an Herb-Target Interaction Network (HTINet) approach, a novel network integration pipeline for herb-target prediction mainly relying on the symptom related associations. HTINet focuses on capturing the low-dimensional feature vectors for both herbs and proteins by network embedding, which incorporate the topological properties of nodes across multi-layered heterogeneous network, and then performs supervised learning based on these low-dimensional feature representations. HTINet obtains performance improvement over a well-established random walk based herb-target prediction method. Furthermore, we have manually validated several predicted herb-target interactions from independent literatures. These results indicate that HTINet can be used to integrate heterogeneous information to predict novel herb-target interactions

    Fuse: Multiple Network Alignment via Data Fusion

    Get PDF

    A semi-supervised learning approach to predict synthetic genetic interactions by combining functional and topological properties of functional gene network

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Genetic interaction profiles are highly informative and helpful for understanding the functional linkages between genes, and therefore have been extensively exploited for annotating gene functions and dissecting specific pathway structures. However, our understanding is rather limited to the relationship between double concurrent perturbation and various higher level phenotypic changes, e.g. those in cells, tissues or organs. Modifier screens, such as synthetic genetic arrays (SGA) can help us to understand the phenotype caused by combined gene mutations. Unfortunately, exhaustive tests on all possible combined mutations in any genome are vulnerable to combinatorial explosion and are infeasible either technically or financially. Therefore, an accurate computational approach to predict genetic interaction is highly desirable, and such methods have the potential of alleviating the bottleneck on experiment design.</p> <p>Results</p> <p>In this work, we introduce a computational systems biology approach for the accurate prediction of pairwise synthetic genetic interactions (SGI). First, a high-coverage and high-precision functional gene network (FGN) is constructed by integrating protein-protein interaction (PPI), protein complex and gene expression data; then, a graph-based semi-supervised learning (SSL) classifier is utilized to identify SGI, where the topological properties of protein pairs in weighted FGN is used as input features of the classifier. We compare the proposed SSL method with the state-of-the-art supervised classifier, the support vector machines (SVM), on a benchmark dataset in <it>S. cerevisiae </it>to validate our method's ability to distinguish synthetic genetic interactions from non-interaction gene pairs. Experimental results show that the proposed method can accurately predict genetic interactions in <it>S. cerevisiae </it>(with a sensitivity of 92% and specificity of 91%). Noticeably, the SSL method is more efficient than SVM, especially for very small training sets and large test sets.</p> <p>Conclusions</p> <p>We developed a graph-based SSL classifier for predicting the SGI. The classifier employs topological properties of weighted FGN as input features and simultaneously employs information induced from labelled and unlabelled data. Our analysis indicates that the topological properties of weighted FGN can be employed to accurately predict SGI. Also, the graph-based SSL method outperforms the traditional standard supervised approach, especially when used with small training sets. The proposed method can alleviate experimental burden of exhaustive test and provide a useful guide for the biologist in narrowing down the candidate gene pairs with SGI. The data and source code implementing the method are available from the website: <url>http://home.ustc.edu.cn/~yzh33108/GeneticInterPred.htm</url></p

    Previsão e análise da estrutura e dinâmica de redes biológicas

    Get PDF
    Increasing knowledge about the biological processes that govern the dynamics of living organisms has fostered a better understanding of the origin of many diseases as well as the identification of potential therapeutic targets. Biological systems can be modeled through biological networks, allowing to apply and explore methods of graph theory in their investigation and characterization. This work had as main motivation the inference of patterns and rules that underlie the organization of biological networks. Through the integration of different types of data, such as gene expression, interaction between proteins and other biomedical concepts, computational methods have been developed so that they can be used to predict and study diseases. The first contribution, was the characterization a subsystem of the human protein interactome through the topological properties of the networks that model it. As a second contribution, an unsupervised method using biological criteria and network topology was used to improve the understanding of the genetic mechanisms and risk factors of a disease through co-expression networks. As a third contribution, a methodology was developed to remove noise (denoise) in protein networks, to obtain more accurate models, using the network topology. As a fourth contribution, a supervised methodology was proposed to model the protein interactome dynamics, using exclusively the topology of protein interactions networks that are part of the dynamic model of the system. The proposed methodologies contribute to the creation of more precise, static and dynamic biological models through the identification and use of topological patterns of protein interaction networks, which can be used to predict and study diseases.O conhecimento crescente sobre os processos biológicos que regem a dinâmica dos organismos vivos tem potenciado uma melhor compreensão da origem de muitas doenças, assim como a identificação de potenciais alvos terapêuticos. Os sistemas biológicos podem ser modelados através de redes biológicas, permitindo aplicar e explorar métodos da teoria de grafos na sua investigação e caracterização. Este trabalho teve como principal motivação a inferência de padrões e de regras que estão subjacentes à organização de redes biológicas. Através da integração de diferentes tipos de dados, como a expressão de genes, interação entre proteínas e outros conceitos biomédicos, foram desenvolvidos métodos computacionais, para que possam ser usados na previsão e no estudo de doenças. Como primeira contribuição, foi proposto um método de caracterização de um subsistema do interactoma de proteínas humano através das propriedades topológicas das redes que o modelam. Como segunda contribuição, foi utilizado um método não supervisionado que utiliza critérios biológicos e topologia de redes para, através de redes de co-expressão, melhorar a compreensão dos mecanismos genéticos e dos fatores de risco de uma doença. Como terceira contribuição, foi desenvolvida uma metodologia para remover ruído (denoise) em redes de proteínas, para obter modelos mais precisos, utilizando a topologia das redes. Como quarta contribuição, propôs-se uma metodologia supervisionada para modelar a dinâmica do interactoma de proteínas, usando exclusivamente a topologia das redes de interação de proteínas que fazem parte do modelo dinâmico do sistema. As metodologias propostas contribuem para a criação de modelos biológicos, estáticos e dinâmicos, mais precisos, através da identificação e uso de padrões topológicos das redes de interação de proteínas, que podem ser usados na previsão e no estudo doenças.Programa Doutoral em Engenharia Informátic

    An Overview of the Use of Neural Networks for Data Mining Tasks

    Get PDF
    In the recent years the area of data mining has experienced a considerable demand for technologies that extract knowledge from large and complex data sources. There is a substantial commercial interest as well as research investigations in the area that aim to develop new and improved approaches for extracting information, relationships, and patterns from datasets. Artificial Neural Networks (NN) are popular biologically inspired intelligent methodologies, whose classification, prediction and pattern recognition capabilities have been utilised successfully in many areas, including science, engineering, medicine, business, banking, telecommunication, and many other fields. This paper highlights from a data mining perspective the implementation of NN, using supervised and unsupervised learning, for pattern recognition, classification, prediction and cluster analysis, and focuses the discussion on their usage in bioinformatics and financial data analysis tasks
    corecore