96 research outputs found

    Genes2Networks: Connecting Lists of Proteins by Using Background Literature-based Mammalian Networks

    Get PDF
    In recent years, in-silico literature-based mammalian protein-protein interaction network datasets have been developed. These datasets contain binary interactions extracted manually from legacy experimental biomedical research literature. Placing lists of genes or proteins identified as significantly changing in multivariate experiments, in the context of background knowledge about binary interactions, can be used to place these genes or proteins in the context of pathways and protein complexes.
Genes2Networks is a software system that integrates the content of ten mammalian literature-based interaction network datasets. Filtering to prune low-confidence interactions was implemented. Genes2Networks is delivered as a web-based service using AJAX. The system can be used to extract relevant subnetworks created from “seed” lists of human Entrez gene names. The output includes a dynamic linkable three color web-based network map, with a statistical analysis report that identifies significant intermediate nodes used to connect the seed list. Genes2Networks is available at http://actin.pharm.mssm.edu/genes2networks.
Genes2Network is a powerful web-based software application tool that can help experimental biologists to interpret high-throughput experimental results used in genomics and proteomics studies where the output of these experiments is a list of significantly changing genes or proteins. The system can be used to find relationships between nodes from the seed list, and predict novel nodes that play a key role in a common function

    Biomolecular network querying: a promising approach in systems biology

    Get PDF
    The rapid accumulation of various network-related data from multiple species and conditions (e.g. disease versus normal) provides unprecedented opportunities to study the function and evolution of biological systems. Comparison of biomolecular networks between species or conditions is a promising approach to understanding the essential mechanisms used by living organisms. Computationally, the basic goal of this network comparison or 'querying' is to uncover identical or similar subnetworks by mapping the queried network (e.g. a pathway or functional module) to another network or network database. Such comparative analysis may reveal biologically or clinically important pathways or regulatory networks. In particular, we argue that user-friendly tools for network querying will greatly enhance our ability to study the fundamental properties of biomolecular networks at a system-wide level

    Genes2Networks: Connecting Lists of Proteins by Using Background Literature-based Mammalian Networks

    Get PDF
    In recent years, in-silico literature-based mammalian protein-protein interaction network datasets have been developed. These datasets contain binary interactions extracted manually from legacy experimental biomedical research literature. Placing lists of genes or proteins identified as significantly changing in multivariate experiments, in the context of background knowledge about binary interactions, can be used to place these genes or proteins in the context of pathways and protein complexes.
Genes2Networks is a software system that integrates the content of ten mammalian literature-based interaction network datasets. Filtering to prune low-confidence interactions was implemented. Genes2Networks is delivered as a web-based service using AJAX. The system can be used to extract relevant subnetworks created from “seed” lists of human Entrez gene names. The output includes a dynamic linkable three color web-based network map, with a statistical analysis report that identifies significant intermediate nodes used to connect the seed list. Genes2Networks is available at http://actin.pharm.mssm.edu/genes2networks.
Genes2Network is a powerful web-based software application tool that can help experimental biologists to interpret high-throughput experimental results used in genomics and proteomics studies where the output of these experiments is a list of significantly changing genes or proteins. The system can be used to find relationships between nodes from the seed list, and predict novel nodes that play a key role in a common function

    Genes2Networks: connecting lists of gene symbols using mammalian protein interactions databases

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In recent years, mammalian protein-protein interaction network databases have been developed. The interactions in these databases are either extracted manually from low-throughput experimental biomedical research literature, extracted automatically from literature using techniques such as natural language processing (NLP), generated experimentally using high-throughput methods such as yeast-2-hybrid screens, or interactions are predicted using an assortment of computational approaches. Genes or proteins identified as significantly changing in proteomic experiments, or identified as susceptibility disease genes in genomic studies, can be placed in the context of protein interaction networks in order to assign these genes and proteins to pathways and protein complexes.</p> <p>Results</p> <p>Genes2Networks is a software system that integrates the content of ten mammalian interaction network datasets. Filtering techniques to prune low-confidence interactions were implemented. Genes2Networks is delivered as a web-based service using AJAX. The system can be used to extract relevant subnetworks created from "seed" lists of human Entrez gene symbols. The output includes a dynamic linkable three color web-based network map, with a statistical analysis report that identifies significant intermediate nodes used to connect the seed list.</p> <p>Conclusion</p> <p>Genes2Networks is powerful web-based software that can help experimental biologists to interpret lists of genes and proteins such as those commonly produced through genomic and proteomic experiments, as well as lists of genes and proteins associated with disease processes. This system can be used to find relationships between genes and proteins from seed lists, and predict additional genes or proteins that may play key roles in common pathways or protein complexes.</p

    Large-scale event extraction from literature with multi-level gene normalization

    Get PDF
    Text mining for the life sciences aims to aid database curation, knowledge summarization and information retrieval through the automated processing of biomedical texts. To provide comprehensive coverage and enable full integration with existing biomolecular database records, it is crucial that text mining tools scale up to millions of articles and that their analyses can be unambiguously linked to information recorded in resources such as UniProt, KEGG, BioGRID and NCBI databases. In this study, we investigate how fully automated text mining of complex biomolecular events can be augmented with a normalization strategy that identifies biological concepts in text, mapping them to identifiers at varying levels of granularity, ranging from canonicalized symbols to unique gene and proteins and broad gene families. To this end, we have combined two state-of-the-art text mining components, previously evaluated on two community-wide challenges, and have extended and improved upon these methods by exploiting their complementary nature. Using these systems, we perform normalization and event extraction to create a large-scale resource that is publicly available, unique in semantic scope, and covers all 21.9 million PubMed abstracts and 460 thousand PubMed Central open access full-text articles. This dataset contains 40 million biomolecular events involving 76 million gene/protein mentions, linked to 122 thousand distinct genes from 5032 species across the full taxonomic tree. Detailed evaluations and analyses reveal promising results for application of this data in database and pathway curation efforts. The main software components used in this study are released under an open-source license. Further, the resulting dataset is freely accessible through a novel API, providing programmatic and customized access (http://www.evexdb.org/api/v001/). Finally, to allow for large-scale bioinformatic analyses, the entire resource is available for bulk download from http://evexdb.org/download/, under the Creative Commons -Attribution - Share Alike (CC BY-SA) license

    Customizable views on semantically integrated networks for systems biology

    Get PDF
    Motivation: The rise of high-throughput technologies in the post-genomic era has led to the production of large amounts of biological data. Many of these datasets are freely available on the Internet. Making optimal use of these data is a significant challenge for bioinformaticians. Various strategies for integrating data have been proposed to address this challenge. One of the most promising approaches is the development of semantically rich integrated datasets. Although well suited to computational manipulation, such integrated datasets are typically too large and complex for easy visualization and interactive exploration

    Enhancing the accuracy of HMM-based conserved pathway prediction using global correspondence scores

    Get PDF
    BACKGROUND: Comparative network analysis aims to identify common subnetworks in biological networks. It can facilitate the prediction of conserved functional modules across different species and provide deep insights into their underlying regulatory mechanisms. Recently, it has been shown that hidden Markov models (HMMs) can provide a flexible and computationally efficient framework for modeling and comparing biological networks. RESULTS: In this work, we show that using global correspondence scores between molecules can improve the accuracy of the HMM-based network alignment results. The global correspondence scores are computed by performing a semi-Markov random walk on the networks to be compared. The resulting score naturally integrates the sequence similarity between molecules and the topological similarity between their molecular interactions, thereby providing a more effective measure for estimating the functional similarity between molecules. By incorporating the global correspondence scores, instead of relying on sequence similarity or functional annotation scores used by previous approaches, our HMM-based network alignment method can identify conserved subnetworks that are functionally more coherent. CONCLUSIONS: Performance analysis based on synthetic and microbial networks demonstrates that the proposed network alignment strategy significantly improves the robustness and specificity of the predicted alignment results, in terms of conserved functional similarity measured based on KEGG ortholog (KO) groups. These results clearly show that the HMM-based network alignment framework using global correspondence scores can effectively find conserved biological pathways and has the potential to be used for automatic functional annotation of biomolecules

    On Ranked Approximate Matching Of Large Attributed Graphs

    Get PDF
    Many emerging database applications entail sophisticated graph based query manipulation, predominantly evident in large-scale scientific applications. To access the information embedded in graphs, efficient graph matching tools and algorithms have become of prime importance. Although the prohibitively expensive time complexity associated with exact sub-graph isomorphism techniques has limited its efficacy in the application domain, approximate yet efficient graph matching techniques have received much attention due to their pragmatic applicability. Since public domain databases are noisy and incomplete in nature, inexact graph matching techniques have proven to be more promising in terms of inferring knowledge from numerous structural data repositories. Contemporary algorithms for approximate graph matching incur substantial cost to generate candidates, and then test and rank them for possible match. Leading algorithms balance processing time and overall resource consumption cost by leveraging sophisticated data structures and graph properties to improve overall performance. In this dissertation, we propose novel techniques for approximate graph matching based on two different techniques called TraM or Top-k Graph Matching and Approximate Network Matching or AtoM respectively. While TraM off-loads a significant amount of its processing on to the database making the approach viable for large graphs, AtoM provides improved turn around time by means of graph summarization prior to matching. The summarization process is aided by domain sensitive similarity matrices, which in turn helps improve the matching performance. The vector space embedding of the graphs and efficient filtration of the search space enables computation of approximate graph similarity at a throw-away cost. We combine domain similarity and topological similarity to obtain overall graph similarity and compare them with neighborhood biased segments of the data-graph for proper matches. We show that our approach can naturally support the emerging trend in graph pattern queries and discuss its suitability for large networks as it can be seamlessly transformed to adhere to map-reduce framework. We have conducted thorough experiments on several synthetic and real data sets, and have demonstrated the effectiveness and efficiency of the proposed method
    • ā€¦
    corecore