24,822 research outputs found

    The Biomolecular Interaction Network Database in PSI-MI 2.5

    Get PDF
    The Biomolecular Interaction Network Database (BIND) is a major source of curated biomolecular interactions, which has been unmaintained for the last few years, a trend which will eventually result in the loss of a significant amount of unique biomolecular interaction information, mostly as database identifiers become out of date. To help reverse this trend, we converted BIND to a standard format, Proteomics Standard Initiative-Molecular Interaction 2.5, starting from the last curated data release (from 2005) available in a custom XML format and made the core components (interactions and complexes) plus additional valuable curated information available for download (http://download.baderlab.org/BINDTranslation/). Major work during the conversion process was required to update out of date molecule identifiers resulting in a more comprehensive conversion of BIND, by measures including number of species and interactor types covered, than what is currently accessible elsewhere. This work also highlights issues of data modeling, controlled vocabulary adoption and data cleaning that can serve as a general case study on the future compatibility of interaction databases

    The Biomolecular Interaction Network Database and related tools 2005 update

    Get PDF
    The Biomolecular Interaction Network Database (BIND) (http://bind.ca) archives biomolecular interaction, reaction, complex and pathway information. Our aim is to curate the details about molecular interactions that arise from published experimental research and to provide this information, as well as tools to enable data analysis, freely to researchers worldwide. BIND data are curated into a comprehensive machine-readable archive of computable information and provides users with methods to discover interactions and molecular mechanisms. BIND has worked to develop new methods for visualization that amplify the underlying annotation of genes and proteins to facilitate the study of molecular interaction networks. BIND has maintained an open database policy since its inception in 1999. Data growth has proceeded at a tremendous rate, approaching over 100 000 records. New services provided include a new BIND Query and Submission interface, a Standard Object Access Protocol service and the Small Molecule Interaction Database (http://smid.blueprint.org) that allows users to determine probable small molecule binding sites of new sequences and examine conserved binding residues

    Genes2Networks: Connecting Lists of Proteins by Using Background Literature-based Mammalian Networks

    Get PDF
    In recent years, in-silico literature-based mammalian protein-protein interaction network datasets have been developed. These datasets contain binary interactions extracted manually from legacy experimental biomedical research literature. Placing lists of genes or proteins identified as significantly changing in multivariate experiments, in the context of background knowledge about binary interactions, can be used to place these genes or proteins in the context of pathways and protein complexes.
Genes2Networks is a software system that integrates the content of ten mammalian literature-based interaction network datasets. Filtering to prune low-confidence interactions was implemented. Genes2Networks is delivered as a web-based service using AJAX. The system can be used to extract relevant subnetworks created from “seed” lists of human Entrez gene names. The output includes a dynamic linkable three color web-based network map, with a statistical analysis report that identifies significant intermediate nodes used to connect the seed list. Genes2Networks is available at http://actin.pharm.mssm.edu/genes2networks.
Genes2Network is a powerful web-based software application tool that can help experimental biologists to interpret high-throughput experimental results used in genomics and proteomics studies where the output of these experiments is a list of significantly changing genes or proteins. The system can be used to find relationships between nodes from the seed list, and predict novel nodes that play a key role in a common function

    Genes2Networks: Connecting Lists of Proteins by Using Background Literature-based Mammalian Networks

    Get PDF
    In recent years, in-silico literature-based mammalian protein-protein interaction network datasets have been developed. These datasets contain binary interactions extracted manually from legacy experimental biomedical research literature. Placing lists of genes or proteins identified as significantly changing in multivariate experiments, in the context of background knowledge about binary interactions, can be used to place these genes or proteins in the context of pathways and protein complexes.
Genes2Networks is a software system that integrates the content of ten mammalian literature-based interaction network datasets. Filtering to prune low-confidence interactions was implemented. Genes2Networks is delivered as a web-based service using AJAX. The system can be used to extract relevant subnetworks created from “seed” lists of human Entrez gene names. The output includes a dynamic linkable three color web-based network map, with a statistical analysis report that identifies significant intermediate nodes used to connect the seed list. Genes2Networks is available at http://actin.pharm.mssm.edu/genes2networks.
Genes2Network is a powerful web-based software application tool that can help experimental biologists to interpret high-throughput experimental results used in genomics and proteomics studies where the output of these experiments is a list of significantly changing genes or proteins. The system can be used to find relationships between nodes from the seed list, and predict novel nodes that play a key role in a common function

    Automatic extraction of biomolecular interactions: an empirical approach

    Get PDF
    Background We describe a method for extracting data about how biomolecule pairs interact from texts. This method relies on empirically determined characteristics of sentences. The characteristics are efficient to compute, making this approach to extraction of biomolecular interactions scalable. The results of such interaction mining can support interaction network annotation, question answering, database construction, and other applications. Results We constructed a software system to search MEDLINE for sentences likely to describe interactions between given biomolecules. The system extracts a list of the interaction-indicating terms appearing in those sentences, then ranks those terms based on their likelihood of correctly characterizing how the biomolecules interact. The ranking process uses a tf-idf (term frequency-inverse document frequency) based technique using empirically derived knowledge about sentences, and was applied to the MEDLINE literature collection. Software was developed as part of the MetNet toolkit (http://www.metnetdb.org). Conclusions Specific, efficiently computable characteristics of sentences about biomolecular interactions were analyzed to better understand how to use these characteristics to extract how biomolecules interact. The text empirics method that was investigated, though arising from a classical tradition, has yet to be fully explored for the task of extracting biomolecular interactions from the literature. The conclusions we reach about the sentence characteristics investigated in this work, as well as the technique itself, could be used by other systems to provide evidence about putative interactions, thus supporting efforts to maximize the ability of hybrid systems to support such tasks as annotating and constructing interaction networks

    Representability of algebraic topology for biomolecules in machine learning based scoring and virtual screening

    Full text link
    This work introduces a number of algebraic topology approaches, such as multicomponent persistent homology, multi-level persistent homology and electrostatic persistence for the representation, characterization, and description of small molecules and biomolecular complexes. Multicomponent persistent homology retains critical chemical and biological information during the topological simplification of biomolecular geometric complexity. Multi-level persistent homology enables a tailored topological description of inter- and/or intra-molecular interactions of interest. Electrostatic persistence incorporates partial charge information into topological invariants. These topological methods are paired with Wasserstein distance to characterize similarities between molecules and are further integrated with a variety of machine learning algorithms, including k-nearest neighbors, ensemble of trees, and deep convolutional neural networks, to manifest their descriptive and predictive powers for chemical and biological problems. Extensive numerical experiments involving more than 4,000 protein-ligand complexes from the PDBBind database and near 100,000 ligands and decoys in the DUD database are performed to test respectively the scoring power and the virtual screening power of the proposed topological approaches. It is demonstrated that the present approaches outperform the modern machine learning based methods in protein-ligand binding affinity predictions and ligand-decoy discrimination
    corecore