Search CORE

57,748 research outputs found

STRING and STITCH: known and predicted interactions between proteins and chemicals

Author: Christian von Mering
Lars J. Jensen
Manuel Stark
Michael Kuhn
Peer Bork
Samuel Chaffron
Publication venue
Publication date: 06/09/2008
Field of study

Information on protein-protein and protein-chemical interactions is essential for understanding cellular functions. The STRING and STITCH web resources integrate interaction evidence derived from pathways, automatic literature mining, primary experimental data, and genomic context. The resulting interaction networks cover 1.5 million proteins from 373 organisms and 68,000 chemicals

Nature Precedings

Event based text mining for integrated network construction

Author: Saeys Yvan
Van de Peer Yves
Van Landeghem Sofie
Publication venue: Microtome Publishing
Publication date: 01/01/2010
Field of study

The scientific literature is a rich and challenging data source for research in systems biology, providing numerous interactions between biological entities. Text mining techniques have been increasingly useful to extract such information from the literature in an automatic way, but up to now the main focus of text mining in the systems biology field has been restricted mostly to the discovery of protein-protein interactions. Here, we take this approach one step further, and use machine learning techniques combined with text mining to extract a much wider variety of interactions between biological entities. Each particular interaction type gives rise to a separate network, represented as a graph, all of which can be subsequently combined to yield a so-called integrated network representation. This provides a much broader view on the biological system as a whole, which can then be used in further investigations to analyse specific properties of the networ

Ghent University Academic Bibliography

Evidence mining and novelty assessment of protein–protein interactions with the ConsensusPathDB plugin for Cytoscape

Author: Aebersold
Atanas Kamburov
Chatr-aryamontri
Ewing
Fields
Flicek
Huntley
Ideker
Kamburov
Keiichiro Ono
Konstantin Pentchev
Ralf Herwig
Rual
Shannon
Stelzl
The UniProt Consortium
Trey Ideker
Publication venue: Oxford University Press
Publication date: 01/01/2010
Field of study

Summary: Protein–protein interaction detection methods are applied on a daily basis by molecular biologists worldwide. After generating a set of potential interactions, biologists face the problem of highlighting the ones that are novel and collecting evidence with respect to literature and annotation. This task can be as tedious as searching for every predicted interaction in several interaction data repositories, or manually screening the scientific literature. To facilitate the task of evidence mining and novelty assessment of protein–protein interactions, we have developed a Cytoscape plugin that automatically mines publication references, database references, interaction detection method descriptions and pathway annotation for a user-supplied network of interactions. The basis for the annotation is ConsensusPathDB—a meta-database that integrates numerous protein–protein, signaling, metabolic and gene regulatory interaction repositories for currently three species: Homo sapiens, Saccharomyces cerevisiae and Mus musculus

Crossref

PubMed Central

MPG.PuRe

The potential of text mining in data integration and network biology for plant research : a case study on Arabidopsis

Author: De Bodt Stefanie
Drebert Zuzanna
Inzé Dirk
Van de Peer Yves
Van Landeghem Sofie
Publication venue: 'American Society of Plant Biologists (ASPB)'
Publication date: 01/01/2013
Field of study

Despite the availability of various data repositories for plant research, a wealth of information currently remains hidden within the biomolecular literature. Text mining provides the necessary means to retrieve these data through automated processing of texts. However, only recently has advanced text mining methodology been implemented with sufficient computational power to process texts at a large scale. In this study, we assess the potential of large-scale text mining for plant biology research in general and for network biology in particular using a state-of-the-art text mining system applied to all PubMed abstracts and PubMed Central full texts. We present extensive evaluation of the textual data for Arabidopsis thaliana, assessing the overall accuracy of this new resource for usage in plant network analyses. Furthermore, we combine text mining information with both protein-protein and regulatory interactions from experimental databases. Clusters of tightly connected genes are delineated from the resulting network, illustrating how such an integrative approach is essential to grasp the current knowledge available for Arabidopsis and to uncover gene information through guilt by association. All large-scale data sets, as well as the manually curated textual data, are made publicly available, hereby stimulating the application of text mining data in future plant biology studies

Ghent University Academic Bibliography

PubMed Central

Identification of novel molecular signatures of IgA nephropathy through an integrative -omics analysis

Author: Cisek Katryna
Delles Christian
Filip Szymon
Gakiopoulou Chara
Jankowski Joachim
Krochmal Magdalena
Markoska Katerina
Mischak Harald
Orange Clare
Spasovski Goce
Vlahou Antonia
Zoidakis Jerome
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

IgA nephropathy (IgAN) is the most prevalent among primary glomerular diseases worldwide. Although our understanding of IgAN has advanced significantly, its underlying biology and potential drug targets are still unexplored. We investigated a combinatorial approach for the analysis of IgAN-relevant -omics data, aiming at identification of novel molecular signatures of the disease. Nine published urinary proteomics datasets were collected and the reported differentially expressed proteins in IgAN vs. healthy controls were integrated into known biological pathways. Proteins participating in these pathways were subjected to multi-step assessment, including investigation of IgAN transcriptomics datasets (Nephroseq database), their reported protein-protein interactions (STRING database), kidney tissue expression (Human Protein Atlas) and literature mining. Through this process, from an initial dataset of 232 proteins significantly associated with IgAN, 20 pathways were predicted, yielding 657 proteins for further analysis. Step-wise evaluation highlighted 20 proteins of possibly high relevance to IgAN and/or kidney disease. Experimental validation of 3 predicted relevant proteins, adenylyl cyclase-associated protein 1 (CAP1), SHC-transforming protein 1 (SHC1) and prolylcarboxypeptidase (PRCP) was performed by immunostaining of human kidney sections. Collectively, this study presents an integrative procedure for -omics data exploitation, giving rise to biologically relevant results

Maastricht University Research Portal

Publikationsserver der RWTH Aachen University

Enlighten

Negation of protein-protein interactions: analysis and extraction

Author: Alfarano
Apweiler
Chapman
Chapman
Debusmann
Friedman
Givon
Järvinen
Kim
Knight
Leroy
Leroy
Massimo Poesio
Mutalik
Olivia Sanchez-Graillet
Settles
Temkin
Tottie
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2007
Field of study

Sanchez Graillet O, Poesio M. Negation of protein-protein interactions: analysis and extraction. Bioinformatics. 2007;23(13):i424--i432.**Motivation**: Negative information about protein–protein interactions—from uncertainty about the occurrence of an interaction to knowledge that it did not occur—is often of great use to biologists and could lead to important discoveries. Yet, to our knowledge, no proposals focusing on extracting such information have been proposed in the text mining literature. **Results**: In this work, we present an analysis of the types of negative information that is reported, and a heuristic-based system using a full dependency parser to extract such information. We performed a preliminary evaluation study that shows encouraging results of our system. Finally, we have obtained an initial corpus of negative protein–protein interactions as basis for the construction of larger ones. **Availability**:The corpus is available by request from the authors

Crossref

Publications at Bielefeld University

Ranking Interactions for a Curation Task

Author: Clematide S
Rinaldi Fabio
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 21/12/2011
Field of study

One of the key pieces of information which biomedical text mining systems are expected to extract from the literature are interactions among different types of biomedical entities (proteins, genes, diseases, drugs, etc.). Different types of entities might be considered, for example protein-protein interactions have been extensively studied as part of the Bio Creative competitive evaluations. However, more complex interactions such as those among genes, drugs, and diseases are increasingly of interest. Different databases have been used as reference for the evaluation of extraction and ranking techniques. The aim of this paper is to describe a machine-learning based reranking approach for candidate interactions extracted from the literature. The results are evaluated using data derived from the Pharm GKB database. The importance of a good ranking is particularly evident when the results are applied to support human curators

ZORA

The STRING database in 2021: customizable protein-protein networks, and functional characterization of user-uploaded gene/measurement sets.

Author: Bork Peer
Doncheva Nadezhda T
Fang Tao
Gable Annika L
Jensen Lars J
Kirsch Rebecca
Legeay Marc
Lyon David
Nastou Katerina C
Pyysalo Sampo
Szklarczyk Damian
von Mering Christian
Publication venue: 'Oxford University Press (OUP)'
Publication date: 08/01/2021
Field of study

Cellular life depends on a complex web of functional associations between biomolecules. Among these associations, protein-protein interactions are particularly important due to their versatility, specificity and adaptability. The STRING database aims to integrate all known and predicted associations between proteins, including both physical interactions as well as functional associations. To achieve this, STRING collects and scores evidence from a number of sources: (i) automated text mining of the scientific literature, (ii) databases of interaction experiments and annotated complexes/pathways, (iii) computational interaction predictions from co-expression and from conserved genomic context and (iv) systematic transfers of interaction evidence from one organism to another. STRING aims for wide coverage; the upcoming version 11.5 of the resource will contain more than 14 000 organisms. In this update paper, we describe changes to the text-mining system, a new scoring-mode for physical interactions, as well as extensive user interface features for customizing, extending and sharing protein networks. In addition, we describe how to query STRING with genome-wide, experimental data, including the automated detection of enriched functionalities and potential biases in the user's query data. The STRING resource is available online, at https://string-db.org/

ZORA

Evaluation of linguistic features useful in extraction of interactions from PubMed; Application to annotating known, high-throughput and predicted interactions in I2D

Author: Bader
Barrios-Rodiles
BioCreAtIve
BioCreAtIvE
Brown
Brown
Brown
Bunescu
Bunescu
Collins
David Otasek
Donaldson
Erkan
Fundel
Fundel
Gavin
Giot
Haddow
Hakenberg
Hao
Ho
Hoffmann
Huang
Igor Jurisica
Ingham
Ito
Jang
Joachims
Jones
Kerrien
Krallinger
Krallinger
Leitner
Li
Lin
LLL
Mering
Mewes
Mitsumori
Nielsen
Niu
Otasek
Peri
Plake
Ponzielli
Ramani
Romano
Rual
Stelzl
Temkin
Tsuruoka
Xenarios
Yakushiji
Yun Niu
Zanzoni
Zhou
Zhou
Publication venue: Oxford University Press
Publication date
Field of study

Motivation: Identification and characterization of protein–protein interactions (PPIs) is one of the key aims in biological research. While previous research in text mining has made substantial progress in automatic PPI detection from literature, the need to improve the precision and recall of the process remains. More accurate PPI detection will also improve the ability to extract experimental data related to PPIs and provide multiple evidence for each interaction

Crossref

PubMed Central

Consolidating the set of known human protein-protein interactions in preparation for large-scale mapping of the human interactome

Author: Bunescu Razvan C
Marcotte Edward M
Mooney Raymond J
Ramani Arun K
Publication venue: BioMed Central
Publication date: 01/01/2005
Field of study

BACKGROUND: Extensive protein interaction maps are being constructed for yeast, worm, and fly to ask how the proteins organize into pathways and systems, but no such genome-wide interaction map yet exists for the set of human proteins. To prepare for studies in humans, we wished to establish tests for the accuracy of future interaction assays and to consolidate the known interactions among human proteins. RESULTS: We established two tests of the accuracy of human protein interaction datasets and measured the relative accuracy of the available data. We then developed and applied natural language processing and literature-mining algorithms to recover from Medline abstracts 6,580 interactions among 3,737 human proteins. A three-part algorithm was used: first, human protein names were identified in Medline abstracts using a discriminator based on conditional random fields, then interactions were identified by the co-occurrence of protein names across the set of Medline abstracts, filtering the interactions with a Bayesian classifier to enrich for legitimate physical interactions. These mined interactions were combined with existing interaction data to obtain a network of 31,609 interactions among 7,748 human proteins, accurate to the same degree as the existing datasets. CONCLUSION: These interactions and the accuracy benchmarks will aid interpretation of current functional genomics data and provide a basis for determining the quality of future large-scale human protein interaction assays. Projecting from the approximately 15 interactions per protein in the best-sampled interaction set to the estimated 25,000 human genes implies more than 375,000 interactions in the complete human protein interaction network. This set therefore represents no more than 10% of the complete network

Springer - Publisher Connector

PubMed Central