Search CORE

28 research outputs found

The Protein Identifier Cross-Referencing (PICR) service: reconciling protein identifiers across multiple source databases

Author: Apweiler Rolf
Côté Richard G
Hermjakob Henning
Jones Philip
Kerrien Samuel
Leinonen Rasko
Lin Quan
Martens Lennart
Reisinger Florian
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Abstract Background Each major protein database uses its own conventions when assigning protein identifiers. Resolving the various, potentially unstable, identifiers that refer to identical proteins is a major challenge. This is a common problem when attempting to unify datasets that have been annotated with proteins from multiple data sources or querying data providers with one flavour of protein identifiers when the source database uses another. Partial solutions for protein identifier mapping exist but they are limited to specific species or techniques and to a very small number of databases. As a result, we have not found a solution that is generic enough and broad enough in mapping scope to suit our needs. Results We have created the Protein Identifier Cross-Reference (PICR) service, a web application that provides interactive and programmatic (SOAP and REST) access to a mapping algorithm that uses the UniProt Archive (UniParc) as a data warehouse to offer protein cross-references based on 100% sequence identity to proteins from over 70 distinct source databases loaded into UniParc. Mappings can be limited by source database, taxonomic ID and activity status in the source database. Users can copy/paste or upload files containing protein identifiers or sequences in FASTA format to obtain mappings using the interactive interface. Search results can be viewed in simple or detailed HTML tables or downloaded as comma-separated values (CSV) or Microsoft Excel (XLS) files suitable for use in a local database or a spreadsheet. Alternatively, a SOAP interface is available to integrate PICR functionality in other applications, as is a lightweight REST interface. Conclusion We offer a publicly available service that can interactively map protein identifiers and protein sequences to the majority of commonly used protein databases. Programmatic access is available through a standards-compliant SOAP interface or a lightweight REST interface. The PICR interface, documentation and code examples are available at <url>http://www.ebi.ac.uk/Tools/picr</url>.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

Ghent University Academic Bibliography

PubMed Central

MINT and IntAct contribute to the Second BioCreative challenge: serving the text-mining community with high quality molecular interaction data

Author: Apweiler Rolf
Aranda Bruno
Castagnoli Luisa
Ceol Arnaud
Cesareni Gianni
Chatr-aryamontri Andrew
Costa Stefano
Derow Cathy
Hermjakob Henning
Huntley Rachael
Kerrien Samuel
Khadake Jyoti
Leroy Catherine
Licata Luana
Orchard Sandra
Thorneycroft Dave
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

In the absence of consolidated pipelines to archive biological data electronically, information dispersed in the literature must be captured by manual annotation. Unfortunately, manual annotation is time consuming and the coverage of published interaction data is therefore far from complete. The use of text-mining tools to identify relevant publications and to assist in the initial information extraction could help to improve the efficiency of the curation process and, as a consequence, the database coverage of data available in the literature. The 2006 BioCreative competition was aimed at evaluating text-mining procedures in comparison with manual annotation of protein-protein interactions

Crossref

Springer - Publisher Connector

PubMed Central

UCL Discovery

ART

The IntAct molecular interaction database in 2012

Author: Aranda Bruno
Breuza Lionel
Bridge Alan
Broackes-Carter Fiona
Chen Carol
Duesbury Margaret
Dumousseau Marine
Feuermann Marc
Hermjakob Henning
Hinz Ursula
Jandrasits Christine
Jimenez Rafael C.
Kerrien Samuel
Khadake Jyoti
Mahadevan Usha
Masson Patrick
Orchard Sandra
Pedruzzi Ivo
Pfeiffenberger Eric
Porras Pablo
Raghunath Arathi
Roechert Bernd
Publication venue
Publication date: 02/08/2017
Field of study

IntAct is an open-source, open data molecular interaction database populated by data either curated from the literature or from direct data depositions. Two levels of curation are now available within the database, with both IMEx-level annotation and less detailed MIMIx-compatible entries currently supported. As from September 2011, IntAct contains approximately 275 000 curated binary interaction evidences from over 5000 publications. The IntAct website has been improved to enhance the search process and in particular the graphical display of the results. New data download formats are also available, which will facilitate the inclusion of IntAct's data in the Semantic Web. IntAct is an active contributor to the IMEx consortium (http://www.imexconsortium.org). IntAct source code and data are freely available at http://www.ebi.ac.uk/intac

RERO DOC Digital Library

Broadening the horizon – level 2.5 of the HUPO-PSI format for molecular interactions

BACKGROUND: Molecular interaction Information is a key resource in modern biomedical research. Publicly available data have previously been provided in a broad array of diverse formats, making access to this very difficult. The publication and wide implementation of the Human Proteome Organisation Proteomics Standards Initiative Molecular Interactions (HUPO PSI-MI) format in 2004 was a major step towards the establishment of a single, unified format by which molecular interactions should be presented, but focused purely on protein-protein interactions. RESULTS: The HUPO-PSI has further developed the PSI-MI XML schema to enable the description of interactions between a wider range of molecular types, for example nucleic acids, chemical entities, and molecular complexes. Extensive details about each supported molecular interaction can now be captured, including the biological role of each molecule within that interaction, detailed description of interacting domains, and the kinetic parameters of the interaction. The format is supported by data management and analysis tools and has been adopted by major interaction data providers. Additionally, a simpler, tab-delimited format MITAB2.5 has been developed for the benefit of users who require only minimal information in an easy to access configuration. CONCLUSION: The PSI-MI XML2.5 and MITAB2.5 formats have been jointly developed by interaction data producers and providers from both the academic and commercial sector, and are already widely implemented and well supported by an active development community. PSI-MI XML2.5 enables the description of highly detailed molecular interaction data and facilitates data exchange between databases and users without loss of information. MITAB2.5 is a simpler format appropriate for fast Perl parsing or loading into Microsoft Excel

University of Toronto Research Repository

Crossref

Harvard University - DASH

Springer - Publisher Connector

Serveur académique lausannois

Directory of Open Access Journals

Global modeling of transcriptional responses in interaction networks

Author: Aittokallio
Ashburner
Bush
Chang
Dai
Draghici
Dudley
Gelman
Goeman
Granovskaia
Greco
Hanisch
Hartwell
Honkela
Hu
Ideker
Irizarry
Juha E. A. Knuuttila
Kanehisa
Kerrien
Kilpinen
Kong
Kurihara
Lage
Lamb
Law
Lee
Leo Lahti
Liang
Loots
Lucas
Lukk
Madeira
Montaner
Nacu
Nam
Nuyten
Nymark
Rachlin
Reiss
Roth
Roth
Samuel Kaski
Sanguinetti
Schaefer
Scherf
Schmid
Shiga
Su
Tanay
Tanay
Tarca
Ulitsky
Wilkinson
Wu
Publication venue: 'Oxford University Press (OUP)'
Publication date: 02/02/2012
Field of study

Motivation: Cell-biological processes are regulated through a complex network of interactions between genes and their products. The processes, their activating conditions, and the associated transcriptional responses are often unknown. Organism-wide modeling of network activation can reveal unique and shared mechanisms between physiological conditions, and potentially as yet unknown processes. We introduce a novel approach for organism-wide discovery and analysis of transcriptional responses in interaction networks. The method searches for local, connected regions in a network that exhibit coordinated transcriptional response in a subset of conditions. Known interactions between genes are used to limit the search space and to guide the analysis. Validation on a human pathway network reveals physiologically coherent responses, functional relatedness between physiological conditions, and coordinated, context-specific regulation of the genes. Availability: Implementation is freely available in R and Matlab at http://netpro.r-forge.r-project.orgComment: 19 pages, 13 figure

arXiv.org e-Print Archive

Crossref

The PSI semantic validator: a framework to check MIAPE compliance of proteomics data

Author: Aranda Bruno
Hermjakob Henning
Jones Andrew R
Kerrien Samuel
Martens Lennart
Montecchi-Palazzi Luisa
Reisinger Florian
Publication venue: 'Wiley'
Publication date: 01/01/2009
Field of study

The Human Proteome Organization's Proteomics Standards Initiative (PSI) promotes the development of exchange standards to improve data integration and interoperability. PSI specifies the suitable level of detail required when reporting a proteomics experiment (via the Minimum Information About a Proteomics Experiment), and provides extensible markup language (XML) exchange formats and dedicated controlled vocabularies (CVs) that must be combined to generate a standard compliant document. The framework presented here tackles the issue of checking that experimental data reported using a specific format, CVs and public bio-ontologies (e.g. Gene Ontology, NCBI taxonomy) are compliant with the Minimum Information About a Proteomics Experiment recommendations. The semantic validator not only checks the XML syntax but it also enforces rules regarding the use of an ontology class or CV terms by checking that the terms exist in the resource and that they are used in the correct location of a document. Moreover, this framework is extremely fast, even on sizable data files, and flexible, as it can be adapted to any standard by customizing the parameters it requires: an XML Schema Definition, one or more CVs or ontologies, and a mapping file describing in a formal way how the semantic resources and the format are interrelated. As such, the validator provides a general solution to the common problem in data exchange: how to validate the correct usage of a data standard beyond simple XML Schema Definition validation. The framework source code and its various applications can be found at http://psidev.info/validator

Ghent University Academic Bibliography

Protein interaction data curation: the International Molecular Exchange (IMEx) consortium.

Author: A Ceol
A Chatr-aryamontri
Alan Bridge
Andrew Chatr-aryamontri
Arathi Raghunath
B Aranda
Bernd Roechert
BJ Breitkreutz
Bruno Aranda
C Alfarano
C Prieto
Carol Chen
D Szklarczyk
David J Lynn
DJ Lynn
E Chautard
Emilie Chautard
EW Sayers
F Leitner
Fiona S L Brinkman
GD Bader
Gianni Cesareni
H Hermjakob
Henning Hermjakob
I Xenarios
Igor Jurisica
Ioannis Xenarios
J Goll
JC Rain
JF Rual
Jignesh Bhate
Johannes Goll
Jyoti Khadake
KR Brown
L Giot
L Montecchi-Palazzi
L Salwinski
Leonardo Briganti
Linda I Hannick
Livia Perfetto
Lukasz Salwinski
Marine Dumousseau
Mike Tyers
Peter Uetz
Robert E W Hancock
S Kerrien
S Kerrien
S Orchard
S Orchard
Samuel Kerrien
Sandra Orchard
Sara Abbani
Shelby Bidwell
Sylvie Ricard-Blum
TS Keshava Prasad
U Guldener
U Stelzl
Usha Mahadevan
VM Perreau
Volker Stümpflen
YC Chen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

The International Molecular Exchange (IMEx) consortium is an international collaboration between major public interaction data providers to share literature-curation efforts and make a nonredundant set of protein interactions available in a single search interface on a common website (http://www.imexconsortium.org/). Common curation rules have been developed, and a central registry is used to manage the selection of articles to enter into the dataset. We discuss the advantages of such a service to the user, our quality-control measures and our data-distribution practices

Crossref

Serveur académique lausannois

PubMed Central

PuSH

ART

IntAct: an open source molecular interaction database

Author: Apweller Rolf
Armstrong John
Bairoch Amos
Cesareni Gianni
Hermjakob Henning
Kerrien Samuel
Lewington Chris
Margalit Hanah
Montecchi-Palazzi Luisa
Mudali Sugath
Orchard Sandra
Roechert Bernd
Roepstorff Peter
Sherman David
Valencia Alfonso
Vingron Martin
Publication venue
Publication date: 01/01/2004
Field of study

IntAct provides an open source database and toolkit for the storage, presentation and analysis of protein interactions. The web interface provides both textual and graphical representations of protein interactions, and allows exploring interaction networks in the context of the GO annotations of the interacting proteins. A web service allows direct computational access to retrieve interaction networks in XML format. IntAct currently contains 2200 binary and complex interactions imported from the literature and curated in collaboration with the Swiss-Prot team, making intensive use of controlled vocabularies to ensure data consistency. All IntAct software, data and controlled vocabularies are available at http://www.ebi.ac.uk/intact

PubMed Central

ART

Archive ouverte UNIGE