29 research outputs found

    Biblio-MetReS for user-friendly mining of genes and biological processes in scientific documents

    Get PDF
    One way to initiate the reconstruction of molecular circuits is by using automated text-mining techniques. Developing more efficient methods for such reconstruction is a topic of active research, and those methods are typically included by bioinfor- maticians in pipelines used to mine and curate large literature datasets. Nevertheless, experimental biologists have a limited number of available user-friendly tools that use text-mining for network reconstruction and require no programming skills to use. One of these tools is Biblio-MetReS. Originally, this tool permitted an on-the-fly analysis of documents contained in a number of web-based literature databases to identify co-occurrence of proteins/genes. This approach ensured results that were always up-to-date with the latest live version of the databases. However, this `up-to- dateness' came at the cost of large execution times. Here we report an evolution of the application Biblio-MetReS that permits constructing co-occurrence networks for genes, GO processes, Pathways, or any combination of the three types of entities and graphically represent those entities.We show that the performance of Biblio- MetReS in identifying gene co-occurrence is as least as good as that of other com- parable applications (STRING and iHOP). In addition, we also show that the iden- tification of GO processes is on par to that reported in the latest BioCreAtIvE chal- lenge. Finally, we also report the implementation of a new strategy that combines on-the-fly analysis of new documents with preprocessed information from docu- ments that were encountered in previous analyses. This combination simultaneously decreases program run time and maintains `up-to-dateness' of the results.RA was partially supported by the Ministerio de Ciencia e Innovación (MICINN, Spain through grant BFU2010-17704). FS was partially funded by the MICINN, with grants TIN2011-28689-C02-02. The authors are members of the research groups 2009SGR809 and 2009SGR145, funded by the “Generalitat de Catalunya”. AU is funded by a Generalitat de Catalunya (AGAUR) PhD fellowship. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript

    Biblio-MetReS: A bibliometric network reconstruction application and server

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Reconstruction of genes and/or protein networks from automated analysis of the literature is one of the current targets of text mining in biomedical research. Some user-friendly tools already perform this analysis on precompiled databases of abstracts of scientific papers. Other tools allow <b>expert </b>users to elaborate and analyze the full content of a corpus of scientific documents. However, to our knowledge, no <b>user friendly </b>tool that simultaneously analyzes the latest set of scientific documents available on line and reconstructs the set of genes referenced in those documents is available.</p> <p>Results</p> <p>This article presents such a tool, Biblio-MetReS, and compares its functioning and results to those of other user-friendly applications (iHOP, STRING) that are widely used. Under similar conditions, Biblio-MetReS creates networks that are comparable to those of other user friendly tools. Furthermore, analysis of full text documents provides more complete reconstructions than those that result from using only the abstract of the document.</p> <p>Conclusions</p> <p>Literature-based automated network reconstruction is still far from providing complete reconstructions of molecular networks. However, its value as an auxiliary tool is high and it will increase as standards for reporting biological entities and relationships become more widely accepted and enforced. Biblio-MetReS is an application that can be downloaded from <url>http://metres.udl.cat/</url>. It provides an easy to use environment for researchers to reconstruct their networks of interest from an always up to date set of scientific documents.</p

    Human protein reference database—2006 update

    Get PDF
    Human Protein Reference Database (HPRD) () was developed to serve as a comprehensive collection of protein features, post-translational modifications (PTMs) and protein–protein interactions. Since the original report, this database has increased to >20 000 proteins entries and has become the largest database for literature-derived protein–protein interactions (>30 000) and PTMs (>8000) for human proteins. We have also introduced several new features in HPRD including: (i) protein isoforms, (ii) enhanced search options, (iii) linking of pathway annotations and (iv) integration of a novel browser, GenProt Viewer (), developed by us that allows integration of genomic and proteomic information. With the continued support and active participation by the biomedical community, we expect HPRD to become a unique source of curated information for the human proteome and spur biomedical discoveries based on integration of genomic, transcriptomic and proteomic data

    Two Component Systems: Physiological Effect of a Third Component

    Get PDF
    Signal transduction systems mediate the response and adaptation of organisms to environmental changes. In prokaryotes, this signal transduction is often done through Two Component Systems (TCS). These TCS are phosphotransfer protein cascades, and in their prototypical form they are composed by a kinase that senses the environmental signals (SK) and by a response regulator (RR) that regulates the cellular response. This basic motif can be modified by the addition of a third protein that interacts either with the SK or the RR in a way that could change the dynamic response of the TCS module. In this work we aim at understanding the effect of such an additional protein (which we call “third component”) on the functional properties of a prototypical TCS. To do so we build mathematical models of TCS with alternative designs for their interaction with that third component. These mathematical models are analyzed in order to identify the differences in dynamic behavior inherent to each design, with respect to functionally relevant properties such as sensitivity to changes in either the parameter values or the molecular concentrations, temporal responsiveness, possibility of multiple steady states, or stochastic fluctuations in the system. The differences are then correlated to the physiological requirements that impinge on the functioning of the TCS. This analysis sheds light on both, the dynamic behavior of synthetically designed TCS, and the conditions under which natural selection might favor each of the designs. We find that a third component that modulates SK activity increases the parameter space where a bistable response of the TCS module to signals is possible, if SK is monofunctional, but decreases it when the SK is bifunctional. The presence of a third component that modulates RR activity decreases the parameter space where a bistable response of the TCS module to signals is possible

    Development and application of computational methdologies for Integrated Molecular Systems Biology

    No full text
    L'objectiu del treball presentat en aquesta tesi va ser el desenvolupament i l'aplicació de metodologies computacionals que integren l’anàlisis de informació sobre seqüències proteiques, informació funcional i genòmica per a la reconstrucció, anotació i organització de proteomes complets, de manera que els resultats es poden comparar entre qualsevol nombre d'organismes amb genomes completament seqüenciats. Metodològicament, m'he centrat en la identificació de l'organització molecular dins d'un proteoma complet d'un organisme de referència i comparació amb proteomes d'altres organismes, en espacial, estructural i funcional, el teixit cel • lular de desenvolupament, o els nivells de la fisiologia. La metodologia es va aplicar per abordar la qüestió de la identificació de organismes model adequats per a estudiar diferents fenòmens biològics. Això es va fer mitjançant la comparació d’un conjunt de proteines involucrades en diferents fenòmens biològics en Saccharomyces cerevisiae i Homo sapiens amb els conjunts corresponents d'altres organismes amb genomes. La tesi conclou amb la presentació d'un servidor web, Homol-MetReS, en què s'implementa la metodologia. Homol-MetReS proporciona un entorn de codi obert a la comunitat científica en què es poden realitzar múltiples nivells de comparació i anàlisi de proteomes.El objetivo del trabajo presentado en esta tesis fue el desarrollo y la aplicación de metodologías computacionales que integran el análisis de la secuencia y de la información funcional y genómica, con el objetivo de reconstruir, anotar y organizar proteomas completos, de tal manera que estos proteomas se puedan comparar entre cualquier número de organismos con genomas completamente secuenciados. Metodológicamente, I centrado en la identificación de organización molecular dentro de un proteoma completo de un organismo de referencia, vinculando cada proteína en que proteoma a las proteínas de otros organismos, de tal manera que cualquiera puede comparar los dos proteomas en espacial, estructural, funcional tejido, celular, el desarrollo o los niveles de la fisiología. La metodología se aplicó para abordar la cuestión de la identificación de organismos modelo adecuados para estudiar diferentes fenómenos biológicos. Esto se hizo comparando conjuntos de proteínas involucradas en diferentes fenómenos biológicos en Saccharomyces cerevisiae y Homo sapiens con los conjuntos correspondientes de otros organismos con genomas completamente secuenciados. La tesis concluye con la presentación de un servidor web, Homol-MetReS, en el que se implementa la metodología. Homol-MetReS proporciona un entorno de código abierto a la comunidad científica en la que se pueden realizar múltiples niveles de comparación y análisis de proteomas.The aim of the work presented in this thesis was the development and application of computational methodologies that integrate sequence, functional, and genomic information to provide tools for the reconstruction, annotation and organization of complete proteomes in such a way that the results can be compared between any number of organisms with fully sequenced genomes. Methodologically, I focused on identifying molecular organization within a complete proteome of a reference organism and comparing with proteomes of other organisms at spatial, structural, functional, cellular tissue, development or physiology levels. The methodology was applied to address the issue of identifying appropriate model organisms to study different biological phenomena. This was done by comparing the protein sets involved in different biological phenomena in Saccharomyces cerevisiae and Homo sapiens. This thesis concludes by presenting a web server, Homol-MetReS, on which the methodology is implemented. It provides an open source environment to the scientific community on which they can perform multi-level comparison and analysis of proteomes.   

    Database Constraints Applied to Metabolic Pathway Reconstruction Tools

    Get PDF
    Our group developed two biological applications, Biblio-MetReS and Homol-MetReS, accessing the same database of organisms with annotated genes. Biblio-MetReS is a data-mining application that facilitates the reconstruction of molecular networks based on automated text-mining analysis of published scientific literature. Homol-MetReS allows functional (re)annotation of proteomes, to properly identify both the individual proteins involved in the process(es) of interest and their function. It also enables the sets of proteins involved in the process(es) in different organisms to be compared directly. The efficiency of these biological applications is directly related to the design of the shared database. We classified and analyzed the different kinds of access to the database. Based on this study, we tried to adjust and tune the configurable parameters of the database server to reach the best performance of the communication data link to/from the database system. Different database technologies were analyzed. We started the study with a public relational SQL database, MySQL. Then, the same database was implemented by a MapReduce-based database named HBase. The results indicated that the standard configuration of MySQL gives an acceptable performance for low or medium size databases. Nevertheless, tuning database parameters can greatly improve the performance and lead to very competitive runtimes

    Percentage of parameter space where bistable responses are possible<sup>a</sup>.

    No full text
    a<p>Some bidimensional sections of the multidimensional parameter space of bistability are shown in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0031095#pone.0031095.s002" target="_blank">Figure S2</a>. The results show that in TCS with a bifunctional SK, both a TC<sub>SK</sub> and a TC<sub>RR</sub> cause a decrease in the size of the parametric region of bistability, with one exception: Model C has a larger parametric region of bistability when the signaling target is SK autophosphorylation (k<sub>1</sub>). However, in systems with a monofunctional SK, a TCSK causes an increase and a TCRR causes a decrease in the size of the parametric region of bistability if the environment modulates the SK dephosphorylation (k<sub>2</sub>). A|B stands for Model A controlled for Model B. A|C stands for Model A controlled for Model C.</p
    corecore