23 research outputs found

    BNDB – The Biochemical Network Database

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Technological advances in high-throughput techniques and efficient data acquisition methods have resulted in a massive amount of life science data. The data is stored in numerous databases that have been established over the last decades and are essential resources for scientists nowadays. However, the diversity of the databases and the underlying data models make it difficult to combine this information for solving complex problems in systems biology. Currently, researchers typically have to browse several, often highly focused, databases to obtain the required information. Hence, there is a pressing need for more efficient systems for integrating, analyzing, and interpreting these data. The standardization and virtual consolidation of the databases is a major challenge resulting in a unified access to a variety of data sources.</p> <p>Description</p> <p>We present the Biochemical Network Database (BNDB), a powerful relational database platform, allowing a complete semantic integration of an extensive collection of external databases. BNDB is built upon a comprehensive and extensible object model called BioCore, which is powerful enough to model most known biochemical processes and at the same time easily extensible to be adapted to new biological concepts. Besides a web interface for the search and curation of the data, a Java-based viewer (BiNA) provides a powerful platform-independent visualization and navigation of the data. BiNA uses sophisticated graph layout algorithms for an interactive visualization and navigation of BNDB.</p> <p>Conclusion</p> <p>BNDB allows a simple, unified access to a variety of external data sources. Its tight integration with the biochemical network library BN++ offers the possibility for import, integration, analysis, and visualization of the data. BNDB is freely accessible at <url>http://www.bndb.org</url>.</p

    GeneTrail—advanced gene set enrichment analysis

    Get PDF
    We present a comprehensive and efficient gene set analysis tool, called ‘GeneTrail’ that offers a rich functionality and is easy to use. Our web-based application facilitates the statistical evaluation of high-throughput genomic or proteomic data sets with respect to enrichment of functional categories. GeneTrail covers a wide variety of biological categories and pathways, among others KEGG, TRANSPATH, TRANSFAC, and GO. Our web server provides two common statistical approaches, ‘Over-Representation Analysis’ (ORA) comparing a reference set of genes to a test set, and ‘Gene Set Enrichment Analysis’ (GSEA) scoring sorted lists of genes. Besides other newly developed features, GeneTrail's statistics module includes a novel dynamic-programming algorithm that improves the P-value computation of GSEA methods considerably. GeneTrail is freely accessible at http://genetrail.bioinf.uni-sb.d

    A novel algorithm for detecting differentially regulated paths based on gene set enrichment analysis

    Get PDF
    Motivation: Deregulated signaling cascades are known to play a crucial role in many pathogenic processes, among them are tumor initiation and progression. In the recent past, modern experimental techniques that allow for measuring the amount of mRNA transcripts of almost all known human genes in a tissue or even in a single cell have opened new avenues for studying the activity of the signaling cascades and for understanding the information flow in the networks

    Challenges in integrating Escherichia coli molecular biology data

    Get PDF
    One key challenge in Systems Biology is to provide mechanisms to collect and integrate the necessary data to be able to meet multiple analysis requirements. Typically, biological contents are scattered over multiple data sources and there is no easy way of comparing heterogeneous data contents. This work discusses ongoing standardisation and interoperability efforts and exposes integration challenges for the model organism Escherichia coli K-12. The goal is to analyse the major obstacles faced by integration processes, suggest ways to systematically identify them, and whenever possible, propose solutions or means to assistmanual curation. Integration of gene, protein and compound data was evaluated by performing comparisons over EcoCyc, KEGG, BRENDA, ChEBI, Entrez Gene and UniProt contents. Cross-links, a number of standard nomenclatures and name information supported the comparisons. Except for the gene integration scenario, in no other scenario an element of integration performed well enough to support the process by itself. Indeed, both the integration of enzyme and compound records imply considerable curation. Results evidenced that, even for a well-studied model organism, source contents are still far from being as standardized as it would be desired and metadata varies considerably from source to source. Before designing any data integration pipeline, researchers should decide on the sources that best fit the purpose of analysis and be aware of existing conflicts/inconsistencies to be able to intervene in their resolution. Moreover, they should be aware of the limits of automatic integration such that they can define the extent of necessary manual curation for each application.Portuguese FCT funded MIT-Portugal Program in Bioengineering (MIT-Pt/BS-BB/0082/2008); PhD grant from FCT (ref. SFRH/BD/22863/2005) to S.

    Biana: a software framework for compiling biological interactions and analyzing networks

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The analysis and usage of biological data is hindered by the spread of information across multiple repositories and the difficulties posed by different nomenclature systems and storage formats. In particular, there is an important need for data unification in the study and use of protein-protein interactions. Without good integration strategies, it is difficult to analyze the whole set of available data and its properties.</p> <p>Results</p> <p>We introduce BIANA (Biologic Interactions and Network Analysis), a tool for biological information integration and network management. BIANA is a Python framework designed to achieve two major goals: i) the integration of multiple sources of biological information, including biological entities and their relationships, and ii) the management of biological information as a network where entities are nodes and relationships are edges. Moreover, BIANA uses properties of proteins and genes to infer latent biomolecular relationships by transferring edges to entities sharing similar properties. BIANA is also provided as a plugin for Cytoscape, which allows users to visualize and interactively manage the data. A web interface to BIANA providing basic functionalities is also available. The software can be downloaded under GNU GPL license from <url>http://sbi.imim.es/web/BIANA.php</url>.</p> <p>Conclusions</p> <p>BIANA's approach to data unification solves many of the nomenclature issues common to systems dealing with biological data. BIANA can easily be extended to handle new specific data repositories and new specific data types. The unification protocol allows BIANA to be a flexible tool suitable for different user requirements: non-expert users can use a suggested unification protocol while expert users can define their own specific unification rules.</p

    An integer linear programming approach for finding deregulated subgraphs in regulatory networks

    Get PDF
    Deregulation of cell signaling pathways plays a crucial role in the development of tumors. The identification of such pathways requires effective analysis tools that facilitate the interpretation of expression differences. Here, we present a novel and highly efficient method for identifying deregulated subnetworks in a regulatory network. Given a score for each node that measures the degree of deregulation of the corresponding gene or protein, the algorithm computes the heaviest connected subnetwork of a specified size reachable from a designated root node. This root node can be interpreted as a molecular key player responsible for the observed deregulation. To demonstrate the potential of our approach, we analyzed three gene expression data sets. In one scenario, we compared expression profiles of non-malignant primary mammary epithelial cells derived from BRCA1 mutation carriers and of epithelial cells without BRCA1 mutation. Our results suggest that oxidative stress plays an important role in epithelial cells of BRCA1 mutation carriers and that the activation of stress proteins may result in avoidance of apoptosis leading to an increased overall survival of cells with genetic alterations. In summary, our approach opens new avenues for the elucidation of pathogenic mechanisms and for the detection of molecular key players

    Bioinformatics approaches for cancer research

    Get PDF
    Cancer is the consequence of genetic alterations that influence the behavior of affected cells. While the phenotypic effects of cancer like infinite proliferation are common hallmarks of this complex class of diseases, the connections between the genetic alterations and these effects are not always evident. The growth of information generated by experimental high-throughput techniques makes it possible to combine heterogeneous data from different sources to gain new insights into these complex molecular processes. The demand on computational biology to develop tools and methods to facilitate the evaluation of such data has increased accordingly. To this end, we developed new approaches and bioinformatics tools for the analysis of high-throughput data. Additionally, we integrated these new approaches into our comprehensive C++ framework GeneTrail. GeneTrail presents a powerful package that combines information retrieval, statistical evaluation of gene sets, result presentation, and data exchange. To make GeneTrail';s capabilities available to the research community, we implemented a graphical user interface in PHP and set up a webserver that is world-wide accessible. In this thesis, we discuss newly integrated algorithms and extensions of GeneTrail, as well as some comprehensive studies that have been performed with GeneTrail in the context of cancer research. We applied GeneTrail to analyze properties of tumor-associated antigens to elucidate the mechanisms of antigen candidate selection. Furthermore, we performed an extensive analysis of miRNAs and their putative target pathways and networks in cancer. In the field of differential network analysis, we employed a combination of expression values and topological data to identify patterns of deregulated subnetworks and putative key players for the deregulation. Signatures of deregulated subnetworks may help to predict the sensitivity of tumor subtypes to therapeutic agents and, hence, may be used in the future to guide the selection of optimal agents. Furthermore, the identified putative key players may represent oncogenes, tumor suppressor genes, or other genes that contribute to crucial changes of regulatory and signaling processes in cancer cells and may serve as potential targets for an individualized tumor therapy. With these applications, we demonstrate the usefulness of our GeneTrail package and hope that our work will contribute to a better understanding of cancer.Krebs ist eine Folge von tiefgreifenden genetischen Veränderungen, die das Verhalten der betroffenen Zellen beeinflussen. Während phänotypische Effekte wie unaufhörliches Wachstum augenscheinliche Merkmale dieser komplexen Klasse von Krankheiten sind, sind die Zusammenhänge zwischen genetischen Veränderungen und diesen Effekten oftmals weit weniger offensichtlich. Mit der stetigen Zunahme an Daten, die aus Hochdurchsatz-Verfahren stammen, ist es möglich geworden, heterogene Daten aus verschiedenen Quellen zu kombinieren und neue Erkenntnisse über diese Zusammenhänge zu gewinnen. Dementsprechend sind auch die Anforderungen an die Bioinformatik gewachsen, geeignete Applikationen und Verfahren zu entwickeln, um die Auswertung solcher Daten zu vereinfachen. Zu diesem Zweck haben wir neue Ansätze und bioinformatische Werkzeuge für die Analyse von entsprechenden Daten für die Krebsforschung entwickelt, welche wir in unser umfangreiches C++ System GeneTrail integriert haben. GeneTrail stellt ein mächtiges Softwarepaket dar, das Informationsgewinnung, statistische Auswertung von Gen Mengen, visuelle Darstellung der Resultate und Datenaustausch kombiniert. Um GeneTrail';s Fähigkeiten der Forschungsgemeinschaft zugänglich zu machen, haben wir eine graphische Benutzerschnittstelle in PHP implementiert und einen Webserver aufgesetzt, auf den weltweit zugegriffen werden kann. In der vorliegenden Arbeit diskutieren wir neu integrierte Algorithmen und Erweiterungen von GeneTrail, sowie umfangreiche Untersuchungen im Bereich Krebsforschung, die mit GeneTrail durchgeführt wurden. Wir haben GeneTrail angewendet, um Eigenschaften von Tumorantigenen zu untersuchen, um aufzuklären, welche dieser Eigenschaften zur Selektion dieser Proteine als Antigene beitragen. Des Weiteren haben wir eine umfangreiche Analyse von miRNAs und deren potentiellen Zielpfaden und -netzen in verschiedenen Krebsarten durchgeführt. Im Bereich differentieller Netzwerkanalyse kombinierten wir Expressionswerte und topologische Netzwerkdaten, um Muster deregulierter Teilnetzwerke und mögliche Schlüsselgene für die Deregulation zu identifizieren. Signaturen deregulierter Teilnetzwerke können helfen die Sensitivität verschiedener Tumorarten gegenüber Therapeutika vorherzusagen und damit zukünftig eine optimal angepasste Therapie zu ermöglichen. Außerdem können die identifizierten potentiellen Schlüsselgene Oncogene, Tumorsuppressorgene, oder andere Gene darstellen, die zu wichtigen Änderungen von regulatorischen Prozessen in Krebszellen beitragen, und damit auch als potentielle Ziele für eine individuelle Tumortherapie in Frage kommen. Mit diesen Anwendungen untermauern wir den Nutzen von GeneTrail und hoffen, dass unsere Arbeit in Zukunft zu einem besseren Verständnis von Krebs beiträgt
    corecore