17,897 research outputs found

    Representing and analysing molecular and cellular function in the computer

    Get PDF
    Determining the biological function of a myriad of genes, and understanding how they interact to yield a living cell, is the major challenge of the post genome-sequencing era. The complexity of biological systems is such that this cannot be envisaged without the help of powerful computer systems capable of representing and analysing the intricate networks of physical and functional interactions between the different cellular components. In this review we try to provide the reader with an appreciation of where we stand in this regard. We discuss some of the inherent problems in describing the different facets of biological function, give an overview of how information on function is currently represented in the major biological databases, and describe different systems for organising and categorising the functions of gene products. In a second part, we present a new general data model, currently under development, which describes information on molecular function and cellular processes in a rigorous manner. The model is capable of representing a large variety of biochemical processes, including metabolic pathways, regulation of gene expression and signal transduction. It also incorporates taxonomies for categorising molecular entities, interactions and processes, and it offers means of viewing the information at different levels of resolution, and dealing with incomplete knowledge. The data model has been implemented in the database on protein function and cellular processes 'aMAZE' (http://www.ebi.ac.uk/research/pfbp/), which presently covers metabolic pathways and their regulation. Several tools for querying, displaying, and performing analyses on such pathways are briefly described in order to illustrate the practical applications enabled by the model

    Defining a robust biological prior from Pathway Analysis to drive Network Inference

    Get PDF
    Inferring genetic networks from gene expression data is one of the most challenging work in the post-genomic era, partly due to the vast space of possible networks and the relatively small amount of data available. In this field, Gaussian Graphical Model (GGM) provides a convenient framework for the discovery of biological networks. In this paper, we propose an original approach for inferring gene regulation networks using a robust biological prior on their structure in order to limit the set of candidate networks. Pathways, that represent biological knowledge on the regulatory networks, will be used as an informative prior knowledge to drive Network Inference. This approach is based on the selection of a relevant set of genes, called the "molecular signature", associated with a condition of interest (for instance, the genes involved in disease development). In this context, differential expression analysis is a well established strategy. However outcome signatures are often not consistent and show little overlap between studies. Thus, we will dedicate the first part of our work to the improvement of the standard process of biomarker identification to guarantee the robustness and reproducibility of the molecular signature. Our approach enables to compare the networks inferred between two conditions of interest (for instance case and control networks) and help along the biological interpretation of results. Thus it allows to identify differential regulations that occur in these conditions. We illustrate the proposed approach by applying our method to a study of breast cancer's response to treatment

    Reactome - a knowledgebase of human biological pathways

    Get PDF
    Pathway curation is a powerful tool for systematically associating gene products with functions. Reactome (www.reactome.org) is a manually curated human pathway knowledgebase describing a wide range of biological processes in a computationally accessible manner. The core unit of the Reactome data model is the Reaction, whose instances form a network of biological interactions through entities that are consumed, produced, or act as catalysts. Entities are distinguished by their molecular identities and cellular locations. Set objects allow grouping of related entities. Curation is based on communication between expert authors and staff curators, facilitated by freely available data entry tools. Manually curated data are subjected to quality control and peer review by a second expert. Reactome data are released quarterly. At release time, electronic orthology inference performed on human data produces reaction predictions in 22 species ranging from mouse to bacteria. Cross-references to a large number of publicly available databases are attached, providing multiple entry points into the database. The Reactome Mart allows query submission and data retrieval from Reactome and across other databases. The SkyPainter tool provides visualization and statistical analysis of user supplied data, e.g. from microarray experiments. Reactome data are freely available in a number of data formats (e.g. BioPax, SBML)

    HAGR: the Human Ageing Genomic Resources

    Get PDF
    The Human Ageing Genomic Resources (HAGR) is a collection of online resources for studying the biology of human ageing. HAGR features two main databases: GenAge and AnAge. GenAge is a curated database of genes related to human ageing. Entries were primarily selected based on genetic perturbations in animal models and human diseases as well as an extensive literature review. Each entry includes a variety of automated and manually curated information, including, where available, protein–protein interactions, the relevant literature, and a description of the gene and how it relates to human ageing. The goal of GenAge is to provide the most complete and comprehensive database of genes related to human ageing on the Internet as well as render an overview of the genetics of human ageing. AnAge is an integrative database describing the ageing process in several organisms and featuring, if available, maximum life span, taxonomy, developmental schedules and metabolic rate, making AnAge a unique resource for the comparative biology of ageing. Associated with the databases are data-mining tools and software designed to investigate the role of genes and proteins in the human ageing process as well as analyse ageing across different taxa. HAGR is freely available to the academic community at http://genomics.senescence.info

    How to understand the cell by breaking it: network analysis of gene perturbation screens

    Get PDF
    Modern high-throughput gene perturbation screens are key technologies at the forefront of genetic research. Combined with rich phenotypic descriptors they enable researchers to observe detailed cellular reactions to experimental perturbations on a genome-wide scale. This review surveys the current state-of-the-art in analyzing perturbation screens from a network point of view. We describe approaches to make the step from the parts list to the wiring diagram by using phenotypes for network inference and integrating them with complementary data sources. The first part of the review describes methods to analyze one- or low-dimensional phenotypes like viability or reporter activity; the second part concentrates on high-dimensional phenotypes showing global changes in cell morphology, transcriptome or proteome.Comment: Review based on ISMB 2009 tutorial; after two rounds of revisio

    Current advances in systems and integrative biology

    Get PDF
    Systems biology has gained a tremendous amount of interest in the last few years. This is partly due to the realization that traditional approaches focusing only on a few molecules at a time cannot describe the impact of aberrant or modulated molecular environments across a whole system. Furthermore, a hypothesis-driven study aims to prove or disprove its postulations, whereas a hypothesis-free systems approach can yield an unbiased and novel testable hypothesis as an end-result. This latter approach foregoes assumptions which predict how a biological system should react to an altered microenvironment within a cellular context, across a tissue or impacting on distant organs. Additionally, re-use of existing data by systematic data mining and re-stratification, one of the cornerstones of integrative systems biology, is also gaining attention. While tremendous efforts using a systems methodology have already yielded excellent results, it is apparent that a lack of suitable analytic tools and purpose-built databases poses a major bottleneck in applying a systematic workflow. This review addresses the current approaches used in systems analysis and obstacles often encountered in large-scale data analysis and integration which tend to go unnoticed, but have a direct impact on the final outcome of a systems approach. Its wide applicability, ranging from basic research, disease descriptors, pharmacological studies, to personalized medicine, makes this emerging approach well suited to address biological and medical questions where conventional methods are not ideal

    Uniformly curated signaling pathways reveal tissue-specific cross-talks and support drug target discovery

    Full text link
    Motivation: Signaling pathways control a large variety of cellular processes. However, currently, even within the same database signaling pathways are often curated at different levels of detail. This makes comparative and cross-talk analyses difficult. Results: We present SignaLink, a database containing 8 major signaling pathways from Caenorhabditis elegans, Drosophila melanogaster, and humans. Based on 170 review and approx. 800 research articles, we have compiled pathways with semi-automatic searches and uniform, well-documented curation rules. We found that in humans any two of the 8 pathways can cross-talk. We quantified the possible tissue- and cancer-specific activity of cross-talks and found pathway-specific expression profiles. In addition, we identified 327 proteins relevant for drug target discovery. Conclusions: We provide a novel resource for comparative and cross-talk analyses of signaling pathways. The identified multi-pathway and tissue-specific cross-talks contribute to the understanding of the signaling complexity in health and disease and underscore its importance in network-based drug target selection. Availability: http://SignaLink.orgComment: 9 pages, 4 figures, 2 tables and a supplementary info with 5 Figures and 13 Table

    Unbiased protein association study on the public human proteome reveals biological connections between co-occurring protein pairs

    Get PDF
    Mass-spectrometry-based, high-throughput proteomics experiments produce large amounts of data. While typically acquired to answer specific biological questions, these data can also be reused in orthogonal wayS to reveal new biological knowledge. We here present a novel method for such orthogonal data reuse of public proteomics data. Our method elucidates biological relationships between proteins based on the co-occurrence of these proteins across human experiments in the PRIDE database. The majority of the significantly co-occurring protein pairs that were detected by our method have been successfully mapped to existing biological knowledge. The validity of our novel method is substantiated by the extremely few pairs that can be mapped to existing knowledge based on random associations between the same set of proteins. Moreover, using literature searches and the STRING database, we were able to derive meaningful biological associations for unannotated protein pairs that were detected using our method, further illustrating that as-yet unknown associations present highly interesting targets for follow-up analysis
    • …
    corecore