121 research outputs found

    Waveomics: bringing experimental data to online collaboration

    Get PDF
    Systems biology offers an interdisciplinary approach to scientific research that typically involves the collaboration of teams of experimentalists and mathematical modellers. While the importance of data standards has been recognised in facilitating exchange of data between the parties, challenges still remain regarding the practicalities of disseminating experimental data.

The introduction of novel web-based tools aimed at promoting collaborative work has provided a platform upon which scientific applications can be built. The recently released Google Wave protocol provides a facility for real-time collaboration between teams of researchers.

This work introduces a customized Robot that automatically scans text in Google Waves for experimental data identifiers, extracts corresponding experimental data from remote resources associated with such identifiers, and appends charts showing this experimental data to the Wave

    The SBML Level 3 Annotation package: an initial proposal

    Get PDF
    The SBML Level 3 Annotation package proposal intends to extend the current Level 3 Core annotations by increasing the range of RDF features supported.

Such an extension will support RDF Reification, annotation of SBML attributes, specification of relationships between annotations, negation of annotations and cross-references and cross-element annotations

    The SBML Level 3 Annotation package: an initial proposal

    Full text link

    Reconstruction of an in silico metabolic model of _Arabidopsis thaliana_ through database integration

    Get PDF
    The number of genome-scale metabolic models has been rising quickly in recent years, and the scope of their utilization encompasses a broad range of applications from metabolic engineering to biological discovery. However the reconstruction of such models remains an arduous process requiring a high level of human intervention. Their utilization is further hampered by the absence of standardized data and annotation formats and the lack of recognized quality and validation standards.

Plants provide a particularly rich range of perspectives for applications of metabolic modeling. We here report the first effort to the reconstruction of a genome-scale model of the metabolic network of the plant _Arabidopsis thaliana_, including over 2300 reactions and compounds. Our reconstruction was performed using a semi-automatic methodology based on the integration of two public genome-wide databases, significantly accelerating the process. Database entries were compared and integrated with each other, allowing us to resolve discrepancies and enhance the quality of the reconstruction. This process lead to the construction of three models based on different quality and validation standards, providing users with the possibility to choose the standard that is most appropriate for a given application. First, a _core metabolic model_ containing only consistent data provides a high quality model that was shown to be stoichiometrically consistent. Second, an _intermediate metabolic model_ attempts to fill gaps and provides better continuity. Third, a _complete metabolic model_ contains the full set of known metabolic reactions and compounds in _Arabidopsis thaliana_.

We provide an annotated SBML file of our core model to enable the maximum level of compatibility with existing tools and databases. We eventually discuss a series of principles to raise awareness of the need to develop coordinated efforts and common standards for the reconstruction of genome-scale metabolic models, with the aim of enabling their widespread diffusion, frequent update, maximum compatibility and convenience of use by the wider research community and industry

    Towards a genome-scale kinetic model of cellular metabolism

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Advances in bioinformatic techniques and analyses have led to the availability of genome-scale metabolic reconstructions. The size and complexity of such networks often means that their potential behaviour can only be analysed with constraint-based methods. Whilst requiring minimal experimental data, such methods are unable to give insight into cellular substrate concentrations. Instead, the long-term goal of systems biology is to use kinetic modelling to characterize fully the mechanics of each enzymatic reaction, and to combine such knowledge to predict system behaviour.</p> <p>Results</p> <p>We describe a method for building a parameterized genome-scale kinetic model of a metabolic network. Simplified linlog kinetics are used and the parameters are extracted from a kinetic model repository. We demonstrate our methodology by applying it to yeast metabolism. The resultant model has 956 metabolic reactions involving 820 metabolites, and, whilst approximative, has considerably broader remit than any existing models of its type. Control analysis is used to identify key steps within the system.</p> <p>Conclusions</p> <p>Our modelling framework may be considered a stepping-stone toward the long-term goal of a fully-parameterized model of yeast metabolism. The model is available in SBML format from the BioModels database (BioModels ID: MODEL1001200000) and at <url>http://www.mcisb.org/resources/genomescale/</url>.</p

    DeepGraphMol, a multi-objective, computational strategy for generating molecules with desirable properties: a graph convolution and reinforcement learning approach

    Get PDF
    Abstract We address the problem of generating novel molecules with desired interaction properties as a multi-objective optimization problem. Interaction binding models are learned from binding data using graph convolution networks (GCNs). Since the experimentally obtained property scores are recognised as having potentially gross errors, we adopted a robust loss for the model. Combinations of these terms, including drug likeness and synthetic accessibility, are then optimized using reinforcement learning based on a graph convolution policy approach. Some of the molecules generated, while legitimate chemically, can have excellent drug-likeness scores but appear unusual. We provide an example based on the binding potency of small molecules to dopamine transporters. We extend our method successfully to use a multi-objective reward function, in this case for generating novel molecules that bind with dopamine transporters but not with those for norepinephrine. Our method should be generally applicable to the generation in silico of molecules with desirable properties

    Model-driven user interfaces for bioinformatics data resources: regenerating the wheel as an alternative to reinventing it

    Get PDF
    BACKGROUND: The proliferation of data repositories in bioinformatics has resulted in the development of numerous interfaces that allow scientists to browse, search and analyse the data that they contain. Interfaces typically support repository access by means of web pages, but other means are also used, such as desktop applications and command line tools. Interfaces often duplicate functionality amongst each other, and this implies that associated development activities are repeated in different laboratories. Interfaces developed by public laboratories are often created with limited developer resources. In such environments, reducing the time spent on creating user interfaces allows for a better deployment of resources for specialised tasks, such as data integration or analysis. Laboratories maintaining data resources are challenged to reconcile requirements for software that is reliable, functional and flexible with limitations on software development resources. RESULTS: This paper proposes a model-driven approach for the partial generation of user interfaces for searching and browsing bioinformatics data repositories. Inspired by the Model Driven Architecture (MDA) of the Object Management Group (OMG), we have developed a system that generates interfaces designed for use with bioinformatics resources. This approach helps laboratory domain experts decrease the amount of time they have to spend dealing with the repetitive aspects of user interface development. As a result, the amount of time they can spend on gathering requirements and helping develop specialised features increases. The resulting system is known as Pierre, and has been validated through its application to use cases in the life sciences, including the PEDRoDB proteomics database and the e-Fungi data warehouse. CONCLUSION: MDAs focus on generating software from models that describe aspects of service capabilities, and can be applied to support rapid development of repository interfaces in bioinformatics. The Pierre MDA is capable of supporting common database access requirements with a variety of auto-generated interfaces and across a variety of repositories. With Pierre, four kinds of interfaces are generated: web, stand-alone application, text-menu, and command line. The kinds of repositories with which Pierre interfaces have been used are relational, XML and object databases

    An informatic pipeline for the data capture and submission of quantitative proteomic data using iTRAQ(TM)

    Get PDF
    BACKGROUND: Proteomics continues to play a critical role in post-genomic science as continued advances in mass spectrometry and analytical chemistry support the separation and identification of increasing numbers of peptides and proteins from their characteristic mass spectra. In order to facilitate the sharing of this data, various standard formats have been, and continue to be, developed. Still not fully mature however, these are not yet able to cope with the increasing number of quantitative proteomic technologies that are being developed. RESULTS: We propose an extension to the PRIDE and mzData XML schema to accommodate the concept of multiple samples per experiment, and in addition, capture the intensities of the iTRAQ(TM )reporter ions in the entry. A simple Java-client has been developed to capture and convert the raw data from common spectral file formats, which also uses a third-party open source tool for the generation of iTRAQ(TM) reported intensities from Mascot output, into a valid PRIDE XML entry. CONCLUSION: We describe an extension to the PRIDE and mzData schemas to enable the capture of quantitative data. Currently this is limited to iTRAQ(TM) data but is readily extensible for other quantitative proteomic technologies. Furthermore, a software tool has been developed which enables conversion from various mass spectrum file formats and corresponding Mascot peptide identifications to PRIDE formatted XML. The tool represents a simple approach to preparing quantitative and qualitative data for submission to repositories such as PRIDE, which is necessary to facilitate data deposition and sharing in public domain database. The software is freely available from

    ChEBI in 2016: Improved services and an expanding collection of metabolites

    Get PDF
    ChEBI is a database and ontology containing infor-mation about chemical entities of biological inter-est. It currently includes over 46 000 entries, each of which is classified within the ontology and assigned multiple annotations including (where relevant) a chemical structure, database cross-references, syn-onyms and literature citations. All content is freely available and can be accessed online a
    corecore