7 research outputs found

    A computational framework for complex disease stratification from multiple large-scale datasets.

    Get PDF
    BACKGROUND: Multilevel data integration is becoming a major area of research in systems biology. Within this area, multi-'omics datasets on complex diseases are becoming more readily available and there is a need to set standards and good practices for integrated analysis of biological, clinical and environmental data. We present a framework to plan and generate single and multi-'omics signatures of disease states. METHODS: The framework is divided into four major steps: dataset subsetting, feature filtering, 'omics-based clustering and biomarker identification. RESULTS: We illustrate the usefulness of this framework by identifying potential patient clusters based on integrated multi-'omics signatures in a publicly available ovarian cystadenocarcinoma dataset. The analysis generated a higher number of stable and clinically relevant clusters than previously reported, and enabled the generation of predictive models of patient outcomes. CONCLUSIONS: This framework will help health researchers plan and perform multi-'omics big data analyses to generate hypotheses and make sense of their rich, diverse and ever growing datasets, to enable implementation of translational P4 medicine

    Biomolecular annotation integration and querying to help unveiling new biomedical knowledge

    No full text
    Targeting biological questions requires comprehensive evaluation of multiple types of annotations describing current biological knowledge; they are increasingly available, but their fast evolution, heterogeneity and dispersion in many different sources hamper their effective use. Leveraging on innovative flexible data schema and automatic software procedures that support the integration of data sources evolving in number, data content and structure, while assuring quality and provenance tracking of the integrated data, we created a multi-organism Genomic and Proteomic Knowledge Base (GPKB) and easily maintained it updated. From several well-known databases it imports and integrates very numerous gene and protein data, external references and annotations, expressed through multiple biomedical terminologies. To easily query such integrated data, we developed intuitive web interfaces and services for programmatic access to the GPKB; they are publicly available respectively at http://www.bioinformatics. deib.polimi.it/GPKB/ and http://www.bioinformatics.deib.polimi.it/ GPKB-REST/. The created GPKB is a very valuable resource used in several projects by many users; the developed interfaces enhance its relevance to the community by allowing the seamlessly composition of queries, although complex, on all data integrated in the GPKB, which can help unveiling new biomedical knowledge

    A computational framework for complex disease stratification from multiple large-scale datasets

    No full text
    corecore