5 research outputs found

    Heterogeneous biomedical database integration using a hybrid strategy: a p53 cancer research database.

    Get PDF
    Complex problems in life science research give rise to multidisciplinary collaboration, and hence, to the need for heterogeneous database integration. The tumor suppressor p53 is mutated in close to 50% of human cancers, and a small drug-like molecule with the ability to restore native function to cancerous p53 mutants is a long-held medical goal of cancer treatment. The Cancer Research DataBase (CRDB) was designed in support of a project to find such small molecules. As a cancer informatics project, the CRDB involved small molecule data, computational docking results, functional assays, and protein structure data. As an example of the hybrid strategy for data integration, it combined the mediation and data warehousing approaches. This paper uses the CRDB to illustrate the hybrid strategy as a viable approach to heterogeneous data integration in biomedicine, and provides a design method for those considering similar systems. More efficient data sharing implies increased productivity, and, hopefully, improved chances of success in cancer research. (Code and database schemas are freely downloadable, http://www.igb.uci.edu/research/research.html.)

    Integration of prostate cancer clinical data using an ontology

    Get PDF
    AbstractIt is increasingly important for investigators to efficiently and effectively access, interpret, and analyze the data from diverse biological, literature, and annotation sources in a unified way. The heterogeneity of biomedical data and the lack of metadata are the primary sources of the difficulty for integration, presenting major challenges to effective search and retrieval of the information. As a proof of concept, the Prostate Cancer Ontology (PCO) is created for the development of the Prostate Cancer Information System (PCIS). PCIS is applied to demonstrate how the ontology is utilized to solve the semantic heterogeneity problem from the integration of two prostate cancer related database systems at the Fox Chase Cancer Center. As the results of the integration process, the semantic query language SPARQL is applied to perform the integrated queries across the two database systems based on PCO

    Dynamic integration of biological data sources using the data concierge

    Get PDF

    Integration of biological data: systems, infrastructures and programmable tools

    Full text link
    Tesis doctoral inédita leída en la Universidad Autónoma de Madrid. Escuela Politécnica Superior, Departamento de Ingeniería informática. Fecha de lectura: 19-05-200

    Architecture of a mediator for a bioinformatics database federation

    No full text
    Developments in our ability to integrate and analyze data held in existing heterogeneous data resources can lead to an increase in our understanding of biological function at all levels. However, supporting ad hoc queries across multiple data resources and correlating data retrieved from these is still difficult. To address this, we are building a mediator based on the functional data model database, P/FDM, which integrates access to heterogeneous distributed biological databases. Our architecture makes use of the existing search capabilities and indexes of the underlying databases, without infringing on their autonomy. Central to our design philosophy is the use of schemas. We have adopted a federated architecture with a five-level schema, arising from the use of the ANSI-SPARC three-level schema to describe both the existing autonomous data resources and the mediator itself. We describe the use of mapping functions and list comprehensions in query splitting, producing execution plans, code generation, and result fusion. We give an example of cross-database querying involving data held locally in P/FDM systems and external data in SRS
    corecore