7,044 research outputs found

    Data access and integration in the ISPIDER proteomics grid

    Get PDF
    Grid computing has great potential for supporting the integration of complex, fast changing biological data repositories to enable distributed data analysis. One scenario where Grid computing has such potential is provided by proteomics resources which are rapidly being developed with the emergence of affordable, reliable methods to study the proteome. The protein identifications arising from these methods derive from multiple repositories which need to be integrated to enable uniform access to them. A number of technologies exist which enable these resources to be accessed in a Grid environment, but the independent development of these resources means that significant data integration challenges, such as heterogeneity and schema evolution, have to be met. This paper presents an architecture which supports the combined use of Grid data access (OGSA-DAI), Grid distributed querying (OGSA-DQP) and data integration (AutoMed) software tools to support distributed data analysis. We discuss the application of this architecture for the integration of several autonomous proteomics data resources

    Bioinformatics service reconciliation by heterogeneous schema transformation

    Get PDF
    This paper focuses on the problem of bioinformatics service reconciliation in a generic and scalable manner so as to enhance interoperability in a highly evolving field. Using XML as a common representation format, but also supporting existing flat-file representation formats, we propose an approach for the scalable semi-automatic reconciliation of services, possibly invoked from within a scientific workflows tool. Service reconciliation may use the AutoMed heterogeneous data integration system as an intermediary service, or may use AutoMed to produce services that mediate between services. We discuss the application of our approach for the reconciliation of services in an example bioinformatics workflow. The main contribution of this research is an architecture for the scalable reconciliation of bioinformatics services

    Federation views as a basis for querying and updating database federations

    Get PDF
    This paper addresses the problem of how to query and update so-called database federations. A database federation provides for tight coupling of a collection of heterogeneous component databases into a global integrated system. This problem of querying and updating a database federation is tackled by describing a logical architecture and a general semantic framework for precise specification of such database federations, with the aim to provide a basis for implementing a federation by means of relational database views. Our approach to database federations is based on the UML/OCL data model, and aims at the integration of the underlying database schemas of the component legacy systems to a separate, newly defined integrated database schema. One of the central notions in database modelling and in constraint specifications is the notion of a database view, which closely corresponds to the notion of derived class in UML. We will employ OCL (version 2.0) and the notion of derived class as a means to treat (inter-)database constraints and database views in a federated context. Our approach to coupling component databases into a global, integrated system is based on mediation. The first objective of our paper is to demonstrate that our particular mediating system integrates component schemas without loss of constraint information. The second objective is to show that the concept of relational database view provides a sound basis for actual implementation of database federations, both for querying and updating purposes.
    • 

    corecore