7 research outputs found

    Grid-based semantic integration of heterogeneous data resources : implementation on a HealthGrid

    Get PDF
    The semantic integration of geographically distributed and heterogeneous data resources still remains a key challenge in Grid infrastructures. Today's mainstream Grid technologies hold the promise to meet this challenge in a systematic manner, making data applications more scalable and manageable. The thesis conducts a thorough investigation of the problem, the state of the art, and the related technologies, and proposes an Architecture for Semantic Integration of Data Sources (ASIDS) addressing the semantic heterogeneity issue. It defines a simple mechanism for the interoperability of heterogeneous data sources in order to extract or discover information regardless of their different semantics. The constituent technologies of this architecture include Globus Toolkit (GT4) and OGSA-DAI (Open Grid Service Architecture Data Integration and Access) alongside other web services technologies such as XML (Extensive Markup Language). To show this, the ASIDS architecture was implemented and tested in a realistic setting by building an exemplar application prototype on a HealthGrid (pilot implementation). The study followed an empirical research methodology and was informed by extensive literature surveys and a critical analysis of the relevant technologies and their synergies. The two literature reviews, together with the analysis of the technology background, have provided a good overview of the current Grid and HealthGrid landscape, produced some valuable taxonomies, explored new paths by integrating technologies, and more importantly illuminated the problem and guided the research process towards a promising solution. Yet the primary contribution of this research is an approach that uses contemporary Grid technologies for integrating heterogeneous data resources that have semantically different. data fields (attributes). It has been practically demonstrated (using a prototype HealthGrid) that discovery in semantically integrated distributed data sources can be feasible by using mainstream Grid technologies, which have been shown to have some Significant advantages over non-Grid based approaches.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Workflow Systems for Science: Concepts and Tools

    Get PDF

    Bioinformatics Data Access Service in the ProGenGrid System

    No full text
    Current bioinformatics workflows require the collection of results coming from different tools on several Web sites. High-throughput services integrated through Web Services allow researchers to access a virtual organization by providing large computational and storage resources. There are considerable costs associated with running a high-throughput application including hardware, storage, maintenance, and bandwidth. Moreover, often such tools use biological data banks heterogeneous in the format and semantic, so the task of enabling their composition and cooperation is even more difficult. Researchers are now taking advantage of economies of scale by building large shared systems for bioinformatics processing. Integrating Computational Grids and Web Services technologies can be a key solution to simplify interaction between bioinformatics tools and biological databases. This paper presents a data access service for retrieving and transferring input data coming from heterogeneous data banks to high throughput applications, wrapped as Web Services

    A Services Oriented System for Bioinformatics Applications on the Grid

    No full text
    This paper describes the evolution of the main services of the ProGen- Grid (Proteomics & Genomics Grid) system, a distributed and ubiquitous grid en- vironment (“virtual laboratory”), based on Workflow and supporting the design, execution and monitoring of “in silico” experiments in bioinformatics. ProGenGrid is a Grid-based Problem Solving Environment that allows the com- position of data sources and bioinformatics programs wrapped as Web Services (WS). The use of WS provides ease of use and fosters re-use. The resulting work- flow of WS is then scheduled on the Grid, leveraging Grid-middleware services. In particular, ProGenGrid offers a modular bag of services and currently is focused on the biological simulation of two important bioinformatics problems: prediction of the secondary structure of proteins, and sequence alignment of proteins. Both services are based on an enhanced data access service

    A Semantic Grid-based Data Access and Integration Service for Bioinformatics

    No full text
    Given the heterogeneous nature of biological data and their intensive use in many tools, in this paper we propose a semantic data access and integration (DAI) service, based on the grid paradigm, for the bioinformatics domain. This service uses ontologies for correlating different data sets. The DAI proposed in this work is a fundamental component of the ProGenGrid system, a grid-enabled platform, which aims at the design and implementation of a virtual laboratory where e-scientists could simulate complex "in silico" experiments, composing some popular analysis and visualization tools (e.g. Blast and Rasmol) available as Web services, into a workflow. The main goal of the DAI is to provide bioinformatics tools with advanced functionalities and data integration services for heterogeneous biological data banks, such as PDB and Swiss-Prot. A case study of our specialized data access service for locating similar protein sequences is presented

    A Semantic Grid-based Data Access and Integration Service for Bioinformatics

    No full text
    Given the heterogeneous nature of biological data and their intensive use in many tools, in this paper we propose a semantic data access and integration (DAI) service, based on the Grid paradigm, for the bioinformatics domain. This service uses ontologies for correlating different data sets. The DAI proposed in this work is a fundamental component of the ProGenGrid system, a grid-enabled platform, which aims at the design and implementation of a virtual laboratory where e-scientists could simulate complex “in silico ” experiments, composing some popular analysis and visualization tools (e.g. Blast and Rasmol) available as Web Services, into a workflow. The main goal of the DAI is to provide bioinformatics tools with advanced functionalities and data integration services for heterogeneous biological data banks, such as PDB and Swiss-Prot. A case study of our specialized data access service for locating similar protein sequences is presented
    corecore