5 research outputs found

    Grid-based semantic integration of heterogeneous data resources : implementation on a HealthGrid

    Get PDF
    The semantic integration of geographically distributed and heterogeneous data resources still remains a key challenge in Grid infrastructures. Today's mainstream Grid technologies hold the promise to meet this challenge in a systematic manner, making data applications more scalable and manageable. The thesis conducts a thorough investigation of the problem, the state of the art, and the related technologies, and proposes an Architecture for Semantic Integration of Data Sources (ASIDS) addressing the semantic heterogeneity issue. It defines a simple mechanism for the interoperability of heterogeneous data sources in order to extract or discover information regardless of their different semantics. The constituent technologies of this architecture include Globus Toolkit (GT4) and OGSA-DAI (Open Grid Service Architecture Data Integration and Access) alongside other web services technologies such as XML (Extensive Markup Language). To show this, the ASIDS architecture was implemented and tested in a realistic setting by building an exemplar application prototype on a HealthGrid (pilot implementation). The study followed an empirical research methodology and was informed by extensive literature surveys and a critical analysis of the relevant technologies and their synergies. The two literature reviews, together with the analysis of the technology background, have provided a good overview of the current Grid and HealthGrid landscape, produced some valuable taxonomies, explored new paths by integrating technologies, and more importantly illuminated the problem and guided the research process towards a promising solution. Yet the primary contribution of this research is an approach that uses contemporary Grid technologies for integrating heterogeneous data resources that have semantically different. data fields (attributes). It has been practically demonstrated (using a prototype HealthGrid) that discovery in semantically integrated distributed data sources can be feasible by using mainstream Grid technologies, which have been shown to have some Significant advantages over non-Grid based approaches.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Metadata for managing grid resources in data mining applications

    No full text
    categorization, semantic grid The Grid is an infrastructure for resource sharing and coordinated use of those resources in dynamic heterogeneous distributed environments. The effective use of a Grid requires the definition of metadata for managing the heterogeneity of involved resources that include computers, data, network facilities, and software tools provided by different organizations. Metadata management becomes a key issue when complex applications, such as dataintensive simulations and data mining applications, are executed on a Grid. This paper discusses metadata models for heterogeneous resource management in Grid-based data mining applications. In particular, it discusses how resources are represented and managed in the KNOWLEDGE GRID, a framework for Grid-enabled distributed data mining. The paper illustrates how XML-based metadata is used to describe data mining tools, data sources, mining models, and execution plans, and how metadata is used for the design and execution of distributed knowledge discovery applications on Grids. 1

    Metadata for Managing Grid Resources in Data Mining Applications

    No full text
    corecore