484 research outputs found
A Web-based mapping technique foreEstablishing metadata interoperability
Die Integration von Metadaten aus unterschiedlichen, heterogenen Datenquellen erfordert Metadaten-Interoperabilität, eine Eigenschaft die nicht standardmäßig gegeben ist. Metadaten Mapping Verfahren ermöglichen es Domänenexperten Metadaten-Interoperabilität in einem bestimmten Integrationskontext herzustellen. Mapping Lösungen sollen dabei die notwendige Unterstützung bieten. Während diese für den etablierten Bereich interoperabler Datenbanken bereits existieren, ist dies für Web-Umgebungen nicht der Fall.
Betrachtet man das Ausmaß ständig wachsender strukturierter Metadaten und Metadatenschemata im Web, so zeichnet sich ein Bedarf nach Web-basierten Mapping Lösungen ab. Den Kern einer solchen Lösung bildet ein Mappingmodell, das die zur Spezifikation von Mappings notwendigen Sprachkonstrukte definiert. Existierende Semantic Web Sprachen wie beispielsweise RDFS oder OWL bieten zwar grundlegende Mappingelemente (z.B.: owl:equivalentProperty, owl:sameAs), adressieren jedoch nicht das gesamte Sprektrum möglicher semantischer und struktureller Heterogenitäten, die zwischen unterschiedlichen, inkompatiblen Metadatenobjekten auftreten können. Außerdem fehlen technische Lösungsansätze zur Überführung zuvor definierter Mappings in ausführbare Abfragen.
Als zentraler wissenschaftlicher Beitrag dieser Dissertation, wird ein abstraktes Mappingmodell präsentiert, welches das Mappingproblem auf generischer Ebene reflektiert und Lösungsansätze zum Abgleich
inkompatibler Schemata bietet. Instanztransformationsfunktionen und URIs nehmen in diesem Modell eine zentrale Rolle ein. Erstere überbrücken ein breites Spektrum möglicher semantischer und struktureller Heterogenitäten, während letztere das Mappingmodell in die Architektur des World Wide Webs einbinden. Auf einer konkreten, sprachspezifischen Ebene wird die Anbindung des abstrakten Modells an die RDF Vocabulary Description Language (RDFS) präsentiert, wodurch ein Mapping zwischen unterschiedlichen, in RDFS ausgedrückten Metadatenschemata ermöglicht wird.
Das Mappingmodell ist in einen zyklischen Mappingprozess eingebunden, der die Anforderungen an Mappinglösungen in vier aufeinanderfolgende Phasen kategorisiert: mapping discovery, mapping representation, mapping execution und mapping maintenance. Im Rahmen dieser Dissertation beschäftigen wir uns hauptsächlich mit der Representation-Phase sowie mit der Transformation von Mappingspezifikationen in ausführbare SPARQL-Abfragen. Zur Unterstützung der Discovery-Phase bietet das Mappingmodell eine
Schnittstelle zur Einbindung von Schema- oder Ontologymatching-Algorithmen. Für die Maintenance-Phase präsentieren wir ein einfaches, aber seinen Zweck erfüllendes Mapping-Registry Konzept.
Auf Basis des Mappingmodells stellen wir eine Web-basierte Mediator-Wrapper Architektur vor, die Domänenexperten die Möglichkeit bietet, SPARQL-Mediationsschnittstellen zu definieren. Die zu integrierenden
Datenquellen müssen dafür durch Wrapper-Komponenen gekapselt werden, welche die enthaltenen Metadaten im Web exponieren und SPARQL-Zugriff ermöglichen. Als beipielhafte Wrapper Komponente
präsentieren wir den OAI2LOD Server, mit dessen Hilfe Datenquellen eingebunden werden können, die ihre Metadaten über das Open Archives Initative Protocol for Metadata Harvesting (OAI-PMH) exponieren.
Im Rahmen einer Fallstudie zeigen wir, wie Mappings in Web-Umgebungen erstellt werden können und wie unsere Mediator-Wrapper Architektur nach wenigen, einfachen Konfigurationsschritten Metadaten aus unterschiedlichen, heterogenen Datenquellen integrieren kann, ohne dass dadurch die Notwendigkeit entsteht, eine Mapping Lösung in einer lokalen Systemumgebung zu installieren.The integration of metadata from distinct, heterogeneous data sources requires metadata interoperability, which is a qualitative property of metadata information objects that is not given by default. The technique of metadata mapping allows domain experts to establish metadata interoperability in a certain integration scenario. Mapping solutions, as a technical manifestation of this technique, are already available for the intensively studied domain of database system interoperability, but they rarely exist for the Web.
If we consider the amount of steadily increasing structured metadata and corresponding metadata schemes on the Web, we can observe a clear need for a mapping solution that can operate in a Web-based environment.
To achieve that, we first need to build its technical core, which is a mapping model that provides the language primitives to define mapping relationships. Existing Semantic Web languages such as RDFS and OWL define some basic mapping elements (e.g., owl:equivalentProperty, owl:sameAs), but do not address the full
spectrum of semantic and structural heterogeneities that can occur among distinct, incompatible metadata information objects. Furthermore, it is still unclear how to process defined mapping relationships during run-time in order to deliver metadata to the client in a uniform way.
As the main contribution of this thesis, we present an abstract mapping model, which reflects the mapping problem on a generic level and provides the means for reconciling incompatible metadata. Instance
transformation functions and URIs take a central role in that model. The former cover a broad spectrum of possible structural and semantic heterogeneities, while the latter bind the complete mapping model to the architecture of the Word Wide Web. On the concrete, language-specific level we present a binding of the abstract mapping model for the RDF Vocabulary Description Language (RDFS), which allows us to create mapping specifications among incompatible metadata schemes expressed in RDFS.
The mapping model is embedded in a cyclic process that categorises the requirements a mapping solution should fulfil into four subsequent phases: mapping discovery, mapping representation, mapping execution, and mapping maintenance. In this thesis, we mainly focus on mapping representation and on the transformation of mapping specifications into executable SPARQL queries. For mapping discovery support, the model provides an interface for plugging-in schema and ontology matching algorithms. For mapping maintenance we introduce the concept of a simple, but effective mapping registry.
Based on the mapping model, we propose aWeb-based mediator wrapper-architecture that allows domain experts to set up mediation endpoints that provide a uniform SPARQL query interface to a set of distributed metadata sources. The involved data sources are encapsulated by wrapper components that expose the contained
metadata and the schema definitions on the Web and provide a SPARQL query interface to these metadata. In this thesis, we present the OAI2LOD Server, a wrapper component for integrating metadata that are accessible via the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH).
In a case study, we demonstrate how mappings can be created in aWeb environment and how our mediator wrapper architecture can easily be configured in order to integrate metadata from various heterogeneous data sources without the need to install any mapping solution or metadata integration solution in a local system environment
3rd EGEE User Forum
We have organized this book in a sequence of chapters, each chapter associated with an application or technical theme introduced by an overview of the contents, and a summary of the main conclusions coming from the Forum for the chapter topic. The first chapter gathers all the plenary session keynote addresses, and following this there is a sequence of chapters covering the application flavoured sessions. These are followed by chapters with the flavour of Computer Science and Grid Technology. The final chapter covers the important number of practical demonstrations and posters exhibited at the Forum. Much of the work presented has a direct link to specific areas of Science, and so we have created a Science Index, presented below. In addition, at the end of this book, we provide a complete list of the institutes and countries involved in the User Forum
Data Analysis in Automotive Industrial Cells
The manufacturing industry always has been one of the leading energy consumers, so
companies in this area are always trying to use the best tools provided by the evolution
of the technology, to analyse and lower the production costs. Many known studies don’t
mind inconveniences such as stopping the production of the factory to perform studies
or deep architecture improvements in the transport system.
The proposed solution offers two different sets of tools. A device adapter, that targets
the gather and storage of data, from industrial robotic cells devices, being the main requirement
for a data analysis application, and a data analysis system, that analyses the
stored data, without changing the existing production model. The analysis procedure
aims the energy usage of a cell and its robot, and the duration of the executed processes.
This solution was tested in two different robotic cells, that execute the same process.
Multiple executions with different robot velocities were performed in order to gather
the required data to provide an analysis and the conclusion was that, for both cells, the
energy usage for each executed product was lower when the robot speed was higher, and
that one of the cells is more efficient that other cell when executing at high speed but less
efficient on lower velocities
A semantic and agent-based approach to support information retrieval, interoperability and multi-lateral viewpoints for heterogeneous environmental databases
PhDData stored in individual autonomous databases often needs to be combined and
interrelated. For example, in the Inland Water (IW) environment monitoring domain,
the spatial and temporal variation of measurements of different water quality indicators
stored in different databases are of interest. Data from multiple data sources is more
complex to combine when there is a lack of metadata in a computation forin and when
the syntax and semantics of the stored data models are heterogeneous. The main types
of information retrieval (IR) requirements are query transparency and data
harmonisation for data interoperability and support for multiple user views. A
combined Semantic Web based and Agent based distributed system framework has
been developed to support the above IR requirements. It has been implemented using
the Jena ontology and JADE agent toolkits. The semantic part supports the
interoperability of autonomous data sources by merging their intensional data, using a
Global-As-View or GAV approach, into a global semantic model, represented in
DAML+OIL and in OWL. This is used to mediate between different local database
views. The agent part provides the semantic services to import, align and parse
semantic metadata instances, to support data mediation and to reason about data
mappings during alignment. The framework has applied to support information
retrieval, interoperability and multi-lateral viewpoints for four European environmental
agency databases.
An extended GAV approach has been developed and applied to handle queries that can
be reformulated over multiple user views of the stored data. This allows users to
retrieve data in a conceptualisation that is better suited to them rather than to have to
understand the entire detailed global view conceptualisation. User viewpoints are
derived from the global ontology or existing viewpoints of it. This has the advantage
that it reduces the number of potential conceptualisations and their associated
mappings to be more computationally manageable. Whereas an ad hoc framework
based upon conventional distributed programming language and a rule framework
could be used to support user views and adaptation to user views, a more formal
framework has the benefit in that it can support reasoning about the consistency,
equivalence, containment and conflict resolution when traversing data models. A
preliminary formulation of the formal model has been undertaken and is based upon
extending a Datalog type algebra with hierarchical, attribute and instance value
operators. These operators can be applied to support compositional mapping and
consistency checking of data views. The multiple viewpoint system was implemented
as a Java-based application consisting of two sub-systems, one for viewpoint
adaptation and management, the other for query processing and query result
adjustment
Recommended from our members
A distributed analysis and monitoring framework for the compact Muon solenoid experiment and a pedestrian simulation
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.The design of a parallel and distributed computing system is a very complicated task. It requires a detailed understanding of the design issues and of the theoretical and practical aspects of their solutions. Firstly, this thesis discusses in detail the major concepts and components required to make parallel and distributed computing a reality. A multithreaded and distributed framework capable of analysing the simulation data produced by a pedestrian simulation software was developed. Secondly, this thesis discusses the origins and fundamentals of Grid computing and the motivations for its use in High Energy Physics. Access to the data produced by the Large Hadron Collider (LHC) has to be provided for more than five thousand scientists all over the world. Users who run analysis jobs on the Grid do not necessarily have expertise in Grid computing. Simple, userfriendly and reliable monitoring of the analysis jobs is one of the key components of the operations of the distributed analysis; reliable monitoring is one of the crucial components of the Worldwide LHC Computing Grid for providing the functionality and performance that is required by the LHC experiments. The CMS Dashboard Task Monitoring and the CMS Dashboard Job Summary monitoring applications were developed to serve the needs of the CMS community
LHCb distributed data analysis on the computing grid
LHCb is one of the four Large Hadron Collider (LHC) experiments based at CERN, the European Organisation for Nuclear Research. The LHC experiments will start taking an unprecedented amount of data when they come online in 2007. Since no single institute has the compute resources to handle this data, resources must be pooled to form the Grid. Where the Internet has made it possible to share information stored on computers across the world, Grid computing aims to provide access to computing power and storage capacity on geographically distributed systems. LHCb software applications must work seamlessly on the Grid allowing users to efficiently access distributed compute resources. It is essential to the success of the LHCb experiment that physicists can access data from the detector, stored in many heterogeneous systems, to perform distributed data analysis. This thesis describes the work performed to enable distributed data analysis for the LHCb experiment on the LHC Computing Grid
A FRAMEWORK FOR BIOPROFILE ANALYSIS OVER GRID
An important trend in modern medicine is towards individualisation of healthcare to tailor
care to the needs of the individual. This makes it possible, for example, to personalise
diagnosis and treatment to improve outcome. However, the benefits of this can only be fully
realised if healthcare and ICT resources are exploited (e.g. to provide access to relevant data,
analysis algorithms, knowledge and expertise). Potentially, grid can play an important role
in this by allowing sharing of resources and expertise to improve the quality of care. The
integration of grid and the new concept of bioprofile represents a new topic in the healthgrid
for individualisation of healthcare.
A bioprofile represents a personal dynamic "fingerprint" that fuses together a person's
current and past bio-history, biopatterns and prognosis. It combines not just data, but also
analysis and predictions of future or likely susceptibility to disease, such as brain diseases
and cancer. The creation and use of bioprofile require the support of a number of healthcare
and ICT technologies and techniques, such as medical imaging and electrophysiology and
related facilities, analysis tools, data storage and computation clusters. The need to share
clinical data, storage and computation resources between different bioprofile centres creates
not only local problems, but also global problems.
Existing ICT technologies are inappropriate for bioprofiling because of the difficulties in the
use and management of heterogeneous IT resources at different bioprofile centres. Grid as an
emerging resource sharing concept fulfils the needs of bioprofile in several aspects, including
discovery, access, monitoring and allocation of distributed bioprofile databases, computation
resoiuces, bioprofile knowledge bases, etc. However, the challenge of how to integrate the
grid and bioprofile technologies together in order to offer an advanced distributed bioprofile
environment to support individualized healthcare remains.
The aim of this project is to develop a framework for one of the key meta-level bioprofile
applications: bioprofile analysis over grid to support individualised healthcare. Bioprofile
analysis is a critical part of bioprofiling (i.e. the creation, use and update of bioprofiles).
Analysis makes it possible, for example, to extract markers from data for diagnosis and to
assess individual's health status. The framework provides a basis for a "grid-based" solution
to the challenge of "distributed bioprofile analysis" in bioprofiling. The main contributions
of the thesis are fourfold:
A. An architecture for bioprofile analysis over grid. The design of a suitable aichitecture
is fundamental to the development of any ICT systems. The architecture creates a
meaiis for categorisation, determination and organisation of core grid components to
support the development and use of grid for bioprofile analysis;
B. A service model for bioprofile analysis over grid. The service model proposes a
service design principle, a service architecture for bioprofile analysis over grid, and
a distributed EEG analysis service model. The service design principle addresses
the main service design considerations behind the service model, in the aspects of
usability, flexibility, extensibility, reusability, etc. The service architecture identifies
the main categories of services and outlines an approach in organising services to
realise certain functionalities required by distributed bioprofile analysis applications.
The EEG analysis service model demonstrates the utilisation and development of
services to enable bioprofile analysis over grid;
C. Two grid test-beds and a practical implementation of EEG analysis over grid. The two
grid test-beds: the BIOPATTERN grid and PlymGRID are built based on existing
grid middleware tools. They provide essential experimental platforms for research in
bioprofiling over grid. The work here demonstrates how resources, grid middleware
and services can be utilised, organised and implemented to support distributed EEG
analysis for early detection of dementia. The distributed Electroencephalography
(EEG) analysis environment can be used to support a variety of research activities in
EEG analysis;
D. A scheme for organising multiple (heterogeneous) descriptions of individual grid
entities for knowledge representation of grid. The scheme solves the compatibility
and adaptability problems in managing heterogeneous descriptions (i.e. descriptions
using different languages and schemas/ontologies) for collaborated representation of
a grid environment in different scales. It underpins the concept of bioprofile analysis
over grid in the aspect of knowledge-based global coordination between components
of bioprofile analysis over grid
- …