Search CORE

13,502 research outputs found

XML-based approaches for the integration of heterogeneous bio-molecular data

Author: Berlanga-Llavori Rafael
Jiménez-Ruiz Ernesto
Manset David
Mesiti Marco
Perlasca Paolo
Sanz Ismael
Valentini Giorgio
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Background: The today's public database infrastructure spans a very large collection of heterogeneous biological data, opening new opportunities for molecular biology, bio-medical and bioinformatics research, but raising also new problems for their integration and computational processing. Results: In this paper we survey the most interesting and novel approaches for the representation, integration and management of different kinds of biological data by exploiting XML and the related recommendations and approaches. Moreover, we present new and interesting cutting edge approaches for the appropriate management of heterogeneous biological data represented through XML. Conclusion: XML has succeeded in the integration of heterogeneous biomolecular information, and has established itself as the syntactic glue for biological data sources. Nevertheless, a large variety of XML-based data formats have been proposed, thus resulting in a difficult effective integration of bioinformatics data schemes. The adoption of a few semantic-rich standard formats is urgent to achieve a seamless integration of the current biological resources. </p

CiteSeerX

City Research Online

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

AIR Universita degli studi di Milano

Springer - Publisher Connector

PubMed Central

Repositori Institucional de la Universitat Jaume I

Oxford University Research Archive

Data integration for XML based on semantic knowledge

Author: Ahmad Kamsuriah
Ibrahim Hamidah
Mamat Ali
Mohd Noah Shahrul Azman
Publication venue
Publication date: 14/02/2004
Field of study

Reconciling of knowledge from multiple heterogeneous data sources has been a major focus of database research for more than a decade.As a standard for exchanging business data on the WWW, XML should provide the ability of expressing data and semantics among them. Since most of application data are stored in relational databases due to its popularity and rich development experiences over it.Therefore, how to provide a proper mapping approach from relational model to XML model becomes the major research problem in the field of current information exchanging, sharing and integration..The model needs to be integrated and at the same time maintain the semantic knowledge among the data. The aim of this paper is to provide an overview for XML based data integration on semantic knowledge.At the end of the paper, we review some methodologies from existing literature

UUM Repository

A semantic framework for web-based accommodation information integration

Author: Yang K
Publication venue
Publication date: 01/01/2012
Field of study

University of Technology, Sydney. Faculty of Engineering and Information Technology.With the tremendous growth of the Web, a broad spectrum of accommodation information is to be found on the Internet. In order to adequately support information users in collecting and sharing information online, it is important to create an effective information integration solution, and to provide integrated access to the vast numbers of online information sources. In addition to the problem of distributed information sources, information users also need to cope with the heterogeneous nature of the online information sources, where individual information sources are stored and presented following their own structures and formats. In this thesis, we explore some of the challenges in the field of information integration, and propose solutions to some of the arising challenges. We focus on the utilization of ontology for integrating heterogeneous, structured and semi-structured information sources, where instance level data are stored and presented according to meta-data level schemas. In particular, we looked at XML-based data that is stored according to XML schemas. In a first step towards a large-scale information integration solution, we propose a semantic integration framework. The proposed framework solves the problem of information integration on three levels: the data level, process level and architecture level. On the data level, we leverage the benefit of ontology, and use ontology as a mediator for enabling semantic interoperability among heterogeneous data sources. On the process level, we alter the process of information integration, and propose a three step integration process named as the publish-combine-use mechanism. The primary goal is to distribute the efforts of collecting and integrating information sources to various types of end users. In the proposed approach, information providers have more control over their own data sources, as data sources are able to join and leave the information sharing network according to their own preferences. On the architecture level, we combine the flexibility offered by the emerging distributed P2P approach with the query processing capability provided by the centralized approach. The joint architecture is similar to the structure of the online accommodation industry. This thesis also demonstrates the practical applicability of the proposed semantic integration framework by implementing a prototype system. The prototype system named the "accommodation hub" is specifically developed for integrating online accommodation information in the large, distributed, heterogeneous online environment. The proposed semantic integration solution and the implemented prototype system are evaluated to provide a measure of the system performance and usage. Results show that the proposed solution delivers better performance with respect to some of the evaluation criteria than some related approaches in information integration

OPUS - University of Technology Sydney

Consistency and modularity in mediated service-based data integration solutions

Author: Pahl Claus
Zhu Yaoling
Publication venue: 'IGI Global'
Publication date: 31/01/2009
Field of study

Irish Universities

DCU Online Research Access Service

Recommended from our members

Grid-based semantic integration of heterogeneous data resources: Implementation on a HealthGrid

Author: Naseer Aisha
Publication venue: Brunel University, School of Information Systems, Computing and Mathematics
Publication date: 01/01/2007
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and was awarded by Brunel University.The semantic integration of geographically distributed and heterogeneous data resources still remains a key challenge in Grid infrastructures. Today's mainstream Grid technologies hold the promise to meet this challenge in a systematic manner, making data applications more scalable and manageable. The thesis conducts a thorough investigation of the problem, the state of the art, and the related technologies, and proposes an Architecture for Semantic Integration of Data Sources (ASIDS) addressing the semantic heterogeneity issue. It defines a simple mechanism for the interoperability of heterogeneous data sources in order to extract or discover information regardless of their different semantics. The constituent technologies of this architecture include Globus Toolkit (GT4) and OGSA-DAI (Open Grid Service Architecture Data Integration and Access) alongside other web services technologies such as XML (Extensive Markup Language). To show this, the ASIDS architecture was implemented and tested in a realistic setting by building an exemplar application prototype on a HealthGrid (pilot implementation). The study followed an empirical research methodology and was informed by extensive literature surveys and a critical analysis of the relevant technologies and their synergies. The two literature reviews, together with the analysis of the technology background, have provided a good overview of the current Grid and HealthGrid landscape, produced some valuable taxonomies, explored new paths by integrating technologies, and more importantly illuminated the problem and guided the research process towards a promising solution. Yet the primary contribution of this research is an approach that uses contemporary Grid technologies for integrating heterogeneous data resources that have semantically different. data fields (attributes). It has been practically demonstrated (using a prototype HealthGrid) that discovery in semantically integrated distributed data sources can be feasible by using mainstream Grid technologies, which have been shown to have some Significant advantages over non-Grid based approaches

Brunel University Research Archive

Automated syntactic mediation for Web service integration

Author: Moreau Luc
Payne Terry R
Szomszor Martin
Publication venue
Publication date: 01/01/2006
Field of study

As the Web Services and Grid community adopt Semantic Web technology, we observe a shift towards higher-level workflow composition and service discovery practices. While this provides excellent functionality to non-expert users, more sophisticated middleware is required to hide the details of service invocation and service integration. An investigation of a common Bioinformatics use case reveals that the execution of high-level workflow designs requires additional processing to harmonise syntactically incompatible service interfaces. In this paper, we present an architecture to support the automatic reconciliation of data formats in such Web Service worklflows. The mediation of data is driven by ontologies that encapsulate the information contained in heterogeneous data structures supplying a common, conceptual data representation. Data conversion is carried out by a Configurable Mediator component, consuming mappings between \xml schemas and \owl ontologies. We describe our system and give examples of our mapping language against the background of a Bioinformatics use case

Southampton (e-Prints Soton)

XML for Domain Viewpoints

Author: McClatchey R.
Stok P. v/d
van Lingen F.
Willers I.
Publication venue
Publication date: 30/07/2001
Field of study

Within research institutions like CERN (European Organization for Nuclear Research) there are often disparate databases (different in format, type and structure) that users need to access in a domain-specific manner. Users may want to access a simple unit of information without having to understand detail of the underlying schema or they may want to access the same information from several different sources. It is neither desirable nor feasible to require users to have knowledge of these schemas. Instead it would be advantageous if a user could query these sources using his or her own domain models and abstractions of the data. This paper describes the basis of an XML (eXtended Markup Language) framework that provides this functionality and is currently being developed at CERN. The goal of the first prototype was to explore the possibilities of XML for data integration and model management. It shows how XML can be used to integrate data sources. The framework is not only applicable to CERN data sources but other environments too.Comment: 9 pages, 6 figures, conference report from SCI'2001 Multiconference on Systemics & Informatics, Florid

arXiv.org e-Print Archive

CERN Document Server

Heterogeneous biomedical database integration using a hybrid strategy: a p53 cancer research database.

Author: Bichutskiy Vadim Y
Brachmann Rainer K
Colman Richard
Lathrop Richard H
Publication venue: eScholarship, University of California
Publication date: 01/01/2006
Field of study

Complex problems in life science research give rise to multidisciplinary collaboration, and hence, to the need for heterogeneous database integration. The tumor suppressor p53 is mutated in close to 50% of human cancers, and a small drug-like molecule with the ability to restore native function to cancerous p53 mutants is a long-held medical goal of cancer treatment. The Cancer Research DataBase (CRDB) was designed in support of a project to find such small molecules. As a cancer informatics project, the CRDB involved small molecule data, computational docking results, functional assays, and protein structure data. As an example of the hybrid strategy for data integration, it combined the mediation and data warehousing approaches. This paper uses the CRDB to illustrate the hybrid strategy as a viable approach to heterogeneous data integration in biomedicine, and provides a design method for those considering similar systems. More efficient data sharing implies increased productivity, and, hopefully, improved chances of success in cancer research. (Code and database schemas are freely downloadable, http://www.igb.uci.edu/research/research.html.)

Directory of Open Access Journals

eScholarship - University of California