Search CORE

150 research outputs found

XML for Domain Viewpoints

Author: McClatchey R.
Stok P. v/d
van Lingen F.
Willers I.
Publication venue
Publication date: 30/07/2001
Field of study

Within research institutions like CERN (European Organization for Nuclear Research) there are often disparate databases (different in format, type and structure) that users need to access in a domain-specific manner. Users may want to access a simple unit of information without having to understand detail of the underlying schema or they may want to access the same information from several different sources. It is neither desirable nor feasible to require users to have knowledge of these schemas. Instead it would be advantageous if a user could query these sources using his or her own domain models and abstractions of the data. This paper describes the basis of an XML (eXtended Markup Language) framework that provides this functionality and is currently being developed at CERN. The goal of the first prototype was to explore the possibilities of XML for data integration and model management. It shows how XML can be used to integrate data sources. The framework is not only applicable to CERN data sources but other environments too.Comment: 9 pages, 6 figures, conference report from SCI'2001 Multiconference on Systemics & Informatics, Florid

arXiv.org e-Print Archive

IMPrECISE: Good-is-good-enough data integration

Author: Keijzer Ander de
Keulen Maurice van
Publication venue: IEEE Computer Society Press
Publication date: 01/01/2008
Field of study

IMPrECISE is an XQuery module that adds probabilistic XML functionality to an existing XML DBMS, in our case MonetDB/XQuery. We demonstrate probabilistic XML and data integration functionality of IMPrECISE. The prototype is configurable with domain knowledge such that the amount of uncertainty arising during data integration is reduced to an acceptable level, thus obtaining a "good is good enough" data integration with minimal human effort

CiteSeerX

University of Twente Research Information

Integration of weakly heterogeneous semistructured data

Author: Feuerlicht G
Pokorný J
Richta K
Ruttananontsatean N
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/12/2009
Field of study

While most business applications typically operate on structured data that can be effectively managed using relational databases, some applications use more complex semistructured data that lacks a stable schema. XML techniques are available for the management of semistructured data, but such techniques tend to be ineffective when applied to large amounts of heterogeneous data, in particular in applications with complex query requirements. We describe an approach that relies on the mapping of multiple semistructured data sets to object-relational structures and uses an object-relational database to support complex query requirements. As an example we use weakly heterogeneous oceanographic data. © 2009 Springer Science+Business Media, LLC

OPUS - University of Technology Sydney

On distributed data processing in data grid architecture for a virtual repository

Author: Adamus Radosław
Kaczmarek Krzysztof
Kowalski Tomasz Marek
Kuliberda Kamil
Subieta Kazimierz
Wiślicki Jacek
Publication venue: Lodz University of Technology. Press
Publication date: 01/01/2010
Field of study

The article describes the problem of integration of distributed, heterogeneous and fragmented collections of data with application of the virtual repository and the data grid concept. The technology involves: wrappers enveloping external resources, a virtual network (based on the peer-topeer technology) responsible for integration of data into one global schema and a distributed index for speeding-up data retrieval. Authors present a method for obtaining data from heterogeneously structured external databases and then a procedure of integration the data to one, commonly available, global schema. The core of the described solution is based on the Stack-Based Query Language (SBQL) and virtual updatable SBQL views. The system transport and indexing layer is based on the P2P architecture

Lodz University of Technology Repository

Semantic Modelling of e-Solutions Using a View Formalism with Conceptual and Logical Extensions

Author: Chang Elizabeth
Dillon Tharam
Feng L.
Rajugan R.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2005
Field of study

In industrial informatics, there exists a requirement to model and design views at a higher level of abstraction. Since the classical view definitions are only available at the query or instance level, modelling and maintaining such views for complex enterprise information systems (EIS) is a challenging task. Further, the introduction of semi-structured data (namely XML) and its rapid adaptation by the commercial and industrial systems increased the complexity for view design and specification. To address such and issue, in this paper we present; (a) a layered view model for XML, (b) a design methodology for such views and (c) some real-world industrial applications of the view model. The XML view formalism is defined at the conceptual level and the design methodology is based on the XML semantic (XSemantic) nets, a high-level object-oriented (OO) modelling language for XML domains

OPUS - University of Technology Sydney

espace@Curtin

Anatomy of a Native XML Base Management System

Author: Fiebig Thorsten
Helmer Sven
Kanne Carl-Christian
Mildenberger Julia
Moerkotte Guido
Schiele Robert
Westmann Till
Publication venue
Publication date: 01/01/2002
Field of study

Several alternatives to manage large XML document collections exist, ranging from file systems over relational or other database systems to specifically tailored XML repositories. In this paper we give a tour of Natix, a database management system designed from scratch for storing and processing XML data. Contrary to the common belief that management of XML data is just another application for traditional databases like relational systems, we illustrate how almost every component in a database system is affected in terms of adequacy and performance. We show how to design and optimize areas such as storage, transaction management comprising recovery and multi-user synchronisation as well as query processing for XML

CiteSeerX

MAnnheim DOCument Server

On the performance impact of using JSON, beyond impedance mismatch

Author: Abelló Gamazo Alberto
Hewasinghage Moditha Lakshan Dharmasir
Nadal Francesch Sergi
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

NOSQL database management systems adopt semi-structured data models, such as JSON, to easily accommodate schema evolution and overcome the overhead generated from transforming internal structures to tabular data (i.e., impedance mismatch). There exist multiple, and equivalent, ways to physically represent semi-structured data, but there is a lack of evidence about the potential impact on space and query performance. In this paper, we embark on the task of quantifying that, precisely for document stores. We empirically compare multiple ways of representing semi-structured data, which allows us to derive a set of guidelines for efficient physical database design considering both JSON and relational options in the same palette.Partly funded by the European Commission through the programme “EM IT4BI-DC”.Peer ReviewedPostprint (author's final draft

A Framework for Management of Semistructured Probabilistic Data

Author: A. Dekhtyar
A. Dekhtyar
Alex Dekhtyar
D. Barbará
D. Barbará
D. Dey
D. Dey
D. Dey
D. Dey
E. Kornatzky
E. Kornatzky
E. Zimányi
F. Tian
F. Tian
J. Halpern
J. Halpern
Judy Goldsmith
L.M. de Campos
L.M. de Campos
M. Pittarelli
M. Pittarelli
R. Ng
V.S. Lakshmanan
V.S. Lakshmanan
W. Zhao
W. Zhao
Wenzhong Zhao
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study