2,245 research outputs found
Managing and Querying Multi-Version XML Data with Update Logging
With the increasing popularity of storing content on the WWW and intranet in XML form, there arises the need for the control and management of this data. As this data is constantly evolving, users want to be able to query previous versions, query changes in documents, as well as to retrieve a particular document version efficiently. This paper proposes a version management system for XML data that can manage and query changes in an effective and meaningful manner
A Comparative Study: Change Detection and Querying Dynamic XML Documents
The efficient management of the dynamic XML documents is a complex area of research. The changes and size of the XML documents throughout its lifetime are limitless. Change detection is an important part of version management to identify difference between successive versions of a document. Document content is continuously evolving. Users wanted to be able to query previous versions, query changes in documents, as well as to retrieve a particular document version efficiently. In this paper we provide comprehensive comparative analysis of various control schemes for change detection and querying dynamic XML documents
Anatomy of a Native XML Base Management System
Several alternatives to manage large XML document collections exist, ranging from file systems over relational or other database systems to specifically tailored XML repositories. In this paper we give a tour of Natix, a database management system designed from scratch for storing and processing XML data. Contrary to the common belief that management of XML data is just another application for traditional databases like relational systems, we illustrate how almost every component in a database system is affected in terms of adequacy and performance. We show how to design and optimize areas such as storage, transaction management comprising recovery and multi-user synchronisation as well as query processing for XML
Designing Traceability into Big Data Systems
Providing an appropriate level of accessibility and traceability to data or
process elements (so-called Items) in large volumes of data, often
Cloud-resident, is an essential requirement in the Big Data era.
Enterprise-wide data systems need to be designed from the outset to support
usage of such Items across the spectrum of business use rather than from any
specific application view. The design philosophy advocated in this paper is to
drive the design process using a so-called description-driven approach which
enriches models with meta-data and description and focuses the design process
on Item re-use, thereby promoting traceability. Details are given of the
description-driven design of big data systems at CERN, in health informatics
and in business process management. Evidence is presented that the approach
leads to design simplicity and consequent ease of management thanks to loose
typing and the adoption of a unified approach to Item management and usage.Comment: 10 pages; 6 figures in Proceedings of the 5th Annual International
Conference on ICT: Big Data, Cloud and Security (ICT-BDCS 2015), Singapore
July 2015. arXiv admin note: text overlap with arXiv:1402.5764,
arXiv:1402.575
AsterixDB: A Scalable, Open Source BDMS
AsterixDB is a new, full-function BDMS (Big Data Management System) with a
feature set that distinguishes it from other platforms in today's open source
Big Data ecosystem. Its features make it well-suited to applications like web
data warehousing, social data storage and analysis, and other use cases related
to Big Data. AsterixDB has a flexible NoSQL style data model; a query language
that supports a wide range of queries; a scalable runtime; partitioned,
LSM-based data storage and indexing (including B+-tree, R-tree, and text
indexes); support for external as well as natively stored data; a rich set of
built-in types; support for fuzzy, spatial, and temporal types and queries; a
built-in notion of data feeds for ingestion of data; and transaction support
akin to that of a NoSQL store.
Development of AsterixDB began in 2009 and led to a mid-2013 initial open
source release. This paper is the first complete description of the resulting
open source AsterixDB system. Covered herein are the system's data model, its
query language, and its software architecture. Also included are a summary of
the current status of the project and a first glimpse into how AsterixDB
performs when compared to alternative technologies, including a parallel
relational DBMS, a popular NoSQL store, and a popular Hadoop-based SQL data
analytics platform, for things that both technologies can do. Also included is
a brief description of some initial trials that the system has undergone and
the lessons learned (and plans laid) based on those early "customer"
engagements
A Generic Approach and Framework for Managing Complex Information
Several application domains, such as healthcare, incorporate domain knowledge into their day-to-day activities to standardise and enhance their performance. Such incorporation produces complex information, which contains two main clusters (active and passive) of information that have internal connections between them. The active cluster determines the recommended procedure that should be taken as a reaction to specific situations. The passive cluster determines the information that describes these situations and other descriptive information plus the execution history of the complex information. In the healthcare domain, a medical patient plan is an example for complex information produced during the disease management activity from specific clinical guidelines. This thesis investigates the complex information management at an application domain level in order to support the day-to-day organization activities. In this thesis, a unified generic approach and framework, called SIM (Specification, Instantiation and Maintenance), have been developed for computerising the complex information management. The SIM approach aims at providing a conceptual model for the complex information at different abstraction levels (generic and entity-specific). In the SIM approach, the complex information at the generic level is referred to as a skeletal plan from which several entity-specific plans are generated. The SIM framework provides comprehensive management aspects for managing the complex information. In the SIM framework, the complex information goes through three phases, specifying the skeletal plans, instantiating entity-specific plans, and then maintaining these entity-specific plans during their lifespan. In this thesis, a language, called AIM (Advanced Information Management), has been developed to support the main functionalities of the SIM approach and framework. AIM consists of three components: AIMSL, AIM ESPDoc model, and AIMQL. The AIMSL is the AIM specification component that supports the formalisation process of the complex information at a generic level (skeletal plans). The AIM ESPDoc model is a computer-interpretable model for the entity-specific plan. AIMQL is the AIM query component that provides support for manipulating and querying the complex information, and provides special manipulation operations and query capabilities, such as replay query support. The applicability of the SIM approach and framework is demonstrated through developing a proof-of-concept system, called AIMS, using the available technologies, such as XML and DBMS. The thesis evaluates the the AIMS system using a clinical case study, which has applied to a medical test request application
State-of-the-art on evolution and reactivity
This report starts by, in Chapter 1, outlining aspects of querying and updating resources on
the Web and on the Semantic Web, including the development of query and update languages
to be carried out within the Rewerse project.
From this outline, it becomes clear that several existing research areas and topics are of
interest for this work in Rewerse. In the remainder of this report we further present state of
the art surveys in a selection of such areas and topics. More precisely: in Chapter 2 we give
an overview of logics for reasoning about state change and updates; Chapter 3 is devoted to briefly describing existing update languages for the Web, and also for updating logic programs;
in Chapter 4 event-condition-action rules, both in the context of active database systems and
in the context of semistructured data, are surveyed; in Chapter 5 we give an overview of some relevant rule-based agents frameworks
- …