2,245 research outputs found

    Managing and Querying Multi-Version XML Data with Update Logging

    Get PDF
    With the increasing popularity of storing content on the WWW and intranet in XML form, there arises the need for the control and management of this data. As this data is constantly evolving, users want to be able to query previous versions, query changes in documents, as well as to retrieve a particular document version efficiently. This paper proposes a version management system for XML data that can manage and query changes in an effective and meaningful manner

    Managing and querying multi-version XML data with update logging

    Get PDF

    A Comparative Study: Change Detection and Querying Dynamic XML Documents

    Get PDF
    The efficient management of the dynamic XML documents is a complex area of research. The changes and size of the XML documents throughout its lifetime are limitless. Change detection is an important part of version management to identify difference between successive versions of a document. Document content is continuously evolving. Users wanted to be able to query previous versions, query changes in documents, as well as to retrieve a particular document version efficiently. In this paper we provide comprehensive comparative analysis of various control schemes for change detection and querying dynamic XML documents

    Anatomy of a Native XML Base Management System

    Full text link
    Several alternatives to manage large XML document collections exist, ranging from file systems over relational or other database systems to specifically tailored XML repositories. In this paper we give a tour of Natix, a database management system designed from scratch for storing and processing XML data. Contrary to the common belief that management of XML data is just another application for traditional databases like relational systems, we illustrate how almost every component in a database system is affected in terms of adequacy and performance. We show how to design and optimize areas such as storage, transaction management comprising recovery and multi-user synchronisation as well as query processing for XML

    Designing Traceability into Big Data Systems

    Full text link
    Providing an appropriate level of accessibility and traceability to data or process elements (so-called Items) in large volumes of data, often Cloud-resident, is an essential requirement in the Big Data era. Enterprise-wide data systems need to be designed from the outset to support usage of such Items across the spectrum of business use rather than from any specific application view. The design philosophy advocated in this paper is to drive the design process using a so-called description-driven approach which enriches models with meta-data and description and focuses the design process on Item re-use, thereby promoting traceability. Details are given of the description-driven design of big data systems at CERN, in health informatics and in business process management. Evidence is presented that the approach leads to design simplicity and consequent ease of management thanks to loose typing and the adoption of a unified approach to Item management and usage.Comment: 10 pages; 6 figures in Proceedings of the 5th Annual International Conference on ICT: Big Data, Cloud and Security (ICT-BDCS 2015), Singapore July 2015. arXiv admin note: text overlap with arXiv:1402.5764, arXiv:1402.575

    AsterixDB: A Scalable, Open Source BDMS

    Full text link
    AsterixDB is a new, full-function BDMS (Big Data Management System) with a feature set that distinguishes it from other platforms in today's open source Big Data ecosystem. Its features make it well-suited to applications like web data warehousing, social data storage and analysis, and other use cases related to Big Data. AsterixDB has a flexible NoSQL style data model; a query language that supports a wide range of queries; a scalable runtime; partitioned, LSM-based data storage and indexing (including B+-tree, R-tree, and text indexes); support for external as well as natively stored data; a rich set of built-in types; support for fuzzy, spatial, and temporal types and queries; a built-in notion of data feeds for ingestion of data; and transaction support akin to that of a NoSQL store. Development of AsterixDB began in 2009 and led to a mid-2013 initial open source release. This paper is the first complete description of the resulting open source AsterixDB system. Covered herein are the system's data model, its query language, and its software architecture. Also included are a summary of the current status of the project and a first glimpse into how AsterixDB performs when compared to alternative technologies, including a parallel relational DBMS, a popular NoSQL store, and a popular Hadoop-based SQL data analytics platform, for things that both technologies can do. Also included is a brief description of some initial trials that the system has undergone and the lessons learned (and plans laid) based on those early "customer" engagements

    A Generic Approach and Framework for Managing Complex Information

    Get PDF
    Several application domains, such as healthcare, incorporate domain knowledge into their day-to-day activities to standardise and enhance their performance. Such incorporation produces complex information, which contains two main clusters (active and passive) of information that have internal connections between them. The active cluster determines the recommended procedure that should be taken as a reaction to specific situations. The passive cluster determines the information that describes these situations and other descriptive information plus the execution history of the complex information. In the healthcare domain, a medical patient plan is an example for complex information produced during the disease management activity from specific clinical guidelines. This thesis investigates the complex information management at an application domain level in order to support the day-to-day organization activities. In this thesis, a unified generic approach and framework, called SIM (Specification, Instantiation and Maintenance), have been developed for computerising the complex information management. The SIM approach aims at providing a conceptual model for the complex information at different abstraction levels (generic and entity-specific). In the SIM approach, the complex information at the generic level is referred to as a skeletal plan from which several entity-specific plans are generated. The SIM framework provides comprehensive management aspects for managing the complex information. In the SIM framework, the complex information goes through three phases, specifying the skeletal plans, instantiating entity-specific plans, and then maintaining these entity-specific plans during their lifespan. In this thesis, a language, called AIM (Advanced Information Management), has been developed to support the main functionalities of the SIM approach and framework. AIM consists of three components: AIMSL, AIM ESPDoc model, and AIMQL. The AIMSL is the AIM specification component that supports the formalisation process of the complex information at a generic level (skeletal plans). The AIM ESPDoc model is a computer-interpretable model for the entity-specific plan. AIMQL is the AIM query component that provides support for manipulating and querying the complex information, and provides special manipulation operations and query capabilities, such as replay query support. The applicability of the SIM approach and framework is demonstrated through developing a proof-of-concept system, called AIMS, using the available technologies, such as XML and DBMS. The thesis evaluates the the AIMS system using a clinical case study, which has applied to a medical test request application

    State-of-the-art on evolution and reactivity

    Get PDF
    This report starts by, in Chapter 1, outlining aspects of querying and updating resources on the Web and on the Semantic Web, including the development of query and update languages to be carried out within the Rewerse project. From this outline, it becomes clear that several existing research areas and topics are of interest for this work in Rewerse. In the remainder of this report we further present state of the art surveys in a selection of such areas and topics. More precisely: in Chapter 2 we give an overview of logics for reasoning about state change and updates; Chapter 3 is devoted to briefly describing existing update languages for the Web, and also for updating logic programs; in Chapter 4 event-condition-action rules, both in the context of active database systems and in the context of semistructured data, are surveyed; in Chapter 5 we give an overview of some relevant rule-based agents frameworks
    corecore