5 research outputs found
Benchmarking Summarizability Processing in XML Warehouses with Complex Hierarchies
Business Intelligence plays an important role in decision making. Based on
data warehouses and Online Analytical Processing, a business intelligence tool
can be used to analyze complex data. Still, summarizability issues in data
warehouses cause ineffective analyses that may become critical problems to
businesses. To settle this issue, many researchers have studied and proposed
various solutions, both in relational and XML data warehouses. However, they
find difficulty in evaluating the performance of their proposals since the
available benchmarks lack complex hierarchies. In order to contribute to
summarizability analysis, this paper proposes an extension to the XML warehouse
benchmark (XWeB) with complex hierarchies. The benchmark enables us to generate
XML data warehouses with scalable complex hierarchies as well as
summarizability processing. We experimentally demonstrated that complex
hierarchies can definitely be included into a benchmark dataset, and that our
benchmark is able to compare two alternative approaches dealing with
summarizability issues.Comment: 15th International Workshop on Data Warehousing and OLAP (DOLAP
2012), Maui : United States (2012
JUpdate: A JSON Update Language
Although JSON documents are being used in several emerging applications (e.g., Big Data applications, IoT, mobile computing, smart cities, and online social networks), there is no consensual or standard language for updating JSON documents (i.e., creating, deleting or changing such documents, where changing means inserting, deleting, replacing, copying, moving, etc., portions of data in such documents). To fill this gap, we propose in this paper an SQL-like language, named JUpdate, for updating JSON documents. JUpdate is based on a set of six primitive update operations, which is proven complete and minimal, and it provides a set of fourteen user-friendly high-level operations with a well-founded semantics defined on the basis of the primitive update operations
Recommended from our members
Mediation on XQuery Views
The major goal of information integration is to provide efficient and easy-to-use access to multiple heterogeneous data sources with a single query. At the same time, one of the current trends is to use standard technologies for implementing solutions to complex software problems. In this dissertation, I used XML and XQuery as the standard technologies and have developed an extended projection algorithm to provide a solution to the information integration problem. In order to demonstrate my solution, I implemented a prototype mediation system called Omphalos based on XML related technologies. The dissertation describes the architecture of the system, its metadata, and the process it uses to answer queries. The system uses XQuery expressions (termed metaqueries) to capture complex mappings between global schemas and data source schemas. The system then applies these metaqueries in order to rewrite a user query on a virtual global database (representing the integrated view of the heterogeneous data sources) to a query (termed an outsourced query) on the real data sources. An extended XML document projection algorithm was developed to increase the efficiency of selecting the relevant subset of data from an individual data source to answer the user query. The system applies the projection algorithm to decompose an outsourced query into atomic queries which are each executed on a single data source. I also developed an algorithm to generate integrating queries, which the system uses to compose the answers from the atomic queries into a single answer to the original user query. I present a proof of both the extended XML document projection algorithm and the query integration algorithm. An analysis of the efficiency of the new extended algorithm is also presented. Finally I describe a collaborative schema-matching tool that was implemented to facilitate maintaining metadata
Compressing Labels of Dynamic XML Data using Base-9 Scheme and Fibonacci Encoding
The flexibility and self-describing nature of XML has made it the most common mark-up language used for data representation over the Web. XML data is naturally modelled as a tree, where the structural tree information can be encoded into labels via XML labelling scheme in order to permit answers to queries without the need to access original XML files. As the transmission of XML data over the Internet has become vibrant, it has also become necessary to have an XML labelling scheme that supports dynamic XML data. For a large-scale and frequently updated XML document, existing dynamic XML labelling schemes still suffer from high growth rates in terms of their label size, which can result in overflow problems and/or ambiguous data/query retrievals.
This thesis considers the compression of XML labels. A novel XML labelling scheme, named “Base-9”, has been developed to generate labels that are as compact as possible and yet provide efficient support for queries to both static and dynamic XML data. A Fibonacci prefix-encoding method has been used for the first time to store Base-9’s XML labels in a compressed format, with the intention of minimising the storage space without degrading XML querying performance. The thesis also investigates the compression of XML labels using various existing prefix-encoding methods. This investigation has resulted in the proposal of a novel prefix-encoding method named “Elias-Fibonacci of order 3”, which has achieved the fastest encoding time of all prefix-encoding methods studied in this thesis, whereas Fibonacci encoding was found to require the minimum storage.
Unlike current XML labelling schemes, the new Base-9 labelling scheme ensures the generation of short labels even after large, frequent, skewed insertions. The advantages of such short labels as those generated by the combination of applying the Base-9 scheme and the use of Fibonacci encoding in terms of storing, updating, retrieving and querying XML data are supported by the experimental results reported herein
An Integrated Environment For Automated Benchmarking And Validation Of XML-Based Applications
Testing is the dominant software verification technique used in industry; it is a critical and
most expensive process during software development. Along with the increase in software
complexity, the costs of testing are increasing rapidly. Faced with this problem, many
researchers are working on automated testing, attempting to find methods that execute the
processes of testing automatically and cut down the cost of testing.
Today, software systems are becoming complicated. Some of them are composed of
several different components. Some projects even required different systems to work together
and support each other. The XML have been developed to facilitate data exchange
and enhance interoperability among software systems. Along with the development of
XML technologies, XML-based systems are used widely in many domains. In this thesis
we will present a methodology for testing XML-based applications automatically.
In this thesis we present a methodology called XPT (XML-based Partition Testing)
which is defined as deriving XML Instances from XML Schema automatically and systematically.
XPT methodology is inspired from the Category-partition method, which is a
well-known approach to Black-box Test generation. We follow a similar idea of applying
partitioning to an XML Schema in order to generate a suite of conforming instances; in
addition, since the number of generated instances soon becomes unmanageable, we also
introduce a set of heuristics for reducing the suite; while optimizing the XML Schema
coverage. The aim of our research is not only to invent a technical method, but also to attempt
to apply XPT methodology in real applications. We have created a proof-of-concept
tool, TAXI, which is the implementation of XPT. This tool has a graphic user interface
that can guide and help testers to use it easily. TAXI can also be customized for specific
applications to build the test environment and automate the whole processes of testing.
The details of TAXI design and the case studies using TAXI in different domains are
presented in this thesis. The case studies cover three test purposes. The first one is for
functional correctness, specifically we apply the methodology to do the XSLT Testing,
which uses TAXI to build an automatic environment for testing the XSLT transformation;
the second is for robustness testing, we did the XML database mapping test which tests the
data transformation tool for mapping and populate the data from XML Document to XML
database; and the third one is for the performance testing, we show XML benchmark that
uses TAXI to do the benchmarking of the XML-based applications