Search CORE

1,287 research outputs found

Fast and Tiny Structural Self-Indexes for XML

Author: Maneth Sebastian
Sebastian Tom
Publication venue
Publication date: 27/12/2010
Field of study

XML document markup is highly repetitive and therefore well compressible using dictionary-based methods such as DAGs or grammars. In the context of selectivity estimation, grammar-compressed trees were used before as synopsis for structural XPath queries. Here a fully-fledged index over such grammars is presented. The index allows to execute arbitrary tree algorithms with a slow-down that is comparable to the space improvement. More interestingly, certain algorithms execute much faster over the index (because no decompression occurs). E.g., for structural XPath count queries, evaluating over the index is faster than previous XPath implementations, often by two orders of magnitude. The index also allows to serialize XML results (including texts) faster than previous systems, by a factor of ca. 2-3. This is due to efficient copy handling of grammar repetitions, and because materialization is totally avoided. In order to compare with twig join implementations, we implemented a materializer which writes out pre-order numbers of result nodes, and show its competitiveness.Comment: 13 page

arXiv.org e-Print Archive

HAL - Lille 3

INRIA a CCSD electronic archive server

08261 Abstracts Collection -- Structure-Based Compression of Complex Massive Data

Author: Lohrey Markus
Maneth Sebastian
Rytter Wojciech
Publication venue: Dagstuhl Seminar Proceedings. 08261 - Structure-Based Compression of Complex Massive Data
Publication date: 01/01/2008
Field of study

From June 22, 2008 to June 27, 2008 the Dagstuhl Seminar 08261 ``Structure-Based Compression of Complex Massive Data\u27\u27 was held in the International Conference and Research Center (IBFI), Schloss Dagstuhl. During the seminar, several participants presented their current research, and ongoing work and open problems were discussed. Abstracts of the presentations given during the seminar as well as abstracts of seminar results and ideas are put together in this paper. The first section describes the seminar topics and goals in general. Links to extended abstracts or full papers are provided, if available

Dagstuhl Research Online Publication Server

Shingle 2.0: generalising self-consistent and automated domain discretisation for multi-scale geophysical models

Author: Candy Adam S.
Pietrzak Julie D.
Publication venue
Publication date: 24/03/2017
Field of study

The approaches taken to describe and develop spatial discretisations of the domains required for geophysical simulation models are commonly ad hoc, model or application specific and under-documented. This is particularly acute for simulation models that are flexible in their use of multi-scale, anisotropic, fully unstructured meshes where a relatively large number of heterogeneous parameters are required to constrain their full description. As a consequence, it can be difficult to reproduce simulations, ensure a provenance in model data handling and initialisation, and a challenge to conduct model intercomparisons rigorously. This paper takes a novel approach to spatial discretisation, considering it much like a numerical simulation model problem of its own. It introduces a generalised, extensible, self-documenting approach to carefully describe, and necessarily fully, the constraints over the heterogeneous parameter space that determine how a domain is spatially discretised. This additionally provides a method to accurately record these constraints, using high-level natural language based abstractions, that enables full accounts of provenance, sharing and distribution. Together with this description, a generalised consistent approach to unstructured mesh generation for geophysical models is developed, that is automated, robust and repeatable, quick-to-draft, rigorously verified and consistent to the source data throughout. This interprets the description above to execute a self-consistent spatial discretisation process, which is automatically validated to expected discrete characteristics and metrics.Comment: 18 pages, 10 figures, 1 table. Submitted for publication and under revie

arXiv.org e-Print Archive

Directory of Open Access Journals

Service broker based on cloud service description language

Author: Barrie Peter
Razaq Abdul
Tianfield Huaglory
Yue Hong
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 24/04/2017
Field of study

ResearchOnline@GCU

Compression of Probabilistic XML documents

Author: Veldman Irma
Publication venue: University of Twente, Centre for Telematics and Information Technology
Publication date: 01/01/2009
Field of study

Probabilistic XML (PXML) files resulting from data integration can become extremely large, which is undesired. For XML there are several techniques available to compress the document and since probabilistic XML is in fact (a special form of) XML, it might benefit from these methods even more. In this research we search for compression mechanisms that are available for XML and implement one of them to customize it with respect to the properties of probabilistic XML. Experiments show that there is no significant improvement for combinations of traditional mechanisms with techniques that are specially designed for probabilistic XML

University of Twente Research Information