282 research outputs found

    SMOQE: A System for Providing Secure Access to XML

    Get PDF
    XML views have been widely used to enforce access control, support data integration, and speed up query answering. In many applications, e.g., XML security enforcement, it is prohibitively expensive to materialize and maintain a large number of views. Therefore, views are necessarily virtual. An immediate question then is how to answer queries on XML virtual views. A common approach is to rewrite a query on the view to an equivalent one on the underlying document, and evaluate the rewritten query. This is the approach used in the Secure MOdular Query Engine (SMOQE). The demo presents SMOQE, the first system to provide efficient support for answering queries over virtual and possibly recursively defined XML views. We demonstrate a set of novel techniques for the specification of views, the rewriting, evaluation and optimization of XML queries. Moreover, we provide insights into the internals of the engine by a set of visual tools. 1

    06472 Abstracts Collection - XQuery Implementation Paradigms

    Get PDF
    From 19.11.2006 to 22.11.2006, the Dagstuhl Seminar 06472 ``XQuery Implementation Paradigms'' was held in the International Conference and Research Center (IBFI), Schloss Dagstuhl. During the seminar, several participants presented their current research, and ongoing work and open problems were discussed. Abstracts of the presentations given during the seminar as well as abstracts of seminar results and ideas are put together in this paper. The first section describes the seminar topics and goals in general. Links to extended abstracts or full papers are provided, if available

    XQuery Streaming by Forest Transducers

    Full text link
    Streaming of XML transformations is a challenging task and only very few systems support streaming. Research approaches generally define custom fragments of XQuery and XPath that are amenable to streaming, and then design custom algorithms for each fragment. These languages have several shortcomings. Here we take a more principles approach to the problem of streaming XQuery-based transformations. We start with an elegant transducer model for which many static analysis problems are well-understood: the Macro Forest Transducer (MFT). We show that a large fragment of XQuery can be translated into MFTs --- indeed, a fragment of XQuery, that can express important features that are missing from other XQuery stream engines, such as GCX: our fragment of XQuery supports XPath predicates and let-statements. We then rely on a streaming execution engine for MFTs, one which uses a well-founded set of optimizations from functional programming, such as strictness analysis and deforestation. Our prototype achieves time and memory efficiency comparable to the fastest known engine for XQuery streaming, GCX. This is surprising because our engine relies on the OCaml built in garbage collector and does not use any specialized buffer management, while GCX's efficiency is due to clever and explicit buffer management.Comment: Full version of the paper in the Proceedings of the 30th IEEE International Conference on Data Engineering (ICDE 2014

    Overview of query optimization in XML database systems

    Get PDF

    05061 Abstracts Collection -- Foundations of Semistructured Data

    Get PDF
    From 06.02.05 to 11.02.05, the Dagstuhl Seminar 05061 ``Foundations of Semistructured Data\u27\u27 was held in the International Conference and Research Center (IBFI), Schloss Dagstuhl. During the seminar, several participants presented their current research, and ongoing work and open problems were discussed. Abstracts of the presentations given during the seminar as well as abstracts of seminar results and ideas are put together in this paper. The first section describes the seminar topics and goals in general. Links to extended abstracts or full papers are provided, if available

    Semantics and efficient evaluation of partial tree-pattern queries on XML

    Get PDF
    Current applications export and exchange XML data on the web. Usually, XML data are queried using keyword queries or using the standard structured query language XQuery the core of which consists of the navigational query language XPath. In this context, one major challenge is the querying of the data when the structure of the data sources is complex or not fully known to the user. Another challenge is the integration of multiple data sources that export data with structural differences and irregularities. In this dissertation, a query language for XML called Partial Tree-Pattern Query (PTPQ) language is considered. PTPQs generalize and strictly contain Tree-Pattern Queries (TPQs) and can express a broad structural fragment of XPath. Because of their expressive power and flexibility, they are useful for querying XML documents the structure of which is complex or not fully known to the user, and for integrating XML data sources with different structures. The dissertation focuses on three issues. The first one is the design of efficient non-main-memory evaluation methods for PTPQs. The second one is the assignment of semantics to PTPQs so that they return meaningful answers. The third one is the development of techniques for answering TPQs using materialized views. Non-main-memory XML query evaluation can be done in two modes (which also define two evaluation models). In the first mode, data is preprocessed and indexes, called inverted lists, are built for it. In the second mode, data are unindexed and arrives continuously in the form of a stream. Existing algorithms cannot be used directly or indirectly to efficiently compute PTPQs in either mode. Initially, the problem of efficiently evaluating partial path queries in the inverted lists model has been addressed. Partial path queries form a subclass of PTPQs which is not contained in the class of TPQs. Three novel algorithms for evaluating partial path queries including a holistic one have been designed. The analytical and experimental results show that the holistic algorithm outperforms the other two. These results have been extended into holistic and non-holistic approaches for PTPQs in the inverted lists model. The experiments show again the superiority of the holistic approach. The dissertation has also addressed the problem of evaluating PTPQs in the streaming model, and two original efficient streaming algorithms for PTPQs have been designed. Compared to the only known streaming algorithm that supports an extension of TPQs, the experimental results show that the proposed algorithms perform better by orders of magnitude while consuming a much smaller fraction of memory space. An original approach for assigning semantics to PTPQs has also been devised. The novel semantics seamlessly applies to keyword queries and to queries with structural restrictions. In contrast to previous approaches that operate locally on data, the proposed approach operates globally on structural summaries of data to extract tree patterns. Compared to previous approaches, an experimental evaluation shows that our approach has a perfect recall both for XML documents with complete and with incomplete data. It also shows better precision compared to approaches with similar recall. Finally, the dissertation has addressed the problem of answering XML queries using exclusively materialized views. An original approach for materializing views in the context of the inverted lists model has been suggested. Necessary and sufficient conditions have been provided for tree-pattern query answerability in terms of view-to-query homomorphisms. A time and space efficient algorithm was designed for deciding query answerability and a technique for computing queries over view materializations using stack- based holistic algorithms was developed. Further, optimizations were developed which (a) minimize the storage space and avoid redundancy by materializing views as bitmaps, and (b) optimize the evaluation of the queries over the views by applying bitwise operations on view materializations. The experimental results show that the proposed approach obtains largely higher hit rates than previous approaches, speeds up significantly the evaluation of queries without using views, and scales very smoothly in terms of storage space and computational overhead
    corecore