7 research outputs found

    Efficient data representation for XML in peer-based systems

    Get PDF
    Purpose - New directions in the provision of end-user computing experiences mean that the best way to share data between small mobile computing devices needs to be determined. Partitioning large structures so that they can be shared efficiently provides a basis for data-intensive applications on such platforms. The partitioned structure can be compressed using dictionary-based approaches and then directly queried without firstly decompressing the whole structure. Design/methodology/approach - The paper describes an architecture for partitioning XML into structural and dictionary elements and the subsequent manipulation of the dictionary elements to make the best use of available space. Findings - The results indicate that considerable savings are available by removing duplicate dictionaries. The paper also identifies the most effective strategy for defining dictionary scope. Research limitations/implications - This evaluation is based on a range of benchmark XML structures and the approach to minimising dictionary size shows benefit in the majority of these. Where structures are small and regular, the benefits of efficient dictionary representation are lost. The authors' future research now focuses on heuristics for further partitioning of structural elements. Practical implications - Mobile applications that need access to large data collections will benefit from the findings of this research. Traditional client/server architectures are not suited to dealing with high volume demands from a multitude of small mobile devices. Peer data sharing provides a more scalable solution and the experiments that the paper describes demonstrate the most effective way of sharing data in this context. Social implications - Many services are available via smartphone devices but users are wary of exploiting the full potential because of the need to conserve battery power. The approach mitigates this challenge and consequently expands the potential for users to benefit from mobile information systems. This will have impact in areas such as advertising, entertainment and education but will depend on the acceptability of file sharing being extended from the desktop to the mobile environment. Originality/value - The original work characterises the most effective way of sharing large data sets between small mobile devices. This will save battery power on devices such as smartphones, thus providing benefits to users of such devices

    Forschungsbericht UniversitÀt Mannheim 2006 / 2007

    Full text link
    Sie erhalten darin zum einen zusammenfassende Darstellungen zu den Forschungsschwerpunkten und Forschungsprofilen der UniversitĂ€t und deren Entwicklung in der Forschung. Zum anderen gibt der Forschungsbericht einen Überblick ĂŒber die Publikationen und Forschungsprojekte der LehrstĂŒhle, Professuren und zentralen Forschungseinrichtungen. Diese werden ergĂ€nzt um Angaben zur Organisation von Forschungsveranstaltungen, der Mitwirkung in ForschungsausschĂŒssen, einer Übersicht zu den fĂŒr Forschungszwecke eingeworbenen Drittmitteln, zu den Promotionen und Habilitationen, zu Preisen und Ehrungen und zu Förderern der UniversitĂ€t Mannheim. Darin zeigt sich die Bandbreite und Vielseitigkeit der ForschungsaktivitĂ€ten und deren Erfolg auf nationaler und internationaler Ebene

    An Algebraic Approach to XQuery Optimization

    Get PDF
    As more data is stored in XML and more applications need to process this data, XML query optimization becomes performance critical. While optimization techniques for relational databases have been developed over the last thirty years, the optimization of XML queries poses new challenges. Query optimizers for XQuery, the standard query language for XML data, need to consider both document order and sequence order. Nevertheless, algebraic optimization proved powerful in query optimizers in relational and object oriented databases. Thus, this dissertation presents an algebraic approach to XQuery optimization. In this thesis, an algebra over sequences is presented that allows for a simple translation of XQuery into this algebra. The formal definitions of the operators in this algebra allow us to reason formally about algebraic optimizations. This thesis leverages the power of this formalism when unnesting nested XQuery expressions. In almost all cases unnesting nested queries in XQuery reduces query execution times from hours to seconds or milliseconds. Moreover, this dissertation presents three basic algebraic patterns of nested queries. For every basic pattern a decision tree is developed to select the most effective unnesting equivalence for a given query. Query unnesting extends the search space that can be considered during cost-based optimization of XQuery. As a result, substantially more efficient query execution plans may be detected. This thesis presents two more important cases where the number of plan alternatives leads to substantially shorter query execution times: join ordering and reordering location steps in path expressions. Our algebraic framework detects cases where document order or sequence order is destroyed. However, state-of-the-art techniques for order optimization in cost-based query optimizers have efficient mechanisms to repair order in these cases. The results obtained for query unnesting and cost-based optimization of XQuery underline the need for an algebraic approach to XQuery optimization for efficient XML query processing. Moreover, they are applicable to optimization in relational databases where order semantics are considered

    Rewriting Declarative Query Languages

    Full text link
    Queries against databases are formulated in declarative languages. Examples are the relational query language SQL and XPath or XQuery for querying data stored in XML. Using a declarative query language, the querist does not need to know about or decide on anything about the actual strategy a system uses to answer the query. Instead, the system can freely choose among the algorithms it employs to answer a query. Predominantly, query processing in the relational context is accomplished using a relational algebra. To this end, the query is translated into a logical algebra. The algebra consists of logical operators which facilitate the application of various optimization techniques. For example, logical algebra expressions can be rewritten in order to yield more efficient expressions. In order to query XML data, XPath and XQuery have been developed. Both are declarative query languages and, hence, can benefit from powerful optimizations. For instance, they could be evaluated using an algebraic framework. However, in general, the existing approaches are not directly utilizable for XML query processing. This thesis has two goals. The first goal is to overcome the above-mentioned misfits of XML query processing, making it ready for industrial-strength settings. Specifically, we develop an algebraic framework that is designed for the efficient evaluation of XPath and XQuery. To this end, we define an order-aware logical algebra and a translation of XPath into this algebra. Furthermore, based on the resulting algebraic expressions, we present rewrites in order to speed up the execution of such queries. The second goal is to investigate rewriting techniques in the relational context. To this end, we present rewrites based on algebraic equivalences that unnest nested SQL queries with disjunctions. Specifically, we present equivalences for unnesting algebraic expressions with bypass operators to handle disjunctive linking and correlation. Our approach can be applied to quantified table subqueries as well as scalar subqueries. For all our results, we present experiments that demonstrate the effectiveness of the developed approaches

    Skalierbare AusfĂŒhrung von Prozessanwendungen in dienstorientierten Umgebungen

    Get PDF
    Die Strukturierung und Nutzung von unternehmensinternen IT-Infrastrukturen auf Grundlage dienstorientierter Architekturen (SOA) und etablierter XML-Technologien ist in den vergangenen Jahren stetig gewachsen. Lag der Fokus anfĂ€nglicher SOA-Realisierungen auf der flexiblen AusfĂŒhrung klassischer, unternehmensrelevanter GeschĂ€ftsprozesse, so bilden heutzutage zeitnahe Datenanalysen sowie die Überwachung von geschĂ€ftsrelevanten Ereignissen weitere wichtige Anwendungsklassen, um sowohl kurzfristig Probleme des GeschĂ€ftsablaufes zu identifizieren als auch um mittel- und langfristige VerĂ€nderungen im Markt zu erkennen und die GeschĂ€ftsprozesse des Unternehmens flexibel darauf anzupassen. Aufgrund der geschichtlich bedingten, voneinander unabhĂ€ngigen Entwicklung der drei Anwendungsklassen, werden die jeweiligen Anwendungsprozesse gegenwĂ€rtig in eigenstĂ€ndigen Systemen modelliert und ausgefĂŒhrt. Daraus resultiert jedoch eine Reihe von Nachteilen, welche diese Arbeit aufzeigt und ausfĂŒhrlich diskutiert. Vor diesem Hintergrund beschĂ€ftigte sich die vorliegende Arbeit mit der Ableitung einer konsolidierten AusfĂŒhrungsplattform, die es ermöglicht, Prozesse aller drei Anwendungsklassen gemeinsam zu modellieren und in einer SOA-basierten Infrastruktur effizient auszufĂŒhren. Die vorliegende Arbeit adressiert die Probleme einer solchen konsolidierten AusfĂŒhrungsplattform auf den drei Ebenen der Dienstkommunikation, der ProzessausfĂŒhrung und der optimalen Verteilung von SOA-Komponenten in einer Infrastruktur

    Child Prime Label Approaches to Evaluate XML Structured Queries

    Get PDF
    The adoption of the eXtensible Markup Language (XML) as the standard format to store and exchange semi-structure data has been gaining momentum. The growing number of XML documents leads to the need for appropriate XML querying algorithms which are able to retrieve XML data efficiently. Due to the importance of twig pattern matching in XML retrieval systems, finding all matching occurrences of a tree pattern query in an XML document is often considered as a specific task for XML databases as well as a core operation in XML query processing. This thesis presents a design and implementation of a new indexing technique, called the Child Prime Label (CPL) which exploits the property of prime numbers to identify Parent-Child (P-C) edges in twig pattern queries (TPQs) during query evaluation. The CPL approach can be incorporated efficiently within the existing labelling schemes. The major contributions of this thesis can be seen as a set of novel twig matching algorithms which apply the CPL approach and focus on reducing the overhead of storing useless elements and performing unnecessary computations during the output enumeration. The research presented here is the first to provide an efficient and general solution for TPQs containing ordering constraints and positional predicates specified by the XML query languages. To evaluate the CPL approaches, the holistic model was implemented as an experimental prototype in which the approaches proposed are compared against state-of-the-art holistic twig algorithms. Extensive performance studies on various real-world and artificial datasets were conducted to demonstrate the significant improvement of the CPL approaches over the previous indexing and querying methods. The experimental results demonstrate the validity and improvements of the new algorithms over other related methods on common various subclasses of TPQs. Moreover, the scalability tests reveal that the new algorithms are more suitable for processing large XML datasets

    Forschungsbericht UniversitÀt Mannheim, 2004 / 2005

    Full text link
    Die UniversitĂ€t Mannheim gibt in dem vorliegenden Forschungsbericht 2004/2005 Rechenschaft ĂŒber ihre Leistungen auf dem Gebiet der Forschung. Erstmals folgt diese Dokumentation einer neuen Gliederung, die auf einen Beschluss des Forschungsrates der UniversitĂ€t Mannheim zurĂŒckgeht. Wie gewohnt erhalten Sie einen Überblick ĂŒber die Publikationen und Forschungsprojekte der LehrstĂŒhle, Professuren und zentralen Forschungseinrichtungen. Diese werden ergĂ€nzt um Angaben zur Organisation von Forschungsveranstaltungen, der Mitwirkung in ForschungsausschĂŒssen, einer Übersicht zu den fĂŒr Forschungszwecke eingeworbenen Drittmitteln, zu den Promotionen und Habilitationen, zu Preisen und Ehrungen und zu Förderern der UniversitĂ€t Mannheim. Abgerundet werden diese Daten durch zusammenfassende Darstellungen der Forschungsschwerpunkte und des Forschungsprofils der FakultĂ€ten
    corecore