326 research outputs found

    Four Lessons in Versatility or How Query Languages Adapt to the Web

    Get PDF
    Exposing not only human-centered information, but machine-processable data on the Web is one of the commonalities of recent Web trends. It has enabled a new kind of applications and businesses where the data is used in ways not foreseen by the data providers. Yet this exposition has fractured the Web into islands of data, each in different Web formats: Some providers choose XML, others RDF, again others JSON or OWL, for their data, even in similar domains. This fracturing stifles innovation as application builders have to cope not only with one Web stack (e.g., XML technology) but with several ones, each of considerable complexity. With Xcerpt we have developed a rule- and pattern based query language that aims to give shield application builders from much of this complexity: In a single query language XML and RDF data can be accessed, processed, combined, and re-published. Though the need for combined access to XML and RDF data has been recognized in previous work (including the W3C’s GRDDL), our approach differs in four main aspects: (1) We provide a single language (rather than two separate or embedded languages), thus minimizing the conceptual overhead of dealing with disparate data formats. (2) Both the declarative (logic-based) and the operational semantics are unified in that they apply for querying XML and RDF in the same way. (3) We show that the resulting query language can be implemented reusing traditional database technology, if desirable. Nevertheless, we also give a unified evaluation approach based on interval labelings of graphs that is at least as fast as existing approaches for tree-shaped XML data, yet provides linear time and space querying also for many RDF graphs. We believe that Web query languages are the right tool for declarative data access in Web applications and that Xcerpt is a significant step towards a more convenient, yet highly efficient data access in a “Web of Data”

    Interoperability of XML and relational data-optimization algorithm

    Get PDF
    "Within the past six years, Extensible Markup Language (XML) has spread rapidly and has gained popularity in the database community with its primary focus in the design of query languages and storage methods to select data from vast amounts of XML data efficiently. In this respect, I discuss some of the research that has been done by presenting three papers that describe different approaches to querying XML documents. This thesis concentrates on the method used by Sadri and Lakshmanan in [1]: viewing an XML document as a relational database upon which the user can write simple SQL queries that can be translated into equivalent XQuery queries. Taking the output of the translation algorithm presented, I further develop an optimization algorithm meant to decrease the running time of the translated queries. I mainly focus on two aspects: the need of the distinct-values() function and the minimization of the number of variables. "--Abstract from author supplied metadata

    Using a Semi-Realistic Database to Support a Database Course

    Get PDF
    A common problem for university relational database courses is to construct effective databases for instructions and assignments. Highly simplified ‘toy’ databases are easily available for teaching, learning, and practicing. However, they do not reflect the complexity and practical considerations that students encounter in real-world projects after their graduation. On the other hand, production databases may contain too much domain nuances and complexity to be effectively used as a learning tool. Sakila is a semi-realistic, high quality, open source, and highly available database provided by MySQL. This paper describes the use of Sakila as a unified platform to support instructions and multiple assignments of a graduate database course for five semesters. Based on seven surveys with 186 responses, the paper discusses our experience using Sakila. We find this approach promising, and students in general find it more useful and interesting than the highly simplified databases developed by the instructor, or obtained from textbooks. We constructed a collection of 124 problems with suggested solutions on the topics of database modeling and normalization, SQL query, view, stored function, stored procedure, trigger, database Web-driven application development with PHP/MySQL, Relational Algebra using an interpreter, Relational Calculus, XML generation, XPath, and XQuery. This collection is available to Information Systems (IS) educators for adoption or adaptation as assignments, examples, and examination questions to support different database courses

    Strategies for Encoding XML Documents in Relational Databases: Comparisons and Contrasts.

    Get PDF
    The rise of XML as a de facto standard for document and data exchange has created a need to store and query XML documents in relational databases, today\u27s de facto standard for data storage. Two common strategies for storing XML documents in relational databases, a process known as document shredding, are Interval encoding and ORDPATH Encoding. Interval encoding, which uses a fixed mapping for shredding XML documents, tends to favor selection queries, at a potential cost of O(N) for supporting insertion queries. ORDPATH Encoding, which uses a looser mapping for shredding XML, supports fixed-cost insertions, at a potential cost of longer-running selection queries. Experiments conducted for this research suggest that the breakeven point between the two algorithms occurs when users offer an average 1 insertion to every 5.6 queries, relative to documents of between 1.5 MB and 4 MB in size. However, heterogeneous tests of varying mixes of selects and inserts indicate that Interval always outperforms ORDPATH for mixes ranging from 76% selects to 88% selects. Queries for this experiment and sample documents were drawn from the XMark benchmark suite

    Rewriting Declarative Query Languages

    Full text link
    Queries against databases are formulated in declarative languages. Examples are the relational query language SQL and XPath or XQuery for querying data stored in XML. Using a declarative query language, the querist does not need to know about or decide on anything about the actual strategy a system uses to answer the query. Instead, the system can freely choose among the algorithms it employs to answer a query. Predominantly, query processing in the relational context is accomplished using a relational algebra. To this end, the query is translated into a logical algebra. The algebra consists of logical operators which facilitate the application of various optimization techniques. For example, logical algebra expressions can be rewritten in order to yield more efficient expressions. In order to query XML data, XPath and XQuery have been developed. Both are declarative query languages and, hence, can benefit from powerful optimizations. For instance, they could be evaluated using an algebraic framework. However, in general, the existing approaches are not directly utilizable for XML query processing. This thesis has two goals. The first goal is to overcome the above-mentioned misfits of XML query processing, making it ready for industrial-strength settings. Specifically, we develop an algebraic framework that is designed for the efficient evaluation of XPath and XQuery. To this end, we define an order-aware logical algebra and a translation of XPath into this algebra. Furthermore, based on the resulting algebraic expressions, we present rewrites in order to speed up the execution of such queries. The second goal is to investigate rewriting techniques in the relational context. To this end, we present rewrites based on algebraic equivalences that unnest nested SQL queries with disjunctions. Specifically, we present equivalences for unnesting algebraic expressions with bypass operators to handle disjunctive linking and correlation. Our approach can be applied to quantified table subqueries as well as scalar subqueries. For all our results, we present experiments that demonstrate the effectiveness of the developed approaches

    Dynamic recomposition of documents from distributed data sources

    Get PDF
    Dynamic recomposition of documents refers to the process of on-the-fly creation of documents. A document can be generated from several documents that are stored at distributed data sites. The source can be queried and results obtained in the form of XML. These XML documents can be combined after a series of transformation operations to obtain the target document. The resultant document can be stored statically or in the form of a command, which can be invoked later to recompose this document dynamically. Also, in case a change is made to a document, then only the change can be stored, instead of storing the modified document in its entirety. The purpose of this research was to provide a way to recompose dynamic documents. A solution is proposed at the level of algebra for update and recomposition of documents stored at distributed data sources. The issue of representation of a document by a command, i.e., a composition operator and/or an editing command along with one or more path expressions has also been researched. The construction of a dynamic document has three phases to it. The first one is the information retrieval. Phase two deals with building of real document: this includes the filtering of retrieved data by selecting relevant subset of a document and then applying update operations, and finally the ordering and assembling of the document. The final phase consists of displaying or storing or exchanging it over the web through a convenient means

    Integration and coordination in a cognitive vision system

    Get PDF
    In this paper, we present a case study that exemplifies general ideas of system integration and coordination. The application field of assistant technology provides an ideal test bed for complex computer vision systems including real-time components, human-computer interaction, dynamic 3-d environments, and information retrieval aspects. In our scenario the user is wearing an augmented reality device that supports her/him in everyday tasks by presenting information that is triggered by perceptual and contextual cues. The system integrates a wide variety of visual functions like localization, object tracking and recognition, action recognition, interactive object learning, etc. We show how different kinds of system behavior are realized using the Active Memory Infrastructure that provides the technical basis for distributed computation and a data- and eventdriven integration approach
    corecore