1,422 research outputs found

    Efficient & Effective Selective Query Rewriting with Efficiency Predictions

    Get PDF
    To enhance effectiveness, a user's query can be rewritten internally by the search engine in many ways, for example by applying proximity, or by expanding the query with related terms. However, approaches that benefit effectiveness often have a negative impact on efficiency, which has impacts upon the user satisfaction, if the query is excessively slow. In this paper, we propose a novel framework for using the predicted execution time of various query rewritings to select between alternatives on a per-query basis, in a manner that ensures both effectiveness and efficiency. In particular, we propose the prediction of the execution time of ephemeral (e.g., proximity) posting lists generated from uni-gram inverted index posting lists, which are used in establishing the permissible query rewriting alternatives that may execute in the allowed time. Experiments examining both the effectiveness and efficiency of the proposed approach demonstrate that a 49% decrease in mean response time (and 62% decrease in 95th-percentile response time) can be attained without significantly hindering the effectiveness of the search engine

    Signature Files: An Integrated Access Method for Formatted and Unformatted Databases

    Get PDF
    The signature file approach is one of the most powerful information storage and retrieval techniques which is used for finding the data objects that are relevant to the user queries. The main idea of all signature based schemes is to reflect the essence of the data items into bit pattern (descriptors or signatures) and store them in a separate file which acts as a filter to eliminate the non aualifvine data items for an information reauest. It provides an integrated access method for both formattid and formatted databases. A complative overview and discussion of the proposed signatnre generation methods and the major signature file organization schemes are presented. Applications of the signature techniques to formatted and unformatted databases, single and multiterm query cases, serial and paratlei architecture. static and dynamic environments are provided with a special emphasis on the multimedia databases where the pioneering prototype systems using signatnres yield highly encouraging results

    Physical Plan Instrumentation in Databases: Mechanisms and Applications

    Get PDF
    Database management systems (DBMSs) are designed with the goal set to compile SQL queries to physical plans that, when executed, provide results to the SQL queries. Building on this functionality, an ever-increasing number of application domains (e.g., provenance management, online query optimization, physical database design, interactive data profiling, monitoring, and interactive data visualization) seek to operate on how queries are executed by the DBMS for a wide variety of purposes ranging from debugging and data explanation to optimization and monitoring. Unfortunately, DBMSs provide little, if any, support to facilitate the development of this class of important application domains. The effect is such that database application developers and database system architects either rewrite the database internals in ad-hoc ways; work around the SQL interface, if possible, with inevitable performance penalties; or even build new databases from scratch only to express and optimize their domain-specific application logic over how queries are executed. To address this problem in a principled manner in this dissertation, we introduce a prototype DBMS, namely, Smoke, that exposes instrumentation mechanisms in the form of a framework to allow external applications to manipulate physical plans. Intuitively, a physical plan is the underlying representation that DBMSs use to encode how a SQL query will be executed, and providing instrumentation mechanisms at this representation level allows applications to express and optimize their logic on how queries are executed. Having such an instrumentation-enabled DBMS in-place, we then consider how to express and optimize applications that rely their logic on how queries are executed. To best demonstrate the expressive and optimization power of instrumentation-enabled DBMSs, we express and optimize applications across several important domains including provenance management, interactive data visualization, interactive data profiling, physical database design, online query optimization, and query discovery. Expressivity-wise, we show that Smoke can express known techniques, introduce novel semantics on known techniques, and introduce new techniques across domains. Performance-wise, we show case-by-case that Smoke is on par with or up-to several orders of magnitudes faster than state-of-the-art imperative and declarative implementations of important applications across domains. As such, we believe our contributions provide evidence and form the basis towards a class of instrumentation-enabled DBMSs with the goal set to express and optimize applications across important domains with core logic over how queries are executed by DBMSs

    Design issues in the production of hyper‐books and visual‐books

    Get PDF
    This paper describes an ongoing research project in the area of electronic books. After a brief overview of the state of the art in this field, two new forms of electronic book are presented: hyper‐books and visual‐books. A flexible environment allows them to be produced in a semi‐automatic way starting from different sources: electronic texts (as input for hyper‐books) and paper books (as input for visual‐books). The translation process is driven by the philosophy of preserving the book metaphor in order to guarantee that electronic information is presented in a familiar way. Another important feature of our research is that hyper‐books and visual‐books are conceived not as isolated objects but as entities within an electronic library, which inherits most of the features of a paper‐based library but introduces a number of new properties resulting from its non‐physical nature

    Anatomy of a Native XML Base Management System

    Full text link
    Several alternatives to manage large XML document collections exist, ranging from file systems over relational or other database systems to specifically tailored XML repositories. In this paper we give a tour of Natix, a database management system designed from scratch for storing and processing XML data. Contrary to the common belief that management of XML data is just another application for traditional databases like relational systems, we illustrate how almost every component in a database system is affected in terms of adequacy and performance. We show how to design and optimize areas such as storage, transaction management comprising recovery and multi-user synchronisation as well as query processing for XML
    • 

    corecore