44 research outputs found

    Predictable performance and high query concurrency for data analytics

    Get PDF
    Conventional data warehouses employ the query- at-a-time model, which maps each query to a distinct physical plan. When several queries execute concurrently, this model introduces contention and thrashing, because the physical plans—unaware of each other—compete for access to the underlying I/O and computation resources. As a result, while modern systems can efficiently optimize and evaluate a single complex data analysis query, their performance suffers significantly and can be highly erratic when multiple complex queries run at the same time. We present in this paper Cjoin , a new design that substantially improves throughput in large-scale data analytics systems processing many concurrent join queries. In contrast to the conventional query-at-a-time model, our approach employs a single physical plan that shares I/O, computation, and tuple storage across all in-flight join queries. We use an “always on” pipeline of non-blocking operators, managed by a controller that continuously examines the current query mix and optimizes the pipeline on the fly. Our design enables data analytics engines to scale gracefully to large data sets, provide predictable execution times, and reduce contention. We implemented Cjoin as an extension to the PostgreSQL DBMS. This prototype outperforms conventional commercial systems by an order of magnitude for tens to hundreds of concurrent queries

    Distributed file organization with scalable cost/performance

    No full text

    Concurrency control protocols guaranteeing atomicity and serializability

    No full text

    How to Manage Persistent State in DRM Systems

    No full text
    Digital Rights Managements (DRM) systems often must manage persistent state, which includes protected content, an audit trail, content usage counts, certificates and decryption keys. Ideally, persistent state that has monetary value should be stored in a physically secure server. However, frequently the persistent state may need to be stored in a hostile environment. For example, for good performance and to support disconnected operation, recent audit records may be stored on a consumer device. The device's user may have an incentive to alter the audit trail and thus obtain content for free. In this paper we explain the need for persistent state in DRM systems, describe several methods for maintaining persistent state depending on the system requirements, and then focus on the the special case of protecting persistent state in hostile environments.

    Efficient Peer-To-Peer Lookup Based on a Distributed Trie

    No full text
    Two main approaches have been taken for distributed key/value lookup operations in peer-to-peer systems: broadcast searches [1, 2] and location-deterministic algorithms [5, 6, 7, 9]. We describe a third alternative based on a distributed trie. This algorithm functions well in a very dynamic, hostile environment, offering security benefits over prior proposals. Our approach takes advantage of working-set temporal locality and global key/value distribution skews due to content popularity. Peers gradually learn system state during lookups, receiving the sought values and/or internal information used by the trie. The distributed trie converges to an accurate network map over time. We describe several modes of information piggybacking, and conservative and liberal variants of the basic algorithm for adversarial settings. Simulations show efficient lookups and low failure rates
    corecore