436 research outputs found

    The data cyclotron query processing scheme

    Get PDF
    Distributed database systems exploit static workload characteristics to steer data fragmentation and data allocation schemes. However, the grand challenge of distributed query processing is to come up with a self-organizing architecture, which exploits all resources to manage the hot data set, minimize query response time, and maximize throughput without global co-ordination. In this paper, we introduce the Data Cyclotron architecture which addresses the challenges using turbulent data movement through a storage ring built from distributed main memory capitalizing modern remote-DMA facilities. Queries assigned to individual nodes interact with the Data Cyclotron by picking up data fragments continuously flowing around, i.e., the hot set. Each data fragment carries a level of interest (LOI) metric, which represents the cumulative query interest as the fragment passes around the ring multiple times. A fragment with a LOI below a given threshold, inversely proportional to the ring load, is pulled o

    Peak Performance – Remote Memory Revisited

    Get PDF
    Many database systems share a need for large amounts of fast storage. However, economies of scale limit the utility of extending a single machine with an arbitrary amount of memory. The recent broad availability of the zero-copy data transfer protocol RDMA over low-latency and high throughput network connections such as InfiniBand prompts us to revisit the long-proposed usage of memory provided by remote machines. In this paper, we present a solution to make use of remote memory without manipulation of the operating system, and investigate the impact on database performance

    An architecture for recycling intermediates in a column-store

    Get PDF
    Automatically recycling (intermediate) results is a grand challenge for state-of-the-art databases to improve both query response time and throughput. Tuples are loaded and streamed through a tuple-at-a-time processing pipeline avoiding materialization of intermediates as much as possible. This limits the opportunities for reuse of overlapping computations to DBA-defined materialized views and function/result cache tuning. In contrast, the operator-at-a-time execution paradigm produces fully materialized results in each step of the query plan. To avoid resource contention, these intermediates are evicted as soon as possible. In this paper we study an architecture that harvests the by-products of the operator-at-a-time paradigm in a column store system using a lightweight mechanism, the recycler. The key challenge then becomes selection of the policies to admit intermediates to the resource pool, their retention period, and the eviction strategy when facing resource limitations. The proposed recycling architecture has been implemented in an open-source system. An experimental analysis against the TPC-H ad-hoc decision support benchmark and a complex, real-world application (SkyServer) demonstrates its effectiveness in terms of self-organizing behavior and its significant performance gains. The results indicate the potentials of recycling intermediates and charters a route for further development of database kernels

    Spinning Relations: High-Speed Networks for Distributed Join Processing

    Get PDF
    By leveraging modern networking hardware (RDMA-enabled network cards), we can shift priorities in distributed database processing significantly. Complex and sophisticated mechanisms to avoid network traffic can be replaced by a scheme that takes advantag

    An architecture for recycling intermediates in a column-store

    Get PDF
    Automatic recycling intermediate results to improve both query response time and throughput is a grand c

    A column-store meets the point clouds

    Get PDF
    Dealing with LIDAR data in the context of database management systems calls for a re-assessment of their functionality, performance, and storage/processing limitations. The territory for efficient and scalable processing of LIDAR repositories using GIS enabled database systems is still largely unexplored. Bringing together hard core database management experts and GIS application developers is a sine qua non to advance the state of the art. In particular to assess the relative merits of both traditional row-based database engines and the modern column-oriented database engines

    Column-store support for RDF data management: not all swans are white

    Get PDF
    This paper reports on the results of an independent evaluation of the techniques presented in the VLDB 2007 paper "Scalable Semantic Web Data Management Using Vertical Partitioning", authored by D. Abadi, A. Marcus, S. R. Madden, and K. Hollenbach. We revisit the proposed benchmark and examine both the data and query space coverage. The benchmark is extended to cover a larger portion of the query space in a canonical way. Repeatability of the experiments is assessed using the code base obtained from the authors. Inspired by the proposed vertically-partitioned storage solution for RDF data and the performance figures using a column-store, we conduct a complementary analy- sis of state-of-the-art RDF storage solutions. To this end, we employ MonetDB/SQL, a fully-functional open source column-store, and a well-known --- for its performance --- commercial row-store DBMS.We implement two relational RDF storage solutions – triple-store and vertically-partitioned --- in both systems. This allows us to expand the scope of with the performance characterization along both dimensions --- triple-store vs. vertically-partitioned and row-store vs. column-store --- individually, before analyzing their combined effects. A detailed report of the experimental test-bed, as well as an in-depth analysis of the parameters involved, clarify the scope of the solution originally presented and position the results in a broader context by covering more systems

    Reactive phenotypes after acute and chronic NK‐cell activation

    Get PDF
    J Biol Regul Homeost Agents. 2004 Jul-Dec;18(3-4):331-4. Reactive phenotypes after acute and chronic NK-cell activation. Lima M, Almeida J, Teixeira MA, Santos AH, Queirós ML, Fonseca S, Moura J, Gonçalves M, Orfão A, Pinto Ribeiro AC. Service of Clinical Hematology, Laboratory of Cytometry, Hospital Geral de Santo António, Porto, Portugal. [email protected] Abstract Several phenotypic changes have been shown to occur after NK-cell stimulation, involving molecules that have been proved to regulate NK-cell migration into tissues and NK-cell activation and proliferation as well as target cell recognition and killing. Here, we review the reactive phenotypes observed in vivo after acute and chronic NK-cell activation. PMID: 15786700 [PubMed - indexed for MEDLINE

    Benchmarking and improving point cloud data management in MonetDB

    Get PDF
    The popularity, availability and sizes of point cloud data sets are increasing, thus raising interesting data management and processing challenges. Various software solutions are available for the management of point cloud data. A benchmark for point cloud data management systems was defined and it was executed for several solutions. In this paper we focus on the solutions based on the column-store MonetDB, the generic out-of-the-box approach is compared with two alternative approaches that exploit the spatial coherence of the data to improve the data access and to minimize the storage requirement
    corecore