2,635 research outputs found

    On Predictive Modeling for Optimizing Transaction Execution in Parallel OLTP Systems

    Full text link
    A new emerging class of parallel database management systems (DBMS) is designed to take advantage of the partitionable workloads of on-line transaction processing (OLTP) applications. Transactions in these systems are optimized to execute to completion on a single node in a shared-nothing cluster without needing to coordinate with other nodes or use expensive concurrency control measures. But some OLTP applications cannot be partitioned such that all of their transactions execute within a single-partition in this manner. These distributed transactions access data not stored within their local partitions and subsequently require more heavy-weight concurrency control protocols. Further difficulties arise when the transaction's execution properties, such as the number of partitions it may need to access or whether it will abort, are not known beforehand. The DBMS could mitigate these performance issues if it is provided with additional information about transactions. Thus, in this paper we present a Markov model-based approach for automatically selecting which optimizations a DBMS could use, namely (1) more efficient concurrency control schemes, (2) intelligent scheduling, (3) reduced undo logging, and (4) speculative execution. To evaluate our techniques, we implemented our models and integrated them into a parallel, main-memory OLTP DBMS to show that we can improve the performance of applications with diverse workloads.Comment: VLDB201

    Implementing PRISMA/DB in an OOPL

    Get PDF
    PRISMA/DB is implemented in a parallel object-oriented language to gain insight in the usage of parallelism. This environment allows us to experiment with parallelism by simply changing the allocation of objects to the processors of the PRISMA machine. These objects are obtained by a strictly modular design of PRISMA/DB. Communication between the objects is required to cooperatively handle the various tasks, but it limits the potential for parallelism. From this approach, we hope to gain a better understanding of parallelism, which can be used to enhance the performance of PRISMA/DB.\ud The work reported in this document was conducted as part of the PRISMA project, a joint effort with Philips Research Eindhoven, partially supported by the Dutch "Stimuleringsprojectteam Informaticaonderzoek (SPIN)

    Dynamic Physiological Partitioning on a Shared-nothing Database Cluster

    Full text link
    Traditional DBMS servers are usually over-provisioned for most of their daily workloads and, because they do not show good-enough energy proportionality, waste a lot of energy while underutilized. A cluster of small (wimpy) servers, where its size can be dynamically adjusted to the current workload, offers better energy characteristics for these workloads. Yet, data migration, necessary to balance utilization among the nodes, is a non-trivial and time-consuming task that may consume the energy saved. For this reason, a sophisticated and easy to adjust partitioning scheme fostering dynamic reorganization is needed. In this paper, we adapt a technique originally created for SMP systems, called physiological partitioning, to distribute data among nodes, that allows to easily repartition data without interrupting transactions. We dynamically partition DB tables based on the nodes' utilization and given energy constraints and compare our approach with physical partitioning and logical partitioning methods. To quantify possible energy saving and its conceivable drawback on query runtimes, we evaluate our implementation on an experimental cluster and compare the results w.r.t. performance and energy consumption. Depending on the workload, we can substantially save energy without sacrificing too much performance

    Staring into the abyss: An evaluation of concurrency control with one thousand cores

    Get PDF
    Computer architectures are moving towards an era dominated by many-core machines with dozens or even hundreds of cores on a single chip. This unprecedented level of on-chip parallelism introduces a new dimension to scalability that current database management systems (DBMSs) were not designed for. In particular, as the number of cores increases, the problem of concurrency control becomes extremely challenging. With hundreds of threads running in parallel, the complexity of coordinating competing accesses to data will likely diminish the gains from increased core counts. To better understand just how unprepared current DBMSs are for future CPU architectures, we performed an evaluation of concurrency control for on-line transaction processing (OLTP) workloads on many-core chips. We implemented seven concurrency control algorithms on a main-memory DBMS and using computer simulations scaled our system to 1024 cores. Our analysis shows that all algorithms fail to scale to this magnitude but for different reasons. In each case, we identify fundamental bottlenecks that are independent of the particular database implementation and argue that even state-of-the-art DBMSs suffer from these limitations. We conclude that rather than pursuing incremental solutions, many-core chips may require a completely redesigned DBMS architecture that is built from ground up and is tightly coupled with the hardware.Intel Corporation (Science and Technology Center for Big Data

    Architecture independent environment for developing engineering software on MIMD computers

    Get PDF
    Engineers are constantly faced with solving problems of increasing complexity and detail. Multiple Instruction stream Multiple Data stream (MIMD) computers have been developed to overcome the performance limitations of serial computers. The hardware architectures of MIMD computers vary considerably and are much more sophisticated than serial computers. Developing large scale software for a variety of MIMD computers is difficult and expensive. There is a need to provide tools that facilitate programming these machines. First, the issues that must be considered to develop those tools are examined. The two main areas of concern were architecture independence and data management. Architecture independent software facilitates software portability and improves the longevity and utility of the software product. It provides some form of insurance for the investment of time and effort that goes into developing the software. The management of data is a crucial aspect of solving large engineering problems. It must be considered in light of the new hardware organizations that are available. Second, the functional design and implementation of a software environment that facilitates developing architecture independent software for large engineering applications are described. The topics of discussion include: a description of the model that supports the development of architecture independent software; identifying and exploiting concurrency within the application program; data coherence; engineering data base and memory management
    • …
    corecore