To take full advantage of the parallelism offered by a multicore machine, one must write parallel code. Writing parallel code is difficult. Even when one writes correct code, there are numerous performance pitfalls. For example, an unrecognized data hotspot could mean that all threads effectively serialize their access to the hotspot, and throughput is dramatically reduced. Previous work has demonstrated that database operations suffer from such hotspots when naively implemented to run in parallel on a multi-core processor. In this paper, we aim to provide a generic framework for performing certain kinds of concurrent database operations in parallel. The formalism is similar to user-defined aggregates and Google’s MapReduce in that users specify certain functions for parts of the computation that need to be performed over large volumes of data. We provide infrastructure that allows multiple threads on a multi-core machine to concurrently perform read and write operations on shared data structures, automatically mitigating hotspots and other performance hazards. Our goal is not to squeeze the last drop of performance out of a particular platform. Rather, we aim to provide a framework within which a programmer can, without detailed knowledge of concurrent and parallel programming, develop code that efficiently utilizes a multi-core machine
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.