50 research outputs found
Boosting Multi-Core Reachability Performance with Shared Hash Tables
This paper focuses on data structures for multi-core reachability, which is a
key component in model checking algorithms and other verification methods. A
cornerstone of an efficient solution is the storage of visited states. In
related work, static partitioning of the state space was combined with
thread-local storage and resulted in reasonable speedups, but left open whether
improvements are possible. In this paper, we present a scaling solution for
shared state storage which is based on a lockless hash table implementation.
The solution is specifically designed for the cache architecture of modern
CPUs. Because model checking algorithms impose loose requirements on the hash
table operations, their design can be streamlined substantially compared to
related work on lockless hash tables. Still, an implementation of the hash
table presented here has dozens of sensitive performance parameters (bucket
size, cache line size, data layout, probing sequence, etc.). We analyzed their
impact and compared the resulting speedups with related tools. Our
implementation outperforms two state-of-the-art multi-core model checkers (SPIN
and DiVinE) by a substantial margin, while placing fewer constraints on the
load balancing and search algorithms.Comment: preliminary repor
BAG : Managing GPU as buffer cache in operating systems
This paper presents the design, implementation and evaluation of BAG, a system that manages GPU as the buffer cache in operating systems. Unlike previous uses of GPUs, which have focused on the computational capabilities of GPUs, BAG is designed to explore a new dimension in managing GPUs in heterogeneous systems where the GPU memory is an exploitable but always ignored resource. With the carefully designed data structures and algorithms, such as concurrent hashtable, log-structured data store for the management of GPU memory, and highly-parallel GPU kernels for garbage collection, BAG achieves good performance under various workloads. In addition, leveraging the existing abstraction of the operating system not only makes the implementation of BAG non-intrusive, but also facilitates the system deployment
Scalable Hash Tables
The term scalability with regards to this dissertation has two meanings: It means
taking the best possible advantage of the provided resources (both computational
and memory resources) and it also means scaling data structures in the literal sense,
i.e., growing the capacity, by “rescaling” the table.
Scaling well to computational resources implies constructing the fastest best per-
forming algorithms and data structures. On today’s many-core machines the best
performance is immediately associated with parallelism. Since CPU frequencies
have stopped growing about 10-15 years ago, parallelism is the only way to take ad-
vantage of growing computational resources. But for data structures in general and
hash tables in particular performance is not only linked to faster computations. The
most execution time is actually spent waiting for memory. Thus optimizing data
structures to reduce the amount of memory accesses or to take better advantage of
the memory hierarchy especially through predictable access patterns and prefetch-
ing is just as important.
In terms of scaling the size of hash tables we have identified three domains where
scaling hash-based data structures have been lacking previously, i.e., space effi-
cient growing, concurrent hash tables, and Approximate Membership Query data
structures (AMQ-filter). Throughout this dissertation, we describe the problems
in these areas and develop efficient solutions. We highlight three different libraries
that we have developed over the course of this dissertation, each containing mul-
tiple implementations that have shown throughout our testing to be among the
best implementations in their respective domains. In this composition they offer
a comprehensive toolbox that can be used to solve many kinds of hashing related
problems or to develop individual solutions for further ones.
DySECT is a library for space efficient hash tables specifically growing space effi-
cient hash tables that scale with their input size. It contains the namesake DySECT
data structure in addition to a number of different probing and cuckoo based im-
plementations. Growt is a library for highly efficient concurrent hash tables. It
contains a very fast base table and a number of extensions to adapt this table to
match any purpose. All extension can be combined to create a variety of different
interfaces. In our extensive experimental evaluation, each adaptation has shown
to be among the best hash tables for their specific purpose. Lpqfilter is a library
for concurrent approximate membership query (AMQ) data structures. It contains
some original data structures, like the linear probing quotient filter, as well as some
novel approaches to dynamically sized quotient filters
High Performance Computing using Infiniband-based clusters
L'abstract è presente nell'allegato / the abstract is in the attachmen
Concurrent Deterministic Skiplist and Other Data Structures
Skiplists are used in a variety of applications for storing data subject to
order criteria. In this article we discuss the design, analysis and performance
of a concurrent deterministic skip list on many-core NUMA nodes. We also
evaluate the performance of a concurrent lock-free unbounded queue
implementation and three implementations of multi-writer, multi-reader~(MWMR)
hash tables and compare their performance with equivalent implementations from
Intel's Thread Building Blocks~(TBB) library. We focus on strategies for memory
management that reduce page faults and cache misses for the memory access
patterns in these data structures. This paper proposes hierarchical usage of
concurrent data structures in programs to improve memory latencies by reducing
memory accesses from remote NUMA nodes