1,028 research outputs found
Cache-Oblivious Peeling of Random Hypergraphs
The computation of a peeling order in a randomly generated hypergraph is the
most time-consuming step in a number of constructions, such as perfect hashing
schemes, random -SAT solvers, error-correcting codes, and approximate set
encodings. While there exists a straightforward linear time algorithm, its poor
I/O performance makes it impractical for hypergraphs whose size exceeds the
available internal memory.
We show how to reduce the computation of a peeling order to a small number of
sequential scans and sorts, and analyze its I/O complexity in the
cache-oblivious model. The resulting algorithm requires
I/Os and time to peel a random hypergraph with edges.
We experimentally evaluate the performance of our implementation of this
algorithm in a real-world scenario by using the construction of minimal perfect
hash functions (MPHF) as our test case: our algorithm builds a MPHF of
billion keys in less than hours on a single machine. The resulting data
structure is both more space-efficient and faster than that obtained with the
current state-of-the-art MPHF construction for large-scale key sets
The Cost of Address Translation
Modern computers are not random access machines (RAMs). They have a memory
hierarchy, multiple cores, and virtual memory. In this paper, we address the
computational cost of address translation in virtual memory. Starting point for
our work is the observation that the analysis of some simple algorithms (random
scan of an array, binary search, heapsort) in either the RAM model or the EM
model (external memory model) does not correctly predict growth rates of actual
running times. We propose the VAT model (virtual address translation) to
account for the cost of address translations and analyze the algorithms
mentioned above and others in the model. The predictions agree with the
measurements. We also analyze the VAT-cost of cache-oblivious algorithms.Comment: A extended abstract of this paper was published in the proceedings of
ALENEX13, New Orleans, US
GPU LSM: A Dynamic Dictionary Data Structure for the GPU
We develop a dynamic dictionary data structure for the GPU, supporting fast
insertions and deletions, based on the Log Structured Merge tree (LSM). Our
implementation on an NVIDIA K40c GPU has an average update (insertion or
deletion) rate of 225 M elements/s, 13.5x faster than merging items into a
sorted array. The GPU LSM supports the retrieval operations of lookup, count,
and range query operations with an average rate of 75 M, 32 M and 23 M
queries/s respectively. The trade-off for the dynamic updates is that the
sorted array is almost twice as fast on retrievals. We believe that our GPU LSM
is the first dynamic general-purpose dictionary data structure for the GPU.Comment: 11 pages, accepted to appear on the Proceedings of IEEE International
Parallel and Distributed Processing Symposium (IPDPS'18
Computationally Data-Independent Memory Hard Functions
Memory hard functions (MHFs) are an important cryptographic primitive that are used to design egalitarian proofs of work and in the construction of moderately expensive key-derivation functions resistant to brute-force attacks. Broadly speaking, MHFs can be divided into two categories: data-dependent memory hard functions (dMHFs) and data-independent memory hard functions (iMHFs). iMHFs are resistant to certain side-channel attacks as the memory access pattern induced by the honest evaluation algorithm is independent of the potentially sensitive input e.g., password. While dMHFs are potentially vulnerable to side-channel attacks (the induced memory access pattern might leak useful information to a brute-force attacker), they can achieve higher cumulative memory complexity (CMC) in comparison than an iMHF. In particular, any iMHF that can be evaluated in N steps on a sequential machine has CMC at most ?((N^2 log log N)/log N). By contrast, the dMHF scrypt achieves maximal CMC ?(N^2) - though the CMC of scrypt would be reduced to just ?(N) after a side-channel attack.
In this paper, we introduce the notion of computationally data-independent memory hard functions (ciMHFs). Intuitively, we require that memory access pattern induced by the (randomized) ciMHF evaluation algorithm appears to be independent from the standpoint of a computationally bounded eavesdropping attacker - even if the attacker selects the initial input. We then ask whether it is possible to circumvent known upper bound for iMHFs and build a ciMHF with CMC ?(N^2). Surprisingly, we answer the question in the affirmative when the ciMHF evaluation algorithm is executed on a two-tiered memory architecture (RAM/Cache).
We introduce the notion of a k-restricted dynamic graph to quantify the continuum between unrestricted dMHFs (k=n) and iMHFs (k=1). For any ? > 0 we show how to construct a k-restricted dynamic graph with k=?(N^(1-?)) that provably achieves maximum cumulative pebbling cost ?(N^2). We can use k-restricted dynamic graphs to build a ciMHF provided that cache is large enough to hold k hash outputs and the dynamic graph satisfies a certain property that we call "amenable to shuffling". In particular, we prove that the induced memory access pattern is indistinguishable to a polynomial time attacker who can monitor the locations of read/write requests to RAM, but not cache. We also show that when k=o(N^(1/log log N))then any k-restricted graph with constant indegree has cumulative pebbling cost o(N^2). Our results almost completely characterize the spectrum of k-restricted dynamic graphs
Fast Scalable Construction of (Minimal Perfect Hash) Functions
Recent advances in random linear systems on finite fields have paved the way
for the construction of constant-time data structures representing static
functions and minimal perfect hash functions using less space with respect to
existing techniques. The main obstruction for any practical application of
these results is the cubic-time Gaussian elimination required to solve these
linear systems: despite they can be made very small, the computation is still
too slow to be feasible.
In this paper we describe in detail a number of heuristics and programming
techniques to speed up the resolution of these systems by several orders of
magnitude, making the overall construction competitive with the standard and
widely used MWHC technique, which is based on hypergraph peeling. In
particular, we introduce broadword programming techniques for fast equation
manipulation and a lazy Gaussian elimination algorithm. We also describe a
number of technical improvements to the data structure which further reduce
space usage and improve lookup speed.
Our implementation of these techniques yields a minimal perfect hash function
data structure occupying 2.24 bits per element, compared to 2.68 for MWHC-based
ones, and a static function data structure which reduces the multiplicative
overhead from 1.23 to 1.03
- …