24 research outputs found
An Associativity Threshold Phenomenon in Set-Associative Caches
In an -way set-associative cache, the cache is partitioned into
disjoint sets of size , and each item can only be cached in one set,
typically selected via a hash function. Set-associative caches are widely used
and have many benefits, e.g., in terms of latency or concurrency, over fully
associative caches, but they often incur more cache misses. As the set size
decreases, the benefits increase, but the paging costs worsen.
In this paper we characterize the performance of an -way
set-associative LRU cache of total size , as a function of . We prove the following, assuming that sets are selected using a
fully random hash function:
- For , the paging cost of an -way
set-associative LRU cache is within additive of that a fully-associative
LRU cache of size , with probability ,
for all request sequences of length .
- For , and for all and , the paging
cost of an -way set-associative LRU cache is not within a factor of
that a fully-associative LRU cache of size , for some request sequence of
length .
- For , if the hash function can be occasionally
changed, the paging cost of an -way set-associative LRU cache is within
a factor of that a fully-associative LRU cache of size ,
with probability , for request sequences of
arbitrary (e.g., super-polynomial) length.
Some of our results generalize to other paging algorithms besides LRU, such
as least-frequently used (LFU)
Iceberg Hashing: Optimizing Many Hash-Table Criteria at Once
Despite being one of the oldest data structures in computer science, hash
tables continue to be the focus of a great deal of both theoretical and
empirical research. A central reason for this is that many of the fundamental
properties that one desires from a hash table are difficult to achieve
simultaneously; thus many variants offering different trade-offs have been
proposed.
This paper introduces Iceberg hashing, a hash table that simultaneously
offers the strongest known guarantees on a large number of core properties.
Iceberg hashing supports constant-time operations while improving on the state
of the art for space efficiency, cache efficiency, and low failure probability.
Iceberg hashing is also the first hash table to support a load factor of up to
while being stable, meaning that the position where an element is
stored only ever changes when resizes occur. In fact, in the setting where keys
are bits, the space guarantees that Iceberg hashing offers,
namely that it uses at most bits to
store items from a universe , matches a lower bound by Demaine et al.
that applies to any stable hash table.
Iceberg hashing introduces new general-purpose techniques for some of the
most basic aspects of hash-table design. Notably, our indirection-free
technique for dynamic resizing, which we call waterfall addressing, and our
techniques for achieving stability and very-high probability guarantees, can be
applied to any hash table that makes use of the front-yard/backyard paradigm
for hash table design
The I/O Complexity of Computing Prime Tables
International audienceWe revisit classical sieves for computing primes and analyze their performance in the external-memory model. Most prior sieves are analyzed in the RAM model, where the focus is on minimizing both the total number of operations and the size of the working set. The hope is that if the working set fits in RAM, then the sieve will have good I/O performance, though such an outcome is by no means guaranteed by a small working-set size. We analyze our algorithms directly in terms of I/Os and operations. In the external-memory model, permutation can be the most expensive aspect of sieving, in contrast to the RAM model, where permutations are trivial. We show how to implement classical sieves so that they have both good I/O performance and good RAM performance, even when the problem size N becomes hugeâeven superpolynomially larger than RAM. Towards this goal, we give two I/O-efficient priority queues that are optimized for the operations incurred by these sieves
Fault-tolerant aggregation: Flow-Updating meets Mass-Distribution
Flow-Updating (FU) is a fault-tolerant technique that has proved to be efficient in practice for the distributed computation of aggregate functions in communication networks where individual processors do not have access to global information. Previous distributed aggregation protocols, based on repeated sharing of input values (or mass) among processors, sometimes called Mass-Distribution (MD) protocols, are not resilient to communication failures (or message loss) because such failures yield a loss of mass. In this paper, we present a protocol which we call Mass-Distribution with Flow-Updating (MDFU). We obtain MDFU by applying FU techniques to classic MD. We analyze the convergence time of MDFU showing that stochastic message loss produces low overhead. This is the first convergence proof of an FU-based algorithm. We evaluate MDFU experimentally, comparing it with previous MD and FU protocols, and verifying the behavior predicted by the analysis. Finally, given that MDFU incurs a fixed deviation proportional to the message-loss rate, we adjust the accuracy of MDFU heuristically in a new protocol called MDFU with Linear Prediction (MDFU-LP). The evaluation shows that both MDFU and MDFU-LP behave very well in practice, even under high rates of message loss and even changing the input values dynamically.- A preliminary version of this work appeared in [2]. This work was partially supported by the National Science Foundation (CNS-1408782, IIS-1247750); the National Institutes of Health (CA198952-01); EMC, Inc.; Pace University Seidenberg School of CSIS; and by Project "Coral - Sustainable Ocean Exploitation: Tools and Sensors/NORTE-01-0145-FEDER-000036" financed by the North Portugal Regional Operational Programme (NORTE 2020), under the PORTUGAL 2020 Partnership Agreement, and through the European Regional Development Fund (ERDF).info:eu-repo/semantics/publishedVersio
The LCA problem revisited
Abstract. We present a very simple algorithm for the Least Common Ancestors problem. We thus dispel the frequently held notion that optimal LCA computation is unwieldy and unimplementable. Interestingly, this algorithm is a sequentialization of a previously known PRAM algorithm.
Initializing Sensor Networks of Non-uniform Density in the Weak Sensor Model
Assumptions about node density in the sensor networks literature are frequently too strong. Neither adversarially chosen nor uniform random deployment seem realistic in many intended applications of sensor nodes. We define smooth distributions of sensor nodes to be those where the minimum density is guaranteed to achieve connectivity in random deployments, but higher densities may appear in certain areas. We study basic problems for smooth distribution of nodes. Most notably, we present a Weak Sensor Model-compliant distributed protocol for hop-optimal network initialization (NI), a fundamental problem in sensor networks. In order to prove lower bounds, we observe that all nodes must communicate with some other node in order to join the network, and we call the problem of achieving such a communication the group therapy (GT) problem. We show a tight lower bound for the GT problem in radio networks for any class of protocols, and a stronger lower bound for the important class of randomized uniform-oblivious protocols. Given that any NI protocol also solves GT, these lower bounds apply to NI. We also show that the same lower bound holds for a related problem that we call independent set , when nodes are distributed uniformly, even in expectation