Search CORE

24 research outputs found

An Associativity Threshold Phenomenon in Set-Associative Caches

Author: Bender Michael A.
Das Rathish
Farach-Colton Martín
Tagliavini Guido
Publication venue
Publication date: 10/04/2023
Field of study

In an

\alpha

-way set-associative cache, the cache is partitioned into disjoint sets of size

\alpha

, and each item can only be cached in one set, typically selected via a hash function. Set-associative caches are widely used and have many benefits, e.g., in terms of latency or concurrency, over fully associative caches, but they often incur more cache misses. As the set size

\alpha

decreases, the benefits increase, but the paging costs worsen. In this paper we characterize the performance of an

\alpha

-way set-associative LRU cache of total size

k

, as a function of

\alpha = \alpha(k)

. We prove the following, assuming that sets are selected using a fully random hash function: - For

\alpha = \omega(\log k)

, the paging cost of an

\alpha

-way set-associative LRU cache is within additive

O(1)

of that a fully-associative LRU cache of size

(1-o(1))k

, with probability

1 - 1/\operatorname{poly}(k)

, for all request sequences of length

\operatorname{poly}(k)

. - For

\alpha = o(\log k)

, and for all

c = O(1)

and

r = O(1)

, the paging cost of an

\alpha

-way set-associative LRU cache is not within a factor

c

of that a fully-associative LRU cache of size

k/r

, for some request sequence of length

O(k^{1.01})

. - For

\alpha = \omega(\log k)

, if the hash function can be occasionally changed, the paging cost of an

\alpha

-way set-associative LRU cache is within a factor

1 + o(1)

of that a fully-associative LRU cache of size

(1-o(1))k

, with probability

1 - 1/\operatorname{poly}(k)

, for request sequences of arbitrary (e.g., super-polynomial) length. Some of our results generalize to other paging algorithms besides LRU, such as least-frequently used (LFU)

arXiv.org e-Print Archive

Iceberg Hashing: Optimizing Many Hash-Table Criteria at Once

Author: Bender Michael A.
Conway Alex
Farach-Colton Martín
Kuszmaul William
Tagliavini Guido
Publication venue
Publication date: 22/10/2023
Field of study

Despite being one of the oldest data structures in computer science, hash tables continue to be the focus of a great deal of both theoretical and empirical research. A central reason for this is that many of the fundamental properties that one desires from a hash table are difficult to achieve simultaneously; thus many variants offering different trade-offs have been proposed. This paper introduces Iceberg hashing, a hash table that simultaneously offers the strongest known guarantees on a large number of core properties. Iceberg hashing supports constant-time operations while improving on the state of the art for space efficiency, cache efficiency, and low failure probability. Iceberg hashing is also the first hash table to support a load factor of up to

1 - o(1)

while being stable, meaning that the position where an element is stored only ever changes when resizes occur. In fact, in the setting where keys are

\Theta(\log n)

bits, the space guarantees that Iceberg hashing offers, namely that it uses at most

\log \binom{|U|}{n} + O(n \log \log n)

bits to store

n

items from a universe

U

, matches a lower bound by Demaine et al. that applies to any stable hash table. Iceberg hashing introduces new general-purpose techniques for some of the most basic aspects of hash-table design. Notably, our indirection-free technique for dynamic resizing, which we call waterfall addressing, and our techniques for achieving stability and very-high probability guarantees, can be applied to any hash table that makes use of the front-yard/backyard paradigm for hash table design

arXiv.org e-Print Archive

The I/O Complexity of Computing Prime Tables

Author: Bender Michael
Chowdhury Rezaul
Conway Alex
Farach-Colton Martín
Ganapathi Pramod
Johnson Rob
Mccauley Samuel
Simon Bertrand
Singh Shikha
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

International audienceWe revisit classical sieves for computing primes and analyze their performance in the external-memory model. Most prior sieves are analyzed in the RAM model, where the focus is on minimizing both the total number of operations and the size of the working set. The hope is that if the working set fits in RAM, then the sieve will have good I/O performance, though such an outcome is by no means guaranteed by a small working-set size. We analyze our algorithms directly in terms of I/Os and operations. In the external-memory model, permutation can be the most expensive aspect of sieving, in contrast to the RAM model, where permutations are trivial. We show how to implement classical sieves so that they have both good I/O performance and good RAM performance, even when the problem size N becomes huge—even superpolynomially larger than RAM. Towards this goal, we give two I/O-efficient priority queues that are optimized for the operations incurred by these sieves

HAL-ENS-LYON

Crossref

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Fault-tolerant aggregation: Flow-Updating meets Mass-Distribution

Author: A Sinclair
AG Dimakis
C Intanagonwiwat
Carlos Baquero
DR Kowalski
IF Akyildiz
JY Chen
JY Chen
L Gasieniec
L Lovász
L Xiao
M Jelasity
M Mitzenmacher
Martín Farach-Colton
Miguel A. Mosteiro
P Erdos
P Jesus
P Jesus
Paulo Jesus
Paulo Sérgio Almeida
R Olfati-Saber
S Boyd
W Feller
WN Gansterer
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/08/2017
Field of study

Flow-Updating (FU) is a fault-tolerant technique that has proved to be efficient in practice for the distributed computation of aggregate functions in communication networks where individual processors do not have access to global information. Previous distributed aggregation protocols, based on repeated sharing of input values (or mass) among processors, sometimes called Mass-Distribution (MD) protocols, are not resilient to communication failures (or message loss) because such failures yield a loss of mass. In this paper, we present a protocol which we call Mass-Distribution with Flow-Updating (MDFU). We obtain MDFU by applying FU techniques to classic MD. We analyze the convergence time of MDFU showing that stochastic message loss produces low overhead. This is the first convergence proof of an FU-based algorithm. We evaluate MDFU experimentally, comparing it with previous MD and FU protocols, and verifying the behavior predicted by the analysis. Finally, given that MDFU incurs a fixed deviation proportional to the message-loss rate, we adjust the accuracy of MDFU heuristically in a new protocol called MDFU with Linear Prediction (MDFU-LP). The evaluation shows that both MDFU and MDFU-LP behave very well in practice, even under high rates of message loss and even changing the input values dynamically.- A preliminary version of this work appeared in [2]. This work was partially supported by the National Science Foundation (CNS-1408782, IIS-1247750); the National Institutes of Health (CA198952-01); EMC, Inc.; Pace University Seidenberg School of CSIS; and by Project "Coral - Sustainable Ocean Exploitation: Tools and Sensors/NORTE-01-0145-FEDER-000036" financed by the North Portugal Regional Operational Programme (NORTE 2020), under the PORTUGAL 2020 Partnership Agreement, and through the European Regional Development Fund (ERDF).info:eu-repo/semantics/publishedVersio

Universidade do Minho: RepositoriUM

Crossref

The LCA problem revisited

Author: Martín Farach-colton
Michael A. Bender
Publication venue: Springer
Publication date: 01/01/2000
Field of study

Abstract. We present a very simple algorithm for the Least Common Ancestors problem. We thus dispel the frequently held notion that optimal LCA computation is unwieldy and unimplementable. Interestingly, this algorithm is a sequentialization of a previously known PRAM algorithm.

CiteSeerX

Initializing Sensor Networks of Non-uniform Density in the Weak Sensor Model

Author: Farach-Colton Martín
Mosteiro Miguel A.
Publication venue: Kean Digital Learning Commons
Publication date: 10/06/2014
Field of study

Assumptions about node density in the sensor networks literature are frequently too strong. Neither adversarially chosen nor uniform random deployment seem realistic in many intended applications of sensor nodes. We define smooth distributions of sensor nodes to be those where the minimum density is guaranteed to achieve connectivity in random deployments, but higher densities may appear in certain areas. We study basic problems for smooth distribution of nodes. Most notably, we present a Weak Sensor Model-compliant distributed protocol for hop-optimal network initialization (NI), a fundamental problem in sensor networks. In order to prove lower bounds, we observe that all nodes must communicate with some other node in order to join the network, and we call the problem of achieving such a communication the group therapy (GT) problem. We show a tight lower bound for the GT problem in radio networks for any class of protocols, and a stronger lower bound for the important class of randomized uniform-oblivious protocols. Given that any NI protocol also solves GT, these lower bounds apply to NI. We also show that the same lower bound holds for a related problem that we call independent set , when nodes are distributed uniformly, even in expectation

Crossref

Kean Digital Learning Commons