332 research outputs found
Linear Hashing is Awesome
We consider the hash function where
are chosen uniformly at random from . We prove that when we
use in hashing with chaining to insert elements into a table of size
the expected length of the longest chain is
. The proof also generalises to give the same
bound when we use the multiply-shift hash function by Dietzfelbinger et al.
[Journal of Algorithms 1997].Comment: A preliminary version appeared at FOCS'1
Theory and applications of hashing: report from Dagstuhl Seminar 17181
This report documents the program and the topics discussed of the 4-day Dagstuhl Seminar 17181 “Theory and Applications of Hashing”, which took place May 1–5, 2017. Four long and eighteen short talks covered a wide and diverse range of topics within the theme of the workshop. The program left sufficient space for informal discussions among the 40 participants
How blockchain impacts cloud-based system performance: a case study for a groupware communication application
This paper examines the performance trade-off when implementing a blockchain architecture for a cloud-based groupware communication application. We measure the additional cloud-based resources and performance costs of the overhead required to implement a groupware collaboration system over a blockchain architecture. To evaluate our groupware application, we develop measuring instruments for testing scalability and performance of computer systems deployed as cloud computing applications. While some details of our groupware collaboration application have been published in earlier work, in this paper we reflect on a generalized measuring method for blockchain-enabled applications which may in turn lead to a general methodology for testing cloud-based system performance and scalability using blockchain. Response time and transaction throughput metrics are collected for the blockchain implementation against the non-blockchain implementation and some conclusions are drawn about the additional resources that a blockchain architecture for a groupware collaboration application impose
Linear Hashing: No Shift, Non-Prime Modulus, For eal!
In classical Linear Hashing items
are mapped to bins by a function such as for prime and randomly chosen integers
. Despite 's simplicity understanding the expected
maxload, i.e., number of elements in a fullest bin, of for
worst-case inputs is a notoriously challenging open question. For hashing
items the best known lower bound is , whereas the best known upper bound is due
to Knudsen.
In this paper we consider three modifications of classic : (1)
without the ``" shift term, resulting in loss of
pairwise-independence. (2) with a composite, rather than prime,
modulus. (3) in a continuous setting where the multiplier ``"
is chosen from rather than . We show that
is fairly robust to these changes, in particular by demonstrating
analogs of known maxload-bounds for these new variants.
These results give several new perspectives on , in particular
showing that properties of such as pairwise-independence, a prime
modulus, or even its setting in the integers may not be fundamental. We believe
that these new perspectives, beyond being independently interesting, may also
be useful in future work towards understanding the maxload of .Comment: 11 page
The Whalesong
Student and community leaders meet at banquet -- Peru inspires and amazes UAS group -- Egan Library wing opens to rave reviews -- Count yourself lucky -- Notice to Stafford Loan Borrowers -- VideoVersity - what is it? -- Global ethics brought to UAS -- Success is up to you -- Student Spotlight: Augie Stiehr -- Long Live Narcissus -- Media and computer services merge -- Special election to be held -- Paint misbehavin' -- Get wet at Squire's -- Preview -- The best album you never heard. .
Locally Uniform Hashing
Hashing is a common technique used in data processing, with a strong impact
on the time and resources spent on computation. Hashing also affects the
applicability of theoretical results that often assume access to (unrealistic)
uniform/fully-random hash functions. In this paper, we are concerned with
designing hash functions that are practical and come with strong theoretical
guarantees on their performance.
To this end, we present tornado tabulation hashing, which is simple, fast,
and exhibits a certain full, local randomness property that provably makes
diverse algorithms perform almost as if (abstract) fully-random hashing was
used. For example, this includes classic linear probing, the widely used
HyperLogLog algorithm of Flajolet, Fusy, Gandouet, Meunier [AOFA 97] for
counting distinct elements, and the one-permutation hashing of Li, Owen, and
Zhang [NIPS 12] for large-scale machine learning. We also provide a very
efficient solution for the classical problem of obtaining fully-random hashing
on a fixed (but unknown to the hash function) set of keys using
space. As a consequence, we get more efficient implementations of the splitting
trick of Dietzfelbinger and Rink [ICALP'09] and the succinct space uniform
hashing of Pagh and Pagh [SICOMP'08].
Tornado tabulation hashing is based on a simple method to systematically
break dependencies in tabulation-based hashing techniques.Comment: FOCS 202
Sparse Nonnegative Convolution Is Equivalent to Dense Nonnegative Convolution
Computing the convolution of two length- vectors is an ubiquitous computational primitive. Applications range from string problems to Knapsack-type problems, and from 3SUM to All-Pairs Shortest Paths. These applications often come in the form of nonnegative convolution, where the entries of are nonnegative integers. The classical algorithm to compute uses the Fast Fourier Transform and runs in time . However, often and satisfy sparsity conditions, and hence one could hope for significant improvements. The ideal goal is an -time algorithm, where is the number of non-zero elements in the output, i.e., the size of the support of . This problem is referred to as sparse nonnegative convolution, and has received considerable attention in the literature; the fastest algorithms to date run in time . The main result of this paper is the first -time algorithm for sparse nonnegative convolution. Our algorithm is randomized and assumes that the length and the largest entry of and are subexponential in . Surprisingly, we can phrase our algorithm as a reduction from the sparse case to the dense case of nonnegative convolution, showing that, under some mild assumptions, sparse nonnegative convolution is equivalent to dense nonnegative convolution for constant-error randomized algorithms. Specifically, if is the time to convolve two nonnegative length- vectors with success probability , and is the time to convolve two nonnegative vectors with output size with success probability , then . Our approach uses a variety of new techniques in combination with some old machinery from linear sketching and structured linear algebra, as well as new insights on linear hashing, the most classical hash function
- …