20 research outputs found
Wear Minimization for Cuckoo Hashing: How Not to Throw a Lot of Eggs into One Basket
We study wear-leveling techniques for cuckoo hashing, showing that it is
possible to achieve a memory wear bound of after the
insertion of items into a table of size for a suitable constant
using cuckoo hashing. Moreover, we study our cuckoo hashing method empirically,
showing that it significantly improves on the memory wear performance for
classic cuckoo hashing and linear probing in practice.Comment: 13 pages, 1 table, 7 figures; to appear at the 13th Symposium on
Experimental Algorithms (SEA 2014
Weakly Submodular Functions
Submodular functions are well-studied in combinatorial optimization, game
theory and economics. The natural diminishing returns property makes them
suitable for many applications. We study an extension of monotone submodular
functions, which we call {\em weakly submodular functions}. Our extension
includes some (mildly) supermodular functions. We show that several natural
functions belong to this class and relate our class to some other recent
submodular function extensions.
We consider the optimization problem of maximizing a weakly submodular
function subject to uniform and general matroid constraints. For a uniform
matroid constraint, the "standard greedy algorithm" achieves a constant
approximation ratio where the constant (experimentally) converges to 5.95 as
the cardinality constraint increases. For a general matroid constraint, a
simple local search algorithm achieves a constant approximation ratio where the
constant (analytically) converges to 10.22 as the rank of the matroid
increases
Twin-Width and Polynomial Kernels
We study the existence of polynomial kernels for parameterized problems without a polynomial kernel on general graphs, when restricted to graphs of bounded twin-width. It was previously observed in [Bonnet et al., ICALP\u2721] that the problem k-Independent Set allows no polynomial kernel on graph of bounded twin-width by a very simple argument, which extends to several other problems such as k-Independent Dominating Set, k-Path, k-Induced Path, k-Induced Matching. In this work, we examine the k-Dominating Set and variants of k-Vertex Cover for the existence of polynomial kernels.
As a main result, we show that k-Dominating Set does not admit a polynomial kernel on graphs of twin-width at most 4 under a standard complexity-theoretic assumption. The reduction is intricate, especially due to the effort to bring the twin-width down to 4, and it can be tweaked to work for Connected k-Dominating Set and Total k-Dominating Set with a slightly worse bound on the twin-width.
On the positive side, we obtain a simple quadratic vertex kernel for Connected k-Vertex Cover and Capacitated k-Vertex Cover on graphs of bounded twin-width. These kernels rely on that graphs of bounded twin-width have Vapnik-Chervonenkis (VC) density 1, that is, for any vertex set X, the number of distinct neighborhoods in X is at most c?|X|, where c is a constant depending only on the twin-width. Interestingly the kernel applies to any graph class of VC density 1, and does not require a witness sequence. We also present a more intricate O(k^{1.5}) vertex kernel for Connected k-Vertex Cover.
Finally we show that deciding if a graph has twin-width at most 1 can be done in polynomial time, and observe that most graph optimization/decision problems can be solved in polynomial time on graphs of twin-width at most 1
A Case for Partitioned Bloom Filters
In a partitioned Bloom Filter the bit vector is split into disjoint
sized parts, one per hash function. Contrary to hardware designs, where
they prevail, software implementations mostly adopt standard Bloom filters,
considering partitioned filters slightly worse, due to the slightly larger
false positive rate (FPR). In this paper, by performing an in-depth analysis,
first we show that the FPR advantage of standard Bloom filters is smaller than
thought; more importantly, by studying the per-element FPR, we show that
standard Bloom filters have weak spots in the domain: elements which will be
tested as false positives much more frequently than expected. This is relevant
in scenarios where an element is tested against many filters, e.g., in packet
forwarding. Moreover, standard Bloom filters are prone to exhibit extremely
weak spots if naive double hashing is used, something occurring in several,
even mainstream, libraries. Partitioned Bloom filters exhibit a uniform
distribution of the FPR over the domain and are robust to the naive use of
double hashing, having no weak spots. Finally, by surveying several usages
other than testing set membership, we point out the many advantages of having
disjoint parts: they can be individually sampled, extracted, added or retired,
leading to superior designs for, e.g., SIMD usage, size reduction, test of set
disjointness, or duplicate detection in streams. Partitioned Bloom filters are
better, and should replace the standard form, both in general purpose libraries
and as the base for novel designs.Comment: 21 page
Prioritized Metric Structures and Embedding
Metric data structures (distance oracles, distance labeling schemes, routing
schemes) and low-distortion embeddings provide a powerful algorithmic
methodology, which has been successfully applied for approximation algorithms
\cite{llr}, online algorithms \cite{BBMN11}, distributed algorithms
\cite{KKMPT12} and for computing sparsifiers \cite{ST04}. However, this
methodology appears to have a limitation: the worst-case performance inherently
depends on the cardinality of the metric, and one could not specify in advance
which vertices/points should enjoy a better service (i.e., stretch/distortion,
label size/dimension) than that given by the worst-case guarantee.
In this paper we alleviate this limitation by devising a suit of {\em
prioritized} metric data structures and embeddings. We show that given a
priority ranking of the graph vertices (respectively,
metric points) one can devise a metric data structure (respectively, embedding)
in which the stretch (resp., distortion) incurred by any pair containing a
vertex will depend on the rank of the vertex. We also show that other
important parameters, such as the label size and (in some sense) the dimension,
may depend only on . In some of our metric data structures (resp.,
embeddings) we achieve both prioritized stretch (resp., distortion) and label
size (resp., dimension) {\em simultaneously}. The worst-case performance of our
metric data structures and embeddings is typically asymptotically no worse than
of their non-prioritized counterparts.Comment: To appear at STOC 201