2,159 research outputs found
Fast Similarity Sketching
We consider the Similarity Sketching problem: Given a universe we want a random function mapping subsets into vectors of size , such that similarity is preserved. More
precisely: Given sets , define and
. We want to have , where
and furthermore to have strong concentration
guarantees (i.e. Chernoff-style bounds) for . This is a fundamental problem
which has found numerous applications in data mining, large-scale
classification, computer vision, similarity search, etc. via the classic
MinHash algorithm. The vectors are also called sketches.
The seminal MinHash algorithm uses random hash functions
, and stores as the sketch of . The main drawback of MinHash is,
however, its running time, and finding a sketch with similar
properties and faster running time has been the subject of several papers.
Addressing this, Li et al. [NIPS'12] introduced one permutation hashing (OPH),
which creates a sketch of size in time, but with the drawback
that possibly some of the entries are "empty" when . One could
argue that sketching is not necessary in this case, however the desire in most
applications is to have one sketching procedure that works for sets of all
sizes. Therefore, filling out these empty entries is the subject of several
follow-up papers initiated by Shrivastava and Li [ICML'14]. However, these
"densification" schemes fail to provide good concentration bounds exactly in
the case , where they are needed. (continued...
Dynamic Algorithms for Graph Coloring
We design fast dynamic algorithms for proper vertex and edge colorings in a
graph undergoing edge insertions and deletions. In the static setting, there
are simple linear time algorithms for - vertex coloring and
-edge coloring in a graph with maximum degree . It is
natural to ask if we can efficiently maintain such colorings in the dynamic
setting as well. We get the following three results. (1) We present a
randomized algorithm which maintains a -vertex coloring with
expected amortized update time. (2) We present a deterministic
algorithm which maintains a -vertex coloring with
amortized update time. (3) We present a simple,
deterministic algorithm which maintains a -edge coloring with
worst-case update time. This improves the recent
-edge coloring algorithm with worst-case
update time by Barenboim and Maimon.Comment: To appear in SODA 201
A Circuit-Based Approach to Efficient Enumeration
We study the problem of enumerating the satisfying valuations of a circuit while bounding the delay, i.e., the time needed to compute each successive valuation. We focus on the class of structured d-DNNF circuits originally introduced in knowledge compilation, a sub-area of artificial intelligence. We propose an algorithm for these circuits that enumerates valuations with linear preprocessing and delay linear in the Hamming weight of each valuation. Moreover, valuations of constant Hamming weight can be enumerated with linear preprocessing and constant delay.
Our results yield a framework for efficient enumeration that applies to all problems whose solutions can be compiled to structured d-DNNFs. In particular, we use it to recapture classical results in database theory, for factorized database representations and for MSO evaluation. This gives an independent proof of constant-delay enumeration for MSO formulae with first-order free variables on bounded-treewidth structures
Two new results about quantum exact learning
We present two new results about exact learning by quantum computers. First,
we show how to exactly learn a -Fourier-sparse -bit Boolean function from
uniform quantum examples for that function. This
improves over the bound of uniformly random classical
examples (Haviv and Regev, CCC'15). Our main tool is an improvement of Chang's
lemma for the special case of sparse functions. Second, we show that if a
concept class can be exactly learned using quantum membership
queries, then it can also be learned using classical membership queries. This improves the
previous-best simulation result (Servedio and Gortler, SICOMP'04) by a -factor.Comment: v3: 21 pages. Small corrections and clarification
- …