Search CORE

3,670 research outputs found

Succinct Filters for Sets of Unknown Sizes

Author: Liu Mingmou
Yin Yitong
Yu Huacheng
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 47th International Colloquium on Automata, Languages, and Programming (ICALP 2020)
Publication date: 01/01/2020
Field of study

The membership problem asks to maintain a set S ? [u], supporting insertions and membership queries, i.e., testing if a given element is in the set. A data structure that computes exact answers is called a dictionary. When a (small) false positive rate ? is allowed, the data structure is called a filter. The space usages of the standard dictionaries or filters usually depend on the upper bound on the size of S, while the actual set can be much smaller. Pagh, Segev and Wieder [Pagh et al., 2013] were the first to study filters with varying space usage based on the current |S|. They showed in order to match the space with the current set size n = |S|, any filter data structure must use (1-o(1))n(log(1/?)+(1-O(?))log log n) bits, in contrast to the well-known lower bound of N log(1/?) bits, where N is an upper bound on |S|. They also presented a data structure with almost optimal space of (1+o(1))n(log(1/?)+O(log log n)) bits provided that n > u^0.001, with expected amortized constant insertion time and worst-case constant lookup time. In this work, we present a filter data structure with improvements in two aspects: - it has constant worst-case time for all insertions and lookups with high probability; - it uses space (1+o(1))n(log (1/?)+log log n) bits when n > u^0.001, achieving optimal leading constant for all ? = o(1). We also present a dictionary that uses (1+o(1))nlog(u/n) bits of space, matching the optimal space in terms of the current size, and performs all operations in constant time with high probability

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Learning detectors quickly using structured covariance matrices

Author: Lucey Simon
Sridharan Sridha
Valmadre Jack
Publication venue
Publication date: 28/03/2014
Field of study

Computer vision is increasingly becoming interested in the rapid estimation of object detectors. Canonical hard negative mining strategies are slow as they require multiple passes of the large negative training set. Recent work has demonstrated that if the distribution of negative examples is assumed to be stationary, then Linear Discriminant Analysis (LDA) can learn comparable detectors without ever revisiting the negative set. Even with this insight, however, the time to learn a single object detector can still be on the order of tens of seconds on a modern desktop computer. This paper proposes to leverage the resulting structured covariance matrix to obtain detectors with identical performance in orders of magnitude less time and memory. We elucidate an important connection to the correlation filter literature, demonstrating that these can also be trained without ever revisiting the negative set

arXiv.org e-Print Archive

CiteSeerX

Bayesian wavelet de-noising with the caravan prior

Author: Gugushvili Shota
Schauer Moritz
Spreij Peter
van der Meulen Frank
Publication venue: 'EDP Sciences'
Publication date: 01/01/2019
Field of study

According to both domain expert knowledge and empirical evidence, wavelet coefficients of real signals tend to exhibit clustering patterns, in that they contain connected regions of coefficients of similar magnitude (large or small). A wavelet de-noising approach that takes into account such a feature of the signal may in practice outperform other, more vanilla methods, both in terms of the estimation error and visual appearance of the estimates. Motivated by this observation, we present a Bayesian approach to wavelet de-noising, where dependencies between neighbouring wavelet coefficients are a priori modelled via a Markov chain-based prior, that we term the caravan prior. Posterior computations in our method are performed via the Gibbs sampler. Using representative synthetic and real data examples, we conduct a detailed comparison of our approach with a benchmark empirical Bayes de-noising method (due to Johnstone and Silverman). We show that the caravan prior fares well and is therefore a useful addition to the wavelet de-noising toolbox.Comment: 32 pages, 15 figures, 4 table

arXiv.org e-Print Archive

EDP Sciences OAI-PMH repository (1.2.0)

Chalmers Research

Radboud Repository

International Migration, Integration and Social Cohesion online publications

UvA-DARE

A Hash Table Without Hash Functions, and How to Get the Most Out of Your Random Bits

Author: Kuszmaul William
Publication venue
Publication date: 15/09/2022
Field of study

This paper considers the basic question of how strong of a probabilistic guarantee can a hash table, storing

n

(1 + \Theta(1)) \log n

-bit key/value pairs, offer? Past work on this question has been bottlenecked by limitations of the known families of hash functions: The only hash tables to achieve failure probabilities less than 1 / 2^{\polylog n} require access to fully-random hash functions -- if the same hash tables are implemented using the known explicit families of hash functions, their failure probabilities become 1 / \poly(n). To get around these obstacles, we show how to construct a randomized data structure that has the same guarantees as a hash table, but that \emph{avoids the direct use of hash functions}. Building on this, we are able to construct a hash table using

O(n)

random bits that achieves failure probability

1 / n^{n^{1 - \epsilon}}

for an arbitrary positive constant

\epsilon

. In fact, we show that this guarantee can even be achieved by a \emph{succinct dictionary}, that is, by a dictionary that uses space within a

1 + o(1)

factor of the information-theoretic optimum. Finally we also construct a succinct hash table whose probabilistic guarantees fall on a different extreme, offering a failure probability of 1 / \poly(n) while using only

\tilde{O}(\log n)

random bits. This latter result matches (up to low-order terms) a guarantee previously achieved by Dietzfelbinger et al., but with increased space efficiency and with several surprising technical components

arXiv.org e-Print Archive