4 research outputs found

    Efficient Sampling Methods for Discrete Distributions

    Get PDF

    Dynamic sampling from a discrete probability distribution with a known distribution of rates

    Get PDF
    In this paper, we consider several efficient data structures for the problem of sampling from a dynamically changing discrete probability distribution, where some prior information is known on the distribution of the rates, in particular the maximum and minimum rate, and where the number of possible outcomes N is large. We consider three basic data structures, the Acceptance–Rejection method, the Complete Binary Tree and the Alias method. These can be used as building blocks in a multi-level data structure, where at each of the levels, one of the basic data structures can be used, with the top level selecting a group of events, and the bottom level selecting an element from a group. Depending on assumptions on the distribution of the rates of outcomes, different combinations of the basic structures can be used. We prove that for particular data structures the expected time of sampling and update is constant when the rate distribution follows certain conditions. We show that for any distribution, combining a tree structure with the Acceptance–Rejection method, we have an expected time of sampling and update of O(loglogrmax/rmin) is possible, where rmax is the maximum rate and rmin the minimum rate. We also discuss an implementation of a Two Levels Acceptance–Rejection data structure, that allows expected constant time for sampling, and amortized constant time for updates, assuming that rmax and rmin are known and the number of events is sufficiently large. We also present an experimental verification, highlighting the limits given by the constraints of a real-life setting

    An Adaptive Sublinear-Time Block Sparse Fourier Transform

    Get PDF
    The problem of approximately computing the kk dominant Fourier coefficients of a vector XX quickly, and using few samples in time domain, is known as the Sparse Fourier Transform (sparse FFT) problem. A long line of work on the sparse FFT has resulted in algorithms with O(klognlog(n/k))O(k\log n\log (n/k)) runtime [Hassanieh \emph{et al.}, STOC'12] and O(klogn)O(k\log n) sample complexity [Indyk \emph{et al.}, FOCS'14]. These results are proved using non-adaptive algorithms, and the latter O(klogn)O(k\log n) sample complexity result is essentially the best possible under the sparsity assumption alone: It is known that even adaptive algorithms must use Ω((klog(n/k))/loglogn)\Omega((k\log(n/k))/\log\log n) samples [Hassanieh \emph{et al.}, STOC'12]. By {\em adaptive}, we mean being able to exploit previous samples in guiding the selection of further samples. This paper revisits the sparse FFT problem with the added twist that the sparse coefficients approximately obey a (k0,k1)(k_0,k_1)-block sparse model. In this model, signal frequencies are clustered in k0k_0 intervals with width k1k_1 in Fourier space, and k=k0k1k= k_0k_1 is the total sparsity. Signals arising in applications are often well approximated by this model with k0kk_0\ll k. Our main result is the first sparse FFT algorithm for (k0,k1)(k_0, k_1)-block sparse signals with a sample complexity of O(k0k1+k0log(1+k0)logn)O^*(k_0k_1 + k_0\log(1+ k_0)\log n) at constant signal-to-noise ratios, and sublinear runtime. A similar sample complexity was previously achieved in the works on {\em model-based compressive sensing} using random Gaussian measurements, but used Ω(n)\Omega(n) runtime. To the best of our knowledge, our result is the first sublinear-time algorithm for model based compressed sensing, and the first sparse FFT result that goes below the O(klogn)O(k\log n) sample complexity bound. Interestingly, the aforementioned model-based compressive sensing result that relies on Gaussian measurements is non-adaptive, whereas our algorithm crucially uses {\em adaptivity} to achieve the improved sample complexity bound. We prove that adaptivity is in fact necessary in the Fourier setting: Any {\em non-adaptive} algorithm must use Ω(k0k1lognk0k1)\Omega(k_0k_1\log \frac{n}{k_0k_1}) samples for the (k0,k1(k_0,k_1)-block sparse model, ruling out improvements over the vanilla sparsity assumption. Our main technical innovation for adaptivity is a new randomized energy-based importance sampling technique that may be of independent interest

    Maintaining Discrete Probability Distributions Optimally

    No full text
    corecore