7,105 research outputs found

    GROTESQUE: Noisy Group Testing (Quick and Efficient)

    Full text link
    Group-testing refers to the problem of identifying (with high probability) a (small) subset of DD defectives from a (large) set of NN items via a "small" number of "pooled" tests. For ease of presentation in this work we focus on the regime when D = \cO{N^{1-\gap}} for some \gap > 0. The tests may be noiseless or noisy, and the testing procedure may be adaptive (the pool defining a test may depend on the outcome of a previous test), or non-adaptive (each test is performed independent of the outcome of other tests). A rich body of literature demonstrates that Θ(Dlog(N))\Theta(D\log(N)) tests are information-theoretically necessary and sufficient for the group-testing problem, and provides algorithms that achieve this performance. However, it is only recently that reconstruction algorithms with computational complexity that is sub-linear in NN have started being investigated (recent work by \cite{GurI:04,IndN:10, NgoP:11} gave some of the first such algorithms). In the scenario with adaptive tests with noisy outcomes, we present the first scheme that is simultaneously order-optimal (up to small constant factors) in both the number of tests and the decoding complexity (\cO{D\log(N)} in both the performance metrics). The total number of stages of our adaptive algorithm is "small" (\cO{\log(D)}). Similarly, in the scenario with non-adaptive tests with noisy outcomes, we present the first scheme that is simultaneously near-optimal in both the number of tests and the decoding complexity (via an algorithm that requires \cO{D\log(D)\log(N)} tests and has a decoding complexity of {O(D(logN+log2D)){\cal O}(D(\log N+\log^{2}D))}. Finally, we present an adaptive algorithm that only requires 2 stages, and for which both the number of tests and the decoding complexity scale as {O(D(logN+log2D)){\cal O}(D(\log N+\log^{2}D))}. For all three settings the probability of error of our algorithms scales as \cO{1/(poly(D)}.Comment: 26 pages, 5 figure

    Detecting Activations over Graphs using Spanning Tree Wavelet Bases

    Full text link
    We consider the detection of activations over graphs under Gaussian noise, where signals are piece-wise constant over the graph. Despite the wide applicability of such a detection algorithm, there has been little success in the development of computationally feasible methods with proveable theoretical guarantees for general graph topologies. We cast this as a hypothesis testing problem, and first provide a universal necessary condition for asymptotic distinguishability of the null and alternative hypotheses. We then introduce the spanning tree wavelet basis over graphs, a localized basis that reflects the topology of the graph, and prove that for any spanning tree, this approach can distinguish null from alternative in a low signal-to-noise regime. Lastly, we improve on this result and show that using the uniform spanning tree in the basis construction yields a randomized test with stronger theoretical guarantees that in many cases matches our necessary conditions. Specifically, we obtain near-optimal performance in edge transitive graphs, kk-nearest neighbor graphs, and ϵ\epsilon-graphs

    Fast Hierarchical Clustering and Other Applications of Dynamic Closest Pairs

    Full text link
    We develop data structures for dynamic closest pair problems with arbitrary distance functions, that do not necessarily come from any geometric structure on the objects. Based on a technique previously used by the author for Euclidean closest pairs, we show how to insert and delete objects from an n-object set, maintaining the closest pair, in O(n log^2 n) time per update and O(n) space. With quadratic space, we can instead use a quadtree-like structure to achieve an optimal time bound, O(n) per update. We apply these data structures to hierarchical clustering, greedy matching, and TSP heuristics, and discuss other potential applications in machine learning, Groebner bases, and local improvement algorithms for partition and placement problems. Experiments show our new methods to be faster in practice than previously used heuristics.Comment: 20 pages, 9 figures. A preliminary version of this paper appeared at the 9th ACM-SIAM Symp. on Discrete Algorithms, San Francisco, 1998, pp. 619-628. For source code and experimental results, see http://www.ics.uci.edu/~eppstein/projects/pairs
    corecore