48 research outputs found

    Nearly Optimal Sparse Group Testing

    Full text link
    Group testing is the process of pooling arbitrary subsets from a set of nn items so as to identify, with a minimal number of tests, a "small" subset of dd defective items. In "classical" non-adaptive group testing, it is known that when dd is substantially smaller than nn, Θ(dlog(n))\Theta(d\log(n)) tests are both information-theoretically necessary and sufficient to guarantee recovery with high probability. Group testing schemes in the literature meeting this bound require most items to be tested Ω(log(n))\Omega(\log(n)) times, and most tests to incorporate Ω(n/d)\Omega(n/d) items. Motivated by physical considerations, we study group testing models in which the testing procedure is constrained to be "sparse". Specifically, we consider (separately) scenarios in which (a) items are finitely divisible and hence may participate in at most γo(log(n))\gamma \in o(\log(n)) tests; or (b) tests are size-constrained to pool no more than ρo(n/d)\rho \in o(n/d)items per test. For both scenarios we provide information-theoretic lower bounds on the number of tests required to guarantee high probability recovery. In both scenarios we provide both randomized constructions (under both ϵ\epsilon-error and zero-error reconstruction guarantees) and explicit constructions of designs with computationally efficient reconstruction algorithms that require a number of tests that are optimal up to constant or small polynomial factors in some regimes of n,d,γ,n, d, \gamma, and ρ\rho. The randomized design/reconstruction algorithm in the ρ\rho-sized test scenario is universal -- independent of the value of dd, as long as ρo(n/d)\rho \in o(n/d). We also investigate the effect of unreliability/noise in test outcomes. For the full abstract, please see the full text PDF

    Group testing:an information theory perspective

    Get PDF
    The group testing problem concerns discovering a small number of defective items within a large population by performing tests on pools of items. A test is positive if the pool contains at least one defective, and negative if it contains no defectives. This is a sparse inference problem with a combinatorial flavour, with applications in medical testing, biology, telecommunications, information technology, data science, and more. In this monograph, we survey recent developments in the group testing problem from an information-theoretic perspective. We cover several related developments: efficient algorithms with practical storage and computation requirements, achievability bounds for optimal decoding methods, and algorithm-independent converse bounds. We assess the theoretical guarantees not only in terms of scaling laws, but also in terms of the constant factors, leading to the notion of the {\em rate} of group testing, indicating the amount of information learned per test. Considering both noiseless and noisy settings, we identify several regimes where existing algorithms are provably optimal or near-optimal, as well as regimes where there remains greater potential for improvement. In addition, we survey results concerning a number of variations on the standard group testing problem, including partial recovery criteria, adaptive algorithms with a limited number of stages, constrained test designs, and sublinear-time algorithms.Comment: Survey paper, 140 pages, 19 figures. To be published in Foundations and Trends in Communications and Information Theor

    Noise-Resilient Group Testing: Limitations and Constructions

    Full text link
    We study combinatorial group testing schemes for learning dd-sparse Boolean vectors using highly unreliable disjunctive measurements. We consider an adversarial noise model that only limits the number of false observations, and show that any noise-resilient scheme in this model can only approximately reconstruct the sparse vector. On the positive side, we take this barrier to our advantage and show that approximate reconstruction (within a satisfactory degree of approximation) allows us to break the information theoretic lower bound of Ω~(d2logn)\tilde{\Omega}(d^2 \log n) that is known for exact reconstruction of dd-sparse vectors of length nn via non-adaptive measurements, by a multiplicative factor Ω~(d)\tilde{\Omega}(d). Specifically, we give simple randomized constructions of non-adaptive measurement schemes, with m=O(dlogn)m=O(d \log n) measurements, that allow efficient reconstruction of dd-sparse vectors up to O(d)O(d) false positives even in the presence of δm\delta m false positives and O(m/d)O(m/d) false negatives within the measurement outcomes, for any constant δ<1\delta < 1. We show that, information theoretically, none of these parameters can be substantially improved without dramatically affecting the others. Furthermore, we obtain several explicit constructions, in particular one matching the randomized trade-off but using m=O(d1+o(1)logn)m = O(d^{1+o(1)} \log n) measurements. We also obtain explicit constructions that allow fast reconstruction in time \poly(m), which would be sublinear in nn for sufficiently sparse vectors. The main tool used in our construction is the list-decoding view of randomness condensers and extractors.Comment: Full version. A preliminary summary of this work appears (under the same title) in proceedings of the 17th International Symposium on Fundamentals of Computation Theory (FCT 2009

    Estimation of Sparsity via Simple Measurements

    Full text link
    We consider several related problems of estimating the 'sparsity' or number of nonzero elements dd in a length nn vector x\mathbf{x} by observing only b=Mx\mathbf{b} = M \odot \mathbf{x}, where MM is a predesigned test matrix independent of x\mathbf{x}, and the operation \odot varies between problems. We aim to provide a Δ\Delta-approximation of sparsity for some constant Δ\Delta with a minimal number of measurements (rows of MM). This framework generalizes multiple problems, such as estimation of sparsity in group testing and compressed sensing. We use techniques from coding theory as well as probabilistic methods to show that O(DlogDlogn)O(D \log D \log n) rows are sufficient when the operation \odot is logical OR (i.e., group testing), and nearly this many are necessary, where DD is a known upper bound on dd. When instead the operation \odot is multiplication over R\mathbb{R} or a finite field Fq\mathbb{F}_q, we show that respectively Θ(D)\Theta(D) and Θ(DlogqnD)\Theta(D \log_q \frac{n}{D}) measurements are necessary and sufficient.Comment: 13 pages; shortened version presented at ISIT 201
    corecore