12 research outputs found

    Estimation of Sparsity via Simple Measurements

    Full text link
    We consider several related problems of estimating the 'sparsity' or number of nonzero elements dd in a length nn vector x\mathbf{x} by observing only b=Mx\mathbf{b} = M \odot \mathbf{x}, where MM is a predesigned test matrix independent of x\mathbf{x}, and the operation \odot varies between problems. We aim to provide a Δ\Delta-approximation of sparsity for some constant Δ\Delta with a minimal number of measurements (rows of MM). This framework generalizes multiple problems, such as estimation of sparsity in group testing and compressed sensing. We use techniques from coding theory as well as probabilistic methods to show that O(DlogDlogn)O(D \log D \log n) rows are sufficient when the operation \odot is logical OR (i.e., group testing), and nearly this many are necessary, where DD is a known upper bound on dd. When instead the operation \odot is multiplication over R\mathbb{R} or a finite field Fq\mathbb{F}_q, we show that respectively Θ(D)\Theta(D) and Θ(DlogqnD)\Theta(D \log_q \frac{n}{D}) measurements are necessary and sufficient.Comment: 13 pages; shortened version presented at ISIT 201

    Engineering Competitive and Query-Optimal Minimal-Adaptive Randomized Group Testing Strategies

    Get PDF
    Suppose that given is a collection of nn elements where dd of them are \emph{defective}. We can query an arbitrarily chosen subset of elements which returns Yes if the subset contains at least one defective and No if the subset is free of defectives. The problem of group testing is to identify the defectives with a minimum number of such queries. By the information-theoretic lower bound at least log2(nd)dlog2(nd)dlog2n\log_2 \binom {n}{d} \approx d\log_2 (\frac{n}{d}) \approx d\log_2 n queries are needed. Using adaptive group testing, i.e., asking one query at a time, the lower bound can be easily achieved. However, strategies are preferred that work in a fixed small number of stages, where queries in a stage are asked in parallel. A group testing strategy is called \emph{competitive} if it works for completely unknown dd and requires only O(dlog2n)O(d\log_2 n) queries. Usually competitive group testing is based on sequential queries. We have shown that actually competitive group testing with expected O(dlog2n)O(d\log_2 n) queries is possible in only 22 or 33 stages. Then we have focused on minimizing the hidden constant factor in the query number and proposed a systematic approach for this purpose. Another main result is related to the design of query-optimal and minimal-adaptive strategies. We have shown that a 22-stage randomized strategy with prescribed success probability can asymptotically achieve the information-theoretic lower bound for dnd \ll n and growing much slower than nn. Similarly, we can approach the entropy lower bound in 44 stages when d=o(n)d=o(n)

    New Constructions for Competitive and Minimal-Adaptive Group Testing

    Get PDF
    Group testing (GT) was originally proposed during the World War II in an attempt to minimize the \emph{cost} and \emph{waiting time} in performing identical blood tests of the soldiers for a low-prevalence disease. Formally, the GT problem asks to find dnd\ll n \emph{defective} elements out of nn elements by querying subsets (pools) for the presence of defectives. By the information-theoretic lower bound, essentially dlog2nd\log_2 n queries are needed in the worst-case. An \emph{adaptive} strategy proceeds sequentially by performing one query at a time, and it can achieve the lower bound. In various applications, nothing is known about dd beforehand and a strategy for this scenario is called \emph{competitive}. Such strategies are usually adaptive and achieve query optimality within a constant factor called the \emph{competitive ratio}. In many applications, queries are time-consuming. Therefore, \emph{minimal-adaptive} strategies which run in a small number ss of stages of parallel queries are favorable. This work is mainly devoted to the design of minimal-adaptive strategies combined with other demands of both theoretical and practical interest. First we target unknown dd and show that actually competitive GT is possible in as few as 22 stages only. The main ingredient is our randomized estimate of a previously unknown dd using nonadaptive queries. In addition, we have developed a systematic approach to obtain optimal competitive ratios for our strategies. When dd is a known upper bound, we propose randomized GT strategies which asymptotically achieve query optimality in just 22, 33 or 44 stages depending upon the growth of dd versus nn. Inspired by application settings, such as at American Red Cross, where in most cases GT is applied to small instances, \textit{e.g.}, n=16n=16. We extended our study of query-optimal GT strategies to solve a given problem instance with fixed values nn, dd and ss. We also considered the situation when elements to test cannot be divided physically (electronic devices), thus the pools must be disjoint. For GT with \emph{disjoint} simultaneous pools, we show that Θ(sd(n/d)1/s)\Theta (sd(n/d)^{1/s}) tests are sufficient, and also necessary for certain ranges of the parameters

    Bounds for nonadaptive group tests to estimate the amount of defectives

    No full text
    Group testing is the problem of finding d defectives in a set of n elements, by asking carefully chosen subsets (pools) whether they contain defectives. Strategies are preferred that use both a small number of tests close to the information-theoretic lower bound d log n, and a small constant number of stages, where tests in every stage are done in parallel, in order to save time. They should even work if d is not known in advance. In fact, one can succeed with O(d log n) queries in two stages, if certain tests are randomized and a constant failure probability is allowed. An essential ingredient of such strategies is to get an estimate of d within a constant factor. This problem is also interesting in its own right. It can be solved with O(log n) randomized group tests of a certain type. We prove that O(log n) tests are also necessary, if elements for the pools are chosen independently. The proof builds upon an analysis of the influence of tests on the searcher's ability to distinguish between any two candidate numbers with a constant ratio. The next challenge is to get optimal constant factors in the O(log n) test number, depending on the prescribed error probability and the accuracy of d. We give practical methods to derive upper bound tradeoffs and conjecture that they are already close to optimal. One of them uses a linear programming formulation

    Bounds for nonadaptive group tests to estimate the amount of defectives

    No full text
    The classical and well-studied group testing problem is to find d defectives in a set of n elements by group tests, which tell us for any chosen subset whether it contains defectives or not. Strategies are preferred that use both a small number of tests close to the information-theoretic lower bound d log n, and a small constant number of stages, where tests in every stage are done in parallel, in order to save time. They should even work if d is completely unknown in advance. An essential ingredient of such competitive and minimal-adaptive group testing strategies is an estimate of d within a constant factor. More precisely, d shall be underestimated only with some given error probability, and overestimated only by a constant factor, called the competitive ratio. The latter problem is also interesting in its own right. It can be solved with O(log n) randomized group tests of a certain type. In this paper we prove that O(log n) tests are really needed. The proof is based on an analysis of the influence of tests on the searcher's ability to distinguish between any two candidate numbers with a constant ratio. Once we know this lower bound, the next challenge is to get optimal constant factors in the O(log n) test number, depending on the desired error probability and competitive ratio. We give a method to derive upper bounds and conjecture that our particular strategy is already optimal
    corecore