48 research outputs found
Nearly Optimal Sparse Group Testing
Group testing is the process of pooling arbitrary subsets from a set of
items so as to identify, with a minimal number of tests, a "small" subset of
defective items. In "classical" non-adaptive group testing, it is known
that when is substantially smaller than , tests are
both information-theoretically necessary and sufficient to guarantee recovery
with high probability. Group testing schemes in the literature meeting this
bound require most items to be tested times, and most tests
to incorporate items.
Motivated by physical considerations, we study group testing models in which
the testing procedure is constrained to be "sparse". Specifically, we consider
(separately) scenarios in which (a) items are finitely divisible and hence may
participate in at most tests; or (b) tests are
size-constrained to pool no more than items per test. For both
scenarios we provide information-theoretic lower bounds on the number of tests
required to guarantee high probability recovery. In both scenarios we provide
both randomized constructions (under both -error and zero-error
reconstruction guarantees) and explicit constructions of designs with
computationally efficient reconstruction algorithms that require a number of
tests that are optimal up to constant or small polynomial factors in some
regimes of and . The randomized design/reconstruction
algorithm in the -sized test scenario is universal -- independent of the
value of , as long as . We also investigate the effect of
unreliability/noise in test outcomes. For the full abstract, please see the
full text PDF
Group testing:an information theory perspective
The group testing problem concerns discovering a small number of defective
items within a large population by performing tests on pools of items. A test
is positive if the pool contains at least one defective, and negative if it
contains no defectives. This is a sparse inference problem with a combinatorial
flavour, with applications in medical testing, biology, telecommunications,
information technology, data science, and more. In this monograph, we survey
recent developments in the group testing problem from an information-theoretic
perspective. We cover several related developments: efficient algorithms with
practical storage and computation requirements, achievability bounds for
optimal decoding methods, and algorithm-independent converse bounds. We assess
the theoretical guarantees not only in terms of scaling laws, but also in terms
of the constant factors, leading to the notion of the {\em rate} of group
testing, indicating the amount of information learned per test. Considering
both noiseless and noisy settings, we identify several regimes where existing
algorithms are provably optimal or near-optimal, as well as regimes where there
remains greater potential for improvement. In addition, we survey results
concerning a number of variations on the standard group testing problem,
including partial recovery criteria, adaptive algorithms with a limited number
of stages, constrained test designs, and sublinear-time algorithms.Comment: Survey paper, 140 pages, 19 figures. To be published in Foundations
and Trends in Communications and Information Theor
Noise-Resilient Group Testing: Limitations and Constructions
We study combinatorial group testing schemes for learning -sparse Boolean
vectors using highly unreliable disjunctive measurements. We consider an
adversarial noise model that only limits the number of false observations, and
show that any noise-resilient scheme in this model can only approximately
reconstruct the sparse vector. On the positive side, we take this barrier to
our advantage and show that approximate reconstruction (within a satisfactory
degree of approximation) allows us to break the information theoretic lower
bound of that is known for exact reconstruction of
-sparse vectors of length via non-adaptive measurements, by a
multiplicative factor .
Specifically, we give simple randomized constructions of non-adaptive
measurement schemes, with measurements, that allow efficient
reconstruction of -sparse vectors up to false positives even in the
presence of false positives and false negatives within the
measurement outcomes, for any constant . We show that, information
theoretically, none of these parameters can be substantially improved without
dramatically affecting the others. Furthermore, we obtain several explicit
constructions, in particular one matching the randomized trade-off but using measurements. We also obtain explicit constructions
that allow fast reconstruction in time \poly(m), which would be sublinear in
for sufficiently sparse vectors. The main tool used in our construction is
the list-decoding view of randomness condensers and extractors.Comment: Full version. A preliminary summary of this work appears (under the
same title) in proceedings of the 17th International Symposium on
Fundamentals of Computation Theory (FCT 2009
Recommended from our members
Algorithms to Exploit Data Sparsity
While data in the real world is very high-dimensional, it generally has some underlying structure; for instance, if we think of an image as a set of pixels with associated color values, most possible settings of color values correspond to something more like random noise than what we typically think of as a picture. With an appropriate transformation of basis, this underlying structure can often be converted into sparsity in data, giving an equivalent representation of the data where the magnitude is large in only a few directions relative to the ambient dimension. This motivates a variety of theoretical questions around designing algorithms that can exploit this data sparsity to achieve better performance than what would be possible naively, and in this thesis we tackle several such questions.We first examine the question of simply approximating the level of sparsity of a signal under several different measurement models, a natural first step if the sparsity is to be exploited by other algorithms. Second, we look at a particular sparse signal recovery problem called nonadaptive probabilistic group testing, and investigate the question of exactly how sparse the signal needs to be before the methods used for recovering sparse signals outperform those used for non-sparse signals. Third, we prove novel upper bounds on the number of measurements needed to recover a sparse signal in the universal one-bit compressed sensing model of sparse signal recovery. Fourth, we give some approximations of an information-theoretic quantity called the index coding rate of a network modeled by a graph, in the special case that the graph is sparse or otherwise highly structured. For each of the problems considered, we also discuss some remaining open questions and conjectures, as well as possible directions towards their solutions
Estimation of Sparsity via Simple Measurements
We consider several related problems of estimating the 'sparsity' or number
of nonzero elements in a length vector by observing only
, where is a predesigned test matrix
independent of , and the operation varies between problems.
We aim to provide a -approximation of sparsity for some constant
with a minimal number of measurements (rows of ). This framework
generalizes multiple problems, such as estimation of sparsity in group testing
and compressed sensing. We use techniques from coding theory as well as
probabilistic methods to show that rows are sufficient
when the operation is logical OR (i.e., group testing), and nearly this
many are necessary, where is a known upper bound on . When instead the
operation is multiplication over or a finite field
, we show that respectively and measurements are necessary and sufficient.Comment: 13 pages; shortened version presented at ISIT 201