1,312 research outputs found
Group Testing with Probabilistic Tests: Theory, Design and Application
Identification of defective members of large populations has been widely
studied in the statistics community under the name of group testing. It
involves grouping subsets of items into different pools and detecting defective
members based on the set of test results obtained for each pool.
In a classical noiseless group testing setup, it is assumed that the sampling
procedure is fully known to the reconstruction algorithm, in the sense that the
existence of a defective member in a pool results in the test outcome of that
pool to be positive. However, this may not be always a valid assumption in some
cases of interest. In particular, we consider the case where the defective
items in a pool can become independently inactive with a certain probability.
Hence, one may obtain a negative test result in a pool despite containing some
defective items. As a result, any sampling and reconstruction method should be
able to cope with two different types of uncertainty, i.e., the unknown set of
defective items and the partially unknown, probabilistic testing procedure.
In this work, motivated by the application of detecting infected people in
viral epidemics, we design non-adaptive sampling procedures that allow
successful identification of the defective items through a set of probabilistic
tests. Our design requires only a small number of tests to single out the
defective items. In particular, for a population of size and at most
defective items with activation probability , our results show that tests is sufficient if the sampling procedure should
work for all possible sets of defective items, while
tests is enough to be successful for any single set of defective items.
Moreover, we show that the defective members can be recovered using a simple
reconstruction algorithm with complexity of .Comment: Full version of the conference paper "Compressed Sensing with
Probabilistic Measurements: A Group Testing Solution" appearing in
proceedings of the 47th Annual Allerton Conference on Communication, Control,
and Computing, 2009 (arXiv:0909.3508). To appear in IEEE Transactions on
Information Theor
Non-adaptive pooling strategies for detection of rare faulty items
We study non-adaptive pooling strategies for detection of rare faulty items.
Given a binary sparse N-dimensional signal x, how to construct a sparse binary
MxN pooling matrix F such that the signal can be reconstructed from the
smallest possible number M of measurements y=Fx? We show that a very low number
of measurements is possible for random spatially coupled design of pools F. Our
design might find application in genetic screening or compressed genotyping. We
show that our results are robust with respect to the uncertainty in the matrix
F when some elements are mistaken.Comment: 5 page
Compressed Genotyping
Significant volumes of knowledge have been accumulated in recent years
linking subtle genetic variations to a wide variety of medical disorders from
Cystic Fibrosis to mental retardation. Nevertheless, there are still great
challenges in applying this knowledge routinely in the clinic, largely due to
the relatively tedious and expensive process of DNA sequencing. Since the
genetic polymorphisms that underlie these disorders are relatively rare in the
human population, the presence or absence of a disease-linked polymorphism can
be thought of as a sparse signal. Using methods and ideas from compressed
sensing and group testing, we have developed a cost-effective genotyping
protocol. In particular, we have adapted our scheme to a recently developed
class of high throughput DNA sequencing technologies, and assembled a
mathematical framework that has some important distinctions from 'traditional'
compressed sensing ideas in order to address different biological and technical
constraints.Comment: Submitted to IEEE Transaction on Information Theory - Special Issue
on Molecular Biology and Neuroscienc
Boolean Compressed Sensing and Noisy Group Testing
The fundamental task of group testing is to recover a small distinguished
subset of items from a large population while efficiently reducing the total
number of tests (measurements). The key contribution of this paper is in
adopting a new information-theoretic perspective on group testing problems. We
formulate the group testing problem as a channel coding/decoding problem and
derive a single-letter characterization for the total number of tests used to
identify the defective set. Although the focus of this paper is primarily on
group testing, our main result is generally applicable to other compressive
sensing models.
The single letter characterization is shown to be order-wise tight for many
interesting noisy group testing scenarios. Specifically, we consider an
additive Bernoulli() noise model where we show that, for items and
defectives, the number of tests is for arbitrarily
small average error probability and for a worst case
error criterion. We also consider dilution effects whereby a defective item in
a positive pool might get diluted with probability and potentially missed.
In this case, it is shown that is and
for the average and the worst case error
criteria, respectively. Furthermore, our bounds allow us to verify existing
known bounds for noiseless group testing including the deterministic noise-free
case and approximate reconstruction with bounded distortion. Our proof of
achievability is based on random coding and the analysis of a Maximum
Likelihood Detector, and our information theoretic lower bound is based on
Fano's inequality.Comment: In this revision: reorganized the paper, added citations to related
work, and fixed some bug
On Finding a Subset of Healthy Individuals from a Large Population
In this paper, we derive mutual information based upper and lower bounds on
the number of nonadaptive group tests required to identify a given number of
"non defective" items from a large population containing a small number of
"defective" items. We show that a reduction in the number of tests is
achievable compared to the approach of first identifying all the defective
items and then picking the required number of non-defective items from the
complement set. In the asymptotic regime with the population size , to identify non-defective items out of a population
containing defective items, when the tests are reliable, our results show
that measurements are
sufficient, where is a constant independent of and , and
is a bounded function of and . Further, in the nonadaptive group
testing setup, we obtain rigorous upper and lower bounds on the number of tests
under both dilution and additive noise models. Our results are derived using a
general sparse signal model, by virtue of which, they are also applicable to
other important sparse signal based applications such as compressive sensing.Comment: 32 pages, 2 figures, 3 tables, revised version of a paper submitted
to IEEE Trans. Inf. Theor
Compressed Sensing with Probabilistic Measurements: A Group Testing Solution
Detection of defective members of large populations has been widely studied
in the statistics community under the name "group testing", a problem which
dates back to World War II when it was suggested for syphilis screening. There
the main interest is to identify a small number of infected people among a
large population using collective samples. In viral epidemics, one way to
acquire collective samples is by sending agents inside the population. While in
classical group testing, it is assumed that the sampling procedure is fully
known to the reconstruction algorithm, in this work we assume that the decoder
possesses only partial knowledge about the sampling process. This assumption is
justified by observing the fact that in a viral sickness, there is a chance
that an agent remains healthy despite having contact with an infected person.
Therefore, the reconstruction method has to cope with two different types of
uncertainty; namely, identification of the infected population and the
partially unknown sampling procedure.
In this work, by using a natural probabilistic model for "viral infections",
we design non-adaptive sampling procedures that allow successful identification
of the infected population with overwhelming probability 1-o(1). We propose
both probabilistic and explicit design procedures that require a "small" number
of agents to single out the infected individuals. More precisely, for a
contamination probability p, the number of agents required by the probabilistic
and explicit designs for identification of up to k infected members is bounded
by m = O(k^2 (log n)/p^2) and m = O(k^2 (log n)^2 /p^2), respectively. In both
cases, a simple decoder is able to successfully identify the infected
population in time O(mn).Comment: In Proceedings of the Forty-Seventh Annual Allerton Conference on
Communication, Control, and Computin
- …