8 research outputs found
Learning Immune-Defectives Graph through Group Tests
This paper deals with an abstraction of a unified problem of drug discovery
and pathogen identification. Pathogen identification involves identification of
disease-causing biomolecules. Drug discovery involves finding chemical
compounds, called lead compounds, that bind to pathogenic proteins and
eventually inhibit the function of the protein. In this paper, the lead
compounds are abstracted as inhibitors, pathogenic proteins as defectives, and
the mixture of "ineffective" chemical compounds and non-pathogenic proteins as
normal items. A defective could be immune to the presence of an inhibitor in a
test. So, a test containing a defective is positive iff it does not contain its
"associated" inhibitor. The goal of this paper is to identify the defectives,
inhibitors, and their "associations" with high probability, or in other words,
learn the Immune Defectives Graph (IDG) efficiently through group tests. We
propose a probabilistic non-adaptive pooling design, a probabilistic two-stage
adaptive pooling design and decoding algorithms for learning the IDG. For the
two-stage adaptive-pooling design, we show that the sample complexity of the
number of tests required to guarantee recovery of the inhibitors, defectives,
and their associations with high probability, i.e., the upper bound, exceeds
the proposed lower bound by a logarithmic multiplicative factor in the number
of items. For the non-adaptive pooling design too, we show that the upper bound
exceeds the proposed lower bound by at most a logarithmic multiplicative factor
in the number of items.Comment: Double column, 17 pages. Updated with tighter lower bounds and other
minor edit
Generalized Group Testing
In the problem of classical group testing one aims to identify a small subset
(of size ) diseased individuals/defective items in a large population (of
size ). This process is based on a minimal number of suitably-designed group
tests on subsets of items, where the test outcome is positive iff the given
test contains at least one defective item. Motivated by physical
considerations, we consider a generalized setting that includes as special
cases multiple other group-testing-like models in the literature. In our
setting, which subsumes as special cases a variety of noiseless and noisy
group-testing models in the literature, the test outcome is positive with
probability , where is the number of defectives tested in a pool, and
is an arbitrary monotonically increasing (stochastic) test function.
Our main contributions are as follows.
1. We present a non-adaptive scheme that with probability
identifies all defective items. Our scheme requires at most tests, where is a suitably
defined "sensitivity parameter" of , and is never larger than , but may be substantially smaller for many
.
2. We argue that any testing scheme (including adaptive schemes) needs at
least
tests to ensure reliable recovery. Here is a suitably defined
"concentration parameter" of .
3. We prove that for a variety of
sparse-recovery group-testing models in the literature, and for any other test function
Group testing:an information theory perspective
The group testing problem concerns discovering a small number of defective
items within a large population by performing tests on pools of items. A test
is positive if the pool contains at least one defective, and negative if it
contains no defectives. This is a sparse inference problem with a combinatorial
flavour, with applications in medical testing, biology, telecommunications,
information technology, data science, and more. In this monograph, we survey
recent developments in the group testing problem from an information-theoretic
perspective. We cover several related developments: efficient algorithms with
practical storage and computation requirements, achievability bounds for
optimal decoding methods, and algorithm-independent converse bounds. We assess
the theoretical guarantees not only in terms of scaling laws, but also in terms
of the constant factors, leading to the notion of the {\em rate} of group
testing, indicating the amount of information learned per test. Considering
both noiseless and noisy settings, we identify several regimes where existing
algorithms are provably optimal or near-optimal, as well as regimes where there
remains greater potential for improvement. In addition, we survey results
concerning a number of variations on the standard group testing problem,
including partial recovery criteria, adaptive algorithms with a limited number
of stages, constrained test designs, and sublinear-time algorithms.Comment: Survey paper, 140 pages, 19 figures. To be published in Foundations
and Trends in Communications and Information Theor