20 research outputs found
Noise-Resilient Group Testing: Limitations and Constructions
We study combinatorial group testing schemes for learning -sparse Boolean
vectors using highly unreliable disjunctive measurements. We consider an
adversarial noise model that only limits the number of false observations, and
show that any noise-resilient scheme in this model can only approximately
reconstruct the sparse vector. On the positive side, we take this barrier to
our advantage and show that approximate reconstruction (within a satisfactory
degree of approximation) allows us to break the information theoretic lower
bound of that is known for exact reconstruction of
-sparse vectors of length via non-adaptive measurements, by a
multiplicative factor .
Specifically, we give simple randomized constructions of non-adaptive
measurement schemes, with measurements, that allow efficient
reconstruction of -sparse vectors up to false positives even in the
presence of false positives and false negatives within the
measurement outcomes, for any constant . We show that, information
theoretically, none of these parameters can be substantially improved without
dramatically affecting the others. Furthermore, we obtain several explicit
constructions, in particular one matching the randomized trade-off but using measurements. We also obtain explicit constructions
that allow fast reconstruction in time \poly(m), which would be sublinear in
for sufficiently sparse vectors. The main tool used in our construction is
the list-decoding view of randomness condensers and extractors.Comment: Full version. A preliminary summary of this work appears (under the
same title) in proceedings of the 17th International Symposium on
Fundamentals of Computation Theory (FCT 2009
Efficiently decodable non-adaptive group testing
We consider the following "efficiently decodable" non-adaptive
group testing problem. There is an unknown string
x 2 f0; 1gn [x is an element of set {0,1} superscript n] with at most d ones in it. We are allowed to test
any subset S [n] [S subset [n] ]of the indices. The answer to the test
tells whether xi = 0 [x subscript i = 0] for all i 2 S [i is an element of S] or not. The objective
is to design as few tests as possible (say, t tests) such that
x can be identifi ed as fast as possible (say, poly(t)-time).
Efficiently decodable non-adaptive group testing has applications
in many areas, including data stream algorithms and
data forensics.
A non-adaptive group testing strategy can be represented
by a t x n matrix, which is the stacking of all the
characteristic vectors of the tests. It is well-known that if
this matrix is d-disjunct, then any test outcome corresponds
uniquely to an unknown input string. Furthermore, we know
how to construct d-disjunct matrices with t = O(d2 [d superscript 2] log n)
efficiently. However, these matrices so far only allow for a
"decoding" time of O(nt), which can be exponentially larger
than poly(t) for relatively small values of d.
This paper presents a randomness efficient construction
of d-disjunct matrices with t = O(d2 [d superscript 2] log n) that can be decoded
in time poly(d) [function composed of] t log2 t + O(t2) [t log superscript 2 t and O (t superscript 2)]. To the best of our
knowledge, this is the first result that achieves an efficient decoding
time and matches the best known O(d2 log n) [O (d superscript 2 log n)] bound
on the number of tests. We also derandomize the construction,
which results in a polynomial time deterministic construction
of such matrices when d = O(log n= log log n).
A crucial building block in our construction is the
notion of (d,l)-list disjunct matrices, which represent the
more general "list group testing" problem whose goal is to
output less than d + l positions in x, including all the (at
most d) positions that have a one in them. List disjunct
matrices turn out to be interesting objects in their own right
and were also considered independently by [Cheraghchi,
FCT 2009]. We present connections between list disjunct
matrices, expanders, dispersers and disjunct matrices. List
disjunct matrices have applications in constructing (d,l)-
sparsity separator structures [Ganguly, ISAAC 2008] and in
constructing tolerant testers for Reed-Solomon codes in the
data stream model.
1 IntroductionDavid & Lucile Packard FoundationCenter for Massive Data Algorithmics (MADALGO)National Science Foundation (U.S.) (Grant CCF-0728645)National Science Foundation (U.S.) (Grant CCF-0347565)National Science Foundation (U.S.) (CAREER Award CCF-0844796
Construction of Almost Disjunct Matrices for Group Testing
In a \emph{group testing} scheme, a set of tests is designed to identify a
small number of defective items among a large set (of size ) of items.
In the non-adaptive scenario the set of tests has to be designed in one-shot.
In this setting, designing a testing scheme is equivalent to the construction
of a \emph{disjunct matrix}, an matrix where the union of supports
of any columns does not contain the support of any other column. In
principle, one wants to have such a matrix with minimum possible number of
rows (tests). One of the main ways of constructing disjunct matrices relies on
\emph{constant weight error-correcting codes} and their \emph{minimum
distance}. In this paper, we consider a relaxed definition of a disjunct matrix
known as \emph{almost disjunct matrix}. This concept is also studied under the
name of \emph{weakly separated design} in the literature. The relaxed
definition allows one to come up with group testing schemes where a
close-to-one fraction of all possible sets of defective items are identifiable.
Our main contribution is twofold. First, we go beyond the minimum distance
analysis and connect the \emph{average distance} of a constant weight code to
the parameters of an almost disjunct matrix constructed from it. Our second
contribution is to explicitly construct almost disjunct matrices based on our
average distance analysis, that have much smaller number of rows than any
previous explicit construction of disjunct matrices. The parameters of our
construction can be varied to cover a large range of relations for and .Comment: 15 Page
Applications of Derandomization Theory in Coding
Randomized techniques play a fundamental role in theoretical computer science
and discrete mathematics, in particular for the design of efficient algorithms
and construction of combinatorial objects. The basic goal in derandomization
theory is to eliminate or reduce the need for randomness in such randomized
constructions. In this thesis, we explore some applications of the fundamental
notions in derandomization theory to problems outside the core of theoretical
computer science, and in particular, certain problems related to coding theory.
First, we consider the wiretap channel problem which involves a communication
system in which an intruder can eavesdrop a limited portion of the
transmissions, and construct efficient and information-theoretically optimal
communication protocols for this model. Then we consider the combinatorial
group testing problem. In this classical problem, one aims to determine a set
of defective items within a large population by asking a number of queries,
where each query reveals whether a defective item is present within a specified
group of items. We use randomness condensers to explicitly construct optimal,
or nearly optimal, group testing schemes for a setting where the query outcomes
can be highly unreliable, as well as the threshold model where a query returns
positive if the number of defectives pass a certain threshold. Finally, we
design ensembles of error-correcting codes that achieve the
information-theoretic capacity of a large class of communication channels, and
then use the obtained ensembles for construction of explicit capacity achieving
codes.
[This is a shortened version of the actual abstract in the thesis.]Comment: EPFL Phd Thesi
A single-photon sampling architecture for solid-state imaging
Advances in solid-state technology have enabled the development of silicon
photomultiplier sensor arrays capable of sensing individual photons. Combined
with high-frequency time-to-digital converters (TDCs), this technology opens up
the prospect of sensors capable of recording with high accuracy both the time
and location of each detected photon. Such a capability could lead to
significant improvements in imaging accuracy, especially for applications
operating with low photon fluxes such as LiDAR and positron emission
tomography.
The demands placed on on-chip readout circuitry imposes stringent trade-offs
between fill factor and spatio-temporal resolution, causing many contemporary
designs to severely underutilize the technology's full potential. Concentrating
on the low photon flux setting, this paper leverages results from group testing
and proposes an architecture for a highly efficient readout of pixels using
only a small number of TDCs, thereby also reducing both cost and power
consumption. The design relies on a multiplexing technique based on binary
interconnection matrices. We provide optimized instances of these matrices for
various sensor parameters and give explicit upper and lower bounds on the
number of TDCs required to uniquely decode a given maximum number of
simultaneous photon arrivals.
To illustrate the strength of the proposed architecture, we note a typical
digitization result of a 120x120 photodiode sensor on a 30um x 30um pitch with
a 40ps time resolution and an estimated fill factor of approximately 70%, using
only 161 TDCs. The design guarantees registration and unique recovery of up to
4 simultaneous photon arrivals using a fast decoding algorithm. In a series of
realistic simulations of scintillation events in clinical positron emission
tomography the design was able to recover the spatio-temporal location of 98.6%
of all photons that caused pixel firings.Comment: 24 pages, 3 figures, 5 table
GROTESQUE: Noisy Group Testing (Quick and Efficient)
Group-testing refers to the problem of identifying (with high probability) a
(small) subset of defectives from a (large) set of items via a "small"
number of "pooled" tests. For ease of presentation in this work we focus on the
regime when D = \cO{N^{1-\gap}} for some \gap > 0. The tests may be
noiseless or noisy, and the testing procedure may be adaptive (the pool
defining a test may depend on the outcome of a previous test), or non-adaptive
(each test is performed independent of the outcome of other tests). A rich body
of literature demonstrates that tests are
information-theoretically necessary and sufficient for the group-testing
problem, and provides algorithms that achieve this performance. However, it is
only recently that reconstruction algorithms with computational complexity that
is sub-linear in have started being investigated (recent work by
\cite{GurI:04,IndN:10, NgoP:11} gave some of the first such algorithms). In the
scenario with adaptive tests with noisy outcomes, we present the first scheme
that is simultaneously order-optimal (up to small constant factors) in both the
number of tests and the decoding complexity (\cO{D\log(N)} in both the
performance metrics). The total number of stages of our adaptive algorithm is
"small" (\cO{\log(D)}). Similarly, in the scenario with non-adaptive tests
with noisy outcomes, we present the first scheme that is simultaneously
near-optimal in both the number of tests and the decoding complexity (via an
algorithm that requires \cO{D\log(D)\log(N)} tests and has a decoding
complexity of {}. Finally, we present an
adaptive algorithm that only requires 2 stages, and for which both the number
of tests and the decoding complexity scale as {}. For all three settings the probability of error of our
algorithms scales as \cO{1/(poly(D)}.Comment: 26 pages, 5 figure
Symmetric-key Corruption Detection : When XOR-MACs Meet Combinatorial Group Testing
We study a class of MACs, which we call corruption detectable MAC, that is able to not only check the integrity of the whole message, but also detect a part of the message that is corrupted.
It can be seen as an application of the classical Combinatorial Group Testing (CGT) to message authentication.
However, previous work on this application has inherent limitation in communication.
We present a novel approach to combine CGT and a class of linear MACs (XOR-MAC) that enables to break this limit. Our proposal, XOR-GTM, has a significantly smaller communication cost than any of the previous ones, keeping the same corruption detection capability. Our numerical examples for storage application show a reduction of communication by a factor of around 15 to 70 compared with previous schemes.
XOR-GTM is parallelizable and is as efficient as standard MACs.
We prove that XOR-GTM is provably secure under the standard pseudorandomness assumptions
Noisy Non-Adaptive Group Testing: A (Near-)Definite Defectives Approach
The group testing problem consists of determining a small set of defective
items from a larger set of items based on a number of possibly-noisy tests, and
is relevant in applications such as medical testing, communication protocols,
pattern matching, and many more. We study the noisy version of the problem,
where the output of each standard noiseless group test is subject to
independent noise, corresponding to passing the noiseless result through a
binary channel. We introduce a class of algorithms that we refer to as
Near-Definite Defectives (NDD), and study bounds on the required number of
tests for vanishing error probability under Bernoulli random test designs. In
addition, we study algorithm-independent converse results, giving lower bounds
on the required number of tests under Bernoulli test designs. Under reverse
-channel noise, the achievable rates and converse results match in a broad
range of sparsity regimes, and under -channel noise, the two match in a
narrower range of dense/low-noise regimes. We observe that although these two
channels have the same Shannon capacity when viewed as a communication channel,
they can behave quite differently when it comes to group testing. Finally, we
extend our analysis of these noise models to the symmetric noise model, and
show improvements over the best known existing bounds in broad scaling regimes.Comment: Submitted to IEEE Transactions on Information Theor