15 research outputs found
Linear-time list recovery of high-rate expander codes
We show that expander codes, when properly instantiated, are high-rate list
recoverable codes with linear-time list recovery algorithms. List recoverable
codes have been useful recently in constructing efficiently list-decodable
codes, as well as explicit constructions of matrices for compressive sensing
and group testing. Previous list recoverable codes with linear-time decoding
algorithms have all had rate at most 1/2; in contrast, our codes can have rate
for any . We can plug our high-rate codes into a
construction of Meir (2014) to obtain linear-time list recoverable codes of
arbitrary rates, which approach the optimal trade-off between the number of
non-trivial lists provided and the rate of the code. While list-recovery is
interesting on its own, our primary motivation is applications to
list-decoding. A slight strengthening of our result would implies linear-time
and optimally list-decodable codes for all rates, and our work is a step in the
direction of solving this important problem
Noise-Resilient Group Testing with Order-Optimal Tests and Fast-and-Reliable Decoding
Group testing (GT) is the Boolean counterpart of compressed sensing and the
marketplace of new ideas for related problems such as cognitive radio and heavy
hitter. A GT scheme is considered good if it is nonadaptive, uses
tests, resists noise, can be decoded in
time, and makes nearly no mistakes. In this paper, we propose "Gacha GT", an
elementary, self-contained, and unified randomized scheme that, for the first
time, satisfies all criteria for a fairly large region of parameters, namely
when . Outside this parameter region, Gacha can be
specialized to outperform the state-of-the-art partial-recovery GTs,
exact-recovery GTs, and worst-case GTs.
The new idea that runs through this paper, using an analogy, is to ask every
person to break her -digit "phone number" into three -digit numbers ,
, and and write , , and on three pieces of
sticky notes, where is her "birthday". This way, one can sort the sticky
notes by birthday to reassemble the phone numbers. This birthday--number code
and other coded constructions can be stacked like a multipartite graph pyramid.
Gacha's encoder will synthesize the test results from the bottom up; and
Gacha's decoder will reassemble the phone numbers from the top down.Comment: 23 pages, 8 figure
Collision Resistant Hashing for Paranoids: Dealing with Multiple Collisions
A collision resistant hash (CRH) function is one that compresses its input, yet it is hard to find a collision, i.e. a s.t. . Collision resistant hash functions are one of the more useful cryptographic primitives both in theory and in practice and two prominent applications are in signature schemes and succinct zero-knowledge arguments.
In this work we consider a relaxation of the above requirement that we call Multi-CRH: a function where it is hard to find which are all distinct, yet . We show that for some of the major applications of CRH functions it is possible to replace them by the weaker notion of an Multi-CRH, albeit at the price of adding interaction: we show a statistically hiding commitment schemes with succinct interaction (committing to bits requires exchanging bits) that can be opened locally (without revealing the full string). This in turn can be used to provide succinct arguments for any statement. On the other hand we show black-box separation results from standard CRH and a hierarchy of such Multi-CRHs
Towards Optimal Approximate Streaming Pattern Matching by Matching Multiple Patterns in Multiple Streams
Recently, there has been a growing focus in solving approximate pattern matching problems in the streaming model. Of particular interest are the pattern matching with k-mismatches (KMM) problem and the pattern matching with w-wildcards (PMWC) problem. Motivated by reductions from these problems in the streaming model to the dictionary matching problem, this paper focuses on designing algorithms for the dictionary matching problem in the multi-stream model where there are several independent streams of data (as opposed to just one in the streaming model), and the memory complexity of an algorithm is expressed using two quantities: (1) a read-only shared memory storage area which is shared among all the streams, and (2) local stream memory that each stream stores separately.
In the dictionary matching problem in the multi-stream model the goal is to preprocess a dictionary D={P_1,P_2,...,P_d} of d=|D| patterns (strings with maximum length m over alphabet Sigma) into a data structure stored in shared memory, so that given multiple independent streaming texts (where characters arrive one at a time) the algorithm reports occurrences of patterns from D in each one of the texts as soon as they appear.
We design two efficient algorithms for the dictionary matching problem in the multi-stream model. The first algorithm works when all the patterns in D have the same length m and costs O(d log m) words in shared memory, O(log m log d) words in stream memory, and O(log m) time per character. The second algorithm works for general D, but the time cost per character becomes O(log m+log d log log d). We also demonstrate the usefulness of our first algorithm in solving both the KMM problem and PMWC problem in the streaming model. In particular, we obtain the first almost optimal (up to poly-log factors) algorithm for the PMWC problem in the streaming model. We also design a new algorithm for the KMM problem in the streaming model that, up to poly-log factors, has the same bounds as the most recent results that use different techniques. Moreover, for most inputs, our algorithm for KMM is significantly faster on average
Applications of Derandomization Theory in Coding
Randomized techniques play a fundamental role in theoretical computer science
and discrete mathematics, in particular for the design of efficient algorithms
and construction of combinatorial objects. The basic goal in derandomization
theory is to eliminate or reduce the need for randomness in such randomized
constructions. In this thesis, we explore some applications of the fundamental
notions in derandomization theory to problems outside the core of theoretical
computer science, and in particular, certain problems related to coding theory.
First, we consider the wiretap channel problem which involves a communication
system in which an intruder can eavesdrop a limited portion of the
transmissions, and construct efficient and information-theoretically optimal
communication protocols for this model. Then we consider the combinatorial
group testing problem. In this classical problem, one aims to determine a set
of defective items within a large population by asking a number of queries,
where each query reveals whether a defective item is present within a specified
group of items. We use randomness condensers to explicitly construct optimal,
or nearly optimal, group testing schemes for a setting where the query outcomes
can be highly unreliable, as well as the threshold model where a query returns
positive if the number of defectives pass a certain threshold. Finally, we
design ensembles of error-correcting codes that achieve the
information-theoretic capacity of a large class of communication channels, and
then use the obtained ensembles for construction of explicit capacity achieving
codes.
[This is a shortened version of the actual abstract in the thesis.]Comment: EPFL Phd Thesi