7 research outputs found
Lower bounds for constant query affine-invariant LCCs and LTCs
Affine-invariant codes are codes whose coordinates form a vector space over a
finite field and which are invariant under affine transformations of the
coordinate space. They form a natural, well-studied class of codes; they
include popular codes such as Reed-Muller and Reed-Solomon. A particularly
appealing feature of affine-invariant codes is that they seem well-suited to
admit local correctors and testers.
In this work, we give lower bounds on the length of locally correctable and
locally testable affine-invariant codes with constant query complexity. We show
that if a code is an -query
locally correctable code (LCC), where is a finite field and
is a finite alphabet, then the number of codewords in is
at most . Also, we show that if
is an -query locally testable
code (LTC), then the number of codewords in is at most
. The dependence on in these
bounds is tight for constant-query LCCs/LTCs, since Guo, Kopparty and Sudan
(ITCS `13) construct affine-invariant codes via lifting that have the same
asymptotic tradeoffs. Note that our result holds for non-linear codes, whereas
previously, Ben-Sasson and Sudan (RANDOM `11) assumed linearity to derive
similar results.
Our analysis uses higher-order Fourier analysis. In particular, we show that
the codewords corresponding to an affine-invariant LCC/LTC must be far from
each other with respect to Gowers norm of an appropriate order. This then
allows us to bound the number of codewords, using known decomposition theorems
which approximate any bounded function in terms of a finite number of
low-degree non-classical polynomials, upto a small error in the Gowers norm
Spanoids - An Abstraction of Spanning Structures, and a Barrier for LCCs
We introduce a simple logical inference structure we call a spanoid (generalizing the notion of a matroid), which captures well-studied problems in several areas. These include combinatorial geometry (point-line incidences), algebra (arrangements of hypersurfaces and ideals), statistical physics (bootstrap percolation), network theory (gossip / infection processes) and coding theory. We initiate a thorough investigation of spanoids, from computational and structural viewpoints, focusing on parameters relevant to the applications areas above and, in particular, to questions regarding Locally Correctable Codes (LCCs).
One central parameter we study is the rank of a spanoid, extending the rank of a matroid and related to the dimension of codes. This leads to one main application of our work, establishing the first known barrier to improving the nearly 20-year old bound of Katz-Trevisan (KT) on the dimension of LCCs. On the one hand, we prove that the KT bound (and its more recent refinements) holds for the much more general setting of spanoid rank. On the other hand we show that there exist (random) spanoids whose rank matches these bounds. Thus, to significantly improve the known bounds one must step out of the spanoid framework.
Another parameter we explore is the functional rank of a spanoid, which captures the possibility of turning a given spanoid into an actual code. The question of the relationship between rank and functional rank is one of the main questions we raise as it may reveal new avenues for constructing new LCCs (perhaps even matching the KT bound). As a first step, we develop an entropy relaxation of functional rank to create a small constant gap and amplify it by tensoring to construct a spanoid whose functional rank is smaller than rank by a polynomial factor. This is evidence that the entropy method we develop can prove polynomially better bounds than KT-type methods on the dimension of LCCs.
To facilitate the above results we also develop some basic structural results on spanoids including an equivalent formulation of spanoids as set systems and properties of spanoid products. We feel that given these initial findings and their motivations, the abstract study of spanoids merits further investigation. We leave plenty of concrete open problems and directions
Lower Bounds for 2-Query LCCs over Large Alphabet
A locally correctable code (LCC) is an error correcting code that allows correction of any arbitrary coordinate of a corrupted codeword by querying only a few coordinates. We show that any 2-query locally correctable code C:{0,1}^k -> Sigma^n that can correct a constant fraction of corrupted symbols must have n >= exp(k/log|Sigma|) under the assumption that the LCC is zero-error. We say that an LCC is zero-error if there exists a non-adaptive corrector algorithm that succeeds with probability 1 when the input is an uncorrupted codeword. All known constructions of LCCs are zero-error.
Our result is tight upto constant factors in the exponent. The only previous lower bound on the length of 2-query LCCs over large alphabet was Omega((k/log|Sigma|)^2) due to Katz and Trevisan (STOC 2000). Our bound implies that zero-error LCCs cannot yield 2-server private information retrieval (PIR) schemes with sub-polynomial communication. Since there exists a 2-server PIR scheme with sub-polynomial communication (STOC 2015) based on a zero-error 2-query locally decodable code (LDC), we also obtain a separation between LDCs and LCCs over large alphabet
Any Errors in this Dissertation are Probably Fixable: Topics in Probability and Error Correcting Codes.
We study two problems in coding theory, list-decoding and local-decoding. We take a probabilistic approach to these problems, in contrast to more typical algebraic approaches.
In list-decoding, we settle two open problems about the list-decodability of some well-studied ensembles of codes. First, we show that random linear codes are optimally list-decodable, and second, we show that there exist Reed-Solomon codes which are (nearly) optimally list-decodable. Our approach uses high-dimensional probability. We extend this framework to apply to a large family of codes obtained through random operations.
In local-decoding, we use expander codes to construct locally-correctible linear codes with rate approaching 1. Until recently, such codes were conjectured not to exist, and before this work the only known constructions relied on algebraic, rather than probabilistic and combinatorial, methods.PhDMathematicsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/108844/1/wootters_1.pd
String Measures: Computational Complexity and Related Problems in Communication
Strings are fundamental objects in computer science. Modern applications such as text processing, bioinformatics, and distributed data storage systems often need to deal with very large strings. These applications motivated the study of the computational complexity of string related problems as well as a better understanding of edit operations on strings in general. In this thesis, we study several problems related to edit type string measures and error correcting codes for edit errors, i.e. insertions and deletions.
The results presented in this thesis can be roughly partitioned into two parts. The first part is about the space complexity of computing or approximating string measures. We study three classical string measures: edit distance (ED), longest common subsequence (LCS), and longest increasing subsequence (LIS). Our first main result shows that all these three string measures can be approximated to within a 1+o(1) multiplicative factor using only polylog space in polynomial time. We further study ED and LCS in the asymmetric streaming model introduced by Saks and Seshadhri (SODA, 2013). The model can be viewed as an intermediate model between the random access model and the standard streaming model. In this model, one has streaming access to one of the input strings and random access to the other. For both ED and LCS, we present new algorithms as well as several space lower bounds in the asymmetric streaming model.
The second part of our results is about locally decodable codes (LDCs) that can tolerate edit errors. LDCs are a class of error correcting code that allow quick recovery of a message symbol by only looking at a few positions of the encoded message (codeword). LDCs for Hamming errors have been extensively studied while arguably little is known about LDCs for edit errors. In this thesis, we present exponential lower bounds for LDCs that can tolerate edit errors. In particular, we show that 2-query linear LDCs for edit errors do not exist, and the codeword length of any constant query LDCs for edit errors must be exponential. These bounds exhibit a strict separation between Hamming errors and edit errors. We also introduce the notion of LDCs with randomized encoding, which can be viewed as a relaxation of the standard LDCs. We give constructions of LDCs with randomized encoding that can achieve significantly better rate-query tradeoffs
Point Location and Active Learning: Learning Halfspaces Almost Optimally
Given a finite set and a binary linear classifier
, how many queries of the form are required
to learn the label of every point in ? Known as \textit{point location},
this problem has inspired over 35 years of research in the pursuit of an
optimal algorithm. Building on the prior work of Kane, Lovett, and Moran (ICALP
2018), we provide the first nearly optimal solution, a randomized linear
decision tree of depth , improving on the previous best
of from Ezra and Sharir (Discrete and Computational
Geometry, 2019). As a corollary, we also provide the first nearly optimal
algorithm for actively learning halfspaces in the membership query model. En
route to these results, we prove a novel characterization of Barthe's Theorem
(Inventiones Mathematicae, 1998) of independent interest. In particular, we
show that may be transformed into approximate isotropic position if and
only if there exists no -dimensional subspace with more than a
-fraction of , and provide a similar characterization for exact
isotropic position