238 research outputs found
Near-Linear Time Insertion-Deletion Codes and (1+)-Approximating Edit Distance via Indexing
We introduce fast-decodable indexing schemes for edit distance which can be
used to speed up edit distance computations to near-linear time if one of the
strings is indexed by an indexing string . In particular, for every length
and every , one can in near linear time construct a string
with , such that, indexing
any string , symbol-by-symbol, with results in a string where for which edit
distance computations are easy, i.e., one can compute a
-approximation of the edit distance between and any other
string in time.
Our indexing schemes can be used to improve the decoding complexity of
state-of-the-art error correcting codes for insertions and deletions. In
particular, they lead to near-linear time decoding algorithms for the
insertion-deletion codes of [Haeupler, Shahrasbi; STOC `17] and faster decoding
algorithms for list-decodable insertion-deletion codes of [Haeupler, Shahrasbi,
Sudan; ICALP `18]. Interestingly, the latter codes are a crucial ingredient in
the construction of fast-decodable indexing schemes
Beyond Single-Deletion Correcting Codes: Substitutions and Transpositions
We consider the problem of designing low-redundancy codes in settings where one must correct deletions in conjunction with substitutions or adjacent transpositions; a combination of errors that is usually observed in DNA-based data storage. One of the most basic versions of this problem was settled more than 50 years ago by Levenshtein, who proved that binary Varshamov-Tenengolts codes correct one arbitrary edit error, i.e., one deletion or one substitution, with nearly optimal redundancy. However, this approach fails to extend to many simple and natural variations of the binary single-edit error setting. In this work, we make progress on the code design problem above in three such variations:
- We construct linear-time encodable and decodable length-n non-binary codes correcting a single edit error with nearly optimal redundancy log n+O(log log n), providing an alternative simpler proof of a result by Cai, Chee, Gabrys, Kiah, and Nguyen (IEEE Trans. Inf. Theory 2021). This is achieved by employing what we call weighted VT sketches, a new notion that may be of independent interest.
- We show the existence of a binary code correcting one deletion or one adjacent transposition with nearly optimal redundancy log n+O(log log n).
- We construct linear-time encodable and list-decodable binary codes with list-size 2 for one deletion and one substitution with redundancy 4log n+O(log log n). This matches the existential bound up to an O(log log n) additive term
Synchronization Strings: Explicit Constructions, Local Decoding, and Applications
This paper gives new results for synchronization strings, a powerful
combinatorial object that allows to efficiently deal with insertions and
deletions in various communication settings:
We give a deterministic, linear time synchronization string
construction, improving over an time randomized construction.
Independently of this work, a deterministic time
construction was just put on arXiv by Cheng, Li, and Wu. We also give a
deterministic linear time construction of an infinite synchronization string,
which was not known to be computable before. Both constructions are highly
explicit, i.e., the symbol can be computed in time.
This paper also introduces a generalized notion we call
long-distance synchronization strings that allow for local and very fast
decoding. In particular, only time and access to logarithmically
many symbols is required to decode any index.
We give several applications for these results:
For any we provide an insdel correcting
code with rate which can correct any fraction
of insdel errors in time. This near linear computational
efficiency is surprising given that we do not even know how to compute the
(edit) distance between the decoding input and output in sub-quadratic time. We
show that such codes can not only efficiently recover from fraction of
insdel errors but, similar to [Schulman, Zuckerman; TransInf'99], also from any
fraction of block transpositions and replications.
We show that highly explicitness and local decoding allow for
infinite channel simulations with exponentially smaller memory and decoding
time requirements. These simulations can be used to give the first near linear
time interactive coding scheme for insdel errors
An Iteratively Decodable Tensor Product Code with Application to Data Storage
The error pattern correcting code (EPCC) can be constructed to provide a
syndrome decoding table targeting the dominant error events of an inter-symbol
interference channel at the output of the Viterbi detector. For the size of the
syndrome table to be manageable and the list of possible error events to be
reasonable in size, the codeword length of EPCC needs to be short enough.
However, the rate of such a short length code will be too low for hard drive
applications. To accommodate the required large redundancy, it is possible to
record only a highly compressed function of the parity bits of EPCC's tensor
product with a symbol correcting code. In this paper, we show that the proposed
tensor error-pattern correcting code (T-EPCC) is linear time encodable and also
devise a low-complexity soft iterative decoding algorithm for EPCC's tensor
product with q-ary LDPC (T-EPCC-qLDPC). Simulation results show that
T-EPCC-qLDPC achieves almost similar performance to single-level qLDPC with a
1/2 KB sector at 50% reduction in decoding complexity. Moreover, 1 KB
T-EPCC-qLDPC surpasses the performance of 1/2 KB single-level qLDPC at the same
decoder complexity.Comment: Hakim Alhussien, Jaekyun Moon, "An Iteratively Decodable Tensor
Product Code with Application to Data Storage
It'll probably work out: improved list-decoding through random operations
In this work, we introduce a framework to study the effect of random
operations on the combinatorial list-decodability of a code. The operations we
consider correspond to row and column operations on the matrix obtained from
the code by stacking the codewords together as columns. This captures many
natural transformations on codes, such as puncturing, folding, and taking
subcodes; we show that many such operations can improve the list-decoding
properties of a code. There are two main points to this. First, our goal is to
advance our (combinatorial) understanding of list-decodability, by
understanding what structure (or lack thereof) is necessary to obtain it.
Second, we use our more general results to obtain a few interesting corollaries
for list decoding:
(1) We show the existence of binary codes that are combinatorially
list-decodable from fraction of errors with optimal rate
that can be encoded in linear time.
(2) We show that any code with relative distance, when randomly
folded, is combinatorially list-decodable fraction of errors with
high probability. This formalizes the intuition for why the folding operation
has been successful in obtaining codes with optimal list decoding parameters;
previously, all arguments used algebraic methods and worked only with specific
codes.
(3) We show that any code which is list-decodable with suboptimal list sizes
has many subcodes which have near-optimal list sizes, while retaining the error
correcting capabilities of the original code. This generalizes recent results
where subspace evasive sets have been used to reduce list sizes of codes that
achieve list decoding capacity
Subquadratic time encodable codes beating the Gilbert-Varshamov bound
We construct explicit algebraic geometry codes built from the
Garcia-Stichtenoth function field tower beating the Gilbert-Varshamov bound for
alphabet sizes at least 192. Messages are identied with functions in certain
Riemann-Roch spaces associated with divisors supported on multiple places.
Encoding amounts to evaluating these functions at degree one places. By
exploiting algebraic structures particular to the Garcia-Stichtenoth tower, we
devise an intricate deterministic \omega/2 < 1.19 runtime exponent encoding and
1+\omega/2 < 2.19 expected runtime exponent randomized (unique and list)
decoding algorithms. Here \omega < 2.373 is the matrix multiplication exponent.
If \omega = 2, as widely believed, the encoding and decoding runtimes are
respectively nearly linear and nearly quadratic. Prior to this work, encoding
(resp. decoding) time of code families beating the Gilbert-Varshamov bound were
quadratic (resp. cubic) or worse
Locally Encodable and Decodable Codes for Distributed Storage Systems
We consider the locality of encoding and decoding operations in distributed
storage systems (DSS), and propose a new class of codes, called locally
encodable and decodable codes (LEDC), that provides a higher degree of
operational locality compared to currently known codes. For a given locality
structure, we derive an upper bound on the global distance and demonstrate the
existence of an optimal LEDC for sufficiently large field size. In addition, we
also construct two families of optimal LEDC for fields with size linear in code
length.Comment: 7 page
Improved Nearly-MDS Expander Codes
A construction of expander codes is presented with the following three
properties:
(i) the codes lie close to the Singleton bound, (ii) they can be encoded in
time complexity that is linear in their code length, and (iii) they have a
linear-time bounded-distance decoder.
By using a version of the decoder that corrects also erasures, the codes can
replace MDS outer codes in concatenated constructions, thus resulting in
linear-time encodable and decodable codes that approach the Zyablov bound or
the capacity of memoryless channels. The presented construction improves on an
earlier result by Guruswami and Indyk in that any rate and relative minimum
distance that lies below the Singleton bound is attainable for a significantly
smaller alphabet size.Comment: Part of this work was presented at the 2004 IEEE Int'l Symposium on
Information Theory (ISIT'2004), Chicago, Illinois (June 2004). This work was
submitted to IEEE Transactions on Information Theory on January 21, 2005. To
appear in IEEE Transactions on Information Theory, August 2006. 12 page
- …