5,085 research outputs found
Deletion codes in the high-noise and high-rate regimes
The noise model of deletions poses significant challenges in coding theory,
with basic questions like the capacity of the binary deletion channel still
being open. In this paper, we study the harder model of worst-case deletions,
with a focus on constructing efficiently decodable codes for the two extreme
regimes of high-noise and high-rate. Specifically, we construct polynomial-time
decodable codes with the following trade-offs (for any eps > 0):
(1) Codes that can correct a fraction 1-eps of deletions with rate poly(eps)
over an alphabet of size poly(1/eps);
(2) Binary codes of rate 1-O~(sqrt(eps)) that can correct a fraction eps of
deletions; and
(3) Binary codes that can be list decoded from a fraction (1/2-eps) of
deletions with rate poly(eps)
Our work is the first to achieve the qualitative goals of correcting a
deletion fraction approaching 1 over bounded alphabets, and correcting a
constant fraction of bit deletions with rate aproaching 1. The above results
bring our understanding of deletion code constructions in these regimes to a
similar level as worst-case errors
Linear Insertion Deletion Codes in the High-Noise and High-Rate Regimes
This work continues the study of linear error correcting codes against adversarial insertion deletion errors (insdel errors). Previously, the work of Cheng, Guruswami, Haeupler, and Li [Kuan Cheng et al., 2021] showed the existence of asymptotically good linear insdel codes that can correct arbitrarily close to 1 fraction of errors over some constant size alphabet, or achieve rate arbitrarily close to 1/2 even over the binary alphabet. As shown in [Kuan Cheng et al., 2021], these bounds are also the best possible. However, known explicit constructions in [Kuan Cheng et al., 2021], and subsequent improved constructions by Con, Shpilka, and Tamo [Con et al., 2022] all fall short of meeting these bounds. Over any constant size alphabet, they can only achieve rate < 1/8 or correct < 1/4 fraction of errors; over the binary alphabet, they can only achieve rate < 1/1216 or correct < 1/54 fraction of errors. Apparently, previous techniques face inherent barriers to achieve rate better than 1/4 or correct more than 1/2 fraction of errors.
In this work we give new constructions of such codes that meet these bounds, namely, asymptotically good linear insdel codes that can correct arbitrarily close to 1 fraction of errors over some constant size alphabet, and binary asymptotically good linear insdel codes that can achieve rate arbitrarily close to 1/2. All our constructions are efficiently encodable and decodable. Our constructions are based on a novel approach of code concatenation, which embeds the index information implicitly into codewords. This significantly differs from previous techniques and may be of independent interest. Finally, we also prove the existence of linear concatenated insdel codes with parameters that match random linear codes, and propose a conjecture about linear insdel codes
Near-Linear Time Insertion-Deletion Codes and (1+)-Approximating Edit Distance via Indexing
We introduce fast-decodable indexing schemes for edit distance which can be
used to speed up edit distance computations to near-linear time if one of the
strings is indexed by an indexing string . In particular, for every length
and every , one can in near linear time construct a string
with , such that, indexing
any string , symbol-by-symbol, with results in a string where for which edit
distance computations are easy, i.e., one can compute a
-approximation of the edit distance between and any other
string in time.
Our indexing schemes can be used to improve the decoding complexity of
state-of-the-art error correcting codes for insertions and deletions. In
particular, they lead to near-linear time decoding algorithms for the
insertion-deletion codes of [Haeupler, Shahrasbi; STOC `17] and faster decoding
algorithms for list-decodable insertion-deletion codes of [Haeupler, Shahrasbi,
Sudan; ICALP `18]. Interestingly, the latter codes are a crucial ingredient in
the construction of fast-decodable indexing schemes
Guess & Check Codes for Deletions and Synchronization
We consider the problem of constructing codes that can correct
deletions occurring in an arbitrary binary string of length bits.
Varshamov-Tenengolts (VT) codes can correct all possible single deletions
with an asymptotically optimal redundancy. Finding similar codes
for deletions is an open problem. We propose a new family of
codes, that we call Guess & Check (GC) codes, that can correct, with high
probability, a constant number of deletions occurring at uniformly
random positions within an arbitrary string. The GC codes are based on MDS
codes and have an asymptotically optimal redundancy that is . We provide deterministic polynomial time encoding and decoding schemes for
these codes. We also describe the applications of GC codes to file
synchronization.Comment: Accepted in ISIT 201
Synchronization Strings: Explicit Constructions, Local Decoding, and Applications
This paper gives new results for synchronization strings, a powerful
combinatorial object that allows to efficiently deal with insertions and
deletions in various communication settings:
We give a deterministic, linear time synchronization string
construction, improving over an time randomized construction.
Independently of this work, a deterministic time
construction was just put on arXiv by Cheng, Li, and Wu. We also give a
deterministic linear time construction of an infinite synchronization string,
which was not known to be computable before. Both constructions are highly
explicit, i.e., the symbol can be computed in time.
This paper also introduces a generalized notion we call
long-distance synchronization strings that allow for local and very fast
decoding. In particular, only time and access to logarithmically
many symbols is required to decode any index.
We give several applications for these results:
For any we provide an insdel correcting
code with rate which can correct any fraction
of insdel errors in time. This near linear computational
efficiency is surprising given that we do not even know how to compute the
(edit) distance between the decoding input and output in sub-quadratic time. We
show that such codes can not only efficiently recover from fraction of
insdel errors but, similar to [Schulman, Zuckerman; TransInf'99], also from any
fraction of block transpositions and replications.
We show that highly explicitness and local decoding allow for
infinite channel simulations with exponentially smaller memory and decoding
time requirements. These simulations can be used to give the first near linear
time interactive coding scheme for insdel errors
Energy-Efficient Communication over the Unsynchronized Gaussian Diamond Network
Communication networks are often designed and analyzed assuming tight
synchronization among nodes. However, in applications that require
communication in the energy-efficient regime of low signal-to-noise ratios,
establishing tight synchronization among nodes in the network can result in a
significant energy overhead. Motivated by a recent result showing that
near-optimal energy efficiency can be achieved over the AWGN channel without
requiring tight synchronization, we consider the question of whether the
potential gains of cooperative communication can be achieved in the absence of
synchronization. We focus on the symmetric Gaussian diamond network and
establish that cooperative-communication gains are indeed feasible even with
unsynchronized nodes. More precisely, we show that the capacity per unit energy
of the unsynchronized symmetric Gaussian diamond network is within a constant
factor of the capacity per unit energy of the corresponding synchronized
network. To this end, we propose a distributed relaying scheme that does not
require tight synchronization but nevertheless achieves most of the energy
gains of coherent combining.Comment: 20 pages, 4 figures, submitted to IEEE Transactions on Information
Theory, presented at IEEE ISIT 201
Synchronization Strings: Codes for Insertions and Deletions Approaching the Singleton Bound
We introduce synchronization strings as a novel way of efficiently dealing
with synchronization errors, i.e., insertions and deletions. Synchronization
errors are strictly more general and much harder to deal with than commonly
considered half-errors, i.e., symbol corruptions and erasures. For every
, synchronization strings allow to index a sequence with an
size alphabet such that one can efficiently transform
synchronization errors into half-errors. This powerful new
technique has many applications. In this paper, we focus on designing insdel
codes, i.e., error correcting block codes (ECCs) for insertion deletion
channels.
While ECCs for both half-errors and synchronization errors have been
intensely studied, the later has largely resisted progress. Indeed, it took
until 1999 for the first insdel codes with constant rate, constant distance,
and constant alphabet size to be constructed by Schulman and Zuckerman. Insdel
codes for asymptotically large or small noise rates were given in 2016 by
Guruswami et al. but these codes are still polynomially far from the optimal
rate-distance tradeoff. This makes the understanding of insdel codes up to this
work equivalent to what was known for regular ECCs after Forney introduced
concatenated codes in his doctoral thesis 50 years ago.
A direct application of our synchronization strings based indexing method
gives a simple black-box construction which transforms any ECC into an equally
efficient insdel code with a slightly larger alphabet size. This instantly
transfers much of the highly developed understanding for regular ECCs over
large constant alphabets into the realm of insdel codes. Most notably, we
obtain efficient insdel codes which get arbitrarily close to the optimal
rate-distance tradeoff given by the Singleton bound for the complete noise
spectrum
- …