4,796 research outputs found
Synchronization Strings: Explicit Constructions, Local Decoding, and Applications
This paper gives new results for synchronization strings, a powerful
combinatorial object that allows to efficiently deal with insertions and
deletions in various communication settings:
We give a deterministic, linear time synchronization string
construction, improving over an time randomized construction.
Independently of this work, a deterministic time
construction was just put on arXiv by Cheng, Li, and Wu. We also give a
deterministic linear time construction of an infinite synchronization string,
which was not known to be computable before. Both constructions are highly
explicit, i.e., the symbol can be computed in time.
This paper also introduces a generalized notion we call
long-distance synchronization strings that allow for local and very fast
decoding. In particular, only time and access to logarithmically
many symbols is required to decode any index.
We give several applications for these results:
For any we provide an insdel correcting
code with rate which can correct any fraction
of insdel errors in time. This near linear computational
efficiency is surprising given that we do not even know how to compute the
(edit) distance between the decoding input and output in sub-quadratic time. We
show that such codes can not only efficiently recover from fraction of
insdel errors but, similar to [Schulman, Zuckerman; TransInf'99], also from any
fraction of block transpositions and replications.
We show that highly explicitness and local decoding allow for
infinite channel simulations with exponentially smaller memory and decoding
time requirements. These simulations can be used to give the first near linear
time interactive coding scheme for insdel errors
Near-Linear Time Insertion-Deletion Codes and (1+)-Approximating Edit Distance via Indexing
We introduce fast-decodable indexing schemes for edit distance which can be
used to speed up edit distance computations to near-linear time if one of the
strings is indexed by an indexing string . In particular, for every length
and every , one can in near linear time construct a string
with , such that, indexing
any string , symbol-by-symbol, with results in a string where for which edit
distance computations are easy, i.e., one can compute a
-approximation of the edit distance between and any other
string in time.
Our indexing schemes can be used to improve the decoding complexity of
state-of-the-art error correcting codes for insertions and deletions. In
particular, they lead to near-linear time decoding algorithms for the
insertion-deletion codes of [Haeupler, Shahrasbi; STOC `17] and faster decoding
algorithms for list-decodable insertion-deletion codes of [Haeupler, Shahrasbi,
Sudan; ICALP `18]. Interestingly, the latter codes are a crucial ingredient in
the construction of fast-decodable indexing schemes
Coding for interactive communication correcting insertions and deletions
We consider the question of interactive communication, in which two remote
parties perform a computation while their communication channel is
(adversarially) noisy. We extend here the discussion into a more general and
stronger class of noise, namely, we allow the channel to perform insertions and
deletions of symbols. These types of errors may bring the parties "out of
sync", so that there is no consensus regarding the current round of the
protocol.
In this more general noise model, we obtain the first interactive coding
scheme that has a constant rate and resists noise rates of up to
. To this end we develop a novel primitive we name edit
distance tree code. The edit distance tree code is designed to replace the
Hamming distance constraints in Schulman's tree codes (STOC 93), with a
stronger edit distance requirement. However, the straightforward generalization
of tree codes to edit distance does not seem to yield a primitive that suffices
for communication in the presence of synchronization problems. Giving the
"right" definition of edit distance tree codes is a main conceptual
contribution of this work
Synchronization Strings: Codes for Insertions and Deletions Approaching the Singleton Bound
We introduce synchronization strings as a novel way of efficiently dealing
with synchronization errors, i.e., insertions and deletions. Synchronization
errors are strictly more general and much harder to deal with than commonly
considered half-errors, i.e., symbol corruptions and erasures. For every
, synchronization strings allow to index a sequence with an
size alphabet such that one can efficiently transform
synchronization errors into half-errors. This powerful new
technique has many applications. In this paper, we focus on designing insdel
codes, i.e., error correcting block codes (ECCs) for insertion deletion
channels.
While ECCs for both half-errors and synchronization errors have been
intensely studied, the later has largely resisted progress. Indeed, it took
until 1999 for the first insdel codes with constant rate, constant distance,
and constant alphabet size to be constructed by Schulman and Zuckerman. Insdel
codes for asymptotically large or small noise rates were given in 2016 by
Guruswami et al. but these codes are still polynomially far from the optimal
rate-distance tradeoff. This makes the understanding of insdel codes up to this
work equivalent to what was known for regular ECCs after Forney introduced
concatenated codes in his doctoral thesis 50 years ago.
A direct application of our synchronization strings based indexing method
gives a simple black-box construction which transforms any ECC into an equally
efficient insdel code with a slightly larger alphabet size. This instantly
transfers much of the highly developed understanding for regular ECCs over
large constant alphabets into the realm of insdel codes. Most notably, we
obtain efficient insdel codes which get arbitrarily close to the optimal
rate-distance tradeoff given by the Singleton bound for the complete noise
spectrum
A Lower Bound on the List-Decodability of Insdel Codes
For codes equipped with metrics such as Hamming metric, symbol pair metric or
cover metric, the Johnson bound guarantees list-decodability of such codes.
That is, the Johnson bound provides a lower bound on the list-decoding radius
of a code in terms of its relative minimum distance , list size and
the alphabet size For study of list-decodability of codes with insertion
and deletion errors (we call such codes insdel codes), it is natural to ask the
open problem whether there is also a Johnson-type bound. The problem was first
investigated by Wachter-Zeh and the result was amended by Hayashi and Yasunaga
where a lower bound on the list-decodability for insdel codes was derived.
The main purpose of this paper is to move a step further towards solving the
above open problem. In this work, we provide a new lower bound for the
list-decodability of an insdel code. As a consequence, we show that unlike the
Johnson bound for codes under other metrics that is tight, the bound on
list-decodability of insdel codes given by Hayashi and Yasunaga is not tight.
Our main idea is to show that if an insdel code with a given Levenshtein
distance is not list-decodable with list size , then the list decoding
radius is lower bounded by a bound involving and . In other words, if
the list decoding radius is less than this lower bound, the code must be
list-decodable with list size . At the end of the paper we use such bound to
provide an insdel-list-decodability bound for various well-known codes, which
has not been extensively studied before
- …