48,271 research outputs found
An Upper Bound on the Capacity of non-Binary Deletion Channels
We derive an upper bound on the capacity of non-binary deletion channels.
Although binary deletion channels have received significant attention over the
years, and many upper and lower bounds on their capacity have been derived,
such studies for the non-binary case are largely missing. The state of the art
is the following: as a trivial upper bound, capacity of an erasure channel with
the same input alphabet as the deletion channel can be used, and as a lower
bound the results by Diggavi and Grossglauser are available. In this paper, we
derive the first non-trivial non-binary deletion channel capacity upper bound
and reduce the gap with the existing achievable rates. To derive the results we
first prove an inequality between the capacity of a 2K-ary deletion channel
with deletion probability , denoted by , and the capacity of the
binary deletion channel with the same deletion probability, , that is,
. Then by employing some existing upper
bounds on the capacity of the binary deletion channel, we obtain upper bounds
on the capacity of the 2K-ary deletion channel. We illustrate via examples the
use of the new bounds and discuss their asymptotic behavior as .Comment: accepted for presentation in ISIT 201
Efficient File Synchronization: a Distributed Source Coding Approach
The problem of reconstructing a source sequence with the presence of decoder
side-information that is mis-synchronized to the source due to deletions is
studied in a distributed source coding framework. Motivated by practical
applications, the deletion process is assumed to be bursty and is modeled by a
Markov chain. The minimum rate needed to reconstruct the source sequence with
high probability is characterized in terms of an information theoretic
expression, which is interpreted as the amount of information of the deleted
content and the locations of deletions, subtracting "nature's secret", that is,
the uncertainty of the locations given the source and side-information. For
small bursty deletion probability, the asymptotic expansion of the minimum rate
is computed.Comment: 9 pages, 2 figures. A shorter version will appear in IEEE
International Symposium on Information Theory (ISIT), 201
Fundamental Bounds and Approaches to Sequence Reconstruction from Nanopore Sequencers
Nanopore sequencers are emerging as promising new platforms for
high-throughput sequencing. As with other technologies, sequencer errors pose a
major challenge for their effective use. In this paper, we present a novel
information theoretic analysis of the impact of insertion-deletion (indel)
errors in nanopore sequencers. In particular, we consider the following
problems: (i) for given indel error characteristics and rate, what is the
probability of accurate reconstruction as a function of sequence length; (ii)
what is the number of `typical' sequences within the distortion bound induced
by indel errors; (iii) using replicated extrusion (the process of passing a DNA
strand through the nanopore), what is the number of replicas needed to reduce
the distortion bound so that only one typical sequence exists within the
distortion bound.
Our results provide a number of important insights: (i) the maximum length of
a sequence that can be accurately reconstructed in the presence of indel and
substitution errors is relatively small; (ii) the number of typical sequences
within the distortion bound is large; and (iii) replicated extrusion is an
effective technique for unique reconstruction. In particular, we show that the
number of replicas is a slow function (logarithmic) of sequence length --
implying that through replicated extrusion, we can sequence large reads using
nanopore sequencers. Our model considers indel and substitution errors
separately. In this sense, it can be viewed as providing (tight) bounds on
reconstruction lengths and repetitions for accurate reconstruction when the two
error modes are considered in a single model.Comment: 12 pages, 5 figure
- …