8 research outputs found
Lossy compression of discrete sources via Viterbi algorithm
We present a new lossy compressor for discrete-valued sources. For coding a
sequence , the encoder starts by assigning a certain cost to each possible
reconstruction sequence. It then finds the one that minimizes this cost and
describes it losslessly to the decoder via a universal lossless compressor. The
cost of each sequence is a linear combination of its distance from the sequence
and a linear function of its order empirical distribution.
The structure of the cost function allows the encoder to employ the Viterbi
algorithm to recover the minimizer of the cost. We identify a choice of the
coefficients comprising the linear function of the empirical distribution used
in the cost function which ensures that the algorithm universally achieves the
optimum rate-distortion performance of any stationary ergodic source in the
limit of large , provided that diverges as . Iterative
techniques for approximating the coefficients, which alleviate the
computational burden of finding the optimal coefficients, are proposed and
studied.Comment: 26 pages, 6 figures, Submitted to IEEE Transactions on Information
Theor
Universal Sampling Rate Distortion
We examine the coordinated and universal rate-efficient sampling of a subset
of correlated discrete memoryless sources followed by lossy compression of the
sampled sources. The goal is to reconstruct a predesignated subset of sources
within a specified level of distortion. The combined sampling mechanism and
rate distortion code are universal in that they are devised to perform robustly
without exact knowledge of the underlying joint probability distribution of the
sources. In Bayesian as well as nonBayesian settings, single-letter
characterizations are provided for the universal sampling rate distortion
function for fixed-set sampling, independent random sampling and memoryless
random sampling. It is illustrated how these sampling mechanisms are
successively better. Our achievability proofs bring forth new schemes for joint
source distribution-learning and lossy compression
Compression-Based Compressed Sensing
Modern compression algorithms exploit complex structures that are present in
signals to describe them very efficiently. On the other hand, the field of
compressed sensing is built upon the observation that "structured" signals can
be recovered from their under-determined set of linear projections. Currently,
there is a large gap between the complexity of the structures studied in the
area of compressed sensing and those employed by the state-of-the-art
compression codes. Recent results in the literature on deterministic signals
aim at bridging this gap through devising compressed sensing decoders that
employ compression codes. This paper focuses on structured stochastic processes
and studies the application of rate-distortion codes to compressed sensing of
such signals. The performance of the formerly-proposed compressible signal
pursuit (CSP) algorithm is studied in this stochastic setting. It is proved
that in the very low distortion regime, as the blocklength grows to infinity,
the CSP algorithm reliably and robustly recovers instances of a stationary
process from random linear projections as long as their count is slightly more
than times the rate-distortion dimension (RDD) of the source. It is also
shown that under some regularity conditions, the RDD of a stationary process is
equal to its information dimension (ID). This connection establishes the
optimality of the CSP algorithm at least for memoryless stationary sources, for
which the fundamental limits are known. Finally, it is shown that the CSP
algorithm combined by a family of universal variable-length fixed-distortion
compression codes yields a family of universal compressed sensing recovery
algorithms
Universal Compressed Sensing
In this paper, the problem of developing universal algorithms for compressed
sensing of stochastic processes is studied. First, R\'enyi's notion of
information dimension (ID) is generalized to analog stationary processes. This
provides a measure of complexity for such processes and is connected to the
number of measurements required for their accurate recovery. Then a minimum
entropy pursuit (MEP) optimization approach is proposed, and it is proven that
it can reliably recover any stationary process satisfying some mixing
constraints from sufficient number of randomized linear measurements, without
having any prior information about the distribution of the process. It is
proved that a Lagrangian-type approximation of the MEP optimization problem,
referred to as Lagrangian-MEP problem, is identical to a heuristic
implementable algorithm proposed by Baron et al. It is shown that for the right
choice of parameters the Lagrangian-MEP algorithm, in addition to having the
same asymptotic performance as MEP optimization, is also robust to the
measurement noise. For memoryless sources with a discrete-continuous mixture
distribution, the fundamental limits of the minimum number of required
measurements by a non-universal compressed sensing decoder is characterized by
Wu et al. For such sources, it is proved that there is no loss in universal
coding, and both the MEP and the Lagrangian-MEP asymptotically achieve the
optimal performance
Rate-Distortion via Markov Chain Monte Carlo
We propose an approach to lossy source coding, utilizing ideas from Gibbs
sampling, simulated annealing, and Markov Chain Monte Carlo (MCMC). The idea is
to sample a reconstruction sequence from a Boltzmann distribution associated
with an energy function that incorporates the distortion between the source and
reconstruction, the compressibility of the reconstruction, and the point sought
on the rate-distortion curve. To sample from this distribution, we use a `heat
bath algorithm': Starting from an initial candidate reconstruction (say the
original source sequence), at every iteration, an index i is chosen and the
i-th sequence component is replaced by drawing from the conditional probability
distribution for that component given all the rest. At the end of this process,
the encoder conveys the reconstruction to the decoder using universal lossless
compression. The complexity of each iteration is independent of the sequence
length and only linearly dependent on a certain context parameter (which grows
sub-logarithmically with the sequence length). We show that the proposed
algorithms achieve optimum rate-distortion performance in the limits of large
number of iterations, and sequence length, when employed on any stationary
ergodic source. Experimentation shows promising initial results. Employing our
lossy compressors on noisy data, with appropriately chosen distortion measure
and level, followed by a simple de-randomization operation, results in a family
of denoisers that compares favorably (both theoretically and in practice) with
other MCMC-based schemes, and with the Discrete Universal Denoiser (DUDE).Comment: 35 pages, 16 figures, Submitted to IEEE Transactions on Information
Theor
Sampling Rate Distortion
Consider a memoryless multiple source with m components of which a (possibly randomized) subset of k ≤ m components are sampled at each time instant and jointly compressed with the objective of reconstructing a prespecified subset of the m components under a given distortion criterion. The combined sampling and lossy compression mechanisms are to be designed to perform robustly with or without exact knowledge of the underlying joint probability distribution of the source. In this dissertation, we introduce a new framework of sampling rate distortion to study the tradeoffs among sampling mechanism, encoder-decoder structure, compression rate
and the desired level of accuracy in the reconstruction.
We begin with a discrete memoryless multiple source whose joint probability mass function (pmf) is taken to be known. A notion of sampling rate distortion function is introduced to study the mentioned tradeoffs, and is characterized first for fixed-set sampling. Next,
for independent random sampling performed without the knowledge of the source outputs, it is shown that the sampling rate distortion
function is the same whether or not the decoder is informed of the sequence of sampled sets. For memoryless random sampling, with the sampling depending on the source outputs, it is shown that deterministic sampling, characterized by a conditional point-mass, is optimal and suffices to achieve the sampling rate distortion function.
Building on this, we consider a universal setting where the joint pmf of a discrete memoryless multiple source is known only to belong to a {\it finite} family of pmfs. In Bayesian and nonBayesian settings, single-letter characterizations are provided for the universal sampling rate distortion function for the fixed-set sampling, independent random sampling and memoryless random sampling. We show that these sampling mechanisms successively improve upon each other:
(i) in their ability to enable an associated encoder approximate the underlying joint pmf and
(ii) in their ability to choose appropriate subsets of the multiple source for compression by the encoder.
Lastly, we consider a jointly Gaussian multiple memoryless source, to be reconstructed under a mean-squared error distortion criterion, with joint probability distribution function known only to belong to an uncountable family of probability density functions (characterized by a convex compact subset in Euclidean space). For fixed-set sampling, we characterize the universal sampling rate distortion function in Bayesian and nonBayesian settings. We also provide optimal reconstruction algorithms, of reduced complexity, which compress and reconstruct the sampled source components first under a modified distortion criterion, and then form MMSE estimates for the unsampled components based on reconstructions of the former.
The questions addressed in this dissertation are motivated by various applications, e.g., dynamic thermal management for multicore processors, in-network computation and satellite imaging