474 research outputs found
The Rate-Distortion Function and Excess-Distortion Exponent of Sparse Regression Codes with Optimal Encoding
This paper studies the performance of sparse regression codes for lossy
compression with the squared-error distortion criterion. In a sparse regression
code, codewords are linear combinations of subsets of columns of a design
matrix. It is shown that with minimum-distance encoding, sparse regression
codes achieve the Shannon rate-distortion function for i.i.d. Gaussian sources
as well as the optimal excess-distortion exponent. This completes a
previous result which showed that and the optimal exponent were
achievable for distortions below a certain threshold. The proof of the
rate-distortion result is based on the second moment method, a popular
technique to show that a non-negative random variable is strictly positive
with high probability. In our context, is the number of codewords within
target distortion of the source sequence. We first identify the reason
behind the failure of the standard second moment method for certain
distortions, and illustrate the different failure modes via a stylized example.
We then use a refinement of the second moment method to show that is
achievable for all distortion values. Finally, the refinement technique is
applied to Suen's correlation inequality to prove the achievability of the
optimal Gaussian excess-distortion exponent
Lossy Compression via Sparse Linear Regression: Computationally Efficient Encoding and Decoding
We propose computationally efficient encoders and decoders for lossy
compression using a Sparse Regression Code. The codebook is defined by a design
matrix and codewords are structured linear combinations of columns of this
matrix. The proposed encoding algorithm sequentially chooses columns of the
design matrix to successively approximate the source sequence. It is shown to
achieve the optimal distortion-rate function for i.i.d Gaussian sources under
the squared-error distortion criterion. For a given rate, the parameters of the
design matrix can be varied to trade off distortion performance with encoding
complexity. An example of such a trade-off as a function of the block length n
is the following. With computational resource (space or time) per source sample
of O((n/\log n)^2), for a fixed distortion-level above the Gaussian
distortion-rate function, the probability of excess distortion decays
exponentially in n. The Sparse Regression Code is robust in the following
sense: for any ergodic source, the proposed encoder achieves the optimal
distortion-rate function of an i.i.d Gaussian source with the same variance.
Simulations show that the encoder has good empirical performance, especially at
low and moderate rates.Comment: 14 pages, to appear in IEEE Transactions on Information Theor
Quantized Estimation of Gaussian Sequence Models in Euclidean Balls
A central result in statistical theory is Pinsker's theorem, which
characterizes the minimax rate in the normal means model of nonparametric
estimation. In this paper, we present an extension to Pinsker's theorem where
estimation is carried out under storage or communication constraints. In
particular, we place limits on the number of bits used to encode an estimator,
and analyze the excess risk in terms of this constraint, the signal size, and
the noise level. We give sharp upper and lower bounds for the case of a
Euclidean ball, which establishes the Pareto-optimal minimax tradeoff between
storage and risk in this setting.Comment: Appearing at NIPS 201
Learning to compress and search visual data in large-scale systems
The problem of high-dimensional and large-scale representation of visual data
is addressed from an unsupervised learning perspective. The emphasis is put on
discrete representations, where the description length can be measured in bits
and hence the model capacity can be controlled. The algorithmic infrastructure
is developed based on the synthesis and analysis prior models whose
rate-distortion properties, as well as capacity vs. sample complexity
trade-offs are carefully optimized. These models are then extended to
multi-layers, namely the RRQ and the ML-STC frameworks, where the latter is
further evolved as a powerful deep neural network architecture with fast and
sample-efficient training and discrete representations. For the developed
algorithms, three important applications are developed. First, the problem of
large-scale similarity search in retrieval systems is addressed, where a
double-stage solution is proposed leading to faster query times and shorter
database storage. Second, the problem of learned image compression is targeted,
where the proposed models can capture more redundancies from the training
images than the conventional compression codecs. Finally, the proposed
algorithms are used to solve ill-posed inverse problems. In particular, the
problems of image denoising and compressive sensing are addressed with
promising results.Comment: PhD thesis dissertatio
Recommended from our members
Design Techniques for Efficient Sparse Regression Codes
Sparse regression codes (SPARCs) are a recently introduced coding scheme for the additive
white Gaussian noise channel, for which polynomial time decoding algorithms have been proposed which provably achieve the Shannon channel capacity. One such algorithm is the approximate message passing (AMP) decoder. However, directly implementing these decoders
does not yield good empirical performance at practical block lengths. This thesis develops techniques for improving both the error rate performance, and the time and memory complexity,
of the AMP decoder. It focuses on practical and efficient implementations for both single- and
multi-user scenarios.
A key design parameter for SPARCs is the power allocation, which is a vector of coefficients which determines how codewords are constructed. In this thesis, novel power allocation
schemes are proposed which result in several orders of magnitude improvement to error rate
compared to previous designs. Further improvements to error rate come from investigating
the role of other SPARC construction parameters, and from performing an online estimation
of a key AMP parameter instead of using a pre-computed value.
Another significant improvement to error rates comes from a novel three-stage decoder
which combines SPARCs with an outer code based on low-density parity-check codes. This
construction protects only vulnerable sections of the SPARC codeword with the outer code,
minimising the impact to the code rate. The combination provides a sharp waterfall in bit error
rates and very low overall codeword error rates.
Two changes to the basic SPARC structure are proposed to reduce computational and
memory complexity. First, the design matrix is replaced with an efficient in-place transform
based on Hadamard matrices, which dramatically reduces the overall decoder time and memory complexity with no impact on error rate. Second, an alternative SPARC design is developed, called Modulated SPARCs. These are shown to also achieve the Shannon channel capacity, while obtaining similar empirical error rates to the original SPARC, and permitting a
further reduction in time and memory complexity.
Finally, SPARCs are implemented for the broadcast and multiple access channels, and for
the multiple description and Wyner-Ziv source coding models. Designs for appropriate power
allocations and decoding strategies are proposed and are found to give good empirical results,
demonstrating that SPARCs are also well suited to these multi-user settings.Funded by a Doctoral Training Award from the Engineering and Physical Sciences Research Council
- …