1,824 research outputs found
Scanning and Sequential Decision Making for Multi-Dimensional Data - Part I: the Noiseless Case
We investigate the problem of scanning and prediction ("scandiction", for
short) of multidimensional data arrays. This problem arises in several aspects
of image and video processing, such as predictive coding, for example, where an
image is compressed by coding the error sequence resulting from scandicting it.
Thus, it is natural to ask what is the optimal method to scan and predict a
given image, what is the resulting minimum prediction loss, and whether there
exist specific scandiction schemes which are universal in some sense.
Specifically, we investigate the following problems: First, modeling the data
array as a random field, we wish to examine whether there exists a scandiction
scheme which is independent of the field's distribution, yet asymptotically
achieves the same performance as if this distribution was known. This question
is answered in the affirmative for the set of all spatially stationary random
fields and under mild conditions on the loss function. We then discuss the
scenario where a non-optimal scanning order is used, yet accompanied by an
optimal predictor, and derive bounds on the excess loss compared to optimal
scanning and prediction.
This paper is the first part of a two-part paper on sequential decision
making for multi-dimensional data. It deals with clean, noiseless data arrays.
The second part deals with noisy data arrays, namely, with the case where the
decision maker observes only a noisy version of the data, yet it is judged with
respect to the original, clean data.Comment: 46 pages, 2 figures. Revised version: title changed, section 1
revised, section 3.1 added, a few minor/technical corrections mad
PEA265: Perceptual Assessment of Video Compression Artifacts
The most widely used video encoders share a common hybrid coding framework
that includes block-based motion estimation/compensation and block-based
transform coding. Despite their high coding efficiency, the encoded videos
often exhibit visually annoying artifacts, denoted as Perceivable Encoding
Artifacts (PEAs), which significantly degrade the visual Qualityof- Experience
(QoE) of end users. To monitor and improve visual QoE, it is crucial to develop
subjective and objective measures that can identify and quantify various types
of PEAs. In this work, we make the first attempt to build a large-scale
subjectlabelled database composed of H.265/HEVC compressed videos containing
various PEAs. The database, namely the PEA265 database, includes 4 types of
spatial PEAs (i.e. blurring, blocking, ringing and color bleeding) and 2 types
of temporal PEAs (i.e. flickering and floating). Each containing at least
60,000 image or video patches with positive and negative labels. To objectively
identify these PEAs, we train Convolutional Neural Networks (CNNs) using the
PEA265 database. It appears that state-of-theart ResNeXt is capable of
identifying each type of PEAs with high accuracy. Furthermore, we define PEA
pattern and PEA intensity measures to quantify PEA levels of compressed video
sequence. We believe that the PEA265 database and our findings will benefit the
future development of video quality assessment methods and perceptually
motivated video encoders.Comment: 10 pages,15 figures,4 table
Near-Optimal Rates for Limited-Delay Universal Lossy Source Coding
International audienceWe consider the problem of limited-delay lossy coding of individual sequences. Here, the goal is to design (fixed-rate) compression schemes to minimize the normalized expected distortion redundancy relative to a reference class of coding schemes, measured as the difference between the average distortion of the algorithm and that of the best coding scheme in the reference class. In compressing a sequence of length T, the best schemes available in the literature achieve an O(T^-1/3) normalized distortion redundancy relative to finite reference classes of limited delay and limited memory, and the same redundancy is achievable, up to logarithmic factors, when the reference class is the set of scalar quantizers. It has also been shown that the distortion redundancy is at least of order T^-1/2 in the latter case, and the lower bound can easily be extended to sufficiently powerful (possibly finite) reference coding schemes. In this paper, we narrow the gap between the upper and lower bounds, and give a compression scheme whose normalized distortion redundancy is O(ln(T)/ T^1/2) relative to any finite class of reference schemes, only a logarithmic factor larger than the lower bound. The method is based on the recently introduced shrinking dartboard prediction algorithm, a variant of exponentially weighted average prediction. The algorithm is also extended to the problem of joint source-channel coding over a (known) stochastic noisy channel and to the case when side information is also available to the decoder (the Wyner–Ziv setting). The same improvements are obtained for these settings as in the case of a noiseless channel. Our method is also applied to the problem of zero-delay scalar quantization, where O(ln(T)/ T^1/2) normalized distortion redundancy is achieved relative to the (infinite) class of scalar quantizers of a given rate, almost achieving the known lower bound of order 1/ T^-1/2. The computationally efficient algorithms known for scalar quantization and the Wyner–Ziv setting carry over to our (improved) coding schemes presented in this paper
Implementation issues in source coding
An edge preserving image coding scheme which can be operated in both a lossy and a lossless manner was developed. The technique is an extension of the lossless encoding algorithm developed for the Mars observer spectral data. It can also be viewed as a modification of the DPCM algorithm. A packet video simulator was also developed from an existing modified packet network simulator. The coding scheme for this system is a modification of the mixture block coding (MBC) scheme described in the last report. Coding algorithms for packet video were also investigated
On Predictive Coding for Erasure Channels Using a Kalman Framework
We present a new design method for robust low-delay coding of autoregressive (AR) sources for transmission across erasure channels. It is a fundamental rethinking of existing concepts. It considers the encoder a mechanism that produces signal measurements from which the decoder estimates the original signal. The method is based on linear predictive coding and Kalman estimation at the decoder. We employ a novel encoder state-space representation with a linear quantization noise model. The encoder is represented by the Kalman measurement at the decoder. The presented method designs the encoder and decoder offline through an iterative algorithm based on closed-form minimization of the trace of the decoder state error covariance. The design method is shown to provide considerable performance gains, when the transmitted quantized prediction errors are subject to loss, in terms of signal-to-noise ratio (SNR) compared to the same coding framework optimized for no loss. The design method applies to stationary auto-regressive sources of any order. We demonstrate the method in a framework based on a generalized differential pulse code modulation (DPCM) encoder. The presented principles can be applied to more complicated coding systems that incorporate predictive coding as well
Universal Sampling Rate Distortion
We examine the coordinated and universal rate-efficient sampling of a subset
of correlated discrete memoryless sources followed by lossy compression of the
sampled sources. The goal is to reconstruct a predesignated subset of sources
within a specified level of distortion. The combined sampling mechanism and
rate distortion code are universal in that they are devised to perform robustly
without exact knowledge of the underlying joint probability distribution of the
sources. In Bayesian as well as nonBayesian settings, single-letter
characterizations are provided for the universal sampling rate distortion
function for fixed-set sampling, independent random sampling and memoryless
random sampling. It is illustrated how these sampling mechanisms are
successively better. Our achievability proofs bring forth new schemes for joint
source distribution-learning and lossy compression
Data Processing Bounds for Scalar Lossy Source Codes with Side Information at the Decoder
In this paper, we introduce new lower bounds on the distortion of scalar
fixed-rate codes for lossy compression with side information available at the
receiver. These bounds are derived by presenting the relevant random variables
as a Markov chain and applying generalized data processing inequalities a la
Ziv and Zakai. We show that by replacing the logarithmic function with other
functions, in the data processing theorem we formulate, we obtain new lower
bounds on the distortion of scalar coding with side information at the decoder.
The usefulness of these results is demonstrated for uniform sources and the
convex function , . The bounds in this case are
shown to be better than one can obtain from the Wyner-Ziv rate-distortion
function.Comment: 35 pages, 9 figure
- …