Search CORE

6 research outputs found

Multiterminal source coding: sum-rate loss, code designs, and applications to video sensor networks

Author: Yang Yang
Publication venue
Publication date: 15/05/2009
Field of study

Driven by a host of emerging applications (e.g., sensor networks and wireless video), distributed source coding (i.e., Slepian-Wolf coding, Wyner-Ziv coding and various other forms of multiterminal source coding), has recently become a very active research area. This dissertation focuses on multiterminal (MT) source coding problem, and consists of three parts. The first part studies the sum-rate loss of an important special case of quadratic Gaussian multi-terminal source coding, where all sources are positively symmetric and all target distortions are equal. We first give the minimum sum-rate for joint encoding of Gaussian sources in the symmetric case, and then show that the supremum of the sum-rate loss due to distributed encoding in this case is 1 2 log2 5 4 = 0:161 b/s when L = 2 and increases in the order of º L 2 log2 e b/s as the number of terminals L goes to infinity. The supremum sum-rate loss of 0:161 b/s in the symmetric case equals to that in general quadratic Gaussian two-terminal source coding without the symmetric assumption. It is conjectured that this equality holds for any number of terminals. In the second part, we present two practical MT coding schemes under the framework of Slepian-Wolf coded quantization (SWCQ) for both direct and indirect MT problems. The first, asymmetric SWCQ scheme relies on quantization and Wyner-Ziv coding, and it is implemented via source splitting to achieve any point on the sum-rate bound. In the second, conceptually simpler scheme, symmetric SWCQ, the two quantized sources are compressed using symmetric Slepian-Wolf coding via a channel code partitioning technique that is capable of achieving any point on the Slepian-Wolf sum-rate bound. Our practical designs employ trellis-coded quantization and turbo/LDPC codes for both asymmetric and symmetric Slepian-Wolf coding. Simulation results show a gap of only 0.139-0.194 bit per sample away from the sum-rate bound for both direct and indirect MT coding problems. The third part applies the above two MT coding schemes to two practical sources, i.e., stereo video sequences to save the sum rate over independent coding of both sequences. Experiments with both schemes on stereo video sequences using H.264, LDPC codes for Slepian-Wolf coding of the motion vectors, and scalar quantization in conjunction with LDPC codes for Wyner-Ziv coding of the residual coefficients give slightly smaller sum rate than separate H.264 coding of both sequences at the same video quality

Scalable video compression with optimized visual performance and random accessibility

Author: Leung Raymond
Publication venue: UNSW, Sydney
Publication date: 01/01/2006
Field of study

This thesis is concerned with maximizing the coding efficiency, random accessibility and visual performance of scalable compressed video. The unifying theme behind this work is the use of finely embedded localized coding structures, which govern the extent to which these goals may be jointly achieved. The first part focuses on scalable volumetric image compression. We investigate 3D transform and coding techniques which exploit inter-slice statistical redundancies without compromising slice accessibility. Our study shows that the motion-compensated temporal discrete wavelet transform (MC-TDWT) practically achieves an upper bound to the compression efficiency of slice transforms. From a video coding perspective, we find that most of the coding gain is attributed to offsetting the learning penalty in adaptive arithmetic coding through 3D code-block extension, rather than inter-frame context modelling. The second aspect of this thesis examines random accessibility. Accessibility refers to the ease with which a region of interest is accessed (subband samples needed for reconstruction are retrieved) from a compressed video bitstream, subject to spatiotemporal code-block constraints. We investigate the fundamental implications of motion compensation for random access efficiency and the compression performance of scalable interactive video. We demonstrate that inclusion of motion compensation operators within the lifting steps of a temporal subband transform incurs a random access penalty which depends on the characteristics of the motion field. The final aspect of this thesis aims to minimize the perceptual impact of visible distortion in scalable reconstructed video. We present a visual optimization strategy based on distortion scaling which raises the distortion-length slope of perceptually significant samples. This alters the codestream embedding order during post-compression rate-distortion optimization, thus allowing visually sensitive sites to be encoded with higher fidelity at a given bit-rate. For visual sensitivity analysis, we propose a contrast perception model that incorporates an adaptive masking slope. This versatile feature provides a context which models perceptual significance. It enables scene structures that otherwise suffer significant degradation to be preserved at lower bit-rates. The novelty in our approach derives from a set of "perceptual mappings" which account for quantization noise shaping effects induced by motion-compensated temporal synthesis. The proposed technique reduces wavelet compression artefacts and improves the perceptual quality of video

Sparse image approximation with application to flexible image coding

Author: Figueras i Ventura Rosa Maria
Publication venue: Lausanne, EPFL
Publication date: 25/05/2005
Field of study

Natural images are often modeled through piecewise-smooth regions. Region edges, which correspond to the contours of the objects, become, in this model, the main information of the signal. Contours have the property of being smooth functions along the direction of the edge, and irregularities on the perpendicular direction. Modeling edges with the minimum possible number of terms is of key importance for numerous applications, such as image coding, segmentation or denoising. Standard separable basis fail to provide sparse enough representation of contours, due to the fact that this kind of basis do not see the regularity of edges. In order to be able to detect this regularity, a new method based on (possibly redundant) sets of basis functions able to capture the geometry of images is needed. This thesis presents, in a first stage, a study about the features that basis functions should have in order to provide sparse representations of a piecewise-smooth image. This study emphasizes the need for edge-adapted basis functions, capable to accurately capture local orientation and anisotropic scaling of image structures. The need of different anisotropy degrees and orientations in the basis function set leads to the use of redundant dictionaries. However, redundant dictionaries have the inconvenience of giving no unique sparse image decompositions, and from all the possible decompositions of a signal in a redundant dictionary, just the sparsest is needed. There are several algorithms that allow to find sparse decompositions over redundant dictionaries, but most of these algorithms do not always guarantee that the optimal approximation has been recovered. To cope with this problem, a mathematical study about the properties of sparse approximations is performed. From this, a test to check whether a given sparse approximation is the sparsest is provided. The second part of this thesis presents a novel image approximation scheme, based on the use of a redundant dictionary. This scheme allows to have a good approximation of an image with a number of terms much smaller than the dimension of the signal. This novel approximation scheme is based on a dictionary formed by a combination of anisotropically refined and rotated wavelet-like mother functions and Gaussians. An efficient Full Search Matching Pursuit algorithm to perform the image decomposition in such a dictionary is designed. Finally, a geometric image coding scheme based on the image approximated over the anisotropic and rotated dictionary of basis functions is designed. The coding performances of this dictionary are studied. Coefficient quantization appears to be of crucial importance in the design of a Matching Pursuit based coding scheme. Thus, a quantization scheme for the MP coefficients has been designed, based on the theoretical energy upper bound of the MP algorithm and the empirical observations of the coefficient distribution and evolution. Thanks to this quantization, our image coder provides low to medium bit-rate image approximations, while it allows for on the fly resolution switching and several other affine image transformations to be performed directly in the transformed domain

Connecting the Speed-Accuracy Trade-Offs in Sensorimotor Control and Neurophysiology Reveals Diversity Sweet Spots in Layered Control Architectures

Author: Nakahira Yorie
Publication venue
Publication date: 01/01/2019
Field of study

Nervous systems sense, communicate, compute, and actuate movement using distributed components with trade-offs in speed, accuracy, sparsity, noise, and saturation. Nevertheless, the resulting control can achieve remarkably fast, accurate, and robust performance due to a highly effective layered control architecture. However, this architecture has received little attention from the existing research. This is in part because of the lack of theory that connects speed-accuracy trade-offs (SATs) in the components neurophysiology with system-level sensorimotor control and characterizes the overall system performance when different layers (planning vs. reflex layer) act work jointly. In thesis, we present a theoretical framework that provides a synthetic perspective of both levels and layers. We then use this framework to clarify the properties of effective layered architectures and explain why there exists extreme diversity across layers (planning vs. reflex layers) and within levels (sensorimotor versus neural/muscle hardware levels). The framework characterizes how the sensorimotor SATs are constrained by the component SATs of neurons communicating with spikes and their sensory and muscle endpoints, in both stochastic and deterministic models. The theoretical predictions are also verified using driving experiments. Our results lead to a novel concept, termed ``diversity sweet spots (DSSs)'': the appropriate diversity in the properties of neurons and muscles across layers and within levels help create systems that are both fast and accurate despite being built from components that are individually slow or inaccurate. At the component level, this concept explains why there are extreme heterogeneities in the neural or muscle composition. At the system level, DSSs explain the benefits of layering to allow extreme heterogeneities in speed and accuracy in different sensorimotor loops. Similar issues and properties also extend down to the cellular level in biology and outward to our most advanced network technologies from smart grid to the Internet of Things. We present our initial step in expanding our framework to that area and widely-open area of research for future direction

Caltech Theses and Dissertations