23 research outputs found
Distributed Source Coding Techniques for Lossless Compression of Hyperspectral Images
This paper deals with the application of distributed source coding (DSC) theory to remote sensing image compression. Although DSC exhibits a significant potential in many application fields, up till now the results obtained on real signals fall short of the theoretical bounds, and often impose additional system-level constraints. The objective of this paper is to assess the potential of DSC for lossless image compression carried out onboard a remote platform. We first provide a brief overview of DSC of correlated information sources. We then focus on onboard lossless image compression, and apply DSC techniques in order to reduce the complexity of the onboard encoder, at the expense of the decoder's, by exploiting the correlation of different bands of a hyperspectral dataset. Specifically, we propose two different compression schemes, one based on powerful binary error-correcting codes employed as source codes, and one based on simpler multilevel coset codes. The performance of both schemes is evaluated on a few AVIRIS scenes, and is compared with other state-of-the-art 2D and 3D coders. Both schemes turn out to achieve competitive compression performance, and one of them also has reduced complexity. Based on these results, we highlight the main issues that are still to be solved to further improve the performance of DSC-based remote sensing systems
Remote Sensing Data Compression
A huge amount of data is acquired nowadays by different remote sensing systems installed on satellites, aircrafts, and UAV. The acquired data then have to be transferred to image processing centres, stored and/or delivered to customers. In restricted scenarios, data compression is strongly desired or necessary. A wide diversity of coding methods can be used, depending on the requirements and their priority. In addition, the types and properties of images differ a lot, thus, practical implementation aspects have to be taken into account. The Special Issue paper collection taken as basis of this book touches on all of the aforementioned items to some degree, giving the reader an opportunity to learn about recent developments and research directions in the field of image compression. In particular, lossless and near-lossless compression of multi- and hyperspectral images still remains current, since such images constitute data arrays that are of extremely large size with rich information that can be retrieved from them for various applications. Another important aspect is the impact of lossless compression on image classification and segmentation, where a reasonable compromise between the characteristics of compression and the final tasks of data processing has to be achieved. The problems of data transition from UAV-based acquisition platforms, as well as the use of FPGA and neural networks, have become very important. Finally, attempts to apply compressive sensing approaches in remote sensing image processing with positive outcomes are observed. We hope that readers will find our book useful and interestin
Intra-Key-Frame Coding and Side Information Generation Schemes in Distributed Video Coding
In this thesis investigation has been made to propose improved schemes for intra-key-frame coding and side information (SI) generation in a distributed video
coding (DVC) framework. From the DVC developments in last few years it has
been observed that schemes put more thrust on intra-frame coding and better
quality side information (SI) generation. In fact both are interrelated as SI
generation is dependent on decoded key frame quality. Hence superior quality
key frames generated through intra-key frame coding will in turn are utilized to
generate good quality SI frames. As a result, DVC needs less number of parity
bits to reconstruct the WZ frames at the decoder. Keeping this in mind, we have
proposed two schemes for intra-key frame coding namely,
(a) Borrows Wheeler Transform based H.264/AVC (Intra) intra-frame coding
(BWT-H.264/AVC(Intra))
(b) Dictionary based H.264/AVC (Intra) intra-frame coding using orthogonal
matching pursuit (DBOMP-H.264/AVC (Intra))
BWT-H.264/AVC (Intra) scheme is a modified version of H.264/AVC (Intra)
scheme where a regularized bit stream is generated prior to compression. This
scheme results in higher compression efficiency as well as high quality decoded
key frames. DBOMP-H.264/AVC (Intra) scheme is based on an adaptive
dictionary and H.264/AVC (Intra) intra-frame coding. The traditional transform
is replaced with a dictionary trained with K-singular value decomposition (K-SVD)
algorithm. The dictionary elements are coded using orthogonal matching pursuit
(OMP).
Further, two side information generation schemes have been suggested namely,
(a) Multilayer Perceptron based side information generation (MLP - SI)
(b) Multivariable support vector regression based side information generation
(MSVR-SI)
MLP-SI scheme utilizes a multilayer perceptron (MLP) to estimate SI frames
from the decoded key frames block-by-block. The network is trained offline using
training patterns from different frames collected from standard video sequences.
MSVR-SI scheme uses an optimized multi variable support vector regression
(M-SVR) to generate SI frames from decoded key frames block-by-block. Like
MLP, the training for M-SVR is made offline with known training patterns apriori.
Both intra-key-frame coding and SI generation schemes are embedded in
the Stanford based DVC architecture and studied individually to compare
performances with their competitive schemes. Visual as well as quantitative
evaluations have been made to show the efficacy of the schemes. To exploit the
usefulness of intra-frame coding schemes in SI generation, four hybrid schemes
have been formulated by combining the aforesaid suggested schemes as follows:
(a) BWT-MLP scheme that uses BWT-H.264/AVC (Intra) intra-frame
coding scheme and MLP-SI side information generation scheme.
(b) BWT-MSVR scheme, where we utilize BWT-H.264/AVC (Intra)
for intra-frame coding followed by MSVR-SI based side information
generation.
(c) DBOMP-MLP scheme is an outcome of putting DBOMP-H.264/AVC
(Intra) intra-frame coding and MLP-SI side information generation
schemes.
(d) DBOMP-MSVR scheme deals with DBOMP-H.264/AVC (Intra)
intra-frame coding and MSVR-SI side information generation together.
The hybrid schemes are also incorporated into the Stanford based DVC
architecture and simulation has been carried out on standard video sequences.
The performance analysis with respect to overall rate distortion, number requests
per SI frame, temporal evaluation, and decoding time requirement has been made
to derive an overall conclusion
Layered Wyner-Ziv video coding: a new approach to video compression and delivery
Following recent theoretical works on successive Wyner-Ziv coding, we propose
a practical layered Wyner-Ziv video coder using the DCT, nested scalar quantiza-
tion, and irregular LDPC code based Slepian-Wolf coding (or lossless source coding
with side information at the decoder). Our main novelty is to use the base layer
of a standard scalable video coder (e.g., MPEG-4/H.26L FGS or H.263+) as the
decoder side information and perform layered Wyner-Ziv coding for quality enhance-
ment. Similar to FGS coding, there is no performance di®erence between layered and
monolithic Wyner-Ziv coding when the enhancement bitstream is generated in our
proposed coder. Using an H.26L coded version as the base layer, experiments indicate
that Wyner-Ziv coding gives slightly worse performance than FGS coding when the
channel (for both the base and enhancement layers) is noiseless. However, when the
channel is noisy, extensive simulations of video transmission over wireless networks
conforming to the CDMA2000 1X standard show that H.26L base layer coding plus
Wyner-Ziv enhancement layer coding are more robust against channel errors than
H.26L FGS coding. These results demonstrate that layered Wyner-Ziv video coding
is a promising new technique for video streaming over wireless networks.
For scalable video transmission over the Internet and 3G wireless networks, we
propose a system for receiver-driven layered multicast based on layered Wyner-Ziv video coding and digital fountain coding. Digital fountain codes are near-capacity
erasure codes that are ideally suited for multicast applications because of their rate-
less property. By combining an error-resilient Wyner-Ziv video coder and rateless
fountain codes, our system allows reliable multicast of high-quality video to an arbi-
trary number of heterogeneous receivers without the requirement of feedback chan-
nels. Extending this work on separate source-channel coding, we consider distributed
joint source-channel coding by using a single channel code for both video compression
(via Slepian-Wolf coding) and packet loss protection. We choose Raptor codes - the
best approximation to a digital fountain - and address in detail both encoder and de-
coder designs. Simulation results show that, compared to one separate design using
Slepian-Wolf compression plus erasure protection and another based on FGS coding
plus erasure protection, the proposed joint design provides better video quality at the
same number of transmitted packets
Learning-based Wavelet-like Transforms For Fully Scalable and Accessible Image Compression
The goal of this thesis is to improve the existing wavelet transform with the aid of machine learning techniques, so as to enhance coding efficiency of wavelet-based image compression frameworks, such as JPEG 2000.
In this thesis, we first propose to augment the conventional base wavelet transform with two additional learned lifting steps -- a high-to-low step followed by a low-to-high step. The high-to-low step suppresses aliasing in the low-pass band by using the detail bands at the same resolution, while the low-to-high step aims to further remove redundancy from detail bands by using the corresponding low-pass band. These two additional steps reduce redundancy (notably aliasing information) amongst the wavelet subbands, and also improve the visual quality of reconstructed images at reduced resolutions.
To train these two networks in an end-to-end fashion, we develop a backward annealing approach to overcome the non-differentiability of the quantization and cost functions during back-propagation. Importantly, the two additional networks share a common architecture, named a proposal-opacity topology, which is inspired and guided by a specific theoretical argument related to geometric flow. This particular network topology is compact and with limited non-linearities, allowing a fully scalable system; one pair of trained network parameters are applied for all levels of decomposition and for all bit-rates of interest. By employing the additional lifting networks within the JPEG2000 image coding standard, we can achieve up to 17.4% average BD bit-rate saving over a wide range of bit-rates, while retaining the quality and resolution scalability features of JPEG2000.
Built upon the success of the high-to-low and low-to-high steps, we then study more broadly the extension of neural networks to all lifting steps that correspond to the base wavelet transform. The purpose of this comprehensive study is to understand what is the most effective way to develop learned wavelet-like transforms for highly scalable and accessible image compression. Specifically, we examine the impact of the number of learned lifting steps, the number of layers and the number of channels in each learned lifting network, and kernel support in each layer. To facilitate the study, we develop a generic training methodology that is simultaneously appropriate to all lifting structures considered. Experimental results ultimately suggest that to improve the existing wavelet transform, it is more profitable to augment a larger wavelet transform with more diverse high-to-low and low-to-high steps, rather than developing deep fully learned lifting structures
Robust density modelling using the student's t-distribution for human action recognition
The extraction of human features from videos is often inaccurate and prone to outliers. Such outliers can severely affect density modelling when the Gaussian distribution is used as the model since it is highly sensitive to outliers. The Gaussian distribution is also often used as base component of graphical models for recognising human actions in the videos (hidden Markov model and others) and the presence of outliers can significantly affect the recognition accuracy. In contrast, the Student's t-distribution is more robust to outliers and can be exploited to improve the recognition rate in the presence of abnormal data. In this paper, we present an HMM which uses mixtures of t-distributions as observation probabilities and show how experiments over two well-known datasets (Weizmann, MuHAVi) reported a remarkable improvement in classification accuracy. © 2011 IEEE