254 research outputs found
Self-similarity and wavelet forms for the compression of still image and video data
This thesis is concerned with the methods used to reduce the data volume required to represent
still images and video sequences. The number of disparate still image and video
coding methods increases almost daily. Recently, two new strategies have emerged and
have stimulated widespread research. These are the fractal method and the wavelet transform.
In this thesis, it will be argued that the two methods share a common principle: that
of self-similarity. The two will be related concretely via an image coding algorithm which
combines the two, normally disparate, strategies.
The wavelet transform is an orientation selective transform. It will be shown that the
selectivity of the conventional transform is not sufficient to allow exploitation of self-similarity
while keeping computational cost low. To address this, a new wavelet transform
is presented which allows for greater orientation selectivity, while maintaining the
orthogonality and data volume of the conventional wavelet transform. Many designs for
vector quantizers have been published recently and another is added to the gamut by this
work. The tree structured vector quantizer presented here is on-line and self structuring,
requiring no distinct training phase. Combining these into a still image data compression
system produces results which are among the best that have been published to date.
An extension of the two dimensional wavelet transform to encompass the time dimension
is straightforward and this work attempts to extrapolate some of its properties into three
dimensions. The vector quantizer is then applied to three dimensional image data to
produce a video coding system which, while not optimal, produces very encouraging
results
Conjoint probabilistic subband modeling
Thesis (Ph. D.)--Massachusetts Institute of Technology, Program in Media Arts & Sciences, 1997.Includes bibliographical references (leaves 125-133).by Ashok Chhabedia Popat.Ph.D
Sparse representation based hyperspectral image compression and classification
Abstract
This thesis presents a research work on applying sparse representation to lossy hyperspectral image
compression and hyperspectral image classification. The proposed lossy hyperspectral image
compression framework introduces two types of dictionaries distinguished by the terms sparse
representation spectral dictionary (SRSD) and multi-scale spectral dictionary (MSSD), respectively.
The former is learnt in the spectral domain to exploit the spectral correlations, and the
latter in wavelet multi-scale spectral domain to exploit both spatial and spectral correlations in
hyperspectral images. To alleviate the computational demand of dictionary learning, either a
base dictionary trained offline or an update of the base dictionary is employed in the compression
framework. The proposed compression method is evaluated in terms of different objective
metrics, and compared to selected state-of-the-art hyperspectral image compression schemes, including
JPEG 2000. The numerical results demonstrate the effectiveness and competitiveness of
both SRSD and MSSD approaches.
For the proposed hyperspectral image classification method, we utilize the sparse coefficients
for training support vector machine (SVM) and k-nearest neighbour (kNN) classifiers. In particular,
the discriminative character of the sparse coefficients is enhanced by incorporating contextual
information using local mean filters. The classification performance is evaluated and compared
to a number of similar or representative methods. The results show that our approach could outperform
other approaches based on SVM or sparse representation.
This thesis makes the following contributions. It provides a relatively thorough investigation
of applying sparse representation to lossy hyperspectral image compression. Specifically,
it reveals the effectiveness of sparse representation for the exploitation of spectral correlations
in hyperspectral images. In addition, we have shown that the discriminative character of sparse
coefficients can lead to superior performance in hyperspectral image classification.EM201
Frugal hypothesis testing and classification
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2010.Cataloged from PDF version of thesis.Includes bibliographical references (p. 157-175).The design and analysis of decision rules using detection theory and statistical learning theory is important because decision making under uncertainty is pervasive. Three perspectives on limiting the complexity of decision rules are considered in this thesis: geometric regularization, dimensionality reduction, and quantization or clustering. Controlling complexity often reduces resource usage in decision making and improves generalization when learning decision rules from noisy samples. A new margin-based classifier with decision boundary surface area regularization and optimization via variational level set methods is developed. This novel classifier is termed the geometric level set (GLS) classifier. A method for joint dimensionality reduction and margin-based classification with optimization on the Stiefel manifold is developed. This dimensionality reduction approach is extended for information fusion in sensor networks. A new distortion is proposed for the quantization or clustering of prior probabilities appearing in the thresholds of likelihood ratio tests. This distortion is given the name mean Bayes risk error (MBRE). The quantization framework is extended to model human decision making and discrimination in segregated populations.by Kush R. Varshney.Ph.D
Time and frequency domain algorithms for speech coding
The promise of digital hardware economies (due to recent advances in
VLSI technology), has focussed much attention on more complex and sophisticated
speech coding algorithms which offer improved quality at relatively
low bit rates.
This thesis describes the results (obtained from computer simulations)
of research into various efficient (time and frequency domain) speech
encoders operating at a transmission bit rate of 16 Kbps.
In the time domain, Adaptive Differential Pulse Code Modulation (ADPCM)
systems employing both forward and backward adaptive prediction were
examined. A number of algorithms were proposed and evaluated, including
several variants of the Stochastic Approximation Predictor (SAP). A
Backward Block Adaptive (BBA) predictor was also developed and found to
outperform the conventional stochastic methods, even though its complexity
in terms of signal processing requirements is lower. A simplified
Adaptive Predictive Coder (APC) employing a single tap pitch predictor
considered next provided a slight improvement in performance over ADPCM,
but with rather greater complexity.
The ultimate test of any speech coding system is the perceptual performance
of the received speech. Recent research has indicated that this
may be enhanced by suitable control of the noise spectrum according to
the theory of auditory masking. Various noise shaping ADPCM
configurations were examined, and it was demonstrated that a proposed
pre-/post-filtering arrangement which exploits advantageously the
predictor-quantizer interaction, leads to the best subjective
performance in both forward and backward prediction systems.
Adaptive quantization is instrumental to the performance of ADPCM systems.
Both the forward adaptive quantizer (AQF) and the backward oneword
memory adaptation (AQJ) were examined. In addition, a novel method
of decreasing quantization noise in ADPCM-AQJ coders, which involves the
application of correction to the decoded speech samples, provided
reduced output noise across the spectrum, with considerable high frequency
noise suppression.
More powerful (and inevitably more complex) frequency domain speech
coders such as the Adaptive Transform Coder (ATC) and the Sub-band Coder
(SBC) offer good quality speech at 16 Kbps. To reduce complexity and
coding delay, whilst retaining the advantage of sub-band coding, a novel
transform based split-band coder (TSBC) was developed and found to compare
closely in performance with the SBC.
To prevent the heavy side information requirement associated with a
large number of bands in split-band coding schemes from impairing coding
accuracy, without forgoing the efficiency provided by adaptive bit
allocation, a method employing AQJs to code the sub-band signals together
with vector quantization of the bit allocation patterns was also
proposed.
Finally, 'pipeline' methods of bit allocation and step size estimation
(using the Fast Fourier Transform (FFT) on the input signal) were examined.
Such methods, although less accurate, are nevertheless useful in
limiting coding delay associated with SRC schemes employing Quadrature
Mirror Filters (QMF)
- …