1,132 research outputs found
Digital Signal Processing
Contains an introduction and reports on twenty research projects.National Science Foundation (Grant ECS 84-07285)U.S. Navy - Office of Naval Research (Contract N00014-81-K-0742)National Science Foundation FellowshipSanders Associates, Inc.U.S. Air Force - Office of Scientific Research (Contract F19628-85-K-0028)Canada, Bell Northern Research ScholarshipCanada, Fonds pour la Formation de Chercheurs et l'Aide a la Recherche Postgraduate FellowshipCanada, Natural Science and Engineering Research Council Postgraduate FellowshipU.S. Navy - Office of Naval Research (Contract N00014-81-K-0472)Fanny and John Hertz Foundation FellowshipCenter for Advanced Television StudiesAmoco Foundation FellowshipU.S. Air Force - Office of Scientific Research (Contract F19628-85-K-0028
Differential encoding techniques applied to speech signals
The increasing use of digital communication systems has
produced a continuous search for efficient methods of speech
encoding.
This thesis describes investigations of novel differential
encoding systems. Initially Linear First Order DPCM systems
employing a simple delayed encoding algorithm are examined.
The systems detect an overload condition in the encoder, and
through a simple algorithm reduce the overload noise at the
expense of some increase in the quantization (granular) noise.
The signal-to-noise ratio (snr) performance of such d codec has
1 to 2 dB's advantage compared to the First Order Linear DPCM
system.
In order to obtain a large improvement in snr the high
correlation between successive pitch periods as well as the
correlation between successive samples in the voiced speech
waveform is exploited. A system called "Pitch Synchronous
First Order DPCM" (PSFOD) has been developed. Here the difference
Sequence formed between the samples of the input sequence in the
current pitch period and the samples of the stored decoded
sequence from the previous pitch period are encoded. This
difference sequence has a smaller dynamic range than the original
input speech sequence enabling a quantizer with better resolution
to be used for the same transmission bit rate. The snr is increased
by 6 dB compared with the peak snr of a First Order DPCM codea.
A development of the PSFOD system called a Pitch Synchronous
Differential Predictive Encoding system (PSDPE) is next investigated.
The principle of its operation is to predict the next sample in
the voiced-speech waveform, and form the prediction error which
is then subtracted from the corresponding decoded prediction
error in the previous pitch period. The difference is then
encoded and transmitted. The improvement in snr is approximately
8 dB compared to an ADPCM codea, when the PSDPE system uses an
adaptive PCM encoder. The snr of the system increases further
when the efficiency of the predictors used improve. However,
the performance of a predictor in any differential system is
closely related to the quantizer used. The better the quantization
the more information is available to the predictor and the better
the prediction of the incoming speech samples. This leads
automatically to the investigation in techniques of efficient
quantization. A novel adaptive quantization technique called
Dynamic Ratio quantizer (DRQ) is then considered and its theory
presented. The quantizer uses an adaptive non-linear element
which transforms the input samples of any amplitude to samples
within a defined amplitude range. A fixed uniform quantizer
quantizes the transformed signal. The snr for this quantizer
is almost constant over a range of input power limited in practice
by the dynamia range of the adaptive non-linear element, and it
is 2 to 3 dB's better than the snr of a One Word Memory adaptive
quantizer.
Digital computer simulation techniques have been used widely
in the above investigations and provide the necessary experimental
flexibility. Their use is described in the text
Selected topics in video coding and computer vision
Video applications ranging from multimedia communication to computer vision have been extensively studied in the past decades. However, the emergence of new applications continues to raise questions that are only partially answered by existing techniques. This thesis studies three selected topics related to video: intra prediction in block-based video coding, pedestrian detection and tracking in infrared imagery, and multi-view video alignment.;In the state-of-art video coding standard H.264/AVC, intra prediction is defined on the hierarchical quad-tree based block partitioning structure which fails to exploit the geometric constraint of edges. We propose a geometry-adaptive block partitioning structure and a new intra prediction algorithm named geometry-adaptive intra prediction (GAIP). A new texture prediction algorithm named geometry-adaptive intra displacement prediction (GAIDP) is also developed by extending the original intra displacement prediction (IDP) algorithm with the geometry-adaptive block partitions. Simulations on various test sequences demonstrate that intra coding performance of H.264/AVC can be significantly improved by incorporating the proposed geometry adaptive algorithms.;In recent years, due to the decreasing cost of thermal sensors, pedestrian detection and tracking in infrared imagery has become a topic of interest for night vision and all weather surveillance applications. We propose a novel approach for detecting and tracking pedestrians in infrared imagery based on a layered representation of infrared images. Pedestrians are detected from the foreground layer by a Principle Component Analysis (PCA) based scheme using the appearance cue. To facilitate the task of pedestrian tracking, we formulate the problem of shot segmentation and present a graph matching-based tracking algorithm. Simulations with both OSU Infrared Image Database and WVU Infrared Video Database are reported to demonstrate the accuracy and robustness of our algorithms.;Multi-view video alignment is a process to facilitate the fusion of non-synchronized multi-view video sequences for various applications including automatic video based surveillance and video metrology. In this thesis, we propose an accurate multi-view video alignment algorithm that iteratively aligns two sequences in space and time. To achieve an accurate sub-frame temporal alignment, we generalize the existing phase-correlation algorithm to 3-D case. We also present a novel method to obtain the ground-truth of the temporal alignment by using supplementary audio signals sampled at a much higher rate. The accuracy of our algorithm is verified by simulations using real-world sequences
A Review on Block Matching Motion Estimation and Automata Theory based Approaches for Fractal Coding
Fractal compression is the lossy compression technique in the field of gray/color image and video compression. It gives high compression ratio, better image quality with fast decoding time but improvement in encoding time is a challenge. This review paper/article presents the analysis of most significant existing approaches in the field of fractal based gray/color images and video compression, different block matching motion estimation approaches for finding out the motion vectors in a frame based on inter-frame coding and intra-frame coding i.e. individual frame coding and automata theory based coding approaches to represent an image/sequence of images. Though different review papers exist related to fractal coding, this paper is different in many sense. One can develop the new shape pattern for motion estimation and modify the existing block matching motion estimation with automata coding to explore the fractal compression technique with specific focus on reducing the encoding time and achieving better image/video reconstruction quality. This paper is useful for the beginners in the domain of video compression
- …