504 research outputs found
A user's guide for the signal processing software for image and speech compression developed in the Communications and Signal Processing Laboratory (CSPL), version 1
A complete documentation of the software developed in the Communication and Signal Processing Laboratory (CSPL) during the period of July 1985 to March 1986 is provided. Utility programs and subroutines that were developed for a user-friendly image and speech processing environment are described. Additional programs for data compression of image and speech type signals are included. Also, programs for the zero-memory and block transform quantization in the presence of channel noise are described. Finally, several routines for simulating the perfromance of image compression algorithms are included
Time and frequency domain algorithms for speech coding
The promise of digital hardware economies (due to recent advances in
VLSI technology), has focussed much attention on more complex and sophisticated
speech coding algorithms which offer improved quality at relatively
low bit rates.
This thesis describes the results (obtained from computer simulations)
of research into various efficient (time and frequency domain) speech
encoders operating at a transmission bit rate of 16 Kbps.
In the time domain, Adaptive Differential Pulse Code Modulation (ADPCM)
systems employing both forward and backward adaptive prediction were
examined. A number of algorithms were proposed and evaluated, including
several variants of the Stochastic Approximation Predictor (SAP). A
Backward Block Adaptive (BBA) predictor was also developed and found to
outperform the conventional stochastic methods, even though its complexity
in terms of signal processing requirements is lower. A simplified
Adaptive Predictive Coder (APC) employing a single tap pitch predictor
considered next provided a slight improvement in performance over ADPCM,
but with rather greater complexity.
The ultimate test of any speech coding system is the perceptual performance
of the received speech. Recent research has indicated that this
may be enhanced by suitable control of the noise spectrum according to
the theory of auditory masking. Various noise shaping ADPCM
configurations were examined, and it was demonstrated that a proposed
pre-/post-filtering arrangement which exploits advantageously the
predictor-quantizer interaction, leads to the best subjective
performance in both forward and backward prediction systems.
Adaptive quantization is instrumental to the performance of ADPCM systems.
Both the forward adaptive quantizer (AQF) and the backward oneword
memory adaptation (AQJ) were examined. In addition, a novel method
of decreasing quantization noise in ADPCM-AQJ coders, which involves the
application of correction to the decoded speech samples, provided
reduced output noise across the spectrum, with considerable high frequency
noise suppression.
More powerful (and inevitably more complex) frequency domain speech
coders such as the Adaptive Transform Coder (ATC) and the Sub-band Coder
(SBC) offer good quality speech at 16 Kbps. To reduce complexity and
coding delay, whilst retaining the advantage of sub-band coding, a novel
transform based split-band coder (TSBC) was developed and found to compare
closely in performance with the SBC.
To prevent the heavy side information requirement associated with a
large number of bands in split-band coding schemes from impairing coding
accuracy, without forgoing the efficiency provided by adaptive bit
allocation, a method employing AQJs to code the sub-band signals together
with vector quantization of the bit allocation patterns was also
proposed.
Finally, 'pipeline' methods of bit allocation and step size estimation
(using the Fast Fourier Transform (FFT) on the input signal) were examined.
Such methods, although less accurate, are nevertheless useful in
limiting coding delay associated with SRC schemes employing Quadrature
Mirror Filters (QMF)
Picture coding in viewdata systems
Viewdata systems in commercial use at present offer the facility
for transmitting alphanumeric text and graphic displays via the public
switched telephone network. An enhancement to the system would be to
transmit true video images instead of graphics. Such a system, under
development in Britain at present uses Differential Pulse Code Modulation
(DPCM) and a transmission rate of 1200 bits/sec. Error protection
is achieved by the use of error protection codes, which increases
the channel requirement.
In this thesis, error detection and correction of DPCM coded
video signals without the use of channel error protection is studied.
The scheme operates entirely at the receiver by examining the local
statistics of the received data to determine the presence of errors.
Error correction is then undertaken by interpolation from adjacent
correct or previousiy corrected data.
DPCM coding of pictures has the inherent disadvantage of a slow
build-up of the displayed picture at the receiver and difficulties with
image size manipulation. In order to fit the pictorial information
into a viewdata page, its size has to be reduced. Unitary transforms,
typically the discrete Fourier transform (DFT), the discrete cosine
transform (DCT) and the Hadamard transform (HT) enable lowpass filtering and decimation to be carried out in a single operation in the transform
domain. Size reductions of different orders are considered and the merits
of the DFT, DCT and HT are investigated.
With limited channel capacity, it is desirable to remove the
redundancy present in the source picture in order to reduce the bit
rate. Orthogonal transformation decorrelates the spatial sample
distribution and packs most of the image energy in the low order
coefficients. This property is exploited in bit-reduction schemes
which are adaptive to the local statistics of the different source
pictures used. In some cases, bit rates of less than 1.0 bit/pel
are achieved with satisfactory received picture quality.
Unlike DPCM systems, transform coding has the advantage of being
able to display rapidly a picture of low resolution by initial inverse
transformation of the low order coefficients only. Picture resolution
is then progressively built up as more coefficients are received and
decoded. Different sequences of picture update are investigated to
find that which achieves the best subjective quality with the fewest
possible coefficients transmitted
Study of communications data compression methods
A simple monochrome conditional replenishment system was extended to higher compression and to higher motion levels, by incorporating spatially adaptive quantizers and field repeating. Conditional replenishment combines intraframe and interframe compression, and both areas are investigated. The gain of conditional replenishment depends on the fraction of the image changing, since only changed parts of the image need to be transmitted. If the transmission rate is set so that only one fourth of the image can be transmitted in each field, greater change fractions will overload the system. A computer simulation was prepared which incorporated (1) field repeat of changes, (2) a variable change threshold, (3) frame repeat for high change, and (4) two mode, variable rate Hadamard intraframe quantizers. The field repeat gives 2:1 compression in moving areas without noticeable degradation. Variable change threshold allows some flexibility in dealing with varying change rates, but the threshold variation must be limited for acceptable performance
Orthogonal transforms and their application to image coding
Imperial Users onl
Digital television system design study
The use of digital techniques for transmission of pictorial data is discussed for multi-frame images (television). Video signals are processed in a manner which includes quantization and coding such that they are separable from the noise introduced into the channel. The performance of digital television systems is determined by the nature of the processing techniques (i.e., whether the video signal itself or, instead, something related to the video signal is quantized and coded) and to the quantization and coding schemes employed
- …