743 research outputs found
Low bit rate digital apeech signal processing systems
Imperial Users onl
Scalable and perceptual audio compression
This thesis deals with scalable perceptual audio compression. Two scalable perceptual solutions as well as a scalable to lossless solution are proposed and investigated. One of the scalable perceptual solutions is built around sinusoidal modelling of the audio signal whilst the other is built on a transform coding paradigm. The scalable coders are shown to scale both in a waveform matching manner as well as a psychoacoustic manner. In order to measure the psychoacoustic scalability of the systems investigated in this thesis, the similarity between the original signal\u27s psychoacoustic parameters and that of the synthesized signal are compared. The psychoacoustic parameters used are loudness, sharpness, tonahty and roughness. This analysis technique is a novel method used in this thesis and it allows an insight into the perceptual distortion that has been introduced by any coder analyzed in this manner
The low bit-rate coding of speech signals
Imperial Users onl
Picture coding in viewdata systems
Viewdata systems in commercial use at present offer the facility
for transmitting alphanumeric text and graphic displays via the public
switched telephone network. An enhancement to the system would be to
transmit true video images instead of graphics. Such a system, under
development in Britain at present uses Differential Pulse Code Modulation
(DPCM) and a transmission rate of 1200 bits/sec. Error protection
is achieved by the use of error protection codes, which increases
the channel requirement.
In this thesis, error detection and correction of DPCM coded
video signals without the use of channel error protection is studied.
The scheme operates entirely at the receiver by examining the local
statistics of the received data to determine the presence of errors.
Error correction is then undertaken by interpolation from adjacent
correct or previousiy corrected data.
DPCM coding of pictures has the inherent disadvantage of a slow
build-up of the displayed picture at the receiver and difficulties with
image size manipulation. In order to fit the pictorial information
into a viewdata page, its size has to be reduced. Unitary transforms,
typically the discrete Fourier transform (DFT), the discrete cosine
transform (DCT) and the Hadamard transform (HT) enable lowpass filtering and decimation to be carried out in a single operation in the transform
domain. Size reductions of different orders are considered and the merits
of the DFT, DCT and HT are investigated.
With limited channel capacity, it is desirable to remove the
redundancy present in the source picture in order to reduce the bit
rate. Orthogonal transformation decorrelates the spatial sample
distribution and packs most of the image energy in the low order
coefficients. This property is exploited in bit-reduction schemes
which are adaptive to the local statistics of the different source
pictures used. In some cases, bit rates of less than 1.0 bit/pel
are achieved with satisfactory received picture quality.
Unlike DPCM systems, transform coding has the advantage of being
able to display rapidly a picture of low resolution by initial inverse
transformation of the low order coefficients only. Picture resolution
is then progressively built up as more coefficients are received and
decoded. Different sequences of picture update are investigated to
find that which achieves the best subjective quality with the fewest
possible coefficients transmitted
Time and frequency domain algorithms for speech coding
The promise of digital hardware economies (due to recent advances in
VLSI technology), has focussed much attention on more complex and sophisticated
speech coding algorithms which offer improved quality at relatively
low bit rates.
This thesis describes the results (obtained from computer simulations)
of research into various efficient (time and frequency domain) speech
encoders operating at a transmission bit rate of 16 Kbps.
In the time domain, Adaptive Differential Pulse Code Modulation (ADPCM)
systems employing both forward and backward adaptive prediction were
examined. A number of algorithms were proposed and evaluated, including
several variants of the Stochastic Approximation Predictor (SAP). A
Backward Block Adaptive (BBA) predictor was also developed and found to
outperform the conventional stochastic methods, even though its complexity
in terms of signal processing requirements is lower. A simplified
Adaptive Predictive Coder (APC) employing a single tap pitch predictor
considered next provided a slight improvement in performance over ADPCM,
but with rather greater complexity.
The ultimate test of any speech coding system is the perceptual performance
of the received speech. Recent research has indicated that this
may be enhanced by suitable control of the noise spectrum according to
the theory of auditory masking. Various noise shaping ADPCM
configurations were examined, and it was demonstrated that a proposed
pre-/post-filtering arrangement which exploits advantageously the
predictor-quantizer interaction, leads to the best subjective
performance in both forward and backward prediction systems.
Adaptive quantization is instrumental to the performance of ADPCM systems.
Both the forward adaptive quantizer (AQF) and the backward oneword
memory adaptation (AQJ) were examined. In addition, a novel method
of decreasing quantization noise in ADPCM-AQJ coders, which involves the
application of correction to the decoded speech samples, provided
reduced output noise across the spectrum, with considerable high frequency
noise suppression.
More powerful (and inevitably more complex) frequency domain speech
coders such as the Adaptive Transform Coder (ATC) and the Sub-band Coder
(SBC) offer good quality speech at 16 Kbps. To reduce complexity and
coding delay, whilst retaining the advantage of sub-band coding, a novel
transform based split-band coder (TSBC) was developed and found to compare
closely in performance with the SBC.
To prevent the heavy side information requirement associated with a
large number of bands in split-band coding schemes from impairing coding
accuracy, without forgoing the efficiency provided by adaptive bit
allocation, a method employing AQJs to code the sub-band signals together
with vector quantization of the bit allocation patterns was also
proposed.
Finally, 'pipeline' methods of bit allocation and step size estimation
(using the Fast Fourier Transform (FFT) on the input signal) were examined.
Such methods, although less accurate, are nevertheless useful in
limiting coding delay associated with SRC schemes employing Quadrature
Mirror Filters (QMF)
Data compression techniques applied to high resolution high frame rate video technology
An investigation is presented of video data compression applied to microgravity space experiments using High Resolution High Frame Rate Video Technology (HHVT). An extensive survey of methods of video data compression, described in the open literature, was conducted. The survey examines compression methods employing digital computing. The results of the survey are presented. They include a description of each method and assessment of image degradation and video data parameters. An assessment is made of present and near term future technology for implementation of video data compression in high speed imaging system. Results of the assessment are discussed and summarized. The results of a study of a baseline HHVT video system, and approaches for implementation of video data compression, are presented. Case studies of three microgravity experiments are presented and specific compression techniques and implementations are recommended
Recommended from our members
Speech coding
Speech is the predominant means of communication between human beings and since the invention of the telephone by Alexander Graham Bell in 1876, speech services have remained to be the core service in almost all telecommunication systems. Original analog methods of telephony had the disadvantage of speech signal getting corrupted by noise, cross-talk and distortion Long haul transmissions which use repeaters to compensate for the loss in signal strength on transmission links also increase the associated noise and distortion. On the other hand digital transmission is relatively immune to noise, cross-talk and distortion primarily because of the capability to faithfully regenerate digital signal at each repeater purely based on a binary decision. Hence end-to-end performance of the digital link essentially becomes independent of the length and operating frequency bands of the link Hence from a transmission point of view digital transmission has been the preferred approach due to its higher immunity to noise. The need to carry digital speech became extremely important from a service provision point of view as well. Modem requirements have introduced the need for robust, flexible and secure services that can carry a multitude of signal types (such as voice, data and video) without a fundamental change in infrastructure. Such a requirement could not have been easily met without the advent of digital transmission systems, thereby requiring speech to be coded digitally. The term Speech Coding is often referred to techniques that represent or code speech signals either directly as a waveform or as a set of parameters by analyzing the speech signal. In either case, the codes are transmitted to the distant end where speech is reconstructed or synthesized using the received set of codes. A more generic term that is applicable to these techniques that is often interchangeably used with speech coding is the term voice coding. This term is more generic in the sense that the coding techniques are equally applicable to any voice signal whether or not it carries any intelligible information, as the term speech implies. Other terms that are commonly used are speech compression and voice compression since the fundamental idea behind speech coding is to reduce (compress) the transmission rate (or equivalently the bandwidth) And/or reduce storage requirements In this document the terms speech and voice shall be used interchangeably
Scalable Speech Coding for IP Networks
The emergence of Voice over Internet Protocol (VoIP) has posed new challenges to the development of speech codecs. The key issue of transporting real-time voice packet over IP networks is the lack of guarantee for reasonable speech quality due to packet delay or loss.
Most of the widely used narrowband codecs depend on the Code Excited Linear Prediction (CELP) coding technique. The CELP technique utilizes the long-term prediction across the frame boundaries and therefore causes error propagation in the case of packet loss and need to transmit redundant information in order to mitigate the problem. The internet Low Bit-rate Codec (iLBC) employs the frame-independent coding and therefore inherently possesses high robustness to packet loss. However, the original iLBC lacks in some of the key features of speech codecs for IP networks: Rate flexibility, Scalability, and Wideband support.
This dissertation presents novel scalable narrowband and wideband speech codecs for IP networks using the frame independent coding scheme based on the iLBC. The rate flexibility is added to the iLBC by employing the discrete cosine transform (DCT) and iii the scalable algebraic vector quantization (AVQ) and by allocating different number of bits to the AVQ. The bit-rate scalability is obtained by adding the enhancement layer to the core layer of the multi-rate iLBC. The enhancement layer encodes the weighted iLBC coding error in the modified DCT (MDCT) domain. The proposed wideband codec employs the bandwidth extension technique to extend the capabilities of existing narrowband codecs to provide wideband coding functionality. The wavelet transform is also used to further enhance the performance of the proposed codec.
The performance evaluation results show that the proposed codec provides high robustness to packet loss and achieves equivalent or higher speech quality than state-of-the-art codecs under the clean channel condition
Delta modulation techniques for low bit-rate digital speech encoding
Includes bibliography.Two new hybrid companding delta modulators for speech encoding are presented here. These modulators differ from the Hybrid Companding Delta Modulator (HCDM) proposed by Un et al in that the two new encoders employ Song Voice Adaptation as the basis of the instantaneous compandor, rather than Constant Factor adaptation. A detailed analysis of the performance, both objective and subjective, of these hybrid codecs has been carried out. Results show that overall the two codecs developed as part of this project are better than the HCDM codec. In addition the new codecs offer simpler implementation in digital hardware than the HCDM. A Computer Aided Test (CAT) system has been developed to simplify the design and test processes for speech codecs
- …