Search CORE

607 research outputs found

Speech Compression Using Discrete Wavelet Transform

Author: M.Ali Najih Abdulmawla
Publication venue
Publication date: 01/08/2003
Field of study

Speech compression is an area of digital processing that is focusing on reducing bit rate of the speech signal for transmission or storage without significant loss of quality. Wavelet transform has been recently proposed for signal analysis. Speech signal compression using wavelet transform is given a considerable attention in this thesis. Speech coding is a lossy scheme and is implemented here to compress onedimensional speech signal. Basically, this scheme consists of four operations which are the transform, threshold techniques (by level and global threshold), quantization, and entropy encoding operations. The reconstruction of the compressed signal as well as the detailed steps needed are discussed.The performance of wavelet compression is compared against linear Productive Coding and Global System for Mobile Communication (GSM) algorithms using SNR, PSNR, NRMSE and compression ratio. Software simulating the lossy compression scheme is developed using Matlab 6. This software provides the basic speech analysis as well as the compression and decompression operations. The results obtained show reasonably high compression ratio and good signal quality

Universiti Putra Malaysia Institutional Repository

Scalable and perceptual audio compression

Author: Raad Mohammed
Publication venue: School of Electrical, Computer and Telecommunications Engineering
Publication date: 01/01/2003
Field of study

This thesis deals with scalable perceptual audio compression. Two scalable perceptual solutions as well as a scalable to lossless solution are proposed and investigated. One of the scalable perceptual solutions is built around sinusoidal modelling of the audio signal whilst the other is built on a transform coding paradigm. The scalable coders are shown to scale both in a waveform matching manner as well as a psychoacoustic manner. In order to measure the psychoacoustic scalability of the systems investigated in this thesis, the similarity between the original signal\u27s psychoacoustic parameters and that of the synthesized signal are compared. The psychoacoustic parameters used are loudness, sharpness, tonahty and roughness. This analysis technique is a novel method used in this thesis and it allows an insight into the perceptual distortion that has been introduced by any coder analyzed in this manner

Research Online

On Predictive Coding for Erasure Channels Using a Kalman Framework

Author: Andersen Søren Vang
Arildsen Thomas
Jensen Søren Holdt
Murthi Manohar
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2009
Field of study

We present a new design method for robust low-delay coding of autoregressive (AR) sources for transmission across erasure channels. It is a fundamental rethinking of existing concepts. It considers the encoder a mechanism that produces signal measurements from which the decoder estimates the original signal. The method is based on linear predictive coding and Kalman estimation at the decoder. We employ a novel encoder state-space representation with a linear quantization noise model. The encoder is represented by the Kalman measurement at the decoder. The presented method designs the encoder and decoder offline through an iterative algorithm based on closed-form minimization of the trace of the decoder state error covariance. The design method is shown to provide considerable performance gains, when the transmitted quantized prediction errors are subject to loss, in terms of signal-to-noise ratio (SNR) compared to the same coding framework optimized for no loss. The design method applies to stationary auto-regressive sources of any order. We demonstrate the method in a framework based on a generalized differential pulse code modulation (DPCM) encoder. The presented principles can be applied to more complicated coding systems that incorporate predictive coding as well

Lund University Publications

VBN

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Proceedings of the Scientific Data Compression Workshop

Author: Ramapriyan H. K.
Publication venue
Publication date
Field of study

Continuing advances in space and Earth science requires increasing amounts of data to be gathered from spaceborne sensors. NASA expects to launch sensors during the next two decades which will be capable of producing an aggregate of 1500 Megabits per second if operated simultaneously. Such high data rates cause stresses in all aspects of end-to-end data systems. Technologies and techniques are needed to relieve such stresses. Potential solutions to the massive data rate problems are: data editing, greater transmission bandwidths, higher density and faster media, and data compression. Through four subpanels on Science Payload Operations, Multispectral Imaging, Microwave Remote Sensing and Science Data Management, recommendations were made for research in data compression and scientific data applications to space platforms

NASA Technical Reports Server

The development of speech coding and the first standard coder for public mobile telephony

Author: Sluijter R.J.
Publication venue: Technische Universiteit Eindhoven
Publication date: 01/01/2005
Field of study

This thesis describes in its core chapter (Chapter 4) the original algorithmic and design features of the ??rst coder for public mobile telephony, the GSM full-rate speech coder, as standardized in 1988. It has never been described in so much detail as presented here. The coder is put in a historical perspective by two preceding chapters on the history of speech production models and the development of speech coding techniques until the mid 1980s, respectively. In the epilogue a brief review is given of later developments in speech coding. The introductory Chapter 1 starts with some preliminaries. It is de- ??ned what speech coding is and the reader is introduced to speech coding standards and the standardization institutes which set them. Then, the attributes of a speech coder playing a role in standardization are explained. Subsequently, several applications of speech coders - including mobile telephony - will be discussed and the state of the art in speech coding will be illustrated on the basis of some worldwide recognized standards. Chapter 2 starts with a summary of the features of speech signals and their source, the human speech organ. Then, historical models of speech production which form the basis of di??erent kinds of modern speech coders are discussed. Starting with a review of ancient mechanical models, we will arrive at the electrical source-??lter model of the 1930s. Subsequently, the acoustic-tube models as they arose in the 1950s and 1960s are discussed. Finally the 1970s are reviewed which brought the discrete-time ??lter model on the basis of linear prediction. In a unique way the logical sequencing of these models is exposed, and the links are discussed. Whereas the historical models are discussed in a narrative style, the acoustic tube models and the linear prediction tech nique as applied to speech, are subject to more mathematical analysis in order to create a sound basis for the treatise of Chapter 4. This trend continues in Chapter 3, whenever instrumental in completing that basis. In Chapter 3 the reader is taken by the hand on a guided tour through time during which successive speech coding methods pass in review. In an original way special attention is paid to the evolutionary aspect. Speci??cally, for each newly proposed method it is discussed what it added to the known techniques of the time. After presenting the relevant predecessors starting with Pulse Code Modulation (PCM) and the early vocoders of the 1930s, we will arrive at Residual-Excited Linear Predictive (RELP) coders, Analysis-by-Synthesis systems and Regular- Pulse Excitation in 1984. The latter forms the basis of the GSM full-rate coder. In Chapter 4, which constitutes the core of this thesis, explicit forms of Multi-Pulse Excited (MPE) and Regular-Pulse Excited (RPE) analysis-by-synthesis coding systems are developed. Starting from current pulse-amplitude computation methods in 1984, which included solving sets of equations (typically of order 10-16) two hundred times a second, several explicit-form designs are considered by which solving sets of equations in real time is avoided. Then, the design of a speci??c explicitform RPE coder and an associated eÆcient architecture are described. The explicit forms and the resulting architectural features have never been published in so much detail as presented here. Implementation of such a codec enabled real-time operation on a state-of-the-art singlechip digital signal processor of the time. This coder, at a bit rate of 13 kbit/s, has been selected as the Full-Rate GSM standard in 1988. Its performance is recapitulated. Chapter 5 is an epilogue brie y reviewing the major developments in speech coding technology after 1988. Many speech coding standards have been set, for mobile telephony as well as for other applications, since then. The chapter is concluded by an outlook

Repository TU/e

Pure OAI Repository

Lossless compression of color filter array mosaic images with visualization via JPEG 2000

Author: Blanes Garcia Ian
Hernández-Cabronero Miguel
Marcellin Michael W.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

Digital cameras have become ubiquitous for amateur and professional applications. The raw images captured by digital sensors typically take the form of color filter array (CFA) mosaic images, which must be "developed" (via digital signal processing) before they can be viewed. Photographers and scientists often repeat the "development process" using different parameters to obtain images suitable for different purposes. Since the development process is generally not invertible, it is commonly desirable to store the raw (or undeveloped) mosaic images indefinitely. Uncompressed mosaic image file sizes can be more than 30 times larger than those of developed images stored in JPEG format. Thus, data compression is of interest. Several compression methods for mosaic images have been proposed in the literature. However, they all require a custom decompressor followed by development-specific software to generate a displayable image. In this paper, a novel compression pipeline that removes these requirements is proposed. Specifically, mosaic images can be losslessly recovered from the resulting compressed files, and, more significantly, images can be directly viewed (decompressed and developed) using only a JPEG 2000 compliant image viewer. Experiments reveal that the proposed pipeline attains excellent visual quality, while providing compression performance competitive to that of state-of-the-art compression algorithms for mosaic images

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Diposit Digital de Documents de la UAB

Optimisation techniques for low bit rate speech coding

Author: Shum Ellen
Publication venue
Publication date: 01/01/1998
Field of study

This thesis extends the background theory of speech and major speech coding schemes used in existing networks to an implementation of GSM full-rate speech compression on a RISC DSP and a multirate application for speech coding. Speech coding is the field concerned with obtaining compact digital representations of speech signals for the purpose of efficient transmission. In this thesis, the background of speech compression, characteristics of speech signals and the DSP algorithms used have been examined. The current speech coding schemes and requirements have been studied. The Global System for Mobile communication (GSM) is a digital mobile radio system which is extensively used throughout Europe, and also in many other parts of the world. The algorithm is standardised by the European Telecommunications Standardisation histitute (ETSI). The full-rate and half-rate speech compression of GSM have been analysed. A real time implementation of the full-rate algorithm has been carried out on a RISC processor GEPARD by Austria Mikro Systeme International (AMS). The GEPARD code has been tested with all of the test sequences provided by ETSI and the results are bit-exact. The transcoding delay is lower than the ETSI requirement. A comparison of the half-rate and full-rate compression algorithms is discussed. Both algorithms offer near toll speech quality comparable or better than analogue cellular networks. The half-rate compression requires more computationally intensive operations and therefore a more powerful processor will be needed due to the complexity of the code. Hence the cost of the implementation of half-rate codec will be considerably higher than full-rate. A description of multirate signal processing and its application on speech (SBC) and speech/audio (MPEG) has been given. An investigation into the possibility of combining multirate filtering and GSM fill-rate speech algorithm. The results showed that multirate signal processing cannot be directly applied GSM full-rate speech compression since this method requires more processing power, causing longer coding delay but did not appreciably improve the bit rate. In order to achieve a lower bit rate, the GSM full-rate mathematical algorithm can be used instead of the standardised ETSI recommendation. Some changes including the number of quantisation bits has to be made before the application of multirate signal processing and a new standard will be required

Durham e-Theses