11,066 research outputs found

    Vector quantization

    Get PDF
    During the past ten years Vector Quantization (VQ) has developed from a theoretical possibility promised by Shannon's source coding theorems into a powerful and competitive technique for speech and image coding and compression at medium to low bit rates. In this survey, the basic ideas behind the design of vector quantizers are sketched and some comments made on the state-of-the-art and current research efforts

    Quantization of Prior Probabilities for Hypothesis Testing

    Full text link
    Bayesian hypothesis testing is investigated when the prior probabilities of the hypotheses, taken as a random vector, are quantized. Nearest neighbor and centroid conditions are derived using mean Bayes risk error as a distortion measure for quantization. A high-resolution approximation to the distortion-rate function is also obtained. Human decision making in segregated populations is studied assuming Bayesian hypothesis testing with quantized priors

    Engineering data compendium. Human perception and performance. User's guide

    Get PDF
    The concept underlying the Engineering Data Compendium was the product of a research and development program (Integrated Perceptual Information for Designers project) aimed at facilitating the application of basic research findings in human performance to the design and military crew systems. The principal objective was to develop a workable strategy for: (1) identifying and distilling information of potential value to system design from the existing research literature, and (2) presenting this technical information in a way that would aid its accessibility, interpretability, and applicability by systems designers. The present four volumes of the Engineering Data Compendium represent the first implementation of this strategy. This is the first volume, the User's Guide, containing a description of the program and instructions for its use

    An identity of Chernoff bounds with an interpretation in statistical physics and applications in information theory

    Full text link
    An identity between two versions of the Chernoff bound on the probability a certain large deviations event, is established. This identity has an interpretation in statistical physics, namely, an isothermal equilibrium of a composite system that consists of multiple subsystems of particles. Several information--theoretic application examples, where the analysis of this large deviations probability naturally arises, are then described from the viewpoint of this statistical mechanical interpretation. This results in several relationships between information theory and statistical physics, which we hope, the reader will find insightful.Comment: 29 pages, 1 figure. Submitted to IEEE Trans. on Information Theor

    Comparison of VQ and DTW classifiers for speaker verification

    Get PDF
    This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.---- Copyright IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.An investigation into the relative speaker verification performance of various types of vector quantisation (VQ) and dynamic time warping (DTW) classifiers is presented. The study covers a number of algorithmic issues involved in the above classifiers, and examines the effects of these on the verification accuracy. The experiments are based on the use of a subset from the Brent (telephone quality) speech database. This subset consists of repetitions of isolated digit utterances 1 to 9 and zero. The paper describes the experimental work, and presents an analysis of the results

    Perceptual models in speech quality assessment and coding

    Get PDF
    The ever-increasing demand for good communications/toll quality speech has created a renewed interest into the perceptual impact of rate compression. Two general areas are investigated in this work, namely speech quality assessment and speech coding. In the field of speech quality assessment, a model is developed which simulates the processing stages of the peripheral auditory system. At the output of the model a "running" auditory spectrum is obtained. This represents the auditory (spectral) equivalent of any acoustic sound such as speech. Auditory spectra from coded speech segments serve as inputs to a second model. This model simulates the information centre in the brain which performs the speech quality assessment. [Continues.

    Effects of noise suppression and envelope dynamic range compression on the intelligibility of vocoded sentences for a tonal language

    Get PDF
    Vocoder simulation studies have suggested that the carrier signal type employed affects the intelligibility of vocoded speech. The present work further assessed how carrier signal type interacts with additional signal processing, namely, single-channel noise suppression and envelope dynamic range compression, in determining the intelligibility of vocoder simulations. In Experiment 1, Mandarin sentences that had been corrupted by speech spectrum-shaped noise (SSN) or two-talker babble (2TB) were processed by one of four single-channel noise-suppression algorithms before undergoing tone-vocoded (TV) or noise-vocoded (NV) processing. In Experiment 2, dynamic ranges of multiband envelope waveforms were compressed by scaling of the mean-removed envelope waveforms with a compression factor before undergoing TV or NV processing. TV Mandarin sentences yielded higher intelligibility scores with normal-hearing (NH) listeners than did noise-vocoded sentences. The intelligibility advantage of noise-suppressed vocoded speech depended on the masker type (SSN vs 2TB). NV speech was more negatively influenced by envelope dynamic range compression than was TV speech. These findings suggest that an interactional effect exists between the carrier signal type employed in the vocoding process and envelope distortion caused by signal processing

    Comparison of Wideband Earpiece Integrations in Mobile Phone

    Get PDF
    Perinteisesti puhelinverkoissa vÀlitettÀvÀ puhe on ollut kapeakaistaista, kaistan ollessa 300 - 3400 Hz. Voidaan kuitenkin olettaa, ettÀ laajakaistaiset puhepalvelut tulevat saamaan markkinoilla enemmÀn jalansijaa tulevina vuosina. TÀssÀ lopputyössÀ esitellÀÀn puheenkoodauksen perusteet laajakaistaisen adaptiivisen moninopeuspuhekoodekin (AMR-WB) kanssa. Laajakaistainen puhekoodekki laajentaa puhekaistan 50-7000 Hz kÀyttÀen 16 kHz nÀytetaajuutta. KÀytÀnnössÀ laajempi kaista tarkoittaa parannuksia puheen ymmÀrrettÀvyyteen ja tekee siitÀ luonnollisemman ja mukavamman kuuloista. TÀmÀn lopputyön pÀÀtavoite on vertailla kahden eri laajakaistaisen matkapuhelinkuulokkeen integrointia. Kysymys kuuluu, kuinka paljon kÀyttÀjÀ hyötyy isommasta kuulokkeesta matkapuhelimessa? Kuulokkeiden suorituskyvyn selvittÀmiseksi niille tehtiin objektiivisia mittauksia vapaakentÀssÀ. Mittauksia tehtiin myös puhelimelle pÀÀ- ja torsosimulaattorissa (HATS) johdottamalla kuuloke suoraan vahvistimelle, sekÀ lisÀksi puhelun ollessa aktiivisena GSM ja WCDMA verkoissa. Objektiiviset mittaukset osoittivat kahden eri integroinnin vÀliset erot kuulokkeiden taajuusvasteessa ja sÀrössÀ erityisesti matalilla taajuuksilla. Lopuksi tehtiin kuuntelukoe tarkoituksena selvittÀÀ erottaako loppukÀyttÀjÀ pienemmÀn ja isomman kuulokkeen vÀlistÀ eroa kÀyttÀen kapeakaistaisia ja laajakaistaisia puhelinÀÀninÀytteitÀ. Kuuntelukokeen tuloksien pohjalta voidaan sanoa, ettÀ kÀyttÀjÀ erottaa kahden eri integroinnin erot ja miespuhuja hyötyy naispuhujaa enemmÀn isommasta kuulokkeesta laajakaistaisella puhekoodekilla.The speech in telecommunication networks has been traditionally narrowband ranging from 300 Hz to 3400 Hz. It can be expected that wideband speech call services will increase their foothold in the markets during the coming years. In this thesis speech coding basics with adaptive multirate wideband (AMR-WB) are introduced. The wideband codec widens the speech band to new range from 50 Hz to 7000 Hz using 16 kHz sampling frequency. In practice the wider band means improvements to speech intelligibility and makes it more natural and comfortable to listen to. The main focus of this thesis work is to compare two different wideband earpiece integrations. The question is how much the end-user will benefit from using a larger earpiece in a mobile phone? To find out speaker performance, objective measurements in free field were done for the earpiece modules. Measurements were performed also for the phone on head and torso simulator (HATS) by wiring the earpieces directly to a power amplifier and with over the air on GSM and WCDMA networks. The results of objective measurements showed differences between the earpiece integrations especially on low frequencies in frequency response and distortion. Finally the subjective listening test is done for comparison to see if the end-user notices the difference between smaller and larger earpiece integrations using narrowband and wideband speech samples. Based on these subjective test results it can be said that the user can differentiate between two different integrations and that a male speaker benefits more from a larger earpiece than a female speaker

    A study of data coding technology developments in the 1980-1985 time frame, volume 2

    Get PDF
    The source parameters of digitized analog data are discussed. Different data compression schemes are outlined and analysis of their implementation are presented. Finally, bandwidth compression techniques are given for video signals
    • 

    corecore