Search CORE

73,991 research outputs found

Differential encoding techniques applied to speech signals

Author: Constantinos S. Xydeas (7200770)
Publication venue
Publication date: 01/01/1978
Field of study

The increasing use of digital communication systems has produced a continuous search for efficient methods of speech encoding. This thesis describes investigations of novel differential encoding systems. Initially Linear First Order DPCM systems employing a simple delayed encoding algorithm are examined. The systems detect an overload condition in the encoder, and through a simple algorithm reduce the overload noise at the expense of some increase in the quantization (granular) noise. The signal-to-noise ratio (snr) performance of such d codec has 1 to 2 dB's advantage compared to the First Order Linear DPCM system. In order to obtain a large improvement in snr the high correlation between successive pitch periods as well as the correlation between successive samples in the voiced speech waveform is exploited. A system called "Pitch Synchronous First Order DPCM" (PSFOD) has been developed. Here the difference Sequence formed between the samples of the input sequence in the current pitch period and the samples of the stored decoded sequence from the previous pitch period are encoded. This difference sequence has a smaller dynamic range than the original input speech sequence enabling a quantizer with better resolution to be used for the same transmission bit rate. The snr is increased by 6 dB compared with the peak snr of a First Order DPCM codea. A development of the PSFOD system called a Pitch Synchronous Differential Predictive Encoding system (PSDPE) is next investigated. The principle of its operation is to predict the next sample in the voiced-speech waveform, and form the prediction error which is then subtracted from the corresponding decoded prediction error in the previous pitch period. The difference is then encoded and transmitted. The improvement in snr is approximately 8 dB compared to an ADPCM codea, when the PSDPE system uses an adaptive PCM encoder. The snr of the system increases further when the efficiency of the predictors used improve. However, the performance of a predictor in any differential system is closely related to the quantizer used. The better the quantization the more information is available to the predictor and the better the prediction of the incoming speech samples. This leads automatically to the investigation in techniques of efficient quantization. A novel adaptive quantization technique called Dynamic Ratio quantizer (DRQ) is then considered and its theory presented. The quantizer uses an adaptive non-linear element which transforms the input samples of any amplitude to samples within a defined amplitude range. A fixed uniform quantizer quantizes the transformed signal. The snr for this quantizer is almost constant over a range of input power limited in practice by the dynamia range of the adaptive non-linear element, and it is 2 to 3 dB's better than the snr of a One Word Memory adaptive quantizer. Digital computer simulation techniques have been used widely in the above investigations and provide the necessary experimental flexibility. Their use is described in the text

Loughborough University Institutional Repository

The voice activity detection (VAD) recorder and VAD network recorder : a thesis presented in partial fulfilment of the requirements for the degree of Master of Science in Computer Science at Massey University

Author: Liu Feng
Publication venue: 'Massey University'
Publication date: 01/01/2001
Field of study

The project is to provide a feasibility study for the AudioGraph tool, focusing on two application areas: the VAD (voice activity detector) recorder and the VAD network recorder. The first one achieves a low bit-rate speech recording on the fly, using a GSM compression coder with a simple VAD algorithm; and the second one provides two-way speech over IP, fulfilling echo cancellation with a simplex channel. The latter is required for implementing a synchronous AudioGraph. In the first chapter we introduce the background of this project, specifically, the VoIP technology, the AudioGraph tool, and the VAD algorithms. We also discuss the problems set for this project. The second chapter presents all the relevant techniques in detail, including sound representation, speech-coding schemes, sound file formats, PowerPlant and Macintosh programming issues, and the simple VAD algorithm we have developed. The third chapter discusses the implementation issues, including the systems' objective, architecture, the problems encountered and solutions used. The fourth chapter illustrates the results of the two applications. The user documentations for the applications are given, and after that, we analyse the parameters based on the results. We also present the default settings of the parameters, which could be used in the AudioGraph system. The last chapter provides conclusions and future work

Massey Research Online

Multimodal person recognition for human-vehicle interaction

Author: Abut Huseyin
Abut Hüseyin
Ercil Aytul
Erdogan Hakan
Erdoğan Hakan
Erzin Engin
Erçil Aytül
Tekalp A. Murat
Yemez Yucel
Yemez Yücel
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/04/2006
Field of study

Next-generation vehicles will undoubtedly feature biometric person recognition as part of an effort to improve the driving experience. Today's technology prevents such systems from operating satisfactorily under adverse conditions. A proposed framework for achieving person recognition successfully combines different biometric modalities, borne out in two case studies

Sabanci University Research Database

Recommended from our members

Dental drill noise reduction using a combination of active noise control, passive noise control and adaptive filtering

Author: Atherton MA
Kaymak E
Millar B
Rotter K
Publication venue: Turkish Acoustical Society
Publication date: 01/01/2007
Field of study

Dental drills produce a characteristic high frequency, narrow band noise that is uncomfortable for patients and is also known to be harmful to dentists under prolonged exposure. It is therefore desirable to protect the patient and dentist whilst allowing two-way communication. A solution is to use a combination of the three main noise control methods, namely, Passive Noise Control (PNC), Adaptive Filtering (AF) and Active Noise Control (ANC). This paper discusses the application of the three methods to reduce dental drill noise while allowing two-way communication. Experimental setup for measuring the noise reduction by PNC is explained and results from different headphones and headphone types are presented. The implementation and results of an AF system using the Least Mean Square (LMS) algorithm are shown. ANC requires a modification of the LMS algorithm due to the introduction of the electro-acoustical cancellation path transfer function to compensate for the delays introduced by the control system. Therefore a cancellation path transfer function modeling method based on the filtered reference LMS (FXLMS) algorithm is presented along with preliminary results of the implementation

Brunel University Research Archive

Network planning for third-generation mobile radio systems

Author: Beach MA
Cheung JCS
McGeehan JP
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/11/1994
Field of study

Explore Bristol Research

Recognizing Voice Over IP: A Robust Front-End for Speech Recognition on the World Wide Web

Author: Díaz de María Fernando
Gallardo Antolín Ascensión
Peláez Moreno Carmen
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2001
Field of study

The Internet Protocol (IP) environment poses two relevant sources of distortion to the speech recognition problem: lossy speech coding and packet loss. In this paper, we propose a new front-end for speech recognition over IP networks. Specifically, we suggest extracting the recognition feature vectors directly from the encoded speech (i.e., the bit stream) instead of decoding it and subsequently extracting the feature vectors. This approach offers two significant benefits. First, the recognition system is only affected by the quantization distortion of the spectral envelope. Thus, we are avoiding the influence of other sources of distortion due to the encoding-decoding process. Second, when packet loss occurs, our front-end becomes more effective since it is not constrained to the error handling mechanism of the codec. We have considered the ITU G.723.1 standard codec, which is one of the most preponderant coding algorithms in voice over IP (VoIP) and compared the proposed front-end with the conventional approach in two automatic speech recognition (ASR) tasks, namely, speaker-independent isolated digit recognition and speaker-independent continuous speech recognition. In general, our approach outperforms the conventional procedure, for a variety of simulated packet loss rates. Furthermore, the improvement is higher as network conditions worsen.Publicad

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Universidad Carlos III de Madrid e-Archivo

Micro protocol engineering for unstructured carriers: On the embedding of steganographic control protocols into audio transmissions

Author: Keller Jörg
Mazurczyk Wojciech
Naumann Matthias
Wendzel Steffen
Publication venue
Publication date: 28/05/2015
Field of study

Network steganography conceals the transfer of sensitive information within unobtrusive data in computer networks. So-called micro protocols are communication protocols placed within the payload of a network steganographic transfer. They enrich this transfer with features such as reliability, dynamic overlay routing, or performance optimization --- just to mention a few. We present different design approaches for the embedding of hidden channels with micro protocols in digitized audio signals under consideration of different requirements. On the basis of experimental results, our design approaches are compared, and introduced into a protocol engineering approach for micro protocols.Comment: 20 pages, 7 figures, 4 table

arXiv.org e-Print Archive

Fraunhofer-ePrints