Search CORE

4,910 research outputs found

Echo Cancellation - A Likelihood Ratio Test for Double-talk Versus Channel Change

Author: Bershad Neil J.
Tourneret Jean-Yves
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2006
Field of study

Echo cancellers are in wide use in both electrical (four wire to two wire mismatch) and acoustic (speaker-microphone coupling) applications. One of the main design problems is the control logic for adaptation. Basically, the algorithm weights should be frozen in the presence of double-talk and adapt quickly in the absence of double-talk. The control logic can be quite complicated since it is often not easy to discriminate between the echo signal and the near-end speaker. This paper derives a log likelihood ratio test (LRT) for deciding between double-talk (freeze weights) and a channel change (adapt quickly) using a stationary Gaussian stochastic input signal model. The probability density function of a sufficient statistic under each hypothesis is obtained and the performance of the test is evaluated as a function of the system parameters. The receiver operating characteristics (ROCs) indicate that it is difficult to correctly decide between double-talk and a channel change based upon a single look. However, post-detection integration of approximately one hundred sufficient statistic samples yields a detection probability close to unity (0.99) with a small false alarm probability (0.01)

Scientific Publications of the University of Toulouse II Le Mirail

Open Archive Toulouse Archive Ouverte

HAL Descartes

Creativity First, Science Follows:Lessons in Digital Signal Processing Education

Author: Alty Stephen
Cheong Took Clive
Howard David
Yardim Anush
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 28/04/2021
Field of study

Royal Holloway - Pure

study of adaptive signal processing

Author: Patra Manas Ranjan
Publication venue
Publication date: 01/01/2013
Field of study

An adaptive filter is a digital filter that can adjust its coefficients to give the best match t An adaptive filter is a digital filter that can adjust its coefficients to give the best match to a given desired signal. When an adaptive filter operates in a changeable environment the filter coefficients can adapt in response to changes in the applied input signals. Adaptive filters depend on recursive algorithms to update their coefficients and train them to near the optimum solution. An everyday example of adaptive filters is in the telephone system where, impedance mismatches causing echoes of a signal are a significant source of annoyance to the users of the system. The adaptive signal process is here to estimate and generate the echo path and compensate for it. To do this the echo path is viewed as an unknown system with some impulse response and the adaptive filter must mimic this response. Adaptive Filters are generally implemented in the time domain which works well in most scenarios however in many applications the impulse response become long, and increasing the complexity of the filter beyond a level where it can no longer be implemented efficiently in the time domain. An example of acoustic echo cancellation applications is in hands free telephony system. However there exists an alternative solution and that is to implement the filters in the frequency domain. The Discrete Fourier Transform or Fast Fourier Transform (FFT) allows the conversion of signals from the time domain to the frequency domain in an efficient manner. Despite the efficiency of the FFT the overhead involved in converting the signals to the frequency domain does place a restriction on the use of the algorithm. When the impulse response of the unknown system and hence the impulse response of the filter is long enough however this is not an issue since the computational cost of the conversion is much less than that of the time domain algorithm. The actual filtering of the signals requires little computational cost in the frequency domain. Investigation of the so-called crossover point, the point where the frequency domain implementation becomes more efficient than the time domain implementation is important to establish the point where frequency domain implementation becomes practica

ethesis@nitr

Virtual Audio - Three-Dimensional Audio in Virtual Environments

Author: Adler Daniel
Publication venue: Swedish Institute of Computer Science
Publication date: 01/01/1996
Field of study

Three-dimensional interactive audio has a variety ofpotential uses in human-machine interfaces. After lagging seriously behind the visual components, the importance of sound is now becoming increas-ingly accepted. This paper mainly discusses background and techniques to implement three-dimensional audio in computer interfaces. A case study of a system for three-dimensional audio, implemented by the author, is described in great detail. The audio system was moreover integrated with a virtual reality system and conclusions on user tests and use of the audio system is presented along with proposals for future work at the end of the paper. The thesis begins with a definition of three-dimensional audio and a survey on the human auditory system to give the reader the needed knowledge of what three-dimensional audio is and how human auditory perception works

RISE – Research Institutes of Sweden

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Swedish Institute of Computer Science Publications Database

Enhancing Usability, Security, and Performance in Mobile Computing

Author: Yi Shanhe
Publication venue: W&M ScholarWorks
Publication date: 01/01/2018
Field of study

We have witnessed the prevalence of smart devices in every aspect of human life. However, the ever-growing smart devices present significant challenges in terms of usability, security, and performance. First, we need to design new interfaces to improve the device usability which has been neglected during the rapid shift from hand-held mobile devices to wearables. Second, we need to protect smart devices with abundant private data against unauthorized users. Last, new applications with compute-intensive tasks demand the integration of emerging mobile backend infrastructure. This dissertation focuses on addressing these challenges. First, we present GlassGesture, a system that improves the usability of Google Glass through a head gesture user interface with gesture recognition and authentication. We accelerate the recognition by employing a novel similarity search scheme, and improve the authentication performance by applying new features of head movements in an ensemble learning method. as a result, GlassGesture achieves 96% gesture recognition accuracy. Furthermore, GlassGesture accepts authorized users in nearly 92% of trials, and rejects attackers in nearly 99% of trials. Next, we investigate the authentication between a smartphone and a paired smartwatch. We design and implement WearLock, a system that utilizes one\u27s smartwatch to unlock one\u27s smartphone via acoustic tones. We build an acoustic modem with sub-channel selection and adaptive modulation, which generates modulated acoustic signals to maximize the unlocking success rate against ambient noise. We leverage the motion similarities of the devices to eliminate unnecessary unlocking. We also offload heavy computation tasks from the smartwatch to the smartphone to shorten response time and save energy. The acoustic modem achieves a low bit error rate (BER) of 8%. Compared to traditional manual personal identification numbers (PINs) entry, WearLock not only automates the unlocking but also speeds it up by at least 18%. Last, we consider low-latency video analytics on mobile devices, leveraging emerging mobile backend infrastructure. We design and implement LAVEA, a system which offloads computation from mobile clients to edge nodes, to accomplish tasks with intensive computation at places closer to users in a timely manner. We formulate an optimization problem for offloading task selection and prioritize offloading requests received at the edge node to minimize the response time. We design and compare various task placement schemes for inter-edge collaboration to further improve the overall response time. Our results show that the client-edge configuration has a speedup ranging from 1.3x to 4x against running solely by the client and 1.2x to 1.7x against the client-cloud configuration

College of William & Mary: W&M Publish

BSA Practice guidance: an overview of current management of auditory processing disorder (APD)

Author: Alles R.
Bamiou D.
Batchelor L.
Campbell N.G. (Lead Author)
Canning D.
Grant P.
Luxon L.
Moore D.
Murray P.
Nairn S.
Rosen S.
Sirimanna T.
Treharne D.
Wakeham K.
Publication venue: British Society of Audiology
Publication date: 17/10/2011
Field of study

Southampton (e-Prints Soton)

Recommended from our members

The Challenge of Spoken Language Systems: Research Directions for the Nineties

Author: McKeown Kathleen
Cole Ron
Hirschman Lynette
Atlas Les
Beckman Mary
Biermann Alan
Bush Marcia
Clements Mark
Cohen Jordan
Garcia Oscar
Hanson Brian
Hermansky Hynek
Levinson Steve
Morgan Nelson
Novick David G.
Ostendorf Mari
Oviatt Sharon
Price Patti
Silverman Harvey
Spitz Judy
Waibel Alex
Weinstein Clifford
Zahorian Steve
Zue Victor
Publication venue
Publication date: 01/01/1995
Field of study

A spoken language system combines speech recognition, natural language processing and human interface technology. It functions by recognizing the person's words, interpreting the sequence of words to obtain a meaning in terms of the application, and providing an appropriate response back to the user. Potential applications of spoken language systems range from simple tasks, such as retrieving information from an existing database (traffic reports, airline schedules), to interactive problem solving tasks involving complex planning and reasoning (travel planning, traffic routing), to support for multilingual interactions. We examine eight key areas in which basic research is needed to produce spoken language systems: (1) robust speech recognition; (2) automatic training and adaptation; (3) spontaneous speech; (4) dialogue models; (5) natural language response generation; (6) speech synthesis and speech generation; (7) multilingual systems; and (8) interactive multimodal systems. In each area, we identify key research challenges, the infrastructure needed to support research, and the expected benefits. We conclude by reviewing the need for multidisciplinary research, for development of shared corpora and related resources, for computational support and far rapid communication among researchers. The successful development of this technology will increase accessibility of computers to a wide range of users, will facilitate multinational communication and trade, and will create new research specialties and jobs in this rapidly expanding area

Columbia University Academic Commons

TamPub Julkaisuarkisto - TamPub Institutional Repository

Trepo - Institutional Repository of Tampere University

Recommended from our members

The Challenge of Spoken Language Systems: Research Directions for the Nineties

Author: Atlas Les
Beckman Mary
Biermann Alan
Bush Marcia
Clements Mark
Cohen Jordan
Cole Ron
Garcia Oscar
Hanson Brian
Hermansky Hynek
Hirschman Lynette
Levinson Steve
McKeown Kathleen
Morgan Nelson
Novick David G.
Ostendorf Mari
Oviatt Sharon
Price Patti
Silverman Harvey
Spitz Judy
Waibel Alex
Weinstein Clifford
Zahorian Steve
Zue Victor
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/1995
Field of study

Columbia University Academic Commons

An investigation of the utility of monaural sound source separation via nonnegative matrix factorization applied to acoustic echo and reverberation mitigation for hands-free telephony

Author: Cahill Niall M.
Publication venue
Publication date: 01/02/2012
Field of study

In this thesis we investigate the applicability and utility of Monaural Sound Source Separation (MSSS) via Nonnegative Matrix Factorization (NMF) for various problems related to audio for hands-free telephony. We first investigate MSSS via NMF as an alternative acoustic echo reduction approach to existing approaches such as Acoustic Echo Cancellation (AEC). To this end, we present the single-channel acoustic echo problem as an MSSS problem, in which the objective is to extract the users signal from a mixture also containing acoustic echo and noise. To perform separation, NMF is used to decompose the near-end microphone signal onto the union of two nonnegative bases in the magnitude Short Time Fourier Transform domain. One of these bases is for the spectral energy of the acoustic echo signal, and is formed from the in- coming far-end user’s speech, while the other basis is for the spectral energy of the near-end speaker, and is trained with speech data a priori. In comparison to AEC, the speaker extraction approach obviates Double-Talk Detection (DTD), and is demonstrated to attain its maximal echo mitigation performance immediately upon initiation and to maintain that performance during and after room changes for similar computational requirements. Speaker extraction is also shown to introduce distortion of the near-end speech signal during double-talk, which is quantified by means of a speech distortion measure and compared to that of AEC. Subsequently, we address Double-Talk Detection (DTD) for block-based AEC algorithms. We propose a novel block-based DTD algorithm that uses the available signals and the estimate of the echo signal that is produced by NMF-based speaker extraction to compute a suitably normalized correlation-based decision variable, which is compared to a fixed threshold to decide on doubletalk. Using a standard evaluation technique, the proposed algorithm is shown to have comparable detection performance to an existing conventional block-based DTD algorithm. It is also demonstrated to inherit the room change insensitivity of speaker extraction, with the proposed DTD algorithm generating minimal false doubletalk indications upon initiation and in response to room changes in comparison to the existing conventional DTD. We also show that this property allows its paired AEC to converge at a rate close to the optimum. Another focus of this thesis is the problem of inverting a single measurement of a non- minimum phase Room Impulse Response (RIR). We describe the process by which percep- tually detrimental all-pass phase distortion arises in reverberant speech filtered by the inverse of the minimum phase component of the RIR; in short, such distortion arises from inverting the magnitude response of the high-Q maximum phase zeros of the RIR. We then propose two novel partial inversion schemes that precisely mitigate this distortion. One of these schemes employs NMF-based MSSS to separate the all-pass phase distortion from the target speech in the magnitude STFT domain, while the other approach modifies the inverse minimum phase filter such that the magnitude response of the maximum phase zeros of the RIR is not fully compensated. Subjective listening tests reveal that the proposed schemes generally produce better quality output speech than a comparable inversion technique

MURAL - Maynooth University Research Archive Library

Irish Universities

NUI Maynooth Eprint Archive

Maynooth University ePrints and eTheses Archive