Search CORE

1,343 research outputs found

Analogue CMOS Cochlea Systems: A Historic Retrospective

Author: Andreas Katsiamis
Emmanuel Drakakis
Publication venue: 'IntechOpen'
Publication date: 26/04/2011
Field of study

Coding Strategies for Cochlear Implants Under Adverse Environments

Author: Tahmina Qudsia
Publication venue: UWM Digital Commons
Publication date: 01/05/2016
Field of study

Cochlear implants are electronic prosthetic devices that restores partial hearing in patients with severe to profound hearing loss. Although most coding strategies have significantly improved the perception of speech in quite listening conditions, there remains limitations on speech perception under adverse environments such as in background noise, reverberation and band-limited channels, and we propose strategies that improve the intelligibility of speech transmitted over the telephone networks, reverberated speech and speech in the presence of background noise. For telephone processed speech, we propose to examine the effects of adding low-frequency and high- frequency information to the band-limited telephone speech. Four listening conditions were designed to simulate the receiving frequency characteristics of telephone handsets. Results indicated improvement in cochlear implant and bimodal listening when telephone speech was augmented with high frequency information and therefore this study provides support for design of algorithms to extend the bandwidth towards higher frequencies. The results also indicated added benefit from hearing aids for bimodal listeners in all four types of listening conditions. Speech understanding in acoustically reverberant environments is always a difficult task for hearing impaired listeners. Reverberated sounds consists of direct sound, early reflections and late reflections. Late reflections are known to be detrimental to speech intelligibility. In this study, we propose a reverberation suppression strategy based on spectral subtraction to suppress the reverberant energies from late reflections. Results from listening tests for two reverberant conditions (RT60 = 0.3s and 1.0s) indicated significant improvement when stimuli was processed with SS strategy. The proposed strategy operates with little to no prior information on the signal and the room characteristics and therefore, can potentially be implemented in real-time CI speech processors. For speech in background noise, we propose a mechanism underlying the contribution of harmonics to the benefit of electroacoustic stimulations in cochlear implants. The proposed strategy is based on harmonic modeling and uses synthesis driven approach to synthesize the harmonics in voiced segments of speech. Based on objective measures, results indicated improvement in speech quality. This study warrants further work into development of algorithms to regenerate harmonics of voiced segments in the presence of noise

University of Wisconsin-Milwaukee

Perceptually motivated blind source separation of convolutive audio mixtures

Author: Guddeti Ram Mohana Reddy
Publication venue: The University of Edinburgh
Publication date: 01/01/2005
Field of study

Edinburgh Research Archive

Acoustics of ancient Greek and Roman theaters in use today

Author: Angelakis Konstantinos
Gade Anders Christian
Publication venue
Publication date: 01/01/2006
Field of study

Crossref

Online Research Database In Technology

Performance of optimized sound field control techniques in simulated and real acoustic environments

Author: Jan Abildgaard Pedersen
Marek Olik
Martin Møller
Martin Olsen
Philip Coleman
Philip Jackson
Publication venue: 'Acoustical Society of America (ASA)'
Publication date
Field of study

Crossref

Studies on noise robust automatic speech recognition

Author: Kurimo Mikko
Palomäki Kalle J.
Remes Ulpu
Publication venue: Teknillinen korkeakoulu
Publication date: 01/01/2009
Field of study

Noise in everyday acoustic environments such as cars, traffic environments, and cafeterias remains one of the main challenges in automatic speech recognition (ASR). As a research theme, it has received wide attention in conferences and scientific journals focused on speech technology. This article collection reviews both the classic and novel approaches suggested for noise robust ASR. The articles are literature reviews written for the spring 2009 seminar course on noise robust automatic speech recognition (course code T-61.6060) held at TKK

Aaltodoc Publication Archive

Quantitative assessment of spatial sound distortion by the semi-ideal recording point of a hear-through device

Author: Christensen Flemming
Hammershøi Dorte
Hoffmann Pablo F.
Publication venue: 'Acoustical Society of America (ASA)'
Publication date: 01/01/2013
Field of study

Crossref

VBN

TimeScaleNet : a Multiresolution Approach for Raw Audio Recognition using Learnable Biquadratic IIR Filters and Residual Networks of Depthwise-Separable One-Dimensional Atrous Convolutions

Author: Bavu Eric
Garcia Alexandre
Pujol Hadrien
Ramamonjy Aro
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

International audienceIn the present paper, we show the benefit of a multi-resolution approach that allows to encode the relevant information contained in unprocessed time domain acoustic signals. TimeScaleNet aims at learning an efficient representation of a sound, by learning time dependencies both at the sample level and at the frame level. The proposed approach allows to improve the interpretability of the learning scheme, by unifying advanced deep learning and signal processing techniques. In particular, TimeScaleNet's architecture introduces a new form of recurrent neural layer, which is directly inspired from digital IIR signal processing. This layer acts as a learnable passband biquadratic digital IIR filterbank. The learnable filterbank allows to build a time-frequency-like feature map that self-adapts to the specific recognition task and dataset, with a large receptive field and very few learnable parameters. The obtained frame-level feature map is then processed using a residual network of depthwise separable atrous convolutions. This second scale of analysis aims at efficiently encoding relationships between the time fluctuations at the frame timescale, in different learnt pooled frequency bands, in the range of [20 ms ; 200 ms]. TimeScaleNet is tested both using the Speech Commands Dataset and the ESC-10 Dataset. We report a very high mean accuracy of 94.87 ± 0.24% (macro averaged F1-score : 94.9 ± 0.24%) for speech recognition, and a rather moderate accuracy of 69.71 ± 1.91% (macro averaged F1-score : 70.14 ± 1.57%) for the environmental sound classification task

Enhancement of Periodic Signals:with Application to Speech Signals

Author: Jensen Jesper Rindom
Publication venue
Publication date: 01/01/2012
Field of study

VBN