560 research outputs found
Joint model-based recognition and localization of overlapped acoustic events using a set of distributed small microphone arrays
In the analysis of acoustic scenes, often the occurring sounds have to be
detected in time, recognized, and localized in space. Usually, each of these
tasks is done separately. In this paper, a model-based approach to jointly
carry them out for the case of multiple simultaneous sources is presented and
tested. The recognized event classes and their respective room positions are
obtained with a single system that maximizes the combination of a large set of
scores, each one resulting from a different acoustic event model and a
different beamformer output signal, which comes from one of several
arbitrarily-located small microphone arrays. By using a two-step method, the
experimental work for a specific scenario consisting of meeting-room acoustic
events, either isolated or overlapped with speech, is reported. Tests carried
out with two datasets show the advantage of the proposed approach with respect
to some usual techniques, and that the inclusion of estimated priors brings a
further performance improvement.Comment: Computational acoustic scene analysis, microphone array signal
processing, acoustic event detectio
ChordMics: Acoustic Signal Purification with Distributed Microphones
Acoustic signal acts as an essential input to many systems. However, the pure
acoustic signal is very difficult to extract, especially in noisy environments.
Existing beamforming systems are able to extract the signal transmitted from
certain directions. However, since microphones are centrally deployed, these
systems have limited coverage and low spatial resolution. We overcome the above
limitations and present ChordMics, a distributed beamforming system. By
leveraging the spatial diversity of the distributed microphones, ChordMics is
able to extract the acoustic signal from arbitrary points. To realize such a
system, we further address the fundamental challenge in distributed
beamforming: aligning the signals captured by distributed and unsynchronized
microphones. We implement ChordMics and evaluate its performance under both LOS
and NLOS scenarios. The evaluation results tell that ChordMics can deliver
higher SINR than the centralized microphone array. The average performance gain
is up to 15dB
Microphone array signal processing for robot audition
Robot audition for humanoid robots interacting naturally with humans in an unconstrained real-world environment is a hitherto unsolved challenge. The recorded microphone signals are usually distorted by background and interfering noise sources (speakers) as well as room reverberation. In addition, the movements of a robot and its actuators cause ego-noise which degrades the recorded signals significantly. The movement of the robot body and its head also complicates the detection and tracking of the desired, possibly moving, sound sources of interest. This paper presents an overview of the concepts in microphone array processing for robot audition and some recent achievements
Adaptive Signal Processing Techniques and Realistic Propagation Modeling for Multiantenna Vital Sign Estimation
Tämän työn keskeisimpänä tavoitteena on ihmisen elintoimintojen tarkkailu ja estimointi käyttäen radiotaajuisia mittauksia ja adaptiivisia signaalinkäsittelymenetelmiä monen vastaanottimen kantoaaltotutkalla.
Työssä esitellään erilaisia adaptiivisia menetelmiä, joiden avulla hengityksen ja sydämen värähtelyn aiheuttamaa micro-Doppler vaihemodulaatiota sisältävät eri vastaanottimien signaalit voidaan yhdistää. Työssä johdetaan lisäksi realistinen malli radiosignaalien etenemiselle ja heijastushäviöille, jota käytettiin moniantennitutkan simuloinnissa esiteltyjen menetelmien vertailemiseksi.
Saatujen tulosten perusteella voidaan osoittaa, että adaptiiviset menetelmät parantavat langattoman elintoimintojen estimoinnin luotettavuutta, ja mahdollistavat monitoroinnin myös pienillä signaali-kohinasuhteen arvoilla.This thesis addresses the problem of vital sign estimation through the use of adaptive signal enhancement techniques with multiantenna continuous wave radar. The use of different adaptive processing techniques is proposed in a novel approach to combine signals from multiple receivers carrying the information of the cardiopulmonary micro-Doppler effect caused by breathing and heartbeat.
The results are based on extensive simulations using a realistic signal propagation model derived in the thesis. It is shown that these techniques provide a significant increase in vital sign rate estimation accuracy, and enable monitoring at lower SNR conditions
Speech enhancement for robust automatic speech recognition: Evaluation using a baseline system and instrumental measures
Automatic speech recognition in everyday environments must be robust to significant levels of reverberation and noise. One strategy to achieve such robustness is multi-microphone speech enhancement. In this study, we present results of an evaluation of different speech enhancement pipelines using a state-of-the-art ASR system for a wide range of reverberation and noise conditions. The evaluation exploits the recently released ACE Challenge database which includes measured multichannel acoustic impulse responses from 7 different rooms with reverberation times ranging from 0.33 s to 1.34 s. The reverberant speech is mixed with ambient, fan and babble noise recordings made with the same microphone setups in each of the rooms. In the first experiment performance of the ASR without speech processing is evaluated. Results clearly indicate the deleterious effect of both noise and reverberation. In the second experiment, different speech enhancement pipelines are evaluated with relative word error rate reductions of up to 82%. Finally, the ability of selected instrumental metrics to predict ASR performance improvement is assessed. The best performing metric, Short-Time Objective Intelligibility Measure, is shown to have a Pearson correlation coefficient of 0.79, suggesting that it is a useful predictor of algorithm performance in these tests
Recommended from our members
Automatic Speech Separation for Brain-Controlled Hearing Technologies
Speech perception in crowded acoustic environments is particularly challenging for hearing impaired listeners. While assistive hearing devices can suppress background noises distinct from speech, they struggle to lower interfering speakers without knowing the speaker on which the listener is focusing. The human brain has a remarkable ability to pick out individual voices in a noisy environment like a crowded restaurant or a busy city street. This inspires the brain-controlled hearing technologies. A brain-controlled hearing aid acts as an intelligent filter, reading wearers’ brainwaves and enhancing the voice they want to focus on.
Two essential elements form the core of brain-controlled hearing aids: automatic speech separation (SS), which isolates individual speakers from mixed audio in an acoustic scene, and auditory attention decoding (AAD) in which the brainwaves of listeners are compared with separated speakers to determine the attended one, which can then be amplified to facilitate hearing. This dissertation focuses on speech separation and its integration with AAD, aiming to propel the evolution of brain-controlled hearing technologies. The goal is to help users to engage in conversations with people around them seamlessly and efficiently.
This dissertation is structured into two parts. The first part focuses on automatic speech separation models, beginning with the introduction of a real-time monaural speech separation model, followed by more advanced real-time binaural speech separation models. The binaural models use both spectral and spatial features to separate speakers and are more robust to noise and reverberation. Beyond performing speech separation, the binaural models preserve the interaural cues of separated sound sources, which is a significant step towards immersive augmented hearing. Additionally, the first part explores using speaker identifications to improve the performance and robustness of models in long-form speech separation. This part also delves into unsupervised learning methods for multi-channel speech separation, aiming to improve the models' ability to generalize to real-world audio.
The second part of the dissertation integrates speech separation introduced in the first part with auditory attention decoding (SS-AAD) to develop brain-controlled augmented hearing systems. It is demonstrated that auditory attention decoding with automatically separated speakers is as accurate and fast as using clean speech sounds. Furthermore, to better align the experimental environment of SS-AAD systems with real-life scenarios, the second part introduces a new AAD task that closely simulates real-world complex acoustic settings. The results show that the SS-AAD system is capable of improving speech intelligibility and facilitating tracking of the attended speaker in realistic acoustic environments. Finally, this part presents employing self-supervised learned speech representation in the SS-AAD systems to enhance the neural decoding of attentional selection
Improving speech intelligibility in hearing aids. Part I: Signal processing algorithms
[EN] The improvement of speech intelligibility in hearing aids is a traditional problem that still remains open and unsolved. Modern devices may include signal processing algorithms
to improve intelligibility: automatic gain control, automatic environmental classification or speech enhancement. However, the design of such algorithms is strongly restricted by some engineering constraints caused by the reduced dimensions of hearing aid devices. In this paper, we discuss the application of state-of-theart signal processing algorithms to improve speech intelligibility in digital hearing aids, with particular emphasis on speech enhancement algorithms. Different alternatives for both monaural and binaural speech enhancement have been considered, arguing whether they are
suitable to be implemented in a commercial hearing aid or not.This work has been funded by the Spanish Ministry of Science and Innovation, under project TEC2012-38142-C04-02.Ayllón, D.; Gil Pita, R.; Rosa Zurera, M.; Padilla, L.; Piñero Sipán, MG.; Diego Antón, MD.; Ferrer Contreras, M.... (2014). Improving speech intelligibility in hearing aids. Part I: Signal processing algorithms. Waves. 6:61-71. http://hdl.handle.net/10251/57901S6171
Space time transceiver design over multipath fading channels
Imperial Users onl
- …