Search CORE

560 research outputs found

Joint model-based recognition and localization of overlapped acoustic events using a set of distributed small microphone arrays

Author: Chakraborty Rupayan
Nadeu Climent
Publication venue
Publication date: 01/01/2017
Field of study

In the analysis of acoustic scenes, often the occurring sounds have to be detected in time, recognized, and localized in space. Usually, each of these tasks is done separately. In this paper, a model-based approach to jointly carry them out for the case of multiple simultaneous sources is presented and tested. The recognized event classes and their respective room positions are obtained with a single system that maximizes the combination of a large set of scores, each one resulting from a different acoustic event model and a different beamformer output signal, which comes from one of several arbitrarily-located small microphone arrays. By using a two-step method, the experimental work for a specific scenario consisting of meeting-room acoustic events, either isolated or overlapped with speech, is reported. Tests carried out with two datasets show the advantage of the proposed approach with respect to some usual techniques, and that the inclusion of estimated priors brings a further performance improvement.Comment: Computational acoustic scene analysis, microphone array signal processing, acoustic event detectio

arXiv.org e-Print Archive

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

ChordMics: Acoustic Signal Purification with Distributed Microphones

Author: He Yuan
Jin Meng
Li Jinming
Wang Weiguo
Publication venue
Publication date: 30/09/2022
Field of study

Acoustic signal acts as an essential input to many systems. However, the pure acoustic signal is very difficult to extract, especially in noisy environments. Existing beamforming systems are able to extract the signal transmitted from certain directions. However, since microphones are centrally deployed, these systems have limited coverage and low spatial resolution. We overcome the above limitations and present ChordMics, a distributed beamforming system. By leveraging the spatial diversity of the distributed microphones, ChordMics is able to extract the acoustic signal from arbitrary points. To realize such a system, we further address the fundamental challenge in distributed beamforming: aligning the signals captured by distributed and unsynchronized microphones. We implement ChordMics and evaluate its performance under both LOS and NLOS scenarios. The evaluation results tell that ChordMics can deliver higher SINR than the centralized microphone array. The average performance gain is up to 15dB

arXiv.org e-Print Archive

Microphone array signal processing for robot audition

Author: Horaud R
Kellermann W
Löllmann HW
Mazel A
Moore AH
Naylor PA
Rafaely B
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

Robot audition for humanoid robots interacting naturally with humans in an unconstrained real-world environment is a hitherto unsolved challenge. The recorded microphone signals are usually distorted by background and interfering noise sources (speakers) as well as room reverberation. In addition, the movements of a robot and its actuators cause ego-noise which degrades the recorded signals significantly. The movement of the robot body and its head also complicates the detection and tracking of the desired, possibly moving, sound sources of interest. This paper presents an overview of the concepts in microphone array processing for robot audition and some recent achievements

Crossref

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Spiral - Imperial College Digital Repository

Hal-Diderot

HAL-Rennes 1

Multichannel Speech Enhancement

Author: Lino Garcia
Soledad Torres-Guijarro
Publication venue: 'IntechOpen'
Publication date: 01/10/2008
Field of study

IntechOpen

Adaptive Signal Processing Techniques and Realistic Propagation Modeling for Multiantenna Vital Sign Estimation

Author: Aho Janne
Publication venue
Publication date: 01/01/2013
Field of study

Tämän työn keskeisimpänä tavoitteena on ihmisen elintoimintojen tarkkailu ja estimointi käyttäen radiotaajuisia mittauksia ja adaptiivisia signaalinkäsittelymenetelmiä monen vastaanottimen kantoaaltotutkalla. Työssä esitellään erilaisia adaptiivisia menetelmiä, joiden avulla hengityksen ja sydämen värähtelyn aiheuttamaa micro-Doppler vaihemodulaatiota sisältävät eri vastaanottimien signaalit voidaan yhdistää. Työssä johdetaan lisäksi realistinen malli radiosignaalien etenemiselle ja heijastushäviöille, jota käytettiin moniantennitutkan simuloinnissa esiteltyjen menetelmien vertailemiseksi. Saatujen tulosten perusteella voidaan osoittaa, että adaptiiviset menetelmät parantavat langattoman elintoimintojen estimoinnin luotettavuutta, ja mahdollistavat monitoroinnin myös pienillä signaali-kohinasuhteen arvoilla.This thesis addresses the problem of vital sign estimation through the use of adaptive signal enhancement techniques with multiantenna continuous wave radar. The use of different adaptive processing techniques is proposed in a novel approach to combine signals from multiple receivers carrying the information of the cardiopulmonary micro-Doppler effect caused by breathing and heartbeat. The results are based on extensive simulations using a realistic signal propagation model derived in the thesis. It is shown that these techniques provide a significant increase in vital sign rate estimation accuracy, and enable monitoring at lower SNR conditions

Aaltodoc Publication Archive

Speech enhancement for robust automatic speech recognition: Evaluation using a baseline system and instrumental measures

Author: Moore AH
Naylor PA
Peso P
Publication venue: 'Elsevier BV'
Publication date: 25/11/2016
Field of study

Automatic speech recognition in everyday environments must be robust to significant levels of reverberation and noise. One strategy to achieve such robustness is multi-microphone speech enhancement. In this study, we present results of an evaluation of different speech enhancement pipelines using a state-of-the-art ASR system for a wide range of reverberation and noise conditions. The evaluation exploits the recently released ACE Challenge database which includes measured multichannel acoustic impulse responses from 7 different rooms with reverberation times ranging from 0.33 s to 1.34 s. The reverberant speech is mixed with ambient, fan and babble noise recordings made with the same microphone setups in each of the rooms. In the first experiment performance of the ASR without speech processing is evaluated. Results clearly indicate the deleterious effect of both noise and reverberation. In the second experiment, different speech enhancement pipelines are evaluated with relative word error rate reductions of up to 82%. Finally, the ability of selected instrumental metrics to predict ASR performance improvement is assessed. The best performing metric, Short-Time Objective Intelligibility Measure, is shown to have a Pearson correlation coefficient of 0.79, suggesting that it is a useful predictor of algorithm performance in these tests

Spiral - Imperial College Digital Repository

Recommended from our members

Automatic Speech Separation for Brain-Controlled Hearing Technologies

Author: Han Cong
Publication venue
Publication date: 01/01/2024
Field of study

Speech perception in crowded acoustic environments is particularly challenging for hearing impaired listeners. While assistive hearing devices can suppress background noises distinct from speech, they struggle to lower interfering speakers without knowing the speaker on which the listener is focusing. The human brain has a remarkable ability to pick out individual voices in a noisy environment like a crowded restaurant or a busy city street. This inspires the brain-controlled hearing technologies. A brain-controlled hearing aid acts as an intelligent filter, reading wearers’ brainwaves and enhancing the voice they want to focus on. Two essential elements form the core of brain-controlled hearing aids: automatic speech separation (SS), which isolates individual speakers from mixed audio in an acoustic scene, and auditory attention decoding (AAD) in which the brainwaves of listeners are compared with separated speakers to determine the attended one, which can then be amplified to facilitate hearing. This dissertation focuses on speech separation and its integration with AAD, aiming to propel the evolution of brain-controlled hearing technologies. The goal is to help users to engage in conversations with people around them seamlessly and efficiently. This dissertation is structured into two parts. The first part focuses on automatic speech separation models, beginning with the introduction of a real-time monaural speech separation model, followed by more advanced real-time binaural speech separation models. The binaural models use both spectral and spatial features to separate speakers and are more robust to noise and reverberation. Beyond performing speech separation, the binaural models preserve the interaural cues of separated sound sources, which is a significant step towards immersive augmented hearing. Additionally, the first part explores using speaker identifications to improve the performance and robustness of models in long-form speech separation. This part also delves into unsupervised learning methods for multi-channel speech separation, aiming to improve the models' ability to generalize to real-world audio. The second part of the dissertation integrates speech separation introduced in the first part with auditory attention decoding (SS-AAD) to develop brain-controlled augmented hearing systems. It is demonstrated that auditory attention decoding with automatically separated speakers is as accurate and fast as using clean speech sounds. Furthermore, to better align the experimental environment of SS-AAD systems with real-life scenarios, the second part introduces a new AAD task that closely simulates real-world complex acoustic settings. The results show that the SS-AAD system is capable of improving speech intelligibility and facilitating tracking of the attended speaker in realistic acoustic environments. Finally, this part presents employing self-supervised learned speech representation in the SS-AAD systems to enhance the neural decoding of attentional selection

Columbia University Academic Commons

Improving speech intelligibility in hearing aids. Part I: Signal processing algorithms

Author: Ayllón D.
Diego Antón María de
Ferrer Contreras Miguel
Gil Pita Roberto
González Téllez Alberto
Padilla L.
Piñero Sipán María Gemma
Rosa Zurera Manuel
Publication venue: Instituto de Telecomunicaciones y Aplicaciones Multimedia (ITEAM)
Publication date: 01/01/2014
Field of study

[EN] The improvement of speech intelligibility in hearing aids is a traditional problem that still remains open and unsolved. Modern devices may include signal processing algorithms to improve intelligibility: automatic gain control, automatic environmental classification or speech enhancement. However, the design of such algorithms is strongly restricted by some engineering constraints caused by the reduced dimensions of hearing aid devices. In this paper, we discuss the application of state-of-theart signal processing algorithms to improve speech intelligibility in digital hearing aids, with particular emphasis on speech enhancement algorithms. Different alternatives for both monaural and binaural speech enhancement have been considered, arguing whether they are suitable to be implemented in a commercial hearing aid or not.This work has been funded by the Spanish Ministry of Science and Innovation, under project TEC2012-38142-C04-02.Ayllón, D.; Gil Pita, R.; Rosa Zurera, M.; Padilla, L.; Piñero Sipán, MG.; Diego Antón, MD.; Ferrer Contreras, M.... (2014). Improving speech intelligibility in hearing aids. Part I: Signal processing algorithms. Waves. 6:61-71. http://hdl.handle.net/10251/57901S6171

RiuNet

Space time transceiver design over multipath fading channels

Author: Zhang Tingting
Zhang Tingting
Publication venue
Publication date: 01/01/2008
Field of study

Imperial Users onl

Spiral - Imperial College Digital Repository