Search CORE

429 research outputs found

Partially Adaptive Multichannel Joint Reduction of Ego-noise and Environmental Noise

Author: Fang Huajian
Gerkmann Timo
Twiefel Johannes
Wermter Stefan
Wittmer Niklas
Publication venue
Publication date: 27/03/2023
Field of study

Human-robot interaction relies on a noise-robust audio processing module capable of estimating target speech from audio recordings impacted by environmental noise, as well as self-induced noise, so-called ego-noise. While external ambient noise sources vary from environment to environment, ego-noise is mainly caused by the internal motors and joints of a robot. Ego-noise and environmental noise reduction are often decoupled, i.e., ego-noise reduction is performed without considering environmental noise. Recently, a variational autoencoder (VAE)-based speech model has been combined with a fully adaptive non-negative matrix factorization (NMF) noise model to recover clean speech under different environmental noise disturbances. However, its enhancement performance is limited in adverse acoustic scenarios involving, e.g. ego-noise. In this paper, we propose a multichannel partially adaptive scheme to jointly model ego-noise and environmental noise utilizing the VAE-NMF framework, where we take advantage of spatially and spectrally structured characteristics of ego-noise by pre-training the ego-noise model, while retaining the ability to adapt to unknown environmental noise. Experimental results show that our proposed approach outperforms the methods based on a completely fixed scheme and a fully adaptive scheme when ego-noise and environmental noise are present simultaneously.Comment: Accepted to the 2023 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2023

arXiv.org e-Print Archive

Réduction de l'égo-bruit de robots

Author: Lagacé Pierre-Olivier
Publication venue: 'Universite de Sherbrooke'
Publication date: 01/01/2023
Field of study

En robotique, il est désirable d’équiper les robots du sens de l’audition afin de mieux interagir avec les utilisateurs et l’environnement. Cependant, le bruit causé par les actionneurs des robots, nommé égo-bruit, réduit considérablement la qualité des segments audios. Conséquemment, la performance des techniques de reconnaissance de la parole et de détection d’évènements sonores est limitée par la quantité de bruit que le robot produit durant ses mouvements. Le bruit généré par les robots diffère considérablement selon l’environnement, les moteurs, les matériaux utilisés et même selon l’intégrité des différentes composantes mécaniques. L’objectif du projet est de concevoir un modèle de réduction d’égo-bruit robuste utilisant plusieurs microphones et d’être capable de le calibrer rapidement sur un robot mobile. Ce mémoire présente une méthode de réduction de l’égo-bruit combinant l’apprentissage de gabarit de matrice de covariance du bruit à un algorithme de formation de faisceau de réponses à variance minimum sans distorsion. L’approche utilisée pour l’apprentissage des matrices de covariances permet d’enregistrer les caractéristiques spatiales de l’égo-bruit en moins de deux minutes pour chaque nouvel environnement. L’algorithme de faisceau permet, quant à lui, de réduire l’égo-bruit du signal bruité sans l’ajout de distorsion nonlinéaire dans le signal résultant. La méthode est implémentée sous Robot Operating System pour une utilisation simple et rapide sur différents robots. L’évaluation de cette nouvelle méthode a été effectuée sur un robot réel dans trois environnements différents : une petite salle, une grande salle et un corridor de bureau. L’augmentation du ratio signal-bruit est d’environ 10 dB et est constante entre les trois salles. La réduction du taux d’erreur des mots de la reconnaissance vocale se situe entre 30 % et 55 %. Le modèle a aussi été testé pour la détection d’évènements sonores. Une augmentation de 7 % à 20 % de la précision moyenne a été mesurée pour la détection de la musique, mais aucune augmentation significative pour la parole, les cris, les portes qui ferment et les alarmes. La méthode proposée permet une utilisation plus accessible de la reconnaissance vocale sur des robots bruyants. De plus, une analyse des principaux paramètres a permis de valider leurs impacts sur la performance du système. Les performances sont meilleures lorsque le système est calibré avec plus de bruit du robot et lorsque la longueur des segments utilisés est plus longue. La taille de la Transformée de Fourier rapide à court terme (Short-Time Fourier Transform) peut être réduite pour réduire le temps de traitement du système. Cependant, la taille de cette transformée impacte aussi la résolution des caractéristiques du signal résultant. Un compromis doit être faire entre un faible temps de traitement et la qualité du signal en sortie du système

Savoirs UdeS

Acoustic Echo Estimation using the model-based approach with Application to Spatial Map Construction in Robotics

Author: Saqib Usama
Publication venue: Aalborg Universitetsforlag
Publication date: 01/01/2022
Field of study

VBN

A Blind Source Separation Framework for Ego-Noise Reduction on Multi-Rotor Drones

Author: Cavallaro A
Wang L
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2020
Field of study

Crossref

Queen Mary Research Online

Audio-Motor Integration for Robot Audition

Author: Aharon
Argentieri
Aytekin
Barfuss
Berglund
Bernard
Bernard
Braasch
Bustamante
Cooke
Davis
Deleforge
Deleforge
Deleforge
Deleforge
Deleforge
Deleforge
Deleforge
Evers
Furukawa
Gannot
Gaultier
Gouaillier
Haykin
Hofman
Hofman
Hornstein
Huang
Huggins-Daines
Ince
Ito
Kato
Kneip
Kreković
Li
Löllmann
Löllmann
Ma
Ma
Magassouba
May
Middlebrooks
Nakadai
Nakadai
Nakadai
Nakadai
Naylor
Nguyen
O'Regan
Otani
Otsuka
Ozerov
Perrett
Poincaré
Portello
Prasad
Rascon
Sanchez-Riera
Sawada
Schmidt
Schmidt
Schölkopf
Smaragdis
Strutt (Lord Rayleigh)
Talmon
Thurlow
Tourbabin
Tropp
Valin
Vincent
Vincent
Virtanen
Wallach
Wang
Wang
Wightman
Wright
Xiao
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 19/11/2018
Field of study

International audienceIn the context of robotics, audio signal processing in the wild amounts to dealing with sounds recorded by a system that moves and whose actuators produce noise. This creates additional challenges in sound source localization, signal enhancement and recognition. But the speci-ficity of such platforms also brings interesting opportunities: can information about the robot actuators' states be meaningfully integrated in the audio processing pipeline to improve performance and efficiency? While robot audition grew to become an established field, methods that explicitly use motor-state information as a complementary modality to audio are scarcer. This chapter proposes a unified view of this endeavour, referred to as audio-motor integration. A literature review and two learning-based methods for audio-motor integration in robot audition are presented, with application to single-microphone sound source localization and ego-noise reduction on real data

Crossref

INRIA a CCSD electronic archive server

ODAS: Open embeddeD Audition System

Author: Faucher Samuel
Godin Cédric
Grondin François
Lauzon Jean-Samuel
Létourneau Dominic
Michaud François
Michaud Simon
Vincent Jonathan
Publication venue
Publication date: 05/03/2021
Field of study

Artificial audition aims at providing hearing capabilities to machines, computers and robots. Existing frameworks in robot audition offer interesting sound source localization, tracking and separation performance, but involve a significant amount of computations that limit their use on robots with embedded computing capabilities. This paper presents ODAS, the Open embeddeD Audition System framework, which includes strategies to reduce the computational load and perform robot audition tasks on low-cost embedded computing systems. It presents key features of ODAS, along with cases illustrating its uses in different robots and artificial audition applications

arXiv.org e-Print Archive

Harmonic beamformers for speech enhancement and dereverberation in the time domain

Author: Benesty Jacob
Christensen Mads Græsbøll
Jensen Jesper Rindom
Karimian-Azari Sam
Publication venue: 'Elsevier BV'
Publication date: 01/01/2020
Field of study

VBN

3D reconstruction and motion estimation using forward looking sonar

Author: Assalih Hassan
Publication venue: Engineering and Physical Sciences
Publication date: 01/01/2013
Field of study

Autonomous Underwater Vehicles (AUVs) are increasingly used in different domains including archaeology, oil and gas industry, coral reef monitoring, harbour’s security, and mine countermeasure missions. As electromagnetic signals do not penetrate underwater environment, GPS signals cannot be used for AUV navigation, and optical cameras have very short range underwater which limits their use in most underwater environments. Motion estimation for AUVs is a critical requirement for successful vehicle recovery and meaningful data collection. Classical inertial sensors, usually used for AUV motion estimation, suffer from large drift error. On the other hand, accurate inertial sensors are very expensive which limits their deployment to costly AUVs. Furthermore, acoustic positioning systems (APS) used for AUV navigation require costly installation and calibration. Moreover, they have poor performance in terms of the inferred resolution. Underwater 3D imaging is another challenge in AUV industry as 3D information is increasingly demanded to accomplish different AUV missions. Different systems have been proposed for underwater 3D imaging, such as planar-array sonar and T-configured 3D sonar. While the former features good resolution in general, it is very expensive and requires huge computational power, the later is cheaper implementation but requires long time for full 3D scan even in short ranges. In this thesis, we aim to tackle AUV motion estimation and underwater 3D imaging by proposing relatively affordable methodologies and study different parameters affecting their performance. We introduce a new motion estimation framework for AUVs which relies on the successive acoustic images to infer AUV ego-motion. Also, we propose an Acoustic Stereo Imaging (ASI) system for underwater 3D reconstruction based on forward looking sonars; the proposed system features cheaper implementation than planar array sonars and solves the delay problem in T configured 3D sonars

CiteSeerX

ROS: The Research Output Service. Heriot-Watt University Edinburgh