429 research outputs found

    Partially Adaptive Multichannel Joint Reduction of Ego-noise and Environmental Noise

    Full text link
    Human-robot interaction relies on a noise-robust audio processing module capable of estimating target speech from audio recordings impacted by environmental noise, as well as self-induced noise, so-called ego-noise. While external ambient noise sources vary from environment to environment, ego-noise is mainly caused by the internal motors and joints of a robot. Ego-noise and environmental noise reduction are often decoupled, i.e., ego-noise reduction is performed without considering environmental noise. Recently, a variational autoencoder (VAE)-based speech model has been combined with a fully adaptive non-negative matrix factorization (NMF) noise model to recover clean speech under different environmental noise disturbances. However, its enhancement performance is limited in adverse acoustic scenarios involving, e.g. ego-noise. In this paper, we propose a multichannel partially adaptive scheme to jointly model ego-noise and environmental noise utilizing the VAE-NMF framework, where we take advantage of spatially and spectrally structured characteristics of ego-noise by pre-training the ego-noise model, while retaining the ability to adapt to unknown environmental noise. Experimental results show that our proposed approach outperforms the methods based on a completely fixed scheme and a fully adaptive scheme when ego-noise and environmental noise are present simultaneously.Comment: Accepted to the 2023 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2023

    Réduction de l'égo-bruit de robots

    Get PDF
    En robotique, il est désirable d’équiper les robots du sens de l’audition afin de mieux interagir avec les utilisateurs et l’environnement. Cependant, le bruit causé par les actionneurs des robots, nommé égo-bruit, réduit considérablement la qualité des segments audios. Conséquemment, la performance des techniques de reconnaissance de la parole et de détection d’évènements sonores est limitée par la quantité de bruit que le robot produit durant ses mouvements. Le bruit généré par les robots diffère considérablement selon l’environnement, les moteurs, les matériaux utilisés et même selon l’intégrité des différentes composantes mécaniques. L’objectif du projet est de concevoir un modèle de réduction d’égo-bruit robuste utilisant plusieurs microphones et d’être capable de le calibrer rapidement sur un robot mobile. Ce mémoire présente une méthode de réduction de l’égo-bruit combinant l’apprentissage de gabarit de matrice de covariance du bruit à un algorithme de formation de faisceau de réponses à variance minimum sans distorsion. L’approche utilisée pour l’apprentissage des matrices de covariances permet d’enregistrer les caractéristiques spatiales de l’égo-bruit en moins de deux minutes pour chaque nouvel environnement. L’algorithme de faisceau permet, quant à lui, de réduire l’égo-bruit du signal bruité sans l’ajout de distorsion nonlinéaire dans le signal résultant. La méthode est implémentée sous Robot Operating System pour une utilisation simple et rapide sur différents robots. L’évaluation de cette nouvelle méthode a été effectuée sur un robot réel dans trois environnements différents : une petite salle, une grande salle et un corridor de bureau. L’augmentation du ratio signal-bruit est d’environ 10 dB et est constante entre les trois salles. La réduction du taux d’erreur des mots de la reconnaissance vocale se situe entre 30 % et 55 %. Le modèle a aussi été testé pour la détection d’évènements sonores. Une augmentation de 7 % à 20 % de la précision moyenne a été mesurée pour la détection de la musique, mais aucune augmentation significative pour la parole, les cris, les portes qui ferment et les alarmes. La méthode proposée permet une utilisation plus accessible de la reconnaissance vocale sur des robots bruyants. De plus, une analyse des principaux paramètres a permis de valider leurs impacts sur la performance du système. Les performances sont meilleures lorsque le système est calibré avec plus de bruit du robot et lorsque la longueur des segments utilisés est plus longue. La taille de la Transformée de Fourier rapide à court terme (Short-Time Fourier Transform) peut être réduite pour réduire le temps de traitement du système. Cependant, la taille de cette transformée impacte aussi la résolution des caractéristiques du signal résultant. Un compromis doit être faire entre un faible temps de traitement et la qualité du signal en sortie du système

    Acoustic Echo Estimation using the model-based approach with Application to Spatial Map Construction in Robotics

    Get PDF

    A Blind Source Separation Framework for Ego-Noise Reduction on Multi-Rotor Drones

    Get PDF

    Audio-Motor Integration for Robot Audition

    Get PDF
    International audienceIn the context of robotics, audio signal processing in the wild amounts to dealing with sounds recorded by a system that moves and whose actuators produce noise. This creates additional challenges in sound source localization, signal enhancement and recognition. But the speci-ficity of such platforms also brings interesting opportunities: can information about the robot actuators' states be meaningfully integrated in the audio processing pipeline to improve performance and efficiency? While robot audition grew to become an established field, methods that explicitly use motor-state information as a complementary modality to audio are scarcer. This chapter proposes a unified view of this endeavour, referred to as audio-motor integration. A literature review and two learning-based methods for audio-motor integration in robot audition are presented, with application to single-microphone sound source localization and ego-noise reduction on real data

    ODAS: Open embeddeD Audition System

    Full text link
    Artificial audition aims at providing hearing capabilities to machines, computers and robots. Existing frameworks in robot audition offer interesting sound source localization, tracking and separation performance, but involve a significant amount of computations that limit their use on robots with embedded computing capabilities. This paper presents ODAS, the Open embeddeD Audition System framework, which includes strategies to reduce the computational load and perform robot audition tasks on low-cost embedded computing systems. It presents key features of ODAS, along with cases illustrating its uses in different robots and artificial audition applications

    3D reconstruction and motion estimation using forward looking sonar

    Get PDF
    Autonomous Underwater Vehicles (AUVs) are increasingly used in different domains including archaeology, oil and gas industry, coral reef monitoring, harbour’s security, and mine countermeasure missions. As electromagnetic signals do not penetrate underwater environment, GPS signals cannot be used for AUV navigation, and optical cameras have very short range underwater which limits their use in most underwater environments. Motion estimation for AUVs is a critical requirement for successful vehicle recovery and meaningful data collection. Classical inertial sensors, usually used for AUV motion estimation, suffer from large drift error. On the other hand, accurate inertial sensors are very expensive which limits their deployment to costly AUVs. Furthermore, acoustic positioning systems (APS) used for AUV navigation require costly installation and calibration. Moreover, they have poor performance in terms of the inferred resolution. Underwater 3D imaging is another challenge in AUV industry as 3D information is increasingly demanded to accomplish different AUV missions. Different systems have been proposed for underwater 3D imaging, such as planar-array sonar and T-configured 3D sonar. While the former features good resolution in general, it is very expensive and requires huge computational power, the later is cheaper implementation but requires long time for full 3D scan even in short ranges. In this thesis, we aim to tackle AUV motion estimation and underwater 3D imaging by proposing relatively affordable methodologies and study different parameters affecting their performance. We introduce a new motion estimation framework for AUVs which relies on the successive acoustic images to infer AUV ego-motion. Also, we propose an Acoustic Stereo Imaging (ASI) system for underwater 3D reconstruction based on forward looking sonars; the proposed system features cheaper implementation than planar array sonars and solves the delay problem in T configured 3D sonars
    • …
    corecore