418 research outputs found

    Microphone array signal processing for robot audition

    Get PDF
    Robot audition for humanoid robots interacting naturally with humans in an unconstrained real-world environment is a hitherto unsolved challenge. The recorded microphone signals are usually distorted by background and interfering noise sources (speakers) as well as room reverberation. In addition, the movements of a robot and its actuators cause ego-noise which degrades the recorded signals significantly. The movement of the robot body and its head also complicates the detection and tracking of the desired, possibly moving, sound sources of interest. This paper presents an overview of the concepts in microphone array processing for robot audition and some recent achievements

    ODAS: Open embeddeD Audition System

    Full text link
    Artificial audition aims at providing hearing capabilities to machines, computers and robots. Existing frameworks in robot audition offer interesting sound source localization, tracking and separation performance, but involve a significant amount of computations that limit their use on robots with embedded computing capabilities. This paper presents ODAS, the Open embeddeD Audition System framework, which includes strategies to reduce the computational load and perform robot audition tasks on low-cost embedded computing systems. It presents key features of ODAS, along with cases illustrating its uses in different robots and artificial audition applications

    Integration of a voice recognition system in a social robot

    Get PDF
    Human-Robot Interaction (HRI) 1 is one of the main fields in the study and research of robotics. Within this field, dialog systems and interaction by voice play a very important role. When speaking about human- robot natural dialog we assume that the robot has the capability to accurately recognize the utterance what the human wants to transmit verbally and even its semantic meaning, but this is not always achieved. In this paper we describe the steps and requirements that we went through in order to endow the personal social robot Maggie, developed in the University Carlos III of Madrid, with the capability of understanding the natural language spoken by any human. We have analyzed the different possibilities offered by current software/hardware alternatives by testing them in real environments. We have obtained accurate data related to the speech recognition capabilities in different environments, using the most modern audio acquisition systems and analyzing not so typical parameters as user age, sex, intonation, volume and language. Finally we propose a new model to classify recognition results as accepted and rejected, based in a second ASR opinion. This new approach takes into account the pre-calculated success rate in noise intervals for each recognition framework decreasing false positives and false negatives rate.The funds have provided by the Spanish Government through the project called `Peer to Peer Robot-Human Interaction'' (R2H), of MEC (Ministry of Science and Education), and the project “A new approach to social robotics'' (AROS), of MICINN (Ministry of Science and Innovation). The research leading to these results has received funding from the RoboCity2030-II-CM project (S2009/DPI-1559), funded by Programas de Actividades I+D en la Comunidad de Madrid and cofunded by Structural Funds of the EU

    Simultaneous asynchronous microphone array calibration and sound source localisation

    Full text link
    © 2015 IEEE. In this paper, an approach for sound source localisation and calibration of an asynchronous microphone array is proposed to be solved simultaneously. A graph-based Simultaneous Localisation and Mapping (SLAM) method is used for this purpose. Traditional sound source localisation using a microphone array has two main requirements. Firstly, geometrical information of microphone array is needed. Secondly, a multichannel analog-to-digital converter is required to obtain synchronous readings of the audio signal. Recent works aim at releasing these two requirements by estimating the time offset between each pair of microphones. However, it was assumed that the clock timing in each microphone sound card is exactly the same, which requires the clocks in the sound cards to be identically manufactured. A methodology is hereby proposed to calibrate an asynchronous microphone array using a graph-based optimisation method borrowed from the SLAM literature, effectively estimating the array geometry, time offset and clock difference/drift rate of each microphone together with the sound source locations. Simulation and experimental results are presented, which prove the effectiveness of the proposed methodology in achieving accurate estimates of the microphone array characteristics needed to be used on realistic settings with asynchronous sound devices
    • …
    corecore