1,769 research outputs found

    Vision-Guided Robot Hearing

    Get PDF
    International audienceNatural human-robot interaction (HRI) in complex and unpredictable environments is important with many potential applicatons. While vision-based HRI has been thoroughly investigated, robot hearing and audio-based HRI are emerging research topics in robotics. In typical real-world scenarios, humans are at some distance from the robot and hence the sensory (microphone) data are strongly impaired by background noise, reverberations and competing auditory sources. In this context, the detection and localization of speakers plays a key role that enables several tasks, such as improving the signal-to-noise ratio for speech recognition, speaker recognition, speaker tracking, etc. In this paper we address the problem of how to detect and localize people that are both seen and heard. We introduce a hybrid deterministic/probabilistic model. The deterministic component allows us to map 3D visual data onto an 1D auditory space. The probabilistic component of the model enables the visual features to guide the grouping of the auditory features in order to form audiovisual (AV) objects. The proposed model and the associated algorithms are implemented in real-time (17 FPS) using a stereoscopic camera pair and two microphones embedded into the head of the humanoid robot NAO. We perform experiments with (i)~synthetic data, (ii)~publicly available data gathered with an audiovisual robotic head, and (iii)~data acquired using the NAO robot. The results validate the approach and are an encouragement to investigate how vision and hearing could be further combined for robust HRI

    Embodied & Situated Language Processing

    Get PDF

    Alignment of Binocular-Binaural Data Using a Moving Audio-Visual Target

    Get PDF
    Best Paper AwardInternational audienceIn this paper we address the problem of aligning visual (V) and auditory (A) data using a sensor that is composed of a camera-pair and a microphone-pair. The original contribution of the paper is a method for AV data aligning through estimation of the 3D positions of the microphones in the visual-centred coordinate frame defined by the stereo camera-pair. We exploit the fact that these two distinct data sets are conditioned by a common set of parameters, namely the (unknown) 3D trajectory of an AV object, and derive an EM-like algorithm that alternates between the estimation of the microphone-pair position and the estimation of the AV object trajectory. The proposed algorithm has a number of built-in features: it can deal with A and V observations that are misaligned in time, it estimates the reliability of the data, it is robust to outliers in both modalities, and it has proven theoretical convergence. We report experiments with both simulated and real data

    Perception in real and artificial insects: a robotic investigation of cricket phonotaxis

    Get PDF
    The aim of this thesis is to investigate a methodology for studying percep¬ tual systems by building artificial ones. It is proposed that useful results can be obtained from detailed robotic modelling of specific sensorimotor mechanisms in lower animals. By looking at the sensory control of behaviour in simple biological organisms, and in working robots, it is argued that proper appreciation of the physical interaction of the system with the environment and the task is essential for discovering how perceptual mechanisms function. Although links to biology, and concern with perceptual competence, are fields of growing interest in Artificial Intelligence, much of the current research fails to adequately address these issues, as the model systems being built do not represent real sensorimotor problems.By analyzing what is required for a model of a system to contribute to ex¬ plaining that system, a particular approach to modeling perceptual systems is suggested. This involves choosing an appropriate target system to model, building a system that validly represents the target with respect to a particular hypothesis, and properly evaluating the behaviour of the model system to draw conclusions about the target. The viability and potential contribution of this approach is demonstrated in the design, implementation and evaluation of a mobile robot model of a hypothesised mechanism for phonotaxis in the cricket.The result is a robot that successfully locates a specific sound source under a variety of conditions, with a range of behaviour that resembles the cricket in many ways. This provides some support for the hypothesis that the neural mechanism for phonotaxis in crickets does not involve separate processing for recognition and location of the signal, as is generally supposed. It also shows the importance of un¬ derstanding the physical interaction of the system's structure with its environment in devising and implementing perceptual systems. Both these results vindicate the proposed methodology
    corecore