Search CORE

4 research outputs found

Computational Audiovisual Scene Analysis

Author: Yan Rujiao
Publication venue: Universitätsbibliothek Bielefeld
Publication date: 01/01/2014
Field of study

Yan R. Computational Audiovisual Scene Analysis. Bielefeld: Universitätsbibliothek Bielefeld; 2014.In most real-world situations, a robot is interacting with multiple people. In this case, understanding of the dialogs is essential. However, dialog scene analysis is missing in most existing systems of human-robot interaction. In such systems, only one speaker can talk with the robot or each speaker wears an attached microphone or a headset. The target of Computational AudioVisual Scene Analysis (CAVSA) is therefore making dialogs between humans and robots more natural and flexible. The CAVSA system is able to learn how many speakers are in the scenario, where the speakers are and who is currently speaking. CAVSA is a challenging task due to the complexity of dialogue scenarios. First, speakers are unknown in advance, thus a database for training high-level features beforehand to recognize faces or voices is not available. Second, people can dynamically come into and leave the scene, may move all the time and even change their locations outside the camera field of view. Third, the robot can not see all the people at the same time due to limited camera field of view and head movements. Moreover, a sound could be related to a person who stands outside the camera field of view and has never been seen. I will show that the CAVSA system is able to assign words to corresponding speakers. A speaker is recognized again when he leaves and enters the scene, or changes his position even with a newly appearing person

Publications at Bielefeld University

Auditory Localization Using Direction-Dependent Spectral Information

Author: Daniel Gill
Israel Nelken
Lidror Troyansky
Publication venue
Publication date: 01/01/2000
Field of study

This work presents a biologically motivated neuronal model for detecting the elevation of unfamiliar natural sound sources using monoaural cues, based on head-related-transfer functions. This model can determine the elevation of an unfamiliar sound source to within less than 43 with no error using very small number of training samples. In addition, we suggest that the approximate logarithmic response of the cells in the cochlea is bene"cial for localizing unfamiliar sound sources. # 2000 Elsevier Science B.V. All rights reserved

CiteSeerX

Auditory localization using direction-dependent spectral information

Author: Blauert
Daniel Gill
Israel Nelken
Lidror Troyansky
Middlebrooks
Musicant
Nandy
Nelken
Neti
Rice
Rice
Spirou
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref