Search CORE

6 research outputs found

Multi-modal particle filtering tracking using appearance, motion and audio likelihoods

Author: Andrea Cavallaro
Matteo Bregonzio
Murtaza Taj
Publication venue
Publication date: 01/01/2007
Field of study

ABSTRACT We propose a multi-modal object tracking algorithm that combines appearance, motion and audio information in a particle filter. The proposed tracker fuses at the likelihood level the audio-visual observations captured with a video camera coupled with two microphones. Two video likelihoods are computed that are based on a 3D color histogram appearance model and on a color change detection, whereas an audio likelihood provides information about the direction of arrival of a target. The direction of arrival is computed based on a multi-band generalized cross-correlation function enhanced with a noise suppression and reverberation filtering that uses the precedence effect. We evaluate the tracker on single and multi-modality tracking and quantify the performance improvement introduced by integrating audio and visual information in the tracking process

CiteSeerX

Tracking Identities and Attention in Smart Environments - Contributions and Progress in the CHIL Project

Author: Bernardin K.
Ekenel H.
Stiefelhagen Rainer
Voit M.
Publication venue: Institute of Electrical and Electronics Engineers
Publication date: 01/01/2008
Field of study

KITopen

Detection and localization of 3d audio-visual objects using unsupervised clustering

Author
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2008
Field of study

Crossref

Detection and Localization of 3D Audio-Visual Objects Using Unsupervised Clustering

Author: Arnaud Elise
Forbes Florence
Hansard Miles
Horaud Radu
Khalidov Vasil
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 20/10/2008
Field of study

International audienceThis paper addresses the issues of detecting and localizing objects in a scene that are both seen and heard. We explain the benefits of a human-like configuration of sensors (binaural and binocular) for gathering auditory and visual observations. It is shown that the detection and localization problem can be recast as the task of clustering the audio-visual observations into coherent groups. We propose a probabilistic generative model that captures the relations between audio and visual observations. This model maps the data into a common audio-visual 3D representation via a pair of mixture models. Inference is performed by a version of the expectationmaximization algorithm, which is formally derived, and which provides cooperative estimates of both the auditory activity and the 3D position of each object. We describe several experiments with single- and multiple-speaker detection and localization, in the presence of other audio sources

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

HAL-Rennes 1

The always best positioned paradigm for mobile indoor applications

Author: Schwartz Tim
Publication venue: Sonstige Einrichtungen. DFKI Deutsches Forschungszentrum für Künstliche Intelligenz
Publication date: 01/01/2012
Field of study

In this dissertation, methods for personal positioning in outdoor and indoor environments are investigated. The Always Best Positioned paradigm, which has the goal of providing a preferably consistent self-positioning, will be defined. Furthermore, the localization toolkit LOCATO will be presented, which allows to easily realize positioning systems that follow the paradigm. New algorithms were developed, which particularly address the robustness of positioning systems with respect to the Always Best Positioned paradigm. With the help of this toolkit, three example positioning-systems were implemented, each designed for different applications and requirements: a low-cost system, which can be used in conjunction with user-adaptive public displays, a so-called opportunistic system, which enables positioning with room-level accuracy in any building that provides a WiFi infrastructure, and a high-accuracy system for instrumented environments, which works with active RFID tags and infrared beacons. Furthermore, a new and unique evaluation-method for positioning systems is presented, which uses step-accurate natural walking-traces as ground truth. Finally, six location based services will be presented, which were realized either with the tools provided by LOCATO or with one of the example positioning-systems.In dieser Doktorarbeit werden Methoden zur Personenpositionierung im Innen- und Außenbereich von Gebäuden untersucht. Es wird das ,,Always Best Positioned” Paradigma definiert, welches eine möglichst lückenlose Selbstpositionierung zum Ziel hat. Weiterhin wird die Lokalisierungsplattform LOCATO vorgestellt, welche eine einfache Umsetzung von Positionierungssystemen ermöglicht. Hierzu wurden neue Algorithmen entwickelt, welche gezielt die Robustheit von Positionierungssystemen unter Berücksichtigung des ,,Always Best Positioned” Paradigmas angehen. Mit Hilfe dieser Plattform wurden drei Beispiel Positionierungssysteme entwickelt, welche unterschiedliche Einsatzgebiete berücksichtigen: Ein kostengünstiges System, das im Zusammenhang mit benutzeradaptiven öffentlichen Bildschirmen benutzt werden kann; ein sogenanntes opportunistisches Positionierungssystem, welches eine raumgenaue Positionierung in allen Gebäuden mit WLAN-Infrastruktur ermöglicht, sowie ein metergenaues Positionierungssystem, welches mit Hilfe einer Instrumentierung aus aktiven RFID-Tags und Infrarot-Baken arbeitet. Weiterhin wird erstmalig eine Positionierungsevaluation vorgestellt, welche schrittgenaue, natürliche Bewegungspfade als Referenzsystem einsetzt. Im Abschluss werden 6 lokationsbasierte Dienste vorgestellt, welche entweder mit Hilfe von LOCATO oder mit Hilfe einer der drei Beispiel-Positionierungssysteme entwickelt wurden

A Generative Approach to Audio-Visual Person Tracking

Author: Brunelli Roberto
Brutti Alessio
Chippendale Paul Ian
Lanz Oswald
Omologo Maurizio
Svaizer Piergiorgio
Tobia Francesco
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2006
Field of study

This paper focuses on the integration of acoustic and visualinformation for people tracking. The system presented relies on aprobabilistic framework within which information from multiple sourcesis integrated at an intermediate stage. An advantage of the methodproposed is that of using a generative approach which supports easyand robust integration of multi source information by means of sampledprojection instead of triangulation. The system described has beendeveloped in the EU funded CHIL Project researchactivities. Experimental results from the CLEAR evaluation workshopare reported

Archivio della ricerca - Fondazione Bruno Kessler