6 research outputs found

    Multi-modal particle filtering tracking using appearance, motion and audio likelihoods

    Get PDF
    ABSTRACT We propose a multi-modal object tracking algorithm that combines appearance, motion and audio information in a particle filter. The proposed tracker fuses at the likelihood level the audio-visual observations captured with a video camera coupled with two microphones. Two video likelihoods are computed that are based on a 3D color histogram appearance model and on a color change detection, whereas an audio likelihood provides information about the direction of arrival of a target. The direction of arrival is computed based on a multi-band generalized cross-correlation function enhanced with a noise suppression and reverberation filtering that uses the precedence effect. We evaluate the tracker on single and multi-modality tracking and quantify the performance improvement introduced by integrating audio and visual information in the tracking process

    Tracking Identities and Attention in Smart Environments - Contributions and Progress in the CHIL Project

    Get PDF

    Detection and localization of 3d audio-visual objects using unsupervised clustering

    Full text link

    Detection and Localization of 3D Audio-Visual Objects Using Unsupervised Clustering

    Get PDF
    International audienceThis paper addresses the issues of detecting and localizing objects in a scene that are both seen and heard. We explain the benefits of a human-like configuration of sensors (binaural and binocular) for gathering auditory and visual observations. It is shown that the detection and localization problem can be recast as the task of clustering the audio-visual observations into coherent groups. We propose a probabilistic generative model that captures the relations between audio and visual observations. This model maps the data into a common audio-visual 3D representation via a pair of mixture models. Inference is performed by a version of the expectationmaximization algorithm, which is formally derived, and which provides cooperative estimates of both the auditory activity and the 3D position of each object. We describe several experiments with single- and multiple-speaker detection and localization, in the presence of other audio sources

    The always best positioned paradigm for mobile indoor applications

    Get PDF
    In this dissertation, methods for personal positioning in outdoor and indoor environments are investigated. The Always Best Positioned paradigm, which has the goal of providing a preferably consistent self-positioning, will be defined. Furthermore, the localization toolkit LOCATO will be presented, which allows to easily realize positioning systems that follow the paradigm. New algorithms were developed, which particularly address the robustness of positioning systems with respect to the Always Best Positioned paradigm. With the help of this toolkit, three example positioning-systems were implemented, each designed for different applications and requirements: a low-cost system, which can be used in conjunction with user-adaptive public displays, a so-called opportunistic system, which enables positioning with room-level accuracy in any building that provides a WiFi infrastructure, and a high-accuracy system for instrumented environments, which works with active RFID tags and infrared beacons. Furthermore, a new and unique evaluation-method for positioning systems is presented, which uses step-accurate natural walking-traces as ground truth. Finally, six location based services will be presented, which were realized either with the tools provided by LOCATO or with one of the example positioning-systems.In dieser Doktorarbeit werden Methoden zur Personenpositionierung im Innen- und Außenbereich von Gebäuden untersucht. Es wird das ,,Always Best Positioned” Paradigma definiert, welches eine möglichst lückenlose Selbstpositionierung zum Ziel hat. Weiterhin wird die Lokalisierungsplattform LOCATO vorgestellt, welche eine einfache Umsetzung von Positionierungssystemen ermöglicht. Hierzu wurden neue Algorithmen entwickelt, welche gezielt die Robustheit von Positionierungssystemen unter Berücksichtigung des ,,Always Best Positioned” Paradigmas angehen. Mit Hilfe dieser Plattform wurden drei Beispiel Positionierungssysteme entwickelt, welche unterschiedliche Einsatzgebiete berücksichtigen: Ein kostengünstiges System, das im Zusammenhang mit benutzeradaptiven öffentlichen Bildschirmen benutzt werden kann; ein sogenanntes opportunistisches Positionierungssystem, welches eine raumgenaue Positionierung in allen Gebäuden mit WLAN-Infrastruktur ermöglicht, sowie ein metergenaues Positionierungssystem, welches mit Hilfe einer Instrumentierung aus aktiven RFID-Tags und Infrarot-Baken arbeitet. Weiterhin wird erstmalig eine Positionierungsevaluation vorgestellt, welche schrittgenaue, natürliche Bewegungspfade als Referenzsystem einsetzt. Im Abschluss werden 6 lokationsbasierte Dienste vorgestellt, welche entweder mit Hilfe von LOCATO oder mit Hilfe einer der drei Beispiel-Positionierungssysteme entwickelt wurden

    A Generative Approach to Audio-Visual Person Tracking

    No full text
    This paper focuses on the integration of acoustic and visualinformation for people tracking. The system presented relies on aprobabilistic framework within which information from multiple sourcesis integrated at an intermediate stage. An advantage of the methodproposed is that of using a generative approach which supports easyand robust integration of multi source information by means of sampledprojection instead of triangulation. The system described has beendeveloped in the EU funded CHIL Project researchactivities. Experimental results from the CLEAR evaluation workshopare reported