Unsupervised cross-modal deep-model adaptation for audio-visual re-identification with wearable cameras

Alessio, Brutti; Andrea Cavallaro

Unsupervised cross-modal deep-model adaptation for audio-visual re-identification with wearable cameras

Authors: Brutti Alessio
Andrea Cavallaro
Publication date: 1 January 2017
Publisher

Abstract

Model adaptation is important for the analysis of audio-visual data from body worn cameras in order to cope with rapidly changing scene conditions, varying object appearance and limited training data. In this paper, we propose a new approach for the on-line and unsupervised adaptation of deep-learning models for audio-visual target re-identification. Specifically, we adapt each mono-modal model using the unsupervised labelling provided by the other modality. To limit the detrimental effects of erroneous labels, we use a regularisation term based on the Kullback-Leibler divergence between the initial model and the one being adapted. The proposed adaptation strategy complements common audio-visual late fusion approaches and is beneficial also when one modality is no longer reliable. We show the contribution of the proposed strategy in improving the overall re-identification performance on a challenging public dataset captured with body worn cameras

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

Archivio della ricerca - Fondazione Bruno Kessler

oai:cris.fbk.eu:11582/313180

Last time updated on 03/09/2019