Search CORE

874 research outputs found

Deep Perceptual Mapping for Thermal to Visible Face Recognition

Author: Sarfraz M. Saquib
Stiefelhagen Rainer
Publication venue
Publication date: 10/07/2015
Field of study

Cross modal face matching between the thermal and visible spectrum is a much de- sired capability for night-time surveillance and security applications. Due to a very large modality gap, thermal-to-visible face recognition is one of the most challenging face matching problem. In this paper, we present an approach to bridge this modality gap by a significant margin. Our approach captures the highly non-linear relationship be- tween the two modalities by using a deep neural network. Our model attempts to learn a non-linear mapping from visible to thermal spectrum while preserving the identity in- formation. We show substantive performance improvement on a difficult thermal-visible face dataset. The presented approach improves the state-of-the-art by more than 10% in terms of Rank-1 identification and bridge the drop in performance due to the modality gap by more than 40%.Comment: BMVC 2015 (oral

arXiv.org e-Print Archive

KITopen

Unsupervised Multiple Person Tracking using AutoEncoder-Based Lifted Multicuts

Author: Ho Kalun
Keuper Janis
Keuper Margret
Publication venue
Publication date: 01/01/2020
Field of study

Multiple Object Tracking (MOT) is a long-standing task in computer vision. Current approaches based on the tracking by detection paradigm either require some sort of domain knowledge or supervision to associate data correctly into tracks. In this work, we present an unsupervised multiple object tracking approach based on visual features and minimum cost lifted multicuts. Our method is based on straight-forward spatio-temporal cues that can be extracted from neighboring frames in an image sequences without superivison. Clustering based on these cues enables us to learn the required appearance invariances for the tracking task at hand and train an autoencoder to generate suitable latent representation. Thus, the resulting latent representations can serve as robust appearance cues for tracking even over large temporal distances where no reliable spatio-temporal features could be extracted. We show that, despite being trained without using the provided annotations, our model provides competitive results on the challenging MOT Benchmark for pedestrian tracking

arXiv.org e-Print Archive

Hochschulschriftenserver der Hochschule Offenburg