44 research outputs found
Highly Efficient Regression for Scalable Person Re-Identification
Existing person re-identification models are poor for scaling up to large
data required in real-world applications due to: (1) Complexity: They employ
complex models for optimal performance resulting in high computational cost for
training at a large scale; (2) Inadaptability: Once trained, they are
unsuitable for incremental update to incorporate any new data available. This
work proposes a truly scalable solution to re-id by addressing both problems.
Specifically, a Highly Efficient Regression (HER) model is formulated by
embedding the Fisher's criterion to a ridge regression model for very fast
re-id model learning with scalable memory/storage usage. Importantly, this new
HER model supports faster than real-time incremental model updates therefore
making real-time active learning feasible in re-id with human-in-the-loop.
Extensive experiments show that such a simple and fast model not only
outperforms notably the state-of-the-art re-id methods, but also is more
scalable to large data with additional benefits to active learning for reducing
human labelling effort in re-id deployment
Multi-View Face Recognition From Single RGBD Models of the Faces
This work takes important steps towards solving the following problem of current interest: Assuming that each individual in a population can be modeled by a single frontal RGBD face image, is it possible to carry out face recognition for such a population using multiple 2D images captured from arbitrary viewpoints? Although the general problem as stated above is extremely challenging, it encompasses subproblems that can be addressed today. The subproblems addressed in this work relate to: (1) Generating a large set of viewpoint dependent face images from a single RGBD frontal image for each individual; (2) using hierarchical approaches based on view-partitioned subspaces to represent the training data; and (3) based on these hierarchical approaches, using a weighted voting algorithm to integrate the evidence collected from multiple images of the same face as recorded from different viewpoints. We evaluate our methods on three datasets: a dataset of 10 people that we created and two publicly available datasets which include a total of 48 people. In addition to providing important insights into the nature of this problem, our results show that we are able to successfully recognize faces with accuracies of 95% or higher, outperforming existing state-of-the-art face recognition approaches based on deep convolutional neural networks
Multi-task mutual learning for vehicle re-identification
Vehicle re-identification (Re-ID) aims to search a specific vehicle instance across non-overlapping camera views. The main challenge of vehicle Re-ID is that the visual appearance of vehicles may drastically changes according to diverse viewpoints and illumination. Most existing vehicle Re-ID models cannot make full use of various complementary vehicle information, e.g. vehicle type and orientation. In this paper, we propose a novel Multi-Task Mutual Learning (MTML) deep model to learn discriminative features simultaneously from multiple branches. Specifically, we design a consensus learning loss function by fusing features from the final convolutional feature maps from all branches. Extensive comparative evaluations demonstrate the effectiveness of our proposed MTML method in comparison to the state-of-the-art vehicle Re-ID techniques on a large-scale benchmark dataset, VeRi-776. We also yield competitive performance on the NVIDIA 2019 AI City Challenge Track 2
Information theoretic sensor management for multi-target tracking with a single pan-tilt-zoom camera
Automatic multiple target tracking with pan-tilt-zoom (PTZ) cameras is a hard task, with few approaches in the lit-erature, most of them proposing simplistic scenarios. In this paper, we present a PTZ camera management framework which lies on information theoretic principles: at each time step, the next camera pose (pan, tilt, focal length) is chosen, according to a policy which ensures maximum information gain. The formulation takes into account occlusions, phys-ical extension of targets, realistic pedestrian detectors and the mechanical constraints of the camera. Convincing com-parative results on synthetic data, realistic simulations and the implementation on a real video surveillance camera val-idate the effectiveness of the proposed method. 1