15 research outputs found
Real-time Person Re-identification at the Edge: A Mixed Precision Approach
A critical part of multi-person multi-camera tracking is person
re-identification (re-ID) algorithm, which recognizes and retains identities of
all detected unknown people throughout the video stream. Many re-ID algorithms
today exemplify state of the art results, but not much work has been done to
explore the deployment of such algorithms for computation and power constrained
real-time scenarios. In this paper, we study the effect of using a light-weight
model, MobileNet-v2 for re-ID and investigate the impact of single (FP32)
precision versus half (FP16) precision for training on the server and inference
on the edge nodes. We further compare the results with the baseline model which
uses ResNet-50 on state of the art benchmarks including CUHK03, Market-1501,
and Duke-MTMC. The MobileNet-V2 mixed precision training method can improve
both inference throughput on the edge node, and training time on server
reaching to 27.77fps and , respectively and decreases
power consumption on the edge node by , while it deteriorates
accuracy only 5.6\% in respect to ResNet-50 single precision on the average for
three different datasets. The code and pre-trained networks are publicly
available at https://github.com/TeCSAR-UNCC/person-reid.Comment: This is a pre-print of an article published in International
Conference on Image Analysis and Recognition (ICIAR 2019), Lecture Notes in
Computer Science. The final authenticated version is available online at
https://doi.org/10.1007/978-3-030-27272-2_
Few-Shot Deep Adversarial Learning for Video-based Person Re-identification
Video-based person re-identification (re-ID) refers to matching people across
camera views from arbitrary unaligned video footages. Existing methods rely on
supervision signals to optimise a projected space under which the distances
between inter/intra-videos are maximised/minimised. However, this demands
exhaustively labelling people across camera views, rendering them unable to be
scaled in large networked cameras. Also, it is noticed that learning effective
video representations with view invariance is not explicitly addressed for
which features exhibit different distributions otherwise. Thus, matching videos
for person re-ID demands flexible models to capture the dynamics in time-series
observations and learn view-invariant representations with access to limited
labeled training samples. In this paper, we propose a novel few-shot deep
learning approach to video-based person re-ID, to learn comparable
representations that are discriminative and view-invariant. The proposed method
is developed on the variational recurrent neural networks (VRNNs) and trained
adversarially to produce latent variables with temporal dependencies that are
highly discriminative yet view-invariant in matching persons. Through extensive
experiments conducted on three benchmark datasets, we empirically show the
capability of our method in creating view-invariant temporal features and
state-of-the-art performance achieved by our method.Comment: Appearing at IEEE Transactions on Image Processin