70,098 research outputs found
Learning Feature Aggregation in Temporal Domain for Re-Identification
Person re-identification is a standard and established problem in the
computer vision community. In recent years, vehicle re-identification is also
getting more attention. In this paper, we focus on both these tasks and propose
a method for aggregation of features in temporal domain as it is common to have
multiple observations of the same object. The aggregation is based on weighting
different elements of the feature vectors by different weights and it is
trained in an end-to-end manner by a Siamese network. The experimental results
show that our method outperforms other existing methods for feature aggregation
in temporal domain on both vehicle and person re-identification tasks.
Furthermore, to push research in vehicle re-identification further, we
introduce a novel dataset CarsReId74k. The dataset is not limited to
frontal/rear viewpoints. It contains 17,681 unique vehicles, 73,976 observed
tracks, and 277,236 positive pairs. The dataset was captured by 66 cameras from
various angles.Comment: Under consideration at Computer Vision and Image Understandin
Attribute-Aware Attention Model for Fine-grained Representation Learning
How to learn a discriminative fine-grained representation is a key point in
many computer vision applications, such as person re-identification,
fine-grained classification, fine-grained image retrieval, etc. Most of the
previous methods focus on learning metrics or ensemble to derive better global
representation, which are usually lack of local information. Based on the
considerations above, we propose a novel Attribute-Aware Attention Model
(), which can learn local attribute representation and global category
representation simultaneously in an end-to-end manner. The proposed model
contains two attention models: attribute-guided attention module uses attribute
information to help select category features in different regions, at the same
time, category-guided attention module selects local features of different
attributes with the help of category cues. Through this attribute-category
reciprocal process, local and global features benefit from each other. Finally,
the resulting feature contains more intrinsic information for image recognition
instead of the noisy and irrelevant features. Extensive experiments conducted
on Market-1501, CompCars, CUB-200-2011 and CARS196 demonstrate the
effectiveness of our . Code is available at
https://github.com/iamhankai/attribute-aware-attention.Comment: Accepted by ACM Multimedia 2018 (Oral). Code is available at
https://github.com/iamhankai/attribute-aware-attentio
Learning Deep Neural Networks for Vehicle Re-ID with Visual-spatio-temporal Path Proposals
Vehicle re-identification is an important problem and has many applications
in video surveillance and intelligent transportation. It gains increasing
attention because of the recent advances of person re-identification
techniques. However, unlike person re-identification, the visual differences
between pairs of vehicle images are usually subtle and even challenging for
humans to distinguish. Incorporating additional spatio-temporal information is
vital for solving the challenging re-identification task. Existing vehicle
re-identification methods ignored or used over-simplified models for the
spatio-temporal relations between vehicle images. In this paper, we propose a
two-stage framework that incorporates complex spatio-temporal information for
effectively regularizing the re-identification results. Given a pair of vehicle
images with their spatio-temporal information, a candidate
visual-spatio-temporal path is first generated by a chain MRF model with a
deeply learned potential function, where each visual-spatio-temporal state
corresponds to an actual vehicle image with its spatio-temporal information. A
Siamese-CNN+Path-LSTM model takes the candidate path as well as the pairwise
queries to generate their similarity score. Extensive experiments and analysis
show the effectiveness of our proposed method and individual components.Comment: To appear in ICCV 201
cvpaper.challenge in 2016: Futuristic Computer Vision through 1,600 Papers Survey
The paper gives futuristic challenges disscussed in the cvpaper.challenge. In
2015 and 2016, we thoroughly study 1,600+ papers in several
conferences/journals such as CVPR/ICCV/ECCV/NIPS/PAMI/IJCV
AI Oriented Large-Scale Video Management for Smart City: Technologies, Standards and Beyond
Deep learning has achieved substantial success in a series of tasks in
computer vision. Intelligent video analysis, which can be broadly applied to
video surveillance in various smart city applications, can also be driven by
such powerful deep learning engines. To practically facilitate deep neural
network models in the large-scale video analysis, there are still unprecedented
challenges for the large-scale video data management. Deep feature coding,
instead of video coding, provides a practical solution for handling the
large-scale video surveillance data. To enable interoperability in the context
of deep feature coding, standardization is urgent and important. However, due
to the explosion of deep learning algorithms and the particularity of feature
coding, there are numerous remaining problems in the standardization process.
This paper envisions the future deep feature coding standard for the AI
oriented large-scale video management, and discusses existing techniques,
standards and possible solutions for these open problems.Comment: 8 pages, 8 figures, 5 table
Joint Discriminative and Generative Learning for Person Re-identification
Person re-identification (re-id) remains challenging due to significant
intra-class variations across different cameras. Recently, there has been a
growing interest in using generative models to augment training data and
enhance the invariance to input changes. The generative pipelines in existing
methods, however, stay relatively separate from the discriminative re-id
learning stages. Accordingly, re-id models are often trained in a
straightforward manner on the generated data. In this paper, we seek to improve
learned re-id embeddings by better leveraging the generated data. To this end,
we propose a joint learning framework that couples re-id learning and data
generation end-to-end. Our model involves a generative module that separately
encodes each person into an appearance code and a structure code, and a
discriminative module that shares the appearance encoder with the generative
module. By switching the appearance or structure codes, the generative module
is able to generate high-quality cross-id composed images, which are online fed
back to the appearance encoder and used to improve the discriminative module.
The proposed joint learning framework renders significant improvement over the
baseline without using generated data, leading to the state-of-the-art
performance on several benchmark datasets.Comment: CVPR 2019 (Oral
Part-Guided Attention Learning for Vehicle Instance Retrieval
Vehicle instance retrieval often requires one to recognize the fine-grained
visual differences between vehicles. Besides the holistic appearance of
vehicles which is easily affected by the viewpoint variation and distortion,
vehicle parts also provide crucial cues to differentiate near-identical
vehicles. Motivated by these observations, we introduce a Part-Guided Attention
Network (PGAN) to pinpoint the prominent part regions and effectively combine
the global and part information for discriminative feature learning. PGAN first
detects the locations of different part components and salient regions
regardless of the vehicle identity, which serve as the bottom-up attention to
narrow down the possible searching regions. To estimate the importance of
detected parts, we propose a Part Attention Module (PAM) to adaptively locate
the most discriminative regions with high-attention weights and suppress the
distraction of irrelevant parts with relatively low weights. The PAM is guided
by the instance retrieval loss and therefore provides top-down attention that
enables attention to be calculated at the level of car parts and other salient
regions. Finally, we aggregate the global appearance and part features to
improve the feature performance further. The PGAN combines part-guided
bottom-up and top-down attention, global and part visual features in an
end-to-end framework. Extensive experiments demonstrate that the proposed
method achieves new state-of-the-art vehicle instance retrieval performance on
four large-scale benchmark datasets.Comment: 12 page
Attribute-guided Feature Learning Network for Vehicle Re-identification
Vehicle re-identification (reID) plays an important role in the automatic
analysis of the increasing urban surveillance videos, which has become a hot
topic in recent years. However, it poses the critical but challenging problem
that is caused by various viewpoints of vehicles, diversified illuminations and
complicated environments. Till now, most existing vehicle reID approaches focus
on learning metrics or ensemble to derive better representation, which are only
take identity labels of vehicle into consideration. However, the attributes of
vehicle that contain detailed descriptions are beneficial for training reID
model. Hence, this paper proposes a novel Attribute-Guided Network (AGNet),
which could learn global representation with the abundant attribute features in
an end-to-end manner. Specially, an attribute-guided module is proposed in
AGNet to generate the attribute mask which could inversely guide to select
discriminative features for category classification. Besides that, in our
proposed AGNet, an attribute-based label smoothing (ALS) loss is presented to
better train the reID model, which can strength the distinct ability of vehicle
reID model to regularize AGNet model according to the attributes. Comprehensive
experimental results clearly demonstrate that our method achieves excellent
performance on both VehicleID dataset and VeRi-776 dataset.Comment: arXiv admin note: text overlap with arXiv:1912.1019
Vehicle Re-Identification in Context
Existing vehicle re-identification (re-id) evaluation benchmarks consider
strongly artificial test scenarios by assuming the availability of high quality
images and fine-grained appearance at an almost constant image scale,
reminiscent to images required for Automatic Number Plate Recognition, e.g.
VeRi-776. Such assumptions are often invalid in realistic vehicle re-id
scenarios where arbitrarily changing image resolutions (scales) are the norm.
This makes the existing vehicle re-id benchmarks limited for testing the true
performance of a re-id method. In this work, we introduce a more realistic and
challenging vehicle re-id benchmark, called Vehicle Re-Identification in
Context (VRIC). In contrast to existing datasets, VRIC is uniquely
characterised by vehicle images subject to more realistic and unconstrained
variations in resolution (scale), motion blur, illumination, occlusion, and
viewpoint. It contains 60,430 images of 5,622 vehicle identities captured by 60
different cameras at heterogeneous road traffic scenes in both day-time and
night-time.Comment: Dataset available at: http://qmul-vric.github.io. To appear on German
Conference on Pattern Recognition (GCPR) 201
Attribute-guided Feature Extraction and Augmentation Robust Learning for Vehicle Re-identification
Vehicle re-identification is one of the core technologies of intelligent
transportation systems and smart cities, but large intra-class diversity and
inter-class similarity poses great challenges for existing method. In this
paper, we propose a multi-guided learning approach which utilizing the
information of attributes and meanwhile introducing two novel random augments
to improve the robustness during training. What's more, we propose an attribute
constraint method and group re-ranking strategy to refine matching results. Our
method achieves mAP of 66.83% and rank-1 accuracy 76.05% in the CVPR 2020 AI
City Challenge
- …