7,894 research outputs found
Pose-based Deep Gait Recognition
Human gait or walking manner is a biometric feature that allows
identification of a person when other biometric features such as the face or
iris are not visible. In this paper, we present a new pose-based convolutional
neural network model for gait recognition. Unlike many methods that consider
the full-height silhouette of a moving person, we consider the motion of points
in the areas around human joints. To extract motion information, we estimate
the optical flow between consecutive frames. We propose a deep convolutional
model that computes pose-based gait descriptors. We compare different network
architectures and aggregation methods and experimentally assess various sets of
body parts to determine which are the most important for gait recognition. In
addition, we investigate the generalization ability of the developed algorithms
by transferring them between datasets. The results of these experiments show
that our approach outperforms state-of-the-art methods
Cross-Domain Visual Matching via Generalized Similarity Measure and Feature Learning
Cross-domain visual data matching is one of the fundamental problems in many
real-world vision tasks, e.g., matching persons across ID photos and
surveillance videos. Conventional approaches to this problem usually involves
two steps: i) projecting samples from different domains into a common space,
and ii) computing (dis-)similarity in this space based on a certain distance.
In this paper, we present a novel pairwise similarity measure that advances
existing models by i) expanding traditional linear projections into affine
transformations and ii) fusing affine Mahalanobis distance and Cosine
similarity by a data-driven combination. Moreover, we unify our similarity
measure with feature representation learning via deep convolutional neural
networks. Specifically, we incorporate the similarity measure matrix into the
deep architecture, enabling an end-to-end way of model optimization. We
extensively evaluate our generalized similarity model in several challenging
cross-domain matching tasks: person re-identification under different views and
face verification over different modalities (i.e., faces from still images and
videos, older and younger faces, and sketch and photo portraits). The
experimental results demonstrate superior performance of our model over other
state-of-the-art methods.Comment: To appear in IEEE Transactions on Pattern Analysis and Machine
Intelligence (T-PAMI), 201
cvpaper.challenge in 2015 - A review of CVPR2015 and DeepSurvey
The "cvpaper.challenge" is a group composed of members from AIST, Tokyo Denki
Univ. (TDU), and Univ. of Tsukuba that aims to systematically summarize papers
on computer vision, pattern recognition, and related fields. For this
particular review, we focused on reading the ALL 602 conference papers
presented at the CVPR2015, the premier annual computer vision event held in
June 2015, in order to grasp the trends in the field. Further, we are proposing
"DeepSurvey" as a mechanism embodying the entire process from the reading
through all the papers, the generation of ideas, and to the writing of paper.Comment: Survey Pape
cvpaper.challenge in 2016: Futuristic Computer Vision through 1,600 Papers Survey
The paper gives futuristic challenges disscussed in the cvpaper.challenge. In
2015 and 2016, we thoroughly study 1,600+ papers in several
conferences/journals such as CVPR/ICCV/ECCV/NIPS/PAMI/IJCV
Sparsifying Neural Network Connections for Face Recognition
This paper proposes to learn high-performance deep ConvNets with sparse
neural connections, referred to as sparse ConvNets, for face recognition. The
sparse ConvNets are learned in an iterative way, each time one additional layer
is sparsified and the entire model is re-trained given the initial weights
learned in previous iterations. One important finding is that directly training
the sparse ConvNet from scratch failed to find good solutions for face
recognition, while using a previously learned denser model to properly
initialize a sparser model is critical to continue learning effective features
for face recognition. This paper also proposes a new neural correlation-based
weight selection criterion and empirically verifies its effectiveness in
selecting informative connections from previously learned models in each
iteration. When taking a moderately sparse structure (26%-76% of weights in the
dense model), the proposed sparse ConvNet model significantly improves the face
recognition performance of the previous state-of-the-art DeepID2+ models given
the same training data, while it keeps the performance of the baseline model
with only 12% of the original parameters
Relevance Subject Machine: A Novel Person Re-identification Framework
We propose a novel method called the Relevance Subject Machine (RSM) to solve
the person re-identification (re-id) problem. RSM falls under the category of
Bayesian sparse recovery algorithms and uses the sparse representation of the
input video under a pre-defined dictionary to identify the subject in the
video. Our approach focuses on the multi-shot re-id problem, which is the
prevalent problem in many video analytics applications. RSM captures the
essence of the multi-shot re-id problem by constraining the support of the
sparse codes for each input video frame to be the same. Our proposed approach
is also robust enough to deal with time varying outliers and occlusions by
introducing a sparse, non-stationary noise term in the model error. We provide
a novel Variational Bayesian based inference procedure along with an intuitive
interpretation of the proposed update rules. We evaluate our approach over
several commonly used re-id datasets and show superior performance over current
state-of-the-art algorithms. Specifically, for ILIDS-VID, a recent large scale
re-id dataset, RSM shows significant improvement over all published approaches,
achieving an 11.5% (absolute) improvement in rank 1 accuracy over the closest
competing algorithm considered.Comment: Submitted to IEEE Transactions on Pattern Analysis and Machine
Intelligenc
Transfer Adaptation Learning: A Decade Survey
The world we see is ever-changing and it always changes with people, things,
and the environment. Domain is referred to as the state of the world at a
certain moment. A research problem is characterized as transfer adaptation
learning (TAL) when it needs knowledge correspondence between different
moments/domains. Conventional machine learning aims to find a model with the
minimum expected risk on test data by minimizing the regularized empirical risk
on the training data, which, however, supposes that the training and test data
share similar joint probability distribution. TAL aims to build models that can
perform tasks of target domain by learning knowledge from a semantic related
but distribution different source domain. It is an energetic research filed of
increasing influence and importance, which is presenting a blowout publication
trend. This paper surveys the advances of TAL methodologies in the past decade,
and the technical challenges and essential problems of TAL have been observed
and discussed with deep insights and new perspectives. Broader solutions of
transfer adaptation learning being created by researchers are identified, i.e.,
instance re-weighting adaptation, feature adaptation, classifier adaptation,
deep network adaptation and adversarial adaptation, which are beyond the early
semi-supervised and unsupervised split. The survey helps researchers rapidly
but comprehensively understand and identify the research foundation, research
status, theoretical limitations, future challenges and under-studied issues
(universality, interpretability, and credibility) to be broken in the field
toward universal representation and safe applications in open-world scenarios.Comment: 26 pages, 4 figure
Face Recognition: From Traditional to Deep Learning Methods
Starting in the seventies, face recognition has become one of the most
researched topics in computer vision and biometrics. Traditional methods based
on hand-crafted features and traditional machine learning techniques have
recently been superseded by deep neural networks trained with very large
datasets. In this paper we provide a comprehensive and up-to-date literature
review of popular face recognition methods including both traditional
(geometry-based, holistic, feature-based and hybrid methods) and deep learning
methods
Exploring Uncertainty in Conditional Multi-Modal Retrieval Systems
We cast visual retrieval as a regression problem by posing triplet loss as a
regression loss. This enables epistemic uncertainty estimation using dropout as
a Bayesian approximation framework in retrieval. Accordingly, Monte Carlo (MC)
sampling is leveraged to boost retrieval performance. Our approach is evaluated
on two applications: person re-identification and autonomous car driving.
Comparable state-of-the-art results are achieved on multiple datasets for the
former application.
We leverage the Honda driving dataset (HDD) for autonomous car driving
application. It provides multiple modalities and similarity notions for
ego-motion action understanding. Hence, we present a multi-modal conditional
retrieval network. It disentangles embeddings into separate representations to
encode different similarities. This form of joint learning eliminates the need
to train multiple independent networks without any performance degradation.
Quantitative evaluation highlights our approach competence, achieving 6%
improvement in a highly uncertain environment
Learning Channel Inter-dependencies at Multiple Scales on Dense Networks for Face Recognition
We propose a new deep network structure for unconstrained face recognition.
The proposed network integrates several key components together in order to
characterize complex data distributions, such as in unconstrained face images.
Inspired by recent progress in deep networks, we consider some important
concepts, including multi-scale feature learning, dense connections of network
layers, and weighting different network flows, for building our deep network
structure. The developed network is evaluated in unconstrained face matching,
showing the capability of learning complex data distributions caused by face
images with various qualities.Comment: 12 page
- …