20,969 research outputs found
SANet: Structure-Aware Network for Visual Tracking
Convolutional neural network (CNN) has drawn increasing interest in visual
tracking owing to its powerfulness in feature extraction. Most existing
CNN-based trackers treat tracking as a classification problem. However, these
trackers are sensitive to similar distractors because their CNN models mainly
focus on inter-class classification. To address this problem, we use
self-structure information of object to distinguish it from distractors.
Specifically, we utilize recurrent neural network (RNN) to model object
structure, and incorporate it into CNN to improve its robustness to similar
distractors. Considering that convolutional layers in different levels
characterize the object from different perspectives, we use multiple RNNs to
model object structure in different levels respectively. Extensive experiments
on three benchmarks, OTB100, TC-128 and VOT2015, show that the proposed
algorithm outperforms other methods. Code is released at
http://www.dabi.temple.edu/~hbling/code/SANet/SANet.html.Comment: In CVPR Deep Vision Workshop, 201
Online Feature Selection for Visual Tracking
Object tracking is one of the most important tasks in many applications of computer vision. Many tracking methods use a fixed set of features ignoring that appearance of a target object may change drastically due to intrinsic and extrinsic factors. The ability to dynamically identify discriminative features would help in handling the appearance variability by improving tracking performance. The contribution of this work is threefold. Firstly, this paper presents a collection of several modern feature selection approaches selected among filter, embedded, and wrapper methods. Secondly, we provide extensive tests regarding the classification task intended to explore the strengths and weaknesses of the proposed methods with the goal to identify the right candidates for online tracking. Finally, we show how feature selection mechanisms can be successfully employed for ranking the features used by a tracking system, maintaining high frame rates. In particular, feature selection mounted on the Adaptive Color Tracking (ACT) system operates at over 110 FPS. This work demonstrates the importance of feature selection in online and realtime applications, resulted in what is clearly a very impressive performance, our solutions improve by 3% up to 7% the baseline ACT while providing superior results compared to 29 state-of-the-art tracking methods
Infinite Latent Feature Selection: A Probabilistic Latent Graph-Based Ranking Approach
Feature selection is playing an increasingly significant role with respect to
many computer vision applications spanning from object recognition to visual
object tracking. However, most of the recent solutions in feature selection are
not robust across different and heterogeneous set of data. In this paper, we
address this issue proposing a robust probabilistic latent graph-based feature
selection algorithm that performs the ranking step while considering all the
possible subsets of features, as paths on a graph, bypassing the combinatorial
problem analytically. An appealing characteristic of the approach is that it
aims to discover an abstraction behind low-level sensory data, that is,
relevancy. Relevancy is modelled as a latent variable in a PLSA-inspired
generative process that allows the investigation of the importance of a feature
when injected into an arbitrary set of cues. The proposed method has been
tested on ten diverse benchmarks, and compared against eleven state of the art
feature selection methods. Results show that the proposed approach attains the
highest performance levels across many different scenarios and difficulties,
thereby confirming its strong robustness while setting a new state of the art
in feature selection domain.Comment: Accepted at the IEEE International Conference on Computer Vision
(ICCV), 2017, Venice. Preprint cop
Pedestrian Attribute Recognition: A Survey
Recognizing pedestrian attributes is an important task in computer vision
community due to it plays an important role in video surveillance. Many
algorithms has been proposed to handle this task. The goal of this paper is to
review existing works using traditional methods or based on deep learning
networks. Firstly, we introduce the background of pedestrian attributes
recognition (PAR, for short), including the fundamental concepts of pedestrian
attributes and corresponding challenges. Secondly, we introduce existing
benchmarks, including popular datasets and evaluation criterion. Thirdly, we
analyse the concept of multi-task learning and multi-label learning, and also
explain the relations between these two learning algorithms and pedestrian
attribute recognition. We also review some popular network architectures which
have widely applied in the deep learning community. Fourthly, we analyse
popular solutions for this task, such as attributes group, part-based,
\emph{etc}. Fifthly, we shown some applications which takes pedestrian
attributes into consideration and achieve better performance. Finally, we
summarized this paper and give several possible research directions for
pedestrian attributes recognition. The project page of this paper can be found
from the following website:
\url{https://sites.google.com/view/ahu-pedestrianattributes/}.Comment: Check our project page for High Resolution version of this survey:
https://sites.google.com/view/ahu-pedestrianattributes
- …