45,095 research outputs found
Online Feature Selection for Visual Tracking
Object tracking is one of the most important tasks in many applications of computer vision. Many tracking methods use a fixed set of features ignoring that appearance of a target object may change drastically due to intrinsic and extrinsic factors. The ability to dynamically identify discriminative features would help in handling the appearance variability by improving tracking performance. The contribution of this work is threefold. Firstly, this paper presents a collection of several modern feature selection approaches selected among filter, embedded, and wrapper methods. Secondly, we provide extensive tests regarding the classification task intended to explore the strengths and weaknesses of the proposed methods with the goal to identify the right candidates for online tracking. Finally, we show how feature selection mechanisms can be successfully employed for ranking the features used by a tracking system, maintaining high frame rates. In particular, feature selection mounted on the Adaptive Color Tracking (ACT) system operates at over 110 FPS. This work demonstrates the importance of feature selection in online and realtime applications, resulted in what is clearly a very impressive performance, our solutions improve by 3% up to 7% the baseline ACT while providing superior results compared to 29 state-of-the-art tracking methods
Coarse-to-Fine Adaptive People Detection for Video Sequences by Maximizing Mutual Information
Applying people detectors to unseen data is challenging since patterns distributions, such
as viewpoints, motion, poses, backgrounds, occlusions and people sizes, may significantly differ
from the ones of the training dataset. In this paper, we propose a coarse-to-fine framework to adapt
frame by frame people detectors during runtime classification, without requiring any additional
manually labeled ground truth apart from the offline training of the detection model. Such adaptation
make use of multiple detectors mutual information, i.e., similarities and dissimilarities of detectors
estimated and agreed by pair-wise correlating their outputs. Globally, the proposed adaptation
discriminates between relevant instants in a video sequence, i.e., identifies the representative frames
for an adaptation of the system. Locally, the proposed adaptation identifies the best configuration
(i.e., detection threshold) of each detector under analysis, maximizing the mutual information to
obtain the detection threshold of each detector. The proposed coarse-to-fine approach does not
require training the detectors for each new scenario and uses standard people detector outputs, i.e.,
bounding boxes. The experimental results demonstrate that the proposed approach outperforms
state-of-the-art detectors whose optimal threshold configurations are previously determined and
fixed from offline training dataThis work has been partially supported by the Spanish government under the project TEC2014-53176-R
(HAVideo
Comparison of fusion methods for thermo-visual surveillance tracking
In this paper, we evaluate the appearance tracking performance of multiple fusion schemes that combine information from standard CCTV and thermal infrared spectrum video for the tracking of surveillance objects, such as people, faces, bicycles and vehicles. We show results on numerous real world multimodal surveillance sequences, tracking challenging objects whose appearance changes rapidly. Based on these results we can determine the most promising fusion scheme
- …