10,055 research outputs found
Coarse-to-Fine Adaptive People Detection for Video Sequences by Maximizing Mutual Information
Applying people detectors to unseen data is challenging since patterns distributions, such
as viewpoints, motion, poses, backgrounds, occlusions and people sizes, may significantly differ
from the ones of the training dataset. In this paper, we propose a coarse-to-fine framework to adapt
frame by frame people detectors during runtime classification, without requiring any additional
manually labeled ground truth apart from the offline training of the detection model. Such adaptation
make use of multiple detectors mutual information, i.e., similarities and dissimilarities of detectors
estimated and agreed by pair-wise correlating their outputs. Globally, the proposed adaptation
discriminates between relevant instants in a video sequence, i.e., identifies the representative frames
for an adaptation of the system. Locally, the proposed adaptation identifies the best configuration
(i.e., detection threshold) of each detector under analysis, maximizing the mutual information to
obtain the detection threshold of each detector. The proposed coarse-to-fine approach does not
require training the detectors for each new scenario and uses standard people detector outputs, i.e.,
bounding boxes. The experimental results demonstrate that the proposed approach outperforms
state-of-the-art detectors whose optimal threshold configurations are previously determined and
fixed from offline training dataThis work has been partially supported by the Spanish government under the project TEC2014-53176-R
(HAVideo
A Large-scale Distributed Video Parsing and Evaluation Platform
Visual surveillance systems have become one of the largest data sources of
Big Visual Data in real world. However, existing systems for video analysis
still lack the ability to handle the problems of scalability, expansibility and
error-prone, though great advances have been achieved in a number of visual
recognition tasks and surveillance applications, e.g., pedestrian/vehicle
detection, people/vehicle counting. Moreover, few algorithms explore the
specific values/characteristics in large-scale surveillance videos. To address
these problems in large-scale video analysis, we develop a scalable video
parsing and evaluation platform through combining some advanced techniques for
Big Data processing, including Spark Streaming, Kafka and Hadoop Distributed
Filesystem (HDFS). Also, a Web User Interface is designed in the system, to
collect users' degrees of satisfaction on the recognition tasks so as to
evaluate the performance of the whole system. Furthermore, the highly
extensible platform running on the long-term surveillance videos makes it
possible to develop more intelligent incremental algorithms to enhance the
performance of various visual recognition tasks.Comment: Accepted by Chinese Conference on Intelligent Visual Surveillance
201
A framework for evaluating stereo-based pedestrian detection techniques
Automated pedestrian detection, counting, and tracking have received significant attention in the computer vision community of late. As such, a variety of techniques have been investigated using both traditional 2-D computer vision techniques and, more recently, 3-D stereo information. However, to date, a quantitative assessment of the performance of stereo-based pedestrian detection has been problematic, mainly due to the lack of standard stereo-based test data and an agreed methodology for carrying out the evaluation. This has forced researchers into making subjective comparisons between competing approaches. In this paper, we propose a framework for the quantitative evaluation of a short-baseline stereo-based pedestrian detection system. We provide freely available synthetic and real-world test data and recommend a set of evaluation metrics. This allows researchers to benchmark systems, not only with respect to other stereo-based approaches, but also with more traditional 2-D approaches. In order to illustrate its usefulness, we demonstrate the application of this framework to evaluate our own recently proposed technique for pedestrian detection and tracking
Radar-based Road User Classification and Novelty Detection with Recurrent Neural Network Ensembles
Radar-based road user classification is an important yet still challenging
task towards autonomous driving applications. The resolution of conventional
automotive radar sensors results in a sparse data representation which is tough
to recover by subsequent signal processing. In this article, classifier
ensembles originating from a one-vs-one binarization paradigm are enriched by
one-vs-all correction classifiers. They are utilized to efficiently classify
individual traffic participants and also identify hidden object classes which
have not been presented to the classifiers during training. For each classifier
of the ensemble an individual feature set is determined from a total set of 98
features. Thereby, the overall classification performance can be improved when
compared to previous methods and, additionally, novel classes can be identified
much more accurately. Furthermore, the proposed structure allows to give new
insights in the importance of features for the recognition of individual
classes which is crucial for the development of new algorithms and sensor
requirements.Comment: 8 pages, 9 figures, accepted paper for 2019 IEEE Intelligent Vehicles
Symposium (IV), Paris, France, June 201
Detection thresholding using mutual information
In this paper, we introduce a novel non-parametric thresholding method that we term Mutual-Information
Thresholding. In our approach, we choose the two detection thresholds for two input signals such that the
mutual information between the thresholded signals is maximised. Two efficient algorithms implementing our
idea are presented: one using dynamic programming to fully explore the quantised search space and the other
method using the Simplex algorithm to perform gradient ascent to significantly speed up the search, under the
assumption of surface convexity. We demonstrate the effectiveness of our approach in foreground detection
(using multi-modal data) and as a component in a person detection system
- …