3,346 research outputs found
Future Frame Prediction for Anomaly Detection -- A New Baseline
Anomaly detection in videos refers to the identification of events that do
not conform to expected behavior. However, almost all existing methods tackle
the problem by minimizing the reconstruction errors of training data, which
cannot guarantee a larger reconstruction error for an abnormal event. In this
paper, we propose to tackle the anomaly detection problem within a video
prediction framework. To the best of our knowledge, this is the first work that
leverages the difference between a predicted future frame and its ground truth
to detect an abnormal event. To predict a future frame with higher quality for
normal events, other than the commonly used appearance (spatial) constraints on
intensity and gradient, we also introduce a motion (temporal) constraint in
video prediction by enforcing the optical flow between predicted frames and
ground truth frames to be consistent, and this is the first work that
introduces a temporal constraint into the video prediction task. Such spatial
and motion constraints facilitate the future frame prediction for normal
events, and consequently facilitate to identify those abnormal events that do
not conform the expectation. Extensive experiments on both a toy dataset and
some publicly available datasets validate the effectiveness of our method in
terms of robustness to the uncertainty in normal events and the sensitivity to
abnormal events.Comment: IEEE Conference on Computer Vision and Pattern Recognition 201
Exploring Human Vision Driven Features for Pedestrian Detection
Motivated by the center-surround mechanism in the human visual attention
system, we propose to use average contrast maps for the challenge of pedestrian
detection in street scenes due to the observation that pedestrians indeed
exhibit discriminative contrast texture. Our main contributions are first to
design a local, statistical multi-channel descriptorin order to incorporate
both color and gradient information. Second, we introduce a multi-direction and
multi-scale contrast scheme based on grid-cells in order to integrate
expressive local variations. Contributing to the issue of selecting most
discriminative features for assessing and classification, we perform extensive
comparisons w.r.t. statistical descriptors, contrast measurements, and scale
structures. This way, we obtain reasonable results under various
configurations. Empirical findings from applying our optimized detector on the
INRIA and Caltech pedestrian datasets show that our features yield
state-of-the-art performance in pedestrian detection.Comment: Accepted for publication in IEEE Transactions on Circuits and Systems
for Video Technology (TCSVT
Interactively Test Driving an Object Detector: Estimating Performance on Unlabeled Data
In this paper, we study the problem of `test-driving' a detector, i.e.
allowing a human user to get a quick sense of how well the detector generalizes
to their specific requirement. To this end, we present the first system that
estimates detector performance interactively without extensive ground truthing
using a human in the loop. We approach this as a problem of estimating
proportions and show that it is possible to make accurate inferences on the
proportion of classes or groups within a large data collection by observing
only of samples from the data. In estimating the false detections (for
precision), the samples are chosen carefully such that the overall
characteristics of the data collection are preserved. Next, inspired by its use
in estimating disease propagation we apply pooled testing approaches to
estimate missed detections (for recall) from the dataset. The estimates thus
obtained are close to the ones obtained using ground truth, thus reducing the
need for extensive labeling which is expensive and time consuming.Comment: Published at Winter Conference on Applications of Computer Vision,
201
UA-DETRAC: A New Benchmark and Protocol for Multi-Object Detection and Tracking
In recent years, numerous effective multi-object tracking (MOT) methods are
developed because of the wide range of applications. Existing performance
evaluations of MOT methods usually separate the object tracking step from the
object detection step by using the same fixed object detection results for
comparisons. In this work, we perform a comprehensive quantitative study on the
effects of object detection accuracy to the overall MOT performance, using the
new large-scale University at Albany DETection and tRACking (UA-DETRAC)
benchmark dataset. The UA-DETRAC benchmark dataset consists of 100 challenging
video sequences captured from real-world traffic scenes (over 140,000 frames
with rich annotations, including occlusion, weather, vehicle category,
truncation, and vehicle bounding boxes) for object detection, object tracking
and MOT system. We evaluate complete MOT systems constructed from combinations
of state-of-the-art object detection and object tracking methods. Our analysis
shows the complex effects of object detection accuracy on MOT system
performance. Based on these observations, we propose new evaluation tools and
metrics for MOT systems that consider both object detection and object tracking
for comprehensive analysis.Comment: 18 pages, 11 figures, accepted by CVI
Object Detection in 20 Years: A Survey
Object detection, as of one the most fundamental and challenging problems in
computer vision, has received great attention in recent years. Its development
in the past two decades can be regarded as an epitome of computer vision
history. If we think of today's object detection as a technical aesthetics
under the power of deep learning, then turning back the clock 20 years we would
witness the wisdom of cold weapon era. This paper extensively reviews 400+
papers of object detection in the light of its technical evolution, spanning
over a quarter-century's time (from the 1990s to 2019). A number of topics have
been covered in this paper, including the milestone detectors in history,
detection datasets, metrics, fundamental building blocks of the detection
system, speed up techniques, and the recent state of the art detection methods.
This paper also reviews some important detection applications, such as
pedestrian detection, face detection, text detection, etc, and makes an in-deep
analysis of their challenges as well as technical improvements in recent years.Comment: This work has been submitted to the IEEE TPAMI for possible
publicatio
- …