38,406 research outputs found
A Consistent Metric for Performance Evaluation of Multi-Object Filters
The concept of a miss-distance, or error, between a reference quantity and its estimated/controlled value, plays a fundamental role in any filtering/control problem. Yet there is no satisfactory notion of a miss-distance in the well-established field of multi-object filtering. In this paper, we outline the inconsistencies of existing metrics in the context of multi-object miss-distances for performance evaluation. We then propose a new mathematically and intuitively consistent metric that addresses the drawbacks of current multi-object performance evaluation metrics
Localization Recall Precision (LRP): A New Performance Metric for Object Detection
Average precision (AP), the area under the recall-precision (RP) curve, is
the standard performance measure for object detection. Despite its wide
acceptance, it has a number of shortcomings, the most important of which are
(i) the inability to distinguish very different RP curves, and (ii) the lack of
directly measuring bounding box localization accuracy. In this paper, we
propose 'Localization Recall Precision (LRP) Error', a new metric which we
specifically designed for object detection. LRP Error is composed of three
components related to localization, false negative (FN) rate and false positive
(FP) rate. Based on LRP, we introduce the 'Optimal LRP', the minimum achievable
LRP error representing the best achievable configuration of the detector in
terms of recall-precision and the tightness of the boxes. In contrast to AP,
which considers precisions over the entire recall domain, Optimal LRP
determines the 'best' confidence score threshold for a class, which balances
the trade-off between localization and recall-precision. In our experiments, we
show that, for state-of-the-art object (SOTA) detectors, Optimal LRP provides
richer and more discriminative information than AP. We also demonstrate that
the best confidence score thresholds vary significantly among classes and
detectors. Moreover, we present LRP results of a simple online video object
detector which uses a SOTA still image object detector and show that the
class-specific optimized thresholds increase the accuracy against the common
approach of using a general threshold for all classes. At
https://github.com/cancam/LRP we provide the source code that can compute LRP
for the PASCAL VOC and MSCOCO datasets. Our source code can easily be adapted
to other datasets as well.Comment: to appear in ECCV 201
Interpretable Convolutional Neural Networks
This paper proposes a method to modify traditional convolutional neural
networks (CNNs) into interpretable CNNs, in order to clarify knowledge
representations in high conv-layers of CNNs. In an interpretable CNN, each
filter in a high conv-layer represents a certain object part. We do not need
any annotations of object parts or textures to supervise the learning process.
Instead, the interpretable CNN automatically assigns each filter in a high
conv-layer with an object part during the learning process. Our method can be
applied to different types of CNNs with different structures. The clear
knowledge representation in an interpretable CNN can help people understand the
logics inside a CNN, i.e., based on which patterns the CNN makes the decision.
Experiments showed that filters in an interpretable CNN were more semantically
meaningful than those in traditional CNNs.Comment: In this version, we release the website of the code. Compared to the
previous version, we have corrected all values of location instability in
Table 3--6 by dividing the values by sqrt(2), i.e., a=a/sqrt(2). Such
revisions do NOT decrease the significance of the superior performance of our
method, because we make the same correction to location-instability values of
all baseline
Poisson multi-Bernoulli conjugate prior for multiple extended object filtering
This paper presents a Poisson multi-Bernoulli mixture (PMBM) conjugate prior
for multiple extended object filtering. A Poisson point process is used to
describe the existence of yet undetected targets, while a multi-Bernoulli
mixture describes the distribution of the targets that have been detected. The
prediction and update equations are presented for the standard transition
density and measurement likelihood. Both the prediction and the update preserve
the PMBM form of the density, and in this sense the PMBM density is a conjugate
prior. However, the unknown data associations lead to an intractably large
number of terms in the PMBM density, and approximations are necessary for
tractability. A gamma Gaussian inverse Wishart implementation is presented,
along with methods to handle the data association problem. A simulation study
shows that the extended target PMBM filter performs well in comparison to the
extended target d-GLMB and LMB filters. An experiment with Lidar data
illustrates the benefit of tracking both detected and undetected targets
Spatio-Temporal Image Boundary Extrapolation
Boundary prediction in images as well as video has been a very active topic
of research and organizing visual information into boundaries and segments is
believed to be a corner stone of visual perception. While prior work has
focused on predicting boundaries for observed frames, our work aims at
predicting boundaries of future unobserved frames. This requires our model to
learn about the fate of boundaries and extrapolate motion patterns. We
experiment on established real-world video segmentation dataset, which provides
a testbed for this new task. We show for the first time spatio-temporal
boundary extrapolation in this challenging scenario. Furthermore, we show
long-term prediction of boundaries in situations where the motion is governed
by the laws of physics. We successfully predict boundaries in a billiard
scenario without any assumptions of a strong parametric model or any object
notion. We argue that our model has with minimalistic model assumptions derived
a notion of 'intuitive physics' that can be applied to novel scenes
Multi-Bernoulli Sensor-Control via Minimization of Expected Estimation Errors
This paper presents a sensor-control method for choosing the best next state
of the sensor(s), that provide(s) accurate estimation results in a multi-target
tracking application. The proposed solution is formulated for a multi-Bernoulli
filter and works via minimization of a new estimation error-based cost
function. Simulation results demonstrate that the proposed method can
outperform the state-of-the-art methods in terms of computation time and
robustness to clutter while delivering similar accuracy
- …