38,406 research outputs found

    A Consistent Metric for Performance Evaluation of Multi-Object Filters

    Get PDF
    The concept of a miss-distance, or error, between a reference quantity and its estimated/controlled value, plays a fundamental role in any filtering/control problem. Yet there is no satisfactory notion of a miss-distance in the well-established field of multi-object filtering. In this paper, we outline the inconsistencies of existing metrics in the context of multi-object miss-distances for performance evaluation. We then propose a new mathematically and intuitively consistent metric that addresses the drawbacks of current multi-object performance evaluation metrics

    Localization Recall Precision (LRP): A New Performance Metric for Object Detection

    Get PDF
    Average precision (AP), the area under the recall-precision (RP) curve, is the standard performance measure for object detection. Despite its wide acceptance, it has a number of shortcomings, the most important of which are (i) the inability to distinguish very different RP curves, and (ii) the lack of directly measuring bounding box localization accuracy. In this paper, we propose 'Localization Recall Precision (LRP) Error', a new metric which we specifically designed for object detection. LRP Error is composed of three components related to localization, false negative (FN) rate and false positive (FP) rate. Based on LRP, we introduce the 'Optimal LRP', the minimum achievable LRP error representing the best achievable configuration of the detector in terms of recall-precision and the tightness of the boxes. In contrast to AP, which considers precisions over the entire recall domain, Optimal LRP determines the 'best' confidence score threshold for a class, which balances the trade-off between localization and recall-precision. In our experiments, we show that, for state-of-the-art object (SOTA) detectors, Optimal LRP provides richer and more discriminative information than AP. We also demonstrate that the best confidence score thresholds vary significantly among classes and detectors. Moreover, we present LRP results of a simple online video object detector which uses a SOTA still image object detector and show that the class-specific optimized thresholds increase the accuracy against the common approach of using a general threshold for all classes. At https://github.com/cancam/LRP we provide the source code that can compute LRP for the PASCAL VOC and MSCOCO datasets. Our source code can easily be adapted to other datasets as well.Comment: to appear in ECCV 201

    Interpretable Convolutional Neural Networks

    Full text link
    This paper proposes a method to modify traditional convolutional neural networks (CNNs) into interpretable CNNs, in order to clarify knowledge representations in high conv-layers of CNNs. In an interpretable CNN, each filter in a high conv-layer represents a certain object part. We do not need any annotations of object parts or textures to supervise the learning process. Instead, the interpretable CNN automatically assigns each filter in a high conv-layer with an object part during the learning process. Our method can be applied to different types of CNNs with different structures. The clear knowledge representation in an interpretable CNN can help people understand the logics inside a CNN, i.e., based on which patterns the CNN makes the decision. Experiments showed that filters in an interpretable CNN were more semantically meaningful than those in traditional CNNs.Comment: In this version, we release the website of the code. Compared to the previous version, we have corrected all values of location instability in Table 3--6 by dividing the values by sqrt(2), i.e., a=a/sqrt(2). Such revisions do NOT decrease the significance of the superior performance of our method, because we make the same correction to location-instability values of all baseline

    Poisson multi-Bernoulli conjugate prior for multiple extended object filtering

    Full text link
    This paper presents a Poisson multi-Bernoulli mixture (PMBM) conjugate prior for multiple extended object filtering. A Poisson point process is used to describe the existence of yet undetected targets, while a multi-Bernoulli mixture describes the distribution of the targets that have been detected. The prediction and update equations are presented for the standard transition density and measurement likelihood. Both the prediction and the update preserve the PMBM form of the density, and in this sense the PMBM density is a conjugate prior. However, the unknown data associations lead to an intractably large number of terms in the PMBM density, and approximations are necessary for tractability. A gamma Gaussian inverse Wishart implementation is presented, along with methods to handle the data association problem. A simulation study shows that the extended target PMBM filter performs well in comparison to the extended target d-GLMB and LMB filters. An experiment with Lidar data illustrates the benefit of tracking both detected and undetected targets

    Spatio-Temporal Image Boundary Extrapolation

    Get PDF
    Boundary prediction in images as well as video has been a very active topic of research and organizing visual information into boundaries and segments is believed to be a corner stone of visual perception. While prior work has focused on predicting boundaries for observed frames, our work aims at predicting boundaries of future unobserved frames. This requires our model to learn about the fate of boundaries and extrapolate motion patterns. We experiment on established real-world video segmentation dataset, which provides a testbed for this new task. We show for the first time spatio-temporal boundary extrapolation in this challenging scenario. Furthermore, we show long-term prediction of boundaries in situations where the motion is governed by the laws of physics. We successfully predict boundaries in a billiard scenario without any assumptions of a strong parametric model or any object notion. We argue that our model has with minimalistic model assumptions derived a notion of 'intuitive physics' that can be applied to novel scenes

    Multi-Bernoulli Sensor-Control via Minimization of Expected Estimation Errors

    Full text link
    This paper presents a sensor-control method for choosing the best next state of the sensor(s), that provide(s) accurate estimation results in a multi-target tracking application. The proposed solution is formulated for a multi-Bernoulli filter and works via minimization of a new estimation error-based cost function. Simulation results demonstrate that the proposed method can outperform the state-of-the-art methods in terms of computation time and robustness to clutter while delivering similar accuracy
    • …
    corecore