1,277 research outputs found
Detecting Lesion Bounding Ellipses With Gaussian Proposal Networks
Lesions characterized by computed tomography (CT) scans, are arguably often
elliptical objects. However, current lesion detection systems are predominantly
adopted from the popular Region Proposal Networks (RPNs) that only propose
bounding boxes without fully leveraging the elliptical geometry of lesions. In
this paper, we present Gaussian Proposal Networks (GPNs), a novel extension to
RPNs, to detect lesion bounding ellipses. Instead of directly regressing the
rotation angle of the ellipse as the common practice, GPN represents bounding
ellipses as 2D Gaussian distributions on the image plain and minimizes the
Kullback-Leibler (KL) divergence between the proposed Gaussian and the ground
truth Gaussian for object localization. We show the KL divergence loss
approximately incarnates the regression loss in the RPN framework when the
rotation angle is 0. Experiments on the DeepLesion dataset show that GPN
significantly outperforms RPN for lesion bounding ellipse detection thanks to
lower localization error. GPN is open sourced at
https://github.com/baidu-research/GP
Localization Recall Precision (LRP): A New Performance Metric for Object Detection
Average precision (AP), the area under the recall-precision (RP) curve, is
the standard performance measure for object detection. Despite its wide
acceptance, it has a number of shortcomings, the most important of which are
(i) the inability to distinguish very different RP curves, and (ii) the lack of
directly measuring bounding box localization accuracy. In this paper, we
propose 'Localization Recall Precision (LRP) Error', a new metric which we
specifically designed for object detection. LRP Error is composed of three
components related to localization, false negative (FN) rate and false positive
(FP) rate. Based on LRP, we introduce the 'Optimal LRP', the minimum achievable
LRP error representing the best achievable configuration of the detector in
terms of recall-precision and the tightness of the boxes. In contrast to AP,
which considers precisions over the entire recall domain, Optimal LRP
determines the 'best' confidence score threshold for a class, which balances
the trade-off between localization and recall-precision. In our experiments, we
show that, for state-of-the-art object (SOTA) detectors, Optimal LRP provides
richer and more discriminative information than AP. We also demonstrate that
the best confidence score thresholds vary significantly among classes and
detectors. Moreover, we present LRP results of a simple online video object
detector which uses a SOTA still image object detector and show that the
class-specific optimized thresholds increase the accuracy against the common
approach of using a general threshold for all classes. At
https://github.com/cancam/LRP we provide the source code that can compute LRP
for the PASCAL VOC and MSCOCO datasets. Our source code can easily be adapted
to other datasets as well.Comment: to appear in ECCV 201
Sequential optimization for efficient high-quality object proposal generation
We are motivated by the need for a generic object proposal generation algorithm which achieves good balance between object detection recall, proposal localization quality and computational efficiency. We propose a novel object proposal algorithm, BING ++, which inherits the virtue of good computational efficiency of BING [1] but significantly improves its proposal localization quality. At high level we formulate the problem of object proposal generation from a novel probabilistic perspective, based on which our BING++ manages to improve the localization quality by employing edges and segments to estimate object boundaries and update the proposals sequentially. We propose learning the parameters efficiently by searching for approximate solutions in a quantized parameter space for complexity reduction. We demonstrate the generalization of BING++ with the same fixed parameters across different object classes and datasets. Empirically our BING++ can run at half speed of BING on CPU, but significantly improve the localization quality by 18.5 and 16.7 percent on both VOC2007 and Microhsoft COCO datasets, respectively. Compared with other state-of-the-art approaches, BING++ can achieve comparable performance, but run significantly faster
Sequential Optimization for Efficient High-Quality Object Proposal Generation
We are motivated by the need for a generic object proposal generation
algorithm which achieves good balance between object detection recall, proposal
localization quality and computational efficiency. We propose a novel object
proposal algorithm, BING++, which inherits the virtue of good computational
efficiency of BING but significantly improves its proposal localization
quality. At high level we formulate the problem of object proposal generation
from a novel probabilistic perspective, based on which our BING++ manages to
improve the localization quality by employing edges and segments to estimate
object boundaries and update the proposals sequentially. We propose learning
the parameters efficiently by searching for approximate solutions in a
quantized parameter space for complexity reduction. We demonstrate the
generalization of BING++ with the same fixed parameters across different object
classes and datasets. Empirically our BING++ can run at half speed of BING on
CPU, but significantly improve the localization quality by 18.5% and 16.7% on
both VOC2007 and Microhsoft COCO datasets, respectively. Compared with other
state-of-the-art approaches, BING++ can achieve comparable performance, but run
significantly faster.Comment: Accepted by TPAM
Mono3D++: Monocular 3D Vehicle Detection with Two-Scale 3D Hypotheses and Task Priors
We present a method to infer 3D pose and shape of vehicles from a single
image. To tackle this ill-posed problem, we optimize two-scale projection
consistency between the generated 3D hypotheses and their 2D
pseudo-measurements. Specifically, we use a morphable wireframe model to
generate a fine-scaled representation of vehicle shape and pose. To reduce its
sensitivity to 2D landmarks, we jointly model the 3D bounding box as a coarse
representation which improves robustness. We also integrate three task priors,
including unsupervised monocular depth, a ground plane constraint as well as
vehicle shape priors, with forward projection errors into an overall energy
function.Comment: Proc. of the AAAI, September 201
- …