141 research outputs found
Metric-aligned Sample Selection and Critical Feature Sampling for Oriented Object Detection
Arbitrary-oriented object detection is a relatively emerging but challenging
task. Although remarkable progress has been made, there still remain many
unsolved issues due to the large diversity of patterns in orientation, scale,
aspect ratio, and visual appearance of objects in aerial images. Most of the
existing methods adopt a coarse-grained fixed label assignment strategy and
suffer from the inconsistency between the classification score and localization
accuracy. First, to align the metric inconsistency between sample selection and
regression loss calculation caused by fixed IoU strategy, we introduce affine
transformation to evaluate the quality of samples and propose a distance-based
label assignment strategy. The proposed metric-aligned selection (MAS) strategy
can dynamically select samples according to the shape and rotation
characteristic of objects. Second, to further address the inconsistency between
classification and localization, we propose a critical feature sampling (CFS)
module, which performs localization refinement on the sampling location for
classification task to extract critical features accurately. Third, we present
a scale-controlled smooth loss (SC-Loss) to adaptively select high
quality samples by changing the form of regression loss function based on the
statistics of proposals during training. Extensive experiments are conducted on
four challenging rotated object detection datasets DOTA, FAIR1M-1.0, HRSC2016,
and UCAS-AOD. The results show the state-of-the-art accuracy of the proposed
detector
Context-Aware Single-Shot Detector
SSD is one of the state-of-the-art object detection algorithms, and it
combines high detection accuracy with real-time speed. However, it is widely
recognized that SSD is less accurate in detecting small objects compared to
large objects, because it ignores the context from outside the proposal boxes.
In this paper, we present CSSD--a shorthand for context-aware single-shot
multibox object detector. CSSD is built on top of SSD, with additional layers
modeling multi-scale contexts. We describe two variants of CSSD, which differ
in their context layers, using dilated convolution layers (DiCSSD) and
deconvolution layers (DeCSSD) respectively. The experimental results show that
the multi-scale context modeling significantly improves the detection accuracy.
In addition, we study the relationship between effective receptive fields
(ERFs) and the theoretical receptive fields (TRFs), particularly on a VGGNet.
The empirical results further strengthen our conclusion that SSD coupled with
context layers achieves better detection results especially for small objects
( on MS-COCO compared to the newest SSD), while
maintaining comparable runtime performance
Object Detection and Classification in Occupancy Grid Maps using Deep Convolutional Networks
A detailed environment perception is a crucial component of automated
vehicles. However, to deal with the amount of perceived information, we also
require segmentation strategies. Based on a grid map environment
representation, well-suited for sensor fusion, free-space estimation and
machine learning, we detect and classify objects using deep convolutional
neural networks. As input for our networks we use a multi-layer grid map
efficiently encoding 3D range sensor information. The inference output consists
of a list of rotated bounding boxes with associated semantic classes. We
conduct extensive ablation studies, highlight important design considerations
when using grid maps and evaluate our models on the KITTI Bird's Eye View
benchmark. Qualitative and quantitative benchmark results show that we achieve
robust detection and state of the art accuracy solely using top-view grid maps
from range sensor data.Comment: 6 pages, 4 tables, 4 figure
- …