42,058 research outputs found
Small-Object Detection in Remote Sensing Images with End-to-End Edge-Enhanced GAN and Object Detector Network
The detection performance of small objects in remote sensing images is not
satisfactory compared to large objects, especially in low-resolution and noisy
images. A generative adversarial network (GAN)-based model called enhanced
super-resolution GAN (ESRGAN) shows remarkable image enhancement performance,
but reconstructed images miss high-frequency edge information. Therefore,
object detection performance degrades for small objects on recovered noisy and
low-resolution remote sensing images. Inspired by the success of edge enhanced
GAN (EEGAN) and ESRGAN, we apply a new edge-enhanced super-resolution GAN
(EESRGAN) to improve the image quality of remote sensing images and use
different detector networks in an end-to-end manner where detector loss is
backpropagated into the EESRGAN to improve the detection performance. We
propose an architecture with three components: ESRGAN, Edge Enhancement Network
(EEN), and Detection network. We use residual-in-residual dense blocks (RRDB)
for both the ESRGAN and EEN, and for the detector network, we use the faster
region-based convolutional network (FRCNN) (two-stage detector) and single-shot
multi-box detector (SSD) (one stage detector). Extensive experiments on a
public (car overhead with context) and a self-assembled (oil and gas storage
tank) satellite dataset show superior performance of our method compared to the
standalone state-of-the-art object detectors.Comment: This paper contains 27 pages and accepted for publication in MDPI
remote sensing journal. GitHub Repository:
https://github.com/Jakaria08/EESRGAN (Implementation
Efficient Yet Deep Convolutional Neural Networks for Semantic Segmentation
Semantic Segmentation using deep convolutional neural network pose more
complex challenge for any GPU intensive task. As it has to compute million of
parameters, it results to huge memory consumption. Moreover, extracting finer
features and conducting supervised training tends to increase the complexity.
With the introduction of Fully Convolutional Neural Network, which uses finer
strides and utilizes deconvolutional layers for upsampling, it has been a go to
for any image segmentation task. In this paper, we propose two segmentation
architecture which not only needs one-third the parameters to compute but also
gives better accuracy than the similar architectures. The model weights were
transferred from the popular neural net like VGG19 and VGG16 which were trained
on Imagenet classification data-set. Then we transform all the fully connected
layers to convolutional layers and use dilated convolution for decreasing the
parameters. Lastly, we add finer strides and attach four skip architectures
which are element-wise summed with the deconvolutional layers in steps. We
train and test on different sparse and fine data-sets like Pascal VOC2012,
Pascal-Context and NYUDv2 and show how better our model performs in this tasks.
On the other hand our model has a faster inference time and consumes less
memory for training and testing on NVIDIA Pascal GPUs, making it more efficient
and less memory consuming architecture for pixel-wise segmentation.Comment: 8 page
Selective Refinement Network for High Performance Face Detection
High performance face detection remains a very challenging problem,
especially when there exists many tiny faces. This paper presents a novel
single-shot face detector, named Selective Refinement Network (SRN), which
introduces novel two-step classification and regression operations selectively
into an anchor-based face detector to reduce false positives and improve
location accuracy simultaneously. In particular, the SRN consists of two
modules: the Selective Two-step Classification (STC) module and the Selective
Two-step Regression (STR) module. The STC aims to filter out most simple
negative anchors from low level detection layers to reduce the search space for
the subsequent classifier, while the STR is designed to coarsely adjust the
locations and sizes of anchors from high level detection layers to provide
better initialization for the subsequent regressor. Moreover, we design a
Receptive Field Enhancement (RFE) block to provide more diverse receptive
field, which helps to better capture faces in some extreme poses. As a
consequence, the proposed SRN detector achieves state-of-the-art performance on
all the widely used face detection benchmarks, including AFW, PASCAL face,
FDDB, and WIDER FACE datasets. Codes will be released to facilitate further
studies on the face detection problem.Comment: The first two authors have equal contributions. Corresponding author:
Shifeng Zhang ([email protected]
- …