1,321 research outputs found

    Small-Object Detection in Remote Sensing Images with End-to-End Edge-Enhanced GAN and Object Detector Network

    Full text link
    The detection performance of small objects in remote sensing images is not satisfactory compared to large objects, especially in low-resolution and noisy images. A generative adversarial network (GAN)-based model called enhanced super-resolution GAN (ESRGAN) shows remarkable image enhancement performance, but reconstructed images miss high-frequency edge information. Therefore, object detection performance degrades for small objects on recovered noisy and low-resolution remote sensing images. Inspired by the success of edge enhanced GAN (EEGAN) and ESRGAN, we apply a new edge-enhanced super-resolution GAN (EESRGAN) to improve the image quality of remote sensing images and use different detector networks in an end-to-end manner where detector loss is backpropagated into the EESRGAN to improve the detection performance. We propose an architecture with three components: ESRGAN, Edge Enhancement Network (EEN), and Detection network. We use residual-in-residual dense blocks (RRDB) for both the ESRGAN and EEN, and for the detector network, we use the faster region-based convolutional network (FRCNN) (two-stage detector) and single-shot multi-box detector (SSD) (one stage detector). Extensive experiments on a public (car overhead with context) and a self-assembled (oil and gas storage tank) satellite dataset show superior performance of our method compared to the standalone state-of-the-art object detectors.Comment: This paper contains 27 pages and accepted for publication in MDPI remote sensing journal. GitHub Repository: https://github.com/Jakaria08/EESRGAN (Implementation

    Crowd Counting with Decomposed Uncertainty

    Full text link
    Research in neural networks in the field of computer vision has achieved remarkable accuracy for point estimation. However, the uncertainty in the estimation is rarely addressed. Uncertainty quantification accompanied by point estimation can lead to a more informed decision, and even improve the prediction quality. In this work, we focus on uncertainty estimation in the domain of crowd counting. With increasing occurrences of heavily crowded events such as political rallies, protests, concerts, etc., automated crowd analysis is becoming an increasingly crucial task. The stakes can be very high in many of these real-world applications. We propose a scalable neural network framework with quantification of decomposed uncertainty using a bootstrap ensemble. We demonstrate that the proposed uncertainty quantification method provides additional insight to the crowd counting problem and is simple to implement. We also show that our proposed method exhibits the state of the art performances in many benchmark crowd counting datasets.Comment: Accepted in AAAI 2020 (Main Technical Track

    An AI-Horticulture Monitoring and Prediction System with Automatic Object Counting

    Get PDF
    Estimating density maps and counting the number of objects of interest from images has a wide range of applications, such as crowd counting, traffic monitoring, cell microscopy in biomedical imaging, plant counting in agronomy, as well as environmental survey. Manual counting is a labor-intensive and time-consuming process. Over the past few years, the topic of automatic object counting by computers has been actively evolving from the classic machine learning methods based on handcrafted image features to end-to-end deep learning methods using data-driven feature engineering, for example by Convolutional Neural Networks (CNNs). In our research, we focus on the task of counting plants for large-scale nursery farms to build an AI-horticulture monitoring and prediction system using unmanned aerial vehicle (UAV) images. The common challenges of automatic object counting as other computer vision tasks are scenario difference, object occlusion, scale variation of views, non-uniform distribution, and perspective difference. For an AI-horticulture monitoring and prediction system for large-scale analysis, the plant species various a lot, so that the image features are different based on different appearance of species. In order to solve these complex problems, the deep convolutional neural network-based approaches are proposed. Our method uses the density map as the ground truth to train the modified classic deep neural networks for object counting regression. Experiments are conducted comparing our proposed models with the state-of-the-art object counting and density estimation approaches. The results demonstrate that our proposed counting model outperforms state-of-the-art approaches by achieving the best counting performance with a mean absolute error of 1.93 and a mean square error of 2.68 on our horticulture nursery plant dataset

    Bounding Box-Free Instance Segmentation Using Semi-Supervised Learning for Generating a City-Scale Vehicle Dataset

    Full text link
    Vehicle classification is a hot computer vision topic, with studies ranging from ground-view up to top-view imagery. In remote sensing, the usage of top-view images allows for understanding city patterns, vehicle concentration, traffic management, and others. However, there are some difficulties when aiming for pixel-wise classification: (a) most vehicle classification studies use object detection methods, and most publicly available datasets are designed for this task, (b) creating instance segmentation datasets is laborious, and (c) traditional instance segmentation methods underperform on this task since the objects are small. Thus, the present research objectives are: (1) propose a novel semi-supervised iterative learning approach using GIS software, (2) propose a box-free instance segmentation approach, and (3) provide a city-scale vehicle dataset. The iterative learning procedure considered: (1) label a small number of vehicles, (2) train on those samples, (3) use the model to classify the entire image, (4) convert the image prediction into a polygon shapefile, (5) correct some areas with errors and include them in the training data, and (6) repeat until results are satisfactory. To separate instances, we considered vehicle interior and vehicle borders, and the DL model was the U-net with the Efficient-net-B7 backbone. When removing the borders, the vehicle interior becomes isolated, allowing for unique object identification. To recover the deleted 1-pixel borders, we proposed a simple method to expand each prediction. The results show better pixel-wise metrics when compared to the Mask-RCNN (82% against 67% in IoU). On per-object analysis, the overall accuracy, precision, and recall were greater than 90%. This pipeline applies to any remote sensing target, being very efficient for segmentation and generating datasets.Comment: 38 pages, 10 figures, submitted to journa

    DOTA: A Large-scale Dataset for Object Detection in Aerial Images

    Get PDF
    Object detection is an important and challenging problem in computer vision. Although the past decade has witnessed major advances in object detection in natural scenes, such successes have been slow to aerial imagery, not only because of the huge variation in the scale, orientation and shape of the object instances on the earth's surface, but also due to the scarcity of well-annotated datasets of objects in aerial scenes. To advance object detection research in Earth Vision, also known as Earth Observation and Remote Sensing, we introduce a large-scale Dataset for Object deTection in Aerial images (DOTA). To this end, we collect 28062806 aerial images from different sensors and platforms. Each image is of the size about 4000-by-4000 pixels and contains objects exhibiting a wide variety of scales, orientations, and shapes. These DOTA images are then annotated by experts in aerial image interpretation using 1515 common object categories. The fully annotated DOTA images contains 188,282188,282 instances, each of which is labeled by an arbitrary (8 d.o.f.) quadrilateral To build a baseline for object detection in Earth Vision, we evaluate state-of-the-art object detection algorithms on DOTA. Experiments demonstrate that DOTA well represents real Earth Vision applications and are quite challenging.Comment: Accepted to CVPR 201
    • 

    corecore