111 research outputs found

    Bounding Box-Free Instance Segmentation Using Semi-Supervised Learning for Generating a City-Scale Vehicle Dataset

    Full text link
    Vehicle classification is a hot computer vision topic, with studies ranging from ground-view up to top-view imagery. In remote sensing, the usage of top-view images allows for understanding city patterns, vehicle concentration, traffic management, and others. However, there are some difficulties when aiming for pixel-wise classification: (a) most vehicle classification studies use object detection methods, and most publicly available datasets are designed for this task, (b) creating instance segmentation datasets is laborious, and (c) traditional instance segmentation methods underperform on this task since the objects are small. Thus, the present research objectives are: (1) propose a novel semi-supervised iterative learning approach using GIS software, (2) propose a box-free instance segmentation approach, and (3) provide a city-scale vehicle dataset. The iterative learning procedure considered: (1) label a small number of vehicles, (2) train on those samples, (3) use the model to classify the entire image, (4) convert the image prediction into a polygon shapefile, (5) correct some areas with errors and include them in the training data, and (6) repeat until results are satisfactory. To separate instances, we considered vehicle interior and vehicle borders, and the DL model was the U-net with the Efficient-net-B7 backbone. When removing the borders, the vehicle interior becomes isolated, allowing for unique object identification. To recover the deleted 1-pixel borders, we proposed a simple method to expand each prediction. The results show better pixel-wise metrics when compared to the Mask-RCNN (82% against 67% in IoU). On per-object analysis, the overall accuracy, precision, and recall were greater than 90%. This pipeline applies to any remote sensing target, being very efficient for segmentation and generating datasets.Comment: 38 pages, 10 figures, submitted to journa

    Road Segmentation for Remote Sensing Images using Adversarial Spatial Pyramid Networks

    Full text link
    Road extraction in remote sensing images is of great importance for a wide range of applications. Because of the complex background, and high density, most of the existing methods fail to accurately extract a road network that appears correct and complete. Moreover, they suffer from either insufficient training data or high costs of manual annotation. To address these problems, we introduce a new model to apply structured domain adaption for synthetic image generation and road segmentation. We incorporate a feature pyramid network into generative adversarial networks to minimize the difference between the source and target domains. A generator is learned to produce quality synthetic images, and the discriminator attempts to distinguish them. We also propose a feature pyramid network that improves the performance of the proposed model by extracting effective features from all the layers of the network for describing different scales objects. Indeed, a novel scale-wise architecture is introduced to learn from the multi-level feature maps and improve the semantics of the features. For optimization, the model is trained by a joint reconstruction loss function, which minimizes the difference between the fake images and the real ones. A wide range of experiments on three datasets prove the superior performance of the proposed approach in terms of accuracy and efficiency. In particular, our model achieves state-of-the-art 78.86 IOU on the Massachusetts dataset with 14.89M parameters and 86.78B FLOPs, with 4x fewer FLOPs but higher accuracy (+3.47% IOU) than the top performer among state-of-the-art approaches used in the evaluation

    Remote Sensing Object Detection Meets Deep Learning: A Meta-review of Challenges and Advances

    Full text link
    Remote sensing object detection (RSOD), one of the most fundamental and challenging tasks in the remote sensing field, has received longstanding attention. In recent years, deep learning techniques have demonstrated robust feature representation capabilities and led to a big leap in the development of RSOD techniques. In this era of rapid technical evolution, this review aims to present a comprehensive review of the recent achievements in deep learning based RSOD methods. More than 300 papers are covered in this review. We identify five main challenges in RSOD, including multi-scale object detection, rotated object detection, weak object detection, tiny object detection, and object detection with limited supervision, and systematically review the corresponding methods developed in a hierarchical division manner. We also review the widely used benchmark datasets and evaluation metrics within the field of RSOD, as well as the application scenarios for RSOD. Future research directions are provided for further promoting the research in RSOD.Comment: Accepted with IEEE Geoscience and Remote Sensing Magazine. More than 300 papers relevant to the RSOD filed were reviewed in this surve

    Towards Large-Scale Small Object Detection: Survey and Benchmarks

    Full text link
    With the rise of deep convolutional neural networks, object detection has achieved prominent advances in past years. However, such prosperity could not camouflage the unsatisfactory situation of Small Object Detection (SOD), one of the notoriously challenging tasks in computer vision, owing to the poor visual appearance and noisy representation caused by the intrinsic structure of small targets. In addition, large-scale dataset for benchmarking small object detection methods remains a bottleneck. In this paper, we first conduct a thorough review of small object detection. Then, to catalyze the development of SOD, we construct two large-scale Small Object Detection dAtasets (SODA), SODA-D and SODA-A, which focus on the Driving and Aerial scenarios respectively. SODA-D includes 24828 high-quality traffic images and 278433 instances of nine categories. For SODA-A, we harvest 2513 high resolution aerial images and annotate 872069 instances over nine classes. The proposed datasets, as we know, are the first-ever attempt to large-scale benchmarks with a vast collection of exhaustively annotated instances tailored for multi-category SOD. Finally, we evaluate the performance of mainstream methods on SODA. We expect the released benchmarks could facilitate the development of SOD and spawn more breakthroughs in this field. Datasets and codes are available at: \url{https://shaunyuan22.github.io/SODA}

    A review of technical factors to consider when designing neural networks for semantic segmentation of Earth Observation imagery

    Full text link
    Semantic segmentation (classification) of Earth Observation imagery is a crucial task in remote sensing. This paper presents a comprehensive review of technical factors to consider when designing neural networks for this purpose. The review focuses on Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), Generative Adversarial Networks (GANs), and transformer models, discussing prominent design patterns for these ANN families and their implications for semantic segmentation. Common pre-processing techniques for ensuring optimal data preparation are also covered. These include methods for image normalization and chipping, as well as strategies for addressing data imbalance in training samples, and techniques for overcoming limited data, including augmentation techniques, transfer learning, and domain adaptation. By encompassing both the technical aspects of neural network design and the data-related considerations, this review provides researchers and practitioners with a comprehensive and up-to-date understanding of the factors involved in designing effective neural networks for semantic segmentation of Earth Observation imagery.Comment: 145 pages with 32 figure

    Understanding cities with machine eyes: A review of deep computer vision in urban analytics

    Get PDF
    Modelling urban systems has interested planners and modellers for decades. Different models have been achieved relying on mathematics, cellular automation, complexity, and scaling. While most of these models tend to be a simplification of reality, today within the paradigm shifts of artificial intelligence across the different fields of science, the applications of computer vision show promising potential in understanding the realistic dynamics of cities. While cities are complex by nature, computer vision shows progress in tackling a variety of complex physical and non-physical visual tasks. In this article, we review the tasks and algorithms of computer vision and their applications in understanding cities. We attempt to subdivide computer vision algorithms into tasks, and cities into layers to show evidence of where computer vision is intensively applied and where further research is needed. We focus on highlighting the potential role of computer vision in understanding urban systems related to the built environment, natural environment, human interaction, transportation, and infrastructure. After showing the diversity of computer vision algorithms and applications, the challenges that remain in understanding the integration between these different layers of cities and their interactions with one another relying on deep learning and computer vision. We also show recommendations for practice and policy-making towards reaching AI-generated urban policies

    A Comprehensive Survey of Deep Learning in Remote Sensing: Theories, Tools and Challenges for the Community

    Full text link
    In recent years, deep learning (DL), a re-branding of neural networks (NNs), has risen to the top in numerous areas, namely computer vision (CV), speech recognition, natural language processing, etc. Whereas remote sensing (RS) possesses a number of unique challenges, primarily related to sensors and applications, inevitably RS draws from many of the same theories as CV; e.g., statistics, fusion, and machine learning, to name a few. This means that the RS community should be aware of, if not at the leading edge of, of advancements like DL. Herein, we provide the most comprehensive survey of state-of-the-art RS DL research. We also review recent new developments in the DL field that can be used in DL for RS. Namely, we focus on theories, tools and challenges for the RS community. Specifically, we focus on unsolved challenges and opportunities as it relates to (i) inadequate data sets, (ii) human-understandable solutions for modelling physical phenomena, (iii) Big Data, (iv) non-traditional heterogeneous data sources, (v) DL architectures and learning algorithms for spectral, spatial and temporal data, (vi) transfer learning, (vii) an improved theoretical understanding of DL systems, (viii) high barriers to entry, and (ix) training and optimizing the DL.Comment: 64 pages, 411 references. To appear in Journal of Applied Remote Sensin
    • …
    corecore