421 research outputs found

    Object Detection in 20 Years: A Survey

    Full text link
    Object detection, as of one the most fundamental and challenging problems in computer vision, has received great attention in recent years. Its development in the past two decades can be regarded as an epitome of computer vision history. If we think of today's object detection as a technical aesthetics under the power of deep learning, then turning back the clock 20 years we would witness the wisdom of cold weapon era. This paper extensively reviews 400+ papers of object detection in the light of its technical evolution, spanning over a quarter-century's time (from the 1990s to 2019). A number of topics have been covered in this paper, including the milestone detectors in history, detection datasets, metrics, fundamental building blocks of the detection system, speed up techniques, and the recent state of the art detection methods. This paper also reviews some important detection applications, such as pedestrian detection, face detection, text detection, etc, and makes an in-deep analysis of their challenges as well as technical improvements in recent years.Comment: This work has been submitted to the IEEE TPAMI for possible publicatio

    DEEP FULLY RESIDUAL CONVOLUTIONAL NEURAL NETWORK FOR SEMANTIC IMAGE SEGMENTATION

    Get PDF
    Department of Computer Science and EngineeringThe goal of semantic image segmentation is to partition the pixels of an image into semantically meaningful parts and classifying those parts according to a predefined label set. Although object recognition models achieved remarkable performance recently and they even surpass human???s ability to recognize objects, but semantic segmentation models are still behind. One of the reason that makes semantic segmentation relatively a hard problem is the image understanding at pixel level by considering global context as oppose to object recognition. One other challenge is transferring the knowledge of an object recognition model for the task of semantic segmentation. In this thesis, we are delineating some of the main challenges we faced approaching semantic image segmentation with machine learning algorithms. Our main focus was how we can use deep learning algorithms for this task since they require the least amount of feature engineering and also it was shown that such models can be applied to large scale datasets and exhibit remarkable performance. More precisely, we worked on a variation of convolutional neural networks (CNN) suitable for the semantic segmentation task. We proposed a model called deep fully residual convolutional networks (DFRCN) to tackle this problem. Utilizing residual learning makes training of deep models feasible which ultimately leads to having a rich powerful visual representation. Our model also benefits from skip-connections which ease the propagation of information from the encoder module to the decoder module. This would enable our model to have less parameters in the decoder module while it also achieves better performance. We also benchmarked the effective variation of the proposed model on a semantic segmentation benchmark. We first make a thorough review of current high-performance models and the problems one might face when trying to replicate such models which mainly arose from the lack of sufficient provided information. Then, we describe our own novel method which we called deep fully residual convolutional network (DFRCN). We showed that our method exhibits state of the art performance on a challenging benchmark for aerial image segmentation.clos

    A review of technical factors to consider when designing neural networks for semantic segmentation of Earth Observation imagery

    Full text link
    Semantic segmentation (classification) of Earth Observation imagery is a crucial task in remote sensing. This paper presents a comprehensive review of technical factors to consider when designing neural networks for this purpose. The review focuses on Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), Generative Adversarial Networks (GANs), and transformer models, discussing prominent design patterns for these ANN families and their implications for semantic segmentation. Common pre-processing techniques for ensuring optimal data preparation are also covered. These include methods for image normalization and chipping, as well as strategies for addressing data imbalance in training samples, and techniques for overcoming limited data, including augmentation techniques, transfer learning, and domain adaptation. By encompassing both the technical aspects of neural network design and the data-related considerations, this review provides researchers and practitioners with a comprehensive and up-to-date understanding of the factors involved in designing effective neural networks for semantic segmentation of Earth Observation imagery.Comment: 145 pages with 32 figure
    corecore