210 research outputs found

    Deep invariant feature learning for remote sensing scene classification

    Get PDF
    Image classification, as the core task in the computer vision field, has proceeded at a break­neck pace. It largely attributes to the recent growth of deep learning techniques which have blown the conventional statistical methods on a plethora of benchmarks and even can outperform humans in specific image classification tasks. Despite deep learning exceeding alternative techniques, they have many apparent disadvantages that prevent them from being deployed for the general-purpose. Specifically, deep learning always requires a considerable amount of well-annotated data to circumvent the problems of over-fitting and the lacking of prior knowledge. However, manually labelled data is expensive to acquire and is impossible to incorporate the variations as much as the real world. Consequently, deep learning models usually fail when they confront with the underrepresented variations in the training data. This is the main reason why the deep learning model is barely satisfactory in the challeng­ing image recognition task that contains nuisance variations such as, Remote Sensing Scene Classification (RSSC). The classification of remote sensing scene image is a procedure of assigning the seman­tic meaning labels for the given satellite images that contain the complicated variations, such as texture and appearances. The algorithms for effectively understanding and recognising remote sensing scene images have the potential to be employed in a broad range of applications, such as urban planning, Land Use and Land Cover (LULC) determination, natural hazards detection, vegetation mapping, environmental monitoring. This inspires us to de­sign the frameworks that can automatically predict the precise label for satellite images. In our research project, we mine and define the challenges in RSSC community compared with general scene image recognition tasks. Specifically, we summarise the problems into the following perspectives. 1) Visual-semantic ambiguity: the discrepancy between visual features and semantic concepts; 2) Variations: the intra-class diversity and inter-class similarity; 3) Clutter background; 4) The small size of the training set; 5) Unsatisfactory classification accuracy in large-scale datasets. To address the aforementioned challenges, we explore a way to dynamically expand the capabilities of incorporating the prior knowledge by transforming the input data so that we can learn the globally invariant second-order features from the transformed data for improving the performance of RSSC tasks. First, we devise a recurrent transformer network (RTN) to progressively discover the discriminative regions of input images and learn the corresponding second-order features. The model is optimised using pairwise ranking loss to achieve localising discriminative parts and learning the corresponding features in a mutu­ally reinforced way. Second, we observed that existing remote sensing image datasets lack the provision of ontological structures. Therefore, a multi-granularity canonical appearance pooling (MG-CAP) model is proposed to automatically seek the implied hierarchical structures of datasets and produced covariance features contained the multi-grained information. Third, we explore a way to improve the discriminative power of the second-order features. To accomplish this target, we present a covariance feature embedding (CFE) model to im­prove the distinctive power of covariance pooling by using suitable matrix normalisation methods and a low-norm cosine similarity loss to accurately metric the distances of high­dimensional features. Finally, we improved the performance of RSSC while using fewer model parameters. An invariant deep compressible covariance pooling (IDCCP) model is presented to boost the classification accuracy for RSSC tasks. Meanwhile, we proofed the generalisability of our IDCCP model using group theory and manifold optimisation techniques. All of the proposed frameworks allow being optimised in an end-to-end manner and are well-supported by GPU acceleration. We conduct extensive experiments on the well-known remote sensing scene image datasets to demonstrate the great promotions of our proposed methods in comparison with state-of-the-art approaches

    Land scene classification from remote sensing images using improved artificial bee colony optimization algorithm

    Get PDF
    The images obtained from remote sensing consist of background complexities and similarities among the objects that act as challenge during the classification of land scenes. Land scenes are utilized in various fields such as agriculture, urbanization, and disaster management, to detect the condition of land surfaces and help to identify the suitability of the land surfaces for planting crops, and building construction. The existing methods help in the classification of land scenes through the images obtained from remote sensing technology, but the background complexities and presence of similar objects act as a barricade against providing better results. To overcome these issues, an improved artificial bee colony optimization algorithm with convolutional neural network (IABC-CNN) model is proposed to achieve better results in classifying the land scenes. The images are collected from aerial image dataset (AID), Northwestern Polytechnical University-Remote Sensing Image Scene 45 (NWPU-RESIS45), and University of California Merced (UCM) datasets. IABC effectively selects the best features from the extracted features using visual geometry group-16 (VGG-16). The selected features from the IABC are provided for the classification process using multiclass-support vector machine (MSVM). Results obtained from the proposed IABC-CNN achieves a better classification accuracy of 96.40% with an error rate 3.6%

    Parsing Objects at a Finer Granularity: A Survey

    Full text link
    Fine-grained visual parsing, including fine-grained part segmentation and fine-grained object recognition, has attracted considerable critical attention due to its importance in many real-world applications, e.g., agriculture, remote sensing, and space technologies. Predominant research efforts tackle these fine-grained sub-tasks following different paradigms, while the inherent relations between these tasks are neglected. Moreover, given most of the research remains fragmented, we conduct an in-depth study of the advanced work from a new perspective of learning the part relationship. In this perspective, we first consolidate recent research and benchmark syntheses with new taxonomies. Based on this consolidation, we revisit the universal challenges in fine-grained part segmentation and recognition tasks and propose new solutions by part relationship learning for these important challenges. Furthermore, we conclude several promising lines of research in fine-grained visual parsing for future research.Comment: Survey for fine-grained part segmentation and object recognition; Accepted by Machine Intelligence Research (MIR

    Generic Object Detection and Segmentation for Real-World Environments

    Get PDF

    Deep Learning-Based Part Labeling of Tree Components in Point Cloud Data

    Get PDF
    Point cloud data analysis plays a crucial role in forest management, remote sensing, and wildfire monitoring and mitigation, necessitating robust computer algorithms and pipelines for segmentation and labeling of tree components. This thesis presents a novel pipeline that employs deep learning models, such as the Point-Voxel Transformer (PVT), and synthetic tree point clouds for automatic tree part-segmentation. The pipeline leverages the expertise of environmental artists to enhance the quality and diversity of training data and investigates alternative subsampling methods to optimize model performance. Furthermore, we evaluate various label propagation techniques to improve the labeling of synthetic tree point clouds. By comparing different community detection methods and graph connectivity inference techniques, we demonstrate that K-NN connectivity inference and carefully selected community detection methods significantly enhance labeling accuracy, efficiency, and coverage. The proposed methods hold the potential to improve the quality of forest management and monitoring applications, enable better assessment of wildfire hazards, and facilitate advancements in remote sensing and forestry fields
    corecore