317 research outputs found

    Dense semantic labeling of sub-decimeter resolution images with convolutional neural networks

    Full text link
    Semantic labeling (or pixel-level land-cover classification) in ultra-high resolution imagery (< 10cm) requires statistical models able to learn high level concepts from spatial data, with large appearance variations. Convolutional Neural Networks (CNNs) achieve this goal by learning discriminatively a hierarchy of representations of increasing abstraction. In this paper we present a CNN-based system relying on an downsample-then-upsample architecture. Specifically, it first learns a rough spatial map of high-level representations by means of convolutions and then learns to upsample them back to the original resolution by deconvolutions. By doing so, the CNN learns to densely label every pixel at the original resolution of the image. This results in many advantages, including i) state-of-the-art numerical accuracy, ii) improved geometric accuracy of predictions and iii) high efficiency at inference time. We test the proposed system on the Vaihingen and Potsdam sub-decimeter resolution datasets, involving semantic labeling of aerial images of 9cm and 5cm resolution, respectively. These datasets are composed by many large and fully annotated tiles allowing an unbiased evaluation of models making use of spatial information. We do so by comparing two standard CNN architectures to the proposed one: standard patch classification, prediction of local label patches by employing only convolutions and full patch labeling by employing deconvolutions. All the systems compare favorably or outperform a state-of-the-art baseline relying on superpixels and powerful appearance descriptors. The proposed full patch labeling CNN outperforms these models by a large margin, also showing a very appealing inference time.Comment: Accepted in IEEE Transactions on Geoscience and Remote Sensing, 201

    Kernel Manifold Alignment

    Full text link
    We introduce a kernel method for manifold alignment (KEMA) and domain adaptation that can match an arbitrary number of data sources without needing corresponding pairs, just few labeled examples in all domains. KEMA has interesting properties: 1) it generalizes other manifold alignment methods, 2) it can align manifolds of very different complexities, performing a sort of manifold unfolding plus alignment, 3) it can define a domain-specific metric to cope with multimodal specificities, 4) it can align data spaces of different dimensionality, 5) it is robust to strong nonlinear feature deformations, and 6) it is closed-form invertible which allows transfer across-domains and data synthesis. We also present a reduced-rank version for computational efficiency and discuss the generalization performance of KEMA under Rademacher principles of stability. KEMA exhibits very good performance over competing methods in synthetic examples, visual object recognition and recognition of facial expressions tasks

    Non-convex regularization in remote sensing

    Get PDF
    In this paper, we study the effect of different regularizers and their implications in high dimensional image classification and sparse linear unmixing. Although kernelization or sparse methods are globally accepted solutions for processing data in high dimensions, we present here a study on the impact of the form of regularization used and its parametrization. We consider regularization via traditional squared (2) and sparsity-promoting (1) norms, as well as more unconventional nonconvex regularizers (p and Log Sum Penalty). We compare their properties and advantages on several classification and linear unmixing tasks and provide advices on the choice of the best regularizer for the problem at hand. Finally, we also provide a fully functional toolbox for the community.Comment: 11 pages, 11 figure

    Detecting animals in African Savanna with UAVs and the crowds

    Full text link
    Unmanned aerial vehicles (UAVs) offer new opportunities for wildlife monitoring, with several advantages over traditional field-based methods. They have readily been used to count birds, marine mammals and large herbivores in different environments, tasks which are routinely performed through manual counting in large collections of images. In this paper, we propose a semi-automatic system able to detect large mammals in semi-arid Savanna. It relies on an animal-detection system based on machine learning, trained with crowd-sourced annotations provided by volunteers who manually interpreted sub-decimeter resolution color images. The system achieves a high recall rate and a human operator can then eliminate false detections with limited effort. Our system provides good perspectives for the development of data-driven management practices in wildlife conservation. It shows that the detection of large mammals in semi-arid Savanna can be approached by processing data provided by standard RGB cameras mounted on affordable fixed wings UAVs

    Land cover mapping at very high resolution with rotation equivariant CNNs: towards small yet accurate models

    Full text link
    In remote sensing images, the absolute orientation of objects is arbitrary. Depending on an object's orientation and on a sensor's flight path, objects of the same semantic class can be observed in different orientations in the same image. Equivariance to rotation, in this context understood as responding with a rotated semantic label map when subject to a rotation of the input image, is therefore a very desirable feature, in particular for high capacity models, such as Convolutional Neural Networks (CNNs). If rotation equivariance is encoded in the network, the model is confronted with a simpler task and does not need to learn specific (and redundant) weights to address rotated versions of the same object class. In this work we propose a CNN architecture called Rotation Equivariant Vector Field Network (RotEqNet) to encode rotation equivariance in the network itself. By using rotating convolutions as building blocks and passing only the the values corresponding to the maximally activating orientation throughout the network in the form of orientation encoding vector fields, RotEqNet treats rotated versions of the same object with the same filter bank and therefore achieves state-of-the-art performances even when using very small architectures trained from scratch. We test RotEqNet in two challenging sub-decimeter resolution semantic labeling problems, and show that we can perform better than a standard CNN while requiring one order of magnitude less parameters

    Advances in Hyperspectral Image Classification: Earth monitoring with statistical learning methods

    Full text link
    Hyperspectral images show similar statistical properties to natural grayscale or color photographic images. However, the classification of hyperspectral images is more challenging because of the very high dimensionality of the pixels and the small number of labeled examples typically available for learning. These peculiarities lead to particular signal processing problems, mainly characterized by indetermination and complex manifolds. The framework of statistical learning has gained popularity in the last decade. New methods have been presented to account for the spatial homogeneity of images, to include user's interaction via active learning, to take advantage of the manifold structure with semisupervised learning, to extract and encode invariances, or to adapt classifiers and image representations to unseen yet similar scenes. This tutuorial reviews the main advances for hyperspectral remote sensing image classification through illustrative examples.Comment: IEEE Signal Processing Magazine, 201

    Network-based correlated correspondence for unsupervised domain adaptation of hyperspectral satellite images

    Get PDF
    International audienceAdapting a model to changes in the data distribution is a relevant problem in machine learning and pattern recognition since such changes degrade the performances of classifiers trained on undistorted samples. This paper tackles the problem of domain adaptation in the context of hyperspectral satellite image analysis. We propose a new correlated correspondence algorithm based on network analysis. The algorithm finds a matching between two distributions, which preserves the geometrical and topological information of the corresponding graphs. We evaluate the performance of the algorithm on a shadow compensation problem in hyperspectral image analysis: the land use classification obtained with the compensated data is improve
    corecore