905 research outputs found

    Toulouse Hyperspectral Data Set: a benchmark data set to assess semi-supervised spectral representation learning and pixel-wise classification techniques

    Full text link
    Airborne hyperspectral images can be used to map the land cover in large urban areas, thanks to their very high spatial and spectral resolutions on a wide spectral domain. While the spectral dimension of hyperspectral images is highly informative of the chemical composition of the land surface, the use of state-of-the-art machine learning algorithms to map the land cover has been dramatically limited by the availability of training data. To cope with the scarcity of annotations, semi-supervised and self-supervised techniques have lately raised a lot of interest in the community. Yet, the publicly available hyperspectral data sets commonly used to benchmark machine learning models are not totally suited to evaluate their generalization performances due to one or several of the following properties: a limited geographical coverage (which does not reflect the spectral diversity in metropolitan areas), a small number of land cover classes and a lack of appropriate standard train / test splits for semi-supervised and self-supervised learning. Therefore, we release in this paper the Toulouse Hyperspectral Data Set that stands out from other data sets in the above-mentioned respects in order to meet key issues in spectral representation learning and classification over large-scale hyperspectral images with very few labeled pixels. Besides, we discuss and experiment the self-supervised task of Masked Autoencoders and establish a baseline for pixel-wise classification based on a conventional autoencoder combined with a Random Forest classifier achieving 82% overall accuracy and 74% F1 score. The Toulouse Hyperspectral Data Set and our code are publicly available at https://www.toulouse-hyperspectral-data-set.com and https://www.github.com/Romain3Ch216/tlse-experiments, respectively.Comment: 17 pages, 13 figure

    Harmonized Landsat 8 and Sentinel-2 Time Series Data to Detect Irrigated Areas: An Application in Southern Italy

    Get PDF
    Lack of accurate and up-to-date data associated with irrigated areas and related irrigation amounts is hampering the full implementation and compliance of the Water Framework Directive (WFD). In this paper, we describe the framework that we developed and implemented within the DIANA project to map the actual extent of irrigated areas in the Campania region (Southern Italy) during the 2018 irrigation season. For this purpose, we considered 202 images from the Harmonized Landsat Sentinel-2 (HLS) products (57 images from Landsat 8 and 145 images from Sentinel-2). Such data were preprocessed in order to extract a multitemporal Normalized Difference Vegetation Index (NDVI) map, which was then smoothed through a gap-filling algorithm. We further integrated data coming from high-resolution (4 km) global satellite precipitation Precipitation Estimation from Remotely Sensed Information using Artificial Neural Networks (PERSIANN)-Cloud Classification System (CCS) products. We collected an extensive ground truth in the field represented by 2992 data points coming from three main thematic classes: bare soil and rainfed (class 0), herbaceous (class 1), and tree crop (class 2). This information was exploited to generate irrigated area maps by adopting a machine learning classification approach. We compared six different types of classifiers through a cross-validation approach and found that, in general, random forests, support vector machines, and boosted decision trees exhibited the best performances in terms of classification accuracy and robustness to different tested scenarios. We found an overall accuracy close to 90% in discriminating among the three thematic classes, which highlighted promising capabilities in the detection of irrigated areas from HLS products

    Semi-supervised learning with constrained virtual support vector machines for classification of remote sensing image data

    Get PDF
    We introduce two semi-supervised models for the classification of remote sensing image data. The models are built upon the framework of Virtual Support Vector Machines (VSVM). Generally, VSVM follow a two-step learning procedure: A Support Vector Machines (SVM) model is learned to determine and extract labeled samples that constitute the decision boundary with the maximum margin between thematic classes, i.e., the Support Vectors (SVs). The SVs govern the creation of so-called virtual samples. This is done by modifying, i.e., perturbing, the image features to which a decision boundary needs to be invariant. Subsequently, the classification model is learned for a second time by using the newly created virtual samples in addition to the SVs to eventually find a new optimal decision boundary. Here, we extend this concept by (i) integrating a constrained set of semilabeled samples when establishing the final model. Thereby, the model constrainment, i.e., the selection mechanism for including solely informative semi-labeled samples, is built upon a self-learning procedure composed of two active learning heuristics. Additionally, (ii) we consecutively deploy semi-labeled samples for the creation of semi-labeled virtual samples by modifying the image features of semi-labeled samples that have become semi-labeled SVs after an initial model run. We present experimental results from classifying two multispectral data sets with a sub-meter geometric resolution. The proposed semi-supervised VSVM models exhibit the most favorable performance compared to related SVM and VSVM-based approaches, as well as (semi-)supervised CNNs, in situations with a very limited amount of available prior knowledge, i.e., labeled samples

    Deep multitask learning with label interdependency distillation for multicriteria street-level image classification

    Get PDF
    Multitask learning (MTL) aims at beneficial joint solving of multiple prediction problems by sharing information across different tasks. However, without adequate consideration of interdependencies, MTL models are prone to miss valuable information. In this paper, we introduce a novel deep MTL architecture that specifically encodes cross-task interdependencies within the setting of multiple image classification problems. Based on task-wise interim class label probability predictions by an intermediately supervised hard parameter sharing convolutional neural network, interdependencies are inferred in two ways: i) by directly stacking label probability sequences to the image feature vector (i.e., multitask stacking), and ii) by passing probability sequences to gated recurrent unit-based recurrent neural networks to explicitly learn cross-task interdependency representations and stacking those to the image feature vector (i.e., interdependency representation learning). The proposed MTL architecture is applied as a tool for generic multi-criteria building characterization using street-level imagery related to risk assessments toward multiple natural hazards. Experimental results for classifying buildings according to five vulnerability-related target variables (i.e., five learning tasks), namely height, lateral load-resisting system material, seismic building structural type, roof shape, and block position are obtained for the Chilean capital Santiago de Chile. Our MTL methods with cross-task label interdependency modeling consistently outperform single task learning (STL) and classical hard parameter sharing MTL alike. Even when starting already from high classification accuracy levels, estimated generalization capabilities can be further improved by considerable margins of accumulated task-specific residuals beyond +6% κ. Thereby, the combination of multitask stacking and interdependency representation learning attains the highest accuracy estimates for the addressed task and data setting (up to cross-task accuracy mean values of 88.43% overall accuracy and 84.49% κ). From an efficiency perspective, the proposed MTL methods turn out to be substantially favorable compared to STL in terms of training time consumption

    Hyperspectral Image Classification -- Traditional to Deep Models: A Survey for Future Prospects

    Get PDF
    Hyperspectral Imaging (HSI) has been extensively utilized in many real-life applications because it benefits from the detailed spectral information contained in each pixel. Notably, the complex characteristics i.e., the nonlinear relation among the captured spectral information and the corresponding object of HSI data make accurate classification challenging for traditional methods. In the last few years, Deep Learning (DL) has been substantiated as a powerful feature extractor that effectively addresses the nonlinear problems that appeared in a number of computer vision tasks. This prompts the deployment of DL for HSI classification (HSIC) which revealed good performance. This survey enlists a systematic overview of DL for HSIC and compared state-of-the-art strategies of the said topic. Primarily, we will encapsulate the main challenges of traditional machine learning for HSIC and then we will acquaint the superiority of DL to address these problems. This survey breakdown the state-of-the-art DL frameworks into spectral-features, spatial-features, and together spatial-spectral features to systematically analyze the achievements (future research directions as well) of these frameworks for HSIC. Moreover, we will consider the fact that DL requires a large number of labeled training examples whereas acquiring such a number for HSIC is challenging in terms of time and cost. Therefore, this survey discusses some strategies to improve the generalization performance of DL strategies which can provide some future guidelines

    Multi-target regressor chains with repetitive permutation scheme for characterization of built environments with remote sensing

    Get PDF
    Multi-task learning techniques allow the beneficial joint estimation of multiple target variables. Here, we propose a novel multi-task regression (MTR) method called ensemble of regressor chains with repetitive permutation scheme. It belongs to the family of problem transformation based MTR methods which foresee the creation of an individual model per target variable. Subsequently, the combination of the separate models allows obtaining an overall prediction. Our method builds upon the concept of so-called ensemble of regressor chains which align single-target models along a flexible permutation, i.e., chain. However, in order to particularly address situations with a small number of target variables, we equip ensemble of regressor chains with a repetitive permutation scheme. Thereby, estimates of the target variables are cascaded to subsequent models as additional features when learning along a chain, whereby one target variable can occupy multiple elements of the chain. We provide experimental evaluation of the method by jointly estimating built-up height and built-up density based on features derived from Sentinel-2 data for the four largest cities in Germany in a comparative setup. We also consider single-target stacking, multi-target stacking, and ensemble of regressor chains without repetitive permutation. Empirical results underline the beneficial performance properties of MTR methods. Our ensemble of regressor chain with repetitive permutation scheme approach achieved most frequently the highest accuracies compared to the other MTR methods, whereby mean improvements across the experiments of 14.5% compared to initial single-target models could be achieved

    Pixel-level semantic understanding of ophthalmic images and beyond

    Get PDF
    Computer-assisted semantic image understanding constitutes the substrate of applications that range from biomarker detection to intraoperative guidance or street scene understanding for self-driving systems. This PhD thesis is on the development of deep learning-based, pixel-level, semantic segmentation methods for medical and natural images. For vessel segmentation in OCT-A, a method comprising iterative refinement of the extracted vessel maps and an auxiliary loss function that penalizes structural inaccuracies, is proposed and tested on data captured from real clinical conditions comprising various pathological cases. Ultimately, the presented method enables the extraction of a detailed vessel map of the retina with potential applications to diagnostics or intraoperative localization. Furthermore, for scene segmentation in cataract surgery, the major challenge of class imbalance is identified among several factors. Subsequently, a method addressing it is proposed, achieving state-of-the-art performance on a challenging public dataset. Accurate semantic segmentation in this domain can be used to monitor interactions between tools and anatomical parts for intraoperative guidance and safety. Finally, this thesis proposes a novel contrastive learning framework for supervised semantic segmentation, that aims to improve the discriminative power of features in deep neural networks. The proposed approach leverages contrastive loss function applied both at multiple model layers and across them. Importantly, the proposed framework is easy to combine with various model architectures and is experimentally shown to significantly improve performance on both natural and medical domain

    Contributions to Ensemble Classifiers with Image Analysis Applications

    Get PDF
    134 p.Ésta tesis tiene dos aspectos fundamentales, por un lado, la propuesta denuevas arquitecturas de clasificadores y, por otro, su aplicación a el análisis deimagen.Desde el punto de vista de proponer nuevas arquitecturas de clasificaciónla tesis tiene dos contribucciones principales. En primer lugar la propuestade un innovador ensemble de clasificadores basado en arquitecturas aleatorias,como pueden ser las Extreme Learning Machines (ELM), Random Forest (RF) yRotation Forest, llamado Hybrid Extreme Rotation Forest (HERF) y su mejoraAnticipative HERF (AHERF) que conlleva una selección del modelo basada enel rendimiento de predicción para cada conjunto de datos específico. Ademásde lo anterior, proveemos una prueba formal tanto del AHERF, como de laconvergencia de los ensembles de regresores ELMs que mejoran la usabilidad yreproducibilidad de los resultados.En la vertiente de aplicación hemos estado trabajando con dos tipos de imágenes:imágenes hiperespectrales de remote sensing, e imágenes médicas tanto depatologías específicas de venas de sangre como de imágenes para el diagnósticode Alzheimer. En todos los casos los ensembles de clasificadores han sido la herramientacomún además de estrategias especificas de aprendizaje activo basadasen dichos ensembles de clasificadores. En el caso concreto de la segmentaciónde vasos sanguíneos nos hemos enfrentado con problemas, uno relacionado conlos trombos del Aneurismas de Aorta Abdominal en imágenes 3D de tomografíacomputerizada y el otro la segmentación de venas sangineas en la retina. Losresultados en ambos casos en términos de rendimiento en clasificación y ahorrode tiempo en la segmentación humana nos permiten recomendar esos enfoquespara la práctica clínica.Chapter 1Background y contribuccionesDado el espacio limitado para realizar el resumen de la tesis hemos decididoincluir un resumen general con los puntos más importantes, una pequeña introducciónque pudiera servir como background para entender los conceptos básicosde cada uno de los temas que hemos tocado y un listado con las contribuccionesmás importantes.1.1 Ensembles de clasificadoresLa idea de los ensembles de clasificadores fue propuesta por Hansen y Salamon[4] en el contexto del aprendizaje de las redes neuronales artificiales. Sutrabajo mostró que un ensemble de redes neuronales con un esquema de consensogrupal podía mejorar el resultado obtenido con una única red neuronal.Los ensembles de clasificadores buscan obtener unos resultados de clasificaciónmejores combinando clasificadores débiles y diversos [8, 9]. La propuesta inicialde ensemble contenía una colección homogena de clasificadores individuales. ElRandom Forest es un claro ejemplo de ello, puesto que combina la salida de unacolección de árboles de decisión realizando una votación por mayoría [2, 3], yse construye utilizando una técnica de remuestreo sobre el conjunto de datos ycon selección aleatoria de variables.2CHAPTER 1. BACKGROUND Y CONTRIBUCCIONES 31.2 Aprendizaje activoLa construcción de un clasificador supervisado consiste en el aprendizaje de unaasignación de funciones de datos en un conjunto de clases dado un conjunto deentrenamiento etiquetado. En muchas situaciones de la vida real la obtenciónde las etiquetas del conjunto de entrenamiento es costosa, lenta y propensa aerrores. Esto hace que la construcción del conjunto de entrenamiento sea unatarea engorrosa y requiera un análisis manual exaustivo de la imagen. Esto se realizanormalmente mediante una inspección visual de las imágenes y realizandoun etiquetado píxel a píxel. En consecuencia el conjunto de entrenamiento esaltamente redundante y hace que la fase de entrenamiento del modelo sea muylenta. Además los píxeles ruidosos pueden interferir en las estadísticas de cadaclase lo que puede dar lugar a errores de clasificación y/o overfitting. Por tantoes deseable que un conjunto de entrenamiento sea construido de una manera inteligente,lo que significa que debe representar correctamente los límites de clasemediante el muestreo de píxeles discriminantes. La generalización es la habilidadde etiquetar correctamente datos que no se han visto previamente y quepor tanto son nuevos para el modelo. El aprendizaje activo intenta aprovecharla interacción con un usuario para proporcionar las etiquetas de las muestrasdel conjunto de entrenamiento con el objetivo de obtener la clasificación másprecisa utilizando el conjunto de entrenamiento más pequeño posible.1.3 AlzheimerLa enfermedad de Alzheimer es una de las causas más importantes de discapacidaden personas mayores. Dado el envejecimiento poblacional que es una realidaden muchos países, con el aumento de la esperanza de vida y con el aumentodel número de personas mayores, el número de pacientes con demencia aumentarátambién. Debido a la importancia socioeconómica de la enfermedad enlos países occidentales existe un fuerte esfuerzo internacional focalizado en laenfermedad del Alzheimer. En las etapas tempranas de la enfermedad la atrofiacerebral suele ser sutil y está espacialmente distribuida por diferentes regionescerebrales que incluyen la corteza entorrinal, el hipocampo, las estructuras temporaleslateral e inferior, así como el cíngulo anterior y posterior. Son muchoslos esfuerzos de diseño de algoritmos computacionales tratando de encontrarbiomarcadores de imagen que puedan ser utilizados para el diagnóstico no invasivodel Alzheimer y otras enfermedades neurodegenerativas.CHAPTER 1. BACKGROUND Y CONTRIBUCCIONES 41.4 Segmentación de vasos sanguíneosLa segmentación de los vasos sanguíneos [1, 7, 6] es una de las herramientas computacionalesesenciales para la evaluación clínica de las enfermedades vasculares.Consiste en particionar un angiograma en dos regiones que no se superponen:la región vasculares y el fondo. Basándonos en los resultados de dicha particiónse pueden extraer, modelar, manipular, medir y visualizar las superficies vasculares.Éstas estructuras son muy útiles y juegan un rol muy imporntate en lostratamientos endovasculares de las enfermedades vasculares. Las enfermedadesvasculares son una de las principales fuentes de morbilidad y mortalidad en todoel mundo.Aneurisma de Aorta Abdominal El Aneurisma de Aorta Abdominal (AAA)es una dilatación local de la Aorta que ocurre entre las arterias renal e ilíaca. Eldebilitamiento de la pared de la aorta conduce a su deformación y la generaciónde un trombo. Generalmente, un AAA se diagnostica cuando el diámetro anterioposteriormínimo de la aorta alcanza los 3 centímetros [5]. La mayoría delos aneurismas aórticos son asintomáticos y sin complicaciones. Los aneurismasque causan los síntomas tienen un mayor riesgo de ruptura. El dolor abdominalo el dolor de espalda son las dos principales características clínicas que sugiereno bien la reciente expansión o fugas. Las complicaciones son a menudo cuestiónde vida o muerte y pueden ocurrir en un corto espacio de tiempo. Por lo tanto,el reto consiste en diagnosticar lo antes posible la aparición de los síntomas.Imágenes de Retina La evaluación de imágenes del fondo del ojo es una herramientade diagnóstico de la patología vascular y no vascular. Dicha inspecciónpuede revelar hipertensión, diabetes, arteriosclerosis, enfermedades cardiovascularese ictus. Los principales retos para la segmentación de vasos retinianos son:(1) la presencia de lesiones que se pueden interpretar de forma errónea comovasos sanguíneos; (2) bajo contraste alrededor de los vasos más delgados, (3)múltiples escalas de tamaño de los vasos.1.5 ContribucionesÉsta tesis tiene dos tipos de contribuciones. Contribuciones computacionales ycontribuciones orientadas a una aplicación o prácticas.CHAPTER 1. BACKGROUND Y CONTRIBUCCIONES 5Desde un punto de vista computacional las contribuciones han sido las siguientes:¿ Un nuevo esquema de aprendizaje activo usando Random Forest y el cálculode la incertidumbre que permite una segmentación de imágenes rápida,precisa e interactiva.¿ Hybrid Extreme Rotation Forest.¿ Adaptative Hybrid Extreme Rotation Forest.¿ Métodos de aprendizaje semisupervisados espectrales-espaciales.¿ Unmixing no lineal y reconstrucción utilizando ensembles de regresoresELM.Desde un punto de vista práctico:¿ Imágenes médicas¿ Aprendizaje activo combinado con HERF para la segmentación deimágenes de tomografía computerizada.¿ Mejorar el aprendizaje activo para segmentación de imágenes de tomografíacomputerizada con información de dominio.¿ Aprendizaje activo con el clasificador bootstrapped dendritic aplicadoa segmentación de imágenes médicas.¿ Meta-ensembles de clasificadores para detección de Alzheimer conimágenes de resonancia magnética.¿ Random Forest combinado con aprendizaje activo para segmentaciónde imágenes de retina.¿ Segmentación automática de grasa subcutanea y visceral utilizandoresonancia magnética.¿ Imágenes hiperespectrales¿ Unmixing no lineal y reconstrucción utilizando ensembles de regresoresELM.¿ Métodos de aprendizaje semisupervisados espectrales-espaciales concorrección espacial usando AHERF.¿ Método semisupervisado de clasificación utilizando ensembles de ELMsy con regularización espacial

    A Comprehensive Survey on Deep-Learning-based Vehicle Re-Identification: Models, Data Sets and Challenges

    Full text link
    Vehicle re-identification (ReID) endeavors to associate vehicle images collected from a distributed network of cameras spanning diverse traffic environments. This task assumes paramount importance within the spectrum of vehicle-centric technologies, playing a pivotal role in deploying Intelligent Transportation Systems (ITS) and advancing smart city initiatives. Rapid advancements in deep learning have significantly propelled the evolution of vehicle ReID technologies in recent years. Consequently, undertaking a comprehensive survey of methodologies centered on deep learning for vehicle re-identification has become imperative and inescapable. This paper extensively explores deep learning techniques applied to vehicle ReID. It outlines the categorization of these methods, encompassing supervised and unsupervised approaches, delves into existing research within these categories, introduces datasets and evaluation criteria, and delineates forthcoming challenges and potential research directions. This comprehensive assessment examines the landscape of deep learning in vehicle ReID and establishes a foundation and starting point for future works. It aims to serve as a complete reference by highlighting challenges and emerging trends, fostering advancements and applications in vehicle ReID utilizing deep learning models
    • …
    corecore