    Supervised learning on graphs of spatio-temporal similarity in satellite image sequences

    High resolution satellite image sequences are multidimensional signals composed of spatio-temporal patterns associated to numerous and various phenomena. Bayesian methods have been previously proposed in (Heas and Datcu, 2005) to code the information contained in satellite image sequences in a graph representation using Bayesian methods. Based on such a representation, this paper further presents a supervised learning methodology of semantics associated to spatio-temporal patterns occurring in satellite image sequences. It enables the recognition and the probabilistic retrieval of similar events. Indeed, graphs are attached to statistical models for spatio-temporal processes, which at their turn describe physical changes in the observed scene. Therefore, we adjust a parametric model evaluating similarity types between graph patterns in order to represent user-specific semantics attached to spatio-temporal phenomena. The learning step is performed by the incremental definition of similarity types via user-provided spatio-temporal pattern examples attached to positive or/and negative semantics. From these examples, probabilities are inferred using a Bayesian network and a Dirichlet model. This enables to links user interest to a specific similarity model between graph patterns. According to the current state of learning, semantic posterior probabilities are updated for all possible graph patterns so that similar spatio-temporal phenomena can be recognized and retrieved from the image sequence. Few experiments performed on a multi-spectral SPOT image sequence illustrate the proposed spatio-temporal recognition method

    Exploiting Deep Features for Remote Sensing Image Retrieval: A Systematic Investigation

    Remote sensing (RS) image retrieval is of great significant for geological information mining. Over the past two decades, a large amount of research on this task has been carried out, which mainly focuses on the following three core issues: feature extraction, similarity metric and relevance feedback. Due to the complexity and multiformity of ground objects in high-resolution remote sensing (HRRS) images, there is still room for improvement in the current retrieval approaches. In this paper, we analyze the three core issues of RS image retrieval and provide a comprehensive review on existing methods. Furthermore, for the goal to advance the state-of-the-art in HRRS image retrieval, we focus on the feature extraction issue and delve how to use powerful deep representations to address this task. We conduct systematic investigation on evaluating correlative factors that may affect the performance of deep features. By optimizing each factor, we acquire remarkable retrieval results on publicly available HRRS datasets. Finally, we explain the experimental phenomenon in detail and draw conclusions according to our analysis. Our work can serve as a guiding role for the research of content-based RS image retrieval

    Remote Sensing Image Scene Classification: Benchmark and State of the Art

    Remote sensing image scene classification plays an important role in a wide range of applications and hence has been receiving remarkable attention. During the past years, significant efforts have been made to develop various datasets or present a variety of approaches for scene classification from remote sensing images. However, a systematic review of the literature concerning datasets and methods for scene classification is still lacking. In addition, almost all existing datasets have a number of limitations, including the small scale of scene classes and the image numbers, the lack of image variations and diversity, and the saturation of accuracy. These limitations severely limit the development of new approaches especially deep learning-based methods. This paper first provides a comprehensive review of the recent progress. Then, we propose a large-scale dataset, termed "NWPU-RESISC45", which is a publicly available benchmark for REmote Sensing Image Scene Classification (RESISC), created by Northwestern Polytechnical University (NWPU). This dataset contains 31,500 images, covering 45 scene classes with 700 images in each class. The proposed NWPU-RESISC45 (i) is large-scale on the scene classes and the total image number, (ii) holds big variations in translation, spatial resolution, viewpoint, object pose, illumination, background, and occlusion, and (iii) has high within-class diversity and between-class similarity. The creation of this dataset will enable the community to develop and evaluate various data-driven algorithms. Finally, several representative methods are evaluated using the proposed dataset and the results are reported as a useful baseline for future research.Comment: This manuscript is the accepted version for Proceedings of the IEE

    Content-based Information Retrieval via Nearest Neighbor Search

    Content-based information retrieval (CBIR) has attracted significant interest in the past few years. When given a search query, the search engine will compare the query with all the stored information in the database through nearest neighbor search. Finally, the system will return the most similar items. We contribute to the CBIR research the following: firstly, Distance Metric Learning (DML) is studied to improve retrieval accuracy of nearest neighbor search. Additionally, Hash Function Learning (HFL) is considered to accelerate the retrieval process. On one hand, a new local metric learning framework is proposed - Reduced-Rank Local Metric Learning (R2LML). By considering a conical combination of Mahalanobis metrics, the proposed method is able to better capture information like data\u27s similarity and location. A regularization to suppress the noise and avoid over-fitting is also incorporated into the formulation. Based on the different methods to infer the weights for the local metric, we considered two frameworks: Transductive Reduced-Rank Local Metric Learning (T-R2LML), which utilizes transductive learning, while Efficient Reduced-Rank Local Metric Learning (E-R2LML)employs a simpler and faster approximated method. Besides, we study the convergence property of the proposed block coordinate descent algorithms for both our frameworks. The extensive experiments show the superiority of our approaches. On the other hand, *Supervised Hash Learning (*SHL), which could be used in supervised, semi-supervised and unsupervised learning scenarios, was proposed in the dissertation. By considering several codewords which could be learned from the data, the proposed method naturally derives to several Support Vector Machine (SVM) problems. After providing an efficient training algorithm, we also study the theoretical generalization bound of the new hashing framework. In the final experiments, *SHL outperforms many other popular hash function learning methods. Additionally, in order to cope with large data sets, we also conducted experiments running on big data using a parallel computing software package, namely LIBSKYLARK

    Interactive models for latent information discovery in satellite images

    The recent increase in Earth Observation (EO) missions has resulted in unprecedented volumes of multi-modal data to be processed, understood, used and stored in archives. The advanced capabilities of satellite sensors become useful only when translated into accurate, focused information, ready to be used by decision makers from various fields. Two key problems emerge when trying to bridge the gap between research, science and multi-user platforms: (1) The current systems for data access permit only queries by geographic location, time of acquisition, type of sensor, but this information is often less important than the latent, conceptual content of the scenes; (2) simultaneously, many new applications relying on EO data require the knowledge of complex image processing and computer vision methods for understanding and extracting information from the data. This dissertation designs two important concept modules of a theoretical image information mining (IIM) system for EO: semantic knowledge discovery in large databases and data visualization techniques. These modules allow users to discover and extract relevant conceptual information directly from satellite images and generate an optimum visualization for this information. The first contribution of this dissertation brings a theoretical solution that bridges the gap and discovers the semantic rules between the output of state-of-the-art classification algorithms and the semantic, human-defined, manually-applied terminology of cartographic data. The set of rules explain in latent, linguistic concepts the contents of satellite images and link the low-level machine language to the high-level human understanding. The second contribution of this dissertation is an adaptive visualization methodology used to assist the image analyst in understanding the satellite image through optimum representations and to offer cognitive support in discovering relevant information in the scenes. It is an interactive technique applied to discover the optimum combination of three spectral features of a multi-band satellite image that enhance visualization of learned targets and phenomena of interest. The visual mining module is essential for an IIM system because all EO-based applications involve several steps of visual inspection and the final decision about the information derived from satellite data is always made by a human operator. To ensure maximum correlation between the requirements of the analyst and the possibilities of the computer, the visualization tool models the human visual system and secures that a change in the image space is equivalent to a change in the perception space of the operator. This thesis presents novel concepts and methods that help users access and discover latent information in archives and visualize satellite scenes in an interactive, human-centered and information-driven workflow.Der aktuelle Anstieg an Erdbeobachtungsmissionen hat zu einem Anstieg von multi-modalen Daten geführt die verarbeitet, verstanden, benutzt und in Archiven gespeichert werden müssen. Die erweiterten Fähigkeiten von Satellitensensoren sind nur dann von Entscheidungstraegern nutzbar, wenn sie in genaue, fokussierte Information liefern. Es bestehen zwei Schlüsselprobleme beim Versuch die Lücke zwischen Forschung, Wissenschaft und Multi-User-Systeme zu füllen: (1) Die aktuellen Systeme für Datenzugriffe erlauben nur Anfragen basierend auf geografischer Position, Aufzeichnungszeit, Sensortyp. Aber diese Informationen sind oft weniger wichtig als der latente, konzeptuelle Inhalt der Szenerien. (2) Viele neue Anwendungen von Erdbeobachtungsdaten benötigen Wissen über komplexe Bildverarbeitung und Computer Vision Methoden um Information verstehen und extrahieren zu können. Diese Dissertation zeigt zwei wichtige Konzeptmodule eines theoretischen Image Information Mining (IIM) Systems für Erdbeobachtung auf: Semantische Informationsentdeckung in grossen Datenbanken und Datenvisualisierungstechniken. Diese Module erlauben Benutzern das Entdecken und Extrahieren relevanter konzeptioneller Informationen direkt aus Satellitendaten und die Erzeugung von optimalen Visualisierungen dieser Informationen. Der erste Beitrag dieser Dissertation bringt eine theretische Lösung welche diese Lücke überbrückt und entdeckt semantische Regeln zwischen dem Output von state-of-the-art Klassifikationsalgorithmen und semantischer, menschlich definierter, manuell angewendete Terminologie von kartographischen Daten. Ein Satz von Regeln erkläret in latenten, linguistischen Konzepten den Inhalte von Satellitenbildern und verbinden die low-level Maschinensprache mit high-level menschlichen Verstehen. Der zweite Beitrag dieser Dissertation ist eine adaptive Visualisierungsmethode die einem Bildanalysten im Verstehen der Satellitenbilder durch optimale Repräsentation hilft und die kognitive Unterstützung beim Entdecken von relevenanter Informationen in Szenerien bietet. Die Methode ist ein interaktive Technik die angewendet wird um eine optimale Kombination von von drei Spektralfeatures eines Multiband-Satellitenbildes welche die Visualisierung von gelernten Zielen and Phänomenen ermöglichen. Das visuelle Mining-Modul ist essentiell für IIM Systeme da alle erdbeobachtungsbasierte Anwendungen mehrere Schritte von visueller Inspektion benötigen und davon abgeleitete Informationen immer vom Operator selbst gemacht werden müssen. Um eine maximale Korrelation von Anforderungen des Analysten und den Möglichkeiten von Computern sicher zu stellen, modelliert das Visualisierungsmodul das menschliche Wahrnehmungssystem und stellt weiters sicher, dass eine Änderung im Bildraum äquivalent zu einer Änderung der Wahrnehmung durch den Operator ist. Diese These präsentieret neuartige Konzepte und Methoden, die Anwendern helfen latente Informationen in Archiven zu finden und visualisiert Satellitenszenen in einem interaktiven, menschlich zentrierten und informationsgetriebenen Arbeitsprozess

    Novel neural network-based algorithms for urban classification and change detection from satellite imagery

    L`attivitĂ  umana sta cambiando radicalmente l`ecosistema ambientale, unito anche alla rapida espansione demografica dei sistemi urbani. Benche` queste aree rappresentano solo una minima frazione della Terra, il loro impatto sulla richiesta di energia, cibo, acqua e materiali primi, e` enorme. Per cui, una informazione accurata e tempestiva risulta essere essenziale per gli enti di protezione civile in caso, ad esempio, di catastrofi ambientali. Negli ultimi anni il forte sviluppo di sistemi satellitari, sia dal punto di vista della risoluzione spaziale che di quella radiometrica e temporale, ha permesso una sempre piu` accurato monitoraggio della Terra, sia con sistemi ottici che con quelli RADAR. Ad ogni modo, una piu` alta risoluzione (sia spaziale, che spettrale o temporale) presenta tanti vantaggi e miglioramenti quanti svantaggi e limitazioni. In questa tesi sono discussi in dettaglio i diversi aspetti e tecniche per la classificazione e monitoraggio dei cambiamenti di aree urbane, utilizzando sia sistemi ottici che RADAR. Particolare enfasi e` data alla teoria ed all`uso di reti neurali.Human activity dominates the Earth's ecosystems with structural modifications. The rapid population growth over recent decades and the concentration of this population in and around urban areas have significantly impacted the environment. Although urban areas represent a small fraction of the land surface, they affect large areas due to the magnitude of the associated energy, food, water, and raw material demands. Reliable information in populated areas is essential for urban planning and strategic decision making, such as civil protection departments in cases of emergency. Remote sensing is increasingly being used as a timely and cost-effective source of information in a wide number of applications, from environment monitoring to location-aware systems. However, mapping human settlements represents one of the most challenging areas for the remote sensing community due to its high spatial and spectral diversity. From the physical composition point of view, several different materials can be used for the same man-made element (for example, building roofs can be made of clay tiles, metal, asphalt, concrete, plastic, grass or stones). On the other hand, the same material can be used for different purposes (for example, concrete can be found in paved roads or building roofs). Moreover, urban areas are often made up of materials present in the surrounding region, making them indistinguishable from the natural or agricultural areas (examples can be unpaved roads and bare soil, clay tiles and bare soil, or parks and vegetated open spaces) [1]. During the last two decades, significant progress has been made in developing and launching satellites with instruments, in both the optical/infrared and microwave regions of the spectra, well suited for Earth observation with an increasingly finer spatial, spectral and temporal resolution. Fine spatial sensors with metric or sub-metric resolution allow the detection of small-scale objects, such as elements of residential housing, commercial buildings, transportation systems and utilities. Multi-spectral and hyper-spectral remote sensing systems provide additional discriminative features for classes that are spectrally similar, due to their higher spectral resolution. The temporal component, integrated with the spectral and spatial dimensions, provides essential information, for example on vegetation dynamics. Moreover, the delineation of temporal homogeneous patches reduces the effect of local spatial heterogeneity that often masks larger spatial patterns. Nevertheless, higher resolution (spatial, spectral or temporal) imagery comes with limits and challenges that equal the advantages and improvements, and this is valid for both optical and synthetic aperture radar data [2]. This thesis addresses the different aspects of mapping and change detection of human settlements, discussing the main issues related to the use of optical and synthetic aperture radar data. Novel approaches and techniques are proposed and critically discussed to cope with the challenges of urban areas, including data fusion, image information mining, and active learning. The chapters are subdivided into three main parts. Part I addresses the theoretical aspects of neural networks, including their different architectures, design, and training. The proposed neural networks-based algorithms, their applications to classification and change detection problems, and the experimental results are described in Part II and Part III