223 research outputs found

    BiGSeT: Binary Mask-Guided Separation Training for DNN-based Hyperspectral Anomaly Detection

    Full text link
    Hyperspectral anomaly detection (HAD) aims to recognize a minority of anomalies that are spectrally different from their surrounding background without prior knowledge. Deep neural networks (DNNs), including autoencoders (AEs), convolutional neural networks (CNNs) and vision transformers (ViTs), have shown remarkable performance in this field due to their powerful ability to model the complicated background. However, for reconstruction tasks, DNNs tend to incorporate both background and anomalies into the estimated background, which is referred to as the identical mapping problem (IMP) and leads to significantly decreased performance. To address this limitation, we propose a model-independent binary mask-guided separation training strategy for DNNs, named BiGSeT. Our method introduces a separation training loss based on a latent binary mask to separately constrain the background and anomalies in the estimated image. The background is preserved, while the potential anomalies are suppressed by using an efficient second-order Laplacian of Gaussian (LoG) operator, generating a pure background estimate. In order to maintain separability during training, we periodically update the mask using a robust proportion threshold estimated before the training. In our experiments, We adopt a vanilla AE as the network to validate our training strategy on several real-world datasets. Our results show superior performance compared to some state-of-the-art methods. Specifically, we achieved a 90.67% AUC score on the HyMap Cooke City dataset. Additionally, we applied our training strategy to other deep network structures, achieving improved detection performance compared to their original versions, demonstrating its effective transferability. The code of our method will be available at https://github.com/enter-i-username/BiGSeT.Comment: 13 pages, 13 figures, submitted to IEEE TRANSACTIONS ON IMAGE PROCESSIN

    Semi-supervised and unsupervised kernel-based novelty detection with application to remote sensing images

    Get PDF
    The main challenge of new information technologies is to retrieve intelligible information from the large volume of digital data gathered every day. Among the variety of existing data sources, the satellites continuously observing the surface of the Earth are key to the monitoring of our environment. The new generation of satellite sensors are tremendously increasing the possibilities of applications but also increasing the need for efficient processing methodologies in order to extract information relevant to the users' needs in an automatic or semi-automatic way. This is where machine learning comes into play to transform complex data into simplified products such as maps of land-cover changes or classes by learning from data examples annotated by experts. These annotations, also called labels, may actually be difficult or costly to obtain since they are established on the basis of ground surveys. As an example, it is extremely difficult to access a region recently flooded or affected by wildfires. In these situations, the detection of changes has to be done with only annotations from unaffected regions. In a similar way, it is difficult to have information on all the land-cover classes present in an image while being interested in the detection of a single one of interest. These challenging situations are called novelty detection or one-class classification in machine learning. In these situations, the learning phase has to rely only on a very limited set of annotations, but can exploit the large set of unlabeled pixels available in the images. This setting, called semi-supervised learning, allows significantly improving the detection. In this Thesis we address the development of methods for novelty detection and one-class classification with few or no labeled information. The proposed methodologies build upon the kernel methods, which take place within a principled but flexible framework for learning with data showing potentially non-linear feature relations. The thesis is divided into two parts, each one having a different assumption on the data structure and both addressing unsupervised (automatic) and semi-supervised (semi-automatic) learning settings. The first part assumes the data to be formed by arbitrary-shaped and overlapping clusters and studies the use of kernel machines, such as Support Vector Machines or Gaussian Processes. An emphasis is put on the robustness to noise and outliers and on the automatic retrieval of parameters. Experiments on multi-temporal multispectral images for change detection are carried out using only information from unchanged regions or none at all. The second part assumes high-dimensional data to lie on multiple low dimensional structures, called manifolds. We propose a method seeking a sparse and low-rank representation of the data mapped in a non-linear feature space. This representation allows us to build a graph, which is cut into several groups using spectral clustering. For the semi-supervised case where few labels of one class of interest are available, we study several approaches incorporating the graph information. The class labels can either be propagated on the graph, constrain spectral clustering or used to train a one-class classifier regularized by the given graph. Experiments on the unsupervised and oneclass classification of hyperspectral images demonstrate the effectiveness of the proposed approaches

    MSDH: matched subspace detector with heterogeneous noise

    Get PDF
    The matched subspace detector (MSD) is a classical subspace-based method for hyperspectral subpixel target detection. However, the model assumes that noise has the same variance over different bands, which is usually unrealistic in practice. In this letter, we relax the equal variance assumption and propose a matched subspace detector with heterogeneous noise (MSDH). In essence, the noise variances are different for different bands and they can be estimated by using iteratively reweighted least squares methods. Experiments on two benchmark real hyperspectral datasets demonstrate the superiority of MSDH over MSD for subpixel target detection

    Exploiting Cross Domain Relationships for Target Recognition

    Get PDF
    Cross domain recognition extracts knowledge from one domain to recognize samples from another domain of interest. The key to solving problems under this umbrella is to find out the latent connections between different domains. In this dissertation, three different cross domain recognition problems are studied by exploiting the relationships between different domains explicitly according to the specific real problems. First, the problem of cross view action recognition is studied. The same action might seem quite different when observed from different viewpoints. Thus, how to use the training samples from a given camera view and perform recognition in another new view is the key point. In this work, reconstructable paths between different views are built to mirror labeled actions from one source view into one another target view for learning an adaptable classifier. The path learning takes advantage of the joint dictionary learning techniques with exploiting hidden information in the seemingly useless samples, making the recognition performance robust and effective. Second, the problem of person re-identification is studied, which tries to match pedestrian images in non-overlapping camera views based on appearance features. In this work, we propose to learn a random kernel forest to discriminatively assign a specific distance metric to each pair of local patches from the two images in matching. The forest is composed by multiple decision trees, which are designed to partition the overall space of local patch-pairs into substantial subspaces, where a simple but effective local metric kernel can be defined to minimize the distance of true matches. Third, the problem of multi-event detection and recognition in smart grid is studied. The signal of multi-event might not be a straightforward combination of some single-event signals because of the correlation among devices. In this work, a concept of ``root-pattern\u27\u27 is proposed that can be extracted from a collection of single-event signals, but also transferable to analyse the constituent components of multi-cascading-event signals based on an over-complete dictionary, which is designed according to the ``root-patterns\u27\u27 with temporal information subtly embedded. The correctness and effectiveness of the proposed approaches have been evaluated by extensive experiments

    Interpretable Hyperspectral AI: When Non-Convex Modeling meets Hyperspectral Remote Sensing

    Full text link
    Hyperspectral imaging, also known as image spectrometry, is a landmark technique in geoscience and remote sensing (RS). In the past decade, enormous efforts have been made to process and analyze these hyperspectral (HS) products mainly by means of seasoned experts. However, with the ever-growing volume of data, the bulk of costs in manpower and material resources poses new challenges on reducing the burden of manual labor and improving efficiency. For this reason, it is, therefore, urgent to develop more intelligent and automatic approaches for various HS RS applications. Machine learning (ML) tools with convex optimization have successfully undertaken the tasks of numerous artificial intelligence (AI)-related applications. However, their ability in handling complex practical problems remains limited, particularly for HS data, due to the effects of various spectral variabilities in the process of HS imaging and the complexity and redundancy of higher dimensional HS signals. Compared to the convex models, non-convex modeling, which is capable of characterizing more complex real scenes and providing the model interpretability technically and theoretically, has been proven to be a feasible solution to reduce the gap between challenging HS vision tasks and currently advanced intelligent data processing models

    Spatial Analysis for Landscape Changes

    Get PDF
    Recent increasing trends of the occurrence of natural and anthropic processes have a strong impact on landscape modification, and there is a growing need for the implementation of effective instruments, tools, and approaches to understand and manage landscape changes. A great improvement in the availability of high-resolution DEMs, GIS tools, and algorithms of automatic extraction of landform features and change detections has favored an increase in the analysis of landscape changes, which became an essential instrument for the quantitative evaluation of landscape changes in many research fields. One of the most effective ways of investigating natural landscape changes is the geomorphological one, which benefits from recent advances in the development of digital elevation model (DEM) comparison software and algorithms, image change detection, and landscape evolution models. This Special Issue collects six papers concerning the application of traditional and innovative multidisciplinary methods in several application fields, such as geomorphology, urban and territorial systems, vegetation restoration, and soil science. The papers include multidisciplinary studies that highlight the usefulness of quantitative analyses of satellite images and UAV-based DEMs, the application of Landscape Evolution Models (LEMs) and automatic landform classification algorithms to solve multidisciplinary issues of landscape changes. A review article is also presented, dealing with the bibliometric analysis of the research topic

    Image-set, Temporal and Spatiotemporal Representations of Videos for Recognizing, Localizing and Quantifying Actions

    Get PDF
    This dissertation addresses the problem of learning video representations, which is defined here as transforming the video so that its essential structure is made more visible or accessible for action recognition and quantification. In the literature, a video can be represented by a set of images, by modeling motion or temporal dynamics, and by a 3D graph with pixels as nodes. This dissertation contributes in proposing a set of models to localize, track, segment, recognize and assess actions such as (1) image-set models via aggregating subset features given by regularizing normalized CNNs, (2) image-set models via inter-frame principal recovery and sparsely coding residual actions, (3) temporally local models with spatially global motion estimated by robust feature matching and local motion estimated by action detection with motion model added, (4) spatiotemporal models 3D graph and 3D CNN to model time as a space dimension, (5) supervised hashing by jointly learning embedding and quantization, respectively. State-of-the-art performances are achieved for tasks such as quantifying facial pain and human diving. Primary conclusions of this dissertation are categorized as follows: (i) Image set can capture facial actions that are about collective representation; (ii) Sparse and low-rank representations can have the expression, identity and pose cues untangled and can be learned via an image-set model and also a linear model; (iii) Norm is related with recognizability; similarity metrics and loss functions matter; (v) Combining the MIL based boosting tracker with the Particle Filter motion model induces a good trade-off between the appearance similarity and motion consistence; (iv) Segmenting object locally makes it amenable to assign shape priors; it is feasible to learn knowledge such as shape priors online from Web data with weak supervision; (v) It works locally in both space and time to represent videos as 3D graphs; 3D CNNs work effectively when inputted with temporally meaningful clips; (vi) the rich labeled images or videos help to learn better hash functions after learning binary embedded codes than the random projections. In addition, models proposed for videos can be adapted to other sequential images such as volumetric medical images which are not included in this dissertation
    • …
    corecore