77 research outputs found

    An Overview on the Generation and Detection of Synthetic and Manipulated Satellite Images

    Get PDF
    Due to the reduction of technological costs and the increase of satellites launches, satellite images are becoming more popular and easier to obtain. Besides serving benevolent purposes, satellite data can also be used for malicious reasons such as misinformation. As a matter of fact, satellite images can be easily manipulated relying on general image editing tools. Moreover, with the surge of Deep Neural Networks (DNNs) that can generate realistic synthetic imagery belonging to various domains, additional threats related to the diffusion of synthetically generated satellite images are emerging. In this paper, we review the State of the Art (SOTA) on the generation and manipulation of satellite images. In particular, we focus on both the generation of synthetic satellite imagery from scratch, and the semantic manipulation of satellite images by means of image-transfer technologies, including the transformation of images obtained from one type of sensor to another one. We also describe forensic detection techniques that have been researched so far to classify and detect synthetic image forgeries. While we focus mostly on forensic techniques explicitly tailored to the detection of AI-generated synthetic contents, we also review some methods designed for general splicing detection, which can in principle also be used to spot AI manipulate imagesComment: 25 pages, 17 figures, 5 tables, APSIPA 202

    Exploiting gan as an oversampling method for imbalanced data augmentation with application to the fault diagnosis of an industrial robot

    Get PDF
    O diagnóstico inteligente de falhas baseado em aprendizagem máquina geralmente requer um conjunto de dados balanceados para produzir um desempenho aceitável. No entanto, a obtenção de dados quando o equipamento industrial funciona com falhas é uma tarefa desafiante, resultando frequentemente num desequilíbrio entre dados obtidos em condições nominais e com falhas. As técnicas de aumento de dados são das abordagens mais promissoras para mitigar este problema. Redes adversárias generativas (GAN) são um tipo de modelo generativo que consiste de um módulo gerador e de um discriminador. Por meio de aprendizagem adversária entre estes módulos, o gerador otimizado pode produzir padrões sintéticos que podem ser usados para amumento de dados. Investigamos se asGANpodem ser usadas como uma ferramenta de sobre amostra- -gem para compensar um conjunto de dados desequilibrado em uma tarefa de diagnóstico de falhas num manipulador robótico industrial. Realizaram-se uma série de experiências para validar a viabilidade desta abordagem. A abordagem é comparada com seis cenários, incluindo o método clássico de sobre amostragem SMOTE. Os resultados mostram que a GAN supera todos os cenários comparados. Para mitigar dois problemas reconhecidos no treino das GAN, ou seja, instabilidade de treino e colapso de modo, é proposto o seguinte. Propomos uma generalização da GAN de erro quadrado médio (MSE GAN) da Wasserstein GAN com penalidade de gradiente (WGAN-GP), referida como VGAN (GAN baseado numa matriz V) para mitigar a instabilidade de treino. Além disso, propomos um novo critério para rastrear o modelo mais adequado durante o treino. Experiências com o MNIST e no conjunto de dados do manipulador robótico industrial mostram que o VGAN proposto supera outros modelos competitivos. A rede adversária generativa com consistência de ciclo (CycleGAN) visa lidar com o colapso de modo, uma condição em que o gerador produz pouca ou nenhuma variabilidade. Investigamos a distância fatiada de Wasserstein (SWD) na CycleGAN. O SWD é avaliado tanto no CycleGAN incondicional quanto no CycleGAN condicional com e sem mecanismos de compressão e excitação. Mais uma vez, dois conjuntos de dados são avaliados, ou seja, o MNIST e o conjunto de dados do manipulador robótico industrial. Os resultados mostram que o SWD tem menor custo computacional e supera o CycleGAN convencional.Machine learning based intelligent fault diagnosis often requires a balanced data set for yielding an acceptable performance. However, obtaining faulty data from industrial equipment is challenging, often resulting in an imbalance between data acquired in normal conditions and data acquired in the presence of faults. Data augmentation techniques are among the most promising approaches to mitigate such issue. Generative adversarial networks (GAN) are a type of generative model consisting of a generator module and a discriminator. Through adversarial learning between these modules, the optimised generator can produce synthetic patterns that can be used for data augmentation. We investigate whether GAN can be used as an oversampling tool to compensate for an imbalanced data set in an industrial robot fault diagnosis task. A series of experiments are performed to validate the feasibility of this approach. The approach is compared with six scenarios, including the classical oversampling method (SMOTE). Results show that GAN outperforms all the compared scenarios. To mitigate two recognised issues in GAN training, i.e., instability and mode collapse, the following is proposed. We proposed a generalization of both mean sqaure error (MSE GAN) and Wasserstein GAN with gradient penalty (WGAN-GP), referred to as VGAN (the V-matrix based GAN) to mitigate training instability. Also, a novel criterion is proposed to keep track of the most suitable model during training. Experiments on both the MNIST and the industrial robot data set show that the proposed VGAN outperforms other competitive models. Cycle consistency generative adversarial network (CycleGAN) is aiming at dealing with mode collapse, a condition where the generator yields little to none variability. We investigate the sliced Wasserstein distance (SWD) for CycleGAN. SWD is evaluated in both the unconditional CycleGAN and the conditional CycleGAN with and without squeeze-and-excitation mechanisms. Again, two data sets are evaluated, i.e., the MNIST and the industrial robot data set. Results show that SWD has less computational cost and outperforms conventional CycleGAN

    QMRNet: Quality Metric Regression for EO Image Quality Assessment and Super-Resolution

    Get PDF
    [EN] The latest advances in super-resolution have been tested with general-purpose images such as faces, landscapes and objects, but mainly unused for the task of super-resolving earth observation images. In this research paper, we benchmark state-of-the-art SR algorithms for distinct EO datasets using both full-reference and no-reference image quality assessment metrics. We also propose a novel Quality Metric Regression Network (QMRNet) that is able to predict the quality (as a no-reference metric) by training on any property of the image (e.g., its resolution, its distortions, etc.) and also able to optimize SR algorithms for a specific metric objective. This work is part of the implementation of the framework IQUAFLOW, which has been developed for the evaluation of image quality and the detection and classification of objects as well as image compression in EO use cases. We integrated our experimentation and tested our QMRNet algorithm on predicting features such as blur, sharpness, snr, rer and ground sampling distance and obtained validation medRs below 1.0 (out of N = 50) and recall rates above 95%. The overall benchmark shows promising results for LIIF, CAR and MSRN and also the potential use of QMRNet as a loss for optimizing SR predictions. Due to its simplicity, QMRNet could also be used for other use cases and image domains, as its architecture and data processing is fully scalable.The project was financed by the Ministry of Science and Innovation (MICINN) and by the European Union within the framework of FEDER RETOS-Collaboration of the State Program of Research (RTC2019-007434-7), Development and Innovation Oriented to the Challenges of Society, within the State Research Plan Scientific and Technical and Innovation 2017¿2020, with the main objective of promoting technological development, innovation and quality research.Berga, D.; Gallés, P.; Takáts, K.; Mohedano, E.; Riordan-Chen, L.; García-Moll, C.; Vilaseca, D.... (2023). QMRNet: Quality Metric Regression for EO Image Quality Assessment and Super-Resolution. Remote Sensing. 15(9). https://doi.org/10.3390/rs1509245115

    Impact Of Semantics, Physics And Adversarial Mechanisms In Deep Learning

    Get PDF
    Deep learning has greatly advanced the performance of algorithms on tasks such as image classification, speech enhancement, sound separation, and generative image models. However many current popular systems are driven by empirical rules that do not fully exploit the underlying physics of the data. Many speech and audio systems fix STFT preprocessing before their networks. Hyperspectral Image (HSI) methods often don't deliberately consider the spectral spatial trade off that is not present in normal images. Generative Adversarial Networks (GANs) that learn a generative distribution of images don't prioritize semantic labels of the training data. To meet these opportunities we propose to alter known deep learning methods to be more dependent on the semantic and physical underpinnings of the data to create better performing and more robust algorithms for sound separation and classification, image generation, and HSI segmentation. Our approaches take inspiration from from Harmonic Analysis, SVMs, and classical statistical detection theory, and further the state-of-the art in source separation, defense against audio adversarial attacks, HSI classification, and GANs. Recent deep learning approaches have achieved impressive performance on speech enhancement and separation tasks. However, these approaches have not been investigated for separating mixtures of arbitrary sounds of different types, a task we refer to as universal sound separation. To study this question, we develop a dataset of mixtures containing arbitrary sounds, and use it to investigate the space of mask-based separation architectures, varying both the overall network architecture and the framewise analysis-synthesis basis for signal transformations. We compare using a short-time Fourier transform (STFT) with a learnable basis at variable window sizes for the feature extraction stage of our sound separation network. We also compare the robustness to adversarial examples of speech classification networks that similarly hybridize established Time-frequency (TF) methods with learnable filter weights. We analyze HSI images for material classification. For hyperspectral image cubes TF methods decompose spectra into multi-spectral bands, while Neural Networks (NNs) incorporate spatial information across scales and model multiple levels of dependencies between spectral features. The Fourier scattering transform is an amalgamation of time-frequency representations with neural network architectures. We propose and test a three dimensional Fourier scattering method on hyperspectral datasets, and present results that indicate that the Fourier scattering transform is highly effective at representing spectral data when compared with other state-of-the-art methods. We study the spectral-spatial trade-off that our Scattering approach allows.We also use a similar multi-scale approach to develop a defense against audio adversarial attacks. We propose a unification of a computational model of speech processing in the brain with commercial wake-word networks to create a cortical network, and show that it can increase resistance to adversarial noise without a degradation in performance. Generative Adversarial Networks are an attractive approach to constructing generative models that mimic a target distribution, and typically use conditional information (cGANs) such as class labels to guide the training of the discriminator and the generator. We propose a loss that ensures generator updates are always class specific, rather than training a function that measures the information theoretic distance between the generative distribution and one target distribution, we generalize the successful hinge-loss that has become an essential ingredient of many GANs to the multi-class setting and use it to train a single generator classifier pair. While the canonical hinge loss made generator updates according to a class agnostic margin a real/fake discriminator learned, our multi-class hinge-loss GAN updates the generator according to many classification margins. With this modification, we are able to accelerate training and achieve state of the art Inception and FID scores on Imagenet128. We study the trade-off between class fidelity and overall diversity of generated images, and show modifications of our method can prioritize either each during training. We show that there is a limit to how closely classification and discrimination can be combined while maintaining sample diversity with some theoretical results on K+1 GANs

    Exploiting generative self-supervised learning for the assessment of biological images with lack of annotations

    Get PDF
    Computer-aided analysis of biological images typically requires extensive training on large-scale annotated datasets, which is not viable in many situations. In this paper, we present Generative Adversarial Network Discriminator Learner (GAN-DL), a novel self-supervised learning paradigm based on the StyleGAN2 architecture, which we employ for self-supervised image representation learning in the case of fluorescent biological images

    A survey on generative adversarial networks for imbalance problems in computer vision tasks

    Get PDF
    Any computer vision application development starts off by acquiring images and data, then preprocessing and pattern recognition steps to perform a task. When the acquired images are highly imbalanced and not adequate, the desired task may not be achievable. Unfortunately, the occurrence of imbalance problems in acquired image datasets in certain complex real-world problems such as anomaly detection, emotion recognition, medical image analysis, fraud detection, metallic surface defect detection, disaster prediction, etc., are inevitable. The performance of computer vision algorithms can significantly deteriorate when the training dataset is imbalanced. In recent years, Generative Adversarial Neural Networks (GANs) have gained immense attention by researchers across a variety of application domains due to their capability to model complex real-world image data. It is particularly important that GANs can not only be used to generate synthetic images, but also its fascinating adversarial learning idea showed good potential in restoring balance in imbalanced datasets. In this paper, we examine the most recent developments of GANs based techniques for addressing imbalance problems in image data. The real-world challenges and implementations of synthetic image generation based on GANs are extensively covered in this survey. Our survey first introduces various imbalance problems in computer vision tasks and its existing solutions, and then examines key concepts such as deep generative image models and GANs. After that, we propose a taxonomy to summarize GANs based techniques for addressing imbalance problems in computer vision tasks into three major categories: 1. Image level imbalances in classification, 2. object level imbalances in object detection and 3. pixel level imbalances in segmentation tasks. We elaborate the imbalance problems of each group, and provide GANs based solutions in each group. Readers will understand how GANs based techniques can handle the problem of imbalances and boost performance of the computer vision algorithms

    A review of technical factors to consider when designing neural networks for semantic segmentation of Earth Observation imagery

    Full text link
    Semantic segmentation (classification) of Earth Observation imagery is a crucial task in remote sensing. This paper presents a comprehensive review of technical factors to consider when designing neural networks for this purpose. The review focuses on Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), Generative Adversarial Networks (GANs), and transformer models, discussing prominent design patterns for these ANN families and their implications for semantic segmentation. Common pre-processing techniques for ensuring optimal data preparation are also covered. These include methods for image normalization and chipping, as well as strategies for addressing data imbalance in training samples, and techniques for overcoming limited data, including augmentation techniques, transfer learning, and domain adaptation. By encompassing both the technical aspects of neural network design and the data-related considerations, this review provides researchers and practitioners with a comprehensive and up-to-date understanding of the factors involved in designing effective neural networks for semantic segmentation of Earth Observation imagery.Comment: 145 pages with 32 figure
    corecore