104 research outputs found

    Audio Splicing Detection and Localization Based on Acquisition Device Traces

    Get PDF
    In recent years, the multimedia forensic community has put a great effort in developing solutions to assess the integrity and authenticity of multimedia objects, focusing especially on manipulations applied by means of advanced deep learning techniques. However, in addition to complex forgeries as the deepfakes, very simple yet effective manipulation techniques not involving any use of state-of-the-art editing tools still exist and prove dangerous. This is the case of audio splicing for speech signals, i.e., to concatenate and combine multiple speech segments obtained from different recordings of a person in order to cast a new fake speech. Indeed, by simply adding a few words to an existing speech we can completely alter its meaning. In this work, we address the overlooked problem of detection and localization of audio splicing from different models of acquisition devices. Our goal is to determine whether an audio track under analysis is pristine, or it has been manipulated by splicing one or multiple segments obtained from different device models. Moreover, if a recording is detected as spliced, we identify where the modification has been introduced in the temporal dimension. The proposed method is based on a Convolutional Neural Network (CNN) that extracts model-specific features from the audio recording. After extracting the features, we determine whether there has been a manipulation through a clustering algorithm. Finally, we identify the point where the modification has been introduced through a distance-measuring technique. The proposed method allows to detect and localize multiple splicing points within a recording

    Landmine Detection from GPR Data Using Convolutional Neural Networks

    Get PDF
    The presence of buried landmines is a serious threat in many areas around the World. Despite various techniques have been proposed in the literature to detect and recognize buried objects, automatic and easy to use systems providing accurate performance are still under research. Given the incredible results achieved by deep learning in many detection tasks, in this paper we propose a pipeline for buried landmine detection based on convolutional neural networks (CNNs) applied to ground-penetrating radar (GPR) images. The proposed algorithm is capable of recognizing whether a B-scan profile obtained from GPR acquisitions contains traces of buried mines. Validation of the presented system is carried out on real GPR acquisitions, albeit system training can be performed simply relying on synthetically generated data. Results show that it is possible to reach 95% of detection accuracy without training in real acquisition of landmine profiles

    Ray space transform interpolation with convolutional autoencoder

    Get PDF
    In this paper we propose an algorithm for the reconstruction of the Ray Space Transform (RST) through the use of neural networks. In particular, our aim is to reconstruct the magnitude of the RST acquired from a linear microphone array, as if the array were composed by a larger amount of microphones. This is useful for applications that need a higher RST resolution when only a limited amount of microphones can be used due to practical constraints or physical limitations. The proposed solution leverages recent advancements in deep learning as it is based on a fully convolutional autoencoder. To validate our method, we show through a simulative campaign that it is possible to improve sound source localization using the reconstructed RST compared to the use of the original RST

    Convolutional Autoencoder for Landmine Detection on GPR Scans

    Get PDF
    Buried unexploded landmines are a serious threat in many countries all over the World. As many landmines are nowadays mostly plastic made, the use of ground penetrating radar (GPR) systems for their detection is gaining the trend. However, despite several techniques have been proposed, a safe automatic solution is far from being at hand. In this paper, we propose a landmine detection method based on convolutional autoencoder applied to B-scans acquired with a GPR. The proposed system leverages an anomaly detection pipeline: the autoencoder learns a description of B-scans clear of landmines, and detects landmine traces as anomalies. In doing so, the autoencoder never uses data containing landmine traces at training time. This allows to avoid making strong assumptions on the kind of landmines to detect, thus paving the way to detection of novel landmine models

    On the use of Benford's law to detect GAN-generated images

    Get PDF
    The advent of Generative Adversarial Network (GAN) architectures has given anyone the ability of generating incredibly realistic synthetic imagery. The malicious diffusion of GAN-generated images may lead to serious social and political consequences (e.g., fake news spreading, opinion formation, etc.). It is therefore important to regulate the widespread distribution of synthetic imagery by developing solutions able to detect them. In this paper, we study the possibility of using Benford's law to discriminate GAN-generated images from natural photographs. Benford's law describes the distribution of the most significant digit for quantized Discrete Cosine Transform (DCT) coefficients. Extending and generalizing this property, we show that it is possible to extract a compact feature vector from an image. This feature vector can be fed to an extremely simple classifier for GAN-generated image detection purpose

    Training CNNs in Presence of JPEG Compression: Multimedia Forensics vs Computer Vision

    Get PDF
    Convolutional Neural Networks (CNNs) have proved very accurate in multiple computer vision image classification tasks that required visual inspection in the past (e.g., object recognition, face detection, etc.). Motivated by these astonishing results, researchers have also started using CNNs to cope with image forensic problems (e.g., camera model identification, tampering detection, etc.). However, in computer vision, image classification methods typically rely on visual cues easily detectable by human eyes. Conversely, forensic solutions rely on almost invisible traces that are often very subtle and lie in the fine details of the image under analysis. For this reason, training a CNN to solve a forensic task requires some special care, as common processing operations (e.g., resampling, compression, etc.) can strongly hinder forensic traces. In this work, we focus on the effect that JPEG has on CNN training considering different computer vision and forensic image classification problems. Specifically, we consider the issues that rise from JPEG compression and misalignment of the JPEG grid. We show that it is necessary to consider these effects when generating a training dataset in order to properly train a forensic detector not losing generalization capability, whereas it is almost possible to ignore these effects for computer vision tasks

    Training CNNs in Presence of JPEG Compression: Multimedia Forensics vs Computer Vision

    Get PDF
    Convolutional Neural Networks (CNNs) have proved very accurate in multiple computer vision image classification tasks that required visual inspection in the past (e.g., object recognition, face detection, etc.). Motivated by these astonishing results, researchers have also started using CNNs to cope with image forensic problems (e.g., camera model identification, tampering detection, etc.). However, in computer vision, image classification methods typically rely on visual cues easily detectable by human eyes. Conversely, forensic solutions rely on almost invisible traces that are often very subtle and lie in the fine details of the image under analysis. For this reason, training a CNN to solve a forensic task requires some special care, as common processing operations (e.g., resampling, compression, etc.) can strongly hinder forensic traces. In this work, we focus on the effect that JPEG has on CNN training considering different computer vision and forensic image classification problems. Specifically, we consider the issues that rise from JPEG compression and misalignment of the JPEG grid. We show that it is necessary to consider these effects when generating a training dataset in order to properly train a forensic detector not losing generalization capability, whereas it is almost possible to ignore these effects for computer vision tasks
    • …
    corecore