253 research outputs found
Discriminating multiple JPEG compression using first digit features
The analysis of JPEG double-compressed images is a problem largely studied by the multimedia forensics community, as it might be exploited, e.g., for tampering localization or source device identification. In many practical scenarios, like photos uploaded on blogs, on-line albums, and photo sharing web sites, images might be JPEG compressed several times. However, the identification of the number of compression stages applied to an image remains an open issue. We proposes a forensic method based on the analysis of the distribution of the first significant digits of the discrete cosine transform coefficients, which follow Benford's law in images compressed just once. Then, the detector is optimized and extended in order to identify accurately the number of compression stages applied to an image. The experimental validation considers up to four consecutive compression stages and shows that the proposed approach extends and outperforms the previously-published algorithms for double JPEG compression detection
Localization of JPEG double compression through multi-domain convolutional neural networks
When an attacker wants to falsify an image, in most of cases she/he will
perform a JPEG recompression. Different techniques have been developed based on
diverse theoretical assumptions but very effective solutions have not been
developed yet. Recently, machine learning based approaches have been started to
appear in the field of image forensics to solve diverse tasks such as
acquisition source identification and forgery detection. In this last case, the
aim ahead would be to get a trained neural network able, given a to-be-checked
image, to reliably localize the forged areas. With this in mind, our paper
proposes a step forward in this direction by analyzing how a single or double
JPEG compression can be revealed and localized using convolutional neural
networks (CNNs). Different kinds of input to the CNN have been taken into
consideration, and various experiments have been carried out trying also to
evidence potential issues to be further investigated.Comment: Accepted to CVPRW 2017, Workshop on Media Forensic
On the use of Benford's law to detect GAN-generated images
The advent of Generative Adversarial Network (GAN) architectures has given
anyone the ability of generating incredibly realistic synthetic imagery. The
malicious diffusion of GAN-generated images may lead to serious social and
political consequences (e.g., fake news spreading, opinion formation, etc.). It
is therefore important to regulate the widespread distribution of synthetic
imagery by developing solutions able to detect them. In this paper, we study
the possibility of using Benford's law to discriminate GAN-generated images
from natural photographs. Benford's law describes the distribution of the most
significant digit for quantized Discrete Cosine Transform (DCT) coefficients.
Extending and generalizing this property, we show that it is possible to
extract a compact feature vector from an image. This feature vector can be fed
to an extremely simple classifier for GAN-generated image detection purpose
Digital image forensics
Digital image forensics is a relatively new research field that aims to expose the origin and composition of, and the history of processing applied to digital images. Hence, the digital image forensics is expected to be of significant importance to our modern society in which the digital media are getting more and more popular. In this thesis, image tampering detection and classification of double JPEG compression are the two major subjects studied. Since any manipulation applied to digital images changes image statistics, identifying statistical artifacts becomes critically important in image forensics. In this thesis, a few typical forensic techniques have been studied. Finally, it is foreseen that the investigations on endless confliction between forensics and anti-forensics are to deepen our understanding on image statistics and advance civilization of our society
Training CNNs in Presence of JPEG Compression: Multimedia Forensics vs Computer Vision
Convolutional Neural Networks (CNNs) have proved very accurate in multiple
computer vision image classification tasks that required visual inspection in
the past (e.g., object recognition, face detection, etc.). Motivated by these
astonishing results, researchers have also started using CNNs to cope with
image forensic problems (e.g., camera model identification, tampering
detection, etc.). However, in computer vision, image classification methods
typically rely on visual cues easily detectable by human eyes. Conversely,
forensic solutions rely on almost invisible traces that are often very subtle
and lie in the fine details of the image under analysis. For this reason,
training a CNN to solve a forensic task requires some special care, as common
processing operations (e.g., resampling, compression, etc.) can strongly hinder
forensic traces. In this work, we focus on the effect that JPEG has on CNN
training considering different computer vision and forensic image
classification problems. Specifically, we consider the issues that rise from
JPEG compression and misalignment of the JPEG grid. We show that it is
necessary to consider these effects when generating a training dataset in order
to properly train a forensic detector not losing generalization capability,
whereas it is almost possible to ignore these effects for computer vision
tasks
Hybrid LSTM and Encoder-Decoder Architecture for Detection of Image Forgeries
With advanced image journaling tools, one can easily alter the semantic
meaning of an image by exploiting certain manipulation techniques such as
copy-clone, object splicing, and removal, which mislead the viewers. In
contrast, the identification of these manipulations becomes a very challenging
task as manipulated regions are not visually apparent. This paper proposes a
high-confidence manipulation localization architecture which utilizes
resampling features, Long-Short Term Memory (LSTM) cells, and encoder-decoder
network to segment out manipulated regions from non-manipulated ones.
Resampling features are used to capture artifacts like JPEG quality loss,
upsampling, downsampling, rotation, and shearing. The proposed network exploits
larger receptive fields (spatial maps) and frequency domain correlation to
analyze the discriminative characteristics between manipulated and
non-manipulated regions by incorporating encoder and LSTM network. Finally,
decoder network learns the mapping from low-resolution feature maps to
pixel-wise predictions for image tamper localization. With predicted mask
provided by final layer (softmax) of the proposed architecture, end-to-end
training is performed to learn the network parameters through back-propagation
using ground-truth masks. Furthermore, a large image splicing dataset is
introduced to guide the training process. The proposed method is capable of
localizing image manipulations at pixel level with high precision, which is
demonstrated through rigorous experimentation on three diverse datasets
- …