169 research outputs found
Analysis of adversarial attacks against CNN-based image forgery detectors
With the ubiquitous diffusion of social networks, images are becoming a
dominant and powerful communication channel. Not surprisingly, they are also
increasingly subject to manipulations aimed at distorting information and
spreading fake news. In recent years, the scientific community has devoted
major efforts to contrast this menace, and many image forgery detectors have
been proposed. Currently, due to the success of deep learning in many
multimedia processing tasks, there is high interest towards CNN-based
detectors, and early results are already very promising. Recent studies in
computer vision, however, have shown CNNs to be highly vulnerable to
adversarial attacks, small perturbations of the input data which drive the
network towards erroneous classification. In this paper we analyze the
vulnerability of CNN-based image forensics methods to adversarial attacks,
considering several detectors and several types of attack, and testing
performance on a wide range of common manipulations, both easily and hardly
detectable
Cross-Domain Local Characteristic Enhanced Deepfake Video Detection
As ultra-realistic face forgery techniques emerge, deepfake detection has
attracted increasing attention due to security concerns. Many detectors cannot
achieve accurate results when detecting unseen manipulations despite excellent
performance on known forgeries. In this paper, we are motivated by the
observation that the discrepancies between real and fake videos are extremely
subtle and localized, and inconsistencies or irregularities can exist in some
critical facial regions across various information domains. To this end, we
propose a novel pipeline, Cross-Domain Local Forensics (XDLF), for more general
deepfake video detection. In the proposed pipeline, a specialized framework is
presented to simultaneously exploit local forgery patterns from space,
frequency, and time domains, thus learning cross-domain features to detect
forgeries. Moreover, the framework leverages four high-level forgery-sensitive
local regions of a human face to guide the model to enhance subtle artifacts
and localize potential anomalies. Extensive experiments on several benchmark
datasets demonstrate the impressive performance of our method, and we achieve
superiority over several state-of-the-art methods on cross-dataset
generalization. We also examined the factors that contribute to its performance
through ablations, which suggests that exploiting cross-domain local
characteristics is a noteworthy direction for developing more general deepfake
detectors
Digital image forensics via meta-learning and few-shot learning
Digital images are a substantial portion of the information conveyed by social media, the Internet, and television in our daily life. In recent years, digital images have become not only one of the public information carriers, but also a crucial piece of evidence. The widespread availability of low-cost, user-friendly, and potent image editing software and mobile phone applications facilitates altering images without professional expertise. Consequently, safeguarding the originality and integrity of digital images has become a difficulty. Forgers commonly use digital image manipulation to transmit misleading information. Digital image forensics investigates the irregular patterns that might result from image alteration. It is crucial to information security.
Over the past several years, machine learning techniques have been effectively used to identify image forgeries. Convolutional Neural Networks(CNN) are a frequent machine learning approach. A standard CNN model could distinguish between original and manipulated images. In this dissertation, two CNN models are introduced to recognize seam carving and Gaussian filtering.
Training a conventional CNN model for a new similar image forgery detection task, one must start from scratch. Additionally, many types of tampered image data are challenging to acquire or simulate.
Meta-learning is an alternative learning paradigm in which a machine learning model gets experience across numerous related tasks and uses this expertise to improve its future learning performance. Few-shot learning is a method for acquiring knowledge from few data. It can classify images with as few as one or two examples per class. Inspired by meta-learning and few-shot learning, this dissertation proposed a prototypical networks model capable of resolving a collection of related image forgery detection problems. Unlike traditional CNN models, the proposed prototypical networks model does not need to be trained from scratch for a new task. Additionally, it drastically decreases the quantity of training images
GLFF: Global and Local Feature Fusion for AI-synthesized Image Detection
With the rapid development of deep generative models (such as Generative
Adversarial Networks and Diffusion models), AI-synthesized images are now of
such high quality that humans can hardly distinguish them from pristine ones.
Although existing detection methods have shown high performance in specific
evaluation settings, e.g., on images from seen models or on images without
real-world post-processing, they tend to suffer serious performance degradation
in real-world scenarios where testing images can be generated by more powerful
generation models or combined with various post-processing operations. To
address this issue, we propose a Global and Local Feature Fusion (GLFF)
framework to learn rich and discriminative representations by combining
multi-scale global features from the whole image with refined local features
from informative patches for AI synthesized image detection. GLFF fuses
information from two branches: the global branch to extract multi-scale
semantic features and the local branch to select informative patches for
detailed local artifacts extraction. Due to the lack of a synthesized image
dataset simulating real-world applications for evaluation, we further create a
challenging fake image dataset, named DeepFakeFaceForensics (DF 3 ), which
contains 6 state-of-the-art generation models and a variety of post-processing
techniques to approach the real-world scenarios. Experimental results
demonstrate the superiority of our method to the state-of-the-art methods on
the proposed DF 3 dataset and three other open-source datasets.Comment: 13 pages, 6 figures, 8 table
A Full-Image Full-Resolution End-to-End-Trainable CNN Framework for Image Forgery Detection
Due to limited computational and memory resources, current deep learning
models accept only rather small images in input, calling for preliminary image
resizing. This is not a problem for high-level vision problems, where
discriminative features are barely affected by resizing. On the contrary, in
image forensics, resizing tends to destroy precious high-frequency details,
impacting heavily on performance. One can avoid resizing by means of patch-wise
processing, at the cost of renouncing whole-image analysis. In this work, we
propose a CNN-based image forgery detection framework which makes decisions
based on full-resolution information gathered from the whole image. Thanks to
gradient checkpointing, the framework is trainable end-to-end with limited
memory resources and weak (image-level) supervision, allowing for the joint
optimization of all parameters. Experiments on widespread image forensics
datasets prove the good performance of the proposed approach, which largely
outperforms all baselines and all reference methods.Comment: 13 pages, 12 figures, journa
Robust Sequential DeepFake Detection
Since photorealistic faces can be readily generated by facial manipulation
technologies nowadays, potential malicious abuse of these technologies has
drawn great concerns. Numerous deepfake detection methods are thus proposed.
However, existing methods only focus on detecting one-step facial manipulation.
As the emergence of easy-accessible facial editing applications, people can
easily manipulate facial components using multi-step operations in a sequential
manner. This new threat requires us to detect a sequence of facial
manipulations, which is vital for both detecting deepfake media and recovering
original faces afterwards. Motivated by this observation, we emphasize the need
and propose a novel research problem called Detecting Sequential DeepFake
Manipulation (Seq-DeepFake). Unlike the existing deepfake detection task only
demanding a binary label prediction, detecting Seq-DeepFake manipulation
requires correctly predicting a sequential vector of facial manipulation
operations. To support a large-scale investigation, we construct the first
Seq-DeepFake dataset, where face images are manipulated sequentially with
corresponding annotations of sequential facial manipulation vectors. Based on
this new dataset, we cast detecting Seq-DeepFake manipulation as a specific
image-to-sequence task and propose a concise yet effective Seq-DeepFake
Transformer (SeqFakeFormer). To better reflect real-world deepfake data
distributions, we further apply various perturbations on the original
Seq-DeepFake dataset and construct the more challenging Sequential DeepFake
dataset with perturbations (Seq-DeepFake-P). To exploit deeper correlation
between images and sequences when facing Seq-DeepFake-P, a dedicated
Seq-DeepFake Transformer with Image-Sequence Reasoning (SeqFakeFormer++) is
devised, which builds stronger correspondence between image-sequence pairs for
more robust Seq-DeepFake detection.Comment: Extension of our ECCV 2022 paper: arXiv:2207.02204 . Code:
https://github.com/rshaojimmy/SeqDeepFak
- …