476 research outputs found
Cross-Domain Local Characteristic Enhanced Deepfake Video Detection
As ultra-realistic face forgery techniques emerge, deepfake detection has
attracted increasing attention due to security concerns. Many detectors cannot
achieve accurate results when detecting unseen manipulations despite excellent
performance on known forgeries. In this paper, we are motivated by the
observation that the discrepancies between real and fake videos are extremely
subtle and localized, and inconsistencies or irregularities can exist in some
critical facial regions across various information domains. To this end, we
propose a novel pipeline, Cross-Domain Local Forensics (XDLF), for more general
deepfake video detection. In the proposed pipeline, a specialized framework is
presented to simultaneously exploit local forgery patterns from space,
frequency, and time domains, thus learning cross-domain features to detect
forgeries. Moreover, the framework leverages four high-level forgery-sensitive
local regions of a human face to guide the model to enhance subtle artifacts
and localize potential anomalies. Extensive experiments on several benchmark
datasets demonstrate the impressive performance of our method, and we achieve
superiority over several state-of-the-art methods on cross-dataset
generalization. We also examined the factors that contribute to its performance
through ablations, which suggests that exploiting cross-domain local
characteristics is a noteworthy direction for developing more general deepfake
detectors
Feature aggregation and region-aware learning for detection of splicing forgery.
Detection of image splicing forgery become an increasingly difficult task due to the scale variations of the forged areas and the covered traces of manipulation from post-processing techniques. Most existing methods fail to jointly multi-scale local and global information and ignore the correlations between the tampered and real regions in inter-image, which affects the detection performance of multi-scale tampered regions. To tackle these challenges, in this paper, we propose a novel method based on feature aggregation and region-aware learning to detect the manipulated areas with varying scales. In specific, we first integrate multi-level adjacency features using a feature selection mechanism to improve feature representation. Second, a cross-domain correlation aggregation module is devised to perform correlation enhancement of local features from CNN and global representations from Transformer, allowing for a complementary fusion of dual-domain information. Third, a region-aware learning mechanism is designed to improve feature discrimination by comparing the similarities and differences of the features between different regions. Extensive evaluations on benchmark datasets indicate the effectiveness in detecting multi-scale spliced tampered regions
D-Unet: A Dual-encoder U-Net for Image Splicing Forgery Detection and Localization
Recently, many detection methods based on convolutional neural networks
(CNNs) have been proposed for image splicing forgery detection. Most of these
detection methods focus on the local patches or local objects. In fact, image
splicing forgery detection is a global binary classification task that
distinguishes the tampered and non-tampered regions by image fingerprints.
However, some specific image contents are hardly retained by CNN-based
detection networks, but if included, would improve the detection accuracy of
the networks. To resolve these issues, we propose a novel network called
dual-encoder U-Net (D-Unet) for image splicing forgery detection, which employs
an unfixed encoder and a fixed encoder. The unfixed encoder autonomously learns
the image fingerprints that differentiate between the tampered and non-tampered
regions, whereas the fixed encoder intentionally provides the direction
information that assists the learning and detection of the network. This
dual-encoder is followed by a spatial pyramid global-feature extraction module
that expands the global insight of D-Unet for classifying the tampered and
non-tampered regions more accurately. In an experimental comparison study of
D-Unet and state-of-the-art methods, D-Unet outperformed the other methods in
image-level and pixel-level detection, without requiring pre-training or
training on a large number of forgery images. Moreover, it was stably robust to
different attacks.Comment: 13 pages, 13 figure
A survey on passive digital video forgery detection techniques
Digital media devices such as smartphones, cameras, and notebooks are becoming increasingly popular. Through digital platforms such as Facebook, WhatsApp, Twitter, and others, people share digital images, videos, and audio in large quantities. Especially in a crime scene investigation, digital evidence plays a crucial role in a courtroom. Manipulating video content with high-quality software tools is easier, which helps fabricate video content more efficiently. It is therefore necessary to develop an authenticating method for detecting and verifying manipulated videos. The objective of this paper is to provide a comprehensive review of the passive methods for detecting video forgeries. This survey has the primary goal of studying and analyzing the existing passive techniques for detecting video forgeries. First, an overview of the basic information needed to understand video forgery detection is presented. Later, it provides an in-depth understanding of the techniques used in the spatial, temporal, and spatio-temporal domain analysis of videos, datasets used, and their limitations are reviewed. In the following sections, standard benchmark video forgery datasets and the generalized architecture for passive video forgery detection techniques are discussed in more depth. Finally, identifying loopholes in existing surveys so detecting forged videos much more effectively in the future are discussed
On Generative Adversarial Network Based Synthetic Iris Presentation Attack And Its Detection
Human iris is considered a reliable and accurate modality for biometric recognition due to its unique texture information. Reliability and accuracy of iris biometric modality have prompted its large-scale deployment for critical applications such as border control and national identification projects. The extensive growth of iris recognition systems has raised apprehensions about the susceptibility of these systems to various presentation attacks.
In this thesis, a novel iris presentation attack using deep learning based synthetically generated iris images is presented. Utilizing the generative capability of deep convolutional generative adversarial networks and iris quality metrics, a new framework, named as iDCGAN is proposed for creating realistic appearing synthetic iris images. In-depth analysis is performed using quality score distributions of real and synthetically generated iris images to understand the effectiveness of the proposed approach. We also demonstrate that synthetically generated iris images can be used to attack existing iris recognition systems.
As synthetically generated iris images can be effectively deployed in iris presentation attacks, it is important to develop accurate iris presentation attack detection algorithms which can distinguish such synthetic iris images from real iris images. For this purpose, a novel structural and textural feature-based iris presentation attack detection framework (DESIST) is proposed. The key emphasis of DESIST is on developing a unified framework for detecting a medley of iris presentation attacks, including synthetic iris. Experimental evaluations showcase the efficacy of the proposed DESIST framework in detecting synthetic iris presentation attacks
Image and Video Forensics
Nowadays, images and videos have become the main modalities of information being exchanged in everyday life, and their pervasiveness has led the image forensics community to question their reliability, integrity, confidentiality, and security. Multimedia contents are generated in many different ways through the use of consumer electronics and high-quality digital imaging devices, such as smartphones, digital cameras, tablets, and wearable and IoT devices. The ever-increasing convenience of image acquisition has facilitated instant distribution and sharing of digital images on digital social platforms, determining a great amount of exchange data. Moreover, the pervasiveness of powerful image editing tools has allowed the manipulation of digital images for malicious or criminal ends, up to the creation of synthesized images and videos with the use of deep learning techniques. In response to these threats, the multimedia forensics community has produced major research efforts regarding the identification of the source and the detection of manipulation. In all cases (e.g., forensic investigations, fake news debunking, information warfare, and cyberattacks) where images and videos serve as critical evidence, forensic technologies that help to determine the origin, authenticity, and integrity of multimedia content can become essential tools. This book aims to collect a diverse and complementary set of articles that demonstrate new developments and applications in image and video forensics to tackle new and serious challenges to ensure media authenticity
Towards Robust GAN-generated Image Detection: a Multi-view Completion Representation
GAN-generated image detection now becomes the first line of defense against
the malicious uses of machine-synthesized image manipulations such as
deepfakes. Although some existing detectors work well in detecting clean, known
GAN samples, their success is largely attributable to overfitting unstable
features such as frequency artifacts, which will cause failures when facing
unknown GANs or perturbation attacks. To overcome the issue, we propose a
robust detection framework based on a novel multi-view image completion
representation. The framework first learns various view-to-image tasks to model
the diverse distributions of genuine images. Frequency-irrelevant features can
be represented from the distributional discrepancies characterized by the
completion models, which are stable, generalized, and robust for detecting
unknown fake patterns. Then, a multi-view classification is devised with
elaborated intra- and inter-view learning strategies to enhance view-specific
feature representation and cross-view feature aggregation, respectively. We
evaluated the generalization ability of our framework across six popular GANs
at different resolutions and its robustness against a broad range of
perturbation attacks. The results confirm our method's improved effectiveness,
generalization, and robustness over various baselines.Comment: Accepted to IJCAI 202
- …