476 research outputs found

    Cross-Domain Local Characteristic Enhanced Deepfake Video Detection

    Full text link
    As ultra-realistic face forgery techniques emerge, deepfake detection has attracted increasing attention due to security concerns. Many detectors cannot achieve accurate results when detecting unseen manipulations despite excellent performance on known forgeries. In this paper, we are motivated by the observation that the discrepancies between real and fake videos are extremely subtle and localized, and inconsistencies or irregularities can exist in some critical facial regions across various information domains. To this end, we propose a novel pipeline, Cross-Domain Local Forensics (XDLF), for more general deepfake video detection. In the proposed pipeline, a specialized framework is presented to simultaneously exploit local forgery patterns from space, frequency, and time domains, thus learning cross-domain features to detect forgeries. Moreover, the framework leverages four high-level forgery-sensitive local regions of a human face to guide the model to enhance subtle artifacts and localize potential anomalies. Extensive experiments on several benchmark datasets demonstrate the impressive performance of our method, and we achieve superiority over several state-of-the-art methods on cross-dataset generalization. We also examined the factors that contribute to its performance through ablations, which suggests that exploiting cross-domain local characteristics is a noteworthy direction for developing more general deepfake detectors

    Feature aggregation and region-aware learning for detection of splicing forgery.

    Get PDF
    Detection of image splicing forgery become an increasingly difficult task due to the scale variations of the forged areas and the covered traces of manipulation from post-processing techniques. Most existing methods fail to jointly multi-scale local and global information and ignore the correlations between the tampered and real regions in inter-image, which affects the detection performance of multi-scale tampered regions. To tackle these challenges, in this paper, we propose a novel method based on feature aggregation and region-aware learning to detect the manipulated areas with varying scales. In specific, we first integrate multi-level adjacency features using a feature selection mechanism to improve feature representation. Second, a cross-domain correlation aggregation module is devised to perform correlation enhancement of local features from CNN and global representations from Transformer, allowing for a complementary fusion of dual-domain information. Third, a region-aware learning mechanism is designed to improve feature discrimination by comparing the similarities and differences of the features between different regions. Extensive evaluations on benchmark datasets indicate the effectiveness in detecting multi-scale spliced tampered regions

    D-Unet: A Dual-encoder U-Net for Image Splicing Forgery Detection and Localization

    Full text link
    Recently, many detection methods based on convolutional neural networks (CNNs) have been proposed for image splicing forgery detection. Most of these detection methods focus on the local patches or local objects. In fact, image splicing forgery detection is a global binary classification task that distinguishes the tampered and non-tampered regions by image fingerprints. However, some specific image contents are hardly retained by CNN-based detection networks, but if included, would improve the detection accuracy of the networks. To resolve these issues, we propose a novel network called dual-encoder U-Net (D-Unet) for image splicing forgery detection, which employs an unfixed encoder and a fixed encoder. The unfixed encoder autonomously learns the image fingerprints that differentiate between the tampered and non-tampered regions, whereas the fixed encoder intentionally provides the direction information that assists the learning and detection of the network. This dual-encoder is followed by a spatial pyramid global-feature extraction module that expands the global insight of D-Unet for classifying the tampered and non-tampered regions more accurately. In an experimental comparison study of D-Unet and state-of-the-art methods, D-Unet outperformed the other methods in image-level and pixel-level detection, without requiring pre-training or training on a large number of forgery images. Moreover, it was stably robust to different attacks.Comment: 13 pages, 13 figure

    A survey on passive digital video forgery detection techniques

    Get PDF
    Digital media devices such as smartphones, cameras, and notebooks are becoming increasingly popular. Through digital platforms such as Facebook, WhatsApp, Twitter, and others, people share digital images, videos, and audio in large quantities. Especially in a crime scene investigation, digital evidence plays a crucial role in a courtroom. Manipulating video content with high-quality software tools is easier, which helps fabricate video content more efficiently. It is therefore necessary to develop an authenticating method for detecting and verifying manipulated videos. The objective of this paper is to provide a comprehensive review of the passive methods for detecting video forgeries. This survey has the primary goal of studying and analyzing the existing passive techniques for detecting video forgeries. First, an overview of the basic information needed to understand video forgery detection is presented. Later, it provides an in-depth understanding of the techniques used in the spatial, temporal, and spatio-temporal domain analysis of videos, datasets used, and their limitations are reviewed. In the following sections, standard benchmark video forgery datasets and the generalized architecture for passive video forgery detection techniques are discussed in more depth. Finally, identifying loopholes in existing surveys so detecting forged videos much more effectively in the future are discussed

    On Generative Adversarial Network Based Synthetic Iris Presentation Attack And Its Detection

    Get PDF
    Human iris is considered a reliable and accurate modality for biometric recognition due to its unique texture information. Reliability and accuracy of iris biometric modality have prompted its large-scale deployment for critical applications such as border control and national identification projects. The extensive growth of iris recognition systems has raised apprehensions about the susceptibility of these systems to various presentation attacks. In this thesis, a novel iris presentation attack using deep learning based synthetically generated iris images is presented. Utilizing the generative capability of deep convolutional generative adversarial networks and iris quality metrics, a new framework, named as iDCGAN is proposed for creating realistic appearing synthetic iris images. In-depth analysis is performed using quality score distributions of real and synthetically generated iris images to understand the effectiveness of the proposed approach. We also demonstrate that synthetically generated iris images can be used to attack existing iris recognition systems. As synthetically generated iris images can be effectively deployed in iris presentation attacks, it is important to develop accurate iris presentation attack detection algorithms which can distinguish such synthetic iris images from real iris images. For this purpose, a novel structural and textural feature-based iris presentation attack detection framework (DESIST) is proposed. The key emphasis of DESIST is on developing a unified framework for detecting a medley of iris presentation attacks, including synthetic iris. Experimental evaluations showcase the efficacy of the proposed DESIST framework in detecting synthetic iris presentation attacks

    Image and Video Forensics

    Get PDF
    Nowadays, images and videos have become the main modalities of information being exchanged in everyday life, and their pervasiveness has led the image forensics community to question their reliability, integrity, confidentiality, and security. Multimedia contents are generated in many different ways through the use of consumer electronics and high-quality digital imaging devices, such as smartphones, digital cameras, tablets, and wearable and IoT devices. The ever-increasing convenience of image acquisition has facilitated instant distribution and sharing of digital images on digital social platforms, determining a great amount of exchange data. Moreover, the pervasiveness of powerful image editing tools has allowed the manipulation of digital images for malicious or criminal ends, up to the creation of synthesized images and videos with the use of deep learning techniques. In response to these threats, the multimedia forensics community has produced major research efforts regarding the identification of the source and the detection of manipulation. In all cases (e.g., forensic investigations, fake news debunking, information warfare, and cyberattacks) where images and videos serve as critical evidence, forensic technologies that help to determine the origin, authenticity, and integrity of multimedia content can become essential tools. This book aims to collect a diverse and complementary set of articles that demonstrate new developments and applications in image and video forensics to tackle new and serious challenges to ensure media authenticity

    Towards Robust GAN-generated Image Detection: a Multi-view Completion Representation

    Full text link
    GAN-generated image detection now becomes the first line of defense against the malicious uses of machine-synthesized image manipulations such as deepfakes. Although some existing detectors work well in detecting clean, known GAN samples, their success is largely attributable to overfitting unstable features such as frequency artifacts, which will cause failures when facing unknown GANs or perturbation attacks. To overcome the issue, we propose a robust detection framework based on a novel multi-view image completion representation. The framework first learns various view-to-image tasks to model the diverse distributions of genuine images. Frequency-irrelevant features can be represented from the distributional discrepancies characterized by the completion models, which are stable, generalized, and robust for detecting unknown fake patterns. Then, a multi-view classification is devised with elaborated intra- and inter-view learning strategies to enhance view-specific feature representation and cross-view feature aggregation, respectively. We evaluated the generalization ability of our framework across six popular GANs at different resolutions and its robustness against a broad range of perturbation attacks. The results confirm our method's improved effectiveness, generalization, and robustness over various baselines.Comment: Accepted to IJCAI 202
    • …
    corecore