24 research outputs found

    Shrinking the Semantic Gap: Spatial Pooling of Local Moment Invariants for Copy-Move Forgery Detection

    Full text link
    Copy-move forgery is a manipulation of copying and pasting specific patches from and to an image, with potentially illegal or unethical uses. Recent advances in the forensic methods for copy-move forgery have shown increasing success in detection accuracy and robustness. However, for images with high self-similarity or strong signal corruption, the existing algorithms often exhibit inefficient processes and unreliable results. This is mainly due to the inherent semantic gap between low-level visual representation and high-level semantic concept. In this paper, we present a very first study of trying to mitigate the semantic gap problem in copy-move forgery detection, with spatial pooling of local moment invariants for midlevel image representation. Our detection method expands the traditional works on two aspects: 1) we introduce the bag-of-visual-words model into this field for the first time, may meaning a new perspective of forensic study; 2) we propose a word-to-phrase feature description and matching pipeline, covering the spatial structure and visual saliency information of digital images. Extensive experimental results show the superior performance of our framework over state-of-the-art algorithms in overcoming the related problems caused by the semantic gap.Comment: 13 pages, 11 figure

    An Evaluation of Popular Copy-Move Forgery Detection Approaches

    Full text link
    A copy-move forgery is created by copying and pasting content within the same image, and potentially post-processing it. In recent years, the detection of copy-move forgeries has become one of the most actively researched topics in blind image forensics. A considerable number of different algorithms have been proposed focusing on different types of postprocessed copies. In this paper, we aim to answer which copy-move forgery detection algorithms and processing steps (e.g., matching, filtering, outlier detection, affine transformation estimation) perform best in various postprocessing scenarios. The focus of our analysis is to evaluate the performance of previously proposed feature sets. We achieve this by casting existing algorithms in a common pipeline. In this paper, we examined the 15 most prominent feature sets. We analyzed the detection performance on a per-image basis and on a per-pixel basis. We created a challenging real-world copy-move dataset, and a software framework for systematic image manipulation. Experiments show, that the keypoint-based features SIFT and SURF, as well as the block-based DCT, DWT, KPCA, PCA and Zernike features perform very well. These feature sets exhibit the best robustness against various noise sources and downsampling, while reliably identifying the copied regions.Comment: Main paper: 14 pages, supplemental material: 12 pages, main paper appeared in IEEE Transaction on Information Forensics and Securit

    Manipulation Mask Generator: High-Quality Image Manipulation Mask Generation Method Based on Modified Total Variation Noise Reduction

    Full text link
    In artificial intelligence, any model that wants to achieve a good result is inseparable from a large number of high-quality data. It is especially true in the field of tamper detection. This paper proposes a modified total variation noise reduction method to acquire high-quality tampered images. We automatically crawl original and tampered images from the Baidu PS Bar. Baidu PS Bar is a website where net friends post countless tampered images. Subtracting the original image with the tampered image can highlight the tampered area. However, there is also substantial noise on the final print, so these images can't be directly used in the deep learning model. Our modified total variation noise reduction method is aimed at solving this problem. Because a lot of text is slender, it is easy to lose text information after the opening and closing operation. We use MSER (Maximally Stable Extremal Regions) and NMS (Non-maximum Suppression) technology to extract text information. And then use the modified total variation noise reduction technology to process the subtracted image. Finally, we can obtain an image with little noise by adding the image and text information. And the idea also largely retains the text information. Datasets generated in this way can be used in deep learning models, and they will help the model achieve better results

    Perceptual Video Hashing for Content Identification and Authentication

    Get PDF
    Perceptual hashing has been broadly used in the literature to identify similar contents for video copy detection. It has also been adopted to detect malicious manipulations for video authentication. However, targeting both applications with a single system using the same hash would be highly desirable as this saves the storage space and reduces the computational complexity. This paper proposes a perceptual video hashing system for content identification and authentication. The objective is to design a hash extraction technique that can withstand signal processing operations on one hand and detect malicious attacks on the other hand. The proposed system relies on a new signal calibration technique for extracting the hash using the discrete cosine transform (DCT) and the discrete sine transform (DST). This consists of determining the number of samples, called the normalizing shift, that is required for shifting a digital signal so that the shifted version matches a certain pattern according to DCT/DST coefficients. The rationale for the calibration idea is that the normalizing shift resists signal processing operations while it exhibits sensitivity to local tampering (i.e., replacing a small portion of the signal with a different one). While the same hash serves both applications, two different similarity measures have been proposed for video identification and authentication, respectively. Through intensive experiments with various types of video distortions and manipulations, the proposed system has been shown to outperform related state-of-the art video hashing techniques in terms of identification and authentication with the advantageous ability to locate tampered regions

    Perceptual Video Hashing for Content Identification and Authentication

    Full text link

    Towards Learning Representations in Visual Computing Tasks

    Get PDF
    abstract: The performance of most of the visual computing tasks depends on the quality of the features extracted from the raw data. Insightful feature representation increases the performance of many learning algorithms by exposing the underlying explanatory factors of the output for the unobserved input. A good representation should also handle anomalies in the data such as missing samples and noisy input caused by the undesired, external factors of variation. It should also reduce the data redundancy. Over the years, many feature extraction processes have been invented to produce good representations of raw images and videos. The feature extraction processes can be categorized into three groups. The first group contains processes that are hand-crafted for a specific task. Hand-engineering features requires the knowledge of domain experts and manual labor. However, the feature extraction process is interpretable and explainable. Next group contains the latent-feature extraction processes. While the original feature lies in a high-dimensional space, the relevant factors for a task often lie on a lower dimensional manifold. The latent-feature extraction employs hidden variables to expose the underlying data properties that cannot be directly measured from the input. Latent features seek a specific structure such as sparsity or low-rank into the derived representation through sophisticated optimization techniques. The last category is that of deep features. These are obtained by passing raw input data with minimal pre-processing through a deep network. Its parameters are computed by iteratively minimizing a task-based loss. In this dissertation, I present four pieces of work where I create and learn suitable data representations. The first task employs hand-crafted features to perform clinically-relevant retrieval of diabetic retinopathy images. The second task uses latent features to perform content-adaptive image enhancement. The third task ranks a pair of images based on their aestheticism. The goal of the last task is to capture localized image artifacts in small datasets with patch-level labels. For both these tasks, I propose novel deep architectures and show significant improvement over the previous state-of-art approaches. A suitable combination of feature representations augmented with an appropriate learning approach can increase performance for most visual computing tasks.Dissertation/ThesisDoctoral Dissertation Computer Science 201

    Image and Video Forensics

    Get PDF
    Nowadays, images and videos have become the main modalities of information being exchanged in everyday life, and their pervasiveness has led the image forensics community to question their reliability, integrity, confidentiality, and security. Multimedia contents are generated in many different ways through the use of consumer electronics and high-quality digital imaging devices, such as smartphones, digital cameras, tablets, and wearable and IoT devices. The ever-increasing convenience of image acquisition has facilitated instant distribution and sharing of digital images on digital social platforms, determining a great amount of exchange data. Moreover, the pervasiveness of powerful image editing tools has allowed the manipulation of digital images for malicious or criminal ends, up to the creation of synthesized images and videos with the use of deep learning techniques. In response to these threats, the multimedia forensics community has produced major research efforts regarding the identification of the source and the detection of manipulation. In all cases (e.g., forensic investigations, fake news debunking, information warfare, and cyberattacks) where images and videos serve as critical evidence, forensic technologies that help to determine the origin, authenticity, and integrity of multimedia content can become essential tools. This book aims to collect a diverse and complementary set of articles that demonstrate new developments and applications in image and video forensics to tackle new and serious challenges to ensure media authenticity

    Handbook of Digital Face Manipulation and Detection

    Get PDF
    This open access book provides the first comprehensive collection of studies dealing with the hot topic of digital face manipulation such as DeepFakes, Face Morphing, or Reenactment. It combines the research fields of biometrics and media forensics including contributions from academia and industry. Appealing to a broad readership, introductory chapters provide a comprehensive overview of the topic, which address readers wishing to gain a brief overview of the state-of-the-art. Subsequent chapters, which delve deeper into various research challenges, are oriented towards advanced readers. Moreover, the book provides a good starting point for young researchers as well as a reference guide pointing at further literature. Hence, the primary readership is academic institutions and industry currently involved in digital face manipulation and detection. The book could easily be used as a recommended text for courses in image processing, machine learning, media forensics, biometrics, and the general security area
    corecore