6 research outputs found
Perceptual MAE for Image Manipulation Localization: A High-level Vision Learner Focusing on Low-level Features
Nowadays, multimedia forensics faces unprecedented challenges due to the
rapid advancement of multimedia generation technology thereby making Image
Manipulation Localization (IML) crucial in the pursuit of truth. The key to IML
lies in revealing the artifacts or inconsistencies between the tampered and
authentic areas, which are evident under pixel-level features. Consequently,
existing studies treat IML as a low-level vision task, focusing on allocating
tampered masks by crafting pixel-level features such as image RGB noises, edge
signals, or high-frequency features. However, in practice, tampering commonly
occurs at the object level, and different classes of objects have varying
likelihoods of becoming targets of tampering. Therefore, object semantics are
also vital in identifying the tampered areas in addition to pixel-level
features. This necessitates IML models to carry out a semantic understanding of
the entire image. In this paper, we reformulate the IML task as a high-level
vision task that greatly benefits from low-level features. Based on such an
interpretation, we propose a method to enhance the Masked Autoencoder (MAE) by
incorporating high-resolution inputs and a perceptual loss supervision module,
which is termed Perceptual MAE (PMAE). While MAE has demonstrated an impressive
understanding of object semantics, PMAE can also compensate for low-level
semantics with our proposed enhancements. Evidenced by extensive experiments,
this paradigm effectively unites the low-level and high-level features of the
IML task and outperforms state-of-the-art tampering localization methods on all
five publicly available datasets
ReLoc: A Restoration-Assisted Framework for Robust Image Tampering Localization
With the spread of tampered images, locating the tampered regions in digital
images has drawn increasing attention. The existing image tampering
localization methods, however, suffer from severe performance degradation when
the tampered images are subjected to some post-processing, as the tampering
traces would be distorted by the post-processing operations. The poor
robustness against post-processing has become a bottleneck for the practical
applications of image tampering localization techniques. In order to address
this issue, this paper proposes a novel restoration-assisted framework for
image tampering localization (ReLoc). The ReLoc framework mainly consists of an
image restoration module and a tampering localization module. The key idea of
ReLoc is to use the restoration module to recover a high-quality counterpart of
the distorted tampered image, such that the distorted tampering traces can be
re-enhanced, facilitating the tampering localization module to identify the
tampered regions. To achieve this, the restoration module is optimized not only
with the conventional constraints on image visual quality but also with a
forensics-oriented objective function. Furthermore, the restoration module and
the localization module are trained alternately, which can stabilize the
training process and is beneficial for improving the performance. The proposed
framework is evaluated by fighting against JPEG compression, the most commonly
used post-processing. Extensive experimental results show that ReLoc can
significantly improve the robustness against JPEG compression. The restoration
module in a well-trained ReLoc model is transferable. Namely, it is still
effective when being directly deployed with another tampering localization
module.Comment: 12 pages, 5 figure
Datasets, Clues and State-of-the-Arts for Multimedia Forensics: An Extensive Review
With the large chunks of social media data being created daily and the
parallel rise of realistic multimedia tampering methods, detecting and
localising tampering in images and videos has become essential. This survey
focusses on approaches for tampering detection in multimedia data using deep
learning models. Specifically, it presents a detailed analysis of benchmark
datasets for malicious manipulation detection that are publicly available. It
also offers a comprehensive list of tampering clues and commonly used deep
learning architectures. Next, it discusses the current state-of-the-art
tampering detection methods, categorizing them into meaningful types such as
deepfake detection methods, splice tampering detection methods, copy-move
tampering detection methods, etc. and discussing their strengths and
weaknesses. Top results achieved on benchmark datasets, comparison of deep
learning approaches against traditional methods and critical insights from the
recent tampering detection methods are also discussed. Lastly, the research
gaps, future direction and conclusion are discussed to provide an in-depth
understanding of the tampering detection research arena
Media Forensics and DeepFakes: an overview
With the rapid progress of recent years, techniques that generate and
manipulate multimedia content can now guarantee a very advanced level of
realism. The boundary between real and synthetic media has become very thin. On
the one hand, this opens the door to a series of exciting applications in
different fields such as creative arts, advertising, film production, video
games. On the other hand, it poses enormous security threats. Software packages
freely available on the web allow any individual, without special skills, to
create very realistic fake images and videos. So-called deepfakes can be used
to manipulate public opinion during elections, commit fraud, discredit or
blackmail people. Potential abuses are limited only by human imagination.
Therefore, there is an urgent need for automated tools capable of detecting
false multimedia content and avoiding the spread of dangerous false
information. This review paper aims to present an analysis of the methods for
visual media integrity verification, that is, the detection of manipulated
images and videos. Special emphasis will be placed on the emerging phenomenon
of deepfakes and, from the point of view of the forensic analyst, on modern
data-driven forensic methods. The analysis will help to highlight the limits of
current forensic tools, the most relevant issues, the upcoming challenges, and
suggest future directions for research
Image and Video Forensics
Nowadays, images and videos have become the main modalities of information being exchanged in everyday life, and their pervasiveness has led the image forensics community to question their reliability, integrity, confidentiality, and security. Multimedia contents are generated in many different ways through the use of consumer electronics and high-quality digital imaging devices, such as smartphones, digital cameras, tablets, and wearable and IoT devices. The ever-increasing convenience of image acquisition has facilitated instant distribution and sharing of digital images on digital social platforms, determining a great amount of exchange data. Moreover, the pervasiveness of powerful image editing tools has allowed the manipulation of digital images for malicious or criminal ends, up to the creation of synthesized images and videos with the use of deep learning techniques. In response to these threats, the multimedia forensics community has produced major research efforts regarding the identification of the source and the detection of manipulation. In all cases (e.g., forensic investigations, fake news debunking, information warfare, and cyberattacks) where images and videos serve as critical evidence, forensic technologies that help to determine the origin, authenticity, and integrity of multimedia content can become essential tools. This book aims to collect a diverse and complementary set of articles that demonstrate new developments and applications in image and video forensics to tackle new and serious challenges to ensure media authenticity