148 research outputs found
Hybrid LSTM and Encoder-Decoder Architecture for Detection of Image Forgeries
With advanced image journaling tools, one can easily alter the semantic
meaning of an image by exploiting certain manipulation techniques such as
copy-clone, object splicing, and removal, which mislead the viewers. In
contrast, the identification of these manipulations becomes a very challenging
task as manipulated regions are not visually apparent. This paper proposes a
high-confidence manipulation localization architecture which utilizes
resampling features, Long-Short Term Memory (LSTM) cells, and encoder-decoder
network to segment out manipulated regions from non-manipulated ones.
Resampling features are used to capture artifacts like JPEG quality loss,
upsampling, downsampling, rotation, and shearing. The proposed network exploits
larger receptive fields (spatial maps) and frequency domain correlation to
analyze the discriminative characteristics between manipulated and
non-manipulated regions by incorporating encoder and LSTM network. Finally,
decoder network learns the mapping from low-resolution feature maps to
pixel-wise predictions for image tamper localization. With predicted mask
provided by final layer (softmax) of the proposed architecture, end-to-end
training is performed to learn the network parameters through back-propagation
using ground-truth masks. Furthermore, a large image splicing dataset is
introduced to guide the training process. The proposed method is capable of
localizing image manipulations at pixel level with high precision, which is
demonstrated through rigorous experimentation on three diverse datasets
TriPINet: Tripartite Progressive Integration Network for Image Manipulation Localization
Image manipulation localization aims at distinguishing forged regions from
the whole test image. Although many outstanding prior arts have been proposed
for this task, there are still two issues that need to be further studied: 1)
how to fuse diverse types of features with forgery clues; 2) how to
progressively integrate multistage features for better localization
performance. In this paper, we propose a tripartite progressive integration
network (TriPINet) for end-to-end image manipulation localization. First, we
extract both visual perception information, e.g., RGB input images, and visual
imperceptible features, e.g., frequency and noise traces for forensic feature
learning. Second, we develop a guided cross-modality dual-attention (gCMDA)
module to fuse different types of forged clues. Third, we design a set of
progressive integration squeeze-and-excitation (PI-SE) modules to improve
localization performance by appropriately incorporating multiscale features in
the decoder. Extensive experiments are conducted to compare our method with
state-of-the-art image forensics approaches. The proposed TriPINet obtains
competitive results on several benchmark datasets
A Full-Image Full-Resolution End-to-End-Trainable CNN Framework for Image Forgery Detection
Due to limited computational and memory resources, current deep learning
models accept only rather small images in input, calling for preliminary image
resizing. This is not a problem for high-level vision problems, where
discriminative features are barely affected by resizing. On the contrary, in
image forensics, resizing tends to destroy precious high-frequency details,
impacting heavily on performance. One can avoid resizing by means of patch-wise
processing, at the cost of renouncing whole-image analysis. In this work, we
propose a CNN-based image forgery detection framework which makes decisions
based on full-resolution information gathered from the whole image. Thanks to
gradient checkpointing, the framework is trainable end-to-end with limited
memory resources and weak (image-level) supervision, allowing for the joint
optimization of all parameters. Experiments on widespread image forensics
datasets prove the good performance of the proposed approach, which largely
outperforms all baselines and all reference methods.Comment: 13 pages, 12 figures, journa
Biomedical Image Splicing Detection using Uncertainty-Guided Refinement
Recently, a surge in biomedical academic publications suspected of image
manipulation has led to numerous retractions, turning biomedical image
forensics into a research hotspot. While manipulation detectors are concerning,
the specific detection of splicing traces in biomedical images remains
underexplored. The disruptive factors within biomedical images, such as
artifacts, abnormal patterns, and noises, show misleading features like the
splicing traces, greatly increasing the challenge for this task. Moreover, the
scarcity of high-quality spliced biomedical images also limits potential
advancements in this field. In this work, we propose an Uncertainty-guided
Refinement Network (URN) to mitigate the effects of these disruptive factors.
Our URN can explicitly suppress the propagation of unreliable information flow
caused by disruptive factors among regions, thereby obtaining robust features.
Moreover, URN enables a concentration on the refinement of uncertainly
predicted regions during the decoding phase. Besides, we construct a dataset
for Biomedical image Splicing (BioSp) detection, which consists of 1,290
spliced images. Compared with existing datasets, BioSp comprises the largest
number of spliced images and the most diverse sources. Comprehensive
experiments on three benchmark datasets demonstrate the superiority of the
proposed method. Meanwhile, we verify the generalizability of URN when against
cross-dataset domain shifts and its robustness to resist post-processing
approaches. Our BioSp dataset will be released upon acceptance
- …