2,810 research outputs found
Hybrid LSTM and Encoder-Decoder Architecture for Detection of Image Forgeries
With advanced image journaling tools, one can easily alter the semantic
meaning of an image by exploiting certain manipulation techniques such as
copy-clone, object splicing, and removal, which mislead the viewers. In
contrast, the identification of these manipulations becomes a very challenging
task as manipulated regions are not visually apparent. This paper proposes a
high-confidence manipulation localization architecture which utilizes
resampling features, Long-Short Term Memory (LSTM) cells, and encoder-decoder
network to segment out manipulated regions from non-manipulated ones.
Resampling features are used to capture artifacts like JPEG quality loss,
upsampling, downsampling, rotation, and shearing. The proposed network exploits
larger receptive fields (spatial maps) and frequency domain correlation to
analyze the discriminative characteristics between manipulated and
non-manipulated regions by incorporating encoder and LSTM network. Finally,
decoder network learns the mapping from low-resolution feature maps to
pixel-wise predictions for image tamper localization. With predicted mask
provided by final layer (softmax) of the proposed architecture, end-to-end
training is performed to learn the network parameters through back-propagation
using ground-truth masks. Furthermore, a large image splicing dataset is
introduced to guide the training process. The proposed method is capable of
localizing image manipulations at pixel level with high precision, which is
demonstrated through rigorous experimentation on three diverse datasets
Hybrid Scene Compression for Visual Localization
Localizing an image wrt. a 3D scene model represents a core task for many
computer vision applications. An increasing number of real-world applications
of visual localization on mobile devices, e.g., Augmented Reality or autonomous
robots such as drones or self-driving cars, demand localization approaches to
minimize storage and bandwidth requirements. Compressing the 3D models used for
localization thus becomes a practical necessity. In this work, we introduce a
new hybrid compression algorithm that uses a given memory limit in a more
effective way. Rather than treating all 3D points equally, it represents a
small set of points with full appearance information and an additional, larger
set of points with compressed information. This enables our approach to obtain
a more complete scene representation without increasing the memory
requirements, leading to a superior performance compared to previous
compression schemes. As part of our contribution, we show how to handle
ambiguous matches arising from point compression during RANSAC. Besides
outperforming previous compression techniques in terms of pose accuracy under
the same memory constraints, our compression scheme itself is also more
efficient. Furthermore, the localization rates and accuracy obtained with our
approach are comparable to state-of-the-art feature-based methods, while using
a small fraction of the memory.Comment: Published at CVPR 201
Weakly-supervised localization of diabetic retinopathy lesions in retinal fundus images
Convolutional neural networks (CNNs) show impressive performance for image
classification and detection, extending heavily to the medical image domain.
Nevertheless, medical experts are sceptical in these predictions as the
nonlinear multilayer structure resulting in a classification outcome is not
directly graspable. Recently, approaches have been shown which help the user to
understand the discriminative regions within an image which are decisive for
the CNN to conclude to a certain class. Although these approaches could help to
build trust in the CNNs predictions, they are only slightly shown to work with
medical image data which often poses a challenge as the decision for a class
relies on different lesion areas scattered around the entire image. Using the
DiaretDB1 dataset, we show that on retina images different lesion areas
fundamental for diabetic retinopathy are detected on an image level with high
accuracy, comparable or exceeding supervised methods. On lesion level, we
achieve few false positives with high sensitivity, though, the network is
solely trained on image-level labels which do not include information about
existing lesions. Classifying between diseased and healthy images, we achieve
an AUC of 0.954 on the DiaretDB1.Comment: Accepted in Proc. IEEE International Conference on Image Processing
(ICIP), 201
- …