3 research outputs found
Hybrid LSTM and Encoder-Decoder Architecture for Detection of Image Forgeries
With advanced image journaling tools, one can easily alter the semantic
meaning of an image by exploiting certain manipulation techniques such as
copy-clone, object splicing, and removal, which mislead the viewers. In
contrast, the identification of these manipulations becomes a very challenging
task as manipulated regions are not visually apparent. This paper proposes a
high-confidence manipulation localization architecture which utilizes
resampling features, Long-Short Term Memory (LSTM) cells, and encoder-decoder
network to segment out manipulated regions from non-manipulated ones.
Resampling features are used to capture artifacts like JPEG quality loss,
upsampling, downsampling, rotation, and shearing. The proposed network exploits
larger receptive fields (spatial maps) and frequency domain correlation to
analyze the discriminative characteristics between manipulated and
non-manipulated regions by incorporating encoder and LSTM network. Finally,
decoder network learns the mapping from low-resolution feature maps to
pixel-wise predictions for image tamper localization. With predicted mask
provided by final layer (softmax) of the proposed architecture, end-to-end
training is performed to learn the network parameters through back-propagation
using ground-truth masks. Furthermore, a large image splicing dataset is
introduced to guide the training process. The proposed method is capable of
localizing image manipulations at pixel level with high precision, which is
demonstrated through rigorous experimentation on three diverse datasets
Ultra-fast basic geometrical transformations on linear image data structure
This paper presents a general, ultra-fast approach for geometrical image transformations, based on the usage of linear lookup hash tables. The new method is developed to fix distortions on document images as part of a real-time optical character recognition (OCR) system. The approach is generalized and uses linear image representation combined with pre-computed lookup tables. Backward mapping is used for generation of lookup tables, while forward mapping is presented as an alternative and more efficient mapping model for specific cases. Also, a theoretical space and time complexity analysis of the proposed method is provided. To achieve maximal computational performance, pointer arithmetic and highly-optimized low-level machine code implementations are provided, including the specialized implementations for horizontal mirror, vertical mirror, and 90° rotation. Also, a modified variant of the approach, based on auto-generated machine code is presented. Very high computational performances are achieved at the expense of memory usage. The performances from the perspective of time complexity are analyzed and compared with classical implementation, FPGA implementation, and other implementations of the image rotation. Numerical results are given for a set of different PC specifications to provide full insight into the implementation performances. The processing time for very large images are below 200 ms for backward mapping and below 100 ms for forward mapping for most machines, which is 30–60 times faster than the classical implementation, 5–20 times faster than the FPGA implementation, and up to 6 times faster than other implementations of image rotation. Original documents belonging to Nikola Tesla are used for visual demonstration of performance.</p
Ultra-fast basic geometrical transformations on linear image data structure
This paper presents a general, ultra-fast approach for geometrical image transformations, based on the usage of linear lookup hash tables. The new method is developed to fix distortions on document images as part of a real-time optical character recognition (OCR) system. The approach is generalized and uses linear image representation combined with pre-computed lookup tables. Backward mapping is used for generation of lookup tables, while forward mapping is presented as an alternative and more efficient mapping model for specific cases. Also, a theoretical space and time complexity analysis of the proposed method is provided. To achieve maximal computational performance, pointer arithmetic and highly-optimized low-level machine code implementations are provided, including the specialized implementations for horizontal mirror, vertical mirror, and 90° rotation. Also, a modified variant of the approach, based on auto-generated machine code is presented. Very high computational performances are achieved at the expense of memory usage. The performances from the perspective of time complexity are analyzed and compared with classical implementation, FPGA implementation, and other implementations of the image rotation. Numerical results are given for a set of different PC specifications to provide full insight into the implementation performances. The processing time for very large images are below 200 ms for backward mapping and below 100 ms for forward mapping for most machines, which is 30–60 times faster than the classical implementation, 5–20 times faster than the FPGA implementation, and up to 6 times faster than other implementations of image rotation. Original documents belonging to Nikola Tesla are used for visual demonstration of performance.</p