3 research outputs found

    Hybrid LSTM and Encoder-Decoder Architecture for Detection of Image Forgeries

    Full text link
    With advanced image journaling tools, one can easily alter the semantic meaning of an image by exploiting certain manipulation techniques such as copy-clone, object splicing, and removal, which mislead the viewers. In contrast, the identification of these manipulations becomes a very challenging task as manipulated regions are not visually apparent. This paper proposes a high-confidence manipulation localization architecture which utilizes resampling features, Long-Short Term Memory (LSTM) cells, and encoder-decoder network to segment out manipulated regions from non-manipulated ones. Resampling features are used to capture artifacts like JPEG quality loss, upsampling, downsampling, rotation, and shearing. The proposed network exploits larger receptive fields (spatial maps) and frequency domain correlation to analyze the discriminative characteristics between manipulated and non-manipulated regions by incorporating encoder and LSTM network. Finally, decoder network learns the mapping from low-resolution feature maps to pixel-wise predictions for image tamper localization. With predicted mask provided by final layer (softmax) of the proposed architecture, end-to-end training is performed to learn the network parameters through back-propagation using ground-truth masks. Furthermore, a large image splicing dataset is introduced to guide the training process. The proposed method is capable of localizing image manipulations at pixel level with high precision, which is demonstrated through rigorous experimentation on three diverse datasets

    Ultra-fast basic geometrical transformations on linear image data structure

    Get PDF
    This paper presents a general, ultra-fast approach for geometrical image transformations, based on the usage of linear lookup hash tables. The new method is developed to fix distortions on document images as part of a real-time optical character recognition (OCR) system. The approach is generalized and uses linear image representation combined with pre-computed lookup tables. Backward mapping is used for generation of lookup tables, while forward mapping is presented as an alternative and more efficient mapping model for specific cases. Also, a theoretical space and time complexity analysis of the proposed method is provided. To achieve maximal computational performance, pointer arithmetic and highly-optimized low-level machine code implementations are provided, including the specialized implementations for horizontal mirror, vertical mirror, and 90° rotation. Also, a modified variant of the approach, based on auto-generated machine code is presented. Very high computational performances are achieved at the expense of memory usage. The performances from the perspective of time complexity are analyzed and compared with classical implementation, FPGA implementation, and other implementations of the image rotation. Numerical results are given for a set of different PC specifications to provide full insight into the implementation performances. The processing time for very large images are below 200 ms for backward mapping and below 100 ms for forward mapping for most machines, which is 30–60 times faster than the classical implementation, 5–20 times faster than the FPGA implementation, and up to 6 times faster than other implementations of image rotation. Original documents belonging to Nikola Tesla are used for visual demonstration of performance.</p

    Ultra-fast basic geometrical transformations on linear image data structure

    Get PDF
    This paper presents a general, ultra-fast approach for geometrical image transformations, based on the usage of linear lookup hash tables. The new method is developed to fix distortions on document images as part of a real-time optical character recognition (OCR) system. The approach is generalized and uses linear image representation combined with pre-computed lookup tables. Backward mapping is used for generation of lookup tables, while forward mapping is presented as an alternative and more efficient mapping model for specific cases. Also, a theoretical space and time complexity analysis of the proposed method is provided. To achieve maximal computational performance, pointer arithmetic and highly-optimized low-level machine code implementations are provided, including the specialized implementations for horizontal mirror, vertical mirror, and 90° rotation. Also, a modified variant of the approach, based on auto-generated machine code is presented. Very high computational performances are achieved at the expense of memory usage. The performances from the perspective of time complexity are analyzed and compared with classical implementation, FPGA implementation, and other implementations of the image rotation. Numerical results are given for a set of different PC specifications to provide full insight into the implementation performances. The processing time for very large images are below 200 ms for backward mapping and below 100 ms for forward mapping for most machines, which is 30–60 times faster than the classical implementation, 5–20 times faster than the FPGA implementation, and up to 6 times faster than other implementations of image rotation. Original documents belonging to Nikola Tesla are used for visual demonstration of performance.</p
    corecore