696 research outputs found

    Low computational complexity variable block size (VBS) partitioning for motion estimation using the Walsh Hadamard transform (WHT)

    Get PDF
    Variable Block Size (VBS) based motion estimation has been adapted in state of the art video coding, such as H.264/AVC, VC-1. However, a low complexity H.264/AVC encoder cannot take advantage of VBS due to its power consumption requirements. In this paper, we present a VBS partition algorithm based on a binary motion edge map without either initial motion estimation or Rate-Distortion (R-D) optimization for selecting modes. The proposed algorithm uses the Walsh Hadamard Transform (WHT) to create a binary edge map, which provides a computational complexity cost effectiveness compared to other light segmentation methods typically used to detect the required region

    Visual saliency guided high dynamic range image compression

    Get PDF
    Recent years have seen the emergence of the visual saliency-based image and video compression for low dynamic range (LDR) visual content. The high dynamic range (HDR) imaging is yet to follow such an approach for compression as the state-of-the-art visual saliency detection models are mainly concerned with LDR content. Although a few HDR saliency detection models have been proposed in the recent years, they lack the comprehensive validation. Current HDR image compression schemes do not differentiate salient and non-salient regions, which has been proved redundant in terms of the Human Visual System. In this paper, we propose a novel visual saliency guided layered compression scheme for HDR images. The proposed saliency detection model is robust and highly correlates with the ground truth saliency maps obtained from eye tracker. The results show a reduction of bit-rates up to 50% while retaining the same high visual quality in terms of HDR-Visual Difference Predictor (HDR-VDP) and the visual saliency-induced index for perceptual image quality assessment (VSI) metrics in the salient regions

    Detection of Nonaligned Double JPEG Compression Based on Integer Periodicity Maps

    Get PDF
    In this paper, a simple yet reliable algorithm to detect the presence of nonaligned double JPEG compression (NA-JPEG) in compressed images is proposed. The method evaluates a single feature based on the integer periodicity of the blockwise discrete cosine transform (DCT) coefficients when the DCT is computed according to the grid of the previous JPEG compression. Even if the proposed feature is computed relying only on DC coefficient statistics, a simple threshold detector can classify NA-JPEG images with improved accuracy with respect to existing methods and on smaller image sizes, without resorting to a properly trained classifier. Moreover, the proposed scheme is able to accurately estimate the grid shift and the quantization step of the DC coefficient of the primary JPEG compression, allowing one to perform a more detailed analysis of possibly forged image

    Semantic Image Segmentation via Deep Parsing Network

    Full text link
    This paper addresses semantic image segmentation by incorporating rich information into Markov Random Field (MRF), including high-order relations and mixture of label contexts. Unlike previous works that optimized MRFs using iterative algorithm, we solve MRF by proposing a Convolutional Neural Network (CNN), namely Deep Parsing Network (DPN), which enables deterministic end-to-end computation in a single forward pass. Specifically, DPN extends a contemporary CNN architecture to model unary terms and additional layers are carefully devised to approximate the mean field algorithm (MF) for pairwise terms. It has several appealing properties. First, different from the recent works that combined CNN and MRF, where many iterations of MF were required for each training image during back-propagation, DPN is able to achieve high performance by approximating one iteration of MF. Second, DPN represents various types of pairwise terms, making many existing works as its special cases. Third, DPN makes MF easier to be parallelized and speeded up in Graphical Processing Unit (GPU). DPN is thoroughly evaluated on the PASCAL VOC 2012 dataset, where a single DPN model yields a new state-of-the-art segmentation accuracy.Comment: To appear in International Conference on Computer Vision (ICCV) 201

    WG1N5315 - Response to Call for AIC evaluation methodologies and compression technologies for medical images: LAR Codec

    Get PDF
    This document presents the LAR image codec as a response to Call for AIC evaluation methodologies and compression technologies for medical images.This document describes the IETR response to the specific call for contributions of medical imaging technologies to be considered for AIC. The philosophy behind our coder is not to outperform JPEG2000 in compression; our goal is to propose an open source, royalty free, alternative image coder with integrated services. While keeping the compression performances in the same range as JPEG2000 but with lower complexity, our coder also provides services such as scalability, cryptography, data hiding, lossy to lossless compression, region of interest, free region representation and coding

    Hybrid LSTM and Encoder-Decoder Architecture for Detection of Image Forgeries

    Full text link
    With advanced image journaling tools, one can easily alter the semantic meaning of an image by exploiting certain manipulation techniques such as copy-clone, object splicing, and removal, which mislead the viewers. In contrast, the identification of these manipulations becomes a very challenging task as manipulated regions are not visually apparent. This paper proposes a high-confidence manipulation localization architecture which utilizes resampling features, Long-Short Term Memory (LSTM) cells, and encoder-decoder network to segment out manipulated regions from non-manipulated ones. Resampling features are used to capture artifacts like JPEG quality loss, upsampling, downsampling, rotation, and shearing. The proposed network exploits larger receptive fields (spatial maps) and frequency domain correlation to analyze the discriminative characteristics between manipulated and non-manipulated regions by incorporating encoder and LSTM network. Finally, decoder network learns the mapping from low-resolution feature maps to pixel-wise predictions for image tamper localization. With predicted mask provided by final layer (softmax) of the proposed architecture, end-to-end training is performed to learn the network parameters through back-propagation using ground-truth masks. Furthermore, a large image splicing dataset is introduced to guide the training process. The proposed method is capable of localizing image manipulations at pixel level with high precision, which is demonstrated through rigorous experimentation on three diverse datasets
    • 

    corecore