7,958 research outputs found

    Fidelity-Naturalness Evaluation of Single Image Super Resolution

    Full text link
    We study the problem of evaluating super resolution methods. Traditional evaluation methods usually judge the quality of super resolved images based on a single measure of their difference with the original high resolution images. In this paper, we proposed to use both fidelity (the difference with original images) and naturalness (human visual perception of super resolved images) for evaluation. For fidelity evaluation, a new metric is proposed to solve the bias problem of traditional evaluation. For naturalness evaluation, we let humans label preference of super resolution results using pair-wise comparison, and test the correlation between human labeling results and image quality assessment metrics' outputs. Experimental results show that our fidelity-naturalness method is better than the traditional evaluation method for super resolution methods, which could help future research on single-image super resolution

    Image Reconstruction with Predictive Filter Flow

    Full text link
    We propose a simple, interpretable framework for solving a wide range of image reconstruction problems such as denoising and deconvolution. Given a corrupted input image, the model synthesizes a spatially varying linear filter which, when applied to the input image, reconstructs the desired output. The model parameters are learned using supervised or self-supervised training. We test this model on three tasks: non-uniform motion blur removal, lossy-compression artifact reduction and single image super resolution. We demonstrate that our model substantially outperforms state-of-the-art methods on all these tasks and is significantly faster than optimization-based approaches to deconvolution. Unlike models that directly predict output pixel values, the predicted filter flow is controllable and interpretable, which we demonstrate by visualizing the space of predicted filters for different tasks.Comment: https://www.ics.uci.edu/~skong2/pff.htm

    PasteGAN: A Semi-Parametric Method to Generate Image from Scene Graph

    Full text link
    Despite some exciting progress on high-quality image generation from structured(scene graphs) or free-form(sentences) descriptions, most of them only guarantee the image-level semantical consistency, i.e. the generated image matching the semantic meaning of the description. They still lack the investigations on synthesizing the images in a more controllable way, like finely manipulating the visual appearance of every object. Therefore, to generate the images with preferred objects and rich interactions, we propose a semi-parametric method, PasteGAN, for generating the image from the scene graph and the image crops, where spatial arrangements of the objects and their pair-wise relationships are defined by the scene graph and the object appearances are determined by the given object crops. To enhance the interactions of the objects in the output, we design a Crop Refining Network and an Object-Image Fuser to embed the objects as well as their relationships into one map. Multiple losses work collaboratively to guarantee the generated images highly respecting the crops and complying with the scene graphs while maintaining excellent image quality. A crop selector is also proposed to pick the most-compatible crops from our external object tank by encoding the interactions around the objects in the scene graph if the crops are not provided. Evaluated on Visual Genome and COCO-Stuff dataset, our proposed method significantly outperforms the SOTA methods on Inception Score, Diversity Score and Fr\'echet Inception Distance. Extensive experiments also demonstrate our method's ability to generate complex and diverse images with given objects.Comment: 10 pages, 6 figures; Accepted by NeurIPS 201

    Pixels to Graphs by Associative Embedding

    Full text link
    Graphs are a useful abstraction of image content. Not only can graphs represent details about individual objects in a scene but they can capture the interactions between pairs of objects. We present a method for training a convolutional neural network such that it takes in an input image and produces a full graph definition. This is done end-to-end in a single stage with the use of associative embeddings. The network learns to simultaneously identify all of the elements that make up a graph and piece them together. We benchmark on the Visual Genome dataset, and demonstrate state-of-the-art performance on the challenging task of scene graph generation.Comment: Updated numbers. Code and pretrained models available at https://github.com/umich-vl/px2grap

    Hierarchical Watermarking Framework Based on Analysis of Local Complexity Variations

    Full text link
    Increasing production and exchange of multimedia content has increased the need for better protection of copyright by means of watermarking. Different methods have been proposed to satisfy the tradeoff between imperceptibility and robustness as two important characteristics in watermarking while maintaining proper data-embedding capacity. Many watermarking methods use image independent set of parameters. Different images possess different potentials for robust and transparent hosting of watermark data. To overcome this deficiency, in this paper we have proposed a new hierarchical adaptive watermarking framework. At the higher level of hierarchy, complexity of an image is ranked in comparison with complexities of images of a dataset. For a typical dataset of images, the statistical distribution of block complexities is found. At the lower level of the hierarchy, for a single cover image that is to be watermarked, complexities of blocks can be found. Local complexity variation (LCV) among a block and its neighbors is used to adaptively control the watermark strength factor of each block. Such local complexity analysis creates an adaptive embedding scheme, which results in higher transparency by reducing blockiness effects. This two level hierarchy has enabled our method to take advantage of all image blocks to elevate the embedding capacity while preserving imperceptibility. For testing the effectiveness of the proposed framework, contourlet transform (CT) in conjunction with discrete cosine transform (DCT) is used to embed pseudo-random binary sequences as watermark. Experimental results show that the proposed framework elevates the performance the watermarking routine in terms of both robustness and transparency.Comment: 12 pages, 14 figures, 8 table

    (Quasi)Periodicity Quantification in Video Data, Using Topology

    Full text link
    This work introduces a novel framework for quantifying the presence and strength of recurrent dynamics in video data. Specifically, we provide continuous measures of periodicity (perfect repetition) and quasiperiodicity (superposition of periodic modes with non-commensurate periods), in a way which does not require segmentation, training, object tracking or 1-dimensional surrogate signals. Our methodology operates directly on video data. The approach combines ideas from nonlinear time series analysis (delay embeddings) and computational topology (persistent homology), by translating the problem of finding recurrent dynamics in video data, into the problem of determining the circularity or toroidality of an associated geometric space. Through extensive testing, we show the robustness of our scores with respect to several noise models/levels, we show that our periodicity score is superior to other methods when compared to human-generated periodicity rankings, and furthermore, we show that our quasiperiodicity score clearly indicates the presence of biphonation in videos of vibrating vocal folds, which has never before been accomplished end to end quantitatively.Comment: 27 pages, 1 column, 23 figures, SIAM Journal on Imaging Sciences, 201

    Image Generation from Sketch Constraint Using Contextual GAN

    Full text link
    In this paper we investigate image generation guided by hand sketch. When the input sketch is badly drawn, the output of common image-to-image translation follows the input edges due to the hard condition imposed by the translation process. Instead, we propose to use sketch as weak constraint, where the output edges do not necessarily follow the input edges. We address this problem using a novel joint image completion approach, where the sketch provides the image context for completing, or generating the output image. We train a generated adversarial network, i.e, contextual GAN to learn the joint distribution of sketch and the corresponding image by using joint images. Our contextual GAN has several advantages. First, the simple joint image representation allows for simple and effective learning of joint distribution in the same image-sketch space, which avoids complicated issues in cross-domain learning. Second, while the output is related to its input overall, the generated features exhibit more freedom in appearance and do not strictly align with the input features as previous conditional GANs do. Third, from the joint image's point of view, image and sketch are of no difference, thus exactly the same deep joint image completion network can be used for image-to-sketch generation. Experiments evaluated on three different datasets show that our contextual GAN can generate more realistic images than state-of-the-art conditional GANs on challenging inputs and generalize well on common categories.Comment: ECCV 201

    Two-Stream Neural Networks for Tampered Face Detection

    Full text link
    We propose a two-stream network for face tampering detection. We train GoogLeNet to detect tampering artifacts in a face classification stream, and train a patch based triplet network to leverage features capturing local noise residuals and camera characteristics as a second stream. In addition, we use two different online face swapping applications to create a new dataset that consists of 2010 tampered images, each of which contains a tampered face. We evaluate the proposed two-stream network on our newly collected dataset. Experimental results demonstrate the effectiveness of our method

    Image Super-Resolution via Deterministic-Stochastic Synthesis and Local Statistical Rectification

    Full text link
    Single image superresolution has been a popular research topic in the last two decades and has recently received a new wave of interest due to deep neural networks. In this paper, we approach this problem from a different perspective. With respect to a downsampled low resolution image, we model a high resolution image as a combination of two components, a deterministic component and a stochastic component. The deterministic component can be recovered from the low-frequency signals in the downsampled image. The stochastic component, on the other hand, contains the signals that have little correlation with the low resolution image. We adopt two complementary methods for generating these two components. While generative adversarial networks are used for the stochastic component, deterministic component reconstruction is formulated as a regression problem solved using deep neural networks. Since the deterministic component exhibits clearer local orientations, we design novel loss functions tailored for such properties for training the deep regression network. These two methods are first applied to the entire input image to produce two distinct high-resolution images. Afterwards, these two images are fused together using another deep neural network that also performs local statistical rectification, which tries to make the local statistics of the fused image match the same local statistics of the groundtruth image. Quantitative results and a user study indicate that the proposed method outperforms existing state-of-the-art algorithms with a clear margin.Comment: to appear in SIGGRAPH Asia 201

    An Evolutionary Computing Enriched RS Attack Resilient Medical Image Steganography Model for Telemedicine Applications

    Full text link
    The recent advancement in computing technologies and resulting vision based applications have gives rise to a novel practice called telemedicine that requires patient diagnosis images or allied information to recommend or even perform diagnosis practices being located remotely. However, to ensure accurate and optimal telemedicine there is the requirement of seamless or flawless biomedical information about patient. On the contrary, medical data transmitted over insecure channel often remains prone to get manipulated or corrupted by attackers. The existing cryptosystems alone are not sufficient to deal with these issues and hence in this paper a highly robust reversible image steganography model has been developed for secret information hiding. Unlike traditional wavelet transform techniques, we incorporated Discrete Ripplet Transformation (DRT) technique for message embedding in the medical cover images. In addition, to assure seamless communication over insecure channel, a dual cryptosystem model containing proposed steganography scheme and RSA cryptosystem has been developed. One of the key novelties of the proposed research work is the use of adaptive genetic algorithm (AGA) for optimal pixel adjustment process (OPAP) that enriches data hiding capacity as well as imperceptibility features. The performance assessment reveals that the proposed steganography model outperforms other wavelet transformation based approaches in terms of high PSNR, embedding capacity, imperceptibility etc.Comment: 14 page / 3 figures / 6 tables, Multidimensional Systems and Signal Processing 201
    • …
    corecore