7,958 research outputs found
Fidelity-Naturalness Evaluation of Single Image Super Resolution
We study the problem of evaluating super resolution methods. Traditional
evaluation methods usually judge the quality of super resolved images based on
a single measure of their difference with the original high resolution images.
In this paper, we proposed to use both fidelity (the difference with original
images) and naturalness (human visual perception of super resolved images) for
evaluation. For fidelity evaluation, a new metric is proposed to solve the bias
problem of traditional evaluation. For naturalness evaluation, we let humans
label preference of super resolution results using pair-wise comparison, and
test the correlation between human labeling results and image quality
assessment metrics' outputs. Experimental results show that our
fidelity-naturalness method is better than the traditional evaluation method
for super resolution methods, which could help future research on single-image
super resolution
Image Reconstruction with Predictive Filter Flow
We propose a simple, interpretable framework for solving a wide range of
image reconstruction problems such as denoising and deconvolution. Given a
corrupted input image, the model synthesizes a spatially varying linear filter
which, when applied to the input image, reconstructs the desired output. The
model parameters are learned using supervised or self-supervised training. We
test this model on three tasks: non-uniform motion blur removal,
lossy-compression artifact reduction and single image super resolution. We
demonstrate that our model substantially outperforms state-of-the-art methods
on all these tasks and is significantly faster than optimization-based
approaches to deconvolution. Unlike models that directly predict output pixel
values, the predicted filter flow is controllable and interpretable, which we
demonstrate by visualizing the space of predicted filters for different tasks.Comment: https://www.ics.uci.edu/~skong2/pff.htm
PasteGAN: A Semi-Parametric Method to Generate Image from Scene Graph
Despite some exciting progress on high-quality image generation from
structured(scene graphs) or free-form(sentences) descriptions, most of them
only guarantee the image-level semantical consistency, i.e. the generated image
matching the semantic meaning of the description. They still lack the
investigations on synthesizing the images in a more controllable way, like
finely manipulating the visual appearance of every object. Therefore, to
generate the images with preferred objects and rich interactions, we propose a
semi-parametric method, PasteGAN, for generating the image from the scene graph
and the image crops, where spatial arrangements of the objects and their
pair-wise relationships are defined by the scene graph and the object
appearances are determined by the given object crops. To enhance the
interactions of the objects in the output, we design a Crop Refining Network
and an Object-Image Fuser to embed the objects as well as their relationships
into one map. Multiple losses work collaboratively to guarantee the generated
images highly respecting the crops and complying with the scene graphs while
maintaining excellent image quality. A crop selector is also proposed to pick
the most-compatible crops from our external object tank by encoding the
interactions around the objects in the scene graph if the crops are not
provided. Evaluated on Visual Genome and COCO-Stuff dataset, our proposed
method significantly outperforms the SOTA methods on Inception Score, Diversity
Score and Fr\'echet Inception Distance. Extensive experiments also demonstrate
our method's ability to generate complex and diverse images with given objects.Comment: 10 pages, 6 figures; Accepted by NeurIPS 201
Pixels to Graphs by Associative Embedding
Graphs are a useful abstraction of image content. Not only can graphs
represent details about individual objects in a scene but they can capture the
interactions between pairs of objects. We present a method for training a
convolutional neural network such that it takes in an input image and produces
a full graph definition. This is done end-to-end in a single stage with the use
of associative embeddings. The network learns to simultaneously identify all of
the elements that make up a graph and piece them together. We benchmark on the
Visual Genome dataset, and demonstrate state-of-the-art performance on the
challenging task of scene graph generation.Comment: Updated numbers. Code and pretrained models available at
https://github.com/umich-vl/px2grap
Hierarchical Watermarking Framework Based on Analysis of Local Complexity Variations
Increasing production and exchange of multimedia content has increased the
need for better protection of copyright by means of watermarking. Different
methods have been proposed to satisfy the tradeoff between imperceptibility and
robustness as two important characteristics in watermarking while maintaining
proper data-embedding capacity. Many watermarking methods use image independent
set of parameters. Different images possess different potentials for robust and
transparent hosting of watermark data. To overcome this deficiency, in this
paper we have proposed a new hierarchical adaptive watermarking framework. At
the higher level of hierarchy, complexity of an image is ranked in comparison
with complexities of images of a dataset. For a typical dataset of images, the
statistical distribution of block complexities is found. At the lower level of
the hierarchy, for a single cover image that is to be watermarked, complexities
of blocks can be found. Local complexity variation (LCV) among a block and its
neighbors is used to adaptively control the watermark strength factor of each
block. Such local complexity analysis creates an adaptive embedding scheme,
which results in higher transparency by reducing blockiness effects. This two
level hierarchy has enabled our method to take advantage of all image blocks to
elevate the embedding capacity while preserving imperceptibility. For testing
the effectiveness of the proposed framework, contourlet transform (CT) in
conjunction with discrete cosine transform (DCT) is used to embed pseudo-random
binary sequences as watermark. Experimental results show that the proposed
framework elevates the performance the watermarking routine in terms of both
robustness and transparency.Comment: 12 pages, 14 figures, 8 table
(Quasi)Periodicity Quantification in Video Data, Using Topology
This work introduces a novel framework for quantifying the presence and
strength of recurrent dynamics in video data. Specifically, we provide
continuous measures of periodicity (perfect repetition) and quasiperiodicity
(superposition of periodic modes with non-commensurate periods), in a way which
does not require segmentation, training, object tracking or 1-dimensional
surrogate signals. Our methodology operates directly on video data. The
approach combines ideas from nonlinear time series analysis (delay embeddings)
and computational topology (persistent homology), by translating the problem of
finding recurrent dynamics in video data, into the problem of determining the
circularity or toroidality of an associated geometric space. Through extensive
testing, we show the robustness of our scores with respect to several noise
models/levels, we show that our periodicity score is superior to other methods
when compared to human-generated periodicity rankings, and furthermore, we show
that our quasiperiodicity score clearly indicates the presence of biphonation
in videos of vibrating vocal folds, which has never before been accomplished
end to end quantitatively.Comment: 27 pages, 1 column, 23 figures, SIAM Journal on Imaging Sciences,
201
Image Generation from Sketch Constraint Using Contextual GAN
In this paper we investigate image generation guided by hand sketch. When the
input sketch is badly drawn, the output of common image-to-image translation
follows the input edges due to the hard condition imposed by the translation
process. Instead, we propose to use sketch as weak constraint, where the output
edges do not necessarily follow the input edges. We address this problem using
a novel joint image completion approach, where the sketch provides the image
context for completing, or generating the output image. We train a generated
adversarial network, i.e, contextual GAN to learn the joint distribution of
sketch and the corresponding image by using joint images. Our contextual GAN
has several advantages. First, the simple joint image representation allows for
simple and effective learning of joint distribution in the same image-sketch
space, which avoids complicated issues in cross-domain learning. Second, while
the output is related to its input overall, the generated features exhibit more
freedom in appearance and do not strictly align with the input features as
previous conditional GANs do. Third, from the joint image's point of view,
image and sketch are of no difference, thus exactly the same deep joint image
completion network can be used for image-to-sketch generation. Experiments
evaluated on three different datasets show that our contextual GAN can generate
more realistic images than state-of-the-art conditional GANs on challenging
inputs and generalize well on common categories.Comment: ECCV 201
Two-Stream Neural Networks for Tampered Face Detection
We propose a two-stream network for face tampering detection. We train
GoogLeNet to detect tampering artifacts in a face classification stream, and
train a patch based triplet network to leverage features capturing local noise
residuals and camera characteristics as a second stream. In addition, we use
two different online face swapping applications to create a new dataset that
consists of 2010 tampered images, each of which contains a tampered face. We
evaluate the proposed two-stream network on our newly collected dataset.
Experimental results demonstrate the effectiveness of our method
Image Super-Resolution via Deterministic-Stochastic Synthesis and Local Statistical Rectification
Single image superresolution has been a popular research topic in the last
two decades and has recently received a new wave of interest due to deep neural
networks. In this paper, we approach this problem from a different perspective.
With respect to a downsampled low resolution image, we model a high resolution
image as a combination of two components, a deterministic component and a
stochastic component. The deterministic component can be recovered from the
low-frequency signals in the downsampled image. The stochastic component, on
the other hand, contains the signals that have little correlation with the low
resolution image. We adopt two complementary methods for generating these two
components. While generative adversarial networks are used for the stochastic
component, deterministic component reconstruction is formulated as a regression
problem solved using deep neural networks. Since the deterministic component
exhibits clearer local orientations, we design novel loss functions tailored
for such properties for training the deep regression network. These two methods
are first applied to the entire input image to produce two distinct
high-resolution images. Afterwards, these two images are fused together using
another deep neural network that also performs local statistical rectification,
which tries to make the local statistics of the fused image match the same
local statistics of the groundtruth image. Quantitative results and a user
study indicate that the proposed method outperforms existing state-of-the-art
algorithms with a clear margin.Comment: to appear in SIGGRAPH Asia 201
An Evolutionary Computing Enriched RS Attack Resilient Medical Image Steganography Model for Telemedicine Applications
The recent advancement in computing technologies and resulting vision based
applications have gives rise to a novel practice called telemedicine that
requires patient diagnosis images or allied information to recommend or even
perform diagnosis practices being located remotely. However, to ensure accurate
and optimal telemedicine there is the requirement of seamless or flawless
biomedical information about patient. On the contrary, medical data transmitted
over insecure channel often remains prone to get manipulated or corrupted by
attackers. The existing cryptosystems alone are not sufficient to deal with
these issues and hence in this paper a highly robust reversible image
steganography model has been developed for secret information hiding. Unlike
traditional wavelet transform techniques, we incorporated Discrete Ripplet
Transformation (DRT) technique for message embedding in the medical cover
images. In addition, to assure seamless communication over insecure channel, a
dual cryptosystem model containing proposed steganography scheme and RSA
cryptosystem has been developed. One of the key novelties of the proposed
research work is the use of adaptive genetic algorithm (AGA) for optimal pixel
adjustment process (OPAP) that enriches data hiding capacity as well as
imperceptibility features. The performance assessment reveals that the proposed
steganography model outperforms other wavelet transformation based approaches
in terms of high PSNR, embedding capacity, imperceptibility etc.Comment: 14 page / 3 figures / 6 tables, Multidimensional Systems and Signal
Processing 201
- …