160,141 research outputs found
Towards an Automatic Turing Test: Learning to Evaluate Dialogue Responses
Automatically evaluating the quality of dialogue responses for unstructured
domains is a challenging problem. Unfortunately, existing automatic evaluation
metrics are biased and correlate very poorly with human judgements of response
quality. Yet having an accurate automatic evaluation procedure is crucial for
dialogue research, as it allows rapid prototyping and testing of new models
with fewer expensive human evaluations. In response to this challenge, we
formulate automatic dialogue evaluation as a learning problem. We present an
evaluation model (ADEM) that learns to predict human-like scores to input
responses, using a new dataset of human response scores. We show that the ADEM
model's predictions correlate significantly, and at a level much higher than
word-overlap metrics such as BLEU, with human judgements at both the utterance
and system-level. We also show that ADEM can generalize to evaluating dialogue
models unseen during training, an important step for automatic dialogue
evaluation.Comment: ACL 201
Semantically Invariant Text-to-Image Generation
Image captioning has demonstrated models that are capable of generating
plausible text given input images or videos. Further, recent work in image
generation has shown significant improvements in image quality when text is
used as a prior. Our work ties these concepts together by creating an
architecture that can enable bidirectional generation of images and text. We
call this network Multi-Modal Vector Representation (MMVR). Along with MMVR, we
propose two improvements to the text conditioned image generation. Firstly, a
n-gram metric based cost function is introduced that generalizes the caption
with respect to the image. Secondly, multiple semantically similar sentences
are shown to help in generating better images. Qualitative and quantitative
evaluations demonstrate that MMVR improves upon existing text conditioned image
generation results by over 20%, while integrating visual and text modalities.Comment: 5 papers, 5 figures, Published in 2018 25th IEEE International
Conference on Image Processing (ICIP
The Unreasonable Effectiveness of Deep Features as a Perceptual Metric
While it is nearly effortless for humans to quickly assess the perceptual
similarity between two images, the underlying processes are thought to be quite
complex. Despite this, the most widely used perceptual metrics today, such as
PSNR and SSIM, are simple, shallow functions, and fail to account for many
nuances of human perception. Recently, the deep learning community has found
that features of the VGG network trained on ImageNet classification has been
remarkably useful as a training loss for image synthesis. But how perceptual
are these so-called "perceptual losses"? What elements are critical for their
success? To answer these questions, we introduce a new dataset of human
perceptual similarity judgments. We systematically evaluate deep features
across different architectures and tasks and compare them with classic metrics.
We find that deep features outperform all previous metrics by large margins on
our dataset. More surprisingly, this result is not restricted to
ImageNet-trained VGG features, but holds across different deep architectures
and levels of supervision (supervised, self-supervised, or even unsupervised).
Our results suggest that perceptual similarity is an emergent property shared
across deep visual representations.Comment: Accepted to CVPR 2018; Code and data available at
https://www.github.com/richzhang/PerceptualSimilarit
Learned Perceptual Image Enhancement
Learning a typical image enhancement pipeline involves minimization of a loss
function between enhanced and reference images. While L1 and L2 losses are
perhaps the most widely used functions for this purpose, they do not
necessarily lead to perceptually compelling results. In this paper, we show
that adding a learned no-reference image quality metric to the loss can
significantly improve enhancement operators. This metric is implemented using a
CNN (convolutional neural network) trained on a large-scale dataset labelled
with aesthetic preferences of human raters. This loss allows us to conveniently
perform back-propagation in our learning framework to simultaneously optimize
for similarity to a given ground truth reference and perceptual quality. This
perceptual loss is only used to train parameters of image processing operators,
and does not impose any extra complexity at inference time. Our experiments
demonstrate that this loss can be effective for tuning a variety of operators
such as local tone mapping and dehazing
Metric Learning for Generalizing Spatial Relations to New Objects
Human-centered environments are rich with a wide variety of spatial relations
between everyday objects. For autonomous robots to operate effectively in such
environments, they should be able to reason about these relations and
generalize them to objects with different shapes and sizes. For example, having
learned to place a toy inside a basket, a robot should be able to generalize
this concept using a spoon and a cup. This requires a robot to have the
flexibility to learn arbitrary relations in a lifelong manner, making it
challenging for an expert to pre-program it with sufficient knowledge to do so
beforehand. In this paper, we address the problem of learning spatial relations
by introducing a novel method from the perspective of distance metric learning.
Our approach enables a robot to reason about the similarity between pairwise
spatial relations, thereby enabling it to use its previous knowledge when
presented with a new relation to imitate. We show how this makes it possible to
learn arbitrary spatial relations from non-expert users using a small number of
examples and in an interactive manner. Our extensive evaluation with real-world
data demonstrates the effectiveness of our method in reasoning about a
continuous spectrum of spatial relations and generalizing them to new objects.Comment: Accepted at the 2017 IEEE/RSJ International Conference on Intelligent
Robots and Systems. The new Freiburg Spatial Relations Dataset and a demo
video of our approach running on the PR-2 robot are available at our project
website: http://spatialrelations.cs.uni-freiburg.d
- …