760 research outputs found
Generative Image Modeling Using Spatial LSTMs
Modeling the distribution of natural images is challenging, partly because of
strong statistical dependencies which can extend over hundreds of pixels.
Recurrent neural networks have been successful in capturing long-range
dependencies in a number of problems but only recently have found their way
into generative image models. We here introduce a recurrent image model based
on multi-dimensional long short-term memory units which are particularly suited
for image modeling due to their spatial structure. Our model scales to images
of arbitrary size and its likelihood is computationally tractable. We find that
it outperforms the state of the art in quantitative comparisons on several
image datasets and produces promising results when used for texture synthesis
and inpainting
A Neural Algorithm of Artistic Style
In fine art, especially painting, humans have mastered the skill to create
unique visual experiences through composing a complex interplay between the
content and style of an image. Thus far the algorithmic basis of this process
is unknown and there exists no artificial system with similar capabilities.
However, in other key areas of visual perception such as object and face
recognition near-human performance was recently demonstrated by a class of
biologically inspired vision models called Deep Neural Networks. Here we
introduce an artificial system based on a Deep Neural Network that creates
artistic images of high perceptual quality. The system uses neural
representations to separate and recombine content and style of arbitrary
images, providing a neural algorithm for the creation of artistic images.
Moreover, in light of the striking similarities between performance-optimised
artificial neural networks and biological vision, our work offers a path
forward to an algorithmic understanding of how humans create and perceive
artistic imagery
A note on the evaluation of generative models
Probabilistic generative models can be used for compression, denoising,
inpainting, texture synthesis, semi-supervised learning, unsupervised feature
learning, and other tasks. Given this wide range of applications, it is not
surprising that a lot of heterogeneity exists in the way these models are
formulated, trained, and evaluated. As a consequence, direct comparison between
models is often difficult. This article reviews mostly known but often
underappreciated properties relating to the evaluation and interpretation of
generative models with a focus on image models. In particular, we show that
three of the currently most commonly used criteria---average log-likelihood,
Parzen window estimates, and visual fidelity of samples---are largely
independent of each other when the data is high-dimensional. Good performance
with respect to one criterion therefore need not imply good performance with
respect to the other criteria. Our results show that extrapolation from one
criterion to another is not warranted and generative models need to be
evaluated directly with respect to the application(s) they were intended for.
In addition, we provide examples demonstrating that Parzen window estimates
should generally be avoided
Neural system identification for large populations separating "what" and "where"
Neuroscientists classify neurons into different types that perform similar
computations at different locations in the visual field. Traditional methods
for neural system identification do not capitalize on this separation of 'what'
and 'where'. Learning deep convolutional feature spaces that are shared among
many neurons provides an exciting path forward, but the architectural design
needs to account for data limitations: While new experimental techniques enable
recordings from thousands of neurons, experimental time is limited so that one
can sample only a small fraction of each neuron's response space. Here, we show
that a major bottleneck for fitting convolutional neural networks (CNNs) to
neural data is the estimation of the individual receptive field locations, a
problem that has been scratched only at the surface thus far. We propose a CNN
architecture with a sparse readout layer factorizing the spatial (where) and
feature (what) dimensions. Our network scales well to thousands of neurons and
short recordings and can be trained end-to-end. We evaluate this architecture
on ground-truth data to explore the challenges and limitations of CNN-based
system identification. Moreover, we show that our network model outperforms
current state-of-the art system identification models of mouse primary visual
cortex.Comment: NIPS 201
A Generative Model of Natural Texture Surrogates
Natural images can be viewed as patchworks of different textures, where the
local image statistics is roughly stationary within a small neighborhood but
otherwise varies from region to region. In order to model this variability, we
first applied the parametric texture algorithm of Portilla and Simoncelli to
image patches of 64X64 pixels in a large database of natural images such that
each image patch is then described by 655 texture parameters which specify
certain statistics, such as variances and covariances of wavelet coefficients
or coefficient magnitudes within that patch.
To model the statistics of these texture parameters, we then developed
suitable nonlinear transformations of the parameters that allowed us to fit
their joint statistics with a multivariate Gaussian distribution. We find that
the first 200 principal components contain more than 99% of the variance and
are sufficient to generate textures that are perceptually extremely close to
those generated with all 655 components. We demonstrate the usefulness of the
model in several ways: (1) We sample ensembles of texture patches that can be
directly compared to samples of patches from the natural image database and can
to a high degree reproduce their perceptual appearance. (2) We further
developed an image compression algorithm which generates surprisingly accurate
images at bit rates as low as 0.14 bits/pixel. Finally, (3) We demonstrate how
our approach can be used for an efficient and objective evaluation of samples
generated with probabilistic models of natural images.Comment: 34 pages, 9 figure
- …