4,466 research outputs found
Extreme Image Coding via Multiscale Autoencoders With Generative Adversarial Optimization
We propose a MultiScale AutoEncoder(MSAE) based extreme image compression
framework to offer visually pleasing reconstruction at a very low bitrate. Our
method leverages the "priors" at different resolution scale to improve the
compression efficiency, and also employs the generative adversarial
network(GAN) with multiscale discriminators to perform the end-to-end trainable
rate-distortion optimization. We compare the perceptual quality of our
reconstructions with traditional compression algorithms using High-Efficiency
Video Coding(HEVC) based Intra Profile and JPEG2000 on the public Cityscapes
and ADE20K datasets, demonstrating the significant subjective quality
improvement.Comment: Accepted to IEEE VCIP 2019 as an oral presentatio
Siamese Encoding and Alignment by Multiscale Learning with Self-Supervision
We propose a method of aligning a source image to a target image, where the
transform is specified by a dense vector field. The two images are encoded as
feature hierarchies by siamese convolutional nets. Then a hierarchy of aligner
modules computes the transform in a coarse-to-fine recursion. Each module
receives as input the transform that was computed by the module at the level
above, aligns the source and target encodings at the same level of the
hierarchy, and then computes an improved approximation to the transform using a
convolutional net. The entire architecture of encoder and aligner nets is
trained in a self-supervised manner to minimize the squared error between
source and target remaining after alignment. We show that siamese encoding
enables more accurate alignment than the image pyramids of SPyNet, a previous
deep learning approach to coarse-to-fine alignment. Furthermore,
self-supervision applies even without target values for the transform, unlike
the strongly supervised SPyNet. We also show that our approach outperforms
one-shot approaches to alignment, because the fine pathways in the latter
approach may fail to contribute to alignment accuracy when displacements are
large. As shown by previous one-shot approaches, good results from
self-supervised learning require that the loss function additionally penalize
non-smooth transforms. We demonstrate that "masking out" the penalty function
near discontinuities leads to correct recovery of non-smooth transforms. Our
claims are supported by empirical comparisons using images from serial section
electron microscopy of brain tissue
The sharp, the flat and the shallow: Can weakly interacting agents learn to escape bad minima?
An open problem in machine learning is whether flat minima generalize better
and how to compute such minima efficiently. This is a very challenging problem.
As a first step towards understanding this question we formalize it as an
optimization problem with weakly interacting agents. We review appropriate
background material from the theory of stochastic processes and provide
insights that are relevant to practitioners. We propose an algorithmic
framework for an extended stochastic gradient Langevin dynamics and illustrate
its potential. The paper is written as a tutorial, and presents an alternative
use of multi-agent learning. Our primary focus is on the design of algorithms
for machine learning applications; however the underlying mathematical
framework is suitable for the understanding of large scale systems of agent
based models that are popular in the social sciences, economics and finance
Convolutional nets for reconstructing neural circuits from brain images acquired by serial section electron microscopy
Neural circuits can be reconstructed from brain images acquired by serial
section electron microscopy. Image analysis has been performed by manual labor
for half a century, and efforts at automation date back almost as far.
Convolutional nets were first applied to neuronal boundary detection a dozen
years ago, and have now achieved impressive accuracy on clean images. Robust
handling of image defects is a major outstanding challenge. Convolutional nets
are also being employed for other tasks in neural circuit reconstruction:
finding synapses and identifying synaptic partners, extending or pruning
neuronal reconstructions, and aligning serial section images to create a 3D
image stack. Computational systems are being engineered to handle petavoxel
images of cubic millimeter brain volumes
A deep-learning-based surrogate model for data assimilation in dynamic subsurface flow problems
A deep-learning-based surrogate model is developed and applied for predicting
dynamic subsurface flow in channelized geological models. The surrogate model
is based on deep convolutional and recurrent neural network architectures,
specifically a residual U-Net and a convolutional long short term memory
recurrent network. Training samples entail global pressure and saturation maps,
at a series of time steps, generated by simulating oil-water flow in many (1500
in our case) realizations of a 2D channelized system. After training, the
`recurrent R-U-Net' surrogate model is shown to be capable of accurately
predicting dynamic pressure and saturation maps and well rates (e.g.,
time-varying oil and water rates at production wells) for new geological
realizations. Assessments demonstrating high surrogate-model accuracy are
presented for an individual geological realization and for an ensemble of 500
test geomodels. The surrogate model is then used for the challenging problem of
data assimilation (history matching) in a channelized system. For this study,
posterior reservoir models are generated using the randomized maximum
likelihood method, with the permeability field represented using the recently
developed CNN-PCA parameterization. The flow responses required during the data
assimilation procedure are provided by the recurrent R-U-Net. The overall
approach is shown to lead to substantial reduction in prediction uncertainty.
High-fidelity numerical simulation results for the posterior geomodels
(generated by the surrogate-based data assimilation procedure) are shown to be
in essential agreement with the recurrent R-U-Net predictions. The accuracy and
dramatic speedup provided by the surrogate model suggest that it may eventually
enable the application of more formal posterior sampling methods in realistic
problems
Prediction of Discretization of GMsFEM using Deep Learning
In this paper, we propose a deep-learning-based approach to a class of
multiscale problems. THe Generalized Multiscale Finite Element Method (GMsFEM)
has been proven successful as a model reduction technique of flow problems in
heterogeneous and high-contrast porous media. The key ingredients of GMsFEM
include mutlsicale basis functions and coarse-scale parameters, which are
obtained from solving local problems in each coarse neighborhood. Given a fixed
medium, these quantities are precomputed by solving local problems in an
offline stage, and result in a reduced-order model. However, these quantities
have to be re-computed in case of varying media. The objective of our work is
to make use of deep learning techniques to mimic the nonlinear relation between
the permeability field and the GMsFEM discretizations, and use neural networks
to perform fast computation of GMsFEM ingredients repeatedly for a class of
media. We provide numerical experiments to investigate the predictive power of
neural networks and the usefulness of the resultant multiscale model in solving
channelized porous media flow problems
InGAN: Capturing and Remapping the "DNA" of a Natural Image
Generative Adversarial Networks (GANs) typically learn a distribution of
images in a large image dataset, and are then able to generate new images from
this distribution. However, each natural image has its own internal statistics,
captured by its unique distribution of patches. In this paper we propose an
"Internal GAN" (InGAN) - an image-specific GAN - which trains on a single input
image and learns its internal distribution of patches. It is then able to
synthesize a plethora of new natural images of significantly different sizes,
shapes and aspect-ratios - all with the same internal patch-distribution (same
"DNA") as the input image. In particular, despite large changes in global
size/shape of the image, all elements inside the image maintain their local
size/shape. InGAN is fully unsupervised, requiring no additional data other
than the input image itself. Once trained on the input image, it can remap the
input to any size or shape in a single feedforward pass, while preserving the
same internal patch distribution. InGAN provides a unified framework for a
variety of tasks, bridging the gap between textures and natural images
Bandwidth Extension on Raw Audio via Generative Adversarial Networks
Neural network-based methods have recently demonstrated state-of-the-art
results on image synthesis and super-resolution tasks, in particular by using
variants of generative adversarial networks (GANs) with supervised feature
losses. Nevertheless, previous feature loss formulations rely on the
availability of large auxiliary classifier networks, and labeled datasets that
enable such classifiers to be trained. Furthermore, there has been
comparatively little work to explore the applicability of GAN-based methods to
domains other than images and video. In this work we explore a GAN-based method
for audio processing, and develop a convolutional neural network architecture
to perform audio super-resolution. In addition to several new architectural
building blocks for audio processing, a key component of our approach is the
use of an autoencoder-based loss that enables training in the GAN framework,
with feature losses derived from unlabeled data. We explore the impact of our
architectural choices, and demonstrate significant improvements over previous
works in terms of both objective and perceptual quality
FFJORD: Free-form Continuous Dynamics for Scalable Reversible Generative Models
A promising class of generative models maps points from a simple distribution
to a complex distribution through an invertible neural network.
Likelihood-based training of these models requires restricting their
architectures to allow cheap computation of Jacobian determinants.
Alternatively, the Jacobian trace can be used if the transformation is
specified by an ordinary differential equation. In this paper, we use
Hutchinson's trace estimator to give a scalable unbiased estimate of the
log-density. The result is a continuous-time invertible generative model with
unbiased density estimation and one-pass sampling, while allowing unrestricted
neural network architectures. We demonstrate our approach on high-dimensional
density estimation, image generation, and variational inference, achieving the
state-of-the-art among exact likelihood methods with efficient sampling.Comment: 8 Pages, 6 figure
Fast-Slow Recurrent Neural Networks
Processing sequential data of variable length is a major challenge in a wide
range of applications, such as speech recognition, language modeling,
generative image modeling and machine translation. Here, we address this
challenge by proposing a novel recurrent neural network (RNN) architecture, the
Fast-Slow RNN (FS-RNN). The FS-RNN incorporates the strengths of both
multiscale RNNs and deep transition RNNs as it processes sequential data on
different timescales and learns complex transition functions from one time step
to the next. We evaluate the FS-RNN on two character level language modeling
data sets, Penn Treebank and Hutter Prize Wikipedia, where we improve state of
the art results to and bits-per-character (BPC), respectively. In
addition, an ensemble of two FS-RNNs achieves BPC on Hutter Prize
Wikipedia outperforming the best known compression algorithm with respect to
the BPC measure. We also present an empirical investigation of the learning and
network dynamics of the FS-RNN, which explains the improved performance
compared to other RNN architectures. Our approach is general as any kind of RNN
cell is a possible building block for the FS-RNN architecture, and thus can be
flexibly applied to different tasks.Comment: Corrected minor typos in Figure 1 and Zoneout citatio
- …