3,942 research outputs found
Decomposed Adversarial Learned Inference
Effective inference for a generative adversarial model remains an important
and challenging problem. We propose a novel approach, Decomposed Adversarial
Learned Inference (DALI), which explicitly matches prior and conditional
distributions in both data and code spaces, and puts a direct constraint on the
dependency structure of the generative model. We derive an equivalent form of
the prior and conditional matching objective that can be optimized efficiently
without any parametric assumption on the data. We validate the effectiveness of
DALI on the MNIST, CIFAR-10, and CelebA datasets by conducting quantitative and
qualitative evaluations. Results demonstrate that DALI significantly improves
both reconstruction and generation as compared to other adversarial inference
models
A Plug-in Method for Representation Factorization in Connectionist Models
In this article, we focus on decomposing latent representations in generative
adversarial networks or learned feature representations in deep autoencoders
into semantically controllable factors in a semisupervised manner, without
modifying the original trained models. Particularly, we propose factors'
decomposer-entangler network (FDEN) that learns to decompose a latent
representation into mutually independent factors. Given a latent
representation, the proposed framework draws a set of interpretable factors,
each aligned to independent factors of variations by minimizing their total
correlation in an information-theoretic means. As a plug-in method, we have
applied our proposed FDEN to the existing networks of adversarially learned
inference and pioneer network and performed computer vision tasks of
image-to-image translation in semantic ways, e.g., changing styles, while
keeping the identity of a subject, and object classification in a few-shot
learning scheme. We have also validated the effectiveness of the proposed
method with various ablation studies in the qualitative, quantitative, and
statistical examination.Comment: in IEEE Transactions on Neural Networks and Learning Systems, 202
Robust Question Answering Through Sub-part Alignment
Current textual question answering models achieve strong performance on
in-domain test sets, but often do so by fitting surface-level patterns in the
data, so they fail to generalize to out-of-distribution settings. To make a
more robust and understandable QA system, we model question answering as an
alignment problem. We decompose both the question and context into smaller
units based on off-the-shelf semantic representations (here, semantic roles),
and align the question to a subgraph of the context in order to find the
answer. We formulate our model as a structured SVM, with alignment scores
computed via BERT, and we can train end-to-end despite using beam search for
approximate inference. Our explicit use of alignments allows us to explore a
set of constraints with which we can prohibit certain types of bad model
behavior arising in cross-domain settings. Furthermore, by investigating
differences in scores across different potential answers, we can seek to
understand what particular aspects of the input lead the model to choose the
answer without relying on post-hoc explanation techniques. We train our model
on SQuAD v1.1 and test it on several adversarial and out-of-domain datasets.
The results show that our model is more robust cross-domain than the standard
BERT QA model, and constraints derived from alignment scores allow us to
effectively trade off coverage and accuracy
Attribute Guided Unpaired Image-to-Image Translation with Semi-supervised Learning
Unpaired Image-to-Image Translation (UIT) focuses on translating images among
different domains by using unpaired data, which has received increasing
research focus due to its practical usage. However, existing UIT schemes defect
in the need of supervised training, as well as the lack of encoding domain
information. In this paper, we propose an Attribute Guided UIT model termed
AGUIT to tackle these two challenges. AGUIT considers multi-modal and
multi-domain tasks of UIT jointly with a novel semi-supervised setting, which
also merits in representation disentanglement and fine control of outputs.
Especially, AGUIT benefits from two-fold: (1) It adopts a novel semi-supervised
learning process by translating attributes of labeled data to unlabeled data,
and then reconstructing the unlabeled data by a cycle consistency operation.
(2) It decomposes image representation into domain-invariant content code and
domain-specific style code. The redesigned style code embeds image style into
two variables drawn from standard Gaussian distribution and the distribution of
domain label, which facilitates the fine control of translation due to the
continuity of both variables. Finally, we introduce a new challenge, i.e.,
disentangled transfer, for UIT models, which adopts the disentangled
representation to translate data less related with the training set. Extensive
experiments demonstrate the capacity of AGUIT over existing state-of-the-art
models
An inner-loop free solution to inverse problems using deep neural networks
We propose a new method that uses deep learning techniques to accelerate the
popular alternating direction method of multipliers (ADMM) solution for inverse
problems. The ADMM updates consist of a proximity operator, a least squares
regression that includes a big matrix inversion, and an explicit solution for
updating the dual variables. Typically, inner loops are required to solve the
first two sub-minimization problems due to the intractability of the prior and
the matrix inversion. To avoid such drawbacks or limitations, we propose an
inner-loop free update rule with two pre-trained deep convolutional
architectures. More specifically, we learn a conditional denoising auto-encoder
which imposes an implicit data-dependent prior/regularization on ground-truth
in the first sub-minimization problem. This design follows an empirical
Bayesian strategy, leading to so-called amortized inference. For matrix
inversion in the second sub-problem, we learn a convolutional neural network to
approximate the matrix inversion, i.e., the inverse mapping is learned by
feeding the input through the learned forward network. Note that training this
neural network does not require ground-truth or measurements, i.e., it is
data-independent. Extensive experiments on both synthetic data and real
datasets demonstrate the efficiency and accuracy of the proposed method
compared with the conventional ADMM solution using inner loops for solving
inverse problems
Quantization-Based Regularization for Autoencoders
Autoencoders and their variations provide unsupervised models for learning
low-dimensional representations for downstream tasks. Without proper
regularization, autoencoder models are susceptible to the overfitting problem
and the so-called posterior collapse phenomenon. In this paper, we introduce a
quantization-based regularizer in the bottleneck stage of autoencoder models to
learn meaningful latent representations. We combine both perspectives of Vector
Quantized-Variational AutoEncoders (VQ-VAE) and classical denoising
regularization methods of neural networks. We interpret quantizers as
regularizers that constrain latent representations while fostering a
similarity-preserving mapping at the encoder. Before quantization, we impose
noise on the latent codes and use a Bayesian estimator to optimize the
quantizer-based representation. The introduced bottleneck Bayesian estimator
outputs the posterior mean of the centroids to the decoder, and thus, is
performing soft quantization of the noisy latent codes. We show that our
proposed regularization method results in improved latent representations for
both supervised learning and clustering downstream tasks when compared to
autoencoders using other bottleneck structures.Comment: AAAI 202
InverseNet: Solving Inverse Problems with Splitting Networks
We propose a new method that uses deep learning techniques to solve the
inverse problems. The inverse problem is cast in the form of learning an
end-to-end mapping from observed data to the ground-truth. Inspired by the
splitting strategy widely used in regularized iterative algorithm to tackle
inverse problems, the mapping is decomposed into two networks, with one
handling the inversion of the physical forward model associated with the data
term and one handling the denoising of the output from the former network,
i.e., the inverted version, associated with the prior/regularization term. The
two networks are trained jointly to learn the end-to-end mapping, getting rid
of a two-step training. The training is annealing as the intermediate variable
between these two networks bridges the gap between the input (the degraded
version of output) and output and progressively approaches to the ground-truth.
The proposed network, referred to as InverseNet, is flexible in the sense that
most of the existing end-to-end network structure can be leveraged in the first
network and most of the existing denoising network structure can be used in the
second one. Extensive experiments on both synthetic data and real datasets on
the tasks, motion deblurring, super-resolution, and colorization, demonstrate
the efficiency and accuracy of the proposed method compared with other image
processing algorithms
Modeling Uncertainty by Learning a Hierarchy of Deep Neural Connections
Modeling uncertainty in deep neural networks, despite recent important
advances, is still an open problem. Bayesian neural networks are a powerful
solution, where the prior over network weights is a design choice, often a
normal distribution or other distribution encouraging sparsity. However, this
prior is agnostic to the generative process of the input data, which might lead
to unwarranted generalization for out-of-distribution tested data. We suggest
the presence of a confounder for the relation between the input data and the
discriminative function given the target label. We propose an approach for
modeling this confounder by sharing neural connectivity patterns between the
generative and discriminative networks. This approach leads to a new deep
architecture, where networks are sampled from the posterior of local causal
structures, and coupled into a compact hierarchy. We demonstrate that sampling
networks from this hierarchy, proportionally to their posterior, is efficient
and enables estimating various types of uncertainties. Empirical evaluations of
our method demonstrate significant improvement compared to state-of-the-art
calibration and out-of-distribution detection methods
Causal Generative Domain Adaptation Networks
An essential problem in domain adaptation is to understand and make use of
distribution changes across domains. For this purpose, we first propose a
flexible Generative Domain Adaptation Network (G-DAN) with specific latent
variables to capture changes in the generating process of features across
domains. By explicitly modeling the changes, one can even generate data in new
domains using the generating process with new values for the latent variables
in G-DAN. In practice, the process to generate all features together may
involve high-dimensional latent variables, requiring dealing with distributions
in high dimensions and making it difficult to learn domain changes from few
source domains. Interestingly, by further making use of the causal
representation of joint distributions, we then decompose the joint distribution
into separate modules, each of which involves different low-dimensional latent
variables and can be learned separately, leading to a Causal G-DAN (CG-DAN).
This improves both statistical and computational efficiency of the learning
procedure. Finally, by matching the feature distribution in the target domain,
we can recover the target-domain joint distribution and derive the learning
machine for the target domain. We demonstrate the efficacy of both G-DAN and
CG-DAN in domain generation and cross-domain prediction on both synthetic and
real data experiments.Comment: 12 page
Kernel Implicit Variational Inference
Recent progress in variational inference has paid much attention to the
flexibility of variational posteriors. One promising direction is to use
implicit distributions, i.e., distributions without tractable densities as the
variational posterior. However, existing methods on implicit posteriors still
face challenges of noisy estimation and computational infeasibility when
applied to models with high-dimensional latent variables. In this paper, we
present a new approach named Kernel Implicit Variational Inference that
addresses these challenges. As far as we know, for the first time implicit
variational inference is successfully applied to Bayesian neural networks,
which shows promising results on both regression and classification tasks.Comment: Published as a conference paper at ICLR 201
- …