866 research outputs found
OptionGAN: Learning Joint Reward-Policy Options using Generative Adversarial Inverse Reinforcement Learning
Reinforcement learning has shown promise in learning policies that can solve
complex problems. However, manually specifying a good reward function can be
difficult, especially for intricate tasks. Inverse reinforcement learning
offers a useful paradigm to learn the underlying reward function directly from
expert demonstrations. Yet in reality, the corpus of demonstrations may contain
trajectories arising from a diverse set of underlying reward functions rather
than a single one. Thus, in inverse reinforcement learning, it is useful to
consider such a decomposition. The options framework in reinforcement learning
is specifically designed to decompose policies in a similar light. We therefore
extend the options framework and propose a method to simultaneously recover
reward options in addition to policy options. We leverage adversarial methods
to learn joint reward-policy options using only observed expert states. We show
that this approach works well in both simple and complex continuous control
tasks and shows significant performance increases in one-shot transfer
learning.Comment: Accepted to the Thirthy-Second AAAI Conference On Artificial
Intelligence (AAAI), 201
Conditional Sum-Product Networks: Imposing Structure on Deep Probabilistic Architectures
Probabilistic graphical models are a central tool in AI; however, they are
generally not as expressive as deep neural models, and inference is notoriously
hard and slow. In contrast, deep probabilistic models such as sum-product
networks (SPNs) capture joint distributions in a tractable fashion, but still
lack the expressive power of intractable models based on deep neural networks.
Therefore, we introduce conditional SPNs (CSPNs), conditional density
estimators for multivariate and potentially hybrid domains which allow
harnessing the expressive power of neural networks while still maintaining
tractability guarantees. One way to implement CSPNs is to use an existing SPN
structure and condition its parameters on the input, e.g., via a deep neural
network. This approach, however, might misrepresent the conditional
independence structure present in data. Consequently, we also develop a
structure-learning approach that derives both the structure and parameters of
CSPNs from data. Our experimental evidence demonstrates that CSPNs are
competitive with other probabilistic models and yield superior performance on
multilabel image classification compared to mean field and mixture density
networks. Furthermore, they can successfully be employed as building blocks for
structured probabilistic models, such as autoregressive image models.Comment: 13 pages, 6 figure
Generative Image Modeling Using Spatial LSTMs
Modeling the distribution of natural images is challenging, partly because of
strong statistical dependencies which can extend over hundreds of pixels.
Recurrent neural networks have been successful in capturing long-range
dependencies in a number of problems but only recently have found their way
into generative image models. We here introduce a recurrent image model based
on multi-dimensional long short-term memory units which are particularly suited
for image modeling due to their spatial structure. Our model scales to images
of arbitrary size and its likelihood is computationally tractable. We find that
it outperforms the state of the art in quantitative comparisons on several
image datasets and produces promising results when used for texture synthesis
and inpainting
- …