10,470 research outputs found
CompILE: Compositional Imitation Learning and Execution
We introduce Compositional Imitation Learning and Execution (CompILE): a
framework for learning reusable, variable-length segments of
hierarchically-structured behavior from demonstration data. CompILE uses a
novel unsupervised, fully-differentiable sequence segmentation module to learn
latent encodings of sequential data that can be re-composed and executed to
perform new tasks. Once trained, our model generalizes to sequences of longer
length and from environment instances not seen during training. We evaluate
CompILE in a challenging 2D multi-task environment and a continuous control
task, and show that it can find correct task boundaries and event encodings in
an unsupervised manner. Latent codes and associated behavior policies
discovered by CompILE can be used by a hierarchical agent, where the high-level
policy selects actions in the latent code space, and the low-level,
task-specific policies are simply the learned decoders. We found that our
CompILE-based agent could learn given only sparse rewards, where agents without
task-specific policies struggle.Comment: ICML (2019
Constrained Deep Networks: Lagrangian Optimization via Log-Barrier Extensions
This study investigates the optimization aspects of imposing hard inequality
constraints on the outputs of CNNs. In the context of deep networks,
constraints are commonly handled with penalties for their simplicity, and
despite their well-known limitations. Lagrangian-dual optimization has been
largely avoided, except for a few recent works, mainly due to the computational
complexity and stability/convergence issues caused by alternating explicit dual
updates/projections and stochastic optimization. Several studies showed that,
surprisingly for deep CNNs, the theoretical and practical advantages of
Lagrangian optimization over penalties do not materialize in practice. We
propose log-barrier extensions, which approximate Lagrangian optimization of
constrained-CNN problems with a sequence of unconstrained losses. Unlike
standard interior-point and log-barrier methods, our formulation does not need
an initial feasible solution. Furthermore, we provide a new technical result,
which shows that the proposed extensions yield an upper bound on the duality
gap. This generalizes the duality-gap result of standard log-barriers, yielding
sub-optimality certificates for feasible solutions. While sub-optimality is not
guaranteed for non-convex problems, our result shows that log-barrier
extensions are a principled way to approximate Lagrangian optimization for
constrained CNNs via implicit dual variables. We report comprehensive weakly
supervised segmentation experiments, with various constraints, showing that our
formulation outperforms substantially the existing constrained-CNN methods,
both in terms of accuracy, constraint satisfaction and training stability, more
so when dealing with a large number of constraints
Learning Opposites Using Neural Networks
Many research works have successfully extended algorithms such as
evolutionary algorithms, reinforcement agents and neural networks using
"opposition-based learning" (OBL). Two types of the "opposites" have been
defined in the literature, namely \textit{type-I} and \textit{type-II}. The
former are linear in nature and applicable to the variable space, hence easy to
calculate. On the other hand, type-II opposites capture the "oppositeness" in
the output space. In fact, type-I opposites are considered a special case of
type-II opposites where inputs and outputs have a linear relationship. However,
in many real-world problems, inputs and outputs do in fact exhibit a nonlinear
relationship. Therefore, type-II opposites are expected to be better in
capturing the sense of "opposition" in terms of the input-output relation. In
the absence of any knowledge about the problem at hand, there seems to be no
intuitive way to calculate the type-II opposites. In this paper, we introduce
an approach to learn type-II opposites from the given inputs and their outputs
using the artificial neural networks (ANNs). We first perform \emph{opposition
mining} on the sample data, and then use the mined data to learn the
relationship between input and its opposite . We have validated
our algorithm using various benchmark functions to compare it against an
evolving fuzzy inference approach that has been recently introduced. The
results show the better performance of a neural approach to learn the
opposites. This will create new possibilities for integrating oppositional
schemes within existing algorithms promising a potential increase in
convergence speed and/or accuracy.Comment: To appear in proceedings of the 23rd International Conference on
Pattern Recognition (ICPR 2016), Cancun, Mexico, December 201
- …