8,298 research outputs found
Deep Kernels for Optimizing Locomotion Controllers
Sample efficiency is important when optimizing parameters of locomotion
controllers, since hardware experiments are time consuming and expensive.
Bayesian Optimization, a sample-efficient optimization framework, has recently
been widely applied to address this problem, but further improvements in sample
efficiency are needed for practical applicability to real-world robots and
high-dimensional controllers. To address this, prior work has proposed using
domain expertise for constructing custom distance metrics for locomotion. In
this work we show how to learn such a distance metric automatically. We use a
neural network to learn an informed distance metric from data obtained in
high-fidelity simulations. We conduct experiments on two different controllers
and robot architectures. First, we demonstrate improvement in sample efficiency
when optimizing a 5-dimensional controller on the ATRIAS robot hardware. We
then conduct simulation experiments to optimize a 16-dimensional controller for
a 7-link robot model and obtain significant improvements even when optimizing
in perturbed environments. This demonstrates that our approach is able to
enhance sample efficiency for two different controllers, hence is a fitting
candidate for further experiments on hardware in the future.Comment: (Rika Antonova and Akshara Rai contributed equally
Gaussian process domain experts for model adaptation in facial behavior analysis
We present a novel approach for supervised domain adaptation that is based upon the probabilistic framework of Gaussian processes (GPs). Specifically, we introduce domain-specific GPs as local experts for facial expression classification from face images. The adaptation of the classifier is facilitated in probabilistic fashion by conditioning the target expert on multiple source experts. Furthermore, in contrast to existing adaptation approaches, we also learn a target expert from available target data solely. Then, a single and confident classifier is obtained by combining the predictions from multiple experts based on their confidence. Learning of the model is efficient and requires no retraining/reweighting of the source classifiers. We evaluate the proposed approach on two publicly available datasets for multi-class (MultiPIE) and multi-label (DISFA) facial expression classification. To this end, we perform adaptation of two contextual factors: where (view) and who (subject). We show in our experiments that the proposed approach consistently outperforms both source and target classifiers, while using as few as 30 target examples. It also outperforms the state-of-the-art approaches for supervised domain adaptation
Variational Walkback: Learning a Transition Operator as a Stochastic Recurrent Net
We propose a novel method to directly learn a stochastic transition operator
whose repeated application provides generated samples. Traditional undirected
graphical models approach this problem indirectly by learning a Markov chain
model whose stationary distribution obeys detailed balance with respect to a
parameterized energy function. The energy function is then modified so the
model and data distributions match, with no guarantee on the number of steps
required for the Markov chain to converge. Moreover, the detailed balance
condition is highly restrictive: energy based models corresponding to neural
networks must have symmetric weights, unlike biological neural circuits. In
contrast, we develop a method for directly learning arbitrarily parameterized
transition operators capable of expressing non-equilibrium stationary
distributions that violate detailed balance, thereby enabling us to learn more
biologically plausible asymmetric neural networks and more general non-energy
based dynamical systems. The proposed training objective, which we derive via
principled variational methods, encourages the transition operator to "walk
back" in multi-step trajectories that start at data-points, as quickly as
possible back to the original data points. We present a series of experimental
results illustrating the soundness of the proposed approach, Variational
Walkback (VW), on the MNIST, CIFAR-10, SVHN and CelebA datasets, demonstrating
superior samples compared to earlier attempts to learn a transition operator.
We also show that although each rapid training trajectory is limited to a
finite but variable number of steps, our transition operator continues to
generate good samples well past the length of such trajectories, thereby
demonstrating the match of its non-equilibrium stationary distribution to the
data distribution. Source Code: http://github.com/anirudh9119/walkback_nips17Comment: To appear at NIPS 201
- …