87,967 research outputs found
Learning Contact-Rich Manipulation Skills with Guided Policy Search
Autonomous learning of object manipulation skills can enable robots to
acquire rich behavioral repertoires that scale to the variety of objects found
in the real world. However, current motion skill learning methods typically
restrict the behavior to a compact, low-dimensional representation, limiting
its expressiveness and generality. In this paper, we extend a recently
developed policy search method \cite{la-lnnpg-14} and use it to learn a range
of dynamic manipulation behaviors with highly general policy representations,
without using known models or example demonstrations. Our approach learns a set
of trajectories for the desired motion skill by using iteratively refitted
time-varying linear models, and then unifies these trajectories into a single
control policy that can generalize to new situations. To enable this method to
run on a real robot, we introduce several improvements that reduce the sample
count and automate parameter selection. We show that our method can acquire
fast, fluent behaviors after only minutes of interaction time, and can learn
robust controllers for complex tasks, including putting together a toy
airplane, stacking tight-fitting lego blocks, placing wooden rings onto
tight-fitting pegs, inserting a shoe tree into a shoe, and screwing bottle caps
onto bottles
Quadratic Multi-Dimensional Signaling Games and Affine Equilibria
This paper studies the decentralized quadratic cheap talk and signaling game
problems when an encoder and a decoder, viewed as two decision makers, have
misaligned objective functions. The main contributions of this study are the
extension of Crawford and Sobel's cheap talk formulation to multi-dimensional
sources and to noisy channel setups. We consider both (simultaneous) Nash
equilibria and (sequential) Stackelberg equilibria. We show that for arbitrary
scalar sources, in the presence of misalignment, the quantized nature of all
equilibrium policies holds for Nash equilibria in the sense that all Nash
equilibria are equivalent to those achieved by quantized encoder policies. On
the other hand, all Stackelberg equilibria policies are fully informative. For
multi-dimensional setups, unlike the scalar case, Nash equilibrium policies may
be of non-quantized nature, and even linear. In the noisy setup, a Gaussian
source is to be transmitted over an additive Gaussian channel. The goals of the
encoder and the decoder are misaligned by a bias term and encoder's cost also
includes a penalty term on signal power. Conditions for the existence of affine
Nash equilibria as well as general informative equilibria are presented. For
the noisy setup, the only Stackelberg equilibrium is the linear equilibrium
when the variables are scalar. Our findings provide further conditions on when
affine policies may be optimal in decentralized multi-criteria control problems
and lead to conditions for the presence of active information transmission in
strategic environments.Comment: 15 pages, 4 figure
- …