2,053 research outputs found
"Guess what I'm doing": Extending legibility to sequential decision tasks
In this paper we investigate the notion of legibility in sequential decision
tasks under uncertainty. Previous works that extend legibility to scenarios
beyond robot motion either focus on deterministic settings or are
computationally too expensive. Our proposed approach, dubbed PoL-MDP, is able
to handle uncertainty while remaining computationally tractable. We establish
the advantages of our approach against state-of-the-art approaches in several
simulated scenarios of different complexity. We also showcase the use of our
legible policies as demonstrations for an inverse reinforcement learning agent,
establishing their superiority against the commonly used demonstrations based
on the optimal policy. Finally, we assess the legibility of our computed
policies through a user study where people are asked to infer the goal of a
mobile robot following a legible policy by observing its actions
Experimental evidence of shock mitigation in a Hertzian tapered chain
We present an experimental study of the mechanical impulse propagation
through a horizontal alignment of elastic spheres of progressively decreasing
diameter , namely a tapered chain. Experimentally, the diameters of
spheres which interact via the Hertz potential are selected to keep as close as
possible to an exponential decrease, , where the
experimental tapering factor is either ~% or ~%.
In agreement with recent numerical results, an impulse initiated in a
monodisperse chain (a chain of identical beads) propagates without shape
changes, and progressively transfer its energy and momentum to a propagating
tail when it further travels in a tapered chain. As a result, the front pulse
of this wave decreases in amplitude and accelerates. Both effects are
satisfactorily described by the hard spheres approximation, and basically, the
shock mitigation is due to partial transmissions, from one bead to the next, of
momentum and energy of the front pulse. In addition when small dissipation is
included, a better agreement with experiments is found. A close analysis of the
loading part of the experimental pulses demonstrates that the front wave adopts
itself a self similar solution as it propagates in the tapered chain. Finally,
our results corroborate the capability of these chains to thermalize
propagating impulses and thereby act as shock absorbing devices.Comment: ReVTeX, 7 pages with 6 eps, accepted for Phys. Rev. E (Related papers
on http://www.supmeca.fr/perso/jobs/
Interactively Teaching an Inverse Reinforcement Learner with Limited Feedback
We study the problem of teaching via demonstrations in sequential
decision-making tasks. In particular, we focus on the situation when the
teacher has no access to the learner's model and policy, and the feedback from
the learner is limited to trajectories that start from states selected by the
teacher. The necessity to select the starting states and infer the learner's
policy creates an opportunity for using the methods of inverse reinforcement
learning and active learning by the teacher. In this work, we formalize the
teaching process with limited feedback and propose an algorithm that solves
this teaching problem. The algorithm uses a modified version of the active
value-at-risk method to select the starting states, a modified maximum causal
entropy algorithm to infer the policy, and the difficulty score ratio method to
choose the teaching demonstrations. We test the algorithm in a synthetic car
driving environment and conclude that the proposed algorithm is an effective
solution when the learner's feedback is limited.Comment: 7 pages, 3 figure
An analysis of reinforcement learning with function approximation
Abstract — In this paper, we propose a reinforcement learning approach to address multi-robot cooperative navigation tasks in infinite settings. We propose an algorithm to simultaneously address the problems of learning and coordination in multirobot problems. The proposed algorithm extends those existing in the literature, allowing to address simultaneous learning and coordination in problems with an infinite state-space. We also present the results obtained in several test scenarios featuring multi-robot navigation situations with partial observability. I
Hexagons, Kinks and Disorder in Oscillated Granular Layers
Experiments on vertically oscillated granular layers in an evacuated
container reveal a sequence of well-defined pattern bifurcations as the
container acceleration is increased. Period doublings of the layer center of
mass motion and a parametric wave instability interact to produce hexagons and
more complicated patterns composed of distinct spatial domains of different
relative phase separated by kinks (phase discontinuities). Above a critical
acceleration, the layer becomes disordered in both space and time.Comment: 4 pages. The RevTeX file has a macro allowing various styles. The
appropriate style is "myprint" which is the defaul
- …