270 research outputs found
Implicit imitation in multiagent reinforcement learning
Imitation is actively being studied as an effective means of learning in multi-agent environments. It allows an agent to learn how to act well (perhaps optimally) by passively observing the actions of cooperative teachers or other more experienced agents its environment. We propose a straightforward imitation mechanism called model extraction that can be integrated easily into standard model-based reinforcement learning algorithms. Roughly, by observing a mentor with similar capabilities, an agent can extract information about its own capabilities in unvisited parts of state space. The extracted information can accelerate learning dramatically. We illustrate the benefits of model extraction by integrating it with prioritized sweeping, and demonstrating improved performance and convergence through observation of single and multiple mentors. Though we make some stringent assumptions regarding observability, possible interactions and common abilities, we briefly comment on extensions of the model that relax these
Time-asymmetry of probabilities versus relativistic causal structure: an arrow of time
There is an incompatibility between the symmetries of causal structure in
relativity theory and the signaling abilities of probabilistic devices with
inputs and outputs: while time-reversal in relativity will not introduce the
ability to signal between spacelike separated regions, this is not the case for
probabilistic devices with space-like separated input-output pairs. We
explicitly describe a non-signaling device which becomes a perfect signaling
device under time-reversal, where time-reversal can be conceptualized as
playing backwards a videotape of an agent manipulating the device. This leads
to an arrow of time that is identifiable when studying the correlations of
events for spacelike separated regions. Somewhat surprisingly, although
time-reversal of Popuscu-Roerlich boxes also allows agents to signal, it does
not yield a perfect signaling device. Finally, we realize time-reversal using
post-selection, which could lead experimental implementation.Comment: 4 pages, some figures; replaces arXiv:1010.4572 [quant-ph
Law Librarianship: A Forum
Law librarianship is a profession that has a proud history and a bright future, yet it is not without its problems and concerns. For this issue of Law Library Lights, we have gathered together a number of luminaries in the field and asked them a number of questions related to the most important issues facing our community: professional image, additional roles, education and training, ethics, minority recruitment, budget crunch, technology, vendors, and the future.
This exchange was published in volume 35, issue number 5, May/June 1992
Deriving and improving CMA-ES with Information geometric trust regions
CMA-ES is one of the most popular stochastic search algorithms.
It performs favourably in many tasks without the need of extensive
parameter tuning. The algorithm has many beneficial properties,
including automatic step-size adaptation, efficient covariance updates
that incorporates the current samples as well as the evolution
path and its invariance properties. Its update rules are composed
of well established heuristics where the theoretical foundations of
some of these rules are also well understood. In this paper we
will fully derive all CMA-ES update rules within the framework of
expectation-maximisation-based stochastic search algorithms using
information-geometric trust regions. We show that the use of the trust
region results in similar updates to CMA-ES for the mean and the
covariance matrix while it allows for the derivation of an improved
update rule for the step-size. Our new algorithm, Trust-Region Covariance
Matrix Adaptation Evolution Strategy (TR-CMA-ES) is
fully derived from first order optimization principles and performs
favourably in compare to standard CMA-ES algorithm
Contextual covariance matrix adaptation evolutionary strategies
Many stochastic search algorithms are designed to optimize a fixed objective function to learn a task, i.e., if the objective function changes slightly, for example, due to a change in the situation or context of the task, relearning is required to adapt to the new context. For instance, if we want to learn a kicking movement for a soccer robot, we have to relearn the movement for different ball locations. Such relearning is undesired as it is highly inefficient and many applications require a fast adaptation to a new context/situation. Therefore, we investigate contextual stochastic search algorithms that can learn multiple, similar tasks simultaneously. Current contextual stochastic search methods are based on policy search algorithms and suffer from premature convergence and the need for parameter tuning. In this paper, we extend the well known CMA-ES algorithm to the contextual setting and illustrate its performance on several contextual tasks. Our new algorithm, called contextual CMAES, leverages from contextual learning while it preserves all the features of standard CMA-ES such as stability, avoidance of premature convergence, step size control and a minimal amount of parameter tuning.This research was funded by European Union’s FP7 un-
der EuRoC grant agreement CP-IP 608849 and LIACC
(UID/CEC/00027/2015) and IEETA (UID/CEC/00127/2015)
and also partially was funded by PARC.info:eu-repo/semantics/publishedVersio
Probing student motivation for studying introductory chemistry at UWA
The introductory chemistry unit is a unit at The University of Western Australia designed to provide an opportunity for students with little or no background in chemistry to gain an understanding of basic chemistry concepts. Due to the different backgrounds of each student it is not clear how students arrived to be enrolled in this unit, their previous chemistry experience, or what future aspirations they have for their field of study. This study aims to investigate students perceived motivations for undertaking the introductory chemistry unit and aims to assess how these motivations change with respect to different aspects common to chemistry education. This study consisted of three student surveys and two focus group interviews conducted across the course of the semester. Student responses were individually followed across the three surveys. This was to track any changes of student perceptions towards the unit and chemistry in general. The results obtained from this study will play a key role in improving the delivery of this introductory chemistry unit for future cohorts which we hope will ultimately result in improved student outcomes
- …