27,217 research outputs found
Intrinsically Motivated Goal Exploration Processes with Automatic Curriculum Learning
Intrinsically motivated spontaneous exploration is a key enabler of
autonomous lifelong learning in human children. It enables the discovery and
acquisition of large repertoires of skills through self-generation,
self-selection, self-ordering and self-experimentation of learning goals. We
present an algorithmic approach called Intrinsically Motivated Goal Exploration
Processes (IMGEP) to enable similar properties of autonomous or self-supervised
learning in machines. The IMGEP algorithmic architecture relies on several
principles: 1) self-generation of goals, generalized as fitness functions; 2)
selection of goals based on intrinsic rewards; 3) exploration with incremental
goal-parameterized policy search and exploitation of the gathered data with a
batch learning algorithm; 4) systematic reuse of information acquired when
targeting a goal for improving towards other goals. We present a particularly
efficient form of IMGEP, called Modular Population-Based IMGEP, that uses a
population-based policy and an object-centered modularity in goals and
mutations. We provide several implementations of this architecture and
demonstrate their ability to automatically generate a learning curriculum
within several experimental setups including a real humanoid robot that can
explore multiple spaces of goals with several hundred continuous dimensions.
While no particular target goal is provided to the system, this curriculum
allows the discovery of skills that act as stepping stone for learning more
complex skills, e.g. nested tool use. We show that learning diverse spaces of
goals with intrinsic motivations is more efficient for learning complex skills
than only trying to directly learn these complex skills
Information driven self-organization of complex robotic behaviors
Information theory is a powerful tool to express principles to drive
autonomous systems because it is domain invariant and allows for an intuitive
interpretation. This paper studies the use of the predictive information (PI),
also called excess entropy or effective measure complexity, of the sensorimotor
process as a driving force to generate behavior. We study nonlinear and
nonstationary systems and introduce the time-local predicting information
(TiPI) which allows us to derive exact results together with explicit update
rules for the parameters of the controller in the dynamical systems framework.
In this way the information principle, formulated at the level of behavior, is
translated to the dynamics of the synapses. We underpin our results with a
number of case studies with high-dimensional robotic systems. We show the
spontaneous cooperativity in a complex physical system with decentralized
control. Moreover, a jointly controlled humanoid robot develops a high
behavioral variety depending on its physics and the environment it is
dynamically embedded into. The behavior can be decomposed into a succession of
low-dimensional modes that increasingly explore the behavior space. This is a
promising way to avoid the curse of dimensionality which hinders learning
systems to scale well.Comment: 29 pages, 12 figure
Recognising the Clothing Categories from Free-Configuration Using Gaussian-Process-Based Interactive Perception
In this paper, we propose a Gaussian Process- based interactive perception approach for recognising highly- wrinkled clothes. We have integrated this recognition method within a clothes sorting pipeline for the pre-washing stage of an autonomous laundering process. Our approach differs from reported clothing manipulation approaches by allowing the robot to update its perception confidence via numerous interactions with the garments. The classifiers predominantly reported in clothing perception (e.g. SVM, Random Forest) studies do not provide true classification probabilities, due to their inherent structure. In contrast, probabilistic classifiers (of which the Gaussian Process is a popular example) are able to provide predictive probabilities. In our approach, we employ a multi-class Gaussian Process classification using the Laplace approximation for posterior inference and optimising hyper-parameters via marginal likelihood maximisation. Our experimental results show that our approach is able to recognise unknown garments from highly-occluded and wrinkled con- figurations and demonstrates a substantial improvement over non-interactive perception approaches
Post-Westgate SWAT : C4ISTAR Architectural Framework for Autonomous Network Integrated Multifaceted Warfighting Solutions Version 1.0 : A Peer-Reviewed Monograph
Police SWAT teams and Military Special Forces face mounting pressure and
challenges from adversaries that can only be resolved by way of ever more
sophisticated inputs into tactical operations. Lethal Autonomy provides
constrained military/security forces with a viable option, but only if
implementation has got proper empirically supported foundations. Autonomous
weapon systems can be designed and developed to conduct ground, air and naval
operations. This monograph offers some insights into the challenges of
developing legal, reliable and ethical forms of autonomous weapons, that
address the gap between Police or Law Enforcement and Military operations that
is growing exponentially small. National adversaries are today in many
instances hybrid threats, that manifest criminal and military traits, these
often require deployment of hybrid-capability autonomous weapons imbued with
the capability to taken on both Military and/or Security objectives. The
Westgate Terrorist Attack of 21st September 2013 in the Westlands suburb of
Nairobi, Kenya is a very clear manifestation of the hybrid combat scenario that
required military response and police investigations against a fighting cell of
the Somalia based globally networked Al Shabaab terrorist group.Comment: 52 pages, 6 Figures, over 40 references, reviewed by a reade
Black-Box Data-efficient Policy Search for Robotics
The most data-efficient algorithms for reinforcement learning (RL) in
robotics are based on uncertain dynamical models: after each episode, they
first learn a dynamical model of the robot, then they use an optimization
algorithm to find a policy that maximizes the expected return given the model
and its uncertainties. It is often believed that this optimization can be
tractable only if analytical, gradient-based algorithms are used; however,
these algorithms require using specific families of reward functions and
policies, which greatly limits the flexibility of the overall approach. In this
paper, we introduce a novel model-based RL algorithm, called Black-DROPS
(Black-box Data-efficient RObot Policy Search) that: (1) does not impose any
constraint on the reward function or the policy (they are treated as
black-boxes), (2) is as data-efficient as the state-of-the-art algorithm for
data-efficient RL in robotics, and (3) is as fast (or faster) than analytical
approaches when several cores are available. The key idea is to replace the
gradient-based optimization algorithm with a parallel, black-box algorithm that
takes into account the model uncertainties. We demonstrate the performance of
our new algorithm on two standard control benchmark problems (in simulation)
and a low-cost robotic manipulator (with a real robot).Comment: Accepted at the IEEE/RSJ International Conference on Intelligent
Robots and Systems (IROS) 2017; Code at
http://github.com/resibots/blackdrops; Video at http://youtu.be/kTEyYiIFGP
- …