1,851 research outputs found
Constructing Parsimonious Analytic Models for Dynamic Systems via Symbolic Regression
Developing mathematical models of dynamic systems is central to many
disciplines of engineering and science. Models facilitate simulations, analysis
of the system's behavior, decision making and design of automatic control
algorithms. Even inherently model-free control techniques such as reinforcement
learning (RL) have been shown to benefit from the use of models, typically
learned online. Any model construction method must address the tradeoff between
the accuracy of the model and its complexity, which is difficult to strike. In
this paper, we propose to employ symbolic regression (SR) to construct
parsimonious process models described by analytic equations. We have equipped
our method with two different state-of-the-art SR algorithms which
automatically search for equations that fit the measured data: Single Node
Genetic Programming (SNGP) and Multi-Gene Genetic Programming (MGGP). In
addition to the standard problem formulation in the state-space domain, we show
how the method can also be applied to input-output models of the NARX
(nonlinear autoregressive with exogenous input) type. We present the approach
on three simulated examples with up to 14-dimensional state space: an inverted
pendulum, a mobile robot, and a bipedal walking robot. A comparison with deep
neural networks and local linear regression shows that SR in most cases
outperforms these commonly used alternative methods. We demonstrate on a real
pendulum system that the analytic model found enables a RL controller to
successfully perform the swing-up task, based on a model constructed from only
100 data samples
Intrinsic Motivation Systems for Autonomous Mental Development
Exploratory activities seem to be intrinsically rewarding
for children and crucial for their cognitive development.
Can a machine be endowed with such an intrinsic motivation
system? This is the question we study in this paper, presenting a number of computational systems that try to capture this drive towards novel or curious situations. After discussing related research coming from developmental psychology, neuroscience, developmental robotics, and active learning, this paper presents the mechanism of Intelligent Adaptive Curiosity, an intrinsic motivation system which pushes a robot towards situations in which it maximizes its learning progress. This drive makes the robot focus on situations which are neither too predictable nor too unpredictable, thus permitting autonomous mental development.The complexity of the robot’s activities autonomously increases and complex developmental sequences self-organize without being constructed in a supervised manner. Two experiments are presented illustrating the stage-like organization emerging with this mechanism. In one of them, a physical robot is placed on a baby play mat with objects that it can learn to manipulate. Experimental results show that the robot first spends time in situations
which are easy to learn, then shifts its attention progressively to situations of increasing difficulty, avoiding situations in which nothing can be learned. Finally, these various results are discussed in relation to more complex forms of behavioral organization and data coming from developmental psychology.
Key words: Active learning, autonomy, behavior, complexity,
curiosity, development, developmental trajectory, epigenetic
robotics, intrinsic motivation, learning, reinforcement learning,
values
Learning Equations for Extrapolation and Control
We present an approach to identify concise equations from data using a
shallow neural network approach. In contrast to ordinary black-box regression,
this approach allows understanding functional relations and generalizing them
from observed data to unseen parts of the parameter space. We show how to
extend the class of learnable equations for a recently proposed equation
learning network to include divisions, and we improve the learning and model
selection strategy to be useful for challenging real-world data. For systems
governed by analytical expressions, our method can in many cases identify the
true underlying equation and extrapolate to unseen domains. We demonstrate its
effectiveness by experiments on a cart-pendulum system, where only 2 random
rollouts are required to learn the forward dynamics and successfully achieve
the swing-up task.Comment: 9 pages, 9 figures, ICML 201
Benchmarking Deep Reinforcement Learning for Continuous Control
Recently, researchers have made significant progress combining the advances
in deep learning for learning feature representations with reinforcement
learning. Some notable examples include training agents to play Atari games
based on raw pixel data and to acquire advanced manipulation skills using raw
sensory inputs. However, it has been difficult to quantify progress in the
domain of continuous control due to the lack of a commonly adopted benchmark.
In this work, we present a benchmark suite of continuous control tasks,
including classic tasks like cart-pole swing-up, tasks with very high state and
action dimensionality such as 3D humanoid locomotion, tasks with partial
observations, and tasks with hierarchical structure. We report novel findings
based on the systematic evaluation of a range of implemented reinforcement
learning algorithms. Both the benchmark and reference implementations are
released at https://github.com/rllab/rllab in order to facilitate experimental
reproducibility and to encourage adoption by other researchers.Comment: 14 pages, ICML 201
Evolutionary Reinforcement Learning: A Survey
Reinforcement learning (RL) is a machine learning approach that trains agents
to maximize cumulative rewards through interactions with environments. The
integration of RL with deep learning has recently resulted in impressive
achievements in a wide range of challenging tasks, including board games,
arcade games, and robot control. Despite these successes, there remain several
crucial challenges, including brittle convergence properties caused by
sensitive hyperparameters, difficulties in temporal credit assignment with long
time horizons and sparse rewards, a lack of diverse exploration, especially in
continuous search space scenarios, difficulties in credit assignment in
multi-agent reinforcement learning, and conflicting objectives for rewards.
Evolutionary computation (EC), which maintains a population of learning agents,
has demonstrated promising performance in addressing these limitations. This
article presents a comprehensive survey of state-of-the-art methods for
integrating EC into RL, referred to as evolutionary reinforcement learning
(EvoRL). We categorize EvoRL methods according to key research fields in RL,
including hyperparameter optimization, policy search, exploration, reward
shaping, meta-RL, and multi-objective RL. We then discuss future research
directions in terms of efficient methods, benchmarks, and scalable platforms.
This survey serves as a resource for researchers and practitioners interested
in the field of EvoRL, highlighting the important challenges and opportunities
for future research. With the help of this survey, researchers and
practitioners can develop more efficient methods and tailored benchmarks for
EvoRL, further advancing this promising cross-disciplinary research field
Multiobjective optimization of electromagnetic structures based on self-organizing migration
Práce se zabĂ˝vá popisem novĂ©ho stochastickĂ©ho vĂcekriteriálnĂho optimalizaÄŤnĂho algoritmu MOSOMA (Multiobjective Self-Organizing Migrating Algorithm). Je zde ukázáno, Ĺľe algoritmus je schopen Ĺ™ešit nejrĹŻznÄ›jšà typy optimalizaÄŤnĂch Ăşloh (s jakĂ˝mkoli poÄŤtem kritĂ©riĂ, s i bez omezujĂcĂch podmĂnek, se spojitĂ˝m i diskrĂ©tnĂm stavovĂ˝m prostorem). VĂ˝sledky algoritmu jsou srovnány s dalšĂmi běžnÄ› pouĹľĂvanĂ˝mi metodami pro vĂcekriteriálnĂ optimalizaci na velkĂ© sadÄ› testovacĂch Ăşloh. Uvedli jsme novou techniku pro vĂ˝poÄŤet metriky rozprostĹ™enĂ (spread) zaloĹľenĂ© na hledánĂ minimálnĂ kostry grafu (Minimum Spanning Tree) pro problĂ©my majĂcĂ vĂce neĹľ dvÄ› kritĂ©ria. DoporuÄŤenĂ© hodnoty pro parametry Ĺ™ĂdĂcĂ bÄ›h algoritmu byly urÄŤeny na základÄ› vĂ˝sledkĹŻ jejich citlivostnĂ analĂ˝zy. Algoritmus MOSOMA je dále ĂşspěšnÄ› pouĹľit pro Ĺ™ešenĂ rĹŻznĂ˝ch návrhovĂ˝ch Ăşloh z oblasti elektromagnetismu (návrh Yagi-Uda antĂ©ny a dielektrickĂ˝ch filtrĹŻ, adaptivnĂ Ĺ™ĂzenĂ vyzaĹ™ovanĂ©ho svazku v ÄŤasovĂ© oblasti…).This thesis describes a novel stochastic multi-objective optimization algorithm called MOSOMA (Multi-Objective Self-Organizing Migrating Algorithm). It is shown that MOSOMA is able to solve various types of multi-objective optimization problems (with any number of objectives, unconstrained or constrained problems, with continuous or discrete decision space). The efficiency of MOSOMA is compared with other commonly used optimization techniques on a large suite of test problems. The new procedure based on finding of minimum spanning tree for computing the spread metric for problems with more than two objectives is proposed. Recommended values of parameters controlling the run of MOSOMA are derived according to their sensitivity analysis. The ability of MOSOMA to solve real-life problems from electromagnetics is shown in a few examples (Yagi-Uda and dielectric filters design, adaptive beam forming in time domain…).
Analytical Programming - a Novel Approach for Evolutionary Synthesis of Symbolic Structures
This chapter discusses an alternative approach for symbolic structures and solutions synthesis and demonstrates a comparison with other methods, for example Genetic Programming (GP) or Grammatical Evolution (GE). Generally, there are two well known methods, which can be used for symbolic structures synthesis by means of computers. The first one is called GP and the other is GE. Another interesting research was carried out by Artificial Immune Systems (AIS) or/and systems, which do not use tree structures like linear GP and other similar algorithm like Multi Expression Programming (MEP), etc. In this chapter, a different method called Analytic Programming (AP), is presented. AP is a grammar free algorithmic superstructure, which can be used by any programming language and also by any arbitrary Evolutionary Algorithm (EA) or another class of numerical optimization method. This chapter describes not only theoretical principles of AP, but also its comparative study with selected well known case examples from GP as well as applications on synthesis of: controller, systems of deterministic chaos, electronics circuits, etc. For simulation purposes, AP has been co-joined with EA’s like Differential Evolution (DE), Self-Organising Migrating Algorithm (SOMA), Genetic Algorithms (GA) and Simulated Annealing (SA). All case studies has been carefully prepared and repeated in order to get valid statistical data for proper conclusions.P(ED2.1.00/03.0089), P(GA102/09/1680), S, Z(MSM7088352101
- …