107 research outputs found
Gradient-free Policy Architecture Search and Adaptation
We develop a method for policy architecture search and adaptation via
gradient-free optimization which can learn to perform autonomous driving tasks.
By learning from both demonstration and environmental reward we develop a model
that can learn with relatively few early catastrophic failures. We first learn
an architecture of appropriate complexity to perceive aspects of world state
relevant to the expert demonstration, and then mitigate the effect of
domain-shift during deployment by adapting a policy demonstrated in a source
domain to rewards obtained in a target environment. We show that our approach
allows safer learning than baseline methods, offering a reduced cumulative
crash metric over the agent's lifetime as it learns to drive in a realistic
simulated environment.Comment: Accepted in Conference on Robot Learning, 201
Evolving Artificial Neural Networks To Imitate Human Behaviour In Shinobi III : Return of the Ninja Master
Our society is increasingly fond of computational tools. This phenomenon has
greatly increased over the past decade following, among other factors, the
emergence of a new Artificial Intelligence paradigm. Specifically, the coupling
of two algorithmic techniques, Deep Neural Networks and Stochastic Gradient
Descent, thrusted by an exponentially increasing computing capacity, has and is
continuing to become a major asset in many modern technologies. However, as
progress takes its course, some still wonder whether other methods could
similarly or even more greatly benefit from these various hardware advances. In
order to further this study, we delve in this thesis into Evolutionary
Algorithms and their application to Dynamic Neural Networks, two techniques
which despite enjoying many advantageous properties have yet to find their
niche in contemporary Artificial Intelligence. We find that by elaborating new
methods while exploiting strong computational resources, it becomes possible to
develop strongly performing agents on a variety of benchmarks but also some
other agents behaving very similarly to human subjects on the video game
Shinobi III : Return of The Ninja Master, typical complex tasks previously out
of reach for non-gradient-based optimization
Evolving‌ ‌artificial‌ ‌neural‌ ‌networks‌‌ ‌to‌ ‌imitate‌ ‌human‌ ‌behaviour‌‌ ‌in‌ ‌Shinobi‌ ‌III‌ ‌:‌ ‌return‌ ‌of‌ ‌the‌ ‌Ninja‌ ‌master‌
Notre société est de plus en plus friande d’outils informatiques. Ce phénomène s’est particulièrement accru lors de cette dernière décennie suite, entre autres, à l’émergence d’un nouveau paradigme d’Intelligence Artificielle. Plus précisément, le couplage de deux techniques algorithmiques, les Réseaux de Neurones Profonds et la Descente de Gradient Stochastique, propulsé par une force de calcul exponentiellement croissante, est devenu et continue de devenir un atout majeur dans de nombreuses nouvelles technologies. Néanmoins, alors que le progrès suit son cours, certains se demandent toujours si d’autres méthodes pourraient similairement, voire davantage, bénéficier de ces diverses avancées matérielles.
Afin de pousser cette étude, nous nous attelons dans ce mémoire aux Algorithmes Évolutionnaires et leur application aux Réseaux de Neurones Dynamiques, deux techniques dotées d’un grand nombre de propriétés avantageuses n’ayant toutefois pas trouvé leur place au sein de l’Intelligence Artificielle contemporaine. Nous trouvons qu’en élaborant de nouvelles méthodes tout en exploitant une forte puissance informatique, il nous devient alors possible de développer des agents à haute performance sur des bases de référence ainsi que d’autres agents se comportant de façon très similaire à des sujets humains sur le jeu vidéo Shinobi III : Return of The Ninja Master, cas typique de tâches complexes que seule l’optimisation par gradient était capable d’aborder jusqu’alors.Our society is increasingly fond of computational tools. This phenomenon has greatly increased over the past decade following, among other factors, the emergence of a new Artificial Intelligence paradigm. Specifically, the coupling of two algorithmic techniques, Deep Neural Networks and Stochastic Gradient Descent, thrusted by an exponentially increasing computing capacity, has and is continuing to become a major asset in many modern technologies. However, as progress takes its course, some still wonder whether other methods could similarly or even more greatly benefit from these various hardware advances.
In order to further this study, we delve in this thesis into Evolutionary Algorithms and their application to Dynamic Neural Networks, two techniques which despite enjoying many advantageous properties have yet to find their niche in contemporary Artificial Intelligence. We find that by elaborating new methods while exploiting strong computational resources, it becomes possible to develop strongly performing agents on a variety of benchmarks but also some other agents behaving very similarly to human subjects on the video game Shinobi III : Return of The Ninja Master, typical complex tasks previously out of reach for non-gradient-based optimization
Evolutionary Reinforcement Learning: A Survey
Reinforcement learning (RL) is a machine learning approach that trains agents
to maximize cumulative rewards through interactions with environments. The
integration of RL with deep learning has recently resulted in impressive
achievements in a wide range of challenging tasks, including board games,
arcade games, and robot control. Despite these successes, there remain several
crucial challenges, including brittle convergence properties caused by
sensitive hyperparameters, difficulties in temporal credit assignment with long
time horizons and sparse rewards, a lack of diverse exploration, especially in
continuous search space scenarios, difficulties in credit assignment in
multi-agent reinforcement learning, and conflicting objectives for rewards.
Evolutionary computation (EC), which maintains a population of learning agents,
has demonstrated promising performance in addressing these limitations. This
article presents a comprehensive survey of state-of-the-art methods for
integrating EC into RL, referred to as evolutionary reinforcement learning
(EvoRL). We categorize EvoRL methods according to key research fields in RL,
including hyperparameter optimization, policy search, exploration, reward
shaping, meta-RL, and multi-objective RL. We then discuss future research
directions in terms of efficient methods, benchmarks, and scalable platforms.
This survey serves as a resource for researchers and practitioners interested
in the field of EvoRL, highlighting the important challenges and opportunities
for future research. With the help of this survey, researchers and
practitioners can develop more efficient methods and tailored benchmarks for
EvoRL, further advancing this promising cross-disciplinary research field
Born to learn: The inspiration, progress, and future of evolved plastic artificial neural networks
Biological plastic neural networks are systems of extraordinary computational
capabilities shaped by evolution, development, and lifetime learning. The
interplay of these elements leads to the emergence of adaptive behavior and
intelligence. Inspired by such intricate natural phenomena, Evolved Plastic
Artificial Neural Networks (EPANNs) use simulated evolution in-silico to breed
plastic neural networks with a large variety of dynamics, architectures, and
plasticity rules: these artificial systems are composed of inputs, outputs, and
plastic components that change in response to experiences in an environment.
These systems may autonomously discover novel adaptive algorithms, and lead to
hypotheses on the emergence of biological adaptation. EPANNs have seen
considerable progress over the last two decades. Current scientific and
technological advances in artificial neural networks are now setting the
conditions for radically new approaches and results. In particular, the
limitations of hand-designed networks could be overcome by more flexible and
innovative solutions. This paper brings together a variety of inspiring ideas
that define the field of EPANNs. The main methods and results are reviewed.
Finally, new opportunities and developments are presented
An Intelligent Social Learning-based Optimization Strategy for Black-box Robotic Control with Reinforcement Learning
Implementing intelligent control of robots is a difficult task, especially
when dealing with complex black-box systems, because of the lack of visibility
and understanding of how these robots work internally. This paper proposes an
Intelligent Social Learning (ISL) algorithm to enable intelligent control of
black-box robotic systems. Inspired by mutual learning among individuals in
human social groups, ISL includes learning, imitation, and self-study styles.
Individuals in the learning style use the Levy flight search strategy to learn
from the best performer and form the closest relationships. In the imitation
style, individuals mimic the best performer with a second-level rapport by
employing a random perturbation strategy. In the self-study style, individuals
learn independently using a normal distribution sampling method while
maintaining a distant relationship with the best performer. Individuals in the
population are regarded as autonomous intelligent agents in each style. Neural
networks perform strategic actions in three styles to interact with the
environment and the robot and iteratively optimize the network policy. Overall,
ISL builds on the principles of intelligent optimization, incorporating ideas
from reinforcement learning, and possesses strong search capabilities, fast
computation speed, fewer hyperparameters, and insensitivity to sparse rewards.
The proposed ISL algorithm is compared with four state-of-the-art methods on
six continuous control benchmark cases in MuJoCo to verify its effectiveness
and advantages. Furthermore, ISL is adopted in the simulation and experimental
grasping tasks of the UR3 robot for validations, and satisfactory solutions are
yielded
- …