6,803 research outputs found
Improving Exploration in Evolution Strategies for Deep Reinforcement Learning via a Population of Novelty-Seeking Agents
Evolution strategies (ES) are a family of black-box optimization algorithms
able to train deep neural networks roughly as well as Q-learning and policy
gradient methods on challenging deep reinforcement learning (RL) problems, but
are much faster (e.g. hours vs. days) because they parallelize better. However,
many RL problems require directed exploration because they have reward
functions that are sparse or deceptive (i.e. contain local optima), and it is
unknown how to encourage such exploration with ES. Here we show that algorithms
that have been invented to promote directed exploration in small-scale evolved
neural networks via populations of exploring agents, specifically novelty
search (NS) and quality diversity (QD) algorithms, can be hybridized with ES to
improve its performance on sparse or deceptive deep RL tasks, while retaining
scalability. Our experiments confirm that the resultant new algorithms, NS-ES
and two QD algorithms, NSR-ES and NSRA-ES, avoid local optima encountered by ES
to achieve higher performance on Atari and simulated robots learning to walk
around a deceptive trap. This paper thus introduces a family of fast, scalable
algorithms for reinforcement learning that are capable of directed exploration.
It also adds this new family of exploration algorithms to the RL toolbox and
raises the interesting possibility that analogous algorithms with multiple
simultaneous paths of exploration might also combine well with existing RL
algorithms outside ES
Discovering Evolutionary Stepping Stones through Behavior Domination
Behavior domination is proposed as a tool for understanding and harnessing
the power of evolutionary systems to discover and exploit useful stepping
stones. Novelty search has shown promise in overcoming deception by collecting
diverse stepping stones, and several algorithms have been proposed that combine
novelty with a more traditional fitness measure to refocus search and help
novelty search scale to more complex domains. However, combinations of novelty
and fitness do not necessarily preserve the stepping stone discovery that
novelty search affords. In several existing methods, competition between
solutions can lead to an unintended loss of diversity. Behavior domination
defines a class of algorithms that avoid this problem, while inheriting
theoretical guarantees from multiobjective optimization. Several existing
algorithms are shown to be in this class, and a new algorithm is introduced
based on fast non-dominated sorting. Experimental results show that this
algorithm outperforms existing approaches in domains that contain useful
stepping stones, and its advantage is sustained with scale. The conclusion is
that behavior domination can help illuminate the complex dynamics of
behavior-driven search, and can thus lead to the design of more scalable and
robust algorithms.Comment: To Appear in Proceedings of the Genetic and Evolutionary Computation
Conference (GECCO 2017
Fusing novelty and surprise for evolving robot morphologies
Traditional evolutionary algorithms tend to converge to a single
good solution, which can limit their chance of discovering more
diverse and creative outcomes. Divergent search, on the other hand,
aims to counter convergence to local optima by avoiding selection
pressure towards the objective. Forms of divergent search such as
novelty or surprise search have proven to be beneficial for both
the efficiency and the variety of the solutions obtained in deceptive
tasks. Importantly for this paper, early results in maze navigation
have shown that combining novelty and surprise search yields an
even more effective search strategy due to their orthogonal nature.
Motivated by the largely unexplored potential of coupling novelty
and surprise as a search strategy, in this paper we investigate how
fusing the two can affect the evolution of soft robot morphologies.
We test the capacity of the combined search strategy against objective,
novelty, and surprise search, by comparing their efficiency and
robustness, and the variety of robots they evolve. Our key results
demonstrate that novelty-surprise search is generally more efficient
and robust across eight different resolutions. Further, surprise
search explores the space of robot morphologies more broadly than
any other algorithm examined.peer-reviewe
Will This Paper Increase Your h-index? Scientific Impact Prediction
Scientific impact plays a central role in the evaluation of the output of
scholars, departments, and institutions. A widely used measure of scientific
impact is citations, with a growing body of literature focused on predicting
the number of citations obtained by any given publication. The effectiveness of
such predictions, however, is fundamentally limited by the power-law
distribution of citations, whereby publications with few citations are
extremely common and publications with many citations are relatively rare.
Given this limitation, in this work we instead address a related question asked
by many academic researchers in the course of writing a paper, namely: "Will
this paper increase my h-index?" Using a real academic dataset with over 1.7
million authors, 2 million papers, and 8 million citation relationships from
the premier online academic service ArnetMiner, we formalize a novel scientific
impact prediction problem to examine several factors that can drive a paper to
increase the primary author's h-index. We find that the researcher's authority
on the publication topic and the venue in which the paper is published are
crucial factors to the increase of the primary author's h-index, while the
topic popularity and the co-authors' h-indices are of surprisingly little
relevance. By leveraging relevant factors, we find a greater than 87.5%
potential predictability for whether a paper will contribute to an author's
h-index within five years. As a further experiment, we generate a
self-prediction for this paper, estimating that there is a 76% probability that
it will contribute to the h-index of the co-author with the highest current
h-index in five years. We conclude that our findings on the quantification of
scientific impact can help researchers to expand their influence and more
effectively leverage their position of "standing on the shoulders of giants."Comment: Proc. of the 8th ACM International Conference on Web Search and Data
Mining (WSDM'15
Toward behavioural innovation economics – Heuristics and biases in choice under novelty
A framework for ‘behavioural innovation economics’ is proposed here as a synthesis of behavioural economics and innovation economics in the specific context of choice under novelty. We seek to apply the heuristics and biases framework of behavioural economics to the study of the innovation process in order to map and analyze systematic choice failures in the innovation process. We elaborate the distinction between choice under uncertainty and choice under novelty, as well as drawing out the ‘efficient innovation hypothesis’ implicit in most behavioural models of innovation. The subject domain of a research program for behavioural innovation economics is then briefly outlined in terms of a catalogue of characteristic ways in which choice under novelty renders innovation processes subject to failure.
Quality Diversity: Harnessing Evolution to Generate a Diversity of High-Performing Solutions
Evolution in nature has designed countless solutions to innumerable interconnected problems, giving birth to the impressive array of complex modern life observed today. Inspired by this success, the practice of evolutionary computation (EC) abstracts evolution artificially as a search operator to find solutions to problems of interest primarily through the adaptive mechanism of survival of the fittest, where stronger candidates are pursued at the expense of weaker ones until a solution of satisfying quality emerges. At the same time, research in open-ended evolution (OEE) draws different lessons from nature, seeking to identify and recreate processes that lead to the type of perpetual innovation and indefinitely increasing complexity observed in natural evolution. New algorithms in EC such as MAP-Elites and Novelty Search with Local Competition harness the toolkit of evolution for a related purpose: finding as many types of good solutions as possible (rather than merely the single best solution). With the field in its infancy, no empirical studies previously existed comparing these so-called quality diversity (QD) algorithms. This dissertation (1) contains the first extensive and methodical effort to compare different approaches to QD (including both existing published approaches as well as some new methods presented for the first time here) and to understand how they operate to help inform better approaches in the future. It also (2) introduces a new technique for encoding neural networks for evolution with indirect encoding that contain multiple sensory or output modalities. Further, it (3) explores the idea that QD can act as an engine of open-ended discovery by introducing an expressive platform called Voxelbuild where QD algorithms continually evolve robots that stack blocks in new ways. A culminating experiment (4) is presented that investigates evolution in Voxelbuild over a very long timescale. This research thus stands to advance the OEE community\u27s desire to create and understand open-ended systems while also laying the groundwork for QD to realize its potential within EC as a means to automatically generate an endless progression of new content in real-world applications
- …