6,803 research outputs found

    Learning Behavior Characterizations for Novelty Search

    Get PDF

    Improving Exploration in Evolution Strategies for Deep Reinforcement Learning via a Population of Novelty-Seeking Agents

    Full text link
    Evolution strategies (ES) are a family of black-box optimization algorithms able to train deep neural networks roughly as well as Q-learning and policy gradient methods on challenging deep reinforcement learning (RL) problems, but are much faster (e.g. hours vs. days) because they parallelize better. However, many RL problems require directed exploration because they have reward functions that are sparse or deceptive (i.e. contain local optima), and it is unknown how to encourage such exploration with ES. Here we show that algorithms that have been invented to promote directed exploration in small-scale evolved neural networks via populations of exploring agents, specifically novelty search (NS) and quality diversity (QD) algorithms, can be hybridized with ES to improve its performance on sparse or deceptive deep RL tasks, while retaining scalability. Our experiments confirm that the resultant new algorithms, NS-ES and two QD algorithms, NSR-ES and NSRA-ES, avoid local optima encountered by ES to achieve higher performance on Atari and simulated robots learning to walk around a deceptive trap. This paper thus introduces a family of fast, scalable algorithms for reinforcement learning that are capable of directed exploration. It also adds this new family of exploration algorithms to the RL toolbox and raises the interesting possibility that analogous algorithms with multiple simultaneous paths of exploration might also combine well with existing RL algorithms outside ES

    Discovering Evolutionary Stepping Stones through Behavior Domination

    Full text link
    Behavior domination is proposed as a tool for understanding and harnessing the power of evolutionary systems to discover and exploit useful stepping stones. Novelty search has shown promise in overcoming deception by collecting diverse stepping stones, and several algorithms have been proposed that combine novelty with a more traditional fitness measure to refocus search and help novelty search scale to more complex domains. However, combinations of novelty and fitness do not necessarily preserve the stepping stone discovery that novelty search affords. In several existing methods, competition between solutions can lead to an unintended loss of diversity. Behavior domination defines a class of algorithms that avoid this problem, while inheriting theoretical guarantees from multiobjective optimization. Several existing algorithms are shown to be in this class, and a new algorithm is introduced based on fast non-dominated sorting. Experimental results show that this algorithm outperforms existing approaches in domains that contain useful stepping stones, and its advantage is sustained with scale. The conclusion is that behavior domination can help illuminate the complex dynamics of behavior-driven search, and can thus lead to the design of more scalable and robust algorithms.Comment: To Appear in Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2017

    Fusing novelty and surprise for evolving robot morphologies

    Get PDF
    Traditional evolutionary algorithms tend to converge to a single good solution, which can limit their chance of discovering more diverse and creative outcomes. Divergent search, on the other hand, aims to counter convergence to local optima by avoiding selection pressure towards the objective. Forms of divergent search such as novelty or surprise search have proven to be beneficial for both the efficiency and the variety of the solutions obtained in deceptive tasks. Importantly for this paper, early results in maze navigation have shown that combining novelty and surprise search yields an even more effective search strategy due to their orthogonal nature. Motivated by the largely unexplored potential of coupling novelty and surprise as a search strategy, in this paper we investigate how fusing the two can affect the evolution of soft robot morphologies. We test the capacity of the combined search strategy against objective, novelty, and surprise search, by comparing their efficiency and robustness, and the variety of robots they evolve. Our key results demonstrate that novelty-surprise search is generally more efficient and robust across eight different resolutions. Further, surprise search explores the space of robot morphologies more broadly than any other algorithm examined.peer-reviewe

    Will This Paper Increase Your h-index? Scientific Impact Prediction

    Full text link
    Scientific impact plays a central role in the evaluation of the output of scholars, departments, and institutions. A widely used measure of scientific impact is citations, with a growing body of literature focused on predicting the number of citations obtained by any given publication. The effectiveness of such predictions, however, is fundamentally limited by the power-law distribution of citations, whereby publications with few citations are extremely common and publications with many citations are relatively rare. Given this limitation, in this work we instead address a related question asked by many academic researchers in the course of writing a paper, namely: "Will this paper increase my h-index?" Using a real academic dataset with over 1.7 million authors, 2 million papers, and 8 million citation relationships from the premier online academic service ArnetMiner, we formalize a novel scientific impact prediction problem to examine several factors that can drive a paper to increase the primary author's h-index. We find that the researcher's authority on the publication topic and the venue in which the paper is published are crucial factors to the increase of the primary author's h-index, while the topic popularity and the co-authors' h-indices are of surprisingly little relevance. By leveraging relevant factors, we find a greater than 87.5% potential predictability for whether a paper will contribute to an author's h-index within five years. As a further experiment, we generate a self-prediction for this paper, estimating that there is a 76% probability that it will contribute to the h-index of the co-author with the highest current h-index in five years. We conclude that our findings on the quantification of scientific impact can help researchers to expand their influence and more effectively leverage their position of "standing on the shoulders of giants."Comment: Proc. of the 8th ACM International Conference on Web Search and Data Mining (WSDM'15

    Toward behavioural innovation economics – Heuristics and biases in choice under novelty

    Get PDF
    A framework for ‘behavioural innovation economics’ is proposed here as a synthesis of behavioural economics and innovation economics in the specific context of choice under novelty. We seek to apply the heuristics and biases framework of behavioural economics to the study of the innovation process in order to map and analyze systematic choice failures in the innovation process. We elaborate the distinction between choice under uncertainty and choice under novelty, as well as drawing out the ‘efficient innovation hypothesis’ implicit in most behavioural models of innovation. The subject domain of a research program for behavioural innovation economics is then briefly outlined in terms of a catalogue of characteristic ways in which choice under novelty renders innovation processes subject to failure.

    Quality Diversity: Harnessing Evolution to Generate a Diversity of High-Performing Solutions

    Get PDF
    Evolution in nature has designed countless solutions to innumerable interconnected problems, giving birth to the impressive array of complex modern life observed today. Inspired by this success, the practice of evolutionary computation (EC) abstracts evolution artificially as a search operator to find solutions to problems of interest primarily through the adaptive mechanism of survival of the fittest, where stronger candidates are pursued at the expense of weaker ones until a solution of satisfying quality emerges. At the same time, research in open-ended evolution (OEE) draws different lessons from nature, seeking to identify and recreate processes that lead to the type of perpetual innovation and indefinitely increasing complexity observed in natural evolution. New algorithms in EC such as MAP-Elites and Novelty Search with Local Competition harness the toolkit of evolution for a related purpose: finding as many types of good solutions as possible (rather than merely the single best solution). With the field in its infancy, no empirical studies previously existed comparing these so-called quality diversity (QD) algorithms. This dissertation (1) contains the first extensive and methodical effort to compare different approaches to QD (including both existing published approaches as well as some new methods presented for the first time here) and to understand how they operate to help inform better approaches in the future. It also (2) introduces a new technique for encoding neural networks for evolution with indirect encoding that contain multiple sensory or output modalities. Further, it (3) explores the idea that QD can act as an engine of open-ended discovery by introducing an expressive platform called Voxelbuild where QD algorithms continually evolve robots that stack blocks in new ways. A culminating experiment (4) is presented that investigates evolution in Voxelbuild over a very long timescale. This research thus stands to advance the OEE community\u27s desire to create and understand open-ended systems while also laying the groundwork for QD to realize its potential within EC as a means to automatically generate an endless progression of new content in real-world applications
    • …
    corecore