16,604 research outputs found

    Safe Mutations for Deep and Recurrent Neural Networks through Output Gradients

    Full text link
    While neuroevolution (evolving neural networks) has a successful track record across a variety of domains from reinforcement learning to artificial life, it is rarely applied to large, deep neural networks. A central reason is that while random mutation generally works in low dimensions, a random perturbation of thousands or millions of weights is likely to break existing functionality, providing no learning signal even if some individual weight changes were beneficial. This paper proposes a solution by introducing a family of safe mutation (SM) operators that aim within the mutation operator itself to find a degree of change that does not alter network behavior too much, but still facilitates exploration. Importantly, these SM operators do not require any additional interactions with the environment. The most effective SM variant capitalizes on the intriguing opportunity to scale the degree of mutation of each individual weight according to the sensitivity of the network's outputs to that weight, which requires computing the gradient of outputs with respect to the weights (instead of the gradient of error, as in conventional deep learning). This safe mutation through gradients (SM-G) operator dramatically increases the ability of a simple genetic algorithm-based neuroevolution method to find solutions in high-dimensional domains that require deep and/or recurrent neural networks (which tend to be particularly brittle to mutation), including domains that require processing raw pixels. By improving our ability to evolve deep neural networks, this new safer approach to mutation expands the scope of domains amenable to neuroevolution

    Adaptive Investment Strategies For Periodic Environments

    Full text link
    In this paper, we present an adaptive investment strategy for environments with periodic returns on investment. In our approach, we consider an investment model where the agent decides at every time step the proportion of wealth to invest in a risky asset, keeping the rest of the budget in a risk-free asset. Every investment is evaluated in the market via a stylized return on investment function (RoI), which is modeled by a stochastic process with unknown periodicities and levels of noise. For comparison reasons, we present two reference strategies which represent the case of agents with zero-knowledge and complete-knowledge of the dynamics of the returns. We consider also an investment strategy based on technical analysis to forecast the next return by fitting a trend line to previous received returns. To account for the performance of the different strategies, we perform some computer experiments to calculate the average budget that can be obtained with them over a certain number of time steps. To assure for fair comparisons, we first tune the parameters of each strategy. Afterwards, we compare the performance of these strategies for RoIs with different periodicities and levels of noise.Comment: Paper submitted to Advances in Complex Systems (November, 2007) 22 pages, 9 figure

    A comparative study of game theoretic and evolutionary models for software agents

    No full text
    Most of the existing work in the study of bargaining behaviour uses techniques from game theory. Game theoretic models for bargaining assume that players are perfectly rational and that this rationality in common knowledge. However, the perfect rationality assumption does not hold for real-life bargaining scenarios with humans as players, since results from experimental economics show that humans find their way to the best strategy through trial and error, and not typically by means of rational deliberation. Such players are said to be boundedly rational. In playing a game against an opponent with bounded rationality, the most effective strategy of a player is not the equilibrium strategy but the one that is the best reply to the opponent's strategy. The evolutionary model provides a means for studying the bargaining behaviour of boundedly rational players. This paper provides a comprehensive comparison of the game theoretic and evolutionary approaches to bargaining by examining their assumptions, goals, and limitations. We then study the implications of these differences from the perspective of the software agent developer

    Evolutionary Algorithms for Reinforcement Learning

    Full text link
    There are two distinct approaches to solving reinforcement learning problems, namely, searching in value function space and searching in policy space. Temporal difference methods and evolutionary algorithms are well-known examples of these approaches. Kaelbling, Littman and Moore recently provided an informative survey of temporal difference methods. This article focuses on the application of evolutionary algorithms to the reinforcement learning problem, emphasizing alternative policy representations, credit assignment methods, and problem-specific genetic operators. Strengths and weaknesses of the evolutionary approach to reinforcement learning are presented, along with a survey of representative applications
    corecore