16,604 research outputs found
Safe Mutations for Deep and Recurrent Neural Networks through Output Gradients
While neuroevolution (evolving neural networks) has a successful track record
across a variety of domains from reinforcement learning to artificial life, it
is rarely applied to large, deep neural networks. A central reason is that
while random mutation generally works in low dimensions, a random perturbation
of thousands or millions of weights is likely to break existing functionality,
providing no learning signal even if some individual weight changes were
beneficial. This paper proposes a solution by introducing a family of safe
mutation (SM) operators that aim within the mutation operator itself to find a
degree of change that does not alter network behavior too much, but still
facilitates exploration. Importantly, these SM operators do not require any
additional interactions with the environment. The most effective SM variant
capitalizes on the intriguing opportunity to scale the degree of mutation of
each individual weight according to the sensitivity of the network's outputs to
that weight, which requires computing the gradient of outputs with respect to
the weights (instead of the gradient of error, as in conventional deep
learning). This safe mutation through gradients (SM-G) operator dramatically
increases the ability of a simple genetic algorithm-based neuroevolution method
to find solutions in high-dimensional domains that require deep and/or
recurrent neural networks (which tend to be particularly brittle to mutation),
including domains that require processing raw pixels. By improving our ability
to evolve deep neural networks, this new safer approach to mutation expands the
scope of domains amenable to neuroevolution
Adaptive Investment Strategies For Periodic Environments
In this paper, we present an adaptive investment strategy for environments
with periodic returns on investment. In our approach, we consider an investment
model where the agent decides at every time step the proportion of wealth to
invest in a risky asset, keeping the rest of the budget in a risk-free asset.
Every investment is evaluated in the market via a stylized return on investment
function (RoI), which is modeled by a stochastic process with unknown
periodicities and levels of noise. For comparison reasons, we present two
reference strategies which represent the case of agents with zero-knowledge and
complete-knowledge of the dynamics of the returns. We consider also an
investment strategy based on technical analysis to forecast the next return by
fitting a trend line to previous received returns. To account for the
performance of the different strategies, we perform some computer experiments
to calculate the average budget that can be obtained with them over a certain
number of time steps. To assure for fair comparisons, we first tune the
parameters of each strategy. Afterwards, we compare the performance of these
strategies for RoIs with different periodicities and levels of noise.Comment: Paper submitted to Advances in Complex Systems (November, 2007) 22
pages, 9 figure
A comparative study of game theoretic and evolutionary models for software agents
Most of the existing work in the study of bargaining behaviour uses techniques from game theory. Game theoretic models for bargaining assume that players are perfectly rational and that this rationality in common knowledge. However, the perfect rationality assumption does not hold for real-life bargaining scenarios with humans as players, since results from experimental economics show that humans find their way to the best strategy through trial and error, and not typically by means of rational deliberation. Such players are said to be boundedly rational. In playing a game against an opponent with bounded rationality, the most effective strategy of a player is not the equilibrium strategy but the one that is the best reply to the opponent's strategy. The evolutionary model provides a means for studying the bargaining behaviour of boundedly rational players. This paper provides a comprehensive comparison of the game theoretic and evolutionary approaches to bargaining by examining their assumptions, goals, and limitations. We then study the implications of these differences from the perspective of the software agent developer
Evolutionary Algorithms for Reinforcement Learning
There are two distinct approaches to solving reinforcement learning problems,
namely, searching in value function space and searching in policy space.
Temporal difference methods and evolutionary algorithms are well-known examples
of these approaches. Kaelbling, Littman and Moore recently provided an
informative survey of temporal difference methods. This article focuses on the
application of evolutionary algorithms to the reinforcement learning problem,
emphasizing alternative policy representations, credit assignment methods, and
problem-specific genetic operators. Strengths and weaknesses of the
evolutionary approach to reinforcement learning are presented, along with a
survey of representative applications
- …