Skip to main content
Article thumbnail
Location of Repository

On learning soccer strategies

By R. Salustowicz, M.A. Wiering and J. Schmidhuber

Abstract

We use simulated soccer to study multiagent learning. Each team's\ud players (agents) share action set and policy but may behave differently\ud due to position-dependent inputs. All agents making up a team are\ud rewarded or punished collectively in case of goals. We conduct\ud simulations with varying team sizes, and compare two learning \ud algorithms: TD-Q learning with linear neural networks (TD-Q) and\ud Probabilistic Incremental Program Evolution (PIPE). TD-Q is based on\ud evaluation functions (EFs) mapping input/action pairs to expected\ud reward, while PIPE searches policy space directly. PIPE uses an adaptive\ud probability distribution to synthesize programs that calculate action \ud probabilities from current inputs. Our results show that TD-Q has \ud difficulties to learn appropriate shared EFs. PIPE, however, does not \ud depend on EFs and finds good policies faster and more reliably

Topics: Wiskunde en Informatica
Year: 1997
OAI identifier: oai:dspace.library.uu.nl:1874/25434
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • http://dspace.library.uu.nl:80... (external link)
  • Suggested articles


    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.