586 research outputs found
Using Monte Carlo Search With Data Aggregation to Improve Robot Soccer Policies
RoboCup soccer competitions are considered among the most challenging
multi-robot adversarial environments, due to their high dynamism and the
partial observability of the environment. In this paper we introduce a method
based on a combination of Monte Carlo search and data aggregation (MCSDA) to
adapt discrete-action soccer policies for a defender robot to the strategy of
the opponent team. By exploiting a simple representation of the domain, a
supervised learning algorithm is trained over an initial collection of data
consisting of several simulations of human expert policies. Monte Carlo policy
rollouts are then generated and aggregated to previous data to improve the
learned policy over multiple epochs and games. The proposed approach has been
extensively tested both on a soccer-dedicated simulator and on real robots.
Using this method, our learning robot soccer team achieves an improvement in
ball interceptions, as well as a reduction in the number of opponents' goals.
Together with a better performance, an overall more efficient positioning of
the whole team within the field is achieved
Learning Agile Soccer Skills for a Bipedal Robot with Deep Reinforcement Learning
We investigate whether Deep Reinforcement Learning (Deep RL) is able to
synthesize sophisticated and safe movement skills for a low-cost, miniature
humanoid robot that can be composed into complex behavioral strategies in
dynamic environments. We used Deep RL to train a humanoid robot with 20
actuated joints to play a simplified one-versus-one (1v1) soccer game. We first
trained individual skills in isolation and then composed those skills
end-to-end in a self-play setting. The resulting policy exhibits robust and
dynamic movement skills such as rapid fall recovery, walking, turning, kicking
and more; and transitions between them in a smooth, stable, and efficient
manner - well beyond what is intuitively expected from the robot. The agents
also developed a basic strategic understanding of the game, and learned, for
instance, to anticipate ball movements and to block opponent shots. The full
range of behaviors emerged from a small set of simple rewards. Our agents were
trained in simulation and transferred to real robots zero-shot. We found that a
combination of sufficiently high-frequency control, targeted dynamics
randomization, and perturbations during training in simulation enabled
good-quality transfer, despite significant unmodeled effects and variations
across robot instances. Although the robots are inherently fragile, minor
hardware modifications together with basic regularization of the behavior
during training led the robots to learn safe and effective movements while
still performing in a dynamic and agile way. Indeed, even though the agents
were optimized for scoring, in experiments they walked 156% faster, took 63%
less time to get up, and kicked 24% faster than a scripted baseline, while
efficiently combining the skills to achieve the longer term objectives.
Examples of the emergent behaviors and full 1v1 matches are available on the
supplementary website.Comment: Project website: https://sites.google.com/view/op3-socce
Programming Robosoccer agents by modelling human behavior
The Robosoccer simulator is a challenging environment for artificial intelligence, where a human has to program a team of agents and introduce it into a soccer virtual environment. Most usually, Robosoccer agents are programmed by hand. In some cases, agents make use of Machine learning (ML) to adapt and predict the behavior of the opposite team, but the bulk of the agent has been preprogrammed. The main aim of this paper is to transform Robosoccer into an interactive game and let a human control a Robosoccer agent. Then ML techniques can be used to model his/her behavior from training instances generated during the play. This model will be used later to control a Robosoccer agent, thus imitating the human behavior. We have focused our research on low-level behavior, like looking for the ball, conducting the ball towards the goal, or scoring in the presence of opponent players. Results have shown that indeed, Robosoccer agents can be controlled by programs that model human play.Publicad
Pyrus Base: An Open Source Python Framework for the RoboCup 2D Soccer Simulation
Soccer, also known as football in some parts of the world, involves two teams
of eleven players whose objective is to score more goals than the opposing
team. To simulate this game and attract scientists from all over the world to
conduct research and participate in an annual computer-based soccer world cup,
Soccer Simulation 2D (SS2D) was one of the leagues initiated in the RoboCup
competition. In every SS2D game, two teams of 11 players and one coach connect
to the RoboCup Soccer Simulation Server and compete against each other. Over
the past few years, several C++ base codes have been employed to control
agents' behavior and their communication with the server. Although C++ base
codes have laid the foundation for the SS2D, developing them requires an
advanced level of C++ programming. C++ language complexity is a limiting
disadvantage of C++ base codes for all users, especially for beginners. To
conquer the challenges of C++ base codes and provide a powerful baseline for
developing machine learning concepts, we introduce Pyrus, the first Python base
code for SS2D. Pyrus is developed to encourage researchers to efficiently
develop their ideas and integrate machine learning algorithms into their teams.
Pyrus base is open-source code, and it is publicly available under MIT License
on GitHu
A plan classifier based on Chi-square distribution tests
To make good decisions in a social context, humans often need to recognize the plan underlying the behavior of
others, and make predictions based on this recognition. This process, when carried out by software agents or robots, is known
as plan recognition, or agent modeling. Most existing techniques for plan recognition assume the availability of carefully
hand-crafted plan libraries, which encode the a-priori known behavioral repertoire of the observed agents; during run-time,
plan recognition algorithms match the observed behavior of the agents against the plan-libraries, and matches are reported
as hypotheses. Unfortunately, techniques for automatically acquiring plan-libraries from observations, e.g., by learning or
data-mining, are only beginning to emerge.
We present an approach for automatically creating the model of an agent behavior based on the observation and analysis of
its atomic behaviors. In this approach, observations of an agent behavior are transformed into a sequence of atomic behaviors
(events). This stream is analyzed in order to get the corresponding behavior model, represented by a distribution of relevant
events. Once the model has been created, the proposed approach presents a method using a statistical test for classifying an
observed behavior. Therefore, in this research, the problem of behavior classification is examined as a problem of learning to
characterize the behavior of an agent in terms of sequences of atomic behaviors. The experiment results of this paper show
that a system based on our approach can efficiently recognize different behaviors in different domains, in particular UNIX
command-line data, and RoboCup soccer simulationThis work has been partially supported by the Spanish Government under project TRA2007-67374-C02-0
- …