586 research outputs found

    Using Monte Carlo Search With Data Aggregation to Improve Robot Soccer Policies

    Full text link
    RoboCup soccer competitions are considered among the most challenging multi-robot adversarial environments, due to their high dynamism and the partial observability of the environment. In this paper we introduce a method based on a combination of Monte Carlo search and data aggregation (MCSDA) to adapt discrete-action soccer policies for a defender robot to the strategy of the opponent team. By exploiting a simple representation of the domain, a supervised learning algorithm is trained over an initial collection of data consisting of several simulations of human expert policies. Monte Carlo policy rollouts are then generated and aggregated to previous data to improve the learned policy over multiple epochs and games. The proposed approach has been extensively tested both on a soccer-dedicated simulator and on real robots. Using this method, our learning robot soccer team achieves an improvement in ball interceptions, as well as a reduction in the number of opponents' goals. Together with a better performance, an overall more efficient positioning of the whole team within the field is achieved

    Behavior Acquisition in RoboCup Middle Size League Domain

    Get PDF

    Learning Agile Soccer Skills for a Bipedal Robot with Deep Reinforcement Learning

    Full text link
    We investigate whether Deep Reinforcement Learning (Deep RL) is able to synthesize sophisticated and safe movement skills for a low-cost, miniature humanoid robot that can be composed into complex behavioral strategies in dynamic environments. We used Deep RL to train a humanoid robot with 20 actuated joints to play a simplified one-versus-one (1v1) soccer game. We first trained individual skills in isolation and then composed those skills end-to-end in a self-play setting. The resulting policy exhibits robust and dynamic movement skills such as rapid fall recovery, walking, turning, kicking and more; and transitions between them in a smooth, stable, and efficient manner - well beyond what is intuitively expected from the robot. The agents also developed a basic strategic understanding of the game, and learned, for instance, to anticipate ball movements and to block opponent shots. The full range of behaviors emerged from a small set of simple rewards. Our agents were trained in simulation and transferred to real robots zero-shot. We found that a combination of sufficiently high-frequency control, targeted dynamics randomization, and perturbations during training in simulation enabled good-quality transfer, despite significant unmodeled effects and variations across robot instances. Although the robots are inherently fragile, minor hardware modifications together with basic regularization of the behavior during training led the robots to learn safe and effective movements while still performing in a dynamic and agile way. Indeed, even though the agents were optimized for scoring, in experiments they walked 156% faster, took 63% less time to get up, and kicked 24% faster than a scripted baseline, while efficiently combining the skills to achieve the longer term objectives. Examples of the emergent behaviors and full 1v1 matches are available on the supplementary website.Comment: Project website: https://sites.google.com/view/op3-socce

    Programming Robosoccer agents by modelling human behavior

    Get PDF
    The Robosoccer simulator is a challenging environment for artificial intelligence, where a human has to program a team of agents and introduce it into a soccer virtual environment. Most usually, Robosoccer agents are programmed by hand. In some cases, agents make use of Machine learning (ML) to adapt and predict the behavior of the opposite team, but the bulk of the agent has been preprogrammed. The main aim of this paper is to transform Robosoccer into an interactive game and let a human control a Robosoccer agent. Then ML techniques can be used to model his/her behavior from training instances generated during the play. This model will be used later to control a Robosoccer agent, thus imitating the human behavior. We have focused our research on low-level behavior, like looking for the ball, conducting the ball towards the goal, or scoring in the presence of opponent players. Results have shown that indeed, Robosoccer agents can be controlled by programs that model human play.Publicad

    Pyrus Base: An Open Source Python Framework for the RoboCup 2D Soccer Simulation

    Full text link
    Soccer, also known as football in some parts of the world, involves two teams of eleven players whose objective is to score more goals than the opposing team. To simulate this game and attract scientists from all over the world to conduct research and participate in an annual computer-based soccer world cup, Soccer Simulation 2D (SS2D) was one of the leagues initiated in the RoboCup competition. In every SS2D game, two teams of 11 players and one coach connect to the RoboCup Soccer Simulation Server and compete against each other. Over the past few years, several C++ base codes have been employed to control agents' behavior and their communication with the server. Although C++ base codes have laid the foundation for the SS2D, developing them requires an advanced level of C++ programming. C++ language complexity is a limiting disadvantage of C++ base codes for all users, especially for beginners. To conquer the challenges of C++ base codes and provide a powerful baseline for developing machine learning concepts, we introduce Pyrus, the first Python base code for SS2D. Pyrus is developed to encourage researchers to efficiently develop their ideas and integrate machine learning algorithms into their teams. Pyrus base is open-source code, and it is publicly available under MIT License on GitHu

    A plan classifier based on Chi-square distribution tests

    Get PDF
    To make good decisions in a social context, humans often need to recognize the plan underlying the behavior of others, and make predictions based on this recognition. This process, when carried out by software agents or robots, is known as plan recognition, or agent modeling. Most existing techniques for plan recognition assume the availability of carefully hand-crafted plan libraries, which encode the a-priori known behavioral repertoire of the observed agents; during run-time, plan recognition algorithms match the observed behavior of the agents against the plan-libraries, and matches are reported as hypotheses. Unfortunately, techniques for automatically acquiring plan-libraries from observations, e.g., by learning or data-mining, are only beginning to emerge. We present an approach for automatically creating the model of an agent behavior based on the observation and analysis of its atomic behaviors. In this approach, observations of an agent behavior are transformed into a sequence of atomic behaviors (events). This stream is analyzed in order to get the corresponding behavior model, represented by a distribution of relevant events. Once the model has been created, the proposed approach presents a method using a statistical test for classifying an observed behavior. Therefore, in this research, the problem of behavior classification is examined as a problem of learning to characterize the behavior of an agent in terms of sequences of atomic behaviors. The experiment results of this paper show that a system based on our approach can efficiently recognize different behaviors in different domains, in particular UNIX command-line data, and RoboCup soccer simulationThis work has been partially supported by the Spanish Government under project TRA2007-67374-C02-0
    • …
    corecore