Search CORE

27 research outputs found

Toward Real-Time Decentralized Reinforcement Learning using Finite Support Basis Functions

Author: Leottau David L.
Lobos-Tsunekawa Kenzo
Ruiz-del-Solar Javier
Publication venue
Publication date: 01/01/2017
Field of study

This paper addresses the design and implementation of complex Reinforcement Learning (RL) behaviors where multi-dimensional action spaces are involved, as well as the need to execute the behaviors in real-time using robotic platforms with limited computational resources and training times. For this purpose, we propose the use of decentralized RL, in combination with finite support basis functions as alternatives to Gaussian RBF, in order to alleviate the effects of the curse of dimensionality on the action and state spaces respectively, and to reduce the computation time. As testbed, a RL based controller for the in-walk kick in NAO robots, a challenging and critical problem for soccer robotics, is used. The reported experiments show empirically that our solution saves up to 99.94% of execution time and 98.82% of memory consumption during execution, without diminishing performance compared to classical approaches.Comment: Accepted in the RoboCup Symposium 2017. Final version will be published at Springe

arXiv.org e-Print Archive

Repositorio Académico de la Universidad de Chile

Adaptive action supervision in reinforcement learning from real-world multi-agent demonstrations

Author: Fujii Keisuke
Kawahara Yoshinobu
Nakahara Hiroshi
Scott Atom
Takeishi Naoya
Tsutsui Kazushi
Publication venue
Publication date: 26/05/2023
Field of study

Modeling of real-world biological multi-agents is a fundamental problem in various scientific and engineering fields. Reinforcement learning (RL) is a powerful framework to generate flexible and diverse behaviors in cyberspace; however, when modeling real-world biological multi-agents, there is a domain gap between behaviors in the source (i.e., real-world data) and the target (i.e., cyberspace for RL), and the source environment parameters are usually unknown. In this paper, we propose a method for adaptive action supervision in RL from real-world demonstrations in multi-agent scenarios. We adopt an approach that combines RL and supervised learning by selecting actions of demonstrations in RL based on the minimum distance of dynamic time warping for utilizing the information of the unknown source dynamics. This approach can be easily applied to many existing neural network architectures and provide us with an RL model balanced between reproducibility as imitation and generalization ability to obtain rewards in cyberspace. In the experiments, using chase-and-escape and football tasks with the different dynamics between the unknown source and target environments, we show that our approach achieved a balance between the reproducibility and the generalization ability compared with the baselines. In particular, we used the tracking data of professional football players as expert demonstrations in football and show successful performances despite the larger gap between behaviors in the source and target environments than the chase-and-escape task.Comment: 14 pages, 5 figure

arXiv.org e-Print Archive

Accelerate Multi-Agent Reinforcement Learning in Zero-Sum Games with Subgame Curriculum Learning

Author: Chen Jiayu
Fang Fei
Li Yunfei
Song Jiaming
Wang Yu
Wu Yi
Xu Zelai
Yang Huazhong
Yu Chao
Publication venue
Publication date: 16/12/2023
Field of study

Learning Nash equilibrium (NE) in complex zero-sum games with multi-agent reinforcement learning (MARL) can be extremely computationally expensive. Curriculum learning is an effective way to accelerate learning, but an under-explored dimension for generating a curriculum is the difficulty-to-learn of the subgames -- games induced by starting from a specific state. In this work, we present a novel subgame curriculum learning framework for zero-sum games. It adopts an adaptive initial state distribution by resetting agents to some previously visited states where they can quickly learn to improve performance. Building upon this framework, we derive a subgame selection metric that approximates the squared distance to NE values and further adopt a particle-based state sampler for subgame generation. Integrating these techniques leads to our new algorithm, Subgame Automatic Curriculum Learning (SACL), which is a realization of the subgame curriculum learning framework. SACL can be combined with any MARL algorithm such as MAPPO. Experiments in the particle-world environment and Google Research Football environment show SACL produces much stronger policies than baselines. In the challenging hide-and-seek quadrant environment, SACL produces all four emergent stages and uses only half the samples of MAPPO with self-play. The project website is at https://sites.google.com/view/sacl-rl

arXiv.org e-Print Archive

Recommended from our members

Hypernetworks Analysis of RoboCup Interactions

Author: Rossi Ruggero
Publication venue
Publication date: 20/05/2022
Field of study

Robotic soccer simulations are controlled environments in which the rich variety of interactions among agents make them good candidates to be studied as complex adaptive systems. The challenge is to create an autonomous team of soccer agents that can adapt and improve its behaviour as it plays other teams. By analogy with chess, the movements of the soccer agents and the ball form ever-changing networks as players in one team form structures that give their team an advantage. For example, the Defender’s Dilemma involves relationships between an attacker with the ball, a team-mate and a defender. The defender must choose between tackling the player with the ball, or taking a position to intercept a pass to the other attacker. Since these structures involve more that two interacting entities it is necessary to go beyond networks to multidimensional hypernetworks. In this context, this thesis investigates (i) is it possible to identify patterns of play, that lead a team to obtain an advantage ?, (ii) is it possible to forecast with a good degree of accuracy if a certain game action or sequence of game actions is going to be successful, before it has been completed ?, and (iii) is it possible to make behavioural patterns emerge in the game without specifying the behavioural rules in detail ? To investigate these research questions we devised two methods to analyse the interactions between robotic players, one based on traditional programming and one based on Deep Learning. The first method identified thousands of Defender’s Dilemma configurations from RoboCup 2D simulator games and found a statistically significant association between winning and the creation of the defender’s dilemma by the attackers of the winning team. The second method showed that a feedforward Artificial Neural Network trained on thousands of games can take as input the current game configuration and forecast to a high degree of accuracy if the current action will end up in a goal or not. Finally, we designed our own fast and simple robotic soccer simulator for investigating Reinforcement Learning. This showed that Reinforcement Learning using Proximal Policy Optimization could train two agents in the task of scoring a goal, using only basic actions without using pre-built hand-programmed skills. These experiments provide evidence that it is possible: to identify advantageous patterns of play; to forecast if an action or sequence of actions will be successful; and to make behavioural patterns emerge in the game without specifying the behavioural rules in detail

Open Research Online (The Open University)

Multiagent reactive plan application learning in dynamic environments

Author: Costas Tsatsoulis
Hüseyin Sevay
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2004
Field of study

Crossref

Motion Synthesis and Control for Autonomous Agents using Generative Models and Reinforcement Learning

Author: Xu Pei
Publication venue: Clemson University Libraries
Publication date: 01/08/2023
Field of study

Imitating and predicting human motions have wide applications in both graphics and robotics, from developing realistic models of human movement and behavior in immersive virtual worlds and games to improving autonomous navigation for service agents deployed in the real world. Traditional approaches for motion imitation and prediction typically rely on pre-defined rules to model agent behaviors or use reinforcement learning with manually designed reward functions. Despite impressive results, such approaches cannot effectively capture the diversity of motor behaviors and the decision making capabilities of human beings. Furthermore, manually designing a model or reward function to explicitly describe human motion characteristics often involves laborious fine-tuning and repeated experiments, and may suffer from generalization issues. In this thesis, we explore data-driven approaches using generative models and reinforcement learning to study and simulate human motions. Specifically, we begin with motion synthesis and control of physically simulated agents imitating a wide range of human motor skills, and then focus on improving the local navigation decisions of autonomous agents in multi-agent interaction settings. For physics-based agent control, we introduce an imitation learning framework built upon generative adversarial networks and reinforcement learning that enables humanoid agents to learn motor skills from a few examples of human reference motion data. Our approach generates high-fidelity motions and robust controllers without needing to manually design and finetune a reward function, allowing at the same time interactive switching between different controllers based on user input. Based on this framework, we further propose a multi-objective learning scheme for composite and task-driven control of humanoid agents. Our multi-objective learning scheme balances the simultaneous learning of disparate motions from multiple reference sources and multiple goal-directed control objectives in an adaptive way, enabling the training of efficient composite motion controllers. Additionally, we present a general framework for fast and robust learning of motor control skills. Our framework exploits particle filtering to dynamically explore and discretize the high-dimensional action space involved in continuous control tasks, and provides a multi-modal policy as a substitute for the commonly used Gaussian policies. For navigation learning, we leverage human crowd data to train a human-inspired collision avoidance policy by combining knowledge distillation and reinforcement learning. Our approach enables autonomous agents to take human-like actions during goal-directed steering in fully decentralized, multi-agent environments. To inform better control in such environments, we propose SocialVAE, a variational autoencoder based architecture that uses timewise latent variables with socially-aware conditions and a backward posterior approximation to perform agent trajectory prediction. Our approach improves current state-of-the-art performance on trajectory prediction tasks in daily human interaction scenarios and more complex scenes involving interactions between NBA players. We further extend SocialVAE by exploiting semantic maps as context conditions to generate map-compliant trajectory prediction. Our approach processes context conditions and social conditions occurring during agent-agent interactions in an integrated manner through the use of a dual-attention mechanism. We demonstrate the real-time performance of our approach and its ability to provide high-fidelity, multi-modal predictions on various large-scale vehicle trajectory prediction tasks

Clemson University: TigerPrints

Scaling multi-agent reinforcement learning to eleven aside simulated robot soccer

Author: Smit Andries
Publication venue
Publication date: 01/12/2022
Field of study

Electrical and Electronic Engineerin

Stellenbosch University SUNScholar Repository

Robotics 2010

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

Without a doubt, robotics has made an incredible progress over the last decades. The vision of developing, designing and creating technical systems that help humans to achieve hard and complex tasks, has intelligently led to an incredible variety of solutions. There are barely technical fields that could exhibit more interdisciplinary interconnections like robotics. This fact is generated by highly complex challenges imposed by robotic systems, especially the requirement on intelligent and autonomous operation. This book tries to give an insight into the evolutionary process that takes place in robotics. It provides articles covering a wide range of this exciting area. The progress of technical challenges and concepts may illuminate the relationship between developments that seem to be completely different at first sight. The robotics remains an exciting scientific and engineering field. The community looks optimistically ahead and also looks forward for the future challenges and new development

Directory of Open Access Books (DOAB)

Talent Identification and Development in Sports Performance

Author: Calleja-Gonzalez Julio
Calvo Alberto
Cumming Sean
Gonçalves Bruno
Leite Nuno
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2022
Field of study

The identification and development of talent have always been a relevant topic in sports performance. In fact, a significant body of research is available worldwide discussing this longitudinal process, the qualities that underpin elite sports performance, and how coaches can facilitate the developmental process of talented athletes. Despite the continued interest given to issues of talent identification and development, recent literature highlights the low predictive value of applied and theoretical talent identification models. Talent is the expression of a complex and multidimensional phenomenon, where, despite the existing practical recommendations, many coaches and stakeholders continue to fail to adequately value the distinction between growth, maturation, and training age. Technological resources have enabled important advances, however, this has been limited essentially to defining or validating motor skills variables or genetic markers that characterize the most talented athletes. Emerging technological resources and recent methodological advances are enabling integrated assessment and monitoring to include maturational, physiological, biomechanical, and perceptual skills while also creating optimal environments for performance and dealing with injury prevention and recovery

Repositório Científico da Universidade de Évora

The frequency of falls in children judo training

Author: Reguli Zdenko
Vít Michal
Publication venue: 'Faculty of Kinesiology, University of Zagreb'
Publication date: 01/01/2017
Field of study

Purpose: Falling techniques are inseparable part of youth judo training. Falling techniques are related to avoiding injuries exercises (Nauta et al., 2013). There is not good evidence about the ratio of falling during the training in children. Methods: 26 children (age 8.88±1.88) were video recorded on ten training sessions for further indirect observation and performance analysis. Results: Research protocol consisted from recording falls and falling techniques (Reguli et al., 2015) in warming up, combat games, falling techniques, throwing techniques and free fighting (randori) part of the training session. While children were taught almost exclusively forward slapping roll, backward slapping roll and sideward direct slapping fall, in other parts of training also other types of falling, as forward fall on knees, naturally occurred. Conclusions: Judo coaches should stress also on teaching unorthodox falls adding to standard judo curriculum (Koshida et al., 2014). Various falling games to teach children safe falling in different conditions should be incorporated into judo training. Further research to gain more data from groups of different age in various combat and non-combat sports is needed

Univerzitní repozitář Masarykovy univerzity