608 research outputs found

    Boosting Studies of Multi-Agent Reinforcement Learning on Google Research Football Environment: the Past, Present, and Future

    Full text link
    Even though Google Research Football (GRF) was initially benchmarked and studied as a single-agent environment in its original paper, recent years have witnessed an increasing focus on its multi-agent nature by researchers utilizing it as a testbed for Multi-Agent Reinforcement Learning (MARL). However, the absence of standardized environment settings and unified evaluation metrics for multi-agent scenarios hampers the consistent understanding of various studies. Furthermore, the challenging 5-vs-5 and 11-vs-11 full-game scenarios have received limited thorough examination due to their substantial training complexities. To address these gaps, this paper extends the original environment by not only standardizing the environment settings and benchmarking cooperative learning algorithms across different scenarios, including the most challenging full-game scenarios, but also by discussing approaches to enhance football AI from diverse perspectives and introducing related research tools. Specifically, we provide a distributed and asynchronous population-based self-play framework with diverse pre-trained policies for faster training, two football-specific analytical tools for deeper investigation, and an online leaderboard for broader evaluation. The overall expectation of this work is to advance the study of Multi-Agent Reinforcement Learning on Google Research Football environment, with the ultimate goal of benefiting real-world sports beyond virtual games

    Exposing Attention-Decision-Learning Cycles in Engineering Project Teams through Collaborative Design Experiments

    Get PDF
    Engineering project outcomes are driven by a dynamic mix of the social physics of teams, the unique complexities of the engineering challenge at hand, and stakeholder pressures in context. Various related research has demonstrated formal experiments for tightly controlled problems and in small teams, including work in organizational psychology, computational organization theory, design thinking, and coordination science. We realize there is room for testing these foundational concepts in quasi-controlled environments with distributed teams challenged by problem, solution, and organization complexity common today. This paper presents a quasi-experiment to study how engineers proceed through attention, decision, and learning cycles in the design of a System of Systems. The experiment utilized an ensemble of an agent-based model, a decision-support interface, and a variety of sensors to record behavior and activity. Four pilots were conducted for a maritime industry challenge with experienced industry experts, followed by a primary experiment for data collection. Though this work is preliminary, the experimental approach detects (for this case) how designers focused on different variables (attention), manipulated variables to accomplish desired outcomes (decisions), and explored the system performance trade space variously over time to reveal false assumptions and uncover better decisions (learning). Lessons learned from this quasi-experiment are guiding this research team to prepare scalable and reproducible engineering teamwork experiments that include sensors of events over time in the problem, solution, and socials spaces of engineering projects

    Automatic Evaluation of Information Provider Reliablity and Expertise

    Get PDF
    Q&A social media have gained a lot of attention during the recent years. People rely on these sites to obtain information due to a number of advantages they offer as compared to conventional sources of knowledge (e.g., asynchronous and convenient access). However, for the same question one may find highly contradicting answers, causing an ambiguity with respect to the correct information. This can be attributed to the presence of unreliable and/or non-expert users. These two attributes (reliability and expertise) significantly affect the quality of the answer/information provided. We present a novel approach for estimating these user's characteristics relying on human cognitive traits. In brief, we propose each user to monitor the activity of his peers (on the basis of responses to questions asked by him) and observe their compliance with predefined cognitive models. These observations lead to local assessments that can be further fused to obtain a reliability and expertise consensus for every other user in the social network (SN). For the aggregation part we use subjective logic. To the best of our knowledge this is the first study of this kind in the context of Q&A SNs. Our proposed approach is highly distributed; each user can individually estimate the expertise and the reliability of his peers using his direct interactions with them and our framework. The online SN (OSN), which can be considered as a distributed database, performs continuous data aggregation for users expertise and reliability assesment in order to reach a consensus. In our evaluations, we first emulate a Q&A SN to examine various performance aspects of our algorithm (e.g., convergence time, responsiveness etc.). Our evaluations indicate that it can accurately assess the reliability and the expertise of a user with a small number of samples and can successfully react to the latter's behavior change, provided that the cognitive traits hold in practice. Furthermore, the use of the consensus operator for the aggregation of multiple opinions on a specific user, reduces the uncertainty with regards to the final assessment. However, as real data obtained from Yahoo! Answers imply, the pairwise interactions between specific users are limited. Hence, we consider the aggregate set of questions as posted from the system itself and we assess the expertise and realibility of users based on their response behavior. We observe, that users have different behaviors depending on the level at which we are observing them. In particular, while their activity is focused on a few general categories, yielding them reliable, their microscopic (within general category) activity is highly scattered

    Game Plan: What AI can do for Football, and What Football can do for AI

    Get PDF
    The rapid progress in artificial intelligence (AI) and machine learning has opened unprecedented analytics possibilities in various team and individual sports, including baseball, basketball, and tennis. More recently, AI techniques have been applied to football, due to a huge increase in data collection by professional teams, increased computational power, and advances in machine learning, with the goal of better addressing new scientific challenges involved in the analysis of both individual players’ and coordinated teams’ behaviors. The research challenges associated with predictive and prescriptive football analytics require new developments and progress at the intersection of statistical learning, game theory, and computer vision. In this paper, we provide an overarching perspective highlighting how the combination of these fields, in particular, forms a unique microcosm for AI research, while offering mutual benefits for professional teams, spectators, and broadcasters in the years to come. We illustrate that this duality makes football analytics a game changer of tremendous value, in terms of not only changing the game of football itself, but also in terms of what this domain can mean for the field of AI. We review the state-of-theart and exemplify the types of analysis enabled by combining the aforementioned fields, including illustrative examples of counterfactual analysis using predictive models, and the combination of game-theoretic analysis of penalty kicks with statistical learning of player attributes. We conclude by highlighting envisioned downstream impacts, including possibilities for extensions to other sports (real and virtual)

    Developing Enculturated Agents:Pitfalls and Strategies

    Get PDF

    A Contextual Approach To Learning Collaborative Behavior Via Observation

    Get PDF
    This dissertation describes a novel technique to creating a simulated team of agents through observation. Simulated human teamwork can be used for a number of purposes, such as expert examples, automated teammates for training purposes and realistic opponents in games and training simulation. Current teamwork simulations require the team member behaviors be programmed into the simulation, often requiring a great deal of time and effort. None are able to observe a team at work and replicate the teamwork behaviors. Machine learning techniques for learning by observation and learning by demonstration have proven successful at observing behavior of humans or other software agents and creating a behavior function for a single agent. The research described here combines current research in teamwork simulations and learning by observation to effectively train a multi-agent system in effective team behavior. The dissertation describes the background and work by others as well as a detailed description of the learning method. A prototype built to evaluate the developed approach as well as the extensive experimentation conducted is also described

    On the Combination of Game-Theoretic Learning and Multi Model Adaptive Filters

    Get PDF
    This paper casts coordination of a team of robots within the framework of game theoretic learning algorithms. In particular a novel variant of fictitious play is proposed, by considering multi-model adaptive filters as a method to estimate other players’ strategies. The proposed algorithm can be used as a coordination mechanism between players when they should take decisions under uncertainty. Each player chooses an action after taking into account the actions of the other players and also the uncertainty. Uncertainty can occur either in terms of noisy observations or various types of other players. In addition, in contrast to other game-theoretic and heuristic algorithms for distributed optimisation, it is not necessary to find the optimal parameters a priori. Various parameter values can be used initially as inputs to different models. Therefore, the resulting decisions will be aggregate results of all the parameter values. Simulations are used to test the performance of the proposed methodology against other game-theoretic learning algorithms.</p

    Contributions to the Modelling of Auditory Hallucinations, Social robotics, and Multiagent Systems

    Get PDF
    165 p.The Thesis covers three diverse lines of work that have been tackled with the central endeavor of modeling and understanding the phenomena under consideration. Firstly, the Thesis works on the problem of finding brain connectivity biomarkers of auditory hallucinations, a rather frequent phenomena that can be related some pathologies, but which is also present in healthy population. We apply machine learning techniques to assess the significance of effective brain connections extracted by either dynamical causal modeling or Granger causality. Secondly, the Thesis deals with the usefulness of social robotics strorytelling as a therapeutic tools for children at risk of exclussion. The Thesis reports on the observations gathered in several therapeutic sessions carried out in Spain and Bulgaria, under the supervision of tutors and caregivers. Thirdly, the Thesis deals with the spatio-temporal dynamic modeling of social agents trying to explain the phenomena of opinion survival of the social minorities. The Thesis proposes a eco-social model endowed with spatial mobility of the agents. Such mobility and the spatial perception of the agents are found to be strong mechanisms explaining opinion propagation and survival
    corecore