11,705 research outputs found

    Arena: A General Evaluation Platform and Building Toolkit for Multi-Agent Intelligence

    Full text link
    Learning agents that are not only capable of taking tests, but also innovating is becoming a hot topic in AI. One of the most promising paths towards this vision is multi-agent learning, where agents act as the environment for each other, and improving each agent means proposing new problems for others. However, existing evaluation platforms are either not compatible with multi-agent settings, or limited to a specific game. That is, there is not yet a general evaluation platform for research on multi-agent intelligence. To this end, we introduce Arena, a general evaluation platform for multi-agent intelligence with 35 games of diverse logics and representations. Furthermore, multi-agent intelligence is still at the stage where many problems remain unexplored. Therefore, we provide a building toolkit for researchers to easily invent and build novel multi-agent problems from the provided game set based on a GUI-configurable social tree and five basic multi-agent reward schemes. Finally, we provide Python implementations of five state-of-the-art deep multi-agent reinforcement learning baselines. Along with the baseline implementations, we release a set of 100 best agents/teams that we can train with different training schemes for each game, as the base for evaluating agents with population performance. As such, the research community can perform comparisons under a stable and uniform standard. All the implementations and accompanied tutorials have been open-sourced for the community at https://sites.google.com/view/arena-unity/

    Evolutionary Tournament-Based Comparison of Learning and Non-Learning Algorithms for Iterated Games

    Get PDF
    Evolutionary tournaments have been used effectively as a tool for comparing game-playing algorithms. For instance, in the late 1970's, Axelrod organized tournaments to compare algorithms for playing the iterated prisoner's dilemma (PD) game. These tournaments capture the dynamics in a population of agents that periodically adopt relatively successful algorithms in the environment. While these tournaments have provided us with a better understanding of the relative merits of algorithms for iterated PD, our understanding is less clear about algorithms for playing iterated versions of arbitrary single-stage games in an environment of heterogeneous agents. While the Nash equilibrium solution concept has been used to recommend using Nash equilibrium strategies for rational players playing general-sum games, learning algorithms like fictitious play may be preferred for playing against sub-rational players. In this paper, we study the relative performance of learning and non-learning algorithms in an evolutionary tournament where agents periodically adopt relatively successful algorithms in the population. The tournament is played over a testbed composed of all possible structurally distinct 2Ă—2 conflicted games with ordinal payoffs: a baseline, neutral testbed for comparing algorithms. Before analyzing results from the evolutionary tournament, we discuss the testbed, our choice of representative learning and non-learning algorithms and relative rankings of these algorithms in a round-robin competition. The results from the tournament highlight the advantage of learning algorithms over players using static equilibrium strategies for repeated plays of arbitrary single-stage games. The results are likely to be of more benefit compared to work on static analysis of equilibrium strategies for choosing decision procedures for open, adapting agent society consisting of a variety of competitors.Repeated Games, Evolution, Simulation

    Evaluating the Use of Inclusive Teaching Materials for Learners with Autism

    Get PDF
    In the last decade, the field of applied behavior analysis (ABA) has committed to working on diversity, equity, and inclusion (DEI). The work began with call-to-action papers, empirical work on cultural accommodations, and most recently, the certifying board has changed the professional standards for board-certified behavior analysts (BCBAs). An objective and measurable step that BCBAs can take to adhere to the new ethical and professional standards is to use inclusive teaching materials. Inclusive teaching materials are teaching materials that reflect the diversity of society. This study compared the rate of learning and generalization between an inclusive and non-inclusive set of teaching materials during an occupations identification task (e.g., “Touch Scientist”). We attempted to teach six preschool-aged children diagnosed with autism spectrum disorder (ASD) to identify occupations using an inclusive set of 2-D stimuli and a non-inclusive set of 2-D stimuli. The purpose of this study was to begin empirically evaluating inclusion within the field of ABA by comparing the rate of learning and generalization across the two teaching materials. All of the participants had difficulty in learning to identify occupations, except for one. Two participants only met the mastery criteria of the occupations assigned to the inclusive materials conditions, and three participants were withdrawn from the study. While there were many limitations to participant learning in this study, based on an occupation by condition analysis, it did not seem that the type of teaching materials was a variable. The potential limitations and future research related to inclusive teaching materials, stimulus feature manipulation, and instructional procedures for children with ASD are discussed

    Comparing procedures on the acquisition and generalization of tacts for children with autism spectrum disorder

    Get PDF
    Generalization is a critical outcome for individuals with autism spectrum disorder (ASD) who display new skills in a limited range of contexts. In the absence of proper planning, generalization may not be observed. The purpose of the current study was to directly compare serial to concurrent multiple exemplar training using total training time per exemplar, mean total training time, and exposures to mastery across three children diagnosed with ASD. Additionally, we assessed the efficiency of presenting secondary targets in the antecedent and consequence portions of learning trials and evaluated generalization to tacts not associated with direct teaching. Results suggested that all training conditions produced acquisition and generalization for trained and untrained exemplars. However, the serial multiple exemplar training condition was more efficient for two participants, whereas the instructive feedback condition was the most efficient for the third. Findings are discussed considering previous studies and areas for future research

    Assessing Preference for Home Language or English Praise in English Language Learners with Disabilities

    Get PDF
    Assessing preference for stimuli has been shown to be of value when determining potential reinforcers for individuals with disabilities. Researchers have found that preference for forms of social interaction can be identified for persons with disabilities. Furthermore, these same social interactions can be used as reinforcers for these same persons. This study conceptualized different languages as different types of social interactions. Assessing preference for languages may be of use to identify forms of social reinforcement that can be used with English Language Learners (ELLs) with disabilities. Identifying reinforcers may be of value for this population to inform how to structure language supports in their environment. Five ELLs with disabilities between the ages of 10 and 17 years old participated in the study. We conducted a paired-stimulus preference assessment for specific language praise statements in English and Spanish to determine the language in which the participants preferred praise. Following the preference assessment, we conducted a concurrent-chains reinforcer assessment to determine reinforcing efficacy of praise in each language. We found two of five participants preferred Spanish praise to English praise. Three of five participants’ preference was undifferentiated between Spanish and English praise. For four of the five participants praise in different languages functioned as a reinforcer. All participants’ preference assessments predicted, to a degree, the results of their reinforcer assessments. From these results we concluded our paired stimulus preference assessment was effective for evaluating preference for different types of praise. Preference was also indicative of reinforcing efficacy of praise

    The evolving link between learning and assessment : from 'transmission check' to 'learning support'

    Get PDF
    Learning and assessment are now considered as two sides of the same coin we simply cannot speak of one without also referring to the other. This paper, which traces the evolution of the link between learning and assessment, explores what led to our shift in understanding of the learning process from the behaviourist to the constructivist model, and the implications that this 'revolution ' has had for assessment. Making assessment at the service of learning is subsequently identified as the challenge ahead for the educational community.peer-reviewe
    • …
    corecore