Search CORE

11 research outputs found

Modelling collective learning in design

Author: Duffy A.H.B.
Wu Zhichao
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 01/09/2004
Field of study

In this paper, a model of collective learning in design is developed in the context of team design. It explains that a team design activity uses input knowledge, environmental information, and design goals to produce output knowledge. A collective learning activity uses input knowledge from different agents and produces learned knowledge with the process of knowledge acquisition and transformation between different agents, which may be triggered by learning goals and rationale triggers. Different forms of collective learning were observed with respect to agent interactions, goal(s) of learning, and involvement of an agent. Three types of links between team design and collective learning were identified, namely teleological, rationale, and epistemic. Hypotheses of collective learning are made based upon existing theories and models in design and learning, which were tested using a protocol analysis approach. The model of collective learning in design is derived from the test results. The proposed model can be used as a basis to develop agent-based learning systems in design. In the future, collective learning between design teams, the links between collective learning and creativity, and computational support for collective learning can be investigated

Crossref

University of Strathclyde Institutional Repository

Recommended from our members

Marketing: the new core skill for all?

Author: Wilson Jonathan (Bilal) A.J.
Publication venue: MarkPlus Inc
Publication date: 01/08/2013
Field of study

Greenwich Academic Literature Archive

Traffic Signal Control with Communicative Deep Reinforcement Learning Agents: a Case Study

Author: Fazzini Paolo
Petracchini Francesco
Wheeler Isaac
Publication venue
Publication date: 03/07/2021
Field of study

In this work we theoretically and experimentally analyze Multi-Agent Advantage Actor-Critic (MA2C) and Independent Advantage Actor-Critic (IA2C), two recently proposed multi-agent reinforcement learning methods that can be applied to control traffic signals in urban areas. The two methods differ in their use of a reward calculated locally or globally and in the management of agents' communication. We analyze the methods theoretically with the framework provided by non-Markov decision processes, which provides useful insights in the analysis of the algorithms. Moreover, we analyze the efficacy and the robustness of the methods experimentally by testing them in two traffic areas in the Bologna (Italy) area, simulated by SUMO, a software tool. The experimental results indicate that MA2C achieves the best performance in the majority of cases, outperforms the alternative method considered, and displays sufficient stability during the learning process.Comment: 41 pages, 16 figure

arXiv.org e-Print Archive

Optimal Conservative Offline RL with General Function Approximation via Augmented Lagrangian

Author: Jiao Jiantao
Rashidinejad Paria
Russell Stuart
Yang Kunhe
Zhu Hanlin
Publication venue
Publication date: 01/11/2022
Field of study

Offline reinforcement learning (RL), which refers to decision-making from a previously-collected dataset of interactions, has received significant attention over the past years. Much effort has focused on improving offline RL practicality by addressing the prevalent issue of partial data coverage through various forms of conservative policy learning. While the majority of algorithms do not have finite-sample guarantees, several provable conservative offline RL algorithms are designed and analyzed within the single-policy concentrability framework that handles partial coverage. Yet, in the nonlinear function approximation setting where confidence intervals are difficult to obtain, existing provable algorithms suffer from computational intractability, prohibitively strong assumptions, and suboptimal statistical rates. In this paper, we leverage the marginalized importance sampling (MIS) formulation of RL and present the first set of offline RL algorithms that are statistically optimal and practical under general function approximation and single-policy concentrability, bypassing the need for uncertainty quantification. We identify that the key to successfully solving the sample-based approximation of the MIS problem is ensuring that certain occupancy validity constraints are nearly satisfied. We enforce these constraints by a novel application of the augmented Lagrangian method and prove the following result: with the MIS formulation, augmented Lagrangian is enough for statistically optimal offline RL. In stark contrast to prior algorithms that induce additional conservatism through methods such as behavior regularization, our approach provably eliminates this need and reinterprets regularizers as "enforcers of occupancy validity" than "promoters of conservatism."Comment: 49 pages, 1 figur

arXiv.org e-Print Archive

Approximate universal artificial intelligence and self-play learning for games

Author: Veness Joel William
Publication venue: UNSW, Sydney
Publication date: 01/01/2011
Field of study

This thesis is split into two independent parts. The first is an investigation of some practical aspects of Marcus Hutter's Universal Artificial Intelligence theory. The main contributions are to show how a very general agent can be built and analysed using the mathematical tools of this theory. Before the work presented in this thesis, it was an open question as to whether this theory was of any relevance to reinforcement learning practitioners. This work suggests that it is indeed relevant and worthy of future investigation. The second part of this thesis looks at self-play learning in two player, deterministic, adversarial turn-based games. The main contribution is the introduction of a new technique for training the weights of a heuristic evaluation function from data collected by classical game tree search algorithms. This method is shown to outperform previous self-play training routines based on Temporal Difference learning when applied to the game of Chess. In particular, the main highlight was using this technique to construct a Chess program that learnt to play master level Chess by tuning a set of initially random weights from self play games

UNSWorks

Design agents that learn

Author: Dan L. Grecu
David C. Brown
Dunskus
Grecu
Publication venue: 'Cambridge University Press (CUP)'
Publication date
Field of study

Crossref

Learning Actions and Action Verbs from Human-Agent Interaction

Author: Mohan Shiwali
Publication venue: Association for the Advancement of Artificial Intelligence
Publication date: 20/09/2021
Field of study

The goal of my research is to design agents that learn from human-agent interaction. Specifically, I am interested in acquisition of procedural, conceptual and linguistic knowledge related to novel actions from human-agent collaborative task execution

Association for the Advancement of Artificial Intelligence: AAAI Publications

Asymmetric Interpretations of Positive and Negative Human Feedback for a Social Learning Agent

Author: Andrea L. Thomaz
Cynthia Breazeal
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2007
Field of study

Abstract — The ability for people to interact with robots and teach them new skills will be crucial to the successful application of robots in everyday human environments. In order to design agents that learn efficiently and effectively from their instruction, it is important to understand how people, that are not experts in Machine Learning or robotics, will try to teach social robots. In prior work we have shown that human trainers use positive and negative feedback differentially when interacting with a Reinforcement Learning agent. In this paper we present experiments and implementations on two platforms, a robotic and a computer game platform, that explore the multiple communicative intents of positive and negative feedback from a human partner, in particular that negative feedback is both about the past and about intentions for future action. I

CiteSeerX

Crossref