Search CORE

4 research outputs found

Achieving Cooperative Behavior Based on Intention Estimation by Learning Combinations of Modules

Author: Oka Natsuki
Sakato Tatsuya
Publication venue: International Journal of Sciences: Basic and Applied Research
Publication date: 27/12/2015
Field of study

A robot needs to process information appropriately depending on the environment or context. However, some of the abilities required by a robot are often common irrespective of the environment or context. In such situations, the learning agent should not learn the abilities again but use the learning results of previous tasks. In the field of the study of intellectual systems, models have been proposed that solve complex problems by combining modules, each of which serve a specific function such as recognition, planning, or action selection. The models can use the learning results of previous tasks in different environments or contexts by combining modules it has learnt. In this paper, we focus on achieving cooperative behavior based on intention estimation, and propose a model for a learning agent that can acquire combinations of modules using which the agent can achieve cooperative behavior based on intention estimation. The experimental results indicate that a desirable combination of the modules was acquired and the learning process suitably progressed

GSSRR.ORG: International Journals: Publishing Research Papers in all Fields

Multi-agent reinforcement learning algorithm to handle beliefs of other agents' policies and embedded beliefs

Author
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2006
Field of study

Crossref

ABSTRACT Multi-agent Reinforcement Learning Algorithm to Handle Beliefs of Other Agents ’ Policies and Embedded Beliefs

Author: Takaki Makino
Publication venue
Publication date
Field of study

We have developed a new series of multi-agent reinforcement learning algorithms that choose a policy based on beliefs about co-players’ policies. The algorithms are applicable to situations where a state is fully observable by the agents, but there is no limit on the number of players. Some of the algorithms employ embedded beliefs to handle the cases that co-players are also choosing a policy based on their beliefs of others ’ policies. Simulation experiments on Iterated Prisoners ’ Dilemma games show that the algorithms using on policy-based belief converge to highly mutually-cooperative behavior, unlike the existing algorithms based on action-based belief

CiteSeerX