Search CORE

144,949 research outputs found

Crowd behavioural simulation via multi-agent reinforcement learning

Author: Lim Sheng Yan
Publication venue
Publication date: 01/01/2016
Field of study

A dissertation submitted to the Faculty of Science, University of the Witwatersrand, Johannesburg, in fulfilment of requirements for the degree of Master of Science. Johannesburg, 2015.Crowd simulation can be thought of as a group of entities interacting with one another. Traditionally, an animated entity would require precise scripts so that it can function in a virtual environment autonomously. Previous studies on crowd simulation have been used in real world applications but these methods are not learning agents and are therefore unable to adapt and change their behaviours. The state of the art crowd simulation methods include flow based, particle and strategy based models. A reinforcement learning agent could learn how to navigate, behave and interact in an environment without explicit design. Then a group of reinforcement learning agents should be able to act in a way that simulates a crowd. This thesis investigates the believability of crowd behavioural simulation via three multi-agent reinforcement learning methods. The methods are Q-learning in multi-agent markov decision processes model, joint state action Q-learning and joint state value iteration algorithm. The three learning methods are able to produce believable and realistic crowd behaviours

Wits Institutional Repository on DSPACE

Adaptive Group-based Signal Control by Reinforcement Learning

Author: Jin Junchen
Ma Xiaoliang
Publication venue: The Authors. Published by Elsevier B.V.
Publication date: 31/12/2015
Field of study

AbstractGroup-based signal control is one of the most prevalent control schemes in the European countries. The major advantage of group-based control is its capability in providing flexible phase structures. The current group-based control systems are usually implemented with rather simple timing logics, e.g. vehicle actuated logic. However, such a timing logic is not sufficient to respond to the traffic environment whose inputs, i.e. traffic demands, dynamically change over time. Therefore, the primary objective of this paper is to formulate the existing group-based signal controller as a multi-agent system. The proposed signal control system is capable of making intelligent timing decisions by utilizing machine learning techniques. In this regard, reinforcement learning is a potential solution because of its self-learning properties in a dynamic environment. This paper, thus, proposes an adaptive signal control system, enabled by a reinforcement learning algorithm, in the context of group-based phasing technique. Two different learning algorithms, Q-learning and SARSA, have been investigated and tested on a four-legged intersection. The experiments are carried out by means of an open-source traffic simulation tool, SUMO. Performances on traffic mobility of the adaptive group- based signal control systems are compared against those of a well-established group-based fixed time control system. In the testbed experiments, simulation results reveal that the learning-based adaptive signal controller outperforms group-based fixed time signal controller with regards to the improvements in traffic mobility efficiency. In addition, SARSA learning is a more suitable implementation for the proposed adaptive group-based signal control system compared to the Q-learning approach

Elsevier - Publisher Connector

Risk perception and behavioral change during epidemics: Comparing models of individual and collective learning

Author: Abdulkareem SA
Augustijn EW
Filatova T
Musial K
Mustafa YT
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2020
Field of study

Copyright © 2020 Abdulkareem et al. Modern societies are exposed to a myriad of risks ranging from disease to natural hazards and technological disruptions. Exploring how the awareness of risk spreads and how it triggers a diffusion of coping strategies is prominent in the research agenda of various domains. It requires a deep understanding of how individuals perceive risks and communicate about the effectiveness of protective measures, highlighting learning and social interaction as the core mechanisms driving such processes. Methodological approaches that range from purely physics-based diffusion models to data-driven environmental methods rely on agentbased modeling to accommodate context-dependent learning and social interactions in a diffusion process. Mixing agent-based modeling with data-driven machine learning has become popularity. However, little attention has been paid to the role of intelligent learning in risk appraisal and protective decisions, whether used in an individual or a collective process. The differences between collective learning and individual learning have not been sufficiently explored in diffusion modeling in general and in agent-based models of socioenvironmental systems in particular. To address this research gap, we explored the implications of intelligent learning on the gradient from individual to collective learning, using an agent-based model enhanced by machine learning. Our simulation experiments showed that individual intelligent judgement about risks and the selection of coping strategies by groups with majority votes were outperformed by leader-based groups and even individuals deciding alone. Social interactions appeared essential for both individual learning and group learning. The choice of how to represent social learning in an agent-based model could be driven by existing cultural and social norms prevalent in a modeled society

OPUS - University of Technology Sydney

Directory of Open Access Journals

University of Twente Research Information

On Learning by Exchanging Advice

Author: Nunes L.
Oliveira E.
Publication venue
Publication date: 01/01/2002
Field of study

One of the main questions concerning learning in Multi-Agent Systems is: (How) can agents benefit from mutual interaction during the learning process?. This paper describes the study of an interactive advice-exchange mechanism as a possible way to improve agents' learning performance. The advice-exchange technique, discussed here, uses supervised learning (backpropagation), where reinforcement is not directly coming from the environment but is based on advice given by peers with better performance score (higher confidence), to enhance the performance of a heterogeneous group of Learning Agents (LAs). The LAs are facing similar problems, in an environment where only reinforcement information is available. Each LA applies a different, well known, learning technique: Random Walk (hill-climbing), Simulated Annealing, Evolutionary Algorithms and Q-Learning. The problem used for evaluation is a simplified traffic-control simulation. Initial results indicate that advice-exchange can improve learning speed, although bad advice and/or blind reliance can disturb the learning performance.Comment: 12 pages, 6 figures, 1 table, accepted in Second Symposium on Adaptive Agents and Multi-Agent Systems (AAMAS-II), 200

arXiv.org e-Print Archive

CiteSeerX

Repositório Institucional do ISCTE-IUL

Repositório Aberto da Universidade do Porto

Talking Nets: A Multi-Agent Connectionist Approach to Communication and Trust between Individuals

Author: Heylighen Francis
Van Overwalle Frank
Publication venue
Publication date: 01/01/2006
Field of study

A multi-agent connectionist model is proposed that consists of a collection of individual recurrent networks that communicate with each other, and as such is a network of networks. The individual recurrent networks simulate the process of information uptake, integration and memorization within individual agents, while the communication of beliefs and opinions between agents is propagated along connections between the individual networks. A crucial aspect in belief updating based on information from other agents is the trust in the information provided. In the model, trust is determined by the consistency with the receiving agents’ existing beliefs, and results in changes of the connections between individual networks, called trust weights. Thus activation spreading and weight change between individual networks is analogous to standard connectionist processes, although trust weights take a specific function. Specifically, they lead to a selective propagation and thus filtering out of less reliable information, and they implement Grice’s (1975) maxims of quality and quantity in communication. The unique contribution of communicative mechanisms beyond intra-personal processing of individual networks was explored in simulations of key phenomena involving persuasive communication and polarization, lexical acquisition, spreading of stereotypes and rumors, and a lack of sharing unique information in group decisions

CogPrints Cognitive Sciences Eprint Archive

Navigating the Ocean with DRL: Path following for marine vessels

Author: Alam Md Shadab
Jose Joel
Somayajula Abhilash Sharma
Publication venue
Publication date: 23/10/2023
Field of study

Human error is a substantial factor in marine accidents, accounting for 85% of all reported incidents. By reducing the need for human intervention in vessel navigation, AI-based methods can potentially reduce the risk of accidents. AI techniques, such as Deep Reinforcement Learning (DRL), have the potential to improve vessel navigation in challenging conditions, such as in restricted waterways and in the presence of obstacles. This is because DRL algorithms can optimize multiple objectives, such as path following and collision avoidance, while being more efficient to implement compared to traditional methods. In this study, a DRL agent is trained using the Deep Deterministic Policy Gradient (DDPG) algorithm for path following and waypoint tracking. Furthermore, the trained agent is evaluated against a traditional PD controller with an Integral Line of Sight (ILOS) guidance system for the same. This study uses the Kriso Container Ship (KCS) as a test case for evaluating the performance of different controllers. The ship's dynamics are modeled using the maneuvering Modelling Group (MMG) model. This mathematical simulation is used to train a DRL-based controller and to tune the gains of a traditional PD controller. The simulation environment is also used to assess the controller's effectiveness in the presence of wind.Comment: Proceedings of the Sixth International Conference in Ocean Engineering (ICOE2023

arXiv.org e-Print Archive