Search CORE

245 research outputs found

Reinforcement learning for business modeling.

Author: Oliveira Fernando S.
Publication venue: IGI Global
Publication date: 01/01/2014
Field of study

This chapter summarizes the reinforcement learning theory, emphasizing its relationship with dynamic programming (reinforcement learning algorithms may replace dynamic programming when a full model of the environment is not available) and analysing its limitations as a tool for modelling human and organizational behaviour

Crossref

Open Access Institutional Repository at Robert Gordon University

Online Learning in Iterated Prisoner's Dilemma to Mimic Human Behavior

Author: Bouneffouf Djallel
Cecchi Guillermo
Lin Baihan
Publication venue
Publication date: 10/09/2020
Field of study

Prisoner's Dilemma mainly treat the choice to cooperate or defect as an atomic action. We propose to study online learning algorithm behavior in the Iterated Prisoner's Dilemma (IPD) game, where we explored the full spectrum of reinforcement learning agents: multi-armed bandits, contextual bandits and reinforcement learning. We have evaluate them based on a tournament of iterated prisoner's dilemma where multiple agents can compete in a sequential fashion. This allows us to analyze the dynamics of policies learned by multiple self-interested independent reward-driven agents, and also allows us study the capacity of these algorithms to fit the human behaviors. Results suggest that considering the current situation to make decision is the worst in this kind of social dilemma game. Multiples discoveries on online learning behaviors and clinical validations are stated.Comment: To the best of our knowledge, this is the first attempt to explore the full spectrum of reinforcement learning agents (multi-armed bandits, contextual bandits and reinforcement learning) in the sequential social dilemma. This mental variants section supersedes and extends our work arXiv:1706.02897 (MAB), arXiv:2005.04544 (CB) and arXiv:1906.11286 (RL) into the multi-agent settin

arXiv.org e-Print Archive

Automatic Game Parameter Tuning using General Video Game Agents

Author: Kunanusont Kamolwan
Publication venue
Publication date: 19/04/2018
Field of study

Automatic Game Design is a subfield of Game Artificial Intelligence that aims to study the usage of AI algorithms for assisting in game design tasks. This dissertation presents a research work in this field, focusing on applying an evolutionary algorithm to video game parameterization. The task we are interested in is player experience. N-Tuple Bandit Evolutionary Algorithm (NTBEA) is an evolutionary algorithm that was recently proposed and successfully applied in game parameterization in a simple domain, which is the first experiment included in this project. To further investigating its ability in evolving game parameters, We applied NTBEA to evolve parameter sets for three General Video Game AI (GVGAI) games, because GVGAI has variety supplies of video games in different types and the framework has already been prepared for parameterization. 9 positive increasing functions were picked as target functions as representations of the player expected score trends. Our initial assumption was that the evolved games should provide the game environments that allow players to obtain score in the same trend as one of these functions. The experiment results confirm this for some functions, and prove that the NTBEA is very much capable of evolving GVGAI games to satisfy this task

University of Essex Research Repository

Towards Thompson Sampling for Complex Bayesian Reasoning

Author: Glimsdal Sondre
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

Paper III, IV, and VI are not available as a part of the dissertation due to the copyright.Thompson Sampling (TS) is a state-of-art algorithm for bandit problems set in a Bayesian framework. Both the theoretical foundation and the empirical efficiency of TS is wellexplored for plain bandit problems. However, the Bayesian underpinning of TS means that TS could potentially be applied to other, more complex, problems as well, beyond the bandit problem, if suitable Bayesian structures can be found. The objective of this thesis is the development and analysis of TS-based schemes for more complex optimization problems, founded on Bayesian reasoning. We address several complex optimization problems where the previous state-of-art relies on a relatively myopic perspective on the problem. These includes stochastic searching on the line, the Goore game, the knapsack problem, travel time estimation, and equipartitioning. Instead of employing Bayesian reasoning to obtain a solution, they rely on carefully engineered rules. In all brevity, we recast each of these optimization problems in a Bayesian framework, introducing dedicated TS based solution schemes. For all of the addressed problems, the results show that besides being more effective, the TS based approaches we introduce are also capable of solving more adverse versions of the problems, such as dealing with stochastic liars.publishedVersio

Agder University Research Archive

The 2016 Two-Player GVGAI Competition

Author: Couetoux A
Gaina RD
Kirchgessner F
Liu J
Lucas SM
Perez-Liebana D
Soemers DJNJ
Vodopivec T
Winands MHM
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 15/01/2018
Field of study

This paper showcases the setting and results of the first Two-Player General Video Game AI competition, which ran in 2016 at the IEEE World Congress on Computational Intelligence and the IEEE Conference on Computational Intelligence and Games. The challenges for the general game AI agents are expanded in this track from the single-player version, looking at direct player interaction in both competitive and cooperative environments of various types and degrees of difficulty. The focus is on the agents not only handling multiple problems, but also having to account for another intelligent entity in the game, who is expected to work towards their own goals (winning the game). This other player will possibly interact with first agent in a more engaging way than the environment or any non-playing character may do. The top competition entries are analyzed in detail and the performance of all agents is compared across the four sets of games. The results validate the competition system in assessing generality, as well as showing Monte Carlo Tree Search continuing to dominate by winning the overall Championship. However, this approach is closely followed by Rolling Horizon Evolutionary Algorithms, employed by the winner of the second leg of the contest

University of Essex Research Repository

Maastricht University Research Portal

Crossref

Queen Mary Research Online