Search CORE

5 research outputs found

Addiction beyond pharmacological effects: The role of environment complexity and bounded rationality

Author: Fiore Vincenzo G
Gu Xiaosi
Ognibene Dimitri
Publication venue: 'Elsevier BV'
Publication date: 01/08/2019
Field of study

Several decision-making vulnerabilities have been identified as underlying causes for addictive behaviours, or the repeated execution of stereotyped actions despite their adverse consequences. These vulnerabilities are mostly associated with brain alterations caused by the consumption of substances of abuse. However, addiction can also happen in the absence of a pharmacological component, such as seen in pathological gambling and videogaming. We use a new reinforcement learning model to highlight a previously neglected vulnerability that we suggest interacts with those already identified, whilst playing a prominent role in non-pharmacological forms of addiction. Specifically, we show that a dual-learning system (i.e. combining model-based and model-free) can be vulnerable to highly rewarding, but suboptimal actions, that are followed by a complex ramification of stochastic adverse effects. This phenomenon is caused by the overload of the capabilities of an agent, as time and cognitive resources required for exploration, deliberation, situation recognition, and habit formation, all increase as a function of the depth and richness of detail of an environment. Furthermore, the cognitive overload can be aggravated due to alterations (e.g. caused by stress) in the bounded rationality, i.e. the limited amount of resources available for the model-based component, in turn increasing the agent’s chances to develop or maintain addictive behaviours. Our study demonstrates that, independent of drug consumption, addictive behaviours can arise in the interaction between the environmental complexity and the biologically finite resources available to explore and represent it

University of Essex Research Repository

To UCT, or not to UCT? (Position Paper)

Author: Domshlak Carmel
Feldman Zohar
Publication venue: AAAI Press
Publication date: 20/08/2021
Field of study

Monte-Carlo search is successfully used in simulation-based planning for various large-scale sequential decision problems, and the UCT algorithm seems to be the choice in most (if not all) such recent success stories. Based on some recent discoveries in theory and empirical analysis of Monte-Carlo search, here we argue that, if online sequential decision making is your problem, and Monte-Carlo tree search is your way to go, then UCT is unlikely to be the best fit for your needs

Association for the Advancement of Artificial Intelligence: AAAI Publications

The effect of simulation bias on action selection in Monte Carlo Tree Search

Author: James Steven Doron
Publication venue
Publication date: 01/01/2016
Field of study

A dissertation submitted to the Faculty of Science, University of the Witwatersrand, in fulfilment of the requirements for the degree of Master of Science. August 2016.Monte Carlo Tree Search (MCTS) is a family of directed search algorithms that has gained widespread attention in recent years. It combines a traditional tree-search approach with Monte Carlo simulations, using the outcome of these simulations (also known as playouts or rollouts) to evaluate states in a look-ahead tree. That MCTS does not require an evaluation function makes it particularly well-suited to the game of Go — seen by many to be chess’s successor as a grand challenge of artificial intelligence — with MCTS-based agents recently able to achieve expert-level play on 19×19 boards. Furthermore, its domain-independent nature also makes it a focus in a variety of other fields, such as Bayesian reinforcement learning and general game-playing. Despite the vast amount of research into MCTS, the dynamics of the algorithm are still not yet fully understood. In particular, the effect of using knowledge-heavy or biased simulations in MCTS still remains unknown, with interesting results indicating that better-informed rollouts do not necessarily result in stronger agents. This research provides support for the notion that MCTS is well-suited to a class of domain possessing a smoothness property. In these domains, biased rollouts are more likely to produce strong agents. Conversely, any error due to incorrect bias is compounded in non-smooth domains, and in particular for low-variance simulations. This is demonstrated empirically in a number of single-agent domains.LG201

Wits Institutional Repository on DSPACE

Planning Algorithms for Multi-Robot Active Perception

Author: Best Graeme
Publication venue: Faculty of Engineering and Information Technologies, School of Aerospace, Mechanical and Mechatronic Engineering
Publication date: 16/01/2019
Field of study

A fundamental task of robotic systems is to use on-board sensors and perception algorithms to understand high-level semantic properties of an environment. These semantic properties may include a map of the environment, the presence of objects, or the parameters of a dynamic field. Observations are highly viewpoint dependent and, thus, the performance of perception algorithms can be improved by planning the motion of the robots to obtain high-value observations. This motivates the problem of active perception, where the goal is to plan the motion of robots to improve perception performance. This fundamental problem is central to many robotics applications, including environmental monitoring, planetary exploration, and precision agriculture. The core contribution of this thesis is a suite of planning algorithms for multi-robot active perception. These algorithms are designed to improve system-level performance on many fronts: online and anytime planning, addressing uncertainty, optimising over a long time horizon, decentralised coordination, robustness to unreliable communication, predicting plans of other agents, and exploiting characteristics of perception models. We first propose the decentralised Monte Carlo tree search algorithm as a generally-applicable, decentralised algorithm for multi-robot planning. We then present a self-organising map algorithm designed to find paths that maximally observe points of interest. Finally, we consider the problem of mission monitoring, where a team of robots monitor the progress of a robotic mission. A spatiotemporal optimal stopping algorithm is proposed and a generalisation for decentralised monitoring. Experimental results are presented for a range of scenarios, such as marine operations and object recognition. Our analytical and empirical results demonstrate theoretically-interesting and practically-relevant properties that support the use of the approaches in practice

Sydney eScholarship

Monte-Carlo tree search enhancements for one-player and two-player domains

Author: Baier Hendrik
Publication venue: 'University of Maastricht'
Publication date: 01/01/2015
Field of study

Maastricht University Research Portal