2,310 research outputs found
Multiagent Deep Reinforcement Learning: Challenges and Directions Towards Human-Like Approaches
This paper surveys the field of multiagent deep reinforcement learning. The
combination of deep neural networks with reinforcement learning has gained
increased traction in recent years and is slowly shifting the focus from
single-agent to multiagent environments. Dealing with multiple agents is
inherently more complex as (a) the future rewards depend on the joint actions
of multiple players and (b) the computational complexity of functions
increases. We present the most common multiagent problem representations and
their main challenges, and identify five research areas that address one or
more of these challenges: centralised training and decentralised execution,
opponent modelling, communication, efficient coordination, and reward shaping.
We find that many computational studies rely on unrealistic assumptions or are
not generalisable to other settings; they struggle to overcome the curse of
dimensionality or nonstationarity. Approaches from psychology and sociology
capture promising relevant behaviours such as communication and coordination.
We suggest that, for multiagent reinforcement learning to be successful, future
research addresses these challenges with an interdisciplinary approach to open
up new possibilities for more human-oriented solutions in multiagent
reinforcement learning.Comment: 37 pages, 6 figure
A Multilingual Virtual Guide for Self-Attachment Technique
In this work, we propose a computational framework that leverages existing
out-of-language data to create a conversational agent for the delivery of
Self-Attachment Technique (SAT) in Mandarin. Our framework does not require
large-scale human translations, yet it achieves a comparable performance whilst
also maintaining safety and reliability. We propose two different methods of
augmenting available response data through empathetic rewriting. We evaluate
our chatbot against a previous, English-only SAT chatbot through non-clinical
human trials (N=42), each lasting five days, and quantitatively show that we
are able to attain a comparable level of performance to the English SAT
chatbot. We provide qualitative analysis on the limitations of our study and
suggestions with the aim of guiding future improvements
A Survey of Zero-shot Generalisation in Deep Reinforcement Learning
The study of zero-shot generalisation (ZSG) in deep Reinforcement Learning
(RL) aims to produce RL algorithms whose policies generalise well to novel
unseen situations at deployment time, avoiding overfitting to their training
environments. Tackling this is vital if we are to deploy reinforcement learning
algorithms in real world scenarios, where the environment will be diverse,
dynamic and unpredictable. This survey is an overview of this nascent field. We
rely on a unifying formalism and terminology for discussing different ZSG
problems, building upon previous works. We go on to categorise existing
benchmarks for ZSG, as well as current methods for tackling these problems.
Finally, we provide a critical discussion of the current state of the field,
including recommendations for future work. Among other conclusions, we argue
that taking a purely procedural content generation approach to benchmark design
is not conducive to progress in ZSG, we suggest fast online adaptation and
tackling RL-specific problems as some areas for future work on methods for ZSG,
and we recommend building benchmarks in underexplored problem settings such as
offline RL ZSG and reward-function variation
Causal Reasoning: Charting a Revolutionary Course for Next-Generation AI-Native Wireless Networks
Despite the basic premise that next-generation wireless networks (e.g., 6G)
will be artificial intelligence (AI)-native, to date, most existing efforts
remain either qualitative or incremental extensions to existing ``AI for
wireless'' paradigms. Indeed, creating AI-native wireless networks faces
significant technical challenges due to the limitations of data-driven,
training-intensive AI. These limitations include the black-box nature of the AI
models, their curve-fitting nature, which can limit their ability to reason and
adapt, their reliance on large amounts of training data, and the energy
inefficiency of large neural networks. In response to these limitations, this
article presents a comprehensive, forward-looking vision that addresses these
shortcomings by introducing a novel framework for building AI-native wireless
networks; grounded in the emerging field of causal reasoning. Causal reasoning,
founded on causal discovery, causal representation learning, and causal
inference, can help build explainable, reasoning-aware, and sustainable
wireless networks. Towards fulfilling this vision, we first highlight several
wireless networking challenges that can be addressed by causal discovery and
representation, including ultra-reliable beamforming for terahertz (THz)
systems, near-accurate physical twin modeling for digital twins, training data
augmentation, and semantic communication. We showcase how incorporating causal
discovery can assist in achieving dynamic adaptability, resilience, and
cognition in addressing these challenges. Furthermore, we outline potential
frameworks that leverage causal inference to achieve the overarching objectives
of future-generation networks, including intent management, dynamic
adaptability, human-level cognition, reasoning, and the critical element of
time sensitivity
Human-AI complex task planning
The process of complex task planning is ubiquitous and arises in a variety of compelling applications. A few leading examples include designing a personalized course plan or trip plan, designing music playlists/work sessions in web applications, or even planning routes of naval assets to collaboratively discover an unknown destination. For all of these aforementioned applications, creating a plan requires satisfying a basic construct, i.e., composing a sequence of sub-tasks (or items) that optimizes several criteria and satisfies constraints. For instance, in course planning, sub-tasks or items are core and elective courses, and degree requirements capture their complex dependencies as constraints. In trip planning, sub-tasks are points of interest (POIs) and constraints represent time and monetary budget, or user-specified requirements. Needless to say, task plans are to be individualized and designed considering uncertainty. When done manually, the process is human-intensive and tedious, and unlikely to scale. The goal of this dissertation is to present computational frameworks that synthesize the capabilities of human and AI algorithms to enable task planning at scale while satisfying multiple objectives and complex constraints.
This dissertation makes significant contributions in four main areas, (i) proposing novel models, (ii) designing principled scalable algorithms, (iii) conducting rigorous experimental analysis, and (iv) deploying designed solutions in the real-world. A suite of constrained and multi-objective optimization problems has been formalized, with a focus on their applicability across diverse domains. From an algorithmic perspective, the dissertation proposes principled algorithms with theoretical guarantees adapted from discrete optimization techniques, as well as Reinforcement Learning based solutions. The memory and computational efficiency of these algorithms have been studied, and optimization opportunities have been proposed. The designed solutions are extensively evaluated on various large-scale real-world and synthetic datasets and compared against multiple baseline solutions after appropriate adaptation. This dissertation also presents user study results involving human subjects to validate the effectiveness of the proposed models. Lastly, a notable outcome of this dissertation is the deployment of one of the developed solutions at the Naval Postgraduate School. This deployment enables simultaneous route planning for multiple assets that are robust to uncertainty under multiple contexts
Generalization Through the Lens of Learning Dynamics
A machine learning (ML) system must learn not only to match the output of a
target function on a training set, but also to generalize to novel situations
in order to yield accurate predictions at deployment. In most practical
applications, the user cannot exhaustively enumerate every possible input to
the model; strong generalization performance is therefore crucial to the
development of ML systems which are performant and reliable enough to be
deployed in the real world. While generalization is well-understood
theoretically in a number of hypothesis classes, the impressive generalization
performance of deep neural networks has stymied theoreticians. In deep
reinforcement learning (RL), our understanding of generalization is further
complicated by the conflict between generalization and stability in widely-used
RL algorithms. This thesis will provide insight into generalization by studying
the learning dynamics of deep neural networks in both supervised and
reinforcement learning tasks.Comment: PhD Thesi
- …