2,310 research outputs found

    Multiagent Deep Reinforcement Learning: Challenges and Directions Towards Human-Like Approaches

    Full text link
    This paper surveys the field of multiagent deep reinforcement learning. The combination of deep neural networks with reinforcement learning has gained increased traction in recent years and is slowly shifting the focus from single-agent to multiagent environments. Dealing with multiple agents is inherently more complex as (a) the future rewards depend on the joint actions of multiple players and (b) the computational complexity of functions increases. We present the most common multiagent problem representations and their main challenges, and identify five research areas that address one or more of these challenges: centralised training and decentralised execution, opponent modelling, communication, efficient coordination, and reward shaping. We find that many computational studies rely on unrealistic assumptions or are not generalisable to other settings; they struggle to overcome the curse of dimensionality or nonstationarity. Approaches from psychology and sociology capture promising relevant behaviours such as communication and coordination. We suggest that, for multiagent reinforcement learning to be successful, future research addresses these challenges with an interdisciplinary approach to open up new possibilities for more human-oriented solutions in multiagent reinforcement learning.Comment: 37 pages, 6 figure

    A Multilingual Virtual Guide for Self-Attachment Technique

    Full text link
    In this work, we propose a computational framework that leverages existing out-of-language data to create a conversational agent for the delivery of Self-Attachment Technique (SAT) in Mandarin. Our framework does not require large-scale human translations, yet it achieves a comparable performance whilst also maintaining safety and reliability. We propose two different methods of augmenting available response data through empathetic rewriting. We evaluate our chatbot against a previous, English-only SAT chatbot through non-clinical human trials (N=42), each lasting five days, and quantitatively show that we are able to attain a comparable level of performance to the English SAT chatbot. We provide qualitative analysis on the limitations of our study and suggestions with the aim of guiding future improvements

    A Survey of Zero-shot Generalisation in Deep Reinforcement Learning

    Get PDF
    The study of zero-shot generalisation (ZSG) in deep Reinforcement Learning (RL) aims to produce RL algorithms whose policies generalise well to novel unseen situations at deployment time, avoiding overfitting to their training environments. Tackling this is vital if we are to deploy reinforcement learning algorithms in real world scenarios, where the environment will be diverse, dynamic and unpredictable. This survey is an overview of this nascent field. We rely on a unifying formalism and terminology for discussing different ZSG problems, building upon previous works. We go on to categorise existing benchmarks for ZSG, as well as current methods for tackling these problems. Finally, we provide a critical discussion of the current state of the field, including recommendations for future work. Among other conclusions, we argue that taking a purely procedural content generation approach to benchmark design is not conducive to progress in ZSG, we suggest fast online adaptation and tackling RL-specific problems as some areas for future work on methods for ZSG, and we recommend building benchmarks in underexplored problem settings such as offline RL ZSG and reward-function variation

    Causal Reasoning: Charting a Revolutionary Course for Next-Generation AI-Native Wireless Networks

    Full text link
    Despite the basic premise that next-generation wireless networks (e.g., 6G) will be artificial intelligence (AI)-native, to date, most existing efforts remain either qualitative or incremental extensions to existing ``AI for wireless'' paradigms. Indeed, creating AI-native wireless networks faces significant technical challenges due to the limitations of data-driven, training-intensive AI. These limitations include the black-box nature of the AI models, their curve-fitting nature, which can limit their ability to reason and adapt, their reliance on large amounts of training data, and the energy inefficiency of large neural networks. In response to these limitations, this article presents a comprehensive, forward-looking vision that addresses these shortcomings by introducing a novel framework for building AI-native wireless networks; grounded in the emerging field of causal reasoning. Causal reasoning, founded on causal discovery, causal representation learning, and causal inference, can help build explainable, reasoning-aware, and sustainable wireless networks. Towards fulfilling this vision, we first highlight several wireless networking challenges that can be addressed by causal discovery and representation, including ultra-reliable beamforming for terahertz (THz) systems, near-accurate physical twin modeling for digital twins, training data augmentation, and semantic communication. We showcase how incorporating causal discovery can assist in achieving dynamic adaptability, resilience, and cognition in addressing these challenges. Furthermore, we outline potential frameworks that leverage causal inference to achieve the overarching objectives of future-generation networks, including intent management, dynamic adaptability, human-level cognition, reasoning, and the critical element of time sensitivity

    Human-AI complex task planning

    Get PDF
    The process of complex task planning is ubiquitous and arises in a variety of compelling applications. A few leading examples include designing a personalized course plan or trip plan, designing music playlists/work sessions in web applications, or even planning routes of naval assets to collaboratively discover an unknown destination. For all of these aforementioned applications, creating a plan requires satisfying a basic construct, i.e., composing a sequence of sub-tasks (or items) that optimizes several criteria and satisfies constraints. For instance, in course planning, sub-tasks or items are core and elective courses, and degree requirements capture their complex dependencies as constraints. In trip planning, sub-tasks are points of interest (POIs) and constraints represent time and monetary budget, or user-specified requirements. Needless to say, task plans are to be individualized and designed considering uncertainty. When done manually, the process is human-intensive and tedious, and unlikely to scale. The goal of this dissertation is to present computational frameworks that synthesize the capabilities of human and AI algorithms to enable task planning at scale while satisfying multiple objectives and complex constraints. This dissertation makes significant contributions in four main areas, (i) proposing novel models, (ii) designing principled scalable algorithms, (iii) conducting rigorous experimental analysis, and (iv) deploying designed solutions in the real-world. A suite of constrained and multi-objective optimization problems has been formalized, with a focus on their applicability across diverse domains. From an algorithmic perspective, the dissertation proposes principled algorithms with theoretical guarantees adapted from discrete optimization techniques, as well as Reinforcement Learning based solutions. The memory and computational efficiency of these algorithms have been studied, and optimization opportunities have been proposed. The designed solutions are extensively evaluated on various large-scale real-world and synthetic datasets and compared against multiple baseline solutions after appropriate adaptation. This dissertation also presents user study results involving human subjects to validate the effectiveness of the proposed models. Lastly, a notable outcome of this dissertation is the deployment of one of the developed solutions at the Naval Postgraduate School. This deployment enables simultaneous route planning for multiple assets that are robust to uncertainty under multiple contexts

    Generalization Through the Lens of Learning Dynamics

    Full text link
    A machine learning (ML) system must learn not only to match the output of a target function on a training set, but also to generalize to novel situations in order to yield accurate predictions at deployment. In most practical applications, the user cannot exhaustively enumerate every possible input to the model; strong generalization performance is therefore crucial to the development of ML systems which are performant and reliable enough to be deployed in the real world. While generalization is well-understood theoretically in a number of hypothesis classes, the impressive generalization performance of deep neural networks has stymied theoreticians. In deep reinforcement learning (RL), our understanding of generalization is further complicated by the conflict between generalization and stability in widely-used RL algorithms. This thesis will provide insight into generalization by studying the learning dynamics of deep neural networks in both supervised and reinforcement learning tasks.Comment: PhD Thesi
    • …
    corecore