4,727 research outputs found

    Learning for Multi-robot Cooperation in Partially Observable Stochastic Environments with Macro-actions

    Get PDF
    This paper presents a data-driven approach for multi-robot coordination in partially-observable domains based on Decentralized Partially Observable Markov Decision Processes (Dec-POMDPs) and macro-actions (MAs). Dec-POMDPs provide a general framework for cooperative sequential decision making under uncertainty and MAs allow temporally extended and asynchronous action execution. To date, most methods assume the underlying Dec-POMDP model is known a priori or a full simulator is available during planning time. Previous methods which aim to address these issues suffer from local optimality and sensitivity to initial conditions. Additionally, few hardware demonstrations involving a large team of heterogeneous robots and with long planning horizons exist. This work addresses these gaps by proposing an iterative sampling based Expectation-Maximization algorithm (iSEM) to learn polices using only trajectory data containing observations, MAs, and rewards. Our experiments show the algorithm is able to achieve better solution quality than the state-of-the-art learning-based methods. We implement two variants of multi-robot Search and Rescue (SAR) domains (with and without obstacles) on hardware to demonstrate the learned policies can effectively control a team of distributed robots to cooperate in a partially observable stochastic environment.Comment: Accepted to the 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2017

    Evolvability: What Is It and How Do We Get It?

    Get PDF
    Biological organisms exhibit spectacular adaptation to their environments. However, another marvel of biology lurks behind the adaptive traits that organisms exhibit over the course of their lifespans: it is hypothesized that biological organisms also exhibit adaptation to the evolutionary process itself. That is, biological organisms are thought to possess traits that facilitate evolution. The term evolvability was coined to describe this type of adaptation. The question of evolvability has special practical relevance to computer science researchers engaged in longstanding efforts to harness evolution as an algorithm for automated design. It is hoped that a more nuanced understanding of biological evolution will translate to more powerful digital evolution techniques. This thesis presents a theoretical overview of evolvability, illustrated with examples from biology and evolutionary computing

    Design Principles for Model Generalization and Scalable AI Integration in Radio Access Networks

    Full text link
    Artificial intelligence (AI) has emerged as a powerful tool for addressing complex and dynamic tasks in radio communication systems. Research in this area, however, focused on AI solutions for specific, limited conditions, hindering models from learning and adapting to generic situations, such as those met across radio communication systems. This paper emphasizes the pivotal role of achieving model generalization in enhancing performance and enabling scalable AI integration within radio communications. We outline design principles for model generalization in three key domains: environment for robustness, intents for adaptability to system objectives, and control tasks for reducing AI-driven control loops. Implementing these principles can decrease the number of models deployed and increase adaptability in diverse radio communication environments. To address the challenges of model generalization in communication systems, we propose a learning architecture that leverages centralization of training and data management functionalities, combined with distributed data generation. We illustrate these concepts by designing a generalized link adaptation algorithm, demonstrating the benefits of our proposed approach

    The Alberta Plan for AI Research

    Full text link
    Herein we describe our approach to artificial intelligence research, which we call the Alberta Plan. The Alberta Plan is pursued within our research groups in Alberta and by others who are like minded throughout the world. We welcome all who would join us in this pursuit

    A Framework for Integrated Assessment Modelling

    Get PDF
    “Air quality plans” according to Air Quality Directive 2008/50/EC Art. 23 are the strategic element to be developed, with the aim to reliably meet ambient air quality standards in a cost-effective way. This chapter provides a general framework to develop and assess such plans along the lines of the European Commission’s basic ideas to implement effective emission reduction measures at local, region, and national level. This methodological point of view also allows to analyse the existing integrated approaches

    A Survey on Causal Reinforcement Learning

    Full text link
    While Reinforcement Learning (RL) achieves tremendous success in sequential decision-making problems of many domains, it still faces key challenges of data inefficiency and the lack of interpretability. Interestingly, many researchers have leveraged insights from the causality literature recently, bringing forth flourishing works to unify the merits of causality and address well the challenges from RL. As such, it is of great necessity and significance to collate these Causal Reinforcement Learning (CRL) works, offer a review of CRL methods, and investigate the potential functionality from causality toward RL. In particular, we divide existing CRL approaches into two categories according to whether their causality-based information is given in advance or not. We further analyze each category in terms of the formalization of different models, ranging from the Markov Decision Process (MDP), Partially Observed Markov Decision Process (POMDP), Multi-Arm Bandits (MAB), and Dynamic Treatment Regime (DTR). Moreover, we summarize the evaluation matrices and open sources while we discuss emerging applications, along with promising prospects for the future development of CRL.Comment: 29 pages, 20 figure

    Bridging adaptive management and reinforcement learning for more robust decisions

    Full text link
    From out-competing grandmasters in chess to informing high-stakes healthcare decisions, emerging methods from artificial intelligence are increasingly capable of making complex and strategic decisions in diverse, high-dimensional, and uncertain situations. But can these methods help us devise robust strategies for managing environmental systems under great uncertainty? Here we explore how reinforcement learning, a subfield of artificial intelligence, approaches decision problems through a lens similar to adaptive environmental management: learning through experience to gradually improve decisions with updated knowledge. We review where reinforcement learning (RL) holds promise for improving evidence-informed adaptive management decisions even when classical optimization methods are intractable. For example, model-free deep RL might help identify quantitative decision strategies even when models are nonidentifiable. Finally, we discuss technical and social issues that arise when applying reinforcement learning to adaptive management problems in the environmental domain. Our synthesis suggests that environmental management and computer science can learn from one another about the practices, promises, and perils of experience-based decision-making.Comment: In press at Philosophical Transactions of the Royal Society

    Towards full-scale autonomy for multi-vehicle systems planning and acting in extreme environments

    Get PDF
    Currently, robotic technology offers flexible platforms for addressing many challenging problems that arise in extreme environments. These problems’ nature enhances the use of heterogeneous multi-vehicle systems which can coordinate and collaborate to achieve a common set of goals. While such applications have previously been explored in limited contexts, long-term deployments in such settings often require an advanced level of autonomy to maintain operability. The success of planning and acting approaches for multi-robot systems are conditioned by including reasoning regarding temporal, resource and knowledge requirements, and world dynamics. Automated planning provides the tools to enable intelligent behaviours in robotic systems. However, whilst many planning approaches and plan execution techniques have been proposed, these solutions highlight an inability to consistently build and execute high-quality plans. Motivated by these challenges, this thesis presents developments advancing state-of-the-art temporal planning and acting to address multi-robot problems. We propose a set of advanced techniques, methods and tools to build a high-level temporal planning and execution system that can devise, execute and monitor plans suitable for long-term missions in extreme environments. We introduce a new task allocation strategy, called HRTA, that optimises the task distribution amongst the heterogeneous fleet, relaxes the planning problem and boosts the plan search. We implement the TraCE planner that enforces contingent planning considering propositional temporal and numeric constraints to deal with partial observability about the initial state. Our developments regarding robust plan execution and mission adaptability include the HLMA, which efficiently optimises the task allocation and refines the planning model considering the experience from robots’ previous mission executions. We introduce the SEA failure solver that, combined with online planning, overcomes unexpected situations during mission execution, deals with joint goals implementation, and enhances mission operability in long-term deployments. Finally, we demonstrate the efficiency of our approaches with a series of experiments using a new set of real-world planning domains.Engineering and Physical Sciences Research Council (EPSRC) grant EP/R026173/
    corecore