4,727 research outputs found
Learning for Multi-robot Cooperation in Partially Observable Stochastic Environments with Macro-actions
This paper presents a data-driven approach for multi-robot coordination in
partially-observable domains based on Decentralized Partially Observable Markov
Decision Processes (Dec-POMDPs) and macro-actions (MAs). Dec-POMDPs provide a
general framework for cooperative sequential decision making under uncertainty
and MAs allow temporally extended and asynchronous action execution. To date,
most methods assume the underlying Dec-POMDP model is known a priori or a full
simulator is available during planning time. Previous methods which aim to
address these issues suffer from local optimality and sensitivity to initial
conditions. Additionally, few hardware demonstrations involving a large team of
heterogeneous robots and with long planning horizons exist. This work addresses
these gaps by proposing an iterative sampling based Expectation-Maximization
algorithm (iSEM) to learn polices using only trajectory data containing
observations, MAs, and rewards. Our experiments show the algorithm is able to
achieve better solution quality than the state-of-the-art learning-based
methods. We implement two variants of multi-robot Search and Rescue (SAR)
domains (with and without obstacles) on hardware to demonstrate the learned
policies can effectively control a team of distributed robots to cooperate in a
partially observable stochastic environment.Comment: Accepted to the 2017 IEEE/RSJ International Conference on Intelligent
Robots and Systems (IROS 2017
Evolvability: What Is It and How Do We Get It?
Biological organisms exhibit spectacular adaptation to their environments. However, another marvel of biology lurks behind the adaptive traits that organisms exhibit over the course of their lifespans: it is hypothesized that biological organisms also exhibit adaptation to the evolutionary process itself. That is, biological organisms are thought to possess traits that facilitate evolution. The term evolvability was coined to describe this type of adaptation. The question of evolvability has special practical relevance to computer science researchers engaged in longstanding efforts to harness evolution as an algorithm for automated design. It is hoped that a more nuanced understanding of biological evolution will translate to more powerful digital evolution techniques. This thesis presents a theoretical overview of evolvability, illustrated with examples from biology and evolutionary computing
Design Principles for Model Generalization and Scalable AI Integration in Radio Access Networks
Artificial intelligence (AI) has emerged as a powerful tool for addressing
complex and dynamic tasks in radio communication systems. Research in this
area, however, focused on AI solutions for specific, limited conditions,
hindering models from learning and adapting to generic situations, such as
those met across radio communication systems.
This paper emphasizes the pivotal role of achieving model generalization in
enhancing performance and enabling scalable AI integration within radio
communications. We outline design principles for model generalization in three
key domains: environment for robustness, intents for adaptability to system
objectives, and control tasks for reducing AI-driven control loops.
Implementing these principles can decrease the number of models deployed and
increase adaptability in diverse radio communication environments. To address
the challenges of model generalization in communication systems, we propose a
learning architecture that leverages centralization of training and data
management functionalities, combined with distributed data generation. We
illustrate these concepts by designing a generalized link adaptation algorithm,
demonstrating the benefits of our proposed approach
The Alberta Plan for AI Research
Herein we describe our approach to artificial intelligence research, which we
call the Alberta Plan. The Alberta Plan is pursued within our research groups
in Alberta and by others who are like minded throughout the world. We welcome
all who would join us in this pursuit
A Framework for Integrated Assessment Modelling
“Air quality plans” according to Air Quality Directive 2008/50/EC Art. 23 are the strategic element to be developed, with the aim to reliably meet ambient air quality standards in a cost-effective way. This chapter provides a general framework to develop and assess such plans along the lines of the European Commission’s basic ideas to implement effective emission reduction measures at local, region, and national level. This methodological point of view also allows to analyse the existing integrated approaches
A Survey on Causal Reinforcement Learning
While Reinforcement Learning (RL) achieves tremendous success in sequential
decision-making problems of many domains, it still faces key challenges of data
inefficiency and the lack of interpretability. Interestingly, many researchers
have leveraged insights from the causality literature recently, bringing forth
flourishing works to unify the merits of causality and address well the
challenges from RL. As such, it is of great necessity and significance to
collate these Causal Reinforcement Learning (CRL) works, offer a review of CRL
methods, and investigate the potential functionality from causality toward RL.
In particular, we divide existing CRL approaches into two categories according
to whether their causality-based information is given in advance or not. We
further analyze each category in terms of the formalization of different
models, ranging from the Markov Decision Process (MDP), Partially Observed
Markov Decision Process (POMDP), Multi-Arm Bandits (MAB), and Dynamic Treatment
Regime (DTR). Moreover, we summarize the evaluation matrices and open sources
while we discuss emerging applications, along with promising prospects for the
future development of CRL.Comment: 29 pages, 20 figure
Bridging adaptive management and reinforcement learning for more robust decisions
From out-competing grandmasters in chess to informing high-stakes healthcare
decisions, emerging methods from artificial intelligence are increasingly
capable of making complex and strategic decisions in diverse, high-dimensional,
and uncertain situations. But can these methods help us devise robust
strategies for managing environmental systems under great uncertainty? Here we
explore how reinforcement learning, a subfield of artificial intelligence,
approaches decision problems through a lens similar to adaptive environmental
management: learning through experience to gradually improve decisions with
updated knowledge. We review where reinforcement learning (RL) holds promise
for improving evidence-informed adaptive management decisions even when
classical optimization methods are intractable. For example, model-free deep RL
might help identify quantitative decision strategies even when models are
nonidentifiable. Finally, we discuss technical and social issues that arise
when applying reinforcement learning to adaptive management problems in the
environmental domain. Our synthesis suggests that environmental management and
computer science can learn from one another about the practices, promises, and
perils of experience-based decision-making.Comment: In press at Philosophical Transactions of the Royal Society
Towards full-scale autonomy for multi-vehicle systems planning and acting in extreme environments
Currently, robotic technology offers flexible platforms for addressing many challenging problems that arise in extreme environments. These problems’ nature enhances
the use of heterogeneous multi-vehicle systems which can coordinate and collaborate
to achieve a common set of goals. While such applications have previously been
explored in limited contexts, long-term deployments in such settings often require
an advanced level of autonomy to maintain operability.
The success of planning and acting approaches for multi-robot systems are conditioned by including reasoning regarding temporal, resource and knowledge requirements, and world dynamics. Automated planning provides the tools to enable intelligent behaviours in robotic systems. However, whilst many planning approaches and
plan execution techniques have been proposed, these solutions highlight an inability
to consistently build and execute high-quality plans.
Motivated by these challenges, this thesis presents developments advancing state-of-the-art temporal planning and acting to address multi-robot problems. We propose a set of advanced techniques, methods and tools to build a high-level temporal
planning and execution system that can devise, execute and monitor plans suitable for long-term missions in extreme environments. We introduce a new task
allocation strategy, called HRTA, that optimises the task distribution amongst the
heterogeneous fleet, relaxes the planning problem and boosts the plan search. We
implement the TraCE planner that enforces contingent planning considering propositional temporal and numeric constraints to deal with partial observability about
the initial state. Our developments regarding robust plan execution and mission
adaptability include the HLMA, which efficiently optimises the task allocation and
refines the planning model considering the experience from robots’ previous mission
executions. We introduce the SEA failure solver that, combined with online planning, overcomes unexpected situations during mission execution, deals with joint
goals implementation, and enhances mission operability in long-term deployments.
Finally, we demonstrate the efficiency of our approaches with a series of experiments
using a new set of real-world planning domains.Engineering and Physical Sciences Research Council (EPSRC) grant EP/R026173/
- …