377 research outputs found
Safe Policy Synthesis in Multi-Agent POMDPs via Discrete-Time Barrier Functions
A multi-agent partially observable Markov decision process (MPOMDP) is a
modeling paradigm used for high-level planning of heterogeneous autonomous
agents subject to uncertainty and partial observation. Despite their modeling
efficiency, MPOMDPs have not received significant attention in safety-critical
settings. In this paper, we use barrier functions to design policies for
MPOMDPs that ensure safety. Notably, our method does not rely on discretization
of the belief space, or finite memory. To this end, we formulate sufficient and
necessary conditions for the safety of a given set based on discrete-time
barrier functions (DTBFs) and we demonstrate that our formulation also allows
for Boolean compositions of DTBFs for representing more complicated safe sets.
We show that the proposed method can be implemented online by a sequence of
one-step greedy algorithms as a standalone safe controller or as a
safety-filter given a nominal planning policy. We illustrate the efficiency of
the proposed methodology based on DTBFs using a high-fidelity simulation of
heterogeneous robots.Comment: 8 pages and 4 figure
Deep Reinforcement Learning for Flipper Control of Tracked Robots
The autonomous control of flippers plays an important role in enhancing the
intelligent operation of tracked robots within complex environments. While
existing methods mainly rely on hand-crafted control models, in this paper, we
introduce a novel approach that leverages deep reinforcement learning (DRL)
techniques for autonomous flipper control in complex terrains. Specifically, we
propose a new DRL network named AT-D3QN, which ensures safe and smooth flipper
control for tracked robots. It comprises two modules, a feature extraction and
fusion module for extracting and integrating robot and environment state
features, and a deep Q-Learning control generation module for incorporating
expert knowledge to obtain a smooth and efficient control strategy. To train
the network, a novel reward function is proposed, considering both learning
efficiency and passing smoothness. A simulation environment is constructed
using the Pymunk physics engine for training. We then directly apply the
trained model to a more realistic Gazebo simulation for quantitative analysis.
The consistently high performance of the proposed approach validates its
superiority over manual teleoperation
Safe Policy Synthesis in Multi-Agent POMDPs via Discrete-Time Barrier Functions
A multi-agent partially observable Markov decision process (MPOMDP) is a modeling paradigm used for high-level planning of heterogeneous autonomous agents subject to uncertainty and partial observation. Despite their modeling efficiency, MPOMDPs have not received significant attention in safety-critical settings. In this paper, we use barrier functions to design policies for MPOMDPs that ensure safety. Notably, our method does not rely on discretizations of the belief space, or finite memory. To this end, we formulate sufficient and necessary conditions for the safety of a given set based on discrete-time barrier functions (DTBFs) and we demonstrate that our formulation also allows for Boolean compositions of DTBFs for representing more complicated safe sets. We show that the proposed method can be implemented online by a sequence of one-step greedy algorithms as a standalone safe controller or as a safety-filter given a nominal planning policy. We illustrate the efficiency of the proposed methodology based on DTBFs using a high-fidelity simulation of heterogeneous robots
Partially Observable Games for Secure Autonomy
Technology development efforts in autonomy and cyber-defense have been evolving independently of each other, over the past decade. In this paper, we report our ongoing effort to integrate these two presently distinct areas into a single framework. To this end, we propose the two-player partially observable stochastic game formalism to capture both high-level autonomous mission planning under uncertainty and adversarial decision making subject to imperfect information. We show that synthesizing sub-optimal strategies for such games is possible under finite-memory assumptions for both the autonomous decision maker and the cyber-adversary. We then describe an experimental testbed to evaluate the efficacy of the proposed framework
Partially Observable Games for Secure Autonomy
Technology development efforts in autonomy and cyber-defense have been
evolving independently of each other, over the past decade. In this paper, we
report our ongoing effort to integrate these two presently distinct areas into
a single framework. To this end, we propose the two-player partially observable
stochastic game formalism to capture both high-level autonomous mission
planning under uncertainty and adversarial decision making subject to imperfect
information. We show that synthesizing sub-optimal strategies for such games is
possible under finite-memory assumptions for both the autonomous decision maker
and the cyber-adversary. We then describe an experimental testbed to evaluate
the efficacy of the proposed framework
Partially Observable Games for Secure Autonomy
Technology development efforts in autonomy and cyber-defense have been evolving independently of each other, over the past decade. In this paper, we report our ongoing effort to integrate these two presently distinct areas into a single framework. To this end, we propose the two-player partially observable stochastic game formalism to capture both high-level autonomous mission planning under uncertainty and adversarial decision making subject to imperfect information. We show that synthesizing sub-optimal strategies for such games is possible under finite-memory assumptions for both the autonomous decision maker and the cyber-adversary. We then describe an experimental testbed to evaluate the efficacy of the proposed framework
- …