427 research outputs found
The exponential cost optimality for finite horizon semi-Markov decision processes
summary:This paper considers an exponential cost optimality problem for finite horizon semi-Markov decision processes (SMDPs). The objective is to calculate an optimal policy with minimal exponential costs over the full set of policies in a finite horizon. First, under the standard regular and compact-continuity conditions, we establish the optimality equation, prove that the value function is the unique solution of the optimality equation and the existence of an optimal policy by using the minimum nonnegative solution approach. Second, we establish a new value iteration algorithm to calculate both the value function and the -optimal policy. Finally, we give a computable machine maintenance system to illustrate the convergence of the algorithm
Value Functions are Control Barrier Functions: Verification of Safe Policies using Control Theory
Guaranteeing safe behaviour of reinforcement learning (RL) policies poses
significant challenges for safety-critical applications, despite RL's
generality and scalability. To address this, we propose a new approach to apply
verification methods from control theory to learned value functions. By
analyzing task structures for safety preservation, we formalize original
theorems that establish links between value functions and control barrier
functions. Further, we propose novel metrics for verifying value functions in
safe control tasks and practical implementation details to improve learning.
Our work presents a novel method for certificate learning, which unlocks a
diversity of verification techniques from control theory for RL policies, and
marks a significant step towards a formal framework for the general, scalable,
and verifiable design of RL-based control systems
Analysis and Design of Vehicle Platooning Operations on Mixed-Traffic Highways
Platooning of connected and autonomous vehicles (CAVs) has a significant
potential for throughput improvement. However, the interaction between CAVs and
non-CAVs may limit the practically attainable improvement due to platooning. To
better understand and address this limitation, we introduce a new fluid model
of mixed-autonomy traffic flow and use this model to analyze and design platoon
coordination strategies. We propose tandem-link fluid model that considers
randomly arriving platoons sharing highway capacity with non-CAVs. We derive
verifiable conditions for stability of the fluid model by analyzing an
underlying M/D/1 queuing process and establishing a Foster-Lyapunov drift
condition for the fluid model. These stability conditions enable a quantitative
analysis of highway throughput under various scenarios. The model is useful for
designing platoon coordination strategies that maximize throughput and minimize
delay. Such coordination strategies are provably optimal in the fluid model and
are practically relevant. We also validate our results using standard
macroscopic (cell transmission model, CTM) and microscopic (Simulation for
Urban Mobility, SUMO) simulation models
- …