24 research outputs found
A Learning Based Approach to Control Synthesis of Markov Decision Processes for Linear Temporal Logic Specifications
We propose to synthesize a control policy for a Markov decision process (MDP)
such that the resulting traces of the MDP satisfy a linear temporal logic (LTL)
property. We construct a product MDP that incorporates a deterministic Rabin
automaton generated from the desired LTL property. The reward function of the
product MDP is defined from the acceptance condition of the Rabin automaton.
This construction allows us to apply techniques from learning theory to the
problem of synthesis for LTL specifications even when the transition
probabilities are not known a priori. We prove that our method is guaranteed to
find a controller that satisfies the LTL property with probability one if such
a policy exists, and we suggest empirically with a case study in traffic
control that our method produces reasonable control strategies even when the
LTL property cannot be satisfied with probability one
Towards Assume-Guarantee Profiles for Autonomous Vehicles
Rules or specifications for autonomous vehicles are currently formulated on a case-by-case basis, and put together in a rather ad-hoc fashion. As a step towards eliminating this practice, we propose a systematic procedure for generating a set of supervisory specifications for self-driving cars that are 1) associated with a distributed assume-guarantee structure and 2) characterizable by the notion of consistency and completeness. Besides helping autonomous vehicles make better decisions on the road, the assume-guarantee contract structure also helps address the notion of blame when undesirable events occur. We give several game-theoretic examples to demonstrate applicability of our framework
Vehicle Independent Road Resistance Estimation Using Connected Vehicle Data
This paper is investigating if it possible to use vehicle log data to estimate vehicle independent road resistance parameters that can have large local variations and change rapidly, such as wind speed, wind direction and road surface conditions. The estimated parameters can be used to improve range estimation, route planning and vehicle energy management. The advantage with using vehicle independent parameters is that data from any vehicle can be used to improve the estimation and that all vehicles can benefit from the estimated data. An analytical solution previously presented for parameter estimation is verified on vehicle log data. Results show that the method works reasonably well for wind speed estimation and that changes in road conditions can be detected. Side wind affects need to be considered in future wor
Improving Automated Driving through Planning with Human Internal States
This work examines the hypothesis that partially observable Markov decision
process (POMDP) planning with human driver internal states can significantly
improve both safety and efficiency in autonomous freeway driving. We evaluate
this hypothesis in a simulated scenario where an autonomous car must safely
perform three lane changes in rapid succession. Approximate POMDP solutions are
obtained through the partially observable Monte Carlo planning with observation
widening (POMCPOW) algorithm. This approach outperforms over-confident and
conservative MDP baselines and matches or outperforms QMDP. Relative to the MDP
baselines, POMCPOW typically cuts the rate of unsafe situations in half or
increases the success rate by 50%.Comment: Preprint before submission to IEEE Transactions on Intelligent
Transportation Systems. arXiv admin note: text overlap with arXiv:1702.0085
Risk-aware motion planning for automated vehicle among human-driven cars
We consider the maneuver planning problem for automated vehicles when they share the road with human-driven cars and interact with each other using a finite set of maneuvers. Each maneuver is calculated considering input constraints, actuator disturbances and sensor noise, so that we can use a maneuver automaton to perform higher-level planning that is robust against lower-level effects. In order to model the behavior of human-driven cars in response to the intent of the automated vehicle, we use control improvisation to build a probabilistic model. To accommodate for potential mismatches between the learned human model and human driving behaviors, we use a conditional value-at-risk objective function to obtain the optimal policy for the automated vehicle. We demonstrate through simulations that our motion planning framework consisting of an interactive human driving model and risk-aware motion planning strategy makes it possible to adapt to different traffic conditions and confidence levels
Correct-by-Construction Advanced Driver Assistance Systems based on a Cognitive Architecture
Research into safety in autonomous and semi-autonomous vehicles has, so far,
largely been focused on testing and validation through simulation. Due to the
fact that failure of these autonomous systems is potentially life-endangering,
formal methods arise as a complementary approach. This paper studies the
application of formal methods to the verification of a human driver model built
using the cognitive architecture ACT-R, and to the design of
correct-by-construction Advanced Driver Assistance Systems (ADAS). The novelty
lies in the integration of ACT-R in the formal analysis and an abstraction
technique that enables finite representation of a large dimensional, continuous
system in the form of a Markov process. The situation considered is a
multi-lane highway driving scenario and the interactions that arise. The
efficacy of the method is illustrated in two case studies with various driving
conditions.Comment: Proceedings at IEEE CAVS 201
Voluntary lane-change policy synthesis with reactive control improvisation
In this paper, we propose reactive control improvisation
to synthesize voluntary lane-change policy that meets
human preferences under given traffic environments. We first
train Markov models to describe traffic patterns and the motion
of vehicles responding to such patterns using traffic data. The
trained parameters are calibrated using control improvisation
to ensure the traffic scenario assumptions are satisfied. Based
on the traffic pattern, vehicle response models, and Bayesian
switching rules, the lane-change environment for an automated
vehicle is modeled as a Markov decision process. Based on
human lane-change behaviors, we train a voluntary lane-change
policy using explicit-duration Markov decision process.
Parameters in the lane-change policy are calibrated through
reactive control improvisation to allow an automated car to
pursue faster speed while maintaining desired frequency of
lane-change maneuvers in various traffic environments