8 research outputs found
Search-based optimal motion planning for automated driving
This paper presents a framework for fast and robust motion planning designed
to facilitate automated driving. The framework allows for real-time computation
even for horizons of several hundred meters and thus enabling automated driving
in urban conditions. This is achieved through several features. Firstly, a
convenient geometrical representation of both the search space and driving
constraints enables the use of classical path planning approach. Thus, a wide
variety of constraints can be tackled simultaneously (other vehicles, traffic
lights, etc.). Secondly, an exact cost-to-go map, obtained by solving a relaxed
problem, is then used by A*-based algorithm with model predictive flavour in
order to compute the optimal motion trajectory. The algorithm takes into
account both distance and time horizons. The approach is validated within a
simulation study with realistic traffic scenarios. We demonstrate the
capability of the algorithm to devise plans both in fast and slow driving
conditions, even when full stop is required.Comment: Preprint accepted to 2018 IEEE/RSJ International Conference on
Intelligent Robots and Systems (IROS 2018). A supplementary video is
available at https://youtu.be/D5XJ5ncSuq
Strategic decision making for automated driving on two-lane, one way roads using model predictive control
This paper presents an algorithm for strategic decision making regarding when lane change and overtake manoeuvres are desirable and feasible. By considering the task of driving on two-lane, one-way roads, as the selection of desired lane and velocity profile, the algorithm providesuseful results in terms of velocity control as well as a decision variable corresponding to whether a lane change manoeuvre should be performed. The decision process is modelled through a mixed logical dynamical system which is solved through model predictive control using mixed integer program formulation. The performance of the proposed control system is explored through simulations of varying driving scenaria on a two-lane, one-way road, which shows the capability of the system to achieve appropriate longitudinal and lateral control strategies depending on the traffic situation
A Potential Field-Based Model Predictive Path-Planning Controller for Autonomous Road Vehicles
© IEEE 2017. Rasekhipour, Y., Khajepour, A., Chen, S.-K., & Litkouhi, B. (2016). A Potential Field-Based Model Predictive Path-Planning Controller for Autonomous Road Vehicles. IEEE Transactions on Intelligent Transportation Systems, 18(5), 1255–1267. https://doi.org/10.1109/TITS.2016.2604240Artificial potential fields and optimal controllers are two common methods for path planning of autonomous vehicles. An artificial potential field method is capable of assigning different potential functions to different types of obstacles and road structures and plans the path based on these potential functions. It does not, however, include the vehicle dynamics in the path-planning process. On the other hand, an optimal path-planning controller integrated with vehicle dynamics plans an optimal feasible path that guarantees vehicle stability in following the path. In this method, the obstacles and road boundaries are usually included in the optimal control problem as constraints and not with any arbitrary function. A model predictive path-planning controller is introduced in this paper such that its objective includes potential functions along with the vehicle dynamics terms. Therefore, the path-planning system is capable of treating different obstacles and road structures distinctly while planning the optimal path utilizing vehicle dynamics. The path-planning controller is modeled and simulated on a CarSim vehicle model for some complicated test scenarios. The results show that, with this path-planning controller, the vehicle avoids the obstacles and observes road regulations with appropriate vehicle dynamics. Moreover, since the obstacles and road regulations can be defined with different functions, the path-planning system plans paths corresponding to their importance and priorities
Game Theoretic Model Predictive Control for Autonomous Driving
This study presents two closely-related solutions to autonomous vehicle control problems in highway driving scenario using game theory and model predictive control. We first develop a game theoretic four-stage model predictive controller (GT4SMPC). The controller is responsible for both longitudinal and lateral movements of Subject Vehicle (SV) . It includes a Stackelberg game as a high level controller and a model predictive controller (MPC) as a low level one. Specifically, GT4SMPC constantly establishes and solves games corresponding to multiple gaps in front of multiple-candidate vehicles (GCV) when SV is interacting with them by signaling a lane change intention through turning light or by a small lateral movement. SV’s payoff is the negative of the MPC’s cost function , which ensures strong connection between the game and that the solution of the game is more likely to be achieved by a hybrid MPC (HMPC). GCV’s payoff is a linear combination of the speed payoff, headway payoff and acceleration payoff. . We use decreasing acceleration model to generate our prediction of TV’s future motion, which is utilized in both defining TV’s payoffs over the prediction horizon in the game and as the reference of the MPC. Solving the games gives the optimal gap and the target vehicle (TV). In the low level , the lane change process are divided into four stages: traveling in the current lane, leaving current lane, crossing lane marking, traveling in the target lane. The division identifies the time that SV should initiate actual lateral movement for the lateral controller and specifies the constraints HMPC should deal at each step of the MPC prediction horizon. Then the four-stage HMPC controls SV’s actual longitudinal motion and execute the lane change at the right moment. Simulations showed the GT4SMPC is able to intelligently drive SV into the selected gap and accomplish both discretionary land change (DLC) and mandatory lane change (MLC) in a dynamic situation. Human-in-the-loop driving simulation indicated that GT4SMPC can decently control the SV to complete lane changes with the presence of human drivers. Second, we propose a differential game theoretic model predictive controller (DGTMPC) to address the drawbacks of GT4SMPC. In GT4SMPC, the games are defined as table game, which indicates each players only have limited amount of choices for a specific game and such choice remain fixed during the prediction horizon. In addition, we assume a known model for traffic vehicles but in reality drivers’ preference is partly unknown. In order to allow the TV to make multiple decisions within the prediction horizon and to measure TV’s driving style on-line, we propose a differential game theoretic model predictive controller (DGTMPC). The high level of the hierarchical DGTMPC is the two-player differential lane-change Stackelberg game. We assume each player uses a MPC to control its motion and the optimal solution of leaders’ MPC depends on the solution of the follower. Therefore, we convert this differential game problem into a bi-level optimization problem and solves the problem with the branch and bound algorithm. Besides the game, we propose an inverse model predictive control algorithm (IMPC) to estimate the MPC weights of other drivers on-line based on surrounding vehicle’s real-time behavior, assuming they are controlled by MPC as well. The estimation results contribute to a more appropriate solution to the game against driver of specific type. The solution of the algorithm indicates the future motion of the TV, which can be used as the reference for the low level controller. The low level HMPC controls both the longitudinal motion of SV and his real-time lane decision. Simulations showed that the DGTMPC can well identify the weights traffic vehicles’ MPC cost function and behave intelligently during the interaction. Comparison with level-k controller indicates DGTMPC’s Superior performance
A Detection and Mitigation System for Unintended Acceleration: An Integrated Hybrid Data-driven and Model-based Approach
This study presents an integrated hybrid data-driven and model-based approach to detecting abnormal driving conditions. Vehicle data (e.g., velocity and gas pedal position) and traffic data (e.g., positions and velocities of cars nearby) are proposed for use in the detection process. In this study, the abnormal driving condition mainly refers to unintended acceleration (UA), which is the unintended, unexpected, uncontrolled acceleration of a vehicle. It is often accompanied by an apparent loss of braking effectiveness. UA has become one of the most complained-about vehicle problems in recent history.
The data-driven algorithm aims to use historical data to develop a model that describes the boundary between normal and abnormal vehicle behavior in the vehicle data space. At first, several detection models were created by analyzing historical vehicle data at specific moments such as acceleration peaks and gear shifting. After that, these models were incorporated into a detection system. The system decided if a UA event had occurred by sending real-time vehicle data to the models and comprehensively analyzing their diagnostic results. Besides the data-driven algorithm, a driver model-based approach is proposed. An adaptive and rational driver model based on game theory was developed for a human driver. It was combined with a vehicle model to predict future vehicle behavior. The differences between real driving behavior and predicted driving behavior were recorded and analyzed by the detection system. An unusually large difference indicated a high probability of an abnormal event.
Both the data-driven approach and the model-based approach were tested in the Simulink/dSPACE environment. It allowed a human driver to use analog steering wheels and pedals to control a virtual vehicle in real time and made tests more realistic. Vehicle models and traffic models were created in dSPACE to study the influences of UA and ineffective brakes in various roadway driving situations. Test results show that the integrated system was capable of detecting UA in one second with high accuracy. Finally, a brake assist system was designed to cooperate with the detection system, which reduced the risk of accidents
Game Theoretic Model Predictive Control for Autonomous Driving
This study presents two closely-related solutions to autonomous vehicle control problems in highway driving scenario using game theory and model predictive control. We first develop a game theoretic four-stage model predictive controller (GT4SMPC). The controller is responsible for both longitudinal and lateral movements of Subject Vehicle (SV) . It includes a Stackelberg game as a high level controller and a model predictive controller (MPC) as a low level one. Specifically, GT4SMPC constantly establishes and solves games corresponding to multiple gaps in front of multiple-candidate vehicles (GCV) when SV is interacting with them by signaling a lane change intention through turning light or by a small lateral movement. SV’s payoff is the negative of the MPC’s cost function , which ensures strong connection between the game and that the solution of the game is more likely to be achieved by a hybrid MPC (HMPC). GCV’s payoff is a linear combination of the speed payoff, headway payoff and acceleration payoff. . We use decreasing acceleration model to generate our prediction of TV’s future motion, which is utilized in both defining TV’s payoffs over the prediction horizon in the game and as the reference of the MPC. Solving the games gives the optimal gap and the target vehicle (TV). In the low level , the lane change process are divided into four stages: traveling in the current lane, leaving current lane, crossing lane marking, traveling in the target lane. The division identifies the time that SV should initiate actual lateral movement for the lateral controller and specifies the constraints HMPC should deal at each step of the MPC prediction horizon. Then the four-stage HMPC controls SV’s actual longitudinal motion and execute the lane change at the right moment. Simulations showed the GT4SMPC is able to intelligently drive SV into the selected gap and accomplish both discretionary land change (DLC) and mandatory lane change (MLC) in a dynamic situation. Human-in-the-loop driving simulation indicated that GT4SMPC can decently control the SV to complete lane changes with the presence of human drivers. Second, we propose a differential game theoretic model predictive controller (DGTMPC) to address the drawbacks of GT4SMPC. In GT4SMPC, the games are defined as table game, which indicates each players only have limited amount of choices for a specific game and such choice remain fixed during the prediction horizon. In addition, we assume a known model for traffic vehicles but in reality drivers’ preference is partly unknown. In order to allow the TV to make multiple decisions within the prediction horizon and to measure TV’s driving style on-line, we propose a differential game theoretic model predictive controller (DGTMPC). The high level of the hierarchical DGTMPC is the two-player differential lane-change Stackelberg game. We assume each player uses a MPC to control its motion and the optimal solution of leaders’ MPC depends on the solution of the follower. Therefore, we convert this differential game problem into a bi-level optimization problem and solves the problem with the branch and bound algorithm. Besides the game, we propose an inverse model predictive control algorithm (IMPC) to estimate the MPC weights of other drivers on-line based on surrounding vehicle’s real-time behavior, assuming they are controlled by MPC as well. The estimation results contribute to a more appropriate solution to the game against driver of specific type. The solution of the algorithm indicates the future motion of the TV, which can be used as the reference for the low level controller. The low level HMPC controls both the longitudinal motion of SV and his real-time lane decision. Simulations showed that the DGTMPC can well identify the weights traffic vehicles’ MPC cost function and behave intelligently during the interaction. Comparison with level-k controller indicates DGTMPC’s Superior performance
Aprendizagem de coordenação em sistemas multi-agente
The ability for an agent to coordinate with others within a system is a
valuable property in multi-agent systems. Agents either cooperate as a team
to accomplish a common goal, or adapt to opponents to complete different
goals without being exploited. Research has shown that learning multi-agent
coordination is significantly more complex than learning policies in singleagent
environments, and requires a variety of techniques to deal with the
properties of a system where agents learn concurrently. This thesis aims to
determine how can machine learning be used to achieve coordination within
a multi-agent system. It asks what techniques can be used to tackle the
increased complexity of such systems and their credit assignment challenges,
how to achieve coordination, and how to use communication to improve the
behavior of a team.
Many algorithms for competitive environments are tabular-based, preventing
their use with high-dimension or continuous state-spaces, and may be
biased against specific equilibrium strategies. This thesis proposes multiple
deep learning extensions for competitive environments, allowing algorithms
to reach equilibrium strategies in complex and partially-observable environments,
relying only on local information. A tabular algorithm is also extended
with a new update rule that eliminates its bias against deterministic strategies.
Current state-of-the-art approaches for cooperative environments rely
on deep learning to handle the environment’s complexity and benefit from a
centralized learning phase. Solutions that incorporate communication between
agents often prevent agents from being executed in a distributed
manner. This thesis proposes a multi-agent algorithm where agents learn
communication protocols to compensate for local partial-observability, and
remain independently executed. A centralized learning phase can incorporate
additional environment information to increase the robustness and speed with
which a team converges to successful policies. The algorithm outperforms
current state-of-the-art approaches in a wide variety of multi-agent environments.
A permutation invariant network architecture is also proposed
to increase the scalability of the algorithm to large team sizes. Further research
is needed to identify how can the techniques proposed in this thesis,
for cooperative and competitive environments, be used in unison for mixed
environments, and whether they are adequate for general artificial intelligence.A capacidade de um agente se coordenar com outros num sistema Ă© uma
propriedade valiosa em sistemas multi-agente. Agentes cooperam como
uma equipa para cumprir um objetivo comum, ou adaptam-se aos oponentes
de forma a completar objetivos egoĂstas sem serem explorados. Investigação
demonstra que aprender coordenação multi-agente é significativamente
mais complexo que aprender estratégias em ambientes com um
único agente, e requer uma variedade de técnicas para lidar com um ambiente
onde agentes aprendem simultaneamente. Esta tese procura determinar
como aprendizagem automática pode ser usada para encontrar coordenação
em sistemas multi-agente. O documento questiona que técnicas podem ser
usadas para enfrentar a superior complexidade destes sistemas e o seu desafio
de atribuição de crédito, como aprender coordenação, e como usar
comunicação para melhorar o comportamento duma equipa.
MĂşltiplos algoritmos para ambientes competitivos sĂŁo tabulares, o que impede
o seu uso com espaços de estado de alta-dimensĂŁo ou contĂnuos, e
podem ter tendĂŞncias contra estratĂ©gias de equilĂbrio especĂficas. Esta tese
propõe múltiplas extensões de aprendizagem profunda para ambientes competitivos,
permitindo a algoritmos atingir estratĂ©gias de equilĂbrio em ambientes
complexos e parcialmente-observáveis, com base em apenas informação
local. Um algoritmo tabular é também extendido com um novo critério de
atualização que elimina a sua tendĂŞncia contra estratĂ©gias determinĂsticas.
Atuais soluções de estado-da-arte para ambientes cooperativos têm base em
aprendizagem profunda para lidar com a complexidade do ambiente, e beneficiam
duma fase de aprendizagem centralizada. Soluções que incorporam
comunicação entre agentes frequentemente impedem os próprios de ser executados
de forma distribuĂda. Esta tese propõe um algoritmo multi-agente
onde os agentes aprendem protocolos de comunicação para compensarem
por observabilidade parcial local, e continuam a ser executados de forma
distribuĂda. Uma fase de aprendizagem centralizada pode incorporar informação
adicional sobre ambiente para aumentar a robustez e velocidade
com que uma equipa converge para estratégias bem-sucedidas. O algoritmo
ultrapassa abordagens estado-da-arte atuais numa grande variedade de ambientes
multi-agente. Uma arquitetura de rede invariante a permutações é
também proposta para aumentar a escalabilidade do algoritmo para grandes
equipas. Mais pesquisa é necessária para identificar como as técnicas propostas
nesta tese, para ambientes cooperativos e competitivos, podem ser
usadas em conjunto para ambientes mistos, e averiguar se sĂŁo adequadas a
inteligência artificial geral.Apoio financeiro da FCT e do FSE no âmbito do III Quadro Comunitário de ApoioPrograma Doutoral em Informátic