4,777 research outputs found

    Image-based multi-agent reinforcement learning for demand–capacity balancing

    Get PDF
    Air traffic flow management (ATFM) is of crucial importance to the European Air Traffic Control System due to two factors: first, the impact of ATFM, including safety implications on ATC operations; second, the possible consequences of ATFM measures on both airports and airlines operations. Thus, the central flow management unit continually seeks to improve traffic flow management to reduce delays and congestion. In this work, we investigated the use of reinforcement learning (RL) methods to compute policies to solve demand–capacity imbalances (a.k.a. congestion) during the pre-tactical phase. To address cases where the expected demands exceed the airspace sector capacity, we considered agents representing flights who have to decide on ground delays jointly. To overcome scalability issues, we propose using raw pixel images as input, which can represent an arbitrary number of agents without changing the system’s architecture. This article compares deep Q-learning and deep deterministic policy gradient algorithms with different configurations. Experimental results, using real-world data for training and validation, confirm the effectiveness of our approach to resolving demand–capacity balancing problems, showing the robustness of the RL approach presented in this article.This work was funded by EUROCONTROL under Ph.D. Research contract no. 18-220569- C2 and by the Ministry of Economy, Industry, and Competitiveness of Spain under grant number PID2020-116377RB-C21.Peer ReviewedPostprint (published version

    Decision-Making in Autonomous Driving using Reinforcement Learning

    Get PDF
    The main topic of this thesis is tactical decision-making for autonomous driving. An autonomous vehicle must be able to handle a diverse set of environments and traffic situations, which makes it hard to manually specify a suitable behavior for every possible scenario. Therefore, learning-based strategies are considered in this thesis, which introduces different approaches based on reinforcement learning (RL). A general decision-making agent, derived from the Deep Q-Network (DQN) algorithm, is proposed. With few modifications, this method can be applied to different driving environments, which is demonstrated for various simulated highway and intersection scenarios. A more sample efficient agent can be obtained by incorporating more domain knowledge, which is explored by combining planning and learning in the form of Monte Carlo tree search and RL. In different highway scenarios, the combined method outperforms using either a planning or a learning-based strategy separately, while requiring an order of magnitude fewer training samples than the DQN method. A drawback of many learning-based approaches is that they create black-box solutions, which do not indicate the confidence of the agent\u27s decisions. Therefore, the Ensemble Quantile Networks (EQN) method is introduced, which combines distributional RL with an ensemble approach, to provide an estimate of both the aleatoric and the epistemic uncertainty of each decision. The results show that the EQN method can balance risk and time efficiency in different occluded intersection scenarios, while also identifying situations that the agent has not been trained for. Thereby, the agent can avoid making unfounded, potentially dangerous, decisions outside of the training distribution. Finally, this thesis introduces a neural network architecture that is invariant to permutations of the order in which surrounding vehicles are listed. This architecture improves the sample efficiency of the agent by the factorial of the number of surrounding vehicles

    ENLISTING AI IN COURSE OF ACTION ANALYSIS AS APPLIED TO NAVAL FREEDOM OF NAVIGATION OPERATIONS

    Get PDF
    Navy Planning Process (NPP) Course of Action (COA) analysis requires time and subject matter experts (SMEs) to function properly. Independent steamers (lone destroyers) can soon find themselves lacking time or more than 1–2 SMEs or both. Artificial Intelligence (AI) techniques implemented in real-time strategy (RTS) wargames can be applied to military wargaming to aid military decision-makers’ COA analysis. Using a deep-Q network (DQN) and the ATLATL wargaming framework, I was able to train AI agents that could operate as the opposing force (OPFOR) commander at both satisfactory and near-optimal levels of performance, after less than 24 hours of training or 500000–learning steps. I also show that under 6 hours or 150000–learning steps does not result in a satisfactory AI admiral capable of playing the role as the OPFOR commander in a similarly sized freedom of navigation operation (FONOP) scenario. Applying these AI techniques can save both time onboard and time for reachback personnel. Training AI admirals as quality OPFOR commanders can enhance the NPP for the entire Navy without adding additional strain and without creating analysis paralysis. The meaningful insights and localized flashpoints revealed through hundreds of thousands of constructive operations and experienced by the crew in live simulation or simulation replays will lead to real world, combat-ready naval forces capable of deterring aggression and maintaining freedom of the seas.Lieutenant, United States NavyApproved for public release. Distribution is unlimited
    corecore