546 research outputs found

    Independent learners in abstract traffic scenarios

    Get PDF
    Traffic is a phenomena that emerges from individual, uncoordinatedand, most of the times, selfish route choice made by drivers. In general, this leads topoor global and individual performance, regarding travel times and road network loadbalance. This work presents a reinforcement learning based approach for route choicewhich relies solely on drivers experience to guide their decisions. There is no coordinatedlearning mechanism, thus driver agents are independent learners. Our approachis tested on two abstract traffic scenarios and it is compared to other route choice methods.Experimental results show that drivers learn routes in complex scenarios with noprior knowledge. Plus, the approach outperforms the compared route choice methodsregarding drivers’ travel time. Also, satisfactory performance is achieved regardingroad network load balance. The simplicity, realistic assumptions and performance ofthe proposed approach suggests that it is a feasible candidate for implementation innavigation systems for guiding drivers decision regarding route choice

    Sharing diverse information gets driver agents to learn faster : an application in en route trip building

    Get PDF
    With the increase in the use of private transportation, developing more efficient ways to distribute routes in a traffic network has become more and more important. Several attempts to address this issue have already been proposed, either by using a central authority to assign routes to the vehicles, or by means of a learning process where drivers select their best routes based on their previous experiences. The present work addresses a way to connect reinforcement learning to new technologies such as car-to-infrastructure communication in order to augment the drivers knowledge in an attempt to accelerate the learning process. Our method was compared to both a classical, iterative approach, as well as to standard reinforcement learning without communication. Results show that our method outperforms both of them. Further, we have performed robustness tests, by allowing messages to be lost, and by reducing the storage capacity of the communication devices. We were able to show that our method is not only tolerant to information loss, but also points out to improved performance when not all agents get the same information. Hence, we stress the fact that, before deploying communication in urban scenarios, it is necessary to take into consideration that the quality and diversity of information shared are key aspects

    Improving Pan-African research and education networks through traffic engineering: A LISP/SDN approach

    Get PDF
    The UbuntuNet Alliance, a consortium of National Research and Education Networks (NRENs) runs an exclusive data network for education and research in east and southern Africa. Despite a high degree of route redundancy in the Alliance's topology, a large portion of Internet traffic between the NRENs is circuitously routed through Europe. This thesis proposes a performance-based strategy for dynamic ranking of inter-NREN paths to reduce latencies. The thesis makes two contributions: firstly, mapping Africa's inter-NREN topology and quantifying the extent and impact of circuitous routing; and, secondly, a dynamic traffic engineering scheme based on Software Defined Networking (SDN), Locator/Identifier Separation Protocol (LISP) and Reinforcement Learning. To quantify the extent and impact of circuitous routing among Africa's NRENs, active topology discovery was conducted. Traceroute results showed that up to 75% of traffic from African sources to African NRENs went through inter-continental routes and experienced much higher latencies than that of traffic routed within Africa. An efficient mechanism for topology discovery was implemented by incorporating prior knowledge of overlapping paths to minimize redundancy during measurements. Evaluation of the network probing mechanism showed a 47% reduction in packets required to complete measurements. An interactive geospatial topology visualization tool was designed to evaluate how NREN stakeholders could identify routes between NRENs. Usability evaluation showed that users were able to identify routes with an accuracy level of 68%. NRENs are faced with at least three problems to optimize traffic engineering, namely: how to discover alternate end-to-end paths; how to measure and monitor performance of different paths; and how to reconfigure alternate end-to-end paths. This work designed and evaluated a traffic engineering mechanism for dynamic discovery and configuration of alternate inter-NREN paths using SDN, LISP and Reinforcement Learning. A LISP/SDN based traffic engineering mechanism was designed to enable NRENs to dynamically rank alternate gateways. Emulation-based evaluation of the mechanism showed that dynamic path ranking was able to achieve 20% lower latencies compared to the default static path selection. SDN and Reinforcement Learning were used to enable dynamic packet forwarding in a multipath environment, through hop-by-hop ranking of alternate links based on latency and available bandwidth. The solution achieved minimum latencies with significant increases in aggregate throughput compared to static single path packet forwarding. Overall, this thesis provides evidence that integration of LISP, SDN and Reinforcement Learning, as well as ranking and dynamic configuration of paths could help Africa's NRENs to minimise latencies and to achieve better throughputs

    Sample Efficient Policy Search for Optimal Stopping Domains

    Full text link
    Optimal stopping problems consider the question of deciding when to stop an observation-generating process in order to maximize a return. We examine the problem of simultaneously learning and planning in such domains, when data is collected directly from the environment. We propose GFSE, a simple and flexible model-free policy search method that reuses data for sample efficiency by leveraging problem structure. We bound the sample complexity of our approach to guarantee uniform convergence of policy value estimates, tightening existing PAC bounds to achieve logarithmic dependence on horizon length for our setting. We also examine the benefit of our method against prevalent model-based and model-free approaches on 3 domains taken from diverse fields.Comment: To appear in IJCAI-201

    Learning-based perception and control with adaptive stress testing for safe autonomous air mobility

    Get PDF
    The use of electrical vertical takeoff and landing (eVTOL) aircraft to provide efficient, high-speed, on-demand air transportation within a metropolitan area is a topic of increasing interest, which is expected to bring fundamental changes to the city infrastructures and daily commutes. NASA, Uber, and Airbus have been exploring this exciting concept of Urban Air Mobility (UAM), which has the potential to provide meaningful door-to-door trip time savings compared with automobiles. However, successfully bringing such vehicles and airspace operations to fruition will require introducing orders-of-magnitude more aircraft to a given airspace volume, and the ability to manage many of these eVTOL aircraft safely in a congested urban area presents a challenge unprecedented in air traffic management. Although there are existing solutions for communication technology, onboard computing capability, and sensor technology, the computation guidance algorithm to enable safe, efficient, and scalable flight operations for dense self-organizing air traffic still remains an open question. In order to enable safe and efficient autonomous on-demand free flight operations in this UAM concept, a suite of tools in learning-based perception and control systems with stress testing for safe autonomous air mobility is proposed in this dissertation. First, a key component for the safe autonomous operation of unmanned aircraft is an effective onboard perception system, which will support sense-and-avoid functions. For example, in a package delivery mission, or an emergency landing event, pedestrian detection could help unmanned aircraft with safe landing zone identification. In this dissertation, we developed a deep-learning-based onboard computer vision algorithm on unmanned aircraft for pedestrian detection and tracking. In contrast with existing research with ground-level pedestrian detection, the developed algorithm achieves highly accurate multiple pedestrian detection from a bird-eye view, when both the pedestrians and the aircraft platform are moving. Second, for the aircraft guidance, a message-based decentralized computational guidance algorithm with separation assurance capability for single aircraft case and multiple cooperative aircraft case is designed and analyzed in this dissertation. The algorithm proposed in this work is to formulate this problem as a Markov Decision Process (MDP) and solve it using an online algorithm Monte Carlo Tree Search (MCTS). For the multiple cooperative aircraft case, a novel coordination strategy is introduced by using the logit level-kk model in behavioral game theory. To achieve higher scalability, we introduce the airspace sector concept into the UAM environment by dividing the airspace into sectors, so that each aircraft only needs to coordinate with aircraft in the same sector. At each decision step, all of the aircraft will run the proposed computational guidance algorithm onboard, which can guide all the aircraft to their respective destinations while avoiding potential conflicts among them. In addition, to make the proposed algorithm more practical, we also consider the communication constraints and communication loss among the aircraft by modifying our computational guidance algorithms given certain communication constraints (time, bandwidth, and communication loss) and designing air-to-air and air-to-ground communication frameworks to facilitate the computational guidance algorithm. To demonstrate the performance of the proposed computational guidance algorithm, a free-flight airspace simulator that incorporates environment uncertainty is built in an OpenAI Gym environment. Numerical experiment results over several case studies including the roundabout test problem show that the proposed computational guidance algorithm has promising performance even with the high-density air traffic case. Third, to ensure the developed autonomous systems meet the high safety standards of aviation, we propose a novel, simulation driven approach for validation that can automatically discover the failure modes of a decision-making system, and optimize the parameters that configure the system to improve its safety performance. Using simulation, we demonstrate that the proposed validation algorithm is able to discover failure modes in the system that would be challenging for humans to find and fix, and we show how the algorithm can learn from these failure modes to improve the performance of the decision-making system under test
    corecore