4,121 research outputs found

    Deep learning for video game playing

    Get PDF
    In this article, we review recent Deep Learning advances in the context of how they have been applied to play different types of video games such as first-person shooters, arcade games, and real-time strategy games. We analyze the unique requirements that different game genres pose to a deep learning system and highlight important open challenges in the context of applying these machine learning methods to video games, such as general game playing, dealing with extremely large decision spaces and sparse rewards

    Driver Digital Twin for Online Prediction of Personalized Lane Change Behavior

    Full text link
    Connected and automated vehicles (CAVs) are supposed to share the road with human-driven vehicles (HDVs) in a foreseeable future. Therefore, considering the mixed traffic environment is more pragmatic, as the well-planned operation of CAVs may be interrupted by HDVs. In the circumstance that human behaviors have significant impacts, CAVs need to understand HDV behaviors to make safe actions. In this study, we develop a Driver Digital Twin (DDT) for the online prediction of personalized lane change behavior, allowing CAVs to predict surrounding vehicles' behaviors with the help of the digital twin technology. DDT is deployed on a vehicle-edge-cloud architecture, where the cloud server models the driver behavior for each HDV based on the historical naturalistic driving data, while the edge server processes the real-time data from each driver with his/her digital twin on the cloud to predict the lane change maneuver. The proposed system is first evaluated on a human-in-the-loop co-simulation platform, and then in a field implementation with three passenger vehicles connected through the 4G/LTE cellular network. The lane change intention can be recognized in 6 seconds on average before the vehicle crosses the lane separation line, and the Mean Euclidean Distance between the predicted trajectory and GPS ground truth is 1.03 meters within a 4-second prediction window. Compared to the general model, using a personalized model can improve prediction accuracy by 27.8%. The demonstration video of the proposed system can be watched at https://youtu.be/5cbsabgIOdM

    Deep Reinforcement Learning and Game Theoretic Monte Carlo Decision Process for Safe and Efficient Lane Change Maneuver and Speed Management

    Get PDF
    Predicting the states of the surrounding traffic is one of the major problems in automated driving. Maneuvers such as lane change, merge, and exit management could pose challenges in the absence of intervehicular communication and can benefit from driver behavior prediction. Predicting the motion of surrounding vehicles and trajectory planning need to be computationally efficient for real-time implementation. This dissertation presents a decision process model for real-time automated lane change and speed management in highway and urban traffic. In lane change and merge maneuvers, it is important to know how neighboring vehicles will act in the imminent future. Human driver models, probabilistic approaches, rule-base techniques, and machine learning approach have addressed this problem only partially as they do not focus on the behavioral features of the vehicles. The main goal of this research is to develop a fast algorithm that predicts the future states of the neighboring vehicles, runs a fast decision process, and learns the regretfulness and rewardfulness of the executed decisions. The presented algorithm is developed based on level-K game theory to model and predict the interaction between the vehicles. Using deep reinforcement learning, this algorithm encodes and memorizes the past experiences that are recurrently used to reduce the computations and speed up motion planning. Also, we use Monte Carlo Tree Search (MCTS) as an effective tool that is employed nowadays for fast planning in complex and dynamic game environments. This development leverages the computation power efficiently and showcases promising outcomes for maneuver planning and predicting the environment’s dynamics. In the absence of traffic connectivity that may be due to either passenger’s choice of privacy or the vehicle’s lack of technology, this development can be extended and employed in automated vehicles for real-world and practical applications

    Parallel driving in CPSS: a unified approach for transport automation and vehicle intelligence

    Get PDF
    The emerging development of connected and automated vehicles imposes a significant challenge on current vehicle control and transportation systems. This paper proposes a novel unified approach, Parallel Driving, a cloud-based cyberphysical-social systems U+0028 CPSS U+0029 framework aiming at synergizing connected automated driving. This study first introduces the CPSS and ACP-based intelligent machine systems. Then the parallel driving is proposed in the cyber-physical-social space, considering interactions among vehicles, human drivers, and information. Within the framework, parallel testing, parallel learning and parallel reinforcement learning are developed and concisely reviewed. Development on intelligent horizon U+0028 iHorizon U+0028 and its applications are also presented towards parallel horizon. The proposed parallel driving offers an ample solution for achieving a smooth, safe and efficient cooperation among connected automated vehicles with different levels of automation in future road transportation systems

    Autonomous Navigation in (the Animal and) the Machine

    Get PDF
    Understanding the principles underlying autonomous navigation might be the most enticing quest the computational neuroscientist can undertake. Autonomous operation, also known as voluntary behavior, is the result of higher cognitive mechanisms and what is known as executive function in psychology. A rudimentary knowledge of the brain can explain where and to a certain degree how parts of a computation are expressed. However, achieving a satisfactory understanding of the neural computation involved in voluntary behavior is beyond today’s neuroscience. In contrast with the study of the brain, with a comprehensive body of theory for trying to understand system with unmatched complexity, the field of AI is to a larger extent guided by examples of achievements. Although the two sciences differ in methods, theoretical foundation, scientific vigour, and direct applicability, the intersection between the two may be a viable approach toward understanding autonomy. This project is an example of how both fields may benefit from such a venture. The findings presented in this thesis may be interesting for behavioral neuroscience, exploring how operant functions can be combined to form voluntary behavior. The presented theory can also be considered as documentation of a successful implementation of autonomous navigation in Euclidean space. Findings are grouped into three parts, as expressed in this thesis. First, pertinent back- ground theory is presented in Part I – collecting key findings from psychology and from AI relating to autonomous navigation. Part II presents a theoretical contribution to RL theory developed during the design and implementation of the emulator for navigational autonomy, before experimental findings from a selection of published papers are attached as Part III. Note how this thesis emphasizes the understanding of volition and autonomous navigation rather than accomplishments by the agent, reflecting the aim of this project – to understand the basic principles of autonomous navigation to a sufficient degree to be able to recreate its effect by first principles

    Bio­-inspired approaches to the control and modelling of an anthropomimetic robot

    Get PDF
    Introducing robots into human environments requires them to handle settings designed specifically for human size and morphology, however, large, conventional humanoid robots with stiff, high powered joint actuators pose a significant danger to humans. By contrast, “anthropomimetic” robots mimic both human morphology and internal structure; skeleton, muscles, compliance and high redundancy. Although far safer, their resultant compliant structure presents a formidable challenge to conventional control. Here we review, and seek to address, characteristic control issues of this class of robot, whilst exploiting their biomimetic nature by drawing upon biological motor control research. We derive a novel learning controller for discovering effective reaching actions created through sustained activation of one or more muscle synergies, an approach which draws upon strong, recent evidence from animal and humans studies, but is almost unexplored to date in musculoskeletal robot literature. Since the best synergies for a given robot will be unknown, we derive a deliberately simple reinforcement learning approach intended to allow their emergence, in particular those patterns which aid linearization of control. We also draw upon optimal control theories to encourage the emergence of smoother movement by incorporating signal dependent noise and trial repetition. In addition, we argue the utility of developing a detailed dynamic model of a complete robot and present a stable, physics-­‐‑based model, of the anthropomimetic ECCERobot, running in real time with 55 muscles and 88 degrees of freedom. Using the model, we find that effective reaching actions can be learned which employ only two sequential motor co-­‐‑activation patterns, each controlled by just a single common driving signal. Factor analysis shows the emergent muscle co-­‐‑activations can be reconstructed to significant accuracy using weighted combinations of only 13 common fragments, labelled “candidate synergies”. Using these synergies as drivable units the same controller learns the same task both faster and better, however, other reaching tasks perform less well, proportional to dissimilarity; we therefore propose that modifications enabling emergence of a more generic set of synergies are required. Finally, we propose a continuous controller for the robot, based on model predictive control, incorporating our model as a predictive component for state estimation, delay-­‐‑ compensation and planning, including merging of the robot and sensed environment into a single model. We test the delay compensation mechanism by controlling a second copy of the model acting as a proxy for the real robot, finding that performance is significantly improved if a precise degree of compensation is applied and show how rapidly an un-­‐‑compensated controller fails as the model accuracy degrades
    corecore