443 research outputs found

    Modeling Human Driving Behavior through Generative Adversarial Imitation Learning

    Full text link
    Imitation learning is an approach for generating intelligent behavior when the cost function is unknown or difficult to specify. Building upon work in inverse reinforcement learning (IRL), Generative Adversarial Imitation Learning (GAIL) aims to provide effective imitation even for problems with large or continuous state and action spaces. Driver modeling is one example of a problem where the state and action spaces are continuous. Human driving behavior is characterized by non-linearity and stochasticity, and the underlying cost function is unknown. As a result, learning from human driving demonstrations is a promising approach for generating human-like driving behavior. This article describes the use of GAIL for learning-based driver modeling. Because driver modeling is inherently a multi-agent problem, where the interaction between agents needs to be modeled, this paper describes a parameter-sharing extension of GAIL called PS-GAIL to tackle multi-agent driver modeling. In addition, GAIL is domain agnostic, making it difficult to encode specific knowledge relevant to driving in the learning process. This paper describes Reward Augmented Imitation Learning (RAIL), which modifies the reward signal to provide domain-specific knowledge to the agent. Finally, human demonstrations are dependent upon latent factors that may not be captured by GAIL. This paper describes Burn-InfoGAIL, which allows for disentanglement of latent variability in demonstrations. Imitation learning experiments are performed using NGSIM, a real-world highway driving dataset. Experiments show that these modifications to GAIL can successfully model highway driving behavior, accurately replicating human demonstrations and generating realistic, emergent behavior in the traffic flow arising from the interaction between driving agents.Comment: 28 pages, 8 figures. arXiv admin note: text overlap with arXiv:1803.0104

    A Probabilistic Framework for Imitating Human Race Driver Behavior

    Full text link
    Understanding and modeling human driver behavior is crucial for advanced vehicle development. However, unique driving styles, inconsistent behavior, and complex decision processes render it a challenging task, and existing approaches often lack variability or robustness. To approach this problem, we propose Probabilistic Modeling of Driver behavior (ProMoD), a modular framework which splits the task of driver behavior modeling into multiple modules. A global target trajectory distribution is learned with Probabilistic Movement Primitives, clothoids are utilized for local path generation, and the corresponding choice of actions is performed by a neural network. Experiments in a simulated car racing setting show considerable advantages in imitation accuracy and robustness compared to other imitation learning algorithms. The modular architecture of the proposed framework facilitates straightforward extensibility in driving line adaptation and sequencing of multiple movement primitives for future research.Comment: updated references [17] and [33]; added journal inf

    Residual Skill Policies: Learning an Adaptable Skill-based Action Space for Reinforcement Learning for Robotics

    Full text link
    Skill-based reinforcement learning (RL) has emerged as a promising strategy to leverage prior knowledge for accelerated robot learning. Skills are typically extracted from expert demonstrations and are embedded into a latent space from which they can be sampled as actions by a high-level RL agent. However, this skill space is expansive, and not all skills are relevant for a given robot state, making exploration difficult. Furthermore, the downstream RL agent is limited to learning structurally similar tasks to those used to construct the skill space. We firstly propose accelerating exploration in the skill space using state-conditioned generative models to directly bias the high-level agent towards only sampling skills relevant to a given state based on prior experience. Next, we propose a low-level residual policy for fine-grained skill adaptation enabling downstream RL agents to adapt to unseen task variations. Finally, we validate our approach across four challenging manipulation tasks that differ from those used to build the skill space, demonstrating our ability to learn across task variations while significantly accelerating exploration, outperforming prior works. Code and videos are available on our project website: https://krishanrana.github.io/reskill.Comment: 6th Conference on Robot Learning (CoRL), 202

    Exploiting Structure for Scalable and Robust Deep Learning

    Get PDF
    Deep learning has seen great success training deep neural networks for complex prediction problems, such as large-scale image recognition, short-term time-series forecasting, and learning behavioral models for games with simple dynamics. However, neural networks have a number of weaknesses: 1) they are not sample-efficient and 2) they are often not robust against (adversarial) input perturbations. Hence, it is challenging to train neural networks for problems with exponential complexity, such as multi-agent games, complex long-term spatiotemporal dynamics, or noisy high-resolution image data. This thesis contributes methods to improve the sample efficiency, expressive power, and robustness of neural networks, by exploiting various forms of low-dimensional structure, such as spatiotemporal hierarchy and multi-agent coordination. We show the effectiveness of this approach in multiple learning paradigms: in both the supervised learning (e.g., imitation learning) and reinforcement learning settings. First, we introduce hierarchical neural networks that model both short-term actions and long-term goals from data, and can learn human-level behavioral models for spatiotemporal multi-agent games, such as basketball, using imitation learning. Second, in reinforcement learning, we show that behavioral policies with a hierarchical latent structure can efficiently learn forms of multi-agent coordination, which enables a form of structured exploration for faster learning. Third, we showcase tensor-train recurrent neural networks that can model high-order mutliplicative structure in dynamical systems (e.g., Lorenz dynamics). We show that this model class gives state-of-the-art long-term forecasting performance with very long time horizons for both simulation and real-world traffic and climate data. Finally, we demonstrate two methods for neural network robustness: 1) stability training, a form of stochastic data augmentation to make neural networks more robust, and 2) neural fingerprinting, a method that detects adversarial examples by validating the network’s behavior in the neighborhood of any given input. In sum, this thesis takes a step to enable machine learning for the next scale of problem complexity, such as rich spatiotemporal multi-agent games and large-scale robust predictions.</p

    Engage D1.2 Final Project Results Report

    Get PDF
    This deliverable summarises the activities and results of Engage, the SESAR 2020 Knowledge Transfer Network (KTN). The KTN initiated and supported multiple activities for SESAR and the European air traffic management (ATM) community, including PhDs, focused catalyst fund projects, thematic workshops, summer schools and the launch of a wiki as the one-stop, go-to source for ATM research and knowledge in Europe. Key throughout was the integration of exploratory and industrial research, thus expediting the innovation pipeline and bringing researchers together. These activities laid valuable foundations for the SESAR Digital Academy

    β€œEconomic man” in cross-cultural perspective: Behavioral experiments in 15 small-scale societies

    Get PDF
    Researchers from across the social sciences have found consistent deviations from the predictions of the canonical model of self-interest in hundreds of experiments from around the world. This research, however, cannot determine whether the uniformity results from universal patterns of human behavior or from the limited cultural variation available among the university students used in virtually all prior experimental work. To address this, we undertook a cross-cultural study of behavior in ultimatum, public goods, and dictator games in a range of small-scale societies exhibiting a wide variety of economic and cultural conditions. We found, first, that the canonical model – based on self-interest – fails in all of the societies studied. Second, our data reveal substantially more behavioral variability across social groups than has been found in previous research. Third, group-level differences in economic organization and the structure of social interactions explain a substantial portion of the behavioral variation across societies: the higher the degree of market integration and the higher the payoffs to cooperation in everyday life, the greater the level of prosociality expressed in experimental games. Fourth, the available individual-level economic and demographic variables do not consistently explain game behavior, either within or across groups. Fifth, in many cases experimental play appears to reflect the common interactional patterns of everyday life
    • …
    corecore