90,030 research outputs found

    Motion Synthesis and Control for Autonomous Agents using Generative Models and Reinforcement Learning

    Get PDF
    Imitating and predicting human motions have wide applications in both graphics and robotics, from developing realistic models of human movement and behavior in immersive virtual worlds and games to improving autonomous navigation for service agents deployed in the real world. Traditional approaches for motion imitation and prediction typically rely on pre-defined rules to model agent behaviors or use reinforcement learning with manually designed reward functions. Despite impressive results, such approaches cannot effectively capture the diversity of motor behaviors and the decision making capabilities of human beings. Furthermore, manually designing a model or reward function to explicitly describe human motion characteristics often involves laborious fine-tuning and repeated experiments, and may suffer from generalization issues. In this thesis, we explore data-driven approaches using generative models and reinforcement learning to study and simulate human motions. Specifically, we begin with motion synthesis and control of physically simulated agents imitating a wide range of human motor skills, and then focus on improving the local navigation decisions of autonomous agents in multi-agent interaction settings. For physics-based agent control, we introduce an imitation learning framework built upon generative adversarial networks and reinforcement learning that enables humanoid agents to learn motor skills from a few examples of human reference motion data. Our approach generates high-fidelity motions and robust controllers without needing to manually design and finetune a reward function, allowing at the same time interactive switching between different controllers based on user input. Based on this framework, we further propose a multi-objective learning scheme for composite and task-driven control of humanoid agents. Our multi-objective learning scheme balances the simultaneous learning of disparate motions from multiple reference sources and multiple goal-directed control objectives in an adaptive way, enabling the training of efficient composite motion controllers. Additionally, we present a general framework for fast and robust learning of motor control skills. Our framework exploits particle filtering to dynamically explore and discretize the high-dimensional action space involved in continuous control tasks, and provides a multi-modal policy as a substitute for the commonly used Gaussian policies. For navigation learning, we leverage human crowd data to train a human-inspired collision avoidance policy by combining knowledge distillation and reinforcement learning. Our approach enables autonomous agents to take human-like actions during goal-directed steering in fully decentralized, multi-agent environments. To inform better control in such environments, we propose SocialVAE, a variational autoencoder based architecture that uses timewise latent variables with socially-aware conditions and a backward posterior approximation to perform agent trajectory prediction. Our approach improves current state-of-the-art performance on trajectory prediction tasks in daily human interaction scenarios and more complex scenes involving interactions between NBA players. We further extend SocialVAE by exploiting semantic maps as context conditions to generate map-compliant trajectory prediction. Our approach processes context conditions and social conditions occurring during agent-agent interactions in an integrated manner through the use of a dual-attention mechanism. We demonstrate the real-time performance of our approach and its ability to provide high-fidelity, multi-modal predictions on various large-scale vehicle trajectory prediction tasks

    On Partially Controlled Multi-Agent Systems

    Full text link
    Motivated by the control theoretic distinction between controllable and uncontrollable events, we distinguish between two types of agents within a multi-agent system: controllable agents, which are directly controlled by the system's designer, and uncontrollable agents, which are not under the designer's direct control. We refer to such systems as partially controlled multi-agent systems, and we investigate how one might influence the behavior of the uncontrolled agents through appropriate design of the controlled agents. In particular, we wish to understand which problems are naturally described in these terms, what methods can be applied to influence the uncontrollable agents, the effectiveness of such methods, and whether similar methods work across different domains. Using a game-theoretic framework, this paper studies the design of partially controlled multi-agent systems in two contexts: in one context, the uncontrollable agents are expected utility maximizers, while in the other they are reinforcement learners. We suggest different techniques for controlling agents' behavior in each domain, assess their success, and examine their relationship.Comment: See http://www.jair.org/ for any accompanying file

    CoRide: Joint Order Dispatching and Fleet Management for Multi-Scale Ride-Hailing Platforms

    Get PDF
    How to optimally dispatch orders to vehicles and how to tradeoff between immediate and future returns are fundamental questions for a typical ride-hailing platform. We model ride-hailing as a large-scale parallel ranking problem and study the joint decision-making task of order dispatching and fleet management in online ride-hailing platforms. This task brings unique challenges in the following four aspects. First, to facilitate a huge number of vehicles to act and learn efficiently and robustly, we treat each region cell as an agent and build a multi-agent reinforcement learning framework. Second, to coordinate the agents from different regions to achieve long-term benefits, we leverage the geographical hierarchy of the region grids to perform hierarchical reinforcement learning. Third, to deal with the heterogeneous and variant action space for joint order dispatching and fleet management, we design the action as the ranking weight vector to rank and select the specific order or the fleet management destination in a unified formulation. Fourth, to achieve the multi-scale ride-hailing platform, we conduct the decision-making process in a hierarchical way where a multi-head attention mechanism is utilized to incorporate the impacts of neighbor agents and capture the key agent in each scale. The whole novel framework is named as CoRide. Extensive experiments based on multiple cities real-world data as well as analytic synthetic data demonstrate that CoRide provides superior performance in terms of platform revenue and user experience in the task of city-wide hybrid order dispatching and fleet management over strong baselines.Comment: CIKM 201
    • …
    corecore