41,855 research outputs found

    Toward Crowd-Sensitive Path Planning

    Full text link
    If a robot can predict crowds in parts of its environment that are inaccessible to its sensors, then it can plan to avoid them. This paper proposes a fast, online algorithm that learns average crowd densities in different areas. It also describes how these densities can be incorporated into existing navigation architectures. In simulation across multiple challenging crowd scenarios, the robot reaches its target faster, travels less, and risks fewer collisions than if it were to plan with the traditional A* algorithm.Comment: Accepted at AAAI fall symposium 201

    LeTS-Drive: Driving in a Crowd by Learning from Tree Search

    Full text link
    Autonomous driving in a crowded environment, e.g., a busy traffic intersection, is an unsolved challenge for robotics. The robot vehicle must contend with a dynamic and partially observable environment, noisy sensors, and many agents. A principled approach is to formalize it as a Partially Observable Markov Decision Process (POMDP) and solve it through online belief-tree search. To handle a large crowd and achieve real-time performance in this very challenging setting, we propose LeTS-Drive, which integrates online POMDP planning and deep learning. It consists of two phases. In the offline phase, we learn a policy and the corresponding value function by imitating the belief tree search. In the online phase, the learned policy and value function guide the belief tree search. LeTS-Drive leverages the robustness of planning and the runtime efficiency of learning to enhance the performance of both. Experimental results in simulation show that LeTS-Drive outperforms either planning or imitation learning alone and develops sophisticated driving skills

    RoboBrain: Large-Scale Knowledge Engine for Robots

    Full text link
    In this paper we introduce a knowledge engine, which learns and shares knowledge representations, for robots to carry out a variety of tasks. Building such an engine brings with it the challenge of dealing with multiple data modalities including symbols, natural language, haptic senses, robot trajectories, visual features and many others. The \textit{knowledge} stored in the engine comes from multiple sources including physical interactions that robots have while performing tasks (perception, planning and control), knowledge bases from the Internet and learned representations from several robotics research groups. We discuss various technical aspects and associated challenges such as modeling the correctness of knowledge, inferring latent information and formulating different robotic tasks as queries to the knowledge engine. We describe the system architecture and how it supports different mechanisms for users and robots to interact with the engine. Finally, we demonstrate its use in three important research areas: grounding natural language, perception, and planning, which are the key building blocks for many robotic tasks. This knowledge engine is a collaborative effort and we call it RoboBrain.Comment: 10 pages, 9 figure

    ALAN: Adaptive Learning for Multi-Agent Navigation

    Full text link
    In multi-agent navigation, agents need to move towards their goal locations while avoiding collisions with other agents and static obstacles, often without communication with each other. Existing methods compute motions that are optimal locally but do not account for the aggregated motions of all agents, producing inefficient global behavior especially when agents move in a crowded space. In this work, we develop methods to allow agents to dynamically adapt their behavior to their local conditions. We accomplish this by formulating the multi-agent navigation problem as an action-selection problem, and propose an approach, ALAN, that allows agents to compute time-efficient and collision-free motions. ALAN is highly scalable because each agent makes its own decisions on how to move using a set of velocities optimized for a variety of navigation tasks. Experimental results show that the agents using ALAN, in general, reach their destinations faster than using ORCA, a state-of-the-art collision avoidance framework, the Social Forces model for pedestrian navigation, and a Predictive collision avoidance model.Comment: Submitted to the Autonomous Robots Journal, Special Issue on Distributed Robot

    Design Challenges of Multi-UAV Systems in Cyber-Physical Applications: A Comprehensive Survey, and Future Directions

    Full text link
    Unmanned Aerial Vehicles (UAVs) have recently rapidly grown to facilitate a wide range of innovative applications that can fundamentally change the way cyber-physical systems (CPSs) are designed. CPSs are a modern generation of systems with synergic cooperation between computational and physical potentials that can interact with humans through several new mechanisms. The main advantages of using UAVs in CPS application is their exceptional features, including their mobility, dynamism, effortless deployment, adaptive altitude, agility, adjustability, and effective appraisal of real-world functions anytime and anywhere. Furthermore, from the technology perspective, UAVs are predicted to be a vital element of the development of advanced CPSs. Therefore, in this survey, we aim to pinpoint the most fundamental and important design challenges of multi-UAV systems for CPS applications. We highlight key and versatile aspects that span the coverage and tracking of targets and infrastructure objects, energy-efficient navigation, and image analysis using machine learning for fine-grained CPS applications. Key prototypes and testbeds are also investigated to show how these practical technologies can facilitate CPS applications. We present and propose state-of-the-art algorithms to address design challenges with both quantitative and qualitative methods and map these challenges with important CPS applications to draw insightful conclusions on the challenges of each application. Finally, we summarize potential new directions and ideas that could shape future research in these areas

    Modeling and Inferring Human Intents and Latent Functional Objects for Trajectory Prediction

    Full text link
    This paper is about detecting functional objects and inferring human intentions in surveillance videos of public spaces. People in the videos are expected to intentionally take shortest paths toward functional objects subject to obstacles, where people can satisfy certain needs (e.g., a vending machine can quench thirst), by following one of three possible intent behaviors: reach a single functional object and stop, or sequentially visit several functional objects, or initially start moving toward one goal but then change the intent to move toward another. Since detecting functional objects in low-resolution surveillance videos is typically unreliable, we call them "dark matter" characterized by the functionality to attract people. We formulate the Agent-based Lagrangian Mechanics wherein human trajectories are probabilistically modeled as motions of agents in many layers of "dark-energy" fields, where each agent can select a particular force field to affect its motions, and thus define the minimum-energy Dijkstra path toward the corresponding source "dark matter". For evaluation, we compiled and annotated a new dataset. The results demonstrate our effectiveness in predicting human intent behaviors and trajectories, and localizing functional objects, as well as discovering distinct functional classes of objects by clustering human motion behavior in the vicinity of functional objects

    A Survey of Data Fusion in Smart City Applications

    Full text link
    The advancement of various research sectors such as Internet of Things (IoT), Machine Learning, Data Mining, Big Data, and Communication Technology has shed some light in transforming an urban city integrating the aforementioned techniques to a commonly known term - Smart City. With the emergence of smart city, plethora of data sources have been made available for wide variety of applications. The common technique for handling multiple data sources is data fusion, where it improves data output quality or extracts knowledge from the raw data. In order to cater evergrowing highly complicated applications, studies in smart city have to utilize data from various sources and evaluate their performance based on multiple aspects. To this end, we introduce a multi-perspectives classification of the data fusion to evaluate the smart city applications. Moreover, we applied the proposed multi-perspectives classification to evaluate selected applications in each domain of the smart city. We conclude the paper by discussing potential future direction and challenges of data fusion integration.Comment: Accepted and To be published in Elsevier Information Fusio

    Robobarista: Learning to Manipulate Novel Objects via Deep Multimodal Embedding

    Full text link
    There is a large variety of objects and appliances in human environments, such as stoves, coffee dispensers, juice extractors, and so on. It is challenging for a roboticist to program a robot for each of these object types and for each of their instantiations. In this work, we present a novel approach to manipulation planning based on the idea that many household objects share similarly-operated object parts. We formulate the manipulation planning as a structured prediction problem and learn to transfer manipulation strategy across different objects by embedding point-cloud, natural language, and manipulation trajectory data into a shared embedding space using a deep neural network. In order to learn semantically meaningful spaces throughout our network, we introduce a method for pre-training its lower layers for multimodal feature embedding and a method for fine-tuning this embedding space using a loss-based margin. In order to collect a large number of manipulation demonstrations for different objects, we develop a new crowd-sourcing platform called Robobarista. We test our model on our dataset consisting of 116 objects and appliances with 249 parts along with 250 language instructions, for which there are 1225 crowd-sourced manipulation demonstrations. We further show that our robot with our model can even prepare a cup of a latte with appliances it has never seen before.Comment: Journal Versio

    Robobarista: Object Part based Transfer of Manipulation Trajectories from Crowd-sourcing in 3D Pointclouds

    Full text link
    There is a large variety of objects and appliances in human environments, such as stoves, coffee dispensers, juice extractors, and so on. It is challenging for a roboticist to program a robot for each of these object types and for each of their instantiations. In this work, we present a novel approach to manipulation planning based on the idea that many household objects share similarly-operated object parts. We formulate the manipulation planning as a structured prediction problem and design a deep learning model that can handle large noise in the manipulation demonstrations and learns features from three different modalities: point-clouds, language and trajectory. In order to collect a large number of manipulation demonstrations for different objects, we developed a new crowd-sourcing platform called Robobarista. We test our model on our dataset consisting of 116 objects with 249 parts along with 250 language instructions, for which there are 1225 crowd-sourced manipulation demonstrations. We further show that our robot can even manipulate objects it has never seen before.Comment: In International Symposium on Robotics Research (ISRR) 201

    Discovering Underlying Plans Based on Distributed Representations of Actions

    Full text link
    Plan recognition aims to discover target plans (i.e., sequences of actions) behind observed actions, with history plan libraries or domain models in hand. Previous approaches either discover plans by maximally "matching" observed actions to plan libraries, assuming target plans are from plan libraries, or infer plans by executing domain models to best explain the observed actions, assuming complete domain models are available. In real world applications, however, target plans are often not from plan libraries and complete domain models are often not available, since building complete sets of plans and complete domain models are often difficult or expensive. In this paper we view plan libraries as corpora and learn vector representations of actions using the corpora; we then discover target plans based on the vector representations. Our approach is capable of discovering underlying plans that are not from plan libraries, without requiring domain models provided. We empirically demonstrate the effectiveness of our approach by comparing its performance to traditional plan recognition approaches in three planning domains
    corecore