348 research outputs found

    A Data-Efficient Deep Learning Approach for Deployable Multimodal Social Robots

    Get PDF
    The deep supervised and reinforcement learning paradigms (among others) have the potential to endow interactive multimodal social robots with the ability of acquiring skills autonomously. But it is still not very clear yet how they can be best deployed in real world applications. As a step in this direction, we propose a deep learning-based approach for efficiently training a humanoid robot to play multimodal games---and use the game of `Noughts \& Crosses' with two variants as a case study. Its minimum requirements for learning to perceive and interact are based on a few hundred example images, a few example multimodal dialogues and physical demonstrations of robot manipulation, and automatic simulations. In addition, we propose novel algorithms for robust visual game tracking and for competitive policy learning with high winning rates, which substantially outperform DQN-based baselines. While an automatic evaluation shows evidence that the proposed approach can be easily extended to new games with competitive robot behaviours, a human evaluation with 130 humans playing with the {\it Pepper} robot confirms that highly accurate visual perception is required for successful game play

    CARPe Posterum: A Convolutional Approach for Real-time Pedestrian Path Prediction

    Full text link
    Pedestrian path prediction is an essential topic in computer vision and video understanding. Having insight into the movement of pedestrians is crucial for ensuring safe operation in a variety of applications including autonomous vehicles, social robots, and environmental monitoring. Current works in this area utilize complex generative or recurrent methods to capture many possible futures. However, despite the inherent real-time nature of predicting future paths, little work has been done to explore accurate and computationally efficient approaches for this task. To this end, we propose a convolutional approach for real-time pedestrian path prediction, CARPe. It utilizes a variation of Graph Isomorphism Networks in combination with an agile convolutional neural network design to form a fast and accurate path prediction approach. Notable results in both inference speed and prediction accuracy are achieved, improving FPS considerably in comparison to current state-of-the-art methods while delivering competitive accuracy on well-known path prediction datasets.Comment: AAAI-21 Camera Read

    Cyber-Agricultural Systems for Crop Breeding and Sustainable Production

    Get PDF
    The Cyber-Agricultural System (CAS) Represents an overarching Framework of Agriculture that Leverages Recent Advances in Ubiquitous Sensing, Artificial Intelligence, Smart Actuators, and Scalable Cyberinfrastructure (CI) in Both Breeding and Production Agriculture. We Discuss the Recent Progress and Perspective of the Three Fundamental Components of CAS – Sensing, Modeling, and Actuation – and the Emerging Concept of Agricultural Digital Twins (DTs). We Also Discuss How Scalable CI is Becoming a Key Enabler of Smart Agriculture. in This Review We Shed Light on the Significance of CAS in Revolutionizing Crop Breeding and Production by Enhancing Efficiency, Productivity, Sustainability, and Resilience to Changing Climate. Finally, We Identify Underexplored and Promising Future Directions for CAS Research and Development

    Reinforcement Learning Approaches in Social Robotics

    Full text link
    This article surveys reinforcement learning approaches in social robotics. Reinforcement learning is a framework for decision-making problems in which an agent interacts through trial-and-error with its environment to discover an optimal behavior. Since interaction is a key component in both reinforcement learning and social robotics, it can be a well-suited approach for real-world interactions with physically embodied social robots. The scope of the paper is focused particularly on studies that include social physical robots and real-world human-robot interactions with users. We present a thorough analysis of reinforcement learning approaches in social robotics. In addition to a survey, we categorize existent reinforcement learning approaches based on the used method and the design of the reward mechanisms. Moreover, since communication capability is a prominent feature of social robots, we discuss and group the papers based on the communication medium used for reward formulation. Considering the importance of designing the reward function, we also provide a categorization of the papers based on the nature of the reward. This categorization includes three major themes: interactive reinforcement learning, intrinsically motivated methods, and task performance-driven methods. The benefits and challenges of reinforcement learning in social robotics, evaluation methods of the papers regarding whether or not they use subjective and algorithmic measures, a discussion in the view of real-world reinforcement learning challenges and proposed solutions, the points that remain to be explored, including the approaches that have thus far received less attention is also given in the paper. Thus, this paper aims to become a starting point for researchers interested in using and applying reinforcement learning methods in this particular research field

    Software-hardware Integration and Human-centered Benchmarking for Socially-compliant Robot Navigation

    Full text link
    The social compatibility (SC) is one of the most important parameters for service robots. It characterises the interaction quality between a robot and a human. In this paper, we first introduce an open-source software-hardware integration scheme for socially-compliant robot navigation and then propose a human-centered benchmarking framework. For the former, we integrate one 3D lidar, one 2D lidar, and four RGB-D cameras for robot exterior perception. The software system is entirely based on the Robot Operating System (ROS) with high modularity and fully deployed to the embedded hardware-based edge while running at a rate that exceeds the release frequency of sensor data. For the latter, we propose a new human-centered performance evaluation metric that can be used to measure SC quickly and efficiently. The values of this metric correlate with the results of the Godspeed questionnaire, which is believed to be a golden standard approach for SC measurements. Together with other commonly used metrics, we benchmark two open-source socially-compliant robot navigation methods, in an end-to-end manner. We clarify all aspects of the benchmarking to ensure the reproducibility of the experiments. We also show that the proposed new metric can provide further justification for the selection of numerical metrics (objective) from a human perspective (subjective).Comment: 8 pages, 8 figure

    Reward-Based Environment States for Robot Manipulation Policy Learning

    Get PDF
    Training robot manipulation policies is a challenging and open problem in robotics and artificial intelligence. In this paper we propose a novel and compact state representation based on the rewards predicted from an image-based task success classifier. Our experiments—using the Pepper robot in simulation with two deep reinforcement learning algorithms on a grab-and-lift task—reveal that our proposed state representation can achieve up to 97% task success using our best policies

    ViNT: A Foundation Model for Visual Navigation

    Full text link
    General-purpose pre-trained models ("foundation models") have enabled practitioners to produce generalizable solutions for individual machine learning problems with datasets that are significantly smaller than those required for learning from scratch. Such models are typically trained on large and diverse datasets with weak supervision, consuming much more training data than is available for any individual downstream application. In this paper, we describe the Visual Navigation Transformer (ViNT), a foundation model that aims to bring the success of general-purpose pre-trained models to vision-based robotic navigation. ViNT is trained with a general goal-reaching objective that can be used with any navigation dataset, and employs a flexible Transformer-based architecture to learn navigational affordances and enable efficient adaptation to a variety of downstream navigational tasks. ViNT is trained on a number of existing navigation datasets, comprising hundreds of hours of robotic navigation from a variety of different robotic platforms, and exhibits positive transfer, outperforming specialist models trained on singular datasets. ViNT can be augmented with diffusion-based subgoal proposals to explore novel environments, and can solve kilometer-scale navigation problems when equipped with long-range heuristics. ViNT can also be adapted to novel task specifications with a technique inspired by prompt-tuning, where the goal encoder is replaced by an encoding of another task modality (e.g., GPS waypoints or routing commands) embedded into the same space of goal tokens. This flexibility and ability to accommodate a variety of downstream problem domains establishes ViNT as an effective foundation model for mobile robotics. For videos, code, and model checkpoints, see our project page at https://visualnav-transformer.github.io.Comment: Accepted for oral presentation at CoRL 202

    How to Raise a Robot - A Case for Neuro-Symbolic AI in Constrained Task Planning for Humanoid Assistive Robots

    Get PDF
    Humanoid robots will be able to assist humans in their daily life, in particular due to their versatile action capabilities. However, while these robots need a certain degree of autonomy to learn and explore, they also should respect various constraints, for access control and beyond. We explore the novel field of incorporating privacy, security, and access control constraints with robot task planning approaches. We report preliminary results on the classical symbolic approach, deep-learned neural networks, and modern ideas using large language models as knowledge base. From analyzing their trade-offs, we conclude that a hybrid approach is necessary, and thereby present a new use case for the emerging field of neuro-symbolic artificial intelligence
    • …
    corecore