348 research outputs found
A Data-Efficient Deep Learning Approach for Deployable Multimodal Social Robots
The deep supervised and reinforcement learning paradigms (among others) have the potential to endow interactive multimodal social robots with the ability of acquiring skills autonomously. But it is still not very clear yet how they can be best deployed in real world applications. As a step in this direction, we propose a deep learning-based approach for efficiently training a humanoid robot to play multimodal games---and use the game of `Noughts \& Crosses' with two variants as a case study. Its minimum requirements for learning to perceive and interact are based on a few hundred example images, a few example multimodal dialogues and physical demonstrations of robot manipulation, and automatic simulations. In addition, we propose novel algorithms for robust visual game tracking and for competitive policy learning with high winning rates, which substantially outperform DQN-based baselines. While an automatic evaluation shows evidence that the proposed approach can be easily extended to new games with competitive robot behaviours, a human evaluation with 130 humans playing with the {\it Pepper} robot confirms that highly accurate visual perception is required for successful game play
CARPe Posterum: A Convolutional Approach for Real-time Pedestrian Path Prediction
Pedestrian path prediction is an essential topic in computer vision and video
understanding. Having insight into the movement of pedestrians is crucial for
ensuring safe operation in a variety of applications including autonomous
vehicles, social robots, and environmental monitoring. Current works in this
area utilize complex generative or recurrent methods to capture many possible
futures. However, despite the inherent real-time nature of predicting future
paths, little work has been done to explore accurate and computationally
efficient approaches for this task. To this end, we propose a convolutional
approach for real-time pedestrian path prediction, CARPe. It utilizes a
variation of Graph Isomorphism Networks in combination with an agile
convolutional neural network design to form a fast and accurate path prediction
approach. Notable results in both inference speed and prediction accuracy are
achieved, improving FPS considerably in comparison to current state-of-the-art
methods while delivering competitive accuracy on well-known path prediction
datasets.Comment: AAAI-21 Camera Read
Cyber-Agricultural Systems for Crop Breeding and Sustainable Production
The Cyber-Agricultural System (CAS) Represents an overarching Framework of Agriculture that Leverages Recent Advances in Ubiquitous Sensing, Artificial Intelligence, Smart Actuators, and Scalable Cyberinfrastructure (CI) in Both Breeding and Production Agriculture. We Discuss the Recent Progress and Perspective of the Three Fundamental Components of CAS – Sensing, Modeling, and Actuation – and the Emerging Concept of Agricultural Digital Twins (DTs). We Also Discuss How Scalable CI is Becoming a Key Enabler of Smart Agriculture. in This Review We Shed Light on the Significance of CAS in Revolutionizing Crop Breeding and Production by Enhancing Efficiency, Productivity, Sustainability, and Resilience to Changing Climate. Finally, We Identify Underexplored and Promising Future Directions for CAS Research and Development
Reinforcement Learning Approaches in Social Robotics
This article surveys reinforcement learning approaches in social robotics.
Reinforcement learning is a framework for decision-making problems in which an
agent interacts through trial-and-error with its environment to discover an
optimal behavior. Since interaction is a key component in both reinforcement
learning and social robotics, it can be a well-suited approach for real-world
interactions with physically embodied social robots. The scope of the paper is
focused particularly on studies that include social physical robots and
real-world human-robot interactions with users. We present a thorough analysis
of reinforcement learning approaches in social robotics. In addition to a
survey, we categorize existent reinforcement learning approaches based on the
used method and the design of the reward mechanisms. Moreover, since
communication capability is a prominent feature of social robots, we discuss
and group the papers based on the communication medium used for reward
formulation. Considering the importance of designing the reward function, we
also provide a categorization of the papers based on the nature of the reward.
This categorization includes three major themes: interactive reinforcement
learning, intrinsically motivated methods, and task performance-driven methods.
The benefits and challenges of reinforcement learning in social robotics,
evaluation methods of the papers regarding whether or not they use subjective
and algorithmic measures, a discussion in the view of real-world reinforcement
learning challenges and proposed solutions, the points that remain to be
explored, including the approaches that have thus far received less attention
is also given in the paper. Thus, this paper aims to become a starting point
for researchers interested in using and applying reinforcement learning methods
in this particular research field
Software-hardware Integration and Human-centered Benchmarking for Socially-compliant Robot Navigation
The social compatibility (SC) is one of the most important parameters for
service robots. It characterises the interaction quality between a robot and a
human. In this paper, we first introduce an open-source software-hardware
integration scheme for socially-compliant robot navigation and then propose a
human-centered benchmarking framework. For the former, we integrate one 3D
lidar, one 2D lidar, and four RGB-D cameras for robot exterior perception. The
software system is entirely based on the Robot Operating System (ROS) with high
modularity and fully deployed to the embedded hardware-based edge while running
at a rate that exceeds the release frequency of sensor data. For the latter, we
propose a new human-centered performance evaluation metric that can be used to
measure SC quickly and efficiently. The values of this metric correlate with
the results of the Godspeed questionnaire, which is believed to be a golden
standard approach for SC measurements. Together with other commonly used
metrics, we benchmark two open-source socially-compliant robot navigation
methods, in an end-to-end manner. We clarify all aspects of the benchmarking to
ensure the reproducibility of the experiments. We also show that the proposed
new metric can provide further justification for the selection of numerical
metrics (objective) from a human perspective (subjective).Comment: 8 pages, 8 figure
Reward-Based Environment States for Robot Manipulation Policy Learning
Training robot manipulation policies is a challenging and open problem in robotics and artificial intelligence. In this paper we propose a novel and compact state representation based on the rewards predicted from an image-based task success
classifier. Our experiments—using the Pepper robot in simulation with two deep reinforcement learning algorithms on a grab-and-lift task—reveal that our proposed state representation can achieve up to 97% task success using our best policies
ViNT: A Foundation Model for Visual Navigation
General-purpose pre-trained models ("foundation models") have enabled
practitioners to produce generalizable solutions for individual machine
learning problems with datasets that are significantly smaller than those
required for learning from scratch. Such models are typically trained on large
and diverse datasets with weak supervision, consuming much more training data
than is available for any individual downstream application. In this paper, we
describe the Visual Navigation Transformer (ViNT), a foundation model that aims
to bring the success of general-purpose pre-trained models to vision-based
robotic navigation. ViNT is trained with a general goal-reaching objective that
can be used with any navigation dataset, and employs a flexible
Transformer-based architecture to learn navigational affordances and enable
efficient adaptation to a variety of downstream navigational tasks. ViNT is
trained on a number of existing navigation datasets, comprising hundreds of
hours of robotic navigation from a variety of different robotic platforms, and
exhibits positive transfer, outperforming specialist models trained on singular
datasets. ViNT can be augmented with diffusion-based subgoal proposals to
explore novel environments, and can solve kilometer-scale navigation problems
when equipped with long-range heuristics. ViNT can also be adapted to novel
task specifications with a technique inspired by prompt-tuning, where the goal
encoder is replaced by an encoding of another task modality (e.g., GPS
waypoints or routing commands) embedded into the same space of goal tokens.
This flexibility and ability to accommodate a variety of downstream problem
domains establishes ViNT as an effective foundation model for mobile robotics.
For videos, code, and model checkpoints, see our project page at
https://visualnav-transformer.github.io.Comment: Accepted for oral presentation at CoRL 202
How to Raise a Robot - A Case for Neuro-Symbolic AI in Constrained Task Planning for Humanoid Assistive Robots
Humanoid robots will be able to assist humans in their daily life, in particular due to their versatile action capabilities. However, while these robots need a certain degree of autonomy to learn and explore, they also should respect various constraints, for access control and beyond. We explore the novel field of incorporating privacy, security, and access control constraints with robot task planning approaches. We report preliminary results on the classical symbolic approach, deep-learned neural networks, and modern ideas using large language models as knowledge base. From analyzing their trade-offs, we conclude that a hybrid approach is necessary, and thereby present a new use case for the emerging field of neuro-symbolic artificial intelligence
- …