219 research outputs found
Toward General-Purpose Robots via Foundation Models: A Survey and Meta-Analysis
Building general-purpose robots that can operate seamlessly, in any
environment, with any object, and utilizing various skills to complete diverse
tasks has been a long-standing goal in Artificial Intelligence. Unfortunately,
however, most existing robotic systems have been constrained - having been
designed for specific tasks, trained on specific datasets, and deployed within
specific environments. These systems usually require extensively-labeled data,
rely on task-specific models, have numerous generalization issues when deployed
in real-world scenarios, and struggle to remain robust to distribution shifts.
Motivated by the impressive open-set performance and content generation
capabilities of web-scale, large-capacity pre-trained models (i.e., foundation
models) in research fields such as Natural Language Processing (NLP) and
Computer Vision (CV), we devote this survey to exploring (i) how these existing
foundation models from NLP and CV can be applied to the field of robotics, and
also exploring (ii) what a robotics-specific foundation model would look like.
We begin by providing an overview of what constitutes a conventional robotic
system and the fundamental barriers to making it universally applicable. Next,
we establish a taxonomy to discuss current work exploring ways to leverage
existing foundation models for robotics and develop ones catered to robotics.
Finally, we discuss key challenges and promising future directions in using
foundation models for enabling general-purpose robotic systems. We encourage
readers to view our living GitHub repository of resources, including papers
reviewed in this survey as well as related projects and repositories for
developing foundation models for robotics
Spatial context-aware person-following for a domestic robot
Domestic robots are in the focus of research in
terms of service providers in households and even as robotic
companion that share the living space with humans. A major
capability of mobile domestic robots that is joint exploration
of space. One challenge to deal with this task is how could we
let the robots move in space in reasonable, socially acceptable
ways so that it will support interaction and communication
as a part of the joint exploration. As a step towards this
challenge, we have developed a context-aware following behav-
ior considering these social aspects and applied these together
with a multi-modal person-tracking method to switch between
three basic following approaches, namely direction-following,
path-following and parallel-following. These are derived from
the observation of human-human following schemes and are
activated depending on the current spatial context (e.g. free
space) and the relative position of the interacting human.
A combination of the elementary behaviors is performed in
real time with our mobile robot in different environments.
First experimental results are provided to demonstrate the
practicability of the proposed approach
Model-Based Environmental Visual Perception for Humanoid Robots
The visual perception of a robot should answer two fundamental questions: What? and Where? In order to properly and efficiently reply to these questions, it is essential to establish a bidirectional coupling between the external stimuli and the internal representations. This coupling links the physical world with the inner abstraction models by sensor transformation, recognition, matching and optimization algorithms. The objective of this PhD is to establish this sensor-model coupling
Active Vision for Scene Understanding
Visual perception is one of the most important sources of information for both humans and robots. A particular challenge is the acquisition and interpretation of complex unstructured scenes. This work contributes to active vision for humanoid robots. A semantic model of the scene is created, which is extended by successively changing the robot\u27s view in order to explore interaction possibilities of the scene
- …