15,169 research outputs found
Teaching robots parametrized executable plans through spoken interaction
While operating in domestic environments, robots will necessarily
face difficulties not envisioned by their developers at programming
time. Moreover, the tasks to be performed by a robot will often
have to be specialized and/or adapted to the needs of specific users
and specific environments. Hence, learning how to operate by interacting
with the user seems a key enabling feature to support the
introduction of robots in everyday environments.
In this paper we contribute a novel approach for learning, through
the interaction with the user, task descriptions that are defined as a
combination of primitive actions. The proposed approach makes
a significant step forward by making task descriptions parametric
with respect to domain specific semantic categories. Moreover, by
mapping the task representation into a task representation language,
we are able to express complex execution paradigms and to revise
the learned tasks in a high-level fashion. The approach is evaluated
in multiple practical applications with a service robot
MaestROB: A Robotics Framework for Integrated Orchestration of Low-Level Control and High-Level Reasoning
This paper describes a framework called MaestROB. It is designed to make the
robots perform complex tasks with high precision by simple high-level
instructions given by natural language or demonstration. To realize this, it
handles a hierarchical structure by using the knowledge stored in the forms of
ontology and rules for bridging among different levels of instructions.
Accordingly, the framework has multiple layers of processing components;
perception and actuation control at the low level, symbolic planner and Watson
APIs for cognitive capabilities and semantic understanding, and orchestration
of these components by a new open source robot middleware called Project Intu
at its core. We show how this framework can be used in a complex scenario where
multiple actors (human, a communication robot, and an industrial robot)
collaborate to perform a common industrial task. Human teaches an assembly task
to Pepper (a humanoid robot from SoftBank Robotics) using natural language
conversation and demonstration. Our framework helps Pepper perceive the human
demonstration and generate a sequence of actions for UR5 (collaborative robot
arm from Universal Robots), which ultimately performs the assembly (e.g.
insertion) task.Comment: IEEE International Conference on Robotics and Automation (ICRA) 2018.
Video: https://www.youtube.com/watch?v=19JsdZi0TW
Language-based sensing descriptors for robot object grounding
In this work, we consider an autonomous robot that is required
to understand commands given by a human through natural language.
Specifically, we assume that this robot is provided with an internal
representation of the environment. However, such a representation is unknown
to the user. In this context, we address the problem of allowing a
human to understand the robot internal representation through dialog.
To this end, we introduce the concept of sensing descriptors. Such representations
are used by the robot to recognize unknown object properties
in the given commands and warn the user about them. Additionally, we
show how these properties can be learned over time by leveraging past
interactions in order to enhance the grounding capabilities of the robot
VirtualHome: Simulating Household Activities via Programs
In this paper, we are interested in modeling complex activities that occur in
a typical household. We propose to use programs, i.e., sequences of atomic
actions and interactions, as a high level representation of complex tasks.
Programs are interesting because they provide a non-ambiguous representation of
a task, and allow agents to execute them. However, nowadays, there is no
database providing this type of information. Towards this goal, we first
crowd-source programs for a variety of activities that happen in people's
homes, via a game-like interface used for teaching kids how to code. Using the
collected dataset, we show how we can learn to extract programs directly from
natural language descriptions or from videos. We then implement the most common
atomic (inter)actions in the Unity3D game engine, and use our programs to
"drive" an artificial agent to execute tasks in a simulated household
environment. Our VirtualHome simulator allows us to create a large activity
video dataset with rich ground-truth, enabling training and testing of video
understanding models. We further showcase examples of our agent performing
tasks in our VirtualHome based on language descriptions.Comment: CVPR 2018 (Oral
- …