16,260 research outputs found
MULTI-MODAL TASK INSTRUCTIONS TO ROBOTS BY NAIVE USERS
This thesis presents a theoretical framework for the design of user-programmable
robots. The objective of the work is to investigate multi-modal unconstrained natural
instructions given to robots in order to design a learning robot. A corpus-centred
approach is used to design an agent that can reason, learn and interact with a human in a
natural unconstrained way. The corpus-centred design approach is formalised and
developed in detail. It requires the developer to record a human during interaction and
analyse the recordings to find instruction primitives. These are then implemented into a
robot. The focus of this work has been on how to combine speech and gesture using
rules extracted from the analysis of a corpus. A multi-modal integration algorithm is
presented, that can use timing and semantics to group, match and unify gesture and
language. The algorithm always achieves correct pairings on a corpus and initiates
questions to the user in ambiguous cases or missing information. The domain of card
games has been investigated, because of its variety of games which are rich in rules and
contain sequences. A further focus of the work is on the translation of rule-based
instructions. Most multi-modal interfaces to date have only considered sequential
instructions. The combination of frame-based reasoning, a knowledge base organised as
an ontology and a problem solver engine is used to store these rules. The understanding
of rule instructions, which contain conditional and imaginary situations require an agent
with complex reasoning capabilities. A test system of the agent implementation is also
described. Tests to confirm the implementation by playing back the corpus are
presented. Furthermore, deployment test results with the implemented agent and human
subjects are presented and discussed. The tests showed that the rate of errors that are
due to the sentences not being defined in the grammar does not decrease by an
acceptable rate when new grammar is introduced. This was particularly the case for
complex verbal rule instructions which have a large variety of being expressed
Knowledge Representation for Robots through Human-Robot Interaction
The representation of the knowledge needed by a robot to perform complex
tasks is restricted by the limitations of perception. One possible way of
overcoming this situation and designing "knowledgeable" robots is to rely on
the interaction with the user. We propose a multi-modal interaction framework
that allows to effectively acquire knowledge about the environment where the
robot operates. In particular, in this paper we present a rich representation
framework that can be automatically built from the metric map annotated with
the indications provided by the user. Such a representation, allows then the
robot to ground complex referential expressions for motion commands and to
devise topological navigation plans to achieve the target locations.Comment: Knowledge Representation and Reasoning in Robotics Workshop at ICLP
201
Robot Learning from Human Demonstration: Interpretation, Adaptation, and Interaction
Robot Learning from Demonstration (LfD) is a research area that focuses on how robots can learn new skills by observing how people perform various activities. As humans, we have a remarkable ability to imitate other humanâs behaviors and adapt to new situations. Endowing robots with these critical capabilities is a significant but very challenging problem considering the complexity and variation of human activities in highly dynamic environments.
This research focuses on how robots can learn new skills by interpreting human activities, adapting the learned skills to new situations, and naturally interacting with humans. This dissertation begins with a discussion of challenges in each of these three problems. A new unified representation approach is introduced to enable robots to simultaneously interpret the high-level semantic meanings and generalize the low-level trajectories of a broad range of human activities. An adaptive framework based on feature space decomposition is then presented for robots to not only reproduce skills, but also autonomously and efficiently adjust the learned skills to new environments that are significantly different from demonstrations. To achieve natural Human Robot Interaction (HRI), this dissertation presents a Recurrent Neural Network based deep perceptual control approach, which is capable of integrating multi-modal perception sequences with actions for robots to interact with humans in long-term tasks.
Overall, by combining the above approaches, an autonomous system is created for robots to acquire important skills that can be applied to human-centered applications. Finally, this dissertation concludes with a discussion of future directions that could accelerate the upcoming technological revolution of robot learning from human demonstration
Spatial context-aware person-following for a domestic robot
Domestic robots are in the focus of research in
terms of service providers in households and even as robotic
companion that share the living space with humans. A major
capability of mobile domestic robots that is joint exploration
of space. One challenge to deal with this task is how could we
let the robots move in space in reasonable, socially acceptable
ways so that it will support interaction and communication
as a part of the joint exploration. As a step towards this
challenge, we have developed a context-aware following behav-
ior considering these social aspects and applied these together
with a multi-modal person-tracking method to switch between
three basic following approaches, namely direction-following,
path-following and parallel-following. These are derived from
the observation of human-human following schemes and are
activated depending on the current spatial context (e.g. free
space) and the relative position of the interacting human.
A combination of the elementary behaviors is performed in
real time with our mobile robot in different environments.
First experimental results are provided to demonstrate the
practicability of the proposed approach
Who am I talking with? A face memory for social robots
In order to provide personalized services and to
develop human-like interaction capabilities robots need to rec-
ognize their human partner. Face recognition has been studied
in the past decade exhaustively in the context of security systems
and with significant progress on huge datasets. However, these
capabilities are not in focus when it comes to social interaction
situations. Humans are able to remember people seen for a
short moment in time and apply this knowledge directly in
their engagement in conversation. In order to equip a robot with
capabilities to recall human interlocutors and to provide user-
aware services, we adopt human-human interaction schemes to
propose a face memory on the basis of active appearance models
integrated with the active memory architecture. This paper
presents the concept of the interactive face memory, the applied
recognition algorithms, and their embedding into the robotâs
system architecture. Performance measures are discussed for
general face databases as well as scenario-specific datasets
Multi-Modal Human-Machine Communication for Instructing Robot Grasping Tasks
A major challenge for the realization of intelligent robots is to supply them
with cognitive abilities in order to allow ordinary users to program them
easily and intuitively. One way of such programming is teaching work tasks by
interactive demonstration. To make this effective and convenient for the user,
the machine must be capable to establish a common focus of attention and be
able to use and integrate spoken instructions, visual perceptions, and
non-verbal clues like gestural commands. We report progress in building a
hybrid architecture that combines statistical methods, neural networks, and
finite state machines into an integrated system for instructing grasping tasks
by man-machine interaction. The system combines the GRAVIS-robot for visual
attention and gestural instruction with an intelligent interface for speech
recognition and linguistic interpretation, and an modality fusion module to
allow multi-modal task-oriented man-machine communication with respect to
dextrous robot manipulation of objects.Comment: 7 pages, 8 figure
Understanding of Object Manipulation Actions Using Human Multi-Modal Sensory Data
Object manipulation actions represent an important share of the Activities of
Daily Living (ADLs). In this work, we study how to enable service robots to use
human multi-modal data to understand object manipulation actions, and how they
can recognize such actions when humans perform them during human-robot
collaboration tasks. The multi-modal data in this study consists of videos,
hand motion data, applied forces as represented by the pressure patterns on the
hand, and measurements of the bending of the fingers, collected as human
subjects performed manipulation actions. We investigate two different
approaches. In the first one, we show that multi-modal signal (motion, finger
bending and hand pressure) generated by the action can be decomposed into a set
of primitives that can be seen as its building blocks. These primitives are
used to define 24 multi-modal primitive features. The primitive features can in
turn be used as an abstract representation of the multi-modal signal and
employed for action recognition. In the latter approach, the visual features
are extracted from the data using a pre-trained image classification deep
convolutional neural network. The visual features are subsequently used to
train the classifier. We also investigate whether adding data from other
modalities produces a statistically significant improvement in the classifier
performance. We show that both approaches produce a comparable performance.
This implies that image-based methods can successfully recognize human actions
during human-robot collaboration. On the other hand, in order to provide
training data for the robot so it can learn how to perform object manipulation
actions, multi-modal data provides a better alternative
A real-time human-robot interaction system based on gestures for assistive scenarios
Natural and intuitive human interaction with robotic systems is a key point to develop robots assisting people in an easy and effective way. In this paper, a Human Robot Interaction (HRI) system able to recognize gestures usually employed in human non-verbal communication is introduced, and an in-depth study of its usability is performed. The system deals with dynamic gestures such as waving or nodding which are recognized using a Dynamic Time Warping approach based on gesture specific features computed from depth maps. A static gesture consisting in pointing at an object is also recognized. The pointed location is then estimated in order to detect candidate objects the user may refer to. When the pointed object is unclear for the robot, a disambiguation procedure by means of either a verbal or gestural dialogue is performed. This skill would lead to the robot picking an object in behalf of the user, which could present difficulties to do it by itself. The overall system â which is composed by a NAO and Wifibot robots, a KinectTM v2 sensor and two laptops â is firstly evaluated in a structured lab setup. Then, a broad set of user tests has been completed, which allows to assess correct performance in terms of recognition rates, easiness of use and response times.Postprint (author's final draft
- âŚ