23,667 research outputs found
Robust Spoken Language Understanding for House Service Robots
Service robotics has been growing significantly in thelast years, leading to several research results and to a numberof consumer products. One of the essential features of theserobotic platforms is represented by the ability of interactingwith users through natural language. Spoken commands canbe processed by a Spoken Language Understanding chain, inorder to obtain the desired behavior of the robot. The entrypoint of such a process is represented by an Automatic SpeechRecognition (ASR) module, that provides a list of transcriptionsfor a given spoken utterance. Although several well-performingASR engines are available off-the-shelf, they operate in a generalpurpose setting. Hence, they may be not well suited in therecognition of utterances given to robots in specific domains. Inthis work, we propose a practical yet robust strategy to re-ranklists of transcriptions. This approach improves the quality of ASRsystems in situated scenarios, i.e., the transcription of roboticcommands. The proposed method relies upon evidences derivedby a semantic grammar with semantic actions, designed tomodel typical commands expressed in scenarios that are specificto human service robotics. The outcomes obtained throughan experimental evaluation show that the approach is able toeffectively outperform the ASR baseline, obtained by selectingthe first transcription suggested by the AS
A Review of Verbal and Non-Verbal Human-Robot Interactive Communication
In this paper, an overview of human-robot interactive communication is
presented, covering verbal as well as non-verbal aspects of human-robot
interaction. Following a historical introduction, and motivation towards fluid
human-robot communication, ten desiderata are proposed, which provide an
organizational axis both of recent as well as of future research on human-robot
communication. Then, the ten desiderata are examined in detail, culminating to
a unifying discussion, and a forward-looking conclusion
Exploiting Deep Semantics and Compositionality of Natural Language for Human-Robot-Interaction
We develop a natural language interface for human robot interaction that
implements reasoning about deep semantics in natural language. To realize the
required deep analysis, we employ methods from cognitive linguistics, namely
the modular and compositional framework of Embodied Construction Grammar (ECG)
[Feldman, 2009]. Using ECG, robots are able to solve fine-grained reference
resolution problems and other issues related to deep semantics and
compositionality of natural language. This also includes verbal interaction
with humans to clarify commands and queries that are too ambiguous to be
executed safely. We implement our NLU framework as a ROS package and present
proof-of-concept scenarios with different robots, as well as a survey on the
state of the art
Multi-Modal Human-Machine Communication for Instructing Robot Grasping Tasks
A major challenge for the realization of intelligent robots is to supply them
with cognitive abilities in order to allow ordinary users to program them
easily and intuitively. One way of such programming is teaching work tasks by
interactive demonstration. To make this effective and convenient for the user,
the machine must be capable to establish a common focus of attention and be
able to use and integrate spoken instructions, visual perceptions, and
non-verbal clues like gestural commands. We report progress in building a
hybrid architecture that combines statistical methods, neural networks, and
finite state machines into an integrated system for instructing grasping tasks
by man-machine interaction. The system combines the GRAVIS-robot for visual
attention and gestural instruction with an intelligent interface for speech
recognition and linguistic interpretation, and an modality fusion module to
allow multi-modal task-oriented man-machine communication with respect to
dextrous robot manipulation of objects.Comment: 7 pages, 8 figure
A discriminative approach to grounded spoken language understanding in interactive robotics
Spoken Language Understanding in Interactive Robotics provides computational models of human-machine communication based on the vocal input. However, robots operate in specific environments and the correct interpretation of the spoken sentences depends on the physical, cognitive and linguistic aspects triggered by the operational environment. Grounded language processing should exploit both the physical constraints of the context as well as knowledge assumptions of the robot. These include the subjective perception of the environment that explicitly affects linguistic reasoning. In this work, a standard linguistic pipeline for semantic parsing is extended toward a form of perceptually informed natural language processing that combines discriminative learning and distributional semantics. Empirical results achieve up to a 40% of relative error reduction
Interactive Text2Pickup Network for Natural Language based Human-Robot Collaboration
In this paper, we propose the Interactive Text2Pickup (IT2P) network for
human-robot collaboration which enables an effective interaction with a human
user despite the ambiguity in user's commands. We focus on the task where a
robot is expected to pick up an object instructed by a human, and to interact
with the human when the given instruction is vague. The proposed network
understands the command from the human user and estimates the position of the
desired object first. To handle the inherent ambiguity in human language
commands, a suitable question which can resolve the ambiguity is generated. The
user's answer to the question is combined with the initial command and given
back to the network, resulting in more accurate estimation. The experiment
results show that given unambiguous commands, the proposed method can estimate
the position of the requested object with an accuracy of 98.49% based on our
test dataset. Given ambiguous language commands, we show that the accuracy of
the pick up task increases by 1.94 times after incorporating the information
obtained from the interaction.Comment: 8 pages, 9 figure
- …