1,578 research outputs found
A Comparison of Visualisation Methods for Disambiguating Verbal Requests in Human-Robot Interaction
Picking up objects requested by a human user is a common task in human-robot
interaction. When multiple objects match the user's verbal description, the
robot needs to clarify which object the user is referring to before executing
the action. Previous research has focused on perceiving user's multimodal
behaviour to complement verbal commands or minimising the number of follow up
questions to reduce task time. In this paper, we propose a system for reference
disambiguation based on visualisation and compare three methods to disambiguate
natural language instructions. In a controlled experiment with a YuMi robot, we
investigated real-time augmentations of the workspace in three conditions --
mixed reality, augmented reality, and a monitor as the baseline -- using
objective measures such as time and accuracy, and subjective measures like
engagement, immersion, and display interference. Significant differences were
found in accuracy and engagement between the conditions, but no differences
were found in task time. Despite the higher error rates in the mixed reality
condition, participants found that modality more engaging than the other two,
but overall showed preference for the augmented reality condition over the
monitor and mixed reality conditions
Multimodal One-Shot Learning of Speech and Images
Imagine a robot is shown new concepts visually together with spoken tags,
e.g. "milk", "eggs", "butter". After seeing one paired audio-visual example per
class, it is shown a new set of unseen instances of these objects, and asked to
pick the "milk". Without receiving any hard labels, could it learn to match the
new continuous speech input to the correct visual instance? Although unimodal
one-shot learning has been studied, where one labelled example in a single
modality is given per class, this example motivates multimodal one-shot
learning. Our main contribution is to formally define this task, and to propose
several baseline and advanced models. We use a dataset of paired spoken and
visual digits to specifically investigate recent advances in Siamese
convolutional neural networks. Our best Siamese model achieves twice the
accuracy of a nearest neighbour model using pixel-distance over images and
dynamic time warping over speech in 11-way cross-modal matching.Comment: 5 pages, 1 figure, 3 tables; accepted to ICASSP 201
Composable Deep Reinforcement Learning for Robotic Manipulation
Model-free deep reinforcement learning has been shown to exhibit good
performance in domains ranging from video games to simulated robotic
manipulation and locomotion. However, model-free methods are known to perform
poorly when the interaction time with the environment is limited, as is the
case for most real-world robotic tasks. In this paper, we study how maximum
entropy policies trained using soft Q-learning can be applied to real-world
robotic manipulation. The application of this method to real-world manipulation
is facilitated by two important features of soft Q-learning. First, soft
Q-learning can learn multimodal exploration strategies by learning policies
represented by expressive energy-based models. Second, we show that policies
learned with soft Q-learning can be composed to create new policies, and that
the optimality of the resulting policy can be bounded in terms of the
divergence between the composed policies. This compositionality provides an
especially valuable tool for real-world manipulation, where constructing new
policies by composing existing skills can provide a large gain in efficiency
over training from scratch. Our experimental evaluation demonstrates that soft
Q-learning is substantially more sample efficient than prior model-free deep
reinforcement learning methods, and that compositionality can be performed for
both simulated and real-world tasks.Comment: Videos: https://sites.google.com/view/composing-real-world-policies
Collaborating with a Mobile Robot: An Augmented Reality Multimodal Interface
Invited paperWe have created an infrastructure that allows a human to collaborate in a natural manner with a robotic system. In this paper we describe our system and its implementation with a mobile robot. In our
prototype the human communicates with the mobile robot using natural speech and gestures, for example, by selecting a point in 3D space and saying âgo hereâ or âgo behind thatâ. The robot responds using
speech so the human is able to understand its intentions and beliefs. Augmented Reality (AR) technology is used to facilitate natural use of gestures and provide a common 3D spatial reference for both the robot and human, thus providing a means for grounding of communication and maintaining spatial awareness.
This paper first discusses related work then gives a brief overview of AR and its capabilities. The architectural design we have developed is outlined and then a case study is discussed
Speech & Multimodal Resources: the Herme Database of Spontaneous Multimodal Human-Robot Dialogues
This paper presents methodologies and tools for language resource (LR) construction. It describes a database of interactive speech collected over a three-month period at the Science Gallery in Dublin, where visitors could take part in a conversation with a robot. The system collected samples of informal, chatty dialogue â normally difficult to capture under laboratory conditions for human-human dialogue, and particularly so for human-machine interaction. The conversations were based on a script followed by the robot consisting largely of social chat with some task-based elements. The interactions were audio-visually recorded using several cameras together with microphones. As part of the conversation the participants were asked to sign a consent form giving permission to use their data for human-machine interaction research. The multimodal corpus will be made available to interested researchers and the technology developed during the three-month exhibition is being extended for use in education and assisted-living applications
Including universal design in a summer camp workshop on robotics
In this paper we will describe a summer camp short-course intended for high-school students with excellent qualifications. The course is addressed to students who are thinking on studying a technical career including a section on universal design for the first time.
The department of Mathematics and Computer Science at Universitat de Barcelona will host a workshop on robotics next summer within the context of Campus CientĂficos de Verano by FundaciĂłn Española para la Ciencia y la TecnologĂa.. High-school students will be selected around Spain based on their qualifications and motivation to attend the workshop.
The first activity in the summer camp will be the building of Lego Mindstorms robots. These robots contain several sensors and actuators that can be programmed to do different tasks. One of the robots will be programmed to be able to track a line and another two will be programmed to do a Sumo fight on their own. Students will learn how to use sensors and actuators and code programming algorithms.
For the second activity the students will develop a Mobile App with the MIT App Inventor2 software [1] in order to control the robots. In this activity students will learn how to program apps in a simple way to complete their understanding of programming.
Taking into account European Higher Education Area requirements for Accessibility in technical careers, this workshop will introduce an innovation; the third activity will consist in the adaptation of the app and robots for multimodal access (including sound and sight redundant warnings) and the readjustment of the appâs buttons for users with motor and visual disabilities (e.g. making the buttons bigger and with non-repeating behaviour).
Students attending the summer camp will be introduced to the needs and skills of different user profiles of people with disabilities. After this theoretical introduction, they will experience motor and visual disabilities with simulations inspired by the Inclusive design Toolkit resource [2] ].And finally, they will modify the app based on IEEE RWEP Accessible apps by Ayanna Howard [3] so to maximise the accessibility possibilities of App Inventor. Complementary resources will be made available to those students showing interest in this area, such as RWEP prosthetic hands projects, other toolkits and bibliography.
This will serve as a first experience for the students and there is no prevision of including technical aids such as GRID2 or similar [4] due to budget restrictions. There are no students with disabilities registered for this year edition so the course does not seek accessibility for participants as authors. We will consider working on accessibility for participants of the following editions of this workshop, building on past experiences reaching this goal [5] [6] [7].
The main focus of the workshop is to encourage the creative learning of a robots summer camp [8], [9] with the inclusion of universal design as an essential requirement in the design and development of computer applications or systems. With this initiative we want to increase awareness on accessibility requirements for future technical students.PID U
Young children in an education context : apps, cultural agency and expanding communicative repertoires
This chapter examines video recorded interactions of childrenâs engagement with touchscreens in an early education setting. The extracts are taken from an ethnographic research study that explored childrenâs expanding repertoires for meaning-making as these emerged throughout their first year of school. The episodes presented in this chapter draw on observations of childrenâs spontaneous interactions with and around two iPad apps. The findings reveal how childrenâs engagement with iPads has the potential to simultaneously confer childrenâs cultural agency and further expand childrenâs repertoires for meaning-making. The discussion provides nuanced interpretations of how touchscreens might contribute positively to young childrenâs early learning and play experiences
- âŠ