Search CORE

1,578 research outputs found

A Comparison of Visualisation Methods for Disambiguating Verbal Requests in Human-Robot Interaction

Author: Gustafson Joakim
Karaoguz Hakan
Kontogiorgos Dimosthenis
Kragic Danica
Leite Iolanda
Nykvist Olov
Sibirtseva Elena
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 26/01/2018
Field of study

Picking up objects requested by a human user is a common task in human-robot interaction. When multiple objects match the user's verbal description, the robot needs to clarify which object the user is referring to before executing the action. Previous research has focused on perceiving user's multimodal behaviour to complement verbal commands or minimising the number of follow up questions to reduce task time. In this paper, we propose a system for reference disambiguation based on visualisation and compare three methods to disambiguate natural language instructions. In a controlled experiment with a YuMi robot, we investigated real-time augmentations of the workspace in three conditions -- mixed reality, augmented reality, and a monitor as the baseline -- using objective measures such as time and accuracy, and subjective measures like engagement, immersion, and display interference. Significant differences were found in accuracy and engagement between the conditions, but no differences were found in task time. Despite the higher error rates in the mixed reality condition, participants found that modality more engaging than the other two, but overall showed preference for the augmented reality condition over the monitor and mixed reality conditions

arXiv.org e-Print Archive

Crossref

Multimodal One-Shot Learning of Speech and Images

Author: Eloff Ryan
Engelbrecht Herman A.
Kamper Herman
Publication venue
Publication date: 15/04/2019
Field of study

Imagine a robot is shown new concepts visually together with spoken tags, e.g. "milk", "eggs", "butter". After seeing one paired audio-visual example per class, it is shown a new set of unseen instances of these objects, and asked to pick the "milk". Without receiving any hard labels, could it learn to match the new continuous speech input to the correct visual instance? Although unimodal one-shot learning has been studied, where one labelled example in a single modality is given per class, this example motivates multimodal one-shot learning. Our main contribution is to formally define this task, and to propose several baseline and advanced models. We use a dataset of paired spoken and visual digits to specifically investigate recent advances in Siamese convolutional neural networks. Our best Siamese model achieves twice the accuracy of a nearest neighbour model using pixel-distance over images and dynamic time warping over speech in 11-way cross-modal matching.Comment: 5 pages, 1 figure, 3 tables; accepted to ICASSP 201

arXiv.org e-Print Archive

Crossref

Composable Deep Reinforcement Learning for Robotic Manipulation

Author: Abbeel Pieter
Dalal Murtaza
Haarnoja Tuomas
Levine Sergey
Pong Vitchyr
Zhou Aurick
Publication venue
Publication date: 18/03/2018
Field of study

Model-free deep reinforcement learning has been shown to exhibit good performance in domains ranging from video games to simulated robotic manipulation and locomotion. However, model-free methods are known to perform poorly when the interaction time with the environment is limited, as is the case for most real-world robotic tasks. In this paper, we study how maximum entropy policies trained using soft Q-learning can be applied to real-world robotic manipulation. The application of this method to real-world manipulation is facilitated by two important features of soft Q-learning. First, soft Q-learning can learn multimodal exploration strategies by learning policies represented by expressive energy-based models. Second, we show that policies learned with soft Q-learning can be composed to create new policies, and that the optimality of the resulting policy can be bounded in terms of the divergence between the composed policies. This compositionality provides an especially valuable tool for real-world manipulation, where constructing new policies by composing existing skills can provide a large gain in efficiency over training from scratch. Our experimental evaluation demonstrates that soft Q-learning is substantially more sample efficient than prior model-free deep reinforcement learning methods, and that compositionality can be performed for both simulated and real-world tasks.Comment: Videos: https://sites.google.com/view/composing-real-world-policies

arXiv.org e-Print Archive

Crossref

Collaborating with a Mobile Robot: An Augmented Reality Multimodal Interface

Author: Green S.A.
Chen X.Q.
Billinghurst M.
Chase J.G.
Publication venue: University of Canterbury. Mechanical Engineering.
Publication date: 01/01/1995
Field of study

Invited paperWe have created an infrastructure that allows a human to collaborate in a natural manner with a robotic system. In this paper we describe our system and its implementation with a mobile robot. In our prototype the human communicates with the mobile robot using natural speech and gestures, for example, by selecting a point in 3D space and saying “go here” or “go behind that”. The robot responds using speech so the human is able to understand its intentions and beliefs. Augmented Reality (AR) technology is used to facilitate natural use of gestures and provide a common 3D spatial reference for both the robot and human, thus providing a means for grounding of communication and maintaining spatial awareness. This paper first discusses related work then gives a brief overview of AR and its capabilities. The architectural design we have developed is outlined and then a case study is discussed

Crossref

UC Research Repository

Speech & Multimodal Resources: the Herme Database of Spontaneous Multimodal Human-Robot Dialogues

Author: Campbell Nick
De Looze Celine
Gilmartin Emer
Hang Jing Guang
Vaughan Brian
Publication venue: Dublin Institute of Technology
Publication date: 23/05/2012
Field of study

This paper presents methodologies and tools for language resource (LR) construction. It describes a database of interactive speech collected over a three-month period at the Science Gallery in Dublin, where visitors could take part in a conversation with a robot. The system collected samples of informal, chatty dialogue – normally difficult to capture under laboratory conditions for human-human dialogue, and particularly so for human-machine interaction. The conversations were based on a script followed by the robot consisting largely of social chat with some task-based elements. The interactions were audio-visually recorded using several cameras together with microphones. As part of the conversation the participants were asked to sign a consent form giving permission to use their data for human-machine interaction research. The multimodal corpus will be made available to interested researchers and the technology developed during the three-month exhibition is being extended for use in education and assisted-living applications

Arrow@TUDublin

Including universal design in a summer camp workshop on robotics

Author: Puertas i Prats Eloi
Ribera Mireia
Publication venue: 'IATED Academy'
Publication date: 24/01/2016
Field of study

In this paper we will describe a summer camp short-course intended for high-school students with excellent qualifications. The course is addressed to students who are thinking on studying a technical career including a section on universal design for the first time. The department of Mathematics and Computer Science at Universitat de Barcelona will host a workshop on robotics next summer within the context of Campus Científicos de Verano by Fundación Española para la Ciencia y la Tecnología.. High-school students will be selected around Spain based on their qualifications and motivation to attend the workshop. The first activity in the summer camp will be the building of Lego Mindstorms robots. These robots contain several sensors and actuators that can be programmed to do different tasks. One of the robots will be programmed to be able to track a line and another two will be programmed to do a Sumo fight on their own. Students will learn how to use sensors and actuators and code programming algorithms. For the second activity the students will develop a Mobile App with the MIT App Inventor2 software [1] in order to control the robots. In this activity students will learn how to program apps in a simple way to complete their understanding of programming. Taking into account European Higher Education Area requirements for Accessibility in technical careers, this workshop will introduce an innovation; the third activity will consist in the adaptation of the app and robots for multimodal access (including sound and sight redundant warnings) and the readjustment of the app’s buttons for users with motor and visual disabilities (e.g. making the buttons bigger and with non-repeating behaviour). Students attending the summer camp will be introduced to the needs and skills of different user profiles of people with disabilities. After this theoretical introduction, they will experience motor and visual disabilities with simulations inspired by the Inclusive design Toolkit resource [2] ].And finally, they will modify the app based on IEEE RWEP Accessible apps by Ayanna Howard [3] so to maximise the accessibility possibilities of App Inventor. Complementary resources will be made available to those students showing interest in this area, such as RWEP prosthetic hands projects, other toolkits and bibliography. This will serve as a first experience for the students and there is no prevision of including technical aids such as GRID2 or similar [4] due to budget restrictions. There are no students with disabilities registered for this year edition so the course does not seek accessibility for participants as authors. We will consider working on accessibility for participants of the following editions of this workshop, building on past experiences reaching this goal [5] [6] [7]. The main focus of the workshop is to encourage the creative learning of a robots summer camp [8], [9] with the inclusion of universal design as an essential requirement in the design and development of computer applications or systems. With this initiative we want to increase awareness on accessibility requirements for future technical students.PID U

Consortium of Academic Libraries of Catalonia (CBUC)

Diposit Digital de la Universitat de Barcelona

Young children in an education context : apps, cultural agency and expanding communicative repertoires

Author: Daniels Karen
Publication venue: 'Informa UK Limited'
Publication date: 27/10/2016
Field of study

This chapter examines video recorded interactions of children’s engagement with touchscreens in an early education setting. The extracts are taken from an ethnographic research study that explored children’s expanding repertoires for meaning-making as these emerged throughout their first year of school. The episodes presented in this chapter draw on observations of children’s spontaneous interactions with and around two iPad apps. The findings reveal how children’s engagement with iPads has the potential to simultaneously confer children’s cultural agency and further expand children’s repertoires for meaning-making. The discussion provides nuanced interpretations of how touchscreens might contribute positively to young children’s early learning and play experiences

Sheffield Hallam University Research Archive