3,236 research outputs found
Recommended from our members
A corpus-based analysis of route instructions in human-robot interaction
This paper investigates how users employ spatial descriptions to navigate a speech-enabled robot. We created a simulated environment in which users gave route instructions in a dialogic real-time interaction with a robot, which was
operated by naĂŻve participants. The ability of robot monitoring was also manipulated in two experimental conditions. The results provide evidence that the content of the instructions and strategies of the users vary depending on the conditions and
demands of the interaction. As expected, the route instructions frequently were underspecified and arbitrary. The findings of
this study elucidate the complexity in interpreting spatial language in HRI. However, they also point to the need for
endowing mobile robots with richer dialogue resources to compensate for the uncertainties arising from language as well
as the environment
Overcoming barriers and increasing independence: service robots for elderly and disabled people
This paper discusses the potential for service robots to overcome barriers and increase independence of
elderly and disabled people. It includes a brief overview of the existing uses of service robots by disabled and elderly
people and advances in technology which will make new uses possible and provides suggestions for some of these new
applications. The paper also considers the design and other conditions to be met for user acceptance. It also discusses
the complementarity of assistive service robots and personal assistance and considers the types of applications and
users for which service robots are and are not suitable
Natural user interfaces for interdisciplinary design review using the Microsoft Kinect
As markets demand engineered products faster, waiting on the cyclical design processes of the past is not an option. Instead, industry is turning to concurrent design and interdisciplinary teams. When these teams collaborate, engineering CAD tools play a vital role in conceptualizing and validating designs. These tools require significant user investment to master, due to challenging interfaces and an overabundance of features. These challenges often prohibit team members from using these tools for exploring designs. This work presents a method allowing users to interact with a design using intuitive gestures and head tracking, all while keeping the model in a CAD format. Specifically, Siemens\u27 Teamcenter® Lifecycle Visualization Mockup (Mockup) was used to display design geometry while modifications were made through a set of gestures captured by a Microsoft KinectTM in real time. This proof of concept program allowed a user to rotate the scene, activate Mockup\u27s immersive menu, move the immersive wand, and manipulate the view based on head position.
This work also evaluates gesture usability and task completion time for this proof of concept system. A cognitive model evaluation method was used to evaluate the premise that gesture-based user interfaces are easier to use and learn with regards to time than a traditional mouse and keyboard interface. Using a cognitive model analysis tool allowed the rapid testing of interaction concepts without the significant overhead of user studies and full development cycles. The analysis demonstrated that using the KinectTM is a feasible interaction mode for CAD/CAE programs. In addition, the analysis pointed out limitations in the gesture interfaces ability to compete time wise with easily accessible customizable menu options
Real-time crowd control of existing interfaces
Crowdsourcing has been shown to be an effective approach for solving difficult problems, but current crowdsourcing systems suffer two main limitations: (i) tasks must be repackaged for proper display to crowd workers, which generally requires substantial one-off programming effort and support infrastructure, and (ii) crowd workers generally lack a tight feedback loop with their task. In this paper, we introduce Legion, a system that allows end users to easily capture existing GUIs and outsource them for collaborative, real-time control by the crowd. We present mediation strategies for integrating the input of multiple crowd workers in real-time, evaluate these mediation strategies across several applications, and further validate Legion by exploring the space of novel applications that it enables
From Verbs to Tasks: An Integrated Account of Learning Tasks from Situated Interactive Instruction.
Intelligent collaborative agents are becoming common in the human society. From virtual assistants such as Siri and Google Now to assistive robots, they contribute to human activities in a variety of ways. As they become more pervasive, the challenge of customizing them to a variety of environments and tasks becomes critical. It is infeasible for engineers to program them for each individual use. Our research aims at building interactive robots and agents that adapt to new environments autonomously by interacting with human users using natural modalities.
This dissertation studies the problem of learning novel tasks from human-agent dialog. We propose a novel approach for interactive task learning, situated interactive instruction (SII), and investigate approaches to three computational challenges that arise in designing SII agents: situated comprehension, mixed-initiative interaction, and interactive task learning. We propose a novel mixed-modality grounded representation for task verbs which encompasses their lexical, semantic, and
task-oriented aspects. This representation is useful in situated comprehension and can be learned through human-agent interactions. We introduce the Indexical Model of comprehension that can exploit
extra-linguistic contexts for resolving semantic ambiguities in situated comprehension of task commands. The Indexical model is integrated with a mixed-initiative interaction model that facilitates
a flexible task-oriented human-agent dialog. This dialog serves as the basis of interactive task learning. We propose an interactive variation of explanation-based learning that can acquire the proposed
representation. We demonstrate that our learning paradigm is efficient, can transfer knowledge between structurally similar tasks, integrates agent-driven exploration with instructional learning, and can acquire several tasks. The methods proposed in this thesis are integrated in Rosie - a generally instructable agent developed in the Soar cognitive architecture and embodied on a table-top robot.PhDComputer Science and EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/111573/1/shiwali_1.pd
Recommended from our members
Continually improving grounded natural language understanding through human-robot dialog
As robots become ubiquitous in homes and workplaces such as hospitals and factories, they must be able to communicate with humans. Several kinds of knowledge are required to understand and respond to a human's natural language commands and questions. If a person requests an assistant robot to take me to Alice's office, the robot must know that Alice is a person who owns some unique office, and that take me means it should navigate there. Similarly, if a person requests bring me the heavy, green mug, the robot must have accurate mental models of the physical concepts heavy, green, and mug. To avoid forcing humans to use key phrases or words robots already know, this thesis focuses on helping robots understanding new language constructs through interactions with humans and with the world around them. To understand a command in natural language, a robot must first convert that command to an internal representation that it can reason with. Semantic parsing is a method for performing this conversion, and the target representation is often semantic forms represented as predicate logic with lambda calculus. Traditional semantic parsing relies on hand-crafted resources from a human expert: an ontology of concepts, a lexicon connecting language to those concepts, and training examples of language with abstract meanings. One thrust of this thesis is to perform semantic parsing with sparse initial data. We use the conversations between a robot and human users to induce pairs of natural language utterances with the target semantic forms a robot discovers through its questions, reducing the annotation effort of creating training examples for parsing. We use this data to build more dialog-capable robots in new domains with much less expert human effort (Thomason et al., 2015; Padmakumar et al., 2017). Meanings of many language concepts are bound to the physical world. Understanding object properties and categories, such as heavy, green, and mug requires interacting with and perceiving the physical world. Embodied robots can use manipulation capabilities, such as pushing, picking up, and dropping objects to gather sensory data about them. This data can be used to understand non-visual concepts like heavy and empty (e.g. get the empty carton of milk from the fridge), and assist with concepts that have both visual and non-visual expression (e.g. tall things look big and also exert force sooner than short things when pressed down on). A second thrust of this thesis focuses on strategies for learning these concepts using multi-modal sensory information. We use human-in-the-loop learning to get labels between concept words and actual objects in the environment (Thomason et al., 2016, 2017). We also explore ways to tease out polysemy and synonymy in concept words (Thomason and Mooney, 2017) such as light, which can refer to a weight or a color, the latter sense being synonymous with pale. Additionally, pushing, picking up, and dropping objects to gather sensory information is prohibitively time-consuming, so we investigate strategies for using linguistic information and human input to expedite exploration when learning a new concept (Thomason et al., 2018). Finally, we build an integrated agent with both parsing and perception capabilities that learns from conversations with users to improve both components over time. We demonstrate that parser learning from conversations (Thomason et al., 2015) can be combined with multi-modal perception (Thomason et al., 2016) using predicate-object labels gathered through opportunistic active learning (Thomason et al., 2017) during those conversations to improve performance for understanding natural language commands from humans. Human users also qualitatively rate this integrated learning agent as more usable after it has improved from conversation-based learning.Computer Science
- …