99 research outputs found

    The Assistive Multi-Armed Bandit

    Full text link
    Learning preferences implicit in the choices humans make is a well studied problem in both economics and computer science. However, most work makes the assumption that humans are acting (noisily) optimally with respect to their preferences. Such approaches can fail when people are themselves learning about what they want. In this work, we introduce the assistive multi-armed bandit, where a robot assists a human playing a bandit task to maximize cumulative reward. In this problem, the human does not know the reward function but can learn it through the rewards received from arm pulls; the robot only observes which arms the human pulls but not the reward associated with each pull. We offer sufficient and necessary conditions for successfully assisting the human in this framework. Surprisingly, better human performance in isolation does not necessarily lead to better performance when assisted by the robot: a human policy can do better by effectively communicating its observed rewards to the robot. We conduct proof-of-concept experiments that support these results. We see this work as contributing towards a theory behind algorithms for human-robot interaction.Comment: Accepted to HRI 201

    Adaptive modality selection algorithm in robot-assisted cognitive training

    Get PDF
    © 20xx IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Interaction of socially assistive robots with users is based on social cues coming from different interaction modalities, such as speech or gestures. However, using all modalities at all times may be inefficient as it can overload the user with redundant information and increase the task completion time. Additionally, users may favor certain modalities over the other as a result of their disability or personal preference. In this paper, we propose an Adaptive Modality Selection (AMS) algorithm that chooses modalities depending on the state of the user and the environment, as well as user preferences. The variables that describe the environment and the user state are defined as resources, and we posit that modalities are successful if certain resources possess specific values during their use. Besides the resources, the proposed algorithm takes into account user preferences which it learns while interacting with users. We tested our algorithm in simulations, and we implemented it on a robotic system that provides cognitive training, specifically Sequential memory exercises. Experimental results show that it is possible to use only a subset of available modalities without compromising the interaction. Moreover, we see a trend for users to perform better when interacting with a system with implemented AMS algorithm.Peer ReviewedPostprint (author's final draft

    Learning robot policies using a high-level abstraction persona-behaviour simulator

    Get PDF
    2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting /republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other worksCollecting data in Human-Robot Interaction for training learning agents might be a hard task to accomplish. This is especially true when the target users are older adults with dementia since this usually requires hours of interactions and puts quite a lot of workload on the user. This paper addresses the problem of importing the Personas technique from HRI to create fictional patients’ profiles. We propose a Persona-Behaviour Simulator tool that provides, with high-level abstraction, user’s actions during an HRI task, and we apply it to cognitive training exercises for older adults with dementia. It consists of a Persona Definition that characterizes a patient along four dimensions and a Task Engine that provides information regarding the task complexity. We build a simulated environment where the high-level user’s actions are provided by the simulator and the robot initial policy is learned using a Q-learning algorithm. The results show that the current simulator provides a reasonable initial policy for a defined Persona profile. Moreover, the learned robot assistance has proved to be robust to potential changes in the user’s behaviour. In this way, we can speed up the fine-tuning of the rough policy during the real interactions to tailor the assistance to the given user. We believe the presented approach can be easily extended to account for other types of HRI tasks; for example, when input data is required to train a learning algorithm, but data collection is very expensive or unfeasible. We advocate that simulation is a convenient tool in these cases.Peer ReviewedPostprint (author's final draft
    • …
    corecore