1,925 research outputs found

    The Assistive Multi-Armed Bandit

    Full text link
    Learning preferences implicit in the choices humans make is a well studied problem in both economics and computer science. However, most work makes the assumption that humans are acting (noisily) optimally with respect to their preferences. Such approaches can fail when people are themselves learning about what they want. In this work, we introduce the assistive multi-armed bandit, where a robot assists a human playing a bandit task to maximize cumulative reward. In this problem, the human does not know the reward function but can learn it through the rewards received from arm pulls; the robot only observes which arms the human pulls but not the reward associated with each pull. We offer sufficient and necessary conditions for successfully assisting the human in this framework. Surprisingly, better human performance in isolation does not necessarily lead to better performance when assisted by the robot: a human policy can do better by effectively communicating its observed rewards to the robot. We conduct proof-of-concept experiments that support these results. We see this work as contributing towards a theory behind algorithms for human-robot interaction.Comment: Accepted to HRI 201

    Role Playing Learning for Socially Concomitant Mobile Robot Navigation

    Full text link
    In this paper, we present the Role Playing Learning (RPL) scheme for a mobile robot to navigate socially with its human companion in populated environments. Neural networks (NN) are constructed to parameterize a stochastic policy that directly maps sensory data collected by the robot to its velocity outputs, while respecting a set of social norms. An efficient simulative learning environment is built with maps and pedestrians trajectories collected from a number of real-world crowd data sets. In each learning iteration, a robot equipped with the NN policy is created virtually in the learning environment to play itself as a companied pedestrian and navigate towards a goal in a socially concomitant manner. Thus, we call this process Role Playing Learning, which is formulated under a reinforcement learning (RL) framework. The NN policy is optimized end-to-end using Trust Region Policy Optimization (TRPO), with consideration of the imperfectness of robot's sensor measurements. Simulative and experimental results are provided to demonstrate the efficacy and superiority of our method

    Human Preference-Based Learning for High-dimensional Optimization of Exoskeleton Walking Gaits

    Get PDF
    Optimizing lower-body exoskeleton walking gaits for user comfort requires understanding users’ preferences over a high-dimensional gait parameter space. However, existing preference-based learning methods have only explored low-dimensional domains due to computational limitations. To learn user preferences in high dimensions, this work presents LINECOSPAR, a human-in-the-loop preference-based framework that enables optimization over many parameters by iteratively exploring one-dimensional subspaces. Additionally, this work identifies gait attributes that characterize broader preferences across users. In simulations and human trials, we empirically verify that LINECOSPAR is a sample-efficient approach for high-dimensional preference optimization. Our analysis of the experimental data reveals a correspondence between human preferences and objective measures of dynamicity, while also highlighting differences in the utility functions underlying individual users’ gait preferences. This result has implications for exoskeleton gait synthesis, an active field with applications to clinical use and patient rehabilitation

    An Adaptive Behaviour-Based Strategy for SARs interacting with Older Adults with MCI during a Serious Game Scenario

    Full text link
    The monotonous nature of repetitive cognitive training may cause losing interest in it and dropping out by older adults. This study introduces an adaptive technique that enables a Socially Assistive Robot (SAR) to select the most appropriate actions to maintain the engagement level of older adults while they play the serious game in cognitive training. The goal is to develop an adaptation strategy for changing the robot's behaviour that uses reinforcement learning to encourage the user to remain engaged. A reinforcement learning algorithm was implemented to determine the most effective adaptation strategy for the robot's actions, encompassing verbal and nonverbal interactions. The simulation results demonstrate that the learning algorithm achieved convergence and offers promising evidence to validate the strategy's effectiveness

    Computational neurorehabilitation: modeling plasticity and learning to predict recovery

    Get PDF
    Despite progress in using computational approaches to inform medicine and neuroscience in the last 30 years, there have been few attempts to model the mechanisms underlying sensorimotor rehabilitation. We argue that a fundamental understanding of neurologic recovery, and as a result accurate predictions at the individual level, will be facilitated by developing computational models of the salient neural processes, including plasticity and learning systems of the brain, and integrating them into a context specific to rehabilitation. Here, we therefore discuss Computational Neurorehabilitation, a newly emerging field aimed at modeling plasticity and motor learning to understand and improve movement recovery of individuals with neurologic impairment. We first explain how the emergence of robotics and wearable sensors for rehabilitation is providing data that make development and testing of such models increasingly feasible. We then review key aspects of plasticity and motor learning that such models will incorporate. We proceed by discussing how computational neurorehabilitation models relate to the current benchmark in rehabilitation modeling – regression-based, prognostic modeling. We then critically discuss the first computational neurorehabilitation models, which have primarily focused on modeling rehabilitation of the upper extremity after stroke, and show how even simple models have produced novel ideas for future investigation. Finally, we conclude with key directions for future research, anticipating that soon we will see the emergence of mechanistic models of motor recovery that are informed by clinical imaging results and driven by the actual movement content of rehabilitation therapy as well as wearable sensor-based records of daily activity
    corecore