703 research outputs found

    Using informative behavior to increase engagement in the TAMER framework

    Get PDF

    Using informative behavior to increase engagement while learning from human reward

    Get PDF
    In this work, we address a relatively unexplored aspect of designing agents that learn from human reward. We investigate how an agent’s non-task behavior can affect a human trainer’s training and agent learning. We use the TAMER framework, which facilitates the training of agents by human-generated reward signals, i.e., judgements of the quality of the agent’s actions, as the foundation for our investigation. Then, starting from the premise that the interaction between the agent and the trainer should be bi-directional, we propose two new training interfaces to increase a human trainer’s active involvement in the training process and thereby improve the agent’s task performance. One provides information on the agent’s uncertainty which is a metric calculated as data coverage, the other on its performance. Our results from a 51-subject user study show that these interfaces can induce the trainers to train longer and give more feedback. The agent’s performance, however, increases only in response to the addition of performance-oriented information, not by sharing uncertainty levels. These results suggest that the organizational maxim about human behavior, “you get what you measure”—i.e., sharing metrics with people causes them to focus on optimizing those metrics while de-emphasizing other objectives—also applies to the training of agents. Using principle component analysis, we show how trainers in the two conditions train agents differently. In addition, by simulating the influence of the agent’s uncertainty–informative behavior on a human’s training behavior, we show that trainers could be distracted by the agent sharing its uncertainty levels about its actions, giving poor feedback for the sake of reducing the agent’s uncertainty without improving the agent’s performance

    Social interaction for efficient agent learning from human reward

    Get PDF
    Abstract - Learning from rewards generated by a human trainer observing an agent in action has been proven to be a powerful method for teaching autonomous agents to perform challenging tasks, especially for those non-technical users. Since the efficacy of this approach depends critically on the reward the trainer provides, we consider how the interaction between the trainer and the agent should be designed so as to increase the efficiency of the training process. This article investigates the influence of the agent’s socio-competitive feedback on the human trainer’s training behavior and the agent’s learning. The results of our user study with 85 participants suggest that the agent’s passive socio-competitive feedback—showing performance and score of agents trained by trainers in a leaderboard—substantially increases the engagement of the participants in the game task and improves the agents’ performance, even though the participants do not directly play the game but instead train the agent to do so. Moreover, making this feedback active—sending the trainer her agent’s performance relative to others—further induces more participants to train agents longer and improves the agent’s learning. Our further analysis shows that agents trained by trainers affected by both the passive and active social feedback could obtain a higher performance under a score mechanism that could be optimized from the trainer’s perspective and the agent’s additional active social feedback can keep participants to further train agents to learn policies that can obtain a higher performance under such a score mechanism.Fundamental Research Funds for the Central Universities of China (Grant No. 841713015)China Postdoctoral Science Foundatio

    Human Engagement Providing Evaluative and Informative Advice for Interactive Reinforcement Learning

    Full text link
    Reinforcement learning is an approach used by intelligent agents to autonomously learn new skills. Although reinforcement learning has been demonstrated to be an effective learning approach in several different contexts, a common drawback exhibited is the time needed in order to satisfactorily learn a task, especially in large state-action spaces. To address this issue, interactive reinforcement learning proposes the use of externally-sourced information in order to speed up the learning process. Up to now, different information sources have been used to give advice to the learner agent, among them human-sourced advice. When interacting with a learner agent, humans may provide either evaluative or informative advice. From the agent's perspective these styles of interaction are commonly referred to as reward-shaping and policy-shaping respectively. Evaluation requires the human to provide feedback on the prior action performed, while informative advice they provide advice on the best action to select for a given situation. Prior research has focused on the effect of human-sourced advice on the interactive reinforcement learning process, specifically aiming to improve the learning speed of the agent, while reducing the engagement with the human. This work presents an experimental setup for a human-trial designed to compare the methods people use to deliver advice in term of human engagement. Obtained results show that users giving informative advice to the learner agents provide more accurate advice, are willing to assist the learner agent for a longer time, and provide more advice per episode. Additionally, self-evaluation from participants using the informative approach has indicated that the agent's ability to follow the advice is higher, and therefore, they feel their own advice to be of higher accuracy when compared to people providing evaluative advice.Comment: 33 pages, 15 figure

    Adapting robot task planning to user preferences: an assistive shoe dressing example

    Get PDF
    The final publication is available at link.springer.comHealthcare robots will be the next big advance in humans’ domestic welfare, with robots able to assist elderly people and users with disabilities. However, each user has his/her own preferences, needs and abilities. Therefore, robotic assistants will need to adapt to them, behaving accordingly. Towards this goal, we propose a method to perform behavior adaptation to the user preferences, using symbolic task planning. A user model is built from the user’s answers to simple questions with a fuzzy inference system, and it is then integrated into the planning domain. We describe an adaptation method based on both the user satisfaction and the execution outcome, depending on which penalizations are applied to the planner’s rules. We demonstrate the application of the adaptation method in a simple shoe-fitting scenario, with experiments performed in a simulated user environment. The results show quick behavior adaptation, even when the user behavior changes, as well as robustness to wrong inference of the initial user model. Finally, some insights in a non-simulated world shoe-fitting setup are also provided.Peer ReviewedPostprint (author's final draft

    Globalising employee engagement: myths and reality; a Middle East perspective.

    Get PDF
    The purpose of this research was to investigate if selected cultural and national aspects had an effect on employee engagement drivers. Another aim was to find out if applying global engagement tools in different cultures would provide an accurate engagement report. Finally, a new tool was proposed and examined in this study by companies operating in the Middle and Near East regions. Employee engagement has been of growing concern to business leaders as well as occupational psychologists, since it was claimed to relate to organisational productivity and long term success. Despite this growing concern and various consultancy solutions provided, few academic researches tackled cross cultural employee engagement aspects. In this research, both qualitative and quantitative research methodologies were used. The qualitative research data consisted of two in-depth interviews with employees working in the Middle and Near East regions. The quantitative research data was gathered with the aid of two questionnaires. One hundred and eighty nine responses were received out of two hundred and seventeen questionnaires sent. The response rate was eighty seven per cent. This research produced a number of key findings: (a) Cultural, national and organisational factors affect engagement drivers. (b) Engagement drivers change over time, at least in priority. (c) Measuring engagement through a globally designed fixed tool is not likely to produce accurate results that management can use to plan for actions. The main conclusion drawn from this research was that current approaches to measuring employee engagement are taking engagement drivers as common for granted, and this concept should be revised. The author recommends that leaders should investigate and run an analysis of engagement drivers before any engagement survey is undertaken. A new tool has been presented by the research and was tested by a number of organisations. This tool takes into account building engagement questionnaires based on key drivers analysed from specific work cultures

    The Effects Of Social Media Influencer Attributes On Collaborating Brand Credibility And Advocacy

    Get PDF
    This thesis investigates different characteristics and dimensions related to social media influencers that might affect some brand outcomes after being endorsed by /collaborating with the influencer in brand communications. This study specifically examines the impact of three dimensions—social media influencer credibility, attractiveness, and endorsement content quality—on the collaborating brand’s credibility. It also examines the influence of brand credibility on brand advocacy. The study also explores the mediating role of brand credibility and the moderating role of digital experience. To achieve these aims, the researcher employed the premises of two theories: the stimulus–organism–response theory and the social learning theory. The data were collected using an online questionnaire from 281 respondents. The findings reveal that social media influencer credibility significantly influences the credibility of the collaborating brand which in turn exert significant impact on brand advocacy. A mediating effect of collaborating brand credibility is identified between social media influencer credibility and brand advocacy. The findings have essential managerial implications that assist managers in choosing the most effective social media influencer for their brand

    Interactive Imitation Learning in Robotics: A Survey

    Full text link
    Interactive Imitation Learning (IIL) is a branch of Imitation Learning (IL) where human feedback is provided intermittently during robot execution allowing an online improvement of the robot's behavior. In recent years, IIL has increasingly started to carve out its own space as a promising data-driven alternative for solving complex robotic tasks. The advantages of IIL are its data-efficient, as the human feedback guides the robot directly towards an improved behavior, and its robustness, as the distribution mismatch between the teacher and learner trajectories is minimized by providing feedback directly over the learner's trajectories. Nevertheless, despite the opportunities that IIL presents, its terminology, structure, and applicability are not clear nor unified in the literature, slowing down its development and, therefore, the research of innovative formulations and discoveries. In this article, we attempt to facilitate research in IIL and lower entry barriers for new practitioners by providing a survey of the field that unifies and structures it. In addition, we aim to raise awareness of its potential, what has been accomplished and what are still open research questions. We organize the most relevant works in IIL in terms of human-robot interaction (i.e., types of feedback), interfaces (i.e., means of providing feedback), learning (i.e., models learned from feedback and function approximators), user experience (i.e., human perception about the learning process), applications, and benchmarks. Furthermore, we analyze similarities and differences between IIL and RL, providing a discussion on how the concepts offline, online, off-policy and on-policy learning should be transferred to IIL from the RL literature. We particularly focus on robotic applications in the real world and discuss their implications, limitations, and promising future areas of research
    • …
    corecore