733,123 research outputs found

    Efficiently Learning from Revealed Preference

    Full text link
    In this paper, we consider the revealed preferences problem from a learning perspective. Every day, a price vector and a budget is drawn from an unknown distribution, and a rational agent buys his most preferred bundle according to some unknown utility function, subject to the given prices and budget constraint. We wish not only to find a utility function which rationalizes a finite set of observations, but to produce a hypothesis valuation function which accurately predicts the behavior of the agent in the future. We give efficient algorithms with polynomial sample-complexity for agents with linear valuation functions, as well as for agents with linearly separable, concave valuation functions with bounded second derivative.Comment: Extended abstract appears in WINE 201

    Task Transfer by Preference-Based Cost Learning

    Full text link
    The goal of task transfer in reinforcement learning is migrating the action policy of an agent to the target task from the source task. Given their successes on robotic action planning, current methods mostly rely on two requirements: exactly-relevant expert demonstrations or the explicitly-coded cost function on target task, both of which, however, are inconvenient to obtain in practice. In this paper, we relax these two strong conditions by developing a novel task transfer framework where the expert preference is applied as a guidance. In particular, we alternate the following two steps: Firstly, letting experts apply pre-defined preference rules to select related expert demonstrates for the target task. Secondly, based on the selection result, we learn the target cost function and trajectory distribution simultaneously via enhanced Adversarial MaxEnt IRL and generate more trajectories by the learned target distribution for the next preference selection. The theoretical analysis on the distribution learning and convergence of the proposed algorithm are provided. Extensive simulations on several benchmarks have been conducted for further verifying the effectiveness of the proposed method.Comment: Accepted to AAAI 2019. Mingxuan Jing and Xiaojian Ma contributed equally to this wor

    Engaging low skilled employees in workplace learning : UK Commission for Employment and Skills Evidence Report no. 43

    Get PDF
    The Employee Demand study (UKCES, 2009) highlighted the significant barriers to learning that are faced by a number of UK employees. This report sets out the findings of a study into the motivators and barriers to participation in workplace learning by low skilled employees. Employees in low skilled jobs are a group which has been overlooked in previous research. The study was carried out by the Employment Research Institute (ERI) at Edinburgh Napier University on behalf of the UK Commission for Employment and Skills (the UK Commission). The report presents the results of a survey of both employee and employer views on participation in workplace learning in the care sector in north east England and the hotel sector in Yorkshire and Humberside. As well as a standard survey, the report also outlines the stated preference approach adopted. The stated preference approach allows employees to consider a hypothetical case of participation in workplace learning. Employees were given choices of combinations of job and learning related factors that might affect their preference for or against workplace learning. In conclusion, the report suggests many positive features which employers, individuals and policy makers could build on in developing the skills of people in low skilled jobs, which is important in securing our competitive advantage in the long term

    Preference-Based Learning for Exoskeleton Gait Optimization

    Get PDF
    This paper presents a personalized gait optimization framework for lower-body exoskeletons. Rather than optimizing numerical objectives such as the mechanical cost of transport, our approach directly learns from user preferences, e.g., for comfort. Building upon work in preference-based interactive learning, we present the CoSpar algorithm. CoSpar prompts the user to give pairwise preferences between trials and suggest improvements; as exoskeleton walking is a non-intuitive behavior, users can provide preferences more easily and reliably than numerical feedback. We show that CoSpar performs competitively in simulation and demonstrate a prototype implementation of CoSpar on a lower-body exoskeleton to optimize human walking trajectory features. In the experiments, CoSpar consistently found user-preferred parameters of the exoskeleton’s walking gait, which suggests that it is a promising starting point for adapting and personalizing exoskeletons (or other assistive devices) to individual users
    • 

    corecore