23 research outputs found

    Coactive Learning for Locally Optimal Problem Solving

    Full text link
    Coactive learning is an online problem solving setting where the solutions provided by a solver are interactively improved by a domain expert, which in turn drives learning. In this paper we extend the study of coactive learning to problems where obtaining a globally optimal or near-optimal solution may be intractable or where an expert can only be expected to make small, local improvements to a candidate solution. The goal of learning in this new setting is to minimize the cost as measured by the expert effort over time. We first establish theoretical bounds on the average cost of the existing coactive Perceptron algorithm. In addition, we consider new online algorithms that use cost-sensitive and Passive-Aggressive (PA) updates, showing similar or improved theoretical bounds. We provide an empirical evaluation of the learners in various domains, which show that the Perceptron based algorithms are quite effective and that unlike the case for online classification, the PA algorithms do not yield significant performance gains.Comment: AAAI 2014 paper, including appendice

    The development of an adaptive upper-limb stroke rehabilitation robotic system

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Stroke is the primary cause of adult disability. To support this large population in recovery, robotic technologies are being developed to assist in the delivery of rehabilitation. This paper presents an automated system for a rehabilitation robotic device that guides stroke patients through an upper-limb reaching task. The system uses a decision theoretic model (a partially observable Markov decision process, or POMDP) as its primary engine for decision making. The POMDP allows the system to automatically modify exercise parameters to account for the specific needs and abilities of different individuals, and to use these parameters to take appropriate decisions about stroke rehabilitation exercises.</p> <p>Methods</p> <p>The performance of the system was evaluated by comparing the decisions made by the system with those of a human therapist. A single patient participant was paired up with a therapist participant for the duration of the study, for a total of six sessions. Each session was an hour long and occurred three times a week for two weeks. During each session, three steps were followed: (A) after the system made a decision, the therapist either agreed or disagreed with the decision made; (B) the researcher had the device execute the decision made by the therapist; (C) the patient then performed the reaching exercise. These parts were repeated in the order of A-B-C until the end of the session. Qualitative and quantitative question were asked at the end of each session and at the completion of the study for both participants.</p> <p>Results</p> <p>Overall, the therapist agreed with the system decisions approximately 65% of the time. In general, the therapist thought the system decisions were believable and could envision this system being used in both a clinical and home setting. The patient was satisfied with the system and would use this system as his/her primary method of rehabilitation.</p> <p>Conclusions</p> <p>The data collected in this study can only be used to provide insight into the performance of the system since the sample size was limited. The next stage for this project is to test the system with a larger sample size to obtain significant results.</p

    The Use of Domain Knowledge in Reinforcement Learning (Het gebruik van domeinkennis in reinforcement learning)

    No full text
    Reinforcement Learning ofte `leren uit beloningen' is de tak van kunstma tige intelligentie die bestudeert hoe agenten optimaal gedrag kunnen ler en in sequentiele beslissingsproblemen, waar het soms pas na enkele acti es duidelijk wordt wat de waarde van een vorige beslissing was. Dit lere n gebeurt door exploratie in de omgeving. In de meeste klassieke belonin gsleertechnieken wordt de omgeving als een zwarte doos bekeken waarvan n iets bekend is over het mogelijke gedrag, en kan het enkel door voldoend e exploratie duidelijk worden welke beslissingen welke gevolgen hebben. In veel domeinen is er echter allerhande expertkennis over de achterligg ende processen beschikbaar. Het is aannemelijk dat het gebruik van deze informatie de leertaak drastisch kan vereenvoudigen. Het eerste deel van deze thesis onderzoekt enkele manieren waarop dergel ijke kennis kan gebruikt worden om het leerproces te versnellen. Dit omv at onder andere het geval waarin er een volledig en correct model van de omgeving gegeven is en een manier om verschillende toestanden en acties te behandelen alsof ze identiek zijn. We introduceren twee nieuwe algor itmes, het eerste lost op een efficiente manier een gegeven sequentieel beslissingsprobleem op, het tweede gebruikt een afstandsmaat tussen toes tand-beslissingsparen om een opdeling van de omgeving te maken, wat leid t tot een grote reductie in de grootte van het probleem zonder een grote fout te introduceren. In het tweede deel van dit werk concentreren we ons op problemen waarbij extra informatie over het domein enkel voorhanden is met een kost. Meer informatie kan tot een beter gedrag leiden, maar zal een hogere kost me t zich meebrengen. In dit deel van de thesis zal de balans tussen kost e n waarde van informatie bestudeerd worden, zowel voor gewone numerieke v oorspellingstaken als voor sequentiele beslissingsproblemen. Drie nieuwe algoritmes worden geintroduceerd die gebruik maken van, resp ectievelijk, lineaire regressie, regressiebomen en lineaire model-bomen. Van al deze algoritmes wordt aangetoond dat er geen informatie aangesch aft wordt die zijn kost niet waard is.status: publishe

    Cost-sensitive reinforcement learning

    No full text
    We introduce cost-sensitive regression as a way to introduce information obtained by planning as background knowledge into a relational reinforcement learning algorithm. By offering a trade-off between using knowledge rich, but com- putationally expensive knowledge resulting from planning like approaches such as minimax search and computationally cheap, but possibly incorrect generalizations, the reinforcement learning agent can automatically learn when to apply planning and when to build a generalizing strategy. This approach would be useful for problem domains where a model is given but which are too large to solve by search. We discuss some difficulties that arise when trying to define costs that are semantically well founded for reinforcement learning problems and present a preliminary algorithm that illustrates the feasibility of the approach.status: publishe

    Reinforcement learning with state-action-pair generalized aggregation

    No full text
    status: publishe

    Linear Regression using Costly Features

    No full text
    In this paper we consider the problem of linear regression where some features might only be observable at a certain cost. We assume that this cost can be expressed in the same units and thus be compared t
    corecore