Search CORE

4 research outputs found

A Formal Framework for Speedup Learning from Problems and Solutions

Author: Natarajan B. K.
Tadepalli P.
Publication venue
Publication date: 01/01/1996
Field of study

Speedup learning seeks to improve the computational efficiency of problem solving with experience. In this paper, we develop a formal framework for learning efficient problem solving from random problems and their solutions. We apply this framework to two different representations of learned knowledge, namely control rules and macro-operators, and prove theorems that identify sufficient conditions for learning in each representation. Our proofs are constructive in that they are accompanied with learning algorithms. Our framework captures both empirical and explanation-based speedup learning in a unified fashion. We illustrate our framework with implementations in two domains: symbolic integration and Eight Puzzle. This work integrates many strands of experimental and theoretical work in machine learning, including empirical learning of control rules, macro-operator learning, Explanation-Based Learning (EBL), and Probably Approximately Correct (PAC) Learning.Comment: See http://www.jair.org/ for any accompanying file

arXiv.org e-Print Archive

CiteSeerX

Recommended from our members

Between MDPs and Semi-MDPs:Learning, Planning, and Representing Knowledge at Multiple Temporal Scales

Author: Sutton Richard S.
Publication venue: ScholarWorks@UMass Amherst
Publication date: 01/01/1998
Field of study

Learning, planning, and representing knowledge at multiple levels of temporal abstraction are key challenges for AI. In this paper we develop an approach to these problems based on the mathematical framework of reinforcement learning and Markov decision processes (MDPs). We extend the usual notion of action to include options|whole courses of behavior that may be temporally extended, stochastic, and contingent on events. Examples of options include picking up an object, going to lunch, and traveling to a distant city, as well as primitive actions such as muscle twitches and joint torques. Options may be given a priori, learned by experience, or both. They may be used interchangeably with actions in a variety of planning and learning methods. The theory of semi-Markov decision processes (SMDPs) can be applied to model the consequences of options and as a basis for planning and learning methods using them. In this paper we develop these connections, building on prior work by Bradtke and Du (1995), Parr (in prep.) and others. Our main novel results concern the interface between the MDP and SMDP levels of analysis. We show how a set of options can be altered by changing only their termination conditions to improve over SMDP methods with no additional cost. We also introduce intra-option temporal-dierence methods that are able to learn from fragments of an option\u27s execution. Finally, we propose a notion of subgoal which can be used to improve the options themselves. Overall, we argue that options and their models provide hitherto missing aspects of a powerful, clear, and expressive framework for representing and organizing knowledge

ScholarWorks@UMass Amherst

Use of a core concept search tool for the information literacy education of undergraduate students

Author: Redfern Victoria
Publication venue
Publication date: 01/01/2012
Field of study

University of Canberra Research Repository

A Statistical Approach to Solving the EBL Utility Problem

Author: Igor Jurisica
Russell Greiner
Publication venue
Publication date
Field of study

Many "learning from experience" systems use information extracted from problem solving experiences to modify a performance element PE, forming a new element PE 0 that can solve these and similar problems more efficiently. However, as transformations that improve performance on one set of problems can degrade performance on other sets, the new PE 0 is not always better than the original PE; this depends on the distribution of problems. We therefore seek the performance element whose expected performance, over this distribution, is optimal. Unfortunately, the actual distribution, which is needed to determine which element is optimal, is usually not known. Moreover, the task of finding the optimal element, even knowing the distribution, is intractable for most interesting spaces of elements. This paper presents a method, palo, that side-steps these problems by using a set of samples to estimate the unknown distribution, and by using a set of transformations to hill-climb to a local o..

CiteSeerX