145 research outputs found
A Domain Independent Explanation-Based Generalizer
Coordinated Science Laboratory was formerly known as Control Systems LaboratoryONR / N00014-86-K-0309National Science Foundation / IST-83-1788
Multi-objective evolution for Generalizable Policy Gradient Algorithms
Performance, generalizability, and stability are three Reinforcement Learning
(RL) challenges relevant to many practical applications in which they present
themselves in combination. Still, state-of-the-art RL algorithms fall short
when addressing multiple RL objectives simultaneously and current human-driven
design practices might not be well-suited for multi-objective RL. In this paper
we present MetaPG, an evolutionary method that discovers new RL algorithms
represented as graphs, following a multi-objective search criteria in which
different RL objectives are encoded in separate fitness scores. Our findings
show that, when using a graph-based implementation of Soft Actor-Critic (SAC)
to initialize the population, our method is able to find new algorithms that
improve upon SAC's performance and generalizability by 3% and 17%,
respectively, and reduce instability up to 65%. In addition, we analyze the
graph structure of the best algorithms in the population and offer an
interpretation of specific elements that help trading performance for
generalizability and vice versa. We validate our findings in three different
continuous control tasks: RWRL Cartpole, RWRL Walker, and Gym Pendulum.Comment: 23 pages, 12 figures, 10 table
Explanation-based generalization of partially ordered plans
Most previous work in analytic generalization of plans dealt with totally ordered plans. These methods cannot be directly applied to generalizing partially ordered plans, since they do not capture all interactions among plan operators for all total orders of such plans. We introduce a new method for generalizing partially ordered plans. This method is based on providing explanation-based generalization (EBG) with explanations which systematically capture the interactions among plan operators for all the total orders of a partially-ordered plan. The explanations are based on the Modal Truth Criterion which states the necessary and sufficient conditions for ensuring the truth of a proposition at any point in a plan, for a class of partially ordered plans. The generalizations obtained by this method guarantee successful and interaction-free execution of any total order of the generalized plan. In addition, the systematic derivation of the generalization algorithms from the Modal Truth Criterion obviates the need for carrying out a separate formal proof of correctness of the EBG algorithms
Recommended from our members
LT revisited : explanation-based learning and the logic of Principia mathematica
This paper describes an explanation-based learning (EBL) system based on a version of Newell, Shaw and Simon's LOGIC-THEORIST (LT). Results of applying this system to propositional calculus problems from Principia Mathematica are compared with results of applying several other versions of the same performance element to these problems. The primary goal of this study is to characterize and analyze differences between not learning, rote learning (LT's original learning method), and EBL. Another aim is to provide base-line characterizations of the performance of a simple problem solver in the context of the Principa problems, in the hope that these problems can be used as a benchmark for testing improved learning methods, just as problems like chess and the eight puzzle have been used as benchmarks in research on search methods
Recommended from our members
Induction over explanations : a method that exploits domain knowledge to learn from examples
We introduce five criteria by which to judge the suitability of a method for solving the problem of learning concepts from examples: correctness (the correct concept should be identified), performance efficiency (the learned definition should be efficient to apply to the performance task), flexibility (the method should be able to learn a variety of different concepts), ease of engineering (the method should be easy to implement in new domains) and learning efficiency (the method should learn from few examples efficiently). We analyze two existing methods for learning from examples, similarity-based learning (SBL) and explanation-based learning (EBL), and find them inappropriate for solving an important sub-problem: learning functional concepts from examples. In SBL, the performance efficiency goal is incompatible with the other goals, because the representation best for performance is ineffective for learning. In EBL, it is difficult to satisfy the flexibility or correctness goals, because the concepts are identified from a single example and an inflexible generalization policy. "We introduce a new method, called induction over explanations (IOE), that overcomes these difficulties. The method applies a domain theory to construct explanations from the training examples as in EBL, but forms the concept definition by employing an SBL generalization policy over the explanations. The concept definition is then compiled into a form efficient for the performance task. The method has the advantage that an explicit domain theory can be exploited to aid the learning process, the vocabulary engineering of representations is significantly reduced, and the correct concepts can be learned from few examples. We illustrate the method in an implemented system, called Wyl2, that learns concepts in a variety of domains including the concepts "skewer" and "knight-fork" in chess.Key words: Learning from examples, induction over explanations, explanation based learning, similarity based learning, inductive learning, evaluation of learning methods
- …