185 research outputs found

    Learning Representations in Model-Free Hierarchical Reinforcement Learning

    Full text link
    Common approaches to Reinforcement Learning (RL) are seriously challenged by large-scale applications involving huge state spaces and sparse delayed reward feedback. Hierarchical Reinforcement Learning (HRL) methods attempt to address this scalability issue by learning action selection policies at multiple levels of temporal abstraction. Abstraction can be had by identifying a relatively small set of states that are likely to be useful as subgoals, in concert with the learning of corresponding skill policies to achieve those subgoals. Many approaches to subgoal discovery in HRL depend on the analysis of a model of the environment, but the need to learn such a model introduces its own problems of scale. Once subgoals are identified, skills may be learned through intrinsic motivation, introducing an internal reward signal marking subgoal attainment. In this paper, we present a novel model-free method for subgoal discovery using incremental unsupervised learning over a small memory of the most recent experiences (trajectories) of the agent. When combined with an intrinsic motivation learning mechanism, this method learns both subgoals and skills, based on experiences in the environment. Thus, we offer an original approach to HRL that does not require the acquisition of a model of the environment, suitable for large-scale applications. We demonstrate the efficiency of our method on two RL problems with sparse delayed feedback: a variant of the rooms environment and the first screen of the ATARI 2600 Montezuma's Revenge game

    Massively parallel support for a case-based planning system

    Get PDF
    Case-based planning (CBP), a kind of case-based reasoning, is a technique in which previously generated plans (cases) are stored in memory and can be reused to solve similar planning problems in the future. CBP can save considerable time over generative planning, in which a new plan is produced from scratch. CBP thus offers a potential (heuristic) mechanism for handling intractable problems. One drawback of CBP systems has been the need for a highly structured memory to reduce retrieval times. This approach requires significant domain engineering and complex memory indexing schemes to make these planners efficient. In contrast, our CBP system, CaPER, uses a massively parallel frame-based AI language (PARKA) and can do extremely fast retrieval of complex cases from a large, unindexed memory. The ability to do fast, frequent retrievals has many advantages: indexing is unnecessary; very large case bases can be used; memory can be probed in numerous alternate ways; and queries can be made at several levels, allowing more specific retrieval of stored plans that better fit the target problem with less adaptation. In this paper we describe CaPER's case retrieval techniques and some experimental results showing its good performance, even on large case bases

    Planning And Scheduling For Large-scaledistributed Systems

    Get PDF
    Many applications require computing resources well beyond those available on any single system. Simulations of atomic and subatomic systems with application to material science, computations related to study of natural sciences, and computer-aided design are examples of applications that can benefit from the resource-rich environment provided by a large collection of autonomous systems interconnected by high-speed networks. To transform such a collection of systems into a user\u27s virtual machine, we have to develop new algorithms for coordination, planning, scheduling, resource discovery, and other functions that can be automated. Then we can develop societal services based upon these algorithms, which hide the complexity of the computing system for users. In this dissertation, we address the problem of planning and scheduling for large-scale distributed systems. We discuss a model of the system, analyze the need for planning, scheduling, and plan switching to cope with a dynamically changing environment, present algorithms for the three functions, report the simulation results to study the performance of the algorithms, and introduce an architecture for an intelligent large-scale distributed system

    An adaptive deductive planning system

    Get PDF
    A generic planning system is introduced which allows for custom building of planners able to generate plans for different plan consumers in the context of intelligent support systems. All planners are adapted to the pecularities of different plan consumers, to their domain knowledge, their typical behavior, their preferences, and their utilization of plans. The necessary knowledge sources of the generic planner are fixed in order to enable it to produce plans of a certain specificity. Its control strategy is described in a formal specification language containing constructs which allow for the configuration of characteristic parts of the control strategy. The customized planners are defined by executable specifications. An application of the approach to deductive planning based on a modal temporal logic is shown. It is demonstrated in an example how needs of different plan consumers in an intelligent help system can be met by a deductive planner

    Multi-agent planning using an abductive : event calculus

    Get PDF
    Temporal reasoning within distributed Artificial Intelligence Systems is faced with the problem of concurrent streams of action. Well known, logic-based systems using the SITUATION CALCULUS solve the frame problem in a purely linear manner. Recent research, however, has revealed that the EVENT CALCULUS under the abduction principle is capable of nonlinear planning. In this report, we present a planning service module which incorporates this approach into a constraint logic framework and even allows a notion of strong nonlinearity. The work includes the axiomatisation of appropriate versions of the EVENT CALCULUS, the development of a suitably sound and complete proof procedure that supports abduction and the implementation of both of these layers on the constraint platform OZ. We demonstrate prototypically how this module, EVE, can be integrated into an existing multi-agent architecture and evaluate the behaviour of such agents within an application domain, the loading dock scenario

    Storing and Indexing Plan Derivations through Explanation-based Analysis of Retrieval Failures

    Full text link
    Case-Based Planning (CBP) provides a way of scaling up domain-independent planning to solve large problems in complex domains. It replaces the detailed and lengthy search for a solution with the retrieval and adaptation of previous planning experiences. In general, CBP has been demonstrated to improve performance over generative (from-scratch) planning. However, the performance improvements it provides are dependent on adequate judgements as to problem similarity. In particular, although CBP may substantially reduce planning effort overall, it is subject to a mis-retrieval problem. The success of CBP depends on these retrieval errors being relatively rare. This paper describes the design and implementation of a replay framework for the case-based planner DERSNLP+EBL. DERSNLP+EBL extends current CBP methodology by incorporating explanation-based learning techniques that allow it to explain and learn from the retrieval failures it encounters. These techniques are used to refine judgements about case similarity in response to feedback when a wrong decision has been made. The same failure analysis is used in building the case library, through the addition of repairing cases. Large problems are split and stored as single goal subproblems. Multi-goal problems are stored only when these smaller cases fail to be merged into a full solution. An empirical evaluation of this approach demonstrates the advantage of learning from experienced retrieval failure.Comment: See http://www.jair.org/ for any accompanying file

    Decision Representation Language (DRL) and Its Support Environment

    Get PDF
    In this report, I describe a language, called Decision Representation Language (DRL), for representing the qualitative aspects of decision making processes such as the alternatives being evaluated, goals to satisfy, and the arguments evaluating the alternatives. Once a decision process is represented in this language, the system can provide a set of services that support people making the decision. These services, together with the interface such as the object and the different presentation formats, form the support environment for using the language. I describe the services that have been so far identified to be useful — the managements of dependency, plausibility, viewpoints, and precedents. I also discuss how this work on DRL is related to other studies on decision making.MIT Artificial Intelligence Laborator

    A foundation for machine learning in design

    Get PDF
    This paper presents a formalism for considering the issues of learning in design. A foundation for machine learning in design (MLinD) is defined so as to provide answers to basic questions on learning in design, such as, "What types of knowledge can be learnt?", "How does learning occur?", and "When does learning occur?". Five main elements of MLinD are presented as the input knowledge, knowledge transformers, output knowledge, goals/reasons for learning, and learning triggers. Using this foundation, published systems in MLinD were reviewed. The systematic review presents a basis for validating the presented foundation. The paper concludes that there is considerable work to be carried out in order to fully formalize the foundation of MLinD

    Landmark-based approaches for goal recognition as planning

    Get PDF
    This article is a revised and extended version of two papers published at AAAI 2017 (Pereira et al., 2017b) and ECAI 2016 (Pereira and Meneguzzi, 2016). We thank the anonymous reviewers that helped improve the research in this article. The authors thank Shirin Sohrabi for discussing the way in which the algorithms of Sohrabi et al. (2016) should be configured, and Yolanda Escudero-Martın for providing code for the approach of E-Martın et al. (2015) and engaging with us. We also thank Miquel Ramırez and Mor Vered for various discussions, and Andre Grahl Pereira for a discussion of properties of our algorithm. Felipe thanks CNPq for partial financial support under its PQ fellowship, grant number 305969/2016-1.Peer reviewedPostprin

    Cognitive Modeling for Computer Animation: A Comparative Review

    Get PDF
    Cognitive modeling is a provocative new paradigm that paves the way towards intelligent graphical characters by providing them with logic and reasoning skills. Cognitively empowered self-animating characters will see in the near future a widespread use in the interactive game, multimedia, virtual reality and production animation industries. This review covers three recently-published papers from the field of cognitive modeling for computer animation. The approaches and techniques employed are very different. The cognition model in the first paper is built on top of Soar, which is intended as a general cognitive architecture for developing systems that exhibit intelligent behaviors. The second paper uses an active plan tree and a plan library to achieve the fast and robust reactivity to the environment changes. The third paper, based on an AI formalism known as the situation calculus, develops a cognitive modeling language called CML and uses it to specify a behavior outline or sketch plan to direct the characters in terms of goals. Instead of presenting each paper in isolation then comparatively analyzing them, we take a top-down approach by first classifying the field into three different categories and then attempting to put each paper into a proper category. Hopefully in this way it can provide a more cohesive, systematic view of cognitive modeling approaches employed in computer animation
    • …
    corecore