25 research outputs found

    Efficient, Safe, and Probably Approximately Complete Learning of Action Models

    Full text link
    In this paper we explore the theoretical boundaries of planning in a setting where no model of the agent's actions is given. Instead of an action model, a set of successfully executed plans are given and the task is to generate a plan that is safe, i.e., guaranteed to achieve the goal without failing. To this end, we show how to learn a conservative model of the world in which actions are guaranteed to be applicable. This conservative model is then given to an off-the-shelf classical planner, resulting in a plan that is guaranteed to achieve the goal. However, this reduction from a model-free planning to a model-based planning is not complete: in some cases a plan will not be found even when such exists. We analyze the relation between the number of observed plans and the likelihood that our conservative approach will indeed fail to solve a solvable problem. Our analysis show that the number of trajectories needed scales gracefully

    Generalizing Agent Plans and Behaviors with Automated Staged Observation in The Real-Time Strategy Game Starcraft

    Get PDF
    In this thesis we investigate the processes involved in learning to play a game. It was inspired by two observations about how human players learn to play. First, learning the domain is intertwined with goal pursuit. Second, games are designed to ramp up in complexity, walking players through a gradual cycle of acquiring, refining, and generalizing knowledge about the domain. This approach does not rely on traces of expert play. We created an integrated planning, learning and execution system that uses StarCraft as its domain. The planning module creates command/event groupings based on the data received. Observations of unit behavior are collected during execution and returned to the learning module which tests the generalization hypothesizes. The planner uses those test results to generate events that will pursue the goal and facilitate learning the domain. We demonstrate that this approach can efficiently learn the subtle traits of commands through multiple scenarios

    On-line learning of macro planning operators using probabilistic estimations of cause-effects

    Get PDF
    In this work we propose an on-line learning method for learning action rules for planning. The system uses a probabilistic approach of a constructive induction method that combines a beam search with an example-based search over candidate rules to find those that more concisely describe the world dynamics. The approach permits a rapid integration of the knowledge acquired from experience. Exploration of the world dynamics is guided by the planner, and – if the planner fails because of incomplete knowledge – by a teacher through action instructions

    INCREMENTAL LEARNING OF PROCEDURAL PLANNING KNOWLEDGE IN CHALLENGING ENVIRONMENTS

    Full text link
    Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/75646/1/j.1467-8640.2005.00280.x.pd

    Optimize Planning Heuristics to Rank, not to Estimate Cost-to-Goal

    Full text link
    In imitation learning for planning, parameters of heuristic functions are optimized against a set of solved problem instances. This work revisits the necessary and sufficient conditions of strictly optimally efficient heuristics for forward search algorithms, mainly A* and greedy best-first search, which expand only states on the returned optimal path. It then proposes a family of loss functions based on ranking tailored for a given variant of the forward search algorithm. Furthermore, from a learning theory point of view, it discusses why optimizing cost-to-goal \hstar\ is unnecessarily difficult. The experimental comparison on a diverse set of problems unequivocally supports the derived theory.Comment: 10 page

    Webots inteligentes autónomos

    Get PDF
    Un sistema inteligente autónomo (SIA) se define como tal, si cumple con las siguientes condiciones: (i) transforma las percepciones de su entorno en situaciones (conjunto de datos esenciales del estado del entorno), (ii) elige sus propios sub-objetivos guiado por su objetivo de diseño, (iii) construye sus propios planes para alcanzar sus objetivos, basándose en su propia experiencia (percepciones almacenadas en memoria), (iv) ejecuta el plan construido y (v) aprende a partir de las interacciones con su entorno. Es decir, un SIA es aquel sistema que percibe su entorno, que planifica sus acciones, que ejecuta los planes y que aprende a partir de las experiencias previas. Por otra parte actualmente, se define un webot como un robot virtual (artefacto software) que “habita” la web y desarrolla en ella determinadas tareas para las cuales ha sido programado. Este proyecto busca fusionar los conceptos de SIA y webot sentando las bases conceptuales para definir un webot inteligente autónomo y explorar sus potenciales aplicaciones.Eje: Innovación en Sistemas de SoftwareRed de Universidades con Carreras en Informática (RedUNCI

    Webots inteligentes autónomos

    Get PDF
    Un sistema inteligente autónomo (SIA) se define como tal, si cumple con las siguientes condiciones: (i) transforma las percepciones de su entorno en situaciones (conjunto de datos esenciales del estado del entorno), (ii) elige sus propios sub-objetivos guiado por su objetivo de diseño, (iii) construye sus propios planes para alcanzar sus objetivos, basándose en su propia experiencia (percepciones almacenadas en memoria), (iv) ejecuta el plan construido y (v) aprende a partir de las interacciones con su entorno. Es decir, un SIA es aquel sistema que percibe su entorno, que planifica sus acciones, que ejecuta los planes y que aprende a partir de las experiencias previas. Por otra parte actualmente, se define un webot como un robot virtual (artefacto software) que “habita” la web y desarrolla en ella determinadas tareas para las cuales ha sido programado. Este proyecto busca fusionar los conceptos de SIA y webot sentando las bases conceptuales para definir un webot inteligente autónomo y explorar sus potenciales aplicaciones.Eje: Innovación en Sistemas de SoftwareRed de Universidades con Carreras en Informática (RedUNCI

    Integrating Planning, Execution, and Learning to Improve Plan Execution

    Get PDF
    Algorithms for planning under uncertainty require accurate action models that explicitly capture the uncertainty of the environment. Unfortunately, obtaining these models is usually complex. In environments with uncertainty, actions may produce countless outcomes and hence, specifying them and their probability is a hard task. As a consequence, when implementing agents with planning capabilities, practitioners frequently opt for architectures that interleave classical planning and execution monitoring following a replanning when failure paradigm. Though this approach is more practical, it may produce fragile plans that need continuous replanning episodes or even worse, that result in execution dead-ends. In this paper, we propose a new architecture to relieve these shortcomings. The architecture is based on the integration of a relational learning component and the traditional planning and execution monitoring components. The new component allows the architecture to learn probabilistic rules of the success of actions from the execution of plans and to automatically upgrade the planning model with these rules. The upgraded models can be used by any classical planner that handles metric functions or, alternatively, by any probabilistic planner. This architecture proposal is designed to integrate off-the-shelf interchangeable planning and learning components so it can profit from the last advances in both fields without modifying the architecture.Publicad

    PLTOOL: a knowledge engineering tool for planning and learning

    Get PDF
    Artificial intelligence (AI) planning solves the problem of generating a correct and efficient ordered set of instantiated activities, from a knowledge base of generic actions, which when executed will transform some initial state into some desirable end-state. There is a long tradition of work in AI for developing planners that make use of heuristics that are shown to improve their performance in many real world and artificial domains. The developers of planners have chosen between two extremes when defining those heuristics. The domain-independent planners use domain-independent heuristics, which exploit information only from the ‘syntactic’ structure of the problem space and of the search tree. Therefore, they do not need any ‘semantic’ information from a given domain in order to guide the search. From a knowledge engineering (KE) perspective, the planners that use this type of heuristics have the advantage that the users of this technology need only focus on defining the domain theory and not on defining how to make the planner efficient (how to obtain ‘good’ solutions with the minimal computational resources). However, the domain-dependent planners require users to manually represent knowledge not only about the domain theory, but also about how to make the planner efficient. This approach has the advantage of using either better domain-theory formulations or using domain knowledge for defining the heuristics, thus potentially making them more efficient. However, the efficiency of these domain-dependent planners strongly relies on the KE and planning expertise of the user. When the user is an expert on these two types of knowledge, domain-dependent planners clearly outperform domain-independent planners in terms of number of solved problems and quality of solutions. Machine-learning (ML) techniques applied to solve the planning problems have focused on providing middle-ground solutions as compared to the aforementioned two extremes. Here, the user first defines a domain theory, and then executes the ML techniques that automatically modify or generate new knowledge with respect to both the domain theory and the heuristics. In this paper, we present our work on building a tool, PLTOOL (planning and learning tool), to help users interact with a set of ML techniques and planners. The goal is to provide a KE framework for mixed-initiative generation of efficient and good planning knowledge.This work has been partially supported by the Spanish MCyT project TIC2002-04146-C05-05, MEC project TIN2005-08945-C06-05 and regional CAM-UC3M project UC3M-INF-05-016.Publicad
    corecore