    Knowledge representation issues in control knowledge learning

    Seventeenth International Conference on Machine Learning. Stanford, CA, USA, 29 June-2 July, 2000Knowledge representation is a key issue for any machine learning task. There have already been many comparative studies about knowledge representation with respect to machine learning in classication tasks. However, apart from some work done on reinforcement learning techniques in relation to state representation, very few studies have concentrated on the eect of knowledge representation for machine learning applied to problem solving, and more specically, to planning. In this paper, we present an experimental comparative study of the eect of changing the input representation of planning domain knowledge on control knowledge learning. We show results in two classical domains using three dierent machine learning systems, that have previously shown their eectiveness on learning planning control knowledge: a pure ebl mechanism, a combination of ebl and induction (hamlet), and a Genetic Programming based system (evock).Publicad

    Learning to solve planning problems efficiently by means of genetic programming

    Declarative problem solving, such as planning, poses interesting challenges for Genetic Programming (GP). There have been recent attempts to apply GP to planning that fit two approaches: (a) using GP to search in plan space or (b) to evolve a planner. In this article, we propose to evolve only the heuristics to make a particular planner more efficient. This approach is more feasible than (b) because it does not have to build a planner from scratch but can take advantage of already existing planning systems. It is also more efficient than (a) because once the heuristics have been evolved, they can be used to solve a whole class of different planning problems in a planning domain, instead of running GP for every new planning problem. Empirical results show that our approach (EVOCK) is able to evolve heuristics in two planning domains (the blocks world and the logistics domain) that improve PRODIGY4.0 performance. Additionally, we experiment with a new genetic operator - Instance-Based Crossover - that is able to use traces of the base planner as raw genetic material to be injected into the evolving population.Publicad

    Two steps reinforcement learning

    When applying reinforcement learning in domains with very large or continuous state spaces, the experience obtained by the learning agent in the interaction with the environment must be generalized. The generalization methods are usually based on the approximation of the value functions used to compute the action policy and tackled in two different ways. On the one hand by using an approximation of the value functions based on a supervized learning method. On the other hand, by discretizing the environment to use a tabular representation of the value functions. In this work, we propose an algorithm that uses both approaches to use the benefits of both mechanisms, allowing a higher performance. The approach is based on two learning phases. In the first one, a learner is used as a supervized function approximator, but using a machine learning technique which also outputs a state space discretization of the environment, such as nearest prototype classifiers or decision trees do. In the second learning phase, the space discretization computed in the first phase is used to obtain a tabular representation of the value function computed in the previous phase, allowing a tuning of such value function approximation. Experiments in different domains show that executing both learning phases improves the results obtained executing only the first one. The results take into account the resources used and the performance of the learned behavior.This research was partially conducted while the firs author was visiting Carnegie Mellon University from the Universidad Carlos III de Madrid, supported by a generous grant from the Spanish Ministry of Education and Fulbright. Both authors were partially sponsored by the Spanish MEC project TIN2005-08945-C06-05 and regional CAM-UC3M project number CCG06-UC3M/TIC-0831.Publicad

    VQQL. Applying vector quantization to reinforcement learning

    Proceeding of: RoboCup-99: Robot Soccer World Cup III, July 27 to August 6, 1999, Stockholm, SwedenReinforcement learning has proven to be a set of successful techniques for finding optimal policies on uncertain and/or dynamic domains, such as the RoboCup. One of the problems on using such techniques appears with large state and action spaces, as it is the case of input information coming from the Robosoccer simulator. In this paper, we describe a new mechanism for solving the states generalization problem in reinforcement learning algorithms. This clustering mechanism is based on the vector quantization technique for signal analog-to-digital conversion and compression, and on the Generalized Lloyd Algorithm for the design of vector quantizers. Furthermore, we present the VQQL model, that integrates Q-Learning as reinforcement learning technique and vector quantization as state generalization technique. We show some results on applying this model to learning the interception task skill for Robosoccer agents.Publicad

    Distributed reinforcement learning in multi-agent decision systems

    Proceeding of: 6th Ibero-American Conference on AI (IBERAMIA '98),Lisbon, Portugal, October 5–9, 1998Decision problems can be usually solved using systems that implement different paradigms. These systems may be integrated into a single distributed system, with the expectation of obtaining a group performance more satisfactory than individual performances. Such a distributed system is what we call a Multi Agent Decision System (MADES), a special kind of Multi Agent System, that integrates several heterogeneous autonomous decision systems (agents). A MADES must produce a single solution proposal for the problem instance it faces, despite the fact that its decision making is distributed, and every agent produces solution proposals according to its local view and to its idiosyncrasy. We present a distributed reinforcement algorithm for learning how to combine the decisions the agents make in a distributed way, into a single group decision (solution proposal).Publicad

    MĂ©todo incremental de compilaciĂłn de conocimiento inteligente : MINNIE

    En el presente trabajo, se expone la tesis de que los seres humanos aprenden por medio del estudio de sus decisiones correctas e incorrectas, descartando aquellas situaciones en las que no se dispone de alternativa. Asimismo, se especifica la construcción de un algoritmo que modela el comportamiento humano en dominios en los que es necesario tomar decisiones. Los resultados del algoritmo demuestran no sólo que el comportamiento del sistema es análogo al humano en esos dominios, sino que además es flexible para poderse integrar a otros modelos de aprendizaje

    An integrated approach of learning, planning, and execution

    Agents (hardware or software) that act autonomously in an environment have to be able to integrate three basic behaviors: planning, execution, and learning. This integration is mandatory when the agent has no knowledge about how its actions can affect the environment, how the environment reacts to its actions, or, when the agent does not receive as an explicit input, the goals it must achieve. Without an a priori theory, autonomous agents should be able to self-propose goals, set-up plans for achieving the goals according to previously learned models of the agent and the environment, and learn those models from past experiences of successful and failed executions of plans. Planning involves selecting a goal to reach and computing a set of actions that will allow the autonomous agent to achieve the goal. Execution deals with the interaction with the environment by application of planned actions, observation of resulting perceptions, and control of successful achievement of the goals. Learning is needed to predict the reactions of the environment to the agent actions, thus guiding the agent to achieve its goals more efficiently. In this context, most of the learning systems applied to problem solving have been used to learn control knowledge for guiding the search for a plan, but few systems have focused on the acquisition of planning operator descriptions. As an example, currently, one of the most used techniques for the integration of (a way of) planning, execution, and learning is reinforcement learning. However, they usually do not consider the representation of action descriptions, so they cannot reason in terms of goals and ways of achieving those goals. In this paper, we present an integrated architecture, lope, that learns operator definitions, plans using those operators, and executes the plans for modifying the acquired operators. The resulting system is domain-independent, and we have performed experiments in a robotic framework. The results clearly show that the integrated planning, learning, and executing system outperforms the basic planner in that domain.Publicad

    Efficient approaches for multi-agent planning

    Multi-agent planning (MAP) deals with planning systems that reason on long-term goals by multiple collaborative agents which want to maintain privacy on their knowledge. Recently, new MAP techniques have been devised to provide efficient solutions. Most approaches expand distributed searches using modified planners, where agents exchange public information. They present two drawbacks: they are planner-dependent; and incur a high communication cost. Instead, we present two algorithms whose search processes are monolithic (no communication while individual planning) and MAP tasks are compiled such that they are planner-independent (no programming effort needed when replacing the base planner). Our two approaches first assign each public goal to a subset of agents. In the first distributed approach, agents iteratively solve problems by receiving plans, goals and states from previous agents. After generating new plans by reusing previous agents' plans, they share the new plans and some obfuscated private information with the following agents. In the second centralized approach, agents generate an obfuscated version of their problems to protect privacy and then submit it to an agent that performs centralized planning. The resulting approaches are efficient, outperforming other state-of-the-art approaches.This work has been partially supported by MICINN projects TIN2008-06701-C03-03, TIN2011-27652-C03-02 and TIN2014-55637-C2-1-R

    Transferring learned control-knowledge between planners

    As any other problem solving task that employs search, AI Planning needs heuristics to efficiently guide the problem-space exploration. Machine learning (ML) provides several techniques for automatically acquiring those heuristics. Usually, a planner solves a problem, and a ML technique generates knowledge from the search episode in terms of complete plans (macro-operators or cases), or heuristics (also named control knowledge in planning). In this paper, we present a novel way of generating planning heuristics: we learn heuristics in one planner and transfer them to another planner. This approach is based on the fact that different planners employ different search bias. We want to extract knowledge from the search performed by one planner and use the learned knowledge on another planner that uses a different search bias. The goal is to improve the efficiency of the second planner by capturing regularities of the domain that it would not capture by itself due to its bias. We employ a deductive learning method (EBL) that is able to automatically acquire control knowledge by generating bounded explanations of the problem-solving episodes in a Graphplan-based planner. Then, we transform the learned knowledge so that it can be used by a bidirectional planner.20th International Joint Conferences on Artificial Intelligence (IJCAI-07)Hyderabad, India, 9 - 12 Jan 2007Publicad

    A context vector model for information retrieval

    In the vector space model for information retrieval, term vectors are pair-wise orthogonal, that is, terms are assumed to be independent. It is well known that this assumption is too restrictive. In this article, we present our work on an indexing and retrieval method that, based on the vector space model, incorporates term dependencies and thus obtains semantically richer representations of documents. First, we generate term context vectors based on the co-occurrence of terms in the same documents. These vectors are used to calculate context vectors for documents. We present different techniques for estimating the dependencies among terms. We also define term weights that can be employed in the model. Experimental results on four text collections (MED, CRANFIELD, CISI, and CACM) show that the incorporation of term dependencies in the retrieval process performs statistically significantly better than the classical vector space model with IDF weights. We also show that the degree of semantic matching versus direct word matching that performs best varies on the four collections. We conclude that the model performs well for certain types of queries and, generally, for information tasks with high recall requirements. Therefore, we propose the use of the context vector model in combination with other, direct word-matching methods.Publicad