355 research outputs found

    Aprendizaje por Refuerzo: Fundamentos Teóricos y Aplicación al cubo de Rubik

    Get PDF
    Trabajo de Fin de Doble Grado en Ingeniería Informática y Matemáticas, Facultad de Informática UCM, Departamento de Sistemas Informáticos y Computación, Curso 2021/2022.The techniques employed and developed in the area of reinforcement learning have been evolving since their origins at the end of the 20th century. Thanks to the various advances in this field, it has been possible to solve increasingly complicated problems. The influence of other areas of machine learning and artificial intelligence has enabled applications of reinforcement learning that initially posed great challenges due to their computational requirements. One such problem is the one we will discuss in this work, which is characterized by a large state space and a single final state. First, a theoretical introduction to the area of reinforcement learning will be given, focusing on those aspects most relevant to the solution of our problem. Then, a theoretical description of the DeepCubeA algorithm will be presented, that was designed to solve the Rubik’s 3x3x3 Cube, which has a large state space and only one final state. Finally, we will design and implement a version of the DeepCubeA algorithm, adding some relevant aspects of its previous version (DeepCube), and we will study its behavior with the Rubik’s 2x2x2 and 3x3x3 Cubes, and the Hanoi Towers.Las técnicas utilizadas y desarrolladas en el área del aprendizaje por refuerzo han ido evolucionando desde sus inicios, a finales del siglo XX. Gracias a los distintos avances en este sector, se han podido resolver problemas cada vez más complicados. La influencia de otras áreas del aprendizaje automático y de la inteligencia artificial han permitido aplicaciones del aprendizaje por refuerzo que inicialmente suponían grandes desafíos por sus requerimientos computacionales. Uno de esos problemas es el que trataremos en este trabajo, que se caracteriza por un gran espacio de estados y un único estado final. En un primer lugar, se dará una introducción teórica al área del aprendizaje por refuerzo, centrándonos en aquellos aspectos más relevantes en la resolución de nuestro problema. Después, se expondrá una descripción teórica del algoritmo DeepCubeA que fue diseñado especialmente para resolver el cubo de Rubik 3x3x3, caracterizado precisamente por un gran espacio de estados y un único estado final. Por último, diseñaremos e implementaremos una versión del algoritmo DeepCubeA, añadiendo algunos aspectos relevantes de su version anterior (DeepCube), y estudiaremos su comportamiento con los cubos de Rubik 2x2x2 y 3x3x3, y las Torres de Hanói.Depto. de Sistemas Informáticos y ComputaciónFac. de InformáticaTRUEunpu

    Planning And Scheduling For Large-scaledistributed Systems

    Get PDF
    Many applications require computing resources well beyond those available on any single system. Simulations of atomic and subatomic systems with application to material science, computations related to study of natural sciences, and computer-aided design are examples of applications that can benefit from the resource-rich environment provided by a large collection of autonomous systems interconnected by high-speed networks. To transform such a collection of systems into a user\u27s virtual machine, we have to develop new algorithms for coordination, planning, scheduling, resource discovery, and other functions that can be automated. Then we can develop societal services based upon these algorithms, which hide the complexity of the computing system for users. In this dissertation, we address the problem of planning and scheduling for large-scale distributed systems. We discuss a model of the system, analyze the need for planning, scheduling, and plan switching to cope with a dynamically changing environment, present algorithms for the three functions, report the simulation results to study the performance of the algorithms, and introduce an architecture for an intelligent large-scale distributed system

    Combining high-level causal reasoning witth low-level geometric reasoning and motion planning for robotic manipulation

    Get PDF
    We present a modular planning framework for manipulation tasks that combines high-level representation and causality-based reasoning with low-level geometric reasoning and motion planning. This framework features bilateral interaction between task and motion planning, and embeds geometric reasoning in causal reasoning. The causal reasoner guides the motion planner by finding an optimal task-plan; if there is no feasible kinematic solution for that task-plan then the motion planner guides the causal reasoner by modifying the planning problem with new temporal constraints. The geometric reasoner guides the causal reasoner to find feasible kinematic solutions by means of external predicates/functions. We show the applicability of this method on two sample problems: extended towers of Hanoi and multiple robot manipulation inside a maze. We focus on two main problems in this planning framework: i) a systemic analysis of various levels of integration between high-level representation and causality-based reasoning with low-level geometric reasoning and motion planning and ii) generalization of the planning framework to continuous domains. For the former, we consider various levels of integration in the two domains mentioned above, to check which level of integration achieves better performance. For the latter, we abstract configurations at the representation level by continuous regions instead of discrete positions, and introduce an incremental sampling-based method coupled to a goal region-based probabilistic path planner for extracting specific goal configurations required for generating valid plans for execution. This way, we tightly integrate high-level reasoning and region-based motion planning and provide a general framework for addressing a wide spectrum of manipulation problems

    Symbolic Reachability Analysis of B through ProB and LTSmin

    Get PDF
    We present a symbolic reachability analysis approach for B that can provide a significant speedup over traditional explicit state model checking. The symbolic analysis is implemented by linking ProB to LTSmin, a high-performance language independent model checker. The link is achieved via LTSmin's PINS interface, allowing ProB to benefit from LTSmin's analysis algorithms, while only writing a few hundred lines of glue-code, along with a bridge between ProB and C using ZeroMQ. ProB supports model checking of several formal specification languages such as B, Event-B, Z and TLA. Our experiments are based on a wide variety of B-Method and Event-B models to demonstrate the efficiency of the new link. Among the tested categories are state space generation and deadlock detection; but action detection and invariant checking are also feasible in principle. In many cases we observe speedups of several orders of magnitude. We also compare the results with other approaches for improving model checking, such as partial order reduction or symmetry reduction. We thus provide a new scalable, symbolic analysis algorithm for the B-Method and Event-B, along with a platform to integrate other model checking improvements via LTSmin in the future

    A Survey of Monte Carlo Tree Search Methods

    Get PDF
    Monte Carlo tree search (MCTS) is a recently proposed search method that combines the precision of tree search with the generality of random sampling. It has received considerable interest due to its spectacular success in the difficult problem of computer Go, but has also proved beneficial in a range of other domains. This paper is a survey of the literature to date, intended to provide a snapshot of the state of the art after the first five years of MCTS research. We outline the core algorithm's derivation, impart some structure on the many variations and enhancements that have been proposed, and summarize the results from the key game and nongame domains to which MCTS methods have been applied. A number of open research questions indicate that the field is ripe for future work

    Proceedings of the Workshop on Change of Representation and Problem Reformulation

    Get PDF
    The proceedings of the third Workshop on Change of representation and Problem Reformulation is presented. In contrast to the first two workshops, this workshop was focused on analytic or knowledge-based approaches, as opposed to statistical or empirical approaches called 'constructive induction'. The organizing committee believes that there is a potential for combining analytic and inductive approaches at a future date. However, it became apparent at the previous two workshops that the communities pursuing these different approaches are currently interested in largely non-overlapping issues. The constructive induction community has been holding its own workshops, principally in conjunction with the machine learning conference. While this workshop is more focused on analytic approaches, the organizing committee has made an effort to include more application domains. We have greatly expanded from the origins in the machine learning community. Participants in this workshop come from the full spectrum of AI application domains including planning, qualitative physics, software engineering, knowledge representation, and machine learning
    corecore