19 research outputs found

    Solving Multi-agent planning tasks by using automated planning

    Get PDF
    This dissertation consists on developing a control system for an autonomous multiagent system using Automated Planning and Computer Vision to solve warehouse organization tasks. This work presents an heterogeneous multi-agent system where each robot has different capabilities. In order to complete the proposed task, the robots will need to collaborate. On one hand, there are coordinator robots that collect information about the boxes to get their destination storage position using Computer Vision. On the other hand, there are cargo robots that push the boxes more easily than the coordinators but they have no camera devices to identify the boxes. Then, both robots must collaborate in order to solve the warehouse problem due to the different sensors and actuators that they have available. This work has been developed in Java. It uses JNAOqi to communicate with the NAO robots (coordinators) and rosjava to communicate with the P3DX robots (cargos). The control modules are deployed in the PELEA architecuture. The empirical evaluation has been conducted in a real environment using two robots: one NAO8 Robot and one P3DX robot.Este trabajo presenta el desarrollo de un sistema de control para un sistema autónomo multi-agente con Planificación Automática y Visión Artificial para resolver tareas de ordenación de almacenes. En el proyecto se presenta un sistema multi-agente heterogéneo donde cada agente tiene diferentes habilidades. Para poder completar la tarea propuesta, los agentes, en este caso robots, deben colaborar. Por un lado, hay robots coordinadores que recogen información de las cajas medinte Visión Artificial para conocer la posición de almacenaje de la caja. Por otro lado, hay robots de carga que empujan las cajas hasta su destino con mayor facilidad que los coordinadores pero que no tienen cámaras de video para identificar las cajas. Por ello, ambos robots tienen que colaborar para resolver el problema de ordenación debido a los diferentes sensores y actuadores que tienen disponibles. El proyecto se ha desarrollado en Java. Se ha utilizado JNAOqi para comunicarse con los robots NAO (coordinadores) y rosjava para comunicarse con los robots P3DX (carga). La evaluación empírica se ha realizado en un entorno real utilizando dos robots: un robot NAO y un robot P3DX.Ingeniería Informátic

    Self-Motivated Composition of Strategic Action Policies

    Get PDF
    In the last 50 years computers have made dramatic progress in their capabilities, but at the same time their failings have demonstrated that we, as designers, do not yet understand the nature of intelligence. Chess playing, for example, was long offered up as an example of the unassailability of the human mind to Artificial Intelligence, but now a chess engine on a smartphone can beat a grandmaster. Yet, at the same time, computers struggle to beat amateur players in simpler games, such as Stratego, where sheer processing power cannot substitute for a lack of deeper understanding. The task of developing that deeper understanding is overwhelming, and has previously been underestimated. There are many threads and all must be investigated. This dissertation explores one of those threads, namely asking the question “How might an artificial agent decide on a sensible course of action, without being told what to do?”. To this end, this research builds upon empowerment, a universal utility which provides an entirely general method for allowing an agent to measure the preferability of one state over another. Empowerment requires no explicit goals, and instead favours states that maximise an agent’s control over its environment. Several extensions to the empowerment framework are proposed, which drastically increase the array of scenarios to which it can be applied, and allow it to evaluate actions in addition to states. These extensions are motivated by concepts such as bounded rationality, sub-goals, and anticipated future utility. In addition, the novel concept of strategic affinity is proposed as a general method for measuring the strategic similarity between two (or more) potential sequences of actions. It does this in a general fashion, by examining how similar the distribution of future possible states would be in the case of enacting either sequence. This allows an agent to group action sequences, even in an unknown task space, into ‘strategies’. Strategic affinity is combined with the empowerment extensions to form soft-horizon empowerment, which is capable of composing action policies in a variety of unknown scenarios. A Pac-Man-inspired prey game and the Gambler’s Problem are used to demonstrate this selfmotivated action selection, and a Sokoban inspired box-pushing scenario is used to highlight the capability to pick strategically diverse actions. The culmination of this is that soft-horizon empowerment demonstrates a variety of ‘intuitive’ behaviours, which are not dissimilar to what we might expect a human to try. This line of thinking demonstrates compelling results, and it is suggested there are a couple of avenues for immediate further research. One of the most promising of these would be applying the self-motivated methodology and strategic affinity method to a wider range of scenarios, with a view to developing improved heuristic approximations that generate similar results. A goal of replicating similar results, whilst reducing the computational overhead, could help drive an improved understanding of how we may get closer to replicating a human-like approach

    Ghost In the Grid: Challenges for Reinforcement Learning in Grid World Environments

    Get PDF
    The current state-of-the-art deep reinforcement learning techniques require agents to gather large amounts of diverse experiences to train effective and general models. In addition, there are also many other factors that have to be taken into consideration: for example, how the agent interacts with its environment; parameter optimization techniques; environment exploration methods; and finally the diversity of environments that is provided to an agent. In this thesis, we investigate several of these factors. Firstly we introduce Griddly, a high-performance grid-world game engine that provides a state-of-the-art combination of high performance and flexibility. We demonstrate that grid worlds provide a principled and expressive substrate for fundamental research questions in reinforcement learning, whilst filtering out noise inherent in physical systems. We show that although grid-worlds are constructed with simple rules-based mechanics, they can be used to construct complex open-ended, and procedurally generated environments. We improve upon Griddly with GriddlyJS, a web-based tool for designing and testing grid-world environments for reinforcement learning research. GriddlyJS provides a rich suite of features that assist researchers in a multitude of different learning approaches. To highlight the features of GriddlyJS we present a dataset of 100 complex escape-room puzzle levels. In addition to these complex puzzle levels, we provide human-generated trajectories and a baseline policy that can be run in a web browser. We show that this tooling enables significantly faster research iteration in many sub-fields. We then explore several areas of RL research that are made accessible by the features introduced by Griddly: Firstly, we explore learning grid-world game mechanics using deep neural networks. The {\em neural game engine} is introduced which has competitive performance in terms of sample efficiency and predicting states accurately over long time horizons. Secondly, {\em conditional action trees} are introduced which describe a method for compactly expressing complex hierarchical action spaces. Expressing hierarchical action spaces as trees leads to action spaces that are additive rather than multiplicative over the factors of the action space. It is shown that these compressed action spaces reduce the required output size of neural networks without compromising performance. This makes the interfaces to complex environments significantly simpler to implement. Finally, we explore the inherent symmetry in common observation spaces, using the concept of {\em geometric deep learning}. We show that certain geometric data augmentation methods do not conform to the underlying assumptions in several training algorithms. We provide solutions to these problems in the form of novel regularization functions and demonstrate that these methods fix the underlying assumptions

    Learning Hierarchical Compositional Task Definitions through Online Situated Interactive Language Instruction

    Full text link
    Artificial agents, from robots to personal assistants, have become competent workers in many settings and embodiments, but for the most part, they are limited to performing the capabilities and tasks with which they were initially programmed. Learning in these settings has predominately focused on learning to improve the agent’s performance on a task, and not on learning the actual definition of a task. The primary method for imbuing an agent with the task definition has been through programming by humans, who have detailed knowledge of the task, domain, and agent architecture. In contrast, humans quickly learn new tasks from scratch, often from instruction by another human. If we desire AI agents to be flexible and dynamically extendable, they will need to emulate these learning capabilities, and not be stuck with the limitation that task definitions must be acquired through programming. This dissertation explores the problem of how an Interactive Task Learning agent can learn the complete definition or formulation of novel tasks rapidly through online natural language instruction from a human instructor. Recent advances in natural language processing, memory systems, computer vision, spatial reasoning, robotics, and cognitive architectures make the time ripe to study how knowledge can be automatically acquired, represented, transferred, and operationalized. We present a learning approach embodied in an ITL agent that interactively learns the meaning of task concepts, the goals, actions, failure conditions, and task-specific terms, for 60 games and puzzles. In our approach, the agent learns hierarchical symbolic representations of task knowledge that enable it to transfer and compose knowledge, analyze and debug multiple interpretations, and communicate with the teacher to resolve ambiguity. Our results show that the agent can correctly generalize, disambiguate, and transfer concepts across variations of language descriptions and world representations, even with distractors present.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/153434/1/jrkirk_1.pd

    BNAIC 2008:Proceedings of BNAIC 2008, the twentieth Belgian-Dutch Artificial Intelligence Conference

    Get PDF

    Symbolic Search in Planning and General Game Playing

    Get PDF
    Search is an important topic in many areas of AI. Search problems often result in an immense number of states. This work addresses this by using a special datastructure, BDDs, which can represent large sets of states efficiently, often saving space compared to explicit representations. The first part is concerned with an analysis of the complexity of BDDs for some search problems, resulting in lower or upper bounds on BDD sizes for these. The second part is concerned with action planning, an area where the programmer does not know in advance what the search problem will look like. This part presents symbolic algorithms for finding optimal solutions for two different settings, classical and net-benefit planning, as well as several improvements to these algorithms. The resulting planner was able to win the International Planning Competition IPC 2008. The third part is concerned with general game playing, which is similar to planning in that the programmer does not know in advance what game will be played. This work proposes algorithms for instantiating the input and solving games symbolically. For playing, a hybrid player based on UCT and the solver is presented

    Generation and Analysis of Content for Physics-Based Video Games

    Get PDF
    The development of artificial intelligence (AI) techniques that can assist with the creation and analysis of digital content is a broad and challenging task for researchers. This topic has been most prevalent in the field of game AI research, where games are used as a testbed for solving more complex real-world problems. One of the major issues with prior AI-assisted content creation methods for games has been a lack of direct comparability to real-world environments, particularly those with realistic physical properties to consider. Creating content for such environments typically requires physics-based reasoning, which imposes many additional complications and restrictions that must be considered. Addressing and developing methods that can deal with these physical constraints, even if they are only within simulated game environments, is an important and challenging task for AI techniques that intend to be used in real-world situations. The research presented in this thesis describes several approaches to creating and analysing levels for the physics-based puzzle game Angry Birds, which features a realistic 2D environment. This research was multidisciplinary in nature and covers a wide variety of different AI fields, leading to this thesis being presented as a compilation of published work. The central part of this thesis consists of procedurally generating levels for physics-based games similar to those in Angry Birds. This predominantly involves creating and placing stable structures made up of many smaller blocks, as well as other level elements. Multiple approaches are presented, including both fully autonomous and human-AI collaborative methodologies. In addition, several analyses of Angry Birds levels were carried out using current state-of-the-art agents. A hyper-agent was developed that uses machine learning to estimate the performance of each agent in a portfolio for an unknown level, allowing it to select the one most likely to succeed. Agent performance on levels that contain deceptive or creative properties was also investigated, allowing determination of the current strengths and weaknesses of different AI techniques. The observed variability in performance across levels for different AI techniques led to the development of an adaptive level generation system, allowing for the dynamic creation of increasingly challenging levels over time based on agent performance analysis. An additional study also investigated the theoretical complexity of Angry Birds levels from a computational perspective. While this research is predominately applied to video games with physics-based simulated environments, the challenges and problems solved by the proposed methods also have significant real-world potential and applications

    ICAPS 2012. Proceedings of the third Workshop on the International Planning Competition

    Get PDF
    22nd International Conference on Automated Planning and Scheduling. June 25-29, 2012, Atibaia, Sao Paulo (Brazil). Proceedings of the 3rd the International Planning CompetitionThe Academic Advising Planning Domain / Joshua T. Guerin, Josiah P. Hanna, Libby Ferland, Nicholas Mattei, and Judy Goldsmith. -- Leveraging Classical Planners through Translations / Ronen I. Brafman, Guy Shani, and Ran Taig. -- Advances in BDD Search: Filtering, Partitioning, and Bidirectionally Blind / Stefan Edelkamp, Peter Kissmann, and Álvaro Torralba. -- A Multi-Agent Extension of PDDL3.1 / Daniel L. Kovacs. -- Mining IPC-2011 Results / Isabel Cenamor, Tomás de la Rosa, and Fernando Fernández. -- How Good is the Performance of the Best Portfolio in IPC-2011? / Sergio Nuñez, Daniel Borrajo, and Carlos Linares López. -- “Type Problem in Domain Description!” or, Outsiders’ Suggestions for PDDL Improvement / Robert P. Goldman and Peter KellerEn prens

    Using Plan Decomposition for Continuing Plan Optimisation and Macro Generation

    No full text
    This thesis addresses three problems in the field of classical AI planning: decomposing a plan into meaningful subplans, continuing plan quality optimisation, and macro generation for efficient planning. The importance and difficulty of each of these problems is outlined below. (1) Decomposing a plan into meaningful subplans can facilitate a number of postplan generation tasks, including plan quality optimisation and macro generation – the two key concerns of this thesis. However, conventional plan decomposition techniques are often unable to decompose plans because they consider dependencies among steps, rather than subplans. (2) Finding high quality plans for large planning problems is hard. Planners that guarantee optimal, or bounded suboptimal, plan quality often cannot solve them In one experiment with the Genome Edit Distance domain optimal planners solved only 11.5% of problems. Anytime planners promise a way to successively produce better plans over time. However, current anytime planners tend to reach a limit where they stop finding any further improvement, and the plans produced are still very far from the best possible. In the same experiment, the LAMA anytime planner solved all problems but found plans whose average quality is 1.57 times worse than the best known. (3) Finding solutions quickly or even finding any solution for large problems within some resource constraint is also difficult. The best-performing planner in the 2014 international planning competition still failed to solve 29.3% of problems. Re-engineering a domain model by capturing and exploiting structural knowledge in the form of macros has been found very useful in speeding up planners. However, existing planner independent macro generation techniques often fail to capture some promising macro candidates because the constituent actions are not found in sequence in the totally ordered training plans. This thesis contributes to plan decomposition by developing a new plan deordering technique, named block deordering, that allows two subplans to be unordered even when their constituent steps cannot. Based on the block-deordered plan, this thesis further contributes to plan optimisation and macro generation, and their implementations in two systems, named BDPO2 and BloMa. Key to BDPO2 is a decomposition into subproblems of improving parts of the current best plan, rather than the plan as a whole. BDPO2 can be seen as an application of the large neighbourhood search strategy to planning. We use several windowing strategies to extract subplans from the block deordering of the current plan, and on-line learning for applying the most promising subplanners to the most promising subplans. We demonstrate empirically that even starting with the best plans found by other means, BDPO2 is still able to continue improving plan quality, and often produces better plans than other anytime planners when all are given enough runtime. BloMa uses an automatic planner independent technique to extract and filter “self-containe” subplans as macros from the block deordered training plans. These macros represent important longer activities useful to improve planners coverage and efficiency compared to the traditional macro generation approaches
    corecore