64 research outputs found

    Optimal Planning with State Constraints

    Get PDF
    In the classical planning model, state variables are assigned values in the initial state and remain unchanged unless explicitly affected by action effects. However, some properties of states are more naturally modelled not as direct effects of actions but instead as derived, in each state, from the primary variables via a set of rules. We refer to those rules as state constraints. The two types of state constraints that will be discussed here are numeric state constraints and logical rules that we will refer to as axioms. When using state constraints we make a distinction between primary variables, whose values are directly affected by action effects, and secondary variables, whose values are determined by state constraints. While primary variables have finite and discrete domains, as in classical planning, there is no such requirement for secondary variables. For example, using numeric state constraints allows us to have secondary variables whose values are real numbers. We show that state constraints are a construct that lets us combine classical planning methods with specialised solvers developed for other types of problems. For example, introducing numeric state constraints enables us to apply planning techniques in domains involving interconnected physical systems, such as power networks. To solve these types of problems optimally, we adapt commonly used methods from optimal classical planning, namely state-space search guided by admissible heuristics. In heuristics based on monotonic relaxation, the idea is that in a relaxed state each variable assumes a set of values instead of just a single value. With state constraints, the challenge becomes to evaluate the conditions, such as goals and action preconditions, that involve secondary variables. We employ consistency checking tools to evaluate whether these conditions are satisfied in the relaxed state. In our work with numerical constraints we use linear programming, while with axioms we use answer set programming and three value semantics. This allows us to build a relaxed planning graph and compute constraint-aware version of heuristics based on monotonic relaxation. We also adapt pattern database heuristics. We notice that an abstract state can be thought of as a state in the monotonic relaxation in which the variables in the pattern hold only one value, while the variables not in the pattern simultaneously hold all the values in their domains. This means that we can apply the same technique for evaluating conditions on secondary variables as we did for the monotonic relaxation and build pattern databases similarly as it is done in classical planning. To make better use of our heuristics, we modify the A* algorithm by combining two techniques that were previously used independently – partial expansion and preferred operators. Our modified algorithm, which we call PrefPEA, is most beneficial in cases where heuristic is expensive to compute, but accurate, and states have many successors

    Symbolic Search in Planning and General Game Playing

    Get PDF
    Search is an important topic in many areas of AI. Search problems often result in an immense number of states. This work addresses this by using a special datastructure, BDDs, which can represent large sets of states efficiently, often saving space compared to explicit representations. The first part is concerned with an analysis of the complexity of BDDs for some search problems, resulting in lower or upper bounds on BDD sizes for these. The second part is concerned with action planning, an area where the programmer does not know in advance what the search problem will look like. This part presents symbolic algorithms for finding optimal solutions for two different settings, classical and net-benefit planning, as well as several improvements to these algorithms. The resulting planner was able to win the International Planning Competition IPC 2008. The third part is concerned with general game playing, which is similar to planning in that the programmer does not know in advance what game will be played. This work proposes algorithms for instantiating the input and solving games symbolically. For playing, a hybrid player based on UCT and the solver is presented

    A* Search Without Expansions: Learning Heuristic Functions with Deep Q-Networks

    Full text link
    A* search is an informed search algorithm that uses a heuristic function to guide the order in which nodes are expanded. Since the computation required to expand a node and compute the heuristic values for all of its generated children grows linearly with the size of the action space, A* search can become impractical for problems with large action spaces. This computational burden becomes even more apparent when heuristic functions are learned by general, but computationally expensive, deep neural networks. To address this problem, we introduce DeepCubeAQ, a deep reinforcement learning and search algorithm that builds on the DeepCubeA algorithm and deep Q-networks. DeepCubeAQ learns a heuristic function that, with a single forward pass through a deep neural network, computes the sum of the transition cost and the heuristic value of all of the children of a node without explicitly generating any of the children, eliminating the need for node expansions. DeepCubeAQ then uses a novel variant of A* search, called AQ* search, that uses the deep Q-network to guide search. We use DeepCubeAQ to solve the Rubik's cube when formulated with a large action space that includes 1872 meta-actions and show that this 157-fold increase in the size of the action space incurs less than a 4-fold increase in computation time when performing AQ* search and that AQ* search is orders of magnitude faster than A* search

    Solving planning problems with deep reinforcement learning and tree search

    Get PDF
    Deep reinforcement learning methods are capable of learning complex heuristics starting with no prior knowledge, but struggle in environments where the learning signal is sparse. In contrast, planning methods can discover the optimal path to a goal in the absence of external rewards, but often require a hand-crafted heuristic function to be effective. In this thesis, we describe a model-based reinforcement learning method that bridges the middle ground between these two approaches. When evaluated on the complex domain of Sokoban, the model-based method was found to be more performant, stable and sample-efficient than a model-free baseline

    Sokoban game and artificial intelligence

    Get PDF
    Tato práce je zaměřena na řešení hry Sokoban metodami umělé inteligence. Teoretická část popisuje hru Sokoban, problematiku stavového prostoru a princip vybraných prohledávacích algoritmů. V rámci praktické části byly v jazyce Python implementovány popsané algoritmy a bylo vytvořeno grafické uživatelské rozhraní. V závěrečné části byly provedeny srovnávací experimenty.The thesis is focused on solving the Sokoban game using artificial intelligence algorithms. The first part of the thesis describes the Sokoban game, state space and selected state space search methods. In the second part selected methods were implemented and graphic user interface was created in the Python environment. Comparative experiments were executed in the final part.

    Improving the efficiency of the Pre-Optimization Plan Techniques

    Get PDF
    Automated planning is an important research area of Artificial Intelligence (AI). In classical planning, which is a sub-area of automated planning, attention is given to ‘agile’ planning, i.e., solving planning problems as quickly as possible regardless of the quality of solution plans. Obtaining solutions quickly is important for real-time applications as well as in situations of imminent danger. Post-planning optimisation techniques for improving the quality of solution plans are a good option for improving poor quality plans. Since such techniques are run as post-processing, this avoids situations where there is a risk of not having solution plans in time. This thesis focuses on an important sub-area of post-planning optimisation; that is, on identifying and removing redundant actions from solution plans. In particular, this study extends the existing Action Elimination and Greedy Action Elimination algorithms by introduce two approaches to improve their efficiency. The AE and GAE algorithms are thereby developed into the UAIAE and UGAIAE systems respectively. The key to our approaches is based on optimise the process while keeping the same elimination power’ (identifying and removing the same number of redundant actions). First approach improves the algorithms by considering situations where inverse actions are redundant, while the other identifies a subset of actions that cannot be present in any redundant actions set. This subset is named justified unique actions. The study’s approach to identifying this subset has been motivated by a promising heuristic approach called ‘landmarks’, which are facts or actions that cannot be eliminated to achieve the goal. The approaches in this study have been empirically evaluated using several benchmark domains, as well as several planning engines that participated in the Agile track of the International Planning Competition 2014. In addition, they have been evaluated against state-of-the-art optimal and satisficing planners, as well as they are evaluated against a plan repair technique. The methods of AE family can be understood as polynomial methods that improve the quality of a plan by removing redundant actions, or as tools to complement more sophisticated plan optimisation techniques

    BNAIC 2008:Proceedings of BNAIC 2008, the twentieth Belgian-Dutch Artificial Intelligence Conference

    Get PDF

    Active Learning for Reducing Labeling Effort in Text Classification Tasks

    Get PDF
    Labeling data can be an expensive task as it is usually performed manually by domain experts. This is cumbersome for deep learning, as it is dependent on large labeled datasets. Active learning (AL) is a paradigm that aims to reduce labeling effort by only using the data which the used model deems most informative. Little research has been done on AL in a text classification setting and next to none has involved the more recent, state-of-the-art Natural Language Processing (NLP) models. Here, we present an empirical study that compares different uncertainty-based algorithms with BERTbase_{base} as the used classifier. We evaluate the algorithms on two NLP classification datasets: Stanford Sentiment Treebank and KvK-Frontpages. Additionally, we explore heuristics that aim to solve presupposed problems of uncertainty-based AL; namely, that it is unscalable and that it is prone to selecting outliers. Furthermore, we explore the influence of the query-pool size on the performance of AL. Whereas it was found that the proposed heuristics for AL did not improve performance of AL; our results show that using uncertainty-based AL with BERTbase_{base} outperforms random sampling of data. This difference in performance can decrease as the query-pool size gets larger.Comment: Accepted as a conference paper at the joint 33rd Benelux Conference on Artificial Intelligence and the 30th Belgian Dutch Conference on Machine Learning (BNAIC/BENELEARN 2021). This camera-ready version submitted to BNAIC/BENELEARN, adds several improvements including a more thorough discussion of related work plus an extended discussion section. 28 pages including references and appendice
    corecore