64 research outputs found
Optimal Planning with State Constraints
In the classical planning model, state variables are assigned
values in the initial state and remain unchanged unless
explicitly affected by action effects. However, some properties
of states are more naturally modelled not as direct effects of
actions but instead as derived, in each state, from the primary
variables via a set of rules. We refer to those rules as state
constraints. The two types of state constraints that will be
discussed here are numeric state constraints and logical rules
that we will refer to as axioms.
When using state constraints we make a distinction between
primary variables, whose values are directly affected by action
effects, and secondary variables, whose values are determined by
state constraints. While primary variables have finite and
discrete domains, as in classical planning, there is no such
requirement for secondary variables. For example, using numeric
state constraints allows us to have secondary variables whose
values are real numbers. We show that state constraints are a
construct that lets us combine classical planning methods with
specialised solvers developed for other types of problems. For
example, introducing numeric state constraints enables us to
apply planning techniques in domains involving interconnected
physical systems, such as power networks.
To solve these types of problems optimally, we adapt commonly
used methods from optimal classical planning, namely state-space
search guided by admissible heuristics. In heuristics based on
monotonic relaxation, the idea is that in a relaxed state each
variable assumes a set of values instead of just a single value.
With state constraints, the challenge becomes to evaluate the
conditions, such as goals and action preconditions, that involve
secondary variables. We employ consistency checking tools to
evaluate whether these conditions are satisfied in the relaxed
state. In our work with numerical constraints we use linear
programming, while with axioms we use answer set programming and
three value semantics. This allows us to build a relaxed planning
graph and compute constraint-aware version of heuristics based on
monotonic relaxation.
We also adapt pattern database heuristics. We notice that an
abstract state can be thought of as a state in the monotonic
relaxation in which the variables in the pattern hold only one
value, while the variables not in the pattern simultaneously hold
all the values in their domains. This means that we can apply the
same technique for evaluating conditions on secondary variables
as we did for the monotonic relaxation and build pattern
databases similarly as it is done in classical planning.
To make better use of our heuristics, we modify the A* algorithm
by combining two techniques that were previously used
independently – partial expansion and preferred operators. Our
modified algorithm, which we call PrefPEA, is most beneficial in
cases where heuristic is expensive to compute, but accurate, and
states have many successors
Symbolic Search in Planning and General Game Playing
Search is an important topic in many areas of AI. Search problems often result in an immense number of states. This work addresses this by using a special datastructure, BDDs, which can represent large sets of states efficiently, often saving space compared to explicit representations. The first part is concerned with an analysis of the complexity of BDDs for some search problems, resulting in lower or upper bounds on BDD sizes for these. The second part is concerned with action planning, an area where the programmer does not know in advance what the search problem will look like. This part presents symbolic algorithms for finding optimal solutions for two different settings, classical and net-benefit planning, as well as several improvements to these algorithms. The resulting planner was able to win the International Planning Competition IPC 2008. The third part is concerned with general game playing, which is similar to planning in that the programmer does not know in advance what game will be played. This work proposes algorithms for instantiating the input and solving games symbolically. For playing, a hybrid player based on UCT and the solver is presented
A* Search Without Expansions: Learning Heuristic Functions with Deep Q-Networks
A* search is an informed search algorithm that uses a heuristic function to
guide the order in which nodes are expanded. Since the computation required to
expand a node and compute the heuristic values for all of its generated
children grows linearly with the size of the action space, A* search can become
impractical for problems with large action spaces. This computational burden
becomes even more apparent when heuristic functions are learned by general, but
computationally expensive, deep neural networks. To address this problem, we
introduce DeepCubeAQ, a deep reinforcement learning and search algorithm that
builds on the DeepCubeA algorithm and deep Q-networks. DeepCubeAQ learns a
heuristic function that, with a single forward pass through a deep neural
network, computes the sum of the transition cost and the heuristic value of all
of the children of a node without explicitly generating any of the children,
eliminating the need for node expansions. DeepCubeAQ then uses a novel variant
of A* search, called AQ* search, that uses the deep Q-network to guide search.
We use DeepCubeAQ to solve the Rubik's cube when formulated with a large action
space that includes 1872 meta-actions and show that this 157-fold increase in
the size of the action space incurs less than a 4-fold increase in computation
time when performing AQ* search and that AQ* search is orders of magnitude
faster than A* search
Solving planning problems with deep reinforcement learning and tree search
Deep reinforcement learning methods are capable of learning complex heuristics starting with no prior knowledge, but struggle in environments where the learning signal is sparse. In contrast, planning methods can discover the optimal path to a goal in the absence of external rewards, but often require a hand-crafted heuristic function to be effective. In this thesis, we describe a model-based reinforcement learning method that bridges the middle ground between these two approaches. When evaluated on the complex domain of Sokoban, the model-based method was found to be more performant, stable and sample-efficient than a model-free baseline
Sokoban game and artificial intelligence
Tato práce je zaměřena na řešení hry Sokoban metodami umělé inteligence. Teoretická část popisuje hru Sokoban, problematiku stavového prostoru a princip vybraných prohledávacích algoritmů. V rámci praktické části byly v jazyce Python implementovány popsané algoritmy a bylo vytvořeno grafické uživatelské rozhraní. V závěrečné části byly provedeny srovnávací experimenty.The thesis is focused on solving the Sokoban game using artificial intelligence algorithms. The first part of the thesis describes the Sokoban game, state space and selected state space search methods. In the second part selected methods were implemented and graphic user interface was created in the Python environment. Comparative experiments were executed in the final part.
Improving the efficiency of the Pre-Optimization Plan Techniques
Automated planning is an important research area of Artificial Intelligence (AI). In classical planning, which is a sub-area of automated planning, attention is given to ‘agile’ planning, i.e., solving planning problems as quickly as possible regardless of the quality of solution plans. Obtaining solutions quickly is important for real-time applications as well as in situations of imminent danger. Post-planning optimisation techniques for improving the quality of solution plans are a good option for improving poor quality plans. Since such techniques are run as post-processing, this avoids situations where there is a risk of not having solution plans in time. This thesis focuses on an important sub-area of post-planning optimisation; that is, on identifying and removing redundant actions from solution plans. In particular, this study extends the existing Action Elimination and Greedy Action Elimination algorithms by introduce two approaches to improve their efficiency. The AE and GAE algorithms are thereby developed into the UAIAE and UGAIAE systems respectively. The key to our approaches is based on optimise the process while keeping the same elimination power’ (identifying and removing the same number of redundant actions). First approach improves the algorithms by considering situations where inverse actions are redundant, while the other identifies a subset of actions that cannot be present in any redundant actions set. This subset is named justified unique actions. The study’s approach to identifying this subset has been motivated by a promising heuristic approach called ‘landmarks’, which are facts or actions that cannot be eliminated to achieve the goal.
The approaches in this study have been empirically evaluated using several benchmark domains, as well as several planning engines that participated in the Agile track of the International Planning Competition 2014. In addition, they have been evaluated against state-of-the-art optimal and satisficing planners, as well as they are evaluated against a plan repair technique.
The methods of AE family can be understood as polynomial methods that improve the quality of a plan by removing redundant actions, or as tools to complement more sophisticated plan optimisation techniques
BNAIC 2008:Proceedings of BNAIC 2008, the twentieth Belgian-Dutch Artificial Intelligence Conference
Active Learning for Reducing Labeling Effort in Text Classification Tasks
Labeling data can be an expensive task as it is usually performed manually by
domain experts. This is cumbersome for deep learning, as it is dependent on
large labeled datasets. Active learning (AL) is a paradigm that aims to reduce
labeling effort by only using the data which the used model deems most
informative. Little research has been done on AL in a text classification
setting and next to none has involved the more recent, state-of-the-art Natural
Language Processing (NLP) models. Here, we present an empirical study that
compares different uncertainty-based algorithms with BERT as the used
classifier. We evaluate the algorithms on two NLP classification datasets:
Stanford Sentiment Treebank and KvK-Frontpages. Additionally, we explore
heuristics that aim to solve presupposed problems of uncertainty-based AL;
namely, that it is unscalable and that it is prone to selecting outliers.
Furthermore, we explore the influence of the query-pool size on the performance
of AL. Whereas it was found that the proposed heuristics for AL did not improve
performance of AL; our results show that using uncertainty-based AL with
BERT outperforms random sampling of data. This difference in
performance can decrease as the query-pool size gets larger.Comment: Accepted as a conference paper at the joint 33rd Benelux Conference
on Artificial Intelligence and the 30th Belgian Dutch Conference on Machine
Learning (BNAIC/BENELEARN 2021). This camera-ready version submitted to
BNAIC/BENELEARN, adds several improvements including a more thorough
discussion of related work plus an extended discussion section. 28 pages
including references and appendice
- …