Search CORE

38 research outputs found

Goal reasoning for autonomous agents using automated planning

Author: Pozanco Lancho Alberto
Publication venue
Publication date: 03/03/2021
Field of study

Mención Internacional en el título de doctorAutomated planning deals with the task of finding a sequence of actions, namely a plan, which achieves a goal from a given initial state. Most planning research consider goals are provided by a external user, and agents just have to find a plan to achieve them. However, there exist many real world domains where agents should not only reason about their actions but also about their goals, generating new ones or changing them according to the perceived environment. In this thesis we aim at broadening the goal reasoning capabilities of planningbased agents, both when acting in isolation and when operating in the same environment as other agents. In single-agent settings, we firstly explore a special type of planning tasks where we aim at discovering states that fulfill certain cost-based requirements with respect to a given set of goals. By computing these states, agents are able to solve interesting tasks such as find escape plans that move agents in to safe places, hide their true goal to a potential observer, or anticipate dynamically arriving goals. We also show how learning the environment’s dynamics may help agents to solve some of these tasks. Experimental results show that these states can be quickly found in practice, making agents able to solve new planning tasks and helping them in solving some existing ones. In multi-agent settings, we study the automated generation of goals based on other agents’ behavior. We focus on competitive scenarios, where we are interested in computing counterplans that prevent opponents from achieving their goals. We frame these tasks as counterplanning, providing theoretical properties of the counterplans that solve them. We also show how agents can benefit from computing some of the states we propose in the single-agent setting to anticipate their opponent’s movements, thus increasing the odds of blocking them. Experimental results show how counterplans can be found in different environments ranging from competitive planning domains to real-time strategy games.Programa de Doctorado en Ciencia y Tecnología Informática por la Universidad Carlos III de MadridPresidenta: Eva Onaindía de la Rivaherrera.- Secretario: Ángel García Olaya.- Vocal: Mark Robert

Universidad Carlos III de Madrid e-Archivo

Robotic ubiquitous cognitive ecology for smart homes

Author: A Cesta
A Gaddam
A Gerevini
A Lotfi
A Moustapha
A. K. Ray
A. Micheli
A. Renteria
A. Saffiotti
AK Ray
AK Ray
AK Ray
C Gallicchio
C Liming
C Watkins
C. Gallicchio
C. Gennaro
C. Vairo
D Bacciu
D Bacciu
D Cook
D De
D Peebles
D Roggen
D Vernon
D Verstraeten
D. Bacciu
D. Swords
DJ Cook
G Edelman
G Leng
G. Amato
H Hongmei
H Jaeger
H. Lozano
JR Anderson
M Alam
M Kurz
M Lukosevicius
M Sokolova
M. Broxvall
M. Di Rocco
M. Dragone
MB Do
P Doherty
P Langley
P Langley
P Rashidi
P. Vance
R Kulkarni
R Sun
R Sun
S Fratini
S Knight
S Zhang
S. Chessa
S. Coleman
T. M. McGinnity
W Duch
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

Robotic ecologies are networks of heterogeneous robotic devices pervasively embedded in everyday environments, where they cooperate to perform complex tasks. While their potential makes them increasingly popular, one fundamental problem is how to make them both autonomous and adaptive, so as to reduce the amount of preparation, pre-programming and human supervision that they require in real world applications. The project RUBICON develops learning solutions which yield cheaper, adaptive and efficient coordination of robotic ecologies. The approach we pursue builds upon a unique combination of methods from cognitive robotics, machine learning, planning and agent- based control, and wireless sensor networks. This paper illustrates the innovations advanced by RUBICON in each of these fronts before describing how the resulting techniques have been integrated and applied to a smart home scenario. The resulting system is able to provide useful services and pro-actively assist the users in their activities. RUBICON learns through an incremental and progressive approach driven by the feed- back received from its own activities and from the user, while also self-organizing the manner in which it uses available sensors, actuators and other functional components in the process. This paper summarises some of the lessons learned by adopting such an approach and outlines promising directions for future work

Crossref

Heriot Watt Pure

Nottingham Trent Institutional Repository (IRep)

Archivio della Ricerca - Università di Pisa

TECNALIA Publications

Knowledge engineering techniques for automated planning

Author: Shah Mohammad Munshi Shahin
Publication venue
Publication date
Field of study

Formulating knowledge for use in AI Planning engines is currently some-thing of an ad-hoc process, where the skills of knowledge engineers and the tools they use may signiﬁcantly inﬂuence the quality of the resulting planning application. There is little in the way of guidelines or standard procedures, however, for knowledge engineers to use when formulating knowledge into planning domain languages such as PDDL. Also, there is little published research to inform engineers on which method and tools to use in order to effectively engineer a new planning domain model. This is of growing importance, as domain independent planning engines are now being used in a wide range of applications, with the consequence that op-erational problem encodings and domain models have to be developed in a standard language. In particular, at the difﬁcult stage of domain knowledge formulation, changing a statement of the requirements into something for-mal - a PDDL domain model - is still somewhat of an ad hoc process, usually conducted by a team of AI experts using text editors. On the other hand, the use of tools such as itSIMPLE or GIPO, with which experts gen-erate a high level diagrammatic description and automatically generate the domain model, have not yet been proven to be more effective than hand coding. The major contribution of this thesis is the evaluation of knowledge en-gineering tools and techniques involved in the formulation of knowledge. To support this, we introduce and encode a new planning domain called Road Trafﬁc Accidents (RTA), and discuss a set of requirements that we have derived, in consultation with stakeholders and analysis of accident management manuals, for the planning part of the management task. We then use and evaluate three separate strategies for knowledge formulation, encoding domain models from a textual, structural description of require-ments using (i) the traditional method of a PDDL expert and text editor (ii) a leading planning GUI with built in UML modelling tools (iii) an object-based notation inspired by formal methods. We evaluate these three ap-proaches using process and product metrics. The results give insights into the strengths and weaknesses of the approaches, highlight lessons learned regarding knowledge encoding, and point to important lines of research for knowledge engineering for planning. In addition, we discuss a range of state-of-the-art modelling tools to ﬁnd the types of features that the knowledge engineering tools possess. These features have also been used for evaluating the methods used. We bench-mark our evaluation approach by comparing it with the method used in the previous International Competition for Knowledge Engineering for Plan-ning & Scheduling (ICKEPS). We conclude by providing a set of guide-lines for building future knowledge engineering tools

University of Huddersfield Repository

On the connection of probabilistic model checking, planning, and learning for system verification

Author: Klauck Michaela
Publication venue: Saarländische Universitäts- und Landesbibliothek
Publication date: 01/01/2022
Field of study

This thesis presents approaches using techniques from the model checking, planning, and learning community to make systems more reliable and perspicuous. First, two heuristic search and dynamic programming algorithms are adapted to be able to check extremal reachability probabilities, expected accumulated rewards, and their bounded versions, on general Markov decision processes (MDPs). Thereby, the problem space originally solvable by these algorithms is enlarged considerably. Correctness and optimality proofs for the adapted algorithms are given, and in a comprehensive case study on established benchmarks it is shown that the implementation, called Modysh, is competitive with state-of-the-art model checkers and even outperforms them on very large state spaces. Second, Deep Statistical Model Checking (DSMC) is introduced, usable for quality assessment and learning pipeline analysis of systems incorporating trained decision-making agents, like neural networks (NNs). The idea of DSMC is to use statistical model checking to assess NNs resolving nondeterminism in systems modeled as MDPs. The versatility of DSMC is exemplified in a number of case studies on Racetrack, an MDP benchmark designed for this purpose, flexibly modeling the autonomous driving challenge. In a comprehensive scalability study it is demonstrated that DSMC is a lightweight technique tackling the complexity of NN analysis in combination with the state space explosion problem.Diese Arbeit präsentiert Ansätze, die Techniken aus dem Model Checking, Planning und Learning Bereich verwenden, um Systeme verlässlicher und klarer verständlich zu machen. Zuerst werden zwei Algorithmen für heuristische Suche und dynamisches Programmieren angepasst, um Extremwerte für Erreichbarkeitswahrscheinlichkeiten, Erwartungswerte für Kosten und beschränkte Varianten davon, auf generellen Markov Entscheidungsprozessen (MDPs) zu untersuchen. Damit wird der Problemraum, der ursprünglich mit diesen Algorithmen gelöst wurde, deutlich erweitert. Korrektheits- und Optimalitätsbeweise für die angepassten Algorithmen werden gegeben und in einer umfassenden Fallstudie wird gezeigt, dass die Implementierung, namens Modysh, konkurrenzfähig mit den modernsten Model Checkern ist und deren Leistung auf sehr großen Zustandsräumen sogar übertrifft. Als Zweites wird Deep Statistical Model Checking (DSMC) für die Qualitätsbewertung und Lernanalyse von Systemen mit integrierten trainierten Entscheidungsgenten, wie z.B. neuronalen Netzen (NN), eingeführt. Die Idee von DSMC ist es, statistisches Model Checking zur Bewertung von NNs zu nutzen, die Nichtdeterminismus in Systemen, die als MDPs modelliert sind, auflösen. Die Vielseitigkeit des Ansatzes wird in mehreren Fallbeispielen auf Racetrack gezeigt, einer MDP Benchmark, die zu diesem Zweck entwickelt wurde und die Herausforderung des autonomen Fahrens flexibel modelliert. In einer umfassenden Skalierbarkeitsstudie wird demonstriert, dass DSMC eine leichtgewichtige Technik ist, die die Komplexität der NN-Analyse in Kombination mit dem State Space Explosion Problem bewältigt

Universaar

Acronym

Combining Subgoal Graphs with Reinforcement Learning to Build a Rational Pathfinder

Author: Hu Cong
Hu Yue
Qin Long
Yin Quanjun
Zeng Junjie
Publication venue: 'MDPI AG'
Publication date: 05/11/2018
Field of study

In this paper, we present a hierarchical path planning framework called SG-RL (subgoal graphs-reinforcement learning), to plan rational paths for agents maneuvering in continuous and uncertain environments. By "rational", we mean (1) efficient path planning to eliminate first-move lags; (2) collision-free and smooth for agents with kinematic constraints satisfied. SG-RL works in a two-level manner. At the first level, SG-RL uses a geometric path-planning method, i.e., Simple Subgoal Graphs (SSG), to efficiently find optimal abstract paths, also called subgoal sequences. At the second level, SG-RL uses an RL method, i.e., Least-Squares Policy Iteration (LSPI), to learn near-optimal motion-planning policies which can generate kinematically feasible and collision-free trajectories between adjacent subgoals. The first advantage of the proposed method is that SSG can solve the limitations of sparse reward and local minima trap for RL agents; thus, LSPI can be used to generate paths in complex environments. The second advantage is that, when the environment changes slightly (i.e., unexpected obstacles appearing), SG-RL does not need to reconstruct subgoal graphs and replan subgoal sequences using SSG, since LSPI can deal with uncertainties by exploiting its generalization ability to handle changes in environments. Simulation experiments in representative scenarios demonstrate that, compared with existing methods, SG-RL can work well on large-scale maps with relatively low action-switching frequencies and shorter path lengths, and SG-RL can deal with small changes in environments. We further demonstrate that the design of reward functions and the types of training environments are important factors for learning feasible policies.Comment: 20 page

arXiv.org e-Print Archive

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals