24 research outputs found
Resilience, reliability, and coordination in autonomous multi-agent systems
Acknowledgements The research reported in this paper was funded and supported by various grants over the years: Robotics and AI in Nuclear (RAIN) Hub (EP/R026084/1); Future AI and Robotics for Space (FAIR-SPACE) Hub (EP/R026092/1); Offshore Robotics for Certification of Assets (ORCA) Hub (EP/R026173/1); the Royal Academy of Engineering under the Chair in Emerging Technologies scheme; Trustworthy Autonomous Systems âVerifiability Nodeâ (EP/V026801); Scrutable Autonomous Systems (EP/J012084/1); Supporting Security Policy with Effective Digital Intervention (EP/P011829/1); The International Technology Alliance in Network and Information Sciences.Peer reviewedPostprin
PRP Rebooted: Advancing the State of the Art in FOND Planning
Fully Observable Non-Deterministic (FOND) planning is a variant of classical
symbolic planning in which actions are nondeterministic, with an action's
outcome known only upon execution. It is a popular planning paradigm with
applications ranging from robot planning to dialogue-agent design and reactive
synthesis. Over the last 20 years, a number of approaches to FOND planning have
emerged. In this work, we establish a new state of the art, following in the
footsteps of some of the most powerful FOND planners to date. Our planner, PR2,
decisively outperforms the four leading FOND planners, at times by a large
margin, in 17 of 18 domains that represent a comprehensive benchmark suite.
Ablation studies demonstrate the impact of various techniques we introduce,
with the largest improvement coming from our novel FOND-aware heuristic.Comment: 13 pages, 4 figures, AAAI conference paper Update: Fixed abstract and
typo
Goal reasoning for autonomous agents using automated planning
MenciĂłn Internacional en el tĂtulo de doctorAutomated planning deals with the task of finding a sequence of actions, namely
a plan, which achieves a goal from a given initial state. Most planning research
consider goals are provided by a external user, and agents just have to find a
plan to achieve them. However, there exist many real world domains where
agents should not only reason about their actions but also about their goals,
generating new ones or changing them according to the perceived environment.
In this thesis we aim at broadening the goal reasoning capabilities of planningbased
agents, both when acting in isolation and when operating in the same
environment as other agents.
In single-agent settings, we firstly explore a special type of planning tasks
where we aim at discovering states that fulfill certain cost-based requirements
with respect to a given set of goals. By computing these states, agents are able
to solve interesting tasks such as find escape plans that move agents in to safe
places, hide their true goal to a potential observer, or anticipate dynamically arriving
goals. We also show how learning the environmentâs dynamics may help
agents to solve some of these tasks. Experimental results show that these states
can be quickly found in practice, making agents able to solve new planning
tasks and helping them in solving some existing ones.
In multi-agent settings, we study the automated generation of goals based on
other agentsâ behavior. We focus on competitive scenarios, where we are interested
in computing counterplans that prevent opponents from achieving their
goals. We frame these tasks as counterplanning, providing theoretical properties
of the counterplans that solve them. We also show how agents can benefit
from computing some of the states we propose in the single-agent setting to
anticipate their opponentâs movements, thus increasing the odds of blocking
them. Experimental results show how counterplans can be found in different
environments ranging from competitive planning domains to real-time strategy
games.Programa de Doctorado en Ciencia y TecnologĂa InformĂĄtica por la Universidad Carlos III de MadridPresidenta: Eva OnaindĂa de la Rivaherrera.- Secretario: Ăngel GarcĂa Olaya.- Vocal: Mark Robert
Receding Horizon Re-ordering of Multi-Agent Execution Schedules
The trajectory planning for a fleet of Automated Guided Vehicles (AGVs) on a
roadmap is commonly referred to as the Multi-Agent Path Finding (MAPF) problem,
the solution to which dictates each AGV's spatial and temporal location until
it reaches it's goal without collision. When executing MAPF plans in dynamic
workspaces, AGVs can be frequently delayed, e.g., due to encounters with humans
or third-party vehicles. If the remainder of the AGVs keeps following their
individual plans, synchrony of the fleet is lost and some AGVs may pass through
roadmap intersections in a different order than originally planned. Although
this could reduce the cumulative route completion time of the AGVs, generally,
a change in the original ordering can cause conflicts such as deadlocks. In
practice, synchrony is therefore often enforced by using a MAPF execution
policy employing, e.g., an Action Dependency Graph (ADG) to maintain ordering.
To safely re-order without introducing deadlocks, we present the concept of the
Switchable Action Dependency Graph (SADG). Using the SADG, we formulate a
comparatively low-dimensional Mixed-Integer Linear Program (MILP) that
repeatedly re-orders AGVs in a recursively feasible manner, thus maintaining
deadlock-free guarantees, while dynamically minimizing the cumulative route
completion time of all AGVs. Various simulations validate the efficiency of our
approach when compared to the original ADG method as well as robust MAPF
solution approaches.Comment: IEEE Transactions on Robotics (T-Ro) preprint, 17 pages, 32 figure
Adaptive search techniques in AI planning and heuristic search
State-space search is a common approach to solve problems appearing in artificial intelligence and other subfields of computer science. In such problems, an agent must find a sequence of actions leading from an initial state to a goal state. However, the state spaces of practical applications are often too large to explore exhaustively. Hence, heuristic functions that estimate the distance to a goal state (such as straight-line distance for navigation tasks) are used to guide the search more effectively. Heuristic search is typically viewed as a static process. The heuristic function is assumed to be unchanged throughout the search, and its resulting values are directly used for guidance without applying any further reasoning to them. Yet critical aspects of the task may only be discovered during the search, e.g., regions of the state space where the heuristic does not yield reliable values. Our work here aims to make this process more dynamic, allowing the search to adapt to such observations. One form of adaptation that we consider is online refinement of the heuristic function. We design search algorithms that detect weaknesses in the heuristic, and address them with targeted refinement operations. If the heuristic converges to perfect estimates, this results in a secondary method of progress, causing search algorithms that are otherwise incomplete to eventually find a solution. We also consider settings that inherently require adaptation: In online replanning, a plan that is being executed must be amended for changes in the environment. Similarly, in real-time search, an agent must act under strict time constraints with limited information. The search algorithms we introduce in this work share a common pattern of online adaptation, allowing them to effectively react to challenges encountered during the search. We evaluate our contributions on a wide range of standard benchmarks. Our results show that the flexibility of these algorithms makes them more robust than traditional approaches, and they often yield substantial improvements over current state-of-the-art planners.Die Zustandsraumsuche ist ein oft verwendeter Ansatz um verschiedene Probleme zu lösen, die in der KĂŒnstlichen Intelligenz und anderen Bereichen der Informatik auftreten. Dabei muss ein Akteur eine Folge von Aktionen finden, die einen Pfad von einem Startzustand zu einem Zielzustand bilden. Die ZustandsrĂ€ume von praktischen Anwendungen sind hĂ€ufig zu groĂ um sie vollstĂ€ndig zu durchsuchen. Aus diesem Grund leitet man die Suche mit Heuristiken, die die Distanz zu einem Zielzustand abschĂ€tzen; zum Beispiel lĂ€sst sich die Luftliniendistanz als Heuristik fĂŒr Navigationsprobleme einsetzen. Heuristische Suche wird typischerweise als statischer Prozess angesehen. Man nimmt an, dass die Heuristik wĂ€hrend der Suche eine unverĂ€nderte Funktion ist, und die resultierenden Werte werden direkt zur Leitung der Suche benutzt ohne weitere Logik darauf anzuwenden. Jedoch könnten kritische Aspekte des Problems erst im Laufe der Suche erkannt werden, wie zum Beispiel Bereiche des Zustandsraums in denen die Heuristik keine verlĂ€sslichen AbschĂ€tzungen liefert. In dieser Arbeit wird der Suchprozess dynamischer gestaltet und der Suche ermöglicht sich solchen Beobachtungen anzupassen. Eine Art dieser Anpassung ist die Onlineverbesserung der Heuristik. Es werden Suchalgorithmen entwickelt, die SchwĂ€chen in der Heuristik erkennen und mit gezielten Verbesserungsoperationen beheben. Wenn die Heuristik zu perfekten Werten konvergiert ergibt sich daraus eine zusĂ€tzliche Form von Fortschritt, wodurch auch Suchalgorithmen, die sonst unvollstĂ€ndig sind, garantiert irgendwann eine Lösung finden werden. Es werden auch Szenarien betrachtet, die schon von sich aus Anpassung erfordern: In der Onlineumplanung muss ein Plan, der gerade ausgefĂŒhrt wird, auf Ănderungen in der Umgebung angepasst werden. Ăhnlich dazu muss sich ein Akteur in der Echtzeitsuche unter strengen Zeitauflagen und mit eingeschrĂ€nkten Informationen bewegen. Die Suchalgorithmen, die in dieser Arbeit eingefĂŒhrt werden, folgen einem gemeinsamen Muster von Onlineanpassung, was ihnen ermöglicht effektiv auf Herausforderungen zu reagieren die im Verlauf der Suche aufkommen. Diese AnsĂ€tze werden auf einer breiten Reihe von Benchmarks ausgewertet. Die Ergebnisse zeigen, dass die FlexibilitĂ€t dieser Algorithmen zu erhöhter ZuverlĂ€ssigkeit im Vergleich zu traditionellen AnsĂ€tzen fĂŒhrt, und es werden oft deutliche Verbesserungen gegenĂŒber modernen Planungssystemen erzielt.DFG grant 389792660 as part of TRR 248 â CPEC (see https://perspicuous-computing.science), and DFG grant HO 2169/5-1, "Critically Constrained Planning via Partial Delete Relaxation
A Comparison of SAT Encodings for Acyclicity of Directed Graphs
Many practical applications require synthesizing directed graphs that satisfy the acyclic constraint along with some side constraints. Several methods have been devised for encoding acyclicity of directed graphs into SAT, each of which is based on a cycle-detecting algorithm. The leaf-elimination encoding (LEE) repeatedly eliminates leaves from the graph, and judges the graph to be acyclic if the graph becomes empty at a certain time. The vertex-elimination encoding (VEE) exploits the property that the cyclicity of the resulting graph produced by the vertex-elimination operation entails the cyclicity of the original graph. While VEE is significantly smaller than the transitive-closure encoding for sparse graphs, it generates prohibitively large encodings for large dense graphs. This paper reports on a comparison study of four SAT encodings for acyclicity of directed graphs, namely, LEE using unary encoding for time variables (LEE-u), LEE using binary encoding for time variables (LEE-b), VEE, and a hybrid encoding which combines LEE-b and VEE. The results show that the hybrid encoding significantly outperforms the others
Smoke Test Planning using Answer Set Programming
Smoke testing is an important method to increase stability and reliability of hardware- gramming, Testing depending systems. Due to concurrent access to the same physical resource and the impracticality of the use of virtualization, smoke testing requires some form of planning. In this paper, we propose to decompose test cases in terms of atomic actions consisting of preconditions and effects. We present a solution based on answer set programming with multi-shot solving that automatically generates short parallel test plans. Experiments suggest that the approach is feasible for non-inherently sequential test cases and scales up to thousands of test cases