Concurrent decision making in Markov decision processes

Abstract

This dissertation investigates concurrent decision making and coordination in systems that can simultaneously execute multiple actions to perform tasks more efficiently. Concurrent decision-making is a fundamental problem in many areas of robotics, control, and computer science. In the field of Artificial Intelligence in particular, this problem is recognized as a formidable challenge. By concurrent decision making we refer to a class of problems that require agents to accomplish long-term goals by concurrently executing multiple activities. In general, the problem is difficult to solve as it requires learning and planning with a combinatorial set of interacting concurrent activities with uncertain outcomes that compete for limited resources in the system. The dissertation presents a general framework for modeling the concurrent decision making problem based on semi-Markov decision processes (SMDPs). Our approach is based on a centralized control formalism, where we assume a central control mechanism initiates, executes and monitors concurrent activities. This view also captures the type of concurrency that exists in single agent domains, where a single agent is capable of performing multiple activities simultaneously by exploiting the degrees of freedom (DOF) in the system. We present a set of coordination mechanisms employed by our model for monitoring the execution and termination of concurrent activities. Such coordination mechanisms incorporate various natural activity completion mechanisms based on the individual termination of each activity. We provide theoretical results that assert the correctness of the model semantics which allows us to apply standard SMDP learning and planning techniques for solving the concurrent decision making problem. SMDP solution methods do not scale to concurrent decision making systems with large degrees of freedom. This problem is a classic example of the curse of dimensionality in the action space, where the size of the set of concurrent activities exponentially grows as the system admits more degrees of freedom. To alleviate this problem, we develop a novel decision theoretic framework motivated by the coarticulation phenomenon investigated in speech and motor control research. The key idea in this approach is based on the fact that in many concurrent decision making problems, the overall objective of the problem can be viewed as concurrent optimization of a set of interacting and possibly simpler subgoals of the problem for which the agent has gained the necessary skills to achieve them. We show that by applying coarticulation to systems with excess degrees of freedom, concurrency is naturally generated. We present a set of theoretical results that characterizes the efficiency of the concurrent decision making based on the coarticulation framework when compared to the case in which the agent is allowed to only execute activities sequentially (i.e., no coarticulation). (Abstract shortened by UMI.

    Similar works