36 research outputs found

    Planning in POMDPs Using Multiplicity Automata

    Get PDF
    Planning and learning in Partially Observable MDPs (POMDPs) are among the most challenging tasks in both the AI and Operation Research communities. Although solutions to these problems are intractable in general, there might be special cases, such as structured POMDPs, which can be solved efficiently. A natural and possibly efficient way to represent a POMDP is through the predictive state representation (PSR) - a representation which recently has been receiving increasing attention. In this work, we relate POMDPs to multiplicity automata- showing that POMDPs can be represented by multiplicity automata with no increase in the representation size. Furthermore, we show that the size of the multiplicity automaton is equal to the rank of the predictive state representation. Therefore, we relate both the predictive state representation and POMDPs to the well-founded multiplicity automata literature. Based on the multiplicity automata representation, we provide a planning algorithm which is exponential only in the multiplicity automata rank rather than the number of states of the POMDP. As a result, whenever the predictive state representation is logarithmic in the standard POMDP representation, our planning algorithm is efficient.Comment: Appears in Proceedings of the Twenty-First Conference on Uncertainty in Artificial Intelligence (UAI2005

    Minimization via duality

    Get PDF
    We show how to use duality theory to construct minimized versions of a wide class of automata. We work out three cases in detail: (a variant of) ordinary automata, weighted automata and probabilistic automata. The basic idea is that instead of constructing a maximal quotient we go to the dual and look for a minimal subalgebra and then return to the original category. Duality ensures that the minimal subobject becomes the maximally quotiented object

    On knowledge representation and decision making under uncertainty

    Get PDF
    Designing systems with the ability to make optimal decisions under uncertainty is one of the goals of artificial intelligence. However, in many applications the design of optimal planners is complicated due to imprecise inputs and uncertain outputs resulting from stochastic dynamics. Partially Observable Markov Decision Processes (POMDPs) provide a rich mathematical framework to model these kinds of problems. However, the high computational demand of solution methods for POMDPs is a drawback for applying them in practice.In this thesis, we present a two-fold approach for improving the tractability of POMDP planning. First, we focus on designing good heuristics for POMDP approximation algorithms. We aim to scale up the efficiency of a class of POMDP approximations called point-based planning methods by designing a good planning space. We study the effect of three properties of reachable belief state points that may influence the performance of point-based approximation methods. Second, we investigate approaches to designing good controllers using an alternative representation of systems with partial observability called Predictive State Representation (PSR). This part of the thesis advocates the usefulness and practicality of PSRs in planning under uncertainty. We also attempt to move some useful characteristics of the PSR model, which has a predictive view of the world, to the POMDP model, which has a probabilistic view of the hidden states of the world. We propose a planning algorithm motivated by the connections between the two models

    On the Use of Non-Stationary Policies for Stationary Infinite-Horizon Markov Decision Processes

    Get PDF
    We consider infinite-horizon stationary γ\gamma-discounted Markov Decision Processes, for which it is known that there exists a stationary optimal policy. Using Value and Policy Iteration with some error ϵ\epsilon at each iteration, it is well-known that one can compute stationary policies that are 2γ(1γ)2ϵ\frac{2\gamma}{(1-\gamma)^2}\epsilon-optimal. After arguing that this guarantee is tight, we develop variations of Value and Policy Iteration for computing non-stationary policies that can be up to 2γ1γϵ\frac{2\gamma}{1-\gamma}\epsilon-optimal, which constitutes a significant improvement in the usual situation when γ\gamma is close to 1. Surprisingly, this shows that the problem of "computing near-optimal non-stationary policies" is much simpler than that of "computing near-optimal stationary policies"

    A Spectral Algorithm for Learning Hidden Markov Models

    Get PDF
    Hidden Markov Models (HMMs) are one of the most fundamental and widely used statistical tools for modeling discrete time series. In general, learning HMMs from data is computationally hard (under cryptographic assumptions), and practitioners typically resort to search heuristics which suffer from the usual local optima issues. We prove that under a natural separation condition (bounds on the smallest singular value of the HMM parameters), there is an efficient and provably correct algorithm for learning HMMs. The sample complexity of the algorithm does not explicitly depend on the number of distinct (discrete) observations---it implicitly depends on this quantity through spectral properties of the underlying HMM. This makes the algorithm particularly applicable to settings with a large number of observations, such as those in natural language processing where the space of observation is sometimes the words in a language. The algorithm is also simple, employing only a singular value decomposition and matrix multiplications.Comment: Published in JCSS Special Issue "Learning Theory 2009

    On Regularity of Unary Probabilistic Automata

    Get PDF
    The quantitative verification of Probabilistic Automata (PA) is undecidable in general. Unary PA are a simpler model where the choice of action is fixed. Still, the quantitative verification problem is open and known to be as hard as Skolem\u27s problem, a problem on linear recurrence sequences, whose decidability is open for at least 40 years. In this paper, we approach this problem by studying the languages generated by unary PAs (as defined below), whose regularity would entail the decidability of quantitative verification. Given an initial distribution, we represent the trajectory of a unary PA over time as an infinite word over a finite alphabet, where the n-th letter represents a probability range after n steps. We extend this to a language of trajectories (a set of words), one trajectory for each initial distribution from a (possibly infinite) set. We show that if the eigenvalues of the transition matrix associated with the unary PA are all distinct positive real numbers, then the language is effectively regular. Further, we show that this result is at the boundary of regularity, as non-regular languages can be generated when the restrictions are even slightly relaxed. The regular representation of the language allows us to reason about more general properties, e.g., robustness of a regular property in a neighbourhood around a given distribution

    Multi-agent persistent surveillance under temporal logic constraints

    Full text link
    This thesis proposes algorithms for the deployment of multiple autonomous agents for persistent surveillance missions requiring repeated, periodic visits to regions of interest. Such problems arise in a variety of domains, such as monitoring ocean conditions like temperature and algae content, performing crowd security during public events, tracking wildlife in remote or dangerous areas, or watching traffic patterns and road conditions. Using robots for surveillance is an attractive solution to scenarios in which fixed sensors are not sufficient to maintain situational awareness. Multi-agent solutions are particularly promising, because they allow for improved spatial and temporal resolution of sensor information. In this work, we consider persistent monitoring by teams of agents that are tasked with satisfying missions specified using temporal logic formulas. Such formulas allow rich, complex tasks to be specified, such as "visit regions A and B infinitely often, and if region C is visited then go to region D, and always avoid obstacles." The agents must determine how to satisfy such missions according to fuel, communication, and other constraints. Such problems are inherently difficult due to the typically infinite horizon, state space explosion from planning for multiple agents, communication constraints, and other issues. Therefore, computing an optimal solution to these problems is often infeasible. Instead, a balance must be struck between computational complexity and optimality. This thesis describes solution methods for two main classes of multi-agent persistent surveillance problems. First, it considers the class of problems in which persistent surveillance goals are captured entirely by TL constraints. Such problems require agents to repeatedly visit a set of surveillance regions in order to satisfy their mission. We present results for agents solving such missions with charging constraints, with noisy observations, and in the presence of adversaries. The second class of problems include an additional optimality criterion, such as minimizing uncertainty about the location of a target or maximizing sensor information among the team of agents. We present solution methods and results for such missions with a variety of optimality criteria based on information metrics. For both classes of problems, the proposed algorithms are implemented and evaluated via simulation, experiments with robots in a motion capture environment, or both

    Tools and Algorithms for the Construction and Analysis of Systems

    Get PDF
    This open access book constitutes the proceedings of the 28th International Conference on Tools and Algorithms for the Construction and Analysis of Systems, TACAS 2022, which was held during April 2-7, 2022, in Munich, Germany, as part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2022. The 46 full papers and 4 short papers presented in this volume were carefully reviewed and selected from 159 submissions. The proceedings also contain 16 tool papers of the affiliated competition SV-Comp and 1 paper consisting of the competition report. TACAS is a forum for researchers, developers, and users interested in rigorously based tools and algorithms for the construction and analysis of systems. The conference aims to bridge the gaps between different communities with this common interest and to support them in their quest to improve the utility, reliability, exibility, and efficiency of tools and algorithms for building computer-controlled systems

    Tools and Algorithms for the Construction and Analysis of Systems

    Get PDF
    This open access book constitutes the proceedings of the 28th International Conference on Tools and Algorithms for the Construction and Analysis of Systems, TACAS 2022, which was held during April 2-7, 2022, in Munich, Germany, as part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2022. The 46 full papers and 4 short papers presented in this volume were carefully reviewed and selected from 159 submissions. The proceedings also contain 16 tool papers of the affiliated competition SV-Comp and 1 paper consisting of the competition report. TACAS is a forum for researchers, developers, and users interested in rigorously based tools and algorithms for the construction and analysis of systems. The conference aims to bridge the gaps between different communities with this common interest and to support them in their quest to improve the utility, reliability, exibility, and efficiency of tools and algorithms for building computer-controlled systems
    corecore