546 research outputs found
Distributed Synthesis in Continuous Time
We introduce a formalism modelling communication of distributed agents
strictly in continuous-time. Within this framework, we study the problem of
synthesising local strategies for individual agents such that a specified set
of goal states is reached, or reached with at least a given probability. The
flow of time is modelled explicitly based on continuous-time randomness, with
two natural implications: First, the non-determinism stemming from interleaving
disappears. Second, when we restrict to a subclass of non-urgent models, the
quantitative value problem for two players can be solved in EXPTIME. Indeed,
the explicit continuous time enables players to communicate their states by
delaying synchronisation (which is unrestricted for non-urgent models). In
general, the problems are undecidable already for two players in the
quantitative case and three players in the qualitative case. The qualitative
undecidability is shown by a reduction to decentralized POMDPs for which we
provide the strongest (and rather surprising) undecidability result so far
Sensor Scheduling for Optimal Observability Using Estimation Entropy
We consider sensor scheduling as the optimal observability problem for
partially observable Markov decision processes (POMDP). This model fits to the
cases where a Markov process is observed by a single sensor which needs to be
dynamically adjusted or by a set of sensors which are selected one at a time in
a way that maximizes the information acquisition from the process. Similar to
conventional POMDP problems, in this model the control action is based on all
past measurements; however here this action is not for the control of state
process, which is autonomous, but it is for influencing the measurement of that
process. This POMDP is a controlled version of the hidden Markov process, and
we show that its optimal observability problem can be formulated as an average
cost Markov decision process (MDP) scheduling problem. In this problem, a
policy is a rule for selecting sensors or adjusting the measuring device based
on the measurement history. Given a policy, we can evaluate the estimation
entropy for the joint state-measurement processes which inversely measures the
observability of state process for that policy. Considering estimation entropy
as the cost of a policy, we show that the problem of finding optimal policy is
equivalent to an average cost MDP scheduling problem where the cost function is
the entropy function over the belief space. This allows the application of the
policy iteration algorithm for finding the policy achieving minimum estimation
entropy, thus optimum observability.Comment: 5 pages, submitted to 2007 IEEE PerCom/PerSeNS conferenc
Severity-sensitive norm-governed multi-agent planning
This research was funded by Selex ES. The software developed during this research, including the norm analysis and planning algorithms, the simulator and harbour protection scenario used during evaluation is freely available from doi:10.5258/SOTON/D0139Peer reviewedPublisher PD
Technical Report: Distribution Temporal Logic: Combining Correctness with Quality of Estimation
We present a new temporal logic called Distribution Temporal Logic (DTL)
defined over predicates of belief states and hidden states of partially
observable systems. DTL can express properties involving uncertainty and
likelihood that cannot be described by existing logics. A co-safe formulation
of DTL is defined and algorithmic procedures are given for monitoring
executions of a partially observable Markov decision process with respect to
such formulae. A simulation case study of a rescue robotics application
outlines our approach.Comment: More expanded version of "Distribution Temporal Logic: Combining
Correctness with Quality of Estimation" to appear in IEEE CDC 201
Technical report: Distribution Temporal Logic: combining correctness with quality of estimation
We present a new temporal logic called Distribution Temporal Logic (DTL) defined over predicates of belief states and hidden states of partially observable systems. DTL can express properties involving uncertainty and likelihood that cannot be described by existing logics. A co-safe formulation of DTL is defined and algorithmic procedures are given for monitoring executions of a partially observable Markov decision process with respect to such formulae. A simulation case study of a rescue robotics application outlines our approach
Stochastic Shortest Path with Energy Constraints in POMDPs
We consider partially observable Markov decision processes (POMDPs) with a
set of target states and positive integer costs associated with every
transition. The traditional optimization objective (stochastic shortest path)
asks to minimize the expected total cost until the target set is reached. We
extend the traditional framework of POMDPs to model energy consumption, which
represents a hard constraint. The energy levels may increase and decrease with
transitions, and the hard constraint requires that the energy level must remain
positive in all steps till the target is reached. First, we present a novel
algorithm for solving POMDPs with energy levels, developing on existing POMDP
solvers and using RTDP as its main method. Our second contribution is related
to policy representation. For larger POMDP instances the policies computed by
existing solvers are too large to be understandable. We present an automated
procedure based on machine learning techniques that automatically extracts
important decisions of the policy allowing us to compute succinct human
readable policies. Finally, we show experimentally that our algorithm performs
well and computes succinct policies on a number of POMDP instances from the
literature that were naturally enhanced with energy levels.Comment: Technical report accompanying a paper published in proceedings of
AAMAS 201
- …