166 research outputs found
A Projective Simulation Scheme for Partially-Observable Multi-Agent Systems
We introduce a kind of partial observability to the projective simulation
(PS) learning method. It is done by adding a belief projection operator and an
observability parameter to the original framework of the efficiency of the PS
model. I provide theoretical formulations, network representations, and
situated scenarios derived from the invasion toy problem as a starting point
for some multi-agent PS models.Comment: 28 pages, 21 figure
Quantum partially observable Markov decision processes
We present quantum observable Markov decision processes (QOMDPs), the quantum analogs of partially observable Markov decision processes (POMDPs). In a QOMDP, an agent is acting in a world where the state is represented as a quantum state and the agent can choose a superoperator to apply. This is similar to the POMDP belief state, which is a probability distribution over world states and evolves via a stochastic matrix. We show that the existence of a policy of at least a certain value has the same complexity for QOMDPs and POMDPs in the polynomial and infinite horizon cases. However, we also prove that the existence of a policy that can reach a goal state is decidable for goal POMDPs and undecidable for goal QOMDPs.National Science Foundation (U.S.) (Grant 0844626)National Science Foundation (U.S.) (Grant 1122374)National Science Foundation (U.S.) (Waterman Award
Dynamics of Social Networks: Multi-agent Information Fusion, Anticipatory Decision Making and Polling
This paper surveys mathematical models, structural results and algorithms in
controlled sensing with social learning in social networks.
Part 1, namely Bayesian Social Learning with Controlled Sensing addresses the
following questions: How does risk averse behavior in social learning affect
quickest change detection? How can information fusion be priced? How is the
convergence rate of state estimation affected by social learning? The aim is to
develop and extend structural results in stochastic control and Bayesian
estimation to answer these questions. Such structural results yield fundamental
bounds on the optimal performance, give insight into what parameters affect the
optimal policies, and yield computationally efficient algorithms.
Part 2, namely, Multi-agent Information Fusion with Behavioral Economics
Constraints generalizes Part 1. The agents exhibit sophisticated decision
making in a behavioral economics sense; namely the agents make anticipatory
decisions (thus the decision strategies are time inconsistent and interpreted
as subgame Bayesian Nash equilibria).
Part 3, namely {\em Interactive Sensing in Large Networks}, addresses the
following questions: How to track the degree distribution of an infinite random
graph with dynamics (via a stochastic approximation on a Hilbert space)? How
can the infected degree distribution of a Markov modulated power law network
and its mean field dynamics be tracked via Bayesian filtering given incomplete
information obtained by sampling the network? We also briefly discuss how the
glass ceiling effect emerges in social networks.
Part 4, namely \emph{Efficient Network Polling} deals with polling in large
scale social networks. In such networks, only a fraction of nodes can be polled
to determine their decisions. Which nodes should be polled to achieve a
statistically accurate estimates
Improving quantum state detection with adaptive sequential observations
For many quantum systems intended for information processing, one detects the
logical state of a qubit by integrating a continuously observed quantity over
time. For example, ion and atom qubits are typically measured by driving a
cycling transition and counting the number of photons observed from the
resulting fluorescence. Instead of recording only the total observed count in a
fixed time interval, one can observe the photon arrival times and get a state
detection advantage by using the temporal structure in a model such as a Hidden
Markov Model. We study what further advantage may be achieved by applying
pulses to adaptively transform the state during the observation. We give a
three-state example where adaptively chosen transformations yield a clear
advantage, and we compare performances on an ion example, where we see
improvements in some regimes. We provide a software package that can be used
for exploration of temporally resolved strategies with and without adaptively
chosen transformations.Comment: Submitted for publication in Quantum Science and Technology. 26
pages, 8 figures. Corrected typos in appendix, updated acknowledgement
Contributions on complexity bounds for Deterministic Partially Observed Markov Decision Process
Markov Decision Processes (Mdps) form a versatile framework used to model a
wide range of optimization problems. The Mdp model consists of sets of states,
actions, time steps, rewards, and probability transitions. When in a given
state and at a given time, the decision maker's action generates a reward and
determines the state at the next time step according to the probability
transition function. However, Mdps assume that the decision maker knows the
state of the controlled dynamical system. Hence, when one needs to optimize
controlled dynamical systems under partial observation, one often turns toward
the formalism of Partially Observed Markov Decision Processes (Pomdp). Pomdps
are often untractable in the general case as Dynamic Programming suffers from
the curse of dimensionality. Instead of focusing on the general Pomdps, we
present a subclass where transitions and observations mappings are
deterministic: Deterministic Partially Observed Markov Decision Processes
(Det-Pomdp). That subclass of problems has been studied by (Littman, 1996) and
(Bonet, 2009). It was first considered as a limit case of Pomdps by Littman,
mainly used to illustrate the complexity of Pomdps when considering as few
sources of uncertainties as possible. In this paper, we improve on Littman's
complexity bounds. We then introduce and study an even simpler class: Separated
Det-Pomdps and give some new complexity bounds for this class. This new class
of problems uses a property of the dynamics and observation to push back the
curse of dimensionality
- …