26,711 research outputs found
Reduction of Markov Chains using a Value-of-Information-Based Approach
In this paper, we propose an approach to obtain reduced-order models of
Markov chains. Our approach is composed of two information-theoretic processes.
The first is a means of comparing pairs of stationary chains on different state
spaces, which is done via the negative Kullback-Leibler divergence defined on a
model joint space. Model reduction is achieved by solving a
value-of-information criterion with respect to this divergence. Optimizing the
criterion leads to a probabilistic partitioning of the states in the high-order
Markov chain. A single free parameter that emerges through the optimization
process dictates both the partition uncertainty and the number of state groups.
We provide a data-driven means of choosing the `optimal' value of this free
parameter, which sidesteps needing to a priori know the number of state groups
in an arbitrary chain.Comment: Submitted to Entrop
Optimal Kullback-Leibler Aggregation via Information Bottleneck
In this paper, we present a method for reducing a regular, discrete-time
Markov chain (DTMC) to another DTMC with a given, typically much smaller number
of states. The cost of reduction is defined as the Kullback-Leibler divergence
rate between a projection of the original process through a partition function
and a DTMC on the correspondingly partitioned state space. Finding the reduced
model with minimal cost is computationally expensive, as it requires an
exhaustive search among all state space partitions, and an exact evaluation of
the reduction cost for each candidate partition. Our approach deals with the
latter problem by minimizing an upper bound on the reduction cost instead of
minimizing the exact cost; The proposed upper bound is easy to compute and it
is tight if the original chain is lumpable with respect to the partition. Then,
we express the problem in the form of information bottleneck optimization, and
propose using the agglomerative information bottleneck algorithm for searching
a sub-optimal partition greedily, rather than exhaustively. The theory is
illustrated with examples and one application scenario in the context of
modeling bio-molecular interactions.Comment: 13 pages, 4 figure
Compositional Performance Modelling with the TIPPtool
Stochastic process algebras have been proposed as compositional specification formalisms for performance models. In this paper, we describe a tool which aims at realising all beneficial aspects of compositional performance modelling, the TIPPtool. It incorporates methods for compositional specification as well as solution, based on state-of-the-art techniques, and wrapped in a user-friendly graphical front end. Apart from highlighting the general benefits of the tool, we also discuss some lessons learned during development and application of the TIPPtool. A non-trivial model of a real life communication system serves as a case study to illustrate benefits and limitations
Extreme State Aggregation Beyond MDPs
We consider a Reinforcement Learning setup where an agent interacts with an
environment in observation-reward-action cycles without any (esp.\ MDP)
assumptions on the environment. State aggregation and more generally feature
reinforcement learning is concerned with mapping histories/raw-states to
reduced/aggregated states. The idea behind both is that the resulting reduced
process (approximately) forms a small stationary finite-state MDP, which can
then be efficiently solved or learnt. We considerably generalize existing
aggregation results by showing that even if the reduced process is not an MDP,
the (q-)value functions and (optimal) policies of an associated MDP with same
state-space size solve the original problem, as long as the solution can
approximately be represented as a function of the reduced states. This implies
an upper bound on the required state space size that holds uniformly for all RL
problems. It may also explain why RL algorithms designed for MDPs sometimes
perform well beyond MDPs.Comment: 28 LaTeX pages. 8 Theorem
Markov Decision Processes with Applications in Wireless Sensor Networks: A Survey
Wireless sensor networks (WSNs) consist of autonomous and resource-limited
devices. The devices cooperate to monitor one or more physical phenomena within
an area of interest. WSNs operate as stochastic systems because of randomness
in the monitored environments. For long service time and low maintenance cost,
WSNs require adaptive and robust methods to address data exchange, topology
formulation, resource and power optimization, sensing coverage and object
detection, and security challenges. In these problems, sensor nodes are to make
optimized decisions from a set of accessible strategies to achieve design
goals. This survey reviews numerous applications of the Markov decision process
(MDP) framework, a powerful decision-making tool to develop adaptive algorithms
and protocols for WSNs. Furthermore, various solution methods are discussed and
compared to serve as a guide for using MDPs in WSNs
Estimation of Markov Chain via Rank-Constrained Likelihood
This paper studies the estimation of low-rank Markov chains from empirical
trajectories. We propose a non-convex estimator based on rank-constrained
likelihood maximization. Statistical upper bounds are provided for the
Kullback-Leiber divergence and the risk between the estimator and the
true transition matrix. The estimator reveals a compressed state space of the
Markov chain. We also develop a novel DC (difference of convex function)
programming algorithm to tackle the rank-constrained non-smooth optimization
problem. Convergence results are established. Experiments show that the
proposed estimator achieves better empirical performance than other popular
approaches.Comment: Accepted at ICML 201
- âŠ