26,711 research outputs found

    Reduction of Markov Chains using a Value-of-Information-Based Approach

    Full text link
    In this paper, we propose an approach to obtain reduced-order models of Markov chains. Our approach is composed of two information-theoretic processes. The first is a means of comparing pairs of stationary chains on different state spaces, which is done via the negative Kullback-Leibler divergence defined on a model joint space. Model reduction is achieved by solving a value-of-information criterion with respect to this divergence. Optimizing the criterion leads to a probabilistic partitioning of the states in the high-order Markov chain. A single free parameter that emerges through the optimization process dictates both the partition uncertainty and the number of state groups. We provide a data-driven means of choosing the `optimal' value of this free parameter, which sidesteps needing to a priori know the number of state groups in an arbitrary chain.Comment: Submitted to Entrop

    Optimal Kullback-Leibler Aggregation via Information Bottleneck

    Full text link
    In this paper, we present a method for reducing a regular, discrete-time Markov chain (DTMC) to another DTMC with a given, typically much smaller number of states. The cost of reduction is defined as the Kullback-Leibler divergence rate between a projection of the original process through a partition function and a DTMC on the correspondingly partitioned state space. Finding the reduced model with minimal cost is computationally expensive, as it requires an exhaustive search among all state space partitions, and an exact evaluation of the reduction cost for each candidate partition. Our approach deals with the latter problem by minimizing an upper bound on the reduction cost instead of minimizing the exact cost; The proposed upper bound is easy to compute and it is tight if the original chain is lumpable with respect to the partition. Then, we express the problem in the form of information bottleneck optimization, and propose using the agglomerative information bottleneck algorithm for searching a sub-optimal partition greedily, rather than exhaustively. The theory is illustrated with examples and one application scenario in the context of modeling bio-molecular interactions.Comment: 13 pages, 4 figure

    Compositional Performance Modelling with the TIPPtool

    Get PDF
    Stochastic process algebras have been proposed as compositional specification formalisms for performance models. In this paper, we describe a tool which aims at realising all beneficial aspects of compositional performance modelling, the TIPPtool. It incorporates methods for compositional specification as well as solution, based on state-of-the-art techniques, and wrapped in a user-friendly graphical front end. Apart from highlighting the general benefits of the tool, we also discuss some lessons learned during development and application of the TIPPtool. A non-trivial model of a real life communication system serves as a case study to illustrate benefits and limitations

    Extreme State Aggregation Beyond MDPs

    Full text link
    We consider a Reinforcement Learning setup where an agent interacts with an environment in observation-reward-action cycles without any (esp.\ MDP) assumptions on the environment. State aggregation and more generally feature reinforcement learning is concerned with mapping histories/raw-states to reduced/aggregated states. The idea behind both is that the resulting reduced process (approximately) forms a small stationary finite-state MDP, which can then be efficiently solved or learnt. We considerably generalize existing aggregation results by showing that even if the reduced process is not an MDP, the (q-)value functions and (optimal) policies of an associated MDP with same state-space size solve the original problem, as long as the solution can approximately be represented as a function of the reduced states. This implies an upper bound on the required state space size that holds uniformly for all RL problems. It may also explain why RL algorithms designed for MDPs sometimes perform well beyond MDPs.Comment: 28 LaTeX pages. 8 Theorem

    Markov Decision Processes with Applications in Wireless Sensor Networks: A Survey

    Full text link
    Wireless sensor networks (WSNs) consist of autonomous and resource-limited devices. The devices cooperate to monitor one or more physical phenomena within an area of interest. WSNs operate as stochastic systems because of randomness in the monitored environments. For long service time and low maintenance cost, WSNs require adaptive and robust methods to address data exchange, topology formulation, resource and power optimization, sensing coverage and object detection, and security challenges. In these problems, sensor nodes are to make optimized decisions from a set of accessible strategies to achieve design goals. This survey reviews numerous applications of the Markov decision process (MDP) framework, a powerful decision-making tool to develop adaptive algorithms and protocols for WSNs. Furthermore, various solution methods are discussed and compared to serve as a guide for using MDPs in WSNs

    Estimation of Markov Chain via Rank-Constrained Likelihood

    Full text link
    This paper studies the estimation of low-rank Markov chains from empirical trajectories. We propose a non-convex estimator based on rank-constrained likelihood maximization. Statistical upper bounds are provided for the Kullback-Leiber divergence and the ℓ2\ell_2 risk between the estimator and the true transition matrix. The estimator reveals a compressed state space of the Markov chain. We also develop a novel DC (difference of convex function) programming algorithm to tackle the rank-constrained non-smooth optimization problem. Convergence results are established. Experiments show that the proposed estimator achieves better empirical performance than other popular approaches.Comment: Accepted at ICML 201
    • 

    corecore