Search CORE

2 research outputs found

Sequential Empirical Coordination Under an Output Entropy Constraint

Author: Raginsky Maxim
Shafieepoorfard Ehsan
Publication venue
Publication date: 11/06/2018
Field of study

This paper considers the problem of sequential empirical coordination, where the objective is to achieve a given value of the expected uniform deviation between state-action empirical averages and statistical expectations under a given strategic probability measure, with respect to a given universal Glivenko-Cantelli class of test functions. A communication constraint is imposed on the Shannon entropy of the resulting action sequence. It is shown that the fundamental limit on the output entropy is given by the minimum of the mutual information between the state and the action processes under all strategic measures that have the same marginal state process as the target measure and approximate the target measure to desired accuracy with respect to the underlying Glivenko--Cantelli seminorm. The fundamental limit is shown to be asymptotically achievable by tree-structured codes.Comment: 12 pages, double-column format, accepted to IEEE Transactions on Information Theor

arXiv.org e-Print Archive

Transfer-Entropy-Regularized Markov Decision Processes

Author: Sandberg Henrik
Skoglund Mikael
Tanaka Takashi
Publication venue
Publication date: 27/05/2020
Field of study

We consider the framework of transfer-entropy-regularized Markov Decision Process (TERMDP) in which the weighted sum of the classical state-dependent cost and the transfer entropy from the state random process to the control random process is minimized. Although TERMDPs are generally formulated as nonconvex optimization problems, we derive an analytical necessary optimality condition expressed as a finite set of nonlinear equations, based on which an iterative forward-backward computational procedure similar to the Arimoto-Blahut algorithm is proposed. It is shown that every limit point of the sequence generated by the proposed algorithm is a stationary point of the TERMDP. Applications of TERMDPs are discussed in the context of networked control systems theory and non-equilibrium thermodynamics. The proposed algorithm is applied to an information-constrained maze navigation problem, whereby we study how the price of information qualitatively alters the optimal decision polices

arXiv.org e-Print Archive