2 research outputs found
Sequential Empirical Coordination Under an Output Entropy Constraint
This paper considers the problem of sequential empirical coordination, where
the objective is to achieve a given value of the expected uniform deviation
between state-action empirical averages and statistical expectations under a
given strategic probability measure, with respect to a given universal
Glivenko-Cantelli class of test functions. A communication constraint is
imposed on the Shannon entropy of the resulting action sequence. It is shown
that the fundamental limit on the output entropy is given by the minimum of the
mutual information between the state and the action processes under all
strategic measures that have the same marginal state process as the target
measure and approximate the target measure to desired accuracy with respect to
the underlying Glivenko--Cantelli seminorm. The fundamental limit is shown to
be asymptotically achievable by tree-structured codes.Comment: 12 pages, double-column format, accepted to IEEE Transactions on
Information Theor
Transfer-Entropy-Regularized Markov Decision Processes
We consider the framework of transfer-entropy-regularized Markov Decision
Process (TERMDP) in which the weighted sum of the classical state-dependent
cost and the transfer entropy from the state random process to the control
random process is minimized. Although TERMDPs are generally formulated as
nonconvex optimization problems, we derive an analytical necessary optimality
condition expressed as a finite set of nonlinear equations, based on which an
iterative forward-backward computational procedure similar to the
Arimoto-Blahut algorithm is proposed. It is shown that every limit point of the
sequence generated by the proposed algorithm is a stationary point of the
TERMDP. Applications of TERMDPs are discussed in the context of networked
control systems theory and non-equilibrium thermodynamics. The proposed
algorithm is applied to an information-constrained maze navigation problem,
whereby we study how the price of information qualitatively alters the optimal
decision polices