Rewards typically express desirabilities or preferences over a set of
alternatives. Here we propose that rewards can be defined for any probability
distribution based on three desiderata, namely that rewards should be
real-valued, additive and order-preserving, where the latter implies that more
probable events should also be more desirable. Our main result states that
rewards are then uniquely determined by the negative information content. To
analyze stochastic processes, we define the utility of a realization as its
reward rate. Under this interpretation, we show that the expected utility of a
stochastic process is its negative entropy rate. Furthermore, we apply our
results to analyze agent-environment interactions. We show that the expected
utility that will actually be achieved by the agent is given by the negative
cross-entropy from the input-output (I/O) distribution of the coupled
interaction system and the agent's I/O distribution. Thus, our results allow
for an information-theoretic interpretation of the notion of utility and the
characterization of agent-environment interactions in terms of entropy
dynamics.Comment: AGI-2010. 6 pages, 1 figur