Four problems related to information divergence measures defined on finite
alphabets are considered. In three of the cases we consider, we illustrate a
contrast which arises between the binary-alphabet and larger-alphabet settings.
This is surprising in some instances, since characterizations for the
larger-alphabet settings do not generalize their binary-alphabet counterparts.
Specifically, we show that f-divergences are not the unique decomposable
divergences on binary alphabets that satisfy the data processing inequality,
thereby clarifying claims that have previously appeared in the literature. We
also show that KL divergence is the unique Bregman divergence which is also an
f-divergence for any alphabet size. We show that KL divergence is the unique
Bregman divergence which is invariant to statistically sufficient
transformations of the data, even when non-decomposable divergences are
considered. Like some of the problems we consider, this result holds only when
the alphabet size is at least three.Comment: to appear in IEEE Transactions on Information Theor