Scientists often seek simplified representations of complex systems to
facilitate prediction and understanding. If the factors comprising a
representation allow us to make accurate predictions about our system, but
obscuring any subset of the factors destroys our ability to make predictions,
we say that the representation exhibits informational synergy. We argue that
synergy is an undesirable feature in learned representations and that
explicitly minimizing synergy can help disentangle the true factors of
variation underlying data. We explore different ways of quantifying synergy,
deriving new closed-form expressions in some cases, and then show how to modify
learning to produce representations that are minimally synergistic. We
introduce a benchmark task to disentangle separate characters from images of
words. We demonstrate that Minimally Synergistic (MinSyn) representations
correctly disentangle characters while methods relying on statistical
independence fail.Comment: 8 pages, 4 figures, 55th Annual Allerton Conference on Communication,
Control, and Computing, 201