Sparse coding is a proven principle for learning compact representations of
images. However, sparse coding by itself often leads to very redundant
dictionaries. With images, this often takes the form of similar edge detectors
which are replicated many times at various positions, scales and orientations.
An immediate consequence of this observation is that the estimation of the
dictionary components is not statistically efficient. We propose a factored
model in which factors of variation (e.g. position, scale and orientation) are
untangled from the underlying Gabor-like filters. There is so much redundancy
in sparse codes for natural images that our model requires only a single
dictionary element (a Gabor-like edge detector) to outperform standard sparse
coding. Our model scales naturally to arbitrary-sized images while achieving
much greater statistical efficiency during learning. We validate this claim
with a number of experiments showing, in part, superior compression of
out-of-sample data using a sparse coding dictionary learned with only a single
image.Comment: 9 pages, 8 figure