The unification of low-level perception and high-level reasoning is a
long-standing problem in artificial intelligence, which has the potential to
not only bring the areas of logic and learning closer together but also
demonstrate how abstract concepts might emerge from sensory data. Precisely
because deep learning methods dominate perception-based learning, including
vision, speech, and linguistic grammar, there is fast-growing literature on how
to integrate symbolic reasoning and deep learning. Broadly, efforts seem to
fall into three camps: those focused on defining a logic whose formulas capture
deep learning, ones that integrate symbolic constraints in deep learning, and
others that allow neural computations and symbolic reasoning to co-exist
separately, to enjoy the strengths of both worlds. In this paper, we identify
another dimension to this inquiry: what do the hidden layers really capture,
and how can we reason about that logically? In particular, we consider
autoencoders that are widely used for dimensionality reduction and inject a
symbolic generative framework onto the feature layer. This allows us, among
other things, to generate example images for a class to get a sense of what was
learned. Moreover, the modular structure of the proposed model makes it
possible to learn relations over multiple images at a time, as well as handle
noisy labels. Our empirical evaluations show the promise of this inquiry