We introduce a new approach to probabilistic unsupervised learning based on
the recognition-parametrised model (RPM): a normalised semi-parametric
hypothesis class for joint distributions over observed and latent variables.
Under the key assumption that observations are conditionally independent given
latents, the RPM combines parametric prior and observation-conditioned latent
distributions with non-parametric observation marginals. This approach leads to
a flexible learnt recognition model capturing latent dependence between
observations, without the need for an explicit, parametric generative model.
The RPM admits exact maximum-likelihood learning for discrete latents, even for
powerful neural-network-based recognition. We develop effective approximations
applicable in the continuous-latent case. Experiments demonstrate the
effectiveness of the RPM on high-dimensional data, learning image
classification from weak indirect supervision; direct image-level latent
Dirichlet allocation; and recognition-parametrised Gaussian process factor
analysis (RP-GPFA) applied to multi-factorial spatiotemporal datasets. The RPM
provides a powerful framework to discover meaningful latent structure
underlying observational data, a function critical to both animal and
artificial intelligence