We present a Bayesian nonparametric framework for multilevel clustering which
utilizes group-level context information to simultaneously discover
low-dimensional structures of the group contents and partitions groups into
clusters. Using the Dirichlet process as the building block, our model
constructs a product base-measure with a nested structure to accommodate
content and context observations at multiple levels. The proposed model
possesses properties that link the nested Dirichlet processes (nDP) and the
Dirichlet process mixture models (DPM) in an interesting way: integrating out
all contents results in the DPM over contexts, whereas integrating out
group-specific contexts results in the nDP mixture over content variables. We
provide a Polya-urn view of the model and an efficient collapsed Gibbs
inference procedure. Extensive experiments on real-world datasets demonstrate
the advantage of utilizing context information via our model in both text and
image domains.Comment: Full version of ICML 201