Co-occurrence Data is a common and important information source in many
areas, such as the word co-occurrence in the sentences, friends co-occurrence
in social networks and products co-occurrence in commercial transaction data,
etc, which contains rich correlation and clustering information about the
items. In this paper, we study co-occurrence data using a general energy-based
probabilistic model, and we analyze three different categories of energy-based
model, namely, the L1β, L2β and Lkβ models, which are able to capture
different levels of dependency in the co-occurrence data. We also discuss how
several typical existing models are related to these three types of energy
models, including the Fully Visible Boltzmann Machine (FVBM) (L2β), Matrix
Factorization (L2β), Log-BiLinear (LBL) models (L2β), and the Restricted
Boltzmann Machine (RBM) model (Lkβ). Then, we propose a Deep Embedding Model
(DEM) (an Lkβ model) from the energy model in a \emph{principled} manner.
Furthermore, motivated by the observation that the partition function in the
energy model is intractable and the fact that the major objective of modeling
the co-occurrence data is to predict using the conditional probability, we
apply the \emph{maximum pseudo-likelihood} method to learn DEM. In consequence,
the developed model and its learning method naturally avoid the above
difficulties and can be easily used to compute the conditional probability in
prediction. Interestingly, our method is equivalent to learning a special
structured deep neural network using back-propagation and a special sampling
strategy, which makes it scalable on large-scale datasets. Finally, in the
experiments, we show that the DEM can achieve comparable or better results than
state-of-the-art methods on datasets across several application domains