4,226 research outputs found
Recruitment Market Trend Analysis with Sequential Latent Variable Models
Recruitment market analysis provides valuable understanding of
industry-specific economic growth and plays an important role for both
employers and job seekers. With the rapid development of online recruitment
services, massive recruitment data have been accumulated and enable a new
paradigm for recruitment market analysis. However, traditional methods for
recruitment market analysis largely rely on the knowledge of domain experts and
classic statistical models, which are usually too general to model large-scale
dynamic recruitment data, and have difficulties to capture the fine-grained
market trends. To this end, in this paper, we propose a new research paradigm
for recruitment market analysis by leveraging unsupervised learning techniques
for automatically discovering recruitment market trends based on large-scale
recruitment data. Specifically, we develop a novel sequential latent variable
model, named MTLVM, which is designed for capturing the sequential dependencies
of corporate recruitment states and is able to automatically learn the latent
recruitment topics within a Bayesian generative framework. In particular, to
capture the variability of recruitment topics over time, we design hierarchical
dirichlet processes for MTLVM. These processes allow to dynamically generate
the evolving recruitment topics. Finally, we implement a prototype system to
empirically evaluate our approach based on real-world recruitment data in
China. Indeed, by visualizing the results from MTLVM, we can successfully
reveal many interesting findings, such as the popularity of LBS related jobs
reached the peak in the 2nd half of 2014, and decreased in 2015.Comment: 11 pages, 30 figure, SIGKDD 201
The Discrete Infinite Logistic Normal Distribution
We present the discrete infinite logistic normal distribution (DILN), a
Bayesian nonparametric prior for mixed membership models. DILN is a
generalization of the hierarchical Dirichlet process (HDP) that models
correlation structure between the weights of the atoms at the group level. We
derive a representation of DILN as a normalized collection of gamma-distributed
random variables, and study its statistical properties. We consider
applications to topic modeling and derive a variational inference algorithm for
approximate posterior inference. We study the empirical performance of the DILN
topic model on four corpora, comparing performance with the HDP and the
correlated topic model (CTM). To deal with large-scale data sets, we also
develop an online inference algorithm for DILN and compare with online HDP and
online LDA on the Nature magazine, which contains approximately 350,000
articles.Comment: This paper will appear in Bayesian Analysis. A shorter version of
this paper appeared at AISTATS 2011, Fort Lauderdale, FL, US
- …