267,633 research outputs found
Learning the Structure of Deep Sparse Graphical Models
Deep belief networks are a powerful way to model complex probability
distributions. However, learning the structure of a belief network,
particularly one with hidden units, is difficult. The Indian buffet process has
been used as a nonparametric Bayesian prior on the directed structure of a
belief network with a single infinitely wide hidden layer. In this paper, we
introduce the cascading Indian buffet process (CIBP), which provides a
nonparametric prior on the structure of a layered, directed belief network that
is unbounded in both depth and width, yet allows tractable inference. We use
the CIBP prior with the nonlinear Gaussian belief network so each unit can
additionally vary its behavior between discrete and continuous representations.
We provide Markov chain Monte Carlo algorithms for inference in these belief
networks and explore the structures learned on several image data sets.Comment: 20 pages, 6 figures, AISTATS 2010, Revise
Self-Adaptive Hierarchical Sentence Model
The ability to accurately model a sentence at varying stages (e.g.,
word-phrase-sentence) plays a central role in natural language processing. As
an effort towards this goal we propose a self-adaptive hierarchical sentence
model (AdaSent). AdaSent effectively forms a hierarchy of representations from
words to phrases and then to sentences through recursive gated local
composition of adjacent segments. We design a competitive mechanism (through
gating networks) to allow the representations of the same sentence to be
engaged in a particular learning task (e.g., classification), therefore
effectively mitigating the gradient vanishing problem persistent in other
recursive models. Both qualitative and quantitative analysis shows that AdaSent
can automatically form and select the representations suitable for the task at
hand during training, yielding superior classification performance over
competitor models on 5 benchmark data sets.Comment: 8 pages, 7 figures, accepted as a full paper at IJCAI 201
- …