13,970 research outputs found
Recommended from our members
Sequence Classification Restricted Boltzmann Machines With Gated Units
For the classification of sequential data, dynamic Bayesian networks and recurrent neural networks (RNNs) are the preferred models. While the former can explicitly model the temporal dependences between the variables, and the latter have the capability of learning representations. The recurrent temporal restricted Boltzmann machine (RTRBM) is a model that combines these two features. However, learning and inference in RTRBMs can be difficult because of the exponential nature of its gradient computations when maximizing log likelihoods. In this article, first, we address this intractability by optimizing a conditional rather than a joint probability distribution when performing sequence classification. This results in the ``sequence classification restricted Boltzmann machine'' (SCRBM). Second, we introduce gated SCRBMs (gSCRBMs), which use an information processing gate, as an integration of SCRBMs with long short-term memory (LSTM) models. In the experiments reported in this article, we evaluate the proposed models on optical character recognition, chunking, and multiresident activity recognition in smart homes. The experimental results show that gSCRBMs achieve the performance comparable to that of the state of the art in all three tasks. gSCRBMs require far fewer parameters in comparison with other recurrent networks with memory gates, in particular, LSTMs and gated recurrent units (GRUs)
Dialogue Act Recognition via CRF-Attentive Structured Network
Dialogue Act Recognition (DAR) is a challenging problem in dialogue
interpretation, which aims to attach semantic labels to utterances and
characterize the speaker's intention. Currently, many existing approaches
formulate the DAR problem ranging from multi-classification to structured
prediction, which suffer from handcrafted feature extensions and attentive
contextual structural dependencies. In this paper, we consider the problem of
DAR from the viewpoint of extending richer Conditional Random Field (CRF)
structural dependencies without abandoning end-to-end training. We incorporate
hierarchical semantic inference with memory mechanism on the utterance
modeling. We then extend structured attention network to the linear-chain
conditional random field layer which takes into account both contextual
utterances and corresponding dialogue acts. The extensive experiments on two
major benchmark datasets Switchboard Dialogue Act (SWDA) and Meeting Recorder
Dialogue Act (MRDA) datasets show that our method achieves better performance
than other state-of-the-art solutions to the problem. It is a remarkable fact
that our method is nearly close to the human annotator's performance on SWDA
within 2% gap.Comment: 10 pages, 4figure
Interacting Attention-gated Recurrent Networks for Recommendation
Capturing the temporal dynamics of user preferences over items is important
for recommendation. Existing methods mainly assume that all time steps in
user-item interaction history are equally relevant to recommendation, which
however does not apply in real-world scenarios where user-item interactions can
often happen accidentally. More importantly, they learn user and item dynamics
separately, thus failing to capture their joint effects on user-item
interactions. To better model user and item dynamics, we present the
Interacting Attention-gated Recurrent Network (IARN) which adopts the attention
model to measure the relevance of each time step. In particular, we propose a
novel attention scheme to learn the attention scores of user and item history
in an interacting way, thus to account for the dependencies between user and
item dynamics in shaping user-item interactions. By doing so, IARN can
selectively memorize different time steps of a user's history when predicting
her preferences over different items. Our model can therefore provide
meaningful interpretations for recommendation results, which could be further
enhanced by auxiliary features. Extensive validation on real-world datasets
shows that IARN consistently outperforms state-of-the-art methods.Comment: Accepted by ACM International Conference on Information and Knowledge
Management (CIKM), 201
- …