3,596 research outputs found
Effective Spoken Language Labeling with Deep Recurrent Neural Networks
Understanding spoken language is a highly complex problem, which can be
decomposed into several simpler tasks. In this paper, we focus on Spoken
Language Understanding (SLU), the module of spoken dialog systems responsible
for extracting a semantic interpretation from the user utterance. The task is
treated as a labeling problem. In the past, SLU has been performed with a wide
variety of probabilistic models. The rise of neural networks, in the last
couple of years, has opened new interesting research directions in this domain.
Recurrent Neural Networks (RNNs) in particular are able not only to represent
several pieces of information as embeddings but also, thanks to their recurrent
architecture, to encode as embeddings relatively long contexts. Such long
contexts are in general out of reach for models previously used for SLU. In
this paper we propose novel RNNs architectures for SLU which outperform
previous ones. Starting from a published idea as base block, we design new deep
RNNs achieving state-of-the-art results on two widely used corpora for SLU:
ATIS (Air Traveling Information System), in English, and MEDIA (Hotel
information and reservation in France), in French.Comment: 8 pages. Rejected from IJCAI 2017, good remarks overall, but slightly
off-topic as from global meta-reviews. Recommendations: 8, 6, 6, 4. arXiv
admin note: text overlap with arXiv:1706.0174
Toward Abstraction from Multi-modal Data: Empirical Studies on Multiple Time-scale Recurrent Models
The abstraction tasks are challenging for multi- modal sequences as they
require a deeper semantic understanding and a novel text generation for the
data. Although the recurrent neural networks (RNN) can be used to model the
context of the time-sequences, in most cases the long-term dependencies of
multi-modal data make the back-propagation through time training of RNN tend to
vanish in the time domain. Recently, inspired from Multiple Time-scale
Recurrent Neural Network (MTRNN), an extension of Gated Recurrent Unit (GRU),
called Multiple Time-scale Gated Recurrent Unit (MTGRU), has been proposed to
learn the long-term dependencies in natural language processing. Particularly
it is also able to accomplish the abstraction task for paragraphs given that
the time constants are well defined. In this paper, we compare the MTRNN and
MTGRU in terms of its learning performances as well as their abstraction
representation on higher level (with a slower neural activation). This was done
by conducting two studies based on a smaller data- set (two-dimension time
sequences from non-linear functions) and a relatively large data-set
(43-dimension time sequences from iCub manipulation tasks with multi-modal
data). We conclude that gated recurrent mechanisms may be necessary for
learning long-term dependencies in large dimension multi-modal data-sets (e.g.
learning of robot manipulation), even when natural language commands was not
involved. But for smaller learning tasks with simple time-sequences, generic
version of recurrent models, such as MTRNN, were sufficient to accomplish the
abstraction task.Comment: Accepted by IJCNN 201
Label-Dependencies Aware Recurrent Neural Networks
In the last few years, Recurrent Neural Networks (RNNs) have proved effective
on several NLP tasks. Despite such great success, their ability to model
\emph{sequence labeling} is still limited. This lead research toward solutions
where RNNs are combined with models which already proved effective in this
domain, such as CRFs. In this work we propose a solution far simpler but very
effective: an evolution of the simple Jordan RNN, where labels are re-injected
as input into the network, and converted into embeddings, in the same way as
words. We compare this RNN variant to all the other RNN models, Elman and
Jordan RNN, LSTM and GRU, on two well-known tasks of Spoken Language
Understanding (SLU). Thanks to label embeddings and their combination at the
hidden layer, the proposed variant, which uses more parameters than Elman and
Jordan RNNs, but far fewer than LSTM and GRU, is more effective than other
RNNs, but also outperforms sophisticated CRF models.Comment: 22 pages, 3 figures. Accepted at CICling 2017 conference. Best
Verifiability, Reproducibility, and Working Description awar
Towards Zero-Shot Frame Semantic Parsing for Domain Scaling
State-of-the-art slot filling models for goal-oriented human/machine
conversational language understanding systems rely on deep learning methods.
While multi-task training of such models alleviates the need for large
in-domain annotated datasets, bootstrapping a semantic parsing model for a new
domain using only the semantic frame, such as the back-end API or knowledge
graph schema, is still one of the holy grail tasks of language understanding
for dialogue systems. This paper proposes a deep learning based approach that
can utilize only the slot description in context without the need for any
labeled or unlabeled in-domain examples, to quickly bootstrap a new domain. The
main idea of this paper is to leverage the encoding of the slot names and
descriptions within a multi-task deep learned slot filling model, to implicitly
align slots across domains. The proposed approach is promising for solving the
domain scaling problem and eliminating the need for any manually annotated data
or explicit schema alignment. Furthermore, our experiments on multiple domains
show that this approach results in significantly better slot-filling
performance when compared to using only in-domain data, especially in the low
data regime.Comment: 4 pages + 1 reference
- …