Search CORE

78 research outputs found

Combining Word Feature Vector Method with the Convolutional Neural Network for Slot Filling in Spoken Language Understanding

Author: Lin Ruixi
Publication venue
Publication date: 18/06/2018
Field of study

Slot filling is an important problem in Spoken Language Understanding (SLU) and Natural Language Processing (NLP), which involves identifying a user's intent and assigning a semantic concept to each word in a sentence. This paper presents a word feature vector method and combines it into the convolutional neural network (CNN). We consider 18 word features and each word feature is constructed by merging similar word labels. By introducing the concept of external library, we propose a feature set approach that is beneficial for building the relationship between a word from the training dataset and the feature. Computational results are reported using the ATIS dataset and comparisons with traditional CNN as well as bi-directional sequential CNN are also presented

arXiv.org e-Print Archive

End-to-End Optimization of Task-Oriented Dialogue Model with Deep Reinforcement Learning

Author: Hakkani-Tur Dilek
Heck Larry
Liu Bing
Shah Pararth
Tur Gokhan
Publication venue
Publication date: 30/11/2017
Field of study

In this paper, we present a neural network based task-oriented dialogue system that can be optimized end-to-end with deep reinforcement learning (RL). The system is able to track dialogue state, interface with knowledge bases, and incorporate query results into agent's responses to successfully complete task-oriented dialogues. Dialogue policy learning is conducted with a hybrid supervised and deep RL methods. We first train the dialogue agent in a supervised manner by learning directly from task-oriented dialogue corpora, and further optimize it with deep RL during its interaction with users. In the experiments on two different dialogue task domains, our model demonstrates robust performance in tracking dialogue state and producing reasonable system responses. We show that deep RL based optimization leads to significant improvement on task success rate and reduction in dialogue length comparing to supervised training model. We further show benefits of training task-oriented dialogue model end-to-end comparing to component-wise optimization with experiment results on dialogue simulations and human evaluations

arXiv.org e-Print Archive

Multi-Domain Adversarial Learning for Slot Filling in Spoken Language Understanding

Author: Lane Ian
Liu Bing
Publication venue
Publication date: 30/11/2017
Field of study

The goal of this paper is to learn cross-domain representations for slot filling task in spoken language understanding (SLU). Most of the recently published SLU models are domain-specific ones that work on individual task domains. Annotating data for each individual task domain is both financially costly and non-scalable. In this work, we propose an adversarial training method in learning common features and representations that can be shared across multiple domains. Model that produces such shared representations can be combined with models trained on individual domain SLU data to reduce the amount of training samples required for developing a new domain. In our experiments using data sets from multiple domains, we show that adversarial training helps in learning better domain-general SLU models, leading to improved slot filling F1 scores. We further show that applying adversarial learning on domain-general model also helps in achieving higher slot filling performance when the model is jointly optimized with domain-specific models

arXiv.org e-Print Archive

Neural CRF transducers for sequence labeling

Author: Feng Junlan
Hu Kai
Hu Min
Ou Zhijian
Publication venue
Publication date: 04/11/2018
Field of study

Conditional random fields (CRFs) have been shown to be one of the most successful approaches to sequence labeling. Various linear-chain neural CRFs (NCRFs) are developed to implement the non-linear node potentials in CRFs, but still keeping the linear-chain hidden structure. In this paper, we propose NCRF transducers, which consists of two RNNs, one extracting features from observations and the other capturing (theoretically infinite) long-range dependencies between labels. Different sequence labeling methods are evaluated over POS tagging, chunking and NER (English, Dutch). Experiment results show that NCRF transducers achieve consistent improvements over linear-chain NCRFs and RNN transducers across all the four tasks, and can improve state-of-the-art results

arXiv.org e-Print Archive

Sequential Convolutional Neural Networks for Slot Filling in Spoken Language Understanding

Author: Vu Ngoc Thang
Publication venue
Publication date: 24/06/2016
Field of study

We investigate the usage of convolutional neural networks (CNNs) for the slot filling task in spoken language understanding. We propose a novel CNN architecture for sequence labeling which takes into account the previous context words with preserved order information and pays special attention to the current word with its surrounding context. Moreover, it combines the information from the past and the future words for classification. Our proposed CNN architecture outperforms even the previously best ensembling recurrent neural network model and achieves state-of-the-art results with an F1-score of 95.61% on the ATIS benchmark dataset without using any additional linguistic knowledge and resources.Comment: Accepted at Interspeech 201

arXiv.org e-Print Archive

Speech Model Pre-training for End-to-End Spoken Language Understanding

Author: Bengio Yoshua
Ignoto Patrick
Lugosch Loren
Ravanelli Mirco
Tomar Vikrant Singh
Publication venue
Publication date: 25/07/2019
Field of study

Whereas conventional spoken language understanding (SLU) systems map speech to text, and then text to intent, end-to-end SLU systems map speech directly to intent through a single trainable model. Achieving high accuracy with these end-to-end models without a large amount of training data is difficult. We propose a method to reduce the data requirements of end-to-end SLU in which the model is first pre-trained to predict words and phonemes, thus learning good features for SLU. We introduce a new SLU dataset, Fluent Speech Commands, and show that our method improves performance both when the full dataset is used for training and when only a small subset is used. We also describe preliminary experiments to gauge the model's ability to generalize to new phrases not heard during training.Comment: Accepted to Interspeech 201

arXiv.org e-Print Archive

Combining Textual Content and Structure to Improve Dialog Similarity

Author: Appel Ana Paula
Cavalin Paulo Rodrigo
Pinhanez Claudio Santos
Vasconcelos Marisa Affonso
Publication venue
Publication date: 20/02/2018
Field of study

Chatbots, taking advantage of the success of the messaging apps and recent advances in Artificial Intelligence, have become very popular, from helping business to improve customer services to chatting to users for the sake of conversation and engagement (celebrity or personal bots). However, developing and improving a chatbot requires understanding their data generated by its users. Dialog data has a different nature of a simple question and answering interaction, in which context and temporal properties (turn order) creates a different understanding of such data. In this paper, we propose a novelty metric to compute dialogs' similarity based not only on the text content but also on the information related to the dialog structure. Our experimental results performed over the Switchboard dataset show that using evidence from both textual content and the dialog structure leads to more accurate results than using each measure in isolation.Comment: 5 page

arXiv.org e-Print Archive

FollowNet: Robot Navigation by Following Natural Language Directions with Deep Reinforcement Learning

Author: Faust Aleksandra
Fiser Marek
Hakkani-Tur Dilek
Kew J. Chase
Shah Pararth
Publication venue
Publication date: 16/05/2018
Field of study

Understanding and following directions provided by humans can enable robots to navigate effectively in unknown situations. We present FollowNet, an end-to-end differentiable neural architecture for learning multi-modal navigation policies. FollowNet maps natural language instructions as well as visual and depth inputs to locomotion primitives. FollowNet processes instructions using an attention mechanism conditioned on its visual and depth input to focus on the relevant parts of the command while performing the navigation task. Deep reinforcement learning (RL) a sparse reward learns simultaneously the state representation, the attention function, and control policies. We evaluate our agent on a dataset of complex natural language directions that guide the agent through a rich and realistic dataset of simulated homes. We show that the FollowNet agent learns to execute previously unseen instructions described with a similar vocabulary, and successfully navigates along paths not encountered during training. The agent shows 30% improvement over a baseline model without the attention mechanism, with 52% success rate at novel instructions.Comment: 7 pages, 8 figure

arXiv.org e-Print Archive

Semi-Supervised Few-Shot Learning for Dual Question-Answer Extraction

Author: Chen Ke
Mehrotra Sharad
Shou Lidan
Wang Jue
Wu Sai
Publication venue
Publication date: 08/04/2019
Field of study

This paper addresses the problem of key phrase extraction from sentences. Existing state-of-the-art supervised methods require large amounts of annotated data to achieve good performance and generalization. Collecting labeled data is, however, often expensive. In this paper, we redefine the problem as question-answer extraction, and present SAMIE: Self-Asking Model for Information Ixtraction, a semi-supervised model which dually learns to ask and to answer questions by itself. Briefly, given a sentence

s

and an answer

a

, the model needs to choose the most appropriate question

\hat q

; meanwhile, for the given sentence

s

and same question

\hat q

selected in the previous step, the model will predict an answer

\hat a

. The model can support few-shot learning with very limited supervision. It can also be used to perform clustering analysis when no supervision is provided. Experimental results show that the proposed method outperforms typical supervised methods especially when given little labeled data.Comment: 7 pages, 5 figures, submission to IJCAI1

arXiv.org e-Print Archive

Elastic CRFs for Open-ontology Slot Filling

Author: Dai Yinpei
Feng Junlan
Ou Zhijian
Wang Yanmeng
Zhang Yichi
Publication venue
Publication date: 04/11/2018
Field of study

Slot filling is a crucial component in task-oriented dialog systems, which is to parse (user) utterances into semantic concepts called slots. An ontology is defined by the collection of slots and the values that each slot can take. The widely-used practice of treating slot filling as a sequence labeling task suffers from two drawbacks. First, the ontology is usually pre-defined and fixed. Most current methods are unable to predict new labels for unseen slots. Second, the one-hot encoding of slot labels ignores the semantic meanings and relations for slots, which are implicit in their natural language descriptions. These observations motivate us to propose a novel model called elastic conditional random field (eCRF), for open-ontology slot filling. eCRFs can leverage the neural features of both the utterance and the slot descriptions, and are able to model the interactions between different slots. Experimental results show that eCRFs outperforms existing models on both the in-domain and the cross-doamin tasks, especially in predictions of unseen slots and values.Comment: 5 page

arXiv.org e-Print Archive