60 research outputs found
BERT for Joint Intent Classification and Slot Filling
Intent classification and slot filling are two essential tasks for natural
language understanding. They often suffer from small-scale human-labeled
training data, resulting in poor generalization capability, especially for rare
words. Recently a new language representation model, BERT (Bidirectional
Encoder Representations from Transformers), facilitates pre-training deep
bidirectional representations on large-scale unlabeled corpora, and has created
state-of-the-art models for a wide variety of natural language processing tasks
after simple fine-tuning. However, there has not been much effort on exploring
BERT for natural language understanding. In this work, we propose a joint
intent classification and slot filling model based on BERT. Experimental
results demonstrate that our proposed model achieves significant improvement on
intent classification accuracy, slot filling F1, and sentence-level semantic
frame accuracy on several public benchmark datasets, compared to the
attention-based recurrent neural network models and slot-gated models.Comment: 4 pages, 1 figur
A Novel Task-Oriented Text Corpus in Silent Speech Recognition and its Natural Language Generation Construction Method
Millions of people with severe speech disorders around the world may regain
their communication capabilities through techniques of silent speech
recognition (SSR). Using electroencephalography (EEG) as a biomarker for speech
decoding has been popular for SSR. However, the lack of SSR text corpus has
impeded the development of this technique. Here, we construct a novel
task-oriented text corpus, which is utilized in the field of SSR. In the
process of construction, we propose a task-oriented hybrid construction method
based on natural language generation algorithm. The algorithm focuses on the
strategy of data-to-text generation, and has two advantages including
linguistic quality and high diversity. These two advantages use template-based
method and deep neural networks respectively. In an SSR experiment with the
generated text corpus, analysis results show that the performance of our hybrid
construction method outperforms the pure method such as template-based natural
language generation or neural natural language generation models.Comment: Accepted for publication in the 3rd International Conference on
Natural Language Processing and Information Retrieval, 201
ClovaCall: Korean Goal-Oriented Dialog Speech Corpus for Automatic Speech Recognition of Contact Centers
Automatic speech recognition (ASR) via call is essential for various
applications, including AI for contact center (AICC) services. Despite the
advancement of ASR, however, most publicly available call-based speech corpora
such as Switchboard are old-fashioned. Also, most existing call corpora are in
English and mainly focus on open domain dialog or general scenarios such as
audiobooks. Here we introduce a new large-scale Korean call-based speech corpus
under a goal-oriented dialog scenario from more than 11,000 people, i.e.,
ClovaCall corpus. ClovaCall includes approximately 60,000 pairs of a short
sentence and its corresponding spoken utterance in a restaurant reservation
domain. We validate the effectiveness of our dataset with intensive experiments
using two standard ASR models. Furthermore, we release our ClovaCall dataset
and baseline source codes to be available via
https://github.com/ClovaAI/ClovaCall.Comment: 5 pages, 2 figures, 4 tables, The first two authors equally
contributed to this wor
TATL at W-NUT 2020 Task 2: A Transformer-based Baseline System for Identification of Informative COVID-19 English Tweets
As the COVID-19 outbreak continues to spread throughout the world, more and
more information about the pandemic has been shared publicly on social media.
For example, there are a huge number of COVID-19 English Tweets daily on
Twitter. However, the majority of those Tweets are uninformative, and hence it
is important to be able to automatically select only the informative ones for
downstream applications. In this short paper, we present our participation in
the W-NUT 2020 Shared Task 2: Identification of Informative COVID-19 English
Tweets. Inspired by the recent advances in pretrained Transformer language
models, we propose a simple yet effective baseline for the task. Despite its
simplicity, our proposed approach shows very competitive results in the
leaderboard as we ranked 8 over 56 teams participated in total
Joint Intent Detection And Slot Filling Based on Continual Learning Model
Slot filling and intent detection have become a significant theme in the
field of natural language understanding. Even though slot filling is
intensively associated with intent detection, the characteristics of the
information required for both tasks are different while most of those
approaches may not fully aware of this problem. In addition, balancing the
accuracy of two tasks effectively is an inevitable problem for the joint
learning model. In this paper, a Continual Learning Interrelated Model (CLIM)
is proposed to consider semantic information with different characteristics and
balance the accuracy between intent detection and slot filling effectively. The
experimental results show that CLIM achieves state-of-the-art performace on
slot filling and intent detection on ATIS and Snips.Comment: Accepted to ICASSP 202
SlotRefine: A Fast Non-Autoregressive Model for Joint Intent Detection and Slot Filling
Slot filling and intent detection are two main tasks in spoken language
understanding (SLU) system. In this paper, we propose a novel
non-autoregressive model named SlotRefine for joint intent detection and slot
filling. Besides, we design a novel two-pass iteration mechanism to handle the
uncoordinated slots problem caused by conditional independence of
non-autoregressive model. Experiments demonstrate that our model significantly
outperforms previous models in slot filling task, while considerably speeding
up the decoding (up to X 10.77). In-depth analyses show that 1) pretraining
schemes could further enhance our model; 2) two-pass mechanism indeed remedy
the uncoordinated slots.Comment: To appear in the the main conference of EMNLP 202
STIL -- Simultaneous Slot Filling, Translation, Intent Classification, and Language Identification: Initial Results using mBART on MultiATIS++
Slot-filling, Translation, Intent classification, and Language
identification, or STIL, is a newly-proposed task for multilingual Natural
Language Understanding (NLU). By performing simultaneous slot filling and
translation into a single output language (English in this case), some portion
of downstream system components can be monolingual, reducing development and
maintenance cost. Results are given using the multilingual BART model (Liu et
al., 2020) fine-tuned on 7 languages using the MultiATIS++ dataset. When no
translation is performed, mBART's performance is comparable to the current
state of the art system (Cross-Lingual BERT by Xu et al. (2020)) for the
languages tested, with better average intent classification accuracy (96.07%
versus 95.50%) but worse average slot F1 (89.87% versus 90.81%). When
simultaneous translation is performed, average intent classification accuracy
degrades by only 1.7% relative and average slot F1 degrades by only 1.2%
relative.Comment: 4 pages; To be published at AACL 2020; For code, see:
https://github.com/jgmfitz/stil-mbart-multiatispp-aacl202
Rapid Adaptation of BERT for Information Extraction on Domain-Specific Business Documents
Techniques for automatically extracting important content elements from
business documents such as contracts, statements, and filings have the
potential to make business operations more efficient. This problem can be
formulated as a sequence labeling task, and we demonstrate the adaption of BERT
to two types of business documents: regulatory filings and property lease
agreements. There are aspects of this problem that make it easier than
"standard" information extraction tasks and other aspects that make it more
difficult, but on balance we find that modest amounts of annotated data (less
than 100 documents) are sufficient to achieve reasonable accuracy. We integrate
our models into an end-to-end cloud platform that provides both an easy-to-use
annotation interface as well as an inference interface that allows users to
upload documents and inspect model outputs
Zero-Shot Visual Slot Filling as Question Answering
This paper presents a new approach to visual zero-shot slot filling. The
approach extends previous approaches by reformulating the slot filling task as
Question Answering. Slot tags are converted to rich natural language questions
that capture the semantics of visual information and lexical text on the GUI
screen. These questions are paired with the user's utterance and slots are
extracted from the utterance using a state-of-the-art ALBERT-based Question
Answering system trained on the Stanford Question Answering dataset (SQuaD2).
An approach to further refine the model with multi-task training is presented.
The multi-task approach facilitates the incorporation of a large number of
successive refinements and transfer learning across similar tasks. A new Visual
Slot dataset and a visual extension of the popular ATIS dataset is introduced
to support research and experimentation on visual slot filling. Results show F1
scores between 0.52 and 0.60 on the Visual Slot and ATIS datasets with no
training data (zero-shot).Comment: 5 pages, 6 figures, 4 table
Warped Language Models for Noise Robust Language Understanding
Masked Language Models (MLM) are self-supervised neural networks trained to
fill in the blanks in a given sentence with masked tokens. Despite the
tremendous success of MLMs for various text based tasks, they are not robust
for spoken language understanding, especially for spontaneous conversational
speech recognition noise. In this work we introduce Warped Language Models
(WLM) in which input sentences at training time go through the same
modifications as in MLM, plus two additional modifications, namely inserting
and dropping random tokens. These two modifications extend and contract the
sentence in addition to the modifications in MLMs, hence the word "warped" in
the name. The insertion and drop modification of the input text during training
of WLM resemble the types of noise due to Automatic Speech Recognition (ASR)
errors, and as a result WLMs are likely to be more robust to ASR noise. Through
computational results we show that natural language understanding systems built
on top of WLMs perform better compared to those built based on MLMs, especially
in the presence of ASR errors.Comment: To appear at IEEE SLT 202
- …