60 research outputs found

    BERT for Joint Intent Classification and Slot Filling

    Full text link
    Intent classification and slot filling are two essential tasks for natural language understanding. They often suffer from small-scale human-labeled training data, resulting in poor generalization capability, especially for rare words. Recently a new language representation model, BERT (Bidirectional Encoder Representations from Transformers), facilitates pre-training deep bidirectional representations on large-scale unlabeled corpora, and has created state-of-the-art models for a wide variety of natural language processing tasks after simple fine-tuning. However, there has not been much effort on exploring BERT for natural language understanding. In this work, we propose a joint intent classification and slot filling model based on BERT. Experimental results demonstrate that our proposed model achieves significant improvement on intent classification accuracy, slot filling F1, and sentence-level semantic frame accuracy on several public benchmark datasets, compared to the attention-based recurrent neural network models and slot-gated models.Comment: 4 pages, 1 figur

    A Novel Task-Oriented Text Corpus in Silent Speech Recognition and its Natural Language Generation Construction Method

    Full text link
    Millions of people with severe speech disorders around the world may regain their communication capabilities through techniques of silent speech recognition (SSR). Using electroencephalography (EEG) as a biomarker for speech decoding has been popular for SSR. However, the lack of SSR text corpus has impeded the development of this technique. Here, we construct a novel task-oriented text corpus, which is utilized in the field of SSR. In the process of construction, we propose a task-oriented hybrid construction method based on natural language generation algorithm. The algorithm focuses on the strategy of data-to-text generation, and has two advantages including linguistic quality and high diversity. These two advantages use template-based method and deep neural networks respectively. In an SSR experiment with the generated text corpus, analysis results show that the performance of our hybrid construction method outperforms the pure method such as template-based natural language generation or neural natural language generation models.Comment: Accepted for publication in the 3rd International Conference on Natural Language Processing and Information Retrieval, 201

    ClovaCall: Korean Goal-Oriented Dialog Speech Corpus for Automatic Speech Recognition of Contact Centers

    Full text link
    Automatic speech recognition (ASR) via call is essential for various applications, including AI for contact center (AICC) services. Despite the advancement of ASR, however, most publicly available call-based speech corpora such as Switchboard are old-fashioned. Also, most existing call corpora are in English and mainly focus on open domain dialog or general scenarios such as audiobooks. Here we introduce a new large-scale Korean call-based speech corpus under a goal-oriented dialog scenario from more than 11,000 people, i.e., ClovaCall corpus. ClovaCall includes approximately 60,000 pairs of a short sentence and its corresponding spoken utterance in a restaurant reservation domain. We validate the effectiveness of our dataset with intensive experiments using two standard ASR models. Furthermore, we release our ClovaCall dataset and baseline source codes to be available via https://github.com/ClovaAI/ClovaCall.Comment: 5 pages, 2 figures, 4 tables, The first two authors equally contributed to this wor

    TATL at W-NUT 2020 Task 2: A Transformer-based Baseline System for Identification of Informative COVID-19 English Tweets

    Full text link
    As the COVID-19 outbreak continues to spread throughout the world, more and more information about the pandemic has been shared publicly on social media. For example, there are a huge number of COVID-19 English Tweets daily on Twitter. However, the majority of those Tweets are uninformative, and hence it is important to be able to automatically select only the informative ones for downstream applications. In this short paper, we present our participation in the W-NUT 2020 Shared Task 2: Identification of Informative COVID-19 English Tweets. Inspired by the recent advances in pretrained Transformer language models, we propose a simple yet effective baseline for the task. Despite its simplicity, our proposed approach shows very competitive results in the leaderboard as we ranked 8 over 56 teams participated in total

    Joint Intent Detection And Slot Filling Based on Continual Learning Model

    Full text link
    Slot filling and intent detection have become a significant theme in the field of natural language understanding. Even though slot filling is intensively associated with intent detection, the characteristics of the information required for both tasks are different while most of those approaches may not fully aware of this problem. In addition, balancing the accuracy of two tasks effectively is an inevitable problem for the joint learning model. In this paper, a Continual Learning Interrelated Model (CLIM) is proposed to consider semantic information with different characteristics and balance the accuracy between intent detection and slot filling effectively. The experimental results show that CLIM achieves state-of-the-art performace on slot filling and intent detection on ATIS and Snips.Comment: Accepted to ICASSP 202

    SlotRefine: A Fast Non-Autoregressive Model for Joint Intent Detection and Slot Filling

    Full text link
    Slot filling and intent detection are two main tasks in spoken language understanding (SLU) system. In this paper, we propose a novel non-autoregressive model named SlotRefine for joint intent detection and slot filling. Besides, we design a novel two-pass iteration mechanism to handle the uncoordinated slots problem caused by conditional independence of non-autoregressive model. Experiments demonstrate that our model significantly outperforms previous models in slot filling task, while considerably speeding up the decoding (up to X 10.77). In-depth analyses show that 1) pretraining schemes could further enhance our model; 2) two-pass mechanism indeed remedy the uncoordinated slots.Comment: To appear in the the main conference of EMNLP 202

    STIL -- Simultaneous Slot Filling, Translation, Intent Classification, and Language Identification: Initial Results using mBART on MultiATIS++

    Full text link
    Slot-filling, Translation, Intent classification, and Language identification, or STIL, is a newly-proposed task for multilingual Natural Language Understanding (NLU). By performing simultaneous slot filling and translation into a single output language (English in this case), some portion of downstream system components can be monolingual, reducing development and maintenance cost. Results are given using the multilingual BART model (Liu et al., 2020) fine-tuned on 7 languages using the MultiATIS++ dataset. When no translation is performed, mBART's performance is comparable to the current state of the art system (Cross-Lingual BERT by Xu et al. (2020)) for the languages tested, with better average intent classification accuracy (96.07% versus 95.50%) but worse average slot F1 (89.87% versus 90.81%). When simultaneous translation is performed, average intent classification accuracy degrades by only 1.7% relative and average slot F1 degrades by only 1.2% relative.Comment: 4 pages; To be published at AACL 2020; For code, see: https://github.com/jgmfitz/stil-mbart-multiatispp-aacl202

    Rapid Adaptation of BERT for Information Extraction on Domain-Specific Business Documents

    Full text link
    Techniques for automatically extracting important content elements from business documents such as contracts, statements, and filings have the potential to make business operations more efficient. This problem can be formulated as a sequence labeling task, and we demonstrate the adaption of BERT to two types of business documents: regulatory filings and property lease agreements. There are aspects of this problem that make it easier than "standard" information extraction tasks and other aspects that make it more difficult, but on balance we find that modest amounts of annotated data (less than 100 documents) are sufficient to achieve reasonable accuracy. We integrate our models into an end-to-end cloud platform that provides both an easy-to-use annotation interface as well as an inference interface that allows users to upload documents and inspect model outputs

    Zero-Shot Visual Slot Filling as Question Answering

    Full text link
    This paper presents a new approach to visual zero-shot slot filling. The approach extends previous approaches by reformulating the slot filling task as Question Answering. Slot tags are converted to rich natural language questions that capture the semantics of visual information and lexical text on the GUI screen. These questions are paired with the user's utterance and slots are extracted from the utterance using a state-of-the-art ALBERT-based Question Answering system trained on the Stanford Question Answering dataset (SQuaD2). An approach to further refine the model with multi-task training is presented. The multi-task approach facilitates the incorporation of a large number of successive refinements and transfer learning across similar tasks. A new Visual Slot dataset and a visual extension of the popular ATIS dataset is introduced to support research and experimentation on visual slot filling. Results show F1 scores between 0.52 and 0.60 on the Visual Slot and ATIS datasets with no training data (zero-shot).Comment: 5 pages, 6 figures, 4 table

    Warped Language Models for Noise Robust Language Understanding

    Full text link
    Masked Language Models (MLM) are self-supervised neural networks trained to fill in the blanks in a given sentence with masked tokens. Despite the tremendous success of MLMs for various text based tasks, they are not robust for spoken language understanding, especially for spontaneous conversational speech recognition noise. In this work we introduce Warped Language Models (WLM) in which input sentences at training time go through the same modifications as in MLM, plus two additional modifications, namely inserting and dropping random tokens. These two modifications extend and contract the sentence in addition to the modifications in MLMs, hence the word "warped" in the name. The insertion and drop modification of the input text during training of WLM resemble the types of noise due to Automatic Speech Recognition (ASR) errors, and as a result WLMs are likely to be more robust to ASR noise. Through computational results we show that natural language understanding systems built on top of WLMs perform better compared to those built based on MLMs, especially in the presence of ASR errors.Comment: To appear at IEEE SLT 202
    • …