67 research outputs found

    λ”₯λŸ¬λ‹ 기반 생성 λͺ¨λΈμ„ μ΄μš©ν•œ μžμ—°μ–΄μ²˜λ¦¬ 데이터 증강 기법

    Get PDF
    ν•™μœ„λ…Όλ¬Έ(박사)--μ„œμšΈλŒ€ν•™κ΅ λŒ€ν•™μ› :κ³΅κ³ΌλŒ€ν•™ 컴퓨터곡학뢀,2020. 2. 이상ꡬ.Recent advances in generation capability of deep learning models have spurred interest in utilizing deep generative models for unsupervised generative data augmentation (GDA). Generative data augmentation aims to improve the performance of a downstream machine learning model by augmenting the original dataset with samples generated from a deep latent variable model. This data augmentation approach is attractive to the natural language processing community, because (1) there is a shortage of text augmentation techniques that require little supervision and (2) resource scarcity being prevalent. In this dissertation, we explore the feasibility of exploiting deep latent variable models for data augmentation on three NLP tasks: sentence classification, spoken language understanding (SLU) and dialogue state tracking (DST), represent NLP tasks of various complexities and properties -- SLU requires multi-task learning of text classification and sequence tagging, while DST requires the understanding of hierarchical and recurrent data structures. For each of the three tasks, we propose a task-specific latent variable model based on conditional, hierarchical and sequential variational autoencoders (VAE) for multi-modal joint modeling of linguistic features and the relevant annotations. We conduct extensive experiments to statistically justify our hypothesis that deep generative data augmentation is beneficial for all subject tasks. Our experiments show that deep generative data augmentation is effective for the select tasks, supporting the idea that the technique can potentially be utilized for other range of NLP tasks. Ablation and qualitative studies reveal deeper insight into the underlying mechanisms of generative data augmentation. As a secondary contribution, we also shed light onto the recurring posterior collapse phenomenon in autoregressive VAEs and, subsequently, propose novel techniques to reduce the model risk, which is crucial for proper training of complex VAE models, enabling them to synthesize better samples for data augmentation. In summary, this work intends to demonstrate and analyze the effectiveness of unsupervised generative data augmentation in NLP. Ultimately, our approach enables standardized adoption of generative data augmentation, which can be applied orthogonally to existing regularization techniques.졜근 λ”₯λŸ¬λ‹ 기반 생성 λͺ¨λΈμ˜ κΈ‰κ²©ν•œ λ°œμ „μœΌλ‘œ 이λ₯Ό μ΄μš©ν•œ 생성 기반 데이터 증강 기법(generative data augmentation, GDA)의 μ‹€ν˜„ κ°€λŠ₯성에 λŒ€ν•œ κΈ°λŒ€κ°€ 컀지고 μžˆλ‹€. 생성 기반 데이터 증강 기법은 λ”₯λŸ¬λ‹ 기반 μž μž¬λ³€μˆ˜ λͺ¨λΈμ—μ„œ 생성 된 μƒ˜ν”Œμ„ 원본 데이터셋에 μΆ”κ°€ν•˜μ—¬ μ—°κ΄€λœ νƒœμŠ€ν¬μ˜ μ„±λŠ₯을 ν–₯μƒμ‹œν‚€λŠ” κΈ°μˆ μ„ μ˜λ―Έν•œλ‹€. λ”°λΌμ„œ 생성 기반 데이터 증강 기법은 데이터 κ³΅κ°„μ—μ„œ μ΄λ€„μ§€λŠ” μ •κ·œν™” 기술의 ν•œ ν˜•νƒœλ‘œ 간주될 수 μžˆλ‹€. μ΄λŸ¬ν•œ λ”₯λŸ¬λ‹ 기반 생성 λͺ¨λΈμ˜ μƒˆλ‘œμš΄ ν™œμš© κ°€λŠ₯성은 μžμ—°μ–΄μ²˜λ¦¬ λΆ„μ•Όμ—μ„œ λ”μš± μ€‘μš”ν•˜κ²Œ λΆ€κ°λ˜λŠ” μ΄μœ λŠ” (1) λ²”μš© κ°€λŠ₯ν•œ ν…μŠ€νŠΈ 데이터 증강 기술의 λΆ€μž¬μ™€ (2) ν…μŠ€νŠΈ λ°μ΄ν„°μ˜ ν¬μ†Œμ„±μ„ 극볡할 수 μžˆλŠ” λŒ€μ•ˆμ΄ ν•„μš”ν•˜κΈ° λ•Œλ¬Έμ΄λ‹€. 문제의 λ³΅μž‘λ„μ™€ νŠΉμ§•μ„ 골고루 μ±„μ§‘ν•˜κΈ° μœ„ν•΄ λ³Έ λ…Όλ¬Έμ—μ„œλŠ” ν…μŠ€νŠΈ λΆ„λ₯˜(text classification), 순차적 λ ˆμ΄λΈ”λ§κ³Ό λ©€ν‹°νƒœμŠ€ν‚Ή 기술이 ν•„μš”ν•œ λ°œν™” 이해(spoken language understanding, SLU), 계측적이며 μž¬κ·€μ μΈ 데이터 ꡬ쑰에 λŒ€ν•œ κ³ λ €κ°€ ν•„μš”ν•œ λŒ€ν™” μƒνƒœ 좔적(dialogue state tracking, DST) λ“± μ„Έ 가지 λ¬Έμ œμ—μ„œ λ”₯λŸ¬λ‹ 기반 생성 λͺ¨λΈμ„ ν™œμš©ν•œ 데이터 증강 κΈ°λ²•μ˜ 타당성에 λŒ€ν•΄ 닀룬닀. λ³Έ μ—°κ΅¬μ—μ„œλŠ” 쑰건뢀, 계측적 및 순차적 variational autoencoder (VAE)에 κΈ°λ°˜ν•˜μ—¬ 각 μžμ—°μ–΄μ²˜λ¦¬ λ¬Έμ œμ— νŠΉν™”λœ ν…μŠ€νŠΈ 및 μ—°κ΄€ λΆ€μ°© 정보λ₯Ό λ™μ‹œμ— μƒμ„±ν•˜λŠ” 특수 λ”₯λŸ¬λ‹ 생성 λͺ¨λΈλ“€μ„ μ œμ‹œν•˜κ³ , λ‹€μ–‘ν•œ ν•˜λ₯˜ λͺ¨λΈκ³Ό 데이터셋을 λ‹€λ£¨λŠ” λ“± 폭 넓은 μ‹€ν—˜μ„ 톡해 λ”₯ 생성 λͺ¨λΈ 기반 데이터 증강 κΈ°λ²•μ˜ 효과λ₯Ό ν†΅κ³„μ μœΌλ‘œ μž…μ¦ν•˜μ˜€λ‹€. λΆ€μˆ˜μ  μ—°κ΅¬μ—μ„œλŠ” μžκΈ°νšŒκ·€μ (autoregressive) VAEμ—μ„œ 빈번히 λ°œμƒν•˜λŠ” posterior collapse λ¬Έμ œμ— λŒ€ν•΄ νƒκ΅¬ν•˜κ³ , ν•΄λ‹Ή 문제λ₯Ό μ™„ν™”ν•  수 μžˆλŠ” μ‹ κ·œ λ°©μ•ˆλ„ μ œμ•ˆν•œλ‹€. ν•΄λ‹Ή 방법을 생성적 데이터 증강에 ν•„μš”ν•œ λ³΅μž‘ν•œ VAE λͺ¨λΈμ— μ μš©ν•˜μ˜€μ„ λ•Œ, 생성 λͺ¨λΈμ˜ 생성 질이 ν–₯μƒλ˜μ–΄ 데이터 증강 νš¨κ³Όμ—λ„ 긍정적인 영ν–₯을 λ―ΈμΉ  수 μžˆμŒμ„ κ²€μ¦ν•˜μ˜€λ‹€. λ³Έ 논문을 톡해 μžμ—°μ–΄μ²˜λ¦¬ λΆ„μ•Όμ—μ„œ κΈ°μ‘΄ μ •κ·œν™” 기법과 병행 적용 κ°€λŠ₯ν•œ 비지도 ν˜•νƒœμ˜ 데이터 증강 κΈ°λ²•μ˜ ν‘œμ€€ν™”λ₯Ό κΈ°λŒ€ν•΄ λ³Ό 수 μžˆλ‹€.1 Introduction 1 1.1 Motivation 1 1.2 Dissertation Overview 6 2 Background and Related Work 8 2.1 Deep Latent Variable Models 8 2.1.1 Variational Autoencoder (VAE) 10 2.1.2 Deep Generative Models and Text Generation 12 2.2 Data Augmentation 12 2.2.1 General Description 13 2.2.2 Categorization of Data Augmentation 14 2.2.3 Theoretical Explanations 21 2.3 Summary 24 3 Basic Task: Text Classi cation 25 3.1 Introduction 25 3.2 Our Approach 28 3.2.1 Proposed Models 28 3.2.2 Training with I-VAE 29 3.3 Experiments 31 3.3.1 Datasets 32 3.3.2 Experimental Settings 33 3.3.3 Implementation Details 34 3.3.4 Data Augmentation Results 36 3.3.5 Ablation Studies 39 3.3.6 Qualitative Analysis 40 3.4 Summary 45 4 Multi-task Learning: Spoken Language Understanding 46 4.1 Introduction 46 4.2 Related Work 48 4.3 Model Description 48 4.3.1 Framework Formulation 48 4.3.2 Joint Generative Model 49 4.4 Experiments 56 4.4.1 Datasets 56 4.4.2 Experimental Settings 57 4.4.3 Generative Data Augmentation Results 61 4.4.4 Comparison to Other State-of-the-art Results 63 4.4.5 Ablation Studies 63 4.5 Summary 67 5 Complex Data: Dialogue State Tracking 68 5.1 Introduction 68 5.2 Background and Related Work 70 5.2.1 Task-oriented Dialogue 70 5.2.2 Dialogue State Tracking 72 5.2.3 Conversation Modeling 72 5.3 Variational Hierarchical Dialogue Autoencoder (VHDA) 73 5.3.1 Notations 73 5.3.2 Variational Hierarchical Conversational RNN 74 5.3.3 Proposed Model 75 5.3.4 Posterior Collapse 82 5.4 Experimental Results 84 5.4.1 Experimental Settings 84 5.4.2 Data Augmentation Results 90 5.4.3 Intrinsic Evaluation - Language Evaluation 94 5.4.4 Qualitative Results 95 5.5 Summary 101 6 Conclusion 103 6.1 Summary 103 6.2 Limitations 104 6.3 Future Work 105Docto

    Data-efficient methods for dialogue systems

    Get PDF
    Conversational User Interface (CUI) has become ubiquitous in everyday life, in consumer-focused products like Siri and Alexa or more business-oriented customer support automation solutions. Deep learning underlies many recent breakthroughs in dialogue systems but requires very large amounts of training data, often annotated by experts β€” and this dramatically increases the cost of deploying such systems in production setups and reduces their flexibility as software products. Trained with smaller data, these methods end up severely lacking robustness to various phenomena of spoken language (e.g. disfluencies), out-of-domain input, and often just have too little generalisation power to other tasks and domains. In this thesis, we address the above issues by introducing a series of methods for bootstrapping robust dialogue systems from minimal data. Firstly, we study two orthogonal approaches to dialogue: a linguistically informed model (DyLan) and a machine learning-based one (MemN2N) β€” from the data efficiency perspective, i.e. their potential to generalise from minimal data and robustness to natural spontaneous input. We outline the steps to obtain data-efficient solutions with either approach and proceed with the neural models for the rest of the thesis. We then introduce the core contributions of this thesis, two data-efficient models for dialogue response generation: the Dialogue Knowledge Transfer Network (DiKTNet) based on transferable latent dialogue representations, and the Generative-Retrieval Transformer (GRTr) combining response generation logic with a retrieval mechanism as the fallback. GRTr ranked first at the Dialog System Technology Challenge 8 Fast Domain Adaptation task. Next, we the problem of training robust neural models from minimal data. As such, we look at robustness to disfluencies and propose a multitask LSTM-based model for domain-general disfluency detection. We then go on to explore robustness to anomalous, or out-of-domain (OOD) input. We address this problem by (1) presenting Turn Dropout, a data-augmentation technique facilitating training for anomalous input only using in-domain data, and (2) introducing VHCN and AE-HCN, autoencoder-augmented models for efficient training with turn dropout based on the Hybrid Code Networks (HCN) model family. With all the above work addressing goal-oriented dialogue, our final contribution in this thesis focuses on social dialogue where the main objective is maintaining natural, coherent, and engaging conversation for as long as possible. We introduce a neural model for response ranking in social conversation used in Alana, the 3rd place winner in the Amazon Alexa Prize 2017 and 2018. For our model, we employ a novel technique of predicting the dialogue length as the main objective for ranking. We show that this approach matches the performance of its counterpart based on the conventional, human rating-based objective β€” and surpasses it given more raw dialogue transcripts, thus reducing the dependence on costly and cumbersome dialogue annotations.EPSRC project BABBLE (grant EP/M01553X/1)

    Data Augmentation for Conversational AI

    Full text link
    Advancements in conversational systems have revolutionized information access, surpassing the limitations of single queries. However, developing dialogue systems requires a large amount of training data, which is a challenge in low-resource domains and languages. Traditional data collection methods like crowd-sourcing are labor-intensive and time-consuming, making them ineffective in this context. Data augmentation (DA) is an affective approach to alleviate the data scarcity problem in conversational systems. This tutorial provides a comprehensive and up-to-date overview of DA approaches in the context of conversational systems. It highlights recent advances in conversation augmentation, open domain and task-oriented conversation generation, and different paradigms of evaluating these models. We also discuss current challenges and future directions in order to help researchers and practitioners to further advance the field in this area

    Goal-Embedded Dual Hierarchical Model for Task-Oriented Dialogue Generation

    Full text link
    Hierarchical neural networks are often used to model inherent structures within dialogues. For goal-oriented dialogues, these models miss a mechanism adhering to the goals and neglect the distinct conversational patterns between two interlocutors. In this work, we propose Goal-Embedded Dual Hierarchical Attentional Encoder-Decoder (G-DuHA) able to center around goals and capture interlocutor-level disparity while modeling goal-oriented dialogues. Experiments on dialogue generation, response generation, and human evaluations demonstrate that the proposed model successfully generates higher-quality, more diverse and goal-centric dialogues. Moreover, we apply data augmentation via goal-oriented dialogue generation for task-oriented dialog systems with better performance achieved.Comment: Accepted by CoNLL-201
    • …
    corecore