6 research outputs found
Learning to Rank Question-Answer Pairs using Hierarchical Recurrent Encoder with Latent Topic Clustering
In this paper, we propose a novel end-to-end neural architecture for ranking
candidate answers, that adapts a hierarchical recurrent neural network and a
latent topic clustering module. With our proposed model, a text is encoded to a
vector representation from an word-level to a chunk-level to effectively
capture the entire meaning. In particular, by adapting the hierarchical
structure, our model shows very small performance degradations in longer text
comprehension while other state-of-the-art recurrent neural network models
suffer from it. Additionally, the latent topic clustering module extracts
semantic information from target samples. This clustering module is useful for
any text related tasks by allowing each data sample to find its nearest topic
cluster, thus helping the neural network model analyze the entire data. We
evaluate our models on the Ubuntu Dialogue Corpus and consumer electronic
domain question answering dataset, which is related to Samsung products. The
proposed model shows state-of-the-art results for ranking question-answer
pairs.Comment: 10 pages, Accepted as a conference paper at NAACL 201
μ§μμλ΅ μμ€ν μ μν ν μ€νΈ λνΉ μ¬μΈ΅ μ κ²½λ§
νμλ
Όλ¬Έ (λ°μ¬) -- μμΈλνκ΅ λνμ : 곡과λν μ κΈ°Β·μ 보곡νλΆ, 2020. 8. μ κ΅λ―Ό.The question answering (QA) system has attracted huge interests due to its applicability in real-world applications. This dissertation proposes novel ranking algorithms for the QA system based on deep neural networks. We first tackle the long-text QA that requires the model to understand the excessively large sequence of text inputs. To solve this problem, we propose a hierarchical recurrent dual encoder that encodes texts from word-level to paragraph-level. We further propose a latent topic clustering method that utilizes semantic information in the target corpus, and thus it increases the performance of the QA system. Secondly, we investigate the short-text QA, where the information in text pairs are limited. To overcome the insufficiency, we combine a pretrained language model and an enhanced latent clustering method to the QA model. This novel architecture enables the model to utilizes additional information, resulting in achieving state-of-the-art performance for the standard answer-selection tasks (i.e., WikiQA, TREC-QA). Finally, we investigate detecting supporting sentences for complex QA system. As opposed to the previous studies, the model needs to understand the relationship between sentences to answer the question. Inspired by the hierarchical nature of the text, we propose a graph neural network-based model that iteratively propagates necessary information between text nodes and achieve the best performance among existing methods.λ³Έ νμ λ
Όλ¬Έμ λ₯ λ΄λ΄ λ€νΈμν¬ κΈ°λ° μ§μμλ΅ μμ€ν
μ κ΄ν λͺ¨λΈμ μ μνλ€. λ¨Όμ κΈ΄ λ¬Έμ₯μ λν μ§μμλ΅μ νκΈ° μν΄μ κ³μΈ΅ ꡬ쑰μ μ¬κ·μ κ²½λ§ λͺ¨λΈμ μ μνμλ€. μ΄λ₯Ό ν΅ν΄ λͺ¨λΈμ΄ μ£Όμ΄μ§ λ¬Έμ₯μ 짧μ μνμ€ λ¨μλ‘ ν¨μ¨μ μΌλ‘ λ€λ£° μ μκ² νμ¬ ν° μ±λ₯ ν₯μμ μ»μλ€. λν νμ΅ κ³Όμ μμ λ°μ΄ν° μμ λ΄ν¬λ ν ν½μ μλ λΆλ₯νλ λͺ¨λΈμ μ μνκ³ , μ΄λ₯Ό κΈ°μ‘΄ μ§μμλ΅ λͺ¨λΈμ λ³ν©νμ¬ μΆκ° μ±λ₯ κ°μ μ μ΄λ£¨μλ€. μ΄μ΄μ§λ μ°κ΅¬λ‘ 짧μ λ¬Έμ₯μ λν μ§μμλ΅ λͺ¨λΈμ μ μνμλ€. λ¬Έμ₯μ κΈΈμ΄κ° 짧μμ§μλ‘ λ¬Έμ₯ μμμ μ»μ μ μλ μ 보μ μλ μ€μ΄λ€κ² λλ€. μ°λ¦¬λ μ΄λ¬ν λ¬Έμ λ₯Ό ν΄κ²°νκΈ° μν΄, μ¬μ νμ΅λ μΈμ΄ λͺ¨λΈκ³Ό μλ‘μ΄ ν ν½ ν΄λ¬μ€ν°λ§ κΈ°λ²μ μ μ©νμλ€. μ μν λͺ¨λΈμ μ’
λ 짧μ λ¬Έμ₯ μ§μμλ΅ μ°κ΅¬ μ€ κ°μ₯ μ’μ μ±λ₯μ νλνμλ€. λ§μ§λ§μΌλ‘ μ¬λ¬ λ¬Έμ₯ μ¬μ΄μ κ΄κ³λ₯Ό μ΄μ©νμ¬ λ΅λ³μ μ°ΎμμΌ νλ μ§μμλ΅ μ°κ΅¬λ₯Ό μ§ννμλ€. μ°λ¦¬λ λ¬Έμ λ΄ κ° λ¬Έμ₯μ κ·Έλνλ‘ λμνν ν μ΄λ₯Ό νμ΅ν μ μλ κ·Έλν λ΄λ΄ λ€νΈμν¬λ₯Ό μ μνμλ€. μ μν λͺ¨λΈμ κ° λ¬Έμ₯μ κ΄κ³μ±μ μ±κ³΅μ μΌλ‘ κ³μ°νμκ³ , μ΄λ₯Ό ν΅ν΄ 볡μ‘λκ° λμ μ§μμλ΅ μμ€ν
μμ κΈ°μ‘΄μ μ μλ λͺ¨λΈλ€κ³Ό λΉκ΅νμ¬ κ°μ₯ μ’μ μ±λ₯μ νλνμλ€.1 Introduction 1
2 Background 8
2.1 Textual Data Representation 8
2.2 Encoding Sequential Information in Text 12
3 Question-Answer Pair Ranking for Long Text 16
3.1 Related Work 18
3.2 Method 19
3.2.1 Baseline Approach 19
3.2.2 Proposed Approaches (HRDE+LTC) 22
3.3 Experimental Setup and Dataset 26
3.3.1 Dataset 26
3.3.2 Consumer Product Question Answering Corpus 30
3.3.3 Implementation Details 32
3.4 Empirical Results 34
3.4.1 Comparison with other methods 35
3.4.2 Degradation Comparison for Longer Texts 37
3.4.3 Effects of the LTC Numbers 38
3.4.4 Comprehensive Analysis of LTC 38
3.5 Further Investigation on Ranking Lengthy Document 40
3.5.1 Problem and Dataset 41
3.5.2 Methods 45
3.5.3 Experimental Results 51
3.6 Conclusion 55
4 Answer-Selection for Short Sentence 56
4.1 Related Work 57
4.2 Method 59
4.2.1 Baseline approach 59
4.2.2 Proposed Approaches (Comp-Clip+LM+LC+TL) 62
4.3 Experimental Setup and Dataset 66
4.3.1 Dataset 66
4.3.2 Implementation Details 68
4.4 Empirical Results 69
4.4.1 Comparison with Other Methods 69
4.4.2 Impact of Latent Clustering 72
4.5 Conclusion 72
5 Supporting Sentence Detection for Question Answering 73
5.1 Related Work 75
5.2 Method 76
5.2.1 Baseline approaches 76
5.2.2 Proposed Approach (Propagate-Selector) 78
5.3 Experimental Setup and Dataset 82
5.3.1 Dataset 82
5.3.2 Implementation Details 83
5.4 Empirical Results 85
5.4.1 Comparisons with Other Methods 85
5.4.2 Hop Analysis 86
5.4.3 Impact of Various Graph Topologies 88
5.4.4 Impact of Node Representation 91
5.5 Discussion 92
5.6 Conclusion 93
6 Conclusion 94Docto