Search CORE

4,646 research outputs found

A Survey of Natural Language Generation

Author: Chen Miaoxin
Dong Chenhe
Gong Haifan
Li Junxin
Li Yinghui
Shen Ying
Yang Min
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 02/08/2022
Field of study

This paper offers a comprehensive review of the research on Natural Language Generation (NLG) over the past two decades, especially in relation to data-to-text generation and text-to-text generation deep learning methods, as well as new applications of NLG technology. This survey aims to (a) give the latest synthesis of deep learning research on the NLG core tasks, as well as the architectures adopted in the field; (b) detail meticulously and comprehensively various NLG tasks and datasets, and draw attention to the challenges in NLG evaluation, focusing on different evaluation methods and their relationships; (c) highlight some future emphasis and relatively recent research issues that arise due to the increasing synergy between NLG and other artificial intelligence areas, such as computer vision, text and computational creativity.Comment: Accepted by ACM Computing Survey (CSUR) 202

arXiv.org e-Print Archive

딥 뉴럴 네트워크 기반의 문장 인코더를 이용한 문장 간 관계 모델링

Author: 최지헌
Publication venue: 서울대학교 대학원
Publication date: 01/02/2020
Field of study

학위논문(박사)--서울대학교 대학원 :공과대학 컴퓨터공학부,2020. 2. 이상구.문장 매칭이란 두 문장 간 의미적으로 일치하는 정도를 예측하는 문제이다. 어떤 모델이 두 문장 사이의 관계를 효과적으로 밝혀내기 위해서는 높은 수준의 자연어 텍스트 이해 능력이 필요하기 때문에, 문장 매칭은 다양한 자연어 처리 응용의 성능에 중요한 영향을 미친다. 본 학위 논문에서는 문장 인코더, 매칭 함수, 준지도 학습이라는 세 가지 측면에서 문장 매칭의 성능 개선을 모색한다. 문장 인코더란 문장으로부터 유용한 특질들을 추출하는 역할을 하는 구성 요소로, 본 논문에서는 문장 인코더의 성능 향상을 위하여 Gumbel Tree-LSTM과 Cell-aware Stacked LSTM이라는 두 개의 새로운 아키텍처를 제안한다. Gumbel Tree-LSTM은 재귀적 뉴럴 네트워크(recursive neural network) 구조에 기반한 아키텍처이다. 구조 정보가 포함된 데이터를 입력으로 사용하던 기존의 재귀적 뉴럴 네트워크 모델과 달리, Gumbel Tree-LSTM은 구조가 없는 데이터로부터 특정 문제에 대한 성능을 최대화하는 파싱 전략을 학습한다. Cell-aware Stacked LSTM은 LSTM 구조를 개선한 아키텍처로, 여러 LSTM 레이어를 중첩하여 사용할 때 망각 게이트(forget gate)를 추가적으로 도입하여 수직 방향의 정보 흐름을 더 효율적으로 제어할 수 있도록 한다. 한편, 새로운 매칭 함수로서 우리는 요소별 쌍선형 문장 매칭(element-wise bilinear sentence matching, ElBiS) 함수를 제안한다. ElBiS 알고리즘은 특정 문제를 해결하는 데에 적합한 방식으로 두 문장 표현을 하나의 벡터로 합치는 방법을 자동으로 찾는 것을 목적으로 한다. 문장 표현을 얻을 때에 서로 같은 문장 인코더를 사용한다는 사실로부터 우리는 벡터의 각 요소 간 쌍선형(bilinear) 상호 작용만을 고려하여도 두 문장 벡터 간 비교를 충분히 잘 수행할 수 있다는 가설을 수립하고 이를 실험적으로 검증한다. 상호 작용의 범위를 제한함으로써, 자동으로 유용한 병합 방법을 찾는다는 이점을 유지하면서 모든 상호 작용을 고려하는 쌍선형 풀링 방법에 비해 필요한 파라미터의 수를 크게 줄일 수 있다. 마지막으로, 학습 시 레이블이 있는 데이터와 레이블이 없는 데이터를 함께 사용하는 준지도 학습을 위해 우리는 교차 문장 잠재 변수 모델(cross-sentence latent variable model, CS-LVM)을 제안한다. CS-LVM의 생성 모델은 출처 문장(source sentence)의 잠재 표현 및 출처 문장과 목표 문장(target sentence) 간의 관계를 나타내는 변수로부터 목표 문장이 생성된다고 가정한다. CS-LVM에서는 두 문장이 하나의 모델 안에서 모두 고려되기 때문에, 학습에 사용되는 목적 함수가 더 자연스럽게 정의된다. 또한, 우리는 생성 모델의 파라미터가 더 의미적으로 적합한 문장을 생성하도록 유도하기 위하여 일련의 의미 제약들을 정의한다. 본 학위 논문에서 제안된 개선 방안들은 문장 매칭 과정을 포함하는 다양한 자연어 처리 응용의 효용성을 높일 것으로 기대된다.Sentence matching is a task of predicting the degree of match between two sentences. Since high level of understanding natural language text is needed for a model to identify the relationship between two sentences, it is an important component for various natural language processing applications. In this dissertation, we seek for the improvement of the sentence matching module from the following three ingredients: sentence encoder, matching function, and semi-supervised learning. To enhance a sentence encoder network which takes responsibility of extracting useful features from a sentence, we propose two new sentence encoder architectures: Gumbel Tree-LSTM and Cell-aware Stacked LSTM (CAS-LSTM). Gumbel Tree-LSTM is based on a recursive neural network (RvNN) architecture, however unlike typical RvNN architectures it does not need a structured input. Instead, it learns from data a parsing strategy that is optimized for a specific task. The latter, CAS-LSTM, extends the stacked long short-term memory (LSTM) architecture by introducing an additional forget gate for better handling of vertical information flow. And then, as a new matching function, we present the element-wise bilinear sentence matching (ElBiS) function. It aims to automatically find an aggregation scheme that fuses two sentence representations into a single one suitable for a specific task. From the fact that a sentence encoder is shared across inputs, we hypothesize and empirically prove that considering only the element-wise bilinear interaction is sufficient for comparing two sentence vectors. By restricting the interaction, we can largely reduce the number of required parameters compared with full bilinear pooling methods without losing the advantage of automatically discovering useful aggregation schemes. Finally, to facilitate semi-supervised training, i.e. to make use of both labeled and unlabeled data in training, we propose the cross-sentence latent variable model (CS-LVM). Its generative model assumes that a target sentence is generated from the latent representation of a source sentence and the variable indicating the relationship between the source and the target sentence. As it considers the two sentences in a pair together in a single model, the training objectives are defined more naturally than prior approaches based on the variational auto-encoder (VAE). We also define semantic constraints that force the generator to generate semantically more plausible sentences. We believe that the improvements proposed in this dissertation would advance the effectiveness of various natural language processing applications containing modeling sentence pairs.Chapter 1 Introduction 1 1.1 Sentence Matching 1 1.2 Deep Neural Networks for Sentence Matching 2 1.3 Scope of the Dissertation 4 Chapter 2 Background and Related Work 9 2.1 Sentence Encoders 9 2.2 Matching Functions 11 2.3 Semi-Supervised Training 13 Chapter 3 Sentence Encoder: Gumbel Tree-LSTM 15 3.1 Motivation 15 3.2 Preliminaries 16 3.2.1 Recursive Neural Networks 16 3.2.2 Training RvNNs without Tree Information 17 3.3 Model Description 19 3.3.1 Tree-LSTM 19 3.3.2 Gumbel-Softmax 20 3.3.3 Gumbel Tree-LSTM 22 3.4 Implementation Details 25 3.5 Experiments 27 3.5.1 Natural Language Inference 27 3.5.2 Sentiment Analysis 32 3.5.3 Qualitative Analysis 33 3.6 Summary 36 Chapter 4 Sentence Encoder: Cell-aware Stacked LSTM 38 4.1 Motivation 38 4.2 Related Work 40 4.3 Model Description 43 4.3.1 Stacked LSTMs 43 4.3.2 Cell-aware Stacked LSTMs 44 4.3.3 Sentence Encoders 46 4.4 Experiments 47 4.4.1 Natural Language Inference 47 4.4.2 Paraphrase Identification 50 4.4.3 Sentiment Classification 52 4.4.4 Machine Translation 53 4.4.5 Forget Gate Analysis 55 4.4.6 Model Variations 56 4.5 Summary 59 Chapter 5 Matching Function: Element-wise Bilinear Sentence Matching 60 5.1 Motivation 60 5.2 Proposed Method: ElBiS 61 5.3 Experiments 63 5.3.1 Natural language inference 64 5.3.2 Paraphrase Identification 66 5.4 Summary and Discussion 68 Chapter 6 Semi-Supervised Training: Cross-Sentence Latent Variable Model 70 6.1 Motivation 70 6.2 Preliminaries 71 6.2.1 Variational Auto-Encoders 71 6.2.2 von Mises–Fisher Distribution 73 6.3 Proposed Framework: CS-LVM 74 6.3.1 Cross-Sentence Latent Variable Model 75 6.3.2 Architecture 78 6.3.3 Optimization 79 6.4 Experiments 84 6.4.1 Natural Language Inference 84 6.4.2 Paraphrase Identification 85 6.4.3 Ablation Study 86 6.4.4 Generated Sentences 88 6.4.5 Implementation Details 89 6.5 Summary and Discussion 90 Chapter 7 Conclusion 92 Appendix A Appendix 96 A.1 Sentences Generated from CS-LVM 96Docto

SNU Open Repository and Archive

Pedestrian Attribute Recognition: A Survey

Author: Luo Bin
Tang Jin
Wang Xiao
Yang Rui
Zheng Shaofei
Publication venue
Publication date: 22/01/2019
Field of study

Recognizing pedestrian attributes is an important task in computer vision community due to it plays an important role in video surveillance. Many algorithms has been proposed to handle this task. The goal of this paper is to review existing works using traditional methods or based on deep learning networks. Firstly, we introduce the background of pedestrian attributes recognition (PAR, for short), including the fundamental concepts of pedestrian attributes and corresponding challenges. Secondly, we introduce existing benchmarks, including popular datasets and evaluation criterion. Thirdly, we analyse the concept of multi-task learning and multi-label learning, and also explain the relations between these two learning algorithms and pedestrian attribute recognition. We also review some popular network architectures which have widely applied in the deep learning community. Fourthly, we analyse popular solutions for this task, such as attributes group, part-based, \emph{etc}. Fifthly, we shown some applications which takes pedestrian attributes into consideration and achieve better performance. Finally, we summarized this paper and give several possible research directions for pedestrian attributes recognition. The project page of this paper can be found from the following website: \url{https://sites.google.com/view/ahu-pedestrianattributes/}.Comment: Check our project page for High Resolution version of this survey: https://sites.google.com/view/ahu-pedestrianattributes

arXiv.org e-Print Archive

REPRESENTATION LEARNING FOR EMOTION RECOGNITION AND MENTAL HEALTH ANALYSIS

Author: Yang Kailai
Publication venue
Publication date: 01/08/2023
Field of study

The University of Manchester - Institutional Repository

Deep Generative Models for Natural Language

Author: Shah Harshil Bharat
Publication venue: UCL (University College London)
Publication date: 28/11/2021
Field of study

Generative models aim to simulate the process by which a set of data is generated. They are intuitive, interpretable, and naturally suited to learning from unlabelled data. This is particularly appealing in natural language processing, where labels are often costly to obtain and can require significant manual input from trained annotators. However, traditional generative modelling approaches can often be inflexible due to the need to maintain tractable maximum likelihood training. On the other hand, deep learning methods are powerful, flexible, and have achieved significant success on a wide variety of natural language processing tasks. In recent years, algorithms have been developed for training generative models that incorporate neural networks to parametrise their conditional distributions. These approaches aim to take advantage of the intuitiveness and interpretability of generative models as well as the power and flexibility of deep learning. In this work, we investigate how to leverage such algorithms in order to develop deep generative models for natural language. Firstly, we present an attention-based latent variable model, trained using unlabelled data, for learning representations of sentences. Experiments such as missing word imputation and sentence similarity matching suggest that the representations are able to learn semantic information about the sentences. We then present an RNN-based latent variable model for per- forming machine translation. Trained using semi-supervised learning, our approach achieves strong results even with very limited labelled data. Finally, we present a locally-contextual conditional random field for performing sequence labelling tasks. Our method consistently outperforms the linear chain conditional random field and achieves state of the art performance on two out of the four tasks evaluated

UCL Discovery

Automatic Distractor Generation for Multiple Choice Questions in Standard Tests

Author: Fan Wei
Qiu Zhaopeng
Wu Xian
Publication venue
Publication date: 01/01/2020
Field of study

To assess the knowledge proficiency of a learner, multiple choice question is an efficient and widespread form in standard tests. However, the composition of the multiple choice question, especially the construction of distractors is quite challenging. The distractors are required to both incorrect and plausible enough to confuse the learners who did not master the knowledge. Currently, the distractors are generated by domain experts which are both expensive and time-consuming. This urges the emergence of automatic distractor generation, which can benefit various standard tests in a wide range of domains. In this paper, we propose a question and answer guided distractor generation (EDGE) framework to automate distractor generation. EDGE consists of three major modules: (1) the Reforming Question Module and the Reforming Passage Module apply gate layers to guarantee the inherent incorrectness of the generated distractors; (2) the Distractor Generator Module applies attention mechanism to control the level of plausibility. Experimental results on a large-scale public dataset demonstrate that our model significantly outperforms existing models and achieves a new state-of-the-art.Comment: accepted by COLING202

arXiv.org e-Print Archive

Crossref

ASPECT-BASED SENTIMENT ANALYSIS ON TEXT-BASED REVIEWS USING DEEP NEURAL NETWORKS AND EMBEDDING MODELS

Author: MAMANI BARNAGHI PEIMAN
Publication venue
Publication date: 19/05/2022
Field of study

Edge Hill University Research Information Repository

Language modelling for clinical natural language understanding and generation

Author: Zhang Jingqing
Publication venue: Computing, Imperial College London
Publication date: 01/12/2022
Field of study

One of the long-standing objectives of Artificial Intelligence (AI) is to design and develop algorithms for social good including tackling public health challenges. In the era of digitisation, with an unprecedented amount of healthcare data being captured in digital form, the analysis of the healthcare data at scale can lead to better research of diseases, better monitoring patient conditions and more importantly improving patient outcomes. However, many AI-based analytic algorithms rely solely on structured healthcare data such as bedside measurements and test results which only account for 20% of all healthcare data, whereas the remaining 80% of healthcare data is unstructured including textual data such as clinical notes and discharge summaries which is still underexplored. Conventional Natural Language Processing (NLP) algorithms that are designed for clinical applications rely on the shallow matching, templates and non-contextualised word embeddings which lead to limited understanding of contextual semantics. Though recent advances in NLP algorithms have demonstrated promising performance on a variety of NLP tasks in the general domain with contextualised language models, most of these generic NLP algorithms struggle at specific clinical NLP tasks which require biomedical knowledge and reasoning. Besides, there is limited research to study generative NLP algorithms to generate clinical reports and summaries automatically by considering salient clinical information. This thesis aims to design and develop novel NLP algorithms especially clinical-driven contextualised language models to understand textual healthcare data and generate clinical narratives which can potentially support clinicians, medical scientists and patients. The first contribution of this thesis focuses on capturing phenotypic information of patients from clinical notes which is important to profile patient situation and improve patient outcomes. The thesis proposes a novel self-supervised language model, named Phenotypic Intelligence Extraction (PIE), to annotate phenotypes from clinical notes with the detection of contextual synonyms and the enhancement to reason with numerical values. The second contribution is to demonstrate the utility and benefits of using phenotypic features of patients in clinical use cases by predicting patient outcomes in Intensive Care Units (ICU) and identifying patients at risk of specific diseases with better accuracy and model interpretability. The third contribution is to propose generative models to generate clinical narratives to automate and accelerate the process of report writing and summarisation by clinicians. This thesis first proposes a novel summarisation language model named PEGASUS which surpasses or is on par with the state-of-the-art performance on 12 downstream datasets including biomedical literature from PubMed. PEGASUS is further extended to generate medical scientific documents from input tabular data.Open Acces

Spiral - Imperial College Digital Repository