552 research outputs found

    A Deep Generative Framework for Paraphrase Generation

    Full text link
    Paraphrase generation is an important problem in NLP, especially in question answering, information retrieval, information extraction, conversation systems, to name a few. In this paper, we address the problem of generating paraphrases automatically. Our proposed method is based on a combination of deep generative models (VAE) with sequence-to-sequence models (LSTM) to generate paraphrases, given an input sentence. Traditional VAEs when combined with recurrent neural networks can generate free text but they are not suitable for paraphrase generation for a given sentence. We address this problem by conditioning the both, encoder and decoder sides of VAE, on the original sentence, so that it can generate the given sentence's paraphrases. Unlike most existing models, our model is simple, modular and can generate multiple paraphrases, for a given sentence. Quantitative evaluation of the proposed method on a benchmark paraphrase dataset demonstrates its efficacy, and its performance improvement over the state-of-the-art methods by a significant margin, whereas qualitative human evaluation indicate that the generated paraphrases are well-formed, grammatically correct, and are relevant to the input sentence. Furthermore, we evaluate our method on a newly released question paraphrase dataset, and establish a new baseline for future research

    Cell Type Classification Via Deep Learning On Single-Cell Gene Expression Data

    Get PDF
    Single-cell sequencing is a recently advanced revolutionary technology which enables researchers to obtain genomic, transcriptomic, or multi-omics information through gene expression analysis. It gives the advantage of analyzing highly heterogenous cell type information compared to traditional sequencing methods, which is gaining popularity in the biomedical area. Moreover, this analysis can help for early diagnosis and drug development of tumor cells, and cancer cell types. In the workflow of gene expression data profiling, identification of the cell types is an important task, but it faces many challenges like the curse of dimensionality, sparsity, batch effect, and overfitting. However, these challenges can be overcome by performing a feature selection technique which selects more relevant features by reducing feature dimensions. In this research work, recurrent neural network-based feature selection model is proposed to extract relevant features from high dimensional, and low sample size data. Moreover, a deep learning-based gene embedding model is also proposed to reduce data sparsity of single-cell data for cell type identification. The proposed frameworks have been implemented with different architectures of recurrent neural networks, and demonstrated via real-world micro-array datasets and single-cell RNA-seq data and observed that the proposed models perform better than other feature selection models. A semi-supervised model is also implemented using the same workflow of gene embedding concept since labeling data is very cumbersome, time consuming, and requires manual effort and expertise in the field. Therefore, different ratios of labeled data are used in the experiment to validate the concept. Experimental results show that the proposed semi-supervised approach represents very encouraging performance even though a limited number of labeled data is used via the gene embedding concept. In addition, graph attention based autoencoder model has also been studied to learn the latent features by incorporating prior knowledge with gene expression data for cell type classification. Index Terms β€” Single-Cell Gene Expression Data, Gene Embedding, Semi-Supervised model, Incorporate Prior Knowledge, Gene-gene Interaction Network, Deep Learning, Graph Auto Encode

    μžκΈ°νšŒκ·€λͺ¨λΈ 기반 ν…μŠ€νŠΈ 생성을 μœ„ν•œ 효과적인 ν•™μŠ΅ 방법에 κ΄€ν•œ 연ꡬ

    Get PDF
    ν•™μœ„λ…Όλ¬Έ(박사) -- μ„œμšΈλŒ€ν•™κ΅λŒ€ν•™μ› : κ³΅κ³ΌλŒ€ν•™ 전기·정보곡학뢀, 2021.8. κΉ€νš¨μ„.The rise of deep neural networks has promoted tremendous advances in natural language processing research. Natural language generation is a subfield of natural language processing, which is inevitable in building a human-like artificial intelligence since they take responsibility for delivering the decision-making of machines in natural language. For neural network-based text generation techniques, which have achieved most state-of-the-art performance, autoregressive methods are generally adapted because of their correspondence to the word-by-word nature of human language production. In this dissertation, we investigate two different ways to train autoregressive text generation models, which are based on deep neural networks. We first focus on a token-level training of question generation, which aims to generate a question related to a given input passage. The proposed Answer-Separated Seq2Seq effectively mitigates a problem from the previous question generation models that a significant proportion of the generated questions include words in the target answer. While autoregressive methods are primarily trained with maximum likelihood estimation, they suffer from several problems, such as exposure bias. As a remedy, we propose a sequence-level GAN-based approach for text generation that promotes collaborative training in both continuous and discrete representations of text. To aggregate the achievement of the research mentioned above, we finally propose a novel way of training a sequence-level question generation model, adopting a pre-trained language model, one of the most significant breakthroughs in natural language processing, along with Proximal Policy Optimization.μžμ—°μ–΄ 처리 μ—°κ΅¬λŠ” λ”₯ λ‰΄λŸ΄λ„·μ˜ λ„μž…μœΌλ‘œ 인해 λŒ€λŒ€μ μΈ λ°œμ „μ„ κ±°μ³€λ‹€. μžμ—°μ–΄ 처리 μ—°κ΅¬μ˜ 일쒅인 μžμ—°μ–΄ 생성은 기계가 λ‚΄λ¦° 결정을 μ‚¬λžŒμ΄ 이해할 수 μžˆλ„λ‘ μ „λ‹¬ν•˜λŠ” κΈ°λŠ₯이 μžˆλ‹€, 그렇기에 μ‚¬λžŒμ„ λͺ¨λ°©ν•˜λŠ” 인곡지λŠ₯ μ‹œμŠ€ν…œμ„ κ΅¬μΆ•ν•˜λŠ” 데에 μžˆμ–΄ ν•„μˆ˜ λΆˆκ°€κ²°ν•œ μš”μ†Œμ΄λ‹€. 일반적으둜 λ‰΄λŸ΄λ„· 기반의 ν…μŠ€νŠΈ 생성 νƒœμŠ€ν¬μ—μ„œλŠ” μžλ™νšŒκ·€ 방법둠듀이 주둜 μ‚¬μš©λ˜λŠ”λ°, μ΄λŠ” μ‚¬λžŒμ˜ μ–Έμ–΄ 생성 κ³Όμ •κ³Ό μœ μ‚¬ν•œ 양상을 띠기 λ•Œλ¬Έμ΄λ‹€. λ³Έ ν•™μœ„ λ…Όλ¬Έμ—μ„œλŠ” 두 가지 λ‰΄λŸ΄λ„· 기반의 μžλ™νšŒκ·€ ν…μŠ€νŠΈ 생성 λͺ¨λΈ ν•™μŠ΅ 기법에 λŒ€ν•΄ μ œμ•ˆν•œλ‹€. 첫 번째 λ°©λ²•λ‘ μ—μ„œλŠ” 토큰 λ ˆλ²¨μ—μ„œμ˜ 질문 생성 λͺ¨λΈ ν•™μŠ΅ 방법에 λŒ€ν•΄ μ†Œκ°œν•œλ‹€. λ…Όλ¬Έμ—μ„œ μ œμ•ˆν•˜λŠ” λ‹΅λ³€ 뢄리 μ‹œν€€μŠ€-투-μ‹œν€€μŠ€ λͺ¨λΈμ€ 기쑴에 μ‘΄μž¬ν•˜λŠ” 질문 생성 λͺ¨λΈλ‘œ μƒμ„±λœ 질문이 닡변에 ν•΄λ‹Ήν•˜λŠ” λ‚΄μš©μ„ ν¬ν•¨ν•˜λŠ” λ¬Έμ œμ μ„ 효과적으둜 ν•΄κ²°ν•œλ‹€. 주둜 μ΅œλŒ€ μš°λ„ 좔정법을 톡해 ν•™μŠ΅λ˜λŠ” μžλ™νšŒκ·€ λ°©λ²•λ‘ μ—λŠ” λ…ΈμΆœ 편ν–₯ λ“±κ³Ό 같은 문제점이 μ‘΄μž¬ν•œλ‹€. μ΄λŸ¬ν•œ λ¬Έμ œμ μ„ ν•΄κ²°ν•˜κΈ° μœ„ν•΄ λ…Όλ¬Έμ—μ„œλŠ” ν…μŠ€νŠΈμ˜ 연속 곡간 ν‘œν˜„κ³Ό 이산 곡간 ν‘œν˜„ λͺ¨λ‘μ— λŒ€ν•΄ μƒν˜Έλ³΄μ™„μ μœΌλ‘œ ν•™μŠ΅ν•˜λŠ” μ‹œν€€μŠ€ 레벨의 μ λŒ€ 신경망 기반의 ν…μŠ€νŠΈ 생성 기법을 μ œμ•ˆν•œλ‹€. λ§ˆμ§€λ§‰μœΌλ‘œ μ•žμ„  방법둠듀을 μ’…ν•©ν•˜μ—¬ μ‹œν€€μŠ€ 레벨의 질문 생성기법을 μ œμ•ˆν•˜λ©°, μ΄λŸ¬ν•œ κ³Όμ •μ—μ„œ μ΅œμ‹  μžμ—°μ–΄ 처리 방법 쀑 ν•˜λ‚˜μΈ 사전 ν•™μŠ΅ μ–Έμ–΄ λͺ¨λΈκ³Ό κ·Όμœ„ μ •μ±… μ΅œμ ν™” 방법을 μ΄μš©ν•œλ‹€.1 INTRODUCTION 1 1.1 Contributions 4 2 BACKGROUND 8 2.1 Sequence-to-Sequence model 8 2.1.1 Sequence-to-Sequence model with Attention Mechanism 8 2.2 Autoregressive text generation 11 2.2.1 Maximum Likelihood Training 11 2.2.2 Pros and cons of autoregressive methods 11 2.3 Non-autoregressive text generation 13 2.4 Transformers 13 2.5 Reinforcement Learning 16 2.5.1 Policy Gradient 17 3 TOKEN-LEVEL TRAINING OF CONDITIONAL TEXT GENERATION MODEL 19 3.1 Related Work 22 3.2 Task Definition 23 3.3 Base Model: Encoder-Decoder with Attention 23 3.4 Answer-Separated Seq2Seq 25 3.4.1 Encoder 27 3.4.2 Answer-Separated Decoder 28 3.5 Experimental Settings 30 3.5.1 Dataset 30 3.5.2 Implementation Details 30 3.5.3 Evaluation Methods 32 3.6 Results 32 3.6.1 Performance Comparison 32 3.6.2 Impact of Answer Separation 34 3.6.3 Question Generation for Machine Comprehension 36 3.7 Conclusion 38 4 SEQUENCE-LEVEL TRAINING OF UNCONDITIONAL TEXT GENERATION 40 4.1 Background 42 4.1.1 Generative Adversarial Networks 42 4.1.2 Continuous-space Methods 44 4.1.3 Discrete-space Methods 44 4.2 ConcreteGAN 45 4.2.1 Autoencoder Reconstruction 45 4.2.2 Adversarial Training in the Latent Code Space 47 4.2.3 Adversarial Training with Textual Outputs 48 4.3 Experiments 49 4.3.1 Dataset 50 4.3.2 Experimental Settings 50 4.3.3 Evaluation Metrics 51 4.3.4 Experimental Results for Quality & Diversity 52 4.3.5 Experimental Results for FD score 56 4.3.6 Human Evaluation 56 4.3.7 Analyses of Code Space 57 4.4 Conclusion 60 5 SEQUENCE-LEVEL TRAINING OF CONDITIONAL TEXT GENERATION 61 5.1 Introduction 61 5.2 Background 63 5.2.1 Pre-trained Language Model 63 5.2.2 Proximal Policy Optimization 70 5.3 Methods 72 5.3.1 Step One: Token-level Fine-tuning 72 5.3.2 Step Two: Sequence-level Fine-tuning with Question-specific Reward 72 5.4 Experiments 74 5.4.1 Implementation Details 75 5.4.2 Quantitative Analysis 76 5.4.3 Qualitative Analysis 76 5.5 Conclusion 78 6 CONCLUSION 80 7 APPENDIX* 82 7.1 Generated Samples 82 7.2 Comparison of ARAE and ARAE* 84 7.3 Human Evaluation Criteria 85λ°•

    A survey of text representation methods and their genealogy

    Get PDF
    In recent years, with the advent of highly scalable artificial-neural-network-based text representation methods the field of natural language processing has seen unprecedented growth and sophistication. It has become possible to distill complex linguistic information of text into multidimensional dense numeric vectors with the use of the distributional hypothesis. As a consequence, text representation methods have been evolving at such a quick pace that the research community is struggling to retain knowledge of the methods and their interrelations. We contribute threefold to this lack of compilation, composition, and systematization by providing a survey of current approaches, by arranging them in a genealogy, and by conceptualizing a taxonomy of text representation methods to examine and explain the state-of-the-art. Our research is a valuable guide and reference for artificial intelligence researchers and practitioners interested in natural language processing applications such as recommender systems, chatbots, and sentiment analysis

    Deep Learning for Learning Representation and Its Application to Natural Language Processing

    Get PDF
    As the web evolves even faster than expected, the exponential growth of data becomes overwhelming. Textual data is being generated at an ever-increasing pace via emails, documents on the web, tweets, online user reviews, blogs, and so on. As the amount of unstructured text data grows, so does the need for intelligently processing and understanding it. The focus of this dissertation is on developing learning models that automatically induce representations of human language to solve higher level language tasks. In contrast to most conventional learning techniques, which employ certain shallow-structured learning architectures, deep learning is a newly developed machine learning technique which uses supervised and/or unsupervised strategies to automatically learn hierarchical representations in deep architectures and has been employed in varied tasks such as classification or regression. Deep learning was inspired by biological observations on human brain mechanisms for processing natural signals and has attracted the tremendous attention of both academia and industry in recent years due to its state-of-the-art performance in many research domains such as computer vision, speech recognition, and natural language processing. This dissertation focuses on how to represent the unstructured text data and how to model it with deep learning models in different natural language processing viii applications such as sequence tagging, sentiment analysis, semantic similarity and etc. Specifically, my dissertation addresses the following research topics: In Chapter 3, we examine one of the fundamental problems in NLP, text classification, by leveraging contextual information [MLX18a]; In Chapter 4, we propose a unified framework for generating an informative map from review corpus [MLX18b]; Chapter 5 discusses the tagging address queries in map search [Mok18]. This research was performed in collaboration with Microsoft; and In Chapter 6, we discuss an ongoing research work in the neural language sentence matching problem. We are working on extending this work to a recommendation system
    • …
    corecore