Search CORE

2,611 research outputs found

Sentence Simplification with Memory-Augmented Neural Networks

Author: Hu Baotian
Munkhdalai Tsendsuren
Vu Tu
Yu Hong
Publication venue
Publication date: 19/04/2018
Field of study

Sentence simplification aims to simplify the content and structure of complex sentences, and thus make them easier to interpret for human readers, and easier to process for downstream NLP applications. Recent advances in neural machine translation have paved the way for novel approaches to the task. In this paper, we adapt an architecture with augmented memory capacities called Neural Semantic Encoders (Munkhdalai and Yu, 2017) for sentence simplification. Our experiments demonstrate the effectiveness of our approach on different simplification datasets, both in terms of automatic evaluation measures and human judgments.Comment: Accepted as a conference paper at NAACL HLT 201

arXiv.org e-Print Archive

Integrating Transformer and Paraphrase Rules for Sentence Simplification

Author: Andi Saptono
Bambang Parmanto
He Daqing
Meng Rui
Zhao Sanqiang
Publication venue
Publication date: 26/10/2018
Field of study

Sentence simplification aims to reduce the complexity of a sentence while retaining its original meaning. Current models for sentence simplification adopted ideas from ma- chine translation studies and implicitly learned simplification mapping rules from normal- simple sentence pairs. In this paper, we explore a novel model based on a multi-layer and multi-head attention architecture and we pro- pose two innovative approaches to integrate the Simple PPDB (A Paraphrase Database for Simplification), an external paraphrase knowledge base for simplification that covers a wide range of real-world simplification rules. The experiments show that the integration provides two major benefits: (1) the integrated model outperforms multiple state- of-the-art baseline models for sentence simplification in the literature (2) through analysis of the rule utilization, the model seeks to select more accurate simplification rules. The code and models used in the paper are available at https://github.com/ Sanqiang/text_simplification

arXiv.org e-Print Archive

Complexity-Weighted Loss and Diverse Reranking for Sentence Simplification

Author: Apidianaki Marianna
Callison-Burch Chris
Kriz Reno
Kumar Gaurav
Miltsakaki Eleni
Sedoc João
Zheng Carolina
Publication venue
Publication date: 04/04/2019
Field of study

Sentence simplification is the task of rewriting texts so they are easier to understand. Recent research has applied sequence-to-sequence (Seq2Seq) models to this task, focusing largely on training-time improvements via reinforcement learning and memory augmentation. One of the main problems with applying generic Seq2Seq models for simplification is that these models tend to copy directly from the original sentence, resulting in outputs that are relatively long and complex. We aim to alleviate this issue through the use of two main techniques. First, we incorporate content word complexities, as predicted with a leveled word complexity model, into our loss function during training. Second, we generate a large set of diverse candidate simplifications at test time, and rerank these to promote fluency, adequacy, and simplicity. Here, we measure simplicity through a novel sentence complexity model. These extensions allow our models to perform competitively with state-of-the-art systems while generating simpler sentences. We report standard automatic and human evaluation metrics.Comment: 11 pages, North American Association of Computational Linguistics (NAACL 2019

arXiv.org e-Print Archive

Neural Task Planning with And-Or Graph Representations

Author: Chen Riquan
Chen Tianshui
Lin Liang
Liu Xiaobai
Luo Xiaonan
Nie Lin
Publication venue
Publication date: 24/08/2018
Field of study

This paper focuses on semantic task planning, i.e., predicting a sequence of actions toward accomplishing a specific task under a certain scene, which is a new problem in computer vision research. The primary challenges are how to model task-specific knowledge and how to integrate this knowledge into the learning procedure. In this work, we propose training a recurrent long short-term memory (LSTM) network to address this problem, i.e., taking a scene image (including pre-located objects) and the specified task as input and recurrently predicting action sequences. However, training such a network generally requires large numbers of annotated samples to cover the semantic space (e.g., diverse action decomposition and ordering). To overcome this issue, we introduce a knowledge and-or graph (AOG) for task description, which hierarchically represents a task as atomic actions. With this AOG representation, we can produce many valid samples (i.e., action sequences according to common sense) by training another auxiliary LSTM network with a small set of annotated samples. Furthermore, these generated samples (i.e., task-oriented action sequences) effectively facilitate training of the model for semantic task planning. In our experiments, we create a new dataset that contains diverse daily tasks and extensively evaluate the effectiveness of our approach.Comment: Submitted to TMM, under minor revision. arXiv admin note: text overlap with arXiv:1707.0467

arXiv.org e-Print Archive

Metalearning with Hebbian Fast Weights

Author: Munkhdalai Tsendsuren
Trischler Adam
Publication venue
Publication date: 12/07/2018
Field of study

We unify recent neural approaches to one-shot learning with older ideas of associative memory in a model for metalearning. Our model learns jointly to represent data and to bind class labels to representations in a single shot. It builds representations via slow weights, learned across tasks through SGD, while fast weights constructed by a Hebbian learning rule implement one-shot binding for each new task. On the Omniglot, Mini-ImageNet, and Penn Treebank one-shot learning benchmarks, our model achieves state-of-the-art results.Comment: 8 pages, 3 figures, 4 tables. arXiv admin note: text overlap with arXiv:1712.0992

arXiv.org e-Print Archive

ConvAMR: Abstract meaning representation parsing for legal document

Author: Minh Nguyen Le
Satoh Ken
Sinh Vu Trong
Viet Lai Dac
Publication venue
Publication date: 20/11/2017
Field of study

Convolutional neural networks (CNN) have recently achieved remarkable performance in a wide range of applications. In this research, we equip convolutional sequence-to-sequence (seq2seq) model with an efficient graph linearization technique for abstract meaning representation parsing. Our linearization method is better than the prior method at signaling the turn of graph traveling. Additionally, convolutional seq2seq model is more appropriate and considerably faster than the recurrent neural network models in this task. Our method outperforms previous methods by a large margin on both the standard dataset LDC2014T12. Our result indicates that future works still have a room for improving parsing model using graph linearization approach.Comment: SCIDOCA2017, Japa

arXiv.org e-Print Archive

EditNTS: An Neural Programmer-Interpreter Model for Sentence Simplification through Explicit Editing

Author: Cheung Jackie Chi Kit
Dong Yue
Li Zichao
Rezagholizadeh Mehdi
Publication venue
Publication date: 19/06/2019
Field of study

We present the first sentence simplification model that learns explicit edit operations (ADD, DELETE, and KEEP) via a neural programmer-interpreter approach. Most current neural sentence simplification systems are variants of sequence-to-sequence models adopted from machine translation. These methods learn to simplify sentences as a byproduct of the fact that they are trained on complex-simple sentence pairs. By contrast, our neural programmer-interpreter is directly trained to predict explicit edit operations on targeted parts of the input sentence, resembling the way that humans might perform simplification and revision. Our model outperforms previous state-of-the-art neural sentence simplification models (without external knowledge) by large margins on three benchmark text simplification corpora in terms of SARI (+0.95 WikiLarge, +1.89 WikiSmall, +1.41 Newsela), and is judged by humans to produce overall better and simpler output sentences.Comment: 9 pages, 1 figure, accepted at ACL201

arXiv.org e-Print Archive

A Word-Complexity Lexicon and A Neural Readability Ranking Model for Lexical Simplification

Author: Maddela Mounica
Xu Wei
Publication venue
Publication date: 12/10/2018
Field of study

Current lexical simplification approaches rely heavily on heuristics and corpus level features that do not always align with human judgment. We create a human-rated word-complexity lexicon of 15,000 English words and propose a novel neural readability ranking model with a Gaussian-based feature vectorization layer that utilizes these human ratings to measure the complexity of any given word or phrase. Our model performs better than the state-of-the-art systems for different lexical simplification tasks and evaluation datasets. Additionally, we also produce SimplePPDB++, a lexical resource of over 10 million simplifying paraphrase rules, by applying our model to the Paraphrase Database (PPDB).Comment: 12 pages; EMNLP 201

arXiv.org e-Print Archive

Instance-based Inductive Deep Transfer Learning by Cross-Dataset Querying with Locality Sensitive Hashing

Author: Annervaz K M
Chowdhury Somnath Basu Roy
Dukkipati Ambedkar
Publication venue
Publication date: 16/02/2018
Field of study

Supervised learning models are typically trained on a single dataset and the performance of these models rely heavily on the size of the dataset, i.e., amount of data available with the ground truth. Learning algorithms try to generalize solely based on the data that is presented with during the training. In this work, we propose an inductive transfer learning method that can augment learning models by infusing similar instances from different learning tasks in the Natural Language Processing (NLP) domain. We propose to use instance representations from a source dataset, \textit{without inheriting anything} from the source learning model. Representations of the instances of \textit{source} \& \textit{target} datasets are learned, retrieval of relevant source instances is performed using soft-attention mechanism and \textit{locality sensitive hashing}, and then, augmented into the model during training on the target dataset. Our approach simultaneously exploits the local \textit{instance level information} as well as the macro statistical viewpoint of the dataset. Using this approach we have shown significant improvements for three major news classification datasets over the baseline. Experimental evaluations also show that the proposed approach reduces dependency on labeled data by a significant margin for comparable performance. With our proposed cross dataset learning procedure we show that one can achieve competitive/better performance than learning from a single dataset

arXiv.org e-Print Archive

Syntax-guided Controlled Generation of Paraphrases

Author: Ahuja Kabir
Kumar Ashutosh
Talukdar Partha
Vadapalli Raghuram
Publication venue
Publication date: 17/05/2020
Field of study

Given a sentence (e.g., "I like mangoes") and a constraint (e.g., sentiment flip), the goal of controlled text generation is to produce a sentence that adapts the input sentence to meet the requirements of the constraint (e.g., "I hate mangoes"). Going beyond such simple constraints, recent works have started exploring the incorporation of complex syntactic-guidance as constraints in the task of controlled paraphrase generation. In these methods, syntactic-guidance is sourced from a separate exemplar sentence. However, these prior works have only utilized limited syntactic information available in the parse tree of the exemplar sentence. We address this limitation in the paper and propose Syntax Guided Controlled Paraphraser (SGCP), an end-to-end framework for syntactic paraphrase generation. We find that SGCP can generate syntax conforming sentences while not compromising on relevance. We perform extensive automated and human evaluations over multiple real-world English language datasets to demonstrate the efficacy of SGCP over state-of-the-art baselines. To drive future research, we have made SGCP's source code availableComment: 16 pages, 3 figures, Accepted to TACL 202

arXiv.org e-Print Archive