Search CORE

28 research outputs found

Text Simplification Using Neural Machine Translation

Author: Chen Ping
Qiang Jipeng
Rochford John
Wang Tong
Publication venue: eScholarship@UMassChan
Publication date: 05/03/2016
Field of study

Text simplification (TS) is the technique of reducing the lexical, syntactical complexity of text. Existing automatic TS systems can simplify text only by lexical simplification or by manually defined rules. Neural Machine Translation (NMT) is a recently proposed approach for Machine Translation (MT) that is receiving a lot of research interest. In this paper, we regard original English and simplified English as two languages, and apply a NMT model–Recurrent Neural Network (RNN) encoder-decoder on TS to make the neural network to learn text simplification rules by itself. Then we discuss challenges and strategies about how to apply a NMT model to the task of text simplification

eScholarship@UMMS

Association for the Advancement of Artificial Intelligence: AAAI Publications

Lexical Simplification with Pretrained Encoders

Author: Li Yun
Qiang Jipeng
Wu Xindong
Yuan Yunhao
Zhu Yi
Publication venue
Publication date: 03/04/2020
Field of study

Lexical simplification (LS) aims to replace complex words in a given sentence with their simpler alternatives of equivalent meaning. Recently unsupervised lexical simplification approaches only rely on the complex word itself regardless of the given sentence to generate candidate substitutions, which will inevitably produce a large number of spurious candidates. We present a simple LS approach that makes use of the Bidirectional Encoder Representations from Transformers (BERT) which can consider both the given sentence and the complex word during generating candidate substitutions for the complex word. Specifically, we mask the complex word of the original sentence for feeding into the BERT to predict the masked token. The predicted results will be used as candidate substitutions. Despite being entirely unsupervised, experimental results show that our approach obtains obvious improvement compared with these baselines leveraging linguistic databases and parallel corpus, outperforming the state-of-the-art by more than 12 Accuracy points on three well-known benchmarks

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Multilingual Lexical Simplification via Paraphrase Generation

Author: Hua Kaixun
Li Yun
Liu Kang
Qiang Jipeng
Yuan Yunhao
Zhu Yi
Publication venue
Publication date: 27/07/2023
Field of study

Lexical simplification (LS) methods based on pretrained language models have made remarkable progress, generating potential substitutes for a complex word through analysis of its contextual surroundings. However, these methods require separate pretrained models for different languages and disregard the preservation of sentence meaning. In this paper, we propose a novel multilingual LS method via paraphrase generation, as paraphrases provide diversity in word selection while preserving the sentence's meaning. We regard paraphrasing as a zero-shot translation task within multilingual neural machine translation that supports hundreds of languages. After feeding the input sentence into the encoder of paraphrase modeling, we generate the substitutes based on a novel decoding strategy that concentrates solely on the lexical variations of the complex word. Experimental results demonstrate that our approach surpasses BERT-based methods and zero-shot GPT3-based method significantly on English, Spanish, and Portuguese

arXiv.org e-Print Archive