Search CORE

67,052 research outputs found

Noisy Channel for Automatic Text Simplification

Author: Cumbicus-Pineda Oscar M
Gonzalez-Dios Itziar
Gutiérrez-Fandiño Iker
Soroa Aitor
Publication venue
Publication date: 06/11/2022
Field of study

In this paper we present a simple re-ranking method for Automatic Sentence Simplification based on the noisy channel scheme. Instead of directly computing the best simplification given a complex text, the re-ranking method also considers the probability of the simple sentence to produce the complex counterpart, as well as the probability of the simple text itself, according to a language model. Our experiments show that combining these scores outperform the original system in three different English datasets, yielding the best known result in one of them. Adopting the noisy channel scheme opens new ways to infuse additional information into ATS systems, and thus to control important aspects of them, a known limitation of end-to-end neural seq2seq generative models.Comment: 8 page

arXiv.org e-Print Archive

Automated text simplification as a preprocessing step for machine translation into an under-resourced language

Author: Popović Maja
Štajner Sanja
Publication venue: 'Assoc. for Computational Linguistics Bulgaria'
Publication date: 04/09/2019
Field of study

In this work, we investigate the possibility of using fully automatic text simplification system on the English source in machine translation (MT) for improving its translation into an under-resourced language. We use the state-of-the-art automatic text simplification (ATS) system for lexically and syntactically simplifying source sentences, which are then translated with two state-of-the-art English-to-Serbian MT systems, the phrase-based MT (PBMT) and the neural MT (NMT). We explore three different scenarios for using the ATS in MT: (1) using the raw output of the ATS; (2) automatically filtering out the sentences with low grammaticality and meaning preservation scores; and (3) performing a minimal manual correction of the ATS output. Our results show improvement in fluency of the translation regardless of the chosen scenario, and difference in success of the three scenarios depending on the MT approach used (PBMT or NMT) with regards to improving translation fluency and post-editing effort

Irish Universities

DCU Online Research Access Service

Lexico-syntactic Text Simplification And Compression With Typed Dependencies

Author: Mandya Angrosh Annayappan
Nomoto Tadashi
Siddharthan Advaith
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2014
Field of study

We describe two systems for text simplification using typed dependency structures, one that performs lexical and syntactic simplification, and another that performs sentence compression optimised to satisfy global text constraints such as lexical density, the ratio of difficult words, and text length. We report a substantial evaluation that demonstrates the superiority of our systems, individually and in combination, over the state of the art, and also report a comprehension based evaluation of contemporary automatic text simplification systems with target non-native readers

Aberdeen University Research

Open Research Online (The Open University)

Text Simplification Using Neural Machine Translation

Author: Chen Ping
Qiang Jipeng
Rochford John
Wang Tong
Publication venue: eScholarship@UMassChan
Publication date: 05/03/2016
Field of study

Text simplification (TS) is the technique of reducing the lexical, syntactical complexity of text. Existing automatic TS systems can simplify text only by lexical simplification or by manually defined rules. Neural Machine Translation (NMT) is a recently proposed approach for Machine Translation (MT) that is receiving a lot of research interest. In this paper, we regard original English and simplified English as two languages, and apply a NMT model–Recurrent Neural Network (RNN) encoder-decoder on TS to make the neural network to learn text simplification rules by itself. Then we discuss challenges and strategies about how to apply a NMT model to the task of text simplification

eScholarship@UMMS

Association for the Advancement of Artificial Intelligence: AAAI Publications

Towards Personalised Simplification based on L2 Learners' Native Language

Author: Alessio Palmero Aprosio
Leonardo Herzog
Luca Ducceschi
Sara Tonelli
Stefano Menini
Publication venue
Publication date: 01/01/2018
Field of study

We present an approach to improve the selection of complex words for automatic text simplification, addressing the need of L2 learners to take into account their native language during simplification. In particular, we develop a methodology that automatically identifies ‘difficult’ terms (i.e. false friends) for L2 learners in order to simplify them. We evaluate not only the quality of the detected false friends but also the impact of this methodology on text simplification compared with a standard frequency-based approach

Crossref

Archivio della ricerca - Fondazione Bruno Kessler

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

OpenEdition

DEPLAIN: A German Parallel Corpus with Intralingual Translations into Plain Language for Sentence and Document Simplification

Author: Kallmeyer Laura
Momen Omar
Stodden Regina
Publication venue
Publication date: 30/05/2023
Field of study

Text simplification is an intralingual translation task in which documents, or sentences of a complex source text are simplified for a target audience. The success of automatic text simplification systems is highly dependent on the quality of parallel data used for training and evaluation. To advance sentence simplification and document simplification in German, this paper presents DEplain, a new dataset of parallel, professionally written and manually aligned simplifications in plain German ("plain DE" or in German: "Einfache Sprache"). DEplain consists of a news domain (approx. 500 document pairs, approx. 13k sentence pairs) and a web-domain corpus (approx. 150 aligned documents, approx. 2k aligned sentence pairs). In addition, we are building a web harvester and experimenting with automatic alignment methods to facilitate the integration of non-aligned and to be published parallel documents. Using this approach, we are dynamically increasing the web domain corpus, so it is currently extended to approx. 750 document pairs and approx. 3.5k aligned sentence pairs. We show that using DEplain to train a transformer-based seq2seq text simplification model can achieve promising results. We make available the corpus, the adapted alignment methods for German, the web harvester and the trained models here: https://github.com/rstodden/DEPlain.Comment: Accepted to ACL 202

arXiv.org e-Print Archive