28 research outputs found
Evaluating prose style transfer with the Bible
In the prose style transfer task a system, provided with text input and a
target prose style, produces output which preserves the meaning of the input
text but alters the style. These systems require parallel data for evaluation
of results and usually make use of parallel data for training. Currently, there
are few publicly available corpora for this task. In this work, we identify a
high-quality source of aligned, stylistically distinct text in different
versions of the Bible. We provide a standardized split, into training,
development and testing data, of the public domain versions in our corpus. This
corpus is highly parallel since many Bible versions are included. Sentences are
aligned due to the presence of chapter and verse numbers within all versions of
the text. In addition to the corpus, we present the results, as measured by the
BLEU and PINC metrics, of several models trained on our data which can serve as
baselines for future research. While we present these data as a style transfer
corpus, we believe that it is of unmatched quality and may be useful for other
natural language tasks as well
Lexico-syntactic Text Simplification And Compression With Typed Dependencies
We describe two systems for text simplification using typed dependency structures, one that performs lexical and syntactic simplification, and another that performs sentence compression optimised to satisfy global text constraints such as lexical density, the ratio of difficult words, and text length. We report a substantial evaluation that demonstrates the superiority of our systems, individually and in combination, over the state of the art, and also report a comprehension based evaluation of contemporary automatic text simplification systems with target non-native readers
Chapter Bibliography
authored support system; contextual machine translation; controlled document authoring; controlled language; document structure; terminology management; translation technology; usability evaluatio
Applying Estonian Digital Resources and Technologies in a Text Simplification Program
Käesoleva bakalaureusetöö eesmärk oli uurida teksti lihtsustamise meetodeid ning luua veebipõhine rakendus, mis lihtsustaks eestikeelset teksti. Rakenduse loomiseks kasutati keeleressursse, nagu Eesti Wordnet, word2vec’i mudel, sagedusloend, võõrsõnade leksikon ja põhisõnavara sõnastik ning nendega leitakse sõnade keerukus ning sobivus teksti.The purpose of this Bachelor’s thesis was to research text simplification methods and to create a web-based application to simplify Estonian texts. The web application uses language resources such as the Estonian Wordnet, word2vec model, frequency dictionary, foreign word dictionary and basic vocabulary dictionary, which are used to identify word complexity and suitability to the text
A Spoken Dialogue System for Enabling Comfortable Information Acquisition and Consumption
早大学位記番号:新8137早稲田大
Intelligent text processing to help readers with autism
© 2018, Springer International Publishing AG. Autistic Spectrum Disorder (ASD) is a neurodevelopmental disorder which has a life-long impact on the lives of people diagnosed with the condition. In many cases, people with ASD are unable to derive the gist or meaning of written documents due to their inability to process complex sentences, understand non-literal text, and understand uncommon and technical terms. This paper presents FIRST, an innovative project which developed language technology (LT) to make documents more accessible to people with ASD. The project has produced a powerful editor which enables carers of people with ASD to prepare texts suitable for this population. Assessment of the texts generated using the editor showed that they are not less readable than those generated more slowly as a result of onerous unaided conversion and were significantly more readable than the originals. Evaluation of the tool shows that it can have a positive impact on the lives of people with ASD.Published versio