11,267 research outputs found

    Measuring text readability with machine comprehension: a pilot study

    Get PDF
    International audienceThis article studies the relationship between text readability indice and automatic machine understanding systems. Our hypothesis is that the simpler a text is, the better it should be understood by a machine. We thus expect to a strong correlation between readability levels on the one hand, and performance of automatic reading systems on the other hand. We test this hypothesis with several understanding systems based on language models of varying strengths, measuring this correlation on two corpora of journalistic texts. Our results suggest that this correlation is rather small that existing comprehension systems are far to reproduce the gradual improvement of their performance on texts of decreasing complexity

    Getting creative in the languages classroom

    Get PDF
    The following principles are central to the work of ‘Linguistic Creativity in Language Learning’, a research strand of Creative Multilingualism: We create language every day. Language diversity facilitates creative diversity. Linguistic diversity nurtures diverse expression of feelings, thoughts and identities, and diverse ways of knowing and seeing the world. In this chapter we outline how they might be considered in relation to classroom language learning. One of the authors of this chapter..

    Reading and Rereading Shakespeare’s Sonnets: Combining Quantitative Narrative Analysis and Predictive Modeling

    Get PDF
    Natural reading is rather like a juggling feat, as our eyes and minds are kept on several things at the same time. Instead, reading texts developed by researchers (so-called “textoids”; Graesser, Millis, & Zwaan, 1997) may be fairly simple, since this facilitates an experimental investigation. It thus provides the chance for clear statements regarding the effect of predefined variables. Likewise, most empirical studies focused only a few selected features while ignoring the great diversity of possibly important others (e.g., Rayner et al., 2001; Reichle, Rayner, & Pollatsek, 2003; Rayner & Pollatsek, 2006; Engbert et al., 2005; Rayner, 2009). However, it is not possible to directly transfer the results generated from textoids to natural reading due to the identification of more than 100 features on different hierarchical levels, which may influence processing a natural text (Graf, Nagler, & Jacobs, 2005; Jacobs, 2015a, b; Jacobs et al., 2017). The present dissertation differed from past research in that it used a literary text, i.e., Shakespeare’s sonnets, instead of texts constructed by the experimenter. The goal of the present dissertation was to investigate how psycholinguistic features may influence the reading behavior during poem perception. To this end, two problems need to be handled: Firstly, complex natural texts need to be broken up into measurable and testable features by “turning words into numbers” (Franzosi, 2010) for the sake of statistical analysis. Secondly, statistical ways were sought to deal with the non-linear webs of correlations among different features, which has long been a concern of Jacob’s working group (e.g., Willems, 2015; Willems & Jacobs, 2016; Jacobs & Willems, 2018). A quantitative narrative analysis (QNA) based predictive modeling approach was suggested to solve the above problems (e.g., Jacobs et al., 2017; Jacobs, 2017, 2018a, b). Since it is impossible to identify all relevant features of a natural text [e.g., over 50 features mentioned for single word recognition (Graf et al., 2005) or over 100 features computed for the corpus of Shakespeare sonnets (Jacobs et al., 2017)] and including more inter/supra-lexical features also requires extending sample sizes (i.e., more/longer texts and more participants), my dissertation focuses on lexical features. Seven of these are surface features (word length, word frequency, orthographic neighborhood density, higher frequency neighbors, orthographic dissimilarity index, consonant vowel quotient, and the sonority score) and two are affective-semantic features (valence and arousal). By applying the QNA-based predictive modeling approach, I conducted three eye tracking studies: study 1 (Chapter 5) asked English native speakers to read three of Shakespeare’s sonnets (sonnet 27, 60, and 66), aiming to investigate the role of seven surface psycholinguistic features in sonnets reading. Study 2 (Chapter 6) used a rereading paradigm and let another group of English natives read two of the three sonnets (sonnet 27 and 66), to find out whether the roles of the surface psycholinguistic features may be changed in rereading. In study 3 (Chapter 7), I reanalyzed the data of study 2, in which beyond the surface features I started to pay attention to the affective-semantic features, hoping to examine whether the roles of surface and affective-semantic features may be different throughout reading sessions. The three studies show highly reliable data for high feature importance of surface variables, and in rereading an increasing impact of affective-semantic features in reading Shakespeare’s sonnets. From a methodological viewpoint, all three studies show a much better sufficiency of neural net approach than the classical general linear model approach in psycholinguistic eye tracking research. For the rereading studies, in general, compared to the first reading, rereading improved the fluency of reading on poem level (shorter total reading times, shorter regression times, and lower fixation probability) and the depth of comprehension (e.g., Hakemulder, 2004; Kuijpers & Hakemulder, 2018). Contrary to the other rereading studies using literary texts (e.g., Dixon et al., 1993; Millis, 1995; Kuijpers & Hakemulder, 2018), no increase in appreciation was apparent. In summary, this dissertation can show that the application of predictive modeling to investigate poetry might be far more suitable to capture the highly interactive, non-linear composition of linguistic features in natural texts that guide reading behavior and reception. Besides, surface features seem to influence reading during all reading sessions, while affective-semantic features seem to increase their importance in line with processing depth as indicated by higher influence during rereading. The results seem to be stable and valid as I could replicate these novel findings using machine learning algorithms within my dissertation project. My dissertation project is a first step towards a more differentiated picture of the guiding factors of poetry reception and a poetry specific reading model

    Automated Reading Passage Generation with OpenAI's Large Language Model

    Full text link
    The widespread usage of computer-based assessments and individualized learning platforms has resulted in an increased demand for the rapid production of high-quality items. Automated item generation (AIG), the process of using item models to generate new items with the help of computer technology, was proposed to reduce reliance on human subject experts at each step of the process. AIG has been used in test development for some time. Still, the use of machine learning algorithms has introduced the potential to improve the efficiency and effectiveness of the process greatly. The approach presented in this paper utilizes OpenAI's latest transformer-based language model, GPT-3, to generate reading passages. Existing reading passages were used in carefully engineered prompts to ensure the AI-generated text has similar content and structure to a fourth-grade reading passage. For each prompt, we generated multiple passages, the final passage was selected according to the Lexile score agreement with the original passage. In the final round, the selected passage went through a simple revision by a human editor to ensure the text was free of any grammatical and factual errors. All AI-generated passages, along with original passages were evaluated by human judges according to their coherence, appropriateness to fourth graders, and readability

    An investigation of challenges in machine translation of literary texts : the case of the English–Chinese language pair

    Get PDF
    In the absence of a focus on literary text translation in studies of machine translation (MT), this study aims at investigating some challenges of this application of the technology. First, the most commonly used types of MT are reviewed in chronological order of their development, and, for the purpose of identifying challenges for MT in literary text translation, the challenges human translators face in literary text translation are linked to corresponding aspects of MT. In investigating the research questions of the challenges that MT systems face in literary text translation, and whether equivalence can be established by MT in literary text translation, a qualitative method is used. Areas such as the challenges for MT in the establishment of corpora, achieving equivalence, and realisation of creativity in literary texts are examined in order to reveal some of the potential contributing factors to the difficulties faced in literary text translation by MT. Through text analysis on chosen sample literary texts on three online MT platforms (Google Translate, DeepL and Youdao Translate), all based on highly advanced neural machine translation engines, this study offers a pragmatic view on some challenging areas in literary text translation using these widely acclaimed online platforms, and offers insights on potential research opportunities in studies of literary text translation using MT

    Multilingual Unsupervised Sentence Simplification

    Full text link
    Progress in Sentence Simplification has been hindered by the lack of supervised data, particularly in languages other than English. Previous work has aligned sentences from original and simplified corpora such as English Wikipedia and Simple English Wikipedia, but this limits corpus size, domain, and language. In this work, we propose using unsupervised mining techniques to automatically create training corpora for simplification in multiple languages from raw Common Crawl web data. When coupled with a controllable generation mechanism that can flexibly adjust attributes such as length and lexical complexity, these mined paraphrase corpora can be used to train simplification systems in any language. We further incorporate multilingual unsupervised pretraining methods to create even stronger models and show that by training on mined data rather than supervised corpora, we outperform the previous best results. We evaluate our approach on English, French, and Spanish simplification benchmarks and reach state-of-the-art performance with a totally unsupervised approach. We will release our models and code to mine the data in any language included in Common Crawl

    Sentiment and Sentence Similarity as Predictors of Integrated and Independent L2 Writing Performance

    Get PDF
    This study aimed to utilize sentiment and sentence similarity analyses, two Natural Language Processing techniques, to see if and how well they could predict L2 Writing Performance in integrated and independent task conditions. The data sources were an integrated L2 writing corpus of 185 literary analysis essays and an independent L2 writing corpus of 500 argumentative essays, both of which were compiled in higher education contexts. Both essay groups were scored between 0 and 100. Two Python libraries, TextBlob and SpaCy, were used to generate sentiment and sentence similarity data. Using sentiment (polarity and subjectivity) and sentence similarity variables, regression models were built and 95% prediction intervals were compared for integrated and independent corpora. The results showed that integrated L2 writing performance could be predicted by subjectivity and sentence similarity. However, only subjectivity predicted independent L2 writing performance. The prediction interval of subjectivity for independent writing model was found to be narrower than the same interval for integrated writing. The results show that the sentiment and sentence similarity analysis algorithms can be used to generate complementary data to improve more complex multivariate L2 writing performance prediction models

    VNHSGE: VietNamese High School Graduation Examination Dataset for Large Language Models

    Full text link
    The VNHSGE (VietNamese High School Graduation Examination) dataset, developed exclusively for evaluating large language models (LLMs), is introduced in this article. The dataset, which covers nine subjects, was generated from the Vietnamese National High School Graduation Examination and comparable tests. 300 literary essays have been included, and there are over 19,000 multiple-choice questions on a range of topics. The dataset assesses LLMs in multitasking situations such as question answering, text generation, reading comprehension, visual question answering, and more by including both textual data and accompanying images. Using ChatGPT and BingChat, we evaluated LLMs on the VNHSGE dataset and contrasted their performance with that of Vietnamese students to see how well they performed. The results show that ChatGPT and BingChat both perform at a human level in a number of areas, including literature, English, history, geography, and civics education. They still have space to grow, though, especially in the areas of mathematics, physics, chemistry, and biology. The VNHSGE dataset seeks to provide an adequate benchmark for assessing the abilities of LLMs with its wide-ranging coverage and variety of activities. We intend to promote future developments in the creation of LLMs by making this dataset available to the scientific community, especially in resolving LLMs' limits in disciplines involving mathematics and the natural sciences.Comment: 74 pages, 44 figure

    Sentence Simplification for Text Processing

    Get PDF
    A thesis submitted in partial fulfilment of the requirement of the University of Wolverhampton for the degree of Doctor of Philosophy.Propositional density and syntactic complexity are two features of sentences which affect the ability of humans and machines to process them effectively. In this thesis, I present a new approach to automatic sentence simplification which processes sentences containing compound clauses and complex noun phrases (NPs) and converts them into sequences of simple sentences which contain fewer of these constituents and have reduced per sentence propositional density and syntactic complexity. My overall approach is iterative and relies on both machine learning and handcrafted rules. It implements a small set of sentence transformation schemes, each of which takes one sentence containing compound clauses or complex NPs and converts it one or two simplified sentences containing fewer of these constituents (Chapter 5). The iterative algorithm applies the schemes repeatedly and is able to simplify sentences which contain arbitrary numbers of compound clauses and complex NPs. The transformation schemes rely on automatic detection of these constituents, which may take a variety of forms in input sentences. In the thesis, I present two new shallow syntactic analysis methods which facilitate the detection process. The first of these identifies various explicit signs of syntactic complexity in input sentences and classifies them according to their specific syntactic linking and bounding functions. I present the annotated resources used to train and evaluate this sign tagger (Chapter 2) and the machine learning method used to implement it (Chapter 3). The second syntactic analysis method exploits the sign tagger and identifies the spans of compound clauses and complex NPs in input sentences. In Chapter 4 of the thesis, I describe the development and evaluation of a machine learning approach performing this task. This chapter also presents a new annotated dataset supporting this activity. In the thesis, I present two implementations of my approach to sentence simplification. One of these exploits handcrafted rule activation patterns to detect different parts of input sentences which are relevant to the simplification process. The other implementation uses my machine learning method to identify compound clauses and complex NPs for this purpose. Intrinsic evaluation of the two implementations is presented in Chapter 6 together with a comparison of their performance with several baseline systems. The evaluation includes comparisons of system output with human-produced simplifications, automated estimations of the readability of system output, and surveys of human opinions on the grammaticality, accessibility, and meaning of automatically produced simplifications. Chapter 7 presents extrinsic evaluation of the sentence simplification method exploiting handcrafted rule activation patterns. The extrinsic evaluation involves three NLP tasks: multidocument summarisation, semantic role labelling, and information extraction. Finally, in Chapter 8, conclusions are drawn and directions for future research considered
    • 

    corecore