1,106 research outputs found

    Intralingual translation and cascading crises: evaluating the impact of semi-automation on the readability and comprehensibility of health content

    Get PDF
    During crises, intralingual translation (or simplification) of medical content can facilitate comprehension among lay readers and foster their compliance with instructions aimed to avoid or mitigate the cascading effects of crises. The onus of simplifying health-related texts often falls on medical experts, and the task of intralingual translation tends to be nonautomated. Medical authors are asked to check and remember different sets of plain language guidelines, while also relying on their interpretation of how and when to implement these guidelines. Accordingly, even simplified health-related texts present characteristics that make them difficult to read and comprehend, particularly for an audience with low (health) literacy. Against this background, this chapter describes an experimental study aimed at testing the impact that using a controlled language (CL) checker to semi-automate intralingual translation has on the readability and comprehensibility of medical content. The study focused on the plain language summaries and abstracts produced by the non-profit organisation Cochrane. Using Coh-Metrix and recall, this investigation found that the introduction of a CL checker influenced some readability features, but not lay readers’ comprehension, regardless of their native language. Finally, strategies to enhance the comprehensibility of health content and reduce the vulnerability of readers in crises are discussed

    Automatic detection of parallel sentences from comparable biomedical texts

    Get PDF
    International audienceParallel sentences provide semantically similar information which can vary on a given dimension, such as language or register. Parallel sentences with register variation (like expert and non-expert documents) can be exploited for the automatic text simplification. The aim of automatic text simplification is to better access and understand a given information. In the biomedical field, simplification may permit patients to understand medical and health texts. Yet, there is currently no such available resources. We propose to exploit comparable corpora which are distinguished by their registers (specialized and simplified versions) to detect and align parallel sentences. These corpora are in French and are related to the biomedical area. Our purpose is to state whether a given pair of specialized and simplified sentences is to be aligned or not. Manually created reference data show 0.76 inter-annotator agreement. We treat this task as binary classification (alignment/non-alignment). We perform experiments on balanced and imbalanced data. The results on balanced data reach up to 0.96 F-Measure. On imbalanced data, the results are lower but remain competitive when using classification models train on balanced data. Besides, among the three datasets exploited (se-mantic equivalence and inclusions), the detection of equivalence pairs is more efficient

    Effects of lexical simplification toward vocabulary mastery and reading comprehension of the Eleventh Grade IPA Students at SMA Muhammadiyah Palangka Raya

    Get PDF
    The title of the study “Effect of Lexical Simplification Toward Vocabulary Mastery and Reading Comprehension of The Eleventh Grade IPA Students at SMA Muhammadiyah 1 Palangka Raya” is presented because there was a reading comprehension problem of Eleventh Grade IPA Students at SMA Muhammadiyah 1 Palangka Raya which is caused by vocabulary mastery. And the writer choosed lexical simplification to solve this problem. The purpose of this study was to find out the significant effect of lexical simplification toward students’ vocabulary and students’ reading comprehension; and the interaction effect between vocabulary mastery and reading comprehension with and without lexical simplification of the eleventh grade IPA students of SMA Muhammadiyah 1 Palangka Raya. This research used the quantitative approach with ex-post facto design. The writer designed the lesson plan, conducted the product treatment, observed the students’ score by pre-test and post-test. the population of the study was Eleventh Grade IPA Students at SMA Muhammadiyah 1 Palangka Raya which consist of 58 students. the writer used cluster random sampling in this study and took 29 students as sample. Then the writer used One-Way ANNOVA to analyze the data and the result showed that there was significant differences between students score of Narrative Text with and without Lexical Simplification toward vocabulary mastery and Reading Comprehension after doing treatment Fvaluewas higher than Ftable ( vocabulary = 2,723 > 2,71; reading comprehension = 5,725 > 2,71). The study showed result (a) Significant value was lower than alpha (0.01 0.05), it meant that There was no significant interaction effect between vocabulary mastery and reading comprehension with and without lexical simplification of the eleventh grade IPA students of SMA Muhammadiyah 1 Palangka Raya. Finally, based on result above, the writer recommended that teacher can be able to apply lexical simplification in reading comprehension. Considering of the study result, the use of lexical simplification is effective because students’ vocabulary mastery and reading comprehension were improved. ABSTRAK Judul penelitian "Pengaruh Penyederhanaan leksikal terhadap Penguasaan Kosakata dan Pemahaman Bacaan pada siswa IPA Kelas XI I di SMA Muhammadiyah 1 Palangka Raya" disajikan karena adanya masalah terkait pemahaman bacaan pada siswa IPA Kelas XI di SMA Muhammadiyah 1 Palangka Raya yang disebabkan oleh penguasaan kosakata. Dan penulis memilih penyederhanaan leksikal untuk mengatasi masalah ini. Tujuan dari penelitian ini adalah untuk mengetahui pengaruh signifikan penyederhanaan leksikal terhadap penguasaan 'kosakata dan pemahaman bacaan siswa; dan efek interaksi antara penguasaan kosakata dan pemahaman bacaan dengan dan tanpa penyederhanaan leksikal pada siswa IPA kelas XI SMA Muhammadiyah 1 Palangka Raya. Penelitian ini menggunakan pendekatan kuantitatif dengan desain ex-post facto. Penulis merancang rencana pelajaran, melakukan perlakuan produk, dan mengamati skor siswa dengan pre-test dan post-test. Populasi penelitian ini adalah siswa IPA Kelas XI di SMA Muhammadiyah 1 Palangka Raya yang terdiri atas 58 siswa. Penulis menggunakan cluster random sampling dalam penelitian ini dan mengambil 29 siswa sebagai sampel. Kemudian penulis menggunakan One-Way Annova untuk menganalisis data dan hasilnya menunjukkan bahwa ada perbedaan yang signifikan antara nilai siswa yg menjawab Narrative Text dengan penyederhanaa leksikal dan Narrative Text tanpa penyederhanaa leksikal terhadap penguasaan kosakata dan pemahaman bacaan setelah melakukan perlakuan, Fhitung lebih tinggi dari Ftabel (kosa kata = 2723 > 2,71; pemahaman bacaan = 5.725> 2,71). Penelitian ini menunjukkan hasil (a) nilai signifikan lebih rendah dari alpha (0,01 0,05), itu berarti bahwa tidak ada pengaruh interaksi yang signifikan antara penguasaan kosakata dan pemahaman bacaan dengan dan tanpa penyederhanaan leksikal terhadap siswa IPA kelas XI SMA Muhammadiyah 1 Palangka Raya. Akhirnya, berdasarkan hasil di atas, penulis merekomendasikan bahwa guru dapat menerapkan penyederhanaan leksikal dalam pemahaman membaca. Melihat dari hasil penelitian ini, penggunaan penyederhanaan leksikal efektif karena penguasaan kosakata dan pemahaman bacaan meningkat

    Investigation into the efficacy of text modification: What type of text do learners of Japanese authenticate?

    No full text
    The dissertation is a study of the efficacy of reading materials for learners of Japanese as a foreign language (JFL). It discusses the merits of 'authentic' materials written primarily for native speaker-readers compared to 'modified' texts adapted in some way for learners. Further, it compares various sorts of modifications: simplification, elaboration, marginal glosses and the use of onscreen computer pop-ups. More broadly, it locates the study within the wider discourse of pedagogy concerning reading materials for second language learners, especially JFL learners. Reading in Japanese as a second language is generally thought to be more demanding than reading in some other second languages. The study therefore argues that the authenticity debate and efficacy of text modification must be addressed specifically in the JFL reading pedagogy. In the context of the authenticity debate, there are, broadly, two opposing views. One favours the predominant use of unmodified texts while the other promotes the efficacy of modified texts. While there have been numerous theoretical discussions and empirical findings in the reading pedagogy of English as a second or foreign language (ESL/EFL), the JFL reading pedagogy is currently lacking such academic endeavours. Hence, the present study seeks to fill the gap. The study is mixed methods research, consisting of three projects in which both qualitative and quantitative methods are employed. This approach investigates equally the effects of text modification on participating learners' cognitive changes (reading comprehension) and affective changes (motivation and perception). The results indicate that learners of Japanese comprehend modified texts statistically significantly better than they do unmodified texts. Findings include that modified texts for Japanese are more efficacious than they are in the ESL/EFL context. However, modified texts that are insufficiently challenging fail to enhance learners' motivation. Advanced learners especially were found to have a negative attitude toward reading modified Japanese texts

    An Automatic Modern Standard Arabic Text Simplification System: A Corpus-Based Approach

    Get PDF
    This thesis brings together an overview of Text Readability (TR) about Text Simplification (TS) with an application of both to Modern Standard Arabic (MSA). It will present our findings on using automatic TR and TS tools to teach MSA, along with challenges, limitations, and recommendations about enhancing the TR and TS models. Reading is one of the most vital tasks that provide language input for communication and comprehension skills. It is proved that the use of long sentences, connected sentences, embedded phrases, passive voices, non- standard word orders, and infrequent words can increase the text difficulty for people with low literacy levels, as well as second language learners. The thesis compares the use of sentence embeddings of different types (fastText, mBERT, XLM-R and Arabic-BERT), as well as traditional language features such as POS tags, dependency trees, readability scores and frequency lists for language learners. The accuracy of the 3-way CEFR (The Common European Framework of Reference for Languages Proficiency Levels) classification is F-1 of 0.80 and 0.75 for Arabic-Bert and XLM-R classification, respectively and 0.71 Spearman correlation for the regression task. At the same time, the binary difficulty classifier reaches F-1 0.94 and F-1 0.98 for the sentence-pair semantic similarity classifier. TS is an NLP task aiming to reduce the linguistic complexity of the text while maintaining its meaning and original information (Siddharthan, 2002; Camacho Collados, 2013; Saggion, 2017). The simplification study experimented using two approaches: (i) a classification approach and (ii) a generative approach. It then evaluated the effectiveness of these methods using the BERTScore (Zhang et al., 2020) evaluation metric. The simple sentences produced by the mT5 model achieved P 0.72, R 0.68 and F-1 0.70 via BERTScore while combining Arabic- BERT and fastText achieved P 0.97, R 0.97 and F-1 0.97. To reiterate, this research demonstrated the effectiveness of the implementation of a corpus-based method combined with extracting extensive linguistic features via the latest NLP techniques. It provided insights which can be of use in various Arabic corpus studies and NLP tasks such as translation for educational purposes

    Extensive reading and L2 development: a study of Hong Kong secondary learners of English

    Get PDF
    Although extensive reading is regarded by many practitioners as a potentially very useful means of assisting L2 development, experimental enquiry into its effectiveness has so far produced little more than a collection of somewhat disparate findings. Nor has any attempt been made to categorically link any such research findings with second language acquisition theory. Consequently, we have no coherent, research-based theory of L2 extensive reading.Using data from a large-scale project implemented in Hong Kong secondary schools, the L2 English writing of students participating in an extensive reading scheme as part of the school curriculum was compared to that of non-participant students. Samples of timed narrative writing from 392 students in Secondaries 2 and 3 were rated holistically on a scale of 1 - 6 for overall quality, grammatical complexity, grammatical accuracy, vocabulary range, coherence, spelling and conventions of presentation. A subset of 150 compositions from two control and two experimental classes were further evaluated on a range of objective measures.Results from the two evaluation procedures were cross-referenced, and indicate that extensive reading in an L2 may benefit language development in quite specific ways. Findings are discussed within the context of current psycholinguistic and neurolinguistic theory and an explanation consistent with such theory is proposed. It is argued that, because it is likely to be subserved by a different memory system from that which subserves formal classroom instruction, extensive reading may enhance levels of automaticity, thus favouring the development of fluency, and, concomitantly, complexity and coherence. At low levels of L2 competence, extensive reading may also accelerate the acquisition of basic grammar through frequency effects

    Détection automatique de phrases parallÚles dans un corpus biomédical comparable technique/simplifié

    Get PDF
    International audienceAutomatic detection of parallel sentences in comparable biomedical corpora Parallel sentences provide identical or semantically similar information which gives important clues on language. When sentences vary by their register (like expert vs non-expert), they can be exploited for the automatic text simplification. The aim of text simplification is to improve the understanding of texts. For instance, in the biomedical field, simplification may permit patients to understand better medical texts in relation to their health. Yet, there is currently very few resources for the simplification of French texts. We propose to exploit comparable corpora, which are distinguished by their technicality, to detect parallel sentences and to align them. The reference data are created manually and show 0.76 inter-annotator agreement. We perform experiments on balanced and imbalanced data. The results on balanced data reach up to 0.94 F-measure. On imbalanced data, the results are lower (up to 0.92 F-measure) but remain competitive when using classification models trained on balanced data.Les phrases parallĂšles contiennent des informations identiques ou trĂšs proches sĂ©mantiquement et offrent des indications importantes sur le fonctionnement de la langue. Lorsque les phrases sont diffĂ©renciĂ©es par leur registre (comme expert vs. non-expert), elles peuvent ĂȘtre exploitĂ©es pour la simplification automatique de textes. Le but de la simplification automatique est d'amĂ©liorer la comprĂ©hension de textes. Par exemple, dans le domaine biomĂ©dical, la simplification peut permettre aux patients de mieux comprendre les textes relatifs Ă  leur santĂ©. Il existe cependant trĂšs peu de ressources pour la simplification en français. Nous proposons donc d'exploiter des corpus com-parables, diffĂ©renciĂ©s par leur technicitĂ©, pour y dĂ©tecter des phrases parallĂšles et les aligner. Les donnĂ©es de rĂ©fĂ©rence sont crĂ©Ă©es manuellement et montrent un accord inter-annotateur de 0,76. Nous expĂ©rimentons sur des donnĂ©es Ă©quilibrĂ©es et dĂ©sĂ©quilibrĂ©es. La F-mesure sur les donnĂ©es Ă©quilibrĂ©es atteint jusqu'Ă  0,94. Sur les donnĂ©es dĂ©sĂ©quilibrĂ©es, les rĂ©sultats sont plus faibles (jusqu'Ă  0,92 de F-mesure) mais restent compĂ©titifs lorsque les modĂšles sont entraĂźnĂ©s sur les donnĂ©es Ă©quilibrĂ©es

    Predicting lexical complexity in English texts: the Complex 2.0 dataset

    Get PDF
    © 2022 The Authors. Published by Springer. This is an open access article available under a Creative Commons licence. The published version can be accessed at the following link on the publisher’s website: https://doi.org/10.1007/s10579-022-09588-2Identifying words which may cause difficulty for a reader is an essential step in most lexical text simplification systems prior to lexical substitution and can also be used for assessing the readability of a text. This task is commonly referred to as complex word identification (CWI) and is often modelled as a supervised classification problem. For training such systems, annotated datasets in which words and sometimes multi-word expressions are labelled regarding complexity are required. In this paper we analyze previous work carried out in this task and investigate the properties of CWI datasets for English. We develop a protocol for the annotation of lexical complexity and use this to annotate a new dataset, CompLex 2.0. We present experiments using both new and old datasets to investigate the nature of lexical complexity. We found that a Likert-scale annotation protocol provides an objective setting that is superior for identifying the complexity of words compared to a binary annotation protocol. We release a new dataset using our new protocol to promote the task of Lexical Complexity Prediction

    Facilitating second language acquisition (SLA) through computer-mediated communication (CMC) in an English for Civil Engineering (ECE) environment

    Get PDF
    This study explores the application of computer-mediated communication (CMC) in an English for Civil Engineering (ECE) learning setting. The aim is to examine the interactional opportunities present in the computer-mediated environment for evidence of conditions deemed facilitative of second language acquisition, based on the tenets prescribed by the Interaction Hypothesis. This theory emphasizes the importance of interaction in language learning and the necessity for learners to have access to meaningful and comprehensible input. It is based on the premise that acquisition will occur through interaction where learners arc provided opportunities to negotiate meaning in order to develop mutual understanding. In tum, this allows for hypothesis testing related to learners\u27 developing interlanguage systems. It also provides opportunities for learners to produce comprehensible output and have access to feedback related to their attempts. All these are regarded as crucial for language acquisition. Most of the studies on interaction work reported in the literature are related to oral interaction. Nevertheless, studies on the use of CMC have reported that this medium can promote meaningful interaction that can foster interlanguage development through meaning negotiation and focus on form. The participants in this study consist of one English language teacher and a group of seventy-three students. The task employed for this study is based on one of the requirements of the ECE program, specifically for the students to engage in a discussion forum on current and relevant social, economic and environmental issues related to the civil engineering field and profession. For a more in-depth and thorough understanding of the entire perspective in the application of CMC in this ECE setting, both qualitative and quantitative procedures are adopted for the purpose of data analysis. The analysis of interactional exchanges reveals that this on-line platform serves as a suitable context and a conducive environment for interlanguage development. Both student-to-teacher and student-to-student interactional exchanges provide evidence of opportunities for modified input, feedback and modified output. The interview responses also provide important insights into the subjective dimension of learning in terms of students\u27 overall opinion and perception of the on-line interactional exchange
    • 

    corecore