1,562 research outputs found

    Lexical Parameters, Based on Corpus Analysis of English and Swedish Cancer Data, of Relevance for NLG

    Get PDF
    Proceedings of the 16th Nordic Conference of Computational Linguistics NODALIDA-2007. Editors: Joakim Nivre, Heiki-Jaan Kaalep, Kadri Muischnek and Mare Koit. University of Tartu, Tartu, 2007. ISBN 978-9985-4-0513-0 (online) ISBN 978-9985-4-0514-7 (CD-ROM) pp. 333-336

    A Classifier to Evaluate Language Specificity in Medical Documents

    Get PDF
    Consumer health information written by health care professionals is often inaccessible to the consumers it is written for. Traditional readability formulas examine syntactic features like sentence length and number of syllables, ignoring the target audience\u27s grasp of the words themselves. The use of specialized vocabulary disrupts the understanding of patients with low reading skills, causing a decrease in comprehension. A naive Bayes classifier for three levels of increasing medical terminology specificity (consumer/patient, novice health learner, medical professional) was created with a lexicon generated from a representative medical corpus. Ninety-six percent accuracy in classification was attained. The classifier was then applied to existing consumer health web pages. We found that only 4% of pages were classified at a layperson level, regardless of the Flesch reading ease scores, while the remaining pages were at the level of medical professionals. This indicates that consumer health web pages are not using appropriate language for their target audience

    A corpus based, lexical analysis of patient information for radiography

    Get PDF
    Despite the importance and the ubiquity of medical patient information in many healthcare systems in the world, we know very little about the lexical characteristics of the register. We do not know how patients perceive the information in the leaflets or whether the messages are transmitted effectively and fully understood. How a medical authority instructs and obliges patients in written information is also unclear. While the number of radiographic examinations performed globally increases year on year, studies consistently show that patients lack basic knowledge regarding the commonly-performed exams and show very poor understanding of the concomitant risks associated with radiation. There is, then, a pressing need to investigate radiography patient information in order to better understand why, and where, it is less effective. This thesis applies three approaches common to the field of corpus linguistics to uncover some of the lexical characteristics of patient information for radiography. The approaches used in this thesis are a keyword extraction, a lexical bundles analysis and an investigation of modal verbs used to express obligation. The findings suggest that patient information for radiography possesses characteristics more common to academic prose than conversation, although the high informational content of the register goes some way to explaining this and suggests that the reliance on these structures may, to a certain extent, be unavoidable. Results also suggest that the reliance on should to oblige and instruct is problematic as it may cause interpretation problems for certain patients, including those for whom English is not a primary language. Certain other characteristics of patient information revealed by the analyses may also cause comprehension, and while further research is needed, none of these characteristics would be evaluated as problematic by standard readability measures, furthering doubts about the suitability of such measures for the evaluation of medical information

    The effects of individual differences and linguistic features on reading comprehension of health-related texts

    Get PDF
    Background. Relatively little attention has been focused on whether or how the effects of reader characteristics, or of the linguistic properties of a text, predict reading comprehension of health-related information. In addition, there is little evidence for the utility of any of the writing guidelines promulgated by the National Health Service (NHS) in order to improve the comprehension of health information. Nonetheless, some previous research suggests that health-related texts could be adapted for different groups of users to optimise understanding. Thus, existing knowledge presents important limitations, and raises concerns with potentially far-reaching practical implications. To address these concerns, I investigated how variation in individual differences and in text features predicts the comprehension of health-related texts, examining how the effects of textual features may differ for different kinds of readers. Method. The focus of this thesis is on Study 3, in which I investigated the predictors of tested comprehension, but I report preliminary studies where I examined the readability of a sample of health-related texts (Study 1), and the perceived comprehension of a sample of health-related texts (Study 2). In the primary study (Study 3), I used Bayesian mixed-effects models to analyse the influences that affect the accuracy of responses to questions probing the comprehension of a sample of health-related texts. I measured variation among 200 participants in their cognitive abilities, to capture the effects of individual differences, as well as variation in the linguistic features of texts, to capture the effects of text structure and content. Results. I found that tested comprehension was less likely to be accurate among older participants. However, comprehension accuracy was greater given higher levels of education, health literacy, and English language proficiency levels. In addition, self-rated evaluations of perceived comprehension predicted comprehension, but only in the absence of other individual-differences-related predictors. Variation in text features, including readability estimates, did not predict comprehension accuracy, and there was no evidence for the modulation of the effects of individual differences by text features. Discussion. Text features did not module the effects of individual differences to influence comprehension accuracy in any meaningful way. This suggests that adapting health-related texts to different groups of the population may be of limited practical value. Implications. Individual differences really matter to comprehension. Thus, optimally, understanding of health-related texts amongst the end-users should be tested, and interventions to aid readers, such as those with relatively low health literacy levels, could be used to improve comprehension of health-texts. In the absence of sensitive measures of reader characteristics, and when testing of understanding is not possible, the use of end-user evaluations of health-related texts may serve as a useful proxy of tested comprehension. However, looking for text effects, and guidance focusing on text effects, seems less useful given the reported evidence. Consequently, the effectiveness of designing health-related texts with the consideration of NHS’s text writing guidelines, is likely to be limited

    Is Electronic Access to Medical Records an Empowering and Patient-centered Initiative? – a Qualitative Contextual and Linguistic Analysis of Danish Electronic Records

    Get PDF
    Political correctness demands a patient empowering and patient-centered approach to health care and today patients are increasingly involved in, and responsible for, their own health. Patients are potentially subjected to large amounts of health information and, in a Danish context, patients have recently gained easy electronic access to their hospital records. Access, which used to be by application, is now only a few clicks away. This initiative is praised as patient empowering and patient-centered even though the e-records are not written for patients, but are the working tool of health professionals. Thus, an expert language text, as it stands, has to function as patient information. In this article, we examine the language of the e-records with a view to determining potential lay-friendliness and thus patient-centeredness. We also discuss whether access, by definition, is a progressive initiative and whether patient empowerment is always the same as patient-centeredness

    Proposing Thematic Diversity Reference Model for Essay Writing and Assessment

    Get PDF
    The tertiary level learners show inadequacy in thematic diversity in their essay writing in English, especially in non-English speaking countries. The learners’ effectiveness in developing the topic is unsatisfactory, affecting their creativity and performance and, finally, the evaluation of their essays. Though there are guidelines and textbooks for developing writing skills, inadequate training with a proper practice model reflects weak performance at the university level. Writers can develop a topic in diverse ways from multidisciplinary, spatial, and temporal perspectives. Therefore, a simplified guiding Model is proposed for facilitation. This paper describes the Model and application instruction in the pre-writing, writing and post-writing processes. It can be an efficient tool for students in their written course assignments, essay questions in examinations, and creative writing. In addition, it can be an effective tool for teachers to teach and assess expressive language skills like essay writing. A visually designed Thematic Diversity Reference Model for textual production and assessment developed in this research addresses this thematic non-diversity. The topic is approached at four levels: 1) Content, 2) Strategic, 3) Tactic, to 4) Logical Approaches. It stimulates feelings, ideas and activities in abstract and concrete ways and from different perspectives, as described in the schema theory of J. Piaget (1957) and the Social Learning Theory of Albert Bandura (1999), thus diversifying and enhancing the theme or topic and the sub-themes consequently. The researchers analysed an essay written in English to explain its usefulness in writing and assessment. The analysis was dominantly qualitative. The application reveals a significant failure in all four approaches. It identifies the absence of multidisciplinary nature, critical, creative, temporal, spatial, and contextual expansion, statistical, ideological, experiential, and referential information, causal and logical relationships, cohesion, and coherence. Thus, the Model assists the writer and the reader, advocating pedagogical solutions in developing writing strategies for better performance and assessment. It applies to interdisciplinary and multidisciplinary writings, social sciences, and humanities. KEYWORDS: Essay writing, Thematic diversity, Creativity; Evaluation; Reference mode

    Effects of Two Prereading Activities on Comprehending Science Text: Reading Abridged Text and Learning Vocabulary Words

    Full text link
    The present study examined the effects of two prereading activities designed to improve fifth-grade students’ vocabulary learning and comprehension of science textbook content containing those words. Ninety-three fifth grade students participated in this study. The prereading activities consisted of students reading an abridged version of the text or receiving instruction on vocabulary words drawn from the text before reading the full text once. Students receiving these treatments were compared to a control condition in which students reread the full text passage twice but did not receive any prereading treatment. Students were grouped by reading ability levels into above average, average, and below average readers. ANOVAs confirmed that the treatment/control groups did not differ on any of the pretests. ANOVAs were performed to examine the effects of the prereading treatments on measures of students’ vocabulary learning and reading comprehension of the science text. Results showed that students in the vocabulary training condition and the abridged text condition performed similarly in defining the vocabulary words and generating sentences containing the words, and both groups outperformed the control group on these measures. In addition, the vocabulary trained group outperformed the other two groups on a prompted recall measure of text comprehension. Treatment effects conditioned by reader ability were found on the sentence generation measure. The difference favoring the vocabulary group over the control group was evident for above-average and average readers but not for below average readers. The difference favoring the abridged group over the control group was evident for average and below average readers but not for above average readers. Students in the abridged text condition performed similarly across all reading levels, whereas students in the vocabulary and the control conditions differed across reading levels, with performance declining linearly as reading level declined. Better readers outperformed poorer readers on all the vocabulary measures and all but one of the reading comprehension measures. Results of this study suggest that having students read an abridged version of a difficult science text can help students learn vocabulary words in the text. Teaching students vocabulary words contained in a difficult science text prior to reading the text can help students learn the vocabulary words and improve their comprehension of the text

    Simplifying, reading, and machine translating health content: an empirical investigation of usability

    Get PDF
    Text simplification, through plain language (PL) or controlled language (CL), is adopted to increase readability, comprehension and machine translatability of (health) content. Cochrane is a non-profit organisation where volunteer authors summarise and simplify health-related English texts on the impact of treatments and interventions into plain language summaries (PLS), which are then disseminated online to the lay audience and translated. Cochrane’s simplification approach is non-automated, and involves the manual checking and implementation of different sets of PL guidelines, which can be an unsatisfactory, challenging and time-consuming task. This thesis examined if using the Acrolinx CL checker to automatically and consistently check PLS for readability and translatability issues would increase the usability of Cochrane’s simplification approach and, more precisely: (i) authors’ satisfaction; and (ii) authors’ effectiveness in terms of readability, comprehensibility, and machine translatability into Spanish. Data on satisfaction were collected from twelve Cochrane authors by means of the System Usability Scale and follow-up preference questions. Readability was analysed through the computational tool Coh-Metrix. Evidence on comprehensibility was gathered through ratings and recall protocols produced by lay readers, both native and non-native speakers of English. Machine translatability was assessed in terms of adequacy and fluency with forty-one Cochrane contributors, all native speakers of Spanish. Authors seemed to welcome the introduction of Acrolinx, and the adoption of this CL checker reduced word length, sentence length, and syntactic complexity. No significant impact on comprehensibility and machine translatability was identified. We observed that reading skills and characteristics other than simplified language (e.g. formatting) might influence comprehension. Machine translation quality was relatively high, with mainly style issues. This thesis presented an environment that could boost volunteer authors’ satisfaction and foster their adoption of simple language. We also discussed strategies to increase the accessibility of online health content among lay readers with different skills and language backgrounds

    Writing Science for Diverse Audiences: A Corpus-based Discourse Analysis of the Language of Science News and Research Articles

    Get PDF
    Despite the historic prevalence of the research article (RA) genre in the English for Academic Purposes (ESP) literature, work examining the ways that academic research is communicated with broader audiences—sometimes referred to as ‘popularization’ or ‘popular science’ (Gotti, 2014)—is on the rise. Scholars from diverse fields have shown interest in contexts of popular science in part because they represent a meeting point between the general public and academia. However, much of the research examining the language of popular science has adopted a rhetorical rather than linguistic lens (Pérez-Llantada, 2021). In addition, the recent interest in digital multi-modal genres (e.g., Luzón, 2023; Xia, 2023) has left the linguistic features of written discourse comparatively under-examined, and studies adopting corpus approaches have often included texts which are out-of-date, few in number, or under-described with regard to their place under the umbrella of popular science. This dissertation applies a mixed methods design to a new corpus representative of one variety of popular science writing, namely online science news articles (SNAs). It uses computer programs to compare the linguistic profiles of 400 SNAs with a matching corpus of the 400 RAs. Specifically, this dissertation investigates features of the verb phrase, namely short verb phrase variation, long verb phrase variation, and attribution of knowledge via reporting clauses. These features offer a useful contrast to the current noun-focused approach to grammatical complexity research (see Lan, Liu, & Staples, 2019). To inform interpretation of corpus findings, discourse- based interviews (Conrad, 2014) with seven SNA writers were also employed. Findings from the linguistic analyses, analyses of the registers’ situational characteristics, and informant interviews highlight the many differences between the registers, differences motivated especially by characteristics of audience, textual layout, and purpose. SNAs are short texts which function to inform and entertain an audience of mixed expertise. As a result, they utilize more verbs overall, as well as features of short and long verb phrases which allow writers to report research activities as stories involving researchers, their beliefs, and their words. Implications relating to contexts of science communication and pedagogical applications are discussed
    corecore