232,127 research outputs found

    All mixed up? Finding the optimal feature set for general readability prediction and its application to English and Dutch

    Get PDF
    Readability research has a long and rich tradition, but there has been too little focus on general readability prediction without targeting a specific audience or text genre. Moreover, though NLP-inspired research has focused on adding more complex readability features there is still no consensus on which features contribute most to the prediction. In this article, we investigate in close detail the feasibility of constructing a readability prediction system for English and Dutch generic text using supervised machine learning. Based on readability assessments by both experts and a crowd, we implement different types of text characteristics ranging from easy-to-compute superficial text characteristics to features requiring a deep linguistic processing, resulting in ten different feature groups. Both a regression and classification setup are investigated reflecting the two possible readability prediction tasks: scoring individual texts or comparing two texts. We show that going beyond correlation calculations for readability optimization using a wrapper-based genetic algorithm optimization approach is a promising task which provides considerable insights in which feature combinations contribute to the overall readability prediction. Since we also have gold standard information available for those features requiring deep processing we are able to investigate the true upper bound of our Dutch system. Interestingly, we will observe that the performance of our fully-automatic readability prediction pipeline is on par with the pipeline using golden deep syntactic and semantic information

    Readability, presentation and quality of allergy-related patient information leaflets: a cross sectional and longitudinal study

    Get PDF
    Objective: Patient information leaflets (PILs) are widely used to reinforce or illustrate health information and to complement verbal consultation. The objectives of the study were to assess the readability and presentation of PILs published by Allergy UK, and to conduct a longitudinal assessment to evaluate the impact of leaflet amendment and revision on readability. Methods: Readability of Allergy UK leaflets available in 2013 was assessed using Simple Measure of Gobbledegook (SMOG) and Flesch-Kincaid Reading Grade Formula. Leaflet presentation was evaluated using the Clear Print Guidelines of the Royal National Institute of Blind People (RNIB) and the Patient Information Appraisal System developed by the British Medical Association (BMA). Changes in the leaflets’ readability scores over five years were investigated. Results: 108 leaflets, covering a wide range of allergic conditions and treatment options, were assessed. The leaflets had average SMOG and Flesch-Kincaid scores of 13.9 (range 11-18, SD 1.2) and 10.9 (range 5-17, SD 2.1) respectively. All leaflets met the RNIB Clear Print guidelines, with the exception of font size which was universally inadequate. The leaflets scored on average 10 (median 10, range 7-15) out of a maximum of 27 on the BMA checklist. The overall average SMOG score of 31 leaflets available in both 2008 and 2013 had not changed significantly. The process of leaflet revision resulted in 1% change in readability scores overall, with a predominantly upward trend with six leaflets increasing their readability score by >10% and only three decreasing by >10%. Conclusion: Allergy-related patient information leaflets are well presented but have readability levels that are higher than those recommended for health information. Involving service users in the process of leaflet design, together with systematic pre-publication screening of readability would enhance the accessibility and comprehensibility of written information for people with allergy and their careers

    Inter-Rater Agreement Study on Readability Assessment in Bengali

    Full text link
    An inter-rater agreement study is performed for readability assessment in Bengali. A 1-7 rating scale was used to indicate different levels of readability. We obtained moderate to fair agreement among seven independent annotators on 30 text passages written by four eminent Bengali authors. As a by product of our study, we obtained a readability-annotated ground truth dataset in Bengali. .Comment: 6 pages, 4 tables, Accepted in ICCONAC, 201

    Do Linguistic Style and Readability of Scientific Abstracts affect their Virality?

    Full text link
    Reactions to textual content posted in an online social network show different dynamics depending on the linguistic style and readability of the submitted content. Do similar dynamics exist for responses to scientific articles? Our intuition, supported by previous research, suggests that the success of a scientific article depends on its content, rather than on its linguistic style. In this article, we examine a corpus of scientific abstracts and three forms of associated reactions: article downloads, citations, and bookmarks. Through a class-based psycholinguistic analysis and readability indices tests, we show that certain stylistic and readability features of abstracts clearly concur in determining the success and viral capability of a scientific article.Comment: Proceedings of the Sixth International AAAI Conference on Weblogs and Social Media (ICWSM 2012), 4-8 June 2012, Dublin, Irelan

    Controlled language and readability

    Get PDF
    Controlled Language (CL) rules specify constraints on lexicon, grammar and style with the objective of improving text translatability, comprehensibility, readability and usability. A significant body of research exists demonstrating the positive effects CL rules can have on machine translation quality (e.g. Mitamura and Nyberg 1995; Kamprath et al 1998; Bernth 1999; Nyberg et al 2003), acceptability (Roturier 2006), and post-editing effort (O’Brien 2006). Since CL rules aim to reduce complexity and ambiguity, claims have been made that they consequently improve the readability of text (e.g., Spaggiari, Beaujard and Cannesson 2003; Reuther 2003). Little work, however, has been done on the effects of CL on readability. This paper represents an attempt to investigate the relationship in an empirical manner using both qualitative and quantitative methods

    Readability and understandability of andrology questionnaires

    Get PDF
    Objective: Medical questionnaires, which enable collection, comparison and analysis of appropriate data as a means of written communication between a patient and a doctor, must be easily readable, and understandable. Here, we measure the readability and understandability of questionnaires used in andrology and examine the relationship between the educational status of the patients and the understandability of the forms. Material and methods: Seven questionnaires used to diagnose andological diseases were selected from the European Association of Urology guidelines. The number of syllables per word, the number of words in a sentence, and the average word and sentence lengths were calculated for each Turkish validated form. Readability scores were calculated, and closet tests were used to measure the understandability of the texts. Results: Three hundred and twenty-seven male volunteers participated in the study. Two hundred and sixteen of the participants (66%) had a high school or college education. The readability level of the seven forms was determined to be ''Difficult'' or ''Very Difficult,'' and at least a high school education level was required to understand the forms. As education level and monthly income increased, the understandability of the forms increased; as the readability of the forms became more difficult, their understandability decreased (p<0.001). Conclusion: The readability levels of questionnaires used in andrology are well above the reading level of Turkey. Health providers can help patients to fill out forms to increase doctor-patient communication

    Readability of PBE reporting

    Get PDF
    The standardisation of Public Benefit Entities reporting has developed since 1992. Beneficial PBE reporting requires representations of position and performance, congruent with the Qualitative Characteristics of the conceptual framework. Non-regulation, optional adoption and sector-neutral standards led to issues of erroneous, complicated and misleading language in past reports. After calls for change, sector specific regulations and a tier system was introduced to address negative impacts on PBE reporting, and catering to different PBE types or users. This study aims to investigate if current reporting is meeting expected outcomes of regulation, specifically: Has the 2015 adoption of sector-specific standards impacted the readability of New Zealand PBE's annual reports? Data was collected by a convenience sample of PBE compliant annual reports and the corresponding sector-neutral report. These reports were converted, cleaned and measured for readability (by applying Flesch Reading Ease, Flesch Kincaid Grade Level and passive sentences measures). The resulting data was analysed with a Paired t-Test for a significant difference. FRE results indicated 93% of reports were tougher than ‘slightly difficult to read’. Most reports indicated a difference of one point or more, 53% of reports improved, while, 33% of reports declined after implementing the PBE regulations. This study concludes sector specific standards have not resulted in a consistent, statistically significant, difference in PBE for any readability measures studied. The use of jargon and the lack of specificity in readability measures are possible limitations of this research. However, for PBE’s to deliver efficient annual reports for users, further changes may be needed

    Estimating readability with the Strathclyde readability measure

    Get PDF
    Despite their significant limitations, readability measures that are easy to apply have definite appeal. With this in mind, we have been exploring the prospects for more insightful measures that are computer-based and, thereby, still easily applied. The orthodox reliance on intrinsic syntactic features is an inherent limitation of most readability measures, since they have no reference to the likelihood that readers will be acquainted with the constituent words and phrases. To accommodate this feature of 'human familiarity', we have devised a metric that combines traditional factors, such as Average Sentence Length, with a measure of word 'commonality' based upon word frequency. This paper details the derivation, nature and application of the Strathclyde Readability Measure (SRM)
    corecore