39 research outputs found

    Readability of Wikipedia Pages on Autoimmune Disorders: Systematic Quantitative Assessment

    Get PDF
    Background: In the era of new information and communication technologies, the Internet is being increasingly accessed for health-related information. Indeed, recently published patient surveys of people with autoimmune disorders confirmed that the Internet was reported as one of the most important health information sources. Wikipedia, a free online encyclopedia launched in 2001, is generally one of the most visited websites worldwide and is often consulted for health-related information. Objective: The main objective of this investigation was to quantitatively assess whether the Wikipedia pages related to autoimmune disorders can be easily accessed by patients and their families, in terms of readability. Methods: We obtained and downloaded a list of autoimmune disorders from the American Autoimmune Related Diseases Association (AARDA) website. We analyzed Wikipedia articles for their overall level of readability with 6 different quantitative readability scales: (1) the Flesch Reading Ease, (2) the Gunning Fog Index, (3) the Coleman-Liau Index, (4) the Flesch-Kincaid Grade Level, (5) the Automated Readability Index (ARI), and (6) the Simple Measure of Gobbledygook (SMOG). Further, we investigated the correlation between readability and clinical, pathological, and epidemiological parameters. Moreover, each Wikipedia analysis was assessed according to its content, breaking down the readability indices by main topic of each part (namely, pathogenesis, treatment, diagnosis, and prognosis plus a section containing paragraphs not falling into any of the previous categories). Results: We retrieved 134 diseases from the AARDA website. The Flesch Reading Ease yielded a mean score of 24.34 (SD 10.73), indicating that the sites were very difficult to read and best understood by university graduates, while mean Gunning Fog Index and ARI scores were 16.87 (SD 2.03) and 14.06 (SD 2.12), respectively. The Coleman-Liau Index and the Flesch-Kincaid Grade Level yielded mean scores of 14.48 (SD 1.57) and 14.86 (1.95), respectively, while the mean SMOG score was 15.38 (SD 1.37). All the readability indices confirmed that the sites were suitable for a university graduate reading level. We found no correlation between readability and clinical, pathological, and epidemiological parameters. Differences among the different sections of the Wikipedia pages were statistically significant. Conclusions: Wikipedia pages related to autoimmune disorders are characterized by a low level of readability. The onus is, therefore, on physicians and health authorities to improve the health literacy skills of patients and their families and to create, together with patients themselves, disease-specific readable sites, disseminating highly accessible health-related online information, in terms of both clarity and conciseness

    The Use of Chinese-language Internet Information about Cancer by Chinese Health Consumers

    Get PDF
    We investigated the use of Chinese-language Internet information about cancer by Chinese health consumers, and its impact on their cancer care. We applied a grounded theory approach and undertook semi-structured interviews with 20 participants in China to learn their experience of using the Internet for cancer information as a patient or a family member. Thematic analysis of the interview data identified three key themes: (1) information needs evolve during the treatment journey; (2) Traditional Chinese Medicine (TCM) and adverse effects of treatment are the topics of greatest interest; and (3) most participants have encountered Internet health information with questionable quality. These findings suggest that although Internet has great potential to empower Chinese cancer patients and their family through cancer care journey, the information quality issues, cultural considerations and current health care paradigm constrain this potential. Further research is needed to address these issues in improving cancer care in China

    New Zealand information on the Internet: the Power to Find the Knowledge

    No full text
    In a world of apparently ubiquitous information, does knowledge still equal power? Whatever the answer to this question, we will not have power unless we can retrieve our knowledge. Despite the advances of the last decades, issues remain in finding information on the Web relating to Aotearoa. These include: the efficiency with which the global search engines index the NZ web space, searching for macronised words, the quality of Wikipedia information about NZ, and the availability of open access NZ research

    Wikipedia and Large Language Models: Perfect Pairing or Perfect Storm?

    Get PDF
    This is the submitted manuscript. The article is due to be published in 2023Purpose: The purpose of this paper is to explore the potential benefits and challenges of using large language models (LLMs) like ChatGPT to edit Wikipedia. Approach: The first portion of this paper provides background about Wikipedia and LLMs, explicating briefly how each works. The paper's second section then explores both the ways that LLMs can be used to make Wikipedia a stronger site and the challenges that these technologies pose to Wikipedia editors. The paper's final section explores the implications for information professionals. Findings: The paper argues that LLMs can be used to proofread Wikipedia articles, outline potential articles, and generate usable Wikitext. The pitfalls include the technology's potential to generate text that is plagiarized or violates copyright, its tendency to produce "original research," and its tendency to generate incorrect or biased information. Originality: While there has been limited discussion among Wikipedia editors about the use of LLMs when editing the site, hardly any scholarship has been given to how these models can impact Wikipedia's development and quality. This paper thus aims to fill this gap in knowledge by examining both the potential benefits and pitfalls of using LLMs on Wikipedia

    Readability of web content An analysis by topic

    Get PDF
    Readability is determined by the characteristics of the text that influence their understanding. The web is composed of content on various topics and the results retrieved in the top positions by the main search engines are expected to be those with the highest number of views. In this study, we analyzed the readability of web pages according to the topic to which it belongs and their position in the search result. For that, we collected the top-20 results retrieved by Google to 23,779 queries from 20 topics and used several readability metrics. The results of the analysis showed that the content from organizations (like colleges and other institutions) and health-related content have lower readability values. Categories Games and Home are on the opposite side. For the categories identified as having less readability, tools can be developed that help the user understand their content. We also found that top-ranked pages have higher values of readability. One can conclude that, directly or indirectly, readability is a factor that seems to be being considered by the Google search engine or has an influence on page popularity

    The accuracy and completeness of drug information in Google snippet blocks

    Get PDF
    Introduction: Consumers commonly use the Internet for immediate drug information. In 2014, Google introduced the snippet block to programmatically search available websites to answer a question entered into the search engine without the need for the user to enter any websites. This study compared the accuracy and completeness of drug information found in Google snippet blocks to US Food and Drug Administration (FDA) medication guides. Methods: Ten outpatient drugs were selected from the 2018 Clinical Drugstats Database Medical Expenditure Panel Survey. Six questions in the medication guide for each drug were entered into the Google search engine to find the snippet block. The accuracy and completeness of drug information in the Google snippet block were quantified by two different pharmacists using a scoring system of 1 (less than 25% accurate/complete information) to 5 (100% accurate/complete information). Descriptive statistics were used to summarize the scores. Results: For five out of the six questions, the information in the Google snippets had less than 50% accuracy and completeness compared to the medication guides. The average accuracy and completeness scores of the Google snippets were highest for “What are the ingredients of [the drug]?” with scores of 3.38 (51–75%) and 3.00 (51–75%), respectively. The question on “How to take [drug]?” had the lowest score with averages of 1.00 (<25%) for both accuracy and completeness. Conclusion: Google snippets provide inaccurate and incomplete drug information when compared to FDA-approved drug medication guides. This aspect may cause patient harm; therefore, it is imperative for health care and health information professionals to provide reliable drug resources to patients and consumers if written information may be needed

    Situating Wikipedia as a health information resource in various contexts: A scoping review

    Get PDF
    Background Wikipedia’s health content is the most frequently visited resource for health information on the internet. While the literature provides strong evidence for its high usage, a comprehensive literature review of Wikipedia’s role within the health context has not yet been reported. Objective To conduct a comprehensive review of peer-reviewed, published literature to learn what the existing body of literature says about Wikipedia as a health information resource and what publication trends exist, if any. Methods A comprehensive literature search in OVID Medline, OVID Embase, CINAHL, LISTA, Wilson’s Web, AMED, and Web of Science was performed. Through a two-stage screening process, records were excluded if: Wikipedia was not a major or exclusive focus of the article; Wikipedia was not discussed within the context of a health or medical topic; the article was not available in English, the article was irretrievable, or; the article was a letter, commentary, editorial, or popular media article. Results 89 articles and conference proceedings were selected for inclusion in the review. Four categories of literature emerged: 1) studies that situate Wikipedia as a health information resource; 2) investigations into the quality of Wikipedia, 3) explorations of the utility of Wikipedia in education, and 4) studies that demonstrate the utility of Wikipedia in research. Conclusion The literature positions Wikipedia as a prominent health information resource in various contexts for the public, patients, students, and practitioners seeking health information online. Wikipedia’s health content is accessed frequently, and its pages regularly rank highly in Google search results. While Wikipedia itself is well into its second decade, the academic discourse around Wikipedia within the context of health is still young and the academic literature is limited when attempts are made to understand Wikipedia as a health information resource. Possibilities for future research will be discussed

    Assessing the Readability of Medical Documents: A Ranking Approach

    Get PDF
    BACKGROUND: The use of electronic health record (EHR) systems with patient engagement capabilities, including viewing, downloading, and transmitting health information, has recently grown tremendously. However, using these resources to engage patients in managing their own health remains challenging due to the complex and technical nature of the EHR narratives. OBJECTIVE: Our objective was to develop a machine learning-based system to assess readability levels of complex documents such as EHR notes. METHODS: We collected difficulty ratings of EHR notes and Wikipedia articles using crowdsourcing from 90 readers. We built a supervised model to assess readability based on relative orders of text difficulty using both surface text features and word embeddings. We evaluated system performance using the Kendall coefficient of concordance against human ratings. RESULTS: Our system achieved significantly higher concordance (.734) with human annotators than did a baseline using the Flesch-Kincaid Grade Level, a widely adopted readability formula (.531). The improvement was also consistent across different disease topics. This method\u27s concordance with an individual human user\u27s ratings was also higher than the concordance between different human annotators (.658). CONCLUSIONS: We explored methods to automatically assess the readability levels of clinical narratives. Our ranking-based system using simple textual features and easy-to-learn word embeddings outperformed a widely used readability formula. Our ranking-based method can predict relative difficulties of medical documents. It is not constrained to a predefined set of readability levels, a common design in many machine learning-based systems. Furthermore, the feature set does not rely on complex processing of the documents. One potential application of our readability ranking is personalization, allowing patients to better accommodate their own background knowledge

    Assessing the Readability of Medical Documents: A Ranking Approach

    Get PDF
    BACKGROUND: The use of electronic health record (EHR) systems with patient engagement capabilities, including viewing, downloading, and transmitting health information, has recently grown tremendously. However, using these resources to engage patients in managing their own health remains challenging due to the complex and technical nature of the EHR narratives. OBJECTIVE: Our objective was to develop a machine learning-based system to assess readability levels of complex documents such as EHR notes. METHODS: We collected difficulty ratings of EHR notes and Wikipedia articles using crowdsourcing from 90 readers. We built a supervised model to assess readability based on relative orders of text difficulty using both surface text features and word embeddings. We evaluated system performance using the Kendall coefficient of concordance against human ratings. RESULTS: Our system achieved significantly higher concordance (.734) with human annotators than did a baseline using the Flesch-Kincaid Grade Level, a widely adopted readability formula (.531). The improvement was also consistent across different disease topics. This method\u27s concordance with an individual human user\u27s ratings was also higher than the concordance between different human annotators (.658). CONCLUSIONS: We explored methods to automatically assess the readability levels of clinical narratives. Our ranking-based system using simple textual features and easy-to-learn word embeddings outperformed a widely used readability formula. Our ranking-based method can predict relative difficulties of medical documents. It is not constrained to a predefined set of readability levels, a common design in many machine learning-based systems. Furthermore, the feature set does not rely on complex processing of the documents. One potential application of our readability ranking is personalization, allowing patients to better accommodate their own background knowledge
    corecore