5 research outputs found

    Using machine learning for automatic identification of evidence-based health information on the web

    Get PDF
    Automatic assessment of the quality of online health information is a need especially with the massive growth of online content. In this paper, we present an approach to assessing the quality of health webpages based on their content rather than on purely technical features, by applying machine learning techniques to the automatic identification of evidence-based health information. Several machine learning approaches were applied to learn classifiers using different combinations of features. Three datasets were used in this study for three different diseases, namely shingles, flu and migraine. The results obtained using the classifiers were promising in terms of precision and recall especially with diseases with few different pathogenic mechanisms

    Automatic identification of information quality metrics in health news stories

    Get PDF
    Objective: Many online and printed media publish health news of questionable trustworthiness and it may be difficult for laypersons to determine the information quality of such articles. The purpose of this work was to propose a methodology for the automatic assessment of the quality of health-related news stories using natural language processing and machine learning. Materials and Methods: We used a database from the website HealthNewsReview.org that aims to improve the public dialogue about health care. HealthNewsReview.org developed a set of criteria to critically analyze health care interventions' claims. In this work, we attempt to automate the evaluation process by identifying the indicators of those criteria using natural language processing-based machine learning on a corpus of more than 1,300 news stories. We explored features ranging from simple n-grams to more advanced linguistic features and optimized the feature selection for each task. Additionally, we experimented with the use of pre-trained natural language model BERT. Results: For some criteria, such as mention of costs, benefits, harms, and “disease-mongering,” the evaluation results were promising with an F1 measure reaching 81.94%, while for others the results were less satisfactory due to the dataset size, the need of external knowledge, or the subjectivity in the evaluation process. Conclusion: These used criteria are more challenging than those addressed by previous work, and our aim was to investigate how much more difficult the machine learning task was, and how and why it varied between criteria. For some criteria, the obtained results were promising; however, automated evaluation of the other criteria may not yet replace the manual evaluation process where human experts interpret text senses and make use of external knowledge in their assessment

    Fake news or weak science? Visibility and characterization of anti-vaccine webpages returned by Google in different languages and countries

    Get PDF
    The 1998 Lancet paper by Wakefield et al , despite subsequent retraction and evidence indicating no causal link between vaccinations and autism, triggered significant parental concern. The aim of this study was to analyse the online information available on this topic. Using localized versions of Google, we searched “autism vaccine” in English, French, Italian, Portuguese, Mandarin and Arabic and analyzed 200 websites for each search engine result page (SERP). A common feature was the newsworthiness of the topic, with news outlets representing 25-50% of the SERP, followed by unaffiliated websites (blogs, social media) that represented 27-41% and included most of the vaccine-negative websites. Between 12% and 24% of websites had a negative stance on vaccines, while most websites were pro-vaccine (43-70%). However, their ranking by Google varied. While in Google.com the first vaccine-negative website was the 43rd in the SERP, there was one vaccine-negative webpage in the top 10 websites in both the British and Australian localized versions and in French and two in Italian, Portuguese and Mandarin, suggesting that the information quality algorithm used by Google may work better in English. Many webpages mentioned celebrities in the context of the link between vaccines and autism, with Donald Trump most frequently. Few websites (1-5%) promoted complementary and alternative medicine (CAM) but 50-100% of these were also vaccine-negative suggesting that CAM users are more exposed to vaccine-negative information. This analysis highlights the need for monitoring the web for information impacting on vaccine uptake

    Corrigendum: fake news or weak science? Visibility and characterization of antivaccine webpages returned by Google in different languages and countries

    Get PDF
    In the original article, there was amistake in SupplementaryMaterialData Sheet 1 as published. In the tab “Australia” the value of cell M191 should be “0” instead of “1” and that of cell N191 should be “1” instead of “0.” The corrected Supplementary Material Data Sheet 1 has been replaced in the original article. In addition, there was a mistake in Table 6 as published. The percentage of vaccine-negative websites for Australia should be “33.3%” instead of “16.7%.” The corrected Table 6 appears below. The authors apologize for this error and state that this does not change the scientific conclusions of the article in any way. The original article has been updated
    corecore