Background: Tools used to appraise the credibility of health information are
time-consuming to apply and require context-specific expertise, limiting their
use for quickly identifying and mitigating the spread of misinformation as it
emerges. Our aim was to estimate the proportion of vaccination-related posts on
Twitter are likely to be misinformation, and how unevenly exposure to
misinformation was distributed among Twitter users.
Methods: Sampling from 144,878 vaccination-related web pages shared on
Twitter between January 2017 and March 2018, we used a seven-point checklist
adapted from two validated tools to appraise the credibility of a small subset
of 474. These were used to train several classifiers (random forest, support
vector machines, and a recurrent neural network with transfer learning), using
the text from a web page to predict whether the information satisfies each of
the seven criteria.
Results: Applying the best performing classifier to the 144,878 web pages, we
found that 14.4\% of relevant posts to text-based communications were linked to
webpages of low credibility and made up 9.2\% of all potential
vaccination-related exposures. However, the 100 most popular links to
misinformation were potentially seen by between 2 million and 80 million
Twitter users, and for a substantial sub-population of Twitter users engaging
with vaccination-related information, links to misinformation appear to
dominate the vaccination-related information to which they were exposed.
Conclusions: We proposed a new method for automatically appraising the
credibility of webpages based on a combination of validated checklist tools.
The results suggest that an automatic credibility appraisal tool can be used to
find populations at higher risk of exposure to misinformation or applied
proactively to add friction to the sharing of low credibility vaccination
information.Comment: 8 Pages, 5 Figure