143 research outputs found
Digital Pharmacovigilance: the medwatcher system for monitoring adverse events through automated processing of internet social media and crowdsourcing
Thesis (Ph.D.)--Boston UniversityHalf of Americans take a prescription drug, medical devices are in broad use, and population coverage for many vaccines is over 90%. Nearly all medical products carry risk of adverse events (AEs), sometimes severe. However, pre- approval trials use small populations and exclude participants by specific criteria, making them insufficient to determine the risks of a product as used in the population. Existing post-marketing reporting systems are critical, but suffer from underreporting. Meanwhile, recent years have seen an explosion in adoption of Internet services and smartphones. MedWatcher is a new system that harnesses emerging technologies for pharmacovigilance in the general population. MedWatcher consists of two components, a text-processing module,
MedWatcher Social, and a crowdsourcing module, MedWatcher Personal. With the natural language processing component, we acquire public data from the Internet, apply classification algorithms, and extract AE signals. With the crowdsourcing application, we provide software allowing consumers to submit AE reports directly.
Our MedWatcher Social algorithm for identifying symptoms performs with 77% precision and 88% recall on a sample of Twitter posts. Our machine learning algorithm for identifying AE-related posts performs with 68% precision and 89% recall on a labeled Twitter corpus. For zolpidem tartrate, certolizumab pegol, and dimethyl fumarate, we compared AE profiles from Twitter with reports from the FDA spontaneous reporting system. We find some concordance (Spearman's rho= 0.85, 0.77, 0.82, respectively, for symptoms at MedDRA System Organ Class level). Where the sources differ, milder effects are overrepresented in Twitter. We also compared post-marketing profiles with trial results and found little concordance.
MedWatcher Personal saw substantial user adoption, receiving 550 AE reports in a one-year period, including over 400 for one device, Essure. We categorized 400 Essure reports by symptom, compared them to 129 reports from the FDA spontaneous reporting system, and found high concordance (rho = 0.65) using MedDRA Preferred Term granularity. We also compared Essure Twitter posts with MedWatcher and FDA reports, and found rho= 0.25 and 0.31 respectively.
MedWatcher represents a novel pharmacoepidemiology surveillance informatics system; our analysis is the first to compare AEs across social media, direct reporting, FDA spontaneous reports, and pre-approval trials
Information Retrieval of Opioid Dependence Medications Reviews from Health-Related Social Media
Social media provides a convenient platform for patients to share their drug usage experience with others; consequently, health researchers can leverage this potential data to gain valuable information about usersâ drug satisfaction. Since the 1990s, opioid drug abuse has become a national crisis. In order to reduce the dependency of opioids, several drugs have been presented to the market, but little is known about patient satisfaction with these treatments. Sentiment analysis is a method to measure and interpret patientsâ satisfaction. In the first phase of this study, we aimed to utilize social media posts to predict patientsâ sentiment towards opioid dependency treatment. We focused on Suboxone, a well-known opioid dependence medication, as our targeted treatment and Drugs.com, an online healthcare forum as our data source. For the purpose of our analysis, we first collected 1,532 posts to create a training dataset, split the posts to sentences, and annotated 1100 sentences for sentiment analysis. To predict patientsâ sentiment, we extracted features from patientsâ posts, including bigrams, trigrams, and features extracted from topic modeling. To develop the prediction model, we used two machine learning methods, NaĂŻve Bayes and SVM, for predicting sentiment. We achieved the best performance using SVM, getting an accuracy of 61% for SVM. In the second phase of this study, we also aimed to understand the behavior of the patients toward the targeted medication. To accomplish this goal, we used the Health Belief Model (HBM), a social psychological model that describes and predicts patientsâ health-related attitudes in action, benefit, barrier, and threat categories, for predicting such behavior from patientsâ reviews. We also utilized the same combinations of features and machine learning methods that we used in the first phase of the study, and the best accuracy performance was 47% for the SVM classifier as compared to 43% as our baseline
A Semantics-based User Interface Model for Content Annotation, Authoring and Exploration
The Semantic Web and Linked Data movements with the aim of creating, publishing and interconnecting machine readable information have gained traction in the last years.
However, the majority of information still is contained in and exchanged using unstructured documents, such as Web pages, text documents, images and videos.
This can also not be expected to change, since text, images and videos are the natural way in which humans interact with information.
Semantic structuring of content on the other hand provides a wide range of advantages compared to unstructured information.
Semantically-enriched documents facilitate information search and retrieval, presentation, integration, reusability, interoperability and personalization.
Looking at the life-cycle of semantic content on the Web of Data, we see quite some progress on the backend side in storing structured content or for linking data and schemata.
Nevertheless, the currently least developed aspect of the semantic content life-cycle is from our point of view the user-friendly manual and semi-automatic creation of rich semantic content.
In this thesis, we propose a semantics-based user interface model, which aims to reduce the complexity of underlying technologies for semantic enrichment of content by Web users.
By surveying existing tools and approaches for semantic content authoring, we extracted a set of guidelines for designing efficient and effective semantic authoring user interfaces.
We applied these guidelines to devise a semantics-based user interface model called WYSIWYM (What You See Is What You Mean) which enables integrated authoring, visualization and exploration of unstructured and (semi-)structured content.
To assess the applicability of our proposed WYSIWYM model, we incorporated the model into four real-world use cases comprising two general and two domain-specific applications.
These use cases address four aspects of the WYSIWYM implementation:
1) Its integration into existing user interfaces,
2) Utilizing it for lightweight text analytics to incentivize users,
3) Dealing with crowdsourcing of semi-structured e-learning content,
4) Incorporating it for authoring of semantic medical prescriptions
Health Misinformation in Search and Social Media
People increasingly rely on the Internet in order to search for and share health-related information. Indeed, searching for and sharing information about medical treatments are among the most frequent uses of online data. While this is a convenient and fast method to collect information, online sources may contain incorrect information that has the potential to cause harm, especially if people believe what they read without further research or professional medical advice.
The goal of this thesis is to address the misinformation problem in two of the most commonly used online services: search engines and social media platforms. We examined how people use these platforms to search for and share health information. To achieve this, we designed controlled laboratory user studies and employed large-scale social media data analysis tools. The solutions proposed in this thesis can be used to build systems that better support people's health-related decisions.
The techniques described in this thesis addressed online searching and social media sharing in the following manner. First, with respect to search engines, we aimed to determine the extent to which people can be influenced by search engine results when trying to learn about the efficacy of various medical treatments. We conducted a controlled laboratory study wherein we biased the search results towards either correct or incorrect information. We then asked participants to determine the efficacy of different medical treatments. Results showed that people were significantly influenced both positively and negatively by search results bias. More importantly, when the subjects were exposed to incorrect information, they made more incorrect decisions than when they had no interaction with the search results.
Following from this work, we extended the study to gain insights into strategies people use during this decision-making process, via the think-aloud method. We found that, even with verbalization, people were strongly influenced by the search results bias. We also noted that people paid attention to what the majority states, authoritativeness, and content quality when evaluating online content. Understanding the effects of cognitive biases that can arise during online search is a complex undertaking because of the presence of unconscious biases (such as the search results ranking) that the think-aloud method fails to show.
Moving to social media, we first proposed a solution to detect and track misinformation in social media. Using Zika as a case study, we developed a tool for tracking misinformation on Twitter. We collected 13 million tweets regarding the Zika outbreak and tracked rumors outlined by the World Health Organization and the Snopes fact-checking website. We incorporated health professionals, crowdsourcing, and machine learning to capture health-related rumors as well as clarification communications. In this way, we illustrated insights that the proposed tools provide into potentially harmful information on social media, allowing public health researchers and practitioners to respond with targeted and timely action.
From identifying rumor-bearing tweets, we examined individuals on social media who are posting questionable health-related information, in particular those promoting cancer treatments that have been shown to be ineffective. Specifically, we studied 4,212 Twitter users who have posted about one of 139 ineffective ``treatments'' and compared them to a baseline of users generally interested in cancer. Considering features that capture user attributes, writing style, and sentiment, we built a classifier that is able to identify users prone to propagating such misinformation. This classifier achieved an accuracy of over 90%, providing a potential tool for public health officials to identify such individuals for preventive intervention
Distilling the Outcomes of Personal Experiences: A Propensity-scored Analysis of Social Media
ABSTRACT Millions of people regularly report the details of their realworld experiences on social media. This provides an opportunity to observe the outcomes of common and critical situations. Identifying and quantifying these outcomes may provide better decision-support and goal-achievement for individuals, and help policy-makers and scientists better understand important societal phenomena. We address several open questions about using social media data for open-domain outcome identification: Are the words people are more likely to use after some experience relevant to this experience? How well do these words cover the breadth of outcomes likely to occur for an experience? What kinds of outcomes are discovered? Studying 3-months of Twitter data capturing people who experienced 39 distinct situations across a variety of domains, we find that these outcomes are generally found to be relevant (55-100% on average) and that causally related concepts are more likely to be discovered than conceptual or semantically related concepts
Recommended from our members
TwiMed: Twitter and PubMed Comparable Corpus of Drugs, Diseases, Symptoms, and Their Relations.
BACKGROUND: Work on pharmacovigilance systems using texts from PubMed and Twitter typically target at different elements and use different annotation guidelines resulting in a scenario where there is no comparable set of documents from both Twitter and PubMed annotated in the same manner. OBJECTIVE: This study aimed to provide a comparable corpus of texts from PubMed and Twitter that can be used to study drug reports from these two sources of information, allowing researchers in the area of pharmacovigilance using natural language processing (NLP) to perform experiments to better understand the similarities and differences between drug reports in Twitter and PubMed. METHODS: We produced a corpus comprising 1000 tweets and 1000 PubMed sentences selected using the same strategy and annotated at entity level by the same experts (pharmacists) using the same set of guidelines. RESULTS: The resulting corpus, annotated by two pharmacists, comprises semantically correct annotations for a set of drugs, diseases, and symptoms. This corpus contains the annotations for 3144 entities, 2749 relations, and 5003 attributes. CONCLUSIONS: We present a corpus that is unique in its characteristics as this is the first corpus for pharmacovigilance curated from Twitter messages and PubMed sentences using the same data selection and annotation strategies. We believe this corpus will be of particular interest for researchers willing to compare results from pharmacovigilance systems (eg, classifiers and named entity recognition systems) when using data from Twitter and from PubMed. We hope that given the comprehensive set of drug names and the annotated entities and relations, this corpus becomes a standard resource to compare results from different pharmacovigilance studies in the area of NLP.This research project was supported by a grant from the Japanese Ministry of Education, Culture, Sports, Science and Technology (MEXT)
- âŠ