450 research outputs found

    Utilizing Consumer Health Posts for Pharmacovigilance: Identifying Underlying Factors Associated with Patients’ Attitudes Towards Antidepressants

    Get PDF
    Non-adherence to antidepressants is a major obstacle to antidepressants therapeutic benefits, resulting in increased risk of relapse, emergency visits, and significant burden on individuals and the healthcare system. Several studies showed that non-adherence is weakly associated with personal and clinical variables, but strongly associated with patients’ beliefs and attitudes towards medications. The traditional methods for identifying the key dimensions of patients’ attitudes towards antidepressants are associated with some methodological limitations, such as concern about confidentiality of personal information. In this study, attempts have been made to address the limitations by utilizing patients’ self report experiences in online healthcare forums to identify underlying factors affecting patients attitudes towards antidepressants. The data source of the study was a healthcare forum called “askapatients.com”. 892 patients’ reviews were randomly collected from the forum for the four most commonly prescribed antidepressants including Sertraline (Zoloft) and Escitalopram (Lexapro) from SSRI class, and Venlafaxine (Effexor) and duloxetine (Cymbalta) from SNRI class. Methodology of this study is composed of two main phases: I) generating structured data from unstructured patients’ drug reviews and testing hypotheses concerning attitude, II) identification and normalization of Adverse Drug Reactions (ADRs), Withdrawal Symptoms (WDs) and Drug Indications (DIs) from the posts, and mapping them to both The UMLS and SNOMED CT concepts. Phase II also includes testing the association between ADRs and attitude. The result of the first phase of this study showed that “experience of adverse drug reactions”, “perceived distress received from ADRs”, “lack of knowledge about medication’s mechanism”, “withdrawal experience”, “duration of usage”, and “drug effectiveness” are strongly associated with patients attitudes. However, demographic variables including “age” and “gender” are not associated with attitude. Analysis of the data in second phase of the study showed that from 6,534 identified entities, 73% are ADRs, 12% are WDs, and 15 % are drug indications. In addition, psychological and cognitive expressions have higher variability than physiological expressions. All three types of entities were mapped to 811 UMLS and SNOMED CT concepts. Testing the association between ADRs and attitude showed that from twenty-one physiological ADRs specified in the ASEC questionnaire, “dry mouth”, “increased appetite”, “disorientation”, “yawning”, “weight gain”, and “problem with sexual dysfunction” are associated with attitude. A set of psychological and cognitive ADRs, such as “emotional indifference” and “memory problem were also tested that showed significance association between these types of ADRs and attitude. The findings of this study have important implications for designing clinical interventions aiming to improve patients\u27 adherence towards antidepressants. In addition, the dataset generated in this study has significant implications for improving performance of text-mining algorithms aiming to identify health related information from consumer health posts. Moreover, the dataset can be used for generating and testing hypotheses related to ADRs associated with psychiatric mediations, and identifying factors associated with discontinuation of antidepressants. The dataset and guidelines of this study are available at https://sites.google.com/view/pharmacovigilanceinpsychiatry/hom

    Opinion Mining and Sentiment Analysis of Online Drug Reviews as a Pharmacovigilance Technique

    Get PDF
    Pharmacovigilance is the science that focuses on identification and characterization of adverse effects of medications in populations when released to market. The focus of this paper is to study the prospects of exploiting drug related online reviews contributed by social media groups for finding the adverse effects of drugs using opinion mining and sentiment analysis. The experiences and opinions related to drug adverse reactions by patients or other contributors in these forums can be mined and analyzed as a facilitator for pharmacovigilance. This review paper highlights the usability of opinion mining and sentiment analysis as one of the approaches for pharmacovigilance. DOI: 10.17762/ijritcc2321-8169.150711

    Predicting Medication Prescription Rankings with Medication Relation Network

    Get PDF
    Medication prescription rankings and demands prediction could benefit both medication consumers and pharmaceutical companies from various aspects. Our study predicts the medication prescription rankings focusing on patients’ medication switch and combination behavior, which is an innovative genre of medication knowledge that could be learned from unstructured patient generated contents. We first construct two supervised machine learning systems for medication references identification and medication relations classification from unstructured patient’s reviews. We further map the medication switch and combination relations into directed and undirected networks respectively. An adjusted transition in and out (ATIO) system is proposed for medication prescription rankings prediction. The proposed system demonstrates the highest positive correlation with actual medication prescription amounts comparing to other network-based measures. In order to predict the prescription demand changes, we compare four predictive regression models. The model incorporated the network-based measure from ATIO system achieve the lowest mean square errors

    Digital Pharmacovigilance: the medwatcher system for monitoring adverse events through automated processing of internet social media and crowdsourcing

    Full text link
    Thesis (Ph.D.)--Boston UniversityHalf of Americans take a prescription drug, medical devices are in broad use, and population coverage for many vaccines is over 90%. Nearly all medical products carry risk of adverse events (AEs), sometimes severe. However, pre- approval trials use small populations and exclude participants by specific criteria, making them insufficient to determine the risks of a product as used in the population. Existing post-marketing reporting systems are critical, but suffer from underreporting. Meanwhile, recent years have seen an explosion in adoption of Internet services and smartphones. MedWatcher is a new system that harnesses emerging technologies for pharmacovigilance in the general population. MedWatcher consists of two components, a text-processing module, MedWatcher Social, and a crowdsourcing module, MedWatcher Personal. With the natural language processing component, we acquire public data from the Internet, apply classification algorithms, and extract AE signals. With the crowdsourcing application, we provide software allowing consumers to submit AE reports directly. Our MedWatcher Social algorithm for identifying symptoms performs with 77% precision and 88% recall on a sample of Twitter posts. Our machine learning algorithm for identifying AE-related posts performs with 68% precision and 89% recall on a labeled Twitter corpus. For zolpidem tartrate, certolizumab pegol, and dimethyl fumarate, we compared AE profiles from Twitter with reports from the FDA spontaneous reporting system. We find some concordance (Spearman's rho= 0.85, 0.77, 0.82, respectively, for symptoms at MedDRA System Organ Class level). Where the sources differ, milder effects are overrepresented in Twitter. We also compared post-marketing profiles with trial results and found little concordance. MedWatcher Personal saw substantial user adoption, receiving 550 AE reports in a one-year period, including over 400 for one device, Essure. We categorized 400 Essure reports by symptom, compared them to 129 reports from the FDA spontaneous reporting system, and found high concordance (rho = 0.65) using MedDRA Preferred Term granularity. We also compared Essure Twitter posts with MedWatcher and FDA reports, and found rho= 0.25 and 0.31 respectively. MedWatcher represents a novel pharmacoepidemiology surveillance informatics system; our analysis is the first to compare AEs across social media, direct reporting, FDA spontaneous reports, and pre-approval trials

    Use of Text Data in Identifying and Prioritizing Potential Drug Repositioning Candidates

    Get PDF
    New drug development costs between 500 million and 2 billion dollars and takes 10-15 years, with a success rate of less than 10%. Drug repurposing (defined as discovering new indications for existing drugs) could play a significant role in drug development, especially considering the declining success rates of developing novel drugs. In the period 2007-2009, drug repurposing led to the launching of 30-40% of new drugs. Typically, new indications for existing medications are identified by accident. However, new technologies and a large number of available resources enable the development of systematic approaches to identify and validate drug-repurposing candidates with significantly lower cost. A variety of resources have been utilized to identify novel drug repurposing candidates such as biomedical literature, clinical notes, and genetic data. In this dissertation, we focused on using text data in identifying and prioritizing drug repositioning candidates and conducted five studies. In the first study, we aimed to assess the feasibility of using patient reviews from social media to identify potential candidates for drug repurposing. We retrieved patient reviews of 180 medications from an online forum, WebMD. Using dictionary-based and machine learning approaches, we identified disease names in the reviews. Several publicly available resources were used to exclude comments containing known indications and adverse drug effects. After manually reviewing some of the remaining comments, we implemented a rule-based system to identify beneficial effects. The dictionary-based system and machine learning system identified 2178 and 6171 disease names respectively in 64,616 patient comments. We provided a list of 10 common patterns that patients used to report any beneficial effects or uses of medication. After manually reviewing the comments tagged by our rule-based system, we identified five potential drug repurposing candidates. To our knowledge, this was the first study to consider using social media data to identify drug-repurposing candidates. We found that even a rule-based system, with a limited number of rules, could identify beneficial effect mentions in the comments of patients. Our preliminary study shows that social media has the potential to be used in drug repurposing. In the second study, we investigated the significance of extracting information from multiple sentences specifically in the context of drug-disease relation discovery. We used multiple resources such as Semantic Medline, a literature-based resource, and Medline search (for filtering spurious results) and inferred 8,772 potential drug-disease pairs. Our analysis revealed that 6,450 (73.5%) of the 8,772 potential drug-disease relations did not occur in a single sentence. Moreover, only 537 of the drug-disease pairs matched the curated gold standard in the Comparative Toxicogenomics Database (CTD), a trusted resource for drug-disease relations. Among the 537, nearly 75% (407) of the drug-disease pairs occur in multiple sentences. Our analysis revealed that the drug-disease pairs inferred from Semantic Medline or retrieved from CTD could be extracted from multiple sentences in the literature. This highlights the significance of the need for discourse-level analysis in extracting the relations from biomedical literature. In the third and fourth study, we focused on prioritizing drug repositioning candidates extracted from biomedical literature which we refer to as Literature-Based Discovery (LBD). In the third study, we used drug-gene and gene-disease semantic predications extracted from Medline abstracts to generate a list of potential drug-disease pairs. We further ranked the generated pairs, by assigning scores based on the predicates that qualify drug-gene and gene-disease relationships. On comparing the top-ranked drug-disease pairs against the Comparative Toxicogenomics Database, we found that a significant percentage of top-ranked pairs appeared in CTD. Co-occurrence of these high-ranked pairs in Medline abstracts is then used to improve the rankings of the inferred drug-disease relations. Finally, manual evaluation of the top-ten pairs ranked by our approach revealed that nine of them have good potential for biological significance based on expert judgment. In the fourth study, we proposed a method, utilizing information surrounding causal findings, to prioritize discoveries generated by LBD systems. We focused on discovering drug-disease relations, which have the potential to identify drug repositioning candidates or adverse drug reactions. Our LBD system used drug-gene and gene-disease semantic predication in SemMedDB as causal findings and Swanson’s ABC model to generate potential drug-disease relations. Using sentences, as a source of causal findings, our ranking method trained a binary classifier to classify generated drug-disease relations into desired classes. We trained and tested our classifier for three different purposes: a) drug repositioning b) adverse drug-event detection and c) drug-disease relation detection. The classifier obtained 0.78, 0.86, and 0.83 F-measures respectively for these tasks. The number of causal findings of each hypothesis, which were classified as positive by the classifier, is the main metric for ranking hypotheses in the proposed method. To evaluate the ranking method, we counted and compared the number of true relations in the top 100 pairs, ranked by our method and one of the previous methods. Out of 181 true relations in the test dataset, the proposed method ranked 20 of them in the top 100 relations while this number was 13 for the other method. In the last study, we used biomedical literature and clinical trials in ranking potential drug repositioning candidates identified by Phenome-Wide Association Studies (PheWAS). Unlike previous approaches, in this study, we did not limit our method to LBD. First, we generated a list of potential drug repositioning candidates using PheWAS. We retrieved 212,851 gene-disease associations from PheWAS catalog and 14,169 gene-drug relationships from DrugBank. Following Swanson’s model, we generated 52,966 potential drug repositioning candidates. Then, we developed an information retrieval system to retrieve any evidence of those candidates co-occurring in the biomedical literature and clinical trials. We identified nearly 14,800 drug-disease pairs with some evidence of support. In addition, we identified more than 38,000 novel candidates for re-purposing, encompassing hundreds of different disease states and over 1,000 individual medications. We anticipate that these results will be highly useful for hypothesis generation in the field of drug repurposing

    Mining social media data for biomedical signals and health-related behavior

    Full text link
    Social media data has been increasingly used to study biomedical and health-related phenomena. From cohort level discussions of a condition to planetary level analyses of sentiment, social media has provided scientists with unprecedented amounts of data to study human behavior and response associated with a variety of health conditions and medical treatments. Here we review recent work in mining social media for biomedical, epidemiological, and social phenomena information relevant to the multilevel complexity of human health. We pay particular attention to topics where social media data analysis has shown the most progress, including pharmacovigilance, sentiment analysis especially for mental health, and other areas. We also discuss a variety of innovative uses of social media data for health-related applications and important limitations in social media data access and use.Comment: To appear in the Annual Review of Biomedical Data Scienc

    약물 감시를 위한 비정형 텍스트 내 임상 정보 추출 연구

    Get PDF
    학위논문(박사) -- 서울대학교대학원 : 융합과학기술대학원 응용바이오공학과, 2023. 2. 이형기.Pharmacovigilance is a scientific activity to detect, evaluate and understand the occurrence of adverse drug events or other problems related to drug safety. However, concerns have been raised over the quality of drug safety information for pharmacovigilance, and there is also a need to secure a new data source to acquire drug safety information. On the other hand, the rise of pre-trained language models based on a transformer architecture has accelerated the application of natural language processing (NLP) techniques in diverse domains. In this context, I tried to define two problems in pharmacovigilance as an NLP task and provide baseline models for the defined tasks: 1) extracting comprehensive drug safety information from adverse drug events narratives reported through a spontaneous reporting system (SRS) and 2) extracting drug-food interaction information from abstracts of biomedical articles. I developed annotation guidelines and performed manual annotation, demonstrating that strong NLP models can be trained to extracted clinical information from unstructrued free-texts by fine-tuning transformer-based language models on a high-quality annotated corpus. Finally, I discuss issues to consider when when developing annotation guidelines for extracting clinical information related to pharmacovigilance. The annotated corpora and the NLP models in this dissertation can streamline pharmacovigilance activities by enhancing the data quality of reported drug safety information and expanding the data sources.약물 감시는 약물 부작용 또는 약물 안전성과 관련된 문제의 발생을 감지, 평가 및 이해하기 위한 과학적 활동이다. 그러나 약물 감시에 사용되는 의약품 안전성 정보의 보고 품질에 대한 우려가 꾸준히 제기되었으며, 해당 보고 품질을 높이기 위해서는 안전성 정보를 확보할 새로운 자료원이 필요하다. 한편 트랜스포머 아키텍처를 기반으로 사전훈련 언어모델이 등장하면서 다양한 도메인에서 자연어처리 기술 적용이 가속화되었다. 이러한 맥락에서 본 학위 논문에서는 약물 감시를 위한 다음 2가지 정보 추출 문제를 자연어처리 문제 형태로 정의하고 관련 기준 모델을 개발하였다: 1) 수동적 약물 감시 체계에 보고된 이상사례 서술자료에서 포괄적인 약물 안전성 정보를 추출한다. 2) 영문 의약학 논문 초록에서 약물-식품 상호작용 정보를 추출한다. 이를 위해 안전성 정보 추출을 위한 어노테이션 가이드라인을 개발하고 수작업으로 어노테이션을 수행하였다. 결과적으로 고품질의 자연어 학습데이터를 기반으로 사전학습 언어모델을 미세 조정함으로써 비정형 텍스트에서 임상 정보를 추출하는 강력한 자연어처리 모델 개발이 가능함을 확인하였다. 마지막으로 본 학위 논문에서는 약물감시와 관련된임상 정보 추출을 위한 어노테이션 가이드라인을 개발할 때 고려해야 할 주의 사항에 대해 논의하였다. 본 학위 논문에서 소개한 자연어 학습데이터와 자연어처리 모델은 약물 안전성 정보의 보고 품질을 향상시키고 자료원을 확장하여 약물 감시 활동을 보조할 것으로 기대된다.Chapter 1 1 1.1 Contributions of this dissertation 2 1.2 Overview of this dissertation 2 1.3 Other works 3 Chapter 2 4 2.1 Pharmacovigilance 4 2.2 Biomedical NLP for pharmacovigilance 6 2.2.1 Pre-trained language models 6 2.2.2 Corpora to extract clinical information for pharmacovigilance 9 Chapter 3 11 3.1 Motivation 12 3.2 Proposed Methods 14 3.2.1 Data source and text corpus 15 3.2.2 Annotation of ADE narratives 16 3.2.3 Quality control of annotation 17 3.2.4 Pretraining KAERS-BERT 18 3.2.6 Named entity recognition 20 3.2.7 Entity label classification and sentence extraction 21 3.2.8 Relation extraction 21 3.2.9 Model evaluation 22 3.2.10 Ablation experiment 23 3.3 Results 24 3.3.1 Annotated ICSRs 24 3.3.2 Corpus statistics 26 3.3.3 Performance of NLP models to extract drug safety information 28 3.3.4 Ablation experiment 31 3.4 Discussion 33 3.5 Conclusion 38 Chapter 4 39 4.1 Motivation 39 4.2 Proposed Methods 43 4.2.1 Data source 44 4.2.2 Annotation 45 4.2.3 Quality control of annotation 49 4.2.4 Baseline model development 49 4.3 Results 50 4.3.1 Corpus statistics 50 4.3.2 Annotation Quality 54 4.3.3 Performance of baseline models 55 4.3.4 Qualitative error analysis 56 4.4 Discussion 59 4.5 Conclusion 63 Chapter 5 64 5.1 Issues around defining a word entity 64 5.2 Issues around defining a relation between word entities 66 5.3 Issues around defining entity labels 68 5.4 Issues around selecting and preprocessing annotated documents 68 Chapter 6 71 6.1 Dissertation summary 71 6.2 Limitation and future works 72 6.2.1 Development of end-to-end information extraction models from free-texts to database based on existing structured information 72 6.2.2 Application of in-context learning framework in clinical information extraction 74 Chapter 7 76 7.1 Annotation Guideline for "Extraction of Comprehensive Drug Safety Information from Adverse Event Narratives Reported through Spontaneous Reporting System" 76 7.2 Annotation Guideline for "Extraction of Drug-Food Interactions from the Abtracts of Biomedical Articles" 100박

    Text Mining Methods for Analyzing Online Health Information and Communication

    Get PDF
    The Internet provides an alternative way to share health information. Specifically, social network systems such as Twitter, Facebook, Reddit, and disease specific online support forums are increasingly being used to share information on health related topics. This could be in the form of personal health information disclosure to seek suggestions or answering other patients\u27 questions based on their history. This social media uptake gives a new angle to improve the current health communication landscape with consumer generated content from social platforms. With these online modes of communication, health providers can offer more immediate support to the people seeking advice. Non-profit organizations and federal agencies can also diffuse preventative information in such networks for better outcomes. Researchers in health communication can mine user generated content on social networks to understand themes and derive insights into patient experiences that may be impractical to glean through traditional surveys. The main difficulty in mining social health data is in separating the signal from the noise. Social data is characterized by informal nature of content, typos, emoticons, tonal variations (e.g. sarcasm), and ambiguities arising from polysemous words, all of which make it difficult in building automated systems for deriving insights from such sources. In this dissertation, we present four efforts to mine health related insights from user generated social data. In the first effort, we build a model to identify marketing tweets on electronic cigarettes (e-cigs) and assess different topics in marketing and non-marketing messages on e-cigs on Twitter. In our next effort, we build ensemble models to classify messages on a mental health forum for triaging posts whose authors need immediate attention from trained moderators to prevent self-harm. The third effort deals with models from our participation in a shared task on identifying tweets that discuss adverse drug reactions and those that mention medication intake. In the final task, we build a classifier that identifies whether a particular tweet about the popular Juul e-cig indicates the tweeter actually using the product. Our methods range from linear classifiers (e.g., logistic regression), classical nonlinear models (e.g., nearest neighbors), recent deep neural networks (e.g., convolutional neural networks), and ensembles of all these models in using different supervised training regimens (e.g., co-training). The focus is more on task specific system building than on building specific individual models. Overall, we demonstrate that it is possible to glean insights from social data on health related topics through natural language processing and machine learning with use-cases from substance use and mental health

    Improving information accessibility using online patient drug reviews

    Get PDF
    Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2011.Cataloged from PDF version of thesis.Includes bibliographical references (p. 85-92).We address the problem of information accessibility for patients concerned about, pharmaceutical drug side effects and experiences. We create a new corpus of online patient-provided drug reviews and present our initial experiments on that corpus. We detect biases in term distributions that show a statistically significant association between a class of cholesterol-lowering drugs called statins, and a wide range of alarming disorders, including depression, memory loss, and heart failure. We also develop an initial language model for speech recognition in the medical domain, with transcribed data on sample patient comments collected with Amazon Mechanical Turk. Our findings show that patient-reported drug experiences have great potential to empower consumers to make more informed decisions about medical drugs, and our methods will be used to increase information accessibility for consumers.by Yueyang Alice Li.M.Eng

    Articulating the new normal(s) : mental disability, medical discourse, and rhetorical action.

    Get PDF
    “Articulating the New Normal(s): Mental Disability, Medical Discourse, and Rhetorical Action” studies the writing of people diagnosed with autism and post- traumatic stress disorder within online discussion boards related to mental health and outlines their unique rhetorical strategies for interacting with biomedical ideologies of psychiatry and activist discourses. The opening chapter situates this dissertation in relation to previous scholarship in Rhetoric, Disability Studies, and other fields. I also provide a summary of the set of mixed methods I use to gather and analyze my data, including rhetorical analysis, corpus analysis, and qualitative interviews. In Chapter 2, “Medical Terminology and Discourse Features of Online Discussions of Mental Health,” I explore the ways in which medical discourse appears in discussions of mental disability through medical terms that writers and speakers use when discussing a diagnosis. Using methods borrowed from linguistics, I demonstrate that the writers in my study make different linguistic choices than the general public, and that the most prominent differences are related to the social construction of mental health and medicine. In Chapter 3, “Inhabiting Biological Primacy with Chiasmic Rhetoric in Mental Health Forums,” I describe and analyze a variety of common topics in online conversations that connect mental health and expert knowledge of the brain. I argue that this connection of mental experience and brain science constitutes a chiasmic rhetoric. The writers foregrounded in this chapter acknowledge and accept much of the claims of medicine and neuroscience regarding the brain but, uniquely, work to divide that knowledge from the path of normativity and optimization. Chapter 4, “Classified Conversations: Psychiatry and Technical Communication in Online Spaces,” examines the practices of participants in online mental health discussion forums conversations as they interpret technical documents. I detail four salient forms of the manipulation of medical discourse in online communities. At the close of this chapter, I explain how these insights can inform academic study of writing in mental health contexts and transform the content and application of medical and technical texts. In Chapter 5, “Re-Forming Mental Health: Rhetorical Innovation and the Language of Advocacy,” I summarize and synthesize the core arguments of earlier chapters, with an extended caveat regarding the ethical dilemmas of this study. Finally, I offer a set of practical recommendations for different communities with which my research has been conversant, the fields of Rhetoric and Rhetoric of Health and Medicine, Disability Studies, and activism related to mental disabilities
    corecore