2,392 research outputs found

    Automatic text filtering using limited supervision learning for epidemic intelligence

    Get PDF
    [no abstract

    Social Media Text Mining Framework for Drug Abuse: An Opioid Crisis Case Analysis

    Get PDF
    Social media is considered as a promising and viable source of data for gaining insights into various disease conditions, patients’ attitudes and behaviors, and medications. The daily use of social media provides new opportunities for analyzing several aspects of communication. Social media as a big data source can be used to recognize communication and behavioral themes of problematic use of prescription drugs. Mining and analyzing such media have challenges and limitations with respect to topic deduction and data quality. There is a need for a structured approach to efficiently and effectively analyze social media content related to drug abuse in a manner that can mitigate the challenges surrounding the use of this data source. Following a design science research methodology, the research aims at developing and evaluating a framework for mining and analyzing social media content related to drug abuse in a manner that will mitigate challenges and limitations related to topic deduction and data quality. The framework consists of four phases: Topic Discovery and Detection; Data Collection; Data Preparation and Quality; and Analysis and Results. The topic discovery and detection phase consists of a topic expansion stage for the drug abuse related topics that address the research domain and objectives. The topic expansion is based on different terms related to keywords, categories, and characteristics of the topic of interest and the objective of monitoring. To formalize the process and supporting artifacts, we create an ontology for drug abuse that captures the different categories that exist in the topic expansion and the literature. The data collection phase is characterized by the date range, social media platforms, search keywords, and a set of inclusion/exclusion criteria. The data preparation and quality phase is mainly concerned with obtaining high-quality data to mitigate problems with data veracity. In this phase, we pre-process the collected data then we evaluate the quality of the data, with respect to the terms and objectives of the research topic phase, using a data quality evaluation matrix. Finally, in the data analysis phase, the researcher can choose the suitable analysis approach. We used a combination of unsupervised and supervised machine learning approaches, including opinion and content analysis modeling. We demonstrate and evaluate the applicability of the proposed framework to identify common concerns toward opioid crisis from two perspectives; the addicted users’ perspective and the public’s (non-addicted users) perspective. In both cases, data is collected from twitter using Crimson Hexagon, a social media analytics tool for data collection and analysis. Natural language processing is used for data preparation and pre-processing. Different data visualization techniques such as, word clouds and clustering visualization, are used to form a deeper understanding of the relationships among the identified themes for the selected communities. The results help in understanding concerns of the public and opioid addicts towards the opioid crisis in the United States. Results of this study could help in understanding the problem aspects and provide key input when it comes to defining and implementing innovative solutions/strategies to face the opioid epidemic. From a theoretical perspective, this study highlights the importance of developing and adapting text mining techniques to social media for drug abuse. This study proposes a social media text mining framework for drug abuse research which lead to a good quality of datasets. Emphasis is placed on developing methods for improving the discovery and identification of topics in social media domains characterized by a plethora of highly diverse terms and a lack of commonly available dictionary/language by the community such as in the opioid and drug abuse case. From a practical perspective, automatically analyzing social media users’ posts using machine learning tools can help in understanding the public themes and topics that exist in the recent discussions of online users of social media networks. This could help in developing proper mitigation strategies. Examples of such strategies can be gaining insights from the discussion topics to make the opioid media campaigns more effective in preventing opioid misuse. Finally, the study helps address some of the U.S. Department of Health and Human Services (HHS) five-point strategy by providing a systematic approach that could support conducting better research on addiction and drug abuse and strengthening public health data reporting and collection using social media data

    Machine Learning Algorithms in Analysis, Diagnosing and Predicting COVID-19: A Systematic Literature Review

    Get PDF
    Since the COVID-19 corona virus first appeared at the end of 2019, in Wuhan province, China, the analysis, diagnosis, and prognosis of COVID-19 (SARS-CoV-2) has attracted the greatest attention. Since then, every part of the world needs some sort of system or instrument to assist judgments for prompt quarantine and medical treatment. For a variety of uses, including prediction, classification, and analysis, machine learning (MLR) have demonstrated their accuracy and efficiency in the fields of education, health, and security. In this paper, three main questions will be answered related to COVID-19 analysis, predicting, and diagnosing. The performance evaluation, fast process and identification, quick learning, and accurate results of MLR algorithms make them as a base for all models in analyzing, diagnosing, and predicting COVID-19 infection. The impact of using supervised and unsupervised MLR can be used for estimating the spread level of COVID-19 to make the proper strategic decisions. The researchers next compared the effects of various datatypes on diagnosing, forecasting, and assessing the severity of COVID-19 infection in order to examine the effects of MLRs. Three fields are associated with COVID-19, according to the analysis of the chosen study (analysis, diagnosing, and predicting). The majority of researches focus on the subject of COVID-19 diagnosis, where they use their models to identify the infection. In the selected studies, several algorithms are employed, however, a study revealed that the neural network is the most used method when compared to other algorithms. The most used method for identifying, forecasting, and evaluating COVID-19 infection is supervised MLR

    Toward unsupervised outbreak detection through visual perception of new patterns

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Statistical algorithms are routinely used to detect outbreaks of well-defined syndromes, such as influenza-like illness. These methods cannot be applied to the detection of emerging diseases for which no preexisting information is available.</p> <p>This paper presents a method aimed at facilitating the detection of outbreaks, when there is no a priori knowledge of the clinical presentation of cases.</p> <p>Methods</p> <p>The method uses a visual representation of the symptoms and diseases coded during a patient consultation according to the International Classification of Primary Care 2<sup>nd </sup>version (ICPC-2). The surveillance data are transformed into color-coded cells, ranging from white to red, reflecting the increasing frequency of observed signs. They are placed in a graphic reference frame mimicking body anatomy. Simple visual observation of color-change patterns over time, concerning a single code or a combination of codes, enables detection in the setting of interest.</p> <p>Results</p> <p>The method is demonstrated through retrospective analyses of two data sets: description of the patients referred to the hospital by their general practitioners (GPs) participating in the French Sentinel Network and description of patients directly consulting at a hospital emergency department (HED).</p> <p>Informative image color-change alert patterns emerged in both cases: the health consequences of the August 2003 heat wave were visualized with GPs' data (but passed unnoticed with conventional surveillance systems), and the flu epidemics, which are routinely detected by standard statistical techniques, were recognized visually with HED data.</p> <p>Conclusion</p> <p>Using human visual pattern-recognition capacities to detect the onset of unexpected health events implies a convenient image representation of epidemiological surveillance and well-trained "epidemiology watchers". Once these two conditions are met, one could imagine that the epidemiology watchers could signal epidemiological alerts, based on "image walls" presenting the local, regional and/or national surveillance patterns, with specialized field epidemiologists assigned to validate the signals detected.</p

    Detection and analysis of drug non-compliance in internet fora using information retrieval approaches

    Get PDF
    International audienceIn the health-related field, drug non-compliance situations happen when patients do not follow their prescriptions and do actions which lead to potentially harmful situations. Although such situations are dangerous, patients usually do not report them to their physicians. Hence, it is necessary to study other sources of information. We propose to study online health fora with information retrieval methods in order to identify messages that contain drug non-compliance information. Exploitation of information retrieval methods permits to detect non-compliance messages with up to 0.529 F-measure, compared to 0.824 F-measure reached with supervized machine learning methods. For some fine-grained categories and on new data, it shows up to 0.70 Precision
    • …
    corecore