435 research outputs found

    Global disease monitoring and forecasting with Wikipedia

    Full text link
    Infectious disease is a leading threat to public health, economic stability, and other key social structures. Efforts to mitigate these impacts depend on accurate and timely monitoring to measure the risk and progress of disease. Traditional, biologically-focused monitoring techniques are accurate but costly and slow; in response, new techniques based on social internet data such as social media and search queries are emerging. These efforts are promising, but important challenges in the areas of scientific peer review, breadth of diseases and countries, and forecasting hamper their operational usefulness. We examine a freely available, open data source for this use: access logs from the online encyclopedia Wikipedia. Using linear models, language as a proxy for location, and a systematic yet simple article selection procedure, we tested 14 location-disease combinations and demonstrate that these data feasibly support an approach that overcomes these challenges. Specifically, our proof-of-concept yields models with r2r^2 up to 0.92, forecasting value up to the 28 days tested, and several pairs of models similar enough to suggest that transferring models from one location to another without re-training is feasible. Based on these preliminary results, we close with a research agenda designed to overcome these challenges and produce a disease monitoring and forecasting system that is significantly more effective, robust, and globally comprehensive than the current state of the art.Comment: 27 pages; 4 figures; 4 tables. Version 2: Cite McIver & Brownstein and adjust novelty claims accordingly; revise title; various revisions for clarit

    Infoveillance of infectious diseases in USA: STDs, tuberculosis, and hepatitis

    Get PDF
    Big Data Analytics have become an integral part of Health Informatics over the past years, with the analysis of Internet data being all the more popular in health assessment in various topics. In this study, we first examine the geographical distribution of the online behavioral variations towards Chlamydia, Gonorrhea, Syphilis, Tuberculosis, and Hepatitis in the United States by year from 2004 to 2017. Next, we examine the correlations between Google Trends data and official health data from the ‘Centers for Disease Control and Prevention’ (CDC) on said diseases, followed by estimating linear regressions for the respective relationships. The results show that Infoveillance can assist with exploring public awareness and accurately measure the behavioral changes towards said diseases. The correlations between Google Trends data and CDC data on Chlamydia cases are statistically significant at a national level and in most of the states, while the forecasting exhibits good performing results in many states. For Hepatitis, significant correlations are observed for several US States, while forecasting also exhibits promising results. On the contrary, several factors can affect the applicability of this forecasting method, as in the cases of Gonorrhea, Syphilis, and Tuberculosis, where the correlations are statistically significant in fewer states. Thus this study highlights that the analysis of Google Trends data should be done with caution in order for the results to be robust. In addition, we suggest that the applicability of this method is not that trivial or universal, and that several factors need to be taken into account when using online data in this line of research. However, this study also supports previous findings suggesting that the analysis of real-time online data is important in health assessment, as it tackles the long procedure of data collection and analysis in traditional survey methods, and provides us with information that could not be accessible otherwise

    Analyzing the Association of Google Trends and Temperature with Rocky Mountain Spotted Fever in the United States 2004-2015

    Get PDF
    INTRODUCTION: Rocky Mountain Spotted Fever (RMSF) is a vector-borne disease spread through infected ticks. Climate influences survival and distribution of ticks which as an effect on exposure with humans. Since 2000, the incidence has increased from 1.7 cases per million person-years to 14.3 cases per million person-years in 2012. Around this time, Google was founded has become the premiere search engine in the United States market with the number of unique monthly visitors surpassing one billion for the first time in May 2011. Google Trends is a public tool provided by Google Inc. that shows how often a search term is entered relative to total-search volume across various regions, tracking data since Google’s public offering release in 2004. AIM: This study examines the association between Google Trends, temperature, and onset cases of Rocky Mountain Spotted Fever in the United States from 2004-2015. METHODS: This is a retrospective cross-sectional study; data was obtained from the National Notifiable Disease Surveillance System (NNDSS), National Oceanic and Atmosphere Administration (NOAA), and Google Trends. Thirty-four states were examined based on Spotted Fever Rickettsiosis (SFR) incidence in 2014 according to the Centers for Disease Control and Prevention (CDC). The average minimum temperature, average temperature, and average maximum temperature for the 34 states was collected from the NOAA website. Google Trends data was based on the search term “Rocky Mountain Spotted Fever”. SAS was used to conduct simple and multiple regression analysis to examine the association between SFR onset cases, temperature, and Google Trend’s data. RESULTS: From 2004-2015, a total of 25,993 onset cases were recorded across 34 states. North Carolina (5777 onset cases) had the most recorded while Connecticut (2 onset cases) had the least recorded. Statistical significance was measured at p ≤ 0.05. When examining the United States, the model (onset case = Interest Over Time) was statistically significant, the predictor Interest Over Time explained 52.62% of the variance (R2 = 0.5262, F1,143=157.69, p \u3c 0.0001). Interest Over Time was also found to be statistically significant (β = 6.57, t1 = 12.56, p \u3c 0.0001). When examining data at a state level, average temperature, as a predictor for onset cases, was statistically significant across 31 out of 34 states (31/34). Average minimum temperature (31/34) and average maximum temperature (31/34) also had the same statistical significant ratio as average temperature. Google trends was statistically significant for 14 out of 34 states (14/34). Only 5 out of 34 states had both variables as statistically significant when measured as predictors. CONCLUSION: The results from this study shows that Google Trends has at best modest reliability in determining the epidemiology of Rocky Mountain Spotted Fever. Temperature does show an association to onset cases, but we must keep in mind that temperature primarily describes the association for exposure to infection rather than actual onset cases. Overall, it is unclear what kind of influence Google Trends has and require further studies

    Chinese Social Media Reaction to Information about 42 Notifiable Infectious Diseases

    Get PDF
    This study aimed to identify what information triggered social media users' responses regarding infectious diseases. Chinese microblogs in 2012 regarding 42 infectious diseases were obtained through a keyword search in the Weiboscope database. Qualitative content analysis was performed for the posts pertinent to each keyword of the day of the year with the highest daily count. Similar posts were grouped and coded. We identified five categories of information that increased microblog traffic pertaining to infectious diseases: news of an outbreak or a case; health education / information; alternative health information / Traditional Chinese Medicine; commercial advertisement / entertainment; and social issues. News unrelated to the specified infectious diseases also led to elevated microblog traffic. Our study showcases the diverse contexts from which increased social media traffic occur. Our results will facilitate better health communication as causes underlying increased social media traffic are revealed.published_or_final_versio

    ASPREN surveillance system for influenza-like illness - a comparison with FluTracking and the National Notifiable Diseases Surveillance System

    Get PDF
    Public health surveillance systems are fundamental to the prevention and control of infectious diseases. Data obtained by sentinel surveillance systems may be used to inform public health decision making, priority setting and subsequent action

    Is there a duty to participate in digital epidemiology?

    Get PDF
    This paper poses the question of whether people have a duty to participate in digital epidemiology. While an implied duty to participate has been argued for in relation to biomedical research in general, digital epidemiology involves processing of non-medical, granular and proprietary data types that pose different risks to participants. We first describe traditional justifications for epidemiology that imply a duty to participate for the general public, which take account of the immediacy and plausibility of threats, and the identifiability of data. We then consider how these justifications translate to digital epidemiology, understood as an evolution of traditional epidemiology that includes personal and proprietary digital data alongside formal medical datasets. We consider the risks imposed by re-purposing such data for digital epidemiology and propose eight justificatory conditions that should be met in justifying a duty to participate for specific digital epidemiological studies. The conditions are then applied to three hypothetical cases involving usage of social media data for epidemiological purposes. We conclude with a list of questions to be considered in public negotiations of digital epidemiology, including the application of a duty to participate to third-party data controllers, and the important distinction between moral and legal obligations to participate in research

    Disease surveillance systems

    Get PDF
    Recent advances in information and communication technologies have made the development and operation of complex disease surveillance systems technically feasible, and many systems have been proposed to interpret diverse data sources for health-related signals. Implementing these systems for daily use and efficiently interpreting their output, however, remains a technical challenge. This thesis presents a method for understanding disease surveillance systems structurally, examines four existing systems, and discusses the implications of developing such systems. The discussion is followed by two papers. The first paper describes the design of a national outbreak detection system for daily disease surveillance. It is currently in use at the Swedish Institute for Communicable Disease Control. The source code has been licenced under GNU v3 and is freely available. The second paper discusses methodological issues in computational epidemiology, and presents the lessons learned from a software development project in which a spatially explicit micro-meso-macro model for the entire Swedish population was built based on registry data

    Surveillance of vaccine-preventable diseases (VPDs ): Update 2017 : Viral VPDs

    Get PDF
    Session I: CDC subject matter experts for viral VPDs (mumps, acute flaccid myelitis/polio, varicella, measles, and rotavirus)Viral%20slies.pdf2017677
    • …
    corecore