932 research outputs found

    Transfer learning for unsupervised influenza-like illness models from online search data

    Get PDF
    A considerable body of research has demonstrated that online search data can be used to complement current syndromic surveillance systems. The vast majority of previous work proposes solutions that are based on supervised learning paradigms, in which historical disease rates are required for training a model. However, for many geographical regions this information is either sparse or not available due to a poor health infrastructure. It is these regions that have the most to benefit from inferring population health statistics from online user search activity. To address this issue, we propose a statistical framework in which we first learn a supervised model for a region with adequate historical disease rates, and then transfer it to a target region, where no syndromic surveillance data exists. This transfer learning solution consists of three steps: (i) learn a regularized regression model for a source country, (ii) map the source queries to target ones using semantic and temporal similarity metrics, and (iii) re-adjust the weights of the target queries. It is evaluated on the task of estimating influenza-like illness (ILI) rates. We learn a source model for the United States, and subsequently transfer it to three other countries, namely France, Spain and Australia. Overall, the transferred (unsupervised) models achieve strong performance in terms of Pearson correlation with the ground truth (> .92 on average), and their mean absolute error does not deviate greatly from a fully supervised baseline

    Tracking COVID-19 using online search

    Get PDF
    Previous research has demonstrated that various properties of infectious diseases can be inferred from online search behaviour. In this work we use time series of online search query frequencies to gain insights about the prevalence of COVID-19 in multiple countries. We first develop unsupervised modelling techniques based on associated symptom categories identified by the United Kingdom's National Health Service and Public Health England. We then attempt to minimise an expected bias in these signals caused by public interest -- as opposed to infections -- using the proportion of news media coverage devoted to COVID-19 as a proxy indicator. Our analysis indicates that models based on online searches precede the reported confirmed cases and deaths by 16.7 (10.2 - 23.2) and 22.1 (17.4 - 26.9) days, respectively. We also investigate transfer learning techniques for mapping supervised models from countries where the spread of disease has progressed extensively to countries that are in earlier phases of their respective epidemic curves. Furthermore, we compare time series of online search activity against confirmed COVID-19 cases or deaths jointly across multiple countries, uncovering interesting querying patterns, including the finding that rarer symptoms are better predictors than common ones. Finally, we show that web searches improve the short-term forecasting accuracy of autoregressive models for COVID-19 deaths. Our work provides evidence that online search data can be used to develop complementary public health surveillance methods to help inform the COVID-19 response in conjunction with more established approaches.Comment: Published in Nature Digital Medicine. Please note that the published version differs from this preprin

    Machine learning in drug supply chain management during disease outbreaks: a systematic review

    Get PDF
    The drug supply chain is inherently complex. The challenge is not only the number of stakeholders and the supply chain from producers to users but also production and demand gaps. Downstream, drug demand is related to the type of disease outbreak. This study identifies the correlation between drug supply chain management and the use of predictive parameters in research on the spread of disease, especially with machine learning methods in the last five years. Using the Publish or Perish 8 application, there are 71 articles that meet the inclusion criteria and keyword search requirements according to Kitchenham's systematic review methodology. The findings can be grouped into three broad groupings of disease outbreaks, each of which uses machine learning algorithms to predict the spread of disease outbreaks. The use of parameters for prediction with machine learning has a correlation with drug supply management in the coronavirus disease case. The area of drug supply risk management has not been heavily involved in the prediction of disease outbreaks

    Data-Centric Epidemic Forecasting: A Survey

    Full text link
    The COVID-19 pandemic has brought forth the importance of epidemic forecasting for decision makers in multiple domains, ranging from public health to the economy as a whole. While forecasting epidemic progression is frequently conceptualized as being analogous to weather forecasting, however it has some key differences and remains a non-trivial task. The spread of diseases is subject to multiple confounding factors spanning human behavior, pathogen dynamics, weather and environmental conditions. Research interest has been fueled by the increased availability of rich data sources capturing previously unobservable facets and also due to initiatives from government public health and funding agencies. This has resulted, in particular, in a spate of work on 'data-centered' solutions which have shown potential in enhancing our forecasting capabilities by leveraging non-traditional data sources as well as recent innovations in AI and machine learning. This survey delves into various data-driven methodological and practical advancements and introduces a conceptual framework to navigate through them. First, we enumerate the large number of epidemiological datasets and novel data streams that are relevant to epidemic forecasting, capturing various factors like symptomatic online surveys, retail and commerce, mobility, genomics data and more. Next, we discuss methods and modeling paradigms focusing on the recent data-driven statistical and deep-learning based methods as well as on the novel class of hybrid models that combine domain knowledge of mechanistic models with the effectiveness and flexibility of statistical approaches. We also discuss experiences and challenges that arise in real-world deployment of these forecasting systems including decision-making informed by forecasts. Finally, we highlight some challenges and open problems found across the forecasting pipeline.Comment: 67 pages, 12 figure

    Analysis of Tweets for Social Media Health Applications

    Get PDF
    abstract: Social networking sites like Twitter have provided people a platform to connect with each other, to discuss and share information and news or to entertain themselves. As the number of users continues to grow there has been explosive growth in the data generated by these users. Such a vast data source has provided researchers a way to study and monitor public health. Accurately analyzing tweets is a difficult task mainly because of their short length, the inventive spellings and creative language expressions. Instead of focusing at the topic level, identifying tweets that have personal health experience mentions would be more helpful to researchers, governments and other organizations. Another important limitation in the current systems for social media health applications is the use of a disease-specific model and dataset to study a particular disease. Identifying adverse drug reactions is an important part of the drug development process. Detecting and extracting adverse drug mentions in tweets can supplement the list of adverse drug reactions that result from the drug trials and can help in the improvement of the drugs. This thesis aims to address these two challenges and proposes three systems. A generalizable system to identify personal health experience mentions across different disease domains, a system for automatic classifications of adverse effects mentions in tweets and a system to extract adverse drug mentions from tweets. The proposed systems use the transfer learning from language models to achieve notable scores on Social Media Mining for Health Applications(SMM4H) 2019 (Weissenbacher et al. 2019) shared tasks.Dissertation/ThesisMasters Thesis Computer Science 201

    Artificial Intelligence for Sustainability—A Systematic Review of Information Systems Literature

    Get PDF
    The booming adoption of Artificial Intelligence (AI) likewise poses benefits and challenges. In this paper, we particularly focus on the bright side of AI and its promising potential to face our society’s grand challenges. Given this potential, different studies have already conducted valuable work by conceptualizing specific facets of AI and sustainability, including reviews on AI and Information Systems (IS) research or AI and business values. Nonetheless, there is still little holistic knowledge at the intersection of IS, AI, and sustainability. This is problematic because the IS discipline, with its socio-technical nature, has the ability to integrate perspectives beyond the currently dominant technological one as well as can advance both theory and the development of purposeful artifacts. To bridge this gap, we disclose how IS research currently makes use of AI to boost sustainable development. Based on a systematically collected corpus of 95 articles, we examine sustainability goals, data inputs, technologies and algorithms, and evaluation approaches that coin the current state of the art within the IS discipline. This comprehensive overview enables us to make more informed investments (e.g., policy and practice) as well as to discuss blind spots and possible directions for future research
    • …
    corecore