9 research outputs found

    Early Detection of COVID-19 Hotspots Using Spatio-Temporal Data

    Full text link
    Recently, the Centers for Disease Control and Prevention (CDC) has worked with other federal agencies to identify counties with increasing coronavirus disease 2019 (COVID-19) incidence (hotspots) and offers support to local health departments to limit the spread of the disease. Understanding the spatio-temporal dynamics of hotspot events is of great importance to support policy decisions and prevent large-scale outbreaks. This paper presents a spatio-temporal Bayesian framework for early detection of COVID-19 hotspots (at the county level) in the United States. We assume both the observed number of cases and hotspots depend on a class of latent random variables, which encode the underlying spatio-temporal dynamics of the transmission of COVID-19. Such latent variables follow a zero-mean Gaussian process, whose covariance is specified by a non-stationary kernel function. The most salient feature of our kernel function is that deep neural networks are introduced to enhance the model's representative power while still enjoying the interpretability of the kernel. We derive a sparse model and fit the model using a variational learning strategy to circumvent the computational intractability for large data sets. Our model demonstrates better interpretability and superior hotspot-detection performance compared to other baseline methods

    Curating a COVID-19 data repository and forecasting county-level death counts in the United States​

    No full text
    Presented online on October 23, 2020 at 2:00 p.m.Bin Yu is Chancellor’s Distinguished Professor and Class of 1936 Second Chair in the Departments of Statistics and of Electrical Engineering & Computer Sciences at the University of California at Berkeley and a former chair of Statistics at UC Berkeley. Yu's research focuses on practice, algorithm, and theory of statistical machine learning and causal inference. Her group is engaged in interdisciplinary research with scientists from genomics, neuroscience, and precision medicine.Runtime: 58:26 minutesAs the COVID-19 outbreak evolves, accurate forecasting continues to play an extremely important role in informing policy decisions. In this paper, we present our continuous curation of a large data repository containing COVID-19 information from a range of sources. We use this data to develop predictions and corresponding prediction intervals for the short-term trajectory of COVID-19 cumulative death counts at the county-level in the United States up to two weeks ahead. Using data from January 22 to June 20, 2020, we develop and combine multiple forecasts using ensembling techniques, resulting in an ensemble we refer to as Combined Linear and Exponential Predictors (CLEP). Our individual predictors include county-specific exponential and linear predictors, a shared exponential predictor that pools data together across counties, an expanded shared exponential predictor that uses data from neighboring counties, and a demographics-based shared exponential predictor. We use prediction errors from the past five days to assess the uncertainty of our death predictions, resulting in generally-applicable prediction intervals, Maximum (absolute) Error Prediction Intervals (MEPI). MEPI achieves a coverage rate of more than 94% when averaged across counties for predicting cumulative recorded death counts two weeks in the future. Our forecasts are currently being used by the non-profit organization, Response4Life, to determine the medical supply need for individual hospitals and have directly contributed to the distribution of medical supplies across the country. We hope that our forecasts and data repository at this https URL can help guide necessary county-specific decision-making and help counties prepare for their continued fight against COVID-19

    New Spatio-temporal Hawkes Process Models For Social Good

    Get PDF
    Indiana University-Purdue University Indianapolis (IUPUI)As more and more datasets with self-exciting properties become available, the demand for robust models that capture contagion across events is also getting stronger. Hawkes processes stand out given their ability to capture a wide range of contagion and self-excitation patterns, including the transmission of infectious disease, earthquake aftershock distributions, near-repeat crime patterns, and overdose clusters. The Hawkes process is flexible in modeling these various applications through parametric and non-parametric kernels that model event dependencies in space, time and on networks. In this thesis, we develop new frameworks that integrate Hawkes Process models with multi-armed bandit algorithms, high dimensional marks, and high-dimensional auxiliary data to solve problems in search and rescue, forecasting infectious disease, and early detection of overdose spikes. In Chapter 3, we develop a method applications to the crisis of increasing overdose mortality over the last decade. We first encode the molecular substructures found in a drug overdose toxicology report. We then cluster these overdose encodings into different overdose categories and model these categories with spatio-temporal multivariate Hawkes processes. Our results demonstrate that the proposed methodology can improve estimation of the magnitude of an overdose spike based on the substances found in an initial overdose. In Chapter 4, we build a framework for multi-armed bandit problems arising in event detection where the underlying process is self-exciting. We derive the expected number of events for Hawkes processes given a parametric model for the intensity and then analyze the regret bound of a Hawkes process UCB-normal algorithm. By introducing the Hawkes Processes modeling into the upper confidence bound construction, our models can detect more events of interest under the multi-armed bandit problem setting. We apply the Hawkes bandit model to spatio-temporal data on crime events and earthquake aftershocks. We show that the model can quickly learn to detect hotspot regions, when events are unobserved, while striking a balance between exploitation and exploration. In Chapter 5, we present a new spatio-temporal framework for integrating Hawkes processes with multi-armed bandit algorithms. Compared to the methods proposed in Chapter 4, the upper confidence bound is constructed through Bayesian estimation of a spatial Hawkes process to balance the trade-off between exploiting and exploring geographic regions. The model is validated through simulated datasets and real-world datasets such as flooding events and improvised explosive devices (IEDs) attack records. The experimental results show that our model outperforms baseline spatial MAB algorithms through rewards and ranking metrics. In Chapter 6, we demonstrate that the Hawkes process is a powerful tool to model the infectious disease transmission. We develop models using Hawkes processes with spatial-temporal covariates to forecast COVID-19 transmission at the county level. In the proposed framework, we show how to estimate the dynamic reproduction number of the virus within an EM algorithm through a regression on Google mobility indices. We also include demographic covariates as spatial information to enhance the accuracy. Such an approach is tested on both short-term and long-term forecasting tasks. The results show that the Hawkes process outperforms several benchmark models published in a public forecast repository. The model also provides insights on important covariates and mobility that impact COVID-19 transmission in the U.S. Finally, in chapter 7, we discuss implications of the research and future research directions

    PREVENT Symposium - Session 3, Population Level Theme

    No full text
    Presented online February 23, 2021, 10:30 a.m.-1:10 p.m.National Symposium on Predicting Emergence of Virulent Entities by Novel Technologies (PREVENT) : What Advances In Science, Technology, And Human Behavior Will Enable Prediction And Prevention Of Future Pandemics?Chairs: B. Aditya Prakash and Paul TorrensBryan Grenfell is a population biologist, distinguished for his investigation into the spatiotemporal dynamics of pathogens and other populations. Bryan studies processes that occur in populations at different scales and how infections move through such groups of organisms. His work is crucial in helping to control disease in humans and animals. His research is theoretical as well as based on large datasets, demonstrating how the density of a population and randomness interact to change the size and composition of populations. Alongside colleagues from the National University of Singapore, he studied measles in developed countries and is now extending his investigations to whooping cough and other infectious diseases. Bryan is currently Professor of Ecology and Evolutionary Biology and Public Affairs at Princeton University in New Jersey. He was awarded the T. H. Huxley Medal from Imperial College London in 1991, and the Scientific Medal of the Zoological Society of London in 1995.Bin Yu is Chancellor’s Distinguished Professor and Class of 1936 Second Chair in the Departments of Statistics and of Electrical Engineering & Computer Sciences at the University of California at Berkeley and a former chair of Statistics at UC Berkeley. Yu's research focuses on practice, algorithm, and theory of statistical machine learning and causal inference. Her group is engaged in interdisciplinary research with scientists from genomics, neuroscience, and precision medicine. In order to augment empirical evidence for decision-making, they are investigating methods/algorithms (and associated statistical inference problems) such as dictionary learning, non-negative matrix factorization (NMF), EM and deep learning (CNNs and LSTMs), and heterogeneous effect estimation in randomized experiments (X-learner). Their recent algorithms include staNMF for unsupervised learning, iterative Random Forests (iRF) and signed iRF (s-iRF) for discovering predictive and stable high-order interactions in supervised learning, contextual decomposition (CD) and aggregated contextual decomposition (ACD) for interpretation of Deep Neural Networks (DNNs). Yu is a member of the U.S. National Academy of Sciences and a fellow of the American Academy of Arts and Sciences. She was a Guggenheim Fellow in 2006, and the Tukey Memorial Lecturer of the Bernoulli Society in 2012. She was President of IMS (Institute of Mathematical Statistics) in 2013-2014 and the Rietz Lecturer of IMS in 2016. She received the E. L. Scott Award from COPSS (Committee of Presidents of Statistical Societies) in 2018. Moreover, Yu was a founding co-director of the Microsoft Research Asia (MSR) Lab at Peking University and is a member of the scientific advisory board at the UK Alan Turing Institute for data science and AI.Jordan Peccia is the Thomas E. Golden Jr. Professor of environmental engineering at Yale University. His research mixes genetics with engineering to study childhood exposure to bacteria, fungi and viruses in buildings. Peccia is a member of Connecticut Academy of Science and Engineering and associate editor for the journal Indoor Air. He earned his PhD in environmental engineering from the University of Colorado.Runtime: 60:02 minutesBryan Grenfell - Plenary Talk TITLE: "What Cross-Scale Research Can Tell Us About Predicting, Understanding And Mitigating Future Pandemics?" We briefly review the epidemic and evolutionary dynamics of directly-transmitted infections and their transition from pandemics to endemicity. We discuss how cross-scale dynamics, from protein to pandemic, determine key issues in understanding, predicting and mitigating outbreaks, then build on this to discuss future cross-scale research and public health priorities.Bin Yu - Presentation TITLE: "Curating a COVID-19 Data Repository and Forecasting County-Level Death Counts in the United States". As the COVID-19 outbreak continues to evolve, accurate forecasting continues to play an extremely important role in informing policy decisions. In this talk, I will describe a large data repository containing COVID-19 information curated from a range of different sources. This data is then used to develop several predictors and prediction intervals for forecasting the short-term (e.g., over the next week) trajectory of COVID-19-related recorded deaths at the county-level in the United States.Jordan Peccia - Presentation TITLE: "Tracking Epidemics at the Population Level Through Wastewater-Based Epidemiology". Throughout the world, wastewater is continually collected from human populations and conveyed to central locations for treatment and/or discharge. The chemical and biological features of wastewater contain insight into the disease state and behavior of a community. This talk reports on the Yale COVID-19 wastewater project, where daily samples were collected from eight different wastewater treatment facilities representing 20 Connecticut towns and cities and covering a population of more than one million. Tracking SARS-CoV-2 concentrations in these treatment facilities during the COVID-19 pandemic and linking these concentrations to public health data demonstrate how wastewater-based epidemiology can be a rapid, cost effective, and accurate measure of disease dynamics within a community.National Science Foundation (U.S.
    corecore