Search CORE

90 research outputs found

On the predictability of infectious disease outbreaks

Author: Petri Giovanni
Scarpino Samuel V.
Publication venue
Publication date: 10/10/2018
Field of study

Infectious disease outbreaks recapitulate biology: they emerge from the multi-level interaction of hosts, pathogens, and their shared environment. As a result, predicting when, where, and how far diseases will spread requires a complex systems approach to modeling. Recent studies have demonstrated that predicting different components of outbreaks--e.g., the expected number of cases, pace and tempo of cases needing treatment, demand for prophylactic equipment, importation probability etc.--is feasible. Therefore, advancing both the science and practice of disease forecasting now requires testing for the presence of fundamental limits to outbreak prediction. To investigate the question of outbreak prediction, we study the information theoretic limits to forecasting across a broad set of infectious diseases using permutation entropy as a model independent measure of predictability. Studying the predictability of a diverse collection of historical outbreaks--including, chlamydia, dengue, gonorrhea, hepatitis A, influenza, measles, mumps, polio, and whooping cough--we identify a fundamental entropy barrier for infectious disease time series forecasting. However, we find that for most diseases this barrier to prediction is often well beyond the time scale of single outbreaks. We also find that the forecast horizon varies by disease and demonstrate that both shifting model structures and social network heterogeneity are the most likely mechanisms for the observed differences across contagions. Our results highlight the importance of moving beyond time series forecasting, by embracing dynamic modeling approaches, and suggest challenges for performing model selection across long time series. We further anticipate that our findings will contribute to the rapidly growing field of epidemiological forecasting and may relate more broadly to the predictability of complex adaptive systems

arXiv.org e-Print Archive

Directory of Open Access Journals

Epidemiological consequences of an ineffective Bordetella pertussis vaccine

Author: Althouse Benjamin M.
Scarpino Samuel V.
Publication venue
Publication date: 28/02/2014
Field of study

The recent increase in Bordetella pertussis incidence (whooping cough) presents a challenge to global health. Recent studies have called into question the effectiveness of acellular B. pertussis vaccination in reducing transmission. Here we examine the epidemiological consequences of an ineffective B. pertussis vaccine. Using a dynamic transmission model, we find that: 1) an ineffective vaccine can account for the observed increase in B. pertussis incidence; 2) asymptomatic infections can bias surveillance and upset situational awareness of B. pertussis; and 3) vaccinating individuals in close contact with infants too young to receive vaccine (so called "cocooning" unvaccinated children) may be ineffective. Our results have important implications for B. pertussis vaccination policy and paint a complicated picture for achieving herd immunity and possible B. pertussis eradication.Comment: 7 pages, 3 figures, with supplemen

arXiv.org e-Print Archive

A message-passing approach for recurrent-state epidemic models on networks

Author: Moore Cristopher
Scarpino Samuel V.
Shrestha Munik
Publication venue: 'American Physical Society (APS)'
Publication date: 08/05/2015
Field of study

Epidemic processes are common out-of-equilibrium phenomena of broad interdisciplinary interest. Recently, dynamic message-passing (DMP) has been proposed as an efficient algorithm for simulating epidemic models on networks, and in particular for estimating the probability that a given node will become infectious at a particular time. To date, DMP has been applied exclusively to models with one-way state changes, as opposed to models like SIS (susceptible-infectious-susceptible) and SIRS (susceptible-infectious-recovered-susceptible) where nodes can return to previously inhabited states. Because many real-world epidemics can exhibit such recurrent dynamics, we propose a DMP algorithm for complex, recurrent epidemic models on networks. Our approach takes correlations between neighboring nodes into account while preventing causal signals from backtracking to their immediate source, and thus avoids "echo chamber effects" where a pair of adjacent nodes each amplify the probability that the other is infectious. We demonstrate that this approach well approximates results obtained from Monte Carlo simulation and that its accuracy is often superior to the pair approximation (which also takes second-order correlations into account). Moreover, our approach is more computationally efficient than the pair approximation, especially for complex epidemic models: the number of variables in our DMP approach grows as

2mk

where

m

is the number of edges and

k

is the number of states, as opposed to

mk^2

for the pair approximation. We suspect that the resulting reduction in computational effort, as well as the conceptual simplicity of DMP, will make it a useful tool in epidemic modeling, especially for inference tasks where there is a large parameter space to explore.Comment: 12 pages, 8 figure

arXiv.org e-Print Archive

multiDimBio: An R Package for the Design, Analysis, and Visualization of Systems Biology Experiments

Author: Crews David
Gillette Ross
Scarpino Samuel V.
Publication venue
Publication date: 02/04/2014
Field of study

The past decade has witnessed a dramatic increase in the size and scope of biological and behavioral experiments. These experiments are providing an unprecedented level of detail and depth of data. However, this increase in data presents substantial statistical and graphical hurdles to overcome, namely how to distinguish signal from noise and how to visualize multidimensional results. Here we present a series of tools designed to support a research project from inception to publication. We provide implementation of dimension reduction techniques and visualizations that function well with the types of data often seen in animal behavior studies. This package is designed to be used with experimental data but can also be used for experimental design and sample justification. The goal for this project is to create a package that will evolve over time, thereby remaining relevant and reflective of current methods and techniques

arXiv.org e-Print Archive

Prudent behaviour accelerates disease transmission

Author: Allard Antoine
Hebert-Dufresne Laurent
Scarpino Samuel V.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 02/09/2015
Field of study

Infectious diseases often spread faster near their peak than would be predicted given early data on transmission. Despite the commonality of this phenomena, there are no known general mechanisms able to cause an exponentially spreading dis- ease to begin spreading faster. Indeed most features of real world social networks, e.g. clustering1,2 and community structure3, and of human behaviour, e.g. social distancing4 and increased hygiene5, will slow disease spread. Here, we consider a model where individuals with essential societal roles-e.g. teachers, first responders, health-care workers, etc.- who fall ill are replaced with healthy individuals. We refer to this process as relational exchange. Relational exchange is also a behavioural process, but one whose effect on disease transmission is less obvious. By incorporating this behaviour into a dynamic network model, we demonstrate that replacing individuals can accelerate disease transmission. Furthermore, we find that the effects of this process are trivial when considering a standard mass-action model, but dramatic when considering network structure. This result highlights another critical shortcoming in mass-action models, namely their inability to account for behavioural processes. Lastly, using empirical data, we find that this mechanism parsimoniously explains observed patterns across more than seventeen years of influenza and dengue virus data. We anticipate that our findings will advance the emerging field of disease forecasting and will better inform public health decision making during outbreaks

arXiv.org e-Print Archive

Estimation with Binned Data

Author: Holas Igor
Scarpino Samuel V.
von Hippel Paul T.
Publication venue
Publication date: 02/10/2012
Field of study

Variables such as household income are sometimes binned, so that we only know how many households fall in each of several bins such as

0-10,000,

10,000-15,000, or $200,000+. We provide a SAS macro that estimates the mean and variance of binned data by fitting the extended generalized gamma (EGG) distribution, the power normal (PN) distribution, and a new distribution that we call the power logistic (PL). The macro also implements a "best-of-breed" estimator that chooses from among the EGG, PN, and PL estimates on the basis of likelihood and finite variance. We test the macro by estimating the mean family and household incomes of approximately 13,000 US school districts between 1970 and 2009. The estimates have negligible bias (0-2%) and a root mean squared error of just 3-6%. The estimates compare favorably with estimates obtained by fitting the Dagum, generalized beta (GB2), or logspline distributions.Comment: 16 pages + 2 tables + 4 figure

arXiv.org e-Print Archive

The Interhospital Transfer Network for Very Low Birth Weight Infants in the United States

Author: Edwards Erika M.
Greenberg Lucy T.
Horbar Jeffrey D.
Scarpino Samuel V.
Shrestha Munik
Publication venue
Publication date: 02/07/2018
Field of study

Very low birth weight (VLBW) infants require specialized care in neonatal intensive care units. In the United States (U.S.), such infants frequently are transferred between hospitals. Although these neonatal transfer networks are important, both economically and for infant morbidity and mortality, the national-level pattern of neonatal transfers is largely unknown. Using data from Vermont Oxford Network on 44,753 births, 2,122 hospitals, and 9,722 inter-hospital infant transfers from 2015, we performed the largest analysis to date on the inter-hospital transfer network for VLBW infants in the U.S. We find that transfers are organized around regional communities, but that despite being largely within state boundaries, most communities often contain at least two hospitals in different states. To classify the structural variation in transfer pattern amongst these communities, we applied a spectral measure for regionalization and found an association between a community's degree of regionalization and their infant transfer rate, which was not utilized in detecting communities. We also demonstrate that the established measures of network centrality and hierarchy, e.g., the community-wide entropy in PageRank or betweenness centrality and number of distinct `layers' within a community, correlate weakly with our regionalization index and were not significantly associated with metrics on infant transfer rate. Our results suggest that the regionalization index captures novel information about the structural properties of VLBW infant transfer networks, have the practical implication of characterizing neonatal care in the U.S., and may apply more broadly to the role of centralizing forces in organizing complex adaptive systems

arXiv.org e-Print Archive

Interacting contagions are indistinguishable from social reinforcement

Author: Hébert-Dufresne Laurent
Scarpino Samuel V.
Young Jean-Gabriel
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 03/06/2019
Field of study

From fake news to innovative technologies, many contagions spread via a process of social reinforcement, where multiple exposures are distinct from prolonged exposure to a single source. Contrarily, biological agents such as Ebola or measles are typically thought to spread as simple contagions. Here, we demonstrate that interacting simple contagions are indistinguishable from complex contagions. In the social context, our results highlight the challenge of identifying and quantifying mechanisms, such as social reinforcement, in a world where an innumerable amount of ideas, memes and behaviors interact. In the biological context, this parallel allows the use of complex contagions to effectively quantify the non-trivial interactions of infectious diseases.Comment: Supplementary Material containing details of our simulation and inference procedures is available as an ancillary fil

arXiv.org e-Print Archive

Robust estimation of inequality from binned incomes

Author: Holas Igor
Scarpino Samuel V.
von Hippel Paul T.
Publication venue: 'SAGE Publications'
Publication date: 06/06/2016
Field of study

Researchers must often estimate income inequality using data that give only the number of cases (e.g., families or households) whose incomes fall in "bins" such as

0-9,999,

10,000-14,999,..., $200,000+. We find that popular methods for estimating inequality from binned incomes are not robust in small samples, where popular methods can produce infinite, undefined, or arbitrarily large estimates. To solve these and other problems, we develop two improved estimators: the robust Pareto midpoint estimator (RPME) and the multimodel generalized beta estimator (MGBE). In a broad evaluation using US national, state, and county data from 1970 to 2009, we find that both estimators produce very good estimates of the mean and Gini, but less accurate estimates of the Theil and mean log deviation. Neither estimator is uniformly more accurate, but the RPME is much faster, which may be a consideration when many estimates must be obtained from many datasets. We have made the methods available as the rpme and mgbe commands for Stata and the binequality package for R.Comment: 39 pages, 7 tables, 7 figure

arXiv.org e-Print Archive

Socioeconomic bias in influenza surveillance

Author: Clements Bruce
Dimitrov Nedialko B.
Eggo Rosalind M.
Meyers Lauren Ancel
Scarpino Samuel V.
Scott James G.
Publication venue
Publication date: 01/04/2018
Field of study

Individuals in low socioeconomic brackets are considered at-risk for developing influenza-related complications and often exhibit higher than average influenza-related hospitalization rates. This disparity has been attributed to various factors, including restricted access to preventative and therapeutic health care, limited sick leave, and household structure. Adequate influenza surveillance in these at-risk populations is a critical precursor to accurate risk assessments and effective intervention. However, the United States of America's primary national influenza surveillance system (ILINet) monitors outpatient healthcare providers, which may be largely inaccessible to lower socioeconomic populations. Recent initiatives to incorporate internet-source and hospital electronic medical records data into surveillance systems seek to improve the timeliness, coverage, and accuracy of outbreak detection and situational awareness. Here, we use a flexible statistical framework for integrating multiple surveillance data sources to evaluate the adequacy of traditional (ILINet) and next generation (BioSense 2.0 and Google Flu Trends) data for situational awareness of influenza across poverty levels. We find that zip codes in the highest poverty quartile are a critical blind-spot for ILINet that the integration of next generation data fails to ameliorate

arXiv.org e-Print Archive