43 research outputs found

    Estimating influenza incidence using search query deceptiveness and generalized ridge regression

    Full text link
    Seasonal influenza is a sometimes surprisingly impactful disease, causing thousands of deaths per year along with much additional morbidity. Timely knowledge of the outbreak state is valuable for managing an effective response. The current state of the art is to gather this knowledge using in-person patient contact. While accurate, this is time-consuming and expensive. This has motivated inquiry into new approaches using internet activity traces, based on the theory that lay observations of health status lead to informative features in internet data. These approaches risk being deceived by activity traces having a coincidental, rather than informative, relationship to disease incidence; to our knowledge, this risk has not yet been quantitatively explored. We evaluated both simulated and real activity traces of varying deceptiveness for influenza incidence estimation using linear regression. We found that deceptiveness knowledge does reduce error in such estimates, that it may help automatically-selected features perform as well or better than features that require human curation, and that a semantic distance measure derived from the Wikipedia article category tree serves as a useful proxy for deceptiveness. This suggests that disease incidence estimation models should incorporate not only data about how internet features map to incidence but also additional data to estimate feature deceptiveness. By doing so, we may gain one more step along the path to accurate, reliable disease incidence estimation using internet data. This capability would improve public health by decreasing the cost and increasing the timeliness of such estimates.Comment: 27 pages, 8 figure

    Epidemiological data challenges: planning for a more robust future through data standards

    Get PDF
    Accessible epidemiological data are of great value for emergency preparedness and response, understanding disease progression through a population, and building statistical and mechanistic disease models that enable forecasting. The status quo, however, renders acquiring and using such data difficult in practice. In many cases, a primary way of obtaining epidemiological data is through the internet, but the methods by which the data are presented to the public often differ drastically among institutions. As a result, there is a strong need for better data sharing practices. This paper identifies, in detail and with examples, the three key challenges one encounters when attempting to acquire and use epidemiological data: 1) interfaces, 2) data formatting, and 3) reporting. These challenges are used to provide suggestions and guidance for improvement as these systems evolve in the future. If these suggested data and interface recommendations were adhered to, epidemiological and public health analysis, modeling, and informatics work would be significantly streamlined, which can in turn yield better public health decision-making capabilities.Comment: v2 includes several typo fixes; v3 adds a paragraph on backfill; v4 adds 2 new paragraphs to the conclusion that address Frontiers reviewer comments; v5 adds some minor modifications that address additional reviewer comment

    Salivary microbiomes of indigenous Tsimane mothers and infants are distinct despite frequent premastication

    Get PDF
    Background Premastication, the transfer of pre-chewed food, is a common infant and young child feeding practice among the Tsimane, forager-horticulturalists living in the Bolivian Amazon. Research conducted primarily with Western populations has shown that infants harbor distinct oral microbiota from their mothers. Premastication, which is less common in these populations, may influence the colonization and maturation of infant oral microbiota, including via transmission of oral pathogens. We collected premasticated food and saliva samples from Tsimane mothers and infants (9–24 months of age) to test for evidence of bacterial transmission in premasticated foods and overlap in maternal and infant salivary microbiota. We extracted bacterial DNA from two premasticated food samples and 12 matched salivary samples from maternal-infant pairs. DNA sequencing was performed with MiSeq (Illumina). We evaluated maternal and infant microbial composition in terms of relative abundance of specific taxa, alpha and beta diversity, and dissimilarity distances. Results The bacteria in saliva and premasticated food were mapped to 19 phyla and 400 genera and were dominated by Firmicutes, Proteobacteria, Actinobacteria, and Bacteroidetes. The oral microbial communities of Tsimane mothers and infants who frequently share premasticated food were well-separated in a non-metric multi-dimensional scaling ordination (NMDS) plot. Infant microbiotas clustered together, with weighted Unifrac distances significantly differing between mothers and infants. Infant saliva contained more Firmicutes (p < 0.01) and fewer Proteobacteria (p < 0.05) than did maternal saliva. Many genera previously associated with dental and periodontal infections, e.g. Neisseria, Gemella, Rothia, Actinomyces, Fusobacterium, and Leptotrichia, were more abundant in mothers than in infants. Conclusions Salivary microbiota of Tsimane infants and young children up to two years of age do not appear closely related to those of their mothers, despite frequent premastication and preliminary evidence that maternal bacteria is transmitted to premasticated foods. Infant physiology and diet may constrain colonization by maternal bacteria, including several oral pathogens

    The Biosurveillance Analytics Resource Directory (BARD): Facilitating the Use of Epidemiological Models for Infectious Disease Surveillance

    Get PDF
    Epidemiological modeling for infectious disease is important for disease management and its routine implementation needs to be facilitated through better description of models in an operational context. A standardized model characterization process that allows selection or making manual comparisons of available models and their results is currently lacking. A key need is a universal framework to facilitate model description and understanding of its features. Los Alamos National Laboratory (LANL) has developed a comprehensive framework that can be used to characterize an infectious disease model in an operational context. The framework was developed through a consensus among a panel of subject matter experts. In this paper, we describe the framework, its application to model characterization, and the development of the Biosurveillance Analytics Resource Directory (BARD; http://brd.bsvgateway.org/brd/), to facilitate the rapid selection of operational models for specific infectious/communicable diseases. We offer this framework and associated database to stakeholders of the infectious disease modeling field as a tool for standardizing model description and facilitating the use of epidemiological models

    Evaluation of Point of Need Diagnostic Tests for Use in California Influenza Outbreaks

    Get PDF
    Because of the potential threats flu viruses pose, the United States, like many developed countries, has a very well established flu surveillance system consisting of 10 components collecting laboratory data, mortality data, hospitalization data and sentinel outpatient care data. Currently, this surveillance system is estimated to lag behind the actual seasonal outbreak by one to two weeks. As new data streams come online, it is important to understand what added benefit they bring to the flu surveillance system complex. For data streams to be effective, they should provide data in a more timely fashion or provide additional data that current surveillance systems cannot provide. Two multiplexed diagnostic tools designed to test syndromically relevant pathogens and wirelessly upload data for rapid integration and interpretation were evaluated to see how they fit into the influenza surveillance scheme in California

    Evaluation of Point of Need Diagnostic Tests for Use in California Influenza Outbreaks

    No full text
    Because of the potential threats flu viruses pose, the United States, like many developed countries, has a very well established flu surveillance system consisting of 10 components collecting laboratory data, mortality data, hospitalization data and sentinel outpatient care data. Currently, this surveillance system is estimated to lag behind the actual seasonal outbreak by one to two weeks. As new data streams come online, it is important to understand what added benefit they bring to the flu surveillance system complex. For data streams to be effective, they should provide data in a more timely fashion or provide additional data that current surveillance systems cannot provide. Two multiplexed diagnostic tools designed to test syndromically relevant pathogens and wirelessly upload data for rapid integration and interpretation were evaluated to see how they fit into the influenza surveillance scheme in California

    Even a good influenza forecasting model can benefit from internet-based nowcasts, but those benefits are limited.

    No full text
    The ability to produce timely and accurate flu forecasts in the United States can significantly impact public health. Augmenting forecasts with internet data has shown promise for improving forecast accuracy and timeliness in controlled settings, but results in practice are less convincing, as models augmented with internet data have not consistently outperformed models without internet data. In this paper, we perform a controlled experiment, taking into account data backfill, to improve clarity on the benefits and limitations of augmenting an already good flu forecasting model with internet-based nowcasts. Our results show that a good flu forecasting model can benefit from the augmentation of internet-based nowcasts in practice for all considered public health-relevant forecasting targets. The degree of forecast improvement due to nowcasting, however, is uneven across forecasting targets, with short-term forecasting targets seeing the largest improvements and seasonal targets such as the peak timing and intensity seeing relatively marginal improvements. The uneven forecasting improvements across targets hold even when "perfect" nowcasts are used. These findings suggest that further improvements to flu forecasting, particularly seasonal targets, will need to derive from other, non-nowcasting approaches
    corecore