Search CORE

4,405 research outputs found

Global disease monitoring and forecasting with Wikipedia

Author: Del Valle Sara Y.
Deshpande Alina
Fairchild Geoffrey
Generous Nicholas
Priedhorsky Reid
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 15/07/2014
Field of study

Infectious disease is a leading threat to public health, economic stability, and other key social structures. Efforts to mitigate these impacts depend on accurate and timely monitoring to measure the risk and progress of disease. Traditional, biologically-focused monitoring techniques are accurate but costly and slow; in response, new techniques based on social internet data such as social media and search queries are emerging. These efforts are promising, but important challenges in the areas of scientific peer review, breadth of diseases and countries, and forecasting hamper their operational usefulness. We examine a freely available, open data source for this use: access logs from the online encyclopedia Wikipedia. Using linear models, language as a proxy for location, and a systematic yet simple article selection procedure, we tested 14 location-disease combinations and demonstrate that these data feasibly support an approach that overcomes these challenges. Specifically, our proof-of-concept yields models with

r^2

up to 0.92, forecasting value up to the 28 days tested, and several pairs of models similar enough to suggest that transferring models from one location to another without re-training is feasible. Based on these preliminary results, we close with a research agenda designed to overcome these challenges and produce a disease monitoring and forecasting system that is significantly more effective, robust, and globally comprehensive than the current state of the art.Comment: 27 pages; 4 figures; 4 tables. Version 2: Cite McIver & Brownstein and adjust novelty claims accordingly; revise title; various revisions for clarit

arXiv.org e-Print Archive

CiteSeerX

Directory of Open Access Journals

PubMed Central

FigShare

Deep-Learning for Classification of Colorectal Polyps on Whole-Slide Images

Author: Hassanpour Saeed
Korbar Bruno
Miraflor Allen P.
Nicka Katherine M.
Olofson Andrea M.
Suriawinata Arief A.
Suriawinata Matthew A.
Torresani Lorenzo
Publication venue
Publication date: 01/01/2017
Field of study

Histopathological characterization of colorectal polyps is an important principle for determining the risk of colorectal cancer and future rates of surveillance for patients. This characterization is time-intensive, requires years of specialized training, and suffers from significant inter-observer and intra-observer variability. In this work, we built an automatic image-understanding method that can accurately classify different types of colorectal polyps in whole-slide histology images to help pathologists with histopathological characterization and diagnosis of colorectal polyps. The proposed image-understanding method is based on deep-learning techniques, which rely on numerous levels of abstraction for data representation and have shown state-of-the-art results for various image analysis tasks. Our image-understanding method covers all five polyp types (hyperplastic polyp, sessile serrated polyp, traditional serrated adenoma, tubular adenoma, and tubulovillous/villous adenoma) that are included in the US multi-society task force guidelines for colorectal cancer risk assessment and surveillance, and encompasses the most common occurrences of colorectal polyps. Our evaluation on 239 independent test samples shows our proposed method can identify the types of colorectal polyps in whole-slide images with a high efficacy (accuracy: 93.0%, precision: 89.7%, recall: 88.3%, F1 score: 88.8%). The presented method in this paper can reduce the cognitive burden on pathologists and improve their accuracy and efficiency in histopathological characterization of colorectal polyps, and in subsequent risk assessment and follow-up recommendations

arXiv.org e-Print Archive

Directory of Open Access Journals

Doctor of Philosophy

Author: Davis Kailah T.
Publication venue: University of Utah
Publication date: 01/08/2014
Field of study

dissertationPublic health surveillance systems are crucial for the timely detection and response to public health threats. Since the terrorist attacks of September 11, 2001, and the release of anthrax in the following month, there has been a heightened interest in public health surveillance. The years immediately following these attacks were met with increased awareness and funding from the federal government which has significantly strengthened the United States surveillance capabilities; however, despite these improvements, there are substantial challenges faced by today's public health surveillance systems. Problems with the current surveillance systems include: a) lack of leveraging unstructured public health data for surveillance purposes; and b) lack of information integration and the ability to leverage resources, applications or other surveillance efforts due to systems being built on a centralized model. This research addresses these problems by focusing on the development and evaluation of new informatics methods to improve the public health surveillance. To address the problems above, we first identified a current public surveillance workflow which is affected by the problems described and has the opportunity for enhancement through current informatics techniques. The 122 Mortality Surveillance for Pneumonia and Influenza was chosen as the primary use case for this dissertation work. The second step involved demonstrating the feasibility of using unstructured public health data, in this case death certificates. For this we created and evaluated a pipeline iv composed of a detection rule and natural language processor, for the coding of death certificates and the identification of pneumonia and influenza cases. The second problem was addressed by presenting the rationale of creating a federated model by leveraging grid technology concepts and tools for the sharing and epidemiological analyses of public health data. As a case study of this approach, a secured virtual organization was created where users are able to access two grid data services, using death certificates from the Utah Department of Health, and two analytical grid services, MetaMap and R. A scientific workflow was created using the published services to replicate the mortality surveillance workflow. To validate these approaches, and provide proofs-of-concepts, a series of real-world scenarios were conducted

The University of Utah: J. Willard Marriott Digital Library

Investigating Sociodemographic Disparities in Cancer Risk Using Web-Based Informatics

Author: Tourassi Georgia
Yoon Hong-Jun
Publication venue: 'Purdue University (bepress)'
Publication date: 24/01/2018
Field of study

Cancer health disparities due to demographic and socioeconomic factors are an area of great interest in the epidemiological community. Adjusting for such factors is important when developing cancer risk models. However, for digital epidemiology studies relying on online sources such information is not readily available. This paper presents a novel method for extracting demographic and socioeconomic information from openly available online obituaries. The method relies on tailored language processing rules and a probabilistic scheme to map subjects’ occupation history to the occupation classification codes and related earnings provided by the U.S. Census Bureau. Using this information, a case-control study is executed fully in silico to investigate how age, gender, parity, and income level impact breast and lung cancer risk. Based on 48,368 online obituaries (4,643 for breast cancer, 6,274 for lung cancer, and 37,451 cancer-free) collected automatically and a generalized cancer risk model, our study shows strong association between age, parity, and socioeconomic status and cancer risk. Although for breast cancer the observed trends are very consistent with traditional epidemiological studies, some inconsistency is observed for lung cancer with respect to socioeconomic status

Purdue E-Pubs

Public Health and Epidemiology Informatics: Recent Research and Trends in the United States

Author: Dixon Brian E.
Kharrazi H.
Lehmann H. P.
Publication venue: 'Georg Thieme Verlag KG'
Publication date: 01/01/2015
Field of study

Objectives To survey advances in public health and epidemiology informatics over the past three years. Methods We conducted a review of English-language research works conducted in the domain of public health informatics (PHI), and published in MEDLINE between January 2012 and December 2014, where information and communication technology (ICT) was a primary subject, or a main component of the study methodology. Selected articles were synthesized using a thematic analysis using the Essential Services of Public Health as a typology. Results Based on themes that emerged, we organized the advances into a model where applications that support the Essential Services are, in turn, supported by a socio-technical infrastructure that relies on government policies and ethical principles. That infrastructure, in turn, depends upon education and training of the public health workforce, development that creates novel or adapts existing infrastructure, and research that evaluates the success of the infrastructure. Finally, the persistence and growth of infrastructure depends on financial sustainability. Conclusions Public health informatics is a field that is growing in breadth, depth, and complexity. Several Essential Services have benefited from informatics, notably, “Monitor Health,” “Diagnose & Investigate,” and “Evaluate.” Yet many Essential Services still have not yet benefited from advances such as maturing electronic health record systems, interoperability amongst health information systems, analytics for population health management, use of social media among consumers, and educational certification in clinical informatics. There is much work to be done to further advance the science of PHI as well as its impact on public health practice

IUPUIScholarWorks

PubMed Central

Enhancing Drug Overdose Mortality Surveillance through Natural Language Processing and Machine Learning

Author: Ward Patrick J.
Publication venue: UKnowledge
Publication date: 01/01/2021
Field of study

Epidemiological surveillance is key to monitoring and assessing the health of populations. Drug overdose surveillance has become an increasingly important part of public health practice as overdose morbidity and mortality has increased due in large part to the opioid crisis. Monitoring drug overdose mortality relies on death certificate data, which has several limitations including timeliness and the coding structure used to identify specific substances that caused death. These limitations stem from the need to analyze the free-text cause-of-death sections of the death certificate that are completed by the medical certifier during death investigation. Other fields, including clinical sciences, have utilized natural language processing (NLP) methods to gain insight from free-text data, but thus far, adoption of NLP methods in epidemiological surveillance has been limited. Through a narrative review of NLP methods currently used in public health surveillance and the integration of two NLP tasks, classification and named entity recognition, this dissertation enhances the capabilities of public health practitioners and researchers to perform drug overdose mortality surveillance. This dissertation advances both surveillance science and public health practice by integrating methods from bioinformatics into the surveillance pipeline which provides more timely and increased quality overdose mortality surveillance, which is essential to guiding effective public health response to the continuing drug overdose epidemic

University of Kentucky

Addendum to Informatics for Health 2017: Advancing both science and practice

Author: Cornet Ronald
McCowan Colin
Peek Niels
Scott Philip
Publication venue: 'BCS Learning and Development Limited'
Publication date: 01/10/2017
Field of study

This article presents presentation and poster abstracts that were mistakenly omitted from the original publication

Informatics in Primary Care (BCS, The Chartered Institute for IT)

Directory of Open Access Journals

Enlighten

University of St. Andrews - Pure

Extracting information from the text of electronic medical records to improve case detection: a systematic review

Author: Afzal
Afzal
Ananthakrishnan
Baus
Cano
Carroll
Carroll
Carroll
Castro
Chapman
Chen
Chung
Currie
de Lusignan
DeLisle
DeLisle
Donia Scott
Dorr
Elizabeth Ford
Ford
Friedlin
Friedman
Friedman
Graiser
Greenhalgh
Gulliford
Gundlapalli
Hanauer
Hanauer
Hanauer
Harkema
Helen E Smith
Imfeld
Jackie A Cassell
John A Carroll
Jones
Kalra
Karnik
Koeling
Kushida
Li
Liao
Lin
Lindberg
Love
Lovis
Ludvigsson
Manning
Manuel
McPeek Hinz
Mehrabi
Meystre
Nielen
Pakhomov
Pakhomov
Powsner
Rait
Resnik
Roch
Ryan
Savova
Soler
Stein
Stone
Tange
Tate
Tsui
Uzuner
Valkhoff
Walsh
Widdifield
Wilke
Wu
Xia
Xu
Xu
Yadav
Ye
Zeng
Zeng
Zheng
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2016
Field of study

Background: Electronic medical records (EMRs) are revolutionizing health-related research. One key issue for study quality is the accurate identification of patients with the condition of interest. Information in EMRs can be entered as structured codes or unstructured free text. The majority of research studies have used only coded parts of EMRs for case-detection, which may bias findings, miss cases, and reduce study quality. This review examines whether incorporating information from text into case-detection algorithms can improve research quality. Methods: A systematic search returned 9659 papers, 67 of which reported on the extraction of information from free text of EMRs with the stated purpose of detecting cases of a named clinical condition. Methods for extracting information from text and the technical accuracy of case-detection algorithms were reviewed. Results: Studies mainly used US hospital-based EMRs, and extracted information from text for 41 conditions using keyword searches, rule-based algorithms, and machine learning methods. There was no clear difference in case-detection algorithm accuracy between rule-based and machine learning methods of extraction. Inclusion of information from text resulted in a significant improvement in algorithm sensitivity and area under the receiver operating characteristic in comparison to codes alone (median sensitivity 78% (codes + text) vs 62% (codes), P = .03; median area under the receiver operating characteristic 95% (codes + text) vs 88% (codes), P = .025). Conclusions: Text in EMRs is accessible, especially with open source information extraction algorithms, and significantly improves case detection when combined with codes. More harmonization of reporting within EMR studies is needed, particularly standardized reporting of algorithm accuracy metrics like positive predictive value (precision) and sensitivity (recall)

Crossref

PubMed Central

Sussex Research Online

International Society for Disease Surveillance Conference 2011: Building the Future of Public Health Surveillance: Building the Future of Public Health Surveillance

Author: Reidpath Daniel
Sarran Christohe
Soyiri Ireneous
Publication venue: Co-Action Publishing
Publication date: 01/01/2011
Field of study

Daniel Reidpath - ORCID: 0000-0002-8796-0420 https://orcid.org/0000-0002-8796-04204pubpub1117

Repository@Hull - Worktribe

Crossref

PubMed Central

Queen Margaret University eResearch

Public health surveillance : preparing for the future

Author
Publication venue
Publication date
Field of study

Suggested citation: Office of Public Health Scientific Services. Centers for Disease Control and Prevention. Public Health Surveillance: Preparing for the Future. Atlanta, GA: Centers for Disease Control andPrevention; September 2018.Surveillance-Series-Bookleth.pdf2018992

CDC Stacks