69 research outputs found

    Evaluation of natural language processing from emergency department computerized medical records for intra-hospital syndromic surveillance

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The identification of patients who pose an epidemic hazard when they are admitted to a health facility plays a role in preventing the risk of hospital acquired infection. An automated clinical decision support system to detect suspected cases, based on the principle of syndromic surveillance, is being developed at the University of Lyon's Hôpital de la Croix-Rousse. This tool will analyse structured data and narrative reports from computerized emergency department (ED) medical records. The first step consists of developing an application (UrgIndex) which automatically extracts and encodes information found in narrative reports. The purpose of the present article is to describe and evaluate this natural language processing system.</p> <p>Methods</p> <p>Narrative reports have to be pre-processed before utilizing the French-language medical multi-terminology indexer (ECMT) for standardized encoding. UrgIndex identifies and excludes syntagmas containing a negation and replaces non-standard terms (abbreviations, acronyms, spelling errors...). Then, the phrases are sent to the ECMT through an Internet connection. The indexer's reply, based on Extensible Markup Language, returns codes and literals corresponding to the concepts found in phrases. UrgIndex filters codes corresponding to suspected infections. Recall is defined as the number of relevant processed medical concepts divided by the number of concepts evaluated (coded manually by the medical epidemiologist). Precision is defined as the number of relevant processed concepts divided by the number of concepts proposed by UrgIndex. Recall and precision were assessed for respiratory and cutaneous syndromes.</p> <p>Results</p> <p>Evaluation of 1,674 processed medical concepts contained in 100 ED medical records (50 for respiratory syndromes and 50 for cutaneous syndromes) showed an overall recall of 85.8% (95% CI: 84.1-87.3). Recall varied from 84.5% for respiratory syndromes to 87.0% for cutaneous syndromes. The most frequent cause of lack of processing was non-recognition of the term by UrgIndex (9.7%). Overall precision was 79.1% (95% CI: 77.3-80.8). It varied from 81.4% for respiratory syndromes to 77.0% for cutaneous syndromes.</p> <p>Conclusions</p> <p>This study demonstrates the feasibility of and interest in developing an automated method for extracting and encoding medical concepts from ED narrative reports, the first step required for the detection of potentially infectious patients at epidemic risk.</p

    Reliability and validity of EMS dispatch code-based categorization of emergency patients for syndromic surveillance.

    Get PDF
    A retrospective study involving the secondary analysis of public health surveillance records was undertaken to characterize the reliability and validity of an EMS dispatch data-based scheme for assigning emergency patients to surveillance syndromes in relation to two other schemes, one based on hospital ED clinicians\u27 manual categorization according to patients\u27 chief complaint and clinical presentation, and one based on ICD-9 coded hospital ED diagnoses. Comparisons of a sample of individual emergency patients\u27 syndrome assignments according to the EMS versus each of the two hospital categorization schemes were made by matching EMS run records to their corresponding emergency department patient encounter records. This new, linked dataset was analyzed to assess the level of agreement beyond chance between the three possible pairs of syndrome categorization schemes in assigning patients to a respiratory or non-respiratory syndrome and to a gastrointestinal or non-gastrointestinal syndrome. Cohen\u27s kappa statistics were used to measure chance-adjusted agreement between categorization schemes (raters). Z-tests and a chi-square-like test based on the variance of the kappa statistic were used to test the equivalence of kappa coefficients across syndromes, population subgroups and pairs of syndrome assignment schemes. The sensitivity, specificity, predictive value positive and predictive value negative of EMS dispatch and chief complaint-based categorization schemes were also calculated, using the ICD-9-coded ED diagnosis-based categorization scheme as the criterion standard. Comparisons of all performance characteristic (i.e. sensitivity, specificity, predictive value positive and predictive value negative) values were made across categorization schemes and surveillance syndromes to determine whether they were significantly different. The use of EMS dispatch codes for assigning emergency patients to surveillance syndromes was found to have limited but statistically significant reliability in relation to more commonly used syndrome grouping methods based on chief complaints or ICD-9 coded ED diagnoses. The reliability of EMS-based syndrome assignment varied significantly by syndrome, age group and comparison rater. When ICD-9 coded ED diagnosis-based grouping is taken as the criterion standard of syndrome definition, the validity of EMS-based syndrome assignment was limited but comparable to chief complaint-based assignment. The validity of EMS-based syndrome assignment varied significantly by syndrome

    Exploratory analysis of methods for automated classification of laboratory test orders into syndromic groups in veterinary medicine

    Get PDF
    Background: Recent focus on earlier detection of pathogen introduction in human and animal populations has led to the development of surveillance systems based on automated monitoring of health data. Real- or near real-time monitoring of pre-diagnostic data requires automated classification of records into syndromes-syndromic surveillance-using algorithms that incorporate medical knowledge in a reliable and efficient way, while remaining comprehensible to end users. Methods: This paper describes the application of two of machine learning (Naïve Bayes and Decision Trees) and rule-based methods to extract syndromic information from laboratory test requests submitted to a veterinary diagnostic laboratory. Results: High performance (F1-macro = 0.9995) was achieved through the use of a rule-based syndrome classifier, based on rule induction followed by manual modification during the construction phase, which also resulted in clear interpretability of the resulting classification process. An unmodified rule induction algorithm achieved an F1-micro score of 0.979 though this fell to 0.677 when performance for individual classes was averaged in an unweighted manner (F1-macro), due to the fact that the algorithm failed to learn 3 of the 16 classes from the training set. Decision Trees showed equal interpretability to the rule-based approaches, but achieved an F1-micro score of 0.923 (falling to 0.311 when classes are given equal weight). A Naïve Bayes classifier learned all classes and achieved high performance (F1-micro = 0.994 and F1-macro =. 955), however the classification process is not transparent to the domain experts. Conclusion: The use of a manually customised rule set allowed for the development of a system for classification of laboratory tests into syndromic groups with very high performance, and high interpretability by the domain experts. Further research is required to develop internal validation rules in order to establish automated methods to update model rules without user input

    A Space–Time Permutation Scan Statistic for Disease Outbreak Detection

    Get PDF
    BACKGROUND: The ability to detect disease outbreaks early is important in order to minimize morbidity and mortality through timely implementation of disease prevention and control measures. Many national, state, and local health departments are launching disease surveillance systems with daily analyses of hospital emergency department visits, ambulance dispatch calls, or pharmacy sales for which population-at-risk information is unavailable or irrelevant. METHODS AND FINDINGS: We propose a prospective space–time permutation scan statistic for the early detection of disease outbreaks that uses only case numbers, with no need for population-at-risk data. It makes minimal assumptions about the time, geographical location, or size of the outbreak, and it adjusts for natural purely spatial and purely temporal variation. The new method was evaluated using daily analyses of hospital emergency department visits in New York City. Four of the five strongest signals were likely local precursors to citywide outbreaks due to rotavirus, norovirus, and influenza. The number of false signals was at most modest. CONCLUSION: If such results hold up over longer study times and in other locations, the space–time permutation scan statistic will be an important tool for local and national health departments that are setting up early disease detection surveillance systems

    Modeling emergency department visit patterns for infectious disease complaints: results and application to disease surveillance

    Get PDF
    BACKGROUND: Concern over bio-terrorism has led to recognition that traditional public health surveillance for specific conditions is unlikely to provide timely indication of some disease outbreaks, either naturally occurring or induced by a bioweapon. In non-traditional surveillance, the use of health care resources are monitored in "near real" time for the first signs of an outbreak, such as increases in emergency department (ED) visits for respiratory, gastrointestinal or neurological chief complaints (CC). METHODS: We collected ED CCs from 2/1/94 – 5/31/02 as a training set. A first-order model was developed for each of seven CC categories by accounting for long-term, day-of-week, and seasonal effects. We assessed predictive performance on subsequent data from 6/1/02 – 5/31/03, compared CC counts to predictions and confidence limits, and identified anomalies (simulated and real). RESULTS: Each CC category exhibited significant day-of-week differences. For most categories, counts peaked on Monday. There were seasonal cycles in both respiratory and undifferentiated infection complaints and the season-to-season variability in peak date was summarized using a hierarchical model. For example, the average peak date for respiratory complaints was January 22, with a season-to-season standard deviation of 12 days. This season-to-season variation makes it challenging to predict respiratory CCs so we focused our effort and discussion on prediction performance for this difficult category. Total ED visits increased over the study period by 4%, but respiratory complaints decreased by roughly 20%, illustrating that long-term averages in the data set need not reflect future behavior in data subsets. CONCLUSION: We found that ED CCs provided timely indicators for outbreaks. Our approach led to successful identification of a respiratory outbreak one-to-two weeks in advance of reports from the state-wide sentinel flu surveillance and of a reported increase in positive laboratory test results

    Assessing and improving the accuracy of surveillance case definitions using administrative data

    Get PDF
    BACKGROUND Keeping pace with the rapidly evolving demands of infectious disease monitoring requires constant advances in surveillance methodology and infrastructure. A promising new method is syndromic surveillance, where health department staff, assisted by automated data acquisition and statistical alerts, monitor health indicators in near real-time. Several syndromic surveillance systems use diagnoses in administrative databases. However, physician claim diagnoses are not audited, and the effect of diagnostic coding variation on surveillance case definitions is not known. Furthermore, syndromic surveillance systems are limited by high false-positive (FP) rates. Almost no effort has been made to reduce FP rates by improving the positive predictive value (PPV) of surveilled data. OBJECTIVES 1) To evaluate the feasibility of identifying syndrome cases using diagnoses in physician claims. 2) To assess the accuracy of syndrome definitions based on diagnoses in physician claims. 3) To identify physician, patient, encounter and billing characteristics associated with the PPV of syndrome definitions. METHODS &amp; RESULTS STUDY 1: We focused on a subset of diagnoses from a single syndrome (respiratory). We compared cases and non-cases identified from physician claims to medical charts. A convenience sample of 9 Montreal-area family physicians participated. 3,526 visits among 729 patients were abstracted from medical charts and linked to physician claims. The sensitivity and PPV of physician claims for identifying respiratory infections were 0.49, 95%CI (0.45, 0.53) and 0.93, 95%CI (0.91, 0.94). This pilot work demonstrated the feasibility of the proposed method and contributed to planning a full-scale validation of several syndrome definitions. STUDY 2: We focused on 5 syndromes: fever, gastrointestinal, neurological, rash, and respiratory. We selected a random sample of 3,600 physicians practicing in the province of Quebec in 2005-2007, then a stratified random sample of 10 visits per physician from their claims. We obtained chart diagnoses for all sampled visits through double-blinded chart reviews. Sensitivity, specificity, PPV, and negative predictive value (NPV) of syndrome definitions based on diagnoses in physician claims were estimated by comparison to chart review. 1,098 (30.5%) physicians completed the chart review and 10,529 visits were validated. The sensitivity of syndrome definitions ranged from 0.11, 95%CI (0.10, 0.13) for fever to 0.44, 95%CI (0.41, 0.47) for respiratory syndrome. The specificity and NPV were high for all syndromes. The PPV ranged from 0.59, 95%CI (0.55, 0.64) for fever to 0.85, 95%CI (0.83, 0.88) for respiratory syndrome. STUDY 3: We focused on the 4,330 syndrome cases identified from the claims of the 1,098 physicians who participated in study 2. We estimated the association between claim-chart agreement and physician, patient, encounter and billing characteristics using multivariate logistic regression. The likelihood of the medical chart agreeing with the physician claim about the presence of a syndrome was higher when the physician had billed many visits for the same syndrome recently (RR per 10 visits, 1.05; 95%CI, 1.01-1.08), had a lower workload (RR per 10 claims, 0.93; 95%CI, 0.90-0.97), and when the patient was younger (RR per 5 years, 0.96; 95%CI, 0.94-0.97) and less socially deprived (RR most vs least deprived, 0.76; 95%CI, 0.60-0.95). CONCLUSIONS This was the first population-based validation of syndromic surveillance case definitions based on diagnoses in physician claims. We found that the sensitivity of syndrome definitions was low, the PPV was moderate to high, and the specificity and NPV were high. We identified several physician, patient, encounter and billing characteristics associated with the PPV of syndrome definitions, many of which are readily accessible to public health departments and could be used to reduce the FP rate of syndromic surveillance systems.CONTEXTE La surveillance des maladies infectieuses est un défi en constante évolution et un progrès continu au niveau des méthodes et des infrastructures est nécessaire pour répondre à la demande. Une nouvelle approche est la surveillance syndromique, où le personnel de santé publique, assisté de collecte automatisée de données et d'alertes statistiques, surveille des indicateurs de santé en temps quasi-réel. Plusieurs systèmes de surveillance syndromique s'appuient sur les diagnostics issus de bases de données administratives. Parce que ces codes de diagnostics ne font pas l'objet d'audits, l'effet de variations dans leur codage sur les définitions syndromiques demeure inconnu. OBJECTIFS 1) Évaluer la faisabilité d'identifier des syndromes à partir des diagnostics issus des services facturés par les médecins. 2) Évaluer l'exactitude de définitions syndromiques basées sur les diagnostics issus des services facturés par les médecins.3) Identifier les caractéristiques du médecin, du patient, de la rencontre médecin-patient et du mode de facturation associées au coefficient de prédiction positif (CPP) des définitions syndromiques. MÉTHODES &amp; RÉSULTATS ÉTUDE 1: Cette étude a porté sur un seul syndrome (respiratoire). Nous avons comparés les cas positifs et négatifs identifiés à partir de la facturation, aux dossiers médicaux. Un échantillon de 9 médecins généralistes Montréalais a été utilisé. Les diagnostics de 3 526 visites effectuées par 729 patients ont été extraits des dossiers médicaux, et reliés à la facturation. La sensibilité et le CPP des diagnostics d'infection respiratoire issus de la facturation étaient 0.49 et 0.93. Cette étude de faisabilité a permis la planification d'une validation à grande-échelle de plusieurs définitions syndromiques. ÉTUDE 2: Cette étude a porté sur 5 syndromes: fièvre, gastro-intestinal, neurologique, cutané et respiratoire. Nous avons sélectionné aléatoirement 3600 médecins pratiquant au Québec en 2005-2007 et, parmi tous les services facturés, 10 visites par médecin. Pour chaque visite, le diagnostic du dossier médical a été obtenu grâce à une révision de dossier à double insu. La sensibilité, la spécificité, le CPP et le coefficient prédictif négatif (CPN) des définitions syndromiques basées sur les diagnostics issus de la facturation ont été estimés. 1098 (30.5%) médecins ont participé à l'étude et 10529 visites ont été validées. La sensibilité des définitions syndromiques variait de 0.11 pour la fièvre à 0.44 pour le syndrome respiratoire. La spécificité et le CPN étaient élevés pour tous les syndromes. Le CPP variait de 0.59 pour la fièvre à 0.85 pour le syndrome respiratoire. ÉTUDE 3: Nous avons restreint notre échantillon aux 4330 visites des 1098 médecins de l'étude 2 où le diagnostic de la facturation correspondait à l'un des syndromes. Nous avons utilisé une régression logistique multi-variée afin d'estimer l'association entre l'accord facturation-dossier et les caractéristiques du médecin, du patient, de la rencontre médecin-patient et du mode de facturation. La probabilité que le dossier médical confirme un syndrome présent selon la facturation était plus élevée lorsque le médecin avait facturé plusieurs visites pour le même syndrome récemment, avait une charge de travail moindre, et lorsque le patient était plus jeune et moins défavorisé socialement. CONCLUSIONS Cette étude a été la première validation à grande-échelle de définitions syndromiques basées sur les diagnostics issus des services facturés par les médecins. Nous avons découvert que la sensibilité de ces définitions est faible, le CPP varie de moyen à élevé, et la spécificité et le CPN sont élévés. Nous avons identifiés maintes caractéristiques du médecin, du patient, de la rencontre médecin-patient et du mode de facturation associées au CPP des définitions syndromiques, dont plusieurs sont accessibles aux agences de santé publique et pourraient être utilisées pour améliorer les systèmes de surveillance syndromique

    Instituting a Regional Syndromic Surveillance System: Barriers and Opportunities

    Get PDF
    Syndromic surveillance is a relatively new tool being explored for early detection of disease outbreaks in communities. To signal an early outbreak, syndromic surveillance utilizes nontraditional indicators such as over-the-counter drug sales, physician and emergency room visits, laboratory tests ordered, absenteeism and calls to nurse hotlines or poison control centers. Methodological issues, costs, legal issues, technological issues and lack of rigorous evaluation may all be barriers to instituting syndromic surveillance within a local region. However, exploring the feasibility of developing this system within a region can bring opportunities for increased communication and understanding between public health, medical providers and the emergency response community. Nurses can be instrumental in facilitating this process.Master of Public Healt

    Beyond traditional surveillance: applying syndromic surveillance to developing settings – opportunities and challenges

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>All countries need effective disease surveillance systems for early detection of outbreaks. The revised International Health Regulations [IHR], which entered into force for all 194 World Health Organization member states in 2007, have expanded traditional infectious disease notification to include surveillance for public health events of potential international importance, even if the causative agent is not yet known. However, there are no clearly established guidelines for how countries should conduct this surveillance, which types of emerging disease syndromes should be reported, nor any means for enforcement.</p> <p>Discussion</p> <p>The commonly established concept of syndromic surveillance in developed regions encompasses the use of pre-diagnostic information in a near real time fashion for further investigation for public health action. Syndromic surveillance is widely used in North America and Europe, and is typically thought of as a highly complex, technology driven automated tool for early detection of outbreaks. Nonetheless, low technology applications of syndromic surveillance are being used worldwide to augment traditional surveillance.</p> <p>Summary</p> <p>In this paper, we review examples of these novel applications in the detection of vector-borne diseases, foodborne illness, and sexually transmitted infections. We hope to demonstrate that syndromic surveillance in its basic version is a feasible and effective tool for surveillance in developing countries and may facilitate compliance with the new IHR guidelines.</p
    corecore