Search CORE

19 research outputs found

Natural language processing to extract medical problems from electronic clinical documents: Performance evaluation

Author: Haug Peter J.
Meystre Stéphane
Publication venue: Elsevier Inc.
Publication date: 31/12/2006
Field of study

AbstractIn this study, we evaluate the performance of a Natural Language Processing (NLP) application designed to extract medical problems from narrative text clinical documents. The documents come from a patient’s electronic medical record and medical problems are proposed for inclusion in the patient’s electronic problem list. This application has been developed to help maintain the problem list and make it more accurate, complete, and up-to-date. The NLP part of this system—analyzed in this study—uses the UMLS MetaMap Transfer (MMTx) application and a negation detection algorithm called NegEx to extract 80 different medical problems selected for their frequency of use in our institution. When using MMTx with its default data set, we measured a recall of 0.74 and a precision of 0.756. A custom data subset for MMTx was created, making it faster and significantly improving the recall to 0.896 with a non-significant reduction in precision

Elsevier - Publisher Connector

Assimilative processes in a client with borderline personality disorder: Tracking internal multiplicity over the first ten sessions of therapy.

Author: Imesch Céline
Kolly Stéphane
Kramer Ueli
Meystre Claudia
Publication venue: 'American Psychological Association (APA)'
Publication date: 01/01/2016
Field of study

Crossref

Serveur académique lausannois

Evaluating current automatic de-identification methods with Veteran’s health administration clinical documents

Author: BA Beckwith
Brett R South
D Gupta
E Aramaki
F Jeffrey Friedlin
FJ Friedlin
G Szarvas
H Dalianis
I Neamatullah
J Aberdeen
J Gardner
JJ Berman
K Hara
Matthew H Samore
O Uzuner
O Uzuner
Oscar Ferrández
P Ohm
R Grishman
Shuying Shen
SM Meystre
SM Meystre
Stéphane M Meystre
Y Guo
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Evaluating the informatics for integrating biology and the bedside system for clinical research

Author: AJ Butte
D Box
E O'Brien
EA Zerhouni
EA Zerhouni
EA Zerhouni
GA Patton
HR Warner
Joyce A Mitchell
JS Maul
M Skolnick
MP Papazoglou
MP Papazoglou
R Nalichowski
RT Fielding
SN Murphy
SN Murphy
SN Murphy
Stéphane M Meystre
Vikrant G Deshmukh
Publication venue: BioMed Central
Publication date: 01/10/2009
Field of study

Abstract Background Selecting patient cohorts is a critical, iterative, and often time-consuming aspect of studies involving human subjects; informatics tools for helping streamline the process have been identified as important infrastructure components for enabling clinical and translational research. We describe the evaluation of a free and open source cohort selection tool from the Informatics for Integrating Biology and the Bedside (i2b2) group: the i2b2 hive. Methods Our evaluation included the usability and functionality of the i2b2 hive using several real world examples of research data requests received electronically at the University of Utah Health Sciences Center between 2006 - 2008. The hive server component and the visual query tool application were evaluated for their suitability as a cohort selection tool on the basis of the types of data elements requested, as well as the effort required to fulfill each research data request using the i2b2 hive alone. Results We found the i2b2 hive to be suitable for obtaining estimates of cohort sizes and generating research cohorts based on simple inclusion/exclusion criteria, which consisted of about 44% of the clinical research data requests sampled at our institution. Data requests that relied on post-coordinated clinical concepts, aggregate values of clinical findings, or temporal conditions in their inclusion/exclusion criteria could not be fulfilled using the i2b2 hive alone, and required one or more intermediate data steps in the form of pre- or post-processing, modifications to the hive metadata, etc. Conclusion The i2b2 hive was found to be a useful cohort-selection tool for fulfilling common types of requests for research data, and especially in the estimation of initial cohort sizes. For another institution that might want to use the i2b2 hive for clinical research, we recommend that the institution would need to have structured, coded clinical data and metadata available that can be transformed to fit the logical data models of the i2b2 hive, strategies for extracting relevant clinical data from source systems, and the ability to perform substantial pre- and post-processing of these data.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

The use of regional platforms for managing electronic health records for the production of regional public health indicators in France

Author: A Coden
A Mykowiecka
A Roberts
Agence des systèmes d'information partagés de santé
Agence des systèmes d'information partagés de santé
Agence des systèmes d'information partagés de santé
B Dean
C Brodley
C Friedman
C Grouin
C Quantin
C Quantin
C Quantin
C Schoen
Centers for Disease Control and Prevention
D Friedman
D Kalra
D Proux
D Proux
E Lau
F Farsi
F Laforest
G Batista
G Ritschard
G Saporta
G Weiss
Groupe de travail-politiques régionales de santé
H Stenzhorn
HJ Murff
I Guyon
International Organization for Standardization
JC Denny
JG Anderson
Journal Officiel de la République Française
M Apkon
M Fieschi
M Klompas
Marie-Hélène Metzger
MH Metzger
MK Obenshain
N Chawla
N Japkowicz
N Terrin
O Boussaïd
P Domingos
P Lenca
Philippe Castets
Q Gicquel
R Krishna
Roger Salamon
S Brossette
S Dudoit
S Pakhomov
S Sakji
SM Meystre
Stéphane Lallich
T Dietterich
T Durand
Thierry Durand
W Stead
WW Chapman
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

An Extensible Evaluation Framework Applied to Clinical Text Deidentification Natural Language Processing Tools: Multisystem and Multicorpus Study

Author: Paul M Heider
Stéphane M Meystre
Publication venue: JMIR Publications
Publication date: 01/05/2024
Field of study

BackgroundClinical natural language processing (NLP) researchers need access to directly comparable evaluation results for applications such as text deidentification across a range of corpus types and the means to easily test new systems or corpora within the same framework. Current systems, reported metrics, and the personally identifiable information (PII) categories evaluated are not easily comparable. ObjectiveThis study presents an open-source and extensible end-to-end framework for comparing clinical NLP system performance across corpora even when the annotation categories do not align. MethodsAs a use case for this framework, we use 6 off-the-shelf text deidentification systems (ie, CliniDeID, deid from PhysioNet, MITRE Identity Scrubber Toolkit [MIST], NeuroNER, National Library of Medicine [NLM] Scrubber, and Philter) across 3 standard clinical text corpora for the task (2 of which are publicly available) and 1 private corpus (all in English), with annotation categories that are not directly analogous. The framework is built on shell scripts that can be extended to include new systems, corpora, and performance metrics. We present this open tool, multiple means for aligning PII categories during evaluation, and our initial timing and performance metric findings. Code for running this framework with all settings needed to run all pairs are available via Codeberg and GitHub. ResultsFrom this case study, we found large differences in processing speed between systems. The fastest system (ie, MIST) processed an average of 24.57 (SD 26.23) notes per second, while the slowest (ie, CliniDeID) processed an average of 1.00 notes per second. No system uniformly outperformed the others at identifying PII across corpora and categories. Instead, a rich tapestry of performance trade-offs emerged for PII categories. CliniDeID and Philter prioritize recall over precision (with an average recall 6.9 and 11.2 points higher, respectively, for partially matching spans of text matching any PII category), while the other 4 systems consistently have higher precision (with MIST’s precision scoring 20.2 points higher, NLM Scrubber scoring 4.4 points higher, NeuroNER scoring 7.2 points higher, and deid scoring 17.1 points higher). The macroaverage recall across corpora for identifying names, one of the more sensitive PII categories, included deid (48.8%) and MIST (66.9%) at the low end and NeuroNER (84.1%), NLM Scrubber (88.1%), and CliniDeID (95.9%) at the high end. A variety of metrics across categories and corpora are reported with a wider variety (eg, F2-score) available via the tool. ConclusionsNLP systems in general and deidentification systems and corpora in our use case tend to be evaluated in stand-alone research articles that only include a limited set of comparators. We hold that a single evaluation pipeline across multiple systems and corpora allows for more nuanced comparisons. Our open pipeline should reduce barriers to evaluation and system advancement

Directory of Open Access Journals