5 research outputs found
Assessing Information Congruence of Documented Cardiovascular Disease between Electronic Dental and Medical Records
Dentists are more often treating patients with Cardiovascular Diseases (CVD) in their clinics; therefore, dentists may need to alter treatment plans in the presence of CVD. However, it’s unclear to what extent patient-reported CVD information is accurately captured in Electronic Dental Records (EDRs). In this pilot study, we aimed to measure the reliability of patient-reported CVD conditions in EDRs. We assessed information congruence by comparing patients’ self-reported dental histories to their original diagnosis assigned by their medical providers in the Electronic Medical Record (EMR). To enable this comparison, we encoded patients CVD information from the free-text data of EDRs into a structured format using natural language processing (NLP). Overall, our NLP approach achieved promising performance extracting patients’ CVD-related information. We observed disagreement between self-reported EDR data and physician-diagnosed EMR data
The Impact of Automatic Pre-annotation in Clinical Note Data Element Extraction - the CLEAN Tool
Objective. Annotation is expensive but essential for clinical note review and
clinical natural language processing (cNLP). However, the extent to which
computer-generated pre-annotation is beneficial to human annotation is still an
open question. Our study introduces CLEAN (CLinical note rEview and
ANnotation), a pre-annotation-based cNLP annotation system to improve clinical
note annotation of data elements, and comprehensively compares CLEAN with the
widely-used annotation system Brat Rapid Annotation Tool (BRAT).
Materials and Methods. CLEAN includes an ensemble pipeline (CLEAN-EP) with a
newly developed annotation tool (CLEAN-AT). A domain expert and a novice
user/annotator participated in a comparative usability test by tagging 87 data
elements related to Congestive Heart Failure (CHF) and Kawasaki Disease (KD)
cohorts in 84 public notes.
Results. CLEAN achieved higher note-level F1-score (0.896) over BRAT (0.820),
with significant difference in correctness (P-value < 0.001), and the mostly
related factor being system/software (P-value < 0.001). No significant
difference (P-value 0.188) in annotation time was observed between CLEAN (7.262
minutes/note) and BRAT (8.286 minutes/note). The difference was mostly
associated with note length (P-value < 0.001) and system/software (P-value
0.013). The expert reported CLEAN to be useful/satisfactory, while the novice
reported slight improvements.
Discussion. CLEAN improves the correctness of annotation and increases
usefulness/satisfaction with the same level of efficiency. Limitations include
untested impact of pre-annotation correctness rate, small sample size, small
user size, and restrictedly validated gold standard.
Conclusion. CLEAN with pre-annotation can be beneficial for an expert to deal
with complex annotation tasks involving numerous and diverse target data
elements
Ensembles of NLP Tools for Data Element Extraction from Clinical Notes.
Natural Language Processing (NLP) is essential for concept extraction from narrative text in electronic health records (EHR). To extract numerous and diverse concepts, such as data elements (i.e., important concepts related to a certain medical condition), a plausible solution is to combine various NLP tools into an ensemble to improve extraction performance. However, it is unclear to what extent ensembles of popular NLP tools improve the extraction of numerous and diverse concepts. Therefore, we built an NLP ensemble pipeline to synergize the strength of popular NLP tools using seven ensemble methods, and to quantify the improvement in performance achieved by ensembles in the extraction of data elements for three very different cohorts. Evaluation results show that the pipeline can improve the performance of NLP tools, but there is high variability depending on the cohort