7 research outputs found

    Development of a Pipeline for Adverse Drug Reaction Identification in Clinical Notes: Word Embedding Models and String Matching

    Get PDF
    BACKGROUND: Knowledge about adverse drug reactions (ADRs) in the population is limited because of underreporting, which hampers surveillance and assessment of drug safety. Therefore, gathering accurate information that can be retrieved from clinical notes about the incidence of ADRs is of great relevance. However, manual labeling of these notes is time-consuming, and automatization can improve the use of free-text clinical notes for the identification of ADRs. Furthermore, tools for language processing in languages other than English are not widely available. OBJECTIVE: The aim of this study is to design and evaluate a method for automatic extraction of medication and Adverse Drug Reaction Identification in Clinical Notes (ADRIN). METHODS: Dutch free-text clinical notes (N=277,398) and medication registrations (N=499,435) from the Cardiology Centers of the Netherlands database were used. All clinical notes were used to develop word embedding models. Vector representations of word embedding models and string matching with a medical dictionary (Medical Dictionary for Regulatory Activities [MedDRA]) were used for identification of ADRs and medication in a test set of clinical notes that were manually labeled. Several settings, including search area and punctuation, could be adjusted in the prototype to evaluate the optimal version of the prototype. RESULTS: The ADRIN method was evaluated using a test set of 988 clinical notes written on the stop date of a drug. Multiple versions of the prototype were evaluated for a variety of tasks. Binary classification of ADR presence achieved the highest accuracy of 0.84. Reduced search area and inclusion of punctuation improved performance, whereas incorporation of the MedDRA did not improve the performance of the pipeline. CONCLUSIONS: The ADRIN method and prototype are effective in recognizing ADRs in Dutch clinical notes from cardiac diagnostic screening centers. Surprisingly, incorporation of the MedDRA did not result in improved identification on top of word embedding models. The implementation of the ADRIN tool may help increase the identification of ADRs, resulting in better care and saving substantial health care costs

    Deep neural networks reveal novel sex-specific electrocardiographic features relevant for mortality risk.

    Get PDF
    AIMS: Incorporation of sex in study design can lead to discoveries in medical research. Deep neural networks (DNNs) accurately predict sex based on the electrocardiogram (ECG) and we hypothesized that misclassification of sex is an important predictor for mortality. Therefore, we first developed and validated a DNN that classified sex based on the ECG and investigated the outcome. Second, we studied ECG drivers of DNN-classified sex and mortality. METHODS AND RESULTS: A DNN was trained to classify sex based on 131 673 normal ECGs. The algorithm was validated on internal (68 500 ECGs) and external data sets (3303 and 4457 ECGs). The survival of sex (mis)classified groups was investigated using time-to-event analysis and sex-stratified mediation analysis of ECG features. The DNN successfully distinguished female from male ECGs {internal validation: area under the curve (AUC) 0.96 [95% confidence interval (CI): 0.96, 0.97]; external validations: AUC 0.89 (95% CI: 0.88, 0.90), 0.94 (95% CI: 0.93, 0.94)}. Sex-misclassified individuals (11%) had a 1.4 times higher mortality risk compared with correctly classified peers. The ventricular rate was the strongest mediating ECG variable (41%, 95% CI: 31%, 56%) in males, while the maximum amplitude of the ST segment was strongest in females (18%, 95% CI: 11%, 39%). Short QRS duration was associated with higher mortality risk. CONCLUSION: Deep neural networks accurately classify sex based on ECGs. While the proportion of ECG-based sex misclassifications is low, it is an interesting biomarker. Investigation of the causal pathway between misclassification and mortality uncovered new ECG features that might be associated with mortality. Increased emphasis on sex as a biological variable in artificial intelligence is warranted

    Development of a Pipeline for Adverse Drug Reaction Identification in Clinical Notes: Word Embedding Models and String Matching

    No full text
    Background: Knowledge about adverse drug reactions (ADRs) in the population is limited because of underreporting, which hampers surveillance and assessment of drug safety. Therefore, gathering accurate information that can be retrieved from clinical notes about the incidence of ADRs is of great relevance. However, manual labeling of these notes is time-consuming, and automatization can improve the use of free-text clinical notes for the identification of ADRs. Furthermore, tools for language processing in languages other than English are not widely available. Objective: The aim of this study is to design and evaluate a method for automatic extraction of medication and Adverse Drug Reaction Identification in Clinical Notes (ADRIN). Methods: Dutch free-text clinical notes (N=277,398) and medication registrations (N=499,435) from the Cardiology Centers of the Netherlands database were used. All clinical notes were used to develop word embedding models. Vector representations of word embedding models and string matching with a medical dictionary (Medical Dictionary for Regulatory Activities [MedDRA]) were used for identification of ADRs and medication in a test set of clinical notes that were manually labeled. Several settings, including search area and punctuation, could be adjusted in the prototype to evaluate the optimal version of the prototype. Results: The ADRIN method was evaluated using a test set of 988 clinical notes written on the stop date of a drug. Multiple versions of the prototype were evaluated for a variety of tasks. Binary classification of ADR presence achieved the highest accuracy of 0.84. Reduced search area and inclusion of punctuation improved performance, whereas incorporation of the MedDRA did not improve the performance of the pipeline. Conclusions: The ADRIN method and prototype are effective in recognizing ADRs in Dutch clinical notes from cardiac diagnostic screening centers. Surprisingly, incorporation of the MedDRA did not result in improved identification on top of word embedding models. The implementation of the ADRIN tool may help increase the identification of ADRs, resulting in better care and saving substantial health care costs

    Coronary calcification measures predict mortality in symptomatic women and men

    No full text
    OBJECTIVE: To assess the prognostic value of absolute and sex-specific, age-specific and race/ethnicity-specific (Multi-Ethnic Study of Atherosclerosis, MESA) percentiles of coronary artery calcification in symptomatic women and men. METHODS: The study population consisted of 4985 symptomatic patients (2793 women, 56%) visiting a diagnostic outpatient cardiology clinic between 2009 and 2018 who were referred for cardiac CT to determine Coronary Artery Calcium Score (CACS). Regular care data were used and these data were linked to the databases of Statistics Netherlands for all-cause mortality data. Kaplan-Meier curves, multivariate Cox proportional hazards regression and concordance statistics were used to evaluate the prognostic value of CACS and MESA percentiles. Women were older compared with men (60 vs 59 years). RESULTS: Median CACS was 0 (IQR: 0-54) in women and 42 (IQR: 0-54) in men. After a median follow-up of 4.4 years (IQR: 3.1-6.3), 116 (2.3%; 53 women and 63 men) patients died. MESA percentiles did not perform better compared with absolute CACS (C-statistic 0.65, 95% CI 0.57 to 0.73, vs 0.66, 95% CI 0.58 to 0.74, in women and 0.59, 95% CI 0.51 to 0.67, vs 0.62, 95% CI 0.55 to 0.69, in men, for the percentiles and absolute CACS, respectively). CONCLUSIONS: In symptomatic individuals absolute CACS predicts mortality with a moderately good performance. MESA percentiles did not perform better compared with absolute CACS, thus there is no need to use them. Including degree of stenosis in the model might slightly improve mortality risk prediction in women, but not in men

    Development of a Pipeline for Adverse Drug Reaction Identification in Clinical Notes: Word Embedding Models and String Matching

    No full text
    BACKGROUND: Knowledge about adverse drug reactions (ADRs) in the population is limited because of underreporting, which hampers surveillance and assessment of drug safety. Therefore, gathering accurate information that can be retrieved from clinical notes about the incidence of ADRs is of great relevance. However, manual labeling of these notes is time-consuming, and automatization can improve the use of free-text clinical notes for the identification of ADRs. Furthermore, tools for language processing in languages other than English are not widely available. OBJECTIVE: The aim of this study is to design and evaluate a method for automatic extraction of medication and Adverse Drug Reaction Identification in Clinical Notes (ADRIN). METHODS: Dutch free-text clinical notes (N=277,398) and medication registrations (N=499,435) from the Cardiology Centers of the Netherlands database were used. All clinical notes were used to develop word embedding models. Vector representations of word embedding models and string matching with a medical dictionary (Medical Dictionary for Regulatory Activities [MedDRA]) were used for identification of ADRs and medication in a test set of clinical notes that were manually labeled. Several settings, including search area and punctuation, could be adjusted in the prototype to evaluate the optimal version of the prototype. RESULTS: The ADRIN method was evaluated using a test set of 988 clinical notes written on the stop date of a drug. Multiple versions of the prototype were evaluated for a variety of tasks. Binary classification of ADR presence achieved the highest accuracy of 0.84. Reduced search area and inclusion of punctuation improved performance, whereas incorporation of the MedDRA did not improve the performance of the pipeline. CONCLUSIONS: The ADRIN method and prototype are effective in recognizing ADRs in Dutch clinical notes from cardiac diagnostic screening centers. Surprisingly, incorporation of the MedDRA did not result in improved identification on top of word embedding models. The implementation of the ADRIN tool may help increase the identification of ADRs, resulting in better care and saving substantial health care costs
    corecore