10 research outputs found
Predicting Multiple ICD-10 Codes from Brazilian-Portuguese Clinical Notes
ICD coding from electronic clinical records is a manual, time-consuming and
expensive process. Code assignment is, however, an important task for billing
purposes and database organization. While many works have studied the problem
of automated ICD coding from free text using machine learning techniques, most
use records in the English language, especially from the MIMIC-III public
dataset. This work presents results for a dataset with Brazilian Portuguese
clinical notes. We develop and optimize a Logistic Regression model, a
Convolutional Neural Network (CNN), a Gated Recurrent Unit Neural Network and a
CNN with Attention (CNN-Att) for prediction of diagnosis ICD codes. We also
report our results for the MIMIC-III dataset, which outperform previous work
among models of the same families, as well as the state of the art. Compared to
MIMIC-III, the Brazilian Portuguese dataset contains far fewer words per
document, when only discharge summaries are used. We experiment concatenating
additional documents available in this dataset, achieving a great boost in
performance. The CNN-Att model achieves the best results on both datasets, with
micro-averaged F1 score of 0.537 on MIMIC-III and 0.485 on our dataset with
additional documents.Comment: Accepted at BRACIS 202
Estimating the health‐related quality of life of kidney stone patients: initial results from the Wisconsin Stone Quality of Life Machine‐Learning Algorithm (WISQOL‐MLA)
Semantic relatedness and similarity of biomedical terms: examining the effects of recency, size, and section of biomedical publications on the performance of word2vec
Recent advances in Swedish and Spanish medical entity recognition in clinical texts using deep neural approaches
Background Text mining and natural language processing of clinical text, such as notes from electronic health records, requires specific consideration of the specialized characteristics of these texts. Deep learning methods could potentially mitigate domain specific challenges such as limited access to in-domain tools and data sets. Methods A bi-directional Long Short-Term Memory network is applied to clinical notes in Spanish and Swedish for the task of medical named entity recognition. Several types of embeddings, both generated from in-domain and out-of-domain text corpora, and a number of generation and combination strategies for embeddings have been evaluated in order to investigate different input representations and the influence of domain on the final results. Results For Spanish, a micro averaged F1-score of 75.25 was obtained and for Swedish, the corresponding score was 76.04. The best results for both languages were achieved using embeddings generated from in-domain corpora extracted from electronic health records, but embeddings generated from related domains were also found to be beneficial. Conclusions A recurrent neural network with in-domain embeddings improved the medical named entity recognition compared to shallow learning methods, showing this combination to be suitable for entity recognition in clinical text for both languages.The publication cost of this article was funded by Stockholm University Librar
Recent advances in Swedish and Spanish medical entity recognition in clinical texts using deep neural approaches
Longitudinal cognitive biomarkers predicting symptom onset in presymptomatic frontotemporal dementia
Introduction: We performed 4-year follow-up neuropsychological assessment to investigate cognitive decline and the prognostic abilities from presymptomatic to symptomatic familial frontotemporal dementia (FTD). Methods: Presymptomatic MAPT (n = 15) and GRN mutation carriers (n = 31), and healthy controls (n = 39) underwent neuropsychological assessment every 2 years. Eight mutation carriers (5 MAPT, 3 GRN) became symptomatic. We investigated cognitive decline with multilevel regression modeling; the prognostic performance was assessed with ROC analyses and stepwise logistic regression. Results: MAPT converters declined on language, attention, executive function, social cognition, and memory, and GRN converters declined on attention and executive function (p < 0.05). Cognitive decline in ScreeLing phonology (p = 0.046) and letter fluency (p = 0.046) were predictive for conversion to non-fluent variant PPA, and decline on categorical fluency (p = 0.025) for an underlying MAPT mutation. Discussion: Using longitudinal neuropsychological assessment, we detected a mutation-specific pattern of cognitive decline, potentially suggesting prognostic value of neuropsychological trajectories in conversion to symptomatic FTD