3,235 research outputs found

    Enhance Representation Learning of Clinical Narrative with Neural Networks for Clinical Predictive Modeling

    Get PDF
    Medicine is undergoing a technological revolution. Understanding human health from clinical data has major challenges from technical and practical perspectives, thus prompting methods that understand large, complex, and noisy data. These methods are particularly necessary for natural language data from clinical narratives/notes, which contain some of the richest information on a patient. Meanwhile, deep neural networks have achieved superior performance in a wide variety of natural language processing (NLP) tasks because of their capacity to encode meaningful but abstract representations and learn the entire task end-to-end. In this thesis, I investigate representation learning of clinical narratives with deep neural networks through a number of tasks ranging from clinical concept extraction, clinical note modeling, and patient-level language representation. I present methods utilizing representation learning with neural networks to support understanding of clinical text documents. I first introduce the notion of representation learning from natural language processing and patient data modeling. Then, I investigate word-level representation learning to improve clinical concept extraction from clinical notes. I present two works on learning word representations and evaluate them to extract important concepts from clinical notes. The first study focuses on cancer-related information, and the second study evaluates shared-task data. The aims of these two studies are to automatically extract important entities from clinical notes. Next, I present a series of deep neural networks to encode hierarchical, longitudinal, and contextual information for modeling a series of clinical notes. I also evaluate the models by predicting clinical outcomes of interest, including mortality, length of stay, and phenotype predictions. Finally, I propose a novel representation learning architecture to develop a generalized and transferable language representation at the patient level. I also identify pre-training tasks appropriate for constructing a generalizable language representation. The main focus is to improve predictive performance of phenotypes with limited data, a challenging task due to a lack of data. Overall, this dissertation addresses issues in natural language processing for medicine, including clinical text classification and modeling. These studies show major barriers to understanding large-scale clinical notes. It is believed that developing deep representation learning methods for distilling enormous amounts of heterogeneous data into patient-level language representations will improve evidence-based clinical understanding. The approach to solving these issues by learning representations could be used across clinical applications despite noisy data. I conclude that considering different linguistic components in natural language and sequential information between clinical events is important. Such results have implications beyond the immediate context of predictions and further suggest future directions for clinical machine learning research to improve clinical outcomes. This could be a starting point for future phenotyping methods based on natural language processing that construct patient-level language representations to improve clinical predictions. While significant progress has been made, many open questions remain, so I will highlight a few works to demonstrate promising directions

    A Powerful Paradigm for Cardiovascular Risk Stratification Using Multiclass, Multi-Label, and Ensemble-Based Machine Learning Paradigms: A Narrative Review

    Get PDF
    Background and Motivation: Cardiovascular disease (CVD) causes the highest mortality globally. With escalating healthcare costs, early non-invasive CVD risk assessment is vital. Conventional methods have shown poor performance compared to more recent and fast-evolving Artificial Intelligence (AI) methods. The proposed study reviews the three most recent paradigms for CVD risk assessment, namely multiclass, multi-label, and ensemble-based methods in (i) office-based and (ii) stress-test laboratories. Methods: A total of 265 CVD-based studies were selected using the preferred reporting items for systematic reviews and meta-analyses (PRISMA) model. Due to its popularity and recent development, the study analyzed the above three paradigms using machine learning (ML) frameworks. We review comprehensively these three methods using attributes, such as architecture, applications, pro-and-cons, scientific validation, clinical evaluation, and AI risk-of-bias (RoB) in the CVD framework. These ML techniques were then extended under mobile and cloud-based infrastructure. Findings: Most popular biomarkers used were office-based, laboratory-based, image-based phenotypes, and medication usage. Surrogate carotid scanning for coronary artery risk prediction had shown promising results. Ground truth (GT) selection for AI-based training along with scientific and clinical validation is very important for CVD stratification to avoid RoB. It was observed that the most popular classification paradigm is multiclass followed by the ensemble, and multi-label. The use of deep learning techniques in CVD risk stratification is in a very early stage of development. Mobile and cloud-based AI technologies are more likely to be the future. Conclusions: AI-based methods for CVD risk assessment are most promising and successful. Choice of GT is most vital in AI-based models to prevent the RoB. The amalgamation of image-based strategies with conventional risk factors provides the highest stability when using the three CVD paradigms in non-cloud and cloud-based frameworks

    Deep learning-enabled technologies for bioimage analysis.

    Get PDF
    Deep learning (DL) is a subfield of machine learning (ML), which has recently demonstrated its potency to significantly improve the quantification and classification workflows in biomedical and clinical applications. Among the end applications profoundly benefitting from DL, cellular morphology quantification is one of the pioneers. Here, we first briefly explain fundamental concepts in DL and then we review some of the emerging DL-enabled applications in cell morphology quantification in the fields of embryology, point-of-care ovulation testing, as a predictive tool for fetal heart pregnancy, cancer diagnostics via classification of cancer histology images, autosomal polycystic kidney disease, and chronic kidney diseases

    Computer-based identification of relationships between medical concepts and cluster analysis in clinical notes

    Get PDF
    Clinical notes contain information about medical concepts or entities (such as diseases, treatments and drugs) that provide a comprehensive and overall impression of the patient’s health. The automatic extraction of these entities is relevant for health experts and researchers as they identify associations between the latter. However, automatically extracting information from clinical notes is challenging, due to their narrative format. This research describes a process to automatically extract and aggregate medical entities from clinical notes, as well as the process to identify clusters of patients and disease-treatment relationships. The i2b2 2008 Obesity dataset was used, and consists of 1237 discharge summaries of overweight and diabetic patients. Therefore, this thesis is focused on obesity diseases. For the automatic extraction of medical entities, MetaMap and cTAKES were used, and the automatic extraction capacity of both tools compared. Also, UMLS enabled the aggregation of the extracted entities. Two approaches were applied for cluster analysis. Firstly, a sparse K-means algorithm was used over a patient-disease matrix with 14 comorbidities related to obesity. Secondly, to visualize and analyze other diseases present in the clinical notes, 86 diseases were used to identify clusters of patients with a network-based approach. Furthermore, bipartite graphs were used to explore disease-treatment relationships among some of the clusters obtained. The result of the experiments we conducted show cTAKES slightly outperforming MetaMap, but this situation can change, considering other configuration options in the respective tools, including an abbreviation list. Moreover, concept aggregation (with similar and different semantic types) was shown to be a good strategy for improving medical entity extraction. The sparse K-means enabled identification of three types of clusters (high, medium and low), based on the number of comorbidities and the percentage of patients suffering from them. These results show that diabetes, hypercholesterolemia, atherosclerotic cardiovascular diseases, congestive heart failure, obstructive sleep apnea, and depression were the most prevalent diseases. With the network approach, it was possible to visualize and analyze patient information. In it, three sub-graphs or clusters were identified: obese patients with metabolic problems, obese patients with infection problems, and obese patients with a mechanical problem. Bipartite graphs for a disease-treatment relationship showed treatments for different types of diseases, which means that obese patients are suffering from multiple diseases. This work shows that clinical notes are a rich source of information, and they can be used to explore, visualize, and analyze patient’s information by applying different approaches. More work is needed to explore the relationship between the different medical entities from clinical notes and from different disease datasets. Also, considering that some medical documents express events in time, this characteristic should be considered in future works to form a personalized portrait of clusters, diseases and patients

    The Quality Application of Deep Learning in Clinical Outcome Predictions Using Electronic Health Record Data: A Systematic Review

    Get PDF
    Introduction: Electronic Health Record (EHR) is a significant source of medical data that can be used to develop predictive modelling with therapeutically useful outcomes. Predictive modelling using EHR data has been increasingly utilized in healthcare, achieving outstanding performance and improving healthcare outcomes. Objectives: The main goal of this review study is to examine different deep learning approaches and techniques used to EHR data processing. Methods: To find possibly pertinent articles that have used deep learning on EHR data, the PubMed database was searched. Using EHR data, we assessed and summarized deep learning performance in a number of clinical applications that focus on making specific predictions about clinical outcomes, and we compared the outcomes with those of conventional machine learning models. Results: For this study, a total of 57 papers were chosen. There have been five identified clinical outcome predictions: illness (n=33), intervention (n=6), mortality (n=5), Hospital readmission (n=7), and duration of stay (n=1). The majority of research (39 out of 57) used structured EHR data. RNNs were used as deep learning models the most frequently (LSTM: 17 studies, GRU: 6 research). The analysis shows that deep learning models have excelled when applied to a variety of clinical outcome predictions. While deep learning's application to EHR data has advanced rapidly, it's crucial that these models remain reliable, offering critical insights to assist clinicians in making informed decision. Conclusions: The findings demonstrate that deep learning can outperform classic machine learning techniques since it has the advantage of utilizing extensive and sophisticated datasets, such as longitudinal data seen in EHR. We think that deep learning will keep expanding because it has been quite successful in enhancing healthcare outcomes utilizing EHR data
    • …
    corecore