Search CORE

9 research outputs found

Med-BERT: pre-trained contextualized embeddings on large-scale structured electronic health records for disease prediction

Author: Rasmy Laila
Tao Cui
Xiang Yang
Xie Ziqian
Zhi Degui
Publication venue
Publication date: 22/05/2020
Field of study

Deep learning (DL) based predictive models from electronic health records (EHR) deliver impressive performance in many clinical tasks. Large training cohorts, however, are often required to achieve high accuracy, hindering the adoption of DL-based models in scenarios with limited training data size. Recently, bidirectional encoder representations from transformers (BERT) and related models have achieved tremendous successes in the natural language processing domain. The pre-training of BERT on a very large training corpus generates contextualized embeddings that can boost the performance of models trained on smaller datasets. We propose Med-BERT, which adapts the BERT framework for pre-training contextualized embedding models on structured diagnosis data from 28,490,650 patients EHR dataset. Fine-tuning experiments are conducted on two disease-prediction tasks: (1) prediction of heart failure in patients with diabetes and (2) prediction of pancreatic cancer from two clinical databases. Med-BERT substantially improves prediction accuracy, boosting the area under receiver operating characteristics curve (AUC) by 2.02-7.12%. In particular, pre-trained Med-BERT substantially improves the performance of tasks with very small fine-tuning training sets (300-500 samples) boosting the AUC by more than 20% or equivalent to the AUC of 10 times larger training set. We believe that Med-BERT will benefit disease-prediction studies with small local training datasets, reduce data collection expenses, and accelerate the pace of artificial intelligence aided healthcare.Comment: L.R., X.Y., and Z.X. share first authorship of this wor

arXiv.org e-Print Archive

PubMed Central

DigitalCommons@The Texas Medical Center

Representation of EHR data for predictive modeling: a comparison between UMLS and other terminologies.

Author: Rasmy Laila
Tao Cui
Tiryaki Firat
Xiang Yang
Xu Hua
Zhi Degui
Zhou Yujia
Publication venue: DigitalCommons@TMC
Publication date: 15/09/2020
Field of study

OBJECTIVE: Predictive disease modeling using electronic health record data is a growing field. Although clinical data in their raw form can be used directly for predictive modeling, it is a common practice to map data to standard terminologies to facilitate data aggregation and reuse. There is, however, a lack of systematic investigation of how different representations could affect the performance of predictive models, especially in the context of machine learning and deep learning. MATERIALS AND METHODS: We projected the input diagnoses data in the Cerner HealthFacts database to Unified Medical Language System (UMLS) and 5 other terminologies, including CCS, CCSR, ICD-9, ICD-10, and PheWAS, and evaluated the prediction performances of these terminologies on 2 different tasks: the risk prediction of heart failure in diabetes patients and the risk prediction of pancreatic cancer. Two popular models were evaluated: logistic regression and a recurrent neural network. RESULTS: For logistic regression, using UMLS delivered the optimal area under the receiver operating characteristics (AUROC) results in both dengue hemorrhagic fever (81.15%) and pancreatic cancer (80.53%) tasks. For recurrent neural network, UMLS worked best for pancreatic cancer prediction (AUROC 82.24%), second only (AUROC 85.55%) to PheWAS (AUROC 85.87%) for dengue hemorrhagic fever prediction. DISCUSSION/CONCLUSION: In our experiments, terminologies with larger vocabularies and finer-grained representations were associated with better prediction performances. In particular, UMLS is consistently 1 of the best-performing ones. We believe that our work may help to inform better designs of predictive models, although further investigation is warranted

PubMed Central

DigitalCommons@The Texas Medical Center

Recalage d'images avec la corrélation d'images basée sur la méthode de Fourier

Author: RASMY Laila
Publication venue: IAV Hassan II
Publication date: 17/03/2023
Field of study

Image registration is an important technique in many computer vision applications, such as image fusion, object tracking, face recognition, change detection, etc. Registration of multi-date images is based on primitive space, similarity measure, search and optimization strategy. Each component plays a fundamental role in estimating the best spatial transformation, which has a direct impact on the robustness and accuracy of these methods. In this paper, we will be discussing classical and recent image registration methods, including their fundamental principles. This review provides a comprehensive reference resource for researchers involved in image registration with Fourier-based image correlation by describing Fourier-based image correlation methods, describing existing subpixel techniques in the frequency domain, and summarizing comparative studies of subpixel techniques. Keywords: sub-pixel registration, matching, phase correlation, Fourier transformLe recalage d'images est une technique importante dans de nombreuses applications de vision par ordinateur, telles que la fusion d'images, le suivi d'objets, la reconnaissance de visages, la détection de changements, etc. Les composantes principales du processus de recalage à savoir l’espace des primitives, la mesure de similarité, la stratégie de recherche et d'optimisation, jouent un rôle fondamental dans l’estimation de la meilleure transformation spatiale pour recaler les images multi-dates, qui influence directement la précision et la robustesse de ces méthodes. Cet article se concentre principalement sur les méthodes classique et récentes de recalage d’images, y compris les principes fondamentaux. L'objectif spécifique de cette revue consiste à décrire les méthodes de corrélation d'images basées sur la méthode de Fourier, d'exposer les méthodes sub-pixellique existantes dans le domaine fréquentiel et d'esquisser un résumé sur les études comparatives des méthodes sub-pixelliques de fournir une source de référence complète aux chercheurs impliqués dans le recalage d'images avec la corrélation d'images basée sur la méthode de Fourier. Mots clés: recalage sub-pixellique, mise en correspondance, corrélation de phase, transformée de Fourie

Revue Marocaine des Sciences Agronomiques et Vétérinaires

Automatic Sub-Pixel Co-Registration of Remote Sensing Images Using Phase Correlation and Harris Detector

Author: Imane Sebari
Laila Rasmy
Mohamed Ettarid
Publication venue: 'MDPI AG'
Publication date: 12/06/2021
Field of study

In this paper, we propose a new approach for sub-pixel co-registration based on Fourier phase correlation combined with the Harris detector. Due to the limitation of the standard phase correlation method to achieve only pixel-level accuracy, another approach is required to reach sub-pixel matching precision. We first applied the Harris corner detector to extract corners from both references and sensed images. Then, we identified their corresponding points using phase correlation between the image pairs. To achieve sub-pixel registration accuracy, two optimization algorithms were used. The effectiveness of the proposed method was tested with very high-resolution (VHR) remote sensing images, including Pleiades satellite images and aerial imagery. Compared with the speeded-up robust features (SURF)-based method, phase correlation with the Blackman window function produced 91% more matches with high reliability. Moreover, the results of the optimization analysis have revealed that Nelder–Mead algorithm performs better than the two-point step size gradient algorithm regarding localization accuracy and computation time. The proposed approach achieves better accuracy than 0.5 pixels and outperforms the speeded-up robust features (SURF)-based method. It can achieve sub-pixel accuracy in the presence of noise and produces large numbers of correct matching points

Multidisciplinary Digital Publishing Institute

Deep learning model for personalized prediction of positive MRSA culture using time-series electronic health records

Author: Bijun Sai Kannadath
Bingyu Mao
Degui Zhi
Laila Rasmy
Masayuki Nigo
Ziqian Xie
Publication venue: Nature Portfolio
Publication date: 01/03/2024
Field of study

Abstract Methicillin-resistant Staphylococcus aureus (MRSA) poses significant morbidity and mortality in hospitals. Rapid, accurate risk stratification of MRSA is crucial for optimizing antibiotic therapy. Our study introduced a deep learning model, PyTorch_EHR, which leverages electronic health record (EHR) time-series data, including wide-variety patient specific data, to predict MRSA culture positivity within two weeks. 8,164 MRSA and 22,393 non-MRSA patient events from Memorial Hermann Hospital System, Houston, Texas are used for model development. PyTorch_EHR outperforms logistic regression (LR) and light gradient boost machine (LGBM) models in accuracy (AUROCPyTorch_EHR = 0.911, AUROCLR = 0.857, AUROCLGBM = 0.892). External validation with 393,713 patient events from the Medical Information Mart for Intensive Care (MIMIC)-IV dataset in Boston confirms its superior accuracy (AUROCPyTorch_EHR = 0.859, AUROCLR = 0.816, AUROCLGBM = 0.838). Our model effectively stratifies patients into high-, medium-, and low-risk categories, potentially optimizing antimicrobial therapy and reducing unnecessary MRSA-specific antimicrobials. This highlights the advantage of deep learning models in predicting MRSA positive cultures, surpassing traditional machine learning models and supporting clinicians’ judgments

Directory of Open Access Journals

Drug discovery utilizing biotechnological methodologies and Northern Africa biodiversity

Author: Aly Mohamed
El-Menshawi Bassem
Emara Laila
Mahrous Karima
Osman Abdel-Monem
Rasmy Farouk
Youssif Fouad
Publication venue: National Research Centre (NRC), Cairo, EG
Publication date: 01/01/2005
Field of study

PowerPoint presentatio

International Development Research Centre: IDRC Digital Library

Time-sensitive clinical concept embeddings learned from large electronic health records

Author: Cui Tao
Degui Zhi
Fang Li
Firat Tiryaki
Hua Xu
Jun Xu
Laila Rasmy
Wenjin Jim Zheng
Xiaoqian Jiang
Yang Xiang
Yaoyun Zhang
Yonghui Wu
Yujia Zhou
Yuqi Si
Zhiheng Li
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/04/2019
Field of study

Abstract Background Learning distributional representation of clinical concepts (e.g., diseases, drugs, and labs) is an important research area of deep learning in the medical domain. However, many existing relevant methods do not consider temporal dependencies along the longitudinal sequence of a patient’s records, which may lead to incorrect selection of contexts. Methods To address this issue, we extended three popular concept embedding learning methods: word2vec, positive pointwise mutual information (PPMI) and FastText, to consider time-sensitive information. We then trained them on a large electronic health records (EHR) database containing about 50 million patients to generate concept embeddings and evaluated them for both intrinsic evaluations focusing on concept similarity measure and an extrinsic evaluation to assess the use of generated concept embeddings in the task of predicting disease onset. Results Our experiments show that embeddings learned from information within one visit (time window zero) improve performance on the concept similarity measure and the FastText algorithm usually had better performance than the other two algorithms. For the predictive modeling task, the optimal result was achieved by word2vec embeddings with a 30-day sliding window. Conclusions Considering time constraints are important in training clinical concept embeddings. We expect they can benefit a series of downstream applications

Directory of Open Access Journals

Vasopressor treatment and mortality following nontraumatic subarachnoid hemorrhage: a nationwide electronic health record analysis

Author: Aguilar David
Brown Derek
DeSantis Stacia M
Leon Novelo Luis
Maroufy Vahed
Miao Hongyu
Rasmy Laila
Talebi Yashar
Thomas Emy
Wang Xueying
Williams George
Wu Hulin
Yamal Jose-Miguel
Yaseen Ashraf
Yu Duo
Zhi Degui
Zhu Gen
Zhu Hai
Zhu Hongjian
Publication venue: 'Journal of Neurosurgery Publishing Group (JNSPG)'
Publication date: 01/05/2020
Field of study

OBJECTIVE: Subarachnoid hemorrhage (SAH) is a devastating cerebrovascular condition, not only due to the effect of initial hemorrhage, but also due to the complication of delayed cerebral ischemia (DCI). While hypertension facilitated by vasopressors is often initiated to prevent DCI, which vasopressor is most effective in improving outcomes is not known. The objective of this study was to determine associations between initial vasopressor choice and mortality in patients with nontraumatic SAH. METHODS: The authors conducted a retrospective cohort study using a large, national electronic medical record data set from 2000-2014 to identify patients with a new diagnosis of nontraumatic SAH (based on ICD-9 codes) who were treated with the vasopressors dopamine, phenylephrine, or norepinephrine. The relationship between the initial choice of vasopressor therapy and the primary outcome, which was defined as in-hospital death or discharge to hospice care, was examined. RESULTS: In total, 2634 patients were identified with nontraumatic SAH who were treated with a vasopressor. In this cohort, the average age was 56.5 years, 63.9% were female, and 36.5% of patients developed the primary outcome. The incidence of the primary outcome was higher in those initially treated with either norepinephrine (47.6%) or dopamine (50.6%) than with phenylephrine (24.5%). After adjusting for possible confounders using propensity score methods, the adjusted OR of the primary outcome was higher with dopamine (OR 2.19, 95% CI 1.70-2.81) and norepinephrine (OR 2.24, 95% CI 1.80-2.80) compared with phenylephrine. Sensitivity analyses using different variable selection procedures, causal inference models, and machine-learning methods confirmed the main findings. CONCLUSIONS: In patients with nontraumatic SAH, phenylephrine was significantly associated with reduced mortality in SAH patients compared to dopamine or norepinephrine. Prospective randomized clinical studies are warranted to confirm this finding

Crossref

DigitalCommons@The Texas Medical Center

Dynamic Prognosis Prediction for Patients on DAPT After Drug‐Eluting Stent Implantation: Model Development and Validation

Author: Abhijeet Dhoble
Ahmed Abdelhameed
Cui Tao
David Aguilar
Degui Zhi
Fang Li
Jianfu Li
Jiang Bian
JianPing He
Jingcheng Du
Jingna Feng
Laila Rasmy
Mattia Prosperi
Qing Wang
Shuteng Niu
Xinyuan Zhang
Xinyue Hu
Yang Xiang
Yi Nian
Yifang Dang
Yujia Zhou
Zenan Sun
Ziqian Xie
Publication venue: Wiley
Publication date: 01/02/2024
Field of study

Background The rapid evolution of artificial intelligence (AI) in conjunction with recent updates in dual antiplatelet therapy (DAPT) management guidelines emphasizes the necessity for innovative models to predict ischemic or bleeding events after drug‐eluting stent implantation. Leveraging AI for dynamic prediction has the potential to revolutionize risk stratification and provide personalized decision support for DAPT management. Methods and Results We developed and validated a new AI‐based pipeline using retrospective data of drug‐eluting stent‐treated patients, sourced from the Cerner Health Facts data set (n=98 236) and Optum's de‐identified Clinformatics Data Mart Database (n=9978). The 36 months following drug‐eluting stent implantation were designated as our primary forecasting interval, further segmented into 6 sequential prediction windows. We evaluated 5 distinct AI algorithms for their precision in predicting ischemic and bleeding risks. Model discriminative accuracy was assessed using the area under the receiver operating characteristic curve, among other metrics. The weighted light gradient boosting machine stood out as the preeminent model, thus earning its place as our AI‐DAPT model. The AI‐DAPT demonstrated peak accuracy in the 30 to 36 months window, charting an area under the receiver operating characteristic curve of 90% [95% CI, 88%–92%] for ischemia and 84% [95% CI, 82%–87%] for bleeding predictions. Conclusions Our AI‐DAPT excels in formulating iterative, refined dynamic predictions by assimilating ongoing updates from patients' clinical profiles, holding value as a novel smart clinical tool to facilitate optimal DAPT duration management with high accuracy and adaptability

Directory of Open Access Journals