Search CORE

6 research outputs found

Representation of EHR data for predictive modeling: a comparison between UMLS and other terminologies.

Author: Rasmy Laila
Tao Cui
Tiryaki Firat
Xiang Yang
Xu Hua
Zhi Degui
Zhou Yujia
Publication venue: DigitalCommons@TMC
Publication date: 15/09/2020
Field of study

OBJECTIVE: Predictive disease modeling using electronic health record data is a growing field. Although clinical data in their raw form can be used directly for predictive modeling, it is a common practice to map data to standard terminologies to facilitate data aggregation and reuse. There is, however, a lack of systematic investigation of how different representations could affect the performance of predictive models, especially in the context of machine learning and deep learning. MATERIALS AND METHODS: We projected the input diagnoses data in the Cerner HealthFacts database to Unified Medical Language System (UMLS) and 5 other terminologies, including CCS, CCSR, ICD-9, ICD-10, and PheWAS, and evaluated the prediction performances of these terminologies on 2 different tasks: the risk prediction of heart failure in diabetes patients and the risk prediction of pancreatic cancer. Two popular models were evaluated: logistic regression and a recurrent neural network. RESULTS: For logistic regression, using UMLS delivered the optimal area under the receiver operating characteristics (AUROC) results in both dengue hemorrhagic fever (81.15%) and pancreatic cancer (80.53%) tasks. For recurrent neural network, UMLS worked best for pancreatic cancer prediction (AUROC 82.24%), second only (AUROC 85.55%) to PheWAS (AUROC 85.87%) for dengue hemorrhagic fever prediction. DISCUSSION/CONCLUSION: In our experiments, terminologies with larger vocabularies and finer-grained representations were associated with better prediction performances. In particular, UMLS is consistently 1 of the best-performing ones. We believe that our work may help to inform better designs of predictive models, although further investigation is warranted

PubMed Central

DigitalCommons@The Texas Medical Center

Parsing clinical text using the state-of-the-art deep learning based parsers: a systematic comparison

Author: Firat Tiryaki
Hua Xu
Min Jiang
Yaoyun Zhang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/04/2019
Field of study

Abstract Background A shareable repository of clinical notes is critical for advancing natural language processing (NLP) research, and therefore a goal of many NLP researchers is to create a shareable repository of clinical notes, that has breadth (from multiple institutions) as well as depth (as much individual data as possible). Methods We aimed to assess the degree to which individuals would be willing to contribute their health data to such a repository. A compact e-survey probed willingness to share demographic and clinical data categories. Participants were faculty, staff, and students in two geographically diverse major medical centers (Utah and New York). Such a sample could be expected to respond like a typical potential participant from the general public who is given complete and fully informed consent about the pros and cons of participating in a research study. Results 2140 respondents completed the surveys. 56% of respondents were “somewhat/definitely willing” to share clinical data with identifiers, while 89% of respondents were “somewhat (17%) /definitely willing (72%)” to share without identifiers. Results were consistent across gender, age, and education, but there were some differences by geographical region. Individuals were most reluctant (50–74%) sharing mental health, substance abuse, and domestic violence data. Conclusions We conclude that a substantial fraction of potential patient participants, once educated about risks and benefits, would be willing to donate de-identified clinical data to a shared research repository. A slight majority even would be willing to share absent de-identification, suggesting that perceptions about data misuse are not a major concern. Such a repository of clinical notes should be invaluable for clinical NLP research and advancement

Directory of Open Access Journals

Time-sensitive clinical concept embeddings learned from large electronic health records

Author: Cui Tao
Degui Zhi
Fang Li
Firat Tiryaki
Hua Xu
Jun Xu
Laila Rasmy
Wenjin Jim Zheng
Xiaoqian Jiang
Yang Xiang
Yaoyun Zhang
Yonghui Wu
Yujia Zhou
Yuqi Si
Zhiheng Li
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/04/2019
Field of study

Abstract Background Learning distributional representation of clinical concepts (e.g., diseases, drugs, and labs) is an important research area of deep learning in the medical domain. However, many existing relevant methods do not consider temporal dependencies along the longitudinal sequence of a patient’s records, which may lead to incorrect selection of contexts. Methods To address this issue, we extended three popular concept embedding learning methods: word2vec, positive pointwise mutual information (PPMI) and FastText, to consider time-sensitive information. We then trained them on a large electronic health records (EHR) database containing about 50 million patients to generate concept embeddings and evaluated them for both intrinsic evaluations focusing on concept similarity measure and an extrinsic evaluation to assess the use of generated concept embeddings in the task of predicting disease onset. Results Our experiments show that embeddings learned from information within one visit (time window zero) improve performance on the concept similarity measure and the FastText algorithm usually had better performance than the other two algorithms. For the predictive modeling task, the optimal result was achieved by word2vec embeddings with a 30-day sliding window. Conclusions Considering time constraints are important in training clinical concept embeddings. We expect they can benefit a series of downstream applications

Directory of Open Access Journals

Consumers’ fluid milk consumption behaviors in TURKEY: an application of multinomial logit model

Author: A.E.M. Bus
A.G. Woodside
B. Lonnerdal
B.W. Greene
C. Akbay
Cuma Akbay
D. Dong
D. McFadden
E. Firat
E. Martinez
Gulgun Yildiz Tiryaki
H.L. Davis
J.B. Ford
J.L. Hsu
R.D. Coe
R.E. Black
S. Murphy
S.A. Hatirli
T.M. Schmit
Y. Watanable
Publication venue
Publication date
Field of study

Consumption behaviors and preference, Household survey data, Processed fluid milk, Turkey,

Crossref

Research Papers in Economics

Blood pressure monitoring in kidney transplantation: a systematic review on hypertension and target organ damage

Author: Ahmed
Cheung
Chobanian
Covic
Czyzewski
Fernandez-Vega
Firat
Frank
Galiatsou
Gluskin
Halimi
Haydar
Hermida
Ibernon
Jacobi
Kanbay
Kario
Kasiske
Kendirlinan Demirkol
Kooman
Lipkin
Mallamaci
Mallamaci
Mallamaci
Mancia
Marcondes
McGregor
McManus
Minutolo
Moher
Ozkayar
O’Brien
Paoletti
Pisano
Pisano
Sega
Sezer
Sezer
Siu
Staessen
Tiryaki
Wadei
Weir
Wells
Wen
Publication venue: 'Oxford University Press (OUP)'
Publication date
Field of study

Crossref