2,362 research outputs found

    The Quality Application of Deep Learning in Clinical Outcome Predictions Using Electronic Health Record Data: A Systematic Review

    Get PDF
    Introduction: Electronic Health Record (EHR) is a significant source of medical data that can be used to develop predictive modelling with therapeutically useful outcomes. Predictive modelling using EHR data has been increasingly utilized in healthcare, achieving outstanding performance and improving healthcare outcomes. Objectives: The main goal of this review study is to examine different deep learning approaches and techniques used to EHR data processing. Methods: To find possibly pertinent articles that have used deep learning on EHR data, the PubMed database was searched. Using EHR data, we assessed and summarized deep learning performance in a number of clinical applications that focus on making specific predictions about clinical outcomes, and we compared the outcomes with those of conventional machine learning models. Results: For this study, a total of 57 papers were chosen. There have been five identified clinical outcome predictions: illness (n=33), intervention (n=6), mortality (n=5), Hospital readmission (n=7), and duration of stay (n=1). The majority of research (39 out of 57) used structured EHR data. RNNs were used as deep learning models the most frequently (LSTM: 17 studies, GRU: 6 research). The analysis shows that deep learning models have excelled when applied to a variety of clinical outcome predictions. While deep learning's application to EHR data has advanced rapidly, it's crucial that these models remain reliable, offering critical insights to assist clinicians in making informed decision. Conclusions: The findings demonstrate that deep learning can outperform classic machine learning techniques since it has the advantage of utilizing extensive and sophisticated datasets, such as longitudinal data seen in EHR. We think that deep learning will keep expanding because it has been quite successful in enhancing healthcare outcomes utilizing EHR data

    The Convergence of Human and Artificial Intelligence on Clinical Care - Part I

    Get PDF
    This edited book contains twelve studies, large and pilots, in five main categories: (i) adaptive imputation to increase the density of clinical data for improving downstream modeling; (ii) machine-learning-empowered diagnosis models; (iii) machine learning models for outcome prediction; (iv) innovative use of AI to improve our understanding of the public view; and (v) understanding of the attitude of providers in trusting insights from AI for complex cases. This collection is an excellent example of how technology can add value in healthcare settings and hints at some of the pressing challenges in the field. Artificial intelligence is gradually becoming a go-to technology in clinical care; therefore, it is important to work collaboratively and to shift from performance-driven outcomes to risk-sensitive model optimization, improved transparency, and better patient representation, to ensure more equitable healthcare for all

    Secondary use of Structured Electronic Health Records Data: From Observational Studies to Deep Learning-based Predictive Modeling

    Get PDF
    With the wide adoption of electronic health records (EHRs), researchers, as well as large healthcare organizations, governmental institutions, insurance, and pharmaceutical companies have been interested in leveraging this rich clinical data source to extract clinical evidence and develop predictive algorithms. Large vendors have been able to compile structured EHR data from sites all over the United States, de-identify these data, and make them available to data science researchers in a more usable format. For this dissertation, we leveraged one of the earliest and largest secondary EHR data sources and conducted three studies of increasing scope. In the first study, which was of limited scope, we conducted a retrospective observational study to compare the effect of three drugs on a specific population of approximately 3,000 patients. Using a novel statistical method, we found evidence that the selection of phenylephrine as the primary vasopressor to induce hypertension for the management of nontraumatic subarachnoid hemorrhage is associated with better outcomes as compared to selecting norepinephrine or dopamine. In the second study, we widened our scope, using a cohort of more than 100,000 patients to train generalizable models for the risk prediction of specific clinical events, such as heart failure in diabetes patients or pancreatic cancer. In this study, we found that recurrent neural network-based predictive models trained on expressive terminologies, which preserve a high level of granularity, are associated with better prediction performance as compared with other baseline methods, such as logistic regression. Finally, we widened our scope again, to train Med-BERT, a foundation model, on more than 20 million patientsโ€™ diagnosis data. Med-BERT was found to improve the prediction performance of downstream tasks that have a small sample size, which otherwise would limit the ability of the model to learn good representation. In conclusion, we found that we can extract useful information and train helpful deep learning-based predictive models. Given the limitations of secondary EHR data and taking into consideration that the data were originally collected for administrative and not research purposes, however, the findings need clinical validation. Therefore, clinical trials are warranted to further validate any new evidence extracted from such data sources before updating clinical practice guidelines. The implementability of the developed predictive models, which are in an early development phase, also warrants further evaluation

    ๋”ฅ๋Ÿฌ๋‹ ๊ธฐ๋ฐ˜ ์ƒ์กด๋ถ„์„์ด ์ ์šฉ๋œ ์‹ฌํ˜ˆ๊ด€์งˆํ™˜ ์œ„ํ—˜ ํ‰๊ฐ€ ๋ชจ๋ธ ์„ฑ๋Šฅ ํ–ฅ์ƒ์„ ์œ„ํ•œ ์ฝ•์Šค ๋ชจํ˜•๊ณผ ๊ฒฐํ•ฉ๋œ ํ•˜์ด๋ธŒ๋ฆฌ๋“œ ์ ‘๊ทผ๋ฒ•: ํ—ฌ์Šค์ผ€์–ด-ํ™˜๊ฒฝ ์—ฐ๊ณ„ ๋ฐ์ดํ„ฐ ํ™œ์šฉ ์—ฐ๊ตฌ

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (๋ฐ•์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ์˜๊ณผ๋Œ€ํ•™ ์˜๊ณผํ•™๊ณผ, 2020. 8. ๋ฐ•์ƒ๋ฏผ .Background and aims: The contribution of different cardiovascular disease (CVD) risk factors for the risk evaluation and predictive modeling for incident CVD is often debated. Also, to what extent data on CVD risk factors from multiple data categories should be collected for comprehensive risk assessment and predictive modeling for CVD risk using survival analysis is uncertain despite the increasing availability of the relevant data sources. This study aimed to evaluate the contribution of different data categories derived from integrated data on healthcare and environmental exposure to the risk evaluation and prediction models for CVD risk using deep learning based survival analysis in combination with Cox proportional hazards regression and Cox proportional hazards regression. Methods: Information on the comprehensive list of CVD risk factors were collected from systematic reviews of variables included in the conventional CVD risk assessment tools and observational studies from medical literature database (PubMed and Embase). Each risk factor was screened for availability in the National Health Insurance Service-National Sample Cohort (NHIS-NSC) linked to environmental exposure data on cumulative particulate matter and urban green space using residential area code. Individual records of 137,249 patients more than 40 years of age who underwent the biennial national health screening between 2009 and 2010 without previous history of CVD were followed up for incident CVD event from January 1, 2011 to December 31, 2013 in the NHIS-NSC with data linkage to environmental exposure. Statistics-based variable selection methods were implemented as follows: statistical significance, subset with the minimum (best) Akaike Information Criteria (AIC), variables selected from the regularized Cox proportional hazards regression with elastic net penalty, and finally a variable set that commonly meets all the criteria from the abovementioned statistical methods. Prediction models using Cox proportional hazards deep neural network (DeepSurv) and Cox proportional hazards regression were constructed in the training set (80% of the total sample) using input feature sets selected from the abovementioned strategies and progressively adding input features by data categories to examine the relative contribution of each data type to the predictive performance for CVD risk. Performance evaluations of the DeepSurv and Cox proportional hazards regression models for CVD risk were conducted in the test set (20% of the total sample) with Unos concordance statistics (C-index), which is the most up-to-date evaluation metrics for the survival models with right censored data. Results: After the comprehensive review, data synthesis, and availability check, a total of 31 risk factors in the categories of sociodemographic, clinical laboratory test and measurement, lifestyle behavior, family history, underlying medical conditions, dental health, medication, and environmental exposure were identified in the NHIS-NSC linked to environmental exposure data. Among the models constructed with different variable selection methods, using statistically significant variables for DeepSurv (Unos C-index: 0.7069) and all of the variables for Cox proportional hazards regression (Unos C-index: 0.7052) showed improved predictive performance for CVD risk, which was a statistically significant increase (p-value for difference in Unos C-index: <0.0001 for both comparisons) compared to the models with basic clinical factors (age, sex, and body mass index), respectively. When all and statistically significant variables in each data category from sociodemographic to environmental exposure were progressively added as input features into DeepSurv and Cox proportional hazards regression for predictive modeling for CVD risk, the DeepSurv model with statistically significant variables pertaining to the sociodemographic factors, clinical laboratory test and measurement, and lifestyle behavior data showed the notable performance that outperformed Cox proportional hazards regression model with statistically significant variables added up to the medication category. Extensive data linkage to environmental exposure on cumulative particulate matter and urban green space offered only marginal improvement for the predictive performance of DeepSurv and Cox proportional hazards regression models for CVD risk. Conclusion: To obtain the best predictive performance of DeepSurv model for CVD risk with minimum number of input features, information on sociodemographic, clinical laboratory test and measurement, and lifestyle behavior should be primarily collected and used as input features in the NHIS-NSC. Also, the overall performance of DeepSurv for CVD risk assessment was improved with a hybrid approach using statistically significant variables from Cox proportional hazards regression as input features. When all the data categories in the NHIS-NSC linked to environmental exposure data are available, progressively adding variables in each data category could incrementally increase the predictive performance of DeepSurv model for CVD risk with the hybrid approach. Data linkage to the environmental exposure with residential area code in the NHIS-NSC offered marginally improved performance for CVD risk in both DeepSurv model with the hybrid approach and Cox proportional hazards regression model.๋ฐฐ๊ฒฝ ๋ฐ ๋ชฉ์ : ์‹ฌํ˜ˆ๊ด€์งˆํ™˜ ์œ„ํ—˜ํ‰๊ฐ€ ๋ฐ ์˜ˆ์ธก๋ชจ๋ธ๋ง์—์„œ ๋‹ค์–‘ํ•œ ์‹ฌํ˜ˆ๊ด€์งˆํ™˜ ์œ„ํ—˜์ธ์ž๋“ค์˜ ๋ชจ๋ธ ์„ฑ๋Šฅํ–ฅ์ƒ์— ๋Œ€ํ•œ ๊ธฐ์—ฌ๋„๋Š” ๋…ผ๋ž€์˜ ์š”์ง€๋กœ ๋ณด๊ณ ๋˜์–ด์™”๋‹ค. ๋˜ํ•œ, ์ง€์†์ ์œผ๋กœ ์ฆ๊ฐ€ํ•˜๋Š” ํ™œ์šฉ ๊ฐ€๋Šฅํ•œ ์‹ฌํ˜ˆ๊ด€์งˆํ™˜ ๊ด€๋ จ ๋ฐ์ดํ„ฐ์˜ ์ข…๋ฅ˜์™€ ์–‘์—๋„ ๋ถˆ๊ตฌํ•˜๊ณ  ํฌ๊ด„์ ์ธ ์‹ฌํ˜ˆ๊ด€์งˆํ™˜ ์œ„ํ—˜ํ‰๊ฐ€์™€ ์ตœ์ ์˜ ์˜ˆ์ธก ๋ชจํ˜• ๊ฐœ๋ฐœ์„ ์œ„ํ•ด ๋ฐ์ดํ„ฐ๋ฅผ ์–ด๋Š ๋ฒ”์œ„์™€ ์ˆ˜์ค€๊นŒ์ง€ ์ˆ˜์ง‘ํ•ด์•ผ ํ•˜๋Š”์ง€์— ๋Œ€ํ•œ ๊ทผ๊ฑฐ๋Š” ๋ถ€์กฑํ•œ ํ˜„ํ™ฉ์ด๋‹ค. ๋ณธ ์—ฐ๊ตฌ์—์„œ๋Š” ์ฝ•์Šค ๋ชจํ˜•๊ณผ ๊ฒฐํ•ฉ๋œ ๋”ฅ๋Ÿฌ๋‹ ๊ธฐ๋ฐ˜ ์ƒ์กด๋ถ„์„ ์ ‘๊ทผ๋ฒ• ๋ฐ ์ฝ•์Šค ๋ชจํ˜•์„ ํ™œ์šฉํ•œ ์‹ฌํ˜ˆ๊ด€์งˆํ™˜ ์œ„ํ—˜ํ‰๊ฐ€์™€ ์˜ˆ์ธก๋ชจ๋ธ๋ง์—์„œ ํ—ฌ์Šค์ผ€์–ด-ํ™˜๊ฒฝ ์—ฐ๊ณ„ ๋ฐ์ดํ„ฐ ํ™œ์šฉ๋ฐฉ๋ฒ• ๋ฐ ๋ฒ”์ฃผ์— ๋”ฐ๋ฅธ ๋ชจ๋ธ ์„ฑ๋Šฅํ–ฅ์ƒ์— ๋Œ€ํ•œ ๊ธฐ์—ฌ๋„๋ฅผ ํ‰๊ฐ€ํ•˜๊ณ ์ž ํ•˜์˜€๋‹ค. ์—ฐ๊ตฌ ๋ฐฉ๋ฒ•: ์ „ํ†ต์  ์‹ฌํ˜ˆ๊ด€์งˆํ™˜ ์œ„ํ—˜ ํ‰๊ฐ€ ๋„๊ตฌ ๋ฐ ๊ด€์ฐฐ ์—ฐ๊ตฌ๋“ค์— ํฌํ•จ ๋œ ์‹ฌํ˜ˆ๊ด€์งˆํ™˜ ์œ„ํ—˜์š”์ธ ๊ด€๋ จ ๋ณ€์ˆ˜๋“ค์„ ์ฒด๊ณ„์  ๋ฌธํ—Œ๊ณ ์ฐฐ ๋ฐฉ๋ฒ•๋ก ์„ ํ™œ์šฉํ•˜์—ฌ ์˜ํ•™์—ฐ๊ตฌ ๋ฌธํ—Œ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค (PubMed and Embase)์—์„œ ํฌ๊ด„์ ์œผ๋กœ ์ •๋ณด๋ฅผ ์ˆ˜์ง‘ํ•˜์˜€๋‹ค. ๋ฏธ์„ธ๋จผ์ง€ ๋ˆ„์ ์žฅ๊ธฐ๋…ธ์ถœ ๋ฐ ๋„์‹œ๋…น์ง€๋ฉด์ ์— ๋Œ€ํ•œ ํ™˜๊ฒฝ ๋…ธ์ถœ ๋ฐ์ดํ„ฐ์™€ ์—ฐ๊ณ„ ๋œ ๊ตญ๋ฏผ๊ฑด๊ฐ•๋ณดํ—˜๊ณต๋‹จ ํ‘œ๋ณธ์ฝ”ํ˜ธํŠธ, (National Health Insurance Service-National Sample Cohort, NHIS-NSC)์—์„œ ๊ฐ ์‹ฌํ˜ˆ๊ด€์งˆํ™˜ ์œ„ํ—˜์ธ์ž๋“ค์˜ ๋ฐ์ดํ„ฐ ํ™•๋ณด ๊ฐ€๋Šฅ์„ฑ์„ ๊ฒ€ํ† ํ•˜์˜€๋‹ค. NHIS-NSC๋ฅผ ๊ธฐ์ค€์œผ๋กœ 2009๋…„์—์„œ 2010๋…„ ์‚ฌ์ด์— ๊ตญ๊ฐ€๊ฑด๊ฐ•๊ฒ€์ง„์„ ๋ฐ›์€ 40์„ธ ์ด์ƒ ๋Œ€์ƒ์ž ์ค‘ ๊ณผ๊ฑฐ ์‹ฌํ˜ˆ๊ด€์งˆํ™˜ ๋ณ‘๋ ฅ์ด ์—†๋Š” ๋Œ€์ƒ์ž 137,249๋ช…์˜ ํ™˜์ž์— ๋Œ€ํ•œ ์ •๋ณด๋ฅผ ์ˆ˜์ง‘ํ•˜์—ฌ 2011 ๋…„ 1 ์›” 1 ์ผ๋ถ€ํ„ฐ 2013 ๋…„ 12 ์›” 31 ์ผ๊นŒ์ง€ ์‹ ๊ทœ ๋ฐœ์ƒํ•œ ์‹ฌํ˜ˆ๊ด€์งˆํ™˜์— ๋Œ€ํ•ด ์‹œ๊ฐ„ ๊ฒฝ๊ณผ์— ๋”ฐ๋ผ ์ถ”์  ์กฐ์‚ฌํ•˜์˜€๋‹ค. ํ†ต๊ณ„ ๊ธฐ๋ฐ˜ ๋ณ€์ˆ˜์„ ํƒ ๋ฐฉ๋ฒ•์€ ์ฝ•์Šค๋น„๋ก€์œ„ํ—˜๋ชจํ˜•์—์„œ ํ†ต๊ณ„์  ์œ ์˜์„ฑ, ์ตœ์†Œ (์ตœ์ƒ์˜) Akaike Information Criteria (AIC)์˜ ํ•˜์œ„ ์ง‘ํ•ฉ, elastic net penalty๋กœ ์ •๊ทœํ™” ๋œ ์ฝ•์Šค๋น„๋ก€์œ„ํ—˜๋ชจํ˜•์—์„œ ์„ ํƒ๋œ ๋ณ€์ˆ˜ ๋ฐ ์œ„์— ์–ธ๊ธ‰๋œ ๋ชจ๋“  ๊ธฐ์ค€์„ ์ถฉ์กฑํ•˜๋Š” ๋ณ€์ˆ˜ ์„ธํŠธ๋กœ ์„ ์ •ํ•˜์˜€๋‹ค. ์œ„์— ๋ช…์‹œ๋œ ํ†ต๊ณ„์  ๋ฐฉ๋ฒ• ์™ธ ๋ชจ๋“  ๋ฐ์ดํ„ฐ ๋ฒ”์ฃผ์— ์†ํ•œ ๋ณ€์ˆ˜ ๋ฐ ์ฝ•์Šค๋น„๋ก€์œ„ํ—˜๋ชจํ˜•์—์„œ ํ†ต๊ณ„์ ์œผ๋กœ ์œ ์˜๋ฏธํ•œ ๋ณ€์ˆ˜ (ํ•˜์ด๋ธŒ๋ฆฌ๋“œ ์ ‘๊ทผ๋ฒ•)๋ฅผ ์ ์ง„์ ์œผ๋กœ ์ž…๋ ฅ ํ”ผ์ณ๋กœ ์ถ”๊ฐ€ํ•˜๋Š” ์ „๋žต์œผ๋กœ ๋”ฅ๋Ÿฌ๋‹ ๊ธฐ๋ฐ˜ ์ƒ์กด๋ถ„์„ (Cox proportional hazards deep neural network, DeepSurv) ๋ฐ ์ฝ•์Šค๋น„๋ก€์œ„ํ—˜๋ชจํ˜•์—์„œ ์˜ˆ์ธก ๋ชจ๋ธ๋“ค์„ ํ›ˆ๋ จ ์„ธํŠธ (์ „์ฒด ์ƒ˜ํ”Œ์˜ 80 %)๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ๊ฐœ๋ฐœํ•˜์˜€๋‹ค. DeepSurv ๋ฐ ์ฝ•์Šค๋น„๋ก€ ์œ„ํ—˜๋ชจํ˜•์„ ํ™œ์šฉํ•œ ์‹ฌํ˜ˆ๊ด€์งˆํ™˜ ์˜ˆ์ธก ๋ชจ๋ธ์˜ ์„ฑ๋Šฅํ‰๊ฐ€๋Š” ์ƒ์กด๋ถ„์„์„ ํ™œ์šฉํ•œ ์˜ˆ์ธก ๋ชจ๋ธ๋ง์— ๊ฐ€์žฅ ์ ํ•ฉํ•œ ํ‰๊ฐ€์ง€ํ‘œ๋กœ ์•Œ๋ ค์ง„ Unos concordance statistics (C-index)๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํ…Œ์ŠคํŠธ ์„ธํŠธ (์ด ์ƒ˜ํ”Œ์˜ 20 %)์—์„œ ์ˆ˜ํ–‰ํ•˜์˜€๋‹ค. ๊ฒฐ๊ณผ: ์ฒด๊ณ„์  ๋ฌธํ—Œ๊ณ ์ฐฐ, ๋ฐ์ดํ„ฐ ์ทจํ•ฉ ๋ฐ ์ถ”์ถœ ๊ฐ€๋Šฅ์„ฑ ๊ฒ€ํ†  ํ›„, ์ธ๊ตฌ์‚ฌํšŒํ•™์  ์š”์ธ, ๊ฑด๊ฐ•๊ฒ€์ง„ ๋ฐ ์ธก์ • ๊ฒฐ๊ณผ, ์ƒํ™œ์Šต๊ด€, ๊ฐ€์กฑ๋ ฅ, ๊ฑด๊ฐ•์ƒํƒœ, ๊ตฌ๊ฐ•๊ฑด๊ฐ•, ์•ฝ๋ฌผ ๋ฐ ํ™˜๊ฒฝ ๋…ธ์ถœ ๋ฐ์ดํ„ฐ ๋ฒ”์ฃผ์—์„œ ์ด 31 ๊ฐœ์˜ ์‹ฌํ˜ˆ๊ด€์งˆํ™˜ ์œ„ํ—˜์ธ์ž๊ฐ€ ์ง€์—ญํ™˜๊ฒฝ ์ž๋ฃŒ์™€ ์—ฐ๊ณ„๋œ NHIS-NSC์—์„œ ํ™•์ธ๋˜์—ˆ๋‹ค. ํ†ต๊ณ„ ๊ธฐ๋ฐ˜ ๋ณ€์ˆ˜์„ ํƒ ๋ฐฉ๋ฒ•์œผ๋กœ ๊ฐœ๋ฐœํ•œ ์‹ฌํ˜ˆ๊ด€์งˆํ™˜ ์˜ˆ์ธก ๋ชจ๋ธ ์ค‘ ์ฝ•์Šค๋น„๋ก€์œ„ํ—˜๋ชจํ˜•์—์„œ ํ†ต๊ณ„์ ์œผ๋กœ ์œ ์˜๋ฏธํ•œ ๋ณ€์ˆ˜๋ฅผ DeepSurv์— ์ ์šฉํ•œ ํ•˜์ด๋ธŒ๋ฆฌ๋“œ ์ ‘๊ทผ๋ฒ•์ด Uno 's C-index ๊ฐ’ 0.7069, ๋ชจ๋“  ๋ณ€์ˆ˜๋ฅผ ์ฝ•์Šค๋น„๋ก€์œ„ํ—˜๋ชจํ˜•์— ์ ์šฉํ•œ ์ฝ•์Šค๋น„๋ก€์œ„ํ—˜๋ชจํ˜•์ด Uno 's C-index ๊ฐ’ 0.7052๋กœ ๋‚˜ํƒ€๋‚˜ ๊ธฐ๋ณธ ์ž„์ƒ ์š”์ธ (์—ฐ๋ น, ์„ฑ๋ณ„ ๋ฐ ์ฒด์งˆ๋Ÿ‰์ง€์ˆ˜)์ด ํฌํ•จ๋œ ์˜ˆ์ธก ๋ชจ๋ธ๊ณผ ๋น„๊ตํ•˜์—ฌ ํ†ต๊ณ„์ ์œผ๋กœ ์œ ์˜๋ฏธํ•œ ๋ชจ๋ธ ์˜ˆ์ธก๋ ฅ ์ฆ๊ฐ€๋ฅผ ๋ณด์˜€๋‹ค (๋‘ ๋ชจ๋ธ ๋ชจ๋‘ Unos C-index ์ฐจ์ด์— ๋Œ€ํ•œ p-value : <0.0001). ์ธ๊ตฌ์‚ฌํšŒํ•™์  ํŠน์„ฑ์—์„œ ํ™˜๊ฒฝ ๋…ธ์ถœ์— ์ด๋ฅด๊ธฐ๊นŒ์ง€ ๊ฐ ๋ฐ์ดํ„ฐ ๋ฒ”์ฃผ์—์„œ ๋ชจ๋‘ ํ†ต๊ณ„์ ์œผ๋กœ ์œ ์˜๋ฏธํ•œ ๋ณ€์ˆ˜๋“ค์ด ์‹ฌํ˜ˆ๊ด€์งˆํ™˜ ์˜ˆ์ธก ๋ชจ๋ธ๋ง์„์œ„ํ•œ DeepSurv ๋ฐ Cox ๋น„๋ก€ ์œ„ํ—˜ ํšŒ๊ท€์— ์ž…๋ ฅ ํ”ผ์ณ๋กœ ์ ์ง„์ ์œผ๋กœ ์ถ”๊ฐ€ ๋œ ๊ฒฝ์šฐ, ์ธ๊ตฌ์‚ฌํšŒํ•™์  ์š”์ธ, ๊ฑด๊ฐ•๊ฒ€์ง„ ๋ฐ ์ธก์ • ๊ฒฐ๊ณผ, ์ƒํ™œ์Šต๊ด€ ์š”์ธ ์ค‘ ํ†ต๊ณ„์ ์œผ๋กœ ์œ ์˜๋ฏธํ•œ ๋ณ€์ˆ˜๋“ค๋กœ ๊ตฌ์„ฑ๋œ DeepSurv ๋ชจ๋ธ์ด ์˜์•ฝํ’ˆ ์‚ฌ์šฉ๊นŒ์ง€ ๊ณ ๋ คํ•œ Cox ๋น„๋ก€ ์œ„ํ—˜ ํšŒ๊ท€๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•œ ๋ชจ๋ธ ๋ณด๋‹ค ๋›ฐ์–ด๋‚œ ์„ฑ๋Šฅ์„ ๋‚˜ํƒ€๋ƒˆ๋‹ค. ๋ฏธ์„ธ๋จผ์ง€ ๋ฐ ๋„์‹œ๋…น์ง€๋ฉด์ ์— ๋Œ€ํ•œ ํ™˜๊ฒฝ ๋…ธ์ถœ ๋ฐ์ดํ„ฐ๋ฅผ ๊ฑฐ์ฃผ์ง€๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ NHIS-NSC์™€ ์—ฐ๊ณ„ ํ›„ ์ ์ง„์ ์œผ๋กœ ์ž…๋ ฅ ํ”ผ์ณ๋กœ ์ถ”๊ฐ€ ์‹œ DeepSurv ๋ฐ ์ฝ•์Šค๋น„๋ก€์œ„ํ—˜๋ชจํ˜•์„ ํ™œ์šฉํ•œ ์‹ฌํ˜ˆ๊ด€์งˆํ™˜ ์˜ˆ์ธก ๋ชจ๋ธ๋ง ์„ฑ๋Šฅ์„ ํ†ต๊ณ„์ ์œผ๋กœ ์œ ์˜๋ฏธํ•œ ์ˆ˜์ค€์œผ๋กœ ๊ฐœ์„ ํ•˜์ง€ ๋ชปํ–ˆ๋‹ค. ๊ฒฐ๋ก : ์ตœ์†Œ ์ž…๋ ฅ ํ”ผ์ณ๋ฅผ ๊ฐ–์ถ˜ ์ƒ์กด ๋ถ„์„ ๊ธฐ๋ฐ˜ ์‹ฌํ˜ˆ๊ด€์งˆํ™˜ ์˜ˆ์ธก ๋ชจ๋ธ์—์„œ ์ตœ์ƒ์˜ ์„ฑ๋Šฅ์„ ์–ป์œผ๋ ค๋ฉด ์ธ๊ตฌ์‚ฌํšŒํ•™์ , ๊ฑด๊ฐ•๊ฒ€์ง„ ๋ฐ ์ธก์ • ๊ฒฐ๊ณผ, ๋ฐ ์ƒํ™œ์Šต๊ด€์— ๋Œ€ํ•œ ์ •๋ณด๋ฅผ NHIS-NSC์—์„œ ์ˆ˜์ง‘ํ•˜์—ฌ DeepSurv์˜ ์ž…๋ ฅ ํ”ผ์ณ๋กœ ํ™œ์šฉํ•ด์•ผํ•œ๋‹ค. ์ง€์—ญํ™˜๊ฒฝ ์ž๋ฃŒ์™€ ์—ฐ๊ณ„๋œ NHIS-NSC์—์„œ ๋ชจ๋“  ๋ฐ์ดํ„ฐ ๋ฒ”์ฃผ๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์„ ๋•Œ ์ ์ง„์ ์œผ๋กœ ๊ฐ ๋ฐ์ดํ„ฐ ๋ฒ”์ฃผ ์ค‘ ์ฝ•์Šค๋น„๋ก€์œ„ํ—˜๋ชจํ˜•์—์„œ ํ†ต๊ณ„์ ์œผ๋กœ ์œ ์˜๋ฏธํ•œ ์‹ฌํ˜ˆ๊ด€์งˆํ™˜ ์œ„ํ—˜์ธ์ž๋ฅผ ์ ์ง„์ ์œผ๋กœ ์ž…๋ ฅ ํ”ผ์ณ๋กœ DeepSurv ๋ชจ๋ธ์— ์ถ”๊ฐ€ํ•˜๋Š” ํ•˜์ด๋ธŒ๋ฆฌ๋“œ ์ ‘๊ทผ๋ฒ•์—์„œ ์‹ฌํ˜ˆ๊ด€์งˆํ™˜ ์˜ˆ์ธก ๋ชจ๋ธ๋ง ์„ฑ๋Šฅ์ด ์ ์ฐจ ํ–ฅ์ƒ ๋  ๊ฒƒ์œผ๋กœ ๊ธฐ๋Œ€ํ•  ์ˆ˜ ์žˆ๋‹ค. ์ฃผ๊ฑฐ ์ง€์—ญ ์ฝ”๋“œ๋ฅผ ์‚ฌ์šฉํ•œ NHIS-NSC์™€ ํ™˜๊ฒฝ ๋…ธ์ถœ ๋ฐ์ดํ„ฐ ์—ฐ๊ณ„๋Š” DeepSurv ๋ฐ ์ฝ•์Šค๋น„๋ก€์œ„ํ—˜๋ชจํ˜• ๋ชจ๋‘์—์„œ ์‹ฌํ˜ˆ๊ด€์งˆํ™˜ ์˜ˆ์ธก ๋ชจ๋ธ๋ง ์„ฑ๋Šฅ์ด ํ–ฅ์ƒ๋˜์—ˆ์ง€๋งŒ ํ†ต๊ณ„์ ์œผ๋กœ ์œ ์˜๋ฏธํ•œ ์ฆ๊ฐ€ ์ˆ˜์ค€์€ ์•„๋‹Œ ๊ฒƒ์œผ๋กœ ๋‚˜ํƒ€๋‚˜ ํ™˜๊ฒฝ ๋…ธ์ถœ ๋ฐ์ดํ„ฐ ์—ฐ๊ณ„ ๋ฐ ์ ์šฉ ์‹œ ๊ฒ€ํ† ๊ฐ€ ํ•„์š”ํ•  ๊ฒƒ์œผ๋กœ ์ถ”์ •๋œ๋‹ค.I. Introduction 1 1. Background 1 2. Research problem 4 3. Hypothesis and objective 6 3.1. Hypothesis 6 3.2. Objective 6 II. Materials and methods 8 1. Comprehensive review and identification of cardiovascular disease (CVD) risk factors 8 1.1. Systematic review on variables included in conventional CVD risk assessment tools 8 1.2. Systematic review on traditional and emerging CVD risk factors from observational studies 9 1.3. Integration of the comprehensive list of CVD risk factors 11 1.4. Screening for data availability 11 2. Cohort analysis for measuring strength of association between risk factors and incident cardiovascular disease 11 2.1 Study population and linkage to environmental exposure data 11 2.2. Variable selection and data processing 15 2.3. Population-based cohort analysis 17 3. Predictive modeling using survival analysis: DeepSurv and Cox proportional hazards regression 17 3.1. Model development 17 3.2. Evaluation of the predictive performance of the models 20 III. Results 21 1. Identification and categorization of cardiovascular disease risk factors 21 2. Magnitude of association between selected risk factors with cardiovascular disease 43 3. Model performance evaluation 56 VI. Discussion 68 1. Key findings and contributions 68 2. Comparison to other studies 69 3. Strengths and limitations 73 4. Implications 74 5. Future perspectives 75 V. Conclusion 77 Reference 78 ๊ตญ๋ฌธ์ดˆ๋ก 88Docto

    Artificial Intelligence in Acute Ischemic Stroke Subtypes According to Toast Classification: A Comprehensive Narrative Review

    Get PDF
    The correct recognition of the etiology of ischemic stroke (IS) allows tempestive interventions in therapy with the aim of treating the cause and preventing a new cerebral ischemic event. Nevertheless, the identification of the cause is often challenging and is based on clinical features and data obtained by imaging techniques and other diagnostic exams. TOAST classification system describes the different etiologies of ischemic stroke and includes five subtypes: LAAS (large-artery atherosclerosis), CEI (cardio embolism), SVD (small vessel disease), ODE (stroke of other determined etiology), and UDE (stroke of undetermined etiology). AI models, providing computational methodologies for quantitative and objective evaluations, seem to increase the sensitivity of main IS causes, such as tomographic diagnosis of carotid stenosis, electrocardiographic recognition of atrial fibrillation, and identification of small vessel disease in magnetic resonance images. The aim of this review is to provide overall knowledge about the most effective AI models used in the differential diagnosis of ischemic stroke etiology according to the TOAST classification. According to our results, AI has proven to be a useful tool for identifying predictive factors capable of subtyping acute stroke patients in large heterogeneous populations and, in particular, clarifying the etiology of UDE IS especially detecting cardioembolic sources

    ๋”ฅ ๋‰ด๋Ÿด ๋„คํŠธ์›Œํฌ๋ฅผ ํ™œ์šฉํ•œ ์˜ํ•™ ๊ฐœ๋… ๋ฐ ํ™˜์ž ํ‘œํ˜„ ํ•™์Šต๊ณผ ์˜๋ฃŒ ๋ฌธ์ œ์—์˜ ์‘์šฉ

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ(๋ฐ•์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต๋Œ€ํ•™์› : ๊ณต๊ณผ๋Œ€ํ•™ ์ „๊ธฐยท์ •๋ณด๊ณตํ•™๋ถ€, 2022. 8. ์ •๊ต๋ฏผ.๋ณธ ํ•™์œ„ ๋…ผ๋ฌธ์€ ์ „๊ตญ๋ฏผ ์˜๋ฃŒ ๋ณดํ—˜๋ฐ์ดํ„ฐ์ธ ํ‘œ๋ณธ์ฝ”ํ˜ธํŠธDB๋ฅผ ํ™œ์šฉํ•˜์—ฌ ๋”ฅ ๋‰ด๋Ÿด ๋„คํŠธ์›Œํฌ ๊ธฐ๋ฐ˜์˜ ์˜ํ•™ ๊ฐœ๋… ๋ฐ ํ™˜์ž ํ‘œํ˜„ ํ•™์Šต ๋ฐฉ๋ฒ•๊ณผ ์˜๋ฃŒ ๋ฌธ์ œ ํ•ด๊ฒฐ ๋ฐฉ๋ฒ•์„ ์ œ์•ˆํ•œ๋‹ค. ๋จผ์ € ์ˆœ์ฐจ์ ์ธ ํ™˜์ž ์˜๋ฃŒ ๊ธฐ๋ก๊ณผ ๊ฐœ์ธ ํ”„๋กœํŒŒ์ผ ์ •๋ณด๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ํ™˜์ž ํ‘œํ˜„์„ ํ•™์Šตํ•˜๊ณ  ํ–ฅํ›„ ์งˆ๋ณ‘ ์ง„๋‹จ ๊ฐ€๋Šฅ์„ฑ์„ ์˜ˆ์ธกํ•˜๋Š” ์žฌ๊ท€์‹ ๊ฒฝ๋ง ๋ชจ๋ธ์„ ์ œ์•ˆํ•˜์˜€๋‹ค. ์šฐ๋ฆฌ๋Š” ๋‹ค์–‘ํ•œ ์„ฑ๊ฒฉ์˜ ํ™˜์ž ์ •๋ณด๋ฅผ ํšจ์œจ์ ์œผ๋กœ ํ˜ผํ•ฉํ•˜๋Š” ๊ตฌ์กฐ๋ฅผ ๋„์ž…ํ•˜์—ฌ ํฐ ์„ฑ๋Šฅ ํ–ฅ์ƒ์„ ์–ป์—ˆ๋‹ค. ๋˜ํ•œ ํ™˜์ž์˜ ์˜๋ฃŒ ๊ธฐ๋ก์„ ์ด๋ฃจ๋Š” ์˜๋ฃŒ ์ฝ”๋“œ๋“ค์„ ๋ถ„์‚ฐ ํ‘œํ˜„์œผ๋กœ ๋‚˜ํƒ€๋‚ด ์ถ”๊ฐ€ ์„ฑ๋Šฅ ๊ฐœ์„ ์„ ์ด๋ฃจ์—ˆ๋‹ค. ์ด๋ฅผ ํ†ตํ•ด ์˜๋ฃŒ ์ฝ”๋“œ์˜ ๋ถ„์‚ฐ ํ‘œํ˜„์ด ์ค‘์š”ํ•œ ์‹œ๊ฐ„์  ์ •๋ณด๋ฅผ ๋‹ด๊ณ  ์žˆ์Œ์„ ํ™•์ธํ•˜์˜€๊ณ , ์ด์–ด์ง€๋Š” ์—ฐ๊ตฌ์—์„œ๋Š” ์ด๋Ÿฌํ•œ ์‹œ๊ฐ„์  ์ •๋ณด๊ฐ€ ๊ฐ•ํ™”๋  ์ˆ˜ ์žˆ๋„๋ก ๊ทธ๋ž˜ํ”„ ๊ตฌ์กฐ๋ฅผ ๋„์ž…ํ•˜์˜€๋‹ค. ์šฐ๋ฆฌ๋Š” ์˜๋ฃŒ ์ฝ”๋“œ์˜ ๋ถ„์‚ฐ ํ‘œํ˜„ ๊ฐ„์˜ ์œ ์‚ฌ๋„์™€ ํ†ต๊ณ„์  ์ •๋ณด๋ฅผ ๊ฐ€์ง€๊ณ  ๊ทธ๋ž˜ํ”„๋ฅผ ๊ตฌ์ถ•ํ•˜์˜€๊ณ  ๊ทธ๋ž˜ํ”„ ๋‰ด๋Ÿด ๋„คํŠธ์›Œํฌ๋ฅผ ํ™œ์šฉ, ์‹œ๊ฐ„/ํ†ต๊ณ„์  ์ •๋ณด๊ฐ€ ๊ฐ•ํ™”๋œ ์˜๋ฃŒ ์ฝ”๋“œ์˜ ํ‘œํ˜„ ๋ฒกํ„ฐ๋ฅผ ์–ป์—ˆ๋‹ค. ํš๋“ํ•œ ์˜๋ฃŒ ์ฝ”๋“œ ๋ฒกํ„ฐ๋ฅผ ํ†ตํ•ด ์‹œํŒ ์•ฝ๋ฌผ์˜ ์ž ์žฌ์ ์ธ ๋ถ€์ž‘์šฉ ์‹ ํ˜ธ๋ฅผ ํƒ์ง€ํ•˜๋Š” ๋ชจ๋ธ์„ ์ œ์•ˆํ•œ ๊ฒฐ๊ณผ, ๊ธฐ์กด์˜ ๋ถ€์ž‘์šฉ ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค์— ์กด์žฌํ•˜์ง€ ์•Š๋Š” ์‚ฌ๋ก€๊นŒ์ง€๋„ ์˜ˆ์ธกํ•  ์ˆ˜ ์žˆ์Œ์„ ๋ณด์˜€๋‹ค. ๋งˆ์ง€๋ง‰์œผ๋กœ ๋ถ„๋Ÿ‰์— ๋น„ํ•ด ์ฃผ์š” ์ •๋ณด๊ฐ€ ํฌ์†Œํ•˜๋‹ค๋Š” ์˜๋ฃŒ ๊ธฐ๋ก์˜ ํ•œ๊ณ„๋ฅผ ๊ทน๋ณตํ•˜๊ธฐ ์œ„ํ•ด ์ง€์‹๊ทธ๋ž˜ํ”„๋ฅผ ํ™œ์šฉํ•˜์—ฌ ์‚ฌ์ „ ์˜ํ•™ ์ง€์‹์„ ๋ณด๊ฐ•ํ•˜์˜€๋‹ค. ์ด๋•Œ ํ™˜์ž์˜ ์˜๋ฃŒ ๊ธฐ๋ก์„ ๊ตฌ์„ฑํ•˜๋Š” ์ง€์‹๊ทธ๋ž˜ํ”„์˜ ๋ถ€๋ถ„๋งŒ์„ ์ถ”์ถœํ•˜์—ฌ ๊ฐœ์ธํ™”๋œ ์ง€์‹๊ทธ๋ž˜ํ”„๋ฅผ ๋งŒ๋“ค๊ณ  ๊ทธ๋ž˜ํ”„ ๋‰ด๋Ÿด ๋„คํŠธ์›Œํฌ๋ฅผ ํ†ตํ•ด ๊ทธ๋ž˜ํ”„์˜ ํ‘œํ˜„ ๋ฒกํ„ฐ๋ฅผ ํš๋“ํ•˜์˜€๋‹ค. ์ตœ์ข…์ ์œผ๋กœ ์ˆœ์ฐจ์ ์ธ ์˜๋ฃŒ ๊ธฐ๋ก์„ ํ•จ์ถ•ํ•œ ํ™˜์ž ํ‘œํ˜„๊ณผ ๋”๋ถˆ์–ด ๊ฐœ์ธํ™”๋œ ์˜ํ•™ ์ง€์‹์„ ํ•จ์ถ•ํ•œ ํ‘œํ˜„์„ ํ•จ๊ป˜ ์‚ฌ์šฉํ•˜์—ฌ ํ–ฅํ›„ ์งˆ๋ณ‘ ๋ฐ ์ง„๋‹จ ์˜ˆ์ธก ๋ฌธ์ œ์— ํ™œ์šฉํ•˜์˜€๋‹ค.This dissertation proposes a deep neural network-based medical concept and patient representation learning methods using medical claims data to solve two healthcare tasks, i.e., clinical outcome prediction and post-marketing adverse drug reaction (ADR) signal detection. First, we propose SAF-RNN, a Recurrent Neural Network (RNN)-based model that learns a deep patient representation based on the clinical sequences and patient characteristics. Our proposed model fuses different types of patient records using feature-based gating and self-attention. We demonstrate that high-level associations between two heterogeneous records are effectively extracted by our model, thus achieving state-of-the-art performances for predicting the risk probability of cardiovascular disease. Secondly, based on the observation that the distributed medical code embeddings represent temporal proximity between the medical codes, we introduce a graph structure to enhance the code embeddings with such temporal information. We construct a graph using the distributed code embeddings and the statistical information from the claims data. We then propose the Graph Neural Network(GNN)-based representation learning for post-marketing ADR detection. Our model shows competitive performances and provides valid ADR candidates. Finally, rather than using patient records alone, we utilize a knowledge graph to augment the patient representation with prior medical knowledge. Using SAF-RNN and GNN, the deep patient representation is learned from the clinical sequences and the personalized medical knowledge. It is then used to predict clinical outcomes, i.e., next diagnosis prediction and CVD risk prediction, resulting in state-of-the-art performances.1 Introduction 1 2 Background 8 2.1 Medical Concept Embedding 8 2.2 Encoding Sequential Information in Clinical Records 11 3 Deep Patient Representation with Heterogeneous Information 14 3.1 Related Work 16 3.2 Problem Statement 19 3.3 Method 20 3.3.1 RNN-based Disease Prediction Model 20 3.3.2 Self-Attentive Fusion (SAF) Encoder 23 3.4 Dataset and Experimental Setup 24 3.4.1 Dataset 24 3.4.2 Experimental Design 26 ii 3.4.3 Implementation Details 27 3.5 Experimental Results 28 3.5.1 Evaluation of CVD Prediction 28 3.5.2 Sensitivity Analysis 28 3.5.3 Ablation Studies 31 3.6 Further Investigation 32 3.6.1 Case Study: Patient-Centered Analysis 32 3.6.2 Data-Driven CVD Risk Factors 32 3.7 Conclusion 33 4 Graph-Enhanced Medical Concept Embedding 40 4.1 Related Work 42 4.2 Problem Statement 43 4.3 Method 44 4.3.1 Code Embedding Learning with Skip-gram Model 44 4.3.2 Drug-disease Graph Construction 45 4.3.3 A GNN-based Method for Learning Graph Structure 47 4.4 Dataset and Experimental Setup 49 4.4.1 Dataset 49 4.4.2 Experimental Design 50 4.4.3 Implementation Details 52 4.5 Experimental Results 53 4.5.1 Evaluation of ADR Detection 53 4.5.2 Newly-Described ADR Candidates 54 4.6 Conclusion 55 5 Knowledge-Augmented Deep Patient Representation 57 5.1 Related Work 60 5.1.1 Incorporating Prior Medical Knowledge for Clinical Outcome Prediction 60 5.1.2 Inductive KGC based on Subgraph Learning 61 5.2 Method 61 5.2.1 Extracting Personalized KG 61 5.2.2 KA-SAF: Knowledge-Augmented Self-Attentive Fusion Encoder 64 5.2.3 KGC as a Pre-training Task 68 5.2.4 Subgraph Infomax: SGI 69 5.3 Dataset and Experimental Setup 72 5.3.1 Clinical Outcome Prediction 72 5.3.2 Next Diagnosis Prediction 72 5.4 Experimental Results 73 5.4.1 Cardiovascular Disease Prediction 73 5.4.2 Next Diagnosis Prediction 73 5.4.3 KGC on SemMed KG 73 5.5 Conclusion 74 6 Conclusion 77 Abstract (In Korean) 90 Acknowlegement 92๋ฐ•

    Machine learning risk prediction model for acute coronary syndrome and death from use of non-steroidal anti-inflammatory drugs in administrative data

    Get PDF
    Our aim was to investigate the usefulness of machine learning approaches on linked administrative health data at the population level in predicting older patientsโ€™ one-year risk of acute coronary syndrome and death following the use of non-steroidal anti-inflammatory drugs (NSAIDs). Patients from a Western Australian cardiovascular population who were supplied with NSAIDs between 1 Jan 2003 and 31 Dec 2004 were identified from Pharmaceutical Benefits Scheme data. Comorbidities from linked hospital admissions data and medication history were inputs. Admissions for acute coronary syndrome or death within one year from the first supply date were outputs. Machine learning classification methods were used to build models to predict ACS and death. Model performance was measured by the area under the receiver operating characteristic curve (AUC-ROC), sensitivity and specificity. There were 68,889 patients in the NSAIDs cohort with mean age 76 years and 54% were female. 1882 patients were admitted for acute coronary syndrome and 5405 patients died within one year after their first supply of NSAIDs. The multi-layer neural network, gradient boosting machine and support vector machine were applied to build various classification models. The gradient boosting machine achieved the best performance with an average AUC-ROC of 0.72 predicting ACS and 0.84 predicting death. Machine learning models applied to linked administrative data can potentially improve adverse outcome risk prediction. Further investigation of additional data and approaches are required to improve the performance for adverse outcome risk prediction

    Recurrent Stroke Prediction using Machine Learning Algorithms with Clinical Public Datasets: An Empirical Performance Evaluation

    Get PDF
    ุบุงู„ุจู‹ุง ู…ุง ุชูƒูˆู† ุงู„ุณูƒุชุฉ ุงู„ุฏู…ุงุบูŠุฉ ุงู„ู…ุชูƒุฑุฑุฉ ู…ุฏู…ุฑุฉ ูˆู‚ุงุฏุฑุฉ ุนู„ู‰ ุงู„ุชุณุจุจ ููŠ ุฅุนุงู‚ุฉ ุดุฏูŠุฏุฉ ุฃูˆ ุงู„ูˆูุงุฉ. ูˆู…ุน ุฐู„ูƒ ุŒ ูุฅู† ู…ุง ูŠู‚ุฑุจ ู…ู† 90 ูช ู…ู† ุฃุณุจุงุจ ุงู„ุณูƒุชุฉ ุงู„ุฏู…ุงุบูŠุฉ ุงู„ู…ุชูƒุฑุฑุฉ ู‚ุงุจู„ุฉ ู„ู„ุชุบูŠุฑ ุŒ ู…ู…ุง ูŠุนู†ูŠ ุฃู†ู‡ ูŠู…ูƒู† ุชุฌู†ุจ ุงู„ุณูƒุชุงุช ุงู„ุฏู…ุงุบูŠุฉ ุงู„ู…ุชูƒุฑุฑุฉ ุนู† ุทุฑูŠู‚ ุงู„ุชุญูƒู… ููŠ ุนูˆุงู…ู„ ุงู„ุฎุทุฑ ุŒ ูˆุงู„ุชูŠ ู‡ูŠ ููŠ ุงู„ุฃุณุงุณ ุณู„ูˆูƒูŠุฉ ูˆุงุณุชู‚ู„ุงุจูŠุฉ ุจุทุจูŠุนุชู‡ุง. ูˆุจุงู„ุชุงู„ูŠ ุŒ ูŠุชุถุญ ู…ู† ุงู„ุฃุนู…ุงู„ ุงู„ุณุงุจู‚ุฉ ุฃู† ู†ู…ูˆุฐุฌ ุงู„ุชู†ุจุค ุจุงู„ุณูƒุชุฉ ุงู„ุฏู…ุงุบูŠุฉ ุงู„ู…ุชูƒุฑุฑุฉ ูŠู…ูƒู† ุฃู† ูŠุณุงุนุฏ ููŠ ุชู‚ู„ูŠู„ ุงุญุชู…ุงู„ูŠุฉ ุงู„ุฅุตุงุจุฉ ุจุณูƒุชุฉ ุฏู…ุงุบูŠุฉ ู…ุชูƒุฑุฑุฉ. ุฃุธู‡ุฑุช ุงู„ุฃุนู…ุงู„ ุงู„ุณุงุจู‚ุฉ ู†ุชุงุฆุฌ ูˆุงุนุฏุฉ ููŠ ุงู„ุชู†ุจุค ุจุญุงู„ุงุช ุงู„ุณูƒุชุฉ ุงู„ุฏู…ุงุบูŠุฉ ู„ุฃูˆู„ ู…ุฑุฉ ุจุงุณุชุฎุฏุงู… ุฃุณุงู„ูŠุจ ุงู„ุชุนู„ู… ุงู„ุขู„ูŠ. ูˆู…ุน ุฐู„ูƒ ุŒ ู‡ู†ุงูƒ ุฃุนู…ุงู„ ู…ุญุฏูˆุฏุฉ ู„ู„ุชู†ุจุค ุจุงู„ุณูƒุชุฉ ุงู„ุฏู…ุงุบูŠุฉ ุงู„ู…ุชูƒุฑุฑุฉ ุจุงุณุชุฎุฏุงู… ุฃุณุงู„ูŠุจ ุงู„ุชุนู„ู… ุงู„ุขู„ูŠ. ูˆู…ู† ุซู… ุŒ ุชู… ุงู‚ุชุฑุงุญ ู‡ุฐุง ุงู„ุนู…ู„ ู„ุฅุฌุฑุงุก ุชุญู„ูŠู„ ุชุฌุฑูŠุจูŠ ูˆุงู„ุชุญู‚ูŠู‚ ููŠ ุฎูˆุงุฑุฒู…ูŠุงุช ุงู„ุชุนู„ู… ุงู„ุขู„ูŠ ุงู„ู…ุทุจู‚ุฉ ููŠ ู†ู…ุงุฐุฌ ุงู„ุชู†ุจุค ุจุงู„ุณูƒุชุฉ ุงู„ุฏู…ุงุบูŠุฉ ุงู„ู…ุชูƒุฑุฑุฉ. ูŠู‡ุฏู ู‡ุฐุง ุงู„ุจุญุซ ุฅู„ู‰ ุงู„ุชุญู‚ูŠู‚ ููŠ ุฃุฏุงุก ุฎูˆุงุฑุฒู…ูŠุงุช ุงู„ุชุนู„ู… ุงู„ุขู„ูŠ ูˆู…ู‚ุงุฑู†ุชู‡ุง ุจุงุณุชุฎุฏุงู… ู…ุฌู…ูˆุนุงุช ุงู„ุจูŠุงู†ุงุช ุงู„ุณุฑูŠุฑูŠุฉ ุงู„ุนุงู…ุฉ ู„ู„ุณูƒุชุฉ ุงู„ุฏู…ุงุบูŠุฉ ุงู„ู…ุชูƒุฑุฑุฉ. ููŠ ู‡ุฐู‡ ุงู„ุฏุฑุงุณุฉ ุŒ ุชู… ุงุณุชุฎุฏุงู… ุงู„ุดุจูƒุฉ ุงู„ุนุตุจูŠุฉ ุงู„ุงุตุทู†ุงุนูŠุฉ (ANN) ูˆุขู„ุฉ ุงู„ู…ุชุฌู‡ุงุช ุงู„ุฏุงุนู…ุฉ (SVM) ูˆู‚ุงุฆู…ุฉ ู‚ูˆุงุนุฏ ุจุงูŠุฒูŠ (BRL) ูˆู…ู‚ุงุฑู†ุฉ ุฃุฏุงุฆู‡ุง ููŠ ู…ุฌุงู„ ู†ู…ูˆุฐุฌ ุงู„ุชู†ุจุค ุจุงู„ุณูƒุชุฉ ุงู„ุฏู…ุงุบูŠุฉ ุงู„ู…ุชูƒุฑุฑุฉ. ุชุธู‡ุฑ ู†ุชูŠุฌุฉ ุงู„ุชุฌุงุฑุจ ุงู„ุชุฌุฑูŠุจูŠุฉ ุฃู† ANN ุณุฌู„ุช ุฃุนู„ู‰ ุฏู‚ุฉ ุนู†ุฏ 80.00ูช ุŒ ุชู„ูŠู‡ุง BRL ุจู†ุณุจุฉ 75.91ูช ูˆ SVM ุจู†ุณุจุฉ 60.45ูช.Recurrent strokes can be devastating, often resulting in severe disability or death. However, nearly 90% of the causes of recurrent stroke are modifiable, which means recurrent strokes can be averted by controlling risk factors, which are mainly behavioral and metabolic in nature. Thus, it shows that from the previous works that recurrent stroke prediction model could help in minimizing the possibility of getting recurrent stroke. Previous works have shown promising results in predicting first-time stroke cases with machine learning approaches. However, there are limited works on recurrent stroke prediction using machine learning methods. Hence, this work is proposed to perform an empirical analysis and to investigate machine learning algorithms implementation in the recurrent stroke prediction models. This research aims to investigate and compare the performance of machine learning algorithms using recurrent stroke clinical public datasets. In this study, Artificial Neural Network (ANN), Support Vector Machine (SVM) and Bayesian Rule List (BRL) are used and compared their performance in the domain of recurrent stroke prediction model. The result of the empirical experiments shows that ANN scores the highest accuracy at 80.00%, follows by BRL with 75.91% and SVM with 60.45%
    • โ€ฆ
    corecore