2 research outputs found

    Development and utilization of artificial intelligence technology based on electronic medical records for improving hospital processes

    No full text
    Background and objective: As the global healthcare market expands and medical standards improve, healthcare costs are increasing. At the same time, the financial burden of healthcare costs on both nations with limited budgets and individuals is growing. To address this issue, various approaches to reducing and controlling healthcare costs have been proposed. Among these approaches, hospital process improvement is considered a significant method for providing important benefits to patients and treatment services. In particular, process optimization is regarded as an effective way to reduce inefficiencies for both hospitals and patients. Recent research has been actively applying artificial intelligence (AI) to the field of medicine, and efforts to overcome the constraints of medical resources are increasing. Among various methods, there is particular attention to predicting emergency department (ED) overcrowding and hospital bed occupancy rates (BORs). ED overcrowding can lead to issues such as increased mortality rates, longer wait times, treatment errors, diagnostic and procedural delays, and more. High BORs can also negatively affect the health of healthcare staff and increase the risk of infections. In this study, we developed AI models that utilize electronic medical records (EMR) to improve hospital processes. Method: In the first chapter, we focus on creating models to predict the likelihood of admission within 24 hours for patients passing through the ED and forecasting expected wait times. This model was developed to support quick decision-making by ED physicians and has shown outstanding performance in predicting the likelihood of admission within 24 hours and wait times. Furthermore, by leveraging unstructured text data, we enhanced the model's performance and proved the importance of unstructured text in ED notes using explainable artificial intelligence (XAI) to confirm variable influences. In the second chapter, we conducted research on predicting the BORs of individual wards and rooms. We combined time-series data related to bed occupancy recorded at hourly intervals with static room data to create various datasets. Using these datasets, we developed two models for predicting ward BORs and four models for predicting room BORs. These models demonstrated high performance, with the model that combined dynamic and static data and predicted BORs at weekly intervals performing the best. This emphasized the importance of static data. Results: In chapter 1, among several evaluated models, the extreme gradient boosting model (XGB) that incorporated text data yielded the best performance. This model achieved an area under the receiver operating characteristic curve (AUROC) score of 0.922 and an area under the precision-recall curve (AUPRC) score of 0.687. The mean absolute error (MAE) revealed a difference of approximately 3 hours. Through XAI, we identified important variables affecting this classification and found that unstructured text data variables mainly had a large impact. In chapter 2, the ward-level prediction model with an MAE of 0.057, a mean squared error (MSE) of 0.007, a root mean squared error (RMSE) of 0.082, and an R2 score of 0.582. Among the room-level prediction models, the model that combined static data exhibited superior performance with an MAE of 0.123, an MSE of 0.051, an RMSE of 0.226, and an R2 score of 0.320. Model results can be displayed on an electronic dashboard for easy access via the web. Conclusions: Research aimed at improving hospital processes must produce practical results that can be used in healthcare institutions. Therefore, we have proposed a virtual web application that is practically applicable to significantly enhance the economic efficiency of hospital and ED operations. Applying AI models to hospital processes can simplify procedures and efficiently utilize limited medical resources. This not only enhances medical services but also offers the potential for cost savings.Maste

    Self-Training With Quantile Errors for Multivariate Missing Data Imputation for Regression Problems in Electronic Medical Records: Algorithm Development Study

    No full text
    Background: When using machine learning in the real world, the missing value problem is the first problem encountered. Methods to impute this missing value include statistical methods such as mean, expectation-maximization, and multiple imputations by chained equations (MICE) as well as machine learning methods such as multilayer perceptron, k-nearest neighbor, and decision tree. Objective: The objective of this study was to impute numeric medical data such as physical data and laboratory data. We aimed to effectively impute data using a progressive method called self-training in the medical field where training data are scarce. Methods: In this paper, we propose a self-training method that gradually increases the available data. Models trained with complete data predict the missing values in incomplete data. Among the incomplete data, the data in which the missing value is validly predicted are incorporated into the complete data. Using the predicted value as the actual value is called pseudolabeling. This process is repeated until the condition is satisfied. The most important part of this process is how to evaluate the accuracy of pseudolabels. They can be evaluated by observing the effect of the pseudolabeled data on the performance of the model. Results: In self-training using random forest (RF), mean squared error was up to 12% lower than pure RF, and the Pearson correlation coefficient was 0.1% higher. This difference was confirmed statistically. In the Friedman test performed on MICE and RF, self-training showed a P value between .003 and .02. A Wilcoxon signed-rank test performed on the mean imputation showed the lowest possible P value, 3.05e-5, in all situations. Conclusions: Self-training showed significant results in comparing the predicted values and actual values, but it needs to be verified in an actual machine learning system. And self-training has the potential to improve performance according to the pseudolabel evaluation method, which will be the main subject of our future research
    corecore