2,348 research outputs found
A Conceptual Framework to Predict Disease Progressions in Patients with Chronic Kidney Disease, Using Machine Learning and Process Mining
Process Mining is a technique looking into the analysis and mining of existing process flow. On the other hand, Machine Learning is a data science field and a sub-branch of Artificial Intelligence with the main purpose of replicating human behavior through algorithms. The separate application of Process Mining and Machine Learning for healthcare purposes has been widely explored with a various number of published works discussing their use. However, the simultaneous application of Process Mining and Machine Learning algorithms is still a growing field with ongoing studies on its application. This paper proposes a feasible framework where Process Mining and Machine Learning can be used in combination within the healthcare environment
Centralized and distributed learning methods for predictive health analytics
The U.S. health care system is considered costly and highly inefficient, devoting substantial resources to the treatment of acute conditions in a hospital setting rather than focusing on prevention and keeping patients out of the hospital. The potential for cost savings is large; in the U.S. more than $30 billion are spent each year on hospitalizations deemed preventable, 31% of which is attributed to heart diseases and 20% to diabetes. Motivated by this, our work focuses on developing centralized and distributed learning methods to predict future heart- or diabetes- related hospitalizations based on patient Electronic Health Records (EHRs).
We explore a variety of supervised classification methods and we present a novel likelihood ratio based method (K-LRT) that predicts hospitalizations and offers interpretability by identifying the K most significant features that lead to a positive prediction for each patient. Next, assuming that the positive class consists of multiple clusters (hospitalized patients due to different reasons), while the negative class is drawn from a single cluster (non-hospitalized patients healthy in every aspect), we present an alternating optimization approach, which jointly discovers the clusters in the positive class and optimizes the classifiers that separate each positive cluster from the negative samples. We establish the convergence of the method and characterize its VC dimension. Last, we develop a decentralized cluster Primal-Dual Splitting (cPDS) method for large-scale problems, that is computationally efficient and privacy-aware.
Such a distributed learning scheme is relevant for multi-institutional collaborations or peer-to-peer applications, allowing the agents to collaborate, while keeping every participant's data private. cPDS is proved to have an improved convergence rate
compared to existing centralized and decentralized methods. We test all methods on real EHR data from the Boston Medical Center and compare results in terms of prediction accuracy and interpretability
Real-time data mining models for predicting length of stay in intensive care units
Nowadays the efficiency of costs and resources planning in hospitals embody a critical role in the
management of these units. Length Of Stay (LOS) is a good metric when the goal is to decrease costs and to
optimize resources. In Intensive Care Units (ICU) optimization assumes even a greater importance derived
from the high costs associated to inpatients. This study presents two data mining approaches to predict LOS
in an ICU. The first approach considered the admission variables and some other physiologic variables
collected during the first 24 hours of inpatient. The second approach considered admission data and
supplementary clinical data of the patient (vital signs and laboratory results) collected in real-time. The
results achieved in the first approach are very poor (accuracy of 73 %). However, when the prediction is
made using the data collected in real-time, the results are very interesting (sensitivity of 96.104%). The
models induced in second experiment are sensitive to the patient clinical situation and can predict LOS
according to the monitored variables. Models for predicting LOS at admission are not suited to the ICU
particularities. Alternatively, they should be induced in real-time, using online-learning and considering the
most recent patient condition when the model is induced.(undefined
Data mining for prediction of length of stay of cardiovascular accident inpatients
The healthcare sector generates large amounts of data on a daily basis. This data holds valuable knowledge that, beyond supporting a wide range of medical and healthcare functions such as clinical decision support, can be used for improving profits and cutting down on wasted overhead. The evaluation and analysis of stored clinical data may lead to the discovery of trends and patterns that can significantly enhance overall understanding of disease progression and clinical management. Data mining techniques aim precisely at the extraction of useful knowledge from raw data. This work describes an implementation of a data mining project approach to predict the hospitalization period of cardiovascular accident patients. This provides an effective tool for the hospital cost containment and management efficiency. The data used for this project contains information about patients hospitalized in Cardiovascular Accident’s unit in 2016 for having suffered a stroke. The Weka software was used as the machine learning toolkit.Fundação para a Ciência e a Tecnologia (UID/CEC/00319/2013
Benchmarking machine learning models on multi-centre eICU critical care dataset
Progress of machine learning in critical care has been difficult to track, in
part due to absence of public benchmarks. Other fields of research (such as
computer vision and natural language processing) have established various
competitions and public benchmarks. Recent availability of large clinical
datasets has enabled the possibility of establishing public benchmarks. Taking
advantage of this opportunity, we propose a public benchmark suite to address
four areas of critical care, namely mortality prediction, estimation of length
of stay, patient phenotyping and risk of decompensation. We define each task
and compare the performance of both clinical models as well as baseline and
deep learning models using eICU critical care dataset of around 73,000
patients. This is the first public benchmark on a multi-centre critical care
dataset, comparing the performance of clinical gold standard with our
predictive model. We also investigate the impact of numerical variables as well
as handling of categorical variables on each of the defined tasks. The source
code, detailing our methods and experiments is publicly available such that
anyone can replicate our results and build upon our work.Comment: Source code to replicate the results
https://github.com/mostafaalishahi/eICU_Benchmar
Predicting Chronic Disease Hospitalizations from Electronic Health Records: An Interpretable Classification Approach
Urban living in modern large cities has significant adverse effects on
health, increasing the risk of several chronic diseases. We focus on the two
leading clusters of chronic disease, heart disease and diabetes, and develop
data-driven methods to predict hospitalizations due to these conditions. We
base these predictions on the patients' medical history, recent and more
distant, as described in their Electronic Health Records (EHR). We formulate
the prediction problem as a binary classification problem and consider a
variety of machine learning methods, including kernelized and sparse Support
Vector Machines (SVM), sparse logistic regression, and random forests. To
strike a balance between accuracy and interpretability of the prediction, which
is important in a medical setting, we propose two novel methods: K-LRT, a
likelihood ratio test-based method, and a Joint Clustering and Classification
(JCC) method which identifies hidden patient clusters and adapts classifiers to
each cluster. We develop theoretical out-of-sample guarantees for the latter
method. We validate our algorithms on large datasets from the Boston Medical
Center, the largest safety-net hospital system in New England
Prediction of length of stay for stroke patients using artificial neural networks
Strokes are neurological events that affect a certain area of the brain. Since brain controls fundamental body activities, brain cell deterioration and dead can lead to serious disabilities and poor life quality. This makes strokes the leading cause of disabilities and mortality worldwide. Patients that suffer strokes are hospitalized in order to be submitted to surgery and receive recovery therapies. Thus, it’s important to predict the length of stay for these patients, since it can be costly to them and their family, as well as to the medical institutions. The aim of this study is to make a prediction on the number of days of patients’ hospital stays based on information available about the neurological event that happened, the patient’s health status and surgery details. A neural network was put to test with three attribute subsets with different sizes. The best result was obtained with the subset with fewer features obtaining a RMSE and a MAE of 5.9451 and 4.6354, respectively.FCT - Fundação para a Ciência e a Tecnologia (UID/CEC/00319/2019
- …