2,348 research outputs found

    A Conceptual Framework to Predict Disease Progressions in Patients with Chronic Kidney Disease, Using Machine Learning and Process Mining

    Get PDF
    Process Mining is a technique looking into the analysis and mining of existing process flow. On the other hand, Machine Learning is a data science field and a sub-branch of Artificial Intelligence with the main purpose of replicating human behavior through algorithms. The separate application of Process Mining and Machine Learning for healthcare purposes has been widely explored with a various number of published works discussing their use. However, the simultaneous application of Process Mining and Machine Learning algorithms is still a growing field with ongoing studies on its application. This paper proposes a feasible framework where Process Mining and Machine Learning can be used in combination within the healthcare environment

    Centralized and distributed learning methods for predictive health analytics

    Get PDF
    The U.S. health care system is considered costly and highly inefficient, devoting substantial resources to the treatment of acute conditions in a hospital setting rather than focusing on prevention and keeping patients out of the hospital. The potential for cost savings is large; in the U.S. more than $30 billion are spent each year on hospitalizations deemed preventable, 31% of which is attributed to heart diseases and 20% to diabetes. Motivated by this, our work focuses on developing centralized and distributed learning methods to predict future heart- or diabetes- related hospitalizations based on patient Electronic Health Records (EHRs). We explore a variety of supervised classification methods and we present a novel likelihood ratio based method (K-LRT) that predicts hospitalizations and offers interpretability by identifying the K most significant features that lead to a positive prediction for each patient. Next, assuming that the positive class consists of multiple clusters (hospitalized patients due to different reasons), while the negative class is drawn from a single cluster (non-hospitalized patients healthy in every aspect), we present an alternating optimization approach, which jointly discovers the clusters in the positive class and optimizes the classifiers that separate each positive cluster from the negative samples. We establish the convergence of the method and characterize its VC dimension. Last, we develop a decentralized cluster Primal-Dual Splitting (cPDS) method for large-scale problems, that is computationally efficient and privacy-aware. Such a distributed learning scheme is relevant for multi-institutional collaborations or peer-to-peer applications, allowing the agents to collaborate, while keeping every participant's data private. cPDS is proved to have an improved convergence rate compared to existing centralized and decentralized methods. We test all methods on real EHR data from the Boston Medical Center and compare results in terms of prediction accuracy and interpretability

    Real-time data mining models for predicting length of stay in intensive care units

    Get PDF
    Nowadays the efficiency of costs and resources planning in hospitals embody a critical role in the management of these units. Length Of Stay (LOS) is a good metric when the goal is to decrease costs and to optimize resources. In Intensive Care Units (ICU) optimization assumes even a greater importance derived from the high costs associated to inpatients. This study presents two data mining approaches to predict LOS in an ICU. The first approach considered the admission variables and some other physiologic variables collected during the first 24 hours of inpatient. The second approach considered admission data and supplementary clinical data of the patient (vital signs and laboratory results) collected in real-time. The results achieved in the first approach are very poor (accuracy of 73 %). However, when the prediction is made using the data collected in real-time, the results are very interesting (sensitivity of 96.104%). The models induced in second experiment are sensitive to the patient clinical situation and can predict LOS according to the monitored variables. Models for predicting LOS at admission are not suited to the ICU particularities. Alternatively, they should be induced in real-time, using online-learning and considering the most recent patient condition when the model is induced.(undefined

    Data mining for prediction of length of stay of cardiovascular accident inpatients

    Get PDF
    The healthcare sector generates large amounts of data on a daily basis. This data holds valuable knowledge that, beyond supporting a wide range of medical and healthcare functions such as clinical decision support, can be used for improving profits and cutting down on wasted overhead. The evaluation and analysis of stored clinical data may lead to the discovery of trends and patterns that can significantly enhance overall understanding of disease progression and clinical management. Data mining techniques aim precisely at the extraction of useful knowledge from raw data. This work describes an implementation of a data mining project approach to predict the hospitalization period of cardiovascular accident patients. This provides an effective tool for the hospital cost containment and management efficiency. The data used for this project contains information about patients hospitalized in Cardiovascular Accident’s unit in 2016 for having suffered a stroke. The Weka software was used as the machine learning toolkit.Fundação para a Ciência e a Tecnologia (UID/CEC/00319/2013

    Benchmarking machine learning models on multi-centre eICU critical care dataset

    Get PDF
    Progress of machine learning in critical care has been difficult to track, in part due to absence of public benchmarks. Other fields of research (such as computer vision and natural language processing) have established various competitions and public benchmarks. Recent availability of large clinical datasets has enabled the possibility of establishing public benchmarks. Taking advantage of this opportunity, we propose a public benchmark suite to address four areas of critical care, namely mortality prediction, estimation of length of stay, patient phenotyping and risk of decompensation. We define each task and compare the performance of both clinical models as well as baseline and deep learning models using eICU critical care dataset of around 73,000 patients. This is the first public benchmark on a multi-centre critical care dataset, comparing the performance of clinical gold standard with our predictive model. We also investigate the impact of numerical variables as well as handling of categorical variables on each of the defined tasks. The source code, detailing our methods and experiments is publicly available such that anyone can replicate our results and build upon our work.Comment: Source code to replicate the results https://github.com/mostafaalishahi/eICU_Benchmar

    Predicting Chronic Disease Hospitalizations from Electronic Health Records: An Interpretable Classification Approach

    Full text link
    Urban living in modern large cities has significant adverse effects on health, increasing the risk of several chronic diseases. We focus on the two leading clusters of chronic disease, heart disease and diabetes, and develop data-driven methods to predict hospitalizations due to these conditions. We base these predictions on the patients' medical history, recent and more distant, as described in their Electronic Health Records (EHR). We formulate the prediction problem as a binary classification problem and consider a variety of machine learning methods, including kernelized and sparse Support Vector Machines (SVM), sparse logistic regression, and random forests. To strike a balance between accuracy and interpretability of the prediction, which is important in a medical setting, we propose two novel methods: K-LRT, a likelihood ratio test-based method, and a Joint Clustering and Classification (JCC) method which identifies hidden patient clusters and adapts classifiers to each cluster. We develop theoretical out-of-sample guarantees for the latter method. We validate our algorithms on large datasets from the Boston Medical Center, the largest safety-net hospital system in New England

    Prediction of length of stay for stroke patients using artificial neural networks

    Get PDF
    Strokes are neurological events that affect a certain area of the brain. Since brain controls fundamental body activities, brain cell deterioration and dead can lead to serious disabilities and poor life quality. This makes strokes the leading cause of disabilities and mortality worldwide. Patients that suffer strokes are hospitalized in order to be submitted to surgery and receive recovery therapies. Thus, it’s important to predict the length of stay for these patients, since it can be costly to them and their family, as well as to the medical institutions. The aim of this study is to make a prediction on the number of days of patients’ hospital stays based on information available about the neurological event that happened, the patient’s health status and surgery details. A neural network was put to test with three attribute subsets with different sizes. The best result was obtained with the subset with fewer features obtaining a RMSE and a MAE of 5.9451 and 4.6354, respectively.FCT - Fundação para a Ciência e a Tecnologia (UID/CEC/00319/2019
    corecore