22 research outputs found
Simultaneous Measurement Imputation and Outcome Prediction for Achilles Tendon Rupture Rehabilitation
Achilles Tendon Rupture (ATR) is one of the typical soft tissue injuries.
Rehabilitation after such a musculoskeletal injury remains a prolonged process
with a very variable outcome. Accurately predicting rehabilitation outcome is
crucial for treatment decision support. However, it is challenging to train an
automatic method for predicting the ATR rehabilitation outcome from treatment
data, due to a massive amount of missing entries in the data recorded from ATR
patients, as well as complex nonlinear relations between measurements and
outcomes. In this work, we design an end-to-end probabilistic framework to
impute missing data entries and predict rehabilitation outcomes simultaneously.
We evaluate our model on a real-life ATR clinical cohort, comparing with
various baselines. The proposed method demonstrates its clear superiority over
traditional methods which typically perform imputation and prediction in two
separate stages
Predicting Unplanned Hospital Readmissions using Patient Level Data
The rate of unplanned hospital readmissions in the US is likely to face a steady rise after 2020. Hence, this issue has received considerable critical attention with the policy makers. Majority of hospitals in the US pay millions of dollars as penalty for readmitting patients within 30 days due to strict norms imposed by the Hospital Readmission Reduction Program. In this study, we develop two novel models: PURE (Predicting Unplanned Readmissions using Embeddings) and Hybrid DeepR, which uses the historical medical events of patients to predict readmissions within 30 days. Both these models are hybrid sequence models that leverage both sequential events (history of events) and static features (like gender, blood pressure) of the patients to mine patterns in the data. Our results are promising, and they benchmark previous results in predicting hospital readmissions. The contributions of this study add to existing literature on healthcare analytics
Learning Patient Static Information from Time-series EHR and an Approach for Safeguarding Privacy and Fairness
Recent work in machine learning for healthcare has raised concerns about
patient privacy and algorithmic fairness. For example, previous work has shown
that patient self-reported race can be predicted from medical data that does
not explicitly contain racial information. However, the extent of data
identification is unknown, and we lack ways to develop models whose outcomes
are minimally affected by such information. Here we systematically investigated
the ability of time-series electronic health record data to predict patient
static information. We found that not only the raw time-series data, but also
learned representations from machine learning models, can be trained to predict
a variety of static information with area under the receiver operating
characteristic curve as high as 0.851 for biological sex, 0.869 for binarized
age and 0.810 for self-reported race. Such high predictive performance can be
extended to a wide range of comorbidity factors and exists even when the model
was trained for different tasks, using different cohorts, using different model
architectures and databases. Given the privacy and fairness concerns these
findings pose, we develop a variational autoencoder-based approach that learns
a structured latent space to disentangle patient-sensitive attributes from
time-series data. Our work thoroughly investigates the ability of machine
learning models to encode patient static information from time-series
electronic health records and introduces a general approach to protect
patient-sensitive attribute information for downstream tasks