1,420 research outputs found
Systematic Review on Missing Data Imputation Techniques with Machine Learning Algorithms for Healthcare
Missing data is one of the most common issues encountered in data cleaning process especially when dealing with medical dataset. A real collected dataset is prone to be incomplete, inconsistent, noisy and redundant due to potential reasons such as human errors, instrumental failures, and adverse death. Therefore, to accurately deal with incomplete data, a sophisticated algorithm is proposed to impute those missing values. Many machine learning algorithms have been applied to impute missing data with plausible values. However, among all machine learning imputation algorithms, KNN algorithm has been widely adopted as an imputation for missing data due to its robustness and simplicity and it is also a promising method to outperform other machine learning methods. This paper provides a comprehensive review of different imputation techniques used to replace the missing data. The goal of the review paper is to bring specific attention to potential improvements to existing methods and provide readers with a better grasps of imputation technique trends
Simultaneous Measurement Imputation and Outcome Prediction for Achilles Tendon Rupture Rehabilitation
Achilles Tendon Rupture (ATR) is one of the typical soft tissue injuries.
Rehabilitation after such a musculoskeletal injury remains a prolonged process
with a very variable outcome. Accurately predicting rehabilitation outcome is
crucial for treatment decision support. However, it is challenging to train an
automatic method for predicting the ATR rehabilitation outcome from treatment
data, due to a massive amount of missing entries in the data recorded from ATR
patients, as well as complex nonlinear relations between measurements and
outcomes. In this work, we design an end-to-end probabilistic framework to
impute missing data entries and predict rehabilitation outcomes simultaneously.
We evaluate our model on a real-life ATR clinical cohort, comparing with
various baselines. The proposed method demonstrates its clear superiority over
traditional methods which typically perform imputation and prediction in two
separate stages
Filling out the missing gaps: Time Series Imputation with Semi-Supervised Learning
Missing data in time series is a challenging issue affecting time series
analysis. Missing data occurs due to problems like data drops or sensor
malfunctioning. Imputation methods are used to fill in these values, with
quality of imputation having a significant impact on downstream tasks like
classification. In this work, we propose a semi-supervised imputation method,
ST-Impute, that uses both unlabeled data along with downstream task's labeled
data. ST-Impute is based on sparse self-attention and trains on tasks that
mimic the imputation process. Our results indicate that the proposed method
outperforms the existing supervised and unsupervised time series imputation
methods measured on the imputation quality as well as on the downstream tasks
ingesting imputed time series
- …