6,091 research outputs found
COMET: A Recipe for Learning and Using Large Ensembles on Massive Data
COMET is a single-pass MapReduce algorithm for learning on large-scale data.
It builds multiple random forest ensembles on distributed blocks of data and
merges them into a mega-ensemble. This approach is appropriate when learning
from massive-scale data that is too large to fit on a single machine. To get
the best accuracy, IVoting should be used instead of bagging to generate the
training subset for each decision tree in the random forest. Experiments with
two large datasets (5GB and 50GB compressed) show that COMET compares favorably
(in both accuracy and training time) to learning on a subsample of data using a
serial algorithm. Finally, we propose a new Gaussian approach for lazy ensemble
evaluation which dynamically decides how many ensemble members to evaluate per
data point; this can reduce evaluation cost by 100X or more
Early hospital mortality prediction using vital signals
Early hospital mortality prediction is critical as intensivists strive to
make efficient medical decisions about the severely ill patients staying in
intensive care units. As a result, various methods have been developed to
address this problem based on clinical records. However, some of the laboratory
test results are time-consuming and need to be processed. In this paper, we
propose a novel method to predict mortality using features extracted from the
heart signals of patients within the first hour of ICU admission. In order to
predict the risk, quantitative features have been computed based on the heart
rate signals of ICU patients. Each signal is described in terms of 12
statistical and signal-based features. The extracted features are fed into
eight classifiers: decision tree, linear discriminant, logistic regression,
support vector machine (SVM), random forest, boosted trees, Gaussian SVM, and
K-nearest neighborhood (K-NN). To derive insight into the performance of the
proposed method, several experiments have been conducted using the well-known
clinical dataset named Medical Information Mart for Intensive Care III
(MIMIC-III). The experimental results demonstrate the capability of the
proposed method in terms of precision, recall, F1-score, and area under the
receiver operating characteristic curve (AUC). The decision tree classifier
satisfies both accuracy and interpretability better than the other classifiers,
producing an F1-score and AUC equal to 0.91 and 0.93, respectively. It
indicates that heart rate signals can be used for predicting mortality in
patients in the ICU, achieving a comparable performance with existing
predictions that rely on high dimensional features from clinical records which
need to be processed and may contain missing information.Comment: 11 pages, 5 figures, preprint of accepted paper in IEEE&ACM CHASE
2018 and published in Smart Health journa
- …