Search CORE

17 research outputs found

AutoMLBench: A Comprehensive Experimental Evaluation of Automated Machine Learning Frameworks

Author: Aldallal Abdelrahman
Eldeeb Hassan
Elshawi Radwa
Maher Mohamed
Matsuk Oleh
Sakr Sherif
Publication venue
Publication date: 18/04/2022
Field of study

Nowadays, machine learning is playing a crucial role in harnessing the power of the massive amounts of data that we are currently producing every day in our digital world. With the booming demand for machine learning applications, it has been recognized that the number of knowledgeable data scientists can not scale with the growing data volumes and application needs in our digital world. In response to this demand, several automated machine learning (AutoML) techniques and frameworks have been developed to fill the gap of human expertise by automating the process of building machine learning pipelines. In this study, we present a comprehensive evaluation and comparison of the performance characteristics of six popular AutoML frameworks, namely, Auto-Weka, AutoSKlearn, TPOT, Recipe, ATM, and SmartML across 100 data sets from established AutoML benchmark suites. Our experimental evaluation considers different aspects for its comparison including the performance impact of several design decisions including time budget, size of search space, meta-learning, and ensemble construction. The results of our study reveal various interesting insights that can significantly guide and impact the design of AutoML frameworks

arXiv.org e-Print Archive

Using machine learning on cardiorespiratory fitness data for predicting hypertension: The Henry Ford ExercIse Testing (FIT) Project

Author: Ahmed Amjad
Al-Mallah Mouaz H
Blaha Michael J
Brawner Clinton
Elshawi Radwa
Keteyian Steven J
Qureshi Waqas T
Sakr Sherif
Publication venue: Henry Ford Health System Scholarly Commons
Publication date: 01/01/2018
Field of study

This study evaluates and compares the performance of different machine learning techniques on predicting the individuals at risk of developing hypertension, and who are likely to benefit most from interventions, using the cardiorespiratory fitness data. The dataset of this study contains information of 23,095 patients who underwent clinician- referred exercise treadmill stress testing at Henry Ford Health Systems between 1991 and 2009 and had a complete 10-year follow-up. The variables of the dataset include information on vital signs, diagnosis and clinical laboratory measurements. Six machine learning techniques were investigated: LogitBoost (LB), Bayesian Network classifier (BN), Locally Weighted Naive Bayes (LWB), Artificial Neural Network (ANN), Support Vector Machine (SVM) and Random Tree Forest (RTF). Using different validation methods, the RTF model has shown the best performance (AUC = 0.93) and outperformed all other machine learning techniques examined in this study. The results have also shown that it is critical to carefully explore and evaluate the performance of the machine learning models using various model evaluation methods as the prediction accuracy can significantly differ

Henry Ford Health System Scholarly Commons

Directory of Open Access Journals

A flowchart of our experimental process.

Author: Amjad Ahmed (5117423)
Clinton Brawner (4267915)
Michael J. Blaha (507123)
Mouaz H. Al-Mallah (206468)
Radwa Elshawi (5117426)
Sherif Sakr (4267924)
Steven Keteyian (4267921)
Waqas T. Qureshi (823529)
Publication venue
Publication date
Field of study

A flowchart of our experimental process.</p

FigShare

The performance of the Different Machine Learning Models evaluated using the 10-fold cross validation method using SMOTE.

Author: Amjad Ahmed (5117423)
Clinton Brawner (4267915)
Michael J. Blaha (507123)
Mouaz H. Al-Mallah (206468)
Radwa Elshawi (5117426)
Sherif Sakr (4267924)
Steven Keteyian (4267921)
Waqas T. Qureshi (823529)
Publication venue
Publication date
Field of study

The RTF model achieves the highest AUC (0.93), F-Score (86.70%), sensitivity (69,96%) and Specificity (91.71%).</p

FigShare

Using Machine Learning to Define the Association between Cardiorespiratory Fitness and All-Cause Mortality (from the Henry Ford Exercise Testing Project)

Author: Ahmed Amjad M
Ahmed Haitham M
Al-Mallah Mouaz H
Blaha Michael J
Brawner Clinton
Ehrman Jonathan K
Elshawi Radwa
Keteyian Steven J
Qureshi Waqas T
Sakr Sherif
Publication venue: Henry Ford Health System Scholarly Commons
Publication date: 01/12/2017
Field of study

Previous studies have demonstrated that cardiorespiratory fitness is a strong marker of cardiovascular health. Machine learning (ML) can enhance the prediction of outcomes through classification techniques that classify the data into predetermined categories. The aim of the analysis is to compare the prediction of 10 years of all-cause mortality (ACM) using statistical logistic regression (LR) and ML approaches in a cohort of patients who underwent exercise stress testing. We included 34,212 patients (55% males, mean age 54 ± 13 years) free of coronary artery disease or heart failure who underwent exercise treadmill stress testing between 1991 and 2009 and had complete 10-year follow-up. The primary outcome of this analysis was ACM at 10 years. The probability of 10-years ACM was calculated using statistical LR and ML, and the accuracy of these methods was calculated and compared. A total of 3,921 patients died at 10 years. Using statistical LR, the sensitivity to predict ACM was 44.9% (95% confidence interval [CI] 43.3% to 46.5%), whereas the specificity was 93.4% (95% CI 93.1% to 93.7%). The sensitivity of ML to predict ACM was 87.4% (95% CI 86.3% to 88.4%), whereas the specificity was 97.2% (95% CI 97.0% to 97.4%). The ML approach was associated with improved model discrimination (area under the curve for ML [0.923 (95% CI 0.917 to 0.928)]) compared with statistical LR (0.836 [95% CI 0.829 to 0.846],

Henry Ford Health System Scholarly Commons

AUC Curves for the Different Machine Learning Models using SMOTE evaluated using 10-fold cross-validation.

Author: Amjad Ahmed (5117423)
Clinton Brawner (4267915)
Michael J. Blaha (507123)
Mouaz H. Al-Mallah (206468)
Radwa Elshawi (5117426)
Sherif Sakr (4267924)
Steven Keteyian (4267921)
Waqas T. Qureshi (823529)
Publication venue
Publication date
Field of study

AUC Curves for the Different Machine Learning Models using SMOTE evaluated using 10-fold cross-validation.</p

FigShare

The performance of the Different Machine Learning Models evaluated using the Hold Out method (70/30) using SMOTE.

Author: Amjad Ahmed (5117423)
Clinton Brawner (4267915)
Michael J. Blaha (507123)
Mouaz H. Al-Mallah (206468)
Radwa Elshawi (5117426)
Sherif Sakr (4267924)
Steven Keteyian (4267921)
Waqas T. Qureshi (823529)
Publication venue
Publication date
Field of study

The RTF model achieve the highest AUC (0.88), Sensitivity (74.30%), Precision (73.50%) and F-Score (73.90%).</p

FigShare

Comparison of the performance of Artificial Neural Networks (ANN) classifier with gradient descent back-propagation using hidden units {1, 2, 4, 8} and the momentum {0,0.2, 0.5} using 10-fold cross validation using SMOTE.

Author: Amjad Ahmed (5117423)
Clinton Brawner (4267915)
Michael J. Blaha (507123)
Mouaz H. Al-Mallah (206468)
Radwa Elshawi (5117426)
Sherif Sakr (4267924)
Steven Keteyian (4267921)
Waqas T. Qureshi (823529)
Publication venue
Publication date
Field of study

Comparison of the performance of Artificial Neural Networks (ANN) classifier with gradient descent back-propagation using hidden units {1, 2, 4, 8} and the momentum {0,0.2, 0.5} using 10-fold cross validation using SMOTE.</p

FigShare