Search CORE

15,692 research outputs found

Predictive Models for Bariatric Surgery Risks with Imbalanced Medical Datasets

Author: Ewing Joseph
Razzaghi Talayeh
Sadrfaridpour Ehsan
Safro Ilya
Scott John D.
Publication venue: Clemson University Libraries
Publication date: 01/08/2017
Field of study

Bariatric surgery (BAR) has become a popular treatment for type 2 diabetes mellitus (T2DM) which is among the most critical obesity-related comorbidities. Patients who have bariatric surgery, are exposed to complications after surgery. Furthermore, the mid- to long-term complications after bariatric surgery can be deadly and increase the complexity of managing safety of these operations and healthcare costs. Current studies on BAR complications have mainly used risk scoring for identifying patients who are more likely to have complications after surgery. Though, these studies do not take into considera-tion the imbalanced nature of the data where the size of the class of interest (patients who have complications after surgery) is relatively small. We propose the use of imbalanced classification techniques to tackle the imbalanced bariatric surgery data: synthetic minority oversampling technique (SMOTE), random undersampling, and en-semble learning classification methods including Random Forest, Bagging, and AdaBoost. Moreover, we improve classification performance through using Chi-Squared, Information Gain, and Correlation-based feature selection (CFS) techniques. We study the Premier Healthcare Database with focus on the most-frequent complications includ-ing Diabetes, Angina, Heart Failure, and Stroke. Our results show that the ensemble learning-based classiﬁcation techniques using any feature selection method mentioned above are the best approach for handling the imbalanced nature of the bariatric surgical outcome data. In our evaluation, we ﬁnd a slight preference toward using SMOTE method compared to the random undersampling method. These results demonstrate the potential of machine-learning tools as clinical decision support in identifying risks/outcomes associated with bariatric surgery and their eﬀectiveness in reducing the surgery complications as well as improving patient care

Imbalanced Ensemble Classifier for learning from imbalanced business school data set

Author: Chakraborty Tanujit
Publication venue
Publication date: 17/10/2018
Field of study

Private business schools in India face a common problem of selecting quality students for their MBA programs to achieve the desired placement percentage. Generally, such data sets are biased towards one class, i.e., imbalanced in nature. And learning from the imbalanced dataset is a difficult proposition. This paper proposes an imbalanced ensemble classifier which can handle the imbalanced nature of the dataset and achieves higher accuracy in case of the feature selection (selection of important characteristics of students) cum classification problem (prediction of placements based on the students' characteristics) for Indian business school dataset. The optimal value of an important model parameter is found. Numerical evidence is also provided using Indian business school dataset to assess the outstanding performance of the proposed classifier

arXiv.org e-Print Archive