39 research outputs found
Recommended from our members
Machine Learning to Identify Dialysis Patients at High Death Risk.
IntroductionGiven the high mortality rate within the first year of dialysis initiation, an accurate estimation of postdialysis mortality could help patients and clinicians in decision making about initiation of dialysis. We aimed to use machine learning (ML) by incorporating complex information from electronic health records to predict patients at risk for postdialysis short-term mortality.MethodsThis study was carried out on a contemporary cohort of 27,615 US veterans with incident end-stage renal disease (ESRD). We implemented a random forest method on 49 variables obtained before dialysis transition to predict outcomes of 30-, 90-, 180-, and 365-day all-cause mortality after dialysis initiation.ResultsThe mean (±SD) age of our cohort was 68.7 ± 11.2 years, 98.1% of patients were men, 29.4% were African American, and 71.4% were diabetic. The final random forest model provided C-statistics (95% confidence intervals) of 0.7185 (0.6994-0.7377), 0.7446 (0.7346-0.7546), 0.7504 (0.7425-0.7583), and 0.7488 (0.7421-0.7554) for predicting risk of death within the 4 different time windows. The models showed good internal validity and replicated well in patients with various demographic and clinical characteristics and provided similar or better performance compared with other ML algorithms. Results may not be generalizable to non-veterans. Use of predictors available in electronic medical records has limited the assessment of number of predictors.ConclusionWe implemented and ML-based method to accurately predict short-term postdialysis mortality in patients with incident ESRD. Our models could aid patients and clinicians in better decision making about the best course of action in patients approaching ESRD
A New Supervised Classification of Credit Approval Data via the Hybridized RBF Neural Network Model Using Information Complexity
In this paper, we introduce a new approach for supervised classification to handle mixed-data (i.e., categorical, binary, and continuous) data structures using a hybrid radial basis function neural networks (HRBF-NN). HRBF-NN supervised classification combines regression trees, ridge regression, and the genetic algorithm (GA) with radial basis function (RBF) neural networks (NN) along with information complexity (ICOMP) criterion as the fitness function to carry out both classification and subset selection of best predictors which discriminate between the classes. In this manner, we reduce the dimensionality of the data and at the same time improve classification accuracy of the fitted predictive model. We apply HRBF-NN supervised classification to a real benchmark credit approval mixed-data set to classify the customers into good/bad classes for credit approval. Our results show the excellent performance of HRBF-NN method in supervised classification tasks
A novel Hybrid RBF Neural Networks model as a forecaster
We introduce a novel predictive statistical modeling technique called Hybrid Radial Basis Function Neural Networks (HRBF-NN) as a forecaster. HRBF-NN is a flexible forecasting technique that integrates regression trees, ridge regression, with radial basis function (RBF) neural networks (NN). We develop a new computational procedure using model selection based on information-theoretic principles as the fitness function using the genetic algorithm (GA) to carry out subset selection of best predictors. Due to the dynamic and chaotic nature of the underlying stock market process, as is well known, the task of generating economically useful stock market forecasts is difficult, if not impossible. HRBF-NN is well suited for modeling complex non-linear relationships and dependencies between the stock indices. We propose HRBF-NN as our forecaster and a predictive modeling tool to study the daily movements of stock indices. We show numerical examples to determine a predictive relationship between the Istanbul Stock Exchange National 100 Index (ISE100) and seven other international stock market indices. We select the best subset of predictors by minimizing the information complexity (ICOMP) criterion as the fitness function within the GA. Using the best subset of variables we construct out-of-sample forecasts for the ISE100 index to determine the daily directional movements. Our results obtained demonstrate the utility and the flexibility of HRBF-NN as a clever predictive modeling tool for highly dependent and nonlinear data
Model selection using information criteria under a new estimation method: least squares ratio
In this study, we evaluate several forms of both Akaike-type and Information Complexity (ICOMP)-type information criteria, in the context of selecting an optimal subset least squares ratio (LSR) regression model. Our simulation studies are designed to mimic many characteristics present in real data -- heavy tails, multicollinearity, redundant variables, and completely unnecessary variables. Our findings are that LSR in conjunction with one of the ICOMP criteria is very good at selecting the true model. Finally, we apply these methods to the familiar body fat data set.
Recommended from our members
PREDICTING DECLINE IN RESIDUAL RENAL UREA CLEARANCE VIA MACHINE LEARNING
Identifying Sociomarkers of Pediatric Asthma Patients at Risk of Hospital Revisiting
ObjectiveAsthma is one of the most common chronic childhood diseases in the United States [2, 3]. In addition to its pervasiveness, pediatric asthma shows high sensitivity to the environment. Combining medical-social dataset with machine learning methods we demonstrate how socio-markers play an important role in identifying patients at risk of hospital revisits due to pediatric asthma within a year.IntroductionA socio-marker is a measurable indicator of social conditions where a patient is embedded in and exposed to, being analogous with a biomarker indicating the severity or presence of some disease state. Social factors are one of the most clinical health determinants [1], which play a critical role in explaining health outcomes. Socio-markers can help medical practitioners and researchers to reliably identify high-risk individuals in a timely manner.MethodsWe collected data from three different sources: pediatric asthma encounter records from Jan 1st, 2016 to Dec 31st, 2016 at a children’s hospital, the 2010 U.S census data and neighborhood quality survey data by Memphis Property Hub. After merging these datasets we examine the effect of social features in identifying the patients who visited the hospital more than once during the observation period. We only use the first time visit (3,678 cases) to avoid over-counting of the same patients. In addition to demographic features (age, gender, insurance type, and race (African American and White)), we incorporate the social features such as the proportion of individuals living below the federal poverty level, blight prevalence, neighborhood quality, neighborhood quality inequality, trash dumping presence, the broken window pervasiveness within the zip code area of patients’ residence are included.We then implemented a Support Vector Machine (SVM) based classification model using abovementioned 11 social features. The classification outcome is either patient visits the hospital only one-time (class 0) or revisits the hospital within a year (class 1). Among 3,678 unique patients in the dataset, only 823 pediatric patients revisited hospital with asthma. So, to overcome the class imbalance issue, we have used 823 patients’ data (randomly selected in 1,000 iterations) from each class. Further, to avoid overfitting and ensure generalizability, we divided the dataset as training, test, and validation with a proportion of 60%, 20%, and 20%, respectively. The reported test (5-folds cross-validation using training and testing data) and validation accuracy of the SVM method are averaged over 1,000 iterations to avoid sampling error and bias.ResultsThe proposed socio-marker model resulted in an average classification accuracy of 63.70% for the test set and 63.67 % for the validation set. Further, the average specificity (the total true negative cases divided by the sum of true negative and false positive) and sensitivity (the total number of true positive cases divided by the sum of positive predicted cases) is found to be 62.79% and 64.77%, respectively for the test set and 62.79% and 64.83%, respectively for the validation set. Results of this study suggest that socio-marker features that are not directly related to a patient’s medical conditions can still predict whether the patient will come back to the hospital within a year or not with approximately 64% accuracy.ConclusionsBringing the socio-marker features in the surveillance system may ease the burden of detecting the patients at risk of revisiting the hospital. The results should be interpreted with caution because we only used 12-month period of observation and the visit beyond the observation window is not considered. Also the patients may have visited different hospitals which are not captured in the data.References1. Booske BC, Athens JK, Kindig DA, Park H, Remington PL: Different perspectives for assigning weights to determinants of health. University of Wisconsin: Population Health Institute 2010.2. Subbarao P, Mandhane PJ, Sears MR: Asthma: epidemiology, etiology and risk factors. Canadian Medical Association Journal 2009, 181(9):E181-E190.3. Gold DR, Wright R: Population disparities in asthma. Annu Rev Public Health 2005, 26:89-113