62 research outputs found
Recommended from our members
Machine Learning to Identify Dialysis Patients at High Death Risk.
IntroductionGiven the high mortality rate within the first year of dialysis initiation, an accurate estimation of postdialysis mortality could help patients and clinicians in decision making about initiation of dialysis. We aimed to use machine learning (ML) by incorporating complex information from electronic health records to predict patients at risk for postdialysis short-term mortality.MethodsThis study was carried out on a contemporary cohort of 27,615 US veterans with incident end-stage renal disease (ESRD). We implemented a random forest method on 49 variables obtained before dialysis transition to predict outcomes of 30-, 90-, 180-, and 365-day all-cause mortality after dialysis initiation.ResultsThe mean (±SD) age of our cohort was 68.7 ± 11.2 years, 98.1% of patients were men, 29.4% were African American, and 71.4% were diabetic. The final random forest model provided C-statistics (95% confidence intervals) of 0.7185 (0.6994-0.7377), 0.7446 (0.7346-0.7546), 0.7504 (0.7425-0.7583), and 0.7488 (0.7421-0.7554) for predicting risk of death within the 4 different time windows. The models showed good internal validity and replicated well in patients with various demographic and clinical characteristics and provided similar or better performance compared with other ML algorithms. Results may not be generalizable to non-veterans. Use of predictors available in electronic medical records has limited the assessment of number of predictors.ConclusionWe implemented and ML-based method to accurately predict short-term postdialysis mortality in patients with incident ESRD. Our models could aid patients and clinicians in better decision making about the best course of action in patients approaching ESRD
A Novel Regression Approach: Least Squares Ratio
Regression Analysis (RA) is one of the frequently used tool for forecasting. The Ordinary Least Squares (OLS) Technique is the basic instrument of RA and there are many regression techniques based on OLS. This paper includes a new regression approach, called Least Squares Ratio (LSR), and comparison of OLS and LSR according to mean square errors of estimation of theoretical regression parameters (mse ss) and dependent value (mse y)
A New Supervised Classification of Credit Approval Data via the Hybridized RBF Neural Network Model Using Information Complexity
In this paper, we introduce a new approach for supervised classification to handle mixed-data (i.e., categorical, binary, and continuous) data structures using a hybrid radial basis function neural networks (HRBF-NN). HRBF-NN supervised classification combines regression trees, ridge regression, and the genetic algorithm (GA) with radial basis function (RBF) neural networks (NN) along with information complexity (ICOMP) criterion as the fitness function to carry out both classification and subset selection of best predictors which discriminate between the classes. In this manner, we reduce the dimensionality of the data and at the same time improve classification accuracy of the fitted predictive model. We apply HRBF-NN supervised classification to a real benchmark credit approval mixed-data set to classify the customers into good/bad classes for credit approval. Our results show the excellent performance of HRBF-NN method in supervised classification tasks
A novel Hybrid RBF Neural Networks model as a forecaster
We introduce a novel predictive statistical modeling technique called Hybrid Radial Basis Function Neural Networks (HRBF-NN) as a forecaster. HRBF-NN is a flexible forecasting technique that integrates regression trees, ridge regression, with radial basis function (RBF) neural networks (NN). We develop a new computational procedure using model selection based on information-theoretic principles as the fitness function using the genetic algorithm (GA) to carry out subset selection of best predictors. Due to the dynamic and chaotic nature of the underlying stock market process, as is well known, the task of generating economically useful stock market forecasts is difficult, if not impossible. HRBF-NN is well suited for modeling complex non-linear relationships and dependencies between the stock indices. We propose HRBF-NN as our forecaster and a predictive modeling tool to study the daily movements of stock indices. We show numerical examples to determine a predictive relationship between the Istanbul Stock Exchange National 100 Index (ISE100) and seven other international stock market indices. We select the best subset of predictors by minimizing the information complexity (ICOMP) criterion as the fitness function within the GA. Using the best subset of variables we construct out-of-sample forecasts for the ISE100 index to determine the daily directional movements. Our results obtained demonstrate the utility and the flexibility of HRBF-NN as a clever predictive modeling tool for highly dependent and nonlinear data
Model selection using information criteria under a new estimation method: least squares ratio
In this study, we evaluate several forms of both Akaike-type and Information Complexity (ICOMP)-type information criteria, in the context of selecting an optimal subset least squares ratio (LSR) regression model. Our simulation studies are designed to mimic many characteristics present in real data -- heavy tails, multicollinearity, redundant variables, and completely unnecessary variables. Our findings are that LSR in conjunction with one of the ICOMP criteria is very good at selecting the true model. Finally, we apply these methods to the familiar body fat data set.
A meta-analysis of carbon capture and storage technology assessments: Understanding the driving factors of variability in cost estimates
The estimated cost of reducing carbon emissions through the deployment of carbon capture and storage (CCS) in power systems vary by a factor of five or more across studies published over the past 8 years. The objective of this paper is to understand the contribution of techno-economic variables and modeling assumptions to explain the large variability in the published international literature on cost of avoided CO<inf>2</inf> (CACO2) using statistical methods. We carry out a meta-analysis of the variations in reported CACO2 for coal and natural gas power plants with CCS. We use regression and correlation analysis to explain the variation in reported CACO2. The regression models built in our analysis have strong predictive power (R2 > 0.90) for all power plant types. We find that the parameters that have high variability and large influence on the value of CACO2 estimated are levelized cost of electricity (LCOE) penalty, capital cost of CCS, and efficiency penalty. In addition, the selection of baseline technologies and more attention and transparency around the calculation of capital costs will reduce the variability across studies to better reflect technology uncertainty and improve comparability across studies
- …