1,386 research outputs found
Proximal boosting and its acceleration
Gradient boosting is a prediction method that iteratively combines weak
learners to produce a complex and accurate model. From an optimization point of
view, the learning procedure of gradient boosting mimics a gradient descent on
a functional variable. This paper proposes to build upon the proximal point
algorithm when the empirical risk to minimize is not differentiable to
introduce a novel boosting approach, called proximal boosting. Besides being
motivated by non-differentiable optimization, the proposed algorithm benefits
from Nesterov's acceleration in the same way as gradient boosting [Biau et al.,
2018]. This leads to a variant, called accelerated proximal boosting.
Advantages of leveraging proximal methods for boosting are illustrated by
numerical experiments on simulated and real-world data. In particular, we
exhibit a favorable comparison over gradient boosting regarding convergence
rate and prediction accuracy
Methods to Improve the Prediction Accuracy and Performance of Ensemble Models
The application of ensemble predictive models has been an important research area in predicting medical diagnostics, engineering diagnostics, and other related smart devices and
related technologies. Most of the current predictive models are complex and not reliable despite numerous efforts in the past by the research community. The performance accuracy of the predictive models have not always been realised due to many factors such as complexity and class imbalance. Therefore there is a need to improve the predictive accuracy of current ensemble models and to enhance their applications and reliability and non-visual predictive tools.
The research work presented in this thesis has adopted a pragmatic phased approach to propose and develop new ensemble models using multiple methods and validated the methods through rigorous testing and implementation in different phases. The first phase comprises of empirical investigations on standalone and ensemble algorithms that were carried out to ascertain their performance effects on complexity and simplicity of the classifiers. The second phase comprises of an improved ensemble model based on the integration of Extended Kalman Filter (EKF), Radial Basis Function Network (RBFN) and AdaBoost algorithms. The third phase comprises of an extended model based on early stop concepts, AdaBoost algorithm, and statistical performance of the training samples to minimize overfitting performance of the proposed model. The fourth phase comprises of an enhanced analytical multivariate logistic regression predictive model developed to minimize the complexity and improve prediction accuracy of logistic regression model.
To facilitate the practical application of the proposed models; an ensemble non-invasive analytical tool is proposed and developed. The tool links the gap between theoretical concepts and practical application of theories to predict breast cancer survivability. The empirical findings suggested that: (1) increasing the complexity and topology of algorithms does not necessarily lead to a better algorithmic performance, (2) boosting by resampling performs slightly better than boosting by reweighting, (3) the prediction accuracy of the proposed ensemble EKF-RBFN-AdaBoost model performed better than several established ensemble models, (4) the proposed early stopped model converges faster and minimizes overfitting better compare with other models, (5) the proposed multivariate logistic regression concept minimizes the complexity models (6) the performance of the proposed analytical non-invasive tool performed comparatively better than many of the benchmark analytical tools used in predicting breast cancers and diabetics ailments.
The research contributions to ensemble practice are: (1) the integration and development of EKF, RBFN and AdaBoost algorithms as an ensemble model, (2) the development and validation of ensemble model based on early stop concepts, AdaBoost, and statistical concepts of the training samples, (3) the development and validation of predictive logistic regression model based on breast cancer, and (4) the development and validation of a non-invasive breast cancer analytic tools based on the proposed and developed predictive models in this thesis.
To validate prediction accuracy of ensemble models, in this thesis the proposed models were applied in modelling breast cancer survivability and diabetics’ diagnostic tasks. In comparison with other established models the simulation results of the models showed improved predictive accuracy.
The research outlines the benefits of the proposed models, whilst proposes new directions for future work that could further extend and improve the proposed models discussed in this
thesis
Auto-WEKA: Combined Selection and Hyperparameter Optimization of Classification Algorithms
Many different machine learning algorithms exist; taking into account each
algorithm's hyperparameters, there is a staggeringly large number of possible
alternatives overall. We consider the problem of simultaneously selecting a
learning algorithm and setting its hyperparameters, going beyond previous work
that addresses these issues in isolation. We show that this problem can be
addressed by a fully automated approach, leveraging recent innovations in
Bayesian optimization. Specifically, we consider a wide range of feature
selection techniques (combining 3 search and 8 evaluator methods) and all
classification approaches implemented in WEKA, spanning 2 ensemble methods, 10
meta-methods, 27 base classifiers, and hyperparameter settings for each
classifier. On each of 21 popular datasets from the UCI repository, the KDD Cup
09, variants of the MNIST dataset and CIFAR-10, we show classification
performance often much better than using standard selection/hyperparameter
optimization methods. We hope that our approach will help non-expert users to
more effectively identify machine learning algorithms and hyperparameter
settings appropriate to their applications, and hence to achieve improved
performance.Comment: 9 pages, 3 figure
- …