65,439 research outputs found

    Improved customer choice predictions using ensemble methods

    Get PDF
    In this paper various ensemble learning methods from machinelearning and statistics are considered and applied to the customerchoice modeling problem. The application of ensemble learningusually improves the prediction quality of flexible models likedecision trees and thus leads to improved predictions. We giveexperimental results for two real-life marketing datasets usingdecision trees, ensemble versions of decision trees and thelogistic regression model, which is a standard approach for thisproblem. The ensemble models are found to improve upon individualdecision trees and outperform logistic regression.Next, an additive decomposition of the prediction error of amodel, the bias/variance decomposition, is considered. A modelwith a high bias lacks the flexibility to fit the data well. Ahigh variance indicates that a model is instable with respect todifferent datasets. Decision trees have a high variance componentand a low bias component in the prediction error, whereas logisticregression has a high bias component and a low variance component.It is shown that ensemble methods aim at minimizing the variancecomponent in the prediction error while leaving the bias componentunaltered. Bias/variance decompositions for all models for bothcustomer choice datasets are given to illustrate these concepts.brand choice;data mining;boosting;choice models;Bias/Variance decomposition;Bagging;CART;ensembles

    Building Combined Classifiers

    Get PDF
    This chapter covers different approaches that may be taken when building an ensemble method, through studying specific examples of each approach from research conducted by the authors. A method called Negative Correlation Learning illustrates a decision level combination approach with individual classifiers trained co-operatively. The Model level combination paradigm is illustrated via a tree combination method. Finally, another variant of the decision level paradigm, with individuals trained independently instead of co-operatively, is discussed as applied to churn prediction in the telecommunications industry

    A Modern Take on the Bias-Variance Tradeoff in Neural Networks

    Full text link
    The bias-variance tradeoff tells us that as model complexity increases, bias falls and variances increases, leading to a U-shaped test error curve. However, recent empirical results with over-parameterized neural networks are marked by a striking absence of the classic U-shaped test error curve: test error keeps decreasing in wider networks. This suggests that there might not be a bias-variance tradeoff in neural networks with respect to network width, unlike was originally claimed by, e.g., Geman et al. (1992). Motivated by the shaky evidence used to support this claim in neural networks, we measure bias and variance in the modern setting. We find that both bias and variance can decrease as the number of parameters grows. To better understand this, we introduce a new decomposition of the variance to disentangle the effects of optimization and data sampling. We also provide theoretical analysis in a simplified setting that is consistent with our empirical findings

    A Framework for Unbiased Model Selection Based on Boosting

    Get PDF
    Variable selection and model choice are of major concern in many statistical applications, especially in high-dimensional regression models. Boosting is a convenient statistical method that combines model fitting with intrinsic model selection. We investigate the impact of base-learner specification on the performance of boosting as a model selection procedure. We show that variable selection may be biased if the covariates are of different nature. Important examples are models combining continuous and categorical covariates, especially if the number of categories is large. In this case, least squares base-learners offer increased flexibility for the categorical covariate and lead to a preference even if the categorical covariate is non-informative. Similar difficulties arise when comparing linear and nonlinear base-learners for a continuous covariate. The additional flexibility in the nonlinear base-learner again yields a preference of the more complex modeling alternative. We investigate these problems from a theoretical perspective and suggest a framework for unbiased model selection based on a general class of penalized least squares base-learners. Making all base-learners comparable in terms of their degrees of freedom strongly reduces the selection bias observed in naive boosting specifications. The importance of unbiased model selection is demonstrated in simulations and an application to forest health models

    An efficient randomised sphere cover classifier

    Get PDF
    This paper describes an efficient randomised sphere cover classifier(aRSC), that reduces the training data set size without loss of accuracy when compared to nearest neighbour classifiers. The motivation for developing this algorithm is the desire to have a non-deterministic, fast, instance-based classifier that performs well in isolation but is also ideal for use with ensembles. We use 24 benchmark datasets from UCI repository and six gene expression datasets for evaluation. The first set of experiments demonstrate the basic benefits of sphere covering. The second set of experiments demonstrate that when we set the a parameter through cross validation, the resulting aRSC algorithm outperforms several well known classifiers when compared using the Friedman rank sum test. Thirdly, we test the usefulness of aRSC when used with three feature filtering filters on six gene expression datasets. Finally, we highlight the benefits of pruning with a bias/variance decompositio
    corecore