COMPARISON OF PREDICTIVE PERFORMANCES OF MARS AND CART ALGORITHMS THROUGH R SOFTWARE

Abstract

Within the framework of general linear model, there is lack of information on comparatively examining data mining algorithms viz. CART, CHAID, C5.0, Exhaustive CHAID, MLP, RBF and particularly MARS, which derives a convenient prediction equation. All of the algorithms can be more informative than a classical method like multiple linear regressions in the violation of some distributional assumptions in relation to variables to be studied. The aims of the current investigation were to comparatively examine MARS and CART algorithms and multiple linear regressions through R free software in terms of general linear model and to present how to step-by-step use R software in the application of these statistical approaches. MARS data mining algorithm also used as an alternative to response surface method in optimization process has been examined in detail in generalized cross validation for the first time. In the R software, “penalty = -1” and “a backward pruning method” were specified for MARS. Thus, GCV is converted into RSS/n where RSS is residual sum of squares and n is sample size. Model evaluation criteria estimated to compare these three approaches were R2, R2ADJUSTED, SDRATIO and Pearson correlation between predicted and actual dependent values. As a result, the current investigation will be a noble reference for researchers who will perform similar studies in next time

    Similar works

    Full text

    thumbnail-image

    Available Versions