20 research outputs found
Mass appraisal of residential apartments: An application of Random forest for valuation and a CART-based approach for model diagnostics
To the best knowledge of authors, the use of Random forest as a potential technique for residential estate mass appraisal has been attempted for the first time. In the empirical study using data on residential apartments the method performed better than such techniques as CHAID, CART, KNN, multiple regression analysis, Artificial Neural Networks (MLP and RBF) and Boosted Trees. An approach for automatic detection of segments where a model significantly underperforms and for detecting segments with systematically under- or overestimated prediction is introduced. This segmentational approach is applicable to various expert systems including, but not limited to, those used for the mass appraisal.Random forest, mass appraisal, CART, model diagnostics, real estate, automatic valuation model
Applying CHAID for logistic regression diagnostics and classification accuracy improvement
In this study a CHAID-based approach to detecting classification accuracy heterogeneity across segments of observations is proposed. This helps to solve some important problems, facing a model-builder: 1. How to automatically detect segments in which the model significantly underperforms? 2. How to incorporate the knowledge about classification accuracy heterogeneity across segments to partition observations in order to achieve better predictive accuracy? The approach was applied to churn data from the UCI Repository of Machine Learning Databases. By splitting the dataset into 4 parts, which are based on the decision tree, and building a separate logistic regression scoring model for each segment we increased the accuracy by more than 7 percentage points on the test sample. Significant increase in recall and precision was also observed. It was shown that different segments may have absolutely different churn predictors. Therefore such a partitioning gives a better insight into factors influencing customer behavior.CHAID; logistic regression; churn prediction; performance improvement; segmentwise prediction; decision tree; classification tree
Applying a CART-based approach for the diagnostics of mass appraisal models
In this paper an approach for automatic detection of segments where a regression model significantly underperforms and for detecting segments with systematically under- or overestimated prediction is introduced. This segmentational approach is applicable to various expert systems including, but not limited to, those used for the mass appraisal. The proposed approach may be useful for various regression analysis applications, especially those with strong heteroscedasticity. It helps to reveal segments for which separate models or appraiser assistance are desirable. The segmentational approach has been applied to a mass appraisal model based on the Random Forest algorithm.CART, model diagnostics, mass appraisal, real estate, Random forest, heteroscedasticity
Accounting for latent classes in movie box office modeling
This paper addresses the issue of unobserved heterogeneity in film characteristics influence on box-office. We argue that the analysis of pooled samples, most common among researchers, does not shed light on underlying segmentations and leads to significantly different estimates obtained by researchers running similar regressions for movie success modeling. For instance, it may be expected that a restrictive MPAA rating is a box office poison for a family comedy, while it insignificantly influences an action movie‟s revenues. Using a finite mixture model we extract two latent groups, the differences between which can be explained in part by the movie genre, the source, the creative type and the production method. Based on this result, the authors recommend developing separate movie success models for different segments, rather than adopting an approach, that was commonly used in previous research, when one explanatory or predictive model is developed for the whole sample of movies.finite mixture model, box office, latent class, movie success, quantile regression, unobserved heterogeneity
Applying a CART-based approach for the diagnostics of mass appraisal models
In this paper an approach for automatic detection of segments where a regression model significantly underperforms and for detecting segments with systematically under- or overestimated prediction is introduced. This segmentational approach is applicable to various expert systems including, but not limited to, those used for the mass appraisal. The proposed approach may be useful for various regression analysis applications, especially those with strong heteroscedasticity. It helps to reveal segments for which separate models or appraiser assistance are desirable. The segmentational approach has been applied to a mass appraisal model based on the Random Forest algorithm.CART, model diagnostics, mass appraisal, real estate, Random forest, heteroscedasticity
Applying a CART-based approach for the diagnostics of mass appraisal models
In this paper an approach for automatic detection of segments where a regression model significantly underperforms and for detecting segments with systematically under- or overestimated prediction is introduced. This segmentational approach is applicable to various expert systems including, but not limited to, those used for the mass appraisal. The proposed approach may be useful for various regression analysis applications, especially those with strong heteroscedasticity. It helps to reveal segments for which separate models or appraiser assistance are desirable. The segmentational approach has been applied to a mass appraisal model based on the Random Forest algorithm
Applying CHAID for logistic regression diagnostics and classification accuracy improvement
In this study a CHAID-based approach to detecting classification accuracy heterogeneity across segments of observations is proposed. This helps to solve some important problems, facing a model-builder:
1. How to automatically detect segments in which the model significantly underperforms?
2. How to incorporate the knowledge about classification accuracy heterogeneity across segments to partition observations in order to achieve better predictive accuracy?
The approach was applied to churn data from the UCI Repository of Machine Learning Databases. By splitting the dataset into 4 parts, which are based on the decision tree, and building a separate logistic regression scoring model for each segment we increased the accuracy by more than 7 percentage points on the test sample. Significant increase in recall and precision was also observed. It was shown that different segments may have absolutely different churn predictors. Therefore such a partitioning gives a better insight into factors influencing customer behavior
Mass appraisal of residential apartments: An application of Random forest for valuation and a CART-based approach for model diagnostics
To the best knowledge of authors, the use of Random forest as a potential technique for residential estate mass appraisal has been attempted for the first time. In the empirical study using data on residential apartments the method performed better than such techniques as CHAID, CART, KNN, multiple regression analysis, Artificial Neural Networks (MLP and RBF) and Boosted Trees. An approach for automatic detection of segments where a model significantly underperforms and for detecting segments with systematically under- or overestimated prediction is introduced. This segmentational approach is applicable to various expert systems including, but not limited to, those used for the mass appraisal
Applying CHAID for logistic regression diagnostics and classification accuracy improvement
In this study a CHAID-based approach to detecting classification accuracy heterogeneity across segments of observations is proposed. This helps to solve some important problems, facing a model-builder:
1. How to automatically detect segments in which the model significantly underperforms?
2. How to incorporate the knowledge about classification accuracy heterogeneity across segments to partition observations in order to achieve better predictive accuracy?
The approach was applied to churn data from the UCI Repository of Machine Learning Databases. By splitting the dataset into 4 parts, which are based on the decision tree, and building a separate logistic regression scoring model for each segment we increased the accuracy by more than 7 percentage points on the test sample. Significant increase in recall and precision was also observed. It was shown that different segments may have absolutely different churn predictors. Therefore such a partitioning gives a better insight into factors influencing customer behavior
Mass appraisal of residential apartments: An application of Random forest for valuation and a CART-based approach for model diagnostics
To the best knowledge of authors, the use of Random forest as a potential technique for residential estate mass appraisal has been attempted for the first time. In the empirical study using data on residential apartments the method performed better than such techniques as CHAID, CART, KNN, multiple regression analysis, Artificial Neural Networks (MLP and RBF) and Boosted Trees. An approach for automatic detection of segments where a model significantly underperforms and for detecting segments with systematically under- or overestimated prediction is introduced. This segmentational approach is applicable to various expert systems including, but not limited to, those used for the mass appraisal