384 research outputs found

    Maximize What Matters: Predicting Customer Churn With Decision-Centric Ensemble Selection

    Get PDF
    Churn modeling is important to sustain profitable customer relationships in saturated consumer markets. A churn model predicts the likelihood of customer defection. This is important to target retention offers to the right customers and to use marketing resources efficiently. The prevailing approach toward churn model development, supervised learning, suffers an important limitation: it does not allow the marketing analyst to account for campaign planning objectives and constraints during model building. Our key proposition is that creating a churn model in awareness of actual business requirements increases the performance of the final model for marketing decision support. To demonstrate this, we propose a decision-centric framework to create churn models. We test our modeling framework on eight real-life churn data sets and find that it performs significantly better than state-of-the-art churn models. Further analysis suggests that this improvement comes directly from incorporating business objectives into model building, which confirms the effectiveness of the proposed framework. In particular, we estimate that our approach increases the per customer profits of retention campaigns by $.47 on average

    A Metric for Measuring Customer Turnover Prediction Models

    Get PDF
    The interest for data mining techniques has increased tremendously during the past decades, and numerous classification techniques have been applied in a wide range of business applications. Hence, the need for adequate performance measures has become more important than ever. In this application, a cost-benefit analysis framework is formalized in order to define performance measures which are aligned with the main objectives of the end users, i.e., profit maximization. A new performance measure is defined, the expected maximum profit criterion. This general framework is then applied to the customer churn problem with its particular cost-benefit structure. The advantage of this approach is that it assists companies with selecting the classifier which maximizes the profit. Moreover, it aids with the practical implementation in the sense that it provides guidance about the fraction of the customer base to be included in the retention campaign

    Ensemble of Example-Dependent Cost-Sensitive Decision Trees

    Get PDF
    Several real-world classification problems are example-dependent cost-sensitive in nature, where the costs due to misclassification vary between examples and not only within classes. However, standard classification methods do not take these costs into account, and assume a constant cost of misclassification errors. In previous works, some methods that take into account the financial costs into the training of different algorithms have been proposed, with the example-dependent cost-sensitive decision tree algorithm being the one that gives the highest savings. In this paper we propose a new framework of ensembles of example-dependent cost-sensitive decision-trees. The framework consists in creating different example-dependent cost-sensitive decision trees on random subsamples of the training set, and then combining them using three different combination approaches. Moreover, we propose two new cost-sensitive combination approaches; cost-sensitive weighted voting and cost-sensitive stacking, the latter being based on the cost-sensitive logistic regression method. Finally, using five different databases, from four real-world applications: credit card fraud detection, churn modeling, credit scoring and direct marketing, we evaluate the proposed method against state-of-the-art example-dependent cost-sensitive techniques, namely, cost-proportionate sampling, Bayes minimum risk and cost-sensitive decision trees. The results show that the proposed algorithms have better results for all databases, in the sense of higher savings.Comment: 13 pages, 6 figures, Submitted for possible publicatio

    Profit Maximizing Logistic Regression Modeling for Credit Scoring

    Get PDF
    Multiple classification techniques have been employed for different business applications. In the particular case of credit scoring, a classifier which maximizes the total profit is preferable. The recently proposed expected maximum profit (EMP) measure for credit scoring allows to select the most profitable classifier. Taking the idea of the EMP one step further, it is desirable to integrate the measure into model construction, and thus obtain a profit maximizing model. Therefore, in this work we propose a method based on the ProfLogit classifier, which optimizes the coefficients of a logistic regression model using a genetic algorithm. The proposed implemented technique shows a significant improvement compared to regular maximum likelihood based logistic regression models on real-life data sets in terms of total profit, which is the ultimate goal for most businesses.</p

    Predictive Customer Lifetime value modeling: Improving customer engagement and business performance

    Get PDF
    CookUnity, a meal subscription service, has witnessed substantial annual revenue growth over the past three years. However, this growth has primarily been driven by the acquisition of new users to expand the customer base, rather than an evident increase in customers' spending levels. If it weren't for the raised subscription prices, the company's customer lifetime value (CLV) would have remained the same as it was three years ago. Consequently, the company's leadership recognizes the need to adopt a holistic approach to unlock an enhancement in CLV. The objective of this thesis is to develop a comprehensive understanding of CLV, its implications, and how companies leverage it to inform strategic decisions. Throughout the course of this study, our central focus is to deliver a fully functional and efficient machine learning solution to CookUnity. This solution will possess exceptional predictive capabilities, enabling accurate forecasting of each customer's future CLV. By equipping CookUnity with this powerful tool, our aim is to empower the company to strategically leverage CLV for sustained growth. To achieve this objective, we analyze various methodologies and approaches to CLV analysis, evaluating their applicability and effectiveness within the context of CookUnity. We thoroughly explore available data sources that can serve as predictors of CLV, ensuring the incorporation of the most relevant and meaningful variables in our model. Additionally, we assess different research methodologies to identify the top-performing approach and examine its implications for implementation at CookUnity. By implementing data-driven strategies based on our predictive CLV model, CookUnity will be able to optimize order levels and maximize the lifetime value of its customer base. The outcome of this thesis will be a robust ML solution with remarkable prediction accuracy and practical usability within the company. Furthermore, the insights gained from our research will contribute to a broader understanding of CLV in the subscription-based business context, stimulating further exploration and advancement in this field of study

    Holistic Measures for Evaluating Prediction Models in Smart Grids

    Full text link
    The performance of prediction models is often based on "abstract metrics" that estimate the model's ability to limit residual errors between the observed and predicted values. However, meaningful evaluation and selection of prediction models for end-user domains requires holistic and application-sensitive performance measures. Inspired by energy consumption prediction models used in the emerging "big data" domain of Smart Power Grids, we propose a suite of performance measures to rationally compare models along the dimensions of scale independence, reliability, volatility and cost. We include both application independent and dependent measures, the latter parameterized to allow customization by domain experts to fit their scenario. While our measures are generalizable to other domains, we offer an empirical analysis using real energy use data for three Smart Grid applications: planning, customer education and demand response, which are relevant for energy sustainability. Our results underscore the value of the proposed measures to offer a deeper insight into models' behavior and their impact on real applications, which benefit both data mining researchers and practitioners.Comment: 14 Pages, 8 figures, Accepted and to appear in IEEE Transactions on Knowledge and Data Engineering, 2014. Authors' final version. Copyright transferred to IEE
    corecore