1,606 research outputs found

    Ensemble of Example-Dependent Cost-Sensitive Decision Trees

    Get PDF
    Several real-world classification problems are example-dependent cost-sensitive in nature, where the costs due to misclassification vary between examples and not only within classes. However, standard classification methods do not take these costs into account, and assume a constant cost of misclassification errors. In previous works, some methods that take into account the financial costs into the training of different algorithms have been proposed, with the example-dependent cost-sensitive decision tree algorithm being the one that gives the highest savings. In this paper we propose a new framework of ensembles of example-dependent cost-sensitive decision-trees. The framework consists in creating different example-dependent cost-sensitive decision trees on random subsamples of the training set, and then combining them using three different combination approaches. Moreover, we propose two new cost-sensitive combination approaches; cost-sensitive weighted voting and cost-sensitive stacking, the latter being based on the cost-sensitive logistic regression method. Finally, using five different databases, from four real-world applications: credit card fraud detection, churn modeling, credit scoring and direct marketing, we evaluate the proposed method against state-of-the-art example-dependent cost-sensitive techniques, namely, cost-proportionate sampling, Bayes minimum risk and cost-sensitive decision trees. The results show that the proposed algorithms have better results for all databases, in the sense of higher savings.Comment: 13 pages, 6 figures, Submitted for possible publicatio

    An Efficient Hybrid Classifier Model for Customer Churn Prediction

    Get PDF
    Customer churn prediction is used to retain customers at the highest risk of churn by proactively engaging with them. Many machine learning-based data mining approaches have been previously used to predict client churn. Although, single model classifiers increase the scattering of prediction with a low model performance which degrades reliability of the model. Hence, Bag of learners based Classification is used in which learners with high performance are selected to estimate wrongly and correctly classified instances thereby increasing the robustness of model performance.  Furthermore, loss of interpretability in the model during prediction leads to insufficient prediction accuracy.  Hence, an Associative classifier with Apriori Algorithm is introduced as a booster that integrates classification and association rule mining to build a strong classification model in which frequent items are obtained using Apriori Algorithm. Also, accurate prediction is provided by testing wrongly classified instances from the bagging phase using generated rules in an associative classifier. The proposed models are then simulated in Python platform and the results achieved high accuracy, ROC score, precision, specificity, F-measure, and recall

    Improving customer churn prediction by data augmentation using pictorial stimulus-choice data

    Get PDF
    The purpose of this paper is to determine the added value of pictorial stimulus-choice data in customer churn prediction. Using Random Forests and 5 times 2 fold cross-validation, this study analyzes how much pictorial stimulus choice data and survey data increase the AUC of a churn model over and above administrative, operational and complaints data. The finding is that pictorial-stimulus choice data significantly increases AUC of models with administrative and operational data. The practical implication of this finding is that companies should start considering mining pictorial data from social media sites (e.g. Pinterest), in order to augment their internal customer database. This study is original in that it is the first that assesses the added value of pictorial stimulus-choice data in predictive models. This is important because more and more social media websites are focusing on pictures

    Customer Churn Prediction in Telecom Sector: A Survey and way a head

    Get PDF
    © 2021 International Journal of Scientific & Technology Research. This work is licensed under a Creative Commons Attribution 4.0 International License.The telecommunication (telecom)industry is a highly technological domain has rapidly developed over the previous decades as a result of the commercial success in mobile communication and the internet. Due to the strong competition in the telecom industry market, companies use a business strategy to better understand their customers’ needs and measure their satisfaction. This helps telecom companies to improve their retention power and reduces the probability to churn. Knowing the reasons behind customer churn and the use of Machine Learning (ML) approaches for analyzing customers' information can be of great value for churn management. This paper aims to study the importance of Customer Churn Prediction (CCP) and recent research in the field of CCP. Challenges and open issues that need further research and development to CCP in the telecom sector are exploredPeer reviewe

    Research trends in customer churn prediction: A data mining approach

    Get PDF
    This study aims to present a very recent literature review on customer churn prediction based on 40 relevant articles published between 2010 and June 2020. For searching the literature, the 40 most relevant articles according to Google Scholar ranking were selected and collected. Then, each of the articles were scrutinized according to six main dimensions: Reference; Areas of Research; Main Goal; Dataset; Techniques; outcomes. The research has proven that the most widely used data mining techniques are decision tree (DT), support vector machines (SVM) and Logistic Regression (LR). The process combined with the massive data accumulation in the telecom industry and the increasingly mature data mining technology motivates the development and application of customer churn model to predict the customer behavior. Therefore, the telecom company can effectively predict the churn of customers, and then avoid customer churn by taking measures such as reducing monthly fixed fees. The present literature review offers recent insights on customer churn prediction scientific literature, revealing research gaps, providing evidences on current trends and helping to understand how to develop accurate and efficient Marketing strategies. The most important finding is that artificial intelligence techniques are are obviously becoming more used in recent years for telecom customer churn prediction. Especially, artificial NN are outstandingly recognized as a competent prediction method. This is a relevant topic for journals related to other social sciences, such as Banking, and also telecom data make up an outstanding source for developing novel prediction modeling techniques. Thus, this study can lead to recommendations for future customer churn prediction improvement, in addition to providing an overview of current research trends.info:eu-repo/semantics/acceptedVersio

    A predict-and-optimize approach to profit-driven churn prevention

    Full text link
    In this paper, we introduce a novel predict-and-optimize method for profit-driven churn prevention. We frame the task of targeting customers for a retention campaign as a regret minimization problem. The main objective is to leverage individual customer lifetime values (CLVs) to ensure that only the most valuable customers are targeted. In contrast, many profit-driven strategies focus on churn probabilities while considering average CLVs. This often results in significant information loss due to data aggregation. Our proposed model aligns with the guidelines of Predict-and-Optimize (PnO) frameworks and can be efficiently solved using stochastic gradient descent methods. Results from 12 churn prediction datasets underscore the effectiveness of our approach, which achieves the best average performance compared to other well-established strategies in terms of average profit.Comment: 15 pages, 4 figures, submitted to INFORMATION SCIENCE

    Churn Identification and Prediction from a Large-Scale Telecommunication Dataset Using NLP

    Get PDF
    The identification of customer churn is a major issue for large telecom businesses. In order to manage the data of current customers as well as acquire and manage new customers, every day, a substantial volume of data gets generated. Therefore, it's crucial to identify the causes of client churn so that the appropriate steps can be taken to lower it. Numerous researchers have already discussed their efforts to combine static and dynamic approaches in order to reduce churn in big data sets, but these systems still have many issues when it comes to actually identifying churn. In this paper, we suggested two methods, the first of which is churn identification and using Natural Language Processing (NLP) methods and machine learning techniques, we make predictions based on a vast telecommunication data set. The NLP process involves data pre-processing, normalization, feature extraction, and feature selection. For feature extraction, we employ unique techniques like TF-IDF, Stanford NLP, and occurrence correlation methods, have been suggested. Throughout the lesson, a machine learning classification algorithm is used for training and testing. Finally, the system employs a variety of cross validation techniques and training and evaluating Machine learning algorithms. The experimental analysis shows the system's efficacy and accuracy
    corecore