1,606 research outputs found
Ensemble of Example-Dependent Cost-Sensitive Decision Trees
Several real-world classification problems are example-dependent
cost-sensitive in nature, where the costs due to misclassification vary between
examples and not only within classes. However, standard classification methods
do not take these costs into account, and assume a constant cost of
misclassification errors. In previous works, some methods that take into
account the financial costs into the training of different algorithms have been
proposed, with the example-dependent cost-sensitive decision tree algorithm
being the one that gives the highest savings. In this paper we propose a new
framework of ensembles of example-dependent cost-sensitive decision-trees. The
framework consists in creating different example-dependent cost-sensitive
decision trees on random subsamples of the training set, and then combining
them using three different combination approaches. Moreover, we propose two new
cost-sensitive combination approaches; cost-sensitive weighted voting and
cost-sensitive stacking, the latter being based on the cost-sensitive logistic
regression method. Finally, using five different databases, from four
real-world applications: credit card fraud detection, churn modeling, credit
scoring and direct marketing, we evaluate the proposed method against
state-of-the-art example-dependent cost-sensitive techniques, namely,
cost-proportionate sampling, Bayes minimum risk and cost-sensitive decision
trees. The results show that the proposed algorithms have better results for
all databases, in the sense of higher savings.Comment: 13 pages, 6 figures, Submitted for possible publicatio
An Efficient Hybrid Classifier Model for Customer Churn Prediction
Customer churn prediction is used to retain customers at the highest risk of churn by proactively engaging with them. Many machine learning-based data mining approaches have been previously used to predict client churn. Although, single model classifiers increase the scattering of prediction with a low model performance which degrades reliability of the model. Hence, Bag of learners based Classification is used in which learners with high performance are selected to estimate wrongly and correctly classified instances thereby increasing the robustness of model performance. Furthermore, loss of interpretability in the model during prediction leads to insufficient prediction accuracy. Hence, an Associative classifier with Apriori Algorithm is introduced as a booster that integrates classification and association rule mining to build a strong classification model in which frequent items are obtained using Apriori Algorithm. Also, accurate prediction is provided by testing wrongly classified instances from the bagging phase using generated rules in an associative classifier. The proposed models are then simulated in Python platform and the results achieved high accuracy, ROC score, precision, specificity, F-measure, and recall
Improving customer churn prediction by data augmentation using pictorial stimulus-choice data
The purpose of this paper is to determine the added value of pictorial stimulus-choice data in customer churn prediction. Using Random Forests and 5 times 2 fold cross-validation, this study analyzes how much pictorial stimulus choice data and survey data increase the AUC of a churn model over and above administrative, operational and complaints data. The finding is that pictorial-stimulus choice data significantly increases AUC of models with administrative and operational data. The practical implication of this finding is that companies should start considering mining pictorial data from social media sites (e.g. Pinterest), in order to augment their internal customer database. This study is original in that it is the first that assesses the added value of pictorial stimulus-choice data in predictive models. This is important because more and more social media websites are focusing on pictures
Customer Churn Prediction in Telecom Sector: A Survey and way a head
© 2021 International Journal of Scientific & Technology Research. This work is licensed under a Creative Commons Attribution 4.0 International License.The telecommunication (telecom)industry is a highly technological domain has rapidly developed over the previous decades as a result of the commercial success in mobile communication and the internet. Due to the strong competition in the telecom industry market, companies use a business strategy to better understand their customers’ needs and measure their satisfaction. This helps telecom companies to improve their retention power and reduces the probability to churn. Knowing the reasons behind customer churn and the use of Machine Learning (ML) approaches for analyzing customers' information can be of great value for churn management. This paper aims to study the importance of Customer Churn Prediction (CCP) and recent research in the field of CCP. Challenges and open issues that need further research and development to CCP in the telecom sector are exploredPeer reviewe
Research trends in customer churn prediction: A data mining approach
This study aims to present a very recent literature review on customer churn prediction based on 40 relevant articles published between 2010 and June 2020. For searching the literature, the 40 most relevant articles according to Google Scholar ranking were selected and collected. Then, each of the articles were scrutinized according to six main dimensions: Reference; Areas of Research; Main Goal; Dataset; Techniques; outcomes. The research has proven that the most widely used data mining techniques are decision tree (DT), support vector machines (SVM) and Logistic Regression (LR). The process combined with the massive data accumulation in the telecom industry and the increasingly mature data mining technology motivates the development and application of customer churn model to predict the customer behavior. Therefore, the telecom company can effectively predict the churn of customers, and then avoid customer churn by taking measures such as reducing monthly fixed fees. The present literature review offers recent insights on customer churn prediction scientific literature, revealing research gaps, providing evidences on current trends and helping to understand how to develop accurate and efficient Marketing strategies. The most important finding is that artificial intelligence techniques are are obviously becoming more used in recent years for telecom customer churn prediction. Especially, artificial NN are outstandingly recognized as a competent prediction method. This is a relevant topic for journals related to other social sciences, such as Banking, and also telecom data make up an outstanding source for developing novel prediction modeling techniques. Thus, this study can lead to recommendations for future customer churn prediction improvement, in addition to providing an overview of current research trends.info:eu-repo/semantics/acceptedVersio
A predict-and-optimize approach to profit-driven churn prevention
In this paper, we introduce a novel predict-and-optimize method for
profit-driven churn prevention. We frame the task of targeting customers for a
retention campaign as a regret minimization problem. The main objective is to
leverage individual customer lifetime values (CLVs) to ensure that only the
most valuable customers are targeted. In contrast, many profit-driven
strategies focus on churn probabilities while considering average CLVs. This
often results in significant information loss due to data aggregation. Our
proposed model aligns with the guidelines of Predict-and-Optimize (PnO)
frameworks and can be efficiently solved using stochastic gradient descent
methods. Results from 12 churn prediction datasets underscore the effectiveness
of our approach, which achieves the best average performance compared to other
well-established strategies in terms of average profit.Comment: 15 pages, 4 figures, submitted to INFORMATION SCIENCE
Churn Identification and Prediction from a Large-Scale Telecommunication Dataset Using NLP
The identification of customer churn is a major issue for large telecom businesses. In order to manage the data of current customers as well as acquire and manage new customers, every day, a substantial volume of data gets generated. Therefore, it's crucial to identify the causes of client churn so that the appropriate steps can be taken to lower it. Numerous researchers have already discussed their efforts to combine static and dynamic approaches in order to reduce churn in big data sets, but these systems still have many issues when it comes to actually identifying churn. In this paper, we suggested two methods, the first of which is churn identification and using Natural Language Processing (NLP) methods and machine learning techniques, we make predictions based on a vast telecommunication data set. The NLP process involves data pre-processing, normalization, feature extraction, and feature selection. For feature extraction, we employ unique techniques like TF-IDF, Stanford NLP, and occurrence correlation methods, have been suggested. Throughout the lesson, a machine learning classification algorithm is used for training and testing. Finally, the system employs a variety of cross validation techniques and training and evaluating Machine learning algorithms. The experimental analysis shows the system's efficacy and accuracy
- …