35 research outputs found
A SLR on Customer Dropout Prediction
Dropout prediction is a problem that is being addressed with machine learning algorithms;
thus, appropriate approaches to address the dropout rate are needed. The selection of an algorithm to predict
the dropout rate is only one problem to be addressed. Other aspects should also be considered, such as
which features should be selected and how to measure accuracy while considering whether the features are
appropriate according to the business context in which they are employed. To solve these questions, the
goal of this paper is to develop a systematic literature review to evaluate the development of existing studies
and to predict the dropout rate in contractual settings using machine learning to identify current trends and
research opportunities. The results of this study identify trends in the use of machine learning algorithms
in different business areas and in the adoption of machine learning algorithms, including which metrics are
being adopted and what features are being applied. Finally, some research opportunities and gaps that could
be explored in future research are presented.info:eu-repo/semantics/publishedVersio
A SLR on Customer Dropout Prediction
Dropout prediction is a problem that is being addressed with machine learning algorithms;
thus, appropriate approaches to address the dropout rate are needed. The selection of an algorithm to predict
the dropout rate is only one problem to be addressed. Other aspects should also be considered, such as
which features should be selected and how to measure accuracy while considering whether the features are
appropriate according to the business context in which they are employed. To solve these questions, the
goal of this paper is to develop a systematic literature review to evaluate the development of existing studies
and to predict the dropout rate in contractual settings using machine learning to identify current trends and
research opportunities. The results of this study identify trends in the use of machine learning algorithms
in different business areas and in the adoption of machine learning algorithms, including which metrics are
being adopted and what features are being applied. Finally, some research opportunities and gaps that could
be explored in future research are presented.info:eu-repo/semantics/publishedVersio
Churn prediction based on text mining and CRM data analysis
Within quantitative marketing, churn prediction on a single customer level has become a major issue. An extensive body of literature shows that, today, churn prediction is mainly based on structured CRM data. However, in the past years, more and more digitized customer text data has become available, originating from emails, surveys or scripts of phone calls. To date, this data source remains vastly untapped for churn prediction, and corresponding methods are rarely described in literature.
Filling this gap, we present a method for estimating churn probabilities directly from text data, by adopting classical text mining methods and combining them with state-of-the-art statistical prediction modelling. We transform every customer text document into a vector in a high-dimensional word space, after applying text mining pre-processing steps such as removal of stop words, stemming and word selection. The churn probability is then estimated by statistical modelling, using random forest models. We applied these methods to customer text data of a major Swiss telecommunication provider, with data originating from transcripts of phone calls between customers and call-centre agents.
In addition to the analysis of the text data, a similar churn prediction was performed for the same customers, based on structured CRM data. This second approach serves as a benchmark for the text data churn prediction, and is performed by using random forest on the structured CRM data which contains more than 300 variables.
Comparing the churn prediction based on text data to classical churn prediction based on structured CRM data, we found that the churn prediction based on text data performs as well as the prediction using structured CRM data. Furthermore we found that by combining both structured and text data, the prediction accuracy can be increased up to 10%.
These results show clearly that text data contains valuable information and should be considered for churn estimation
Research trends in customer churn prediction: A data mining approach
This study aims to present a very recent literature review on customer churn prediction based on 40 relevant articles published between 2010 and June 2020. For searching the literature, the 40 most relevant articles according to Google Scholar ranking were selected and collected. Then, each of the articles were scrutinized according to six main dimensions: Reference; Areas of Research; Main Goal; Dataset; Techniques; outcomes. The research has proven that the most widely used data mining techniques are decision tree (DT), support vector machines (SVM) and Logistic Regression (LR). The process combined with the massive data accumulation in the telecom industry and the increasingly mature data mining technology motivates the development and application of customer churn model to predict the customer behavior. Therefore, the telecom company can effectively predict the churn of customers, and then avoid customer churn by taking measures such as reducing monthly fixed fees. The present literature review offers recent insights on customer churn prediction scientific literature, revealing research gaps, providing evidences on current trends and helping to understand how to develop accurate and efficient Marketing strategies. The most important finding is that artificial intelligence techniques are are obviously becoming more used in recent years for telecom customer churn prediction. Especially, artificial NN are outstandingly recognized as a competent prediction method. This is a relevant topic for journals related to other social sciences, such as Banking, and also telecom data make up an outstanding source for developing novel prediction modeling techniques. Thus, this study can lead to recommendations for future customer churn prediction improvement, in addition to providing an overview of current research trends.info:eu-repo/semantics/acceptedVersio
Customer churn prediction in telecommunication industry using data certainty
Customer Churn Prediction (CCP) is a challenging activity for decision makers and machine learning community because most of the time, churn and non-churn customers have resembling features. From different experiments on customer churn and related data, it can be seen that a classifier shows different accuracy levels for different zones of a dataset. In such situations, a correlation can easily be observed in the level of classifier's accuracy and certainty of its prediction. If a mechanism can be defined to estimate the classifier's certainty for different zones within the data, then the expected classifier's accuracy can be estimated even before the classification. In this paper, a novel CCP approach is presented based on the above concept of classifier's certainty estimation using distance factor. The dataset is grouped into different zones based on the distance factor which are then divided into two categories as; (i) data with high certainty, and (ii) data with low certainty, for predicting customers exhibiting Churn and Non-churn behavior. Using different state-of-the-art evaluation measures (e.g., accuracy, f-measure, precision and recall) on different publicly available the Telecommunication Industry (TCI) datasets show that (i) the distance factor is strongly co-related with the certainty of the classifier, and (ii) the classifier obtained high accuracy in the zone with greater distance factor's value (i.e., customer churn and non-churn with high certainty) than those placed in the zone with smaller distance factor's value (i.e., customer churn and non-churn with low certainty)
Cross-company customer churn prediction in telecommunication: A comparison of data transformation methods
© 2018 Elsevier Ltd Cross-Company Churn Prediction (CCCP) is a domain of research where one company (target) is lacking enough data and can use data from another company (source) to predict customer churn successfully. To support CCCP, the cross-company data is usually transformed to a set of similar normal distribution of target company data prior to building a CCCP model. However, it is still unclear which data transformation method is most effective in CCCP. Also, the impact of data transformation methods on CCCP model performance using different classifiers have not been comprehensively explored in the telecommunication sector. In this study, we devised a model for CCCP using data transformation methods (i.e., log, z-score, rank and box-cox) and presented not only an extensive comparison to validate the impact of these transformation methods in CCCP, but also evaluated the performance of underlying baseline classifiers (i.e., Naive Bayes (NB), K-Nearest Neighbour (KNN), Gradient Boosted Tree (GBT), Single Rule Induction (SRI) and Deep learner Neural net (DP)) for customer churn prediction in telecommunication sector using the above mentioned data transformation methods. We performed experiments on publicly available datasets related to the telecommunication sector. The results demonstrated that most of the data transformation methods (e.g., log, rank, and box-cox) improve the performance of CCCP significantly. However, the Z-Score data transformation method could not achieve better results as compared to the rest of the data transformation methods in this study. Moreover, it is also investigated that the CCCP model based on NB outperform on transformed data and DP, KNN and GBT performed on the average, while SRI classifier did not show significant results in term of the commonly used evaluation measures (i.e., probability of detection, probability of false alarm, area under the curve and g-mean)
Intelligent data analysis approaches to churn as a business problem: a survey
Globalization processes and market deregulation policies are rapidly changing the competitive environments of many economic sectors. The appearance of new competitors and technologies leads to an increase in competition and, with it, a growing preoccupation among service-providing companies with creating stronger customer bonds. In this context, anticipating the customer’s intention to abandon the provider, a phenomenon known as churn, becomes a competitive advantage. Such anticipation can be the result of the correct application of information-based knowledge extraction in the form of business analytics. In particular, the use of intelligent data analysis, or data mining, for the analysis of market surveyed information can be of great assistance to churn management. In this paper, we provide a detailed survey of recent applications of business analytics to churn, with a focus on computational intelligence methods. This is preceded by an in-depth discussion of churn within the context of customer continuity management. The survey is structured according to the stages identified as basic for the building of the predictive models of churn, as well as according to the different types of predictive methods employed and the business areas of their application.Peer ReviewedPostprint (author's final draft
Cost-sensitive probabilistic predictions for support vector machines
Support vector machines (SVMs) are widely used and constitute one of the best
examined and used machine learning models for two-class classification.
Classification in SVM is based on a score procedure, yielding a deterministic
classification rule, which can be transformed into a probabilistic rule (as
implemented in off-the-shelf SVM libraries), but is not probabilistic in
nature. On the other hand, the tuning of the regularization parameters in SVM
is known to imply a high computational effort and generates pieces of
information that are not fully exploited, not being used to build a
probabilistic classification rule. In this paper we propose a novel approach to
generate probabilistic outputs for the SVM. The new method has the following
three properties. First, it is designed to be cost-sensitive, and thus the
different importance of sensitivity (or true positive rate, TPR) and
specificity (true negative rate, TNR) is readily accommodated in the model. As
a result, the model can deal with imbalanced datasets which are common in
operational business problems as churn prediction or credit scoring. Second,
the SVM is embedded in an ensemble method to improve its performance, making
use of the valuable information generated in the parameters tuning process.
Finally, the probabilities estimation is done via bootstrap estimates, avoiding
the use of parametric models as competing approaches. Numerical tests on a wide
range of datasets show the advantages of our approach over benchmark
procedures.Comment: European Journal of Operational Research (2023
Towards a More Reliable Interpretation of Machine Learning Outputs for Safety-Critical Systems Using Feature Importance Fusion
When machine learning supports decision-making in safety-critical systems, it
is important to verify and understand the reasons why a particular output is
produced. Although feature importance calculation approaches assist in
interpretation, there is a lack of consensus regarding how features' importance
is quantified, which makes the explanations offered for the outcomes mostly
unreliable. A possible solution to address the lack of agreement is to combine
the results from multiple feature importance quantifiers to reduce the variance
of estimates. Our hypothesis is that this will lead to more robust and
trustworthy interpretations of the contribution of each feature to machine
learning predictions. To assist test this hypothesis, we propose an extensible
Framework divided in four main parts: (i) traditional data pre-processing and
preparation for predictive machine learning models; (ii) predictive machine
learning; (iii) feature importance quantification and (iv) feature importance
decision fusion using an ensemble strategy. We also introduce a novel fusion
metric and compare it to the state-of-the-art. Our approach is tested on
synthetic data, where the ground truth is known. We compare different fusion
approaches and their results for both training and test sets. We also
investigate how different characteristics within the datasets affect the
feature importance ensembles studied. Results show that our feature importance
ensemble Framework overall produces 15% less feature importance error compared
to existing methods. Additionally, results reveal that different levels of
noise in the datasets do not affect the feature importance ensembles' ability
to accurately quantify feature importance, whereas the feature importance
quantification error increases with the number of features and number of
orthogonal informative features