84 research outputs found

    Research trends in customer churn prediction: A data mining approach

    Get PDF
    This study aims to present a very recent literature review on customer churn prediction based on 40 relevant articles published between 2010 and June 2020. For searching the literature, the 40 most relevant articles according to Google Scholar ranking were selected and collected. Then, each of the articles were scrutinized according to six main dimensions: Reference; Areas of Research; Main Goal; Dataset; Techniques; outcomes. The research has proven that the most widely used data mining techniques are decision tree (DT), support vector machines (SVM) and Logistic Regression (LR). The process combined with the massive data accumulation in the telecom industry and the increasingly mature data mining technology motivates the development and application of customer churn model to predict the customer behavior. Therefore, the telecom company can effectively predict the churn of customers, and then avoid customer churn by taking measures such as reducing monthly fixed fees. The present literature review offers recent insights on customer churn prediction scientific literature, revealing research gaps, providing evidences on current trends and helping to understand how to develop accurate and efficient Marketing strategies. The most important finding is that artificial intelligence techniques are are obviously becoming more used in recent years for telecom customer churn prediction. Especially, artificial NN are outstandingly recognized as a competent prediction method. This is a relevant topic for journals related to other social sciences, such as Banking, and also telecom data make up an outstanding source for developing novel prediction modeling techniques. Thus, this study can lead to recommendations for future customer churn prediction improvement, in addition to providing an overview of current research trends.info:eu-repo/semantics/acceptedVersio

    A comparative study of tree-based models for churn prediction : a case study in the telecommunication sector

    Get PDF
    Dissertation presented as the partial requirement for obtaining a Master's degree in Statistics and Information Management, specialization in Marketing Research e CRMIn the recent years the topic of customer churn gains an increasing importance, which is the phenomena of the customers abandoning the company to another in the future. Customer churn plays an important role especially in the more saturated industries like telecommunication industry. Since the existing customers are very valuable and the acquisition cost of new customers is very high nowadays. The companies want to know which of their customers and when are they going to churn to another provider, so that measures can be taken to retain the customers who are at risk of churning. Such measures could be in the form of incentives to the churners, but the downside is the wrong classification of a churners will cost the company a lot, especially when incentives are given to some non-churner customers. The common challenge to predict customer churn will be how to pre-process the data and which algorithm to choose, especially when the dataset is heterogeneous which is very common for telecommunication companies’ datasets. The presented thesis aims at predicting customer churn for telecommunication sector using different decision tree algorithms and its ensemble models

    Cross-company customer churn prediction in telecommunication: A comparison of data transformation methods

    Get PDF
    © 2018 Elsevier Ltd Cross-Company Churn Prediction (CCCP) is a domain of research where one company (target) is lacking enough data and can use data from another company (source) to predict customer churn successfully. To support CCCP, the cross-company data is usually transformed to a set of similar normal distribution of target company data prior to building a CCCP model. However, it is still unclear which data transformation method is most effective in CCCP. Also, the impact of data transformation methods on CCCP model performance using different classifiers have not been comprehensively explored in the telecommunication sector. In this study, we devised a model for CCCP using data transformation methods (i.e., log, z-score, rank and box-cox) and presented not only an extensive comparison to validate the impact of these transformation methods in CCCP, but also evaluated the performance of underlying baseline classifiers (i.e., Naive Bayes (NB), K-Nearest Neighbour (KNN), Gradient Boosted Tree (GBT), Single Rule Induction (SRI) and Deep learner Neural net (DP)) for customer churn prediction in telecommunication sector using the above mentioned data transformation methods. We performed experiments on publicly available datasets related to the telecommunication sector. The results demonstrated that most of the data transformation methods (e.g., log, rank, and box-cox) improve the performance of CCCP significantly. However, the Z-Score data transformation method could not achieve better results as compared to the rest of the data transformation methods in this study. Moreover, it is also investigated that the CCCP model based on NB outperform on transformed data and DP, KNN and GBT performed on the average, while SRI classifier did not show significant results in term of the commonly used evaluation measures (i.e., probability of detection, probability of false alarm, area under the curve and g-mean)

    An Optimized Approach for Maximizing Business Intelligence using Machine Learning

    Get PDF
    The subject of study known as business intelligence is responsible for the development of techniques and tools for the analysis of business information with the goal of assisting in the management and decision-making processes of corporations. In the current climate, business intelligence is essential to the process of formulating a strategy and carrying out operations that are data-driven. Throughout the many stages of the company operation, an organization will need assistance evaluating data and making decisions; a decision support system may provide this assistance by including business intelligence as an essential component. The fact that this enormous quantity of data is distributed over a number of different types of platforms, however, makes it a difficult challenge, in particular to understand the information that is actually relevant and to make efficient use of it for business intelligence. One of the most important challenges facing modern society is maximizing business intelligence through the application of machine learning. It offers a full analysis that is based on predictions and is extracted for Business Intelligence techniques along with current application fields. This anomalous gap has been pointed up, and solutions and future research areas have been offered to overcome it in order to create effective business strategies

    A SLR on Customer Dropout Prediction

    Get PDF
    Dropout prediction is a problem that is being addressed with machine learning algorithms; thus, appropriate approaches to address the dropout rate are needed. The selection of an algorithm to predict the dropout rate is only one problem to be addressed. Other aspects should also be considered, such as which features should be selected and how to measure accuracy while considering whether the features are appropriate according to the business context in which they are employed. To solve these questions, the goal of this paper is to develop a systematic literature review to evaluate the development of existing studies and to predict the dropout rate in contractual settings using machine learning to identify current trends and research opportunities. The results of this study identify trends in the use of machine learning algorithms in different business areas and in the adoption of machine learning algorithms, including which metrics are being adopted and what features are being applied. Finally, some research opportunities and gaps that could be explored in future research are presented.info:eu-repo/semantics/publishedVersio

    A SLR on Customer Dropout Prediction

    Get PDF
    Dropout prediction is a problem that is being addressed with machine learning algorithms; thus, appropriate approaches to address the dropout rate are needed. The selection of an algorithm to predict the dropout rate is only one problem to be addressed. Other aspects should also be considered, such as which features should be selected and how to measure accuracy while considering whether the features are appropriate according to the business context in which they are employed. To solve these questions, the goal of this paper is to develop a systematic literature review to evaluate the development of existing studies and to predict the dropout rate in contractual settings using machine learning to identify current trends and research opportunities. The results of this study identify trends in the use of machine learning algorithms in different business areas and in the adoption of machine learning algorithms, including which metrics are being adopted and what features are being applied. Finally, some research opportunities and gaps that could be explored in future research are presented.info:eu-repo/semantics/publishedVersio

    Twitter Analysis to Predict the Satisfaction of Saudi Telecommunication Companies’ Customers

    Get PDF
    The flexibility in mobile communications allows customers to quickly switch from one service provider to another, making customer churn one of the most critical challenges for the data and voice telecommunication service industry. In 2019, the percentage of post-paid telecommunication customers in Saudi Arabia decreased; this represents a great deal of customer dissatisfaction and subsequent corporate fiscal losses. Many studies correlate customer satisfaction with customer churn. The Telecom companies have depended on historical customer data to measure customer churn. However, historical data does not reveal current customer satisfaction or future likeliness to switch between telecom companies. Current methods of analysing churn rates are inadequate and faced some issues, particularly in the Saudi market. This research was conducted to realize the relationship between customer satisfaction and customer churn and how to use social media mining to measure customer satisfaction and predict customer churn. This research conducted a systematic review to address the churn prediction models problems and their relation to Arabic Sentiment Analysis. The findings show that the current churn models lack integrating structural data frameworks with real-time analytics to target customers in real-time. In addition, the findings show that the specific issues in the existing churn prediction models in Saudi Arabia relate to the Arabic language itself, its complexity, and lack of resources. As a result, I have constructed the first gold standard corpus of Saudi tweets related to telecom companies, comprising 20,000 manually annotated tweets. It has been generated as a dialect sentiment lexicon extracted from a larger Twitter dataset collected by me to capture text characteristics in social media. I developed a new ASA prediction model for telecommunication that fills the detected gaps in the ASA literature and fits the telecommunication field. The proposed model proved its effectiveness for Arabic sentiment analysis and churn prediction. This is the first work using Twitter mining to predict potential customer loss (churn) in Saudi telecom companies, which has not been attempted before. Different fields, such as education, have different features, making applying the proposed model is interesting because it based on text-mining

    Comparing Oversampling Techniques to Handle the Class Imbalance Problem: A Customer Churn Prediction Case Study

    Get PDF
    Customer retention is a major issue for various service-based organizations particularly telecom industry, wherein predictive models for observing the behavior of customers are one of the great instruments in customer retention process and inferring the future behavior of the customers. However, the performances of predictive models are greatly affected when the real-world data set is highly imbalanced. A data set is called imbalanced if the samples size from one class is very much smaller or larger than the other classes. The most commonly used technique is over/under sampling for handling the class-imbalance problem (CIP) in various domains. In this paper, we survey six well-known sampling techniques and compare the performances of these key techniques, i.e., mega-trend diffusion function (MTDF), synthetic minority oversampling technique, adaptive synthetic sampling approach, couples top-N reverse k-nearest neighbor, majority weighted minority oversampling technique, and immune centroids oversampling technique. Moreover, this paper also reveals the evaluation of four rules-generation algorithms (the learning from example module, version 2 (LEM2), covering, exhaustive, and genetic algorithms) using publicly available data sets. The empirical results demonstrate that the overall predictive performance of MTDF and rules-generation based on genetic algorithms performed the best as compared with the rest of the evaluated oversampling methods and rule-generation algorithms

    Can bank interaction during rating measurement of micro and very small enterprises ipso facto Determine the collapse of PD status?

    Get PDF
    This paper begins with an analysis of trends - over the period 2012-2018 - for total bank loans, non-performing loans, and the number of active, working enterprises. A review survey was done on national data from Italy with a comparison developed on a local subset from the Sardinia Region. Empirical evidence appears to support the hypothesis of the paper: can the rating class assigned by banks - using current IRB and A-IRB systems - to micro and very small enterprises, whose ability to replace financial resources using endogenous means is structurally impaired, ipso facto orient the results of performance in the same terms of PD assigned by the algorithm, thereby upending the principle of cause and effect? The thesis is developed through mathematical modeling that demonstrates the interaction of the measurement tool (the rating algorithm applied by banks) on the collapse of the loan status (default, performing, or some intermediate point) of the assessed micro-entity. Emphasis is given, in conclusion, to the phenomenon using evidence of the intrinsically mutualistic link of the two populations of banks and (micro) enterprises provided by a system of differential equation
    corecore