1,411 research outputs found
Customer Churn Prediction of Telecom Company Using Machine Learning Algorithms
We can’t escape the fact that using telecommunications has become a significant part of our everyday lives. Since the Covid-19 pandemic, the telecommunication industry has become crucial. Hence, the industry now enjoys growth opportunities. In this study, KNN, Random Forest (RF), AdaBoost, Logistic Regression (LR), XGBoost, and Support Vector Machine (SVM) are 6 supervised machine learning algorithms that will be used in this study to predict the customer churn of a telecom company in California. The goal of this study is to identify the classifier that predicts customer churn the most effectively. As evidenced by its accuracy of 79.67%, precision of 64.67%, recall of 51.87%, and F1-score of 57.57%, XGBoost is the overall most effective classifier in this study. Next, the purpose of this study is to identify the characteristics of customers who are most likely to leave the telecom company. These characteristics were discovered based on customers’ demographics and account information. Lastly, this study also provides the company with advice on how to retain customers. The study advises company to personalize the customer experience, implement a customer loyalty program, and apply AI in customer relationship management in retaining customers
Un-factorize non-food NPS on a food-based retailer
Dissertação de mestrado em Estatística para Ciência de DadosO Net Promoter Score (NPS) é uma métrica muito utilizada para medir o nível de lealdade dos
consumidores. Neste sentido, esta dissertação pretende desenvolver um modelo de classificação
que permita identificar a classe do NPS dos consumidores, ou seja, classificar o consumidor como
Detrator, Passivo ou Promotor, assim como perceber os fatores que têm maior impacto nessa classificação. A informação recolhida permitirá à organização ter uma melhor percepção das áreas a
melhorar de forma a elevar a satisfação do consumidor.
Para tal, propõe-se uma abordagem de Data Mining para o problema de classificação multiclasse. A abordagem utiliza dados de um inquérito e dados transacionais do cartão de fidelização
de um retalhista, que formam o conjunto de dados a partir dos quais se consegue obter informações sobre as pontuações do Net Promoter Score (NPS), o comportamento dos consumidores
e informações das lojas. Inicialmente é feita uma análise exploratória dos dados extraídos. Uma
vez que as classes são desbalanceadas, várias técnicas de reamostragem são aplicadas para equilibrar as mesmas. São aplicados dois algoritmos de classificação: Árvores de Decisão e Random
Forests. Os resultados obtidos revelam um mau desempenho dos modelos. Uma análise de erro
é feita ao último modelo, onde se conclui que este tem dificuldade em distinguir os Detratores e os
Passivos, mas tem um bom desempenho a prever os Promotores.
Numa ótica de negócio, esta metodologia pode ser utilizada para fazer uma distinção entre
os Promotores e o resto dos consumidores, uma vez que os Promotores são a segmentação de
clientes mais prováveis de beneficiar o mesmo a longo prazo, ajudando a promover a organização
e atraíndo novos consumidores.More and more companies realise that understanding their customers can be a way to improve
customer satisfaction and, consequently, customer loyalty, which in turn can result in an increase
in sales. The NPS has been widely adopted by managers as a measure of customer loyalty and
predictor of sales growth.
In this regard, this dissertation aims to create a classification model focused not only in identi fying the customer’s NPS class, namely, classify the customer as Detractor, Passive or Promoter,
but also in understanding which factors have the most impact on the customer’s classification. The
goal in doing so is to collect relevant business insights as a way to identify areas that can help to
improve customer satisfaction.
We propose a Data Mining approach to the NPS multi-class classification problem. Our ap proach leverages survey data, as well as transactional data collected through a retailer’s loyalty
card, building a data set from which we can extract information, such as NPS ratings, customer
behaviour and store details. Initially, an exploratory analysis is done on the data. Several resam pling techniques are applied to the data set to handle class imbalance. Two different machine
learning algorithms are applied: Decision Trees and Random Forests. The results did not show a
good model’s performance. An error analysis was then performed in the later model, where it was
concluded that the classifier has difficulty distinguishing the classes Detractors and Passives, but
has a good performance when predicting the class Promoters.
In a business sense, this methodology can be leveraged to distinguish the Promoters from the
rest of the consumers, since the Promoters are more likely to provide good value in long term and
can benefit the company by spreading the word for attracting new customers
Customer churn prediction in telecom using machine learning and social network analysis in big data platform
Customer churn is a major problem and one of the most important concerns for
large companies. Due to the direct effect on the revenues of the companies,
especially in the telecom field, companies are seeking to develop means to
predict potential customer to churn. Therefore, finding factors that increase
customer churn is important to take necessary actions to reduce this churn. The
main contribution of our work is to develop a churn prediction model which
assists telecom operators to predict customers who are most likely subject to
churn. The model developed in this work uses machine learning techniques on big
data platform and builds a new way of features' engineering and selection. In
order to measure the performance of the model, the Area Under Curve (AUC)
standard measure is adopted, and the AUC value obtained is 93.3%. Another main
contribution is to use customer social network in the prediction model by
extracting Social Network Analysis (SNA) features. The use of SNA enhanced the
performance of the model from 84 to 93.3% against AUC standard. The model was
prepared and tested through Spark environment by working on a large dataset
created by transforming big raw data provided by SyriaTel telecom company. The
dataset contained all customers' information over 9 months, and was used to
train, test, and evaluate the system at SyriaTel. The model experimented four
algorithms: Decision Tree, Random Forest, Gradient Boosted Machine Tree "GBM"
and Extreme Gradient Boosting "XGBOOST". However, the best results were
obtained by applying XGBOOST algorithm. This algorithm was used for
classification in this churn predictive model.Comment: 24 pages, 14 figures. PDF https://rdcu.be/budK
INTEGRATING KANO MODEL WITH DATA MINING TECHNIQUES TO ENHANCE CUSTOMER SATISFACTION
The business world is becoming more competitive from time to time; therefore, businesses are forced to improve their strategies in every single aspect. So, determining the elements that contribute to the clients\u27 contentment is one of the critical needs of businesses to develop successful products in the market. The Kano model is one of the models that help determine which features must be included in a product or service to improve customer satisfaction. The model focuses on highlighting the most relevant attributes of a product or service along with customers’ estimation of how these attributes can be used to predict satisfaction with specific services or products. This research aims at developing a method to integrate the Kano model and data mining approaches to select relevant attributes that drive customer satisfaction, with a specific focus on higher education. The significant contribution of this research is to improve the quality of United Arab Emirates University academic support and development services provided to their students by solving the problem of selecting features that are not methodically correlated to customer satisfaction, which could reduce the risk of investing in features that could ultimately be irrelevant to enhancing customer satisfaction. Questionnaire data were collected from 646 students from United Arab Emirates University. The experiment suggests that Extreme Gradient Boosting Regression can produce the best results for this kind of problem. Based on the integration of the Kano model and the feature selection method, the number of features used to predict customer satisfaction is minimized to four features. It was found that either Chi-Square or Analysis of Variance (ANOVA) features selection model’s integration with the Kano model giving higher values of Pearson correlation coefficient and R2. Moreover, the prediction was made using union features between the Kano model\u27s most important features and the most frequent features among 8 clusters. It shows high-performance results
Profitable Retail Customer Identification Based on a Combined Prediction Strategy of Customer Lifetime Value
As a fundamental concept of customer relationship management, customer lifetime value (CLV) serves as a crucial metric to identify profitable retail customers. Various methods are available to predict CLV in different contexts. With the development of consumer big data, modern statistics and machine learning algorithms have been gradually adopted in CLV modeling. We introduce two machine learning algorithms—the gradient boosting decision tree (GBDT) and the random forest (RF)—in retail customer CLV modeling and compare their predictive performance with two classical models—the Pareto/NBD (HB) and the Pareto/GGG. To ensure CLV prediction and customer identification robustness, we combined the predictions of the four models to determine which customers are the most—or least—profitable. Using 43 weeks of customer transaction data from a large retailer in China, we predicted customer value in the future 20 weeks. The results show that the predictive performance of GBDT and RF is generally better than that of the Pareto/NBD (HB) and Pareto/GGG models. Because the predictions are not entirely consistent, we combine them to identify profitable and unprofitable customers
Combined artificial bee colony algorithm and machine learning techniques for prediction of online consumer repurchase intention
A novel paradigm in the service sector i.e. services through the web is a progressive mechanism for rendering offerings over diverse environments. Internet provides huge opportunities for companies to provide personalized online services to their customers. But prompt novel web services introduction may unfavorably affect the quality and user gratification. Subsequently, prediction of the consumer intention is of supreme importance in selecting the web services for an application. The aim of study is to predict online consumer repurchase intention and to achieve this objective a hybrid approach which a combination of machine learning techniques and Artificial Bee Colony (ABC) algorithm has been used. The study is divided into three phases. Initially, shopping mall and consumer characteristic’s for repurchase intention has been identified through extensive literature review. Secondly, ABC has been used to determine the feature selection of consumers’ characteristics and shopping malls’ attributes (with > 0.1 threshold value) for the prediction model. Finally, validation using K-fold cross has been employed to measure the best classification model robustness. The classification models viz., Decision Trees (C5.0), AdaBoost, Random Forest (RF), Support Vector Machine (SVM) and Neural Network (NN), are utilized for prediction of consumer purchase intention. Performance evaluation of identified models on training-testing partitions (70-30%) of the data set, shows that AdaBoost method outperforms other classification models with sensitivity and accuracy of 0.95 and 97.58% respectively, on testing data set. This study is a revolutionary attempt that considers both, shopping mall and consumer characteristics in examine the consumer purchase intention.N/
Customer Churn Detection and Marketing Retention Strategies in the Online Food Delivery Business
The purpose of this thesis is to analyze the behavior of customers within the
Online Food Delivery industry, through which it is proposed to develop a prediction
model that allows detecting, based on valuable active customers, those who will leave
the services of Alpha Corporation in the near future.
Firstly, valuable customers are defined as those consumers who have made at
least 8 orders in the last 12 months. In this way, considering the historical behavior of
said users, as well as applying Feature Engineering techniques, a first approach is
proposed based on the implementation of a Random Forest algorithm and, later, a
boosting algorithm: XGBoost.
Once the performance of each of the models developed is analyzed, and potential
churners are identified, different marketing suggestions are proposed in order to retain
said customers. Retention strategies will be based on how Alpha Corporation works, as
well as on the output of the predictive model. Other development alternatives will also
be discussed: a clustering model based on potential churners or an unstructured data
model to analyze the emotions of those users according to the NPS surveys. The aim of
these proposals is to complement the prediction to design more specific retention
marketing strategies
Forecasting credit card attrition using machine learning models
Este trabajo tiene como objetivo el estudio, aplicación e implementación de modelos Machine Learning para identificar qué clientes desean cancelar alguna de sus tarjetas de crédito. La industria bancaria utiliza esta tecnología con el fin de obtener predicciones más fiables a la hora de identificar oportunidades de compra, inversión o fraude. Estos modelos se pueden adaptar de forma independiente, por medio del reconocimiento de patrones y algoritmos basados en cálculos matemáticos.
Para desarrollar la investigación se implementaron y evaluaron cuatro modelos (LightGBM, XGBoost, Random Forest y Logistic Regression) con el fin de predecir a través de los datos del cliente y sus productos la posibilidad de que cancele sus tarjetas de crédito. Mediante una análisis de la curvas ROC usando las métricas AUC, se llegó a la conclusión que de los modelos seleccionados, el modelo elegido para realizar la predicción fue LightGBM, ya que fue el que tuvo mejor desempeño en los experimentos realizados. De igual forma, se encontró que la variable Score Acierta, una calificación del cliente proveída por la central de riesgos, es la que más discrimina en los modelos predicción.The objective of this work is the implementation and evaluation of Machine Learning models to identify which customers want to cancel their credit cards. The banking industry uses this technology to obtain more reliable predictions when identifying opportunities for purchase, investment, or fraud. These models can be adapted independently, by recognizing patterns and algorithms based on mathematical calculations.
Four models (LightGBM, XGBoost, Random Forest and Logistic Regression) were implemented and evaluated to predict, using data about customers and products held pertaining to a bank in Colombia, the likelihood of customers cancelling their credit cards. By analysing the ROC curves using the AUC metric, it is concluded that, of the selected models, the model chosen for deployment would be LightGBM, since it was the one that performed best in the experiments conducted. Furthermore, the ``Score Acierta'' variable, a customer rating provided by the Colombian credit rating agency, was found to be the most discriminating in prediction models
Study about customer segmentation and application in a real case
The hospitality industry generates a huge variety of data that grows by the day, becoming
incrinsingly difficult to analyse this data manually in order to build a good data model. A
thorough understanding of current customer profiles enables better resource allocation
and leads to better definition of product and market development strategies. Dividing
customers into similar groups to help develop more objective and focused marketing
messages for each of the segments. Thus, in the present dissertation methods of
classification and segmentation of existing data in the literature review are studied. Then,
a real case study is presented, using data from Property Management Systems of eight
Portuguese hotels, four city hotels and four resort hotels. This data set consists of fortyone
attributes but, after selection of the most predictive variables, only a subset of
attributes is used for data modeling. Next, the classification and segmentation methods
studied in the literature review are applied for extracting the relevant information. The
results are analyzed and discussed to understand their suitability to study the particular
characteristics of hotel reservations.O setor de hospitalidade gera uma enorme variedade de dados que crescem a cada dia,
tornando-se fisicamente impossível analisar esses dados manualmente a fim de construir
um bom modelo de dados. Um profundo entendimento dos perfis dos atuais clientes
permite uma melhor alocação de recursos e leva a uma melhor definição das estratégias
de desenvolvimento de produtos e mercados. A divisão dos clientes em grupos
semelhantes para ajudar a desenvolver mensagens de marketing mais objetivas e focadas
para cada um dos seus segmentos.
Desse modo na presente dissertação são estudados métodos de classificação e
segmentação de dados existentes na revisão da literatura. De seguida, procede-se à
apresentação de um estudo de um caso real, usando dados pertencentes a Sistemas de
Gestão de Propriedade de oito hotéis portugueses, quatro hóteis de cidade e quatro hóteis
de resort, este conjunto de dados é composto por quarenta e um atributos, mas, após uma
selecção das variáveis com maior poder preditivo, apenas um subconjunto de atributos é
utilizado para a modelação dos dados. Em seguida, são aplicados os métodos de
classificação e segmentação estudados na revisão de literatura de modo a extrair
informação relevante. Os resultados são analisados e discutidos para entender sua
adequação ao estudo das características particulares das reservas de hotéis
- …