Search CORE

130 research outputs found

A comparative study of tree-based models for churn prediction : a case study in the telecommunication sector

Author: Kandel Ibrahem Hamdy Abdelhamid
Publication venue
Publication date: 22/01/2019
Field of study

Dissertation presented as the partial requirement for obtaining a Master's degree in Statistics and Information Management, specialization in Marketing Research e CRMIn the recent years the topic of customer churn gains an increasing importance, which is the phenomena of the customers abandoning the company to another in the future. Customer churn plays an important role especially in the more saturated industries like telecommunication industry. Since the existing customers are very valuable and the acquisition cost of new customers is very high nowadays. The companies want to know which of their customers and when are they going to churn to another provider, so that measures can be taken to retain the customers who are at risk of churning. Such measures could be in the form of incentives to the churners, but the downside is the wrong classification of a churners will cost the company a lot, especially when incentives are given to some non-churner customers. The common challenge to predict customer churn will be how to pre-process the data and which algorithm to choose, especially when the dataset is heterogeneous which is very common for telecommunication companies’ datasets. The presented thesis aims at predicting customer churn for telecommunication sector using different decision tree algorithms and its ensemble models

Repositório da Universidade Nova de Lisboa

Advances and applications in Ensemble Learning

Author: Ballings Michel
Publication venue: Ghent University. Faculty of Economics and Business Administration
Publication date: 01/01/2014
Field of study

Ghent University Academic Bibliography

Stacking Ensemble Approach for Churn Prediction: Integrating CNN and Machine Learning Models with CatBoost Meta-Learner

Author: Hiew Fu San
Khoh Wee How
Ooi Shih Yin
Pang Ying Han
Tan Yan Lin
Publication venue: MMU Press
Publication date: 01/09/2023
Field of study

In the telecom industry, predicting customer churn is crucial for improving customer retention. In literature, the use of single classifiers is predominantly focused. Customer data is complex data due to class imbalance and contain multiple factors that exhibit nonlinear dependencies. In these complex scenarios, single classifiers may be unable to fully utilize the available information to capture the underlying interactions effectively. In contrast, ensemble learning that combines various base classifiers empowers a more thorough data analysis, leading to improved prediction performance. In this paper, a heterogeneous ensemble model is proposed for churn prediction in the telecom industry. The model involves exploratory data analysis, data pre-processing and data resampling to handle class imbalance. In this proposed model, multiple trained base classifiers with different characteristics are integrated through a stacking ensemble technique. Specifically, convolutional-based neural network, logistic regression, decision tree and Support Vector Machine (SVM) are considered as the base classifiers in this work. The proposed stacking ensemble model utilizes the unique strengths of each base classifier and leverages collective knowledge to improve prediction performance with a meta-learner. The efficacy of the proposed model is assessed on a real-world dataset, i.e., Cell2Cell. The empirical results demonstrate the superiority of the proposed model in churn prediction with 62.4% f1-score and 60.62% recall

Directory of Open Access Journals

Machine learning techniques in churn rate analysis

Author: Gutiérrez González Diego
Publication venue
Publication date: 01/02/2020
Field of study

RESUMEN: Este trabajo tiene como objetivo ofrecer una visión simplificada de la importancia del estudio del Churn Rate por parte de las empresas. A su vez, se explican diferentes técnicas de Data Mining que sirven para medir esta tasa y aportar información de cómo reducirla. La motivación de este análisis viene dada por la necesidad de conocer cuáles son las técnicas más utilizadas en su medición y a la vez averiguar cuáles son más efectivas en diferentes escenarios. Se comienza definiendo el concepto de Churn Rate y su importancia para después continuar con la definición de Data Mining. Mas adelante se explican cuatro métodos que sirven para calcular esta tasa aportando ejemplos prácticos de su efectividad. Todo esto se apoyará en una revisión bibliográfica de diferentes estudios relacionados con este tema. El objetivo de esta revisión es descubrir cuales son los métodos más utilizados en el análisis del Churn Rate en los últimos cinco años. Por otro lado, también se busca encontrar cuales son los más eficaces para este cálculo. Como resultado de este análisis se puede concluir que las técnicas de Data Mining más utilizadas en los últimos años son el Support Vector Machine y las Redes Neuronales Artificiales. Estas dos técnicas son las más relacionadas con la inteligencia artificial ya que se busca crear modelo de aprendizaje automatizado para obtener mejores resultados. Las Regresiones y los Árboles de decisión son técnicas menos usadas en el campo objeto de estudio de este trabajo pero que ofrecen unos resultados más precisos, al menos a corto plazo, quizás debidos a su mayor sencillez de aplicación. El tamaño de la muestra utilizada para el análisis también es importante ya que a mayor tamaño menor precisión, pero más posibilidades de desarrollar un modelo de aprendizaje automatizado que de mejores resultados a largo plazo.ABSTRACT: This work aims to provide a simplified view of the importance of the study of the Churn Rate by companies. At the same time, different techniques of Data Mining are explained that serve to measure this rate and to contribute information of how to reduce it. The motivation of this analysis is given by the need to know which are the most used techniques in their measurement and at the same time find out which are more effective in different scenarios. It begins by defining the concept of Churn Rate and its importance and then continue with the definition of Data Mining. Later on, four methods are explained that serve to calculate this rate providing practical examples of its effectiveness. All this will be supported by a bibliographic review of different studies related to this topic. The objective of this review is to discover which are the most used methods in the analysis of the Churn Rate in the last five years. On the other hand, it also seeks to find which are the most effective for this calculation. As a result of this analysis it can be concluded that the most used Data Mining techniques in recent years are the Support Vector Machine and Artificial Neural Networks. These two techniques are the most related to artificial intelligence as it seeks to create automated learning model for better results. Regressions and Decision Trees are less used techniques in this field, but they offer more precise results, at least in the short term, perhaps due to their simplicity of application. The size of the sample used for analysis is also important because the larger the sample, the lower the accuracy, but the more likely it is to develop an automated learning model that will yield better long-term results.Máster en Empresa y Tecnologías de la Informació

UCrea

Adaptive algorithms for real-world transactional data mining.

Author: Apeh Edward Tersoo
Publication venue
Publication date: 01/01/2012
Field of study

The accurate identiﬁcation of the right customer to target with the right product at the right time, through the right channel, to satisfy the customer’s evolving needs, is a key performance driver and enhancer for businesses. Data mining is an analytic process designed to explore usually large amounts of data (typically business or market related) in search of consistent patterns and/or systematic relationships between variables for the purpose of generating explanatory/predictive data models from the detected patterns. It provides an effective and established mechanism for accurate identiﬁcation and classiﬁcation of customers. Data models derived from the data mining process can aid in effectively recognizing the status and preference of customers - individually and as a group. Such data models can be incorporated into the business market segmentation, customer targeting and channelling decisions with the goal of maximizing the total customer lifetime proﬁt. However, due to costs, privacy and/or data protection reasons, the customer data available for data mining is often restricted to veriﬁed and validated data,(in most cases,only the business owned transactional data is available). Transactional data is a valuable resource for generating such data models. Transactional data can be electronically collected and readily made available for data mining in large quantity at minimum extra cost. Transactional data is however, inherently sparse and skewed. These inherent characteristics of transactional data give rise to the poor performance of data models built using customer data based on transactional data. Data models for identifying, describing, and classifying customers, constructed using evolving transactional data thus need to effectively handle the inherent sparseness and skewness of evolving transactional data in order to be efficient and accurate. Using real-world transactional data, this thesis presents the ﬁndings and results from the investigation of data mining algorithms for analysing, describing, identifying and classifying customers with evolving needs. In particular, methods for handling the issues of scalability, uncertainty and adaptation whilst mining evolving transactional data are analysed and presented. A novel application of a new framework for integrating transactional data binning and classiﬁcation techniques is presented alongside an effective prototype selection algorithm for efficient transactional data model building. A new change mining architecture for monitoring, detecting and visualizing the change in customer behaviour using transactional data is proposed and discussed as an effective means for analysing and understanding the change in customer buying behaviour over time. Finally, the challenging problem of discerning between the change in the customer proﬁle (which may necessitate the effective change of the customer’s label) and the change in performance of the model(s) (which may necessitate changing or adapting the model(s)) is introduced and discussed by way of a novel ﬂexible and efficient architecture for classiﬁer model adaptation and customer proﬁles class relabeling

Bournemouth University Research Online

Antecedents of ESG-Related Corporate Misconduct: Theoretical Considerations and Machine Learning Applications

Author: Gemmer Lars
Publication venue
Publication date: 01/09/2023
Field of study

The core objective of this cumulative dissertation is to generate new insights in the occurrence and prediction of unethical firm behavior disclosure. The first two papers investigate predictors and antecedents of (severe) unethical firm behavior disclosure. The third paper addresses frequently occurring methodological issues when applying machine learning approaches within marketing research. Hence, the three papers of this dissertation contribute to two recent topics within the field of marketing: First, marketing research has already focused intensively on the consequences of corporate misconduct and the accompanying media coverage. Meanwhile, the prediction and the process of occurrence of such threatening events have been examined only sporadically so far. Second, companies and researchers are increasingly implementing machine learning as a methodology to solve marketing-specific tasks. In this context, the users of machine learning methods often face methodological challenges, for which this dissertation reviews possible solutions. Specifically, in study 1, machine learning algorithms are used to predict the future occurrence of severe threatening news coverage of corporate misconduct. Study 2 identifies relationships between the specific competitive situation of a company within its industry and unethical firm behavior disclosure. Study 3 addresses machine learning-based issues for marketing researchers and presents possible solutions by reviewing the computer science literature

Kölner UniversitätsPublikationsServer