119 research outputs found
Enhancing Unbalanced Data Classification with Cross-Validation and Extreme Gradient Boosting: A Comprehensive Analysis
As a novel and efficient ensemble learning algorithm, XGBoost has been widely applied due to its multiple advantages, but its classification effect in cases of data imbalance is often not ideal. Aiming at this problem, efforts were made to optimize XGBoost and the Cross Validation algorithm. The main idea is to combine cross validation and XGBoost on unbalanced data for data processing, and then get the final model based on XGBoost through training. At the same time, optimal parameters are searched and adjusted automatically through optimization algorithms to realize more accurate classification predictions. In the testing phase, the area under the curve (AUC) is used as an evaluation indicator to compare and analyze the classification performance of various sampling methods and algorithm models. The results of the model analysis using AUC are expected to verify the feasibility and effectiveness of the proposed algorithm
Prediction of Customers Churn in Telecommunication Industry
In the developed world, mobile markets have reached saturation on subscriber penetration and connections growth. The challenge for operators has evolved from attracting new customers to retaining existing ones. Various components have an impact on churn. Therefore, it is very important to understand the behaviour of the customers, encourage them in spending more and then predicting the future by preventing their attrition. As the industry is evolving, the biggest challenge for operators is to engage with consumers and retain their loyalty by delivering more competitive and innovative value-added services. While understanding consumer needs remains essential to improve customer retention, other emerging tariffs and services are likely to carry a long-term impact on churn (including national, international and roaming bundles tariffs and mobile services). The churn might be voluntary in cases they want to leave the network they actually are using, or involuntary churn in case of unpaid bills. The methodology used to do the right evaluations in order to achieve strong results in this field is very large and varied. The scope of this thesis is to identify and analyse different appropriate models that can help the data analysts to find the churners in Telecommunication industry. In this thesis we are going to discuss on two important topics in telecommunication markets and their respective predictive models, which tend to understand the customer behaviour towards different competitors: market share in telecommunication industry and customer churn
Churn Identification and Prediction from a Large-Scale Telecommunication Dataset Using NLP
The identification of customer churn is a major issue for large telecom businesses. In order to manage the data of current customers as well as acquire and manage new customers, every day, a substantial volume of data gets generated. Therefore, it's crucial to identify the causes of client churn so that the appropriate steps can be taken to lower it. Numerous researchers have already discussed their efforts to combine static and dynamic approaches in order to reduce churn in big data sets, but these systems still have many issues when it comes to actually identifying churn. In this paper, we suggested two methods, the first of which is churn identification and using Natural Language Processing (NLP) methods and machine learning techniques, we make predictions based on a vast telecommunication data set. The NLP process involves data pre-processing, normalization, feature extraction, and feature selection. For feature extraction, we employ unique techniques like TF-IDF, Stanford NLP, and occurrence correlation methods, have been suggested. Throughout the lesson, a machine learning classification algorithm is used for training and testing. Finally, the system employs a variety of cross validation techniques and training and evaluating Machine learning algorithms. The experimental analysis shows the system's efficacy and accuracy
Recommended from our members
Toward a model of customer experience
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.Retaining high-value and profitable customers is a major strategic objective for many companies. In mature mobile phone markets where growth has slowed, the defection of customers from one network to another has intensified and is strongly fuelled by poor Customer Experience. Trends in the service economy suggest that experience can be exploited as a means of supplying the basis of a new economic offering, ignited in part by the shift that is taking place in the analysis of people’s interaction with digital products. In this light, the research describes a strategic approach to the use of Information Systems as a means of improving Customer Experience. Using Action Research in a mobile telecommunications operator, a Customer Experience Monitoring and Action Response model (CEMAR) is developed that evaluates disparate customer data, residing across many systems, builds experience profiles and suggests appropriate contextual actions where experience is poor. The model provides value in identifying issues, understanding them in the context of the overall Customer Experience (over time) and dealing with them appropriately. The novelty of the approach is the synthesis of data analysis with an enhanced understanding of Customer Experience which is developed implicitly, in real-time and in advance of any instigation by the customer.Royal Academy of Engineerin
Recommended from our members
A data-driven framework for investigating customer retention
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University London.This study presents a data-driven simulation framework in order to understand customer behaviour and therefore improve customer retention. The overarching system design methodology used for this study is aligned with the design science paradigm. The Social Media Domain Analysis (SoMeDoA) approach
is adopted and evaluated to build a model on the determinants of customer satisfaction in the mobile services industry. Furthermore, the most popular machine learning algorithms for analysing customer churn are applied to analyse customer retention based on the derived determinants. Finally, a data-driven approach for agent-based modelling is proposed to investigate the social effect of customer retention. The key contribution of this study is the customer agent decision trees (CADET) approach and a data-driven approach for Agent-Based Modelling (ABM). The CADET approach is applied to a dataset provided by a UK mobile services company. One of the major findings of using the CADET approach to investigate customer retention is that social influence, specifically word of mouth has an impact on customer retention. The second contribution of this
study is the method used to uncover customer satisfaction determinants. The SoMeDoA framework was applied to uncover determinants of customer satisfaction in the mobile services industry. Customer service, coverage quality and price are found to be key determinants of customer satisfaction in the mobile services industry. The third contribution of this study is the approach used to build customer churn prediction models. The most popular machine learning techniques are used to build customer churn prediction models based on
identified customer satisfaction determinants. Overall, for the identified determinants, decision trees have the highest accuracy scores for building customer churn prediction models
Customer churn prediction for web browsers
In the competitive web browser market, identifying potential churners is critical to decreasing the loss of existing customers. Churn prediction based on customer behaviors plays a vital role in customer retention strategies. However, traditional churn prediction algorithms such as Tree-based models cannot exploit the temporal characteristics of browser customers behaviors, while sequence models cannot explicitly extract the information between multiple behaviors. To meet this challenge, we propose a novel model named Multivariate Behavior Sequence Transformer (MBST) with two complementary attention mechanisms to explore the temporal and behavioral information separately. Furthermore, a Tree-based classifier is attached for churn prediction instead of using the multilayer perceptron. Extensive experiments on a real-world Tencent QQ browser dataset with over 600,000 samples demonstrate that the proposed MBST achieves the F-score of 82.72% and the Area Under Curve (AUC) of 93.75%, which significantly outperforms state-of-the-art methods in terms of churn prediction
Toward a model of customer experience
Retaining high-value and profitable customers is a major strategic objective for many companies. In mature mobile phone markets where growth has slowed, the defection of customers from one network to another has intensified and is strongly fuelled by poor Customer Experience. Trends in the service economy suggest that experience can be exploited as a means of supplying the basis of a new economic offering, ignited in part by the shift that is taking place in the analysis of people’s interaction with digital products. In this light, the research describes a strategic approach to the use of Information Systems as a means of improving Customer Experience. Using Action Research in a mobile telecommunications operator, a Customer Experience Monitoring and Action Response model (CEMAR) is developed that evaluates disparate customer data, residing across many systems, builds experience profiles and suggests appropriate contextual actions where experience is poor. The model provides value in identifying issues, understanding them in the context of the overall Customer Experience (over time) and dealing with them appropriately. The novelty of the approach is the synthesis of data analysis with an enhanced understanding of Customer Experience which is developed implicitly, in real-time and in advance of any instigation by the customer.EThOS - Electronic Theses Online ServiceRoyal Academy of EngineeringGBUnited Kingdo
Characterization of the clients retention in the tlecommunications companies
The ability of a company to be able to do a precisely churn prediction, so it can act on it, is paramount. For this reason, Deloitte addressed me the challenge of characterizing the client’s retention in the telecom companies. To do so, it was created a comprehensive tool that enables Deloitte to evaluate the churn management maturity level of a telecom operator and highlight its strengths and weaknesses. The development of this matrix was based on a depth churn research, a market research based on 40 interviews and 2 focus group and the valuable feedback from Deloitte consultants
X-TIME: An in-memory engine for accelerating machine learning on tabular data with CAMs
Structured, or tabular, data is the most common format in data science. While
deep learning models have proven formidable in learning from unstructured data
such as images or speech, they are less accurate than simpler approaches when
learning from tabular data. In contrast, modern tree-based Machine Learning
(ML) models shine in extracting relevant information from structured data. An
essential requirement in data science is to reduce model inference latency in
cases where, for example, models are used in a closed loop with simulation to
accelerate scientific discovery. However, the hardware acceleration community
has mostly focused on deep neural networks and largely ignored other forms of
machine learning. Previous work has described the use of an analog content
addressable memory (CAM) component for efficiently mapping random forests. In
this work, we focus on an overall analog-digital architecture implementing a
novel increased precision analog CAM and a programmable network on chip
allowing the inference of state-of-the-art tree-based ML models, such as
XGBoost and CatBoost. Results evaluated in a single chip at 16nm technology
show 119x lower latency at 9740x higher throughput compared with a
state-of-the-art GPU, with a 19W peak power consumption
- …