37 research outputs found

    Research trends in customer churn prediction: A data mining approach

    Get PDF
    This study aims to present a very recent literature review on customer churn prediction based on 40 relevant articles published between 2010 and June 2020. For searching the literature, the 40 most relevant articles according to Google Scholar ranking were selected and collected. Then, each of the articles were scrutinized according to six main dimensions: Reference; Areas of Research; Main Goal; Dataset; Techniques; outcomes. The research has proven that the most widely used data mining techniques are decision tree (DT), support vector machines (SVM) and Logistic Regression (LR). The process combined with the massive data accumulation in the telecom industry and the increasingly mature data mining technology motivates the development and application of customer churn model to predict the customer behavior. Therefore, the telecom company can effectively predict the churn of customers, and then avoid customer churn by taking measures such as reducing monthly fixed fees. The present literature review offers recent insights on customer churn prediction scientific literature, revealing research gaps, providing evidences on current trends and helping to understand how to develop accurate and efficient Marketing strategies. The most important finding is that artificial intelligence techniques are are obviously becoming more used in recent years for telecom customer churn prediction. Especially, artificial NN are outstandingly recognized as a competent prediction method. This is a relevant topic for journals related to other social sciences, such as Banking, and also telecom data make up an outstanding source for developing novel prediction modeling techniques. Thus, this study can lead to recommendations for future customer churn prediction improvement, in addition to providing an overview of current research trends.info:eu-repo/semantics/acceptedVersio

    Feature selection by multi-objective optimization: application to network anomaly detection by hierarchical self-organizing maps.

    Get PDF
    Feature selection is an important and active issue in clustering and classification problems. By choosing an adequate feature subset, a dataset dimensionality reduction is allowed, thus contributing to decreasing the classification computational complexity, and to improving the classifier performance by avoiding redundant or irrelevant features. Although feature selection can be formally defined as an optimisation problem with only one objective, that is, the classification accuracy obtained by using the selected feature subset, in recent years, some multi-objective approaches to this problem have been proposed. These either select features that not only improve the classification accuracy, but also the generalisation capability in case of supervised classifiers, or counterbalance the bias toward lower or higher numbers of features that present some methods used to validate the clustering/classification in case of unsupervised classifiers. The main contribution of this paper is a multi-objective approach for feature selection and its application to an unsupervised clustering procedure based on Growing Hierarchical Self-Organizing Maps (GHSOM) that includes a new method for unit labelling and efficient determination of the winning unit. In the network anomaly detection problem here considered, this multi-objective approach makes it possible not only to differentiate between normal and anomalous traffic but also among different anomalies. The efficiency of our proposals has been evaluated by using the well-known DARPA/NSL-KDD datasets that contain extracted features and labeled attacks from around 2 million connections. The selected feature sets computed in our experiments provide detection rates up to 99.8% with normal traffic and up to 99.6% with anomalous traffic, as well as accuracy values up to 99.12%.This work has been funded by FEDER funds and the Ministerio de Ciencia e Innovación of the Spanish Government under Project No. TIN2012-32039

    PENINGKATAN AKURASI ALGORITMA BACKPROPAGATION DENGAN SELEKSI FITUR PARTICLE SWARM OPTIMIZATION DALAM PREDIKSI PELANGGAN TELEKOMUNIKASI YANG HILANG

    Get PDF
    Abstrak: Telekomunikasi adalah salah satu industri, di mana pelanggan memerlukan perhatian khusus, oleh  karena  itu,  manajemen  di  sebuah  perusahaan  telekomunikasi  ingin  kehilangan  pelanggan  model prediksi untuk efisien memprediksi berpotensi kehilangan pelanggan. Jaringan syaraf adalah metode yang sering digunakan untuk memprediksi. Teknik yang paling populer dalam metode adalah saraf algoritma jaringan backpropagation. Namun algoritma backpropagationmemiliki kelemahan pada kebutuhan untuk data  pelatihan  besar  dan  optimasi  yang  digunakan  kurang  efisien.  Particle  Swarm  Optimization (PSO) adalah  suatu  algoritma  optimasi  yang  dapat  memecahkan  yang  efektif  masalah  pada  algoritma  neural network umumnya  menggunakan  algoritma  backpropagation.  Pengujian  model  dengan  berbasis menggunakan  Backpropagation Particle Swarm Optimizationmenggunakan data pelanggan hilang pada telekomunikasi. Model yang dihasilkan diuji untuk memperoleh akurasi dan nilai-nilai AUC dari masingmasing  algoritma  untuk  mendapatkan  tes  menggunakan  nilai  yang  diperoleh  akurasi  Backpropagation adalah 85.48% dan nilai AUC adalah 0.531. Sementarapengujian dengan menggunakan Backpropagation berbasis  Particle  Swarm  Optimization dipilih  atribut  dan  penyesuaian  nilai  parameter  yang  diperoleh 86.05% akurasi dan nilai AUC adalah 0,637. Dengan demikian dapat disimpulkan bahwa data pelanggan uji  hilang  dalam  telekomunikasi  menggunakan  aplikasi  Particle  Swarm  Optimization  Backpropagation dan dalam pemilihan atribut  diperoleh bahwa  metode  ini  lebih akurat dalam prediksi pelanggan  hilang telekomunikasi dibandingkan dengan Backpropagation, ditandai dengan peningkatan akurasi 00:57% dan nilai-nilai AUC dari 0.106, dengan nilai yang dimasukkan ke dalam akurasi klasifikasi cukup.Kata  Kunci:  Telekomunikasi,  Neural  Network,  Backpropagation,  Particle  Swarm  Optimization

    Modeling Attrition in Organizations from Email Communication

    Full text link
    Abstract—Modeling people’s online behavior in relation to their real-world social context is an interesting and important research problem. In this paper, we present our preliminary study of attrition behavior in real-world organizations based on two online datasets: a dataset from a small startup (40+ users) and a dataset from one large US company (3600+ users). The small startup dataset is collected using our privacy-preserving data logging tool, which removes personal identifiable information from content data and extracts only aggregated statistics such as word frequency counts and sentiment features. The privacy-preserving measures have enabled us to recruit participants to support this study. Correlation analysis over the startup dataset has shown that statistically there is often a change point in people’s online behavior, and data exhibits weak trends that may be manifestation of real-world attrition. Same findings are also verified in the large company dataset. Furthermore, we have trained a classifier to predict real-world attrition with a moderate accuracy of 60-65 % on the large company dataset. Given the incompleteness and noisy nature of data, the accuracy is encouraging. I

    Systematic Literature Review on Customer Switching Behaviour from Marketing and Data Science Perspectives

    Get PDF
    This paper systematically examines the literature review in the field of customer switching behavior. Based on the literature review, it can be concluded that customer switching behavior is a topic that has been widely researched, with a focus on various industries, particularly banking and telecommunications. Research trends in this area have shown a positive direction in recent years, and the amount of research being done in marketing and data science is relatively balanced. In marketing, correlational studies are predominant, with a focus on identifying relationships between customer satisfaction, price-related variables, attractiveness of alternatives, service failure, quality, and switching costs to switching behavior. The PPM model is also gaining popularity as an important development for switching behavior because it considers both push and pull factors. Data science research has shown promising results in predicting customer switching behavior, with each research paper achieving good predictive accuracy. However, research gaps spanning the fields of marketing and data science need to be addressed to provide a comprehensive understanding of the drivers of customer switching behavior. Overall, the literature review shows that customer switching behavior is an important concern for businesses, and further research in this area is essential to gain a better understanding of customer behavior and develop effective strategies to retain customers

    An Interval-based Multiobjective Approach to Feature Subset Selection Using Joint Modeling of Objectives and Variables

    Get PDF
    This paper studies feature subset selection in classification using a multiobjective estimation of distribution algorithm. We consider six functions, namely area under ROC curve, sensitivity, specificity, precision, F1 measure and Brier score, for evaluation of feature subsets and as the objectives of the problem. One of the characteristics of these objective functions is the existence of noise in their values that should be appropriately handled during optimization. Our proposed algorithm consists of two major techniques which are specially designed for the feature subset selection problem. The first one is a solution ranking method based on interval values to handle the noise in the objectives of this problem. The second one is a model estimation method for learning a joint probabilistic model of objectives and variables which is used to generate new solutions and advance through the search space. To simplify model estimation, l1 regularized regression is used to select a subset of problem variables before model learning. The proposed algorithm is compared with a well-known ranking method for interval-valued objectives and a standard multiobjective genetic algorithm. Particularly, the effects of the two new techniques are experimentally investigated. The experimental results show that the proposed algorithm is able to obtain comparable or better performance on the tested datasets

    Çok amaçlı değişken seçimine etkileşimli evrimsel yaklaşımlar.

    Get PDF
    In feature selection problems, the aim is to select a subset of features to characterize an output of interest. In characterizing an output, we may want to consider multiple objectives such as maximizing classification performance, minimizing number of selected features or cost, etc. We develop a preference-based approach for multi-objective feature selection problems. Finding all Pareto optimal subsets may turn out to be a computationally demanding problem and we still would need to select a solution eventually. Therefore, we develop interactive evolutionary approaches that aim to converge to a subset that is highly preferred by the decision maker. We test our approach on several instances simulating decision-maker preferences by underlying preference functions and demonstrate that it works well.M.S. - Master of Scienc

    Optimizing Ontology Alignments through NSGA-II without Using Reference Alignment

    Get PDF
    Ontology is widely used to solve the data heterogeneity problems on the semantic web, but the available ontologies could themselves introduce heterogeneity. In order to reconcile these ontologies to implement the semantic interoperability, we need to find the relationships among the entities in various ontologies, and the process of identifying them is called ontology alignment. In all the existing matching systems that use evolutionary approaches to optimize their parameters, a reference alignment between two ontologies to be aligned should be given in advance which could be very expensive to obtain especially when the scale of ontologies is considerably large. To address this issue, in this paper we propose a novel approach to utilize the NSGA-II to optimize the ontology alignments without using the reference alignment. In our approach, an adaptive aggregation strategy is presented to improve the efficiency of optimizing process and two approximate evaluation measures, namely match coverage and match ratio, are introduced to replace the classic recall and precision on reference alignment to evaluate the quality of the alignments. Experimental results show that our approach is effective and can find the solutions that are very close to those obtained by the approaches using reference alignment, and the quality of alignments is in general better than that of state of the art ontology matching systems such as GOAL and SAMBO
    corecore