272 research outputs found

    Online churn detection on high dimensional cellular data using adaptive hierarchical trees

    Get PDF
    We study online sequential logistic regression for churn detection in cellular networks when the feature vectors lie in a high dimensional space on a time varying manifold. We escape the curse of dimensionality by tracking the subspace of the underlying manifold using a hierarchical tree structure. We use the projections of the original high dimensional feature space onto the underlying manifold as the modified feature vectors. By using the proposed algorithm, we provide significant classification performance with significantly reduced computational complexity as well as memory requirement. We reduce the computational complexity to the order of the depth of the tree and the memory requirement to only linear in the intrinsic dimension of the manifold. We provide several results with real life cellular network data for churn detection. © 2016 IEEE

    Quadri-dimensional approach for data analytics in mobile networks

    Get PDF
    The telecommunication market is growing at a very fast pace with the evolution of new technologies to support high speed throughput and the availability of a wide range of services and applications in the mobile networks. This has led to a need for communication service providers (CSPs) to shift their focus from network elements monitoring towards services monitoring and subscribers’ satisfaction by introducing the service quality management (SQM) and the customer experience management (CEM) that require fast responses to reduce the time to find and solve network problems, to ensure efficiency and proactive maintenance, to improve the quality of service (QoS) and the quality of experience (QoE) of the subscribers. While both the SQM and the CEM demand multiple information from different interfaces, managing multiple data sources adds an extra layer of complexity with the collection of data. While several studies and researches have been conducted for data analytics in mobile networks, most of them did not consider analytics based on the four dimensions involved in the mobile networks environment which are the subscriber, the handset, the service and the network element with multiple interface correlation. The main objective of this research was to develop mobile network analytics models applied to the 3G packet-switched domain by analysing data from the radio network with the Iub interface and the core network with the Gn interface to provide a fast root cause analysis (RCA) approach considering the four dimensions involved in the mobile networks. This was achieved by using the latest computer engineering advancements which are Big Data platforms and data mining techniques through machine learning algorithms.Electrical and Mining EngineeringM. Tech. (Electrical Engineering

    Sequential churn prediction and analysis of cellular network users - A multi-class, multi-label perspective

    Get PDF
    We investigate the problem of churn detection and prediction using sequential cellular network data. We introduce a cleaning and preprocessing of the dataset that makes it suitable for the analysis. We draw a comparison of the churn prediction results from the-state-of-the-art algorithms such as the Gradient Boosting Trees, Random Forests, basic Long Short-Term Memory (LSTM) and Support Vector Machines (SVM). We achieve significant performance boost by incorporating the sequential nature of the data, imputing missing information and analyzing the effects of various features. This in turns makes the classifier rigorous enough to give highly accurate results. We emphasize on the sequential nature of the problem and seek algorithms that can track the variations in the data. We test and compare the performance of proposed algorithms using performance measures on real life cellular network data for churn detection. © 2017 IEEE

    Cost-Sensitive Learning-based Methods for Imbalanced Classification Problems with Applications

    Get PDF
    Analysis and predictive modeling of massive datasets is an extremely significant problem that arises in many practical applications. The task of predictive modeling becomes even more challenging when data are imperfect or uncertain. The real data are frequently affected by outliers, uncertain labels, and uneven distribution of classes (imbalanced data). Such uncertainties create bias and make predictive modeling an even more difficult task. In the present work, we introduce a cost-sensitive learning method (CSL) to deal with the classification of imperfect data. Typically, most traditional approaches for classification demonstrate poor performance in an environment with imperfect data. We propose the use of CSL with Support Vector Machine, which is a well-known data mining algorithm. The results reveal that the proposed algorithm produces more accurate classifiers and is more robust with respect to imperfect data. Furthermore, we explore the best performance measures to tackle imperfect data along with addressing real problems in quality control and business analytics

    A comparative study of tree-based models for churn prediction : a case study in the telecommunication sector

    Get PDF
    Dissertation presented as the partial requirement for obtaining a Master's degree in Statistics and Information Management, specialization in Marketing Research e CRMIn the recent years the topic of customer churn gains an increasing importance, which is the phenomena of the customers abandoning the company to another in the future. Customer churn plays an important role especially in the more saturated industries like telecommunication industry. Since the existing customers are very valuable and the acquisition cost of new customers is very high nowadays. The companies want to know which of their customers and when are they going to churn to another provider, so that measures can be taken to retain the customers who are at risk of churning. Such measures could be in the form of incentives to the churners, but the downside is the wrong classification of a churners will cost the company a lot, especially when incentives are given to some non-churner customers. The common challenge to predict customer churn will be how to pre-process the data and which algorithm to choose, especially when the dataset is heterogeneous which is very common for telecommunication companies’ datasets. The presented thesis aims at predicting customer churn for telecommunication sector using different decision tree algorithms and its ensemble models

    Data mining in manufacturing: a review based on the kind of knowledge

    Get PDF
    In modern manufacturing environments, vast amounts of data are collected in database management systems and data warehouses from all involved areas, including product and process design, assembly, materials planning, quality control, scheduling, maintenance, fault detection etc. Data mining has emerged as an important tool for knowledge acquisition from the manufacturing databases. This paper reviews the literature dealing with knowledge discovery and data mining applications in the broad domain of manufacturing with a special emphasis on the type of functions to be performed on the data. The major data mining functions to be performed include characterization and description, association, classification, prediction, clustering and evolution analysis. The papers reviewed have therefore been categorized in these five categories. It has been shown that there is a rapid growth in the application of data mining in the context of manufacturing processes and enterprises in the last 3 years. This review reveals the progressive applications and existing gaps identified in the context of data mining in manufacturing. A novel text mining approach has also been used on the abstracts and keywords of 150 papers to identify the research gaps and find the linkages between knowledge area, knowledge type and the applied data mining tools and techniques

    Flexible Application-Layer Multicast in Heterogeneous Networks

    Get PDF
    This work develops a set of peer-to-peer-based protocols and extensions in order to provide Internet-wide group communication. The focus is put to the question how different access technologies can be integrated in order to face the growing traffic load problem. Thereby, protocols are developed that allow autonomous adaptation to the current network situation on the one hand and the integration of WiFi domains where applicable on the other hand

    Twitter Analysis to Predict the Satisfaction of Saudi Telecommunication Companies’ Customers

    Get PDF
    The flexibility in mobile communications allows customers to quickly switch from one service provider to another, making customer churn one of the most critical challenges for the data and voice telecommunication service industry. In 2019, the percentage of post-paid telecommunication customers in Saudi Arabia decreased; this represents a great deal of customer dissatisfaction and subsequent corporate fiscal losses. Many studies correlate customer satisfaction with customer churn. The Telecom companies have depended on historical customer data to measure customer churn. However, historical data does not reveal current customer satisfaction or future likeliness to switch between telecom companies. Current methods of analysing churn rates are inadequate and faced some issues, particularly in the Saudi market. This research was conducted to realize the relationship between customer satisfaction and customer churn and how to use social media mining to measure customer satisfaction and predict customer churn. This research conducted a systematic review to address the churn prediction models problems and their relation to Arabic Sentiment Analysis. The findings show that the current churn models lack integrating structural data frameworks with real-time analytics to target customers in real-time. In addition, the findings show that the specific issues in the existing churn prediction models in Saudi Arabia relate to the Arabic language itself, its complexity, and lack of resources. As a result, I have constructed the first gold standard corpus of Saudi tweets related to telecom companies, comprising 20,000 manually annotated tweets. It has been generated as a dialect sentiment lexicon extracted from a larger Twitter dataset collected by me to capture text characteristics in social media. I developed a new ASA prediction model for telecommunication that fills the detected gaps in the ASA literature and fits the telecommunication field. The proposed model proved its effectiveness for Arabic sentiment analysis and churn prediction. This is the first work using Twitter mining to predict potential customer loss (churn) in Saudi telecom companies, which has not been attempted before. Different fields, such as education, have different features, making applying the proposed model is interesting because it based on text-mining

    A bottom-up approach to real-time search in large networks and clouds

    Full text link
    corecore