335 research outputs found

    Dropout Prediction: A Systematic Literature Review

    Get PDF
    Dropout predicting is challenging analysis process which requires appropriate approaches to address the dropout. Existing approaches are applied in different areas such as education, telecommunications, retail, social networks, and banking services. The goal is to identify customers in the risk of dropout to support retention strategies. This research developed a systematic literature review to evaluate the development of existing studies to predict dropout using machine learning, following the guidelines recommended by Kitchenham and Peterson. The systematic review followed three phases planning, conducting, and reporting. The selection of the most relevant articles was based on the use of Active Systematic Review tool using artificial intelligence algorithms. The criteria identified 28 articles and several research lines where identified. Dropout is a transversal problem for several sectors of economic activity, where it can be taken countermeasures before it happens if detected early

    The misty crystal ball: Efficient concealment of privacy-sensitive attributes in predictive analytics

    Get PDF
    Individuals are becoming increasingly concerned with privacy. This curtails their willingness to share sensitive attributes like age, gender or personal preferences; yet firms largely rely upon customer data in any type of predictive analytics. Hence, organizations are confronted with a dilemma in which they need to make a tradeoff between a sparse use of data and the utility from better predictive analytics. This paper proposes a masking mechanism that obscures sensitive attributes while maintaining a large degree of predictive power. More precisely, we efficiently identify data partitions that are best suited for (i) shuffling, (ii) swapping and, as a form of randomization, (iii) perturbing attributes by conditional replacement. By operating on data partitions that are derived from a predictive algorithm, we achieve the objective of masking privacy-sensitive attributes with marginal downsides for predictive modeling. The resulting trade-off between masking and predictive utility is empirically evaluated in the context of customer churn where, for instance, a stratified shuffling of attribute values impedes predictive accuracy rarely by more than a percentage point. Our proposed framework entails direct managerial implications as a growing share of firms adopts predictive analytics and thus requires mechanisms that better adhere to user demands for information privacy

    A data mining-based framework for supply chain risk management

    Get PDF
    Increased risk exposure levels, technological developments and the growing information overload in supply chain networks drive organizations to embrace data-driven approaches in Supply Chain Risk Management (SCRM). Data Mining (DM) employs multiple analytical techniques for intelligent and timely decision making; however, its potential is not entirely explored for SCRM. The paper aims to develop a DM-based framework for the identification, assessment and mitigation of different type of risks in supply chains. A holistic approach integrates DM and risk management activities in a unique framework for effective risk management. The framework is validated with a case study based on a series of semi-structured interviews, discussions and a focus group study. The study showcases how DM supports in discovering hidden and useful information from unstructured risk data for making intelligent risk management decisions

    Comparison of Classification Algorithms and Undersampling Methods on Employee Churn Prediction: A Case Study of a Tech Company

    Get PDF
    Churn prediction is a common data mining problem that many companies face across industries. More commonly, customer churn has been studied extensively within the telecommunications industry where there is low customer retention due to high market competition. Similar to customer churn, employee churn is very costly to a company and by not deploying proper risk mitigation strategies, profits cannot be maximized, and valuable employees may leave the company. The cost to replace an employee is exponentially higher than finding a replacement, so it is in any company’s best interest to prioritize employee retention. This research combines machine learning techniques with undersampling in hopes of identifying employees at risk of churn so retention strategies can be implemented before it is too late. Four different classification algorithms are tested on a variety of undersampled datasets in order to find the most effective undersampling and classification method for predicting employee churn. Statistical analysis is conducted on the appropriate evaluation metrics to find the most significant methods. The results of this study can be used by the company to target individuals at risk of churn so that risk mitigation strategies can be effective in retaining the valuable employees. Methods and results can be tested and applied across different industries and companies

    Predicting HR Churn with Python and Machine Learning

    Get PDF
    Employee turnover imposes a substantial financial burden, necessitating proactive retention strategies. The aim is to leverage HR analytics, specifically employing a systematic machine learning approach, to predict the likelihood of active employees leaving the company. Using a systematic approach for supervised classification, the study leverages data on former employees to predict the probability of current employees leaving. Factors such as recruitment costs, sign-on bonuses, and onboarding productivity loss are analysed to explain when and why employees are prone to leave. The project aims to empower companies to take pre-emptive measures for retention. Contributing to HR Analytics, it provides a methodological framework applicable to various machine learning problems, optimizing human resource management, and enhancing overall workforce stability. This research contributes not only to predicting turnover but also proposes policies and strategies derived from the model's results. By understanding the root causes and timing of employee departures, companies can proactively implement measures to mitigate turnover, thereby minimizing the associated financial and operational burdens

    A comparative study of tree-based models for churn prediction : a case study in the telecommunication sector

    Get PDF
    Dissertation presented as the partial requirement for obtaining a Master's degree in Statistics and Information Management, specialization in Marketing Research e CRMIn the recent years the topic of customer churn gains an increasing importance, which is the phenomena of the customers abandoning the company to another in the future. Customer churn plays an important role especially in the more saturated industries like telecommunication industry. Since the existing customers are very valuable and the acquisition cost of new customers is very high nowadays. The companies want to know which of their customers and when are they going to churn to another provider, so that measures can be taken to retain the customers who are at risk of churning. Such measures could be in the form of incentives to the churners, but the downside is the wrong classification of a churners will cost the company a lot, especially when incentives are given to some non-churner customers. The common challenge to predict customer churn will be how to pre-process the data and which algorithm to choose, especially when the dataset is heterogeneous which is very common for telecommunication companies’ datasets. The presented thesis aims at predicting customer churn for telecommunication sector using different decision tree algorithms and its ensemble models

    Next-generation big data analytics: state of the art, challenges, and future research topics

    Get PDF
    The term big data occurs more frequently now than ever before. A large number of fields and subjects, ranging from everyday life to traditional research fields (i.e., geography and transportation, biology and chemistry, medicine and rehabilitation), involve big data problems. The popularizing of various types of network has diversified types, issues, and solutions for big data more than ever before. In this paper, we review recent research in data types, storage models, privacy, data security, analysis methods, and applications related to network big data. Finally, we summarize the challenges and development of big data to predict current and future trends.This work was supported in part by the “Open3D: Collaborative Editing for 3D Virtual Worlds” [EPSRC (EP/M013685/1)], in part by the “Distributed Java Infrastructure for Real-Time Big-Data” (CAS14/00118), in part by eMadrid (S2013/ICE-2715), in part by the HERMES-SMARTDRIVER (TIN2013-46801-C4-2-R), and in part by the AUDACity (TIN2016-77158-C4-1-R). Paper no. TII-16-1
    • …
    corecore