2 research outputs found

    Identification of important features and data mining classification techniques in predicting employee absenteeism at work

    Get PDF
    Employees absenteeism at the work costs organizations billions a year. Prediction of employees’ absenteeism and the reasons behind their absence help organizations in reducing expenses and increasing productivity. Data mining turns the vast volume of human resources data into information that can help in decision-making and prediction. Although the selection of features is a critical step in data mining to enhance the efficiency of the final prediction, it is not yet known which method of feature selection is better. Therefore, this paper aims to compare the performance of three well-known feature selection methods in absenteeism prediction, which are relief-based feature selection, correlation-based feature selection and information-gain feature selection. In addition, this paper aims to find the best combination of feature selection method and data mining technique in enhancing the absenteeism prediction accuracy. Seven classification techniques were used as the prediction model. Additionally, cross-validation approach was utilized to assess the applied prediction models to have more realistic and reliable results. The used dataset was built at a courier company in Brazil with records of absenteeism at work. Regarding experimental results, correlationbased feature selection surpasses the other methods through the performance measurements. Furthermore, bagging classifier was the best-performing data mining technique when features were selected using correlation-based feature selection with an accuracy rate of (92%)

    Studying employee absenteeism due to health-related factors: A data-science approach

    Get PDF
    United States employers are spending approximately $950 billion on healthcare benefits, and these costs are impeding their ability to compete in their respective markets. Furthermore, these costs do not include employee absenteeism—the cost of failing to show up for scheduled work. Research has shown that the primary reason for employee absenteeism is poor health. However, management research has primarily focused on controllable factors related to avoidable absences (e.g., job burnout, work attitudes, and personality characteristics). Therefore, the critical issue I address in this dissertation is: How can employers understand, predict and decrease the effect of absenteeism related to the health conditions of their workforces?A data-science approach was used to explore this critical question, focusing on the leading cause of disability, musculoskeletal disorders (MSDs), and how they impact employee absenteeism. First, I created a well-formed combined dataset using advanced data preparation methods on the datasets of three self-insured employers, their medical claims, pharmacy claims, human resource records, and attendance data. Next, I ran machine learning algorithms to examine the prediction accuracy and the most probable risk factors influencing employee absenteeism related to the health condition. For example, factors influencing the risk of increased absence related to poor health include demographic features of the employees and their position (e.g., age, gender, salary, department, and workload), existing health conditions at the time of absence (e.g., diabetes, behavior health, arthritis, cardiac, and gastrointestinal), treatments for the health condition (e.g., drug, physical therapy, non-surgical procedures, and surgical procedures), and other medical-related variables (e.g., provider types, locations, imaging, labs, and tests). The impact of time was also investigated to obtain treatment information because research indicates that shorter wait times correlate with better outcomes for MSD treatments. A post hoc analysis was conducted to compare the essential variables that predict long-term employee absenteeism to the critical variables that predict high medical costs. It provides important insights into which sorts of healthcare services are connected with a quality outcome (e.g., lower employee absence)
    corecore