3 research outputs found

    An application of machine learning to explore relationships between factors of organisational silence and culture, with specific focus on predicting silence behaviours

    Get PDF
    Research indicates that there are many individual reasons why people do not speak up when confronted with situations that may concern them within their working environment. One of the areas that requires more focused research is the role culture plays in why a person may remain silent when such situations arise. The purpose of this study is to use data science techniques to explore the patterns in a data set that would lead a person to engage in organisational silence. The main research question the thesis asks is: Is Machine Learning a tool that Social Scientists can use with respect to Organisational Silence and Culture, that augments commonly used statistical analysis approaches in this domain. This study forms part of a larger study being run by the third supervisor of this thesis. A questionnaire was developed by organisational psychologists within this group to collect data covering six traits of silence as well as cultural and individual attributes that could be used to determine if someone would engage in silence or not. This thesis explores three of those cultures to find main effects and interactions between variables that could influence silence behaviours. Data analysis was carried out on data collected in three European countries, Italy, Germany and Poland (n=774). The data analysis comprised of (1) exploring the characteristics of the data and determining the validity and reliability of the questionnaire; (2) identifying a suitable classification algorithm which displayed good predictive accuracy and modelled the data well based on eight already confirmed hypotheses from the organisational silence literature and (3) investigate newly discovered patterns and interactions within the data, that were previously not documented in the Silence literature on how culture plays a role in predicting silence. It was found that all the silence constructs showed good validity with the exception of Opportunistic Silence and Disengaged Silence. Validation of the cultural dimensions was found to be poor for all constructs when aggregated to individual level with the exception of Humane Orientation Organisational Practices, Power Distance Organisational Practices, Humane Orientation Societal Practices and Power Distance Societal Practices. In addition, not all constructs were invariant across countries. For example, a number of constructs showed invariance across the Poland and Germany samples, but failed for the Italian sample. Ten models were trained to identify predictors of a binary variable, engaged in Organisational Silence. Two of the most accurate models were chosen for further analysis of the main effects and interactions within the dataset, namely Random Forest (AUC = 0.655) and Conditional Inference Forests (AUC = 0.647). Models confirmed 9 out of 16 of the known relationships, and identified three additional potential interactions within the data that were previously not documented in the silence literature on how culture plays a role in predicting silence. For example, Climate for Authenticity was discovered to moderate the effect of both Power Distance Societal Practices and Diffident Silence in reducing the probability of someone engaging in silence. This is the first time this instrument was validated via statistical techniques for suitability to be used across cultures. The techniques of modelling the silence data using classification algorithms with Partial Dependency Plots is a novel and previously unexplored method of exploring organisational silence. In addition, the results identified new information on how culture plays a role in silence behaviours. The results also highlighted that models such as ensembles that identify non-linear relationships without making assumptions about the data, and visualisations depicting interactions identified by such models, can offer new insights over and above the current toolbox of analysis techniques prevalent in social science research

    A CRITICAL EXPLORATION OF THE POTENTIAL UTILITY OF RULE INDUCTION DATA MINING METHODS TO “ORTHODOX” EDUCATION RESEARCH

    Get PDF
    Despite some theoretical promise, it is unclear whether rule induction data mining approaches (e.g., classification trees and association rules) add methodological value to "orthodox" education research, i.e., research unrelated to computer-based education. To better understand whether and how rule induction methods could be useful to education researchers, I explored whether they, relative to regression approaches, (1) improve classification accuracy, and/or (2) offer new avenues of explanation. Additionally, I aimed to illustrate a practical and principled way to use the various rule induction approaches so researchers can more easily choose to use it. To these ends, I conducted an extended literature review on rule induction methods, and re-analyzed two regression studies (Byrnes & Miller, 2007; Thomas, 2006) on the National Educational Longitudinal Study of 1988 using ten rule induction approaches. Data mining happened in two rounds for each study: first, by using only the predictors used in the original study, and second by using all reasonable and available predictors. I compared results across methods and rounds to better understand whether, how, and why the rule induction may provide additional insights. I found that while rule induction approaches can be labor intensive and not necessarily more predictive than regression, they can provide unique descriptions of the sample that shows at-a-glance, how key predictors relate to each other and to the outcome. They can also help identify relationships between variables that held for some subgroups but not others. For example: (i) rulesets induced from Byrnes and Miller's dataset suggested that Algebra 2 and math self-concept were positively related to 12th grade math scores, but only for those who were higher achieving in 8th grade math; (ii) association rules mined from Thomas' dataset suggested that factors such as school safety and honors program participation were more strongly associated with 12th grade achievement for lower income and students with lower parental education. Thus, when relationships between the predictors and outcome may not be uniform across the population, rule induction can provide more information than regression in exploring those relationships. Lessons learned and recommendations on how to apply rule induction approaches are also discussed
    corecore