A machine learning approach to modeling and predicting training effectiveness

Abstract

Thesis: Ph. D., Massachusetts Institute of Technology, Department of Aeronautics and Astronautics, 2015.Cataloged from PDF version of thesis.Includes bibliographical references (pages 357-372).Developments in online and computer-based training (CBT) technologies have enabled improvements in efficiency, efficacy, and scalability of modern training programs. The use of computer-based methods in training programs allows for the collection of trainee assessment metrics at much higher levels of detail, providing new opportunities for training evaluation in these programs. These resulting datasets may provide increased opportunities for training evaluation and trainee intervention through the use of descriptive and predictive modeling. In particular, there is the potential for descriptive approaches to provide greater understanding of trainee behavior and indicate similarities between trainees, while accurate prediction models of future performance available early in a training program could help inform trainee intervention methods. However, traditional analysis techniques and human intuition are of limited use on so-called "big-data" environments, and one of the most promising areas to prepare for this influx of complex training data is the field of machine learning. Thus, the objective of this thesis was to lay the foundations for the use of machine learning algorithms in computer-based training settings. First, a taxonomy of training domains was developed to identify typical properties of training data. Second, the theoretical and practical considerations between traditional machine learning applications and various training domains were identified and compared. This analysis identified the potential impacts of training data on machine learning performance and presented countermeasures to overcome some of the challenges associated with data from human training. Third, analyses of machine learning performance were conducted on datasets from two different training domains: a rule-based nuclear reactor CBT, and a knowledge-based classroom environment with online components. These analyses discussed the results of the machine learning algorithms with a particular focus on the usefulness of the model outputs for training evaluation. Additionally, the differences between machine learning applications to the two training domains were compared, providing a set of lessons for the future use of machine learning in training. Several consistent themes emerged from these analyses that can inform both research and applied use of machine learning in training. On the tested datasets, simple machine learning algorithms provided similar performance to complex methods for both unsupervised and supervised learning, and have additional benefits for ease of interpretation by training supervisors. The availability of process-level assessment metrics generally provided little improvement over traditional summative metrics when available, but were able to make strong contributions when summative information was limited. In particular, process-level information was able to improve early prediction to inform trainee intervention for longer training programs, and was able to improve descriptive modeling of the data for shorter programs. The frequency with which process-level information is collected further allows for accurate predictions to be made earlier in the training program, which allow for greater certainty and earlier application of targeted interventions in a training program. These lessons provide the groundwork for the study of machine learning on training domain data, enabling the efficient use of new data opportunities in computer-based training programs.by Alexander James Stimpson.Ph. D

    Similar works

    Full text

    thumbnail-image

    Available Versions