1,904 research outputs found

    Determining systematic differences in human graders for machine learning-based automated hiring

    Get PDF
    Firms routinely utilize natural language processing combined with other machine learning (ML) tools to assess prospective employees through automated resume classification based on pre-codified skill databases. The rush to automation can however backfire by encoding unintentional bias against groups of candidates. We run two experiments with human evaluators from two different countries to determine how cultural differences may affect hiring decisions. We use hiring materials provided by an international skill testing firm which runs hiring assessments for Fortune 500 companies. The company conducts a video-based interview assessment using machine learning, which grades job applicants automatically based on verbal and visual cues. Our study has three objectives: to compare the automatic assessments of the video interviews to assessments of the same interviews by human graders in order to assess how they differ; to examine which characteristics of human graders may lead to systematic differences in their assessments; and to propose a method to correct human evaluations using automation. We find that systematic differences can exist across human graders and that some of these differences can be accounted for by an ML tool if measured at the time of training

    A pontuação sat corrigida e o seu benefício potencial para estudantes de grupos minoritários no ensino superior

    Get PDF
    This paper investigates the predictive validity of the Revised SAT (R-SAT) score, proposed by Freedle (2003) as an alternative to compensate minority students for the potential harm caused by the relationship between item difficulty and ethnic DIF observed in the SAT. The R-SAT score is the score minority students would have received if only the hardest questions from the test had been considered and was computed using a formula score and a regression approach. In this article we examine the potential effects of using the R-SAT of minority students in the admissions decision to selective institutions, and its capacity to predict short and long-term academic outcomes as well as its potential benefits regarding differential prediction of college grades for minority students. To test this out, we examined the performance of the R-SAT score compared to the standard SAT score in a sample of graduates from California public schools and in a subsample ofstudents who enrolled in the University of California. We found that, in terms of the potential for college admissions for minority students, prediction power and the issue of overprediction, the R-SAT score did not perform significantly better than the SAT score.En este trabajo se investiga la validez predictiva del puntaje corregido del SAT (R-SAT), propuesto por Freedle (2003) como una alternativa para compensar a los estudiantes de minorías étnica por el daño potencial causado por la relación entre la dificultad de los ítems y el funcionamiento diferencial del ítem (DIF) asociado a etnia observada en el SAT. El puntaje R-SAT es aquel que los estudiantes de minorías habrían recibido si se hubieran considerado solo las preguntas más difíciles de la prueba y se calcula utilizando una corrección por adivinación (formula score) y un enfoque de regresión. En este artículo se reflexiona sobre los efectos potenciales del uso de la R-SAT en la decisión de admisión a instituciones selectivas, su capacidad para predecir los resultados académicos de corto y largo plazo, y sus posibles beneficios en relación con la predicción diferencial de notas universitarias para estudiantes de minorías étnicas. Para estudiar esto, se analizó el desempeño de la puntuación R-SAT en comparación con la puntuación estándar SAT en una muestra de alumnos graduados de las escuelas públicas de California y en una submuestra de estudiantes que se matricularon en la Universidad de California. Los resultados muestran que la puntuación R-SAT no se comportó significativamente mejor que la puntuación SAT al considerar las posibilidades de admisión a la universidad de grupos minoritarios, el poder de predicción y el problema de sobrepredicción. Este artigo pesquisa a validade preditiva da pontuação SAT Corrigida (R-SAT, pela sua sigla em inglês), proposta por Freedle (2003) como uma alternativa para balançar os grupos de estudantes minoritários de possíveis danos causados pelo vínculo entre a dificuldade e o funcionamento diferencial do item segundo a etnia (DIF, pela sua sigla em inglês). A pontuação R-SAT é a pontuação que os estudantes de grupos minoritários recebem se somente as perguntas mais difíceis do teste são consideradas e calculadas usando uma fórmula de pontuação e uma análise de regressão. Neste artigo, exploramos os efeitos potenciais do uso da R-SAT de estudantes de grupos minoritários nas decisões de admissão a instituições seletivas, e sua capacidade para predizer resultados académicos a curto e longo prazo, além de seus benefícios potenciais a respeito da predição diferencial das notas obtidas na faculdade dos estudantes de grupos minoritários. Para verificar isto na prática, examinamos o desempenho da pontuação R-SAT comparada com a pontuação do teste SAT padronizado numa amostra de estudantes formados das escolas públicas da Califórnia e uma sub-amostra de estudantes matriculados na Universidade da Califórnia. Os resultados mostram que em termos do potencial para a admissão no ensino superior de estudantes de grupos minoritários, o poder preditivo e o assunto de “superestimação”, a pontuação R-SAT não é significativamente melhor do que a pontuação SAT

    Explainable AI (XAI): Improving At-Risk Student Prediction with Theory-Guided Data Science, K-means Classification, and Genetic Programming

    Get PDF
    This research explores the use of eXplainable Artificial Intelligence (XAI) in Educational Data Mining (EDM) to improve the performance and explainability of artificial intelligence (AI) and machine learning (ML) models predicting at-risk students. Explainable predictions provide students and educators with more insight into at-risk indicators and causes, which facilitates instructional intervention guidance. Historically, low student retention has been prevalent across the globe as nations have implemented a wide range of interventions (e.g., policies, funding, and academic strategies) with only minimal improvements in recent years. In the US, recent attrition rates indicate two out of five first-time freshman students will not graduate from the same four-year institution within six years. In response, emerging AI research leveraging recent advancements in Deep Learning has demonstrated high predictive accuracy for identifying at-risk students, which is useful for planning instructional interventions. However, research suggested a general trade-off between performance and explainability of predictive models. Those that outperform, such as deep neural networks (DNN), are highly complex and considered black boxes (i.e., systems that are difficult to explain, interpret, and understand). The lack of model transparency/explainability results in shallow predictions with limited feedback prohibiting useful intervention guidance. Furthermore, concerns for trust and ethical use are raised for decision-making applications that involve humans, such as health, safety, and education. To address low student retention and the lack of interpretable models, this research explored the use of eXplainable Artificial Intelligence (XAI) in Educational Data Mining (EDM) to improve instruction and learning. More specifically, XAI has the potential to enhance the performance and explainability of AI/ML models predicting at-risk students. The scope of this study includes a hybrid research design comprising: (1) a systematic literature review of XAI and EDM applications in education; (2) the development of a theory-guided feature selection (TGFS) conceptual learning model; and (3) an EDM study exploring the efficacy of a TGFS XAI model. The EDM study implemented K-Means Classification for explorative (unsupervised) and predictive (supervised) analysis in addition to assessing Genetic Programming (GP), a type of XAI model, predictive performance, and explainability against common AI/ML models. Online student activity and performance data were collected from a learning management system (LMS) from a four-year higher education institution. Student data was anonymized and protected to ensure data privacy and security. Data was aggregated at weekly intervals to compute and assess the predictive performance (sensitivity, recall, and f-1 score) over time. Mean differences and effect sizes are reported at the .05 significance level. Reliability and validity are improved by implementing research best practices

    Meeting the Challenges to Measurement in an Era of Accountability

    Get PDF
    Under pressure and support from the federal government, states have increasingly turned to indicators based on student test scores to evaluate teachers and schools, as well as students themselves. The focus thus far has been on test scores in those subject areas where there is a sequence of consecutive tests, such as in mathematics or English/language arts with a focus on grades 4-8. Teachers in these subject areas, however, constitute less than thirty percent of the teacher workforce in a district. Comparatively little has been written about the measurement of achievement in the other grades and subjects. This volume seeks to remedy this imbalance by focusing on the assessment of student achievement in a broad range of grade levels and subject areas, with particular attention to their use in the evaluation of teachers and schools in all. It addresses traditional end-of-course tests, as well as alternative measures such as portfolios, exhibitions, and student learning objectives. In each case, issues related to design and development, psychometric considerations, and validity challenges are covered from both a generic and a content-specific perspective. The NCME Applications of Educational Measurement and Assessment series includes edited volumes designed to inform research-based applications of educational measurement and assessment. Edited by leading experts, these books are comprehensive and practical resources on the latest developments in the field. The NCME series editorial board is comprised of Michael J. Kolen, Chair; Robert L. Brennan; Wayne Camara; Edward H. Haertel; Suzanne Lane; and Rebecca Zwick

    Preadmission academic achievement criteria as predictors of nursing program completion and NCLEX -RN success

    Get PDF
    Admission policies and practices in higher education, including those in nursing programs, are diverse; yet administrators have traditionally relied upon preadmission academic achievement for selection of qualified students. Higher education administrators have the responsibility to serve the institution and all of its constituents, ensuring that admission policies and regular systematic evaluation of those policies are important aspects of that service.;The nursing shortage and limited resources have pressed nursing schools to implement innovative strategies to increase the number of qualified graduates. State University\u27s School of Nursing has used a score sheet to rank associate degree nursing applicants since 1984. The preadmission score sheet includes cumulative GPA, standardized test scores, prerequisite and support course grades, and LPN (licensed practical nurse) licensure. Students cannot become registered nurses unless they complete the nursing program and pass the National Council Licensure Examination for Registered Nurses (NCLEX-RN).;The purpose of this study was to determine the ability of various preadmission academic achievement-related variables to predict nursing program completion and NCLEX-RN success. The sample consisted of 294 students admitted to the State University associate degree nursing program in the Fall of 2005, 2006, and 2007. Logistic regression models were used to determine which preadmission academic achievement variables were most predictive of program completion and NCLEX-RN success.;TEAS science scores were predictive of both program completion and NCLEX-RN success. TEAS reading scores were predictive of NCLEX-RN success but not program completion. Science GPA was predictive of program completion, and health-related coursework GPA was predictive of NCLEX-RN success. Demographic factors were also evaluated for the ability to predict success, and of those variables, student type (traditional versus nontraditional) was predictive of both outcome variables. Nontraditional students were most likely to succeed.;Specific recommendations were presented for policy and future research. This study suggested greater emphasis on variables predictive of student success in admission policy, caution when using test scores without context for admission decisions, and variety when selecting those measures used to rank applicants. This study also suggested that the largest amount of variance in student success is yet to be explained and presented recommendations for study replication and expansion

    Academic and non-academic factors as predictors of early academic success in bacclaureate nursing programs

    Get PDF
    Nurses are in very high demand and this situation has placed an unprecedented call for faculty in higher education institutions to produce more graduates. With more students applying to nursing programs and a limited number of nursing slots available, admission to nursing programs has become increasingly competitive. Given these conditions, a trend toward increasing admissions standards has been noticed as program leaders and faculty struggle to institute some type of sorting method to select the applicants who are most likely to succeed in their nursing education programs. Entrance examinations have been increasingly used as a major part of admissions criteria for nursing programs with an assumption that high test scores on entrance examinations will correlate with program success. The purpose of this study was to examine selected academic and non-academic variables of first-term nursing major students to identify variables that correlate with early in-program success, and second, to compare the predictive efficiency between widely used nursing entrance tests (i.e., NET, TEAS, CCTST, ATI-CTT). This study was a retrospective, descriptive, and correlational investigation of 651 baccalaureate nursing students at a single study site. The researcher compiled data from academic student records to examine 18 independent variables for predictive correlation with the criterion variable of term-one success. The results of data analysis demonstrated that of the variables investigated, 43% to 48% of the variance in term-one outcome was predicted by these two main variables: pre-nursing grade-point average (GPA) and critical thinking test score. Nursing entrance test scores did not add to prediction of term-one success. Multiple regression analyses demonstrated stronger predictive efficiency with the model utilizing pre-nursing GPA and ATI Critical Thinking Test scores. The researcher also found significantly lower term-one pass rates in minority, African American, and English-as-a-second language students. This area of investigation should be studied further. Additionally, by using results of this study, a model, the Early Academic Success (EAS) Prediction model, was developed for nursing leaders and faculties interested in investigating predictors of early academic success in their baccalaureate programs

    Meeting the Challenges to Measurement in an Era of Accountability

    Get PDF
    Under pressure and support from the federal government, states have increasingly turned to indicators based on student test scores to evaluate teachers and schools, as well as students themselves. The focus thus far has been on test scores in those subject areas where there is a sequence of consecutive tests, such as in mathematics or English/language arts with a focus on grades 4-8. Teachers in these subject areas, however, constitute less than thirty percent of the teacher workforce in a district. Comparatively little has been written about the measurement of achievement in the other grades and subjects. This volume seeks to remedy this imbalance by focusing on the assessment of student achievement in a broad range of grade levels and subject areas, with particular attention to their use in the evaluation of teachers and schools in all. It addresses traditional end-of-course tests, as well as alternative measures such as portfolios, exhibitions, and student learning objectives. In each case, issues related to design and development, psychometric considerations, and validity challenges are covered from both a generic and a content-specific perspective. The NCME Applications of Educational Measurement and Assessment series includes edited volumes designed to inform research-based applications of educational measurement and assessment. Edited by leading experts, these books are comprehensive and practical resources on the latest developments in the field. The NCME series editorial board is comprised of Michael J. Kolen, Chair; Robert L. Brennan; Wayne Camara; Edward H. Haertel; Suzanne Lane; and Rebecca Zwick

    Classroom Assessment and Educational Measurement

    Get PDF
    Classroom Assessment and Educational Measurement explores the ways in which the theory and practice of both educational measurement and the assessment of student learning in classroom settings mutually inform one another. Chapters by assessment and measurement experts consider the nature of classroom assessment information, from student achievement to affective and socio-emotional attributes; how teachers interpret and work with assessment results; and emerging issues in assessment such as digital technologies and diversity/inclusion. This book uniquely considers the limitations of applying large-scale educational measurement theory to classroom assessment and the adaptations necessary to make this transfer useful. Researchers, graduate students, industry professionals, and policymakers will come away with an essential understanding of how the classroom assessment context is essential to broadening contemporary educational measurement perspectives
    corecore