66 research outputs found
Parameters and Structure of Neural Network Databases for Assessment of Learning Outcomes
The purpose of this study is to determine the methodology, develop a theory of construction, put into practice algorithmization and implement the functionality of a hybrid intelligent system for assessment of educational outcomes of trainees on the basis of the identified keyword parameters and structure of the artificial neural network using expert systems and fuzzy simulation; to develop a methodology for the construction of structural-logic, hierarchical, functional and fractal schemes for structuring databases of the didactic field of learning elements; to determine the content, structure of parameters and database components, selection criteria and the content of complexes of educational standards. The methodology of introducing intelligent systems into mathematical education is on the basis of the Hegelian triad: thesis (implementation of the coherence principle) – antithesis (implementation of principles of the fractality and historiogenesis) – synthesis (implementation of the principles of self-organization and reflection of the complex system inversion integrity). Requirements for the organization and construction of the artificial neural network for assessment of personal achievements on the basis of fuzzy simulation have been developed. In the direction of using elements of fractal geometry, the technological structures of clusters that constitute the basis of generalized structures have been developed. In particular, it is revealed that the didactic field of learning elements is equipped with a system of multi-level hierarchical databases of exercises, motivational-applied, research, practice-oriented tasks using expert systems and integration of mathematical, information, natural-science and humanities knowledge and procedures
Identifiable Cognitive Diagnosis with Encoder-decoder for Modelling Students' Performance
Cognitive diagnosis aims to diagnose students' knowledge proficiencies based
on their response scores on exam questions, which is the basis of many domains
such as computerized adaptive testing. Existing cognitive diagnosis models
(CDMs) follow a proficiency-response paradigm, which views diagnostic results
as learnable embeddings that are the cause of students' responses and learns
the diagnostic results through optimization. However, such a paradigm can
easily lead to unidentifiable diagnostic results and the explainability
overfitting problem, which is harmful to the quantification of students'
learning performance. To address these problems, we propose a novel
identifiable cognitive diagnosis framework. Specifically, we first propose a
flexible diagnostic module which directly diagnose identifiable and explainable
examinee traits and question features from response logs. Next, we leverage a
general predictive module to reconstruct response logs from the diagnostic
results to ensure the preciseness of the latter. We furthermore propose an
implementation of the framework, i.e., ID-CDM, to demonstrate the availability
of the former. Finally, we demonstrate the identifiability, explainability and
preciseness of diagnostic results of ID-CDM through experiments on four public
real-world datasets
Integrating Timing Considerations to Improve Testing Practices
Integrating Timing Considerations to Improve Testing Practices synthesizes a wealth of theory and research on time issues in assessment into actionable advice for test development, administration, and scoring. One of the major advantages of computer-based testing is the capability to passively record test-taking metadata—including how examinees use time and how time affects testing outcomes. This has opened many questions for testing administrators. Is there a trade-off between speed and accuracy in test taking? What considerations should influence equitable decisions about extended-time accommodations? How can test administrators use timing data to balance the costs and resulting validity of tests administered at commercial testing centers? In this comprehensive volume, experts in the field discuss the impact of timing considerations, constraints, and policies on valid score interpretations; administrative accommodations, test construction, and examinees’ experiences and behaviors; and how to implement the findings into practice. These 12 chapters provide invaluable resources for testing professionals to better understand the inextricable links between effective time allocation and the purposes of high-stakes testing
Integrating Timing Considerations to Improve Testing Practices
Integrating Timing Considerations to Improve Testing Practices synthesizes a wealth of theory and research on time issues in assessment into actionable advice for test development, administration, and scoring. One of the major advantages of computer-based testing is the capability to passively record test-taking metadata—including how examinees use time and how time affects testing outcomes. This has opened many questions for testing administrators. Is there a trade-off between speed and accuracy in test taking? What considerations should influence equitable decisions about extended-time accommodations? How can test administrators use timing data to balance the costs and resulting validity of tests administered at commercial testing centers? In this comprehensive volume, experts in the field discuss the impact of timing considerations, constraints, and policies on valid score interpretations; administrative accommodations, test construction, and examinees’ experiences and behaviors; and how to implement the findings into practice. These 12 chapters provide invaluable resources for testing professionals to better understand the inextricable links between effective time allocation and the purposes of high-stakes testing
Recommended from our members
I’ve (Urn)ed This: An Application and Criterion-based Evaluation of the Urnings Algorithm
There is increased interest in personalized learning and making e-learning environments more adaptable. Some e-learning systems may use an Item Response Theory (IRT)-based assessment system. An important distinction between assessment and learning contexts is that learner proficiency is expected to remain constant across an assessment, while it is expected to change over time in a learning context. Constant learner proficiency during an assessment enables conventional approaches to estimating person and item parameters using IRT. These IRT-based systems could be abandoned for alternative approaches to modeling learners and system learning content, but assessments may provide more functions than adapting learning material to students. Thus, there is the question, how can e-learning systems with IRT-based assessment components more dynamically adapt their learning content? Is there a solution that leverages IRT for adapting the learning content of the system? A promising solution is the Urnings algorithm. Like other candidate algorithms, it is computationally light, but this algorithm has mechanisms for preventing variance inflation and is suitable for e-learning contexts. It also provides a measure of uncertainty around estimates. It has been studied both through simulations and applications to e-learning systems. Results are promising; however, there has not been an application of the Urnings algorithm to an e-learning context where there are conventionally estimated person parameters to compare the algorithm estimates to. This study addresses this gap by applying the Urnings algorithm to a K–8 reading and mathematics learning platform. In data from this platform, we have person parameter estimates across academic years from an in-system diagnostic assessment. Results from this study will help industry researchers understand the feasibility of the Urnings algorithm for large e-learning systems with IRT-based assessment components
Advancing Human Assessment: The Methodological, Psychological and Policy Contributions of ETS
​This book describes the extensive contributions made toward the advancement of human assessment by scientists from one of the world’s leading research institutions, Educational Testing Service. The book’s four major sections detail research and development in measurement and statistics, education policy analysis and evaluation, scientific psychology, and validity. Many of the developments presented have become de-facto standards in educational and psychological measurement, including in item response theory (IRT), linking and equating, differential item functioning (DIF), and educational surveys like the National Assessment of Educational Progress (NAEP), the Programme of international Student Assessment (PISA), the Progress of International Reading Literacy Study (PIRLS) and the Trends in Mathematics and Science Study (TIMSS). In addition to its comprehensive coverage of contributions to the theory and methodology of educational and psychological measurement and statistics, the book gives significant attention to ETS work in cognitive, personality, developmental, and social psychology, and to education policy analysis and program evaluation. The chapter authors are long-standing experts who provide broad coverage and thoughtful insights that build upon decades of experience in research and best practices for measurement, evaluation, scientific psychology, and education policy analysis. Opening with a chapter on the genesis of ETS and closing with a synthesis of the enormously diverse set of contributions made over its 70-year history, the book is a useful resource for all interested in the improvement of human assessment
Advancing Human Assessment: The Methodological, Psychological and Policy Contributions of ETS
Educational Testing Service (ETS); large-scale assessment; policy research; psychometrics; admissions test
Estudio comparativo entre algoritmos que miden la precisiĂłn del sistema de selecciĂłn de Ătems para test adaptativos computarizados /
Según la definición de la Real Academia de la Lengua Española, un test es una prueba destinada a
evaluar conocimientos o aptitudes, en la cual hay que elegir la respuesta correcta entre varias opciones
previamente fijadas. Los test pueden ser orales o escritos, de estructura fija o flexible, dependiendo de
qué se quiere evaluar, cómo se quiere evaluar y cuál es el contexto en el que se esté aplicando la
evaluaciĂłn.
Las evaluaciones son generalmente la forma más común y efectiva de evaluar el conocimiento o la
habilidad de un alumno. Las evaluaciones tradicionales no siempre satisfacen la necesidad de
discriminar el conocimiento de los alumnos, y los atributos tales como: el tiempo de finalizaciĂłn del
examen, asĂ como el grado de dificultad de un test, son difĂciles de controlar.
Los test realizados a través de un computador -TAI, han demostrado ser más eficaces y eficientes que
las pruebas tradicionales de papel y lápiz debido a varias razones:
â—Ź Usualmente los estudiantes deben responder un nĂşmero reducido de preguntas al ser evaluados
con TAIs, puesto que sĂłlo se les administra Ătems apropiados para su nivel de conocimiento,
mientras que al responder un test tradicional un estudiante puede enfrentarse a preguntas muy
difĂciles o pasar rápidamente unas muy fáciles.
â—Ź Debido a que los TAIs se realizan en lĂnea, los estudiantes tienen acceso a la retroalimentaciĂłn del
test en menos tiempo.
â—Ź Los TAIs proveen puntajes uniformemente precisos para la mayorĂa de evaluados (Thissen y
Mislevy, 2000). En contraste, los tests tradicionales suelen ser muy precisos al evaluar estudiantes
de habilidad media y poco exactos con estudiantes cuya habilidad es muy alta o muy baja.
Con la interactividad y la adaptabilidad del usuario, la aplicaciĂłn de los TAIs amplĂa las posibilidades de
realizar pruebas más allá de las limitaciones de las pruebas tradicionales de papel y lápiz.Incluye referencias bibliográficas (páginas 53-55
- …