Search CORE

7,508 research outputs found

Nonlinear sequential designs for logistic item response theory models with applications to computerized adaptive tests

Author: Chang Hua-Hua
Ying Zhiliang
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 10/06/2009
Field of study

Computerized adaptive testing is becoming increasingly popular due to advancement of modern computer technology. It differs from the conventional standardized testing in that the selection of test items is tailored to individual examinee's ability level. Arising from this selection strategy is a nonlinear sequential design problem. We study, in this paper, the sequential design problem in the context of the logistic item response theory models. We show that the adaptive design obtained by maximizing the item information leads to a consistent and asymptotically normal ability estimator in the case of the Rasch model. Modifications to the maximum information approach are proposed for the two- and three-parameter logistic models. Similar asymptotic properties are established for the modified designs and the resulting estimator. Examples are also given in the case of the two-parameter logistic model to show that without such modifications, the maximum likelihood estimator of the ability parameter may not be consistent.Comment: Published in at http://dx.doi.org/10.1214/08-AOS614 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

Psychometrics in Practice at RCEC

Author: Eggen T.J.H.M.
Veldkamp B.P.
Publication venue: Ipskamp Drukkers
Publication date: 01/01/2012
Field of study

A broad range of topics is dealt with in this volume: from combining the psychometric generalizability and item response theories to the ideas for an integrated formative use of data-driven decision making, assessment for learning and diagnostic testing. A number of chapters pay attention to computerized (adaptive) and classification testing. Other chapters treat the quality of testing in a general sense, but for topics like maintaining standards or the testing of writing ability, the quality of testing is dealt with more specifically.\ud All authors are connected to RCEC as researchers. They present one of their current research topics and provide some insight into the focus of RCEC. The selection of the topics and the editing intends that the book should be of special interest to educational researchers, psychometricians and practitioners in educational assessment

University of Twente Research Information

Recommended from our members

Scoring and classifying examinees using measurement decision theory

Author: Rudner Lawrence M.
Publication venue: ScholarWorks@UMass Amherst
Publication date: 23/11/2019
Field of study

This paper describes and evaluates the use of measurement decision theory (MDT) to classify examinees based on their item response patterns. The model has a simple framework that starts with the conditional probabilities of examinees in each category or mastery state responding correctly to each item. The presented evaluation investigates: (1) the classification accuracy of tests scored using decision theory; (2) the effectiveness of different sequential testing procedures; and (3) the number of items needed to make a classification. A large percentage of examinees can be classified accurately with very few items using decision theory. A Java Applet for self instruction and software for generating, calibrating and scoring MDT data are provided. Accessed 13,741 times on https://pareonline.net from April 11, 2009 to December 31, 2019. For downloads from January 1, 2020 forward, please click on the PlumX Metrics link to the right

ScholarWorks@UMass Amherst

The Road Ahead for State Assessments

Author
Publication venue: Rennie Center for Education Research & Policy
Publication date: 05/05/2011
Field of study

The adoption of the Common Core State Standards offers an opportunity to make significant improvements to the large-scale statewide student assessments that exist today, and the two US DOE-funded assessment consortia -- the Partnership for the Assessment of Readiness for College and Careers (PARCC) and the SMARTER Balanced Assessment Consortium (SBAC) -- are making big strides forward. But to take full advantage of this opportunity the states must focus squarely on making assessments both fair and accurate.A new report commissioned by the Rennie Center for Education Research & Policy and Policy Analysis for California Education (PACE), The Road Ahead for State Assessments, offers a blueprint for strengthening assessment policy, pointing out how new technologies are opening up new possibilities for fairer, more accurate evaluations of what students know and are able to do. Not all of the promises can yet be delivered, but the report provides a clear set of assessment-policy recommendations. The Road Ahead for State Assessments includes three papers on assessment policy.The first, by Mark Reckase of Michigan State University, provides an overview of computer adaptive assessment. Computer adaptive assessment is an established technology that offers detailed information on where students are on a learning continuum rather than a summary judgment about whether or not they have reached an arbitrary standard of "proficiency" or "readiness." Computer adaptivity will support the fair and accurate assessment of English learners (ELs) and lead to a serious engagement with the multiple dimensions of "readiness" for college and careers.The second and third papers give specific attention to two areas in which we know that current assessments are inadequate: assessments in science and assessments for English learners.In science, paper-and-pencil, multiple choice tests provide only weak and superficial information about students' knowledge and skills -- most specifically about their abilities to think scientifically and actually do science. In their paper, Chris Dede and Jody Clarke-Midura of Harvard University illustrate the potential for richer, more authentic assessments of students' scientific understanding with a case study of a virtual performance assessment now under development at Harvard. With regard to English learners, administering tests in English to students who are learning the language, or to speakers of non-standard dialects, inevitably confounds students' content knowledge with their fluency in Standard English, to the detriment of many students. In his paper, Robert Linquanti of WestEd reviews key problems in the assessment of ELs, and identifies the essential features of an assessment system equipped to provide fair and accurate measures of their academic performance.The report's contributors offer deeply informed recommendations for assessment policy, but three are especially urgent.Build a system that ensures continued development and increased reliance on computer adaptive testing. Computer adaptive assessment provides the essential foundation for a system that can produce fair and accurate measurement of English learners' knowledge and of all students' knowledge and skills in science and other subjects. Developing computer adaptive assessments is a necessary intermediate step toward a system that makes assessment more authentic by tightly linking its tasks and instructional activities and ultimately embedding assessment in instruction. It is vital for both consortia to keep these goals in mind, even in light of current technological and resource constraints.Integrate the development of new assessments with assessments of English language proficiency (ELP). The next generation of ELP assessments should take into consideration an English learners' specific level of proficiency in English. They will need to be based on ELP standards that sufficiently specify the target academic language competencies that English learners need to progress in and gain mastery of the Common Core Standards. One of the report's authors, Robert Linquanti, states: "Acknowledging and overcoming the challenges involved in fairly and accurately assessing ELs is integral and not peripheral to the task of developing an assessment system that serves all students well. Treating the assessment of ELs as a separate problem -- or, worse yet, as one that can be left for later -- calls into question the basic legitimacy of assessment systems that drive high-stakes decisions about students, teachers, and schools." Include virtual performance assessments as part of comprehensive state assessment systems. Virtual performance assessments have considerable promise for measuring students' inquiry and problem-solving skills in science and in other subject areas, because authentic assessment can be closely tied to or even embedded in instruction. The simulation of authentic practices in settings similar to the real world opens the way to assessment of students' deeper learning and their mastery of 21st century skills across the curriculum. We are just setting out on the road toward assessments that ensure fair and accurate measurement of performance for all students, and support for sustained improvements in teaching and learning. Developing assessments that realize these goals will take time, resources and long-term policy commitment. PARCC and SBAC are taking the essential first steps down a long road, and new technologies have begun to illuminate what's possible. This report seeks to keep policymakers' attention focused on the road ahead, to ensure that the choices they make now move us further toward the goal of college and career success for all students. This publication was released at an event on May 16, 2011

IssueLab

Systems identification and application systems development for monitoring the physiological and health status of crewmen in space

Author: Furukawa S.
Leonard J. I.
Vannordstrand P. C.
Publication venue
Publication date
Field of study

The use of automated, analytical techniques to aid medical support teams is suggested. Recommendations are presented for characterizing crew health in terms of: (1) wholebody function including physiological, psychological and performance factors; (2) a combination of critical performance indexes which consist of multiple factors of measurable parameters; (3) specific responses to low noise level stress tests; and (4) probabilities of future performance based on present and periodic examination of past performance. A concept is proposed for a computerized real time biomedical monitoring and health care system that would have the capability to integrate monitored data, detect off-nominal conditions based on current knowledge of spaceflight responses, predict future health status, and assist in diagnosis and alternative therapies. Mathematical models could play an important role in this approach, especially when operating in a real time mode. Recommendations are presented to update the present health monitoring systems in terms of recent advances in computer technology and biomedical monitoring systems

NASA Technical Reports Server

Turkish Prospective Teachers Perspective of Different Types of Exams: Multiple Choice, Essay and Computerized-type Testing

Author: Demirci Neset
Publication venue: OpenRiver
Publication date: 01/08/2008
Field of study

The major aim of this study was to compare prospective teachers’ attitudes and natures toward teacher-made essay, multiple-choice type versus computerized-type exams. The primary study was conducted on a sample of 393 prospective teachers (specifically, students from 33 physics education, 93 science education, 66 computer education, and 201 elementary education departments) at Necatibey Faculty of Education, in Balıkesir, Turkey who were administered a test attitude inventory specifically designed to assess prospective teachers’ attitudes toward essay, multiple choice versus computerized type formats on a variety of critical dimensions. The data from study was pointing to more favorable prospective teachers’ attitudes towards multiple choice exams compared to essay and computerized- type formats on most dimensions assessed. However, prospective teachers, in general, did not want to choose one type to another; because, they are willing to use some assessment types altogether or combination of at least two types (multiple choice and essay). Many see the computerized-type exam is more contemporary approaches than the others, and also many have a positive attitude toward using it in their further teaching. Nevertheless, somehow many see using computerized-type exam is not convenient and/or comfortable to use it ye

OpenRiver@Winona State University

Principles and practice of on-demand testing

Author: Wheadon Christopher
Publication venue: Office of the Qualifications and Examinations Regulator
Publication date: 01/01/2009
Field of study

Digital Education Resource Archive

WorkKeys Mathematics Skill-Level Scores as Predictors for Placement into College-Level Mathematics

Author: Sadler Wanda
Publication venue: ODU Digital Commons
Publication date: 01/01/2009
Field of study

A study to determine the relationship between the WorkKeys applied mathematics skill level scores and the COMPASS/ESL mathematics placement scores as predictors for placement in college level mathematics

Old Dominion University

Testing in the Professions

Author
Publication venue: 'Informa UK Limited'
Publication date: 26/04/2022
Field of study

Testing in the Professions focuses on current practices in credentialing testing as a guide for practitioners. With a broad focus on the key components, issues, and concerns surrounding the test development and validation process, this book brings together a wide range of research and theory—from design and analysis of tests to security, scoring, and reporting. Written by leading experts in the field of measurement and assessment, each chapter includes authentic examples as to how various practices are implemented or current issues observed in credentialing programs. The volume begins with an exploration of the various types of credentialing programs as well as key differences in the interpretation and evaluation of test scores. The next set of chapters discusses key test development steps, including test design, content development, analysis, and evaluation. The final set of chapters addresses specific topics that span the testing process, including communication with stakeholders, security, program evaluation, and legal principles. As a response to the growing number of professions and professional designations that are tied to testing requirements, Testing in the Professions is a comprehensive source for up-to-date measurement and credentialing practices

Directory of Open Access Books (DOAB)

Facilitating Variable-Length Computerized Classification Testing Via Automatic Racing Calibration Heuristics

Author: Barrett Andrew Frederick
Publication venue: [Bloomington, Ind.] : Indiana University
Publication date: 01/01/2015
Field of study

Thesis (Ph.D.) - Indiana University, School of Education, 2015Computer Adaptive Tests (CATs) have been used successfully with standardized tests. However, CATs are rarely practical for assessment in instructional contexts, because large numbers of examinees are required a priori to calibrate items using item response theory (IRT). Computerized Classification Tests (CCTs) provide a practical alternative to IRT-based CATs. CCTs show promise for instructional contexts, since many fewer examinees are required for item parameter estimation. However, there is a paucity of clear guidelines indicating when items are sufficiently calibrated in CCTs. Is there an efficient and accurate CCT algorithm which can estimate item parameters adaptively? Automatic Racing Calibration Heuristics (ARCH) was invented as a new CCT method and was empirically evaluated in two studies. Monte Carlo simulations were run on previous administrations of a computer literacy test, consisting of 85 items answered by 104 examinees. Simulations resulted in determination of thresholds needed by the ARCH method for parameter estimates. These thresholds were subsequently used in 50 sets of computer simulations in order to compare accuracy and efficiency of ARCH with the sequential probability ratio test (SPRT) and with an enhanced method called EXSPRT. In the second study, 5,729 examinees took an online plagiarism test, where ARCH was implemented in parallel with SPRT and EXSPRT for comparison. Results indicated that new statistics were needed by ARCH to establish thresholds and to determine when ARCH could begin. The ARCH method resulted in test lengths significantly shorter than SPRT, and slightly longer than EXSPRT without sacrificing accuracy of classification of examinees as masters and nonmasters. This research was the first of its kind in evaluating the ARCH method. ARCH appears to be a viable CCT method, which could be particularly useful in massively open online courses (MOOCs). Additional studies with different test content and contexts are needed

IUScholarWorks (University of Indiana)