19,300 research outputs found
Influence of Context on Item Parameters in Forced-Choice Personality Assessments
A fundamental assumption in computerized adaptive testing (CAT) is that item parameters are invariant with respect to context â items surrounding the administered item. This assumption, however, may not hold in forced-choice (FC) assessments, where explicit comparisons are made between items included in the same block. We empirically examined the influence of context on item parameters by comparing parameter estimates from two FC instruments. The first instrument was compiled of blocks of three items, whereas in the second, the context was manipulated by adding one item to each block, resulting in blocks of four. The item parameter estimates were highly similar. However, a small number of significant deviations were observed, confirming the importance of context when designing adaptive FC assessments. Two patterns of such deviations were identified, and methods to reduce their occurrences in a FC CAT setting were proposed. It was shown that with a small proportion of violations of the parameter invariance assumption, score estimation remained stable
Psychometrics in Practice at RCEC
A broad range of topics is dealt with in this volume: from combining the psychometric generalizability and item response theories to the ideas for an integrated formative use of data-driven decision making, assessment for learning and diagnostic testing. A number of chapters pay attention to computerized (adaptive) and classification testing. Other chapters treat the quality of testing in a general sense, but for topics like maintaining standards or the testing of writing ability, the quality of testing is dealt with more specifically.\ud
All authors are connected to RCEC as researchers. They present one of their current research topics and provide some insight into the focus of RCEC. The selection of the topics and the editing intends that the book should be of special interest to educational researchers, psychometricians and practitioners in educational assessment
Optimizing the assessment of suicidal behavior: the application of curtailment techniques
Background:
Given their length, commonly used scales to assess suicide risk, such as the Beck Scale for Suicide Ideation (SSI) are of limited use as screening tools. In the current study we tested whether deterministic and stochastic curtailment can be applied to shorten the 19-item SSI, without compromising its accuracy.
Methods:
Data from 366 patients, who were seen by a liaison psychiatry service in a general hospital in Scotland after a suicide attempt, were used. Within 24Â h of admission, the SSI was administered; 15 months later, it was determined whether a patient was re-admitted to a hospital as the result of another suicide attempt. We fitted a Receiver Operating Characteristic curve to derive the best cut-off value of the SSI for predicting future suicidal behavior. Using this cut-off, both deterministic and stochastic curtailment were simulated on the item score patterns of the SSI.
Results:
A cut-off value of SSIâ„6 provided the best classification accuracy for future suicidal behavior. Using this cut-off, we found that both deterministic and stochastic curtailment reduce the length of the SSI, without reducing the accuracy of the final classification decision. With stochastic curtailment, on average, less than 8 items are needed to assess whether administration of the full-length test will result in an SSI score below or above the cut-off value of 6.
Limitations:
New studies using other datasets should re-validate the optimal cut-off for risk of repeated suicidal behavior after being treated in a hospital following an attempt.
Conclusions:
Curtailment can be used to simplify the assessment of suicidal behavior, and should be considered as an alternative to the full scale
Recommended from our members
Macaques preferentially attend to visual patterns with higher fractal dimension contours.
Animals' sensory systems evolved to efficiently process information from their environmental niches. Niches often include irregular shapes and rough textures (e.g., jagged terrain, canopy outlines) that must be navigated to find food, escape predators, and master other fitness-related challenges. For most primates, vision is the dominant sensory modality and thus, primates have evolved systems for processing complicated visual stimuli. One way to quantify information present in visual stimuli in natural scenes is evaluating their fractal dimension. We hypothesized that sensitivity to complicated geometric forms, indexed by fractal dimension, is an evolutionarily conserved capacity, and tested this capacity in rhesus macaques (Macaca mulatta). Monkeys viewed paired black and white images of simulated self-similar contours that systematically varied in fractal dimension while their attention to the stimuli was measured using noninvasive infrared eye tracking. They fixated more frequently on, dwelled for longer durations on, and had attentional biases towards images that contain boundary contours with higher fractal dimensions. This indicates that, like humans, they discriminate between visual stimuli on the basis of fractal dimension and may prefer viewing informationally rich visual stimuli. Our findings suggest that sensitivity to fractal dimension may be a wider ability of the vertebrate vision system
Facilitating Variable-Length Computerized Classification Testing Via Automatic Racing Calibration Heuristics
Thesis (Ph.D.) - Indiana University, School of Education, 2015Computer Adaptive Tests (CATs) have been used successfully with standardized tests. However, CATs are rarely practical for assessment in instructional contexts, because large numbers of examinees are required a priori to calibrate items using item response theory (IRT). Computerized Classification Tests (CCTs) provide a practical alternative to IRT-based CATs. CCTs show promise for instructional contexts, since many fewer examinees are required for item parameter estimation. However, there is a paucity of clear guidelines indicating when items are sufficiently calibrated in CCTs.
Is there an efficient and accurate CCT algorithm which can estimate item parameters adaptively? Automatic Racing Calibration Heuristics (ARCH) was invented as a new CCT method and was empirically evaluated in two studies.
Monte Carlo simulations were run on previous administrations of a computer literacy test, consisting of 85 items answered by 104 examinees. Simulations resulted in determination of thresholds needed by the ARCH method for parameter estimates. These thresholds were subsequently used in 50 sets of computer simulations in order to compare accuracy and efficiency of ARCH with the sequential probability ratio test (SPRT) and with an enhanced method called EXSPRT. In the second study, 5,729 examinees took an online plagiarism test, where ARCH was implemented in parallel with SPRT and EXSPRT for comparison.
Results indicated that new statistics were needed by ARCH to establish thresholds and to determine when ARCH could begin. The ARCH method resulted in test lengths significantly shorter than SPRT, and slightly longer than EXSPRT without sacrificing accuracy of classification of examinees as masters and nonmasters.
This research was the first of its kind in evaluating the ARCH method. ARCH appears to be a viable CCT method, which could be particularly useful in massively open online courses (MOOCs). Additional studies with different test content and contexts are needed
A study on exponential-size neighborhoods for the bin packing problem with conflicts
We propose an iterated local search based on several classes of local and
large neighborhoods for the bin packing problem with conflicts. This problem,
which combines the characteristics of both bin packing and vertex coloring,
arises in various application contexts such as logistics and transportation,
timetabling, and resource allocation for cloud computing. We introduce
evaluation procedures for classical local-search moves, polynomial variants of
ejection chains and assignment neighborhoods, an adaptive set covering-based
neighborhood, and finally a controlled use of 0-cost moves to further diversify
the search. The overall method produces solutions of good quality on the
classical benchmark instances and scales very well with an increase of problem
size. Extensive computational experiments are conducted to measure the
respective contribution of each proposed neighborhood. In particular, the
0-cost moves and the large neighborhood based on set covering contribute very
significantly to the search. Several research perspectives are open in relation
to possible hybridizations with other state-of-the-art mathematical programming
heuristics for this problem.Comment: 26 pages, 8 figure
Recommended from our members
Optimal test designs with content balancing and variable target information functions as constraints.
Optimal test design involves the application of an item selection heuristic to construct a test to fit the target information function in order that the standard error of the test can be controlled at different regions of the ability continuum. The real data simulation study assessed the efficiency of binary programming in optimal item selection by comparing the degree in which the obtained test information was approximated to different target information functions with a manual heuristic. The effects of imposing a content balancing constraint was studied in conventional, two-stage and adaptive tests designed using the automated procedure. Results showed that the automated procedure improved upon the manual procedure significantly when a uniform target information function was used. However, when a peaked target information function was used, the improvement over the manual procedure was marginal. Both procedures were affected by the distribution of the item parameters in the item pool. The degree in which the examinee empirical scores were recovered was lower when a content balancing constraint was imposed in the conventional test designs. The effect of uneven item parameter distribution in the item pool was shown by the poorer recovery of the empirical scores at the higher regions of the ability continuum. Two-stage tests were shown to limit the effects of content balancing. Content balanced adaptive tests using optimal item selection was shown to be efficient in empirical score recovery, especially in maintaining equiprecision in measurement over a wide ability range despite the imposition of content balancing constraint in the test design. The study had implications for implementing automated test designs in the school systems supported by hardware and expertise in measurement theory and addresses the issue of content balancing using optimal test designs within an adaptive testing framework
- âŠ