93 research outputs found
Cut-off values for the applied version of the Beck Depression Inventory in a general working population
Rasch scaling procedures for informing development of a valid Fetal Surveillance Education Program multiple-choice assessment
<p>Abstract</p> <p>Background</p> <p>It is widely recognised that deficiencies in fetal surveillance practice continue to contribute significantly to the burden of adverse outcomes. This has prompted the development of evidence-based clinical practice guidelines by the Royal Australian and New Zealand College of Obstetricians and Gynaecologists and an associated Fetal Surveillance Education Program to deliver the associated learning. This article describes initial steps in the validation of a corresponding multiple-choice assessment of the relevant educational outcomes through a combination of item response modelling and expert judgement.</p> <p>Methods</p> <p>The Rasch item response model was employed for item and test analysis and to empirically derive the substantive interpretation of the assessment variable. This interpretation was then compared to the hierarchy of competencies specified a priori by a team of eight subject-matter experts. Classical Test Theory analyses were also conducted.</p> <p>Results</p> <p>A high level of agreement between the hypothesised and derived variable provided evidence of construct validity. Item and test indices from Rasch analysis and Classical Test Theory analysis suggested that the current test form was of moderate quality. However, the analyses made clear the required steps for establishing a valid assessment of sufficient psychometric quality. These steps included: increasing the number of items from 40 to 50 in the first instance, reviewing ineffective items, targeting new items to specific content and difficulty gaps, and formalising the assessment blueprint in light of empirical information relating item structure to item difficulty.</p> <p>Conclusion</p> <p>The application of the Rasch model for criterion-referenced assessment validation with an expert stakeholder group is herein described. Recommendations for subsequent item and test construction are also outlined in this article.</p
Creating an Instrument to Measure Student Response to Instructional Practices
BackgroundCalls for the reform of education in science, technology, engineering, and mathematics (STEM) have inspired many instructional innovations, some research based. Yet adoption of such instruction has been slow. Research has suggested that studentsâ response may significantly affect an instructorâs willingness to adopt different types of instruction.PurposeWe created the Student Response to Instructional Practices (StRIP) instrument to measure the effects of several variables on student response to instructional practices. We discuss the stepâbyâstep process for creating this instrument.Design/MethodThe development process had six steps: item generation and construct development, validity testing, implementation, exploratory factor analysis, confirmatory factor analysis, and instrument modification and replication. We discuss pilot testing of the initial instrument, construct development, and validation using exploratory and confirmatory factor analyses.ResultsThis process produced 47 items measuring three parts of our framework. Types of instruction separated into four factors (interactive, constructive, active, and passive); strategies for using inâclass activities into two factors (explanation and facilitation); and student responses to instruction into five factors (value, positivity, participation, distraction, and evaluation).ConclusionsWe describe the design process and final results for our instrument, a useful tool for understanding the relationship between type of instruction and studentsâ response.Peer Reviewedhttps://deepblue.lib.umich.edu/bitstream/2027.42/136692/1/jee20162_am.pdfhttps://deepblue.lib.umich.edu/bitstream/2027.42/136692/2/jee20162.pd
Iterative Linking With the Differential Functioning of Items and Tests (DFIT) Method: Comparison of Testwide and Item Parameter Replication (IPR) Critical Values
A Monte Carlo study was conducted to examine the accuracy of differential item functioning (DIF) detection using the differential functioning of items and tests (DFIT) method. Specifically, the performance of DFIT was compared using âtestwideâ critical values suggested by Flowers, Oshima, and Raju, based on simulations involving large numbers of DIF-free items, with item-specific critical values obtained via the newer item parameter replication (IPR) method. Also examined were the benefits of single-stage, two-stage, and iterative linking for dichotomous and ordered polytomous data involving samples of various size, tests of different length, types and percentages of DIF items, and levels of impact. Overall, the results indicated that testwide and IPR-based critical values corresponding to a nominal alpha of .01 provided similar power for detecting DIF due to shifts in extremity parameters, but IPR power was generally lower when DIF was due to differences in discrimination. In addition, IPR-based critical values provided as good as or better control of Type I error under most conditions, and results improved for all methods when using two-stage and iterative linking rather than single-stage linking and DIF analysis. The implications of these findings for future research involving the DFIT method with various item response models are discussed
Development and validation of SEER (Seeking, Engaging with and Evaluating Research): a measure of policymakersâ capacity to engage with and use research
- âŠ