Search CORE

93 research outputs found

Evaluating the Validity of Assessments: The Consequences of Use

Author: American Educational Research Association
American Educational Research Association American Psychological Association and the National Council on Measurement in Education
American Psychological Association
American Psychological Association American Educational Research Association and the National Council on Measurement in Education
American Psychological Association American Educational Research Association and the National Council on Measurement in Education
Cronbach
Linn
Messick
Popham
Shepard
Shepard
Publication venue: 'Wiley'
Publication date
Field of study

Crossref

Cut-off values for the applied version of the Beck Depression Inventory in a general working population

Crossref

Rasch scaling procedures for informing development of a valid Fetal Surveillance Education Program multiple-choice assessment

Abstract Background It is widely recognised that deficiencies in fetal surveillance practice continue to contribute significantly to the burden of adverse outcomes. This has prompted the development of evidence-based clinical practice guidelines by the Royal Australian and New Zealand College of Obstetricians and Gynaecologists and an associated Fetal Surveillance Education Program to deliver the associated learning. This article describes initial steps in the validation of a corresponding multiple-choice assessment of the relevant educational outcomes through a combination of item response modelling and expert judgement. Methods The Rasch item response model was employed for item and test analysis and to empirically derive the substantive interpretation of the assessment variable. This interpretation was then compared to the hierarchy of competencies specified a priori by a team of eight subject-matter experts. Classical Test Theory analyses were also conducted. Results A high level of agreement between the hypothesised and derived variable provided evidence of construct validity. Item and test indices from Rasch analysis and Classical Test Theory analysis suggested that the current test form was of moderate quality. However, the analyses made clear the required steps for establishing a valid assessment of sufficient psychometric quality. These steps included: increasing the number of items from 40 to 50 in the first instance, reviewing ineffective items, targeting new items to specific content and difficulty gaps, and formalising the assessment blueprint in light of empirical information relating item structure to item difficulty. Conclusion The application of the Rasch model for criterion-referenced assessment validation with an expert stakeholder group is herein described. Recommendations for subsequent item and test construction are also outlined in this article.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

University of Melbourne Institutional Repository

Creating an Instrument to Measure Student Response to Instructional Practices

BackgroundCalls for the reform of education in science, technology, engineering, and mathematics (STEM) have inspired many instructional innovations, some research based. Yet adoption of such instruction has been slow. Research has suggested that students’ response may significantly affect an instructor’s willingness to adopt different types of instruction.PurposeWe created the Student Response to Instructional Practices (StRIP) instrument to measure the effects of several variables on student response to instructional practices. We discuss the step‐by‐step process for creating this instrument.Design/MethodThe development process had six steps: item generation and construct development, validity testing, implementation, exploratory factor analysis, confirmatory factor analysis, and instrument modification and replication. We discuss pilot testing of the initial instrument, construct development, and validation using exploratory and confirmatory factor analyses.ResultsThis process produced 47 items measuring three parts of our framework. Types of instruction separated into four factors (interactive, constructive, active, and passive); strategies for using in‐class activities into two factors (explanation and facilitation); and student responses to instruction into five factors (value, positivity, participation, distraction, and evaluation).ConclusionsWe describe the design process and final results for our instrument, a useful tool for understanding the relationship between type of instruction and students’ response.Peer Reviewedhttps://deepblue.lib.umich.edu/bitstream/2027.42/136692/1/jee20162_am.pdfhttps://deepblue.lib.umich.edu/bitstream/2027.42/136692/2/jee20162.pd

Crossref

Deep Blue Documents at the University of Michigan

Grade Equivalents

Author: A Anastasi
Joint Committee on Standards for Educational and Psychological Testing of the American Educational Research Association American Psychological Association, and National Council on Measurement in Education
KM Murphy
MJ Allen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

Crossref

Details of the 1988 AACD Ethical Standards

Author: American Association for Counseling and Development.
American Educational Research Association. American Psychological Association and National Council on Measurement in Education.
American Personnel and Guidance Association.
American Psychological Association.
American Psychological Association.
Colorado Psychological Association.
Commission on Rehabilitation Counselor Certification.
Council for the Accreditation of Counseling and Related Educational Programs.
National Board for Certified Counselors.
National Career Development Association.
Ohio Psychological Association.
Standing Committee on Test Standards.
Publication venue: 'Wiley'
Publication date
Field of study

Crossref

Iterative Linking With the Differential Functioning of Items and Tests (DFIT) Method: Comparison of Testwide and Item Parameter Replication (IPR) Critical Values

Author: American Educational Research Association American Psychological Association, and the National Council on Measurement in Education
Birnbaum A.
Blitz D. L.
Holland P. W.
Kim S.
Thissen D.
Publication venue: 'SAGE Publications'
Publication date: 01/09/2012
Field of study

A Monte Carlo study was conducted to examine the accuracy of differential item functioning (DIF) detection using the differential functioning of items and tests (DFIT) method. Specifically, the performance of DFIT was compared using “testwide” critical values suggested by Flowers, Oshima, and Raju, based on simulations involving large numbers of DIF-free items, with item-specific critical values obtained via the newer item parameter replication (IPR) method. Also examined were the benefits of single-stage, two-stage, and iterative linking for dichotomous and ordered polytomous data involving samples of various size, tests of different length, types and percentages of DIF items, and levels of impact. Overall, the results indicated that testwide and IPR-based critical values corresponding to a nominal alpha of .01 provided similar power for detecting DIF due to shifts in extremity parameters, but IPR power was generally lower when DIF was due to differences in discrimination. In addition, IPR-based critical values provided as good as or better control of Type I error under most conditions, and results improved for all methods when using two-stage and iterative linking rather than single-stage linking and DIF analysis. The implications of these findings for future research involving the DFIT method with various item response models are discussed

Crossref

USFSP Digital Archive

Scholar Commons - University of South Florida