Standardized conceptual assessment represents a widely-used tool for
educational researchers interested in student learning within the standard
undergraduate physics curriculum. For example, these assessments are often used
to measure student learning across educational contexts and instructional
strategies. However, to support the large-scale implementation often required
for cross-institutional testing, it is necessary for these instruments to have
question formats that facilitate easy grading. Previously, we created a
multiple-response version of an existing, validated, upper-division
electrostatics diagnostic with the goal of increasing the instrument's
potential for large-scale implementation. Here, we report on the validity and
reliability of this new version as an independent instrument. These findings
establish the validity of the multiple-response version as measured by multiple
test statistics including item difficulty, item discrimination, and internal
consistency. Moreover, we demonstrate that the majority of student responses to
the new version are internally consistent even when they are incorrect, and
provide an example of how the new format can be used to gain insight into
student difficulties with specific content in electrostatics.Comment: 8 pages, 6 figures, submitted to Phys. Rev. ST-PE