Article thumbnail

National curriculum assessment: how to make it better

By Dylan Wiliam


In a series of papers ov er the last ten years, I have outlined various problems affecting the assessment of the national curriculum in England which are the subject of a critique by Paul Newton (this issue). In responding to this critique, I acknowledge that his summary of my position is fair, and agree that, by the standards of analytic rationality, the evidence for some of the problems I identify is not compelling. However, in response I argue that by standards of reasonableness (eg on the balance of probabilities) the evidence is sufficently serious to warrant a re-examination of national curriculum assessment, and the alternatives. In particular, I argue that the current system provides assessments that are not sufficiently reliable for the inferences that are made on the basis of the results and has also caused a narrowing of the curriculum. I propose that the first of these weaknesses can be addressed through the increased use of teacher assessment, and the second by increasing the range of the curriculum tested through testing a greater proportion of the curriculum. In order to effect these changes without increasing the burdern on students and teachers, I propose that these two changes are combined in the form of a light sampling scheme which would increase both the reliability and minimise the curricular backwash, although the price paid for this would be the lack of a direct, transparent and objective link between the results achieved by individual students on tests and the reported levels of a school’s performance

Year: 2003
OAI identifier:

To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.

Suggested articles


  1. (1991). Assessment and testing: a survey of research. Cambridge:
  2. (1994). Assessment-based reform: challenges to educational measurement. Paper presented at Angoff Memorial Lecture.
  3. (1996). Can performance-based student assessment be psychometrically sound? In
  4. (1984). Chelsea diagnostic mathematics tests: algebra.
  5. (2000). Integrating summative and formative functions of assessment. Paper presented at First annual conference of the Association for Educational Assessment-Europe held at Prague,
  6. (1999). Note on sources of sampling variability in science performance assessments.
  7. (2001). Return to reason.
  8. (1968). Statistical theories of mental test scores.
  9. (1993). Technical issues in the development and implementation of a system of criterion-referenced age-independent levels of attainment in the National Curriculum of England and Wales. Unpublished PhD thesis,
  10. (1972). The dependability of behavioural measurements: theory of generalizability for scores and profile reporting.
  11. (2000). The meanings and consequence of educational assessments. doi
  12. (2002). Working inside the black box: assessment for learning in the classroom.