Blogs

Exam marking consistency: Ofqual must publish the data

Of around 330,000 GCSE, AS and A level grade challenges in 2024, 220,000 marking errors were discovered resulting in almost 75,000 grade changes. Given the numbers involved, Ofqual must begin routinely publishing marking consistency metrics, says Dennis Sherwood
What if? One-in-five exam grades appealed in 2024 were changed (most upgraded). It begs the question: How many unchallenged exam grades might also have been changed if they had been appealed? - Adobe Stock

In December, exams watchdog Ofqual published its statistics relating to reviews of marking and moderation for the summer 2024 GCSE, AS and A level exams in England (Ofqual, 2024).

As shown in the left-hand column of the table below (figure 1), the overall number of grades changed, expressed as a percentage of entries, 1.2%, has been remarkably constant each year since 2016, when Ofqual introduced the current rules for appeals.

As with all statistics, it is always wise to look behind the numbers, and doing so reveals that the number of grades changed in 2024 (74,665) is indeed 1.2% of the number of grades awarded in 2024 (6,423,785).

However, since a grade can be changed only if it has been challenged, surely a more valid measure is to compare the number of grade changes to the number of grades challenged (331,330).

This calculation reveals that about 23% of challenges resulted in a grade change – almost all of which (99.6%) were upgrades.

So about one challenge in every five “wins”. That number too has been pretty consistent since 2016.

 

Figure 1: Statistics relating to GCSE, AS and A level exam grade challenges and grade changes for exams in England from 2016 to 2024, excluding the Covid years of 2020 and 2021 (Source: Ofqual, 2024)

 

Also important in this context is a figure not explicitly published, but that can be computed: the number of grades that are not challenged (6,092,455). This equates to some 95% of awards, another long-time constant.

This figure prompts a key question: How many of those unchallenged grades might have been changed if only they had been challenged? No-one knows. No-one has looked.

 

Marking errors

A grade, of course, can be changed only if a challenge discovers a marking error. Not all marking errors, however, trigger a grade change, for it is possible that both the original incorrect mark and the corrected mark are within the same grade width.

Furthermore, a “review of marking” might identify and correct several marking errors in the same script, but there is only a single grade change.

The number of marking errors is therefore likely to be greater than the number of grades changed, as is indeed the case, with 221,555 marking errors resulting in 74,665 grade changes (see figure 2).

 

 

Figure 2: Statistics relating to the number of marking errors discovered in GCSE, AS and A level scripts after the summer 2024 exams (Source: Ofqual 2024)

 

Those 221,555 marking errors were discovered following 331,330 challenges and, to me, the fact that on average across all exams about two challenges in every three identify a marking error is most alarming – and raises another important question: How many marking errors are there, undiscovered, in the 6,092,455 grades that were not challenged?

Furthermore, that average figure of 67% masks the particularly high number for AS and A level (83%), as compared to 63% for GCSE – a gap that has significantly widened since the Covid years (see figure 3).

 

Figure 3: Marking errors as a percentage of challenges across GCSE and AS/A level over recent years (source: Ofqual annual reviews of marking and moderation)

 

For GCSE – subject to the blip in 2017, which might be attributable to the introduction of the 1-9 grading system that year – this percentage has remained at the almost constant (albeit high) level of about 63%.

But for AS and A level, this percentage first became significantly greater than the GCSE figure in 2018, leaping up in 2022. And although reducing in 2023 and again in 2024, it still exceeds 80%, implying that four in every five AS and A level challenges discover a marking error.

Why is marking so poor in general – and in particular at AS and A level? And what are the implications with regards to the quality control of marking?

 

Measures of grade (un)reliability still not disclosed

The analysis so far has been based on just a few of the 3,500 or so numbers in Ofqual’s 22 data tables, which were published alongside the review of marking and moderation for the summer 2024 exam series (Ofqual, 2024).

We might expect that so much data will tell us everything we might ever wish to know about last summer’s exams. But no.

Some numbers are missing. And, arguably, the most important ones.

Ofqual’s statistics contain no information as regards the reliability of the grades awarded: the probability that a grade, as shown on a candidate’s certificate, is the same as the grade that would have been awarded had a subject senior examiner marked that script (the grade that Ofqual designates as “definitive” or “true”).

For a wider discussion of definitive or true grades, see my previous article for SecEd – Can GCSE and A level exam grades be trusted? (Sherwood, 2023). (See also Ofqual, 2016; 2018.)

A reliability of 100%, or 1.00, means that it is certain that the grade on the certificate is “definitive”; a reliability of 50%, or 0.50, implies that you might as well toss a coin as to whether the grade on the certificate is “definitive” or not.

These numbers are important, and Ofqual published them for 14 subjects, just once, in November 2018. See page 21 (figure 12 to be exact) of its document Marking consistency metrics: An update (Ofqual, 2018).

Yet, according to an Ofqual board paper dated January 25, 2017 (paragraphs 22, 23), Ofqual can “routinely create marking consistency metrics for GCSEs and A levels”, analysed by subject and by exam board, from which measures of grade reliability are easily determined – measures that this paper describe as “informative”.

So why are they not “routinely” presented in Ofqual’s annual statistics?

 

Further information and resources