Progress 8 – Duncan Baldwin presents the story so far and what to do (and what not to do) next

With the first universal set of Progress 8 scores now published, it is a good time to review what we have learned about this new performance measure so far, and what lessons there are for both schools and the government.

To recap, it became pretty clear in 2012 that the accountability system needed reform. The cause of the problem was the five A* to C including English and maths performance measure, which was trying to do too many things at the same time – a measure of pupil-level success, the school accountability headline measure, and the standard passport into further education or employment.

So many things rested on a single edifice that the pressure to get as many pupils as possible over the C/D boundary became too great.

Something had to be done, and the government’s solution, Progress 8 (P8 for short) was broadly welcomed by the profession, including the Association of School and College Leaders (ASCL). We liked it for a number of reasons:

  • It was inclusive. The grade of every pupil counts, not just those likely to get C grades.
  • It recognised progress at every grade boundary in pretty much every subject. No longer was it all down to English and maths departments (although they still featured heavily in the new measure).
  • The floor became a function of entry, connected to the prior attainment of each school’s pupils rather than an arbitrary line in the sand, unachievable for some and trivial for others. Potentially, even schools with high GCSE attainment could be exposed as making insufficient progress for their pupils rather than being masked by a threshold measure which offered no challenge, and schools doing a remarkable job with weaker intakes could be identified and celebrated.

In short, P8 resonated with the moral purpose of school leaders – getting the best attainment for every pupil in every subject.
P8 was launched as one of a set of four measures, alongside Attainment 8, the “basics” measure for English and maths, and the English Baccalaureate. Although a better set could have been chosen (the EBacc measure largely overlaps the other three, rendering it redundant), the fact that there was a range is also a good thing.

We have seen the distortion which happens when there is an over-emphasis on one measure alone.

But the new measure needs a new mindset to go with it. Understanding how P8 works and grasping its behaviour over time is vital. For example, a key point is that P8 is a relative measure – performance in your school depends on the performance of all other schools.

This is a bit like switching between stroke play and match play in golf. Now it doesn’t matter how many strokes you take to get round; all that matters is that you do better than your opponent.

With universal publication in 2016 (some schools had opted in to publication a year early, but all schools saw their shadow P8 data in 2015), an early lesson was that P8 is impossible to predict.

Many schools cried foul and blamed their data systems when their provisional scores were published. They had made the mistake of using last year’s averages to estimate this year’s results. Trying to do that when the system of qualifications and entry patterns is in so much flux is almost pointless and very risky. It is not until 2022 at the earliest that we can expect stability, as system changes continue to feed through until then.

No performance measure is perfect. P8 is better than its predecessor in many ways, but it has several issues at system level, some of which need to be put right urgently.

Its biggest strength, the fact that the grades of every pupil count, has also proved to be its biggest weakness. There have always been pupils, often those who are the most vulnerable and in greatest need, who achieved few qualifications or none at all.

Under the five A* to C including English and maths system, those pupils formed part of the group who didn’t pass the threshold. Crucially though, they didn’t impact any more or less than a pupil who might have achieved four C grades and a D and narrowly missed out.

This is not the case with P8. A single pupil with few qualifications can make a disproportionately large difference to the score. If the school has several such pupils then the effect can be dramatic.

I know of one school which had a group of eight pupils who missed exams for various reasons, including serious illness. That made an overall difference of -0.2 to the school’s P8 score. Or to put it another way, it took the positive progress of its top 33 pupils to compensate for the impact of those eight.

What does “good” look like under P8? It is a brave decision (minister) to use decimals and negative numbers with the general public, who are likely to struggle to understand them.

The DfE has used a grading system for schools’ P8 on the performance tables’ website in an effort to help, but the use of confidence intervals in their definition has brought its own complexity.

It is quite possible (and is the case for several schools this year) to have a higher P8 score than another school but a lower ranking by the DfE. Whether confidence intervals are appropriate in this instance is a debate for much better statisticians than me, but they don’t seem to bring clarity in this case.

While those and other issues (including the lack of contextualisation of P8 to school type and the impact of reformed GCSE) need attention at system level, schools themselves have their own traps to avoid.

I have mentioned the unpredictability of P8 already, but another obvious temptation for schools is to try to use P8 at pupil, subject or even teacher level. In my view, it is wrong to do any of those things, and schools should be very wary of anyone who claims they can be done (and even more wary of anyone trying to sell them a solution).

We should not try to shoehorn P8 into being anything other than a whole-school measure. Although there is a P8 score for individual pupils, this is connected to which subjects they are studying and isn’t purely a measure of their progress. You can (and should) look at the component buckets of P8 for English and maths, but that’s as far as individual subjects go.

I hope schools embrace the opportunity to separate out what matters for the school (Progress and Attainment 8), from what matters for the pupils: the teaching they receive and the grades they achieve.

  • Duncan Baldwin is deputy director of policy at the Association of School and College Leaders.

Further information

ASCL is holding a workshop in London on using data better on Thursday, February 9, and courses on understanding national data in London on Tuesday, February 7, and Manchester on Tuesday, February 28. For further information, visit www.ascl.org.uk or email pd@ascl.org.uk