In March, students in 50 schools across England will participate in a trial of the PISA2015 assessments, just four months after the international PISA2012 results were published and attracted considerable attention from educators, politicians and the media. But just how confident can we be that the results stack up? Are they a fair measure of countries’ performance?
PISA stands for the Programme for International Student Assessment. Every three years, 15-year-olds from around 70 countries complete tests and questionnaires. The tests cover mathematical and scientific literacy, reading, collaborative problem-solving and financial literacy.
Detractors and supporters alike will acknowledge that the international PISA results capture public and media attention. In such a bright political spotlight, those running the assessments must ensure that they are translated consistently, administered consistently and marked consistently in all the participating countries.
The first and most striking feature of PISA is that it represents a global force for innovation and advancement in educational assessment. In 2015, PISA will include at least three unparalleled developments:
In all member-countries, the tests and questionnaires will be administered on computer. No other high-profile assessment programme has gone so far in implementing a computer-based solution.
The 2015 tests will include computer-based assessments of investigative scientific literacy. Students will find “questions” which consist of micro-worlds in which they conduct investigations, collect and analyse data and draw conclusions. As such, the 2015 assessments will provide a glimpse of what national science exams (GCSEs and A levels) could look like in the future.
The tests will also include a unique assessment of collaborative problem-solving. In 2015 the tests will place students in virtual teams to discuss, develop and agree solutions to everyday problems.
Development and innovation is all very well, but how can we trust the data? I am generally asked two questions about the reliability of PISA data:
What latitude is there for countries to select or skew the sample of schools participating in the tri-annual exercise?
Don’t countries mark the tests leniently, favouring their own students?
The PISA programme is run by a core team. One part of that team, a company called Westat in the USA, is responsible for ensuring that the schools participating in each survey are representative of their country. Each country sends Westat a database of all schools, including their background and performance data.
In the case of the UK, Westat selects a sample of around 50 schools for a field trial (which is taking place next month) and a second sample of around 200 schools for the main study (which will next take place in 2015).
In case any school cannot participate, Westat identifies a specific replacement school. There is no latitude in the programme for countries to decide which schools to choose or for low-performing schools to drop-out.
The programme also provides stringent controls on marking to ensure reliability and consistency between countries. Students’ answers to test questions are “scored” rather than “marked”. Scoring involves allocating a code to indicate what answer a student gave. Correct responses receive one set of codes, incorrect or partial answers are classified, missing and spoiled answers are given a third category of codes.
Understanding of the scoring framework is developed through extensive training. PISA countries have just attended a six-day training programme covering the 2015 tests in mathematics, reading and science. Experienced exam markers will be recruited in the UK to complete the scoring of the field trial next month and their scoring will be overseen by a senior marker.
The codes allocated by markers will be analysed on a daily basis through marking software that looks for aberrations, leniency or severity in any marker’s work.
In line with international best practice in exam marking, PISA scoring includes independent (or blind) double-marking of scripts to measure the consistency of markers’ work.
An international standardisation sample of scripts is seeded throughout the PISA scripts. No marker will know whether they are marking a “real” script or a standardisation script.
The same set of standardisation scripts is used across all PISA countries, enabling analysis of consistency of marking between programmes.
Providing even greater levels of confidence in scoring standards, the 2015 tests will be computer delivered, meaning that students’ responses to objective questions will be auto-scored.
Approximately one-third of the questions will be scored by human subject experts and the remainder will be marked in all countries by the same computer algorithms, implementing precisely the same standards. This mix of objective questions and open-ended questions requiring expert marking represents the best of both worlds of test design.
PISA may not be flawless, but it is a significant international programme, implementing world-leading best practice in test design and test marking.
The 2015 PISA assessments are being managed in the UK by a consortium of RM, the Institute of Education and World Class Arena Limited (WCAL). Martin Ripley is managing director of WCAL and can be contacted by teachers/markers interested in getting involved in the scoring of PISA scripts. Martin is also a director of the 21st Century Learning Alliance.