A teacher’s guide to retrieval practice: Metacognition

Written by: Kristian Still | Published:
Image: Adobe Stock

Continuing his series on the potential of retrieval practice, spaced learning, and successive relearning in the classroom, Kristian Still turns to metacognition, the importance of students accurately regulating their learning, and how we can help them to do this

In this series, I am attempting to elaborate and share what the recipe of test-enhanced learning (more commonly known as retrieval practice), spaced learning, interleaving, feedback, metacognition, and motivation might look like in and out of the classroom.

I am reviewing the research and cognitive science behind these concepts and the modulators underpinning the effective retention of knowledge.

In writing this series, nine clear but interlinked elements emerged. I am considering these elements across nine distinct but related articles:

I would urge readers to also listen to a recent episode of the SecEd Podcast (SecEd, 2022) looking at retrieval practice, spaced learning, and interleaving and featuring a practical discussion between myself, teacher Helen Webb, and Dr Tom Perry, who led the Education Endowment Foundation’s Cognitive Science Approaches in the Classroom review (Perry et al, 2021).

This series, in reviewing the evidence-base, seeks to help you reflect on what will work for you, your classroom, and your pupils. This is article seven and it focuses on metacognition.


As we have discovered throughout this series, a clear and present danger with test-enhanced learning and spaced practice is that it seems harder to the students than massed or blocked practice and so is often rejected as a learning strategy. This is why metacognition is a crucial ingredient.

Metacognition refers to learners’ understanding and regulation of their own learning process, including their beliefs and perceptions about learning, monitoring the state of their knowledge, and controlling their learning activities (Dunlosky & Metcalfe, 2009).

Students’ metacognitions about what processing fluency signals about learning quality, their a priori beliefs about which study methods are most effective, their willingness to implement them, and their appreciation of how to apply them, influences whether their study strategy and schedule selections will be adaptive and produce conditions that maximise learning (Bjork et al, 2013).

Metacognition and self-regulation approaches that encourage learners to think about their own learning (often by teaching them specific strategies for planning, monitoring and evaluating their learning) have a consistently high level of impact with learners.

That seems a very reasonable place to start.

The components of metacognition

Broadly speaking, metacognition consists of three primary components: beliefs (including knowledge and perceptions), monitoring, and control (Flavell, 1979; Bjork et al, 2013; Rivers, 2021).

  • Beliefs: What do we believe to be effective when studying? Why do some learners correctly endorse the effectiveness of retrieval practice, spacing, interleaving for learning whereas others do not?
  • Monitoring: Knowing how to assess and manage one’s own learning and react accordingly.
  • Control: Decisions about when, why, and how to study.

Metacognition and retrieval practice

“The fact that conditions of learning that make performance improve rapidly often fail to support long-term retention and transfer, whereas conditions that create challenges (i.e. difficulties) and slow the rate of apparent learning often optimise long-term retention and transfer, means that learners – and teachers – are vulnerable to misassessing whether learning has or has not occurred.”
Bjork & Bjork, 2020.

More simply, as cognitive psychologist Pooja Agarwal has said: "Easy learning is easy forgetting. Challenging learning strengthens our remembering, our learning, our memories." And our role as teachers is to ensure our pupils know and believe that. But this is easier said than done.

A learner’s beliefs about effort and their subjective experiences of difficulty during encoding and retrieval influence their evaluation of the effectiveness of the methods and, as a result, their metacognitive and self-regulatory decisions.

As we have already highlighted in this series, the learning benefits of retrieval practice, spacing and interleaving are all too often ignored by students when selecting study activities (Karpicke, 2009; Rivers 2021). We know that learners have a “tendency to conflate short-term performance with long-term learning when, in fact, there is overwhelming evidence that learning and performance are dissociable” (Soderstrom & Bjork, 2013).

As we referenced in article five on feedback, while 90% of students reported using self-testing, most of them did so in order to identify gaps in their knowledge rather than because they believed that self-testing conferred a direct learning benefit.

Moreover, 64% of students reported not revisiting material once they felt they had mastered it, while only 36% of students reported that they would restudy or test themselves later on that information (Kornell & Bjork, 2007; Hartwig & Dunlosky, 2012).

Let’s meet Anoara

To help illustrate the three components, let’s consider Anoara, a year 11 pupil preparing for end-of-term assessment in two of her classes. She has not been using spaced retrieval practice or interleaving as part of her learning. Almost impossibly, Anoara scored 60/100 on her end of year 10 assessments and, as unbelievable as it sounds, the assessments were of the exact same difficulty. I know I am pushing it, but Anoara enjoys her subjects equally and invests her time and effort commensurately.

For science she decides to prepare by rereading her textbook. For geography she decides to write her own flashcards and quiz herself (a form of retrieval practice testing).

She even adopts the Leitner system, which she has seen on social media – and which was first described in Sebastian Leitner’s 1972 book, So Lernt Man Lernen (How to Learn).

In a nutshell: Using flashcards and a “learning box,” the box is separated into 3-5 compartments. All flashcards start in compartment 1. Correctly answered cards move to the next compartment and so on. Compartment 1 is reviewed daily. Compartment 2 every other day. Compartment 3 is reviewed every third day, etc – slowly extending the spacing for correctly retrieved cards (Leitner 1972).

As Anoara approaches the assessment, she feels like she is learning more of the material for science than for geography, which leads her to believe that rereading is the more effective strategy. Informed by her on-going monitoring and beliefs, she decides to adopt rereading as her primary learning strategy for both science and geography.

In contrast to her (misguided) belief about which strategy would be more effective, she performs comparably better in geography than science. Will she adopt and stick with flashcards for her final exam revision?

Judgements of learning (beliefs)

Metacognitive monitoring concerns learners’ ability to assess the progress of their learning. The accuracy of their metacognitive monitoring influences study choices and, consequently, how well information is learned and retained.

Learners like Anoara misinterpret momentary accessibility of knowledge as a marker of long-term storage strength. Swayed by illusions of familiarity and fluency (Bjork et al, 2013; Kirk-Johnson et al, 2019) pupils prefer massed and blocked practice and restudy. Moreover, these beliefs and feelings can be difficult to overcome with students “insensitive to their own performance”, even when presented with the contrary results and despite fairly extensive debiasing attempts (Yan et al, 2016).

Thus, identifying and understanding the conditions that promote accurate metacognition is critical for promoting efficient and effective learning.

Of course, spaced retrieval practice is inherently metacognitive and when I first began reading retrieval research I would often encounter JOLs – judgements of learning.

JOLs are predictions (prospective) of how likely participants feel they are to remember material that was learned with a particular strategy after the event.

Regrettably, “students’ predictions were almost always higher than the grade they earned and this was particularly true for low-performing students” (Miller & Geraci, 2011).

It is worthwhile knowing that learners tend to be overconfident in predicting their own learning (Soderstrom & Bjork, 2015) and tend to terminate their encoding (learning) before materials are sufficiently committed to memory (Kornell & Bjork, 2008).

The accuracy of JOLs can play a large role in determining how adaptive (or maladaptive) study decisions end up being (Kornell & Metcalfe, 2006). JOLs are inspired and formed by learning beliefs and learning experiences.

Belief-based cues refer to what one consciously believes about learning, about memory, about remembering, with learners having been shown to use the heuristic “easily learned means easily remembered” (Koriat, 2008), and all too often failing to recognise the mnemonic benefits that testing provides as a learning strategy (Karpicke & Roediger, 2008; Tullis et al, 2013).

It is important to remember, therefore, that people (learners) are more sensitive to experience-based cues and not belief-based cues (Kornell et al, 2011). Experience-based cues include anything learners can directly experience, including monitoring and control metacognitions.

You will not be surprised that “JOLs about restudied and tested material tend to be inaccurate – whereas learners recall more tested than restudied material, their predictions often do not reflect this recall difference” (Rivers, 2021).

That is, learners experience or perceive they learn more effectively and efficiently when restudying. And this is a big problem.

Accurate calibration (monitoring)

Calibration is the process of matching the learner’s perception of their performance with the actual level of performance.

Although learners’ JOLs – or predictions – are often dissociated from test performance, “postdictions” (made after a test) tend to more accurately reflect test performance (Siedlecka et al, 2016).

The research also suggests we are more likely to be slightly under-confident in the knowledge we are less confident of, and a little over-confident of the knowledge we are confident of.

One of the metacognition benefits of retrieval practice is that it improves the calibration of students’ judgements, improves relative monitoring accuracy, and reduces overconfidence (Hacker & Bol, 2019).

Essentially, testing provides diagnostic feedback to inform learners about the gap between their anticipated and actual learning level (Szpunar et al, 2014), which then motivates them to expend more effort to narrow the perceived gap.

Monitoring accuracy

In between prospective and retrospective judgements, are real-time concurrent metacognitive judgements, predominantly measured using confidence-based ratings (technically retrospective, they are taken immediately after each item/question/test in order to capture feelings of confidence, as opposed to beliefs formed before and after a test).

Couchman et al (2016) reported that confidence ratings for each individual question accurately predicted performance and were a much better decisional guide than retrospective judgements and, as such, that the best strategy for learning is to "record confidence, as a decision is being made, and use that information when reviewing".

In your classroom, what might this look like? Well, for example, we may ask pupils to forecast their performance on each individual question, as they work through a test or mock paper, giving a score or confidence rating. Rather than asking for their overall grade prediction or confidence on the paper before or after the activity. Following the exam, you would then compare their predictions with actual performance.

Barenberg and Dutke (2019) examined the potential of retrieval practice during learning to improve the accuracy of learners’ confidence judgements in future retrieval. In the final test, the proportion of correct answers and the proportion of confident answers were higher with retrieval practice than compared to the control condition.

They concluded that the confidence judgements can "stimulate the learners to reflect (on) their understanding of learning topics and the quality of knowledge they acquired".

What is more, this reflection, as I mentioned earlier, can help them to identify learning topics that need further clarification and help them to develop the accuracy of their confidence judgements.

Coutinho et al (2020) report that confidence judgements were accurate indicators of performance and that students who scored higher in monitoring accuracy (calibration) performed better on the exam than those who scored lower. Why? “It prompted the students to engage in an analysis of knowledge.”

Here comes the good news. Retrieval practice not only enhances students’ memory performance it also improves their metacognitive monitoring accuracy. It helps learners make more accurate predictions about future performances (Ariel & Karpicke, 2018).

As Tullis et al (2013) observed: “Retrieval has enormous potential to enhance long-term retention, particularly if learners appreciate its benefits and utilise it properly during self-regulated learning.”

And there is good news for Anoara too: “Building digital flashcards provides a potentially powerful authentic assessment task” (Colbran et al, 2015) shown to yield medium to large effects on comprehension, recall, and problem-solving (Song, 2016) as a result of deeper processing and reflection of the learning material.

And while we are here – on self-assessment, two meta-analyses (Graham et al, 2015; Sanchez et al, 2017) demonstrated a positive association between self-assessment and learning. On average, “students who engaged in self-grading performed better on subsequent tests than did students who did not”.

So why not tell your pupils this and develop a self-assessment routine for marking retrieval practice. Knowing how to assess and manage one’s own learning is critical for becoming an efficient and effective learner.

They have the power (control)

Directly linked with monitoring is control. The decisions about what, when, and how to study. When monitoring is inaccurate, decisions about studying can be suboptimal.

Here we cite again Kornell & Bjork’s (2007) findings that 90% of students reported using self-testing but most only do so in order to identify gaps in knowledge rather than for the direct learning benefits (see also Hartwig & Dunlosky, 2012).

Karpicke et al (2009) asked a large group of college students about their study approach: 57% said they would reread their notes or textbook and 21% said they would do something else. Only 18% said they would attempt to recall material after reading it.

Even after students were shown their results and the benefits of retrieval practice, little changed – 42% students said they would practise retrieval and then reread, but 41% still said they would only reread (17% said they would do something else).

In other words, 58% of students “who knew better” indicated that they would not practise active retrieval even when they would have the opportunity to reread afterwards.

Rivers (2021) reported that 58% of students adopted ineffective or low-utility learning techniques (rereading 43%, copying notes 11%, highlighting 4%).

Endres et al (2017) and Hui et al (2021) are more convincing. Both studies report that after exposing students to the results, the proportion of students choosing retrieval practice increased significantly in the following review phase and, in the Hui et al study, in the long-term also.

The solution? Hui et al (2021) simply suggest that “feedback about the perceived learning as well as the actual learning may make students realise the mismatching of the perceived learning and the actual learning”.

As Rivers states (2021): “Metacognition and self-regulation approaches that encourage learners to think about their own learning (often by teaching them specific strategies for planning, monitoring and evaluating their learning) have consistently high levels of impact with learners.”

And Cogliano et al (2019) reported that students who completed metacognitive retrieval practice training scored higher on the final exam, with improved metacognitive awareness, accuracy for well-learned and yet-to-be-learned topics, exam preparation, monitoring strategy, control strategy, and practice-test frequency all contributing to this improved success. What is more, trained students continued to repeatedly use practice tests more often than the control group.

Similarly Ariel and Karpicke (2018) found students who had experienced a similar intervention also showed potential for long-term changes in their self-regulated learning, spontaneously using repeated retrieval practice one week later to learn new materials.

Bringing this together

Of course, metacognition and self-regulation is a much larger area of study than this article’s restricted focus on the elements of retrieval practice. But for our purposes, allow me to summarise.

Learners hold the belief that testing is only useful under certain conditions, where retrieval is either “easy” or successful, or when preparing for an assessment (driven by the goal of identifying which information is well known or not, rather than the goal of increasing retention).

Yet retrieval practice is most effective when learners engage in multiple successful retrieval attempts. However, learners rarely engage in such a strategy, often not knowing the value of successive relearning and how memory works (or alternatively due to overconfidence).

But retrieval practice also improves metacognition processes, creating a virtuous learning cycle, with higher calibration associated with higher overall achievement in test-takers (Bol & Hacker, 2001).

Considering that as learners travel through their education careers more and more learning takes place without direct supervision, a failure to understand and monitor one’s own learning process and adopt some of the most powerful learning strategies – spacing, retrieval practice and interleaving – would be shortsighted.

Students may also choose to avoid difficult learning strategies if they are focused on a short-term performance outcome, if their competence beliefs are low, or if their performance expectations are less ambitious or distinct from those of the instructor (see Ariel et al, 2009; Kirk-Johnson et al, 2019).


  • Educating about metacognitive approaches consistently reports to have high levels of impact with learners.
  • Forewarn pupils of the misleading metacognitive influence of fluency and familiarity they feel when undertaking massed or blocked practice.
  • Changing “beliefs” is very difficult. Drawing upon experience is easier. But without changing beliefs, you are unlikely to impact upon the controlling decisions.
  • Know that the introduction or first cycle of retrieval practice will be met with skepticism and complaints. Be ready for this!
  • Test to high success rates (or very low failure rates). Revisit the table from article three showing how to up the success rate in your quizzes.
  • Retrieval practice helps learners make more accurate learning predictions. More accurate learning predictions lead to better study (control) decisions and better outcomes.
  • Concurrent metacognition may have potential applications in the classroom, for example comparing concurrent metacognitive judgements of individual exam questions may provide useful insights.
  • And if you only have time to read one paper on this topic: Metacognition about practice testing: A review of learners’ beliefs, monitoring, and control of test-enhanced learning (Rivers, 2021): https://bit.ly/3ol4LWM

  • Kristian Still is deputy head academic at Boundary Oak School in Fareham. A school leader by day, together with his co-creator Alex Warren, a full-time senior software developer, he is also working with Leeds University and Dr Richard Allen on RememberMore, a project offering resources to teachers and pupils to support personalised spaced retrieval practice. Read his previous articles for SecEd via https://bit.ly/seced-kristianstill

References: For all research references relating to this article, go to https://bit.ly/3lkNdZb

Acknowledgement: Where the investigations of test-enhanced learning collided with the research of Dr Michelle Rivers – retrieval and metacognition – Anoara Mughal (@anoara_a) was the first person I contacted. Not only did she bring a wealth of metacognitive expertise, she also offered a wealth of expertise from a sector I was less knowledgeable about, namely primary education. Thank you, Anoara, for your support, advice and generosity.

RememberMore: RememberMore delivers a free, personalised, and adaptive, spaced retrieval practice with feedback. For details, visit www.remembermore.app or try the app and resources via https://classroom.remembermore.app/


Please view our Terms and Conditions before leaving a comment.

Change the CAPTCHA codeSpeak the CAPTCHA code
Sign up SecEd Bulletin