A teacher’s guide to retrieval practice: Let’s make some memories (stick)

Written by: Kristian Still | Published:
Image: Adobe Stock

The potential impact of retrieval practice, spaced learning, successive relearning, and metacognitive approaches in the classroom cannot be underestimated. In this series, Kristian Still looks at nine interlinked elements crucial to these approaches and draws out important lessons for teachers from the research evidence. In part one of nine, he tackles memory.

In a recent SecEd article, I made the case that despite the “wealth of evidence” about the “reliable advantage” of test-enhanced learning – more commonly referred to as the testing effect or retrieval practice – it is more complicated than that and “retrieval practice alone is not enough” (Still, 2021).

In this series, I will attempt to elaborate and share what the recipe of repeated retrieval, spaced learning, interleaving, feedback, metacognition, and motivation might look like in and out of the classroom.

I will review the research and cognitive science behind these concepts and the modulators underpinning the effective retention of knowledge. In writing this series, nine clear but interlinked elements emerged, and so I will consider these elements across nine distinct but related articles:

  1. Memory (this article)
  2. Repeated retrieval practice or test-enhanced learning (Published: April 6, 2022)
  3. Spaced retrieval practice or spaced learning (Published: April 20, 2022)
  4. Interleaving (Published: April 27, 2022)
  5. Feedback and elaboration (Published: May 4, 2022)
  6. Successive relearning (Published: May 11, 2022)
  7. Metacognition (Published: May 18, 2022)
  8. Overcoming illusions of knowing or competence (Published: May 25, 2022)
  9. Testing, motivation and achievement (June 8)

The launch of this series coincides with an episode of the SecEd Podcast looking at retrieval practice, spaced learning, and interleaving and featuring a practical discussion between myself, teacher Helen Webb, and Dr Tom Perry (SecEd, 2022).

Giving his final thoughts at the end of the podcast, Dr Perry, who led the Education Endowment Foundation’s Cognitive Science Approaches in the Classroom review (Perry et al, 2021), had an exciting message: “We know a huge amount about learning, memory, cognition, attention, and it’s creating some really powerful and practical principles that we can trust.”

However, “quite a lot of this has not been tested out in realistic classroom circumstances – we don’t have a strong applied science; we don’t have a large amount of trials”.

Thus his “big message” was that: “The onus is on educators to really work out what good looks like in their own contexts. The evidence can give you good principles, but I don’t think that it can define specific practices – those are the things that need to be worked out on the ground for particular kids and in particular subjects.”

Which basically means that the researchers have had their fun – now it’s our turn! This series, in reviewing the evidence-base behind these approaches, seeks to help you reflect on what will work for you, your classroom, and your pupils. This is article one and it focuses on memory.

Memory, learning, encoding and retrieval

The title of Dr John Sweller’s presentation at ResearchEd Melbourne in 2017 left a mark: “Without an understanding of human cognitive architecture, instruction is blind.”

In his presentation (available online), Dr Sweller presents what teachers really need to understand about cognitive load theory (2017). Cue that frequently cited Professor Dylan Wiliam quote (from a 2017 tweet): “Sweller’s cognitive load theory is the single most important thing for teachers to know.”

So why is an understanding of “human cognitive architecture” – memory to you and me – so important? Well before I review the importance of Dr Sweller’s research, I would like to introduce Professor Dan Willingham’s Simple Model of Memory.

Memory is as thinking does

The above is a famous quote from Prof Willingham (2003), professor of psychology at the University of Virginia and author of Why Don’t Students Like School? (2009).

Here’s another: “Teachers need what might be called a mental model of the learner: knowledge of children’s cognitive, emotional, and motivational make-up.”

While this series focuses on the many elements of test-enhanced learning (retrieval practice), we absolutely must begin with an understanding of memory. This is the lens through which we need to view the many elements of test-enhanced learning.

So – memory is as thinking does. When it comes to cognition – to thinking – theory is unavoidable. Therefore, a model is almost always inevitable. For the purpose of a shared understanding, let’s go with Willingham’s (2009) Simple Model of Memory. Other models are available.

Model of memory: Willingham’s Simple Model of Memory (Willingham, 2009) as interpreted by Oliver Caviglioli (image reproduced with kind permission from @olicav)

In this model, we have the environment on the left, which is all the things we see, hear, feel and so on. What we attend to or give our attention to – what we selectively concentrate upon – is moved to our working memory on the right.

Working memory holds all those things we are thinking about: our reflections at this moment, thoughts about the last lesson, those dark clouds on the horizon, and so on. And of course you can also be aware of things that are not currently “in” the environment – the smell of food from the school canteen or how you expect the next lesson to unfold.

Below working memory are “learning” and “remembering”, which we will return to in a moment. And below all that, long-term memory is the “big mental warehouse” (Clark et al, 2012) in which we maintain our knowledge of the world.

Importantly, long-term memory resides outside of your awareness until called upon, and then these thoughts enter your working memory and so become conscious.

Thinking occurs when you combine information from the environment and from long-term memory in new ways. That combination happens in working memory as either learning (encoding) or remembering (relearning). A learner's prior knowledge will, therefore, always be a key consideration.

Incidentally, researchers have recently investigated the impact of “pretesting” – i.e. testing pupils on what they have not yet been taught or do not yet know – with some rather unexpected results. We shall come back to this later in the series.

Another helpful diagram can be found in the useful and accessible evidence review Cognitive science approaches in the classroom (Perry et al, 2021).

The review reminds us of the key principle that working memory can be overloaded: “Many of the strategies derived from cognitive science focus on the crucial interactions between working memory and long-term memory and the important observation from cognitive science that our working memories have limited capacity.”

Sensory, working and long-term: The three elements of memory as illustrated in Perry et al (2021).

Let's add a little meat to the bones. Working memory is the memory system where conscious processing of sensory information occurs. Sensory and working memory are needed even for the simplest activities. If you cannot remember the beginning of this sentence, you cannot understand its ending. It is where small amounts of information are stored for a very short duration – what Clark et al (2012) call “the limited mental ‘space’ in which we think”.

Again, working memory is extremely limited in both capacity and duration. Almost all information stored in working memory, not consciously processed or rehearsed, is lost within 30 seconds (Peterson & Peterson, 1959).

An average person can only hold but a few chunks of information in their working memory at any one time (go ahead and forget Miller’s (1956) seven plus or minus two unique concepts as it is very likely to be a generous overestimate. Cowan (2001) suggests the figure to be as low as four).

There is evidence to indicate differences in working memory capacity between individuals and that working memory capacity increases gradually until the teenage years (Swanson, 1999).

An interesting aside: Argawal et al (2017) report that retrieval practice with feedback yields a greater benefit for pupils with lower working memory capacity, although the wider literature shows mixed results.

Of equal importance, Ericsson and Kintsch (1995) and Baddeley (2001) distinguish between the capacity of working memory when it is processing new information and new relationships (encoding/learning) compared with processing prior knowledge from long-term memory (remembering/relearning). This again emphasises the importance of prior knowledge.

Long-term memory is the memory system where large amounts of information are stored “semi-permanently” – what Clark et al (2012) call “that big mental warehouse of things we know” – be that explicit vocabulary, people’s names and faces, chess moves, or sports skills.

Long-term memory holds a virtually unlimited amount of knowledge and yet remains severely impeded by working memory. And for good reason – protecting us from catastrophic interference (McClelland, 2013), namely newly learned information suddenly and completely erasing previously learnt information.

While we can access countless autobiographical events in vivid detail, sing along to hundreds of songs, it is impossible for most of us to keep more than a couple of digits in working memory at the same time.

Making the most of human memory requires understanding “important peculiarities” of the storage and retrieval processes (Bjork & Bjork, 1992), being aware of its weaknesses, and exploiting its strengths.

So, coming back to Dr Sweller: “The implications of working memory limitations on instructional design can hardly be overestimated.

“Anything beyond the simplest cognitive activities appear to overwhelm working memory. Prima facie, any instructional design that flouts or merely ignores working memory limitations inevitably is deficient.” (Sweller et al, 1998)

You can see why Sweller’s presentation left a mark.

Cognitive load theory

Cognitive load theory is based on a number of widely accepted theories about how human brains process and store information.

As with Willingham’s Simple Model of Memory, these assumptions include: that human memory can be divided into working memory and long-term memory; that information is stored in the long-term memory in the form of schemas; and that processing new information results in “cognitive load” on working memory which can affect learning outcomes. Let’s pause for a moment on schemas.

A schema provides a system for organising and storing knowledge according to how it will be used, with “skilled performance” developed through building ever greater numbers of increasingly complex schemas and by combining elements of lower level schemas into higher level schemas. Indeed, renowned cognitive scientists Fiorella and Mayer emphasise that pupils make meaning when they select, organise, and integrate information (Mayer, 2014).

I am not a big fan of the word “performance”, as it suggests something transitory or fleeting. So I prefer to think of “knowledge” being developed as opposed to performance.

Crucially for cognitive load theory, schema reduce working memory load or, in effect, bypass the limits of working memory; they enable working memory to be reallocated by allowing information to be accessed automatically from long-term memory.

Alternatively, cognitive overload occurs with new information and, as expertise grows (knowledge in long-term memory and schemas), cognitive overload is less of a concern and therefore optimal instruction changes.

More simply, if working memory is overloaded, there is a greater risk that what is being taught will not be understood by the learner, or will be misinterpreted or confused, and will not be effectively encoded and transferred to long-term memory, and that learning will be slowed down.

Cognitive load research demonstrates that instructional methods are most effective when designed to fit within the known limits of working memory, and therefore strongly supports guided models of instruction, especially for teaching novice learners in “technical” subjects such as mathematics, science and technology.

However, cognitive load theory does not consider, for example, factors such as pupils’ motivation and metacognitive beliefs about how their own ability might influence the effectiveness of learning. Nor do cognitive load theorists advocate using all aspects of explicit instruction all of the time.

Put simply, Sweller’s “human cognitive architecture” with which we began this article describes the necessary and sufficient conditions for learning.

When designing learning, educators need to be cognisant of the possibilities and limitations of our memory and the interactions within it, including where advantages may be gained (such as in developing routines) and where “choke points” constrain or impede learning (such as the selective nature of attention).

What is more, far too many of us assume that providing learners with additional, peripheral information is beneficial, or at worst harmless. But additional information or redundancy is anything but harmless as Dr Sweller himself has said (2016): “Providing unnecessary information can be a major reason for instructional failure.”

Another reason for failure is requiring the learner to process two or more sources of information simultaneously in order to understand the material (for example, when a diagram is used to illustrate or explain a separate piece of explanatory text). It is for this reason that the RememberMore app I have been involved in developing (see further information) shows both cues and responses simultaneously on the app screen and does not employ animations and transitions.

Making memories

“Learning is the residue of thought.” Yes, we’re back to Prof Willingham (2009).

“Your memory is a product of what you think most carefully about. What students think about most carefully is what they will remember.” (Willingham, 2008)

The popular three-store model, whereby human memory is conceptualised as occurring in three stages – encoding, storage, and retrieval – dates back to work by Köhler (1947) and Melton (1963).

The three-store framework holds that after we acquire new information, some of it undergoes storage into long-term memory, and then after a while some of this previously encoded information can be retrieved. The stages are considered logically and temporally separate. Memory is therefore discussed as having two dimensions (Bjork & Bjork, 1992):

  • Storage strength: How deep-rooted/interconnected memories are.
  • Retrieval strength: How accessible memories are.

Sensory memory and attention

Incoming sensory information from the environment that is attended to (that makes it through the attention bottleneck) moves to our working memory. Neurons are activated and are either encoded, or after a brief period of time, are lost.

Attention is the primary gatekeeper of learning and relearning and the ultimate commodity of our classrooms. It is our responsibility as teachers to harness and direct attention and to minimise all other distractions or competition for it. Attention is “the” choke point.

Once we have got students' attention, connections are formed within the information presented (and potentially with prior knowledge). After consolidation, the memory traces are considered "stabilised" and stored. Learning is said to occur when information from working memory is transferred to long-term memory through conscious processing – linking new knowledge to what's already in our memory, or prior knowledge.

It is one possible reason why pupils with greater prior knowledge learn more, and often learn faster.

When retrieving that information, an "associative chain reaction" of activating neural representations modifies that memory, promotes consolidation (storage), and slows forgetting.

And yet, much of everyday, real-world learning is iterative: pupils encounter cross-curricular themes, discuss lessons between themselves, are set targeted homework, watch television programmes, read books, play games, prepare for tests in various ways, and sit tests and exams.

The proposition that there are discrete encoding and retrieval phases does not describe education very well at all, nor does it capture the interactive nature of encoding and retrieval, learning and relearning, whereby retrieval can actually be considered a (re)encoding event.

A quick word about forgetting

Our memories (what we attend to) are encoded (consolidated, stored to be retrieved) or forgotten. As we will point out over this series, knowledge of forgetting is very much a part of spaced retrieval practice, as much as knowledge of remembering is.

Possibly the most succinct explanation of the relationship between memory and the importance of forgetting comes from Henry L Roediger III: “Remembering is greatly aided if the first presentation is forgotten to some extent before the repetition occurs.” (Roediger & Karpicke, 2011)

This is why educators would be well advised to promote and forewarn pupils about the upcoming test, to prompt that their teaching is attended to in the first place, with spaced retrieval practice aiding consolidation and storage of that teaching in long-term memory. Or more simply, building storage strength and retrieval strength for when that knowledge or memory is next required.

And remember, performance during learning is a poor predictor of future performance because it reflects the momentary accessibility of knowledge rather than how well it has been stored in memory (Bjork & Bjork, 1992).

Learning, remembering and relearning is a process, not a product. We can only infer that it has occurred from pupils’ “performances” over time and where we have evidence of their capabilities.

A final thought

Learning involves a relatively permanent change in long-term memory and as Willingham has said (2009) “understanding is remembering in disguise”.

This change unfolds over time; it is not fleeting but rather has a lasting impact on how pupils think and act. Learning is not something done to pupils, but rather something pupils themselves do. It is the result of how pupils interpret and respond to their experiences – conscious and unconscious, past and present.

And of course, even if we do all this, as teachers we still need to understand pupils’ “emotional and motivational make-up” (Willingham, 2009), by which we mean that understanding the cognitive science behind these approaches is fine, but we also need to understand what engages and motivates our pupils so we can ensure they can realise the benefits of test-enhanced learning. But then you knew that anyway…

I will conclude with Schweppe & Rummer, 2013 (as quoted in Mccrea 2019): "We learn what we think about, and what we think about is determined by what we attend to."

Encoding: This illustration shows us what encoding, consolidation, storage and retrieval might look like. Diagram adapted and used with kind permission from @efratfurst


  • Attention is the primary gatekeeper of learning and relearning.
  • Learning is a relatively permanent change in long-term memory.
  • Working memory, where conscious processing occurs, is severely limited in duration and capacity.
  • Long-term memory is virtually unlimited but yet severely limited by working memory.
  • Knowledge of forgetting is very much a part of spaced retrieval practice – just as much as knowledge of remembering.
  • A pupil’s performance during learning/encoding is a poor predictor of future performance.
  • Test-enhanced learning includes “pretesting” and potentiation, retrieval and relearning, and much more.
  • And finally, if you only have time to read one paper on this topic: Ask the cognitive scientist: What will improve a student’s memory? (Willingham, 2008): https://bit.ly/3svkTGD

  • Kristian Still @KristianStillis deputy head academic at Boundary Oak School in Fareham. A school leader by day, together with his co-creator Alex Warren, a full-time senior software developer, he is also working with Leeds University and Dr Richard Allen on RememberMore, a project offering resources to teachers and pupils to support personalised spaced retrieval practice. Read his previous articles for SecEd via https://bit.ly/seced-kristianstill

References: For all research references relating to this article, go to https://bit.ly/3JY0C3V

Acknowledgement: This article would not have been possible without the author’s conversations with educational neuroscience expert Sarah Cottingham and her musings over at https://overpractised.wordpress.com/

ResearchED: Kristian will be speaking at the first ever ResearchED Berkshire taking place at Desborough College in Maidenhead on May 7. Visit https://researched.org.uk/event/researched-berkshire/

RememberMore: RememberMore delivers a free, personalised, and adaptive, spaced retrieval practice with feedback. For details, visit www.remembermore.app or try the app and resources via https://classroom.remembermore.app/


Please view our Terms and Conditions before leaving a comment.

Change the CAPTCHA codeSpeak the CAPTCHA code
Sign up SecEd Bulletin