Cognitive Load and Language Teaching – What Teachers Need to Know

Cognitive science and educational psychology seem to be popping up everywhere lately, or perhaps it has just been on my radar because I recently followed The Learning Scientists on Twitter and have been listening to their podcasts on effective learning strategies. These ideas fit in nicely with the trend towards more evidence-based teaching strategies that has been occurring during the last few years. One reason this is so is because strategies such as these are based on what we know of how the brain learns. While much work still needs to be done in regards to this area and second language learning/teaching, there are a number of articles that support a more cognitively-informed instructional approach. Recently (August, 2017), the Centre for Education Statistics and Evaluation published a white paper that specifically describes cognitive load theory and its instructional implications. This Research Bite will briefly summarize that article and synthesize it with an article published online by John Sweller (May, 2017) which focuses on the second language implications of cognitive load theory.

Why Cognitive Load Theory?

Cognitive load theory is important because it presents an evidence-based explanation for important learning mechanisms. In addition, it has clear practical implications for teaching. Dylan Williams, a professor emeritus of educational assessment, described the theory as “the single most important thing for teachers to know”.

Underlying Concepts

There are several important concepts that make up cognitive load theory. First is working memory, which is a form of conscious limited memory that can process limited information for a minimal amount of time. Research suggests that average working memory can process about four “chunks” of information at any one time. A chunk can be a simple or complex: 1, 7, 3, 9; 1993, 1997, 1999, 2001; blue, honey, cup, mouse; The Dark World, Age of Ultron, Civil War, Ragnorak. All of these are examples of chunks.

In addition to working memory, we have long-term memory, which has no limits on storage (as far as we know). The knowledge it does store is organized into “schemas“, which are accumulative. In other words, proficient skill is created by combining lower level schema into more complex higher level ones. Entire schemas can serve as a single chunk within working memory despite their complexity. This is one explanation for how cognitive load can be reduced.

So, then, what is cognitive load? This is when processing new information puts a load on our limited working memory. It has a clear impact on learning. Too much load could lead to poor understanding, confusion, or the inability to store new information in long term memory.

Types of Cognitive Load

There are three types of cognitive load. Intrinsic cognitive load is the “inherent difficulty of the subject matter,” based on the complexity of the material itself combined with the learner’s prior knowledge. With more prior knowledge, inherently complex material becomes easier. Therefore, intrinsic cognitive load is relative. Scaffolding and introducing material through its basic individual parts first (part-whole approach) is one way to limit intrinsic load. Extraneous cognitive load, which is a negative load, is related to how the material is taught. In particular, this refers to “Poorly designed instruction that does not facilitate schema construction and automation.” An example of this would be minimally guided instruction – getting learners to figure out complex material themselves without support. Germane cognitive load is considered positive. This is the load that comes from the very process of learning – moving information from working to long-term memory. Although germane cognitive load is positive, a situation in which all three loads are occurring at the same time would be a situation of total cognitive overload.

Where’s the Evidence?

A number of experiments and random controlled trials (RCTs) support these findings. Much research comes from worked examples. For example, a math problem would be solved step-by-step, often already fully worked out, with explanations along the way. Evidence shows that students learning math through worked examples learned more quickly and were able to transfer skills. Other evidence comes from indirect measurements. A number of studies comparing two instructional techniques – one conventional and one focusing on reducing cognitive load – have shown positive results for cognitive-load based interventions.

Still, there are some contentions regarding cognitive load theory. Not on its underlying assumptions, but rather on its more specific elements such as clear definitions, whether the additive nature of the three types of cognitive loads exists, a lack of direct measurements of cognitive load, and the generalizability of the evidence.

General Teaching Implications

Most research recommends focusing on explicit teaching for novice learners or when introducing new and complex material in order to reduce intrinsic load. In addition, explicit teaching with explicit guidance will reduce extraneous load. A number of studies support this. Some evidence suggests explicit teaching actually increases motivation. Note that this is for novices and new information. Once this new information has been learned, learners should be free to practice it independently and in less structured ways.

Teach specific skills rather than generic skills. Teach how to solve a specific problem rather than a general “problem-solving strategy”: “teaching domain-specific skills is more effective because, while general problem-solving skills are innate to humans and therefore do not need to be explicitly taught, domain-specific skills are not automatically acquired by learners without explicit teaching” (p. 6). Other recommendations come in the form of “effects”:

  • The worked-example effect, discussed above.
  • The expertise-reversal effect suggests worked examples become less effective as learners’ knowledge increases. They can become counter-productive. Therefore, they are only useful for novices or new material.
  • The redundancy effect is related to erroneous load. This is when students are given unnecessary or redundant information that overloads their working memory. For example, text and a diagram that repeats information, or reading from a PPT.
  • Related to this is the split attention effect; that is, processing multiple sources of information at the same time. Examples would be when “a diagram is used to explain a concept, but it cannot be understood without referring to a separate piece of explanatory text.” This requires holding two separate sources of information in working memory while trying to synthesize them.
  • Conversely, the modality effect shows that non-redundant sources of visual and auditory information can be effective at the same time. These both have separate streams of working memory and therefore will not cause overload.

Language Teaching Implications

Sweller (2017), who is one of the foundational scientists related to cognitive load theory, explains that this theory and its research has specific relevance for adult second language learning, much of which runs counter to established teaching methods. It’s important to point out that Sweller writes with the adult learner in mind, and he makes sure to emphasize that a child’s first language acquisition – “picking up a language” – is a different phenomenon than an adult’s second language learning. I would add here that first language acquisition should really refer to first languages, and that second language acquisition for adults is decidedly deliberate whereas children learning a language is often not.

His recommendations are that language teaching should be explicit: present grammar and vocabulary explicitly rather than requiring students to discover it. Avoid the split attention effect by providing translations nearby or next to the original word, such as in a reading or worksheet, or connecting words and translations with arrows on a PPT. Don’t split their attention by making them search through a dictionary or separate piece of paper. Avoid the redundancy and expertise-reversal effects by not providing translations of words they already know (obviously!). In terms of language immersion, understand that immersion is not suitable for novice learners; however, those with more expertise will certainly benefit. This is also related to the above effects.


The raft of evidence elucidating the underlying mechanisms of learning make it quite clear why understanding cognitive load theory is important. Of course, there needs to be more research in SLA. However, there are a number of practical takeaways that are suggested by the research. With a little creative thinking, I’m sure we can generate some interesting and effective ideas on how to transform this knowledge into useful pedagogy (beyond what Sweller has suggested). So, how do you use or how can you imagine using cognitive load theory to inform your teaching and instructional/material design?


Centre for Education Statistics and Evaluation. (August, 2017). Cognitive load theory: Research that teachers really need to understand.  Downloaded from

Sweller, J. (May 16, 2017). Cognitive load theory and teaching English as a second language to adult learners. Contact (TESL Ontario’s magazine). Retrieved from

Featured photo by aitoff (Pixabay)

Anthony Schmidt on TwitterAnthony Schmidt on Wordpress
Anthony Schmidt
English language Instructor at University of Tennessee, Knoxville
Anthony Schmidt is editor of ELT Research Bites. He also has his own blog at Offline, he is a full-time English language instructor in a university IEP program. He is interested in all aspects of applied linguistics, in particular English for Academic Purposes.

13 thoughts on “Cognitive Load and Language Teaching – What Teachers Need to Know”

  1. Your blog remains an excellent source of thought provocation! I’ve recently become aware of Cognitive Load Theory and -like Mura and many others- think that it’s real value lies in the area of instructional design. Which is pretty fundamental to what we do as teachers!

    Thanks for raising awareness of this theory in the field of ELT. I hope now to be able to take this further with colleagues at work and we can explore what relevance CLT has to our day-to-day work.

  2. I must agree with Mura. Sweller is not very good at explaining language learning (L1 learning based on explicit learning (Anderson’s ACT, for example) really doesn’t cut it ) and even the teaching tips are questionable. As Mura suggests, input elaboration (involving adding redunancy and regularity to texts) has been shown to help reading comprehension listening comprehension and incidental vocab. acquisition. See, for example, KIm, Y. (2006) The Effects of input elaboration on vocabulary acquisition…. TESOL Quarterly, 40, 2, 341-373.

    1. I’m not doubting that input enhancement is helpful. In fact, I think sweller recommends that when he is talking about glosses. I think I have misrepresented the redundancy effect a bit. It is mostly concerned with receiving redundant information through two different channels that require the competing processing resources. For example, reading while listening. has a good overview, including research on language learning.

  3. hi Anthony
    some quick comments:

    – assertion by Sweller that adult language learning is “biologically secondary” seems to ignore work which assumes the implicit nature of acquiring language (e.g. generative grammar, usage based theories)?
    – some critics assess the distinction of Geary’s model of two systems may not be as clearly thought out as say other two system models? (e.g.
    – from the Sweller link – “Learners should not be required to search for needed information.” how does this tie with say involvement load hypothesis of Hulstijn and Laufer (2001) where “search” is one of the cognitive components and more “search” e.g. consulting a dictionary is said to lead to better vocabulary retention?
    – considering the reported benefits of elaborated input (not translations but “redundancy and clearer signaling of thematic structure in the form of examples, paraphrases and repetition of original information, and synonyms and definitions of low- frequency words” – Oh, 2001) for novice language learners is there evidence that such elaborated input is not beneficial for more expert language learners?

    i think cog load is relevant for instructional design, but find it harder to see relevancy for acquiring language

    1. Quick responses
      1. I Think Sweller was trying to set the context for differentiating between 1st and 2nd language learning by emphasizing that they are learned differently, and that 2nd language acquisition/learning is often more deliberate, especially by adults. Of course you can acquire it implictly or through immersion, but I’m not sure how cognitive load affects this. Overstimulation might be a synonym here. Most adults learn in a deliberate context (e.g. a classroom) and for them, as so we need to consider cognitive load for good instruction design.
      2. I have no clue about this.
      3. If vocabulary learning is the task, then switching seems OK. But if reading is the task and you must switch to look up words, you are clearly diminishing your attentional resources. Multi-tasking doesn’t work.
      4. I think he was saying that redundancy is not useful for proficient learners (those proficient in the redundant material) as it is more of a distractor than an aid.

      A cursory search for cognitive load and ESL brings up a number of related articles. I think the relevancy has been well established.

      Thanks for commenting!

      1. hi
        what articles did you find?
        i chased up the Sweller & Sweller (2006) ref and in there they refer to +literacy+ as “biologically secondary” no ref to adult language learning, me thinks Sweller in that TesolOntario post is jumping the shark!

        1. Does he mean that the act of literacy and learning to read is not a natural act like learning to speak and understand a language? If so, that makes sense since reading and writing are biologically unnatural. What is the reference?

          1. yes he takes his cue from Geary; seems plausible but what’s the relevance to language learning? the ref is Sweller, J. & Sweller, S. (2006). Natural information processing systems. Evolutionary Psychology, 4, 434–458.

Leave a Reply