Testing-effect research is growing — but not for every learner

A review of nearly 24,000 scientific papers suggests retrieval-practice research still pays limited attention to neurodivergent learners and learning disabilities.

Contents

Split illustration of a traditional classroom and a student with headphones studying on a tablet, with a glowing brain between them

Illustration: contrasting learning settings and memory — conceptual image, not from the cited study.

Short version

Active retrieval — trying to remember information instead of simply rereading it — is one of the most strongly supported ideas in learning science.

That is why flashcards, quizzes, spaced repetition, and self-testing systems have become so popular in recent years. But a new scientific perspective highlights an important issue: most research in this area still focuses on “typical” learners and rarely examines how these methods work for people with learning differences such as dyslexia, ADHD, or other neurodivergent profiles.


What the paper explored

The authors analyzed nearly 24,000 scientific publications connected to the testing effect — the finding that recalling information from memory tends to strengthen learning more effectively than passive review.

Rather than asking only whether retrieval practice works, the researchers wanted to understand how the field itself has changed over time. They traced the shift from controlled laboratory experiments toward real educational settings such as classrooms, schools, and university courses.

In earlier decades, many testing-effect studies were highly artificial. Participants might memorize word lists or short passages and then complete memory tests a short time later. More recent work increasingly looks at real-world learning environments involving teachers, study habits, homework, and long-term educational outcomes.

The paper also uses dyslexia as a case study. The authors examined how often learners with reading difficulties appear in this literature and whether researchers can safely assume that standard retrieval-based techniques benefit them in the same way they benefit typical readers.

Importantly, this is not a new classroom experiment. It is a large-scale analysis and commentary on the research landscape itself.


What the researchers found

One major conclusion is that retrieval practice has clearly moved beyond the laboratory. Researchers are paying much more attention to practical educational use, including how teachers apply retrieval strategies in real classrooms and how students use them during independent study.

However, the authors argue that inclusion within the field remains uneven.

Despite the growing educational focus, learners with disabilities or neurodivergent traits still appear surprisingly rarely in testing-effect research. Much of the evidence behind modern study tools comes from relatively narrow participant groups — often university students without major learning difficulties.

The paper suggests this matters more than it may seem.

When retrieval practice becomes embedded in educational technology, flashcard systems, and classroom recommendations, there is a temptation to assume the same strategy works equally well for everyone. But the authors argue there are good theoretical reasons to be cautious about that assumption.

For example, many retrieval tasks rely heavily on reading fluency, spelling speed, written recall, or sustained attention. A learner with dyslexia may experience retrieval exercises very differently from a typical reader. Someone with ADHD might benefit from frequent testing in one context but feel overwhelmed in another depending on cognitive load, timing, or task structure.

The authors are not claiming retrieval practice is ineffective for neurodivergent learners. In fact, it may still be highly beneficial. Their point is that the field currently lacks enough targeted evidence to make strong universal claims.


Why this matters

Modern learning culture is increasingly built around a simple idea:

“Test yourself more often to learn better.”

And in many situations, that advice is supported by strong evidence.

But real learners vary enormously in how they process information, maintain attention, manage working memory, or tolerate cognitive fatigue. A study strategy that works well for one student may need significant adaptation for another.

For some learners, standard flashcard systems may create unnecessary stress or overload. Others may benefit more from visual prompts, shorter sessions, adjustable repetition schedules, or alternative response formats.

The broader message of the paper is that educational effectiveness depends not only on the algorithm or technique itself, but also on the person using it.


Limitations

The paper is bibliometric in nature, meaning it analyzes patterns within scientific literature rather than directly measuring educational outcomes.

As a result, it can show which groups are understudied, but it cannot by itself determine exactly which learning approaches work best for specific individuals.

The dyslexia discussion is also primarily illustrative. Other areas — including autism, ADHD, dyscalculia, and related conditions — still require their own dedicated evidence bases.


Final thoughts

The testing effect remains one of the strongest and most replicated findings in learning science.

But as retrieval-based systems become more common in schools, apps, and self-education platforms, a more important question is emerging:

Not only “Does this method work?” but “Who does it work best for — and under what conditions?”

The most effective learning systems of the future will likely combine retrieval practice with flexibility, accessibility, adjustable difficulty, and support for different cognitive styles and learning paths.


This is a plain-language summary of: “Trends in testing effect research: from lab to classroom, but not yet for all learners”.

Source: NPJ Science of Learning (2026).