Can the brain learn faster from rare events than from repetition? A UCSF study changes the view on associative learning
For more than a century, the image of Pavlov’s dog expecting food after the sound of a bell has served as an almost textbook proof that the link between a stimulus and a reward is built through repetition. The more times the sound preceded food, it was assumed, the stronger and faster the learning. But new research by scientists at the University of California, San Francisco (UCSF) proposes a different—and for many, provocative—conclusion: the number of repetitions in itself is not decisive; what is crucial is how much time passes between rewards.
This refers to a paper published on 12 February 2026 in the journal
Nature Neuroscience, in which the authors argue that associative learning is strongly determined by the spacing between outcomes, that is, between rewards. When rewards follow too close to one another, the brain “extracts” less from each individual episode. When the spacing is larger, learning per trial becomes more efficient, even if there are fewer trials overall.
From “practice makes perfect” to “timing is everything”
In the classic explanation of associative learning, an animal (or a human) recognizes through repetition that a certain cue in the environment predicts an outcome. In modern neuroscience, this is often described through the role of dopamine: at first, dopamine is released more strongly when the reward arrives, and over time that signal “shifts” to the stimulus that predicts the reward. Such a shift in the dopamine response is interpreted as a mechanism by which the brain builds predictions, strengthens useful connections, and weakens those that are not confirmed.
The UCSF team, led by neurologist Vijai Mohan K. Namboodiri, wanted to test how closely that process is actually tied to the number of trials. In experiments on mice, they used a simple task: a sound (stimulus) predicts sugar-sweetened water (reward). Instead of changing the “difficulty” of the task or the type of reward, they changed what is often taken for granted: the spacing between trials.
Experiment with mice: fewer rewards, yet the same learning
In the first series of experiments, the researchers arranged trials so that one group had a short interval, approximately 30 to 60 seconds, while the other had a much longer one, from five to ten minutes or more. This created a situation that, by the old logic, should have clearly favored a “dense” schedule: mice with short intervals received many more rewards within the same time frame because they could go through more trials.
The result, however, went in the opposite direction. Groups that had significantly fewer trials, but with rewards spaced out, learned just as fast in terms of the total time needed to show the learned behavior. In other words, more trials did not mean proportionally faster learning. What changed was the number of trials needed to “catch” the association: with longer intervals, mice needed far fewer repetitions before they began to respond to the sound by expecting a reward.
In the published data, the authors state that, for example, mice with an interval of 600 seconds between trials learned on average in a single-digit number of trials, whereas the group with a 60-second interval needed many more trials to reach the same outcome. Although “rarer” trials looked like a slower path, the total time to the emergence of the learned behavior was comparable.
Dopamine as a “timer” for the interval between rewards
To understand what was happening in the brain, the researchers tracked dopamine activity during learning. Dopamine in this context is often described as a signal that helps the brain update expectations: when something better or worse than expected happens, the dopamine response can “thicken” or “thin” the connection between stimulus and outcome.
In the UCSF model, however, dopamine does not act only as a reaction to surprise, but also as part of a mechanism that takes into account the time interval between rewards. When rewards were rarer, the dopamine response to the stimulus appeared earlier, after fewer repetitions, as if the brain more quickly “concluded” that the cue truly carries information. When rewards were frequent and clustered, the brain learned less from each episode, so it took more repetitions for the dopamine signal to stably shift to the stimulus.
The authors summarize this with the thesis that associative learning is less “practice makes perfect” and more “timing is everything”: the efficiency of learning per trial increases when the interval between rewards is larger.
Not only spacing, but also the rarity of the reward
An interesting part of the study involved a scenario in which the stimulus is present regularly, but the reward appears rarely. In one variant, the researchers played the sound at intervals of about 60 seconds, but provided sugar-sweetened water in only about 10% of trials. This design mimics real-life situations in which a certain signal is present often, but the “payoff” happens occasionally and unpredictably.
In that case, mice began releasing dopamine after the sound with a relatively small number of received rewards, even when the sound was not followed by a reward. This is important because it suggests the brain can build strong expectations and incentives based on rare but “informative” outcomes. Such a mechanism could explain why some forms of behavior become persistent and hard to extinguish, especially when rewards are intermittent.
Why “cramming” often fails: a possible link to school learning
Although the study deals with basic mechanisms of learning in the brains of mice, the authors and commentators point to potentially broader implications. One is intuitive: when information is “packed” into a short time, like intense studying the night before an exam, each individual repetition episode may have a smaller effect. Conversely, distributed learning over a longer period gives the brain the time gap that, according to this theory, increases the amount of “learning per event.”
In practice, this is close to what educational psychology has long recognized as the spacing effect. But the UCSF work seeks to offer a more precise neurobiological and mathematical description: it is not only that spacing is “better,” but that the learning rate can scale with the time between rewards or outcomes, while the total time needed to learn something remains approximately stable and the number of repetitions varies.
Implications for addictions: intermittent “triggers” and lasting habits
Even more sensitive consequences concern addictive behaviors. Smoking is often an example of a habit that involves many cues in the environment: the smell of smoke, the sight of a pack, a particular place, or company. The reward (nicotine and the accompanying dopamine response) does not have to arrive at perfectly regular intervals; in reality it can be intermittent, depending on the situation and availability. If the brain truly learns more strongly from rare, spaced, or unpredictable rewards, that could strengthen the link between such cues and craving.
The UCSF explanation also mentions why therapies that provide a continuous, stable dose (such as nicotine patches) might help some people. If the dopamine “signature” of the reward is constantly present and less tied to specific stimuli, then the association between cues and reward is interrupted or weakened. This could, at least in theory, reduce the power of triggers that otherwise drive the urge for a cigarette.
Such an interpretation does not mean the solution is universal, nor that addiction can be reduced to a single mechanism. But it provides an additional framework for understanding why intermittent reinforcement and environmental triggers can be so powerful, and why treatment strategies often try to change the relationship between cues, expectations, and outcomes.
What this means for artificial intelligence: faster learning from fewer examples?
The authors also raise the question of whether such a principle could be transferred to artificial intelligence systems. Many modern learning algorithms, especially those relying on variants of reinforcement learning, update their estimates after an enormous number of interactions. This “trial-by-trial” approach resembles the older assumption about associative learning: each new episode brings a small correction, and progress is built through billions of repetitions.
If the brain can increase the learning rate per episode when outcomes are rarer or spaced out, this suggests models could be more efficient if they built the temporal structure of experience into the logic of learning itself. In that scenario, a system would extract more information from individual, “costlier” events, instead of relying on endless repetition with minimal shifts. The researchers emphasize that this is a direction for future work, not a finished recipe: transferring biological principles into computational models requires caution, testing, and clear limits of applicability.
A broader question: how complete was the old theory, really?
It is important to emphasize that the UCSF study does not claim repetition is not important. In many skills, repetition builds automatism, precision, and endurance. What is being questioned is the simple equation “more trials = faster learning” in the domain of basic associative learning, especially when it comes to linking stimuli and outcomes in relatively short laboratory tasks.
The paper in
Nature Neuroscience introduces the idea that the brain tracks the temporal “economy” of rewards: when rewards pile up, each carries a smaller informational value about the cause; when rewards are spaced out, the brain treats each episode as more important for inferring what in the environment truly predicts the outcome. The authors tested this relationship across different intervals and showed that the number of trials to learning can change approximately in proportion to the change in spacing, while the total time to learning remains similar.
Additionally, the results in the paper were extended to learning associated with unpleasant outcomes, where it is also seen that the learning rate can scale with the time between outcomes. This suggests the principle is not limited only to a “sweet” reward, but may have broader applicability in how the brain sets expectations, whether it is approaching a reward or avoiding a threat.
What is currently clear, and what remains open
According to the available data, the study firmly shows that under a controlled task in mice, the interval between rewards strongly changes the efficiency of learning per trial, with clear changes in dopamine signaling. What still needs to be clarified is how these rules map onto complex human situations, where a “reward” can be abstract, delayed, or socially mediated, and stimuli are multiple and often ambiguous.
Still, the message that already imposes itself on the reader is practical and easy to understand: not every repetition is equally valuable. If the brain truly learns more when there is spacing between “payoffs,” then learning, habits, and therapies may need to be viewed through the lens of rhythm and schedule, not only through the sum of trials.
Sources:- UC San Francisco – study overview and key statements by the authors ( UCSF )- Nature Neuroscience – scientific paper “Duration between rewards controls the rate of behavioral and dopaminergic learning”, DOI: 10.1038/s41593-026-02206-2 ( Nature Neuroscience (PDF) )- Crossref Crossmark – official metadata on the online publication date (12 February 2026) ( Crossmark )
Find accommodation nearby
Creation time: 2 hours ago