Neuroscience & Neurotechnology

The Brain Times Its Lessons by the Clock, Not the Count

A UCSF study in mice finds that how fast an animal learns a cue-reward link depends on the time between rewards, not how many pairings it sees. The result contradicts a core assumption behind decades of dopamine learning models.

Abel Chen
·
February 21, 2026
·
4 min
Article hero

Imagine two students studying the same flashcards. One drills forty cards in an hour. The other sees four in that same hour. Common sense says the first student learns more. That is also what the standard textbook picture of the brain says: more pairings, more learning. A new study in mice says common sense is wrong.

Researchers at the University of California, San Francisco report that the speed at which an animal learns to link a cue with a reward does not track the number of times it experiences that pairing. Instead, it tracks the time that passes between one reward and the next. Cram more cue-reward pairings into a fixed stretch of time and, over that stretch, the animal ends up learning the same amount. The count stops mattering. The clock takes over.

Why this dents a decades-old model

For years, the dominant account of reward learning has run through dopamine and a quantity called the reward prediction error, the gap between what an animal expected and what it got. Most versions of this idea are what the authors call trial-based. Learning accumulates one experience at a time, pairing by pairing, like beads on a string. Built into that framing is a quiet assumption: over a fixed window, the more cue-reward pairings you rack up, the more you should learn.

The UCSF team, led by Dennis A. Burke and senior author Vijay Mohan K. Namboodiri, set out to test that assumption directly. Across many experimental conditions in mice, they measured both the animals' behavior and the activity of their dopamine system as the animals figured out which cues predicted reward. The pattern was consistent. Behavioral and dopaminergic learning rates scaled with the interval between rewards, or between punishments. Because of that scaling, the total amount learned over a fixed duration came out roughly independent of how many individual pairings the animal had lived through.

Put bluntly, packing rewards closer together did not speed learning up. Spacing them out did not slow it down in the way a bead-counting model would predict. The brain seemed to be measuring elapsed time, not tallying events.

A retrospective account of dopamine

To explain the data, the researchers turned to a different flavor of dopamine model, one based on retrospective learning. Rather than marching forward trial by trial and asking what a cue predicts next, this framing has the animal looking backward from a reward and asking what preceded it, and how long ago. That backward-looking, time-sensitive logic reproduces the interval effect the experiments turned up. It also, the authors argue, pulls several threads together into one account of how the biology of learning actually works.

This matters beyond mouse behavior rooms. Reward prediction error is one of the most influential ideas linking neuroscience to artificial intelligence, and it sits under a large body of work on how brains and machines learn from feedback. If a core assumption of the trial-based version does not survive contact with the timing data, the models that inherit that assumption may need adjusting too.

What the study does not settle

The work was done in mice, and the leap to human learning is not automatic. The experiments center on relatively simple cue-reward and cue-punishment associations, not the tangled, multi-step learning that fills a real day. The retrospective model fits the results well, but a good fit is an argument, not a closed case, and competing explanations will get their turn. There is also a boundary question the paper itself raises: intervals cannot stretch to infinity and keep working the same way, so where the rule bends at the extremes is left for later.

Still, the core observation is hard to wave away. Two knobs that most models treat as interchangeable, the number of experiences and the time they span, came apart in the data, and time won. According to PubMed, the study appears in Nature Neuroscience (doi.org/10.1038/s41593-026-02206-2). It suggests the brain's tutor is watching a clock as much as counting repetitions, which is a strange and useful thing to know about how any of us come to expect a reward.

Sources
Sources content
Comments

Comments

Stay current on biology.

Weekly research updates, breakthrough summaries, and new articles — straight to your inbox. Free, always.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.