Misalignment Between Backpropagation and the Hierarchy of Brain Responses to Images

Jun 08, 2026

Authors: Joséphine Raugel, Maximilian Seitzer, Marc Szafraniec, Huy V. Vo, Jérémy Rapin, Patrick Labatut, Piotr Bojanowski, Valentin Wyart, Jean-Remi King
Paper: https://arxiv.org/abs/2605.28693
Code: N/A
Model: N/A

TL;DR

WHAT was done? The researchers mapped both the forward-processing signals and the backward-learning signals (gradients) of state-of-the-art vision models directly onto high-resolution human brain scans (fMRI and MEG) to see if the brain uses a biological equivalent of backpropagation to learn.

WHY it matters? While artificial neural networks and the human brain form highly similar internal representations when looking at images, this study reveals that their underlying learning processes are profoundly different. This suggests that the brain relies on alternative, potentially more efficient learning mechanisms yet to be fully exploited in artificial intelligence.

Details

The Mirage of Shared Vision

For over a decade, computational neuroscientists have been struck by a fascinating parallel: when a deep artificial neural network is trained to recognize images, its internal hierarchy of representations mirrors that of the human visual system. If you show a picture to both a vision model and a human, the early layers of the model predict activity in the early visual cortex, while the deeper layers predict activity in the late, high-level visual areas that process semantic concepts. This remarkable alignment has tempted many to assume that deep learning models and the human brain are fundamentally doing the same thing.

However, this correspondence has only ever been verified in the “forward pass”—the instantaneous cascade of activations that flows from the retina or input layer to the deeper processing centers of the brain or model. The far deeper mystery lies in the “backward pass,” the learning process itself. In modern artificial intelligence, learning is driven by backpropagation, a mathematical engine that computes error signals, or gradients, from the top of the network down to the bottom to adjust individual synaptic weights. Whether the biological brain implements something equivalent to backpropagation remains one of the most fiercely debated questions in cognitive science. If the brain and AI arrive at the same representational destination, do they take the same learning route to get there?

Chasing the Backward Ripple

To answer this, we can think of forward activations as water flowing down a terraced, multi-tier waterfall. The water starts at the top (representing low-level features like edges and shapes) and ripples down to the bottom pools (representing complex objects and concepts). Backpropagation is like sending a series of dye-pulses backward from the bottom pool up to the very top spring to adjust the gates of each terrace based on how the water is flowing.

This analogy holds up to a point: in an artificial network, these backward dye-pulses are mathematically precise updates computed in a strict, step-by-step sequence. In the physical brain, however, forward and backward signals are constantly intermingling in a messy, real-time feedback loop. Still, the metaphor highlights what we should expect if the brain utilizes backpropagation: we should be able to track these upward-flowing dye-pulses traveling backward across both time and space in the human visual cortex.

To catch these elusive learning signals in biological tissue, the researchers designed a conservative, sensitive metric they call Residual Backpropagation. Because the forward signals and the backward learning signals in a neural network are highly correlated, standard comparison tools can easily mistake one for the other. The authors solved this by measuring how well a joint combination of forward and backward model features could predict brain activity, and then subtracting the prediction score of the forward activations alone:

R_{Residual Backprop}=R(Forward+Backprop)−R(Forward)

In this equation, R represents the encoding score—a mathematical correlation indicating how closely the model’s internal states predict biological brain activity (a Pearson correlation score R ∈ [−1, 1]^m). Think of it as a metric for how well a computer model’s simulated thoughts mirror actual patterns of human brain activity. By subtracting the forward score, the researchers isolated the pure, residual variance explained solely by the backpropagated learning signals. If this residual score is significantly above zero, it means the brain is harboring neural patterns that uniquely correspond to artificial learning signals.

Mapping Gradients to Gray Matter

The experimental architecture of this study, traced conceptually in the schematic of Figure 1, brings together two worlds. On one side is a suite of state-of-the-art vision models, led by DINOv3-S ^[review], a self-supervised Vision Transformer whose student-teacher training framework mimics the unsupervised, predictive way human brains learn from their environment. On the other side is an exceptionally rich set of human neuroimaging data. The researchers utilized ultra-high-field 7 Tesla functional Magnetic Resonance Imaging (fMRI), which provides sharp spatial maps of brain activity, and Magnetoencephalography (MEG), which captures the electrical signals of the brain at a millisecond-by-millisecond resolution.

The walkthrough of the experiment is straightforward. Natural images are fed into the vision model to extract its layer-by-layer forward activations and its corresponding backpropagated gradients, which are computed using the model’s natural training objectives. These high-dimensional model signals are then compressed using Principal Component Analysis to keep the math tractably small. At the same time, human subjects view the same images while their brain responses are recorded. Finally, linear regression models—acting as translation bridges—are trained to predict the physical brain activity from the compressed model features. By comparing where (fMRI) and when (MEG) these mappings succeed, the team could directly track the biological footprint of backpropagation.

Simultaneous Peaks and Reversed Maps

The results, visualized across several key figures, deliver a striking paradox. On the positive side, the backpropagated gradients from the model did indeed reliably predict human brain responses. As shown in Figure 3, across eight diverse vision architectures—including vision-language models like CLIP and supervised architectures like ConvNeXt-L—residual backpropagation explained a statistically significant portion of neural variance that forward activations alone could not capture. This signal was most prominent in higher-level visual cortices and at later processing delays, peaking between 0.5 and 1.7 seconds after an image appeared.

Yet, when the researchers zoomed in on the temporal and spatial organization of these signals, the biological plausibility of backpropagation fell apart. Standard backpropagation requires a strict, sequential top-down cascade: the final layer must compute its errors first, which are then passed backward to earlier layers over time.

But the temporal records of the human brain told a completely different story. As detailed in the MEG curves of Figure 2F and Figure 4F, the gradient-related signals across all layers of the model peaked in the brain simultaneously. There was no sequential, layer-by-layer delay as the learning signal traveled backward.

Spatially, the disconnect was just as glaring. Instead of early sensory areas aligning with shallow gradients and association areas aligning with deep gradients, the maps were entirely inverted: early visual areas best matched the deepest, late-stage gradients of the model, shown by the negative correlation trend in Figure 2D.

Additionally, by tracking these signals across the actual training lifecycle of the DINOv3-S model (shown in Figure 5 and Figure 6), the researchers uncovered a fascinating developmental trajectory. The residual backpropagation signal in the brain did not grow continuously. Instead, it followed a non-monotonic curve—climbing to a transient peak early in training (around 10,000 iterations) before quietly dropping off again as the model’s teacher and student networks converged on stable representations. This matches the intuition that learning signals should be most active when a system is actively absorbing new information, but it still occurred without the spatiotemporal hierarchy that defines artificial backpropagation.

The Deepening Rift in Credit Assignment

This work sits at a critical intersection in computational neuroscience. Over the years, researchers like Lillicrap et al. (2020) have devised highly ingenious, “biologically plausible” approximations of backpropagation to solve physical constraints, such as the mathematical requirement that feedback synaptic connections must perfectly mirror feedforward connections.

This puzzle lies at the heart of credit assignment—the fundamental algorithmic challenge of determining which specific synapses or pathways among billions of options are responsible for an overall success or error. Importantly, these biologically plausible variants preserve the core computational structure of backpropagation: error signals are computed at higher levels and sequentially propagated downward to guide synaptic updates at earlier stages. By demonstrating that the human brain completely lacks this sequential, top-down temporal progression, this study provides a profound empirical constraint. It suggests that while theories of biological backpropagation are mathematically feasible on paper, they do not match how the human brain actually processes learning signals in vivo.

What the Mind Can Do That Models Can’t

Despite its rigorous design, this study must be interpreted within the constraints of modern neuroimaging. Non-invasive tools like fMRI and MEG are marvels of science, but they operate at a macroscopic scale. They cannot directly resolve the microscopic, synapse-level changes or the complex signal processing that occurs within the individual dendrites of a single neuron. It is possible that sequential learning signals are occurring at sub-millimeter scales that are simply smoothed out by macroscopic scans.

Furthermore, models like DINOv3-S operate on static, isolated images. The human brain, conversely, exists in a continuous, dynamic loop of sensory feedback, constantly using recurrent connections to predict the next millisecond of visual input. A static, feedforward-feedback network is an imperfect proxy for a recurrent, active system.

A Call for Alternative Intelligence

The broader significance of this research is a sobering but exciting reminder: representational similarity does not imply algorithmic similarity. The fact that artificial neural networks and human brains both learn to see the world using similar hierarchical building blocks is a beautiful example of convergent evolution. But they are scaling different sides of the same mountain.

By showing that the brain’s learning dynamics diverge from the canonical top-down sequence of backpropagation, this work strongly motivates the exploration of alternative credit-assignment theories. Frameworks like local Hebbian learning, contrastive predictive coding, and real-time predictive coding do not rely on a globally coordinated, backward mathematical pass. Unlocking these biological secrets is not just a quest for neuroscience; it may hold the key to designing a new class of AI models that are as incredibly data-efficient and low-power as the human mind.

ArXivIQ

Discussion about this post

Ready for more?