Generative Recursive Reasoning

May 20, 2026

∙ Paid

Authors: Junyeob Baek, Mingyu Jo, Minsu Kim, Mengye Ren, Yoshua Bengio, Sungjin Ahn
Paper: https://arxiv.org/abs/2605.19376
Code: https://ahn-ml.github.io/gram-website
Model: N/A

TL;DR

WHAT was done? The authors introduce Generative Recursive reAsoning Models (GRAM), a probabilistic framework that transforms recursive latent reasoning from a deterministic sequence of updates into a stochastic, multi-trajectory computation. By integrating learned, state-dependent Gaussian residual perturbations into latent transitions and training the system via amortized variational inference, GRAM models both conditional reasoning p_θ(y∣x) and unconditional generation p_θ(x) over continuous-latent trajectories.

WHY it matters? Existing recursive reasoning models suffer from mode collapse in multi-solution landscapes because their state trajectories are fundamentally deterministic. GRAM breaks this bottleneck by enabling width-based inference-time scaling (parallel trajectory sampling) as a latency-friendly complement to traditional depth-based scaling. It consistently outperforms leading deterministic recursive baselines on complex reasoning and constraint-satisfaction tasks (such as Sudoku-Extreme, N-Queens, and Graph Coloring) while remaining parameter-efficient and demonstrating strong unconditional generation capabilities.

Details

The Deterministic Attractor Trap in Latent State Refinement

A major focus in current AI reasoning research is the implementation of extended computation. While large language models typically scale computation by extending explicit sequence lengths via Chain-of-Thought tokens, recursive reasoning architectures decouple parameter scale and sequence length by refining a persistent latent state over time using shared weight blocks. Representing this recurrent paradigm are architectures like Universal Transformers ^[review] (representing the classic class of Universal Recursive Models, hereafter referred to as URMs or Looped Transformers), the Hierarchical Reasoning Model (HRM) ^[review], and the Tiny Recursive Model (TRM) ^[review]. Recent variations like the Universal Reasoning Model (URM) ^[review] attempt to improve this deterministic bias using convolutional layers and truncated backpropagation. However, a profound bottleneck across these systems is their deterministic nature. Given a specific input and initialization, these models follow a single, fixed trajectory through latent space and converge to a single attractor. When solving problems with multi-solution landscapes or hard constraints, deterministic systems easily get trapped in suboptimal local minima with no mechanism for recovery or path exploration. GRAM addresses this limitation by reformulating recursive latent reasoning as a stochastic, multi-hypothesis generative process.

Continue reading this post for free, courtesy of Grigory Sapunov.

Or purchase a paid subscription.

ArXivIQ

Generative Recursive Reasoning

TL;DR

Details

The Deterministic Attractor Trap in Latent State Refinement

Continue reading this post for free, courtesy of Grigory Sapunov.