Neural Computers
Authors: Mingchen Zhuge, Changsheng Zhao, Haozhe Liu, Zijian Zhou, Shuming Liu, Wenyi Wang, Ernie Chang, Gael Le Lan, Junjie Fei, Wenxuan Zhang, Yasheng Sun, Zhipeng Cai, Zechun Liu, Yunyang Xiong, Yining Yang, Yuandong Tian, Yangyang Shi, Vikas Chandra, Jürgen Schmidhuber
Paper: https://arxiv.org/abs/2604.06425v1
Code: https://github.com/metauto-ai/NeuralComputer
Blog: https://metauto.ai/neuralcomputer/index_eng.html
TL;DR
WHAT was done? Researchers from Meta AI and KAUST propose a new architectural paradigm called a Neural Computer (NC), which unifies computation, memory, and I/O operations into a single learned latent runtime state. Rather than treating an AI as an agent that manipulates an external operating system, they instantiate the computer directly within the weights of a diffusion transformer (built on Wan2.1), demonstrating this concept via two prototypes: NCCLIGen for terminal environments and NCGUIWorld for desktop graphical interfaces.
WHY it matters? This work outlines a fundamental shift from the modular Von Neumann hardware/software stack to a unified “neural latent stack.” If this trajectory holds, future systems will not be explicitly coded but differentiably configured. By proving that early runtime primitives—like I/O alignment and short-horizon control—can emerge solely from observing interface traces, the paper provides a roadmap toward Completely Neural Computers (CNCs) that could replace traditional digital computing substrates.
Executive summary: For strategic leaders and systems researchers, this paper highlights a critical divergence in AI system design. While the industry heavily invests in tool-using agents that interact with external software, this research suggests an alternative where the model itself absorbs the execution environment. Through extensive ablations on data quality and action injection, the authors show that models can render highly accurate interfaces and respond to user inputs. However, they also reveal a severe limitation in native symbolic reasoning, proving that current video-based instantiations are exceptional renderers but fragile reasoners.
Details
The Mediation Bottleneck
The current trajectory of interactive AI is fundamentally mediated. We build autonomous agents that utilize low-bandwidth APIs or visual streams to manipulate external execution environments—browsers, codebases, and operating systems. Conversely, we build world models that simulate the physical or digital dynamics of these environments for planning. In both paradigms, the ultimate source of truth—the executable state, the memory allocation, the system contract—remains isolated in a conventional, non-neural software stack. The authors of this paper identify this gap and propose that the neural network should no longer merely predict or interact with the computer; the neural network should be the computer. They define this as a Neural Computer (NC), a system that sheds the traditional division of memory, compute, and I/O, folding them into a continuous neural manifold. To test this, they move beyond the generic generation capabilities of Sora 2 or Veo 3.1 and construct specific, conditionable video-based prototypes, contrasting their approach against existing system architectures.





