AgentOS: From Application Silos to a Natural Language-Driven Data Ecosystem
The Shift from Graphical Interfaces to Intent-Driven Ecosystems
Authors: Rui Liu, Tao Zhe, Dongjie Wang, Zijun Yao, Kunpeng Liu, Yanjie Fu, Huan Liu, Jian Pei
Paper: https://arxiv.org/abs/2603.08938
TL;DR
WHAT was done? The authors propose a conceptual and architectural redesign of the computing environment called AgentOS. It replaces traditional Graphical User Interfaces (GUIs) and isolated applications with a “Single Port” natural language interface and an “Agent Kernel” that dynamically translates user intent into composable “Skills-as-Modules.”
WHY it matters? Deploying probabilistic, autonomous large language model (LLM) agents on top of legacy, deterministic operating systems creates fragile interaction loops and severe security vulnerabilities. By reimagining the operating system as a continuous Knowledge Discovery and Data Mining (KDD) pipeline, AgentOS offers a structurally native way to orchestrate multi-agent workflows, maintain persistent contextual memory, and enforce semantic security boundaries.
Executive summary: The current paradigm of forcing autonomous agents to navigate systems built for human visual processing is fundamentally mismatched, resulting in what the authors term the “Screen-as-Interface” bottleneck. AgentOS resolves this by subsuming the traditional desktop beneath an intelligent intent-routing layer. This shift demands a pivot from conventional systems engineering to real-time data mining, where the operating system must continuously construct personalized knowledge graphs, recommend executable logic, and optimize action sequences to safely operationalize ambiguous human intent.
Details
The “Shadow AI” and Screen-as-Interface Bottleneck
The rapid deployment of locally hosted agents has exposed a critical architectural mismatch in modern computing. Operating systems like Windows, macOS, and Linux are fundamentally designed for explicit human interaction via Graphical User Interfaces (GUIs). When autonomous agents are layered on top of these environments as user-space applications, they are forced to operate through a “Screen-as-Interface” paradigm, relying on visual scraping or simulated keystrokes. As illustrated in Figure 1, this GUI-centric interaction obscures underlying structured data, leading to severe semantic loss.
Furthermore, this approach breeds fragile execution paths that break whenever an application updates its layout. More critically, delegating system-level permissions to these opaque, autonomous processes creates a “Shadow AI” crisis. Legacy operating systems lack the semantic understanding necessary to distinguish between an agent executing a benign file organization task and one maliciously exfiltrating data via an indirect prompt injection.



