AI中文摘要
大型语言模型(LLM)智能体正在从请求-响应助手演变为长时间运行的软件参与者:它们在模型调用之间维护状态,分叉子任务,等待外部事件,请求人类授权,生成工具,并执行必须被恢复和审计的副作用。本文提出Agent libOS,一种受库操作系统启发的LLM智能体运行时基础。Agent libOS运行在传统主机操作系统之上;它不实现硬件驱动、内核模式隔离或POSIX兼容操作系统。相反,它将智能体视为一个AgentProcess:一个可调度的执行主体,具有进程标识、父子谱系、生命周期状态、从AgentImage派生的工具表、类型化对象内存、显式能力、人类队列、检查点、事件和审计记录。其核心设计原则是工具是类似libc的包装器;运行时原语是权限边界。文件系统访问、对象访问、睡眠、人类批准、JIT工具注册和外部副作用都在显式能力和策略下在原语边界进行检查。我们描述了设计、威胁模型、Python原型和面向安全的评估。当前原型实现了异步调度、命名空间本地对象内存、运行时集成的人类批准、一次性权限授予、每进程工作目录、shell和图像注册原语、基于libOS系统调用代理的Deno/TypeScript JIT工具、文件系统/对象桥接工具、可注入的资源提供者基础、确定性演示、真实模型烟雾脚本以及撰写时的123个回归测试。Agent libOS不是提高规划器准确性,而是展示了一种运行时基础,在该基础上,长时间运行的LLM智能体可以被调度、授权、恢复和审计,而无需将工具分发视为信任边界。
英文摘要
Large language model (LLM) agents are evolving from request-response assistants into long-running software actors: they maintain state across model calls, fork subtasks, wait for external events, request human authority, generate tools, and perform side effects that must be resumed and audited. This paper presents Agent libOS, a library-OS-inspired runtime substrate for LLM agents. Agent libOS runs above a conventional host operating system; it does not implement hardware drivers, kernel-mode isolation, or a POSIX-compatible operating system. Instead, it treats an agent as an AgentProcess: a schedulable execution subject with process identity, parent-child lineage, lifecycle state, a tool table derived from an AgentImage, typed Object Memory, explicit capabilities, human queues, checkpoints, events, and audit records. Its central design rule is tools are libc-like wrappers; runtime primitives are the authority boundary. Filesystem access, object access, sleeps, human approval, JIT tool registration, and external side effects are checked at primitive boundaries under explicit capabilities and policy.
We describe the design, threat model, Python prototype, and safety-oriented evaluation. The current prototype implements async scheduling, namespace-local Object Memory, runtime-integrated human approval, one-shot permission grants, per-process working directories, shell and image-registration primitives, Deno/TypeScript JIT tools over a libOS syscall broker, filesystem/object bridge tools, an injectable Resource Provider Substrate, deterministic demos, real-model smoke scripts, and 123 regression tests at the time of writing. Rather than improving planner accuracy, Agent libOS demonstrates a runtime substrate in which long-running LLM agents can be scheduled, authorized, resumed, and audited without treating tool dispatch as the trust boundary.