SaliMory: Orchestrating Cognitive Memory for Conversational Agents
SaliMory: 为对话代理编排认知记忆
Kai Zhang, Xinyuan Zhang, Hongda Jiang, Shiun-Zu Kuo, Hyokun Yun, Ejaz Ahmed, Shereen Oraby, Ziyun Li, Sanat Sharma, Ann Lee, Ahmed A Aly, Anuj Kumar, Raffay Hamid, Xin Luna Dong
AI总结 提出SALIMORY框架,通过层级阶段过程奖励和奖励分解对比优化,端到端训练单一语言模型管理认知结构记忆,显著降低记忆相关错误并提升个性化表现。
详情
作为终身伴侣的对话代理必须在所有交互中保持持久记忆。然而,简单地用原始检索扩展上下文窗口会降低推理质量,而通过标准强化学习训练记忆代理在多阶段流程中会造成严重的信用分配瓶颈。为解决这一问题,我们引入了SALIMORY,一个训练单一语言模型管理认知结构记忆(涵盖用户事实、偏好和工作记忆)的框架。通过引入层级阶段过程奖励和奖励分解对比优化,SALIMORY为不同的记忆操作(选择性过滤、整合和线索驱动回忆)提供端到端的隔离监督。SALIMORY将记忆相关故障减少了三分之一,端到端准确率比最先进方法高出10%以上,良好个性化率提高了一倍多。
Conversational agents that serve as lifelong companions must maintain persistent memory across all interactions. However, simply expanding context windows with raw retrieval degrades reasoning quality, while training memory agents via standard reinforcement learning creates a severe credit assignment bottleneck in a multi-stage pipeline. To solve this, we introduce SALIMORY, a framework that trains a single language model to manage a cognitively-structured memory-spanning user facts, preferences, and working memory. By introducing a hierarchical stage-wise process reward and reward-decomposed contrastive refinement, SALIMORY provides isolated supervision for distinct memory operations (selective filtering, consolidation, and cue-driven recall) end-to-end. SALIMORY cuts memory-attributed failures by one-third, outperforms the state-of-the-art by over 10% in end-to-end accuracy, and more than doubles the Good Personalization rate.