arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2601.08323 2026-03-30 cs.AI

AtomMem : Learnable Dynamic Agentic Memory with Atomic Memory Operation

Yupeng Huo, Yaxi Lu, Zhong Zhang, Haotian Chen, Yankai Lin

详情

英文摘要

Equipping agents with memory is essential for solving real-world long-horizon problems. However, most existing agent memory mechanisms rely on static and hand-crafted workflows. This limits the performance and generalization ability of these memory designs, which highlights the need for a more flexible, learning-based memory framework. In this paper, we propose AtomMem, which reframes memory management as a dynamic decision-making problem. We deconstruct high-level memory processes into fundamental atomic CRUD (Create, Read, Update, Delete) operations, transforming the memory workflow into a learnable decision process. By combining supervised fine-tuning with reinforcement learning, AtomMem learns an autonomous, task-aligned policy to orchestrate memory behaviors tailored to specific task demands. Experimental results across 3 long-context benchmarks demonstrate that the trained AtomMem-8B consistently outperforms prior static-workflow memory methods. Further analysis of training dynamics shows that our learning-based formulation enables the agent to discover structured, task-aligned memory management strategies, highlighting a key advantage over predefined routines.

URL PDF HTML ☆

赞 0 踩 0

2601.07855 2026-03-30 cs.CV cs.AI

RoAD Benchmark: How LiDAR Models Fail under Coupled Domain Shifts and Label Evolution

Subeen Lee, Siyeong Lee, Namil Kim, Jaesik Choi

2601.01200 2026-03-30 cs.CV eess.IV

MS-ISSM: Objective Quality Assessment of Point Clouds Using Multi-scale Implicit Structural Similarity

Zhang Chen, Shuai Wan, Yuezhe Zhang, Siyu Ren, Fuzheng Yang, Junhui Hou

2601.00680 2026-03-30 cs.CL

Sigmoid Head for Quality Estimation under Language Ambiguity

Tu Anh Dinh, Jan Niehues

2512.20660 2026-03-30 cs.LG cs.AI cs.SE

The Dual-State Architecture for Reliable LLM Agents

Matthew Thompson

Comments 18 pages, 2 figures, 5 tables. V2 extends and supersedes V1, introducing tri-state guard semantics, a three-level recovery hierarchy, and SWE-Bench boundary analysis

2512.19692 2026-03-30 cs.CV

Interact2Ar: Full-Body Human-Human Interaction Generation via Autoregressive Diffusion Models

Pablo Ruiz-Ponce, Sergio Escalera, José García-Rodríguez, Jiankang Deng, Rolandos Alexandros Potamias

Comments Project Page: https://pabloruizponce.com/papers/Interact2Ar

2512.16145 2026-03-30 cs.CL cs.AI

MRG-R1: Reinforcement Learning for Clinically Aligned Medical Report Generation

Pengyu Wang, Shuchang Ye, Usman Naseem, Jinman Kim

Comments 10 pages

2512.14549 2026-03-30 cs.CL cs.AI

Dual-objective Language Models: Training Efficiency Without Overfitting

David Samuel, Lucas Georges Gabriel Charpentier

2512.13607 2026-03-30 cs.CL cs.AI cs.LG

Nemotron-Cascade: Scaling Cascaded Reinforcement Learning for General-Purpose Reasoning Models

Boxin Wang, Chankyu Lee, Nayeon Lee, Sheng-Chieh Lin, Wenliang Dai, Yang Chen, Yangyi Chen, Zhuolin Yang, Zihan Liu, Mohammad Shoeybi, Bryan Catanzaro, Wei Ping

Comments We publicly release the Nemotron-Cascade models and the full collection of training data at: https://huggingface.co/collections/nvidia/nemotron-cascade

2512.13478 2026-03-30 cs.CL cs.AI cs.LG

NRR-Core: Non-Resolution Reasoning as a Computational Framework for Contextual Identity and Ambiguity Preservation

Kei Saito

Comments 12 pages, 2 figures, 2 tables. Replacement synced to repository snapshot v40. Series hub link: https://github.com/kei-saito-research/nrr-series-hub

2512.13442 2026-03-30 cs.LG

XNNTab -- Interpretable Neural Networks for Tabular Data using Sparse Autoencoders

Khawla Elhadri, Jörg Schlötterer, Christin Seifert

Comments Accepted at the 4th World Conference on eXplainable Artificial Intelligence (XAI-2026)

2512.11798 2026-03-30 cs.CV cs.AI cs.GR

Particulate: Feed-Forward 3D Object Articulation

Ruining Li, Yuxin Yao, Chuanxia Zheng, Christian Rupprecht, Joan Lasenby, Shangzhe Wu, Andrea Vedaldi

Comments CVPR 2026. Project page: https://ruiningli.com/particulate

2512.09435 2026-03-30 cs.CV

UniPart: Part-Level 3D Generation with Unified 3D Geom-Seg Latents

Xufan He, Yushuang Wu, Xiaoyang Guo, Chongjie Ye, Jiaqing Zhou, Tianlei Hu, Xiaoguang Han, Dong Du

Comments Project page: https://xfanhe.github.io/projects/unipart/

2512.08777 2026-03-30 cs.CL cs.AI

Fluent Alignment with Disfluent Judges: Post-training for Lower-resource Languages

David Samuel, Lilja Øvrelid, Erik Velldal, Andrey Kutuzov

2512.08029 2026-03-30 cs.LG cs.CV

CLARITY: Medical World Model for Guiding Treatment Decisions by Modeling Context-Aware Disease Trajectories in Latent Space

Tianxingjian Ding, Yuanhao Zou, Chen Chen, Mubarak Shah, Yu Tian

2512.02650 2026-03-30 cs.CV cs.LG cs.MM cs.SD eess.AS

Hear What Matters! Text-conditioned Selective Video-to-Audio Generation

Junwon Lee, Juhan Nam, Jiyoung Lee

Comments accepted to CVPR 2026

2512.02425 2026-03-30 cs.CV cs.AI cs.CL cs.IR cs.LG

WorldMM: Dynamic Multimodal Memory Agent for Long Video Reasoning

Woongyeong Yeo, Kangsan Kim, Jaehong Yoon, Sung Ju Hwang

Comments CVPR 2026. Project page : https://worldmm.github.io

2512.00850 2026-03-30 cs.CV

Smol-GS: Compact Representations for Abstract 3D Gaussian Splatting

Haishan Wang, Mohammad Hassan Vali, Arno Solin

2511.21075 2026-03-30 cs.LG cs.AI

Aligning LLMs with Biomedical Knowledge using Balanced Fine-Tuning

Zhenchao Tang, Fang Wang, Haohuai He, Jiale Zhou, Tianxu Lv, Jun Zhu, Shouzhi Chen, Minghao Yang, Yu Wang, Jiayang Wu, Yidong Song, Yaokun Li, Jiehui Huang, Dawei Huang, Zhi Song, Jianhua Yao

2511.20620 2026-03-30 cs.CV cs.RO

Wanderland: Geometrically Grounded Simulation for Open-World Embodied AI

Xinhao Liu, Jiaqi Li, Youming Deng, Ruxin Chen, Yingjia Zhang, Yifei Ma, Li Guo, Yiming Li, Jing Zhang, Chen Feng

Comments CVPR 2026

2511.18910 2026-03-30 cs.RO

An Efficient Closed-Form Solution to Full Visual-Inertial State Initialization

Samuel Cerezo, Seong Hun Lee, Javier Civera

Comments 8 pages, 3 figures, 6 tables. Accepted to RA-L

2511.18746 2026-03-30 cs.CV cs.AI

Any4D: Open-Prompt 4D Generation from Natural Language and Images

Hao Li, Qiao Sun

Comments The authors identified issues in the 4D generation pipeline and evaluation that affect result validity. To ensure scientific accuracy, we will revise the methodology and experiments thoroughly before resubmitting. This version should not be cited or relied upon

2511.18090 2026-03-30 cs.CV

Versatile Recompression-Aware Perceptual Image Super-Resolution

Mingwei He, Tongda Xu, Xingtong Ge, Ming Sun, Chao Zhou, Yan Wang

2511.17339 2026-03-30 cs.LG

ReBaPL: Repulsive Bayesian Prompt Learning

Yassir Bendou, Omar Ezzahir, Eduardo Fernandes Montesuma, Gabriel Mahuas, Victoria Shevchenko, Mike Gartrell

详情

Journal ref: The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2026

英文摘要

Prompt learning has emerged as an effective technique for fine-tuning large-scale foundation models for downstream tasks. However, conventional prompt learning methods are prone to overfitting and can struggle with out-of-distribution generalization. To address these limitations, Bayesian prompt learning has been proposed, which frames prompt optimization as a Bayesian inference problem to enhance robustness. This paper introduces Repulsive Bayesian Prompt Learning (ReBaPL), a novel method for Bayesian prompt learning, designed to efficiently explore the complex and often multimodal posterior landscape of prompts. Our method integrates a cyclical step-size schedule with a stochastic gradient Hamiltonian Monte Carlo (SGHMC) algorithm, enabling alternating phases of exploration to discover new modes, and exploitation to refine existing modes. Furthermore, we introduce a repulsive force derived from a potential function over probability metrics (including Maximum Mean Discrepancy and Wasserstein distance) computed on the distributions of representations produced by different prompts. This representation-space repulsion diversifies exploration and prevents premature collapse to a single mode. Our approach allows for a more comprehensive characterization of the prompt posterior distribution, leading to improved generalization. In contrast to prior Bayesian prompt learning methods, our method provides a modular plug-and-play Bayesian extension of any existing prompt learning method based on maximum likelihood estimation. We demonstrate the efficacy of ReBaPL on several benchmark datasets, showing superior performance over state-of-the-art prompt learning methods.

URL PDF HTML ☆

赞 0 踩 0

2511.16928 2026-03-30 cs.CV

Rethinking Diffusion Model-Based Video Super-Resolution: Leveraging Dense Guidance from Aligned Features

Jingyi Xu, Meisong Zheng, Ying Chen, Minglang Qiao, Xin Deng, Mai Xu

Comments Accepted by CVPR 2026,20pages

2511.16542 2026-03-30 cs.CV

EOGS++: Earth Observation Gaussian Splatting with Internal Camera Refinement and Direct Panchromatic Rendering

Pierrick Bournez, Luca Savant Aira, Thibaud Ehret, Gabriele Facciolo

Comments 8 pages, ISPRS

2511.15613 2026-03-30 cs.CV cs.CL

When to Think and When to Look: Uncertainty-Guided Lookback

Jing Bi, Filippos Bellos, Junjia Guo, Yayuan Li, Chao Huang, Yolo Y. Tang, Luchuan Song, Susan Liang, Zhongfei Mark Zhang, Jason J. Corso, Chenliang Xu

Comments Accepted to CVPR 2026

2511.14510 2026-03-30 cs.LG

LiteCache: A Query Similarity-Driven, GPU-Centric KVCache Subsystem for Efficient LLM Inference

Jiawei Yi, Ping Gong, Youhui Bai, Zewen Jin, Shengnan Wang, Jiaqi Ruan, Jia He, Jiaan Zhu, Pengcheng Wang, Haibo Wang, Weiguang Wang, Xia Zhu, Cheng Li

2511.10983 2026-03-30 cs.CV cs.AI

Binary Verification for Zero-Shot Vision

Rongbin Hu, Jeffrey Liu

2511.10938 2026-03-30 cs.LG cs.DC

Cascading Bandits With Feedback

R Sri Prakash, Nikhil Karamchandani, Sharayu Moharir