arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

检索范围排序方式

检索时间范围

重置

HOT 人工智能、机器人等 9

cs.AI 人工智能 cs.CV 计算机视觉 cs.CL 自然语言处理 cs.RO 机器人 cs.LG 机器学习 cs.SD 声音 cs.ET 新兴技术 eess.AS 音频语音 eess.IV 图像视频

CS 计算机 41

cs 计算机 cs.AI 人工智能 cs.AR 硬件架构 cs.CC 计算复杂性 cs.CE 计算工程 cs.CG 计算几何 cs.CL 自然语言处理 cs.CR 密码安全 cs.CV 计算机视觉 cs.CY 计算机与社会 cs.DB 数据库 cs.DC 分布式计算 cs.DL 数字图书馆 cs.DM 离散数学 cs.DS 数据结构 cs.ET 新兴技术 cs.FL 形式语言 cs.GL 综述文献 cs.GR 图形学 cs.GT 博弈论 cs.HC 人机交互 cs.IR 信息检索 cs.IT 信息论 cs.LG 机器学习 cs.LO 计算机逻辑 cs.MA 多智能体 cs.MM 多媒体 cs.MS 数学软件 cs.NA 数值分析 cs.NE 神经进化 cs.NI 网络架构 cs.OH 其他计算机 cs.OS 操作系统 cs.PF 性能 cs.PL 编程语言 cs.RO 机器人 cs.SC 符号计算 cs.SD 声音 cs.SE 软件工程 cs.SI 社会信息网络 cs.SY 系统控制

ECON 经济学 4

econ 经济学 econ.EM 计量经济 econ.GN 一般经济 econ.TH 理论经济

EESS 电气与系统 5

eess 电气与系统 eess.AS 音频语音 eess.IV 图像视频 eess.SP 信号处理 eess.SY 系统控制

MATH 数学 33

math 数学 math.AC 交换代数 math.AG 代数几何 math.AP 偏微分方程 math.AT 代数拓扑 math.CA 经典分析 math.CO 组合数学 math.CT 范畴论 math.CV 复变函数 math.DG 微分几何 math.DS 动力系统 math.FA 泛函分析 math.GM 一般数学 math.GN 一般拓扑 math.GR 群论 math.GT 几何拓扑 math.HO 历史综述 math.IT 信息论 math.KT K理论 math.LO 逻辑 math.MG 度量几何 math.MP 数学物理 math.NA 数值分析 math.NT 数论 math.OA 算子代数 math.OC 优化控制 math.PR 概率 math.QA 量子代数 math.RA 环与代数 math.RT 表示论 math.SG 辛几何 math.SP 谱理论 math.ST 统计理论

PHYSICS 物理 55

astro-ph 天体物理 astro-ph.CO 宇宙学 astro-ph.EP 地球行星 astro-ph.GA 星系物理 astro-ph.HE 高能天体 astro-ph.IM 天文仪器 astro-ph.SR 太阳恒星 cond-mat 凝聚态 cond-mat.dis-nn 无序神经 cond-mat.mes-hall 介观纳米 cond-mat.mtrl-sci 材料科学 cond-mat.other 其他凝聚态 cond-mat.quant-gas 量子气体 cond-mat.soft 软凝聚态 cond-mat.stat-mech 统计力学 cond-mat.str-el 强关联电子 cond-mat.supr-con 超导 gr-qc 广义相对论 hep-ex 高能实验 hep-lat 格点高能 hep-ph 高能唯象 hep-th 高能理论 math-ph 数学物理 nlin 非线性科学 nlin.AO 自适应系统 nlin.CD 混沌动力学 nlin.CG 胞自动机 nlin.PS 斑图孤子 nlin.SI 可积系统 nucl-ex 核物理实验 nucl-th 核物理理论 physics 物理 physics.acc-ph 加速器物理 physics.ao-ph 大气海洋 physics.app-ph 应用物理 physics.atm-clus 原子分子团簇 physics.atom-ph 原子物理 physics.bio-ph 生物物理 physics.chem-ph 化学物理 physics.class-ph 经典物理 physics.comp-ph 计算物理 physics.data-an 数据分析 physics.ed-ph 物理教育 physics.flu-dyn 流体动力学 physics.gen-ph 普通物理 physics.geo-ph 地球物理 physics.hist-ph 物理史哲 physics.ins-det 仪器探测 physics.med-ph 医学物理 physics.optics 光学 physics.plasm-ph 等离子体 physics.pop-ph 科普物理 physics.soc-ph 物理与社会 physics.space-ph 空间物理 quant-ph 量子物理

Q-BIO 定量生物 11

q-bio 定量生物 q-bio.BM 生物分子 q-bio.CB 细胞行为 q-bio.GN 基因组学 q-bio.MN 分子网络 q-bio.NC 神经认知 q-bio.OT 其他定量生物 q-bio.PE 种群进化 q-bio.QM 定量方法 q-bio.SC 亚细胞过程 q-bio.TO 组织器官

Q-FIN 定量金融 10

q-fin 定量金融 q-fin.CP 计算金融 q-fin.EC 经济学 q-fin.GN 一般金融 q-fin.MF 数学金融 q-fin.PM 投资组合 q-fin.PR 证券定价 q-fin.RM 风险管理 q-fin.ST 统计金融 q-fin.TR 交易微观结构

STAT 统计 7

stat 统计 stat.AP 统计应用 stat.CO 统计计算 stat.ME 统计方法 stat.ML 机器学习 stat.OT 其他统计 stat.TH 统计理论

2511.09771 2026-05-14 cs.CV

STORM: Segment, Track, and Object Re-Localization from a Single Image

Yu Deng, Teng Cao, Hikaru Shindo, Quentin Delfosse, Jiahong Xue, Kristian Kersting

发表机构 * Department of Computer Science, Technical University of Darmstadt, Darmstadt, Hesse, Germany（德累斯顿技术大学计算机科学系）； Hessian Center for Artificial Intelligence (hessian.AI), Darmstadt, Hesse, Germany（黑森人工智能中心（hessian.AI））； German Research Center for Artificial Intelligence (DFKI), Darmstadt, Hesse, Germany（德国人工智能研究中心（DFKI））； Centre for Cognitive Science, Technical University of Darmstadt, Darmstadt, Hesse, Germany（德累斯顿技术大学认知科学中心）； Google Intrinsic AI Research, Germany. † Work done while at the AIML research lab, now working at Intrinsic, Google.（谷歌Intrinsic AI研究）

AI总结 STORM 是一种统一的框架，能够基于单张参考图像进行条件化的6D姿态估计与跟踪，具有较高的鲁棒性和较低的人工输入需求。该方法结合了分层空间融合注意力机制和基于BCE训练的跟踪验证器，能够在遮挡和快速运动等复杂场景下稳定恢复目标姿态。实验表明，STORM 在无需标注的情况下优于现有方法，并能有效应对严重遮挡和视角变化。

Comments 21 pages. Accepted at the 43rd International Conference on Machine Learning (ICML 2026); camera-ready version

2510.13385 2026-05-14 cs.LG

Probabilistic Prediction Markets with Intermittent Contributions

Michael Vitali, Pierre Pinson

发表机构 * Dyson School of Design Engineering, Imperial College London（帝国理工学院伦敦校区设计工程学院）； Halfspace, Denmark（丹麦Halfspace公司）； Department of Technology, Management and Economics, Technical University of Denmark（丹麦技术大学技术、管理与经济学系）； CoRE, Aarhus University（阿arhus大学CoRE）

AI总结本文研究了在数据所有权和竞争利益限制下，如何通过预测市场机制促进多方协作进行准确预测的问题。提出了一种允许代理自主进出市场、适应动态环境并考虑历史表现的预测市场框架，采用鲁棒回归模型处理缺失提交，并设计了一种兼顾样本内与样本外性能的收益分配机制。实验表明，该设计在模拟和真实数据中均表现出良好的有效性和适应性。

2509.22123 2026-05-14 cs.CL

Multilingual Vision-Language Models, A Survey

Andrei-Alexandru Manea, Jindřich Libovický

发表机构 * Faculty of Mathematics and Physics, Charles University, V Holešovičkách 747/2, Prague, Czech Republic（数学与物理系，查尔斯大学，V Holešovičkách 747/2，布拉格，捷克共和国）

AI总结本文综述了能够处理多语言文本与图像的多语言视觉-语言模型，系统回顾了33个模型和23个基准测试，分析了编码器和生成式架构的发展趋势，并指出了语言中立性与文化适应性之间的关键矛盾。当前训练方法倾向于通过对比学习实现语言中立性，而文化适应性则依赖于多样化数据，多数评估基准优先考虑语义一致性，但近期研究开始引入文化相关的内容以弥补这一差距。

2509.21543 2026-05-14 cs.RO

Self-CriTeach: LLM Self-Teaching and Self-Critiquing for Improving Robotic Planning via Automated Domain Generation

Jinbang Huang, Zhiyuan Li, Yuanzhao Hu, Zhanguang Zhang, Mark Coates, Xingyue Quan, Yingxue Zhang

发表机构 * Huawei Noah's Ark Lab（华为诺亚实验室）； University of Toronto（多伦多大学）； University of British Columbia（不列颠哥伦比亚大学）； McGill University（麦吉尔大学）

AI总结该研究提出了一种名为 Self-CriTeach 的框架，旨在通过大语言模型（LLM）的自我教学与自我批评机制，提升机器人规划能力。该方法利用 LLM 自主生成符号规划域，既用于生成大规模的机器人任务-计划对以进行监督微调，又作为结构化奖励函数提供密集反馈以增强强化学习。该统一训练流程显著提高了 LLM 的规划成功率、跨任务泛化能力，并降低了推理成本和对不完美逻辑状态的敏感性。

Comments International Conference on Machine Learning (ICML) 2026

2509.20786 2026-05-14 cs.LG

LiLAW: Lightweight Learnable Adaptive Weighting to Learn Sample Difficulty & Improve Noisy Training

Abhishek Moturu, Muhammad Muzammil, Anna Goldenberg, Babak Taati

发表机构 * Department of Computer Science（计算机科学系）； University of Toronto（多伦多大学）； Department of Mathematics（数学系）； The Hospital for Sick Children（圣·玛利亚医院）； Department of Statistics（统计学系）； UHN KITE Research Institute（UHN KITE研究所）； T-CAIREM ； Vector Institute（向量研究所）； Institute of Biomedical Engineering（生物医学工程研究所）； Rehabilitation Sciences Institute（康复科学研究所）

AI总结本文提出了一种轻量可学习的自适应加权方法LiLAW，用于在存在噪声和数据异质性的场景下提升深度神经网络的训练效果。该方法通过三个全局可学习的标量参数动态调整每个样本的损失权重，根据样本难度（易、中、难）进行自适应调整，并在每次训练小批量后使用验证小批量进行一次梯度下降更新，无需干净的验证集。实验表明，LiLAW在多种数据集和噪声条件下均能有效提升模型准确率和AUROC，尤其在高噪声环境下表现突出，且计算高效，适用于资源受限的场景。

2509.18993 2026-05-14 cs.LG

CR-Net: Scaling Parameter-Efficient Training with Cross-Layer Low-Rank Structure

Boao Kong, Junzhu Liang, Yuxi Liu, Renjia Deng, Kun Yuan

发表机构 * Peking University（北京大学）

AI总结本文提出了一种名为CR-Net的参数高效的预训练框架，旨在解决当前低秩结构方法在模型性能、计算开销和激活内存节省方面的不足。CR-Net基于跨层激活残差具有低秩特性的发现，采用双路径架构，通过结合前一层输出与低秩差异高效重建层激活，从而在保持高秩信息的同时大幅减少参数量。实验表明，CR-Net在不同规模的模型（从60M到7B参数）上均优于现有低秩方法，且在计算资源和内存消耗方面表现更优。

Comments 32 pages. Accepted by ICLR 2026

2509.13316 2026-05-14 cs.CL cs.LG

Do Activation Verbalization Methods Convey Privileged Information?

Millicent Li, Alberto Mario Ceballos Arroyo, Giordano Rogers, Naomi Saphra, Byron C. Wallace

发表机构 * Northeastern University（东北大学）； Kempner Institute, Harvard University（哈佛大学凯姆纳研究所）； Boston University（波士顿大学）

AI总结本文探讨了激活语言化方法是否能揭示大型语言模型（LLM）的内部工作机制。研究发现，现有方法可能更多地反映语言化模型自身的参数知识，而非目标模型的内部状态。实验表明，这些方法在无需访问目标模型内部信息的情况下也能表现良好，说明当前数据集不足以有效评估语言化方法的效果，亟需设计更严格的基准和实验控制来验证其真正的解释能力。

Comments ICML 2026. 41 pages, 23 tables, 6 figures

2508.09479 2026-05-14 cs.CV

SkySplat: Generalizable 3D Gaussian Splatting from Multi-Temporal Sparse Satellite Images

Xuejun Huang, Xinyi Liu, Yi Wan, Zhi Zheng, Bin Zhang, Mingtao Xiong, Yingying Pei, Yongjun Zhang

发表机构 * School of Remote Sensing and Information Engineering, Wuhan University（武汉大学遥感与信息工程学院）； Technology Innovation Center for Collaborative Applications of Natural Resources Data in GBA, Ministry of Natural Resources（粤港澳大湾区自然资源数据协同应用技术创新中心，自然资源部）； Department of Geography and Resource Management, The Chinese University of Hong Kong（香港中文大学地理与资源管理系）； China Railway Siyuan Survey and Design Group Co., LTD（中国铁路syuan调查设计集团有限公司）

AI总结本文提出了一种名为SkySplat的新型自监督框架，旨在从多时相稀疏卫星图像中实现通用化的三维高斯点云重建。该方法通过将有理多项式系数（RPC）模型集成到通用3D高斯点云生成流程中，解决了现有方法在卫星图像处理中几何约束不足、瞬时物体干扰和辐射不一致等问题。SkySplat仅依赖RGB图像和鲁棒的相对高度监督，无需真实高度图即可实现高效且准确的重建，并在多个基准数据集上表现出优越的性能和跨数据集泛化能力。

Comments AAAI 2026. Code is available at https://github.com/NanCheng2001/SkySplat-main

2508.09320 2026-05-14 cs.LG cs.AI cs.CR

Exact Verification of Graph Neural Networks with Incremental Constraint Solving

Minghao Liu, Chia-Hsuan Lu, Marta Kwiatkowska

发表机构 * University of Oxford（牛津大学）

AI总结该论文提出了一种用于图神经网络（GNN）的精确验证方法，旨在应对属性和结构扰动下的对抗攻击，确保模型的鲁棒性。该方法通过约束求解与边界收紧相结合，并利用求解器的增量求解能力提升效率，支持包括求和、最大值和平均值在内的三种聚合函数，其中后两种为首次应用。实验表明，该方法在多个真实数据集上表现出良好的实用性和优越的分类性能。

Comments Extended version of the paper accepted at FM 2026

2507.12720 2026-05-14 cs.CL

FLEXITOKENS: Flexible Tokenization for Evolving Language Models

Abraham Toluwase Owodunni, Orevaoghene Ahia, Sachin Kumar

发表机构 * The Ohio State University（俄亥俄州立大学）； University of Washington（华盛顿大学）

AI总结本文研究了语言模型在面对新数据分布时的适应性问题，指出传统子词分词器的固定性导致在分布外领域、未见过的语言或脚本中出现文本过度碎片化的问题。为此，作者提出了一种可学习的字节级分词器，通过预测输入字节序列的边界来实现自适应分词，并设计了FLEXITOKENS这一简化训练目标，显著提升了分词的灵活性。实验表明，该方法在多种多语言基准和生成任务中有效减少了分词过度碎片化，相比BPE等传统分词方法在分类和生成任务上提升了约10个百分点。

Comments Accepted to ACL (findings) 2026

2507.09205 2026-05-14 cs.CL

From Curated Data to Scalable Models: Continual Pre-training of Dense and MoE Large Language Models for Tibetan

Lei Yang, Leiyu Pan, Bojian Xiong, Renren Jin, Shaowei Zhang, Yue Chen, Ling Shi, Jiang Zhou, Junru Wu, Zhen Wang, Jianxiang Peng, Juesi Xiao, Tianyu Dong, Zhuowen Han, Zhuo Chen, Yuqi Ren, Deyi Xiong

发表机构 * TJUNLP Lab（TJUNLP实验室）； School of Computer Science and Technology（计算机科学与技术学院）； Tianjin University（天津大学）

AI总结该研究针对藏语这类低资源语言的大规模语言模型发展不足的问题，提出了一套完整的解决方案，包括构建72GB的高质量藏语语料库，并通过多语言持续预训练和指令调优对Qwen2.5-7B模型进行适配。为进一步提升模型容量，研究还将其扩展为50B-10B的专家混合架构，并构建了多个高质量评估数据集。实验表明，所提出的密集模型和MoE模型在多种任务上均优于现有同规模模型，为藏语及其它低资源语言的大模型研究提供了重要参考。

2507.07316 2026-05-14 cs.LG cs.CR

AdeptHEQ-FL: Adaptive Homomorphic Encryption for Federated Learning of Hybrid Classical-Quantum Models with Dynamic Layer Sparing

Md Abrar Jahin, Taufikur Rahman Fuad, M. F. Mridha, Nafiz Fahad, Md. Jakir Hossen

发表机构 * University of Southern California（南加州大学）； Islamic University of Technology（伊斯兰科技大学）； American International University-Bangladesh（孟加拉国美国国际大学）； Multimedia University（多媒体大学）

AI总结该研究提出了一种名为AdeptHEQ-FL的统一混合经典-量子联邦学习框架，旨在解决非独立同分布环境下模型性能、隐私保护与通信效率之间的平衡问题。该方法结合了混合CNN-PQC架构、基于差分隐私的精度加权聚合策略、选择性同态加密技术以及动态层级自适应冻结机制，实现了对敏感模型层的安全聚合与通信开销的最小化。实验表明，该方法在CIFAR-10等数据集上相比现有方法具有显著的精度提升和通信效率优势，验证了其在隐私保护与资源优化方面的有效性。

Comments Accepted in 1st International Workshop on ICCV'25 BISCUIT (Biomedical Image and Signal Computing for Unbiasedness, Interpretability, and Trustworthiness)

Journal ref 1st International Workshop on BISCUIT at ICCV 2025

2505.21238 2026-05-14 cs.CV

3D-UIR: 3D Gaussian for Underwater 3D Scene Reconstruction via Physics Based Appearance-Medium Decoupling

Jieyu Yuan, Yujun Li, Yuanlin Zhang, Chunle Guo, Xiongxin Tang, Ruixing Wang, Chongyi Li

发表机构 * VCIP, College of Computer Science, Nankai University（VCIP，计算机科学学院，南开大学）； Institute of Software, Chinese Academy of Sciences（软件研究所，中国科学院）； DJI（大疆创新）

AI总结该论文提出了一种基于物理原理的3D高斯点云方法（3D-UIR），用于解决水下三维场景重建中的光-介质耦合问题。通过将物体外观与水介质效应解耦，并引入显式的介质嵌入表示，有效提升了场景的一致性和渲染质量。此外，该方法结合深度引导的优化策略，提高了几何重建的准确性，在水下场景的视图合成和场景恢复方面取得了显著改进。

Comments Accepted to IEEE TIP 2026. Project webpage: https://bilityniu.github.io/3D-UIR

2505.15616 2026-05-14 cs.CV

LENS: Multi-level Evaluation of Multimodal Reasoning with Large Language Models

Ruilin Yao, Bo Zhang, Jirui Huang, Xinwei Long, Yifang Zhang, Tianyu Zou, Yufei Wu, Shichao Su, Yifan Xu, Wenxi Zeng, Zhaoyu Yang, Guoyou Li, Shilan Zhang, Zichan Li, Yaxiong Chen, Shengwu Xiong, Peng Xu, Jiajun Zhang, Bowen Zhou, David Clifton, Luc Van Gool

发表机构 * Wuhan University of Technology（武汉理工大学）； Tsinghua University（清华大学）； Institute of Automation, Chinese Academy of Sciences（中国科学院自动化研究所）； Shanghai AI Lab（上海人工智能实验室）； University of Oxford（牛津大学）； INSAIT, Sofia Un. St Kliment Ohridski（索菲亚大学克里门特·欧里迪斯基学院）

AI总结该研究提出了LENS，一个多层级的基准测试，用于评估多模态大语言模型在感知、理解和推理任务中的综合能力。LENS包含3400张当代图像和6万余个由人类撰写的问答，覆盖八个任务和十二种日常场景，支持从基础感知到复杂推理的多层次评估。该数据集通过丰富的标注和来自社交媒体的高质量图像，能够更真实地反映模型在现实场景中的表现，实验表明当前前沿模型在推理任务上的准确率均未超过60%。

Comments Published as a conference paper at ICLR 2026

2505.09760 2026-05-14 cs.RO cs.NE

Neural Associative Skill Memories for safer robotics and modelling human sensorimotor repertoires

Pranav Mahajan, Mufeng Tang, T. Ed Li, Ioannis Havoutis, Ben Seymour

发表机构 * University of Oxford（牛津大学）； Yale University（耶鲁大学）

AI总结本文提出了一种名为神经关联技能记忆（Neural Associative Skill Memories）的框架，旨在提升机器人在复杂环境中的安全性和适应性。该方法通过自监督预测编码实现技能学习与表达的统一，无需显式选择技能即可根据上下文进行技能识别与执行，并具备故障检测能力。相比传统方法，该模型采用局部学习规则，实现了与生物运动准备相关的速度-精度权衡，为神经机器人学和人类感觉运动学习提供了新的计算视角。

Journal ref Neural Computation (2026) 38 (1): 1-27

详情

DOI: 10.1162/NECO.a.1475

英文摘要

Modern robots face challenges shared by humans, where machines must learn multiple sensorimotor skills and express them adaptively. Equipping robots with a human-like memory of how it feels to do multiple stereotypical movements can make robots more aware of normal operational states and help develop self-preserving safer robots. Associative Skill Memories (ASMs) aim to address this by linking movement primitives to sensory feedback, but existing implementations rely on hard-coded libraries of individual skills. A key unresolved problem is how a single neural network can learn a repertoire of skills while enabling fault detection and context-aware execution. Here we introduce Neural Associative Skill Memories (ASMs), a framework that utilises self-supervised predictive coding for temporal prediction to unify skill learning and expression, using biologically plausible learning rules. Unlike traditional ASMs which require explicit skill selection, Neural ASMs implicitly recognize and express skills through contextual inference, enabling fault detection across learned behaviours without an explicit skill selection mechanism. Compared to recurrent neural networks trained via backpropagation through time, our model achieves comparable qualitative performance in skill memory expression while using local learning rules and predicts a biologically relevant speed-accuracy trade-off during skill memory expression. This work advances the field of neurorobotics by demonstrating how predictive coding principles can model adaptive robot control and human motor preparation. By unifying fault detection, reactive control, skill memorisation and expression into a single energy-based architecture, Neural ASMs contribute to safer robotics and provide a computational lens to study biological sensorimotor learning.

URL PDF HTML ☆

赞 0 踩 0

2502.18917 2026-05-14 cs.AI cs.PL cs.SE

ClassInvGen: Class Invariant Synthesis using Large Language Models

Chuyue Sun, Viraj Agashe, Saikat Chakraborty, Jubi Taneja, Clark Barrett, David Dill, Xiaokang Qiu, Shuvendu K. Lahiri

发表机构 * Stanford University（斯坦福大学）； Microsoft Research（微软研究院）； Purdue University（普渡大学）

AI总结 ClassInvGen 是一种利用大语言模型（LLM）生成类不变式的方法，旨在为如 C++ 等主流编程语言生成高质量的类不变式。该方法通过协同生成可执行的类不变式和测试输入，提升了不变式的准确性和完整性，并在实验中优于基于纯 LLM 和传统数据驱动的方法。研究还构建了一个包含标准 C++ 数据结构的基准测试集，并通过实际案例验证了其在真实代码库中的应用效果。

2502.05157 2026-05-14 cs.LG cs.DS

Efficient distributional regression trees learning algorithms for calibrated non-parametric probabilistic forecasts

Quentin Duchemin, Guillaume Obozinski

发表机构 * Swiss Data Science Center（瑞士数据科学中心）； EPFL（瑞士联邦理工学院）； ETH Zürich（苏黎世联邦理工学院）

AI总结本文提出了一种高效的概率回归树学习算法，用于在加权区间分数（WIS）或连续排名概率分数（CRPS）损失函数下进行校准的非参数概率预测。通过引入最小最大堆、权重平衡二叉树和Fenwick树等数据结构，算法在计算效率上得到了显著提升。该方法不仅在数值实验中表现出与现有方法相当的性能，还继承了树模型的可解释性，适用于符合预测和组条件覆盖率保证的场景。

2501.10598 2026-05-14 cs.LG

Addressing Finite-Horizon MDPs via Low-Rank Tensor Value Approximation

Sergio Rozada, Jose Luis Orejuela, Antonio G. Marques

发表机构 * Department of Signal Theory and Comms.（信号理论与通讯系）； King Juan Carlos University（国王胡安·卡洛斯大学）

AI总结本文研究了在有限时间范围的马尔可夫决策过程（MDPs）中，利用低秩张量近似值函数的方法学习最优策略的问题。针对有限时间MDPs中值函数非平稳带来的高维问题和样本复杂度高的挑战，作者提出将值函数建模为低秩张量，从而实现可扩展的表示形式，并在策略迭代框架下结合低秩策略评估与贪心策略改进，计算近似最优策略。该方法引入了基于优化的贝尔曼方程求解框架及块坐标下降算法，并在未知系统动态情况下通过采样轨迹估计值函数，实验表明该方法在计算效率和策略性能方面均具有优势。

2501.05982 2026-05-14 cs.LG eess.SP

Deep Variational Sequential Monte Carlo for High-Dimensional Observations

Wessel L. van Nierop, Nir Shlezinger, Ruud J. G. van Sloun

发表机构 * Dept. of Electrical Engineering（电气工程系）； Eindhoven University of Technology（埃因霍温理工大学）； Dept. of Electrical and Computer Engineering（电气与计算机工程系）； Ben-Gurion University of the Negev（贝内-杰尔大学）

AI总结本文提出了一种基于深度变分思想的序列蒙特卡洛方法，用于处理高维观测下的非线性状态空间系统。该方法通过神经网络参数化提议分布和状态转移分布，利用无监督变分SMC目标进行学习，从而提升粒子滤波的性能。实验表明，该方法在高维部分观测下对洛伦兹吸引子的跟踪任务中优于现有基准，并且在证据下界评估中显示出对后验分布更准确的建模能力。

Journal ref ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Hyderabad, India, 2025

2410.22643 2026-05-14 cs.RO

An Overtaking Trajectory Planning Framework Based on Spatio-temporal Topology and Reachable Set Analysis Ensuring Time Efficiency

Wule Mao, Zhouheng Li, Entao Sun, Lei Xie, Hongye Su

发表机构 * State Key Laboratory of Industrial Control Technology, Zhejiang University, Hangzhou, 310027, China（工业控制技术国家重点实验室，浙江大学，杭州，310027，中国）； Shanghai STEP Electric Corporation, Shanghai, 201802, China（上海STEP电力有限公司，上海，201802，中国）

AI总结本文提出了一种基于时空拓扑和可达集分析的超车轨迹规划框架（SROP），旨在解决高速场景下传统分层规划方法易陷入局部最优和计算效率低的问题。该框架通过引入拓扑类别表示不同的超车行为，上层规划器进行时空搜索以生成多样化的初始路径，下层规划器利用可达集并行评估轨迹，从而解耦车辆运动学约束并加速计算。实验表明，SROP在轨迹平滑性和计算效率方面均有显著提升，并在F1TENTH仿真平台中验证了其在复杂场景下的实用性和鲁棒性。

2409.02708 2026-05-14 cs.LG stat.ME

Few-shot Multi-Task Learning of Linear Invariant Features with Meta Subspace Pursuit

Chaozhi Zhang, Lin Liu, Xiaoqun Zhang

发表机构 * School of Mathematical Sciences, Shanghai Jiao Tong University（上海交通大学数学科学学院）； Institute of Natural Sciences, MOE-LSC, Shanghai Jiao Tong University（上海交通大学自然科学研究院）； SJTU-Yale Joint Center for Biostatistics and Data Science, Shanghai Jiao Tong University（上海交通大学-耶鲁大学生物统计与数据科学联合中心）

AI总结本文研究了在数据稀缺情况下如何通过多任务学习提取线性不变特征的问题，提出了一种名为Meta Subspace Pursuit（Meta-SP）的新算法，用于学习不同任务间共享的低秩不变子空间。该方法在算法层面和统计层面均提供了理论保证，并通过大量实验验证了其在性能上的优越性，优于包括ANIL在内的多种对比方法。

Journal ref CSIAM Transactions on Applied Mathematics (2026)

2409.02038 2026-05-14 cs.CL cs.AI cs.DB

BEAVER: An Enterprise Benchmark for Text-to-SQL

Peter Baile Chen, Devin Yang, Weiyue Li, Fabian Wenz, Yi Zhang, Nesime Tatbul, Michael Cafarella, Çağatay Demiralp, Michael Stonebraker

发表机构 * MIT（麻省理工学院）； Harvard University（哈佛大学）； Greenshoe, Inc.（Greenshoe公司）

AI总结 BEAVER 是首个基于私有数据仓库构建的文本到 SQL 基准测试集，旨在评估大语言模型在复杂企业环境中的表现。该基准包含来自真实查询日志的 9128 个问题-SQL 对，覆盖 19 个不同领域，涵盖复杂的数据库结构和专业领域知识。为解决企业数据稀缺和评估指标不足的问题，BEAVER 通过合成高质量专家验证查询，并引入细粒度子任务评估指标，揭示了当前先进模型在实际企业场景中的显著性能差距。

Comments Dataset and code are available at https://beaverbench.github.io/

详情

英文摘要

Existing text-to-SQL benchmarks have largely been constructed from public databases with well-structured schemas and simplistic question-SQL pairs. While large language models (LLMs) excel on these settings, their efficacy in complex private enterprise environments, characterized by intricate schemas, domain knowledge, and analytical user queries involving sophisticated structures and functions, remains unproven. To bridge this gap, we introduce BEAVER, the first text-to-SQL benchmark derived from private data warehouses. It comprises 9128 question-SQL pairs sourced from real-world query logs and 812 tables across 19 diverse domains. Building this benchmark is challenging because (1) enterprise query logs are scarce due to privacy constraints, and (2) existing all-or-nothing evaluation metrics based on accuracy make error diagnosis difficult -- especially when producing a correct query involves solving multiple compounded challenges, such as domain knowledge and query complexity. We address these issues at two levels. At the dataset level, we synthesize high-fidelity, expert-verified queries that increase dataset size and isolate individual challenges or combine them, producing queries focused on domain knowledge, query complexity, and both. At the evaluation level, we provide human annotations and evaluation metrics for five critical subtasks to enable fine-grained analysis. Our evaluation reveals a significant performance gap compared to existing benchmarks: SOTA agentic frameworks using the advanced model GPT-5.2 achieve only 10.8% accuracy. When provided with all subtask annotations as oracle hints, accuracy increases to 30.1%, confirming that a major bottleneck lies in correctly resolving these subtasks. Finally, we provide a taxonomy of the residual errors that persist even with subtask hints, identifying specific challenges such as the use of advanced functions.

URL PDF HTML ☆

赞 0 踩 0

2110.00062 2026-05-14 cs.RO cs.SY eess.SY

Simulation-based multi-criteria comparison of mono-articular and bi-articular exoskeletons during walking with and without load

Ali KhalilianMotamed Bonab, Volkan Patoglu

发表机构 * Faculty of Engineering and Natural Sciences（工程与自然科学学院）

AI总结本文通过仿真方法对单关节和双关节外骨骼在不同负载条件下的行走性能进行了多目标比较，研究了外骨骼动力学特性与辅助扭矩对代谢成本、肌肉激活和关节反作用力的影响。作者提出了一种基于帕累托优化的多目标设计方法，同时优化外骨骼的功耗和人体代谢率降低效果，并考虑了设备惯性和电能再生的影响。研究结果表明，尽管两种外骨骼的辅助水平相近，但单关节外骨骼在降低关节峰值反作用力方面表现更优，而双关节外骨骼的功耗对负载变化的敏感性更低，且其惯性对代谢成本的负面影响较小。

详情

DOI: 10.1109/TNSRE.2026.3658597

英文摘要

Developing exoskeletons that can reduce the metabolic cost of assisted subjects is challenging since a systematic design approach is required to capture the effects of device dynamics and the assistance torques on human performance. Design studies that rely on musculoskeletal models hold high promise in providing effective design guidelines, as the effect of various devices and different assistance torque profiles on metabolic cost can be studied systematically. In this paper, we present a simulation-based multi-criteria design approach to systematically study the effect of different device kinematics and corresponding optimal assistive torque profiles under actuator saturation on the metabolic cost, muscle activation, and joint reaction forces of subjects walking under different loading conditions. For the multi-criteria comparison of exoskeletons, we introduce a Pareto optimization approach to simultaneously optimize the exoskeleton power consumption and the human metabolic rate reduction during walking, under different loading conditions. We further superpose the effects of device inertia and electrical regeneration on the metabolic rate and power consumption, respectively. Our results explain the effects of heavy loads on the optimal assistance profiles of the exoskeletons and provide guidelines on choosing optimal device configurations under actuator torque limitations, device inertia, and regeneration effects. The multi-criteria comparison of devices indicates that despite the similar assistance levels of both devices, mono-articular exoskeletons show better performance on reducing the peak reaction forces, while the power consumption of bi-articular devices is less sensitive to the loading. Furthermore, for the bi-articular exoskeletons, the device inertia has lower detrimental effects on the metabolic cost of subjects and does not affect the Pareto-optimality of solutions.

URL PDF HTML ☆

赞 0 踩 0

2008.03496 2026-05-14 cs.AI cs.LO cs.RO

Human Robot Collaborative Assembly Planning: An Answer Set Programming Approach

Momina Rizwan, Volkan Patoglu, Esra Erdem

发表机构 * Faculty of Engineering and Natural Sciences, Sabancı University, Istanbul, Turkey（工程与自然科学学院，萨班奇大学，伊斯坦布尔，土耳其）

AI总结本文研究了人机协作装配任务中的规划问题，提出了一种基于答案集编程的方法，结合常识推理和丰富的通信动作，以应对人类行为不确定性带来的挑战。该方法通过扩展混合条件规划，实现了对装配动作顺序的高层规划与几何可行性验证，并在实际场景中验证了其有效性，展示了双臂机器人与人类协作组装家具的应用案例。

Comments 36th International Conference on Logic Programming (ICLP 2020), University Of Calabria, Rende (CS), Italy, September 2020, 15 pages

1811.12784 2026-05-14 cs.CV

The GAN that Warped: Semantic Attribute Editing with Unpaired Data

Gara Dorta, Sara Vicente, Neill D. F. Campbell, Ivor J. A. Simpson

发表机构 * University of Bath（巴斯大学）； Anthropics Technology Ltd.（Anthropics技术有限公司）； University of Sussex（苏塞克斯大学）

AI总结该研究提出了一种基于平滑变形场的语义图像编辑方法，能够在不依赖配对数据的情况下实现高质量的图像编辑。通过结合生成对抗网络（GAN）的最新进展，该方法能够使用未配对数据进行训练，有效保留图像主体的身份特征，并在高分辨率（如4K）图像上实现了高效的编辑。实验表明，该方法在人脸和鸟类图像数据集上均表现出优异的编辑效果和鲁棒性。

Comments CVPR 2020

1804.05261 2026-05-14 cs.CV cs.GR

Physics-driven Fire Modeling from Multi-view Images

Gara Dorta, Luca Benedetti, Dmitry Kit, Yong-Liang Yang

发表机构 * University of Bath（巴斯大学）

AI总结该研究提出了一种从多视角图像中重建物理合理的火焰模型的新方法，解决了传统火焰建模中依赖复杂物理模拟或简化假设的问题。通过RGB相机首次实现了对火焰体积物理属性（如温度、密度）的合理估计，从而支持全局火焰光照等新现象。该方法在多种输入数据上进行了验证，并成功应用于虚拟场景的真实光照生成，展示了其有效性与实用性。

1307.7494 2026-05-14 cs.AI cs.LO cs.RO

ReAct! An Interactive Tool for Hybrid Planning in Robotics

Zeynep Dogmus, Esra Erdem, Volkan Patoglu

发表机构 * Sabancı University（Sabanci大学）

AI总结本文介绍了一种名为 ReAct! 的交互式工具，用于机器人领域中的混合规划。该工具允许研究人员在无需了解底层形式化语法和语义细节的情况下，描述机器人在动态环境中的行为并解决规划问题。ReAct! 支持复杂动态域的建模，包括并发、动作的间接效应和状态/转换约束，并能够将外部计算（如碰撞自由轨迹检查）嵌入到混合域的表示中，从而实现离散高层推理与连续几何推理的紧密集成，适用于从服务机器人到认知工厂等多种复杂场景。

2605.13340 2026-05-14 cs.LG

Shortcut Mitigation via Spurious-Positive Samples

Phuong Quynh Le, Jörg Schlötterer, Sari Sadiya, Gemma Roig, Christin Seifert

发表机构 * University of Marburg（马尔堡大学）； Goethe University Frankfurt（法兰克福歌德大学）

AI总结该论文研究了如何缓解模型对虚假特征（spurious attributes）的依赖问题。作者提出了一种无需额外标注或平衡数据的方法，通过分析模型预测过程，识别出模型依赖虚假特征的样本，并据此定位中间层中与这些特征相关的神经元进行正则化。该方法有效提升了模型的鲁棒性，使其更依赖于真正的判别特征而非偶然正确的预测。

Comments preprint

2605.13335 2026-05-14 cs.AI cs.CV

Ego2World: Compiling Egocentric Cooking Videos into Executable Worlds for Belief-State Planning

Qinchuan Cheng, Zhantao Gong, Pengzhan Sun, Angela Yao, Xulei Yang, Shijie Li

发表机构 * Xi’an Jiaotong University（西安交通大学）； Nankai University（南开大学）； National University of Singapore（新加坡国立大学）； A*STAR

AI总结本文提出 Ego2World，一个将第一视角烹饪视频编译为可执行符号世界的基准，用于评估具身智能体在部分可观测环境下的规划能力。该方法基于视频标注提取可复用的状态转移规则，并在隐藏的符号世界图中执行，迫使智能体仅依靠局部观测和执行反馈进行规划与记忆更新。实验表明，传统动作重叠度指标可能高估任务成功率，而维持持久的信念记忆有助于提升任务完成效率并减少重复视觉探索。

Comments Project page: https://sj-li.com/PROJ/Ego2World/

2605.13334 2026-05-14 cs.CL

LLM-Based Persuasion Enables Guardrail Override in Frontier LLMs

Rodrigo Nogueira, Thales Sales Almeida, Giovana Kerche Bonás, Andrea Roque, Ramon Pires, Hugo Abonizio, Thiago Laitz, Celio Larcher, Roseval Malaquias Junior, Marcos Piau

发表机构 * Maritaca AI ； JusBrasil

AI总结该研究探讨了前沿大型语言模型（LLM）在面对敏感话题时的防护机制，并发现这些模型虽然直接拒绝生成争议性内容，但在模拟用户说服的对话中，却能被其他LLM成功引导生成此类内容。研究通过自然语言说服策略，如同行对比和认知责任重构，展示了攻击者LLM无需明确指令即可促使目标LLM突破其安全限制。实验表明，不同模型组合在多个科学共识话题上均能生成争议性文章，揭示了当前LLM安全机制在交互场景中的潜在漏洞。

AI 大模型

视觉与机器人

科学与医疗

STORM: Segment, Track, and Object Re-Localization from a Single Image

Probabilistic Prediction Markets with Intermittent Contributions

Multilingual Vision-Language Models, A Survey

Self-CriTeach: LLM Self-Teaching and Self-Critiquing for Improving Robotic Planning via Automated Domain Generation

LiLAW: Lightweight Learnable Adaptive Weighting to Learn Sample Difficulty & Improve Noisy Training

CR-Net: Scaling Parameter-Efficient Training with Cross-Layer Low-Rank Structure

Do Activation Verbalization Methods Convey Privileged Information?

SkySplat: Generalizable 3D Gaussian Splatting from Multi-Temporal Sparse Satellite Images

Exact Verification of Graph Neural Networks with Incremental Constraint Solving

FLEXITOKENS: Flexible Tokenization for Evolving Language Models

From Curated Data to Scalable Models: Continual Pre-training of Dense and MoE Large Language Models for Tibetan

AdeptHEQ-FL: Adaptive Homomorphic Encryption for Federated Learning of Hybrid Classical-Quantum Models with Dynamic Layer Sparing

3D-UIR: 3D Gaussian for Underwater 3D Scene Reconstruction via Physics Based Appearance-Medium Decoupling

LENS: Multi-level Evaluation of Multimodal Reasoning with Large Language Models

Neural Associative Skill Memories for safer robotics and modelling human sensorimotor repertoires

ClassInvGen: Class Invariant Synthesis using Large Language Models

Efficient distributional regression trees learning algorithms for calibrated non-parametric probabilistic forecasts

Addressing Finite-Horizon MDPs via Low-Rank Tensor Value Approximation

Deep Variational Sequential Monte Carlo for High-Dimensional Observations

An Overtaking Trajectory Planning Framework Based on Spatio-temporal Topology and Reachable Set Analysis Ensuring Time Efficiency

Few-shot Multi-Task Learning of Linear Invariant Features with Meta Subspace Pursuit

BEAVER: An Enterprise Benchmark for Text-to-SQL

Simulation-based multi-criteria comparison of mono-articular and bi-articular exoskeletons during walking with and without load

Human Robot Collaborative Assembly Planning: An Answer Set Programming Approach

The GAN that Warped: Semantic Attribute Editing with Unpaired Data

Physics-driven Fire Modeling from Multi-view Images

ReAct! An Interactive Tool for Hybrid Planning in Robotics

Shortcut Mitigation via Spurious-Positive Samples

Ego2World: Compiling Egocentric Cooking Videos into Executable Worlds for Belief-State Planning

LLM-Based Persuasion Enables Guardrail Override in Frontier LLMs