arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2605.07886 2026-05-11 stat.ML cs.LG

Characterizing and Correcting Effective Target Shift in Online Learning

Ziyan Li, Naoki Hiratani

AI总结本文研究了在线学习中由于分布偏移导致的有效目标漂移问题，通过核回归的视角揭示了在线学习与离线学习之间的关系，并推导出在线核回归等价于使用漂移目标输出的离线回归。通过目标校正方法，论文证明了在线学习可以与离线学习达到相同的预测性能，并提出了闭式和迭代式的目标修正方法。实验表明，该方法在持续学习任务中优于使用真实目标的在线梯度下降方法，为非平稳环境下的在线学习提供了分析与改进的理论框架。

Comments 22 pages; 6 figures

2605.07838 2026-05-11 q-bio.QM cs.AI cs.LG

PPI-Net connects molecular protein interactions to functional processes in disease

Kyle Higgins, Guadalupe Gonzalez, Dennis Veselkov, Ivan Laponogov, Kirill Veselkov

AI总结该研究提出了一种名为PPI-Net的分层图神经网络，旨在通过整合蛋白质-蛋白质相互作用网络与通路层级表示，揭示分子互作如何驱动疾病功能过程。该模型利用图注意力机制，将患者特异的分子特征在共享的生物互作网络中传播，从而实现从基因到高阶生物学过程的信号聚合。实验表明，PPI-Net在多种癌症数据集上表现出优异的预测性能，并通过整合多组学数据提升了模型的可解释性，揭示了癌症相关的关键信号通路和生物学机制。

Comments 17 pages, 3 figures, 2 tables

2605.07830 2026-05-11 cs.CR cs.AI

CyBiasBench: Benchmarking Bias in LLM Agents for Cyber-Attack Scenarios

Taein Lim, Seongyong Ju, Munhyeok Kim, Hyunjun Kim, Hoki Kim

AI总结本文提出CyBiasBench，一个用于评估大语言模型（LLM）代理在网络攻击场景中偏见行为的基准测试平台。研究发现，不同代理在攻击选择上表现出显著的偏见，倾向于集中使用特定类型的攻击方法，且这种偏见不受提示变化的影响。通过在不同目标和提示条件下对五种代理进行系统评估，作者揭示了攻击分布的熵值差异及偏见惯性效应，表明代理的攻击偏好是其固有特性，而非攻击成功率的函数。

Comments Under Review

2605.07825 2026-05-11 cs.MM cs.CV

Anisotropic Modality Align

Xiaomin Yu, Yijiang Li, Yuhui Zhang, Hanzhen Zhao, Yue Yang, Hao Tang, Yue Song, Xiaobin Hu, Chengwei Qin, Shuicheng Yan, Hui Xiong

AI总结多模态大语言模型的训练长期受到高质量配对数据稀缺的限制。本文研究发现，不同模态在共享表示空间中存在各向异性残差结构，这是阻碍模态互换的主要原因。基于此，作者提出了一个各向异性模态对齐框架 AnisoAlign，通过利用目标模态的几何先验对源模态表示进行有界修正，从而在无配对数据情况下实现模态对齐，实验表明该方法在几何诊断和文本-only训练中均表现出色。

2605.07812 2026-05-11 cs.CR cs.LG

GRASP -- Graph-Based Anomaly Detection Through Self-Supervised Classification

Robin Buchta, Carsten Kleiner, Felix Heine, Gabi Dreo Rodosek

AI总结本文提出了一种基于图的自监督分类方法GRASP，用于检测高级持续性威胁（APT）攻击。该方法通过遮蔽进程可执行文件信息，并从其两跳溯源图邻域中学习推断，从而识别异常行为，无需依赖预设阈值，提高了检测的鲁棒性和泛化能力。实验表明，GRASP在多个数据集上优于现有系统，能够有效检测已知攻击行为，并发现文档中未标记的潜在恶意活动。

Comments 17 pages

2605.07810 2026-05-11 physics.optics cs.CV

Pre-training Enables Extraordinary All-optical Image Denoising

Xudong Lv, Yuxiang Sun, Shuo Wang, Nanxing Chen, Jun Guan, Jingtian Hu

AI总结本文提出了一种基于预训练的全光图像去噪方法，有效提升了光学神经网络在处理严重噪声图像时的性能。研究采用两步优化流程，首先利用大规模简单图像数据集进行预训练，再针对具体任务数据集进行微调，显著提高了去噪质量，将信噪比从低于8 dB提升至高于18 dB，并在多种不同风格的图像数据上表现出良好的泛化能力。该方法在基于视觉的应用中，如人脸识别、车牌识别和无人机定位中也展现了重要价值。

2605.07768 2026-05-11 eess.SY cs.LG cs.SY

Interactive Trajectory Planning with Learning-based Distributionally Robust Model Predictive Control and Markov Systems

Erik Börve, Nikolce Murgovski, Morteza Haghir Chehreghani, Leo Laine

AI总结本文研究了在周围智能体决策存在不确定性的情况下，如何进行交互式轨迹规划。作者提出了一种基于学习的分布鲁棒模型预测控制（DR-MPC）方法，结合PAC学习理论，以应对学习分布中的误差。该方法能够在样本数量变化时，在鲁棒MPC与理想SMPC之间进行有效插值，提升了轨迹规划的鲁棒性与适应性。

2605.07758 2026-05-11 cs.FL cs.LG

SMT-Based Active Learning of Weighted Automata

Tiago Ferreira, Kevin Batz, Alexandra Silva

AI总结本文提出了一种基于SMT的主动学习算法，用于学习非确定性加权自动机（WFA），作为Hankel/L*-类方法的实用且鲁棒的替代方案。该算法在给定半环的基础上进行参数化，若能终止，则保证生成最小的WFA，并证明了其部分正确性及终止条件。实验表明，该算法在有限和无限半环上均能有效学习最小WFA，显著优于简单基线方法，并在生成更小自动机和减少与教师交互方面具有竞争力。

Comments Appearing in CAV 2026

2605.07751 2026-05-11 cs.CY cs.AI

Vibe coding before the trend

Leon van Bokhorst, Koen Suilen

AI总结 2025年初，研究者在四组不同专业的学生中开展了“氛围编程”挑战活动，观察到AI工具的使用促使学生从关注语法转向高阶思维，从记忆转向评估，同时提升了对AI技能的重视。非技术背景学生尤其认可AI工具的易用性，研究认为AI与学习者的关系更像伙伴关系而非替代关系。本文总结了课堂实验中的观察结果，为教育者提供了实践经验和反思。

Comments 10 pages

2605.07746 2026-05-11 stat.ML cs.LG q-bio.QM

Flow Matching for Count Data

Ganchao Wei, John Pearson

AI总结本文研究了高维计数数据（如单细胞RNA测序和神经脉冲序列）的生成建模问题，提出了一种基于连续时间出生-死亡过程的流匹配框架count-FM。该方法通过模拟自由的方式学习计数空间中的边际转移率，实现了在任意计数分布源和目标之间进行高效的生成与迁移。实验表明，count-FM在样本质量、模型效率和路径可解释性方面优于现有方法，适用于无条件生成、数据迁移和条件生成等多种任务。

2605.07738 2026-05-11 physics.comp-ph cs.LG

Physics-Informed Reduced-Order Operator Learning for Hyperelasticity in Continuum Micromechanics

Hamidreza Eivazi, Henning Wessels

AI总结该研究提出了一种结合物理信息的降阶算子学习方法，用于连续介质细观力学中的超弹性问题。通过将平衡神经算子（EquiNO）与基于QR分解的离散经验插值法（Q-DEIM）结合，有效降低了损失函数评估的计算成本，并保证了周期性和力学平衡的约束。该方法在三维代表性体积单元（RVE）上验证，显著提升了计算效率，同时保持了对微观应力场和宏观应力的高精度预测能力。

Comments 22 pages, 12 figures

2605.07723 2026-05-11 cs.DL cs.AI cs.CY physics.soc-ph

LLM hallucinations in the wild: Large-scale evidence from non-existent citations

Zhenyue Zhao, Yihe Wang, Toby Stuart, Mathijs De Vaan, Paul Ginsparg, Yian Yin

AI总结本研究通过分析arXiv、bioRxiv、SSRN和PubMed Central等平台上的250万篇论文共计1.11亿条引用，揭示了大型语言模型（LLM）在实际应用中产生的“幻觉”问题——即生成不存在的引用。研究发现，随着LLM的广泛使用，虚假引用数量显著上升，仅2025年就估计有146,932条此类错误引用。这些错误在AI技术应用较多的领域、使用AI辅助写作的论文以及由小型或早期职业研究团队撰写的论文中尤为明显，且倾向于错误地引用已知名且男性学者，可能加剧科学界现有的不平等问题。研究指出，当前的预印本审核和期刊出版流程难以有效遏制这类错误的扩散。

2605.07705 2026-05-11 cs.LO cs.AI

Cross-Attention and Encoder-Decoder Transformers: A Logical Characterization

Veeti Ahvonen, Damian Heiman, Antti Kuusisto, Miguel Moreno, Matias Selin

AI总结本文提出了对编码器-解码器变换器（如大型语言模型的基础架构）的一种新颖逻辑刻画，研究了其在浮点数和软注意力机制下的实际文本处理场景。作者引入了一种新的时序逻辑，扩展了命题逻辑，包含对编码器输入的计数全局模态和对解码器输入的过去模态。此外，还通过分布式自动机对这类变换器进行了补充刻画，并展示了结果的通用性，能够适应如掩码等架构变化。最后，文章还讨论了编码器-解码器变换器在自回归设置中的应用。

2605.07694 2026-05-11 eess.AS cs.AI cs.SD eess.SP

Dependence on Early and Late Reverberation of Single-Channel Speaker Distance Estimation

Michael Neri, Archontis Politis, Tuomas Virtanen

AI总结本文研究了单通道说话人距离估计模型对房间脉冲响应中早期反射和晚期混响的依赖性。通过将模拟的RIR分解为四种变体，并在不同校准条件下进行评估，发现模型在未进行时间校准时主要依赖早期反射信息，而在时间校准条件下仅通过传播延迟即可实现较高精度的距离估计。研究还表明，早期能量越强、环境混响越弱，估计精度越高。

Comments Submitted to IWAENC 2026

2605.07677 2026-05-11 cs.IR cs.AI cs.CL

TRACE: Tourism Recommendation with Accountable Citation Evidence

Zixu Zhao, Sijin Wang, Yu Hou, Yuanyuan Xu, Yufan Sheng, Xike Xie, Wenjie Zhang, Won-Yong Shin, Xin Cao

AI总结本文提出TRACE，一个用于旅游推荐的可问责对话推荐数据集，旨在解决现有系统在可信性、可验证性和适应性方面的不足。TRACE包含多轮对话、真实用户评论引用和明确的拒绝回合，涵盖2400个POI和34000条评论，支持14种基线方法和25项评估指标。研究揭示了旅游推荐中的“三能力差距”，并表明基于真实评论的引用评分与人工标注高度一致，为构建更可靠、可解释的旅游推荐系统提供了新方向。

详情

英文摘要

Tourism is a high-stakes setting for conversational recommender systems (CRS): a plausible-sounding suggestion can waste real money and trip time once a traveler acts on it. Existing CRS benchmarks primarily evaluate systems with a single Recall@k score over entity mentions, and tourism-specific resources add spatial or knowledge-graph context, yet none of them couple multi-turn recommendation with verbatim review-span evidence and rejection recovery. This leaves an evaluation gap for tourism recommendation that is simultaneously trustworthy, verifiable, and adaptive: recommend the right point of interest (POI) for multi-aspect preferences (such as cuisine, price, atmosphere, walking distance), justify each suggestion with verifiable evidence from prior visitors so the traveler can act without trial and error, and recover when the first recommendation is rejected mid-dialogue. We introduce TRACE, where each item is a multi-turn tourism recommendation dialogue with review-span citations and explicit rejection turns: 10,000 dialogues over 2,400 Yelp POIs and 34,208 reviews across eight U.S. cities, paired with 14 retrieval, planning, and LLM baselines, along with 25 metrics organized under Accuracy, Grounding, and Recovery. Across these baselines, TRACE reveals the Three-Competency Gap: LLM Zero-Shot leads in closed-set Recall@1 and rejection recovery but cites less densely than retrievers; non-LLM retrievers achieve surface-verbatim grounding but with low accuracy; Multi-Review Synthesis fails at recovery. The Grounding Score agrees with human citation precision (Spearman rho=+0.80, p<10^-20), and paired t-tests reproduce the per-baseline ranking (p<0.01 on the dominant contrasts). TRACE reframes accountable tourism recommendation as a joint target (right POI, verifiable evidence, adaptive repair) rather than a single-axis leaderboard.

URL PDF HTML ☆

赞 0 踩 0

2605.07674 2026-05-11 cs.GT cs.CR cs.LG

Differentially Private Auditing Under Strategic Response

Florian A. D. Burnat

AI总结本文研究了在开发者可以战略性响应隐私约束审计接口的情况下，如何设计差分隐私审计机制。作者将隐私约束审计建模为一个双层Stackelberg博弈，并引入了福利加权未检测差距 $B_w$ 作为衡量审计效果的指标，证明了传统的差分隐私审计方法在特定条件下会导致更大的未检测风险。为此，作者提出了战略性私有审计设计（SPAD）方法，通过开发者的KKT系统将双层问题转化为单层优化问题，并设计了基于投影梯度的算法进行求解。

2605.07671 2026-05-11 cs.GT cs.AI cs.MA econ.TH math.OC

The Endogeneity of Miscalibration: Impossibility and Escape in Scored Reporting

Lauri Lovén, Sasu Tarkoma

AI总结本文研究了在自主智能体报告评分机制中，由于评分规则与智能体自身利益之间的内生性关联，导致真实报告难以被有效激励的问题。核心发现是，当使用非仿射的批准函数进行类型筛选时，智能体在无法被检测的偏差下，真实报告将不再是其最优策略，从而破坏评分校准。研究进一步提出，通过使用阶梯函数的批准阈值，可以在不损害校准的前提下实现最优类型筛选，尤其在Brier评分下，次优与最优之间的福利差距可以被消除，这一特性在其他评分规则中并不成立。

Comments 38 pages, no figures. Targeting ACM Transactions on Economics and Computation (TEAC); preprint

2605.07665 2026-05-11 stat.ML cs.LG

Debiased Counterfactual Generation via Flow Matching from Observations

Hugh Dance, Johnny Xi, Peter Orbanz, Benjamin Bloem-Reddy

AI总结本文研究了在干预下估计反事实分布的问题，提出了一种基于观测数据的去混淆流匹配方法，通过利用观测分布与反事实分布之间的紧密联系，提高了反事实生成的准确性。该方法通过流匹配框架和半参数高效估计器实现，能够在高维空间中学习最小能量流，有效克服了现有方法的偏差和失败模式。

2605.07663 2026-05-11 cs.GT cs.CR cs.LG

Quotient Semivalues for False-Name-Resistant Data Attribution

Florian A. D. Burnat, Brittany I. Davidson

AI总结本文研究了在机器学习数据归属中，如何防止贡献者通过虚假身份（如数据分裂、复制或合成数据）来夸大自身贡献的问题。作者提出了一种基于商值半值（quotient semivalue）的机制，通过在证据支持的归属簇上计算Shapley、Banzhaf等值，有效吸收簇内的重复贡献，从而提升归属的公平性与抗虚假身份攻击能力。实验表明，该方法在合成分类任务中显著降低了Sybil攻击带来的收益，提升了数据归属的鲁棒性。

2605.07654 2026-05-11 stat.ML cs.CL cs.LG

Reliable Chain-of-Thought via Prefix Consistency

Naoto Iwase, Yuki Ichihara, Mohammad Atif Quamar, Junpei Komiyama

AI总结该研究提出了一种名为“前缀一致性”的新方法，用于提升大型语言模型在推理任务中的可靠性。通过观察正确答案的思维链在截断后更可能被重新生成，研究利用这一特性作为可靠性信号，对候选答案进行加权。实验表明，该方法在多个数学和科学基准测试中表现出色，能以更少的计算资源达到与多数投票相当的准确率。

Comments See our project page at https://naoto-iwase.github.io/prefix-consistency-page

2605.07634 2026-05-11 math.OC cs.LG math.ST stat.TH

Robust stochastic first order methods in heavy-tailed noise via medoid mini-batch gradient sampling

Manojlo Vukovic, Dusan Jakovetic

AI总结本文研究了在重尾噪声环境下鲁棒的一阶随机优化方法，提出了一种基于中位数梯度采样的新型随机梯度下降算法（R-SGD-Mini）。该方法通过将数据批次划分为多个子块，计算每个子块的梯度，并选择梯度中位数方向进行参数更新，从而有效降低噪声影响。理论分析表明，该算法在非凸设置下能够以 $\mathcal{O}(T^{-1})$ 的速率收敛，并在已知时间范围时达到 $\mathcal{O}(T^{-1/2})$ 的更快收敛速度，实验结果也验证了其优于传统方法的性能。

2605.07536 2026-05-11 cs.CR cs.LG

GESR: Graph-Based Edge Semantic Reconstruction for Stealthy Communication Detection with Benign-Only Training

Henghui Xu, Yuchen Zhang, Xiaobo Ma

AI总结在仅有良性流量训练的情况下检测隐蔽恶意通信是网络安全部门面临的重要挑战。为解决这一问题，本文提出了一种基于图结构的新型框架GESR，通过重构通信边的语义信息，从局部结构上下文中捕捉通信模式，从而有效识别异常通信和主机。该方法无需依赖标记的攻击样本，利用图结构的一致性进行异常检测，并在多个数据集上取得了优异的检测性能。

详情

英文摘要

Detecting stealthy malicious communications from flow logs under benign-only training remains a critical challenge in network security. Malicious communications often camouflage as normal traffic like standard HTTPS flows. Conventional intrusion detectors rely strictly on known labeled attacks. Alternatively, they score flows completely independently. These approaches fail against sparse and context-dependent suspicious activity. To capture this essential context, graph anomaly detectors have been introduced to add valuable relational information to the analysis. However, existing methods fail to test the structural consistency of specific communication edges. To overcome these fundamental limitations, we present GESR, a novel graph-based framework for detecting suspicious communications and anomalous hosts under a benign-only training setting. GESR models complex network activity as attributed communication graphs. It cleverly reconstructs edge semantics entirely from local structural context rather than isolated features. This non-intuitive design forces the framework to predict expected communication patterns from neighborhood topologies. Attackers cannot easily manipulate this deep structural dependency. The model then converts the resulting structural inconsistencies into host-level anomaly scores. It utilizes robust Median Absolute Deviation (MAD) calibration for this final step. We evaluate GESR extensively on CTU-13 and CICIDS2017 datasets. These evaluations strictly impose tight false-positive operating constraints. On CICIDS2017, GESR achieves an outstanding ROC-AUC of 0.9753. It also yields a high TPR of 0.8569 at a strict 5% FPR threshold. GESR consistently outperforms existing methods across both evaluated benchmarks. The results prove that structure-conditioned edge reconstruction is a credible direction for practical intrusion detection.

URL PDF HTML ☆

赞 0 踩 0

2605.01041 2026-05-11 cs.MA cs.AI cs.GT cs.LG cs.RO

Separation Assurance between Heterogeneous Fleets of Small Unmanned Aerial Systems via Multi-Agent Reinforcement Learning

Iman Sharifi, Hyeong Tae Kim, Maheed Hatem Ahmed, Mahsa Ghasemi, Peng Wei

AI总结本文研究了在未来高密度城市空域中，不同公司运营异构小型无人机编队时，如何通过多智能体强化学习实现安全分离的问题。提出了一种基于注意力增强的近端策略优化优势演员-评论家（PPOA2C）框架，用于解决同编队和跨编队的冲突，各编队独立训练策略以保护隐私。实验表明，采用共享PPOA2C策略的两编队能够达到安全分离的均衡状态，且该策略在冲突解决和与规则策略的交互中表现出更强的适应性，突显了其在异构无人机系统中公平冲突管理的重要性。

Comments 8 pages, 3 figure, 1 table

详情

英文摘要

In the envisioned future dense urban airspace, multiple companies will operate heterogeneous fleets of small unmanned aerial systems (sUASs), where each fleet includes several homogeneous aircraft with identical policies and configurations, e.g., equipage, sensing, and communication ranges, making tactical deconfliction highly complex for the aircraft. This paper aims to address two core questions: (1) Can tactical deconfliction policies converge or reach an equilibrium to ensure a conflict-free airspace when companies operate heterogeneous fleets of homogeneous aircraft? (2) If so, will the converged policies discriminate against companies operating sUASs with weaker configurations? We investigate a multi-agent reinforcement learning paradigm in which homogeneous aircraft within heterogeneous fleets operate concurrently to perform package delivery missions over Dallas, Texas, USA. An attention-enhanced Proximal Policy Optimization-based Advantage Actor-Critic (PPOA2C) framework is employed to resolve intra- and inter-fleet conflicts, with each fleet independently training its own policy while preserving privacy. Experimental results show that two fleets with distinct, shared PPOA2C policies can reach an equilibrium to maintain safe separation. While two PPOA2C policies outperform two strong rule-based baselines in terms of conflict resolution, a PPOA2C policy exhibits safer interaction with a rule-based policy, indicating adaptive capabilities of PPOA2C policies. Furthermore, we conducted extensive policy-configuration evaluations, which reveal that equilibria between similar policy types tend to favor fleets with stronger configurations. Even under similar configurations but different policy types, the equilibrium favors one of the heterogeneous policies, underscoring the need for fairness-aware conflict management in heterogeneous sUAS operations.

URL PDF HTML ☆

赞 0 踩 0

2605.00932 2026-05-11 cs.SE cs.AI

Code World Model Preparedness Report

Daniel Song, Peter Ney, Cristina Menghini, Faizan Ahmad, Aidan Boyd, Nathaniel Li, Ziwen Han, Jean-Christophe Testud, Saisuke Okabayashi, Maeve Ryan, Jinpeng Miao, Hamza Kwisaba, Felix Binder, Spencer Whitman, Jim Gust, Esteban Arcaute, Dhaval Kapil, Jacob Kahn, Ayaz Minhas, Tristan Goodman, Lauren Deason, Alexander Vaughan, Shengjia Zhao, Summer Yue

AI总结本报告评估了Meta开发的代码世界模型（CWM）的准备情况，该模型用于代码生成和代码推理。研究通过在可能带来灾难性风险的领域进行预发布测试，并评估模型的潜在偏差，发现CWM并未引入当前AI生态系统之外的额外风险，因此作为开放权重模型发布。

Comments 25 pages, 3 figures

2605.00754 2026-05-11 cs.SE cs.LG

Themis: Training Robust Multilingual Code Reward Models for Flexible Multi-Criteria Scoring

Indraneil Paul, Goran Glavaš, Iryna Gurevych

AI总结该研究提出了Themis-RM，一套用于多语言代码生成的鲁棒奖励模型，支持灵活的多维度评分。为解决现有代码奖励模型主要依赖执行反馈、评分维度单一的问题，研究者构建了Themis-CodeRewardBench基准，并收集了超过35万个代码偏好对，用于训练多语言、多准则的代码奖励模型。实验表明，Themis-RM在多语言迁移和多维度评分任务中表现出色，显著提升了代码奖励模型的灵活性和可靠性。

2604.06276 2026-05-11 eess.IV cs.CV

Structural Regularities of Cinema SDR-to-HDR Mapping in a Controlled Mastering Workflow: A Pixel-wise Case Study on ASC StEM2

Xin Zhang, Xiaoyi Chen

AI总结本文基于ASC StEM2数据集，对电影从标准动态范围（SDR）到高动态范围（HDR）的映射关系进行了像素级的实证研究，分析了在受控制作流程中SDR与HDR版本在亮度和色彩结构上的规律性差异。研究发现，SDR与HDR版本在亮度上具有稳定的全局单调对应关系，而色彩上则表现出色调一致、饱和度分布调整等特点。通过EXR源数据作为参考，研究进一步构建了像素级决策图，区分了需恢复原场景信息的区域和需内容自适应调整的区域，为结构感知的SDR到HDR映射分析提供了可解释的定量基准。

Comments 15 pages, 6 figures. Empirical case study on cinema SDR-to-HDR mapping using ASC StEM2

2604.04891 2026-05-11 math.OC cs.AI stat.ML

Muon Dynamics as a Spectral Wasserstein Flow

Gabriel Peyré

AI总结本文研究了深度学习中梯度归一化方法的连续时间动力学，提出了一种基于谱范数的Wasserstein距离，用于描述参数空间上的概率测度演化。核心方法通过引入由不同矩阵范数索引的谱Wasserstein距离，将归一化训练过程解释为梯度流，并建立了与Benamou-Brenier公式等的理论联系。研究贡献包括静态Kantorovich公式、鲁棒成本表示、高斯简化以及在多种模型中的数值验证，为理解归一化训练提供了新的几何视角。

2603.24914 2026-05-11 math.HO cs.AI

Shaping the Future of Mathematics in the Age of AI

Johan Commelin, Mateja Jamnik, Rodrigo Ochigame, Lenny Taelman, Akshay Venkatesh

AI总结本文探讨了人工智能时代下数学学科面临的变革与挑战，重点分析了价值观、实践方式、教学、技术应用和伦理五个关键领域。作者提出了一系列建议，旨在维护数学界的自主性，重构研究实践，拓展课程内容，建设学术导向的基础设施，并制定共同的伦理准则，以确保数学的未来发展由数学界自身主导。

Comments To appear in Notices of the American Mathematical Society. Based on discussions at a September 2025 workshop on "Mechanization and Mathematical Research" held at the Lorentz center, Leiden

2602.08786 2026-05-11 cs.CY cs.LG

On the Meta-Design of Allocation Problems

Unai Fischer-Abaigar, Emily Aiken, Christoph Kern, Juan Carlos Perdomo

AI总结本文研究了资源分配问题中设计参数的元设计问题，即如何在预测、容量约束和干预质量等高层决策上进行优化，而不仅仅是固定这些参数后寻找最优分配策略。文章首次形式化定义了资源分配问题的元设计空间，并开发了相应的实证工具，帮助实践者进行系统分析。通过德国就业服务和埃塞俄比亚定向现金转移项目的案例研究，验证了该框架的有效性与实用性。

2602.04774 2026-05-11 cond-mat.dis-nn cs.LG stat.ML

Theory of Optimal Learning Rate Schedules and Scaling Laws for a Random Feature Model

Blake Bordelon, Francesco Mori

AI总结本文研究了深度学习中学习率调度的最优理论，针对随机特征模型在随机梯度下降（SGD）下的训练过程，提出了基于最优控制理论的分析方法。研究发现学习率调度可分为“易相”和“难相”两个阶段，分别对应不同的衰减策略，并揭示了学习率与批量大小联合优化对训练效率的影响。实验表明，该理论在图像分类和语言模型任务中均具有良好的适用性，为学习率调度提供了理论指导和实践参考。

详情

英文摘要

Setting the learning rate (LR) for a deep learning model is a critical part of successful training. Choosing LRs is often done empirically with trial and error. In this work, we explore a solvable model of optimal LR schedules for a powerlaw random feature model trained with stochastic gradient descent (SGD). We consider the optimal schedule $η_T^\star(t)$ where $t$ is the current iterate and $T$ is the training horizon. This schedule is computed both as a numerical optimization problem and also analytically using optimal control theory. Our analysis reveals two regimes which we term the easy phase and hard phase. In the easy phase the optimal schedule is a polynomial decay $η_T^\star(t) \simeq T^{-ξ} (1-t/T)^δ$ where $ξ$ and $δ$ depend on the properties of the features and task. In the hard phase, the optimal schedule resembles warmup-stable-decay with constant initial LR and annealing performed over a vanishing fraction of training steps. We investigate joint optimization of LR and batch size and find batch ramps can improve the wall-clock time in the easy phase. Beyond SGD, we derive optimal schedules for momentum parameter $β(t)$ and show that it improves the loss-scaling exponent in the hard phase. We compare our optimal schedule to various benchmarks including (1) optimal constant learning rates $η_T(t) \sim T^{-ξ}$ (2) optimal power laws $η_T(t) \sim T^{-ξ} t^{-χ}$, finding that our schedule achieves better rates than either of these. Our theory suggests that LR transfer across training horizon depends on the structure of the model and task. For ResNet image classification on CIFAR-5M, the learning curves exhibit hard-phase behavior where optimal base LRs are constant under sufficient annealing. GPT-2 style transformers trained in language modeling exhibit easy-phase behavior where optimal LRs shift even under annealing.

URL PDF HTML ☆

赞 0 踩 0