arXivDaily arXiv每日学术速递 周一至周五更新
2605.09383 2026-06-19 cs.RO 版本更新

Safety-Critical LiDAR-Inertial Odometry with On-Manifold Deterministic Protection Level

安全关键的激光雷达-惯性里程计与在线流形确定性保护级别

Yueqi Zhu, Yan Pan, Chufan Rui, Jiasheng Luo, Shihua Li, Bo Zhou

发表机构 * School of Automation, Southeast University(东南大学自动化学院) Key Laboratory of Measurement and Control of CSE, Ministry of Education(教育部测控CSE重点实验室)

AI总结 本文提出一种安全关键的激光雷达-惯性里程计,通过在线流形确定性状态估计提供确定性保护级别,以提升移动机器人在安全关键场景中的导航安全性。

详情
AI中文摘要

在安全关键场景中,自主导航系统的保护级别对于使移动机器人安全执行任务至关重要。然而,现有针对机器人概率导航系统的研究通常使用有限数据集进行离线准确性评估,并假设结果可应用于未知真实环境。因此,当前自主移动机器人往往缺乏在线安全评估的保护级别。为填补这一空白,我们提出了一种安全关键的激光雷达-惯性里程计(LIO),其基于在线流形确定性状态估计提供确定性保护级别。通过采用未知但有界的假设,我们推导出点云噪声与迭代最近点算法估计不确定性之间的简洁闭式关系。利用这一关系,我们设计了一种在线流形椭球集成员滤波器,并将其实现于LIO系统中。利用集成员滤波器的性质,我们的系统将估计位置的可行集作为确定性保护级别,用作机器人下游自主操作的安全参考。实验结果表明,我们的系统能够为各种环境中的不同机器人提供有效的确定性在线安全参考。

英文摘要

In safety-critical scenarios, the protection level of the autonomous navigation system is crucial for enabling mobile robots to perform safe tasks. However, existing studies on probabilistic navigation systems for robots usually perform offline accuracy evaluations using limited datasets and assume that the results can be applied to unknown real-world environments. As a result, current autonomous mobile robots often lack protection levels for online safety assessment. To fill this gap, we propose a safety-critical LiDAR-inertial odometry (LIO) that provides deterministic protection levels based on on-manifold deterministic state estimation. By adopting the unknown but bounded assumption, we derive a neat closed-form relationship between point cloud noise and the uncertainty of the estimation from the iterated closest point algorithm. Using this relationship, we design an on-manifold ellipsoidal set-membership filter and implement it within the LIO system. Leveraging the properties of the set-membership filter, our system offers the feasible sets of the estimated locations as the deterministic protection levels, serving as safety references for the robots' downstream autonomous operations. The experimental results show that our system can provide effective deterministic online safety references for diverse robots in various environments.

2605.08525 2026-06-19 cs.RO cs.SY eess.SY 版本更新

Model-Reference Adaptive Flight Control of a 95-mg Insect-Scale Flapping-Wing Aerial Robot

95毫克昆虫尺度扑翼飞行机器人的模型参考自适应飞行控制

Francisco M. F. R. Gonçalves, Conor K. Trygstad, Néstor O. Pérez-Arancibia

发表机构 * Washington State University(华盛顿州立大学)

AI总结 针对昆虫尺度扑翼飞行机器人参数不确定性和扰动问题,提出模型参考自适应控制(MRAC)架构,结合混合乘性扩展卡尔曼滤波,实现高精度位置控制,并通过95毫克机器人实验验证了悬停和轨迹跟踪性能。

Comments Under review, 8 pages, 7 figures

详情
AI中文摘要

由于系统尺度和复杂制造,描述扑翼昆虫尺度飞行机器人动力学的模型存在参数不确定性,例如惯性矩阵和飞行器的执行器映射。此外,由于其低惯性,这种机器人在飞行中受到随机和系统性扰动的严重影响,包括电源线张力、阵风和机翼不对中产生的非期望气动力。因此,在亚分克尺度上执行复杂机动的高性能要求机器人调整其行为以抵消扰动和模型不确定性。为此,我们引入了一种模型参考自适应控制(MRAC)架构,用于可实现为三维空间中刚体的扑翼机器昆虫的高性能位置控制。此外,我们展示了在飞行中实现混合乘性扩展卡尔曼滤波以估计当前和期望角速度,如何显著抑制姿态振动,特别是沿滚转和俯仰自由度,并提高飞行性能。为了展示所提方法的适用性、功能性和高性能,我们使用一个95毫克的昆虫尺度飞行机器人进行了实时悬停和轨迹跟踪六自由度飞行控制实验。

英文摘要

Due to the system's scale and complex fabrication, the model describing the dynamics of a flapping-wing insect-scale aerial robot is subject to parameter uncertainty; for example, in the inertia matrix and the actuator mapping of the flier. Furthermore, due to its low inertia, this type of robot is greatly affected by stochastic and systematic disturbances during flight, including power-wire tension, gusts, and undesired aerodynamic forces produced by wing misalignment. Therefore, the high-performance execution of complex maneuvers at the subdecigram scale requires the robot to adapt its behavior to counteract disturbances and model uncertainty. Toward this objective, we introduce a model-reference adaptive control (MRAC) architecture for high-performance position control of flapping-wing robotic insects that can be modeled as rigid bodies in the three-dimensional (3D) space. In addition, we demonstrate how the implementation of a hybrid multiplicative extended Kálmán filter for estimating current and desired angular velocities during flight significantly dampens attitude vibrations, especially along the roll and pitch degrees of freedom (DOFs), and also improves flight performance. To show the suitability, functionality, and high performance of the proposed approach, we conducted real-time hovering and trajectory-tracking 6-DOF flight control experiments with a 95-mg insect-scale aerial robot.

2605.07821 2026-06-19 cs.CV cs.AI 版本更新

Mitigating Simplicity Bias in OOD Detection through Object Co-occurrence Analysis

通过对象共现分析缓解OOD检测中的简单性偏差

Boyang Dai, Chaoqi Chen, Yizhou Yu

发表机构 * The University of Hong Kong(香港大学) Shenzhen University(深圳大学) Shenzhen Loop Area Institute(深圳环城区域研究所)

AI总结 提出基于对象共现的OOD检测框架,通过解耦表示和分治策略区分近OOD,缓解简单性偏差,在多种设置下取得竞争结果。

Comments This paper has been accepted by CVPR2026

详情
AI中文摘要

分布外(OOD)检测对于确保深度学习模型的可靠性至关重要。现有方法大多关注正则纠缠表示以区分分布内(ID)和OOD数据,忽略了图像中丰富的上下文信息。这一问题在检测近OOD时尤其具有挑战性,因为具有简单性偏差的模型难以在解耦表示中学习判别性特征。人类视觉系统可以利用自然环境中对象的共现来促进场景理解。受此启发,我们提出了一种以对象为中心的OOD检测框架,学习捕捉图像中的对象共现(OCO)模式。该方法引入了一种新的OOD检测范式,通过预测测试样本的解耦表示来理解图像中的对象共现,然后根据ID训练数据中观察到的对象共现模式自适应地将模式分为三种场景,最后以分治方式进行OOD检测。通过这种方式,OCO可以通过考虑图像中存在的语义上下文关系来区分近OOD,避免仅关注简单、易学习区域的倾向。我们通过在具有挑战性和全频谱OOD设置下的实验评估了OCO,展示了竞争性结果,并证实了其处理语义和协变量偏移的能力。代码发布在:https://this https URL。

英文摘要

Out-of-distribution (OOD) detection is crucial for ensuring the reliability of deep learning models. Existing methods mostly focus on regular entangled representations to discriminate in-distribution (ID) and OOD data, neglecting the rich contextual information within images. This issue is particularly challenging for detecting near-OOD, as models with simplicity bias struggle to learn discriminative features in disentangled representations. The human visual system can use the co-occurrence of objects in the natural environment to facilitate scene understanding. Inspired by this, we propose an Object-Centric OOD detection framework that learns to capture Object CO-occurrence (OCO) patterns within images. The proposed method introduces a new OOD detection paradigm that understands object co-occurrence within an image by predicting disentangled representations for the test sample, then adaptively divides patterns into three scenarios based on object co-occurrence patterns observed in ID training data, and finally performs OOD detection in a divide-and-conquer manner. By doing so, OCO can distinguish near-OOD by considering the semantic contextual relationships present in their images, avoiding the tendency to focus solely on simple, easily learnable regions. We evaluate OCO through experiments across challenging and full-spectrum OOD settings, demonstrating competitive results and confirming its ability to address both semantic and covariate shifts. Code is released at https://github.com/Michael-McQueen/OCO.

2604.23938 2026-06-19 cs.CL 版本更新

TSAssistant: A Human-in-the-Loop Agentic Framework for Automated Target Safety Assessment

TSAssistant: 一种人在回路中的自动化靶点安全性评估智能体框架

Xiaochen Zheng, Zhiwen Jiang, David Tokar, Yexiang Cheng, Alvaro Serra, Melanie Guerard, Klas Hatje, Tatyana Doktorova

发表机构 * Computational Sciences Center of Excellence(计算科学卓越中心)

AI总结 提出TSAssistant多智能体框架,通过分层指令架构和交互式优化循环,将靶点安全性评估报告生成分解为专业子任务,实现高可重复性和证据溯源。

Comments Updated with quantitative and expert evaluations

详情
AI中文摘要

靶点安全性评估(TSA)需要系统整合遗传、转录组、靶点同源性、药理学和临床数据,以评估治疗靶点的潜在安全性风险。该过程劳动密集且依赖专家,在可扩展性和可重复性方面面临挑战。我们提出TSAssistant,一种人在回路中的多智能体框架,将TSA报告生成分解为专门子智能体的工作流:研究子智能体各自基于并引用单个TSA领域,合成子智能体整合跨领域发现。子智能体通过标准化工具接口从精选生物医学来源检索和综合证据,生成可单独引用、基于证据的章节,其行为由分层指令架构塑造,该架构将协调逻辑与领域专业知识和用户意图分离。为补充这些软约束,程序化执行钩子和持久记忆存储在整个工作流中强制执行硬约束,而交互式优化循环允许专家在完全保留跨迭代对话上下文的情况下审查和修订各个章节。我们不是进行单一的整体比较,而是将报告质量分解为可重复性、证据基础、任务级准确性和专家监督下的可控性,发现高可重复性和证据基础、与人类参考高度一致以及专家驱动的净正面改进。

英文摘要

Target Safety Assessment (TSA) requires systematic integration of genetic, transcriptomic, target homology, pharmacological, and clinical data to evaluate potential safety liabilities of therapeutic targets. This process is labor-intensive and expert-dependent, posing challenges in scalability and reproducibility. We present TSAssistant, a human-in-the-loop multi-agent framework that decomposes TSA report generation into a workflow of specialized subagents: Research Subagents that each ground and cite a single TSA domain, and Synthesis Subagents that integrate findings across domains. Subagents retrieve and synthesize evidence from curated biomedical sources through standardized tool interfaces and produce individually citable, evidence-grounded sections, with behavior shaped by a hierarchical instruction architecture that separates coordination logic from domain expertise and user intent. To complement these soft constraints, programmatic execution hooks and persistent memory stores enforce hard constraints across the workflow, while an interactive refinement loop allows experts to review and revise individual sections with full conversational context preserved across iterations. Rather than a single holistic comparison, we decompose report quality into reproducibility, evidential grounding, task-level accuracy, and controllability under expert oversight, finding high reproducibility and grounding, substantial agreement with the human reference, and net-positive expert-driven refinement.

2605.05481 2026-06-19 cs.LG 版本更新

Approximate Next Policy Sampling: Replacing Conservative Target Policy Updates in Deep RL

近似下一策略采样:替代深度强化学习中的保守目标策略更新

Dillon Sandhu, Ronald Parr

AI总结 提出近似下一策略采样(ANPS)方法,通过修改训练分布而非约束策略更新来解决强化学习中的“鸡生蛋”问题,并基于此设计稳定值近似策略迭代(SV-API)算法,在Atari和连续控制任务上实现更大目标策略更新且性能匹配或提升。

详情
AI中文摘要

我们重新审视强化学习中一个经典的“鸡生蛋”问题:为了安全地改进策略,价值函数必须在更新策略的状态访问分布上准确。该状态分布是未知的,且无法为训练价值函数而采样。保守更新解决了这个问题,但代价是缩小策略更新。本文探索了一种替代方案,即近似下一策略采样(ANPS),它通过修改训练分布而非约束策略更新来解决问题。如果训练数据的分布近似于下一策略的分布,则ANPS成立。为了证明ANPS的可行性和有效性,我们引入了稳定值近似策略迭代(SV-API)。SV-API修改了标准的近似策略迭代循环,在迭代更新的行为策略收集相关经验的同时,保持目标策略固定。它仅在满足收敛准则后才承诺采用新策略。如果满足某些稳定性准则,则更新保证是安全的;否则,其安全性不低于标准近似策略迭代。将SV-API应用于PPO得到稳定值PPO(SV-PPO),在高维离散(Atari)和连续控制基准测试中,SV-PPO在执行显著更大的目标策略更新的同时,性能匹配或提升。这些结果证明了ANPS作为RL中这一经典挑战的新解决方案的可行性。

英文摘要

We revisit a classic "chicken-and-egg" problem in reinforcement learning: to safely improve a policy, the value function must be accurate on the state-visitation distribution of the updated policy. That distribution over states is unknown and cannot be sampled for the purposes of training the value function. Conservative updates solve this problem, but at the cost of shrinking the policy update. This paper explores an alternative solution, Approximate Next Policy Sampling (ANPS), which addresses the problem by modifying the training distribution rather than constraining the policy update. ANPS is satisfied if the distribution of the training data approximates that of the next policy. To demonstrate the feasibility and efficacy of ANPS, we introduce Stable Value Approximate Policy Iteration (SV-API). SV-API modifies the standard approximate policy iteration loop to hold the target policy fixed while an iteratively updated behavioral policy gathers relevant experience. It only commits to a new policy once a convergence criterion has been met. If certain stability criteria are met, the update is guaranteed to be safe; otherwise, it remains no less safe than standard approximate policy iteration. Applying SV-API to PPO yields Stable Value PPO (SV-PPO), which matches or improves performance on high-dimensional discrete (Atari) and continuous control benchmarks while executing substantially larger target policy updates. These results demonstrate the viability of ANPS as a new solution to this classic challenge in RL.

2605.00665 2026-06-19 cs.CV 版本更新

Prediction of Alzheimer's Disease Risk Factors from Retinal Images via Deep Learning: Development and Validation of Biologically Relevant Morphological Associations in the UK Biobank

基于深度学习的视网膜图像预测阿尔茨海默病风险因素:英国生物银行中生物学相关形态学关联的开发和验证

Seowung Leem, Yunchao Yang, Adam J. Woods, Ruogu Fang

发表机构 * J. Crayton Pruitt Family Dept. of Biomedical Engineering, University of Florida(朱·克雷顿·普瑞特生物医学工程系,佛罗里达大学) University of Florida Research Computing(佛罗里达大学研究计算中心) Meta AI (FAIR)(Meta AI(FAIR)) School of Behavioral and Brain Sciences, University of Texas at Dallas(德克萨斯大学达拉斯分校行为与脑科学学院) Dept. of Electrical and Computer Engineering, University of Florida(佛罗里达大学电气与计算机工程系) Dept. of Computer and Information Science and Engineering, University of Florida(佛罗里达大学计算机与信息科学与工程系) Center for Cognitive Aging and Memory, University of Florida(佛罗里达大学认知衰老与记忆中心)

AI总结 利用深度学习从视网膜彩色眼底照片预测12个阿尔茨海默病相关风险因素,并揭示其背后的视网膜结构特征,发现视神经头和视网膜血管等区域与风险因素及阿尔茨海默病前期变化相关。

Comments Accepted to the "Journal of Alzheimer's Disease" for publication

详情
AI中文摘要

系统性的、代谢性的、生活方式的因素已通过流行病学和AD特异性生物标志物研究与阿尔茨海默病(AD)建立关联。彩色眼底摄影(CFP)是否包含与这些AD相关风险域相对应的视网膜结构特征仍不清楚。为了确定深度学习(DL)模型能否从CFP预测12个AD相关风险因素,并表征这些预测背后的视网膜结构,从而评估CFP是否反映AD易感性的通路。使用来自英国生物银行的44,501名独特参与者的62,876张CFP,训练DL模型预测与AD发病率相关的12个因素:6个分类变量(性别、吸烟、失眠、经济状况、饮酒、抑郁)和6个连续变量(年龄、受教育完成年龄、BMI、收缩压、舒张压、HbA1c)。评估模型性能、模型显著性和显著性衍生得分(CAM-Score),并与视网膜形态测量进行比较。还将得分在AD发病病例(平均发病前8.55年)与匹配对照之间进行比较。DL的性能范围为分类变量的AUROC=0.5654-0.9480,连续变量的R2=-0.0291-0.7620,优于大多数形态测量-机器学习模型。基于显著性的得分一致地突出了生物学上有意义的区域,特别是视神经头和视网膜血管。它也与现有的形态测量变异一致。多个基于显著性的得分在AD发病病例与匹配对照之间存在显著差异,表明风险因素的视网膜相关性与临床前AD相关变化之间存在潜在重叠。CFP编码了与AD风险因素相关的视网膜特征。尽管不具有诊断性,但DL衍生的视网膜表征可能揭示反映潜在AD易感性的生物学上有意义的风险相关结构变化。

英文摘要

The systemic, metabolic, lifestyle factors have established associations with Alzheimer's Disease (AD) through epidemiologic and AD-specific biomarker studies. Whether colored fundus photography (CFP) contains retinal structural signatures corresponding to these AD-related risk domains remains unclear. To determine whether deep learning (DL) models can predict 12 AD-related risk factors from CFP and to characterize the retinal structures underlying these predictions, thereby assessing whether CFP reflects pathways to AD vulnerability. Using 62,876 CFPs from 44,501 unique participants from the UK Biobank, DL models were trained to predict 12 factors linked to AD incidence: 6 categorical (sex, smoking, sleeplessness, economic status, alcohol use, depression) and 6 continuous (age, age at completing education, BMI, systolic, diastolic blood pressure, HbA1c). Model performance, model saliency, and saliency-derived scores (CAM-Score) were evaluated and compared to retinal morphometry. The scores were also compared between incident-AD cases (average 8.55 years before onset) and matched controls. Performance of DL ranged from AUROC= 0.5654-0.9480 for categorical and R2=-0.0291-0.7620 for continuous factors, outperforming most of the morphometry-machine learning models. Saliency-based score consistently highlighted biologically meaningful regions, particularly the optic nerve head and retinal vasculature. It also aligned with present morphometric variations. Several saliency-based scores differed significantly between incident AD and matched controls, suggesting potential overlap between retinal correlates of risk factors and preclinical AD-associated changes. CFP encodes retinal signatures linked to AD risk factors. Although not diagnostic, DL-derived retinal representations may uncover biologically meaningful risk-related structural changes mirroring the potential AD vulnerability.

2605.00457 2026-06-19 cs.NI cs.LG cs.SY eess.SY 版本更新

Utility-Aware DRL-Based TXOP Adaptation for NR-U and Wi-Fi Coexistence Networks

基于策略驱动的DRL的NR-U与Wi-Fi共存中的TXOP自适应

Po-Heng Chou, Yi-Fang Yu, Shou-Yu Chen, Chiapin Wang

发表机构 * Research Center for Information Technology Innovation (CITI), Academia Sinica (AS)(资讯科技创新研究所以(CITI),中华学术界(AS)) Department of Electrical Engineering, National Taiwan Normal University (NTNU)(国立台湾师范大学电子工程系(NTNU))

AI总结 针对NR-U与Wi-Fi在非授权频谱共存中的频谱利用不平衡问题,提出一种基于策略驱动的深度强化学习框架,通过奖励设计实现公平性、吞吐量和效用的灵活权衡控制。

Comments 15 pages, 13 figures, 2 tables, submitted to IEEE Open Journal of the Communications Society

详情
AI中文摘要

NR-U与Wi-Fi在非授权频谱中的共存引入了一个具有挑战性的共存管理问题,其中异构信道接入机制导致频谱利用的显著不平衡和Wi-Fi性能下降。为了解决这一挑战,我们提出了一种基于策略驱动的深度强化学习(DRL)框架,用于自适应传输机会(TXOP)控制,其中共存过程被建模为马尔可夫决策过程(MDP),深度Q网络(DQN)通过在线交互学习控制策略。一个关键贡献是通过奖励设计引入策略层,从而实现对公平性、吞吐量和效用之间共存权衡的显式控制。开发了三种策略,即绝对公平、适度公平和基于效用的公平,以实现不同的工作点。仿真结果表明,所提出的框架在严格公平控制下实现了高于0.9的Jain公平指数。与绝对公平相比,适度公平将总吞吐量提高了68.22%,而基于效用的策略进一步将效用提高了177.6%。这些结果表明,策略驱动控制为管理异构共存网络中的权衡提供了一种灵活有效的解决方案。

英文摘要

The coexistence of NR-U and Wi-Fi in the unlicensed spectrum introduces a challenging resource management problem, where heterogeneous channel access mechanisms can lead to unbalanced spectrum utilization and severe Wi-Fi performance degradation. To address this issue, this paper proposes a utility-aware deep reinforcement learning (DRL) framework for adaptive transmission opportunity (TXOP) control in NR-U/Wi-Fi coexistence networks. The coexistence process is formulated as a Markov decision process (MDP), in which the NR-U TXOP duration is treated as a controllable variable for regulating post-access channel occupancy. A deep Q-network (DQN) is then employed to learn adaptive TXOP control policies through online interaction with the coexistence environment. A key feature of the proposed framework is the integration of a configurable reward and criterion design, which enables explicit control of the fairness-efficiency-utility tradeoff. Three operating policies are developed, namely absolute fairness, moderate fairness, and utility-oriented moderate fairness, to characterize different coexistence operating points. Simulation results show that the proposed framework achieves a Jain fairness index above 0.9 under strict fairness control. Compared with the absolute fairness policy, the moderate fairness policy improves aggregate throughput by 68.22%, while the utility-oriented policy achieves a 177.6% improvement under the adopted utility evaluation metric. These results demonstrate that the proposed utility-aware DRL framework provides an effective and flexible solution for adaptive TXOP control and tradeoff management in heterogeneous unlicensed coexistence networks.

2604.27456 2026-06-19 cs.CR 版本更新

Federated Generation of Synthetic RNA-seq Data

安全的跨机构合成基因组数据生成

Daniil Filienko, Martine De Cock, Sikha Pentyala

AI总结 本文提出一种安全方法,允许多个数据持有者联合训练合成数据生成器,同时保护隐私。通过多方计算和差分隐私技术,在机构间分布的数据上生成高实用性的合成基因组数据。

详情
AI中文摘要

由于其敏感性,基因组数据的访问受到严格监管。尽管安全措施至关重要,但繁琐的数据访问流程成为开发基因组AI方法的主要障碍。合成数据生成可以通过允许更广泛的数据共享而不暴露敏感信息来缓解这种紧张。合成基因组数据通过在真实数据上训练生成模型并随后采样人工数据生成,以保留相关统计信息的同时限制对底层个体的披露。在某些情况下,单一数据持有者可能有足够的数据来训练此类生成模型;然而,在许多应用中,必须在多个地点结合数据以达到足够的规模。例如,在罕见病研究中,单个医院通常只持有少量患者的数据显示了这种需求。本文提出的解决方案使多个数据持有者能够联合训练合成数据生成器,而不泄露其原始数据。我们的方法结合了安全多方计算(MPC)以确保输入隐私,从而确保没有任何一方以未加密的形式披露其数据,并结合差分隐私(DP)以通过减轻释放合成数据的信息泄漏来提供输出隐私。我们通过在联邦设置中从多个真实RNA-seq队列生成高实用性的合成数据集,经验性地展示了所提方法的有效性,证明了即使数据分布在机构之间,我们的方法也能实现隐私保护的数据合成。

英文摘要

Access to genomic data is highly regulated due to its sensitive nature. While safeguards are essential, cumbersome data access processes pose a significant barrier to the development of AI methods for genomics. Synthetic data generation can mitigate this tension by enabling broader data sharing without exposing sensitive information. Synthetic genomic data are produced by training generative models on real data and subsequently sampling artificial data that preserves relevant statistics while limiting disclosures about the underlying individuals. In some settings, a single data holder may have sufficient data to train such generative models; however, in many applications data must be combined across multiple sites to achieve adequate scale. This need arises, e.g., in rare disease studies, where individual hospitals typically hold data for only a small number of patients. The solution we present in this paper enables multiple data holders to jointly train a synthetic data generator without revealing their raw data. Our approach combines secure multiparty computation (MPC) to ensure input privacy, so that no party ever discloses its data in unencrypted form, with differential privacy (DP) to provide output privacy by mitigating information leakage from the released synthetic data. We empirically demonstrate the effectiveness of the proposed method by generating high-utility synthetic datasets from multiple real RNA-seq cohorts in federated settings, showing that our approach enables privacy-preserving data synthesis even when data are distributed across institutions.

2604.27276 2026-06-19 cs.GT cs.CC 版本更新

Fisher Markets with Approximately Optimal Bundles and the Need for a PCP Theorem for PPAD

具有近似最优束的Fisher市场与PPAD的PCP定理的必要性

Argyrios Deligkas, John Fearnley, Alexandros Hollender, Themistoklis Melissourgos

AI总结 研究在SPLC效用函数的Fisher市场中计算具有近似最优束的竞争均衡的PPAD困难性,证明在PCP-for-PPAD猜想下存在常数δ>0使得问题为PPAD难,且该猜想对证明困难性是必要的。

详情
AI中文摘要

我们研究了在具有可分分段线性凹(SPLC)效用函数的Fisher市场中计算具有近似最优束的竞争均衡的问题,即每个买家收到一个$(1-\delta)$-最优束,而不是完全最优的束。我们首次建立了该问题的难解性结果,通过证明在PCP-for-PPAD猜想下,对于某个常数$\delta > 0$,该问题是PPAD难的。即使所有买家具有相同的预算(等收入竞争均衡)、线性上限效用函数,并且即使我们允许$\varepsilon$-近似清算而不是完全清算,对于任何常数$\varepsilon < 1/9$,该困难性结果仍然成立。重要的是,我们表明PCP-for-PPAD猜想实际上对于证明常数$\delta$的困难性是必要的:在包含我们困难性结果所生成市场的一类广泛市场中,展示寻找此类近似市场均衡的PPAD困难性将证明该猜想。这是第一个自然问题,其中该猜想被证明是建立其困难性所必需的。

英文摘要

We study the problem of computing a competitive equilibrium with approximately optimal bundles in Fisher markets with separable piecewise-linear concave (SPLC) utility functions, meaning that every buyer receives a $(1-δ)$-optimal bundle, instead of a perfectly optimal one. We establish the first intractability result for the problem by showing that it is PPAD-hard for some constant $δ> 0$, assuming the PCP-for-PPAD conjecture. This hardness result holds even if all buyers have identical budgets (competitive equilibrium with equal incomes), linear capped utilities, and even if we also allow $\varepsilon$-approximate clearing instead of perfect clearing, for any constant $\varepsilon < 1/9$. Importantly, we show that the PCP-for-PPAD conjecture is in fact required to show hardness for constant $δ$: showing PPAD-hardness for finding such approximate market equilibria in a broad class of markets encompassing those generated by our hardness result would prove the conjecture. This is the first natural problem where the conjecture is provably required to establish hardness for it.

2602.17315 2026-06-19 cs.LG cs.AI 版本更新

Flickering Multi-Armed Bandits

闪烁多臂老虎机

Sourav Chakraborty, Amit Kiran Rege, Claire Monteleoni, Lijun Chen

发表机构 * University of Colorado Boulder(科罗拉多大学博尔德分校) INRIA Paris(巴黎国家信息与自动化研究所)

AI总结 提出闪烁多臂老虎机模型,通过随机图约束动作可用性,设计两阶段懒惰随机游走算法实现次线性遗憾界,并证明信息论下界的最优性。

详情
AI中文摘要

我们引入闪烁多臂老虎机(FMAB)来建模动作可用性变化环境中的序列决策,其中下一个动作的可访问性被限制为依赖于智能体当前选择的子集。我们通过随机演化图形式化这些约束,其中动作仅限于局部邻域。这种移动受限结构带来了双重挑战:信息获取的统计要求和导航的物理开销。我们在独立同分布 Erdős--R'enyi 和边马尔可夫过程下分析 FMAB,提出一种两阶段懒惰随机游走算法以实现鲁棒探索。我们建立了高概率次线性遗憾界,并通过匹配的信息论下界证明了近最优性。我们的结果刻画了局部移动约束下学习的内在成本,并通过机器人灾难响应模拟进行了补充。

英文摘要

We introduce Flickering Multi-Armed Bandits (FMAB) to model sequential decision-making in environments with changing action availability, where accessibility of the next action is restricted to a subset dependent on the agent's current choice. We formalize these constraints through stochastically evolving graphs where actions are limited to local neighborhoods. This mobility-constrained structure imposes a dual challenge: the statistical requirement of information acquisition and the physical overhead of navigation. We analyze FMAB under i.i.d. Erdős--R'enyi and Edge-Markovian process, proposing a two-phase lazy random walk algorithm for robust exploration. We establish high-probability sublinear regret bounds and prove near-optimality via a matching information-theoretic lower bound. Our results characterize the intrinsic cost of learning under local-move constraints, complemented by a robotic disaster-response simulation.

2604.21097 2026-06-19 stat.ML cs.LG 版本更新

Learning to Emulate Chaos: Adversarial Optimal Transport Regularization

学习模拟混沌:对抗最优传输正则化

Gabriel Melo, Leonardo Santiago, Peter Y. Lu

发表机构 * Department of Mechanical and Aerospace Engineering, North Carolina State University, Raleigh, NC(北卡罗来纳州立大学机械与航空航天工程系) Department of Electrical and Computer Engineering, Tufts University, Medford, MA(塔夫茨大学电气与计算机工程系) Work performed while at the University of Campinas(在坎皮纳斯大学工作期间)

AI总结 针对混沌动力学模拟中长程统计保真度低的问题,提出基于对抗最优传输的目标函数,联合学习高质量汇总统计量和物理一致的模拟器,理论分析与实验验证了Sinkhorn散度和WGAN对偶形式的有效性。

详情
AI中文摘要

混沌出现在许多复杂动力系统中,从天气到电网,但使用机器学习模拟器等数据驱动方法难以准确建模。虽然模拟器是加速模拟和解决逆问题的有前途的工具,但它们仍然难以学习混沌动力学,其中对初始条件的敏感性使得精确的长期预测不可行,尤其是在给定噪声数据的情况下。最近的工作转而训练模拟器以匹配混沌吸引子的统计特性,但这些方法通常依赖于手工制作的汇总统计量或大型、多样的多环境数据集。在这项工作中,我们提出了一类对抗最优传输目标,可以从单个噪声轨迹中联合学习高质量的汇总统计量和物理一致的模拟器。我们从理论上分析并实验验证了我们的方法的Sinkhorn散度公式(2-Wasserstein)和WGAN风格的对偶公式(1-Wasserstein)。在各种混沌系统(包括具有高维时空混沌的系统)上的数值实验表明,使用我们提出的目标训练的模拟器具有显著改善的长期统计保真度。

英文摘要

Chaos arises in many complex dynamical systems, from weather to power grids, but is difficult to accurately model with data-driven methods such as machine learning emulators. While emulators are promising tools for accelerating simulations and solving inverse problems, they still struggle to learn chaotic dynamics, where sensitivity to initial conditions renders exact long-term forecasts infeasible, especially given noisy data. Recent work instead trains emulators to match the statistical properties of chaotic attractors, but these approaches often rely on handcrafted summary statistics or large, diverse multi-environment datasets. In this work, we propose a family of adversarial optimal transport objectives that can jointly learn high-quality summary statistics and a physically consistent emulator from a single noisy trajectory. We theoretically analyze and experimentally validate a Sinkhorn divergence formulation (2-Wasserstein) and a WGAN-style dual formulation (1-Wasserstein) of our approach. Numerical experiments across a variety of chaotic systems, including ones with high-dimensional spatiotemporal chaos, show that emulators trained using our proposed objectives have significantly improved long-term statistical fidelity.

2604.19196 2026-06-19 cs.CV 版本更新

Benchmarking Vision Foundation Models for Domain-Generalizable Face Anti-Spoofing

面向域泛化人脸反欺骗的视觉基础模型基准测试

Mika Feng, Pierre Gallin-Martel, Koichi Ito, Takafumi Aoki

发表机构 * Graduate School of Information Sciences, Tohoku University, Japan(东北大学信息科学研究生院,日本)

AI总结 本文系统评估15种预训练视觉模型在人脸反欺骗域泛化中的表现,发现自监督ViT(尤其是DINOv2+Registers)结合数据增强和注意力损失在MICO协议上达到最优,且计算高效。

Comments 2026 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

详情
AI中文摘要

人脸反欺骗(FAS)由于需要在未见过的环境中进行鲁棒的域泛化而仍然具有挑战性。尽管最近的趋势利用视觉-语言模型(VLM)进行语义监督,但这些多模态方法通常需要高昂的计算资源并表现出高推理延迟。此外,它们的有效性本质上受限于底层视觉特征的质量。本文重新审视仅视觉基础模型建立高效鲁棒FAS基线的潜力。我们在严苛的跨域场景下(包括MICO和有限源域(LSD)协议)对15个预训练模型进行了系统基准测试,例如有监督CNN、有监督ViT和自监督ViT。我们的全面分析表明,自监督视觉模型,特别是带有寄存器的DINOv2,显著抑制了注意力伪影并捕获了关键的细粒度欺骗线索。结合人脸反欺骗数据增强(FAS-Aug)、分块数据增强(PDA)和注意力加权分块损失(APL),我们提出的仅视觉基线在MICO协议上达到了最先进的性能。该基线在数据受限的LSD协议下优于现有方法,同时保持优越的计算效率。这项工作为FAS提供了一个确定的仅视觉基线,表明优化的自监督视觉变换器可以作为仅视觉和未来多模态FAS系统的骨干。项目页面见:此https URL。

英文摘要

Face Anti-Spoofing (FAS) remains challenging due to the requirement for robust domain generalization across unseen environments. While recent trends leverage Vision-Language Models (VLMs) for semantic supervision, these multimodal approaches often demand prohibitive computational resources and exhibit high inference latency. Furthermore, their efficacy is inherently limited by the quality of the underlying visual features. This paper revisits the potential of vision-only foundation models to establish a highly efficient and robust baseline for FAS. We conduct a systematic benchmarking of 15 pre-trained models, such as supervised CNNs, supervised ViTs, and self-supervised ViTs, under severe cross-domain scenarios including the MICO and Limited Source Domains (LSD) protocols. Our comprehensive analysis reveals that self-supervised vision models, particularly DINOv2 with Registers, significantly suppress attention artifacts and capture critical, fine-grained spoofing cues. Combined with Face Anti-Spoofing Data Augmentation (FAS-Aug), Patch-wise Data Augmentation (PDA) and Attention-weighted Patch Loss (APL), our proposed vision-only baseline achieves state-of-the-art performance in the MICO protocol. This baseline outperforms existing methods under the data-constrained LSD protocol while maintaining superior computational efficiency. This work provides a definitive vision-only baseline for FAS, demonstrating that optimized self-supervised vision transformers can serve as a backbone for both vision-only and future multimodal FAS systems. The project page is available at: https://gsisaoki.github.io/FAS-VFMbenchmark-CVPRW2026/ .

2604.07328 2026-06-19 cs.LG 版本更新

How to sketch a learning algorithm

如何勾勒学习算法

Sam Gunn

发表机构 * UC Berkeley(伯克利大学)

AI总结 提出一种数据删除方案,基于稳定性假设,通过随机复方向的高阶导数局部勾勒算术电路,实现深度学习模型输出预测的误差和失败概率可忽略,且预计算和推理仅慢对数因子。

Comments Improved presentation and simplified Algorithm 4

详情
AI中文摘要

训练数据的选择如何影响AI模型?这个广泛的问题对于可解释性、隐私和基础科学至关重要。其技术核心是数据删除问题:在合理的预计算量之后,快速预测如果从学习算法中排除给定训练数据子集,模型在给定情况下的行为。我们提出了一种数据删除方案,能够在深度学习设置中以可忽略的误差$\varepsilon$和失败概率$\delta$预测模型输出。我们的预计算和预测算法分别仅比常规训练和推理慢$\tilde{O}(\log(1/\delta)/\varepsilon^2)$因子。存储需求为$\tilde{O}(\log(1/\delta)/\varepsilon^2)$个模型。我们的证明基于一个称为稳定性的假设。与先前工作所做的假设相比,稳定性似乎与学习强大AI模型完全兼容。为支持这一点,我们展示了稳定性在microgpt的最小实验集中得到满足。我们的代码可在https://this URL获取。在技术层面,我们的工作基于一种新方法,通过计算随机复方向的高阶导数来局部勾勒算术电路。前向模式自动微分允许廉价计算这些导数。

英文摘要

How does the choice of training data influence an AI model? This broad question is of central importance to interpretability, privacy, and basic science. At its technical core is the data deletion problem: after a reasonable amount of precomputation, quickly predict how the model would behave in a given situation if a given subset of training data had been excluded from the learning algorithm. We present a data deletion scheme capable of predicting model outputs with vanishing error $\varepsilon$ and failure probability $δ$ in the deep learning setting. Our precomputation and prediction algorithms are only $\tilde{O}(\log(1/δ)/\varepsilon^2)$ factors slower than regular training and inference, respectively. The storage requirements are those of $\tilde{O}(\log(1/δ)/\varepsilon^2)$ models. Our proof is based on an assumption that we call stability. In contrast to the assumptions made by prior work, stability appears to be fully compatible with learning powerful AI models. In support of this, we show that stability is satisfied in a minimal set of experiments with microgpt. Our code is available at https://github.com/SamSpo1/microgpt-sketch. At a technical level, our work is based on a new method for locally sketching an arithmetic circuit by computing higher-order derivatives in random complex directions. Forward-mode automatic differentiation allows cheap computation of these derivatives.

2604.18105 2026-06-19 eess.AS cs.CL cs.SD 版本更新

NIM4-ASR: Towards Efficient, Robust, and Customizable Real-Time LLM-Based ASR

NIM4-ASR:迈向高效、鲁棒且可定制的实时基于LLM的语音识别

Yuan Xie, Jiaqi Song, Guang Qiu, Xianliang Wang, Kai Qiao, Junfeng Yuan, Shengqing Liu, Yi Zhang, Bowen Chen, Ming Lei, Jie Gao, Jie Wu

发表机构 * Advanced Intelligent Systems Group, NIO(蔚来智能系统集团)

AI总结 提出NIM4-ASR框架,通过重新设计多阶段训练范式(包括预训练架构优化、迭代异步SFT和ASR专用强化学习)以及生产优化(噪声鲁棒性、流式推理和RAG热词定制),在2.3B参数下实现SOTA性能。

详情
AI中文摘要

将大语言模型(LLM)集成到自动语音识别(ASR)中已成为近年来的主流范式。尽管现有的基于LLM的ASR模型在公共基准上表现出色,但其训练仍然主要依赖数据驱动,未能充分解决关键的实际挑战——特别是在资源受限部署中的有限向下可扩展性以及声学挑战条件下的幻觉问题。为了解决这些问题,我们提出了NIM4-ASR,一个面向生产的、基于LLM的ASR框架,针对效率和鲁棒性进行了优化。基于编码器和LLM之间功能角色的原则性划分,我们重新设计了多阶段训练范式,使每个模块与其预期的能力边界对齐。具体来说,我们重新制定了预训练架构和目标以缓解模态差距并提高参数效率;引入了迭代异步SFT阶段以保持声学保真度并约束表示漂移;设计了ASR专用的强化学习阶段以进一步提高识别质量和鲁棒性。我们还加入了一系列面向生产的优化,包括噪声和静音条件下的鲁棒性、实时流式推理以及通过检索增强生成(RAG)进行的热词定制。实验表明,NIM4-ASR仅用2.3B参数就在多个公共基准上达到了最先进的性能,同时在内部基准上显著优于更大规模的竞争对手——特别是在实体密集的真实场景中。NIM4-ASR进一步通过RAG支持百万级热词定制,检索延迟低于毫秒,从而能够高效适应新兴实体和个性化用户需求。

英文摘要

Integrating large language models (LLMs) into automatic speech recognition (ASR) has become a mainstream paradigm in recent years. Although existing LLM-based ASR models demonstrate impressive performance on public benchmarks, their training remains predominantly data-driven, leaving key practical challenges insufficiently addressed -- particularly limited downward scalability in resource-constrained deployments and hallucinations under acoustically challenging conditions. To address these issues, we present NIM4-ASR, a production-oriented LLM-based ASR framework optimized for both efficiency and robustness. Grounded in a principled delineation of functional roles between the encoder and the LLM, we redesign the multi-stage training paradigm to align each module with its intended capability boundary. Specifically, we reformulate the pre-training architecture and objective to mitigate the modality gap and improve parameter efficiency; introduce an iterative asynchronous SFT stage to preserve acoustic fidelity and constrain representation drift; and design an ASR-specialized reinforcement learning stage to further enhance recognition quality and robustness. We additionally incorporate a suite of production-oriented optimizations, including robustness under noisy and silent conditions, real-time streaming inference, and hotword customization via retrieval-augmented generation (RAG). Experiments show that NIM4-ASR achieves state-of-the-art performance on multiple public benchmarks with merely 2.3B parameters, while substantially outperforming larger-scale competitors on internal benchmarks -- particularly in entity-intensive real-world scenarios. NIM4-ASR further supports million-scale hotword customization via RAG with sub-millisecond retrieval latency, enabling efficient adaptation to emerging entities and personalized user requirements.

2603.04531 2026-06-19 cs.RO 版本更新

PTLD: Sim-to-real Privileged Tactile Latent Distillation for Dexterous Manipulation

PTLD: 从仿真到现实的触觉潜在知识蒸馏用于灵巧操作

Rosy Chen, Mustafa Mukadam, Michael Kaess, Tingfan Wu, Francois R Hogan, Jitendra Malik, Akash Sharma

发表机构 * Carnegie Mellon University(卡内基梅隆大学) University of Washington(华盛顿大学) FAIR at Meta(Meta的FAIR团队) UC Berkeley(伯克利大学)

AI总结 提出PTLD方法,通过真实世界触觉策略数据蒸馏鲁棒状态估计器,解决触觉仿真困难问题,在灵巧操作任务中相比纯本体感策略提升182%和57%。

详情
AI中文摘要

触觉灵巧操作对于自动化复杂家务任务至关重要,但学习有效控制策略仍然是一个挑战。虽然最近的工作依赖于模仿学习,但通过机器人遥操作或动觉教学获取多指手的高质量演示是困难的。另一种方法是,通过强化学习我们可以在仿真中学习技能,但快速且真实的触觉观测仿真具有挑战性。为了弥合这一差距,我们引入了PTLD:从仿真到现实的触觉潜在知识蒸馏,这是一种无需触觉仿真即可学习触觉操作技能的新方法。我们的关键思想不是模拟触觉传感器或纯粹依赖本体感策略进行零样本从仿真到现实的迁移,而是利用现实世界中的特权传感器收集真实的触觉策略数据。然后,这些数据用于蒸馏一个鲁棒的状态估计器,该估计器基于触觉输入运行。我们的实验表明,PTLD可以通过结合触觉感知显著改善在仿真中训练的本体感操作策略。在基准的掌内旋转任务中,PTLD相比纯本体感策略实现了182%的提升。我们还展示了PTLD能够学习具有挑战性的触觉掌内重定向任务,在该任务中,我们观察到达到的目标数量相比仅使用本体感提高了57%。网站:此 https URL。

英文摘要

Tactile dexterous manipulation is essential to automating complex household tasks, yet learning effective control policies remains a challenge. While recent work has relied on imitation learning, obtaining high quality demonstrations for multi-fingered hands via robot teleoperation or kinesthetic teaching is prohibitive. Alternatively, with reinforcement we can learn skills in simulation, but fast and realistic simulation of tactile observations is challenging. To bridge this gap, we introduce PTLD: sim-to-real Privileged Tactile Latent Distillation, a novel approach to learning tactile manipulation skills without requiring tactile simulation. Instead of simulating tactile sensors or relying purely on proprioceptive policies to transfer zero-shot sim-to-real, our key idea is to leverage privileged sensors in the real world to collect real-world tactile policy data. This data is then used to distill a robust state estimator that operates on tactile input. We demonstrate from our experiments that PTLD can be used to improve proprioceptive manipulation policies trained in simulation significantly by incorporating tactile sensing. On the benchmark in-hand rotation task, PTLD achieves a 182% improvement over a proprioception only policy. We also show that PTLD enables learning the challenging task of tactile in-hand reorientation where we see a 57% improvement in the number of goals reached over using proprioception alone. Website: https://akashsharma02.github.io/ptld-website/.

2511.22486 2026-06-19 physics.plasm-ph cs.LG 版本更新

The Machine Learning Approach to Moment Closure Relations for Plasma: A Review

等离子体矩闭包关系的机器学习方法:综述

Samuel Burles, Enrico Camporeale

发表机构 * School of Physical and Chemical Sciences, Queen Mary University of London(伦敦大学女王学院物理与化学科学学院) Space Weather TREC, University of Colorado(科罗拉多大学空间天气TREC)

AI总结 本文综述了机器学习方法在等离子体流体模型中发展改进闭包模型的研究,涵盖神经网络代理和方程发现两类方法,并讨论了离线测试与在线模拟的挑战及未来方向。

Comments 58 pages, 6 figures

详情
AI中文摘要

大规模等离子体全局模拟的需求是空间和实验室等离子体物理学中持续存在的挑战。任何基于流体模型的模拟都固有地需要高阶等离子体矩的闭包关系。本综述汇编并分析了近期涌现的机器学习方法,这些方法旨在开发改进的等离子体闭包模型,能够在等离子体流体模型中捕捉动力学现象。我们调查了两类方法:神经网络代理(从多层感知器到傅里叶神经算子,后者最近在流体求解器内在线复现了线性和非线性朗道阻尼)和方程发现方法(如稀疏回归);并根据这些研究是离线对照参考数据测试还是在线在时间演化求解器内测试进行组织。我们概述了与机器学习闭包相关的挑战,包括非对角压力张量精度、超出训练分布的泛化能力以及稳定集成到大尺度模拟中,并指出了未来研究可能解决这些问题的方向。

英文摘要

The requirement for large-scale global simulations of plasma is an ongoing challenge in both space and laboratory plasma physics. Any simulation based on a fluid model inherently requires a closure relation for the high order plasma moments. This review compiles and analyses the recent surge of machine learning approaches developing improved plasma closure models capable of capturing kinetic phenomena within plasma fluid models. We survey two methodological families: neural-network surrogates (from multilayer perceptrons to Fourier neural operators, the latter recently reproducing both linear and non-linear Landau damping online within a fluid solver) and equation-discovery methods such as sparse regression; and organise the studies by whether they are tested offline against reference data or online within a time-evolving solver. We outline the challenges associated with machine-learning closures, including off-diagonal pressure-tensor accuracy, generalisation beyond the training distribution, and stable integration into large-scale simulations, and the directions future research might take to address them.

2604.15838 2026-06-19 cs.LG 版本更新

Reversible Residual Normalization Alleviates Spatio-Temporal Distribution Shift

可逆残差归一化缓解时空分布偏移

Zhaobo Hu, Vincent Gauthier, Mehdi Naima

发表机构 * CNRS -- LIP6 Sorbonne Universit\'e

AI总结 针对时空分布偏移问题,提出可逆残差归一化框架,通过空间感知可逆变换同时处理时空维度偏移,结合图卷积与谱约束图神经网络实现自适应归一化。

详情
AI中文摘要

分布偏移严重降低了深度预测模型的性能。虽然这一问题在单变量时间序列中已有充分研究,但在时空领域中仍然是一个重大挑战。有效的解决方案如实例归一化及其变体可以通过标准化统计量来缓解时间偏移。然而,图上的分布偏移更为复杂,不仅涉及单个节点序列的漂移,还涉及空间网络中的异质性,其中不同节点表现出不同的统计特性。为了解决这个问题,我们提出了可逆残差归一化(RRN),一种新颖的框架,执行空间感知的可逆变换以解决空间和时间维度上的分布偏移。我们的方法在可逆残差块中集成了图卷积操作,实现了在保持可逆性的同时尊重底层图结构的自适应归一化。通过将中心归一化与谱约束图神经网络相结合,我们的方法以数据驱动的方式捕获和归一化复杂的时空关系。我们框架的双向性允许模型在归一化的潜在空间中学习,并通过逆变换恢复原始分布特性,为动态时空系统上的预测提供了一种鲁棒且模型无关的解决方案。

英文摘要

Distribution shift severely degrades the performance of deep forecasting models. While this issue is well-studied for individual time series, it remains a significant challenge in the spatio-temporal domain. Effective solutions like instance normalization and its variants can mitigate temporal shifts by standardizing statistics. However, distribution shift on a graph is far more complex, involving not only the drift of individual node series but also heterogeneity across the spatial network where different nodes exhibit distinct statistical properties. To tackle this problem, we propose Reversible Residual Normalization (RRN), a novel framework that performs spatially-aware invertible transformations to address distribution shift in both spatial and temporal dimensions. Our approach integrates graph convolutional operations within invertible residual blocks, enabling adaptive normalization that respects the underlying graph structure while maintaining reversibility. By combining Center Normalization with spectral-constrained graph neural networks, our method captures and normalizes complex Spatio-Temporal relationships in a data-driven manner. The bidirectional nature of our framework allows models to learn in a normalized latent space and recover original distributional properties through inverse transformation, offering a robust and model-agnostic solution for forecasting on dynamic spatio-temporal systems.

2604.13416 2026-06-19 cs.CV cs.AI 版本更新

DF3DV-1K: A Large-Scale Dataset and Benchmark for Distractor-Free Novel View Synthesis

DF3DV-1K:用于无干扰新视角合成的大规模数据集与基准

Cheng-You Lu, Yi-Shan Hung, Wei-Ling Chi, Hao-Ping Wang, Charlie Li-Ting Tsai, Yu-Cheng Chang, Yu-Lun Liu, Thomas Do, Chin-Teng Lin

发表机构 * University of Technology Sydney(悉尼科技大学) University of Sydney(悉尼大学) National Yang Ming Chiao Tung University(阳明交通大学)

AI总结 为弥补无干扰辐射场领域缺乏大规模真实世界数据集的空白,构建了包含1048个场景、每场景提供干净和杂乱图像集的DF3DV-1K数据集,并基于此基准测试了九种最新方法,识别出最鲁棒的方法和最具挑战的场景。

详情
AI中文摘要

辐射场领域的进展已实现逼真的新视角合成。在多个领域中,已开发出大规模真实世界数据集以支持全面基准测试并促进超越场景特定重建的进展。然而,对于无干扰辐射场,每个场景同时包含干净和杂乱图像的大规模数据集仍然缺乏,限制了发展。为填补这一空白,我们引入了DF3DV-1K,一个包含1048个场景的大规模真实世界数据集,每个场景提供干净和杂乱的图像集用于基准测试。该数据集总共包含89,924张使用消费级相机拍摄的图像,模拟随意拍摄,涵盖128种干扰类型和161种场景主题,包括室内和室外环境。一个精心挑选的41个场景子集DF3DV-41被系统设计用于评估无干扰辐射场方法在挑战性场景下的鲁棒性。利用DF3DV-1K,我们对九种最新的无干扰辐射场方法和3D高斯泼溅进行了基准测试,识别出最鲁棒的方法和最具挑战的场景。除了基准测试,我们还展示了DF3DV-1K的一个应用:微调基于扩散的2D增强器以改进辐射场方法,在保留集(例如DF3DV-41)和On-the-go数据集上实现了平均0.96 dB PSNR和0.057 LPIPS的提升。我们希望DF3DV-1K能促进无干扰视觉的发展,并推动超越场景特定方法的进步。数据集和排行榜可在以下网址获取:此 https URL。

英文摘要

Advances in radiance fields have enabled photorealistic novel view synthesis. In several domains, large-scale real-world datasets have been developed to support comprehensive benchmarking and to facilitate progress beyond scene-specific reconstruction. However, for distractor-free radiance fields, a large-scale dataset with clean and cluttered images per scene remains lacking, limiting the development. To address this gap, we introduce DF3DV-1K, a large-scale real-world dataset comprising 1,048 scenes, each providing clean and cluttered image sets for benchmarking. In total, the dataset contains 89,924 images captured using consumer cameras to mimic casual capture, spanning 128 distractor types and 161 scene themes across indoor and outdoor environments. A curated subset of 41 scenes, DF3DV-41, is systematically designed to evaluate the robustness of distractor-free radiance field methods under challenging scenarios. Using DF3DV-1K, we benchmark nine recent distractor-free radiance field methods and 3D Gaussian Splatting, identifying the most robust methods and the most challenging scenarios. Beyond benchmarking, we demonstrate an application of DF3DV-1K by fine-tuning a diffusion-based 2D enhancer to improve radiance field methods, achieving average improvements of 0.96 dB PSNR and 0.057 LPIPS on the held-out set (e.g., DF3DV-41) and the On-the-go dataset. We hope DF3DV-1K facilitates the development of distractor-free vision and promotes progress beyond scene-specific approaches. The dataset and leaderboard are available at https://johnnylu305.github.io/df3dv1k_web/.

2604.13240 2026-06-19 cs.CV cs.LG 版本更新

A High-Resolution Landscape Dataset for Concept-Based XAI With Application to Species Distribution Models

基于概念的可解释AI的高分辨率景观数据集及其在物种分布模型中的应用

Augustin de la Brosse, Damien Garreau, Thomas Houet, Thomas Corpetti

发表机构 * Université Rennes 2, CNRS, Nantes Université, Univ Brest, LETG, UMR 6554(里昂大学第二分校、法国国家科学研究中心、南特大学、布列塔尼大学、LETG、UMR 6554) LTSER Zone Atelier Armorique(Armorique 领域实验室区) University of Würzburg, Center for Artificial Intelligence and Data Science(乌尔姆大学、人工智能与数据科学中心)

AI总结 提出首个基于概念的可解释AI方法用于物种分布模型,利用高分辨率多光谱和LiDAR无人机影像构建景观概念数据集,通过Robust TCAV量化景观概念对模型预测的影响,案例研究验证了方法的有效性。

详情
AI中文摘要

绘制物种空间分布对于保护政策和入侵物种管理至关重要。物种分布模型(SDMs)是完成此任务的主要工具,具有两个目的:实现稳健的预测性能,同时提供关于分布驱动因素的生态见解。然而,深度学习SDMs日益增长的复杂性使得提取这些见解更具挑战性。为了调和这些目标,我们提出了首个基于概念的可解释AI(XAI)在SDMs中的实现。我们利用Robust TCAV(测试与概念激活向量)方法量化景观概念对模型预测的影响。为此,我们提供了一个新的开放获取的景观概念数据集,该数据集源自高分辨率多光谱和LiDAR无人机影像。它包括跨越15个不同景观概念的653个斑块和1,450个随机参考斑块,旨在适用于广泛的物种。我们通过两个水生昆虫(襀翅目和毛翅目)的案例研究,使用两个卷积神经网络和一个视觉Transformer来展示这种方法。结果表明,基于概念的XAI有助于根据专家知识验证SDMs,同时发现产生新生态假说的新颖关联。Robust TCAV还提供了景观层面的信息,对政策制定和土地管理有用。代码和数据集公开可用。

英文摘要

Mapping the spatial distribution of species is essential for conservation policy and invasive species management. Species distribution models (SDMs) are the primary tools for this task, serving two purposes: achieving robust predictive performance while providing ecological insights into the driving factors of distribution. However, the increasing complexity of deep learning SDMs has made extracting these insights more challenging. To reconcile these objectives, we propose the first implementation of concept-based Explainable AI (XAI) for SDMs. We leverage the Robust TCAV (Testing with Concept Activation Vectors) methodology to quantify the influence of landscape concepts on model predictions. To enable this, we provide a new open-access landscape concept dataset derived from high-resolution multispectral and LiDAR drone imagery. It includes 653 patches across 15 distinct landscape concepts and 1,450 random reference patches, designed to suit a wide range of species. We demonstrate this approach through a case study of two aquatic insects, Plecoptera and Trichoptera, using two Convolutional Neural Networks and one Vision Transformer. Results show that concept-based XAI helps validate SDMs against expert knowledge while uncovering novel associations that generate new ecological hypotheses. Robust TCAV also provides landscape-level information, useful for policy-making and land management. Code and datasets are publicly available.

2604.11556 2026-06-19 cs.SE cs.AI 版本更新

FM-Agent: Scaling Formal Methods to Large Systems via LLM-Based Hoare-Style Reasoning

FM-Agent: 通过基于LLM的Hoare风格推理将形式化方法扩展到大型系统

Haoran Ding, Zhaoguo Wang, Haibo Chen

发表机构 * Institute of Parallel and Distributed Systems, Shanghai Jiao Tong University(并行与分布式系统研究所,上海交通大学)

AI总结 提出FM-Agent框架,利用LLM自动生成函数级规范,实现大型系统的组合式推理,在143k行代码的系统中2天内发现522个新bug。

详情
AI中文摘要

LLM辅助的软件开发已日益普遍,并能生成如编译器这样的大型系统。增强生成代码的正确性变得至关重要。然而,由于代码复杂性,大型系统的自动推理仍然具有挑战性。Hoare逻辑提供了一种将大型系统分解为较小组件并分别推理(即组合式推理)的方法。然而,现有工作仍难以扩展,因为Hoare逻辑要求为每个函数编写形式化规范,给人类带来沉重负担。当代码由LLM生成时,问题更加严重,因为开发人员缺乏对每个函数预期行为的深入理解。本文提出FM-Agent,这是第一个实现大型系统自动化组合式推理的框架。利用LLM,FM-Agent引入了一种自顶向下的范式来自动生成函数级规范。具体来说,FM-Agent从调用者期望函数如何行为中推导出函数的规范,因此即使实现有缺陷,生成的规范也能反映开发者的意图。开发者的意图通常用自然语言表达,而现有的验证器只支持公式。因此,FM-Agent推广了Hoare风格推理,以针对自然语言规范推理函数。最后,为了确认错误存在并解释错误原因,FM-Agent自动生成测试用例以触发潜在错误。在我们的评估中,FM-Agent在2天内成功推理了大型系统,每个系统最多有143k行代码。这些系统已经由开发者测试过,但FM-Agent仍然发现了522个新错误。这些错误可能导致严重后果,包括系统崩溃和错误的执行结果。

英文摘要

LLM-assisted software development has become increasingly prevalent, and can generate large-scale systems, such as compilers. It becomes crucial to strengthen the correctness of the generated code. However, automated reasoning for large-scale systems remains challenging due to code complexity. Hoare logic offers an approach to decomposing a large system into smaller components and reasoning about them separately (i.e., compositional reasoning). However, existing works still struggle to scale, because Hoare logic requires writing formal specifications for each function, imposing a heavy human burden. The problem is exacerbated when code is generated by LLMs, as developers lack a deep understanding of each function's expected behavior. This paper presents FM-Agent, the first framework that realizes automated compositional reasoning for large-scale systems. Leveraging LLMs, FM-Agent introduces a top-down paradigm to automatically generate function-level specifications. Specifically, FM-Agent derives the specification of a function from how its callers expect the function to behave, so the generated specifications can reflect the developer's intent of a function even if the implementation is buggy. Developers' intent is usually expressed in natural language, while existing verifiers only support formulas. Therefore, FM-Agent generalizes Hoare-style inference to reason about functions against natural-language specifications. Finally, to confirm bug existence and explain bug causes, FM-Agent automatically generates test cases to trigger potential bugs. In our evaluation, FM-Agent successfully reasons about large-scale systems within 2 days, each of which has up to 143k LoC. These systems have already been tested by their developers, but FM-Agent still finds 522 newly discovered bugs. These bugs can cause serious consequences, including system crashes and incorrect execution results.

2601.02149 2026-06-19 cond-mat.mes-hall cond-mat.dis-nn cs.AI 版本更新

AI-enhanced tuning of quantum dot Hamiltonians toward Majorana modes

基于人工智能的量子点哈密顿量调优以实现马约拉纳模式

Mateusz Krawczyk, Jarosław Pawłowski

发表机构 * Institute of Theoretical Physics, Wrocław University of Science and Technology(理论物理研究所,沃林大学技术学院)

AI总结 本文提出基于神经网络的模型,通过学习量子点模拟器的工作区域,利用输运测量自动调优设备以获得马约拉纳模式。模型在无监督条件下训练于导电图合成数据,采用融合马约拉纳零模关键性质的物理引导损失函数。

Comments 12 pages, 8 figures, 2 tables

Journal ref Phys. Rev. Applied 25, 064032 (2026)

详情
AI中文摘要

我们提出了一种基于神经网络的模型,能够学习量子点模拟器广泛的工作区域,并利用此知识通过输运测量自动调优这些设备,以在结构中获得马约拉纳模式。模型在无监督条件下训练于导电图合成数据,采用融合马约拉纳零模关键性质的物理引导损失函数。我们展示了通过适当训练,深度视觉变换器网络可以高效记忆哈密顿量参数与导电图之间的关系,并利用此提出量子点链参数更新,驱动系统进入拓扑相。从参数空间的广泛初始调谐范围开始,单步更新足以生成非平凡零模。此外,通过启用迭代调优过程——系统在每一步获得更新的导电图——我们证明该方法可以处理参数空间更大的区域。

英文摘要

We propose a neural network-based model capable of learning the broad landscape of working regimes in quantum dot simulators, and using this knowledge to autotune these devices - based on transport measurements - toward obtaining Majorana modes in the structure. The model is trained in an unsupervised manner on synthetic data in the form of conductance maps, using a physics-informed loss that incorporates key properties of Majorana zero modes. We show that, with appropriate training, a deep vision-transformer network can efficiently memorize relation between Hamiltonian parameters and structures on conductance maps and use it to propose parameters update for a quantum dot chain that drive the system toward topological phase. Starting from a broad range of initial detunings in parameter space, a single update step is sufficient to generate nontrivial zero modes. Moreover, by enabling an iterative tuning procedure - where the system acquires updated conductance maps at each step - we demonstrate that the method can address a much larger region of the parameter space.

2604.03725 2026-06-19 quant-ph cs.IT eess.SP math.IT 版本更新

Quantum Algebraic Diversity: Single-Copy Density Matrix Estimation via Group-Structured Measurements

量子代数多样性:通过群结构测量进行单副本密度矩阵估计

Mitchell A. Thornton

AI总结 将代数多样性框架扩展到量子测量,提出量子代数多样性定理,通过群结构POVM从单副本量子态估计密度矩阵,实现高保真度,并建立经典-量子对偶映射和最优性继承定理。

Comments v3: copy-reduction claim corrected; fidelities fixed; 1 figure removed

详情
AI中文摘要

我们将代数多样性(AD)框架从经典信号处理扩展到量子测量理论。量子代数多样性(QAD)定理表明,应用于量子态单副本的群结构正算子值测度(POVM)会产生一个满秩的群平均密度矩阵估计量,其特征基和特征值排序追踪真实密度矩阵的特征基和特征值排序,并偏向对称化态,类似于从单个观测中恢复协方差特征结构的经典情况。我们建立了一个经典-量子对偶映射,将经典协方差估计与量子态层析成像联系起来,以及一个最优性继承定理,表明经典群最优性通过Born映射在群平均族内转移到量子设置。SIC-POVM被识别为Heisenberg-Weyl群的AD,互无偏基被识别为Clifford群的AD,揭示了层次结构$\mathrm{HW}(d) \subseteq \mathcal{C}(d) \subseteq S_d$,这镜像了经典的$\mathbb{Z}_M \subseteq G_{\min} \subseteq S_M$。双对易子特征值定理给出了多项式时间自适应POVM选择。一个工作的量子比特示例展示了来自单个计算基测量的群平均估计量,在匹配的$\mathbb{Z}_2$群上平均后,达到保真度0.99,而标准单基层层析成像给出的秩1估计保真度为0.80。对于$d=2$到13的蒙特卡洛模拟证实,来自单个结果的保真度高于0.90,而标准保真度按$\sim 1/d$退化。增长比率反映了秩1标准估计量的崩溃,而不是每个参数的更少副本:有偏的单副本估计量减少了不同测量设置的数目,而不是每个参数的采样成本,并且真正的副本减少仅在精确对称下成立。

英文摘要

We extend the algebraic diversity (AD) framework from classical signal processing to quantum measurement theory. The Quantum Algebraic Diversity (QAD) Theorem establishes that a group-structured positive operator-valued measure (POVM) applied to a single copy of a quantum state produces a full-rank, group-averaged density matrix estimator whose eigenbasis and eigenvalue ordering track those of the true density matrix, with a bias toward the symmetrized state, analogous to the classical recovery of covariance eigenstructure from a single observation. We establish a Classical-Quantum Duality Map connecting classical covariance estimation to quantum state tomography, and an Optimality Inheritance Theorem showing that classical group optimality transfers to quantum settings via the Born map within the group-averaged family. SIC-POVMs are identified as AD with the Heisenberg-Weyl group and mutually unbiased bases as AD with the Clifford group, revealing the hierarchy $\mathrm{HW}(d) \subseteq \mathcal{C}(d) \subseteq S_d$ that mirrors the classical $\mathbb{Z}_M \subseteq G_{\min} \subseteq S_M$. The double-commutator eigenvalue theorem gives polynomial-time adaptive POVM selection. A worked qubit example shows the group-averaged estimator from a single computational-basis measurement, averaged over a matched $\mathbb{Z}_2$ group, reaching fidelity 0.99 where standard single-basis tomography gives a rank-1 estimate of fidelity 0.80. Monte Carlo simulations for $d = 2$ to $13$ confirm fidelity above 0.90 from a single outcome while standard fidelity degrades as $\sim 1/d$. The growing ratio reflects collapse of the rank-1 standard estimator, not fewer copies per parameter: the biased single-copy estimator reduces the number of distinct measurement settings, not the per-parameter sampling cost, and a genuine copy reduction holds only under exact symmetry.

2604.09795 2026-06-19 eess.SY cs.RO cs.SY 版本更新

On Feedback Speed Control for a Planar Tracking

平面跟踪中的反馈速度控制

Xincheng Li, Tengyue Liu, Udit Halder

发表机构 * Department of Mechanical and Aerospace Engineering, University of South Florida(南佛罗里达大学机械与航空航天工程系)

AI总结 针对领航-跟随平面跟踪问题,提出一种反馈速度控制律与恒定方位角转向策略,实现并排编队并证明渐近稳定性,扩展至N-agent链网络。

详情
AI中文摘要

本文研究了领航者和跟随者之间的平面跟踪问题。我们提出了一种新颖的反馈速度控制律,结合恒定方位角转向策略,以保持两个智能体之间的并排编队。我们证明了当领航者的转向已知时,所提出的控制使闭环系统渐近稳定。对于跟随者无法获取领航者转向的情况,我们表明系统相对于被视为输入的领航者转向仍然是输入-状态稳定的。此外,我们证明如果领航者的转向是周期性的,跟随者将渐近收敛到具有相同周期的周期轨道。我们通过数值模拟和移动机器人实验验证了这些结果。最后,我们通过将两智能体控制律扩展到N智能体链网络,展示了所提出方法的可扩展性,并说明了其在生物和工程群体中方向信息传播的意义。

英文摘要

This paper investigates a planar tracking problem between a leader and follower agent. We propose a novel feedback speed control law, paired with a constant bearing steering strategy, to maintain an abreast formation between the two agents. We prove that the proposed control yields asymptotic stability of the closed-loop system when the steering of the leader is known. For the case when the leader's steering is unavailable to the follower, we show that the system is still input-to-state stable with respect to the leader's steering viewed as an input. Furthermore, we demonstrate that if the leader's steering is periodic, the follower will asymptotically converge to a periodic orbit with the same period. We validate these results through numerical simulations and experimental implementations on mobile robots. Finally, we demonstrate the scalability of the proposed approach by extending the two-agent control law to an N-agent chain network, illustrating its implications for directional information propagation in biological and engineered flocks.

2604.08552 2026-06-19 cs.DB cs.AI 版本更新

Automated Standardization of Legacy Biomedical Metadata Using an Ontology-Constrained LLM Agent

使用本体约束的LLM代理自动化标准化遗留生物医学元数据

Josef Hardi, Martin J. O'Connor, Marcos Martinez-Romero, Jean G. Rosario, Stephen A. Fisher, Mark A. Musen

发表机构 * Division of Computational Medicine, Stanford University(斯坦福大学计算医学部) Department of Biology, University of Pennsylvania(宾夕法尼亚大学生物学系)

AI总结 提出基于LLM的元数据标准化系统,通过实时查询标准指南和本体服务,在839条HuBMAP记录上验证,相比纯LLM方法显著提升预测准确性。

详情
AI中文摘要

科学元数据通常不完整且不符合社区标准,限制了数据集的可发现性、互操作性和重用。即使存在标准元数据报告指南,它们通常缺乏机器可操作的表征。生成FAIR数据集需要将元数据标准编码为具有丰富字段规范和精确值约束的机器可操作模板。最近的研究表明,由字段名称和本体约束引导的LLM可以改善元数据标准化,但这些方法将约束视为静态文本提示,仅依赖模型的训练知识。我们提出了一种基于LLM的元数据标准化系统,该系统实时查询标准报告指南和权威生物医学术语服务,以按需检索规范正确的标准。我们在来自人类生物分子图谱计划(HuBMAP)的839条遗留元数据记录上评估了该方法,使用专家策划的金标准进行精确匹配评估。我们的评估表明,与仅使用LLM相比,通过实时工具访问增强LLM在受本体约束和不受本体约束的字段上均持续提高了预测准确性,展示了一种实用的生物医学元数据自动化标准化方法。

英文摘要

Scientific metadata are often incomplete and noncompliant with community standards, limiting dataset findability, interoperability, and reuse. Even when standard metadata reporting guidelines exist, they typically lack machine-actionable representations. Producing FAIR datasets requires encoding metadata standards as machine-actionable templates with rich field specifications and precise value constraints. Recent work has shown that LLMs guided by field names and ontology constraints can improve metadata standardization, but these approaches treat constraints as static text prompts, relying on the model's training knowledge alone. We present an LLM-based metadata standardization system that queries standard reporting guidelines and authoritative biomedical terminology services in real time to retrieve canonically correct standards on demand. We evaluate this approach on 839 legacy metadata records from the Human BioMolecular Atlas Program (HuBMAP) using an expert-curated gold standard for exact-match assessment. Our evaluation shows that augmenting the LLM with real-time tool access consistently improves prediction accuracy over the LLM alone across both ontology-constrained and non-ontology-constrained fields, demonstrating a practical approach to automated standardization of biomedical metadata.

2602.22495 2026-06-19 cs.LG cs.AI 版本更新

Reinforcement-aware Knowledge Distillation for LLM Reasoning

面向LLM推理的强化学习感知知识蒸馏

Zhaoyang Zhang, Shuli Jiang, Yantao Shen, Yuting Zhang, Dhananjay Ram, Shuo Yang, Zhuowen Tu, Wei Xia, Stefano Soatto

发表机构 * Meta Guo et al. Lin et al. Xu et al. Shao et al. Schulman et al. Xie et al.

AI总结 提出RL感知蒸馏(RLAD),通过信任区域比率蒸馏(TRRD)在强化学习后训练中实现选择性模仿,解决分布不匹配和目标干扰问题,在逻辑推理和数学基准上优于现有方法。

详情
AI中文摘要

强化学习(RL)后训练最近推动了长链思维推理大语言模型(LLM)的重大进展,但这类模型的高推理成本促使将其蒸馏到更小的学生模型中。大多数现有的知识蒸馏(KD)方法是为监督微调(SFT)设计的,依赖于固定的教师轨迹或基于教师-学生KL散度的正则化。当与RL结合时,这些方法常常遭受分布不匹配和目标干扰:教师监督可能与学生不断变化的rollout分布不一致,并且KL正则化项可能与奖励最大化竞争,需要仔细的损失平衡。为了解决这些问题,我们提出了RL感知蒸馏(RLAD),它在RL期间执行选择性模仿——仅在改进当前策略更新时引导学生向教师学习。我们的核心组件,信任区域比率蒸馏(TRRD),用基于PPO/GRPO风格似然比的目标替代教师-学生KL正则化项,该目标锚定到教师-旧策略混合,从而在学生rollout上产生优势感知、信任区域约束的蒸馏,并自然平衡探索、利用和模仿。在多种逻辑推理和数学基准上,RLAD始终优于离线蒸馏、标准GRPO和基于KL的在策略教师-学生知识蒸馏。

英文摘要

Reinforcement learning (RL) post-training has recently driven major gains in long chain-of-thought reasoning large language models (LLMs), but the high inference cost of such models motivates distillation into smaller students. Most existing knowledge distillation (KD) methods are designed for supervised fine-tuning (SFT), relying on fixed teacher traces or teacher-student Kullback-Leibler (KL) divergence-based regularization. When combined with RL, these approaches often suffer from distribution mismatch and objective interference: teacher supervision may not align with the student's evolving rollout distribution, and the KL regularizer can compete with reward maximization and require careful loss balancing. To address these issues, we propose RL-aware distillation (RLAD), which performs selective imitation during RL -- guiding the student toward the teacher only when it improves the current policy update. Our core component, Trust Region Ratio Distillation (TRRD), replaces the teacher-student KL regularizer with a PPO/GRPO-style likelihood-ratio objective anchored to a teacher--old-policy mixture, yielding advantage-aware, trust-region-bounded distillation on student rollouts and naturally balancing exploration, exploitation, and imitation. Across diverse logic reasoning and math benchmarks, RLAD consistently outperforms offline distillation, standard GRPO, and KL-based on-policy teacher-student knowledge distillation.

2604.07593 2026-06-19 cs.AI 版本更新

Too long; didn't solve

太长;没解决

Lucía M. Cabrera, Isaac Saxton-Knight, Jocelyn D'Arcy

发表机构 * Instituto Balseiro(巴塞罗那研究所) Poindexter Labs(波因迪克斯实验室)

AI总结 研究提示长度和解答长度与大型语言模型在数学问题上的性能关系,发现两者与模型失败率正相关。

详情
AI中文摘要

由一系列数学问题组成的数学基准被广泛用于评估大型语言模型的推理能力,但关于其结构特性如何影响模型行为的研究很少。在这项工作中,我们研究了两个结构长度变量——提示长度和解答长度,并分析了它们如何与模型在新构建的、由专家编写的对抗性数学问题数据集上的性能相关。我们发现,提示长度和解答长度均与模型失败率的增加呈正相关。我们还进行了跨模型分歧的探索性辅助分析。在难度调整的归一化分析下,两个变量与实现模型分离仍保持弱负相关,提示长度的关联稍强。总体而言,我们的主要稳健发现是,结构长度与该数据集中的经验难度相关。

英文摘要

Mathematical benchmarks consisting of a range of mathematics problems are widely used to evaluate the reasoning abilities of large language models, yet little is known about how their structural properties influence model behaviour. In this work, we investigate two structural length variables, prompt length and solution length, and analyse how they relate to model performance on a newly constructed adversarial dataset of expert-authored mathematics problems. We find that both prompt and solution lengths correlate positively with increased model failure across models. We also include a secondary, exploratory analysis of cross-model disagreement. Under a difficulty-adjusted normalised analysis, both variables retain weak negative associations with realised model separation, slightly stronger for prompt length. Overall, our main robust finding is that structural length is linked to empirical difficulty in this dataset.

2505.05306 2026-06-19 cs.LO 版本更新

The calculus of neo-Peircean relations

新皮尔士关系演算

Filippo Bonchi, Alessandro Di Giorgio, Nathan Haydon, Pawel Sobocinski

AI总结 通过从笛卡尔语法转向单子图语法,提出新皮尔士关系演算,其表达能力与一阶逻辑相当,从而规避了关系演算不可有限公理化的经典结论。

Comments arXiv admin note: substantial text overlap with arXiv:2401.07055

详情
AI中文摘要

关系演算由德摩根和皮尔士在19世纪下半叶引入,作为布尔类代数的扩展。后来弗雷格和皮尔士本人对量化理论的发展,为今天所知的一阶逻辑铺平了道路,导致关系演算被长期遗忘。直到1941年,塔斯基提出了关于其是否存在完全公理化的问题。这个问题只得到了否定的答案:关系演算及其许多片段没有有限公理化,后来由若干不可能定理证明。在本文中,我们表明——通过从传统语法(笛卡尔)转向图解语法(单子)——可以为整个演算提供完全公理化。不可能定理被规避,因为我们的演算,称为新皮尔士关系演算,比关系演算更具表达力,实际上与一阶逻辑一样具有表达力。公理是通过结合两个著名的范畴结构:笛卡尔双范畴和线性双范畴而获得的。

英文摘要

The calculus of relations was introduced by De Morgan and Peirce during the second half of the 19th century, as an extension of Boole's algebra of classes. Later developments on quantification theory by Frege and Peirce himself, paved the way to what is known today as first-order logic, causing the calculus of relations to be long forgotten. This was until 1941, when Tarski raised the question on the existence of a complete axiomatisation for it. This question found only negative answers: there is no finite axiomatisation for the calculus of relations and many of its fragments, as shown later by several no-go theorems. In this paper we show that -- by moving from traditional syntax (cartesian) to a diagrammatic one (monoidal) -- it is possible to have complete axiomatisations for the full calculus. The no-go theorems are circumvented by the fact that our calculus, named the calculus of neo-Peircean relations, is more expressive than the calculus of relations and, actually, as expressive as first-order logic. The axioms are obtained by combining two well known categorical structures: cartesian and linear bicategories.

2604.06464 2026-06-19 cs.LG physics.app-ph stat.ML 版本更新

Weighted Bayesian Conformal Prediction

加权贝叶斯共形预测

Xiayin Lou, Peng Luo

发表机构 * Technical University of Munich(慕尼黑技术大学) Massachusetts Institute of Technology(麻省理工学院)

AI总结 提出加权贝叶斯共形预测(WBCP),通过加权Dirichlet先验推广贝叶斯共形预测到重要性加权设置,理论证明有效样本量决定后验方差,并提供更丰富的条件覆盖不确定性。

详情
AI中文摘要

共形预测提供具有有限样本覆盖保证的分布自由预测区间,Snell & Griffiths 最近的工作将其重新解释为贝叶斯求积(BQ-CP),通过阈值上的 Dirichlet 后验产生强大的数据条件保证。然而,BQ-CP 根本上要求 i.i.d. 假设。同时,加权共形预测通过重要性权重处理分布偏移,但仍然是频率学派方法,仅产生点估计阈值。我们提出 \textbf{加权贝叶斯共形预测(WBCP)},它将 BQ-CP 推广到任意重要性加权设置,用加权 Dirichlet $\Dir(\neff \cdot \tilde{w}_1, \ldots, \neff \cdot \tilde{w}_n)$ 替换均匀 Dirichlet $\Dir(1,\ldots,1)$,其中 $\neff$ 是 Kish 有效样本量。我们证明了四个理论结果:(1)~$\neff$ 是匹配频率学派和贝叶斯方差的唯一集中参数;(2)~后验标准差以 $O(1/\sqrt{\neff})$ 衰减;(3)~BQ-CP 的随机占优保证扩展到每个权重轮廓的数据条件保证;(4)~HPD 阈值在条件覆盖上提供 $O(1/\sqrt{\neff})$ 的改进。我们将 WBCP 实例化为 \emph{地理贝叶斯共形预测},其中基于核的空间权重产生每个位置的后验,并具有可解释的诊断。在合成和真实空间数据集上的实验表明,WBCP 在保持覆盖保证的同时提供了更丰富的不确定性信息。

英文摘要

Conformal prediction provides distribution-free prediction intervals with finite-sample coverage guarantees, and recent work by Snell \& Griffiths reframes it as Bayesian Quadrature (BQ-CP), yielding powerful data-conditional guarantees via Dirichlet posteriors over thresholds. However, BQ-CP fundamentally requires the i.i.d. assumption. Meanwhile, weighted conformal prediction handles distribution shift via importance weights but remains frequentist, producing only point-estimate thresholds. We propose \textbf{Weighted Bayesian Conformal Prediction (WBCP)}, which generalizes BQ-CP to arbitrary importance-weighted settings by replacing the uniform Dirichlet $\Dir(1,\ldots,1)$ with a weighted Dirichlet $\Dir(\neff \cdot \tilde{w}_1, \ldots, \neff \cdot \tilde{w}_n)$, where $\neff$ is Kish's effective sample size. We prove four theoretical results: (1)~$\neff$ is the unique concentration parameter matching frequentist and Bayesian variances; (2)~posterior standard deviation decays as $O(1/\sqrt{\neff})$; (3)~BQ-CP's stochastic dominance guarantee extends to per-weight-profile data-conditional guarantees; (4)~the HPD threshold provides $O(1/\sqrt{\neff})$ improvement in conditional coverage. We instantiate WBCP for spatial prediction as \emph{Geographical BQ-CP}, where kernel-based spatial weights yield per-location posteriors with interpretable diagnostics. Experiments on synthetic and real-world spatial datasets demonstrate that WBCP maintains coverage guarantees while providing substantially richer uncertainty information.

2604.06265 2026-06-19 cs.LG cond-mat.stat-mech quant-ph 版本更新

SMT-AD: a scalable quantum-inspired anomaly detection approach

SMT-AD:一种可扩展的量子启发式异常检测方法

Apimuk Sornsaeng, Si Min Chan, Wenxuan Zhang, Swee Liang Wong, Joshua Lim, Jonathan Pan, Dario Poletti

发表机构 * Science, Mathematics and Technology Cluster, Singapore University of Technology and Design(新加坡科技设计大学科学、数学与技术集群) Centre for Quantum Technologies, National University of Singapore(新加坡国立大学量子技术中心) Artificial Intelligence and Data Analytics Strategic Technology Centre, ST Engineering(ST工程人工智能与数据分析战略技术中心) Engineering Product Development Pillar, Singapore University of Technology and Design(新加坡科技设计大学工程产品开发支柱)

AI总结 提出基于多分辨率张量叠加的量子启发式异常检测方法SMT-AD,通过傅里叶辅助特征嵌入和矩阵乘积算子实现线性可扩展,在标准数据集上取得竞争性能。

Comments 12 pages, 5 figures

详情
AI中文摘要

量子启发的张量网络算法已被证明是机器学习任务(包括异常检测)中有效且高效的模型。在此,我们提出一种高度可并行化的量子启发式方法,称为SMT-AD(Superposition of Multiresolution Tensors for Anomaly Detection)。它基于键维数为1的矩阵乘积算子的叠加,通过傅里叶辅助特征嵌入对输入数据进行变换,其中可学习参数的数量随特征大小、嵌入分辨率和矩阵乘积算子结构中附加组件的数量线性增长。我们展示了在标准数据集(包括信用卡交易)上成功的异常检测,并发现即使采用最小配置,它也能与已建立的异常检测基线相媲美。此外,它提供了一种直接的方法来减少模型权重,甚至通过突出最相关的输入特征来提高性能。

英文摘要

Quantum-inspired tensor networks algorithms have shown to be effective and efficient models for machine learning tasks, including anomaly detection. Here, we propose a highly parallelizable quantum-inspired approach which we call SMT-AD from Superposition of Multiresolution Tensors for Anomaly Detection. It is based upon the superposition of bond-dimension-1 matrix product operators to transform the input data with Fourier-assisted feature embedding, where the number of learnable parameters grows linearly with feature size, embedding resolutions, and the number of additional components in the matrix product operators structure. We demonstrate successful anomaly detection when applied to standard datasets, including credit card transactions, and find that, even with minimal configurations, it achieves competitive performance against established anomaly detection baselines. Furthermore, it provides a straightforward way to reduce the weight of the model and even improve the performance by highlighting the most relevant input features.

2604.06001 2026-06-19 physics.comp-ph cs.LG 版本更新

A deep learning framework for jointly solving transient Fokker-Planck equations with arbitrary parameters and initial distributions

一种联合求解具有任意参数和初始分布的瞬态Fokker-Planck方程的深度学习框架

Xiaolong Wang, Jing Feng, Qi Liu, Chengli Tan, Yuanyuan Liu, Yong Xu

发表机构 * School of Mathematics and Statistics, Shaanxi Normal University(陕西师范大学数学与统计学院) School of Mathematics and Statistics, Northwestern Polytechnical University(西北工业大学数学与统计学院) MOE Key Laboratory for Complexity Science in Aerospace, Northwestern Polytechnical University(航空复杂科学教育部重点实验室,西北工业大学) School of Science, Xi’an University of Posts and Telecommunications(西安邮电大学理学院) Department of Systems and Control Engineering, Institute of Science Tokyo(东京科学大学系统与控制工程系)

AI总结 提出基于深度学习的伪解析概率解(PAPS),通过单次训练同时求解任意多模态初始分布、系统参数和时间点的瞬态FPE,速度比GPU加速蒙特卡洛快四个数量级。

详情
AI中文摘要

高效求解Fokker-Planck方程(FPE)是分析复杂参数化随机系统的核心。然而,当前数值方法缺乏跨不同条件的并行计算能力,严重限制了全面的参数探索和瞬态分析。本文引入一种基于深度学习的伪解析概率解(PAPS),通过单次训练过程,同时求解任意多模态初始分布、系统参数和时间点的瞬态FPE解。核心思想是通过高斯混合分布(GMD)统一初始、瞬态和稳态分布,并开发一个约束保持自编码器,将受约束的GMD参数双射映射到无约束的低维潜在表示。在该表示空间中,可以建模跨不同初始条件和系统参数的全局瞬态动力学。在典型系统上的大量实验表明,所提出的PAPS在保持高精度的同时,推理速度比GPU加速的蒙特卡洛模拟快四个数量级。这种效率提升使得以前难以实现的实时参数扫描和随机分岔的系统研究成为可能。通过将表示学习与物理信息瞬态动力学解耦,我们的工作为多维参数化随机系统的概率建模建立了一个可扩展的范式。

英文摘要

Efficiently solving the Fokker-Planck equation (FPE) is central to analyzing complex parameterized stochastic systems. However, current numerical methods lack parallel computation capabilities across varying conditions, severely limiting comprehensive parameter exploration and transient analysis. This paper introduces a deep learning-based pseudo-analytical probability solution (PAPS) that, via a single training process, simultaneously resolves transient FPE solutions for arbitrary multi-modal initial distributions, system parameters, and time points. The core idea is to unify initial, transient, and stationary distributions via Gaussian mixture distributions (GMDs) and develop a constraint-preserving autoencoder that bijectively maps constrained GMD parameters to unconstrained, low-dimensional latent representations. In this representation space, the panoramic transient dynamics across varying initial conditions and system parameters can be modeled by a single evolution network. Extensive experiments on paradigmatic systems demonstrate that the proposed PAPS maintains high accuracy while achieving inference speeds four orders of magnitude faster than GPU-accelerated Monte Carlo simulations. This efficiency leap enables previously intractable real-time parameter sweeps and systematic investigations of stochastic bifurcations. By decoupling representation learning from physics-informed transient dynamics, our work establishes a scalable paradigm for probabilistic modeling of multi-dimensional, parameterized stochastic systems.