arXivDaily arXiv每日学术速递 周一至周五更新
重置
全部学科分类 1942
2603.19545 2026-05-21 eess.SY cs.LG cs.SY math.OC

Verifiable Error Bounds for Physics-Informed Neural Network Solutions of Lyapunov and Hamilton-Jacobi-Bellman Equations

用于李雅普诺夫和哈密尔顿-雅可比-贝尔曼方程的物理信息神经网络解的可验证误差界

Jun Liu

AI总结 本文研究了如何通过物理信息神经网络求解李雅普诺夫和哈密尔顿-雅可比-贝尔曼方程的可验证误差界,提出了基于这些方程的解的误差界计算方法,并展示了如何通过残差界来估计真实解的相对误差以及近似解的后验估计。

Comments The paper will appear in the IEEE Control Systems Letters

详情
AI中文摘要

许多非线性系统分析和控制的核心问题可以重新表述为求解偏微分方程(如李雅普诺夫和哈密尔顿-雅可比-贝尔曼方程)的问题。物理信息神经网络(PINNs)作为一种无网格方法,已被提出用于近似这些方程的解,但在大多数现有工作中,没有严格的保证表明小的PDE残差意味着小的解误差。本文开发了用于李雅普诺夫和哈密尔顿-雅可比-贝尔曼方程近似解的可验证误差界,特别强调基于PINN的近似方法。对于李雅普诺夫和哈密尔顿-雅可比-贝尔曼PDEs,我们展示了可验证残差界可以产生相对于真实解的相对误差界以及以近似解为术语的可计算后验估计。对于哈密尔顿-雅可比-贝尔曼方程,这还提供了在紧致子水平集上的最优值函数的认证上界和下界,并量化了由此诱导的反馈策略的最优性差距。我们进一步证明了一侧残差界已经意味着近似本身定义了有效的李雅普诺夫或控制李雅普诺夫函数。我们通过数值示例展示了这些结果。

英文摘要

Many core problems in nonlinear systems analysis and control can be recast as solving partial differential equations (PDEs) such as Lyapunov and Hamilton-Jacobi-Bellman (HJB) equations. Physics-informed neural networks (PINNs) have emerged as a promising mesh-free approach for approximating their solutions, but in most existing works there is no rigorous guarantee that a small PDE residual implies a small solution error. This paper develops verifiable error bounds for approximate solutions of Lyapunov and HJB equations, with particular emphasis on PINN-based approximations. For both the Lyapunov and HJB PDEs, we show that a verifiable residual bound yields relative error bounds with respect to the true solutions as well as computable a posteriori estimates in terms of the approximate solutions. For the HJB equation, this also yields certified upper and lower bounds on the optimal value function on compact sublevel sets and quantifies the optimality gap of the induced feedback policy. We further show that one-sided residual bounds already imply that the approximation itself defines a valid Lyapunov or control Lyapunov function. We illustrate the results with numerical examples.

2603.10726 2026-05-21 cs.CR cs.DC cs.LG

PrefixWall: Mitigating Prefix Caching Side Channels in Shared LLM Systems

PrefixWall: 缓解共享LLM系统中的前缀缓存侧信道

Panagiotis Georgios Pennas, Konstantinos Papaioannou, Marco Guarnieri, Thaleia Dimitra Doudali

AI总结 本文提出PrefixWall系统,通过监控缓存重用并选择性隔离前缀,有效缓解多租户LLM服务系统中自动前缀缓存(APC)侧信道带来的安全风险,同时提升缓存利用率和推理效率。

详情
AI中文摘要

大型语言模型(LLMs)依赖自动前缀缓存(APC)等优化技术来加速推理。APC通过重用先前计算的状态来加速请求的前缀部分,当另一个请求以相同文本开始时。尽管APC提高了吞吐量,但它引入了定时侧信道:缓存命中比缓存未命中更快,导致可观察的延迟差异。在多租户系统中,攻击者可以利用这些差异推断敏感信息,例如通过观察命中/未命中模式逐步重建其他用户的请求。当前的防御方法采用“砸锤子”策略:禁用APC和缓存共享,隔离用户,牺牲效率以换取常规用户。本文提出PrefixWall,一种系统,能够在不牺牲性能和效率的情况下保护多租户LLM服务系统免受APC侧信道的攻击。PrefixWall监控跨用户的缓存重用,标记可疑共享,并选择性隔离前缀,仅在必要时限制其重用。评估显示,与现有隔离用户的防御方法相比,PrefixWall可实现高达70%的缓存利用率提升和30%的推理延迟降低。PrefixWall的轻量级设计展示了LLM服务中的安全性和性能之间不必相互牺牲。

英文摘要

Large Language Models (LLMs) rely on optimizations like Automatic Prefix Caching (APC) to accelerate inference. APC works by reusing previously computed states for the beginning part of a request (prefix), when another request starts with the same text. While APC improves throughput, it introduces timing side channels: cache hits are faster than misses, creating observable latency differences. In multi-tenant systems, attackers can exploit these differences to infer sensitive information, e.g., by incrementally reconstructing another user's request by observing hit/miss patterns. Current defenses take a sledgehammer approach: they disable APC and cache sharing, isolating users, and sacrificing efficiency for regular users. This paper presents PrefixWall, a system that secures multi-tenant LLM serving systems against APC side channels without sacrificing performance and efficiency. PrefixWall monitors cache reuse across users, flags suspicious sharing, and selectively isolates prefixes, restricting their reuse only when necessary. Evaluation shows that PrefixWall enables up to 70% higher cache reuse and 30% lower inference latency compared to existing defenses that isolate users. PrefixWall's lightweight design demonstrates how security in LLM serving does not have to come at the cost of unnecessarily reduced performance or unbearable overheads.

2602.16399 2026-05-21 eess.AS cs.LG cs.SD

Multi-Channel Replay Speech Detection using Acoustic Maps

基于声学地图的多通道回放语音检测

Michael Neri, Tuomas Virtanen

AI总结 本文提出利用声学地图作为新型空间特征表示方法,用于多通道录音中的回放语音检测,通过轻量级卷积神经网络在ReMASC数据集上实现了竞争性性能,展示了声学地图在不同设备和声学环境下的紧凑且物理可解释的特征空间。

Comments Accepted in EUSIPCO 2026

详情
AI中文摘要

回放攻击仍然是自动说话人验证系统中的关键漏洞,特别是在实时语音助手应用中。在本工作中,我们提出声学地图作为新型的空间特征表示方法,用于从多通道录音中检测回放语音。声学地图源自经典波束成形在离散方位和仰角网格上的处理,编码方向能量分布,反映了人类语音辐射与基于扬声器的回放之间的物理差异。设计了一个轻量级卷积神经网络来操作此表示,在ReMASC数据集上约有6000个可训练参数。实验结果表明,声学地图为回放攻击检测提供了紧凑且物理可解释的特征空间,适用于不同设备和声学环境。

英文摘要

Replay attacks remain a critical vulnerability for automatic speaker verification systems, particularly in real-time voice assistant applications. In this work, we propose acoustic maps as a novel spatial feature representation for replay speech detection from multi-channel recordings. Derived from classical beamforming over discrete azimuth and elevation grids, acoustic maps encode directional energy distributions that reflect physical differences between human speech radiation and loudspeaker-based replay. A lightweight convolutional neural network is designed to operate on this representation, achieving competitive performance on the ReMASC dataset with approximately 6k trainable parameters. Experimental results show that acoustic maps provide a compact and physically interpretable feature space for replay attack detection across different devices and acoustic environments.

2602.12283 2026-05-21 eess.SY cs.HC cs.RO cs.SY

A Lightweight Cubature Kalman Filter for Attitude and Heading Reference Systems Using Simplified Prediction Equations

一种用于姿态和航向参考系统的轻量级立方根卡尔曼滤波器:使用简化预测方程

Shunsei Yamagishi, Lei Jing

AI总结 本文提出了一种改进的立方根卡尔曼滤波器(CKF),在保持估计精度的同时降低了计算成本,称为'Kaisoku立方根卡尔曼滤波器(KCKF)'。通过简化CKF的方程,保留等价的数学关系,推导出轻量级的预测方程。实验结果表明,KCKF相比CKF在浮点运算(FLOPs)上更少,计算时间减少了约19%(在高性能计算机上)和15%(在低成本单板计算机上),同时保持了姿态估计的准确性。

详情
Journal ref
IEEE Access, vol. 14, 2026, pp. 73686-73697
AI中文摘要

姿态和航向参考系统(AHRS)被广泛应用于需要可靠方向和运动传感的任何地方。本文提出了一种改进的立方根卡尔曼滤波器(CKF),在保持估计精度的同时降低了计算成本,称为“Kaisoku立方根卡尔曼滤波器(KCKF)”。通过简化CKF的方程,保留等价的数学关系,推导出KCKF的计算效率方程。通过扩展CKF中的求和项并简化结果,推导出KCKF的轻量级预测方程。本文证明KCKF所需的浮点运算(FLOPs)比CKF更少。受控实验结果表明,与CKF相比,KCKF在高性能计算机上将计算时间减少了约19%,而在低成本单板计算机上减少了约15%。此外,KCKF保持了CKF的姿态估计精度。

英文摘要

Attitude and Heading Reference Systems (AHRSs) are broadly applied wherever reliable orientation and motion sensing is required. In this paper, we present an improved Cubature Kalman Filter (CKF) with lower computational cost while maintaining estimation accuracy, which is named "Kaisoku Cubature Kalman Filter (KCKF)". The computationally efficient equations of the KCKF are derived by simplifying those of the CKF, while preserving equivalent mathematical relations. The lightweight prediction equations in the KCKF are derived by expanding the summation terms in the CKF and simplifying the result. This paper shows that the KCKF requires fewer floating-point operations (FLOPs) than the CKF. The controlled experimental results show that the KCKF reduces the computation time by approximately 19% compared to the CKF on a high-performance computer, whereas the KCKF reduces the computation time by approximately 15% compared to the CKF on a low-cost single-board computer. In addition, the KCKF maintains the attitude estimation accuracy of the CKF.

2602.10989 2026-05-21 math.ST cs.IT cs.LG math.IT math.PR stat.ML stat.TH

Variational Optimality of Föllmer Processes in Generative Diffusions

变分最优的Föllmer过程在生成扩散中的应用

Yifan Chen, Eric Vanden-Eijnden

AI总结 本文研究了利用随机插值框架构造和分析生成扩散的过程,通过条件期望估计漂移项,证明了在变分最优条件下Föllmer过程在路径空间中最小化相对熵,并提供了数据驱动的模拟方法。

详情
AI中文摘要

我们构造并分析了利用随机插值框架在有限时间范围内将点质量运输到指定目标分布的生成扩散。漂移项以条件期望形式表达,可通过独立样本估计而无需模拟随机过程。我们证明扩散系数可以在事后调整而不改变时间边际分布。在所有此类调整中,最小化估计误差对路径空间Kullback-Leibler散度的影响会选出闭式形式的Föllmer过程——一种路径测度相对于由插值计划确定的参考过程最小化相对熵的扩散。这为Föllmer过程提供了新的变分刻画,补充了经典的Schrodinger桥和随机控制方法,并提供了Föllmer漂移的条件期望表示,使从数据中无模拟估计成为可能。我们进一步证明,在最优扩散系数下,路径空间Kullback-Leibler散度与插值计划无关,使得不同计划在变分意义上统计等价。我们还通过数值实验展示了Föllmer过程在概率预报和数据同化中的路径空间变分最优影响。

英文摘要

We construct and analyze generative diffusions that transport a point mass to a prescribed target distribution over a finite time horizon using the stochastic interpolant framework. The drift is expressed as a conditional expectation that can be estimated from independent samples without simulating stochastic processes. We show that the diffusion coefficient can be tuned \emph{a~posteriori} without changing the time-marginal distributions. Among all such tunings, we prove that minimizing the impact of estimation error on the path-space Kullback--Leibler divergence selects, in closed form, a Föllmer process -- a diffusion whose path measure minimizes relative entropy with respect to a reference process determined by the interpolation schedules alone. This yields a new variational characterization of Föllmer processes, complementing classical formulations via Schrödinger bridges and stochastic control, and provides a conditional-expectation representation of the Föllmer drift that enables simulation-free estimation from data. We further establish that, under this optimal diffusion coefficient, the path-space Kullback--Leibler divergence becomes independent of the interpolation schedule, rendering different schedules statistically equivalent in this variational sense. We provide numerical experiments to illustrate the impact of path-space variational optimality of Föllmer's processes in probabilistic forecasting and data assimilation applications.

2602.10455 2026-05-21 cs.IR cs.LG

Compute Only Once: UG-Separation for Efficient Large Recommendation Models

只计算一次:用于高效大规模推荐模型的UG分离

Hui Lu, Zheng Chai, Shipeng Bai, Hao Zhang, Zhifang Fan, Kunmin Bai, Ke Sun, Yingwen Wu, Bingzheng Wei, Xiang Sun, Ziyan Gong, Tianyi Liu, Hua Chen, Deping Xie, Zhongkai Chen, Zhiliang Guo, Qiwei Chen, Yuchao Zheng

AI总结 本文提出UG分离方法,通过在TokenMixer密集交互模型中显式分离用户侧和物品侧的信息流,实现用户侧计算的重用,从而减少冗余推理成本,并通过信息补偿策略和权重量化技术提升效率。

Comments Large Recommender Model, Industrial Recommenders, Scaling Law

详情
AI中文摘要

受扩展定律驱动,推荐系统越来越多地依赖大规模模型以捕捉复杂的特征交互和用户行为,但这种趋势也导致了训练和推理成本过高。虽然长序列模型可以通过KV缓存重用用户侧计算,但在基于TokenMixer的密集特征交互架构中,这种重用困难,因为用户和物品特征在各层之间深度交织和混合。在本工作中,我们提出了用户-组分离(UG-Sep),一种工业级大规模框架,首次在基于TokenMixer的密集交互模型中实现了用户侧计算的重用。UG-Sep在token-mixing层中显式分离用户侧和物品侧的信息流,确保一组令牌在各层中保持纯粹的用户侧表示。这种设计允许相应的每令牌计算在多个样本之间重用,显著减少冗余推理成本。为了补偿由遮蔽引起的潜在表达能力损失,我们进一步提出了信息补偿策略,该策略能够自适应地重建被抑制的用户-物品交互。此外,由于UG-Sep显著减少了用户侧FLOPs并暴露了内存受限组件,我们引入了W8A16(8位权重,16位激活)权重仅量化技术,以缓解内存带宽瓶颈并实现额外加速。我们进行了广泛的离线评估和大规模在线A/B测试,以验证UG-Sep的有效性。结果表明,与字节跳动上的TokenMixer相比,UG-Sep在多个有影响力的业务场景中,如抖音Feed推荐、Hongguo Feed推荐、楚天ja广告和钱盾广告,将推理延迟减少了高达20%,且未对在线用户体验和商业指标造成负面影响。

英文摘要

Driven by scaling laws, recommender systems increasingly rely on larger-scale models to capture complex feature interactions and user behaviors, but this trend also leads to prohibitive training and inference costs. While long-sequence models can reuse user-side computation through KV Caching, such reuse is difficult in TokenMixer-based dense feature interaction architectures, where user and group features are deeply entangled and mixed-up across layers. In this work, we present User-Group Separation (UG-Sep), an industrial large-scale framework that enables user-side computation reusable in TokenMixer-based dense interaction models for the first time. UG-Sep explicitly disentangles user-side and item-side information flows within token-mixing layers, ensuring that a subset of tokens preserves purely user-side representations across layers. This design allows the corresponding per-token computations to be reused across multiple samples, significantly reducing redundant inference cost. To compensate for the potential expressive capacity loss induced by masking, we further propose an Information Compensation strategy that adaptively reconstructs suppressed user-item interactions. Moreover, as UG-Sep substantially reduces user-side FLOPs and exposes memory-bound components, we incorporate W8A16 (8-bit weight, 16-bit activation) weight-only quantization to alleviate memory bandwidth bottlenecks and achieve additional acceleration. We conduct extensive offline evaluations and large-scale online A/B experiments at ByteDance to validate the effectiveness of UG-Sep. Results show that UG-Sep reduces inference latency by up to 20% without causing adverse changes to online user experience and commercial metrics on multiple influential business scenarios compared to TokenMixer at ByteDance, including Douyin Feed Recommendation, Hongguo Feed Recommendation, Chuanshanjia Ads, and Qianchuan Ads.

2602.00933 2026-05-21 cs.SE cs.AI

MCP-Atlas: A Large-Scale Benchmark for Tool-Use Competency with Real MCP Servers

MCP-Atlas:一个大规模的工具使用能力基准测试,使用真实的MCP服务器

Chaithanya Bandi, Razvan-Gabriel Dumitru, Ben Hertzberg, Divyansh Agarwal, Geobio Boo, Tejas Polakam, Sami Hassaan, Jeff Da, HiJae Kim, Vipul Gupta, Manasi Sharma, Andrew Park, Martin Dimakis, Ernesto Gabriel Hernandez Montoya, Dan Rambado, Ivan Salazar, Rafael Cruz, MohammadHossein Rezaei, Chetan Rane, Ben Levin, Daniel Yue Zhang, Brad Kenstler, Bing Liu

AI总结 本文提出MCP-Atlas,一个用于评估工具使用能力的基准测试,基于真实MCP服务器,包含1000个由人类专家编写和验证的任务,覆盖36个真实MCP服务器和220种工具,通过任务级别的评估发现模型在工具调用和认知能力上的表现。

Comments 25 pages, 3 figures, 9 tables

详情
AI中文摘要

模型上下文协议(MCP)正逐渐成为一种标准接口,通过该接口大型语言模型(LLM)代理可以发现并调用外部工具。然而,现有的MCP评估在三个方面存在不足:缺乏真实多步骤工作流和跨服务器编排、缺乏真实MCP服务器而非模拟器、以及缺乏结构化、可重复的声明级评分,与代理的冗长或风格无关。我们引入MCP-Atlas,一个用于测量工具使用能力的基准测试,针对生产MCP服务器。MCP-Atlas包含1000个自然语言任务,由人类专家编写和验证,涵盖36个真实MCP服务器和220种工具。提示不指定服务器、工具或参数,要求代理在语义上可能的干扰项中识别相关工具,并编排多步骤、跨服务器工作流。每个任务均使用声明级评分标准评分,最终答案根据工具输出中的原子事实声明评分。这种以答案为中心的评分允许有效的替代工具调用轨迹获得认可。我们将其与一个11类诊断分类法相结合,将工具调用失败与任务理解、综合、解析和停止的认知失败区分开来。在20个前沿模型(来自六个供应商)在匹配的任务级别条件下评估后,我们发现,在0.75声明覆盖率阈值下,通过率高达82.2%,并呈现出明显的三层性能结构。自动化诊断显示,63.3%的诊断失败是认知性的而非工具调用相关的。值得注意的是,一些高性能模型在成功执行工具后由于提前停止或综合错误而失败。我们发布了任务模式、容器化Harness、声明评估器和一个500任务的公共分割,同时保留500任务的私人分割以保持排行榜的完整性。代码在https://github.com/scaleapi/mcp-atlas。

英文摘要

The Model Context Protocol (MCP) is emerging as a standard interface through which large language model (LLM) agents discover and invoke external tools. However, existing MCP evaluations fall short along three key axes: realistic multi-step workflows with cross-server orchestration, breadth across authentic MCP servers rather than mocks, and structured, reproducible claim-level scoring disentangled from agent verbosity or style. We introduce MCP-Atlas, a benchmark for measuring tool-use competency against production MCP servers. MCP-Atlas contains 1,000 natural-language tasks written and verified by human experts spanning 36 real MCP servers and 220 tools. Prompts do not specify servers, tools, or parameters, requiring agents to identify relevant tools among semantically plausible distractors and to compose multi-step, cross-server workflows. Each task is scored with a claim-level rubric, where final answers are scored against atomic factual claims grounded in tool outputs. This answer-centric scoring permits valid alternative tool-call trajectories to receive credit. We pair this with an 11-category diagnostic taxonomy that disentangles tool-call failures from cognitive failures in task understanding, synthesis, parsing, and stopping. Evaluating 20 frontier models from six providers under matched task-level conditions, we find pass rates up to 82.2% at a 0.75 claim coverage threshold and a clear three-tier performance structure. Automated diagnostics show that 63.3% of diagnosed failures are cognitive rather than tool-call related. Notably, several high-performing models fail after successful tool execution due to premature stopping or incorrect synthesis. We release the task schema, containerized harness, claim evaluator, and a 500-task public split, while reserving a 500-task private split to preserve leaderboard integrity. The code is at https://github.com/scaleapi/mcp-atlas.

2601.22292 2026-05-21 cs.MA cs.LG

Learning Incentive Structures for Cooperative Resilience in Multi-Agent Systems under Social Dilemmas

在社会困境中的多智能体系统中学习合作韧性激励结构

Manuela Chacon-Chamorro, Luis Felipe Giraldo, Nicanor Quijano

AI总结 本文研究了在社会困境中通过多智能体强化学习系统学习促进集体福祉的激励结构,提出了一种评估和排名智能体轨迹的韧性度量标准,并通过三种激励结构评估了资源共享环境中的系统性能。

Comments Supplementary material in https://github.com/mavivi95/supplementary_files/blob/main/Learning_TCSS___Supplementary_File__AN_.pdf Updated version submitted to IEEE Transactions on Computational Social Systems (TCSS). This preprint is under review for possible publication in IEEE

详情
AI中文摘要

多智能体社会困境,如公地悲剧,捕捉了个体激励与集体福祉冲突的场景,使这些系统在受到干扰时极易崩溃。在这一背景下,本文研究了合作韧性,即系统层面在扰动下通过适应性智能体行为维持集体福祉的能力。我们提出了一种框架,用于学习多智能体强化学习系统中与集体福祉一致的激励结构,其中奖励函数塑造个体决策和集体行为。使用韧性度量标准对智能体轨迹进行评分和排序,可以推断出促进韧性集体行为的奖励函数。这些推断出的奖励函数被整合到多智能体强化学习过程中,以塑造社会困境设置中的智能体互动。该方法在受干扰的资源共享环境中进行了评估,使用了三种激励结构:个体激励、与韧性一致的激励,以及结合了个体和集体成分的混合激励结构。结果表明,混合激励结构促进了持续的集体行为,减少了与资源枯竭相关的崩溃事件,并在干扰下保持了系统性能。这些发现突显了激励设计作为促进韧性集体行为的机制,并为在干扰下多智能体社会困境提供了计算框架。

英文摘要

Multi-agent social dilemmas, such as the tragedy of the commons, capture settings where individual incentives conflict with collective well-being, making these systems highly vulnerable to collapse under disruptions. In this context, this work studies cooperative resilience, understood as the system-level ability to maintain collective well-being under perturbations through adaptive agent behavior. We propose a framework for learning incentive structures aligned with collective well-being in multi-agent reinforcement learning systems, where reward functions shape individual decision-making and collective behavior. A resilience metric is used to score and rank agent trajectories, allowing the inference of reward functions that promote resilient collective behavior. These inferred reward functions are integrated into the multi-agent reinforcement learning process to shape agent interactions in social dilemma settings. The approach is evaluated in resource-sharing environments subject to disruptions, using three incentive structures: individual incentives, resilience-aligned incentives, and a hybrid incentive structure that combines both individual and collective components. The results show that the hybrid incentive structure promotes sustained collective behavior, reduces collapse events associated with resource depletion, and preserves system performance under disruption. These findings highlight the role of incentive design as a mechanism for promoting resilient collective behavior and provide a computational framework for multi-agent social dilemmas under disruptions.

2601.03019 2026-05-21 q-bio.GN cs.CL

DNACHUNKER: Learnable Tokenization for DNA Language Models

DNACHUNKER: 用于DNA语言模型的可学习分词

Taewon Kim, Jihwan Shin, Hyomin Kim, Youngmok Jung, Jonghoon Lee, Won-Chul Lee, Sungsoo Ahn, Insu Han

AI总结 本文提出DNACHUNKER,一种可学习的DNA分词方法,通过动态分段模块生成上下文依赖的变长单元,提升DNA语言模型在基因组序列处理中的鲁棒性和效率。

Comments ICML 2026 camera-ready version

详情
AI中文摘要

DNA语言模型越来越多地用于表示基因组序列,但其有效性严重依赖于原始核苷酸如何转换为模型输入。与自然语言不同,DNA没有标准的边界,使固定分词成为在移位、插入缺失和局部重复下脆弱的设计选择。我们引入了DNAChunker,一种带有可学习自适应分段模块的遮蔽DNA语言模型,以生成上下文依赖、变长的单元。基于动态分段过程,DNAChunker学会在功能丰富区域分配更细的粒度,同时压缩重复或冗余序列。我们预训练DNAChunker在人类参考基因组上,并在五个基准上评估其性能,结果在强固定分词基线中一致提升。进一步的分析和消融实验表明,与固定分词不同,分段是以生物信息学指导、对突变具有鲁棒性的学习方式进行的。

英文摘要

DNA language models are increasingly used to represent genomic sequence, yet their effectiveness depends critically on how raw nucleotides are converted into model inputs. Unlike natural language, DNA offers no canonical boundaries, making fixed tokenizations a brittle design choice under shifts, indels, and local repeats. We introduce DNAChunker, a masked DNA language model that incorporates a learnable adaptive segmentation module to produce context-dependent, variable-length units. Building on a dynamic segmentation procedure, DNAChunker learns to allocate finer granularity to functionally enriched regions while compressing repetitive or redundant sequence. We pretrain DNAChunker on the human reference genome and evaluate it across five benchmarks, where it consistently improves over strong fixed-tokenization baselines. Further analyses and ablations indicate that unlike fixed tokenizations, segmentation is learned in a biologically-informed, mutation-resilient manner.

2601.00418 2026-05-21 cs.CR cs.DC cs.LG

Secure, Verifiable, and Scalable Multi-Client Data Sharing via Consensus-Based Privacy-Preserving Data Distribution

通过基于共识的隐私保护数据分发实现安全、可验证和可扩展的多客户端数据共享

Prajwal Panth, Sahaj Raj Malla

AI总结 本文提出了一种基于共识的隐私保护数据分发(CPPDD)框架,该框架是一种轻量级且在设置后自动运行的协议,用于安全的多客户端数据聚合。该框架通过结合每个客户端的仿射掩码和优先级驱动的顺序共识锁定的双层保护机制,强制实施一致发布保密性。通过步骤(sigma_S)和数据(sigma_D)校验和实现去中心化完整性,从而在不需要持续协调的情况下实现自动恶意偏差检测和原子回滚。该设计支持标量、向量和矩阵负载,具有O(N*D)的计算和通信复杂度,可选边缘服务器卸载,并在N-1破坏情况下具有抗合谋性。形式分析证明了正确性、共识依赖完整性与公平性(CDIF)以及在偏差下的高概率回滚,并假设伪随机函数族的情况下证明了IND-CPA安全性。在MNIST衍生向量上的实证评估显示,可扩展性线性增长到N=500,每个客户端的计算时间亚毫秒级。该框架实现了100%的恶意偏差检测、精确的数据恢复以及与MPC和HE基线相比低三个到四个数量级的FLOPs。CPPDD在安全投票、联盟联邦学习、区块链担保和地理信息能力构建中实现了原子协作,解决了在受监管和资源受限环境中可扩展性、信任最小化和可验证多方计算的关键差距。

Comments 25 pages, 6 figures, preprint

详情
AI中文摘要

我们提出了一种基于共识的隐私保护数据分发(CPPDD)框架,一种轻量级且在设置后自动运行的协议,用于安全的多客户端数据聚合。该框架通过结合每个客户端的仿射掩码和优先级驱动的顺序共识锁定的双层保护机制,强制实施一致发布保密性。去中心化完整性通过步骤(sigma_S)和数据(sigma_D)校验和进行验证,从而在不需要持续协调的情况下实现自动恶意偏差检测和原子回滚。该设计支持标量、向量和矩阵负载,具有O(N*D)的计算和通信复杂度,可选边缘服务器卸载,并在N-1破坏情况下具有抗合谋性。形式分析证明了正确性、共识依赖完整性与公平性(CDIF)以及在偏差下的高概率回滚,并假设伪随机函数族的情况下证明了IND-CPA安全性。在MNIST衍生向量上的实证评估显示,可扩展性线性增长到N=500,每个客户端的计算时间亚毫秒级。该框架实现了100%的恶意偏差检测、精确的数据恢复以及与MPC和HE基线相比低三个到四个数量级的FLOPs。CPPDD在安全投票、联盟联邦学习、区块链担保和地理信息能力构建中实现了原子协作,解决了在受监管和资源受限环境中可扩展性、信任最小化和可验证多方计算的关键差距。

英文摘要

We propose the Consensus-Based Privacy-Preserving Data Distribution (CPPDD) framework, a lightweight and post-setup autonomous protocol for secure multi-client data aggregation. The framework enforces unanimous-release confidentiality through a dual-layer protection mechanism that combines per-client affine masking with priority-driven sequential consensus locking. Decentralized integrity is verified via step (sigma_S) and data (sigma_D) checksums, facilitating autonomous malicious deviation detection and atomic abort without requiring persistent coordination. The design supports scalar, vector, and matrix payloads with O(N*D) computation and communication complexity, optional edge-server offloading, and resistance to collusion under N-1 corruptions. Formal analysis proves correctness, Consensus-Dependent Integrity and Fairness (CDIF) with overwhelming-probability abort on deviation, and IND-CPA security assuming a pseudorandom function family. Empirical evaluations on MNIST-derived vectors demonstrate linear scalability up to N = 500 with sub-millisecond per-client computation times. The framework achieves 100% malicious deviation detection, exact data recovery, and three-to-four orders of magnitude lower FLOPs compared to MPC and HE baselines. CPPDD enables atomic collaboration in secure voting, consortium federated learning, blockchain escrows, and geo-information capacity building, addressing critical gaps in scalability, trust minimization, and verifiable multi-party computation for regulated and resource-constrained environments.

2512.08013 2026-05-21 eess.SY cs.LG cs.SY math.OC

Learning Dynamics from Infrequent Output Measurements for Uncertainty-Aware Optimal Control

从稀疏输出测量中学习动态以实现不确定性感知的最优控制

Robert Lefringhausen, Theodor Springer, Sandra Hirche

AI总结 该研究提出了一种基于贝叶斯先验的连续时间动态和潜在状态轨迹建模方法,利用目标Metropolis-Hastings采样器和数值ODE求解器进行更新,通过场景优化方法解决不确定性下的最优控制问题,验证了在1型糖尿病血糖调节中的有效性。

Comments Accepted for publication in the Proceedings of the 2026 IFAC World Congress

详情
AI中文摘要

当非线性系统动态未知且仅有稀疏、噪声的输出测量时,可靠的最优控制极具挑战性。本文针对这种有限传感设置,通过构建连续时间动态和潜在状态轨迹的状态空间形式的贝叶斯先验,并利用配备数值ODE求解器的目标Metropolis-Hastings采样器进行更新。所得后验样本用于构建考虑动态和潜在状态不确定性的场景优化最优控制问题,并通过标准非线性规划方法求解。该方法在使用1型糖尿病模型的数值案例研究中得到了验证。

英文摘要

Reliable optimal control is challenging when the dynamics of a nonlinear system are unknown and only infrequent, noisy output measurements are available. This work addresses this setting of limited sensing by formulating a Bayesian prior over the continuous-time dynamics and latent state trajectory in state-space form and updating it through a targeted Metropolis-Hastings sampler equipped with a numerical ODE integrator. The resulting posterior samples are used to formulate a scenario-based optimal control problem that accounts for the uncertainty in the dynamics and latent state and is solved using standard nonlinear programming methods. The approach is validated in a numerical case study on glucose regulation using a Type 1 diabetes model.

2512.07420 2026-05-21 hep-ph cs.LG hep-ex

E-PCN: Jet Tagging with Explainable Particle Chebyshev Networks Using Kinematic Features

E-PCN:利用可解释的粒子切比雪夫网络进行喷注标记:使用动量学特征

Md Raqibul Islam, Adrita Khan, Mir Sazzat Hossain, Choudhury Ben Yamin Siddiqui, Md. Zakir Hossan, Tanjib Khan, M. Arshad Momen, Amin Ahsan Ali, AKM Mahbubur Rahman

AI总结 本文提出E-PCN,一种结合动量学特征的可解释粒子切比雪夫网络,用于喷注标记,通过构建四个图表示来提高分类的可解释性和准确性。

Comments 25 pages, 3 figures

详情
AI中文摘要

喷注的识别和分类对于解释高能碰撞实验数据至关重要。尽管深度学习已经改善了喷注分类,但通常缺乏可解释性。我们介绍了可解释的粒子切比雪夫网络(E-PCN),这是一种扩展粒子切比雪夫网络(PCN)的图神经网络。E-PCN通过为每个喷注构建四个图表示,将动量学变量整合到喷注分类中,每个图表示由不同的变量加权:角分离(Δ)、横向动量(k_T)、动量分数(z)和不变质量平方(m²)。我们使用梯度加权类激活映射(Grad-CAM)的概念来确定哪些动量学变量主导分类结果。分析表明,角分离和横向动量共同占分类决策的约76%(分别占40.72%和35.67%),动量分数和不变质量贡献剩余的24%。在JetClass数据集上评估,E-PCN在10个信号类上实现了宏精度94.67%、宏AUC 96.78%和宏AUPR 86.79%,分别比基线PCN实现提高了2.36%、4.13%和24.88%,同时展示了物理上可解释的特征学习。

英文摘要

The identification and classification of collimated particle sprays, or jets, are essential for interpreting data from high-energy collider experiments. While deep learning has improved jet classification, it often lacks interpretability. We introduce the Explainable Particle Chebyshev Network (E-PCN), a graph neural network extending the Particle Chebyshev Network (PCN). E-PCN integrates kinematic variables into jet classification by constructing four graph representations per jet, each weighted by a distinct variable: angular separation ($Δ$), transverse momentum ($k_T$), momentum fraction ($z$), and invariant mass squared ($m^2$). We use the concept of Gradient-weighted Class Activation Mapping (Grad-CAM) to determine which kinematic variables dominate classification outcomes. Analysis reveals that angular separation and transverse momentum collectively account for approximately 76% of classification decisions (40.72% and 35.67%, respectively), with momentum fraction and invariant mass contributing the remaining 24%. Evaluated on the JetClass dataset with 10 signal classes, E-PCN achieves a macro-accuracy of 94.67%, macro-AUC of 96.78%, and macro-AUPR of 86.79%, representing improvements of 2.36%, 4.13%, and 24.88% respectively over the baseline PCN implementation, while demonstrating physically interpretable feature learning.

2511.21223 2026-05-21 stat.ML cs.LG

Maxitive Donsker-Varadhan Formulation for Possibilistic Variational Inference

Maxitive Donsker-Varadhan Formulation for Possibilistic Variational Inference

Jasraj Singh, Shelvia Wongso, Jeremie Houssineau, Badr-Eddine Chérief-Abdellatif

AI总结 本文提出了一种基于可能性理论的变分推断方法,通过建立最大性Donsker-Varadhan公式,解决了传统变分推断中对加法性假设的依赖问题,并提出了CBOpt优化器以提升图像分类任务的性能。

Comments 37 pages, 3 figures, 13 tables

详情
AI中文摘要

变分推断(VI)是现代贝叶斯学习的核心,使复杂模型的近似推断成为可能。然而,其公式依赖于高维积分定义的期望和发散,通常使解析处理变得不可能,需要依赖大量近似。可能性理论是一种不精确概率框架,允许我们直接建模信念不确定性,而不是依赖概率的主观解释。尽管该框架在稀疏或不精确信息下提供鲁棒性和可解释性,但将VI适应到可能性设置中需要重新思考核心概念,如发散,这预设了加法性。在本工作中,我们开发了一种原则性的公式,以进行可能性VI,通过建立经典Donsker-Varadhan公式的最大性类比。所得到的框架使我们能够推导出具有指数族候选者的可能性VI学习规则和实用的神经网络训练更新规则,从而产生了一族称为CBOpt的优化器。最后,我们证明CBOpt在域内和域外图像分类任务中实现了有竞争力的性能。

英文摘要

Variational inference (VI) is a cornerstone of modern Bayesian learning, enabling approximate inference in complex models. However, its formulation depends on expectations and divergences defined through high-dimensional integrals, often rendering analytical treatment impossible and necessitating heavy reliance on approximations. Possibility theory, an imprecise probability framework, allows us to directly model epistemic uncertainty instead of relying on a subjective interpretation of probabilities. While this framework provides robustness and interpretability under sparse or imprecise information, adapting VI to the possibilistic setting requires rethinking core concepts such as divergences, which presuppose additivity. In this work, we develop a principled formulation for performing possibilistic VI by establishing a maxitive analogue of the classical Donsker-Varadhan formulation. The resulting framework enables us to derive a learning rule for possibilistic VI with exponential-family candidates and practical update rules for neural-network training, giving rise to a family of optimizers termed CBOpt. Finally, we demonstrate that CBOpt achieves competitive performance on both in-domain and out-of-domain image classification tasks.

2510.09724 2026-05-21 cs.SE cs.AI

InteractScience: Programmatic and Visually-Grounded Evaluation of Interactive Scientific Demonstration Code Generation

InteractScience: 交互式科学演示代码生成的程序化与视觉导向评估

Qiaosheng Chen, Yang Liu, Lei Li, Kai Chen, Qipeng Guo, Gong Cheng, Fei Yuan

AI总结 本文提出InteractScience基准,用于评估大语言模型在生成交互式科学演示代码时结合科学知识与前端交互能力的综合表现,通过程序化测试和视觉测试相结合的方法,评估30种开源和闭源LLM的表现,揭示了在整合领域知识与交互前端编码方面的持续不足。

Comments 27 pages, 17 figures

详情
AI中文摘要

大型语言模型(LLMs)正越来越能够从自然语言指令中生成完整的应用程序,为科学和教育领域创造了新的机会。在这些领域中,交互式科学演示特别有价值,可用于解释概念、支持新的教学方法和展示研究成果。生成此类演示需要模型结合准确的科学知识和能够正确实现并响应用户操作的交互前端代码。这种能力超出了现有基准的范围,这些基准通常只评估知识问答或静态网页代码生成。为了评估这种综合能力,我们设计了一个混合框架,结合程序化功能测试严格验证交互逻辑,并结合视觉导向的定性测试评估渲染输出与参考快照的一致性。基于此框架,我们提出了InteractScience基准,包含五个科学领域中精心设计的问题集,每个问题配以单元测试、参考快照和检查表。我们评估了30种领先的开源和闭源LLM,并报告了结果,突显了在整合领域知识与交互前端编码方面的持续不足。我们的工作将InteractScience定位为首个能够自动衡量这种综合能力的基准,通过现实的交互操作提供基础,推动可靠且具有教育价值的科学演示代码生成。所有代码和数据均在https://github.com/open-compass/InteractScience公开。

英文摘要

Large Language Models (LLMs) are increasingly capable of generating complete applications from natural language instructions, creating new opportunities in science and education. In these domains, interactive scientific demonstrations are particularly valuable for explaining concepts, supporting new teaching methods, and presenting research findings. Generating such demonstrations requires models to combine accurate scientific knowledge with the ability to implement interactive front-end code that behaves correctly and responds to user actions. This capability goes beyond the scope of existing benchmarks, which typically evaluate either knowledge question answering without grounding in code or static web code generation without scientific interactivity. To evaluate this integrated ability, we design a hybrid framework that combines programmatic functional testing to rigorously verify interaction logic with visually-grounded qualitative testing to assess rendered outputs against reference snapshots. Building on this framework, we present InteractScience, a benchmark consisting of a substantial set of carefully designed questions across five scientific domains, each paired with unit tests, reference snapshots, and checklists. We evaluate 30 leading open- and closed-source LLMs and report results that highlight ongoing weaknesses in integrating domain knowledge with interactive front-end coding. Our work positions InteractScience as the first benchmark to automatically measure this combined capability with realistic interactive operations, providing a foundation for advancing reliable and educationally useful scientific demonstration code generation. All code and data are publicly available at https://github.com/open-compass/InteractScience.

2510.00171 2026-05-21 quant-ph cs.LG

Quantum reservoir computing in Jaynes-Cummings models: Nonlinear memory and time-series prediction

在Jaynes-Cummings模型中进行量子回声计算:非线性记忆与时间序列预测

Sreetama Das, Gian Luca Giorgi, Roberta Zambrini

AI总结 本文研究了基于Jaynes-Cummings模型的量子回声计算,探讨了非线性记忆和时间序列预测的核心方法,并展示了其在复杂动态系统中的应用价值。

Comments 16 pages, 14 figures, published version

详情
Journal ref
Phys. Rev. Research 8, 023148 (2026)
AI中文摘要

我们研究了利用由Jaynes-Cummings(JC)哈密顿量及其色散极限(DJC)描述的混合量子-玻色子系统进行量子回声计算(QRC)。这些模型提供了高维希尔伯特空间和内在非线性动力学,使其成为时间信息处理的强大基质。我们通过线性和非线性记忆任务系统地评估了两种回声体,证明它们表现出非线性记忆能力优于线性记忆能力。我们进一步在Mackey-Glass时间序列上测试其预测性能,该序列是用于混沌动态的广泛基准,展示了可比的预测能力。我们还研究了记忆和预测准确性如何随回声参数变化,并展示了更高阶玻色子可观测量和时间复用在增强表达性中的作用,即使在最小的自旋-玻色子配置中也是如此。我们的结果确立了基于JC和DJC的回声体作为时间序列处理的多功能平台,并作为克服等效量子位对设置的基本单元,提供了通往可调、高性能量子机器学习架构的途径。

英文摘要

We investigate quantum reservoir computing (QRC) using a hybrid qubit-boson system described by the Jaynes-Cummings (JC) Hamiltonian and its dispersive limit (DJC). These models provide high-dimensional Hilbert spaces and intrinsic nonlinear dynamics, making them powerful substrates for temporal information processing. We systematically benchmark both reservoirs through linear and nonlinear memory tasks, demonstrating that they exhibit an unusual superior nonlinear over linear memory capacity. We further test their predictive performance on the Mackey-Glass time series, a widely used benchmark for chaotic dynamics, and show comparable forecasting ability. We also investigate how memory and prediction accuracy vary with reservoir parameters, and show the role of higher-order bosonic observables and time multiplexing in enhancing expressivity, even in minimal spin-boson configurations. Our results establish JC- and DJC-based reservoirs as versatile platforms for time-series processing and as elementary units that overcome the setting of equivalent qubit pairs and offer pathways toward tunable, high-performance quantum machine learning architectures.

2509.08010 2026-05-21 cs.CY cs.AI cs.CL cs.HC

Measuring and mitigating overreliance to build human-compatible AI

测量和缓解过度依赖以构建人类兼容的AI

Lujain Ibrahim, Katherine M. Collins, Sunnie S. Y. Kim, Anka Reuel, Max Lamparth, Kevin Feng, Lama Ahmad, Prajna Soni, Alia El Kattan, Merlin Stein, Siddharth Swaroop, Vishakh Padmakumar, Ilia Sucholutsky, Andrew Strait, Diyi Yang, Q. Vera Liao, Umang Bhatt

AI总结 本文研究了大型语言模型过度依赖的风险,探讨了测量和缓解过度依赖的方法,以确保AI能增强而非削弱人类能力。

详情
AI中文摘要

大型语言模型(LLMs)通过作为协作的『思想伙伴』而区别于先前的技术,能够在多种任务上更流畅地进行自然语言交互。随着LLMs在医疗、个人建议等不同领域中日益影响关键决策,过度依赖LLMs的风险也随之增加。本文认为,测量和缓解过度依赖必须成为LLMs研究和部署的核心。首先,我们汇总了个体和社会层面的过度依赖风险,包括高风险错误、治理挑战和认知退化。然后,我们探讨了LLMs的特点、系统设计特征和用户认知偏见,这些因素共同引发了关于实际中过度依赖LLMs的严重且独特的问题。我们还审查了历史上的过度依赖测量方法,识别出三个重要的差距,并提出三个有前景的方向来改进测量。最后,我们提出了可以采取的缓解策略,以确保LLMs增强而非削弱人类能力。

英文摘要

Large language models (LLMs) distinguish themselves from previous technologies by functioning as collaborative ``thought partners,'' capable of engaging more fluidly in natural language on a range of tasks. As LLMs increasingly influence consequential decisions across diverse domains from healthcare to personal advice, the risk of overreliance -- relying on LLMs beyond their capabilities -- grows. This paper argues that measuring and mitigating overreliance must become central to LLM research and deployment. First, we consolidate risks from overreliance at both the individual and societal levels, including high-stakes errors, governance challenges, and cognitive deskilling. Then, we explore LLM characteristics, system design features, and user cognitive biases that together raise serious and unique concerns about overreliance on LLMs in practice. We also examine historical approaches for measuring overreliance, identifying three important gaps and proposing three promising directions to improve measurement. Finally, we propose mitigation strategies that can be pursued to ensure LLMs augment rather than undermine human capabilities.

2509.00303 2026-05-21 cs.DB cs.AI cs.IR

Access Paths for Efficient Ordering with Large Language Models

利用大型语言模型实现高效的排序访问路径

Fuheng Zhao, Jiayue Chen, Yiming Pan, Tahseen Rabbani, Sohaib, Divyakant Agrawal, Amr El Abbadi, Paritosh Aggarwal, Anupam Datta, Dimitris Tsirogiannis

AI总结 本文提出了一种基于大型语言模型的排序语义运算符,并系统研究了其物理实现。通过改进现有语义排序算法并引入语义感知的外部归并排序算法,研究发现没有单一实现能在所有数据集上达到最优。基于此,设计了一个预算感知的优化器,利用启发式规则、LLM作为判断者评估和共识聚合动态选择最优的访问路径。实验结果表明,该优化器在所有基准测试中均能实现与最佳静态方法相当或更优的排名准确性。

详情
AI中文摘要

在本工作中,我们提出了LLM ORDER BY语义运算符作为一种逻辑抽象,并对其物理实现进行了系统研究。首先,我们对现有的语义排序算法进行了若干改进,并引入了一种语义感知的外部归并排序算法。我们的广泛评估表明,没有单一的实现能在所有数据集上提供普遍最优性。从我们的评估中,我们观察到基于比较的算法中排序成本与排序质量之间存在一种通用的时间尺度关系。基于这些见解,我们设计了一个预算感知的优化器,该优化器利用启发式规则、LLM-as-Judge评估和共识聚合来动态选择LLM ORDER BY的近最优访问路径。在我们的广泛评估中,我们的优化器在所有基准测试中均能实现与最佳静态方法相当或更优的排名准确性。我们相信,这项工作为构建稳健、大规模的LLM驱动分析系统中的语义运算符原则性优化提供了基础性见解。

英文摘要

In this work, we present the \texttt{LLM ORDER BY} semantic operator as a logical abstraction and conduct a systematic study of its physical implementations. First, we propose several improvements to existing semantic sorting algorithms and introduce a semantic-aware external merge sort algorithm. Our extensive evaluation reveals that no single implementation offers universal optimality on all datasets. From our evaluations, we observe a general test-time scaling relationship between sorting cost and the ordering quality for comparison-based algorithms. Building on these insights, we design a budget-aware optimizer that utilizes heuristic rules, LLM-as-Judge evaluation, and consensus aggregation to dynamically select the near-optimal access path for LLM ORDER BY. In our extensive evaluations, our optimizer consistently achieves ranking accuracy on par with or superior to the best static methods across all benchmarks. We believe that this work provides foundational insights into the principled optimization of semantic operators essential for building robust, large-scale LLM-powered analytic systems.

2508.16474 2026-05-21 eess.SY cs.LG cs.SY math.OC

Reinforcement Learning-based Control via Y-wise Affine Neural Networks (YANNs)

基于Y-wise仿射神经网络的强化学习控制

Austin Braniff, Yuhe Tian

AI总结 本文提出了一种基于Y-wise仿射神经网络(YANNs)的新型强化学习算法,通过利用YANNs的可解释性,将多参数线性模型预测控制的显式解重新表述,并在初始化RL策略网络和评估网络时提供线性最优控制的自信度,最终实现对一般非线性优化问题的求解。

详情
Journal ref
Computers & Chemical Engineering, Volume 209, 109610 (2026)
AI中文摘要

本文提出了一种基于Y-wise仿射神经网络(YANNs)的新型强化学习算法。YANNs提供了一种可解释的神经网络,能够精确表示任意输入和输出维度的分段仿射函数,定义在任意数量的多面体子域上。YANNs的一个典型应用是重新表述多参数线性模型预测控制的显式解。在此基础上,本文提出利用YANNs初始化RL的策略网络和评估网络,使由此产生的YANN-RL控制算法能够以线性最优控制的自信度开始。YANN-策略网络通过使用离线计算获得的多参数控制解,利用近似的线性系统模型进行初始化。YANN-评估网络表示线性系统中状态-动作价值函数的显式形式以及作为优化控制问题(OCP)目标函数的奖励函数。此外,通过注入额外的网络层来扩展YANNs以实现非线性表达,这些层可以在线通过直接与真实复杂的非线性系统交互进行训练。这样,策略和状态价值函数最初精确表示线性OCP,并能够最终学习一般非线性OCP的解。此外,还实现了连续策略改进,以提供启发式信心,即线性OCP的解作为RL策略性能的有效下界。YANN-RL算法在裁剪摆和安全关键的化学反应系统上进行了演示。实验结果表明,YANN-RL在考虑安全约束时显著优于使用深度确定性策略梯度的现代RL算法。

英文摘要

This work presents a novel reinforcement learning (RL) algorithm based on Y-wise Affine Neural Networks (YANNs). YANNs provide an interpretable neural network which can exactly represent known piecewise affine functions of arbitrary input and output dimensions defined on any amount of polytopic subdomains. One representative application of YANNs is to reformulate explicit solutions of multi-parametric linear model predictive control. Built on this, we propose the use of YANNs to initialize RL actor and critic networks, which enables the resulting YANN-RL control algorithm to start with the confidence of linear optimal control. The YANN-actor is initialized by representing the multi-parametric control solutions obtained via offline computation using an approximated linear system model. The YANN-critic represents the explicit form of the state-action value function for the linear system and the reward function as the objective in an optimal control problem (OCP). Additional network layers are injected to extend YANNs for nonlinear expressions, which can be trained online by directly interacting with the true complex nonlinear system. In this way, both the policy and state-value functions exactly represent a linear OCP initially and are able to eventually learn the solution of a general nonlinear OCP. Continuous policy improvement is also implemented to provide heuristic confidence that the linear OCP solution serves as an effective lower bound to the performance of RL policy. The YANN-RL algorithm is demonstrated on a clipped pendulum and a safety-critical chemical-reactive system. Our results show that YANN-RL significantly outperforms the modern RL algorithm using deep deterministic policy gradient, especially when considering safety constraints.

2508.16453 2026-05-21 cs.SI cs.CL cs.LG

Anti-establishment sentiment on TikTok: Implications for understanding influence(rs) and expertise on social media

TikTok上的反 Establishment 情绪:对社交媒体中影响者和专业知识理解的启示

Tianliang Xu, Ariel Hasell, Sabina Tomkins

AI总结 本文研究了TikTok上反 Establishment 情绪的普遍性,通过计算方法分析了金融、健康和阴谋论等主题内容中反 Establishment 情绪的分布,并探讨了社交媒体环境中反 Establishment 情绪对用户参与和平台激励的影响。

Comments 10 pages excluding references; 14 pages in total; 4 figures; Accepted by the AAAI Conference on Web and Social Media (ICWSM-2026)

详情
AI中文摘要

对公共服务机构的不信任和反 Establishment 观点正在上升(尤其是在美国)。随着人们转向社交媒体获取信息,有必要了解社交媒体环境是否以及如何促进对机构的不信任。在社交媒体中,内容创作者、影响者和其他意见领袖往往将自己定位为在健康、政治等众多话题上具有专业知识和权威性,并在许多情况下贬低和否定机构专业知识以建立追随者并增加自身可见性。然而,这种内容的普及程度以及此类内容是否增加参与度仍不清楚。本研究分析了TikTok平台上反 Establishment 情绪(AES)的普遍性。尽管TikTok作为信息来源非常流行,但其仍然相对研究较少,可能为人们如何形成对机构态度提供重要见解。我们采用计算方法,对TikTok帖子进行标注,判断其是否包含AES,涵盖内容创作者通常定位为专家的主题领域:金融和健康。作为比较,我们还考虑了阴谋论主题,其中AES预期较为常见。我们发现,AES在阴谋论内容中最为普遍,而在其他两个主题的内容中相对罕见。然而,我们发现与此类内容的参与模式因领域而异,并且可能存在平台激励用户发布表达反 Establishment 情绪的内容。

英文摘要

Distrust of public serving institutions and anti-establishment views are on the rise (especially in the U.S.). As people turn to social media for information, it is imperative to understand whether and how social media environments may be contributing to distrust of institutions. In social media, content creators, influencers, and other opinion leaders often position themselves as having expertise and authority on a range of topics from health to politics, and in many cases devalue and dismiss institutional expertise to build a following and increase their own visibility. However, the extent to which this content appears and whether such content increases engagement is unclear. This study analyzes the prevalence of anti-establishment sentiment (AES) on the social media platform TikTok. Despite its popularity as a source of information, TikTok remains relatively understudied and may provide important insights into how people form attitudes towards institutions. We employ a computational approach to label TikTok posts as containing AES or not across topical domains where content creators tend to frame themselves as experts: finance and wellness. As a comparison, we also consider the topic of conspiracy theories, where AES is expected to be common. We find that AES is most prevalent in conspiracy theory content, and relatively rare in content related to the other two topics. However, we find that engagement patterns with such content varies by area, and that there may be platform incentives for users to post content that expresses anti-establishment sentiment.

2507.06929 2026-05-21 cond-mat.mtrl-sci cs.LG physics.comp-ph

Machine-Learned Force Fields for Lattice Dynamics at Coupled-Cluster Level Accuracy

基于耦合簇水平精度的机器学习力场用于晶格动力学

Sita Schönbauer, Johanna P. Carbone, Fredrik V. Eriksson, Florian Libisch, Andreas Grüneis

AI总结 本文研究了基于近似密度泛函理论和耦合簇水平势能面训练的机器学习力场,通过计算声子色散关系和振动密度态与实验和参考ab initio结果进行比较,验证了其在碳金刚石和锂氢固体中的准确性和精度,并探讨了通过耦合簇与密度泛函结果差异的delta学习方法和带电意识的机器学习力场方法。

Comments 17 pages, 7 figures

详情
AI中文摘要

我们研究了基于近似密度泛函理论(DFT)和耦合簇(CC)水平势能面训练的机器学习力场(MLFFs),用于碳金刚石和锂氢固体的晶格动力学。我们通过计算声子色散关系和振动密度态(VDOS),并与实验和参考ab initio结果进行比较,评估了MLFFs的准确性和精度。为克服CC训练数据中长程效应和缺乏原子力的限制,我们探讨了基于CC和DFT结果差异的delta学习方法以及带电意识的MLFF方法。与DFT相比,基于CC理论训练的MLFFs在光学模式的振动频率上更高,更符合实验结果。此外,MLFFs还用于估计锂氢在耦合簇水平上的非谐效应。

英文摘要

We investigate Machine-Learned Force Fields (MLFFs) trained on approximate Density Functional Theory (DFT) and Coupled Cluster (CC) level potential energy surfaces for the carbon diamond and lithium hydride solids. We assess the accuracy and precision of the MLFFs by calculating phonon dispersions and vibrational densities of states (VDOS) that are compared to experiment and reference ab initio results. To overcome limitations from long-range effects and the lack of atomic forces in the CC training data, a delta-learning approach based on the difference between CC and DFT results, as well as a charge aware MLFF approach is explored. Compared to DFT, MLFFs trained on CC theory yield higher vibrational frequencies for optical modes, agreeing better with experiment. Furthermore, the MLFFs are used to estimate anharmonic effects on the VDOS of lithium hydride at the level of CC theory.

2507.06344 2026-05-21 quant-ph cs.CC cs.LG

Gradient Scalability and Taylor Surrogation of Quantum Cost Landscapes

量子成本景观的梯度可扩展性与泰勒近似

Sabri Meyer, Francesco Scala, Francesco Tacchino, Aurelien Lucchi

AI总结 本文研究了变分量子算法中梯度可扩展性与计算复杂性之间的关系,提出了一种经典模拟技术泰勒近似,并引入了线性克莱因编码器以确保梯度的常数可扩展性,通过数值实验发现梯度可能在超多项式复杂区域中衰减多项式而非指数。

Comments 12 pages, 6 figures, 54 pages of supplementary material

详情
AI中文摘要

变分量子算法是近期量子计算的有希望候选者,但因 barren plateaus 问题导致梯度相对于系统规模指数衰减,从而面临可扩展性挑战。最近的推测认为避免这些 plateaus 可能导致经典可模拟性,从而限制量子优势的机会。在本文中,我们推进了梯度可扩展性与变分量子算法计算复杂性之间关系的理论理解。我们首先提出了泰勒近似,一种经典模拟技术,它在近克莱因区域上匹配泡利路径运行时间保证,并在特定情况下提供运行时间优势。利用此近似,我们证明在之前已确立的经典可模拟区域之外,计算复杂性至少为超多项式。接着,我们引入了线性克莱因编码器,一种经典高效的基础集修改器,确保在接近克莱因电路的景观区域中梯度的常数可扩展性。最后,对这些修改后的景观进行数值实验,提供了初步的实验证据,表明在常数可扩展梯度可能在超多项式复杂区域中衰减多项式而非指数的过渡区。这些发现表明可能存在非消失梯度和超多项式复杂性共存的推测实例,这验证了未来正式证明的必要性。

英文摘要

Variational Quantum Algorithms are promising candidates for near-term quantum computing, yet they face scalability challenges due to barren plateaus, where gradients vanish exponentially relative to system size. Recent conjectures suggest that avoiding these plateaus might inherently lead to classical simulability, thereby limiting the opportunities for quantum advantage. In this work, we advance the theoretical understanding of the relationship between gradient scalability at initialization and the computational complexity of variational quantum algorithms. We first present the Taylor surrogate, a classical simulation technique that matches Pauli path runtime guarantees on near-Clifford regions while offering runtime advantages in specific regimes. Leveraging this surrogate, we prove that beyond previously established classically simulable regions, the computational complexity is at least super-polynomial. Next, we introduce the Linear Clifford Encoder, a classically efficient ansatz modifier that ensures constant-scaling gradients within landscape regions close to Clifford circuits. Finally, numerical experiments on these modified landscapes provide preliminary empirical evidence of a transition zone where constant-scaling gradients may decay polynomially in super-polynomially complex regions rather than exponentially. These findings suggest speculative instances where non-vanishing gradients and super-polynomial complexity could potentially coexist, vindicating the need for future formal proofs.

2507.01053 2026-05-21 cs.IR cs.AI cs.DB

M3: Conversational LLMs Simplify Secure Clinical Data Access, Understanding, and Analysis

M3: 对话式大语言模型简化安全的临床数据访问、理解与分析

Rafi Al Attrach, Pedro Moreira, Rajna Fani, Renato Umeton, Amelia Fiske, Leo Anthony Celi

AI总结 本文提出M3系统,通过模型上下文协议实现对MIMIC-IV数据库的自然语言查询,降低了临床数据访问和分析的技术门槛,并展示了其在安全性和性能上的优势。

Comments 18 pages, 4 figures, 3 tables

详情
AI中文摘要

大规模的临床数据库为医学研究提供了机会,但其复杂性却阻碍了有效利用。医学重症监护信息库(MIMIC-IV)是世界上最大的开源电子健康记录数据库之一,传统上需要SQL专业知识和临床领域专业知识。我们引入M3,一种通过模型上下文协议实现对MIMIC-IV数据的自然语言查询的系统。通过单条命令,M3可以从PhysioNet获取MIMIC-IV,启动本地SQLite实例或连接到托管的BigQuery,并允许研究者用普通英语提出临床问题。我们使用EHRSQL 2024基准测试样本对M3进行了评估,使用两个语言模型。在一百个可回答的问题上,专有模型Claude Sonnet 4达到了94%的准确率,开源模型gpt-oss-20B(可在消费级硬件上本地部署)达到了93%;在一百个不可回答的问题样本上,正确行为是放弃而不是生成SQL,gpt-oss-20B在69个问题上正确地放弃了。两个模型将自然语言转换为SQL,执行查询以MIMIC-IV,并返回结构化结果以及底层查询以供验证。错误分析表明,大多数失败源于复杂的时序推理或模糊的问题表述,而不是基本的架构限制。较小的开源模型的可比性能表明,隐私保护的本地部署对于敏感的临床数据分析是可行的。M3降低了对危重病数据分析的技术门槛,并设计了包括OAuth2认证、查询验证和审计日志在内的安全措施。

英文摘要

Large-scale clinical databases offer opportunities for medical research, but their complexity creates barriers to effective use. The Medical Information Mart for Intensive Care (MIMIC-IV), one of the world's largest open-source electronic health record databases, traditionally requires both SQL proficiency and clinical domain expertise. We introduce M3, a system that enables natural language querying of MIMIC-IV data through the Model Context Protocol. With a single command, M3 retrieves MIMIC-IV from PhysioNet, launches a local SQLite instance or connects to hosted BigQuery, and allows researchers to pose clinical questions in plain English. We evaluated M3 using samples from the EHRSQL 2024 benchmark with two language models. On one hundred answerable questions, the proprietary Claude Sonnet 4 achieved 94% accuracy and the open-weights gpt-oss-20B (deployable locally on consumer hardware) achieved 93%; on a matched sample of one hundred unanswerable questions, where correct behavior is to abstain rather than produce SQL, gpt-oss-20B correctly abstained on 69%. Both models translate natural language into SQL, execute queries against MIMIC-IV, and return structured results alongside the underlying query for verification. Error analysis revealed that most failures stemmed from complex temporal reasoning or ambiguous question phrasing rather than fundamental architectural limitations. The comparable performance of a smaller open-weights model demonstrates that privacy-preserving local deployment is viable for sensitive clinical data analysis. M3 lowers technical barriers to critical care data analysis and is designed with security measures including OAuth2 authentication, query validation, and audit logging.

2505.07054 2026-05-21 eess.SY cs.LG cs.SY math.OC

YANNs: Y-wise Affine Neural Networks for Exact and Efficient Representations of Piecewise Linear Functions

YANNs: Y-wise Affine Neural Networks for Exact and Efficient Representations of Piecewise Linear Functions

Austin Braniff, Yuhe Tian

AI总结 本文提出YANNs,一种能够精确且高效表示分段线性函数的Y-wise仿射神经网络,无需训练即可实现功能等效表示,为多参数模型预测控制提供了应用展示,展示了在实时计算中的高效性与控制理论保证。

详情
Journal ref
Computers & Chemical Engineering, Volume 208, 109589 (2026)
AI中文摘要

本文正式介绍了Y-wise仿射神经网络(YANNs),一种完全可解释的网络架构,能够连续且高效地表示具有多面体子域的分段仿射函数。根据证明,YANNs的开发无需训练即可实现功能等效表示。YANNs因此保留了原始公式的全部数学性质。多参数模型预测控制被用作YANNs的应用展示,理论上计算最优控制律作为状态、输出、设定点和扰动的分段仿射函数。通过精确表示多参数控制律,YANNs保留了如递归可行性与稳定性等关键控制理论保证。这使YANNs区别于现有工作,后者将神经网络用于近似最优控制律而非精确表示。通过优化网络推理速度,YANNs在实时计算中比传统分段仿射函数计算快得多。数值案例研究展示了算法在输入/输出维度和子域数量方面的可扩展性。YANNs在控制领域代表了重大进展,作为首个内在确保可行性和稳定性的神经网络控制器。未来应用可将其作为数据驱动建模/控制的高效且可解释的起点。

英文摘要

This work formally introduces Y-wise Affine Neural Networks (YANNs), a fully-explainable network architecture that continuously and efficiently represent piecewise affine functions with polytopic subdomains. Following from the proofs, it is shown that the development of YANNs requires no training to achieve the functionally equivalent representation. YANNs thus maintain all mathematical properties of the original formulations. Multi-parametric model predictive control is utilized as an application showcase of YANNs, which theoretically computes optimal control laws as a piecewise affine function of states, outputs, setpoints, and disturbances. With the exact representation of multi-parametric control laws, YANNs retain essential control-theoretic guarantees such as recursive feasibility and stability. This sets YANNs apart from the existing works which apply neural networks for approximating optimal control laws instead of exactly representing them. By optimizing the inference speed of the networks, YANNs can evaluate substantially faster in real-time compared to traditional piecewise affine function calculations. Numerical case studies are presented to demonstrate the algorithmic scalability with respect to the input/output dimensions and the number of subdomains. YANNs represent a significant advancement in control as the first neural network-based controller that inherently ensures both feasibility and stability. Future applications can leverage them as an efficient and interpretable starting point for data-driven modeling/control.

2504.13048 2026-05-21 cond-mat.mtrl-sci cs.AI

Design Topological Materials by Reinforcement Fine-Tuned Generative Model

通过强化微调生成模型设计拓扑材料

Haosheng Xu, Dongheng Qian, Zhixuan Liu, Yadong Jiang, Jing Wang

AI总结 本文提出通过强化微调生成模型来设计拓扑绝缘体和拓扑晶体绝缘体,展示了该方法在生成具有完整能隙的新拓扑材料方面的有效性,以Ge₂Bi₂O₆为例证明了其在拓扑绝缘体领域的应用。

详情
Journal ref
Nature Communications (2026)
AI中文摘要

拓扑绝缘体(TIs)和拓扑晶体绝缘体(TCIs)是具有非常规电子性质的材料,其发现对实际应用具有高度价值。然而,特别是具有完整能隙的此类材料仍然稀少。鉴于传统方法在已知材料中扫描候选材料的局限性,我们专注于通过生成模型生成新拓扑材料。具体而言,我们应用强化微调(ReFT)到预训练的生成模型,从而将模型的目标与材料设计目标对齐。我们证明ReFT在增强模型生成TIs和TCIs的能力方面是有效的,且对生成材料的稳定性影响很小。使用微调后的模型,我们成功识别了大量新的拓扑材料,Ge₂Bi₂O₆作为代表性的例子——一个具有0.26 eV完整能隙的TI,是该类材料中已知的最大之一。

英文摘要

Topological insulators (TIs) and topological crystalline insulators (TCIs) are materials with unconventional electronic properties, making their discovery highly valuable for practical applications. However, such materials, particularly those with a full band gap, remain scarce. Given the limitations of traditional approaches that scan known materials for candidates, we focus on the generation of new topological materials through a generative model. Specifically, we apply reinforcement fine-tuning (ReFT) to a pre-trained generative model, thereby aligning the model's objectives with our material design goals. We demonstrate that ReFT is effective in enhancing the model's ability to generate TIs and TCIs, with minimal compromise on the stability of the generated materials. Using the fine-tuned model, we successfully identify a large number of new topological materials, with Ge$_2$Bi$_2$O$_6$ serving as a representative example--a TI with a full band gap of 0.26 eV, ranking among the largest known in this category.

2503.22693 2026-05-21 q-fin.ST cs.AI cs.CL

Bridging Language Models and Financial Analysis

连接语言模型与金融分析

Alejandro Lopez-Lira, Jihoon Kwon, Sangwoon Yoon, Jy-yong Sohn, Chanyeol Choi

AI总结 本文旨在通过概述最近的语言模型研究进展,探讨其在金融领域中的应用潜力,填补语言模型在金融行业中的实际应用与研究进展之间的差距。

Comments 28 pages

详情
AI中文摘要

大规模语言模型(LLMs)的快速进步为自然语言处理领域带来了革命性可能性,特别是在金融领域。金融数据通常嵌套在文本内容、数值表格和视觉图表之间复杂的相互关系中,这对传统方法来说是一个挑战。然而,LLMs的出现为处理和分析这种多维数据提供了更高效和深入的途径。尽管LLMs研究进展迅速,但在金融行业中的实际应用仍存在显著差距,因为金融行业更倾向于谨慎整合和长期验证。这种差异导致新兴LLM技术的实施速度较慢,尽管它们在金融应用中具有巨大潜力。因此,许多最新的LLM技术进展仍未被充分探索或利用。本文旨在通过提供对最近LLM研究进展的全面概述,并探讨其在金融领域的适用性,来弥合这一差距。基于之前的文献综述,我们突出几种新的LLM方法,探讨其独特的功能及其在金融数据分析中的潜在相关性。通过综合广泛研究的见解,本文旨在为研究人员和从业者提供有价值的资源,指出有前途的研究方向,并概述未来进一步推进LLM在金融应用中的机会。

英文摘要

The rapid advancements in Large Language Models (LLMs) have unlocked transformative possibilities in natural language processing, particularly within the financial sector. Financial data is often embedded in intricate relationships across textual content, numerical tables, and visual charts, posing challenges that traditional methods struggle to address effectively. However, the emergence of LLMs offers new pathways for processing and analyzing this multifaceted data with increased efficiency and insight. Despite the fast pace of innovation in LLM research, there remains a significant gap in their practical adoption within the finance industry, where cautious integration and long-term validation are prioritized. This disparity has led to a slower implementation of emerging LLM techniques, despite their immense potential in financial applications. As a result, many of the latest advancements in LLM technology remain underexplored or not fully utilized in this domain. This survey seeks to bridge this gap by providing a comprehensive overview of recent developments in LLM research and examining their applicability to the financial sector. Building on previous survey literature, we highlight several novel LLM methodologies, exploring their distinctive capabilities and their potential relevance to financial data analysis. By synthesizing insights from a broad range of studies, this paper aims to serve as a valuable resource for researchers and practitioners, offering direction on promising research avenues and outlining future opportunities for advancing LLM applications in finance.

2502.03545 2026-05-21 cs.GT cs.AI cs.MA cs.SI

Proportional Selection in Networks

网络中的比例选择

Georgios Papasotiropoulos, Oskar Skibski, Piotr Skowron, Tomasz Wąs

AI总结 本文研究了如何从网络中选择k个代表性节点,旨在识别最有影响力节点并确保选择比例反映网络的多样性,提出了两种方法并进行了理论分析和实验验证。

Comments This version has been accepted for publication at IJCAI'26

详情
AI中文摘要

我们解决了从网络中选择k个代表性节点的问题,旨在实现两个目标:识别最有影响力的节点和确保选择比例反映网络的多样性。我们提出了两种方法来完成这一任务,进行了理论分析,并通过一系列实验展示了它们的有效性。

英文摘要

We address the problem of selecting $k$ representative nodes from a network, aiming to achieve two objectives: identifying the most influential nodes and ensuring the selection proportionally reflects the network's diversity. We propose two approaches to accomplish this, analyze them theoretically, and demonstrate their effectiveness through a series of experiments.

2410.23212 2026-05-21 stat.ML cs.LG math.ST stat.TH

Improved convergence rate of kNN graph Laplacians: differentiable self-tuned affinity

kNN图拉普拉斯算子的改进收敛速度:可微自调亲和力

Xiuyuan Cheng, Yixuan Tan, Nan Wu

AI总结 本文研究了kNN图的收敛速度问题,提出了一种可微自调亲和力的方法,通过改进分析得到在流形数据设定下,kNN图拉普拉斯算子以O(N^{-2/(d+6)})的速度收敛到极限流形算子,验证了理论结果。

详情
AI中文摘要

在基于图的数据分析中,k最近邻(kNN)图因其对局部数据密度的适应性而被广泛应用。允许图中边的加权,核化图亲和力提供了一种更一般的kNN图,其中kNN距离用于自适应地设置核带宽。在本文中,我们考虑了一类一般的kNN图,其中图亲和力为W_{ij}=ε^{-d/2}k_0(||x_i -x_j||^2/εϕ(ρ(x_i),ρ(x_j))^2),其中ρ(x)是点x的(重新缩放的)kNN距离,ϕ是一个对称双变量函数,k_0是一个非负函数。在流形数据设定下,其中N个i.i.d.样本x_i从一个未知的d维流形上的密度p中抽取,我们证明了在k_0和ϕ具有C^3正则性并满足其他技术条件时,kNN图拉普拉斯算子以O(N^{-2/(d+6)})的速度收敛到极限流形算子(取决于p),并验证了理论结果。

英文摘要

In graph-based data analysis, $k$-nearest neighbor ($k$NN) graphs are widely used due to their adaptivity to local data densities. Allowing weighted edges in the graph, the kernelized graph affinity provides a more general type of $k$NN graph where the $k$NN distance is used to set the kernel bandwidth adaptively. In this work, we consider a general class of $k$NN graph where the graph affinity is $W_{ij} = ε^{-d/2} k_0 ( \| x_i - x_j \|^2 / εϕ( \hat ρ(x_i), \hat ρ(x_j) )^2 ) $, with $\hatρ(x)$ being the (rescaled) $k$NN distance at the point $x$, $ϕ$ a symmetric bi-variate function, and $k_0$ a non-negative function on $[0,\infty)$. Under the manifold data setting, where $N$ i.i.d. samples $x_i$ are drawn from a density $p$ on a $d$-dimensional unknown manifold embedded in a high dimensional Euclidean space, we prove the operator pointwise convergence of the $k$NN graph Laplacian to the limiting manifold operator (depending on $p$) at the rate of $O(N^{-2/(d+6)})$, up to a log factor, when $k_0$ and $ϕ$ have $C^3$ regularity and satisfy other technical conditions. This is obtained when $ε\sim N^{-2/(d+6)}$ and $k \sim N^{6/(d+6)}$, both at the optimal order to balance the theoretical bias and variance errors. Our improved convergence rate is based on a refined analysis of the $k$NN estimator, which can be of independent interest. We validate our theory by numerical experiments on simulated data.

2410.12771 2026-05-21 cond-mat.mtrl-sci cs.AI physics.comp-ph

Open Materials 2024 (OMat24) Inorganic Materials Dataset and Models

开放材料2024(OMat24)无机材料数据集和模型

Luis Barroso-Luque, Muhammed Shuaibi, Xiang Fu, Brandon M. Wood, Misko Dzamba, Meng Gao, Ammar Rizvi, C. Lawrence Zitnick, Zachary W. Ulissi

AI总结 本研究提出了一种大规模公开数据集OMat24和预训练模型,旨在解决材料发现中公开训练数据和预训练模型不足的问题,通过密度泛函理论计算和先进模型提升材料科学的AI应用。

Comments 19 pages

详情
AI中文摘要

发现具有理想性能的新材料对于从缓解气候变化到下一代计算硬件的进步至关重要。AI有潜力通过更有效地探索化学空间来加速材料发现和设计,比其他计算方法或试错法更有效。尽管在AI用于材料数据、基准和模型方面取得了显著进展,但一个障碍是缺乏公开可用的训练数据和开放预训练模型。为此,我们提出了Open Materials 2024(OMat24)大规模公开数据集的Meta FAIR发布以及一组预训练模型。OMat24包含超过1.1亿个密度泛函理论(DFT)计算,专注于结构和组成多样性。我们的EquiformerV2模型在Matbench Discovery排行榜上实现了最先进的性能,并能够预测基态稳定性和形成能量,F1分数超过0.9,准确率达到20 meV/atom。我们探讨了模型大小、辅助去噪目标和微调对性能的影响,涵盖了包括OMat24、MPtraj和Alexandria在内的多种数据集。OMat24数据集和模型的公开发布使研究社区能够在此基础上进一步推动AI辅助材料科学的发展。

英文摘要

The ability to discover new materials with desirable properties is critical for numerous applications from helping mitigate climate change to advances in next generation computing hardware. AI has the potential to accelerate materials discovery and design by more effectively exploring the chemical space compared to other computational methods or by trial-and-error. While substantial progress has been made on AI for materials data, benchmarks, and models, a barrier that has emerged is the lack of publicly available training data and open pre-trained models. To address this, we present a Meta FAIR release of the Open Materials 2024 (OMat24) large-scale open dataset and an accompanying set of pre-trained models. OMat24 contains over 110 million density functional theory (DFT) calculations focused on structural and compositional diversity. Our EquiformerV2 models achieve state-of-the-art performance on the Matbench Discovery leaderboard and are capable of predicting ground-state stability and formation energies to an F1 score above 0.9 and an accuracy of 20 meV/atom, respectively. We explore the impact of model size, auxiliary denoising objectives, and fine-tuning on performance across a range of datasets including OMat24, MPtraj, and Alexandria. The open release of the OMat24 dataset and models enables the research community to build upon our efforts and drive further advancements in AI-assisted materials science.

2407.08976 2026-05-21 stat.ML cs.LG math.ST stat.TH

Computational-Statistical Trade-off in Kernel Two-Sample Testing with Random Fourier Features

核两样本检验中计算与统计的权衡:随机傅里叶特征

Ikjun Choi, Ilmun Kim

AI总结 本文研究了使用随机傅里叶特征近似MMD检验在计算复杂度与统计功效之间的权衡,证明通过合理选择随机特征数量可以在亚二次时间内达到与MMD检验相同的最小最大分离率。

详情
AI中文摘要

近年来,两样本检验方法得到了快速发展,其中最大均值差异(MMD)检验已成为处理复杂和高维数据的有效工具。尽管MMD检验在成功和广泛应用方面表现突出,但其二次时间复杂度限制了大规模分析的应用。为了解决这一问题,本文重新审视了使用随机傅里叶特征近似的MMD检验,并研究其计算-统计权衡。我们首先揭示,只有当随机特征数量趋于无穷时,近似MMD检验才能在点估计上保持一致性。随后,我们考虑检验的均匀功效,并在最小最大检验框架下研究时间-功效权衡。我们的结果表明,通过精心选择随机特征数量,可以在亚二次时间内达到与MMD检验相同的最小最大分离率。我们基于不同的分布假设(如Sobolev球内的密度)展示了这一点。理论发现通过模拟研究得到验证。

英文摘要

Recent years have seen a surge in methods for two-sample testing, among which the Maximum Mean Discrepancy (MMD) test has emerged as an effective tool for handling complex and high-dimensional data. Despite its success and widespread adoption, the primary limitation of the MMD test has been its quadratic-time complexity, which poses challenges for large-scale analysis. While various approaches have been proposed to expedite the procedure, it has been unclear whether it is possible to attain the same power guarantee as the MMD test at sub-quadratic time cost. To fill this gap, we revisit the approximated MMD test using random Fourier features, and investigate its computational-statistical trade-off. We start by revealing that the approximated MMD test is pointwise consistent in power only when the number of random features approaches infinity. We then consider the uniform power of the test and study the time-power trade-off under the minimax testing framework. Our result shows that, by carefully choosing the number of random features, it is possible to attain the same minimax separation rates as the MMD test within sub-quadratic time. We demonstrate this point under different distributional assumptions such as densities in a Sobolev ball. Our theoretical findings are corroborated by simulation studies.

2407.01734 2026-05-21 quant-ph cs.AI

Optical Quantum Mixed-State Reconstruction With Multiple Deep Learning Approaches

光学量子混合态重构与多种深度学习方法

Nhan Trong Luu, Tuyen Quang Nguyen, Duong Trung Luu, Thang Cong Truong

AI总结 本文提出两种基于神经网络的量子态重构方法,用于纯态和混合态的量子态重构,通过利用类别信息实现对纯态和混合态的高精度重构。

详情
Journal ref
SN Computer Science (2026)
AI中文摘要

量子态重构是表征量子系统状态的关键技术,对许多量子技术应用至关重要。近年来,利用神经网络增强量子态重构的效率和精度引起了广泛关注。然而,适用于多种重构场景的通用方法仍较为有限。本文提出两种基于神经网络的重构方法:受限特征神经网络和混合态神经网络。通过在重构过程中利用类别信息,我们实现了对纯态和混合态的高精度重构。

英文摘要

Quantum state tomography is a crucial technique for characterizing the state of a quantum system, which is essential for many applications in quantum technologies. In recent years, there has been growing interest in leveraging neural networks to enhance the efficiency and accuracy of quantum state tomography. However, versatile methods that are broadly applicable across diverse reconstruction scenarios remain relatively underexplored. In this paper, we present two neural network-based reconstruction approaches for both pure and mixed quantum state tomography: Restricted Feature Based Neural Network and Mixed States Neural Network. By leveraging class information during reconstruction, we are able to achieve state-of-the-art performance of tomography for both pure and mixed quantum states.