arXivDaily arXiv每日学术速递 周一至周五更新
重置
全部学科分类 3851
2606.08258 2026-06-09 cs.GR cs.CV cs.LG 新提交

MS-COOT: Comparing Morse-Smale Complexes with Co-Optimal Transport

MS-COOT: 用共最优传输比较Morse-Smale复形

Guangyu Meng, Mingzhe Li, Erin Wolf Chambers

发表机构 * Department of Computer Science and Engineering, University of Notre Dame(Notre Dame 大学计算机科学与工程系)

AI总结 提出MS-COOT距离,将Morse-Smale复形表示为超图,通过共最优传输联合匹配临界点和区域,实现区域级结构比较,在分类等任务中优于图方法。

详情
AI中文摘要

理解和比较标量场中的结构是科学可视化的核心挑战,应用范围从特征分析到时间和结构比较。Morse-Smale (MS) 复形通过将标量场分解为由梯度流诱导的区域提供了自然表示。然而,现有方法通常依赖于基于图的表示,捕获临界点之间的关系而丢弃区域级结构。在这项工作中,我们将MS复形表示为超图,其中临界点构成节点,区域定义超边。我们引入MS-COOT,一种共最优传输距离,联合计算临界点和区域之间的对应关系。这种公式化使得在基于距离的框架内能够进行显式的区域到区域匹配,从而识别诸如分裂和合并等区域级事件。我们使用领域特定组件实例化该框架,包括编码临界点-区域关系的超网络函数、强调拓扑显著特征的基于持久性的概率度量,以及包含临界点属性的样本代价项。我们在涵盖2D模拟、3D曲面网格和体积数据的五个数据集上评估MS-COOT。我们的结果表明,MS-COOT捕获了基于图的距离未反映的区域级结构变化,同时在分类和分辨率判别等下游任务中实现了强性能。

英文摘要

Understanding and comparing structures in scalar fields is a central challenge in scientific visualization, with applications ranging from feature analysis to temporal and structural comparison. The Morse-Smale (MS) complex provides a natural representation by decomposing a scalar field into regions induced by gradient flow. However, existing approaches typically rely on graph-based representations, capturing relationships between critical points while discarding region-level structure. In this work, we represent the MS complex as a hypergraph, where critical points form nodes and regions define hyperedges. We introduce MS-COOT, a co-optimal transport distance that jointly computes correspondences between critical points and regions. This formulation enables explicit region-to-region matching within a distance-based framework, allowing identification of region-level events such as splitting and merging. We instantiate this framework with domain-specific components, including a hypernetwork function encoding critical point-region relationships, persistence-based probability measures that emphasize topologically significant features, and a sample cost term that incorporates critical point attributes. We evaluate MS-COOT on five datasets spanning 2D simulations, 3D surface meshes, and volumetric data. Our results show that MS-COOT captures region-level structural changes that are not reflected by graph-based distances, while achieving strong performance in downstream tasks such as classification and resolution discrimination.

2606.08247 2026-06-09 eess.AS cs.AI cs.LG eess.SP 新提交

AeroSpectra Sentinel: An Auditable LLM Prompt-Chaining Decision-Support Workflow for Acute Asthma Risk Assessment from Respiratory Sounds and Clinical Signals

AeroSpectra Sentinel:一种用于从呼吸音和临床信号进行急性哮喘风险评估的可审计LLM提示链决策支持工作流

Aueaphum Aueawatthanaphisut

发表机构 * School of Information, Computer, and Communication Technology(信息、计算机与通信技术学院) Sirindhorn International Institute of Technology, Thammasat University(泰国朱拉隆梭国际技术学院)

AI总结 提出AeroSpectra Sentinel,结合STFT呼吸音分析、轻量ML筛查、临床特征融合和五阶段LLM提示链,实现可审计的急性哮喘风险评估,在公开数据集上验证了音频筛查和LLM工作流的有效性。

Comments 10 pages, 8 figures, 5 tables, 14 equations

详情
AI中文摘要

急性哮喘风险评估需要快速解读呼吸音、氧合、气流受限、言语能力、呼吸做功、精神状态以及对缓解治疗的反应。传统的纯音频分类器可以检测喘息样模式,但通常缺乏透明的临床推理和安全升级逻辑。本文提出AeroSpectra Sentinel,一个客户端研究原型和决策支持工作流,结合短时傅里叶变换(STFT)呼吸音分析、轻量机器学习筛查、临床特征融合和五阶段大语言模型(LLM)提示链过程。该工作流分离了信号采集、预处理、声学特征提取、ML筛查、临床护栏和FHIR就绪报告。我们在一个包含来自五个标签的1,211个WAV录音的公共呼吸音数据集上评估了音频筛查组件。使用584个录音的分层子集,随机森林在哮喘与非哮喘筛查中实现了91.10%的二元准确率和78.69%的F1分数,而基于特征的多层感知器实现了89.73%的准确率和78.26%的F1分数。紧凑的log-spectrogram CNN实现了73.29%的准确率和55.17%的F1分数。多类分类实现了77.40%的准确率和77.23%的宏F1。为了评估LLM工作流,我们对40个模拟临床场景进行了基于场景的审计,比较了一次性提示、提示链、带护栏的提示链以及带护栏加FHIR模式验证的提示链。护栏加模式变体实现了最强的模拟安全性和文档一致性。AeroSpectra Sentinel旨在作为研究原型,而非诊断医疗设备或临床验证的风险评估产品。

英文摘要

Acute asthma risk assessment requires rapid interpretation of respiratory sounds, oxygenation, airflow limitation, speech ability, work of breathing, mental status, and response to reliever therapy. Conventional audio-only classifiers can detect wheeze-like patterns but often lack transparent clinical reasoning and safe escalation logic. This paper presents AeroSpectra Sentinel, a client-side research prototype and decision-support workflow that combines short-time Fourier transform (STFT) respiratory sound analysis, lightweight machine-learning screening, clinical feature fusion, and a five-stage large language model (LLM) prompt-chaining process. The workflow separates signal acquisition, preprocessing, acoustic feature extraction, ML screening, clinical guardrails, and FHIR-ready reporting. We evaluated the audio screening component on a public respiratory sound dataset containing 1,211 WAV recordings from five labels. Using a stratified subset of 584 recordings, a random forest achieved 91.10% binary accuracy and 78.69% F1-score for asthma-vs-non-asthma screening, while a feature-based multilayer perceptron achieved 89.73% accuracy and 78.26% F1-score. A compact log-spectrogram CNN achieved 73.29% accuracy and 55.17% F1-score. Multiclass classification achieved 77.40% accuracy and 77.23% macro-F1. To evaluate the LLM workflow, we conducted a scenario-based audit on 40 simulated clinical vignettes comparing one-shot prompting, prompt chaining, prompt chaining with guardrails, and prompt chaining with guardrails plus FHIR schema validation. The guardrail-plus-schema variant achieved the strongest simulated safety and documentation consistency. AeroSpectra Sentinel is intended as a research prototype, not as a diagnostic medical device or clinically validated risk-assessment product.

2606.08228 2026-06-09 q-fin.TR cs.LG q-fin.CP q-fin.ST 新提交

Post-Rejection Follow-up Sampling: A Methodology for Counterfactual Outcome Measurement in Algorithmic DEX Trading

拒绝后跟踪采样:算法DEX交易中反事实结果测量的一种方法

Arati Uday Kamat

发表机构 * Independent Researcher(独立研究者)

AI总结 提出拒绝后跟踪采样(PRFS)方法,通过独立跟踪子系统采样被拒绝代币的价格和流动性,以评估过滤器精度,数据集包含2997个拒绝事件的67000条观测记录。

Comments 12 pages. Companion methodology paper to RED-2400 (arXiv:2605.12151). Currently under review at Ledger. SSRN abstract ID 6607301. Zenodo concept DOI 10.5281/zenodo.20043516

详情
AI中文摘要

去中心化交易所(DEX)上的算法交易系统拒绝了它们评估的大多数候选代币。被拒绝候选代币的反事实结果(如果系统进入会发生什么)很少被测量。本文介绍了拒绝后跟踪采样(PRFS)。一个独立的跟踪子系统以可配置的频率对每个被拒绝代币的价格和流动性进行采样,时间跨度长达二十四小时。PRFS提供了评估过滤器精度所需的数据,这些数据基于被拒绝候选代币的实际市场结果,而不是基于合成的回测重建。方法论、数据架构和存款格式在第三节中描述。配套数据集包含2997个拒绝事件的67000个前向结果观测行,涵盖457个独特的铸币厂,在连续八天的时间窗口内收集(2026-04-10至2026-04-19,UTC)。大约55%的拒绝事件至少有一个前向观测;铸币厂级别的覆盖是完整的。下游分类的主要约束是每个事件的时间密度,而不是事件级别的覆盖。PRFS是数据集无关的。它适用于任何拒绝次数大大超过执行次数的算法决策系统。

英文摘要

Algorithmic trading systems on decentralised exchanges (DEXs) reject most candidate tokens they evaluate. The counterfactual outcome of rejected candidates (what would have happened had the system entered) is rarely measured. This paper introduces Post-Rejection Follow-up Sampling (PRFS). A separate tracking subsystem samples each rejected token's price and liquidity at a configurable cadence, over a horizon of up to twenty-four hours. PRFS produces the data needed to evaluate filter precision against actual market outcomes of rejected candidates, not against synthetic backtest reconstructions. The methodology, data architecture, and deposit format are described in Section III. The companion dataset contains 67,000 forward-outcome observation rows across 2,997 rejection events spanning 457 unique mints, collected over a continuous eight-day window (2026-04-10 to 2026-04-19, UTC). Approximately 55 percent of rejection events receive at least one forward observation; coverage at the mint level is complete. The principal binding constraint on downstream classification is per-event horizon density, not event-level coverage. PRFS is dataset-independent. It generalises to any algorithmic decision system in which rejections substantially outnumber executions.

2606.08210 2026-06-09 eess.AS cs.CL cs.SD 新提交

Paediatric-HGNN: A Hybrid Heterogeneous Graph Neural Network for Detecting Disfluency in Children's Speech via Multiscale Acoustic Fusion

Paediatric-HGNN:一种通过多尺度声学融合检测儿童言语不流畅的混合异构图神经网络

Rashini Liyanarachchi, Rachael Mackay, Alison Short, Aditya Joshi, Erik Meijering

发表机构 * University of New South Wales(新南威尔士大学) Western Sydney University(西澳悉尼大学) Resourced Music Therapy(资源音乐治疗)

AI总结 针对儿童言语中声学变异大、病理口吃与发育性不流畅难以区分的问题,提出Paediatric-HGNN框架,通过构建异构图捕获词汇与声学片段的分层关系,在儿童语料上实现82.4%加权准确率和0.386的典型不流畅F1分数。

Comments Accepted at INTERSPEECH 2026 (Main)

详情
AI中文摘要

自动口吃检测(ASD)系统在处理儿童言语时面临挑战,因为发育中的声音具有高声学变异性,且病理性口吃与典型发育性不流畅之间存在细微差别。我们提出了Paediatric-HGNN,一个使用上下文感知部分-整体交互网络(CaPIN)的框架,专门针对儿童数据定制。与传统的1D信号建模不同,我们的方法构建了一个异构图,捕获词汇单元(词节点)和细粒度声学片段(帧节点)之间的层次关系。在精选的儿童语料库(UCLASS和FluencyBank)上训练后,Paediatric-HGNN实现了82.4%的加权准确率和0.386的典型不流畅F1分数。对层次化词汇-声学交互的建模捕获了发育中的“搜索”行为,为早期临床干预提供了更稳健和可解释的工具。

英文摘要

Automated stuttering detection (ASD) systems struggle with paediatric speech due to high acoustic variability in developing voices and the subtle distinction between pathological stuttering and typical developmental disfluencies. We introduce Paediatric-HGNN, a framework using a Context-aware Part-whole Interaction Network (CaPIN) tailored for paediatric data. Instead of conventional 1D signal modelling, our approach builds a heterogeneous graph capturing hierarchical relationships between lexical units (word nodes) and fine-grained acoustic segments (frame nodes). Trained on curated paediatric corpora (UCLASS and FluencyBank), Paediatric-HGNN achieves 82.4% weighted accuracy and a Typical Disfluency F1-score of 0.386. Modelling hierarchical lexical-acoustic interactions captures developmental "searching" behaviour, offering a more robust and interpretable tool for early clinical intervention.

2606.08203 2026-06-09 math.NA cs.LG cs.NA stat.ML 新提交

Stable and Scalable Probabilistic Numerical Solvers for Stiff and High-Dimensional ODEs

适用于刚性和高维ODE的稳定且可扩展的概率数值求解器

Nathanael Bosch

发表机构 * EPFL(瑞士联邦理工学院)

AI总结 针对刚性和高维常微分方程,提出两种互补策略:无矩阵更新步骤实现线性扩展,以及迭代重线性化提升稳定性,从而开发出稳定且可扩展的概率求解器。

详情
AI中文摘要

基于滤波的常微分方程概率数值求解器已被确立为一种灵活高效的仿真框架,具有内置的数值不确定性量化。然而,刚性和高维问题仍然是一个挑战,因为当前方法要么稳定但计算复杂度为ODE维度的三次方,要么线性扩展但牺牲稳定性。在本文中,我们弥合了这一差距,开发了既稳定又可扩展的概率ODE求解器。我们提出了两种互补策略。首先,我们开发了一种无矩阵更新步骤,利用雅可比向量积、迭代线性求解器和随机协方差估计来实现线性扩展,同时保持稳定性。其次,我们提出迭代重线性化以在不牺牲可扩展性的情况下进一步提高稳定性,将概率ODE求解器转变为完全隐式方法。我们在各种刚性和高维问题上评估了所提出的方法,并展示了相对于现有概率求解器在稳定性和可扩展性上的改进。

英文摘要

Filtering-based probabilistic numerical solvers for ordinary differential equations (ODEs) have been established as a flexible and efficient simulation framework with built-in numerical uncertainty quantification. However, problems that are both stiff and high-dimensional remain a challenge, as current methods are either stable and have cubic cost in the ODE dimension, or scale linearly at the expense of stability. In this paper, we close this gap and develop probabilistic ODE solvers that are both stable and scalable. We propose two complementary strategies. First, we develop a matrix-free update step that uses Jacobian-vector products, iterative linear solvers, and stochastic covariance estimation to enable linear scaling, all while retaining stability. Second, we propose iterative re-linearization to further improve stability without sacrificing scalability, turning probabilistic ODE solvers into fully implicit methods. We evaluate the proposed approaches on a range of stiff and high-dimensional problems and demonstrate improved stability and scalability over established probabilistic solvers.

2606.08202 2026-06-09 stat.ML cs.LG physics.data-an q-bio.NC 新提交

Vector Space of Cycles

循环向量空间

Moo K. Chung, Anass B. El-Yaagoubi, Hernando Ombao

发表机构 * Department of Biostatistics and Medical Informatics University of Wisconsin Madison(威斯康星大学麦迪逊分校生物统计学与医学信息学系) Statistics Program King Abdullah University of Science and Technology(国王 Abdullah 科学与技术大学统计学项目)

AI总结 提出一种变分框架,将循环交互表示为单纯复形上的边流,通过能量最小化动力学分离瞬态与持久谐波流,得到低维循环空间,实现循环结构的投影、平均、比较和统计推断。

详情
AI中文摘要

大多数用于有向交互的统计和机器学习方法关注变量之间的成对效应。即使现有的循环模型也主要通过节点级依赖表示反馈,使得大规模循环组织难以估计和比较。这一限制在生物和神经系统中尤为突出,其中交互高度循环且涉及许多重叠的循环。我们引入了一个用于循环交互统计推断的变分框架。有向交互被表示为单纯复形上的边流,并在能量最小化动力系统下演化。由此产生的动力学将瞬态交互分量与持久谐波流分离,产生一个捕获稳定循环组织的低维循环空间。该框架不是枚举单个循环,而是将循环交互表示为希尔伯特空间的元素,从而实现投影、平均、比较和群体级统计推断。我们建立了谐波投影的理论性质,包括循环空间的表征、方差减少和群体推断。模拟表明,与现有的有向交互方法相比,该方法在密集循环系统中显著改善了循环结构的恢复。应用于400名人类受试者的静息态fMRI,该框架揭示了通过边平均无法检测的可重复的大规模循环组织。这些结果为研究高维动力系统中的循环交互提供了一个可扩展的统计框架。

英文摘要

Most statistical and machine learning methods for directed interactions focus on pairwise effects among variables. Even existing cyclic models represent feedback primarily through node-level dependencies, making large-scale recurrent organization difficult to estimate and compare. This limitation is particularly acute in biological and neural systems, where interactions are highly recurrent and involve many overlapping cycles. We introduce a variational framework for statistical inference on cyclic interactions. Directed interactions are represented as edge flows on a simplicial complex and evolved under an energy-minimizing dynamical system. The resulting dynamics separate transient interaction components from persistent harmonic flows, yielding a low-dimensional cycle space that captures stable recurrent organization. Rather than enumerating individual cycles, the proposed framework represents cyclic interactions as elements of a Hilbert space, enabling projection, averaging, comparison, and population-level statistical inference. We establish theoretical properties of the harmonic projection, including characterization of the cycle space, variance reduction, and population inference. Simulations demonstrate substantially improved recovery of cyclic structure in dense recurrent systems compared with existing directed-interaction methods. Applied to resting-state fMRI from 400 human subjects, the framework reveals reproducible large-scale cyclic organization that is not detectable through edgewise averaging. These results provide a scalable statistical framework for studying recurrent interactions in high-dimensional dynamical systems.

2606.08196 2026-06-09 stat.ML cs.AI cs.LG stat.ME 新提交

Beyond Additivity: Causal Discovery in Location-Scale Noise Models with Hidden Variables

超越可加性:含隐变量的位置-尺度噪声模型中的因果发现

Mariyam Khan, Shohei Shimizu, Thong Pham

发表机构 * RIKEN AIP(理化学研究所Advanced Institute for Science Technology) University of Bergen(卑尔根大学) The University of Osaka(大阪大学) Shiga University(滋贺大学)

AI总结 针对含隐变量且数据生成过程遵循位置-尺度噪声模型(LSNM)的因果发现,证明满足无弓条件的非循环有向混合图(ADMG)可识别,并提出两阶段算法LSNM-UV,在异方差数据上优于可加性基线。

Comments 33 pages, 4 figures

详情
AI中文摘要

我们研究当某些变量隐藏且数据生成过程遵循位置-尺度噪声模型(LSNM)时,从观测数据进行因果发现的问题。现有处理隐藏混杂变量的方法通常假设可加性噪声,但在实践中,原因不仅调节其效应的均值,还调节方差。我们证明,满足无弓条件的非循环有向混合图(ADMG)在含隐变量的LSNM下是可识别的,建立了超越噪声可加性的因果不足模型的第一个可识别性结果。我们进一步提供了即使违反无弓假设时识别因果方向的充分条件。我们的两阶段算法LSNM-UV是正确且完备的,实验表明在异方差数据上优于可加性基线方法。

英文摘要

We study causal discovery from observational data when some variables are hidden and the data-generating process follows a location-scale noise model (LSNM). Existing methods that handle hidden confounders typically assume additive noise, but in practice, causes often modulate not just the mean but also the variance of their effects. We prove that acyclic directed mixed graphs (ADMGs) satisfying a bow-free condition are identifiable under LSNM with hidden variables, establishing the first identifiability result for causally insufficient models beyond noise additivity. We further provide sufficient conditions for identifying causal direction even when the bow-free assumption is violated. Our two-stage algorithm, LSNM-UV, is sound and complete, and experiments demonstrate improved performance over additive baselines on heteroscedastic data.

2606.08188 2026-06-09 math.OC cs.LG 新提交

Latent Structural Categorical Matrix Completion with Application to Quasispecies Analysis

潜在结构分类矩阵补全及其在准种分析中的应用

Qian Zhang, Meixia Lin

发表机构 * Engineering Systems and Design, Singapore University of Technology and Design(新加坡科技设计大学工程系统与设计系) Institute of Statistics and Big Data, Renmin University of China(中国人民大学统计与大数据研究院)

AI总结 提出LCMC双循环优化框架,通过二元张量表示对分类矩阵进行潜在分解,外环自适应估计潜在维度,内环通过张量分解重构矩阵,在病毒准种重建中优于现有方法。

详情
AI中文摘要

矩阵补全在实值数据中已被广泛研究,但现有方法在处理分类变量时往往受限。我们提出LCMC,一种基于二元张量表示的潜在分解分类矩阵补全双循环优化框架。在此设置中,每个分类条目沿第三张量模式编码为独热向量,从而保留其离散、非序数的性质。外环通过内环反馈迭代更新潜在维度来自适应估计,内环通过张量分解重构分类矩阵,并有相应理论分析支持。为进一步提高可扩展性和鲁棒性,我们引入了包括分裂-合并-细化策略和自适应数据缩减技术在内的增强功能。在病毒准种重建的合成和真实数据集上的实验表明,与现有方法相比,LCMC实现了更高的准确性和效率。

英文摘要

Matrix completion has been extensively studied for real-valued data, but existing methods are often limited in handling categorical variables. We propose LCMC, a double-loop optimization framework for categorical matrix completion via latent factorization based on a binary tensor representation. In this setting, each categorical entry is encoded as a one-hot vector along a third tensor mode, thereby preserving its discrete, non-ordinal nature. The outer loop adaptively estimates the latent dimension by iteratively updating it with feedback from the inner loop, while the inner loop reconstructs the categorical matrix through tensor factorization, supported by a corresponding theoretical analysis. To further improve scalability and robustness, we introduce enhancements including a split-merge-refine strategy and an adaptive data reduction technique. Experiments on synthetic and real-world datasets in viral quasispecies reconstruction, demonstrate that LCMC achieves superior accuracy and efficiency compared to existing methods.

2606.08179 2026-06-09 cs.DS cs.CR cs.LG 新提交

Differentially Private Range Subgraph Counting

差分隐私范围子图计数

Xian Chen, Ruobing Bai, Pan Peng

发表机构 * School of Computer Science and Technology, University of Science and Technology of China(计算机科学与技术学院,中国科学技术大学)

AI总结 针对子图计数中的隐私问题,提出差分隐私范围子图计数(DPRSC)问题,通过子图投影将其转化为加权正交范围计数,结合范围树和局部敏感度估计实现低误差隐私查询,并证明误差下界与维度指数相关。

Comments ICML2026

详情
AI中文摘要

子图计数是图分析中的一个基本问题。受实际场景(图分析在选定顶点诱导的子图上进行,而非整个图)以及日益增长的隐私需求的推动,我们首次研究了差分隐私范围子图计数(DPRSC)。其目标是在由多维属性范围定义的诱导子图中,对固定模式图的出现次数进行隐私计数。与经典的点计数不同,子图计数本质上是非线性的且具有高敏感性:单条边的修改可能影响许多子图出现。我们提出了首个具有小加性误差的高效DPRSC算法。我们的方法引入了一个子图投影,将DPRSC简化为加权正交范围计数,从而能够利用范围树和局部敏感度估计来实现准确的隐私查询回答。我们通过将重建攻击归约到DPRSC并利用差异理论,给出了与算法匹配的下界。特别地,我们证明任何用于DPRSC的差分隐私算法都必须承受与维度指数相关的加性误差。实验评估表明,我们的算法在准确性和运行时间上显著优于基线方法,同时保持强大的隐私保证。

英文摘要

Subgraph counting is a fundamental problem in graph analysis. Motivated by practical scenarios where graph analytics are performed on subgraphs induced by selected vertices -- rather than on the entire graph -- and by growing privacy concerns, we initiate the study of differentially private range subgraph counting (DPRSC). The goal is to privately count occurrences of a fixed pattern graph within induced subgraphs defined by multi-dimensional attribute ranges. Unlike classical point counting, subgraph counting is inherently nonlinear and exhibits high sensitivity: a single edge modification can affect many subgraph occurrences. We present the first efficient algorithms for DPRSC with small additive error. Our approach introduces a subgraph projection that reduces DPRSC to weighted orthogonal range counting, enabling the use of range trees and local sensitivity estimation to achieve accurate private query answering. We complement our algorithms with matching lower bounds, obtained by reducing reconstruction attacks to DPRSC and leveraging discrepancy theory. In particular, we show that any differentially private algorithm for DPRSC must incur additive error exponential in the dimension. Empirical evaluations demonstrate that our algorithms significantly outperform baseline methods in accuracy and runtime while maintaining strong privacy guarantees.

2606.08173 2026-06-09 cs.CR cs.LG cs.NI 新提交

AI-Native Closed-Loop Security for 6G-Enabled Cyber-Physical Systems: From Edge Detection to Network-Wide Mitigation

面向6G赋能信息物理系统的AI原生闭环安全:从边缘检测到全网缓解

Bilal Hussain, Muhammad Bilal, Tan Li, Haris Pervaiz, Xiao Tang, Qinghe Du, Fawad Ahmad, Muhammad Azhar, Jun Zhang

发表机构 * Division of Science, Engineering, and Health Studies, School of Professional Education and Executive Development, The Hong Kong Polytechnic University(香港理工大学科学、工程与健康研究学院,专业教育与 executive 发展学院) School of Computing and Communications, Lancaster University(兰卡斯特大学计算机与通讯学院) Department of Computer Science, The Hang Seng University of Hong Kong(香港恒生大学计算机科学系) School of Computer Science and Electronic Engineering, University of Essex(埃塞克斯大学计算机科学与电子工程学院) School of Information and Communication Engineering, Xi’an Jiaotong University(西安交通大学信息与通信工程学院) Department of Applied Data Science, Hong Kong Shue Yan University(香港-Shue Yan大学应用数据科学系) Department of Electronic and Computer Engineering, Hong Kong University of Science and Technology(香港理工大学电子与计算机工程系)

AI总结 本文提出一种AI原生闭环安全架构,在6G CPS中利用MEC进行边缘检测,通过SDN/NFV/O-RAN实现全网缓解,并形式化定义了切片级延迟约束,系统综述了128篇研究并归纳了五个开放问题。

Comments 30 pages, 12 figures, survey paper, submitted to IEEE Communications Surveys & Tutorials (IEEE COMST)

详情
AI中文摘要

在第六代(6G)网络中,数十亿信息物理系统(CPS)——包括自动驾驶车辆、智能电网、工业机器人和远程手术设备——将在超可靠低延迟切片上运行,将远程入侵与物理危害之间的时间差压缩至毫秒级,这是传统边界防火墙和集中式安全运营中心无法满足的。本综述将6G CPS安全重新定义为一种闭环、AI原生的流水线,该流水线在多接入边缘计算(MEC)层进行感知,使用分钟级呼叫详细记录(CDR)进行基线学习,并使用亚毫秒级RAN/开放RAN(O-RAN)遥测数据用于延迟关键路径。它在本地使用压缩深度模型进行决策,通过SDN、NFV和O-RAN控制器实现全网缓解,并通过联邦学习(FL)和数字孪生(DT)回放进行再训练。我们形式化了感知、检测和缓解阶段每个切片的尾部有界延迟契约,该契约在切片依赖的尾部百分位数(安全关键URLLC切片为p99)上强制执行。按照PRISMA 2020协议组织128篇同行评审研究(2017-2026),我们:(i)将6G/CPS威胁面映射到MITRE ATT&CK和CDR可观测特征空间;(ii)统一了跨十二个数据集以及统计、图和Transformer模型的边缘异常检测和DDoS分类;(iii)将SDN/NFV/O-RAN原语综合成一个闭环参考架构;(iv)将FL、大语言模型(LLM)、DT、后量子密码(PQC)、零信任架构(ZTA)和可解释AI视为跨领域使能器,而非并行支柱;(v)将开放问题整合为五个方向,涵盖数据、延迟、信任、标准化和评估。

英文摘要

In sixth-generation (6G) networks, billions of cyber-physical systems (CPSs) - autonomous vehicles, smart grids, industrial robots, and remote-surgical equipment - will run over ultra-reliable low-latency slices, collapsing the gap between a remote breach and physical harm to milliseconds, a budget perimeter firewalls and centralised security operations centres cannot meet. This survey reframes 6G CPS security as a closed-loop, AI-native pipeline that senses at the multi-access edge computing (MEC) tier, using minute-scale call-detail records (CDRs) for baseline learning and sub-millisecond RAN/Open-RAN (O-RAN) telemetry for the latency-critical path. It decides locally with compressed deep models, mitigates network-wide via SDN, NFV, and O-RAN controllers, and retrains through federated learning (FL) and digital-twin (DT) replay. We formalise a per-slice, tail-bounded latency contract on the sense, detect, and mitigate stages, enforced at a slice-dependent tail percentile (p99 for safety-critical URLLC slices). Organising 128 peer-reviewed studies (2017-2026) under a PRISMA 2020 protocol, we (i) map the 6G/CPS threat surface to MITRE ATT&CK and a CDR-observable feature space; (ii) unify edge anomaly detection and DDoS classification across twelve datasets and statistical, graph, and transformer models; (iii) synthesise SDN/NFV/O-RAN primitives into one closed-loop reference architecture; (iv) treat FL, large language models (LLMs), DT, post-quantum cryptography (PQC), zero-trust architecture (ZTA), and explainable AI as cross-cutting enablers, not parallel pillars; and (v) consolidate open problems into five directions spanning data, latency, trust, standardisation, and evaluation.

2606.08172 2026-06-09 cs.HC cs.AI cs.CY 新提交

The Governance of Human-LLM Interaction: Safety Gating, Civility Steering, and Affective Default Lock-In

人类与LLM交互的治理:安全门控、文明引导与情感默认锁定

Manuele Reani, Hongjian Zhang, Hongyu Tian

发表机构 * School of Management and Economics, The Chinese University of Hong Kong, Shenzhen, China(管理学院与经济学学院,香港中文大学(深圳))

AI总结 本研究通过确定性多智能体评估流水线,测量LLM在长程对话中的提示可引导性和风格漂移,提出区分安全门控、文明引导和情感默认锁定的治理框架,揭示提供商对交互形式的控制对多元性、自主性和民主能动性的影响。

详情
AI中文摘要

大型语言模型(LLM)越来越多地介入金融、医疗和心理健康支持等高风险的交互中,但用户对这些系统如何沟通的控制有限。我们将交互风格视为治理对象:提供商侧的对齐不仅阻止有害内容,还稳定了沟通默认值,这些默认值塑造了用户的认知距离、关系期望以及选择退出情感化或拟人化交互的能力。我们引入了一个确定性的多智能体评估流水线,用于测量长程对话中的提示可引导性和风格漂移。该研究在四个领域和三种可运行的角色条件(默认、讽刺和冷漠)下重放了100个冻结的用户脚本,使用三个生成模型,产生了90,000条助手回复,由人类校准的LLM评判员根据有害性、负面情绪、不适当性、共情语言、拟人化和拒绝行为进行评分。第四种有害角色作为安全门控测试单独评估。本文贡献了一种可复现的方法,用于量化提示指定的风格是否随时间保持稳定,以及一个区分安全门控、文明引导和情感默认锁定的治理框架。总体而言,我们表明提示可引导性和回归默认是可观察的指标,反映了提供商对沟通形式的控制,这对人类与LLM交互中的多元性、自主性和民主能动性具有影响。

英文摘要

Large language models (LLMs) increasingly mediate high-stakes interactions in finance, medicine, and mental-health support, yet users have limited control over how these systems communicate. We frame interaction style as a governance object: provider-side alignment not only blocks harmful content, but also stabilizes communicative defaults that shape users' epistemic distance, relational expectations, and capacity to opt out of emotionalized or anthropomorphic interaction. We introduce a deterministic multi-agent evaluation pipeline for measuring prompt steerability and style drift in long-horizon dialogue. The study replays 100 frozen user-only scripts across four domains and three runnable persona conditions: default, sarcastic, and cold, using three generator models, yielding 90,000 assistant replies scored by a human-calibrated LLM judge on harmfulness, negative emotion, inappropriateness, empathic language, anthropomorphism, and refusal behavior. A fourth harmful persona is evaluated separately as a safety-gating test. The paper contributes a reproducible method for quantifying whether prompt-specified styles remain stable over time and a governance framework distinguishing safety gating, civility steering, and affective default lock-in. Overall, we show that prompt steerability and regression-to-default are observable indicators of provider control over communicative form, with implications for pluralism, autonomy, and democratic agency in human-LLM interaction.

2606.08168 2026-06-09 cs.CR cs.AI 新提交

Closing the Sim-to-Real Gap: An Evaluation Framework for Autonomous Cyber Defense Configuration of Commercial EDR

弥合模拟到现实的差距:商业EDR自主网络防御配置评估框架

Kerri Prinos, Lilianne Brush

发表机构 * GitHub

AI总结 提出首个针对商业EDR自主防御智能体的评估框架,通过GOAD实验室与微软Defender XDR的实例化测试,揭示模拟和开源评估无法发现的三个关键差距。

Comments 12 pages including references

详情
AI中文摘要

领先的商业端点检测与响应(EDR)产品已从操作员配置的规则集转变为多组件系统,其中自主AI组件与操作员部署的策略并行运行,并日益取代后者。使用商业EDR作为加固工具的自主防御智能体不再调整被动工具,而是调整能够做出供应商特定决策的黑盒自主系统。我们提出了首个针对加固商业EDR的自主防御智能体的评估框架。我们在Game of Active Directory(GOAD)实验室中实例化该框架,使用Horizon3.ai的NodeZero作为自主渗透测试者,微软Defender XDR作为EDR。我们运行了基于两个大型语言模型(LLM)骨干(Claude Sonnet 4.6和Cisco Foundation-Sec-8B)的防御智能体样本基准测试。我们报告了三个模拟或开源EDR评估无法揭示的经验教训:(i)商业EDR遥测是为安全运营中心(SOC)分析师工作流设计的,而非科学基准测试;(ii)每个策略归属的重要性,以区分防御智能体动作与自主EDR动作;(iii)EDR的自主行为在评估窗口期间会变化。这些发现共同凸显了企业防御的模拟到现实差距,并推动了在包含黑盒自主工具的环境中基准测试自主防御智能体的评估方法论。

英文摘要

Leading commercial endpoint detection and response (EDR) products have shifted from operator-configured rule sets to multi-component systems where autonomous AI components operate alongside, and increasingly in place of, operator-deployed policies. Autonomous defense agents using commercial EDR as their hardening tool are no longer tuning a passive tool, but a black-box autonomous system capable of making vendor-specific decisions. We present the first evaluation framework for autonomous defense agents hardening commercial EDR. We instantiate it in a Game of Active Directory (GOAD) lab with Horizon3.ai's NodeZero as the autonomous pentester and Microsoft Defender XDR as the EDR. We run a sample benchmark of defense agents with two large language model (LLM) backbones (Claude Sonnet 4.6 and Cisco Foundation-Sec-8B). We report three lessons learned that neither simulation nor open-source-EDR evaluation can surface: (i) commercial EDR telemetry is engineered for Security Operations Center (SOC) analyst workflows rather than scientific benchmarking; (ii) the importance of per-policy attribution to separate defense agent actions from autonomous EDR actions; and (iii) the EDR's autonomous behavior varies during the evaluation window. Together, these findings highlight a sim-to-real gap for enterprise defense and motivate evaluation methodology for benchmarking autonomous defense agents in environments with black-box, autonomous tools.

2606.08148 2026-06-09 cond-mat.mtrl-sci cs.LG 新提交

Inverse design of bespoke interatomic potentials via active learning by information-matching

通过信息匹配的主动学习逆向设计定制原子间势

Yonatan Kurniawan, Logan D. Williams, Amit Samanta, Ilia Nikiforov, Daniel Schwalbe-Koda, Mark K. Transtrum, Ellad B. Tadmor, Vincenzo Lordi, Vasily V. Bulatov

发表机构 * Department of Physics and Astronomy, Brigham Young University(物理学与天文学系, Brigham Young 大学) Lawrence Livermore National Laboratory(劳伦斯利弗莫尔国家实验室) Department of Aerospace Engineering and Mechanics, University of Minnesota(航空航天工程与力学系,明尼苏达大学) Department of Materials Science and Engineering, University of California(材料科学与工程系,加州大学) Cross Stream Consulting(Cross Stream 咨询)

AI总结 提出信息匹配方法,通过主动选择训练数据最小化参数不确定性,以少量数据精确预测金属塑性强度,并后验修正模型误差。

详情
AI中文摘要

原子间势能(IPs)能够实现超出第一性原理方法范围的大规模原子模拟,但其预测可靠性关键取决于训练数据的选择、量化不确定性和模型表达能力。主动学习(AL)为构建高效准确的IPs提供了原则性框架,但大多数策略在减少参数不确定性时未明确考虑所预测的特定材料属性。信息匹配(IM)方法通过要求所选训练数据提供至少与实现选定感兴趣量(QoIs)的预定不确定性目标所需一样多的参数空间信息,解决了这一局限性。在此,我们将IM应用于开发专门用于预测金属塑性强度的定制IPs。由于模拟塑性强度的计算成本高昂,我们采用间接IM策略,针对与强度相关的廉价中间QoIs。IM方法能够以最少的训练数据实现精确的参数约束,从而对中间QoIs和塑性强度做出精确预测。然而,模型误差仍然是一个关键限制,事后不确定性膨胀校正为缓解这一限制提供了可行手段。这些发现说明了不确定性感知的AL在预测复杂材料属性方面的前景和局限。

英文摘要

Interatomic potentials (IPs) enable large-scale atomistic simulations beyond the reach of first-principles methods, but their predictive reliability depends critically on the selection of training data, quantified uncertainty, and model expressiveness. Active learning (AL) provides a principled framework for constructing efficient and accurate IPs, yet most strategies reduce parameter uncertainty without explicitly accounting for the specific material properties being predicted. The information-matching (IM) approach addresses this limitation by requiring that the selected training data provide at least as much parameter space information as needed to achieve prescribed uncertainty targets for selected quantities of interest (QoIs). Here, we apply IM to develop bespoke IPs specifically tailored for predicting plastic strength in metals. Due to the high computational cost of simulating plastic strength, we employ an indirect IM strategy that targets inexpensive intermediate QoIs that correlate with strength. The IM method enables precise parameter constraints with minimal training data, yielding precise predictions for both the intermediate QoIs and plastic strength. Yet, model error remains a key limitation, and a post hoc uncertainty inflation correction provides a viable means to mitigate this limitation. These findings illustrate both the promise and limits of uncertainty-aware AL for predicting complex material properties.

2606.08147 2026-06-09 q-bio.GN cs.LG 新提交

Biological Reasoning-Informed Regression for Interpretable Regulatory DNA Activity Prediction

面向可解释调控DNA活性预测的生物学推理引导回归

Yi Duan, Zhao Yang, Jiwei Zhu, Ying Ba, Chuan Cao, Bing Su

发表机构 * Gaoling School of Artificial Intelligence(甘岭人工智能学院) Renmin University of China(中国人民大学) Zhongguancun Academy(中关村学院)

AI总结 提出R3LM框架,通过结构化生物学知识教LLM进行推理引导回归,在增强子预测上达到最优性能并提供可解释机制。

Comments Accepted at KDD 2026 AI4Sciences Track

详情
AI中文摘要

DNA顺式调控元件(CREs)如增强子控制基因表达水平。从DNA序列准确预测调控活性是有价值但具有挑战性的,因为它需要理解复杂的生物调控过程。现有方法通常以黑盒方式从序列回归活性分数,限制了可解释性和回归性能。同时,大型语言模型(LLMs)受益于显式推理过程,但直接将LLMs应用于原始DNA序列表现不佳。在本文中,我们通过引入R3LM框架弥合这一差距,该框架通过结构化生物学知识教LLMs对调控DNA进行推理引导回归。具体来说,我们设计了一种基于生物学的数据格式,结构化DNA的调控信息以改善LLM理解,并构建了CRE-ReasonBench,这是第一个将DNA序列和活性分数与机制推理轨迹关联的数据集。通过两阶段训练,首先教LLMs对结构化生物信息进行推理,然后进行回归,R3LM在三种细胞类型的增强子预测上达到了最先进性能,优于使用原始序列输入的LLMs和专门的DNA模型,同时提供了可解释的机制解释。我们期望R3LM作为一种可解释的奖励模型,能够有效辅助生物学家进行CRE设计。代码可在https://github.com/DuanYi516/R3LM获取。

英文摘要

DNA cis-regulatory elements (CREs) such as enhancers control gene expression levels. Accurately predicting regulatory activity from DNA sequences is valuable but challenging, as it requires understanding complex biological regulatory processes. Existing methods typically regress activity scores from sequences in a black-box manner, limiting both interpretability and regression performance. Meanwhile, large language models (LLMs) benefit from explicit reasoning processes, yet directly applying LLMs to raw DNA sequences performs poorly. In this paper, we bridge this gap by introducing R3LM, a framework that teaches LLMs reasoning-informed regression on regulatory DNA through structured biological knowledge. Specifically, we design a biologically grounded data format that structures DNA's regulatory information for improved LLM understanding, and construct CRE-ReasonBench, the first dataset that associates DNA sequences and activity scores with mechanistic reasoning traces. Through two-stage training that first teaches LLMs reasoning over structured biological information then performs regression, R3LM achieves state-of-the-art performance on enhancer prediction across three cell types, outperforming both LLMs with raw sequence input and specialized DNA models while providing interpretable mechanistic explanations. We expect R3LM as an interpretable reward model that can effectively assist biologists in CRE design. Code is available at https://github.com/DuanYi516/R3LM.

2606.08131 2026-06-09 cs.HC cs.AI 新提交

LCAM: A Framework for Diagnosing Interactional Alignment Failures in Con-versational AI

LCAM:诊断对话式AI中交互对齐失败的框架

Manuele Reani, Hongyu Tian

发表机构 * School of Management and Economics, The Chinese University of Hong Kong, Shenzhen(香港中文大学深圳校区管理学院)

AI总结 提出分层认知对齐模型(LCAM),通过五层对齐和两种失调极性诊断对话式AI的交互失败,应用于LLM咨询案例揭示潜在危害。

详情
AI中文摘要

对话式AI越来越多地用于用户可能脆弱、不确定或依赖系统表面能力的场景中,提供建议、解释、安慰和决策支持。现有的对齐工作通常关注模型目标、偏好优化或输出正确性。然而,许多危害源于交互:系统如何构建权威、表达不确定性、模拟共情、支持推理以及使边界清晰。本文介绍了分层认知对齐模型(LCAM),这是一个用于诊断对话式AI中交互对齐失败的概念性和规范性框架。LCAM将对齐定义为系统行为、用户目标、任务需求和规范性上下文之间的校准匹配。它区分了五个匹配层:感知层、语义层、情感层、认知层和伦理层,以及两种失调极性:欠拟合和过度延伸。我们将LCAM应用于一个已发表的LLM咨询示例,展示了一个看似支持性的回应如何强化有害信念、模拟不适当的关怀并模糊角色边界。通过将对话失败转化为关于过度依赖、虚假亲密、自主性侵蚀、边界混淆和不适当信任的审计和治理问题,LCAM提供了一个超越准确性、有用性或信任度的评估对话式AI的理论和规范性视角。

英文摘要

Conversational AI is increasingly used for advice, interpretation, reassurance, and decision support in contexts where users may be vulnerable, uncertain, or dependent on the system's apparent competence. Existing alignment work often focuses on model objectives, preference optimization, or output correctness. Yet, many harms arise through interaction: how systems frame authority, express uncertainty, simulate empathy, support reasoning, and make boundaries legible. This paper introduces the Layered Cognitive Alignment Model (LCAM), a conceptual and normative framework for diagnosing interac-tional alignment failures in conversational AI. LCAM defines alignment as a calibrated fit among system behavior, user goals, task demands, and normative context. It distinguishes five layers of fit: perceptual, semantic, affective, cognitive, and ethical, and two diagnostic polarities of misalignment: underfit and overreach. We apply LCAM to a published LLM counseling example, showing how an apparently supportive response can reinforce harmful beliefs, simulate inappropriate care, and obscure role boundaries. By translating conversational failures into audit and governance questions concerning over-reliance, false intimacy, autonomy erosion, boundary confusion, and inappropriate trust, LCAM offers a theoretical and normative lens for evaluating conversational AI beyond accuracy, helpfulness, or trust.

2606.08110 2026-06-09 math.FA cs.LG 新提交

New Fractional Ambiguity Function Integrated with CNN-Based Machine Learning for Signal Classification

基于CNN机器学习的分数阶模糊函数新方法用于信号分类

Aamir H. Dar, Prakhar Kumar Sonkar, Neeraj Kumar Sharma

发表机构 * Mehta Family School of Data Science & Artificial Intelligence(梅hta家族数据科学与人工智能学院) Indian Institute of Technology Guwahati(印度理工学院古瓦哈提)

AI总结 提出一种新的分数阶模糊函数(NFrAF),并集成到CNN框架中,用于信号分类,相比传统方法提高了分类精度。

详情
AI中文摘要

从分数阶傅里叶变换导出的新分数阶模糊函数(NFrAF)作为经典模糊函数的推广被引入。严格建立了NFrAF的基本分析性质,包括对称性、边缘性和Moyal型恒等式。在验证其检测和定位单分量及多分量线性调频(LFM)信号的能力后,将NFrAF集成到基于卷积神经网络的机器学习框架中用于信号分类。由于其优越的时频分辨率和定位能力,NFrAF比传统方法(如谱图和经典模糊函数)提供了更具信息量的输入表示。在模拟数据集上的实验结果表明分类精度持续提高,突显了所提表示在数据驱动信号分析中的有效性。

英文摘要

A new fractional ambiguity function (NFrAF) derived from the fractional Fourier transform is introduced as a generalization of the classical ambiguity function. The fundamental analytical properties of the NFrAF, including symmetry, marginality, and Moyal type identities, are rigorously established. After verifying its ability to detect and localize monocomponent and multicomponent linear frequency modulated (LFM) signals, the NFrAF is integrated into a convolutional neural network based machine learning framework for signal classification. Owing to its superior time frequency resolution and localization, the NFrAF provides a more informative input representation than conventional methods such as the spectrogram and classical ambiguity function. Experimental results on simulated datasets demonstrate consistent improvements in classification accuracy, highlighting the effectiveness of the proposed representation for data driven signal analysis.

2606.08043 2026-06-09 cs.GR cs.CV 新提交

OmniFaceRig: Fully Automatic Inner-Mouth-Aware Face Rigging Across Diverse 3D Character Topologies

OmniFaceRig: 跨多种3D角色拓扑的全自动内口感知面部绑定

Chao Wang, Guangyao Ma, John Doublestein, Junming Chen, Yiming Lin, Zhaoen Su, Xiaomin Luo, Shiyang Cheng, Jie Shen, Doug Roble, Dilin Wang, Yilei Li, Rakesh Ranjan

发表机构 * Reality Labs, Meta(Meta现实实验室)

AI总结 提出全自动端到端管道OmniFaceRig,将静态表面网格转换为含内口几何的FACS绑定,支持人类、人形及多种动物拓扑,无需手动标注或模板。

详情
AI中文摘要

面部绑定——创建基于FACS的混合形状以及内口几何(牙齿、牙龈和舌头)——仍然是3D角色制作中的主要瓶颈。现有流程仍需要大量设计工作,特别是手动地标标注、每个角色的模板调整和内口放置。我们提出OmniFaceRig,一个全自动端到端管道,将静态表面仅3D角色网格(无预建模口腔)转换为内口感知的FACS绑定,包含多达155个混合形状、程序化拟合的牙齿、牙龈和舌头,以及重新打包的UV/纹理。OmniFaceRig支持多种拓扑——人类、人形、长吻动物(如狗、狼、狐狸)和短吻动物(如猫、熊、兔子、老虎)——无需手动地标、无需用户提供模板、无需每个资产的设置。该管道结合了混合VLM+CV可绑定性检查、多模型面部解析、密集关键点驱动的模板配准、程序化内口构建以及碰撞感知的混合形状迁移。对于非人类角色,OmniFaceRig选择拓扑特定的面部和内口模板,并使用碰撞感知的内口拟合来减少牙齿-面部交叉,而无需用户暴露于类别特定的调整。我们还公开发布了Omni-Bench,一个包含1000个双足3D角色的免费基准数据集,带有FACS面部混合形状和内口几何,涵盖人类、人形、猫、狗和其他动物。实验表明,在筛选后的Omni-Bench输入上,最终绑定成功率很高,分割集成几乎实现了完全的面部检测召回,以及可靠的内口放置和低穿透率。总之,OmniFaceRig为从静态生成的角色到动画就绪的面部绑定提供了一条自动化路径,适用于人类和非人类拓扑。

英文摘要

Facial rigging - creating FACS-based blendshapes together with inner-mouth geometry (teeth, gums, and tongue) - remains a major bottleneck in 3D character production. Existing pipelines still require substantial designer effort, especially for manual landmark annotation, per-character template adjustment, and inner-mouth placement. We present OmniFaceRig, a fully automatic end-to-end pipeline that converts a static surface-only 3D character mesh, with no pre-modeled oral cavity, into an inner-mouth-aware FACS rig with up to 155 blendshapes, procedurally fitted teeth, gums, and tongue, and re-packed UV/texture. OmniFaceRig supports diverse topologies - humans, humanoids, long-muzzled animals (e.g., dogs, wolves, foxes), and short-muzzled animals (e.g., cats, bears, rabbits, tigers) - with no manual landmarks, no user-provided templates, and no per-asset setup. The pipeline combines hybrid VLM+CV riggability checking, multi-model face parsing, dense keypoint-driven template registration, procedural inner-mouth construction, and collision-aware blendshape transfer. For non-human characters, OmniFaceRig selects topology-specific face and inner-mouth templates and uses collision-aware inner-mouth fitting to reduce teeth-face intersections without exposing users to category-specific tuning. We also publicly release Omni-Bench, a freely available benchmark dataset of 1,000 biped 3D characters with FACS facial blendshapes and inner-mouth geometry, spanning humans, humanoids, cats, dogs, and other animals. Experiments show high final rigging success on screened Omni-Bench inputs, nearly complete face detection recall from the segmentation ensemble and reliable inner-mouth placement with low penetration. Together, OmniFaceRig provides an automatic path from static generated characters to animation-ready facial rigs across both human and non-human topologies.

2606.08041 2026-06-09 cs.GR cs.CV 新提交

Wispy to Voluminous: Prior-free Multi-view Capture of Strand-level Facial Hair

从稀疏到浓密:无先验的多视角面部毛发级联重建

Jaeseong Lee, Giljoo Nam, Adrian Jarabo, Carlos Aliaga

发表机构 * KAIST(韩国科学技术院) Meta Codec Avatar Lab(Meta 编码人像实验室) Meta Reality Labs Research(Meta 现实实验室研究)

AI总结 提出从多视角图像自动重建面部毛发(胡须、眉毛等)的管线,将无结构3D高斯表示转换为显式曲线发丝,解决几何歧义,实现高保真发丝重建。

Comments 27 pages, 16 figures, supplementary included

详情
AI中文摘要

面部毛发是个人身份的一个决定性特征,但仍然是数字头像的关键瓶颈。最近的体积方法实现了照片级真实感,但将毛发烘焙到面部几何中,阻碍了可编辑性,并且无法解析稀疏的、发丝状结构。同时,头皮毛发重建方法针对密集的毛发体积,无法适应面部毛发稀疏、空间变化的特性。我们提出了一种管线,从多视角图像自动重建面部毛发——胡须、髭须、睫毛和眉毛,将无结构的3D高斯表示转换为显式的基于曲线的发丝表示。我们分四个阶段解决几何歧义:(i)优化由跟踪头部几何约束的3D高斯,以强制早期光线终止并抑制次表面噪声;(ii)追踪连续发丝,对频繁交叉和极端曲率具有鲁棒性;(iii)将发丝接地到表面,并通过物理动机的先验解决根尖歧义;(iv)通过光度优化下的不透明度驱动密度控制来细化重建。据我们所知,这是第一个从3D高斯表示重建高保真面部毛发发丝的方法。恢复的发丝忠实地保留了面部毛发特征的朝向和稀疏模式,并生成可直接用于下游生产任务的资产,包括面部动画和物理模拟、几何梳理和转移、外观编辑以及基于物理的渲染。

英文摘要

Facial hair is a defining trait of personal identity, yet remains a critical bottleneck for digital avatars. Recent volumetric methods achieve photorealism but bake hair into the underlying face geometry, preventing editability and failing to resolve sparse, strand-like structures. Meanwhile, scalp-hair reconstruction methods target dense hair volumes and do not transfer to the sparse, spatially-varying nature of facial hair. We present a pipeline that automatically reconstructs facial hair -- beard, mustache, lashes, and brows -- from multi-view images, converting an unstructured 3D Gaussian representation into an explicit curve-based strand representation. We resolve geometric ambiguities in four stages: (i) optimizing 3D Gaussians constrained by tracked head geometry to enforce early ray termination and suppress sub-surface noise; (ii) tracing continuous strands robust to frequent crossings and extreme curvature; (iii) grounding strands to the surface and resolving root-tip ambiguity via a physically-motivated prior; and (iv) refining the reconstruction through opacity-driven density control under photometric optimization. To our knowledge, this is the first method to reconstruct high-fidelity facial hair strands from a 3D Gaussian representation. The recovered strands faithfully preserve the orientation and sparsity patterns characteristic of facial hair, and yield assets immediately suitable for downstream production tasks, including facial animation and physical simulation, geometric grooming and transfer, appearance editing, and physics-based rendering.

2606.08036 2026-06-09 cs.IR cs.AI cs.CL 新提交

GIScholarBench: Benchmarking LLM Overconfidence in GIS Research

GIScholarBench: 在GIS研究中评估大语言模型的过度自信

Zongrng Li, Mingzheng Yang, Lei Zou, Hongxu Ma, Hao Tian, Siqi Zhou, Wenjing Gong, Kaili Zhang, Bingqian Chen, Mitch Zhang, Yifan Yang

发表机构 * Texas A&M University(德克萨斯理工大学) Google(谷歌) Department of Geography(地理系) Department of Landscape Architecture and Urban Planning(景观建筑与城市规划系)

AI总结 针对大语言模型在学术研究中的过度自信问题,构建了包含10865篇论文的GIScholarBench基准,通过元数据检索、文献链接和研究方向生成三项任务评估模型表现,发现所有模型均存在任务不变的过度自信现象。

详情
AI中文摘要

大型语言模型(LLMs)越来越多地用于学术研究工作流程,但学术任务需要高事实精度,因此暴露了一个关键弱点:过度自信。这里,过度自信被行为定义为即使在底层知识不完整或不可验证时,也倾向于产生自信、果断且格式良好的输出,而不是陈述信心与准确性之间的校准差距。为了研究这一问题,我们引入了GIScholarBench,这是一个基于2020年至2025年间发表在25个核心GIScience期刊上的10865篇论文构建的基准。该基准涵盖三个认知复杂度递增的任务:元数据检索、文献链接和研究方向生成。我们通过原生网络界面在真实用户条件下评估了Claude Sonnet 4.5、Gemini 3和ChatGPT 5.3。结果显示所有任务均存在一致的过度自信。在元数据检索中,ChatGPT 5.3取得了最高准确率,但所有模型在预测错误时仍生成确定的标题和DOI。在文献链接中,Claude Sonnet 4.5恢复了最多的参考文献,但所有模型在排名靠前的检索和更长的引文列表之间显示出明显差距,表明参考文献被扩展到可靠检索能力之外。在研究方向生成中,AI生成的方向相比真实未来引用论文显示出更低的主题覆盖率、更高的新颖性缺失率和更低的语义多样性。这些发现表明,LLM的过度自信是任务不变的,但表现形式不同:检索中的事实过度生成、文献链接中不可靠的引文扩展,以及研究构思中输出完整性的过度自信。

英文摘要

Large language models (LLMs) are increasingly used in academic research workflows, but scholarly tasks require high factual precision and therefore expose a key weakness: overconfidence. Here, overconfidence is defined behaviorally as the tendency to produce confident, assertive, and well-formatted outputs even when the underlying knowledge is incomplete or unverifiable, rather than as a calibration gap between stated confidence and accuracy. To examine this issue, we introduce GIScholarBench, a benchmark built from 10,865 papers published in 25 core GIScience journals between 2020 and 2025. The benchmark covers three tasks with increasing cognitive complexity: metadata retrieval, literature linking, and research direction generation. We evaluate Claude Sonnet 4.5, Gemini 3, and ChatGPT 5.3 through their native web interfaces under real-world user-facing conditions. Results show consistent overconfidence across all tasks. In metadata retrieval, ChatGPT 5.3 achieves the highest accuracy, but all models still generate definitive titles and DOIs when predictions are wrong. In literature linking, Claude Sonnet 4.5 recovers the most references, but all models show a clear gap between top-ranked retrieval and longer citation lists, suggesting that references are extended beyond reliable retrieval capacity. In research direction generation, AI-generated directions show lower topic coverage, higher novel miss rates, and lower semantic diversity than real future-citing papers. These findings suggest that LLM overconfidence is task-invariant but takes different forms: factual overgeneration in retrieval, unreliable citation expansion in literature linking, and overconfidence in output completeness during research ideation.

2606.08032 2026-06-09 stat.ML cs.LG 新提交

Variational Proximal Policy Optimization

变分近端策略优化

Ousmane Amadou Dia

发表机构 * University of California, Berkeley(加州大学伯克利分校)

AI总结 提出变分近端策略优化(VP₂O),利用粒子变分推理和专家混合架构,通过几何近端控制机制解决强化学习中的策略模式崩溃和分布漂移问题,在复杂推理任务上取得显著提升。

详情
AI中文摘要

通过近端策略优化进行的人类反馈强化学习经常遭受策略模式崩溃、脆弱的探索循环和分布漂移。本文引入了变分近端策略优化(\(\textsc{VP}_2\textsc{O}\)),这是一种基于粒子的变分推理框架,将策略优化映射到专家混合架构中的Stein变分梯度下降。通过利用局部化专家原型上的函数核以及专家正交化损失,\(\textsc{VP}_2\textsc{O}\)引入了一种基于几何的近端控制机制,可以减少对固定裁剪或KL计划的依赖。我们在33B/4B稀疏专家混合模型上的结果显示,在复杂推理基准测试中取得了多项改进,在Codeforces上建立了\(+\mathbf{179}\) ELO增益,并在AIME数学推理任务上减少了\(\mathbf{32\%}\)的令牌数量。

英文摘要

Reinforcement Learning from Human Feedback via Proximal Policy Optimization often suffers from policy mode collapse, brittle exploration loops, and distribution drift. This paper introduces Variational Proximal Policy Optimization (\(\textsc{VP}_2\textsc{O}\)), a particle-based variational inference framework that maps policy optimization to Stein Variational Gradient Descent within a Mixture-of-Experts architecture. By leveraging functional kernels over localized expert prototypes alongside an expert orthogonalization loss, \(\textsc{VP}_2\textsc{O}\) introduces a geometry-based proximal-control mechanism that can reduce reliance on fixed clipping or KL schedules. Our results on a 33B/4B sparse Mixture-of-Experts model show several improvements across complex reasoning benchmarks, establishing a \(+\mathbf{179}\) ELO gain on Codeforces and a \(\mathbf{32\%}\) reduction in token count on AIME mathematical reasoning tasks.

2606.08030 2026-06-09 cs.MA cs.AI 新提交

Voting Protocols as Coordination Mechanisms for Role-Constrained Multi-Agent Tutoring Systems

投票协议作为角色约束的多智能体辅导系统的协调机制

Eric S. Qiu, Joyce Gill

发表机构 * University of California, Berkeley(加州大学伯克利分校)

AI总结 研究投票协议如何塑造四个角色约束的教学智能体之间的协调,通过比较四种投票协议在模拟辅导环境中的效果,发现协议选择显著影响集体决策和协调行为。

Comments Accepted to ICML 2026 Workshop on AI4Good

详情
AI中文摘要

智能辅导系统引入了一个协调挑战:多个智能体可能提出不同但合理的干预措施,但只能向学习者提供一个响应。在本文中,我们研究了投票协议如何塑造四个角色约束的教学智能体之间的合作,这些智能体负责搭建脚手架、误解、动机和元认知。我们在SciQ和HumanEval基准测试的两个模拟辅导环境中比较了四种投票协议——简单投票、排名投票、累积投票和批准投票。我们不是将投票用作简单的聚合步骤,而是用它来分析在部分教学冲突下集体决策规则如何塑造协调。在1200次模拟交互中,我们发现智能体 deliberation 和投票协议类型经常改变最终获胜的响应,表明两者都显著影响集体决策。不同的投票规则也产生不同的协调行为,即使是短暂的辅导回合也在模拟学生中显示出可测量的学习收益。总体而言,我们表明协议选择与角色专门化的教学智能体之间的不同协调模式相关。

英文摘要

Agentic tutoring systems introduce a coordination challenge: multiple agents may propose different but reasonable interventions, yet only one response can be delivered to the learner. In this paper, we study how voting protocols shape cooperation among four role-constrained pedagogical agents responsible for scaffolding, misconception, motivation, and metacognition. We compare four voting protocols -- simple, ranked, cumulative, and approval voting -- across two simulated tutoring environments on SciQ and HumanEval benchmarks. Rather than using voting as a simple aggregation step, we use it to analyze how collective decision rules shape coordination under partial pedagogical conflict. Across 1,200 simulated interactions, we find that agent deliberation and voting protocol type frequently change which response ultimately wins, showing that both meaningfully shape the collective decision. Different voting rules also produce distinct coordination behaviors, and even brief tutoring turns show measurable learning gains in simulated students. Overall, we show that protocol choice is associated with distinct coordination patterns among role-specialized pedagogical agents.

2606.08020 2026-06-09 quant-ph cs.AI 新提交

Repair Before Veto, When Repair Is Hidden: Quantum-Accessible Features for Repair-Augmented Constraint Learning

在修复被隐藏时先修复再否决:面向修复增强约束学习的量子可访问特征

Yifan Wang

发表机构 * Yifan Wang(王一帆)

AI总结 提出Q-RACL框架,在硬约束决策中引入修复优先于否决的语义,通过量子特征访问解决离散对数隐藏的修复可行性推理问题,显著降低假否决率。

Comments 7 pages, 2 figures

详情
AI中文摘要

硬约束决策系统通常会否决不可行的候选方案。当系统可以采取行动时,这种做法过于僵化:如果已知一个可承受的修复能使不可行但有价值的候选变得可行,那么拒绝就是一个错误的否决,而非排序错误。我们引入了Q-RACL(量子修复增强约束学习),这是一个先修复再否决的框架,首先定义RACL决策语义,然后识别出量子特征访问可以承担关键作用的单一推理环节。RACL在顺序修复计划能恢复可行性和偏好时接受候选方案;否则返回结构化的拒绝理由。关键环节是修复可行性推理:从观察到的候选和上下文来看,哪个修复类别能恢复可行性。我们构建了一个离散对数隐藏的RACL族,其中修复类别是潜在指数a = log_g(x)中的移位区间规则,而学习器只观察到x = g^a mod p。在标准的基于DLP的学习分离下,这个坐标对高效的原始输入经典策略是不可访问的,但通过Shor/Fourier结构对量子智能体是可访问的。在六个素数和十个随机种子下,有界的原始输入经典策略和错误的原始傅里叶编码仍接近随机水平,而Q-DLP策略将假否决率保持在1.1%以下,赢得所有配对种子,并产生QNI_cond在0.9777到0.9972之间。一个经典的DLog预言机与之匹配,隔离了特征访问而非分类器容量。因此,量子AI不是作为通用模型升级添加的;对于这个DLP隐藏的修复族,它提供了缺失的特征,从而闭合了先修复再否决的循环。

英文摘要

Hard-constraint decision systems usually veto infeasible candidates. This is too rigid when the system can act: if a known affordable repair would make an infeasible candidate feasible and valuable, rejection is a false veto rather than a ranking error. We introduce Q-RACL (Quantum Repair-Augmented Constraint Learning), a repair-before-veto framework that first defines RACL decision semantics and then identifies the single inference link where quantum feature access can be load-bearing. RACL accepts a candidate when a sequential repair plan restores feasibility and preference; otherwise it returns structured rejection credit. The hard link is repair-feasibility inference: which repair class restores feasibility from an observed candidate and context. We construct a discrete-logarithm-hidden RACL family where the repair class is a shifted interval rule in the latent exponent a = log_g(x), while the learner observes only x = g^a mod p. Under standard DLP-based learning separation, this coordinate is inaccessible to efficient raw-input classical policies but accessible to a quantum agent through Shor/Fourier structure. Across six primes and ten seeds, bounded raw-input classical policies and a wrong raw-Fourier encoding remain near chance, whereas the Q-DLP policy keeps false-veto rate below 1.1%, wins all paired seeds, and yields QNI_cond = 0.9777 to 0.9972. A classical DLog oracle matches it, isolating feature access rather than classifier capacity. Thus quantum AI is not added as a generic model upgrade; for this DLP-hidden repair family, it supplies the missing feature that closes the repair-before-veto loop.

2606.07968 2026-06-09 cs.CR cs.AI 新提交

RecurGuard: Runtime Monitoring for Reasoning-Token Consumption Attacks

RecurGuard: 推理令牌消耗攻击的运行时监控

Abid Aziz, Hafsa Binte Kibria

发表机构 * Department of Electrical & Computer Engineering(电气与计算机工程系) Rajshahi University of Engineering & Technology(拉贾克西希大学工程与技术学院)

AI总结 RecurGuard通过监控推理轨迹的重复率、体积增长和查询进展三个信号,实时检测并阻止推理令牌消耗攻击,在DS-R1-Qwen-7B上对OverThink和ExtendAttack的检测率分别达99%和92%,且误报率接近零。

详情
AI中文摘要

具有推理能力的大型语言模型可能被诱导将其生成预算花在注入的诱饵任务上,而不是回答用户的问题,导致在没有产生最终答案时发生拒绝服务,以及在输出令牌计费时造成钱包耗尽。输入侧的安全分类器通常会漏掉这些攻击,因为注入的提示可能在语法上看起来是良性的。我们构建了RecurGuard,这是一个运行时监控器,用于在模型暴露推理轨迹时检测推理链消耗攻击。RecurGuard在推理轨迹生成时对其进行分析,并跟踪三个信号:重复率、体积增长以及向用户查询的进展。如果所有三个信号在连续三个块中保持异常,RecurGuard会提前终止生成。我们在开源推理模型上评估了RecurGuard对抗OverThink和ExtendAttack的效果,并在DS-R1-Qwen-7B上进行了自适应压力测试。在该模型上,RecurGuard检测到99%的OverThink攻击和92%的ExtendAttack实例,同时在问答、代码生成、数学和摘要任务上保持接近零的误报率。自适应评估揭示了该防御的局限性:主题攻击仍保持11.9倍的放大效果,联合漏检率约为50%,而完全语义规避将放大倍数从22.8倍降至2.2倍。当推理轨迹不可用时,QDM提供基于最终输出的事后回退监控器。

英文摘要

Reasoning-capable large language models can be induced to spend their generation budget on injected decoy tasks rather than answering the user's question, causing denial of service when no final answer is produced and denial of wallet when excess output tokens are billed. Input-side safety classifiers often miss these attacks because the injected prompts can appear syntactically benign. We build RecurGuard, a runtime monitor for detecting reasoning-chain consumption attacks when reasoning traces are exposed by the model. RecurGuard analyzes reasoning traces as they are generated and tracks three signals: recurrence rate, volume growth, and progress toward the user's query. If all three signals remain anomalous over three consecutive chunks, RecurGuard terminates generation early. We evaluate RecurGuard against OverThink and ExtendAttack across open-weight reasoning models and conduct adaptive stress tests on DS-R1-Qwen-7B. On this model, RecurGuard detects 99% of OverThink attacks and 92% of ExtendAttack instances while maintaining near-zero false positive rates on question answering, code generation, mathematics, and summarization. Adaptive evaluation reveals the limit of the defense: topical attacks retain 11.9x amplification with an approximately 50% joint miss rate, whereas full semantic evasion reduces amplification from 22.8x to 2.2x. When reasoning traces are unavailable, QDM provides a post-hoc fallback monitor based on the final output.

2606.07949 2026-06-09 q-bio.PE cs.CV eess.IV 新提交

Feasibility to detect rapid change and disappearance of seagrass: Lessons from nearly 80 years of vegetation change in the Ako, Seto Inland Sea, Japan

检测海草快速变化和消失的可行性:来自日本濑户内海Ako近80年植被变化的教训

Takehisa Yamakita, Yoji Igarashi, Akira Eto, Ken Ishida, Masaaki Iiyama

发表机构 * Japan Agency for Marine-Earth Science and Technology (JAMSTEC)(日本海洋地球科学技术机构) The University of Tokyo(东京大学) Tokyo University of Marine Science and Technology(东京海洋大学) Shiga University(滋贺大学)

AI总结 本研究利用近80年的航拍和卫星影像,结合YOLO深度学习分割,分析了日本Ako潮滩海草床的长期动态,发现2025年Zostera marina在一年内几乎完全消失,表明这是一次由夏季水温升高驱动的快速生态系统转变,并提出了改进海草监测指标的建议。

详情
AI中文摘要

本研究分析了日本濑户内海的Ako潮滩,该地的大叶藻(Zostera marina)在2025年一年内几乎全部消失。利用1940年代以来的航拍照片、高分辨率卫星影像、GRUS图像(2.5-5米)以及每月Sentinel-2合成图像(10米),我们重建了约80年的海草分布。基于深度学习的YOLO分割在这些数据集上实现了高精度(总体精度≥0.9);尽管无法区分物种,但模型捕捉了植被面积的主要时间动态。长期平均海草面积为6.8公顷,但数值波动很大,从1974年的3.5公顷到1989年的41.3公顷,除2025年的0.2公顷外。2019年至2026年的Sentinel-2合成图像显示出明显的季节性,植被在初夏增加,秋季开始减少。然而,2025年夏季后面积急剧下降,并在2025-2026年整个冬季保持异常低值。我们的结果表明,2025年的事件并非正常波动,而是一次快速生态系统转变,涉及优势冠层物种的丧失,最可能的原因是区域夏季水温升高。这些发现对海草基本海洋变量(EOVs)和TNFD对齐的自然相关披露中使用的自然状态(SoN)指标也有影响。与森林不同,海草草甸需要更精细的时间分辨率,因为显著的季节性和突然崩溃都会强烈影响面积指标。因此,除了先前指出的物种级分类精度等问题外,我们建议:(1)基线应在最长的可用记录上定义并进行生态学论证;(2)在年际比较前应用季节性标准化;(3)将面积异常极端的年份标记出来,而非用作参考点。

英文摘要

This study analyses the Ako tidal flat in the Seto Inland Sea, Japan, where nearly all Zostera marina disappeared within a single year in 2025. Using aerial photographs from the 1940s onward, high-resolution satellite imagery, GRUS images (2.5-5 m), and monthly Sentinel-2 composites (10 m), we reconstructed approximately 80 years of seagrass distribution. YOLO-based segmentation using deep learning achieved high accuracy (overall accuracy >= 0.9) across these datasets; although species could not be discriminated, the models captured the major temporal dynamics in vegetation area. The long-term mean seagrass area was 6.8 ha, but values fluctuated widely, from 3.5 ha in 1974 to 41.3 ha in 1989 except 0.2 ha in 2025. Sentinel-2 composites from 2019 to 2026 revealed clear seasonality, with vegetation increasing in early summer and declining from autumn. In 2025, however, the area decreased sharply after summer and remained anomalously low throughout the winter of 2025-2026. Our results, indicating that the 2025 event was not a normal fluctuation but a rapid ecosystem shift involving the loss of the dominant canopy-forming species, most plausibly driven by regionally elevated summer water temperatures. The findings also have implications for seagrass Essential Ocean Variables (EOVs) and the State of Nature (SoN) metrics used in TNFD-aligned nature-related disclosures. Unlike forests, seagrass meadows require finer temporal resolution because both pronounced seasonality and abrupt collapse strongly influence area-based indicators. Therefore, in addition to previously noted issues such as species-level classification accuracy, we recommend that (1) baselines be defined over the longest available record and justified ecologically, (2) seasonal standardization be applied before inter-annual comparisons, and (3) years with extreme area anomalies be flagged rather than used as reference points.

2606.07943 2026-06-09 cs.CR cs.AI cs.CL 新提交

POISE: Position-Aware Undetectable Skill Injection on LLM Agents

POISE:面向LLM智能体的位置感知不可检测技能注入攻击

Haochang Hao, Dehai Min, Zhifang Zhang, Yunbei Zhang, Miao Xu, Yingqiang Ge, Lu Cheng

发表机构 * University of Illinois at Chicago(伊利诺伊大学香槟分校) University of Queensland(昆士兰大学) Tulane University(路易斯安那州立大学) Rutgers University(罗格斯大学)

AI总结 提出POISE攻击方法,通过位置感知将恶意指令压缩为单一良性指令嵌入技能正文,在保持隐蔽性的同时实现89.3%的攻击成功率,比随机位置基线高28.0个百分点。

Comments 20 pages, 2 figures, 5 tables

详情
AI中文摘要

智能体技能为扩展通用智能体提供了一种轻量级机制,但其开放格式使其容易受到技能投毒攻击。实际危险的注入必须保持不可见:如果执行有效载荷破坏了用户的合法任务,由此产生的失败信号会引发对技能的检查。因此,我们通过攻击成功率(ASR)来评估攻击,这要求注入的有效载荷得以执行,并且用户的任务在同一试验中仍能通过验证器。先前的技能投毒攻击在此视角下面临可靠性-隐蔽性权衡:YAML头部注入可靠加载但易被检查,而将显式恶意命令置于技能正文中的更隐蔽的注入方式则可靠性较低,因为脱离上下文的命令会引发智能体自身的怀疑。我们提出POISE,一种位置感知攻击,将触发器压缩为单个看似良性的正文指令,将其放置在可行位置,并使用上下文感知生成器使其与附近的设置或前提步骤融合。在Skill-Inject(使用codex+gpt-5.2)上,POISE实现了89.3%的ASR,比随机位置正文基线高28.0个百分点,比仅YAML基线高2.6个百分点,同时保留了正文放置的隐蔽性优势。这种隐蔽性是决定性的优势:由于合法的技能正文自然需要特权工具操作,LLM扫描器高度敏感,在四个评判者和两个基准测试中平均将74.6%的干净技能误报为高风险。融入这些误报中,POISE仅导致5.6%的投毒变体相比其干净基线获得新的高风险警报,使得当前的静态防御无效。

英文摘要

Agent skills provide a lightweight mechanism for extending general-purpose agents, but their open format exposes them to skill-poisoning attacks. A practically dangerous injection must stay invisible: if executing the payload derails the user's legitimate task, the resulting failure signal invites inspection of the skill. We therefore evaluate attacks by Attack Success Rate, which requires the injected payload to execute and the user's task to still pass its verifier in the same trial. Prior skill-poisoning attacks face a reliability-stealth trade-off under this lens: YAML-header injections are reliably loaded but easily inspected, whereas stealthier body injections that place explicit malicious commands in the skill prose are less reliable because out-of-context commands invite the agent's own suspicion. We introduce POISE, a position-aware attack that compresses the trigger into a single, benign-looking body instruction, placing it at a feasible position and using a context-aware generator to blend it with nearby setup or prerequisite steps. On Skill-Inject with codex+gpt-5.2, POISE achieves an 89.3% ASR, 28.0 points above a random-placement body baseline and 2.6 points above a YAML-only baseline, while retaining the stealth advantage of body placement. That stealth is the decisive margin: because legitimate skill bodies naturally require privileged tool operations, LLM scanners are hyper-sensitive, falsely flagging 74.6% of clean skills on average across four judges and both benchmarks. Blending into these false alarms, POISE causes only 5.6% of poisoned variants to gain a new high-risk alert over their clean baselines, rendering current static defenses ineffective.

2606.07931 2026-06-09 math.PR cond-mat.stat-mech cs.IT cs.LG math.IT math.ST stat.TH 新提交

Pointwise Complexity for Gaussian Fields: Upper Envelopes, Algorithmic Lower Bounds, and Separation

高斯场的逐点复杂度:上包络、算法下界与分离

Yunbei Xu

发表机构 * National University of Singapore(新加坡国立大学)

AI总结 本文证明了一个方差感知的逐点主测度定理,为高斯过程提供高概率上包络,并通过贝叶斯算法下界和加权基示例,揭示了逐点复杂度与全局极小极大风险之间的分离。

详情
AI中文摘要

我们为中心高斯过程证明了一个方差感知的逐点主测度定理。经典的泛函链刻画了标量量$\mathbb E\sup_{x\in T}X_x$;这里的定理给出了整个场的同时高概率包络。对于先验测度$\mu$,在$x$处的包络由逐点Fernique-Talagrand泛函\[\Phi_\mu(x):=\int_0^{4\sigma(x)}\sqrt{\log\frac{1}{\mu(B_d(x,\varepsilon))}}\,d\varepsilon\]以及相应的高斯尾项控制。该定理提供了经典泛函链的可重用场级精化,以及深度神经网络逐点经验过程界的高斯过程对应物。我们还从交互式Fano/数据处理原理记录了一个贝叶斯算法下包络。对于已知先验$\pi$、观测信道和具体估计量$\widehat t(Y)$,下界通过精确的鬼小弹球质量$\mathbb E_{Y\sim Q}\pi(B_d(\widehat t(Y),\Delta))$表示,而非最坏情况覆盖数。在高斯位置实验中,比较译码器将贝叶斯位置误差转化为决策对齐高斯范围的下界。然后我们构造一个简单的加权基示例,将固定先验的通常Fano松弛、贝叶斯算法下包络、选定子图集上的逐点高斯包络以及全类极小极大风险/全局高斯尺度分离开来。这些结果共同表明,在经典极小极大理论变得过于粗糙或依赖预言机的超参数化环境类中,算法下界为固定估计量提供了逐点复杂性的局部几何证书。

英文摘要

We prove a variance-aware pointwise majorizing-measure theorem for centered Gaussian processes. Classical generic chaining characterizes the scalar quantity $\mathbb E\sup_{x\in T}X_x$; the theorem here gives a simultaneous high-probability envelope for the entire field. For an ambient prior $μ$, the envelope at $x$ is governed by a pointwise Fernique-Talagrand functional \[Φ_μ(x):=\int_0^{4σ(x)}\sqrt{\log\frac{1}{μ(B_d(x,\varepsilon))}}\,d\varepsilon,\] together with the corresponding Gaussian tail term. The theorem provides a reusable field-level refinement of classical generic chaining and a Gaussian-process counterpart of pointwise empirical-process bounds for deep neural networks. We also record a Bayesian algorithmic lower envelope from the interactive Fano/data-processing principle. For a known prior $π$, an observation channel, and a concrete estimator $\widehat t(Y)$, the lower bound is expressed through the exact ghost small-ball mass $\mathbb E_{Y\sim Q}π(B_d(\widehat t(Y),Δ))$, rather than a worst-case covering number. In Gaussian location experiments, comparison decoders convert Bayes location error into lower bounds on decision-aligned Gaussian ranges. We then construct an elementary weighted-basis example separating the usual Fano relaxation for a fixed prior, the Bayesian algorithmic lower envelope, the pointwise Gaussian envelope on the selected subatlas, and the full-class minimax risk/global Gaussian scale. Together, these results show that algorithmic lower bounds provide local-geometric certificates of pointwise complexity for fixed estimators in overparameterized ambient classes, precisely in regimes where classical minimax theory becomes either too coarse or oracle-dependent.

2606.07926 2026-06-09 stat.ML cs.LG 新提交

Barycentric Projections of Optimal Transport Plans on Riemannian Manifolds

黎曼流形上最优传输计划的重心投影

Kisung You

发表机构 * Baruch College(巴彻学院)

AI总结 提出黎曼流形上传输耦合的重心投影框架,通过条件Fréchet均值得到最佳确定性映射,并定义条件方差Monge缺陷,实验验证了内在投影与切向投影的不同作用。

详情
AI中文摘要

最优传输耦合是概率对象,而许多学习流程需要确定性映射。在欧几里得空间中,重心投影通过取条件期望将耦合转换为映射,但在黎曼流形上,曲率和割迹使这一操作变得不平凡。我们开发了一个黎曼流形上传输耦合的重心投影框架。内在投影将每个源点映射到其目标分布的条件Fréchet均值,并证明它是平方测地线损失下的最佳确定性代表。相应的最小值是积分条件Fréchet方差,该方差对于由映射诱导的耦合恰好为零,因此定义了一个条件方差Monge缺陷。我们还研究了一个切向log-exp投影,证明了其欧几里得精确性、在Monge情况下与Brenier-McCann映射的兼容性,以及其作为内在目标的第一单位黎曼梯度更新的解释。对于离散耦合,两种构造都按行分解为加权Fréchet均值和log-exp问题。在球面数据、合成SPD数据和真实EEG协方差矩阵上的实验支持所提出的角色分工:内在投影是变分代表,而切向投影是有用的局部位移代理。

英文摘要

Optimal transport couplings are probabilistic objects, while many learning pipelines require deterministic maps. In Euclidean space, barycentric projection converts a coupling into a map by taking conditional expectations, but on a Riemannian manifold curvature and cut loci make this operation nontrivial. We develop a framework for barycentric projections of transport couplings on Riemannian manifolds. The intrinsic projection maps each source point to the conditional Fréchet mean of its destination law and is shown to be the best deterministic representative under squared geodesic loss. The corresponding minimum value is an integrated conditional Fréchet variance, which vanishes exactly for map-induced couplings and therefore defines a conditional-variance Monge defect. We also study a tangential log-exp projection, prove its Euclidean exactness, its compatibility with Brenier-McCann maps in the Monge case, and its interpretation as the first unit Riemannian gradient update for the intrinsic objective. For discrete couplings, both constructions decompose row-wise into weighted Fréchet mean and log-exp problems. Experiments on spherical data, synthetic SPD data, and real EEG covariance matrices support the proposed division of roles: the intrinsic projection is the variational representative, while the tangential projection is a useful local displacement surrogate.

2606.07923 2026-06-09 cs.DB cs.AI cs.LG 新提交

Larch: Learned Query Optimization for Semantic Predicates

Larch: 面向语义谓词的学习型查询优化

Fuheng Zhao, Pawel Liskowski, Zihan Li, Benjamin Han, Puxuan Yu, Varich Boonsanong, Dimitris Tsirogiannis, Anupam Datta

发表机构 * Snowflake Inc.(Snowflake公司)

AI总结 提出Larch框架,利用嵌入增强的图神经网络和强化学习或监督学习优化AI SQL查询中语义过滤器的执行顺序,显著降低令牌开销。

详情
AI中文摘要

随着大型语言模型(LLM)的出现,许多数据库系统引入了语义运算符,使得能够对非结构化数据(如文本、图像、视频)进行分析查询。语义运算符通常会产生高昂的推理成本和延迟,使得语义(AI)SQL查询难以应用于大规模数据集。同时,其语义性质导致数据库引擎将其视为黑盒,使得AISQL查询难以优化。在本文中,我们介绍了Larch,一个用于优化AI SQL查询中语义过滤器执行的框架。Larch的灵感来自两个关键观察:i) 语义运算符的高延迟为计算密集型运行时优化技术留下了显著空间,ii) 非结构化数据通常伴随着嵌入形式的语义信息,允许在AI_FILTER提示和数据值之间进行高效的语义比较。基于这两个关键观察,我们提出了两种Larch变体:Larch-A2C和Larch-Sel。Larch-A2C使用嵌入增强的门控图神经网络编码任意语义过滤器表达式树,并将过滤器评估顺序表述为马尔可夫决策过程。相比之下,Larch-Sel利用监督学习模型预测过滤器选择性,随后应用动态规划为每个输入行找到接近最优的评估顺序。在多样化的真实世界数据集和全面的合成工作负载上进行评估,两种Larch变体在令牌使用方面始终优于现有的语义过滤器优化技术。我们的结果表明,Larch在不同工作负载下具有鲁棒性,与Palimpzest和Quest相比,将总令牌成本开销降低了3倍至19倍。

英文摘要

With the advent of Large Language Models (LLMs), many database systems introduced semantic operators that enabled analytical queries over unstructured data (e.g. text, images, videos). Semantic operators typically incur high inference costs and latencies making semantic (AI) SQL queries challenging to apply on large scale datasets. At the same time, their semantic nature leads database engines to treat them as black boxes, making AISQL queries difficult to optimize. In this paper, we introduce Larch, a framework for optimizing the execution of semantic filters in AI SQL queries. Larch was inspired by two key observations: i) the high latency of semantic operators leaves significant room for computationally-heavy runtime optimization techniques, ii) unstructured data are typically accompanied by semantic information in the form of embeddings allowing for efficient semantic comparisons between AI_FILTER prompts and data values. Based on these two key observations, we present two Larch variants: Larch-A2C and Larch-Sel. Larch-A2C encodes arbitrary semantic filters expression tree using an embedding-augmented Gated Graph Neural Network and formulates the filter evaluation order as a Markov decision process. In contrast, Larch-Sel leverages a supervised learning model to predict filter selectivities, subsequently applying dynamic programming to find a near-optimal evaluation order for each input row. Evaluated across diverse real-world datasets and comprehensive synthetic workloads, both Larch variants always outperform existing semantic filter optimization techniques in terms of token usage. Our results demonstrate that Larch is robust across diverse workloads, reducing total token cost overhead by 3x-19x compared to Palimpzest and Quest.

2606.07914 2026-06-09 stat.ML cs.LG 新提交

Identifiability and Estimation for Unlabeled Finite Mixtures under Marginal Independence

边际独立下无标签有限混合模型的可识别性与估计

Takafumi Kanamori, Yushi Hirose, Shohei Yamamoto

发表机构 * Department of Mathematical and Computing Science, Institute of Science Tokyo(科学东京学院数学与计算科学系) RIKEN Center for Advanced Intelligence Project(日本学术振兴会先进人工智能项目中心)

AI总结 研究无标签有限混合模型中,利用边际独立性假设恢复潜在成分和估计混合矩阵,提出PM-MMD估计器并证明其收敛性。

详情
AI中文摘要

我们研究来自无标签有限混合模型的成分恢复和混合矩阵估计,其中可观测分布共享相同的潜在成分但具有未知的混合权重。主要识别信号是边际独立性:每个成分假设在至少一个坐标对上是独立的,但没有观察到标签、干净的成分样本或混合权重。我们首先证明乘积成分的一个结构结果:在一元边际线性独立的条件下,成分的任何独立仿射组合必须与单个成分一致。然后我们将这一原理扩展到可观测混合,并表明在满秩和无抵消条件下,边际独立的仿射组合恢复相应的潜在成分。当每个成分在某个坐标对上是独立的时,所有成分都是可识别的,并且在所陈述的完成条件下混合矩阵是可恢复的。最后,我们提出一个基于可观测混合的仿射组合的乘积边际最大均值差异(PM-MMD)估计器,并证明在近似边际独立下的一致收敛性和稳定性。该框架还分离了假设的经验作用:一般来说,不可约性不能直接从无标签混合中检验,而边际独立性通过保留的PM-MMD提供候选级别的诊断。受控实验和流式细胞术实验显示了边际独立性何时提供有用的恢复信号。在报告的多成分比较中,条件感知的代表性选择稳定了PM-MMD,并相对于使用相同无标签混合的聚类、分解和成对混合比例基线改善了恢复。

英文摘要

We study component recovery and mixing-matrix estimation from unlabeled finite mixtures whose observable distributions share the same latent components but have unknown mixing weights. The main identifying signal is marginal independence: each component is assumed to be independent on at least one coordinate pair, but no labels, clean component samples, or mixing weights are observed. We first prove a structural result for product components: under linear independence of the univariate marginals, any independent affine combination of the components must coincide with a single component. We then extend this principle to observable mixtures and show that, under full-rank and no-cancellation conditions, marginally independent affine combinations recover the corresponding latent components. When every component is independent on some coordinate pair, all components are identifiable, and the mixing matrix is recoverable under the stated completion conditions. Finally, we propose a Product-Marginal Maximum Mean Discrepancy (PM-MMD) estimator over affine combinations of the observable mixtures and prove uniform convergence and stability under approximate marginal independence. This framework also separates the empirical roles of the assumptions: irreducibility is, in general, not directly testable from the unlabeled mixtures alone, whereas marginal independence yields a candidate-level diagnostic through held-out PM-MMD. Controlled and flow-cytometry experiments show when marginal independence provides a useful recovery signal. In the reported multi-component comparisons, condition-aware representative selection stabilizes PM-MMD and improves recovery relative to clustering, factorization, and pairwise mixture-proportion baselines using the same unlabeled mixtures.

2606.07896 2026-06-09 physics.optics cs.CV 新提交

Beyond the Thin-Layer Limit: Differentiable Volumetric Training for Visible-Range Diffractive Neural Networks

超越薄层极限:可见光衍射神经网络的微分体积训练

Dineth Jayakody, Dushan N. Wadduwage

发表机构 * Department of Computer Science, Old Dominion University, Norfolk, VA 23529, USA(计算机科学系,老奥德纳大学,诺福克,VA 23529,美国) School of Data Science, Old Dominion University, Norfolk, VA 23529, USA(数据科学学院,老奥德纳大学,诺福克,VA 23529,美国) Department of Physics, Old Dominion University, Norfolk, VA 23529, USA(物理系,老奥德纳大学,诺福克,VA 23529,美国)

AI总结 针对可见光衍射神经网络因薄层近似导致性能不佳的问题,提出可微光束传播层,将每个衍射元件建模为有限厚度体积,显著降低设计-器件失配,FDTD验证将分类准确率从50%提升至90%。

详情
AI中文摘要

衍射深度神经网络(D2NN)有望为机器视觉提供微型化、低功耗、光速的光学前端,然而最成熟的演示仍停留在太赫兹波段,由易于制备的毫米尺度神经元构建。将D2NN推广到几乎所有视觉流水线工作的可见光波段,长期以来归因于纳米尺度神经元的制备困难;但即使近期进展消除了这一障碍,与太赫兹对应物匹配的可见光D2NN仍遥不可及。我们识别出真正的障碍是几乎所有D2NN训练所依赖的薄层近似,它将每个衍射层视为无限薄的掩模。失败的原因并非通常假设的短波长,而是可见光波段使用的低折射率材料(n约1.3-1.5)需要足够厚的浮雕结构,使得层内衍射和相位积累变得显著。为克服这一问题,我们引入可微光束传播($\partial$BPM)层,将每个元件建模为有限厚度体积,并在训练过程中通过其传播光,保持与制备兼容的高度图端到端可训练,无需全波仿真在环。在MNIST、Fashion-MNIST和CIFAR-100分类及成像任务中,$\partial$BPM训练显著降低了设计-器件失配,全波FDTD验证将分类准确率从50%提升至90%,无需重新优化。因此,$\partial$BPM层为高效光学神经网络优化与制备一致的衍射设计之间提供了可扩展的、物理感知的桥梁。

英文摘要

Diffractive deep neural networks (D2NNs) promise miniaturized, power-efficient, light-speed optical front-ends for machine vision, yet the most mature demonstrations remain in the terahertz regime, built from readily fabricated millimeter-scale neurons. Translating D2NNs to the visible range, where nearly all vision pipelines operate, was long blamed on the difficulty of fabricating nanoscale neurons; but even after recent advances removed that barrier, visible-range D2NNs matching their terahertz counterparts remain out of reach. We identify the true obstacle as the thin-layer approximation underlying nearly all D2NN training, which treats each diffractive layer as an infinitely thin mask. It fails not because of the short wavelength, as is commonly assumed, but because the low-refractive-index materials (n approximately 1.3-1.5) used at visible wavelengths require relief structures thick enough that intra-layer diffraction and phase accumulation become significant. To overcome this, we introduce a differentiable beam-propagation ($\partial$BPM) layer that models each element as a finite-thickness volume and propagates light through it during training, keeping the fabrication-compatible height map end-to-end trainable without full-wave simulation in the loop. Across MNIST, Fashion-MNIST, and CIFAR-100 classification and imaging, $\partial$BPM training substantially reduces the design-to-device mismatch, and full-wave FDTD validation raises classification accuracy from 50% to 90% without re-optimization. The $\partial$BPM layer thus offers a scalable, physics-aware bridge between efficient optical neural-network optimization and fabrication-consistent diffractive design.