arXivDaily arXiv每日学术速递 周一至周五更新
全部学科分类 2094
专题追踪
2606.19635 2026-06-19 cs.IR cs.AI cs.LG 新提交

Token Factory: Efficiently Integrating Diverse Signals into Large Recommendation Models

Token Factory:高效整合多样化信号于大型推荐模型

Xilun Chen, Shao-Chuan Wang, Baykal Cakici, Lukasz Heldt, Lichan Hong, Raghu Keshavan, Aniruddh Nath, Li Wei, Xinyang Xi

AI总结 提出Token Factory框架,将传统信号转化为软令牌,高效集成到基于Transformer的大型推荐模型中,避免提示长度爆炸并提升性能。

Comments 8 pages, 10 figures

详情
AI中文摘要

大型推荐模型(LRM)在工业级推荐任务中展现了强大的能力。然而,如何有效且高效地将传统信号整合到这些基于Transformer的架构中仍然是一个主要挑战。传统的直接“文本化”这些信号或创建离散物品表示的方法往往导致过长的提示、巨大的内存占用和高计算开销。为了克服这些限制,我们提出了“Token Factory”,一个旨在将传统信号转化为可由LRM直接处理的“软令牌”的框架。这种方法能够高效集成和压缩异构输入特征,防止提示长度爆炸,同时提升模型性能。我们详细描述了Token Factory的架构,并展示了在工业级推荐环境中验证其有效性的实验结果。

英文摘要

Large Recommendation Models (LRMs) have demonstrated promising capabilities in industry-scale recommendation tasks. However, holistically integrating traditional signals into these transformer-based architectures effectively and efficiently remains a major challenge. Conventional approaches that "textualize" these signals directly or create discrete item representations often lead to excessively long prompts, substantial memory footprints, and high computational overhead. To overcome these limitations, we propose "Token Factory", a framework designed to transform traditional signals into "soft tokens" that can be directly processed by LRMs. This approach enables efficient integration and compression of heterogeneous input features, preventing prompt length explosion while enhancing model performance. We detail the architecture of Token Factory and present experimental results validating its effectiveness in a production-scale recommendation environment.

2606.19566 2026-06-19 eess.SY cs.AI cs.SY 新提交

GDGU: A Gradient Difference-based Graph Unlearning Method for Cyberattack Localization in Electric Vehicle Charging Networks

GDGU:基于梯度差异的图遗忘方法用于电动汽车充电网络中的网络攻击定位

Nanhong Liu, Mucun Sun, Jie Zhang

AI总结 针对电动汽车充电站数据删除需求,提出基于梯度差异的图遗忘方法(GDGU),通过一阶参数校正实现高效遗忘,在保持定位性能的同时显著降低计算开销。

详情
AI中文摘要

电动汽车充电站(EVCS)可能使配电馈线暴露于网络攻击。尽管包括图神经网络在内的机器学习方法可以定位哪个母线被攻破,但在数据共享和模型训练方面仍存在重大挑战。例如,隐私法规允许EVCS所有者从已部署的模型中删除其训练数据,但每次请求都从头重新训练在计算上不可行。为了解决这个问题,我们研究了用于EVCS网络攻击定位的图遗忘(GU),将其形式化为图级多标签分类任务上的特征级遗忘问题。具体来说,我们提出了基于梯度差异的图遗忘(GDGU),通过一阶参数校正消除请求删除数据的影响。该校正基于原始训练数据与修改后数据集之间的梯度差异计算,其中仅遗忘请求的EVCS母线的充电功率特征。然后,应用批归一化重新校准和简短的恢复微调步骤以恢复定位效用。我们在IEEE 34母线、123母线和8500节点配电网络上,使用三种图神经网络骨干网络和累积遗忘场景,将GDGU与两种二阶GU基线进行比较。GDGU在定位效用上与最强基线相当,遗忘保真度接近完全重新训练,同时遗忘速度比从头重新训练快10到12倍,且内存使用远少于二阶GU基线。

英文摘要

Electric vehicle charging stations (EVCSs) can expose distribution feeders to cyberattacks. While machine learning methods, including graph neural networks, can localize which bus is compromised, significant challenges remain in data sharing and model training. For example, privacy regulations grant EVCS owners the right to delete their training data from a deployed model, yet retraining from scratch on every request is computationally prohibitive. To address this, we study graph unlearning (GU) for EVCS cyberattack localization, formulated as a feature-level unlearning problem on a graph-level multi-label classification task. Specifically, we propose gradient difference-based graph unlearning (GDGU), which removes the influence of the requested deletion data through a first-order parameter correction. The correction is computed from the gradient difference between the original training data and a modified dataset in which only the charging power features at the requested EVCS buses are unlearned. Then, a batch-normalization recalibration and a brief recovery fine-tuning step are applied to restore localization utility. We benchmark GDGU against two second-order GU baselines on the IEEE 34-bus, 123-bus, and 8500-node distribution networks across three graph neural network backbones and cumulative unlearning scenarios. GDGU matches the strongest baseline on localization utility and reaches forgetting fidelity close to full-retraining, while unlearning 10 to 12 times faster than retraining from scratch and using far less memory than the second-order GU baselines.

2606.19533 2026-06-19 cs.AR cs.AI 新提交

A Tool for the Synthesis of Adaptive Probabilistic Processors Based on the Ising Model

基于伊辛模型的自适应概率处理器合成工具

Jonathan Juracy Carneiro da Silva, Leonardo R. Gobatto, Jose Rodrigo Azambuja

AI总结 提出一种自动合成与仿真概率架构的工具,通过将组合优化问题映射到伊辛模型,自适应选择更新算法,改善收敛行为并支持硬件实现。

Comments ACM/IEEE/SBC/SBMICRO Symposium on Integrated Circuits and Systems Design 2026

详情
AI中文摘要

本文提出一种用于合成和仿真概率架构的工具,通过将组合优化问题映射到伊辛模型来求解。该方法根据问题特征(如规模和拓扑)自动构建伊辛哈密顿量并确定概率元件(p-bits)的数量。此外,该工具引入了一种自适应策略,用于在吉布斯采样、模拟退火(SA)、模拟量子退火(SQA)和基于簇的方法中选择最合适的更新算法。使用基准问题的实验结果表明,与固定方法相比,该方法具有更好的收敛行为和灵活性。所提出的框架能够系统评估概率计算策略,并支持基于MTJ和p-bits的未来硬件实现的开发。

英文摘要

This work presents a tool for the synthesis and simulation of probabilistic architectures for solving combinatorial optimization problems by mapping them to the Ising model. The proposed approach automatically constructs the Ising Hamiltonian and determines the number of probabilistic elements (p-bits) based on problem characteristics such as size and topology. Furthermore, the tool introduces an adaptive strategy for selecting the most suitable update algorithm among Gibbs Sampling, Simulated Annealing (SA), Simulated Quantum Annealing (SQA), and cluster-based methods. Experimental results using benchmark problems demonstrate improved convergence behavior and flexibility compared to fixed approaches. The proposed framework enables systematic evaluation of probabilistic computing strategies and supports the development of future hardware implementations based on MTJs and p-bits.

2606.20022 2026-06-19 stat.ML cs.LG math.OC 新提交

Stochastic Linear Contextual Bandits with Bounded Noise: A Set-Membership Approach

具有有界噪声的随机线性上下文赌博机:一种集合成员方法

Haonan Xu, Yingying Li

AI总结 针对有界奖励噪声的随机线性上下文赌博机,提出基于集合成员估计和乐观原则的SME-OFU算法,实现O(log T)的遗憾界,优于次高斯噪声下的最优界。

Comments 23 pages, 1 figure

详情
AI中文摘要

本文考虑具有有界奖励噪声的随机线性上下文赌博机(SLCB)。现有工作通常假设次高斯奖励噪声和有界期望奖励,在此条件下最优遗憾界关于时间T为$\tilde{O}(\sqrt{T})$。然而,在许多应用中,实现/观测到的奖励也自然有界,这意味着奖励噪声有界。有界噪声比次高斯条件更具信息性,但在SLCB文献中尚未被明确利用。本文通过利用一种称为集合成员估计(SME)的不确定性量化方法,并应用面对不确定性的乐观原则(OFU),提出了一种新颖的算法SME-OFU。我们的算法享有改进的遗憾界$O(\log T)$。注意,这并不与次高斯噪声下现有的最优界$\tilde{O}(\sqrt{T})$矛盾,因为有界噪声是更强的条件。最后,仿真表明,当奖励噪声有界时,SME-OFU相对于为次高斯噪声设计的基准算法在经验上有所改进。

英文摘要

This paper considers stochastic linear contextual bandits (SLCB) with bounded reward noise. Existing works typically assume sub-Gaussian reward noise and bounded expected rewards, under which the optimal regret bound scales as $\tilde{O}(\sqrt{T})$ in terms of horizon $T$. However, in many applications, realized/observed rewards are also naturally bounded, implying bounded reward noise. Bounded noise is more informative than the sub-Gaussian condition but has not been leveraged explicitly in the SLCB literature. In this paper, we propose a novel algorithm SME-OFU by utilizing an uncertainty quantification method called set-membership estimation (SME) and applying the principle of optimism in the face of uncertainty (OFU). Our algorithm enjoys an improved regret bound $O(\log T)$. Notice that this does not contradict the existing optimal bound $\tilde{O}(\sqrt{T})$ for sub-Gaussian noise because bounded noise is a stronger condition. Finally, simulations show empirical improvements of SME-OFU over a benchmark algorithm designed for sub-Gaussian noise when the reward noise is bounded.

2606.20356 2026-06-19 math.OC cs.AI cs.LG math.PR stat.ML 新提交

Robust $Q$-learning for mean-field control under Wasserstein uncertainty in common noise

公共噪声Wasserstein不确定性下的平均场控制鲁棒$Q$-学习

Mathieu Laurière, Ariel Neufeld, Kyunghyun Park

AI总结 提出一种针对公共噪声分布Wasserstein不确定性的离散时间平均场控制鲁棒$Q$-学习算法,结合量化投影与Wasserstein对偶,证明同步和异步学习的收敛性及有限时间界,并在系统风险和流行病模型中验证鲁棒性-性能权衡。

详情
AI中文摘要

在本文中,我们提出了一种针对公共噪声定律下Wasserstein不确定性的离散时间平均场控制问题的鲁棒$Q$-学习算法。该算法将量化投影方案与公共噪声空间上的Wasserstein对偶重述相结合。我们建立了其收敛性以及同步和异步学习方案的有限时间迭代界。关于系统风险和流行病模型的数值实验将异步实现与理想化的Bellman迭代进行了比较,说明了在公共噪声误设下的鲁棒性-性能权衡,并报告了异步$Q$-学习算法的观察收敛行为。

英文摘要

In this article, we present a robust $Q$-learning algorithm for discrete-time mean-field control problems under Wasserstein uncertainty in the common noise law. The algorithm combines a quantization-and-projection scheme with a Wasserstein dual reformulation on the common-noise space. We establish its convergence together with finite-time iteration bounds for both synchronous and asynchronous learning schemes. Numerical experiments on systemic risk and epidemic models compare the asynchronous implementation with an idealized Bellman iteration, illustrate the robustness-performance tradeoff under common-noise misspecification, and report the observed convergence behavior of the asynchronous $Q$-learning algorithm.

2606.20082 2026-06-19 math.OC cs.DS cs.LG 新提交

Beyond Averaging in John Ellipsoid Approximation: High-Accuracy Algorithms in the Leverage-Score Model

超越John椭球逼近中的平均化:杠杆分数模型中的高精度算法

Xiaoyu Li, Junwei Yu, Jiaojiao Jiang, Junbin Gao, Andi Han

AI总结 本文分离了John椭球逼近算法中的认证、识别和精度三种成本,证明精度依赖仅为双对数,并提出了加速方法和阻尼牛顿法,在杠杆分数模型中实现了高精度逼近。

详情
AI中文摘要

对称多面体 $P=\{\mathbf{x}\in\mathbb{R}^d:\|\mathbf{A}\mathbf{x}\|_\infty\le1\}$, $\mathbf{A}\in\mathbb{R}^{n\times d}$ 的 John 椭球由一系列杠杆分数算法计算,从 Cohen, Cousins, Lee 和 Yang (COLT 2019) 到其后续工作 [WY24, CLS+25],均在 $\Theta(\varepsilon^{-1}\log(n/d))$ 次迭代内达到 $(1+\varepsilon)$-逼近。我们将这一复杂度分离为现代算法混淆的三种成本(认证、识别和精度),并发现历史上的 $\varepsilon^{-1}$ 仅存在于第一种成本中。在等价的 D-最优设计形式 $\min_{\mathbf{p}\in\Delta_n}-\log\det(\sum_i p_i\mathbf{a}_i\mathbf{a}_i^\top)$ 中,杠杆分数预言机恰好是一阶预言机,而 $(1+\varepsilon)$-John 保证对应于 Frank-Wolfe 间隙 $g(\mathbf{p})\le\varepsilon d$;通过这一对应关系,成本得以分离。$\varepsilon^{-1}$ 是认证的产物:迭代点的均匀平均(该系列算法中使用的认证)的间隙恰好为 $\Theta(1/T)$,无论每次迭代多么廉价。相反,针对最后迭代点,同一预言机是快速的:热启动加速方法在 $\varepsilon$-无关的初始化 $C(\mathbf{A})$ 后,仅需 $C(\mathbf{A})+O(\sqrt{\kappa}\log(1/\varepsilon))$ 次查询即可达到保证;一旦最优面被识别,面问题成为无约束自和谐最小化,其 Hessian 可由预言机精确恢复,因此阻尼牛顿法仅需 $O(\log\log(1/\varepsilon))$ 步,总查询数为 $C(\mathbf{A})+O(d^2\log\log(1/\varepsilon))$。因此,在 $\varepsilon$-无关、条件依赖的初始化后,精度依赖是双对数的;开放问题在于剩余的识别成本(达到最优面的无条件界)和下界。精度并非障碍。

英文摘要

The John ellipsoid of a symmetric polytope $P=\{\mathbf{x}\in\mathbb{R}^d:\|\mathbf{A}\mathbf{x}\|_\infty\le1\}$, $\mathbf{A}\in\mathbb{R}^{n\times d}$, is computed by a long line of leverage-score algorithms, from Cohen, Cousins, Lee and Yang (COLT 2019) to its successors [WY24, CLS+25], all reaching a $(1+\varepsilon)$-approximation in $Θ(\varepsilon^{-1}\log(n/d))$ iterations. We separate this complexity into three costs the modern line conflates (certification, identification, and accuracy) and locate the historical $\varepsilon^{-1}$ in the first alone. In the equivalent D-optimal-design form $\min_{\mathbf{p}\inΔ_n}-\log\det(\sum_i p_i\mathbf{a}_i\mathbf{a}_i^\top)$, the leverage-score oracle is exactly the first-order oracle and the $(1+\varepsilon)$-John guarantee the Frank-Wolfe gap $g(\mathbf{p})\le\varepsilon d$; through this dictionary the costs come apart. The $\varepsilon^{-1}$ is a certification artifact: the uniform average of the iterates, the certificate used throughout the line, has gap exactly $Θ(1/T)$, however cheap each iteration is made. Pointed instead at the last iterate the same oracle is fast: a warm-started accelerated method reaches the guarantee in $C(\mathbf{A})+O(\sqrtκ\log(1/\varepsilon))$ queries after an $\varepsilon$-independent setup $C(\mathbf{A})$, and once the optimal face is identified the facial problem is an unconstrained self-concordant minimization whose Hessian the oracle recovers exactly, so damped Newton needs only $O(\log\log(1/\varepsilon))$ steps, for a total of $C(\mathbf{A})+O(d^2\log\log(1/\varepsilon))$ queries. The accuracy dependence is thus doubly logarithmic after an $\varepsilon$-independent, condition-dependent setup; the open problem is the remaining identification cost (a condition-free bound on reaching the optimal face) and lower bounds. Accuracy is not the obstruction.

2606.20062 2026-06-19 math.OC cs.LG math.PR 新提交

Optimal Coarse Correlated Equilibria in Mean Field Games: Linear Programming and No-Regret Learning

平均场博弈中的最优粗相关均衡:线性规划与无遗憾学习

Luciano Campi, Federico Cannerozzi, Ioannis Tzouanas

AI总结 针对连续时间平均场博弈,提出最优粗相关均衡的线性规划刻画,并设计基于拉格朗日对偶的无遗憾学习算法,给出收敛速率。

Comments 55 pages, 3 figures

详情
AI中文摘要

我们引入了连续时间平均场博弈的最优粗相关均衡。粗相关均衡是一种随机推荐方案,任何玩家都无法通过忽略推荐并转向替代策略而获益。问题如下:一个协调者在所有平均场粗相关均衡中选择一个,以优化一个规定的性能准则,该准则可能不同于代表性玩家的目标。在问题公式化之后,我们开发了一个线性规划(LP)公式,证明了最优LP粗相关均衡的存在性,并将LP刻画与原始概率设定联系起来。基于这一刻画,我们设计了一个无遗憾原始-对偶算法,基于外部遗憾约束的等价拉格朗日公式,用于学习此类均衡。我们提供了学习算法的显式收敛速率,数值例子说明了该方法。

英文摘要

We introduce optimal coarse correlated equilibria for continuous-time mean field games. A coarse correlated equilibrium is a randomized recommendation scheme from which no player can gain by ignoring the recommendation and switching to an alternative strategy. The problem is as follows: a moderator selects, among all mean-field coarse correlated equilibria, one that optimizes a prescribed performance criterion, which may differ from the representative player's objective. After formulating the problem, we develop a linear programming (LP) formulation, prove the existence of optimal LP coarse correlated equilibria, and relate the LP characterization to the original probabilistic setting. Building on this characterization, we design a no-regret primal-dual algorithm, based on an equivalent Lagrangian formulation of the external-regret constraint, for learning such equilibria. We provide explicit convergence rates for the learning algorithm, and numerical examples illustrate the method.

2606.19859 2026-06-19 cs.IT cs.LG math.IT math.PR math.ST stat.TH 新提交

Doeblin Curves

Doeblin 曲线

Dongmin Lee, William Lu, Anuran Makur, Japneet Singh

AI总结 提出 Doeblin 曲线概念,量化马尔可夫核在不同散度和功率水平下的收缩行为,并应用于噪声迭代优化、噪声电路可靠计算和差分隐私等领域的更细粒度收缩分析。

Comments 42 pages, 2 figures

Journal ref IEEE Transactions on Information Theory, vol. 72, no. 6, pp. 3556-3596, June 2026

详情
AI中文摘要

近期关于 Doeblin 系数的研究揭示了它们作为 TV 距离的 Dobrushin 收缩系数的多路泛化的有用性,这与它们在马尔可夫链遍历性理论中的经典作用不同。然而,为了建立信息收缩的存在性,通常需要强条件,例如远离 0。基于最近提出的非线性信息收缩概念,我们旨在提出一种更细粒度的基于 Doeblin 的多路收缩行为刻画,即使对于 Doeblin 系数为 0 的信道,也能产生非平凡的收缩保证。为此,我们引入了 Doeblin 曲线的概念——一种非线性函数,它量化了马尔可夫核在特定散度和功率水平下对输入分布集合的收缩行为。在我们的分析过程中,我们发展了 Doeblin 系数的新变分刻画,提出了 Doeblin 曲线的若干性质,定义了功率约束 Doeblin 曲线的几个版本,并利用上述变分刻画推导了上下界。然后,我们将这些结果应用于不同领域,包括噪声迭代优化的泛化界、噪声电路可靠计算的误差界以及在线迭代算法的差分隐私保证。特别是,我们将这些领域的结果扩展到更广泛的领域或群体设置,利用 Doeblin 曲线揭示比 Doeblin 系数更细粒度的收缩现象。

英文摘要

Recent research on Doeblin coefficients has shed light on their usefulness as a multi-way generalization of the Dobrushin contraction coefficient for TV distance, in a separate vein from their classic role in the theory of Markov chain ergodicity. However, strong conditions, such as being bounded away from 0, are typically necessary for Doeblin coefficients to establish the existence of information contraction. Building on recently formulated concepts of nonlinear information contraction, we aim to propose a finer-grained Doeblin-based characterization of multi-way contraction behavior which yields non-vacuous contraction guarantees even for channels whose Doeblin coefficient is 0. To this end, we introduce the notion of a Doeblin curve -- a nonlinear function which quantifies the contraction behavior of a Markov kernel on collections of input distributions at specific levels of divergence and power. Through the course of our analysis, we develop a new variational characterization of Doeblin coefficients, present several properties of Doeblin curves, define several versions of power-constrained Doeblin curves, and derive upper and lower bounds using our aforementioned variational characterization. We then utilize these results in diverse areas, including generalization bounds for noisy iterative optimization, error bounds for reliable computation with noisy circuits, and differential privacy guarantees for online iterative algorithms. In particular, we extend results in these areas to broader domains or group settings, leveraging Doeblin curves to reveal finer-grained contraction phenomena than Doeblin coefficients.

2606.19368 2026-06-19 math.NA cs.LG cs.NA math.OC 新提交

Neural Architectures as Functional Priors in Physics-Informed Control Problems

物理信息控制问题中的神经架构作为函数先验

Sonia Rubio Herranz, Fernando Carlos López Hernández, Antonio López Montes

AI总结 研究神经架构作为隐式函数先验在常微分方程控制问题中的作用,发现不同架构(MLP与傅里叶KAN)在相同条件下产生定性不同的控制,表现出功能特化现象。

Comments 17 pages, 6 figures. Physics-informed neural networks, optimal control, spectral bias, Kolmogorov-Arnold Networks

详情
AI中文摘要

在这项工作中,我们研究了神经架构作为隐式函数先验在由常微分方程控制的问题中的作用。我们的目标不是关注高度复杂的问题,而是在最简单的物理可解释设置中研究受控动力系统中依赖于架构的效应。特别地,我们研究了一个受控的线性RLC电路和一个非线性Duffing型动力系统。这两个系统首先通过经典最优控制公式进行分析,然后通过基于PINN的方法进行分析。我们比较了多层感知器(MLP)和基于傅里叶的KAN类架构的不同组合,并分析了它们对所得控制的影响。数值实验表明,即使在相同的控制方程、损失函数、初始和目标状态、训练参数以及物理约束下,不同的架构选择也会系统地产生定性不同的控制。学习到的解在谱结构、平滑性、能量分布和相空间行为方面出现显著差异。这项工作的一个核心观察是,当神经架构被允许足够的自由度来塑造学习到的控制结构时,会出现功能特化现象。更具体地说,在我们考虑的系统中,基于傅里叶的架构倾向于产生具有更丰富振荡内容的轨迹,而更平滑的低频偏置架构倾向于产生更规则且能量效率更高的控制。这表明控制问题的不同功能组件可能由不同的神经架构更有效地处理,从而导致状态表示和控制生成之间的隐式特化。

英文摘要

In this work we investigate the role of neural architectures as implicit functional priors in control problems governed by ordinary differential equations. Rather than focusing on highly complex problems, our objective is to investigate architecture-dependent effects in controlled dynamical systems within the simplest physically interpretable settings possible. In particular, we study a controlled linear RLC electrical circuit and a nonlinear Duffing-type dynamical system. Both systems are analyzed first through classical optimal-control formulations and later through PINN-based approaches. We compare different combinations of multilayer perceptrons (MLPs) and Fourier-based KAN-like architectures, and analyze their influence on the resulting controls. The numerical experiments suggest that different architectural choices systematically generate qualitatively distinct controls, even under identical governing equations, loss functionals, initial and target states, training parameters and physical constraints. Significant differences appear in the spectral structure, smoothness, energy distribution, and phase-space behavior of the learned solutions. A central observation of this work is the emergence of a functional specialization phenomenon when the neural architectures are allowed sufficient freedom to shape the structure of the learned controls. More specifically, in the systems considered here, Fourier-based architectures tend to produce trajectories with richer oscillatory content, whereas smoother low-frequency-biased architectures tend to generate more regular and energetically efficient controls. This suggests that different functional components of the control problem may be handled more efficiently by different neural architectures, leading to an implicit specialization between state representation and control generation.

2606.20485 2026-06-19 q-fin.RM cs.AI nlin.AO physics.soc-ph 新提交

Optimal Order of Multi-Agent and General Many-Body Systems

多智能体与一般多体系统的最优序

Jake J. Xia

AI总结 提出一个分析多智能体系统的通用框架,基于智能体的权力和响应函数,推导出宏观性质,并引入风险偏好系数研究增长与韧性之间的权衡,得出最优有序度。

Comments Key Words: Many body systems, multi agent crowd interactions, feedback loops, agent power, response function, utility function, risk appetite, order, optimal order, fragility, mobility, synchronization, useful energy, entropy, concentration, correlation, task dependency, receiver dependency, collective intelligence, AI model scaling law

详情
AI中文摘要

本文开发了一个通用框架,用于分析具有智能体行动与集体观测之间反馈回路的多智能体系统。该框架建立在两个基本的智能体层面变量上:权力,衡量智能体对集体结果的影响;以及响应函数,决定智能体如何对观测做出反应。我们推导了宏观性质(包括总权力、有用权力、熵、有序度、脆弱性和流动性)如何从异质智能体的这两个变量中涌现。为了研究增长与韧性之间的权衡,我们引入了一个由风险偏好系数参数化的系统层面效用函数,并推导出一个平衡生产力、稳定性和适应性的最优有序度。分析表明,更强的同步可以增加集体产出,但也可能增加系统脆弱性并降低流动性。我们进一步论证,有序度、熵、信息和有用能量是任务依赖和系统相对的概念,其含义取决于系统的目标。通过测量和设计智能体的权力分布和响应函数,可能更好地理解、预测和优化集体行为,并识别集体智慧和最优序出现的条件。

英文摘要

This paper develops a general framework for analyzing multi-agent systems with feedback loops between agents actions and collective observations. The framework is built on two fundamental agent-level variables: power, which measures agent influence on collective outcomes, and response functions, which determine how agents react to observations. We derive how macroscopic properties, including total power, useful power, entropy, order, fragility, and mobility, emerge from these two variables of heterogeneous agents. To study the trade off between growth and resilience, we introduce a system-level utility function parameterized by a risk-appetite coefficient and derive an optimal degree of order that balances productivity, stability, and adaptability. The analysis suggests that stronger synchronization can increase collective output but may also increase systemic fragility and reduce mobility. We further argue that order, entropy, information, and useful energy are task-dependent and system-relative concepts whose meanings depend on the objectives of the system. By measuring and designing agent power distributions and response functions, it may be possible to better understand, predict, and optimize collective behavior and identify the conditions under which collective intelligence and optimal order emerge.

2606.20299 2026-06-19 stat.ML cs.LG hep-ph physics.data-an 新提交

Statistical Properties of Training & Generalization

训练与泛化的统计特性

Itay Lavie, Noam Levi, Yonatan Kahn

AI总结 从物理学角度研究深度学习的关键特征和意外现象,回顾神经缩放定律及其与物理问题中约束和归纳偏置的相互作用。

Comments 32 pages, 3 figures. Part of the VERaiPHY initiative

详情
AI中文摘要

深度学习成功规避了经典统计学的众多直觉,在多个现实任务中取得了前所未有的性能。本文从物理学角度研究深度学习的关键特征和意外现象,注意指出并尽可能证明构建深度学习模型时固有的多种选择。特别地,我们回顾了神经缩放定律的现象,并讨论了它们与在物理问题中应用机器学习时可能存在的约束和归纳偏置之间的相互作用。

英文摘要

Deep learning has managed to evade numerous intuitions from classical statistics to achieve unprecedented performance on a number of real-world tasks. In this article, we investigate the key features and surprises of deep learning from a physics-informed perspective, taking care to point out and justify where possible the many choices inherent in constructing a deep learning model. In particular, we review the phenomenon of neural scaling laws and discuss their interplay with the constraints and inductive biases which may be present when applying machine learning to problems in physics.

2606.19781 2026-06-19 hep-ex cs.AI 新提交

Towards Engineering Scaling Laws with Pretraining Data Composition

迈向基于预训练数据组成的工程化缩放定律

Jan-Lucas Uslu, Kevin Greif, Daniel Whiteson, Benjamin Nachman

AI总结 研究通过工程化预训练数据组成(增加多样性和与下游任务的对齐)来改变粒子物理中神经网络的缩放行为,使其更偏向数据扩展而非模型扩展。

详情
AI中文摘要

神经缩放定律描述了模型性能如何随计算量、模型大小和数据集大小呈幂律提升。虽然这些关系在大型语言模型中已得到充分验证,但在粒子物理学的大型模型中正在出现。与语言类似,实证研究表明性能呈幂律缩放。然而,与自然语言或图像领域不同,基础物理学拥有高保真模拟器,可以廉价地生成合成数据。这有利于缩放机制中额外数据比额外参数更便宜,并允许预训练数据集本身被工程化以影响缩放。对于高能粒子束碰撞中产生的强子喷注分类任务,我们表明,通过包含更多样化且与下游分类任务更对齐的预训练数据,可以工程化缩放行为,使其需要更多数据而非更大模型。

英文摘要

Neural scaling laws describe how model performance improves as a power law in compute, model size, and dataset size. While well-established for large language models, these relationships are emerging for large models in particle physics. As with language, empirical studies show that the performance scales as a power law. However, unlike natural language or image domains, fundamental physics has high-fidelity simulators that produce synthetic data cheaply. This favors scaling regimes where additional data is cheaper than additional parameters, and allows the pretraining dataset itself to be engineered to influence the scaling. For the task of classifying hadronic jets produced in collisions of high-energy particle beams, we show that the scaling behavior can be engineered towards requiring more data rather than larger models by inclusion of pretraining data which is more diverse and better aligned with the downstream classification task.

2606.19149 2026-06-19 cs.CR cs.LG 新提交

OpenAnt: LLM-Powered Vulnerability Discovery Through Code Decomposition, Adversarial Verification, and Dynamic Testing

OpenAnt:通过代码分解、对抗性验证和动态测试实现LLM驱动的漏洞发现

Nahum Korda, Gadi Evron

AI总结 提出OpenAnt系统,结合静态分析与LLM推理,通过代码分解、对抗性验证和动态测试三阶段流水线,在降低误报率的同时发现未知漏洞。

详情
AI中文摘要

在大型代码库中自动发现漏洞仍然具有挑战性:传统静态分析误报率高,而模糊测试等动态方法需要大量基础设施且通常针对狭窄的漏洞类别。大型语言模型(LLM)的最新进展使得对程序行为进行语义推理成为可能,但将LLM应用于仓库级安全分析会引入上下文管理、成本和验证方面的挑战。我们提出了OpenAnt,一个开源漏洞发现系统,它在多阶段流水线中集成了静态程序分析与基于LLM的推理。OpenAnt引入了三种关键技术。首先,代码库被分解为自包含的分析单元,并通过从外部入口点的可达性进行过滤,将分析面减少高达97%,同时保留与攻击相关的代码。其次,候选漏洞通过受限攻击者模拟进行对抗性验证,其中模型在现实攻击者能力下评估可利用性。第三,通过动态验证确认发现结果,其中自动生成利用环境,在沙箱容器中执行,并在使用后丢弃。在包括OpenSSL、WordPress和Flowise在内的广泛使用的开源项目上的评估表明,这种架构可以识别先前未知的漏洞,同时保持可管理的分析成本并大幅减少误报。我们的结果表明,结合语义推理与利用验证的闭环漏洞发现流水线,为可扩展的自动化安全分析提供了一条实用路径。OpenAnt已在Apache 2.0许可下开源,网址为https://this https URL。

英文摘要

Automated vulnerability discovery in large codebases remains challenging: traditional static analysis produces high false-positive rates, while dynamic approaches such as fuzzing require substantial infrastructure and often target narrow classes of bugs. Recent advances in large language models (LLMs) enable semantic reasoning about program behavior, but applying LLMs to repository-scale security analysis introduces challenges related to context management, cost, and verification. We present OpenAnt, an open-source vulnerability discovery system that integrates static program analysis with LLM-based reasoning in a multi-stage pipeline. OpenAnt introduces three key techniques. First, codebases are decomposed into self-contained analysis units filtered by reachability from external entry points, reducing the analysis surface by up to 97% while preserving attack-relevant code. Second, candidate vulnerabilities undergo adversarial verification through constrained attacker simulation, where the model evaluates exploitability under realistic attacker capabilities. Third, findings are validated through dynamic verification, in which exploit environments are generated automatically, executed in sandboxed containers, and discarded after use. Evaluation on widely used open-source projects including OpenSSL, WordPress, and Flowise shows that this architecture can identify previously unknown vulnerabilities while maintaining manageable analysis cost and substantially reducing false positives. Our results suggest that closed-loop vulnerability discovery pipelines, combining semantic reasoning with exploit validation, provide a practical path toward scalable automated security analysis. OpenAnt is released as open source under the Apache 2.0 license at https://github.com/knostic/OpenAnt.

2503.04507 2026-06-19 q-bio.QM cs.CG cs.LG 交叉投稿

The Morse Transform for Discrete Shape Analysis

离散形状分析的Morse变换

Alexander M. Tanaka, Aras T. Asaad, Richard Cooper, Vidit Nanda

AI总结 提出一种基于定向分段线性Morse理论的拓扑变换,通过记录多个高度函数下的临界点来量化嵌入对象的几何形状,生成的特征向量在配体虚拟筛选中取得最优平均AUROC。

Comments 37 pages, 3 main figures, 2 main tables, 12 appendix figures and 4 appendix tables

详情
AI中文摘要

物体的几何形状在调节其与物理世界的相互作用中起着至关重要的作用。然而,为了统计推断或分类任务的目的,用数值描述几何信息仍然困难。在这里,我们引入了一种新的拓扑变换,它利用定向分段线性Morse理论,通过编录多个高度函数下的临界点来量化嵌入对象的几何形状。该Morse变换的输出记录了表征底层形状的临界点的高度和局部拓扑类型(峰、谷或鞍点),保留了比欧拉特征变换更精细的信息,同时自然优先考虑形状的最外层区域。关键的是,该输出可以进一步压缩为丰富而紧凑的特征向量。我们将Morse特征向量作为配体虚拟筛选(LBVS)的描述符进行基准测试,这本质上依赖于分子的形状。在常见的梯度提升树分类流程下,与其他拓扑变换描述符和标准基于形状的LBVS描述符相比,Morse描述符实现了最高的平均AUROC。

英文摘要

The geometry of an object plays a vital role in modulating its interactions with the physical world. It nevertheless remains difficult to describe geometric information numerically for the purposes of statistical inference or classification tasks. Here, we introduce a new topological transform which leverages directional piecewise-linear Morse theory to quantify the geometry of an embedded object by cataloguing critical points across multiple height-functions. The output of this Morse transform records both the heights and the local topological type (peak, trough or saddle) of the critical points that characterise the underlying shape, retaining finer information than the Euler characteristic transform whilst naturally prioritising a shape's outermost regions. Crucially, this output can be further compressed into a rich but compact feature vector. We benchmark the Morse feature vector as a descriptor for ligand-based virtual screening (LBVS), which intrinsically depends on the shape of molecules. Under a common gradient-boosted tree classification pipeline, Morse descriptors achieve the highest mean AUROC when compared to other topological transform descriptors and to standard shape-based LBVS descriptors.

2606.04101 2026-06-19 cs.DC cs.LG 版本更新

UltraEP: Unleash MoE Training and Inference on Rack-Scale Nodes with Near-Optimal Load Balancing

UltraEP:在机架级节点上以近最优负载均衡释放MoE训练与推理

Xinming Wei, Chao Jin, Tuo Dai, Yinmin Zhong, Shan Yu, Chengxu Yang, Bingyang Wu, Zili Zhang, Jing Mai, Qianchao Zhu, Zhouyang Li, Yuliang Liu, Guojie Luo

AI总结 提出UltraEP,首个基于精确负载的实时均衡器,通过协同设计规划求解与专家复制通信,在机架级节点上实现MoE训练和推理的微批次与逐层重均衡,达到94.3%的力均衡理想吞吐量。

详情
AI中文摘要

大规模专家并行(EP)正成为训练和服务前沿MoE模型的关键,但它也加剧了设备级专家负载不均衡,导致计算掉队者、令牌全对全瓶颈和激活内存峰值。现有的均衡器基于历史负载定期重新分配专家,这对于具有非平稳负载模式的生产部署变得不可靠。我们提出UltraEP,首个用于大规模EP MoE训练和在机架级节点(RSN)上服务预填充的精确负载实时均衡器。基于RSN扩展的纵向扩展连接性,UltraEP在关键路径上对每个微批次和层进行重均衡,这需要规划求解和专家复制通信的非平凡协同设计,以最小化暴露的开销。为此,UltraEP通过高效的配额驱动规划对门控后负载做出积极反应,并利用RSN原生的持久tile流和基于中继的扇出缓解来执行由此产生的不规则专家状态传输。在训练和预填充中,平均涵盖106B到671B参数的MoE模型,UltraEP实现了力均衡理想吞吐量的94.3%,相比无均衡提升了1.49倍,同时将最终跨秩不均衡从1.30-4.01降低到1.01-1.04。此外,我们在2560个GPU的生产MoE训练中验证了UltraEP的可扩展性和鲁棒性。

英文摘要

Large-scale expert parallelism (EP) is becoming pivotal for training and serving frontier MoE models, but it also amplifies device-level expert load imbalance into compute stragglers, token all-to-all bottlenecks, and activation-memory spikes. Existing balancers redistribute experts periodically based on historical load, which becomes unreliable for production deployments with non-stationary load patterns. We present UltraEP, the first exact-load, real-time balancer for large-EP MoE training and serving prefill on rack-scale nodes (RSNs). Leveraging the extended scale-up connectivity among dozens of GPUs within RSNs, UltraEP rebalances every microbatch and layer on critical paths, which requires nontrivial co-design of plan solving and expert replication communication to minimize exposed overhead. To this end, UltraEP eagerly reacts to post-gating load with an efficient quota-driven planner, and executes the resulting irregular expert-state transfers with RSN-native persistent tile streaming and relay-based fan-out mitigation. We evaluate UltraEP in a multi-RSN deployment of up to 256 GPUs, using cutting-edge MoE models from 106B to 671B parameters. Averaged across training and serving, UltraEP achieves 94.3% of the force-balanced ideal throughput, delivering 1.49$\times$ improvement over no-balancing, while reducing the final inter-rank imbalance from 1.30$-$4.01 to 1.01$-$1.04.

2502.19193 2026-06-19 cs.SI cs.AI cs.NE 版本更新

Simulation of Language Evolution under Regulated Social Media Platforms: A Synergistic Approach of Large Language Models and Genetic Algorithms

受监管社交媒体平台下的语言演化模拟:大语言模型与遗传算法的协同方法

Jinyu Cai, Yusei Ishimizu, Mingyue Zhang, Munan Li, Jialong Li, Kenji Tei

AI总结 提出基于大语言模型的多智能体框架,结合遗传算法模拟用户语言策略在监管下的迭代演化,实验表明对话轮次增加可提升信息传递准确性和对话持续性。

Comments The manuscript has been accepted to IEEE Transactions on Computational Social Systems

详情
AI中文摘要

社交媒体平台经常实施限制性政策来调节用户内容,从而催生出创造性的规避语言策略。本文提出了一个基于大语言模型(LLMs)的多智能体框架,用于模拟在监管约束下语言策略的迭代演化。在该框架中,参与者智能体作为社交媒体用户,不断演化其语言表达,而监管智能体通过评估政策违规来模拟平台级别的监管。为了实现更逼真的模拟,我们采用了语言策略的双重设计(约束和表达)来区分冲突目标,并利用LLM驱动的遗传算法(GA)进行语言策略的选择、变异和交叉。该框架使用两种不同的场景进行评估:一个抽象的密码游戏和一个逼真的模拟非法宠物交易场景。实验结果表明,随着对话轮次的增加,不间断对话轮次的数量和信息传输的准确性都显著提高。此外,一项包含40名参与者的用户研究验证了生成对话和策略的现实相关性。消融研究也验证了GA的重要性,强调了其对长期适应性和整体结果改善的贡献。

英文摘要

Social media platforms frequently impose restrictive policies to moderate user content, prompting the emergence of creative evasion language strategies. This paper presents a multi-agent framework based on Large Language Models (LLMs) to simulate the iterative evolution of language strategies under regulatory constraints. In this framework, participant agents, as social media users, continuously evolve their language expression, while supervisory agents emulate platform-level regulation by assessing policy violations. To achieve a more faithful simulation, we employ a dual design of language strategies (constraint and expression) to differentiate conflicting goals and utilize an LLM-driven GA (Genetic Algorithm) for the selection, mutation, and crossover of language strategies. The framework is evaluated using two distinct scenarios: an abstract password game and a realistic simulated illegal pet trade scenario. Experimental results demonstrate that as the number of dialogue rounds increases, both the number of uninterrupted dialogue turns and the accuracy of information transmission improve significantly. Furthermore, a user study with 40 participants validates the real-world relevance of the generated dialogues and strategies. Moreover, ablation studies validate the importance of the GA, emphasizing its contribution to long-term adaptability and improved overall results.

2606.20435 2026-06-19 econ.EM 新提交

Choosing A Headline Estimand from Matching, DID, and Hybrid Designs: A Minimax-Regret Approach

从匹配、DID和混合设计中选择标题估计量:一种极小化最大遗憾方法

Yechan Park, Yuya Sasaki

AI总结 本文提出在面板数据因果效应估计中,混合设计(DIDM)的估计量介于匹配(M)和双重差分(DID)之间,并在宽泛损失函数下是极小化最大遗憾选择,建议将DIDM作为标题估计量,匹配和DID作为边界。

详情
AI中文摘要

使用面板数据估计因果效应的研究人员通常从三种利用过去结果的方法中选择:双重差分(DID)、对滞后结果进行条件化(匹配,M)以及同时进行两者的混合方法(DIDM)。相应的识别假设是非嵌套的,因此对于报告哪种方法几乎没有指导。我们给出了相应估计量有序的条件,其中DIDM介于匹配和DID之间。这使得DIDM在宽泛的损失函数类中成为三者中的极小化最大遗憾选择。我们建议将DIDM报告为标题估计量,匹配和DID作为边界。我们在应用中进行了说明。

英文摘要

Researchers using panel data to estimate causal effects routinely choose among three approaches to using past outcomes: difference-in-differences (DID), conditioning on lagged outcomes (matching, M), and a hybrid that does both (DIDM). The corresponding identifying assumptions are non-nested, leaving little guidance on which to report. We give conditions under which the corresponding estimands are ordered, with DIDM bracketed between matching and DID. This makes DIDM the minimax-regret choice among the three under a broad class of loss functions. We recommend reporting DIDM as the headline estimate, with matching and DID as bounds. We illustrate in applications.

2606.20286 2026-06-19 econ.EM 新提交

Institutions, Inputs, and Agricultural Growth in China:Revisiting Several Controversies, 1949--1986

制度、投入与中国农业增长:重访若干争议(1949–1986)

Jiyuan Lyu

AI总结 本文利用统一数据集和计量方法,重新审视关于中国农业增长的价格剪刀差、重工业投资、1978年改革及去集体化对灌溉影响的四大争议。

详情
AI中文摘要

关于1949年至1986年间中国农业增长的学术争论在价格剪刀差的程度、重工业投资的影响、1978年改革的作用以及去集体化对灌溉的影响等方面持续存在分歧。本文利用单一数据集和互补的计量经济学方法,逐一回应了这些争议。结果表明,1952–1957年是唯一一个通过所有三个渠道实现净提取的时期,此后国家通过财政和信贷工具向农业净流入约1686亿元。重工业投资对农业产生了显著的正向滞后效应,而同期负相关源于投资份额指标的零和性质。投入产出弹性在1970年突然变化,集体农业贷款在1971年断裂,两者均指向华北农业会议的整顿效果。防灾能力从集体时期的0.70下降到家庭承包后的0.53,主要原因是集体维护体系崩溃而非国家投资减少。1979年后农业供给的价格弹性趋近于零,表明1979年的收购价格提高更像是一次性重新校准而非持续的边际激励。

英文摘要

Scholarly debates on China's agricultural growth between 1949 and 1986 continue to differ over the extent of the price scissors, the effect of heavy industrial investment, the role of the 1978 reforms, and the impact of decollectivization on irrigation. Using a single dataset and complementary econometric methods, this paper addresses each of these controversies. The results show that 1952--1957 was the only net extraction period across all three channels, after which the state channelled a net inflow of about 168.6 billion yuan into agriculture via fiscal and credit instruments. Heavy industrial investment exerted a significant positive lagged effect on agriculture, while the contemporaneous negative correlation stemmed from the zero-sum nature of the investment share indicator. The input-output elasticity shifted abruptly in 1970, and collective agricultural loans broke in 1971, both pointing to the rectification effects of the North China Agricultural Conference. Disaster prevention capacity fell from 0.70 under the collective era to 0.53 after household contracting, mainly because the collective maintenance system collapsed rather than because state investment declined. After 1979 the price elasticity of agricultural supply approached zero, suggesting that the 1979 procurement price increase acted more like a one-off recalibration than a sustained marginal incentive.

2606.19972 2026-06-19 econ.EM 新提交

Biodiversity Media Narratives and Stock Market Performance: Evidence from Europe

生物多样性媒体叙事与股市表现:来自欧洲的证据

Andres Azqueta-Gavaldon, Ben Jabeur Sami, Leila Hedhili

AI总结 利用GDELT全球知识图谱构建2015-2025年法德意西四国的生物多样性媒体风险指标,通过面板格兰杰因果检验和增广逆概率加权事件研究发现,生物多样性风险显著降低股价,且低风险期的正面效应大于高风险期的负面效应。

详情
AI中文摘要

本研究为法国、德国、意大利和西班牙构建了2015-2025年间新颖的生物多样性相关媒体风险指标,利用GDELT全球知识图谱捕捉媒体对生物多样性威胁的关注。通过面板格兰杰因果检验和增广逆概率加权(AIPW)事件研究设计,我们发现了高度显著的证据表明生物多样性风险会降低股票价格,其影响在冲击后3至10个月达到峰值。此外,我们揭示了一个明显的非对称性,即低生物多样性风险期的正面效应大于高风险期的负面效应。结果在收益分布的分位数上稳健,并在控制欧洲股票市场波动性和经济政策不确定性时依然成立。我们的发现首次提供了生物多样性媒体叙事驱动欧洲股市估值的证据。

英文摘要

This study constructs novel biodiversity related media risk indicators for France, Germany, Italy, and Spain over 2015-2025, capturing media attention to biodiversity threats using the GDELT Global Knowledge Graph. Using panel Granger causality tests and an augmented inverse probability weighting (AIPW) event-study design, we find highly significant evidence that biodiversity risk reduces stock prices, with effects peaking between 3 and 10 months after a shock. Moreover, we uncover a marked asymmetry whereby the positive effects of low biodiversity risk episodes outweigh the negative effects of high-risk episodes. Results are robust across quantiles of the return distribution and hold when controlling for European equity market volatility and economic policy uncertainty. Our findings provide the first evidence that biodiversity media narratives drive stock market valuations in Europe.

2606.20478 2026-06-19 eess.AS 新提交

Beyond Speaker Independence: Evaluating Cross-Lingual Acoustic-to-Articulatory Inversion Across Finnish and Russian

超越说话人独立性:跨语言声学到发音反演在芬兰语和俄语上的评估

Ruchi Pandey, Tomi Kinnunen

AI总结 本研究系统评估了跨说话人和跨语言域偏移下的声学到发音反演(AAI)性能,利用新构建的芬兰语-俄语双语EMA语料库FROST-EMA,比较了不同发音目标、声学前端和反演后端,发现跨性别性能下降中等(约0.05-0.10),跨语言下降更大(约0.10-0.20)。

详情
AI中文摘要

声学到发音反演(AAI)在域偏移下仍然具有挑战性,其中说话人属性的变化和跨语言条件常常导致性能下降。我们在这种偏移下进行了系统评估,并在FROST-EMA(一个芬兰语-俄语双语EMA语料库)上建立了基线基准。FROST-EMA解决了现有资源的英语偏见和有限的说话人多样性。我们基准测试了(i)发音目标(原始EMA坐标与声道变量),(ii)声学前端(MFCC与SSL特征),以及(iii)反演后端(BiLSTM与轻量级基于注意力的序列模型)。我们进一步定义了跨性别迁移(语言内)和跨语言迁移(性别内)的评估协议。结果表明,相对于域内基线,跨性别不匹配导致皮尔逊相关系数适度下降(约0.05至0.10),而跨语言不匹配导致更大的下降(约0.10至0.20)。

英文摘要

Acoustic-to-articulatory inversion (AAI) remains challenging under domain shifts where changes in speaker attributes and cross-language conditions often degrade performance. We conduct a systematic evaluation under such shifts and establish baseline benchmarks on FROST-EMA, a Finnish-Russian bilingual EMA corpus. FROST-EMA addresses the English bias and limited speaker diversity of existing resources. We benchmark (i) articulatory targets (raw EMA coordinates vs tract variables), (ii) acoustic front-ends (MFCC vs SSL features), and (iii) inversion back-ends (BiLSTM vs a lightweight attention-based sequence model). We further define evaluation protocols for cross-gender transfer (within language) and cross-language transfer (within gender). The results indicate that cross-gender mismatch introduces moderate Pearson correlation declines (approximately 0.05 to 0.10) relative to the in-domain baseline, whereas cross-language mismatch causes larger drops (approximately 0.10 to 0.20).

2606.20450 2026-06-19 eess.SP 新提交

Max-Min Rate Fairness Optimization for Multi-User Pinching-Antenna NOMA Systems

多用户捏合天线NOMA系统的最大最小速率公平性优化

Mahmoud AlaaEldin, Amy Inwood, Xidong Mu, Michail Matthaiou

AI总结 针对多波导捏合天线NOMA下行系统,提出两阶段优化框架,联合优化天线位置和预编码,以最大化最小用户速率,显著提升性能。

详情
AI中文摘要

捏合天线系统(PAS)通过沿米级波导重新定位介电辐射元件(称为捏合天线,PA)来克服信号阻塞,从而创建视距链路。由于每个波导由单个射频(RF)链驱动,非正交多址(NOMA)非常适合基于PAS的多用户通信。本文研究了一个多波导的PAS使能多用户下行NOMA系统,每个波导配备多个PA。联合优化PA位置和基站发射预编码,以最大化最小用户速率。由于PA间干扰引起的快速振荡相干和,所得问题高度非光滑且非凸。为应对这一挑战,我们提出了一种两阶段结构化优化框架。在第一阶段,使用内点算法进行粗略的PA位置和功率分配优化,同时忽略PA信道相位,从而得到接近真实最优的解。在第二阶段,考虑PA信道相位偏移,对PA位置和发射预编码进行微调。该阶段首先应用相位归零,即局部重新定位每个PA,使相应信道相位归零并促进建设性相干合并。然后使用交替过程,迭代执行前后向PA位置精炼和基于逐次凸近似的复发射预编码优化直至收敛,从而减少残余相位失配。仿真结果表明,所提框架显著优于启发式优化基准,且计算时间更短。结果还展示了相对于可比的多输入多输出下行NOMA系统的巨大增益,并揭示了PA数量、用户数量和发射功率对系统性能的影响。

英文摘要

Pinching-antenna systems (PASs) can overcome signal blockage by repositioning dielectric radiating elements, called pinching antennas (PAs), along meter-scale waveguides to create line-of-sight links. Since each waveguide is driven by a single radio-frequency (RF) chain, non-orthogonal multiple access (NOMA) is well suited for PAS-based multi-user communications. This paper studies a PAS-enabled multi-user downlink NOMA system with multiple waveguides, each equipped with multiple PAs. The PA positions and base-station transmit precoding are jointly optimized to maximize the minimum user rate. The resulting problem is highly non-smooth and non-convex because of the rapidly oscillating coherent sums caused by inter-PA interference. To tackle this challenge, we propose a two-stage structured optimization framework. In the first stage, coarse PA-position and power-allocation optimization is performed using an interior-point algorithm while neglecting the PA channel phases, which gives solutions near the true optima. In the second stage, PA positions and transmit precoding are fine-tuned while accounting for the PA channel phase shifts. This stage first applies phase zeroing, where each PA is locally repositioned to align the corresponding channel phase toward zero and promote constructive coherent combining. It then uses an alternating procedure that iteratively performs forward-backward PA position refinement and successive-convex-approximation-based complex transmit precoding optimization until convergence, thereby reducing residual phase mismatch. Simulation results show that the proposed framework significantly outperforms heuristic optimization benchmarks with much lower computational time. They also demonstrate large gains over a comparable multiple-input multiple-output downlink NOMA system and reveal the impact of the number of PAs, users, and transmit power on system performance.

2606.20338 2026-06-19 eess.AS 新提交

Stuttering Classification and Segmentation with Attention-Based Multiple Instance Learning

基于注意力多实例学习的口吃分类与分割

Petar Sušac, Sebastian P. Bayerl, Hrvoje Džapo

AI总结 提出基于微调wav2vec 2.0、WavLM和Whisper编码器的多实例神经网络,利用片段级数据实现帧级口吃分类与分割,帧级F1提升23%。

Comments Accepted at Interspeech 2026

详情
AI中文摘要

使用深度学习方法进行口吃检测和分类有潜力改善口吃严重程度评估过程。大多数口吃分类数据集提供片段级标签,这使得它们不适用于确定单个口吃不流畅持续时间所需的细粒度帧级分类。为了克服这一挑战,我们提出了一种基于微调wav2vec 2.0、WavLM和Whisper编码器的多实例神经网络架构。我们应用基于实例和基于嵌入的多实例学习方法,在片段级数据集上训练模型,用于片段级和帧级口吃分类任务。我们的结果显示,帧级F1分数提高了23%,片段级F1分数提高了2%至9%,证明了我们的模型能够利用片段级数据进行帧级分割的能力。

英文摘要

Stuttering detection and classification using deep learning methods has the potential to improve the process of stuttering severity assessment. Most stuttering classification datasets provide clip-level labels, making them unsuitable for fine-grained frame-level classification needed to determine the duration of individual stuttering dysfluencies. To overcome this challenge, we present a multiple instance neural network architecture based on fine-tuned wav2vec 2.0, WavLM and Whisper encoders. We apply instance- and embedding-based multiple instance learning approaches to train models on a clip-level dataset for both clip-level and frame-level stuttering classification tasks. Our results show a 23% improvement in frame-level F1 score and between 2% and 9% in clip-level F1 score, demonstrating the ability of our models to utilize clip-level data for frame-level segmentation.

2606.20266 2026-06-19 eess.AS 新提交

Transcript-Free Flow-Matching Text-to-Speech via Speech Feature Conditioning

基于语音特征调节的无转录流匹配文本转语音

SooHwan Eom, Hee Suk Yoon, Eunseop Yoon, Mark Hasegawa-Johnson, Chang D. Yoo

AI总结 提出RTFree-F5,用自监督语音表示替代参考转录本,通过轻量适配器映射到F5-TTS文本条件空间,消除对外部ASR依赖,在构音障碍语音上WER从24.6%降至10.4%。

Comments Accepted to Interspeech 2026

详情
AI中文摘要

最近的流匹配文本转语音(TTS)模型,如F5-TTS,在推理时依赖于从外部ASR系统获得的参考转录本。这种依赖性使得零样本TTS对于口音或构音障碍的说话者变得脆弱,而这正是最需要它的场景。此外,我们发现即使有真实转录本可用,基于文本的参考条件化也可能将非典型语音中的非典型声学模式传播到合成语音中。为了解决这个问题,我们提出了RTFree-F5,它用连续的自监督语音表示替换参考转录本,通过轻量适配器映射到F5-TTS的文本条件空间,同时重用预训练检查点。在构音障碍语音上,RTFree-F5将WER从24.6%降低到10.4%,甚至超过了真实参考转录本基线,同时提高了自然度,并在标准基准测试中保持竞争力,而无需任何参考转录本。

英文摘要

Recent flow-matching text-to-speech (TTS) models, such as F5-TTS, rely on a reference transcript at inference time, obtained from an external ASR system. This dependency makes zero-shot TTS brittle for accented or dysarthric speakers, precisely the scenarios where it is most needed. Moreover, we find that text-based reference conditioning can propagate atypical acoustic patterns from atypical speech into synthesis, even when ground-truth transcripts are available. To address this, we propose RTFree-F5, which replaces the reference transcript with continuous self-supervised speech representations mapped into F5-TTS's text-conditioning space via a lightweight adapter, while reusing the pretrained checkpoint. On dysarthric speech, RTFree-F5 reduces WER from 24.6% to 10.4%, surpassing even the ground-truth reference transcript baselines, while improving naturalness and remaining competitive on standard benchmarks without requiring any reference transcript.

2606.20222 2026-06-19 eess.SP 新提交

Reliable ORIS-assisted FSO Communications via HARQ

基于HARQ的可靠ORIS辅助自由空间光通信

Georgios D. Chondrogiannis, Athanasios P. Chrysologou, Vasilis K. Papanikolaou, Alexandros-Apostolos A. Boulogeorgos, Nestor D. Chatzidiamantis, Robert Schober

AI总结 研究结合光学可重构智能表面(ORIS)和混合自动重传请求(HARQ)的自由空间光通信链路,推导端到端信道统计模型,给出HARQ-CC的闭式中断概率和HARQ-IR的中断上界,分析分集阶数和延迟特性。

Comments 13 pages, 8 Figures, Journal

详情
AI中文摘要

本文研究了一种由光学可重构智能表面(ORIS)辅助并通过混合自动重传请求(HARQ)方案增强的自由空间光(FSO)链路。ORIS在障碍物周围创建虚拟视距路径,而HARQ通过重传和合并恢复受湍流、指向抖动和几何损耗损坏的帧。我们首先通过联合考虑大气湍流、ORIS引起的指向误差和几何衰减,推导了端到端发射器-ORIS-接收器(Tx-ORIS-Rx)反射信道的易处理统计模型。基于这些结果,我们获得了采用Chase合并的HARQ(HARQ-CC)的闭式中断概率(OP)表达式,以及采用增量冗余的HARQ(HARQ-IR)的解析中断上界,这些表达式对任意最大传输轮次有效。我们进一步进行了高信噪比(SNR)分析,该分析提供了中断行为的全面表征,并揭示了两种方案的分集阶数。此外,我们通过平均传输轮次和给定成功解码的条件平均轮次来表征截断HARQ过程的延迟行为。最后,数值和蒙特卡洛结果验证了所提出的分析,并表明HARQ显著提高了ORIS辅助FSO的可靠性,即使对于少量重传轮次,HARQ-IR也能实现比HARQ-CC更低的中断和延迟。

英文摘要

This paper studies a free-space optical (FSO) link assisted by an optical reconfigurable intelligent surface (ORIS) and enhanced by a hybrid automatic repeat request (HARQ) scheme. The ORIS creates a virtual line-of-sight path around obstacles, while HARQ recovers frames corrupted by turbulence, pointing jitter, and geometric loss through retransmission and combining. We first derive a tractable statistical model for the end-to-end transmitter-ORIS-receiver (Tx-ORIS-Rx) reflected channel by jointly accounting for atmospheric turbulence, ORIS-induced pointing errors, and geometric attenuation. Building on these results, we obtain closed-form outage probability (OP) expressions for HARQ with Chase combining (HARQ-CC) and analytical outage upper bounds for HARQ with incremental redundancy (HARQ-IR), valid for an arbitrary maximum number of transmission rounds. We further conduct a high signal-to-noise ratio (SNR) analysis that provides a thorough characterization of the outage behavior and reveals the diversity order of both schemes. In addition, we characterize the delay behavior of the truncated HARQ process through the mean number of transmission rounds and the conditional mean number of rounds given successful decoding. Finally, numerical and Monte Carlo results validate the proposed analysis and show that HARQ substantially improves ORIS-assisted FSO reliability, with HARQ-IR achieving lower outage and delay than HARQ-CC, even for a small number of retransmission rounds.

2606.20011 2026-06-19 eess.SP 新提交

Amplitude-Phase-Frequency Block Modulation for OFDM-ISAC with SI-Free PAPR Reduction and Pilotless Sensing

用于OFDM-ISAC的幅度-相位-频率块调制:无旁瓣信息PAPR降低和无导频感知

Bensheng Yang, Min Fan, Haitao Zhao, Haiming Wang

AI总结 提出一种幅度-相位-频率块调制方案,通过斯托克斯球映射和分组相位优化,在OFDM中实现无资源分割的通信与感知集成,同时降低PAPR并消除导频需求。

详情
AI中文摘要

基于正交频分复用(OFDM)的集成感知与通信系统需要一种统一波形,同时支持可靠数据传输、低峰均功率比(PAPR)和精确信道感知。现有方法在分离的时间或频率资源上复用通信与感知,或依赖专用导频进行信道估计,限制了系统灵活性并增加了开销。本文提出一种用于OFDM的幅度-相位-频率块调制(APFBM)方案,在不进行资源分割的情况下实现通信与感知的波形级集成。信息符号在斯托克斯球上表示,并通过明确规则映射到能量归一化的琼斯矢量,该规则为每个块建立确定性相位参考。这种映射暴露了信号结构中固有的共相自由度。在发射端,分组相位优化算法利用该结构自由度降低PAPR,无需旁瓣信息(SI)。在接收端,相同的确定性相位结构支持基于维特比的最大似然(ML)序列检测算法,该算法联合恢复优化相位并估计块状信道幅度和相位。无需专用感知导频,因为感知观测量直接从通信波形中提取。推导了闭式错误率和感知精度表达式。在软件无线电链路上的数值仿真和空中测量证实了有效的PAPR降低、精确的信道感知、可靠的相位恢复和稳定的信道状态信息重建。所提方案以适度降低频谱效率为代价,实现了统一波形设计,同时提供无SI的PAPR降低和无导频感知。

英文摘要

Orthogonal Frequency Division Multiplexing (OFDM)-based integrated sensing and communication systems demand a unified waveform that simultaneously supports reliable data transmission, low peak-to-average power ratio (PAPR), and accurate channel sensing. Existing approaches multiplex communication and sensing across separate time or frequency resources, or rely on dedicated pilots for channel estimation, limiting system flexibility and increasing overhead. This paper proposes an amplitude-phase-frequency block modulation (APFBM) scheme for OFDM that achieves waveform-level integration of communication and sensing without resource partitioning. Information symbols are represented on the Stokes sphere and mapped to energy-normalized Jones vectors through an unambiguous rule that establishes a deterministic phase reference per block. This mapping exposes a commonphase degree of freedom inherent in the signal structure. At the transmitter, a grouped phase optimization algorithm exploits this structural freedom to reduce the PAPR without side information (SI). At the receiver, the same deterministic phase structure enables a Viterbi-based maximum-likelihood (ML) sequence detection algorithm that jointly recovers the optimization phases and estimates the block-wise channel amplitude and phase. No dedicated sensing pilots are required, as the sensing observables are extracted directly from the communication waveform. Closed-form error-rate and sensing-accuracy expressions are derived. Numerical simulations and over-the-air measurements on a software-defined radio link confirm effective PAPR reduction, accurate channel sensing, reliable phase recovery, and stable channel state information reconstruction. The proposed scheme trades a moderate reduction in spectral efficiency for a unified waveform design that simultaneously delivers SI-free PAPR reduction and pilotless sensing.

2606.20001 2026-06-19 eess.AS 新提交

Time-Unconditional Generative Speech Enhancement via Autonomous Rectified Flow

基于自主整流流的时间无条件生成式语音增强

Wen Zhang, Wenbin Jiang, Yang Zhang, Xiaofei Zhou

AI总结 提出自主整流流框架,通过线性插值路径证明目标向量场时间不变性,设计时间无条件网络仅从空间关系推断去噪方向,显著提升生成质量、鲁棒性和推理效率。

详情
AI中文摘要

大多数生成式语音增强方法依赖显式时间步嵌入进行时间条件化。本文提出自主整流流框架,挑战这种条件化的必要性。通过线性插值路径,我们证明目标向量场本质上是时间不变的。我们进一步引入时间无条件网络,消除显式时间步信息,仅从当前状态与带噪观测之间的空间关系推断去噪方向。预测该目标向量场等价于建模噪声分布。通过避免过拟合时间轨迹,所提出的自主设计显著提升了生成质量、鲁棒性和推理效率。

英文摘要

Most generative speech enhancement methods rely on explicit time-step embeddings for temporal conditioning. In this paper, we propose the Autonomous Rectified Flow framework, which challenges the necessity of such conditioning. Using a linear interpolation path, we show that the target vector field is inherently time-invariant. We further introduce a time-unconditional network that eliminates explicit time-step information and infers the denoising direction solely from the spatial relationship between the current state and the noisy observation. Predicting this target vector field is equivalent to modeling the noise distribution. By avoiding overfitting to temporal trajectories, the proposed autonomous design significantly improves generation quality, robustness, and inference efficiency.

2606.19974 2026-06-19 eess.AS 新提交

Interpreting Content and Speaker Characteristics in Factorised Self-Supervised Subspaces

解释因子化自监督子空间中的内容和说话人特征

Kyle Janse van Rensburg, Herman Kamper

AI总结 通过SVD分解WavLM特征为内容矩阵和说话人变换,发现内容空间主要编码强度、共振峰和发声,而说话人空间与音高和性别强相关,并可用于语音合成中的精细控制。

Comments 7 pages, 4 figures

详情
AI中文摘要

自监督语音特征同时编码内容和说话人信息。最近的工作引入了一种基于SVD的因子化方法,将这些特征分解为一个共享的内容矩阵(捕获时间变化)和说话人特定的变换(捕获静态说话人特征)。然而,这些组件内部的信息组织方式仍不清楚。在本文中,我们研究了WavLM因子化的内容和说话人子空间的维度如何与语音特征(如音高、强度和发声)相关。我们发现,内容空间中的前几个维度主要捕获强度、高阶共振峰和发声,而音高编码在较后的维度中。相比之下,方差最大的说话人维度与音高和性别强相关,后面的维度捕获高频变化。干预实验表明,操纵这些维度能够实现对语音合成中语音特征的目标控制。此外,联合修改内容和说话人表示可提供对音高和强度等特征的精细控制。

英文摘要

Self-supervised speech features encode both content and speaker information. Recent work introduced an SVD-based factorisation that decomposes these features into a shared content matrix capturing temporal variation and speaker-specific transformations capturing static speaker characteristics. However, how information is organised within these components remains unclear. In this paper, we investigate how the dimensions of WavLM-factorised content and speaker subspaces correlate with speech characteristics such as pitch, intensity, and voicing. We find that leading dimensions in the content space primarily capture intensity, higher-order formants, and voicing, while pitch is encoded in a later dimension. In contrast, the highest-variance speaker dimension is strongly associated with pitch and gender, with later dimensions capturing high-frequency variation. Intervention experiments show that manipulating these dimensions enables targeted control of speech characteristics for speech synthesis. Furthermore, modifying the content and speaker representations jointly provides fine-grained control over characteristics such as pitch and intensity.

2606.19953 2026-06-19 eess.SP 新提交

ConsisFormer: Compute-Efficient Transformer for Wireless Foundation Models Based on Channel Consistency

ConsisFormer: 基于信道一致性的无线基础模型高效计算Transformer

Yuwei Wang, Li Sun, Tingting Yang, Liwen Jing, Yuxuan Shi, Maged Elkashlan, Mérouane Debbah

AI总结 提出ConsisFormer,利用无线信道短时一致性,通过自适应令牌聚合和特征序列插值降低Transformer计算复杂度,在多种任务上减少83%以上计算量且性能损失极小。

详情
AI中文摘要

无线基础模型(WFM)最近成为AI原生6G网络的一种有前景的范式,能够实现适应各种通信和感知任务的通用信道表示。现有的WFM主要基于Transformer架构,该架构提供了优越的性能,但计算复杂度与输入序列长度的平方成正比,这对其在严格推理延迟约束下的部署构成了重大障碍。为了解决这个问题,本文提出ConsisFormer,一种基于无线信道短时一致性的高效计算Transformer设计,作为WFM的骨干网络。利用相邻时间或频率实例共享相似的散射体簇并因此表现出相似信道特性的观察,我们开发了自适应令牌聚合(ATA)模块,动态合并相邻信道状态信息(CSI)令牌,从而减少自注意力计算中涉及的令牌序列长度以降低计算成本。此外,我们提出了一种特征序列插值(FSI)方法,基于Transformer块输出的稀疏特征序列恢复完整的CSI表示,从而在保持性能不受影响的同时确保低复杂度。此外,我们提出了一种用于WFM的聚合自编码器(AAE)预训练范式,通过压缩和恢复从稀疏化CSI令牌中学习鲁棒的信道表示。仿真结果表明,所提出的设计将WFM的计算复杂度降低了83%以上,同时在包括信道预测、视距/非视距分类、波束预测和定位在内的各种任务上性能损失极小。

英文摘要

Wireless foundation models (WFMs) have recently emerged as a promising paradigm for AI-native 6G networks, enabling universal channel representations adaptable to diverse communication and sensing tasks. Existing WFMs are predominantly built upon the Transformer architecture, which delivers superior performance but incurs computational complexity proportional to the square of the input sequence length, posing a significant barrier to their deployment under stringent inference latency constraints. To address this issue, in this paper, we propose ConsisFormer, a compute-efficient Transformer design based on short-term consistency of wireless channels, as a WFM backbone. By utilizing the observation that adjacent time or frequency instances share similar clusters of scatterers and thus exhibit similar channel characteristics, we develop an adaptive token aggregation (ATA) module to dynamically merge neighboring channel state information (CSI) tokens, thereby reducing the length of the token sequence involved in self-attention calculations to lower the computational cost. Furthermore, we propose a feature sequence interpolation (FSI) method to recover the full CSI representation based on the sparse feature sequence outputted from the Transformer blocks, thus keeping the performance unaffected while ensuring low complexity. Moreover, we propose an aggregated auto-encoder (AAE) pre-training paradigm for WFMs, enabling robust channel representation learning from sparsified CSI tokens via compression and recovery. Simulation results show that the proposed design reduces the computational complexity of WFM by over $83\%$ with negligible performance loss on various tasks including channel prediction, LoS/NLOS classification, beam prediction, and localization.

2606.19940 2026-06-19 eess.AS 新提交

Analyzing Language and Geographical Variation in Speech Representations Across 60 Indic Languages

分析60种印度语言语音表征中的语言和地理变异

Pavan Kumar J, Agneedh Basu, Pranav Bhat, Sujith Pulikodan, Visruth Sanka, Nihar Desai, Prasanta Kumar Ghosh

AI总结 研究通过联合语言-地区监督微调Whisper-base和Wav2Vec2.0,发现该方法在保持语言分类能力的同时,提升了嵌入空间中地区区分度,并利用归一化条件互信息分析了嵌入结构。

详情
AI中文摘要

自监督语音编码器通常使用语言监督进行微调,这可能会忽略地理变异。为了理解在语言和地区联合监督下与仅语言监督下学习到的表征差异,我们微调Whisper-base和Wav2Vec2.0进行联合语言-地区分类(386类)和仅语言分类(60类)任务。语言-地区监督在嵌入空间中改善了条件于语言的地区区分度,同时保持了较强的边缘语言分类能力。我们使用归一化条件互信息(NCMI)分析学习到的嵌入结构,表明语言-地区监督产生了全局语言簇,并在语言内部形成了与地区变异对齐的结构化子簇,从而在不降低语言层面组织的情况下增强了地理可分离性。

英文摘要

Self-supervised speech encoders are often fine-tuned with language supervision, which can overlook geographical variation. To understand the learned representations under joint supervision of language and district compared to language-only supervision, we fine-tune Whisper-base and Wav2Vec2.0-base for classification tasks with joint language-district (386 classes) and language-only classification (60 languages). The language-district supervision improves district discrimination conditioned on language in the embedding space while strong marginal language classification. We analyze the structure of the learned embeddings using Normalized Conditional Mutual Information (NCMI), showing that language-district supervision produces global language clusters with structured within language subclusters aligned to district variation, enhancing geographical separability without degrading language-level organization.

2606.19724 2026-06-19 eess.SP 新提交

Cyclic-Prefix OFDM Probing for Spatial-ISI-Free Distributed Acoustic Sensing via Frequency-Domain Channel Reconstruction

基于频域信道重构的循环前缀OFDM探测实现无空间ISI分布式声学传感

Huan Huang, Zhiyang Xue, Ziang Chen, Zhongxing Tian, Dongdong Zou, Gangxiang Shen, Yi Cai

AI总结 提出使用循环前缀正交频分复用(CP-OFDM)波形作为传感探头,通过频域信道重构消除匹配滤波脉冲压缩中的空间符号间干扰(ISI),实现无空间ISI的分布式声学传感,并同时恢复通信数据,展示共享波形集成感知与通信(ISAC)。

Comments This manuscript has been submitted for possible publication

详情
AI中文摘要

基于匹配滤波的脉冲压缩分布式声学传感(DAS)存在非零压缩旁瓣,导致确定性距离单元间泄漏,即空间符号间干扰(ISI),并在重建的瑞利背向散射迹中产生虚假响应。我们提出一种用于$\phi$-OTDR的循环前缀正交频分复用(CP-OFDM)DAS系统,使用承载数据的CP-OFDM波形作为传感探头。该系统还恢复前向通信数据,初步展示了共享波形集成感知与通信(ISAC)。据我们所知,这是首次将分布式瑞利背向散射建模为有限记忆传感多径信道。基于该模型,我们证明,如果有用OFDM和CP长度覆盖传感多径记忆,则去除CP、单抽头频域均衡和逆离散傅里叶变换可重建每个距离单元系数,且无确定性波形引起的空间ISI,从而实现无空间ISI的相位解调。在模拟的5.2公里链路上,组内间隔5.31–5.83米的十个同时强、弱事件,所提接收机抑制了事件外泄漏,并将相位迹均方误差相比匹配滤波脉冲压缩提升高达29.55 dB。在5.2公里光纤链路的相干外差实验中,占用带宽111.984 MHz,在5 V和1 V驱动下,500 Hz PZT振动分别被盲定位在5.071公里和5.066公里处,其波形恢复的相关系数分别为0.990和0.962。同一承载数据探头还恢复了一幅图像,误码率为零,误差矢量幅度中位数为-23.14 dB。这些结果验证了CP-OFDM辅助的频域信道重构用于无空间ISI的DAS,并展示了其在共享波形光纤ISAC中的潜力。

英文摘要

Matched-filter-based pulse-compression distributed acoustic sensing (DAS) suffers from nonzero compression sidelobes that cause deterministic inter-range-bin leakage, i.e., spatial inter-symbol interference (ISI), and false responses in reconstructed Rayleigh-backscatter traces. We propose a cyclic-prefix orthogonal frequency-division multiplexing (CP-OFDM) DAS system for $ϕ$-OTDR, using a data-bearing CP-OFDM waveform as the sensing probe. It also recovers forward communication data, providing an initial demonstration of shared-waveform integrated sensing and communication (ISAC). To our knowledge, this is the first formulation of distributed Rayleigh backscattering as a finite-memory sensing multipath channel. Based on this formulation, we prove that, if the useful OFDM and CP lengths cover the sensing multipath memory, CP removal, one-tap frequency-domain equalization, and inverse discrete Fourier transform reconstruct each range-bin coefficient without deterministic waveform-induced spatial ISI, enabling spatial-ISI-free phase demodulation. For a simulated 5.2-km link with ten simultaneous strong and weak events spaced by 5.31--5.83 m within groups, the proposed receiver suppresses off-event leakage and improves phase-trace mean-square error by up to 29.55 dB over matched-filter pulse compression. In a heterodyne coherent experiment over a 5.2-km fiber link with 111.984-MHz occupied bandwidth, 500-Hz PZT vibrations are blindly localized at 5.071 and 5.066 km under 5- and 1-V drives, respectively, and their waveforms are recovered with correlation coefficients of 0.990 and 0.962. The same data-bearing probe also recovers an image with zero measured bit-error rate and a median error vector magnitude of -23.14 dB. These results validate CP-OFDM-aided frequency-domain channel reconstruction for spatial-ISI-free DAS and demonstrate its potential for shared-waveform optical-fiber ISAC.