arXivDaily arXiv每日学术速递 周一至周五更新
全部学科分类 1530
专题追踪
2606.19368 2026-06-19 math.NA cs.LG cs.NA math.OC 新提交

Neural Architectures as Functional Priors in Physics-Informed Control Problems

物理信息控制问题中的神经架构作为函数先验

Sonia Rubio Herranz, Fernando Carlos López Hernández, Antonio López Montes

AI总结 研究神经架构作为隐式函数先验在常微分方程控制问题中的作用,发现不同架构(MLP与傅里叶KAN)在相同条件下产生定性不同的控制,表现出功能特化现象。

Comments 17 pages, 6 figures. Physics-informed neural networks, optimal control, spectral bias, Kolmogorov-Arnold Networks

详情
AI中文摘要

在这项工作中,我们研究了神经架构作为隐式函数先验在由常微分方程控制的问题中的作用。我们的目标不是关注高度复杂的问题,而是在最简单的物理可解释设置中研究受控动力系统中依赖于架构的效应。特别地,我们研究了一个受控的线性RLC电路和一个非线性Duffing型动力系统。这两个系统首先通过经典最优控制公式进行分析,然后通过基于PINN的方法进行分析。我们比较了多层感知器(MLP)和基于傅里叶的KAN类架构的不同组合,并分析了它们对所得控制的影响。数值实验表明,即使在相同的控制方程、损失函数、初始和目标状态、训练参数以及物理约束下,不同的架构选择也会系统地产生定性不同的控制。学习到的解在谱结构、平滑性、能量分布和相空间行为方面出现显著差异。这项工作的一个核心观察是,当神经架构被允许足够的自由度来塑造学习到的控制结构时,会出现功能特化现象。更具体地说,在我们考虑的系统中,基于傅里叶的架构倾向于产生具有更丰富振荡内容的轨迹,而更平滑的低频偏置架构倾向于产生更规则且能量效率更高的控制。这表明控制问题的不同功能组件可能由不同的神经架构更有效地处理,从而导致状态表示和控制生成之间的隐式特化。

英文摘要

In this work we investigate the role of neural architectures as implicit functional priors in control problems governed by ordinary differential equations. Rather than focusing on highly complex problems, our objective is to investigate architecture-dependent effects in controlled dynamical systems within the simplest physically interpretable settings possible. In particular, we study a controlled linear RLC electrical circuit and a nonlinear Duffing-type dynamical system. Both systems are analyzed first through classical optimal-control formulations and later through PINN-based approaches. We compare different combinations of multilayer perceptrons (MLPs) and Fourier-based KAN-like architectures, and analyze their influence on the resulting controls. The numerical experiments suggest that different architectural choices systematically generate qualitatively distinct controls, even under identical governing equations, loss functionals, initial and target states, training parameters and physical constraints. Significant differences appear in the spectral structure, smoothness, energy distribution, and phase-space behavior of the learned solutions. A central observation of this work is the emergence of a functional specialization phenomenon when the neural architectures are allowed sufficient freedom to shape the structure of the learned controls. More specifically, in the systems considered here, Fourier-based architectures tend to produce trajectories with richer oscillatory content, whereas smoother low-frequency-biased architectures tend to generate more regular and energetically efficient controls. This suggests that different functional components of the control problem may be handled more efficiently by different neural architectures, leading to an implicit specialization between state representation and control generation.

2606.20485 2026-06-19 q-fin.RM cs.AI nlin.AO physics.soc-ph 新提交

Optimal Order of Multi-Agent and General Many-Body Systems

多智能体与一般多体系统的最优序

Jake J. Xia

AI总结 提出一个分析多智能体系统的通用框架,基于智能体的权力和响应函数,推导出宏观性质,并引入风险偏好系数研究增长与韧性之间的权衡,得出最优有序度。

Comments Key Words: Many body systems, multi agent crowd interactions, feedback loops, agent power, response function, utility function, risk appetite, order, optimal order, fragility, mobility, synchronization, useful energy, entropy, concentration, correlation, task dependency, receiver dependency, collective intelligence, AI model scaling law

详情
AI中文摘要

本文开发了一个通用框架,用于分析具有智能体行动与集体观测之间反馈回路的多智能体系统。该框架建立在两个基本的智能体层面变量上:权力,衡量智能体对集体结果的影响;以及响应函数,决定智能体如何对观测做出反应。我们推导了宏观性质(包括总权力、有用权力、熵、有序度、脆弱性和流动性)如何从异质智能体的这两个变量中涌现。为了研究增长与韧性之间的权衡,我们引入了一个由风险偏好系数参数化的系统层面效用函数,并推导出一个平衡生产力、稳定性和适应性的最优有序度。分析表明,更强的同步可以增加集体产出,但也可能增加系统脆弱性并降低流动性。我们进一步论证,有序度、熵、信息和有用能量是任务依赖和系统相对的概念,其含义取决于系统的目标。通过测量和设计智能体的权力分布和响应函数,可能更好地理解、预测和优化集体行为,并识别集体智慧和最优序出现的条件。

英文摘要

This paper develops a general framework for analyzing multi-agent systems with feedback loops between agents actions and collective observations. The framework is built on two fundamental agent-level variables: power, which measures agent influence on collective outcomes, and response functions, which determine how agents react to observations. We derive how macroscopic properties, including total power, useful power, entropy, order, fragility, and mobility, emerge from these two variables of heterogeneous agents. To study the trade off between growth and resilience, we introduce a system-level utility function parameterized by a risk-appetite coefficient and derive an optimal degree of order that balances productivity, stability, and adaptability. The analysis suggests that stronger synchronization can increase collective output but may also increase systemic fragility and reduce mobility. We further argue that order, entropy, information, and useful energy are task-dependent and system-relative concepts whose meanings depend on the objectives of the system. By measuring and designing agent power distributions and response functions, it may be possible to better understand, predict, and optimize collective behavior and identify the conditions under which collective intelligence and optimal order emerge.

2606.20299 2026-06-19 stat.ML cs.LG hep-ph physics.data-an 新提交

Statistical Properties of Training & Generalization

训练与泛化的统计特性

Itay Lavie, Noam Levi, Yonatan Kahn

AI总结 从物理学角度研究深度学习的关键特征和意外现象,回顾神经缩放定律及其与物理问题中约束和归纳偏置的相互作用。

Comments 32 pages, 3 figures. Part of the VERaiPHY initiative

详情
AI中文摘要

深度学习成功规避了经典统计学的众多直觉,在多个现实任务中取得了前所未有的性能。本文从物理学角度研究深度学习的关键特征和意外现象,注意指出并尽可能证明构建深度学习模型时固有的多种选择。特别地,我们回顾了神经缩放定律的现象,并讨论了它们与在物理问题中应用机器学习时可能存在的约束和归纳偏置之间的相互作用。

英文摘要

Deep learning has managed to evade numerous intuitions from classical statistics to achieve unprecedented performance on a number of real-world tasks. In this article, we investigate the key features and surprises of deep learning from a physics-informed perspective, taking care to point out and justify where possible the many choices inherent in constructing a deep learning model. In particular, we review the phenomenon of neural scaling laws and discuss their interplay with the constraints and inductive biases which may be present when applying machine learning to problems in physics.

2606.19781 2026-06-19 hep-ex cs.AI 新提交

Towards Engineering Scaling Laws with Pretraining Data Composition

迈向基于预训练数据组成的工程化缩放定律

Jan-Lucas Uslu, Kevin Greif, Daniel Whiteson, Benjamin Nachman

AI总结 研究通过工程化预训练数据组成(增加多样性和与下游任务的对齐)来改变粒子物理中神经网络的缩放行为,使其更偏向数据扩展而非模型扩展。

详情
AI中文摘要

神经缩放定律描述了模型性能如何随计算量、模型大小和数据集大小呈幂律提升。虽然这些关系在大型语言模型中已得到充分验证,但在粒子物理学的大型模型中正在出现。与语言类似,实证研究表明性能呈幂律缩放。然而,与自然语言或图像领域不同,基础物理学拥有高保真模拟器,可以廉价地生成合成数据。这有利于缩放机制中额外数据比额外参数更便宜,并允许预训练数据集本身被工程化以影响缩放。对于高能粒子束碰撞中产生的强子喷注分类任务,我们表明,通过包含更多样化且与下游分类任务更对齐的预训练数据,可以工程化缩放行为,使其需要更多数据而非更大模型。

英文摘要

Neural scaling laws describe how model performance improves as a power law in compute, model size, and dataset size. While well-established for large language models, these relationships are emerging for large models in particle physics. As with language, empirical studies show that the performance scales as a power law. However, unlike natural language or image domains, fundamental physics has high-fidelity simulators that produce synthetic data cheaply. This favors scaling regimes where additional data is cheaper than additional parameters, and allows the pretraining dataset itself to be engineered to influence the scaling. For the task of classifying hadronic jets produced in collisions of high-energy particle beams, we show that the scaling behavior can be engineered towards requiring more data rather than larger models by inclusion of pretraining data which is more diverse and better aligned with the downstream classification task.

2606.19149 2026-06-19 cs.CR cs.LG 新提交

OpenAnt: LLM-Powered Vulnerability Discovery Through Code Decomposition, Adversarial Verification, and Dynamic Testing

OpenAnt:通过代码分解、对抗性验证和动态测试实现LLM驱动的漏洞发现

Nahum Korda, Gadi Evron

AI总结 提出OpenAnt系统,结合静态分析与LLM推理,通过代码分解、对抗性验证和动态测试三阶段流水线,在降低误报率的同时发现未知漏洞。

详情
AI中文摘要

在大型代码库中自动发现漏洞仍然具有挑战性:传统静态分析误报率高,而模糊测试等动态方法需要大量基础设施且通常针对狭窄的漏洞类别。大型语言模型(LLM)的最新进展使得对程序行为进行语义推理成为可能,但将LLM应用于仓库级安全分析会引入上下文管理、成本和验证方面的挑战。我们提出了OpenAnt,一个开源漏洞发现系统,它在多阶段流水线中集成了静态程序分析与基于LLM的推理。OpenAnt引入了三种关键技术。首先,代码库被分解为自包含的分析单元,并通过从外部入口点的可达性进行过滤,将分析面减少高达97%,同时保留与攻击相关的代码。其次,候选漏洞通过受限攻击者模拟进行对抗性验证,其中模型在现实攻击者能力下评估可利用性。第三,通过动态验证确认发现结果,其中自动生成利用环境,在沙箱容器中执行,并在使用后丢弃。在包括OpenSSL、WordPress和Flowise在内的广泛使用的开源项目上的评估表明,这种架构可以识别先前未知的漏洞,同时保持可管理的分析成本并大幅减少误报。我们的结果表明,结合语义推理与利用验证的闭环漏洞发现流水线,为可扩展的自动化安全分析提供了一条实用路径。OpenAnt已在Apache 2.0许可下开源,网址为https://this https URL。

英文摘要

Automated vulnerability discovery in large codebases remains challenging: traditional static analysis produces high false-positive rates, while dynamic approaches such as fuzzing require substantial infrastructure and often target narrow classes of bugs. Recent advances in large language models (LLMs) enable semantic reasoning about program behavior, but applying LLMs to repository-scale security analysis introduces challenges related to context management, cost, and verification. We present OpenAnt, an open-source vulnerability discovery system that integrates static program analysis with LLM-based reasoning in a multi-stage pipeline. OpenAnt introduces three key techniques. First, codebases are decomposed into self-contained analysis units filtered by reachability from external entry points, reducing the analysis surface by up to 97% while preserving attack-relevant code. Second, candidate vulnerabilities undergo adversarial verification through constrained attacker simulation, where the model evaluates exploitability under realistic attacker capabilities. Third, findings are validated through dynamic verification, in which exploit environments are generated automatically, executed in sandboxed containers, and discarded after use. Evaluation on widely used open-source projects including OpenSSL, WordPress, and Flowise shows that this architecture can identify previously unknown vulnerabilities while maintaining manageable analysis cost and substantially reducing false positives. Our results suggest that closed-loop vulnerability discovery pipelines, combining semantic reasoning with exploit validation, provide a practical path toward scalable automated security analysis. OpenAnt is released as open source under the Apache 2.0 license at https://github.com/knostic/OpenAnt.

2503.04507 2026-06-19 q-bio.QM cs.CG cs.LG 交叉投稿

The Morse Transform for Discrete Shape Analysis

离散形状分析的Morse变换

Alexander M. Tanaka, Aras T. Asaad, Richard Cooper, Vidit Nanda

AI总结 提出一种基于定向分段线性Morse理论的拓扑变换,通过记录多个高度函数下的临界点来量化嵌入对象的几何形状,生成的特征向量在配体虚拟筛选中取得最优平均AUROC。

Comments 37 pages, 3 main figures, 2 main tables, 12 appendix figures and 4 appendix tables

详情
AI中文摘要

物体的几何形状在调节其与物理世界的相互作用中起着至关重要的作用。然而,为了统计推断或分类任务的目的,用数值描述几何信息仍然困难。在这里,我们引入了一种新的拓扑变换,它利用定向分段线性Morse理论,通过编录多个高度函数下的临界点来量化嵌入对象的几何形状。该Morse变换的输出记录了表征底层形状的临界点的高度和局部拓扑类型(峰、谷或鞍点),保留了比欧拉特征变换更精细的信息,同时自然优先考虑形状的最外层区域。关键的是,该输出可以进一步压缩为丰富而紧凑的特征向量。我们将Morse特征向量作为配体虚拟筛选(LBVS)的描述符进行基准测试,这本质上依赖于分子的形状。在常见的梯度提升树分类流程下,与其他拓扑变换描述符和标准基于形状的LBVS描述符相比,Morse描述符实现了最高的平均AUROC。

英文摘要

The geometry of an object plays a vital role in modulating its interactions with the physical world. It nevertheless remains difficult to describe geometric information numerically for the purposes of statistical inference or classification tasks. Here, we introduce a new topological transform which leverages directional piecewise-linear Morse theory to quantify the geometry of an embedded object by cataloguing critical points across multiple height-functions. The output of this Morse transform records both the heights and the local topological type (peak, trough or saddle) of the critical points that characterise the underlying shape, retaining finer information than the Euler characteristic transform whilst naturally prioritising a shape's outermost regions. Crucially, this output can be further compressed into a rich but compact feature vector. We benchmark the Morse feature vector as a descriptor for ligand-based virtual screening (LBVS), which intrinsically depends on the shape of molecules. Under a common gradient-boosted tree classification pipeline, Morse descriptors achieve the highest mean AUROC when compared to other topological transform descriptors and to standard shape-based LBVS descriptors.

2606.20435 2026-06-19 econ.EM 新提交

Choosing A Headline Estimand from Matching, DID, and Hybrid Designs: A Minimax-Regret Approach

从匹配、DID和混合设计中选择标题估计量:一种极小化最大遗憾方法

Yechan Park, Yuya Sasaki

AI总结 本文提出在面板数据因果效应估计中,混合设计(DIDM)的估计量介于匹配(M)和双重差分(DID)之间,并在宽泛损失函数下是极小化最大遗憾选择,建议将DIDM作为标题估计量,匹配和DID作为边界。

详情
AI中文摘要

使用面板数据估计因果效应的研究人员通常从三种利用过去结果的方法中选择:双重差分(DID)、对滞后结果进行条件化(匹配,M)以及同时进行两者的混合方法(DIDM)。相应的识别假设是非嵌套的,因此对于报告哪种方法几乎没有指导。我们给出了相应估计量有序的条件,其中DIDM介于匹配和DID之间。这使得DIDM在宽泛的损失函数类中成为三者中的极小化最大遗憾选择。我们建议将DIDM报告为标题估计量,匹配和DID作为边界。我们在应用中进行了说明。

英文摘要

Researchers using panel data to estimate causal effects routinely choose among three approaches to using past outcomes: difference-in-differences (DID), conditioning on lagged outcomes (matching, M), and a hybrid that does both (DIDM). The corresponding identifying assumptions are non-nested, leaving little guidance on which to report. We give conditions under which the corresponding estimands are ordered, with DIDM bracketed between matching and DID. This makes DIDM the minimax-regret choice among the three under a broad class of loss functions. We recommend reporting DIDM as the headline estimate, with matching and DID as bounds. We illustrate in applications.

2606.20286 2026-06-19 econ.EM 新提交

Institutions, Inputs, and Agricultural Growth in China:Revisiting Several Controversies, 1949--1986

制度、投入与中国农业增长:重访若干争议(1949–1986)

Jiyuan Lyu

AI总结 本文利用统一数据集和计量方法,重新审视关于中国农业增长的价格剪刀差、重工业投资、1978年改革及去集体化对灌溉影响的四大争议。

详情
AI中文摘要

关于1949年至1986年间中国农业增长的学术争论在价格剪刀差的程度、重工业投资的影响、1978年改革的作用以及去集体化对灌溉的影响等方面持续存在分歧。本文利用单一数据集和互补的计量经济学方法,逐一回应了这些争议。结果表明,1952–1957年是唯一一个通过所有三个渠道实现净提取的时期,此后国家通过财政和信贷工具向农业净流入约1686亿元。重工业投资对农业产生了显著的正向滞后效应,而同期负相关源于投资份额指标的零和性质。投入产出弹性在1970年突然变化,集体农业贷款在1971年断裂,两者均指向华北农业会议的整顿效果。防灾能力从集体时期的0.70下降到家庭承包后的0.53,主要原因是集体维护体系崩溃而非国家投资减少。1979年后农业供给的价格弹性趋近于零,表明1979年的收购价格提高更像是一次性重新校准而非持续的边际激励。

英文摘要

Scholarly debates on China's agricultural growth between 1949 and 1986 continue to differ over the extent of the price scissors, the effect of heavy industrial investment, the role of the 1978 reforms, and the impact of decollectivization on irrigation. Using a single dataset and complementary econometric methods, this paper addresses each of these controversies. The results show that 1952--1957 was the only net extraction period across all three channels, after which the state channelled a net inflow of about 168.6 billion yuan into agriculture via fiscal and credit instruments. Heavy industrial investment exerted a significant positive lagged effect on agriculture, while the contemporaneous negative correlation stemmed from the zero-sum nature of the investment share indicator. The input-output elasticity shifted abruptly in 1970, and collective agricultural loans broke in 1971, both pointing to the rectification effects of the North China Agricultural Conference. Disaster prevention capacity fell from 0.70 under the collective era to 0.53 after household contracting, mainly because the collective maintenance system collapsed rather than because state investment declined. After 1979 the price elasticity of agricultural supply approached zero, suggesting that the 1979 procurement price increase acted more like a one-off recalibration than a sustained marginal incentive.

2606.19972 2026-06-19 econ.EM 新提交

Biodiversity Media Narratives and Stock Market Performance: Evidence from Europe

生物多样性媒体叙事与股市表现:来自欧洲的证据

Andres Azqueta-Gavaldon, Ben Jabeur Sami, Leila Hedhili

AI总结 利用GDELT全球知识图谱构建2015-2025年法德意西四国的生物多样性媒体风险指标,通过面板格兰杰因果检验和增广逆概率加权事件研究发现,生物多样性风险显著降低股价,且低风险期的正面效应大于高风险期的负面效应。

详情
AI中文摘要

本研究为法国、德国、意大利和西班牙构建了2015-2025年间新颖的生物多样性相关媒体风险指标,利用GDELT全球知识图谱捕捉媒体对生物多样性威胁的关注。通过面板格兰杰因果检验和增广逆概率加权(AIPW)事件研究设计,我们发现了高度显著的证据表明生物多样性风险会降低股票价格,其影响在冲击后3至10个月达到峰值。此外,我们揭示了一个明显的非对称性,即低生物多样性风险期的正面效应大于高风险期的负面效应。结果在收益分布的分位数上稳健,并在控制欧洲股票市场波动性和经济政策不确定性时依然成立。我们的发现首次提供了生物多样性媒体叙事驱动欧洲股市估值的证据。

英文摘要

This study constructs novel biodiversity related media risk indicators for France, Germany, Italy, and Spain over 2015-2025, capturing media attention to biodiversity threats using the GDELT Global Knowledge Graph. Using panel Granger causality tests and an augmented inverse probability weighting (AIPW) event-study design, we find highly significant evidence that biodiversity risk reduces stock prices, with effects peaking between 3 and 10 months after a shock. Moreover, we uncover a marked asymmetry whereby the positive effects of low biodiversity risk episodes outweigh the negative effects of high-risk episodes. Results are robust across quantiles of the return distribution and hold when controlling for European equity market volatility and economic policy uncertainty. Our findings provide the first evidence that biodiversity media narratives drive stock market valuations in Europe.

2606.20478 2026-06-19 eess.AS 新提交

Beyond Speaker Independence: Evaluating Cross-Lingual Acoustic-to-Articulatory Inversion Across Finnish and Russian

超越说话人独立性:跨语言声学到发音反演在芬兰语和俄语上的评估

Ruchi Pandey, Tomi Kinnunen

AI总结 本研究系统评估了跨说话人和跨语言域偏移下的声学到发音反演(AAI)性能,利用新构建的芬兰语-俄语双语EMA语料库FROST-EMA,比较了不同发音目标、声学前端和反演后端,发现跨性别性能下降中等(约0.05-0.10),跨语言下降更大(约0.10-0.20)。

详情
AI中文摘要

声学到发音反演(AAI)在域偏移下仍然具有挑战性,其中说话人属性的变化和跨语言条件常常导致性能下降。我们在这种偏移下进行了系统评估,并在FROST-EMA(一个芬兰语-俄语双语EMA语料库)上建立了基线基准。FROST-EMA解决了现有资源的英语偏见和有限的说话人多样性。我们基准测试了(i)发音目标(原始EMA坐标与声道变量),(ii)声学前端(MFCC与SSL特征),以及(iii)反演后端(BiLSTM与轻量级基于注意力的序列模型)。我们进一步定义了跨性别迁移(语言内)和跨语言迁移(性别内)的评估协议。结果表明,相对于域内基线,跨性别不匹配导致皮尔逊相关系数适度下降(约0.05至0.10),而跨语言不匹配导致更大的下降(约0.10至0.20)。

英文摘要

Acoustic-to-articulatory inversion (AAI) remains challenging under domain shifts where changes in speaker attributes and cross-language conditions often degrade performance. We conduct a systematic evaluation under such shifts and establish baseline benchmarks on FROST-EMA, a Finnish-Russian bilingual EMA corpus. FROST-EMA addresses the English bias and limited speaker diversity of existing resources. We benchmark (i) articulatory targets (raw EMA coordinates vs tract variables), (ii) acoustic front-ends (MFCC vs SSL features), and (iii) inversion back-ends (BiLSTM vs a lightweight attention-based sequence model). We further define evaluation protocols for cross-gender transfer (within language) and cross-language transfer (within gender). The results indicate that cross-gender mismatch introduces moderate Pearson correlation declines (approximately 0.05 to 0.10) relative to the in-domain baseline, whereas cross-language mismatch causes larger drops (approximately 0.10 to 0.20).

2606.20450 2026-06-19 eess.SP 新提交

Max-Min Rate Fairness Optimization for Multi-User Pinching-Antenna NOMA Systems

多用户捏合天线NOMA系统的最大最小速率公平性优化

Mahmoud AlaaEldin, Amy Inwood, Xidong Mu, Michail Matthaiou

AI总结 针对多波导捏合天线NOMA下行系统,提出两阶段优化框架,联合优化天线位置和预编码,以最大化最小用户速率,显著提升性能。

详情
AI中文摘要

捏合天线系统(PAS)通过沿米级波导重新定位介电辐射元件(称为捏合天线,PA)来克服信号阻塞,从而创建视距链路。由于每个波导由单个射频(RF)链驱动,非正交多址(NOMA)非常适合基于PAS的多用户通信。本文研究了一个多波导的PAS使能多用户下行NOMA系统,每个波导配备多个PA。联合优化PA位置和基站发射预编码,以最大化最小用户速率。由于PA间干扰引起的快速振荡相干和,所得问题高度非光滑且非凸。为应对这一挑战,我们提出了一种两阶段结构化优化框架。在第一阶段,使用内点算法进行粗略的PA位置和功率分配优化,同时忽略PA信道相位,从而得到接近真实最优的解。在第二阶段,考虑PA信道相位偏移,对PA位置和发射预编码进行微调。该阶段首先应用相位归零,即局部重新定位每个PA,使相应信道相位归零并促进建设性相干合并。然后使用交替过程,迭代执行前后向PA位置精炼和基于逐次凸近似的复发射预编码优化直至收敛,从而减少残余相位失配。仿真结果表明,所提框架显著优于启发式优化基准,且计算时间更短。结果还展示了相对于可比的多输入多输出下行NOMA系统的巨大增益,并揭示了PA数量、用户数量和发射功率对系统性能的影响。

英文摘要

Pinching-antenna systems (PASs) can overcome signal blockage by repositioning dielectric radiating elements, called pinching antennas (PAs), along meter-scale waveguides to create line-of-sight links. Since each waveguide is driven by a single radio-frequency (RF) chain, non-orthogonal multiple access (NOMA) is well suited for PAS-based multi-user communications. This paper studies a PAS-enabled multi-user downlink NOMA system with multiple waveguides, each equipped with multiple PAs. The PA positions and base-station transmit precoding are jointly optimized to maximize the minimum user rate. The resulting problem is highly non-smooth and non-convex because of the rapidly oscillating coherent sums caused by inter-PA interference. To tackle this challenge, we propose a two-stage structured optimization framework. In the first stage, coarse PA-position and power-allocation optimization is performed using an interior-point algorithm while neglecting the PA channel phases, which gives solutions near the true optima. In the second stage, PA positions and transmit precoding are fine-tuned while accounting for the PA channel phase shifts. This stage first applies phase zeroing, where each PA is locally repositioned to align the corresponding channel phase toward zero and promote constructive coherent combining. It then uses an alternating procedure that iteratively performs forward-backward PA position refinement and successive-convex-approximation-based complex transmit precoding optimization until convergence, thereby reducing residual phase mismatch. Simulation results show that the proposed framework significantly outperforms heuristic optimization benchmarks with much lower computational time. They also demonstrate large gains over a comparable multiple-input multiple-output downlink NOMA system and reveal the impact of the number of PAs, users, and transmit power on system performance.

2606.20338 2026-06-19 eess.AS 新提交

Stuttering Classification and Segmentation with Attention-Based Multiple Instance Learning

基于注意力多实例学习的口吃分类与分割

Petar Sušac, Sebastian P. Bayerl, Hrvoje Džapo

AI总结 提出基于微调wav2vec 2.0、WavLM和Whisper编码器的多实例神经网络,利用片段级数据实现帧级口吃分类与分割,帧级F1提升23%。

Comments Accepted at Interspeech 2026

详情
AI中文摘要

使用深度学习方法进行口吃检测和分类有潜力改善口吃严重程度评估过程。大多数口吃分类数据集提供片段级标签,这使得它们不适用于确定单个口吃不流畅持续时间所需的细粒度帧级分类。为了克服这一挑战,我们提出了一种基于微调wav2vec 2.0、WavLM和Whisper编码器的多实例神经网络架构。我们应用基于实例和基于嵌入的多实例学习方法,在片段级数据集上训练模型,用于片段级和帧级口吃分类任务。我们的结果显示,帧级F1分数提高了23%,片段级F1分数提高了2%至9%,证明了我们的模型能够利用片段级数据进行帧级分割的能力。

英文摘要

Stuttering detection and classification using deep learning methods has the potential to improve the process of stuttering severity assessment. Most stuttering classification datasets provide clip-level labels, making them unsuitable for fine-grained frame-level classification needed to determine the duration of individual stuttering dysfluencies. To overcome this challenge, we present a multiple instance neural network architecture based on fine-tuned wav2vec 2.0, WavLM and Whisper encoders. We apply instance- and embedding-based multiple instance learning approaches to train models on a clip-level dataset for both clip-level and frame-level stuttering classification tasks. Our results show a 23% improvement in frame-level F1 score and between 2% and 9% in clip-level F1 score, demonstrating the ability of our models to utilize clip-level data for frame-level segmentation.

2606.20266 2026-06-19 eess.AS 新提交

Transcript-Free Flow-Matching Text-to-Speech via Speech Feature Conditioning

基于语音特征调节的无转录流匹配文本转语音

SooHwan Eom, Hee Suk Yoon, Eunseop Yoon, Mark Hasegawa-Johnson, Chang D. Yoo

AI总结 提出RTFree-F5,用自监督语音表示替代参考转录本,通过轻量适配器映射到F5-TTS文本条件空间,消除对外部ASR依赖,在构音障碍语音上WER从24.6%降至10.4%。

Comments Accepted to Interspeech 2026

详情
AI中文摘要

最近的流匹配文本转语音(TTS)模型,如F5-TTS,在推理时依赖于从外部ASR系统获得的参考转录本。这种依赖性使得零样本TTS对于口音或构音障碍的说话者变得脆弱,而这正是最需要它的场景。此外,我们发现即使有真实转录本可用,基于文本的参考条件化也可能将非典型语音中的非典型声学模式传播到合成语音中。为了解决这个问题,我们提出了RTFree-F5,它用连续的自监督语音表示替换参考转录本,通过轻量适配器映射到F5-TTS的文本条件空间,同时重用预训练检查点。在构音障碍语音上,RTFree-F5将WER从24.6%降低到10.4%,甚至超过了真实参考转录本基线,同时提高了自然度,并在标准基准测试中保持竞争力,而无需任何参考转录本。

英文摘要

Recent flow-matching text-to-speech (TTS) models, such as F5-TTS, rely on a reference transcript at inference time, obtained from an external ASR system. This dependency makes zero-shot TTS brittle for accented or dysarthric speakers, precisely the scenarios where it is most needed. Moreover, we find that text-based reference conditioning can propagate atypical acoustic patterns from atypical speech into synthesis, even when ground-truth transcripts are available. To address this, we propose RTFree-F5, which replaces the reference transcript with continuous self-supervised speech representations mapped into F5-TTS's text-conditioning space via a lightweight adapter, while reusing the pretrained checkpoint. On dysarthric speech, RTFree-F5 reduces WER from 24.6% to 10.4%, surpassing even the ground-truth reference transcript baselines, while improving naturalness and remaining competitive on standard benchmarks without requiring any reference transcript.

2606.20222 2026-06-19 eess.SP 新提交

Reliable ORIS-assisted FSO Communications via HARQ

基于HARQ的可靠ORIS辅助自由空间光通信

Georgios D. Chondrogiannis, Athanasios P. Chrysologou, Vasilis K. Papanikolaou, Alexandros-Apostolos A. Boulogeorgos, Nestor D. Chatzidiamantis, Robert Schober

AI总结 研究结合光学可重构智能表面(ORIS)和混合自动重传请求(HARQ)的自由空间光通信链路,推导端到端信道统计模型,给出HARQ-CC的闭式中断概率和HARQ-IR的中断上界,分析分集阶数和延迟特性。

Comments 13 pages, 8 Figures, Journal

详情
AI中文摘要

本文研究了一种由光学可重构智能表面(ORIS)辅助并通过混合自动重传请求(HARQ)方案增强的自由空间光(FSO)链路。ORIS在障碍物周围创建虚拟视距路径,而HARQ通过重传和合并恢复受湍流、指向抖动和几何损耗损坏的帧。我们首先通过联合考虑大气湍流、ORIS引起的指向误差和几何衰减,推导了端到端发射器-ORIS-接收器(Tx-ORIS-Rx)反射信道的易处理统计模型。基于这些结果,我们获得了采用Chase合并的HARQ(HARQ-CC)的闭式中断概率(OP)表达式,以及采用增量冗余的HARQ(HARQ-IR)的解析中断上界,这些表达式对任意最大传输轮次有效。我们进一步进行了高信噪比(SNR)分析,该分析提供了中断行为的全面表征,并揭示了两种方案的分集阶数。此外,我们通过平均传输轮次和给定成功解码的条件平均轮次来表征截断HARQ过程的延迟行为。最后,数值和蒙特卡洛结果验证了所提出的分析,并表明HARQ显著提高了ORIS辅助FSO的可靠性,即使对于少量重传轮次,HARQ-IR也能实现比HARQ-CC更低的中断和延迟。

英文摘要

This paper studies a free-space optical (FSO) link assisted by an optical reconfigurable intelligent surface (ORIS) and enhanced by a hybrid automatic repeat request (HARQ) scheme. The ORIS creates a virtual line-of-sight path around obstacles, while HARQ recovers frames corrupted by turbulence, pointing jitter, and geometric loss through retransmission and combining. We first derive a tractable statistical model for the end-to-end transmitter-ORIS-receiver (Tx-ORIS-Rx) reflected channel by jointly accounting for atmospheric turbulence, ORIS-induced pointing errors, and geometric attenuation. Building on these results, we obtain closed-form outage probability (OP) expressions for HARQ with Chase combining (HARQ-CC) and analytical outage upper bounds for HARQ with incremental redundancy (HARQ-IR), valid for an arbitrary maximum number of transmission rounds. We further conduct a high signal-to-noise ratio (SNR) analysis that provides a thorough characterization of the outage behavior and reveals the diversity order of both schemes. In addition, we characterize the delay behavior of the truncated HARQ process through the mean number of transmission rounds and the conditional mean number of rounds given successful decoding. Finally, numerical and Monte Carlo results validate the proposed analysis and show that HARQ substantially improves ORIS-assisted FSO reliability, with HARQ-IR achieving lower outage and delay than HARQ-CC, even for a small number of retransmission rounds.

2606.20011 2026-06-19 eess.SP 新提交

Amplitude-Phase-Frequency Block Modulation for OFDM-ISAC with SI-Free PAPR Reduction and Pilotless Sensing

用于OFDM-ISAC的幅度-相位-频率块调制:无旁瓣信息PAPR降低和无导频感知

Bensheng Yang, Min Fan, Haitao Zhao, Haiming Wang

AI总结 提出一种幅度-相位-频率块调制方案,通过斯托克斯球映射和分组相位优化,在OFDM中实现无资源分割的通信与感知集成,同时降低PAPR并消除导频需求。

详情
AI中文摘要

基于正交频分复用(OFDM)的集成感知与通信系统需要一种统一波形,同时支持可靠数据传输、低峰均功率比(PAPR)和精确信道感知。现有方法在分离的时间或频率资源上复用通信与感知,或依赖专用导频进行信道估计,限制了系统灵活性并增加了开销。本文提出一种用于OFDM的幅度-相位-频率块调制(APFBM)方案,在不进行资源分割的情况下实现通信与感知的波形级集成。信息符号在斯托克斯球上表示,并通过明确规则映射到能量归一化的琼斯矢量,该规则为每个块建立确定性相位参考。这种映射暴露了信号结构中固有的共相自由度。在发射端,分组相位优化算法利用该结构自由度降低PAPR,无需旁瓣信息(SI)。在接收端,相同的确定性相位结构支持基于维特比的最大似然(ML)序列检测算法,该算法联合恢复优化相位并估计块状信道幅度和相位。无需专用感知导频,因为感知观测量直接从通信波形中提取。推导了闭式错误率和感知精度表达式。在软件无线电链路上的数值仿真和空中测量证实了有效的PAPR降低、精确的信道感知、可靠的相位恢复和稳定的信道状态信息重建。所提方案以适度降低频谱效率为代价,实现了统一波形设计,同时提供无SI的PAPR降低和无导频感知。

英文摘要

Orthogonal Frequency Division Multiplexing (OFDM)-based integrated sensing and communication systems demand a unified waveform that simultaneously supports reliable data transmission, low peak-to-average power ratio (PAPR), and accurate channel sensing. Existing approaches multiplex communication and sensing across separate time or frequency resources, or rely on dedicated pilots for channel estimation, limiting system flexibility and increasing overhead. This paper proposes an amplitude-phase-frequency block modulation (APFBM) scheme for OFDM that achieves waveform-level integration of communication and sensing without resource partitioning. Information symbols are represented on the Stokes sphere and mapped to energy-normalized Jones vectors through an unambiguous rule that establishes a deterministic phase reference per block. This mapping exposes a commonphase degree of freedom inherent in the signal structure. At the transmitter, a grouped phase optimization algorithm exploits this structural freedom to reduce the PAPR without side information (SI). At the receiver, the same deterministic phase structure enables a Viterbi-based maximum-likelihood (ML) sequence detection algorithm that jointly recovers the optimization phases and estimates the block-wise channel amplitude and phase. No dedicated sensing pilots are required, as the sensing observables are extracted directly from the communication waveform. Closed-form error-rate and sensing-accuracy expressions are derived. Numerical simulations and over-the-air measurements on a software-defined radio link confirm effective PAPR reduction, accurate channel sensing, reliable phase recovery, and stable channel state information reconstruction. The proposed scheme trades a moderate reduction in spectral efficiency for a unified waveform design that simultaneously delivers SI-free PAPR reduction and pilotless sensing.

2606.20001 2026-06-19 eess.AS 新提交

Time-Unconditional Generative Speech Enhancement via Autonomous Rectified Flow

基于自主整流流的时间无条件生成式语音增强

Wen Zhang, Wenbin Jiang, Yang Zhang, Xiaofei Zhou

AI总结 提出自主整流流框架,通过线性插值路径证明目标向量场时间不变性,设计时间无条件网络仅从空间关系推断去噪方向,显著提升生成质量、鲁棒性和推理效率。

详情
AI中文摘要

大多数生成式语音增强方法依赖显式时间步嵌入进行时间条件化。本文提出自主整流流框架,挑战这种条件化的必要性。通过线性插值路径,我们证明目标向量场本质上是时间不变的。我们进一步引入时间无条件网络,消除显式时间步信息,仅从当前状态与带噪观测之间的空间关系推断去噪方向。预测该目标向量场等价于建模噪声分布。通过避免过拟合时间轨迹,所提出的自主设计显著提升了生成质量、鲁棒性和推理效率。

英文摘要

Most generative speech enhancement methods rely on explicit time-step embeddings for temporal conditioning. In this paper, we propose the Autonomous Rectified Flow framework, which challenges the necessity of such conditioning. Using a linear interpolation path, we show that the target vector field is inherently time-invariant. We further introduce a time-unconditional network that eliminates explicit time-step information and infers the denoising direction solely from the spatial relationship between the current state and the noisy observation. Predicting this target vector field is equivalent to modeling the noise distribution. By avoiding overfitting to temporal trajectories, the proposed autonomous design significantly improves generation quality, robustness, and inference efficiency.

2606.19974 2026-06-19 eess.AS 新提交

Interpreting Content and Speaker Characteristics in Factorised Self-Supervised Subspaces

解释因子化自监督子空间中的内容和说话人特征

Kyle Janse van Rensburg, Herman Kamper

AI总结 通过SVD分解WavLM特征为内容矩阵和说话人变换,发现内容空间主要编码强度、共振峰和发声,而说话人空间与音高和性别强相关,并可用于语音合成中的精细控制。

Comments 7 pages, 4 figures

详情
AI中文摘要

自监督语音特征同时编码内容和说话人信息。最近的工作引入了一种基于SVD的因子化方法,将这些特征分解为一个共享的内容矩阵(捕获时间变化)和说话人特定的变换(捕获静态说话人特征)。然而,这些组件内部的信息组织方式仍不清楚。在本文中,我们研究了WavLM因子化的内容和说话人子空间的维度如何与语音特征(如音高、强度和发声)相关。我们发现,内容空间中的前几个维度主要捕获强度、高阶共振峰和发声,而音高编码在较后的维度中。相比之下,方差最大的说话人维度与音高和性别强相关,后面的维度捕获高频变化。干预实验表明,操纵这些维度能够实现对语音合成中语音特征的目标控制。此外,联合修改内容和说话人表示可提供对音高和强度等特征的精细控制。

英文摘要

Self-supervised speech features encode both content and speaker information. Recent work introduced an SVD-based factorisation that decomposes these features into a shared content matrix capturing temporal variation and speaker-specific transformations capturing static speaker characteristics. However, how information is organised within these components remains unclear. In this paper, we investigate how the dimensions of WavLM-factorised content and speaker subspaces correlate with speech characteristics such as pitch, intensity, and voicing. We find that leading dimensions in the content space primarily capture intensity, higher-order formants, and voicing, while pitch is encoded in a later dimension. In contrast, the highest-variance speaker dimension is strongly associated with pitch and gender, with later dimensions capturing high-frequency variation. Intervention experiments show that manipulating these dimensions enables targeted control of speech characteristics for speech synthesis. Furthermore, modifying the content and speaker representations jointly provides fine-grained control over characteristics such as pitch and intensity.

2606.19953 2026-06-19 eess.SP 新提交

ConsisFormer: Compute-Efficient Transformer for Wireless Foundation Models Based on Channel Consistency

ConsisFormer: 基于信道一致性的无线基础模型高效计算Transformer

Yuwei Wang, Li Sun, Tingting Yang, Liwen Jing, Yuxuan Shi, Maged Elkashlan, Mérouane Debbah

AI总结 提出ConsisFormer,利用无线信道短时一致性,通过自适应令牌聚合和特征序列插值降低Transformer计算复杂度,在多种任务上减少83%以上计算量且性能损失极小。

详情
AI中文摘要

无线基础模型(WFM)最近成为AI原生6G网络的一种有前景的范式,能够实现适应各种通信和感知任务的通用信道表示。现有的WFM主要基于Transformer架构,该架构提供了优越的性能,但计算复杂度与输入序列长度的平方成正比,这对其在严格推理延迟约束下的部署构成了重大障碍。为了解决这个问题,本文提出ConsisFormer,一种基于无线信道短时一致性的高效计算Transformer设计,作为WFM的骨干网络。利用相邻时间或频率实例共享相似的散射体簇并因此表现出相似信道特性的观察,我们开发了自适应令牌聚合(ATA)模块,动态合并相邻信道状态信息(CSI)令牌,从而减少自注意力计算中涉及的令牌序列长度以降低计算成本。此外,我们提出了一种特征序列插值(FSI)方法,基于Transformer块输出的稀疏特征序列恢复完整的CSI表示,从而在保持性能不受影响的同时确保低复杂度。此外,我们提出了一种用于WFM的聚合自编码器(AAE)预训练范式,通过压缩和恢复从稀疏化CSI令牌中学习鲁棒的信道表示。仿真结果表明,所提出的设计将WFM的计算复杂度降低了83%以上,同时在包括信道预测、视距/非视距分类、波束预测和定位在内的各种任务上性能损失极小。

英文摘要

Wireless foundation models (WFMs) have recently emerged as a promising paradigm for AI-native 6G networks, enabling universal channel representations adaptable to diverse communication and sensing tasks. Existing WFMs are predominantly built upon the Transformer architecture, which delivers superior performance but incurs computational complexity proportional to the square of the input sequence length, posing a significant barrier to their deployment under stringent inference latency constraints. To address this issue, in this paper, we propose ConsisFormer, a compute-efficient Transformer design based on short-term consistency of wireless channels, as a WFM backbone. By utilizing the observation that adjacent time or frequency instances share similar clusters of scatterers and thus exhibit similar channel characteristics, we develop an adaptive token aggregation (ATA) module to dynamically merge neighboring channel state information (CSI) tokens, thereby reducing the length of the token sequence involved in self-attention calculations to lower the computational cost. Furthermore, we propose a feature sequence interpolation (FSI) method to recover the full CSI representation based on the sparse feature sequence outputted from the Transformer blocks, thus keeping the performance unaffected while ensuring low complexity. Moreover, we propose an aggregated auto-encoder (AAE) pre-training paradigm for WFMs, enabling robust channel representation learning from sparsified CSI tokens via compression and recovery. Simulation results show that the proposed design reduces the computational complexity of WFM by over $83\%$ with negligible performance loss on various tasks including channel prediction, LoS/NLOS classification, beam prediction, and localization.

2606.19940 2026-06-19 eess.AS 新提交

Analyzing Language and Geographical Variation in Speech Representations Across 60 Indic Languages

分析60种印度语言语音表征中的语言和地理变异

Pavan Kumar J, Agneedh Basu, Pranav Bhat, Sujith Pulikodan, Visruth Sanka, Nihar Desai, Prasanta Kumar Ghosh

AI总结 研究通过联合语言-地区监督微调Whisper-base和Wav2Vec2.0,发现该方法在保持语言分类能力的同时,提升了嵌入空间中地区区分度,并利用归一化条件互信息分析了嵌入结构。

详情
AI中文摘要

自监督语音编码器通常使用语言监督进行微调,这可能会忽略地理变异。为了理解在语言和地区联合监督下与仅语言监督下学习到的表征差异,我们微调Whisper-base和Wav2Vec2.0进行联合语言-地区分类(386类)和仅语言分类(60类)任务。语言-地区监督在嵌入空间中改善了条件于语言的地区区分度,同时保持了较强的边缘语言分类能力。我们使用归一化条件互信息(NCMI)分析学习到的嵌入结构,表明语言-地区监督产生了全局语言簇,并在语言内部形成了与地区变异对齐的结构化子簇,从而在不降低语言层面组织的情况下增强了地理可分离性。

英文摘要

Self-supervised speech encoders are often fine-tuned with language supervision, which can overlook geographical variation. To understand the learned representations under joint supervision of language and district compared to language-only supervision, we fine-tune Whisper-base and Wav2Vec2.0-base for classification tasks with joint language-district (386 classes) and language-only classification (60 languages). The language-district supervision improves district discrimination conditioned on language in the embedding space while strong marginal language classification. We analyze the structure of the learned embeddings using Normalized Conditional Mutual Information (NCMI), showing that language-district supervision produces global language clusters with structured within language subclusters aligned to district variation, enhancing geographical separability without degrading language-level organization.

2606.19724 2026-06-19 eess.SP 新提交

Cyclic-Prefix OFDM Probing for Spatial-ISI-Free Distributed Acoustic Sensing via Frequency-Domain Channel Reconstruction

基于频域信道重构的循环前缀OFDM探测实现无空间ISI分布式声学传感

Huan Huang, Zhiyang Xue, Ziang Chen, Zhongxing Tian, Dongdong Zou, Gangxiang Shen, Yi Cai

AI总结 提出使用循环前缀正交频分复用(CP-OFDM)波形作为传感探头,通过频域信道重构消除匹配滤波脉冲压缩中的空间符号间干扰(ISI),实现无空间ISI的分布式声学传感,并同时恢复通信数据,展示共享波形集成感知与通信(ISAC)。

Comments This manuscript has been submitted for possible publication

详情
AI中文摘要

基于匹配滤波的脉冲压缩分布式声学传感(DAS)存在非零压缩旁瓣,导致确定性距离单元间泄漏,即空间符号间干扰(ISI),并在重建的瑞利背向散射迹中产生虚假响应。我们提出一种用于$\phi$-OTDR的循环前缀正交频分复用(CP-OFDM)DAS系统,使用承载数据的CP-OFDM波形作为传感探头。该系统还恢复前向通信数据,初步展示了共享波形集成感知与通信(ISAC)。据我们所知,这是首次将分布式瑞利背向散射建模为有限记忆传感多径信道。基于该模型,我们证明,如果有用OFDM和CP长度覆盖传感多径记忆,则去除CP、单抽头频域均衡和逆离散傅里叶变换可重建每个距离单元系数,且无确定性波形引起的空间ISI,从而实现无空间ISI的相位解调。在模拟的5.2公里链路上,组内间隔5.31–5.83米的十个同时强、弱事件,所提接收机抑制了事件外泄漏,并将相位迹均方误差相比匹配滤波脉冲压缩提升高达29.55 dB。在5.2公里光纤链路的相干外差实验中,占用带宽111.984 MHz,在5 V和1 V驱动下,500 Hz PZT振动分别被盲定位在5.071公里和5.066公里处,其波形恢复的相关系数分别为0.990和0.962。同一承载数据探头还恢复了一幅图像,误码率为零,误差矢量幅度中位数为-23.14 dB。这些结果验证了CP-OFDM辅助的频域信道重构用于无空间ISI的DAS,并展示了其在共享波形光纤ISAC中的潜力。

英文摘要

Matched-filter-based pulse-compression distributed acoustic sensing (DAS) suffers from nonzero compression sidelobes that cause deterministic inter-range-bin leakage, i.e., spatial inter-symbol interference (ISI), and false responses in reconstructed Rayleigh-backscatter traces. We propose a cyclic-prefix orthogonal frequency-division multiplexing (CP-OFDM) DAS system for $ϕ$-OTDR, using a data-bearing CP-OFDM waveform as the sensing probe. It also recovers forward communication data, providing an initial demonstration of shared-waveform integrated sensing and communication (ISAC). To our knowledge, this is the first formulation of distributed Rayleigh backscattering as a finite-memory sensing multipath channel. Based on this formulation, we prove that, if the useful OFDM and CP lengths cover the sensing multipath memory, CP removal, one-tap frequency-domain equalization, and inverse discrete Fourier transform reconstruct each range-bin coefficient without deterministic waveform-induced spatial ISI, enabling spatial-ISI-free phase demodulation. For a simulated 5.2-km link with ten simultaneous strong and weak events spaced by 5.31--5.83 m within groups, the proposed receiver suppresses off-event leakage and improves phase-trace mean-square error by up to 29.55 dB over matched-filter pulse compression. In a heterodyne coherent experiment over a 5.2-km fiber link with 111.984-MHz occupied bandwidth, 500-Hz PZT vibrations are blindly localized at 5.071 and 5.066 km under 5- and 1-V drives, respectively, and their waveforms are recovered with correlation coefficients of 0.990 and 0.962. The same data-bearing probe also recovers an image with zero measured bit-error rate and a median error vector magnitude of -23.14 dB. These results validate CP-OFDM-aided frequency-domain channel reconstruction for spatial-ISI-free DAS and demonstrate its potential for shared-waveform optical-fiber ISAC.

2606.19720 2026-06-19 eess.SP 新提交

An Optimization Framework for Certain Separable Problems using Neural Networks

基于神经网络的特定可分离问题优化框架

Rohit Negi, Soummya Kar

AI总结 针对参数可分离的约束优化问题,提出离线学习与在线处理两阶段策略,利用ADMM和神经网络降低在线计算复杂度。

Comments 15 pages, 5 figures

详情
AI中文摘要

本文研究一类由实时应用驱动的参数约束优化问题。在参数可分离问题结构下,提出基于离线学习和在线处理的两阶段策略,以在资源受限设备上解决这些优化问题。具体地,利用可分离结构,开发了基于交替方向乘子法(ADMM)的迭代求解过程,该过程允许使用基于学习的函数表示(离线学习但在线可快速计算)来降低整体在线设备实现复杂度。通过精心设计ADMM过程,表明即使参数变化,参数优化问题的相应实例也可通过设备上的轻量级在线计算,借助神经网络协处理器求解。

英文摘要

This paper studies a class of parametric constrained optimization problems that are motivated by applications in real time applications. Under a parameter-separable problem structure that naturally arises in these applications, the paper proposes a two phase strategy, based on offline learning and online processing, to address these optimization problems on resource limited devices. Specifically, by exploiting the separable structure, an iterative Alternating Direction Method of Multipliers (ADMM) based solution procedure is developed that enables the use of certain learning based function representations (learned offline but readily computable online) to reduce the overall online on-device implementation complexity. By carefully crafting the ADMM procedure, it is shown that even as the parameters vary, the corresponding instances of the parametric optimization problem may be solved by lightweight online computations in the device with the assistance of a neural network co-processor.

2606.19666 2026-06-19 eess.SP 新提交

Degrees of Freedom and Beamforming for Large Intelligent Surfaces

大规模智能表面的自由度与波束赋形

Jiawang Li, Alireza Saberkari, Buon Kiong Lau, Mats Gustafsson

AI总结 通过互阴影面积闭式表达式估计大规模智能表面(LIS)的空间自由度(DoF),并验证其与数值奇异值谱的吻合;基于DoF分析设计采样方案和波束赋形,证明可形成约DoF数量的独立波束,超过此限会导致干扰增加;极化研究表明电场分量对DoF贡献不均,总场DoF为单极化分量的两倍。

详情
AI中文摘要

空间自由度(DoF)、采样和波束赋形是多用户大规模智能表面(LIS)的基础,其中电磁场必须在多个近场位置进行成形、分辨和聚焦。本文利用互阴影面积的闭式表达式,针对代表性LIS配置估计了DoF数量。通过数值奇异值谱验证了所得DoF预测,其谱膝点与理论估计紧密吻合。对于线源配置,通过将源或观测线划分为单位DoF区间,开发了一种解析采样方案,从而能够选择空间样本。使用最大比传输和迫零的波束赋形结果表明,可以形成大约DoF数量的独立波束。试图超过此限制会导致干扰增加和性能下降。对于基于表面的LIS配置,采样点则通过离散经验插值方法数值确定。相应的波束赋形结果进一步证实,目标区域可以支持大约与DoF分析预测数量相同的独立波束。最后,一项极化感知研究表明,电场分量对DoF的贡献不相等,且总场DoF是单极化分量DoF的两倍。

英文摘要

Spatial degrees of freedom (DoF), sampling, and beamforming are fundamental to multi-user large intelligent surfaces (LISs), where electromagnetic fields must be shaped, resolved, and focused at multiple near-field locations. This work estimates the number of DoF using closed-form expressions derived from the mutual shadow area for representative LIS configurations. The resulting DoF predictions are validated through numerical singular-value spectra, whose spectral knee points closely match the theoretical estimates. For line-source configurations, an analytic sampling scheme is developed by partitioning the source or observation line into unit-DoF intervals, enabling the selection of spatial samples. Beamforming results using maximum-ratio transmission and zero-forcing demonstrate that approximately the number of DoF independent beams can be formed. Attempting to exceed this limit results in increased interference and degraded performance. For surface-based LIS configurations, sampling points are instead determined numerically using the discrete empirical interpolation method. The corresponding beamforming results further confirm that the target region can support approximately as many independent beams as predicted by the DoF analysis. Finally, a polarization-aware study reveals that the electric-field components contribute unequally to the DoF and that the total-field DoF is twice that of a single polarization component.

2606.19536 2026-06-19 eess.SP 新提交

Multistatic J-Band Radar TX/RX Chipset in SiGe BiCMOS with Integrated x16 Frequency Multiplier Chain and High EIRP

采用SiGe BiCMOS工艺的集成x16倍频链和高EIRP的多基地J波段雷达收发芯片组

Stephan Hauptmeier, Kennet Braasch, Till Ziegler-Bellenberg, Diana P. Cortes N., Tobias T. Braun, Michael Höft, Nils Pohl

AI总结 本文设计并测量了一种多基地J波段雷达芯片组,包含集成x16倍频链的发射和接收MMIC,实现了高EIRP和远距离探测。

详情
AI中文摘要

本文介绍了一种多基地J波段雷达芯片组的设计与测量,该芯片组包括一个发射机和一个接收机MMIC,两者均集成了$\ imes$16倍频链,用于低频本振分配和可扩展雷达配置。多基地雷达架构可以同时维持高发射功率和高接收灵敏度,这一优势在本芯片组中得到了充分利用。为此,发射机MMIC上集成的四路功率合成放大器链提供了11.2 dBm的输出功率。在292 GHz下,使用准直PTFE透镜时测得的EIRP为41 dBm,无透镜时为8.8 dBm。尽管倍频因子较高,但片上谐波抑制优于24 dBc,而通过多个滤波器级实现了约50 dBc的辐射带内谐波抑制。接收机MMIC包含三级低噪声放大器,在292 GHz下整体转换增益为43.3 dB。集成的片上贴片天线便于系统集成,并可使用高方向性介质透镜,使该芯片组适用于长达150米的远距离雷达测量。MMIC采用130 nm SiGe BiCMOS工艺实现,其f_T和f_max分别为500 GHz和610 GHz。

英文摘要

This work presents the design and measurement of a multistatic J-band radar chipset comprising a transmitter and a receiver MMIC both featuring an integrated $times$16 frequency multiplier chain for low-frequency local-oscillator distribution and scalable radar configurations. Multistatic radar architectures can sustain high transmission power and high receiver sensitivity simultaneously an advantage that is fully leveraged in the present chipset. To this end a four-way power-combining amplifier chain integrated on the transmitter MMIC delivers an output power of 11.2 dBm. The resulting measured EIRP is 41 dBm at 292 GHz with a collimating PTFE lens and 8.8 dBm without a lens. Despite the high frequency-multiplication factor an on-chip harmonic rejection better than 24 dBc was measured while a radiated in-band harmonic rejection of approximately 50 dBc was achieved through multiple filter stages. The receiver MMIC incorporates a three-stage low-noise amplifier and exhibits an overall conversion gain of 43.3 dB at 292 GHz. Integrated on-chip patch antennas facilitate system integration and the use of highly directive dielectric lenses making the chipset suitable for long-range radar measurements which are demonstrated up to 150 m. The MMICs are realized in a 130 nm SiGe BiCMOS technology with an f_T and f_max of 500 GHz and 610 GHz respectively.

2606.19453 2026-06-19 eess.AS 新提交

A Survey of Full-Duplex Spoken Dialogue Systems: Architectural Hierarchy, Interaction Ontology, and Decision State Machine

全双工口语对话系统综述:架构层次、交互本体与决策状态机

Jingyu Lu, Yuhan Wang, Jianming Luo, Yifu Chen, Tianle Liang, Shengpeng Ji, Ziyue Jiang, Xiaoda Yang, Yu Zhang, Xize Cheng, Chenyuhao Wen, Changhao Pan, Haoxiao Wang, Chen Ye, Jian Wu, Xiaoxi Jiang, Guanjun Jiang, Zhou Zhao

AI总结 针对全双工术语歧义,提出L0-L3架构层次、T×I×R交互本体和IDLE/LISTEN/SPEAK/WAIT/DUAL决策状态机三个框架,揭示现有系统在训练与评估中的实现差距。

Comments 34 pages, 5 figures, 7 tables. Project page and interactive demo: https://github.com/DuplexLM/DuplexSurvey

详情
AI中文摘要

近期有十余个口语对话系统声称实现了“全双工”,但该术语被用于描述本质上不同的能力。现有综述将它们归入单一轴(级联/端到端,或工程化/学习型),忽略了构建者最关心的区别。我们认为这种歧义很大程度上源于分类学问题:当前术语未明确双工决策在何处做出、支持哪些交互类型、以及系统如何逐时刻行为。本文引入三个互补框架:(i) L0-L3架构层次,定位双工决策位置;(ii) T×I×R交互本体,指定每次交互的时间关系、用户意图和所需系统响应;(iii) 决策状态机(IDLE/LISTEN/SPEAK/WAIT/DUAL),描述系统如何在状态间转换。通过对已发表系统和基准的审计,我们记录了一个实现差距:尽管许多架构原则上能在全双工状态下运行,但其观察到的行为仍受训练和评估中表示的交互模式约束。我们指出,相对于(大多未公开的)工业语料库,有限的公开训练数据覆盖范围,以及尚未实现的L3表示级建模目标,是全双工对话未来研究的关键前沿。相关材料见https://this https URL。

英文摘要

More than a dozen spoken dialogue systems have recently claimed to be "full-duplex," yet the term has been used to describe substantially different capabilities. Existing surveys collapse them onto a single axis (cascaded/end-to-end, or engineered/learned) and miss the distinctions that matter most for builders. We argue that much of this ambiguity is taxonomical: current terminology does not specify where duplex decisions are made, which interaction types are supported, or how a system behaves moment by moment. This paper introduces three complementary frameworks: (i) an L0-L3 Architectural Hierarchy that locates where duplex decisions are made; (ii) a $T\times I\times R$ Interaction Ontology that specifies the temporal relation, user intent, and required system response for each interaction; and (iii) a Decision State Machine (IDLE/LISTEN/SPEAK/WAIT/DUAL) that describes how systems move between states. Across published systems and benchmarks, our audit documents a realization gap: although many architectures can in principle operate in full-duplex states, their observed behavior remains constrained by the interaction patterns represented in training and evaluation. We point to the limited public training-data coverage relative to the (largely undisclosed) industrial corpora, together with the still-unrealized goal of L3 representation-level modeling, as the key frontiers for future research on full-duplex dialogue. The related material is available at https://github.com/DuplexLM/DuplexSurvey.

2606.20240 2026-06-19 econ.EM stat.AP 新提交

Two-Sample IV: Efficient Two-Step Estimation and Tests for Overidentification and Weak-Instruments

两样本IV:高效两步估计及过度识别与弱工具变量检验

Fatima Kasenally, Ruoxi Guan, Frank Windmeijer

AI总结 针对两样本IV估计,提出异方差和样本异质性下稳健的两步高效估计方法及过度识别检验,仅需线性回归的汇总统计量,并扩展弱工具变量检验。

详情
AI中文摘要

两样本IV是一种流行的估计方法,当结果变量和处理变量在不同样本中可用,而工具变量在两个样本中都可用时。标准估计量是两样本两阶段最小二乘估计量,在同方差和样本同质性下是有效的。我们开发了一个稳健的两步程序,用于在一般异方差和样本异质性下进行有效估计,并提出了相关的两样本Hansen过度识别检验。我们方法的一个关键特征是只需要两个样本中简化形式和第一阶段的线性回归的汇总统计量。这些是估计系数向量的六个对象,以及同方差和异方差稳健的估计方差矩阵。我们进一步表明,在同方差和同质性下,处理样本中的第一阶段F统计量可以按标准方式用作弱工具变量检验,这里的相对偏差是比例偏差。我们提出了Montiel-Olea和Pflueger (2013)的有效F统计量的扩展,用于异方差情况,遵循Windmeijer (2025)的推广。我们在Marshall (2019)研究教育对投票行为影响的应用中说明了估计量和检验,并进行了聚类稳健推断。

英文摘要

Two-sample IV is a popular estimation method when the outcome and treatment variables are available in different samples, whereas instruments are available in both samples. The standard estimator is two-sample two-stage least squares estimator, which is efficient under homoskedasticity and homogeneity of the samples. We develop a robust two-step procedure for efficient estimation under general heteroskedasticity and heterogeneity of the samples, and propose a related two-sample Hansen overidentification test. A key feature of our approach is that only summary statistics from the linear regressions of the reduced form and first-stage in the two samples are needed. These are the six objects of the estimated coefficient vectors, and the homoskedastic and heteroskedasticity robust estimated variance matrices. We further show that the first-stage F-statistic in the treatment sample can be used as a test for weak instruments in the standard way under homoskedasticity and homogeneity, with the relative bias here a proportional bias. We propose an extension of the effective F-statistic of Montiel-Olea and Pflueger (2013) for the heteroskedastic case, following the generalization in Windmeijer (2025). We illustrate the estimators and tests in an application studying the effect of education on voting behavior from Marshall (2019), with cluster robust inference.

2606.20514 2026-06-19 stat.ME 新提交

Hypergraph Variable Selection with False Discovery Rate Control

具有错误发现率控制的超图变量选择

Sarah Organ, Toby Kenney, Hong Gu

AI总结 针对预测变量复杂依赖结构导致变量选择方法功效降低的问题,提出基于超图的选择方法,在控制错误发现率的同时提高选择功效。

Comments 28 pages, 4 figures

详情
AI中文摘要

控制错误发现率的变量选择方法在预测变量呈现复杂依赖结构时往往会失去功效。我们先前表明,选择分层聚类组的预测变量可以缓解这一问题,同时保持错误发现率控制。然而,当相关性结构较不明确时,重叠的预测变量集可能更有效。我们引入了针对预测变量集上定义假设的广义错误发现率,并提出了一种基于超图的选择方法。该方法在各种设置下实现了更高的功效,同时保持了严格的错误发现率控制。

英文摘要

Variable selection methods that control the false discovery rate often lose power when predictors exhibit complex dependence structures. We previously showed that selecting hierarchically clustered groups of predictors can mitigate this issue while maintaining false discovery rate control. When correlations are less structured, however, overlapping predictor sets may be more effective. We introduce a generalized false discovery rate for hypotheses defined on sets of predictors and propose a hypergraph-based selection method. This approach achieves higher power across diverse settings while preserving rigorous false discovery rate control.

2606.20406 2026-06-19 stat.ME stat.CO 新提交

Flexible modeling of bimodal distributions via skewed-$t$ mixtures

双峰分布的灵活建模:基于偏斜-t分布的混合模型

Marco Bee, Flavio Santi

AI总结 提出基于Fernández和Steel (1998)偏斜-t分布的混合模型,通过EM算法进行极大似然估计,并开发似然比检验,用于拟合双峰、偏斜和厚尾数据,在标准普尔500指数中验证了双峰性。

详情
AI中文摘要

我们提出了一种位置-尺度偏斜-t分布的混合模型,用于拟合双峰、偏斜和厚尾数据。特别地,该混合模型基于Fernández和Steel (1998)的偏斜-t分布,因此模型构建过程可以轻松扩展到其他对称分布的混合。在研究了混合模型的性质后,我们通过EM算法开发了极大似然估计方法,并提出了一个似然比检验,用于检验任何给定成分中无偏斜的原假设。与最近提出的g-and-h分布混合的基于模拟的比较表明,所提出模型在良好指定设置下的估计精度和错误指定框架下的建模能力方面均表现出色。将该模型拟合到标准普尔500指数失真数据,证实了其分布的双峰性,这意味着美国股市历史上处于熊市或牛市状态,而非接近其基本面价值。

英文摘要

We propose a mixture of location-scale skewed-$t$ distributions to fit bimodal, skewed and heavy-tailed data. In particular, the mixture is based on the skewed-$t$ distribution by Fernández and Steel (1998), so that the model-building procedure can be easily extended to mixtures of other symmetric distributions. After studying the properties of the mixture, we develop a maximum likelihood estimation approach via the EM algorithm and a likelihood ratio test of the null hypothesis of no skewness in any given component. A simulation-based comparison to a recently proposed mixture of g-and-h distributions suggests that the performance of the proposed model is excellent, in terms of both estimation precision in well-specified setups and modeling capability in mis-specified frameworks. Fitting the model to the Standard & Poor's 500 distortion allows us to confirm the bimodality of its distribution, with the implication that the US stock market has historically been in bearish or bullish conditions, rather than near its fundamental value.

2606.20341 2026-06-19 stat.ME stat.AP 新提交

Anchors Away: Navigating Unanchored Indirect Comparisons with Multilevel Unanchored Meta-Regression (ML-UMR)

锚定之外:使用多层次非锚定元回归(ML-UMR)导航非锚定间接比较

Conor Chandler, Jack Ishak

AI总结 针对随机证据缺失时的非锚定治疗比较,提出多层次非锚定元回归(ML-UMR),通过贝叶斯框架联合建模个体与汇总数据,估计多治疗、多研究及目标人群的边际和条件效应,并明确识别假设与可转移性假设。

Comments 20 pages (excluding supplementary material), 5 figures

详情
AI中文摘要

当随机证据不可用时,使用单臂研究或断开证据的非锚定间接治疗比较越来越多地用于卫生技术评估(HTA)。现有方法,包括匹配调整间接比较(MAIC)和模拟治疗比较(STC),通常局限于成对设置,并且通常估计比较研究人群中的边际效应,这可能与决策相关人群不同。我们提出多层次非锚定元回归(ML-UMR),一种用于综合来自完全断开证据的个体患者数据和汇总数据的贝叶斯回归框架。ML-UMR通过在一个统一似然中联合建模个体水平和汇总水平数据,将多层次网络元回归(ML-NMR)扩展到非锚定设置,从而能够估计跨多个治疗、研究和目标人群的治疗特异性结果以及边际和条件效应。ML-UMR区分了识别治疗效应所需的假设与将结果转移到目标人群所需的假设。与所有非锚定比较一样,有效推断依赖于强且通常不可验证的假设,包括条件可交换性、结果模型的正确设定以及跨治疗假设(例如,共享预后因素假设(SPFA))。ML-UMR并未减轻这些要求,而是在统一框架内使其明确,并促进敏感性分析。在模拟研究中,ML-UMR对比较人群效应产生了低偏差和名义覆盖。向其他人群的可转移性关键取决于识别假设:在强效应修饰下,违反SPFA导致偏差,而纳入亚组信息则恢复了近乎无偏的估计和名义覆盖。

英文摘要

Unanchored indirect treatment comparisons using single-arm studies or disconnected evidence are increasingly used in health technology assessment (HTA) when randomized evidence is unavailable. Existing methods, including matching-adjusted indirect comparison (MAIC) and simulated treatment comparison (STC), are generally limited to pairwise settings and typically estimate marginal effects in the comparator study population, which may differ from the decision-relevant population. We propose multilevel unanchored meta-regression (ML-UMR), a Bayesian regression framework for synthesizing individual patient data and aggregate data from fully disconnected evidence. ML-UMR extends multilevel network meta-regression (ML-NMR) to unanchored settings by jointly modeling individual- and aggregate-level data within a unified likelihood, enabling estimation of treatment-specific outcomes and both marginal and conditional effects across multiple treatments, studies, and target populations. ML-UMR distinguishes assumptions required to identify treatment effects from those required to transport results to target populations. As with all unanchored comparisons, valid inference relies on strong and often unverifiable assumptions, including conditional exchangeability, correct specification of the outcome model, and cross-treatment assumptions (e.g., shared prognostic factor assumption (SPFA)). ML-UMR does not lessen these requirements but makes them explicit within a unified framework and facilitates sensitivity analyses. In simulation studies, ML-UMR produced low bias and nominal coverage for comparator-population effects. Transportability to alternative populations depended critically on identifying assumptions: violations of SPFA led to bias under strong effect modification, whereas incorporating subgroup information restored near-unbiased estimation and nominal coverage.

2606.20226 2026-06-19 stat.ME stat.CO 新提交

Analysis of uncertain fixed-effects model for Latin square designs

拉丁方设计的不确定固定效应模型分析

Yaru Cheng, Zhiming Li

AI总结 针对无频率稳定性的不确定实验数据,建立拉丁方设计的不确定固定效应模型,提出三种估计方法并构建置信区间,进行不确定齐性检验和常见检验,通过数值模拟和实例验证模型有效性。

详情
AI中文摘要

实验设计中常出现无频率稳定性的不确定数据。经典固定效应模型只能分析精确的实验数据。基于不确定测度,本文建立了拉丁方设计的不确定固定效应模型。首先,我们提出了三种不确定方法来估计处理和区组效应,并构建其置信区间。然后,进行不确定齐性检验和常见检验以评估处理效应的显著性。在数值模拟中,基于偏差、均方误差、平均绝对误差、总体标准差、覆盖概率和平均区间长度比较了三种估计方法。给出了几个例子来说明估计和假设检验的过程。最后,将不确定固定效应模型应用于真实教育数据,展示了其实用价值。

英文摘要

Uncertain data without frequency stability often arises in experimental design. Classical fixed-effects models can only analyze precise experimental data. Based on an uncertain measure, this paper establishes uncertain fixed-effect models for Latin-square designs. First, we propose three methods with uncertainty to estimate the treatment and blocked effects and construct their confidence intervals. Then, uncertain homogeneity and common tests are conducted to assess the significance of treatment effects. In the numerical simulations, the three estimation methods are compared based on bias, mean squared error, mean absolute error, overall standard deviation, coverage probability, and average interval length. Several examples are given to illustrate the process of estimation and hypothesis. Finally, the uncertain fixed-effects model is applied to real education data, demonstrating its practical value.

2606.20191 2026-06-19 stat.ML stat.ME 新提交

AK-MCS-C2 : Active Kriging Monte Carlo Simulation method with conformal certification for failure probability estimation

AK-MCS-C2: 具有共形认证的主动克里金蒙特卡洛模拟方法用于失效概率估计

Edgar Jaber, Vincent Chabridon, Mathilde Mougeot

AI总结 提出一种结合主动克里金蒙特卡洛模拟与共形预测的主动学习框架,通过自适应交叉共形策略和J+GP共形估计器,在少量样本下提供无分布假设的预测误差保证,提高极限状态面附近样本分类可靠性,从而提升失效概率估计的准确性和鲁棒性。

详情
AI中文摘要

我们提出了一种新颖的主动学习框架,用于结构可靠性分析中的失效概率估计,该框架将主动克里金蒙特卡洛模拟与共形预测相结合。所提出的方法采用了一种自适应交叉共形策略,专门针对小样本设置和基于J+GP共形估计器的克里金代理模型设计。与标准的AK-MCS方法不同,所提出的框架对预测误差提供了无分布假设的保证,从而对极限状态面附近的样本进行更可靠的分类。这种改进的不确定性量化增强了失效概率估计的准确性和鲁棒性,特别是在这种效率至关重要的罕见事件区域。可重复的数值结果说明了该方法的有效性,并在公认的基准测试上将其与经典方法进行了比较。

英文摘要

We introduce a novel active-learning framework for failure probability estimation in structural reliability analysis that integrates Active Kriging Monte Carlo simulation with conformal prediction. The proposed approach employs an adaptive cross-conformal strategy specifically designed for small-sample settings and kriging surrogate models using the J+GP conformal estimator. Unlike standard AK-MCS methods, the proposed framework provides distribution-free guarantees on prediction errors, leading to more reliable classification of samples near the limit-state surface. This improved uncertainty quantification enhances both the accuracy and robustness of failure probability estimates, especially for rare-event regimes where such efficiency is crucial. Reproducible numerical results illustrate the effectiveness of the method and also compare it to classical approaches on well-established benchmarks.