arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

检索范围排序方式

检索时间范围

重置

HOT 人工智能、机器人等 9

cs.AI 人工智能 cs.CV 计算机视觉 cs.CL 自然语言处理 cs.RO 机器人 cs.LG 机器学习 cs.SD 声音 cs.ET 新兴技术 eess.AS 音频语音 eess.IV 图像视频

CS 计算机 41

cs 计算机 cs.AI 人工智能 cs.AR 硬件架构 cs.CC 计算复杂性 cs.CE 计算工程 cs.CG 计算几何 cs.CL 自然语言处理 cs.CR 密码安全 cs.CV 计算机视觉 cs.CY 计算机与社会 cs.DB 数据库 cs.DC 分布式计算 cs.DL 数字图书馆 cs.DM 离散数学 cs.DS 数据结构 cs.ET 新兴技术 cs.FL 形式语言 cs.GL 综述文献 cs.GR 图形学 cs.GT 博弈论 cs.HC 人机交互 cs.IR 信息检索 cs.IT 信息论 cs.LG 机器学习 cs.LO 计算机逻辑 cs.MA 多智能体 cs.MM 多媒体 cs.MS 数学软件 cs.NA 数值分析 cs.NE 神经进化 cs.NI 网络架构 cs.OH 其他计算机 cs.OS 操作系统 cs.PF 性能 cs.PL 编程语言 cs.RO 机器人 cs.SC 符号计算 cs.SD 声音 cs.SE 软件工程 cs.SI 社会信息网络 cs.SY 系统控制

ECON 经济学 4

econ 经济学 econ.EM 计量经济 econ.GN 一般经济 econ.TH 理论经济

EESS 电气与系统 5

eess 电气与系统 eess.AS 音频语音 eess.IV 图像视频 eess.SP 信号处理 eess.SY 系统控制

MATH 数学 33

math 数学 math.AC 交换代数 math.AG 代数几何 math.AP 偏微分方程 math.AT 代数拓扑 math.CA 经典分析 math.CO 组合数学 math.CT 范畴论 math.CV 复变函数 math.DG 微分几何 math.DS 动力系统 math.FA 泛函分析 math.GM 一般数学 math.GN 一般拓扑 math.GR 群论 math.GT 几何拓扑 math.HO 历史综述 math.IT 信息论 math.KT K理论 math.LO 逻辑 math.MG 度量几何 math.MP 数学物理 math.NA 数值分析 math.NT 数论 math.OA 算子代数 math.OC 优化控制 math.PR 概率 math.QA 量子代数 math.RA 环与代数 math.RT 表示论 math.SG 辛几何 math.SP 谱理论 math.ST 统计理论

PHYSICS 物理 55

astro-ph 天体物理 astro-ph.CO 宇宙学 astro-ph.EP 地球行星 astro-ph.GA 星系物理 astro-ph.HE 高能天体 astro-ph.IM 天文仪器 astro-ph.SR 太阳恒星 cond-mat 凝聚态 cond-mat.dis-nn 无序神经 cond-mat.mes-hall 介观纳米 cond-mat.mtrl-sci 材料科学 cond-mat.other 其他凝聚态 cond-mat.quant-gas 量子气体 cond-mat.soft 软凝聚态 cond-mat.stat-mech 统计力学 cond-mat.str-el 强关联电子 cond-mat.supr-con 超导 gr-qc 广义相对论 hep-ex 高能实验 hep-lat 格点高能 hep-ph 高能唯象 hep-th 高能理论 math-ph 数学物理 nlin 非线性科学 nlin.AO 自适应系统 nlin.CD 混沌动力学 nlin.CG 胞自动机 nlin.PS 斑图孤子 nlin.SI 可积系统 nucl-ex 核物理实验 nucl-th 核物理理论 physics 物理 physics.acc-ph 加速器物理 physics.ao-ph 大气海洋 physics.app-ph 应用物理 physics.atm-clus 原子分子团簇 physics.atom-ph 原子物理 physics.bio-ph 生物物理 physics.chem-ph 化学物理 physics.class-ph 经典物理 physics.comp-ph 计算物理 physics.data-an 数据分析 physics.ed-ph 物理教育 physics.flu-dyn 流体动力学 physics.gen-ph 普通物理 physics.geo-ph 地球物理 physics.hist-ph 物理史哲 physics.ins-det 仪器探测 physics.med-ph 医学物理 physics.optics 光学 physics.plasm-ph 等离子体 physics.pop-ph 科普物理 physics.soc-ph 物理与社会 physics.space-ph 空间物理 quant-ph 量子物理

Q-BIO 定量生物 11

q-bio 定量生物 q-bio.BM 生物分子 q-bio.CB 细胞行为 q-bio.GN 基因组学 q-bio.MN 分子网络 q-bio.NC 神经认知 q-bio.OT 其他定量生物 q-bio.PE 种群进化 q-bio.QM 定量方法 q-bio.SC 亚细胞过程 q-bio.TO 组织器官

Q-FIN 定量金融 10

q-fin 定量金融 q-fin.CP 计算金融 q-fin.EC 经济学 q-fin.GN 一般金融 q-fin.MF 数学金融 q-fin.PM 投资组合 q-fin.PR 证券定价 q-fin.RM 风险管理 q-fin.ST 统计金融 q-fin.TR 交易微观结构

STAT 统计 7

stat 统计 stat.AP 统计应用 stat.CO 统计计算 stat.ME 统计方法 stat.ML 机器学习 stat.OT 其他统计 stat.TH 统计理论

2606.19368 2026-06-19 math.NA cs.LG cs.NA math.OC 新提交

Neural Architectures as Functional Priors in Physics-Informed Control Problems

物理信息控制问题中的神经架构作为函数先验

Sonia Rubio Herranz, Fernando Carlos López Hernández, Antonio López Montes

AI总结研究神经架构作为隐式函数先验在常微分方程控制问题中的作用，发现不同架构（MLP与傅里叶KAN）在相同条件下产生定性不同的控制，表现出功能特化现象。

Comments 17 pages, 6 figures. Physics-informed neural networks, optimal control, spectral bias, Kolmogorov-Arnold Networks

详情

AI中文摘要

在这项工作中，我们研究了神经架构作为隐式函数先验在由常微分方程控制的问题中的作用。我们的目标不是关注高度复杂的问题，而是在最简单的物理可解释设置中研究受控动力系统中依赖于架构的效应。特别地，我们研究了一个受控的线性RLC电路和一个非线性Duffing型动力系统。这两个系统首先通过经典最优控制公式进行分析，然后通过基于PINN的方法进行分析。我们比较了多层感知器（MLP）和基于傅里叶的KAN类架构的不同组合，并分析了它们对所得控制的影响。数值实验表明，即使在相同的控制方程、损失函数、初始和目标状态、训练参数以及物理约束下，不同的架构选择也会系统地产生定性不同的控制。学习到的解在谱结构、平滑性、能量分布和相空间行为方面出现显著差异。这项工作的一个核心观察是，当神经架构被允许足够的自由度来塑造学习到的控制结构时，会出现功能特化现象。更具体地说，在我们考虑的系统中，基于傅里叶的架构倾向于产生具有更丰富振荡内容的轨迹，而更平滑的低频偏置架构倾向于产生更规则且能量效率更高的控制。这表明控制问题的不同功能组件可能由不同的神经架构更有效地处理，从而导致状态表示和控制生成之间的隐式特化。

英文摘要

In this work we investigate the role of neural architectures as implicit functional priors in control problems governed by ordinary differential equations. Rather than focusing on highly complex problems, our objective is to investigate architecture-dependent effects in controlled dynamical systems within the simplest physically interpretable settings possible. In particular, we study a controlled linear RLC electrical circuit and a nonlinear Duffing-type dynamical system. Both systems are analyzed first through classical optimal-control formulations and later through PINN-based approaches. We compare different combinations of multilayer perceptrons (MLPs) and Fourier-based KAN-like architectures, and analyze their influence on the resulting controls. The numerical experiments suggest that different architectural choices systematically generate qualitatively distinct controls, even under identical governing equations, loss functionals, initial and target states, training parameters and physical constraints. Significant differences appear in the spectral structure, smoothness, energy distribution, and phase-space behavior of the learned solutions. A central observation of this work is the emergence of a functional specialization phenomenon when the neural architectures are allowed sufficient freedom to shape the structure of the learned controls. More specifically, in the systems considered here, Fourier-based architectures tend to produce trajectories with richer oscillatory content, whereas smoother low-frequency-biased architectures tend to generate more regular and energetically efficient controls. This suggests that different functional components of the control problem may be handled more efficiently by different neural architectures, leading to an implicit specialization between state representation and control generation.

URL PDF HTML ☆

赞 0 踩 0

2606.20485 2026-06-19 q-fin.RM cs.AI nlin.AO physics.soc-ph 新提交

Optimal Order of Multi-Agent and General Many-Body Systems

多智能体与一般多体系统的最优序

Jake J. Xia

AI总结提出一个分析多智能体系统的通用框架，基于智能体的权力和响应函数，推导出宏观性质，并引入风险偏好系数研究增长与韧性之间的权衡，得出最优有序度。

Comments Key Words: Many body systems, multi agent crowd interactions, feedback loops, agent power, response function, utility function, risk appetite, order, optimal order, fragility, mobility, synchronization, useful energy, entropy, concentration, correlation, task dependency, receiver dependency, collective intelligence, AI model scaling law

详情

AI中文摘要

本文开发了一个通用框架，用于分析具有智能体行动与集体观测之间反馈回路的多智能体系统。该框架建立在两个基本的智能体层面变量上：权力，衡量智能体对集体结果的影响；以及响应函数，决定智能体如何对观测做出反应。我们推导了宏观性质（包括总权力、有用权力、熵、有序度、脆弱性和流动性）如何从异质智能体的这两个变量中涌现。为了研究增长与韧性之间的权衡，我们引入了一个由风险偏好系数参数化的系统层面效用函数，并推导出一个平衡生产力、稳定性和适应性的最优有序度。分析表明，更强的同步可以增加集体产出，但也可能增加系统脆弱性并降低流动性。我们进一步论证，有序度、熵、信息和有用能量是任务依赖和系统相对的概念，其含义取决于系统的目标。通过测量和设计智能体的权力分布和响应函数，可能更好地理解、预测和优化集体行为，并识别集体智慧和最优序出现的条件。

英文摘要

This paper develops a general framework for analyzing multi-agent systems with feedback loops between agents actions and collective observations. The framework is built on two fundamental agent-level variables: power, which measures agent influence on collective outcomes, and response functions, which determine how agents react to observations. We derive how macroscopic properties, including total power, useful power, entropy, order, fragility, and mobility, emerge from these two variables of heterogeneous agents. To study the trade off between growth and resilience, we introduce a system-level utility function parameterized by a risk-appetite coefficient and derive an optimal degree of order that balances productivity, stability, and adaptability. The analysis suggests that stronger synchronization can increase collective output but may also increase systemic fragility and reduce mobility. We further argue that order, entropy, information, and useful energy are task-dependent and system-relative concepts whose meanings depend on the objectives of the system. By measuring and designing agent power distributions and response functions, it may be possible to better understand, predict, and optimize collective behavior and identify the conditions under which collective intelligence and optimal order emerge.

URL PDF HTML ☆

赞 0 踩 0

2606.20299 2026-06-19 stat.ML cs.LG hep-ph physics.data-an 新提交

Statistical Properties of Training & Generalization

训练与泛化的统计特性

Itay Lavie, Noam Levi, Yonatan Kahn

AI总结从物理学角度研究深度学习的关键特征和意外现象，回顾神经缩放定律及其与物理问题中约束和归纳偏置的相互作用。

Comments 32 pages, 3 figures. Part of the VERaiPHY initiative

2606.19781 2026-06-19 hep-ex cs.AI 新提交

Towards Engineering Scaling Laws with Pretraining Data Composition

迈向基于预训练数据组成的工程化缩放定律

Jan-Lucas Uslu, Kevin Greif, Daniel Whiteson, Benjamin Nachman

AI总结研究通过工程化预训练数据组成（增加多样性和与下游任务的对齐）来改变粒子物理中神经网络的缩放行为，使其更偏向数据扩展而非模型扩展。

2606.19149 2026-06-19 cs.CR cs.LG 新提交

OpenAnt: LLM-Powered Vulnerability Discovery Through Code Decomposition, Adversarial Verification, and Dynamic Testing

OpenAnt：通过代码分解、对抗性验证和动态测试实现LLM驱动的漏洞发现

Nahum Korda, Gadi Evron

AI总结提出OpenAnt系统，结合静态分析与LLM推理，通过代码分解、对抗性验证和动态测试三阶段流水线，在降低误报率的同时发现未知漏洞。

详情

AI中文摘要

在大型代码库中自动发现漏洞仍然具有挑战性：传统静态分析误报率高，而模糊测试等动态方法需要大量基础设施且通常针对狭窄的漏洞类别。大型语言模型（LLM）的最新进展使得对程序行为进行语义推理成为可能，但将LLM应用于仓库级安全分析会引入上下文管理、成本和验证方面的挑战。我们提出了OpenAnt，一个开源漏洞发现系统，它在多阶段流水线中集成了静态程序分析与基于LLM的推理。OpenAnt引入了三种关键技术。首先，代码库被分解为自包含的分析单元，并通过从外部入口点的可达性进行过滤，将分析面减少高达97%，同时保留与攻击相关的代码。其次，候选漏洞通过受限攻击者模拟进行对抗性验证，其中模型在现实攻击者能力下评估可利用性。第三，通过动态验证确认发现结果，其中自动生成利用环境，在沙箱容器中执行，并在使用后丢弃。在包括OpenSSL、WordPress和Flowise在内的广泛使用的开源项目上的评估表明，这种架构可以识别先前未知的漏洞，同时保持可管理的分析成本并大幅减少误报。我们的结果表明，结合语义推理与利用验证的闭环漏洞发现流水线，为可扩展的自动化安全分析提供了一条实用路径。OpenAnt已在Apache 2.0许可下开源，网址为https://this https URL。

英文摘要

Automated vulnerability discovery in large codebases remains challenging: traditional static analysis produces high false-positive rates, while dynamic approaches such as fuzzing require substantial infrastructure and often target narrow classes of bugs. Recent advances in large language models (LLMs) enable semantic reasoning about program behavior, but applying LLMs to repository-scale security analysis introduces challenges related to context management, cost, and verification. We present OpenAnt, an open-source vulnerability discovery system that integrates static program analysis with LLM-based reasoning in a multi-stage pipeline. OpenAnt introduces three key techniques. First, codebases are decomposed into self-contained analysis units filtered by reachability from external entry points, reducing the analysis surface by up to 97% while preserving attack-relevant code. Second, candidate vulnerabilities undergo adversarial verification through constrained attacker simulation, where the model evaluates exploitability under realistic attacker capabilities. Third, findings are validated through dynamic verification, in which exploit environments are generated automatically, executed in sandboxed containers, and discarded after use. Evaluation on widely used open-source projects including OpenSSL, WordPress, and Flowise shows that this architecture can identify previously unknown vulnerabilities while maintaining manageable analysis cost and substantially reducing false positives. Our results suggest that closed-loop vulnerability discovery pipelines, combining semantic reasoning with exploit validation, provide a practical path toward scalable automated security analysis. OpenAnt is released as open source under the Apache 2.0 license at https://github.com/knostic/OpenAnt.

URL PDF HTML ☆

赞 0 踩 0

2503.04507 2026-06-19 q-bio.QM cs.CG cs.LG 交叉投稿

The Morse Transform for Discrete Shape Analysis

离散形状分析的Morse变换

Alexander M. Tanaka, Aras T. Asaad, Richard Cooper, Vidit Nanda

AI总结提出一种基于定向分段线性Morse理论的拓扑变换，通过记录多个高度函数下的临界点来量化嵌入对象的几何形状，生成的特征向量在配体虚拟筛选中取得最优平均AUROC。

Comments 37 pages, 3 main figures, 2 main tables, 12 appendix figures and 4 appendix tables

详情

AI中文摘要

物体的几何形状在调节其与物理世界的相互作用中起着至关重要的作用。然而，为了统计推断或分类任务的目的，用数值描述几何信息仍然困难。在这里，我们引入了一种新的拓扑变换，它利用定向分段线性Morse理论，通过编录多个高度函数下的临界点来量化嵌入对象的几何形状。该Morse变换的输出记录了表征底层形状的临界点的高度和局部拓扑类型（峰、谷或鞍点），保留了比欧拉特征变换更精细的信息，同时自然优先考虑形状的最外层区域。关键的是，该输出可以进一步压缩为丰富而紧凑的特征向量。我们将Morse特征向量作为配体虚拟筛选（LBVS）的描述符进行基准测试，这本质上依赖于分子的形状。在常见的梯度提升树分类流程下，与其他拓扑变换描述符和标准基于形状的LBVS描述符相比，Morse描述符实现了最高的平均AUROC。

英文摘要

The geometry of an object plays a vital role in modulating its interactions with the physical world. It nevertheless remains difficult to describe geometric information numerically for the purposes of statistical inference or classification tasks. Here, we introduce a new topological transform which leverages directional piecewise-linear Morse theory to quantify the geometry of an embedded object by cataloguing critical points across multiple height-functions. The output of this Morse transform records both the heights and the local topological type (peak, trough or saddle) of the critical points that characterise the underlying shape, retaining finer information than the Euler characteristic transform whilst naturally prioritising a shape's outermost regions. Crucially, this output can be further compressed into a rich but compact feature vector. We benchmark the Morse feature vector as a descriptor for ligand-based virtual screening (LBVS), which intrinsically depends on the shape of molecules. Under a common gradient-boosted tree classification pipeline, Morse descriptors achieve the highest mean AUROC when compared to other topological transform descriptors and to standard shape-based LBVS descriptors.

URL PDF HTML ☆

赞 0 踩 0

2606.20435 2026-06-19 econ.EM 新提交

Choosing A Headline Estimand from Matching, DID, and Hybrid Designs: A Minimax-Regret Approach

从匹配、DID和混合设计中选择标题估计量：一种极小化最大遗憾方法

Yechan Park, Yuya Sasaki

AI总结本文提出在面板数据因果效应估计中，混合设计（DIDM）的估计量介于匹配（M）和双重差分（DID）之间，并在宽泛损失函数下是极小化最大遗憾选择，建议将DIDM作为标题估计量，匹配和DID作为边界。

2606.20286 2026-06-19 econ.EM 新提交

Institutions, Inputs, and Agricultural Growth in China:Revisiting Several Controversies, 1949--1986

制度、投入与中国农业增长：重访若干争议（1949–1986）

Jiyuan Lyu

AI总结本文利用统一数据集和计量方法，重新审视关于中国农业增长的价格剪刀差、重工业投资、1978年改革及去集体化对灌溉影响的四大争议。

详情

AI中文摘要

关于1949年至1986年间中国农业增长的学术争论在价格剪刀差的程度、重工业投资的影响、1978年改革的作用以及去集体化对灌溉的影响等方面持续存在分歧。本文利用单一数据集和互补的计量经济学方法，逐一回应了这些争议。结果表明，1952–1957年是唯一一个通过所有三个渠道实现净提取的时期，此后国家通过财政和信贷工具向农业净流入约1686亿元。重工业投资对农业产生了显著的正向滞后效应，而同期负相关源于投资份额指标的零和性质。投入产出弹性在1970年突然变化，集体农业贷款在1971年断裂，两者均指向华北农业会议的整顿效果。防灾能力从集体时期的0.70下降到家庭承包后的0.53，主要原因是集体维护体系崩溃而非国家投资减少。1979年后农业供给的价格弹性趋近于零，表明1979年的收购价格提高更像是一次性重新校准而非持续的边际激励。

英文摘要

Scholarly debates on China's agricultural growth between 1949 and 1986 continue to differ over the extent of the price scissors, the effect of heavy industrial investment, the role of the 1978 reforms, and the impact of decollectivization on irrigation. Using a single dataset and complementary econometric methods, this paper addresses each of these controversies. The results show that 1952--1957 was the only net extraction period across all three channels, after which the state channelled a net inflow of about 168.6 billion yuan into agriculture via fiscal and credit instruments. Heavy industrial investment exerted a significant positive lagged effect on agriculture, while the contemporaneous negative correlation stemmed from the zero-sum nature of the investment share indicator. The input-output elasticity shifted abruptly in 1970, and collective agricultural loans broke in 1971, both pointing to the rectification effects of the North China Agricultural Conference. Disaster prevention capacity fell from 0.70 under the collective era to 0.53 after household contracting, mainly because the collective maintenance system collapsed rather than because state investment declined. After 1979 the price elasticity of agricultural supply approached zero, suggesting that the 1979 procurement price increase acted more like a one-off recalibration than a sustained marginal incentive.

URL PDF HTML ☆

赞 0 踩 0

2606.19972 2026-06-19 econ.EM 新提交

Biodiversity Media Narratives and Stock Market Performance: Evidence from Europe

生物多样性媒体叙事与股市表现：来自欧洲的证据

Andres Azqueta-Gavaldon, Ben Jabeur Sami, Leila Hedhili

AI总结利用GDELT全球知识图谱构建2015-2025年法德意西四国的生物多样性媒体风险指标，通过面板格兰杰因果检验和增广逆概率加权事件研究发现，生物多样性风险显著降低股价，且低风险期的正面效应大于高风险期的负面效应。

2606.20478 2026-06-19 eess.AS 新提交

Beyond Speaker Independence: Evaluating Cross-Lingual Acoustic-to-Articulatory Inversion Across Finnish and Russian

超越说话人独立性：跨语言声学到发音反演在芬兰语和俄语上的评估

Ruchi Pandey, Tomi Kinnunen

AI总结本研究系统评估了跨说话人和跨语言域偏移下的声学到发音反演（AAI）性能，利用新构建的芬兰语-俄语双语EMA语料库FROST-EMA，比较了不同发音目标、声学前端和反演后端，发现跨性别性能下降中等（约0.05-0.10），跨语言下降更大（约0.10-0.20）。

详情

AI中文摘要

声学到发音反演（AAI）在域偏移下仍然具有挑战性，其中说话人属性的变化和跨语言条件常常导致性能下降。我们在这种偏移下进行了系统评估，并在FROST-EMA（一个芬兰语-俄语双语EMA语料库）上建立了基线基准。FROST-EMA解决了现有资源的英语偏见和有限的说话人多样性。我们基准测试了（i）发音目标（原始EMA坐标与声道变量），（ii）声学前端（MFCC与SSL特征），以及（iii）反演后端（BiLSTM与轻量级基于注意力的序列模型）。我们进一步定义了跨性别迁移（语言内）和跨语言迁移（性别内）的评估协议。结果表明，相对于域内基线，跨性别不匹配导致皮尔逊相关系数适度下降（约0.05至0.10），而跨语言不匹配导致更大的下降（约0.10至0.20）。

英文摘要

Acoustic-to-articulatory inversion (AAI) remains challenging under domain shifts where changes in speaker attributes and cross-language conditions often degrade performance. We conduct a systematic evaluation under such shifts and establish baseline benchmarks on FROST-EMA, a Finnish-Russian bilingual EMA corpus. FROST-EMA addresses the English bias and limited speaker diversity of existing resources. We benchmark (i) articulatory targets (raw EMA coordinates vs tract variables), (ii) acoustic front-ends (MFCC vs SSL features), and (iii) inversion back-ends (BiLSTM vs a lightweight attention-based sequence model). We further define evaluation protocols for cross-gender transfer (within language) and cross-language transfer (within gender). The results indicate that cross-gender mismatch introduces moderate Pearson correlation declines (approximately 0.05 to 0.10) relative to the in-domain baseline, whereas cross-language mismatch causes larger drops (approximately 0.10 to 0.20).

URL PDF HTML ☆

赞 0 踩 0

2606.20450 2026-06-19 eess.SP 新提交

Max-Min Rate Fairness Optimization for Multi-User Pinching-Antenna NOMA Systems

多用户捏合天线NOMA系统的最大最小速率公平性优化

Mahmoud AlaaEldin, Amy Inwood, Xidong Mu, Michail Matthaiou

AI总结针对多波导捏合天线NOMA下行系统，提出两阶段优化框架，联合优化天线位置和预编码，以最大化最小用户速率，显著提升性能。

详情

AI中文摘要

捏合天线系统（PAS）通过沿米级波导重新定位介电辐射元件（称为捏合天线，PA）来克服信号阻塞，从而创建视距链路。由于每个波导由单个射频（RF）链驱动，非正交多址（NOMA）非常适合基于PAS的多用户通信。本文研究了一个多波导的PAS使能多用户下行NOMA系统，每个波导配备多个PA。联合优化PA位置和基站发射预编码，以最大化最小用户速率。由于PA间干扰引起的快速振荡相干和，所得问题高度非光滑且非凸。为应对这一挑战，我们提出了一种两阶段结构化优化框架。在第一阶段，使用内点算法进行粗略的PA位置和功率分配优化，同时忽略PA信道相位，从而得到接近真实最优的解。在第二阶段，考虑PA信道相位偏移，对PA位置和发射预编码进行微调。该阶段首先应用相位归零，即局部重新定位每个PA，使相应信道相位归零并促进建设性相干合并。然后使用交替过程，迭代执行前后向PA位置精炼和基于逐次凸近似的复发射预编码优化直至收敛，从而减少残余相位失配。仿真结果表明，所提框架显著优于启发式优化基准，且计算时间更短。结果还展示了相对于可比的多输入多输出下行NOMA系统的巨大增益，并揭示了PA数量、用户数量和发射功率对系统性能的影响。

英文摘要

Pinching-antenna systems (PASs) can overcome signal blockage by repositioning dielectric radiating elements, called pinching antennas (PAs), along meter-scale waveguides to create line-of-sight links. Since each waveguide is driven by a single radio-frequency (RF) chain, non-orthogonal multiple access (NOMA) is well suited for PAS-based multi-user communications. This paper studies a PAS-enabled multi-user downlink NOMA system with multiple waveguides, each equipped with multiple PAs. The PA positions and base-station transmit precoding are jointly optimized to maximize the minimum user rate. The resulting problem is highly non-smooth and non-convex because of the rapidly oscillating coherent sums caused by inter-PA interference. To tackle this challenge, we propose a two-stage structured optimization framework. In the first stage, coarse PA-position and power-allocation optimization is performed using an interior-point algorithm while neglecting the PA channel phases, which gives solutions near the true optima. In the second stage, PA positions and transmit precoding are fine-tuned while accounting for the PA channel phase shifts. This stage first applies phase zeroing, where each PA is locally repositioned to align the corresponding channel phase toward zero and promote constructive coherent combining. It then uses an alternating procedure that iteratively performs forward-backward PA position refinement and successive-convex-approximation-based complex transmit precoding optimization until convergence, thereby reducing residual phase mismatch. Simulation results show that the proposed framework significantly outperforms heuristic optimization benchmarks with much lower computational time. They also demonstrate large gains over a comparable multiple-input multiple-output downlink NOMA system and reveal the impact of the number of PAs, users, and transmit power on system performance.

URL PDF HTML ☆

赞 0 踩 0

2606.20338 2026-06-19 eess.AS 新提交

Stuttering Classification and Segmentation with Attention-Based Multiple Instance Learning

基于注意力多实例学习的口吃分类与分割

Petar Sušac, Sebastian P. Bayerl, Hrvoje Džapo

AI总结提出基于微调wav2vec 2.0、WavLM和Whisper编码器的多实例神经网络，利用片段级数据实现帧级口吃分类与分割，帧级F1提升23%。

Comments Accepted at Interspeech 2026

2606.20266 2026-06-19 eess.AS 新提交

Transcript-Free Flow-Matching Text-to-Speech via Speech Feature Conditioning

基于语音特征调节的无转录流匹配文本转语音

SooHwan Eom, Hee Suk Yoon, Eunseop Yoon, Mark Hasegawa-Johnson, Chang D. Yoo

AI总结提出RTFree-F5，用自监督语音表示替代参考转录本，通过轻量适配器映射到F5-TTS文本条件空间，消除对外部ASR依赖，在构音障碍语音上WER从24.6%降至10.4%。

Comments Accepted to Interspeech 2026

详情

AI中文摘要

最近的流匹配文本转语音（TTS）模型，如F5-TTS，在推理时依赖于从外部ASR系统获得的参考转录本。这种依赖性使得零样本TTS对于口音或构音障碍的说话者变得脆弱，而这正是最需要它的场景。此外，我们发现即使有真实转录本可用，基于文本的参考条件化也可能将非典型语音中的非典型声学模式传播到合成语音中。为了解决这个问题，我们提出了RTFree-F5，它用连续的自监督语音表示替换参考转录本，通过轻量适配器映射到F5-TTS的文本条件空间，同时重用预训练检查点。在构音障碍语音上，RTFree-F5将WER从24.6%降低到10.4%，甚至超过了真实参考转录本基线，同时提高了自然度，并在标准基准测试中保持竞争力，而无需任何参考转录本。

英文摘要

Recent flow-matching text-to-speech (TTS) models, such as F5-TTS, rely on a reference transcript at inference time, obtained from an external ASR system. This dependency makes zero-shot TTS brittle for accented or dysarthric speakers, precisely the scenarios where it is most needed. Moreover, we find that text-based reference conditioning can propagate atypical acoustic patterns from atypical speech into synthesis, even when ground-truth transcripts are available. To address this, we propose RTFree-F5, which replaces the reference transcript with continuous self-supervised speech representations mapped into F5-TTS's text-conditioning space via a lightweight adapter, while reusing the pretrained checkpoint. On dysarthric speech, RTFree-F5 reduces WER from 24.6% to 10.4%, surpassing even the ground-truth reference transcript baselines, while improving naturalness and remaining competitive on standard benchmarks without requiring any reference transcript.

URL PDF HTML ☆

赞 0 踩 0

2606.20222 2026-06-19 eess.SP 新提交

Reliable ORIS-assisted FSO Communications via HARQ

基于HARQ的可靠ORIS辅助自由空间光通信

Georgios D. Chondrogiannis, Athanasios P. Chrysologou, Vasilis K. Papanikolaou, Alexandros-Apostolos A. Boulogeorgos, Nestor D. Chatzidiamantis, Robert Schober

AI总结研究结合光学可重构智能表面（ORIS）和混合自动重传请求（HARQ）的自由空间光通信链路，推导端到端信道统计模型，给出HARQ-CC的闭式中断概率和HARQ-IR的中断上界，分析分集阶数和延迟特性。

Comments 13 pages, 8 Figures, Journal

详情

AI中文摘要

本文研究了一种由光学可重构智能表面（ORIS）辅助并通过混合自动重传请求（HARQ）方案增强的自由空间光（FSO）链路。ORIS在障碍物周围创建虚拟视距路径，而HARQ通过重传和合并恢复受湍流、指向抖动和几何损耗损坏的帧。我们首先通过联合考虑大气湍流、ORIS引起的指向误差和几何衰减，推导了端到端发射器-ORIS-接收器（Tx-ORIS-Rx）反射信道的易处理统计模型。基于这些结果，我们获得了采用Chase合并的HARQ（HARQ-CC）的闭式中断概率（OP）表达式，以及采用增量冗余的HARQ（HARQ-IR）的解析中断上界，这些表达式对任意最大传输轮次有效。我们进一步进行了高信噪比（SNR）分析，该分析提供了中断行为的全面表征，并揭示了两种方案的分集阶数。此外，我们通过平均传输轮次和给定成功解码的条件平均轮次来表征截断HARQ过程的延迟行为。最后，数值和蒙特卡洛结果验证了所提出的分析，并表明HARQ显著提高了ORIS辅助FSO的可靠性，即使对于少量重传轮次，HARQ-IR也能实现比HARQ-CC更低的中断和延迟。

英文摘要

This paper studies a free-space optical (FSO) link assisted by an optical reconfigurable intelligent surface (ORIS) and enhanced by a hybrid automatic repeat request (HARQ) scheme. The ORIS creates a virtual line-of-sight path around obstacles, while HARQ recovers frames corrupted by turbulence, pointing jitter, and geometric loss through retransmission and combining. We first derive a tractable statistical model for the end-to-end transmitter-ORIS-receiver (Tx-ORIS-Rx) reflected channel by jointly accounting for atmospheric turbulence, ORIS-induced pointing errors, and geometric attenuation. Building on these results, we obtain closed-form outage probability (OP) expressions for HARQ with Chase combining (HARQ-CC) and analytical outage upper bounds for HARQ with incremental redundancy (HARQ-IR), valid for an arbitrary maximum number of transmission rounds. We further conduct a high signal-to-noise ratio (SNR) analysis that provides a thorough characterization of the outage behavior and reveals the diversity order of both schemes. In addition, we characterize the delay behavior of the truncated HARQ process through the mean number of transmission rounds and the conditional mean number of rounds given successful decoding. Finally, numerical and Monte Carlo results validate the proposed analysis and show that HARQ substantially improves ORIS-assisted FSO reliability, with HARQ-IR achieving lower outage and delay than HARQ-CC, even for a small number of retransmission rounds.

URL PDF HTML ☆

赞 0 踩 0

2606.20011 2026-06-19 eess.SP 新提交

Amplitude-Phase-Frequency Block Modulation for OFDM-ISAC with SI-Free PAPR Reduction and Pilotless Sensing

用于OFDM-ISAC的幅度-相位-频率块调制：无旁瓣信息PAPR降低和无导频感知

Bensheng Yang, Min Fan, Haitao Zhao, Haiming Wang

AI总结提出一种幅度-相位-频率块调制方案，通过斯托克斯球映射和分组相位优化，在OFDM中实现无资源分割的通信与感知集成，同时降低PAPR并消除导频需求。

详情

AI中文摘要

基于正交频分复用（OFDM）的集成感知与通信系统需要一种统一波形，同时支持可靠数据传输、低峰均功率比（PAPR）和精确信道感知。现有方法在分离的时间或频率资源上复用通信与感知，或依赖专用导频进行信道估计，限制了系统灵活性并增加了开销。本文提出一种用于OFDM的幅度-相位-频率块调制（APFBM）方案，在不进行资源分割的情况下实现通信与感知的波形级集成。信息符号在斯托克斯球上表示，并通过明确规则映射到能量归一化的琼斯矢量，该规则为每个块建立确定性相位参考。这种映射暴露了信号结构中固有的共相自由度。在发射端，分组相位优化算法利用该结构自由度降低PAPR，无需旁瓣信息（SI）。在接收端，相同的确定性相位结构支持基于维特比的最大似然（ML）序列检测算法，该算法联合恢复优化相位并估计块状信道幅度和相位。无需专用感知导频，因为感知观测量直接从通信波形中提取。推导了闭式错误率和感知精度表达式。在软件无线电链路上的数值仿真和空中测量证实了有效的PAPR降低、精确的信道感知、可靠的相位恢复和稳定的信道状态信息重建。所提方案以适度降低频谱效率为代价，实现了统一波形设计，同时提供无SI的PAPR降低和无导频感知。

英文摘要

Orthogonal Frequency Division Multiplexing (OFDM)-based integrated sensing and communication systems demand a unified waveform that simultaneously supports reliable data transmission, low peak-to-average power ratio (PAPR), and accurate channel sensing. Existing approaches multiplex communication and sensing across separate time or frequency resources, or rely on dedicated pilots for channel estimation, limiting system flexibility and increasing overhead. This paper proposes an amplitude-phase-frequency block modulation (APFBM) scheme for OFDM that achieves waveform-level integration of communication and sensing without resource partitioning. Information symbols are represented on the Stokes sphere and mapped to energy-normalized Jones vectors through an unambiguous rule that establishes a deterministic phase reference per block. This mapping exposes a commonphase degree of freedom inherent in the signal structure. At the transmitter, a grouped phase optimization algorithm exploits this structural freedom to reduce the PAPR without side information (SI). At the receiver, the same deterministic phase structure enables a Viterbi-based maximum-likelihood (ML) sequence detection algorithm that jointly recovers the optimization phases and estimates the block-wise channel amplitude and phase. No dedicated sensing pilots are required, as the sensing observables are extracted directly from the communication waveform. Closed-form error-rate and sensing-accuracy expressions are derived. Numerical simulations and over-the-air measurements on a software-defined radio link confirm effective PAPR reduction, accurate channel sensing, reliable phase recovery, and stable channel state information reconstruction. The proposed scheme trades a moderate reduction in spectral efficiency for a unified waveform design that simultaneously delivers SI-free PAPR reduction and pilotless sensing.

URL PDF HTML ☆

赞 0 踩 0

2606.20001 2026-06-19 eess.AS 新提交

Time-Unconditional Generative Speech Enhancement via Autonomous Rectified Flow

基于自主整流流的时间无条件生成式语音增强

Wen Zhang, Wenbin Jiang, Yang Zhang, Xiaofei Zhou

AI总结提出自主整流流框架，通过线性插值路径证明目标向量场时间不变性，设计时间无条件网络仅从空间关系推断去噪方向，显著提升生成质量、鲁棒性和推理效率。

2606.19974 2026-06-19 eess.AS 新提交

Interpreting Content and Speaker Characteristics in Factorised Self-Supervised Subspaces

解释因子化自监督子空间中的内容和说话人特征

Kyle Janse van Rensburg, Herman Kamper

AI总结通过SVD分解WavLM特征为内容矩阵和说话人变换，发现内容空间主要编码强度、共振峰和发声，而说话人空间与音高和性别强相关，并可用于语音合成中的精细控制。

Comments 7 pages, 4 figures

详情

AI中文摘要

自监督语音特征同时编码内容和说话人信息。最近的工作引入了一种基于SVD的因子化方法，将这些特征分解为一个共享的内容矩阵（捕获时间变化）和说话人特定的变换（捕获静态说话人特征）。然而，这些组件内部的信息组织方式仍不清楚。在本文中，我们研究了WavLM因子化的内容和说话人子空间的维度如何与语音特征（如音高、强度和发声）相关。我们发现，内容空间中的前几个维度主要捕获强度、高阶共振峰和发声，而音高编码在较后的维度中。相比之下，方差最大的说话人维度与音高和性别强相关，后面的维度捕获高频变化。干预实验表明，操纵这些维度能够实现对语音合成中语音特征的目标控制。此外，联合修改内容和说话人表示可提供对音高和强度等特征的精细控制。

英文摘要

Self-supervised speech features encode both content and speaker information. Recent work introduced an SVD-based factorisation that decomposes these features into a shared content matrix capturing temporal variation and speaker-specific transformations capturing static speaker characteristics. However, how information is organised within these components remains unclear. In this paper, we investigate how the dimensions of WavLM-factorised content and speaker subspaces correlate with speech characteristics such as pitch, intensity, and voicing. We find that leading dimensions in the content space primarily capture intensity, higher-order formants, and voicing, while pitch is encoded in a later dimension. In contrast, the highest-variance speaker dimension is strongly associated with pitch and gender, with later dimensions capturing high-frequency variation. Intervention experiments show that manipulating these dimensions enables targeted control of speech characteristics for speech synthesis. Furthermore, modifying the content and speaker representations jointly provides fine-grained control over characteristics such as pitch and intensity.

URL PDF HTML ☆

赞 0 踩 0

2606.19953 2026-06-19 eess.SP 新提交

ConsisFormer: Compute-Efficient Transformer for Wireless Foundation Models Based on Channel Consistency

ConsisFormer: 基于信道一致性的无线基础模型高效计算Transformer

Yuwei Wang, Li Sun, Tingting Yang, Liwen Jing, Yuxuan Shi, Maged Elkashlan, Mérouane Debbah

AI总结提出ConsisFormer，利用无线信道短时一致性，通过自适应令牌聚合和特征序列插值降低Transformer计算复杂度，在多种任务上减少83%以上计算量且性能损失极小。

详情

AI中文摘要

无线基础模型（WFM）最近成为AI原生6G网络的一种有前景的范式，能够实现适应各种通信和感知任务的通用信道表示。现有的WFM主要基于Transformer架构，该架构提供了优越的性能，但计算复杂度与输入序列长度的平方成正比，这对其在严格推理延迟约束下的部署构成了重大障碍。为了解决这个问题，本文提出ConsisFormer，一种基于无线信道短时一致性的高效计算Transformer设计，作为WFM的骨干网络。利用相邻时间或频率实例共享相似的散射体簇并因此表现出相似信道特性的观察，我们开发了自适应令牌聚合（ATA）模块，动态合并相邻信道状态信息（CSI）令牌，从而减少自注意力计算中涉及的令牌序列长度以降低计算成本。此外，我们提出了一种特征序列插值（FSI）方法，基于Transformer块输出的稀疏特征序列恢复完整的CSI表示，从而在保持性能不受影响的同时确保低复杂度。此外，我们提出了一种用于WFM的聚合自编码器（AAE）预训练范式，通过压缩和恢复从稀疏化CSI令牌中学习鲁棒的信道表示。仿真结果表明，所提出的设计将WFM的计算复杂度降低了83%以上，同时在包括信道预测、视距/非视距分类、波束预测和定位在内的各种任务上性能损失极小。

英文摘要

Wireless foundation models (WFMs) have recently emerged as a promising paradigm for AI-native 6G networks, enabling universal channel representations adaptable to diverse communication and sensing tasks. Existing WFMs are predominantly built upon the Transformer architecture, which delivers superior performance but incurs computational complexity proportional to the square of the input sequence length, posing a significant barrier to their deployment under stringent inference latency constraints. To address this issue, in this paper, we propose ConsisFormer, a compute-efficient Transformer design based on short-term consistency of wireless channels, as a WFM backbone. By utilizing the observation that adjacent time or frequency instances share similar clusters of scatterers and thus exhibit similar channel characteristics, we develop an adaptive token aggregation (ATA) module to dynamically merge neighboring channel state information (CSI) tokens, thereby reducing the length of the token sequence involved in self-attention calculations to lower the computational cost. Furthermore, we propose a feature sequence interpolation (FSI) method to recover the full CSI representation based on the sparse feature sequence outputted from the Transformer blocks, thus keeping the performance unaffected while ensuring low complexity. Moreover, we propose an aggregated auto-encoder (AAE) pre-training paradigm for WFMs, enabling robust channel representation learning from sparsified CSI tokens via compression and recovery. Simulation results show that the proposed design reduces the computational complexity of WFM by over $83\%$ with negligible performance loss on various tasks including channel prediction, LoS/NLOS classification, beam prediction, and localization.

URL PDF HTML ☆

赞 0 踩 0

2606.19940 2026-06-19 eess.AS 新提交

Analyzing Language and Geographical Variation in Speech Representations Across 60 Indic Languages

分析60种印度语言语音表征中的语言和地理变异

Pavan Kumar J, Agneedh Basu, Pranav Bhat, Sujith Pulikodan, Visruth Sanka, Nihar Desai, Prasanta Kumar Ghosh

AI总结研究通过联合语言-地区监督微调Whisper-base和Wav2Vec2.0，发现该方法在保持语言分类能力的同时，提升了嵌入空间中地区区分度，并利用归一化条件互信息分析了嵌入结构。

2606.19724 2026-06-19 eess.SP 新提交

Cyclic-Prefix OFDM Probing for Spatial-ISI-Free Distributed Acoustic Sensing via Frequency-Domain Channel Reconstruction

基于频域信道重构的循环前缀OFDM探测实现无空间ISI分布式声学传感

Huan Huang, Zhiyang Xue, Ziang Chen, Zhongxing Tian, Dongdong Zou, Gangxiang Shen, Yi Cai

AI总结提出使用循环前缀正交频分复用（CP-OFDM）波形作为传感探头，通过频域信道重构消除匹配滤波脉冲压缩中的空间符号间干扰（ISI），实现无空间ISI的分布式声学传感，并同时恢复通信数据，展示共享波形集成感知与通信（ISAC）。

Comments This manuscript has been submitted for possible publication

详情

AI中文摘要

基于匹配滤波的脉冲压缩分布式声学传感（DAS）存在非零压缩旁瓣，导致确定性距离单元间泄漏，即空间符号间干扰（ISI），并在重建的瑞利背向散射迹中产生虚假响应。我们提出一种用于$\phi$-OTDR的循环前缀正交频分复用（CP-OFDM）DAS系统，使用承载数据的CP-OFDM波形作为传感探头。该系统还恢复前向通信数据，初步展示了共享波形集成感知与通信（ISAC）。据我们所知，这是首次将分布式瑞利背向散射建模为有限记忆传感多径信道。基于该模型，我们证明，如果有用OFDM和CP长度覆盖传感多径记忆，则去除CP、单抽头频域均衡和逆离散傅里叶变换可重建每个距离单元系数，且无确定性波形引起的空间ISI，从而实现无空间ISI的相位解调。在模拟的5.2公里链路上，组内间隔5.31–5.83米的十个同时强、弱事件，所提接收机抑制了事件外泄漏，并将相位迹均方误差相比匹配滤波脉冲压缩提升高达29.55 dB。在5.2公里光纤链路的相干外差实验中，占用带宽111.984 MHz，在5 V和1 V驱动下，500 Hz PZT振动分别被盲定位在5.071公里和5.066公里处，其波形恢复的相关系数分别为0.990和0.962。同一承载数据探头还恢复了一幅图像，误码率为零，误差矢量幅度中位数为-23.14 dB。这些结果验证了CP-OFDM辅助的频域信道重构用于无空间ISI的DAS，并展示了其在共享波形光纤ISAC中的潜力。

英文摘要

Matched-filter-based pulse-compression distributed acoustic sensing (DAS) suffers from nonzero compression sidelobes that cause deterministic inter-range-bin leakage, i.e., spatial inter-symbol interference (ISI), and false responses in reconstructed Rayleigh-backscatter traces. We propose a cyclic-prefix orthogonal frequency-division multiplexing (CP-OFDM) DAS system for $ϕ$-OTDR, using a data-bearing CP-OFDM waveform as the sensing probe. It also recovers forward communication data, providing an initial demonstration of shared-waveform integrated sensing and communication (ISAC). To our knowledge, this is the first formulation of distributed Rayleigh backscattering as a finite-memory sensing multipath channel. Based on this formulation, we prove that, if the useful OFDM and CP lengths cover the sensing multipath memory, CP removal, one-tap frequency-domain equalization, and inverse discrete Fourier transform reconstruct each range-bin coefficient without deterministic waveform-induced spatial ISI, enabling spatial-ISI-free phase demodulation. For a simulated 5.2-km link with ten simultaneous strong and weak events spaced by 5.31--5.83 m within groups, the proposed receiver suppresses off-event leakage and improves phase-trace mean-square error by up to 29.55 dB over matched-filter pulse compression. In a heterodyne coherent experiment over a 5.2-km fiber link with 111.984-MHz occupied bandwidth, 500-Hz PZT vibrations are blindly localized at 5.071 and 5.066 km under 5- and 1-V drives, respectively, and their waveforms are recovered with correlation coefficients of 0.990 and 0.962. The same data-bearing probe also recovers an image with zero measured bit-error rate and a median error vector magnitude of -23.14 dB. These results validate CP-OFDM-aided frequency-domain channel reconstruction for spatial-ISI-free DAS and demonstrate its potential for shared-waveform optical-fiber ISAC.

URL PDF HTML ☆

赞 0 踩 0

2606.19720 2026-06-19 eess.SP 新提交

An Optimization Framework for Certain Separable Problems using Neural Networks

基于神经网络的特定可分离问题优化框架

Rohit Negi, Soummya Kar

AI总结针对参数可分离的约束优化问题，提出离线学习与在线处理两阶段策略，利用ADMM和神经网络降低在线计算复杂度。

Comments 15 pages, 5 figures

2606.19666 2026-06-19 eess.SP 新提交

Degrees of Freedom and Beamforming for Large Intelligent Surfaces

大规模智能表面的自由度与波束赋形

Jiawang Li, Alireza Saberkari, Buon Kiong Lau, Mats Gustafsson

AI总结通过互阴影面积闭式表达式估计大规模智能表面（LIS）的空间自由度（DoF），并验证其与数值奇异值谱的吻合；基于DoF分析设计采样方案和波束赋形，证明可形成约DoF数量的独立波束，超过此限会导致干扰增加；极化研究表明电场分量对DoF贡献不均，总场DoF为单极化分量的两倍。

详情

AI中文摘要

空间自由度（DoF）、采样和波束赋形是多用户大规模智能表面（LIS）的基础，其中电磁场必须在多个近场位置进行成形、分辨和聚焦。本文利用互阴影面积的闭式表达式，针对代表性LIS配置估计了DoF数量。通过数值奇异值谱验证了所得DoF预测，其谱膝点与理论估计紧密吻合。对于线源配置，通过将源或观测线划分为单位DoF区间，开发了一种解析采样方案，从而能够选择空间样本。使用最大比传输和迫零的波束赋形结果表明，可以形成大约DoF数量的独立波束。试图超过此限制会导致干扰增加和性能下降。对于基于表面的LIS配置，采样点则通过离散经验插值方法数值确定。相应的波束赋形结果进一步证实，目标区域可以支持大约与DoF分析预测数量相同的独立波束。最后，一项极化感知研究表明，电场分量对DoF的贡献不相等，且总场DoF是单极化分量DoF的两倍。

英文摘要

Spatial degrees of freedom (DoF), sampling, and beamforming are fundamental to multi-user large intelligent surfaces (LISs), where electromagnetic fields must be shaped, resolved, and focused at multiple near-field locations. This work estimates the number of DoF using closed-form expressions derived from the mutual shadow area for representative LIS configurations. The resulting DoF predictions are validated through numerical singular-value spectra, whose spectral knee points closely match the theoretical estimates. For line-source configurations, an analytic sampling scheme is developed by partitioning the source or observation line into unit-DoF intervals, enabling the selection of spatial samples. Beamforming results using maximum-ratio transmission and zero-forcing demonstrate that approximately the number of DoF independent beams can be formed. Attempting to exceed this limit results in increased interference and degraded performance. For surface-based LIS configurations, sampling points are instead determined numerically using the discrete empirical interpolation method. The corresponding beamforming results further confirm that the target region can support approximately as many independent beams as predicted by the DoF analysis. Finally, a polarization-aware study reveals that the electric-field components contribute unequally to the DoF and that the total-field DoF is twice that of a single polarization component.

URL PDF HTML ☆

赞 0 踩 0

2606.19536 2026-06-19 eess.SP 新提交

Multistatic J-Band Radar TX/RX Chipset in SiGe BiCMOS with Integrated x16 Frequency Multiplier Chain and High EIRP

采用SiGe BiCMOS工艺的集成x16倍频链和高EIRP的多基地J波段雷达收发芯片组

Stephan Hauptmeier, Kennet Braasch, Till Ziegler-Bellenberg, Diana P. Cortes N., Tobias T. Braun, Michael Höft, Nils Pohl

AI总结本文设计并测量了一种多基地J波段雷达芯片组，包含集成x16倍频链的发射和接收MMIC，实现了高EIRP和远距离探测。

详情

AI中文摘要

本文介绍了一种多基地J波段雷达芯片组的设计与测量，该芯片组包括一个发射机和一个接收机MMIC，两者均集成了$\ imes$16倍频链，用于低频本振分配和可扩展雷达配置。多基地雷达架构可以同时维持高发射功率和高接收灵敏度，这一优势在本芯片组中得到了充分利用。为此，发射机MMIC上集成的四路功率合成放大器链提供了11.2 dBm的输出功率。在292 GHz下，使用准直PTFE透镜时测得的EIRP为41 dBm，无透镜时为8.8 dBm。尽管倍频因子较高，但片上谐波抑制优于24 dBc，而通过多个滤波器级实现了约50 dBc的辐射带内谐波抑制。接收机MMIC包含三级低噪声放大器，在292 GHz下整体转换增益为43.3 dB。集成的片上贴片天线便于系统集成，并可使用高方向性介质透镜，使该芯片组适用于长达150米的远距离雷达测量。MMIC采用130 nm SiGe BiCMOS工艺实现，其f_T和f_max分别为500 GHz和610 GHz。

英文摘要

This work presents the design and measurement of a multistatic J-band radar chipset comprising a transmitter and a receiver MMIC both featuring an integrated $times$16 frequency multiplier chain for low-frequency local-oscillator distribution and scalable radar configurations. Multistatic radar architectures can sustain high transmission power and high receiver sensitivity simultaneously an advantage that is fully leveraged in the present chipset. To this end a four-way power-combining amplifier chain integrated on the transmitter MMIC delivers an output power of 11.2 dBm. The resulting measured EIRP is 41 dBm at 292 GHz with a collimating PTFE lens and 8.8 dBm without a lens. Despite the high frequency-multiplication factor an on-chip harmonic rejection better than 24 dBc was measured while a radiated in-band harmonic rejection of approximately 50 dBc was achieved through multiple filter stages. The receiver MMIC incorporates a three-stage low-noise amplifier and exhibits an overall conversion gain of 43.3 dB at 292 GHz. Integrated on-chip patch antennas facilitate system integration and the use of highly directive dielectric lenses making the chipset suitable for long-range radar measurements which are demonstrated up to 150 m. The MMICs are realized in a 130 nm SiGe BiCMOS technology with an f_T and f_max of 500 GHz and 610 GHz respectively.

URL PDF HTML ☆

赞 0 踩 0

2606.19453 2026-06-19 eess.AS 新提交

A Survey of Full-Duplex Spoken Dialogue Systems: Architectural Hierarchy, Interaction Ontology, and Decision State Machine

全双工口语对话系统综述：架构层次、交互本体与决策状态机

Jingyu Lu, Yuhan Wang, Jianming Luo, Yifu Chen, Tianle Liang, Shengpeng Ji, Ziyue Jiang, Xiaoda Yang, Yu Zhang, Xize Cheng, Chenyuhao Wen, Changhao Pan, Haoxiao Wang, Chen Ye, Jian Wu, Xiaoxi Jiang, Guanjun Jiang, Zhou Zhao

AI总结针对全双工术语歧义，提出L0-L3架构层次、T×I×R交互本体和IDLE/LISTEN/SPEAK/WAIT/DUAL决策状态机三个框架，揭示现有系统在训练与评估中的实现差距。

Comments 34 pages, 5 figures, 7 tables. Project page and interactive demo: https://github.com/DuplexLM/DuplexSurvey

详情

AI中文摘要

近期有十余个口语对话系统声称实现了“全双工”，但该术语被用于描述本质上不同的能力。现有综述将它们归入单一轴（级联/端到端，或工程化/学习型），忽略了构建者最关心的区别。我们认为这种歧义很大程度上源于分类学问题：当前术语未明确双工决策在何处做出、支持哪些交互类型、以及系统如何逐时刻行为。本文引入三个互补框架：(i) L0-L3架构层次，定位双工决策位置；(ii) T×I×R交互本体，指定每次交互的时间关系、用户意图和所需系统响应；(iii) 决策状态机（IDLE/LISTEN/SPEAK/WAIT/DUAL），描述系统如何在状态间转换。通过对已发表系统和基准的审计，我们记录了一个实现差距：尽管许多架构原则上能在全双工状态下运行，但其观察到的行为仍受训练和评估中表示的交互模式约束。我们指出，相对于（大多未公开的）工业语料库，有限的公开训练数据覆盖范围，以及尚未实现的L3表示级建模目标，是全双工对话未来研究的关键前沿。相关材料见https://this https URL。

英文摘要

More than a dozen spoken dialogue systems have recently claimed to be "full-duplex," yet the term has been used to describe substantially different capabilities. Existing surveys collapse them onto a single axis (cascaded/end-to-end, or engineered/learned) and miss the distinctions that matter most for builders. We argue that much of this ambiguity is taxonomical: current terminology does not specify where duplex decisions are made, which interaction types are supported, or how a system behaves moment by moment. This paper introduces three complementary frameworks: (i) an L0-L3 Architectural Hierarchy that locates where duplex decisions are made; (ii) a $T\times I\times R$ Interaction Ontology that specifies the temporal relation, user intent, and required system response for each interaction; and (iii) a Decision State Machine (IDLE/LISTEN/SPEAK/WAIT/DUAL) that describes how systems move between states. Across published systems and benchmarks, our audit documents a realization gap: although many architectures can in principle operate in full-duplex states, their observed behavior remains constrained by the interaction patterns represented in training and evaluation. We point to the limited public training-data coverage relative to the (largely undisclosed) industrial corpora, together with the still-unrealized goal of L3 representation-level modeling, as the key frontiers for future research on full-duplex dialogue. The related material is available at https://github.com/DuplexLM/DuplexSurvey.

URL PDF HTML ☆

赞 0 踩 0

2606.20240 2026-06-19 econ.EM stat.AP 新提交

Two-Sample IV: Efficient Two-Step Estimation and Tests for Overidentification and Weak-Instruments

两样本IV：高效两步估计及过度识别与弱工具变量检验

Fatima Kasenally, Ruoxi Guan, Frank Windmeijer

AI总结针对两样本IV估计，提出异方差和样本异质性下稳健的两步高效估计方法及过度识别检验，仅需线性回归的汇总统计量，并扩展弱工具变量检验。

详情

AI中文摘要

两样本IV是一种流行的估计方法，当结果变量和处理变量在不同样本中可用，而工具变量在两个样本中都可用时。标准估计量是两样本两阶段最小二乘估计量，在同方差和样本同质性下是有效的。我们开发了一个稳健的两步程序，用于在一般异方差和样本异质性下进行有效估计，并提出了相关的两样本Hansen过度识别检验。我们方法的一个关键特征是只需要两个样本中简化形式和第一阶段的线性回归的汇总统计量。这些是估计系数向量的六个对象，以及同方差和异方差稳健的估计方差矩阵。我们进一步表明，在同方差和同质性下，处理样本中的第一阶段F统计量可以按标准方式用作弱工具变量检验，这里的相对偏差是比例偏差。我们提出了Montiel-Olea和Pflueger (2013)的有效F统计量的扩展，用于异方差情况，遵循Windmeijer (2025)的推广。我们在Marshall (2019)研究教育对投票行为影响的应用中说明了估计量和检验，并进行了聚类稳健推断。

英文摘要

Two-sample IV is a popular estimation method when the outcome and treatment variables are available in different samples, whereas instruments are available in both samples. The standard estimator is two-sample two-stage least squares estimator, which is efficient under homoskedasticity and homogeneity of the samples. We develop a robust two-step procedure for efficient estimation under general heteroskedasticity and heterogeneity of the samples, and propose a related two-sample Hansen overidentification test. A key feature of our approach is that only summary statistics from the linear regressions of the reduced form and first-stage in the two samples are needed. These are the six objects of the estimated coefficient vectors, and the homoskedastic and heteroskedasticity robust estimated variance matrices. We further show that the first-stage F-statistic in the treatment sample can be used as a test for weak instruments in the standard way under homoskedasticity and homogeneity, with the relative bias here a proportional bias. We propose an extension of the effective F-statistic of Montiel-Olea and Pflueger (2013) for the heteroskedastic case, following the generalization in Windmeijer (2025). We illustrate the estimators and tests in an application studying the effect of education on voting behavior from Marshall (2019), with cluster robust inference.

URL PDF HTML ☆

赞 0 踩 0

2606.20514 2026-06-19 stat.ME 新提交

Hypergraph Variable Selection with False Discovery Rate Control

具有错误发现率控制的超图变量选择

Sarah Organ, Toby Kenney, Hong Gu

AI总结针对预测变量复杂依赖结构导致变量选择方法功效降低的问题，提出基于超图的选择方法，在控制错误发现率的同时提高选择功效。

Comments 28 pages, 4 figures

2606.20406 2026-06-19 stat.ME stat.CO 新提交

Flexible modeling of bimodal distributions via skewed-$t$ mixtures

双峰分布的灵活建模：基于偏斜-t分布的混合模型

Marco Bee, Flavio Santi

AI总结提出基于Fernández和Steel (1998)偏斜-t分布的混合模型，通过EM算法进行极大似然估计，并开发似然比检验，用于拟合双峰、偏斜和厚尾数据，在标准普尔500指数中验证了双峰性。

详情

AI中文摘要

我们提出了一种位置-尺度偏斜-t分布的混合模型，用于拟合双峰、偏斜和厚尾数据。特别地，该混合模型基于Fernández和Steel (1998)的偏斜-t分布，因此模型构建过程可以轻松扩展到其他对称分布的混合。在研究了混合模型的性质后，我们通过EM算法开发了极大似然估计方法，并提出了一个似然比检验，用于检验任何给定成分中无偏斜的原假设。与最近提出的g-and-h分布混合的基于模拟的比较表明，所提出模型在良好指定设置下的估计精度和错误指定框架下的建模能力方面均表现出色。将该模型拟合到标准普尔500指数失真数据，证实了其分布的双峰性，这意味着美国股市历史上处于熊市或牛市状态，而非接近其基本面价值。

英文摘要

We propose a mixture of location-scale skewed-$t$ distributions to fit bimodal, skewed and heavy-tailed data. In particular, the mixture is based on the skewed-$t$ distribution by Fernández and Steel (1998), so that the model-building procedure can be easily extended to mixtures of other symmetric distributions. After studying the properties of the mixture, we develop a maximum likelihood estimation approach via the EM algorithm and a likelihood ratio test of the null hypothesis of no skewness in any given component. A simulation-based comparison to a recently proposed mixture of g-and-h distributions suggests that the performance of the proposed model is excellent, in terms of both estimation precision in well-specified setups and modeling capability in mis-specified frameworks. Fitting the model to the Standard & Poor's 500 distortion allows us to confirm the bimodality of its distribution, with the implication that the US stock market has historically been in bearish or bullish conditions, rather than near its fundamental value.

URL PDF HTML ☆

赞 0 踩 0

2606.20341 2026-06-19 stat.ME stat.AP 新提交

Anchors Away: Navigating Unanchored Indirect Comparisons with Multilevel Unanchored Meta-Regression (ML-UMR)

锚定之外：使用多层次非锚定元回归（ML-UMR）导航非锚定间接比较

Conor Chandler, Jack Ishak

AI总结针对随机证据缺失时的非锚定治疗比较，提出多层次非锚定元回归（ML-UMR），通过贝叶斯框架联合建模个体与汇总数据，估计多治疗、多研究及目标人群的边际和条件效应，并明确识别假设与可转移性假设。

Comments 20 pages (excluding supplementary material), 5 figures

详情

AI中文摘要

当随机证据不可用时，使用单臂研究或断开证据的非锚定间接治疗比较越来越多地用于卫生技术评估（HTA）。现有方法，包括匹配调整间接比较（MAIC）和模拟治疗比较（STC），通常局限于成对设置，并且通常估计比较研究人群中的边际效应，这可能与决策相关人群不同。我们提出多层次非锚定元回归（ML-UMR），一种用于综合来自完全断开证据的个体患者数据和汇总数据的贝叶斯回归框架。ML-UMR通过在一个统一似然中联合建模个体水平和汇总水平数据，将多层次网络元回归（ML-NMR）扩展到非锚定设置，从而能够估计跨多个治疗、研究和目标人群的治疗特异性结果以及边际和条件效应。ML-UMR区分了识别治疗效应所需的假设与将结果转移到目标人群所需的假设。与所有非锚定比较一样，有效推断依赖于强且通常不可验证的假设，包括条件可交换性、结果模型的正确设定以及跨治疗假设（例如，共享预后因素假设（SPFA））。ML-UMR并未减轻这些要求，而是在统一框架内使其明确，并促进敏感性分析。在模拟研究中，ML-UMR对比较人群效应产生了低偏差和名义覆盖。向其他人群的可转移性关键取决于识别假设：在强效应修饰下，违反SPFA导致偏差，而纳入亚组信息则恢复了近乎无偏的估计和名义覆盖。

英文摘要

Unanchored indirect treatment comparisons using single-arm studies or disconnected evidence are increasingly used in health technology assessment (HTA) when randomized evidence is unavailable. Existing methods, including matching-adjusted indirect comparison (MAIC) and simulated treatment comparison (STC), are generally limited to pairwise settings and typically estimate marginal effects in the comparator study population, which may differ from the decision-relevant population. We propose multilevel unanchored meta-regression (ML-UMR), a Bayesian regression framework for synthesizing individual patient data and aggregate data from fully disconnected evidence. ML-UMR extends multilevel network meta-regression (ML-NMR) to unanchored settings by jointly modeling individual- and aggregate-level data within a unified likelihood, enabling estimation of treatment-specific outcomes and both marginal and conditional effects across multiple treatments, studies, and target populations. ML-UMR distinguishes assumptions required to identify treatment effects from those required to transport results to target populations. As with all unanchored comparisons, valid inference relies on strong and often unverifiable assumptions, including conditional exchangeability, correct specification of the outcome model, and cross-treatment assumptions (e.g., shared prognostic factor assumption (SPFA)). ML-UMR does not lessen these requirements but makes them explicit within a unified framework and facilitates sensitivity analyses. In simulation studies, ML-UMR produced low bias and nominal coverage for comparator-population effects. Transportability to alternative populations depended critically on identifying assumptions: violations of SPFA led to bias under strong effect modification, whereas incorporating subgroup information restored near-unbiased estimation and nominal coverage.

URL PDF HTML ☆

赞 0 踩 0

2606.20226 2026-06-19 stat.ME stat.CO 新提交

Analysis of uncertain fixed-effects model for Latin square designs

拉丁方设计的不确定固定效应模型分析

Yaru Cheng, Zhiming Li

AI总结针对无频率稳定性的不确定实验数据，建立拉丁方设计的不确定固定效应模型，提出三种估计方法并构建置信区间，进行不确定齐性检验和常见检验，通过数值模拟和实例验证模型有效性。

2606.20191 2026-06-19 stat.ML stat.ME 新提交

AK-MCS-C2 : Active Kriging Monte Carlo Simulation method with conformal certification for failure probability estimation

AK-MCS-C2: 具有共形认证的主动克里金蒙特卡洛模拟方法用于失效概率估计

Edgar Jaber, Vincent Chabridon, Mathilde Mougeot

AI总结提出一种结合主动克里金蒙特卡洛模拟与共形预测的主动学习框架，通过自适应交叉共形策略和J+GP共形估计器，在少量样本下提供无分布假设的预测误差保证，提高极限状态面附近样本分类可靠性，从而提升失效概率估计的准确性和鲁棒性。

AI 大模型

视觉与机器人

科学与医疗

Neural Architectures as Functional Priors in Physics-Informed Control Problems

Optimal Order of Multi-Agent and General Many-Body Systems

Statistical Properties of Training & Generalization

Towards Engineering Scaling Laws with Pretraining Data Composition

OpenAnt: LLM-Powered Vulnerability Discovery Through Code Decomposition, Adversarial Verification, and Dynamic Testing

The Morse Transform for Discrete Shape Analysis

Choosing A Headline Estimand from Matching, DID, and Hybrid Designs: A Minimax-Regret Approach

Institutions, Inputs, and Agricultural Growth in China:Revisiting Several Controversies, 1949--1986

Biodiversity Media Narratives and Stock Market Performance: Evidence from Europe

Beyond Speaker Independence: Evaluating Cross-Lingual Acoustic-to-Articulatory Inversion Across Finnish and Russian

Max-Min Rate Fairness Optimization for Multi-User Pinching-Antenna NOMA Systems

Stuttering Classification and Segmentation with Attention-Based Multiple Instance Learning

Transcript-Free Flow-Matching Text-to-Speech via Speech Feature Conditioning

Reliable ORIS-assisted FSO Communications via HARQ

Amplitude-Phase-Frequency Block Modulation for OFDM-ISAC with SI-Free PAPR Reduction and Pilotless Sensing

Time-Unconditional Generative Speech Enhancement via Autonomous Rectified Flow

Interpreting Content and Speaker Characteristics in Factorised Self-Supervised Subspaces

ConsisFormer: Compute-Efficient Transformer for Wireless Foundation Models Based on Channel Consistency

Analyzing Language and Geographical Variation in Speech Representations Across 60 Indic Languages

Cyclic-Prefix OFDM Probing for Spatial-ISI-Free Distributed Acoustic Sensing via Frequency-Domain Channel Reconstruction

An Optimization Framework for Certain Separable Problems using Neural Networks

Degrees of Freedom and Beamforming for Large Intelligent Surfaces

Multistatic J-Band Radar TX/RX Chipset in SiGe BiCMOS with Integrated x16 Frequency Multiplier Chain and High EIRP

A Survey of Full-Duplex Spoken Dialogue Systems: Architectural Hierarchy, Interaction Ontology, and Decision State Machine

Two-Sample IV: Efficient Two-Step Estimation and Tests for Overidentification and Weak-Instruments

Hypergraph Variable Selection with False Discovery Rate Control

Flexible modeling of bimodal distributions via skewed-$t$ mixtures

Anchors Away: Navigating Unanchored Indirect Comparisons with Multilevel Unanchored Meta-Regression (ML-UMR)

Analysis of uncertain fixed-effects model for Latin square designs

AK-MCS-C2 : Active Kriging Monte Carlo Simulation method with conformal certification for failure probability estimation