arXivDaily arXiv每日学术速递 周一至周五更新
2605.02989 2026-06-19 cs.IT eess.SP math.IT stat.ML 版本更新

Information Theory and Statistical Learning

信息论与统计学习

Abbas El Gamal

AI总结 本文是Cover & Thomas《信息论基础》第三版的章节预印本,系统介绍了散度度量在模型训练中的作用,涵盖线性回归、生成扩散模型等,并给出了扩散模型更系统的推导。

详情
AI中文摘要

本手稿包含即将出版的《Cover and Thomas信息论基础》第三版中一章的预印本,经Wiley许可发布。新版的目录EIT-3 ToC可在此https URL找到。反馈请联系abbas@ee. this http URL。学习与信息论在模型训练和基本性能极限的表征中均有交叉。本手稿对第一个交叉点进行了简洁易懂的处理,仅需高年级本科生或一年级研究生水平的信息论和统计学基础知识。章末习题使材料既适合课堂使用也适合自学。本章重点讨论散度度量在模型训练中的作用,示例涵盖从线性回归、逻辑回归到自回归模型、变分自编码器、扩散模型、生成对抗网络和基于分数的模型。介绍了证据下界(ELBO)、f-散度和Fisher散度。特别是,对生成扩散模型的处理提供了比文献中更系统、更明确的推导。

英文摘要

This manuscript contains preprint of a chapter under consideration for inclusion in the forthcoming third edition of {\em Cover and Thomas's Elements of Information Theory}, posted with permission from Wiley. The table of contents EIT-3 ToC of the new edition can be found at: https://docs.google.com/document/d/1L-m4oQEJw1PJhoxBeMwrrBD8S_HmvzMEkPbYvS24980/edit?usp=sharing . For feedback, please contact abbas@ee.stanford.edu Learning and information theory intersect in both model training and the characterization of fundamental performance limits. This manuscript provides a concise and accessible treatment of the first intersection, requiring only basic background in information theory and statistics at the senior undergraduate or first-year graduate level. End-of-chapter exercises make the material well suited for classroom use as well as self-study. The chapter focuses on the role of divergence measures in model training, with examples ranging from linear and logistic regression to autoregressive models, variational autoencoders, diffusion models, generative adversarial networks, and score-based models. It introduces the evidence lower bound (ELBO), f-divergences, and the Fisher divergence. In particular, the treatment of the generative diffusion model provides a more systematic and explicit derivation than is typical in the literature.

2606.05846 2026-06-19 cs.CL eess.AS 版本更新

Towards Truly Multilingual ASR: Generalizing Code-Switching ASR to Unseen Language Pairs

迈向真正的多语言ASR:将代码切换ASR泛化到未见语言对

Gio Paik, Hyunseo Shin, Soungmin Lee

发表机构 * University of Tokyo(东京大学)

AI总结 通过模型合并和领域泛化方法,研究从有限语言对中学到的代码切换能力能否泛化到未见语言对,实验表明双语CS-ASR模型对未见语言对有一定泛化能力但有限。

Comments ICML 2026 Workshop on Machine Learning for Audio

详情
AI中文摘要

自动语音识别(ASR)已成为人机交互的关键技术。然而,由于跨多种语言对的代码切换(CS)语音资源严重稀缺,代码切换ASR(CS-ASR)仍然特别具有挑战性。现有方法主要通过合成CS语音生成或在有限双语数据集上进行特定语言对微调来提高CS-ASR性能。然而,这些方法面临固有的可扩展性限制,因为对CS的支持必须针对语言对单独开发,而语言对的数量随支持的语言数量呈组合增长。在这项工作中,我们研究通过模型合并和领域泛化方法,从一组有限的已见语言对中学到的CS能力是否可以泛化到未见语言对。我们的实验表明,合并的双语CS-ASR模型对未见语言对有一定程度的泛化,表明双语CS能力在语言对之间的迁移有限。

英文摘要

Automatic Speech Recognition (ASR) has become a key technology for human--AI interaction. However, code-switching ASR (CS-ASR) remains particularly challenging due to the severe scarcity of multilingual CS speech resources across diverse language pairs. Existing approaches primarily improve CS-ASR performance through synthetic CS speech generation or pair-specific fine-tuning on limited bilingual datasets. Nevertheless, these approaches face an inherent scalability limitation, as support for CS must be developed separately for language pairs whose number grows combinatorially with the number of supported languages. In this work, we investigate whether CS capabilities learned from a limited set of seen language pairs can generalize to unseen language pairs through model merging and domain generalization methods. Our experiments show that merged bilingual CS-ASR models modestly generalize to unseen language pairs, suggesting limited transfer of bilingual CS capabilities across language pairs.

2605.28654 2026-06-19 cs.RO cs.SY eess.SY math.OC 版本更新

Integrated Exploration-Aware UAV Route Optimization and Path Planning

集成探索感知的无人机路径优化与轨迹规划

Jimin Choi, Grant Stagg, Cameron K. Peterson, Max Z. Li

发表机构 * Department of Aerospace Engineering, University of Michigan(密歇根大学航空航天工程系) Department of Electrical Engineering, Brigham Young University(BYU 电子工程系) Department of Aerospace Engineering, Department of Civil and Environmental Engineering, and Department of Industrial and Operations Engineering, University of Michigan(密歇根大学航空航天工程系、土木与环境工程系和工业与运营管理工程系)

AI总结 提出一种集成探索感知的无人机路径优化与轨迹规划框架,通过风险地图、不确定兴趣区域建模、B样条轨迹优化和在线重规划,在灾害监测中平衡报告点访问与新信息探索,实现平均KL散度降低15.9%。

详情
AI中文摘要

无人机越来越多地用于危险环境(如灾区、污染场地、野火区域和受损基础设施)中的探索驱动监测,此时有限的飞行续航必须在访问报告位置和收集新信息之间分配。在这些场景中,关于危险的先验信息通常不完整、空间不精确,并且在执行过程中可能发生变化。例如,初始报告可能识别出危险可能存在的区域,但实际危险可能被移动、部分观察到或完全未被报告。我们提出了一种集成的探索感知无人机路径优化与轨迹规划框架,用于在不确定和演变的先验信息下进行危险监测。环境被表示为空间风险地图,每个位置都有相关的危险状况信念。报告的危险被建模为不确定的兴趣区域(ROI),而不是确认的目标位置,要求无人机在检查报告区域的同时,利用有限的飞行续航探索信息丰富的区域。所提出的方法解决了报告ROI上的车辆路径问题,通过辅助伪节点增强路径以改善空间覆盖,将剩余飞行距离预算分配到路径段,并优化局部探索的动态可行B样条轨迹。在执行过程中,无人机测量更新基于网格的信念地图,当新信息和剩余预算证明调整合理时,对剩余轨迹进行重规划。在48种场景配置中,在线重规划相比离线优化规划器平均KL散度降低15.9%,相比直线遍历降低48.6%。

英文摘要

Uncrewed aerial vehicles (UAVs) are increasingly used for exploration-driven monitoring in hazardous environments such as disaster zones, contaminated sites, wildfire areas, and damaged infrastructure, where limited flight endurance must be allocated between visiting reported locations and gathering new information. In these settings, prior information regarding hazards is often incomplete, spatially imprecise, and subject to change during execution. For example, initial reports may identify a region where a hazard is likely to exist, but the actual hazard may be displaced, partially observed, or entirely unreported. We present an integrated exploration-aware UAV route optimization and path planning framework for hazard monitoring under uncertain and evolving prior information. The environment is represented as a spatial risk map, where each location has an associated belief of hazardous conditions. Reported hazards are modeled as uncertain regions of interest (ROIs) rather than confirmed target locations, requiring the UAV to inspect reported areas while also using its limited flight endurance to explore informative regions. The proposed method solves a vehicle routing problem over reported ROIs, augments the route with auxiliary pseudo-nodes to improve spatial coverage, allocates the remaining flight distance budget across route segments, and optimizes dynamically feasible B-spline trajectories for local exploration. During execution, UAV measurements update a grid-based belief map, and the remaining trajectory is replanned when new information and the remaining budget justify adaptation. Across 48 scenario configurations, online replanning improves average KL reduction by 15.9% over the offline optimized planner and 48.6% over straight-line traversal.

2603.19895 2026-06-19 eess.SY cs.SY math.CV math.DG math.DS 版本更新

Complex Frequency as Generalized Eigenvalue

复频率作为广义特征值

Nikolas Sofos, Federico Milano

AI总结 本文研究了复频率在描述线性时不变系统状态时作为特征值的广义形式,通过几何频率的定义和分解,展示了复频率在二维欧几里得平面中的应用,并证明了线性系统中复频率与特征值的等价性,同时指出非线性系统不具有这一等价性。

详情
AI中文摘要

本文证明了复频率的概念,最初用于描述复值信号的动力学,当应用于线性时不变(LTI)系统的状态时,构成了特征值的广义形式。从几何频率的定义出发,该定义为电路中的频率提供了几何解释,并自然分解为对称和反称成分,分别对应于幅度变化和旋转运动。我们展示复频率作为其在二维欧几里得平面上的限制。对于LTI系统,证明了通过非等距变换计算的系统状态的复频率与原系统的特征值一致。该等价性在任何阶数的可对角化系统中均成立。本文提供了一个统一的几何解释,将经典线性系统理论与曲线微分几何联系起来。同时指出,这种等价性一般不适用于非线性系统。另一方面,系统的几何频率总能被定义,从而为系统流提供几何解释。基于线性和非线性电路的多种示例展示了所提出的框架。

英文摘要

This paper shows that the concept of complex frequency, originally introduced to characterize the dynamics of signals with complex values, constitutes a generalization of eigenvalues when applied to the states of linear time-invariant (LTI) systems. Starting from the definition of geometric frequency, which provides a geometrical interpretation of frequency in electric circuits that admits a natural decomposition into symmetric and antisymmetric components associated with amplitude variation and rotational motion, respectively, we show that complex frequency arises as its restriction to the two-dimensional Euclidean plane. For LTI systems, it is shown that the complex frequencies computed from the system's states subject to a non-isometric transformation, coincide with the original system's eigenvalues. This equivalence is demonstrated for diagonalizable systems of any order. The paper provides a unified geometric interpretation of eigenvalues, bridging classical linear system theory with differential geometry of curves. The paper also highlights that this equivalence does not generally hold for nonlinear systems. On the other hand, the geometric frequency of the system can always be defined, providing a geometrical interpretation of the system flow. A variety of examples based on linear and nonlinear circuits illustrate the proposed framework.

2605.17443 2026-06-19 cs.CL cs.SD eess.AS 版本更新

Analyzing Error Propagation in Korean Spoken QA with ASR-LLM Cascades

分析韩语语音问答中ASR-LLM级联中的误差传播

Donghyuk Jung, Youngwon Choi

发表机构 * Korea Culture Technology Institute, Republic of Korea(韩国文化科技研究所) Maum AI Inc., Republic of Korea(马姆人工智能公司)

AI总结 本文研究了韩语语音问答中ASR-LLM级联中误差传播的问题,通过分析下游语义失败,揭示了传统ASR指标无法完全捕捉的误差影响,发现不同性能的LLM在级联降级上的一致性,识别出单字符ASR错误作为语义失败通道,并通过辅助比较表明大音频语言模型在噪声韩语SQA中优于匹配语言模型的ASR-LLM流水线。

Comments Preprint. Submitted to APSIPA ASC 2026

详情
AI中文摘要

我们分析了自动语音识别(ASR)误差如何通过ASR-LLM级联在韩语语音问答(SQA)中传播,重点关注传统ASR指标无法完全捕捉的下游语义失败。我们的分析显示,由ASR误差引起的相对下游降级在不同绝对性能的LLM中保持一致,表明级联降级主要跟踪ASR阶段的信息损失。我们进一步识别出单字符韩语ASR错误作为一种独特的语义失败通道,其中正确答案在下游预测中完全消失,尽管仅存在微小的转录差异。最后,辅助比较显示,大型音频语言模型在噪声韩语SQA中优于具有匹配语言骨干的ASR-LLM流水线,表明直接音频输入有潜力缓解转录诱导的信息损失。

英文摘要

We analyze how automatic speech recognition (ASR) errors propagate through ASR-LLM cascades in Korean spoken question answering (SQA), focusing on downstream semantic failures that conventional ASR metrics cannot fully capture. Our analysis shows that the relative downstream degradation caused by ASR errors is consistent across LLMs with different absolute performance, suggesting that cascade degradation largely tracks ASR-stage information loss. We further identify single-character Korean ASR errors as a Korean-specific loss channel, where even a minimal transcription difference can change the intended question and degrade downstream QA performance. Finally, an auxiliary comparison shows that a large audio language model outperforms an ASR-LLM cascade with an approximately matched language backbone in noisy Korean SQA, indicating the potential of direct audio input to mitigate transcript-induced information loss.

2605.08525 2026-06-19 cs.RO cs.SY eess.SY 版本更新

Model-Reference Adaptive Flight Control of a 95-mg Insect-Scale Flapping-Wing Aerial Robot

95毫克昆虫尺度扑翼飞行机器人的模型参考自适应飞行控制

Francisco M. F. R. Gonçalves, Conor K. Trygstad, Néstor O. Pérez-Arancibia

发表机构 * Washington State University(华盛顿州立大学)

AI总结 针对昆虫尺度扑翼飞行机器人参数不确定性和扰动问题,提出模型参考自适应控制(MRAC)架构,结合混合乘性扩展卡尔曼滤波,实现高精度位置控制,并通过95毫克机器人实验验证了悬停和轨迹跟踪性能。

Comments Under review, 8 pages, 7 figures

详情
AI中文摘要

由于系统尺度和复杂制造,描述扑翼昆虫尺度飞行机器人动力学的模型存在参数不确定性,例如惯性矩阵和飞行器的执行器映射。此外,由于其低惯性,这种机器人在飞行中受到随机和系统性扰动的严重影响,包括电源线张力、阵风和机翼不对中产生的非期望气动力。因此,在亚分克尺度上执行复杂机动的高性能要求机器人调整其行为以抵消扰动和模型不确定性。为此,我们引入了一种模型参考自适应控制(MRAC)架构,用于可实现为三维空间中刚体的扑翼机器昆虫的高性能位置控制。此外,我们展示了在飞行中实现混合乘性扩展卡尔曼滤波以估计当前和期望角速度,如何显著抑制姿态振动,特别是沿滚转和俯仰自由度,并提高飞行性能。为了展示所提方法的适用性、功能性和高性能,我们使用一个95毫克的昆虫尺度飞行机器人进行了实时悬停和轨迹跟踪六自由度飞行控制实验。

英文摘要

Due to the system's scale and complex fabrication, the model describing the dynamics of a flapping-wing insect-scale aerial robot is subject to parameter uncertainty; for example, in the inertia matrix and the actuator mapping of the flier. Furthermore, due to its low inertia, this type of robot is greatly affected by stochastic and systematic disturbances during flight, including power-wire tension, gusts, and undesired aerodynamic forces produced by wing misalignment. Therefore, the high-performance execution of complex maneuvers at the subdecigram scale requires the robot to adapt its behavior to counteract disturbances and model uncertainty. Toward this objective, we introduce a model-reference adaptive control (MRAC) architecture for high-performance position control of flapping-wing robotic insects that can be modeled as rigid bodies in the three-dimensional (3D) space. In addition, we demonstrate how the implementation of a hybrid multiplicative extended Kálmán filter for estimating current and desired angular velocities during flight significantly dampens attitude vibrations, especially along the roll and pitch degrees of freedom (DOFs), and also improves flight performance. To show the suitability, functionality, and high performance of the proposed approach, we conducted real-time hovering and trajectory-tracking 6-DOF flight control experiments with a 95-mg insect-scale aerial robot.

2605.00457 2026-06-19 cs.NI cs.LG cs.SY eess.SY 版本更新

Utility-Aware DRL-Based TXOP Adaptation for NR-U and Wi-Fi Coexistence Networks

基于策略驱动的DRL的NR-U与Wi-Fi共存中的TXOP自适应

Po-Heng Chou, Yi-Fang Yu, Shou-Yu Chen, Chiapin Wang

发表机构 * Research Center for Information Technology Innovation (CITI), Academia Sinica (AS)(资讯科技创新研究所以(CITI),中华学术界(AS)) Department of Electrical Engineering, National Taiwan Normal University (NTNU)(国立台湾师范大学电子工程系(NTNU))

AI总结 针对NR-U与Wi-Fi在非授权频谱共存中的频谱利用不平衡问题,提出一种基于策略驱动的深度强化学习框架,通过奖励设计实现公平性、吞吐量和效用的灵活权衡控制。

Comments 15 pages, 13 figures, 2 tables, submitted to IEEE Open Journal of the Communications Society

详情
AI中文摘要

NR-U与Wi-Fi在非授权频谱中的共存引入了一个具有挑战性的共存管理问题,其中异构信道接入机制导致频谱利用的显著不平衡和Wi-Fi性能下降。为了解决这一挑战,我们提出了一种基于策略驱动的深度强化学习(DRL)框架,用于自适应传输机会(TXOP)控制,其中共存过程被建模为马尔可夫决策过程(MDP),深度Q网络(DQN)通过在线交互学习控制策略。一个关键贡献是通过奖励设计引入策略层,从而实现对公平性、吞吐量和效用之间共存权衡的显式控制。开发了三种策略,即绝对公平、适度公平和基于效用的公平,以实现不同的工作点。仿真结果表明,所提出的框架在严格公平控制下实现了高于0.9的Jain公平指数。与绝对公平相比,适度公平将总吞吐量提高了68.22%,而基于效用的策略进一步将效用提高了177.6%。这些结果表明,策略驱动控制为管理异构共存网络中的权衡提供了一种灵活有效的解决方案。

英文摘要

The coexistence of NR-U and Wi-Fi in the unlicensed spectrum introduces a challenging resource management problem, where heterogeneous channel access mechanisms can lead to unbalanced spectrum utilization and severe Wi-Fi performance degradation. To address this issue, this paper proposes a utility-aware deep reinforcement learning (DRL) framework for adaptive transmission opportunity (TXOP) control in NR-U/Wi-Fi coexistence networks. The coexistence process is formulated as a Markov decision process (MDP), in which the NR-U TXOP duration is treated as a controllable variable for regulating post-access channel occupancy. A deep Q-network (DQN) is then employed to learn adaptive TXOP control policies through online interaction with the coexistence environment. A key feature of the proposed framework is the integration of a configurable reward and criterion design, which enables explicit control of the fairness-efficiency-utility tradeoff. Three operating policies are developed, namely absolute fairness, moderate fairness, and utility-oriented moderate fairness, to characterize different coexistence operating points. Simulation results show that the proposed framework achieves a Jain fairness index above 0.9 under strict fairness control. Compared with the absolute fairness policy, the moderate fairness policy improves aggregate throughput by 68.22%, while the utility-oriented policy achieves a 177.6% improvement under the adopted utility evaluation metric. These results demonstrate that the proposed utility-aware DRL framework provides an effective and flexible solution for adaptive TXOP control and tradeoff management in heterogeneous unlicensed coexistence networks.

2604.18105 2026-06-19 eess.AS cs.CL cs.SD 版本更新

NIM4-ASR: Towards Efficient, Robust, and Customizable Real-Time LLM-Based ASR

NIM4-ASR:迈向高效、鲁棒且可定制的实时基于LLM的语音识别

Yuan Xie, Jiaqi Song, Guang Qiu, Xianliang Wang, Kai Qiao, Junfeng Yuan, Shengqing Liu, Yi Zhang, Bowen Chen, Ming Lei, Jie Gao, Jie Wu

发表机构 * Advanced Intelligent Systems Group, NIO(蔚来智能系统集团)

AI总结 提出NIM4-ASR框架,通过重新设计多阶段训练范式(包括预训练架构优化、迭代异步SFT和ASR专用强化学习)以及生产优化(噪声鲁棒性、流式推理和RAG热词定制),在2.3B参数下实现SOTA性能。

详情
AI中文摘要

将大语言模型(LLM)集成到自动语音识别(ASR)中已成为近年来的主流范式。尽管现有的基于LLM的ASR模型在公共基准上表现出色,但其训练仍然主要依赖数据驱动,未能充分解决关键的实际挑战——特别是在资源受限部署中的有限向下可扩展性以及声学挑战条件下的幻觉问题。为了解决这些问题,我们提出了NIM4-ASR,一个面向生产的、基于LLM的ASR框架,针对效率和鲁棒性进行了优化。基于编码器和LLM之间功能角色的原则性划分,我们重新设计了多阶段训练范式,使每个模块与其预期的能力边界对齐。具体来说,我们重新制定了预训练架构和目标以缓解模态差距并提高参数效率;引入了迭代异步SFT阶段以保持声学保真度并约束表示漂移;设计了ASR专用的强化学习阶段以进一步提高识别质量和鲁棒性。我们还加入了一系列面向生产的优化,包括噪声和静音条件下的鲁棒性、实时流式推理以及通过检索增强生成(RAG)进行的热词定制。实验表明,NIM4-ASR仅用2.3B参数就在多个公共基准上达到了最先进的性能,同时在内部基准上显著优于更大规模的竞争对手——特别是在实体密集的真实场景中。NIM4-ASR进一步通过RAG支持百万级热词定制,检索延迟低于毫秒,从而能够高效适应新兴实体和个性化用户需求。

英文摘要

Integrating large language models (LLMs) into automatic speech recognition (ASR) has become a mainstream paradigm in recent years. Although existing LLM-based ASR models demonstrate impressive performance on public benchmarks, their training remains predominantly data-driven, leaving key practical challenges insufficiently addressed -- particularly limited downward scalability in resource-constrained deployments and hallucinations under acoustically challenging conditions. To address these issues, we present NIM4-ASR, a production-oriented LLM-based ASR framework optimized for both efficiency and robustness. Grounded in a principled delineation of functional roles between the encoder and the LLM, we redesign the multi-stage training paradigm to align each module with its intended capability boundary. Specifically, we reformulate the pre-training architecture and objective to mitigate the modality gap and improve parameter efficiency; introduce an iterative asynchronous SFT stage to preserve acoustic fidelity and constrain representation drift; and design an ASR-specialized reinforcement learning stage to further enhance recognition quality and robustness. We additionally incorporate a suite of production-oriented optimizations, including robustness under noisy and silent conditions, real-time streaming inference, and hotword customization via retrieval-augmented generation (RAG). Experiments show that NIM4-ASR achieves state-of-the-art performance on multiple public benchmarks with merely 2.3B parameters, while substantially outperforming larger-scale competitors on internal benchmarks -- particularly in entity-intensive real-world scenarios. NIM4-ASR further supports million-scale hotword customization via RAG with sub-millisecond retrieval latency, enabling efficient adaptation to emerging entities and personalized user requirements.

2604.03725 2026-06-19 quant-ph cs.IT eess.SP math.IT 版本更新

Quantum Algebraic Diversity: Single-Copy Density Matrix Estimation via Group-Structured Measurements

量子代数多样性:通过群结构测量进行单副本密度矩阵估计

Mitchell A. Thornton

AI总结 将代数多样性框架扩展到量子测量,提出量子代数多样性定理,通过群结构POVM从单副本量子态估计密度矩阵,实现高保真度,并建立经典-量子对偶映射和最优性继承定理。

Comments v3: copy-reduction claim corrected; fidelities fixed; 1 figure removed

详情
AI中文摘要

我们将代数多样性(AD)框架从经典信号处理扩展到量子测量理论。量子代数多样性(QAD)定理表明,应用于量子态单副本的群结构正算子值测度(POVM)会产生一个满秩的群平均密度矩阵估计量,其特征基和特征值排序追踪真实密度矩阵的特征基和特征值排序,并偏向对称化态,类似于从单个观测中恢复协方差特征结构的经典情况。我们建立了一个经典-量子对偶映射,将经典协方差估计与量子态层析成像联系起来,以及一个最优性继承定理,表明经典群最优性通过Born映射在群平均族内转移到量子设置。SIC-POVM被识别为Heisenberg-Weyl群的AD,互无偏基被识别为Clifford群的AD,揭示了层次结构$\mathrm{HW}(d) \subseteq \mathcal{C}(d) \subseteq S_d$,这镜像了经典的$\mathbb{Z}_M \subseteq G_{\min} \subseteq S_M$。双对易子特征值定理给出了多项式时间自适应POVM选择。一个工作的量子比特示例展示了来自单个计算基测量的群平均估计量,在匹配的$\mathbb{Z}_2$群上平均后,达到保真度0.99,而标准单基层层析成像给出的秩1估计保真度为0.80。对于$d=2$到13的蒙特卡洛模拟证实,来自单个结果的保真度高于0.90,而标准保真度按$\sim 1/d$退化。增长比率反映了秩1标准估计量的崩溃,而不是每个参数的更少副本:有偏的单副本估计量减少了不同测量设置的数目,而不是每个参数的采样成本,并且真正的副本减少仅在精确对称下成立。

英文摘要

We extend the algebraic diversity (AD) framework from classical signal processing to quantum measurement theory. The Quantum Algebraic Diversity (QAD) Theorem establishes that a group-structured positive operator-valued measure (POVM) applied to a single copy of a quantum state produces a full-rank, group-averaged density matrix estimator whose eigenbasis and eigenvalue ordering track those of the true density matrix, with a bias toward the symmetrized state, analogous to the classical recovery of covariance eigenstructure from a single observation. We establish a Classical-Quantum Duality Map connecting classical covariance estimation to quantum state tomography, and an Optimality Inheritance Theorem showing that classical group optimality transfers to quantum settings via the Born map within the group-averaged family. SIC-POVMs are identified as AD with the Heisenberg-Weyl group and mutually unbiased bases as AD with the Clifford group, revealing the hierarchy $\mathrm{HW}(d) \subseteq \mathcal{C}(d) \subseteq S_d$ that mirrors the classical $\mathbb{Z}_M \subseteq G_{\min} \subseteq S_M$. The double-commutator eigenvalue theorem gives polynomial-time adaptive POVM selection. A worked qubit example shows the group-averaged estimator from a single computational-basis measurement, averaged over a matched $\mathbb{Z}_2$ group, reaching fidelity 0.99 where standard single-basis tomography gives a rank-1 estimate of fidelity 0.80. Monte Carlo simulations for $d = 2$ to $13$ confirm fidelity above 0.90 from a single outcome while standard fidelity degrades as $\sim 1/d$. The growing ratio reflects collapse of the rank-1 standard estimator, not fewer copies per parameter: the biased single-copy estimator reduces the number of distinct measurement settings, not the per-parameter sampling cost, and a genuine copy reduction holds only under exact symmetry.

2604.09795 2026-06-19 eess.SY cs.RO cs.SY 版本更新

On Feedback Speed Control for a Planar Tracking

平面跟踪中的反馈速度控制

Xincheng Li, Tengyue Liu, Udit Halder

发表机构 * Department of Mechanical and Aerospace Engineering, University of South Florida(南佛罗里达大学机械与航空航天工程系)

AI总结 针对领航-跟随平面跟踪问题,提出一种反馈速度控制律与恒定方位角转向策略,实现并排编队并证明渐近稳定性,扩展至N-agent链网络。

详情
AI中文摘要

本文研究了领航者和跟随者之间的平面跟踪问题。我们提出了一种新颖的反馈速度控制律,结合恒定方位角转向策略,以保持两个智能体之间的并排编队。我们证明了当领航者的转向已知时,所提出的控制使闭环系统渐近稳定。对于跟随者无法获取领航者转向的情况,我们表明系统相对于被视为输入的领航者转向仍然是输入-状态稳定的。此外,我们证明如果领航者的转向是周期性的,跟随者将渐近收敛到具有相同周期的周期轨道。我们通过数值模拟和移动机器人实验验证了这些结果。最后,我们通过将两智能体控制律扩展到N智能体链网络,展示了所提出方法的可扩展性,并说明了其在生物和工程群体中方向信息传播的意义。

英文摘要

This paper investigates a planar tracking problem between a leader and follower agent. We propose a novel feedback speed control law, paired with a constant bearing steering strategy, to maintain an abreast formation between the two agents. We prove that the proposed control yields asymptotic stability of the closed-loop system when the steering of the leader is known. For the case when the leader's steering is unavailable to the follower, we show that the system is still input-to-state stable with respect to the leader's steering viewed as an input. Furthermore, we demonstrate that if the leader's steering is periodic, the follower will asymptotically converge to a periodic orbit with the same period. We validate these results through numerical simulations and experimental implementations on mobile robots. Finally, we demonstrate the scalability of the proposed approach by extending the two-agent control law to an N-agent chain network, illustrating its implications for directional information propagation in biological and engineered flocks.

2504.09642 2026-06-19 eess.SY cs.SY 版本更新

HBS -- Hardware Build System: Characterizing and comparing direct-Tcl and indirect-abstract approaches for hardware build systems

HBS——硬件构建系统:直接Tcl与间接抽象硬件构建方法的特征化与比较

Michał Kruszewski

AI总结 本文特征化并比较了两种硬件构建系统方法:直接Tcl方法(构建代码由EDA工具直接执行)和间接抽象方法(构建系统生成Tcl脚本后由EDA工具运行),并提出了新的直接Tcl构建系统HBS,以弥补现有直接Tcl系统功能不足,用于与间接抽象系统进行对比。

详情
AI中文摘要

构建系统已成为软件实现和部署过程中不可或缺的一部分。新的编程语言(如Go、Rust或Zig)在发布时都集成了构建系统。然而,在硬件描述领域,主流硬件描述语言(HDL)如VHDL或SystemVerilog并未发布官方构建系统。此外,硬件设计项目通常涉及多种语言。本文特征化并比较了两种常见的硬件构建系统实现方法。第一种是直接Tcl方法,其中构建系统代码在设计构建流程中由EDA工具直接执行。第二种是间接抽象方法,其中构建系统生成Tcl脚本,随后由合适的EDA工具运行。由于现有的直接Tcl构建系统在支持的功能方面均不及间接抽象构建系统,本文还提出了一种新的直接Tcl硬件构建系统,称为HBS。该实现的构建系统作为直接Tcl构建系统的代表,用于与间接抽象构建系统进行比较。

英文摘要

Build systems become an indispensable part of the software implementation and deployment process. New programming languages are released with the build system integrated into the language tools, for example, Go, Rust, or Zig. However, in the hardware description domain, no official build systems have been released with the predominant Hardware Description Languages (HDL) such as VHDL or SystemVerilog. Moreover, hardware design projects are often multilingual. The paper characterizes and compares two common approaches for hardware build system implementations. The first one, the direct-Tcl approach, in which the build system code is executed directly by the EDA tool during the design build flow. The second one, the indirect-abstract approach, in which the build system produces a Tcl script, which is later run by a proper EDA tool. As none of the existing direct-Tcl build systems was close to the indirect-abstract build systems in terms of supported functionalities, the paper also presents a new direct-Tcl hardware build system called HBS. The implemented build system was used as a representative of direct-Tcl build systems in comparison with indirect-abstract build systems.

2603.16865 2026-06-19 math.OC cs.SY eess.SY 版本更新

Prescribed-Time Distributed Generalized Nash Equilibrium Seeking

预设时间分布式广义纳什均衡求解

Liraz Mudrik, Isaac Kaminer, Sean Kragelund, Abram H. Clark

AI总结 针对安全关键多智能体系统,提出首个全分布式算法,在用户预设时间T内求解带共享耦合约束的广义纳什均衡问题,采用多速率增益调度解耦观测器、优化与对偶一致性三层耦合。

Comments 12 pages, 5 figures

详情
AI中文摘要

从协同制导到碰撞避免等安全关键多智能体系统,通常必须在硬截止时间前达成协调决策,而非仅仅最终收敛。本文提出首个全分布式算法,用于在用户预设时间$T$内求解广义纳什均衡(GNE)问题(一种具有共享耦合约束和一般成本耦合的非合作博弈),该时间独立于初始条件。其基础是建立在优化李雅普诺夫函数框架上的集中式预设时间结果,并通过非归一化Hessian-梯度反馈实现,选择该反馈是因为与牛顿和归一化Hessian-梯度实现不同,它自然地分解为每个智能体的计算。分布式实现该反馈要求每个智能体同时运行三个耦合过程:全局状态的预设时间观测器、局部优化律以及强制变分GNE共享乘子的对偶一致性机制。它们的同步运行是核心难点,因为优化不断位移观测器跟踪的状态,而估计误差污染驱动优化的梯度。我们通过一种多速率增益调度解决该耦合,其中观测器和一致性层比优化层严格更快收缩,使得每个误差分量在$T$时刻精确消失。Fischer-Burmeister重构保持设计无投影,同时在截止时间强制执行约束。针对Cournot博弈和时间关键传感器覆盖问题的数值结果验证了该方法,并展示了其作为时间关键自主性求解器在环的应用。

英文摘要

Safety-critical multi-agent systems, from cooperative guidance to collision avoidance, must often reach a coordinated decision by a hard deadline rather than merely converge to one eventually. This paper proposes the first fully distributed algorithm that solves the generalized Nash equilibrium (GNE) problem, a non-cooperative game with shared coupling constraints and general cost coupling, at a user-prescribed time $T$ independent of initial conditions. The foundation is a centralized, prescribed-time result built on the optimization Lyapunov function framework and implemented via unnormalized Hessian-gradient feedback, chosen because, unlike the Newton and normalized Hessian-gradient realizations, it naturally splits into per-agent computations. Distributing this feedback requires each agent to run three coupled processes simultaneously: a prescribed-time observer of the global state, a local optimization law, and a dual-consensus mechanism that enforces the shared multipliers of the variational GNE. Their simultaneous operation is the core difficulty, as the optimization continually displaces the states the observers track, while estimation errors corrupt the gradients that drive the optimization. We resolve this coupling with a multi-rate gain schedule whose observer and dual-consensus layers contract strictly faster than the optimization layer, so that every error component vanishes exactly at $T$. A Fischer-Burmeister reformulation keeps the design projection-free while enforcing the constraints at the deadline. Numerical results for a Cournot game and a time-critical sensor-coverage problem validate the approach and demonstrate its use as a solver-in-the-loop for time-critical autonomy.

2603.16941 2026-06-19 eess.AS cs.CL cs.SD 版本更新

The Voice Behind the Words: Quantifying Intersectional Bias in SpeechLLMs

言语背后的声音:量化语音大语言模型中的交叉偏见

Shree Harsha Bokkahalli Satish, Christoph Minixhofer, Maria Teleki, James Caverlee, Ondřej Klejch, Peter Bell, Gustav Eje Henter, Éva Székely

发表机构 * 1 Department of Speech, Music Hearing, KTH Royal Institute of Technology, Sweden 2 Centre for Speech Technology Research, University of Edinburgh, UK 3 Texas A\&M University, USA

AI总结 本研究通过2880次受控交互,评估三种语音大语言模型在六种英语口音和两种性别呈现中的口音与性别交叉偏见,发现东欧口音(尤其女性)获得更低有用性评分,且人类评估者比LLM评判更敏感。

Comments 5 pages, 3 figures, 1 table, Accepted to Interspeech 2026

详情
AI中文摘要

语音大语言模型直接处理语音输入,保留了之前级联管道中去除的口音和感知性别等线索,这导致了依赖于说话者身份的反应差异。我们使用2880次受控交互(涵盖六种英语口音和两种性别呈现,通过语音克隆保持语言内容不变),对三种语音大语言模型中的口音和性别偏见进行了大规模交叉评估。通过逐点LLM评判评分、成对比较以及经过人工验证的最佳-最差缩放,我们检测到反复出现的定向差异。东欧口音的语音获得较低的有用性评分,尤其是女性呈现的语音。反应保持礼貌但在有用性上存在差异。虽然LLM评判捕捉到了这些偏见的定向趋势,但人类评估者表现出显著更高的敏感性,显示出更强的口音级别对比。

英文摘要

Speech Large Language Models (SpeechLLMs) process spoken input directly, retaining cues such as accent and perceived gender that were previously removed in cascaded pipelines. This introduces speaker identity dependent variation in responses. We present a large-scale intersectional evaluation of accent and gender bias in three SpeechLLMs using 2,880 controlled interactions across six English accents and two gender presentations, keeping linguistic content constant through voice cloning. Using pointwise LLM-judge ratings, pairwise comparisons, and Best-Worst Scaling with human validation, we detect recurring directional disparities. Eastern European-accented speech receives lower helpfulness scores, particularly for female-presenting voices. Responses remain polite but differ in helpfulness. While LLM judges capture the directional trend of these biases, human evaluators exhibit significantly higher sensitivity, showing stronger accent-level contrasts.

2510.06846 2026-06-19 eess.SY cs.SY 版本更新

Decentralized CBF-based Safety Filters for Collision Avoidance of Cooperative Missile Systems with Input Constraints

基于CBF的去中心化安全滤波器:面向输入受限的协同导弹系统碰撞避免

Johannes Autenrieb, Mark Spiller

AI总结 针对多飞行器拦截场景,提出基于鲁棒控制屏障函数的去中心化安全滤波器,通过事件触发和松弛变量优化实现碰撞避免,兼顾计算效率与可扩展性。

Comments 7 pages, 5 figures, accepted for presentation at the 2026 American Control Conference (ACC 2026)

详情
AI中文摘要

本文提出了一种用于多智能体航空航天拦截场景中碰撞避免的去中心化安全滤波器。该方法利用鲁棒控制屏障函数(RCBF)来保证在有界输入和高相对度动力学下安全集的前向不变性。每个执行器执行其标称协同制导指令,而局部二次规划(QP)仅在必要时修改输入。基于距离和零控脱靶量(ZEM)准则的事件触发激活通过将主动约束限制在相关邻居来确保可扩展性。为了在多个同时主动约束下保证可行性,引入了一种松弛变量方案,以帕累托最优方式优先考虑关键智能体。多对多拦截场景的仿真结果表明,所提出的框架在最小偏离标称制导的情况下保持无碰撞运行,为安全关键的多智能体航空航天系统提供了一种计算高效且可扩展的解决方案。

英文摘要

This paper presents a decentralized safety filter for collision avoidance in multi-agent aerospace interception scenarios. The approach leverages robust control barrier functions (RCBFs) to guarantee forward invariance of safe sets under bounded inputs and high-relative-degree dynamics. Each effector executes its nominal cooperative guidance command, while a local quadratic program (QP) modifies the input only when necessary. Event-triggered activation based on range and zero-effort miss (ZEM) criteria ensures scalability by restricting active constraints to relevant neighbors. To ensure feasibility under multiple simultaneously active constraints, a slack-variable relaxation scheme is introduced that prioritizes critical agents in a Pareto-optimal manner. Simulation results in many-on-many interception scenarios demonstrate that the proposed framework maintains collision-free operation with minimal deviation from nominal guidance, providing a computationally efficient and scalable solution for safety-critical multi-agent aerospace systems.

2603.14403 2026-06-19 eess.SY cs.SY 版本更新

Robust Safety Filters for Lipschitz-Bounded Adaptive Closed-Loop Systems with Structured Uncertainties

具有结构不确定性的Lipschitz有界自适应闭环系统的鲁棒安全滤波器

Johannes Autenrieb, Peter A. Fisher, Anuradha Annaswamy

AI总结 针对自适应控制系统的瞬态安全问题,提出一种基于参考的自适应安全框架,利用Lipschitz有界跟踪误差推导鲁棒CBF条件并转化为凸SOCP,减少保守性并保证前向不变性和闭环稳定性。

Comments 6 pages, 4 figures, accepted for publication in the IEEE Control Systems Letters (L-CSS)

详情
AI中文摘要

自适应控制通过在线参数自适应为不确定动态系统提供闭环稳定性和参考跟踪。然而,仅凭这些特性并不能确保状态约束的前向不变性意义上的安全性,特别是在自适应的瞬态阶段。基于控制屏障函数(CBF)的安全滤波器已被提出以解决这一限制,但现有方法通常依赖于保守的约束收紧或二次规划公式中的静态安全裕度。本文针对具有结构参数不确定性的系统提出了一种基于参考的自适应安全框架,该框架明确考虑了瞬态植物-参考失配。安全性在参考层面通过基于屏障函数的滤波器强制执行,而自适应控制驱动植物跟踪安全认证的参考。通过利用闭环跟踪误差动态的Lipschitz界,推导出依赖于跟踪误差的鲁棒CBF条件,并等价地重新表述为凸二阶锥规划(SOCP)。与固定裕度CBF公式相比,所提出的安全滤波器公式通过使安全约束随着植物-参考跟踪误差的减小而逐渐减少限制性,从而减少了保守性,同时保留了前向不变性和闭环稳定性的正式保证。

英文摘要

Adaptive control provides closed-loop stability and reference tracking for uncertain dynamical systems through online parameter adaptation. These properties alone, however, do not ensure safety in the sense of forward invariance of state constraints, particularly during transient phases of adaptation. Control barrier function (CBF)-based safety filters have been proposed to address this limitation, but existing approaches often rely on conservative constraint tightening or static safety margins within quadratic program formulations. This paper proposes a reference-based adaptive safety framework for systems with structured parametric uncertainty that explicitly accounts for transient plant-reference mismatch. Safety is enforced at the reference level using a barrier-function-based filter, while adaptive control drives the plant to track the safety-certified reference. By exploiting Lipschitz bounds on the closed-loop tracking error dynamics, a tracking-error-dependent robust CBF condition is derived and equivalently reformulated as a convex second-order cone program (SOCP). The proposed safety-filter formulation reduces conservatism relative to fixed-margin CBF formulations by rendering the resulting safety constraints progressively less restrictive as the plant-reference tracking error decreases, while preserving formal guarantees of forward invariance and closed-loop stability.

2603.10791 2026-06-19 eess.IV 版本更新

Semantic Satellite Communications for Synchronized Audiovisual Reconstruction

面向同步视听重建的语义卫星通信

Fangyu Liu, Peiwen Jiang, Wenjin Wang, Xiao Li, Shi Jin

AI总结 提出自适应多模态语义传输系统,通过双流生成架构和动态关键帧更新机制,在带宽受限的卫星场景下实现高质量同步视听重建,显著降低带宽消耗并提升鲁棒性。

详情
AI中文摘要

卫星通信在支持高保真同步视听服务方面面临严重瓶颈,因为传统方案在信道波动、带宽有限和长传播延迟下难以处理跨模态一致性。为了解决这些问题,本文提出了一种针对卫星场景的自适应多模态语义传输系统,旨在带宽约束下实现高质量同步视听重建。与具有固定模态优先级的静态方案不同,我们的框架采用双流生成架构,可灵活切换视频驱动音频生成和音频驱动视频生成。这使得系统能够动态解耦语义,仅传输最重要的模态,同时利用跨模态生成恢复另一种模态。为了平衡重建质量和传输开销,动态关键帧更新机制根据无线场景和用户需求自适应维护共享知识库。此外,引入基于大语言模型的决策模块以增强系统适应性。通过集成卫星特定知识,该模块联合考虑任务需求和信道因素(如天气引起的衰落),主动调整传输路径和生成工作流。仿真结果表明,所提系统在实现高保真视听同步的同时显著降低带宽消耗,提高了挑战性卫星场景下的传输效率和鲁棒性。

英文摘要

Satellite communications face severe bottlenecks in supporting high-fidelity synchronized audiovisual services, as conventional schemes struggle with cross-modal coherence under fluctuating channel conditions, limited bandwidth, and long propagation delays. To address these limitations, this paper proposes an adaptive multimodal semantic transmission system tailored for satellite scenarios, aiming for high-quality synchronized audiovisual reconstruction under bandwidth constraints. Unlike static schemes with fixed modal priorities, our framework features a dual-stream generative architecture that flexibly switches between video-driven audio generation and audio-driven video generation. This allows the system to dynamically decouple semantics, transmitting only the most important modality while employing cross-modal generation to recover the other. To balance reconstruction quality and transmission overhead, a dynamic keyframe update mechanism adaptively maintains the shared knowledge base according to wireless scenarios and user requirements. Furthermore, a large language model based decision module is introduced to enhance system adaptability. By integrating satellite-specific knowledge, this module jointly considers task requirements and channel factors such as weather-induced fading to proactively adjust transmission paths and generation workflows. Simulation results demonstrate that the proposed system significantly reduces bandwidth consumption while achieving high-fidelity audiovisual synchronization, improving transmission efficiency and robustness in challenging satellite scenarios.

2509.15069 2026-06-19 eess.SP cs.DS cs.NA math.NA 版本更新

Efficient Computation of Time-Index Powered Weighted Sums Using Cascaded Accumulators

使用级联累加器高效计算时间索引加权和

Deijany Rodriguez Linares, Oksana Moryakova, Håkan Johansson

AI总结 提出一种利用级联累加器高效计算时间索引加权和的方法,将乘法次数从K×N减少到K+1次常数乘法,无需存储数据块,适用于实时逐样本处理系统。

Comments This work has been submitted to the IEEE for possible publication

Journal ref IEEE Signal Processing Letters, vol. 33, pp. 893-897, Feb. 2026

详情
AI中文摘要

本文提出了一种新颖的方法,使用级联累加器高效计算形如$\sum_{n=0}^{N-1} n^{K} v[n]$的时间索引加权和。传统的直接计算需要$K{\times}N$次通用乘法,对于大的$N$变得不可行,而基于查找表或信号反转的替代策略需要存储整个数据块。通过利用累加器特性,所提方法消除了此类存储需求,并将乘法成本降低到仅$K{+}1$次常数乘法,实现了高效的实时实现。当需要在逐样本处理系统中高效计算此类和时,该方法特别有用。

英文摘要

This letter presents a novel approach for \mbox{efficiently} computing time-index powered weighted sums of the form $\sum_{n=0}^{N-1} n^{K} v[n]$ using cascaded accumulators. Traditional direct computation requires $K{\times}N$ general multiplications, which become prohibitive for large $N$, while alternative strategies based on lookup tables or signal reversal require storing entire data blocks. By exploiting accumulator properties, the proposed method eliminates the need for such storage and reduces the multiplicative cost to only $K{+}1$ constant multiplications, enabling efficient real-time implementation. The approach is particularly useful when such sums need to be efficiently computed in sample-by-sample processing systems.

2603.04219 2026-06-19 cs.SD cs.AI eess.AS 版本更新

ZeSTA: Zero-Shot TTS Augmentation with Domain-Conditioned Training for Data-Efficient Personalized Speech Synthesis

ZeSTA: 基于领域条件训练的零样本文本转语音增强用于数据高效的个性化语音合成

Youngwon Choi, Jinwoo Oh, Hwayeon Kim, Hyeonyu Kim

发表机构 * Maum AI Inc.(Maum AI公司) Humelo Inc.(Humelo公司)

AI总结 提出ZeSTA框架,通过轻量领域嵌入区分真实与合成语音,结合真实数据过采样,在极低资源下提升零样本文本转语音增强的说话人相似度,保持可懂度和感知质量。

Comments 6 pages, accepted to INTERSPEECH 2026

详情
AI中文摘要

我们研究了将零样本文本转语音(ZS-TTS)作为低资源个性化语音合成的数据增强源。虽然合成增强可以提供语言丰富且音素多样的语音,但将大量合成语音与有限的真实录音简单混合往往会导致微调过程中说话人相似度下降。为解决这一问题,我们提出了ZeSTA,一个简单的基于领域条件的训练框架,通过轻量领域嵌入区分真实和合成语音,并结合真实数据过采样以在极有限的目标数据下稳定适应,无需修改基础架构。在LibriTTS和一个内部数据集上使用两个ZS-TTS源的实验表明,我们的方法在保持可懂度和感知质量的同时,相比朴素合成增强提高了说话人相似度。音频样本可在我们的网页上获取。

英文摘要

We investigate the use of zero-shot text-to-speech (ZS-TTS) as a data augmentation source for low-resource personalized speech synthesis. While synthetic augmentation can provide linguistically rich and phonetically diverse speech, naively mixing large amounts of synthetic speech with limited real recordings often leads to speaker similarity degradation during fine-tuning. To address this issue, we propose ZeSTA, a simple domain-conditioned training framework that distinguishes real and synthetic speech via a lightweight domain embedding, combined with real-data oversampling to stabilize adaptation under extremely limited target data, without modifying the base architecture. Experiments on LibriTTS and an in-house dataset with two ZS-TTS sources demonstrate that our approach improves speaker similarity over naive synthetic augmentation while preserving intelligibility and perceptual quality. Audio samples are available on our web page.

2510.08275 2026-06-19 eess.SY cs.SY 版本更新

Control Allocation Algorithm for Hypersonic Glide Vehicles with Input Limitations

输入受限的高超声速滑翔飞行器控制分配算法

Johannes Autenrieb, Patrick Gruhn

AI总结 针对高超声速滑翔飞行器执行机构强非线性和物理约束,提出一种迭代控制分配方法,通过嵌入阻力敏感软约束提高能效并降低表面温度,在GHGV-2模型上验证了有效性。

Comments 43pages, 21 figures, accpeted for publication in the AIAA Journal of Guidance, Control, and Dynamics

详情
AI中文摘要

高超声速滑翔飞行器(HGV)在具有执行机构强非线性和严格物理约束的挑战性飞行状态下运行。这些约束包括状态相关的执行器限制、非对称控制边界以及随机动条件变化的热载荷。本文介绍了一种迭代控制分配方法,以实时应对这些挑战。所提出的算法搜索能够实现期望力矩指令的控制输入,同时满足输入幅度和速率的约束。对于细长HGV构型,热载荷和阻力生成密切相关——较低的阻力通常导致表面加热减少。通过嵌入阻力敏感软约束,该方法提高了能量效率并隐含地降低了表面温度,从而降低了飞行器的红外特征。这些特性对于需要低可观测性的远程军事行动尤为有利。该方法利用DLR的通用高超声速滑翔飞行器2(GHGV-2)仿真模型进行了演示。结果证实了该方法在现实约束飞行条件下保持控制权限的有效性。

英文摘要

Hypersonic glide vehicles (HGVs) operate in challenging flight regimes characterized by strong nonlinearities in actuation and stringent physical constraints. These include state-dependent actuator limitations, asymmetric control bounds, and thermal loads that vary with maneuvering conditions. This paper introduces an iterative control allocation method to address these challenges in real time. The proposed algorithm searches for control inputs that achieve the desired moment commands while respecting constraints on input magnitude and rate. For slender HGV configurations, thermal loads and drag generation are strongly correlated-lower drag typically results in reduced surface heating. By embedding drag-sensitive soft constraints, the method improves energy efficiency and implicitly reduces surface temperatures, lowering the vehicle's infrared signature. These features are particularly advantageous for long-range military operations that require low observability. The approach is demonstrated using the DLR's Generic Hypersonic Glide Vehicle 2 (GHGV-2) simulation model. The results confirm the method's effectiveness in maintaining control authority under realistic, constrained flight conditions.

2601.17464 2026-06-19 eess.SY cs.SY 版本更新

Robust Output Regulation of Uncertain Linear Time-Varying Systems

不确定线性时变系统的鲁棒输出调节

Jinmeng Zha, Zhen Zhang

AI总结 针对线性时变系统的鲁棒输出调节问题,提出轨迹匹配系统浸入框架,揭示参数不确定性的根本影响,建立有限线性参数化的精确代数边界,并设计近似鲁棒控制器以实现任意小的有界跟踪误差。

详情
AI中文摘要

线性时变系统的鲁棒输出调节几十年来一直是一个开放问题。为了解决这个问题,我们提出了轨迹匹配系统浸入框架,通过将调节方程重新表述为更具洞察力的形式。这一视角表明,找到内模等价于通过构造一个无外力系统来再现给定受迫系统的稳态输出轨迹。这揭示了参数不确定性的根本影响,给出了鲁棒调节的精确代数边界,称为有限线性参数化。由此,我们进一步证明时变系统中的不确定性容易激发无限维函数族,使得有限维调节器无法实现精确鲁棒调节。因此,我们建立了一个全面的近似鲁棒设计,它产生一个可以任意小的有界跟踪误差,并避免显式求解调节方程。此外,当不确定性以某些特定方式影响系统时,它可以确保精确调节。总体而言,这些结果为构建基于内模的设计提供了一个通用的、可执行的框架,并简化了鲁棒控制的实现过程。

英文摘要

Robust output regulation for linear time-varying systems has remained an open problem for decades. By augmenting the classical immersion viewpoint, we propose the trajectory-matching system immersion framework. It reformulates the regulator equation as a forced system, and demonstrates that finding an internal model is equivalent to reproducing the non-decaying output trajectories of this forced system by constructing an unforced one. This perspective yields an exact algebraic boundary for finite-dimensional internal models, termed finite linear parameterization. It further reveals a distinctive obstruction in time-varying systems: even highly structured, finite-dimensional affine parametric uncertainties can generate infinite-dimensional families of non-decaying error-zeroing signals, thereby precluding exact robust regulation via linear finite-dimensional internal models in general. Hence, we develop a comprehensive approximate robust design, which yields a bounded tracking error that can be arbitrarily small, and avoids explicitly solving the regulator equation. Additionally, it recovers exact regulation when the uncertainty influences the system in some specified ways. Overall, these results clarify the intrinsic limitation of exact finite-dimensional robust regulation for uncertain LTV systems, and provide a general, executable framework for constructing an internal model-based design.

2508.02604 2026-06-19 cs.RO cs.SY eess.SY 版本更新

Periodic robust robotic rock chop via virtual model control

基于虚拟模型控制的周期性鲁棒机器人砍切

Yi Zhang, Fumiya Iida, Fulvio Forni

发表机构 * University of Cambridge(剑桥大学) University of Tokyo(东京大学)

AI总结 提出一种物理结构化的虚拟模型控制器,通过切换虚拟机构生成鲁棒的周期性砍切运动,无需预规划轨迹,在Franka机械臂上实现多种蔬菜的亚毫米级精确切割。

详情
AI中文摘要

机器人切割是一项具有挑战性的、接触丰富的操作任务,机器人必须同时协商未知的物体力学、大接触力和精确的运动要求。我们的假设是,这种复杂性可以通过设计一个物理结构化的虚拟模型控制器来缓解,该控制器使用切换虚拟机构生成鲁棒的、有节奏的岩石砍切运动,无需预先规划的轨迹或精确的环境信息。运动是由环境、机器人动力学和切换虚拟机构的虚拟力之间的相互作用产生的,最终通过可用的驱动实现。通过理论分析和实验验证,我们证明了受控的机器人行为会稳定到周期性的运动。使用Franka机械臂进行的实验表明,在五种不同的蔬菜上实现了鲁棒的切割,对于1毫米到6毫米的厚度,以每秒近一次切割的速度实现了亚毫米级的切片精度。尽管刀的形状或砧板的高度发生变化,控制器仍保持高性能,并成功适应了不同的人形机械臂,展示了鲁棒性和平台独立性。

英文摘要

Robotic cutting is a challenging, contact-rich manipulation task where the robot must simultaneously negotiate unknown object mechanics, large contact forces, and precise motion requirements. Our hypothesis is that this complexity can be alleviated through the design of a physically structured virtual-model controller that uses switched virtual mechanisms to generate a robust, rhythmic rock-chop motion for robotic cutting, without requiring pre-planned trajectories or precise environmental information. Motion is generated by the interaction between the environment, the robot's dynamics, and the virtual forces of the switching virtual mechanism, ultimately realized through the available actuation. Through theoretical analysis and experimental validation, we demonstrate that the controlled robot behavior settles into a stable periodic motion. Experiments with a Franka manipulator demonstrate robust cuts across five different vegetables, achieving sub-millimeter slice accuracy for thicknesses from 1 mm to 6 mm at a rate of nearly one cut per second. The controller maintains high performance despite changes in knife shape or cutting board height, and successfully adapts to a different humanoid manipulator, demonstrating robustness and platform independence.

2601.03112 2026-06-19 eess.IV cs.CV 版本更新

DiT-JSCC: Rethinking Deep JSCC with Diffusion Transformers and Semantic Representations

DiT-JSCC:基于扩散变换器与语义表示的深度JSCC再思考

Kailin Tan, Jincheng Dai, Sixian Wang, Guo Lu, Shuo Shao, Kai Niu, Wenjun Zhang, Ping Zhang

发表机构 * Beijing University of Posts and Telecommunications(北京邮电大学) Shanghai Jiao Tong University(上海交通大学) University of Shanghai for Science and Technology(上海科技大学)

AI总结 提出DiT-JSCC框架,联合学习语义优先表示编码器和扩散变换器生成解码器,通过粗细粒度条件解码和基于Kolmogorov复杂度的自适应带宽分配,在极端信道条件下提升语义一致性与传输效率。

Comments 14pages, 14figures, 2tables

详情
AI中文摘要

生成式联合源信道编码(GJSCC)已成为一种新的深度JSCC范式,用于在极端无线信道条件(如超低带宽和低信噪比)下实现高保真和鲁棒的图像传输。近期研究通常采用扩散模型作为生成解码器,但经常产生视觉上逼真但语义一致性有限的结果。这种局限性源于面向重建的JSCC编码器与生成解码器之间的根本性不匹配,因为前者缺乏显式的语义判别能力,无法提供可靠的条件线索。在本文中,我们提出DiT-JSCC,一种新颖的GJSCC骨干网络,能够联合学习语义优先的表示编码器和基于扩散变换器(DiT)的生成解码器,我们的开源项目旨在促进GJSCC的未来研究。具体来说,我们设计了一个语义-细节双分支编码器,与从粗到细的条件DiT解码器自然对齐,在极端信道条件下优先考虑语义一致性。此外,受Kolmogorov复杂度启发,引入了一种无需训练的自适应带宽分配策略,以进一步提高传输效率,从而真正重新定义生成解码时代的信息价值概念。大量实验表明,DiT-JSCC在语义一致性和视觉质量上始终优于现有JSCC方法,尤其是在极端条件下。

英文摘要

Generative joint source-channel coding (GJSCC) has emerged as a new Deep JSCC paradigm for achieving high-fidelity and robust image transmission under extreme wireless channel conditions, such as ultra-low bandwidth and low signal-to-noise ratio. Recent studies commonly adopt diffusion models as generative decoders, but they frequently produce visually realistic results with limited semantic consistency. This limitation stems from a fundamental mismatch between reconstruction-oriented JSCC encoders and generative decoders, as the former lack explicit semantic discriminability and fail to provide reliable conditional cues. In this paper, we propose DiT-JSCC, a novel GJSCC backbone that can jointly learn a semantics-prioritized representation encoder and a diffusion transformer (DiT) based generative decoder, our open-source project aims to promote the future research in GJSCC. Specifically, we design a semantics-detail dual-branch encoder that aligns naturally with a coarse-to-fine conditional DiT decoder, prioritizing semantic consistency under extreme channel conditions. Moreover, a training-free adaptive bandwidth allocation strategy inspired by Kolmogorov complexity is introduced to further improve the transmission efficiency, thereby indeed redefining the notion of information value in the era of generative decoding. Extensive experiments demonstrate that DiT-JSCC consistently outperforms existing JSCC methods in both semantic consistency and visual quality, particularly in extreme regimes.

2601.00014 2026-06-19 eess.SP cs.AI cs.LG 版本更新

Modeling Day-Long ECG Signals to Predict Heart Failure Risk with Explainable AI

建模全天心电图信号以可解释人工智能预测心力衰竭风险

Eran Zvuloni, Ronit Almog, Michael Glikson, Shany Brimer Biton, Ilan Green, Izhar Laufer, Offer Amir, Joachim A. Behar

发表机构 * Leumit Health Services(Leumit健康服务)

AI总结 提出DeepHHF深度学习模型,利用24小时单导联心电图数据预测五年内心力衰竭风险,AUC达0.80,优于短时片段和临床评分,可解释性分析显示模型关注心律失常和心脏异常。

详情
AI中文摘要

心力衰竭(HF)影响11.8%的65岁及以上成年人,降低生活质量和寿命。预防HF可降低发病率和死亡率。我们假设将人工智能(AI)应用于24小时单导联心电图(ECG)数据可预测五年内HF风险。为此,使用了Technion-Leumit Holter ECG(TLHE)数据集,包括20年间收集的47,729名患者的69,663条记录。我们的深度学习模型DeepHHF在24小时ECG记录上训练,实现了0.80的受试者工作特征曲线下面积,优于使用30秒片段和临床评分的模型。DeepHHF识别的高风险个体住院或死亡事件概率翻倍。可解释性分析显示DeepHHF关注心律失常和心脏异常。本研究强调了深度学习建模24小时连续ECG数据的可行性,捕捉了对可靠风险预测至关重要的阵发性事件。应用于单导联Holter ECG的人工智能无创、廉价且广泛可及,使其成为HF风险预测的有前景工具。

英文摘要

Heart failure (HF) affects 11.8% of adults aged 65 and older, reducing quality of life and longevity. Preventing HF can reduce morbidity and mortality. We hypothesized that artificial intelligence (AI) applied to 24-hour single-lead electrocardiogram (ECG) data could predict the risk of HF within five years. To research this, the Technion-Leumit Holter ECG (TLHE) dataset, including 69,663 recordings from 47,729 patients, collected over 20 years was used. Our deep learning model, DeepHHF, trained on 24-hour ECG recordings, achieved an area under the receiver operating characteristic curve of 0.80 that outperformed a model using 30-second segments and a clinical score. High-risk individuals identified by DeepHHF had a two-fold chance of hospitalization or death incidents. Explainability analysis showed DeepHHF focused on arrhythmias and heart abnormalities. This study highlights the feasibility of deep learning to model 24-hour continuous ECG data, capturing paroxysmal events essential for reliable risk prediction. Artificial intelligence applied to single-lead Holter ECG is non-invasive, inexpensive, and widely accessible, making it a promising tool for HF risk prediction.

2512.17473 2026-06-19 eess.SP cs.LG math.OC stat.ML 版本更新

Alternating Direction Method of Multipliers for Nonlinear Matrix Decompositions

非线性矩阵分解的交替方向乘子法

Atharva Awari, Nicolas Gillis, Arnaud Vandaele

发表机构 * University of Mons(蒙斯大学)

AI总结 提出基于交替方向乘子法(ADMM)的算法求解非线性矩阵分解(NMD),支持多种非线性函数和损失函数,在真实数据集上验证了适用性和效率。

Comments 16 pages, 7 figures. v3: Revised version: added new experiments and comparisons. Code available from https://gitlab.com/Atharva05/admm-for-nmd

详情
AI中文摘要

我们提出了一种基于交替方向乘子法(ADMM)的算法,用于求解非线性矩阵分解(NMD)。给定输入矩阵 $X \in \mathbb{R}^{m \times n}$ 和分解秩 $r \ll \min(m, n)$,NMD 寻求矩阵 $W \in \mathbb{R}^{m \times r}$ 和 $H \in \mathbb{R}^{r \times n}$,使得 $X \approx f(WH)$,其中 $f$ 是逐元素非线性函数。我们在几个代表性非线性模型上评估了我们的方法:适用于非负稀疏数据近似的修正线性单元激活 $f(x) = \max(0, x)$,适用于概率电路表示的逐分量平方 $f(x) = x^2$,以及适用于推荐系统的 MinMax 变换 $f(x) = \min(b, \max(a, x))$。所提出的框架灵活支持多种损失函数,包括最小二乘、$\ell_1$ 范数和 Kullback-Leibler 散度,并且可以轻松扩展到其他非线性和度量。我们在真实世界数据集上展示了该方法的适用性、效率和适应性,突出了其在广泛应用中的潜力。

英文摘要

We present an algorithm based on the alternating direction method of multipliers (ADMM) for solving nonlinear matrix decompositions (NMD). Given an input matrix $X \in \mathbb{R}^{m \times n}$ and a factorization rank $r \ll \min(m, n)$, NMD seeks matrices $W \in \mathbb{R}^{m \times r}$ and $H \in \mathbb{R}^{r \times n}$ such that $X \approx f(WH)$, where $f$ is an element-wise nonlinear function. We evaluate our method on several representative nonlinear models: the rectified linear unit activation $f(x) = \max(0, x)$, suitable for nonnegative sparse data approximation, the component-wise square $f(x) = x^2$, applicable to probabilistic circuit representation, and the MinMax transform $f(x) = \min(b, \max(a, x))$, relevant for recommender systems. The proposed framework flexibly supports diverse loss functions, including least squares, $\ell_1$ norm, and the Kullback-Leibler divergence, and can be readily extended to other nonlinearities and metrics. We illustrate the applicability, efficiency, and adaptability of the approach on real-world datasets, highlighting its potential for a broad range of applications.

2511.14280 2026-06-19 eess.SY cs.SY math.OC 版本更新

A graph-informed regret metric for optimal distributed control

面向最优分布式控制的图信息遗憾度量

Daniele Martinelli, Andrea Martin, Giancarlo Ferrari-Trecate, Luca Furieri

AI总结 提出空间遗憾度量,衡量分布式控制器与拥有额外传感信息的先知控制器之间的最坏性能差距,并基于该度量设计分布式控制器,通过凸优化实现有限维近似,在电力系统仿真中有效抑制局部扰动。

详情
AI中文摘要

我们考虑使用分布式控制器对大规模系统进行最优控制,这些控制器的网络拓扑与子系统之间的耦合图相匹配。在这项工作中,我们引入了空间遗憾,这是一种基于图的度量,用于衡量分布式控制器与能够访问额外传感器信息的先知控制器之间的最坏情况性能差距。先知的图是信息图的用户指定扩展,产生一个基准策略,该策略惩罚那些额外传感会改善性能的扰动。最小化空间遗憾可以产生尊重名义信息图的分布式控制器,这些控制器模仿先知对大规模网络特征扰动(如局部扰动)的响应。我们证明,最小化空间遗憾可以转化为一个具有有限维近似的无限规划。为了扩展到大型网络,我们推导了空间遗憾的上界,该上界可以以分布式方式高效最小化。在电力系统模型上的数值实验表明,与基于经典度量的控制器相比,所得控制器能更有效地抑制局部扰动。

英文摘要

We consider the optimal control of large-scale systems using distributed controllers whose network topology mirrors the coupling graph between subsystems. In this work, we introduce spatial regret, a graph-informed metric measuring the worst-case performance gap between a distributed controller and an oracle with access to additional sensor information. The oracle's graph is a user-specified augmentation of the information graph, yielding a benchmark policy that penalizes disturbances for which additional sensing would improve performance. Minimizing spatial regret yields distributed controllers - respecting the nominal information graph - that emulate the oracle's response to disturbances characteristic of large-scale networks, such as localized perturbations. We show that minimizing spatial regret admits a convex reformulation as an infinite program with a finite-dimensional approximation. To scale to large networks, we derive an upper bound on the spatial regret that can be efficiently minimized in a distributed way. Numerical experiments on power-system models show that the resulting controllers mitigate localized disturbances more effectively than those based on classical metrics.

2510.21546 2026-06-19 eess.SY cs.SY 版本更新

Auction-Based Responsibility Allocation for Scalable Decentralized Safety Filters in Cooperative Multi-Agent Collision Avoidance

基于拍卖的责任分配用于可扩展的去中心化安全滤波器在多智能体协同避碰中

Johannes Autenrieb, Mark Spiller

AI总结 提出基于高阶控制屏障函数和拍卖责任分配的可扩展去中心化安全滤波器,通过非对称分配约束减少计算负荷,实现多智能体协同避碰。

Comments 6 pages, 3 figures, accepted for presentation at the IFAC World Congress 2026

详情
AI中文摘要

本文提出了一种基于高阶控制屏障函数(HOCBFs)和拍卖式责任分配的可扩展去中心化多智能体系统安全滤波器。虽然去中心化HOCBF公式在输入约束下保证了成对安全性,但随着智能体数量增加,它们面临可行性和可扩展性挑战。每个智能体必须评估越来越多的成对约束,增加了不可行的风险,并难以满足实时要求。为了解决这个问题,我们引入了一种基于拍卖的分配方案,该方案基于局部控制努力估计,在邻居之间非对称地分配约束执行。由此产生的有向责任图保证了完全的安全覆盖,同时减少了冗余约束和每个智能体的计算负荷。仿真结果证实了在各种网络规模和交互密度下的安全高效协调。

英文摘要

This paper proposes a scalable decentralized safety filter for multi-agent systems based on high-order control barrier functions (HOCBFs) and auction-based responsibility allocation. While decentralized HOCBF formulations ensure pairwise safety under input bounds, they face feasibility and scalability challenges as the number of agents grows. Each agent must evaluate an increasing number of pairwise constraints, raising the risk of infeasibility and making it difficult to meet real-time requirements. To address this, we introduce an auction-based allocation scheme that distributes constraint enforcement asymmetrically among neighbors based on local control effort estimates. The resulting directed responsibility graph guarantees full safety coverage while reducing redundant constraints and per-agent computational load. Simulation results confirm safe and efficient coordination across a range of network sizes and interaction densities.

2510.00831 2026-06-19 cs.AI cs.LG eess.SP 版本更新

Controlled Comparison of Machine Learning Models for Fault Classification and Localization in Power System Protection

电力系统保护中故障分类与定位的机器学习模型受控比较

Julian Oelhaf, Georg Kordowich, Changhun Kim, Paula Andrea Pérez-Toro, Christian Bergler, Andreas Maier, Johann Jäger, Siming Bayer

发表机构 * Department of Electrical Engineering, Media and Computer Science, Ostbayerische Technische Hochschule Amberg-Weiden(奥贝格-魏登应用技术大学电气工程、媒体与计算机科学系)

AI总结 在统一电磁暂态数据集和10-50ms决策窗口下,对比机器学习模型在故障分类与定位中的性能,发现分类在10ms时F1>0.98,定位误差稳定在约10%线路长度。

Comments Accepted at IEEE PES Innovative Smart Grid Technologies Europe 2026 (ISGT Europe 2026). Pre-camera-ready author version; final proceedings version may differ

详情
AI中文摘要

现代电力系统因逆变器基和分布式能源的集成而日益复杂,挑战了传统保护方案的可靠性,并推动了机器学习在保护任务中的应用。然而,由于不同研究中的数据集、传感假设和决策时域各异,已发表的结果往往难以比较。本文在相同的传感、时序和验证条件下,基于公共电磁暂态数据集,使用10-50ms的决策窗口以反映保护相关时间尺度,对故障分类(FC)和故障定位(FL)的机器学习模型进行了受控比较。对于FC,性能最佳的非线性模型在10ms时F1分数已超过0.98,而低容量模型在较短时域下性能下降,但随窗口延长而改善,表明相关故障类型信息在最早暂态中已存在。对于FL,顶级模型在所有评估时域下达到约10%归一化线路长度的稳定定位误差,而较弱模型形成明显分离的第二性能层级。线路解析分析显示,定位精度随电网段变化,表明存在拓扑依赖的难度而非仅时间上下文不足。这些发现为比较两个信息需求根本不同的保护任务中的机器学习模型提供了受控参考。

英文摘要

The increasing complexity of modern power systems, driven by the integration of inverter-based and distributed energy resources, challenges the reliability of conventional protection schemes and motivates the use of machine learning for protection tasks. However, published results are often difficult to compare because datasets, sensing assumptions, and decision horizons vary across studies. This paper presents a controlled comparison of machine learning models for fault classification (FC) and fault localization (FL) under identical sensing, timing, and validation conditions on a common electromagnetic transient dataset, using decision windows of 10-50 ms to reflect protection-relevant time scales. For FC, the best-performing nonlinear models achieve F1 scores above 0.98 already at 10 ms, while lower-capacity models degrade at shorter horizons but improve with longer windows, indicating that relevant fault-type information is already present in the earliest transient. For FL, the top-performing models reach a stable localization error of about 10 % of normalized line length across all evaluated horizons, while weaker models form a clearly separated second performance tier. Line-resolved analysis shows that localization accuracy varies across grid segments, indicating topology-dependent difficulty rather than insufficient temporal context alone. These findings provide a controlled reference for comparing machine learning models across two protection tasks with fundamentally different information requirements.

2507.14952 2026-06-19 eess.SY cs.SY 版本更新

An approach to the LQG/LTR design problem with specifications for finite-dimensional SISO control systems

有限维SISO控制系统LQG/LTR设计问题的规格化方法

Mahyar Mahinzaeim, Kamyar Mehran

AI总结 提出一种基于加权增广的LQG/LTR设计方法,将低高频设计规格纳入LTR框架,通过灵敏度函数塑造闭环性能与鲁棒性,并用齿轮直流电机扭矩控制实例验证。

Comments typos corrected; references added; additional computational details added

详情
AI中文摘要

这是一篇说明性论文,讨论了有限维单变量(单输入/单输出,SISO)控制系统的线性二次高斯/回路传递恢复(LQG/LTR)设计问题的一种方法。该方法基于利用加权增广,将设计规格纳入LTR技术框架中用于LQG补偿器设计。LQG补偿器需同时满足给定的分析性低频和高频设计规格,这些规格以期望的灵敏度和控制器噪声灵敏度函数表示。本文面向非专业人士,特别是有限维LQG理论中的实践者,他们关注在实际情况下为SISO控制系统的闭环性能和鲁棒性塑造设计反馈补偿器。通过一个详细的设计实例——带弹性安装输出轴的齿轮直流电机的扭矩控制——说明了所提出的方法。

英文摘要

This is an expository paper which discusses an approach to the linear quadratic Gaussian/loop transfer recovery (LQG/LTR) design problem for finite-dimensional single-variable (single-input/single-output, SISO) control systems. The approach is based on the utilisation of weighting augmentation for incorporating design specifications into the framework of the LTR technique for LQG compensator design. The LQG compensator is to simultaneously meet given analytical low- and high-frequency design specifications expressed in terms of desirable sensitivity and controller noise sensitivity functions. The paper is aimed at non-specialists and, in particular, practitioners in finite-dimensional LQG theory interested in the design of feedback compensators for closed-loop performance and robustness shaping of SISO control systems in realistic situations. The proposed approach is illustrated by a detailed design example: the torque control of a geared DC motor with an elastically mounted output shaft.

2509.03488 2026-06-19 eess.SP 版本更新

Efficient DoA Estimation for Linear and Rectangular Arrays with Hybrid Architectures Using Compact DFT Codebooks

基于紧凑DFT码本的线性和矩形阵列混合架构高效DoA估计

Miguel Rivas-Costa, Carlos Mosquera

AI总结 针对混合架构中维度压缩导致空间协方差矩阵自由度不足的问题,提出利用DFT波束成形后的柯西型位移结构的广义最小二乘框架,实现线性阵列的协方差矩阵高效恢复,复杂度为O(N_RF^2 N_x),逼近CRB并优于现有方法。

详情
AI中文摘要

混合模拟数字(HAD)架构显著降低了硬件开销,但引入了严重的维度压缩,这剥夺了空间协方差矩阵(SCM)进行高分辨率波达方向(DoA)估计所需的自由度。离散傅里叶变换(DFT)模拟波束成形的无源巴特勒矩阵实现避免了有源移相器和放大器,进一步加剧了这一挑战。在本文中,我们提出了一个广义最小二乘(GLS)框架,该框架利用了DFT波束成形后出现的柯西型位移结构。通过利用这种结构,我们开发了一种高效的数值技术来恢复均匀线性阵列的SCM,复杂度为$\mathcal{O}(N_{\text{RF}}^2 N_x)$,其中$N_x$是天线数量,$N_{\text{RF}}$是射频链数量。仿真表明,我们的估计器逼近克拉美-罗界(CRB),同时优于最先进的方法。

英文摘要

Hybrid Analog and Digital (HAD) architectures significantly reduce hardware overhead but introduce severe dimensionality compression, which strips the Spatial Covariance Matrix (SCM) of the degrees of freedom required for high-resolution Direction-of-Arrival (DoA) estimation. This challenge is further compounded by passive Butler-matrix implementations of Discrete Fourier Transform (DFT) analog beamforming, which avoid active phase shifters and amplifiers. In this paper, we propose a Generalized Least Squares (GLS) framework that exploits the Cauchy-like displacement structure that arises after DFT beamforming. By leveraging this structure, we develop a highly efficient numerical technique to recover the SCM for uniform linear arrays with a complexity of $\mathcal{O}(N_{\text{RF}}^2 N_x)$, where $N_x$ is the number of antennas and $N_{\text{RF}}$ the number of RF-chains. Simulations demonstrate that our estimator approaches the Cramér-Rao Bound (CRB) while outperforming state-of-the-art methods.

2508.01819 2026-06-19 eess.IV 版本更新

Decoding the Alzheimer's Continuum: Interpretable Multi-Gate Routing for Diagnosis and Transition Prediction

解码阿尔茨海默病连续谱:可解释的多门路由用于诊断与转换预测

Yufeng Jiang, Hexiao Ding, Hongzhao Chen, Jing Lan, Xinzhi Teng, Gerald W. Y. Cheng, Yunlin Mao, Zongxi Li, Haoran Xie, Jung Sun Yoo, Jing Cai

AI总结 提出M$^3$AD统一框架,利用可解释多门专家混合架构,基于T1加权sMRI同时实现三分类诊断和阶段转换预测,准确率达95.13%。

Comments Accepted by MICCAI2026

详情
AI中文摘要

阿尔茨海默病(AD)表现为从正常认知(NC)经轻度认知障碍(MCI)到痴呆的连续进展。然而,大多数深度学习方法将此连续谱简化为不连续的分类任务,很大程度上忽略了动态阶段转换。为了解码这一复杂进展,我们提出M$^3$AD,一个统一框架,仅使用T1加权sMRI联合处理三分类诊断和诊断阶段转换预测。M$^3$AD利用可解释的多门专家混合架构,采用专门的路由机制动态捕获诊断特定的病理模式和跨连续谱的共享结构特征。它进一步通过自适应注意力融合整合临床先验(年龄、性别、eTIV)以增强泛化能力。M$^3$AD在原始实验设置下达到95.13%的准确率(MCLNC报告为90.44%),转换预测准确率为94.87%。关键的是,分析多门路由揭示了区分稳定性和进展性MCI的独特专家激活特征,为个体水平的进展风险分层提供了机制基础。代码见:此 https URL。

英文摘要

Alzheimer's disease (AD) manifests as a continuous progression from normal cognition (NC) through mild cognitive impairment (MCI) to dementia. However, most deep learning approaches reduce this continuum to disjointed classification tasks, largely ignoring dynamic stage transitions. To decode this complex progression, we propose M$^3$AD, a unified framework that jointly addresses three-class diagnosis classification and diagnosis stage transition prediction using only T1-weighted sMRI. M$^3$AD leverages an interpretable multi-gate mixture of experts architecture, employing specialized routing mechanisms to dynamically capture both diagnosis-specific pathological patterns and shared structural features across the continuum. It further integrates clinical priors (age, sex, eTIV) via adaptive attention fusion to enhance generalization. M$^3$AD achieves 95.13% accuracy, compared to 90.44% reported by MCLNC under its original experimental setting, and 94.87% for transition prediction. Crucially, analyzing the multi-gate routing reveals distinct expert activation signatures distinguishing stable from progressive MCI, providing a mechanistic basis for individual-level progression risk stratification. Code is available at https://github.com/csyfjiang/M3AD.