arXivDaily arXiv每日学术速递 周一至周五更新
EESS电气与系统 110
2606.20163 2026-06-19 eess.SY cs.SY 新提交

Techno-Economic Analysis of Shared Mobile Storage for Demand Charge Reduction

用于需求费用削减的共享移动储能技术经济分析

B Hari Kiran Reddy, Ge Chen, Junjie Qin

AI总结 本文提出一个高保真车队管理框架,通过混合整数线性规划模型和启发式算法,评估共享电动汽车在考虑实际物流和运营约束下削减需求费用的技术经济可行性。

Comments 22 pages, 26 figures, journal

详情
AI中文摘要

本文研究了在实际物流和运营约束下,共享电动汽车车队用于削减需求费用的技术经济可行性。与忽略运输开销的理想化模型不同,我们提出了一个高保真车队管理框架,明确考虑了能源消耗的时空耦合、电动汽车驾驶员的人工成本和电池退化。我们将调度问题表述为混合整数线性规划,共同最小化需求费用和总拥有成本。为了解决路径依赖约束带来的计算复杂性,我们开发了一种基于边际价值的启发式算法,该算法以高计算效率实现了接近最优的性能。使用旧金山的真实数据,我们的分析表明,适度数量的电动汽车可以实现显著的需求费用节省,足以收回拥有和运营成本。我们的结果还显示了电价结构、车队规模和成本组成部分如何影响整体盈利能力。

英文摘要

This paper investigates the techno-economic viability of shared electric vehicle (EV) fleets for demand charge reduction under practical logistical and operational constraints. Unlike idealized models that overlook transit overheads, we propose a high-fidelity fleet management framework that explicitly accounts for the spatio-temporal coupling of energy consumption, labor costs for EV drivers, and battery degradation. We formulate the dispatch problem as a mixed-integer linear program (MILP) that jointly minimizes demand charges and total cost of ownership. To address the computational complexity arising from path-dependent constraints, we develop a marginal-value-based heuristic algorithm that achieves near-optimal performance with high computational efficiency. Using real-world data from San Francisco, our analysis reveals that a modest number of EVs can achieve significant demand charge savings, sufficient to recover the ownership and operational expenses. Our results also show how tariff structures, fleet size, and cost components influence overall profitability.

2606.20127 2026-06-19 eess.SY cs.SY 新提交

Contraction-based Neural Control for Cooperative Aerial Payload Transportation with Variable-length Cables

基于收缩的神经控制用于可变长度缆绳的协同空中载荷运输

Yi Lok Lo, Longhao Qian, Hugh H. T. Liu

AI总结 提出一种多无人机吊挂载荷系统的神经非线性控制框架,通过解耦动力学结构,联合训练神经收缩度量控制器和反馈控制器实现载荷轨迹跟踪,并利用可变长度缆绳进行避障。

Comments Submitted for publication in AIAA Scitech 2027

详情
AI中文摘要

本文提出了一种新颖的神经非线性控制框架,用于具有可变长度缆绳和刚体载荷的多无人机吊挂载荷系统。运动方程被表述为解耦结构,其中载荷和缆绳长度动力学由独立控制通道控制,便于在降阶子系统上进行模块化控制器设计。联合训练神经控制收缩度量(CCM)控制器和神经反馈控制器,以强制执行载荷子系统的收缩条件。另外,推导了一种缆绳长度控制律,利用可变长度自由度进行避障。数值模拟展示了在提出的控制框架下,刚体载荷的轨迹跟踪和整个系统的门穿越能力。

英文摘要

This paper presents a novel neural nonlinear control framework for a multi-drone slung payload system with variable-length cables and a rigid-body payload. The equations of motion are formulated into a decoupled structure, where the payload and cable length dynamics are governed by independent control channels, facilitating modularized controller design on reduced-order subsystems. A neural control contraction metric (CCM) controller and a neural feedback controller are jointly trained to enforce contraction conditions for the payload subsystem. Separately, a cable length control law is derived that exploits the variable-length degree of freedom for obstacle avoidance. Numerical simulations demonstrate trajectory tracking of a rigid-body payload and gate traversal capabilities of the overall system under the proposed control framework.

2606.20112 2026-06-19 cs.CV eess.IV 新提交

Pixel-Level Residual Diffusion Transformer: Scalable 3D CT Volume Generation

像素级残差扩散Transformer:可扩展的3D CT体生成

Zhenkai Zhang, Markus Hiller, Krista A. Ehinger, Tom Drummond

发表机构 * School of Computing and Information Systems, The University of Melbourne(墨尔本大学计算与信息系统学院)

AI总结 提出像素级残差扩散Transformer(PRDiT),通过两阶段训练(局部MLP盲估计器分离低频结构+全局残差扩散Transformer建模高频残差)实现高保真3D CT体生成,在LIDC-IDRI和RAD-ChestCT数据集上优于现有方法。

Comments Accepted at ICLR 2026. Code available at https://github.com/Fredy-Zhang/PRDiT

详情
AI中文摘要

由于现有生成模型固有的巨大计算需求和优化困难,生成具有精细细节的高分辨率3D CT体仍然具有挑战性。在本文中,我们提出了像素级残差扩散Transformer(PRDiT),这是一种可扩展的生成框架,可直接在体素级别合成高质量的3D医学体。PRDiT引入了一个两阶段训练架构,包括:1)一个局部去噪器,形式为基于MLP的盲估计器,作用于重叠的3D块,以有效分离低频结构;2)一个全局残差扩散Transformer,采用内存高效注意力来建模和细化整个体上的高频残差。这种从粗到细的建模策略简化了优化,增强了训练稳定性,并有效保留了细微结构,而无需自编码器瓶颈。在LIDC-IDRI和RAD-ChestCT数据集上进行的大量实验表明,PRDiT始终优于最先进的模型,如HA-GAN、3D LDM和WDM-3D,在3D FID、MMD和Wasserstein距离指标上显著降低。

英文摘要

Generating high-resolution 3D CT volumes with fine details remains challenging due to substantial computational demands and optimization difficulties inherent to existing generative models. In this paper, we propose the Pixel-Level Residual Diffusion Transformer (PRDiT), a scalable generative framework that synthesizes high-quality 3D medical volumes directly at voxel-level. PRDiT introduces a two-stage training architecture comprising 1) a local denoiser in the form of an MLP-based blind estimator operating on overlapping 3D patches to separate low-frequency structures efficiently, and 2) a global residual diffusion transformer employing memory-efficient attention to model and refine high-frequency residuals across entire volumes. This coarse-to-fine modeling strategy simplifies optimization, enhances training stability, and effectively preserves subtle structures without the limitations of an autoencoder bottleneck. Extensive experiments conducted on the LIDC-IDRI and RAD-ChestCT datasets demonstrate that PRDiT consistently outperforms state-of-the-art models, such as HA-GAN, 3D LDM and WDM-3D, achieving significantly lower 3D FID, MMD and Wasserstein distance scores.

2606.19987 2026-06-19 cs.SD eess.AS 新提交

PolSeT: Polish Semantics of Timbre Dataset

PolSeT: 波兰语音色语义数据集

Jan Jasiński

AI总结 介绍PolSeT数据集,通过自由言语化和语义差异实验,收集波兰语语义描述符和音色评分,填补音色研究数据空白,支持跨文化心理声学和MIR研究。

Comments 8 pages, 7 figures. Data descriptor for the PolSeT dataset (Polish Semantics of Timbre), available at https://doi.org/10.5281/zenodo.17830609 under CC BY 4.0

详情
AI中文摘要

本数据报告介绍了PolSeT(波兰语语义音色)数据集,该数据集旨在促进波兰语及跨文化背景下的心理声学和音乐信息检索(MIR)研究。数据集包含两个连续实验的数据。实验1(N=60)是一项自由言语化任务,旨在创建波兰语语义描述符词汇表。使用11个刺激,共收集了1901个描述符(701个唯一)。实验2(N=105)利用该词汇表进行语义差异研究,参与者对18种乐器声音在8个双极量表上进行评分,并进行了重复试验以进行信度分析。发布的数据集包括原始听众响应、全面的人口统计数据(经验、性别、年龄)、音频刺激以及提取的声学特征及Python提取代码。该数据集填补了开放音色研究数据的空白,为心理声学研究和多语言语义嵌入模型的训练提供了必要的定性语言基础和定量评分。

英文摘要

This data report introduces PolSeT (Polish Semantic Timbre), a dataset designed to facilitate research in psychoacoustics and Music Information Retrieval (MIR) in Polish and cross-cultural contexts. The dataset contains data from two sequential experiments. Experiment 1 (N=60) was a free-verbalization task aimed at creating a lexicon of Polish semantic descriptors. Using 11 stimuli, a total of 1901 descriptors (701 unique) were gathered. Experiment 2 (N=105) utilized this lexicon to conduct a semantic differential study, where participants rated 18 instrument sounds on 8 bipolar scales, with repeated trials for reliability analysis. The released dataset includes raw listener responses, comprehensive demographics (experience, gender, age), audio stimuli, and extracted acoustic features with Python extraction code. This dataset addresses a gap in open timbre research data, providing both the qualitative linguistic groundwork and the quantitative ratings necessary for psychoacoustic research and the training of multilingual semantic embedding models.

2606.19910 2026-06-19 cs.CL cs.SD eess.AS 新提交

Light-weight Pronunciation Assessment via Discrete Speech Token Surprisal

轻量级发音评估:基于离散语音标记的意外度

Syeda Faiza Ahmed Sara, Shammur Absar Chowdhury

发表机构 * Qatar Computing Research Institute, Doha, Qatar(卡塔尔计算研究所,多哈,卡塔尔)

AI总结 提出仅使用母语语音资源训练的轻量级发音评估框架,通过离散化语音标记和语言模型计算意外度,结合文本引导对齐特征,在无监督或少量校准下达到接近监督方法的性能。

Comments Accepted to Interspeech 2026

详情
AI中文摘要

训练自动发音评估通常依赖于标记的学习者错误或非母语语料库,这些语料库收集成本高昂。我们提出一个轻量级框架,仅使用母语语音资源训练,以无监督或通过少量评分话语进行轻量校准的方式运行。在推理时,学习者语音通过SSL编码器和K-means码本进行离散化。一个在母语序列上训练的标记语言模型计算意外度,其中较高的意外度表示音位偏差。我们添加了一个转录引导的Text2DUnit--DTW模块,该模块从参考文本预测母语标记序列,并将其与声学标记对齐以推导出错误敏感特征。意外度和对齐特征通过简单回归融合。在SpeechOcean762上,PCC从0.60提升到0.66(带转录引导),接近监督基线。在L2-ARCTIC上的跨数据集评估显示了一致的提升。

英文摘要

Training automated pronunciation assessment often relies on labeled learner errors or non-native corpora that are costly to collect. We propose a lightweight framework trained only on native speech resources, operating unsupervised or lightly calibrated with a small set of scored utterances. At inference, learner speech is discretized with an SSL encoder and a K-means codebook. A token language model trained on native sequences computes surprisal where higher surprisal indicates phonotactic deviation. We add a transcript-guided Text2DUnit--DTW module that predicts native token sequences from reference text and aligns them to acoustic tokens to derive error-sensitive features. Surprisal and alignment features are fused via simple regression. On SpeechOcean762, PCC improves from 0.60 to 0.66 with transcript guidance, near supervised baselines. Cross-dataset evaluation on L2-ARCTIC shows consistent gains.

2606.19711 2026-06-19 cs.RO cs.LG cs.SY eess.SY 新提交

A Differentiable Composite Approximation Framework for Autonomous Underwater Vehicle Maneuvering Modeling from Sea-Trial Data

一种可微复合近似框架:基于海试数据的自主水下航行器机动建模

Aobo Wang, Aifei Xia, Zihao Wang, Lizhu Hao

发表机构 * College of Shipbuilding Engineering, Harbin Engineering University(哈尔滨工程大学船舶工程学院) China Academy of Aerospace Aerodynamics(中国航天空气动力技术研究院) Institute of Artificial Intelligence, Shanghai University(上海大学人工智能研究院) China Ship Scientific Research Center(中国船舶科学研究中心)

AI总结 提出可微复合近似框架,结合多项式基与数据自适应基联合校准,并引入转向运动电流估计补偿,提升AUV机动预测精度。

详情
AI中文摘要

基于机载测量的场建模可以生成反映真实运行特性的自主水下航行器(AUV)机动模型。从近似角度看,传统机动模型使用预定义的约束多项式基,而数据驱动模型使用数据自适应基。受此基函数视角启发,本文提出一种可微复合近似公式,其中多项式基分量和数据自适应基分量被视为单个预测器的可微部分并联合校准。开发了一种基于梯度的协同校准方法用于全尺寸AUV机动预测,其中灵敏度感知机制调节有界多项式更新,而神经残差在共享预测目标下捕获剩余非线性差异。为了考虑现场数据中的海流效应,引入了一种基于转向运动的电流估计和补偿程序,以构建电流补偿的学习目标用于训练和滚动预测。该框架使用从7米长AUV在多种机动条件下收集的海试数据进行评估。结果表明,与纯多项式、纯神经网络和冻结先验混合基线相比,所提方法改进了递归轨迹和速度预测,证明了其在基于现场数据的AUV机动建模中的适用性。

英文摘要

Field-based modeling from onboard measurements can produce autonomous underwater vehicle (AUV) maneuvering models that reflect real operating characteristics. From an approximation perspective, conventional maneuvering models use predefined constraint polynomial bases, whereas data-driven models use data-adaptive bases. Motivated by this basis-function view, this paper presents a differentiable composite-approximation formulation, in which the polynomial-basis component and the data-adaptive basis component are treated as differentiable parts of a single predictor and calibrated jointly. A gradient-based co-calibration method is developed for full-scale AUV maneuvering prediction, where a sensitivity-aware mechanism regulates bounded polynomial updates while the neural residual captures remaining nonlinear discrepancies under a shared prediction objective. To account for ocean-current effects in field data, a turning-motion-based current estimation and compensation procedure is incorporated to construct current-compensated learning targets for training and rollout. The framework is evaluated using sea-trial data collected from a 7-meter AUV under multiple maneuvering conditions. Results show that the proposed method improves recursive trajectory and velocity prediction compared with polynomial-only, neural-only, and frozen-prior hybrid baselines, demonstrating its applicability to field-data-based AUV maneuvering modeling.

2606.19699 2026-06-19 cs.RO cs.LG cs.SY eess.SY 新提交

Comparative Study on Agility, Efficiency, and Impact Absorption of Bipedal Robots with Active Toes

具有主动脚趾的双足机器人敏捷性、效率和冲击吸收的比较研究

Joong-Gil Kim, Wontae Ye, Geunwoo Cho, Seong-Ho Yun, Se-Hyoung Cho, Yong-Jae Kim

发表机构 * School of Electrical, Electronics and Communication Engineering, Korea University of Technology and Education(韩国技术教育大学电气、电子与通信工程学院) Artificial Intelligence and Robotics Institute, Korea Institute of Science and Technology(韩国科学技术研究院人工智能与机器人研究所) Robot Innovation Hub, WIRobotics Inc.(WIRobotics公司机器人创新中心)

AI总结 提出一种14自由度双足机器人,模拟人类脚趾的轻量、高扭矩、坚固特性,通过高保真仿真训练环境,对比有无主动脚趾的配置,发现脚趾机器人以1.33米/秒行走时,CoT降低17.5%,脚跟冲击力降低5.0%,路径偏差平均和最大分别降低25.0%和34.0%。

Comments 6 pages, 7 figures

详情
AI中文摘要

人类腿部表现出高效率、敏捷性和冲击吸收能力,其中脚趾在这些能力中起着关键作用。尽管已经有许多尝试在机器人中实现类似人类的脚趾,但它们尚未完全复制人类特征,也没有严格验证其益处。我们提出了一种14自由度的双足机器人,模拟人类脚趾的轻量、高扭矩、坚固特性。为了定量分析主动脚趾在敏捷性、效率和冲击吸收方面的有效性,我们开发了一个高保真仿真训练环境,该环境反映了具有耦合传动和精确功耗的实际执行器。为了确保有和没有主动脚趾的配置之间的公平比较,我们设计了一个最小化强化学习奖励函数,并对两者应用了相同的训练程序。仿真结果表明,在1.33米/秒行走时,与无脚趾配置相比,配备脚趾的机器人将CoT降低了17.5%,脚跟冲击力降低了5.0%。在敏捷性测试中,平均和最大路径偏差分别降低了25.0%和34.0%。

英文摘要

Human legs exhibit high efficiency, agility, and impact absorption, with toes playing a crucial role in these capabilities. While many attempts have been made to implement human-like toes in robots, they have not fully replicated human characteristics nor rigorously validated their benefits. We propose a 14-DOF biped robot emulating human toes' lightweight, high-torque, robust nature. To quantitatively analyze the effectiveness of the active toes in terms of agility, efficiency, and impact absorption, we developed a high-fidelity simulation training environment that reflects actual actuators with coupled transmissions and accurate power consumption. To ensure a fair comparison between configurations with and without active toes, we designed a minimal RL reward function and applied an identical training procedure to both. The simulation results indicate that, at 1.33 m/s walking, the toe-equipped robot reduced CoT by 17.5% and heel-strike GRF by 5.0% compared with the toe-ablation configuration. On the agility test, average and maximum path deviation decreased by 25.0% and 34.0%, respectively.

2606.19688 2026-06-19 cs.SD eess.AS 新提交

Latency-Configurable Streaming Speech Enhancement via Asymmetric Temporal Padding

通过非对称时间填充实现延迟可配置的流式语音增强

Yunsik Kim, Yoonyoung Chung

发表机构 * Department of Electrical Engineering, Pohang University of Science and Technology (POSTECH)(电气工程系,浦项科技大学) Intus Co. Ltd.(Intus有限公司)

AI总结 提出LaCo-SENet,通过非对称时间填充和双缓冲流式机制,在单一超参数下实现延迟与质量的灵活权衡,在VoiceBank+DEMAND上以1.37M参数获得12.5-75.0ms延迟范围,PESQ从3.35到3.43。

Comments 5 pages, 3 figures. Accepted for presentation at Interspeech 2026

详情
AI中文摘要

流式语音增强需要在算法延迟和质量之间取得平衡,但现有方法大多将其视为因果与非因果的二元选择。LaCo-SENet通过单个训练时超参数参数化的两种机制解决了这个问题。首先,非对称时间填充重新分配卷积中的过去和未来上下文,实现系统性的延迟配置。其次,双缓冲流式结合了过去上下文的状体缓冲区和在输入和特征层面提供未来上下文的超前缓冲区。选择性状态更新还防止未来帧泄漏到流式状态中,确保训练-推理一致性。在VoiceBank+DEMAND上,固定预算(1.37M参数)的主干网络产生了覆盖12.5-75.0毫秒的模型系列,PESQ从3.35上升到3.43。在仅12.5毫秒(完全因果)时,PESQ为3.35,达到或超过了先前的因果最先进水平(46.5毫秒时为3.27)。

英文摘要

Streaming speech enhancement requires balancing algorithmic latency against quality, yet existing approaches largely treat this as a binary causal versus non-causal choice. LaCo-SENet addresses this issue with two mechanisms parameterized by a single training-time hyperparameter. First, asymmetric temporal padding redistributes past and future context in convolutions, enabling systematic latency configuration. Second, dual-buffer streaming combines state buffers for past context with lookahead buffers that supply future context at both the input and feature levels. Selective state updates also prevent future-frame leakage into the streaming state, ensuring training-inference consistency. On VoiceBank+DEMAND, a fixed-budget (1.37M parameters) backbone yields a family of models spanning 12.5-75.0 ms, with PESQ rising from 3.35 to 3.43. At just 12.5 ms (fully causal), a PESQ of 3.35 matches or exceeds the prior causal state-of-the-art (3.27 at 46.5 ms).

2606.19683 2026-06-19 cs.AI cs.MA cs.SY eess.SY 新提交

Exit-and-Join Dynamics for Decentralized Coalition Formation

去中心化联盟形成的退出与加入动力学

Quanyan Zhu

发表机构 * New York University Tandon School of Engineering(纽约大学坦登工程学院) Department of Electrical and Computer Engineering(电气与计算机工程系)

AI总结 研究基于单边退出与加入决策的去中心化联盟形成动力学,利用Aumann-Dreze值计算个体收益,建立合作支付分配与非合作最优反应的关联,并分析均衡特征及成本对局部稳定性的影响。

详情
AI中文摘要

本文研究联盟形成作为一种由单边退出与加入决策驱动的去中心化动力学过程。智能体使用Aumann-Dreze值评估局部移动,因此收益在智能体当前联盟内计算,而非通过全局协商的联盟结构。由此产生的模型将合作支付分配与非合作最优反应行为联系起来:一个终端划分恰好是一个没有可接受的、个体有利可图的退出与加入偏离的联盟结构。我们建立了均衡特征,确定了动力学允许标量Lyapunov或精确势函数表示的条件,并分析了切换和接受成本如何塑造局部稳定性。数值实验测试了有限时间稳定、成本敏感性以及一个特殊的凸博弈基准。

英文摘要

This paper studies coalition formation as a decentralized dynamical process driven by unilateral exit-and-join decisions. Agents evaluate local moves using the Aumann-Dreze value, so payoffs are computed within the agent's current coalition rather than through a globally negotiated coalition structure. The resulting model links cooperative payoff allocation with noncooperative best-response behavior: a terminal partition is precisely a coalition structure with no admissible, individually profitable exit-and-join deviation. We establish equilibrium characterizations, identify conditions under which the dynamics admit scalar Lyapunov or exact-potential representations, and analyze how switching and acceptance costs shape local stability. Numerical experiments test finite-time stabilization, cost sensitivity, and a special convex-game benchmark.

2606.19630 2026-06-19 cs.AI cs.DL cs.SY eess.SY 新提交

AI4SE and SE4AI Exploration: A Decade Looking Back and Forward

AI4SE 与 SE4AI 探索:回顾与展望的十年

H. Sinan Bank, Daniel R. Herber, Thomas Bradley

发表机构 * Colorado State University(科罗拉多州立大学)

AI总结 本文回顾了人工智能与系统工程在三个阶段的进展,通过人机一致性文献综述识别出五个关键研究空白,并提供了AI采纳、保障和劳动力转型的指导。

Comments 10 pages, 5 figure

详情
AI中文摘要

2020年3月INCOSE INSIGHT关于人工智能与系统工程的特刊成为该刊历史上下载量最高的一期,并催生了一个研究社区,其年度研讨会现吸引超过250名注册者。在本文中,我们基于作者对该领域核心论文的解读,追溯了人工智能与系统工程在三个阶段(标记为基础、应用和LLM转折点)的进展,并描述了我们对社区已达成共识以及仍存在关键空白的看法。此外,我们进行了一项人机一致性文献综述,利用人类专家和六个人工智能模型评估了1,712篇INCOSE INSIGHT文章和889篇SERC出版物的相关性。结果识别出五个关键研究空白,并为从业者在系统工程中应对AI采纳、保障和劳动力转型提供了指导。我们共享一致性数据以及AI4SE/SE4AI Explorer网络应用程序,以便读者将自己的相关性判断与人类和AI评分者进行比较。

英文摘要

The March 2020 INCOSE INSIGHT special issue on AI and Systems Engineering (SE) became the most downloaded issue in the publication's history and launched a research community that now draws over 250 registrants to its annual workshop. In this article, we trace the progress in AI and SE across three phases (labeled here foundational, applied, and LLM inflection) based on the authors' reading of the field's core papers, and describe our opinions of where the community has converged and where critical gaps remain. Separately, a human-AI agreement literature review leveraging both human expertise and six AI models was performed to assess the relevance of 1,712 INCOSE INSIGHT articles and 889 SERC publications. The results identify five critical research gaps and offer guidance for practitioners navigating AI adoption, assurance, and workforce transformation in SE. We share the agreement data and the AI4SE/SE4AI Explorer web application so readers can compare their own relevance judgments with the human and AI raters.

2606.19599 2026-06-19 eess.SY cs.SY econ.EM 新提交

Ramping Procurement and Bid-Cost Recovery in Real-Time Market

实时市场中的爬坡采购与投标成本回收

Cong Chen, Valentina Norambuena, Lang Tong

AI总结 研究净需求不确定下与经济调度协同优化的爬坡采购,分析单间隔与多间隔协同优化设计,提出评估发电机利润、消费者支付、投标成本回收和运营效率的分析框架,并比较三种定价机制。

Comments 4 figures

详情
AI中文摘要

我们研究了净需求不确定下与经济调度协同优化的爬坡采购。我们考察了电网运营商实施的两种灵活爬坡产品设计:单间隔和多间隔协同优化。两者都依赖于滚动窗口随机优化,包含绑定和咨询间隔决策。我们开发了分析框架来评估发电机利润、消费者支付、投标成本回收(BCR)和运营效率。特别是,净需求不确定性可能导致发电机补偿不足,需要歧视性BCR。虽然运营效率对能量和爬坡价格不变,但生产者利润和消费者支付关键取决于定价。我们研究了节点边际定价(LMP)和两种统一定价:最大调度成本定价(MDCP)和最大时间节点边际定价(MTLMP)。在市场外BCR下,LMP产生歧视性能量价格,而MDCP消除BCR,MTLMP在大多数情况下也是如此。这一性质使我们能够在MDCP下为价格接受型发电机建立真实投标激励。我们的分析突出了单间隔和多间隔协同优化与定价设计之间的权衡:在高预测不确定性和中等爬坡需求下,单间隔能量-爬坡协同优化具有优势,而当净需求预测相对准确且爬坡需求具有挑战性时,多间隔协同优化更优。基于CAISO和ERCOT数据的实证结果表明,与LMP相比,MDCP和MTLMP增加了生产者利润且BCR可忽略,但以消费者支付增加为代价。

英文摘要

We study ramping procurement co-optimized with economic dispatch under net-demand uncertainty. We examine two flexible ramp product designs implemented by grid operators: single-interval and multi-interval co-optimization. Both rely on rolling-window stochastic optimization with binding and advisory interval decisions. We develop analytical frameworks to evaluate generator profits, consumer payments, bid cost recovery (BCR), and operational efficiency. In particular, net-demand uncertainty may lead to generator under-compensation, requiring discriminatory BCR. While operational efficiency is invariant to energy and ramp prices, producer profits and consumer payments depend critically on pricing. We examine locational marginal pricing (LMP) and two uniform pricing: maximum dispatch cost pricing (MDCP) and maximum temporal locational marginal pricing (MTLMP). With out-of-market BCR, LMP yields discriminatory energy prices, whereas MDCP eliminates BCR and MTLMP does so in most cases. This property enables us to establish truthful bidding incentives for price-taking generators under MDCP. Our analysis highlights trade-offs between single- and multi-interval co-optimization and pricing designs: single-interval energy-ramp co-optimization is advantageous under high forecast uncertainty and moderate ramping requirements, whereas multi-interval co-optimization is superior when net-demand forecasts are relatively accurate and ramp needs are challenging. Empirical results on CAISO and ERCOT data show that MDCP and MTLMP increase producer profits with negligible BCR, albeit at the expense of higher consumer payments relative to LMP.

2606.19590 2026-06-19 cs.RO cs.SY eess.SY 新提交

Safe, Real-Time Active Model Discrimination and Fault Diagnosis for Nonlinear Systems via Differentiable Reachability

通过可微可达性实现非线性系统的安全、实时主动模型辨识与故障诊断

Xinpei Ni, Melkior Ornik, Glen Chou, Samuel Coogan

发表机构 * Institute of Robotics and Intelligent Machines (IRIM), Georgia Institute of Technology(佐治亚理工学院机器人与智能机器研究所) Department of Aerospace Engineering, University of Illinois Urbana-Champaign(伊利诺伊大学厄巴纳-香槟分校航空航天工程系)

AI总结 针对不确定非线性系统,提出一种基于可微可达性近似的实时主动故障诊断算法,通过优化控制输入使输出集分离,在保证安全的同时实现快速模型辨识。

详情
AI中文摘要

我们提出了一种安全、实时的算法,用于对具有过程和测量扰动的连续时间不确定非线性系统进行主动故障诊断和模型辨识。给定一组表示正常和故障模式(包括执行器和传感器故障)的候选模型,我们制定了一个输出反馈、时变策略优化问题,该问题(i)在有限时域内鲁棒地强制执行状态输入安全约束,并且(ii)驱动系统产生与至多一个模型一致的采样测量,从而实现确定性诊断。为了实时解决这个问题,我们使用可达状态和输出集的区间过近似开发了一个可处理的近似,并通过一个可微目标函数对诊断能力进行编码,该函数惩罚可能模型的可达输出集之间的重叠。由此产生的优化使用基于梯度的JAX和可微可达性原语在线高效求解。我们在几个高维非线性机器人系统(包括模拟四旋翼和战斗机模型、硬件差速驱动机器人和四足导航)上评估了我们的方法,用于传感器和执行器故障诊断(最多11种故障模式)。在这些案例研究中,我们的方法在50毫秒内实现了可靠的模型辨识,在辨识成功率和速度上优于基线方法,同时提供了形式化的安全保证。

英文摘要

We present a safe, real-time algorithm for active fault diagnosis and model discrimination for uncertain continuous-time nonlinear systems with process and measurement disturbances. Given a finite set of candidate models representing nominal and faulty modes, including actuator and sensor faults, we formulate an output-feedback, time-varying policy optimization problem that (i) robustly enforces state-input safety constraints over a finite horizon and (ii) drives the system to produce sampled measurements consistent with at most one model, enabling deterministic diagnosis. To solve this problem in real time, we develop a tractable approximation using interval over-approximations of reachable state and output sets, and encode diagnosability via a differentiable objective that penalizes overlap between the reachable output sets of possible models. The resulting optimization is solved efficiently online with gradient-based methods using JAX and differentiable reachability primitives. We evaluate our method on sensor and actuator fault diagnosis (up to 11 fault modes) in several high-dimensional nonlinear robotic systems, including a simulated quadrotor and fighter-jet model, a hardware differential-drive robot, and quadrupedal navigation. Across these case studies, our approach achieves reliable model discrimination in under 50 ms, outperforming baselines in discrimination success rate and speed while providing formal safety guarantees.

2606.19566 2026-06-19 eess.SY cs.AI cs.SY 新提交

GDGU: A Gradient Difference-based Graph Unlearning Method for Cyberattack Localization in Electric Vehicle Charging Networks

GDGU:基于梯度差异的图遗忘方法用于电动汽车充电网络中的网络攻击定位

Nanhong Liu, Mucun Sun, Jie Zhang

AI总结 针对电动汽车充电站数据删除需求,提出基于梯度差异的图遗忘方法(GDGU),通过一阶参数校正实现高效遗忘,在保持定位性能的同时显著降低计算开销。

详情
AI中文摘要

电动汽车充电站(EVCS)可能使配电馈线暴露于网络攻击。尽管包括图神经网络在内的机器学习方法可以定位哪个母线被攻破,但在数据共享和模型训练方面仍存在重大挑战。例如,隐私法规允许EVCS所有者从已部署的模型中删除其训练数据,但每次请求都从头重新训练在计算上不可行。为了解决这个问题,我们研究了用于EVCS网络攻击定位的图遗忘(GU),将其形式化为图级多标签分类任务上的特征级遗忘问题。具体来说,我们提出了基于梯度差异的图遗忘(GDGU),通过一阶参数校正消除请求删除数据的影响。该校正基于原始训练数据与修改后数据集之间的梯度差异计算,其中仅遗忘请求的EVCS母线的充电功率特征。然后,应用批归一化重新校准和简短的恢复微调步骤以恢复定位效用。我们在IEEE 34母线、123母线和8500节点配电网络上,使用三种图神经网络骨干网络和累积遗忘场景,将GDGU与两种二阶GU基线进行比较。GDGU在定位效用上与最强基线相当,遗忘保真度接近完全重新训练,同时遗忘速度比从头重新训练快10到12倍,且内存使用远少于二阶GU基线。

英文摘要

Electric vehicle charging stations (EVCSs) can expose distribution feeders to cyberattacks. While machine learning methods, including graph neural networks, can localize which bus is compromised, significant challenges remain in data sharing and model training. For example, privacy regulations grant EVCS owners the right to delete their training data from a deployed model, yet retraining from scratch on every request is computationally prohibitive. To address this, we study graph unlearning (GU) for EVCS cyberattack localization, formulated as a feature-level unlearning problem on a graph-level multi-label classification task. Specifically, we propose gradient difference-based graph unlearning (GDGU), which removes the influence of the requested deletion data through a first-order parameter correction. The correction is computed from the gradient difference between the original training data and a modified dataset in which only the charging power features at the requested EVCS buses are unlearned. Then, a batch-normalization recalibration and a brief recovery fine-tuning step are applied to restore localization utility. We benchmark GDGU against two second-order GU baselines on the IEEE 34-bus, 123-bus, and 8500-node distribution networks across three graph neural network backbones and cumulative unlearning scenarios. GDGU matches the strongest baseline on localization utility and reaches forgetting fidelity close to full-retraining, while unlearning 10 to 12 times faster than retraining from scratch and using far less memory than the second-order GU baselines.

2606.19561 2026-06-19 cs.RO cs.SY eess.SY 新提交

pdSTL: Probabilistic Differentiable Signal Temporal Logic for Stochastic Systems

pdSTL: 面向随机系统的概率可微信号时序逻辑

Bennett Dogbey, Hemanth Manjunatha

发表机构 * Oklahoma State University(俄克拉荷马州立大学)

AI总结 提出pdSTL框架,将概率语义与可微鲁棒性结合,通过区间值概率语义和LSTM式展开实现线性时间可微监控,在障碍物规避、换道和真实四旋翼飞行实验中优于确定性可微STL。

详情
AI中文摘要

在不确定环境中运行的自主机器人必须满足复杂的时序和安全规范,尽管存在随机动力学和感知噪声。虽然信号时序逻辑(STL)为基于梯度的优化提供了鲁棒性度量,但现有的扩展要么缺乏可微性,要么忽略了信念空间的不确定性。我们引入了pdSTL(概率可微信号时序逻辑),这是一个将概率语义与信念轨迹上的可微鲁棒性统一起来的框架。pdSTL采用区间值概率语义来计算保守的满足界限,并通过STL语法树组合传播。我们将时序鲁棒性评估制定为STL算子的循环、LSTM式展开,从而实现适用于端到端轨迹优化的线性时间、可微监控。我们在模拟障碍物规避、换道操作以及真实世界的Crazyflie四旋翼飞行实验中验证了pdSTL,这些实验在气动干扰下进行。结果表明,pdSTL在保持形式化概率保证的同时实现了高效优化,在现实世界的不确定性下,在维持安全裕度方面显著优于确定性可微STL。

英文摘要

Autonomous robots operating in uncertain environments must satisfy complex temporal and safety specifications despite stochastic dynamics and sensing noise. While Signal Temporal Logic (STL) offers robustness measures for gradient-based optimization, existing extensions either lack differentiability or ignore belief-space uncertainty. We introduce pdSTL (probabilistic differentiable Signal Temporal Logic), a framework that unifies probabilistic semantics with differentiable robustness over belief trajectories. pdSTL employs interval-valued probabilistic semantics to compute conservative satisfaction bounds, propagated compositionally through the STL syntax tree. We formulate the temporal robustness evaluation as a recurrent, LSTM-style unfolding of STL operators, enabling linear-time, differentiable monitoring suitable for end-to-end trajectory optimization. We validate pdSTL on simulated obstacle avoidance, lane-change maneuvers, and real-world Crazyflie quadcopter flight experiments under aerodynamic disturbances. Results demonstrate that pdSTL achieves efficient optimization with formal probabilistic guarantees, significantly outperforming deterministic differentiable STL in maintaining safety margins under real-world uncertainty.

2606.19520 2026-06-19 eess.SY cs.SY 新提交

ev-flow: A Reproducible, NHTS-Grounded Generator of Synthetic Plug-in Electric Vehicle Charging Behavior for Eight U.S. Regions

ev-flow: 一个可复现的、基于NHTS的合成插电式电动汽车充电行为生成器,适用于美国八个地区

Bertrand Travacca

AI总结 提出ev-flow开源Python包,基于2017年全国家庭旅行调查数据,通过九阶段流水线生成美国八个地区的合成插电式电动汽车充电行为,填补了美国本土化、NHTS驱动的充电行为生成工具空白。

Comments 20 pages

详情
AI中文摘要

电动汽车并网研究需要大量具有行为真实性的个体充电档案,但实际充电遥测数据稀缺且受隐私限制,现有的开源生成器要么基于非美国出行调查校准,要么忽略了驱动总需求的区域、季节和设备异质性。我们提出\texttt{ev-flow}(导入名\texttt{pev\_synth}),一个MIT许可的开源Python包,基于2017年全国家庭旅行调查(NHTS)微观数据和区域销售组合模型,为美国八个区域生成合成插电式电动汽车充电行为。一个确定性的九阶段流水线(M1-M9)将每辆车从调查记录转换为带时间戳的充电档案:它将调查的人日拼接成捐赠者匹配的365天出行日历,并带有温度依赖的冬季能量提升;从已发表的SPEECh K=16高斯混合参数化中采样行为插电开始时间;评估三层伯努利插电模型;传播连续时间荷电状态账本,并带有明确的PHEV汽油续航扩展项;将插电状态栅格化为15分钟和小时网格。该包生成住宅和工作场所档案类型,并附有描述性EVSE品牌和连接器丰富信息;每个输出均以UTC存储、时区感知,并可从单个主种子实现比特可复现。验证运行器将生成的分布与已发表的边界进行比较,并根据文献出处对每个偏差进行分类:参考的\texttt{bay\_area}住宅档案在21项适用检查中汇总为11项通过、0项未解释失败、6项已解释失败和4项已解释跳过。\texttt{ev-flow}填补了美国本土、基于NHTS的空白,与欧洲生成器(如emobpy和VencoPy)以及充电模拟器(如datafev和ACN-Sim)互补。

英文摘要

Electric-vehicle grid-integration studies need large, behaviorally realistic populations of individual charging profiles, but real charging telemetry is scarce and privacy-restricted, and the existing open generators are calibrated to non-U.S. mobility surveys or flatten the regional, seasonal, and equipment heterogeneity that drives aggregate demand. We present \texttt{ev-flow} (import name \texttt{pev\_synth}), an open-source, MIT-licensed Python package that generates synthetic plug-in electric vehicle charging behavior for eight U.S. regions, grounded in 2017 National Household Travel Survey (NHTS) microdata and regional sales-mix models. A deterministic nine-stage pipeline (M1--M9) carries each vehicle from survey records to a time-stamped charging profile: it stitches survey person-days into donor-matched 365-day travel calendars with a temperature-dependent winter energy uplift, samples behavioral plug-in start times from the published SPEECh K=16 Gaussian-mixture parameterization, evaluates a three-layer Bernoulli plug-in model, propagates a continuous-time state-of-charge ledger with an explicit PHEV gasoline range-extension term, and rasterizes plug status to 15-minute and hourly grids. The package generates residential and workplace profile types with descriptive EVSE brand and connector enrichment; every output is UTC-stored, timezone-aware, and bit-reproducible from a single master seed. A validation runner compares the generated distributions against published bounds and classifies every divergence with literature provenance: the reference \texttt{bay\_area} residential profile rolls up to 11 PASS, 0 unexplained FAIL, 6 explained failures, and 4 explained skips across 21 applicable checks. \texttt{ev-flow} fills a U.S.-focused, NHTS-grounded niche complementary to European generators such as emobpy and VencoPy and to charging simulators such as datafev and ACN-Sim.

2606.19512 2026-06-19 cs.RO cs.SY eess.SY 新提交

Proprioceptive Invariant State Estimation for Humanoid Robots on Non-Inertial Ground

非惯性地面上仿人机器人的本体感觉不变状态估计

Falak Mandali, Zijian He, Yan Gu

发表机构 * Purdue University(普渡大学)

AI总结 提出一种仅使用本体感觉的InEKF方法,利用足部IMU和运动学约束,实现非惯性地面上仿人机器人的实时状态估计,收敛速度提升96%,位置误差降低80%。

详情
AI中文摘要

本文提出了一种不变扩展卡尔曼滤波(InEKF)方法,用于在非惯性地面上运行的仿人机器人仅使用机载本体感觉进行实时状态估计。所提出的方法估计机器人相对于移动地面框架的基座位置和速度,无需直接测量地面运动或外部安装的传感器。通过足部安装的IMU利用支撑脚的运动学约束,该滤波器在保持完全本体感觉的同时,考虑了过程模型和测量模型中的地面引起的非线性。估计器被设计为具有右不变测量模型,从而在较大的初始不确定性下实现有利的误差动态。可观测性分析建立了机器人相对于非惯性地面框架的相对基座位置和速度可观测的条件。在摇摆和俯仰地面上站立和蹲下的Digit仿人机器人实验表明,与现有的InEKF相比,收敛速度提高了96%,位置估计误差减少了80%。在单轴旋转地面上的行走实验实现了平均估计误差小于9厘米,初始误差高达1米。

英文摘要

This paper presents an invariant extended Kalman filtering (InEKF) approach for real-time state estimation of humanoid robots operating on non-inertial ground using only onboard proprioceptive sensing. The proposed approach estimates the robot's base position and velocity relative to the moving ground frame without requiring direct measurements of ground motion or externally mounted sensors. By exploiting kinematic constraints at the stance foot through foot-mounted IMUs, the filter accounts for ground-induced nonlinearities in the process and measurement models while remaining fully proprioceptive. The estimator is formulated to admit a right-invariant measurement model, enabling favorable error dynamics under large initial uncertainties. Observability analysis establishes conditions under which the robot's relative base position and velocity are observable with respect to the non-inertial ground frame. Experiments with the Digit humanoid robot standing and squatting atop a swaying and pitching ground showcase a 96% speedup in convergence rate and an 80% reduction in position estimate errors over existing InEKFs. Walking experiments on a uni-axially rotating ground achieve an average estimation error of less than 9 cm for an initial error of up to 1 m.

2606.19504 2026-06-19 cs.RO cs.SY eess.SY 新提交

Simulating Robotic Locomotion in Sand: Resistive Force Theory in an Open-Source Physics Engine

模拟沙地中的机器人运动:开源物理引擎中的阻力理论

Ryan Walker Brown, Laura K. Treers, Kathryn A. Daltorio

发表机构 * Case Western Reserve University(凯斯西储大学) University of Vermont(佛蒙特大学)

AI总结 将三维颗粒阻力理论(3D RFT)集成到MuJoCo物理引擎中,实现沙地行走模拟,验证了足端形状、速度和负载对运动的影响,并在六足机器人实验中预测行走距离和沉陷误差在20%以内。

Comments 12 pages, 7 figures

详情
AI中文摘要

阻力理论(RFT)的最新进展使得无需模拟单个颗粒相互作用即可近似沙地运动中的地面反作用力,从而降低了计算成本。然而,这些工具在常用于机器人仿真的3D物理引擎中尚不可用。我们探讨了将阻力近似与标准动力学计算相结合,是否能为自由行走的机器人提供稳定的支撑。为此,我们在物理仿真引擎MuJoCo中实现了三维颗粒阻力理论(3D RFT)。我们在多个场景中验证了仿真,证明了由于末端执行器形状、速度和负载引起的关键趋势得以保留。我们的实现预测了12自由度六足机器人在沙地中的行走距离和足部下沉,误差在实验值的20%以内。尽管RFT存在固有近似,但本文描述的开源工具有望帮助开发新的和改进的机器人设计,以穿越颗粒介质基底。

英文摘要

Recent advancements in Resistive Force Theory (RFT) enable approximation of ground reaction forces for locomotion in sand without the computational expense of modeling interactions with individual grains. However, these tools have been absent in 3D physics engines commonly used for robot simulation. We explore if resistive force approximations are sufficient, when integrated with standard dynamics calculations, to provide a stable substrate for a freely walking robot. To determine this, we implement 3D Granular Resistive Force Theory (3D RFT) in a physics simulation engine, MuJoCo. We verify simulations in multiple scenarios to demonstrate that key trends due to end effector shape, speed, and loading are preserved. Our implementation predicts walking distance and foot sinkage of a 12-Degree of Freedom hexapod robot within 20\% of experiments in sand. While RFT has inherent approximations, the open source tool described here has potential to help develop new and improved robot designs to traverse granular media substrates.

2606.19398 2026-06-19 cs.SD eess.AS eess.SP 新提交

S-JEPA : Soft Clustering Anchors for Self-Supervised Speech Representation Learning

S-JEPA:用于自监督语音表示学习的软聚类锚点

Georgios Ioannides, Adrian Kieback, Judah Goldfeder, Linsey Pang, Aman Chadha, Aaron Elkins, Yann LeCun, Ravid Shwartz-Ziv

发表机构 * Carnegie Mellon University(卡内基梅隆大学) New York University(纽约大学) James Silberrad Brown Center for AI(詹姆斯·西尔伯拉德·布朗人工智能中心) Columbia University(哥伦比亚大学) Northeastern University(东北大学) Stanford University(斯坦福大学) Amazon GenAI(亚马逊生成式人工智能)

AI总结 提出S-JEPA,通过KL散度匹配高斯混合模型的软后验概率训练编码器-预测器对,无需离线重聚类或教师蒸馏,在SUPERB协议下以低于90M参数取得最低WER,并建立新的帕累托前沿。

详情
AI中文摘要

自监督语音编码器主要通过预测掩蔽位置处的离散硬聚类ID进行训练,这种方法会坍缩类别边界处的声学模糊性,并需要在迭代之间中断训练以对整个语料库进行重聚类。我们提出S-JEPA,一种JEPA风格的编码器-预测器对,通过KL散度训练以匹配掩蔽位置处高斯混合模型的软后验概率。训练作为连续优化轨迹分两个阶段进行:首先在MFCC特征上使用固定GMM,然后在编码器特征上使用在线GMM,输入层从无标签信号中自适应选择,从而消除了离线重聚类步骤以及手动选择聚类所在Transformer层的问题。在SUPERB协议下,S-JEPA在评估的低于90M参数的自监督方法中实现了最低的词错误率(WER),并在大约一半参数量的情况下在情感识别任务上与HuBERT-Base相当,无需离线重聚类或教师蒸馏即建立了新的帕累托前沿。对预测器在保留语音上的每帧熵的分析揭示了双峰分布,其中相当一部分帧的熵接近完美两聚类平局的熵,这直接经验性地证明了软目标目标保留了硬目标会坍缩的声学模糊性。代码可在以下网址获取:https://this https URL。

英文摘要

Self-supervised speech encoders are predominantly trained by predicting discrete hard cluster IDs at masked positions, a recipe that collapses acoustic ambiguity at category boundaries and requires interrupting training to re-cluster the entire corpus between iterations. We introduce S-JEPA, a JEPA-style encoder-predictor pair trained to match the soft posteriors of a Gaussian Mixture Model at masked positions via KL divergence. Training runs as one continuous optimization trajectory in two phases: a fixed GMM over MFCC features, then an online GMM over encoder features, with the input layer selected adaptively from a label-free signal, removing both the offline re-cluster step and the hand-tuned choice of which transformer layer to cluster on. Under the SUPERB protocol, S-JEPA achieves the lowest WER among evaluated SSL methods below 90M parameters and matches HuBERT-Base on emotion recognition at roughly half its parameter count, establishing a new Pareto frontier without offline re-clustering or teacher distillation. An analysis of the predictor's per-frame entropy on held-out speech reveals a bimodal distribution with a substantial minority of frames near the entropy of a perfect two-cluster tie, providing direct empirical evidence that the soft-target objective preserves the acoustic ambiguity that hard targets would collapse. Code is available at https://github.com/gioannides/s-jepa.

2606.19366 2026-06-19 cs.LG cs.AI eess.SP 新提交

Information Lattice Learning as Probabilistic Graphical Model Structure Learning

信息格学习作为概率图模型结构学习

Haizi Yu, Lav R. Varshney

发表机构 * Kocree, Inc.(Kocree公司) AI Innovation Institute, Stony Brook University(石溪大学人工智能创新研究所)

AI总结 将信息格学习(ILL)解释为概率图模型结构学习,通过投影到分区格上学习可解释规则,并建立与最大熵和因子图的联系。

详情
AI中文摘要

信息格学习(ILL)通过将信号交替投影到编码抽象层次结构的分区格上,并将选定的规则提升回信号域,来学习信号的可解释规则。当信号是概率质量函数时,我们证明ILL学习的概率规则具有自然的概率图模型(PGM)解释,并详细发展了这一解释。ILL中的分区诱导出一个确定性的商变量,规则是该商变量的边际分布。因此,规则集是可解释抽象上的边际约束集合。一般提升是满足这些约束的所有联合分布的可行族,而特殊提升则选择最大无知重建,在ILL中通过L2均匀性原理实现,该原理与最大熵密切相关。在香农熵提升下,相同的约束产生一个对数线性因子图,其因子由学习的抽象索引。然而,信息格本身不是贝叶斯网络:其边编码抽象的细化与粗化,而非条件依赖。因此,ILL最好被视为商变量上可解释的基于约束的因子图的结构学习。这一观点阐明了ILL如何与图模型和最大熵模型相关,同时为推理、可识别性和混合符号-概率学习提出了新方向。

英文摘要

Information lattice learning (ILL) learns interpretable rules of a signal by alternately projecting the signal onto a partition lattice that encodes a hierarchy of abstractions and lifting selected rules back to the signal domain. When the signal is a probability mass function, we show the probabilistic rules learned by ILL admit a natural probabilistic graphical model (PGM) interpretation and develop this interpretation in detail. A partition in ILL induces a deterministic quotient variable, and a rule is the marginal law of that quotient variable. A rule set is therefore a collection of marginal constraints over interpretable abstractions. General lifting is the feasible family of all joint distributions satisfying those constraints, while special lifting chooses a maximum-ignorance reconstruction, implemented in ILL by an L2 uniformity principle closely related to maximum entropy. Under a Shannon-entropy lifting, the same constraints yield a log-linear factor graph whose factors are indexed by learned abstractions. The information lattice itself, however, is not a Bayesian network: its edges encode refinement and coarsening of abstractions, not conditional dependence. Thus ILL is best viewed as structure learning for interpretable constraint-based factor graphs over quotient variables. This view clarifies how ILL relates to graphical models and maximum entropy models, while suggesting new directions for inference, identifiability, and hybrid symbolic-probabilistic learning.

2606.20443 2026-06-19 eess.SY cs.LG cs.SY math.AT 新提交

Topological Data Analysis for High-Dimensional Dynamic Process Monitoring

高维动态过程监测的拓扑数据分析

Angan Mukherjee, Tyler A. Soderstrom, Michael J. Kurtz, Victor M. Zavala

AI总结 提出结合拓扑数据分析和机器学习的方法,将多变量时间序列表示为流形,用拓扑描述符总结结构,并用神经常微分方程学习拓扑结构动态演化,实现高效事件检测。

详情
AI中文摘要

实时过程监测需要从高维时间序列数据中提取可操作信息的方法。在这项工作中,我们提出了一种新的过程监测方法,结合了拓扑数据分析(TDA)和机器学习工具。在所提出的方法中,我们将多变量时间序列数据表示为流形,并使用拓扑描述符来总结此类数据的结构;然后,我们使用神经常微分方程来学习系统拓扑结构的动态演化。使用来自工业过程的真实数据,我们表明这种基于轨迹的事件检测方法能有效检测多种类型的事件。我们将该方法与基于重构的方法(如主成分分析和自编码器)以及使用Koopman自编码器的基于轨迹的方法进行了对比。

英文摘要

Real-time process monitoring requires methods that extract actionable information from high-dimensional time-series data. In this work, we present a new approach for process monitoring that combines tools of topological data analysis (TDA) and machine learning. In the proposed approach, we represent multivariate time-series data as manifolds and use topological descriptors to summarize the structure of such data; we then use a neural ordinary differential equation to learn the dynamic evolution of the topological structure of the system. Using real data from an industrial process, we show that this trajectory-based event detection approach is effective at detecting diverse types of events. We contrast this approach against reconstruction-based approaches such as principal component analysis and autoencoders and against a trajectory-based approach that uses Koopman autoencoders.

2606.20413 2026-06-19 eess.SP cs.IT math.IT 新提交

Hybrid TRP-UE Sensing for Enhanced Target Localization

混合TRP-UE感知用于增强目标定位

Necati Kagan Erkek, Marco Di Renzo, Arman Shojaeifard, Yasser Mestrah, Remun Koirala, Mohammad Heggo, Kunjan Shah

AI总结 提出一种混合TRP-UE感知机制,利用UE辅助感知提升网络感知性能,在室内工厂等复杂传播环境下显著改善目标定位精度。

Comments 6 pages

详情
AI中文摘要

集成感知与通信(ISAC)指的是网络在提供通信服务的同时,能够以可扩展的方式感知环境的能力。ISAC的关键功能之一是对无源和移动感知目标的精确定位。本文介绍了一种新颖的混合TRP-UE感知机制,该机制提升了基于网络的感知性能。使用符合3GPP标准的ISAC信道模型提供了评估结果。结果表明,在室内工厂等具有挑战性的传播环境中,用UE辅助感知补充基于TRP的感知具有显著优势。

英文摘要

Integrated Sensing and Communication (ISAC) refers to the capability for the network to provide communications services whilst also being able to sense the environment in a scalable manner. One of the key functions of ISAC is the accurate localization of passive and mobile sensing targets. This paper introduces a novel hybrid TRP-UE sensing mechanism that improves network-based sensing performance. Evaluation results are provided using 3GPP-compliant ISAC channel models. The results demonstrate the significant benefit in complimenting TRP-based sensing with UE-assisted sensing in challenging propagation environments such as indoor factory.

2606.19715 2026-06-19 eess.SP cs.IT math.IT 新提交

Generalized Pinching-Antenna Systems: A Radio-Stripe-Based Realization

广义夹捏天线系统:基于无线电条带的实现

Yanqing Xu, Zhiguo Ding, Tsung-Hui Chang

AI总结 本文提出基于无线电条带(RS)的广义夹捏天线(RS-GPA)框架,通过主动天线处理单元实现位置灵活的无线接入,并开发稀疏激活与波束成形算法以降低总功耗。

Comments 13 pages, 7 figures

详情
AI中文摘要

本文研究无线电条带(RS)作为广义夹捏天线的实际实现,并提出基于RS的广义夹捏天线(RS-GPA)框架。与依赖导波到自由空间被动耦合的介质波导基被动夹捏天线不同,RS采用沿共享电缆部署的主动天线处理单元(APU)进行本地传输、接收和信号处理。这种类似电缆的主动架构提供了灵活的安装和广泛的频率适用性,同时允许选定的APU作为离散且可控的辐射或接收点,实现位置灵活的无线接入。基于所提出的RS-GPA框架,我们通过考虑距离相关的APU-用户信道建立了系统和信道模型。对于下行传输,我们提出了一个电路功率感知的稀疏APU激活和波束成形问题,并开发了一种重加权群稀疏波束成形算法。为了揭示激活原理,我们分析了单用户下行情况,并通过平衡发射功率节省和电路功率成本来刻画何时应激活额外的APU。受此启发,提出了一种几何引导的低复杂度多用户算法。对于上行传输,我们提出了一个联合APU激活和用户功率控制问题,并开发了一种几何引导的稀疏激活设计。数值结果表明,与基准方案相比,所提出的RS-GPA框架显著降低了总功耗,而几何引导算法在运行时间显著降低的情况下实现了与群稀疏设计几乎相同的功耗性能。

英文摘要

This paper investigates radio stripes (RSs) as a practical realization of generalized pinching antennas and proposes an RS-based generalized pinching-antenna (RS-GPA) framework. Unlike dielectric-waveguide-based passive pinching antennas that rely on passive coupling from a guided wave into free space, RSs employ active antenna processing units (APUs) deployed along a shared cable for local transmission, reception, and signal processing. This cable-like active architecture offers flexible installation and broad frequency applicability, while allowing selected APUs to act as discrete and controllable radiation or reception points for location-flexible wireless access. Based on the proposed RS-GPA framework, we establish the system and channel models by accounting for the distance-dependent APU-user channels. For downlink transmission, we formulate a circuit-power-aware sparse APU activation and beamforming problem and develop a reweighted group-sparse beamforming algorithm. To reveal the activation principle, we analyze the single-user downlink case and characterize when an additional APU should be activated by balancing transmit-power saving and circuit-power cost. Inspired by this insight, a geometry-guided low-complexity multiuser algorithm is proposed. For uplink transmission, we formulate a joint APU activation and user power control problem and develop a geometry-guided sparse activation design. Numerical results show that the proposed RS-GPA framework substantially reduces the total consumed power compared with benchmark schemes, while the geometry-guided algorithm achieves near-identical consumed-power performance to the group-sparse design with significantly lower runtime.

2606.19695 2026-06-19 eess.SY cs.GT cs.SY math.OC 新提交

A Unified Framework for Joint Sensor Placement and Scheduling for Intrusion Detection

入侵检测中联合传感器放置与调度的统一框架

Jayanth Bhargav, Mahsa Ghasemi, Shreyas Sundaram

AI总结 提出一个统一框架,将传感器放置与方向调度联合优化,通过博弈论设计效用函数并利用弱子模性实现近最优检测性能。

Comments 27 pages, 4 figures

详情
AI中文摘要

我们考虑一个入侵检测任务,其中防御者必须联合优化传感器放置位置和方向,以最小化入侵者穿越受保护环境时被漏检的概率。我们将此问题分解为一个元问题(称为SensorPlacement)和一个嵌入的子问题(称为OrientationScheduling)。对于固定的传感器放置,OrientationScheduling子问题被建模为防御者和入侵者之间的两人零和博弈,其中防御者寻求已部署传感器的方向策略以最小化漏检概率,而入侵者则寻求路径选择策略以最大化该概率。由于防御者的策略空间随传感器数量和方向组合增长,通过标准线性规划求解博弈变得不可行。为此,我们开发了一种迭代且高效的均衡求解算法,该算法利用博弈收益函数的结构,并建立了收敛到博弈纳什均衡(NE)的理论保证。该NE值随后被用作SensorPlacement元问题中的效用度量。我们证明了这个基于博弈值的效用函数在传感器放置集合上是弱子模的,并提出了一个具有近最优性保证的贪婪放置算法。据我们所知,这是第一个将博弈论效用设计与(弱)子模优化相结合的统一框架,实现了传感器放置和方向调度的原则性联合优化。通过大量仿真,我们证明所提出的方法实现了近最优的检测性能,同时与基线相比显著减少了计算时间。

英文摘要

We consider an intrusion detection task in which a defender must jointly optimize sensor placement locations and orientations to minimize the probability of missed detection of an intruder traversing a protected environment. We decompose this problem into a meta problem, termed SensorPlacement, and an embedded subproblem, termed OrientationScheduling. The OrientationScheduling subproblem, for a fixed sensor placement, is modeled as a 2-player zero-sum game between the defender and the intruder, where the defender seeks an orientation strategy for the deployed sensors to minimize the probability of missed detection, while the intruder seeks a path selection strategy to maximize it. Since the defender's strategy space grows combinatorially with the number of sensors and orientations, solving the game via standard linear programming becomes prohibitive. To this end, we develop an iterative and efficient equilibrium-seeking algorithm that exploits the structure of the game's payoff function and establishes theoretical guarantees for convergence to the Nash equilibrium (NE) of the game. This NE value is then used as a utility measure in the SensorPlacement meta problem. We show that this game-value-based utility function is weakly submodular over the set of sensor placements and propose a greedy placement algorithm with near-optimality guarantees. To our knowledge, this is the first unified framework to integrate game-theoretic utility design with (weak) submodular optimization, enabling principled joint optimization of sensor placement and orientation scheduling. Through extensive simulations, we demonstrate that the proposed approach achieves near-optimal detection performance while significantly reducing computation time compared to baselines.

2606.20098 2026-06-19 cs.IT eess.SP math.IT 新提交

Site-Specific MIMO Channel Generation via Diffusion and Flow Matching: Fidelity, Efficiency, and Downstream Utility

基于扩散和流匹配的特定场地MIMO信道生成:保真度、效率与下游效用

Sina Beyraghi, Masoud Sadeghian, Firdous Bin Ismail, Angel Lozano, Paul Almasan, Giovanni Geraci

AI总结 本文比较条件去噪扩散隐式模型(cDDIM)和条件流匹配模型(cFMM)生成特定场地MIMO信道数据,cFMM在保持质量的同时推理速度快一个数量级,合成数据能显著提升下游物理层任务性能。

详情
AI中文摘要

本文探索使用生成模型合成高质量的、特定场地的多输入多输出(MIMO)信道数据,以解决为AI原生无线网络获取真实数据所需的大量测量活动的高成本问题。比较了两种位置条件生成范式:条件去噪扩散隐式模型(cDDIM)和条件流匹配模型(cFMM)。这两种模型都根据用户坐标生成MIMO信道矩阵,以保持部署场地的空间结构。从三个维度评估这些方法:统计保真度(包括波束一致性和有效秩)、生成效率以及在下游任务中的效用,例如信道状态信息压缩和波束对齐。在多种传播场景(28 GHz和3.5 GHz,视距和非视距)下的结果表明,即使在训练数据稀缺的情况下,两种模型都能准确捕捉特定场地的特征。值得注意的是,cFMM实现了与cDDIM相当的质量,但推理时间大约少一个数量级。与仅使用稀缺数据或随机信道相比,用这些合成信道扩充稀缺的特定场地数据集在下游物理层任务中带来了显著的性能提升。

英文摘要

This paper explores the use of generative models to synthesize high-quality, site-specific multiple-input multiple-output (MIMO) channel data, addressing the high cost of the extensive measurement campaigns required to acquire real-world data for AI-native wireless networks. Two location-conditioned generative paradigms are compared: a conditional denoising diffusion implicit model (cDDIM), and a conditional flow matching model (cFMM). Both these models generate MIMO channel matrices conditioned on user coordinates, to preserve the spatial structure of the deployment site. The approaches are evaluated across three dimensions: statistical fidelity (including beam consistency and effective rank), generation efficiency, and utility in downstream tasks such as channel-state information compression and beam alignment. Results across diverse propagation scenarios (28 GHz and 3.5 GHz, both line-of-sight and non-line-of-sight) demonstrate that both models accurately capture site-specific characteristics, even when trained on scarce ground-truth data. Notably, cFMM achieves a quality comparable to cDDIM with roughly an order of magnitude less inference time. Augmenting scarce site-specific datasets with these synthetic channels yields hefty performance gains in downstream physical layer tasks compared to using scarce data alone or stochastic channels.

2606.19871 2026-06-19 math.OC cs.MA cs.SY eess.SY 新提交

Semiglobal Input-Delay Tolerance Algorithm for Distributed Nonconvex Optimization of Networked Nonlinear Systems

网络化非线性系统分布式非凸优化的半全局输入延迟容忍算法

Jing-Zhe Xu, Zhi-Wei Liu, Ming-Feng Ge, Yan-Wu Wang, Dinxin He

AI总结 针对存在输入延迟和一致性约束的网络化非线性系统,提出一种半全局输入延迟容忍算法,通过分层设计和输入-状态稳定性分析,在Polyak-Łojasiewicz条件下实现非凸优化的分布式求解。

Comments 36 pages, 5 figures

详情
AI中文摘要

本文研究了一类受输入延迟和一致性约束的网络化非线性系统中的分布式优化问题。引入了输入延迟容忍半全局收敛(IDTSC),即对于任意给定的紧致初始集,存在一个可容许的延迟界,在该界下,最优解在一致性约束内被计算,并且所有节点状态收敛到该解。基于分层设计和输入-状态稳定性分析,开发了一种新的半全局输入延迟容忍(SIDT)算法,该算法在实际中实现了输入延迟与非线性动力学耦合下的分布式优化IDTSC。此外,通过Polyak-Łojasiewicz条件放宽严格凸性要求,SIDT算法将其适用性扩展到非凸优化。最后,数值实验验证了该理论在具有输入延迟的网络化非线性系统上的有效性。

英文摘要

This paper studies a class of distributed optimization problems in networked nonlinear systems (NNSs) subject to input delays and consensus constraints. It introduces input-delay tolerant semiglobal convergence (IDTSC), meaning that for any prescribed compact initial set there exists an admissible delay bound under which the optimal solution is computed within consensus constraints and all node states converge to the solution. Building on a hierarchical design and input-to-state stability analysis, a new semiglobal input-delay tolerant (SIDT) algorithm is developed that practically achieves IDTSC for distributed optimization under the coupling between input delays and nonlinear dynamics. Further, by relaxing strict convexity requirements through the Polyak-Łojasiewicz condition, the SIDT algorithm broadens its applicability to nonconvex optimization. Finally, numerical experiments corroborate the theory on NNSs with input delays.

2606.19669 2026-06-19 math.OC cs.SY eess.SY 新提交

Learning Neural Maximal Lyapunov Functions on $\mathsf{SO}(n)$

在 $\mathsf{SO}(n)$ 上学习神经最大李雅普诺夫函数

Adeel Akhtar, Matthieu Barreau

AI总结 提出基于对数映射的神经李雅普诺夫架构,通过Zubov型表征学习最大吸引域,并推导对数映射导数的显式公式,实现两阶段训练算法。

Comments Accepted to IEEE Control Systems Letters (L-CSS), 6 pages, 2 figures,

详情
AI中文摘要

为李群上的动力系统建立稳定性保证是一个基本挑战,因为为欧几里得空间开发的经典李雅普诺夫方法不能直接转移到弯曲几何上。在本文中,我们提出了一个框架,用于学习在特殊正交群 $\mathsf{SO}(n)$ 上演化的系统的最大李雅普诺夫函数。理论上,我们引入了一种基于对数映射的神经李雅普诺夫架构,具有可证明的逼近能力,并通过最大吸引域的Zubov型表征来形式化学习问题。一个关键的技术贡献是推导了对数映射导数的显式、数值可处理的公式,使得通过一个平衡计算效率和精度的两阶段算法进行训练成为可能。实证上,我们在一个低维非线性系统上验证了该方法。

英文摘要

Establishing stability guarantees for dynamical systems on Lie groups is a fundamental challenge, as classical Lyapunov methods developed for Euclidean spaces do not directly transfer to curved geometries. In this paper, we propose a framework for learning maximal Lyapunov functions for systems evolving on the special orthogonal group $\mathsf{SO}(n)$. Theoretically, we introduce a neural Lyapunov architecture based on the logarithmic map with proven approximation capabilities, and we formulate the learning problem via a Zubov-type characterization of the maximal region of attraction. A key technical contribution is the derivation of explicit, numerically tractable formulas for the derivative of the logarithmic map, enabling training through a two-phase algorithm that balances computational efficiency and accuracy. Empirically, we validate the approach on a low-dimensional nonlinear system.

2606.19767 2026-06-19 eess.IV cs.CV physics.med-ph 新提交

Contour-Constrained Deformable Registration with Parameter Characterization for Head and Neck Surgical Guidance

面向头颈外科引导的带参数表征的轮廓约束可变形配准

Qingyun Yang, Jon S. Heiselman, Ayberk Acar, Morgan J. Ringel, Michael I. Miga, Matthieu Chabanas, Michael C. Topf, Jie Ying Wu

AI总结 提出一种基于正则化Kelvinlet基函数的可变形配准框架,通过表面点云、基准标记和轮廓约束校正术后组织变形,在9例头颈标本上将配准误差从刚性配准的11.11mm降至5.62mm,降幅达49.41%。

详情
AI中文摘要

全球每年新增89万例头颈部鳞状细胞癌,其复发率在实体恶性肿瘤中最高。尽管冰冻切片分析是术中切缘评估的标准方法,但由于切除标本与切除床之间的对准不精确,加上切除后黏膜组织收缩,准确地将检测到的阳性切缘重新定位到切除床上仍然具有挑战性。我们提出了一种生物力学驱动的可变形配准框架,用于校正术后组织变形以提供术中引导。该方法基于正则化Kelvinlet基函数的可变形配准方法,将3D标本网格配准到术中切除床点云。配准匹配表面点云、基准标记和边界轮廓约束,直接惩罚标本与切除床边界之间的垂直距离一致性。在来自皮肤、颊粘膜和舌部位的9个标本上,使用刚性配准的整体平均目标配准误差为$11.11 \pm 4.07$ mm,使用无轮廓约束的可变形配准则降至$8.20 \pm 2.68$ mm(降低26.19%)。所提出的轮廓约束可变形配准进一步将误差降至$5.62 \pm 2.28$ mm,相对于刚性配准降低了49.41%。我们在临床最具挑战性的舌标本中观察到最大降幅。我们还进行了系统的两阶段参数搜索,以表征表面配准、基准对应、轮廓约束和应变能正则化的相对重要性。该搜索表明,对于具有大侧向变形的组织类型,轮廓权重主导配准精度,而算法在广泛的参数组合范围内均可运行。

英文摘要

With 890,000 annual new cases globally, head and neck squamous cell carcinoma has one of the highest recurrence rates among solid malignancies. Although frozen section analysis is the standard of care for intraoperative margin assessment, accurately relocating detected positive margins on the resection bed remains challenging due to imprecise alignment between resected specimens and their resection bed, compounded by post-resection mucosal tissue shrinkage. We present a biomechanics-driven deformable registration framework that corrects post-resection tissue deformation to provide intraoperative guidance. Our approach registers 3D specimen meshes to intraoperative resection bed point clouds using a deformable registration approach based on regularized Kelvinlet basis functions. The registration matches surface point clouds, fiducial landmarks, and boundary contour constraints that directly penalize perpendicular distance-to-agreement between specimen and resection bed boundaries. Across nine specimens from skin, buccal mucosa, and tongue sites, the overall mean target registration error was $11.11 \pm 4.07$ mm using rigid registration, which decreased to $8.20 \pm 2.68$ mm (26.19\% reduction) using deformable registration without contour constraint. The proposed contour-constrained deformable registration further reduced the error to $5.62 \pm 2.28$ mm, a 49.41\% reduction relative to rigid registration. We observed the largest reduction in the most clinically challenging tongue specimens. We also performed a systematic two-stage parameter search to characterize the relative importance of surface alignment, fiducial correspondences, contour constraint, and strain energy regularization. This search revealed that contour weighting dominates registration accuracy for tissue types with large lateral deformation, while the algorithm operates over a broad range of parameter combinations.

2606.20060 2026-06-19 nlin.AO cs.SY eess.SY 新提交

Nodal Braess's Paradox and Inertia Destabilization with Dynamic Node and Line Failures in Power Grids

电网中动态节点与线路故障的节点Braess悖论与惯性失稳

Nubius Brandner, Frank Hellmann, Hans Würfel, Jürgen Kurths, Anton Plietzsch, Anna Büttner

AI总结 提出集成节点/线路故障与同步振荡器动力学的新模型,发现高惯性和节点鲁棒性增强可能反常地扩大级联规模,揭示新型Braess悖论。

详情
AI中文摘要

大规模停电通常由级联故障引起。这些故障通过网络动力学与单个组件故障之间的复杂相互作用动态展开。相比之下,物理学中对级联故障的研究集中在准静态状态下分析线路过载。我们引入了一个新模型,将节点和线路故障的动力学与电网同步的典型振荡器模型相结合。这使我们能够首次研究耦合故障的集体级联行为。我们研究了节点鲁棒性(节点承受瞬态扰动的能力)和惯性(节点抵抗频率偏差的能力)对级联规模的影响。我们发现了驱动系统脆弱性的两种新机制:i) 虽然低惯性被广泛认为是电网的主要挑战,但我们发现高惯性会放大级联规模,除非伴随其他动力学特性的适当调整。ii) 此外,我们发现单个节点鲁棒性的增强可能反常地导致更大的级联。后一种现象构成了一种新型的Braess悖论。理解这种反直觉的集体效应对于实现有弹性的未来电网可能至关重要。

英文摘要

Large-scale power outages are typically caused by cascading failures. These unfold dynamically through complex interactions between network dynamics and individual component failures. In contrast, the study of cascading failures in physics has focused on analyzing line overloads in the quasi-static regime. We introduce a new model that integrates the dynamics of node and line failures with a paradigmatic oscillator model for power grid synchronization. This enables us to investigate the collective cascading behavior of coupled failures for the first time. We study the impact of nodal robustness, the ability of nodes to tolerate transient disturbances, and inertia, the ability of nodes to resist frequency deviations, on cascade sizes. We discover two novel mechanisms driving system fragility: i) While low inertia is widely considered a major challenge for power grids, we find that high inertia can amplify cascade sizes unless accompanied by appropriate adjustments of other dynamical properties. ii) Further, we find that an increase in the robustness of individual nodes can paradoxically lead to larger cascades. This latter phenomenon constitutes a novel type of Braess's paradox. Understanding such counterintuitive collective effects may become central for achieving resilient future power grids.

2606.18485 2026-06-19 cs.SD cs.AI eess.AS 新提交

MagpieTTS-LF: Inference-Time Long-Form Speech Generation Without Training on Long-Form data

MagpieTTS-LF:无需长语音数据训练的推理时长生成长语音生成

Subhankar Ghosh, Jason Li, Paarth Neekhara, Shehzeen Hussain, Ryan Langman, Xuesong Yang, Roy Fejgin

发表机构 * NVIDIA Corporation(英伟达公司)

AI总结 提出MagpieTTS-LF推理时方法,通过软注意力先验、有状态推理和历史感知文本编码,在不重新训练模型的情况下实现连贯的长语音生成。

Journal ref Interspeech 2026

详情
AI中文摘要

神经文本到语音(TTS)系统在短语句上取得了显著质量,但长语音生成表现出韵律漂移、说话人不一致和句子边界伪影。现有方法要么压缩序列、增加上下文长度,要么简单拼接独立合成的片段。我们提出一种称为MagpieTTS-LF的推理时方法,使MagpieTTS能够在不重新训练模型的情况下生成连贯的长语音。我们的方法引入了三个关键创新:(1)软注意力先验,在保留过去和未来上下文的同时引导单调对齐;(2)有状态推理算法,跨句子块维护上下文,确保韵律连续性;(3)历史感知文本编码,利用过去文本进行语篇级韵律规划。在长文本上的实验表明,与其他基线相比,在长距离可懂度、韵律连贯性、说话人一致性和边界自然度方面有显著改进。

英文摘要

Neural Text-to-Speech (TTS) systems achieve remarkable quality on short utterances but long-form speech generation shows prosodic drift, speaker inconsistencies and sentence boundary artifacts. Existing approaches either compress sequences, increase context length or naively concatenate independently synthesized chunks. We present an inference-time approach called MagpieTTS-LF that enables MagpieTTS to produce coherent long-form speech without model retraining. Our method introduces three key innovations: (1) soft attention priors to guide monotonic alignment while preserving past and future context; (2) a stateful inference algorithm that maintains context across sentence chunks, ensuring prosodic continuity; (3) history-aware text encoding that uses past text for discourse-level prosodic planning. Experiments on long texts show significant improvements in long-range intelligibility, prosodic coherence, speaker consistency, and boundary naturalness compared to other baselines.

2606.18272 2026-06-19 cs.NI cs.AI cs.SY eess.SY 新提交

Mitigating Anchoring Bias in LLM-Based Agents for Energy-Efficient 6G Autonomous Networks

缓解基于LLM的智能体在节能6G自主网络中的锚定偏差

Hatim Chergui, Claudia Carballo González, Farhad Rezazadeh, Merouane Debbah

发表机构 * i2CAT Foundation(i2CAT基金会) Universitat Politècnica de Catalunya(政治技术大学) Research Institute for Digital Future(数字未来研究院)

AI总结 提出一种基于截断三参数威布尔分布的随机锚定策略,缓解LLM智能体在6G网络切片中的锚定偏差,结合CVaR数字孪生保障SLA尾延迟,实现高达25%的节能。

Comments 7 pages, 4 figures

详情
AI中文摘要

本文提出了一种自主智能体资源协商框架,旨在使用大语言模型(LLM)智能体实现6G架构中的零接触网络切片。虽然LLM提供了强大的推理能力,但我们证明此类智能体固有地遭受锚定偏差,僵化地坚持初始启发式提议,导致严重的网络过度配置。为系统性地缓解这种认知偏差,我们提出了一种新颖的随机锚定策略,通过截断三参数威布尔分布建模。这种数学上有界的方法与采用条件风险价值(CVaR)的突发感知数字孪生(DT)无缝集成,以严格保证严格的服务水平协议(SLA)尾延迟。为验证我们的方法,我们引入并证明了双峰约束避免效用定理,表明虽然可行的协商遵循经典凸界,但高度约束的场景会发生由逆有理衰减包络控制的相变。使用本地托管的1B参数模型(\ exttt{otel-llm-1b-it})生成的实证结果证实了这些双区域界。我们的认知去偏成功瓦解了僵化的协商模式,迫使智能体主动探索以安全地利用SLA边界,并将系统节能提升高达25%。关键的是,轻量级1B LLM实现了亚秒级推理延迟(平均0.95秒),确保我们的多智能体框架与O-RAN非实时RAN智能控制器(non-RT RIC)的操作时间尺度兼容。

英文摘要

This paper presents an autonomous agentic resource negotiation framework designed to enable zero-touch network slicing in 6G architectures using Large Language Model (LLM) agents. While LLMs offer powerful reasoning capabilities, we demonstrate that such agents inherently suffer from anchoring bias, rigidly adhering to initial heuristic proposals and causing severe network over-provisioning. To systematically mitigate this cognitive bias, we propose a novel randomized anchoring strategy modeled via a Truncated 3-Parameter Weibull distribution. This mathematically bounded approach seamlessly integrates with burst-aware Digital Twins (DTs) employing Conditional Value at Risk (CVaR) to rigorously guarantee strict Service Level Agreement (SLA) tail-latencies. To validate our methodology, we introduce and prove the \emph{Bimodal Constraint-Avoidance Utility Theorem}, demonstrating that while feasible negotiations follow classical convex bounds, highly constrained scenarios undergo a phase transition governed by an inverse rational decay envelope. Empirical results generated using a locally hosted 1B-parameter model otel-llm-1b-it confirm these dual-regime bounds. Our cognitive de-biasing successfully dismantles rigid negotiation patterns, forcing agents into active exploration to safely ride SLA boundaries and boost system energy savings up to 25\%. Crucially, the lightweight 1B LLM achieves sub-second inference latencies (0.95s mean), ensuring our multi-agent framework is compatible with the operational timescales of the O-RAN non-Real-Time RAN Intelligent Controller (non-RT RIC)\footnote{Our source code is available for non-commercial use at https://github.com/HatimChergui.