arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2605.15135 2026-05-15 eess.SP cs.IT math.IT

Deep Mixture of Experts Network for Resource Optimization in Aerial-Terrestrial CF-mMIMO Systems under URLLC

Donggen Li, Chong Huang, Jingfu Li, Pei Xiao, Wenjiang Feng, Dusit Niyato, Zhu Han

AI总结本文研究了在超可靠低时延通信（URLLC）场景下，如何优化空天地一体化免蜂窝大规模MIMO（CF-mMIMO）系统的资源分配问题。为应对高移动性带来的信道老化问题，作者提出了一种基于Transformer的信道预测网络（CP-Net），并设计了一个深度专家混合（MoE）网络（MoE-Net）用于上行功率分配，通过引入加权门控网络（WT-Net）实现专家模型的自适应组合。该方法有效提升了系统在URLLC约束下的通信性能和资源效率。

详情

Comments: 15 pages, accepted for publication in IEEE Transactions on Wireless Communications

英文摘要

As a critical component of sixth-generation (6G) wireless networks, ultra-reliable and low-latency communication (URLLC) is expected to support real-time and reliable information exchange in low-altitude environments. However, achieving URLLC often incurs significant resource overhead, including increased bandwidth consumption, higher transmit power, and denser access point (AP) deployment, which pose significant challenges to both spectral efficiency (SE) and energy efficiency (EE). Besides, existing iterative optimization algorithms are computationally intensive and struggle to meet the latency requirements of URLLC. To address these challenges, we propose a hybrid aerial-terrestrial cell-free massive MIMO (CF-mMIMO) network to support diverse services, along with a channel prediction network and a deep mixture of experts (MoE) network for uplink optimization. First, we design a channel prediction network (CP-Net) to mitigate channel aging caused by high-mobility user equipment (UE). CP-Net employs three Transformer-based sub-networks for aged channel state information (CSI) prediction, while a channel quality-aware loss function is introduced to improve the prediction accuracy of weak links. Based on the predicted CSI, we develop a deep MoE network (MoE-Net) for power allocation comprising three expert models targeting different objectives. Then, we introduce a weighted gating network (WT-Net) to learn an efficient adaptive combination of expert outputs. The proposed framework better captures heterogeneous UE requirements and improves communication performance under URLLC constraints. Numerical results demonstrate the effectiveness of the proposed method.

URL PDF HTML ☆

赞 0 踩 0

2605.15122 2026-05-15 cs.RO cs.LG cs.SY eess.SY

CoCo-InEKF: State Estimation with Learned Contact Covariances in Dynamic, Contact-Rich Scenarios

Michael Baumgartner, David Müller, Agon Serifi, Ruben Grandia, Espen Knoop, Markus Gross, Moritz Bächer

AI总结本文提出了一种名为CoCo-InEKF的新型滤波方法，用于在动态且富含接触的场景中实现腿式机器人的鲁棒状态估计。该方法通过学习接触协方差来替代传统的二值接触状态，从而更精确地捕捉部分接触和方向性滑动等复杂情况。实验表明，该方法在双足机器人上实现了更优的速度估计精度与效率平衡，并提升了滤波一致性，能够有效支持如舞蹈和复杂地面交互等高难度运动的稳定执行。

2605.15086 2026-05-15 eess.IV eess.SP

FaSST: Fast Sparsifying Secondary Transform

Darukeesan Pakiyarajah, Samuel Fernández-Menduiña, Eduardo Pavez, Antonio Ortega, Debargha Mukherjee

AI总结本文提出了一种名为FaSST的快速稀疏化次变换框架，旨在在降低计算复杂度的同时提升视频编码的残差压缩效率。该方法通过将数据驱动的稀疏正交变换分解为一系列Givens旋转，并结合交替最小化策略进行高效求解，实现了对传统低频非可分变换（LFNST）的优化。实验表明，FaSST在保持相同率失真性能的前提下，显著降低了计算量，并在特定模式下实现了更高的编码增益。

2605.15049 2026-05-15 cs.RO cs.MA cs.SY eess.SY

A Prototyping Framework for Distributed Control of Multi-Robot Systems

Junaid Ahmed Memon, Allan Andre Do Nascimento, Kostas Margellos, Antonis Papachristodoulou

AI总结本文提出了一种用于多机器人系统分布式控制的原型框架，旨在连接分布式优化算法的理论研究与实际测试。该框架基于单程序多数据（SPMD）范式，在单台计算机上模拟分布式控制，每个核心运行相同算法并进行局部状态和邻近通信。通过非合作博弈论算法在四旋翼无人机位置交换任务中的应用，验证了该框架在不同动态模型下的有效性，包括质点模型、高保真四旋翼模型以及实际硬件测试平台，展示了其低成本且易用的算法验证优势。

2605.15044 2026-05-15 cs.SD cs.AI cs.LG cs.MM eess.AS

SpeakerLLM: A Speaker-Specialized Audio-LLM for Speaker Understanding and Verification Reasoning

KiHyun Nam, Jungwoo Heo, Siu Bae, Ha-Jin Yu, Joon Son Chung

AI总结随着物理人工智能、对话机器人和无屏可穿戴设备的发展，音频大语言模型需要具备针对说话人的理解能力，以支持用户认证、个性化和上下文感知交互。为此，本文提出 SpeakerLLM，一种专门针对说话人的音频大语言模型框架，能够统一处理单句说话人画像、录音条件理解、双句说话人对比以及基于证据的验证推理。其核心是采用分层说话人分词器，分别捕捉说话人身份和录音条件的多粒度信息，并通过结构化推理轨迹提升验证推理的准确性和可解释性。

2605.15032 2026-05-15 eess.SP cs.LG

Multi-Block Attention for Efficient Channel Estimation in IRS-Assisted mmWave MIMO

Mehrdad Momen-Tayefeh, Mehrshad Momen-Tayefeh, Maryam Sabbaghian

AI总结本文研究了智能反射表面（IRS）辅助毫米波MIMO系统中的高效信道估计问题，提出了基于深度学习的多块注意力（MBA）框架，用于降低训练开销并提升估计精度。该方法通过选择性关闭IRS元素并结合两阶段网络结构，分别进行空间相关性恢复和噪声抑制，有效减少了信道估计中的误差传播。实验表明，MBA方法在保持低计算复杂度的同时，显著降低了导频开销并提升了信道估计性能。

详情

DOI: 10.1109/TCOMM.2025.3618696
Journal ref: IEEE Transactions on Communications, vol. 73, no. 12, pp. 13891-13903, Dec. 2025

英文摘要

Intelligent Reflecting Surfaces (IRSs) are a promising technology for enhancing the spectral and energy efficiency of millimeter-wave (mmWave) multiple-input multiple-output (MIMO) systems. In these systems, accurate channel estimation remains challenging due to the passive nature of IRS elements and the high pilot overhead in large-scale deployments. This paper presents a deep learning-based Multi-Block Attention (MBA) framework for efficient cascaded channel estimation in IRS-assisted mmWave MIMO systems that utilize orthogonal frequency division multiplexing (OFDM). First, we show the optimality of the discrete Fourier transform (DFT) and Hadamard matrices as phase configurations for least squares (LS) estimation. To reduce training overhead, we selectively deactivate IRS elements and compensate for induced feature loss using a two-stage architecture: (i) a Convolutional Attention Network (CAN) for spatial correlation recovery and (ii) a Complex Multi-Convolutional Network (CMN) for noise suppression. The MBA architecture mitigates error propagation through attention-guided feature refinement and denoising. Simulation results indicate that the MBA method reduces pilot overhead by up to 87% compared to the LS estimator. Additionally, at signal-to-noise ratios of 10 dB, our proposed method achieves approximately 51% lower normalized mean squared error (NMSE) than leading methods. It also maintains low computational complexity and adapts effectively to various propagation environments.

URL PDF HTML ☆

赞 0 踩 0

2605.14989 2026-05-15 eess.SP

Map2APS: A Physically Grounded Benchmark for Direct Angle Power Spectrum Prediction from Urban Geometry

Junxi Huang, Xiucheng Wang, Nan Cheng, Kailong Wang, Ruijin Sun, Zhisheng Yin

AI总结本文提出 Map2APS，一个基于物理原理构建的基准数据集，用于直接从城市几何信息预测角度功率谱（APS）。该数据集基于智能射线追踪的传播记录，包含51个等高城市地图和约255万个收发机样本，并采用严格的跨地图划分以评估模型的泛化能力。研究引入了 MS-AReg 作为强基线模型，在测试集上取得了较高的余弦相似度和较低的角度误差，同时提供了评估预测谱方向特性的主导方向指标。

2501.17473 2026-05-15 cs.IT cs.SY eess.SY math.IT

Remote State Estimation over a Wearing Channel: Information Freshness vs. Channel Aging

Jiping Luo, George Stamatakis, Osvaldo Simeone, Nikolaos Pappas

AI总结本文研究了在随时间退化的信道环境下对线性高斯系统进行远程状态估计的问题，探讨了信息新鲜度与信道老化之间的权衡。传感器可在每个时隙选择发送新测量、恢复信道质量或保持静默，而频繁传输会加速信道老化，频繁恢复则会影响估计质量。文章将该问题建模为半马尔可夫决策过程，并揭示了最优策略的单调性特性，提出了结构感知的求解方法。

2605.14949 2026-05-15 cs.CV eess.IV eess.SP

A CUBS-Compatible Ultrasound Morphology and Uncertainty-Aware Baseline for Carotid Intima-Media Segmentation and Preliminary Risk Prediction

Aueaphum Aueawatthanaphisut

AI总结该研究提出了一种基于超声影像的颈动脉内膜中层分割与初步风险预测的可复现基线模型AtheroFlow-XNet，旨在更全面地评估动脉粥样硬化患者的血管风险。模型结合了手动标注的内膜中层掩膜进行监督分割，并引入临床变量辅助风险预测，同时利用蒙特卡洛Dropout实现不确定性感知的推理。实验结果表明，该方法在分割精度和风险预测性能上均达到较高水平，为超声影像支持的自动化血管分析提供了新的思路。

详情

Comments: 13 pages, 5 figures, 2 tables, 20 equations, 3 appendices

英文摘要

Carotid atherosclerosis is a major contributor to ischemic stroke and transient ischemic attack. Conventional ultrasound assessment is commonly based on intima-media thickness, plaque appearance, stenosis degree, and peak systolic velocity, but these morphology- and velocity-based indicators may not fully capture patient-specific vascular risk. This study presents AtheroFlow-XNet, a CUBS-compatible ultrasound morphology and uncertainty-aware learning baseline for carotid intima-media segmentation and preliminary risk prediction. Using the Carotid Ultrasound Boundary Study dataset, manual lumen-intima and media-adventitia boundary annotations were converted into dense intima-media masks for supervised segmentation. Clinical variables were incorporated into an auxiliary risk-prediction branch, and Monte Carlo dropout was used for uncertainty-aware inference. The model was evaluated using a patient-level train-validation-test split with 1,522 training images, 326 validation images, and 328 testing images. The proposed model achieved a Dice coefficient of 0.7930 for LI-MA mask segmentation, a segmentation loss of 0.2359, and an area under the receiver operating characteristic curve of 0.6910 for preliminary risk prediction. Qualitative results showed that predicted masks were generally aligned with manual annotations, while uncertainty maps highlighted ambiguous wall-boundary regions. These results suggest that ultrasound-derived carotid morphology can support automated wall analysis and uncertainty-aware interpretation. Since CUBS does not provide Doppler waveforms or CFD-derived hemodynamic biomarkers, this work should be interpreted as a reproducible morphology-driven baseline. Future work will incorporate Doppler-derived flow profiles, patient-specific vascular reconstruction, and CFD-based wall shear biomarkers.

URL PDF HTML ☆

赞 0 踩 0

2605.14945 2026-05-15 eess.SY cs.SY

Robust Quadcopter Motion Control Using Output Feedback

Stanislav Kim, Anton Pyrkin, Oleg Borisov

AI总结该研究针对四旋翼飞行器的输出反馈运动控制问题，提出了一种鲁棒控制方法。通过几何方法将飞行器模型转化为具有时变增益系数的规范形式，并利用控制输入的双重积分使其变为定常形式。基于扩展观测器方法，设计了一种鲁棒输出反馈控制律，提升了系统在不确定环境下的控制性能。

2605.14944 2026-05-15 cs.RO cs.SY eess.SY

Behavioral Data-Driven Optimal Trajectory Generation for Rotary Cranes

Iskandar Khemakhem, Manuel Zobel, Johannes Schüle, Oliver Sawodny, Naoki Uchiyama, Abdallah Farrage

AI总结随着建筑行业的发展和熟练劳动力的短缺，起重机控制的自动化变得越来越重要。本文提出了一种基于行为数据的开环旋转起重机摆动轨迹生成方法，能够在减少负载摆动的同时降低操作时间和能耗。该方法基于Willems基本引理及其推广，无需显式系统建模，直接利用输入输出数据生成平滑最优轨迹，并通过实验验证了其有效性，相比传统模型方法在负载摆动、跟踪误差和运行时间等方面均有显著提升。

2605.14942 2026-05-15 physics.app-ph cs.SY eess.SY

Radioactive Source Seeking using Bayesian Optimisation with Movement Penalty

Lysander Miller, Joshua Keene, Jeremy M. C. Brown, Airlie Chapman

AI总结本文研究了利用移动机器人进行放射性源定位的问题，提出了一种基于贝叶斯优化的高效寻源策略。该方法采用异方差高斯过程代理模型平衡探索与利用，并引入移动代价函数以减少不必要的移动。实验表明，该策略在放射性源定位任务中具有次线性遗憾，并在仿真中展现出良好的性能。

2605.14941 2026-05-15 eess.SP cs.HC cs.LG

nASR: An End-to-End Trainable Neural Layer for Channel-Level EEG Artifact Subspace Reconstruction in Real-Time BCI

Shantanu Sarkar, Jose L. Contreras-Vidal

AI总结该研究提出了一种端到端可训练的神经网络层nASR，用于实时脑机接口（BCI）中的通道级EEG伪影子空间重构。传统ASR方法依赖固定阈值参数，易影响有效神经信号，而nASR通过引入两个可学习的阈值参数，实现了伪影检测与后续解码的联合优化，有效提升了信号质量与解码性能。实验表明，nASR在分类准确率和推理速度上均优于传统方法，适用于对延迟和性能要求较高的实时BCI应用。

详情

Comments: Preprint. Submitted to IEEE SMC 2026 (under review)

英文摘要

Electroencephalogram (EEG) signals are highly susceptible to artifacts, resulting in a low signal-to-noise ratio which makes extraction of meaningful neural information challenging. Artifact Subspace Reconstruction (ASR) is one of the most widely used artifact filtering techniques in EEG-based BCI applications, owing to its real-time applicability. ASR reconstructs artifact-free signals by operating in Principal Component (PC) space within sliding windows. However, ASR performance is critically sensitive to its threshold parameter - an incorrect threshold risks removing task-relevant neural features alongside artifacts. Furthermore, since PCs are linear combinations of all channels, subspace reconstruction in PC space may alter the underlying data structure, potentially discarding essential neural information. To address these limitations, we propose nASR, a novel end-to-end trainable Keras layer that jointly optimizes artifact rejection and downstream decoding. nASR introduces two trainable threshold parameters: K, which governs artifact detection in PC variance space, and L, which quantifies eigen-spread to pinpoint the primary artifact--contributing channels, enabling selective channel-level reconstruction that preserves clean channel information. An ablation study comprising five model variants (m01 - m05), evaluated across two subjects from the BCI Competition IV Dataset 1, confirms that nASR variants consistently outperform traditional ASR on test classification metrics, while achieving a 6-8x reduction in inference time, making nASR a strong candidate for real-time BCI applications demanding both low latency and high decoding performance.

URL PDF HTML ☆

赞 0 踩 0

2605.14940 2026-05-15 cs.LG cs.AI eess.SP

Not All Symbols Are Equal: Importance-Aware Constellation Design for Semantic Communication

Albert Shaju, Christo Kurisummoottil Thomas, Mayukh Roy Chowdhury

AI总结本文研究了面向语义通信的符号星座设计问题，提出了一种关注语义重要性的联合语义-物理层框架，通过提取离散语义概念、评估语义关键性，并结合深度强化学习动态选择传输符号，从而在物理层实现语义感知的星座映射。该方法引入了语义符号脆弱性指标和语义保护概率，证明了传统格雷编码星座在非均匀语义重要性场景下存在性能局限，并在多个数据集上验证了其在高谱效率下的优越性。

详情

Comments: Submitted to IEEE GLOBECOM 2026. 6 pages, 8 figures

英文摘要

Semantic communication systems for goal-oriented transmission must protect task-relevant information not only through source compression but also via physical layer mapping. Existing approaches decouple constellation design and semantic encoding, exposing critical symbols to channel errors at the same rate as irrelevant ones. Contrary to this, in this paper, a joint semantic-physical layer framework is proposed, which is composed of a vector quantized-variational autoencoder that extracts discrete latent concepts, a semantic criticality indicator (SCI) that scores each concept by task relevance, and a deep reinforcement learning agent that dynamically selects the transmission subset based on instantaneous channel conditions. At the physical layer, a learned semantic-aware M -QAM constellation assigns symbol positions according to joint co-occurrence statistics and SCI scores, departing from the uniform spacing and Gray coding of standard M -QAM which minimizes average BER without regard for semantic content. We introduce a novel semantic symbol vulnerability (SSV) metric and a semantic protection probability (SPP) to quantify the exposure of task-critical symbols to decoding errors, and prove that any Gray-coded constellation is strictly suboptimal in SCI-Weighted SSV whenever the source exhibits non-uniform semantic importance and co-occurrence statistics. Simulation results demonstrate that the proposed constellation achieves near 100% SPP across modulation orders from 4-QAM to 1024-QAM versus 50% for standard constellations at high spectral efficiency, a 21:1 compression ratio with semantic quality above 0.9, generalizing across MNIST, Fashion-MNIST, and FSDD without modification.

URL PDF HTML ☆

赞 0 踩 0

2605.14919 2026-05-15 eess.SP

Transmit Beamforming for High-Rate Underwater Acoustic Communications

Diego A. Cuji, Andrew C. Singer, Milica Stojanovic

AI总结本文研究了用于高速水声通信的发射波束成形技术，旨在减少对信道完全先验知识的依赖。通过利用传播场中几何结构的稳定成分，提出了一种基于角度的波束成形策略，特别适用于存在时间相对稳定的主传播路径的场景。实验结果表明，该方法在数据检测均方误差和误码率方面表现优异。

2605.14883 2026-05-15 eess.SP cs.HC cs.LG

BCI-Based Assessment of Ocular Response Time Using Dynamic Time Warping Leveraging an RDWT-Driven Deep Neural Framework

Shantanu Sarkar, Sai Shashank Gandavarapu, Jeff Feng, Saurabh Prasad, Reza Khanbabaie, Jose L. Contreras-Vidal

AI总结该研究提出了一种基于脑机接口（BCI）的方法，用于评估眼部反应时间，以辅助轻度脑外伤（mTBI）的早期诊断。研究结合了脑电图（EEG）与增强现实（AR）引导的前庭/眼动筛查（VOMS）任务，利用冗余离散小波变换（RDWT）驱动的深度神经网络框架处理EEG信号，并通过动态时间规整（DTW）计算眼部反应时间。实验结果表明，该方法在区分不同受试者的眼动行为方面具有显著效果，尤其在追踪任务中表现出良好的时间差异识别能力，为多模态mTBI评估提供了新的技术途径。

详情

Comments: Submitted to IEEE SMC 2026 (under review)

英文摘要

Mild traumatic brain injury (mTBI) is a prevalent condition that remains difficult to diagnose in its early stages. Oculomotor dysfunction is a well-established marker of mTBI, motivating the development of portable tools that capture both eye-movement behavior and underlying neurophysiology. In this work, we present an initial framework that integrates electroencephalogram (EEG) with augmented-reality (AR)-based Vestibular/Ocular Motor Screening (VOMS) tasks to estimate subject-specific ocular response times. Pre-processed EEG signals, obtained through band-pass filtering and average referencing, are analyzed using a Redundant Discrete Wavelet Transform (RDWT)-driven deep neural framework. The RDWT coefficients are subjected to trainable zero-phase convolutional filtering and reconstructed into the time domain via inverse RDWT, followed by channel-wise temporal and spatial filtering using 2D convolution layers and convolutional-LSTM-based decoding. An ablation study demonstrates that wavelet-domain filtering serves as an effective denoising strategy, improving prediction performance. Sliding-window predictions were validated using Pearson correlation (>= 0.5), and Dynamic Time Warping (DTW) was subsequently used to estimate ocular response times. DTW-derived metrics revealed significant inter-subject differences across all VOM tasks, supported by Mann-Whitney U tests. Cross-correlation analysis further revealed task-dependent temporal behaviors: pursuit tasks exhibited reactive tracking, whereas saccades showed anticipatory responses. Overall, the results highlight pursuit tasks as particularly informative for distinguishing timing differences and demonstrate the potential of RDWT-based EEG features combined with DTW metrics for multimodal mTBI assessment.

URL PDF HTML ☆

赞 0 踩 0

2605.14855 2026-05-15 cs.LG cs.AI eess.SP

Exploitation of Hidden Context in Dynamic Movement Forecasting: A Neural Network Journey from Recurrent to Graph Neural Networks and General Purpose Transformers

Lukas Schelenz, Shobha Rajanna, Denis Gosalci, Lucas Heublein, Jonas Pirkl, Jonathan Ott, Felix Ott, Christopher Mutschler, Tobias Feigl

AI总结本文研究了在动态运动预测任务中如何有效利用隐藏上下文信息，重点探讨了从循环神经网络到图神经网络以及通用型Transformer模型的演进过程。研究对比了多种机器学习方法在预测NBA球员动态运动轨迹中的性能，发现基于LSTM的混合模型在结合上下文信息后取得了最低的最终位移误差，表现优于图注意力网络和Transformer等其他模型。实验表明，不同模型在预测精度、泛化能力和训练效率方面各有优劣，强调了在快速动态环境中进行轨迹预测时需根据具体任务选择合适模型。

详情

DOI: 10.1109/PLANS61210.2025.11028353
Journal ref: IEEE/ION Position, Location and Navigation Symposium (PLANS), Salt Lake City, UT, May 2025
Comments: 12 pages

英文摘要

Forecasting within signal processing pipelines is crucial for mitigating delays, particularly in predicting the dynamic movements of objects such as NBA players. This task poses significant challenges due to the inherently interactive and unpredictable nature of sports, where abrupt changes in velocity and direction are prevalent. Traditional approaches, including (S)ARIMA(X), Kalman filters (KF), and Particle filters (PF), often struggle to model the non-linear dynamics present in such scenarios. Machine learning (ML) methods, such as long short-term memory (LSTM) networks, graph neural networks (GNNs), and Transformers, offer greater flexibility and accuracy but frequently fail to explicitly capture the interplay between temporal dependencies and contextual interactions, which are critical in chaotic sports environments. In this paper, we evaluate these models and assess their strengths and weaknesses. Experimental results reveal key performance trade-offs across input history length, generalizability, and the ability to incorporate contextual information. ML-based methods demonstrated substantial improvements over linear models across forecast horizons of up to 2s. Among the tested architectures, our hybrid LSTM augmented with contextual information achieved the lowest final displacement error (FDE) of 1.51m, outperforming temporal convolutional neural network (TCNN), graph attention network (GAT), and Transformers, while also requiring less data and training time compared to GAT and Transformers. Our findings indicate that no single architecture excels across all metrics, emphasizing the need for task-specific considerations in trajectory prediction for fast-paced, dynamic environments such as NBA gameplay.

URL PDF HTML ☆

赞 0 踩 0

2605.14839 2026-05-15 cs.LG eess.SP

GenAI for Energy-Efficient and Interference-Aware Compressed Sensing of GNSS Signals on a Google Edge TPU

Thorben Wegner, Lucas Heublein, Tobias Feigl, Felix Ott, Christopher Mutschler, Alexander Rügamer

AI总结本文提出了一种基于生成式人工智能（GenAI）的新型方法，用于在谷歌边缘TPU上对全球导航卫星系统（GNSS）信号进行高效压缩与干扰分类。该方法利用变分自编码器（VAEs）在接收端直接压缩信号并实时识别干扰和欺骗攻击，显著降低数据传输和处理能耗。实验表明，该方法在保持信号特征的前提下实现了超过42倍的压缩比，并在重构信号上准确分类约72种干扰类型，为GNSS干扰抑制提供了一种高效且实用的解决方案。

详情

DOI: 10.1109/PLANS61210.2025.11028177
Journal ref: IEEE/ION Position, Location and Navigation Symposium (PLANS), Salt Lake City, UT, May 2025
Comments: 12 pages

英文摘要

Traditional methods for classifying global navigation satellite system (GNSS) jamming signals typically involve post-processing raw or spectral data streams, requiring complex and costly data transmission to cloud-based interference classification systems. In contrast, our proposed approach efficiently compresses GNSS data streams directly at the hardware receiver while simultaneously classifying jamming and spoofing attacks in real time. Given the growing prevalence of GNSS jamming, there is a critical need for real-time solutions suitable for power-constrained environments. This paper introduces a novel method for compressing and classifying GNSS jamming threats using generative artificial intelligence (GenAI), specifically variational autoencoders (VAEs), deployed on Google Edge tensor processing units (TPUs). The study evaluates various autoencoder (AE) architectures to compress and reconstruct GNSS signals, focusing on preserving interference characteristics while minimizing data size near the receiver hardware. The pipeline adapts large-scale AE models for Google Edge TPUs through 8-bit quantization to ensure energy-efficient deployment. Tests on raw in-phase and quadrature-phase (IQ) data, Fast Fourier Transform (FFT) data, and handcrafted features show the system achieves significant compression (>42x) and accurate classification of approximately 72 interference types on reconstructed signals (F2-score 0.915), closely matching the original signals (F2-score 0.923). The hardware-centric GenAI approach also substantially reduces jammer signal transmission costs, offering a practical solution for interference mitigation. Ablation studies on conditional and factorized VAEs (i.e., FactorVAE) explore latent feature disentanglement for data generation, enhancing model interpretability and fostering trust in machine learning (ML) solutions for sensitive interference applications.

URL PDF HTML ☆

赞 0 踩 0

2605.14837 2026-05-15 eess.SP

Making AFDM Secure Against Eavesdroppers: A Phase Function Design Approach

Hengxuan Liu, Vincent Savaux, Arman Farhang

AI总结本文研究了如何通过设计相位函数提高仿射频分复用（AFDM）波形在面对窃听者时的物理层安全性。作者提出了一种通用的相位函数设计方法，通过增加窃听者暴力解调的复杂度来增强AFDM的抗窃听能力。实验结果表明，该方法显著提升了AFDM在物理层安全性能方面的表现。

2605.14788 2026-05-15 eess.SY cs.SY

Hybrid Metaheuristic Optimization of Distributed Control System Hardware Architecture with Model-Based Verification

Ruslan Zakirzyanov

AI总结本文研究了在部分参数不确定的情况下，如何设计高效可靠的分布式控制系统硬件架构这一组合优化问题。作者提出了一种基于模型的形式化架构合成方法，并开发了一种混合蚁群元启发式框架来构建可行的分层架构。通过在大型硫酸厂控制系统的案例中验证，该方法不仅能够满足结构和动态性能要求，还展示了其在实际工程中的可行性。

2605.14766 2026-05-15 cs.CL cs.AI eess.AS

Streaming Speech-to-Text Translation with a SpeechLLM

Titouan Parcollet, Shucong Zhang, Xianrui Zheng, Rogier C. van Dalen

AI总结本文提出了一种基于大语言模型（LLM）的实时流式语音到文本翻译系统，旨在解决现有SpeechLLM系统在实际应用中响应速度慢的问题。该方法使模型不仅能生成翻译文本，还能判断是否已接收到足够的音频信息以进行输出，从而实现更高效的流式处理。实验表明，该系统在保持翻译质量接近非流式基线的同时，将延迟降低至1-2秒，显著提升了实时性。

2605.14741 2026-05-15 eess.SY cs.AI cs.SY

Addressing Terminal Constraints in Data-Driven Demand Response Scheduling

Maximilian Bloor, Martha White, Ehecatl Antonio del Rio Chanona, Calvin Tsay

AI总结本文研究了在数据驱动的需求响应调度中如何满足终端约束的问题，提出了一种结合目标空间规划（GSP）与深度确定性策略梯度（DDPG）的方法，通过学习离散子目标的时序抽象模型，有效传递长期价值，提升调度效果。该方法在模拟的空气分离系统中验证了其在提高样本效率和满足终端存储约束方面的优势，缓解了传统方法在长期约束处理上的不足。

2605.14720 2026-05-15 eess.SP

Joint Phase Noise and Channel Estimation for OTFS

Stephen McWade, Arman Farhang

AI总结本文研究了正交时频空（OTFS）系统中振荡器相位噪声的影响，分析了相位噪声在时延-多普勒域引起的干扰，并推导了三种不同类型振荡器的信噪比（SINR）表达式。研究指出OTFS系统对相位噪声高度敏感，现有仅考虑公共相位误差（CPE）的估计方法无法有效抑制多普勒间干扰（IDI）。为此，本文提出了一种基于维纳滤波的联合信道与相位噪声估计方法，充分利用了相位噪声和多普勒扩展信道的统计特性，仿真结果表明该方法在误码率（BER）性能上相比现有方法有高达8 dB的提升。

2605.14688 2026-05-15 eess.SY cs.SY

Dynamic Event-Triggered Control of Discrete-Time Nonlinear Systems based on Difference-Algebraic Representations

Vitoriano Casas, Gabriela Reis, Pedro Henrique Coutinho, Iury Bessa, Rodrigo Araújo

AI总结本文研究了基于差分代数表示（DAR）的离散时间非线性系统的动态事件触发控制问题，提出了一种基于增益调度控制器的方法。该方法将系统非线性特性以及事件触发采样引起的异步项纳入控制律和触发函数设计中，从而推导出一种更为宽松的协同设计条件，用于同时设计增益调度控制律和动态触发机制，以保证闭环系统的渐近稳定性。此外，还通过优化问题减少了事件触发次数并扩大了闭环系统的吸引域估计，数值例子验证了方法的有效性。

2605.14683 2026-05-15 cs.RO cs.SY eess.SY

SeaVis: Modeling and Control of a Remotely Operated Towed Vehicle for Seabed Visualization and Mapping

Abdelhakim Amer, Aske Alstrup, Frederik Rasmussen, Yury Brodskiy, Andriy Sarabakha, Erdal Kayacan

AI总结本文提出了一种用于海底可视化与测绘的遥控拖曳式水下机器人SeaVis的新型数学模型，并设计了一种增益调度的线性二次调节器（LQR）以实现其深度和姿态的鲁棒控制。通过高保真仿真验证，结果表明该LQR控制器在抗干扰能力、控制效率和舵面动作幅度等方面均优于传统PID控制器，并且在全操作速度范围内均表现出良好的控制效果。研究为水下机器人高精度稳定作业提供了有效的控制方法。

2605.14677 2026-05-15 eess.IV

An Attention-Enhanced Network with Joint Dehazing and Retinex-Based Enhancement for Underwater Images

Sahana Ray, Bibhabasu Debnath, Sanjay Ghosh

AI总结本文针对水下图像因光的吸收、散射和悬浮颗粒导致的能见度下降问题，提出了一种包含去雾和Retinex增强的三阶段网络ADR，通过扩展水下成像模型并结合注意力机制的U-Net++进行优化，有效提升了水下图像的质量。实验表明，该方法在多个基准数据集上的表现优于现有先进方法。

2605.14650 2026-05-15 eess.SP

Multimodal Learning for MIMO Beam Prediction Based on Variational Inference

Zijian Zheng, Wenqiang Yi, Hyundong Shin, Arumugam Nallanathan

AI总结本文研究了基于变分推理的多模态学习方法，用于提升大规模多输入多输出系统中的波束预测精度。该方法通过解耦特征提取与跨模态语义对齐，降低了多模态数据获取的高昂成本，并采用两阶段训练策略提高数据效率和鲁棒性。实验表明，该框架在仅使用传统端到端方法20%多模态数据的情况下，仍能实现具有竞争力的预测精度和高可靠性。

2605.14642 2026-05-15 eess.SY cs.SY math.OC

Distributionally Robust Model Predictive Control for Virtual Power Plants

Nikolas Recke, Mathias Hudoba de Badyn

AI总结本文提出了一种用于虚拟电厂在电价不确定性下的最优运行的分布鲁棒模型预测控制（DRMPC）框架。该方法通过结合数据驱动预测与分位数不确定性量化，构建了能够适应预测偏差和分布变化的时间相关Wasserstein模糊集，并将其直接融入实时决策过程。实验结果表明，在适当选择模糊半径的情况下，DRMPC相比传统基于预测的MPC能提升经济效益，且在不同季节场景下均表现出稳定的收益增长。

2605.14629 2026-05-15 eess.IV cs.CV

Efficient Dense Matching for Enhanced Gaussian Splatting Using AV1 Motion Vectors

Julien Zouein, Vibhoothi Vibhoothi, François Pitié, Anil Kokaram

AI总结本文提出了一种基于AV1运动向量的高效密集匹配方法，用于提升高斯泼溅（3DGS）的初始点云质量。该方法利用AV1视频编解码器中的运动向量，避免了传统SfM方法中耗时的穷举匹配，显著降低了计算开销并提高了点云密度。实验表明，该方法生成的点云数量是传统SfM方法的八倍，有效提升了3DGS的重建精度和训练效率。

2605.14510 2026-05-15 eess.SP

Antenna Tilt Failure Detection and Estimation via Integrated Sensing and Communications

Samed Kesir, Batuhan Kaplan, Emre Arslan, Ahmet Faruk Coskun

AI总结本文研究了窄波束通信系统对物理对准误差的敏感性问题，提出了一种基于集成感知与通信（ISAC）技术的无传感器天线倾角故障检测与估计方法。该方法利用环境静态杂波作为几何参考点，通过监测杂波热图中的系统增益变化实现天线倾角的精确检测与估计，并基于标准5G NR帧结构和两种不同波形进行实现。实验结果表明，该框架能够实现无需外部传感器的自主网络维护与自愈功能。