arXivDaily arXiv每日学术速递 周一至周五更新
重置
2605.15135 2026-05-15 eess.SP cs.IT math.IT

Deep Mixture of Experts Network for Resource Optimization in Aerial-Terrestrial CF-mMIMO Systems under URLLC

Donggen Li, Chong Huang, Jingfu Li, Pei Xiao, Wenjiang Feng, Dusit Niyato, Zhu Han

AI总结 本文研究了在超可靠低时延通信(URLLC)场景下,如何优化空天地一体化免蜂窝大规模MIMO(CF-mMIMO)系统的资源分配问题。为应对高移动性带来的信道老化问题,作者提出了一种基于Transformer的信道预测网络(CP-Net),并设计了一个深度专家混合(MoE)网络(MoE-Net)用于上行功率分配,通过引入加权门控网络(WT-Net)实现专家模型的自适应组合。该方法有效提升了系统在URLLC约束下的通信性能和资源效率。

详情
Comments
15 pages, accepted for publication in IEEE Transactions on Wireless Communications
英文摘要

As a critical component of sixth-generation (6G) wireless networks, ultra-reliable and low-latency communication (URLLC) is expected to support real-time and reliable information exchange in low-altitude environments. However, achieving URLLC often incurs significant resource overhead, including increased bandwidth consumption, higher transmit power, and denser access point (AP) deployment, which pose significant challenges to both spectral efficiency (SE) and energy efficiency (EE). Besides, existing iterative optimization algorithms are computationally intensive and struggle to meet the latency requirements of URLLC. To address these challenges, we propose a hybrid aerial-terrestrial cell-free massive MIMO (CF-mMIMO) network to support diverse services, along with a channel prediction network and a deep mixture of experts (MoE) network for uplink optimization. First, we design a channel prediction network (CP-Net) to mitigate channel aging caused by high-mobility user equipment (UE). CP-Net employs three Transformer-based sub-networks for aged channel state information (CSI) prediction, while a channel quality-aware loss function is introduced to improve the prediction accuracy of weak links. Based on the predicted CSI, we develop a deep MoE network (MoE-Net) for power allocation comprising three expert models targeting different objectives. Then, we introduce a weighted gating network (WT-Net) to learn an efficient adaptive combination of expert outputs. The proposed framework better captures heterogeneous UE requirements and improves communication performance under URLLC constraints. Numerical results demonstrate the effectiveness of the proposed method.

2605.15122 2026-05-15 cs.RO cs.LG cs.SY eess.SY

CoCo-InEKF: State Estimation with Learned Contact Covariances in Dynamic, Contact-Rich Scenarios

Michael Baumgartner, David Müller, Agon Serifi, Ruben Grandia, Espen Knoop, Markus Gross, Moritz Bächer

AI总结 本文提出了一种名为CoCo-InEKF的新型滤波方法,用于在动态且富含接触的场景中实现腿式机器人的鲁棒状态估计。该方法通过学习接触协方差来替代传统的二值接触状态,从而更精确地捕捉部分接触和方向性滑动等复杂情况。实验表明,该方法在双足机器人上实现了更优的速度估计精度与效率平衡,并提升了滤波一致性,能够有效支持如舞蹈和复杂地面交互等高难度运动的稳定执行。

详情
Comments
RSS 2026
英文摘要

Robust state estimation for highly dynamic motion of legged robots remains challenging, especially in dynamic, contact-rich scenarios. Traditional approaches often rely on binary contact states that fail to capture the nuances of partial contact or directional slippage. This paper presents CoCo-InEKF, a differentiable invariant extended Kalman filter that utilizes continuous contact velocity covariances instead of binary contact states. These learned covariances allow the method to dynamically modulate contact confidence, accounting for more nuanced conditions ranging from firm contact to directional slippage or no contact. To predict these covariances for a set of predefined contact candidate points, we employ a lightweight neural network trained end-to-end using a state-error loss. This approach eliminates the need for heuristic ground-truth contact labels. In addition, we propose an automated contact candidate selection procedure and demonstrate that our method is insensitive to their exact placement. Experiments on a bipedal robot demonstrate a superior accuracy-efficiency tradeoff for linear velocity estimation, as well as improved filter consistency compared to baseline methods. This enables the robust execution of challenging motions, including dancing and complex ground interactions -- both in simulation and in the real world.

2605.15086 2026-05-15 eess.IV eess.SP

FaSST: Fast Sparsifying Secondary Transform

Darukeesan Pakiyarajah, Samuel Fernández-Menduiña, Eduardo Pavez, Antonio Ortega, Debargha Mukherjee

AI总结 本文提出了一种名为FaSST的快速稀疏化次变换框架,旨在在降低计算复杂度的同时提升视频编码的残差压缩效率。该方法通过将数据驱动的稀疏正交变换分解为一系列Givens旋转,并结合交替最小化策略进行高效求解,实现了对传统低频非可分变换(LFNST)的优化。实验表明,FaSST在保持相同率失真性能的前提下,显著降低了计算量,并在特定模式下实现了更高的编码增益。

详情
Comments
6 pages, 5 figures, Accepted in ICIP 2026
英文摘要

Data-dependent secondary transforms, which aim to decorrelate coefficients of a separable primary transform, can improve residual coding efficiency; however, their deployment is often constrained by computational complexity. Recent video codecs use variants of the low-frequency non-separable transform (LFNST), which discards some high-frequency secondary transform coefficients, limiting achievable coding gains. Moreover, existing data-dependent secondary transforms lack explicit rate-distortion (RD) optimal design criteria. In this work, we propose a framework for designing low-complexity data-dependent secondary transforms, termed Fast Sparsifying Secondary Transforms (FaSSTs). Our approach approximates data-driven sparse orthonormal transforms (SOTs) by factorizing them into a sequence of Givens rotations. The rotations are efficiently determined using an alternating minimization strategy combined with an approximate Givens factorization procedure. Our method adapts the number of rotations based on the prediction mode, further reducing computational complexity. We design mode-dependent secondary transforms for intra-prediction residuals in AV2 using FaSST. Experimental results show that mode-adaptive FaSST matches the RD performance of LFNST while reducing the number of computations by 83.67%. Moreover, by avoiding fixed-coefficient truncation, FaSST achieves up to 1.80% BD-rate savings relative to LFNST while operating at 66.24% lower complexity.

2605.15049 2026-05-15 cs.RO cs.MA cs.SY eess.SY

A Prototyping Framework for Distributed Control of Multi-Robot Systems

Junaid Ahmed Memon, Allan Andre Do Nascimento, Kostas Margellos, Antonis Papachristodoulou

AI总结 本文提出了一种用于多机器人系统分布式控制的原型框架,旨在连接分布式优化算法的理论研究与实际测试。该框架基于单程序多数据(SPMD)范式,在单台计算机上模拟分布式控制,每个核心运行相同算法并进行局部状态和邻近通信。通过非合作博弈论算法在四旋翼无人机位置交换任务中的应用,验证了该框架在不同动态模型下的有效性,包括质点模型、高保真四旋翼模型以及实际硬件测试平台,展示了其低成本且易用的算法验证优势。

详情
Comments
Accepted at IFAC World Congress 2026
英文摘要

This paper presents a prototyping framework for distributed control of multi-robot systems, aimed at bridging theory and practical testing of distributed optimization algorithms. Using the Single Program, Multiple Data (SPMD) paradigm, the framework emulates distributed control on a single computer, with each core running the same algorithm using local states and neighbour-to-neighbour communication. We demonstrate the framework on a four-quadrotor position-swapping task using a non-cooperative game-theoretic distributed algorithm. Computational time and trajectory data are compared across the supported dynamics levels: a point-mass model, a high-fidelity quadrotor model, and an experimental hardware testbed using Crazyflie quadcopters. The results show that the framework provides a low-cost and accessible approach for validating distributed algorithms.

2605.15044 2026-05-15 cs.SD cs.AI cs.LG cs.MM eess.AS

SpeakerLLM: A Speaker-Specialized Audio-LLM for Speaker Understanding and Verification Reasoning

KiHyun Nam, Jungwoo Heo, Siu Bae, Ha-Jin Yu, Joon Son Chung

AI总结 随着物理人工智能、对话机器人和无屏可穿戴设备的发展,音频大语言模型需要具备针对说话人的理解能力,以支持用户认证、个性化和上下文感知交互。为此,本文提出 SpeakerLLM,一种专门针对说话人的音频大语言模型框架,能够统一处理单句说话人画像、录音条件理解、双句说话人对比以及基于证据的验证推理。其核心是采用分层说话人分词器,分别捕捉说话人身份和录音条件的多粒度信息,并通过结构化推理轨迹提升验证推理的准确性和可解释性。

详情
英文摘要

As audio-first agents become increasingly common in physical AI, conversational robots, and screenless wearables, audio large language models (audio-LLMs) must integrate speaker-specific understanding to support user authorization, personalization, and context-aware interaction. This requires modeling who is speaking, how the voice sounds, and how recording conditions affect speaker cues. Conventional speaker verification systems provide strong scalar scores but little linguistic evidence, while current audio-LLMs and speaker-aware language models have limited ability to organize speaker information beyond binary labels or descriptive profiles. We present SpeakerLLM, a speaker-specialized audio-LLM framework that unifies single-utterance speaker profiling, recording-condition understanding, utterance-pair speaker comparison, and evidence-organized verification reasoning within a natural-language interface. We construct verification-reasoning targets and a decision-composition policy that separate profile-level evidence from the final same-or-different decision and organize recording condition, profile evidence, and the decision into a structured trace. At its core, SpeakerLLM uses a hierarchical speaker tokenizer designed to capture multiple granularities of speaker evidence. Utterance-level speaker embeddings summarize identity and profile-level cues, whereas frame-level speaker features preserve fine-grained acoustic descriptors. Experiments show that SpeakerLLM-Base improves speaker-profile and recording-condition understanding over general audio-LLMs, while SpeakerLLM-VR preserves strong generated-verdict accuracy and produces decision traces grounded in the supervised verification reasoning schema. We will release the metadata-enriched supervision dataset and target-construction code for reproducibility.

2605.15032 2026-05-15 eess.SP cs.LG

Multi-Block Attention for Efficient Channel Estimation in IRS-Assisted mmWave MIMO

Mehrdad Momen-Tayefeh, Mehrshad Momen-Tayefeh, Maryam Sabbaghian

AI总结 本文研究了智能反射表面(IRS)辅助毫米波MIMO系统中的高效信道估计问题,提出了基于深度学习的多块注意力(MBA)框架,用于降低训练开销并提升估计精度。该方法通过选择性关闭IRS元素并结合两阶段网络结构,分别进行空间相关性恢复和噪声抑制,有效减少了信道估计中的误差传播。实验表明,MBA方法在保持低计算复杂度的同时,显著降低了导频开销并提升了信道估计性能。

详情
Journal ref
IEEE Transactions on Communications, vol. 73, no. 12, pp. 13891-13903, Dec. 2025
英文摘要

Intelligent Reflecting Surfaces (IRSs) are a promising technology for enhancing the spectral and energy efficiency of millimeter-wave (mmWave) multiple-input multiple-output (MIMO) systems. In these systems, accurate channel estimation remains challenging due to the passive nature of IRS elements and the high pilot overhead in large-scale deployments. This paper presents a deep learning-based Multi-Block Attention (MBA) framework for efficient cascaded channel estimation in IRS-assisted mmWave MIMO systems that utilize orthogonal frequency division multiplexing (OFDM). First, we show the optimality of the discrete Fourier transform (DFT) and Hadamard matrices as phase configurations for least squares (LS) estimation. To reduce training overhead, we selectively deactivate IRS elements and compensate for induced feature loss using a two-stage architecture: (i) a Convolutional Attention Network (CAN) for spatial correlation recovery and (ii) a Complex Multi-Convolutional Network (CMN) for noise suppression. The MBA architecture mitigates error propagation through attention-guided feature refinement and denoising. Simulation results indicate that the MBA method reduces pilot overhead by up to 87% compared to the LS estimator. Additionally, at signal-to-noise ratios of 10 dB, our proposed method achieves approximately 51% lower normalized mean squared error (NMSE) than leading methods. It also maintains low computational complexity and adapts effectively to various propagation environments.

2605.14989 2026-05-15 eess.SP

Map2APS: A Physically Grounded Benchmark for Direct Angle Power Spectrum Prediction from Urban Geometry

Junxi Huang, Xiucheng Wang, Nan Cheng, Kailong Wang, Ruijin Sun, Zhisheng Yin

AI总结 本文提出 Map2APS,一个基于物理原理构建的基准数据集,用于直接从城市几何信息预测角度功率谱(APS)。该数据集基于智能射线追踪的传播记录,包含51个等高城市地图和约255万个收发机样本,并采用严格的跨地图划分以评估模型的泛化能力。研究引入了 MS-AReg 作为强基线模型,在测试集上取得了较高的余弦相似度和较低的角度误差,同时提供了评估预测谱方向特性的主导方向指标。

详情
Comments
Submitted to IEEE GLOBECOM 2026
英文摘要

Angle power spectrum (APS) characterizes the directional distribution of received signal power and is directly relevant to beam management and MIMO processing. While environment-aware learning has been widely studied for radio maps and path loss, direct map-to-APS prediction still lacks a standardized large-scale benchmark. This paper presents Map2APS, a physically grounded benchmark constructed from intelligent ray-tracing (IRT) path-level propagation records. Map2APS covers 51 equal-height urban maps and approximately 2.55 million Tx--Rx samples, with a strict cross-map split for evaluating generalization to unseen urban layouts. We benchmark representative model families and introduce MS-AReg as a strong reference baseline. On the full held-out test set of 249{,}993 samples, MS-AReg achieves a cosine similarity of 0.948, a peak location error of 1.20$^\circ$, and an inference latency of 0.101 ms/sample. We further report dominant-direction metrics, including Top-1 dominant peak hit rate and dominant peak recall, to evaluate whether predicted spectra preserve decision-relevant arrival directions. The benchmark, code, and evaluation scripts are released at https://github.com/UNIC-Lab/aps-data.

2501.17473 2026-05-15 cs.IT cs.SY eess.SY math.IT

Remote State Estimation over a Wearing Channel: Information Freshness vs. Channel Aging

Jiping Luo, George Stamatakis, Osvaldo Simeone, Nikolaos Pappas

AI总结 本文研究了在随时间退化的信道环境下对线性高斯系统进行远程状态估计的问题,探讨了信息新鲜度与信道老化之间的权衡。传感器可在每个时隙选择发送新测量、恢复信道质量或保持静默,而频繁传输会加速信道老化,频繁恢复则会影响估计质量。文章将该问题建模为半马尔可夫决策过程,并揭示了最优策略的单调性特性,提出了结构感知的求解方法。

详情
Comments
This paper has been accepted for publication in IEEE Transactions on Automatic Control
英文摘要

We study the remote estimation of a linear Gaussian system over a channel that wears out over time and with every use. The sensor can either transmit a fresh measurement in the current time slot, restore the channel quality at the cost of downtime, or remain silent. Frequent transmissions yield accurate estimates but incur significant wear on the channel. Renewing the channel too often improves channel conditions but results in poor estimation quality. What is the optimal timing to transmit measurements and restore the channel? This problem is formulated as a semi-Markov decision process (SMDP). We establish monotonicity properties of the optimal policy and propose structure-aware solution methods.

2605.14949 2026-05-15 cs.CV eess.IV eess.SP

A CUBS-Compatible Ultrasound Morphology and Uncertainty-Aware Baseline for Carotid Intima-Media Segmentation and Preliminary Risk Prediction

Aueaphum Aueawatthanaphisut

AI总结 该研究提出了一种基于超声影像的颈动脉内膜中层分割与初步风险预测的可复现基线模型AtheroFlow-XNet,旨在更全面地评估动脉粥样硬化患者的血管风险。模型结合了手动标注的内膜中层掩膜进行监督分割,并引入临床变量辅助风险预测,同时利用蒙特卡洛Dropout实现不确定性感知的推理。实验结果表明,该方法在分割精度和风险预测性能上均达到较高水平,为超声影像支持的自动化血管分析提供了新的思路。

详情
Comments
13 pages, 5 figures, 2 tables, 20 equations, 3 appendices
英文摘要

Carotid atherosclerosis is a major contributor to ischemic stroke and transient ischemic attack. Conventional ultrasound assessment is commonly based on intima-media thickness, plaque appearance, stenosis degree, and peak systolic velocity, but these morphology- and velocity-based indicators may not fully capture patient-specific vascular risk. This study presents AtheroFlow-XNet, a CUBS-compatible ultrasound morphology and uncertainty-aware learning baseline for carotid intima-media segmentation and preliminary risk prediction. Using the Carotid Ultrasound Boundary Study dataset, manual lumen-intima and media-adventitia boundary annotations were converted into dense intima-media masks for supervised segmentation. Clinical variables were incorporated into an auxiliary risk-prediction branch, and Monte Carlo dropout was used for uncertainty-aware inference. The model was evaluated using a patient-level train-validation-test split with 1,522 training images, 326 validation images, and 328 testing images. The proposed model achieved a Dice coefficient of 0.7930 for LI-MA mask segmentation, a segmentation loss of 0.2359, and an area under the receiver operating characteristic curve of 0.6910 for preliminary risk prediction. Qualitative results showed that predicted masks were generally aligned with manual annotations, while uncertainty maps highlighted ambiguous wall-boundary regions. These results suggest that ultrasound-derived carotid morphology can support automated wall analysis and uncertainty-aware interpretation. Since CUBS does not provide Doppler waveforms or CFD-derived hemodynamic biomarkers, this work should be interpreted as a reproducible morphology-driven baseline. Future work will incorporate Doppler-derived flow profiles, patient-specific vascular reconstruction, and CFD-based wall shear biomarkers.

2605.14945 2026-05-15 eess.SY cs.SY

Robust Quadcopter Motion Control Using Output Feedback

Stanislav Kim, Anton Pyrkin, Oleg Borisov

AI总结 该研究针对四旋翼飞行器的输出反馈运动控制问题,提出了一种鲁棒控制方法。通过几何方法将飞行器模型转化为具有时变增益系数的规范形式,并利用控制输入的双重积分使其变为定常形式。基于扩展观测器方法,设计了一种鲁棒输出反馈控制律,提升了系统在不确定环境下的控制性能。

详情
英文摘要

The study addresses the problem of quadcopter motion control using output feedback. By applying a geometric approach, the quadcopter model is transformed into a normal form with a time-varying gain coefficient, which is subsequently made stationary through double integration of the control input. A robust output feedback control law is synthesised based on the extended observer method.

2605.14944 2026-05-15 cs.RO cs.SY eess.SY

Behavioral Data-Driven Optimal Trajectory Generation for Rotary Cranes

Iskandar Khemakhem, Manuel Zobel, Johannes Schüle, Oliver Sawodny, Naoki Uchiyama, Abdallah Farrage

AI总结 随着建筑行业的发展和熟练劳动力的短缺,起重机控制的自动化变得越来越重要。本文提出了一种基于行为数据的开环旋转起重机摆动轨迹生成方法,能够在减少负载摆动的同时降低操作时间和能耗。该方法基于Willems基本引理及其推广,无需显式系统建模,直接利用输入输出数据生成平滑最优轨迹,并通过实验验证了其有效性,相比传统模型方法在负载摆动、跟踪误差和运行时间等方面均有显著提升。

详情
英文摘要

With the growth of the construction industry and the global shortage of skilled labor, the automation of crane control has become increasingly important for safe and efficient operations. A central challenge in automatic crane control is the reduction of load oscillations during motion, which is primarily addressed through appropriate slewing trajectories. In this context, classical model-based control methods rely on accurate dynamical models and expert tuning, and often struggle to meet safety and precision requirements, while many learning-based approaches require large data sets and significant computational resources. This paper proposes a behavioral data-driven framework for generating open-loop slewing trajectories for rotary cranes that suppress load sway while reducing operation time and energy consumption. The approach builds on Willems' fundamental lemma and its generalizations, to bypass explicit system modeling and operate directly on measured input-output data. A practical workflow is presented in this paper to reduce the need for expert knowledge. Despite the underactuated nature of the crane dynamics, the method identifies a nonparametric representation of the system behavior and generates smooth, optimal trajectories using limited data and convex optimization. The proposed trajectory generation method is validated on a laboratory crane setup and compared against an established model-based approach, achieving up to 35% reduction in load sway, 43% reduction in tracking error, and 50% reduction in travel time.

2605.14942 2026-05-15 physics.app-ph cs.SY eess.SY

Radioactive Source Seeking using Bayesian Optimisation with Movement Penalty

Lysander Miller, Joshua Keene, Jeremy M. C. Brown, Airlie Chapman

AI总结 本文研究了利用移动机器人进行放射性源定位的问题,提出了一种基于贝叶斯优化的高效寻源策略。该方法采用异方差高斯过程代理模型平衡探索与利用,并引入移动代价函数以减少不必要的移动。实验表明,该策略在放射性源定位任务中具有次线性遗憾,并在仿真中展现出良好的性能。

详情
英文摘要

The use of mobile robotics in radioactive source seeking has become an important part of modern radiation-safety practices, supporting timely mitigation of contamination risks and helping protect public health. However, measuring radiation is often time-consuming, rendering traditional gradient-based source-seeking methods less effective due to lower sample efficiency. This paper proposes a sample-efficient Bayesian-Optimisation source-seeking strategy that utilises a heteroscedastic Gaussian process surrogate to balance exploration and exploitation. Excessive inter-sample travel is discouraged through a movement switching cost. The strategy is shown to generate sublinear regret in the source-seeking task, while simulations demonstrate its effectiveness in localising radioactive sources.

2605.14941 2026-05-15 eess.SP cs.HC cs.LG

nASR: An End-to-End Trainable Neural Layer for Channel-Level EEG Artifact Subspace Reconstruction in Real-Time BCI

Shantanu Sarkar, Jose L. Contreras-Vidal

AI总结 该研究提出了一种端到端可训练的神经网络层nASR,用于实时脑机接口(BCI)中的通道级EEG伪影子空间重构。传统ASR方法依赖固定阈值参数,易影响有效神经信号,而nASR通过引入两个可学习的阈值参数,实现了伪影检测与后续解码的联合优化,有效提升了信号质量与解码性能。实验表明,nASR在分类准确率和推理速度上均优于传统方法,适用于对延迟和性能要求较高的实时BCI应用。

详情
Comments
Preprint. Submitted to IEEE SMC 2026 (under review)
英文摘要

Electroencephalogram (EEG) signals are highly susceptible to artifacts, resulting in a low signal-to-noise ratio which makes extraction of meaningful neural information challenging. Artifact Subspace Reconstruction (ASR) is one of the most widely used artifact filtering techniques in EEG-based BCI applications, owing to its real-time applicability. ASR reconstructs artifact-free signals by operating in Principal Component (PC) space within sliding windows. However, ASR performance is critically sensitive to its threshold parameter - an incorrect threshold risks removing task-relevant neural features alongside artifacts. Furthermore, since PCs are linear combinations of all channels, subspace reconstruction in PC space may alter the underlying data structure, potentially discarding essential neural information. To address these limitations, we propose nASR, a novel end-to-end trainable Keras layer that jointly optimizes artifact rejection and downstream decoding. nASR introduces two trainable threshold parameters: K, which governs artifact detection in PC variance space, and L, which quantifies eigen-spread to pinpoint the primary artifact--contributing channels, enabling selective channel-level reconstruction that preserves clean channel information. An ablation study comprising five model variants (m01 - m05), evaluated across two subjects from the BCI Competition IV Dataset 1, confirms that nASR variants consistently outperform traditional ASR on test classification metrics, while achieving a 6-8x reduction in inference time, making nASR a strong candidate for real-time BCI applications demanding both low latency and high decoding performance.

2605.14940 2026-05-15 cs.LG cs.AI eess.SP

Not All Symbols Are Equal: Importance-Aware Constellation Design for Semantic Communication

Albert Shaju, Christo Kurisummoottil Thomas, Mayukh Roy Chowdhury

AI总结 本文研究了面向语义通信的符号星座设计问题,提出了一种关注语义重要性的联合语义-物理层框架,通过提取离散语义概念、评估语义关键性,并结合深度强化学习动态选择传输符号,从而在物理层实现语义感知的星座映射。该方法引入了语义符号脆弱性指标和语义保护概率,证明了传统格雷编码星座在非均匀语义重要性场景下存在性能局限,并在多个数据集上验证了其在高谱效率下的优越性。

详情
Comments
Submitted to IEEE GLOBECOM 2026. 6 pages, 8 figures
英文摘要

Semantic communication systems for goal-oriented transmission must protect task-relevant information not only through source compression but also via physical layer mapping. Existing approaches decouple constellation design and semantic encoding, exposing critical symbols to channel errors at the same rate as irrelevant ones. Contrary to this, in this paper, a joint semantic-physical layer framework is proposed, which is composed of a vector quantized-variational autoencoder that extracts discrete latent concepts, a semantic criticality indicator (SCI) that scores each concept by task relevance, and a deep reinforcement learning agent that dynamically selects the transmission subset based on instantaneous channel conditions. At the physical layer, a learned semantic-aware M -QAM constellation assigns symbol positions according to joint co-occurrence statistics and SCI scores, departing from the uniform spacing and Gray coding of standard M -QAM which minimizes average BER without regard for semantic content. We introduce a novel semantic symbol vulnerability (SSV) metric and a semantic protection probability (SPP) to quantify the exposure of task-critical symbols to decoding errors, and prove that any Gray-coded constellation is strictly suboptimal in SCI-Weighted SSV whenever the source exhibits non-uniform semantic importance and co-occurrence statistics. Simulation results demonstrate that the proposed constellation achieves near 100% SPP across modulation orders from 4-QAM to 1024-QAM versus 50% for standard constellations at high spectral efficiency, a 21:1 compression ratio with semantic quality above 0.9, generalizing across MNIST, Fashion-MNIST, and FSDD without modification.

2605.14919 2026-05-15 eess.SP

Transmit Beamforming for High-Rate Underwater Acoustic Communications

Diego A. Cuji, Andrew C. Singer, Milica Stojanovic

AI总结 本文研究了用于高速水声通信的发射波束成形技术,旨在减少对信道完全先验知识的依赖。通过利用传播场中几何结构的稳定成分,提出了一种基于角度的波束成形策略,特别适用于存在时间相对稳定的主传播路径的场景。实验结果表明,该方法在数据检测均方误差和误码率方面表现优异。

详情
Comments
5 pages, 5 figures, conference
英文摘要

Transmit beamforming for underwater acoustic communication is challenging because it requires perfect knowledge of the channel to the receiver in advance. In practice, channel estimates must be learned through feedback and are often noisy or outdated because of feedback delay and channel variation. In this paper, we investigate angle-based beamforming strategies for a single-user link that reduce dependence on full channel knowledge by exploiting stable components of the geometric structure in the propagation field. In particular, we focus on scenarios in which there exists a dominant path that remains relatively stable over time, making it a suitable candidate for transmit beamforming. Experimental results using the SPACE and MACE data sets demonstrate the effectiveness of the proposed method in terms of data-detection mean-squared error and bit error rate.

2605.14883 2026-05-15 eess.SP cs.HC cs.LG

BCI-Based Assessment of Ocular Response Time Using Dynamic Time Warping Leveraging an RDWT-Driven Deep Neural Framework

Shantanu Sarkar, Sai Shashank Gandavarapu, Jeff Feng, Saurabh Prasad, Reza Khanbabaie, Jose L. Contreras-Vidal

AI总结 该研究提出了一种基于脑机接口(BCI)的方法,用于评估眼部反应时间,以辅助轻度脑外伤(mTBI)的早期诊断。研究结合了脑电图(EEG)与增强现实(AR)引导的前庭/眼动筛查(VOMS)任务,利用冗余离散小波变换(RDWT)驱动的深度神经网络框架处理EEG信号,并通过动态时间规整(DTW)计算眼部反应时间。实验结果表明,该方法在区分不同受试者的眼动行为方面具有显著效果,尤其在追踪任务中表现出良好的时间差异识别能力,为多模态mTBI评估提供了新的技术途径。

详情
Comments
Submitted to IEEE SMC 2026 (under review)
英文摘要

Mild traumatic brain injury (mTBI) is a prevalent condition that remains difficult to diagnose in its early stages. Oculomotor dysfunction is a well-established marker of mTBI, motivating the development of portable tools that capture both eye-movement behavior and underlying neurophysiology. In this work, we present an initial framework that integrates electroencephalogram (EEG) with augmented-reality (AR)-based Vestibular/Ocular Motor Screening (VOMS) tasks to estimate subject-specific ocular response times. Pre-processed EEG signals, obtained through band-pass filtering and average referencing, are analyzed using a Redundant Discrete Wavelet Transform (RDWT)-driven deep neural framework. The RDWT coefficients are subjected to trainable zero-phase convolutional filtering and reconstructed into the time domain via inverse RDWT, followed by channel-wise temporal and spatial filtering using 2D convolution layers and convolutional-LSTM-based decoding. An ablation study demonstrates that wavelet-domain filtering serves as an effective denoising strategy, improving prediction performance. Sliding-window predictions were validated using Pearson correlation (>= 0.5), and Dynamic Time Warping (DTW) was subsequently used to estimate ocular response times. DTW-derived metrics revealed significant inter-subject differences across all VOM tasks, supported by Mann-Whitney U tests. Cross-correlation analysis further revealed task-dependent temporal behaviors: pursuit tasks exhibited reactive tracking, whereas saccades showed anticipatory responses. Overall, the results highlight pursuit tasks as particularly informative for distinguishing timing differences and demonstrate the potential of RDWT-based EEG features combined with DTW metrics for multimodal mTBI assessment.

2605.14855 2026-05-15 cs.LG cs.AI eess.SP

Exploitation of Hidden Context in Dynamic Movement Forecasting: A Neural Network Journey from Recurrent to Graph Neural Networks and General Purpose Transformers

Lukas Schelenz, Shobha Rajanna, Denis Gosalci, Lucas Heublein, Jonas Pirkl, Jonathan Ott, Felix Ott, Christopher Mutschler, Tobias Feigl

AI总结 本文研究了在动态运动预测任务中如何有效利用隐藏上下文信息,重点探讨了从循环神经网络到图神经网络以及通用型Transformer模型的演进过程。研究对比了多种机器学习方法在预测NBA球员动态运动轨迹中的性能,发现基于LSTM的混合模型在结合上下文信息后取得了最低的最终位移误差,表现优于图注意力网络和Transformer等其他模型。实验表明,不同模型在预测精度、泛化能力和训练效率方面各有优劣,强调了在快速动态环境中进行轨迹预测时需根据具体任务选择合适模型。

详情
Journal ref
IEEE/ION Position, Location and Navigation Symposium (PLANS), Salt Lake City, UT, May 2025
Comments
12 pages
英文摘要

Forecasting within signal processing pipelines is crucial for mitigating delays, particularly in predicting the dynamic movements of objects such as NBA players. This task poses significant challenges due to the inherently interactive and unpredictable nature of sports, where abrupt changes in velocity and direction are prevalent. Traditional approaches, including (S)ARIMA(X), Kalman filters (KF), and Particle filters (PF), often struggle to model the non-linear dynamics present in such scenarios. Machine learning (ML) methods, such as long short-term memory (LSTM) networks, graph neural networks (GNNs), and Transformers, offer greater flexibility and accuracy but frequently fail to explicitly capture the interplay between temporal dependencies and contextual interactions, which are critical in chaotic sports environments. In this paper, we evaluate these models and assess their strengths and weaknesses. Experimental results reveal key performance trade-offs across input history length, generalizability, and the ability to incorporate contextual information. ML-based methods demonstrated substantial improvements over linear models across forecast horizons of up to 2s. Among the tested architectures, our hybrid LSTM augmented with contextual information achieved the lowest final displacement error (FDE) of 1.51m, outperforming temporal convolutional neural network (TCNN), graph attention network (GAT), and Transformers, while also requiring less data and training time compared to GAT and Transformers. Our findings indicate that no single architecture excels across all metrics, emphasizing the need for task-specific considerations in trajectory prediction for fast-paced, dynamic environments such as NBA gameplay.

2605.14839 2026-05-15 cs.LG eess.SP

GenAI for Energy-Efficient and Interference-Aware Compressed Sensing of GNSS Signals on a Google Edge TPU

Thorben Wegner, Lucas Heublein, Tobias Feigl, Felix Ott, Christopher Mutschler, Alexander Rügamer

AI总结 本文提出了一种基于生成式人工智能(GenAI)的新型方法,用于在谷歌边缘TPU上对全球导航卫星系统(GNSS)信号进行高效压缩与干扰分类。该方法利用变分自编码器(VAEs)在接收端直接压缩信号并实时识别干扰和欺骗攻击,显著降低数据传输和处理能耗。实验表明,该方法在保持信号特征的前提下实现了超过42倍的压缩比,并在重构信号上准确分类约72种干扰类型,为GNSS干扰抑制提供了一种高效且实用的解决方案。

详情
Journal ref
IEEE/ION Position, Location and Navigation Symposium (PLANS), Salt Lake City, UT, May 2025
Comments
12 pages
英文摘要

Traditional methods for classifying global navigation satellite system (GNSS) jamming signals typically involve post-processing raw or spectral data streams, requiring complex and costly data transmission to cloud-based interference classification systems. In contrast, our proposed approach efficiently compresses GNSS data streams directly at the hardware receiver while simultaneously classifying jamming and spoofing attacks in real time. Given the growing prevalence of GNSS jamming, there is a critical need for real-time solutions suitable for power-constrained environments. This paper introduces a novel method for compressing and classifying GNSS jamming threats using generative artificial intelligence (GenAI), specifically variational autoencoders (VAEs), deployed on Google Edge tensor processing units (TPUs). The study evaluates various autoencoder (AE) architectures to compress and reconstruct GNSS signals, focusing on preserving interference characteristics while minimizing data size near the receiver hardware. The pipeline adapts large-scale AE models for Google Edge TPUs through 8-bit quantization to ensure energy-efficient deployment. Tests on raw in-phase and quadrature-phase (IQ) data, Fast Fourier Transform (FFT) data, and handcrafted features show the system achieves significant compression (>42x) and accurate classification of approximately 72 interference types on reconstructed signals (F2-score 0.915), closely matching the original signals (F2-score 0.923). The hardware-centric GenAI approach also substantially reduces jammer signal transmission costs, offering a practical solution for interference mitigation. Ablation studies on conditional and factorized VAEs (i.e., FactorVAE) explore latent feature disentanglement for data generation, enhancing model interpretability and fostering trust in machine learning (ML) solutions for sensitive interference applications.

2605.14837 2026-05-15 eess.SP

Making AFDM Secure Against Eavesdroppers: A Phase Function Design Approach

Hengxuan Liu, Vincent Savaux, Arman Farhang

AI总结 本文研究了如何通过设计相位函数提高仿射频分复用(AFDM)波形在面对窃听者时的物理层安全性。作者提出了一种通用的相位函数设计方法,通过增加窃听者暴力解调的复杂度来增强AFDM的抗窃听能力。实验结果表明,该方法显著提升了AFDM在物理层安全性能方面的表现。

详情
英文摘要

Affine frequency division multiplexing (AFDM) has recently emerged as a promising waveform for high-mobility communications due to its resilience to Doppler effects and its advantages for integrated sensing and communication (ISAC). AFDM modulates transmit data symbols using chirp subcarriers with two adjustable parameters. One is used for dealing with the Doppler effect and the second parameter can be used for physical layer security (PLS). In this paper, we focus on designing the second chirp parameter in the form of a generic phase function to enhance the robustness of the waveform against brute-force demodulation by the eavesdropper. In particular, we first derive a design criterion that reveals the brute-force demodulation complexity depends on the first derivative of the phase function. Then, we introduce a family of phase functions that can increase the brute-force demodulation complexity in an unbounded and controllable manner, while preserving chirp structure of AFDM. Our simulation results demonstrate that the proposed phase function design enhances the PLS performance of AFDM by several orders of magnitude compared with the conventional AFDM in terms of brute-force demodulation complexity.

2605.14788 2026-05-15 eess.SY cs.SY

Hybrid Metaheuristic Optimization of Distributed Control System Hardware Architecture with Model-Based Verification

Ruslan Zakirzyanov

AI总结 本文研究了在部分参数不确定的情况下,如何设计高效可靠的分布式控制系统硬件架构这一组合优化问题。作者提出了一种基于模型的形式化架构合成方法,并开发了一种混合蚁群元启发式框架来构建可行的分层架构。通过在大型硫酸厂控制系统的案例中验证,该方法不仅能够满足结构和动态性能要求,还展示了其在实际工程中的可行性。

详情
Comments
Accepted for IFAC World Congress 2026
英文摘要

Large-scale chemical plants rely on distributed process control systems (PCS) comprising numerous processing units, communication modules, and I/O devices interconnected via industrial networks. The design of a cost-efficient and reliable hardware architecture under partial uncertainty in plant parameters remains a challenging combinatorial optimization problem. This paper proposes a formal model for distributed control system hardware architecture synthesis. A hybrid ant colony-based metaheuristic framework is developed to construct feasible hierarchical architectures. The proposed approach is validated on a large-scale sulfuric acid plant control system case study. Plant parameters are identified from operational data, system stability is analyzed, and a controller synthesis is performed based on the optimized architecture. The results demonstrate the feasibility of the approach and confirm that the obtained architecture satisfies structural and dynamic performance requirements.

2605.14766 2026-05-15 cs.CL cs.AI eess.AS

Streaming Speech-to-Text Translation with a SpeechLLM

Titouan Parcollet, Shucong Zhang, Xianrui Zheng, Rogier C. van Dalen

AI总结 本文提出了一种基于大语言模型(LLM)的实时流式语音到文本翻译系统,旨在解决现有SpeechLLM系统在实际应用中响应速度慢的问题。该方法使模型不仅能生成翻译文本,还能判断是否已接收到足够的音频信息以进行输出,从而实现更高效的流式处理。实验表明,该系统在保持翻译质量接近非流式基线的同时,将延迟降低至1-2秒,显著提升了实时性。

详情
Comments
9 pages of main text; 24 pages in total
英文摘要

Normally, a system that translates speech into text consists of separate modules for speech recognition and text-to-text translation. Combining those tasks into a SpeechLLM promises to exploit paralinguistic information in the speech and to reduce cascaded errors. But existing SpeechLLM systems are slow since they do not work in a real streaming fashion: they wait for a complete utterance of audio before outputting a translation, or output tokens at fixed intervals, which is not suitable for real applications. This work proposes an LLM-based architecture for real streaming speech-to-text translation. The LLM learns not just to emit output tokens, but also to decide whether it has seen enough audio to do so. The system is trained using automatic alignments of the input speech and the output text. In experiments on different language pairs, the system achieves a translation quality close to the non-streaming baseline, but with a latency of only 1-2 seconds.

2605.14741 2026-05-15 eess.SY cs.AI cs.SY

Addressing Terminal Constraints in Data-Driven Demand Response Scheduling

Maximilian Bloor, Martha White, Ehecatl Antonio del Rio Chanona, Calvin Tsay

AI总结 本文研究了在数据驱动的需求响应调度中如何满足终端约束的问题,提出了一种结合目标空间规划(GSP)与深度确定性策略梯度(DDPG)的方法,通过学习离散子目标的时序抽象模型,有效传递长期价值,提升调度效果。该方法在模拟的空气分离系统中验证了其在提高样本效率和满足终端存储约束方面的优势,缓解了传统方法在长期约束处理上的不足。

详情
Comments
Accepted to IFAC World Congress 2026
英文摘要

Electrified chemical processes are incentivized by exposure to time-varying electricity markets to operate flexibly, but participating in demand response schemes can require satisfying terminal constraints over long horizons. Specifically, terminal constraints may be required when computing optimal schedules in order to preserve dynamic stability. Model-based optimization methods are computationally costly, and data-driven scheduling via reinforcement learning (RL) faces severe credit-assignment challenges. We integrate Goal-Space Planning (GSP) with Deep Deterministic Policy Gradient (DDPG), using learned temporally abstract models over discrete subgoals to propagate value across extended horizons. Using a simulated air separation benchmark, we demonstrate the proposed approach improves sample efficiency over standard DDPG while satisfying terminal storage constraints, mitigating myopic control behavior.

2605.14720 2026-05-15 eess.SP

Joint Phase Noise and Channel Estimation for OTFS

Stephen McWade, Arman Farhang

AI总结 本文研究了正交时频空(OTFS)系统中振荡器相位噪声的影响,分析了相位噪声在时延-多普勒域引起的干扰,并推导了三种不同类型振荡器的信噪比(SINR)表达式。研究指出OTFS系统对相位噪声高度敏感,现有仅考虑公共相位误差(CPE)的估计方法无法有效抑制多普勒间干扰(IDI)。为此,本文提出了一种基于维纳滤波的联合信道与相位噪声估计方法,充分利用了相位噪声和多普勒扩展信道的统计特性,仿真结果表明该方法在误码率(BER)性能上相比现有方法有高达8 dB的提升。

详情
Comments
13 pages, 17 figures. Accepted for publication in IEEE Transactions on Vehicular Technology. arXiv admin note: text overlap with arXiv:2602.12804
英文摘要

This paper investigates the effect of oscillator phase noise in orthogonal time frequency space (OTFS) systems. The paper provides in-depth analysis of the interference due to phase noise in the delay-Doppler domain and derives expressions for SINR for three different oscillator types, namely free-running oscillators, continuous-time phase locked loops (PLLs) and discrete-time PLLs. The analysis demonstrates the OTFS is sensitive to phase noise and requires appropriate estimation and compensation. In particular, the analysis shows phase noise imposed inter-Doppler-interference (IDI) is severe and that existing phase noise estimation techniques which only consider the common-phase-error (CPE) can not compensate this IDI effectively. Additionally, the existing methods in the OTFS literature on phase noise assume the channel to be a known single tap channel. Hence, in this paper, we propose a method for joint channel and phase noise estimation using a Wiener filtering approach. Our proposed method exploits the statistical nature of both the phase noise and the Doppler spread channel. Our numerical results demonstrate the superior performance of our proposed technique, with gains of up to 8~dB in terms of bit error rate (BER) over existing methods in the literature.

2605.14688 2026-05-15 eess.SY cs.SY

Dynamic Event-Triggered Control of Discrete-Time Nonlinear Systems based on Difference-Algebraic Representations

Vitoriano Casas, Gabriela Reis, Pedro Henrique Coutinho, Iury Bessa, Rodrigo Araújo

AI总结 本文研究了基于差分代数表示(DAR)的离散时间非线性系统的动态事件触发控制问题,提出了一种基于增益调度控制器的方法。该方法将系统非线性特性以及事件触发采样引起的异步项纳入控制律和触发函数设计中,从而推导出一种更为宽松的协同设计条件,用于同时设计增益调度控制律和动态触发机制,以保证闭环系统的渐近稳定性。此外,还通过优化问题减少了事件触发次数并扩大了闭环系统的吸引域估计,数值例子验证了方法的有效性。

详情
Comments
Accepted to the IFAC World Congress 2026
英文摘要

This paper addresses the dynamic event-triggered control for a class of discrete-time nonlinear systems described by a difference-algebraic representation (DAR), using a gain-scheduled controller. An outstanding aspect of the proposed method is the incorporation of information about the system's nonlinearities into the control law and the trigger function. The proposed event-triggered mechanism also incorporates information on the asynchronous terms induced by the event-based sampling. All these ingredients enable the derivation of a less conservative co-design condition for the simultaneous design of the gain-scheduled control law and the dynamic triggering mechanism to ensure the asymptotic stability of the closed-loop system. An estimate of the region of attraction of the origin of the closed-loop system is obtained to guarantee the closed-loop system's operation within the domain of validity of the DAR. Then, an optimization problem is formulated to reduce the number of events and enlarge the estimated region of attraction. Finally, the effectiveness of the proposed condition is illustrated by a numerical example.

2605.14683 2026-05-15 cs.RO cs.SY eess.SY

SeaVis: Modeling and Control of a Remotely Operated Towed Vehicle for Seabed Visualization and Mapping

Abdelhakim Amer, Aske Alstrup, Frederik Rasmussen, Yury Brodskiy, Andriy Sarabakha, Erdal Kayacan

AI总结 本文提出了一种用于海底可视化与测绘的遥控拖曳式水下机器人SeaVis的新型数学模型,并设计了一种增益调度的线性二次调节器(LQR)以实现其深度和姿态的鲁棒控制。通过高保真仿真验证,结果表明该LQR控制器在抗干扰能力、控制效率和舵面动作幅度等方面均优于传统PID控制器,并且在全操作速度范围内均表现出良好的控制效果。研究为水下机器人高精度稳定作业提供了有效的控制方法。

详情
Comments
Accepted at IEEE/ASME AIM 2026
英文摘要

High-resolution seafloor mapping necessitates stable and precise positioning for underwater robots. This paper introduces a novel mathematical model for SeaVis remotely operated towed vehicles (ROTVs) and develops a gain-scheduled linear-quadratic regulator (LQR) for robust depth and attitude control. We validate the approach in a high-fidelity simulation, benchmarking the LQR against a conventional PID controller over a challenging seabed profile. The presented results demonstrate the LQR's superior performance, with significantly enhanced robustness to disturbances, greater control efficiency, and substantially reduced flap actuation. The gain scheduling also confirms the controller's effectiveness across the full operational velocity range. The complete simulation environment and controller are open-sourced.

2605.14677 2026-05-15 eess.IV

An Attention-Enhanced Network with Joint Dehazing and Retinex-Based Enhancement for Underwater Images

Sahana Ray, Bibhabasu Debnath, Sanjay Ghosh

AI总结 本文针对水下图像因光的吸收、散射和悬浮颗粒导致的能见度下降问题,提出了一种包含去雾和Retinex增强的三阶段网络ADR,通过扩展水下成像模型并结合注意力机制的U-Net++进行优化,有效提升了水下图像的质量。实验表明,该方法在多个基准数据集上的表现优于现有先进方法。

详情
Comments
6 pages, 3 figures, 4 tables; accepted for the IEEE ICIP 2026 conference
英文摘要

Underwater images suffer from severe wavelength-dependent light absorption and scattering, and turbidity due to suspended particles, degrading visual quality for applications in autonomous underwater vehicles (AUVs), marine biology, archaeology, and offshore infrastructure inspection. Classical IFM inadequately capture nonlinear underwater light behavior, while purely data-driven methods lack physical interpretability. This paper proposes a three-stage network named ADR, that extends the underwater image formation model with additional terms to perform underwater dehazing, followed by Retinex-based enhancement and attention-enabled U-Net++ refinement. Experiments on UIEB and UFO-120 benchmark datasets demonstrate competitive performance with state-of-the-art methods.

2605.14650 2026-05-15 eess.SP

Multimodal Learning for MIMO Beam Prediction Based on Variational Inference

Zijian Zheng, Wenqiang Yi, Hyundong Shin, Arumugam Nallanathan

AI总结 本文研究了基于变分推理的多模态学习方法,用于提升大规模多输入多输出系统中的波束预测精度。该方法通过解耦特征提取与跨模态语义对齐,降低了多模态数据获取的高昂成本,并采用两阶段训练策略提高数据效率和鲁棒性。实验表明,该框架在仅使用传统端到端方法20%多模态数据的情况下,仍能实现具有竞争力的预测精度和高可靠性。

详情
Comments
13 pages, 4 figures
英文摘要

Accurate beam prediction is essential for mitigating signalling overhead and latency in integrated sensing and communication-enabled massive multi-input multi-output systems. With the aid of multimodal learning, the prediction accuracy can be enhanced by leveraging the complementary information from other existing sensors, but the practical deployment is often constrained by the high cost of acquiring semantically aligned multimodal datasets. This paper proposes a variational-inference-based multimodal framework that decouples the optimization problem into modular feature extraction and cross-modal semantic alignment. Specifically, we develop a two-stage training strategy where the model utilises abundant unimodal data for representation learning before performing refined alignment on limited multimodal samples. This design enhances data efficiency and ensures robust feature fusion under sensing uncertainties. Experimental results on the DeepSense6G dataset demonstrate that the proposed framework achieves competitive beam prediction accuracy and maintains high reliability, while only requiring 20% of the multimodal training data compared to conventional end-to-end benchmarks.

2605.14642 2026-05-15 eess.SY cs.SY math.OC

Distributionally Robust Model Predictive Control for Virtual Power Plants

Nikolas Recke, Mathias Hudoba de Badyn

AI总结 本文提出了一种用于虚拟电厂在电价不确定性下的最优运行的分布鲁棒模型预测控制(DRMPC)框架。该方法通过结合数据驱动预测与分位数不确定性量化,构建了能够适应预测偏差和分布变化的时间相关Wasserstein模糊集,并将其直接融入实时决策过程。实验结果表明,在适当选择模糊半径的情况下,DRMPC相比传统基于预测的MPC能提升经济效益,且在不同季节场景下均表现出稳定的收益增长。

详情
Comments
7 pages, 5 figures, submitted to IFAC World Congress 2026
英文摘要

This paper presents a distributionally robust model predictive control (DRMPC) framework for the optimal Virtual Power Plant (VPP) operation under electricity price uncertainty. A unified VPP model is formulated that captures the interaction between buildings, battery storage, and renewable generation, all influenced by exogenous weather and market signals. The proposed approach integrates data-driven forecasting with quantile-based uncertainty quantification to construct time-varying Wasserstein ambiguity sets that adapt to forecast dispersion and distributional shifts. This yields a tractable DR-MPC formulation that incorporates predictive distribution information directly into real-time decision making. The method is evaluated using real weather and market data from a Nordic case study across two seasonal scenarios. The results show that DR-MPC improves economic performance relative to standard forecast-based MPC when the ambiguity radius is chosen appropriately, with consistent gains of up to 0.8% for small radii across both seasonal scenarios. Larger radii become overly conservative and reduce revenue, underscoring the importance of proper radius selection. These findings demonstrate the practical value of distributionally robust optimization for uncertainty-aware VPP operation.

2605.14629 2026-05-15 eess.IV cs.CV

Efficient Dense Matching for Enhanced Gaussian Splatting Using AV1 Motion Vectors

Julien Zouein, Vibhoothi Vibhoothi, François Pitié, Anil Kokaram

AI总结 本文提出了一种基于AV1运动向量的高效密集匹配方法,用于提升高斯泼溅(3DGS)的初始点云质量。该方法利用AV1视频编解码器中的运动向量,避免了传统SfM方法中耗时的穷举匹配,显著降低了计算开销并提高了点云密度。实验表明,该方法生成的点云数量是传统SfM方法的八倍,有效提升了3DGS的重建精度和训练效率。

详情
英文摘要

3D Gaussian Splatting (3DGS) has emerged as a prominent framework for real-time, photorealistic scene reconstruction, offering significant speed-ups over Neural Radiance Fields (NeRF). However, the fidelity of 3DGS representations remains heavily dependent on the quality of the initial point cloud. While standard Structure-from-Motion (SfM) pipelines using COLMAP provide adequate initialisation, they often suffer from high computational costs and sparsity in textureless regions, which degrades subsequent reconstruction accuracy and convergence speed. In this work, we introduce an AV1-based feature detection and matching pipeline that significantly reduces SfM processing overhead. By leveraging motion vectors inherent to the AV1 video codec, we bypass computationally expensive exhaustive matching while maintaining geometric robustness. Our pipeline produces substantially denser point clouds, with up to eight times as many points as classical SfM. We demonstrate that this enhanced initialisation directly improves 3DGS performance, yielding an 9-point increase in VMAF and a 63% average reduction in training time required to reach baseline quality. The project page: https://sigmedia.tv/AV1-3DGS.github.io/

2605.14510 2026-05-15 eess.SP

Antenna Tilt Failure Detection and Estimation via Integrated Sensing and Communications

Samed Kesir, Batuhan Kaplan, Emre Arslan, Ahmet Faruk Coskun

AI总结 本文研究了窄波束通信系统对物理对准误差的敏感性问题,提出了一种基于集成感知与通信(ISAC)技术的无传感器天线倾角故障检测与估计方法。该方法利用环境静态杂波作为几何参考点,通过监测杂波热图中的系统增益变化实现天线倾角的精确检测与估计,并基于标准5G NR帧结构和两种不同波形进行实现。实验结果表明,该框架能够实现无需外部传感器的自主网络维护与自愈功能。

详情
Comments
This paper is accepted in 2026 EUCNC & 6G Summit
英文摘要

This paper addresses the critical sensitivity issue of narrow-beam communication systems to physical misalignments and exploits the potential of Integrated Sensing and Communications (ISAC) technology to propose a sensor-free antenna tilt failure detection and estimation framework. The proposed methods utilize environmental static clutter as geometric anchors to monitor systematic gain shifts in clutter heat maps. The proposed methods are introduced for precise antenna tilt detection and estimation using the standard 5G NR frame structure and two different waveforms. Numerical results show the potential of the proposed framework to enable autonomous, self healing network maintenance without the need for external sensors.