arXivDaily arXiv每日学术速递 周一至周五更新
2602.03801 2026-02-04 eess.SP

Digital-Twin Empowered Deep Reinforcement Learning For Site-Specific Radio Resource Management in NextG Wireless Aerial Corridor

Pulok Tarafder, Zoheb Hassan, Imtiaz Ahmed, Danda B. Rawat, Kamrul Hasan, Cong Pu

Comments Submitted for possible publication to IEEE. Paper currently under review. The contents of this paper may change at any time without notice

详情
英文摘要

Joint base station (BS) association and beam selection in multi-UAV aerial corridors constitutes a challenging radio resource management (RRM) problem. It is driven by high-dimensional action spaces, need for substantial overhead to acquire global channel state information (CSI), rapidly varying propagation channels, and stringent latency requirements. Conventional combinatorial optimization methods, while near-optimal, are computationally prohibitive for real-time operation in such dynamic environments. While learning-based approaches can mitigate computational complexity and CSI overhead, the need for extensive site-specific (SS) datasets for model training remains a key challenge. To address these challenges, we develop a Digital Twin (DT)-enabled two-stage optimization framework that couples physics-based beam gain modeling with DRL for scalable online decision-making. In the first stage, a channel twin (CT) is constructed using a high-fidelity ray-tracing solver with geo-spatial contexts, and network information to capture SS propagation characteristics, and dual annealing algorithm is employed to precompute optimal transmission beam directions. In the second stage, a Multi-Head Proximal Policy Optimization (MH-PPO) agent, equipped with a scalable multi-head actor-critic architecture, is trained on the DT-generated channel dataset to directly map complex channel and beam states to jointly execute UAV-BS-beam association decisions. The proposed PPO agent achieves a 44%-121% improvement over DQN and 249%-807% gain over traditional heuristic based optimization schemes in a dense UAV scenario, while reducing inference latency by several orders of magnitude. These results demonstrate that DT-driven training pipelines can deliver high-performance, low-latency RRM policies tailored to SS deployments suitable for real-time resource management in next-generation aerial corridor networks.

2602.03757 2026-02-04 eess.SY cs.OS cs.SY

Mitigating Timing-Based Attacks in Real-Time Cyber-Physical Systems

Arkaprava Sain, Sunandan Adhikary, Soumyajit Dey

Comments 12 pages, 10 figures

详情
英文摘要

Real-time cyber-physical systems depend on deterministic task execution to guarantee safety and correctness. Unfortunately, this determinism can unintentionally expose timing information that enables adversaries to infer task execution patterns and carry out timing-based attacks targeting safety-critical control tasks. While prior defenses aim to obscure schedules through randomization or isolation, they typically neglect the implications of such modifications on closed-loop control behavior and real-time feasibility. This work studies the problem of securing real-time control workloads against timing inference attacks while explicitly accounting for both schedulability constraints and control performance requirements. We present a scheduling-based mitigation approach that introduces bounded timing perturbations to control task executions in a structured manner, reducing adversarial opportunities without violating real-time guarantees. The framework jointly considers worst-case execution behavior and the impact of execution delays on control performance, enabling the system to operate within predefined safety and performance limits. Through experimental evaluation on representative task sets and control scenarios, the proposed approach demonstrates that exposure to timing-based attacks can be significantly reduced while preserving predictable execution and acceptable control quality.

2602.03711 2026-02-04 eess.SP cs.LG

VR-VFL: Joint Rate and Client Selection for Vehicular Federated Learning Under Imperfect CSI

Metehan Karatas, Subhrakanti Dey, Christian Rohner, Jose Mairton Barros da Silva

Comments This paper has been accepted for presentation at IEEE ICC 2026

详情
英文摘要

Federated learning in vehicular edge networks faces major challenges in efficient resource allocation, largely due to high vehicle mobility and the presence of imperfect channel state information. Many existing methods oversimplify these realities, often assuming fixed communication rounds or ideal channel conditions, which limits their effectiveness in real-world scenarios. To address this, we propose variable rate vehicular federated learning (VR-VFL), a novel federated learning method designed specifically for vehicular networks under imperfect channel state information. VR-VFL combines dynamic client selection with adaptive transmission rate selection, while also allowing round times to flex in response to changing wireless conditions. At its core, VR-VFL is built on a bi-objective optimization framework that strikes a balance between improving learning convergence and minimizing the time required to complete each round. By accounting for both the challenges of mobility and realistic wireless constraints, VR-VFL offers a more practical and efficient approach to federated learning in vehicular edge networks. Simulation results show that the proposed VR-VFL scheme achieves convergence approximately 40% faster than other methods in the literature.

2602.03691 2026-02-04 eess.SY cs.RO cs.SY math.OC

Input-to-State Safe Backstepping: Robust Safety-Critical Control with Unmatched Uncertainties

Max H. Cohen, Pio Ong, Aaron D. Ames

Comments To appear at the 2026 American Control Conference

详情
英文摘要

Guaranteeing safety in the presence of unmatched disturbances -- uncertainties that cannot be directly canceled by the control input -- remains a key challenge in nonlinear control. This paper presents a constructive approach to safety-critical control of nonlinear systems with unmatched disturbances. We first present a generalization of the input-to-state safety (ISSf) framework for systems with these uncertainties using the recently developed notion of an Optimal Decay CBF, which provides more flexibility for satisfying the associated Lyapunov-like conditions for safety. From there, we outline a procedure for constructing ISSf-CBFs for two relevant classes of systems with unmatched uncertainties: i) strict-feedback systems; ii) dual-relative-degree systems, which are similar to differentially flat systems. Our theoretical results are illustrated via numerical simulations of an inverted pendulum and planar quadrotor.

2602.03669 2026-02-04 cs.CV cs.AI cs.LG eess.IV

Efficient Sequential Neural Network with Spatial-Temporal Attention and Linear LSTM for Robust Lane Detection Using Multi-Frame Images

Sandeep Patil, Yongqi Dong, Haneen Farah, Hans Hellendoorn

Comments 14 pages, 9 figures, under review by IEEE T-ITS

详情
英文摘要

Lane detection is a crucial perception task for all levels of automated vehicles (AVs) and Advanced Driver Assistance Systems, particularly in mixed-traffic environments where AVs must interact with human-driven vehicles (HDVs) and challenging traffic scenarios. Current methods lack versatility in delivering accurate, robust, and real-time compatible lane detection, especially vision-based methods often neglect critical regions of the image and their spatial-temporal (ST) salience, leading to poor performance in difficult circumstances such as serious occlusion and dazzle lighting. This study introduces a novel sequential neural network model with a spatial-temporal attention mechanism to focus on key features of lane lines and exploit salient ST correlations among continuous image frames. The proposed model, built on a standard encoder-decoder structure and common neural network backbones, is trained and evaluated on three large-scale open-source datasets. Extensive experiments demonstrate the strength and robustness of the proposed model, outperforming state-of-the-art methods in various testing scenarios. Furthermore, with the ST attention mechanism, the developed sequential neural network models exhibit fewer parameters and reduced Multiply-Accumulate Operations (MACs) compared to baseline sequential models, highlighting their computational efficiency. Relevant data, code, and models are released at https://doi.org/10.4121/4619cab6-ae4a-40d5-af77-582a77f3d821.

2602.03646 2026-02-04 eess.SY cs.SY

A Comparison of Set-Based Observers for Nonlinear Systems

Nico Holzinger, Matthias Althoff

Comments 13 pages

详情
英文摘要

Set-based state estimation computes sets of states consistent with a system model given bounded sets of disturbances and noise. Bounding the set of states is crucial for safety-critical applications so that one can ensure that all specifications are met. While numerous approaches have been proposed for nonlinear discrete-time systems, a unified evaluation under comparable conditions is lacking. This paper reviews and implements a representative selection of set-based observers within the CORA framework. To provide an objective comparison, the methods are evaluated on common benchmarks, and we examine computational effort, scalability, and the conservatism of the resulting state bounds. This study highlights characteristic trade-offs between observer categories and set representations, as well as practical considerations arising in their implementation. All implementations are made publicly available to support reproducibility and future development. This paper thereby offers the first broad, tool-supported comparison of guaranteed state estimators for nonlinear discrete-time systems.

2602.03624 2026-02-04 eess.SP cs.SD

A Multi-decoder Neural Tracking Method for Accurately Predicting Speech Intelligibility

Rien Sonck, Bernd Accou, Tom Francart, Jonas Vanthornhout

详情
英文摘要

Objective: EEG-based methods can predict speech intelligibility, but their accuracy and robustness lag behind behavioral tests, which typically show test-retest differences under 1 dB. We introduce the multi-decoder method to predict speech reception thresholds (SRTs) from EEG recordings, enabling objective assessment for populations unable to perform behavioral tests; such as those with disorders of consciousness or during hearing aid fitting. Approach: The method aggregates data from hundreds of decoders, each trained on different speech features and EEG preprocessing setups to quantify neural tracking (NT) of speech signals. Using data from 39 participants (ages 18-24), we recorded 29 minutes of EEG per person while they listened to speech at six signal-to-noise ratios and a quiet story. NT values were combined into a high-dimensional feature vector per subject, and a support vector regression model was trained to predict SRTs from these vectors. Main Result: Predictions correlated significantly with behavioral SRTs (r = 0.647, p < 0.001; NRMSE = 0.19), with all differences under 1 dB. SHAP analysis showed theta/delta bands and early lags had slightly greater influence. Using pretrained subject-independent decoders reduced required EEG data collection to 15 minutes (3 minutes of story, 12 minutes across six SNR conditions) without losing accuracy.

2602.03581 2026-02-04 eess.SP cs.IT math.IT

Low-Complexity Distributed Combining Design for Near-Field Cell-Free XL-MIMO Systems

Zhe Wang, Jiayi Zhang, Bokai Xu, Dusit Niyato, Bo Ai, Shiwen Mao, Zhu Han

Comments 15 pages, 10 figures, to appear in IEEE Transactions on Wireless Communications

详情
英文摘要

In this paper, we investigate the low-complexity distributed combining scheme design for near-field cell-free extremely large-scale multiple-input-multiple-output (CF XL-MIMO) systems. Firstly, we construct the uplink spectral efficiency (SE) performance analysis framework for CF XL-MIMO systems over centralized and distributed processing schemes. Notably, we derive the centralized minimum mean-square error (CMMSE) and local minimum mean-square error (LMMSE) combining schemes over arbitrary channel estimators. Then, focusing on the CMMSE and LMMSE combining schemes, we propose five low-complexity distributed combining schemes based on the matrix approximation methodology or the symmetric successive over relaxation (SSOR) algorithm. More specifically, we propose two matrix approximation methodology-aided combining schemes: Global Statistics \& Local Instantaneous information-based MMSE (GSLI-MMSE) and Statistics matrix Inversion-based LMMSE (SI-LMMSE). These two schemes are derived by approximating the global instantaneous information in the CMMSE combining and the local instantaneous information in the LMMSE combining with the global and local statistics information by asymptotic analysis and matrix expectation approximation, respectively. Moreover, by applying the low-complexity SSOR algorithm to iteratively solve the matrix inversion in the LMMSE combining, we derive three distributed SSOR-based LMMSE combining schemes, distinguished from the applied information and initial values.

2602.03521 2026-02-04 eess.SY cs.SY

Real-world energy data of 200 feeders from low-voltage grids with metadata in Germany over two years

Manuel Treutlein, Pascal Bothe, Marc Schmidt, Roman Hahn, Oliver Neumann, Ralf Mikut, Veit Hagenmeyer

Comments 20 pages, 6 Figures, 6 Tables. Data is available on Zenodo: https://zenodo.org/records/17831177

详情
英文摘要

The last mile of the distribution grid is crucial for a successful energy transition, as more low-carbon technology like photovoltaic systems, heat pumps, and electric vehicle chargers connect to the low-voltage grid. Despite considerable challenges in operation and planning, researchers often lack access to suitable low-voltage grid data. To address this, we present the FeederBW dataset with data recorded by the German distribution system operator Netze BW. It offers real-world energy data from 200 low-voltage feeders over two years (2023-2025) with weather information and detailed metadata, including changes in low-carbon technology installations. The dataset includes feeder-specific details such as the number of housing units, installed power of low-carbon technology, and aggregated industrial energy data. Furthermore, high photovoltaic feed-in and one-minute temporal resolution makes the dataset unique. FeederBW supports various applications, including machine learning for load forecasting, conducting non-intrusive load monitoring, generating synthetic data, and analyzing the interplay between weather, feeder measurements, and metadata. The dataset reveals insightful patterns and clearly reflects the growing impact of low-carbon technology on low-voltage grids.

2601.08338 2026-02-04 eess.SY cs.SY math.OC

Minimal Actuator Selection

Luca Ballotta, Geethu Joseph

Comments This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible.

详情
英文摘要

Selecting a few available actuators to ensure the controllability of a linear system is a fundamental problem in control theory. Previous works either focus on optimal performance, simplifying the controllability issue, or make the system controllable under structural assumptions, such as in graphs or when the input matrix is a design parameter. We generalize these approaches to offer a precise characterization of the general minimal actuator selection problem where a set of actuators is given, described by a fixed input matrix, and goal is to choose the fewest actuators that make the system controllable. We show that this problem can be equivalently cast as an integer linear program and, if actuation channels are sufficiently independent, as a set multicover problem under multiplicity constraints. The latter equivalence is always true if the state matrix has all distinct eigenvalues, in which case it simplifies to the set cover problem. Such characterizations hold even when a robust selection that tolerates a given number of faulty actuators is desired. Our established connection legitimates a designer to use algorithms from the rich literature on the set multicover problem to select the smallest subset of actuators, including exact solutions that do not require brute-force search.

2512.12755 2026-02-04 eess.SY cs.LG cs.SY

An End-to-End Approach for Microgrid Probabilistic Forecasting and Robust Operation via Decision-focused Learning

Tingwei Cao, Yan Xu

Comments 10 pages

详情
英文摘要

High penetration of renewable energy sources (RES) introduces significant uncertainty and intermittency into microgrid operations, posing challenges to economic and reliable scheduling. To address this, this paper proposes an end-to-end decision-focused framework that jointly optimizes probabilistic forecasting and robust operation for microgrids. A multilayer encoder-decoder (MED) probabilistic forecasting model is integrated with a two-stage robust optimization (TSRO) model involving direct load control (DLC) through a differentiable decision pathway, enabling gradient-based feedback from operational outcomes to improve forecasting performance. Unlike conventional sequential approaches, the proposed method aligns forecasting accuracy with operational objectives by directly minimizing decision regret via a surrogate smart predict-then-optimize (SPO) loss function. This integration ensures that probabilistic forecasts are optimized for downstream decisions, enhancing both economic efficiency and robustness. Case studies on modified IEEE 33-bus and 69-bus systems demonstrate that the proposed framework achieves superior forecasting accuracy and operational performance, reducing total and net operation costs by up to 18% compared with conventional forecasting and optimization combinations. The results verify the effectiveness and scalability of the end-to-end decision-focused approach for resilient and cost-efficient microgrid management under uncertainty.

2510.18190 2026-02-04 eess.AS cs.LG cs.SD

Joint Estimation of Piano Dynamics and Metrical Structure with a Multi-task Multi-Scale Network

Zhanhong He, Hanyu Meng, David Huang, Roberto Togneri

Comments Accepted to ICASSP2026 conference

详情
英文摘要

Estimating piano dynamic from audio recordings is a fundamental challenge in computational music analysis. In this paper, we propose an efficient multi-task network that jointly predicts dynamic levels, change points, beats, and downbeats from a shared latent representation. These four targets form the metrical structure of dynamics in the music score. Inspired by recent vocal dynamic research, we use a multi-scale network as the backbone, which takes Bark-scale specific loudness as the input feature. Compared to log-Mel as input, this reduces model size from 14.7 M to 0.5 M, enabling long sequential input. We use a 60-second audio length in audio segmentation, which doubled the length of beat tracking commonly used. Evaluated on the public MazurkaBL dataset, our model achieves state-of-the-art results across all tasks. This work sets a new benchmark for piano dynamic estimation and delivers a powerful and compact tool, paving the way for large-scale, resource-efficient analysis of musical expression.

2508.02483 2026-02-04 eess.AS

Revisiting the Privacy of Low-Frequency Speech Signals: Exploring Resampling Methods, Evaluation Scenarios, and Speaker Characteristics

Jule Pohlhausen, Jörg Bitzer

Comments Accepted at SPSC 2025 - 5th Symposium on Security and Privacy in Speech Communication

详情
英文摘要

While audio recordings in real life provide insights into social dynamics and conversational behavior, they also raise concerns about the privacy of personal, sensitive data. This article explores the effectiveness of restricting recordings to low-frequency audio to protect spoken content. For resampling the audio signals to different sampling rates, we compare the effect of employing anti-aliasing filtering. Privacy enhancement is measured by an increased word error rate of automatic speech recognition models. The impact on utility performance is measured with voice activity detection models. Our experimental results show that for clean recordings, models trained with a sampling rate of up to 800 Hz transcribe the majority of words correctly. For both models, we analyzed the impact of the speaker's sex and pitch, and we demonstrated that missing anti-aliasing filters more strongly compromise speech privacy.

2503.19943 2026-02-04 eess.IV cs.AI cs.LG

A Spatiotemporal Radar-Based Precipitation Model for Water Level Prediction and Flood Forecasting

Sakshi Dhankhar, Stefan Wittek, Hamidreza Eivazi, Andreas Rausch

Comments 28 pages, 11 figures, 6 tables

Journal ref Journal of Hydrology: Regional Studies, Volume 61, October 2025, 102571

详情
英文摘要

Study Region: Goslar and Göttingen, Lower Saxony, Germany. Study Focus: In July 2017, the cities of Goslar and Göttingen experienced severe flood events characterized by short warning time of only 20 minutes, resulting in extensive regional flooding and significant damage. This highlights the critical need for a more reliable and timely flood forecasting system. This paper presents a comprehensive study on the impact of radar-based precipitation data on forecasting river water levels in Goslar. Additionally, the study examines how precipitation influences water level forecasts in Göttingen. The analysis integrates radar-derived spatiotemporal precipitation patterns with hydrological sensor data obtained from ground stations to evaluate the effectiveness of this approach in improving flood prediction capabilities. New Hydrological Insights for the Region: A key innovation in this paper is the use of residual-based modeling to address the non-linearity between precipitation images and water levels, leading to a Spatiotemporal Radar-based Precipitation Model with residuals (STRPMr). Unlike traditional hydrological models, our approach does not rely on upstream data, making it independent of additional hydrological inputs. This independence enhances its adaptability and allows for broader applicability in other regions with RADOLAN precipitation. The deep learning architecture integrates (2+1)D convolutional neural networks for spatial and temporal feature extraction with LSTM for timeseries forecasting. The results demonstrate the potential of the STRPMr for capturing extreme events and more accurate flood forecasting.

2503.19859 2026-02-04 cs.LG eess.SP math.OC stat.CO stat.ML

An Overview of Low-Rank Structures in the Training and Adaptation of Large Models

Laura Balzano, Tianjiao Ding, Benjamin D. Haeffele, Soo Min Kwon, Qing Qu, Peng Wang, Zhangyang Wang, Can Yaras

Comments Authors are listed alphabetically; 37 pages, 15 figures; minor revision at IEEE Signal Processing Magazine

详情
英文摘要

The substantial computational demands of modern large-scale deep learning present significant challenges for efficient training and deployment. Recent research has revealed a widespread phenomenon wherein deep networks inherently learn low-rank structures in their weights and representations during training. This tutorial paper provides a comprehensive review of advances in identifying and exploiting these low-rank structures, bridging mathematical foundations with practical applications. We present two complementary theoretical perspectives on the emergence of low-rankness: viewing it through the optimization dynamics of gradient descent throughout training, and understanding it as a result of implicit regularization effects at convergence. Practically, these theoretical perspectives provide a foundation for understanding the success of techniques such as Low-Rank Adaptation (LoRA) in fine-tuning, inspire new parameter-efficient low-rank training strategies, and explain the effectiveness of masked training approaches like dropout and masked self-supervised learning.

2411.03168 2026-02-04 eess.AS

Reference Microphone Selection for the Weighted Prediction Error Algorithm using the Normalized L-p Norm

Anselm Lohmann, Toon van Waterschoot, Joerg Bitzer, Simon Doclo

详情
英文摘要

Reverberation may severely degrade the quality of speech signals recorded using microphones in a room. For compact microphone arrays, the choice of the reference microphone for multi-microphone dereverberation typically does not have a large influence on the dereverberation performance. In contrast, when the microphones are spatially distributed, the choice of the reference microphone may significantly contribute to the dereverberation performance. In this paper, we propose to perform reference microphone selection for the weighted prediction error (WPE) dereverberation algorithm based on the normalized $\ell_p$-norm of the dereverberated output signal. Experimental results for different source positions in a reverberant laboratory show that the proposed method yields a better dereverberation performance than reference microphone selection based on the early-to-late reverberation ratio or signal power.

2408.00382 2026-02-04 eess.AS

Long-Term Conversation Analysis: Privacy-Utility Trade-off under Noise and Reverberation

Jule Pohlhausen, Francesco Nespoli, Joerg Bitzer

Comments Accepted for publication at IWAENC 2024

详情
英文摘要

Recordings in everyday life require privacy preservation of the speech content and speaker identity. This contribution explores the influence of noise and reverberation on the trade-off between privacy and utility for low-cost privacy-preserving methods feasible for edge computing. These methods compromise spectral and temporal smoothing, speaker anonymization using the McAdams coefficient, sampling with a very low sampling rate, and combinations. Privacy is assessed by automatic speech and speaker recognition, while our utility considers voice activity detection and speaker diarization. Overall, our evaluation shows that additional noise degrades the performance of all models more than reverberation. This degradation corresponds to enhanced speech privacy, while utility is less deteriorated for some methods.

2401.08486 2026-02-04 eess.AS

Microphone Subset Selection for the Weighted Prediction Error Algorithm using a Group Sparsity Penalty

Anselm Lohmann, Toon van Waterschoot, Joerg Bitzer, Simon Doclo

详情
英文摘要

Reverberation can severely degrade the quality of speech signals recorded using microphones in an enclosure. In acoustic sensor networks with spatially distributed microphones, a similar dereverberation performance may be achieved using only a subset of all available microphones. Using the popular convex relaxation method, in this paper we propose to perform microphone subset selection for the weighted prediction error (WPE) multi-channel dereverberation algorithm by introducing a group sparsity penalty on the prediction filter coefficients. The resulting problem is shown to be solved efficiently using the accelerated proximal gradient algorithm. Experimental evaluation using measured impulse responses shows that the performance of the proposed method is close to the optimal performance obtained by exhaustive search, both for frequency-dependent as well as frequency-independent microphone subset selection. Furthermore, the performance using only a few microphones for frequency-independent microphone subset selection is only marginally worse than using all available microphones.

2301.07649 2026-02-04 eess.AS

Dereverberation in Acoustic Sensor Networks Using Weighted Prediction Error With Microphone-dependent Prediction Delays

Anselm Lohmann, Toon van Waterschoot, Joerg Bitzer, Simon Doclo

详情
英文摘要

In the last decades several multi-microphone speech dereverberation algorithms have been proposed, among which the weighted prediction error (WPE) algorithm. In the WPE algorithm, a prediction delay is required to reduce the correlation between the prediction signals and the direct component in the reference microphone signal. In compact arrays with closely-spaced microphones, the prediction delay is often chosen microphone-independent. In acoustic sensor networks with spatially distributed microphones, large time-differences-of-arrival (TDOAs) of the speech source between the reference microphone and other microphones typically occur. Hence, when using a microphone-independent prediction delay the reference and prediction signals may still be significantly correlated, leading to distortion in the dereverberated output signal. In order to decorrelate the signals, in this paper we propose to apply TDOA compensation with respect to the reference microphone, resulting in microphone-dependent prediction delays for the WPE algorithm. We consider both optimal TDOA compensation using crossband filtering in the short-time Fourier transform domain as well as band-to-band and integer delay approximations. Simulation results for different reverberation times using oracle as well as estimated TDOAs clearly show the benefit of using microphone-dependent prediction delays.

2602.03508 2026-02-04 math.OC cs.SY eess.SY

A necessary and sufficient condition for discrete-time consensus on star boundaries

Galina Sidorenko, Johan Thunberg

Comments 14 pages, 8 figures

详情
英文摘要

It is intuitive and well known, that if agents in a multi-agent system iteratively update their states in the Euclidean space as convex combinations of neighbors' states, all states eventually converge to the same value (consensus), provided the interaction graph is sufficiently connected. However, this seems to be also true in practice if the convex combinations of states are mapped or radially projected onto any unit $l_p$-sphere or even boundaries of star-convex sets, herein referred to as star boundaries. In this paper, we present insight into this matter by providing a necessary and sufficient condition for asymptotic consensus of the normalized states (directions) for strongly connected directed graphs, which is equivalent to asymptotic consensus of states when the star boundaries are the same for all agents. Furthermore, we show that when asymptotic consensus occurs, the states converge linearly and the point of convergence is continuous in the initial states. Assuming a directed strongly connected graph provides a more general setting than that considered, for example, in gradient-based consensus protocols, where symmetric graphs are assumed. Illustrative examples and a vast number of numerical simulations showcase the theoretical results.

2602.03460 2026-02-04 math.OC cs.SY eess.SY

Cholesky factorisation, and intrinsically sparse linear quadratic regulation

Julia Adlercreutz, Richard Pates

Comments 15 pages, 7 figures, under review

详情
英文摘要

We classify a family of matrices of shift operators that can be factorised in a computationally tractable manner with the Cholesky algorithm. Such matrices arise in the linear quadratic regulator problem, and related areas. We use the factorisation to uncover intrinsic sparsity properties in the control laws for transportation problems with an underlying tree structure. This reveals that the optimal control can be applied in a distributed manner that is obscured by standard solution methods.

2602.03398 2026-02-04 eess.AS

A Unified SVD-Modal Solution for Sparse Sound Field Reconstruction with Hybrid Spherical-Linear Microphone Arrays

Shunxi Xu, Thushara Abhayapala, Craig T. Jin

Comments Accepted by ICASSP 2026

详情
英文摘要

We propose a data-driven sparse recovery framework for hybrid spherical linear microphone arrays using singular value decomposition (SVD) of the transfer operator. The SVD yields orthogonal microphone and field modes, reducing to spherical harmonics (SH) in the SMA-only case, while incorporating LMAs introduces complementary modes beyond SH. Modal analysis reveals consistent divergence from SH across frequency, confirming the improved spatial selectivity. Experiments in reverberant conditions show reduced energy-map mismatch and angular error across frequency, distance, and source count, outperforming SMA-only and direct concatenation. The results demonstrate that SVD-modal processing provides a principled and unified treatment of hybrid arrays for robust sparse sound-field reconstruction.

2602.03346 2026-02-04 eess.SY cs.SY

Dynamics of Implicit Time-Invariant Max-Min-Plus-Scaling Discrete-Event Systems

Sreeshma Markkassery, Ton van den Boom, Bart De Schutter

Comments 12 pages, Under review at Automatica

详情
英文摘要

Max-min-plus-scaling (MMPS) systems generalize max-plus, min-plus and max-min-plus models with more flexibility in modelling discrete-event dynamics. Especially, implicit MMPS models capture a wide range of real world discrete-event applications. This article analyzes the dynamics of an autonomous, time-invariant implicit MMPS system in a discrete-event framework. First, we provide sufficient conditions under which an implicit MMPS system admits at least one solution to its state-space representation. Then, we analyze its global behavior by determining the key parameters; the growth rates and fixed points. For a solvable MMPS system, we assess the local behavior of the system around its set of fixed points via a normalization procedure. Further, we present the notion of stability for the normalized system. A case study of the urban railway network substantiates the theoretical results.

2602.03294 2026-02-04 cs.CV cs.RO eess.IV

LEVIO: Lightweight Embedded Visual Inertial Odometry for Resource-Constrained Devices

Jonas Kühne, Christian Vogt, Michele Magno, Luca Benini

Comments This article has been accepted for publication in the IEEE Sensors Journal (JSEN)

Journal ref IEEE Sensors Journal ( Volume: 26, Issue: 3, 01 February 2026)

详情
英文摘要

Accurate, infrastructure-less sensor systems for motion tracking are essential for mobile robotics and augmented reality (AR) applications. The most popular state-of-the-art visual-inertial odometry (VIO) systems, however, are too computationally demanding for resource-constrained hardware, such as micro-drones and smart glasses. This work presents LEVIO, a fully featured VIO pipeline optimized for ultra-low-power compute platforms, allowing six-degrees-of-freedom (DoF) real-time sensing. LEVIO incorporates established VIO components such as Oriented FAST and Rotated BRIEF (ORB) feature tracking and bundle adjustment, while emphasizing a computationally efficient architecture with parallelization and low memory usage to suit embedded microcontrollers and low-power systems-on-chip (SoCs). The paper proposes and details the algorithmic design choices and the hardware-software co-optimization approach, and presents real-time performance on resource-constrained hardware. LEVIO is validated on a parallel-processing ultra-low-power RISC-V SoC, achieving 20 FPS while consuming less than 100 mW, and benchmarked against public VIO datasets, offering a compelling balance between efficiency and accuracy. To facilitate reproducibility and adoption, the complete implementation is released as open-source.

2602.03264 2026-02-04 cs.CV cs.LG eess.IV

HypCBC: Domain-Invariant Hyperbolic Cross-Branch Consistency for Generalizable Medical Image Analysis

Francesco Di Salvo, Sebastian Doerrich, Jonas Alle, Christian Ledig

Comments Accepted to Transactions on Machine Learning Research (TMLR)

详情
英文摘要

Robust generalization beyond training distributions remains a critical challenge for deep neural networks. This is especially pronounced in medical image analysis, where data is often scarce and covariate shifts arise from different hardware devices, imaging protocols, and heterogeneous patient populations. These factors collectively hinder reliable performance and slow down clinical adoption. Despite recent progress, existing learning paradigms primarily rely on the Euclidean manifold, whose flat geometry fails to capture the complex, hierarchical structures present in clinical data. In this work, we exploit the advantages of hyperbolic manifolds to model complex data characteristics. We present the first comprehensive validation of hyperbolic representation learning for medical image analysis and demonstrate statistically significant gains across eleven in-distribution datasets and three ViT models. We further propose an unsupervised, domain-invariant hyperbolic cross-branch consistency constraint. Extensive experiments confirm that our proposed method promotes domain-invariant features and outperforms state-of-the-art Euclidean methods by an average of $+2.1\%$ AUC on three domain generalization benchmarks: Fitzpatrick17k, Camelyon17-WILDS, and a cross-dataset setup for retinal imaging. These datasets span different imaging modalities, data sizes, and label granularities, confirming generalization capabilities across substantially different conditions. The code is available at https://github.com/francescodisalvo05/hyperbolic-cross-branch-consistency .

2602.03246 2026-02-04 cs.DC cs.NI cs.SY eess.SY math.OC

Joint Network-and-Server Congestion in Multi-Source Traffic Allocation: A Convex Formulation and Price-Based Decentralization

Tamoghna Sarkar, Bhaskar Krishnamachari

Comments 10pages, 7 figures, submitted a version conference

详情
英文摘要

This paper studies an important rate allocation problem that arises in many networked and distributed systems: steady-state traffic rate allocation from multiple sources to multiple service nodes when both (i) the access-path delay on each source-node route is rate-dependent (capacity-constrained) and convex, and (ii) each service node (also capacity-constrained) experiences a load-dependent queueing delay driven by aggregate load from all sources. We show that the resulting flow-weighted end-to-end delay minimization is a convex program, yielding a global system-optimal solution characterized by KKT conditions that equalize total marginal costs (a path marginal access term plus a node congestion price) across all utilized routes. This condition admits a Wardrop-type interpretation: for each source, all utilized options equalize total marginal cost, while any option with strictly larger total marginal cost receives no flow. Building on this structure, we develop a lightweight distributed pricing-based algorithm in which each service node locally computes and broadcasts a scalar congestion price from its observed aggregate load, while each source updates its traffic split by solving a small separable convex allocation problem under the advertised prices. Numerical illustrations demonstrate convergence of the distributed iteration to the centralized optimum and highlight the trade-offs induced by jointly modeling access and service congestion.

2602.03245 2026-02-04 eess.AS cs.CL

Mići Princ -- A Little Boy Teaching Speech Technologies the Chakavian Dialect

Nikola Ljubešić, Peter Rupnik, Tea Perinčić

Comments 2 figures, 14 pages, accepted and presented at JTDH 2024

详情
英文摘要

This paper documents our efforts in releasing the printed and audio book of the translation of the famous novel The Little Prince into the Chakavian dialect, as a computer-readable, AI-ready dataset, with the textual and the audio components of the two releases now aligned on the level of each written and spoken word. Our motivation for working on this release is multiple. The first one is our wish to preserve the highly valuable and specific content beyond the small editions of the printed and the audio book. With the dataset published in the CLARIN.SI repository, this content is from now on at the fingertips of any interested individual. The second motivation is to make the data available for various artificial-intelligence-related usage scenarios, such as the one we follow upon inside this paper already -- adapting the Whisper-large-v3 open automatic speech recognition model, with decent performance on standard Croatian, to Chakavian dialectal speech. We can happily report that with adapting the model, the word error rate on the selected test data has being reduced to a half, while we managed to remove up to two thirds of the error on character level. We envision many more usages of this dataset beyond the set of experiments we have already performed, both on tasks of artificial intelligence research and application, as well as dialectal research. The third motivation for this release is our hope that this, now highly structured dataset, will be transformed into a digital online edition of this work, allowing individuals beyond the research and technology communities to enjoy the beauty of the message of the little boy in the desert, told through the spectacular prism of the Chakavian dialect.

2602.03119 2026-02-04 cs.LG eess.SP

Function-Space Empirical Bayes Regularisation with Large Vision-Language Model Priors

Pengcheng Hao, Huaze Tang, Ercan Engin Kuruoglu, Wenbo Ding

详情
英文摘要

Bayesian deep learning (BDL) provides a principled framework for reliable uncertainty quantification by combining deep neural networks with Bayesian inference. A central challenge in BDL lies in the design of informative prior distributions that scale effectively to high-dimensional data. Recent functional variational inference (VI) approaches address this issue by imposing priors directly in function space; however, most existing methods rely on Gaussian process (GP) priors, whose expressiveness and generalisation capabilities become limited in high-dimensional regimes. In this work, we propose VLM-FS-EB, a novel function-space empirical Bayes regularisation framework, leveraging large vision-language models (VLMs) to generates semantically meaningful context points. These synthetic samples are then used VLMs for embeddings to construct expressive functional priors. Furthermore, the proposed method is evaluated against various baselines, and experimental results demonstrate that our method consistently improves predictive performance and yields more reliable uncertainty estimates, particularly in out-of-distribution (OOD) detection tasks and data-scarce regimes.

2602.03082 2026-02-04 cs.LG cs.SY eess.SY math.OC

Geometry-Preserving Neural Architectures on Manifolds with Boundary

Karthik Elamvazhuthi, Shiba Biswal, Kian Rosenblum, Arushi Katyal, Tianli Qu, Grady Ma, Rishi Sonthalia

详情
英文摘要

Preserving geometric structure is important in learning. We propose a unified class of geometry-aware architectures that interleave geometric updates between layers, where both projection layers and intrinsic exponential map updates arise as discretizations of projected dynamical systems on manifolds (with or without boundary). Within this framework, we establish universal approximation results for constrained neural ODEs. We also analyze architectures that enforce geometry only at the output, proving a separate universal approximation property that enables direct comparison to interleaved designs. When the constraint set is unknown, we learn projections via small-time heat-kernel limits, showing diffusion/flow-matching can be used as data-based projections. Experiments on dynamics over S^2 and SO(3), and diffusion on S^{d-1}-valued features demonstrate exact feasibility for analytic updates and strong performance for learned projections

2602.03055 2026-02-04 eess.SP stat.ML

Stationarity and Spectral Characterization of Random Signals on Simplicial Complexes

Madeline Navarro, Andrei Buciulea, Santiago Segarra, Antonio Marques

详情
英文摘要

It is increasingly common for data to possess intricate structure, necessitating new models and analytical tools. Graphs, a prominent type of structure, can encode the relationships between any two entities (nodes). However, graphs neither allow connections that are not dyadic nor permit relationships between sets of nodes. We thus turn to simplicial complexes for connecting more than two nodes as well as modeling relationships between simplices, such as edges and triangles. Our data then consist of signals lying on topological spaces, represented by simplicial complexes. Much recent work explores these topological signals, albeit primarily through deterministic formulations. We propose a probabilistic framework for random signals defined on simplicial complexes. Specifically, we generalize the classical notion of stationarity. By spectral dualities of Hodge and Dirac theory, we define stationary topological signals as the outputs of topological filters given white noise. This definition naturally extends desirable properties of stationarity that hold for both time-series and graph signals. Crucially, we properly define topological power spectral density (PSD) through a clear spectral characterization. We then discuss the advantages of topological stationarity due to spectral properties via the PSD. In addition, we empirically demonstrate the practicality of these benefits through multiple synthetic and real-world simulations.