arXivDaily arXiv每日学术速递 周一至周五更新
重置
2605.06631 2026-05-08 eess.AS

Task-Aware Answer Preservation under Audio Compression for Large Audio Language Models

Amir Ivry

Comments Preprint

详情
英文摘要

Large audio language models (LALMs) are increasingly used to reason over long audio clips, yet deployment often compresses audio before inference to reduce memory and latency. The risk is that compression can leave aggregate accuracy acceptable while sharply degrading answers for a deployment-critical query family. We study answer-preserving audio compression, judging a compressor by the excess answer-error it induces, especially for the worst-affected family. We formulate this theoretically as a compressor acceptance-rejection criterion, derive a practical sign-off protocol that returns compression budgets satisfying worst-family checks with statistical confidence, and evaluate it on five multiple-choice audio question-answering benchmarks with two Qwen-based backbones. The protocol exposes hidden family-level damage, shows that the chosen query-family partition can change the approved budget, and identifies regimes where query-conditioned compression helps maintain answer preservation.

2605.06630 2026-05-08 eess.SY cs.SY

Quantifying Trade-Offs Between Stability and Goal-Obfuscation

Yixuan Wang, Dan Guralnik, Warren Dixon

Comments 11 pages

详情
英文摘要

Safety-critical autonomy in adversarial settings demands more than Lyapunov stability of tracking error signals. An agent executing a goal-directed trajectory is intrinsically legible to a passive observer running online Bayesian inference, because the contractive dynamics of any Lyapunov basin of attraction concentrates posterior belief over the latent intent parameters. We initiates the study of intent privacy over a continuous state space as a joint control problem on the physical state combined with the latent belief state of a putative observer. With the main challenges concentrated around the analysis of the belief-state dynamics, the agent dynamics is assumed to be simple, modeled by the differential inclusion $\dot{x}\in u+\bar{d}\mathbb{B}$. That is, the agent is fully actuated with bounded unknown disturbance to the control input. The observer's intent inference process is modeled as a discrete-time stochastic dynamical system evolving over the belief state space of a Rao Blackwellized particle filter reasoning over large random samples of possible agent goals. The agent's control input is modeled as a piecewise constant signal, with jumps matching the RBPF update times. Building on a prior intent-inference framework and its KL-based information leakage measurement, a privacy constraint is imposed, which amounts to maintaining information leakage above a prescribed threshold with high probability, using probabilistic discrete-time control barrier functions. A key technical contribution is the derivation of separate PCBF results for the Bayesian update step and the resampling step of the RBPF, enabling a PCBF result for the full update as well as integration of the privacy constraint with the agent's task-side tracking requirement. Finally, a joint feasibility analysis is carried out by examining the interplay between the privacy constraint and the tracking envelope.

2605.06628 2026-05-08 eess.IV cs.LG cs.MM eess.AS eess.SP

LiVeAction: a Lightweight, Versatile, and Asymmetric Neural Codec Design for Real-time Operation

Dan Jacobellis, Neeraja J. Yadwadkar

Comments DCC 2026

详情
英文摘要

Modern sensors generate rich, high-fidelity data, yet applications operating on wearable or remote sensing devices remain constrained by bandwidth and power budgets. Standardized codecs such as JPEG and MPEG achieve efficient trade-offs between bitrate and perceptual quality but are designed for human perception, limiting their applicability to machine-perception tasks and non-traditional modalities such as spatial audio arrays, hyperspectral images, and 3D medical images. General-purpose compression schemes based on scalar quantization or resolution reduction are broadly applicable but fail to exploit inherent signal redundancies, resulting in suboptimal rate-distortion performance. Recent generative neural codecs, or tokenizers, model complex signal dependencies but are often over-parameterized, data-hungry, and modality-specific, making them impractical for resource-constrained environments. We introduce a Lightweight, Versatile, and Asymmetric neural codec architecture (LiVeAction), that addresses these limitations through two key ideas. (1) To reduce the complexity of the encoder to meet the resource constraints of the execution environments, we impose an FFT-like structure and reduce the overall size and depth of the neural-network-based analysis transform. (2) To allow arbitrary signal modalities and simplify training, we replace adversarial and perceptual losses with a variance-based rate penalty. Our design produces codecs that deliver superior rate-distortion performance compared to state-of-the-art generative tokenizers, while remaining practical for deployment on low-power sensors. We release our code, experiments, and python library at https://github.com/UT-SysML/liveaction .

2605.06599 2026-05-08 cs.LG eess.AS

Weight-Decay Turns Transformer Loss Landscapes Villani: Functional-Analytic Foundations for Optimization and Generalization

Abhijit Das, Sayantan Dutta

Comments 17 pages, 10 figures

详情
英文摘要

Weight decay is widely used as a regularizer in large language models, yet its precise role in shaping Transformer loss landscapes remains theoretically underexplored. This paper provides the first rigorous functional-analytic characterization of the standard Transformer objective--cross-entropy loss with $L^2$ regularization--by proving it satisfies Villani's criteria for coercive energy functions. Specifically, we show that the regularized loss $\mathcal{F}$ is infinitely differentiable, grows at least quadratically, has Gaussian-integrable tails, and satisfies the differential growth condition $-Δ\mathcal{F} + \tfrac{1}{s}\|\nabla\mathcal{F}\|^{2} \to \infty$ as $\|θ\| \to \infty$ for all $s>0$. From this structure, we derive explicit log-Sobolev and Poincaré constants $C_{\mathrm{LS}} \leq λ^{-1} + d/λ^{2}$, linking the regularization strength $λ$ and model dimension $d$ to finite-time convergence guarantees for noisy stochastic gradient descent and PAC-Bayesian generalization bounds that tighten with increasing $λ$. To validate our theory, we introduce a scalable Villani diagnostic $Ψ_s(θ) = -Δ\mathcal{F} + s^{-1}\|\nabla \mathcal{F}\|^2$ and estimate it efficiently using Hutchinson trace probes in models with over 100M parameters. Experiments on GPT-Neo-125M across Penn Treebank and WikiText-103 confirm the predicted quadratic growth of $Ψ_s$, spectral inflation of the Hessian, and exponential convergence behavior consistent with our log-Sobolev analysis. These results demonstrate that weight decay not only improves generalization empirically but also establishes the mathematical conditions required for fast Langevin mixing and theoretically grounded curvature-aware optimization in deep learning.

2605.06578 2026-05-08 eess.SP

Resource-Efficient CSI Prediction: A Gated Fusion and Factorized Projection Approach

Mohammad Hussain, Maedeh Adibag, Dilara Gurer, Gokhan Kalem, Kerim Serin, Sinem Coleri

Comments Accepted for publication in IEEE Communications Letters. 5 pages, 2 figures

详情
英文摘要

Accurate Channel State Information (CSI) prediction is essential for dynamic multiple-input multiple-output (MIMO) systems but remains computationally demanding. This letter proposes a resource-efficient predictor that combines a gated recurrent unit (GRU) encoder with Luong attention, a bottleneck gated fusion module, and a Dimension-wise Separable Linear Head (DSLH). The gated fusion module integrates local recurrent features with global attention context, while the DSLH reduces the cost of the output mapping. Evaluated on 3GPP TR 38.901-compliant channels, the proposed model achieves an average NMSE of -13.84 dB with 26% fewer parameters and approximately 2.3x higher inference throughput than a dimension-matched LinFormer baseline. The proposed model is best suited to LOS and mixed-condition scenarios, offering a practical accuracy-efficiency trade-off for short-horizon CSI prediction at moderate sequence lengths.

2605.06532 2026-05-08 eess.IV

Histogramless Time-Domain Sketched Fluorescence Lifetime Imaging

Zhenya Zang, Istvan Gyongy, Mike Davies

详情
英文摘要

We present a statistics-aware compression strategy that processes photon timestamps directly from time-correlated single-photon counting (TCSPC) modules for time-domain fluorescence lifetime imaging (FLIM). Rather than storing or transmitting the full histogram per pixel, timestamps are projected onto sparse, non-uniform one-dimensional spline sketches, with knot positions optimally allocated based on Fisher information. This knot allocation concentrates sketch channels where the decay signal exhibits the greatest statistical discriminability, rather than using a uniform allocation. The proposed approach is extensively validated on synthetic mono- and bi-exponential decay data and on experimental fluorescent dye data, demonstrating comparable accuracy to full-histogram non-linear least-squares fitting (NLSF) and Poisson maximum-likelihood estimation (MLE) at compression ratios of up to 256x. We further validate the feasibility of integrating the timestamp-to-sketch projection directly into firmware via fixed-point (FXP) lookup-table (LUT) simulation, targeting high-spatial-resolution single-photon avalanche diode (SPAD) arrays subject to significant data-throughput constraints.

2605.06495 2026-05-08 math.OC cs.SY eess.SY

Global self-optimizing control of batch processes

Chenchen Zhou, Hongxin Su, Xinhui Tang, Yi Cao, Shuang-hua Yang

详情
Journal ref
Journal of Process Control Volume 135, March 2024, 103163
英文摘要

This work considers to achieve near-optimal operation for a class of batch processes by employing self-optimizing control (SOC). Comparing with a continuous one, a batch process exhibits stronger nonlinearity with dynamics because of the non-steady operation condition. This necessitates a global version of SOC to achieve satisfactory performance. Meanwhile, it also makes the existing global SOC (gSOC) not directly applicable to batch processes due to the causality amongst variables. Therefore, it is necessary to extend the original gSOC to batch processes. In addition to the nonconvexity challenge of the original gSOC problem, the new extension for batch processes has to face even more challenges. Particularly, the causality due to dynamics of batch processes brings in structural constraints on controlled variables (CVs), making a CV selection problem even more difficult. To address these challenges, the gSOC problem is recast in a vectorized formulation and it is proved that the structural constraints considered are linear in the vectorized formulation. Moreover, a novel shortcut method is proposed to efficiently find sub-optimal but more transparent solutions for this problem. The effectiveness of the new approach is validated through a case study of a fed-batch reactor, where CVs are constructed through a combination matrix with a repetitive structure, resulting in a simple SOC scheme. This simplicity facilitates the implementation of the SOC approach and enhances its practical applicability and robustness.

2605.06469 2026-05-08 math.OC cs.LG cs.SY eess.SY

Dynamic Controlled Variables Based Dynamic Self-Optimizing Control

Chenchen Zhou, Shaoqi Wang, Hongxin Su, Xinhui Tang, Yi Cao, Shuang-Hua Yang

详情
Journal ref
Journal of Process Control, 2024, 138: 103228
英文摘要

Self-optimizing control is a strategy for selecting controlled variables, where the economic objective guides the selection and design of controlled variables, with the expectation that maintaining the controlled variables at constant values can achieve optimization effects, translating the process optimization problem into a process control problem. Currently, self-optimizing control is widely applied to steady-state optimization problems. However, the development of process systems exhibits a trend towards refinement, highlighting the importance of optimizing dynamic processes such as batch processes and grade transitions. This paper formally introduces the self-optimizing control problem for dynamic optimization, termed the dynamic self-optimizing control problem, extending the original definition of self-optimizing control. A novel concept, "dynamic controlled variables" (DCVs), is proposed, and an implicit control policy is presented based on this concept. The paper theoretically analyzes the advantages and generality of DCVs compared to explicit control strategies and elucidates the relationship between DCVs and traditional controllers. Moreover, this paper puts forth a data-driven approach to designing self-optimizing DCVs, which considers DCV design as a mapping identification problem and employs deep neural networks to parameterize the variables. Three case studies validate the efficacy and superiority of DCVs in approximating multi-valued and discontinuous functions, as well as their application to dynamic optimization problems with non-fixed horizons, which traditional self-optimizing control methods are unable to address.

2605.06448 2026-05-08 math.OC cs.SY eess.SY

Performance guaranteed MPC Policy Approximation via Cost Guided Learning

Chenchen Zhou, Yi Cao, Shuang-hua Yang

详情
Journal ref
IEEE Control Systems Letters, 2024, 8: 346-351
英文摘要

Model predictive control (MPC) is widely used in industries but implementing it poses challenges due to hardware or time constraints. A promising solution is to approximate the MPC policy using function approximators like neural networks. Existing methods focus on minimizing the error between the approximators outputs and the MPC optimal control actions on training data, which is called error guided learning approach in this paper. However, the goals of control law design is not to minimize the fitting error but to minimize the operation cost. This paper proposes a novel cost-guided learning approach that utilizes the cost sensitivity information from the MPC problem to directly minimize the loss in closed-loop performance. A theoretical analysis shows cost-guided learning provides tighter guarantees on optimality loss compared to traditional error-guided learning. Experiments on a continuous stirred tank reactor (CSTR) benchmark demonstrate that the proposed technique results in approximate MPC policies that achieve substantially better closed-loop performance. This work makes an important contribution by connecting the fitting errors with operational objectives, overcoming key limitations of existing approximation methods. The core idea could be applied more broadly for data-driven control.

2605.06442 2026-05-08 eess.SY cs.SY

Probabilistic Assessment of Rare Transient Instability Events via Kriging-based Active Learning Framework

Jingyu Liu, Xiaoting Wang, Xiaozhe Wang

Comments Accepted by International Journal of Electrical Power and Energy Systems for future publication

详情
英文摘要

The increasing uncertainty in modern power systems, driven by the integration of intermittent energy sources and variable loads, underscores the need for probabilistic transient stability assessment. However, existing assessment methods primarily focus on average system stability behavior and may struggle or incur high computational cost when identifying rare transient instability events, which in turn are critical for ensuring system resilience. To address this, the paper proposes a Kriging-based active learning framework to accurately characterize rare instability regions within the input uncertainty space and estimate the associated small instability probability, while requiring only a limited number of expensive time-domain simulations. The proposed active learning (AL) framework is tested on a modified IEEE 59-bus system with simulated load and wind uncertainties, and a WECC 240-bus system incorporating real-world wind and solar generation data. Comparative studies with the existing random forest-based active learning method and three non-AL methods demonstrate that the proposed AL framework achieves superior accuracy and computational efficiency.

2605.06437 2026-05-08 eess.SY cs.SY

Distributed Online Learning for Time-Critical Communication in 6G Industrial Subnetworks

Samira Abdelrahman, Hossam Farag, Gilberto Berardinelli

详情
英文摘要

6G industrial in-X subnetworks are expected to support highly time-critical alarm reporting in large-scale environments characterized by mobility, bursty event-driven traffic, and limited radio resources. In such settings, conventional medium access solutions are ill-suited to guarantee reliable delivery of critical traffic, e.g., emergency alarms, within strict deadlines, especially when multiple subnetworks become simultaneously active after a common alarm event, a scenario widely referred as medium access with a shared message. This paper proposes a distributed deep reinforcement learning (DRL)-based medium access control protocol for timely alarm transmission in time-critical industrial subnetworks. The proposed method enables each local access point (LAP) to learn, in an online manner, to infer contention conditions from a broadcast contention-signature signal and to autonomously select a transmission pattern over the available channels using a lightweight deep neural network and an (ephsilon)-greedy policy. Simulation results demonstrate that the proposed approach consistently achieves a higher probability of in-time alarm delivery than benchmark random-access schemes, while exhibiting better scalability with increasing network density. For instance, the proposed method improves probability of in-time alarm delivery by at least 7% with a network size of 40 subnetworks, while the gain increases to 21% when the number of subnetworks increases to 60.

2605.06419 2026-05-08 eess.SY cs.SY

Residual-Corrected Equivalent-Circuit Model with Universal Differential Equations for Robust Battery Voltage Prediction under Operating-Condition Shift

Alexandre Barbosa de Lima, Roberta Vieira Raggi

详情
英文摘要

Accurate terminal-voltage prediction underpins model-based battery management, yet low-order equivalent-circuit models (\ecm{}) lack expressiveness under transient conditions, whereas purely data-driven predictors sacrifice interpretability and may degrade under operating-condition shift. This paper introduces a residual-corrected hybrid formulation in which a first-order Thevenin \ecm{} (\ecmrc{}) provides the dominant voltage structure, and a compact neural network embedded as a universal differential equation (\ude{}) corrects only the latent polarization mismatch. The \ecmrc{} parameters identified by nonlinear least squares warm-start the hybrid model so that the learned component operates in a low-residual regime. Experiments on a public Panasonic 18650PF dataset compare the proposed \ecmude{} with standalone \ecmrc{} and Long Short-Term Memory (\lstm{}) baselines across four axes: matched-condition prediction on UDDS at \SI{25}{\celsius}, inference-time perturbation of the supplied state-of-charge (\SOC{}, denoted $z$) input, zero-shot temperature transfer (\SI{25}{\celsius} to \SI{-20}{\celsius}), and zero-shot drive-cycle transfer to US06, LA92, and HWFET. The proposed \ecmude{} achieves the lowest voltage error in every setting, reducing mean absolute error (\mae{}) by 48\% relative to the \lstm{} under matched conditions and showing an order-of-magnitude lower inter-seed variability (coefficient of variation: 0.44\% vs.\ 6.20\%). Substantial gains persist under challenging distribution shifts, indicating that the physical model anchors prediction where a purely learned model is most vulnerable. These results position residual-corrected \ecmude{} as a lightweight and interpretable enhancement of low-order circuit models for voltage prediction in battery management systems (\bms{}).

2605.06407 2026-05-08 eess.AS cs.AI cs.CL

WavCube: Unifying Speech Representation for Understanding and Generation via Semantic-Acoustic Joint Modeling

Guanrou Yang, Tian Tan, Qian Chen, Zhikang Niu, Yakun Song, Ziyang Ma, Yushen Chen, Zeyu Xie, Tianrui Wang, Yifan Yang, Wenxi Chen, Qi Chen, Wenrui Liu, Shan Yang, Xie Chen

详情
英文摘要

Integrating speech understanding and generation is a pivotal step toward building unified speech models. However, the different representations required for these two tasks currently pose significant compatibility challenges. Typically, semantics-oriented features are learned from self-supervised learning (SSL), and acoustic-oriented features from reconstruction. Such fragmented representations hinder the realization of truly unified speech systems. We present WavCube, a compact continuous latent derived from an SSL speech encoder that simultaneously supports speech understanding, reconstruction, and generation. WavCube employs a two-stage training scheme. Stage 1 trains a semantic bottleneck to filter off-manifold redundancy that makes raw SSL features intractable for diffusion. Stage 2 injects fine-grained acoustic details via end-to-end reconstruction, while a semantic anchoring loss ensures the representation remains grounded within its original semantic manifold. Comprehensive experiments show that WavCube closely approaches WavLM performance on SUPERB despite an 8x dimensional compression, attains reconstruction quality on par with existing acoustic representations, delivers state-of-the-art zero-shot TTS performance with markedly faster training convergence, and excels in speech enhancement, separation, and voice conversion tasks on the SUPERB-SG benchmark. Systematic ablations reveal that WavCube's two-stage recipe resolves two intrinsic flaws of SSL features for generative modeling, paving the way for future unified speech systems. Codes and checkpoints are available at https://github.com/yanghaha0908/WavCube.

2605.06391 2026-05-08 math.OC cs.SY eess.SY

Unbalanced Optimal Transport and Density Control for Discrete-Time Linear Systems

Haruto Nakashima, Siddhartha Ganguly, Kenji Kashima

Comments To appear in the Proceedings of MTNS 2026 (extended abstracts). Submitted on February 15, 2026; accepted on April 20, 2026. A significantly expanded version containing additional theoretical results, complete proofs, and numerical experiments, is available at: arXiv:2605.04246v1

详情
英文摘要

This article studies unbalanced optimal transport (UOT) and its dynamical extension, unbalanced density control (UDC), for a class of constrained discrete-time linear systems. UOT compares measures with unequal total mass by balancing transport cost and fidelity to reference measures, while UDC incorporates system dynamics and constraints into this framework. Focusing on Gaussian references and discrete-time linear systems, we show that both problems admit globally optimal convex formulations, analogous to covariance steering. A numerical experiment is provided to illustrate our approach.

2605.06359 2026-05-08 eess.SP cs.CV

The frame-level leakage trap: rethinking evaluation protocols for intrinsic image decomposition, with source-separable uncertainty as a case study

Jihwan Woo

Comments Submitted to Journal of Electronic Imaging. 25 pages, 10 figures. Addresses evaluation protocol issues in intrinsic image decomposition and proposes source-separable uncertainty estimation

详情
英文摘要

Evaluation protocols for learned intrinsic image decomposition on MPI Sintel have been inconsistent. Several prior works split the dataset by frames, which allows spatially similar frames of the same scene to appear in both train and test partitions. We quantify this leakage effect for the first time, across three architectures: a frame-level split inflates test R_PSNR by 1.6 to 2.0 dB (p less than 0.01 for all three, paired t-test across 3 seeds) relative to a scene-level split, confirming an architecture-independent protocol effect. A three-point gradient (random/temporal/scene) shows the gap is continuous, and under extended training the frame-level inflation exceeds 10 dB. We advocate scene-level splits as the community standard and provide reference numbers for six representative models under this protocol. As a case study within the corrected protocol, we present a physics-informed decomposition I = R composed with S + N with a source-separable three-way heteroscedastic uncertainty head. We empirically verify channel specialization: the non-Lambertian uncertainty channel shows r = 0.67 cross-correlation with non-Lambertian residual error, more than 4 times the texture channel's correlation. We further demonstrate downstream utility: filtering out the 75% highest-uncertainty pixels reduces reconstruction MSE by 77% on retained pixels, whereas random filtering produces no improvement. The specialization also holds on out-of-distribution real photographs. We report negative results for a more elaborate variant combining frequency decomposition, cross-task supervision, evidential learning, contrastive loss, and test-time adaptation. Our method reaches 15.98 plus or minus 0.41 dB R_PSNR, within 0.8 dB of a 5-member Deep Ensemble at one-fifth the cost, with the unique capability of source-separated uncertainty.

2605.06256 2026-05-08 eess.SP

Cooperative Multi-Static Target Localization for ISAC in Cluttered Industrial IoT Networks

Mostafa Nozari, Israel Leyva-Mayorga, Gilberto Berardinelli

详情
英文摘要

In this paper, we propose a novel integrated sensing and communications (ISAC) framework for collaborative multi-static target localization in dense Industrial Internet-of-Things (IIoT) environments in the presence of environmental clutter. We first develop a lightweight temporal clutter-suppression learning method to mitigate persistent reflections. Building on this, we propose an iterative localization algorithm that integrates two key components introduced in this work: a sampling-based field-of-view-aware initialization (SFI) scheme and an empirical position error bound (PEB) scheme, which together adaptively identify the most informative subset of sensing nodes. A reliability-aware weighted least-squares estimator is then employed to fuse range and angle-of-arrival measurements from the selected sensing receivers for target localization. Numerical results demonstrate rapid convergence of the proposed method, reducing the localization RMSE by nearly two orders of magnitude within six sensing iterations to about 45 cm, while significantly outperforming all considered benchmarks under the same sensing-resource budget.

2605.06189 2026-05-08 eess.AS cs.LG

Predictive-Generative Drift Decomposition for Speech Enhancement and Separation

Julius Richter, Yoshiki Masuyama, Christoph Boeddeker, Takahiro Edo, Gordon Wichern, Jonathan Le Roux

Comments Submitted to NeurIPS 2026

详情
英文摘要

We propose a plug-and-play framework for speech enhancement and separation that augments predictive methods with a generative speech prior. Our approach, termed Stochastic Interpolant Prior for Speech (SIPS), builds on stochastic interpolants and leverages their flexibility to bridge predictive and generative modeling. Specifically, we decompose the interpolation dynamics into a task-specific drift and a stochastic denoising component, allowing a predictive estimate to be integrated directly into the generative sampling process. This results in a mathematically grounded framework for combining strong pretrained predictors with the expressive power of generative models. To this end, we train a score model using only clean speech, yielding a degradation-agnostic prior that can be reused across tasks. During inference, the predictor provides a deterministic drift that steers the sampling process toward a task-consistent estimate, while the score model preserves perceptual naturalness. Unlike prior hybrid approaches, which typically rely on architecture-specific conditioning and are tied to particular predictors or degradation settings, SIPS provides a unified framework that generalizes across predictors and additive degradation tasks. We demonstrate its effectiveness for both speech enhancement and speech separation using recent predictors such as SEMamba and FlexIO. The proposed method consistently improves perceptual quality, achieving gains up +1.0 NISQA for speech separation.

2605.06181 2026-05-08 eess.SY cs.SY

Synthesis of Limit Cycles and Reference Tracking via Switching Affine Systems

Nils Hanke, Zonglin Liu, Olaf Stursberg

详情
英文摘要

This paper introduces a novel method to approximate limit cycles of nonlinear ODEs by use of switching affine dynamics in order to ease data-based modeling and analysis. Previous approaches to approximating limit cycles by switching systems have been largely confined to simple partitions into two-regions or low-dimensional (often planar) settings. In contrast, this study utilizes more general partitions in higher-dimensional state spaces, augmented by external signals, to develop a synthesis scheme that guarantees a globally stable limit cycle. The synthesis task is formulated and solved based on constrained numerical optimization. Starting from sampled data of the nonlinear dynamics, the method minimizes the error between the data and the limit cycle generated by the switching affine model, while employing stability constraints to ensure global stability. Based on the obtained model, the paper tackles the problem of reference tracking for switching affine systems with periodic behavior. While the approximation scheme is based on a common Lyapunov function, the reference tracking approach uses multiple Lyapunov functions to achieve less conservative convergence results. The principle and effectiveness of the proposed methods are illustrated through a set of examples.

2605.06145 2026-05-08 cs.LG cs.AI cs.SY eess.SY

Unifying Goal-Conditioned RL and Unsupervised Skill Learning via Control-Maximization

Alireza Modirshanechi, Benjamin Eysenbach, Peter Dayan, Eric Schulz

详情
英文摘要

Unsupervised pretraining has driven empirical advances in goal-conditioned reinforcement learning (GCRL), but its theoretical foundations remain poorly understood. In particular, an influential class of methods, mutual information skill learning (MISL), discovers behaviorally diverse skills that can later be used for downstream goal-reaching. However, it remains a theoretical mystery why skills learned through MISL should support goal-reaching. A subtle challenge is that both GCRL and MISL are umbrella terms: different GCRL tasks use distinct criteria for measuring goal-reaching performance, while different MISL methods optimize distinct notions of behavioral diversity. We address this challenge and unify GCRL and MISL as instances of control maximization. We identify three canonical GCRL formulations and prove that they are fundamentally inequivalent: they can induce incompatible optimal policies even in the same environment. Nevertheless, they all share a common interpretation: a well-performing goal-conditioned policy is one whose future trajectory is highly sensitive to the commanded goal, with the precise notion of sensitivity determined by the GCRL formulation. Noting that MISL objectives can be understood as measures of skill-sensitivity akin to goal-sensitivity, we show that MISL objectives are bounded by formulation-specific downstream goal-sensitivities. These bounds establish a precise correspondence between MISL methods and downstream GCRL tasks: for every GCRL formulation, there exists a matching MISL objective for which more diverse skills afford greater downstream goal sensitivity. Our results thus lay a theoretical foundation for RL pretraining and have important practical implications, such as suggesting which pretraining objectives to use when a user cares about a specific class of downstream tasks.

2605.06108 2026-05-08 eess.AS

NDF+: Joint Neural Directional Filtering and Diffuse Sound Extraction

Weilong Huang, Le Nhat Tam Huynh, Oliver Thiergart, Emanuël A. P. Habets

详情
英文摘要

Recently, neural directional filtering (NDF) has been introduced as a flexible approach for reconstructing a virtual directional microphone (VDM) with a desired directivity pattern for spatial sound capture. Building on this idea, we propose NDF+, which enables joint neural directional filtering and diffuse sound extraction. NDF+ reformulates VDM estimation into two coupled subtasks: dereverberated VDM reconstruction and diffuse sound extraction. This reformulation enables NDF+ to manipulate diffuse components in the final reconstructed VDM output. We evaluated NDF+ under reverberant conditions and compared it with representative conventional baselines. Results show that NDF+ consistently outperforms the baselines on both subtasks, while maintaining VDM reconstruction quality comparable to that of the original single-task NDF model. These findings indicate that NDF+ introduces an additional degree of freedom for diffuse sound control in the VDM reconstruction. In a stereo recording application, NDF+ provides controllable inter-channel level differences between left and right channels by adjusting the estimated diffuse component.

2605.06107 2026-05-08 eess.SP

A Family of Hybrid Beyond-Diagonal RIS Architectures: Design and Performance Analysis

Konstantinos Ntougias, Ioannis Krikidis

详情
英文摘要

Beyond-diagonal reconfigurable intelligent surfaces (BD-RISs) extend conventional diagonal RISs by allowing inter-element coupling, thereby enlarging the set of attainable scattering matrices and improving the achievable signal-to-noise ratio (SNR). On the other hand, hybrid active/passive RISs use reflect-type power amplifiers in a fraction of the elements to alleviate the multiplicative path loss. In this paper, we bring these two ideas together and introduce a \emph{family of hybrid BD-RIS architectures}, in which the surface is partitioned into two reflecting subsurfaces (RSs), each adopting either a passive or an active group-connected BD-RIS design. We derive a closed-form SNR-maximizing solution that combines, for every BD-RIS group, Takagi's factorization of a certain complex symmetric matrix with an optimal per-group amplification factor that satisfies the reflect-power budget. Three architectures within the proposed family (active/passive, fully-connected-active/sub-connected-active, and sub-connected-active/sub-connected-active hybrid BD-RIS) are studied. Numerical results in a single-input single-output (SISO) link with blocked direct path show that the proposed hybrid BD-RIS architectures attain the same or higher receive SNR than their diagonal counterparts while using significantly fewer reflect-type amplifiers.

2605.06097 2026-05-08 eess.SY cs.SY math.OC

Absolute Stability of Nonlinear Negative Imaginary Systems with Application to Potential Energy Shaping

Kanghong Shi, Ian R. Manchester

Comments 8 pages, 7 figures

详情
英文摘要

This paper establishes absolute stability conditions for nonlinear negative imaginary (NI) systems interconnected with static nonlinear feedback. We first show that the NI property is preserved when the feedback nonlinearity can be expressed as the gradient of a continuously differentiable function, and the composite storage of the resulting system remains positive definite. This condition provides a direct connection between nonlinear static feedback and storage-function shaping along the measured output channels. Building on this result, conditions are derived for absolute stability of the closed-loop system under mild assumptions. The linear specialization of the results strictly generalizes prior absolute stability results for linear NI systems, allowing coupled nonlinearities not covered by existing slope-restricted or sector-bounded frameworks. Finally, the proposed theory is illustrated through a linear example highlighting this generalization and a nonlinear example that shows the utility of the proposed results in potential energy shaping.

2605.06087 2026-05-08 cs.AI cs.SY eess.SY

Safety Certification is Classification

Oliver Schön, Licio Romao, Sadegh Soudjani

Comments 32 pages, 18 figures

详情
英文摘要

The goal of this paper is certifying safety of dynamical systems subject to uncertainty. Existing approaches use trajectory data to estimate transition probabilities, and compute safety probabilities recursively via dynamic programming (DP). This recursion may lead to compounding errors in the certified safety probability, thus collapsing to a vacuous lower bound for growing horizons $T$. We propose a kernel embedding framework that treats safety certification as a classification problem on trajectory data, directly estimating the $T$-step safety probability without recursion. We show that the framework subsumes well-established approaches from the literature (e.g., barrier certificates, robust Markov models) as special cases, and allows us to go beyond their limitations. As the main consequence, it bypasses compounding error across the horizon and enables certification for systems with non-Markovian dynamics. We demonstrate that direct estimators remain stable independent of the certification horizon and in the non-Markovian setting, whilst DP-based certificates silently go unsound -- confirmed in simulation on a neural-controlled quadrotor.

2605.06062 2026-05-08 cs.RO cs.SY eess.SY

Monitoring autonomous persistent surveillance missions using invariance

Vladislav Nenchev, Prodromos Sotiriadis

Comments Accepted at IEEE ICRA 2026

详情
英文摘要

This paper studies runtime monitoring for persistent surveillance by autonomous robots when the autonomy stack is a black box. The environment is partitioned into finitely many parts, each carrying an uncertainty state that decreases when observed and increases otherwise. We model the closed loop as a state-dependent hybrid system with linear parameter varying dynamics and design a monitor based on an invariant computed offline. As this invariant is typically hard to obtain for large to-be-surveyed spaces, we propose a compositional monitor obtained by decentralized computation of low-dimensional invariant sets for each uncertainty region, and checking their conjunction online. Under common independence assumptions, the compositional monitor is sound and complete with respect to the full-system invariant. The approach is applied in a case study with a real robot persistently monitoring a labyrinth, emphasizing its applicability in practice.

2605.06060 2026-05-08 cs.CE cs.SY eess.SY

Arbitrage and the Stability of AMM Price Tracking

Peihao Li, Nadia Dahmani, Wenqi Cai

详情
英文摘要

Automated market makers (AMMs) quote prices from pool state rather than from a limit order book. AMM pools often stay close to a reference price because arbitrageurs correct profitable mispricing. A large part of decentralized finance therefore relies on a simple economic premise: once the AMM price drifts away from the reference price, arbitrage incentives push it back. This paper studies when that premise is strong enough to guarantee block-scale stability. We model the gap between the reference price and the AMM price as a stochastic tracking error, treat arbitrage as the corrective input, and place blockchain execution inside the loop through fees, discrete blocks, transaction ordering, delays, and transaction failure. The detailed execution layer is reduced to the total successful correction confirmed in each block. Under a block-level correction condition, we prove geometric ergodicity of the tracking error and obtain explicit one-step bounds that connect tracking quality to liquidity and execution quality. We also show in a constant-product example how fees, fixed execution costs, and local liquidity map into the no-trade band and the optimal corrective trade. Finally, we build empirical proxies for the theorem quantities from realized block data and use them to organize reduced and mechanism-focused simulations whose comparative statics are consistent with the theory. The contribution is to turn a basic economic intuition behind decentralized finance into a quantitative stability statement together with a tractable calibration interface.

2605.06039 2026-05-08 eess.SP

Bayesian Learning-Aided Near-Field Channel Estimation for mmWave Hybrid MIMO systems employing Uniform Circular Array

Abhisha Garg, Priya Gupta, Suraj Srivastava, Aditya Jagannatham

详情
英文摘要

This work conceives a Ring-Bayes channel learning framework that unifies Bayesian learning with near-field channel estimation in millimeter-wave (mmWave) hybrid MIMO systems. As the number of antennas scales up, users increasingly fall within the near-field region, rendering the conventional planar-wave assumption invalid. Moreover, the widely studied uniform linear arrays (ULAs) at the base station are impractical for large-scale deployment, whereas uniform circular arrays (UCAs) achieve superior beamforming gain and spatial directivity with the same antenna aperture. To exploit these advantages, we design a near-field concentric-ring codebook that captures channel features jointly in angular and distance domains. Leveraging this structure, the proposed Ring-Bayes framework enables highly accurate recovery of UCA near-field channels. Extensive simulations confirm that our approach delivers substantial improvements over existing methods, establishing Ring-Bayes as a powerful and scalable solution for next-generation mmWave communications.

2605.05992 2026-05-08 eess.SY cs.SY

SOPF-Based Adaptive Droop Control for Hybrid AC--HVDC Grids Under Offshore Wind Uncertainty

Hongjin Du, Aleksandra Lekić

详情
英文摘要

The integration of massive offshore wind into hybrid AC-HVDC grids demands robust DC voltage regulation, yet conventional fixed-gain droop controllers struggle under severe stochastic volatility. This paper bridges the gap between system-level economic dispatch and converter-level control by proposing a novel Stochastic Optimal Power Flow (SOPF)-based adaptive droop framework. Rather than relying on heuristic or reactive tuning, wind forecast uncertainty is modeled using a zone-wise Beta distribution that accurately captures the heteroscedastic nature of wind errors across low, mid, and high power regimes. By leveraging Polynomial Chaos Expansion (PCE) within a chance-constrained SOPF, the system's stochastic states are formulated analytically. Crucially, the optimal adaptive droop gain is extracted directly from the first-order PCE coefficients via a Jacobian-free sensitivity analysis, embedding statistical voltage-security guarantees directly into the local converter control. Validation on a 4-terminal AC-HVDC system demonstrates that scenario-adaptive gains significantly outperform standard fixed-coefficient approaches, effectively minimizing active-power tracking errors during extreme wind disturbances.

2605.05952 2026-05-08 eess.SY cs.SY

Foundation Twins: A New Generation of Power Systems Digital Twins using Foundation AI Models

Pedro P. Vergara

Comments 6 pages

详情
英文摘要

Power systems are inherently multi-timescale systems, with different physical phenomena and decision-making processes spanning multiple timescales, time horizons, and geographic scopes. I envision power systems digital twins (DTs) as powerful modeling and simulation tools that can accelerate and improve decision-making across different time scales and geographic scopes. However, until now, research has not delivered such a vision, and power systems DTs remain a concept distant from implementation. This is not a regular research paper. This is a position paper that outlines my vision for developing a new generation of power systems DTs that leverage recent advances in artificial intelligence (AI) and machine learning (ML). I call these Foundation Twins. Foundation Twins combines the generalization features of foundation models with the decision-making capabilities of reinforcement learning (RL) architectures to deliver the envisioned power systems DTs.

2605.01558 2026-05-08 math.OC cs.SY eess.SY

A Measure-Theoretic Formulation of Behavioral Systems

Victor M. Preciado

Comments 29 pages, 2 figures. Corrected proofs from previous version

详情
英文摘要

In Willems' behavioral systems theory, a dynamical system is identified with the set of all trajectories compatible with its laws of motion. In the linear time-invariant setting this trajectory set is a linear subspace, and its algebraic structure underpins the Fundamental Lemma: a single persistently exciting data trajectory generates the entire finite-horizon behavior. For nonlinear or stochastic systems, however, the admissible trajectory set is generally nonconvex, obstructing direct optimization over the behavior. In this paper, we lift the behavioral viewpoint from trajectories to probability measures on trajectories by representing a finite-horizon dynamical system with the set of all Borel probability measures supported on its admissible trajectories. For deterministic systems, this behavioral-measure set is convex and weakly closed even when the dynamics are nonlinear, because convex combinations of trajectory distributions remain dynamically admissible even when convex combinations of trajectories do not. Its extreme points are precisely the Dirac masses on individual admissible trajectories, so the classical deterministic theory is embedded as the extremal skeleton of the richer measure-valued object. On this foundation we establish two core deterministic results and outline a stochastic extension based on history-conditional kernel consistency.

2602.17352 2026-05-08 eess.SY cs.SY

Herd Behavior in Decentralized Balancing Models: A Case Study in Belgium

Max Bruninx, Seyed Soroush Karimi Madahi, Timothy Verstraeten, Jan Decuyper, Chris Develder, Jan Helsen

详情
英文摘要

In a decentralized balancing model, Balance Responsible Parties (BRPs) are encouraged by the Transmission System Operator (TSO) to deviate from their schedule to help the system restore balance, also referred to as implicit balancing. This could reduce balancing costs for the grid operator and lower the entry barrier for flexible assets compared to explicit balancing services. However, these implicit reactions may overshoot when their total capacity is high, potentially requiring more explicit activations. This study analyses the effect of increased participation in the decentralized balancing model in Belgium. To this end, we develop a market simulator that produces price signals on minute-level and simulate the implicit reactions for battery assets with different risk profiles. Besides the current price formula, we also study two potential candidates for the near-term presented by the TSO. A simulation study is conducted using Belgian market data for the year 2023. The findings indicate that, while having a significant positive effect on the balancing costs at first, the risk of overshoots can outweigh the potential benefits when the total capacity of the implicit reactions becomes too large. Furthermore, even when the balancing costs start to increase for the TSO, BRPs were still found to benefit from implicit balancing.