arXivDaily arXiv每日学术速递 周一至周五更新
重置
EESS电气与系统 152
2604.04923 2026-04-07 cs.LG cs.LO cs.SY eess.SY math.AT

Stratifying Reinforcement Learning with Signal Temporal Logic

Justin Curry, Alberto Speranzon

Comments 8 pages, 13 figures

详情
英文摘要

In this paper, we develop a stratification-based semantics for Signal Temporal Logic (STL) in which each atomic predicate is interpreted as a membership test in a stratified space. This perspective reveals a novel correspondence principle between stratification theory and STL, showing that most STL formulas can be viewed as inducing a stratification of space-time. The significance of this interpretation is twofold. First, it offers a fresh theoretical framework for analyzing the structure of the embedding space generated by deep reinforcement learning (DRL) and relates it to the geometry of the ambient decision space. Second, it provides a principled framework that both enables the reuse of existing high-dimensional analysis tools and motivates the creation of novel computational techniques. To ground the theory, we (1) illustrate the role of stratification theory in Minigrid games and (2) apply numerical techniques to the latent embeddings of a DRL agent playing such a game where the robustness of STL formulas is used as the reward. In the process, we propose computationally efficient signatures that, based on preliminary evidence, appear promising for uncovering the stratification structure of such embedding spaces.

2604.04847 2026-04-07 eess.AS cs.CL

Full-Duplex-Bench-v3: Benchmarking Tool Use for Full-Duplex Voice Agents Under Real-World Disfluency

Guan-Ting Lin, Chen Chen, Zhehuai Chen, Hung-yi Lee

Comments Work in progress. Demo at https://daniellin94144.github.io/FDB-v3-demo

详情
英文摘要

We introduce Full-Duplex-Bench-v3 (FDB-v3), a benchmark for evaluating spoken language models under naturalistic speech conditions and multi-step tool use. Unlike prior work, our dataset consists entirely of real human audio annotated for five disfluency categories, paired with scenarios requiring chained API calls across four task domains. We evaluate six model configurations -- GPT-Realtime, Gemini Live 2.5, Gemini Live 3.1, Grok, Ultravox v0.7, and a traditional Cascaded pipeline (Whisper$\rightarrow$GPT-4o$\rightarrow$TTS) -- across accuracy, latency, and turn-taking dimensions. GPT-Realtime leads on Pass@1 (0.600) and interruption avoidance (13.5\%); Gemini Live 3.1 achieves the fastest latency (4.25~s) but the lowest turn-take rate (78.0\%); and the Cascaded baseline, despite a perfect turn-take rate, incurs the highest latency (10.12~s). Across all systems, self-correction handling and multi-step reasoning under hard scenarios remain the most consistent failure modes.

2604.04841 2026-04-07 cs.SD eess.AS eess.SP

Joint Fullband-Subband Modeling for High-Resolution SingFake Detection

Xuanjun Chen, Chia-Yu Hu, Sung-Feng Huang, Haibin Wu, Hung-yi Lee, Jyh-Shing Roger Jang

Comments Submitted to INTERSPEECH 2026

详情
英文摘要

Rapid advances in singing voice synthesis have increased unauthorized imitation risks, creating an urgent need for better Singing Voice Deepfake (SingFake) Detection, also known as SVDD. Unlike speech, singing contains complex pitch, wide dynamic range, and timbral variations. Conventional 16 kHz-sampled detectors prove inadequate, as they discard vital high-frequency information. This study presents the first systematic analysis of high-resolution (44.1 kHz sampling rate) audio for SVDD. We propose a joint fullband-subband modeling framework: the fullband captures global context, while subband-specific experts isolate fine-grained synthesis artifacts unevenly distributed across the spectrum. Experiments on the WildSVDD dataset demonstrate that high-frequency subbands provide essential complementary cues. Our framework significantly outperforms 16 kHz-sampled models, proving that high-resolution audio and strategic subband integration are critical for robust in-the-wild detection.

2604.04834 2026-04-07 cs.CV cs.MM cs.RO eess.IV

E-VLA: Event-Augmented Vision-Language-Action Model for Dark and Blurred Scenes

Jiajun Zhai, Hao Shi, Shangwei Guo, Kailun Yang, Kaiwei Wang

Comments Code and dataset will be available at https://github.com/JJayzee/E-VLA

详情
英文摘要

Robotic Vision-Language-Action (VLA) models generalize well for open-ended manipulation, but their perception is fragile under sensing-stage degradations such as extreme low light, motion blur, and black clipping. We present E-VLA, an event-augmented VLA framework that improves manipulation robustness when conventional frame-based vision becomes unreliable. Instead of reconstructing images from events, E-VLA directly leverages motion and structural cues in event streams to preserve semantic perception and perception-action consistency under adverse conditions. We build an open-source teleoperation platform with a DAVIS346 event camera and collect a real-world synchronized RGB-event-action manipulation dataset across diverse tasks and illumination settings. We also propose lightweight, pretrained-compatible event integration strategies and study event windowing and fusion for stable deployment. Experiments show that even a simple parameter-free fusion, i.e., overlaying accumulated event maps onto RGB images, could substantially improve robustness in dark and blur-heavy scenes: on Pick-Place at 20 lux, success increases from 0% (image-only) to 60% with overlay fusion and to 90% with our event adapter; under severe motion blur (1000 ms exposure), Pick-Place improves from 0% to 20-25%, and Sorting from 5% to 32.5%. Overall, E-VLA provides systematic evidence that event-driven perception can be effectively integrated into VLA models, pointing toward robust embodied intelligence beyond conventional frame-based imaging. Code and dataset will be available at https://github.com/JJayzee/E-VLA.

2604.04822 2026-04-07 eess.SY cs.SY

Bridging Data-Driven Reachability Analysis and Statistical Estimation via Constrained Matrix Convex Generators

Peng Xie, Zhen Zhang, Rolf Findeisen, Amr Alanwar

详情
英文摘要

Data-driven reachability analysis enables safety verification when first-principles models are unavailable. This requires constructing sets of system models consistent with measured trajectories and noise assumptions. Existing approaches rely on zonotopic or box-based approximations, which do not fit the geometry of common noise distributions such as Gaussian disturbances and can lead to significant conservatism, especially in high-dimensional settings. This paper builds on ellipsotope-based representations to introduce mixed-norm uncertainty sets for data-driven reachability. The highest-density region defines the exact minimum-volume noise confidence set, while Constrained Convex Generators (CCG) and their matrix counterpart (CMCG) provide compatible geometric representations at the noise and parameter level. We show that the resulting CMCG coincides with the maximum-likelihood confidence ellipsoid for Gaussian disturbances, while remaining strictly tighter than constrained matrix zonotopes for mixed bounded-Gaussian noise. For non-convex noise distributions such as Gaussian mixtures, a minimum-volume enclosing ellipsoid provides a tractable convex surrogate. We further prove containment of the CMCG times CCG product and bound the conservatism of the Gaussian-Gaussian interaction. Numerical examples demonstrate substantially tighter reachable sets compared to box-based approximations of Gaussian disturbances. These results enable less conservative safety verification and improve the accuracy of uncertainty-aware control design.

2604.04802 2026-04-07 cs.IT cs.LG eess.SP math.IT math.PR stat.ML

Partially deterministic sampling for compressed sensing with denoising guarantees

Yaniv Plan, Matthew S. Scott, Ozgur Yilmaz

详情
英文摘要

We study compressed sensing when the sampling vectors are chosen from the rows of a unitary matrix. In the literature, these sampling vectors are typically chosen randomly; the use of randomness has enabled major empirical and theoretical advances in the field. However, in practice there are often certain crucial sampling vectors, in which case practitioners will depart from the theory and sample such rows deterministically. In this work, we derive an optimized sampling scheme for Bernoulli selectors which naturally combines random and deterministic selection of rows, thus rigorously deciding which rows should be sampled deterministically. This sampling scheme provides measurable improvements in image compressed sensing for both generative and sparse priors when compared to with-replacement and without-replacement sampling schemes, as we show with theoretical results and numerical experiments. Additionally, our theoretical guarantees feature improved sample complexity bounds compared to previous works, and novel denoising guarantees in this setting.

2604.04801 2026-04-07 math.OC cs.SY eess.SY

Feasibility-Aware Imitation Learning for Benders Decomposition

Bernard T. Agyeman, Zhe Li, Ilias Mitrai, Prodromos Daoutidis

详情
英文摘要

Mixed-integer optimization problems arise in a wide range of control applications. Benders decomposition is a widely used algorithm for solving such problems by decomposing them into a mixed-integer master problem and a continuous subproblem. A key computational bottleneck is the repeated solution of increasingly complex master problems across iterations. In this paper, we propose a feasibility-aware imitation learning framework that predicts the values of the integer variables of the master problem at each iteration while accounting for feasibility with respect to constraints governing admissible integer assignments and the accumulated Benders feasibility cuts. The agent is trained using a two-stage procedure that combines behavioral cloning with a feasibility-based logit adjustment to bias predictions toward assignments that satisfy the evolving cut set. The agent is deployed within an agent-based Benders decomposition framework that combines explicit feasibility checks with a time-limited solver computation of a valid lower bound. The proposed approach retains finite convergence properties, as the lower bound is certified at each iteration. Application to a prototypical case study shows that the proposed method improves solution time relative to existing imitation learning approaches for accelerating Benders decomposition, while preserving solution accuracy.

2604.04792 2026-04-07 eess.SP

Multi-Scaled Unscented Kalman Filter

Amit Levy, Itzik Klein

Comments 11 pages, 6 figures

详情
英文摘要

The unscented Kalman filter (UKF) is a commonly used algorithm capable of estimating the states of nonlinear dynamic systems. It carefully chooses a set of sample points, called sigma points that capture the nonlinear system states posterior mean and covariance. The filter is based on the scaled unscented transform, where the scaling parameters impact the spreading of the sigma points, determining the estimated model capturing. In its current form, the UKF employs a single set of scaling parameters shared by all sigma points. Because states in multi-dimensional models often exhibit substantially different behaviors, this imposes a critical limitation: the standard UKF parameters cannot be tuned to extend the spread for one dimension while reducing it for another. To bridge this gap, we propose the multi-scaled UKF to enable spreading differently per state, while maintaining the key properties of the sigma points and UKF. A rigorous mathematical foundation is provided, introducing a novel theoretical approach to multi-scaling. The benefits of this approach are demonstrated through two distinct nonlinear dynamic systems. Consequently, our multi-scaled UKF captures the nonlinear behavior of multi-dimensional states more effectively, leading to improved estimation accuracy.

2604.04772 2026-04-07 math.OC cs.SY eess.SY

Collaborative Altruistic Safety in Coupled Multi-Agent Systems

Brooks A. Butler, Xiao Tan, Aaron D. Ames, Magnus Egerstedt

Comments This work is to appear at the 2026 American Control Conference

详情
英文摘要

This paper presents a novel framework for ensuring safety in dynamically coupled multi-agent systems through collaborative control. Drawing inspiration from ecological models of altruism, we develop collaborative control barrier functions that allow agents to cooperatively enforce individual safety constraints under coupling dynamics. We introduce an altruistic safety condition based on the so-called Hamilton's rule, enabling agents to trade off their own safety to support higher-priority neighbors. By incorporating these conditions into a distributed optimization framework, we demonstrate increased feasibility and robustness in maintaining system-wide safety. The effectiveness of the proposed approach is illustrated through simulation in a simplified formation control scenario.

2604.04758 2026-04-07 eess.SY cs.SY

Data-Driven Reachability Analysis with Optimal Input Design

Peng Xie, Davide M. Raimondo, Rolf Findeisen, Amr Alanwar

详情
英文摘要

This paper addresses the conservatism in data-driven reachability analysis for discrete-time linear systems subject to bounded process noise, where the system matrices are unknown and only input--state trajectory data are available. Building on the constrained matrix zonotope (CMZ) framework, two complementary strategies are proposed to reduce conservatism in reachable-set over-approximations. First, the standard Moore--Penrose pseudoinverse is replaced with a row-norm-minimizing right inverse computed via a second-order cone program (SOCP), which directly reduces the size of the resulting model set, yielding tighter generators and less conservative reachable sets. Second, an online A-optimal input design strategy is introduced to improve the informativeness of the collected data and the conditioning of the resulting model set, thereby reducing uncertainty. The proposed framework extends naturally to piecewise affine systems through mode-dependent data partitioning. Numerical results on a five-dimensional stable LTI system and a two-dimensional piecewise affine system demonstrate that combining designed inputs with the row-norm right inverse significantly reduces conservatism compared to a baseline using random inputs and the pseudoinverse, leading to tighter reachable sets for safety verification.

2604.04743 2026-04-07 cs.CL cs.AI cs.SY eess.SY

Hallucination Basins: A Dynamic Framework for Understanding and Controlling LLM Hallucinations

Kalyan Cherukuri, Lav R. Varshney

详情
英文摘要

Large language models (LLMs) hallucinate: they produce fluent outputs that are factually incorrect. We present a geometric dynamical systems framework in which hallucinations arise from task-dependent basin structure in latent space. Using autoregressive hidden-state trajectories across multiple open-source models and benchmarks, we find that separability is strongly task-dependent rather than universal: factoid settings can show clearer basin separation, whereas summarization and misconception-heavy settings are typically less stable and often overlap. We formalize this behavior with task-complexity and multi-basin theorems, characterize basin emergence in L-layer transformers, and show that geometry-aware steering can reduce hallucination probability without retraining.

2604.04742 2026-04-07 eess.SP cs.NI

ACHEM: A Real-Time Digital Twin Framework with Channel and Radio Emulation

Anil Gurses, Mihail L. Sichitiu

Comments Submitted to the IEEE Journal

详情
英文摘要

Digital twins are becoming an important tool for designing, developing, testing, and optimizing next-generation wireless communication systems. Over the past decade, system softwarization has become a reality, and wireless communication systems are no exception. Software-Defined Radios (SDRs), in general, and Universal Software Radio Peripherals (USRPs), in particular, are often used for prototyping and testing advanced wireless systems. Unfortunately, there is currently no end-to-end, software-based, general-purpose testing environment for SDR-based systems: developers often rely on benchtop setups or even small testbeds, but those are costly and cumbersome to build. At the other end of the spectrum, simulations often rely on simplified channel/radio models and typically do not execute full-stack production code, which can increase development effort and reduce fidelity. In this paper, we propose ACHEM (A Channel Emulator), the first software-based, end-to-end wireless channel emulation environment and toolset for communication systems based on SDRs, specifically USRPs. With the proposed emulator and toolkit, any USRP-based system can be fully emulated at the I/Q level in a pure digital environment without requiring specialized hardware (e.g., vehicles, USRPs, FPGAs, or GPUs). The proposed emulator supports multiple transmitters and receivers, MIMO communications, multiple frequencies, heterogeneous sampling rates, real-time node mobility through vehicle emulation, antenna radiation patterns, and various channel models. ACHEM facilitates wireless digital twin development and deployment. ACHEM is validated with several popular open-source USRP-based wireless communication applications, including GNU Radio, srsRAN 4G/5G, and OpenAirInterface.

2604.04726 2026-04-07 stat.ML cs.LG eess.SP

A Muon-Accelerated Algorithm for Low Separation Rank Tensor Generalized Linear Models

Xiao Liang, Shuang Li

详情
英文摘要

Tensor-valued data arise naturally in multidimensional signal and imaging problems, such as biomedical imaging. When incorporated into generalized linear models (GLMs), naive vectorization can destroy their multi-way structure and lead to high-dimensional, ill-posed estimation. To address this challenge, Low Separation Rank (LSR) decompositions reduce model complexity by imposing low-rank multilinear structure on the coefficient tensor. A representative approach for estimating LSR-based tensor GLMs (LSR-TGLMs) is the Low Separation Rank Tensor Regression (LSRTR) algorithm, which adopts block coordinate descent and enforces orthogonality of the factor matrices through repeated QR-based projections. However, the repeated projection steps can be computationally demanding and slow convergence. Motivated by the need for scalable estimation and classification from such data, we propose LSRTR-M, which incorporates Muon (MomentUm Orthogonalized by Newton-Schulz) updates into the LSRTR framework. Specifically, LSRTR-M preserves the original block coordinate scheme while replacing the projection-based factor updates with Muon steps. Across synthetic linear, logistic, and Poisson LSR-TGLMs, LSRTR-M converges faster in both iteration count and wall-clock time, while achieving lower normalized estimation and prediction errors. On the Vessel MNIST 3D task, it further improves computational efficiency while maintaining competitive classification performance.

2604.04711 2026-04-07 math.DS cs.SY eess.SY

Global Linearization of Parameterized Nonlinear Systems with Stable Equilibrium Point Using the Koopman Operator

Natsuki Katayama, Alexandre Mauroy, Yoshihiko Susuki

Comments 10 pages, 0 figure

详情
英文摘要

The Koopman operator framework enables global analysis of nonlinear systems through its inherent linearity. This study aims to clarify spectral properties of the Koopman operators for nonlinear systems with control inputs. To this end, we treat the inputs as parameters throughout this paper. We then introduce the Koopman operator for a parameterized dynamical system with a globally exponentially stable equilibrium point and analyze how eigenfunctions of the operator depend on the parameter. As a main result, we obtain a global linearization, which enables one to transform the nonlinear system into a finite-dimensional linear system, and we show that it depends continuously on the parameter. Subsequently, for a control-affine system, we investigate a condition under which the transformation providing a global bilinearization does not depend on the parameter. This provides the condition under which the global bilinearization for the control-affine system is independent of the parameter.

2604.04702 2026-04-07 eess.SP

Performance Analysis of STAR-RIS-Assisted NOMA Wireless Systems with Realistic Indoor Outdoor THz Channel Models

Ngoc Phuc Le, Mohamed-Slim Alouini

详情
英文摘要

In this paper, a simultaneously transmitting and reflecting reconfigurable intelligent surface (STAR-RIS)-aided downlink non-orthogonal multiple access (NOMA) Terahertz (THz) wireless system is proposed for indoor and outdoor transmissions. We consider a near-field communication scenario where an access-point (AP) is deployed near a STAR-RIS panel. For links from the STAR-RIS to users, $α-μ$ distribution is adopted for the indoor small-scale fading channels, whereas the outdoor channels are based on Gaussian mixture or mixture of gamma, which follows the recent practical measurement reports. To facilitate performance analysis, we derive exact expressions of a probability density function (PDF) and a cumulative distribution function (CDF) of a weighted sum of $α-μ$ variates. Approximate PDF and CDF expressions of a weighted sum of Gaussian mixture variates are derived as well. Based on these results, closed-form expressions of the outage probability and the ergodic capacity, together with their asymptotic formulas at high signal-to-noise ratio (SNR), are obtained. Moreover, we analyze the capacity of the THz system at the low SNR regime. Impacts of hardware impairments and STAR-RIS protocols (i.e., energy splitting and mode-switching) on the system performance are evaluated. All developed analytical results are validated and demonstrated via numerical simulations.

2604.04684 2026-04-07 eess.SP

Simultaneous Unicast and Multicast Transmissions in Stacked Intelligent Metasurfaces-assisted HAPS Wireless Networks: Performance Analysis and Optimization

Ngoc Phuc Le, Mohamed-Slim Alouini

详情
英文摘要

In this paper, we investigate high-altitude platform station (HAPS) wireless networks for simultaneous non-orthogonal unicast and multicast transmissions. Specifically, stacked intelligent metasurface (SIM)-based wave-domain beamforming is proposed to enable efficient HAPS-to-ground communications. Also, the system performance is investigated from an energy-efficiency (EE) perspective, which is a crucial for HAPS operations. For performance analysis, we derive approximate closed-form expressions for the outage probability over Rician fading channels. For EE optimization, we jointly optimize the transmit power and the SIM phase-shifts for the maximal EE. Two methods are proposed to solve this non-convex optimization problem. The first method develops an efficient alternating optimization (AO) framework based on golden-section search and projected gradient ascent (PGA) for transmit power and phase-shift optimization, respectively. The second method uses unsupervised deep neural network (DNN) that does not require labeling. Performance comparison between the two methods, as well as with other benchmarks schemes are examined. Additionally, the impacts of the number of SIM elements per layers, the number of SIM layers, the maximum transmit power on the EE performance are evaluated. Simulation results are provided to demonstrate the performance of the proposed systems.

2604.04670 2026-04-07 eess.IV cs.AI cs.CY eess.SP

An AI Teaching Assistant for Motion Picture Engineering

Deirdre O'Regan, Anil C. Kokaram

Comments Accepted for publication in IEEE Signal Processing Magazine

详情
英文摘要

The rapid rise of LLMs over the last few years has promoted growing experimentation with LLM-driven AI tutors. However, the details of implementation, as well as the benefit in a teaching environment, are still in the early days of exploration. This article addresses these issues in the context of implementation of an AI Teaching Assistant (AI-TA) using Retrieval Augmented Generation (RAG) for Trinity College Dublin's Master's Motion Picture Engineering (MPE) course. We provide details of our implementation (including the prompt to the LLM, and code), and highlight how we designed and tuned our RAG pipeline to meet course needs. We describe our survey instrument and report on the impact of the AI-TA through a number of quantitative metrics. The scale of our experiment (43 students, 296 sessions, 1,889 queries over 7 weeks) was sufficient to have confidence in our findings. Unlike previous studies, we experimented with allowing the use of the AI-TA in open-book examinations. Statistical analysis across three exams showed no performance differences regardless of AI-TA access (p > 0.05), demonstrating that thoughtfully designed assessments can maintain academic validity. Student feedback revealed that the AI-TA was beneficial (mean = 4.22/5), while students had mixed feelings about preferring it over human tutoring (mean = 2.78/5).

2604.04625 2026-04-07 eess.SY cs.SY

Compact Reconfigurable Intelligent Surface with Phase-Gradient Coded Beam Steering and Controlled Substrate Loss

Mahendra Kheti, Debapratim Ghosh, Soumya P. Dash

Comments 10 pages, 16 figures

详情
英文摘要

This paper presents a 1-bit reconfigurable intelligent surface (RIS) fabricated using a three-layer structure. It employs a manual layer stackup incorporating an optimal air gap to reduce the effective dielectric losses while using a low-cost FR4 substrate. The new design of the unit cells of the proposed RIS is outlined, with each unit cell featuring a PIN-diode-based, compact, simplified biasing network that simplifies the control circuit while maintaining distinct $\boldsymbol{0^\circ/180^\circ \pm 20^\circ}$ phase states between ON/OFF conditions. The designed RIS is in the form of a $\boldsymbol{10\times10}$ array with a compact size of $\boldsymbol{2.9λ_g \times 2.9λ_g}$. Additionally, a phase-gradient coding scheme is presented and utilized that achieves measured beam steering up to $\boldsymbol{\pm30^\circ}$ in both anechoic and noisy environments. Controlled and driven by an Arduino-cum-digital interface, the proposed RIS exhibits measured reflected wave gain enhancement of about 9\,dB over an incident wave angular range of $\boldsymbol{\pm 30^\circ}$. Furthermore, the design is also experimentally validated by transmitting quadrature phase-shift keying-modulated symbols via the RIS-assisted wireless channel. The proposed RIS works for the range 3.38--3.67\,GHz (8.3\%), and is suitable for deployment for the 5G n78 \mbox{band (3.5\,GHz).}

2604.04621 2026-04-07 eess.SP

Flexible Beamforming Design with Hierarchical Rotational 6DMA Systems

Weijia Wang, Changsheng You, Xiaodan Shao, Rui Zhang

详情
英文摘要

Reconfigurable antenna technology, such as movable antennas (MAs) and rotatable antennas (RAs), has emerged as a promising solution to enhance the communication performance of wireless systems by exploiting the new degree of freedom (DoF) in antenna reconfiguration. However, existing RA designs mostly consider the array-wise or antenna-wise rotation only, which constrain their great potential in the wide-range radiation pattern control. To overcome this limitation, we propose a new hierarchical rotational six-dimensional MA (HR-6DMA) architecture to improve downlink coverage, which exploits array-wise rotation for global orientation adjustment and individual antenna rotation for fine-grained radiation refinement. Based on this array architecture, we then formulate an optimization problem to maximize the minimum beamforming gain over a target region by jointly optimizing the two-level rotations and transmit beamforming. To solve this non-convex problem, an efficient algorithm is proposed, where the transmit beamforming and per-antenna rotation are optimized via alternating optimization under any feasible array rotation, followed by a low-complexity linear search to determine the optimal array rotation. Last, numerical results show that the proposed HR-6DMA significantly improves the minimum beamforming gain over fixed and single-level rotatable arrays.

2604.04602 2026-04-07 eess.SY cs.SY math.OC

Stochastic Model Predictive Control with Online Risk Allocation and Feedback Gain Selection

Filipe Marques Barbosa, Johan Löfberg

Comments Updated preprint with a revised title, typographical corrections, and mathematical refinements made after its initial submission for publication

详情
英文摘要

Stochastic Model Predictive Control addresses uncertainties by incorporating chance constraints that provide probabilistic guarantees of constraint satisfaction. However, simultaneously optimizing over the risk allocation and the feedback policies leads to intractable nonconvex problems. This is due to (i) products of functions involving the feedback law and risk allocation in the deterministic counterpart of the chance constraints, and (ii) the presence of the nonconvex Gaussian quantile (probit) function. Existing methods rely on two-stage optimization, which is nonconvex. To address this, we derive disjunctive convex chance constraints and select the feedback law from a set of precomputed candidates. The inherited compositions of the probit function are replaced with power- and exponential-cone representable approximations. The main advantage is that the problem can be formulated as a mixed-integer conic optimization problem and efficiently solved with off-the-shelf software. Moreover, the proposed formulations apply to general chance constraints with products of exclusive disjunctive and Gaussian variables. The proposed approaches are validated with a path-planning application.

2604.04545 2026-04-07 eess.SY cs.LG cs.SY

Safe and Near-Optimal Gate Control: A Case Study from the Danish West Coast

Martin Kristjansen, Kim Guldstrand Larsen, Marius Mikučionis, Christian Schilling

Comments In Proceedings MARS 2026, arXiv:2604.03053

详情
Journal ref
EPTCS 443, 2026, pp. 85-103
英文摘要

Ringkoebing Fjord is an inland water basin on the Danish west coast separated from the North Sea by a set of gates used to control the amount of water entering and leaving the fjord. Currently, human operators decide when and how many gates to open or close for controlling the fjord's water level, with the goal to satisfy a range of conflicting safety and performance requirements such as keeping the water level in a target range, allowing maritime traffic, and enabling fish migration. Uppaal Stratego. We then use this digital twin along with forecasts of the sea level and the wind speed to learn a gate controller in an online fashion. We evaluate the learned controllers under different sea-level scenarios, representing normal tidal behavior, high waters, and low waters. Our evaluation demonstrates that, unlike a baseline controller, the learned controllers satisfy the safety requirements, while performing similarly regarding the other requirements.

2604.04544 2026-04-07 eess.SY cs.FL cs.MA cs.SY

Modelling and Analysis of Supply Chains using Product Time Petri Nets

Eric Lubat, Pierre-Emmanuel Hladik, Yoann Mateu, Rémi Sauvère

Comments In Proceedings MARS 2026, arXiv:2604.03053

详情
Journal ref
EPTCS 443, 2026, pp. 23-39
英文摘要

Supply chains involve geographically distributed manufacturing and assembly sites that must be coordinated under strict timing and resource constraints. While many existing approaches rely on Colored Petri Nets to model material flows, this work focuses on the temporal feasibility of supply chain processes. We propose a modular modelling approach based on Product Time Petri Nets (PTPNs), where each subsystem is represented independently and the global behaviour emerges through synchronised transition labels. A key feature of the model is the explicit representation of the supply chain manager as a critical shared and mobile resource, whose availability directly impacts system feasibility. We analyse how timing constraints and managerial capacity influence the system behaviour, identifying configurations that lead to successful executions, timeouts, or timelocks induced by incompatible timing constraints. This approach enables systematic what-if analysis of supply chain coordination policies and demonstrates the relevance of PTPNs for modelling and analysing synchronised timed systems.

2604.04540 2026-04-07 eess.SP

Activity Recognition Using mm-Wave Radar and Deep Learning: Prayer Tracker Case Study

Karim Saifullin, Sajid Ahmed, Mohamed-Slim Alouini

详情
英文摘要

The issue of privacy has gained significant attention in recent times. Many real-world applications increasingly require the use of sensitive data, such as in surveillance or tracking and assistance systems. To address these concerns, we propose a framework based on mm-wave radar technology that not only meets privacy requirements but also provides the necessary capabilities for these systems, including reliable current position tracking, sequence tracking, and feedback to the user. While the use of radar technology for surveillance purposes is gaining momentum, there has been no research to date on its application for prayer tracking and assistance systems. Furthermore, there is a lack of comprehensive research that covers all aspects of implementing such a system. Proposed approach offers a versatile solution that can be applied to a broad range of scenarios. Instead of utilizing raw I-Q data, we addressed the challenge of classification based on point cloud information generated by the conventional processing chain of the frequency-modulated continuous wave radar. This information contains corresponding range, reflection amplitude, Doppler and angular values. We have developed and compared different machine-learning classification algorithms to identify the most effective one. Our findings reveal that the convolutional neural network ResNet achieves the best results, with accuracy rates reaching up to 95.4 percent when applied to unknown data. The demonstration video of the developed system can be viewed at the following link: https://youtu.be/PnpGQZWqCr4.

2604.04537 2026-04-07 eess.SY cs.SY

PCT-Based Trajectory Tracking for Underactuated Marine Vessels

Ji-Hong Li

详情
英文摘要

This paper investigates the trajectory tracking problem of underactuated marine vessels within a polar coordinate framework. By introducing two polar coordinate transformations (PCTs), the original two-input-three-output second-order tracking model expressed in the Cartesian frame is reduced to a two-input-two-output feedback system. However, the resulting model does not necessarily satisfy the strict-feedback condition required by conventional backstepping approaches. To circumvent potential singularities arising in the controller design, a novel concept termed exponential modification of orientation (EMO) is proposed. While the PCTs yield substantial structural simplification, they also introduce inherent limitations, most notably singularities associated with angular coordinates. Addressing these singularities constitutes another key focus of this paper. Numerical simulation results are presented to demonstrate the effectiveness of the proposed control strategy.

2604.04499 2026-04-07 eess.SY cs.SY math.OC

Distributed Covariance Steering via Non-Convex ADMM for Large-Scale Multi-Agent Systems

Augustinos D. Saravanos, Isin M. Balci, Arshiya Taj Abdul, Efstathios Bakolas, Evangelos A. Theodorou

详情
英文摘要

This paper studies the problem of steering large-scale multi-agent stochastic linear systems between Gaussian distributions under probabilistic collision avoidance constraints. We introduce a family of \textit{distributed covariance steering (DCS)} methods based on the Alternating Direction Method of Multipliers (ADMM), each offering different trade-offs between conservatism and computational efficiency. The first method, Full-Covariance-Consensus (FCC)-DCS, enforces consensus over both the means and covariances of neighboring agents, yielding the least conservative safe solutions. The second approach, Partial-Covariance-Consensus (PCC)-DCS, leverages the insight that safety can be maintained by exchanging only partial covariance information, reducing computational demands. The third method, Mean-Consensus (MC)-DCS, provides the most scalable alternative by requiring consensus only on mean states. Furthermore, we establish novel convergence guarantees for distributed ADMM with iteratively linearized non-convex constraints, covering a broad class of consensus optimization problems. This analysis proves convergence to stationary points for PCC-DCS and MC-DCS, while the convergence of FCC-DCS follows from standard ADMM theory. Simulations in 2D and 3D multi-agent environments verify safety, illustrate the trade-offs between methods, and demonstrate scalability to thousands of agents.

2604.04490 2026-04-07 eess.SP cs.AI eess.IV

RAVEN: Radar Adaptive Vision Encoders for Efficient Chirp-wise Object Detection and Segmentation

Anuvab Sen, Mir Sayeed Mohammad, Saibal Mukhopadhyay

Comments CVPR submission / conference paper

详情
Journal ref
Computer Vision and Pattern Recognition Conference 2026
英文摘要

This paper presents RAVEN, a computationally efficient deep learning architecture for FMCW radar perception. The method processes raw ADC data in a chirp-wise streaming manner, preserves MIMO structure through independent receiver state-space encoders, and uses a learnable cross-antenna mixing module to recover compact virtual-array features. It also introduces an early-exit mechanism so the model can make decisions using only a subset of chirps when the latent state has stabilized. Across automotive radar benchmarks, the approach reports strong object detection and BEV free-space segmentation performance while substantially reducing computation and end-to-end latency compared with conventional frame-based radar pipelines.

2604.04484 2026-04-07 eess.IV cs.CV

TM-BSN: Triangular-Masked Blind-Spot Network for Real-World Self-Supervised Image Denoising

Junyoung Park, Youngjin Oh, Nam Ik Cho

Comments Accepted to CVPR 2026

详情
英文摘要

Blind-spot networks (BSNs) enable self-supervised image denoising by preventing access to the target pixel, allowing clean signal estimation without ground-truth supervision. However, this approach assumes pixel-wise noise independence, which is violated in real-world sRGB images due to spatially correlated noise from the camera's image signal processing (ISP) pipeline. While several methods employ downsampling to decorrelate noise, they alter noise statistics and limit the network's ability to utilize full contextual information. In this paper, we propose the Triangular-Masked Blind-Spot Network (TM-BSN), a novel blind-spot architecture that accurately models the spatial correlation of real sRGB noise. This correlation originates from demosaicing, where each pixel is reconstructed from neighboring samples with spatially decaying weights, resulting in a diamond-shaped pattern. To align the receptive field with this geometry, we introduce a triangular-masked convolution that restricts the kernel to its upper-triangular region, creating a diamond-shaped blind spot at the original resolution. This design excludes correlated pixels while fully leveraging uncorrelated context, eliminating the need for downsampling or post-processing. Furthermore, we use knowledge distillation to transfer complementary knowledge from multiple blind-spot predictions into a lightweight U-Net, improving both accuracy and efficiency. Extensive experiments on real-world benchmarks demonstrate that our method achieves state-of-the-art performance, significantly outperforming existing self-supervised approaches. Our code is available at https://github.com/parkjun210/TM-BSN.

2604.04470 2026-04-07 eess.IV cs.AI

MC-GenRef: Annotation-free mammography microcalcification segmentation with generative posterior refinement

Hyunwoo Cho, Yeeun Kwon, Min Jung Kim, Yangmo Yoo

详情
英文摘要

Microcalcification (MC) analysis is clinically important in screening mammography because clustered puncta can be an early sign of malignancy, yet dense MC segmentation remains challenging: targets are extremely small and sparse, dense pixel-level labels are expensive and ambiguous, and cross-site shift often induces texture-driven false positives and missed puncta in dense tissue. We propose MC-GenRef, a real dense-label-free framework that combines high-fidelity synthetic supervision with test-time generative posterior refinement (TT-GPR). During training, real negative mammogram patches are used as backgrounds, and physically plausible MC patterns are injected through a lightweight image formation model with local contrast modulation and blur, yielding exact image-mask pairs without real dense annotation. Using only these synthetic labeled pairs, MC-GenRef trains a base segmentor and a seed-conditioned rectified-flow (RF) generator that serves as a controllable generative prior. During inference, TT-GPR treats segmentation as approximate posterior inference: it derives a sparse seed from the current prediction, forms seed-consistent RF projections, converts them into case-specific surrogate targets through the frozen segmentor, and iteratively refines the logits with overlap-consistent and edge-aware regularization. On INbreast, the synthetic-only initializer achieved the best Dice without real dense annotations, while TT-GPR improved miss-sensitive performance to Recall and FNR, with strong class-balanced behavior (Bal.Acc., G-Mean). On an external private Yonsei cohort ( n=50 ), TT-GPR consistently improved the synthetic-only initializer under cross-site shift, increasing Dice and Recall while reducing FNR. These results suggest that test-time generative posterior refinement is a practical route to reduce MC misses and improve robustness without additional real dense labeling.

2604.04413 2026-04-07 eess.SP

A Survey on Robust Deep Joint Source-Channel Coding for Semantic Communications

Eunhye Hong, Taewoo Park, Yongjune Kim

详情
英文摘要

Semantic communications (SCs) aim to transmit only the essential information required to perform given tasks, thereby improving communication efficiency. Deep learning-based joint source-channel coding (deep JSCC) has emerged as a promising approach for SC systems; however, its performance often degrades when the deployment channels differ from the training channel conditions, making robustness a critical requirement. This paper presents a structured overview of recent methodologies for enhancing the robustness of deep JSCC. Specifically, existing approaches are categorized into two classes: robust training approaches and adaptive approaches, with the latter further divided into adaptive semantic feature selection, physical-layer adaptation, and semantic feature adaptation. Finally, we discuss promising directions, including multi-task generalization and explainability in robust SC systems.

2604.04407 2026-04-07 eess.IV cs.CV cs.LG cs.MM

NAIMA: Semantics Aware RGB Guided Depth Super-Resolution

Tayyab Nasir, Daochang Liu, Ajmal Mian

详情
英文摘要

Guided depth super-resolution (GDSR) is a multi-modal approach for depth map super-resolution that relies on a low-resolution depth map and a high-resolution RGB image to restore finer structural details. However, the misleading color and texture cues indicating depth discontinuities in RGB images often lead to artifacts and blurred depth boundaries in the generated depth map. We propose a solution that introduces global contextual semantic priors, generated from pretrained vision transformer token embeddings. Our approach to distilling semantic knowledge from pretrained token embeddings is motivated by their demonstrated effectiveness in related monocular depth estimation tasks. We introduce a Guided Token Attention (GTA) module, which iteratively aligns encoded RGB spatial features with depth encodings, using cross-attention for selectively injecting global semantic context extracted from different layers of a pretrained vision transformer. Additionally, we present an architecture called Neural Attention for Implicit Multi-token Alignment (NAIMA), which integrates DINOv2 with GTA blocks for a semantics-aware GDSR. Our proposed architecture, with its ability to distill semantic knowledge, achieves significant improvements over existing methods across multiple scaling factors and datasets.