arXivDaily arXiv每日学术速递 周一至周五更新
重置
2602.18428 2026-02-23 cs.LG cs.CV eess.IV

The Geometry of Noise: Why Diffusion Models Don't Need Noise Conditioning

Mojtaba Sahraee-Ardakan, Mauricio Delbracio, Peyman Milanfar

详情
英文摘要

Autonomous (noise-agnostic) generative models, such as Equilibrium Matching and blind diffusion, challenge the standard paradigm by learning a single, time-invariant vector field that operates without explicit noise-level conditioning. While recent work suggests that high-dimensional concentration allows these models to implicitly estimate noise levels from corrupted observations, a fundamental paradox remains: what is the underlying landscape being optimized when the noise level is treated as a random variable, and how can a bounded, noise-agnostic network remain stable near the data manifold where gradients typically diverge? We resolve this paradox by formalizing Marginal Energy, $E_{\text{marg}}(\mathbf{u}) = -\log p(\mathbf{u})$, where $p(\mathbf{u}) = \int p(\mathbf{u}|t)p(t)dt$ is the marginal density of the noisy data integrated over a prior distribution of unknown noise levels. We prove that generation using autonomous models is not merely blind denoising, but a specific form of Riemannian gradient flow on this Marginal Energy. Through a novel relative energy decomposition, we demonstrate that while the raw Marginal Energy landscape possesses a $1/t^p$ singularity normal to the data manifold, the learned time-invariant field implicitly incorporates a local conformal metric that perfectly counteracts the geometric singularity, converting an infinitely deep potential well into a stable attractor. We also establish the structural stability conditions for sampling with autonomous models. We identify a ``Jensen Gap'' in noise-prediction parameterizations that acts as a high-gain amplifier for estimation errors, explaining the catastrophic failure observed in deterministic blind models. Conversely, we prove that velocity-based parameterizations are inherently stable because they satisfy a bounded-gain condition that absorbs posterior uncertainty into a smooth geometric drift.

2602.18416 2026-02-23 eess.SY cs.SY math.OC

Convex Block-Cholesky Approach to Risk-Constrained Low-thrust Trajectory Design under Operational Uncertainty

Kenshiro Oguri, Gregory Lantoine

详情
英文摘要

Designing robust trajectories under uncertainties is an emerging technology that may represent a key paradigm shift in space mission design. As we pursue more ambitious scientific goals (e.g., multi-moon tours, missions with extensive components of autonomy), it becomes more crucial that missions are designed with navigation (Nav) processes in mind. The effect of Nav processes is statistical by nature, as they consist of orbit determination (OD) and flight-path control (FPC). Thus, this mission design paradigm calls for techniques that appropriately quantify statistical effects of Nav, evaluate associated risks, and design missions that ensure sufficiently low risk while minimizing a statistical performance metric; a common metric is Delta-V99: worst-case (99%-quantile) Delta-V expenditure including statistical FPC efforts. In response to the need, this paper develops an algorithm for risk-constrained trajectory optimization under operational uncertainties due to initial state dispersion, navigation error, maneuver execution error, and imperfect dynamics modeling. We formulate it as a nonlinear stochastic optimal control problem and develop a computationally tractable algorithm that combines optimal covariance steering and sequential convex programming (SCP). Specifically, the proposed algorithm takes a block-Cholesky approach for convex formulation of optimal covariance steering, and leverages a recent SCP algorithm, SCvx*, for reliable numerical convergence. We apply the developed algorithm to risk-constrained, statistical trajectory optimization for exploration of dwarf planet Ceres with a Mars gravity assist, and demonstrate the robustness of the statistically-optimal trajectory and FPC policies via nonlinear Monte Carlo simulation.

2602.18408 2026-02-23 eess.SP

Modeling UAV-aided Roadside Cell-Free Networks with Matérn Hard-Core Point Processes

Chenrui Qiu, Yongxu Zhu, Bo Tan, George K. Karagiannidis, Tasos Dagiuklas

Comments Accepted for presentation at IEEE International Conference on Communications 2026

详情
英文摘要

This paper investigates a uncrewed aerial vehicles (UAV)-assisted cell-free architecture for vehicular networks in road-constrained environments. Roads are modeled using a Poisson Line Process (PLP), with multi-layer roadside access points (APs) deployed via 1-D Poisson Point Process (PPP). Each user forms a localized cell-free cluster by associating with the nearest AP in each layer along its corresponding road. This forms a road-constrained cell-free architecture. To enhance coverage, UAV act as an aerial tier, extending access from 1-D road-constrained layouts (embedded in 2-D) to 3-D. We employ a Matérn Hard-Core (MHC) point process to model the spatial distribution of UAV base stations, ensuring a minimum safety distance between them. In order to enable tractable analysis of the aggregate signal from multiple APs, a distance-based power control scheme is introduced. Leveraging tools from stochastic geometry, we have studied the coverage probability. Furthermore, we analyze the impact of key system parameters on coverage performance, providing useful insights into the deployment and optimization of UAV-assisted cell-free vehicular networks.

2602.18396 2026-02-23 cs.LG eess.SP math.PR stat.AP stat.ML

PRISM-FCP: Byzantine-Resilient Federated Conformal Prediction via Partial Sharing

Ehsan Lari, Reza Arablouei, Stefan Werner

Comments 13 pages, 5 figures, 2 tables, Submitted to IEEE Transactions on Signal Processing (TSP)

详情
英文摘要

We propose PRISM-FCP (Partial shaRing and robust calIbration with Statistical Margins for Federated Conformal Prediction), a Byzantine-resilient federated conformal prediction framework that utilizes partial model sharing to improve robustness against Byzantine attacks during both model training and conformal calibration. Existing approaches address adversarial behavior only in the calibration stage, leaving the learned model susceptible to poisoned updates. In contrast, PRISM-FCP mitigates attacks end-to-end. During training, clients partially share updates by transmitting only $M$ of $D$ parameters per round. This attenuates the expected energy of an adversary's perturbation in the aggregated update by a factor of $M/D$, yielding lower mean-square error (MSE) and tighter prediction intervals. During calibration, clients convert nonconformity scores into characterization vectors, compute distance-based maliciousness scores, and downweight or filter suspected Byzantine contributions before estimating the conformal quantile. Extensive experiments on both synthetic data and the UCI Superconductivity dataset demonstrate that PRISM-FCP maintains nominal coverage guarantees under Byzantine attacks while avoiding the interval inflation observed in standard FCP with reduced communication, providing a robust and communication-efficient approach to federated uncertainty quantification.

2602.18386 2026-02-23 cs.RO cs.AI cs.LG cs.SY eess.SY

Learning to Tune Pure Pursuit in Autonomous Racing: Joint Lookahead and Steering-Gain Control with PPO

Mohamed Elgouhary, Amr S. El-Wakeel

详情
英文摘要

Pure Pursuit (PP) is widely used in autonomous racing for real-time path tracking due to its efficiency and geometric clarity, yet performance is highly sensitive to how key parameters-lookahead distance and steering gain-are chosen. Standard velocity-based schedules adjust these only approximately and often fail to transfer across tracks and speed profiles. We propose a reinforcement-learning (RL) approach that jointly chooses the lookahead Ld and a steering gain g online using Proximal Policy Optimization (PPO). The policy observes compact state features (speed and curvature taps) and outputs (Ld, g) at each control step. Trained in F1TENTH Gym and deployed in a ROS 2 stack, the policy drives PP directly (with light smoothing) and requires no per-map retuning. Across simulation and real-car tests, the proposed RL-PP controller that jointly selects (Ld, g) consistently outperforms fixed-lookahead PP, velocity-scheduled adaptive PP, and an RL lookahead-only variant, and it also exceeds a kinematic MPC raceline tracker under our evaluated settings in lap time, path-tracking accuracy, and steering smoothness, demonstrating that policy-guided parameter tuning can reliably improve classical geometry-based control.

2602.18382 2026-02-23 eess.SY cs.SY math.OC

Incremental Input-to-State Stability and Equilibrium Tracking for Stochastic Contracting Dynamics

Yu Kawano, Simone Betteti, Alexander Davydov, Francesco Bullo

详情
英文摘要

In this paper, we study the contractivity of nonlinear stochastic differential equations (SDEs) driven by deterministic inputs and Brownian motions. Given a weighted $\ell_2$-norm for the state space, we show that an SDE is incrementally noise- and input-to-state stable if its vector field is uniformly contracting in the state and uniformly Lipschitz in the input. This result is applied to error estimation for time-varying equilibrium tracking in the presence of noise affecting both the system dynamics and the input signals. We consider both Ornstein-Uhlenbeck processes modeling unbounded noise and Jacobi diffusion processes modeling bounded noise. Finally, we turn our attention to the associated Fokker-Planck equation of an SDE. For this context, we prove incremental input-to-state stability with respect to an arbitrary $p$-Wasserstein metric when the drift vector field is uniformly contracting in the state and uniformly Lipschitz in the input with respect to an arbitrary norm.

2602.18376 2026-02-23 eess.SY cs.SY

Parameter Update Laws for Adaptive Control with Affine Equality Parameter Constraints

Ashwin P. Dani

详情
英文摘要

In this paper, constrained parameter update laws for adaptive control with convex equality constraint on the parameters are developed, one based on a gradient only update and the other incorporating concurrent learning (CL) update. The update laws are derived by solving a constrained optimization problem with affine equality constraints. This constrained problem is reformulated as an equivalent unconstrained problem in a new variable, thereby eliminating the equality constraints. The resulting update law is integrated with an adaptive trajectory tracking controller, enabling online learning of the unknown system parameters. Lyapunov stability of the closed-loop system with the equality-constrained parameter update law is established. The effectiveness of the proposed equality-constrained adaptive control law is demonstrated through simulations, validating its ability to maintain constraints on the parameter estimates, achieving convergence to the true parameters for CL-based update law, and achieving asymptotic and exponential tracking performance for constrained gradient and constrained CL-based update laws, respectively.

2602.18365 2026-02-23 eess.SY cs.SY

A Marginal Reliability Impact Based Accreditation Framework for Capacity Markets

Feng Zhao, Tongxin Zheng, Dane Schiro, Xiaochu Wang

详情
英文摘要

This paper presents a Marginal Reliability Impact (MRI) based resource accreditation framework for capacity market design. Under this framework, a resource is accredited based on its marginal impact on system reliability, thus aligning the resource accreditation value with its reliability contribution. A key feature of the MRI based accreditation is that the accredited capacities supplied by different resources to the capacity market are substitutable in reliability contribution, a desired feature of homogeneous products. Moreover, with MRI based capacity demand, substitutability between supply and demand for capacity is also achieved. As a result, a capacity market with the MRI based capacity product can better characterize the underlying resource adequacy problem and lead to more efficient market outcomes.

2602.18355 2026-02-23 eess.AS

Rethinking Flow and Diffusion Bridge Models for Speech Enhancement

Dahan Wang, Jun Gao, Tong Lei, Yuxiang Hu, Changbao Zhu, Kai Chen, Jing Lu

Comments Accepted by the 40th AAAI Conference on Artificial Intelligence (AAAI-26)

详情
英文摘要

Flow matching and diffusion bridge models have emerged as leading paradigms in generative speech enhancement, modeling stochastic processes between paired noisy and clean speech signals based on principles such as flow matching, score matching, and Schrödinger bridge. In this paper, we present a framework that unifies existing flow and diffusion bridge models by interpreting them as constructions of Gaussian probability paths with varying means and variances between paired data. Furthermore, we investigate the underlying consistency between the training/inference procedures of these generative models and conventional predictive models. Our analysis reveals that each sampling step of a well-trained flow or diffusion bridge model optimized with a data prediction loss is theoretically analogous to executing predictive speech enhancement. Motivated by this insight, we introduce an enhanced bridge model that integrates an effective probability path design with key elements from predictive paradigms, including improved network architecture, tailored loss functions, and optimized training strategies. Experiments on denoising and dereverberation tasks demonstrate that the proposed method outperforms existing flow and diffusion baselines with fewer parameters and reduced computational complexity. The results also highlight that the inherently predictive nature of this generative framework imposes limitations on its achievable upper-bound performance.

2602.18332 2026-02-23 eess.SP

MD-AirComp+: Adaptive Quantization for Blind Massive Digital Over-the-Air Computation

Li Qiao, Yueqing Wang, Hanjun Jiang, Xinhua Liu, Yixuan Xing, Yongpeng Wu, Zhen Gao

Comments Accepted for publication in Chinese Journal of Electronics

详情
英文摘要

Recent research has shown that unsourced massive access (UMA) is naturally well-suited for over-the-air computation (AirComp), as it does not require knowledge of each individual signal, as demonstrated by the massive digital AirComp (MD-AirComp) scheme proposed in prior work. The MD-AirComp scheme has proven effective in federated edge learning and is highly compatible with current digital wireless networks. However, it depends on channel pre-equalization, which may amplify computation errors in the presence of channel estimation inaccuracies, thus limiting its practical use. In this paper, we propose a blind MD-AirComp+ scheme, which takes advantage of the channel hardening effect in massive multiple-input multiple-output (MIMO) systems. We provide an upper bound on the computation mean square error, analyze the trade-off between computation accuracy and communication overhead, and determine the optimal quantization level. Additionally, we introduce a deep unfolding algorithm to reduce the computational complexity of solving the underdetermined detection problem formulated as a least absolute shrinkage and selection operator optimization problem. Simulation results confirm the effectiveness of the proposed MD-AirComp+ framework, the optimal quantization selection strategy, and the low-complexity detection algorithm.

2602.18331 2026-02-23 eess.SY cs.SY

Koopman-BoxQP: Solving Large-Scale NMPC at kHz Rates

Liang Wu, Wallace Gian Yion Tan, Richard D. Braatz, Ján Drgoňa

Comments Accepted by the 8th Annual Learning for Dynamics and Control Conference (L4DC 2026). arXiv admin note: text overlap with arXiv:2602.15596

详情
英文摘要

Solving large-scale nonlinear model predictive control (NMPC) problems at kilohertz (kHz) rates on standard processors remains a formidable challenge. This paper proposes a Koopman-BoxQP framework that i) learns a linear Koopman high-dimensional model, ii) eliminates the high-dimensional observables to construct a multi-step prediction model of the states and control inputs, iii) penalizes the multi-step prediction model into the objective, which results in a structured box-constrained quadratic program (BoxQP) whose decision variables include both the system states and control inputs, iv) develops a structure-exploited and warm-starting-supported variant of the feasible Mehrotra's interior-point algorithm for BoxQP. Numerical results demonstrate that Koopman-BoxQP can solve a large-scale NMPC problem with $1040$ variables and $2080$ inequalities at a kHz rate.

2602.18263 2026-02-23 eess.SP cs.IT math.IT

Channel Estimation for Double-BD-RIS-Assisted Multi-User MIMO Communication

Junyuan Gao, Shuowen Zhang, Liang Liu

详情
英文摘要

Deploying multiple beyond diagonal reconfigurable intelligent surfaces (BD-RISs) can potentially improve the communication performance thanks to inter-element connections of each BD-RIS and inter-surface cooperative beamforming gain among BD-RISs. However, a major issue for multi-BD-RISassisted communication lies in the channel estimation overhead - the channel coefficients associated with the off-diagonal elements in each BD-RIS's scattering matrix as well as those associated with the reflection links among BD-RISs have to be estimated. In this paper, we propose an efficient channel estimation framework for double-BD-RIS-assisted multi-user multipleinput multiple-output (MIMO) systems. Specifically, we reveal that high-dimensional cascaded channels are characterized by five low-dimensional matrices by exploiting channel correlation properties. Based on this novel observation, in the ideal noiseless case, we develop a channel estimation scheme to recover these matrices sequentially and characterize the closed-form overhead required for perfect estimation as a function of the numbers of users and each BD-RIS's elements and channel ranks, which is with the same order as that in double-diagonal-RIS-aided communication systems. This exciting result implies the superiority of cooperative BD-RIS-aided communication over the diagonal- RIS counterpart even when channel estimation overhead is considered. We further extend the proposed scheme to practical noisy scenarios and provide extensive numerical simulations to validate its effectiveness.

2602.18261 2026-02-23 eess.SY cs.SY nlin.CD

Accurate Data-Based State Estimation from Power Loads Inference in Electric Power Grids

Philippe Jacquod, Laurent Pagnier, Daniel J. Gauthier

Comments 10 pages, 10 figures

详情
英文摘要

Accurate state estimation is a crucial requirement for the reliable operation and control of electric power systems. Here, we construct a data-driven, numerical method to infer missing power load values in large-scale power grids. Given partial observations of power demands, the method estimates the operational state using a linear regression algorithm, exploiting statistical correlations within synthetic training datasets. We evaluate the performance of the method on three synthetic transmission grid test systems. Numerical experiments demonstrate the high accuracy achieved by the method in reconstructing missing demand values under various operating conditions. We further apply the method to real data for the transmission power grid of Switzerland. Despite the restricted number of observations in this dataset, the method infers missing power loads rather accurately. Furthermore, Newton-Raphson power flow solutions show that deviations between true and inferred values for power loads result in smaller deviations between true and inferred values for flows on power lines. This ensures that the estimated operational state correctly captures potential line contingencies. Overall, our results indicate that simple data-based regression techniques can provide an efficient and reliable alternative for state estimation in modern power grids.

2602.18254 2026-02-23 eess.SP cs.IT math.IT

m^3TrackFormer: Transformer-based mmWave Multi-Target Tracking with Lost Target Re-Acquisition Capability

Tongkai Li, Weifeng Zhu, Shuowen Zhang, Jiannong Cao, Shuguang Cui, Liang Liu

详情
英文摘要

This paper considers a millimeter wave (mmWave) integrated sensing and communication (ISAC) system, where a base station (BS) equipped with a large number of antennas but a small number of radio-frequency (RF) chains emits pencillike narrow beams for persistent tracking of multiple moving targets. Under this model, the tracking lost issue arising from the misalignment between the pencil-like beams and the true target positions is inevitable, especially when the trajectories of the targets are complex, and the conventional Kalman filter-based scheme does not work well. To deal with this issue, we propose a Transformer-based mmWave multi-target tracking framework, namely m3TrackFormer, with a novel re-acquisition mechanism, such that even if the echo signals from some targets are too weak to extract sensing information, we are able to re-acquire their locations quickly with small beam sweeping overhead. Specifically, the proposed framework can operate in two modes of normal tracking and target re-acquisition during the tracking procedure, depending on whether the tracking lost occurs. When all targets are hit by the swept beams, the framework works in the Normal Tracking Mode (N-Mode) with a Transformer encoder-based Normal Tracking Network (N-Net) to accurately estimate the positions of these targets and predict the swept beams in the next time block. While the tracking lost happens, the framework will switch to the Re-Acquisition Mode (R-Mode) with a Transformer decoder-based Re-Acquisition Network (RNet) to adjust the beam sweeping strategy for getting back the lost targets and maintaining the tracking of the remaining targets. Thanks to the ability of global trajectory feature extraction, the m3TrackFormer can achieve high beam prediction accuracy and quickly re-acquire the lost targets, compared with other tracking methods.

2602.18247 2026-02-23 eess.SY cs.SY math.OC

Hybrid Control of ADT Switched Linear Systems subject to Actuator Saturation

Fen Wu, Chengzhi Yuan

详情
英文摘要

This paper develops a hybrid output-feedback control framework for average dwell-time (ADT) switched linear systems subject to actuator saturation. The considered subsystems may be exponentially unstable, and the saturation nonlinearity is explicitly handled through a deadzone-based representation. The proposed hybrid controller combines mode-dependent full-order dynamic output-feedback controllers with a supervisory reset mechanism that updates controller states at switching instants. By incorporating the reset rule directly into the synthesis conditions, switching boundary constraints and performance requirements are addressed in a unified convex formulation. Sufficient conditions are derived in terms of linear matrix inequalities (LMIs) to guarantee exponential stability under ADT switching and a prescribed weighted ${\cal L}_2$-gain disturbance attenuation level for energy-bounded disturbances. An explicit controller construction algorithm is provided based on feasible LMI solutions. Simulation results demonstrate the effectiveness and computational tractability of the proposed approach and highlight its advantages over existing output-feedback saturation control methods.

2511.18554 2026-02-23 cs.DS cs.LG cs.SY eess.SY

Online Smoothed Demand Management

Adam Lechowicz, Nicolas Christianson, Mohammad Hajiesmaili, Adam Wierman, Prashant Shenoy

Comments Accepted to SIGMETRICS '26. 65 pages, 11 figures

详情
英文摘要

We introduce and study a class of online problems called online smoothed demand management $(\texttt{OSDM})$, motivated by paradigm shifts in grid integration and energy storage for large energy consumers such as data centers. In $\texttt{OSDM}$, an operator makes two decisions at each time step: an amount of energy to be purchased, and an amount of energy to be delivered (i.e., used for computation). The difference between these decisions charges (or discharges) the operator's energy storage (e.g., a battery). Two types of demand arrive online: base demand, which must be covered at the current time, and flexible demand, which can be satisfied at any time before a demand-specific deadline $Δ_t$. The operator's goal is to minimize a cost (subject to above constraints) that combines a cost of purchasing energy, a cost for delivering energy (if applicable), and smoothness penalties on the purchasing and delivery rates to discourage fluctuations and encourage ``grid healthy'' decisions. $\texttt{OSDM}$ generalizes several problems in the online algorithms literature while being the first to fully model applications of interest. We propose a competitive algorithm for $\texttt{OSDM}$ called $\texttt{PAAD}$ (partitioned accounting & aggregated decisions) and show it achieves the optimal competitive ratio. To overcome the pessimism typical of worst-case analysis, we also propose a novel learning framework that provides guarantees on the worst-case competitive ratio (i.e., to provide robustness against nonstationarity) while allowing end-to-end differentiable learning of the best algorithm on historical instances of the problem. We evaluate our algorithms in a case study of a grid-integrated data center with battery storage, showing that $\texttt{PAAD}$ effectively solves the problem and end-to-end learning achieves substantial performance improvements compared to $\texttt{PAAD}$.

2510.13887 2026-02-23 eess.IV cs.AI cs.LG stat.ML

Incomplete Multi-view Clustering via Hierarchical Semantic Alignment and Cooperative Completion

Xiaojian Ding, Lin Zhao, Xian Li, Xiaoying Zhu

Comments 13 pages, conference paper. Accepted to the Thirty-ninth Conference on Neural Information Processing Systems (NeurIPS 2025)

详情
英文摘要

Incomplete multi-view data, where certain views are entirely missing for some samples, poses significant challenges for traditional multi-view clustering methods. Existing deep incomplete multi-view clustering approaches often rely on static fusion strategies or two-stage pipelines, leading to suboptimal fusion results and error propagation issues. To address these limitations, this paper proposes a novel incomplete multi-view clustering framework based on Hierarchical Semantic Alignment and Cooperative Completion (HSACC). HSACC achieves robust cross-view fusion through a dual-level semantic space design. In the low-level semantic space, consistency alignment is ensured by maximizing mutual information across views. In the high-level semantic space, adaptive view weights are dynamically assigned based on the distributional affinity between individual views and an initial fused representation, followed by weighted fusion to generate a unified global representation. Additionally, HSACC implicitly recovers missing views by projecting aligned latent representations into high-dimensional semantic spaces and jointly optimizes reconstruction and clustering objectives, enabling cooperative learning of completion and clustering. Experimental results demonstrate that HSACC significantly outperforms state-of-the-art methods on five benchmark datasets. Ablation studies validate the effectiveness of the hierarchical alignment and dynamic weighting mechanisms, while parameter analysis confirms the model's robustness to hyperparameter variations. The code is available at https://github.com/XiaojianDing/2025-NeurIPS-HSACC.

2510.06170 2026-02-23 eess.IV cs.AI cs.CV

Smartphone-based iris recognition through high-quality visible-spectrum iris image capture.V2

Naveenkumar G Venkataswamy, Yu Liu, Soumyabrata Dey, Stephanie Schuckers, Masudul H Imtiaz

Comments The new version is available at arXiv:2512.15548

详情
英文摘要

Smartphone-based iris recognition in the visible spectrum (VIS) remains difficult due to illumination variability, pigmentation differences, and the absence of standardized capture controls. This work presents a compact end-to-end pipeline that enforces ISO/IEC 29794-6 quality compliance at acquisition and demonstrates that accurate VIS iris recognition is feasible on commodity devices. Using a custom Android application performing real-time framing, sharpness evaluation, and feedback, we introduce the CUVIRIS dataset of 752 compliant images from 47 subjects. A lightweight MobileNetV3-based multi-task segmentation network (LightIrisNet) is developed for efficient on-device processing, and a transformer matcher (IrisFormer) is adapted to the VIS domain. Under a standardized protocol and comparative benchmarking against prior CNN baselines, OSIRIS attains a TAR of 97.9% at FAR=0.01 (EER=0.76%), while IrisFormer, trained only on UBIRIS.v2, achieves an EER of 0.057% on CUVIRIS. The acquisition app, trained models, and a public subset of the dataset are released to support reproducibility. These results confirm that standardized capture and VIS-adapted lightweight models enable accurate and practical iris recognition on smartphones.

2510.01675 2026-02-23 cs.RO cs.SY eess.SY

Geometric Backstepping Control of Omnidirectional Tiltrotors Incorporating Servo-Rotor Dynamics for Robustness against Sudden Disturbances

Jaewoo Lee, Dongjae Lee, Jinwoo Lee, Hyungyu Lee, Yeonjoon Kim, H. Jin Kim

Comments Accepted to ICRA 2026

详情
英文摘要

This work presents a geometric backstepping controller for a variable-tilt omnidirectional multirotor that explicitly accounts for both servo and rotor dynamics. Considering actuator dynamics is essential for more effective and reliable operation, particularly during aggressive flight maneuvers or recovery from sudden disturbances. While prior studies have investigated actuator-aware control for conventional and fixed-tilt multirotors, these approaches rely on linear relationships between actuator input and wrench, which cannot capture the nonlinearities induced by variable tilt angles. In this work, we exploit the cascade structure between the rigid-body dynamics of the multirotor and its nonlinear actuator dynamics to design the proposed backstepping controller and establish exponential stability of the overall system. Furthermore, we reveal parametric uncertainty in the actuator model through experiments, and we demonstrate that the proposed controller remains robust against such uncertainty. The controller was compared against a baseline that does not account for actuator dynamics across three experimental scenarios: fast translational tracking, rapid rotational tracking, and recovery from sudden disturbance. The proposed method consistently achieved better tracking performance, and notably, while the baseline diverged and crashed during the fastest translational trajectory tracking and the recovery experiment, the proposed controller maintained stability and successfully completed the tasks, thereby demonstrating its effectiveness.

2509.12253 2026-02-23 eess.IV cs.AI cs.LG

Physics-Informed Neural Networks vs. Physics Models for Non-Invasive Glucose Monitoring: A Comparative Study Under Noise-Stressed Synthetic Conditions

Riyaadh Gani

详情
英文摘要

Non-invasive glucose monitoring outside controlled settings is dominated by low signal-to-noise ratio (SNR): hardware drift, environmental variation, and physiology suppress the glucose signature in NIR signals. We present a noise-stressed NIR simulator that injects 12-bit ADC quantisation, LED drift, photodiode dark noise, temperature/humidity variation, contact-pressure noise, Fitzpatrick I-VI melanin, and glucose variability to create a low-correlation regime (rho_glucose-NIR = 0.21). Using this platform, we benchmark six methods: Enhanced Beer-Lambert (physics-engineered ridge regression), Original PINN, Optimised PINN, RTE-inspired PINN, Selective RTE PINN, and a shallow DNN. The physics-engineered Beer Lambert model achieves the lowest error (13.6 mg/dL RMSE) with only 56 parameters and 0.01 ms inference, outperforming deeper PINNs and the SDNN baseline under low-SNR conditions. The study reframes the task as noise suppression under weak signal and shows that carefully engineered physics features can outperform higher-capacity models in this regime.

2509.04055 2026-02-23 eess.SP

Constellation Shaping for OFDM-ISAC Systems: From Theoretical Bounds to Practical Implementation

Benedikt Geiger, Fan Liu, Shihang Lu, Andrej Rode, Daniel Gil Gaviria, Charlotte Muth, Laurent Schmalen

Comments 16 pages, 15 figures, Accepted at IEEE Transactions on Communications (TCOM)

详情
英文摘要

Integrated sensing and communications (ISAC) promises new use cases for mobile communication systems by reusing the communication signal for radar-like sensing. However, sensing and communications (S&C) impose conflicting requirements on the modulation format, resulting in a tradeoff between their corresponding performance. This paper investigates constellation shaping as a means to simultaneously improve S&C performance in orthogonal frequency division multiplexing (OFDM)-based ISAC systems. We begin by deriving how the transmit symbols affect detection performance and derive theoretical lower and upper bounds on the maximum achievable information rate under a given sensing constraint. Using an autoencoder-based optimization, we investigate geometric, probabilistic, and joint constellation shaping, where joint shaping combines both approaches, employing both optimal maximum a-posteriori decoding and practical bit-metric decoding. Our results show that constellation shaping enables a flexible trade-off between S&C, can approach the derived upper bound, and significantly outperforms conventional modulation formats. Motivated by its practical implementation feasibility, we review probabilistic amplitude shaping (PAS) and propose a generalization tailored to ISAC. For this generalization, we propose a low-complexity log-likelihood ratio computation with negligible rate loss. We demonstrate that combining conventional and generalized PAS enables a flexible and low-complexity tradeoff between S&C, closely approaching the performance of joint constellation shaping.

2508.16179 2026-02-23 cs.LG cs.AI eess.SP

Motor Imagery EEG Signal Classification Using Minimally Random Convolutional Kernel Transform and Hybrid Deep Learning

Jamal Hwaidi, Mohamed Chahine Ghanem

详情
英文摘要

The brain-computer interface (BCI) establishes a non-muscle channel that enables direct communication between the human body and an external device. Electroencephalography (EEG) is a popular non-invasive technique for recording brain signals. It is critical to process and comprehend the hidden patterns linked to a specific cognitive or motor task, for instance, measured through the motor imagery brain-computer interface (MI-BCI). A significant challenge is presented by classifying motor imagery-based electroencephalogram (MI-EEG) tasks, given that EEG signals exhibit nonstationarity, time-variance, and individual diversity. Obtaining good classification accuracy is also very difficult due to the growing number of classes and the natural variability among individuals. To overcome these issues, this paper proposes a novel method for classifying EEG motor imagery signals that extracts features efficiently with Minimally Random Convolutional Kernel Transform (MiniRocket), a linear classifier then uses the extracted features for activity recognition. Furthermore, a novel deep learning based on Convolutional Neural Network (CNN) and Long Short Term Memory (LSTM) architecture to serve as a baseline was proposed and demonstrated that classification via MiniRocket's features achieves higher performance than the best deep learning models at lower computational cost. The PhysioNet dataset was used to evaluate the performance of the proposed approaches. The proposed models achieved mean accuracy values of 98.63% and 98.06% for the MiniRocket and CNN-LSTM, respectively. The findings demonstrate that the proposed approach can significantly enhance motor imagery EEG accuracy and provide new insights into the feature extraction and classification of MI-EEG.

2507.11551 2026-02-23 eess.IV cs.AI cs.CV

Landmark Detection for Medical Images using a General-purpose Segmentation Model

Ekaterina Stansfield, Jennifer A. Mitterer, Abdulrahman Altahhan

Comments 13 pages, 8 figures, 2 tables. Submitted to ICONIP 2025

详情
英文摘要

Radiographic images are a cornerstone of medical diagnostics in orthopaedics, with anatomical landmark detection serving as a crucial intermediate step for information extraction. General-purpose foundational segmentation models, such as SAM (Segment Anything Model), do not support landmark segmentation out of the box and require prompts to function. However, in medical imaging, the prompts for landmarks are highly specific. Since SAM has not been trained to recognize such landmarks, it cannot generate accurate landmark segmentations for diagnostic purposes. Even MedSAM, a medically adapted variant of SAM, has been trained to identify larger anatomical structures, such as organs and their parts, and lacks the fine-grained precision required for orthopaedic pelvic landmarks. To address this limitation, we propose leveraging another general-purpose, non-foundational model: YOLO. YOLO excels in object detection and can provide bounding boxes that serve as input prompts for SAM. While YOLO is efficient at detection, it is significantly outperformed by SAM in segmenting complex structures. In combination, these two models form a reliable pipeline capable of segmenting not only a small pilot set of eight anatomical landmarks but also an expanded set of 72 landmarks and 16 regions with complex outlines, such as the femoral cortical bone and the pelvic inlet. By using YOLO-generated bounding boxes to guide SAM, we trained the hybrid model to accurately segment orthopaedic pelvic radiographs. Our results show that the proposed combination of YOLO and SAM yields excellent performance in detecting anatomical landmarks and intricate outlines in orthopaedic pelvic radiographs.

2501.06945 2026-02-23 eess.SP cs.NA math.NA

OpenGERT: Open Source Automated Geometry Extraction with Geometric and Electromagnetic Sensitivity Analyses for Ray-Tracing Propagation Models

Serhat Tadik, Rajib Bhattacharjea, Johnathan Corgan, David Johnson, Jacobus Van der Merwe, Gregory D. Durgin

Comments This work is accepted for publication at the IEEE DySPAN 2025 conference and the copyright has been transferred to IEEE. Due to a code bug, all results and analysis reported as 'mean excess delay' are actually the 'mean delay9; and should be interpreted accordingly

详情
英文摘要

Accurate RF propagation modeling in urban environments is critical for developing digital spectrum twins and optimizing wireless communication systems. We introduce OpenGERT, an open-source automated Geometry Extraction tool for Ray Tracing, which collects and processes terrain and building data from OpenStreetMap, Microsoft Global ML Building Footprints, and USGS elevation data. Using the Blender Python API, it creates detailed urban models for high-fidelity simulations with NVIDIA Sionna RT. We perform sensitivity analyses to examine how variations in building height, position, and electromagnetic material properties affect ray-tracing accuracy. Specifically, we present pairwise dispersion plots of channel statistics (path gain, mean excess delay, delay spread, link outage, and Rician K-factor) and investigate how their sensitivities change with distance from transmitters. We also visualize the variance of these statistics for selected transmitter locations to gain deeper insights. Our study covers Munich and Etoile scenes, each with 10 transmitter locations. For each location, we apply five types of perturbations: material, position, height, height-position, and all combined, with 50 perturbations each. Results show that small changes in permittivity and conductivity minimally affect channel statistics, whereas variations in building height and position significantly alter all statistics, even with noise standard deviations of 1 meter in height and 0.4 meters in position. These findings highlight the importance of precise environmental modeling for accurate propagation predictions, essential for digital spectrum twins and advanced communication networks. The code for geometry extraction and sensitivity analyses is available at github.com/serhatadik/OpenGERT/.

2602.18165 2026-02-23 cs.IT cs.CR eess.SP math.IT

Uncertainty-Aware Jamming Mitigation with Active RIS: A Robust Stackelberg Game Approach

Xiao Tang, Zhen Ma, Limeng Dong, Yichen Wang, Qinghe Du, Dusit Niyato, Zhu Han

Comments Accepted @ IEEE TIFS

详情
英文摘要

Malicious jamming presents a pervasive threat to the secure communications, where the challenge becomes increasingly severe due to the growing capability of the jammer allowing the adaptation to legitimate transmissions. This paper investigates the jamming mitigation by leveraging an active reconfigurable intelligent surface (ARIS), where the channel uncertainties are particularly addressed for robust anti-jamming design. Towards this issue, we adopt the Stackelberg game formulation to model the strategic interaction between the legitimate side and the adversary, acting as the leader and follower, respectively. We prove the existence of the game equilibrium and adopt the backward induction method for equilibrium analysis. We first derive the optimal jamming policy as the follower's best response, which is then incorporated into the legitimate-side optimization for robust anti-jamming design. We address the uncertainty issue and reformulate the legitimate-side problem by exploiting the error bounds to combat the worst-case jamming attacks. The problem is decomposed within a block successive upper bound minimization (BSUM) framework to tackle the power allocation, transceiving beamforming, and active reflection, respectively, which are iterated towards the robust jamming mitigation scheme. Simulation results are provided to demonstrate the effectiveness of the proposed scheme in protecting the legitimate transmissions under uncertainties, and the superior performance in terms of jamming mitigation as compared with the baselines.

2602.18119 2026-02-23 eess.IV cs.AI cs.CV cs.LG

RamanSeg: Interpretability-driven Deep Learning on Raman Spectra for Cancer Diagnosis

Chris Tomy, Mo Vali, David Pertzborn, Tammam Alamatouri, Anna Mühlig, Orlando Guntinas-Lichius, Anna Xylander, Eric Michele Fantuzzi, Matteo Negro, Francesco Crisafi, Pietro Lio, Tiago Azevedo

Comments 12 pages, 8 figures

详情
英文摘要

Histopathology, the current gold standard for cancer diagnosis, involves the manual examination of tissue samples after chemical staining, a time-consuming process requiring expert analysis. Raman spectroscopy is an alternative, stain-free method of extracting information from samples. Using nnU-Net, we trained a segmentation model on a novel dataset of spatial Raman spectra aligned with tumour annotations, achieving a mean foreground Dice score of 80.9%, surpassing previous work. Furthermore, we propose a novel, interpretable, prototype-based architecture called RamanSeg. RamanSeg classifies pixels based on discovered regions of the training set, generating a segmentation mask. Two variants of RamanSeg allow a trade-off between interpretability and performance: one with prototype projection and another projection-free version. The projection-free RamanSeg outperformed a U-Net baseline with a mean foreground Dice score of 67.3%, offering a meaningful improvement over a black-box training approach.

2602.18104 2026-02-23 cs.SD cs.AI cs.LG eess.AS

MeanVoiceFlow: One-step Nonparallel Voice Conversion with Mean Flows

Takuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka, Yuto Kondo

Comments Accepted to ICASSP 2026. Project page: https://www.kecl.ntt.co.jp/people/kaneko.takuhiro/projects/meanvoiceflow/

详情
英文摘要

In voice conversion (VC) applications, diffusion and flow-matching models have exhibited exceptional speech quality and speaker similarity performances. However, they are limited by slow conversion owing to their iterative inference. Consequently, we propose MeanVoiceFlow, a novel one-step nonparallel VC model based on mean flows, which can be trained from scratch without requiring pretraining or distillation. Unlike conventional flow matching that uses instantaneous velocity, mean flows employ average velocity to more accurately compute the time integral along the inference path in a single step. However, training the average velocity requires its derivative to compute the target velocity, which can cause instability. Therefore, we introduce a structural margin reconstruction loss as a zero-input constraint, which moderately regularizes the input-output behavior of the model without harmful statistical averaging. Furthermore, we propose conditional diffused-input training in which a mixture of noise and source data is used as input to the model during both training and inference. This enables the model to effectively leverage source information while maintaining consistency between training and inference. Experimental results validate the effectiveness of these techniques and demonstrate that MeanVoiceFlow achieves performance comparable to that of previous multi-step and distillation-based models, even when trained from scratch. Audio samples are available at https://www.kecl.ntt.co.jp/people/kaneko.takuhiro/projects/meanvoiceflow/.

2602.18086 2026-02-23 eess.SP cs.ET cs.IT math.IT

Non-Contiguous Wi-Fi Spectrum for ISAC: Impact on Multipath Delay Estimation

Ana Jeknić, Aleš Švigelj, Tomaž Javornik, Andrej Hrovat

Comments 12 pages, 7 figures (4 figures contain 2 pictures each, so total 11 pictures in form of 7 figures)

详情
英文摘要

Leveraging channel state information from multiple Wi-Fi bands can improve delay resolution for ranging and sensing when a wide contiguous spectrum is unavailable. However, frequency gaps shape the delay response, introducing sidelobes and secondary peaks that can obscure closely spaced multipath components. This paper examines multipath delay estimation for Wi-Fi-compliant multiband configurations using channel state information (CSI). For a two-path model with unknown complex gains and delays, the Cramér-Rao lower bound (CRLB) for delay separation is derived and analyzed, confirming the benefit of larger frequency aperture, while revealing pronounced, separation-dependent oscillations driven by gap geometry and inter-path coupling. Given the local nature of Cramér-Rao lower bound, the delay response is analyzed next. In the single-path case, the combined subband responses determine how delay-domain sidelobe levels are distributed. The dominant peak spacing is set primarily by the separation between subband center frequencies. In the two-path case, increased aperture sharpens the mainlobe but also intensifies sidelobes and leakage, yielding competing peaks and, in some regimes, a dominant peak shifted from the true delay. Finally, a normalized leakage metric is introduced to predict problematic separations and to identify regimes where local Cramér-Rao lower bound analysis does not capture practical peak-leakage behavior in delay estimation.

2602.18076 2026-02-23 eess.SP

Extremely Large Antenna Spacing Method for Enhanced Wideband Near-Field Sensing

Tommaso Bacchielli, Lorenzo Pucci, Andrea Giorgetti

Comments 14 pages, 8 figures

详情
英文摘要

This paper proposes a monostatic wideband system for integrated sensing and communication (ISAC) at millimeter-wave frequencies, based on multiple-input multiple-output (MIMO) orthogonal frequency-division multiplexing (OFDM). The system operates in a hybrid near-/far-field regime. The transmitter (Tx) operates in the far field (FF) and uses low-complexity beam steering. The receiver (Rx), on the other hand, operates in a pervasive near field (NF), enabled by a very large effective array aperture. To enable a fully digital implementation, we introduce an extremely large antenna spacing (ELAS) design. This design attains the required aperture with only a few widely spaced antenna elements while avoiding grating lobes in the composite Tx-Rx response. We analytically characterize the NF range-angle response of this architecture and study the interplay between NF effects and waveform bandwidth. This leads to the definition of a super-resolution region, where NF propagation at the Rx dominates the achievable range resolution and surpasses the classical, bandwidth-limited resolution. As a case study, we consider an extended target modeled as a collection of scatterers and assess localization performance via maximum-likelihood estimation. Numerical results evaluated in terms of root mean square error (RMSE) and generalized optimal sub-pattern assignment (GOSPA) show that operating in NF conditions with the ELAS-based design yields significant gains compared to a conventional FF baseline at both the Tx and Rx.

2602.18059 2026-02-23 eess.SY cs.SY

Iterative McCormick Relaxation for Joint Impedance Control and Network Topology Optimization

Junseon Park, Hyeongon Park, Rahul K. Gupta

详情
英文摘要

Power system operators are increasingly deploying Variable Impedance Devices (VIDs), e.g., Smart Wires, and Network Topology Optimization (NTO) schemes for mitigating operational challenges such as line and transformer congestion, and voltage violations. This work aims to optimize and coordinate the operation of distributed VIDs considering fixed and optimized topologies. This problem is inherently non-linear due to power flow equations as well as bilinear terms introduced due to variable line impedance of VIDs. Furthermore, the topology optimization scheme makes it a mixed integer nonlinear problem. To tackle this, we introduce using McCormick relaxation scheme, which converts the bilinear constraints into a linear set of constraints along with the DC power flow equations. We propose an iterative correction of the McCormick relaxation to enhance its accuracy. The proposed framework is validated on standard IEEE benchmark test systems, and we present a performance comparison of the iterative McCormick method against the non-linear, SOS2 piecewise linear approximation, and original McCormick relaxation.