arXivDaily arXiv每日学术速递 周一至周五更新
2602.24268 2026-03-02 eess.SY cs.SY math.OC

Virtual Constraint for a Quadrotor UAV Enforcing a Body-Axis Pointing Direction

Alexandre Anahory Simoes, Leonardo Colombo, Juan Giribet, Efstratios Stratoglou

详情
英文摘要

We propose a geometric control framework on $SE(3)$ for quadrotors that enforces pointing-driven missions without completing a full attitude reference. The mission is encoded through virtual constraints defining a task manifold and an associated set of admissible velocities, and invariance is achieved by a feedback law obtained from a linear system in selected inputs. Under a transversality condition with the effective actuation distribution, the invariance-enforcing input is uniquely defined, yielding a constructive control law and, for relevant tasks, closed-form expressions. We further derive a local off-manifold stabilization extension. As a case study, we lock a body axis to a prescribed line-of-sight direction while maintaining fixed altitude.

2602.24260 2026-03-02 eess.SY cs.SY math.OC

Observer-Based Estimation and Hydrostatic Inertia Modeling for Cooperative Transport of Variable-Inertia Loads with Quadrotors

Jacob Goodman, Leonardo Colombo, Juan Giribet

详情
英文摘要

We address load-parameter estimation in cooperative aerial transport with time-varying mass and inertia, as in fluid-carrying payloads. Using an intrinsic manifold model of the multi-quadrotor-load dynamics, we combine a geometric tracking controller with an observer for parameter identification. We estimate mass from measurable kinematics and commanded forces, and handle variable inertia via an inertia surrogate that reproduces the load's rotational dynamics for control and state propagation. Instead of real-time identification of the true inertia tensor, driven by high-dimensional internal fluid motion, we leverage known tank geometry and fluid-mechanical structure to pre-compute inertia tensors and update them through a lookup table indexed by fill level and attitude. The surrogate is justified via the incompressible Navier-Stokes equations in the translating/rotating load frame: when effective forcing is gravity-dominated (i.e., translational/rotational accelerations and especially jerk are limited), the fluid approaches hydrostatic equilibrium and the free surface is well approximated by a plane orthogonal to the body-frame gravity direction.

2602.24259 2026-03-02 eess.SY cs.SY

Curriculum-Based Soft Actor-Critic for Multi-Section R2R Tension Control

Shihao Li, Jiachen Li, Christopher Martin, Zijun Chen, Dongmei Chen, Wei Li

详情
英文摘要

Precise tension control in roll-to-roll (R2R) manufacturing is difficult under varying operating conditions and process uncertainty. This paper presents a curriculum-based Soft Actor-Critic (SAC) controller for multi-section R2R tension control. The policy is trained in three phases with progressively wider reference ranges, from 27 to 33 N to the full operating envelope of 20 to 40 N, so it can generalize across nominal and disturbed conditions. On a three-section R2R benchmark, the learned controller achieves accurate tracking in nominal operation and handles large disturbances, including 20 N to 40 N step changes, with a single policy and no scenario-specific retuning. These results indicate that curriculum-trained SAC is a practical alternative to model-based control when system parameters vary and process uncertainty is significant.

2602.24254 2026-03-02 eess.SY cs.AI cs.LG cs.SY

FaultXformer: A Transformer-Encoder Based Fault Classification and Location Identification model in PMU-Integrated Active Electrical Distribution System

Kriti Thakur, Alivelu Manga Parimi, Mayukha Pal

详情
英文摘要

Accurate fault detection and localization in electrical distribution systems is crucial, especially with the increasing integration of distributed energy resources (DERs), which inject greater variability and complexity into grid operations. In this study, FaultXformer is proposed, a Transformer encoder-based architecture developed for automatic fault analysis using real-time current data obtained from phasor measurement unit (PMU). The approach utilizes time-series current data to initially extract rich temporal information in stage 1, which is crucial for identifying the fault type and precisely determining its location across multiple nodes. In Stage 2, these extracted features are processed to differentiate among distinct fault types and identify the respective fault location within the distribution system. Thus, this dual-stage transformer encoder pipeline enables high-fidelity representation learning, considerably boosting the performance of the work. The model was validated on a dataset generated from the IEEE 13-node test feeder, simulated with 20 separate fault locations and several DER integration scenarios, utilizing current measurements from four strategically located PMUs. To demonstrate robust performance evaluation, stratified 10-fold cross-validation is performed. FaultXformer achieved average accuracies of 98.76% in fault type classification and 98.92% in fault location identification across cross-validation, consistently surpassing conventional deep learning baselines convolutional neural network (CNN), recurrent neural network (RNN). long short-term memory (LSTM) by 1.70%, 34.95%, and 2.04% in classification accuracy and by 10.82%, 40.89%, and 6.27% in location accuracy, respectively. These results demonstrate the efficacy of the proposed model with significant DER penetration.

2602.24247 2026-03-02 eess.SY cs.SY

Data-Driven Linearization based Arc Fault Prediction in Medium Voltage Electrical Distribution System

Mihir Sinha, Kriti Thakur, Prasanta K. Panigrahi, Alivelu Manga Parimi, Mayukha Pal

详情
英文摘要

High-impedance arc faults (HIAFs) in medium-voltage electrical distribution systems are difficult to detect due to their low fault current levels and nonlinear transient behavior. Traditional detection algorithms generally struggle with predictions under dynamic waveform scenarios. This research provides our approach of using a unique data-driven linearization (DDL) framework for early prediction of HIAFs, giving both interpretability and scalability. The proposed method translates nonlinear current waveforms into a linearized space using coordinate embeddings and polynomial transformation, enabling precise modelling of fault precursors.The total duration of the test waveform is 0.5 seconds, within which the arc fault occurs between 0.2 seconds to 0.3 seconds. Our proposed approach using DDL, trained solely on the pre-fault healthy region (0.10 seconds to 0.18 seconds) effectively captures certain invisible fault precursors, to accurately predict the onset of fault at 0.189 seconds, which is approximately 0.011 seconds (i.e., 11 milliseconds) earlier than the actual fault occurrence. In particular, the framework predicts the start of arc faults at 0.189 seconds, significantly earlier of the actual fault incidence at 0.200 seconds, demonstrating substantial early warning capability. Performance evaluation comprises eigenvalue analysis, prediction error measures, error growth rate and waveform regeneration fidelity. Such early prediction proves that the model is capable of correctly foreseeing faults which is especially helpful in preventing real-world faults and accidents. It confirms that our proposed approach reliably predicts arc faults in medium-voltage power distribution systems

2602.24158 2026-03-02 eess.SP

Joint Subcarrier Phase Recovery for Nonlinearity Mitigation

Marco Secondini, Stella Civelli

Comments Submitted to OFC 2026

详情
英文摘要

We propose a low-complexity phase recovery scheme that simultaneously mitigates laser phase noise and fiber nonlinearity across several subcarriers. In a long single-span link with Raman amplification, the scheme achieves 0.9 dB gain with 99 real multiplications per complex symbol.

2602.24150 2026-03-02 eess.SP

Channel Estimation for Beyond Diagonal RIS Exploiting Core Tensor Sparsity

Daniel Costa Araújo, André L. F. de Almeida

详情
英文摘要

Beyond diagonal reconfigurable intelligent surface (BD-RIS)s enhance wave manipulation through inter-element couplings but pose significant channel estimation challenges due to cascaded channels and block-Kronecker structures. This paper proposes a compressive sensing framework exploiting sparse Tucker decomposition of the measurement tensor and the Kronecker rank-one structure of channel components. Two algorithms are developed: Sparse Tensor Orthogonal Recovery Method (STORM), which uses orthogonal matching pursuit (OMP) for greedy support recovery, and Sparse Tensor subspace- Aided Recovery (STAR), which leverages subspace-based projection for enhanced noise robustness. Both perform joint sparse support identification, followed by a Kronecker rank-one factorization via singular value decomposition (SVD) to recover the channel parameters. Simulations show that STAR achieves oracle-assisted least squares (LS) performance at moderate-to-high signal-to-noise ratio (SNR) with significantly fewer measurements than baseline methods, enabling practical BD-RIS deployment in next-generation millimeter wave (mmWave)/sub-terahertz (sub-THz) networks.

2602.23990 2026-03-02 eess.SP

Formation Control for CRLB-Optimal Cooperative Sensing in Low-Altitude Wireless Networks

Jun Wu, Haijia Jin, Nanchi Su, Jinna Li, Haoyuan Pan, Tse-Tin Chan

详情
英文摘要

Cooperative sensing with uncrewed aerial vehicles (UAVs) is a key enabler for low-altitude wireless networks (LAWNs), where sensing accuracy critically depends on the spatial configuration of the UAV formation. In this paper, we study formation design and control for Cramer-Rao lower bound (CRLB)-optimal cooperative target sensing. We first establish a sensing performance model based on range measurements and derive the Fisher information matrix (FIM) of the target location. By adopting the A-optimality criterion, we analytically characterize the formation geometry that minimizes the CRLB of the estimation error. The optimal formation is shown to exhibit isotropic Fisher information in the horizontal plane, leading to a regular polygon geometry with an elevation angle determined by the tradeoff between path loss and geometric diversity. Building on this result, we further develop a distributed formation control strategy that steers UAVs from arbitrary initial deployments toward the sensing-optimal configuration while maintaining formation motion and obstacle avoidance. Numerical results demonstrate that the proposed scheme consistently outperforms benchmark formations in terms of CRLB and achieves reliable convergence under practical constraints.

2602.23977 2026-03-02 eess.SP

From Signals to Causes: A Causal Signal Processing Framework for Robust and Interpretable Clinical Risk Prediction

Surajit Das, Maxine Tan

详情
英文摘要

Learning-based signal processing systems increasingly support high-stakes medical decisions using heterogeneous biomedical signals, including medical images, physiological time series, and clinical records. Despite strong predictive performance, many models rely on statistical correlations that are unstable across acquisition settings, patient populations, and institutional practices, limiting robustness, interpretability, and clinical trust. We advocate a causal signal processing perspective in which biomedical signals are treated as effects of latent generative mechanisms rather than as isolated predictive inputs. Using clinical risk prediction as a motivating example, we show how disease-related factors generate observable biomarkers, while acquisition processes act as confounders influencing signal appearance. In clinical disease risk prediction from chest CT scans and patient risk factors, correlational models may fail under scanner changes, whereas causal abstractions remain invariant. Building on this view, we propose a unifying conceptual framework integrating causal modeling with learning-based signal processing and neuro-symbolic reasoning. Statistical models extract multimodal representations that are mapped to interpretable causal abstractions and combined with symbolic knowledge encoding clinical risk factors and guidelines. This structure enables clinically grounded explanations, counterfactual reasoning about hypothetical interventions, and improved robustness to distribution shifts arising from changes in acquisition conditions or screening policies. Rather than introducing a specific algorithm, this article presents schematic causal structures and a comparative analysis of correlation-based, causal, and neuro-symbolic approaches to guide the design of robust and interpretable medical decision-support systems.

2602.23962 2026-03-02 eess.IV cs.CV

Extending 2D foundational DINOv3 representations to 3D segmentation of neonatal brain MR images

Annayah Usman, Behraj Khan, Tahir Qasim Syed

详情
英文摘要

Precise volumetric delineation of hippocampal structures is essential for quantifying neurodevelopmental trajectories in pre-term and term infants, where subtle morphological variations may carry prognostic significance. While foundation encoders trained on large-scale visual data offer discriminative representations, their 2D formulation is a limitation with respect to the $3$D organization of brain anatomy. We propose a volumetric segmentation strategy that reconciles this tension through a structured window-based disassembly-reassembly mechanism: the global MRI volume is decomposed into non-overlapping 3D windows or sub-cubes, each processed via a separate decoding arm built upon frozen high-fidelity features, and subsequently reassembled prior to a ground-truth correspendence using a dense-prediction head. This architecture preserves constant a decoder memory footprint while forcing predictions to lie within an anatomically consistent geometry. Evaluated on the ALBERT dataset for hippocampal segmentation, the proposed approach achieves a Dice score of 0.65 for a single 3D window. The method demonstrates that volumetric anatomical structure could be recovered from frozen 2D foundation representations through structured compositional decoding, and offers a principled and generalizable extension for foundation models for 3D medical applications.

2602.23961 2026-03-02 eess.IV cs.CV

Clinically-aligned ischemic stroke segmentation and ASPECTS scoring on NCCT imaging using a slice-gated loss on foundation representations

Hiba Azeem, Behraj Khan, Tahir Qasim Syed

详情
英文摘要

Rapid infarct assessment on non-contrast CT (NCCT) is essential for acute ischemic stroke management. Most deep learning methods perform pixel-wise segmentation without modeling the structured anatomical reasoning underlying ASPECTS scoring, where basal ganglia (BG) and supraganglionic (SG) levels are clinically interpreted in a coupled manner. We propose a clinically aligned framework that combines a frozen DINOv3 backbone with a lightweight decoder and introduce a Territory-Aware Gated Loss (TAGL) to enforce BG-SG consistency during training. This anatomically informed supervision adds no inference-time complexity. Our method achieves a Dice score of 0.6385 on AISD, outperforming prior CNN and foundation-model baselines. On a proprietary ASPECTS dataset, TAGL improves mean Dice from 0.698 to 0.767. These results demonstrate that integrating foundation representations with structured clinical priors improves NCCT stroke segmentation and ASPECTS delineation.

2602.23946 2026-03-02 eess.SP cs.IT eess.IV math.IT

Hypercomplex Phase Retrieval

Kumar Vijay Mishra, Henry Arguello, Brian M. Sadler

Comments 21 pages, 4 figures, 2 tables. arXiv admin note: substantial text overlap with arXiv:2310.17660

详情
英文摘要

Hypercomplex signal processing (HSP) offers powerful tools for analyzing and processing multidimensional signals by explicitly exploiting inter-dimensional correlations through Clifford algebra. In recent years, hypercomplex formulations of the phase retrieval (PR) problem, wheren a complex-valued signal is recovered from intensity-only measurements, have attracted growing interest. Hypercomplex phase retrieval (HPR) naturally arises in a range of optical imaging and computational sensing applications, where signals are often modeled using quaternion- or octonion-valued representations. Similar to classical PR, HPR problems may involve measurements obtained via complex, hypercomplex, Fourier, or other structured sensing operators. These formulations open new avenues for the development of advanced HSP-based algorithms and theoretical frameworks. This chapter surveys emerging methodologies and applications of HPR, with particular emphasis on optical imaging systems.

2602.23889 2026-03-02 eess.SP

Optimization-Based Behavioral Modeling of Mixers for Frequency Comb OFDM Radar Processing

Umut Utku Erdem, Henning Poensgen, Taewon Jeong, Lucas Giroto, Benjamin Nuss, Ibrahim Kagan Aksoyak, Ahmet Cagri Ulusoy, Thomas Zwick

详情
英文摘要

This paper presents an optimization-based behavioral model for mixers driven by multi-tone local oscillator (LO) signals, considered specifically for frequency comb orthogonal frequency-division multiplexing radar applications. Unlike traditional models, the proposed approach is designed and tested for multi-tone LO excitations. The model uses polynomial nonlinearities for both intermediate frequency and LO ports, supported by spectrum-domain fitting that selectively emphasizes strong intermodulation products. In addition, a polynomial block is introduced to capture input power-dependent phase nonlinearity. The approach is validated using circuit-level simulations and supported by measurements. Radar processing results show the model replicates distortive effects in simulations. The proposed model enables rapid system-level performance estimations and waveform optimization, replacing computationally expensive circuit-level simulations.

2602.23861 2026-03-02 eess.SP

Secure OFDM Waveform Design for ISAC: Artificial Phase-Doppler Shifts Against Passive Sensing

Umut Utku Erdem, Lucas Giroto, Tobias Chaloun, Tom Schipper, Taewon Jeong, Christian Karle, Benjamin Nuss, Thomas Zwick

详情
英文摘要

This paper proposes a novel low probability of intercept (LPI) waveform design approach for orthogonal frequency-division multiplexing (OFDM)-based integrated sensing and communication systems by introducing artificial phase and Doppler shifts. These controlled impairments, unknown to eavesdroppers, effectively disrupt passive radar processing and intercept attempts. At legitimate receivers, they can be fully compensated, so that standard OFDM communication and sensing performance are preserved. To support the effectiveness of the proposed LPI waveform design for OFDM-based ISAC, measurement results with 1 GHz bandwidth at 27 GHz are presented considering different impairment introduction approaches, all with no impact on cooperative system performance, and compensation capabilities at the eavesdropper.

2602.23856 2026-03-02 eess.SP cs.AR

Quantized Precoding for Maximizing Sum Rate in MU-MIMO Systems with Constrained Fronthaul

Yasaman Khorsandmanesh, Alva Kosasih, Emil Björnson, Joakim Jaldén

Comments arXiv admin note: text overlap with arXiv:2406.19183

详情
英文摘要

This paper studies a downlink multi-user multiple-input multiple-output (MU-MIMO) system, where the precoding matrix is computed at a baseband unit (BBU) and then transmitted to the remote antenna array over a limited-capacity digital fronthaul. The limited bit resolution of the fronthaul introduces quantization effects that are explicitly modeled. We propose a novel sum rate maximization framework that directly incorporates the quantizer's constraints into the precoding design. The resulting maximization problem, a non-convex mixed-integer program, is addressed using a new iterative algorithm inspired by the weighted minimum mean square error (WMMSE) methodology. The precoding optimization subproblem is reformulated as an integer least-squares problem and solved using a novel sphere decoding (SD) algorithm. Additionally, a low-complexity expectation propagation (EP)-based method is introduced to enable the practical implementation of quantized precoding in MU-massive MIMO (MU-mMIMO) systems. Furthermore, numerical evaluations demonstrate that the proposed precoding schemes outperform conventional approaches that optimize infinite-resolution precoding followed by element-wise quantization. We also propose a heuristic quantization-aware precoding method with comparable complexity to the baseline but superior performance. In particular, the EP-based approach offers near-optimal performance with substantial complexity reduction, making it well-suited for real-time MU-mMIMO applications.

2602.23852 2026-03-02 cs.LG eess.SP

ULW-SleepNet: An Ultra-Lightweight Network for Multimodal Sleep Stage Scoring

Zhaowen Wang, Dongdong Zhou, Qi Xu, Fengyu Cong, Mohammad Al-Sa'd, Jenni Raitoharju

Comments Accepted to ICASSP 2026

详情
英文摘要

Automatic sleep stage scoring is crucial for the diagnosis and treatment of sleep disorders. Although deep learning models have advanced the field, many existing models are computationally demanding and designed for single-channel electroencephalography (EEG), limiting their practicality for multimodal polysomnography (PSG) data. To overcome this, we propose ULW-SleepNet, an ultra-lightweight multimodal sleep stage scoring framework that efficiently integrates information from multiple physiological signals. ULW-SleepNet incorporates a novel Dual-Stream Separable Convolution (DSSC) Block, depthwise separable convolutions, channel-wise parameter sharing, and global average pooling to reduce computational overhead while maintaining competitive accuracy. Evaluated on the Sleep-EDF-20 and Sleep-EDF-78 datasets, ULW-SleepNet achieves accuracies of 86.9% and 81.4%, respectively, with only 13.3K parameters and 7.89M FLOPs. Compared to state-of-the-art methods, our model reduces parameters by up to 98.6% with only marginal performance loss, demonstrating its strong potential for real-time sleep monitoring on wearable and IoT devices. The source code for this study is publicly available at https://github.com/wzw999/ULW-SLEEPNET.

2602.23847 2026-03-02 eess.IV cs.CV

Polarization Uncertainty-Guided Diffusion Model for Color Polarization Image Demosaicking

Chenggong Li, Yidong Luo, Junchao Zhang, Degui Yang

Comments Accepted to AAAI2026

详情
英文摘要

Color polarization demosaicking (CPDM) aims to reconstruct full-resolution polarization images of four directions from the color-polarization filter array (CPFA) raw image. Due to the challenge of predicting numerous missing pixels and the scarcity of high-quality training data, existing network-based methods, despite effectively recovering scene intensity information, still exhibit significant errors in reconstructing polarization characteristics (degree of polarization, DOP, and angle of polarization, AOP). To address this problem, we introduce the image diffusion prior from text-to-image (T2I) models to overcome the performance bottleneck of network-based methods, with the additional diffusion prior compensating for limited representational capacity caused by restricted data distribution. To effectively leverage the diffusion prior, we explicitly model the polarization uncertainty during reconstruction and use uncertainty to guide the diffusion model in recovering high error regions. Extensive experiments demonstrate that the proposed method accurately recovers scene polarization characteristics with both high fidelity and strong visual perception.

2602.23831 2026-03-02 cs.IT eess.SP math.IT

Antenna Coding Optimization for Pixel Antenna Empowered Wireless Communication Using Deep Learning with Heterogeneous Multi-Head Selection

Binzhou Zuo, Shanpu Shen, Hongyu Li

Comments 6 pages, 5 figures, accepted by IEEE conferences

详情
英文摘要

Pixel antenna is a promising antenna technology that enables flexible adjustment of radiation characteristics and enhancement of wireless systems through antenna coding. This work proposes a novel deep learning-based antenna coding optimization algorithm. Specifically, the proposed algorithm is supported by a heterogeneous multi-head selection mechanism, whose main idea is to train multiple neural networks based on various coding schemes and select the one that leads to the best system performance. Unlike traditional heuristic searching-based algorithms that require high computational complexity to achieve satisfactory performance, the proposed data-driven deep learning approach can achieve 98\% of the performance achieved by the searching-based algorithms with significantly reduced computational complexity. Results demonstrate that in pixel antenna empowered single-input single-output systems, the proposed algorithm achieves a computational speed 81 times faster than the searching-based algorithm. For more complex pixel antenna empowered multiple-input multiple-output systems, the computational speed is 297 times faster than the existing searching-based algorithm. Benefiting from the high performance and low computational complexity, this algorithm demonstrates the significant potential of pixel antennas as a novel and practical technology to enhance wireless systems.

2602.23803 2026-03-02 eess.IV cs.CV

BiM-GeoAttn-Net: Linear-Time Depth Modeling with Geometry-Aware Attention for 3D Aortic Dissection CTA Segmentation

Yuan Zhang, Lei Liu, Jialin Zhang, Ya-Nan Zhang, Ling Wang, Nan Mu

详情
英文摘要

Accurate segmentation of aortic dissection (AD) lumens in CT angiography (CTA) is essential for quantitative morphological assessment and clinical decision-making. However, reliable 3D delineation remains challenging due to limited long-range context modeling, which compromises inter-slice coherence, and insufficient structural discrimination under low-contrast conditions. To address these limitations, we propose BiM-GeoAttn-Net, a lightweight framework that integrates linear-time depth-wise state-space modeling with geometry-aware vessel refinement. Our approach is featured by Bidirectional Depth Mamba (BiM) to efficiently capture cross-slice dependencies and Geometry-Aware Vessel Attention (GeoAttn) module that employs orientation-sensitive anisotropic filtering to refine tubular structures and sharpen ambiguous boundaries. Extensive experiments on a multi-source AD CTA dataset demonstrate that BiM-GeoAttn-Net achieves a Dice score of 93.35% and an HD95 of 12.36 mm, outperforming representative CNN-, Transformer-, and SSM-based baselines in overlap metrics while maintaining competitive boundary accuracy. These results suggest that coupling linear-time depth modeling with geometry-aware refinement provides an effective, computationally efficient solution for robust 3D AD segmentation.

2602.23782 2026-03-02 eess.IV cs.CV

Breaking the Data Barrier: Robust Few-Shot 3D Vessel Segmentation using Foundation Models

Kirato Yoshihara, Yohei Sugawara, Yuta Tokuoka, Lihang Hong

Comments 10 pages, 3 figures, 2 tables

详情
英文摘要

State-of-the-art vessel segmentation methods typically require large-scale annotated datasets and suffer from severe performance degradation under domain shifts. In clinical practice, however, acquiring extensive annotations for every new scanner or protocol is unfeasible. To address this, we propose a novel framework leveraging a pre-trained Vision Foundation Model (DINOv3) adapted for volumetric vessel segmentation. We introduce a lightweight 3D Adapter for volumetric consistency, a multi-scale 3D Aggregator for hierarchical feature fusion, and Z-channel embedding to effectively bridge the gap between 2D pre-training and 3D medical modalities, enabling the model to capture continuous vascular structures from limited data. We validated our method on the TopCoW (in-domain) and Lausanne (out-of-distribution) datasets. In the extreme few-shot regime with 5 training samples, our method achieved a Dice score of 43.42%, marking a 30% relative improvement over the state-of-the-art nnU-Net (33.41%) and outperforming other Transformer-based baselines, such as SwinUNETR and UNETR, by up to 45%. Furthermore, in the out-of-distribution setting, our model demonstrated superior robustness, achieving a 50% relative improvement over nnU-Net (21.37% vs. 14.22%), which suffered from severe domain overfitting. Ablation studies confirmed that our 3D adaptation mechanism and multi-scale aggregation strategy are critical for vascular continuity and robustness. Our results suggest foundation models offer a viable cold-start solution, improving clinical reliability under data scarcity or domain shifts.

2602.23771 2026-03-02 eess.IV cs.CV

VideoPulse: Neonatal heart rate and peripheral capillary oxygen saturation (SpO2) estimation from contact free video

Deependra Dewagiri, Kamesh Anuradha, Pabadhi Liyanage, Helitha Kulatunga, Pamuditha Somarathne, Udaya S. K. P. Miriya Thanthrige, Nishani Lucas, Anusha Withana, Joshua P. Kulasingham

Comments 11 pages, 3 figures, 5 tables. Preprint. Intended for submission to an IEEE Journal

详情
英文摘要

Remote photoplethysmography (rPPG) enables contact free monitoring of vital signs and is especially valuable for neonates, since conventional methods often require sustained skin contact with adhesive probes that can irritate fragile skin and increase infection control burden. We present VideoPulse, a neonatal dataset and an end to end pipeline that estimates neonatal heart rate and peripheral capillary oxygen saturation (SpO2) from facial video. VideoPulse contains 157 recordings totaling 2.6 hours from 52 neonates with diverse face orientations. Our pipeline performs face alignment and artifact aware supervision using denoised pulse oximeter signals, then applies 3D CNN backbones for heart rate and SpO2 regression with label distribution smoothing and weighted regression for SpO2. Predictions are produced in 2 second windows. On the NBHR neonatal dataset, we obtain heart rate MAE 2.97 bpm using 2 second windows (2.80 bpm at 6 second windows) and SpO2 MAE 1.69 percent. Under cross dataset evaluation, the NBHR trained heart rate model attains 5.34 bpm MAE on VideoPulse, and fine tuning an NBHR pretrained SpO2 model on VideoPulse yields MAE 1.68 percent. These results indicate that short unaligned neonatal video segments can support accurate heart rate and SpO2 estimation, enabling low cost non invasive monitoring in neonatal intensive care.

2602.23769 2026-03-02 eess.SP

Joint Optimization of Flexible Antenna Array Shape and Beamforming for Secure Communication

Zhen Xu, Gaojie Chen, Jing Zhu, Weiwei Zhao, Yonghui Li, Rahim Tafazolli, Wei Huang

Comments 15 pages, 7 figures

详情
英文摘要

Flexible antenna arrays (FAAs) can physically reshape their geometry to add new spatial degrees of freedom, whereas transmit beamforming adjusts the complex element weights to electronically steer and shape the array's radiation pattern, thereby significantly improving communication performance. This paper is the first to explore the integration of FAA geometry control and beamforming for physical layer security enhancement, where a base station equipped with an FAA communicates with a legitimate user in the presence of passive eavesdroppers. To safeguard confidential transmissions, we formulate a new secrecy rate maximization problem that jointly optimizes the transmit beamforming vector and a continuous FAA shape control parameter. Due to the non convex nature of the problem, an alternating optimization algorithm is developed to decompose the joint design into tractable subproblems, which are solved iteratively to refine both the FAA geometry and beamforming strategy. Simulation results confirm that the proposed joint optimization framework significantly outperforms conventional fixed shape or beamforming only schemes, demonstrating the potential of FAA enabled reconfigurability for secure wireless communications.

2602.23752 2026-03-02 eess.IV cs.CV

Unsupervised Causal Prototypical Networks for De-biased Interpretable Dermoscopy Diagnosis

Junhao Jia, Yueyi Wu, Huangwei Chen, Haodong Jing, Haishuai Wang, Jiajun Bu, Lei Wu

详情
英文摘要

Despite the success of deep learning in dermoscopy image analysis, its inherent black-box nature hinders clinical trust, motivating the use of prototypical networks for case-based visual transparency. However, inevitable selection bias in clinical data often drives these models toward shortcut learning, where environmental confounders are erroneously encoded as predictive prototypes, generating spurious visual evidence that misleads medical decision-making. To mitigate these confounding effects, we propose CausalProto, an Unsupervised Causal Prototypical Network that fundamentally purifies the visual evidence chain. Framed within a Structural Causal Model, we employ an Information Bottleneck-constrained encoder to enforce strict unsupervised orthogonal disentanglement between pathological features and environmental confounders. By mapping these decoupled representations into independent prototypical spaces, we leverage the learned spurious dictionary to perform backdoor adjustment via do-calculus, transforming complex causal interventions into efficient expectation pooling to marginalize environmental noise. Extensive experiments on multiple dermoscopy datasets demonstrate that CausalProto achieves superior diagnostic performance and consistently outperforms standard black box models, while simultaneously providing transparent and high purity visual interpretability without suffering from the traditional accuracy compromise.

2602.23738 2026-03-02 eess.SP cs.HC

From Continuous sEMG Signals to Discrete Muscle State Tokens: A Robust and Interpretable Representation Framework

Yuepeng Chen, Kaili Zheng, Ji Wu, Zhuangzhuang Li, Ye Ma, Dongwei Liu, Chenyi Guo, Xiangling Fu

详情
英文摘要

Surface electromyography (sEMG) signals exhibit substantial inter-subject variability and are highly susceptible to noise, posing challenges for robust and interpretable decoding. To address these limitations, we propose a discrete representation of sEMG signals based on a physiology-informed tokenization framework. The method employs a sliding window aligned with the minimal muscle contraction cycle to isolate individual muscle activation events. From each window, ten time-frequency features, including root mean square (RMS) and median frequency (MDF), are extracted, and K-means clustering is applied to group segments into representative muscle-state tokens. We also introduce a large-scale benchmark dataset, ActionEMG-43, comprising 43 diverse actions and sEMG recordings from 16 major muscle groups across the body. Based on this dataset, we conduct extensive evaluations to assess the inter-subject consistency, representation capacity, and interpretability of the proposed sEMG tokens. Our results show that the token representation exhibits high inter-subject consistency (Cohen's Kappa = 0.82+-0.09), indicating that the learned tokens capture consistent and subject-independent muscle activation patterns. In action recognition tasks, models using sEMG tokens achieve Top-1 accuracies of 75.5% with ViT and 67.9% with SVM, outperforming raw-signal baselines (72.8% and 64.4%, respectively), despite a 96% reduction in input dimensionality. In movement quality assessment, the tokens intuitively reveal patterns of muscle underactivation and compensatory activation, offering interpretable insights into neuromuscular control. Together, these findings highlight the effectiveness of tokenized sEMG representations as a compact, generalizable, and physiologically meaningful feature space for applications in rehabilitation, human-machine interaction, and motor function analysis.

2602.23733 2026-03-02 eess.SP

Massive MIMO Channel-aware Decision Fusion Aided by Reconfigurable Intelligent Surfaces

Domenico Ciuonzo, Alessio Zappone, Marco Di Renzo, Linlong Wu

Comments IEEE ICASSP 2025

详情
英文摘要

This paper investigates channel-aware decision fusion empowered by massive MIMO systems and reconfigurable intelligent surfaces (RIS). By integrating both, we aim to improve goal-oriented (fusion) performance despite the unique propagation challenges introduced. Specifically, we investigate traditional favorable propagation properties in the context of RIS-aided Massive MIMO decision fusion. The above analysis is then leveraged (i) to design three sub-optimal simple fusion rules suited for the large-array regime and (ii) to devise an optimization criterion for RIS reflection coefficients based on long-term channel statistics. Simulation results confirm the appeal of the presented design.

2602.23670 2026-03-02 cs.RO cs.SY eess.SY

Physics-Embedded Neural ODEs for Learning Antagonistic Pneumatic Artificial Muscle Dynamics

Xinyao Wang, Jonathan Realmuto

详情
英文摘要

Pneumatic artificial muscles (PAMs) enable compliant actuation for soft wearable, assistive, and interactive robots. When arranged antagonistically, PAMs can provide variable impedance through co-contraction but exhibit coupled, nonlinear, and hysteretic dynamics that challenge modeling and control. This paper presents a hybrid neural ordinary differential equation (Neural ODE) framework that embeds physical structure into a learned model of antagonistic PAM dynamics. The formulation combines parametric joint mechanics and pneumatic state dynamics with a neural network force component that captures antagonistic coupling and rate-dependent hysteresis. The forward model predicts joint motion and chamber pressures with a mean R$^2$ of 0.88 across 225 co-contraction conditions. An inverse formulation, derived from the learned dynamics, computes pressure commands offline for desired motion and stiffness profiles, tracked in closed loop during execution. Experimental validation demonstrates reliable stiffness control across 126-176 N/mm and consistent impedance behavior across operating velocities, in contrast to a static model, which shows degraded stiffness consistency at higher velocities.

2602.23668 2026-03-02 cs.AI cs.SY eess.SY

PseudoAct: Leveraging Pseudocode Synthesis for Flexible Planning and Action Control in Large Language Model Agents

Yihan, Wen, Xin Chen

详情
英文摘要

Large language model (LLM) agents typically rely on reactive decision-making paradigms such as ReAct, selecting actions conditioned on growing execution histories. While effective for short tasks, these approaches often lead to redundant tool usage, unstable reasoning, and high token consumption in complex long-horizon tasks involving branching, iteration, or multi-tool coordination. To address these limitations, this paper introduces PseudoAct, a novel framework for flexible planning and action control in LLM agents through pseudocode synthesis. Leveraging the ability of LLMs to express task-solving strategies as code, PseudoAct synthesizes a structured pseudocode plan that decomposes a task into subtasks and explicitly encodes control flow, including sequencing, conditionals, loops, parallel composition, and combinations of these logic primitives. Actions are then executed by following this global plan, making the decision logic explicit and temporally coherent. This design reduces redundant actions, prevents infinite loops, and avoids uninformative alternative exploration, enabling consistent and efficient long-horizon decision-making. Experiments on benchmark datasets show that our method significantly outperforms existing reactive agent approaches, achieving a 20.93% absolute gain in success rate on FEVER and setting a new state-of-the-art on HotpotQA.

2602.23607 2026-03-02 cs.RO cs.SY eess.SY

MicroPush: A Simulator and Benchmark for Contact-Rich Cell Pushing and Assembly with a Magnetic Rolling Microrobot

Yanda Yang, Sambeeta Das

Comments 13 pages, 8 figures

详情
英文摘要

Magnetic rolling microrobots enable gentle manipulation in confined microfluidic environments, yet autonomy for contact-rich behaviors such as cell pushing and multi-target assembly remains difficult to develop and evaluate reproducibly. We present MicroPush, an open-source simulator and benchmark suite for magnetic rolling microrobots in cluttered 2D scenes. MicroPush combines an overdamped interaction model with contact-aware stick--slip effects, lightweight near-field damping, optional Poiseuille background flow, and a calibrated mapping from actuation frequency to free-space rolling speed. On top of the simulator core, we provide a modular planning--control stack with a two-phase strategy for contact establishment and goal-directed pushing, together with a deterministic benchmark protocol with fixed tasks, staged execution, and unified CSV logging for single-object transport and hexagonal assembly. We report success, time, and tracking metrics, and an actuation-variation measure $E_{Δω}$. Results show that controller stability dominates performance under flow disturbances, while planner choice can influence command smoothness over long-horizon sequences via waypoint progression. MicroPush enables reproducible comparison and ablation of planning, control, and learning methods for microscale contact-rich micromanipulation.

2602.21429 2026-03-02 cs.LG cs.AI cs.SY eess.SY math.OC

Provably Safe Generative Sampling with Constricting Barrier Functions

Darshan Gadginmath, Ahmed Allibhoy, Fabio Pasqualetti

Comments 21 pages, 7 figures

详情
英文摘要

Flow-based generative models, such as diffusion models and flow matching models, have achieved remarkable success in learning complex data distributions. However, a critical gap remains for their deployment in safety-critical domains: the lack of formal guarantees that generated samples will satisfy hard constraints. We address this by proposing a safety filtering framework that acts as an online shield for any pre-trained generative model. Our key insight is to cooperate with the generative process rather than override it. We define a constricting safety tube that is relaxed at the initial noise distribution and progressively tightens to the target safe set at the final data distribution, mirroring the coarse-to-fine structure of the generative process itself. By characterizing this tube via Control Barrier Functions (CBFs), we synthesize a feedback control input through a convex Quadratic Program (QP) at each sampling step. As the tube is loosest when noise is high and intervention is cheapest in terms of control energy, most constraint enforcement occurs when it least disrupts the model's learned structure. We prove that this mechanism guarantees safe sampling while minimizing the distributional shift from the original model at each sampling step, as quantified by the KL divergence. Our framework applies to any pre-trained flow-based generative scheme requiring no retraining or architectural modifications. We validate the approach across constrained image generation, physically-consistent trajectory sampling, and safe robotic manipulation policies, achieving 100% constraint satisfaction while preserving semantic fidelity.

2602.17705 2026-03-02 eess.SP cs.IR cs.IT cs.SY eess.SY math.IT

Wavenumber-domain signal processing for holographic MIMO: Foundations, methods, and future directions

Zijian Zhang, Linglong Dai

Comments Accepted by IEEE Communications Standards Magazine. 6 pages, 5 figures

详情
英文摘要

Holographic multiple-input multiple-output (H-MIMO) systems represent a paradigm shift in wireless communications by enabling quasi-continuous apertures. Unlike conventional MIMO systems, H-MIMO with subwavelength antenna spacing operates in both far-field and near-field regimes, where classical discrete Fourier transform (DFT) representations fail to sufficiently capture the channel characteristics. To address this challenge, this article provides an overview of the emerging wavenumber-domain signal processing framework. Specifically, by leveraging spatial Fourier plane-wave decomposition to model H-MIMO channels, the wavenumber domain offers a unified and physically consistent basis for characterizing subwavelength-level spatial correlation and spherical wave propagation. This article first introduces the concept of H-MIMO and the wavenumber representation of H-MIMO channels. Next, it elaborates on wavenumber-domain signal processing technologies reported in the literature, including multiplexing, channel estimation, and waveform designs. Finally, it highlights open challenges and outlines future research directions in wavenumber-domain signal processing for next-generation wireless systems.