arXivDaily arXiv每日学术速递 周一至周五更新
EESS电气与系统 173
2602.02452 2026-02-03 eess.SY cs.SY

Robust Safety-Critical Control of Networked SIR Dynamics

Saba Samadi, Brooks A. Butler, Philip E. Paré

Comments 8 pages, 7 figures, accepted to the 2026 American Control Conference (ACC)

详情
英文摘要

We present a robust safety-critical control framework tailored for networked susceptible-infected-recovered (SIR) epidemic dynamics, leveraging control barrier functions (CBFs) and robust control barrier functions to address the challenges of epidemic spread and mitigation. In our networked SIR model, each node must keep its infection level below a critical threshold, despite dynamic interactions with neighboring nodes and inherent uncertainties in the epidemic parameters and measurement errors, to ensure public health safety. We first derive a CBF-based controller that guarantees infection thresholds are not exceeded in the nominal case. We enhance the framework to handle realistic epidemic scenarios under uncertainties by incorporating compensation terms that reinforce safety against uncertainties: an independent method with constant bounds for uniform uncertainty, and a novel approach that scales with the state to capture increased relative noise in early or suppressed outbreak stages. Simulation results on a networked SIR system illustrate that the nominal CBF controller maintains safety under low uncertainty, while the robust approaches provide formal safety guarantees under higher uncertainties; in particular, the novel method employs more conservative control efforts to provide larger safety margins, whereas the independent approach optimizes resource allocation by allowing infection levels to approach the boundaries in steady epidemic regimes.

2602.02435 2026-02-03 cs.IT cs.NI cs.SY eess.SY math.IT

Preemptive Scheduling for Age of Job Minimization in Task-Specific Machine Networks

Subhankar Banerjee, Sennur Ulukus

详情
英文摘要

We consider a time-slotted job-assignment system consisting of a central server, $N$ task-specific networks of machines, and multiple users. Each network specializes in executing a distinct type of task. Users stochastically generate jobs of various types and forward them to the central server, which routes each job to the appropriate network of machines. Due to resource constraints, the server cannot serve all users' jobs simultaneously, which motivates the design of scheduling policies with possible preemption. To evaluate scheduling performance, we introduce a novel timeliness metric, the age of job, inspired by the well-known metric, the age of information. We study the problem of minimizing the long-term weighted average age of job. We first propose a max-weight policy by minimizing the one-step Lyapunov drift and then derive the Whittle index (WI) policy when the job completion times of the networks of machines follow geometric distributions. For general job completion time distributions, we introduce a Whittle index with max-weight fallback (WIMWF) policy. We also investigate the Net-gain maximization (NGM) policy. Numerically, we show that the proposed WIMWF policy achieves the best performance in the general job completion time setting. We also observe a scaling trend: two different max-weight policies can outperform the NGM policy in small systems, whereas the NGM policy improves as we scale the system size and becomes asymptotically better than max-weight policies. For geometric service times, the WI policy yields the lowest age across all considered system sizes.

2602.02376 2026-02-03 eess.SY cs.SY

An Efficient Power Management Unit With Continuous MPPT and Energy Recycling for Wireless Millimetric Biomedical Implants

Yiwei Zou, Huan-Cheng Liao, Wei Wang, Wonjune Kim, Yumin Su, Jacob T. Robinson, Kaiyuan Yang

Journal ref IEEE Journal of Solid-State Circuits, 2025

详情
英文摘要

Biomedical implants offer transformative tools to improve medical outcomes. To realize minimally invasive implants with miniaturized volume and weight, wireless power transfer has been extensively studied to replace bulky batteries that dominate the volume of traditional implants and require surgical replacements. Ultra-sonic and magnetoelectric WPT modalities, which leverage low frequency acoustic electrical coupling for energy transduction, become viable solutions for mm-scale receivers. This work presents a fully integrated power management unit for ME WPT in millimetric implants. The PMU achieves load independent maximum power extraction and usage by continuously matching the impedance of the transducer, dynamically optimizing the power stage across varying input divided by load conditions, and reusing the storage energy to sustain the system when input power drops. Its parallel-input regulation and storing stages architecture prevent the cascading power loss. With the skewed-duty-cycle MPPT technique and regulation efficiency optimizer, the PMU achieves a peak MPPT efficiency of 98.5 percent and a peak system overall efficiency of 73.33 percent. Additionally, the PMU includes an adaptive high-voltage charging stage that charges the stimulation capacitor up to 12 V with an improved efficiency of 37.88 percent.

2602.02365 2026-02-03 eess.SP

A Track-Before-Detect Trajectory Multi-Bernoulli Filter for Generalised Superpositional Measurements

Sion Lynch, Ángel F. García-Fernández, Lee Devlin

Comments Submitted to IEEE Transactions on Signal Processing

详情
英文摘要

This paper proposes the Trajectory-Information Exchange Multi-Bernoulli (T-IEMB) filter to estimate sets of alive and all trajectories in track-before-detect applications with generalised superpositional measurements. This measurement model has superpositional hidden variables which are mapped to the conditional mean and covariance of the measurement, enabling it to describe a broad range of measurement models. This paper also presents a Gaussian implementation of the T-IEMB filter, which performs the update by approximating the conditional moments of the measurement model, and admits a computationally light filtering solution. Simulation results for a non-Gaussian radar-based tracking scenario demonstrate the performance of two Gaussian T-IEMB implementations, which provide improved tracking performance compared to a state-of-the-art particle filter based solution for track-before-detect, at a reduced computational cost.

2602.02312 2026-02-03 eess.SP

Flexible laboratory setup for DAC experimentation

Alfredo Pérez Vega-Leal, Manuel G. Satué

详情
英文摘要

Analog multiplexing appears to be a promising solution for modern transmitters, where speed is the primary limitation. The objective is the development of a low-cost solution to compare different digital to analog (DAC) schemes. In particular, analog multiplexing techniques, high-speed single-DAC, Sigma-delta modulation, Dynamic element matching are considered. The work presents a review of these techniques and shows a prototype of a time interleaved sigma delta modulation based DAC based on a commercially available Field Programmable Gate Array system.

2602.02269 2026-02-03 cs.RO cs.AI cs.SE cs.SY eess.SY

Bridging the Sim-to-Real Gap with multipanda ros2: A Real-Time ROS2 Framework for Multimanual Systems

Jon Škerlj, Seongjin Bien, Abdeldjallil Naceri, Sami Haddadin

Comments This work has been submitted to the IEEE for possible publication

详情
英文摘要

We present $multipanda\_ros2$, a novel open-source ROS2 architecture for multi-robot control of Franka Robotics robots. Leveraging ros2 control, this framework provides native ROS2 interfaces for controlling any number of robots from a single process. Our core contributions address key challenges in real-time torque control, including interaction control and robot-environment modeling. A central focus of this work is sustaining a 1kHz control frequency, a necessity for real-time control and a minimum frequency required by safety standards. Moreover, we introduce a controllet-feature design pattern that enables controller-switching delays of $\le 2$ ms, facilitating reproducible benchmarking and complex multi-robot interaction scenarios. To bridge the simulation-to-reality (sim2real) gap, we integrate a high-fidelity MuJoCo simulation with quantitative metrics for both kinematic accuracy and dynamic consistency (torques, forces, and control errors). Furthermore, we demonstrate that real-world inertial parameter identification can significantly improve force and torque accuracy, providing a methodology for iterative physics refinement. Our work extends approaches from soft robotics to rigid dual-arm, contact-rich tasks, showcasing a promising method to reduce the sim2real gap and providing a robust, reproducible platform for advanced robotics research.

2602.02248 2026-02-03 eess.SP cs.IT math.IT

A Novel ISAC Waveform Based on Orthogonal Delay-Doppler Division Multiplexing with FMCW

Kehan Huang, Akram Shafie, Min Qiu, Elias Aboutanios, Jinhong Yuan

Comments 17 pages, 18 figures

详情
英文摘要

In this work, we propose the orthogonal delay-Doppler (DD) division multiplexing (ODDM) modulation with frequency modulated continuous wave (FMCW) (ODDM-FMCW) waveform to enable integrated sensing and communication (ISAC) with a low peak-to-average power ratio (PAPR). We first propose a square-root-Nyquist-filtered FMCW (SRN-FMCW) waveform to address limitations of conventional linear FMCW waveforms in ISAC systems. To better integrate with ODDM, we generate SRN-FMCW by embedding symbols in the DD domain, referred to as a DD-SRN-FMCW frame. A DD chirp compression receiver is designed to obtain the channel response efficiently. Next, we construct the proposed ODDM-FMCW waveform for ISAC by superimposing a DD-SRN-FMCW frame onto an ODDM data frame. A comprehensive performance analysis of the ODDM-FMCW waveform is presented, covering peak-to-average power ratio, spectrum, ambiguity function, and Cramer-Rao bound for delay and Doppler estimation. Numerical results show that the proposed ODDM-FMCW waveform delivers excellent ISAC performance in terms of root mean square error for sensing and bit error rate for communications.

2602.02167 2026-02-03 eess.SP cs.CV cs.LG cs.RO

Real-Time 2D LiDAR Object Detection Using Three-Frame RGB Scan Encoding

Soheil Behnam Roudsari, Alexandre S. Brandão, Felipe N. Martins

Comments 6 pages, 6 figures, submitted to IEEE SAS 2026

详情
英文摘要

Indoor service robots need perception that is robust, more privacy-friendly than RGB video, and feasible on embedded hardware. We present a camera-free 2D LiDAR object detection pipeline that encodes short-term temporal context by stacking three consecutive scans as RGB channels, yielding a compact YOLOv8n input without occupancy-grid construction while preserving angular structure and motion cues. Evaluated in Webots across 160 randomized indoor scenarios with strict scenario-level holdout, the method achieves 98.4% mAP@0.5 (0.778 mAP@0.5:0.95) with 94.9% precision and 94.7% recall on four object classes. On a Raspberry Pi 5, it runs in real time with a mean post-warm-up end-to-end latency of 47.8ms per frame, including scan encoding and postprocessing. Relative to a closely related occupancy-grid LiDAR-YOLO pipeline reported on the same platform, the proposed representation is associated with substantially lower reported end-to-end latency. Although results are simulation-based, they suggest that lightweight temporal encoding can enable accurate and real-time LiDAR-only detection for embedded indoor robotics without capturing RGB appearance.

2602.02161 2026-02-03 cs.LG cs.SY eess.SY

Generating Causal Temporal Interaction Graphs for Counterfactual Validation of Temporal Link Prediction

Aniq Ur Rahman, Justin P. Coon

详情
英文摘要

Temporal link prediction (TLP) models are commonly evaluated based on predictive accuracy, yet such evaluations do not assess whether these models capture the causal mechanisms that govern temporal interactions. In this work, we propose a framework for counterfactual validation of TLP models by generating causal temporal interaction graphs (CTIGs) with known ground-truth causal structure. We first introduce a structural equation model for continuous-time event sequences that supports both excitatory and inhibitory effects, and then extend this mechanism to temporal interaction graphs. To compare causal models, we propose a distance metric based on cross-model predictive error, and empirically validate the hypothesis that predictors trained on one causal model degrade when evaluated on sufficiently distant models. Finally, we instantiate counterfactual evaluation under (i) controlled causal shifts between generating models and (ii) timestamp shuffling as a stochastic distortion with measurable causal distance. Our framework provides a foundation for causality-aware benchmarking.

2602.02148 2026-02-03 eess.SP

RIS-Aided Wireless Amodal Sensing for Single-View 3D Reconstruction

Yuhan Wang, Haobo Zhang, Qingyu Liu, Hongliang Zhang, Lingyang Song

详情
英文摘要

Amodal sensing is critical for various real-world sensing applications because it can recover the complete shapes of partially occluded objects in complex environments. Among various amodal sensing paradigms, wireless amodal sensing is a potential solution due to its advantages of environmental robustness, privacy preservation, and low cost. However, the sensing data obtained by wireless system is sparse for shape reconstruction because of the low spatial resolution, and this issue is further intensified in complex environments with occlusion. To address this issue, we propose a Reconfigurable Intelligent Surface (RIS)-aided wireless amodal sensing scheme that leverages a large-scale RIS to enhance the spatial resolution and create reflection paths that can bypass the obstacles. A generative learning model is also employed to reconstruct the complete shape based on the sensing data captured from the viewpoint of the RIS. In such a system, it is challenging to optimize the RIS phase shifts because the relationship between RIS phase shifts and amodal sensing accuracy is complex and the closed-form expression is unknown. To tackle this challenge, we develop an error prediction model that learns the mapping from RIS phase shifts to amodal sensing accuracy, and optimizes RIS phase shifts based on this mapping. Experimental results on the benchmark dataset show that our method achieves at least a 56.73% reduction in reconstruction error compared to conventional schemes under the same number of RIS configurations.

2602.02086 2026-02-03 eess.SP cs.HC

Neurophysiological effects of museum modalities on emotional engagement with real artworks

Chen Feng, Sébastien Lugan, Karine Lasaracina, Midori Sugaya, Benoît Macq

Comments 7 pages, 4 figures - \c{opyright}IEEE EmotionSense 2026/PerCom 2026

详情
英文摘要

Museums increasingly rely on digital content to support visitors' understanding of artworks, yet little is known about how these formats shape the emotional engagement that underlies meaningful art experiences. This research presents an in-situ EEG study on how digital interpretive content modulate engagement during art viewing. Participants experienced three modalities: direct viewing of a Bruegel painting, a 180° immersive interpretive projection, and a regular, display-based interpretive video. Frontal EEG markers of motivational orientation, internal involvement, perceptual drive, and arousal were extracted using eyes-open baselines and Z-normalized contrasts. Results show modality-specific engagement profiles: display-based interpretive video induced high arousal and fast-band activity, immersive projections promoted calm, presence-oriented absorption, and original artworks reflected internally regulated engagement. These findings, relying on lightweight EEG sensing in an operational cultural environment, suggest that digital interpretive content affects engagement style rather than quantity. This paves the way for new multimodal sensing approaches and enables museums to optimize the modalities and content of their interpretive media.

2602.02031 2026-02-03 eess.IV

Edge-Aligned Initialization of Kernels for Steered Mixture-of-Experts

Martin Determann, Elvira Fleig

详情
英文摘要

Steered Mixture-of-Experts (SMoE) has recently emerged as a powerful framework for spatial-domain image modeling, enabling high-fidelity image representation using a remarkably small number of parameters. Its ability to steer kernel-based experts toward structural image features has led to successful applications in image compression, denoising, super-resolution, and light field processing. However, practical adoption is hindered by the reliance on gradient-based optimization to estimate model parameters on a per-image basis - a process that is computationally intensive and difficult to scale. Initialization strategies for SMoE are an essential component that directly affects convergence and reconstruction quality. In this paper, we propose a novel, edge-based initialization scheme that achieves good reconstruction qualities while reducing the need for stochastic optimization significantly. Through a method that leverages Canny edge detection to extract a sparse set of image contours, kernel positions and orientations are deterministically inferred. A separate approach enables the direct estimation of initial expert coefficients. This initialization reduces both memory consumption and computational cost.

2602.02005 2026-02-03 cs.AR cs.LG cs.SY eess.SY hep-ex quant-ph

Position: The Need for Ultrafast Training

Duc Hoang

Comments Position paper at the 2nd Workshop on Domain-Specialized FPGAs (WDSFPGA 2026)

详情
英文摘要

Domain-specialized FPGAs have delivered unprecedented performance for low-latency inference across scientific and industrial workloads, yet nearly all existing accelerators assume static models trained offline, relegating learning and adaptation to slower CPUs or GPUs. This separation fundamentally limits systems that must operate in non-stationary, high-frequency environments, where model updates must occur at the timescale of the underlying physics. In this paper, I argue for a shift from inference-only accelerators to ultrafast on-chip learning, in which both inference and training execute directly within the FPGA fabric under deterministic, sub-microsecond latency constraints. Bringing learning into the same real-time datapath as inference would enable closed-loop systems that adapt as fast as the physical processes they control, with applications spanning quantum error correction, cryogenic qubit calibration, plasma and fusion control, accelerator tuning, and autonomous scientific experiments. Enabling such regimes requires rethinking algorithms, architectures, and toolflows jointly, but promises to transform FPGAs from static inference engines into real-time learning machines.

2602.01974 2026-02-03 eess.SP

Obstacle Detection at Level Crossings under Adverse Weather Conditions -- A Survey

Chenyang Yan, Mats Bengtsson

详情
英文摘要

Level crossing accidents remain a significant safety concern in modern railway systems, particularly under adverse weather conditions that degrade sensor performance. This review surveys state-of-the-art sensor technologies and fusion strategies for obstacle detection at railway level crossings, with a focus on robustness, detection accuracy, and environmental resilience. Individual sensors such as inductive loops, cameras, radar, and LiDAR offer complementary strengths but involve trade-offs, including material dependence, reduced visibility, and limited resolution in harsh environments. We analyze each modality's working principles, weather-induced vulnerabilities, and mitigation strategies, including signal enhancement and machine-learning-based denoising. We further review multi-sensor fusion approaches, categorized as data-level, feature-level, and decision-level architectures, that integrate complementary information to improve reliability and fault tolerance. The survey concludes with future research directions, including adaptive fusion algorithms, real-time processing pipelines, and weather-resilient datasets to support the deployment of intelligent, fail-safe detection systems for railway safety.

2602.01961 2026-02-03 eess.SP

Uncertainty-Weighted Multi-Task CNN for Joint DoA and Rain-Rate Estimation Under Rain-Induced Array Distortions

Chenyang Yan, Ruonan Yang, Shunqiao Sun, Mats Bengtsson

详情
英文摘要

We investigate joint direction-of-arrival (DoA) and rain-rate estimation for a uniform linear array operating under rain-induced multiplicative distortions. Building on a wavefront fluctuation model whose spatial correlation is governed by the rain-rate, we derive an angle-dependent covariance formulation and use it to synthesize training data. DoA estimation is cast as a multi-label classification problem on a discretized angular grid, while rain-rate estimation is formulated as a multi-class classification task. We then propose a multi-task deep CNN with a shared feature extractor and two task-specific heads, trained using an uncertainty-weighted objective to automatically balance the two losses. Numerical results in a two-source scenario show that the proposed network achieves lower DoA RMSE than classical baselines and provides accurate rain-rate classification at moderate-to-high SNRs.

2602.01947 2026-02-03 eess.SP

Resolution-Aliasing Trade-off in Near-Field Localisation

Baptiste Sambon, Gilles Monnoyer, Luc Vandendorpe, Claude Oestges

Comments Submitted to IEEE Open Journal of Signal Processing

详情
英文摘要

Extremely Large-scale MIMO (XL-MIMO) systems operating in Near-Field (NF) introduce new degrees of freedom for accurate source localisation, but make dense arrays impractical. Sparse or distributed arrays can reduce hardware complexity while maintaining high resolution, yet sub-Nyquist spatial sampling introduces aliasing artefacts in the localisation ambiguity function. This paper presents a unified framework to jointly characterise resolution and aliasing in NF localisation and study the trade-off between the two. Leveraging the concept of local chirp spatial frequency, we derive analytical expressions linking array geometry and sampling density to the spatial bandwidth of the received field. We introduce two geometric tools--Critical Antenna Elements (CAEs) and the Non-Contributive Zone (NCZ)--to intuitively identify how individual antennas contribute to resolution and/or aliasing. Our analysis reveals that resolution and aliasing are not always strictly coupled, e.g., increasing the array aperture can improve resolution without necessarily aggravating aliasing. These results provide practical guidelines for designing NF arrays that optimally balance resolution and aliasing, supporting efficient XL-MIMO deployment.

2602.01892 2026-02-03 cs.RO cs.SY eess.SY

Path Tracking with Dynamic Control Point Blending for Autonomous Vehicles: An Experimental Study

Alexandre Lombard, Florent Perronnet, Nicolas Gaud, Abdeljalil Abbas-Turki

详情
英文摘要

This paper presents an experimental study of a path-tracking framework for autonomous vehicles in which the lateral control command is applied to a dynamic control point along the wheelbase. Instead of enforcing a fixed reference at either the front or rear axle, the proposed method continuously interpolates between both, enabling smooth adaptation across driving contexts, including low-speed maneuvers and reverse motion. The lateral steering command is obtained by barycentric blending of two complementary controllers: a front-axle Stanley formulation and a rear-axle curvature-based geometric controller, yielding continuous transitions in steering behavior and improved tracking stability. In addition, we introduce a curvature-aware longitudinal control strategy based on virtual track borders and ray-tracing, which converts upcoming geometric constraints into a virtual obstacle distance and regulates speed accordingly. The complete approach is implemented in a unified control stack and validated in simulation and on a real autonomous vehicle equipped with GPS-RTK, radar, odometry, and IMU. The results in closed-loop tracking and backward maneuvers show improved trajectory accuracy, smoother steering profiles, and increased adaptability compared to fixed control-point baselines.

2602.01857 2026-02-03 eess.SY cs.SY math.OC

Super-twisting over networks: A Lyapunov approach for distributed differentiation

Rodrigo Aldana-López, Irene Perez Salesa, David Gomez Gutierrez, Rosario Aragues, Carlos Sagues

Comments Preprint. Submitted for possible publication

详情
英文摘要

We study distributed differentiation, where agents in a networked system estimate the average of local time-varying signals and their derivatives under mild assumptions on the agents' signals and their first and second derivatives. Existing sliding-mode methods provide only local stability guarantees and lack systematic gain selection. By isolating the structural features shared with the super-twisting algorithm and encoding them into an abstract model, we construct a Lyapunov function enabling systematic gain design and proving global finite-time convergence to consensus for the distributed differentiator. Building on this framework, we develop an event-triggered hybrid system implementation using time-varying and state dependent threshold rules and derive minimum inter-event time guarantees and accuracy bounds that quantify the trade-off between estimation accuracy and communication effort.

2602.01804 2026-02-03 eess.SY cs.GT cs.SY

Fostering Data Collaboration in Digital Transportation Marketplaces: The Role of Privacy-Preserving Mechanisms

Qiqing Wang, Haokun Yu, Kaidi Yang

详情
英文摘要

Data collaboration between municipal authorities (MA) and mobility providers (MPs) has brought tremendous benefits to transportation systems in the era of big data. Engaging in collaboration can improve the service operations (e.g., reduced delay) of these data owners, however, it can also raise privacy concerns and discourage data-sharing willingness. Specifically, data owners may be concerned that the shared data may leak sensitive information about their customers' mobility patterns or business secrets, resulting in the failure of collaboration. This paper investigates how privacy-preserving mechanisms can foster data collaboration in such settings. We propose a game-theoretic framework to investigate data-sharing among transportation stakeholders, especially considering perturbation-based privacy-preserving mechanisms. Numerical studies demonstrate that lower data quality expectations can incentivize voluntary data sharing, improving transport-related welfare for both MAs and MPs. Our findings provide actionable insights for policymakers and system designers on how privacy-preserving technologies can help bridge data silos and promote collaborative, privacy-aware transportation systems.

2602.01758 2026-02-03 eess.AS physics.bio-ph

Short-wave admittance correction for a time-domain cochlear transmission line model

François Deloche, Morgan Thienpont, Sarah Verhulst

Comments 22 pages, 7 figures

详情
英文摘要

Transmission line (TL) models implemented in the time domain can efficiently simulate basilar-membrane (BM) displacement in response to transient or non-stationary sounds. By design, a TL model is well-suited for an one-dimensional (1-D) characterization of the traveling wave, but the real configuration of the cochlea also introduces higher-dimensional effects. Such effects include the focusing of the pressure around the BM and transverse viscous damping, both of which are magnified in the short-wave region. The two effects depend on the wavelength and are more readily expressed in the frequency domain. In this paper, we introduce a numerical correction for the BM admittance to account for 2-D effects in the time domain using autoregressive filtering and regression techniques. The correction was required for the implementation of a TL model tailored to the gerbil cochlear physiology. The model, which includes instantaneous nonlinearities in the form of variable damping, initially presented insufficient compression with increasing sound levels. This limitation was explained by the strong coupling between gain and frequency selectivity assumed in the 1-D nonlinear TL model, whereas cochlear frequency selectivity shows only a moderate dependence on sound level in small mammals. The correction factor was implemented in the gerbil model and made level-dependent using a feedback loop. The updated model achieved some decoupling between frequency selectivity and gain, providing 5 dB of additional gain and extending the range of sound levels of the compressive regime by 10 dB. We discuss the relevance of this work through two key features: the integration of both analytical and regression methods for characterizing BM admittance, and the combination of instantaneous and non-instantaneous nonlinearities.

2602.01722 2026-02-03 eess.AS

Joint Optimization of ASV and CM tasks: BTUEF Team's Submission for WildSpoof Challenge

Oguzhan Kurnaz, Jagabandhu Mishra, Tomi Kinnunen, Cemal Hanilci

详情
英文摘要

Spoofing-aware speaker verification (SASV) jointly addresses automatic speaker verification and spoofing countermeasures to improve robustness against adversarial attacks. In this paper, we investigate our recently proposed modular SASV framework that enables effective reuse of publicly available ASV and CM systems through non-linear fusion, explicitly modeling their interaction, and optimization with an operating-condition-dependent trainable a-DCF loss. The framework is evaluated using ECAPA-TDNN and ReDimNet as ASV embedding extractors and SSL-AASIST as the CM model, with experiments conducted both with and without fine-tuning on the WildSpoof SASV training data. Results show that the best performance is achieved by combining ReDimNet-based ASV embeddings with fine-tuned SSL-AASIST representations, yielding an a-DCF of 0.0515 on the progress evaluation set and 0.2163 on the final evaluation set.

2602.01681 2026-02-03 eess.IV cs.CV cs.MM

Hyperspectral Image Fusion with Spectral-Band and Fusion-Scale Agnosticism

Yu-Jie Liang, Zihan Cao, Liang-Jian Deng, Yang Yang, Malu Zhang

详情
英文摘要

Current deep learning models for Multispectral and Hyperspectral Image Fusion (MS/HS fusion) are typically designed for fixed spectral bands and spatial scales, which limits their transferability across diverse sensors. To address this, we propose SSA, a universal framework for MS/HS fusion with spectral-band and fusion-scale agnosticism. Specifically, we introduce Matryoshka Kernel (MK), a novel operator that enables a single model to adapt to arbitrary numbers of spectral channels. Meanwhile, we build SSA upon an Implicit Neural Representation (INR) backbone that models the HS signal as a continuous function, enabling reconstruction at arbitrary spatial resolutions. Together, these two forms of agnosticism enable a single MS/HS fusion model that generalizes effectively to unseen sensors and spatial scales. Extensive experiments demonstrate that our single model achieves state-of-the-art performance while generalizing well to unseen sensors and scales, paving the way toward future HS foundation models.

2602.01653 2026-02-03 cs.IT eess.SP math.IT

Low-Complexity Multi-Agent Continual Learning for Stacked Intelligent Metasurface-Assisted Secure Communications

Enyu Shi, Yiyang Zhu, Jiayi Zhang, Ziheng Liu, Jiakang Zheng, Jiancheng An, Derrick Wing Kwan Ng, Bo Ai, Chau Yuen

Comments Enyu Shi and Yiyang Zhu contributed equally to this work

详情
英文摘要

Stacked intelligent metasurfaces (SIMs), composed of multiple layers of reconfigurable transmissive metasurfaces, are gaining prominence as a transformative technology for future wireless communication security. This paper investigates the integration of SIM into multi-user multiple-input multiple-output (MIMO) systems to enhance physical layer security. A novel system architecture is proposed, wherein each base station (BS) antenna transmits a dedicated single-user stream, while a multi-layer SIM executes wave-based beamforming in the electromagnetic domain, thereby avoiding the need for complex baseband digital precoding and significantly reducing hardware overhead. To maximize the weighted sum secrecy rate (WSSR), we formulate a joint precoding optimization problem over BS power allocation and SIM phase shifts, which is high-dimensional and non-convex due to the complexity of the objective function and the coupling among optimization variables. To address this, we propose a manifold-enhanced heterogeneous multi-agent continual learning (MHACL) framework that incorporates gradient representation and dual-scale policy optimization to achieve robust performance in dynamic environments with high demands for secure communication. Furthermore, we develop SIM-MHACL (SIMHACL), a low-complexity learning template that embeds phase coordination into a product manifold structure, reducing the exponential search space to linear complexity while maintaining physical feasibility. Simulation results validate that the proposed framework achieves millisecond-level per-iteratio ntraining in SIM-assisted systems, significantly outperforming various baseline schemes, with SIMHACL achieving comparable WSSR to MHACL while reducing computation time by 30\%.

2602.01646 2026-02-03 eess.SP cs.SY eess.SY

Synthesized-Isotropic Narrowband Channel Parameter Extraction from Angle-Resolved Wideband Channel Measurements

Minseok Kim, Masato Yomoda

详情
英文摘要

Angle-resolved channel sounding using antenna arrays or mechanically steered high-gain antennas is widely employed at millimeter-wave and terahertz bands. To extract antenna-independent large-scale channel parameters such as path loss, delay spread, and angular spread, the radiation-pattern effects embedded in the measured responses must be properly compensated. This paper revisits the technical challenges of path-gain calculation from angle-resolved wideband measurements, with emphasis on angular-domain power integration where the scan beams are inherently non-orthogonal and simple power summation leads to biased omni-equivalent power estimates. We first formulate the synthesized-isotropic narrowband power in a unified matrix form and introduce a beam-accumulation correction factor, including an offset-averaged variant to mitigate scalloping due to off-grid angles. The proposed framework is validated through simulations using channel models and 154~GHz corridor measurements.

2602.01634 2026-02-03 eess.AS cs.AI

HuPER: A Human-Inspired Framework for Phonetic Perception

Chenxu Guo, Jiachen Lian, Yisi Liu, Baihe Huang, Shriyaa Narayanan, Cheol Jun Cho, Gopala Anumanchipalli

详情
英文摘要

We propose HuPER, a human-inspired framework that models phonetic perception as adaptive inference over acoustic-phonetics evidence and linguistic knowledge. With only 100 hours of training data, HuPER achieves state-of-the-art phonetic error rates on five English benchmarks and strong zero-shot transfer to 95 unseen languages. HuPER is also the first framework to enable adaptive, multi-path phonetic perception under diverse acoustic conditions. All training data, models, and code are open-sourced. Code and demo avaliable at https://github.com/HuPER29/HuPER.

2602.01559 2026-02-03 cs.CV eess.IV

Combined Flicker-banding and Moire Removal for Screen-Captured Images

Libo Zhu, Zihan Zhou, Zhiyi Zhou, Yiyang Qu, Weihang Zhang, Keyu Shi, Yifan Fu, Yulun Zhang

详情
英文摘要

Capturing display screens with mobile devices has become increasingly common, yet the resulting images often suffer from severe degradations caused by the coexistence of moiré patterns and flicker-banding, leading to significant visual quality degradation. Due to the strong coupling of these two artifacts in real imaging processes, existing methods designed for single degradations fail to generalize to such compound scenarios. In this paper, we present the first systematic study on joint removal of moiré patterns and flicker-banding in screen-captured images, and propose a unified restoration framework, named CLEAR. To support this task, we construct a large-scale dataset containing both moiré patterns and flicker-banding, and introduce an ISP-based flicker simulation pipeline to stabilize model training and expand the degradation distribution. Furthermore, we design a frequency-domain decomposition and re-composition module together with a trajectory alignment loss to enhance the modeling of compound artifacts. Extensive experiments demonstrate that the proposed method consistently. outperforms existing image restoration approaches across multiple evaluation metrics, validating its effectiveness in complex real-world scenarios.

2602.01547 2026-02-03 cs.SD eess.AS

Attention-weighted Centered Kernel Alignment for Knowledge Distillation in Large Audio-Language Models Applied to Speech Emotion Recognition

Qingran Yang, Botao Zhao, Zuheng Kang, Xue Li, Yayun He, Chuhang Liu, Xulong Zhang, Xiaoyang Qu, Junqing Peng, Jianzong Wang

Comments Accepted to 2026 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2026)

详情
英文摘要

The emergence of Large Audio-Language Models (LALMs) has advanced Speech Emotion Recognition (SER), but their size limits deployment in resource-constrained environments. While Knowledge Distillation is effective for LALM compression, existing methods remain underexplored in distilling the cross-modal projection module (Projector), and often struggle with alignment due to differences in feature dimensions. We propose PL-Distill, a KD framework that combines Projector-Level Distillation (PDist) to align audio embeddings and Logits-Level Distillation (LDist) to align output logits. PDist introduces Attention-weighted Centered Kernel Alignment, a novel approach we propose to highlight important time steps and address dimension mismatches. Meanwhile, LDist minimizes the Kullback-Leibler divergence between teacher and student logits from audio and text modalities. On IEMOCAP, RAVDESS, and SAVEE, PL-Distill compresses an 8.4B-parameter teacher to a compact 1.1B-parameter student, consistently outperforming the teacher, state-of-the-art pretrained models, and other KD baselines across all metrics.

2602.01516 2026-02-03 cs.LG cs.AI cs.SY eess.SY

White-Box Neural Ensemble for Vehicular Plasticity: Quantifying the Efficiency Cost of Symbolic Auditability in Adaptive NMPC

Enzo Nicolas Spotorno, Matheus Wagner, Antonio Augusto Medeiros Frohlich

Comments 5 pages, 1 table, 1 figure, submitted to IEEE VTC 2026 Recent Results Track

详情
英文摘要

We present a white-box adaptive NMPC architecture that resolves vehicular plasticity (adaptation to varying operating regimes without retraining) by arbitrating among frozen, regime-specific neural specialists using a Modular Sovereignty paradigm. The ensemble dynamics are maintained as a fully traversable symbolic graph in CasADi, enabling maximal runtime auditability. Synchronous simulation validates rapid adaptation (~7.3 ms) and near-ideal tracking fidelity under compound regime shifts (friction, mass, drag) where non-adaptive baselines fail. Empirical benchmarking quantifies the transparency cost: symbolic graph maintenance increases solver latency by 72-102X versus compiled parametric physics models, establishing the efficiency price of strict white-box implementation.

2602.01513 2026-02-03 eess.IV cs.AI cs.CR cs.CV

MarkCleaner: High-Fidelity Watermark Removal via Imperceptible Micro-Geometric Perturbation

Xiaoxi Kong, Jieyu Yuan, Pengdi Chen, Yuanlin Zhang, Chongyi Li, Bin Li

详情
英文摘要

Semantic watermarks exhibit strong robustness against conventional image-space attacks. In this work, we show that such robustness does not survive under micro-geometric perturbations: spatial displacements can remove watermarks by breaking the phase alignment. Motivated by this observation, we introduce MarkCleaner, a watermark removal framework that avoids semantic drift caused by regeneration-based watermark removal. Specifically, MarkCleaner is trained with micro-geometry-perturbed supervision, which encourages the model to separate semantic content from strict spatial alignment and enables robust reconstruction under subtle geometric displacements. The framework adopts a mask-guided encoder that learns explicit spatial representations and a 2D Gaussian Splatting-based decoder that explicitly parameterizes geometric perturbations while preserving semantic content. Extensive experiments demonstrate that MarkCleaner achieves superior performance in both watermark removal effectiveness and visual fidelity, while enabling efficient real-time inference. Our code will be made available upon acceptance.

2602.01508 2026-02-03 eess.SY cs.AI cs.CE cs.SY

Harnessing Flexible Spatial and Temporal Data Center Workloads for Grid Regulation Services

Yingrui Fan, Junbo Zhao

详情
英文摘要

Data centers (DCs) are increasingly recognized as flexible loads that can support grid frequency regulation. Yet, most existing methods treat workload scheduling and regulation capacity bidding separately, overlooking how queueing dynamics and spatial-temporal dispatch decisions affect the ability to sustain real-time regulation. As a result, the committed regulation may become infeasible or short-lived. To address this issue, we propose a unified day-ahead co-optimization framework that jointly decides workload distribution across geographically distributed DCs and regulation capacity commitments. We construct a space-time network model to capture workload migration costs, latency requirements, and heterogeneous resource limits. To ensure that the committed regulation remains deliverable, we introduce chance constraints on instantaneous power flexibility based on interactive load forecasts, and apply Value-at-Risk queue-state constraints to maintain sustainable response under cumulative regulation signals. Case studies on a modified IEEE 68-bus system using real data center traces show that the proposed framework lowers system operating costs, enables more viable regulation capacity, and achieves better revenue-risk trade-offs compared to strategies that optimize scheduling and regulation independently.