arXivDaily arXiv每日学术速递 周一至周五更新
2601.16950 2026-01-26 cs.NI cs.MM eess.IV

Evaluating Wi-Fi Performance for VR Streaming: A Study on Realistic HEVC Video Traffic

Ferran Maura, Francesc Wilhelmi, Boris Bellalta

详情
英文摘要

Cloud-based Virtual Reality (VR) streaming presents significant challenges for 802.11 networks due to its high throughput and low latency requirements. When multiple VR users share a Wi-Fi network, the resulting uplink and downlink traffic can quickly saturate the channel. This paper investigates the capacity of 802.11 networks for supporting realistic VR streaming workloads across varying frame rates, bitrates, codec settings, and numbers of users. We develop an emulation framework that reproduces Air Light VR (ALVR) operation, where real HEVC video traffic is fed into an 802.11 simulation model. Our findings explore Wi-Fi's performance anomaly and demonstrate that Intra-refresh (IR) coding effectively reduces latency variability and improves QoS, supporting up to 4 concurrent VR users with Constant Bitrate (CBR) 100 Mbps before the channel is saturated.

2601.16915 2026-01-26 eess.SP

IRS Compensation of Hyper-Rayleigh Fading: How Many Elements Are Needed?

Aleksey S. Gvozdarev

详情
英文摘要

The letter introduces and studies the problem of defining the minimum number of Intelligent Reflecting Surface (IRS) elements needed to compensate for heavy fading conditions in multipath fading channels. The fading severity is quantified in terms of Hyper-Rayleigh Regimes (HRRs) (i.e., full-HRR (worst-case conditions), strong-, weak-, and no-HRR), and the channel model used (Inverse Power Lomax (IPL)) was chosen since it can account for all HRRs. The research presents the derived closed-form channel coefficient envelope statistics for the single IRS-element channel with IPL statistics in both subchannels and total IRS-assisted channel, as well as tight approximations for the channel coefficient and instantaneous signal-to-noise ratio (SNR) statistics for the latter. The derived expressions helped estimate channel parameters corresponding to the specific HRRs of the total channel and demonstrate that while both single links (i.e., ''source-IRS'' and ''IRS-destination'') are in full-HRR, the minimum number of IRS elements needed to bring the total IRS-assisted link (''source-IRS-destination'') out of full-HRR is no less than $6$ (for the whole range on the IPL scale parameter corresponding full-HRR). Furthermore, the minimum number of IRS elements required to bring the total IRS-assisted link into no-HRR is $14$ (under the same conditions).

2601.16904 2026-01-26 physics.optics eess.IV physics.bio-ph

Clinical Feasibility of Label-Free Digital Staining Using Mid-Infrared Microscopy at Subcellular Resolution

L. Duraffourg, H. Borges, M. Fernandes, M. Beurrier-Bousquet, J. Baraillon, B. Taurel, J. Le Galudec, K. Vianey, C. Maisin, L. Samaison, F. Staroz, M. Dupoy

Comments 33 pages, 15 figures

详情
英文摘要

We present a rapid, large-field bimodal imaging platform that integrates conventional brightfield microscopy with a lensless IR imaging scanner, enabling whole-slide IR image stack acquisition in minutes. Using a dedicated deep learning model, we implement an optical HE staining strategy based on subcellular morpho-spectral fingerprinting.

2601.16876 2026-01-26 eess.SY cs.SY

Performance of Differential Protection Applied to Collector Cables of Offshore Wind Farms with MMC-HVDC Transmission

Moisés J. B. B. Davi, Felipe V. Lopes, Vinícius A. Lacerda, Mário Oleskovicz, Oriol Gomis-Bellmunt

Comments Submitted to DPSP Global 2026

详情
英文摘要

The ongoing global transition towards low-carbon energy has propelled the integration of offshore wind farms, which, when combined with Modular Multilevel Converter-based High-Voltage Direct Current (MMC-HVDC) transmission, present unique challenges for power system protection. In collector cables connecting wind turbines to offshore MMC, both ends are supplied by Inverter-Based Resources (IBRs), which modify the magnitude and characteristics of fault currents. In this context, this paper investigates the limitations of conventional differential protection schemes under such conditions and compares them with enhanced strategies that account for sequence components. Using electromagnetic transient simulations of a representative offshore wind farm modeled in PSCAD/EMTDC software, internal and external fault scenarios are assessed, varying fault types and resistances. The comparative evaluation provides insights into the sensitivity and selectivity of differential protection and guides a deeper conceptual understanding of the evolving protection challenges inherent to future converter-dominated grids.

2601.16863 2026-01-26 cs.AI cs.LG cs.MA cs.SY eess.SY

Mixture-of-Models: Unifying Heterogeneous Agents via N-Way Self-Evaluating Deliberation

Tims Pecerskis, Aivars Smirnovs

详情
英文摘要

This paper introduces the N-Way Self-Evaluating Deliberation (NSED) protocol, a Runtime Mixture-of-Models (MoM) architecture that constructs emergent composite models from a plurality of distinct expert agents. Unlike traditional Mixture-of-Experts (MoE) which rely on static gating networks, NSED employs a Dynamic Expertise Broker - a runtime optimization engine that treats model selection as a variation of the Knapsack Problem, binding heterogeneous checkpoints to functional roles based on live telemetry and cost constraints. At the execution layer, we formalize deliberation as a Macro-Scale Recurrent Neural Network (RNN), where the consensus state loops back through a semantic forget gate to enable iterative refinement without proportional VRAM scaling. Key components include an orchestration fabric for trustless N-to-N peer review, a Quadratic Voting activation function for non-linear consensus, and a feedback-driven state update. Empirical validation on challenging benchmarks (AIME 2025, LiveCodeBench) demonstrates that this topology allows ensembles of small (less than 20B) consumer-grade models to match or exceed the performance of state-of-the-art 100B+ parameter models, establishing a new hardware arbitrage efficiency frontier. Furthermore, testing on the DarkBench safety suite reveals intrinsic alignment properties, with peer-mediated correction reducing sycophancy scores below that of any individual agent.

2601.16847 2026-01-26 eess.SP

Hierarchical Distribution Matcher Design for Probabilistic Constellation Shaping Based on a Novel Semi-Analytical Optimization Approach

Pantea Nadimi Goki, Luca Potì

Comments The manuscript has been submitted for publication in the Journal of Lightwave Technologies (JLT)

详情
英文摘要

A novel design procedure for practical hierarchical distribution matchers (HiDMs) in probabilistically shaped constellation systems is presented. The proposed approach enables the determination of optimal parameters for any target distribution matcher rate. Specifically, lower bounds on energy loss, rate loss, and memory requirements are analytically estimated for HiDM architectures approximating the Maxwell Boltzmann (MB) distribution. A semi analytical optimization framework is employed to jointly optimize rate and energy loss, allowing the selection of the number of hierarchical layers, memory size, and block length required to optimize channel capacity. The accuracy of the proposed model is validated through probabilistic amplitude shaping of 16QAM (PAS 16QAM), showing good agreement between analytical predictions and simulated results. The proposed analytical tool facilitates the design of HiDM structures that are compatible with practical hardware and implementation constraints, such as those imposed by state-of-the-art application-specific integrated circuits (ASICs) and field-programmable gate arrays (FPGAs). Furthermore, the performance of the optimized HiDM structure, incorporating layer selection based on lower-bound energy loss, is evaluated over the AWGN channel in terms of normalized generalized mutual information (NGMI) as a function of the optical signal-to-noise ratio (OSNR). At a net data rate of 200 Gbps with 25% forward error correction (FEC) overhead, the proposed scheme achieves a shaping gain improvement of 2.8% compared to previously reported solutions.

2601.16827 2026-01-26 eess.SY cs.SY

Identification of Port-Hamiltonian Differential-Algebraic Equations from Input-Output Data

N. Hagelaars, G. J. E. van Otterdijk, S. Moradi, R. Tóth, N. O. Jaensson, M. Schoukens

Comments Preprint submitted to IFAC World Congress 2026

详情
英文摘要

Many models of physical systems, such as mechanical and electrical networks, exhibit algebraic constraints that arise from subsystem interconnections and underlying physical laws. Such systems are commonly formulated as differential-algebraic equations (DAEs), which describe both the dynamic evolution of system states and the algebraic relations that must hold among them. Within this class, port-Hamiltonian differential-algebraic equations (pH-DAEs) offer a structured, energy-based representation that preserves interconnection and passivity properties. This work introduces a data-driven identification method that combines port-Hamiltonian neural networks (pHNNs) with a differential-algebraic solver to model such constrained systems directly from noisy input-output data. The approach preserves the passivity and interconnection structure of port-Hamiltonian systems while employing a backward Euler discretization with Newton's method to solve the coupled differential and algebraic equations consistently. The performance of the proposed approach is demonstrated on a DC power network, where the identified model accurately captures system behaviour and maintains errors proportional to the noise amplitude, while providing reliable parameter estimates.

2601.16812 2026-01-26 cs.LG eess.IV math.OC

Sample-wise Constrained Learning via a Sequential Penalty Approach with Applications in Image Processing

Francesca Lanzillotta, Chiara Albisani, Davide Pucci, Daniele Baracchi, Alessandro Piva, Matteo Lapucci

详情
英文摘要

In many learning tasks, certain requirements on the processing of individual data samples should arguably be formalized as strict constraints in the underlying optimization problem, rather than by means of arbitrary penalties. We show that, in these scenarios, learning can be carried out exploiting a sequential penalty method that allows to properly deal with constraints. The proposed algorithm is shown to possess convergence guarantees under assumptions that are reasonable in deep learning scenarios. Moreover, the results of experiments on image processing tasks show that the method is indeed viable to be used in practice.

2601.16793 2026-01-26 cs.SD cs.NE eess.AS

A Novel Transfer Learning Approach for Mental Stability Classification from Voice Signal

Rafiul Islam, Md. Taimur Ahad

详情
英文摘要

This study presents a novel transfer learning approach and data augmentation technique for mental stability classification using human voice signals and addresses the challenges associated with limited data availability. Convolutional neural networks (CNNs) have been employed to analyse spectrogram images generated from voice recordings. Three CNN architectures, VGG16, InceptionV3, and DenseNet121, were evaluated across three experimental phases: training on non-augmented data, augmented data, and transfer learning. This proposed transfer learning approach involves pre-training models on the augmented dataset and fine-tuning them on the non-augmented dataset while ensuring strict data separation to prevent data leakage. The results demonstrate significant improvements in classification performance compared to the baseline approach. Among three CNN architectures, DenseNet121 achieved the highest accuracy of 94% and an AUC score of 99% using the proposed transfer learning approach. This finding highlights the effectiveness of combining data augmentation and transfer learning to enhance CNN-based classification of mental stability using voice spectrograms, offering a promising non-invasive tool for mental health diagnostics.

2601.16792 2026-01-26 eess.SP

A Dynamic Parametric Simulator for Fetal Heart Sounds

Yingtong Zhou, Yiang Zhou, Zhengxian Qu, Kang Liu, Ting Tan

详情
英文摘要

Research on fetal phonocardiogram (fPCG) is challenged by the limited number of abdominal recordings, substantial maternal interference, and marked transmissioninduced signal attenuation that complicate reproducible benchmarking. We present a reproducible dynamic parametric simulator that generates long abdominal fPCG sequences by combining cycle-level fetal S1/S2 event synthesis with a convolutional transmission module and configurable interference and background noise. Model parameters are calibrated cyclewise from real abdominal recordings to capture beat-to-beat variability and to define data-driven admissible ranges for controllable synthesis. The generated signals are validated against real recordings in terms of envelope-based temporal structure and frequency-domain characteristics. The simulator is released as open software to support rapid, reproducible evaluation of fPCG processing methods under controlled acquisition conditions.

2601.16780 2026-01-26 eess.IV cs.CV

PocketDVDNet: Realtime Video Denoising for Real Camera Noise

Crispian Morris, Imogen Dexter, Fan Zhang, David R. Bull, Nantheera Anantrasirichai

详情
英文摘要

Live video denoising under realistic, multi-component sensor noise remains challenging for applications such as autofocus, autonomous driving, and surveillance. We propose PocketDVDNet, a lightweight video denoiser developed using our model compression framework that combines sparsity-guided structured pruning, a physics-informed noise model, and knowledge distillation to achieve high-quality restoration with reduced resource demands. Starting from a reference model, we induce sparsity, apply targeted channel pruning, and retrain a teacher on realistic multi-component noise. The student network learns implicit noise handling, eliminating the need for explicit noise-map inputs. PocketDVDNet reduces the original model size by 74% while improving denoising quality and processing 5-frame patches in real-time. These results demonstrate that aggressive compression, combined with domain-adapted distillation, can reconcile performance and efficiency for practical, real-time video denoising.

2601.16774 2026-01-26 cs.SD eess.AS

E2E-AEC: Implementing an end-to-end neural network learning approach for acoustic echo cancellation

Yiheng Jiang, Biao Tian, Haoxu Wang, Shengkui Zhao, Bin Ma, Daren Chen, Xiangang Li

Comments This paper has been accepted by ICASSP2026

详情
英文摘要

We propose a novel neural network-based end-to-end acoustic echo cancellation (E2E-AEC) method capable of streaming inference, which operates effectively without reliance on traditional linear AEC (LAEC) techniques and time delay estimation. Our approach includes several key strategies: First, we introduce and refine progressive learning to gradually enhance echo suppression. Second, our model employs knowledge transfer by initializing with a pre-trained LAECbased model, harnessing the insights gained from LAEC training. Third, we optimize the attention mechanism with a loss function applied on attention weights to achieve precise time alignment between the reference and microphone signals. Lastly, we incorporate voice activity detection to enhance speech quality and improve echo removal by masking the network output when near-end speech is absent. The effectiveness of our approach is validated through experiments conducted on public datasets.

2601.16764 2026-01-26 eess.SY cs.LG cs.SY math.OC

ReLU Networks for Model Predictive Control: Network Complexity and Performance Guarantees

Xingchen Li, Keyou You

详情
英文摘要

Recent years have witnessed a resurgence in using ReLU neural networks (NNs) to represent model predictive control (MPC) policies. However, determining the required network complexity to ensure closed-loop performance remains a fundamental open problem. This involves a critical precision-complexity trade-off: undersized networks may fail to capture the MPC policy, while oversized ones may outweigh the benefits of ReLU network approximation. In this work, we propose a projection-based method to enforce hard constraints and establish a state-dependent Lipschitz continuity property for the optimal MPC cost function, which enables sharp convergence analysis of the closed-loop system. For the first time, we derive explicit bounds on ReLU network width and depth for approximating MPC policies with guaranteed closed-loop performance. To further reduce network complexity and enhance closed-loop performance, we propose a non-uniform error framework with a state-aware scaling function to adaptively adjust both the input and output of the ReLU network. Our contributions provide a foundational step toward certifiable ReLU NN-based MPC.

2601.16758 2026-01-26 quant-ph cs.SY eess.SY math.OC

Noise Resilience and Robust Convergence Guarantees for the Variational Quantum Eigensolver

Mirko Legnini, Julian Berberich

详情
英文摘要

Variational Quantum Algorithms (VQAs) are a class of hybrid quantum-classical algorithms that leverage on classical optimization tools to find the optimal parameters for a parameterized quantum circuit. One relevant application of VQAs is the Variational Quantum Eigensolver (VQE), which aims at steering the output of the quantum circuit to the ground state of a certain Hamiltonian. Recent works have provided global convergence guarantees for VQEs under suitable local surjectivity and smoothness hypotheses, but little has been done in characterizing convergence of these algorithms when the underlying quantum circuit is affected by noise. In this work, we characterize the effect of different coherent and incoherent noise processes on the optimal parameters and the optimal cost of the VQE, and we study their influence on the convergence guarantees of the algorithm. Our work provides novel theoretical insight into the behavior of parameterized quantum circuits. Furthermore, we accompany our results with numerical simulations implemented via Pennylane.

2601.16733 2026-01-26 cs.CV eess.SP

Using Shadows in Circular Synthetic Aperture Sonar Imaging for Target Analysis

Yann Le Gall, Nicolas Burlet, Mathieu Simon, Fabien Novella, Samantha Dugelay, Jean-Philippe Malkasse

Journal ref Synthetic Aperture in Sonar and Radar 2023

详情
英文摘要

Circular Synthetic Aperture Sonar (CSAS) provides a 360° azimuth view of the seabed, surpassing the limited aperture and mono-view image of conventional side-scan SAS. This makes CSAS a valuable tool for target recognition in mine warfare where the diversity of point of view is essential for reducing false alarms. CSAS processing typically produces a very high-resolution two-dimensional image. However, the parallax introduced by the circular displacement of the illuminator fill-in the shadow regions, and the shadow cast by an object on the seafloor is lost in favor of azimuth coverage and resolution. Yet the shadows provide complementary information on target shape useful for target recognition. In this paper, we explore a way to retrieve shadow information from CSAS data to improve target analysis and carry 3D reconstruction. Sub-aperture filtering is used to get a collection of images at various points of view along the circular trajectory and fixed focus shadow enhancement (FFSE) is applied to obtain sharp shadows. An interactive interface is also proposed to allow human operators to visualize these shadows along the circular trajectory. A space-carving reconstruction method is applied to infer the 3D shape of the object from the segmented shadows. The results demonstrate the potential of shadows in circular SAS for improving target analysis and 3D reconstruction.

2601.16727 2026-01-26 eess.SP

Precise Low-Current Measurement Techniques for IoT Devices: A Case Study on MoleNet

Julian Block, Andreas Könsgen, Jens Dede, Anna Förster

详情
英文摘要

Power consumption is a crucial aspect of IoT devices which often have to run on a battery for an extended period of time. Therefore, supply current measurements are crucial before deploying a device in the field. Multimeters and oscilloscopes are not well suited when it comes to measuring very small currents which occur e.g. when an IoT device is in sleep mode. In this report, we compare dedicated source measurement units (SMUs) which allow to measure very small currents with high precision. As an application example, we demonstrate current measurements on our MoleNet IoT sensor board.

2601.16722 2026-01-26 eess.SY cs.SY

Model Predictive Control for Coupled Adoption-Opinion Dynamics

Martina Alutto, Qiulin Xu, Fabrizio Dabbene, Hideaki Ishii, Chiara Ravazzi

Comments 6 pages, 4 figures

详情
英文摘要

This paper investigates an optimal control problem for an adoption-opinion model that couples opinion dynamics with a compartmental adoption framework on a multilayer network to study the diffusion of sustainable behaviors. Adoption evolves through social contagion and perceived benefits, while opinions are shaped by social interactions and feedback from adoption levels. Individuals may also stop adopting virtuous behavior due to external constraints or shifting perceptions, affecting overall diffusion. After the stability analysis of equilibria, both in the presence and absence of adopters, we introduce a Model Predictive Control (MPC) framework that optimizes interventions by shaping opinions rather than directly enforcing adoption. This nudge-based control strategy allows policymakers to influence diffusion indirectly, making interventions more effective and scalable. Numerical simulations demonstrate that, in the absence of control, adoption stagnates, whereas MPC-driven interventions sustain and enhance adoption across communities.

2601.16719 2026-01-26 eess.SY cs.SY

On a Coupled Adoption-Opinion Framework for Competing Innovations

Martina Alutto, Fabrizio Dabbene, Angela Fontan, Karl H. Johansson, Chiara Ravazzi

Comments 7 pages, 3 figures

详情
英文摘要

In this paper, we propose a two-layer adoption-opinion model to study the diffusion of two competing technologies within a population whose opinions evolve under social influence and adoption-driven feedback. After adopting one technology, individuals may become dissatisfied and switch to the alternative. We prove the existence and uniqueness of the adoption-diffused equilibrium, showing that both technologies coexist and that neither partial-adoption nor monopoly can arise. Numerical simulations show that while opinions shape the equilibrium adoption levels, the relative market share between the two technologies depends solely on their user-experience. As a consequence, interventions that symmetrically boost opinions or adoption can disproportionately favor the higher-quality technology, illustrating how symmetric control actions may generate asymmetric outcomes.

2601.16682 2026-01-26 eess.SY cs.SY

Computation-Accuracy Trade-Off in Service-Oriented Model-Based Control

Hazem Ibrahim, Julius Beerwerth, Lorenz Dörschel, Bassam Alrifaee

Comments 9 pages, 9 figures

详情
英文摘要

Representing a control system as a Service-Oriented Architecture (SOA)-referred to as Service-Oriented Model-Based Control (SOMC)-enables runtime-flexible composition of control loop elements. This paper presents a framework that optimizes the computation-accuracy trade-off by formulating service orchestration as an A$^\star$search problem, complemented by Contextual Bayesian Optimization (BO) to tune the multi-objective cost weights. A vehicle longitudinal-velocity control case study demonstrates online, performancedriven reconfiguration of the control architecture. We show that our framework not only combines control and software structure but also considers the real-time requirements of the control system during performance optimization.

2601.16675 2026-01-26 cs.SD cs.LG eess.AS

I Guess That's Why They Call it the Blues: Causal Analysis for Audio Classifiers

David A. Kelly, Hana Chockler

详情
英文摘要

It is well-known that audio classifiers often rely on non-musically relevant features and spurious correlations to classify audio. Hence audio classifiers are easy to manipulate or confuse, resulting in wrong classifications. While inducing a misclassification is not hard, until now the set of features that the classifiers rely on was not well understood. In this paper we introduce a new method that uses causal reasoning to discover features of the frequency space that are sufficient and necessary for a given classification. We describe an implementation of this algorithm in the tool FreqReX and provide experimental results on a number of standard benchmark datasets. Our experiments show that causally sufficient and necessary subsets allow us to manipulate the outputs of the models in a variety of ways by changing the input very slightly. Namely, a change to one out of 240,000 frequencies results in a change in classification 58% of the time, and the change can be so small that it is practically inaudible. These results show that causal analysis is useful for understanding the reasoning process of audio classifiers and can be used to successfully manipulate their outputs.

2601.16664 2026-01-26 eess.SP eess.IV

OFDM-Based ISAC Imaging of Extended Targets via Inverse Virtual Aperture Processing

Michael Negosanti, Lorenzo Pucci, Andrea Giorgetti

Comments 6 pages; This paper was presented at the IEEE JC&S Symposium 2026

详情
英文摘要

This work investigates the performance of an integrated sensing and communication (ISAC) system exploiting inverse virtual aperture (IVA) for imaging moving extended targets in vehicular scenarios. A base station (BS) operates as a monostatic sensor using MIMO-OFDM waveforms. Echoes reflected by the target are processed through motion-compensation techniques to form an IVA range-Doppler (cross-range) image. A case study considers a 5G NR waveform in the upper mid-band, with the target model defined in 3GPP Release 19, representing a vehicle as a set of spatially distributed scatterers. Performance is evaluated in terms of image contrast (IC) and the root mean squared error (RMSE) of the estimated target-centroid range. Finally, the trade-off between sensing accuracy and communication efficiency is examined by varying the subcarrier allocation for IVA imaging. The results provide insights for designing effective sensing strategies in next-generation radio networks.

2601.16663 2026-01-26 cs.DB cs.SY eess.SY

A Categorical Approach to Semantic Interoperability across Building Lifecycle

Zoltan Nagy, Ryan Wisnesky, Kevin Carlson, Eswaran Subrahmanian, Gioele Zardini

详情
英文摘要

Buildings generate heterogeneous data across their lifecycle, yet integrating these data remains a critical unsolved challenge. Despite three decades of standardization efforts, over 40 metadata schemas now span the building lifecycle, with fragmentation accelerating rather than resolving. Current approaches rely on point-to-point mappings that scale quadratically with the number of schemas, or universal ontologies that become unwieldy monoliths. The fundamental gap is the absence of mathematical foundations for structure-preserving transformations across heterogeneous building data. Here we show that category theory provides these foundations, enabling systematic data integration with $O(n)$ specification complexity for $n$ ontologies. We formalize building ontologies as first-order theories and demonstrate two proof-of-concept implementations in Categorical Query Language (CQL): 1) generating BRICK models from IFC design data at commissioning, and 2) three-way integration of IFC, BRICK, and RealEstateCore where only two explicit mappings yield the third automatically through categorical composition. Our correct-by-construction approach treats property sets as first-class schema entities and provides automated bidirectional migrations, and enables cross-ontology queries. These results establish feasibility of categorical methods for building data integration and suggest a path toward an app ecosystem for buildings, where mathematical foundations enable reliable component integration analogous to smartphone platforms.

2601.16662 2026-01-26 eess.SP

Low-Power On-Device Gesture Recognition with Einsum Networks

Sahar Golipoor, Lingyun Yao, Martin Andraud, Stephan Sigg

详情
英文摘要

We design a gesture-recognition pipeline for networks of distributed, resource constrained devices utilising Einsum Networks. Einsum Networks are probabilistic circuits that feature a tractable inference, explainability, and energy efficiency. The system is validated in a scenario of low-power, body-worn, passive Radio Frequency Identification-based gesture recognition. Each constrained device includes task-specific processing units responsible for Received Signal Strength (RSS) and phase processing or Angle of Arrival (AoA) estimation, along with feature extraction, as well as dedicated Einsum hardware that processes the extracted features. The output of all constrained devices is then fused in a decision aggregation module to predict gestures. Experimental results demonstrate that the method outperforms the benchmark models.

2601.16660 2026-01-26 eess.IV cs.CV cs.LG

Fast, faithful and photorealistic diffusion-based image super-resolution with enhanced Flow Map models

Maxence Noble, Gonzalo Iñaki Quintana, Benjamin Aubin, Clément Chadebec

Comments Technical report

详情
英文摘要

Diffusion-based image super-resolution (SR) has recently attracted significant attention by leveraging the expressive power of large pre-trained text-to-image diffusion models (DMs). A central practical challenge is resolving the trade-off between reconstruction faithfulness and photorealism. To address inference efficiency, many recent works have explored knowledge distillation strategies specifically tailored to SR, enabling one-step diffusion-based approaches. However, these teacher-student formulations are inherently constrained by information compression, which can degrade perceptual cues such as lifelike textures and depth of field, even with high overall perceptual quality. In parallel, self-distillation DMs, known as Flow Map models, have emerged as a promising alternative for image generation tasks, enabling fast inference while preserving the expressivity and training stability of standard DMs. Building on these developments, we propose FlowMapSR, a novel diffusion-based framework for image super-resolution explicitly designed for efficient inference. Beyond adapting Flow Map models to SR, we introduce two complementary enhancements: (i) positive-negative prompting guidance, based on a generalization of classifier free-guidance paradigm to Flow Map models, and (ii) adversarial fine-tuning using Low-Rank Adaptation (LoRA). Among the considered Flow Map formulations (Eulerian, Lagrangian, and Shortcut), we find that the Shortcut variant consistently achieves the best performance when combined with these enhancements. Extensive experiments show that FlowMapSR achieves a better balance between reconstruction faithfulness and photorealism than recent state-of-the-art methods for both x4 and x8 upscaling, while maintaining competitive inference time. Notably, a single model is used for both upscaling factors, without any scale-specific conditioning or degradation-guided mechanisms.

2601.16648 2026-01-26 eess.SY cs.SY

A Cognitive Framework for Autonomous Agents: Toward Human-Inspired Design

Francesco Guidi, Jingfeng Shan, Mehrdad Saeidi, Enrico Testi, Elia Favarelli, Andrea Giorgetti, Davide Dardari, Alberto Zanella, Giorgio Li Pira, Francesca Starita, Anna Guerra

详情
英文摘要

This work introduces a human-inspired reinforcement learning (RL) architecture that integrates Pavlovian and instrumental processes to enhance decision-making in autonomous systems. While existing engineering solutions rely almost exclusively on instrumental learning, neuroscience shows that humans use Pavlovian associations to leverage predictive cues to bias behavior before outcomes occur. We translate this dual-system mechanism into a cue-guided RL framework in which radio-frequency (RF) stimuli act as conditioned (Pavlovian) cues that modulate action selection. The proposed architecture combines Pavlovian values with instrumental policy optimization, improving navigation efficiency and cooperative behavior in unknown, partially observable environments. Simulation results demonstrate that cue-driven agents adapt faster, achieving superior performance compared to traditional instrumental-solo agents. This work highlights the potential of human learning principles to reshape digital agents intelligence.

2601.16631 2026-01-26 eess.IV cs.CV eess.SP stat.AP

PanopMamba: Vision State Space Modeling for Nuclei Panoptic Segmentation

Ming Kang, Fung Fung Ting, Raphaël C. -W. Phan, Zongyuan Ge, Chee-Ming Ting

Comments 10 pages, 3 figures

详情
英文摘要

Nuclei panoptic segmentation supports cancer diagnostics by integrating both semantic and instance segmentation of different cell types to analyze overall tissue structure and individual nuclei in histopathology images. Major challenges include detecting small objects, handling ambiguous boundaries, and addressing class imbalance. To address these issues, we propose PanopMamba, a novel hybrid encoder-decoder architecture that integrates Mamba and Transformer with additional feature-enhanced fusion via state space modeling. We design a multiscale Mamba backbone and a State Space Model (SSM)-based fusion network to enable efficient long-range perception in pyramid features, thereby extending the pure encoder-decoder framework while facilitating information sharing across multiscale features of nuclei. The proposed SSM-based feature-enhanced fusion integrates pyramid feature networks and dynamic feature enhancement across different spatial scales, enhancing the feature representation of densely overlapping nuclei in both semantic and spatial dimensions. To the best of our knowledge, this is the first Mamba-based approach for panoptic segmentation. Additionally, we introduce alternative evaluation metrics, including image-level Panoptic Quality ($i$PQ), boundary-weighted PQ ($w$PQ), and frequency-weighted PQ ($fw$PQ), which are specifically designed to address the unique challenges of nuclei segmentation and thereby mitigate the potential bias inherent in vanilla PQ. Experimental evaluations on two multiclass nuclei segmentation benchmark datasets, MoNuSAC2020 and NuInsSeg, demonstrate the superiority of PanopMamba for nuclei panoptic segmentation over state-of-the-art methods. Consequently, the robustness of PanopMamba is validated across various metrics, while the distinctiveness of PQ variants is also demonstrated. Code is available at https://github.com/mkang315/PanopMamba.

2601.16606 2026-01-26 eess.SP

Assessment of Errors of Fundamental Frequency Estimation Methods in the Presence of Voltage Fluctuations and Distortions

Antonio Bracale, Pasquale De Falco, Piotr Kuwałek, Grzegorz Wiczyński

Comments 5 pages, 7 figures, submitted to IEEE conferences

详情
英文摘要

The fundamental frequency is one of the parameters that define power quality. Correctly determining this parameter under the conditions that prevail in modern power grids is crucial. Diagnostic purposes often require an efficient estimation of this parameter within short time windows. Therefore, this article presents the results of numerical simulation studies that allow the assessment of errors in various fundamental frequency estimation methods, including the standard IEC 61000-4-30 method, when the analyzed signal has a form similar to that found in modern power grids. For the purposes of this study, a test signal was adopted recreating the states of the power grid, including the simultaneous occurrence of voltage fluctuations and distortions. Conclusions are presented based on conducted research.

2601.16603 2026-01-26 cs.SD eess.AS

Omni-directional attention mechanism based on Mamba for speech separation

Ke Xue, Chang Sun, Rongfei Fan, Jing Wang, Han Hu

详情
英文摘要

Mamba, a selective state-space model (SSM), has emerged as an efficient alternative to Transformers for speech modeling, enabling long-sequence processing with linear complexity. While effective in speech separation, existing approaches, whether in the time or time-frequency domain, typically decompose the input along a single dimension into short one-dimensional sequences before processing them with Mamba, which restricts it to local 1D modeling and limits its ability to capture global dependencies across the 2D spectrogram. In this work, we propose an efficient omni-directional attention (OA) mechanism built upon unidirectional Mamba, which models global dependencies from ten different directions on the spectrogram. We expand the proposed mechanism into two baseline separation models and evaluate on three public datasets. Experimental results show that our approach consistently achieves significant performance gains over the baselines while preserving linear complexity, outperforming existing state-of-the-art (SOTA) systems.

2601.16586 2026-01-26 eess.SP cs.IT cs.LG math.IT

Learning Successive Interference Cancellation for Low-Complexity Soft-Output MIMO Detection

Benedikt Fesl, Fatih Capar

详情
英文摘要

Low-complexity multiple-input multiple-output (MIMO) detection remains a key challenge in modern wireless systems, particularly for 5G reduced capability (RedCap) and internet-of-things (IoT) devices. In this context, the growing interest in deploying machine learning on edge devices must be balanced against stringent constraints on computational complexity and memory while supporting high-order modulation. Beyond accurate hard detection, reliable soft information is equally critical, as modern receivers rely on soft-input channel decoding, imposing additional requirements on the detector design. In this work, we propose recurSIC, a lightweight learning-based MIMO detection framework that is structurally inspired by successive interference cancellation (SIC) and incorporates learned processing stages. It generates reliable soft information via multi-path hypothesis tracking with a tunable complexity parameter while requiring only a single forward pass and a minimal parameter count. Numerical results in realistic wireless scenarios show that recurSIC achieves strong hard- and soft-detection performance at very low complexity, making it well suited for edge-constrained MIMO receivers.

2601.16577 2026-01-26 eess.SP

Real-Time Evaluation of an Ultra-Tight GNSS/INS Integration Based on Adaptive PLL Bandwidth

Gaël Pages, Priot Benoît, Guillaume Beaugendre

Journal ref European Test \& Telemetry Conference (ETTC 2023), Jun 2023, Toulouse, France

详情
英文摘要

In this contribution, we propose a GNSS/INS ultra-tight coupling in which the GNSS receiver architecture is based on a vector tracking loop type architecture. In the proposed approach, the phase lock loop bandwidth is adapted according to the inertial navigation system information. The latter has the advantage to be easily implementable on a System-on-Chip component such as an FPGA (Field-Programmable Gate Arrays), and can be implemented with minor modifications on an existing GNSS receiver platform. Moreover, compared to classical vector-based solutions, the proposed architecture decodes the navigation message in the loop, without the need to run scalar loops in parallel or having to store pre-downloaded ephemeris data. This architecture therefore does not increase the area occupied on the FPGA and does not use additional resources for storage. The proposed GNSS receiver architecture uses GPS L1/C and Galileo E1 signals and is composed of one acquisition module and 16 tracking channels (8 GPS and 8 Galileo) which are implemented within a FPGA (Zynq-Ultrascale).