arXivDaily arXiv每日学术速递 周一至周五更新
2602.06944 2026-02-09 eess.SY cs.LG cs.SY

Optimal Derivative Feedback Control for an Active Magnetic Levitation System: An Experimental Study on Data-Driven Approaches

Saber Omidi, Rene Akupan Ebunle, Se Young Yoon

Comments 10 pages, 9 figures. Preprint; manuscript under journal review

详情
英文摘要

This paper presents the design and implementation of data-driven optimal derivative feedback controllers for an active magnetic levitation system. A direct, model-free control design method based on the reinforcement learning framework is compared with an indirect optimal control design derived from a numerically identified mathematical model of the system. For the direct model-free approach, a policy iteration procedure is proposed, which adds an iteration layer called the epoch loop to gather multiple sets of process data, providing a more diverse dataset and helping reduce learning biases. This direct control design method is evaluated against a comparable optimal control solution designed from a plant model obtained through the combined Dynamic Mode Decomposition with Control (DMDc) and Prediction Error Minimization (PEM) system identification. Results show that while both controllers can stabilize and improve the performance of the magnetic levitation system when compared to controllers designed from a nominal model, the direct model-free approach consistently outperforms the indirect solution when multiple epochs are allowed. The iterative refinement of the optimal control law over the epoch loop provides the direct approach a clear advantage over the indirect method, which relies on a single set of system data to determine the identified model and control.

2602.06937 2026-02-09 cs.SD cs.LG eess.AS

Reciprocal Latent Fields for Precomputed Sound Propagation

Hugo Seuté, Pranai Vasudev, Etienne Richan, Louis-Xavier Buffoni

Comments Temporary pre-print, will be updated. In review at a conference

详情
英文摘要

Realistic sound propagation is essential for immersion in a virtual scene, yet physically accurate wave-based simulations remain computationally prohibitive for real-time applications. Wave coding methods address this limitation by precomputing and compressing impulse responses of a given scene into a set of scalar acoustic parameters, which can reach unmanageable sizes in large environments with many source-receiver pairs. We introduce Reciprocal Latent Fields (RLF), a memory-efficient framework for encoding and predicting these acoustic parameters. The RLF framework employs a volumetric grid of trainable latent embeddings decoded with a symmetric function, ensuring acoustic reciprocity. We study a variety of decoders and show that leveraging Riemannian metric learning leads to a better reproduction of acoustic phenomena in complex scenes. Experimental validation demonstrates that RLF maintains replication quality while reducing the memory footprint by several orders of magnitude. Furthermore, a MUSHRA-like subjective listening test indicates that sound rendered via RLF is perceptually indistinguishable from ground-truth simulations.

2602.06917 2026-02-09 eess.AS cs.LG

Automatic Detection and Analysis of Singing Mistakes for Music Pedagogy

Sumit Kumar, Suraj Jaiswal, Parampreet Singh, Vipul Arora

Comments Under Review at Transactions of Audio Speech and Language Processing

详情
英文摘要

The advancement of machine learning in audio analysis has opened new possibilities for technology-enhanced music education. This paper introduces a framework for automatic singing mistake detection in the context of music pedagogy, supported by a newly curated dataset. The dataset comprises synchronized teacher learner vocal recordings, with annotations marking different types of mistakes made by learners. Using this dataset, we develop different deep learning models for mistake detection and benchmark them. To compare the efficacy of mistake detection systems, a new evaluation methodology is proposed. Experiments indicate that the proposed learning-based methods are superior to rule-based methods. A systematic study of errors and a cross-teacher study reveal insights into music pedagogy that can be utilised for various music applications. This work sets out new directions of research in music pedagogy. The codes and dataset are publicly available.

2602.06819 2026-02-09 eess.SP cs.AI

Bridging 6G IoT and AI: LLM-Based Efficient Approach for Physical Layer's Optimization Tasks

Ahsan Mehmood, Naveed Ul Hassan, Ghassan M. Kraidy

Comments This paper is submitted to IEEE IoT Journal and is currently under review

详情
英文摘要

This paper investigates the role of large language models (LLMs) in sixth-generation (6G) Internet of Things (IoT) networks and proposes a prompt-engineering-based real-time feedback and verification (PE-RTFV) framework that perform physical-layer's optimization tasks through an iteratively process. By leveraging the naturally available closed-loop feedback inherent in wireless communication systems, PE-RTFV enables real-time physical-layer optimization without requiring model retraining. The proposed framework employs an optimization LLM (O-LLM) to generate task-specific structured prompts, which are provided to an agent LLM (A-LLM) to produce task-specific solutions. Utilizing real-time system feedback, the O-LLM iteratively refines the prompts to guide the A-LLM toward improved solutions in a gradient-descent-like optimization process. We test PE-RTFV approach on wireless-powered IoT testbed case study on user-goal-driven constellation design through semantically solving rate-energy (RE)-region optimization problem which demonstrates that PE-RTFV achieves near-genetic-algorithm performance within only a few iterations, validating its effectiveness for complex physical-layer optimization tasks in resource-constrained IoT networks.

2602.06816 2026-02-09 eess.SP

On the Design of an Optimal Multi-Tone Jammer Against the Wiener Interpolation Filter

Corentin Fonteneau

详情
英文摘要

In the context of civilian and military communications, anti-jamming techniques are essential to ensure information integrity in the presence of malicious interference. A conventional time-domain approach relies on computing the Wiener interpolation filter to estimate and suppress the jamming waveform from the received samples. It is widely acknowledged that this method is effective for protecting wideband systems against narrowband interference. In this work, this paradigm is questioned through the design of a $K$-tone jamming waveform that is intrinsically difficult to estimate assuming a $L$-tap Wiener interpolation filter. This design relies on an optimization procedure that maximizes the analytical Bayesian mean squared error associated with the jamming waveform estimate. Additionally, an analytical proof is provided showing that a multi-tone jamming waveform composed of $L/2+1$ tones is sufficient to render the Wiener-filter-based anti-jamming module completely ineffective. The analytical results are validated through Monte Carlo simulations assuming both perfect knowledge and practical estimates of the correlation functions of the received signal.

2602.06780 2026-02-09 eess.SY cs.SY

UnifSrv: AP Selection for Achieving Uniformly Good Performance of CF-MIMO in Realistic Urban Networks

Yunlu Xiao, Marina Petrova, Ljiljana Simić

详情
英文摘要

Under the ideal assumption of uniform propagation, cell-free massive MIMO (CF-mMIMO) provides uniformly high throughput over the network by effectively surrounding each user with its serving access point (AP) set. However, in realistic non-uniform urban propagation environments, it is difficult to consistently select good limited serving AP sets, resulting in significantly degraded throughput, reintroducing "edge-effect" for the worst-served users. To restore the uniformly good performance of scalable CF-mMIMO in realistic urban networks, we formulate a novel multi-objective optimization problem to jointly achieve high throughput by maximizing the sum data rate, uniform throughput by maximizing Jain's fairness index of the throughput per user, and scalability by minimizing the serving AP set size. We then propose the UnifSrv AP selection algorithms to solve this optimization problem, consisting of a deep reinforcement learning (DRL)-based algorithm UnifSrv-DRL and a heuristic algorithm UnifSrv-heu. We conduct a comprehensive performance evaluation of scalable CF-mMIMO under realistic urban network distributions, propagation, and mobility patterns, showing that the prior benchmark AP selection schemes fail to provide uniformly high throughput in practice. By contrast, UnifSrv at least doubles the throughput compared to prior benchmarks, or achieves comparable throughput but with half of the serving AP set size. Importantly, our heuristic algorithm achieves equivalent throughput to our DRL one, but with orders of magnitude lower complexity. We thus for the first time propose an AP selection algorithm that achieves uniformly good CF-mMIMO performance in realistic urban networks with low complexity.

2602.06761 2026-02-09 eess.IV cs.CV

Orientation-Robust Latent Motion Trajectory Learning for Annotation-free Cardiac Phase Detection in Fetal Echocardiography

Yingyu Yang, Qianye Yang, Can Peng, Elena D'Alberti, Olga Patey, Aris T. Papageorghiou, J. Alison Noble

Comments Preprint, Submitted to a journal

详情
英文摘要

Fetal echocardiography is essential for detecting congenital heart disease (CHD), facilitating pregnancy management, optimized delivery planning, and timely postnatal interventions. Among standard imaging planes, the four-chamber (4CH) view provides comprehensive information for CHD diagnosis, where clinicians carefully inspect the end-diastolic (ED) and end-systolic (ES) phases to evaluate cardiac structure and motion. Automated detection of these cardiac phases is thus a critical component toward fully automated CHD analysis. Yet, in the absence of fetal electrocardiography (ECG), manual identification of ED and ES frames remains a labor-intensive bottleneck. We present ORBIT (Orientation-Robust Beat Inference from Trajectories), a self-supervised framework that identifies cardiac phases without manual annotations under various fetal heart orientation. ORBIT employs registration as self-supervision task and learns a latent motion trajectory of cardiac deformation, whose turning points capture transitions between cardiac relaxation and contraction, enabling accurate and orientation-robust localization of ED and ES frames across diverse fetal positions. Trained exclusively on normal fetal echocardiography videos, ORBIT achieves consistent performance on both normal (MAE = 1.9 frames for ED and 1.6 for ES) and CHD cases (MAE = 2.4 frames for ED and 2.1 for ES), outperforming existing annotation-free approaches constrained by fixed orientation assumptions. These results highlight the potential of ORBIT to facilitate robust cardiac phase detection directly from 4CH fetal echocardiography.

2602.06755 2026-02-09 eess.SP

Multi-Functional RIS-enabled Radar and Communication Coexistence: Channel Modeling and a Sub-6 GHz Indoor Measurement Campaign

Anton Tishchenko, Demos Serghiou, Hamidreza Taghvaee, Arman Shojaeifard, Ahmed Elzanaty, Gabriele Gradoni, Mohsen Khalily, Rahim Tafazolli

详情
英文摘要

In this work, we analyze a multi-functional reconfigurable intelligent surface (MF-RIS)-enabled radar and communication coexistence (RCC) system, detailing the key aspects of its phase synthesis codebook generation and the implemented localization algorithm for real-time user tracking based on density-based spatial clustering of applications with noise (DBSCAN), which features a Kalman filter for the prediction of user mobility. We derived a 3GPP-compatible radar cross-section (RCS) and re-radiation pattern-based channel model for the described MF-RIS system, supplementing it with channel measurements. We obtained large and small-scale characteristics, including path loss, shadow fading, Rician K-factor, cluster powers, and RMS delay spread. The study finds that Sub-6 GHz indoor propagation is largely free of blind spots, even with a blocked line-of-sight (LoS) path. Therefore, the proposed channel model includes non-line-of-sight (NLoS) paths, including the ones created by the MF-RIS. We also performed an experimental evaluation of the channel throughput in a fifth generation (5G) new radio (NR) single user multiple-input-multiple-output (SU-MIMO) system, reporting a 74\% reduction in throughput variance and a 12.5\% sum-rate improvement within the MF-RIS near-field compared to the no-RIS setup. This result shows that the MF-RIS can minimize delay spread and increase the coherence bandwidth by creating virtual-LoS (vLoS) path for the moving user, thereby effectively hardening wireless MIMO channels.

2602.06647 2026-02-09 cs.CL eess.AS

Reading Between the Waves: Robust Topic Segmentation Using Inter-Sentence Audio Features

Steffen Freisinger, Philipp Seeberger, Tobias Bocklet, Korbinian Riedhammer

Comments Accepted to IEEE ICASSP 2026

详情
英文摘要

Spoken content, such as online videos and podcasts, often spans multiple topics, which makes automatic topic segmentation essential for user navigation and downstream applications. However, current methods do not fully leverage acoustic features, leaving room for improvement. We propose a multi-modal approach that fine-tunes both a text encoder and a Siamese audio encoder, capturing acoustic cues around sentence boundaries. Experiments on a large-scale dataset of YouTube videos show substantial gains over text-only and multi-modal baselines. Our model also proves more resilient to ASR noise and outperforms a larger text-only baseline on three additional datasets in Portuguese, German, and English, underscoring the value of learned acoustic features for robust topic segmentation.

2602.06639 2026-02-09 eess.SY cs.RO cs.SY

Efficient and Robust Modeling of Nonlinear Mechanical Systems

Davide Tebaldi, Roberto Zanasi

详情
英文摘要

The development of efficient and robust dynamic models is fundamental in the field of systems and control engineering. In this paper, a new formulation for the dynamic model of nonlinear mechanical systems, that can be applied to different automotive and robotic case studies, is proposed, together with a modeling procedure allowing to automatically obtain the model formulation. Compared with the Euler-Lagrange formulation, the proposed model is shown to give superior performances in terms of robustness against measurement noise for systems exhibiting dependence on some external variables, as well as in terms of execution time when computing the inverse dynamics of the system.

2602.06618 2026-02-09 eess.SY cs.SY

Structured Learning for Electromagnetic Field Modeling and Real-Time Inversion

Antonio Bernardes, Jasan Zughaibi, Michael Muehlebach, Bradley J. Nelson

详情
英文摘要

Precise magnetic field modeling is fundamental to the closed-loop control of electromagnetic navigation systems (eMNS) and the analytical Multipole Expansion Model (MPEM) is the current standard. However, the MPEM relies on strict physical assumptions regarding source symmetry and isolation, and requires optimization-based calibration that is highly sensitive to initialization. These constraints limit its applicability to systems with complex or irregular coil geometries. This work introduces an alternative modeling paradigm based on multi-layer perceptrons that learns nonlinear magnetic mappings while strictly preserving the linear dependence on currents. As a result, the field models enable fast, closed-form minimum-norm inversion with evaluation times of approximately 1 ms, which is critical for high-bandwidth magnetic control. For model training and evaluation we use large-scale, high-density datasets collected from the research-grade OctoMag and clinical-grade Navion systems. Our results demonstrate that data-driven models achieve predictive fidelity equivalent to the MPEM while maintaining comparable data efficiency. Furthermore, we demonstrate that straightforward design choices effectively eliminate spurious workspace ill-conditioning frequently reported in MPEM-based calibration. To facilitate future research, we release the complete codebase and datasets open source.

2602.06602 2026-02-09 cs.SD cs.AI cs.LG eess.AS

Scaling Speech Tokenizers with Diffusion Autoencoders

Yuancheng Wang, Zhenyu Tang, Yun Wang, Arthur Hinsvark, Yingru Liu, Yinghao Li, Kainan Peng, Junyi Ao, Mingbo Ma, Mike Seltzer, Qing He, Xubo Liu

Comments ICLR 2026

详情
英文摘要

Speech tokenizers are foundational to speech language models, yet existing approaches face two major challenges: (1) balancing trade-offs between encoding semantics for understanding and acoustics for reconstruction, and (2) achieving low bit rates and low token rates. We propose Speech Diffusion Tokenizer (SiTok), a diffusion autoencoder that jointly learns semantic-rich representations through supervised learning and enables high-fidelity audio reconstruction with diffusion. We scale SiTok to 1.6B parameters and train it on 2 million hours of speech. Experiments show that SiTok outperforms strong baselines on understanding, reconstruction and generation tasks, at an extremely low token rate of $12.5$ Hz and a bit-rate of 200 bits-per-second.

2602.06569 2026-02-09 eess.SY cs.SY

Safety Controller Synthesis for Stochastic Polynomial Time-Delayed Systems

Omid Akbarzadeh, MohammadHossein Ashoori, Amy Nejati, Abolfazl Lavaei

详情
英文摘要

This work develops a theoretical framework for safety controller synthesis in discrete-time stochastic nonlinear polynomial systems subject to time-invariant delays (dt-SNPS-td). While safety analysis of stochastic systems using control barrier certificates (CBC) has been widely studied, developing safety controllers for stochastic systems with time delays remains largely unexplored. The main challenge arises from the need to account for the influence of delayed components when formulating and enforcing safety conditions. To address this, we employ Krasovskii control barrier certificates, which extend the conventional CBC framework by augmenting it with an additional summation term that captures the influence of delayed states. This formulation integrates both the current and delayed components into a unified barrier structure, enabling safety synthesis for stochastic systems with time delays. The proposed approach synthesizes safety controllers under input constraints, offering probabilistic safety guarantees robust to such delays: it ensures that all trajectories of the dt-SNPS-td remain within the prescribed safe region while fulfilling a quantified probabilistic bound. To achieve this, our method reformulates the safety constraints as a sum-of-squares optimization program, enabling the systematic construction of Krasovskii CBC together with their associated safety controllers. We validate the proposed framework through three case studies, including two physical systems, demonstrating its effectiveness and practical applicability.

2602.06380 2026-02-09 cs.RO cs.SY eess.SY

A Consistency-Improved LiDAR-Inertial Bundle Adjustment

Xinran Li, Shuaikang Zheng, Pengcheng Zheng, Xinyang Wang, Jiacheng Li, Zhitian Li, Xudong Zou

详情
英文摘要

Simultaneous Localization and Mapping (SLAM) using 3D LiDAR has emerged as a cornerstone for autonomous navigation in robotics. While feature-based SLAM systems have achieved impressive results by leveraging edge and planar structures, they often suffer from the inconsistent estimator associated with feature parameterization and estimated covariance. In this work, we present a consistency-improved LiDAR-inertial bundle adjustment (BA) with tailored parameterization and estimator. First, we propose a stereographic-projection representation parameterizing the planar and edge features, and conduct a comprehensive observability analysis to support its integrability with consistent estimator. Second, we implement a LiDAR-inertial BA with Maximum a Posteriori (MAP) formulation and First-Estimate Jacobians (FEJ) to preserve the accurate estimated covariance and observability properties of the system. Last, we apply our proposed BA method to a LiDAR-inertial odometry.

2602.06376 2026-02-09 eess.SP

Xona Pulsar Single-Satellite Positioning: System Perspective and Experimental Validation

Thyagaraja Marathe, Tyler G. R. Reid, Srinivas Tantry, Michael O'Meara

详情
英文摘要

Xona is deploying Pulsar, a low Earth orbit (LEO) commercial navigation system designed to deliver resilient positioning, navigation, and timing (PNT) where traditional solutions fall short. Pulsar satellites broadcast dedicated signals optimized for commercial users. This brings rapid geometry change, strong Doppler observability, and robust timing, enabling new approaches to positioning even when only one satellite is visible. Internet of Things (IoT) applications often prioritize availability over sub-meter accuracy in urban canyons, semi-indoor spaces, and other constrained environments. Many platforms are battery-powered, have strict size, weight, and power (SWaP) limits, and cannot support complex multi-sensor architectures. Leveraging LEO dynamics and signal strength, Pulsar can maintain navigation capability under these conditions without specialized user hardware. Here we present a single-satellite positioning (SSP) concept that uses available Pulsar measurements to estimate user position and receiver clock states without external aiding. Early in Pulsar deployment, only one or two satellites may be in view, yet this still benefits stationary or near-stationary users, including in semi-indoor and indoor settings. We discuss algorithmic details and system implications: SSP enables positioning with minimal satellite visibility, reduces reliance on dense constellations, and supports integration into resource-constrained platforms. We present simulation and live sky results. High-fidelity constellation simulations configured for Pulsar provide controlled performance assessment. We also present early findings from a Pulsar-enabled receiver using observations from the Pulsar-0 satellite on orbit. Preliminary tests demonstrate meter-level accuracy outdoors and indoors, highlighting potential under varied reception conditions.

2602.06365 2026-02-09 eess.SY cs.LG cs.SY

Advances in Battery Energy Storage Management: Control and Economic Synergies

Venkata Rajesh Chundru, Shreshta Rajakumar Deshpande, Stanislav A Gankov

Comments Pre Print

详情
英文摘要

The existing literature on Battery Energy Storage Systems (BESS) predominantly focuses on two main areas: control system design aimed at achieving grid stability and the techno-economic analysis of BESS dispatch on power grid. However, with the increasing incorporation of ancillary services into power grids, a more comprehensive approach to energy management systems is required. Such an approach should not only optimize revenue generation from BESS but also ensure the safe, efficient, and reliable operation of lithium-ion batteries. This research seeks to bridge this gap by exploring literature that addresses both the economic and operational dimensions of BESS. Specifically, it examines how economic aspects of grid duty cycles can align with control schemes deployed in BESS systems. This alignment, or synergy, could be instrumental in creating robust digital twins virtual representations of BESS systems that enhance both grid stability and revenue potential. The literature review is organized into five key categories: (1) ancillary services for BESS, exploring support functions that BESS can provide to power grids; (2) control systems developed for real-time BESS power flow management, ensuring smooth operations under dynamic grid conditions; (3) optimization algorithms for BESS dispatch, focusing on efficient energy allocation strategies; (4) techno-economic analyses of BESS and battery systems to assess their financial viability; and (5) digital twin technologies for real-world BESS deployments, enabling advanced predictive maintenance and performance optimization. This review will identify potential synergies, research gaps, and emerging trends, paving the way for future innovations in BESS management and deployment strategies.

2602.06356 2026-02-09 cs.RO cs.SY eess.SY

Nipping the Drift in the Bud: Retrospective Rectification for Robust Vision-Language Navigation

Gang He, Zhenyang Liu, Kepeng Xu, Li Xu, Tong Qiao, Wenxin Yu, Chang Wu, Weiying Xie

详情
英文摘要

Vision-Language Navigation (VLN) requires embodied agents to interpret natural language instructions and navigate through complex continuous 3D environments. However, the dominant imitation learning paradigm suffers from exposure bias, where minor deviations during inference lead to compounding errors. While DAgger-style approaches attempt to mitigate this by correcting error states, we identify a critical limitation: Instruction-State Misalignment. Forcing an agent to learn recovery actions from off-track states often creates supervision signals that semantically conflict with the original instruction. In response to these challenges, we introduce BudVLN, an online framework that learns from on-policy rollouts by constructing supervision to match the current state distribution. BudVLN performs retrospective rectification via counterfactual re-anchoring and decision-conditioned supervision synthesis, using a geodesic oracle to synthesize corrective trajectories that originate from valid historical states, ensuring semantic consistency. Experiments on the standard R2R-CE and RxR-CE benchmarks demonstrate that BudVLN consistently mitigates distribution shift and achieves state-of-the-art performance in both Success Rate and SPL.

2602.06350 2026-02-09 eess.IV cs.CV

AS-Mamba: Asymmetric Self-Guided Mamba Decoupled Iterative Network for Metal Artifact Reduction

Bowen Ning, Zekun Zhou, Xinyi Zhong, Zhongzhen Wang, HongXin Wu, HaiTao Wang, Liu Shi, Qiegen Liu

Comments 10 pages,10 figures

详情
英文摘要

Metal artifact significantly degrades Computed Tomography (CT) image quality, impeding accurate clinical diagnosis. However, existing deep learning approaches, such as CNN and Transformer, often fail to explicitly capture the directional geometric features of artifacts, leading to compromised structural restoration. To address these limitations, we propose the Asymmetric Self-Guided Mamba (AS-Mamba) for metal artifact reduction. Specifically, the linear propagation of metal-induced streak artifacts aligns well with the sequential modeling capability of State Space Models (SSMs). Consequently, the Mamba architecture is leveraged to explicitly capture and suppress these directional artifacts. Simultaneously, a frequency domain correction mechanism is incorporated to rectify the global amplitude spectrum, thereby mitigating intensity inhomogeneity caused by beam hardening. Furthermore, to bridge the distribution gap across diverse clinical scenarios, we introduce a self-guided contrastive regularization strategy. Extensive experiments on public andclinical dental CBCT datasets demonstrate that AS-Mamba achieves superior performance in suppressing directional streaks and preserving structural details, validating the effectiveness of integrating physical geometric priors into deep network design.

2602.06313 2026-02-09 eess.SP cs.IT math.IT

Hybrid-Field Joint Channel and Visible Region Estimation for RIS-Assisted Communications

Xiaokun Tuo, Ming-Min Zhao, Xiang Wang, Changsheng You, Min-Jian Zhao

Comments 13 pages, 8 figures

详情
英文摘要

In reconfigurable intelligent surface (RIS)-assisted millimeter-wave (mmWave) communication systems, the large-scale RIS introduces pronounced geometric effects that lead to the coexistence of far-field and near-field propagation. Furthermore, random blockages induce spatial non-stationarity across the RIS array, causing signals from different scatterers to illuminate only partial regions, referred to as visible regions (VRs). This renders conventional far-field and fully visible array-based channel models inadequate and makes channel estimation particularly challenging. In this paper, we investigate the non-stationary cascaded channel estimation problem in a hybrid-field propagation environment, where the RIS-base station (BS) link operates in the far-field, while the user-RIS link exhibits near-field characteristics with partial visibility. To address the resulting high-dimensional and coupled estimation problem, a reduced-dimensional sparse bilinear representation is developed by exploiting the structural characteristics of the cascaded channel. In particular, a dictionary compression technique is proposed to represent the high-dimensional coupled dictionary using a low-dimensional polar-domain dictionary weighted by a visibility matrix, thereby significantly reducing the problem scale. Based on this representation, a turbo-structured joint Bayesian estimation (TS-JBE) approach is proposed to simultaneously estimate the channel gains, VRs, and off-grid parameters, thereby avoiding error propagation inherent in existing sequential methods. Simulation results demonstrate that the proposed method significantly improves the estimation accuracy compared with existing approaches.

2602.03868 2026-02-09 eess.AS cs.AI cs.CL cs.SD

Benchmarking Automatic Speech Recognition for Indian Languages in Agricultural Contexts

Chandrashekar M S, Vineet Singh, Lakshmi Pedapudi

Comments 9 pages, 6 figures

详情
英文摘要

The digitization of agricultural advisory services in India requires robust Automatic Speech Recognition (ASR) systems capable of accurately transcribing domain-specific terminology in multiple Indian languages. This paper presents a benchmarking framework for evaluating ASR performance in agricultural contexts across Hindi, Telugu, and Odia languages. We introduce evaluation metrics including Agriculture Weighted Word Error Rate (AWWER) and domain-specific utility scoring to complement traditional metrics. Our evaluation of 10,934 audio recordings, each transcribed by up to 10 ASR models, reveals performance variations across languages and models, with Hindi achieving the best overall performance (WER: 16.2%) while Odia presents the greatest challenges (best WER: 35.1%, achieved only with speaker diarization). We characterize audio quality challenges inherent to real-world agricultural field recordings and demonstrate that speaker diarization with best-speaker selection can substantially reduce WER for multi-speaker recordings (upto 66% depending on the proportion of multi-speaker audio). We identify recurring error patterns in agricultural terminology and provide practical recommendations for improving ASR systems in low-resource agricultural domains. The study establishes baseline benchmarks for future agricultural ASR development.

2602.01420 2026-02-09 math.OC cs.SY eess.SY

Regret of $H_\infty$ Preview Controllers

Jietian Liu, Peter Seiler

详情
英文摘要

This paper studies preview control in both the $H_\infty$ and regret-optimal settings. The plant is modeled as a discrete-time, linear time-invariant system subject to external disturbances. The performance baseline is the optimal non-causal controller that has full knowledge of the disturbance sequence. We first review the construction of the $H_\infty$ preview controller with $p$-steps of disturbance preview. We then show that the closed-loop $H_\infty$ performance of this preview controller converges as $p\to \infty$ to the performance of the optimal non-causal controller. Furthermore, we prove that the optimal regret of the preview controller converges to zero. These results demonstrate that increasing preview length allows controllers to asymptotically achieve non-causal performance in both the $H_\infty$ and regret frameworks. A numerical example illustrates the theoretical results.

2601.21299 2026-02-09 cs.CE eess.SP stat.AP

Collective Noise Filtering in Complex Networks

Tingyu Zhao, István A. Kovács

详情
英文摘要

Complex networks are powerful representations of complex systems across scales and domains, and the field is experiencing unprecedented growth in data availability. However, real-world network data often suffer from noise, biases, and missing data in edge weights, which undermine the reliability of downstream network analyses. Standard noise filtering approaches, whether treating individual edges one-by-one or assuming a uniform global noise level, are suboptimal, because in reality both signal and noise can be heterogeneous and correlated across multiple edges. As a solution, we introduce the Network Wiener Filter, a principled method for collective edge-level noise filtering that leverages both network structure and noise characteristics, to reduce error in the observed edge weights and to infer missing edge weights. We demonstrate the broad practical efficacy of the Network Wiener Filter in two distinct settings, the genetic interaction network of the budding yeast S. cerevisiae and the Enron Corpus email network, noting compelling evidence of successful noise suppression in both applications. With the Network Wiener Filter, we advocate for a shift toward error-aware network science, one that embraces data imperfection as an inherent feature and learns to navigate it effectively.

2601.02885 2026-02-09 eess.SY cs.SY q-bio.NC

A Mathematical Formalization of Self-Determining Agency

Yoshiyuki Ohmura, Earnest Kota Carr, Yasuo Kuniyoshi

详情
英文摘要

Defining agency is an extremely important challenge for cognitive science and artificial intelligence. Physics generally describes mechanical happenings, but there remains an unbridgeable gap between these and the acts of agents. To discuss the morality and responsibility of agents, it is necessary to model acts; whether such responsible acts can be fully explained by physical determinism remains an ongoing debate. Although we have already proposed a physical agent determinism model that appears to go beyond mere mechanical happenings, we have not yet established a strict mathematical formalism to eliminate ambiguity. Here, we explain why a physical system can follow coarse-graining agent-level determination without violating physical laws by formulating supervenient causation. Generally, supervenience including coarse graining does not change without a change in its lower base; therefore, a single supervenience alone cannot define supervenient causation. We define supervenient causation as the causal efficacy from the supervenience level to its lower base level. Although an algebraic expression composed of the multiple supervenient functions does supervenes on the base, an index sequence that determines the algebraic expression does not supervene on the base. Therefore, the sequence can possess unique dynamical laws that are independent of the lower base level. This independent dynamics creates the possibility for temporally preceding changes at the supervenience level to cause changes at the lower base level. Such a dual-laws system is considered useful for modeling self-determining agents such as humans.

2601.02278 2026-02-09 eess.SY cs.SY

Multi-mode Fault Diagnosis Datasets of Three-phase Asynchronous Motor Under Variable Working Conditions

Shijin Chen, Zeyi Liu, Chenyang Li, Dongliang Zou, Xiao He, Donghua Zhou

Comments 13 pages, 9 figures

详情
英文摘要

Three-phase asynchronous motor are fundamental components in industrial systems, and their failure can lead to significant operational downtime and economic losses. Vibration and current signals are effective indicators for monitoring motor health and diagnosing faults. However, motors in real applications often operate under variable conditions such as fluctuating speeds and loads, which complicate the fault diagnosis process. This paper presents a comprehensive dataset collected from a three-phase asynchronous motor under various fault types and severities, operating under diverse speed and load conditions. The dataset includes both single faults and mechanical-electrical compound faults, such as rotor unbalance, stator winding short circuits, bearing faults, and their combinations. Data were acquired under both steady and transitional conditions, with signals including triaxial vibration, three-phase currents, torque, and key-phase signals. This dataset supports the development and validation of robust fault diagnosis methods for electric motors under realistic operating conditions.

2512.22840 2026-02-09 eess.SP

Generalizable Learning for Massive MIMO CSI Feedback in Unseen Environments

Haoyu Wang, Zhi Sun, Shuangfeng Han, Xiaoyun Wang, Zhaocheng Wang

详情
英文摘要

Deep learning is promising to enhance the accuracy and reduce the overhead of channel state information (CSI) feedback, which can boost the capacity of frequency division duplex (FDD) massive multiple-input multiple-output (MIMO) systems. Nevertheless, the generalizability of current deep learning-based CSI feedback algorithms cannot be guaranteed in unseen environments, which induces a high deployment cost. In this paper, the generalizability of deep learning-based CSI feedback is promoted with physics interpretation. Firstly, the distribution shift of the cluster-based channel is modeled, which comprises the multi-cluster structure and single-cluster response. Secondly, the physics-based distribution alignment is proposed to effectively address the distribution shift of the cluster-based channel, which comprises multi-cluster decoupling and fine-grained alignment. Thirdly, the efficiency and robustness of physics-based distribution alignment are enhanced. Explicitly, an efficient multi-cluster decoupling algorithm is proposed based on the Eckart-Young-Mirsky (EYM) theorem to support real-time CSI feedback. Meanwhile, a hybrid criterion to estimate the number of decoupled clusters is designed, which enhances the robustness against channel estimation error. Fourthly, environment-generalizable neural network for CSI feedback (EG-CsiNet) is proposed as a novel learning framework with physics-based distribution alignment. Based on extensive simulations and sim-to-real experiments in various conditions, the proposed EG-CsiNet can robustly reduce the generalization error by more than 3 dB compared to the state-of-the-arts.

2512.20113 2026-02-09 cs.CV eess.IV

Multi-Sensor Attention Networks for Automated Subsurface Delamination Detection in Concrete Bridge Decks

Alireza Moayedikia, Amirhossein Moayedikia

详情
英文摘要

Subsurface delaminations in concrete bridge decks remain undetectable through conventional visual inspection, necessitating automated non-destructive evaluation methods. This work introduces a deep learning framework that integrates Ground Penetrating Radar (GPR) and Infrared Thermography (IRT) through hierarchical attention mechanisms. Our architecture employs temporal self-attention to process GPR electromagnetic signals, spatial attention to analyze thermal imagery, and cross-modal attention with learnable embeddings to model inter-sensor correspondences. We integrate Monte Carlo dropout-based uncertainty quantification, decomposing prediction confidence into model uncertainty and data-driven uncertainty components. Testing across five real-world bridge datasets from the SDNET2021 benchmark reveals that our approach delivers substantial performance gains over single-sensor and concatenation-based baselines when applied to balanced or moderately imbalanced data distributions. Comprehensive ablation analysis confirms that cross-modal attention mechanisms contribute meaningful improvements beyond unimodal attention alone. Critically, we identify and characterize specific failure modes: under extreme class imbalance, attention-based architectures demonstrate susceptibility to majority class bias, indicating scenarios where simpler architectural choices may prove more robust. Our findings equip practitioners with empirically-grounded criteria for selecting appropriate fusion strategies based on dataset characteristics, rather than promoting universal architectural superiority.

2512.04298 2026-02-09 quant-ph cs.SY eess.SY

Experimental Sensitivity Enhancement of a Quantum Rydberg Atom-Based RF Receiver with a Metamaterial GRIN Lens

Anton Tishchenko, Demos Serghiou, Ashwin Thelappilly Joy, Paul Marsh, Paul Martin, Tim Brown, Gabriele Gradoni, Mohsen Khalily, Rahim Tafazolli

详情
英文摘要

We experimentally demonstrate enhanced sensitivity of an atom-based Rydberg radio frequency (RF) receiver integrated with a gradient refractive index (GRIN) Luneburg-type metamaterial lens. By analyzing the electromagnetically induced transparency (EIT) effect in Cesium vapor, we compare receiver performance with and without the GRIN lens under a 2.2~GHz and a 3.6~GHz far-field excitation. Our measurements reveal a significant amplification of the EIT window when the lens is introduced, consistent with the theoretical prediction that the local E-field enhancement at the vapor cell reduces the minimum detectable electric field and improves the microwave electric field measurement sensitivity of the Rydberg atom-based RF receiver over an ultrawide bandwidth of the lens. This experimental validation demonstrates the potential of metamaterial-enhanced quantum RF sensing for a wide range of applications, such as electromagnetic compatibility (EMC) testing, quantum radar, and wireless communication.

2512.01249 2026-02-09 cs.NE cs.AI cs.SY eess.SY

Pascal-Weighted Genetic Algorithms: A Binomially-Structured Recombination Framework

Otman A. Basir

Comments 23 pages, 8 figures

详情
英文摘要

This paper introduces a new family of multi-parent recombination operators for Genetic Algorithms (GAs), based on normalized Pascal (binomial) coefficients. Unlike classical two-parent crossover operators, Pascal-Weighted Recombination (PWR) forms offsprings as structured convex combination of multiple parents, using binomially shaped weights that emphasize central inheritance while suppressing disruptive variance. We develop a mathematical framework for PWR, derive variance-transfer properties, and analyze its effect on schema survival. The operator is extended to real-valued, binary/logit, and permutation representations. We evaluate the proposed method on four representative benchmarks: (i) PID controller tuning evaluated using the ITAE metric, (ii) FIR low-pass filter design under magnitude-response constraints, (iii) wireless power-modulation optimization under SINR coupling, and (iv) the Traveling Salesman Problem (TSP). We demonstrate how, across these benchmarks, PWR consistently yields smoother convergence, reduced variance, and achieves 9-22% performance gains over standard recombination operators. The approach is simple, algorithm-agnostic, and readily integrable into diverse GA architectures.

2510.17290 2026-02-09 eess.SY cs.SY

Enhanced Ground-Satellite Direct Access via Onboard Rydberg Atomic Quantum Receivers

Qihao Peng, Tierui Gong, Zihang Song, Qu Luo, Zihuai Lin, Pei Xiao, Chau Yuen

Comments Accepted by IEEE Wireless Communications

详情
英文摘要

Ground-satellite links for 6G networks face critical challenges, including severe path loss, tight size-weight-power limits, and congested spectrum, all of which significantly hinder the performance of traditional radio frequency (RF) front ends. This article introduces the Rydberg Atomic Quantum Receiver (RAQR) for onboard satellite systems, a millimeter-scale front end that converts radio fields to optical signals through atomic electromagnetically induced transparency. RAQR's high sensitivity and high frequency selectivity address link budget, payload, and interference challenges while fitting within space constraints. A hybrid atomic-electronic design and supporting signal model demonstrate enhanced data rate, coverage, and sensing accuracy relative to conventional RF receivers. The article concludes with integration strategies, distributed-satellite concepts, and open research problems for bringing RAQR-enabled satellite payloads into service.

2510.02514 2026-02-09 eess.IV cs.CV cs.IT eess.SP math.IT stat.ML

Learning a distance measure from the information-estimation geometry of data

Guy Ohayon, Pierre-Etienne H. Fiquet, Florentin Guth, Jona Ballé, Eero P. Simoncelli

Comments ICLR 2026. Code is available at https://github.com/ohayonguy/information-estimation-metric

详情
英文摘要

We introduce the Information-Estimation Metric (IEM), a novel form of distance function derived from an underlying continuous probability density over a domain of signals. The IEM is rooted in a fundamental relationship between information theory and estimation theory, which links the log-probability of a signal with the errors of an optimal denoiser, applied to noisy observations of the signal. In particular, the IEM between a pair of signals is obtained by comparing their denoising error vectors over a range of noise amplitudes. Geometrically, this amounts to comparing the score vector fields of the blurred density around the signals over a range of blur levels. We prove that the IEM is a valid global distance metric and derive a closed-form expression for its local second-order approximation, which yields a Riemannian metric. For Gaussian-distributed signals, the IEM coincides with the Mahalanobis distance. But for more complex distributions, it adapts, both locally and globally, to the geometry of the distribution. In practice, the IEM can be computed using a learned denoiser (analogous to generative diffusion models) and solving a one-dimensional integral. To demonstrate the value of our framework, we learn an IEM on the ImageNet database. Experiments show that this IEM is competitive with or outperforms state-of-the-art supervised image quality metrics in predicting human perceptual judgments.