arXivDaily arXiv每日学术速递 周一至周五更新
重置
EESS电气与系统 158
2603.02197 2026-03-03 cs.IT cs.NI cs.SI cs.SY eess.SP eess.SY math.IT

Characterizing Information Accuracy in Timeliness-Based Gossip Networks

Emirhan Tekez, Melih Bastopcu, Sinan Gezici

详情
英文摘要

We investigate information accuracy in timeliness-based gossip networks where the source evolves according to a continuous-time Markov chain (CTMC) with $M$ states and disseminates status updates to a network of $n$ nodes. In addition to direct source updates, nodes exchange their locally stored packets via gossip and accept incoming packets solely based on whether the incoming packet is fresher than their local copy. As a result, a node can possess the freshest packet in the network while still not having the current source state. To quantify the amount of accurate information flowing in the network under such a gossiping scheme, we introduce two accuracy metrics, average accuracy, defined as the expected fraction of nodes carrying accurate information in any given subset, and freshness-based accuracy, defined as the accuracy of the freshest node in any given subset. Using a stochastic hybrid systems (SHS) framework, we first derive steady-state balance equations and obtain matrix-valued recursions that characterize these metrics in fully connected gossip networks under binary CTMCs. We then extend our analysis to the general multi-state information source using a joint CTMC approach. Finally, we quantify the fraction of nodes whose information is accurate due to direct source pushes versus gossip exchanges. We verify our findings with numerical analyses and provide asymptotic insights.

2603.02149 2026-03-03 cs.CV eess.SP

3D Field of Junctions: A Noise-Robust, Training-Free Structural Prior for Volumetric Inverse Problems

Namhoon Kim, Narges Moeini, Justin Romberg, Sara Fridovich-Keil

Comments Code will be released soon

详情
英文摘要

Volume denoising is a foundational problem in computational imaging, as many 3D imaging inverse problems face high levels of measurement noise. Inspired by the strong 2D image denoising properties of Field of Junctions (ICCV 2021), we propose a novel, fully volumetric 3D Field of Junctions (3D FoJ) representation that optimizes a junction of 3D wedges that best explain each 3D patch of a full volume, while encouraging consistency between overlapping patches. In addition to direct volume denoising, we leverage our 3D FoJ representation as a structural prior that: (i) requires no training data, and thus precludes the risk of hallucination, (ii) preserves and enhances sharp edge and corner structures in 3D, even under low signal to noise ratio (SNR), and (iii) can be used as a drop-in denoising representation via projected or proximal gradient descent for any volumetric inverse problem with low SNR. We demonstrate successful volume reconstruction and denoising with 3D FoJ across three diverse 3D imaging tasks with low-SNR measurements: low-dose X-ray computed tomography (CT), cryogenic electron tomography (cryo-ET), and denoising point clouds such as those from lidar in adverse weather. Across these challenging low-SNR volumetric imaging problems, 3D FoJ outperforms a mixture of classical and neural methods.

2603.02109 2026-03-03 eess.SP cs.LG

Orchestrating Multimodal DNN Workloads in Wireless Neural Processing

Sai Xu, Kai-Kit Wong, Yanan Du, Hyundong Shin

详情
英文摘要

In edge inference, wireless resource allocation and accelerator-level deep neural network (DNN) scheduling have yet to be co-optimized in an end-to-end manner. The lack of coordination between wireless transmission and accelerator-level DNN execution prevents efficient overlap, leading to higher end-to-end inference latency. To address this issue, this paper investigates multimodal DNN workload orchestration in wireless neural processing (WNP), a paradigm that integrates wireless transmission and multi-core accelerator execution into a unified end-to-end pipeline. First, we develop a unified communication-computation model for multimodal DNN execution and formulate the corresponding optimization problem. Second, we propose O-WiN, a framework that orchestrates DNN workloads in WNP through two tightly coupled stages: simulation-based optimization and runtime execution. Third, we develop two algorithms, RTFS and PACS. RTFS schedules communication and computation sequentially, whereas PACS interleaves them to enable pipeline parallelism by overlapping wireless data transfer with accelerator-level DNN execution. Simulation results demonstrate that PACS significantly outperforms RTFS under high modality heterogeneity by better masking wireless latency through communication-computation overlap, thereby highlighting the effectiveness of communication-computation pipelining in accelerating multimodal DNN execution in WNP.

2603.01997 2026-03-03 cs.CV cs.RO eess.IV

Event-Only Drone Trajectory Forecasting with RPM-Modulated Kalman Filtering

Hari Prasanth S. M., Pejman Habibiroudkenar, Eerik Alamikkotervo, Dimitrios Bouzoulas, Risto Ojala

Comments Submitted to ICUAS 2026 conference

详情
英文摘要

Event cameras provide high-temporal-resolution visual sensing that is well suited for observing fast-moving aerial objects; however, their use for drone trajectory prediction remains limited. This work introduces an event-only drone forecasting method that exploits propeller-induced motion cues. Propeller rotational speed are extracted directly from raw event data and fused within an RPM-aware Kalman filtering framework. Evaluations on the FRED dataset show that the proposed method outperforms learning-based approaches and vanilla kalman filter in terms of average distance error and final distance error at 0.4s and 0.8s forecasting horizons. The results demonstrate robust and accurate short- and medium-horizon trajectory forecasting without reliance on RGB imagery or training data.

2603.01970 2026-03-03 eess.SP

The Chebyshev Polynomial Series Frequency Modulation Model for Waveform Design and Analysis

Stephen P. Blackstock, Amaro Tuninetti, Dieter Vanderelst, Laura N. Kloepper, Michael R. Haberman

Comments Submitted to JASA Jan. 2026

详情
英文摘要

Polynomial phase signals (PPS) are a staple of waveform design and analysis in sonar, radar, and communications fields. They also find application in the modeling of bioacoustic emissions, especially those of echolocating animals such as bats and odontocetes. This work presents a novel PPS waveform formulation that exploits some special properties of Chebyshev polynomials, such as orthogonality, recurrence relations, and equivalence to trigonometric functions. The result is the Chebyshev Polynomial Frequency Modulation (CPSFM) family of waveforms, which prove useful in the modeling of bioacoustic signals and the approximation of non-polynomial-phase signals such as hyperbolic chirps. We demonstrate that the CPSFM model admits compact analytic expressions for fundamental continuous-time signal processing functions such as the Fourier transform, the convolution and correlation operations, and the ambiguity function. Derivations for these expressions using CPSFM are presented, along with their application to the analysis of biosonar emissions of Mexican free-tailed bats.

2603.01931 2026-03-03 eess.SY cs.SY

A Hetero-functional Graph State Estimator for Watershed Systems: Application to the Chesapeake Bay

Megan S. Harris, John C. Little, Amro M. Farid

详情
英文摘要

Regional watersheds are complex systems of systems encompassing hydrology, land-use decision-making, estuarine ecological feedbacks, and overlapping governance jurisdictions. Their effective management underlies many modern societal challenges and therefore requires models that capture interdependencies between natural and institutional systems. Regional-specific models such as the Chesapeake Assessment Scenario Tool, used in this paper's case study, provide valuable nutrient estimates but rely on structurally opaque watershed routing that limits integration into broader systems-level analyses. This paper introduces a modeling framework for watershed systems. First, a region-independent reference architecture is developed. Second, the Weighted Least Squares Error Hetero-functional Graph State Estimator, an extension of Hetero-functional Graph Theory (HFGT), is adapted to estimate nutrient flows from uncertain data. The framework is demonstrated through instantiation in the Chesapeake Bay Watershed. By establishing a shared ontology grounded in Systems Modeling Language and HFGT, the approach enables integration of economic and governance systems to support sustainable watershed management.

2603.01918 2026-03-03 eess.SY cs.SY

PAC Finite-Time Safety Guarantees for Stochastic Systems with Unknown Disturbance Distributions

Taoran Wu, Dominik Wagner, C. -H. Luke Ong, Bai Xue

Comments To appear in HSCC 2026

详情
英文摘要

We investigate the problem of establishing finite-time probabilistic safety guarantees for discrete-time stochastic dynamical systems subject to unknown disturbance distributions, using barrier certificate methods. Our approach develops a data-driven safety certification framework that relies only on a finite collection of independent and identically distributed (i.i.d.) disturbance samples. Within this framework, we propose a certification procedure such that, with confidence at least $1-δ$ over the sampled disturbances, if the output of the certification procedure is accepted, the probability that the system remains within a prescribed safe set over a finite horizon is at least $1-ε$. A key challenge lies in formally characterizing the probably approximately correct (PAC) generalization behavior induced by finite samples. To address this, we derive PAC generalization bounds using tools from VC dimension, scenario optimization, and Rademacher complexity. These results illuminate the fundamental trade-offs between sample size, model complexity, and safety tolerance, providing both theoretical insight and practical guidance for designing reliable, data-driven safety certificates in discrete-time stochastic systems.

2603.01902 2026-03-03 eess.SY cs.SY

Dynamic Connectivity and Local Frequency Strength under Stochastic Variations

Bruno Pinheiro, Daniel Dotta

详情
英文摘要

This paper introduces a novel metric, termed the Generalized Fiedler Vector (GFV), to evaluate the \textit{dynamic connectivity} in power systems. The proposed metric leverages the network connectivity, represented by the system Laplacian matrix, together with the nodal inertia distribution, following a formulation previously developed by the first author. By capturing the interplay between system topology and dynamic properties, the GFV provides valuable insights for the optimal siting of stochastic generation to mitigate its impact on local and system-wide frequency variability. The effectiveness of the proposed approach is demonstrated through Monte Carlo simulations performed on the IEEE 68-bus test system.

2603.01872 2026-03-03 eess.IV

Guaranteed Image Classification via Goal-oriented Joint Semantic Source and Channel Coding

Wenchao Wu, Min Qiu, Yansha Deng, Jinhong Yuan

Comments 13 pages, submitted to IEEE TWC

详情
英文摘要

To enable critical applications such as remote diagnostics, image classification must be guaranteed under bandwidth constraints and unreliable wireless channels through joint source and channel coding (JSCC) design. However, most existing JSCC methods focus on minimizing image distortion, implicitly assuming that all image regions contribute equally to classification performance, thereby overlooking their varying importance for the task. In this paper, we propose a goal-oriented joint semantic source and channel coding (G-JSSCC) framework that applies \emph{various} levels of source coding compression and channel coding protection across image regions based on their semantic importance. Specifically, we design a semantic information extraction method that identifies and ranks various image regions based on their contributions to classification, where the contribution is measured by the shapely value from explainable artificial intelligence (AI). Based on that, we design a semantic source coding and a semantic channel coding method, which allocates higher-quality compression and stronger error protection to image regions of great semantic importance. In addition, we define a new metric, termed coding efficiency, to evaluate the effectiveness of the source and channel coding in the classification task. Simulations show that our proposed G-JSSCC framework improves classification probability by 2.70 times, reduces transmission cost by 38%, and enhances coding efficiency by 5.91 times, compared to the benchmark scheme using uniform compression and an idealized channel code to uniformly protect the whole image.

2603.01855 2026-03-03 eess.SP

Quantum-PROBE: Rydberg Atomic Receiver-Based Multi-AoA Estimation with RF Lens

Hong-Bae Jeon, Kaibin Huang, Chan-Byoung Chae

Comments 13 pages, 12 figures

详情
英文摘要

This paper presents the Quantum-Power pROfile Based Estimation (PROBE) framework, a Rydberg Atomic Receiver (RARE)-based multi-user angle-of-arrival (AoA) estimation approach equipped with a radio-frequency (RF) lens front end. We establish a physics-consistent analytical model showing that magnitude-only RARE measurements, processed via the beam-propagation method (BPM) and snapshot-wise power accumulation, can be rigorously characterized as a nonnegative superposition of AoA-dependent, lens-induced spatial power profiles. This formulation reveals a structured and interpretable power-domain dictionary that enables multi-user AoA recovery without explicit phase reconstruction. Building on this foundation, we develop two complementary recovery strategies: (i) a principled non-negative least absolute shrinkage and selection operator (NN-LASSO)-based solver that estimates a sparse nonnegative angular representation via an accelerated proximal-gradient method followed by cluster-based AoA decoding, and (ii) a low-complexity successive interference cancellation (SIC) algorithm that iteratively identifies and removes dominant power-profile components through cosine-similarity matching. Simulation results demonstrate that the proposed Quantum-PROBE framework consistently outperforms representative RARE- and RF-based benchmarks across diverse system configurations, while offering a clear accuracy-complexity tradeoff between the NN-LASSO and SIC variants for practical quantum sensing deployments.

2603.01850 2026-03-03 cs.RO cs.CV cs.SY eess.SY

Tiny-DroNeRF: Tiny Neural Radiance Fields aboard Federated Learning-enabled Nano-drones

Ilenia Carboni, Elia Cereda, Lorenzo Lamberti, Daniele Malpetti, Francesco Conti, Daniele Palossi

Comments This paper has been accepted for publication in the IEEE ICRA 2026 conference. ©2026 IEEE

详情
英文摘要

Sub-30g nano-sized aerial robots can leverage their agility and form factor to autonomously explore cluttered and narrow environments, like in industrial inspection and search and rescue missions. However, the price for their tiny size is a strong limit in their resources, i.e., sub-100 mW microcontroller units (MCUs) delivering $\sim$100 GOps/s at best, and memory budgets well below 100 MB. Despite these strict constraints, we aim to enable complex vision-based tasks aboard nano-drones, such as dense 3D scene reconstruction: a key robotic task underlying fundamental capabilities like spatial awareness and motion planning. Top-performing 3D reconstruction methods leverage neural radiance fields (NeRF) models, which require GBs of memory and massive computation, usually delivered by high-end GPUs consuming 100s of Watts. Our work introduces Tiny-DroNeRF, a lightweight NeRF model, based on Instant-NGP, and optimized for running on a GAP9 ultra-low-power (ULP) MCU aboard our nano-drones. Then, we further empower our Tiny-DroNeRF by leveraging a collaborative federated learning scheme, which distributes the model training among multiple nano-drones. Our experimental results show a 96% reduction in Tiny-DroNeRF's memory footprint compared to Instant-NGP, with only a 5.7 dB drop in reconstruction accuracy. Finally, our federated learning scheme allows Tiny-DroNeRF to train with an amount of data otherwise impossible to keep in a single drone's memory, increasing the overall reconstruction accuracy. Ultimately, our work combines, for the first time, NeRF training on an ULP MCU with federated learning on nano-drones.

2602.02734 2026-03-03 eess.AS cs.AI cs.CL

WAXAL: A Large-Scale Multilingual African Language Speech Corpus

Abdoulaye Diack, Perry Nelson, Kwaku Agbesi, Angela Nakalembe, MohamedElfatih MohamedKhair, Vusumuzi Dube, Tavonga Siyavora, Subhashini Venugopalan, Jason Hickey, Uche Okonkwo, Abhishek Bapna, Isaac Wiafe, Raynard Dodzi Helegah, Elikem Doe Atsakpo, Charles Nutrokpor, Fiifi Baffoe Payin Winful, Kafui Kwashie Solaga, Jamal-Deen Abdulai, Akon Obu Ekpezu, Audace Niyonkuru, Samuel Rutunda, Boris Ishimwe, Michael Melese, Engineer Bainomugisha, Joyce Nakatumba-Nabende, Andrew Katumba, Claire Babirye, Jonathan Mukiibi, Vincent Kimani, Samuel Kibacia, James Maina, Fridah Emmah, Ahmed Ibrahim Shekarau, Ibrahim Shehu Adamu, Yusuf Abdullahi, Howard Lakougna, Bob MacDonald, Hadar Shemtov, Aisha Walcott-Bryant, Moustapha Cisse, Avinatan Hassidim, Jeff Dean, Yossi Matias

Comments Initial dataset release with added TTS, some more to come

详情
英文摘要

The advancement of speech technology has predominantly favored high-resource languages, creating a significant digital divide for speakers of most Sub-Saharan African languages. To address this gap, we introduce WAXAL, a large-scale, openly accessible speech dataset for 24 languages representing over 100 million speakers. The collection consists of two main components: an Automated Speech Recognition (ASR) dataset containing approximately 1,250 hours of transcribed, natural speech from a diverse range of speakers, and a Text-to-Speech (TTS) dataset with around 235 hours of high-quality, single-speaker recordings reading phonetically balanced scripts. This paper details our methodology for data collection, annotation, and quality control, which involved partnerships with four African academic and community organizations. We provide a detailed statistical overview of the dataset and discuss its potential limitations and ethical considerations. The WAXAL datasets are released at https://huggingface.co/datasets/google/WaxalNLP under the permissive CC-BY-4.0 license to catalyze research, enable the development of inclusive technologies, and serve as a vital resource for the digital preservation of these languages.

2512.14450 2026-03-03 eess.SY cs.RO cs.SY

Nonlinear System Identification Nano-drone Benchmark

Riccardo Busetto, Elia Cereda, Marco Forgione, Gabriele Maroni, Dario Piga, Daniele Palossi

详情
英文摘要

We introduce a benchmark for system identification based on 75k real-world samples from the Crazyflie 2.1 Brushless nano-quadrotor, a sub-50g aerial vehicle widely adopted in robotics research. The platform presents a challenging testbed due to its multi-input, multi-output nature, open-loop instability, and nonlinear dynamics under agile maneuvers. The dataset comprises four aggressive trajectories with synchronized 4-dimensional motor inputs and 13-dimensional output measurements. To enable fair comparison of identification methods, the benchmark includes a suite of multi-horizon prediction metrics for evaluating both one-step and multi-step error propagation. In addition to the data, we provide a detailed description of the platform and experimental setup, as well as baseline models highlighting the challenge of accurate prediction under real-world noise and actuation nonlinearities. All data, scripts, and reference implementations are released as open-source at https://github.com/idsia-robotics/nanodrone-sysid-benchmark to facilitate transparent comparison of algorithms and support research on agile, miniaturized aerial robotics.

2512.12046 2026-03-03 cs.LG cs.RO cs.SY eess.SY stat.ML

Goal Reaching with Eikonal-Constrained Hierarchical Quasimetric Reinforcement Learning

Vittorio Giammarino, Ahmed H. Qureshi

详情
英文摘要

Goal-Conditioned Reinforcement Learning (GCRL) mitigates the difficulty of reward design by framing tasks as goal reaching rather than maximizing hand-crafted reward signals. In this setting, the optimal goal-conditioned value function naturally forms a quasimetric, motivating Quasimetric RL (QRL), which constrains value learning to quasimetric mappings and enforces local consistency through discrete, trajectory-based constraints. We propose Eikonal-Constrained Quasimetric RL (Eik-QRL), a continuous-time reformulation of QRL based on the Eikonal Partial Differential Equation (PDE). This PDE-based structure makes Eik-QRL trajectory-free, requiring only sampled states and goals, while improving out-of-distribution generalization. We provide theoretical guarantees for Eik-QRL and identify limitations that arise under complex dynamics. To address these challenges, we introduce Eik-Hierarchical QRL (Eik-HiQRL), which integrates Eik-QRL into a hierarchical decomposition. Empirically, Eik-HiQRL achieves state-of-the-art performance in offline goal-conditioned navigation and yields consistent gains over QRL in manipulation tasks, matching temporal-difference methods.

2511.13690 2026-03-03 eess.SY cs.SY

Novel Stability Criteria for Discrete and Hybrid Systems via Ramanujan Inner Products

Shyam Kamal, Sunidhi Pandey, Thach Ngoc Dinh, Cao Thanh Tinh

Comments 14 pages, 2 figures

详情
英文摘要

This paper introduces a Ramanujan inner product and its corresponding norm, establishing a novel framework for the stability analysis of hybrid and discrete-time systems as an alternative to traditional Euclidean metrics. We establish new $ε$-$δ$ stability conditions that utilize the unique properties of Ramanujan summations and their relationship with number-theoretic concepts. The proposed approach provides enhanced robustness guarantees and reveals fundamental connections between system stability and arithmetic properties of the system dynamics. Theoretical results are rigorously proven, and simulation results on numerical examples are presented to validate the efficacy of the proposed approach.

2511.13171 2026-03-03 eess.SP

Autonomous Sensing UAV for Accurate Multi-User Identification and Localization in LAWN

Niccolò Paglierani, Francesco Linsalata, Vineeth Teeda, Davide Scazzoli, Maurizio Magarini

详情
英文摘要

This paper presents an autonomous sensing framework for identifying and localizing multiple users in Fifth Generation (5G) cooperative networks using an Unmanned Aerial Vehicle (UAV) that is not part of the serving access network. Unlike conventional aerial serving nodes, the proposed UAV operates passively and is dedicated solely to sensing. Passively receiving Uplink (UL) Sounding Reference Signals (SRS), the UAV requires only minimal initial coordination with the network infrastructure during the mission. A complete signal processing chain is proposed and developed, encompassing synchronization, user identification, and localization, all executed onboard UAV during flight. The system autonomously plans and adapts its mission workflow to estimate multiple user positions within a single deployment, integrating flight control with real-time sensing. The approach is validated through extensive simulations and a full-scale low-altitude experimental campaign. Urban simulation scenarios show localization errors below 8 m, while rural field tests achieve errors below 3 m, with reliable synchronization and user identification ensured in both cases. The results confirm the feasibility of infrastructure-independent sensing UAVs as a core element of the emerging Low Altitude Economy (LAE), supporting situational awareness and rapid deployment in emergency or connectivity-limited environments.

2508.19739 2026-03-03 eess.SP cs.ET

Molecular Communication for Gastroretentive Drug Delivery

Sebastian Lotter, Marco Seiter, Maryam Pirmoradi, Lukas Brand, Dagmar Fischer, Robert Schober

Comments 6 pages, 2 figures, This paper has been accepted as Transactions Letter at IEEE Transactions on Molecular, Biological, and Multi-Scale Communications

详情
英文摘要

Recently, bacterial nanocellulose (BNC), a biological material produced by non-pathogenic bacteria that possesses excellent material properties for various medical applications, has received increased interest as a carrier system for drug delivery. However, the vast majority of existing studies on drug release from BNC are feasibility studies with modeling and design aspects remaining largely unexplored. To narrow this research gap, this paper proposes a novel model for the drug release from BNC. Specifically, the drug delivery system considered in this paper consists of a BNC fleece coated with a polymer. The polymer coating is used as an additional diffusion barrier, enabling the controlled release of an active pharmaceutical ingredient. The proposed physics-based model reflects the geometry of the BNC and incorporates the impact of the polymer coating on the drug release. Hence, it can be useful for designing BNC-based drug delivery systems in the future. The accuracy of the model is validated with experimental data obtained in wet lab experiments.

2508.07757 2026-03-03 eess.AS cs.SD

Score-Informed Transformer for Refining MIDI Velocity in Automatic Music Transcription

Zhanhong He, Roberto Togneri, David Huang

Comments Submitted to SMC2026 Conference

详情
英文摘要

MIDI velocity is crucial for capturing expressive dynamics in human performances. In practical scenarios, a music score with inaccurate velocities may be available alongside the performance audio (e.g., music education and free online archives), enabling the task of score-informed MIDI velocity estimation. In this work, we propose a modular, lightweight score-informed Transformer correction module that refines the velocity estimates of Automatic Music Transcription (AMT) systems. We integrate the proposed module into multiple AMT systems (HPT, HPPNet, and DynEst). Trained exclusively on the MAESTRO training split, our method consistently reduces velocity estimation errors on MAESTRO and improves cross-dataset generalization to SMD and MAPS datasets. Under this training protocol, integrating our score-informed module with HPT (named Score-HPT) establishes a new state-of-the-art performance, outperforms existing score-informed methods and velocity-enabled AMT systems while adding only 1 M parameters.

2501.15849 2026-03-03 eess.SY cs.LG cs.SY

Data-Driven Prediction and Control of Hammerstein-Wiener Systems with Implicit Gaussian Processes

Mingzhou Yin, Matthias A. Müller

详情
英文摘要

This work investigates data-driven prediction and control of Hammerstein-Wiener systems using physics-informed Gaussian process (GP) models that encode the block-oriented model structure. Data-driven prediction algorithms have been developed for structured nonlinear systems based on Willems' fundamental lemma. However, existing frameworks do not apply to output nonlinearities in Wiener systems and rely on a finite-dimensional dictionary of basis functions for Hammerstein systems. In this work, an implicit predictor structure is considered, leveraging the linearity for the dynamical part of the model. This implicit function is learned by GP regression, utilizing carefully designed structured kernel functions from linear model parameters and GP priors for the nonlinearities. Virtual derivative points are added to the regression by expectation propagation to encode monotonicity information of the nonlinearities. The linear model parameters are estimated as hyperparameters by assuming a stable spline hyperprior. The implicit GP model provides explicit output prediction by optimizing selected optimality criteria. The implicit model is also applied to receding horizon control with the expected control cost and chance constraint satisfaction guarantee. Numerical results demonstrate that the proposed prediction and control algorithms are superior to black-box GP models without model structure knowledge.

2501.15246 2026-03-03 eess.IV

CryoLithe: Rapid Cryo-ET Reconstruction via Transform-Localized Deep Learning

Vinith Kishore, Valentin Debarnot, AmirEhsan Khorashadizadeh, Ricardo D. Righetto, Benjamin D. Engel, Ivan Dokmanić

详情
英文摘要

Cryo-electron tomography (cryo-ET) enables 3D visualization of cellular structures. Accurate reconstruction of high-resolution volumes is complicated by the very low signal-to-noise ratio and a restricted range of sample tilts. Recent self-supervised deep learning approaches, which post-process initial reconstructions by filtered backprojection (FBP), have significantly improved reconstruction quality with respect to signal processing iterative algorithms, but they are slow, taking dozens of hours for an expert to reconstruct a tomogram and demand large memory. We present CryoLithe, an end-to-end network that directly estimates the volume from an aligned tilt series. CryoLithe achieves denoising and missing wedge correction comparable or better than state-of-the-art self-supervised deep learning approaches such as Icecream, Cryo-CARE, IsoNet or DeepDeWedge, while being two orders of magnitude faster. To achieve this, we implement a local, memory-efficient reconstruction network. We demonstrate that leveraging transform-domain locality makes our network robust to distribution shifts, enabling effective supervised training and giving excellent results on real data$\unicode{x2013}$without retraining or fine-tuning. CryoLithe reconstructions facilitate downstream cryo-ET analysis, including segmentation and subtomogram averaging and is openly available: https://github.com/swing-research/CryoLithe.

2307.14025 2026-03-03 cs.LG cs.CV eess.IV q-bio.QM stat.ML

Topological Inductive Bias fosters Multiple Instance Learning in Data-Scarce Scenarios

Salome Kazeminia, Carsten Marr, Bastian Rieck

详情
Journal ref
Transactions on Machine Learning Research, 2026
英文摘要

Multiple instance learning (MIL) is a framework for weakly supervised classification, where labels are assigned to sets of instances, i.e., bags, rather than to individual data points. This paradigm has proven effective in tasks where fine-grained annotations are unavailable or costly to obtain. However, the effectiveness of MIL drops sharply when training data are scarce, such as for rare disease classification. To address this challenge, we propose incorporating topological inductive biases into the data representation space within the MIL framework. This bias introduces a topology-preserving constraint that encourages the instance encoder to maintain the topological structure of the instance distribution within each bag when mapping them to MIL latent space. As a result, our Topology Guided MIL (TG-MIL) method enhances the performance and generalizability of MIL classifiers across different aggregation functions, especially under scarce-data regimes. Our evaluations show average performance improvements of 15.3% for synthetic MIL datasets, 2.8% for MIL benchmarks, and 5.5% for rare anemia classification compared to current state-of-the-art MIL models, where only 17-120 samples per class are available. We make our code publicly available.

2603.01840 2026-03-03 cs.CV eess.IV

FireRed-OCR Technical Report

Hao Wu, Haoran Lou, Xinyue Li, Zuodong Zhong, Zhaojun Sun, Phellon Chen, Xuanhe Zhou, Kai Zuo, Yibo Chen, Xu Tang, Yao Hu, Boxiang Zhou, Jian Wu, Yongji Wu, Wenxin Yu, Yingmiao Liu, Yuhao Huang, Manjie Xu, Gang Liu, Yidong Ma, Zhichao Sun, Changhao Qiao

详情
英文摘要

We present FireRed-OCR, a systematic framework to specialize general VLMs into high-performance OCR models. Large Vision-Language Models (VLMs) have demonstrated impressive general capabilities but frequently suffer from ``structural hallucination'' when processing complex documents, limiting their utility in industrial OCR applications. In this paper, we introduce FireRed-OCR, a novel framework designed to transform general-purpose VLMs (based on Qwen3-VL) into pixel-precise structural document parsing experts. To address the scarcity of high-quality structured data, we construct a ``Geometry + Semantics'' Data Factory. Unlike traditional random sampling, our pipeline leverages geometric feature clustering and multi-dimensional tagging to synthesize and curate a highly balanced dataset, effectively handling long-tail layouts and rare document types. Furthermore, we propose a Three-Stage Progressive Training strategy that guides the model from pixel-level perception to logical structure generation. This curriculum includes: (1) Multi-task Pre-alignment to ground the model's understanding of document structure; (2) Specialized SFT for standardizing full-image Markdown output; and (3) Format-Constrained Group Relative Policy Optimization (GRPO), which utilizes reinforcement learning to enforce strict syntactic validity and structural integrity (e.g., table closure, formula syntax). Extensive evaluations on OmniDocBench v1.5 demonstrate that FireRed-OCR achieves state-of-the-art performance with an overall score of 92.94\%, significantly outperforming strong baselines such as DeepSeek-OCR 2 and OCRVerse across text, formula, table, and reading order metrics. We open-source our code and model weights to facilitate the ``General VLM to Specialized Structural Expert'' paradigm.

2603.01831 2026-03-03 eess.SY cs.SY

Critical Clearing Time Enhancement of Droop-Controlled Grid-Forming Inverters with Adaptive Function-Based Parameters

Dewan Mahnaaz Mahmud, Vinu Thomas, Bogdan Marinescu, Mickaël Hilairet

Comments 5 pages, 7 figures

详情
英文摘要

With the increasing penetration of renewable energy sources, grid-forming (GFM) inverters are becoming essential for voltage and frequency regulation. However, the transient stability of GFM inverter is critically affected by the current limiters that are embedded with the standard control schemes. This paper proposes a novel adaptive function to enhance the transient stability of droop-controlled GFM inverters. The proposed method autonomously adjusts the active power reference and the droop gain based on the terminal voltage of the inverter. Also, the acceleration of the phase angle is prevented, leading to the maximization of critical clearing time (CCT). The proposed method is benchmarked against two state-of-the-art GFM inverter CCT enhancement methods. Effectiveness of the proposed method is validated through electromagnetic transient (EMT) simulations in MATLAB/Simulink\textsuperscript{\textregistered}.

2603.01810 2026-03-03 eess.IV

Near-Field Focusing Operators for Planar Multi-Static Microwave Imaging Using Back-Projection in the Spatial Domain

Matthias M. Saurer, Marius Brinkmann, Han Na, Quanfeng Wang, Thomas Eibert

Comments This article has been accepted for publication in IEEE. This is the author's version which has not been fully edited and content may change prior to final publication. Citation information: DOI 10.23919/EuCAP63536.2025.10999865. Copyright \c{opyright}2025 IEEE

详情
英文摘要

Based on a plane-wave expansion of the observation data in quasi-planar multi-static scattering scenarios, an improved formalism for image creation utilizing back-projection in the spatial domain is derived. The underlying integral expressions for different focusing operators are derived analytically leading to magnitude correction factors, which are mostly relevant for reconstructing microwave images when the distance from the scattering object to the aperture plane is small. It is shown that the derived imaging procedure is superior to the traditional back-projection only compensating the phase delay of the measurement signals and validate our findings based on simulated as well as measured data. Since the derived focusing operators correspond to a low-pass filtering of the spatial images, the resulting modified multi-static back-projection algorithms effectively suppress imaging artifacts as well.

2603.01790 2026-03-03 eess.SP

Control Plane for Reconfigurable Intelligent Surfaces

Fabio Saggese, Victor Croisfelt, Kyriakos Stylianopoulos, George C. Alexandropoulos, Petar Popovski

Comments Published in Communication Standards Magazine

详情
英文摘要

Research on reconfigurable intelligent surfaces (RISs) has predominantly focused on purely physical (PHY)-layer aspects, particularly, on how signals are dynamically shaped by a controllable wireless propagation environment. However, integrating RISs as system-level network elements requires the development of an RIS-compatible control plane. In this article, we explore design options for such a control plane across two key dimensions: i) the allocation of spectral resources for the control plane (in- or out-of-band), and ii) the rate selection for the data plane (multiplexing or diversity). While our analysis is necessarily simplified, it reveals the fundamental trade-offs inherent in these design choices, which are crucial for integrating RIS technology into future networks.

2603.01767 2026-03-03 cs.CV eess.IV

Downstream Task Inspired Underwater Image Enhancement: A Perception-Aware Study from Dataset Construction to Network Design

Bosen Lin, Feng Gao, Yanwei Yu, Junyu Dong, Qian Du

Comments Accepted for publication in IEEE TIP 2026

详情
英文摘要

In real underwater environments, downstream image recognition tasks such as semantic segmentation and object detection often face challenges posed by problems like blurring and color inconsistencies. Underwater image enhancement (UIE) has emerged as a promising preprocessing approach, aiming to improve the recognizability of targets in underwater images. However, most existing UIE methods mainly focus on enhancing images for human visual perception, frequently failing to reconstruct high-frequency details that are critical for task-specific recognition. To address this issue, we propose a Downstream Task-Inspired Underwater Image Enhancement (DTI-UIE) framework, which leverages human visual perception model to enhance images effectively for underwater vision tasks. Specifically, we design an efficient two-branch network with task-aware attention module for feature mixing. The network benefits from a multi-stage training framework and a task-driven perceptual loss. Additionally, inspired by human perception, we automatically construct a Task-Inspired UIE Dataset (TI-UIED) using various task-specific networks. Experimental results demonstrate that DTI-UIE significantly improves task performance by generating preprocessed images that are beneficial for downstream tasks such as semantic segmentation, object detection, and instance segmentation. The codes are publicly available at https://github.com/oucailab/DTIUIE.

2603.01751 2026-03-03 cs.RO cs.AI cs.LG cs.SY eess.SY

Shape-Interpretable Visual Self-Modeling Enables Geometry-Aware Continuum Robot Control

Peng Yu, Xin Wang, Ning Tan

详情
英文摘要

Continuum robots possess high flexibility and redundancy, making them well suited for safe interaction in complex environments, yet their continuous deformation and nonlinear dynamics pose fundamental challenges to perception, modeling, and control. Existing vision-based control approaches often rely on end-to-end learning, achieving shape regulation without explicit awareness of robot geometry or its interaction with the environment. Here, we introduce a shape-interpretable visual self-modeling framework for continuum robots that enables geometry-aware control. Robot shapes are encoded from multi-view planar images using a Bezier-curve representation, transforming visual observations into a compact and physically meaningful shape space that uniquely characterizes the robot's three-dimensional configuration. Based on this representation, neural ordinary differential equations are employed to self-model both shape and end-effector dynamics directly from data, enabling hybrid shape-position control without analytical models or dense body markers. The explicit geometric structure of the learned shape space allows the robot to reason about its body and surroundings, supporting environment-aware behaviors such as obstacle avoidance and self-motion while maintaining end-effector objectives. Experiments on a cable-driven continuum robot demonstrate accurate shape-position regulation and tracking, with shape errors within 1.56% of image resolution and end-effector errors within 2% of robot length, as well as robust performance in constrained environments. By elevating visual shape representations from two-dimensional observations to an interpretable three-dimensional self-model, this work establishes a principled alternative to vision-based end-to-end control and advances autonomous, geometry-aware manipulation for continuum robots.

2603.01737 2026-03-03 eess.SP math.ST stat.TH

Detection of weak signals under arbitrary noise distributions

J. Zschetzsche, M. Weimar, O. Lang, S. Schuster, A. Haberl, S. Schertler, B. Lehner, J. Reisinger, M. Huemer, S. Rotter

Comments 24 pages, 8 figures, Code available at https://github.com/jonaslindenberger/LRao-detector

详情
英文摘要

Detecting weak signals buried in complex, non-Gaussian noise is a fundamental challenge in science and engineering, with applications ranging from radar systems and communications to industrial monitoring and gravitational wave detection. The Rao detector, a key concept in this domain, achieves asymptotically optimal performance as the number of measurements increases, but requires precise knowledge of the data's statistical properties, often relying on simplified noise models. We propose a hybrid framework that combines a lightweight neural network with the Rao detection framework to address this limitation. The neural network, trained on noise-only data, learns the optimal multivariate nonlinearity, transforming noisy data to enhance signal detectability. The newly introduced LRao detector then fully extracts the signal information, achieving asymptotically optimal performance even under challenging noise conditions. Validated on both simulated and real-world magnetic sensor data, our method significantly outperforms conventional approaches. By bridging data-driven techniques with model-based signal processing, it offers a robust and interpretable solution for signal detection across diverse applications.

2603.01660 2026-03-03 eess.SP

Cramer-Rao Bounds for Target Parameter Estimation in a Bi-Static IRS-Assisted Radar Configuration

Sanjeeva Reddy S, Vinod Veera Reddy

详情
英文摘要

Non-Line-of-Sight (NLoS) sensing and detection of low-observable (stealth) targets are challenging for conventional radar due to blockage and severe propagation loss. Intelligent Reflective Surface (IRS)-assisted radar can extend the field-of-view (FOV), but common architectures rely on the four-hop radar--IRS--target--IRS--radar link, whose attenuation limits estimation performance. This paper proposes an alternative architecture, that exploits the target-scattered component received at a spatially separated IRS and redirected back to a mono-static radar receiver. The geometry provides bi-static/multi-static-like diversity using a passive panel, while retaining a mono-static front-end and avoiding inter-node time synchronization concerns. We develop a signal model for the proposed configuration and recast it into a compact, parameterized form that is suitable for angle estimation. Using this reformulation, we derive the Fisher Information Matrix and the associated Cramér--Rao Lower Bounds (CRLB) for target azimuth and elevation angles with respect to the IRS. Numerical evaluations quantify the impact of various signal-model parameters on the achievable bounds. These results provide insights on the parameter-estimation limits within the FOV against SNR, snapshots and IRS elements.

2603.01611 2026-03-03 eess.SY cs.SY

Predictive Lane-Change and Routing Coordination in Bus-Priority Mixed Traffic Corridors

Tanlu Liang, Ting Bai, Andreas A. Malikopoulos

详情
英文摘要

In this paper, we investigate the coordination of vehicle maneuvers in mixed-traffic corridors where connected and automated vehicles, human-driven vehicles, and buses interact under dedicated bus lane operations. We develop a segment-based network coordination framework that jointly optimizes lane-change and routing decisions of connected and automated vehicles to improve dedicated lane utilization while preserving bus priority. The proposed framework incorporates a predictive bus-protection mechanism that restricts vehicle access to protected lane segments within a monitoring horizon, together with a utility-driven lane-change strategy that accounts for anticipated travel time gains, downstream routing feasibility, and lane-change stability. By explicitly coupling network-level routing decisions with lane-level interaction control, the method proactively mitigates conflicts on dedicated lanes before congestion effects materialize. The proposed approach is evaluated through microscopic traffic simulations in SUMO using a realistic urban corridor. Simulation results demonstrate that the framework enhances bus schedule adherence and reduces average travel times for both automated and human-driven vehicles, while maintaining stable lane-change behavior without increasing maneuver frequency.