Robust Localization in OFDM-Based Massive MIMO through Phase Offset Calibration
Comments Accepted to IEEE International Symposium on Joint Communications & Sensing (JC&S) 2026; recipient of the Best Student Paper Award
Qing Zhang, Adham Sakhnini, Robbert Beerten, Haoqiu Xiong, Zhuangzhuang Cui, Yang Miao, Sofie Pollin
Comments Accepted to IEEE International Symposium on Joint Communications & Sensing (JC&S) 2026; recipient of the Best Student Paper Award
Accurate localization in Orthogonal Frequency Division Multiplexing (OFDM)-based massive Multiple-Input Multiple-Output (MIMO) systems depends critically on phase coherence across subcarriers and antennas. However, practical systems suffer from frequency-dependent and (spatial) antenna-dependent phase offsets, degrading localization accuracy. This paper analytically studies the impact of phase incoherence on localization performance under a static User Equipment (UE) and Line-of-Sight (LoS) scenario. We use two complementary tools. First, we derive the Cramér-Rao Lower Bound (CRLB) to quantify the theoretical limits under phase offsets. Then, we develop a Spatial Ambiguity Function (SAF)-based model to characterize ambiguity patterns. Simulation results reveal that spatial phase offsets severely degrade localization performance, while frequency phase offsets have a minor effect in the considered system configuration. To address this, we propose a robust Channel State Information (CSI) calibration framework and validate it using real-world measurements from a practical massive MIMO testbed. The experimental results confirm that the proposed calibration framework significantly improves the localization Root Mean Squared Error (RMSE) from 5 m to 1.2 cm, aligning well with the theoretical predictions.
Marc Windsheimer, Simon Deniffel, André Kaup
Comments 5 pages, 5 figures, 1 table
Local rate control is a key enabler to generalize image and video compression for dedicated challenges, such as video coding for machines. While traditional hybrid video coding can easily adapt the local rate-distortion trade-off by changing the local quantization parameter, no such approach is currently available for learning-based video compression. In this paper, we propose LRC-DHVC, a hierarchical video compression network, which allows continuous local rate control on a pixel level to vary the spatial quality distribution within individual video frames. This is achieved by concatenating a quality map to the input frame and applying a weighted MSE loss which matches the pixelwise trade-off factors in the quality map. During training, the model sees a variety of quality maps due to a constrained-random generation. Our model is the first neural video compression network, which can continuously and spatially adapt to varying quality constraints. Due to the wide quality and bit rate range, a single set of network parameters is sufficient. Compared to single rate point networks, which scale linearly with the number of rate points, the memory requirements for our network parameters remain constant. The code and model are available at link-updated-upon-acceptance.
Wenyi Yan, Zeyuan Li, Lu Gan, Honqing Liu, Guoquan Li
Two-channel modulo analog-to-digital converters (ADCs) enable high-dynamic-range signal sensing at the Nyquist rate per channel, but existing designs quantise both channel outputs independently, incurring redundant bitrate costs. This paper proposes a bit-efficient quantisation scheme that exploits the integer-valued structure of inter-channel differences, transmitting one quantised channel output together with a compact difference index. We prove that this approach requires only 1-2 bits per signal sample overhead relative to conventional ADCs, despite operating with a much smaller per-channel dynamic range. Simulations confirm the theoretical error bounds and bitrate analysis, while hardware experiments demonstrate substantial bitrate savings compared with existing modulo sampling schemes, while maintaining comparable reconstruction accuracy. These results highlight a practical path towards high-resolution, bandwidth-efficient modulo ADCs for bitrate-constrained systems.
Mohamed Nomeir, Sennur Ulukus
We consider the storage problem in an asymmetric $X$-secure private information retrieval (A-XPIR) setting. The A-XPIR setting considers the $X$-secure PIR problem (XPIR) when a given arbitrary set of servers is communicating. We focus on the trade-off region between the average storage at the servers and the average download cost. In the case of $N=4$ servers and two non-overlapping sets of communicating servers with $K=2$ messages, we characterize the achievable region and show that the three main inequalities compared to the no-security case collapse to two inequalities in the asymmetric security case. In the general case, we derive bounds that need to be satisfied for the general achievable region for an arbitrary number of servers and messages. In addition, we provide the storage and retrieval scheme for the case of $N=4$ servers with $K=2$ messages and two non-overlapping sets of communicating servers, such that the messages are not replicated (in the sense of a coded version of each symbol) and at the same time achieve the optimal achievable rate for the case of replication. Finally, we derive the exact capacity for the case of asymmetric security and asymmetric collusion for $N=4$ servers, with the communication links $\{1,2\}$ and $\{3,4\}$, which splits the servers into two groups, i.e., $g=2$, and with the collusion links $\{1,3\}$, $\{2,4\}$, as $C=\frac{1}{3}$. More generally, we derive a capacity result for a certain family of asymmetric collusion and asymmetric security cases.
Yichen Guo, Tao Peng, Yujie Zhao, Yijing Niu, Wenbo Wang
Comments This work has been submitted to the IEEE for possible publication
Interference significantly impacts the performance of industrial wireless networks, particularly n severe interference environments with dense networks reusing spectrum resources intensively. Although delicate interference information is often unavailable in conventional networks, emerging interference cognition techniques can compensate this critical problem with possibly different precision. This paper investigates the relationship between precision of interference cognition and system performance. We propose a novel performance analysis framework that quantifies the impact of varying interference information precision on achievable rate. Specifically, leveraging the Nakagami-$\mathbf{m}$ fading channel model, we analytically and asymptotically analyze the average achievable rate in the finite blocklength regime for different precision levels of signal and interference information. Our findings reveal the critical importance of identifying per-link interference information for achieving optimal performance. Additionally, obtaining instantaneous information is more beneficial for signal links.
Cristian Sestito, Panagiota Kontou, Pratibha Verma, Atish Dixit, Alexandros D. Keros, Michael O'Boyle, Christos-Savvas Bouganis, Themis Prodromakis
Comments 17 pages, 5 figures, 1 Supplementary (12 pages, 13 figures, 1 table)
Large language models (LLMs) are transforming electronic design automation (EDA) by enhancing design stages such as schematic design, simulation, netlist synthesis, and place-and-route. Existing methods primarily focus these optimisations within isolated open-source EDA tools and often lack the flexibility to handle multiple domains, such as analogue, digital, and radio-frequency design. In contrast, modern systems require to interface with commercial EDA environments, adhere to tool-specific operation rules, and incorporate feedback from design outcomes while supporting diverse design flows. We propose a versatile framework that uses LLMs to generate files compatible with commercial EDA tools and optimise designs using power-performance-area reports. This is accomplished by guiding the LLMs with tool constraints and feedback from design outputs to meet tool requirements and user specifications. Case studies on operational transconductance amplifiers, microstrip patch antennas, and FPGA circuits show that the framework is effective as an EDA-aware assistant, handling diverse design challenges reliably.
Alexander Ihlow, Marius Schmidt, Carsten Andrich, Reiner S. Thomä
Comments 20th European Conference on Antennas and Propagation (EuCAP 2026)
Fundamental research on bistatic radar reflectivity is highly relevant, e.g., to the upcoming mobile communication standard 6G, which includes integrated sensing and communication (ISAC). We introduce a model for correcting instrumentation drift during bistatic radar measurements in anechoic chambers. Usually, background subtraction is applied with the goal to yield the target reflection signal as best as possible while coherently subtracting all signals which were present in both the foreground and background measurement. However, even slight incoherences between the foreground and background measurement process deteriorate the result. We analyze these effects in real measurements in the frequency range 2-18 GHz, taken with the Bistatic Radar (BIRA) measurement facility at TU Ilmenau. Applying our proposed drift correction model, we demonstrate up to 40 dB improvement for the removal of direct line-of-sight antenna crosstalk over the state of the art.
Youngmoon Jung, Myunghun Jung, Joon-Young Yang, Yong-Hyeok Lee, Jaeyoung Roh, Hoon-Young Cho
Comments 5 pages, 1 figure, Accepted at ICASSP 2026
Open-vocabulary keyword spotting (KWS) with text-based enrollment has emerged as a flexible alternative to fixed-phrase triggers. Prior utterance-level matching methods, from an embedding-learning standpoint, learn embeddings at a single fixed dimensionality. We depart from this design and propose Matryoshka Audio-Text Embeddings (MATE), a dual-encoder framework that encodes multiple embedding granularities within a single vector via nested sub-embeddings ("prefixes"). Specifically, we introduce a PCA-guided prefix alignment: PCA-compressed versions of the full text embedding for each prefix size serve as teacher targets to align both audio and text prefixes. This alignment concentrates salient keyword cues in lower-dimensional prefixes, while higher dimensions add detail. MATE is trained with standard deep metric learning objectives for audio-text KWS, and is loss-agnostic. To our knowledge, this is the first application of matryoshka-style embeddings to KWS, achieving state-of-the-art results on WSJ and LibriPhrase without any inference overhead.
Youngmoon Jung, Joon-Young Yang, Ju-ho Kim, Jaeyoung Roh, Chang Woo Han, Hoon-Young Cho
Comments 5 pages, 2 figures, Accepted at ICASSP 2026
Short-utterance speaker verification remains challenging due to limited speaker-discriminative cues in short speech segments. While existing methods focus on enhancing speaker encoders, the embedding learning strategy still forces a single fixed-dimensional representation reused for utterances of any length, leaving capacity misaligned with the information available at different durations. We propose Duration-Aware Matryoshka Embedding (DAME), a model-agnostic framework that builds a nested hierarchy of sub-embeddings aligned to utterance durations: lower-dimensional representations capture compact speaker traits from short utterances, while higher dimensions encode richer details from longer speech. DAME supports both training from scratch and fine-tuning, and serves as a direct alternative to conventional large-margin fine-tuning, consistently improving performance across durations. On the VoxCeleb1-O/E/H and VOiCES evaluation sets, DAME consistently reduces the equal error rate on 1-s and other short-duration trials, while maintaining full-length performance with no additional inference cost. These gains generalize across various speaker encoder architectures under both general training and fine-tuning setups.
Xuehan Wang, Jinhong Yuan, Jintao Wang, Kehan Huang
Comments 10 pages, 5 figures. This paper has been accepted for publication in IEEE TSP
Diversity is an essential concept associated with communication reliability in multipath channels since it determines the slope of bit error rate performance in the medium to high signal-to-noise ratio regions. However, most of the existing analytical frameworks were developed for specific modulation schemes while the efficient validation of full multipath diversity for general modulation schemes remains an open problem. To fill this research gap, we propose to utilize random constellation rotation to ease the conditions for full-diversity modulation designs. For linearly precoded cyclic-prefix orthogonal frequency division multiplexing (OFDM) systems, we prove that maximum multipath diversity can be attained as long as the spread matrix does not have zero entries, which is a sufficient but easily satisfied condition. Furthermore, we derive the sufficient and necessary condition for general modulation schemes, whose verification can be divided into validation tasks for each column of the modulation matrix. Based on the proposed conditions, maximum diversity order can be attained with the probability of 1 by enabling a randomly generated rotation pattern for both time and doubly dispersive channels. The theoretical analysis in this paper also demonstrates that the diversity evaluation can be concentrated on the pairwise error probability when the number of error symbols is one, which reduces the complexity of diversity-driven design and performance analysis for novel modulation schemes significantly in both time and doubly dispersive channels. Finally, numerical results for various modulation schemes confirm that the theoretical analysis holds in both time and doubly dispersive channels. Furthermore, when employing practical detectors, the random constellation rotation technique consistently enhance the transmission reliability for both coded and uncoded systems.
Jiangwei Xie, Zhang Wen, Mike Davies, Dongdong Chen
Comments Technical report
Hyperspectral image (HSI) restoration is a fundamental challenge in computational imaging and computer vision. It involves ill-posed inverse problems, such as inpainting and super-resolution. Although deep learning methods have transformed the field through data-driven learning, their effectiveness hinges on access to meticulously curated ground-truth datasets. This fundamentally restricts their applicability in real-world scenarios where such data is unavailable. This paper presents SHARE (Single Hyperspectral Image Restoration with Equivariance), a fully unsupervised framework that unifies geometric equivariance principles with low-rank spectral modelling to eliminate the need for ground truth. SHARE's core concept is to exploit the intrinsic invariance of hyperspectral structures under differentiable geometric transformations (e.g. rotations and scaling) to derive self-supervision signals through equivariance consistency constraints. Our novel Dynamic Adaptive Spectral Attention (DASA) module further enhances this paradigm shift by explicitly encoding the global low-rank property of HSI and adaptively refining local spectral-spatial correlations through learnable attention mechanisms. Extensive experiments on HSI inpainting and super-resolution tasks demonstrate the effectiveness of SHARE. Our method outperforms many state-of-the-art unsupervised approaches and achieves performance comparable to that of supervised methods. We hope that our approach will shed new light on HSI restoration and broader scientific imaging scenarios. The code will be released at https://github.com/xuwayyy/SHARE.
Zhang Wen, Jiangwei Xie, Dongdong Chen
Comments Technical report
Image Dehazing (ID) aims to produce a clear image from an observation contaminated by haze. Current ID methods typically rely on carefully crafted priors or extensive haze-free ground truth, both of which are expensive or impractical to acquire, particularly in the context of scientific imaging. We propose a new unsupervised learning framework called Equivariant Image Dehazing (EID) that exploits the symmetry of image signals to restore clarity to hazy observations. By enforcing haze consistency and systematic equivariance, EID can recover clear patterns directly from raw, hazy images. Additionally, we propose an adversarial learning strategy to model unknown haze physics and facilitate EID learning. Experiments on two scientific image dehazing benchmarks (including cell microscopy and medical endoscopy) and on natural image dehazing have demonstrated that EID significantly outperforms state-of-the-art approaches. By unifying equivariant learning with modelling haze physics, we hope that EID will enable more versatile and effective haze removal in scientific imaging. Code and datasets will be published.
Eike Osmers, Dorothea Kolossa
Accurate, low-latency estimates of the instantaneous phase of oscillations are essential for closed-loop sensing and actuation, including (but not limited to) phase-locked neurostimulation and other real-time applications. The endpoint-corrected Hilbert transform (ecHT) reduces boundary artefacts of the Hilbert transform by applying a causal narrow-band filter to the analytic spectrum. This improves the phase estimate at the most recent sample. Despite its widespread empirical use, the systematic endpoint distortions of ecHT have lacked a principled, closed-form analysis. In this study, we derive the ecHT endpoint operator analytically and demonstrate that its output can be decomposed into a desired positive-frequency term (a deterministic complex gain that induces a calibratable amplitude/phase bias) and a residual leakage term setting an irreducible variance floor. This yields (i) an explicit characterisation and bounds for endpoint phase/amplitude error, (ii) a mean-squared-error-optimal scalar calibration (c-ecHT), and (iii) practical design rules relating window length, bandwidth/order, and centre-frequency mismatch to residual bias via an endpoint group delay. The resulting calibrated ecHT achieves near-zero mean phase error and remains computationally compatible with real-time pipelines. Code and analyses are provided at https://github.com/eosmers/cecHT.
Sander Doodeman, Paula Chanfreut Palacio, Elena Torta, Duarte Antunes
This paper studies the impact of rigidly attached heavy payload placement - where the payload mass significantly influences the UAV's dynamics - on the stability and control performance of a multirotor unmanned aerial vehicle (UAV). In particular, we focus on how the position of such a payload relative to the vehicle's Center of Gravity (CoG) affects the stability and control performance at an arbitrary point of interest on the UAV, such as the payload position, and on how this position can be optimized. Our conclusions are based on two key contributions. First, we analyze the stability of the zero-dynamics of a complete nonlinear model of the UAV with payload. We demonstrate that the stability of the zero dynamics depends on the vertical signed distance in the body-fixed frame between the controlled output position and the combined CoG of the UAV with payload. Specifically, positioning the output below the CoG yields unstable zero dynamics, while the linearized zero dynamics are marginally stable when placing it above, indicating reduced sensitivity to input disturbances. Second, we analyze the performance of the linearized UAV model with payload by providing an analytical expression for the H2-norm, from which we can quantify the system's attenuation to white noise input disturbances. We conclude that less control authority leads to a higher optimal position of the controlled output with respect to the CoG for closed-loop white-noise disturbance rejection capabilities, also when the heavy payload is the controlled output. The results are illustrated through numerical examples.
Yousef Sadegheih, Dorit Merhof, Pratibha Kumari
Comments Submitted to MIDL 2026
Brain lesion segmentation from multi-modal MRI often assumes fixed modality sets or predefined pathologies, making existing models difficult to adapt across cohorts and imaging protocols. Continual learning (CL) offers a natural solution but current approaches either impose a maximum modality configuration or suffer from severe forgetting in buffer-free settings. We introduce CLMU-Net, a replay-based CL framework for 3D brain lesion segmentation that supports arbitrary and variable modality combinations without requiring prior knowledge of the maximum set. A conceptually simple yet effective channel-inflation strategy maps any modality subset into a unified multi-channel representation, enabling a single model to operate across diverse datasets. To enrich inherently local 3D patch features, we incorporate lightweight domain-conditioned textual embeddings that provide global modality-disease context for each training case. Forgetting is further reduced through principled replay using a compact buffer composed of both prototypical and challenging samples. Experiments on five heterogeneous MRI brain datasets demonstrate that CLMU-Net consistently outperforms popular CL baselines. Notably, our method yields an average Dice score improvement of $\geq$ 18\% while remaining robust under heterogeneous-modality conditions. These findings underscore the value of flexible modality handling, targeted replay, and global contextual cues for continual medical image segmentation. Our implementation is available at https://github.com/xmindflow/CLMU-Net.
Peter Golenderov, Yaroslav Matushenko, Anastasia Tushina, Michal Barodkin
Aortic valve opening (AO) events are crucial for detecting frequency and rhythm disorders, especially in real-world settings where seismocardiography (SCG) signals collected via consumer smartphones are subject to noise, motion artifacts, and variability caused by device heterogeneity. In this work, we present a robust deep-learning framework for SCG segmentation and rhythm analysis using accelerometer recordings obtained with consumer smartphones. We develop an enhanced U-Net v3 architecture that integrates multi-scale convolutions, residual connections, and attention gates, enabling reliable segmentation of noisy SCG signals. A dedicated post-processing pipeline converts probability masks into precise AO timestamps, whereas a novel adaptive 3D-to-1D projection method ensures robustness to arbitrary smartphone orientation. Experimental results demonstrate that the proposed method achieves consistently high accuracy and robustness across various device types and unsupervised data-collection conditions. Our approach enables practical, low-cost, and automated cardiac-rhythm monitoring using everyday mobile devices, paving the way for scalable, field-deployable cardiovascular assessment and future multimodal diagnostic systems.
Ignacio Santamaria, Mohammad Soleymani, Eduard Jorswieck, Jesus Gutierrez, Carlos Beltran
Comments 5 pages, 2 figures
In this paper, we rigorously characterize for the first time the manifold of unitary and symmetric matrices, deriving its tangent space and its geodesics. The resulting parameterization of the geodesics (through a real and symmetric matrix) allows us to derive a new Riemannian manifold optimization (MO) algorithm whose most remarkable feature is that it does not need to set any adaptation parameter. We apply the proposed MO algorithm to maximize the achievable rate in a multiple-input multiple-output (MIMO) system assisted by a beyond-diagonal reconfigurable intelligent surface (BD-RIS), illustrating the method's performance through simulations. The MO algorithm achieves a significant reduction in computational cost compared to previous alternatives based on Takagi decomposition, while retaining global convergence to a stationary point of the cost function.
Ziyi Yang, Li Rao, Zhengding Luo, Dongyuan Shi, Qirui Huang, Woon-Seng Gan
Active noise control (ANC) must adapt quickly when the acoustic environment changes, yet early performance is largely dictated by initialization. We address this with a Model-Agnostic Meta-Learning (MAML) co-initialization that jointly sets the control filter and the secondary-path model for FxLMS-based ANC while keeping the runtime algorithm unchanged. The initializer is pre-trained on a small set of measured paths using short two-phase inner loops that mimic identification followed by residual-noise reduction, and is applied by simply setting the learned initial coefficients. In an online secondary path modeling FxLMS testbed, it yields lower early-stage error, shorter time-to-target, reduced auxiliary-noise energy, and faster recovery after path changes than a baseline without re-initialization. The method provides a simple fast start for feedforward ANC under environment changes, requiring a small set of paths to pre-train.
Yongqiang Zhang, Mustafa A. Kishk, Mohamed-Slim Alouini
Comments 7 pages, 4 figures, 2 tables, accepted by IEEE Communications Magazine
Large Language Models (LLMs) such as ChatGPT promise revolutionary capabilities for Sixth-Generation (6G) wireless networks but their massive computational requirements and tendency to generate technically incorrect information create deployment barriers. In this work, we introduce MAINTAINED: autonomous artificial intelligence agent for wireless network deployment. Instead of encoding domain knowledge within model parameters, our approach orchestrates specialized computational tools for geographic analysis, signal propagation modeling, and network optimization. In a real-world case study, MAINTAINED outperforms state-of-the-art LLMs including ChatGPT-4o, Claude Sonnet 4, and DeepSeek-R1 by up to 100-fold in verified performance metrics while requiring less computational resources. This paradigm shift, moving from relying on parametric knowledge towards externalizing domain knowledge into verifiable computational tools, eliminates hallucination in technical specifications and enables edge-deployable Artificial Intelligence (AI) for wireless communications.
Jiunn-Tsair Chen
Comments 44 pages, 21 figures
Wi-Fi networks increasingly suffer from performance degradation caused by contention-based channel access, dense deployments, and largely self-managed operation among mutually interfering access points (APs). In this paper, we propose a Digital Twin (DT) framework that captures the essential spatial and temporal characteristics of wireless channels and traffic patterns, enabling the prediction of likely future network scenarios while respecting physical constraints. Leveraging this predictive capability, we introduce two analytically derived performance upper bounds-one based on Shannon capacity and the other on latency behavior under CSMA-CA (Carrier Sense Multiple Access with Collision Avoidance)-that can be evaluated efficiently without time-consuming network simulations. By applying importance sampling to DT-generated scenarios, potentially risky network conditions can be identified within large stochastic scenario spaces. These same performance bounds are then used to proactively guide a gradient-based search for improved network configurations, with the objective of avoiding imminent performance degradation rather than pursuing globally optimal but fragile solutions. Simulation results demonstrate that the proposed approach can successfully predict time-dependent network congestion and mitigate it in advance, highlighting its potential for predictive and preventive Wi-Fi network management.
Ruixing Ren, Shan Chen, Xuehan Bao, Pingzheng Ge, Dongming Wang, Junhui Zhao
Comments 11 pages,6 figures
To address the issues of high operational costs and low energy efficiency (EE) caused by the dense deployment of small base stations (s-BSs) in 5G ultra-dense networks (UDNs), this paper first constructs a multi-objective mathematical optimization model targeting maximizing EE and minimizing the number of active BSs. The model incorporates key constraints including BS operational state, user equipment (UE)-BS connection relationship, and load threshold, laying a theoretical foundation for the coordinated optimization of energy conservation and quality of service. Based on this model, an integrated solution combining UE-BS initial connection optimization and load-sharing based BS sleeping is proposed. In the initial connection phase, with communication quality and BS load as dual constraints, efficient matching between UEs and optimal BSs is achieved through three sequential steps: communication feasibility screening, redundant connection removal, and overload load redistribution. This resolves the problems of load imbalance and difficult identification of redundant BSs in UDNs arising from unordered initial connections. In the BS sleeping phase, a BS sleeping index, comprehensively considering UE transferability and backup BS resources, is innovatively introduced to quantify BS dormancy priority. Through a closed-loop process involving low-load BS screening, adjacent BS load evaluation, and load sharing by two takeover BSs based on their capacity, accurate dormancy of redundant BSs and collaborative load migration are realized. Simulation results in a typical UDNs scenario demonstrate that, compared with the traditional baseline scheme, the proposed solution exhibits significant advantages in convergence speed, optimization of the number of active BSs, and EE improvement.
Yongqiang Zhang, Qurrat-Ul-Ain Nadeem
Comments 6 pages, 3 figures, accepted by IEEE International Conference on Communications (ICC) 2026
Multiple-input multiple-output (MIMO) systems require efficient and accurate channel estimation with low pilot overhead to unlock their full potential for high spectral and energy efficiency. While deep generative models have emerged as a powerful foundation for the channel estimation task, the existing approaches using diffusion-based and score-based models suffer from high computational runtime due to their stochastic and many-step iterative sampling. In this paper, we introduce a flow matching-based channel estimator to overcome this limitation. The proposed channel estimator is based on a deep neural network trained to learn the velocity field of wireless channels, which we then integrate into a plug-and-play proximal gradient descent (PnP-PGD) framework. Simulation results reveal that our formulated approach consistently outperforms existing state-of-the-art (SOTA) generative model-based estimators, achieves up to 49 times faster inference at test time, and reduces up to 20 times peak graphics processing unit (GPU) memory usage. Our code and models are publicly available to support reproducible research.
Matteo Vaccargiu, Azmat Ullah, Pierluigi Gallo
Comments 2026 IEEE International Conference on Software Analysis, Evolution and Reengineering - Companion (SANER-C) 9th International Workshop on Blockchain Oriented Software Engineering March 17-20, 2026 Limassol, Cyprus
Carbon credit systems have emerged as a policy tool to incentivize emission reductions and support the transition to clean energy. Reliable carbon-credit certification depends on mechanisms that connect actual, measured renewable-energy production to verifiable emission-reduction records. Although blockchain and IoT technologies have been applied to emission monitoring and trading, existing work offers limited support for certification processes, particularly for small and medium-scale renewable installations. This paper introduces a blockchain-based carbon-credit certification architecture, demonstrated through a 100 kWp photovoltaic case study, that integrates real-time IoT data collection, edge-level aggregation, and secure on-chain storage on a permissioned blockchain with smart contracts. Unlike approaches focused on trading mechanisms, the proposed system aligns with European legislation and voluntary carbon-market standards, clarifying the practical requirements and constraints that apply to photovoltaic operators. The resulting architecture provides a structured pathway for generating verifiable carbon-credit records and supporting third-party verification.
Joon Lee, Jeongyoon Han, Doyoung Kim, Seokhwan Jeong
Comments Soft Robotics
This paper presents the flexible RIM Hand, a biomimetic robotic hand that precisely replicates the carpometacarpal (CMC) joints and employs superelastic Nitinol wires throughout its skeletal framework. By modeling the full carpal-to-metacarpal anatomy, the design enables realistic palm deformation through tendon-driven fingers while enhancing joint restoration and supports skeletal structure with Nitinol-based dorsal extensors. A flexible silicone skin further increases contact friction and contact area, enabling stable grasps for diverse objects. Experiments show that the palm can deform up to 28%, matching human hand flexibility, while achieving more than twice the payload capacity and three times the contact area compared to a rigid palm design. The RIM Hand thus offers improved dexterity, compliance, and anthropomorphism, making it promising for prosthetic and service-robot applications.
Xun Feng, Chao Zhai
The inherent non-convexity of poriferous surfaces typically entraps agents in local minima and complicates workload distribution. To resolve this, we propose a distributed diffeomorphic coverage control framework for the multi-agent system (MAS) in such surfaces. First, we establish a distributed poly-annulus conformal mapping that transforms arbitrary poriferous surfaces into a multi-hole disk. Leveraging this topological equivalence, a collision-free sectorial partition mechanism is designed in the multi-hole disk, which rigorously induces strictly connected subregions and workload balance on the poriferous surfaces. This mechanism utilizes a buffer-based sequence mechanism to ensure strict topological safety when bypassing obstacles. Furthermore, a pull-back Riemannian metric is constructed to define the length metric that encodes safety constraints. Based on this metric, a distributed gradient-based control law is synthesized to drive agents toward optimal configurations, ensuring simultaneous obstacle avoidance and coverage optimization. Theoretical analyses guarantee the Input-to-State Stability (ISS) of the partition dynamics and the asymptotic convergence of the closed-loop system. Numerical simulations confirm the reachability and robustness of the proposed coverage algorithm, offering a scalable solution for distributed coverage in poriferous surfaces.
Shama Siddiqui, Anwar Ahmed Khan, Nicola Marchetti
With the growing applications of the Internet of Things (IoT), a major challenge is to ensure continuous connectivity while providing prioritized access. In dense IoT scenarios, synchronization may be disrupted either by the movement of nodes away from base stations or by the unavailability of reliable Global Navigation Satellite System (GNSS) signals, which can be affected by physical obstructions, multipath fading, or environmental interference, such as such as walls, buildings, moving objects, or electromagnetic noise from surrounding devices. In such contexts, distributed synchronization through Non-Orthogonal Multiple Access (NOMA) offers a promising solution, as it enables simultaneous transmission to multiple users with different power levels, supporting efficient synchronization while minimizing the signaling overhead. Moreover, NOMA also plays a vital role for dynamic priority management in dense and heterogeneous IoT environments. In this article, we proposed a Two-Stage NOMA-Enabled Framework "TSN-IoT" that integrates the mechanisms of conventional Precision Time Protocol (PTP) based synchronization, distributed synchronization and data transmission. The framework is designed as a four-tier architecture that facilitates prioritized data delivery from sensor nodes to the central base station. We demonstrated the performance of "TSN-IoT" through a healthcare use case, where intermittent connectivity and varying data priority levels present key challenges for reliable communication. Synchronization speed and end-to-end delay were evaluated through a series of simulations implemented in Python. Results show that, compared to priority-based Orthogonal Frequency Division Multiple Access (OFDMA), TSN-IoT achieves significantly better performance by offering improved synchronization opportunities and enabling parallel transmissions over the same sub-carrier.
Shuvayan Banerjee, Radhendushka Srivastava, James Saunderson, Ajit Rajwade
Compressed sensing, which involves the reconstruction of sparse signals from an under-determined linear system, has been recently used to solve problems in group testing. In a public health context, group testing aims to determine the health status values of p subjects from n<<p pooled tests, where a pool is defined as a mixture of small, equal-volume portions of the samples of a subset of subjects. This approach saves on the number of tests administered in pandemics or other resource-constrained scenarios. In practical group testing in time-constrained situations, a technician can inadvertently make a small number of errors during pool preparation, which leads to errors in the pooling matrix, which we term `model mismatch errors' (MMEs). This poses difficulties while determining health status values of the participating subjects from the results on n<<p pooled tests. In this paper, we present an algorithm to correct the MMEs in the pooled tests directly from the pooled results and the available (inaccurate) pooling matrix. Our approach then reconstructs the signal vector from the corrected pooling matrix, in order to determine the health status of the subjects. We further provide theoretical guarantees for the correction of the MMEs and the reconstruction error from the corrected pooling matrix. We also provide several supporting numerical results.
Emin Akpinar, Emir Aslandogan, Burak Ahmet Ozden, Haci Ilhan, Erdogan Aydin
Comments This work has been submitted to the IEEE for possible publication
Orthogonal time frequency space (OTFS) modulation stands out as a promising waveform for sixth generation (6G) and beyond wireless communication systems, offering superior performance over conventional methods, particularly in high-mobility scenarios and dispersive channel conditions. Recent research has demonstrated that the reduced computational complexity of deep learning (DL)-based signal detection (SD) methods constitutes a compelling alternative to conventional techniques. In this study, low-complexity DL-based SD methods are proposed for a multiple-input multiple-output (MIMO)-OTFS system and examined under Nakagami-$m$ channel conditions. The symbols obtained from the receiver antennas are combined using maximum ratio combining (MRC) and detected with the help of a DL-based detector implemented with multi-layer perceptron (MLP), convolutional neural network (CNN), and residual network (ResNet). Complexity analysis reveals that the MLP architecture offers significantly lower computational complexity compared to CNN, ResNet, and classical methods such as maximum likelihood detection (MLD). Furthermore, numerical analyses have shown that the proposed DL-based detectors, despite their low complexity, achieve comparable bit error rate (BER) performance to that of a high-performance MLD under various system conditions.
Ziqian Wang, Xianjun Xia, Chuanzeng Huang, Lei Xie
Comments accepted to ICASSP 2026
We present S$^2$Voice, the winning system of the Singing Voice Conversion Challenge (SVCC) 2025 for both the in-domain and zero-shot singing style conversion tracks. Built on the strong two-stage Vevo baseline, S$^2$Voice advances style control and robustness through several contributions. First, we integrate style embeddings into the autoregressive large language model (AR LLM) via a FiLM-style layer-norm conditioning and a style-aware cross-attention for enhanced fine-grained style modeling. Second, we introduce a global speaker embedding into the flow-matching transformer to improve timbre similarity. Third, we curate a large, high-quality singing corpus via an automated pipeline for web harvesting, vocal separation, and transcript refinement. Finally, we employ a multi-stage training strategy combining supervised fine-tuning (SFT) and direct preference optimization (DPO). Subjective listening tests confirm our system's superior performance: leading in style similarity and singer similarity for Task 1, and across naturalness, style similarity, and singer similarity for Task 2. Ablation studies demonstrate the effectiveness of our contributions in enhancing style fidelity, timbre preservation, and generalization. Audio samples are available~\footnote{https://honee-w.github.io/SVC-Challenge-Demo/}.
Lifu Ding, Chunhui Hou, Yutong Li, Qinmin Yang
Hybrid microgrids integrating Grid-Following (GFL) and Grid-Forming (GFM) inverters present complex control challenges arising from the decoupling between long-term economic dispatch and real-time dynamic regulation, as well as the distinct physical limitations of heterogeneous inverters under cyber uncertainties. This paper proposes a Resilient Hierarchical Power Control (RHPC) strategy to unify these conflicting requirements within a cohesive framework. A standardized power increment mechanism is developed to bridge the tertiary and secondary layers, ensuring that real-time load fluctuations are compensated strictly according to the optimal economic ratios derived from the tertiary layer. To address the strict active power saturation constraints of GFL units, a dynamic activation scheme coupled with projection operators is introduced, which actively isolates saturated nodes from the consensus loop to prevent integrator wind-up and preserve the stability of the GFM backbone. Furthermore, the proposed framework incorporates a multi-scale attention mechanism and LSTM-based predictors into the secondary control protocol, endowing the system with robustness against unbounded False Data Injection (FDI) attacks and packet losses. Rigorous theoretical analysis confirms that the system achieves Uniformly Ultimately Bounded (UUB) convergence, and simulations on a modified IEEE 33-bus system demonstrate that the proposed strategy significantly improves power sharing accuracy and operational resilience in both grid-connected and islanded modes compared to conventional methods.
扫码添加微信好友,提出您的宝贵建议 👇
💡 备注请填写:网站反馈