arXivDaily arXiv每日学术速递 周一至周五更新
全部学科分类 1708
专题追踪
2606.07304 2026-06-08 cs.RO 新提交

CAPE: Contrastive Action-conditioned Parallel Encoding for Embodied Planning

CAPE: 用于具身规划的条件对比动作并行编码

Cong Chen, Haowen Wang, Zhixiang Zhang, Pei Ren, Zhengping Che

AI总结 提出CAPE框架,通过对比学习区分不同动作序列的未来结果,实现高效视觉动力学建模,在真实世界和零样本迁移任务中显著提升规划性能并降低推理成本。

Comments 19 pages, 7 figures

详情
AI中文摘要

具身智能体需要在执行前预测候选动作的未来后果,以便有效规划。现有的视觉动力学模型通过重建未来视觉状态或展开密集潜在表示来学习,这会将学习能力分散到视觉显著但与规划无关的内容上,而不是驱动操作结果的动作条件变化。我们提出CAPE,一种对比动作条件并行编码框架,通过区分不同动作序列诱导的未来结果来学习视觉动力学。给定初始观察和候选动作序列,CAPE在单次前向传播中解码完整的未来潜在轨迹,并使用目标收敛对比目标进行训练,该目标对齐对应相同未来结果的预测,同时分离对应不同结果的预测。在真实世界DROID和零样本迁移到RoboCasa上,CAPE在状态检索、离线动作匹配和闭环规划方面显著优于先前基线,同时在长预测范围内显著降低了规划时的推理成本。

英文摘要

Embodied agents need to predict the future consequences of candidate actions in order to plan effectively before execution. Existing visual dynamics models learn by reconstructing future visual states or rolling out dense latent representations, which spreads learning capacity across visually salient but planning-irrelevant content rather than the action-conditioned changes that drive manipulation outcomes. We propose CAPE, a Contrastive Action-conditioned Parallel Encoding framework that learns visual dynamics by distinguishing the future outcomes induced by different action sequences. Given an initial observation and a candidate action sequence, CAPE decodes the full future latent trajectory in a single forward pass and is trained with a Goal-Convergent Contrastive Objective that aligns predictions corresponding to the same future outcome while separating those corresponding to different outcomes. On real-world DROID and zero-shot transfer to RoboCasa, CAPE substantially outperforms prior baselines on future-state retrieval, offline action matching, and closed-loop planning, while notably reducing planning-time inference cost at long prediction horizons.

2606.07067 2026-06-08 cs.RO 新提交

Extending Responsibility-Sensitive Safety for the Assessment of Offloaded Autonomous Driving Services

扩展责任敏感安全以评估卸载的自动驾驶服务

Robin Dehler, Aryan Thakur, Michael Buchholz

AI总结 针对自动驾驶功能卸载中V2X通信导致响应时间变化的安全挑战,扩展责任敏感安全定义,提出基于安全约束的卸载决策与回退机制,并引入热备阶段提升回退安全性。

Comments 8 pages; accepted for 2026 IEEE 29th International Conference on Intelligent Transportation Systems (ITSC), Naples, Italy, September 15-18, 2026 - DOI will be added after publication

详情
AI中文摘要

安全是自动驾驶系统开发的基本要求。虽然功能卸载在计算效率和能耗方面显示出显著优势,但其在安全关键的AD功能中的应用带来了新的挑战。特别是,由于无线车联网通信,卸载的服务组合会导致响应时间增加且可变,这直接影响车辆的反应时间,从而影响其安全保证。在本文中,我们通过扩展责任敏感安全(RSS)的定义,明确考虑本地和卸载的AD服务组合的不同响应时间,来应对这一挑战。基于这一扩展,我们提出将其集成到功能卸载中,使用RSS安全约束进行卸载决策和回退机制。仅当当前交通状况在相应的端到端响应时间下保持安全时,才允许卸载的服务组合。如果违反此条件,系统将执行受控回退到本地执行。此外,我们引入了一种增强的回退策略,其中包括卸载服务的热备阶段,从而实现从卸载服务到本地服务的更快、更安全的过渡。所提出的方法已集成到我们的AD堆栈中,并在仿真和真实世界中进行了评估。实验结果表明,与最先进的功能卸载和安全框架相比,所提出的方法提高了安全性,同时在安全条件允许时保留了分布式计算的优势。

英文摘要

Safety is a fundamental requirement in the development of autonomous driving (AD) systems. While function offloading has demonstrated significant benefits in terms of computational efficiency and energy consumption, its application to safety-critical AD functionality introduces new challenges. In particular, offloaded service compositions incur increased and variable response times due to wireless vehicle-to-everything (V2X) communication, which directly affects the vehicle's reaction time and thus its safety guarantees. In this paper, we address this challenge by extending the definitions of Responsibility-Sensitive Safety (RSS) to explicitly account for different response times of local and offloaded AD service compositions. Based on this extension, we propose an integration into function offloading, using the RSS safety constraints for offloading decision-making and fallback mechanisms. Offloaded service compositions are only permitted if the current traffic situation remains safe under the corresponding end-to-end response time. If this condition is violated, the system performs a controlled fallback to local execution. Furthermore, we introduce an enhanced fallback strategy that includes a warm-standby phase for offloaded services, enabling faster and safer transitions from offloaded to local services. The proposed approach is integrated into our AD stack and evaluated in both simulation and the real world. Experimental results demonstrate that the proposed method improves safety compared to state-of-the-art function offloading and safety frameworks, while preserving the benefits of distributed computation when safety conditions allow.

2606.06960 2026-06-08 cs.CL 新提交

Tree-of-Experience: A Structured Experience-Management Solution for Self-Evolving Agents under Low-Repetition and Implicit-Reward Environments

经验之树:低重复与隐式奖励环境下自演化智能体的结构化经验管理方案

Zihao Deng, Yining Zhu, Leiming Wang, Jingfei Lu, Junbo Wang, Chuncheng Ran, Yu Yang, Dixuan Yang, Jikun Shen

AI总结 针对低重复任务与隐式奖励环境,提出结构化经验管理方法ToE,通过组织、检索、验证和更新经验,在金融情绪预测基准上优于无经验基线。

详情
AI中文摘要

基于经验的自我演化对于LLM智能体至关重要,但现有基准通常假设明确的目标、稳定的任务模式和清晰的反馈。我们研究了一个更具挑战性的场景:具有隐式奖励的低重复任务,其中过去的经验难以重用,且反馈是延迟的、有噪声的且是结果层面的。我们引入了\textsc{FinEvolveBench},一个时间控制的金融情绪预测基准,将每日新闻驱动的预测与未来超额收益联系起来。我们进一步提出了经验之树(ToE),一种结构化的经验管理方法,用于组织、检索、验证和更新智能体的经验。实验表明,通用经验机制并不一致地优于无经验基线,而ToE实现了更强的整体性能。这些结果强调了在隐式奖励环境中,结构化经验管理对于自演化智能体的重要性。

英文摘要

Experience-based self-evolution is crucial for LLM agents, but existing benchmarks often assume explicit goals, stable task patterns, and clear feedback. We study a more challenging setting: low-repetition tasks with implicit rewards, where past experience is difficult to reuse and feedback is delayed, noisy, and outcome-level. We introduce \textsc{FinEvolveBench}, a temporally controlled benchmark for financial sentiment prediction that links daily news-driven predictions to future excess returns. We further propose Tree-of-Experience (ToE), a structured experience-management method that organizes, retrieves, validates, and updates agent experience. Experiments show that general-purpose experience mechanisms do not consistently outperform no-experience baselines, while ToE achieves stronger overall performance. These results highlight the importance of structured experience management for self-evolving agents in implicit-reward environments.

2606.06819 2026-06-08 cs.CV 新提交

VideoSEG-O3: A Multi-turn Reinforcement Learning Framework for Reasoning Video Object Segmentation

VideoSEG-O3:用于推理视频对象分割的多轮强化学习框架

Ming Dai, Sen Yang, Boqiang Duan, Boyuan Tong, Jiedong Zhuang, Wankou Yang, Jingdong Wang

AI总结 提出VideoSEG-O3,首个多轮强化学习框架,通过多轮时空思维链和SEG感知逻辑校准,实现从粗到细的推理视频对象分割,解决复杂视频中的精确像素定位问题。

Comments ICML2026

详情
AI中文摘要

推理视频对象分割(RVOS)需要时间动态、空间细节和语言推理的复杂集成,以实现精确的像素级定位。现有方法局限于对固定初始输入进行推理,缺乏主动获取更多视觉证据的能力,而这对于解决长或复杂视频中的复杂引用通常至关重要。为了解决这个问题,我们提出了\textbf{VideoSEG-O3},这是第一个用于RVOS的多轮强化学习框架,模拟人类的“从粗到细”认知过程。它采用\textit{多轮时空思维链},通过迭代定位关键区间和关键帧来捕获细粒度细节。此外,为了使策略在强化学习阶段能够感知超出\texttt{[SEG]}文本概率的分割质量,我们引入了\textit{SEG感知逻辑校准},将像素级分割反馈直接集成到令牌级逻辑中。此外,我们设计了一个\textit{解耦思考轨迹},将推理过程分层分解为时间、空间和语言维度,并构建了\textbf{VTS-CoT},一个包含全面推理轨迹的专门冷启动数据集。代码和模型将在以下网址发布:this https URL。

英文摘要

Reasoning Video Object Segmentation (RVOS) demands a sophisticated integration of temporal dynamics, spatial details, and linguistic reasoning to achieve precise pixel-level localization. Existing methods are limited to reasoning over fixed initial inputs and lack the capacity to actively acquire further visual evidence, which is often essential for resolving complex references in long or intricate videos. To address this, we propose \textbf{VideoSEG-O3}, the first multi-turn reinforcement learning framework for RVOS that emulates the human \textit{``coarse-to-fine''} cognitive process. It employs a \textit{multi-turn temporal-spatial chain-of-thought} to capture fine-grained details by iteratively pinpointing critical intervals and keyframes. Additionally, to enable the policy to perceive segmentation quality beyond mere text probability of \texttt{[SEG]} during the RL stage, we introduce \textit{SEG-aware logit calibration}, which integrates pixel-wise segmentation feedback directly into the token-level logits. Furthermore, we design a \textit{decoupled thinking trace} to hierarchically decompose the reasoning process into temporal, spatial, and linguistic dimensions, and construct \textbf{VTS-CoT}, a specialized cold-start dataset featuring comprehensive reasoning trajectories. The code and models will be released at https://github.com/Dmmm1997/VideoSEG-O3.

2606.06748 2026-06-08 cs.CL cs.AI cs.LG 新提交

Evidence Graph Consistency in Retrieval-Augmented Generation: A Model-Dependent Analysis of Hallucination Detection

检索增强生成中的证据图一致性:基于模型的幻觉检测分析

Jianru Shen

AI总结 提出证据图一致性(EGC)框架,通过构建局部证据图并计算五种结构一致性指标检测幻觉,发现不同模型族间一致性特征方向相反,表明嵌入图一致性不能作为模型无关的检测信号。

Comments Accepted at the International Conference on Advanced Machine Learning and Data Science; to appear in the IEEE Xplore proceedings

详情
AI中文摘要

检索增强生成(RAG)减少了但并未消除大型语言模型中的幻觉。现有检测方法依赖于生成答案与检索段落之间的平面相似性,忽略了证据片段与答案声明之间的结构关系。我们提出了证据图一致性(EGC)框架,该框架为每个响应构建一个局部证据图,并计算五种结构一致性度量作为幻觉指标。在RAGTruth的完整问答拆分上,跨六个LLM(5,767个响应)进行评估,EGC揭示了一个一致的模型族分裂:图一致性特征在Llama-2模型中显示出预期的诊断方向,但在GPT-4、GPT-3.5和Mistral-7B中表现出系统性逆转。这种逆转表明不同模型族之间存在定性的不同幻觉模式,并表明基于嵌入的图一致性不能作为模型无关的幻觉检测信号。

英文摘要

Retrieval-Augmented Generation (RAG) reduces but does not eliminate hallucination in large language models. Existing detection methods rely on flat similarity between generated answers and retrieved passages, ignoring structural relationships among evidence pieces and answer claims. We propose Evidence Graph Consistency (EGC), a framework that constructs a local evidence graph per response and computes five structural consistency measures as hallucination indicators. Evaluated on the full question answering split of RAGTruth across six LLMs (5,767 responses), EGC reveals a consistent model-family split: graph consistency features show the expected diagnostic direction for hallucinations in Llama-2 models but exhibit systematic reversal in GPT-4, GPT-3.5, and Mistral-7B. This reversal suggests qualitatively different hallucination patterns across model families and indicates that embedding-based graph consistency cannot serve as a model-independent hallucination detection signal.

2606.06682 2026-06-08 cs.LG 新提交

Spatiotemporal Imputation with Graph-Informed Flow Matching

基于图信息流匹配的时空插补

Zepeng Zhang, Aref Einizade, Jhony H. Giraldo, Olga Fink

AI总结 提出GiFlow框架,利用图信息先验和混合向量场模型进行时空插补,优于现有方法。

Comments Accepted at ICML 2026

详情
AI中文摘要

缺失数据是时空系统中的常见挑战,出现在空气质量监测和城市交通管理等应用中。传统的机器学习方法,如循环神经网络和图神经网络,依赖于迭代传播,这往往会在时间和空间上累积误差。最近的基于扩散的方法减轻了误差传播,但需要迭代采样,并且通常依赖于问题无关的高斯先验,限制了效率和有效性。为了解决这些局限性,我们提出了GiFlow,一种用于时空插补的图信息流匹配框架。GiFlow将典型的高斯先验替换为通过时空滤波可观测信号构建的图信息先验,这更好地使源分布与目标对齐,从而简化了生成轨迹。流场由一个混合向量场模型参数化,该模型整合了空间注意力、时间注意力和时空传播,能够联合建模空间和时间依赖性。在合成和真实世界数据集上的大量实验表明,所提出的GiFlow在时空插补中优于最先进的方法。代码可在该 https URL 获取。

英文摘要

Missing data is a common challenge in spatiotemporal systems, arising in applications such as air quality monitoring and urban traffic management. Traditional machine learning approaches, like recurrent and graph neural networks, rely on iterative propagation, which tends to accumulate errors over time and space. Recent diffusion-based methods mitigate error propagation but require iterative sampling and often depend on problem-agnostic Gaussian priors, limiting both efficiency and effectiveness. To address these limitations, we propose GiFlow, a Graph-Informed Flow Matching framework for spatiotemporal imputation. GiFlow replaces the typical Gaussian prior with a graph-informed prior constructed via spatiotemporal filtering of observable signals, which better aligns the source distribution to the target and thereby simplifies the generation trajectory. The flow field is parameterized by a hybrid vector field model that integrates spatial attention, temporal attention, and spatiotemporal propagation, enabling joint modeling of spatial and temporal dependencies. Extensive experiments on both synthetic and real-world datasets demonstrate that the proposed GiFlow outperforms the state-of-the-art approaches in spatiotemporal imputation. The code is available at https://github.com/zepengzhang/GiFlow.

2606.06663 2026-06-08 cs.LG 新提交

Explainable Runtime Dependency Tracking for AI-RAN Conflict Monitoring

面向AI-RAN冲突监控的可解释运行时依赖追踪

Christie Djidjev, Nicholas Kaminski

AI总结 针对AI-RAN中参数-KPI依赖关系可能失效的问题,提出基于布尔矩阵的滑动窗口推理方法,通过事件流一致性检测实现轻量级可解释依赖追踪。

详情
AI中文摘要

未来集成AI的无线接入网络(AI-RAN)将结合开放可编程性与支持学习的xApps、rApps以及作用于共享参数和关键性能指标(KPI)的控制功能。对于冲突监控,仅知道部署了哪些应用是不够的;系统还必须知道运行时诊断所假设的参数-KPI依赖关系在当前运行状态下是否仍然有效。本文研究了一种轻量级的监控原语:从流式遥测事件中追踪可解释的依赖关系表示。我们将活跃依赖关系表示为布尔矩阵,并使用布尔矩阵乘法来检查最近的参数活动事件和KPI响应事件是否与当前估计一致。我们提出了一种滑动窗口推理过程,当估计一致时重复使用,当最近观测表明结构变化时重新计算。该追踪器旨在作为冲突诊断和慢循环模型刷新的可解释信号,而非自主缓解机制。在受控的布尔事件流上的实验表明,在依赖关系变化和布尔观测噪声下,该追踪器能够高效且准确地追踪。

英文摘要

Future AI-integrated Radio Access Networks (AI-RAN) will combine open programmability with learning-enabled xApps, rApps, and control functions that act on shared parameters and key performance indicators (KPIs). For conflict monitoring, it is not enough to know which applications are deployed; the system must also know whether the parameter--KPI dependencies assumed by runtime diagnosis remain valid under the current operating regime. This paper studies a lightweight monitoring primitive for that purpose: tracking an interpretable dependency representation from streaming telemetry events. We represent active dependencies by a Boolean matrix and use Boolean matrix multiplication to check whether recent parameter-activity and KPI-response events are consistent with the current estimate. We propose a sliding-window inference procedure that reuses the estimate when it remains consistent and recomputes it when recent observations indicate structural change. The tracker is intended as an explainable signal for conflict diagnosis and slow-loop model refresh, not as an autonomous mitigation mechanism. Experiments on controlled Boolean event streams show efficient and accurate tracking under dependency changes and Boolean observation noise.

2606.06572 2026-06-08 cs.LG cs.AI cs.CY econ.GN q-fin.EC 新提交

Generative Models Erode Human Temporal Learning Through Market Selection

生成模型通过市场选择侵蚀人类时间学习

Wenjun Cao

AI总结 本文论证现代生成模型在亚AGI能力水平上通过市场选择机制侵蚀人类时间学习,提出价值崩溃路径并用昂贵检验框架形式化,跨领域证据显示验证侵蚀四阶段。

Comments Accepted at ICML 2026

Journal ref Forty-third International Conference on Machine Learning Position Paper Track (2026)

详情
AI中文摘要

我们认为,现代生成模型在当前亚AGI能力水平上对知识和文化生产造成了结构性风险。我们将人类时间学习(HTL)定义为通过长期持续参与问题而形成的路径依赖的知识积累。生成输出在表面特征上越来越像HTL密集型工作,因此验证给定输出是否反映真正的人类学习的成本相对于其预期收益变得高昂。一旦验证失去经济合理性,评估者就会奖励输出而不论其生产模式,而投入多年学习的生产者则在与几乎零成本生成的输出的价格竞争中处于劣势。我们将这一路径称为价值崩溃,并通过一个昂贵检验框架将其形式化。来自学术出版、法律实践、内容平台和软件安全的跨领域证据映射出验证侵蚀的四个阶段。对齐成功是正交的。更好的对齐模型缩小了人类与AI输出之间的可观察差距,使得来源验证更加困难,并加剧了对HTL密集型工作的竞争压力,即使单个AI输出有所改进。

英文摘要

We argue that modern generative models create structural risks for knowledge and cultural production at current, sub-AGI capability levels. We define Human Temporal Learning (HTL) as path-dependent knowledge accumulation through sustained engagement with problems over time. Generative outputs increasingly resemble HTL-intensive work in surface features, so verifying whether a given output reflects genuine human learning grows costly relative to its expected benefit. Once verification loses economic justification, evaluators reward outputs regardless of production mode, and producers who invested years of learning compete on price against outputs that cost almost nothing to generate. We call this pathway value collapse and formalize it through a costly-inspection framework. Cross-domain evidence from academic publishing, legal practice, content platforms, and software security maps onto four stages of verification erosion. Alignment success is orthogonal. Better-aligned models narrow observable gaps between human and AI outputs, making source verification harder and intensifying competitive pressure against HTL-intensive work even when individual AI outputs improve.

2606.06567 2026-06-08 cs.LG 新提交

Are you sure? A Comprehensive and Comprehensible Survey of Uncertainty Quantification in Symbolic Regression

你确定吗?符号回归中不确定性量化的全面且可理解的综述

Julia Reuter, Fabricio Olivetti de Franca

AI总结 综述符号回归中的不确定性量化方法,涵盖频率派、贝叶斯和模型选择三个研究方向,指出该领域尚待探索。

详情
AI中文摘要

符号回归(SR)是一类系统探索数学函数空间以发现准确捕捉数据集中潜在关系的模型的方法。尽管该领域近期取得了进展,但缺乏对不确定性量化(UQ)的支持限制了其在现实决策过程中的应用。在回归分析中,UQ提供了关于模型可靠性的重要信息,这既可以通过考虑数据中的不确定性来帮助避免过拟合,也可以为决策提供见解。本综述首次明确解决这一问题,旨在介绍基本的UQ概念并回顾当前SR中UQ的文献,这些文献大致可分为三个研究方向:频率派、贝叶斯和模型选择。尽管其重要性,SR中的UQ仍未被充分探索,这激励了对SR可靠UQ方法的进一步研究。

英文摘要

Symbolic regression (SR) is a class of methods that systematically explore the space of mathematical functions to discover models that accurately capture the underlying relationships in a dataset. Despite recent advances in the field, a lack of support for uncertainty quantification (UQ) limits its adoption in real-world decision processes. In regression analysis, UQ provides important information about the model reliability, which can both help to avoid overfitting by accounting for uncertainty in the data, and provide insights for decision-making. This survey is the first to clearly address this issue, with the objective of introducing essential UQ concepts and reviewing the current literature on UQ in SR, which can be broadly organized into three research directions: frequentist, Bayesian, and model selection. Despite its importance, UQ in SR is still underexplored, which motivates further research into reliable UQ methods for SR.

2606.06531 2026-06-08 cs.AI quant-ph 新提交

CARVE-Q: Quantum-Proposed, Classically Certified Interactive Driving Repair

CARVE-Q:量子提议、经典认证的交互式驾驶修复

Yifan Wang

AI总结 针对被否决的驾驶操作,提出CARVE-Q架构,通过量子最小搜索加速修复格搜索,同时保持安全认证的经典性,实现可审计的交互修复。

Comments 9 pages, 3 figures

详情
AI中文摘要

在正确的驾驶否决之后,关键问题不仅在于某个操作是否不安全,还在于被阻止的交互是否允许合法的、可审计的且责任有限的修复。预测和博弈论规划器可以提出合理的合作,但它们不能提供修复符合硬性规则、路权、成本分配和自车后备的证明。我们引入了CARVE(通过包络线对被否决操作进行认证的可负担修复),一种无预测的交互式修复证书架构。给定一个被否决的操作,CARVE构建一个有限的修复格并发出一个结构化证书,记录绑定规则、选定的联合修复、按路权缩放的合作包络线、按责任加权的成本分配以及仅自车的后备。这个证书视图揭示了算法瓶颈:多主体修复产生一个乘积格 $M = \prod_j |\mathcal{A}_j|$。因此,我们引入了CARVE-Q,一个验证器屏蔽的量子AI搜索层,它仅对此黑盒格应用量子最小值查找,同时将所有安全权限保留在经典侧。在保守的验证器-预言机模型中,精确的经典最小值查找在最坏情况下需要 $\Theta(M)$ 次查询,而Durr-Hoyer/Grover最小值查找以高概率使用 $O(\sqrt{M})$ 次预言机查询。我们证明了验证器屏蔽的证书可靠性、优先级非泄露、黑盒查询分离以及有限精度可逆预言机的可构造性。然后,我们在最多65,536个分配的CARVE修复预言机上展示了状态向量最小值查找,并在基于Lanelet2的INTERACTION回放中验证了证书保留,实现了100%的路权尊重、100%的责任一致性以及零优先级误报。结果是一种用于认证自主性的信任有界量子AI模式:量子提议;CARVE认证。

英文摘要

The critical question after a correct driving veto is not only whether a maneuver is unsafe, but whether the blocked interaction admits a lawful, auditable, and responsibility-bounded repair. Prediction and game-theoretic planners can suggest plausible cooperation, yet they do not return a proof that the repair respects hard rules, right-of-way, cost allocation, and ego fallback. We introduce CARVE, Certified Affordable Repair of Vetoed maneuvers via Envelopes, a certificate architecture for prediction-free interactive repair. Given a vetoed maneuver, CARVE constructs a finite repair lattice and emits a structured certificate recording the binding rule, selected joint repair, right-of-way-scaled cooperation envelope, responsibility-weighted cost split, and ego-only fallback. This certificate view reveals the algorithmic bottleneck: multi-owner repair induces a product lattice $M = \prod_j |\mathcal{A}_j|$. We therefore introduce CARVE-Q, a verifier-shielded quantum-AI search layer that applies quantum minimum finding only to this black-box lattice while leaving all safety authority classical. In the conservative verifier-oracle model, exact classical minimum finding requires $Θ(M)$ queries in the worst case, whereas Durr-Hoyer/Grover minimum finding uses $O(\sqrt{M})$ oracle queries with high probability. We prove verifier-shielded certificate soundness, priority non-elicitation, black-box query separation, and finite-precision reversible-oracle constructibility. We then demonstrate state-vector minimum finding on CARVE repair oracles up to 65,536 assignments and validate certificate preservation on Lanelet2-grounded INTERACTION replay with 100% right-of-way respect, 100% blame consistency, and zero priority false positives. The result is a trust-bounded quantum-AI pattern for certified autonomy: quantum proposes; CARVE certifies.

2604.20123 2026-06-08 cs.CV 版本更新

Topology-Aware Skeleton Detection via Lighthouse-Guided Structured Inference

拓扑感知的骨架检测:基于灯塔引导的结构化推理

Daoyong Fu, Xiang Zhang, Zhaohuan Zhan, Fan Yang, Ke Yang

AI总结 提出Lighthouse-Skel方法,通过双分支协作检测骨架置信场和结构锚点,并利用灯塔引导策略重连不连续骨架,提升骨架连续性和结构完整性。

Comments This submission is withdrawn by the authors because we identified substantive issues in the current version that may affect the reliability and interpretation of the results. We are conducting a thorough revision and validation before making the work publicly available again

详情
AI中文摘要

在自然图像中,物体骨架用于表示几何形状。然而,姿态或运动的轻微变化可能导致骨架结构的显著变化,增加骨架检测的难度,并常常导致不连续的骨架。现有方法主要关注点级骨架点检测,忽视了结构连续性在恢复完整骨架中的重要性。为解决此问题,我们提出Lighthouse-Skel,一种通过灯塔引导的结构化推理实现拓扑感知的骨架检测方法。具体来说,我们引入了一个双分支协作检测框架,联合学习骨架置信场和结构锚点(包括端点和连接点)。点分支学习的空间分布引导网络关注拓扑脆弱区域,从而提高骨架检测的准确性。基于学习的骨架置信场,我们进一步提出灯塔引导的拓扑补全策略,该策略将检测到的连接点和断点作为灯塔,沿低成本路径重连不连续的骨架段,从而改善骨架连续性和结构完整性。在四个公开数据集上的实验结果表明,所提方法在实现竞争性检测精度的同时,显著提升了骨架的连通性和结构完整性。

英文摘要

In natural images, object skeletons are used to represent geometric shapes. However, even slight variations in pose or movement can cause noticeable changes in skeleton structure, increasing the difficulty of detecting the skeleton and often resulting in discontinuous skeletons. Existing methods primarily focus on point-level skeleton point detection and overlook the importance of structural continuity in recovering complete skeletons. To address this issue, we propose Lighthouse-Skel, a topology-aware skeleton detection method via lighthouse-guided structured inference. Specifically, we introduce a dual-branch collaborative detection framework that jointly learns skeleton confidence field and structural anchors, including endpoints and junction points. The spatial distributions learned by the point branch guide the network to focus on topologically vulnerable regions, which improves the accuracy of skeleton detection. Based on the learned skeleton confidence field, we further propose a lighthouse-guided topology completion strategy, which uses detected junction points and breakpoints as lighthouses to reconnect discontinuous skeleton segments along low-cost paths, thereby improving skeleton continuity and structural integrity. Experimental results on four public datasets demonstrate that the proposed method achieves competitive detection accuracy while substantially improving skeleton connectivity and structural integrity.

2603.21510 2026-06-08 eess.IV cs.CV 版本更新

Unregistered Spectral Image Fusion: Unmixing, Adversarial Learning, and Recoverability

未配准光谱图像融合:解混、对抗学习与可恢复性

Jiahui Song, Sagar Shrestha, Xiao Fu

AI总结 提出无监督框架,通过耦合光谱解混和潜在空间对抗学习同时超分辨未配准的高光谱和多光谱图像,并首次建立可恢复性理论保证。

详情
AI中文摘要

本文研究一对空间未配准的高光谱图像(HSI)和多光谱图像(MSI)的融合问题,两者覆盖大致重叠区域。HSI提供高光谱但低空间分辨率,而MSI则相反。目标是整合它们的互补信息,以提升HSI空间分辨率和MSI光谱分辨率。虽然高光谱-多光谱融合(HMF)已被广泛研究,但未配准设置仍然具有挑战性。许多现有方法仅关注MSI超分辨,而保持HSI不变。监督深度学习方法被提出用于HSI超分辨,但依赖于准确的训练数据,这通常不可用。此外,理论分析主要处理已配准情况,导致未配准HMF理解不足。本文提出一种无监督框架,同时超分辨MSI和HSI。该方法将用于MSI超分辨的耦合光谱解混与用于HSI超分辨的潜在空间对抗学习相结合。在合理的生成模型下,建立了超分辨MSI和HSI可恢复性的理论保证——据我们所知,这是首次为未配准HMF提供此类见解。该方法在半真实和真实HSI-MSI对的不同条件下得到验证。

英文摘要

This paper addresses the fusion of a pair of spatially unregistered hyperspectral image (HSI) and multispectral image (MSI) covering roughly overlapping regions. HSIs offer high spectral but low spatial resolution, while MSIs provide the opposite. The goal is to integrate their complementary information to enhance both HSI spatial resolution and MSI spectral resolution. While hyperspectral-multispectral fusion (HMF) has been widely studied, the unregistered setting remains challenging. Many existing methods focus solely on MSI super-resolution, leaving HSI unchanged. Supervised deep learning approaches were proposed for HSI super-resolution, but rely on accurate training data, which is often unavailable. Moreover, theoretical analyses largely address the co-registered case, leaving unregistered HMF poorly understood. In this work, an unsupervised framework is proposed to simultaneously super-resolve both MSI and HSI. The method integrates coupled spectral unmixing for MSI super-resolution with latent-space adversarial learning for HSI super-resolution. Theoretical guarantees on the recoverability of the super-resolution MSI and HSI are established under reasonable generative models -- providing, to our best knowledge, the first such insights for unregistered HMF. The approach is validated on semi-real and real HSI-MSI pairs across diverse conditions.

2303.11949 2026-06-08 cs.NE cs.LG

A fuzzy adaptive evolutionary-based feature selection and machine learning framework for single and multi-objective body fat prediction

一种基于模糊自适应进化的方法用于单目标和多目标身体脂肪预测的特征选择和机器学习框架

Farshid Keivanian, Raymond Chiong, Zongwen Fan

AI总结 本文提出了一种融合模糊集理论和进化算法的特征选择与机器学习框架,用于提升身体脂肪预测的准确性与稳定性,同时解决多目标优化中的冲突问题。

Comments Due to unforeseen challenges in coordination and supervision, including unavoidable delays, this study requires further review and refinement. To ensure it meets necessary academic and methodological standards, we have decided to withdraw the paper. We appreciate the understanding of the research community

Journal ref Neurocomputing, Article 132974, 2026

详情
AI中文摘要

预测身体脂肪可以为医疗人员和用户提供预防和诊断心脏病的重要信息。混合机器学习模型通过选择相关身体测量值并捕捉所选特征之间的复杂非线性关系,比简单的回归分析方法表现更好。然而,这些模型也存在一些缺点。将身体脂肪预测建模为组合的单目标和多目标优化问题时,常常陷入局部最优。当多个特征子集产生相似或接近的预测时,避免局部最优变得更加复杂。进化特征选择已被用于解决几种基于机器学习的优化问题。模糊集理论决定了探索和利用的适当水平,同时管理参数化和计算成本。通过进化特征选择、模糊集理论和机器学习算法,探索了一种加权求和身体脂肪预测方法,将矛盾的指标整合到一个复合目标中,由模糊自适应进化特征选择优化。混合模糊自适应全局学习局部搜索通用多样性特征选择应用于这种单目标特征选择-机器学习框架(FAGLSUD-based FS-ML)。在使用较少特征的情况下,该模型比其他混合和最新机器学习模型获得了更准确和稳定的脂肪百分比估计。还提出了多目标FAGLSUD-based FS-MLP,用于同时分析准确性、稳定性和维度冲突。为了做出关于最关键身体部位脂肪沉积和血液脂质水平的明智决策,医疗人员和用户可以使用一个良好的分布的帕累托集的权衡解决方案。

英文摘要

Predicting body fat can provide medical practitioners and users with essential information for preventing and diagnosing heart diseases. Hybrid machine learning models offer better performance than simple regression analysis methods by selecting relevant body measurements and capturing complex nonlinear relationships among selected features in modelling body fat prediction problems. There are, however, some disadvantages to them. Current machine learning. Modelling body fat prediction as a combinatorial single- and multi-objective optimisation problem often gets stuck in local optima. When multiple feature subsets produce similar or close predictions, avoiding local optima becomes more complex. Evolutionary feature selection has been used to solve several machine-learning-based optimisation problems. A fuzzy set theory determines appropriate levels of exploration and exploitation while managing parameterisation and computational costs. A weighted-sum body fat prediction approach was explored using evolutionary feature selection, fuzzy set theory, and machine learning algorithms, integrating contradictory metrics into a single composite goal optimised by fuzzy adaptive evolutionary feature selection. Hybrid fuzzy adaptive global learning local search universal diversity-based feature selection is applied to this single-objective feature selection-machine learning framework (FAGLSUD-based FS-ML). While using fewer features, this model achieved a more accurate and stable estimate of body fat percentage than other hybrid and state-of-the-art machine learning models. A multi-objective FAGLSUD-based FS-MLP is also proposed to analyse accuracy, stability, and dimensionality conflicts simultaneously. To make informed decisions about fat deposits in the most vital body parts and blood lipid levels, medical practitioners and users can use a well-distributed Pareto set of trade-off solutions.

2606.07469 2026-06-08 econ.EM cs.NA econ.TH math.NA math.PR 新提交

Statistical and Numerical Convergence in Stochastic Equilibrium

随机均衡中的统计与数值收敛

David Staines

AI总结 本文基于SELCKE的严格随机均衡理论,发现系统以特征值或逆特征值中更接近单位圆者与最大冲击持久性中较大者给出的速率几何收敛至长期均衡,并开发了检验随机均衡存在的模拟程序。

Comments 91 Pages: 63 Main Text, 28 Suppelementary Materials

详情
AI中文摘要

本文阐述了来自SELCKE(Staines (2024a))arXiv:2312.16214的严格随机均衡理论的最一般的计算和计量经济学含义。分析基础是发现系统几何收敛至长期均衡,其速率由特征值或逆特征值(来自外部)中更接近单位圆者与最大冲击持久性中的较大者给出。高阶冲击收敛更快。我开发了一个模拟程序,用于渐近检验特定模型是否存在随机均衡。基本逼近结果断言,无论展开阶数或损失函数如何,随机稳态都能提供最准确的摄动解。我还证明了当二阶项消失时,会出现超一致参数估计量$O(1/T)$。除了Calvo模型,我还研究了两种替代定价模型中的随机均衡。动力学显著简化。我通过误差中的最大滞后限制了脉冲响应达到峰值的时间。这为泰勒合同提供了经验支持,尽管存在单位根和强成本渠道的问题。对于菜单成本,我证明了初始价格分布超指数衰减,产生了一个等价于具有内生重置概率的Calvo模型的系统。异质性扰动的影响表现为实际产出与有效产出之间的额外楔子。借助新的分布论证,证明了目标函数在边界处的爆破,因此该模型满足递归均衡的现有特征值存在条件。在此过程中,为现有的理论模型和统计程序提供了新的见解。

英文摘要

This paper sets out the most general computational and econometric implications of the rigorous stochastic equilibrium theory from SELCKE (Staines (2024a)) arXiv:2312.16214. The analytical backbone is the discovery that the system converges geometrically to long-run equilibrium, at a rate given by the greater of the eigenvalue or inverse eigenvalue (from outside) closest to the unit circle and the maximum shock persistence. High-order shocks converge faster. I develop a simulation procedure to test, with asymptotic power, whether stochastic equilibrium exists for a particular model. The fundamental approximation result asserts that, whatever the order of expansion or loss function, the stochastic steady state delivers the most accurate perturbation solution. I also show that super-consistent parameter estimators $O(1/T)$ arise whenever second-order terms vanish. Besides Calvo, I study stochastic equilibrium in two alternative pricing models. Dynamics simplify considerably. I bound the time the impulse response peaks, by the maximum lag in the errors. This lends empirical support to Taylor contracts, although there are issues surrounding unit roots and the strong cost-channel. For menu costs, I demonstrate that the initial price distribution decays away super-exponentially, producing a system equivalent to Calvo with an endogenous reset probability. The impact of idiosyncratic disturbances appears as an additional wedge between actual and efficient output. Blow-up of the objective function at the boundary is proven, with the help of new distributional arguments, so the model meets existing eigenvalue existence conditions for the recursive equilibrium. Along the way, new light is shone on existing theoretical models and statistical procedures.

2606.07049 2026-06-08 econ.EM 新提交

CausalAlpha: A Real-Time Geopolitical Risk Index from OSINT Channels for Causal Discovery in Financial Markets

CausalAlpha: 来自OSINT渠道的实时地缘政治风险指数及其在金融市场因果发现中的应用

Andres Azqueta-Gavaldon, Borja Ureta

AI总结 提出CausalAlpha框架,利用Telegram OSINT渠道构建高频地缘政治风险指数,通过PC算法发现地缘政治不确定性与金融变量之间的有向因果结构,并识别出政治不稳定和能源媒体覆盖是冲突覆盖的因果前因。

详情
AI中文摘要

我们介绍了CausalAlpha,一个开源框架,它利用自然语言处理从Telegram OSINT渠道构建高频地缘政治风险(GPR)指数,并应用因果发现方法识别地缘政治不确定性与金融市场变量之间的有向因果结构。与标准的情绪指数或格兰杰因果关系方法不同,CausalAlpha采用Peter-Clark(PC)算法来恢复五个类别特定GPR指标与一组涵盖大宗商品价格、股票指数和信用工具的金融变量之间的因果依赖有向无环图(DAG),并在四种DAG规范和三个显著性水平下使用500次块自助重采样进行估计。在alpha = 0.10时,所有DAG规范中出现了两个全局稳健的发现:政治不稳定和能源媒体覆盖独立且因果地先于冲突覆盖,将冲突确立为实时OSINT渠道中地缘政治叙事升级的主要因果汇。在最严格的显著性水平(alpha = 0.05)下,冲突覆盖因果地先于能源板块股票回报(delta XLE),这与地缘政治升级传导至能源市场一致。核心宏观面板的结构VAR证实,地缘政治NLP信号到金融市场价格的动态传导在日频上统计上较弱,表明地缘政治新闻信号主要作用于媒体叙事系统内部。该框架作为生产应用程序部署在Google Cloud Run上,具有自动数据收集和指数构建功能,代表了利用OSINT进行实时宏观金融风险监测的一步。

英文摘要

We introduce CausalAlpha, an open-source framework that constructs a high-frequency Geopolitical Risk (GPR) index from Telegram OSINT channels using natural language processing, and applies causal discovery methods to identify the directed causal structure between geopolitical uncertainty and financial market variables. Unlike standard sentiment indices or Granger-causality approaches, CausalAlpha employs the Peter-Clark (PC) algorithm to recover the directed acyclic graph (DAG) of causal dependencies between five category-specific GPR indicators and a set of financial variables spanning commodity prices, equity indices, and credit instruments, estimated across four DAG specifications and three significance levels with 500 block-bootstrap resamples. Two findings emerge as globally robust across all DAG specifications at alpha = 0.10: political instability and energy media coverage independently and causally precede conflict coverage, establishing conflict as the primary causal sink of geopolitical narrative escalation in real-time OSINT channels. At the strictest significance level (alpha = 0.05), conflict coverage causally precedes energy sector equity returns (delta XLE), consistent with geopolitical escalation transmitting to energy markets. A Structural VAR on the core macro panel confirms that dynamic transmission from geopolitical NLP signals to financial market prices is statistically weak at daily frequency, suggesting that geopolitical news signals operate primarily within the media narrative system. The framework is deployed as a production application on Google Cloud Run with automated data collection and index construction, representing a step toward real-time macrofinancial risk monitoring using OSINT.

2606.06638 2026-06-08 econ.EM 新提交

Consistent estimation in logit models using historical choices as practical consideration set

使用历史选择作为实际考虑集的Logit模型中的一致估计

C. Angelo Guevara

AI总结 本文证明在Logit数据生成过程下,使用历史选择作为实际考虑集可得到参数的一致估计,基于对备选方案抽样定理的重新解释,并提供了蒙特卡洛证据。

详情
AI中文摘要

选择建模中的一个关键挑战在于指定考虑集,即个体在做选择时实际评估的备选方案子集,这对研究者来说是未观察到的(潜在的)。经典的经济人假设认为个体评估全部备选方案,这是一个行为上不可信的假设。实际选项包括直接询问个体,这引入行为偏差;将考虑集视为潜在构念,需要完全枚举和强识别假设;或依赖试图复制个体如何形成这些集的临时启发式方法或非参数方法。最近,一些研究者使用历史选择作为实际考虑集,随着智能卡、手机记录和扫描仪数据等被动数据源的可用性,这种方法变得越来越可行。本文正式证明了一个充分条件,并提供了蒙特卡洛证据,表明在具有跨实例同质选择概率的Logit数据生成过程下,基于历史选择定义实际考虑集可得到参数的一致估计。该证明基于对备选方案抽样定理的重新解释,将历史选择视为来自真实考虑集的抽样,并表明在所述假设下,均匀条件性质成立。文章最后讨论了这一结果的实际意义以及向其他建模框架和假设的潜在扩展。

英文摘要

A key challenge in choice modeling lies in specifying the consideration set, the subset of alternatives that individuals actually evaluate when making choices, which is unobserved (latent) to the researcher. The classical homo economicus assumption posits that individuals assess the full universal set of alternatives, a behaviorally implausible premise. Practical options include directly asking individuals, which introduces behavioral biases; treating the consideration set as a latent construct, requiring full enumeration and strong identification assumptions; or relying on ad hoc heuristics that attempt to replicate how individuals form these sets or on non-parametric methods. Recently, some researchers have used historical choices as practical consideration set, an approach made increasingly feasible by the availability of passive data sources such as smartcards, mobile phone records, and scanner data. This article provides a formal demonstration of a sufficient condition, along with Monte Carlo evidence, showing that, under a Logit data-generating process with homogeneous choice probabilities across instances, defining a practical consideration set based on historical choices yields consistent parameter estimates. The demonstration is based on a reinterpretation of the sampling-of-alternatives theorem, viewing historical choices as draws from the true consideration set, and showing that under the stated assumptions, the uniform conditioning property holds. The article concludes by discussing the practical implications of this result and potential extensions to other modeling frameworks and assumptions.

2606.07347 2026-06-08 eess.SP cs.ET 新提交

CSI Phase Averaging for High-Sensitivity Wi-Fi Sensing in Low-Multipath Environments

低多径环境下的高灵敏度Wi-Fi感知的CSI相位平均

Toshinori Suzuki, Shin-ichiro Ogura, Yu Morishima, Hiroshi Matsuura

AI总结 提出一种基于模型驱动的低复杂度运动检测方法,利用CSI相位结构特性抑制相位偏移误差,并通过相位平均降低噪声,实验证明可在低多径户外环境中检测数米外的飞鸟。

Comments 13 pages, 11 figures, 3 tables

详情
AI中文摘要

本文提出一种基于模型驱动的低复杂度运动检测方法,用于户外Wi-Fi感知。该方法利用低多径传播环境下信道状态信息(CSI)相位分量的结构特性(通常被认为不利于Wi-Fi感知),以减轻源自无线设备的相位偏移误差。此外,相位平均提供了处理增益,降低了包括量化噪声和热噪声在内的随机噪声分量。描述了该方法的理论基础,并使用从商用IEEE 802.11ac设备获取的压缩波束成形帧进行了实验评估。实验主要关注户外果园环境中飞行的野生乌鸦。实验结果表明,即使鸟类在距离发射和接收天线之间的直接视距路径数米外飞行,该方法也能检测到它们。此外,结果表明当风速低于3 m/s时,植被运动引起的波动可忽略不计。所提出的方法预计不仅适用于果园监测,也适用于低多径环境下的其他户外Wi-Fi感知应用。

英文摘要

This paper presents a low-complexity motion detection method for outdoor Wi-Fi sensing based on a model-driven approach. The method exploits the structural characteristics of the phase components in channel state information (CSI) for low-multipath propagation environments, which are generally considered disadvantageous for Wi-Fi sensing, to mitigate the phase offset errors originating from wireless devices. In addition, phase averaging provides a processing gain that reduces the random noise components, including quantization and thermal noise. The theoretical basis of the method is described and its effectiveness is experimentally evaluated using Compressed Beamforming frames obtained from commercial IEEE 802.11ac devices. The experiments primarily focus wild crows flying in an outdoor orchard environment. The experimental results demonstrate that the method can detect birds even when they fly several meters away from the direct line-of-sight path between the transmitter and receiver antennas. Furthermore, the results indicated that fluctuations caused by vegetation movement were negligible when the wind speed was less than 3~m/s. The proposed approach is expected to be applicable not only to orchard monitoring but also to other outdoor Wi-Fi sensing applications in low-multipath environments.

2606.07328 2026-06-08 eess.SP 新提交

Implementation and Calibration of 3GPP-Compliant ISAC Channel Simulator

符合3GPP标准的ISAC信道模拟器的实现与校准

Chien-Han Wu, Ming-Chun Lee, Ta-Sung Lee

AI总结 本文实现了3GPP TR 38.901中指定的ISAC信道模型模拟器,并通过与3GPP公司参考结果对比进行校准分析,为模拟器的实现和校准提供了关键细节。

Comments 6 pages, Codes and other source files are open on GitHub

详情
AI中文摘要

集成感知与通信(ISAC)已成为6G系统的关键技术。为了支持ISAC系统的开发,用于性能评估的精确信道建模与仿真至关重要。最近,3GPP为此引入了标准化的ISAC信道模型及其相关的校准程序。然而,由于建模方法的复杂性以及3GPP报告中缺乏完全明确的实现细节,不同的实现可能导致不一致或不同步的仿真结果。为了解决这个问题,在本工作中,我们实现了TR 38.901中指定的3GPP ISAC信道模型模拟器,并进行了全面的校准分析。我们将仿真结果与3GPP中公司报告的参考结果进行比较,并讨论了几个关键的实现细节,以提供对模拟器实现和校准的见解。为了促进可重复性和进一步研究,所开发的模拟器以及相关数据集和校准结果已作为开源项目在GitHub上发布。

英文摘要

Integrated sensing and communication (ISAC) has emerged as a key technology for 6G systems. To support the development of ISAC systems, accurate channel modeling and simulation for performance evaluation is essential. Recently, 3GPP introduced a standardized ISAC channel model and its associated calibration procedure for this purpose. However, due to the complexity of the modeling methodology and the lack of fully explicit implementation details in the 3GPP reports, different implementations may lead to inconsistent or unsynchronized simulation results. To address this issue, in this work, we implement the 3GPP ISAC channel model simulator specified in TR 38.901 and conduct a comprehensive calibration analysis. We compare the simulation results with the reference results reported by companies in 3GPP and discuss several key implementation details to provide insights into the implementation and calibration of the simulator. To facilitate reproducibility and further research, the developed simulator, together with the relevant datasets and calibration results, has been released as an open-source project on GitHub.

2606.07284 2026-06-08 eess.SP 新提交

RSMA Enabled Hierarchical UAV Networks with Non Linear Energy Harvesting: Outage Probability Analysis and UAV Placement Optimization

具有非线性能量收集的RSMA赋能分层无人机网络:中断概率分析与无人机部署优化

Faicel Khennoufa, Khelil Abdellatif, Metin Ozturk, Halim Yanikomeroglu, Safwan Alfattani

AI总结 针对分层无人机网络中的能量受限和硬件损伤问题,提出结合非线性能量收集与速率分割多址接入的方案,推导中断概率表达式并优化无人机部署,显著提升可靠性。

Comments Accepted in IEEE Transactions on Vehicular Technology

详情
AI中文摘要

无人机有望增强第六代蜂窝网络的连接性、扩展网络覆盖并支持高级通信服务,特别是在公共和民用应用中。尽管多无人机系统比单无人机部署具有更高的效率和成本效益,但其实现仍面临若干基本挑战,限制了其可靠性、可持续性和可扩展性。有限的机载能量限制了任务持续时间和通信连续性。因此,无线能量收集成为克服这一限制的有前景的解决方案。然而,地面能源存在路径损耗,使得从周围无人机收集能量更具可持续性。此外,在硬件损伤和不完美信道状态信息下,速率分割多址接入在分层无人机网络中尚未得到充分探索。本文提出一种具有非线性能量收集和RSMA的分层自组织无人机网络,以提高能量和成本效率,其中无人机从周围无人机收集能量。针对实际场景,我们在所提系统中考虑了HWI和ICSI的影响。据作者所知,本研究是文献中首次对此类场景进行探讨。推导了地面物联网设备、每个CMU以及所提系统总中断概率的表达式,基于Nakagami-$m$衰落信道,同时考虑了HWI、ICSI和非线性EH等实际约束。此外,还推导了高发射功率区域下的近似中断概率表达式。随后,我们制定了两个优化问题以提高可靠性和性能。结果表明,所提系统在中断概率方面优于所有基准方案。

英文摘要

Uncrewed aerial vehicles (UAVs) are expected to enhance connectivity, extend network coverage, and support advanced communication services in sixth-generation (6G) cellular networks, particularly in public and civil applications. Although multi-UAV systems offer greater efficiency and cost-effectiveness than single-UAV deployments, their implementation still faces several fundamental challenges that limit their reliability, sustainability, and scalability. The limited onboard energy restricts mission duration and communication continuity. Therefore, wireless energy harvesting (EH) emerges as a promising solution to overcome this limitation. However, terrestrial energy sources experience path loss, making EH from surrounding UAVs more sustainable. Moreover, rate-splitting multiple access (RSMA) remains insufficiently explored in hierarchical UAV networks under hardware impairments (HWI) and imperfect channel state information (ICSI). This paper proposes a hierarchical ad hoc UAV network with non-linear EH and RSMA to enhance both energy and cost efficiency, where UAVs harvest energy from surrounding UAVs. For a practical scenario, we consider the effect of HWI and ICSI in our proposed system. To the best of the authors knowledge, this study is the first to investigate such a scenario in the literature. The outage probability expressions for ground Internet of things (IoT) devices, each CMU, and the overall outage probability of the proposed system are derived over Nakagami-$m$ fading channels while considering practical constraints such as HWI, ICSI, and non-linear EH. Additionally, approximate outage probability expressions are derived for high transmit power regimes. Subsequently, we formulate two optimization problems to enhance reliability and performance. Our findings indicate that the proposed system outperforms all benchmarks in terms of outage probability.

2606.07264 2026-06-08 eess.AS 新提交

VISA: A Visual Information Strengthened Audio-Reasoning System for the Interspeech 2026 ARC Agent Track

VISA:面向Interspeech 2026 ARC智能体赛道的视觉信息增强音频推理系统

Wenming Tu, Jian Gao, Yanru Huo, Yixuan Wang, Jing Peng, Bohan Li, Ziyang Ma, Tao Liu, Shuai Fan, Kai Yu, Xie Chen, Zilong Zheng

AI总结 提出VISA系统,通过多模态特征提取、模型投票推理和细粒度类别感知路由,增强大音频语言模型的音频推理能力,在ARC智能体赛道取得66.23%评分和77.40%准确率。

Comments Submitted to INTERSPEECH 2026

详情
AI中文摘要

音频推理需要对时变动态和声学混合信号进行多步骤、基于证据的推理,超越了传统感知任务如ASR或字幕生成。我们提出VISA,作为提交至Interspeech 2026音频推理挑战赛(智能体赛道)的系统,通过MMAR评分标准评估正确性和推理质量。在“LALM作为工具”范式下,VISA利用辅助多模态证据增强大音频语言模型,同时避免繁重的编排。该系统集成三个组件:多模态特征提取以获取互补的音频和声学-视觉线索,带一致性检查的模型投票推理以获得稳定预测,以及细粒度类别感知路由以解决分歧并选择符合评分标准的推理链。在官方智能体赛道排行榜上,VISA以66.23%的评分排名第二。它还达到了77.40%的准确率,是单模型和智能体赛道所有系统中最高的。

英文摘要

Audio reasoning requires multi-step, evidence-grounded inference over temporally dynamic and acoustically mixed signals, exceeding conventional perception tasks such as ASR or captioning. We present VISA, our submission to the Interspeech 2026 Audio Reasoning Challenge (Agent Track), evaluated via the MMAR Rubrics for correctness and reasoning quality. Under a "LALM as a Tool" paradigm, VISA strengthens large audio language models with auxiliary multi-modal evidence while avoiding heavy orchestration. The system integrates three components: multi-modal feature extraction for complementary audio and acoustic-visual clues, model-voting inference with consistency checking for stable predictions, and fine-grained category-aware routing to resolve disagreements and select rubric-aligned reasoning chains. On the official Agent Track leaderboard, VISA ranks 2nd overall with a 66.23% Rubrics score. It also achieves 77.40% Accuracy, the highest among all systems listed across both the Single Model and Agent tracks.

2606.07182 2026-06-08 eess.AS 新提交

Audio Imitator: Controlling Timbre and Tempo in Video2Audio Synthesis with Audio Reference

Audio Imitator: 通过音频参考控制视频到音频合成中的音色和节奏

Jiahui Zhao, Tianrui Wang, Chunyu Qiang, Cheng Gong, Xijuan Zeng, Feng Deng, Longbiao Wang

AI总结 提出AudioIM框架,通过双编码器分离建模音色和节奏,实现细粒度风格控制,在保持语义一致性的同时提升风格相似度。

详情
AI中文摘要

视频到音频生成在实现无声视频的语义一致性和时间对齐方面取得了显著进展。然而,音频包含丰富的风格属性,如音色和节奏,这些很难仅从视觉和文本输入中推断出来。虽然参考音频可以作为额外的条件,但它通常被视为整体信号,限制了细粒度的风格控制。我们提出AudioIM,一个属性感知框架,明确将音色和节奏建模为独立的控制因素,而不是依赖整体提示条件。双编码器提取互补的音色相关和节奏相关表示,并通过全局条件注入。基于掩码的训练策略使得在推理时能够进行有效的潜在提示条件。在VGGSound上的实验表明,在保持语义对齐和同步的同时,风格相似度得到了提升。音频样本可在以下网址获取:this https URL。

英文摘要

Video-to-audio generation has made significant progress in achieving semantic consistency and temporal alignment from silent videos. However, audio contains rich stylistic attributes such as timbre and tempo that are difficult to infer from visual and textual inputs alone. While reference audio can serve as additional conditioning, it is typically treated as a holistic signal, limiting fine-grained style control. We propose AudioIM, an attribute-aware framework that explicitly models timbre and tempo as separate control factors rather than relying on holistic prompt conditioning. Dual encoders extract complementary timbre-related and tempo-related representations, which are injected through global conditioning. A masking-based training strategy enables effective latent prompt conditioning at inference. Experiments on VGGSound show improved style similarity while preserving semantic alignment and synchronization. Audio samples are available at: https://anonymousdemo757.github.io/.

2606.07104 2026-06-08 eess.SP 新提交

Robust Secure Beamforming for Movable Antenna Enhanced Integrated Sensing and Communications

可移动天线增强集成感知与通信的鲁棒安全波束赋形

Yuan Chen, Ning Wei, Ahmad Bazzi, Xiangyu Dong, Ran Yang, You Li, Yue Xiu

AI总结 针对不完美窃听信道状态信息,提出联合优化发射波束赋形和天线位置的鲁棒波束赋形设计,以最大化雷达信干噪比并保证通信安全,采用基于块坐标下降的算法结合逐次凸近似和分数规划。

详情
AI中文摘要

在这封信中,我们研究了可移动天线增强的安全集成感知与通信系统中,在存在不完美窃听信道状态信息情况下的鲁棒波束赋形设计。为了提升雷达感知性能,我们通过联合优化发射波束赋形和天线位置,同时确保通信数据安全,提出了一个雷达信干噪比最大化问题。然而,由于天线位置到信道系数的非线性映射以及窃听者信道的不确定性,所得到的优化问题本质上是难以处理的。为了应对这些挑战,我们提出了一种基于块坐标下降的算法,结合了逐次凸近似和分数规划技术。仿真结果表明,我们提出的算法具有快速收敛性,并在保证通信安全的同时显著提升了雷达信干噪比。

英文摘要

In this letter, we investigate robust beamforming design for a movable antenna (MA)-enhanced secure integrated sensing and communications (ISAC) system with imperfect eaves?dropping channel state information (CSI). To improve radar sensing performance, we formulate a radar signal-to-interference?plus-noise ratio (SINR) maximization problem by jointly opti?mizing the transmit beamforming and antenna placement while ensuring communication data security. However, the resulting op?timization problem is inherently intractable due to the nonlinea mapping from antenna positions to channel coefficients, as well as the eavesdropper (Eve) channel uncertainty. To handle these challenges, we propose a block coordinate descent (BCD)-based algorithm incorporating successive convex approximation (SCA) and fractional programming (FP) techniques. Simulation results show that our proposed algorithm exhibits fast convergence and achieves a significant improvement in the radar SINR while guaranteeing communication security.

2606.07091 2026-06-08 eess.SP 新提交

Rate-Splitting--Inspired Uplink Near-Field ISAC

速率分裂启发的上行近场ISAC

Anup Mishra, Israel Leyva-Mayorga, Petar Popovski

AI总结 提出速率分裂(RS)启发的上行近场ISAC框架,通过分裂通信消息到感知操作,推导通信速率和感知速率的闭式表达式,表征可达速率区域,证明RS启发边界优于NOMA启发的时间共享区域。

详情
AI中文摘要

集成感知与通信(ISAC)使感知和通信(S&C)功能共享频谱、硬件和信号处理资源,但由此产生的功能间干扰带来了基本的接收机设计挑战,特别是在上行链路操作中。本文开发了一个速率分裂(RS)启发的上行近场ISAC框架。该框架通过将通信消息分裂到感知操作中,推广了非正交多址(NOMA)启发ISAC的感知中心(S-C)和通信中心(C-C)端点顺序。推导了通信速率(CR)和感知速率(SR)的闭式表达式,考虑了来自目标响应估计不确定性的残余感知干扰。在感知匹配照明下表征了可达CR-SR速率区域,其中所提出的单帧RS启发边界包含NOMA启发的时间共享区域。与经典高斯上行多址信道(其中RS恢复时间共享主导面)不同,上行ISAC中的分裂因子也重塑了感知阶段的干扰,使得RS启发边界匹配或严格扩大S&C折衷。高信噪比分析表明,对于非对齐的S&C信道,残余感知干扰改变速率偏移但不改变主导S&C斜率,而在完全对齐的情况下,它变得斜率受限。使用孔径感知的近场信道模型,推导了大阵列极限,表明随着阵列增长,可达速率保持有限。数值结果验证了分析,并展示了RS启发方案的优势、残余感知干扰的影响以及由物理一致近场建模引起的有限大阵列行为。

英文摘要

Integrated sensing and communication (ISAC) enables sensing and communication (S&C) functionalities to share spectrum, hardware, and signal-processing resources, but the resulting inter-functionality interference creates a fundamental receiver-design challenge, particularly in uplink operation. This paper develops a rate-splitting (RS)-inspired framework for uplink near-field ISAC. The framework generalizes the sensing-centric (S-C) and communication-centric (C-C) endpoint orders of non-orthogonal multiple access (NOMA)-inspired ISAC by splitting the communication message across the sensing operation. Closed-form expressions are derived for the communication-rate (CR) and sensing-rate (SR), accounting for residual sensing interference from target-response estimation uncertainty. The achievable CR-SR rate region is characterized under sensing-matched illumination, where the proposed single-frame RS-inspired boundary contains the NOMA-inspired time-sharing region. Unlike the classical Gaussian uplink multiple access channel, where RS recovers the time-sharing dominant face, the split factor in uplink ISAC also reshapes the sensing-stage interference, allowing the RS-inspired boundary to match or strictly enlarge the S&C tradeoff. High-SNR analysis shows that, for non-aligned S&C channels, residual sensing interference changes the rate offsets but not the leading S&C slopes, whereas in the fully-aligned case it becomes slope-limiting. Using an aperture-aware near-field channel model, large-array limits are derived, showing that achievable rates remain finite as the array grows. Numerical results validate the analysis and demonstrate the benefits of the RS-inspired scheme, the impact of residual sensing interference, and the bounded large-array behaviour induced by physically consistent near-field modelling.

2606.07050 2026-06-08 eess.SP 新提交

Optimized Sampling of Angle-Resolved Scatterometry Data Using End-to-End Compressed Learning Model for Nanograss Deficiency Detection

使用端到端压缩学习模型优化角度分辨散射测量数据采样用于纳米草缺陷检测

Mehdi Abdollahpour, Carsten Bockelmann, Armin Dekorsy

AI总结 提出端到端压缩学习框架,集成可学习纬度采样层与CNN,联合优化采样与分类,在减少90%采样点下保持94.2%的五级缺陷分类精度。

Comments Preprint. 13 pages, 11 figures

详情
AI中文摘要

纳米表面的可靠检测对于确保纳米结构制造质量至关重要。角度分辨散射测量提供了一种非侵入式检测方法,可在线使用,但由于密集的角度采样,通常采集时间较长。本文针对数据采集挑战,提出了一种端到端压缩学习框架,用于使用ARS图像检测氧化锌纳米草中的5级空位缺陷。该框架将可学习的基于纬度的采样层与卷积神经网络集成,使得采样和分类可以在训练过程中联合优化。采样层利用ARS模式的物理结构,学习信息丰富的纬度区域,从而减少采样搜索空间并提高收敛性。评估结果表明,所提方法在不同噪声条件下实现了高且稳定的缺陷级分类性能。使用完整ARS图像,模型在五级缺陷分类中达到94.2%的准确率,在区分缺陷与非缺陷纳米表面时达到98.6%的准确率。所提采样模型在使用多达90%更少的角度采样点时,性能与全图像相当。即使采样点减少99.7%,分类准确率下降不到10个百分点。为了进一步改善有限数据下的训练,我们还研究了基于GAN的增强方法,并使用GAN生成的数据进行模型预训练。增强数据使得仅需少量微调轮次即可快速收敛。

英文摘要

Reliable inspection of nanosurfaces is essential to ensure the quality of nanostructure manufacturing. Angle-resolved scatterometry provides a non-invasive inspection method that can be used in-line but often suffers from long acquisition times due to dense angular sampling. This paper addresses the data acquisition challenge by proposing an end-to-end compressed learning framework for 5-level vacancy deficiency detection in zinc oxide nanograss using ARS images. The proposed framework integrates a learnable latitude-based sampling layer with a convolutional neural network, allowing sampling and classification to be jointly optimized during training. The sampling layer exploits the physical structure of ARS patterns and learns informative latitudinal regions, which reduces the sampling search space and improves convergence. Evaluation results show that the proposed approach achieves high and stable deficiency-level classification performance under different noise conditions. Using full ARS images, the model achieves 94.2% accuracy for five-level deficiency classification and 98.6% accuracy for separating deficient from non-deficient nanosurfaces. The proposed sampling model matches full-image performance while using up to 90% fewer angular sampling points. Even when sampling points are reduced by 99.7%, the classification accuracy decreases by less than 10 percentage points. To further improve training with limited data, we also studied a GAN-based augmentation approach and used GAN-generated data for model pretraining. Augmented data resulted in fast convergence within only a few fine-tuning epochs.

2606.07026 2026-06-08 eess.SP 新提交

A Novel Stripe-based RIS Optimization for UAV Communications and Sensing in Low-Altitude Wireless Networks

基于条带的可重构智能表面优化用于低空无线网络中的无人机通信与感知

Burak Ahmet Celebi, Sefa Kayraklik, Onur Salan, Ibrahim Hokelek, Ali Emre Pusane, Ali Gorcin

AI总结 提出一种低复杂度的条带式RIS相位优化框架,利用相邻元素的结构相位梯度减小搜索空间,在3D移动下增强通信可靠性并提供被动感知能力,仿真和实验验证了其高收敛速度和鲁棒性。

Comments 13 Pages, 14 figures

详情
AI中文摘要

低空无线网络(LAWN)设想了一种可重构的3D网络,能够支持关键任务的空中操作。本文提出了一种可重构智能表面(RIS)辅助的LAWN,以在变化的无线信道条件和信号阻塞下与无人机(UAV)建立可靠通信。提出了一种低复杂度的条带式RIS相移优化框架,以同时增强通信可靠性并为3D移动下的UAV跟踪提供被动感知能力。与高复杂度的优化方法不同,所提方法利用RIS相邻元素固有的结构相位梯度,显著减少了随UAV移动计算和更新RIS配置的搜索空间。分析和仿真结果表明,所提框架在收敛速度和计算效率上优于传统基准,即使在存在相位估计误差和低信噪比(SNR)的情况下,也能保持稳健的高SNR连接。此外,在室外校园环境中使用真实RIS原型进行了测量实验,以证明所提方法的实际可行性。

英文摘要

Low-altitude wireless networks (LAWN) envision a reconfigurable 3D network capable of supporting mission-critical aerial operations. This paper presents a reconfigurable intelligent surface (RIS)-assisted LAWN to establish a reliable communication with an unmanned aerial vehicle (UAV) across varying wireless channel conditions and signal blockages. A low complexity stripe-based RIS phase shift optimization framework is proposed to simultaneously enhance communication reliability and provide passive sensing capability for UAV tracking under 3D mobility. Unlike high-complexity optimization approaches, the proposed method leverages the inherent structural phase-gradient of the RIS adjacent elements to significantly reduce the search space for calculating and updating the RIS configuration as the UAV moves. The analysis and simulation results demonstrate that the proposed framework outperforms conventional benchmarks in convergence speed and computational efficiency, while maintaining robust, high signal-to-noise-ratio (SNR) connectivity even in the presence of phase estimation errors and low SNR regimes. In addition, the measurement experiments using a real RIS prototype in an outdoor campus environment are performed to demonstrate the practical viability of the proposed approach.

2606.06962 2026-06-08 eess.AS 新提交

FSC-Net: Integrating Fast Fourier Convolutions and Progressive Learning for Speech Bandwidth Extension

FSC-Net:融合快速傅里叶卷积与渐进学习的语音带宽扩展

Xinan Chen, Xiaobin Rong, Qinwen Hu, Kai Chen, Jing Lu

AI总结 提出FSC-Net,通过集成快速傅里叶卷积和频率渐进学习,高效建模跨频段谐波依赖,实现窄带到宽带语音的高保真重建,在VCTK 4kHz-48kHz任务上以1.54M参数取得领先的LSD和PESQ分数。

Comments 5 pages, 2 figures

详情
AI中文摘要

语音带宽扩展(BWE)旨在从窄带输入重建高保真宽带音频。尽管近期方法取得了显著进展,但它们通常难以重建真实的高频相位和谐波结构,导致感知伪影。本文提出FSC-Net(全频谱上下文网络),一种参数高效的架构,旨在显式建模跨频段谐波依赖。通过将快速傅里叶卷积(FFCs)集成到复频谱映射框架中,FSC-Net将其感受野扩展到整个频谱,有效捕获长程频率交互。为解决高频生成的不适定性,我们新颖的频率渐进学习课程引导网络从粗到细地重建频谱细节。在VCTK和未见过的EARS数据集上的实验结果表明,FSC-Net提供了持续强劲的重建质量和泛化能力,尤其在具有挑战性的VCTK 4 kHz至48 kHz任务中。与规模更大的基线相比,我们的模型在保持高度紧凑的参数规模(1.54 M)的同时,取得了领先的LSD和PESQ分数。

英文摘要

Speech bandwidth extension (BWE) aims to reconstruct high-fidelity wideband audio from narrowband inputs. While recent approaches have made significant progress, they often struggle to reconstruct realistic high-frequency phase and harmonic structures, leading to perceptual artifacts. In this paper, we propose FSC-Net (Full-Spectrum Context Network), a parameter-efficient architecture designed to explicitly model cross-band harmonic dependencies. By integrating Fast Fourier Convolutions (FFCs) into a complex spectral mapping framework, FSC-Net expands its receptive field to the entire spectrum, capturing long-range frequency interactions effectively. To address the ill-posed nature of high-frequency generation, our novel frequency-progressive learning curriculum guides the network to reconstruct spectral details from coarse to fine. Experimental results on the VCTK and unseen EARS datasets demonstrate that FSC-Net delivers consistently strong reconstruction quality and generalization, particularly in the challenging VCTK 4 kHz-to-48 kHz task. Compared to scaled-up baselines, our model attains leading LSD and PESQ scores while maintaining a highly compact parameter footprint (1.54 M).

2606.06954 2026-06-08 eess.SP 新提交

Learn to Access and Backhaul the Sky: Multi-Scale Radio Map Guided Multi-UAV Cooperation

学会接入和回传天空:多尺度无线电地图引导的多无人机协作

Yifeng Yuan, Shijian Gao

AI总结 针对无人机群在三维场景中因用户移动和建筑遮挡导致的端到端瓶颈问题,提出多尺度无线电地图引导(MRMG)框架,结合全局、局部和链路级地图信息,通过多智能体强化学习实现无人机移动、下一跳选择和功率控制的联合优化,显著提升网络吞吐量和边缘用户速率。

Comments 6 pages, 4 figures

详情
AI中文摘要

受新兴低空经济的驱动,无人机群提供了灵活的集成空地接入和回传。然而,由于用户移动和建筑遮挡在这些三维场景中的相互依赖动态性,提供无缝连接是困难的。这些因素在端到端路径中造成快速变化的瓶颈。此外,联合控制的多维性质限制了传统启发式方法的有效性。为了应对这些挑战,提出了一个多尺度无线电地图引导(MRMG)框架。MRMG框架通过整合三个不同层次的无线电信息来处理异构动态:全局地图提供区域覆盖洞察,局部地图捕获邻域尺度服务条件,链路级地图表征高分辨率信道特征。这种设计有效地解耦了宏观移动和微观链路自适应。为了实现长期性能提升,一个多智能体强化学习(MARL)控制器学习无人机移动、下一跳选择和发射功率控制的协作策略。仿真结果表明,MRMG框架不仅提高了网络吞吐量,还显著增强了小区边缘服务,几乎将第5百分位用户速率翻倍。

英文摘要

Driven by the emerging low-altitude economy, uncrewed aerial vehicle (UAV) swarms offer flexible integrated air-ground access and backhaul. However, providing seamless connectivity is difficult due to the interdependent dynamics of user mobility and building blockages in these 3D scenarios. These factors create rapidly shifting bottlenecks in end-to-end paths. Furthermore, the multi-dimensional nature of joint control limits the effectiveness of traditional heuristics. To address these challenges, a \textbf{\underline{M}}ulti-Scale \textbf{\underline{R}}adio \textbf{\underline{M}}ap-\textbf{\underline{G}}uided (MRMG) framework is proposed. The MRMG framework handles heterogeneous dynamics by integrating three distinct levels of radio information: global-level maps provide regional coverage insights, local-level maps capture neighborhood-scale service conditions, and link-level maps characterize high-resolution channel features. This design effectively decouples macro-movement from micro-link adaptation. To yield long-term performance improvements, A multi-agent reinforcement learning (MARL) controller learns cooperative policies for UAV movement, next-hop selection, and transmit-power control. Simulation results show that the MRMG framework not only improves network throughput but also significantly bolsters cell-edge service, nearly doubling the 5th-percentile user rate.

2606.06933 2026-06-08 eess.IV 新提交

A 3D Formulation of the Extended Phaseless Rytov Approximation

扩展无相位Rytov近似的三维公式

Wanqin Ma, Zan Li, Amartansh Dubey, Alikhan Umirbayev, Yijun Chen, Junhui Rao, Ross Murch

AI总结 提出扩展三维无相位Rytov近似(x3DPRA),将二维无相位RF成像方法扩展到三维,保持实现简单性,实现体积成像,并通过仿真验证其定位、形状重建和材料衰减估计性能。

Comments 12 pages, 6 figures, In processing for IEEE Trans

详情
AI中文摘要

扩展无相位Rytov近似(xPRA)是一种最近提出的无设备射频成像技术,仅使用无相位测量(如接收信号强度RSS)即可提供成像区域的高分辨率重建。由于其无相位公式,可以利用现有无线通信基础设施直接实现。它也优于著名的无设备无相位RF成像方法,如无线电断层成像(RTI)。xPRA(和RTI)中使用的线性无相位公式使得这些方法可能对下一代无线网络中的集成感知与通信(ISAC)系统有用,因为它们不需要宽带宽。然而,到目前为止,xPRA和RTI主要是在二维(2D)中提出的。本文介绍了xPRA的三维扩展,我们称之为扩展三维无相位Rytov近似(x3DPRA)。我们方法的新颖之处在于,它保留了RTI和xPRA的直接实现优势,同时实现了体积(3D)成像。仿真结果表明,x3DPRA提供了良好的位置和形状估计,并且还可以重建物体材料衰减。我们提出了三维公式,通过与二维模型比较进行验证,并报告了展示其性能的仿真结果。

英文摘要

The extended Phaseless Rytov Approximation (xPRA) is a recently proposed device-free RF imaging technique that provides high-resolution reconstructions of the imaging region using only phaseless measurements, such as received signal strength (RSS). Because of its phaseless formulation, it can be implemented straightforwardly using existing wireless commu?nication infrastructure. It also outperforms well-known device?free phaseless RF imaging methods such as Radio Tomographic Imaging (RTI). The linear phaseless formulation used in xPRA(and RTI) makes these methods potentially useful for integrated sensing and communication (ISAC) systems in next generation wireless networks since they do not require wide bandwidths. However, so far, both xPRA and RTI have primarily been formulated in two dimensions (2D). This paper introduces a 3D extension of xPRA, which we call the extended three-dimensional phaseless Rytov approximation (x3DPRA). The novelty of our approach is that it preserves the straightforward implementation advantages of RTI and xPRA while enabling volumetric (3D) imaging. Simulation results show that x3DPRA provides good estimates of location and shape and can also reconstruct object material attenuation. We present the 3D formulation, validate it with a 2D model comparison, and report simulation results demonstrating its performance.

2606.06846 2026-06-08 eess.SP 新提交

Variable-Length Finite-Rate CSI Feedback With Generative Priors

变长有限速率CSI反馈与生成先验

Yangxuan Cheng, Fanyang Meng, Jian Zou, Jiacheng Xie, Zhongqiang Zhang, Ye Wang, Yongsheng Liang

AI总结 提出CsiCoGen,一种基于生成扩散模型的变长CSI反馈结构,通过可迁移码本实现灵活序列长度和量化精度,无需联合训练,在COST2100上达到高码率下室内-31 dB、室外-20 dB NMSE。

详情
AI中文摘要

本文从结构角度研究了变长有限速率CSI反馈,并提出了CsiCoGen,一种新颖的生成式反馈结构,具有无需联合训练的可迁移码本机制。UE将$H_0$映射为有序的码本索引序列,而BS利用共享的去噪先验从接收到的任意部分反馈索引序列中递归恢复CSI。这通过码本大小实现了反馈序列长度和每步量化精度的灵活控制。CsiCoGen不需要联合训练特定任务的反馈编码器或码本与重构器,且相同的在线结构可以搭配不同的预训练去噪器。在本文中,我们使用生成扩散模型实例化解码器。在COST2100上的仿真结果表明,与代表性基线相比,CsiCoGen在速率-NMSE和速率-$\ ho$权衡上表现优异,在高码率下达到约-31 dB室内NMSE和-20 dB室外NMSE,同时展示了可扩展的解码复杂度和可调节的每步量化精度。

英文摘要

This letter studies variable-length finite-rate CSI feedback from a structural perspective and proposes CsiCoGen, a novel generative feedback structure with a transferable codebook mechanism without joint training. The UE maps $H_0$ into an ordered sequence of codebook indices, while the BS recursively recovers CSI from any received partial sequence of feedback indices using a shared denoising prior. This enables flexible control of feedback sequence length and per-step quantization precision through codebook size. CsiCoGen does not require jointly training a task-specific feedback encoder or codebook with the reconstructor, and the same online structure can be paired with different pretrained denoisers. In this work, we instantiate the decoder with a generative diffusion model. Simulation results on COST2100 show favorable rate-NMSE and rate-$ρ$ tradeoffs against representative baselines, with CsiCoGen reaching about -31 dB indoor NMSE and -20 dB outdoor NMSE in the high-rate regime while demonstrating scalable decoding complexity and adjustable per-step quantization precision.

2606.06792 2026-06-08 eess.SP 新提交

Copula Function Parameter Regions in Analyzing Wireless Communications Performances

无线通信性能分析中的Copula函数参数区域

Mona Mohsenzadeh, Saeid Pakravan, Ghosheh Abed Hodtani

AI总结 提出Copula依赖参数区域概念,通过两用户MAC信道中FGM Copula的示例,从通信和概率角度推导参数区域,表明实际需求可显著缩小经典可容许区间。

详情
AI中文摘要

Copula函数已广泛应用于无线通信分析中,用于建模依赖结构和评估系统性能。然而,现有研究通常用Copula依赖参数表达性能指标,而未明确表征其可容许区域。本文介绍了Copula依赖参数区域的概念,并研究了其在无线通信中的重要性。考虑一个由双变量Farlie--Gumbel--Morgenstern (FGM) Copula建模的相关瑞利衰落的两用户无线多址接入信道 (MAC),从中断概率和皮尔逊相关系数 (PCC) 约束出发,从通信理论和概率角度推导出显式参数区域。结果表明,实际通信和统计要求可以显著缩小经典的Copula可容许区间,使得一些理论上可容许的依赖结构变得不可行。数值示例说明了所提出的概念及其实际意义。

英文摘要

Copula functions have been widely employed in wireless communication analysis to model dependence structures and evaluate system performance. However, existing studies generally express performance metrics in terms of copula dependence parameters without explicitly characterizing their admissible regions. This letter introduces the concept of copula dependence parameter regions and investigates its significance in wireless communications. Considering a two-user wireless multiple access channel (MAC) with correlated Rayleigh fading modeled by the bivariate Farlie--Gumbel--Morgenstern (FGM) copula, explicit parameter regions are derived from communication-theoretic and probabilistic perspectives using outage probability and Pearson correlation coefficient (PCC) constraints. The results show that practical communication and statistical requirements can significantly shrink the classical copula admissible interval, rendering some theoretically admissible dependence structures infeasible. Numerical examples illustrate the proposed concept and its practical implications.