arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2606.17367 2026-06-17 cs.CY 新提交

Towards Auditing AI Systems in the Wild

野外AI系统审计的探讨

Aditya T. Vadlamani, Anutam Srinivasan, Srinivasan Parthasarathy

AI总结本文提出将AI系统审计视为在不确定性下监控约束违规的统计问题，强调开发全生命周期审计框架，以持续评估公平性、安全性等风险控制约束。

Comments Accepted to KDD 2026 (Blue Sky Ideas Track)

详情

AI中文摘要

AI系统越来越多地部署在现实环境中，其行为受到动态环境、不断变化的数据分布以及与用户和基础设施的复杂交互的影响。传统的机器学习评估侧重于基准测试，并在沙盒环境中运行，只能提供对野外真实系统行为的有限视角。我们主张开发原则性的审计框架，以监控部署的AI系统在其整个生命周期中的表现。我们进一步提出将审计视为在不确定性下监控约束违规的统计问题，其中期望的属性（例如公平性和安全性）被视为风险控制的约束，必须随着系统通过迭代反馈的演化而持续评估。这一视角凸显了对不确定性感知的监控方法、审计标准的社会技术规范以及能够对野外AI系统进行持续监督的审计基础设施的需求。

英文摘要

AI systems are increasingly deployed in real-world settings where their behavior is shaped by dynamic environments, evolving data distributions, and complex interactions with users and infrastructure. Traditional machine learning evaluation focuses on benchmarks and operates within sandboxed environments, providing only a limited view of the true system behavior in the wild. We argue for the development of principled auditing frameworks that monitor deployed AI systems throughout their lifecycle. We further propose framing auditing as a statistical problem of monitoring constraint violations under uncertainty, where desired properties (e.g., fairness and safety) are treated as risk-controlled constraints that must be continuously evaluated as systems evolve through iterative feedback. This perspective highlights the need for uncertainty-aware monitoring methods, socio-technical specifications of audit criteria, and auditing infrastructures that enable ongoing oversight of AI systems in the wild.

URL PDF HTML ☆

赞 0 踩 0

2606.17360 2026-06-17 cs.CY 新提交

Narratives That Limit the Possible: Interrupting Narrative Closure in Computing Practice

限制可能性的叙事：中断计算实践中的叙事闭合

Samuel Mann, Ruth Myers, Dave Guruge, Lucky Hawkins, Kylie McKee, Rex Alexander, Jamie Vaughan, Tim Lynch, Danny Fridberg

AI总结本文通过集体自我民族志方法，识别计算领域中概念武器化的机制（如简化、个体化、二元框架等），并提出“重构”工具包，以恢复复杂性、揭示假设并引导关注结构条件，促进关怀、充足与正义的实践转变。

详情

AI中文摘要

计算领域的主导概念——创新、效率、韧性、专业精神——常常从反思性理想转变为限制行为、转移责任并排除批判的工具。我们将这种漂移称为武器化：对专业概念的话语性重新利用，使其稳定“一切照旧”，同时使结构性替代方案显得不合理、难以理解或超出范围。通过跨教育、司法、公共管理、研究管理和计算领域的集体自我民族志，我们识别出反复出现的机制（简化、个体化、二元框架、指标替代、英雄/韧性脚本、有组织的无知）。基于这一综合，我们提出重构——和平的、实践就绪的转变（例如，从简化口号到结构性素养；从表演性合规到有意义的结果；从应对到正义）——每个转变都配有一个“立即行动”提示。这些杠杆在不要求正式权威的情况下恢复复杂性、揭示假设并将注意力重新引向结构条件。我们的贡献包括：（1）跨领域描述武器化作为一种模式化现象；（2）基于叙事和系统思维的可移植重构工具包；（3）对有限计算的影响，包括朝向关怀、充足和正义的日常实践转变。

英文摘要

Computing's dominant concepts - innovation, efficiency, resilience, professionalism - often migrate from reflective ideals to instruments that limit behaviour, redirect responsibility, and foreclose critique. We call this drift weaponisation: the discursive repurposing of professional concepts so that they stabilise business-as-usual while making structural alternatives appear unreasonable, illegible, or out of scope. Using collective autoethnography across education, justice, public administration, research management, and computing, we identify recurring mechanisms (simplification, individualisation, binary framing, metric substitution, hero/resilience scripts, organised ignorance). From this synthesis we propose Reframing - peaceful, practice-ready shifts (e.g., From Simplified Slogans to Structural Literacy; From Performative Compliance to Meaningful Outcomes; From Coping to Justice) - each paired with a 'do-now' prompt. The levers restore complexity, surface assumptions, and redirect attention to structural conditions without requiring formal authority. We contribute: (1) a cross-field account of weaponisation as a patterned phenomenon; (2) a portable reframing toolkit grounded in narrative and systems thinking; and (3) implications for computing within limits, including day-to-day practice shifts toward care, sufficiency, and justice.

URL PDF HTML ☆

赞 0 踩 0

2606.17358 2026-06-17 cs.CR 新提交

OTRO: Oblivious Tokenization Path with Square-Root ORAM

OTRO: 具有平方根ORAM的遗忘标记化路径

Jonghyun Lee, Yongqin Wang, Rachit Rajat, Daniel Wong, Mengyuan Li, Murali Annavaram

AI总结针对LLM机密计算中标记器访问模式泄露问题，提出OTRO，利用平方根ORAM实现高效遗忘查找，通过实例池、轮换填充和分块KV缓存感知标记化降低开销，在TDX环境中将TTFT开销限制在4.5%以内。

详情

AI中文摘要

CPU端的大语言模型（LLM）标记器是通过CPU和GPU可信执行环境（TEE）的机密计算堆栈进行LLM服务中的一个关键安全漏洞。标记器通过表驱动查找将提示转换为标记，由此产生的内存访问模式是侧信道泄露的强大来源。最近的研究表明，在生产级Intel TDX上，可以从标记器访问模式端到端恢复用户提示。然而，直接使用流行的基于树的遗忘RAM（例如PathORAM）来防止访问模式泄露会导致约13倍的标记器减速，导致首次令牌生成时间（TTFT）增加10-58%。在本文中，我们提出了OTRO，一种针对延迟关键的LLM服务量身定制的高效遗忘标记化路径。OTRO依赖于平方根ORAM实现快速单次访问查找，但通过三项关键创新避免了每√N次访问时O(N log²N)的重建成本。首先，OTRO提供了一组复制的平方根ORAM实例，利用标记器表的只读特性。其次，基于轮次的旋转策略将访问与重建解耦，并在每个轮次边界填充虚拟访问，以最小化可观察信息。最后，分块KV缓存感知标记化进一步将重建与GPU预填充重叠，并最小化实例数量。作为HuggingFace Tokenizers和nano-vLLM中的模块实现，在配备NVIDIA H100 GPU的TDX启用CVM中运行，OTRO将TTFT开销限制在最多4.5%，将标记器引起的延迟保持在总TTFT的10%以下，并增加不到0.5 GB的内存开销，同时减少各种模型系列和大小的标记器可观察泄露。

英文摘要

The CPU-side large language model (LLM) tokenizer is a critical security gap in LLM serving through a confidential computing stack with CPU and GPU trusted execution environments (TEEs). Tokenizers converts the prompts through table-driven lookups, and the resulting memory access patterns are a powerful source of side-channel leakage. Recent work demonstrates end-to-end recovery of user prompts from tokenizer access pattern on production Intel TDX. However, a drop-in use of the popular tree-based Oblivious RAMs (e.g., PathORAM) to prevent access-pattern leakage introduces $\sim$13$\times$ tokenizer slowdown, resulting in 10-58% higher time-to-first-token (TTFT). In this paper, we present OTRO, an efficient, oblivious tokenization path tailored to latency-critical LLM serving. OTRO relies on square-root ORAM for fast single-access lookups, but avoids its prohibitive $O(N\log^2N$) rebuild cost every $\sqrt{N}$ accesses through three key innovations. First, OTRO provides a pool of replicated square-root ORAM instances that utilize the read-only nature of tokenizer table. Second, an epoch-based rotation policy decouples accesses from rebuilds and pads each epoch with dummy accesses to its boundaries, minimizing observable information. Lastly, chunked KV-cache-aware tokenization further overlaps rebuilds with GPU prefill and minimizes the instance count. Implemented as modules in HuggingFace Tokenizers and nano-vLLM, running within a TDX-enabled CVM with an NVIDIA H100 GPU, OTRO limits TTFT overhead to at most 4.5%, keeps tokenizer-induced latency under 10\% of total TTFT, and adds less than 0.5 GB of memory overhead while reducing the tokenizer's observable leakage across various model families and sizes.

URL PDF HTML ☆

赞 0 踩 0

2606.17347 2026-06-17 eess.SY cs.SY 新提交

Classifying Transient Regimes in Dynamic Systems through Properties of Spatial Curves and Stochastic Processes: A Data-Driven Approach

通过空间曲线和随机过程性质对动态系统中瞬态状态进行分类：一种数据驱动方法

Cristian Puerto-Santana, Javier Diaz-Rozo, Carlos Puerto-Santana, Carlos Ocampo-Martinez

AI总结提出一种基于空间曲线表示和数学矩的瞬态与稳态分类方法，利用弧长和曲率设计分类器，在多元线性、非线性和不连续系统中优于现有技术。

详情

AI中文摘要

本文提出了一种对动态系统中瞬态和稳态状态进行分类的新方法。文献中几种基于传感器的状态分类解决方案需要设置多个参数，或者不适用于包含周期信号的多元系统场景。所提出的方法基于样本数学矩引入了所考虑系统的空间曲线表示。然后，通过连接稳定性理论、空间曲线的几何性质和稳态随机过程的概念，利用所提出曲线的弧长和曲率设计了两个状态分类器。两个分类器都能够描述和检测瞬态状态，考虑的行为包括：多元渐近稳定性、边际稳定性和循环平稳性。此外，对所提出的分类器与文献中现有分类器在性能和计算资源方面进行了定量比较，结果表明，在指定的研究条件下，基于弧长的状态分类器在对模拟线性、非线性和不连续多元系统的瞬态状态分类中优于其他技术。

英文摘要

This article proposes a novel methodology for the classification of transient and stationary regimes in dynamic systems. Several sensor-based solutions for regime classification in the literature require the setting of several parameters, or are not suitable for scenarios involving multivariate systems that may contain periodic signals. The proposed method introduces a spatial curve representation of the considered system based on its sample mathematical moments. Then, by connecting concepts of stability theory, geometrical properties of spatial curves and stationary stochastic processes, two regime classifiers are designed using the arc length and the curvatures of the proposed curve. Both classifiers are capable of describing and detecting transient regimes, considering behaviors such as: multivariate asymptotically, marginally stability, and cyclostationarity. Furthermore, a quantitative comparison in performance and computation resources of the proposed classifiers against existing classifiers in the literature illustrates that the proposed regime classifier based on the arc length outperforms other techniques in classifying transient regimes for simulated linear, non-linear, and discontinuous multivariate systems under the specified studied conditions.

URL PDF HTML ☆

赞 0 踩 0

2606.17322 2026-06-17 cs.CY 新提交

Federated Fair Trade Energy: Speculative Fabulation for a Planet with Limits

联邦公平贸易能源：有限星球上的投机性虚构

Dawn Nafus, Laura Watts

AI总结研究电力网格转向去中心化绿色能源时，永续计算面临的社会正义与基础设施限制问题，通过基于实证的投机性虚构提出跨社区联邦化能源系统的研究机会，并引入“公平贸易能源”概念。

Comments Paper in Proceedings of LIMITS 2026: 12th Workshop on Computing within Limits, 2026-06-23-25, Online

详情

AI中文摘要

当电力网格转向去中心化绿色能源，且地方社区和市政当局对这一重要公共服务拥有更多治理权时，永续计算会发生什么？电力和计算网络不仅仅是相互连接的不同系统。电网向可再生能源发电的转变正在影响计算系统，而计算对电力的需求也在影响电网；一种基础设施限制了另一种。永续计算研究往往侧重于“离网”或“表后”能源，这牺牲了公共电力网格（为普遍服务而管理和监管）旨在提供的一些社会正义利益。我们的论文使用基于实证的“投机性虚构”来识别当能源系统跨社区联邦化时，永续计算中的研究机会。该投机性虚构以未来北海能源岛的能源经理与一位永续计算播客主持人之间的采访形式呈现，使我们能够以可处理的社会和生态术语来构想计算-电网集成，并引入“公平贸易能源”的概念。

英文摘要

What happens to permacomputing when electricity grids shift to decentralised green energy, and local communities and municipalities have increased governance over this vital public service? Electricity and computational networks are more than just separate systems that plug together. Shifts to renewable energy generation in the grid are impacting computational systems, and computational demands on electrical power are impacting the electricity grid; one infrastructure limits the other. Permacomputing research tends to focus on 'off-grid' or 'behind-the-meter' energy. This sacrifices some of the social justice benefits that the public electricity grid, managed and regulated for universal service, was designed to provide. Our paper uses empirically-grounded 'speculative fabulation' to identify research opportunities in permacomputing that open up when energy systems are federated across communities. The speculative fabulation takes the form of an interview between the energy manager of a future energy island in the North Sea and a permacomputing podcaster. This allows us to conceive of computing-grid integration in tractable social and ecological terms, and introduce a notion of 'fair trade energy'.

URL PDF HTML ☆

赞 0 踩 0

2606.17316 2026-06-17 cs.DS 新提交

Approximation Preserving Coresets

近似保持的核集

Milind Prabhu, Chris Schwiegelshohn, Sudarshan Shyam

AI总结针对大数据聚类中核集尺寸小于理论保证的现象，提出近似保持核集，仅保留好解的成本，平衡了强核集与弱核集之间的保证，并证明近似因子微小失真即无法达到该尺寸。

详情

AI中文摘要

大数据环境下的聚类是一个被深入研究的问题，核集作为该领域的重要范式之一而出现。给定一个成本函数 $\text{cost}(P,S)$，将输入点 $P$ 和解 $S$ 映射到一个目标值，核集是一个通常带权重的概要 $\Omega\subseteq P$，使得 $\text{cost}(\Omega,S)\approx \text{cost}(P,S)$。在实践中，经常发现核集尺寸远小于理论保证所建议的尺寸就足够了。在本文中，我们为这一现象提供了一种解释。如果我们只希望保留\emph{好}解（即成本低的解）的成本，那么较小的核集尺寸就足够了。我们定义并设计了\emph{近似保持的核集}，它提供的保证弱于适用于所有解的强核集，但强于仅适用于最优解的弱核集。我们通过证明即使近似因子有非常小的失真也无法达到这种尺寸的核集来补充这一结果。

英文摘要

Clustering in a big data setting is an intensively studied problem, with coresets emerging as one of the important paradigms in this line of work. Given a cost function $\text{cost}(P,S)$ mapping input points $P$ and a solution $S$ to an objective value, a coreset is a typically weighted sketch $Ω\subseteq P$ such that $\text{cost}(Ω,S)\approx \text{cost}(P,S)$. In practice, coreset sizes much smaller than those suggested by theoretical guarantees are often found to be sufficient. In this paper, we offer an explanation for this phenomenon. Smaller coreset sizes suffice if we only wish to preserve the costs of \emph{good} solutions, i.e., solutions with low cost. We define and devise \emph{approximation-preserving coresets}, which provide a weaker guarantee than strong coresets, which apply to all solutions, while providing stronger guarantees than weak coresets, which apply only to the optimum solution. We complement this result by showing that even a very small distortion in the approximation factor cannot admit coresets of this size.

URL PDF HTML ☆

赞 0 踩 0

2606.17315 2026-06-17 cs.DC cs.DS 新提交

Space-Efficient Lock-Free Linear-Probing Hash Table

空间高效的免锁线性探测哈希表

Hagit Attiya, Rotem Oshman, Noa Schiller

AI总结提出一种免锁线性探测哈希表，具有无等待查找，在保持空间高效的同时优雅处理并发，使用少量元数据实现线性化、免锁操作。

详情

AI中文摘要

线性探测是哈希表设计中最简单且空间效率最高的方法之一，由于其紧凑的内存布局，在顺序设置中被广泛使用。然而，设计具有强活性保证的并发线性探测哈希表已被证明是困难的，并且只有少数此类算法被提出，所有这些算法要么限制并发性，要么依赖每个条目的大量元数据，从而损害了空间效率。我们提出了一种具有无等待查找的免锁线性探测哈希表，它保留了顺序线性探测的核心优势，同时优雅地处理争用。我们的设计每个表条目仅使用少量元数据：使用LL/SC时使用恒定数量的额外位，或使用CAS时使用对数数量的位。该算法是可线性化的且免锁的，支持插入、删除和无等待查找操作，并且能够安全地回收已删除元素使用的空间而无需重建表。我们分析了哈希表的均摊步骤复杂度，假设没有相同键的并发插入，并表明每个操作具有与顺序线性探测相匹配的期望均摊步骤复杂度，直到每个键的争用点。

英文摘要

Linear probing is one of the simplest and most space-efficient approaches to hash table design, and is widely used in sequential settings due to its compact memory layout. However, designing a concurrent linear-probing hash table with strong liveness guarantees has proved difficult, and only a handful of such algorithms have been proposed, all of which either restrict concurrency or rely on large per-entry metadata, thereby compromising space efficiency. We present a lock-free linear-probing hash table with wait-free lookups that retains the core advantages of sequential linear probing while handling contention gracefully. Our design uses only a small amount of metadata per table entry: a constant number of additional bits when using LL/SC, or a logarithmic number of bits when using CAS. The algorithm is linearizable and lock-free, supports insert, delete, and wait-free lookup operations, and is able to safely reclaim space used by deleted elements without rebuilding the table. We analyze the amortized step complexity of our hash table assuming no concurrent insertions of the same key, and show that each operation has expected amortized step complexity matching that of sequential linear probing, up to the point contention per key.

URL PDF HTML ☆

赞 0 踩 0

2606.17314 2026-06-17 eess.SY cs.SY 新提交

Line Outage Impact Factor (LOIF): A New Sensitivity Factor for Enhanced Transmission Observability

线路停运影响因子 (LOIF)：一种用于增强输电可观性的新灵敏度因子

Daniel Flores, Yuanrui Sang, Michael P. McGarry

AI总结提出线路停运影响因子 (LOIF) 作为新灵敏度因子，用于输电线路停运检测，相比线路停运分布因子 (LODF) 能更有效选择监测线路，提高检测精度。

详情

AI中文摘要

输电故障若未及时处理可能导致连锁故障和系统停电，影响数百万用户，因此选择最佳位置监测输电系统状态对电力系统可靠性至关重要。本文提出一种新的灵敏度因子——线路停运影响因子 (LOIF)，它特别适用于电力系统监测，能比现有灵敏度因子（如线路停运分布因子 LODF）更有效地揭示输电停运对其他线路潮流的影响。在本研究中，我们将 LOIF 应用于三个测试系统的输电线路停运检测，并基于这两种灵敏度因子使用多种观测输电线路 (OTL) 选择方法将其与 LODF 进行比较。然后，我们应用机器学习算法通过监测选定的 OTL 来检测其他线路的停运，并使用 F1 分数评估检测精度。结果表明，通常在使用相同数量的 OTL 时，基于 LOIF 选择的 OTL 进行检测获得了更高的 F1 分数。这种模式在大规模系统中尤为一致，显示了其在实际应用中的潜力。

英文摘要

Transmission failures can lead to cascading failures and system blackout affecting millions of customers if not handled in time, and choosing the best locations to monitor the condition of the transmission system is crucial for power system reliability. In this paper, we propose a new sensitivity factor, the line outage impact factor (LOIF), which is especially useful for power system monitoring and can reveal the impacts of a transmission outage on the power flow of other lines more effectively than existing sensitivity factors, such as the line outage distribution factors (LODF). In this study, we apply the LOIF in transmission line outage detection in three test systems and compare it with LODF using a number of observed transmission line (OTL) selection methods based on these two sensitivity factors. Then we apply a machine learning algorithm to detect the outages of other lines by monitoring the selected OTLs, and the detection accuracy is evaluated using the F1-score. The results show that, in general, with the same number of OTLs, detection using the OTLs selected using LOIF achieved higher F1-scores. The pattern was especially consistent in large-scale systems, showing its potential in real-world applications.

URL PDF HTML ☆

赞 0 踩 0

2606.17297 2026-06-17 cs.DS 新提交

Scalable K-clique Estimation with Differential Privacy

可扩展的差分隐私k-团估计

Dung Nguyen, Ritwick Mishra, Anil Vullikanti

AI总结针对差分隐私下k-团计数的高全局敏感性问题，提出一种基于阶梯函数局部灵敏度上界和近似灵敏度框架的噪声校准算法，显著提升运行时间并首次扩展到百万边图。

详情

AI中文摘要

k-团计数是子图挖掘中常用的度量。由于图通常包含敏感数据，关于差分隐私下的k-团计数已有大量工作。然而，这些度量具有非常高的全局灵敏度，因此需要更复杂的技术来隐私地计数k-团。平滑灵敏度和阶梯函数被开发用于减少这些度量的私有估计的噪声幅度。然而，这些方法在计算上非常低效。对于k>3的k-团，没有已知的多项式时间算法来计算平滑灵敏度，而阶梯函数的时间复杂度受限于精确计数的时间，这无法很好地扩展。在本文中，我们开发了一种新的高度可扩展的算法，用于差分隐私下的k-团计数估计。我们的算法将阶梯函数调整为局部灵敏度的平滑上界，并利用近似灵敏度框架来校准噪声，其幅度与上界的近似值成比例。这显著提高了运行时间。实验表明，我们的方法比基于阶梯函数的k-团计数估计快几个数量级，同时精度相似。我们的算法是第一个能够扩展到具有数百万条边的图，并且对于较大的k，阶梯函数算法无法完成。

英文摘要

Counts of $k$-cliques are commonly used metrics in subgraph mining. Since graphs often have sensitive data, there also has been a lot of work on $k$-clique counts with differential privacy. However, these metrics have very high global sensitivity, and so more sophisticated techniques have been developed for counting $k$-cliques with privacy. Smooth sensitivity and ladder functions were developed for reducing the noise magnitude for private estimates of these metrics. However, these are computationally very inefficient to estimate. No polynomial time algorithms are known for smooth sensitivity of $k$-cliques for $k>3$, while the time complexity of ladder functions is lower bounded by the time for exact counts, which does not scale very well. In this paper, we develop a new highly scalable algorithm for estimating $k$-clique counts with differential privacy. Our algorithm adapts the ladder function to serve as a smooth upper bound on its local sensitivity, and utilizes the approximation sensitivity framework to calibrate noise with magnitude proportional to an approximation of the bound. This gives us a significant improvement in the running time. Experiments show that our method is several orders of magnitude faster than the ladder function based estimates of $k$-clique counts, while the accuracy is similar. Our algorithm is the first to scale to graphs with millions of edges, and for larger $k$, for which the ladder function algorithm doesn't complete.

URL PDF HTML ☆

赞 0 踩 0

2606.17292 2026-06-17 eess.SY cs.SY 新提交

Robust Direct Data-Driven Hamiltonian for Safe Set Computation under Measurement Noise and Disturbances

鲁棒直接数据驱动哈密顿量：测量噪声和扰动下的安全集计算

Mohammad Bajelani, Christopher A. Strong, Claire J. Tomlin, Jason J. Choi, Klaske van Heusden

AI总结针对测量噪声和扰动，提出鲁棒数据驱动哈密顿量（R-DDH），从噪声数据中推导安全集的内近似，并证明其收敛性。

详情

AI中文摘要

安全集计算是安全关键控制系统中的一个基本挑战，特别是在直接数据驱动设置中，安全分析直接从受噪声影响的测量值进行，无需显式建模。最近提出的一种方法，数据驱动哈密顿量（DDH），能够直接从测量值进行可达性分析，而无需依赖底层系统动力学的先验知识。本文将DDH框架扩展到鲁棒设置，考虑了测量噪声、外部扰动以及采样引起的状态-速度估计误差。从噪声测量中推导出鲁棒数据驱动哈密顿量（R-DDH），并证明其能给出精确哈密顿量的认证下界。这导致值函数的可证明欠近似和相关安全集的内近似。量化了数据驱动哈密顿量与精确哈密顿量之间的差距，并证明在无噪声但有加性扰动的设置中，随着数据增多，该差距收敛到零。通过两个案例研究展示了该方法的有效性：一个受约束的双积分器和一个在感知不确定性下运行的非线性闭环控制的飞机滑行系统。

英文摘要

Safe set computation is a fundamental challenge in safety-critical control systems, especially in direct data-driven settings where safety analysis is performed directly from noise-affected measurements, without explicit modeling. A recently proposed method, Data-Driven Hamiltonian (DDH), enables reachability analysis directly from measurements, without relying on prior knowledge of the underlying system dynamics. This paper extends the DDH framework to a robust setting that accounts for measurement noise, exogenous disturbances, and sampling-induced state-velocity estimation error. A Robust Data-Driven Hamiltonian (R-DDH) is derived from noisy measurements and shown to yield a certified lower bound on the exact Hamiltonian. This results in a provable under-approximation of the value function and an inner approximation of the associated safe set. The gap between the data-driven and exact Hamiltonians is quantified, and it is shown to converge to zero with more data in a noise-free setting with additive disturbances. The effectiveness of the approach is shown through two case studies: a constrained double integrator and an aircraft taxiing system with a nonlinear closed-loop controller operating under perceptual uncertainty.

URL PDF HTML ☆

赞 0 踩 0

2606.17291 2026-06-17 cs.CE 新提交

STORX: An Open-Source Object-Oriented Framework for Shape and Topology Optimization in MATLAB

STORX: 一个用于MATLAB形状与拓扑优化的开源面向对象框架

Amir M. Mirzendehdel, Krishnan Suresh

AI总结提出STORX开源框架，基于MATLAB实现参数化、水平集形状优化及密度、水平集、拓扑灵敏度等拓扑优化方法，通过面向对象结构支持模块化与可扩展性，用于教学与研究。

详情

AI中文摘要

本文介绍了STORX：用于研究与实验的形状与拓扑优化，这是一个基于MATLAB的开源教育框架，用于学习和教授计算设计优化。STORX提供了参数化和水平集形状优化平台，以及拓扑优化方法，包括密度法、水平集法和拓扑灵敏度方法（如进化法和帕累托追踪法）。所有模块遵循一致的面向对象结构，并集成了可视化、灵敏度分析和有限元程序，使用户能够以透明且可重复的方式探索形状与拓扑优化之间的连续体。该代码旨在通过强调模块化和可扩展性（通过清晰的意图分离）来补充研究生课程和独立研究。核心软件接口通过抽象基类定义，使得可以通过添加派生类来实现新的目标函数和设计/制造约束，而无需修改核心代码。本文还描述了软件架构，并通过一系列示例问题展示了该框架如何将数学公式直接映射到可执行代码。

英文摘要

This paper presents STORX: Shape and Topology Optimization for Research and Experimentation, an open-source MATLAB-based educational framework for learning and teaching computational design optimization. STORX provides a platform for parametric and level-set shape optimization, as well as topology optimization methods including density, level-set, and topological sensitivity approaches such as evolutionary and Pareto-tracing methods. All modules follow a consistent object-oriented structure and integrate visualization, sensitivity analysis, and finite element routines, enabling users to explore the continuum between shape and topology optimization in a transparent and reproducible manner. The code is designed to complement graduate-level coursework and independent research by emphasizing modularity and extensibility through a clear separation of intent. Core software interfaces are defined via abstract base classes, enabling new objective functionals and design/manufacturing constraints to be implemented by adding derived classes without modifying the core code. The paper also describes the software architecture and demonstrates how the framework maps mathematical formulations directly to executable code through a series of illustrative problems.

URL PDF HTML ☆

赞 0 踩 0

2606.17275 2026-06-17 cs.LO cs.CR 新提交

Syntactic Systems Cannot See Semantic Invariants

句法系统无法看到语义不变量

Fabio F. G. Buono

AI总结本文通过解决一个开放问题，证明开放归纳和子句集循环两种理论不可比较，并提炼出句法不变性原理，进而类比P与NP问题中的障碍。

详情

DOI: 10.5281/zenodo.20618697

AI中文摘要

我们从一个小开放问题开始，Hetzl和Vierling询问两种归纳理论——开放归纳和子句集循环——是否不可比较。他们证明了一个方向，并留下了另一个方向。这里我们解决了它，证明几乎令人尴尬地简短，因为加法的规则只有在第一个参数是$0$或后继时才能触发，而Skolem常量既不是，因此项$a{+}b$和$b{+}a$永远无法被触及，而一个永远无法触及它们的机器也永远无法证明它们相等。区分这两种理论的是两个常量的顺序，而这个顺序是关于数字的事实，而不是关于符号的。我们从这一证明中提取出一个小的通用原则，即句法不变性原理，它命名了这类论证的形式。然后，我们以一些推测性评论结束，讨论这种相同形式如何非正式地出现在解决$\mathsf{P}$与$\mathsf{NP}$问题的已知障碍中，其中每个障碍似乎都指向了该障碍中技术无法达到的描述层次。我们将其作为一个建议而非定理提出，因为类比是真实的，但我们不会将其推至无法辩护的程度。在此过程中，我们提出了一个类比暗示但未解决的开放问题：是否存在一个快速的$\SAT$算法，如果存在，它是否总是可以作为一台可以写下的机器来展示，或者在某些情况下，它只能作为数字上的函数被发现。

英文摘要

We start from a small open question, where Hetzl and Vierling asked whether two theories of induction, open induction and clause set cycles, are incomparable. They proved one direction and left the other open. Here we close it, and the proof is almost embarrassingly short, because the rules for addition can only fire when the first argument is $0$ or a successor, a Skolem constant is neither, so the terms $a{+}b$ and $b{+}a$ can never be touched, and a machine that can never touch them can never prove they are equal. The thing that separates the two theories is the order of two constants, and that order is a fact about numbers, not about symbols. We extract from this proof a small general principle, the Syntactic Invariance Principle, that names the shape of such arguments. We then close with a few speculative remarks on how this same shape appears, informally, in the known barriers to settling $\mathsf{P}$ versus $\mathsf{NP}$, where each barrier seems to point to a level of description that the techniques in the barrier cannot reach. We raise this as a suggestion rather than a theorem, since the analogy is real but we do not push it past the point where we can defend it. Along the way we raise an open question that the analogy suggests but does not settle, on whether a fast algorithm for $\SAT$, were it to exist, would always be exhibitable as a machine you can write down or whether it could be found, in some cases, only as a function on the numbers.

URL PDF HTML ☆

赞 0 踩 0

2606.17261 2026-06-17 cs.PF cs.SE stat.AP 新提交

The Right Call for Software Benchmarking: Consistent Decisions in Stateful Environments

软件基准测试的正确调用：有状态环境下的一致决策

Gábor Melis

AI总结针对有状态环境下基准测试偏差问题，提出基于对比估计量的实验设计，消除程序特定偏差，实现渐近正确决策。

详情

AI中文摘要

在对性能的不懈追求中，现代计算系统越来越依赖有状态机制来适应工作负载和物理环境的动态变化，这提高了效率，但使基准测试以及软件优化变得困难。事实上，自适应机制本质上会在测量之间引入时间依赖性，并导致对单个程序性能的朴素估计产生偏差。注意到纠正此类偏差需要对系统动态进行推测性假设，我们呼吁优先考虑性能差异而非绝对度量，并将软件基准测试形式化为识别最快程序的决策问题，对此相对知识就足够了。为此，我们提出了简单的实验设计，允许对比的一致估计，从而使程序特定偏差在可接受的假设下抵消。这些设计渐近地产生正确的决策，并为有状态环境下的有限预算基准测试提供了一种稳健的方法，对性能敏感软件的开发具有广泛的影响。

英文摘要

In the perpetual pursuit of performance, modern computing systems rely ever more on stateful mechanisms to accommodate the dynamics of workloads and physical environments, bolstering efficiency but confounding benchmarking and thereby the optimization of software. Indeed, by their nature, adaptive mechanisms introduce temporal dependencies between measurements and render naive estimators of individual program performance biased. Observing that rectifying such biases necessitates speculative assumptions about system dynamics, we call for prioritizing performance differentials over absolute measures and formalize software benchmarking as the decision problem of identifying the fastest program, for which relative knowledge suffices. To this end, we propose simple experiment designs admitting consistent estimators of contrasts, whereby program-specific biases cancel under tenable assumptions. These designs asymptotically yield the correct decision and afford a robust methodology for finite-budget benchmarking in stateful environments, bearing broad implications for the development of performance-sensitive software.

URL PDF HTML ☆

赞 0 踩 0

2606.17253 2026-06-17 cs.AR 新提交

PDAGENT-BENCH: Characterizing, Grounding, and Architecting LLM Agents for VLSI Physical Design

PDAGENT-BENCH: 用于VLSI物理设计的LLM代理的特征化、基础化与架构化

Qiufeng Li, Rongqian Chen, Quan Cheng, Chengxuan Wang, Sizhe Tang, Wuxi Li, Duo Ding, Chia-Tung Ho, Haoxing Ren, David Z. Pan, Tian Lan, Weidong Cao

AI总结提出PDAGENT-BENCH基准，用于评估LLM/VLM代理在VLSI物理设计中的能力，涵盖任务级和工作流级评估，揭示模型在工具执行和长程推理上的局限，并验证人类技能增强工作流的有效性。

详情

AI中文摘要

大型语言模型和视觉语言模型在超大规模集成电路前端设计中取得了显著成功，但它们在VLSI物理设计中的能力仍远未得到充分探索。主要原因是缺乏标准化的基准来评估代理物理设计工作流，这些工作流需要在严格设计约束下进行高维、多阶段优化，与多种电子设计自动化工具协调交互，并进行迭代优化。本文介绍了PDAGENT-BENCH，一个全面且多维度的基准，用于评估基于LLM/VLM的代理在物理设计堆栈中的表现。PDAGENT-BENCH集成了任务级评估和工作流级执行。该基准套件包含353个精心设计的问题，结合了概念性问题与真实世界的工业制品，并配有专家验证的参考和可执行解决方案。这些任务涵盖五个关键能力维度：基础知识、报告理解、根本原因分析、脚本生成和全流程实现。此外，该基准提供了一个统一、与人类对齐的代理物理设计工作流框架，能够在真实EDA环境中实现整体物理设计的闭环评估。对11个最先进模型的实验表明，虽然现代LLM/VLM在概念性任务上表现有竞争力，但在以工具执行为中心的任务（例如，Innovus脚本生成为42.2%）和长程多阶段推理方面仍存在显著局限。我们的研究进一步表明，人类技能增强的代理工作流显著提升了端到端物理设计性能。PDAGENT-BENCH为推进LLM/VLM驱动的整体物理设计自动化建立了一个标准化、可重复且真实的评估框架。我们将很快开源该基准和框架。

英文摘要

Large Language Models and vision-language models have shown remarkable success in the front-end design of Very Large-Scale Integrated Circuits, yet their capabilities for VLSI physical design remain significantly underexplored. The primary cause is the lack of standardized benchmarks for evaluating agentic physical design workflows that require high-dimensional, multi-stage optimization under strict design constraints, coordinated interaction with diverse Electronic Design Automation tools, and iterative refinement. This work introduces PDAGENT-BENCH, a comprehensive and multi-dimensional benchmark for evaluating LLM/VLM-based agents across the physical design stack. PDAGENT-BENCH integrates both task-level assessment and workflow-level execution. The benchmark suite contains 353 curated problems that combine conceptual questions with real-world industrial artifacts, with expert-validated references and executable solutions. These tasks cover five key capability dimensions: foundational knowledge, report comprehension, root-cause analysis, script generation, and full-flow implementation. In addition, the benchmark provides a unified, human-aligned agentic physical design workflow framework that enables closed-loop evaluation of holistic physical design in realistic EDA environments. Experiments on 11 state-of-the-art models reveal that while modern LLMs/VLMs perform competitively on conceptual tasks, they remain substantially limited in tool-centric execution (e.g., 42.2% on Innovus script generation) and long-horizon, multi-stage reasoning. Our studies further show that human-skill-enhanced agentic workflows significantly improve end-to-end physical design performance. PDAGENT-BENCH establishes a standardized, reproducible, and realistic evaluation framework for advancing LLM/VLM-driven holistic physical design automation. We will open source the benchmark and framework soon.

URL PDF HTML ☆

赞 0 踩 0

2606.17245 2026-06-17 cs.CR cs.NI 新提交

Cache to the Future: A Distributed Webpage Archive for Internet Blackouts

缓存至未来：面向互联网断网的分布式网页存档

Ross Evans, Diogo Barradas

AI总结提出Cache to the Future (CttF)系统，利用分布式社区评分和密码学机制在断网期间缓存和传递静态网页内容，仿真验证了城市规模下的有效性。

Comments 20 pages, 8 figures

2606.17228 2026-06-17 cs.LO cs.PL 新提交

A Stone-Cech Collecting Semantics for Residual Process Behaviour

残差过程行为的 Stone-Cech 收集语义

Mike Stannett

AI总结针对非终止计算留下的残差行为，开发了一种紧凑收集语义，通过 Stone-Cech 紧化将尾簇集作为公共语义，区分稳定发散、有限循环发散、混合循环与逃逸等行为，并验证了 CCS 中的残差尾定律。

Comments 36 pages. Created using AI assistance

详情

AI中文摘要

本文为非终止计算留下的残差行为开发了一种紧凑收集语义。对于顺序时间，这是观察空间 Stone-Cech 紧化中流的尾簇集。它为普通循环、混合循环行为以及通过观察空间的非紧部分的逃逸提供了公共语义。基本理论建立了尾不变性、连续观察下的函子性以及开闭观察的时间解读：包含在 beta-X 的相应开闭区域中是最终真值，而非空交集是循环。进展和公平性假设通过加强时间过滤器来表示。通过紧化乘积获得关系含义，因此沿着时间的相同渐近视图进行的观察之间的相关性得以保留。主要应用是 CCS 中的残差行为。无限执行被读作模结构同余的残差过程流。该语义区分了稳定发散、有限循环发散、带有逃逸的混合循环以及通过无界残差增长的逃逸。它验证了前缀、受控展开、有限选择和有限前缀选择形式的残差尾定律，同时识别了这些定律在并行组合和同步下的边界。有限观察商提供了计算接口：抽象含义变为循环状态和强连通分量计算，资源观察检测无界逃逸，而无需检查 Stone-Cech 余集中的单个点。

英文摘要

This paper develops a compact collecting semantics for the residual behaviour left by nonterminating computation. For sequential time this is the tail-cluster set of the stream in the Stone-Cech compactification of the observation space. It gives a common semantics to ordinary recurrence, mixed recurrent behaviour, and escape through noncompact parts of the observation space. The basic theory establishes tail invariance, functoriality under continuous observations, and a temporal reading for clopen observations: containment in the corresponding clopen region of beta-X is eventual truth, while nonempty intersection is recurrence. Progress and fairness assumptions are represented by strengthening the time filter. Relational meanings are obtained by compactifying products, so correlations between observations made along the same asymptotic view of time are retained. The main application is to residual behaviour in CCS. Infinite executions are read as streams of residual processes modulo structural congruence. The resulting semantics distinguishes stable divergence, finite recurrent divergence, mixed recurrence with escape, and escape through unbounded residual growth. It validates residual-tail laws for prefixing, guarded unfolding, finite choice, and finite prefix-choice forms, while also identifying the boundary of those laws under parallel composition and synchronisation. Finite observational quotients provide the computational interface to the compact semantics: abstract meanings become recurrent states and strongly connected component calculations, and resource observations detect unbounded escape without requiring individual points of the Stone-Cech remainder to be inspected.

URL PDF HTML ☆

赞 0 踩 0

2606.17223 2026-06-17 cs.CR 新提交

Safety, Security, and Cognitive Risks in Neuro-Symbolic AI

神经符号AI中的安全性、安全性和认知风险

Manoj Parmar

AI总结本文系统分析了神经符号AI在五层架构中的攻击面，提出统一威胁模型、符号层威胁目录及认知风险分析，并通过三个实证基准验证了攻击的有效性与检测挑战。

Comments 28 pages, 1 figure, 10 tables

详情

AI中文摘要

神经符号AI（NeSy）将神经感知与符号推理相结合，使其在需要可解释性和结构化推理的高风险领域具有吸引力。然而，这种混合架构引入了跨越五个层次的扩大攻击面：神经感知、符号知识库、推理引擎、智能体编排和数据存储——每个层次都可能以纯神经系统中不存在的方式被利用。本文做出六项贡献：（1）正式定义了NeSy攻击面、符号完整性违反（SIV）和跨层放大比$\mathcal{X}$，分解为神经引起的和自主符号敏感性分量；（2）一个统一的威胁模型，扩展了MITRE ATLAS，包含11个NeSy特定策略扩展和五类攻击者分类；（3）一个符号层威胁目录，涵盖知识图谱（KG）投毒、本体合并和推理引擎颠覆；（4）认知风险分析——自动化偏差、权威偏差和谄媚强化——这些风险因NeSy显式的逻辑解释相对于黑箱神经输出而被结构性放大；（5）跨学科缓解措施，具有与NIST AI 600-1和欧盟AI法案一致的可衡量接受标准；（6）三个实证基准：（E1）针对205实体医学KG的目标KG投毒在注入预算$B=5$时达到盈亏平衡SIV，并存在KG特定的隐蔽性/SIV权衡；（E2）在DistilBERT+ProbLog流水线上，$\varepsilon=0.01$的PGD-10产生$\mathcal{X}=5.884$（95%置信区间$[4.64, 8.00]$，$p<0.0001$），通过匹配随机基线（$E^{R}_{\mathrm{rand}}=0$）确认了对抗特异性；（E3）单公理OWL编辑实现93.3%的SIV成功率，100%的Pellet一致性隐蔽性，但留出STIX检测在50%（随机猜测水平）失败，这是一个开放问题。

英文摘要

Neuro-symbolic AI (NeSy) pairs neural perception with symbolic reasoning, making it attractive for high-stakes domains where explainability and structured inference are required. However, this hybrid architecture introduces an enlarged attack surface spanning five layers: neural perception, symbolic knowledge bases, reasoning engines, agentic orchestration, and data stores -- each exploitable in ways absent from purely neural systems. This paper makes six contributions: (1) formal definitions of NeSy Attack Surface, Symbolic Integrity Violation (SIV), and Cross-Layer Amplification Ratio $\mathcal{X}$, decomposed into neural-caused and autonomous symbolic sensitivity components; (2) a unified threat model extending MITRE ATLAS with 11 NeSy-specific tactic extensions and a five-profile attacker taxonomy; (3) a symbolic-layer threat catalogue covering knowledge graph (KG) poisoning, ontology-merging, and inference-engine subversion; (4) analysis of cognitive risks -- automation bias, authority bias, and sycophantic reinforcement -- structurally amplified by NeSy's explicit logical explanations relative to black-box neural outputs; (5) interdisciplinary mitigations with measurable acceptance criteria aligned to NIST AI 600-1 and the EU AI Act; (6) three empirical benchmarks: (E1) targeted KG poisoning achieves break-even SIV at injection budget $B=5$ on a 205-entity medical KG, with a KG-specific stealth/SIV trade-off; (E2) PGD-10 at $\varepsilon=0.01$ yields $\mathcal{X}=5.884$ (95% CI $[4.64,\, 8.00]$, $p<0.0001$), confirmed adversarially specific by a matched-random baseline ($E^{R}_{\mathrm{rand}}=0$), on a DistilBERT+ProbLog pipeline; (E3) single-axiom OWL edits achieve 93.3% SIV success with 100% Pellet-consistency stealth, but held-out STIX detection fails at 50% (random-guessing level), an open problem.

URL PDF HTML ☆

赞 0 踩 0

2606.17217 2026-06-17 eess.SY cs.SY 新提交

A Stateful Stochastic Allocation Mechanism with Fairness Guarantees for Networked Electricity Systems

一种具有公平性保障的有状态随机分配机制用于网络化电力系统

Shaun SWeeney

AI总结提出FP-AMM机制，通过两阶段随机清算规则和短缺记忆状态，实现电力分配公平性，并在IEEE标准系统上验证了收敛性和性能提升。

详情

AI中文摘要

本文开发并分析了公平博弈自动做市商（FP-AMM），一种可编程的电力分配机制，其中稀缺性分配被视为受控、有状态且可审计的信息物理过程。现有机制如节点边际定价是无记忆的，无法考虑历史服务结果，从而无法保证跨市场区间的公平待遇。FP-AMM采用两阶段随机清算规则，包括服务优先级采样和逆公平加权，结合DC-OPF可行域和通过饱和积分器更新的有界短缺记忆。建立了四个主要结果。第一，短缺记忆状态在$[0,1]^N$中不变，且更新映射是收缩率为$1-\beta$的压缩映射。第二，区间内清算算子线性收敛到唯一不动点，收缩因子$q\in(0,1)$。第三，在公平博弈优先级规则下，每节点交付比率几乎必然收敛到合同目标$F^\star$，通过赤字递归的Lyapunov分析获得有限时间$O(1/\sqrt{T})$界。第四，事件触发执行保证了分配跟踪误差的实际最终有界性，并量化了计算-保真度权衡。该机制在IEEE 14、57和118节点系统上经过$T=5000$个市场区间验证。在所有基准测试中实现了向$F^\star$的公平收敛，在IEEE-57网络上峰值弱节点公平误差降低了54%，在稀缺时期相对于等权重基线降低了高达55%，并且始终维持DC可行性。

英文摘要

This paper develops and analyses the Fair Play Automatic Market Maker (FP-AMM), a programmable electricity allocation mechanism in which scarcity allocation is treated as a controlled, stateful, and auditable cyber-physical process. Existing mechanisms such as locational marginal pricing are memoryless and cannot account for historical service outcomes, preventing guarantees of equitable treatment across market intervals. The FP-AMM employs a two-stage stochastic clearing rule comprising service-priority sampling and inverse-fairness weighting, coupled with a DC-OPF feasibility set and bounded shortage memory updated through a saturated integrator. Four main results are established. First, the shortage-memory state is invariant in $[0,1]^N$ and the update map is a contraction with rate $1-β$. Second, the intra-interval clearing operator converges linearly to a unique fixed point with contraction factor $q\in(0,1)$. Third, under the Fair Play priority rule, the per-node delivery ratio converges almost surely to the contracted target $F^\star$, with a finite-time $O(1/\sqrt{T})$ bound obtained via Lyapunov analysis of the deficit recursion. Fourth, event-triggered execution guarantees practical ultimate boundedness of the allocation tracking error and quantifies the computation-fidelity trade-off. The mechanism is validated on the IEEE 14-, 57-, and 118-bus systems over $T=5000$ market intervals. Fairness convergence to $F^\star$ is achieved on all benchmarks, peak weak-bus fairness error is reduced by 54% on the IEEE-57 network and by up to 55% relative to an equal-weight baseline during scarcity periods, and DC feasibility is maintained throughout.

URL PDF HTML ☆

赞 0 踩 0

2606.17212 2026-06-17 cs.GR cs.NI 新提交

Renderable Partial Representations for Dynamic Gaussian Splatting under Incomplete Delivery

不完整交付下动态高斯溅射的可渲染部分表示

Faruk Alpay, Levent Sarioglu, Yaser Hadri

AI总结针对动态高斯表示在交互式渲染中因部分交付导致的退化问题，提出将基元组织为独立寻址的时空簇，通过训练部分依赖图并最小化期望失真、尾部失真等，实现不完整状态仍可直接渲染，并在实验上优于名义层序。

Comments 19 pages, 8 figures, 3 tables. Code, tests, configurations, pinned environment, and measurement records (including the partial-state oracle atlas) are provided as ancillary files

详情

AI中文摘要

动态高斯压缩通常针对完整文件或完整渐进前缀进行优化，但交互式渲染会遇到部分表示：某些时空区域存在，其他缺失，且后期细化无法影响已显示帧。我们研究动态高斯表示，其不完整交付状态仍可直接渲染，且其退化在图像空间中得到优化。高斯基元被组织为独立可寻址的时空簇，包含一个基础层和三个细化层；训练部分依赖图，在一个GPU批次中渲染许多反事实状态，并最小化期望失真、尾部失真、时间不一致性、码率和前缀回归。反事实效用层测量每个完成组在有效接收方上下文中的边际渲染贡献。同一图支持具体的交付实现，包括MTU限制的熵编码块、截止时间感知调度和接收方依赖闭合。在保留视图上，最细细化层在3/32个D-NeRF弹跳球、49/64个HyperNeRF扫帚2和28/64个HyperNeRF鸡簇中具有负平均边际效用；其下尾效用分别在21/32、61/64和42/64个簇中为负。在扫帚2上，渲染效用排序消除了在匹配字节预算下名义层序产生的两个PSNR回归；在鸡上，在不相交训练摄像机上测量的效用将最低匹配预算下的保留PSNR提高了3.03 dB。这些范围性结果表明，名义细化顺序不能替代渲染条件效用：该公式将网络交付视为可渲染场景状态的分布，而不是图形编解码器的外部包装。

英文摘要

Dynamic Gaussian compression is normally optimized for complete files or complete progressive prefixes, but interactive rendering encounters partial representations: some spatiotemporal regions are present, others missing, and late refinements cannot affect the displayed frame. We study dynamic Gaussian representations whose incomplete delivery states remain directly renderable and whose degradation is optimized in image space. Gaussian primitives are organized into independently addressable spatiotemporal clusters with a base level and three refinements; training samples partial dependency graphs, renders many counterfactual states in one GPU batch, and minimizes expected distortion, tail distortion, temporal inconsistency, rate, and prefix regressions. A counterfactual utility layer measures the marginal render contribution of each completion group across valid receiver contexts. The same graph admits a concrete delivery realization with MTU-bounded entropy-coded chunks, deadline-aware scheduling, and receiver-side dependency closure. On held-out views, the finest refinement has negative mean marginal utility in 3/32 D-NeRF bouncingballs, 49/64 HyperNeRF broom2, and 28/64 HyperNeRF chicken clusters; its lower-tail utility is negative in 21/32, 61/64, and 42/64 clusters, respectively. On broom2, render-utility ordering removes both PSNR regressions produced by nominal layer order at matched byte budgets; on chicken, utilities measured on disjoint training cameras improve held-out PSNR by 3.03 dB at the lowest matched budget. These scoped results show why nominal refinement order cannot substitute for render-conditioned utility: the formulation treats network delivery as a distribution over renderable scene states rather than as an external wrapper around a graphics codec.

URL PDF HTML ☆

赞 0 踩 0

2606.17128 2026-06-17 cs.AR 新提交

Shift-Left High-Level Synthesis Verification via Knowledge-Augmented LLM Agent

通过知识增强的LLM智能体实现左移高层次综合验证

Zhihan Xiao, Zhe Zhao, Luke Ztz Hu, Songping Mai

AI总结提出一种知识增强的智能体驱动左移验证框架，通过双层级一致性检查、符号执行和HLS验证知识图谱，在综合前自动验证C与HLS-C的功能一致性，覆盖率达98.26%。

详情

AI中文摘要

高层次综合（HLS）通过将C/C++程序转换为硬件实现，实现了快速硬件开发。在HLS设计流程中，黄金C规范与面向HLS的C实现之间的功能一致性验证是一项关键但劳动密集型的任务。尽管大型语言模型（LLMs）最近在自动化测试平台生成方面显示出潜力，但其随机性常常导致覆盖率不足、验证环境不一致以及等价性检查结果不可靠。为了解决这些限制，我们提出了一种知识增强的、智能体驱动的左移验证框架，用于在综合前自动检查黄金C与HLS-C之间的功能一致性。该框架引入了一种双层级一致性检查机制，该机制共同强制配对测试平台之间的静态结构对齐和动态行为等价性，同时集成符号执行和覆盖率驱动的细化以提高验证完整性。此外，我们构建了一个异构的HLS验证知识图谱，为测试平台生成提供拓扑感知推理先验，并设计了一个自主验证智能体来协调跨异构工具链的迭代细化和故障诊断。在107个HLS基准对上的实验结果表明，所提出的框架实现了98.26%的平均覆盖率和95.33%的动态一致性，优于代表性的基于AST、检索增强和迭代智能体的基线。此 https URL

英文摘要

High-Level Synthesis (HLS) enables rapid hardware development by translating C/C++ programs into hardware implementations. Functional consistency verification between golden C specifications and HLS-oriented C implementations is a critical yet labor-intensive task in HLS design flows. While Large Language Models (LLMs) have recently shown promise in automated testbench generation, their stochastic nature often leads to insufficient coverage, inconsistent verification environments, and unreliable equivalence checking results. To address these limitations, we propose a knowledge-augmented, agent-driven shift-left verification framework for automated functional consistency checking between golden C and HLS-C implementations before synthesis. The framework introduces a Dual-Tier Consistency Checking mechanism that jointly enforces static structural alignment and dynamic behavioral equivalence between paired testbenches, while integrating symbolic execution and coverage-driven refinement to improve verification completeness. Furthermore, we construct a heterogeneous HLS Verification Knowledge Graph to provide topology-aware reasoning priors for testbench generation, and design an autonomous verification agent to orchestrate iterative refinement and failure diagnosis across heterogeneous toolchains. Experimental results on 107 HLS benchmark pairs demonstrate that the proposed framework achieves 98.26\% average coverage and 95.33\% dynamic consistency, outperforming representative AST-based, retrieval-augmented, and iterative agent-based baselines. https://github.com/cz-5f/HLS-LeVeri.git

URL PDF HTML ☆

赞 0 踩 0

2606.17116 2026-06-17 cs.CR 新提交

Quantifying quantum risk: a measure of crypto agility

量化量子风险：加密敏捷性的一种度量

Coryan Wilson-Shah

AI总结本文提出旋转时间作为加密敏捷性的度量，通过历史CVE数据推导出旋转时间容忍度与安全风险容忍度的近似关系，发现旋转时间容忍度在数小时到数天量级，表明加密敏捷性与混合加密结合是设计量子弹性系统的有效方法。

详情

AI中文摘要

由于量子计算机能够实现新的密码分析形式，它们对广泛用于保护当代计算机系统的加密算法构成威胁。实用量子计算机可能在未来十年左右出现，但由于理论上的“先收获，后解密”式攻击者行为，今天就需要采取缓解措施。密码学和安全架构的最新进展显示出支持设计能够抵御量子密码分析的系统的潜力，但在文献中关于推导此类系统的容限方面存在关键空白。在本文中，我们引入了旋转时间的概念作为加密敏捷性的一种度量，并推导出将旋转时间容限与安全风险容限联系起来的近似值。使用历史CVE数据计算旋转时间容限的示例值，发现其量级为数小时到数天。这表明，将加密敏捷性与混合加密结合使用是设计量子弹性系统的有效方法，但可能需要具有挑战性的技术和操作容限以满足组织的风险容限。

英文摘要

Because of their ability to enable new forms of cryptanalysis, quantum computers pose a threat to the cryptographic algorithms that are widely used to secure contemporary computer systems. A practical quantum computer may emerge within the next ten years or so, but due to theorised "harvest now, decrypt later" style attacker behaviour, mitigations are necessary today. Recent advances in cryptography and security architecture show promise in supporting the design of systems that exhibit resilience against quantum-enabled cryptanalysis, however there is a key gap in the literature around the subject of deriving tolerances for such systems. In this paper, we introduce the concept of rotation time as a measure of crypto agility, and derive an approximation that links rotation time tolerance to security risk tolerance. Historical CVE data is used to calculate illustrative values for rotation time tolerance, which is found to be of the order of hours to days. This demonstrates that using crypto agility in conjunction with hybrid encryption is an effective approach for designing quantum-resilient systems, but may necessitate challenging technical and operational tolerances in order to meet organisational risk tolerances.

URL PDF HTML ☆

赞 0 踩 0

2606.17111 2026-06-17 cs.CR cs.DC cs.PF 新提交

Fractional Verkle Trees: A Hypertree Decomposition and Verified Proof Serialization Architecture for High-Performance Blockchain State Accumulators

分数Verkle树：面向高性能区块链状态累加器的超树分解与验证证明序列化架构

Ekleen Kaur, Everton Fraga

AI总结针对Verkle树实现的四个低效问题，提出分数Verkle树（FVT），通过超树分解将全局状态划分为独立子累加器，结合存在性检查、32字节SHA256节点引用等优化，实现并行插入和堆分配减少57%，每年消除全网4.85 PB存储开销。

Comments This work was presented at the Ethereum Community Conference at Cannes, France, 2026, on behalf of Amazon Web Services. https://youtu.be/FHA5mfUOl5o?si=sFA6izcab3cQX4KM

详情

AI中文摘要

现代区块链状态管理面临关键的可扩展性瓶颈：维护数亿条目的密码学承诺在计算上变得难以承受。以太坊向Verkle树的过渡——通过常数大小的IPA向量承诺将证明大小从O(宽度*深度)减少到O(深度)的多项式承诺累加器——是迈向无状态操作的关键一步。然而，当前的实现表现出给家庭验证者带来负担的病态特征。我们识别了参考实现go-verkle \cite{kaur2025goverkle, kaur2025goethereum}中的四个低效问题：(1) 在删除不存在的账户时创建幻影节点；(2) 64字节数据库键导致过多的LSM树压缩；(3) 证明反序列化中的冗余内存复制；(4) 不存在的证明线格式不兼容导致非确定性序列化。我们提出了分数Verkle树（FVT），一种超树分解，将全局状态划分为N个独立的子累加器，由Merkle承诺树协调，实现了改进的缓存局部性、无锁竞争的goroutine并行承诺计算和更快的根重新计算（91 μs对比约500 ms）。我们通过存在性检查、32字节SHA256节点引用、零拷贝引用计数缓冲区和基于哈希表的字典序去重来解决每个低效问题。在Apple M1 Pro上的基准测试显示，堆分配减少57%（每10K证明从566,760字节降至242,004字节），并行插入速度为2,433 ns/op，全网6,000个全节点每年消除4.85 PB存储，推进了以太坊无状态路线图。

英文摘要

Modern blockchain state management faces a critical scalability bottleneck: maintaining cryptographic commitments over hundreds of millions of entries becomes computationally prohibitive. Ethereum's transition to Verkle Trees: polynomial commitment accumulators reducing proof sizes from O(width * depth) to O(depth) via constant-size IPA vector commitments, is a critical step toward stateless operation. Yet, current implementations exhibit pathological characteristics that burden home validators. We identify four inefficiencies in the reference go-verkle implementation \cite{kaur2025goverkle, kaur2025goethereum}: (1) phantom node creation during non-existent account deletion; (2) 64-byte database keys triggering excessive LSM-tree compaction; (3) redundant memory copying in proof deserialization; (4) a Proof of Absence wire format incompatibility causing non-deterministic serialization. We present Fractional Verkle Trees (FVT), a hypertree decomposition partitioning global state into N independent sub-accumulators coordinated by a Merkle commitment tree, achieving improved cache locality, zero-lock-contention goroutine-parallel commitment computation, and faster root recomputation (91 $μ$s vs $\sim$500 ms). We address each inefficiency via existence checks, 32-byte SHA256 node references, zero-copy reference-counted buffers, and HashMap-based lexicographic deduplication. Benchmarks on Apple M1 Pro show 57\% heap allocation reduction (566,760 to 242,004 bytes per 10K proofs), parallel insertion at 2,433 ns/op, and network-wide elimination of 4.85 PB/year across 6,000 full nodes, advancing the Ethereum stateless roadmap.

URL PDF HTML ☆

赞 0 踩 0

2606.17101 2026-06-17 cs.HC 新提交

The Bias Paradox: How AI Personas Can Overcome Human Limitations in UX Research

偏见悖论：AI角色如何克服用户体验研究中的人类局限性

Ozgur Taylan Celik

AI总结本文探讨UX研究中的偏见悖论，即真实人类参与者因情境偏见提供不如AI角色真实的洞察，并提出AI角色可缓解人类局限，呼吁建立识别传统研究偏见的框架。

Comments Paper accepted for ACM CHI workshop on Responsible AI Personas

2606.17094 2026-06-17 cs.SE 新提交

LogCopilot: Automating Log Aggregation Analysis through Large Language Models

LogCopilot: 通过大型语言模型自动化日志聚合分析

Senyu Xie, Chenxi Zhang, Tong Zhou, Jiacheng Liu, Xiaoyu Hong, Qingshan Li, Xin Peng

AI总结提出LogCopilot框架，利用大型语言模型，通过自然语言指令、知识检索和工具调用自动生成LogQL查询，实现日志聚合分析，平均准确率76.8%。

详情

AI中文摘要

日志记录了软件的运行时行为，广泛应用于调试、测试和故障诊断等任务。随着系统规模和复杂性的增加，日志分析逐渐成为一项具有挑战性的任务。当前的工业系统通常使用日志聚合系统（如Grafana Loki和ELK）来简化日志收集和分析过程。工程师使用这些系统提供的DSL查询语言编写查询，可以完成各种日志分析任务。然而，编写这些查询通常耗时且费力，因为工程师需要深入了解DSL语法以及日志中包含的详细信息。为了解决这些挑战，本文提出了LogCopilot，一种基于大型语言模型（LLMs）的自动化日志聚合分析框架。LogCopilot接受自然语言的日志分析指令，并通过知识检索和工具调用实现自动化日志分析。LogCopilot构建了一个层次化的知识库来表示和提供日志中的关键知识。它通过生成和执行LogQL查询来实现自动化的日志聚合分析。基于四个日志数据集的评估证实了LogCopilot的有效性，其平均准确率达到76.8%，优于基线方法。此外，实验结果表明LogCopilot在LogQL查询生成方面是有效的。

英文摘要

Logs record the runtime behavior of software and are widely used in various tasks such as debugging, testing, and fault diagnosis. With the increase in system size and complexity, log analysis has gradually become a challenging task. Current industrial systems typically use log aggregation systems such as Grafana Loki and ELK to simplify the log collection and analysis process. Engineers write queries using the DSL query language provided by these systems can complete a variety of log analysis tasks. However, writing these queries is often time-consuming and labor-intensive, as it requires engineers to have a thorough understanding of the DSL syntax and the detailed information contained in the logs. To address these challenges, this paper proposes LogCopilot, an automated log aggregation analysis framework based on large language models (LLMs). LogCopilot accepts natural language log analysis instructions and accomplishes automated log analysis through knowledge retrieval and tool calling. LogCopilot constructs a hierarchical knowledge base to represent and provide key knowledge in logs. And it achieves automated log aggregation analysis by generating and executing LogQL queries. The evaluation based on four log datasets confirm the effectiveness of LogCopilot, which achieves an average accuracy of 76.8% and outperforms baseline approaches. Moreover, experiment results shows that LogCopilot is effective in LogQL query generation.

URL PDF HTML ☆

赞 0 踩 0

2606.17089 2026-06-17 cs.CR cs.HC 新提交

Security and Human-Centered Assessment of BACnet-Controlled DALI Infrastructure in an Educational Building Automation Testbed

基于BACnet控制的DALI基础设施的教育楼宇自动化测试床的安全与人本评估

Ariton Verush

AI总结通过结合网络枚举、对象级检查、物理机架分析和反思性HCI分析，评估BACnet/IP楼宇自动化测试床中DALI照明的安全性，强调BACS评估不仅涉及技术协议，还需可用工具、物理可观测性、可解释命名和安全的命令优先级心智模型。

Comments 7 pages, 9 figures, 1 table; technical case study

详情

AI中文摘要

楼宇自动化与控制系统通过专用通信协议集成供暖、通风、空调、照明、传感和管理功能。虽然这种集成实现了灵活的楼宇运行，但也创造了复杂的网络物理环境，难以检查、保护并向新分析师解释。本文介绍了在2026年4月于瑞士图恩举行的面向家庭自动化的网络安全黑客马拉松期间，对具有DALI照明基础设施的BACnet/IP楼宇自动化测试床进行的实用安全与人本案例研究。该研究结合了网络导向枚举、对象级检查、物理机架分析以及工具支持学习的反思性HCI分析。使用Yabe和BACteria，本文记录了可观察的BACnet服务，重构了结构化对象层次结构，识别了房间级照明控制路径，并将BACnet对象映射到DALI组级基础设施。分析强调，BACS评估不仅是一项技术协议任务：它还需要可用的工具接口、物理可观测性、可解释的命名约定以及安全的命令优先级心智模型。本文贡献了一个在教育测试床中探索BACnet/DALI的紧凑案例研究，并讨论了对网络安全教育、人本安全工具以及网络物理楼宇环境中负责任实验的影响。

英文摘要

Building automation and control systems integrate heating, ventilation, air conditioning, lighting, sensing, and management functions through specialized communication protocols. While this integration enables flexible building operation, it also creates complex cyber-physical environments that are difficult to inspect, secure, and explain to new analysts. This paper presents a practical security and human-centered case study of a BACnet/IP building automation testbed with DALI lighting infrastructure, investigated during a domotics-oriented cybersecurity hackathon in Thun, Switzerland in April 2026. The study combines network-oriented enumeration, object-level inspection, physical rack analysis, and reflective HCI analysis of tool-supported learning. Using Yabe and BACteria, the work documents observable BACnet services, reconstructs structured object hierarchies, identifies room-level lighting-control paths, and maps BACnet objects to DALI group-level infrastructure. The analysis emphasizes that BACS assessment is not only a technical protocol task: it also requires usable tool interfaces, physical observability, interpretable naming conventions, and safe mental models for command priorities. The paper contributes a compact case study of BACnet/DALI exploration in an educational testbed and discusses implications for cybersecurity education, human-centered security tooling, and responsible experimentation in cyber-physical building environments.

URL PDF HTML ☆

赞 0 踩 0

2606.17058 2026-06-17 cs.DC 新提交

Evaluating LLM Coding Agents on SZ-Family Lossy Compression Across Architectures

评估LLM编码代理在不同架构上的SZ系列有损压缩

Changqing Li, Shouwei Gao, Kai Zhao, Sheng Di, Wenqian Dong

AI总结评估LLM编码代理在SZ系列有损压缩内核上的表现，发现GPU上强模型性能高但对提示敏感，Cerebras上主要挑战是生成可运行程序，且代理在模块化内核上更有效。

Comments 5 pages, 4 figures. Accepted to IPDPS 2026 HPAI4S Workshop

详情

AI中文摘要

大型语言模型（LLM）编码代理越来越多地应用于代码翻译和优化，但它们在性能关键的高性能计算（HPC）环境中的有效性仍缺乏充分表征。本文评估了基于LLM的编码工作流在SZ系列误差有界有损压缩内核上的表现，这些内核结合了数值约束与内存密集型和控制流密集型实现。我们研究了两个代表性的CUDA工作负载（SZp和SZx），并针对两个异构执行平台：NVIDIA GPU和Cerebras晶圆级加速器。聚焦于单代理迭代生成，我们不仅分析了最终吞吐量，还分析了代理运行时行为，包括迭代模式、对提示规范的敏感性以及特征性失败模式。我们的结果揭示了显著的跨架构差异。在GPU上，更强的模型可以实现更高的吞吐量，但对提示精度和优化指导表现出更高的敏感性，而在Cerebras上，主要挑战在于在PE中心的空间执行模型下生成可运行程序。我们进一步观察到，LLM代理在模块化内核（SZx）上比在紧密耦合的位级流水线（SZp）上更有效，后者中的结构依赖关系阻碍了优化进展。这些发现表明，评估用于HPC的LLM编码代理需要考虑性能结果和架构特定的鲁棒性，并且在基于线程的平台上的成功不能直接迁移到空间加速器。

英文摘要

Large language model (LLM) coding agents are increasingly applied to code translation and optimization, yet their effectiveness in performance-critical high-performance computing (HPC) settings remains poorly characterized. This paper evaluates LLM-based coding workflows on SZ-family error-bounded lossy compression kernels, which combine numerical constraints with memory-intensive and control-flow-heavy implementations. We study two representative CUDA workloads (SZp and SZx) and target two heterogeneous execution platforms: NVIDIA GPUs and Cerebras wafer-scale accelerators. Focusing on single-agent iterative generation, we analyze not only final throughput but also agent runtime behavior, including iteration patterns, sensitivity to prompt specification, and characteristic failure modes. Our results reveal a pronounced cross-architecture divergence. On GPUs, stronger models can achieve substantially higher throughput but exhibit increased sensitivity to prompt precision and optimization guidance, whereas on Cerebras the dominant challenge lies in producing runnable programs under a PE-centric spatial execution model. We further observe that LLM agents are more effective on modular kernels (SZx) than on tightly coupled bit-level pipelines (SZp), where structural dependencies hinder optimization progress. These findings suggest that evaluating LLM coding agents for HPC requires accounting for both performance outcomes and architecture-specific robustness, and that success on thread-based platforms does not directly transfer to spatial accelerators.

URL PDF HTML ☆

赞 0 踩 0

2606.18194 2026-06-17 cs.GT math.DS math.OC 新提交

Ergodic Deviation-Robust Equilibrium under Mirror Descent Learning in Finite Games

有限博弈中镜像下降学习下的遍历偏差鲁棒均衡

Joshua Steier

AI总结提出遍历偏差鲁棒均衡（EDRE），一种针对熵镜像下降学习的动态相关均衡概念，要求极限分布为ε-纳什均衡、全程偏差增益为√T量级且为EMD不动点，并证明其在势博弈中存在性及PPAD难度。

Comments Under Review

详情

AI中文摘要

我们引入了遍历偏差鲁棒均衡（EDRE），这是一种针对重复有限博弈的动态相关均衡概念，其中智能体通过熵镜像下降（EMD）进行学习。EDRE要求同一配置和学习运行同时满足三个性质：（E1）极限配置是乘积分布下的ε-纳什均衡；（E2）在整个学习轨迹上，每个固定联盟的累积（单边）偏差增益以高概率为~O(√T)；（E3）极限配置是EMD映射的不动点，因此它是由动力学选择而非仅仅被认证为均衡。我们证明了√T的偏差遗憾率是阶紧的，建立了在精确势博弈中的存在性（通过纳什定理，并在凹性下给出构造性近端路径），同时证明了EMD的Lyapunov单调性（当不动点集为单点集时逐点收敛），并通过变分不等式将选择性质扩展到单调多矩阵博弈。尽管静态EDRE等同于ε-纳什均衡，但其内容是动态的：EMD下的鲁棒（正测度）选择排除了线性不稳定均衡，因此EDRE充当了带有动态证书而非静态精炼的纳什均衡。在复杂性方面，我们证明了一般多矩阵博弈中计算EDRE是PPAD难的，而在势博弈中属于promise-PPAD。一个2×2协调博弈的实例说明了该框架的所有组成部分。附录中包含了额外结果，包括赌博反馈扩展、大步长下双策略EMD映射通向Li-Yorke混沌的倍周期路径、最小成本转向的线性规划公式以及支持性模拟。

英文摘要

We introduce Ergodic Deviation-Robust Equilibrium (EDRE), a dynamics-relative equilibrium concept for repeated finite games in which agents learn via entropic mirror descent (EMD). EDRE requires three properties to hold simultaneously for the same profile and learning run: (E1) the limit profile is an $\varepsilon$-Nash equilibrium at a product distribution; (E2) along the entire learning trajectory, every fixed coalition's cumulative aggregate (summed-unilateral) deviation gain is $\tilde{\mathcal{O}}(\sqrt{T})$ with high probability; and (E3) the limit profile is a fixed point of the EMD map, so that it is selected by the dynamics rather than merely certified as an equilibrium. We prove that the $\sqrt{T}$ deviation-regret rate is order-tight, establish existence in exact-potential games (via Nash's theorem, with a constructive proximal route under concavity) together with Lyapunov monotonicity of EMD (and pointwise convergence when the fixed-point set is a singleton), and extend the selection property to monotone polymatrix games through variational inequalities. Although a static EDRE coincides with an $\varepsilon$-Nash equilibrium, its content is dynamic: robust (positive-measure) selection under EMD excludes linearly unstable equilibria, so EDRE acts as a Nash equilibrium equipped with a dynamic certificate rather than a static refinement. On the complexity side, we show that computing EDRE is PPAD-hard in general polymatrix games and belongs to promise-PPAD for potential games. A worked $2\times 2$ coordination-game example illustrates all components of the framework. Additional results, including a bandit-feedback extension, a period-doubling route to Li-Yorke chaos for the two-strategy EMD map at large step size, a linear-program formulation for minimum-cost steering, and supporting simulations, appear in the appendices.

URL PDF HTML ☆

赞 0 踩 0

2606.18151 2026-06-17 eess.SP cs.IT math.IT 新提交

Channel Charting for Position and Orientation

面向位置和朝向的信道图表

Daniel Richner, Reinhard Wiesmayr, Frederik Zumegen, Christoph Studer

AI总结提出一种自监督方法，利用信道状态信息同时估计用户设备位置和朝向，通过新颖的朝向三元组损失和对齐损失实现，在5G NR实测中接近监督学习精度。

Comments This work has been submitted to the IEEE Conference on Integrated Sensing and Communications 2026 (ISAC)

2606.18150 2026-06-17 eess.SP cs.IT math.IT 新提交

Spatial and Temporal Generalization of CSI-based Neural Positioning

基于CSI的神经定位的空间与时间泛化

Till-Yannic Müller, Frederik Zumegen, Reinhard Wiesmayr, Christoph Studer

AI总结研究基于CSI的神经定位在空间和时间上的泛化能力，使用MLP和Transformer架构在三个真实数据集上评估，发现Transformer在定位精度上优于MLP且参数更少。

Comments This work has been submitted to the IEEE Conference on Integrated Sensing and Communications 2026 (ISAC)

详情

AI中文摘要

基于信道状态信息（CSI）的神经定位利用神经网络学习从CSI测量到用户设备（UE）位置的映射。然而，现有的大多数性能评估使用随机划分的训练/测试CSI数据集分割，这未能反映实际部署的泛化要求，并呈现了乐观的结果。在本文中，我们研究了在室内和室外环境中获取的三个真实世界CSI数据集上，使用符合标准的Wi-Fi和5G NR系统的神经定位的空间和时间泛化。我们使用两种不同的架构——传统的多层感知器（MLP）和一种新颖的Transformer架构——评估对未见过的空间区域、未见过的UE轨迹以及间隔一周的CSI测量活动的泛化能力。我们的实验表明，两种架构在空间和时间上都能很好地泛化，并且所提出的Transformer在定位精度上始终优于MLP，同时需要的模型参数更少。

英文摘要

Channel state information (CSI)-based neural positioning learns a mapping from CSI measurements to user equipment (UE) positions using neural networks. However, most existing performance evaluations utilize randomly partitioned train/test CSI-dataset splits, which fail to reflect the generalization requirements of practical deployments and present optimistic results. In this paper, we study the spatial and temporal generalization of neural positioning with standard-compliant Wi-Fi and 5G NR systems for three real-world CSI datasets acquired in indoor and outdoor environments. We assess generalization with two different architectures, a conventional multilayer perceptron (MLP) and a novel transformer architecture, to unseen spatial regions, unseen UE trajectories, and CSI measurement campaigns separated by one week. Our experiments show that both architectures generalize well in space and time, and the proposed transformer consistently outperforms the MLP in positioning accuracy while requiring fewer model parameters.

URL PDF HTML ☆

赞 0 踩 0

2606.18121 2026-06-17 cs.MA cs.IT math.IT 新提交

On the Reliability of Networks of AI Agents: Density Evolution, Stopping Sets, and Architecture Optimization

AI代理网络的可靠性：密度演化、停止集与架构优化

Ehsan Aghazadeh, Hossein Pishro-Nik

AI总结将多代理AI系统建模为稀疏图上的消息传递，扩展LDPC编码的密度演化理论，分析三种擦除失效模式，并证明密度演化定理以预测未解决子声明的渐近比例。

详情

AI中文摘要

现代AI系统越来越多地通过多个不完美代理协作完成任务：一些代理提出解决方案的片段，其他代理验证它们，结果被组合。这些系统通常优于任何单一模型，但很少清楚它们为何成功或何时失败。我们将此类系统建模为稀疏图上的消息传递，这是低密度奇偶校验（LDPC）码的基础结构，并将编码理论的密度演化机制扩展到这一更丰富的场景。在我们的模型中，任务是一组耦合的二元子声明，代理架构是一个稀疏的、角色类型化的因子图，其校验节点是带噪的布尔验证器节点，每个节点计算其接触的子声明的局部布尔函数。三种不同的失效模式——均建模为擦除（代理弃权、验证器返回无可用输出、两代理间消息丢失）——随着代理交换集合值消息而传播。校验代理通过单一的逻辑强制规则组合这些消息，该规则特化为XOR、AND、OR、蕴含和Horn约束。这不仅仅是LDPC理论的重新标记：验证器函数是非线性和值不对称的，三种失效模式不能简化为单一有效信道，因此需要新的阈值、有限长和逆结果，而非直接重用奇偶校验密度演化。我们证明了一个密度演化定理，该定理预测随机角色类型化架构上未解决子声明的渐近比例，并扩展到确定性的、局部树状图序列。XOR情况恢复了二元擦除信道（BEC）上的经典LDPC递归；AND情况揭示了正负验证器证书之间的不对称性。

英文摘要

Modern AI systems increasingly solve a task not with a single model call but with several imperfect agents working together: some propose pieces of a solution, others verify them, and the results are combined. These systems often outperform any single model, yet it is rarely clear why they succeed or when they will fail. We model such a system as message passing on a sparse graph, the structure that underlies low-density parity-check (LDPC) codes, and extend the density-evolution machinery of coding theory to this richer setting. In our model a task is a set of coupled binary subclaims, and an agent architecture is a sparse, role-typed factor graph whose check nodes are noisy Boolean verifier nodes, each computing a local Boolean function of the subclaims it touches. Three distinct failure modes, all modeled as erasures (an agent abstaining, a verifier returning no usable output, and a message lost between two agents), propagate as the agents exchange set-valued messages. The check agents combine these messages by a single logical-forcing rule that specializes to XOR, AND, OR, implication, and Horn constraints. This is more than a relabeling of LDPC theory: the verifier functions are nonlinear and value-asymmetric, and the three failure modes do not reduce to a single effective channel, so they require new threshold, finite-length, and converse results rather than a direct reuse of parity-check density evolution. We prove a density-evolution theorem that predicts the asymptotic fraction of unresolved subclaims on random role-typed architectures, with an extension to deterministic, locally tree-like graph sequences. The XOR case recovers the classical LDPC recursion on the binary erasure channel (BEC); the AND case exposes an asymmetry between positive and negative verifier certificates.

URL PDF HTML ☆

赞 0 踩 0