arXivDaily arXiv每日学术速递 周一至周五更新
重置
全部学科分类 2251
2606.10702 2026-06-10 cs.SE 新提交

Watts and Debts of Agentic Frameworks: An Empirical Study (Registered Report)

智能体框架的能耗与债务:一项实证研究(注册报告)

Aneetta Sara Shany, Chandrasekar S, Karthik Vaidhyanathan

AI总结 通过实证研究关联智能体框架中的自我承认技术债务与运行时能耗,探讨代码质量能否指导节能设计。

Comments Accepted at the 20th International Symposium on Empirical Software Engineering and Measurement (ESEM 2026), Registered Reports Track

详情
AI中文摘要

背景:每个投入生产的智能体AI系统都隐藏着两种风险:累积的技术债务和未监控的运行时能耗。虽然功能基准测试很常见,但内部结构质量(特别是技术债务)与执行期间动态能耗之间的实证联系仍未探索,这给管理可持续性和运营预算的从业者和组织带来了盲点。目标:我们提出一项验证性实证研究,将自我承认技术债务与智能体框架的硬件级运行时能耗相关联,以确定代码质量能否驱动节能设计决策。方法:我们将通过在一个严格控制的环境中执行标准化任务套件来评估五个开源智能体框架。SATD将通过基于Python的自动注释挖掘提取,并通过基于LLM的分类(使用微调提示)进行归类,而运行时能耗将在硬件级别测量。我们的研究将探讨三个核心研究问题:(RQ1)这些框架中是否存在技术债务;(RQ2)不同架构间运行时能耗的差异;(RQ3)框架的技术债务与其任务级能耗之间的统计相关性。结论:研究结果将确定自动化源代码分析是否可以作为节能框架选择的可靠早期预警代理,从而推进绿色软件工程和智能体AI质量研究。

英文摘要

Context: Every agentic AI system shipped to production carries two hidden risks: accumulated Technical Debt (TD) and unmonitored runtime energy costs. While functional benchmarking is common, the empirical link between internal structural quality (specifically TD) and dynamic energy consumption during execution remains unexplored, creating a blind spot for practitioners and organizations managing sustainability and operational budgets at scale. Goal: We propose a confirmatory empirical study correlating Self-Admitted Technical Debt (SATD) with hardware-level runtime energy consumption across agentic frameworks, to determine whether code quality can drive energy-aware design decisions. Method: We will evaluate five open-source agentic frameworks by executing a standardized task suite in a strictly controlled environment. SATD will be extracted via automated Python-based comment mining and categorized via LLM-based classification using fine-tuned prompt, while runtime energy will be measured at the hardware level. Our study will investigate three core research questions: (RQ1) the presence of TD within these frameworks; (RQ2) the variance in runtime energy consumption across architectures; and (RQ3) the statistical correlation between a framework's TD and its task-level energy consumption. Conclusion: The findings will establish whether automated source code analysis can serve as a reliable, early-warning proxy for energy-efficient framework selection, thereby advancing both green software engineering and agentic AI quality research.

2606.10697 2026-06-10 cs.IR 新提交

Beyond Patches: Superpixel Token-based Transformers for Attribute-Specific Fashion Retrieval

超越图像块:基于超像素标记的Transformer用于属性特定时尚检索

Shuili Zhang, Hongzhang Mu, Wenyuan Zhang, Duohe Ma, Tingwen Liu

AI总结 提出SuperFashion框架,首次在Transformer中采用超像素标记,通过属性引导注意力和超像素分割增强属性定位与判别,在三个数据集上取得显著性能提升。

Comments 9 pages, 5 figures. Published in the Proceedings of the ACM Web Conference 2026 (WWW '26). Author version with minor corrections; results and conclusions unchanged

详情
Journal ref
Proceedings of the ACM Web Conference 2026 (WWW '26), pp. 6956-6964, 2026
AI中文摘要

属性特定时尚检索(ASFR)旨在通过关注特定属性来改进细粒度图像检索。然而,现有的基于图像块的注意力和Transformer方法常常与不规则的属性区域不对齐,且容易受到背景噪声的影响,限制了它们捕捉细微像素级微观结构的能力。为了应对这些挑战,我们提出了SuperFashion,这是第一个在Transformer架构中采用超像素标记的ASFR框架。SuperFashion首先采用属性引导的注意力机制提取属性相关特征,进而指导裁剪具有语义意义的图像区域。然后利用超像素分割在这些区域上生成紧凑、语义连贯的超像素标记。通过为属性标记和超像素标记引入模态特定嵌入,基于超像素标记的Transformer促进了自适应交互与融合,从而增强了属性定位和判别能力。在FashionAI、DARN和DeepFashion上的大量实验表明,与先前的最先进方法相比,整体MAP分别相对提高了1.84%、9.27%和9.35%。SuperFashion为基于网络的图像检索提供了一种新的解决方案。

英文摘要

Attribute-Specific Fashion Retrieval (ASFR) aims to improve fine-grained image retrieval by focusing on specific attributes. However, existing patch-based attention and Transformer methods often misalign with irregular attribute regions and are prone to background noise, limiting their ability to capture subtle, pixel-level microstructures. To tackle these challenges, we propose SuperFashion, the first ASFR framework that adopts superpixel tokens within a Transformer architecture. SuperFashion initially employs an attribute-guided attention mechanism to extract attribute-related features, which in turn guide the cropping of semantically meaningful image regions. Superpixel segmentation is then leveraged on these regions to generate compact, semantically coherent superpixel tokens. By incorporating modality-specific embeddings for both attribute and superpixel tokens, the superpixel token-based Transformer facilitates adaptive interaction and fusion, thereby enhancing attribute localization and discrimination. Extensive experiments on FashionAI, DARN, and DeepFashion demonstrate relative overall MAP improvements of 1.84%, 9.27%, and 9.35% over prior SOTA. SuperFashion offers a new solution for web-based image retrieval.

2606.10693 2026-06-10 cs.DC cs.CC cs.FL 新提交

Generalizing LCL Complexity Gaps to Unbounded Degree via Monadic Second-Order Properties

通过一元二阶性质将LCL复杂性间隙推广到无界度

Chiara Piombi

AI总结 本文提出局部PMSO(LPMSO)问题类,将LCL问题推广到无界度图,并证明在无界度有根树上,已知的复杂性间隙(如ω(log n)与n^{o(1)})及其可判定性仍然成立。

详情
AI中文摘要

过去十年对LOCAL模型的研究在理解局部可检查标记(LCL)问题方面取得了巨大进展,最终几乎完全分类了LCL问题可能展现的复杂性。具体而言,在无向树上,Chang和Pettie证明不存在复杂性介于$\omega(\log n)$和$n^{o(1)}$之间的LCL问题,Chang证明对于每个正整数$k$,不存在复杂性介于$\omega(n^{1/(k+1)})$和$o(n^{1/k})$之间的LCL问题;此外,每个间隙中问题位于哪一侧是可判定的。虽然LCL问题类——大致来说,由解的正确性可通过有限允许节点配置集描述,并可通过常数时间算法局部验证的问题组成——包含许多重要问题,但它有一个主要限制:问题只能定义在有界度图上,这因此限制了上述所有分类和间隙结果。在这项工作中,我们使用Presburger一元二阶(PMSO)公式将LCL问题推广到无界度;更具体地说,我们考虑所谓的局部PMSO(LPMSO)问题,即其正确解既由PMSO公式有限描述,又可通过LOCAL算法在常数时间内局部验证的问题——该类包含LOCAL模型中研究的许多重要问题,但将它们定义在无界度图上。作为我们的主要结果,我们证明在无界度有根树上,上述$\omega(\log n)$-$n^{o(1)}$和$\omega(n^{1/(k+1)})$-$o(n^{1/k})$复杂性间隙(及其可判定性)扩展到LPMSO问题类。

英文摘要

The last decade of research on the LOCAL model has seen tremendous progress in understanding locally checkable labeling (LCL) problems, culminating in an almost complete classification of the possible complexities LCL problems can exhibit. In particular, on undirected trees, Chang and Pettie showed that there is no LCL problem with complexity between $ω(\log n)$ and $n^{o(1)}$ and Chang showed that, for every positive integer $k$, there is no LCL problem with complexity between $ω(n^{1/(k+1)})$ and $o(n^{1/k})$; additionally, which side of each gap a problem is found on is decidable. While the class of LCL problems - which, roughly speaking, consists of problems for which the correctness of a solution can be described by a finite set of allowed node configurations, which in turn can be locally verified by a constant-time algorithm - includes many important problems, it has one major restriction: problems can be defined only on bounded degree graphs, which consequently restricts all the classification and gap results mentioned above. In this work, we propose a generalization of LCL problems to unbounded degree using Presburger monadic second-order (PMSO) formulas; more specifically, we consider what we call Local PMSO (LPMSO) problems, i.e., problems whose correct solutions are both finitely described by a PMSO formula and locally verifiable by a LOCAL algorithm in constant time - this class contains many of the important problems studied in the LOCAL model but defines them on unbounded degree graphs. As our main result we prove that, on unbounded degree rooted trees, the aforementioned $ω(\log n)$ - $n^{o(1)}$ and $ω(n^{1/(k+1)})$ - $o(n^{1/k})$ complexity gaps (and their decidability) extend to the class of LPMSO problems.

2606.10670 2026-06-10 cs.DS cs.CC 新提交

On the Complexity of Signed Domination

关于符号支配的复杂性

Sangam Balchandar Reddy

AI总结 本文研究符号支配问题的复杂性,证明其在分裂图上仍为NP完全,并给出关于反馈顶点集数的W[1]-难结果,同时针对邻域多样性和双覆盖数提出FPT算法。

Comments Extended abstract of this paper has appeared in IWOCA 2026

详情
AI中文摘要

给定图 $G = (V, E)$,符号支配函数是一个函数 $f: V \rightarrow \{-1, 1\}$,使得对于每个顶点 $u \in V$,$\sum\limits_{v \in N[u]} f(v) \geq 1$。$f$ 的权重定义为 $\sum\limits_{u \in V} f(u)$。\sd{}问题的目标是计算一个最小权重的符号支配函数 $f$。已知该问题即使在二分图、弦图和平面图上也是NP完全的。在本文中,我们扩展了\sd{}问题的已知复杂性结果。由于该问题在弦图上是NP完全的,我们研究了它在分裂图(弦图的一个子类)上的复杂性,并证明它仍然是NP完全的。此外,由于该问题以权重为参数是W[2]-难的,我们研究了其关于结构参数的参数化复杂性。我们证明当参数化为反馈顶点集数时(因此也以树宽和团宽为参数),该问题是W[1]-难的。受此困难结果的启发,我们考虑了更严格的参数——邻域多样性和双覆盖数,并提出了FPT算法。

英文摘要

Given a graph $G = (V, E)$, a signed dominating function is a function $f: V \rightarrow \{-1, 1\}$ such that for every vertex $u \in V$, $\sum\limits_{v \in N[u]} f(v) \geq 1$. The weight of $f$ is defined as $\sum\limits_{u \in V} f(u)$. The objective of the \sd{} problem is to compute a signed dominating function $f$ of minimum weight. The problem is known to be NP-complete even when restricted to bipartite, chordal, and planar graphs. In this paper, we extend the known complexity results for the \sd{} problem. Since the problem is NP-complete on chordal graphs, we study its complexity on split graphs, a subclass of chordal graphs, and show that it remains NP-complete. Moreover, as the problem is W[2]-hard parameterized by weight, we investigate its parameterized complexity with respect to structural parameters. We prove that the problem is W[1]-hard when parameterized by feedback vertex set number (and hence by treewidth and clique-width). Motivated by this hardness result, we consider more restrictive parameters, neighbourhood diversity and twin cover number, and present FPT algorithms.

2606.10663 2026-06-10 cs.DB 新提交

Reconstructing OPC UA Address Spaces from Time-Series Databases

从时序数据库重建OPC UA地址空间

Lukas Lürzer, Hannes Unger, Stefan Huber

AI总结 提出opcua-ts架构,通过生命周期稳定的连接键在时序数据库中持久化语义元数据,并重建为实时OPC UA端点,解决多源部署中的标识符冲突问题。

Comments 5 pages, 1 figure. Author's accepted version of a paper accepted at AI4IP 2026 (workshop at DEXA2026); to appear in Springer Communications in Computer and Information Science (CCIS)

详情
AI中文摘要

OPC UA已成为操作技术中占主导地位的开放协议。时序数据库通常会归档OPC UA遥测数据,但丢弃了赋予传感器值含义的语义元数据(节点层次结构、工程单位和类型定义)。从时序数据库中恢复这些信息并非易事:在源端记录的命名空间索引是会话本地的,并且在重启后不稳定,而跨多个源服务器的简单合并会导致标识符冲突。我们提出了opcua-ts,一种已实现的架构,它将语义信息与其遥测数据一起持久化到通用时序数据库中,使用生命周期稳定的连接键,并将源地址空间重建为实时OPC UA端点。我们描述了在多源部署中重建正确性的条件,并通过与源服务器的NodeSet2 XML往返验证了该方法。来自锅炉模拟器往返的初步结果表明该方法是可行的。

英文摘要

OPC UA has become the dominant open protocol in operational technology. Time-series databases routinely archive OPC UA telemetry but discard the semantic metadata (node hierarchy, engineering units, and type definitions) which gives sensor values their meaning. Recovering this information from a time-series database is non-trivial: namespace indices recorded at the source are session-local and unstable across restarts, and naive merging across multiple source servers results in identifier collisions. We present opcua-ts, an implemented architecture that persists this semantic information alongside its telemetry in a general-purpose time-series database under a lifecycle-stable join key, and that reconstructs the source address space as a live OPC UA endpoint. We characterize the conditions under which the reconstruction is sound across multi-source deployments and validate the approach with a NodeSet2 XML round-trip against the source server. Initial results from a boiler-simulator round-trip indicate that the approach is feasible.

2606.10649 2026-06-10 cs.CR cs.FL 新提交

Layer Order Semantics for Automata-Based Cybersecurity

基于自动机的网络安全层序语义

Faruk Alpay, Taylan Alpay

AI总结 本文提出层序自动机模型,形式化描述安全管道中证据处理顺序对检测结果的影响,以HTTP请求反同步为例,区分完整轨迹识别、在线编辑、决策合成与忠实执行,并证明正则策略在因果可见性下的可识别性。

Comments 22 pages; theoretical paper; no figures or tables

详情
AI中文摘要

分层网络安全管道在做出决策前对证据进行转换,这些转换的顺序决定了哪些安全事实对哪些层可见。本文为层序赋予有限状态语义,该语义由层序自动机、确定性顺序安全转换器、证据标记和最终决策自动机构成。实际案例是HTTP请求反同步:前端和后端处理器计算不兼容的请求边界,根据框架证据在提交前是否到达解析器差异层,同一轨迹可能被检测到或遗漏。结果区分了完整轨迹识别、在线编辑、决策合成和忠实执行;将忠实在线执行刻画为因果可见性下的正则前缀闭包情况;并表明超出该边界的正则策略仍可识别但无法成为可部署的执行器。该框架在整体上等价于有限输出确定性编辑自动机,同时保留了层局部不变量,如标记产生、标记存活和对重排敏感的可见性。具体的解析器对语义将禁止的标记因子与http://this.url、http://this.url、http://this.url以及所述抽象下的HTTP/2降级边界不一致关联起来,上下文重排同余类对哪些组件排列诱导相同决策语言进行分类。结果是对顺序敏感安全故障的自动机理论解释,以及用于审计、综合和比较分层执行管道的组合词汇表。

英文摘要

Layered cybersecurity pipelines transform evidence before they decide on it, and the order of those transformations determines which security facts become visible to which layer. This paper gives layer order a finite-state semantics built from a layer-order automaton, deterministic sequential security transducers, evidence markers, and a final decision automaton. The worked case is HTTP request desynchronization: front-end and back-end processors compute incompatible request boundaries, and the same trace is detected or missed according to whether framing evidence reaches the parser-differential layer before it commits. The results separate completed-trace recognition, online editing, decision synthesis, and faithful enforcement; characterize faithful online enforcement as the regular prefix-closed case under causal visibility; and show that regular policies beyond that boundary remain recognizable without becoming deployable enforcers. The framework is monolithically equivalent to finite-output deterministic edit automata, while preserving layer-local invariants such as marker birth, marker survival, and reorder-sensitive visibility. A concrete parser-pair semantics identifies the forbidden marker factor with CL.TE, TE.CL, TE.TE, and HTTP/2-downgrade boundary disagreement under the stated abstraction, and a contextual reorder congruence classifies which component permutations induce the same decision language. The result is an automata-theoretic account of order-sensitive security failures and a compositional vocabulary for auditing, synthesizing, and comparing layered enforcement pipelines.

2606.10644 2026-06-10 cs.PL cs.LO 新提交

Answer Set Programming for Egg Extraction and More

用于蛋提取及更多问题的回答集编程

Ziyi Yang, Ilya Sergey

AI总结 研究使用回答集编程(ASP)从e-graph中提取项,通过朴素编码在效率上媲美优化后的ILP方法,并发现更优提取方案,探讨ASP作为更强大的Datalog与egg结合的潜力。

Comments To be presented at EGRAPHS 2026

详情
AI中文摘要

三年前,Philip Zucker发布了一项尝试,使用回答集编程(ASP)从e-graph中进行项提取。尽管该任务是NP难的,且ASP为e-graph项提供了自然建模,但最初的尝试并未产生令人信服的结果。从实际ASP用户的角度,我们首先指出了使ASP在e-graph提取任务中有效且高效的方法。初步结果表明,朴素的ASP编码在效率上与extraction-gym中基于ILP的优化精确DAG提取相当,并在复杂实例上找到了若干额外的更优提取。这引导我们进一步思考:既然“egg+Datalog”能“更好地结合”,那么将ASP作为更强大的Dalog是否能有“更好的结合”?我们讨论了彼此带来的潜在益处。

英文摘要

Three years ago, Philip Zucker posted an attempt to use answer set programming (ASP) for term extraction from e-graphs Although the task is NP-hard and ASP offers a natural modelling of e-graph terms, the initial attempt did not yield convincing results. From the aspect of practical ASP users, we first pinpoint the way to make ASP work and work well on the task of e-graph extraction. The initial results show the naïve ASP encoding is comparable on efficiency to the well-optimised ILP-based exact DAG extraction in the extraction-gym, and find several extra optimal extraction on the complex instances. This leads us to a further agenda: with the "better together of egg+Datalog", is there a better "better together" by having ASP as a more powerful Datalog? We discuss the potential benefit from each other.

2606.10641 2026-06-10 cs.NI 新提交

CAMASA: A CAM-based Dataset from the MASA Living Lab

CAMASA:来自MASA生活实验室的基于CAM的数据集

Salvatore Iandolo, Marco Savarese, Gaetano Orazio Cauchi, Antonio Solida, Martin Klapez, Maurizio Casoni, Angelo Porrello, Carlo Augusto Grazia

AI总结 本文介绍CAMASA,一个基于真实V2X通信消息的大规模基础设施数据集,用于轨迹预测、交通仿真和数字孪生研究。

Comments Accepted for publication at the IEEE 2026 Vehicular Technology Conference (VTC2026-Fall). Dataset will be available at netlab.unimore.it/MASA

详情
AI中文摘要

轨迹预测是自主和协同驾驶系统的关键推动因素。然而,现有大多数基准要么以传感器为中心、地理受限,要么基于无法捕捉真实V2X通信动态的合成移动轨迹。本文介绍了CAMASA,一个基于在Modena汽车智能区(MASA)内收集的合作感知消息(CAM)和分散式环境通知消息(DENM)的大规模基础设施数据集。该数据集包含在真实城市交通条件下数月内记录的超过4000万条CAM和200万条DENM。我们提出了一个严格的预处理流程,包括过滤、考虑ETSI隐私驱动的stationID变化的假名协调,以及时间归一化至10Hz轨迹,适用于运动预测和时间序列分析。凭借超过14,000公里的重建车辆路径和数万个唯一站ID,CAMASA为协作智能交通系统(C-ITS)研究提供了统计显著的实证基础。除轨迹预测外,该数据集还可用于校准微观城市交通模拟器(如SUMO),并通过联合建模实际部署中的移动模式和V2X通信覆盖,支持开发逼真的智能交通系统(ITS)数字孪生。

英文摘要

Trajectory prediction is a key enabler of autonomous and cooperative driving systems. However, most existing benchmarks are either sensor-centric, geographically constrained, or based on synthetic mobility traces that do not capture real-world V2X communication dynamics. This paper introduces CAMASA, a large-scale infrastructure-based dataset derived from Cooperative Awareness Messages (CAMs) and Decentralized Environmental Notification Messages (DENMs) collected within the Modena Automotive Smart Area (MASA). The dataset comprises more than 40 million CAMs and 2 million DENMs recorded under authentic urban traffic conditions over multiple months. We present a rigorous preprocessing pipeline that includes filtering, pseudonym reconciliation to account for ETSI privacy-driven stationID changes, and temporal normalization to 10 Hz trajectories, suitable for motion forecasting and time-series analysis. With over 14,000 km of reconstructed vehicle paths and tens of thousands of unique station IDs, CAMASA provides a statistically significant empirical foundation for research on Cooperative Intelligent Transportation Systems (C-ITS). Beyond trajectory prediction, the dataset enables calibration of microscopic urban traffic simulators (e.g., SUMO) and supports the development of realistic Intelligent Transportation Systems (ITS) Digital Twins by jointly modeling mobility patterns and V2X communication coverage in real deployments.

2606.10625 2026-06-10 cs.CR 新提交

snaproot: Decentralized File Integrity Verification Using Blockchain-Anchored Cryptographic Hashing

snaproot: 使用区块链锚定加密哈希的去中心化文件完整性验证

Arslan Brömme, Tarkan Yavas

AI总结 提出基于Solana区块链的轻量级文件完整性验证系统snaproot,通过SHA-256哈希锚定实现去中心化验证,并区分存在证明与作者证明,解决了中心化信任依赖和资源开销问题。

Comments 38 pages, 2 figures, 4 tables. Working paper

详情
AI中文摘要

数字内容的快速增长使得可靠的完整性验证变得越来越重要。现有解决方案要么依赖中心化权威机构,这引入了信任依赖和单点故障,要么依赖去中心化存储系统,这会产生高昂的资源开销。在本文中,我们提出了snaproot,一个轻量级系统,它在Solana区块链上实现了Haber和Stornetta的哈希锚定范式,以提供高效、去中心化的文件完整性验证。snaproot生成文件的SHA-256哈希,并将其不可变地存储在链上作为永久参考记录。验证通过重新计算哈希并与存储值进行比较来执行,产生确定性的二进制结果。我们描述了一个四层信任架构,包括三个已实现的层级和一个面向长期持久性的前瞻性层级,该持久性超越了任何单个区块链的生命周期。我们提出了一个正式威胁模型,一个基于SHA-256第二原像抗性的安全分析,以及在Solana Devnet上对1 KB到500 MB文件大小的实证评估。一个核心概念贡献是明确区分存在证明(即文件在给定时间存在的与密钥无关的声明)和作者证明(即记录与特定钱包身份之间的与密钥相关的绑定)。这种分离允许存在保证在密钥丢失时仍然有效,同时保留在密钥保留的情况下更强的作者声明。我们将snaproot与OpenTimestamps、OriginStamp和Chainpoint进行对比,并讨论了在预注册操纵和AI生成内容方面的局限性。

英文摘要

The rapid growth of digital content has made reliable integrity verification increasingly important. Existing solutions rely either on centralized authorities, which introduce trust dependencies and single points of failure, or on decentralized storage systems that incur prohibitive resource overhead. In this paper, we present snaproot, a lightweight system that implements the hash-anchoring paradigm of Haber and Stornetta on the Solana blockchain to provide efficient, decentralized file integrity verification. snaproot generates a SHA-256 hash of a file and stores it immutably on-chain as a permanent reference record. Verification is performed by recomputing the hash and comparing it to the stored value, yielding a deterministic binary outcome. We describe a four-tier trust architecture comprising three realized tiers and one prospective tier for long-term persistence beyond the lifetime of any single blockchain. We present a formal threat model, a security analysis grounded in the second-preimage resistance of SHA-256, and an empirical evaluation on Solana Devnet across file sizes from 1 KB to 500 MB. A central conceptual contribution is the explicit separation between existence proof, the key-independent claim that a file existed at a given time, and authorship proof, the key-dependent binding between a record and a specific wallet identity. This separation allows existence guarantees to survive key loss while preserving stronger authorship claims where keys are retained. We position snaproot against OpenTimestamps, OriginStamp, and Chainpoint and discuss limitations with respect to pre-registration manipulation and AI-generated content.

2606.10598 2026-06-10 cs.SE 新提交

Exploring and Complementing End Users' Requirements in IoT enabled System

探索与补充物联网系统中终端用户的需求

Haotian Li, Xiaohong Chen, Zhi Jin, Shuyuan Xiao, Chenxu Wang, Haoxiang Yan, Xiaoyi Chen

AI总结 针对终端用户通过触发动作编程创建物联网规则时表达碎片化的问题,提出一种基于意图的需求补全方法,利用双向需求可追溯树和多智能体框架,结合大语言模型推理与结构化可追溯性,实现功能完整且内在安全的规则补全,将规则补全率提升43%,逻辑冲突减少21%以上。

详情
AI中文摘要

终端用户通过触发动作编程创建物联网自动化规则,但他们的表达往往是碎片化的,捕捉的是设备操作而非高层意图。这种差距导致条件缺失、逻辑冲突以及安全约束被忽视,从而引发危险行为。为解决这一问题,我们提出了一种意图驱动的需求补全方法,将规则补全重构为双重过程:从碎片化规则中重建意图,然后从该意图中重新生成规则,并在整个过程中嵌入安全性。我们引入了一种双向需求可追溯树,这是一个三层模型,连接规则、意图和质量关注点,并设计了一个多智能体框架,将大语言模型推理与结构化可追溯性相结合。这使得补全既功能完整又内在安全,同时保持可追溯性和可解释性。评估表明,我们的方法显著优于基线,将规则补全率提高了43%,并将逻辑冲突减少了21%以上。通过将补全建立在意图理解的基础上,我们将范式从用户责任转向系统责任,从功能正确性转向整体可信赖性。

英文摘要

End users create IoT automation rules via trigger action programming, but their expressions are often fragmented, capturing device operations rather than high level intents. This gap leads to missing conditions, logical conflicts, and overlooked safety constraints, risking hazardous behaviors. To address this, we propose an intent driven requirements completion approach that reframes rule completion as a dual process: reconstructing intent from fragmented rules, then regenerating rules from that intent, with safety embedded throughout. We introduce a Bidirectional Requirements Traceability Tree, a three layer model linking rules, intents, and quality concerns, and design a multiagent framework that combines LLM reasoning with structured traceability. This enables completions that are both functionally complete and inherently safe, while remaining traceable and explainable. Evaluation shows our method significantly outperforms the baselines, improving the rule completion rate by 43% and reducing logical conflicts by over 21%. By grounding completion in intent understanding, we shift the paradigm from user to system responsibility, and from functional correctness to holistic trustworthiness.

2606.10589 2026-06-10 eess.SY cs.SY 新提交

Transient Stability of Offshore Energy Hubs

海上能源枢纽的暂态稳定性

Alban J. F. Duvivier, Dominic Groß, Daniel Müller, Nicolaos A. Cutululis

AI总结 针对海上能源枢纽中电网构型模块化多电平换流器的电流限制问题,提出结合可变虚拟阻抗和虚拟功率的统一限流策略,通过P-delta分析证明其能提升暂态稳定性,EMT仿真验证了有效性。

Comments 10 pages, 11 figures, 2 tables, journal paper

详情
AI中文摘要

海上能源枢纽(OEH)利用电网构型模块化多电平换流器(MMC)实现大规模海上风电集成和多端HVDC运行。在HVDC连接的海上风电场和OEH中,海上电网构型HVDC换流器从风电场供电的海上交流电网吸收有功功率,并将其转换为直流功率传输到陆上电网。在此背景下,不同故障类型下换流器的限流问题在文献中研究不足,现有研究主要关注功率注入型换流器。本文提出一种统一的限流策略,该策略将基于平滑阈值函数的可变虚拟阻抗(VVI)与一种从虚拟电阻耗散功率导出的新型虚拟功率(VP)机制相结合。VVI在故障引起的过电流期间确保限流,同时保持电压源行为,而VP机制在同步回路中加入补偿功率项,实现换流器间的自动功率再分配。P-delta分析进一步表明,更具电阻性的VVI可提高功率吸收型换流器的暂态稳定性,而所提出的VP机制进一步扩大了稳定裕度。EMT仿真验证了所提出的VVI-VP组合策略能够限制故障电流,在严重故障期间保持同步,并在全换流器型OEH中实现协调的故障后功率共享。

英文摘要

Offshore energy hubs (OEHs) use grid-forming modular multilevel converters (MMCs) to enable large-scale offshore wind integration and multi-terminal HVDC operation. In HVDC-connected offshore wind farms and OEHs, the offshore grid-forming HVDC converters absorb active power from an offshore AC grid supplied by the wind farms and convert it to DC power for transmission to the onshore grid. Converter current limiting under different fault types in this setting is an understudied topic in the literature, which mostly focuses on power-injecting converters. This paper proposes a unified current-limiting strategy that combines a variable virtual impedance (VVI), based on a smooth threshold function, with a novel virtual-power (VP) mechanism derived from the power dissipated in the virtual resistance. The VVI ensures current limitation during fault-induced overcurrents while preserving voltage-source behavior, whereas the VP mechanism adds a compensating power term into the synchronization loop, enabling automatic power redistribution among converters. P-delta analysis further shows that a more resistive VVI can improve the transient stability of power-absorbing converters, while the proposed VP mechanism further enlarges the stability margin. EMT simulations validate that the combined VVI-VP strategy limits fault currents, maintains synchronism during severe faults, and achieves coordinated post-fault power sharing in fully converter-based OEHs.

2606.10544 2026-06-10 cs.SI cs.CY cs.NI cs.SY econ.GN eess.SY q-fin.EC 新提交

From Stacks to Circuits: A Regenerative Socio-Technical Roadmap for AI Infrastructure within Planetary Boundaries

从堆栈到电路:行星边界内人工智能基础设施的再生社会技术路线图

Han-Teng Liao, Karen Ang

AI总结 针对生成式AI线性扩展导致的热力学和材料成本外部化问题,提出一种再生社会技术路线图,通过代谢电路框架将AI基础设施重塑为受行星边界约束的系统之系统,并识别当前以Nvidia为中心的路线图的空白,提出竞争性参考架构。

Comments This document is a working paper and reflects the state of research as of May 2026. Comments are welcome and should be directed to the corresponding author at h.liao@ieee.org. This work is accepted for presentation at the 32nd IEEE ICE/ITMC Conference, Porto, Portugal

详情
Journal ref
2026 IEEE International Conference on Engineering, Technology, and Innovation (ICE/ITMC), forthcoming 2026
AI中文摘要

当前生成式AI的扩展轨迹,以线性供给侧“堆栈”为典型,优先考虑性能密度,同时将显著的热力学和材料成本外部化。随着绿色与数字转型的“双重转型”加速,行业面临技术差距——包括范围3排放和电子废物回收——这些差距阻碍了可持续扩展并导致社会紧张。本研究提出了一种再生社会技术路线图,重新利用可持续生产与消费系统图,将人工智能基础设施重塑为最终受行星边界约束的系统之系统。通过整合电气和电子工程师协会国际器件与系统路线图(IEEE IRDS)对半导体设施的可持续性考量,本研究提出了一种代谢电路框架,将“价值观与需求”置于生产与消费关系循环的中心。本研究识别了当前以Nvidia为中心的路线图中的关键空白,并提出了一种竞争性参考架构。它展示了资源节约和行星责任的自发秩序如何为数字循环经济中的监管合规和产业韧性提供可行的路径。

英文摘要

Current scaling trajectories for Generative AI, typified by linear supply-side "stacks," prioritize performance density while externalizing significant thermodynamic and material costs. As the "Twin Transition" of green and digital transformation accelerates, the industry faces technology gaps - including Scope 3 emissions and e-waste recycling - that impede sustainable scaling and lead to social tensions. This study proposes a Regenerative Socio-Technical roadmap that repurposes the Sustainable Production and Consumption system map to reframe artificial intelligence infrastructure as a system-of-systems governed ultimately by planetary limits. By integrating the Institute of Electrical and Electronics Engineers International Roadmap for Devices and Systems (IEEE IRDS) sustainability considerations for semiconductor facilities, the study proposes a metabolic circuit framework that centers "Values and Needs" within production and consumption relationship loops. This study identifies critical gaps in current Nvidia-centric roadmaps and proposes a competing reference architecture. It demonstrates how a spontaneous order of resource parsimony and planetary accountability can provide an actionable pathway for regulatory compliance and industrial resilience in the digital circular economy.

2606.10536 2026-06-10 cs.CR cs.AR cs.DC 新提交

A Hybrid Edge-Cloud Architecture for Low-Latency Entitlement Verification in Resource-Constrained Devices

面向资源受限设备的低延迟权限验证的混合边缘-云架构

Pravin Nagare, Aditya Sabbineni, Devendra Dahiphale, Faiz Gouri, Pratik Thantharate

AI总结 提出混合边缘-云权限框架,通过本地缓存层和自适应权限缓存主动刷新算法,将授权延迟从422.8ms降至18.4ms(减少95.6%),并利用确定性Ed25519算法和TEE隔离缓解侧信道风险。

Comments 6 pages, 4 figures, 2 tables, 1 algorithm. Prepared in IEEE format. Proposes the AEC-PR framework for low-latency OTT entitlement verification using TEE and Ed25519

详情
AI中文摘要

随着数字媒体消费向大规模OTT平台转移,控制平面(特别是权限和身份验证)的效率已成为用户体验的关键因素。当前架构通常依赖同步的云绑定验证流程,这在资源受限的消费电子设备上会引入显著延迟。本文提出一种混合边缘-云权限框架,旨在最小化用户感知的摩擦。通过在设备中间件中实现安全的本地缓存层,并采用自适应权限缓存主动刷新(AEC-PR)算法,我们将用户交互与后端网络变化解耦。我们在ARM Cortex-A系列硬件上评估了性能,结果表明本地化密码验证将授权延迟从平均422.8ms降低到18.4ms(减少95.6%),同时通过确定性Ed25519算法和TEE隔离缓解了实现级别的侧信道风险。

英文摘要

As digital media consumption shifts toward large-scale Over-the-Top (OTT) platforms, the efficiency of the control plane, specifically entitlement and identity verification, has become a critical factor in user experience. Current architectures often rely on synchronous cloud-tethered validation flows that introduce significant latency, especially on resource-constrained consumer electronics. This paper proposes a Hybrid Edge-Cloud Entitlement Framework designed to minimize user-perceived friction. By implementing a secure, local caching layer within device middleware and utilizing an Adaptive Entitlement Cache with Proactive Refresh (AEC-PR) algorithm, we decouple the user interaction from backend network variability. We evaluate the performance on ARM Cortex-A series hardware, demonstrating that localized cryptographic verification reduces authorization latency from a mean of 422.8ms to 18.4ms (a 95.6% reduction) while mitigating implementation-level side-channel risks through deterministic Ed25519 arithmetic and TEE isolation.

2606.10509 2026-06-10 cs.CE 新提交

On the Localization of Checkerboarding in Multiaxial Stress Regions under SIMP Penalization

SIMP惩罚下多轴应力区域棋盘格现象的定位研究

Iulian Paunel, Jonathan Stollberg, Dominik Schillinger

AI总结 本文通过数值实验揭示棋盘格模式在多轴应力传递区域系统出现,而在单轴应力区域消失,并解释其源于SIMP惩罚抑制连续中间密度时棋盘格布局提供人工刚度的替代作用。

详情
AI中文摘要

棋盘格模式是基于密度的拓扑优化中使用固体各向同性材料惩罚(SIMP)方法和线性有限元时众所周知的数值伪影。现有的基于混合场不兼容性或锁死引起的刚度高估的解释说明了棋盘格布局的人工刚度,但未阐明其特征性的空间定位。在这项工作中,我们展示了棋盘格模式在多轴载荷传递区域系统性地出现,而主要单轴应力区域保持无棋盘格。通过系统的数值研究,我们证明了棋盘格起源于连续中间密度在多轴载荷传递中机械上有利但被SIMP惩罚抑制的区域。由于线性单元的特性,棋盘格布局为这些被惩罚的中间密度区域提供了人工刚性的离散替代。相比之下,单轴载荷路径自然倾向于连续的固体支柱,使得棋盘格在机械上不利。我们的发现提供了棋盘格现象的统一力学解释,即全局应力状态、SIMP惩罚和单元级锁死之间的相互作用,从而解释了其起源和空间定位。

英文摘要

Checkerboard patterns are a well-known numerical artifact in density-based topology optimization using the Solid Isotropic Material with Penalization (SIMP) method and linear finite elements. Existing explanations based on mixed-field incompatibility or locking-induced stiffness overestimation explain the artificial stiffness of checkerboard layouts but do not clarify their characteristic spatial localization. In this work, we show that checkerboard patterns systematically emerge in multiaxial load-transfer regions, whereas predominantly uniaxial stress regions remain checkerboard-free. Through systematic numerical investigations, we demonstrate that checkerboarding originates where continuous intermediate densities are mechanically favorable for multiaxial load transfer but are suppressed by SIMP penalization. Due to the characteristic behavior of linear elements, checkerboard layouts provide an artificially stiff discrete substitute for these penalized intermediate-density regions. In contrast, uniaxial load paths naturally favor continuous solid struts, rendering checkerboards mechanically disadvantageous. Our findings provide a unified mechanical interpretation of checkerboarding as the interplay between global stress states, SIMP penalization, and element-level locking, thereby explaining both its origin and the spatial localization.

2606.10502 2026-06-10 cs.CR 新提交

When VR Meets BCI: (Un)Observable Brainwave-aware Privacy Reconstruction in the Metaverse via Unrestricted Inbuilt Motion Sensors

当VR遇上BCI:通过无限制内置运动传感器在元宇宙中实现(不可)观察的脑电波感知隐私重建

Tao Ni, Zehua Sun, Qingchuan Zhao, Wei-Bin Lee, Cong Wang

AI总结 利用VR头显内置运动传感器捕捉瞳孔反应引起的细微振动,重建脑电信号,首次在元宇宙中揭示不可观察的隐私泄露,实现52.0%-67.2%的感知图像识别准确率。

详情
AI中文摘要

元宇宙设备,如虚拟现实(VR),已在众多领域得到显著发展和广泛应用。尽管近期研究揭示了VR中的隐私泄露,但这些漏洞仅限于虚拟场景中可观察行为的范围(例如用户正在观看的内容)。在这项工作中,我们揭示了超越可观察用户行为范围,通过利用VR头显中无限制的运动传感器重建脑电信号,从而获取不可观察的脑电相关表征(例如用户正在感知的内容)的可行性,这是一个看似被忽视但具有前景的向量。其关键在于,VR头显中的内置运动传感器(如加速度计)能够捕捉由瞳孔反应引起的细微振动,而这些振动与用户的视觉刺激和大脑内部感知高度相关。因此,我们设计并实现了BraVeSpy,以系统性地研究和证明这种源于从内置运动传感器变化中重建的脑电相关表征的严重隐私泄露的可行性。我们在不同VR设备上的广泛评估结果表明,BraVeSpy首次在元宇宙中揭示了不可观察的隐私,我们成功以52.0%-67.2%的准确率揭示了大脑中的感知图像。特别地,我们还发现BraVeSpy优于当前仅限于粗粒度可观察行为推理的方法,在推断用户活动相关敏感信息(如指纹识别网站、应用程序和流媒体视频)方面实现了超过85.0%的准确率,在用户去匿名化、注视运动跟踪和虚拟击键推断方面实现了超过96.0%的准确率。

英文摘要

Metaverse devices, such as virtual reality (VR), have seen substantial development and widespread applications in numerous areas. Although recent studies have revealed privacy leakages in VR, these vulnerabilities were limited in the scope of observable behaviors in virtual scenes (e.g., what a user is seeing). In this work, we uncover the feasibility of going beyond the scope of observable user behaviors to unobservable brain EEG-correlated representations (e.g., what a user is perceiving) by leveraging unrestricted motion sensors in VR headsets to reconstruct brain EEG signals, a seemingly neglected but promising vector. The insight is that the inbuilt motion sensors (e.g., accelerometers) in the VR headset can capture subtle vibrations induced by pupillary responses, which are highly correlated with users' visual stimuli and in-brain perceptions. Therefore, we design and implement BraVeSpy to systematically investigate and demonstrate the feasibility of this severe privacy leakage originating from brain EEG-correlated representations reconstructed from variations of inbuilt motion sensors. Our extensive evaluation results from different VR devices show that BraVeSpy, for the first time in the Metaverse, can reveal unobservable privacy, where we successfully unveiled perceptive images in the brain with 52.0%-67.2% accuracy. In particular, we also find that BraVeSpy outperforms the current approaches that are limited to coarse-grained inference of observable behaviors and achieves over 85.0% accuracy in inferring user activity-related sensitive information, such as fingerprinting websites, apps, and streaming videos, and over 96.0% accuracy in user de-anonymization, gaze movement tracking, and virtual keystroke inference.

2606.10484 2026-06-10 cs.CR 新提交

AgentCanary: A Security Evaluation Framework for Autonomous AI Agents in Real Executable Environments

AgentCanary:真实可执行环境中自主AI智能体的安全评估框架

Peiyang Li, Songping Wang, Yi Huang, Yanhua Shi, Chenhao Zhang, Qi Li, Yueming Lyu, Caifeng Shan, Fengting Li, Chao Feng, Chuanqun Zhu, Liang Chen

AI总结 提出AgentCanary框架,通过正交风险分类、高保真执行环境和轨迹多维度评估,解决现有安全评估中风险覆盖碎片化、环境静态低保真及指标单一粗粒度的问题。

详情
AI中文摘要

自主AI智能体推动了从对话到任务执行的转变,将安全故障从文本欺骗转向系统入侵。尽管安全评估对于主动风险预防至关重要,但先前的工作受到根本性瓶颈的限制,包括碎片化的风险覆盖、静态或低保真的执行环境以及单维度和粗粒度的评估指标。为了应对这些挑战,我们提出了AgentCanary,一个针对自主AI智能体的全面安全评估框架。AgentCanary通过三个贡献提供了系统性的解决方案。首先,全面的风险覆盖:我们引入了一个正交的Entry × Impact风险分类法,将对抗性影响如何进入智能体与其最终造成的危害解耦,并将其实例化为一个场景对齐的任务套件,涵盖实际的部署工作流。其次,高保真真实可执行环境:智能体不是通过静态问答或模拟工具响应,而是与真实工具交互,针对动态配置的任务工件进行操作,并在多步交互中保持持久状态,自然支持长期攻击评估。第三,轨迹基础的多维度评估:评估消耗完整的智能体轨迹,而非回复文本或单个工具调用,从而能够沿三个正交维度(结果安全性、安全意识和任务效用)进行分解评分。我们在AgentCanary上评估了一系列前沿模型,针对三个智能体框架中的多种已建立的对抗攻击方法。结果表明,当前的智能体往往无法识别其面临的攻击,特别是在技能受损、持久状态和长期执行攻击下,这为开发更可靠和安全的智能体系统提供了系统性的基准。

英文摘要

Autonomous AI agents have driven the transition from conversation to task execution, shifting security failures from textual deception to system compromise. Although security evaluation is crucial for proactive risk prevention, prior work is constrained by fundamental bottlenecks, including fragmented risk coverage, static or low-fidelity execution environments, and single-dimensional and coarse-grained assessment metrics. To address these challenges, we propose AgentCanary, a comprehensive security evaluation framework for autonomous AI agents. AgentCanary provides a systematic solution along three contributions. First, comprehensive risk coverage: we introduce an orthogonal Entry $\times$ Impact risk taxonomy that decouples how adversarial influence enters the agent from what harm it ultimately causes, and instantiate it as a scenario-aligned task suite spanning realistic deployment workflows. Second, a high-fidelity real executable environment: rather than static Q&A or mocked tool responses, agents interact with real tools against dynamically provisioned task artifacts, with persistent state across multi-step interactions that naturally supports long-horizon attack evaluation. Third, trajectory-grounded multi-dimensional evaluation: evaluation consumes the full agent trajectory rather than the reply text or a single tool call, enabling decomposed scoring along three orthogonal dimensions, Outcome Safety, Security Awareness, and Task Utility. We evaluate a broad set of frontier models on AgentCanary against multiple established adversarial attack methods across three agent frameworks. The results reveal that current agents often fail to recognize the attacks they face, particularly under compromised skills, persistent state, and long-horizon execution attacks, and provide a systematic baseline for developing more reliable and secure agent systems.

2606.10473 2026-06-10 cs.GR 新提交

AnisoLift: Anisotropic Latent Representations for Coarse Particle Liquid Enhancement

AnisoLift: 用于粗颗粒液体增强的各向异性潜在表示

Zhengqing Gao, Huaxi Huang, Runqi Lin, Yuanyuan Wang, Meng Li, Xi Zhou, Tongliang Liu, Mingming Gong, Xiao Sun

AI总结 提出AnisoLift框架,通过为每个粗颗粒学习各向异性椭球分量,在不增加颗粒数量的情况下捕获高分辨率流动的方向局部结构,提升粗颗粒液体模拟的保真度。

详情
AI中文摘要

基于颗粒的液体模拟广泛应用于图形学和物理建模,但高分辨率展开仍然计算成本高昂。因此,许多方法旨在从粗颗粒模拟中恢复细尺度动力学和密集输运模式。然而,这些方法通常依赖于额外的颗粒生成,这仍然会带来相当大的计算开销并导致表示不佳。为此,我们提出了AnisoLift,一个结构化的潜在闭合框架,为每个粗颗粒增强可学习的各向异性椭球分量。这使得模型能够从底层高分辨率流中捕获方向局部结构,而无需引入额外颗粒。给定一个粗模拟,我们的模型预测颗粒状态的残差校正,使更新后的状态更接近对齐的高分辨率教师。我们的训练目标同时监督颗粒动力学和各向异性几何结构,鼓励物理一致性和结构连贯性。大量实验表明,我们的方法通过提高对完全解析流动行为的保真度来增强粗液体模拟。

英文摘要

Particle-based liquid simulation is widely used in graphics and physical modeling, but high-resolution rollouts remain computationally expensive. Consequently, many methods aim to recover fine-scale dynamics and dense transport patterns from coarse particle simulations. However, these methods typically rely on additional particle generation, which still incurs considerable computational overhead and leads to poor representation. To this end, we propose AnisoLift, a structured latent closure framework that augments each coarse particle with learnable anisotropic ellipsoidal components. This allows the model to capture directional local structure from the underlying high-resolution flow without introducing extra particles. Given a coarse simulation, our model predicts residual corrections to particle states to bring the updated state closer to the aligned high-resolution teacher. Our training objective jointly supervises particle dynamics and anisotropic geometric structure, encouraging both physical consistency and structural coherence. Extensive experiments show that our approach enhances coarse liquid simulations through improving fidelity to fully resolved flow behavior.

2606.10465 2026-06-10 cs.SE 新提交

MASTOR: A Multi-Agent Approach to Semantic Test Oracle Generation for RESTful APIs

MASTOR: 一种面向RESTful API的语义测试预言生成多智能体方法

Sida Deng, Rubing Huang, Zhenzhen Yang, Man Zhang, Xuan Xie, Rongcun Wang

AI总结 提出MASTOR多智能体方法,通过分析源码上下文生成语义测试预言,包括状态字段预言和行为一致性预言,在13个开源项目上达到75.4%平均变异得分。

详情
AI中文摘要

现有的自动化RESTful API测试方法通常依赖简单检查(如HTTP状态码、模式符合性),不足以检测语义错误、业务逻辑违规和状态依赖的不一致性。为此,我们提出MASTOR,一种基于实现源代码为RESTful API生成语义测试预言的多智能体方法。MASTOR包含两个阶段:源码分析和预言生成。前者使用源码提取智能体,通过分析相关源文件的传递导入闭包,为每个端点操作构建源码上下文。后者在收集的上下文上采用两条并行的预言生成路径:单操作路径为每个操作生成状态和字段预言,多操作路径通过利用跨操作语义关联生成操作序列的行为一致性预言。两条路径均应用挑战者智能体审查,其中专门的审查者识别弱点并给出改进提示以指导针对性重新生成,随后进行预言规范化以过滤结构无效的预言。我们在来自WFD和PRAB数据集的13个开源RESTful API项目(296个操作,251,303行代码)基准上评估了MASTOR。MASTOR实现了75.4%的平均变异得分,生成了10,022个预言。这些预言通过ToJUnit和ToPostmanAssertify转换为可执行断言,并通过ToReadable转换为人类可读描述。在50个选定操作的基线比较中,MASTOR比直接提示高出30.1个百分点(69.9% vs. 39.8%),比SATORI高出49.4个百分点(69.9% vs. 20.5%)。

英文摘要

Existing automated RESTful API testing approaches commonly rely on simple checks (e.g., HTTP status codes, schema conformance), which are insufficient for detecting semantic faults, business logic violations, and state-dependent inconsistencies. To address this, we propose MASTOR, a Multi-Agent approach for generating Semantic Test Oracles for RESTful APIs based on implementation source code. MASTOR consists of two phases: source analysis and oracle generation. The former employs a source extraction agent to construct a source context for each endpoint operation by analyzing a transitive import closure of relevant source files. The latter employs two parallel oracle-generation paths over the collected contexts: a single-operation path producing status and field oracles per operation, and a multi-operation path generating behavioral consistency oracles for operation sequences by leveraging cross-operation semantic associations. Both paths apply a challenger-agent review, where a dedicated reviewer identifies weaknesses and issues improvement hints to guide targeted regeneration, followed by oracle normalization to filter out structurally invalid oracles. We evaluated MASTOR on a benchmark of 13 open-source RESTful API projects (296 operations, 251,303 lines of code) from the WFD and PRAB datasets. MASTOR achieved an average mutation score of 75.4%, generating 10,022 oracles. These oracles were translated into executable assertions via ToJUnit and ToPostmanAssertify, and into human-readable descriptions via ToReadable. In a baseline comparison on 50 selected operations, MASTOR outperformed Direct Prompting by 30.1 percentage points (69.9% vs. 39.8%) and SATORI by 49.4 percentage points (69.9% vs. 20.5%).

2606.10451 2026-06-10 cs.GT 新提交

Arbitrage-free Data Pricing

无套利数据定价

Yihang Wu, Zhengyu Jin, Yicheng Fu, Jinfei Liu, Kui Ren

AI总结 研究买方通过贝叶斯决策评估数据价值时的最优数据定价问题,提出无套利约束下的信息定价框架,并利用Blackwell优势刻画无套利条件。

详情
AI中文摘要

受数据在广告、金融和机器学习等应用中价值上升的驱动,数据产品市场变得越来越重要。数据市场主要销售两种产品:数据集和机器学习模型。由于这些产品可以以可忽略的边际成本复制,卖家自然通过查询访问和带噪声的模型发布来对其进行版本化。版本化立即引发了一个套利问题:买家可能组合更便宜的购买,以更低的总价格恢复信息更丰富的产品。现有关于查询和模型定价的工作在买家价值被视为外生时研究无套利性,而关于信息销售的文献则从买家的决策问题中推导价值,但忽略了无套利性。因此,我们研究卖家的最优数据定价问题,其中买家通过贝叶斯决策评估数据价值,并且我们施加无套利约束。我们首先将查询和模型定价解释为信息定价的特例,并制定一般的无套利信息销售问题,展示计算难度,并给出基于McCormick松弛的分支定界算法。然后我们考虑阈值效用,其中买家当且仅当实验足够信息时才有正值。在此条件下,我们发现无套利性可以通过Blackwell优势来刻画,这反过来统一了查询定价\cite{deep2017design}和模型定价\cite{chen2019towards}的无套利条件。最后,我们刻画了受限查询和模型菜单下的收益最大化定价。

英文摘要

Driven by the rising value of data in applications such as advertising, finance, and machine learning, markets for data products have become increasingly important. Data markets mainly sell two kinds of products: datasets and machine learning models. Since these products can be replicated at negligible marginal cost, sellers naturally version them through query access and noisy model releases. Versioning immediately raises an arbitrage problem: a buyer may combine cheaper purchases and recover a more informative product at a lower total price. Existing work on query and model pricing studies arbitrage-freeness when buyer values are treated as exogenous, whereas the literature on selling information derives value from the buyer's decision problem but ignores arbitrage-freeness. Accordingly, we study the seller's optimal data pricing problem where buyers value data through Bayesian decision making and we impose arbitrage-freeness constraints. We first interpret query and model pricing as special cases of information pricing, and formulate the general arbitrage-free information selling problem, show the computational hardness and give a branch-and-bound algorithm based on McCormick relaxations. We then consider threshold utilities where buyers have a positive value if and only if the experiment is sufficiently informative. Under this condition, we find that the arbitrage-freeness can be characterized by Blackwell dominance, which in turn unifies the arbitrage-free conditions for query pricing \cite{deep2017design} and model pricing \cite{chen2019towards}. Finally, we characterize the revenue-maximizing pricing under restricted query and model menus.

2606.10446 2026-06-10 cs.GT cs.DS 新提交

Proportionality from Sampled Approvals

从抽样批准中实现比例性

Gregory Kehne

AI总结 研究多赢家选举中基于抽样选票实现比例代表性的样本复杂度,提出一种新规则将JR公理的样本复杂度降至Õ(k⁴ log(m/δ)),并证明下界Ω(k³)及与Chamberlin-Courant目标的分离。

Comments 44 pages, 9 figures

详情
AI中文摘要

为了确保多赢家选举中的代表性,需要多少选民输入?如果选民是从底层人群中随机抽取的,那么需要抽取多少次才能以高概率找到一个由$k$名候选人组成的比例委员会?满足正当代表(JR)比例性公理的标准多赢家投票规则的基于抽样的变体,在$m$名候选人上使用$\tilde O(k^5 \log \frac{m}{\delta})$个抽样批准选票,其中$\delta$是失败概率,$\tilde O$隐藏了$\mathrm{polylog}(k)$因子。我们提出了一种规则,其JR族比例委员会选择的样本复杂度为$\tilde O(k^{4}\log \frac{m}{\delta})$。这将JR的样本复杂度与相应的选民覆盖(Chamberlin-Courant)目标的自然加性逼近分离开来,后者需要$\Theta(k^5\log \frac{m}{\delta})$个样本。对于下界,我们给出了一族实例,其中$m, \frac{1}{\delta} \in \mathrm{poly}(k)$,为了识别一个JR委员会,需要$\Omega(k^3)$个抽样选票。我们还证明了对$\log m$的依赖是必要的。这个下界是通用的,也适用于排序选票中固体联盟的Hare比例性(PSC)。不幸的是,没有数量的抽样选票足以以高概率满足稍强的Droop JR和Droop PSC公理。但JR的温和放松需要更少的样本,我们评估的超越最坏情况领域和实际批准偏好也是如此。

英文摘要

How much voter input is necessary in order to ensure representation in multiwinner elections? If voters are randomly selected from an underlying population, how many draws are necessary to find a proportional committee of $k$ candidates, with high probability? Sample-based adaptations of standard multiwinner voting rules that satisfy the justified representation (JR) proportionality axiom use $\tilde O(k^5 \log \frac{m}δ)$ sampled approval ballots over $m$ candidates, where $δ$ is a probability of failure and $\tilde O$ suppresses $\mathrm{polylog}(k)$ factors. We present a rule for which the sample complexity of JR-family proportional committee selection is $\tilde O(k^{4}\log \frac{m}δ)$. This separates the sample complexity of JR from that of the natural corresponding additive approximation to the voter coverage (Chamberlin-Courant) objective, which we show requires $Θ(k^5\log \frac{m}δ)$ samples. For lower bounds, we present a family of instances with $m, \frac{1}δ \in \mathrm{poly}(k)$ for which $Ω(k^3)$ sampled ballots are necessary in order to identify a JR committee. We also show a dependence on $\log m$ is necessary. This lower bound is versatile, and also applies to Hare proportionality for solid coalitions (PSC) for ranked ballots. Unfortunately, no number of sampled ballots suffices to satisfy the slightly stronger Droop JR and Droop PSC axioms with high probability. But mild relaxations of JR require fewer samples, as do the beyond-worst-case domains and actual approval preferences we evaluate.

2606.10434 2026-06-10 cs.HC cs.ET 新提交

Profiling cognitive offloading in LLM-mediated synthesis writing: Volume vs. content

LLM中介的合成写作中认知卸载的剖析:数量与内容

Oleksandra Poquet, Mani Shankar Nanduri, Maria Ximena Salinas Loyer, Matthias Stadler, Michael Sailer, Jelena Jovanovic

AI总结 本研究通过聚类分析比较了两种认知卸载剖析方法(基于使用量和基于内容),发现内容剖析能揭示更丰富的认知模式,为理解LLM如何重新分配认知活动提供新视角。

Comments Accepted to the Proceedings of the European Conference for Tecnology-Enhanced Learning' 2026

详情
AI中文摘要

本研究比较了两种剖析学习者在合成写作任务中向LLM卸载认知活动的方法。借鉴Salomon的分布式认知以及Kintsch和van Dijk的文本理解模型,研究将向LLM的卸载操作化为两种方式:LLM使用的数量和卸载的内容,同时考虑先验知识。通过自定义界面与通用LLM交互的97名大学生的数据,使用k-means聚类进行分析。为了捕捉卸载的内容,他们的提示被解释为谁执行活动(主动或被动)以及在什么理解水平(局部或全局)。基于数量的剖析(k=4)主要通过先验知识区分学习者,数量与论文作者身份负相关。基于内容的剖析(k=5)揭示了定性不同的卸载模式,从词汇澄清到主动指导结构和生成,再到在两个水平上被动委托理解。这些模式反映了认知过程的不同碎片化,在学习策略、行为标记和论文作者身份上存在差异。结合卸载的数量和内容可以改进未来关于LLM使用如何重新分配认知活动及其对学习者影响的分析。

英文摘要

This study compares two approaches to profiling how learners offload cognitive activity to LLMs during a synthesis writing task. Drawing on Salomon's distributed cognition and the Kintsch and van Dijk model of text comprehension, the study operationalises offloading to an LLM in two ways: as a volume of LLM use and as content of what is offloaded, both along with prior knowledge. Data from 97 university students interacting with a general-purpose LLM via a custom interface were analysed using k-means clustering. To capture the content of offloading, their prompts were interpreted as to who performs the activity (active or passive) and at what level of comprehension (local or global). Volume-based profiling (k=4) differentiated learners primarily by prior knowledge, with volume negatively associated with essay authorship. Content-based profiling (k=5) revealed qualitatively distinct patterns of offloading, from vocabulary clarification to active direction of structuring and generation to passive delegation of comprehension at both levels. These patterns reflect different fragmentation of the cognitive process, with differences in learning strategies, behavioural markers, and essay authorship. Combining volume and content of offloading could improve future analyses on how LLM use redistributes cognitive activity and its effects on learners.

2606.10426 2026-06-10 eess.SY cs.SY 新提交

Dynamic Optimization of Virtual Inertia and Damping in Converter-Based Power Systems

基于变流器的电力系统中虚拟惯量和阻尼的动态优化

Jovan Krajacic, Maitraya Avadhut Desai, Ognjen Stanojev, Gabriela Hug

AI总结 针对变流器接口可再生能源替代同步机导致的惯量和阻尼缺失问题,提出一种考虑系统稳定性、成本效率和韧性的动态优化算法,实现虚拟惯量和阻尼的最优分配,并在三区域系统上验证。

详情
Journal ref
Proc. 2025 IEEE Kiel PowerTech, Kiel, Germany, 2025
AI中文摘要

向可持续电力系统的转型是通过用变流器接口的可再生能源替代传统同步发电机实现的。然而,由此导致的旋转惯量和调速器阻尼的缺失会引起显著的频率偏差,从而可能导致不稳定。本文重点研究由已建立的变流器控制方案激活的电力系统中虚拟惯量和阻尼的最优分配。为此,我们提出了一种新颖的动态优化算法,该算法考虑了系统稳定性、成本效率和韧性的性能指标。此外,我们的算法考虑了电力系统中扰动的大小和位置以实现最优分配。最后,我们在一个三区域系统上验证了我们的方法,并将我们的结果与基于$\mathcal{H}_2$系统范数的分配方法进行了比较。

英文摘要

The transition towards a sustainable power system is enabled by the replacement of conventional synchronous generators with converter-interfaced renewable energy sources. However, the resulting loss of rotational inertia and governor damping causes significant frequency deviations and can therefore cause instability. The focus of this paper is the optimal allocation of virtual inertia and damping in the power system activated by established converter control schemes. To this end, we propose a novel dynamic optimization algorithm that considers performance metrics for system stability, cost-efficiency, and resilience. In addition, our algorithm considers the magnitudes and locations of disturbances in the power system for the optimal allocation. Finally, we validate our approach on a three-area system and also compare our results with a $\mathcal{H}_2$ system-norm-based allocation approach.

2606.10417 2026-06-10 cs.SE 新提交

Beyond Coverage and Kill Scores: Empirically Measuring Test Suite Behavioural Gaps

超越覆盖率和变异分数:实证测量测试套件行为差距

Partha Protim Paul, Reid Holmes

AI总结 提出一种从自然语言文档和源码中提取预期方法级行为并映射到测试用例的方法,发现17.5%的预期行为未被测试,且行为差距与高覆盖率和变异分数不相关,表明行为覆盖是独立的测试充分性维度。

详情
AI中文摘要

传统的测试充分性度量衡量的是系统的实现,而不是系统是否遵循其预期行为。虽然开发者严重依赖代码覆盖率和变异测试来评估测试套件质量,但这些度量本质上是以实现为中心的,无法检测代码预期行为与实际行为之间的差距。不幸的是,目前还没有可靠的方法来检测这些差异;在本文中,我们介绍了一种自动化的概念验证方法来研究这些差距。该方法从自然语言文档和源代码中提取预期的方法级行为,将其映射到现有测试用例,并识别预期行为与已验证行为之间的差距。我们在十个流行的开源Java库(包含8,922个方法)上评估了该方法,提取了20,729个行为,精确率为93.1%。我们的实证分析保守估计,17.5%的检测到的预期行为完全未被测试,我们将其称为测试套件的行为差距。为了确定这些差距是否仅仅是人工测试的产物,我们评估了最先进的自动化测试生成器(EVOSUITE / ASTER),发现它们同样未能验证至少20.6% / 27.1%的检测到的预期行为。我们进一步证明,行为差距无法通过传统的结构度量来预测:大多数未测试的行为发生在已经具有高行覆盖的方法中,超过一半的行为差距存在于具有高变异分数的方法中。这些结果表明,行为覆盖作为测试套件充分性的一个独立维度,可以补充传统的结构度量。

英文摘要

Traditional test adequacy metrics measure a system's implementation, not whether it adheres to its expected behaviour. While developers rely heavily on code coverage and mutation testing to assess test suite quality, these metrics are fundamentally implementation-centric and cannot detect gaps between what the code is expected to do and what it actually does. Unfortunately, there has been no way to reliably detect these discrepancies; in this paper we introduce an automated proof-of-concept approach to investigate these gaps. The approach extracts expected method-level behaviours from natural language documentation and source code, maps them to existing test cases, and identifies gaps between expected and validated behaviours. We evaluate the approach across ten popular open-source Java libraries comprising 8,922 methods, extracting 20,729 behaviours with 93.1% precision. Our empirical analysis conservatively estimates that 17.5% of detected expected behaviours remain entirely untested, which we term as the test suite's behavioural gap. To determine if these gaps are merely an artifact of human-driven testing, we evaluate state-of-the-art automated test generators (EVOSUITE / ASTER), finding that they similarly fail to validate at least 20.6% / 27.1% of detected expected behaviours. We further demonstrate that behavioural gaps are not predicted by traditional structural metrics: the majority of untested behaviours occur in methods that already have high line coverage, and over half persist in methods with high mutation kill score. These results suggest behavioural coverage acts as an independent dimension of test suite adequacy that can complement traditional structural metrics.

2606.10415 2026-06-10 cs.DC 新提交

RATrain: A Resource-Aware Training Runtime for Large Language Models on Bandwidth-Constrained Heterogeneous Supercomputing Platforms

RATrain:面向带宽受限异构超级计算平台的大语言模型资源感知训练运行时

Yao Lu, Shiqing Ma, Zhongzhi Luan, Gen Li, Jiaxing Qi, Bin Han, Hailong Yang, Depei Qian

AI总结 针对显存层级化、DDR容量有限且集群间通信受限的MT-3000平台,提出资源感知训练运行时RATrain,通过训练状态生命周期调度和资源感知规划器,实现高达1.35倍加速和97.0%的扩展效率。

详情
AI中文摘要

生产级异构超级计算平台越来越多地用于托管大语言模型(LLM)训练工作负载。然而,现有的面向GPU的训练运行时通常依赖高带宽设备内存、快速互连和成熟的集体通信库,这使得它们难以直接适配MT-3000——一个具有显式内存层次结构、有限可用DDR容量和受限集群间通信的平台。本文提出RATrain,一种面向带宽受限异构超级计算平台上密集LLM的资源感知训练运行时。RATrain将标准的非交错1F1B训练形式化为训练状态生命周期调度问题,并在层级别和阶段本地粒度上调度梯度同步、参数更新、参数视图预取和激活恢复。RATrain进一步结合了MT-3000感知的执行后端(用于高效且可预测的FP16 GEMM、Attention Backward和显式数据移动)与资源感知规划器(在每计算集群20GB可用DDR约束下选择可行的训练配置)。我们在真实的MT-3000平台上实现了RATrain,并使用LLaMA-2-7B、Baichuan2-13B、Qwen2.5-32B和LLaMA-2-70B配置进行评估。结果表明,与MT-3000适配的GPU风格训练策略相比,RATrain实现了高达1.35倍的端到端加速。对于LLaMA-2-7B,RATrain扩展到1024个计算集群,达到112,790.55 tokens/s,并实现了97.0%的扩展效率。进一步的1.028B token正确性运行表明,RATrain保留了语义等价的Baseline-1F1B运行的损失轨迹,最大相对损失偏差为0.081%。

英文摘要

Production heterogeneous supercomputing platforms are increasingly used to host large language model (LLM) training workloads. However, existing GPU-oriented training runtimes typically rely on high-bandwidth device memory, fast interconnects, and mature collective communication libraries, making them difficult to directly adapt to MT-3000, a platform with an explicit memory hierarchy, limited usable DDR capacity, and constrained inter-cluster communication. This paper presents RATrain, a resource-aware training runtime for dense LLMs on bandwidth-constrained heterogeneous supercomputing platforms. RATrain formulates standard non-interleaved 1F1B training as a training-state lifecycle scheduling problem, and schedules gradient synchronization, parameter update, parameter-view prefetching, and activation recovery at layer-level and stage-local granularity. RATrain further combines an MT-3000-aware execution backend for efficient and predictable FP16 GEMM, Attention Backward, and explicit data movement with a resource-aware planner that selects feasible training configurations under the 20GB usable-DDR constraint per compute cluster. We implement RATrain on a real MT-3000 platform and evaluate it using LLaMA-2-7B, Baichuan2-13B, Qwen2.5-32B, and LLaMA-2-70B configurations. Results show that RATrain achieves up to 1.35$\times$ end-to-end speedup over MT-3000-adapted GPU-style training strategies. For LLaMA-2-7B, RATrain scales to 1024 compute clusters, reaches 112,790.55 tokens/s, and achieves 97.0\% scaling efficiency. A further 1.028B-token correctness run shows that RATrain preserves the loss trajectory of a semantically equivalent Baseline-1F1B run, with a maximum relative loss deviation of 0.081\%.

2606.10399 2026-06-10 cs.DS 新提交

Average-Case and Smoothed Near-Optimality for Color-Code Decoding

颜色码解码的平均情况和平滑近最优性

Daniel Gibney, Jackson Huffstutler

AI总结 针对颜色码最小权重解码的NP难问题,提出基于块的解码器,在自然噪声模型下实现(1+ε)近似,并证明平滑分析和稀疏情况下的精确解码。

详情
AI中文摘要

二维颜色码的最小权重解码是NP难的(Walters和Turner 2026),这促使人们寻找超越最坏情况精确解码的近似保证。我们研究了一种用于三角颜色码格子的基于块的解码器。该解码器满足确定性加性保证 \(\lvert E_{\mathrm{alg}}\rvert \leq \operatorname{OPT}(S)+O(n/\tau)\),其中 \(n\) 是顶点数,\(\tau\) 是墙间距。我们证明,在自然噪声模型下,这种加性保证变成了近最优的乘性保证。对于恒定速率的独立同分布面噪声和恒定局部度数,选择 \(\tau=\Theta(\epsilon^{-1})\) 可在时间 \(n2^{O(\epsilon^{-1})}\) 内以概率 \(1-\exp(-\Omega(n))\) 给出 \((1+\epsilon)\)-近似。我们还证明了一个平滑类比:当任意对抗性错误模式被独立的恒定速率噪声扰动时,相同的近最优性保证成立。最后,在低概率区域 \(p=o(1/\log^2 n)\),综合征以高概率分解成小的活跃区域,允许独立的分量解码,并在时间 \(n2^{O((\log n)^{3/2})}\) 内产生精确的最小权重修正。这些结果表明,尽管在最坏情况下是困难的,颜色码解码在平均情况、平滑情况和稀疏区域中具有强保证。

英文摘要

Minimum-weight decoding for two-dimensional color codes is NP-hard (Walters and Turner 2026), motivating the search for approximation guarantees beyond worst-case exact decoding. We study a block-based decoder for triangular color-code lattices. The decoder satisfies the deterministic additive guarantee \(\lvert E_{\mathrm{alg}}\rvert \leq \operatorname{OPT}(S)+O(n/τ)\), where \(n\) is the number of vertices and \(τ\) is the wall spacing. We show that this additive guarantee becomes a near-optimal multiplicative guarantee under natural noise models. For constant-rate i.i.d. face noise and constant local degree, choosing \(τ=Θ(ε^{-1})\) gives a \((1+ε)\)-approximation with probability \(1-\exp(-Ω(n))\), in time \(n2^{O(ε^{-1})}\). We also prove a smoothed analogue: the same near-optimality guarantee holds when an arbitrary adversarial error pattern is perturbed by independent constant-rate noise. Finally, in the low-probability regime \(p=o(1/\log^2 n)\), the syndrome decomposes into small active regions with high probability, allowing independent component-wise decoding and yielding an exact minimum-weight correction in time \(n2^{O((\log n)^{3/2})}\). These results show that, despite worst-case hardness, color-code decoding admits strong average-case, smoothed, and sparse-regime guarantees.

2606.10375 2026-06-10 cs.IR 新提交

SIDInspector: A Mapping-First Diagnostic Resource for Semantic-ID Tokenizers

SIDInspector:面向语义ID分词器的映射优先诊断资源

Jiandong Ding, Heng Chang, Huijie Qin, Tianying Liu

AI总结 提出SIDInspector,一种针对语义ID分词器工件的映射优先诊断资源,通过适配器契约和映射级探针检测利用率、别名、邻域对齐等问题,在23,742个音乐项上对比GRID和ReSID/GAOQ,发现前缀对齐是候选曝光信号。

Comments Submitted to CIKM 2026 Resource Track

详情
AI中文摘要

语义ID(\sid)分词器越来越多地被作为独立工件在生成式推荐中重用:导出的项到代码映射成为后续序列生成器必须使用的地址空间。这些映射很少带有通用的检查接口,因此覆盖缺口、全码别名、行为弱前缀、尾部压缩和前缀扇出等问题通常只有在下游训练后才会被发现。我们提出\tool,一种针对\sid分词器工件的映射优先诊断资源。\tool在项映射、元数据、交互和可选的生成器轨迹上定义了一个小的适配器契约;验证该契约;并报告利用、别名、邻域对齐、流行度分配和结构成本等映射级探针,同时包含时间变化和生成器轨迹的钩子。\tool在下游排行榜得分之前报告可检查的工件概况。发布的资源涵盖四种分词器工件线:在23,742个音乐项上的同项GRID/RQ-KMeans风格与ReSID/GAOQ对比,以及发布的LETTER和LC-Rec项索引工件。在音乐对比中,GRID风格的特征文本导出有3,749个唯一全码和0.977的全码别名率,而ReSID/GAOQ在其导出映射中无别名。然而,最强的前缀-共现对齐来自确定性类别前缀控制,而非任一学习导出行(0.447对比0.154和0.055–0.080),表明可寻址性和行为有意义的前缀应分开检查。跨域、固定重排序器和机制探针检查支持相同的诊断方向:前缀对齐是候选曝光信号,而最终排序质量仍是下游模型问题。

英文摘要

Semantic-ID (\sid) tokenizers are increasingly reused as standalone artifacts in generative recommendation: an exported item-to-code mapping becomes the address space that a later sequence generator must use. These mappings rarely come with a common inspection interface, so coverage gaps, full-code aliasing, behaviorally weak prefixes, tail compression, and prefix fan-out are often found only after downstream training. We present \tool, a mapping-first diagnostic resource for \sid tokenizer artifacts. \tool defines a small adapter contract over item mappings, metadata, interactions, and optional generator traces; validates the contract; and reports mapping-level probes for utilization, aliasing, neighborhood alignment, popularity allocation, and structural cost, with hooks for temporal churn and generator traces. \tool reports inspectable artifact profiles before downstream leaderboard scores. The released resource covers four tokenizer artifact lines: a same-item GRID/RQ-KMeans-style and ReSID/GAOQ contrast on 23,742 Musical items, plus released LETTER and LC-Rec item-index artifacts. In the Musical contrast, the GRID-style feature-text export has 3,749 unique full codes and a 0.977 full-code aliasing rate, while ReSID/GAOQ is aliasing-free in its exported mapping. Yet the strongest prefix--co-occurrence alignment comes from a deterministic category-prefix control, not from either learned export row (0.447 versus 0.154 and 0.055--0.080), showing that addressability and behaviorally meaningful prefixes should be inspected separately. Cross-domain, fixed-reranker, and mechanism-probe checks support the same diagnostic direction: prefix alignment is a candidate-exposure signal, while final ranking quality remains a downstream model question.

2606.10330 2026-06-10 cs.GT cs.CY stat.AP 新提交

The Power of Altruism in Sticker Economics: Generosity Minimizes Collective Costs and Overprotective Norms Fuel Inefficiency

利他主义在贴纸经济学中的力量:慷慨最小化集体成本,过度保护规范导致低效率

Luana Ferraz Alvarenga, Caetano Alvarenga Costa, César Rennó-Costa

AI总结 通过基于智能体的建模和蒙特卡洛模拟,研究社区规范如何影响FIFA世界杯贴纸收集的集体效率,发现过度保护策略增加成本,而慷慨策略优化网络流动性并显著减少不良运气的影响。

详情
AI中文摘要

收集FIFA世界杯贴纸册呈现了一个经典的公共物品和集体行动困境,其中独自完成收集效率极低。为了评估本地社区规范如何塑造集体效率,我们使用基于智能体的建模和蒙特卡洛模拟,参数来自巴西纳塔尔交换聚会的实证现场观察。反映赛事最近扩军,Panini 2026专辑包含980张独立贴纸,包括68张金属特殊贴纸。我们对比标准基准经济(1:2特殊对普通交换比率)与过度保护严格策略(独家特殊对特殊交易)和利他慷慨策略(高级玩家放弃所需重复以帮助同伴)。我们的发现表明,过度保护规则困住流动性并导致网络范围的低效率。严格策略使中位数完成成本增加10包,并严重惩罚最不幸的5%收集者,在大城市增加20包,在小社区增加30包。相反,广泛慷慨优化网络流动性并显著压缩坏运气的长尾。引入慷慨策略使第5百分位的所需购买量在大规模配置中减少90包,在较小集群中减少130包。此外,广泛利他触发强烈的功能耦合,有效同步网络中的完成率。这项研究表明,僵化的保护规范降低集体福利,而慷慨成功缓解包抽方差,将昂贵的孤立爱好转变为有韧性、高效的公共物品。

英文摘要

Collecting the FIFA World Cup sticker album presents a classic public-goods and collective-action dilemma, in which completing a collection on one's own is highly inefficient. To evaluate how localized community norms shape collective efficiency, we use agent-based modeling and Monte Carlo simulations, parameterized with empirical field observations from exchange meetups in Natal, Brazil. Reflecting the tournament's recent expansion, the Panini 2026 album features 980 individual stickers, including 68 metallic specials. We contrast a standard baseline economy (1:2 special-to-normal exchange ratio) with an overprotective, strict strategy (exclusive special-for-special trading) and an altruistic, generous strategy (in which advanced players surrender needed duplicates to assist peers). Our findings reveal that overprotective rules trap liquidity and drive network-wide inefficiency. The strict strategy increases median completion costs by 10 packs and severely penalizes the least fortunate 5\% of collectors, adding 20 packs in large cities and 30 in small communities. Conversely, widespread generosity optimizes network liquidity and dramatically compresses the long tail of bad luck. Introducing the generous strategy reduces required purchases for the 5th percentile by 90 packs in large-scale configurations and 130 packs in smaller clusters. Furthermore, widespread altruism triggers a strong functional coupling that effectively synchronizes completion rates across the network. This study demonstrates that while rigid, protective norms degrade collective welfare, generosity successfully mitigates pack-draw variance, transforming an expensive, isolated hobby into a resilient, highly efficient public good.

2606.10325 2026-06-10 cs.MM cs.HC 新提交

Design and Implementation of a Real-time Multi-site Immersive Learning System Using Photon Fusion

基于Photon Fusion的实时多站点沉浸式学习系统的设计与实现

Iwai Wataru, Duc V. Nguyen

AI总结 提出一种基于Photon Fusion的VR沉浸式学习系统,实现多用户实时同步与交互,评估显示通信稳定、可用性好且VR晕动症轻微。

详情
AI中文摘要

在本文中,我们开发了一个基于虚拟现实的沉浸式学习环境,允许教师使用Photon Fusion在虚拟空间中进行授课。所提出的系统允许教师和学生无论实际物理位置如何,都能出现在同一虚拟空间中。教师可以与学生进行实时口头交流,并与3D学习材料进行交互。通过采用Photon Fusion,系统实现了多玩家之间的稳定实时通信和同步。评估结果表明,所提出的系统提供了稳定的通信性能、良好的可用性和最小的VR晕动症,证实了其作为沉浸式学习环境的有效性。

英文摘要

In this paper, we develop a Virtual Reality-based immersive learning environment that allows teachers to conduct a lesson in a virtual space using Photon Fusion. The proposed system allows teachers and students to be present in the same virtual space regardless of their actual physical locations. The teachers can verbally communicate with students in real-time, interacting with 3D learning materials. By adopting Photon Fusion, the system achieves stable real-time communication and synchronization among multiple players. Evaluation results demonstrate that the proposed system provides stable communication performance, good usability, and minimal VR sickness, confirming its effectiveness as an immersive learning environment.

2606.10323 2026-06-10 cs.CR 新提交

Semantic Multi-Agent Intrusion Detection for IoT:Zero-Day and Adversarial Threats with Risk-Aware Reasoning

面向物联网的语义多智能体入侵检测:零日与对抗威胁的风险感知推理

Saeid Jamshidi

AI总结 提出一种结合语义嵌入与多阶段概率决策融合的语义多智能体入侵检测系统,通过四个智能体协作实现零日攻击和对抗攻击的高效检测,在多个真实物联网数据集上达到95.9%检测准确率。

详情
AI中文摘要

物联网设备的快速普及实现了前所未有的自动化和连接性,但也显著增加了攻击面,使网络面临包括零日和对抗入侵在内的复杂网络威胁。传统入侵检测系统难以泛化到未知攻击,通常需要大量计算资源,且缺乏可解释性,尤其是在资源受限和异构的物联网网络中。最近的进展,包括深度学习、开放集检测和基于大语言模型的语义推理,解决了其中一些挑战,但通常专注于零日和对抗威胁,很少将语义推理与多智能体系统结合。为克服这些限制,我们提出了一种语义多智能体入侵检测系统,集成了四个专门智能体(例如Scout、Mutator、Auditor和Arbiter),利用语义嵌入和多阶段概率决策融合。Scout从语义嵌入中诱导结构化假设;Mutator生成对抗性约束变体;Auditor评估一致性并过滤不可靠输出;Arbiter产生可解释的、风险感知的警报。在多个真实物联网数据集上的大量实验表明,所提系统实现了95.9%的整体检测准确率,将误报率降低至6.8%,将零日检测率提升至87.9%,并保持了适用于边缘部署的计算效率。

英文摘要

The rapid proliferation of Internet of Things (IoT) devices has enabled unprecedented automation and connectivity, but it has also substantially increased the attack surface, exposing networks to sophisticated cyber threats, including zero-day and adversarial intrusions. Traditional Intrusion Detection Systems (IDS) struggle to generalize to unseen attacks, often require substantial computational resources, and lack interpretability, particularly in resource-constrained and heterogeneous IoT networks. Recent advances, including Deep Learning (DL), open-set detection, and Large Language Model (LLM)-based semantic reasoning, address some of these challenges but typically focus on zero-day and adversarial threats and rarely combine semantic reasoning with multi-agent systems. To overcome these limitations, we propose a semantic multi-agent ID that integrates four specialized agents (e.g., Scout, Mutator, Auditor, and Arbiter) that leverage semantic embeddings and multi-stage probabilistic decision fusion. The Scout induces structured hypotheses from semantic embeddings; the Mutator generates adversarially constrained variants; the Auditor evaluates consistency and filters unreliable outputs; and the Arbiter produces interpretable, risk-aware alerts. Extensive experiments across multiple real-world IoT datasets demonstrate that the proposed system achieves 95.9% overall detection accuracy, reduces false-positive rates to 6.8%, improves zero-day detection to 87.9%, and maintains computational efficiency suitable for edge deployment.

2606.10320 2026-06-10 cs.SE 新提交

Communication Skills in Software Engineering: A Multivocal Review

软件工程中的沟通技能:一项多声部综述

Dannilo Rabelo, Deisy Peres, Emmanuel Dias, Thayssa Rocha, Enne Rebeca de Freitas, Kiev Gama, Gustavo Pinto

AI总结 通过多声部文献综述,发现学术界与灰色文献在将沟通视为核心能力上高度一致,但在概念化、实证证据与行业实践侧重点上存在差异。

Comments WASHES 2026

详情
AI中文摘要

沟通技能在软件工程中日益被认识到至关重要,然而关于它们的讨论在学术和灰色文献中仍然分散。这种分散是有问题的,因为它限制了对沟通如何在教育和专业环境中被重视、教授和应用的更广泛理解。通过一项多声部文献综述,我们发现学术和灰色来源在将沟通视为核心能力方面存在强烈趋同,同时也识别出侧重点上的差异,学术界侧重于概念化和实证证据,而灰色文献强调实际后果和新兴行业实践。

英文摘要

Communication skills are increasingly recognized as essential in Software Engineering, yet discussions about them remain fragmented across academic and gray literature. This fragmentation is problematic because it limits a broader understanding of how communication is valued, taught, and applied in both educational and professional settings. Through a multivocal literature review, we found strong convergence between academic and gray sources in treating communication as a core competency, while also identifying differences in emphasis, with academia focusing on conceptualization and empirical evidence and gray literature stressing practical consequences and emerging industry practices.