arXivDaily arXiv每日学术速递 周一至周五更新
重置
全部学科分类 1502
2606.17470 2026-06-17 cs.CY cs.HC 新提交

Self-Efficacy and Favorability Shape Learning from Tutoring Systems and Paper Practice

自我效能感和偏好影响辅导系统与纸质练习的学习效果

Xinfei Cen, Vincent Aleven, Kenneth R. Koedinger, Conrad Borchers, Paulo F. Carvalho

AI总结 研究通过平衡被试内设计,发现低基线自我效能感的学生无论练习形式如何都获得更大学习收益,且对辅导系统的偏好与学习收益正相关,但纸质练习模式不同;基于智能辅导系统的练习并未显著提升自我效能感。

Comments Full research paper accepted at EC-TEL 2026

详情
AI中文摘要

动机因素如自我效能感和学生对练习的偏好程度在塑造学习中起着关键作用,尤其是在技术支持的环境中。然而,教育干预常常忽视这些因素如何与练习形式相互作用。本文考察了自我效能感和偏好对两种常见练习形式(纸质练习和基于系统的辅导练习)学习结果的影响。通过使用匹配问题集的平衡被试内设计,我们分离了练习形式的影响,同时建模了动机差异。结果表明,无论练习形式如何,基线自我效能感较低的学生获得了更大的学习收益。在基线自我效能感较低的学生中,对辅导系统的偏好越高,在辅导练习中的学习收益越大,而在纸质练习中模式不同。与纸质方法相比,基于智能辅导系统(ITS)的练习并未显著提高训练后的自我效能感。这些发现强调了根据学生动机特征定制练习形式的潜在价值,因为辅导和纸质练习的收益随基线自我效能感和偏好而变化。它们为未来研究如何更有效地将教学形式与学习者的动机需求对齐奠定了基础。

英文摘要

Motivational factors such as self-efficacy and how favorably students feel toward practice play a crucial role in shaping learning, particularly in technology-supported environments. Yet, educational interventions often overlook how these factors interact with practice format. This paper examines the influence of self-efficacy and favorability on learning outcomes across two common practice formats: paper-based and system-based tutoring practice. Using a counterbalanced within-subject design with matched problem sets, we isolate the effect of practice format while modeling motivational differences. Results indicate that students with lower baseline self-efficacy achieved greater learning gains regardless of practice format. Among students with lower baseline self-efficacy, greater favorability toward the tutor was associated with greater learning gains during tutor practice, whereas the pattern differed in paper-based practice. Intelligent Tutoring System (ITS)-based practice did not significantly improve post-training self-efficacy relative to paper-based methods. These findings underscore the potential value of tailoring practice format to students' motivational profiles, as the benefits of tutor- and paper-based practice varied with baseline self-efficacy and favorability. They lay the groundwork for future research on how instructional formats can be aligned more effectively with learners' motivational needs.

2606.17468 2026-06-17 cs.IR 新提交

RSRank: Learning Relevance from Representational Shifts

RSRank: 从表示偏移中学习相关性

Archit Gupta, Sai Sundaresan, Debabrata Mahapatra

AI总结 针对RAG系统中重排序依赖启发式阈值和语言模型logit信号的问题,提出基于表示偏移(RS)的相关性信号,通过轻量级训练框架学习映射,在零阈值下过滤无关内容,超越现有重排序器。

Comments Under Peer Review

详情
AI中文摘要

随着企业部署基于RAG的系统为用户查询提供有依据的响应,重排序已成为最终过滤步骤的关键组成部分,用于区分相关文档与分散注意力或不相关的文档。现有的重排序器通常依赖启发式阈值来实现最优过滤。此外,对于相关性评分,最先进的方法使用语言模型的logit信号,这些信号是为下一个词预测设计的,而非用于评估相关性。为了解决这些局限性,我们确定了一个原则性的相关性信号:当以文档为条件时,查询内部状态中引起的表示偏移(RS)。我们观察到,(a) 候选文档引起的RS与(b) 预言文档集引起的RS之间的对齐提供了相关性的稳健指标。基于这一见解,我们引入了一个轻量级训练框架,学习将RS映射到校准的相关性分数的投影。我们的训练目标在零阈值下自然过滤不相关内容,减少了对启发式调优的依赖。在多种检索数据集上,我们的方法相比最先进的重排序器取得了性能提升。

英文摘要

As enterprises deploy RAG-based systems to provide grounded responses to user queries, reranking has become a critical component for the final filtering step that separates relevant from distracting or irrelevant documents. Existing rerankers often rely on heuristic thresholds to achieve optimal filtering. Moreover, for relevance scoring, state-of-the-art methods use a language model's logit signals, which are designed for next-token prediction, not for assessing relevance. To address these limitations, we identify a principled signal for relevance: the representational shift (RS) induced in a query's internal state when conditioned on a document. We observe that the alignment between (a) RS induced by a candidate document and (b) RS induced by an oracle document-set provides a robust indicator of relevance. Building on this insight, we introduce a lightweight training framework that learns projections mapping RS to calibrated relevance scores. Our training objectives naturally filter irrelevant content at a zero threshold, reducing dependence on heuristic tuning. Across diverse retrieval datasets, our method delivers gains over SOTA rerankers.

2606.17458 2026-06-17 cs.CE 新提交

ICBCBench: An Industry Consortium Benchmark for Financial Deep Research

ICBCBench:面向金融深度研究的行业联盟基准

Weiya Li, Zhiwei Tang, Yizhou He, Chenghao Wang, Liang Feng, Xiao Sun, Dongrui Liu, Zichen Wen, Hu Wei, Jinghang Wang, Yi Luo, Li Guo, Linfeng Zhang

AI总结 针对金融领域深度研究代理评估标准缺失的问题,提出ICBCBench基准,采用客观任务与主观报告评估双轨范式,揭示当前模型在复杂推理、事实依据和报告质量上的显著差距。

Comments 33 pages, 14 figures. Preprint. Under review

详情
AI中文摘要

随着深度研究代理在金融等知识密集型领域的快速发展,建立可靠且与领域对齐的评估标准仍然是一个关键挑战。现有基准要么专注于封闭式问答,要么专注于开放式报告评估,未能共同捕捉实际工作流程中所需的检索-推理准确性和端到端研究质量。我们引入了ICBCBench,一个由联盟驱动的金融深度研究基准,与来自广泛金融机构和学术界的领域专家合作开发,涉及超过40个组织的50多位专家。它采用双轨范式,整合了具有可验证答案的客观任务和主观长篇报告评估,从而在专家对齐性、引用一致性和来源质量方面,实现对检索-推理准确性和端到端报告质量的互补评估。对最先进的深度研究代理和大语言模型的实验揭示了在复杂推理、事实依据和报告质量方面的显著差距,凸显了实现行业级性能的挑战。我们的数据集和评估框架可在以下网址获取:this https URL。

英文摘要

With the rapid advancement of Deep Research Agents in knowledge-intensive domains such as finance, establishing reliable and domain-aligned evaluation standards remains a critical challenge. Existing benchmarks focus on either closed-ended question answering or open-ended report evaluation, failing to jointly capture retrieval-reasoning accuracy and end-to-end research quality required in real-world workflows. We introduce ICBCBench, a consortium-driven benchmark for financial deep research, developed in collaboration with domain experts from a broad range of financial institutions and academia, involving over 50 experts across more than 40 organizations. It adopts a dual-track paradigm integrating objective tasks with verifiable answers and subjective long-form report evaluation, enabling complementary assessment of retrieval-reasoning accuracy and end-to-end report quality in terms of expert alignment, citation consistency, and source quality. Experiments on state-of-the-art DRAs and large language models reveal substantial gaps in complex reasoning, factual grounding, and report quality, highlighting the challenges of achieving industry-level performance. Our dataset and evaluation framework are available at https://github.com/DeepFin-Intelligence/ICBCBench.

2606.17421 2026-06-17 cs.CR 新提交

Bifrost: Hybrid TEE-FHE Inference for Privacy-Preserving Transformer and LLM Serving

Bifrost: 面向隐私保护Transformer和LLM服务的混合TEE-FHE推理架构

Chenghao Chen, Kailun Qin, Xiaolin Zhang, Chi Zhang, Dawu Gu

AI总结 提出Bifrost混合架构,利用CPU TEE处理非线性操作和状态刷新,FHE加密线性层委托给加速器,实现安全高效的LLM推理,相比纯FHE方案延迟降低9-53倍。

详情
AI中文摘要

云端托管的Transformer和大语言模型(LLM)推理会产生直接的机密性问题:用户提示可能包含敏感代码、业务数据、个人信息或受监管文档,而远程服务会将中间状态暴露给云端软件栈和加速器运行时。全同态加密(FHE)使加速器端的执行仅处理密文,但端到端的LLM推理仍然昂贵,因为线性层与非线性、缓存状态和刷新敏感操作交错在一起。CPU可信执行环境(TEE)可以原生执行这些操作,但仅靠CPU TEE无法定义不受信任的加速器应如何参与。我们提出Bifrost,一种混合TEE-FHE服务架构,其中秘密仅提供给经过认证的CPU TEE,而加速器、设备内存、驱动/运行时栈和主机软件均不在可信计算基内。Bifrost使用FHE作为安全委托机制,用于加速器支持的CKKS上的投影和前馈线性层,而非线性操作、注意力侧控制逻辑、KV状态转换以及解密再加密刷新在CPU TEE内执行。Bifrost+进一步采用预填充/解码拆分:提示侧KV状态在CPU TEE内构建,仅解码侧状态进入混合密文路径。在与Euston方法匹配的估计器风格比较中,Bifrost在GPT-2(1.5B)上将预计延迟降低9.25倍,在LLaMA 3(8B)上降低9.91倍。在直接CKKS/FHE部署中,Bifrost+在GPT-2(124M)上将TTFT降低14.6-45.8倍,在Qwen3(0.6B)上降低15.3-53.4倍。系统经验是选择性加密执行:仅在需要仅密文加速器委托时使用FHE,并将非线性、刷新和提示侧工作保留在CPU TEE内。

英文摘要

Cloud-hosted transformer and large language model (LLM) inference creates a direct confidentiality problem: user prompts may contain sensitive code, business data, personal information, or regulated documents, yet remote serving exposes intermediate state to the cloud software stack and accelerator runtime. Fully homomorphic encryption (FHE) keeps accelerator-side execution ciphertext-only, but end-to-end LLM inference remains expensive because linear layers are interleaved with non-linear, cache-state, and refresh-sensitive operators. CPU trusted execution environments (TEEs) can execute those operators natively, but a CPU TEE alone does not define how an untrusted accelerator should participate. We present Bifrost, a hybrid TEE-FHE serving architecture in which secrets are provisioned only to an attested CPU TEE, while the accelerator, device memory, driver/runtime stack, and host software remain outside the trusted computing base. Bifrost uses FHE as a secure delegation mechanism for projection and feed-forward linear layers on accelerator-backed CKKS, while non-linear operators, attention-side control logic, KV-state transitions, and decrypt-then-encrypt refresh execute inside the CPU TEE. Bifrost+ further applies a prefill/decode split: prompt-side KV state is built inside the CPU TEE, and only decode-side state enters the hybrid ciphertext path. In an estimator-style comparison matching Euston's methodology, Bifrost reduces projected latency by 9.25x on GPT-2 (1.5B) and 9.91x on LLaMA 3 (8B). In direct CKKS/FHE deployments, Bifrost+ reduces TTFT by 14.6-45.8x on GPT-2 (124M) and 15.3-53.4x on Qwen3 (0.6B). The systems lesson is selective encrypted execution: use FHE only where ciphertext-only accelerator delegation is required, and keep non-linear, refresh, and prompt-side work inside the CPU TEE.

2606.17415 2026-06-17 cs.GT 新提交

Pure or Unstable: A Generic Dichotomy for Strong Stackelberg Commitments

纯策略或不稳定:强Stackelberg承诺的通用二分法

Kamil Bulinski, Lang White, Hung Nguyen

AI总结 研究有限领导者-跟随者博弈中强Stackelberg均衡的稳定性,证明当领导者效用从连续分布中采样时,最优承诺几乎必然是纯策略且稳定,或混合策略且不稳定。

Comments 19 pages

详情
AI中文摘要

我们研究了当跟随者的最优反应对应是集值时,有限领导者-跟随者博弈中强Stackelberg均衡(SSE)的鲁棒性。虽然乐观的破平(有利于领导者)被普遍采用,但它可能依赖于刀刃上的无差异。我们形式化了一个稳定性概念:如果在领导者承诺的策略下,跟随者有一个严格降低领导者效用的替代最优反应,则该SSE是不稳定的。我们的主要结果建立了一个尖锐的通用二分法。固定跟随者的效用并从任意连续分布中采样领导者的效用,以概率1,最优Stackelberg承诺是唯一的,并且要么是(i)纯策略,要么是(ii)混合策略且不稳定。当两个玩家的效用都通用地采样时,这加强为:以概率1,唯一的最优承诺要么是纯策略且稳定,要么是混合策略且不稳定。这些定理补充了von Stengel和Zamir的经典通用值结果,表明即使乐观和悲观的领导者值在通用情况下一致,当最优性需要真正的随机化时,策略层面的SSE预测在通用情况下是脆弱的。我们进一步将此视角应用于Stackelberg满足博弈,通过反例反驳了先前工作中的猜想,并确定了该猜想仍然成立的条件。

英文摘要

We study the robustness of the Strong Stackelberg Equilibrium (SSE) in finite leader--follower games when the follower's best-response correspondence is set-valued. While optimistic tie-breaking (in the leader's favor) is commonly adopted, it can hinge on knife-edge indifferences. We formalize a stability notion: an SSE is unstable if, at the leader's committed strategy, the follower has an alternative best response that strictly reduces the leader's payoff. Our main results establish a sharp generic dichotomy. Fixing the follower's utility and sampling the leader's utility from any continuous distribution, with probability one the optimal Stackelberg commitment is unique and is either (i) pure, or (ii) mixed and unstable. When both players' utilities are sampled generically, this strengthens to: with probability one, the unique optimal commitment is either pure and stable or mixed and unstable. These theorems complement the classic generic-value result of von Stengel and Zamir by showing that even when optimistic and pessimistic leader values coincide generically, the strategy-level SSE prediction is generically fragile whenever optimality requires genuine randomization. We further apply this perspective to Stackelberg satisfaction games, disproving a conjecture from prior work via counterexamples and identifying conditions under which it nonetheless holds.

2606.17411 2026-06-17 cs.SI 新提交

Sender--Receiver Community Detection in Directed Networks via Node-Role-Constrained Edge Clustering

有向网络中基于节点角色约束边聚类的发送者-接收者社区检测

Duy Hieu Do

AI总结 提出TT-SR框架,为每个节点分配发送者和接收者角色,通过双层次规则优化角色分配,在保持可解释性的同时提升有向边社区恢复性能。

Comments Preprint, 25 pages

详情
AI中文摘要

有向社区检测具有挑战性,因为边方向编码了不对称的源-目标关系。大多数有向模块度和随机游走方法为每个顶点分配一个标签,而最近基于双模性的方法更自由地聚类有向边。我们提出TT-SR,一个介于这两种观点之间的双层发送者-接收者框架。每个顶点被分配一个发送者角色和一个接收者角色,每条有向边获得由其源节点的发送者角色和目标节点的接收者角色诱导的类型。因此,TT-SR比单标签顶点聚类更具表现力,同时比无限制边聚类更易解释。该方法从计数残差、平稳流、度修正和顺序得分视角生成候选发送者-接收者分配。候选分配通过局部角色更新进行优化,并由双层规则选择:度修正轮廓得分提供主要结构标准,而伯努利密度和顺序流得分仅作为次要排序信号。我们通过发送者-接收者模块度松弛证明了主要谱视角的合理性,并将度修正得分解释为基于似然的残差比较。在路径型、共块和有序流合成基准上的实验表明,TT-SR在三种规模设置下实现了最强或基本持平的最强边社区恢复。在度修正共块和有序流图上增益最为显著。真实网络诊断进一步表明,TT-SR与Email-Eu-core元数据良好对齐,并在未标记的有向网络上提取出强烈的发送者-接收者双社区摘要。

英文摘要

Directed community detection is challenging because edge directions encode asymmetric source-target relations. Most directed modularity and random-walk methods assign one label to each vertex, whereas recent bimodularity-based methods cluster directed edges more freely. We propose TT-SR, a Two-Tier Sender-Receiver framework that lies between these two viewpoints. Each vertex is assigned a sender role and a receiver role, and each directed edge receives the type induced by the sender role of its source and the receiver role of its target. Thus, TT-SR is more expressive than one-label vertex clustering while remaining more interpretable than unrestricted edge clustering. The method generates candidate sender-receiver assignments from count-residual, stationary-flow, degree-corrected, and order-score views. The candidates are refined by local role updates and selected by a two-tier rule: a degree-corrected profile score provides the primary structural criterion, while Bernoulli density and order-flow scores are used only as secondary ranking signals. We justify the main spectral views through sender-receiver modularity relaxations and interpret the degree-corrected score as a likelihood-based residual comparison. Experiments on pathway-type, co-block, and ordered-flow synthetic benchmarks show that TT-SR achieves the strongest or essentially tied strongest edge-community recovery across three scale settings. The gains are most pronounced on degree-corrected co-block and ordered-flow graphs. Real-network diagnostics further indicate that TT-SR aligns well with Email-Eu-core metadata and extracts strong sender-receiverbicommunity summaries on unlabeled directed networks.

2606.17390 2026-06-17 cs.CE 新提交

A Differentiable GPU-Accelerated Finite Element Framework for Inverse Characterization of Finite-Strain Anisotropic Plasticity

一种用于有限应变各向异性塑性逆向表征的可微分GPU加速有限元框架

Deepak Sharma, Itzel Salgado, Lu Huang, Hui-Ping Wang, Jian Cao

AI总结 提出基于JAX的可微分GPU加速有限元框架,通过并行化非线性FEM三大瓶颈实现高效正向模拟,并利用自动微分进行逆向表征,准确恢复各向异性屈服和硬化参数。

详情
AI中文摘要

我们提出了一个完全可微分、GPU加速的有限元框架,用于有限应变各向异性弹塑性材料的正向模拟和逆向表征。该框架基于JAX构建,通过并行化非线性FEM中的三个主要计算瓶颈:单元弱形式和切线刚度评估、全局稀疏矩阵组装以及稀疏线性求解,充分利用现代加速器架构。对于具有300万自由度的正向问题,JAX-FEM在单个NVIDIA H100 GPU上比24核CPU Abaqus基线实现了高达9.4倍的加速。自动微分应用于本构更新和求解器工作流程,为复杂本构模型提供一致的雅可比矩阵,无需手动推导,并为PDE约束的逆向分析提供精确梯度。与有限差分相比,JAX-AD梯度避免了步长敏感性,并以显著更低的计算成本提供所需的灵敏度。在逆向表征中,我们将信息丰富、拓扑优化的异质试件几何与全场位移数据相结合,以识别具有许多参数的复杂本构模型,否则需要多次常规实验才能表征。我们展示了在逐步具有挑战性的设置中准确恢复各向异性屈服和硬化参数,包括均匀和空间变化的材料属性。由此产生的基于AD的公式能够在高维参数空间中进行高效优化,而有限差分方法在计算上不可行。这些结果确立了可微分、GPU加速的有限元方法作为先进制造中模拟、表征和优化工作流程的高通量引擎的实用性。

英文摘要

We present a fully differentiable, GPU-accelerated finite element framework for forward simulation and inverse characterization of finite-strain anisotropic elastoplastic materials. Built on JAX, the framework exploits modern accelerator architectures by parallelizing the three major computational bottlenecks in nonlinear FEM: elemental weak-form and tangent-stiffness evaluation, global sparse matrix assembly, and sparse linear solution. For a large-scale forward problem with 3 million degrees of freedom, JAX-FEM on a single NVIDIA H100 GPU achieves up to 9.4$\times$ speed-up over a 24-core CPU Abaqus baseline. Automatic differentiation is applied through the constitutive update and solver workflow, providing consistent Jacobians for complex constitutive models without manual derivation and accurate gradients for PDE-constrained inverse analysis. Compared with finite differences, the JAX-AD gradients avoid step-size sensitivity and provide the required sensitivities at substantially lower computational cost. For inverse characterization, we combine information-rich, topology-optimized heterogeneous specimen geometries with full-field displacement data to identify complex constitutive models with many parameters that would otherwise require many conventional experiments to characterize. We demonstrate accurate recovery of anisotropic yield and hardening parameters in progressively challenging settings, including uniform and spatially varying material properties. The resulting AD-based formulation enables efficient optimization in high-dimensional parameter spaces where finite-difference approaches are computationally infeasible. These results establish differentiable, GPU-accelerated FEM as a practical high-throughput engine for simulation, characterization, and optimization workflows in advanced manufacturing.

2606.17387 2026-06-17 cs.SE 新提交

Supporting the Adoption of Privacy-Enhancing Technologies through Requirements Engineering

通过需求工程支持隐私增强技术的采用

Oleksandr Kosenkov, Vadym Honcharenko, Abhinava Singh, Volodymyr Spirin, Danica Vranjanin

AI总结 本文从需求工程视角分析隐私增强技术(PETs)在软件工程中采用面临的跨利益相关者和跨学科挑战,提出通过系统化处理工程、商业和法律视角来促进PETs采用。

Comments Accepted to the 34th International Requirements Engineering Conference (RE 2026), Montreal, Canada, from 17 to 21 August 2026

详情
AI中文摘要

近几十年来,隐私增强技术(PETs)已被视为在处理个人数据的软件系统中满足监管和用户隐私要求的一种手段。尽管有大量的研究努力、监管机构的支持、谷歌和微软等大型技术公司的贡献以及软件从业者日益增长的兴趣,PETs的实际采用仍然有限。现有研究一致指出了软件工程中PETs采用面临的反复出现的挑战,例如技术复杂性和培训不足。尽管正在进行研究努力,这些挑战在实践中很大程度上仍未解决。在这篇工业挑战论文中,我们采用一种实用的、需求工程(RE)驱动的视角,考察了多个利益相关者群体(PET开发者、集成者和采用者)以及不同学科视角(工程、法律和商业)下PET采用面临的挑战。我们认为,RE可以通过系统地处理隐私的互补工程、商业和法律观点来促进PETs的采用。忽视这些观点中的任何一个挑战(例如,PETs对软件架构的影响、其商业影响及其对法规遵从的贡献)都可能增加障碍,甚至导致实施失败。在实践中,在RE中明确指定这些观点可以实现利益相关者之间有意义的协调,从而更有效地在软件工程中实现PETs的好处。

英文摘要

In recent decades, privacy-enhancing technologies (PETs) have been recognized as a means of meeting regulatory and user privacy requirements in software systems that process personal data. Despite substantial research efforts, support from regulators, contributions by large technology companies such as Google and Microsoft, and growing interest among software practitioners, the practical adoption of PETs remains limited. Existing research consistently identifies recurring challenges to PETs adoption in SE, such as technical complexity and insufficient training. Despite ongoing research efforts, these challenges largely remain unresolved in practice. In this industrial challenge paper, we apply a practical, requirements engineering (RE)-driven perspective to examine challenges to PET adoption across multiple stakeholder groups (PET developers, integrators, and adopters) as well as across different disciplinary perspectives (engineering, law, and business). We argue that RE can facilitate the adoption of PETs by systematically addressing each of the complementary engineering, business, and legal viewpoints on privacy. Neglecting challenges in any of these viewpoints (e.g., the impact of PETs on software architecture, their business implications, and their contribution to regulatory compliance) can increase the impediments or even lead to implementation failure. In practice, explicit specification of these viewpoints within RE can enable meaningful coordination among stakeholders to more effectively realize the benefits of PETs in software engineering.

2606.17378 2026-06-17 cs.DC 新提交

RISE: Relay Inference and Online Scheduling for Efficient Edge-Device Collaborative Diffusion Model Services

RISE: 面向高效边缘-设备协同扩散模型服务的接力推理与在线调度

Zilan Huang, Zhiqing Tang, Hanshuai Cui, Tian Wang, Yuan Wu, Weijia Jia, Wei Zhao

AI总结 提出RISE方法,通过训练无关的接力机制利用模型家族共享潜空间,将边缘大模型与设备小模型结合,并采用上下文感知调度器优化质量与延迟权衡。

Comments to be published in IEEE ICWS 2026

详情
AI中文摘要

文本到图像扩散模型越来越多地部署在网络边缘,以服务于具有不同质量和延迟要求的异构工作负载。然而,现有的部署策略要么选择具有高保真度但高延迟的大型边缘端模型,要么选择速度较快但语义连贯性较差的轻量级设备端模型。此外,这些方法很少在不同大小的模型之间跨边缘服务器和用户设备分割去噪工作负载。为了弥合这一差距,我们提出了RISE,一种用于边缘-设备扩散模型服务的方法,该方法结合了接力推理与在线调度。受潜强度在模型切换后表现出最小偏差这一发现的驱动,RISE使用了一种训练无关的接力机制,利用模型家族内的共享潜空间:边缘端的大模型处理早期塑造语义结构的去噪步骤,然后将中间潜变量传递给设备端的小模型进行细节细化。为了将该机制部署为实际服务,一个上下文赌博机调度器根据提示复杂度、用户偏好、网络质量和实时节点负载选择最佳接力配置。在两个基准上的实验表明,RISE的接力机制在保持完整模型质量的同时实现了高达2.1倍的加速,其上下文感知调度器在混合工作负载下有效平衡了质量和延迟。

英文摘要

Text-to-image diffusion models are increasingly deployed at the network edge to serve heterogeneous workloads with diverse quality and latency requirements. However, existing deployment strategies choose either large edge-side models with high fidelity but high latency or lightweight device-side models that offer speed at the cost of semantic coherence. Moreover, these approaches rarely split the denoising workload between models of different sizes across edge servers and user devices. To bridge this gap, we propose RISE, a method for edge-device diffusion model services that combines relay inference with online scheduling. Driven by the finding that the latent intensity exhibits minimal deviation after a model handoff, RISE uses a training-free relay mechanism that exploits the shared latent space within a model family: the large model on the edge handles the early denoising steps that shape semantic structure, then passes the intermediate latent to a small device-side model for detail refinement. To deploy this mechanism as a practical service, a contextual bandit scheduler selects the best relay configuration based on prompt complexity, user preferences, network quality and real-time node loads. Experiments on two benchmarks show that RISE's relay mechanism achieves up to 2.1$\times$ speedup while preserving full-model quality, and its context-aware scheduler effectively balances quality and latency under mixed workloads.

2606.17374 2026-06-17 cs.LO cs.PL cs.SE 新提交

Verifying the Rust Standard Library

验证 Rust 标准库

Byron Cook, Remi Delmas, Zyad Hassan, Bart Jacobs, Ranjit Jhala, Rahul Kumar, Felipe R. Monteiro, Thanh Nguyen, Rebecca Rumbul, Michael Tautschnig, Celina Val, Carolyn Zech

AI总结 通过众包方式集成多种验证工具,对 Rust 标准库中的不安全代码进行静态验证,发现并修复未定义行为,展示了大规模验证的可行性与挑战。

Comments Published at 18th NASA Formal Methods Symposium (NFM 2026)

详情
Journal ref
In: Deshmukh, J., Havelund, K., Pinto, A. (eds) NASA Formal Methods. NFM 2026. Lecture Notes in Computer Science, vol 16622. Springer, Cham, pp. 415-435
AI中文摘要

Rust 的类型系统防止了许多类别的内存错误,但其标准库严重依赖不安全代码,这些代码的正确性通过测试(包括在 Miri 下的动态检查)来验证,但缺乏静态验证。我们提出了据我们所知,针对软件库所报告的最大规模验证活动:一个开放的、众包的努力,将互补的验证工具集成到一个从 Rust 标准库分支出来的验证仓库的持续集成中。我们分析了该活动的有效性,讨论了机器检查证明对于一部分未定义行为(例如越界访问、空指针和悬垂指针解引用以及使用未初始化内存)的实际价值,并将剩余障碍作为开放挑战呈现给形式化方法社区。

英文摘要

Rust's type system prevents many classes of memory errors, yet its standard library relies heavily on unsafe code whose correctness is validated through testing, including dynamic checks under Miri, but lacks static verification. We present what is, to the best of our knowledge, the largest verification campaign reported for a software library: an open, crowdsourced effort that integrates complementary verification tools into the continuous integration of a verification repository forked from the Rust standard library. We analyze the campaign's effectiveness, discuss the practical value of machine-checked proofs for a subset of undefined behaviors (e.g., out-of-bounds access, null and dangling pointer dereferences, and use of uninitialized memory), and frame the remaining obstacles as open challenges for the formal-methods community.

2606.17367 2026-06-17 cs.CY 新提交

Towards Auditing AI Systems in the Wild

野外AI系统审计的探讨

Aditya T. Vadlamani, Anutam Srinivasan, Srinivasan Parthasarathy

AI总结 本文提出将AI系统审计视为在不确定性下监控约束违规的统计问题,强调开发全生命周期审计框架,以持续评估公平性、安全性等风险控制约束。

Comments Accepted to KDD 2026 (Blue Sky Ideas Track)

详情
AI中文摘要

AI系统越来越多地部署在现实环境中,其行为受到动态环境、不断变化的数据分布以及与用户和基础设施的复杂交互的影响。传统的机器学习评估侧重于基准测试,并在沙盒环境中运行,只能提供对野外真实系统行为的有限视角。我们主张开发原则性的审计框架,以监控部署的AI系统在其整个生命周期中的表现。我们进一步提出将审计视为在不确定性下监控约束违规的统计问题,其中期望的属性(例如公平性和安全性)被视为风险控制的约束,必须随着系统通过迭代反馈的演化而持续评估。这一视角凸显了对不确定性感知的监控方法、审计标准的社会技术规范以及能够对野外AI系统进行持续监督的审计基础设施的需求。

英文摘要

AI systems are increasingly deployed in real-world settings where their behavior is shaped by dynamic environments, evolving data distributions, and complex interactions with users and infrastructure. Traditional machine learning evaluation focuses on benchmarks and operates within sandboxed environments, providing only a limited view of the true system behavior in the wild. We argue for the development of principled auditing frameworks that monitor deployed AI systems throughout their lifecycle. We further propose framing auditing as a statistical problem of monitoring constraint violations under uncertainty, where desired properties (e.g., fairness and safety) are treated as risk-controlled constraints that must be continuously evaluated as systems evolve through iterative feedback. This perspective highlights the need for uncertainty-aware monitoring methods, socio-technical specifications of audit criteria, and auditing infrastructures that enable ongoing oversight of AI systems in the wild.

2606.17360 2026-06-17 cs.CY 新提交

Narratives That Limit the Possible: Interrupting Narrative Closure in Computing Practice

限制可能性的叙事:中断计算实践中的叙事闭合

Samuel Mann, Ruth Myers, Dave Guruge, Lucky Hawkins, Kylie McKee, Rex Alexander, Jamie Vaughan, Tim Lynch, Danny Fridberg

AI总结 本文通过集体自我民族志方法,识别计算领域中概念武器化的机制(如简化、个体化、二元框架等),并提出“重构”工具包,以恢复复杂性、揭示假设并引导关注结构条件,促进关怀、充足与正义的实践转变。

详情
AI中文摘要

计算领域的主导概念——创新、效率、韧性、专业精神——常常从反思性理想转变为限制行为、转移责任并排除批判的工具。我们将这种漂移称为武器化:对专业概念的话语性重新利用,使其稳定“一切照旧”,同时使结构性替代方案显得不合理、难以理解或超出范围。通过跨教育、司法、公共管理、研究管理和计算领域的集体自我民族志,我们识别出反复出现的机制(简化、个体化、二元框架、指标替代、英雄/韧性脚本、有组织的无知)。基于这一综合,我们提出重构——和平的、实践就绪的转变(例如,从简化口号到结构性素养;从表演性合规到有意义的结果;从应对到正义)——每个转变都配有一个“立即行动”提示。这些杠杆在不要求正式权威的情况下恢复复杂性、揭示假设并将注意力重新引向结构条件。我们的贡献包括:(1)跨领域描述武器化作为一种模式化现象;(2)基于叙事和系统思维的可移植重构工具包;(3)对有限计算的影响,包括朝向关怀、充足和正义的日常实践转变。

英文摘要

Computing's dominant concepts - innovation, efficiency, resilience, professionalism - often migrate from reflective ideals to instruments that limit behaviour, redirect responsibility, and foreclose critique. We call this drift weaponisation: the discursive repurposing of professional concepts so that they stabilise business-as-usual while making structural alternatives appear unreasonable, illegible, or out of scope. Using collective autoethnography across education, justice, public administration, research management, and computing, we identify recurring mechanisms (simplification, individualisation, binary framing, metric substitution, hero/resilience scripts, organised ignorance). From this synthesis we propose Reframing - peaceful, practice-ready shifts (e.g., From Simplified Slogans to Structural Literacy; From Performative Compliance to Meaningful Outcomes; From Coping to Justice) - each paired with a 'do-now' prompt. The levers restore complexity, surface assumptions, and redirect attention to structural conditions without requiring formal authority. We contribute: (1) a cross-field account of weaponisation as a patterned phenomenon; (2) a portable reframing toolkit grounded in narrative and systems thinking; and (3) implications for computing within limits, including day-to-day practice shifts toward care, sufficiency, and justice.

2606.17358 2026-06-17 cs.CR 新提交

OTRO: Oblivious Tokenization Path with Square-Root ORAM

OTRO: 具有平方根ORAM的遗忘标记化路径

Jonghyun Lee, Yongqin Wang, Rachit Rajat, Daniel Wong, Mengyuan Li, Murali Annavaram

AI总结 针对LLM机密计算中标记器访问模式泄露问题,提出OTRO,利用平方根ORAM实现高效遗忘查找,通过实例池、轮换填充和分块KV缓存感知标记化降低开销,在TDX环境中将TTFT开销限制在4.5%以内。

详情
AI中文摘要

CPU端的大语言模型(LLM)标记器是通过CPU和GPU可信执行环境(TEE)的机密计算堆栈进行LLM服务中的一个关键安全漏洞。标记器通过表驱动查找将提示转换为标记,由此产生的内存访问模式是侧信道泄露的强大来源。最近的研究表明,在生产级Intel TDX上,可以从标记器访问模式端到端恢复用户提示。然而,直接使用流行的基于树的遗忘RAM(例如PathORAM)来防止访问模式泄露会导致约13倍的标记器减速,导致首次令牌生成时间(TTFT)增加10-58%。在本文中,我们提出了OTRO,一种针对延迟关键的LLM服务量身定制的高效遗忘标记化路径。OTRO依赖于平方根ORAM实现快速单次访问查找,但通过三项关键创新避免了每√N次访问时O(N log²N)的重建成本。首先,OTRO提供了一组复制的平方根ORAM实例,利用标记器表的只读特性。其次,基于轮次的旋转策略将访问与重建解耦,并在每个轮次边界填充虚拟访问,以最小化可观察信息。最后,分块KV缓存感知标记化进一步将重建与GPU预填充重叠,并最小化实例数量。作为HuggingFace Tokenizers和nano-vLLM中的模块实现,在配备NVIDIA H100 GPU的TDX启用CVM中运行,OTRO将TTFT开销限制在最多4.5%,将标记器引起的延迟保持在总TTFT的10%以下,并增加不到0.5 GB的内存开销,同时减少各种模型系列和大小的标记器可观察泄露。

英文摘要

The CPU-side large language model (LLM) tokenizer is a critical security gap in LLM serving through a confidential computing stack with CPU and GPU trusted execution environments (TEEs). Tokenizers converts the prompts through table-driven lookups, and the resulting memory access patterns are a powerful source of side-channel leakage. Recent work demonstrates end-to-end recovery of user prompts from tokenizer access pattern on production Intel TDX. However, a drop-in use of the popular tree-based Oblivious RAMs (e.g., PathORAM) to prevent access-pattern leakage introduces $\sim$13$\times$ tokenizer slowdown, resulting in 10-58% higher time-to-first-token (TTFT). In this paper, we present OTRO, an efficient, oblivious tokenization path tailored to latency-critical LLM serving. OTRO relies on square-root ORAM for fast single-access lookups, but avoids its prohibitive $O(N\log^2N$) rebuild cost every $\sqrt{N}$ accesses through three key innovations. First, OTRO provides a pool of replicated square-root ORAM instances that utilize the read-only nature of tokenizer table. Second, an epoch-based rotation policy decouples accesses from rebuilds and pads each epoch with dummy accesses to its boundaries, minimizing observable information. Lastly, chunked KV-cache-aware tokenization further overlaps rebuilds with GPU prefill and minimizes the instance count. Implemented as modules in HuggingFace Tokenizers and nano-vLLM, running within a TDX-enabled CVM with an NVIDIA H100 GPU, OTRO limits TTFT overhead to at most 4.5%, keeps tokenizer-induced latency under 10\% of total TTFT, and adds less than 0.5 GB of memory overhead while reducing the tokenizer's observable leakage across various model families and sizes.

2606.17347 2026-06-17 eess.SY cs.SY 新提交

Classifying Transient Regimes in Dynamic Systems through Properties of Spatial Curves and Stochastic Processes: A Data-Driven Approach

通过空间曲线和随机过程性质对动态系统中瞬态状态进行分类:一种数据驱动方法

Cristian Puerto-Santana, Javier Diaz-Rozo, Carlos Puerto-Santana, Carlos Ocampo-Martinez

AI总结 提出一种基于空间曲线表示和数学矩的瞬态与稳态分类方法,利用弧长和曲率设计分类器,在多元线性、非线性和不连续系统中优于现有技术。

详情
AI中文摘要

本文提出了一种对动态系统中瞬态和稳态状态进行分类的新方法。文献中几种基于传感器的状态分类解决方案需要设置多个参数,或者不适用于包含周期信号的多元系统场景。所提出的方法基于样本数学矩引入了所考虑系统的空间曲线表示。然后,通过连接稳定性理论、空间曲线的几何性质和稳态随机过程的概念,利用所提出曲线的弧长和曲率设计了两个状态分类器。两个分类器都能够描述和检测瞬态状态,考虑的行为包括:多元渐近稳定性、边际稳定性和循环平稳性。此外,对所提出的分类器与文献中现有分类器在性能和计算资源方面进行了定量比较,结果表明,在指定的研究条件下,基于弧长的状态分类器在对模拟线性、非线性和不连续多元系统的瞬态状态分类中优于其他技术。

英文摘要

This article proposes a novel methodology for the classification of transient and stationary regimes in dynamic systems. Several sensor-based solutions for regime classification in the literature require the setting of several parameters, or are not suitable for scenarios involving multivariate systems that may contain periodic signals. The proposed method introduces a spatial curve representation of the considered system based on its sample mathematical moments. Then, by connecting concepts of stability theory, geometrical properties of spatial curves and stationary stochastic processes, two regime classifiers are designed using the arc length and the curvatures of the proposed curve. Both classifiers are capable of describing and detecting transient regimes, considering behaviors such as: multivariate asymptotically, marginally stability, and cyclostationarity. Furthermore, a quantitative comparison in performance and computation resources of the proposed classifiers against existing classifiers in the literature illustrates that the proposed regime classifier based on the arc length outperforms other techniques in classifying transient regimes for simulated linear, non-linear, and discontinuous multivariate systems under the specified studied conditions.

2606.17322 2026-06-17 cs.CY 新提交

Federated Fair Trade Energy: Speculative Fabulation for a Planet with Limits

联邦公平贸易能源:有限星球上的投机性虚构

Dawn Nafus, Laura Watts

AI总结 研究电力网格转向去中心化绿色能源时,永续计算面临的社会正义与基础设施限制问题,通过基于实证的投机性虚构提出跨社区联邦化能源系统的研究机会,并引入“公平贸易能源”概念。

Comments Paper in Proceedings of LIMITS 2026: 12th Workshop on Computing within Limits, 2026-06-23-25, Online

详情
AI中文摘要

当电力网格转向去中心化绿色能源,且地方社区和市政当局对这一重要公共服务拥有更多治理权时,永续计算会发生什么?电力和计算网络不仅仅是相互连接的不同系统。电网向可再生能源发电的转变正在影响计算系统,而计算对电力的需求也在影响电网;一种基础设施限制了另一种。永续计算研究往往侧重于“离网”或“表后”能源,这牺牲了公共电力网格(为普遍服务而管理和监管)旨在提供的一些社会正义利益。我们的论文使用基于实证的“投机性虚构”来识别当能源系统跨社区联邦化时,永续计算中的研究机会。该投机性虚构以未来北海能源岛的能源经理与一位永续计算播客主持人之间的采访形式呈现,使我们能够以可处理的社会和生态术语来构想计算-电网集成,并引入“公平贸易能源”的概念。

英文摘要

What happens to permacomputing when electricity grids shift to decentralised green energy, and local communities and municipalities have increased governance over this vital public service? Electricity and computational networks are more than just separate systems that plug together. Shifts to renewable energy generation in the grid are impacting computational systems, and computational demands on electrical power are impacting the electricity grid; one infrastructure limits the other. Permacomputing research tends to focus on 'off-grid' or 'behind-the-meter' energy. This sacrifices some of the social justice benefits that the public electricity grid, managed and regulated for universal service, was designed to provide. Our paper uses empirically-grounded 'speculative fabulation' to identify research opportunities in permacomputing that open up when energy systems are federated across communities. The speculative fabulation takes the form of an interview between the energy manager of a future energy island in the North Sea and a permacomputing podcaster. This allows us to conceive of computing-grid integration in tractable social and ecological terms, and introduce a notion of 'fair trade energy'.

2606.17316 2026-06-17 cs.DS 新提交

Approximation Preserving Coresets

近似保持的核集

Milind Prabhu, Chris Schwiegelshohn, Sudarshan Shyam

AI总结 针对大数据聚类中核集尺寸小于理论保证的现象,提出近似保持核集,仅保留好解的成本,平衡了强核集与弱核集之间的保证,并证明近似因子微小失真即无法达到该尺寸。

详情
AI中文摘要

大数据环境下的聚类是一个被深入研究的问题,核集作为该领域的重要范式之一而出现。给定一个成本函数 $\text{cost}(P,S)$,将输入点 $P$ 和解 $S$ 映射到一个目标值,核集是一个通常带权重的概要 $\Omega\subseteq P$,使得 $\text{cost}(\Omega,S)\approx \text{cost}(P,S)$。在实践中,经常发现核集尺寸远小于理论保证所建议的尺寸就足够了。在本文中,我们为这一现象提供了一种解释。如果我们只希望保留\emph{好}解(即成本低的解)的成本,那么较小的核集尺寸就足够了。我们定义并设计了\emph{近似保持的核集},它提供的保证弱于适用于所有解的强核集,但强于仅适用于最优解的弱核集。我们通过证明即使近似因子有非常小的失真也无法达到这种尺寸的核集来补充这一结果。

英文摘要

Clustering in a big data setting is an intensively studied problem, with coresets emerging as one of the important paradigms in this line of work. Given a cost function $\text{cost}(P,S)$ mapping input points $P$ and a solution $S$ to an objective value, a coreset is a typically weighted sketch $Ω\subseteq P$ such that $\text{cost}(Ω,S)\approx \text{cost}(P,S)$. In practice, coreset sizes much smaller than those suggested by theoretical guarantees are often found to be sufficient. In this paper, we offer an explanation for this phenomenon. Smaller coreset sizes suffice if we only wish to preserve the costs of \emph{good} solutions, i.e., solutions with low cost. We define and devise \emph{approximation-preserving coresets}, which provide a weaker guarantee than strong coresets, which apply to all solutions, while providing stronger guarantees than weak coresets, which apply only to the optimum solution. We complement this result by showing that even a very small distortion in the approximation factor cannot admit coresets of this size.

2606.17315 2026-06-17 cs.DC cs.DS 新提交

Space-Efficient Lock-Free Linear-Probing Hash Table

空间高效的免锁线性探测哈希表

Hagit Attiya, Rotem Oshman, Noa Schiller

AI总结 提出一种免锁线性探测哈希表,具有无等待查找,在保持空间高效的同时优雅处理并发,使用少量元数据实现线性化、免锁操作。

详情
AI中文摘要

线性探测是哈希表设计中最简单且空间效率最高的方法之一,由于其紧凑的内存布局,在顺序设置中被广泛使用。然而,设计具有强活性保证的并发线性探测哈希表已被证明是困难的,并且只有少数此类算法被提出,所有这些算法要么限制并发性,要么依赖每个条目的大量元数据,从而损害了空间效率。我们提出了一种具有无等待查找的免锁线性探测哈希表,它保留了顺序线性探测的核心优势,同时优雅地处理争用。我们的设计每个表条目仅使用少量元数据:使用LL/SC时使用恒定数量的额外位,或使用CAS时使用对数数量的位。该算法是可线性化的且免锁的,支持插入、删除和无等待查找操作,并且能够安全地回收已删除元素使用的空间而无需重建表。我们分析了哈希表的均摊步骤复杂度,假设没有相同键的并发插入,并表明每个操作具有与顺序线性探测相匹配的期望均摊步骤复杂度,直到每个键的争用点。

英文摘要

Linear probing is one of the simplest and most space-efficient approaches to hash table design, and is widely used in sequential settings due to its compact memory layout. However, designing a concurrent linear-probing hash table with strong liveness guarantees has proved difficult, and only a handful of such algorithms have been proposed, all of which either restrict concurrency or rely on large per-entry metadata, thereby compromising space efficiency. We present a lock-free linear-probing hash table with wait-free lookups that retains the core advantages of sequential linear probing while handling contention gracefully. Our design uses only a small amount of metadata per table entry: a constant number of additional bits when using LL/SC, or a logarithmic number of bits when using CAS. The algorithm is linearizable and lock-free, supports insert, delete, and wait-free lookup operations, and is able to safely reclaim space used by deleted elements without rebuilding the table. We analyze the amortized step complexity of our hash table assuming no concurrent insertions of the same key, and show that each operation has expected amortized step complexity matching that of sequential linear probing, up to the point contention per key.

2606.17314 2026-06-17 eess.SY cs.SY 新提交

Line Outage Impact Factor (LOIF): A New Sensitivity Factor for Enhanced Transmission Observability

线路停运影响因子 (LOIF):一种用于增强输电可观性的新灵敏度因子

Daniel Flores, Yuanrui Sang, Michael P. McGarry

AI总结 提出线路停运影响因子 (LOIF) 作为新灵敏度因子,用于输电线路停运检测,相比线路停运分布因子 (LODF) 能更有效选择监测线路,提高检测精度。

详情
AI中文摘要

输电故障若未及时处理可能导致连锁故障和系统停电,影响数百万用户,因此选择最佳位置监测输电系统状态对电力系统可靠性至关重要。本文提出一种新的灵敏度因子——线路停运影响因子 (LOIF),它特别适用于电力系统监测,能比现有灵敏度因子(如线路停运分布因子 LODF)更有效地揭示输电停运对其他线路潮流的影响。在本研究中,我们将 LOIF 应用于三个测试系统的输电线路停运检测,并基于这两种灵敏度因子使用多种观测输电线路 (OTL) 选择方法将其与 LODF 进行比较。然后,我们应用机器学习算法通过监测选定的 OTL 来检测其他线路的停运,并使用 F1 分数评估检测精度。结果表明,通常在使用相同数量的 OTL 时,基于 LOIF 选择的 OTL 进行检测获得了更高的 F1 分数。这种模式在大规模系统中尤为一致,显示了其在实际应用中的潜力。

英文摘要

Transmission failures can lead to cascading failures and system blackout affecting millions of customers if not handled in time, and choosing the best locations to monitor the condition of the transmission system is crucial for power system reliability. In this paper, we propose a new sensitivity factor, the line outage impact factor (LOIF), which is especially useful for power system monitoring and can reveal the impacts of a transmission outage on the power flow of other lines more effectively than existing sensitivity factors, such as the line outage distribution factors (LODF). In this study, we apply the LOIF in transmission line outage detection in three test systems and compare it with LODF using a number of observed transmission line (OTL) selection methods based on these two sensitivity factors. Then we apply a machine learning algorithm to detect the outages of other lines by monitoring the selected OTLs, and the detection accuracy is evaluated using the F1-score. The results show that, in general, with the same number of OTLs, detection using the OTLs selected using LOIF achieved higher F1-scores. The pattern was especially consistent in large-scale systems, showing its potential in real-world applications.

2606.17297 2026-06-17 cs.DS 新提交

Scalable K-clique Estimation with Differential Privacy

可扩展的差分隐私k-团估计

Dung Nguyen, Ritwick Mishra, Anil Vullikanti

AI总结 针对差分隐私下k-团计数的高全局敏感性问题,提出一种基于阶梯函数局部灵敏度上界和近似灵敏度框架的噪声校准算法,显著提升运行时间并首次扩展到百万边图。

详情
AI中文摘要

k-团计数是子图挖掘中常用的度量。由于图通常包含敏感数据,关于差分隐私下的k-团计数已有大量工作。然而,这些度量具有非常高的全局灵敏度,因此需要更复杂的技术来隐私地计数k-团。平滑灵敏度和阶梯函数被开发用于减少这些度量的私有估计的噪声幅度。然而,这些方法在计算上非常低效。对于k>3的k-团,没有已知的多项式时间算法来计算平滑灵敏度,而阶梯函数的时间复杂度受限于精确计数的时间,这无法很好地扩展。在本文中,我们开发了一种新的高度可扩展的算法,用于差分隐私下的k-团计数估计。我们的算法将阶梯函数调整为局部灵敏度的平滑上界,并利用近似灵敏度框架来校准噪声,其幅度与上界的近似值成比例。这显著提高了运行时间。实验表明,我们的方法比基于阶梯函数的k-团计数估计快几个数量级,同时精度相似。我们的算法是第一个能够扩展到具有数百万条边的图,并且对于较大的k,阶梯函数算法无法完成。

英文摘要

Counts of $k$-cliques are commonly used metrics in subgraph mining. Since graphs often have sensitive data, there also has been a lot of work on $k$-clique counts with differential privacy. However, these metrics have very high global sensitivity, and so more sophisticated techniques have been developed for counting $k$-cliques with privacy. Smooth sensitivity and ladder functions were developed for reducing the noise magnitude for private estimates of these metrics. However, these are computationally very inefficient to estimate. No polynomial time algorithms are known for smooth sensitivity of $k$-cliques for $k>3$, while the time complexity of ladder functions is lower bounded by the time for exact counts, which does not scale very well. In this paper, we develop a new highly scalable algorithm for estimating $k$-clique counts with differential privacy. Our algorithm adapts the ladder function to serve as a smooth upper bound on its local sensitivity, and utilizes the approximation sensitivity framework to calibrate noise with magnitude proportional to an approximation of the bound. This gives us a significant improvement in the running time. Experiments show that our method is several orders of magnitude faster than the ladder function based estimates of $k$-clique counts, while the accuracy is similar. Our algorithm is the first to scale to graphs with millions of edges, and for larger $k$, for which the ladder function algorithm doesn't complete.

2606.17292 2026-06-17 eess.SY cs.SY 新提交

Robust Direct Data-Driven Hamiltonian for Safe Set Computation under Measurement Noise and Disturbances

鲁棒直接数据驱动哈密顿量:测量噪声和扰动下的安全集计算

Mohammad Bajelani, Christopher A. Strong, Claire J. Tomlin, Jason J. Choi, Klaske van Heusden

AI总结 针对测量噪声和扰动,提出鲁棒数据驱动哈密顿量(R-DDH),从噪声数据中推导安全集的内近似,并证明其收敛性。

详情
AI中文摘要

安全集计算是安全关键控制系统中的一个基本挑战,特别是在直接数据驱动设置中,安全分析直接从受噪声影响的测量值进行,无需显式建模。最近提出的一种方法,数据驱动哈密顿量(DDH),能够直接从测量值进行可达性分析,而无需依赖底层系统动力学的先验知识。本文将DDH框架扩展到鲁棒设置,考虑了测量噪声、外部扰动以及采样引起的状态-速度估计误差。从噪声测量中推导出鲁棒数据驱动哈密顿量(R-DDH),并证明其能给出精确哈密顿量的认证下界。这导致值函数的可证明欠近似和相关安全集的内近似。量化了数据驱动哈密顿量与精确哈密顿量之间的差距,并证明在无噪声但有加性扰动的设置中,随着数据增多,该差距收敛到零。通过两个案例研究展示了该方法的有效性:一个受约束的双积分器和一个在感知不确定性下运行的非线性闭环控制的飞机滑行系统。

英文摘要

Safe set computation is a fundamental challenge in safety-critical control systems, especially in direct data-driven settings where safety analysis is performed directly from noise-affected measurements, without explicit modeling. A recently proposed method, Data-Driven Hamiltonian (DDH), enables reachability analysis directly from measurements, without relying on prior knowledge of the underlying system dynamics. This paper extends the DDH framework to a robust setting that accounts for measurement noise, exogenous disturbances, and sampling-induced state-velocity estimation error. A Robust Data-Driven Hamiltonian (R-DDH) is derived from noisy measurements and shown to yield a certified lower bound on the exact Hamiltonian. This results in a provable under-approximation of the value function and an inner approximation of the associated safe set. The gap between the data-driven and exact Hamiltonians is quantified, and it is shown to converge to zero with more data in a noise-free setting with additive disturbances. The effectiveness of the approach is shown through two case studies: a constrained double integrator and an aircraft taxiing system with a nonlinear closed-loop controller operating under perceptual uncertainty.

2606.17291 2026-06-17 cs.CE 新提交

STORX: An Open-Source Object-Oriented Framework for Shape and Topology Optimization in MATLAB

STORX: 一个用于MATLAB形状与拓扑优化的开源面向对象框架

Amir M. Mirzendehdel, Krishnan Suresh

AI总结 提出STORX开源框架,基于MATLAB实现参数化、水平集形状优化及密度、水平集、拓扑灵敏度等拓扑优化方法,通过面向对象结构支持模块化与可扩展性,用于教学与研究。

详情
AI中文摘要

本文介绍了STORX:用于研究与实验的形状与拓扑优化,这是一个基于MATLAB的开源教育框架,用于学习和教授计算设计优化。STORX提供了参数化和水平集形状优化平台,以及拓扑优化方法,包括密度法、水平集法和拓扑灵敏度方法(如进化法和帕累托追踪法)。所有模块遵循一致的面向对象结构,并集成了可视化、灵敏度分析和有限元程序,使用户能够以透明且可重复的方式探索形状与拓扑优化之间的连续体。该代码旨在通过强调模块化和可扩展性(通过清晰的意图分离)来补充研究生课程和独立研究。核心软件接口通过抽象基类定义,使得可以通过添加派生类来实现新的目标函数和设计/制造约束,而无需修改核心代码。本文还描述了软件架构,并通过一系列示例问题展示了该框架如何将数学公式直接映射到可执行代码。

英文摘要

This paper presents STORX: Shape and Topology Optimization for Research and Experimentation, an open-source MATLAB-based educational framework for learning and teaching computational design optimization. STORX provides a platform for parametric and level-set shape optimization, as well as topology optimization methods including density, level-set, and topological sensitivity approaches such as evolutionary and Pareto-tracing methods. All modules follow a consistent object-oriented structure and integrate visualization, sensitivity analysis, and finite element routines, enabling users to explore the continuum between shape and topology optimization in a transparent and reproducible manner. The code is designed to complement graduate-level coursework and independent research by emphasizing modularity and extensibility through a clear separation of intent. Core software interfaces are defined via abstract base classes, enabling new objective functionals and design/manufacturing constraints to be implemented by adding derived classes without modifying the core code. The paper also describes the software architecture and demonstrates how the framework maps mathematical formulations directly to executable code through a series of illustrative problems.

2606.17275 2026-06-17 cs.LO cs.CR 新提交

Syntactic Systems Cannot See Semantic Invariants

句法系统无法看到语义不变量

Fabio F. G. Buono

AI总结 本文通过解决一个开放问题,证明开放归纳和子句集循环两种理论不可比较,并提炼出句法不变性原理,进而类比P与NP问题中的障碍。

详情
AI中文摘要

我们从一个小开放问题开始,Hetzl和Vierling询问两种归纳理论——开放归纳和子句集循环——是否不可比较。他们证明了一个方向,并留下了另一个方向。这里我们解决了它,证明几乎令人尴尬地简短,因为加法的规则只有在第一个参数是$0$或后继时才能触发,而Skolem常量既不是,因此项$a{+}b$和$b{+}a$永远无法被触及,而一个永远无法触及它们的机器也永远无法证明它们相等。区分这两种理论的是两个常量的顺序,而这个顺序是关于数字的事实,而不是关于符号的。我们从这一证明中提取出一个小的通用原则,即句法不变性原理,它命名了这类论证的形式。然后,我们以一些推测性评论结束,讨论这种相同形式如何非正式地出现在解决$\mathsf{P}$与$\mathsf{NP}$问题的已知障碍中,其中每个障碍似乎都指向了该障碍中技术无法达到的描述层次。我们将其作为一个建议而非定理提出,因为类比是真实的,但我们不会将其推至无法辩护的程度。在此过程中,我们提出了一个类比暗示但未解决的开放问题:是否存在一个快速的$\SAT$算法,如果存在,它是否总是可以作为一台可以写下的机器来展示,或者在某些情况下,它只能作为数字上的函数被发现。

英文摘要

We start from a small open question, where Hetzl and Vierling asked whether two theories of induction, open induction and clause set cycles, are incomparable. They proved one direction and left the other open. Here we close it, and the proof is almost embarrassingly short, because the rules for addition can only fire when the first argument is $0$ or a successor, a Skolem constant is neither, so the terms $a{+}b$ and $b{+}a$ can never be touched, and a machine that can never touch them can never prove they are equal. The thing that separates the two theories is the order of two constants, and that order is a fact about numbers, not about symbols. We extract from this proof a small general principle, the Syntactic Invariance Principle, that names the shape of such arguments. We then close with a few speculative remarks on how this same shape appears, informally, in the known barriers to settling $\mathsf{P}$ versus $\mathsf{NP}$, where each barrier seems to point to a level of description that the techniques in the barrier cannot reach. We raise this as a suggestion rather than a theorem, since the analogy is real but we do not push it past the point where we can defend it. Along the way we raise an open question that the analogy suggests but does not settle, on whether a fast algorithm for $\SAT$, were it to exist, would always be exhibitable as a machine you can write down or whether it could be found, in some cases, only as a function on the numbers.

2606.17261 2026-06-17 cs.PF cs.SE stat.AP 新提交

The Right Call for Software Benchmarking: Consistent Decisions in Stateful Environments

软件基准测试的正确调用:有状态环境下的一致决策

Gábor Melis

AI总结 针对有状态环境下基准测试偏差问题,提出基于对比估计量的实验设计,消除程序特定偏差,实现渐近正确决策。

详情
AI中文摘要

在对性能的不懈追求中,现代计算系统越来越依赖有状态机制来适应工作负载和物理环境的动态变化,这提高了效率,但使基准测试以及软件优化变得困难。事实上,自适应机制本质上会在测量之间引入时间依赖性,并导致对单个程序性能的朴素估计产生偏差。注意到纠正此类偏差需要对系统动态进行推测性假设,我们呼吁优先考虑性能差异而非绝对度量,并将软件基准测试形式化为识别最快程序的决策问题,对此相对知识就足够了。为此,我们提出了简单的实验设计,允许对比的一致估计,从而使程序特定偏差在可接受的假设下抵消。这些设计渐近地产生正确的决策,并为有状态环境下的有限预算基准测试提供了一种稳健的方法,对性能敏感软件的开发具有广泛的影响。

英文摘要

In the perpetual pursuit of performance, modern computing systems rely ever more on stateful mechanisms to accommodate the dynamics of workloads and physical environments, bolstering efficiency but confounding benchmarking and thereby the optimization of software. Indeed, by their nature, adaptive mechanisms introduce temporal dependencies between measurements and render naive estimators of individual program performance biased. Observing that rectifying such biases necessitates speculative assumptions about system dynamics, we call for prioritizing performance differentials over absolute measures and formalize software benchmarking as the decision problem of identifying the fastest program, for which relative knowledge suffices. To this end, we propose simple experiment designs admitting consistent estimators of contrasts, whereby program-specific biases cancel under tenable assumptions. These designs asymptotically yield the correct decision and afford a robust methodology for finite-budget benchmarking in stateful environments, bearing broad implications for the development of performance-sensitive software.

2606.17253 2026-06-17 cs.AR 新提交

PDAGENT-BENCH: Characterizing, Grounding, and Architecting LLM Agents for VLSI Physical Design

PDAGENT-BENCH: 用于VLSI物理设计的LLM代理的特征化、基础化与架构化

Qiufeng Li, Rongqian Chen, Quan Cheng, Chengxuan Wang, Sizhe Tang, Wuxi Li, Duo Ding, Chia-Tung Ho, Haoxing Ren, David Z. Pan, Tian Lan, Weidong Cao

AI总结 提出PDAGENT-BENCH基准,用于评估LLM/VLM代理在VLSI物理设计中的能力,涵盖任务级和工作流级评估,揭示模型在工具执行和长程推理上的局限,并验证人类技能增强工作流的有效性。

详情
AI中文摘要

大型语言模型和视觉语言模型在超大规模集成电路前端设计中取得了显著成功,但它们在VLSI物理设计中的能力仍远未得到充分探索。主要原因是缺乏标准化的基准来评估代理物理设计工作流,这些工作流需要在严格设计约束下进行高维、多阶段优化,与多种电子设计自动化工具协调交互,并进行迭代优化。本文介绍了PDAGENT-BENCH,一个全面且多维度的基准,用于评估基于LLM/VLM的代理在物理设计堆栈中的表现。PDAGENT-BENCH集成了任务级评估和工作流级执行。该基准套件包含353个精心设计的问题,结合了概念性问题与真实世界的工业制品,并配有专家验证的参考和可执行解决方案。这些任务涵盖五个关键能力维度:基础知识、报告理解、根本原因分析、脚本生成和全流程实现。此外,该基准提供了一个统一、与人类对齐的代理物理设计工作流框架,能够在真实EDA环境中实现整体物理设计的闭环评估。对11个最先进模型的实验表明,虽然现代LLM/VLM在概念性任务上表现有竞争力,但在以工具执行为中心的任务(例如,Innovus脚本生成为42.2%)和长程多阶段推理方面仍存在显著局限。我们的研究进一步表明,人类技能增强的代理工作流显著提升了端到端物理设计性能。PDAGENT-BENCH为推进LLM/VLM驱动的整体物理设计自动化建立了一个标准化、可重复且真实的评估框架。我们将很快开源该基准和框架。

英文摘要

Large Language Models and vision-language models have shown remarkable success in the front-end design of Very Large-Scale Integrated Circuits, yet their capabilities for VLSI physical design remain significantly underexplored. The primary cause is the lack of standardized benchmarks for evaluating agentic physical design workflows that require high-dimensional, multi-stage optimization under strict design constraints, coordinated interaction with diverse Electronic Design Automation tools, and iterative refinement. This work introduces PDAGENT-BENCH, a comprehensive and multi-dimensional benchmark for evaluating LLM/VLM-based agents across the physical design stack. PDAGENT-BENCH integrates both task-level assessment and workflow-level execution. The benchmark suite contains 353 curated problems that combine conceptual questions with real-world industrial artifacts, with expert-validated references and executable solutions. These tasks cover five key capability dimensions: foundational knowledge, report comprehension, root-cause analysis, script generation, and full-flow implementation. In addition, the benchmark provides a unified, human-aligned agentic physical design workflow framework that enables closed-loop evaluation of holistic physical design in realistic EDA environments. Experiments on 11 state-of-the-art models reveal that while modern LLMs/VLMs perform competitively on conceptual tasks, they remain substantially limited in tool-centric execution (e.g., 42.2% on Innovus script generation) and long-horizon, multi-stage reasoning. Our studies further show that human-skill-enhanced agentic workflows significantly improve end-to-end physical design performance. PDAGENT-BENCH establishes a standardized, reproducible, and realistic evaluation framework for advancing LLM/VLM-driven holistic physical design automation. We will open source the benchmark and framework soon.

2606.17245 2026-06-17 cs.CR cs.NI 新提交

Cache to the Future: A Distributed Webpage Archive for Internet Blackouts

缓存至未来:面向互联网断网的分布式网页存档

Ross Evans, Diogo Barradas

AI总结 提出Cache to the Future (CttF)系统,利用分布式社区评分和密码学机制在断网期间缓存和传递静态网页内容,仿真验证了城市规模下的有效性。

Comments 20 pages, 8 figures

详情
AI中文摘要

互联网断网,无论是由于技术故障还是政府有意为之,都会阻止公民访问互联网。在互联网断网常见地区的公民已使用抗断网技术来维持通信。此类技术通常依赖移动网状网络提供有限的消息服务。然而,目前尚无技术能在断网期间持续提供对网络知识源的访问。我们提出Cache to the Future (CttF):一个在断网期间缓存和传递网络上托管静态内容的系统。CttF的分布式社区评分实现了大规模众包缓存,同时密码学构造(数字签名、工作量证明)减轻了对抗性干扰。我们的真实仿真表明,CttF能在城市规模下,在多种良性和对抗场景中传递内容。

英文摘要

Internet blackouts, occurring due to technological mishaps or intentional governmental action, prevent citizens from accessing the internet. Citizens in regions where internet blackouts are common have utilized blackout-resistant technologies to maintain communication. Such technologies often rely on mobile mesh networks to provide limited messaging services. However, no technology currently exists which can provide continued access to knowledge sources on the web during a blackout. We present Cache to the Future (CttF): a system to cache and deliver static content hosted on the web during a blackout. CttF's distributed community ratings crowdsources caching at scale while cryptographic constructs (digital signatures, proofs-of-work) mitigate adversarial interference. Our realistic simulations demonstrate CttF delivering content at city-scale across a wide range of benign and adversarial scenarios.

2606.17228 2026-06-17 cs.LO cs.PL 新提交

A Stone-Cech Collecting Semantics for Residual Process Behaviour

残差过程行为的 Stone-Cech 收集语义

Mike Stannett

AI总结 针对非终止计算留下的残差行为,开发了一种紧凑收集语义,通过 Stone-Cech 紧化将尾簇集作为公共语义,区分稳定发散、有限循环发散、混合循环与逃逸等行为,并验证了 CCS 中的残差尾定律。

Comments 36 pages. Created using AI assistance

详情
AI中文摘要

本文为非终止计算留下的残差行为开发了一种紧凑收集语义。对于顺序时间,这是观察空间 Stone-Cech 紧化中流的尾簇集。它为普通循环、混合循环行为以及通过观察空间的非紧部分的逃逸提供了公共语义。基本理论建立了尾不变性、连续观察下的函子性以及开闭观察的时间解读:包含在 beta-X 的相应开闭区域中是最终真值,而非空交集是循环。进展和公平性假设通过加强时间过滤器来表示。通过紧化乘积获得关系含义,因此沿着时间的相同渐近视图进行的观察之间的相关性得以保留。主要应用是 CCS 中的残差行为。无限执行被读作模结构同余的残差过程流。该语义区分了稳定发散、有限循环发散、带有逃逸的混合循环以及通过无界残差增长的逃逸。它验证了前缀、受控展开、有限选择和有限前缀选择形式的残差尾定律,同时识别了这些定律在并行组合和同步下的边界。有限观察商提供了计算接口:抽象含义变为循环状态和强连通分量计算,资源观察检测无界逃逸,而无需检查 Stone-Cech 余集中的单个点。

英文摘要

This paper develops a compact collecting semantics for the residual behaviour left by nonterminating computation. For sequential time this is the tail-cluster set of the stream in the Stone-Cech compactification of the observation space. It gives a common semantics to ordinary recurrence, mixed recurrent behaviour, and escape through noncompact parts of the observation space. The basic theory establishes tail invariance, functoriality under continuous observations, and a temporal reading for clopen observations: containment in the corresponding clopen region of beta-X is eventual truth, while nonempty intersection is recurrence. Progress and fairness assumptions are represented by strengthening the time filter. Relational meanings are obtained by compactifying products, so correlations between observations made along the same asymptotic view of time are retained. The main application is to residual behaviour in CCS. Infinite executions are read as streams of residual processes modulo structural congruence. The resulting semantics distinguishes stable divergence, finite recurrent divergence, mixed recurrence with escape, and escape through unbounded residual growth. It validates residual-tail laws for prefixing, guarded unfolding, finite choice, and finite prefix-choice forms, while also identifying the boundary of those laws under parallel composition and synchronisation. Finite observational quotients provide the computational interface to the compact semantics: abstract meanings become recurrent states and strongly connected component calculations, and resource observations detect unbounded escape without requiring individual points of the Stone-Cech remainder to be inspected.

2606.17223 2026-06-17 cs.CR 新提交

Safety, Security, and Cognitive Risks in Neuro-Symbolic AI

神经符号AI中的安全性、安全性和认知风险

Manoj Parmar

AI总结 本文系统分析了神经符号AI在五层架构中的攻击面,提出统一威胁模型、符号层威胁目录及认知风险分析,并通过三个实证基准验证了攻击的有效性与检测挑战。

Comments 28 pages, 1 figure, 10 tables

详情
AI中文摘要

神经符号AI(NeSy)将神经感知与符号推理相结合,使其在需要可解释性和结构化推理的高风险领域具有吸引力。然而,这种混合架构引入了跨越五个层次的扩大攻击面:神经感知、符号知识库、推理引擎、智能体编排和数据存储——每个层次都可能以纯神经系统中不存在的方式被利用。本文做出六项贡献:(1)正式定义了NeSy攻击面、符号完整性违反(SIV)和跨层放大比$\mathcal{X}$,分解为神经引起的和自主符号敏感性分量;(2)一个统一的威胁模型,扩展了MITRE ATLAS,包含11个NeSy特定策略扩展和五类攻击者分类;(3)一个符号层威胁目录,涵盖知识图谱(KG)投毒、本体合并和推理引擎颠覆;(4)认知风险分析——自动化偏差、权威偏差和谄媚强化——这些风险因NeSy显式的逻辑解释相对于黑箱神经输出而被结构性放大;(5)跨学科缓解措施,具有与NIST AI 600-1和欧盟AI法案一致的可衡量接受标准;(6)三个实证基准:(E1)针对205实体医学KG的目标KG投毒在注入预算$B=5$时达到盈亏平衡SIV,并存在KG特定的隐蔽性/SIV权衡;(E2)在DistilBERT+ProbLog流水线上,$\varepsilon=0.01$的PGD-10产生$\mathcal{X}=5.884$(95%置信区间$[4.64, 8.00]$,$p<0.0001$),通过匹配随机基线($E^{R}_{\mathrm{rand}}=0$)确认了对抗特异性;(E3)单公理OWL编辑实现93.3%的SIV成功率,100%的Pellet一致性隐蔽性,但留出STIX检测在50%(随机猜测水平)失败,这是一个开放问题。

英文摘要

Neuro-symbolic AI (NeSy) pairs neural perception with symbolic reasoning, making it attractive for high-stakes domains where explainability and structured inference are required. However, this hybrid architecture introduces an enlarged attack surface spanning five layers: neural perception, symbolic knowledge bases, reasoning engines, agentic orchestration, and data stores -- each exploitable in ways absent from purely neural systems. This paper makes six contributions: (1) formal definitions of NeSy Attack Surface, Symbolic Integrity Violation (SIV), and Cross-Layer Amplification Ratio $\mathcal{X}$, decomposed into neural-caused and autonomous symbolic sensitivity components; (2) a unified threat model extending MITRE ATLAS with 11 NeSy-specific tactic extensions and a five-profile attacker taxonomy; (3) a symbolic-layer threat catalogue covering knowledge graph (KG) poisoning, ontology-merging, and inference-engine subversion; (4) analysis of cognitive risks -- automation bias, authority bias, and sycophantic reinforcement -- structurally amplified by NeSy's explicit logical explanations relative to black-box neural outputs; (5) interdisciplinary mitigations with measurable acceptance criteria aligned to NIST AI 600-1 and the EU AI Act; (6) three empirical benchmarks: (E1) targeted KG poisoning achieves break-even SIV at injection budget $B=5$ on a 205-entity medical KG, with a KG-specific stealth/SIV trade-off; (E2) PGD-10 at $\varepsilon=0.01$ yields $\mathcal{X}=5.884$ (95% CI $[4.64,\, 8.00]$, $p<0.0001$), confirmed adversarially specific by a matched-random baseline ($E^{R}_{\mathrm{rand}}=0$), on a DistilBERT+ProbLog pipeline; (E3) single-axiom OWL edits achieve 93.3% SIV success with 100% Pellet-consistency stealth, but held-out STIX detection fails at 50% (random-guessing level), an open problem.

2606.17217 2026-06-17 eess.SY cs.SY 新提交

A Stateful Stochastic Allocation Mechanism with Fairness Guarantees for Networked Electricity Systems

一种具有公平性保障的有状态随机分配机制用于网络化电力系统

Shaun SWeeney

AI总结 提出FP-AMM机制,通过两阶段随机清算规则和短缺记忆状态,实现电力分配公平性,并在IEEE标准系统上验证了收敛性和性能提升。

详情
AI中文摘要

本文开发并分析了公平博弈自动做市商(FP-AMM),一种可编程的电力分配机制,其中稀缺性分配被视为受控、有状态且可审计的信息物理过程。现有机制如节点边际定价是无记忆的,无法考虑历史服务结果,从而无法保证跨市场区间的公平待遇。FP-AMM采用两阶段随机清算规则,包括服务优先级采样和逆公平加权,结合DC-OPF可行域和通过饱和积分器更新的有界短缺记忆。建立了四个主要结果。第一,短缺记忆状态在$[0,1]^N$中不变,且更新映射是收缩率为$1-\beta$的压缩映射。第二,区间内清算算子线性收敛到唯一不动点,收缩因子$q\in(0,1)$。第三,在公平博弈优先级规则下,每节点交付比率几乎必然收敛到合同目标$F^\star$,通过赤字递归的Lyapunov分析获得有限时间$O(1/\sqrt{T})$界。第四,事件触发执行保证了分配跟踪误差的实际最终有界性,并量化了计算-保真度权衡。该机制在IEEE 14、57和118节点系统上经过$T=5000$个市场区间验证。在所有基准测试中实现了向$F^\star$的公平收敛,在IEEE-57网络上峰值弱节点公平误差降低了54%,在稀缺时期相对于等权重基线降低了高达55%,并且始终维持DC可行性。

英文摘要

This paper develops and analyses the Fair Play Automatic Market Maker (FP-AMM), a programmable electricity allocation mechanism in which scarcity allocation is treated as a controlled, stateful, and auditable cyber-physical process. Existing mechanisms such as locational marginal pricing are memoryless and cannot account for historical service outcomes, preventing guarantees of equitable treatment across market intervals. The FP-AMM employs a two-stage stochastic clearing rule comprising service-priority sampling and inverse-fairness weighting, coupled with a DC-OPF feasibility set and bounded shortage memory updated through a saturated integrator. Four main results are established. First, the shortage-memory state is invariant in $[0,1]^N$ and the update map is a contraction with rate $1-β$. Second, the intra-interval clearing operator converges linearly to a unique fixed point with contraction factor $q\in(0,1)$. Third, under the Fair Play priority rule, the per-node delivery ratio converges almost surely to the contracted target $F^\star$, with a finite-time $O(1/\sqrt{T})$ bound obtained via Lyapunov analysis of the deficit recursion. Fourth, event-triggered execution guarantees practical ultimate boundedness of the allocation tracking error and quantifies the computation-fidelity trade-off. The mechanism is validated on the IEEE 14-, 57-, and 118-bus systems over $T=5000$ market intervals. Fairness convergence to $F^\star$ is achieved on all benchmarks, peak weak-bus fairness error is reduced by 54% on the IEEE-57 network and by up to 55% relative to an equal-weight baseline during scarcity periods, and DC feasibility is maintained throughout.

2606.17212 2026-06-17 cs.GR cs.NI 新提交

Renderable Partial Representations for Dynamic Gaussian Splatting under Incomplete Delivery

不完整交付下动态高斯溅射的可渲染部分表示

Faruk Alpay, Levent Sarioglu, Yaser Hadri

AI总结 针对动态高斯表示在交互式渲染中因部分交付导致的退化问题,提出将基元组织为独立寻址的时空簇,通过训练部分依赖图并最小化期望失真、尾部失真等,实现不完整状态仍可直接渲染,并在实验上优于名义层序。

Comments 19 pages, 8 figures, 3 tables. Code, tests, configurations, pinned environment, and measurement records (including the partial-state oracle atlas) are provided as ancillary files

详情
AI中文摘要

动态高斯压缩通常针对完整文件或完整渐进前缀进行优化,但交互式渲染会遇到部分表示:某些时空区域存在,其他缺失,且后期细化无法影响已显示帧。我们研究动态高斯表示,其不完整交付状态仍可直接渲染,且其退化在图像空间中得到优化。高斯基元被组织为独立可寻址的时空簇,包含一个基础层和三个细化层;训练部分依赖图,在一个GPU批次中渲染许多反事实状态,并最小化期望失真、尾部失真、时间不一致性、码率和前缀回归。反事实效用层测量每个完成组在有效接收方上下文中的边际渲染贡献。同一图支持具体的交付实现,包括MTU限制的熵编码块、截止时间感知调度和接收方依赖闭合。在保留视图上,最细细化层在3/32个D-NeRF弹跳球、49/64个HyperNeRF扫帚2和28/64个HyperNeRF鸡簇中具有负平均边际效用;其下尾效用分别在21/32、61/64和42/64个簇中为负。在扫帚2上,渲染效用排序消除了在匹配字节预算下名义层序产生的两个PSNR回归;在鸡上,在不相交训练摄像机上测量的效用将最低匹配预算下的保留PSNR提高了3.03 dB。这些范围性结果表明,名义细化顺序不能替代渲染条件效用:该公式将网络交付视为可渲染场景状态的分布,而不是图形编解码器的外部包装。

英文摘要

Dynamic Gaussian compression is normally optimized for complete files or complete progressive prefixes, but interactive rendering encounters partial representations: some spatiotemporal regions are present, others missing, and late refinements cannot affect the displayed frame. We study dynamic Gaussian representations whose incomplete delivery states remain directly renderable and whose degradation is optimized in image space. Gaussian primitives are organized into independently addressable spatiotemporal clusters with a base level and three refinements; training samples partial dependency graphs, renders many counterfactual states in one GPU batch, and minimizes expected distortion, tail distortion, temporal inconsistency, rate, and prefix regressions. A counterfactual utility layer measures the marginal render contribution of each completion group across valid receiver contexts. The same graph admits a concrete delivery realization with MTU-bounded entropy-coded chunks, deadline-aware scheduling, and receiver-side dependency closure. On held-out views, the finest refinement has negative mean marginal utility in 3/32 D-NeRF bouncingballs, 49/64 HyperNeRF broom2, and 28/64 HyperNeRF chicken clusters; its lower-tail utility is negative in 21/32, 61/64, and 42/64 clusters, respectively. On broom2, render-utility ordering removes both PSNR regressions produced by nominal layer order at matched byte budgets; on chicken, utilities measured on disjoint training cameras improve held-out PSNR by 3.03 dB at the lowest matched budget. These scoped results show why nominal refinement order cannot substitute for render-conditioned utility: the formulation treats network delivery as a distribution over renderable scene states rather than as an external wrapper around a graphics codec.

2606.17128 2026-06-17 cs.AR 新提交

Shift-Left High-Level Synthesis Verification via Knowledge-Augmented LLM Agent

通过知识增强的LLM智能体实现左移高层次综合验证

Zhihan Xiao, Zhe Zhao, Luke Ztz Hu, Songping Mai

AI总结 提出一种知识增强的智能体驱动左移验证框架,通过双层级一致性检查、符号执行和HLS验证知识图谱,在综合前自动验证C与HLS-C的功能一致性,覆盖率达98.26%。

详情
AI中文摘要

高层次综合(HLS)通过将C/C++程序转换为硬件实现,实现了快速硬件开发。在HLS设计流程中,黄金C规范与面向HLS的C实现之间的功能一致性验证是一项关键但劳动密集型的任务。尽管大型语言模型(LLMs)最近在自动化测试平台生成方面显示出潜力,但其随机性常常导致覆盖率不足、验证环境不一致以及等价性检查结果不可靠。为了解决这些限制,我们提出了一种知识增强的、智能体驱动的左移验证框架,用于在综合前自动检查黄金C与HLS-C之间的功能一致性。该框架引入了一种双层级一致性检查机制,该机制共同强制配对测试平台之间的静态结构对齐和动态行为等价性,同时集成符号执行和覆盖率驱动的细化以提高验证完整性。此外,我们构建了一个异构的HLS验证知识图谱,为测试平台生成提供拓扑感知推理先验,并设计了一个自主验证智能体来协调跨异构工具链的迭代细化和故障诊断。在107个HLS基准对上的实验结果表明,所提出的框架实现了98.26%的平均覆盖率和95.33%的动态一致性,优于代表性的基于AST、检索增强和迭代智能体的基线。此 https URL

英文摘要

High-Level Synthesis (HLS) enables rapid hardware development by translating C/C++ programs into hardware implementations. Functional consistency verification between golden C specifications and HLS-oriented C implementations is a critical yet labor-intensive task in HLS design flows. While Large Language Models (LLMs) have recently shown promise in automated testbench generation, their stochastic nature often leads to insufficient coverage, inconsistent verification environments, and unreliable equivalence checking results. To address these limitations, we propose a knowledge-augmented, agent-driven shift-left verification framework for automated functional consistency checking between golden C and HLS-C implementations before synthesis. The framework introduces a Dual-Tier Consistency Checking mechanism that jointly enforces static structural alignment and dynamic behavioral equivalence between paired testbenches, while integrating symbolic execution and coverage-driven refinement to improve verification completeness. Furthermore, we construct a heterogeneous HLS Verification Knowledge Graph to provide topology-aware reasoning priors for testbench generation, and design an autonomous verification agent to orchestrate iterative refinement and failure diagnosis across heterogeneous toolchains. Experimental results on 107 HLS benchmark pairs demonstrate that the proposed framework achieves 98.26\% average coverage and 95.33\% dynamic consistency, outperforming representative AST-based, retrieval-augmented, and iterative agent-based baselines. https://github.com/cz-5f/HLS-LeVeri.git