arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

检索范围排序方式

检索时间范围

重置

HOT 人工智能、机器人等 9

cs.AI 人工智能 cs.CV 计算机视觉 cs.CL 自然语言处理 cs.RO 机器人 cs.LG 机器学习 cs.SD 声音 cs.ET 新兴技术 eess.AS 音频语音 eess.IV 图像视频

CS 计算机 41

cs 计算机 cs.AI 人工智能 cs.AR 硬件架构 cs.CC 计算复杂性 cs.CE 计算工程 cs.CG 计算几何 cs.CL 自然语言处理 cs.CR 密码安全 cs.CV 计算机视觉 cs.CY 计算机与社会 cs.DB 数据库 cs.DC 分布式计算 cs.DL 数字图书馆 cs.DM 离散数学 cs.DS 数据结构 cs.ET 新兴技术 cs.FL 形式语言 cs.GL 综述文献 cs.GR 图形学 cs.GT 博弈论 cs.HC 人机交互 cs.IR 信息检索 cs.IT 信息论 cs.LG 机器学习 cs.LO 计算机逻辑 cs.MA 多智能体 cs.MM 多媒体 cs.MS 数学软件 cs.NA 数值分析 cs.NE 神经进化 cs.NI 网络架构 cs.OH 其他计算机 cs.OS 操作系统 cs.PF 性能 cs.PL 编程语言 cs.RO 机器人 cs.SC 符号计算 cs.SD 声音 cs.SE 软件工程 cs.SI 社会信息网络 cs.SY 系统控制

ECON 经济学 4

econ 经济学 econ.EM 计量经济 econ.GN 一般经济 econ.TH 理论经济

EESS 电气与系统 5

eess 电气与系统 eess.AS 音频语音 eess.IV 图像视频 eess.SP 信号处理 eess.SY 系统控制

MATH 数学 33

math 数学 math.AC 交换代数 math.AG 代数几何 math.AP 偏微分方程 math.AT 代数拓扑 math.CA 经典分析 math.CO 组合数学 math.CT 范畴论 math.CV 复变函数 math.DG 微分几何 math.DS 动力系统 math.FA 泛函分析 math.GM 一般数学 math.GN 一般拓扑 math.GR 群论 math.GT 几何拓扑 math.HO 历史综述 math.IT 信息论 math.KT K理论 math.LO 逻辑 math.MG 度量几何 math.MP 数学物理 math.NA 数值分析 math.NT 数论 math.OA 算子代数 math.OC 优化控制 math.PR 概率 math.QA 量子代数 math.RA 环与代数 math.RT 表示论 math.SG 辛几何 math.SP 谱理论 math.ST 统计理论

PHYSICS 物理 55

astro-ph 天体物理 astro-ph.CO 宇宙学 astro-ph.EP 地球行星 astro-ph.GA 星系物理 astro-ph.HE 高能天体 astro-ph.IM 天文仪器 astro-ph.SR 太阳恒星 cond-mat 凝聚态 cond-mat.dis-nn 无序神经 cond-mat.mes-hall 介观纳米 cond-mat.mtrl-sci 材料科学 cond-mat.other 其他凝聚态 cond-mat.quant-gas 量子气体 cond-mat.soft 软凝聚态 cond-mat.stat-mech 统计力学 cond-mat.str-el 强关联电子 cond-mat.supr-con 超导 gr-qc 广义相对论 hep-ex 高能实验 hep-lat 格点高能 hep-ph 高能唯象 hep-th 高能理论 math-ph 数学物理 nlin 非线性科学 nlin.AO 自适应系统 nlin.CD 混沌动力学 nlin.CG 胞自动机 nlin.PS 斑图孤子 nlin.SI 可积系统 nucl-ex 核物理实验 nucl-th 核物理理论 physics 物理 physics.acc-ph 加速器物理 physics.ao-ph 大气海洋 physics.app-ph 应用物理 physics.atm-clus 原子分子团簇 physics.atom-ph 原子物理 physics.bio-ph 生物物理 physics.chem-ph 化学物理 physics.class-ph 经典物理 physics.comp-ph 计算物理 physics.data-an 数据分析 physics.ed-ph 物理教育 physics.flu-dyn 流体动力学 physics.gen-ph 普通物理 physics.geo-ph 地球物理 physics.hist-ph 物理史哲 physics.ins-det 仪器探测 physics.med-ph 医学物理 physics.optics 光学 physics.plasm-ph 等离子体 physics.pop-ph 科普物理 physics.soc-ph 物理与社会 physics.space-ph 空间物理 quant-ph 量子物理

Q-BIO 定量生物 11

q-bio 定量生物 q-bio.BM 生物分子 q-bio.CB 细胞行为 q-bio.GN 基因组学 q-bio.MN 分子网络 q-bio.NC 神经认知 q-bio.OT 其他定量生物 q-bio.PE 种群进化 q-bio.QM 定量方法 q-bio.SC 亚细胞过程 q-bio.TO 组织器官

Q-FIN 定量金融 10

q-fin 定量金融 q-fin.CP 计算金融 q-fin.EC 经济学 q-fin.GN 一般金融 q-fin.MF 数学金融 q-fin.PM 投资组合 q-fin.PR 证券定价 q-fin.RM 风险管理 q-fin.ST 统计金融 q-fin.TR 交易微观结构

STAT 统计 7

stat 统计 stat.AP 统计应用 stat.CO 统计计算 stat.ME 统计方法 stat.ML 机器学习 stat.OT 其他统计 stat.TH 统计理论

2606.19866 2026-06-19 cs.CR 新提交

Low-Cost Multi-Precision Systolic Arrays for Accelerating FHE NTTs on AI ASICs

低成本多精度脉动阵列用于在AI ASIC上加速FHE NTT

George Alexakis, Dimitrios Schoinianakis, Giorgos Dimitrakopoulos

AI总结针对FHE在AI硬件上因精度不匹配导致的性能瓶颈，提出一种最小修改的多精度脉动阵列，在统一数据流下原生执行全精度输出重建，实现1.33倍加速。

详情

AI中文摘要

全同态加密（FHE）确保了强大的数据隐私，但面临难以承受的计算开销。在AI硬件（如张量处理单元TPU）上加速FHE很有前景，但受到精度不匹配的根本限制：TPU针对8位算术优化，而FHE及其关键部分（如数论变换NTT）需要高精度。当前方法通过矩阵分解在低精度矩阵引擎上执行NTT计算来弥合这一差距。然而，重建全精度结果需要移位加累加，这与矩阵乘法的数据流不匹配。这迫使将全精度重建从矩阵引擎卸载到向量处理器，破坏了矩阵乘法数据流，造成显著的性能瓶颈。为解决这一限制，我们提出一种最小修改的多精度脉动阵列，在统一数据流下，与低精度矩阵乘法同步，在阵列内部原生执行全精度输出重建。使用OpenRoad在7nm工艺下综合，我们的设计硬件开销可忽略不计。使用SCALE-Sim的周期精确模拟表明，在128x128矩阵引擎上，对于2^12到2^16的变换大小，在所提出的架构上原生执行NTT可实现至少1.33倍的加速，成功使标准AI硬件支持高精度FHE加速。

英文摘要

Fully Homomorphic Encryption (FHE) ensures robust data privacy but suffers from prohibitive computational overhead. Accelerating FHE on AI hardware like Tensor Processing Units (TPUs) is promising, yet fundamentally limited by a precision mismatch: TPUs are optimized for 8-bit arithmetic, whereas FHE and its critical parts such as the Number Theoretic Transform (NTT), demand high precision. Current approaches bridge this gap using matrix decomposition to execute NTT computations on low-precision matrix engines. However, reconstructing the full-precision results requires shift-and-add accumulation that does not match the dataflow of matrix multiplication. This forces offloading full-precision reconstruction from matrix engines to vector processors that disrupts the matrix multiplication dataflow, creating significant performance bottleneck. To resolve this limitation, we propose a minimally modified multi-precision systolic array that performs full-precision output reconstruction natively within the array in sync with low-precision matrix multiplication under a uniform dataflow. Synthesized at 7nm with OpenRoad, our design incurs negligible hardware overhead. Cycle-accurate simulations using SCALE-Sim demonstrate that natively executing NTTs on the proposed architecture achieves at least 1.33x speedup, for transform sizes 2^12 to 2^16 on 128x128 matrix engines, successfully enabling standard AI hardware to support high-precision FHE acceleration.

URL PDF HTML ☆

赞 0 踩 0

2606.19861 2026-06-19 cs.NE 新提交

Weight Adaptation for Improving Parallel Performance of Adaptive Stochastic Natural Gradient

权重自适应提升自适应随机自然梯度的并行性能

Yutaro Yamada, Kento Uchida, Shinichi Shirakawa

AI总结提出WA-ASNG，通过梯度上升自适应更新权重参数，最大化自然梯度信号，在二进制优化问题中优于PBIL和ASNG，并有效处理强噪声。

Comments Accepted at EvoCOP 2026 (Part of EvoStar 2026)

详情

DOI: 10.1007/978-3-032-20537-7_10

AI中文摘要

基于概率模型的进化算法在黑箱优化中很有前景。具体来说，自适应随机自然梯度（ASNG）自适应地更新其学习率（概率模型进化算法中的典型超参数），从而实现高效且鲁棒的优化。尽管权重参数是常见的超参数，但随着对耗时任务并行评估需求的增加，如何为更大的种群规模设置合适的权重仍不清楚。在本文中，我们提出了权重自适应ASNG（WA-ASNG），它将权重自适应机制融入ASNG。我们从自然梯度的累积中计算更新方向的估计信号。然后，为了最大化该信号，WA-ASNG通过优化上的梯度上升自适应地更新其权重参数。学习率自适应在满足预期目标值单调改进的充分条件方面发挥作用，而权重自适应机制旨在最大化这种改进。实验结果表明，在二进制优化问题中，种群规模从25到100的各种设置下，WA-ASNG优于PBIL和ASNG。此外，WA-ASNG在存在强噪声的情况下也能高效运行。我们的代码可在此https URL获取。

英文摘要

Probabilistic model-based evolutionary algorithms are promising for black-box optimization. Specifically, the adaptive stochastic natural gradient (ASNG) adaptively updates its learning rate, a typical hyperparameter in probabilistic model-based evolutionary algorithms, thereby realizing efficient and robust optimization. Although weight parameters are common hyperparameters, with the increasing demand for parallel evaluation of time-consuming tasks, it remains unclear how to set suitable weights for larger population sizes. In this paper, we propose Weight Adaptation ASNG (WA-ASNG), which incorporates a weight adaptation mechanism into ASNG. We calculated the estimated signal of the update direction from the accumulations of the natural gradient. Then, to maximize the signal, WA-ASNG adaptively updates its weight parameters by a gradient ascent over the optimization. While the learning rate adaptation plays a role in satisfying a sufficient condition for monotonic improvement of the expected objective value, the mechanism of weight adaptation is intended to maximize this improvement. The experimental results demonstrate that WA-ASNG outperforms PBIL and ASNG across various settings with population sizes ranging from 25 to 100 for binary optimization problems. Furthermore, WA-ASNG can perform efficiently in the presence of strong noise. Our code is available at https://github.com/shiralab/WA-ASNG .

URL PDF HTML ☆

赞 0 踩 0

2606.19826 2026-06-19 cs.CR cs.MA 新提交

Heterogeneous LLM Debate Under Adversarial Peers: Honest Gains, Replacement Costs, and Resilience

对抗性同伴下的异构LLM辩论：诚实增益、替代成本与韧性

Prashanti Nilayam, Kiran Kumar Ramanna, Prashil Tumbade, Sankalp Nayak

AI总结研究异构LLM辩论中诚实与对抗性同伴对修正行为的影响，发现诚实同伴降低有害修正率，对抗性同伴则逆转，且异构性在已有对手时也能作为防御。

详情

AI中文摘要

异构LLM辩论的动机在于，多样化的同伴可以相互纠正，但同样的交流既携带纠正也携带对抗性影响。我们通过跟踪异构同伴如何改变诚实智能体的修正行为来衡量哪种影响占主导：他们改变答案的频率，以及这种改变是纠正性的还是有害的。我们比较了匹配面板（同质基线、诚实混合和对抗混合）以及受污染面板（其中已存在一个恶意的同族同伴），涵盖四个模型家族和三个推理基准。一个诚实的异构同伴显著降低了有害修正，而对抗性同伴则逆转了这一效果。对于Llama-3.1-70B防御者在MATH-hard上，诚实插槽的有害修正率从同质面板的89%下降到有诚实同伴时的35%，而对抗性同伴使其回到90%。条件率对弱防御者隐藏了这种损害，但辩论结束时的翻转率暴露了它。该模式在家族和基准上保持符号一致，而其幅度随防御者-基准机制变化。我们还测量了当已存在一个对抗性同族同伴时的效果：一个诚实的异构同伴降低了有害修正率以及最初正确答案丢失的比率。在相同的Llama-3.1-70B设置下，添加的诚实同伴将最初正确项上的翻转率从同族对手下的31%降至6%。因此，异构性不仅是一个攻击面，而且当对手已经存在时，也是一种防御。

英文摘要

Heterogeneous LLM debate is motivated by the promise that diverse peers correct one another, but the same exchange that carries correction also carries adversarial influence. We measure which dominates by tracking how a heterogeneous peer changes the honest agents' revision behavior: how often they change their answer, and whether the change is corrective or harmful. We compare matched panels (homogeneous baseline, honest-mixed, and adversarial-mixed) and contaminated panels in which a malicious same-family peer is already present, spanning four model families and three reasoning benchmarks. An honest heterogeneous peer sharply lowers harmful revision, and an adversarial one reverses it. For Llama-3.1-70B defenders on MATH-hard, the honest-slot harmful-revision rate falls from 89% in the homogeneous panel to 35% with an honest peer, and an adversarial peer returns it to 90%. The conditional rate hides this damage on weak defenders, but the end-of-debate flip rate exposes it. The pattern keeps its sign across families and benchmarks while its magnitude varies with the defender-benchmark regime. We also measure the effects when an adversarial same-family peer is already present: an honest heterogeneous peer lowers both harmful revision and the rate at which initially-correct answers are lost. On the same Llama-3.1-70B setting, the added honest peer cuts the flip rate on initially-correct items from 31% under a same-family adversary to 6%. Heterogeneity is therefore not only an attack surface but, when an adversary is already present, also a defense.

URL PDF HTML ☆

赞 0 踩 0

2606.19822 2026-06-19 cs.FL 新提交

Learning Alternating Real-Time Automata

学习交替实时自动机

Kazuki Kinoshita, Masaki Waga

AI总结提出AL*RTA算法，结合AL*和NL*RTA，学习交替实时自动机，通过成员和等价查询，实验表明比NL*RTA学到更小自动机但查询更多。

Comments Accepted to QEST+FORMATS 2026

2606.19816 2026-06-19 cs.CY 新提交

Challenges to Grassroots Organization Engagement with AI Policy

基层组织开展AI政策参与的挑战

Carter Buckner, Jennifer Mickel, Nandhini Swaminathan, William Agnew, Jacob Hobbs, Sarthak Arora, Michelle Lin, Yanan Long, B. V. Alaka

AI总结本文通过案例研究，探讨基层组织和边缘化社区在参与AI政策制定中面临的挑战，并提出基于参与式设计的建议。

Comments To appear at ACM FAccT 2026

详情

AI中文摘要

世界各地正在制定公共政策，以应对AI技术带来的隐私、经济、知识产权、能源及其他风险。公众参与作为问责和对齐机制，对治理至关重要。然而，对于缺乏广泛网络、游说能力及其他权力形式的公众群体来说，参与并影响政策制定可能具有挑战性。这一挑战对边缘化社区尤为严峻。本文通过我们组织将参与式设计（PD）原则引入美国AI政策制定的努力进行案例研究。我们描述了与多个美国政策机构的互动，以及为性少数群体参与式开发AI政策的过程。我们强调了与边缘化社区进行PD实践中的挑战，并提出了缓解这些挑战的建议。最后，我们为政策制定者及其他在边缘化社区工作的组织者提供了可行的建议。

英文摘要

Public policies are being developed around the world to address privacy, economic, intellectual property, energy, and other risks that AI technologies pose. Involvement from the general public is essential to governance as an accountability and alignment mechanism. However, participating in and impacting policymaking can be challenging for sections of the public that lack extensive networks, lobbying capabilities, and other forms of power. This challenge is especially acute for marginalized communities. In this paper, we present a case study of our organization's efforts to bring participatory design (PD) principles to AI policymaking in the US. We describe our engagements with several US policy bodies, and our participatory development of AI policy for queer people. We highlight challenges with PD practice with marginalized communities, and offer suggestions to alleviate them. We conclude with actionable recommendations for policymakers and other organizers working in marginalized communities.

URL PDF HTML ☆

赞 0 踩 0

2606.19814 2026-06-19 cs.SE 新提交

CoRaCommit: A VS Code Extension for Commit Message Generation with Exemplar Retrieval

CoRaCommit: 一种基于范例检索的提交消息生成的 VS Code 扩展

Chaoran Cai, Bo Xiong, Chong Wang, Lulu He, Peng Liang

AI总结提出 CoRaCommit VS Code 扩展，通过检索相似提交范例作为提示上下文、并行调用多个大语言模型生成候选消息并基于用户反馈动态推荐，在 ApacheCM 数据集上优于现有扩展。

Comments 17 pages, 6 images, 3 tables, Manuscript submitted to a Journal (2026)

详情

AI中文摘要

提交消息是描述代码变更意图的关键文本制品，在版本控制、代码审查和历史追踪中扮演重要角色。然而，实践中提交消息主要由人工编写，耗时且常导致质量不一致和表达不统一。现有的用于提交消息生成的 VS Code 扩展通常直接基于代码差异调用大语言模型，而不利用相似提交范例作为参考，且很少支持用户反馈驱动的大语言模型推荐。为解决这些局限，本文提出 CoRaCommit，一种 VS Code 扩展，通过检索相似提交范例作为提示上下文、并行调用多个大语言模型进行候选提交消息比较，并基于用户反馈动态推荐大语言模型，从而增强提交消息生成。在 ApacheCM 数据集的 945 个提交上的实验结果表明，CoRaCommit 在 BLEU、CIDEr、METEOR 和 ROUGE-L 指标上优于现有 VS Code 扩展，证明了检索增强上下文对提交消息生成的有效性。

英文摘要

Commit messages are essential textual artifacts that describe the intent behind code changes, and play a critical role in version control, code review, and historical tracking. However, in practice, commit messages are primarily authored manually, which is time-consuming and often results in inconsistent quality and non-uniform expression. Existing VS Code extensions for commit message generation typically directly invoke large language models based on the code diff, without leveraging similar commit exemplars as references, and rarely support user feedback-driven LLM recommendation. To address these limitations, this paper presents CoRaCommit, a VS Code extension that enhances commit message generation by retrieving similar commit exemplars as prompt context, invoking multiple LLMs in parallel for candidate commit message comparison, and dynamically recommending LLMs based on user feedback. Experimental results on 945 commits from the ApacheCM dataset show that CoRaCommit outperforms existing VS Code extensions across BLEU, CIDEr, METEOR, and ROUGE-L metrics, demonstrating the effectiveness of retrieval-augmented context for commit message generation.

URL PDF HTML ☆

赞 0 踩 0

2606.19807 2026-06-19 cs.CR 新提交

DISARM: Target Electronic Device Informed Mitigation of Software Runtime Side-Channel Vulnerabilities

DISARM：目标电子设备知情的软件运行时侧信道漏洞缓解

Tasneem Suha, Tanzim Mahfuz, Rima Asmar Awad, Prabuddha Chakraborty

AI总结提出DISARM方法，利用真实嵌入式设备时序值生成针对性软件修复，以缓解运行时侧信道漏洞，在五个不同设备上优于现有方案。

详情

AI中文摘要

程序运行时或时序攻击利用程序执行时间的变化来提取敏感信息（如加密密钥、敏感变量数据、知识产权）。针对运行时侧信道攻击的最新解决方案试图平衡不同控制流路径下敏感代码的执行时间，以消除时序泄漏。然而，在缓解过程中，大多数技术未考虑目标程序运行的底层硬件或设备。这可能导致过度修复（不必要的额外操作）、修复不足（未正确解决不平衡）甚至失败。我们提出DISARM，一种联合硬件-软件方法（不同于任何现有解决方案），用于缓解运行时侧信道漏洞，该方法利用真实嵌入式设备的时序值生成针对性的软件修复。我们实现了DISARM以支持C、C++和Java源代码，并在22个标准基准测试上进行验证。在五个不同的嵌入式或边缘设备上，DISARM在执行时间开销、代码大小开销和正确性方面均优于现有解决方案如PENDULUM和DifFuzzAR。

英文摘要

Program runtime or timing attacks exploit variations in a program's execution times to extract sensitive information from the program (e.g. encryption keys, sensitive variable data, intellectual property). State-of-the-art solutions to runtime side-channel attacks attempt to balance the execution time of the sensitive code for different control flow paths to eliminate the timing leakage. However, during the mitigation process, most techniques do not consider the underlying hardware or device on which the target program is supposed to run on. This can lead to over-fixing (unnecessary extra operations), under-fixing (not solving the imbalance properly), and even failures. We propose DISARM, a joint hardware-software methodology (unlike any existing solution) for mitigating runtime side-channel vulnerabilities that utilizes timing values from real embedded devices to generate targeted software fixes. We implement DISARM to support C, C++, and Java source codes and validate it across 22 standard benchmarks. DISARM outperforms state-of-the-art solutions such as PENDULUM and DifFuzzAR in terms of execution time overhead, code size overhead, and correctness on five different embedded or edge devices.

URL PDF HTML ☆

赞 0 踩 0

2606.19790 2026-06-19 cs.CE 新提交

The Orchestration Gap: Why Process Automation Stalls in Operationally Complex Industries

编排鸿沟：为何流程自动化在操作复杂行业中停滞不前

Jiechao Gao, Yuandong Pan. Yuangang Li, Jie Wang, Kincho Law, Michael Lepech

AI总结本文提出“编排鸿沟”概念，分析为何多智能体系统在物流、医疗等复杂行业自动化中失败，并给出基于约束执行和可解释性的分阶段自动化路径。

详情

AI中文摘要

智能体系统在数字原生任务上进展迅速，但几乎未触及那些协调自动化可能最重要的行业：物流、医疗运营、建筑以及许多工作分散在不兼容工具和众多参与者中的领域。我们认为原因是缺少一种抽象。在这些场景中，价值并非来自单个有能力的模型调用，而是来自编排——协调多步骤工作流、强制执行硬领域约束、管理人工审批并桥接遗留系统的运行时。我们将这一思想发展成一个可用的概念框架。我们给出了一个操作性测试来识别哪些工作流受限于编排，一种分解方法将工作流的混乱程度与其协调工作量及价值分离，以及一个特征层面的解释说明为何当今的多智能体框架留下了一个特定鸿沟。然后我们提出核心主张：正确的自动化路径是分阶段的，而哪种架构保证最重要取决于一个行业的主要摩擦来源。在监管摩擦下，约束执行是承重关键；在责任摩擦下，可解释性是承重关键。我们以这一观点所暗示的研究计划作为结尾。

英文摘要

Agentic systems have advanced quickly on digitally native tasks, yet they have barely touched the industries where coordinated automation could matter most: logistics, healthcare operations, construction, and the many sectors whose work is spread across incompatible tools and many hands. We argue that the reason is a missing abstraction. The value in these settings does not come from a single capable model invocation; it comes from \emph{orchestration}, the runtime that coordinates multi-step workflows, enforces hard domain constraints, manages human approval, and bridges legacy systems. We develop this idea into a usable conceptual frame. We give an operational test for which workflows are orchestration-bound, a decomposition that separates how tangled a workflow is from how much of its effort is coordination and what that coordination is worth, and a feature-level account of why today's multi-agent frameworks leave a specific gap. We then advance our central claim: the right automation path is staged, and which architectural guarantee carries the most weight depends on a sector's dominant source of friction. Constraint enforcement is load-bearing under regulatory friction; explainability is load-bearing under liability friction. We close with the research program this view implies.

URL PDF HTML ☆

赞 0 踩 0

2606.19775 2026-06-19 cs.SI stat.AP stat.OT 新提交

Rethinking Sampling Strategy in Link Prediction

重新思考链接预测中的采样策略

Yilin Bi, Zhenyu Deng, Xinshan Jiao, Tao Zhou

AI总结提出β-采样方案，研究两阶段采样对链接预测性能的影响，发现缺失链接的结构特征显著影响预测精度，且第二阶段采样策略至关重要。

Comments 19 pages, 5 figures, 3 tables

详情

AI中文摘要

许多现实世界的网络是不完整的，使得链接预测成为网络科学中的一个基本挑战。为了训练参数和评估算法，观察到的链接通常被划分为三个子集，即训练集、验证集和探测集。这种划分隐含地涉及两个采样过程：第一阶段采样产生探测集，第二阶段采样获得变化集。迄今为止，我们对这两个采样过程如何影响算法性能的理解仍然非常有限。为了解决这个问题，我们提出了一种称为β-采样的采样方案，其中链接的采样概率与其两个端点的度数乘积的β次幂成正比。在45个真实网络上的实验表明，通过改变探测集模拟的缺失链接的结构特征显著影响预测精度。当缺失链接倾向于连接高度数节点时，这类链接可以很容易地被准确预测。此外，即使探测集固定，第二阶段采样仍然对预测精度产生显著影响。值得注意的是，最优的第二阶段采样策略不同于随机采样（随机选择链接形成验证集）和一致采样（保证验证集和探测集中的链接具有相同的结构特征）。

英文摘要

Many real-world networks are incomplete, making link prediction a fundamental challenge in network science. To train parameters and evaluate algorithms, observed links are usually divided into three subsets, namely training, validation, and probe sets. This division implicitly involves two sampling processes: first-stage sampling yields the probe set and second-stage sampling obtains the variation set. To date, our understanding of how these two sampling processes affect algorithm performance remains quite limited. To address this issue, we propose a sampling scheme called $β$-sampling, where the sampling probability of a link is proportional to the product of the degrees of its two endpoints raised to the power of $β$. Experiments on 45 real-world networks reveal that the structural characteristics of missing links, as simulated via varying probe sets, substantially impact prediction accuracy. When missing links tend to connect high-degree nodes, such links can be predicted accurately with ease. Furthermore, even with a fixed probe set, second-stage sampling still exerts a significant influence on prediction accuracy. Notably, the optimal second-stage sampling strategy differs from \textit{random sampling} (which randomly selects links to form the validation set) and \textit{consistent sampling} (which guarantees that links in the validation and probe sets share identical structural characteristics).

URL PDF HTML ☆

赞 0 踩 0

2606.19758 2026-06-19 cs.MA 新提交

SIGMA: Skill-Incidence Graphs for Compositional Multi-Agent Design

SIGMA: 用于组合式多智能体设计的技能-关联图

Kun Zeng, Yu Huo, Siyu Zhang, Yuecheng Zhuo, Yuquan Lu, Haoyue Liu, Siyue Chen, Xiaoying Tang

AI总结提出SIGMA框架，通过技能-智能体关联图将智能体构建为可复用技能的任务条件组合，并解码通信拓扑，在六个基准测试中优于基线方法，并展现出对未见技能库的鲁棒性。

Comments EMNLP2026

详情

AI中文摘要

现有的基于图的多智能体系统（MAS）设计者主要通过优化预定义智能体、角色或组上的通信拓扑来改善协作。然而，由于每个节点仍然是一个封闭集实体，这些方法难以泛化到需要未见能力组合的任务。我们提出SIGMA，一个技能-关联图框架，将智能体构建为可复用技能的任务条件组合。给定一个任务和一个技能库，SIGMA预测一个技能-智能体关联矩阵，从选定的技能中组合智能体节点嵌入，并在构建的智能体上解码通信拓扑。在执行过程中，特定技能的邮箱将消息路由到相关分配的能力，使关联结构直接可操作。在六个推理和编码基准测试中，使用三个基础LLM，SIGMA实现了最佳平均性能，并分别比最强的非组合式拓扑基线CARD提高了2.06、2.36和1.75分。它还对未见技能库表现出更强的鲁棒性，平均性能下降仅为0.96分。这些结果表明，组合式节点构建是多智能体设计中除了通信拓扑优化之外的一个互补且重要的方向。代码可在以下网址获取：https://this URL。

英文摘要

Existing graph-based multi-agent system (MAS) designers mainly improve collaboration by optimizing communication topologies over predefined agents, roles, or groups. However, because each node remains a closed-set entity, these methods struggle to generalize to tasks that require unseen combinations of capabilities. We propose SIGMA, a skill-incidence graph framework that constructs agents as task-conditioned bundles of reusable skills. Given a task and a skill library, SIGMA predicts a skill-agent incidence matrix, composes agent node embeddings from selected skills, and decodes a communication topology over the constructed agents. During execution, skill-specific mailboxes route messages to the relevant assigned capabilities, making the incidence structure directly operational. Across six reasoning and coding benchmarks with three base LLMs, SIGMA achieves the best average performance and improves over CARD, the strongest non-compositional topology-based baseline, by 2.06, 2.36, and 1.75 points, respectively. It also shows stronger robustness to unseen skill libraries, with an average performance drop of only 0.96 points. These results suggest that compositional node construction is a complementary and important axis for multi-agent design beyond communication topology optimization. Code is available at https://anonymous.4open.science/r/SIGMA-2338/.

URL PDF HTML ☆

赞 0 踩 0

2606.19746 2026-06-19 cs.DC 新提交

SAC: Disaggregated KV Cache System for Sparse Attention LLMs with CXL

SAC: 面向稀疏注意力LLM的基于CXL的解耦KV缓存系统

Ruiyang Ma, Teng Ma, Junru Li, Hantian Zha, Xuchun Shang, Qingda Hu, Zheng Liu, Xinjun Yang, Tao Ma, Guojie Luo

AI总结针对稀疏注意力模型在长上下文推理中全量KV缓存传输导致的瓶颈，提出基于CXL按需获取top-k KV条目的解耦缓存系统SAC，相比RDMA方案吞吐提升2.1倍、TTFT降低9.7倍。

详情

AI中文摘要

LLM向长上下文推理的扩展将主要服务系统瓶颈从计算转移到内存容量。传统针对密集注意力模型的解决方案依赖基于RDMA的解耦内存池，在解码前从远程存储粗粒度地获取整个前缀KV缓存到本地内存。然而，这种方法对于新兴的稀疏注意力模型本质上是低效的。尽管解码过程中只有一小部分KV条目是活跃的，这些系统仍然将完整的KV缓存获取到本地，导致严重的传输瓶颈和本地内存浪费。为了解决这个问题，我们提出了SAC，第一个针对稀疏注意力模型优化的高效解耦KV缓存系统。通过利用Compute Express Link (CXL)的低延迟、缓存行粒度的加载/存储语义，SAC在推理过程中按需仅获取所需的top-k KV条目。在使用SGLang对DeepSeek-V3.2的评估中，与基于RDMA的基线相比，SAC实现了2.1倍的吞吐量提升、9.7倍的TTFT降低和1.8倍的TBT降低，确立了基于CXL的解耦作为新兴稀疏注意力模型的优越基础设施。

英文摘要

The scaling of LLMs toward long-context inference has shifted the primary serving system bottleneck from computation to memory capacity. Traditional solutions for dense attention models rely on RDMA-based disaggregated memory pools, which perform coarse-grained fetching of the entire prefix KV cache from remote storage to local memory before decoding. However, this approach is fundamentally inefficient for emerging sparse attention models. While only a small fraction of KV entries are active during decoding, these systems still fetch the full KV cache locally, leading to severe transmission bottlenecks and local memory wastage. To address this, we propose SAC, the first efficient disaggregated KV cache system optimized for sparse attention models. By leveraging the low-latency, cache-line granularity load/store semantics of Compute Express Link (CXL), SAC fetches only the required top-k KV entries on demand during inference. Evaluations on DeepSeek-V3.2 using SGLang show that SAC achieves 2.1x higher throughput, 9.7x lower TTFT, and 1.8x lower TBT compared to RDMA-based baselines, establishing CXL-based disaggregation as the superior infrastructure for emerging sparse attention models.

URL PDF HTML ☆

赞 0 踩 0

2606.19745 2026-06-19 cs.HC 新提交

Designing for Interconnected Islamic Learning: A Qualitative Study of Muslim Women's Experiences with Qur'an, Hadith, and Seerah Apps

设计互联的伊斯兰学习：穆斯林女性使用古兰经、圣训和先知传记应用的质性研究

Ishrat Jahan Easha, Nabil Mosharraf Hossain, Araf Mohammad Mahbub, Fairoze Bint Abu Hassan, Zunaid Aslam, Yemin Sajid, Riasat Islam

AI总结通过访谈穆斯林女性，发现她们在数字工具中阅读古兰经、圣训和先知传记时面临上下文分离的张力，提出分层语境性概念，强调在可信、可选且不打断阅读的前提下提供跨文本语境。

Comments 27 pages, 1 figure, 3 tables. Submitted to the International Journal of Human-Computer Interaction

详情

AI中文摘要

伊斯兰学习通常依赖于同时阅读古兰经、圣训和先知传记，然而数字工具通常将这些资源分散在不同的应用、屏幕和搜索路径中。我们通过从在线伊斯兰学习社区招募的五名穆斯林女性的半结构化访谈，将此视为人机交互问题。参与者描述了一个反复出现的张力：她们希望在阅读时获得古兰经-圣训-先知传记的上下文，但仅当上下文扩展是可信的、可选的且不打断阅读时。通过性别化数字宗教、认知信任和无缝学习的视角解读访谈，我们识别出关于上下文理解、真实性、界面杂乱、学习模式和指导特征的五个主题。我们引入分层语境性作为该领域的HCI解释：上下文扩展必须与解释责任、虔诚流动以及跨设备和学习强度的连续性相平衡。

英文摘要

Islamic learning often depends on reading the Qur'an, Hadith, and Seerah together, yet digital tools typically separate these sources across apps, screens, and search pathways. We examine this as a human-computer interaction problem through five semi-structured interviews with Muslim women recruited from an online Islamic learning community. Participants described a recurring tension: they wanted Qur'an-Hadith-Seerah context at the point of reading, but only when contextual expansion remained trustworthy, optional, and did not interrupt reading. Interpreting the interviews through gendered digital religion, epistemic trust, and seamless learning, we identify five themes concerning contextual understanding, authenticity, interface clutter, study modes, and guidance features. We introduce layered contextuality as an HCI account of this domain: contextual expansion must be balanced with interpretive accountability, devotional flow, and continuity across devices and study intensities.

URL PDF HTML ☆

赞 0 踩 0

2606.19703 2026-06-19 cs.HC 新提交

Vibe Coding for Visualization Implementation: An Empirical Study of Practices and Challenges

Vibe Coding 用于可视化实现：实践与挑战的实证研究

Zhengyu Sun, Xiaolin Wen, Fengjie Wang, Can Liu, Yi Lai, Christophe Hurter, Yong Wang

AI总结通过16名参与者的实证研究，探讨用户使用AI驱动的自然语言交互工具生成可视化时的实践（提示、评估、迭代）和挑战。

Comments 5 pages, 2 figures. Short paper under review

2606.19692 2026-06-19 cs.CR cs.DB cs.IR 新提交

When Global Gating Is Enough: Admission-Time Hubness Control in Anisotropic Vector Retrieval Systems

当全局门控足够：各向异性向量检索系统中的准入时间枢纽性控制

Prashant Kumar Pathak, Tarun Kumar Sharma

AI总结针对检索增强生成中向量枢纽性引发的投毒风险，提出准入时间控制方法，通过哨兵查询评分隔离枢纽文档，全局门控在多个数据集上达到高召回率和低误报率。

详情

AI中文摘要

向量枢纽性（少数点成为许多查询的最近邻）在检索增强生成（RAG）中造成投毒风险：一个注入的文档可能影响不相关的请求。现有防御使用周期性反向k近邻扫描，存在暴露窗口和重复的全语料库工作。我们研究准入时间控制，根据哨兵查询对每个候选文档评分，并在插入前隔离类似枢纽的文档。在两个10万文档语料库、五个编码器以及不相交的攻击者和防御者查询集上，全局门控在决定性嵌入空间点达到召回率1.0（有效范围内>=0.92），在HotFlip攻击上达到0.91 +/- 0.07，对一般文档的误报率为1%。每主题门控没有提供可靠的好处，这与各向异性耦合局部和全局可见性一致。阈值是增量维护的，插入成本与语料库大小无关，删除成本摊销。在HNSW上，准入增加约3.1%的摄入延迟，评分在10^6向量上保持平坦，近似索引下1.2%的决策翻转，不涉及攻击。来源信息补充了门控对自然或紧密领域枢纽的处理。

英文摘要

Vector hubness, where a few points become nearest neighbors of many queries, creates a poisoning risk in retrieval-augmented generation (RAG): one injected document can influence unrelated requests. Existing defenses use periodic reverse-kNN scans, leaving an exposure window and repeated corpus-wide work. We study admission-time control, scoring each candidate against sentinel queries and quarantining hub-like documents before insertion. Across two 100,000-document corpora, five encoders, and disjoint attacker and defender query sets, a global gate achieves recall 1.0 at the decisive embedding-space point (>=0.92 across the effective range) and 0.91 +/- 0.07 on HotFlip attacks, with 1% false positives on general documents. A per-topic gate provides no reliable benefit, consistent with anisotropy coupling local and global visibility. Thresholds are maintained incrementally, with corpus-size-independent insertion cost and amortized deletion cost. On HNSW, admission adds about 3.1% to ingestion latency, scoring remains flat to 10^6 vectors, and 1.2% of decisions flip under approximate indexing, none involving attacks. Provenance complements the gate for natural or tight-domain hubs.

URL PDF HTML ☆

赞 0 踩 0

2606.19689 2026-06-19 cs.HC 新提交

Syndesmoscope: The Power of Invariant Plots\\Linked to Traditional Network Views

Syndesmoscope: 不变图的力量与传统网络视图的关联

Matt Oddo, Indira Sowy, Stephen Kobourov, Tamara Munzner

AI总结提出Syndesmoscope系统，通过结合不变图（如kSnakes）与传统网络视图，利用跳蛙和跳房子交互揭示单一视图无法呈现的网络模式。

详情

AI中文摘要

传统的网络表示，如节点-链接视图和邻接矩阵，根据底层布局或排序算法可能显示出截然不同的视觉模式。相比之下，不变图对于相同的输入拓扑始终呈现相同的视觉模式；然而，研究者对其探索不足，且未将其集成到可视化系统中。我们提出了Syndesmoscope，一个用于网络探索的交互式系统，它并置了同一网络的多个视图。窗格显示一个熟悉的力导向视图，以及三个基于图论属性的可解释几何布局窗格：密集-稀疏梯度、测地偏心率和谱二分。作为次要贡献，我们引入了kSnakes，一种基于密度分解的新不变图。Syndesmoscope支持两种关键交互：跳蛙，即不同可解释视觉模式之间的链接高亮；以及跳房子，即通过底层拓扑扩展数据选择的基于跳的遍历。通过在72个不同网络组成的语料库上的使用场景，我们展示了这些交互如何揭示单一视图无法访问的网络模式。在线演示见此URL。

英文摘要

Traditional network representations, such as node-link views and adjacency matrices, can show dramatically different visual patterns, depending on the underlying layout or seriation algorithm. In contrast, invariant plots consistently surface the same visual pattern for the same input topology; yet researchers have underexplored them and have not integrated them into visualization systems. We present Syndesmoscope, an interactive system for network exploration that juxtaposes multiple views of the same network. Panes show a familiar a force-directed view alongside three panes with interpretable geometric layouts based on graph-theoretic properties: dense-sparse gradient, geodesic eccentricity, and spectral bisection. As a secondary contribution, we introduce kSnakes, a new invariant plot based on density decomposition. Syndesmoscope supports two key interactions: leapfrogging, or linked highlighting between different and interpretable visual patterns; and hopscotching, or hop-based traversal that extends data selections through the underlying topology. Through usage scenarios across a corpus of 72 diverse networks, we demonstrate how these interactions reveal network patterns inaccessible through any single view alone. Live demo available at https://syndesmoscope.vercel.app/.

URL PDF HTML ☆

赞 0 踩 0

2606.19686 2026-06-19 cs.PL 新提交

Effect Systems as Abstract Interpretations

效应系统作为抽象解释

Colin S. Gordon

AI总结本文通过将效应量词嵌入抽象域，并从事件发生角度恢复效应量词的一般形式，建立了抽象解释与一般效应系统之间的形式关系。

Comments Draft short paper

2606.19680 2026-06-19 cs.CE 新提交

ImProNCDE: Impulse-Corrected Neural Controlled Differential Equations with Prototype Learning for Longitudinal Prognosis Prediction

ImProNCDE：基于原型学习的脉冲校正神经控制微分方程用于纵向预后预测

Hao Wang, Yupeng Xu, Jinghao Lin, Shuchang Ye, Yige Peng, Jinman Kim, Kun Liu, Lei Bi

AI总结提出ImProNCDE框架，通过残差脉冲校准捕捉病理突变，并利用原型引导轨迹稳定器减少长期误差累积，在眼科纵向预后预测中超越现有方法。

Comments 12 pages, 5 figures

详情

AI中文摘要

纵向眼科影像分析是眼科疾病预后预测的关键步骤。然而，AI辅助预后模型面临随访序列稀疏、不规则采样和不完整的挑战。尽管先进的预后建模方法，尤其是基于神经控制微分方程（NCDE）的方法，为稀疏和不规则的纵向数据提供了原则性的连续时间框架，但在临床随访建模中仍有两个主要问题未解决。首先，标准NCDE的平滑潜在动力学与治疗干预、病灶复发或长随访间隔引起的突然病理变化不匹配。其次，长时间跨度的数值积分会累积误差，导致不稳定的潜在轨迹和弱化的类别区分。为解决这些挑战，我们提出了ImProNCDE，一种带有原型学习的脉冲校正NCDE框架，用于纵向眼科预后预测。为了捕捉平滑潜在动力学之外的突然病理变化，ImProNCDE引入了残差脉冲校准（RIC），在就诊时间注入基于残差的脉冲校正，并在观测偏离连续预测时重新校准潜在状态。为了进一步减轻长时间跨度的误差累积，我们引入了原型引导轨迹稳定器（PTS），旨在将潜在轨迹吸引到可学习的预后原型，以减少类别重叠，最终提高长期稳定性。在多个私人和公共纵向眼科数据集（总计超过1206个样本）上的实验表明，ImProNCDE优于专注于序列建模的现有最先进方法。

英文摘要

Longitudinal ophthalmic imaging analysis is an essential step for prognosis prediction in ophthalmic diseases. However, AI-assisted prognosis models are challenged by follow-up sequences, which tend to be sparse, irregularly sampled, and incomplete. Although advanced prognosis modeling methods, especially for the methods based on neural controlled differential equations (NCDEs), provide a principled continuous-time framework for sparse and irregular longitudinal data. Unfortunately, two major concerns remain unsolved in clinical follow-up modeling. First, the smooth latent dynamics of standard NCDEs is poorly matched to abrupt pathological changes induced by therapeutic intervention, lesion recurrence, or long follow-up gaps. Second, numerical integration over long horizons can accumulate errors, which will produce unstable latent trajectories and weakened class discrimination. To address these challenges, we propose ImProNCDE, an impulse-corrected NCDE framework with prototype learning for longitudinal ophthalmic prognosis prediction. To capture abrupt pathological changes beyond smooth latent dynamics, ImProNCDE introduces Residual Impulse Calibration (RIC), which injects residual-based impulse corrections at visit times and then recalibrates the latent state when observations deviate from continuous predictions. To further mitigate error accumulation over long horizons, we introduce a Prototype-guided Trajectory Stabilizer (PTS), which aims to attract latent trajectories toward learnable prognosis prototypes to reduce class overlap and which ultimately improves long-horizon stability. Experiments on multiple private and public longitudinal ophthalmic datasets (totalling over 1206 samples) show that ImProNCDE outperforms existing SOTA methods focusing on sequence modeling.

URL PDF HTML ☆

赞 0 踩 0

2606.19654 2026-06-19 cs.CR cs.SE 新提交

PUFFERDOS: Efficient and Effective Attack String Generation for Regular Expression Denial of Service Vulnerabilities

PUFFERDOS：针对正则表达式拒绝服务漏洞的高效攻击字符串生成

Shangzhi Xu, Ziqi Ding, Xiao Cheng, Yuekang Li, Nan Sun, Benjamin Turnbull, Shuangxiang Kan, Siqi Ma

AI总结提出PUFFERDOS方法，通过定义三种脆弱模式并利用合成技术与组合符号执行，生成在现实长度预算内且经程序验证有效的ReDoS攻击字符串。

Comments Accepted by S&P'26

详情

AI中文摘要

ReDoS攻击构成了一类关键的资源耗尽漏洞。在此类攻击中，攻击者利用正则表达式引擎的病态最坏情况执行行为，诱导高度不对称的计算工作负载，最终耗尽系统资源并降低服务可用性。为了保护系统免受ReDoS攻击，研究人员提出了许多检测技术，这些技术通过生成攻击字符串来模拟攻击过程，以便在早期开发阶段主动利用ReDoS漏洞并促进修复。现有技术大致分为两类：搜索病态正则表达式结构的静态分析，以及合成候选攻击字符串的动态探索方法。然而，生成的攻击字符串通常不适用于实际利用，因为它们往往假设不切实际的输入长度预算，并且未在程序级别验证攻击的有效性和效率。因此，许多生成的字符串在应用于实际程序时无法触发易受攻击的正则表达式，进一步限制了其实用性。为了解决这些不足，我们引入了一种有效且高效的攻击字符串生成器PUFFERDOS，旨在合成在现实长度预算内可行且经程序级别验证的攻击输入，从而实现对实际程序中ReDoS漏洞的有效利用。具体来说，我们首先基于观察和形式化验证定义了三种脆弱模式。根据这些模式，PUFFERDOS采用合成技术生成攻击字符串，然后通过针对ReDoS的组合符号执行对字符串进行细化和验证，以确保现实世界中的可利用性。

英文摘要

ReDoS attacks constitute a critical class of resource-exhaustion vulnerabilities. In such attacks, adversaries exploit the pathological worst-case execution behavior of regular expression (regex) engines to induce highly asymmetric computational workloads, ultimately exhausting system resources and degrading service availability. To protect systems against ReDoS attacks, numerous detection techniques have been proposed that simulate the attack process by generating attack strings to proactively exploit ReDoS vulnerabilities at the early development stage and facilitate remediation. Existing techniques broadly fall into two classes: static analyses that search for pathological regex structures, and dynamic exploration methods that synthesize candidate attack strings. However, the generated attack strings are often impractical for real-world exploitation because they usually assume unrealistic input-length budgets and do not validate the effectiveness and efficiency of the attack at the program level. Therefore, many generated strings fail to trigger vulnerable regexes when applied to real-world programs, further limiting the practical utility. To address these shortcomings, we introduce an effective and efficient attack string generator, PUFFERDOS, designed to synthesize attack inputs that are both feasible within realistic length budgets and validated at the program level, enabling effective exploitation of ReDoS vulnerabilities in real-world programs. Specifically, we first define three vulnerable patterns based on our observation and formal verification. According to the patterns, PUFFERDOS conducts a synthesis technique to generate attack strings, and then refines and validates the strings with ReDoS-specific compositional concolic execution to guarantee real-world exploitability.

URL PDF HTML ☆

赞 0 踩 0

2606.19644 2026-06-19 cs.SE 新提交

Prompt Quality and Pull Request Outcomes: A Stage-Based Empirical Study of LLM-Assisted Development

提示质量与拉取请求结果：基于阶段的LLM辅助开发实证研究

Richard Sserunjogi, Daniel Ogenrwot, John Businge

AI总结通过分析265个开发者与ChatGPT的交互，研究提示结构（上下文、具体性、验证）对LLM辅助开发中代码生成、采纳和集成深度的影响，发现不同维度在不同阶段有不同作用。

Comments 48 pages, 2 figures

详情

AI中文摘要

大型语言模型（LLM）驱动的工具（如ChatGPT）越来越多地用于协作软件工程工作流，但提示结构如何影响下游拉取请求（PR）结果尚不清楚。先前的研究主要考察对话帮助性、生产力或粗粒度的采用指标，对提示结构在协作集成行为中的作用理解不足。我们分析了来自开源拉取请求中自我承认的ChatGPT使用的265个手动验证的开发者-ChatGPT交互。基于先前关于开发者面向工件和提示工程的研究，我们使用三个维度操作化提示结构：上下文、具体性和验证。我们首先评估LLM辅助注释是否能可靠地再现人类对提示结构的判断，发现在不同维度和工作流上下文中存在显著差异。具体性与人类判断的一致性最稳定；上下文被LLM系统性地低估；验证仍然难以一致评估，这促使采用人类-LLM混合注释策略。使用这个经过验证的框架，我们然后检查提示结构如何影响AI辅助PR工作流中的可操作代码生成、代码采纳和集成深度。具体性和上下文与可操作代码生成关联最强；验证成为代码采纳的主要预测因子；集成深度与上下文关联最强。总体而言，我们的发现表明，提示特征在AI辅助软件工程工作流中表现出不同的、阶段依赖的影响，通过上下文基础、任务具体性和可评估性线索影响下游采纳和集成。

英文摘要

Large language model (LLM)-powered tools such as ChatGPT are increasingly used in collaborative software engineering workflows, yet little is known about how prompt structure influences downstream pull request (PR) outcomes. Prior studies primarily examine conversational helpfulness, productivity, or coarse-grained adoption metrics, leaving the role of prompt structure in collaborative integration behavior insufficiently understood. We analyze 265 manually validated developer-ChatGPT interactions derived from self-admitted ChatGPT usage in open-source pull requests. Building on prior research on developer-facing artifacts and prompt engineering, we operationalize prompt structure using three dimensions: Context, Specificity, and Verification. We first evaluate whether LLM-assisted annotation can reliably reproduce human judgments of prompt structure, finding substantial variation across dimensions and workflow contexts. Specificity shows the most stable agreement with human judgments; Context is systematically under-scored by the LLM; and Verification remains difficult to assess consistently, motivating a hybrid human-LLM annotation strategy. Using this validated framework, we then examine how prompt structure influences actionable code generation, code adoption, and integration depth across AI-assisted PR workflows. Specificity and Context are most strongly associated with actionable code generation; Verification emerges as the primary predictor of code adoption; and integration depth is most strongly associated with Context. Overall, our findings show that prompt characteristics exert distinct, stage-dependent effects across AI-assisted software engineering workflows, influencing downstream adoption and integration through contextual grounding, task specificity, and evaluability cues.

URL PDF HTML ☆

赞 0 踩 0

2606.19620 2026-06-19 cs.CR 新提交

G-Lox: Group-Adaptive, Privacy-Preserving Bridge Distribution with Two-Party Computation

G-Lox: 基于两方计算的组自适应、隐私保护桥分发

Baigang Chen, Nicholas Hopper

AI总结提出G-Lox桥分发系统，通过两方安全计算实现隐藏的组级自适应分配，保护分发者盲性，支持阻塞报告、传输感知重分配和隐私保护组分裂。

详情

AI中文摘要

我们提出G-Lox（组自适应Lox），一种桥分发系统，在保持Lox风格分发者盲性的同时，实现隐藏的、有状态的组级自适应。G-Lox将自适应分配逻辑置于双服务器隐私墙之后，因此没有单个服务器能学习组标识符或组到桥的分配。私有状态访问和状态相关更新使用双服务器DPF/FSS协议和安全两方计算，支持阻塞报告、传输感知重分配和隐私保护组分裂。我们通过系统测量和策略模拟评估G-Lox。在我们的C++/EMP实现中，基于真实TCP套接字，私有状态访问的客户端可见开销较低：在状态大小高达2^16时，每次迭代的通信量保持在低KiB范围。在M=1024时，客户端发送1,968字节，接收1,280字节，每次迭代完成约0.25秒。针对特定组阻塞和女巫枚举的模拟表明，在保持广泛发行的系统中，G-Lox相比类似Lox和rBridge的基线提高了鲁棒性。

英文摘要

We present G-Lox (group-adaptive Lox), a bridge-distribution system that preserves Lox-style distributor blindness while enabling hidden, stateful group-level adaptation. G-Lox places adaptive assignment logic behind a two-server privacy wall, so no single server learns group identifiers or group-to-bridge assignments. Private state access and state-dependent updates use two-server DPF/FSS protocols and secure two-party computation, supporting blockage reporting, transport-aware reassignment, and privacy-preserving group splitting. We evaluate G-Lox through system measurements and policy simulation. In our C++/EMP implementation over real TCP sockets, private state access has low client-visible overhead: across state sizes up to 2^16, communication remains in the low-KiB range per iteration. At M=1024, the client sends 1,968 bytes, receives 1,280 bytes, and completes an iteration in about 0.25 s. Simulations with group-specific blocking and Sybil enumeration show that G-Lox improves robustness over Lox- and rBridge-like baselines among systems that maintain broad issuance.

URL PDF HTML ☆

赞 0 踩 0

2606.19618 2026-06-19 cs.GT 新提交

Joint-task truthfulness of the DMI mechanism

DMI机制的联合任务真实性

Rafael Frongillo

AI总结研究DMI机制在联合任务策略下的真实性，证明当其他代理使用一致策略时，真实报告仍是贝叶斯-纳什均衡，但无限制时主导真实性和知情真实性均失效。

2606.19609 2026-06-19 cs.HC cs.GR 新提交

Building Drift: Documenting On-Site Construction Adaptations Across Material Lifecycles

建筑漂移：记录跨材料生命周期的现场施工适应

Ritik Batra, Martin Tamke, Tom Svilans, Jan Hüls, Amritansh Kwatra, Steven J. Jackson, Thijs Roumen, Mette Ramsgaard Thomsen

AI总结提出“建筑漂移”概念，通过案例研究建立分类法，并开发Pentimento工具，利用视频和3D高斯泼溅记录现场适应，促进再生材料循环利用。

Comments In submission

详情

AI中文摘要

在建筑循环经济中，再生材料承载着先前使用生命，并将在未来建筑中拥有后生命。然而，使用此类材料会引入不可预测性，需要现场即兴发挥，这使得其再利用难以记录和跨建筑生命周期规模化。没有记录，使用再生材料进行施工所需的现场适应使得合作者、评估者和继承者缺乏继续、评估和再利用材料所需的信息。我们将通过这些适应导致物理状态与数字模型的集体偏差称为“建筑漂移”。通过一个案例研究——在森林中建造的再生木材亭子ReShelter，我们开发了一个建筑漂移分类法，以表征跨建筑生命周期的集体偏差：照料场地、寻找契合、解读材料、标记测量和跨社区协调。为了将我们的建筑漂移分类法付诸实践，我们提出了Pentimento，一个利用视频记录和3D高斯泼溅在空间、时间和语义上表示与设计模型相关的现场适应的文档工具。Pentimento使每个利益相关者能够以降低材料再利用障碍的方式浏览材料历史。这些贡献共同为支持再生材料施工所必需的现场即兴发挥的计算工具开辟了路径，从而实现更可持续的回收、修复和再利用循环。

英文摘要

In a circular economy for construction, reclaimed materials carry prior lives of use and go on to have post-lives in future buildings. Yet working with such materials introduces unpredictability that requires on-site improvisation, making their reuse challenging to document and scale across building lifetimes. Without documentation, the on-site adaptations that make construction with reclaimed materials possible leave collaborators, evaluators, and inheritors without the information they need to continue, assess, and reuse materials. We call the collective deviation of the physical state from the digital model through these adaptations "building drift." Through a case study, ReShelter, a reclaimed timber pavilion constructed in the forest, we develop a taxonomy for building drift that characterizes the collective deviation across building lifetimes: Tending the Site, Foraging for Fit, Interpreting the Material, Marking Measurements, and Coordinating Across Communities. To put our taxonomy for building drift into practice, we present Pentimento, a documentation tool that leverages video documentation and 3D Gaussian Splatting to spatially, temporally, and semantically represent on-site adaptations in relation to the designed model. Pentimento enables each stakeholder to navigate material histories in ways that reduce barriers to material reuse. Together, these contributions open pathways towards computational tools that support the on-site improvisation essential to construction with reclaimed materials, enabling more sustainable cycles of recovery, repair, and reuse.

URL PDF HTML ☆

赞 0 踩 0

2606.19599 2026-06-19 eess.SY cs.SY econ.EM 新提交

Ramping Procurement and Bid-Cost Recovery in Real-Time Market

实时市场中的爬坡采购与投标成本回收

Cong Chen, Valentina Norambuena, Lang Tong

AI总结研究净需求不确定下与经济调度协同优化的爬坡采购，分析单间隔与多间隔协同优化设计，提出评估发电机利润、消费者支付、投标成本回收和运营效率的分析框架，并比较三种定价机制。

Comments 4 figures

详情

AI中文摘要

我们研究了净需求不确定下与经济调度协同优化的爬坡采购。我们考察了电网运营商实施的两种灵活爬坡产品设计：单间隔和多间隔协同优化。两者都依赖于滚动窗口随机优化，包含绑定和咨询间隔决策。我们开发了分析框架来评估发电机利润、消费者支付、投标成本回收（BCR）和运营效率。特别是，净需求不确定性可能导致发电机补偿不足，需要歧视性BCR。虽然运营效率对能量和爬坡价格不变，但生产者利润和消费者支付关键取决于定价。我们研究了节点边际定价（LMP）和两种统一定价：最大调度成本定价（MDCP）和最大时间节点边际定价（MTLMP）。在市场外BCR下，LMP产生歧视性能量价格，而MDCP消除BCR，MTLMP在大多数情况下也是如此。这一性质使我们能够在MDCP下为价格接受型发电机建立真实投标激励。我们的分析突出了单间隔和多间隔协同优化与定价设计之间的权衡：在高预测不确定性和中等爬坡需求下，单间隔能量-爬坡协同优化具有优势，而当净需求预测相对准确且爬坡需求具有挑战性时，多间隔协同优化更优。基于CAISO和ERCOT数据的实证结果表明，与LMP相比，MDCP和MTLMP增加了生产者利润且BCR可忽略，但以消费者支付增加为代价。

英文摘要

We study ramping procurement co-optimized with economic dispatch under net-demand uncertainty. We examine two flexible ramp product designs implemented by grid operators: single-interval and multi-interval co-optimization. Both rely on rolling-window stochastic optimization with binding and advisory interval decisions. We develop analytical frameworks to evaluate generator profits, consumer payments, bid cost recovery (BCR), and operational efficiency. In particular, net-demand uncertainty may lead to generator under-compensation, requiring discriminatory BCR. While operational efficiency is invariant to energy and ramp prices, producer profits and consumer payments depend critically on pricing. We examine locational marginal pricing (LMP) and two uniform pricing: maximum dispatch cost pricing (MDCP) and maximum temporal locational marginal pricing (MTLMP). With out-of-market BCR, LMP yields discriminatory energy prices, whereas MDCP eliminates BCR and MTLMP does so in most cases. This property enables us to establish truthful bidding incentives for price-taking generators under MDCP. Our analysis highlights trade-offs between single- and multi-interval co-optimization and pricing designs: single-interval energy-ramp co-optimization is advantageous under high forecast uncertainty and moderate ramping requirements, whereas multi-interval co-optimization is superior when net-demand forecasts are relatively accurate and ramp needs are challenging. Empirical results on CAISO and ERCOT data show that MDCP and MTLMP increase producer profits with negligible BCR, albeit at the expense of higher consumer payments relative to LMP.

URL PDF HTML ☆

赞 0 踩 0

2606.19576 2026-06-19 cs.DB cs.DC 新提交

REMOP: REmote-Memory-aware OPerator Optimization

REMOP: 远程内存感知的算子优化

Shiquan Zhang, Yunhao Mao, Yuqiu Zhang, Gengrui Zhang, Jeyhun Karimov, Hans-Arno Jacobsen

AI总结针对远程内存环境下查询处理中数据传输轮次过多的问题，提出REMOP框架，通过轮次感知的算子内内存策略优化内存溢出执行，在DuckDB中实现三种算子，减少高达97%的传输轮次和48%的算子运行时间。

Comments 14 pages, 13 figures, 9 tables. Preprint, under review

详情

AI中文摘要

远程和分离内存层扩展了分析数据库引擎的有效内存容量，但也重塑了内存溢出查询处理的成本结构。当算子溢出到本地DRAM之外时，将页面移动到远程内存既会产生数据传输时间，也会产生每次传输的固定往返延迟。经典的算子分析和缓冲区分配启发式方法主要通过最小化总I/O量来针对磁盘溢出。在远程内存下，这些策略可能不是最优的，因为它们可能触发过多的传输轮次。我们提出了REMOP，一个远程内存感知的算子优化框架，它使用传输轮次感知的算子内内存策略来改善内存预算紧张下的内存溢出执行。REMOP将传输轮次数引入延迟成本模型，并推导出算子特定的缓冲区划分策略，在DuckDB中为阻塞嵌套循环连接、外部归并排序和外部哈希连接实例化了该方法。我们在双节点计算-内存测试平台上的评估表明，在溢出密集的微基准测试中，REMOP减少了高达97%的传输轮次和高达48%的算子运行时间，并将溢出TPC-H和TPC-DS查询的平均运行时间分别降低了22.7%和26.4%。

英文摘要

Remote and disaggregated memory tiers expand the effective memory capacity of analytical database engines, but they also reshape the cost structure of out-of-memory query processing. When an operator spills beyond local DRAM, moving pages to remote memory incurs both data-transfer time and a fixed round-trip latency per transfer. Classical operator analyses and buffer-allocation heuristics primarily target disk spilling by minimizing total I/O volume. Under remote memory, these strategies can be suboptimal because they may trigger excessive transfer rounds. We present REMOP, a remote-memory-aware operator optimization framework that uses transfer-round-aware intra-operator memory policies to improve out-of-memory execution under tight memory budgets. REMOP introduces the number of transfer rounds into the latency cost model and derives operator-specific buffer-partitioning strategies, instantiating the approach for blocked nested-loop join, external merge sort, and external hash join in DuckDB. Our evaluation on a two-node compute-memory testbed shows that REMOP reduces transfer rounds by up to 97% and operator runtime by up to 48% on spill-heavy microbenchmarks, and lowers the average runtime of spilling TPC-H and TPC-DS queries by 22.7% and 26.4% end-to-end.

URL PDF HTML ☆

赞 0 踩 0

2606.19570 2026-06-19 cs.HC 新提交

Code as Anchor, Memory and Metaphor as Support: Learner Experiences with Multi-View Visualizations

代码作为锚点，记忆与隐喻作为支持：学习者对多视图可视化的体验

Naaz Sibia, Jessica Wen, Amber Richardson, Yashika Jain, Khushi Malik, Bogdan Simion, Carolina Nobre, Angela Zavaleta Bernuy, Andrew Petersen, Michael Liut

AI总结通过眼动追踪和访谈，研究19名CS1/CS2学生在多视图可视化工具中的行为，发现学生主要关注代码，忽视隐喻视图，受能动性、表征适配和合法性因素影响。

Comments Pre-Print of a paper to be published at the International Computing Education Research (ICER) conference 2026

详情

DOI: 10.1145/3765964.3811662

AI中文摘要

程序可视化被广泛用于支持新手程序员，但学生经常忽视或抵制精心设计的视觉支架。关于多重外部表征（MERs）的研究提供了协调视图的认知设计原则，但对于什么因素影响学习者对可用表征的参与度知之甚少。我们对19名已完成CS1和CS2的本科生进行了一项被试内研究。学生使用一个多表征探针（包含同步的代码、记忆和隐喻视图）和Python Tutor，在作用域、while循环和链表任务中完成出声思考任务、反思性访谈和基于摄像头的视线追踪。视线分析显示，尽管有可用的视觉支架，学生将近一半的时间专注于代码。没有先前经验的学生更强烈地以代码为锚点，并且很少参与隐喻视图。访谈确定了影响选择性参与的三个因素：能动性（学生寻求控制认知努力而非简单减少）、表征适配（相同设计在不同情境下感觉有帮助或令人不知所措）以及合法性（一些学生避免他们认为幼稚或不够严谨的隐喻支架）。这些发现表明，计算教育中的多表征工具需要关注情感和社会因素以及认知设计。实际考虑包括将可视化定位为验证工具、提供可切换的抽象级别以及通过框架设计传达学科合法性。更广泛地说，这些主题有助于解释为什么认知上合理的可视化工具可能无法吸引它们旨在帮助的学生。

英文摘要

Program visualizations are widely used to support novice programmers, yet students often ignore or resist well-designed visual scaffolds. Research on multiple external representations (MERs) offers cognitive design principles for coordinating views, but less is known about what shapes learners' engagement with available representations. We conducted a within-subjects study with 19 undergraduates who had completed CS1 and CS2. Students completed think-aloud tasks, reflective interviews, and webcam-based gaze tracking while using a multi-representational probe with synchronized code, memory, and metaphor views, and Python Tutor, across scope, while loops, and linked lists. Gaze analysis showed that students spent nearly half their time focused on code despite available visual scaffolds. Students without prior experience anchored even more heavily in code and engaged minimally with metaphor views. Interviews identified three factors shaping selective engagement: agency, as students sought control over cognitive effort rather than simply having it reduced; representational fit, as identical designs differed in whether they felt helpful or overwhelming; and legitimacy, as some students avoided metaphorical scaffolds they perceived as childish or insufficiently rigorous for university-level work. These findings suggest that multi-representational tools in computing education require attention to affective and social factors alongside cognitive design. Practical considerations include positioning visualizations as verification instruments, offering toggleable abstraction levels, and framing tools to signal disciplinary legitimacy. More broadly, the themes help explain why cognitively sound visualization tools may fail to engage the students they are designed to help.

URL PDF HTML ☆

赞 0 踩 0

2606.19556 2026-06-19 cs.CE 新提交

A hybrid sharp-diffuse interface approach to accurately model melt pool dynamics with rapid evaporation in laser-based processing of metals

一种混合锐利-扩散界面方法，用于精确模拟激光加工金属中伴随快速蒸发的熔池动力学

Nils Much, Andreas Koch, Christoph Meier, Magdalena Schreter-Fleischhacker

AI总结提出混合锐利-扩散界面方法，结合锐利界面传热模型和扩散界面多相流模型，精确模拟激光加工中蒸发驱动的熔池热流体动力学，精度比纯扩散模型高一个数量级。

Journal ref Computer Methods in Applied Mechanics and Engineering 457, 119023, 2026

详情

DOI: 10.1016/j.cma.2026.119023

AI中文摘要

在激光加工金属（如激光束焊接或激光粉末床熔融增材制造）中，熔池动力学的预测模拟需要精确解析熔-气界面的热流体动力学相互作用。这里，蒸发诱导的反冲压力和温度相关的表面张力控制着流动。由于这些机制对界面温度敏感（通常呈指数关系），可靠的预测需要高精度的传热模型。流行的扩散界面公式模糊了激光-金属相互作用中典型的极端热梯度，导致界面温度误差，从而严重降低界面力预测和熔池动力学的精度。我们提出了一种混合锐利-扩散界面方法，用于高保真模拟伴随快速蒸发的熔池热流体动力学。传热问题采用锐利界面非拟合有限元（CutFEM）公式表示，能够精确预测温度场。多相流问题具有大密度比和复杂界面动力学特征，通过稳健的基于水平集的一流体扩散界面有限元公式精确捕捉。通过将锐利界面温度扩展到窄界面区域，在扩散界面流动框架内评估温度相关的界面力，实现了一致耦合。在实际相关基准测试中，锐利界面热模型表现出二阶空间收敛性，使得有限元尺寸比扩散界面方法大两个数量级，同时保持1%精度。在一个代表激光-金属相互作用的耦合热流体动力学新基准测试中，混合方法在同一网格上比纯扩散界面模型精确一个数量级。

英文摘要

Predictive simulation of melt pool dynamics in laser-based processing of metals, e.g., laser beam welding or laser powder bed fusion additive manufacturing, requires accurate resolution of thermo-hydrodynamic interactions at the melt-gas interface. Here, evaporation-induced recoil pressure and temperature-dependent surface tension govern the flow. Because these mechanisms depend sensitively, often exponentially, on the interface temperature, reliable predictions demand highly accurate heat transfer models. Popular diffuse-interface formulations smear the extreme thermal gradients as typical for laser-metal interactions, leading to interface temperature errors that critically degrade the accuracy of interface force predictions and melt pool dynamics. We present a hybrid sharp-diffuse interface approach for high-fidelity modelling of melt pool thermo-hydrodynamics with rapid evaporation. The heat transfer problem is represented using a sharp-interface unfitted finite element (CutFEM) formulation, enabling accurate prediction of the temperature field. The multi-phase flow problem, characterized by large density ratios and complex interface dynamics, is accurately captured using a robust level-set-based one-fluid diffuse-interface finite element formulation. Consistent coupling is achieved by extending the sharp-interface temperature into a narrow interface region to evaluate temperature-dependent interface forces within the diffuse-interface flow framework. In practically relevant benchmarks, the sharp-interface thermal model exhibits second-order spatial convergence, enabling finite element sizes two orders of magnitude larger than the diffuse-interface approach for 1 accuracy. In a novel coupled thermo-hydrodynamic benchmark representative of laser-metal interactions, the hybrid approach is one order of magnitude more accurate than a purely diffuse-interface model on the same mesh. Robu

URL PDF HTML ☆

赞 0 踩 0

2606.19537 2026-06-19 cs.MA cs.DC 新提交

Mesh Inference: A Formal Model of Collective Intelligence Without a Center

网格推理：无中心集体智能的形式模型

Hongwei Xu

AI总结提出网格推理形式模型，通过耦合自由能实现无中心多智能体协作推理，证明收敛唯一性、识别完备性和观测唯一性，并分析线性高斯情况下的延迟代价。

Comments 21 pages, 2 figures

详情

AI中文摘要

我们提出了网格推理的形式模型：一群独立智能体，每个持有私有状态，仅交换被接纳的、类型化的观测，在没有中央协调者且无智能体暴露的情况下，推导出任何一个智能体单独无法得出的结论。没有智能体共享权重、梯度或隐藏状态，且智能体可能跨越不同的团队、网络和组织。受“询问模型是能量最小化推理”这一观察的启发，我们将网格建模为每个智能体局部松弛的耦合自由能。我们证明，单一的接纳/发射策略控制三个性质。首先，对于任何对称或非对称的接纳，网格推理收敛到唯一答案，因为耦合总是M-矩阵。其次，它是识别完备的：当贡献视图是载波连通时，它精确推导出集中式最优解。第三，它是仅观测的：没有节点传输其内部状态，且机密性是识别的对偶。内容寻址谱系是唯一的全局侧信道。在线性高斯情况下，每个推导出的答案都是确定的，因此等于集中式最优解，延迟为O(diam^2)，这是移除中心所付出的代价。这样的推导是无中心学习循环的一个环节，我们将其形式化为架构而非证明。我们提出的开放问题是，询问何时能改善集体而非破坏它：非线性闭包是推导出升级的答案还是自信的错误。据我们所知，这是网格推理的第一个形式模型。

英文摘要

We present a formal model of mesh inference: how a population of independent agents, each holding private state and exchanging only admitted, typed observations, derives a conclusion none of them holds alone, with no central coordinator and no agent exposed. No agent shares weights, gradients, or hidden state, and the agents may span different teams, networks, and organizations. Motivated by the observation that asking a model is energy-minimizing inference, we model the mesh as a coupled free energy that each agent relaxes locally. We show that a single admission/emission policy governs three properties. First, mesh inference converges to a unique answer for any admission, symmetric or not, because the coupling is always an M-matrix. Second, it is identification-complete: it derives the centralized optimum exactly when the contributing views are carrier-connected. Third, it is observation-only: no node transmits its internals, and confidentiality is the dual of identification. Content-addressed lineage is the only global side-channel. In the linear-Gaussian regime every derived answer is determined, hence equal to the centralized optimum, at O(diam^2) latency, the measured price of removing the center. One such derivation is one turn of a center-free learning loop, which we formalize as architecture rather than prove. The open problem we state is when asking improves the collective rather than corrupting it: whether the non-linear closure derives an upgraded answer or a confident error. To our knowledge, this is the first formal model of mesh inference.

URL PDF HTML ☆

赞 0 踩 0

2606.19532 2026-06-19 cs.LO 新提交

Vancomycert: A Certified Neuro-Symbolic Drug Delivery System (Case Study)

Vancomycert: 一种经过认证的神经符号药物递送系统（案例研究）

Alistair Sirman, Fleur Conway, Jessica Ciupa, Gusts Gustavs Grīnbergs, Ekaterina Komendantskaya, Thai Son Hoang, Michael Rawson, Alessandro Bruni, Vaishak Belle, Michael John Williams

AI总结针对抗生素给药神经网络控制器的形式化验证问题，提出一种结合监督学习和定理证明的方法，确保无限时域内自动给药不超过治疗上限。

详情

AI中文摘要

自主决策的神经网络控制器在网络物理系统中已得到广泛应用，但在安全关键的医疗环境中，其部署仍未得到充分验证。本文提出了一种用于抗生素给药神经网络控制器形式化验证的方法和案例研究，其动机源于系统必须在无限时间范围内同时具备适应性和可证明安全性的挑战。我们构建了一个简化但临床可解释的模型，用于跟踪药物浓度、体温和白细胞计数。万古霉素被选为代表性抗生素，广泛用于严重感染，但治疗窗口狭窄，超治疗浓度有肾毒性风险，而亚治疗剂量可能导致治疗失败。我们使用合成的临床医生式给药数据训练了一个监督式神经网络控制器。我们建立了输入-输出安全属性的形式化验证，特别验证了神经网络的一个属性，该属性意味着无限时域证明自动给药从未超过超治疗边界。该系统的属性在Rocq中使用Vehicle交互式定理证明器后端进行证明，以集成不同的证明系统。最终结果是一个验证流水线，允许各种治疗方法，同时为每个特定患者保持安全性。

英文摘要

Neural network controllers for autonomous decision-making are well-established in cyber-physical systems, yet their deployment in safety-critical healthcare settings remains largely unverified. This paper presents a methodology and case study for the formal verification of a neural network controller for antibiotic dosing, motivated by the challenge of systems that must be simultaneously adaptive and provably safe across unbounded time horizons. We construct a simplified yet clinically-interpretable model that tracks drug concentration, body temperature, and white blood cell count. Vancomycin is selected as a representative antibiotic, widely prescribed for severe infections yet carrying a narrow therapeutic window, where supratherapeutic concentrations risk nephrotoxicity and subtherapeutic dosing risks treatment failure. A supervised neural network controller is trained on synthetic clinician-style dosing data. We establish formal verification of input-output safety properties, specifically verifying a property of a neural network that implies an infinite-horizon proof that automated dosing never exceeds the supratherapeutic boundary. This system property is proven in Rocq using the Vehicle interactive theorem prover back-end to integrate the different proof systems. The end result is a verification pipeline that allows for a wide variety of treatment approaches whilst maintaining safety for each specific patient.

URL PDF HTML ☆

赞 0 踩 0

2606.19529 2026-06-19 cs.DC 新提交

The Sheaf Laplacian: A Topological Framework for Data Fusion and Consensus in Distributed Sensing Networks

层拉普拉斯算子：分布式传感网络中数据融合与共识的拓扑框架

Manuel Hernández, Eduardo Sánchez-Soto

AI总结提出层理论作为传统图模型的替代，利用层拉普拉斯算子实现异构分布式传感网络中的数据融合与共识。

2606.19526 2026-06-19 cs.AR 新提交

SPINE: A Fault Injection Profiler for Quantized Neural Networks under Accumulated Faults

SPINE: 面向累积故障下量化神经网络的故障注入分析器

Nathan Guimarães, Ian Kersz, Leonardo R. Gobatto, Fabio Benevenuti, Michael G. Jordan, Antonio Carlos S. Beck, Fernanda L. Kastensmidt, Jose Rodrigo Azambuja

AI总结提出GDB驱动的分析框架SPINE，通过向边缘CPU目标二进制注入累积权重位翻转，生成逐层故障特征，无需重训练或修改代码，指导选择性加固策略。

Comments ACM/IEEE/SBC/SBMICRO Symposium on Integrated Circuits and Systems Design 2026

AI 大模型

视觉与机器人

科学与医疗

Low-Cost Multi-Precision Systolic Arrays for Accelerating FHE NTTs on AI ASICs

Weight Adaptation for Improving Parallel Performance of Adaptive Stochastic Natural Gradient

Heterogeneous LLM Debate Under Adversarial Peers: Honest Gains, Replacement Costs, and Resilience

Learning Alternating Real-Time Automata

Challenges to Grassroots Organization Engagement with AI Policy

CoRaCommit: A VS Code Extension for Commit Message Generation with Exemplar Retrieval

DISARM: Target Electronic Device Informed Mitigation of Software Runtime Side-Channel Vulnerabilities

The Orchestration Gap: Why Process Automation Stalls in Operationally Complex Industries

Rethinking Sampling Strategy in Link Prediction

SIGMA: Skill-Incidence Graphs for Compositional Multi-Agent Design

SAC: Disaggregated KV Cache System for Sparse Attention LLMs with CXL

Designing for Interconnected Islamic Learning: A Qualitative Study of Muslim Women's Experiences with Qur'an, Hadith, and Seerah Apps

Vibe Coding for Visualization Implementation: An Empirical Study of Practices and Challenges

When Global Gating Is Enough: Admission-Time Hubness Control in Anisotropic Vector Retrieval Systems

Syndesmoscope: The Power of Invariant Plots\\Linked to Traditional Network Views

Effect Systems as Abstract Interpretations

ImProNCDE: Impulse-Corrected Neural Controlled Differential Equations with Prototype Learning for Longitudinal Prognosis Prediction

PUFFERDOS: Efficient and Effective Attack String Generation for Regular Expression Denial of Service Vulnerabilities

Prompt Quality and Pull Request Outcomes: A Stage-Based Empirical Study of LLM-Assisted Development

G-Lox: Group-Adaptive, Privacy-Preserving Bridge Distribution with Two-Party Computation

Joint-task truthfulness of the DMI mechanism

Building Drift: Documenting On-Site Construction Adaptations Across Material Lifecycles

Ramping Procurement and Bid-Cost Recovery in Real-Time Market

REMOP: REmote-Memory-aware OPerator Optimization

Code as Anchor, Memory and Metaphor as Support: Learner Experiences with Multi-View Visualizations

A hybrid sharp-diffuse interface approach to accurately model melt pool dynamics with rapid evaporation in laser-based processing of metals

Mesh Inference: A Formal Model of Collective Intelligence Without a Center

Vancomycert: A Certified Neuro-Symbolic Drug Delivery System (Case Study)

The Sheaf Laplacian: A Topological Framework for Data Fusion and Consensus in Distributed Sensing Networks

SPINE: A Fault Injection Profiler for Quantized Neural Networks under Accumulated Faults