arXivDaily arXiv每日学术速递 周一至周五更新
全部学科分类 1971
2606.18633 2026-06-18 cs.MA 新提交

PersonalPlan: Planning Multi-Agent Systems for Personalized Programming Learning

PersonalPlan: 面向个性化编程学习的多智能体系统规划

Zhiyuan Wen, Jiannong Cao, Peng Gao, Haochen Shi, Wengpan Kuan, Bo Yuan, Xiuxiu Qi

AI总结 提出PersonalPlan,一种两阶段多智能体规划器,通过分层SFT和奖励自适应GRPO生成可执行、个性化且具有教学支架的计划,在MAP-PPL数据集上优于现有方法。

详情
AI中文摘要

有效的编程教育需要针对不同学习者背景进行个性化教学。然而,虽然基于LLM的多智能体系统(MAS)擅长复杂规划,但现有规划器通常缺乏轮廓基础(profile-grounding)和教学支架(pedagogical scaffolding),从而削弱了个性化编程学习。为填补这一空白,我们首先引入\textbf{MAP-PPL}(\textbf{M}ulti-\textbf{A}gent \textbf{P}lans for \textbf{P}ersonalized \textbf{P}rogramming \textbf{L}earning),这是一个基于轮廓的多智能体规划数据集,包含来自1,730个Stack Overflow问题组和2,738个学习者轮廓的3,043个查询-轮廓-计划实例。每个计划指定了智能体、子任务、可执行步骤和先决依赖关系。然后,我们提出\textbf{PersonalPlan},一个两阶段MAS规划器,首先使用独立的LoRA适配器进行分层SFT,用于轮廓感知的任务分解和步骤依赖规划,然后应用奖励自适应GRPO,鼓励模型生成可执行、个性化且具有教学支架的计划。在MAP-PPL上进行的广泛实验,将PersonalPlan与前沿LLM、通用MAS框架和智能体规划器进行比较,证明了其优越性。仅使用8B和32B变体,PersonalPlan在计划可执行性、个性化和教学质量方面达到了最先进水平,有效协调了MAS进行智能体-学生交互。

英文摘要

Effective programming education requires personalized instruction adapted to diverse learner backgrounds. However, while LLM-based multi-agent systems (MAS) excel at complex planning, existing planners often lack profile-grounding and pedagogical scaffolding, thereby undermining personalized programming learning. To fill in the gap, we first introduce \textbf{MAP-PPL} (\textbf{M}ulti-\textbf{A}gent \textbf{P}lans for \textbf{P}ersonalized \textbf{P}rogramming \textbf{L}earning), a profile-conditioned multi-agent planning dataset with 3{,}043 query--profile--plan instances from 1{,}730 Stack Overflow question groups and 2{,}738 learner profiles. Each plan specifies agents, subtasks, executable steps, and prerequisite dependencies. Then, we propose \textbf{PersonalPlan}, a two-stage MAS planner that first performs hierarchical SFT with separate LoRA adapters for profile-aware task decomposition and step dependency planning, then applies a Reward-Adaptive GRPO to encourage the model to generate executable, personalized, and pedagogically scaffolded plans. Extensive experiments on MAP-PPL comparing PersonalPlan against frontier LLMs, generic MAS frameworks, and agentic planners demonstrate its superiority. With only 8B and 32B variants, PersonalPlan achieves state-of-the-art plan executability, personalization, and pedagogical quality, effectively orchestrating MAS for agent-student interactions.

2606.18600 2026-06-18 cs.DC 新提交

ShuntServe: Cost-Efficient LLM Serving on Heterogeneous Spot GPU Clusters

ShuntServe: 异构竞价型GPU集群上的成本高效LLM服务

Seungwoo Jeong, Moohyun Song, Juhyun Park, Kyungyong Lee

AI总结 提出ShuntServe系统,通过屋顶线模型估计性能和动态规划优化模型放置,在异构竞价型GPU集群上最大化吞吐量,结合输出保留迁移与共享张量存储实现容错,相比基线吞吐量提升1.42倍,成本效率提升31.9%以上。

Comments 18 pages, 16 figures, 5 tables

详情
AI中文摘要

随着大语言模型(LLM)服务的广泛采用,在云环境中为这些模型提供服务的GPU资源成本已成为关键问题。竞价实例相比按需实例可节省高达90%的成本,但其频繁中断和有限可用性对连续LLM服务构成重大挑战。特别是GPU竞价实例的可用性比基于CPU的实例更低且更不稳定,使得依赖单一GPU类型的同构集群容易受到关联故障的影响。跨多种GPU类型的异构集群可以通过利用不同竞价池的互补可用性模式来解决这一问题,然而现有的LLM服务系统是为同构环境设计的,在异构GPU上部署时会遇到负载不均衡的问题。本文提出了ShuntServe,一个用于异构竞价型GPU集群的成本高效LLM服务系统。ShuntServe采用基于屋顶线模型的分析性服务性能估计器和基于动态规划的模型放置优化器,联合确定节点配置、并行化策略和层分配,以最大化跨异构GPU的吞吐量。为了增强使用竞价实例时的容错能力,ShuntServe将输出保留的请求迁移与通过共享张量存储的并发初始化相结合,通过重叠替换节点准备与持续服务来最小化迁移停机时间。在由L4、A10G和L40S GPU组成的异构AWS集群上对Llama-3.1-70B和Qwen3-32B的评估表明,ShuntServe的吞吐量比最先进的基线高出1.42倍和1.35倍,并且与按需实例相比,在离线服务和在线服务中分别实现了31.9%和31.2%的成本效率提升。

英文摘要

As large language model (LLM) services become widely adopted, the cost of GPU resources for serving these models in cloud environments has emerged as a critical concern. Spot instances offer up to 90% cost savings over on-demand instances, but their frequent interruptions and limited availability pose significant challenges for continuous LLM serving. GPU spot instances, in particular, exhibit lower and more volatile availability than CPU-based instances, making homogeneous clusters that depend on a single GPU type vulnerable to correlated failures. Heterogeneous clusters spanning multiple GPU types can address this by leveraging complementary availability patterns across diverse spot pools, yet existing LLM serving systems are designed for homogeneous environments and suffer from load imbalance when deployed on heterogeneous GPUs. This paper presents ShuntServe, a cost-efficient LLM serving system for heterogeneous spot GPU clusters. ShuntServe employs a roofline model-based analytical serving performance estimator and a dynamic programming-based model placement optimizer that jointly determines node configuration, parallelization strategy, and layer assignment to maximize throughput across heterogeneous GPUs. To enhance fault tolerance when using spot instances, ShuntServe combines output-preserving request migration with concurrent initialization via a shared tensor store, minimizing migration downtime by overlapping replacement node preparation with ongoing serving. Evaluation on Llama-3.1-70B and Qwen3-32B with a heterogeneous AWS cluster of L4, A10G, and L40S GPUs shows that ShuntServe achieves 1.42x and 1.35x higher throughput than state-of-the-art baselines and attains 31.9% and 31.2% cost efficiency improvements over on-demand instances for offline and online serving, respectively.

2606.18593 2026-06-18 cs.HC cs.CY 新提交

"The New Era of Tech-Enabled Traceability": Tensions between the FDA's Data Governance Vision and the Lived Realities of Food Producers

“技术赋能可追溯性的新时代”:FDA的数据治理愿景与食品生产者的现实困境之间的张力

Soonho Kwon, Catherine Wieczorek, Heidi Biggs, Shellye Suttles, Tammi S. Etheridge, Annabel Rothschild, Shaowen Bardzell

AI总结 研究美国FDA食品追溯规则如何将农业食品利益相关者转化为数据劳工,通过分析1198条公众评论揭示数据收集、基础设施和文化实践中的三大矛盾。

详情
AI中文摘要

美国食品药品监督管理局(FDA)的《食品追溯规则》要求农业食品供应链利益相关者(包括农民、渔民、零售工人等)从2026年1月起维护详细的跟踪记录。通过该规则,FDA设想了一个“技术赋能可追溯性的新时代”,其中标准化、协调一致的跟踪数据作为基础公共卫生基础设施,能够更快速地识别和移除可能受污染的食物,最终降低食源性疾病的风险。尽管这一愿景令人期待,但我们观察到,该规则通过强制要求严格的数据收集、格式化和报告要求,将农业食品利益相关者重新配置为数据劳工。在本文中,我们研究了这种重新配置所产生的张力和负担。以数据女性主义为视角,关注数据驱动的政策实施如何不成比例地加重缺乏基础设施和财务能力的小规模、资源不足的利益相关者的负担,我们分析了针对该拟议规则提交至http://www.regulations.gov的1198条公众评论。我们的定性文档分析揭示了三个关键张力:(1)利益相关者在被重新配置为数据工作者时所经历的个人劳动、财务和教育负担;(2)由于基础设施限制、文化背景和特定生产实践,数据跟踪变得不可行的情况;(3)该规则旨在提供的灵活性因其模糊性反而引入了困惑和负担的实例。

英文摘要

The U.S. Food and Drug Administration (FDA)'s Food Traceability Rule requires agri-food supply chain stakeholders (stakeholders)--including farmers, fishers, retail workers, and others--to maintain detailed tracking records beginning in January 2026. Through this Rule, the FDA envisions a "New Era of Tech-Enabled Traceability," in which standardized, harmonized tracking data serve as a foundational public health infrastructure, enabling more rapid identification and removal of potentially contaminated food and ultimately reducing the risk of foodborne illness. Despite this promising vision, we observe that the Rule reconfigures agri-food stakeholders into data laborers by mandating stringent data collection, formatting, and reporting requirements. In this paper, we examine the tensions and burdens that arise from such reconfiguration. Leveraging Data Feminism as an orientation to attend to how data-driven policy implementation disproportionately burdens smaller, under-resourced stakeholders who lack the infrastructural and financial capacity to comply, we analyze 1,198 public comments submitted to Regulations.gov in response to the proposed Rule. Our qualitative document analysis reveals three key tensions: (1) the individual labor, financial, and educational burdens stakeholders experience as they are reconfigured into data workers; (2) moments where data tracking becomes infeasible due to infrastructural limitations, cultural contexts, and situated production practices; and (3) instances where the Rule's intended flexibility instead introduces confusion and burden due to its ambiguity.

2606.18569 2026-06-18 cs.CG 新提交

Tangent Spheres and Integer Distances

切球与整数距离

David Eppstein

AI总结 将Erdős-Anning定理推广到任意维双曲空间,并首次给出欧氏空间维度大于2时整数距离点集大小的定量界,证明基于切球图的双子图引理。

Comments 6 pages, 4 figures. To appear at the Canadian Conference on Computational Geometry (CCCG 2026)

详情
AI中文摘要

Erdős-Anning定理指出,在任意维欧氏空间中,所有距离均为整数的点集要么是有限的,要么是共线的。我们在任意维双曲空间中证明了相同的结果。我们的结果的一个定量形式也首次推广到维度大于2的欧氏空间:如果$\mathbb{E}^D$或$\mathbb{H}^D$中具有整数距离的点集包含一个直径为$d$的$D+1$个一般位置点的子集,那么整个集合的大小为$O(D(d+1)^D)$。为了证明这些结果,我们提出了一个引理:如果欧氏或双曲空间中一个球系的外公切图包含一个$K_{a,b}$子图,其中$a,b\ge 3$,那么该双图的每一侧的球心集合位于一个超平面上。这个引理还意味着,在多边测量(通过到已知地标的距离差确定位置)中,$D+1$个非共面地标总是足以将位置限制为两种可能性。

英文摘要

The Erdős-Anning theorem states that any point set for which all distances are integers, in a Euclidean space of any dimension, must be either finite or collinear. We prove the same result in hyperbolic space of any dimension. A quantitative form of our result also extends for the first time to Euclidean spaces of dimension greater than two: if a set of points with integer distances in $\mathbb{E}^D$ or $\mathbb{H}^D$ has a subset of $D+1$ points in general position whose diameter is $d$, then the whole set has size $O(D(d+1)^D)$. To prove these results we formulate a lemma that, if the graph of external tangencies of a system of spheres in Euclidean or hyperbolic space contains a $K_{a,b}$ subgraph for $a,b\ge 3$, then the sets of spheres on each side of this biclique have centers that lie on a hyperplane. This lemma also implies that, in multilateration (determining a position from differences of distances to known landmarks), $D+1$ non-coplanar landmarks always suffice to limit the position to two possibilities.

2606.18559 2026-06-18 cs.GT 新提交

Principal Component Analysis and Power Indices

主成分分析与权力指数

Xavier Molinero, Enric Monsó, Daniel Samaniego

AI总结 提出一种基于获胜联盟的新权力指数,证明其与主成分分析的特征值等价,并给出该指数的四个性质刻画。

详情
AI中文摘要

衡量一个玩家在简单博弈中的影响力是一个广泛研究的课题。Shapley-Shubik 权力指数在相关性方面可能是最重要的。此外,随着时间的推移,其他权力指数也被提出。在本文中,我们提出了另一种权力指数,定义为获胜联盟的函数。我们证明该指数与主成分分析方法获得的特征值一致,主成分分析是数据科学中用于确定给定数据集中不同特征影响力的广泛使用技术。此外,我们通过四个性质给出了该指数的刻画。

英文摘要

Measuring the influence of a player in a simple game is a widely studied topic. Shapley-Shubik power index is perhaps the maximum exponent in terms of relevance. Furthermore, other power indexes have been proposed over time. In this paper, we propose yet another power index, defined in terms of winning coalitions. We show that this index coincides with eigenvalues obtained with the Principal Component Analysis method, a broadly used technique in data science to determine the influence of different features given a dataset. Furthermore, we provide a characterization of this proposed index in terms of four properties.

2606.18556 2026-06-18 eess.SY cs.SY eess.SP 新提交

Wind-Resilient Trajectory Optimization for UAV-BS Networks: TD3 for Continuous Service Availability

抗风无人机基站网络轨迹优化:基于TD3的连续服务可用性

Azim Akhtarshenas German Svistunov, Kuangyu Zheng, David Lopez-Perez

AI总结 针对风扰导致无人机基站位置漂移和通信质量下降问题,提出基于TD3算法的抗风轨迹调整框架,通过随机运动学建模学习自适应控制策略,在湍流条件下维持最优覆盖,仿真验证其优于PPO等基准方法。

详情
AI中文摘要

无人机基站极易受到阵风和湍流等风扰影响,导致位置漂移并降低通信链路质量,尤其在应急场景中。为应对这一挑战,我们提出了一种基于深度强化学习的抗风轨迹调整与定位框架,采用双延迟深度确定性策略梯度(TD3)算法。该方法将风建模为随机运动学扰动,避免了复杂的气动建模,从而使TD3智能体能够学习维持最优覆盖足迹的自适应控制策略。通过优先考虑湍流条件下以用户为中心的性能指标,所提架构确保了即使在外部干扰下也能持续提供服务。仿真结果表明,基于TD3的方法有效补偿了风致位移,并在吞吐量稳定性和鲁棒性方面优于包括近端策略优化(PPO)在内的基准方法。

英文摘要

Unmanned aerial vehicle (UAV)-mounted base stations are highly susceptible to wind disturbances such as gusts and turbulence, which induce positional drift and degrade communication link quality, particularly in emergency scenarios. To address this challenge, we propose a DRL-based framework for wind-resilient trajectory adjustment and positioning based on the Twin Delayed Deep Deterministic Policy Gradient (TD3) algorithm. The method models wind as a stochastic kinematic perturbation, avoiding complex aerodynamic modeling, thereby enabling the TD3 agent to learn adaptive control policies that maintain optimal coverage footprints. By prioritizing user-centric performance metrics under turbulent conditions, the proposed architecture ensures continuous service availability despite external disruptions. Simulation results demonstrate that the TD3-based approach effectively compensates for wind-induced displacements and outperforms benchmark methods, including Proximal Policy Optimization (PPO), in terms of throughput stability and robustness in windy environments.

2606.18550 2026-06-18 cs.CR 新提交

The Gate Is Only as Honest as Its Contracts: ContractGuard for the Contract Layer of Risk-Aware Causal Gating

门仅与其合约一样诚实:面向风险感知因果门控合约层的ContractGuard

Laxmipriya Ganesh Iyer, Rahul Suresh Babu

AI总结 针对工具增强型LLM代理的间接提示注入,提出ContractGuard,通过验证合约完整性(而非风险标签)来防御攻击,在基准测试中实现零注入成功率。

详情
AI中文摘要

风险感知因果门控(RACG)通过从代理的可见动作空间中移除危险工具来防御工具增强型LLM代理免受间接提示注入,使得即使完全符合注入条件的代理也无法调用其不可见的工具。我们提出三点。首先,这种结构性保证并未消除安全工具使用背后的信任假设;它将其转移到门所读取的工具合约——声明的先决条件、效果、风险和授权——的完整性上,因此攻击者若破坏合约,可使门误判而无需说服代理。其次,伪造工具的效果比篡改其风险标签更危险,因为RACG在可准入门之前应用因果门:离路径工具从不暴露,因此仅重新标记风险会失败,而效果伪造则将危险工具路由到因果路径上并成功。效果完整性,而非风险标签,是承载假设。第三,我们引入ContractGuard,一个位于注册表和门之间的验证器,它分层使用签名来源、类型化合约认证和运行时效果验证;在受控基准测试中,它针对所有建模攻击(包括穷举白盒自适应攻击)将注入成功率恢复为零,且不会过度拒绝诚实合约,该结构性预测在六个当前代托管模型(Claude Opus 4.8, Sonnet 4.6, Haiku 4.5; Amazon Nova Premier and Nova 2 Lite; GPT-OSS-120B)上得到确认。

英文摘要

Risk-Aware Causal Gating (RACG) defends tool-augmented LLM agents against indirect prompt injection by removing dangerous tools from the agent's visible action space, so that even a fully injection-compliant agent cannot call a tool it cannot see. We make three points. First, this structural guarantee does not eliminate the trust assumption behind safe tool use; it relocates it into the integrity of the tool contracts -- declared preconditions, effects, risk, and authorization -- that the gate reads, so an attacker who corrupts a contract can make the gate mis-decide without ever persuading the agent. Second, forging a tool's effects is strictly more dangerous than tampering with its risk label, because RACG applies a causal gate before its admissibility gate: an off-path tool is never exposed, so risk-relabeling alone fails, whereas effect forgery routes the dangerous tool onto the causal path and succeeds. Effect integrity, not the risk label, is the load-bearing assumption. Third, we introduce ContractGuard, a verifier between the registry and the gate that layers signed provenance, typed contract attestation, and runtime effect verification; on a controlled benchmark it restores injection success to zero against every modeled attack -- including an exhaustive white-box adaptive attacker -- without over-rejecting honest contracts, and the structural prediction is confirmed on six current-generation hosted models (Claude Opus 4.8, Sonnet 4.6, Haiku 4.5; Amazon Nova Premier and Nova 2 Lite; GPT-OSS-120B).

2606.18549 2026-06-18 cs.SI 新提交

Co-evolution of the global research collaboration network and the performance of nations in science and technology

全球科研合作网络与国家科技绩效的协同演化

Travis A. Whetsell, Jeongyoon, Yang

AI总结 本研究利用纵向协同演化模型,分析30年全球网络与国家绩效数据,发现国际科研合作与国家科技绩效之间存在双向因果关系,且地理距离调节了绩效与网络动态的交互。

详情
AI中文摘要

研究人员长期以来怀疑国际科研合作(IRC)与科技(S\&T)绩效存在相互因果关系,但这两个现象的内生协同演化尚未经过大规模实证分析检验。本研究利用纵向协同演化模型,基于三十年的全球网络和国家绩效数据,同时检验了IRC网络效应对国家科研绩效的影响以及反之亦然。采用随机行动者导向模型(SAOM)分析1993年至2022年166个国家的数据。年度IRC网络基于Web of Science的XML数据库构建,绩效数据来自Elsevier的分数领域加权引用指数(FWCI)。模型还考虑了地理、经济、人口和政治因素以及内生网络过程。结果支持相互协同演化。然而,值得注意的是,地理距离似乎调节了科研绩效与网络动态之间的交互,表明研究人员在选择地理距离较远的合作者时可能更依赖可见的绩效指标。这一发现指出了基于引用的绩效指标作为合作者选择的信号机制的作用。

英文摘要

Researchers have long suspected that international research collaboration (IRC) and scientific and technological (S\&T) performance are subject to reciprocal causality, yet the endogenous co-evolution of these twin phenomena has yet to be tested by large-scale empirical analysis. This study tests IRC network effects on national research performance and vice versa simultaneously using a longitudinal co-evolution model on three decades of global network and national performance data. Stochastic actor oriented models (SAOM) are used to analyze data on 166 countries from 1993 to 2022. Yearly IRC networks are constructed from Web of Science's XML database, and performance data are gathered from Elsevier's fractional field-weighted citation index (FWCI). The models also account for geographic, economic, demographic, and political factors, as well as endogenous network processes. The results provide support for reciprocal co-evolution. However, notably, geographic distance appears to moderate the interaction between research performance and network dynamics, suggesting researchers may rely more on visible performance metrics when selecting geographically distant collaborators. This finding points to the role of citation based performance metrics as a signaling mechanism for collaborator selection.

2606.18541 2026-06-18 cs.CR cs.CY cs.HC 新提交

Confident yet Concerned: Inconsistencies in Computing Students' Attitudes on Cybersecurity

自信但担忧:计算机专业学生网络安全态度中的不一致性

Victor Adama, Robert Biddle, Nalin Arachchilage, Danielle Lottridge

AI总结 通过调查236名计算机专业大学生,结合主题分析和定量数据,发现学生虽有一定网络安全意识,但存在实践、责任认知和支持结构上的不一致,并识别出四种关键主题张力及两个态度聚类。

详情
AI中文摘要

如今的年轻人最沉浸于技术,在许多平台上管理在线隐私时感到无力,尤其容易受到网络钓鱼攻击。这引发了关于他们对网络安全的一般、广泛态度和管理的疑问。精通技术的年轻人如何对待网络安全?我们寻求更好地理解他们的网络安全知识、态度和经验,特别是在应对欺骗性在线通信方面。我们调查了一组“领先用户”:计算机专业大学生(n=236)。通过将开放式回答的主题分析与定量数据相结合,我们提供了对其经验和看法的洞察。虽然学生表现出合理的网络安全意识,但他们的网络安全经验各不相同,并且在实践、责任认知和支持结构方面存在不一致。研究结果还揭示了四个关键主题张力:1)计算机专业学生知识渊博,但存在持久的错误信念;2)他们更多地从课堂外的来源学习如何保持安全;3)他们获得的帮助有限,并且已成为网络犯罪的受害者;4)许多人自信,但其他人对自己的安全和责任感到担忧。通过对态度的聚类分析,我们识别出两个群体,其中一个群体感觉准备不足、缺乏自信,但表达了学习更多知识的愿望。意图和客观知识的既定测量与准备程度相关。自我效能感与自信相关,并预测了聚类成员身份。

英文摘要

Today's young adults are most immersed in technology, leading in feelings of powerlessness in managing online privacy across many platforms, and particularly susceptible to phishing attacks. This raises questions about their general, wide-ranging attitudes towards and management of cybersecurity. How do young, tech-savvy adults approach cybersecurity? We seek a better understanding of their cybersecurity knowledge, attitudes and experiences, in particular in addressing deceptive online communications. We surveyed a group of `lead users': computing university students (n = 236). By combining thematic analysis of open-ended responses with quantitative data, we provide insights into their experiences and perceptions. While students demonstrate reasonable cybersecurity awareness, their cybersecurity experiences vary, and inconsistencies exist around their practices, perceptions of responsibility, and support structures. Findings also reveal four key thematic tensions: 1) Computing students are knowledgeable yet have persistent incorrect beliefs, 2) They learn more about keeping safe from sources outside the classroom, 3) They have limited assistance and have fallen victim to cybercrime, and 4) Many are confident, yet others are concerned about their own safety and responsibility. Through cluster analysis of attitudes, we identify two groups, with one feeling less prepared, less confident, yet expressing a desire to learn more. Established measures of intentions and objective knowledge were correlated to preparedness. Self-efficacy correlated to confidence and predicted cluster membership.

2606.18540 2026-06-18 cs.CC 新提交

Depth Lower Bounds for ReLU Networks with Binary Inputs

二进制输入的ReLU网络的深度下界

Neil Krishnan, Elchanan Mossel

AI总结 针对二进制输入和实值输出的ReLU网络,构造了一个深度n+1、宽度常数的函数族,证明任何深度d、宽度w的精确计算网络需满足w^d = Ω(2^n),即深度d = o(n/log n)时宽度不能为n的多项式。

Comments The authors explicitly reserves all rights in this work. No permission is granted for the reproduction, storage, or use of this document for the purpose of training artificial intelligence systems or for text and data mining (TDM), including but not limited to the generation of embeddings, summaries, or synthetic derivatives

详情
AI中文摘要

我们研究了具有离散(布尔)输入和实值输出的ReLU网络中深度的作用,补充了两个已有的研究方向。对于布尔输入,在$\mathsf{AC}^0$中证明了显著的深度分离结果,但使用阈值($\mathsf{TC}^0$)或ReLU门时,深度分离仅针对深度二与三建立。另一方面,对于{\em实值}函数和ReLU网络,Telgarsky(2016)构造了一个简单的单变量函数类,在更高深度上建立了分离。本文旨在为$\{0,1\}^n$上的ReLU网络建立全深度分离。我们通过展示一个显式的函数族来实现这一点,该函数族可由深度$n+1$、宽度常数的ReLU网络精确计算,而任何深度$d$、宽度$w$的ReLU网络要精确计算该函数必须满足$w^d = \Omega(2^n)$;特别地,没有深度$d = o(n/\log n)$的网络可以用$n$的多项式宽度计算它。我们注意到,我们的下界依赖于\emph{精确、无限精度}计算,因为输出的指数精度截断可由多项式大小的$\mathsf{TC}^0$电路计算。

英文摘要

We study the role of depth in ReLU networks with discrete (Boolean) inputs and real-valued outputs, complementing two established lines of work. For Boolean inputs, striking depth separation results were proven for $\mathsf{AC}^0$ but with threshold ($\mathsf{TC}^0$) or ReLU gates depth separation is only established for depth two vs. three. On the other hand, for {\em real-valued} functions and ReLU networks, Telgarsky's (2016) constructed a simple one variable class of functions which establishes separation at higher depths. In this paper we are interested to establish an all-depths depth separation for ReLU networks on $\{0,1\}^n$. We do so by exhibiting an explicit family of functions computable exactly by a ReLU network of depth $n+1$ and constant width, such that any ReLU network of depth $d$ and width $w$ computing the function exactly must satisfy $w^d = Ω(2^n)$; in particular, no network of depth $d = o(n/\log n)$ can compute it with width polynomial in $n$. We note that our lower bound relies on \emph{exact, infinite-accuracy} computation as an exponential precision truncation of the output is computable by a polynomial-size $\mathsf{TC}^0$ circuit.

2606.18511 2026-06-18 cs.HC 新提交

Stitching the Divide: Investigating Mixed Reality as a Bridge Between Paper-Based and Digital Artifacts in UI/UX Design

缝合鸿沟:探究混合现实作为UI/UX设计中纸质与数字人工制品之间的桥梁

Abidullah Khan, Jinghui Cheng

AI总结 通过访谈和概念探针研究,发现混合现实能实现连续混合设计工作流、减少手动重建、支持空间锚定工作区及实时跨媒介协作,并推导出未来MR系统的四个设计维度。

Comments Accepted to the ACM Graphics Interface Conference, 2026

详情
AI中文摘要

UI/UX设计师同时使用纸质和数字人工制品,但缺乏将两者无缝集成的工具。混合现实(MR)为结合两种设计环境的优势提供了尚未充分探索的机会。为考察这些机会,我们首先对19名专业UI/UX设计师进行了访谈,了解他们当前使用纸质和数字人工制品的经验。受访谈见解的启发和指导,我们组织了九次概念探针用户研究会议,设计师在其中使用结合了纸质和数字原型制作过程的MR探针,并头脑风暴MR在UI/UX设计中的潜力。我们发现,参与者重视MR在实现连续混合设计工作流、减少手动重建、支持空间锚定工作区以及促进实时跨媒介协作方面的作用。他们还设想了未来具有AI辅助、更丰富的交互和动态内容以及能够在统一环境中管理多样化设计人工制品的MR工具。根据这些发现,我们推导出未来MR系统的四个设计维度,这些系统可能实现更流畅、更具创造性和协作性的设计实践。

英文摘要

UI/UX designers work with both paper-based and digital artifacts but lack tools that seamlessly integrate the two. Mixed Reality (MR) offers under-explored opportunities to combine the strengths of both design environments. To examine these opportunities, we first conducted interviews with 19 professional UI/UX designers to understand their current experiences using paper and digital artifacts. Motivated and informed by the interview insights, we organized nine conceptual-probe user study sessions in which designers engaged with a MR-probe that combined paper and digital prototyping processes and brainstormed MR's potential in UI/UX design. We found that participants valued MR for enabling continuous hybrid design workflows, reducing manual reconstruction, supporting spatially anchored workspaces, and facilitating real-time cross-medium collaboration. They also envisioned future MR tools with AI assistance, richer interactive and dynamic content, and the ability to manage diverse design artifacts within a unified environment. From these findings, we derive four design dimensions for future MR systems that could enable more fluid, creative, and collaborative design practices.

2606.18497 2026-06-18 cs.CR 新提交

Ghost Vectors: Soft-Deleted Embeddings Remain Reconstructible in HNSW Vector Databases

幽灵向量:HNSW向量数据库中软删除的嵌入仍然可重构

Chandranil Chakraborttii, Jackeline García Alvarado, Sitora Abdulofizova, Shivanshu Dwivedi

AI总结 研究揭示HNSW向量数据库的软删除机制存在安全漏洞,被标记删除的向量仍可通过存储层恢复,并提出基于加密密钥轮换的防护方案。

Comments 13 pages, 5 figures, 12 tables. Prepared for submission

详情
AI中文摘要

检索增强生成(RAG)使大型语言模型能够访问外部和私有语料库,以生成事实性、领域特定的响应。现代RAG流水线使用分层可导航小世界(HNSW)向量数据库进行高效的相似性搜索。当用户请求数据删除时,系统通常仅将记录标记为已删除,而嵌入在磁盘上物理保持不变。这种软删除操作在GDPR第17条和HIPAA等数据擦除和保留要求下引发了合规性问题。对三种HNSW实现的分析证实,通过访问存储层的原始索引文件(绕过API访问),已删除的向量在物理上仍然可恢复。使用无需领域特定微调的Vec2Text反演模型,我们在多个真实世界数据集和数据模态上展示了这一漏洞。在维基百科在世人物数据集(BLP)上,我们成功恢复了25.5%的精确人名和46.4%的地理位置(ROUGE-L 0.185)。在高度结构化的敏感数据(NIH Synthea数据集)上,患者年龄和性别标记的恢复率达到100%(ROUGE-L 0.290)。在软删除的图像嵌入上,我们在组织病理学切片上展示了100%的组织分类(p=1.02e-07),在人脸嵌入上top-1身份恢复率达到99%(p<0.01)。本工作引入了Epoch密钥轮换,即加密向量并在删除时丢弃密钥。Epoch密钥轮换将观察到的PII恢复降至0%,并在2.5毫秒内完成500个已删除向量的处理(约0.005毫秒/记录)。此外,它还生成ECDSA签名的加密证明,作为删除事件的可审计记录。

英文摘要

Retrieval-augmented generation (RAG) allows large language models to access external and private corpora for factual, domain-specific responses. Modern RAG pipelines use hierarchical navigable small world (HNSW) vector databases for efficient similarity search. When a user requests data deletion, the systems typically only mark the record as deleted, leaving the embedding on disk physically unchanged. This soft-delete operation raises compliance concerns under data-erasure and retention requirements such as GDPR Article 17 and HIPAA. Analysis on three HNSW implementations confirms that deleted vectors remain physically recoverable by accessing the raw index files at the storage layer, bypassing API access. Using the Vec2Text inversion model without domain-specific fine-tuning, we show this vulnerability on multiple real-world datasets and data modalities. On Wikipedia biographical living persons dataset (BLP), we successfully recover 25.5% of exact person names and 46.4% of geographic locations (ROUGE-L 0.185). Recovery reaches 100% for both patient age and gender markers (ROUGE-L 0.290) on highly structured, sensitive data (NIH Synthea dataset). On soft-deleted image embeddings, we show 100% tissue classification on histopathology patches (p=1.02e-07) and top-1 identity recovery reaches 99% on facial embeddings (p<0.01). This work introduces Epoch Key Rotation, which encrypts vectors and discards the key upon deletion. Epoch key rotation reduces observed PII recovery to 0% and completes in 2.5 ms for 500 deleted vectors (approximately 0.005 ms/record). Additionally, it generates an ECDSA-signed cryptographic proof as an auditable record of the deletion event.

2606.18483 2026-06-18 cs.DC 新提交

Flexible Distributed Particle Filtering for the Internet of Things via Aggregate Computing

面向物联网的灵活分布式粒子滤波:基于聚合计算的方法

Angela Cortecchia, Davide Domini, Giovanni Ciatto, Roberto Casadei, Danilo Pianini, Mirko Viroli

AI总结 提出基于聚合计算的场分布式粒子滤波方法,通过解耦滤波逻辑与协调策略,实现融合中心、测量函数和信息传播等维度的灵活定制,在统一框架下权衡精度、通信开销与鲁棒性。

详情
AI中文摘要

从不确定的分布式观测中进行状态估计是许多网络物理应用的核心。虽然分布式粒子滤波(DPF)算法解决了分布式环境中的非线性和非高斯估计问题,但大多数解决方案仍局限于特定的架构和通信假设,限制了在开放、异构部署(尤其是物联网)中的适应性。在本文中,我们提出了一种基于聚合计算(AC)的场分布式粒子滤波公式。通过将估计和信息传播表达为计算场,我们的方法将核心滤波逻辑与协调和数据流策略解耦。这使得关键设计维度(包括融合中心放置与弹性、聚合测量函数以及信息传播的类型和范围)能够系统定制。通过一系列计算机模拟实验,我们展示了如何在统一框架内推导出不同的DPF配置,并突出了精度、通信成本和鲁棒性之间的权衡。总体而言,所提出的方法将AC定位为在开放物联网环境中设计可适应的DPF解决方案的有效抽象层。

英文摘要

State estimation from uncertain, distributed observations is central in many cyber-physical applications. While Distributed Particle Filtering (DPF) algorithms address nonlinear and non-Gaussian estimations in distributed settings, most solutions remain tied to specific architectures and communication assumptions, limiting adaptability in open, heterogeneous deployments-most notably, the Internet of Things (IoT). In this paper, we propose a field-based formulation of Distributed Particle Filtering grounded in Aggregate Computing (AC). By expressing estimation and information dissemination as computational fields, our approach decouples the core filtering logic from coordination and data-flow strategies. This enables systematic customisation of key design dimensions, including fusion-center placement and resilience, aggregated measurement functions, as well as the type and scope of information propagation. Through a set of in-silico experiments, we show how diverse DPF configurations can be derived within a unified framework, highlighting trade-offs among accuracy, communication cost, and robustness. Overall, the proposed approach positions AC as an effective abstraction layer for engineering adaptable DPF solutions in open IoT environments.

2606.18481 2026-06-18 cs.SE cs.CY cs.HC 新提交

Designing L5: A Permacomputing Approach to Creative Coding

设计 L5:一种面向创意编程的永久计算方法

Lee Tusman, Kit Kuksenok

AI总结 本文介绍 L5,一个基于 LOVE 框架的 Lua 创意编程库,通过五个案例研究探讨如何在可持续性与可用性之间取得平衡,为创意编程社区引入永久计算原则。

Comments 10 pages, 1 figure, In LIMITS 26: Workshop on Computing within Limits, June 23 - 25, 2026

详情
AI中文摘要

创意编程库提供了高级工具,使计算和算法艺术对艺术家和学习者变得可及。Processing/p5 是这类库中的一个家族,以其对初学者友好的方法和在艺术与技术社区的广泛影响而闻名。L5 是该家族的新成员,使用 LOVE 框架在 Lua 中实现。它应用了永久计算原则——一个受永续农业启发、解决计算可持续性的运动——将这些价值观带入一个历史上并不以它们为中心的实践社区。本文通过五个案例研究探讨 L5 的设计决策以及可持续性与可用性之间的张力:1. 平衡感知简单性与暴露接缝,2. 设计以降低资源消耗,3. 确保长期稳定性,4. 约束功能,以及 5. 为资源受限访问设计文档。可持续的创意工具并非优化单一指标,而是需要透明地协调相互冲突的价值观。

英文摘要

Creative coding libraries provide high-level tools that make computational and algorithmic art accessible to artists and learners. Processing/p5 is one such family of libraries, known for its beginner-friendly approach and wide reach across artistic and technical communities. L5 is a new member of this family, implemented in Lua using the LOVE framework. It applies permacomputing principles, a movement addressing sustainability in computing inspired by permaculture, bringing these values to a community of practice not historically centered on them. This paper explores L5's design decisions and tensions between sustainability and usability through five case studies: 1. balancing perceived simplicity versus exposing the seams, 2. designing for lower resource consumption, 3. ensuring long-term stability, 4. constraining functionality, and 5. designing documentation for resource-constrained access. Rather than optimizing for a single metric, sustainable creative tools require navigating competing values transparently.

2606.18427 2026-06-18 cs.CR cs.CY cs.NI 新提交

Understanding the "Airport" Censorship Circumvention Ecosystem in China

理解中国“机场”审查规避生态系统

Rumaisa Habib, Mingshi Wu, Shiva Shahandeh, Min Ni, Eric Wustrow, Zakir Durumeric

AI总结 通过用户调查、社交媒体分析和主动网络测量,首次系统研究中国地下“机场”代理市场,发现其是最流行的审查规避工具,具有多跳架构和独特挑战。

Comments The first two authors contributed equally

详情
AI中文摘要

在中国,一个新兴的地下市场向用户出售基于订阅的审查规避代理,称为“机场”。我们首次对该生态系统进行了系统研究,结合了用户调查、社交媒体分析和主动网络测量。我们发现,机场是迄今为止中国最流行的现成审查规避工具,在我们的1,667名调查受访者中,超过一半的人使用过机场,他们称赞其易用性、性能以及访问ChatGPT和Netflix等地理限制服务的能力。通过扫描互联网和抓取Telegram公告频道,我们识别出3,431个活跃机场,这些机场基于少数开源工具包构建。我们订阅了35个机场并评估其性能,发现由于独特的 多跳架构,其性能通常超过直接连接长城防火墙。然而,机场也带来了新的挑战和安全风险:它们通过支付宝等商业服务接受付款,频繁遭受政府打击,且客户端难以优化配置。许多机场还部署了自己的审查策略。机场比学术文献中的其他规避工具使用更广泛,但引入了新的脆弱性和控制形式,为未来的规避研究提供了经验教训和机遇。

英文摘要

In China, a burgeoning underground market sells citizens subscription-based censorship circumvention proxies known as ''airports''. We present the first systematic study of this ecosystem, combining user surveys, social media analysis, and active network measurements. We find that airports are by far the most popular off-the-shelf censorship circumvention tool in China, used by over half of our 1,667~survey respondents, who cite their ease of use, performance, and access to geo-restricted services like ChatGPT and Netflix. By scanning the Internet and scraping Telegram announcement channels, we identify 3,431 active airports built on a handful of open-source toolkits. We subscribe to 35 airports and characterize their performance, which often surpasses direct connections through the Great Firewall due to a distinctive multi-hop architecture. However, airports also pose new challenges and security risks: they accept payment through commercial services like Alipay, suffer frequent government takedowns, and are difficult for clients to configure optimally. Many airports also deploy their own distinct censorship policies. Airports are far more widely used than other circumvention tools from the academic literature, but introduce new forms of fragility and control, offering both lessons and opportunities for future circumvention research.

2606.18423 2026-06-18 cs.SE 新提交

A Critical Discourse Analysis of Gender Representation in Software Engineering Education Videos on YouTube

YouTube 上软件工程教育视频中性别表征的批判性话语分析

Isabella Graßl, Alexander Serebrenik, Giuseppe Destefanis

AI总结 本研究对 YouTube 上 200 个软件工程教程进行批判性话语分析,发现男性角色和男性语言默认值占主导,存在代理差距,技术决策角色几乎全分配给男性,女性角色缺失或被动,表明在线教育可能再现性别规范。

Comments 20 pages, CSE&ET 2026

详情
AI中文摘要

教育资源可能会影响学生对谁属于软件工程的看法,鉴于该领域持续的性别差距,这一点具有相关性。然而,我们对在线学习空间中关于性别的隐性课程知之甚少。本研究对 YouTube 上 200 个手动分析的英语和德语软件工程教程进行了批判性话语分析,通过语境领域和语言身份标记检查性别表征。我们的结果表明,男性角色和男性语言默认值在教程中占主导地位。我们识别出一个代理差距,其中技术和决策角色几乎完全分配给男性角色,而女性角色要么缺失,要么倾向于被动的、低代理的角色。研究结果表明,YouTube 上的软件工程教育可能再现性别规范,其中语言和表征上的把关可能成为软件工程的象征性障碍。

英文摘要

Educational resources may frame students' perceptions of who belongs in software engineering, which is relevant given the field's ongoing gender gap. However, we know little about the hidden curriculum regarding gender in online learning spaces. This study presents a critical discourse analysis of 200 manually analysed English and German software engineering tutorials on YouTube, examining gender representation through contextual domains and linguistic identity markers. Our results show that male characters and masculine linguistic defaults dominate the tutorials. We identified an agency gap, in which technical and decision-making roles are almost exclusively assigned to male actors, while female actors are either absent or tend to passive, low-agency roles. The findings indicate that software engineering education on YouTube may reproduce gendered norms, in which linguistic and representational gatekeeping may serve as a symbolic barrier to software engineering.

2606.18421 2026-06-18 cs.SE 新提交

Finding Compiler-Platform Interaction Bugs in Deep Learning Pipelines via Cross-Layer Constraints

通过跨层约束发现深度学习流水线中的编译器-平台交互错误

Yuxin Qiu, Jiyuan Wang, Ronak Badhe, Ben Limpanukorn, Miryung Kim, Qian Zhang

AI总结 提出一种自动化框架XCheck,通过提取全栈约束生成测试模型,发现编译器与硬件平台交互导致的错误,并在三个编译器上发现2034个错误案例。

详情
AI中文摘要

人工智能的日益部署需要鲁棒的深度学习编译器,如TVM和ONNX-MLIR。这些编译器以高级AI模型为输入,通过多层变换降低它们,并将其专门化到不同的硬件。测试此类编译器具有独特的挑战性,因为正确性取决于嵌入在整个编译栈中的隐式约束。现有的测试方法主要采用类型约束来限制输入模型生成,因此强调类型验证并监控编译崩溃或覆盖率增益。这种关注忽略了由编译和执行环境之间的交错效应引起的编译器-平台交互错误。在这项工作中,我们提出了一个可扩展的自动化DL编译器测试框架,用于同时(1)发现编译器-平台交互错误和(2)实现行为等价划分。我们的关键见解是,这些错误是由跨编译通道和硬件平台的交互引起的违反假设导致的。因此,我们超越了约束输入生成,并推导出全栈约束。我们的方法分为三步。首先,我们设计了一种自动化方法来提取全栈约束,这些约束共同指导模型生成并表征编译行为。其次,我们优先考虑暴露交互敏感行为的约束,以便我们生成的模型能够执行深度编译逻辑。第三,我们通过自动插入断言来监控覆盖率或通过/失败信号遗漏的不同编译症状,从而实现行为等价划分。我们在三个广泛使用的DL编译器上评估了我们的工具XCheck,发现了2034个揭示错误的案例,包括内存溢出、整数溢出以及根源于编译器-平台交互的静默意外编译。

英文摘要

The growing deployment of artificial intelligence (AI) necessitates robust deep learning (DL) compilers, such as TVM and ONNX-MLIR. These compilers take as input high-level AI models, lower them through multi-layer transformations, and specialize them to diverse hardware. Testing such compilers is uniquely challenging as correctness depends on implicit constraints embedded throughout the compilation stack. Existing testing approaches largely take type constraints to restrict input model generation and therefore emphasize type validation and monitor compilation crashes or coverage gains. This focus overlooks compiler-platform interaction bugs that arise from interleaved effects across compilation and execution environments. In this work, we propose a scalable, automated DL compiler testing framework for, in tandem, (1) finding compiler-platform interaction bugs and (2) enabling behavior equivalence partitioning. Our key insight is that these bugs are caused by violated assumptions arising from interactions across compilation passes and hardware platforms. Therefore, we move beyond constraining input generation and derive full-stack constraints. Our approach is three-fold. First, we design an automated approach to extract full-stack constraints that jointly guide model generation and characterize compilation behaviors. Second, we prioritize constraints that expose interaction-sensitive behaviors, so our generated models are capable of exercising deep compilation logic. Third, we enable behavior equivalence partitioning by automatically inserting assertions to monitor distinct compilation symptoms that coverage or pass/fail signals miss. We evaluated our tool, XCheck, on three widely-used DL compilers and found 2,034 bug-revealing cases, including memory overflows, integer overflows, and silent unexpected compilations that were rooted in compiler-platform interactions.

2606.18417 2026-06-18 cs.CE 新提交

Enhancing neural network extrapolation in thermo-fluid systems using steady-state solutions

利用稳态解增强热流体系统中的神经网络外推能力

Sanjeeb Poudel, Teeratorn Kadeethum, Sanghyun Lee

AI总结 针对耗散PDE系统,提出一种稳态信息嵌入的神经网络表示,将解分解为稳态分量和瞬态修正,直接嵌入渐近行为,无需额外惩罚项,显著提升时间外推能力。

详情
AI中文摘要

时间相关偏微分方程(PDE)出现在许多工程系统中,包括热流体应用。对此类系统的经典数值模拟在长时间动力学中可能变得计算昂贵,因为它们通常需要受稳定性、精度或非线性求解器约束的时间步长进行顺序时间积分。尽管科学机器学习为逼近PDE解提供了替代方案,但标准神经网络近似在训练时间区间外进行外推时通常会退化。在这项工作中,我们针对解松弛到平稳平衡的耗散PDE系统提出了一种稳态信息神经网络表示。所提出的ansatz将解分解为稳态分量和由时间相关衰减曲线调制的瞬态修正。当衰减曲线在长时间消失且瞬态修正保持有界时,该表示将收敛到指定稳态直接嵌入到架构中,而不是通过额外的惩罚项来强制执行。这使得网络能够学习瞬态动力学,同时保持正确的渐近行为。我们在物理信息神经网络(PINN)框架内实现了该方法,并使用SOAP优化器训练所得模型。该方法在一系列物理和几何复杂度递增的问题上进行了评估,范围从一维热方程到方腔顶盖驱动不可压缩Navier-Stokes流、方腔自然对流以及全三维共轭传热问题。数值结果表明,与未明确强制执行渐近条件的架构相比,稳态信息架构显著改善了训练区间之外的时间外推。

英文摘要

Time-dependent partial differential equations (PDEs) arise in many engineering systems, including thermo-fluid applications. Classical numerical simulations of such systems can become computationally expensive for long-time dynamics because they typically require sequential time integration with time steps constrained by stability, accuracy, or nonlinear solvers. Although scientific machine learning provides an alternative for approximating PDE solutions, standard neural network approximations often degrade when extrapolated beyond the training time interval. In this work, we propose a steady-state-informed neural network representation for dissipative PDE systems whose solutions relax toward a stationary equilibrium. The proposed ansatz decomposes the solution into a steady-state component and a transient correction modulated by a time-dependent decay profile. When the decay profile vanishes at long time and the transient correction remains bounded, the representation embeds convergence to the prescribed steady state directly into the architecture, rather than enforcing it through an additional penalty term. This allows the network to learn the transient dynamics while preserving the correct asymptotic behavior. We implement the approach within a physics-informed neural network (PINN) framework and train the resulting model using the SOAP optimizer. The method is evaluated on a sequence of problems of increasing physical and geometric complexity, ranging from the one-dimensional heat equation to incompressible Navier-Stokes flow in a lid-driven cavity, natural convection in a square cavity, and a full three-dimensional conjugate heat transfer problem. The numerical results show that the steady-state-informed architecture substantially improves temporal extrapolation beyond the training interval compared with architectures that do not explicitly enforce the asymptotic condition.

2606.18416 2026-06-18 eess.SY cs.SY 新提交

Constellation-Level Power Allocation for LEO Space-Based Solar Power

LEO天基太阳能的星座级功率分配

Mustafa Alhassan, Amjad Iqbal, Peng Hu

AI总结 提出LEO SBSP系统模型,通过24小时仿真评估Walker 4×5星座的功率分配,发现峰值功率1.986 MW,每站平均40-75 kW,功率密度低于ICNIRP限值。

详情
AI中文摘要

天基太阳能(SBSP)近期作为利用天基基础设施提供持续清洁能源的有吸引力的技术进步重新受到关注。然而,低地球轨道(LEO)卫星星座用于SBSP的潜力在很大程度上仍未探索,缺乏详细的基于仿真的研究。在本文中,我们引入了一个新颖的LEO SBSP系统模型,并对高度450 km的Walker $4\ imes 5$ LEO SBSP星座进行了24小时系统级仿真,在贪婪分配策略下将2.45 GHz微波功率波束传输到八个地面站(GS)。该模型包括轨道传播、日食周期、卫星功率链、Goubau-Brown波束耦合、ITU-R P.618大气衰减和星载电池动力学。结果证实,传输的峰值直流功率达到1.986 MW,而服务站的每站点平均传输功率在40到75 kW之间。八个地面站中有两个在运行期间未获得服务,因为在贪婪策略下,它们的过境排名始终低于同一时刻的竞争链路。整流天线处的入射峰值功率密度(PD)保持在3.35-5.72 W/m²范围内,低于国际非电离辐射防护委员会(ICNIRP)的公众暴露限值。对于此高度的20颗卫星Walker LEO星座,每站实际传输功率为50-100 kW,整流天线应按照约5 W/m²的运行入射功率密度设计,而不是按照地球静止轨道(GEO)时代的100 W/m²额定值设计。

英文摘要

Space-based solar power (SBSP) has recently gained renewed attention as an appealing technological advancement for providing continuous clean energy using space-based infrastructure. However, the potential of low-Earth orbit (LEO) satellite constellations for SBSP remains largely unexplored and lacks detailed simulation-based studies. In this paper, we introduce a novel LEO SBSP system model and conduct a 24-hour system-level simulation of a Walker $4\times 5$ LEO SBSP constellation at an altitude of 450\,km, beaming 2.45\,GHz microwave power to eight ground stations (GSs) under a greedy allocation policy. The model includes orbital propagation, eclipse cycles, the satellite power chain, Goubau--Brown beam coupling, ITU-R P.618 atmospheric attenuation, and onboard battery dynamics. The results confirm that the peak DC power delivered reaches 1.986\,MW, while the mean per-site delivery at the served GS ranged from 40 to 75\,kW. Two of the eight GSs received no service during the run, as their passes were consistently ranked lower under the greedy policy than competing links at the same step. The incident peak power density (PD) at the rectenna remained within the 3.35--5.72\,W/m\textsuperscript{2} range, below the International Commission on Non-Ionizing Radiation Protection (ICNIRP) general-public exposure limit. For a 20-satellite Walker LEO at this altitude, realistic per-site delivery is 50--100 kW, and the rectenna should be sized to the operational incident PD of order 5,W/m\textsuperscript{2} rather than to a Geostationary Earth Orbit (GEO)-era 100,W/m\textsuperscript{2} rating.

2606.18405 2026-06-18 cs.CR 新提交

Evaluating the Effectiveness of LLMs in Aiding Compliance Testing of PKCS#1-v1.5

评估LLM在辅助PKCS#1-v1.5合规性测试中的有效性

Polina Kozyreva, Endadul Hoque

AI总结 研究结合语法变异与LLM代码合成的方法,在PKCS#1 v1.5签名验证的48个加密库实现中,重现了13个非平凡违规类别中的10个,包括所有5个签名伪造类别,并发现1个新差异,但LLM幻觉(82.5%)是主要限制因素。

详情
AI中文摘要

测试二进制协议的实现是否符合规范需要满足结构和语义约束的输入。纯随机生成和原始变异通常不足以探索依赖于类型-长度-值(TLV)编码的协议中的语义有意义行为,而领域特定的合规性测试工具需要深入的协议专业知识和大量手动工作来构建。本研究调查了语法级变异与基于LLM的代码合成相结合是否可以作为规范合规性测试的一种可行且更通用的方法。我们在PKCS#1 v1.5签名验证——一个广泛部署的TLV编码标准,具有形式验证的测试预言(Morpheus)——上评估了该方法,涉及48个加密库实现。我们重现了Morpheus先前识别的13个非平凡规范违规类别中的10个,包括所有5个签名伪造类别,并发现了1个先前未报告的差异。我们发现LLM幻觉(发生在82.5%的生成脚本中)是限制有效性的主要因素,而非变异策略。我们识别出五种不同的幻觉类型,并显示它们的分布在变异类别中系统性地变化:结构变异以13.3%的保真度实现,而约束变异达到30.3%的正确性,但遭受最高比例的完全忽略变异(8.1%)。这些发现揭示了操作可靠性(99.8%)与语义保真度(17.5%)之间的显著差距,为在规范驱动的测试流水线中何时可以信任基于LLM的代码合成提供了可操作的指导。

英文摘要

Testing implementations of binary protocols for specification compliance requires inputs that satisfy both structural and semantic constraints. Purely random generation and primitive mutations are often insufficient for exploring semantically meaningful behaviors in protocols that rely on Type-Length-Value (TLV) encoding, yet domain-specific compliance testing tools require deep protocol expertise and significant manual effort to construct. This work investigates whether grammar-level mutation combined with LLM-based code synthesis can serve as a viable, more generalizable approach to specification compliance testing. We evaluate the approach on PKCS#1 v1.5 signature verification -- a widely deployed TLV-encoded standard with a formally verified testing oracle (Morpheus) -- across 48 cryptographic library implementations. We reproduced 10 of 13 non-trivial specification violation categories previously identified by Morpheus, including all 5 signature forgery categories, and discovered 1 previously unreported discrepancy. We found that LLM hallucination -- occurring in 82.5% of generated scripts -- is the primary factor limiting effectiveness, not the mutation strategies. We identify five distinct hallucination types and show that their distribution varies systematically across mutation categories: structural mutations are implemented with 13.3% fidelity while constraint mutations achieve 30.3% correctness but suffer the highest rate of mutations being fully ignored (8.1%). These findings reveal a striking gap between operational reliability (99.8%) and semantic fidelity (17.5%), providing actionable guidance on when LLM-based code synthesis can be trusted in specification-driven testing pipelines.

2606.18400 2026-06-18 cs.OS cs.CR 新提交

CloakLM: Obfuscating GPU Memory Layout to Mitigate Model Ex-filtration for Serving

CloakLM:混淆GPU内存布局以缓解服务中的模型窃取

Kunal Jain, Seokjin Go, Divya Mahajan

AI总结 针对第三方共享加速器上的模型窃取风险,提出CloakLM框架,通过PCIe流量整形、权重混洗和HBM页面重映射三种机制混淆内存布局,无需硬件修改即可有效防御PCIe嗅探和HBM转储攻击。

Comments 15 pages, 9 figures, 2 tables

详情
AI中文摘要

部署在第三方和共享加速器基础设施上的大型基础模型面临模型窃取的实际风险,现有防御措施未能完全解决。在常见的服务部署中,模型提供者控制虚拟机或裸金属服务栈,但不控制周围的硬件底层。主机到GPU互连、加速器网络和邻近基础设施组件仍处于租户信任边界之外,并且已被证明是可利用的。Hermes展示了通过被动PCIe观察进行无损DNN重建,而TunnelS通过驱动级访问以高吞吐量窃取HBM内容而不干扰推理。共租户虚拟机还可以访问内存映射接口或配置错误的RDMA区域,而无需物理共置。这些攻击利用了ML系统的一个共同特性:模型权重存储在大块、连续且重复访问的内存区域中,使得截获的PCIe传输和HBM转储足以揭示模型结构和参数。我们提出CloakLM,一个纯软件的内存混淆框架,它消除了这种结构规律性,而不改变推理栈对内存的逻辑视图。CloakLM结合了三种机制:PCIe流量整形、层间和层内权重混洗以及物理HBM页面重映射。授权执行保留有效的虚拟内存布局且开销可忽略,而未经授权的观察者看到的是碎片化和语义上不连贯的状态。CloakLM与vLLM和PyTorch集成,无需硬件更改,并补充了机密计算。在使用LLaMA和Qwen模型的分布式推理工作负载上的评估显示,性能接近原生,同时显著增强了对PCIe嗅探和HBM转储攻击的抵抗力,使得推理时的模型窃取变得不太可行。

英文摘要

Large foundation models deployed on third-party and shared accelerator infrastructure face a practical risk of model exfiltration that existing defenses do not fully address. In common serving deployments, model providers control the VM or bare-metal serving stack but not the surrounding hardware substrate. The host to GPU interconnect, accelerator fabric, and neighboring infrastructure components remain outside the tenant's trust boundary and have been shown to be exploitable. Hermes demonstrates lossless DNN reconstruction from passive PCIe observation, while TunnelS exfiltrates HBM contents at high throughput via driver-level access without disrupting inference. Co-tenant VMs can further access memory-mapped interfaces or misconfigured RDMA regions without physical co-location. These attacks exploit a common property of ML systems: model weights are stored in large, contiguous, and repeatedly accessed memory regions, making intercepted PCIe transfers and HBM dumps rich enough to reveal model structure and parameters. We present CloakLM, a software-only memory-obfuscation framework that removes this structural regularity without changing the inference stack's logical view of memory. CloakLM combines three mechanisms: PCIe traffic shaping, inter- and intra-layer weight shuffling, and physical HBM page remapping. Authorized execution retains a valid virtual memory layout with negligible overhead, while unauthorized observers see fragmented and semantically incoherent state. CloakLM integrates with vLLM and PyTorch, requires no hardware changes, and complements confidential computing. Evaluation on distributed inference workloads using LLaMA and Qwen models shows near-native performance while significantly increasing resistance to PCIe snooping and HBM dump attacks, making inference-time model exfiltration substantially less practical.

2606.18392 2026-06-18 cs.GT 新提交

When Mobile Crowdsourcing Meets Queueing Systems: Human-in-the-Loop Learning

当移动众包遇到排队系统:人在回路学习

Hongbo Li, Lingjie Duan, Ness B. Shroff

AI总结 针对服务系统中自私顾客因拥堵信息过时而过度探索导致效率损失的问题,提出动态侧支付机制,协调拥堵管理与信息获取,将无政府状态价格控制在2以下。

Comments This paper has been accpeted by IEEE Transactions on Networking

详情
AI中文摘要

在服务系统中,顾客现在依赖拥堵信息来决定加入哪个队列或服务器,从餐厅和主题公园景点到道路网络。我们将此场景研究为人在回路学习(HILL),其中顾客通过众包平台既消费又生成时间敏感的拥堵信息。由于拥堵报告会过时,高效的系统运行需要持续探索当前状态不确定的服务器。然而,自私的顾客会避免这种探索,因为这会降低他们即时的服务效用,尽管他们的观察会使未来的顾客受益。我们分析了内源性拥堵排队系统中个体激励与系统学习之间的这种张力。我们首先表明,短视的服务器选择可能导致无限的无政府状态价格(PoA):分散的顾客可能通过过度探索可能拥堵的服务器造成任意大的效率损失。在单服务器情况下,我们证明PoA的下界随着缓冲区大小的增加而减小,而在多服务器情况下,上界随着服务器数量的增加而减小。我们进一步表明,现有的用于外生信息探索-利用的信息性非货币机制在我们的设置中失败,因为顾客的选择直接重塑了队列状态,仍然导致无限的PoA。为了解决这一挑战,我们设计了一种动态侧支付机制,定期向一些顾客收费并奖励其他顾客,在保持事后预算平衡的同时阻止过度探索。该机制协调了异构服务器之间的拥堵管理和信息获取,并保证PoA低于2。除了最坏情况分析,使用真实数据集的实验表明,所提出的机制也实现了强大的平均情况性能。

英文摘要

In service systems, customers now rely on congestion information before deciding which queue or server to join, from restaurants and theme-park attractions to road networks. We study this setting as human-in-the-loop learning (HILL), where customers both consume and generate time-sensitive congestion information through crowdsourcing platforms. Because congestion reports become stale, efficient system operation requires continued exploration of servers whose current states are uncertain. Yet selfish customers avoid such exploration when it reduces their immediate service utility, even though their observations would benefit future customers. We analyze this tension between individual incentives and system-wide learning in queueing systems with endogenous congestion. We first show that myopic server choices can induce an infinite price of anarchy (PoA): decentralized customers may cause arbitrarily large efficiency losses by overexploring servers that are likely congested. In the single-server case, we prove that the lower bound on PoA decreases as buffer size grows, while in the multi-server case the upper bound decreases as the number of servers increases. We further show that existing informational, non-monetary mechanisms for exploration-exploitation with exogenous information fail in our setting, as customers' choices directly reshape the queue states and still lead to infinite PoA. To address this challenge, we design a dynamic side-payment mechanism that periodically charges some customers and rewards others, discouraging excessive exploration while maintaining ex-post budget balance. The mechanism coordinates congestion management and information acquisition across heterogeneous servers, and guarantees PoA below 2. Beyond worst-case analysis, experiments using real datasets demonstrate that the proposed mechanism also achieves strong average-case performance.

2606.18377 2026-06-18 cs.SE 新提交

Exploring Statistical Change Point Detection Techniques for Performance Anomaly Detection at Mozilla

探索统计变点检测技术在 Mozilla 性能异常检测中的应用

Mohamed Bilel Besbes, Gregory Mierzwinski, Suhaib Mujahid, Philipp Leitner, Alexander Serebrenik, Dave Hunt, Diego Elias Costa

AI总结 本文针对 Mozilla 性能异常检测中高误报和漏报问题,评估了 25 种变点检测方法和 15 种集成方法,基于人工标注的真实数据集发现集成投票策略在 F1 分数上提升 11%,并已集成到 Mozilla 系统。

详情
AI中文摘要

软件性能回归可能带来严重的业务后果,因此自动检测成为现代持续集成流水线的关键组成部分。在 Mozilla,性能异常检测由 Perfherder 处理,这是 Mozilla 的性能工程管理系统,它基于 Student's T 检验方法在每天数百次代码变更中标记回归。然而,我们对 Mozilla 一年性能数据的初步分析显示,12.5% 生成的警报组是误报,而约 6.8% 的警报组包含自动系统遗漏的回归。本文提出了一项实证研究,评估了 25 种变点检测(CPD)方法和 15 种集成方法作为 Mozilla 当前方法的替代方案。我们构建了一个包含 174 个性能时间序列的真实数据集,由 11 位 Mozilla 性能工程师手动标注,代表了性能工程领域首批从业者标注的 CPD 基准之一。我们的结果表明,虽然离线和混合 CPD 方法比 Mozilla 方法提高了召回率,但代价是精度大幅降低。集成投票策略缓解了这种权衡,并提供了更一致的性能,使 F1 分数提高了 11%。我们通过从业者调查验证了实验结果,并报告了将最佳方法集成到 Mozilla 性能工程系统中的经验教训。

英文摘要

Software performance regressions can have significant business consequences, making automated detection a critical component of modern continuous integration pipelines. At Mozilla, performance anomaly detection is handled by Perfherder, Mozilla's performance engineering management system that relies on a Student's T-test-based approach to flag regressions across hundreds of daily code changes. However, our preliminary analysis of one year of Mozilla performance data reveals that 12.5% of generated alert groups are false positives, while approximately 6.8% of them contain regressions missed by the automated system. This paper presents an empirical study evaluating 25 change-point detection (CPD) methods and 15 ensemble approaches as alternatives to Mozilla's current method. We construct a ground-truth dataset of 174 performance time series manually annotated by eleven Mozilla performance engineers, representing one of the first practitioner-annotated CPD benchmarks for performance engineering. Our results show that while offline and hybrid CPD methods improve recall over Mozilla's method, they do so at a high cost to precision. Ensemble voting strategies alleviate this trade-off and offer more consistent performance, resulting in 11% improvement in the F1-score. We validate the experimental results through a practitioner survey and report on lessons learned from integrating the best methods into Mozilla's performance engineering system.

2606.18320 2026-06-18 cs.CR 新提交

TopVenues: A Reproducible Corpus and Tooling Substrate for Cybersecurity Literature Reviews

TopVenues:一个可复现的网络安全文献综述语料库与工具基础

Sidnei Barbieri, Ágney Lopes Roth Ferraz, Lourenço Alves Pereira Júnior

AI总结 提出TopVenues开源系统,通过DBLP元数据骨架和API构建版本化语料库,实现网络安全文献综述的可复现基础,支持高效检索和可重复测量。

详情
AI中文摘要

网络安全文献综述需要一个可复现的分母:协议在筛选和综合开始前包含的论文集合。如今,该分母通常从出版商门户、书目索引和学术应用程序接口(API)重建,而这些接口的覆盖范围、格式和查询语义随时间变化。本文提出TopVenues,一个开源系统,将语料库构建实现为版本化的研究工件。TopVenues声明一个会议和年份范围,使用DBLP计算机科学书目(DBLP)作为元数据主干,通过开放的学术API和特定出版商的提取器丰富记录的摘要和BibTeX条目,并将结果存储在单调的SQLite快照中,可通过命令行界面(CLI)、Web界面以及用于综述工作流的导出路径访问。2026年5月的快照包含来自2017年至2026年11个网络安全来源的9,925篇论文,摘要覆盖率达99.86%,BibTeX覆盖率达99.99%;全文语料库的关键词搜索在31毫秒内完成,一个250个测试的套件验证了数据完整性不变量。固定的分母还实现了可重复测量:在我们的范围内,四个顶级安全会议2024年至2025年的论文中有29.2%以arXiv预印本形式出现,中位发表前时间为五个月,而先前作者记录过滤器在90%召回率下对后续出现在同一会议集中的预印本进行筛选时,实现了16.5倍的精度提升。TopVenues通过使语料库本身可执行、可检查和可引用,将语料库构建与可审计的网络安全测量联系起来。该工件可在以下网址获取:this https URL。

英文摘要

Cybersecurity literature reviews require a reproducible denominator: the set of papers that a protocol includes before screening and synthesis begin. Today, that denominator is often reconstructed from publisher portals, bibliographic indices, and scholarly application programming interfaces (APIs) whose coverage, formats, and query semantics change over time. This paper presents TopVenues, an open-source system that materializes corpus construction as a versioned research artifact. TopVenues declares a venue and year scope, uses DBLP Computer Science Bibliography (DBLP) as the metadata spine, enriches records with abstracts and BibTeX entries via open scholarly APIs and publisher-specific extractors, and stores the results in a monotonic SQLite snapshot, accessible via a command-line interface (CLI), a web interface, and export paths for review workflows. The May 2026 snapshot contains 9,925 papers from 11 cybersecurity sources over 2017 to 2026, with 99.86% abstract coverage and 99.99% BibTeX coverage; keyword search over the full corpus completes in under 31 ms, and a 250-test suite validates the data-integrity invariants. The fixed denominator also enables repeatable measurement: 29.2% of 2024 to 2025 papers from the four top-ranked security conferences in our scope appear as arXiv preprints, with a median of five months before publication, and a prior-author-track-record filter yields a 16.5x precision gain at 90% recall for triaging preprints that later appear in the same venue set. TopVenues links corpus construction to auditable cybersecurity measurement by making the corpus itself executable, inspectable, and citable. The artifact is available at https://github.com/sidneibarbieri/topVenues.

2606.18314 2026-06-18 cs.CG 新提交

Repair Entropy in Dynamic Geometric Nearest-Neighbour Structures

动态几何最近邻结构中的修复熵

Faruk Alpay, Bugra Kilictas

AI总结 针对小运动下的精确最近邻维护问题,提出基于修复前沿熵的自适应策略,在O(|F_t| log N)时间内修复失效证书,并验证了2400种运动场景下的有效性。

Comments 10 pages, 2 figures, 2 tables; code and dataset provided as ancillary files

详情
AI中文摘要

我们研究小运动下精确最近邻维护的动态几何数据结构。对每个点,我们存储一个由最近邻和两个最小邻近距离组成的证书,间隙为$c_i=d^i_2-d^i_1$。三角不等式给出一个尖锐的有效性半径:在最大位移为$\varepsilon$的一步后,每个满足$c_i>4\varepsilon$的证书仍然有效,因此所有可能的失效被限制在修复前沿$F_t$内。我们引入修复前沿熵$H(F_t)$,即失效证书在索引单元上的归一化香农熵,作为选择事件驱动修复、批量修复或完全重建的工作负载描述符。由此产生的维护规则在单元占用有界的情况下,仅以$O(|F_t|\log N)$时间修复前沿,而完全重建代价为$\Theta(N)$;此外,熵为事件驱动修复所触及的前沿单元数量提供下界,并改变了经验上的修复-重建交叉点。我们在$d\in\{2,3\}$中评估了十种运动族,$N$高达16,000,使用精确的平铺GPU预言机和GPU网格重建作为真实值和竞争者。在2400个标记的转换中,有效性规则没有遗漏任何无效证书,低压前沿通常通过增量修复更便宜,而相同大小的扩散前沿对于事件驱动修复更昂贵,但对于批量修复则不然。发布的数据集记录了前沿几何、证书审计、每种策略的时间以及最佳策略标签。

英文摘要

We study dynamic geometric data structures for exact nearest-neighbour maintenance under small motions. For each point we store a certificate consisting of its nearest neighbour and the two smallest neighbour distances, with clearance $c_i=d^i_2-d^i_1$. A triangle-inequality argument gives a sharp validity radius: after a step of maximum displacement $\varepsilon$, every certificate with $c_i>4\varepsilon$ remains valid, so all possible failures are confined to a repair frontier $F_t$. We introduce repair-frontier entropy $H(F_t)$, the normalized Shannon entropy of failed certificates over index cells, as a workload descriptor for choosing between event-driven repair, batched repair, and full rebuild. The resulting maintenance rule repairs only the frontier in $O(|F_t|\log N)$ time under bounded cell occupancy, while a full rebuild costs $Θ(N)$; moreover, entropy lower-bounds the number of frontier cells touched by event-driven repair and shifts the empirical repair-rebuild crossover. We evaluate ten motion families in $d\in{2,3}$, with $N$ up to $16,000$, using an exact tiled GPU oracle and a GPU grid rebuild as ground truth and competitor. Across $2400$ labelled transitions, the validity rule misses no invalid certificate, low-pressure frontiers are usually cheaper to repair incrementally, and diffuse frontiers of the same size are more expensive for event-driven repair but not for batched repair. The released dataset records frontier geometry, certificate audits, per-strategy times, and best-strategy labels.

2606.18297 2026-06-18 cs.DB 新提交

From Embedded Properties to Trait Nodes: A Design Method for Identifying Reusable Metadata in Property Graph Schemas

从嵌入属性到特征节点:一种在属性图模式中识别可复用元数据的设计方法

Yahya Sa'd, Renzo Angles, Vojtech Merunka, Roberto Garcia, Karel Klima, Pavel Beranek

AI总结 针对属性图模式中描述性属性是否应作为可复用元数据的问题,提出基于五个准则的元数据候选识别方法,通过规则决策工作流分类属性,并以图书馆领域示例和参与式分类验证说明方法有效性。

详情
AI中文摘要

属性图模式通常包含跨异构节点和边重复出现的描述性属性,但模式设计者缺乏明确的方法来决定这些属性应保持嵌入状态还是作为可复用的元数据结构处理。本文在面向5GNF的建模视角下解决这一设计阶段问题,提出一种基于五个准则(跨元素出现、概念独立性、无损外化、复用潜力和治理相关性)的元数据候选识别方法。该方法通过基于规则的决策工作流将属性分类为特征候选、嵌入属性和边界情况。使用图书馆领域的运行示例说明该方法,并通过涉及两个模式上下文中基于参与者的分类任务的说明性验证进行检验。结果表明,仅凭重复出现不足以作为外化的基础,元数据候选识别需要超越频率的语义解释。本文的主要贡献是方法论的:它为决定何时将描述性属性建模为属性图模式中的可复用元数据提供了更明确和系统的基础。

英文摘要

Property-graph schemas often contain descriptive properties that recur across heterogeneous nodes and edges, yet schema designers lack a clear method for deciding whether such properties should remain embedded or be treated as reusable metadata structures. This paper addresses this design-stage problem within a 5GNF-oriented modeling perspective by proposing a method for identifying metadata candidates based on five criteria: cross-element occurrence, conceptual independence, lossless externalization, reuse potential, and governance relevance. The method classifies properties into trait candidates, embedded properties, and borderline cases using a rule-based decision workflow. The approach is illustrated using a running example from a library domain and examined through an illustrative validation involving participant-based classification tasks in two schema contexts. The results show that recurrence alone is not a sufficient basis for externalization and that metadata-candidate identification requires semantic interpretation beyond frequency. The main contribution of the paper is methodological: it provides a more explicit and systematic basis for deciding when descriptive properties should be modeled as reusable metadata in property-graph schemas.

2606.18289 2026-06-18 cs.HC cs.CY 新提交

Beyond the Algorithm: Professional Experiences and Perceptions of AI Bias

超越算法:人工智能偏见的专业经验与认知

Micarah Malone-Gawu

AI总结 通过质性多案例研究,探讨AI从业者如何感知和缓解算法偏见,发现偏见源于历史不公、排他性设计及组织压力,强调公平需要结构性问责、多元参与和认知意识。

Comments PhD thesis

详情
AI中文摘要

这项质性多案例研究的目的是考察社会偏见如何在人工智能和机器学习系统中出现、被感知以及如何被直接参与其设计、开发和治理的从业者所缓解。尽管使用了医疗、刑事司法、就业和教育领域的例子来说明自动化系统塑造日常生活的领域,但本研究聚焦于AI从业者的生活经验和专业见解,而非特定部门的人群。在交叉性理论和认知科学的指导下,本研究采用解释主义方法,对九名从业者进行了半结构化访谈,并辅以文档分析和三角验证的案例材料以丰富情境理解。研究结果表明,算法偏见源于历史不公、排他性设计假设以及优先考虑速度和效率而非伦理反思的组织压力。参与者强调,仅靠技术修正无法确保公平;相反,公平的AI需要结构性问责、多元参与以及在开发周期中持续的认知意识。许多人描述了伦理标准执行不力以及组织文化对负责任实践支持不一致的情况。研究得出结论,以人为中心且具有社会基础的AI发展依赖于在早期设计过程中嵌入伦理、加强治理框架以及培养鼓励反思性决策的制度环境。这些见解有助于当前关于负责任AI的讨论,并为寻求设计透明、负责且与其影响的社区相一致的系统的组织提供实践指导。

英文摘要

The purpose of this qualitative multi-case study was to examine how social bias emerges, is perceived, and can be mitigated within artificial intelligence and machine learning systems by practitioners directly involved in their design, development, and governance. Although examples from healthcare, criminal justice, employment, and education were used to illustrate domains where automated systems shape everyday life, the study focused on the lived experiences and professional insights of AI practitioners rather than sector-specific populations. Guided by Intersectionality Theory and Cognitive Science, the study employed an interpretivist approach, utilizing semi-structured interviews with nine practitioners, supplemented by document analysis and triangulated case material to enrich contextual understanding. Findings showed that algorithmic bias arises from historical inequities, exclusionary design assumptions, and organizational pressures that prioritize speed and efficiency over ethical reflection. Participants emphasized that technical corrections alone cannot ensure fairness; instead, equitable AI requires structural accountability, diverse participation, and sustained cognitive awareness during the development lifecycle. Many described limited enforcement of ethical standards and organizational cultures that inconsistently support responsible practice. The study concludes that human-centered and socially grounded AI development depends on embedding ethics early in the design process, strengthening governance frameworks, and cultivating institutional environments that encourage reflective decision-making. These insights contribute to ongoing conversations on responsible AI and offer practical guidance for organizations seeking to design systems that are transparent, accountable, and aligned with the communities they affect.

2606.18285 2026-06-18 cs.SI cs.CY 新提交

RELIANCE: Curating and Evaluating Reproductive Health Information on Social Media

RELIANCE: 策展与评估社交媒体上的生殖健康信息

Vaibhav Balloli, Laura Peyton Ellis, Vishala Mishra, Alice Chi, Alex Peahl, Elizabeth Bondi-Kelly

AI总结 针对TikTok上孕期和产后健康信息,构建专家标注数据集RELIANCE,评估LLM事实核查能力,发现近60%信息准确,但整体与具体声明评估存在15%差距。

Comments Accepted at Datasets and Benchmarks Track, ACM Knowledge Discovery and Data Mining (KDD) 2026. Project page: https://realize-lab.github.io/RELIANCE/

详情
AI中文摘要

像TikTok这样的社交媒体平台已成为健康信息的关键来源,研究报告称帖子中存在不准确信息。随着大型语言模型(LLM)提供商越来越多地将LLM集成到数字平台中以进行事实核查(例如,X上的Grok和WhatsApp上的Perplexity),并且人们正在使用它们来核查信息,在生殖健康等关键领域部署这些系统而不进行严格评估可能会造成严重伤害。我们介绍了RELIANCE,一个关于TikTok上围绕孕期和产后查询的健康信息的专家标注数据集,既作为生殖健康信息格局的分析,也作为LLM在事实核查这些内容方面的能力评估。我们的数据集包含来自56个经临床医生审核的查询的336个视频中的409个标注句子,由三位产科、妇科和内科专家临床医生进行标注。我们的发现显示,我们采样的视频中近60%的健康信息是准确的。此外,LLM评估揭示了评估具体声明与评估整个内容之间的差距(15%)。我们相信,我们的方法、数据集和工具将支持机器学习社区使用真实世界数据改进LLM在重要领域的应用,扩展到其他平台和语言,并帮助健康社区进一步了解社交媒体上的信息格局。我们的数据集和代码可在以下网址获取:https://this https URL。

英文摘要

Social media platforms like TikTok have become a key source of health information, with studies reporting inaccuracies in posts. As Large Language Model (LLM) providers increasingly integrate LLMs into digital platforms to fact-check content (e.g., Grok and Perplexity on X and WhatsApp, respectively) and are being used by people to fact-check information, deploying these systems in critical areas such as reproductive health without rigorous evaluation can cause serious harm. We introduce RELIANCE, an expert-annotated dataset of health information on TikTok surrounding pregnancy and postpartum queries, serving as both an analysis of the reproductive health information landscape and an evaluation of LLMs' capabilities in fact-checking this content. Our dataset comprises 409 annotated sentences from 336 videos across 56 clinician-reviewed queries, annotated by three expert clinicians in Obstetrics, Gynecology, and Internal Medicine. Our findings reveal that nearly 60\% of the health information in the videos we sampled is accurate. Furthermore, LLM evaluations reveal a gap between evaluating specific claims and evaluating the entire content (15\%). We believe that our methodology, dataset, and tool will support the machine learning community in improving LLMs for important domains with real-world data, extending to other platforms and languages, and helping the health community further understand the information landscape on social media. Our dataset and code are made available at https://realize-lab.github.io/RELIANCE/.

2606.18279 2026-06-18 cs.SI 新提交

Joint Discovery of Graph Structure and Dynamics in Stochastic Interacting Particle Systems

随机相互作用粒子系统中图结构与动力学的联合发现

Demao Liu, Ting Gao, Jinqiao Duan

AI总结 提出交替最小二乘估计器联合识别随机相互作用粒子系统的网络结构和动力学,在合成与真实数据上验证了准确性和鲁棒性。

Comments Preprint. 8 figures

详情
AI中文摘要

我们研究了随机相互作用粒子系统中网络结构和控制动力学的联合识别问题,该系统包含一个未知的有向加权交互图,以及未知的局部和非局部相互作用分量。我们将该问题表述为图和相关基系数的耦合逆问题,并开发了两种交替最小二乘型估计器:三块方案(TALS)和集成对角增强方案(IALS)。IALS公式将局部和相互作用系数的更新合并为一个最小二乘子问题,特别适用于节点局部动力学共享共同函数模板(仅节点依赖缩放)的场景。我们进一步在秩2联合强制条件及适当归一化约定下建立了可辨识性结果。合成实验表明,所提出的估计器能够准确恢复交互图和动力学分量,并在随机强迫、观测噪声和基失配下保持鲁棒性。我们还提供了关于发作期SEEG记录的说明性真实数据应用,其中学习到的模型在多种基配置下产生稳定且可解释的动力学摘要。这项工作为学习随机相互作用粒子系统提供了一个有理论保证的可扩展框架,在计算生物学、神经科学等领域具有广泛的数据驱动识别潜力。

英文摘要

We study the joint identification of network structure and governing dynamics in stochastic interacting particle systems, which consist of an unknown directed weighted interaction graph with unknown local and non-local interaction components. We formulate the problem as a coupled inverse problem for the graph and the associated basis coefficients, and develop two alternating least-squares-type estimators: a three-block scheme (TALS) and an integrated diagonal-augmented scheme (IALS). The IALS formulation combines the updates of the local and interaction coefficients into a single least-squares subproblem, and is particularly well suited to settings in which the nodewise local dynamics share a common functional template up to node-dependent scaling. We further establish an identifiability result under a rank-2 joint coercivity condition together with an appropriate normalization convention. Synthetic experiments show that the proposed estimators accurately recover both the interaction graph and the dynamical components, and remain robust under stochastic forcing, observation noise, and basis mismatch. We also provide an illustrative real-data application on ictal SEEG recordings, where the learned models produce stable and interpretable dynamical summaries across multiple basis configurations. This work advances a theoretically guaranteed scalable framework for learning stochastic interacting particle systems, with broad potential for data-driven identification in computational biology, neuroscience, and beyond.

2606.18274 2026-06-18 cs.SI 新提交

HyDRA: Lossless Hypergraph Summarization via Co-Clustering

HyDRA: 通过共聚类实现无损超图摘要

Giulia Preti, Aris Anagnostopoulos, Francesco Bonchi

AI总结 提出HyDRA,首个无损加权超图摘要框架,通过共聚类思想设计贪心算法,结合增量更新策略,实现80-93%的存储压缩,并支持直接查询和加速下游任务。

详情
AI中文摘要

超图是表示高阶交互的强大表示形式,但其规模和复杂性带来了显著的数据管理和分析挑战。虽然摘要技术广泛用于简化简单图,但超图的无损摘要仍未得到探索。我们引入了HyDRA,这是首个用于加权超图无损摘要的正式框架。在我们的框架中,摘要是一个由超节点(节点组)和超超边(超边组)组成的新加权超图,并配有一个用于精确重建的校正表。通过建立与共聚类的概念联系,我们设计了一种高效、无参数的贪心算法,该算法迭代地合并节点和超边聚类,以最小化一种新颖的存储感知代价函数。HyDRA采用增量更新策略,以避免每一步中校正表的昂贵重新计算。大量实验表明,我们的方法在存储成本上实现了显著降低(在某些设置中,根据超图特征,降低80-93%)。由于生成的摘要本身是超图,可以直接查询,为各种连通性和中心性查询提供快速且准确的近似答案,并加速诸如影响力最大化等下游任务。

英文摘要

Hypergraphs are a powerful representation for higher-order interactions but their scale and complexity pose significant data management and analysis challenges. While summarization techniques are widely used to distill simple graphs, lossless summarization for hypergraphs remains unexplored. We introduce HyDRA, the first formal framework for lossless summarization of weighted hypergraphs. In our framework, a summary is a new weighted hypergraph composed of supernodes (groups of nodes) and superhyperedges (groups of hyperedges), paired with a correction table for exact reconstruction. By establishing a conceptual link to co-clustering, we design an efficient, parameter-free greedy algorithm that iteratively merges node and hyperedge clusters to minimize a novel storage-aware cost function. HyDRA employs an incremental update strategy to prevent the costly recomputation of the correction table at each step. Extensive experiments demonstrate that \our achieves a substantial reduction in storage cost (80-93% in some settings, depending on the hypergraph characteristics). Because the resulting summaries are themselves hypergraphs, they can be queried directly, providing fast and accurate approximate answers for various connectivity and centrality queries, and accelerating downstream tasks such as influence maximization.