arXivDaily arXiv每日学术速递 周一至周五更新
重置
全部学科分类 1502
2606.18052 2026-06-17 cs.CR 新提交

An Empirical Analysis of AI Slop in Music Streaming

AI垃圾音乐在流媒体中的实证分析

Stanley Wu, Josephine Passananti, Viresh Mittal, Wenxin Ding, Haitao Zheng, Ben Y. Zhao

AI总结 研究AI音乐在流媒体平台上的泛滥现象,通过分析Spotify数据和自建AI歌曲发布实验,发现93%的AI音乐播放量极低,且分发平台政策执行不力,检测方法不准确,预测若不采取措施将形成自我维持的灰色产业。

详情
AI中文摘要

生成式AI模型降低了内容创作的门槛,使得任何用户都能轻松创建专业外观的图像、文本和音乐。这催生了一个围绕“AI垃圾”创作的新兴家庭手工业——大量平庸内容被生产以牟利,通常通过冒充人类创作或涉及自动化脚本和虚假消费的骗局。虽然AI垃圾产业与“传统”电子邮件垃圾网络之间存在明显的相似之处,但现在判断AI垃圾生成能否发展成类似的自我维持产业可能还为时过早。在本文中,我们特别关注音乐产业,并探讨问题:我们能否阻止AI音乐垃圾发展成一个自我维持的影子产业?为了回答这个问题,我们描述了当前AI音乐垃圾的状态,及其从生成、分发到用户在流媒体平台上消费的管道。通过检查Spotify上的增长和参与度,我们确认AI音乐表现出AI垃圾特征:绝大多数(93%)的AI音乐几乎没有听众播放,也很少被推荐。AI音乐家“广撒网”,跨多种流派发布大量音乐,希望能产生热门。我们还通过11家独立音乐发行商生成并发布我们自己的AI曲目到流媒体平台,探索AI垃圾管道。我们发现发行商对AI音乐的政策不一致且大多未执行,使得发布大量生产的AI歌曲异常容易。最后,我们考虑AI音乐检测,发现当前方法缺乏准确性或鲁棒性。随着生成成本降低,我们认为除非音乐产业采取具体措施,否则音乐中的垃圾生成将变得自我维持。我们基于发现考虑并讨论潜在的缓解方法。

英文摘要

Generative AI models lower the bar for content creation, making it easy for any user to create professional-looking images, text and music with minimal effort. This has enabled a new cottage industry around creation of "AI slop" mass quantities of mediocre content produced to generate revenue, often through misrepresentation as human-authored content, or scams involving automated scripts and fake consumption. While there are obvious parallels between the AI-slop industry and "traditional" email spam networks, it might be too early to determine if AI slop generation can grow into a similar self-sustaining industry. In this paper, we look specifically at the music industry, and explore the question: Can we prevent AI music slop from growing into a self-sustaining shadow industry? To answer this question, we characterize the current state of AI slop in music, and its pipeline from generation, distribution, and consumption by users on streaming platforms. By examining growth and engagement on Spotify, we confirm that AI music exhibits AI slop characteristics: the overwhelming majority (93%) of AI music receive few, if any listener plays, and are rarely recommended. AI musicians "spray and pray," releasing large volumes of music across multiple genres in hopes of generating a hit. We also explore the AI slop pipeline by generating and publishing our own AI tracks onto streaming through 11 indie music distributors. We find distributors have inconsistent and largely unenforced policies on AI music, making it surprisingly easy to publish mass produced AI songs. Finally, we consider AI music detection, and find that current methods lack accuracy or robustness. As generation costs decrease, we believe slop generation in music will become self-sustainable, unless concrete steps are taken by the music industry. We consider and discuss potential mitigation methods based on our findings.

2606.18042 2026-06-17 cs.DC 新提交

Latency Prediction for LLM Inference on NPU Systems

NPU系统上LLM推理的延迟预测

Juhyun Park, Seungwoo Jeong, Jingyu Lee, Kyungyong Lee

AI总结 针对NPU上LLM推理延迟预测面临微架构不公开、编译器优化不可预测和分桶导致非线性延迟的挑战,提出LENS延迟估计器,通过每个桶两次端到端测量组合预测任意输入输出长度组合的延迟,平均预测误差2.15%。

Comments 12 pages, 9 figures. Submitted to IISWC 2026

详情
AI中文摘要

部署大型语言模型(LLM)需要探索涵盖并行化策略、批处理技术和调度策略的庞大配置空间。在此空间上进行穷举测量是不切实际的,因此延迟预测对于系统优化至关重要。尽管NPU已成为专为LLM推理设计的加速器,但尚未建立针对它们的预测方法。具体来说,将先前的工作应用于NPU上的LLM推理延迟预测面临三个挑战:商用NPU的微架构不公开、不可预测的编译器优化以及由分桶引起的延迟非线性。我们提出了LENS,一种延迟估计器,它可以在没有微架构或编译器信息的情况下预测NPU推理延迟,并捕获由分桶引起的非线性延迟。LENS通过两次端到端(E2E)测量对每个桶进行剖析,并组合结果以预测任意输入-输出长度组合的延迟。我们在来自多个供应商的NPU、多个LLM以及多样化工作负载上验证了LENS,平均预测误差为2.15%。我们进一步将LENS与两个方法相关的基线进行比较,确认了其方法的有效性。

英文摘要

Deploying Large Language Models (LLMs) requires exploring a large configuration space spanning parallelization strategies, batching techniques, and scheduling policies. Exhaustive measurement across this space is impractical, making latency prediction essential for system optimization. While NPUs have emerged as accelerators designed for LLM inference, no prediction methodology has been established for them. Specifically, applying prior work to LLM inference latency prediction on NPUs faces three challenges: undisclosed microarchitecture of commercial NPUs, unpredictable compiler optimizations, and latency non-linearity induced by bucketing. We present LENS, a latency estimator that predicts NPU inference latency without information on the microarchitecture or compiler, and captures the non-linear latency induced by bucketing. LENS profiles each bucket with two end-to-end (E2E) measurements and composes the results to predict latency for arbitrary input-output length combinations. We validate LENS across NPUs from multiple vendors, several LLMs, and diverse workloads, achieving a mean prediction error of 2.15\%. We further compare LENS against two methodologically related baselines, confirming the validity of its approach.

2606.18038 2026-06-17 cs.ET 新提交

Low-Cost Home Automation System for Municipal Swimming Pool: Arduino-Based Implementation and Data Analysis

面向市政游泳池的低成本家庭自动化系统:基于Arduino的实现与数据分析

Júlio Rocha, Salviano Soares, Carlos Quental

AI总结 提出一种基于Arduino的低成本自动化系统,用于市政游泳池的安全、空气质量、气体泄漏、能耗及温湿度控制,通过硬件组装与数据分析两阶段实现实时数据采集与决策支持。

详情
AI中文摘要

本文介绍了一种在市政游泳池中实施的低成本家庭自动化系统,以应对各种挑战,包括安全问题、空气质量控制、气体泄漏检测、能耗降低以及泳池甲板上的温度和湿度控制。该系统利用Arduino微控制器与传感器和执行器,实现实时数据采集和分析。项目分为两个阶段:硬件组装和数据分析。在硬件组装阶段,Arduino将数据发送到Web应用程序编程接口(API),并存储在时序数据库中,结果通过Android应用程序呈现。数据分析阶段涉及使用Pandas、NumPy和Matplotlib等库进行统计探索。所提出的系统旨在基于收集和分析的数据增强决策能力。

英文摘要

This paper presents a low-cost home automation system implemented in a municipal swimming pool to address various challenges, including security concerns, air quality control, gas leakage detection, energy consumption reduction, and temperature and humidity control on the pool deck. The system utilises Arduino microcontrollers with sensors and actuators, enabling real-time data collection and analysis. The project is divided into two phases: hardware assembly and data analysis. In the hardware assembly phase, the Arduino sends data to a web Application Programming Interface (API) and stores it in a time-series database, with results presented in an Android application. The data analysis phase involves statistical exploration using libraries such as Pandas, NumPy, and Matplotlib. The proposed system aims to enhance decision-making based on collected and analysed data.

2606.18031 2026-06-17 cs.SI 新提交

Pareto Optimal Re-ranking with Semi-Automated Content Credibility Detection

基于半自动化内容可信度检测的帕累托最优重排序

Yigit Ege Bayiz, Arash Amini, Ufuk Topcu

AI总结 提出一种双目标优化方法,通过最小化斯皮尔曼脚距保持原始排序,同时最大化内容可信度,并设计半自动化管道结合检索增强与人工核查分配可信度分数,在X平台数据上实现帕累托最优前沿偏差不超过7%。

Comments Submitted to CDC 2026

详情
AI中文摘要

社交媒体帖子通常包含错误信息或误导性内容,降低了内容推送的预期可信度。我们提出一种基于优化的方法,通过改进现有内容排序来提高社交媒体推送中新闻内容的可信度。该方法基于双目标优化方法,在最小化与原始排序的斯皮尔曼脚距距离以保持原始内容顺序的同时,引入额外的线性成本目标以提升内容推送的预期可信度。此外,我们提出一个鲁棒的半自动化管道,基于检索增强分数分配和人工生成的事实核查的混合,为内容分配可信度分数。该半自动化管道有助于使用人工生成的标签来锚定可信度分配,同时确保算法能够扩展到几乎没有人工标签的帖子。我们通过在X(Twitter)上收集的真实世界数据进行的实验设置展示了我们的方法,其中我们基于用户生成的社区笔记和检索增强生成的混合来分配可信度分数。我们提出的方法在已知初始排序值的情况下,两个优化目标与帕累托最优前沿的偏差最多为7%。此外,该算法允许整合不同的来源可信度度量,使其适用于各种社交媒体平台。

英文摘要

Social media posts often include misinformative or misleading content, diminishing the expected credibility of content feeds. We present an optimization-based method to improve the credibility of news content on social media feeds by refining existing content rankings. This method is based on a dual-objective optimization approach that minimizes the Spearman's footrule distance to the original ranking to maintain the original content order while incorporating an additional linear cost objective to elevate the expected credibility of the content feed. Additionally, we propose a robust semi-automated pipeline for assigning credibility scores to content based on a mixture of retrieval-augmented score assignments and human-generated fact-checks. This semi-automated pipeline helps ground the credibility assignment using human-generated labels while ensuring the algorithm extends to posts with few or no human-generated labels. We showcase our approach through an experimental setup using real-world data collected over X (Twitter), where we assign the credibility scores based on a mixture of user-generated community notes and retrieval augmented generation. The method we present leads to at most 7% deviation in both optimization objectives from the Pareto optimal front with known initial ranking values. Additionally, the algorithm allows for incorporating different measures for source credibility, making it applicable across various social media platforms.

2606.18030 2026-06-17 cs.HC 新提交

ParaTutor: LLM Mediated Parent Child Tutoring through Role Separated Scaffolding Interface in Real Time

ParaTutor: 通过角色分离的实时脚手架界面实现LLM介导的亲子辅导

Lan Luo, Anqi Wang, Muzhi Zhou, Junhua Zhu, Jie Cai, Ao Yu, Hui Pan

AI总结 针对亲子辅导中角色不对称问题,提出ParaTutor系统,通过为家长提供辅导指导、为孩子提供视觉基础,在实时互动中保持家长主导和孩子参与,实验表明优于通用LLM辅助。

详情
AI中文摘要

亲子辅导是一种具有不对称角色的协作学习环境,其中家长引导孩子解决问题,而孩子则参与理解和推理。然而,大多数基于LLM的学习系统是为单一用户或对称协作设计的,使得具有不同教学角色的亲子辅导尚未得到充分探索。通过形成性研究,我们发现有效的亲子辅导依赖于保持这些不同的角色,家长引导学习过程,孩子保持积极参与推理。我们还识别出家长难以理解问题结构、缺乏足够知识提供支持或遇到沟通困难破坏共同理解等反复出现的挑战。为解决这些挑战,我们提出ParaTutor,一个为家长和孩子提供不同形式支持的脚手架系统。ParaTutor为家长提供辅导指导,并为孩子提供解决问题的视觉基础。我们与23对亲子(孩子年龄10-12岁)在四种辅导条件下评估ParaTutor,这些条件变化LLM辅助的传递方式。结果表明,通用LLM辅助倾向于减少家长在辅导中的角色,而ParaTutor更好地保持家长主导的支持并维持孩子在推理中的参与。这些发现表明,在多用户学习中,LLM支持的价值不仅取决于模型能力,还取决于支持如何在不同角色的用户之间分配。我们的工作为支持家庭学习的LLM系统提供了设计启示。

英文摘要

Parent child tutoring is a collaborative learning setting with asymmetric roles, where parents guide children s problem solving while children engage in understanding and reasoning. However, most LLM based learning systems are designed for either single users or symmetric collaboration, leaving parent child tutoring with distinct instructional roles underexplored. Through a formative study, we find that effective parent child tutoring depends on preserving these distinct roles, with parents guiding the learning process and children remaining actively engaged in reasoning. We also identify recurring challenges when parents struggle to understand problem structure, lack sufficient knowledge to provide support, or encounter communication difficulties that disrupt shared understanding. To address these challenges, we present ParaTutor, a scaffolding system that provides different forms of support to parents and children. ParaTutor supports parents with guidance for tutoring and provides children with visual grounding for problem solving. We evaluate ParaTutor with 23 parent child dyads (children aged 10 to 12) under four tutoring conditions that vary how LLM assistance is delivered. Results show that generic LLM assistance tends to reduce the parent s role in tutoring, whereas ParaTutor better preserves parent led support and sustains children s participation in reasoning. These findings suggest that in multi users learning, the value of LLM support depends not only on model capability but also on how support is distributed across users with different roles. Our work contributes design implications for LLM systems that support family learning.

2606.18017 2026-06-17 cs.NI 新提交

Energy-Efficient FSO Reconfiguration under User Mobility in Hybrid Fiber-IAB Backhaul

混合光纤-IAB回传中用户移动性下的节能FSO重构

Piotr Lechowicz, Charitha Madapatha, Carlos Natalino, Tommy Svensson, Paolo Monti

AI总结 针对用户移动性导致的回传需求时变问题,提出闭环负载感知迟滞控制器,实现混合光纤-IAB-FSO回传的节能重构,能耗降低8%-44%仅牺牲0.9%-6.7%覆盖。

详情
AI中文摘要

用户移动性产生随机、时变的回传需求,静态容量配置无法匹配。我们提出了一种用于混合光纤-IAB-FSO回传的闭环、负载感知迟滞控制器,并表明能耗下降速度快于覆盖:8%至44%的节能仅牺牲0.9%至6.7%的覆盖。

英文摘要

User mobility creates stochastic, time-varying backhaul demand that static capacity provisioning cannot match. We propose a closed-loop, load-aware hysteresis controller for hybrid fiber-IAB-FSO backhaul and show that energy drops faster than coverage: 8% to 44% energy savings cost only 0.9% to 6.7% coverage.

2606.18016 2026-06-17 cs.NI 新提交

User-Mobility-Aware Optimization of Fiber Placement in Hybrid Fiber-IAB Networks

用户移动性感知的混合光纤-IAB网络光纤放置优化

Piotr Lechowicz, Charitha Madapatha, Carlos Natalino, Tommy Svensson, Paolo Monti

AI总结 提出一种元启发式优化方法,将用户动态集成到拓扑设计中,实现更自适应、成本高效的混合光纤-IAB网络回传架构,助力可扩展灵活的6G网络基础设施。

详情
AI中文摘要

混合光纤-IAB网络的元启发式优化表明,将用户动态集成到拓扑设计中能够实现更自适应和成本高效的回传架构,有助于开发可扩展和灵活的6G网络基础设施。

英文摘要

Metaheuristic optimization of hybrid fiber-IAB networks demonstrates that integrating user dynamics into topology design enables more adaptive and cost-efficient backhaul architectures, contributing to the development of scalable and flexible 6G network infrastructures.

2606.18010 2026-06-17 cs.HC 新提交

Co-Creativity at the Table: A Qualitative Analysis of Creative Interactions in the Podcast "Adventure AI"

桌上的共创:对播客“Adventure AI”中创意互动的定性分析

Hanna Dodd, Daniel G. Brown

AI总结 通过定性分析播客“Adventure AI”中人类与AI在《龙与地下城》中的互动,探讨AI在桌游中的角色、人类角色、AI的评价与失败,以及其作为桌边人物和角色的对待方式。

Comments 11 pages, 3 tables

详情
AI中文摘要

桌面角色扮演游戏因其复杂和协作的性质,为与人工智能(AI)的互动提供了独特的环境。我们分析了播客“Adventure AI”,该播客展示了在《龙与地下城》游戏中人类与AI的互动,以考察AI在桌面角色扮演游戏中如何被使用以及玩家如何看待这种使用。我们对2023年至2025年间该播客的三季内容进行了定性分析,报告了关于AI角色、人类角色、AI的评价与失败,以及其在桌边作为人物和角色的对待方式等主题。在游戏的许多方面,人工智能取得了成功,而在其他方面则不太合适。该分析为未来关于人工智能在游戏空间中应被使用和不应被使用的研究提供了基础。

英文摘要

Tabletop role-playing games provide a unique environment for interaction with artificial intelligence (AI) due to their complex and collaborative nature. We analyze Adventure AI, a podcast featuring human-AI interactions in Dungeons & Dragons play, to examine how AI is and can be used in tabletop role-playing gaming and how players perceive this use. We complete a qualitative analysis of three seasons of this podcast, from 2023 to 2025, reporting on the overarching themes of roles of AI, roles of humans, the evaluations and failures of AI, and its treatment as a person and character at the table. There are many aspects of the game where artificial intelligence succeeds, while there are others where it is less appropriate. This analysis gives a basis for future work on where artificial intelligence should and should not be used in gaming spaces.

2606.17991 2026-06-17 cs.CG 新提交

Greedy Vector Balancing

贪心向量平衡

Wojciech Czerwiński, Daniel Dadush, Ekin Ergen, Arka Ghosh, Sławomir Lasota, Łukasz Orlikowski

AI总结 针对在线向量平衡问题,分析自然欧几里得贪心算法,证明当向量集有限时,贪心算法产生的带符号和范数有与序列长度无关的界,并推广到在线向量划分。

Comments 21 pages, 3 figures

详情
AI中文摘要

在在线向量平衡中,向量 $t_1,\dots,t_n$ 从给定集合 $T$ 中逐个到达,目标是在线分配符号 $s_1,\dots,s_n\in\{\pm1\}$,以最小化任何带符号前缀和 $\sum_{i=1}^k s_i t_i$($k \in [n]$)的最大范数。本文分析了该问题的自然欧几里得贪心向量平衡算法:在每一步 $k$,选择符号 $s_k\in\{\pm1\}$ 使得 $s_k t_k$ 与 $\sum_{i=1}^{k-1} s_i\cdot t_i$ 的内积非正。我们的主要结果是当 $T$ 有限时,贪心算法性能的第一个与序列长度 $n$ 无关的有限界。当 $T \subset \mathbb{R}^d$ 由单位向量组成时,我们证明贪心算法产生的带符号和的欧几里得范数至多为 $(2/\delta_T)^{d-1}$,其中 $\delta_T$ 是 $T$ 中向量与 $T$ 中向量张成的子空间之间的最小非零距离。当序列由 $T$ 中缩放后的向量组成时,同样的上界成立。我们还提供了一个简单集合 $T$,其下界为 $\Omega(\sqrt{d}/\delta_T)$。我们通过证明存在一个有界凸集 $K_T$ 是 $T$ 吸收的来分析贪心算法:$\forall x\in K_T$ 且 $t \in\pm T$,$\langle x,t\rangle\leq0\Rightarrow x+t\in K_T$。我们基于 $T$ 中向量张成的子空间链,给出了一个包含在半径为 $(2/\delta_T)^{d-1}$ 的球内的集合 $K_T$ 的显式构造,这可能具有独立意义。我们将贪心向量平衡界推广到在线向量划分,其中序列 $t_1,\dots,t_n$ 必须在线划分为 $p$ 个子序列。作为一个应用,我们证明了 Bosman 等人(arXiv:2402.19259)猜想的一个特例,表明在场景数固定时,场景下总完成时间调度的字典序版本是多项式时间可解的。

英文摘要

In online vector balancing, vectors $t_1,\dots,t_n$ arrive one by one from a given set $T$ and the goal is to assign signs $s_1,\dots,s_n\in\{\pm1\}$ in an online manner so as to minimize the largest norm of any signed prefix sum $\sum_{i=1}^ks_i t_i$, $k \in [n]$. In this paper, we analyze the natural Euclidean greedy vector balancing algorithm for this problem: at each step $k$, the sign $s_k\in\{\pm1\}$ is chosen so that $s_k t_k$ has non-positive inner product with $\sum_{i=1}^{k-1} s_i\cdot t_i$. Our main result is the first finite bound, independent of the sequence length $n$, on the performance of greedy whenever $T$ is finite. When $T \subset \mathbb{R}^d$ consists of unit vectors, we prove that the signed sums produced by greedy have Euclidean norm at most $(2/δ_T)^{d-1}$, where $δ_T$ is the minimum non-zero distance between vectors in $T$ and subspaces spanned by vectors in $T$. The same upper bound holds when the sequences are composed of scaled down vectors in $T$. We also provide a simple set $T$ for which $Ω(\sqrt{d}/δ_T)$ is a lower bound. We analyze the greedy algorithm by proving the existence of a bounded convex $K_T$ that is $T$-absorbing: $\forall x\in K_T$ and $t \in\pm T$, $\langle x,t\rangle\leq0\Rightarrow x+t\in K_T$. We give an explicit construction of a set $K_T$ contained in a ball of radius $(2/δ_T)^{d-1}$, based on chains of subspaces spanned by vectors in $T$, which may be of independent interest. We generalize our greedy vector balancing bound to online vector partitioning, where the sequence $t_1,\dots,t_n$ must be partitioned in an online manner into $p$ subsequences. As an application, we prove a special case of a conjecture of Bosman et al. (arxiv:2402.19259), showing that a lexicographic version of total completion time scheduling under scenarios is polynomial time solvable when the number of scenarios is fixed.

2606.17987 2026-06-17 cs.NI cs.CR cs.GT 新提交

Security-Induced Braess Paradoxes in Service Function Chain Orchestration

服务功能链编排中的安全诱导Braess悖论

Daniel Commey, Bin Mai

AI总结 研究NFV/SDN编排中增加防御选项可能恶化服务性能的Braess悖论,提出条件判定与预部署筛选机制,实验表明悖论-aware约束可将性能损失控制在1.9%以下。

详情
AI中文摘要

NFV/SDN编排使运营商能够按需实例化流量并通过虚拟防火墙、IDS/IPS副本、WAF集群、零信任网关、备份检查路径和迁移目标。运营商通常将这些选项视为单调改进:更多的检查容量、更低的标称延迟或更广的放置灵活性不应降低服务质量。即使新选项在局部具有吸引力,这种直觉也可能失效。我们研究了服务功能链(SFC)编排中的安全诱导Braess悖论,其中添加防御选项通过将流量和对抗价值集中在共享安全资源上,恶化了适应后的均衡。我们定义了Braessian安全管理操作,推导了在仿射负载依赖VNF延迟下悖论出现的充分条件,并给出了一个预部署编排筛选器,用于拒绝、限制或保留有害选项。一个多租户SFC实验套件将该模型应用于四种拓扑派生场景:胖树数据中心、NSFNET风格广域网、GEANT风格广域网和边缘/雾拓扑。在理论识别的Braessian区域默认参数下,朴素防御扩展使均衡服务成本增加27.2-30.8%,风险集中度增加6.1-9.7倍。悖论感知的约束使用将剩余惩罚保持在1.9%以下,相对于朴素扩展降低服务成本20.0-22.1%,并将浓度敏感的攻击损失代理平均降低93.5%。

英文摘要

NFV/SDN orchestration lets operators instantiate and steer traffic through virtual firewalls, IDS/IPS replicas, WAF clusters, zero-trust gateways, backup inspection paths, and migration targets on demand. Operators often treat these options as monotone improvements: more inspection capacity, lower nominal latency, or broader placement flexibility should not degrade the service. That intuition can fail even when the new option is locally attractive. We study a security-induced Braess paradox in service function chain (SFC) orchestration, where adding a defensive option worsens the post-adaptation equilibrium by concentrating traffic and adversarial value on shared security resources. We define Braessian security-management actions, derive a sufficient condition for paradox emergence under affine load-dependent VNF delay, and give a pre-deployment orchestration screen that rejects, caps, or reserves harmful options. A multi-tenant SFC experiment suite applies the model to four topology-derived settings: a fat-tree datacenter, NSFNET-style WAN, GEANT-style WAN, and edge/fog topology. Under default parameters in the Braessian regime identified by the theory, naive defensive expansion raises equilibrium service cost by 27.2-30.8% and increases risk concentration by factors of 6.1-9.7. Paradox-aware constrained use keeps the residual penalty below 1.9%, reduces service cost by 20.0-22.1% relative to naive expansion, and lowers a concentration-sensitive attack-loss proxy by 93.5% on average.

2606.17986 2026-06-17 cs.CR 新提交

ShellGames: Speculative LLM-Driven SSH Deception

ShellGames: 基于LLM的推测性SSH欺骗

Umberto Salviati, Fabio De Gaspari, Mauro Conti, Luigi Vincenzo Mancini

AI总结 针对LLM在欺骗系统中缺乏持久状态、输出不一致等问题,提出ShellGames,结合思维链、记忆管理、推测执行等五种技术,在正确性、一致性、状态跟踪和鲁棒性上显著优于基线。

详情
AI中文摘要

网络欺骗和移动目标防御是有前景的策略,旨在通过增加不确定性来干扰对手。然而,与对手维持长期、可信的交互会话仍然是一个开放挑战。大型语言模型(LLM)为更动态的欺骗系统提供了有希望的路径,但存在关键限制,从根本上限制了其适用性,包括:缺乏持久状态、输出不一致、幻觉、延迟以及可能暴露欺骗的行为颠覆易感性。我们提出了ShellGames,一个基于LLM的SSH shell模拟器,旨在解决这些限制。ShellGames结合了五种互补技术:(i) 自动思维链和少样本学习以提高正确性;(ii) 内存管理以维持系统状态一致性;(iii) 推测性命令执行以减少响应延迟;(iv) 将复杂交互命令智能路由到沙盒环境;以及(v) 利用shell环境的受限输入输出域进行颠覆检测。为了进行系统评估,我们引入了一个标准化的基准测试协议和数据集,涵盖正确性、一致性、状态跟踪和鲁棒性任务。ShellGames在正确性上达到0.898的命令准确率(比基线高5.3个百分点),一致性上达到0.918的序列级准确率(高36个百分点),状态跟踪准确率0.98(高18.3个百分点),鲁棒性准确率0.95(高37个百分点)。一项有20名参与者的用户研究证实,ShellGames在自由探索下实现了与真实shell相当的真实感,并且在感知命令覆盖率上优于传统蜜罐。

英文摘要

Cyber deception and Moving Target Defense are promising strategies that aim to disrupt adversaries by increasing uncertainty. However, sustaining long-lived, credible interactive sessions with adversaries remains an open challenge. Large Language Models (LLMs) offer a promising path toward more dynamic deception systems, but suffer from key limitations that fundamentally limit their applicability, including: lack of persistent state, output inconsistencies, hallucinations, latency, and susceptibility to behavioral subversion that may reveal the deception. We propose ShellGames, an SSH shell simulator based on LLM designed to address these limitations. ShellGames combines five complementary techniques: (i) Automatic Chain-of-Thought and few-shot learning to improve correctness; (ii) memory management to maintain system state coherency; (iii) speculative command execution to reduce response latency; (iv) smart routing of complex interactive commands to a sandboxed environment; and (v) subversion detection leveraging the constrained input-output domain of shell environments. To enable systematic evaluation, we introduce a standardized benchmarking protocol and dataset spanning correctness, consistency, state tracking, and robustness tasks. ShellGames achieves $0.898$ command accuracy on correctness ($+5.3pp$ over baselines), $0.918$ sequence-level accuracy on consistency ($+36pp$), $0.98$ state tracking accuracy ($+18.3pp$), and $0.95$ accuracy on robustness ($+37pp$). A user study with $n=20$ participants confirms that ShellGames achieves realism comparable to a real shell under free exploration and outperforms traditional honeypots on perceived command coverage.

2606.17981 2026-06-17 cs.SE 新提交

Planning to Hammer: Difficulty-Aware Decomposition for Automating Rocq Proofs

规划锤击:面向自动化Rocq证明的难度感知分解

Ning Zhang, Nongyu Di, Zenan Li, Yuan Yao, Xiaoxing Ma

AI总结 提出Quarry框架,通过LLM规划证明分解并利用难度模型排序子目标,结合CoqHammer自动证明,在Rocq基准测试中成功率提升7%-13%。

Comments 26 pages, 8 figures; submitted to OOPSLA 2026

详情
AI中文摘要

随着AI生成代码的普及,形式化验证(特别是通过Rocq和Isabelle等交互式定理证明器)对于确保软件正确性变得越来越重要。然而,在这些证明器中生成机器检查的证明仍然是一个瓶颈。现有解决方案在证明自动化方面具有互补优势:大型语言模型(LLM)可以提出高级证明策略但缺乏局部严谨性,而CoqHammer等自动化策略可以可靠地处理许多局部目标但缺乏长期规划能力。为了结合两者优点,我们提出了Quarry,一个基于规划的证明合成框架,将证明规划与证明执行分离。具体来说,Quarry要求LLM主动提出多个带有任意子引理的证明分解,在Rocq中临时假设子引理进行类型检查,并使用基于证明状态的难度模型(估计锤子可解性)对候选方案进行排序。然后,它在有限预算内递归证明子引理,有效地将长证明转化为一系列锤子可解的义务序列。我们在SerAPI和CoqHammer之上实现了Quarry,并使用多个前沿LLM在多个基准测试上进行了评估。实验结果表明,基于规划的分解与可解性感知排序显著提高了自动化程度,同时保持了可预测的成本。在统一的10分钟墙钟预算下,Quarry在三个Rocq基准测试中的成功率比最强基线提高了7%到13%。这些结果表明,通过协调神经规划与符号执行(而非取代其中任何一个),可以实现可靠的证明自动化。

英文摘要

As AI-generated code proliferates, formal verification, particularly through interactive theorem provers such as Rocq and Isabelle, becomes increasingly important for ensuring software correctness. However, producing machine-checked proofs in such provers remains a bottleneck. Existing solutions bring complementary strengths to proof automation: large language models (LLMs) can propose high-level proof strategies but lack local rigor, while automated tactics such as CoqHammer can reliably discharge many local goals but lack long-range planning capabilities. To combine the best of both worlds, we present Quarry, a planning-based proof synthesis framework that separates proof planning from proof execution. Specifically, Quarry asks an LLM to actively propose multiple proof decompositions with arbitrary sublemmas, type-checks them in Rocq under temporarily admitted sublemmas, and ranks candidates using a proof-state-based difficulty model that estimates hammer solvability. It then recursively proves sublemmas within a bounded budget, effectively turning long proofs into sequences of hammer-solvable obligations. We implement Quarry on top of SerAPI and CoqHammer and evaluate it using multiple frontier LLMs across multiple benchmarks. The experimental results show that planning-based decomposition with solvability-aware ranking substantially improves automation while maintaining predictable cost. Under a uniform 10-minute wall-clock budget, Quarry improves over the strongest baseline by 7% to 13% in success rate across three Rocq benchmarks. These results demonstrate that reliable proof automation can be achieved by coordinating neural planning with symbolic execution rather than replacing either.

2606.17957 2026-06-17 cs.CR cs.CY cs.HC 新提交

Children Are Not the Enemy: Child-Fit Security as an Alternative to Bans and Surveillance

儿童不是敌人:作为禁令和监控替代方案的儿童适配安全

Kopo M. Ramokapane, Rui Huan, Zaina Dkaidek, Awais Rashid

AI总结 提出“儿童适配安全”设计范式,将儿童视为合法用户而非威胁,以儿童福祉、发展、隐私、安全、自主权和权利为核心安全需求,保护儿童及其参与。

Comments 14 pages, 2 figures, Paper Under review

详情
AI中文摘要

数字技术现已成为儿童学习、游戏、交流、身份形成和社会参与的核心。然而,儿童在线安全的主流方法往往依赖于遏制机制,包括禁令、年龄门控、家长控制、监控和屏幕时间限制。这些方法在特定情境下可能有用,但它们通常将儿童保护主要视为限制访问为成人设计的系统的问题。在本文中,我们认为这种框架对儿童的数字生活而言是不充分的,并且作为安全范式是不够的。我们提出儿童适配安全(Child-fit security),这是一种设计范式,其中可能被儿童使用的技术将儿童视为合法用户,而非需要排除的攻击者、需要修补的漏洞或需要管理的风险。在该范式中,儿童的福祉、发展、隐私、安全、自主权和权利成为核心安全需求。这将保护的重点从应用、账户和数据转移到儿童-系统关系上,意味着同时保护儿童及其参与。我们概念化儿童适配安全,将其与面向遏制的方法进行对比,定义其核心原则,并讨论其对安全设计的影响。最后,我们提出了一个使儿童适配安全可操作化的研究议程。

英文摘要

Digital technologies are now central to children's learning, play, communication, identity formation, and social participation. Yet dominant approaches to children's online safety often rely on containment mechanisms, including bans, age gates, parental controls, monitoring, and screen-time restrictions. These approaches can be useful in specific contexts, but they often frame child protection primarily as a problem of restricting access to systems designed for adults. In this paper, we argue that this framing is inadequate for children's digital lives and insufficient as a security paradigm. We propose Child-fit security, a design paradigm in which technologies likely to be used by children treat a child as legitimate users, not attackers to be excluded, vulnerabilities to be patched, or risks to be managed. In this paradigm, children's wellbeing, development, privacy, safety, agency, and rights become core security requirements. This shifts the focus of protection from apps, accounts, and data to the child-system relationship, which means protecting both the child and their participation. We conceptualise child-fit security, contrast it with containment-oriented approaches, define its core principles, and discuss its implications for security design. We conclude by presenting a research agenda for making child-fit security operational.

2606.17949 2026-06-17 cs.DC 新提交

RouteBalance: Fused Model Routing and Load Balancing for Heterogeneous LLM Serving

RouteBalance: 面向异构LLM服务的融合模型路由与负载均衡

Wei Da, Evangelia Kalyvianaki

AI总结 提出RouteBalance,通过融合模型路由与负载均衡为异构LLM服务实现质量、延迟和成本的三维联合优化,在13实例28GPU集群上达到最优前沿。

Comments 12 pages, 5 figures

详情
AI中文摘要

异构LLM服务栈将调度分为两个独立优化的层次:模型路由器根据质量和成本信号选择模型而忽略实例负载,服务负载均衡器优化队列而忽略质量。我们提出RouteBalance,一个服务感知的调度层,将两者融合为对具体模型实例的单一在线分配,联合权衡质量、延迟和成本。批处理进程内预测器栈和推算的实例状态使得联合决策在请求热路径上成本低廉(12 req/s时约32 ms)。在运行四种模型规模的13实例、28GPU异构集群上,单个部署的RouteBalance栈追踪了质量-成本-吞吐量三维前沿的上部区域。扫描一个权重向量即可达到最高的路由决策质量(DeepEval 0.419,比最强基线高0.013,95%置信区间[+0.005,+0.022];当第二个裁判对实际服务文本重新评分时排序保持不变),并在其成本优先角上达到与最便宜基线持平的每请求成本。在与我们构建的并发评分基线变体进行路由工程均衡后,其平衡预设以2.8秒和30 req/s提供服务,在高负载下领先增强版BEST-Route 2.6到4.1倍。(按发布方式部署这些路由器,每个请求一次串行评分调用,会使它们在高负载下崩溃23倍,这是单独隔离的部署架构效应,而非路由结果。)四臂隔离实验表明,收益来自于在模型选择时对延迟定价;学习到的预测器贡献了校准和SLO余量,而非主要前沿。代码:此 https URL

英文摘要

Heterogeneous LLM serving stacks split scheduling into two layers that optimize in isolation: model routers pick a model from quality and cost signals while ignoring instance load, and serving load balancers optimize queues while ignoring quality. We present RouteBalance, a serving-aware scheduling layer that fuses both into a single online assignment over concrete model instances, jointly trading off quality, latency, and cost. A batched in-process predictor stack and dead-reckoned instance state keep the joint decision cheap on the request hot path ($\approx$32 ms at 12 req/s). On a 13-instance, 28-GPU heterogeneous cluster serving four model sizes, a single deployed RouteBalance stack traces the upper region of the three-way quality-cost-throughput frontier. Sweeping one weight vector reaches both the highest routing-decision quality (DeepEval $0.419$, $+0.013$ over the strongest baseline, $95\%$ CI $[{+}0.005,{+}0.022]$; the ordering holds when a second judge re-scores the actually served text) and, at its cost-priority corner, per-request cost that ties the cheapest baseline. With router engineering equalized against concurrent-scoring baseline variants we build, its balanced preset serves at $2.8$ s and $30$ req/s, leading $2.6$ to $4.1\times$ ahead of enhanced BEST-Route at high load. (Deploying those routers as published, one serial scoring call per request, makes them collapse $23\times$ under load, a deployment-architecture effect we isolate separately, not the routing result.) A four-arm isolation shows the benefit follows from pricing latency at model-selection time; the learned predictors contribute calibration and SLO headroom rather than the headline frontier. Code: https://github.com/AKafakA/route-balance

2606.17925 2026-06-17 cs.GT 新提交

Parasitic Masquerade: Societal Scale Human-Machine Interaction

寄生伪装:社会尺度的人机交互

Jiejun Hu-Bolz, James Stovold

AI总结 通过图论平均场博弈模型,研究人机交互中寄生行为伪装成生产性学习的现象,发现信息流不对称和环境噪声可引发系统相变。

详情
AI中文摘要

本文通过将个体博弈论模型扩展到社会层面,拓展了近期在人机交互方面的研究。我们采用图论平均场博弈(GMFG)模型,模拟了共享环境中四组内部同质但外部异质的智能体之间的交互。结果表明,寄生行为可以伪装成生产性学习,知识分布和行为看似健康,但实际上是由机器耦合而非独立研究驱动的。为了检测这一点,我们测量了信息流方向和环境的信念熵,揭示出在所有场景中,人到机器的通道占主导地位,且在寄生状态下不对称性加剧。我们进一步证明,该系统存在共生的和寄生的均衡共存,环境噪声可以诱发一个临界点,使智能体越过认知成本障碍。这些涌现现象并非设计在任何单个智能体中,而是源于集体交互结构,强调了需要将人机社会学作为一个复杂系统整体进行研究。

英文摘要

This work extends recent developments in studying human--machine interaction by scaling from individual game-theoretic models to a societal-level model. We adopt a Graphon Mean-Field Game (GMFG) that models the interaction among four groups of internally-homogeneous but externally-heterogeneous agents in a shared environment. Our results show that parasitism can masquerade as productive learning, with knowledge distribution and actions appearing healthy while being driven by machine coupling rather than independent investigation. To detect this, we measure the direction of information flow and belief entropy of the environment, revealing that human to machine channel dominates across all scenarios, with the asymmetry intensifying under parasitism. We further demonstrate that the system exhibits coexisting mutualistic and parasitic equilibria, where environmental noise can induce a tipping point that shifts agents past the cognitive cost barrier. These emergent phenomena are not designed into any individual agent but arise from the collective interaction structure, underscoring the need to study the sociology of humans and machines holistically as a complex system.

2606.17921 2026-06-17 cs.MM 新提交

OlfactProfile: Profile-Conditioned Odor Prediction from Audiovisual Content

OlfactProfile: 基于用户嗅觉特征从视听内容预测气味

Zhengyu Lou, Bosheng Qin, Yanan Wang, Duanduan Yin, Wentao Ye, Yu Xin

AI总结 提出OlfactProfile框架,通过结构化场级用户嗅觉特征调制,实现从视听内容预测气味,优于基线模型和通用多模态大模型,在背景气味和情感气味预测上提升显著。

Comments 10 pages, 5 figures

详情
AI中文摘要

自动视频-气味匹配预测与视听内容对齐的香味,用于增强感官的媒体。现有方法通常将气味标签视为仅由场景内容决定,但气味判断也依赖于个体嗅觉特征,包括气味敏感性、对难闻气味的耐受性和情感偏好。忽略这一观察者上下文限制了当前系统预测与感知体验匹配的气味的能力。我们提出了OlfactProfile,一个用于从视听内容进行特征条件气味预测的框架。我们的结果表明,嗅觉特征并非默认有益:在匹配特征骨干网络的情况下,简单的特征拼接和统一特征调制会降低性能,而结构化的场级特征调制则持续改善预测。因此,关键挑战不仅在于观察者上下文是否可用,还在于如何将其整合到多模态推理中。为了研究这一设置,我们构建了一个视听基准,将时间对齐的气味注释与注释者的嗅觉偏好特征配对。它包含1,350个视频片段、一个99类气味词汇和三个语义气味轨道:前景气味、背景气味和情感气味。我们还提出了OAR(嗅觉感知路由),一种多模态融合模块,执行轨道感知的视听路由与场级特征调制,允许特征维度根据感知角色影响气味推理。实验表明,OlfactProfile优于监督基线和通用多模态大模型,在与气味专家的小规模人类比较中具有竞争力,并在无需任务特定微调的情况下改善了气味增强应用中的感知气味匹配。按轨道分析显示,在背景气味和情感气味上增益最强,这些领域观察者依赖的判断最为重要。

英文摘要

Automated video-odor matching predicts scents aligned with audiovisual content for scent-enhanced media. Existing methods usually treat odor labels as determined only by scene content, but odor judgment also depends on individual olfactory profiles, including scent sensitivity, tolerance to unpleasant odors, and affective preference. Ignoring this observer context limits current systems' ability to predict scents that match perceived experience. We present OlfactProfile, a framework for profile-conditioned odor prediction from audiovisual content. Our results show that olfactory profiles are not beneficial by default: with matched feature backbones, naive profile concatenation and uniform profile modulation can degrade performance, while structured field-wise profile conditioning consistently improves prediction. Thus, the key challenge is not merely whether observer context is available, but how it is integrated into multimodal reasoning. To study this setting, we construct an audiovisual benchmark pairing temporally aligned odor annotations with annotator olfactory preference profiles. It contains 1,350 video clips, a 99-class scent vocabulary, and three semantic odor tracks: Foreground Odor, Background Odor, and Emotion Odor. We also propose OAR (Olfactory-Aware Routing), a multimodal fusion module that performs track-aware audiovisual routing with field-wise profile modulation, allowing profile dimensions to influence odor reasoning according to perceptual role. Experiments show that OlfactProfile outperforms supervised baselines and general-purpose multimodal large models, is competitive with odor experts in a small human comparison, and improves perceived scent fit in scent-enhanced applications without task-specific fine-tuning. Per-track analysis shows that gains are strongest for Background Odor and Emotion Odor, where observer-dependent judgment is most important.

2606.17914 2026-06-17 eess.SY cs.SY 新提交

Three-phase model of unbalanced distribution networks with DERs

含分布式能源的不平衡配电网三相模型

S. Perna, C. Lillo, A. R. Di Fazio, M. Russo, G. M. Casolino, P. Varilone, P. Verde

AI总结 提出非近似的三相Dist3Flow支路潮流模型,用节点电压实虚部和功率流为状态变量,通过前后向回扫算法求解,适用于辐射状和闭环拓扑,经OpenDSS验证。

详情
AI中文摘要

经典的稳态配电网分析DistFlow方程无法捕捉由不对称线路、负载和分布式能源(DER)引起的三相系统固有失衡。本文将经典潮流(PF)方程扩展为严格的、非近似的三相公式,称为Dist3Flow。所提出的支路潮流模型(BFM)利用节点电压的实部和虚部以及有功和无功功率流作为状态变量。线路通过非线性前向和后向方程建模,而负载和DER分别通过ZIP模型和P-Q控制表示。通过在终端节点引入特定边界条件,该公式将PF分析推广到辐射状和闭环拓扑。通过使用前后向回扫(BFS)算法获得解。该方法在OpenDSS上针对各种配置进行了验证,考虑了开环和闭环拓扑,以及有无DER的情况。

英文摘要

Classical DistFlow equations for steady-state distribution network analysis fail to capture the inherent imbalances of three-phase systems arising from asymmetrical lines, loads, and distributed energy resources (DERs). This paper extends the classical power flow (PF) equations into a rigorous, non-approximated three-phase formulation, termed Dist3Flow. The proposed branch flow model (BFM) utilizes the real and imaginary components of nodal voltages and the active and reactive power flows as state variables. Lines are modelled by nonlinear forward and backward equations, while loads and DERs are represented via ZIP models and P-Q control, respectively. By incorporating specific boundary conditions at the terminal nodes, the formulation generalizes PF analysis to both radial and closed-ring topologies. The solution is obtained by using a backward/borward sweep (BFS) algorithm. The approach is validated against OpenDSS across various configurations, considering open-ring and closed-ring topologies with and without DERs.

2606.17913 2026-06-17 eess.SY cs.SY 新提交

Reducing Building Heat Demand Through Intelligent Control: A Comparative Simulation Study

通过智能控制减少建筑供暖需求:一项比较仿真研究

Ueli Schilt, Curtis Meister, Philipp Schuetz

AI总结 本研究通过比较两种不同优化目标的模型预测控制策略,发现以热舒适为导向的控制器比最小化供暖功率的控制器更能降低总热量消耗,同时保持高舒适度。

Comments 9 pages, 5 figures, 1 table. REHABEND 2026, 11th Euro-American Congress

详情
AI中文摘要

空间供暖仍然是建筑中的主要能源消耗者。虽然结构改造可以大幅减少需求,但通常成本高昂且耗时。作为替代方案,本研究探讨了智能供暖控制策略以较低投资和更快实施减少热量消耗的潜力。先前的研究表明,用模型预测控制器(MPC)替换传统的基于供暖曲线的控制器可以减少供暖能源需求。尽管大多数研究将MPC与传统控制进行比较,但本工作评估了两种具有不同控制目标的MPC策略,并量化了它们对室内温度跟踪和供暖需求的影响。基于ISO 52016-1在Python中开发了一个虚拟住宅建筑模型,以生成合成测量数据。使用该数据集对简化的电阻-电容(RC)模型进行参数化,并将其用作在MATLAB中实现的两种MPC策略的内部模型。这些策略仅在优化目标上有所不同:一种最小化二次供暖功率,而另一种优先考虑室内温度跟踪以实现热舒适。为期六天的模拟表明,两种策略都满足舒适性和系统约束,但在能源使用和温度变化方面存在差异。以舒适为导向的控制器实现了比最小化供暖功率的控制器更低的总热量消耗,这归因于二次目标函数中对高供暖速率的惩罚。结果证明了MPC设计中目标函数公式的重要性,并表明在不进行建筑围护结构改造的情况下,可以在实现较低供暖需求的同时保持高舒适度水平。

英文摘要

Space heating remains the dominant energy consumer in buildings. While structural retrofitting can substantially reduce demand, it is often costly and time-intensive. As an alternative, this study investigates the potential of intelligent heating control strategies to reduce heat consumption with lower investment and faster implementation. Previous studies have shown that replacing conventional heating-curve-based controllers with model predictive controllers (MPCs) can reduce heating energy demand. Whereas most studies compare MPC to conventional control, this work evaluates two MPC strategies with different control objectives and quantifies their impact on indoor temperature tracking and heating demand. A virtual residential building model was developed in Python based on ISO 52016-1 to generate synthetic measurement data. A simplified resistance-capacitance (RC) model was parametrised using this dataset and used as the internal model for two MPC strategies implemented in MATLAB. The strategies differ only in their optimisation objective: one minimises quadratic heating power, while the other prioritises indoor temperature tracking for thermal comfort. Simulations over six days show that both strategies satisfy comfort and system constraints, but differ in energy use and temperature variation. The comfort-oriented controller achieves lower total heat consumption than the controller minimising heating power, which is attributed to the penalisation of high heating rates in the quadratic objective function. The results demonstrate the importance of objective function formulation in MPC design and show that high comfort levels can be maintained while achieving lower heating demand without structural modifications to the building envelope.

2606.17873 2026-06-17 eess.SY cs.SY 新提交

Model-Free Control for Multi-Time Scale Dynamics of Grid-Connected Power Converters

并网功率变换器多时间尺度动态的无模型控制

Dewan Mahnaaz Mahmud, Vinu Thomas, Bogdan Marinescu

AI总结 针对并网功率变换器的多时间尺度动态,提出一种基于智能比例积分(iPI)的无模型控制方法,并在16kW实验平台上验证其有效性,展示了在二次电压控制中的应用优势。

详情
AI中文摘要

基于电力电子系统的控制器合成主要依赖于系统的数学模型,当实际系统复杂且数学模型无法捕捉其所有动态时,这便成为了一种限制。无模型控制通过使用一种特设的简单模型来弥补这一限制,该模型通过基于导数的高速率动态评估进行补偿。然而,将无模型控制策略应用于基于电力电子的多时间尺度动态系统是具有挑战性的,因为实现这种控制需要导数作用。并网功率变换器是这类系统的例子,但文献中尚未充分解决实验验证问题。本文介绍了包括硬件实现层面在内的此类控制的验证。合成了一种智能比例积分(iPI)控制器,并在16 kW实验测试台上进行了验证。这证明了该方法在并网功率变换器控制中的优势,其中包括它们在二次电压控制中的参与。

英文摘要

Controller synthesis in power electronics-based systems depends predominantly on the mathematical model of the system, which is a limitation when the actual system is complex and the mathematical model cannot capture all its dynamics. Model-free control addresses this limitation by using an ad-hoc simple model which is compensated by high-rate evaluation of dynamics in terms of their derivatives. However, application of the model-free control strategy to power electronics-based multi-time scale dynamical systems is challenging because of the derivative action needed to implement such control. Grid-connected power converters are examples of such systems, yet experimental validation has not been adequately addressed in the literature. This letter presents the validation of such control including the hardware implementation level. An intelligent proportional-integral (iPI) controller is synthesized and validated on a 16 kW experimental test bench. This proves the benefits of the approach in control of grid-connected power converters, among which their participation in the secondary voltage control.

2606.17860 2026-06-17 cs.DC cs.LO 新提交

An Epistemic Analysis of Random Coordinated Attack

随机协调攻击的认知分析

Sophia Knight, David Lehnherr, Sergio Rajsbaum

AI总结 针对随机协调攻击问题,提出一种概率认知逻辑框架,分析Varghese-Lynch算法,证明其下界紧致,并揭示信息水平与认知公式的对应关系。

详情
AI中文摘要

协调攻击问题建模了通过不可靠链路在有限时间内协调联合行动的挑战。它是第一个被证明不可解的分布式计算问题。其分析也揭示了共同知识(认知逻辑中的核心概念)的重要性。然而,据我们所知,可解的随机化版本的协调攻击尚未通过概率认知逻辑的视角进行研究,其中进程通过抛硬币产生随机性。我们提出了一个认知逻辑框架,用于研究执行有限轮次的随机化算法。该框架适用于协调攻击、近似一致和共识问题,并支持动态图模型:同步系统中可靠进程执行有限轮次,同时对手决定哪些消息丢失。我们的方法结合了动态网络的逻辑刻画和任务可解性技术,以及概率动态认知逻辑的思想。它受到Varghese和Lynch关于随机协调攻击的操作模型的启发。更广泛地说,由此产生的概率认知任务可解性概念为随机化分布式计算的认知研究提供了基础。利用该框架,我们从知识理论的角度分析了Varghese-Lynch算法,提供了对该算法及其下界的正式处理。作为副产品,我们加强了下界并证明其紧致性。证明依赖于不可区分性论证,表明在概率设置中关于知识的推理仍然至关重要。我们还形式化了Varghese和Lynch引入的信息水平概念,表明它对应于一个特定的认知公式。

英文摘要

The coordinated attack problem models the challenge of coordinating a joint action within a bounded time by communicating over unreliable links. It was the first distributed computing problem proven unsolvable. Its analysis also revealed the importance of common knowledge, a central concept in epistemic logic. However, the randomized version of coordinated attack, which is solvable, has not, to the best of our knowledge, been studied through the lens of probabilistic epistemic logic, where processes generate randomness by flipping coins. We present an epistemic logic framework for studying randomized algorithms that execute for a bounded number of rounds. The framework applies to coordinated attack, approximate agreement, and consensus, and supports dynamic graph models: synchronous systems in which reliable processes execute a bounded number of rounds while an adversary determines which messages are lost. Our approach combines techniques from the logical characterization of dynamic networks and task solvability with ideas from probabilistic dynamic epistemic logic. It is inspired by the operational model of Varghese and Lynch for randomized coordinated attack. More broadly, the resulting notion of probabilistic epistemic task solvability provides a foundation for the epistemic study of randomized distributed computation. Using this framework, we analyze the Varghese-Lynch algorithm from a knowledge-theoretic perspective, providing a formal treatment of the algorithm and its lower bound. As a byproduct, we strengthen the lower bound and show it is tight. The proof relies on indistinguishability arguments, demonstrating that reasoning about knowledge remains essential in the probabilistic setting. We also formalize the notion of information level introduced by Varghese and Lynch, showing that it corresponds to a specific epistemic formula.

2606.17853 2026-06-17 cs.NE 新提交

An Optimization Framework for Automated Assessment of Biological Plausibility of Spiking Neurons

脉冲神经元生物学合理性的自动化评估优化框架

Sven Nitzsche, Alexandru Ionita, Andreas Faust, Bogdan Ionescu, Juergen Becker

AI总结 提出一个开源框架,通过优化模型参数以复现生物典型放电模式,自动化评估脉冲神经元模型的生物学合理性,并在多个模型上验证有效性。

Comments Reviewed version published at the ECML-PKDD 2025 joint post-workshop proceeding in Springer Communications in Computer and Information Science

详情
AI中文摘要

生物学合理性是神经形态计算和脉冲神经网络中的一个关键概念,但其定义不一致且难以量化。在这项工作中,我们提出了一个用于脉冲神经元模型生物学合理性自动化评估的开源框架。我们的方法基于评估模型复现生物系统中观察到的典型神经元放电模式的能力,遵循Izhikevich提出的分类。通过将这些模式编码为目标函数并相应优化模型参数,我们的框架无需先验分析建模即可实现经验性评估。将神经元模型视为黑箱,该框架提供了一种实用且灵活的方法来表征其动态能力。我们在几个已建立的模型和一个先前未探索的自定义模型上展示了该框架的有效性。该框架使用Python实现,兼容PyTorch和Norse库,专为机器学习场景设计。它旨在作为系统研究生物学合理性与网络级性能指标(如准确性、能效、鲁棒性和适应性)之间关系的起点。

英文摘要

Biological plausibility is a key concept in neuromorphic computing and spiking neural networks, yet it remains inconsistently defined and difficult to quantify. In this work, we present an open-source framework for the automated assessment of biological plausibility in spiking neuron models. Our method builds on the idea of evaluating a model's ability to replicate canonical neuronal firing patterns observed in biological systems, following the classification proposed by Izhikevich. By encoding these patterns into objective functions and optimizing model parameters accordingly, our framework enables empirical assessment without requiring prior analytical modeling. Treating neuron models as black boxes, it provides a practical and flexible means of characterizing their dynamic capabilities. We demonstrate the effectiveness of the framework on several established models and a previously unexplored custom model. Implemented in Python and compatible with PyTorch and the Norse library, the framework is tailored for machine learning contexts. It is intended as a starting point for systematic research into the relationship between biological plausibility and network-level performance metrics such as accuracy, energy efficiency, robustness, and adaptability.

2606.17850 2026-06-17 cs.AR 新提交

CUTh-Solver: GPU-Accelerated Sparse Matrix Solver for High-Resolution Thermal Simulation of 3D ICs

CUTh-Solver:用于3D IC高分辨率热仿真的GPU加速稀疏矩阵求解器

Chenghan Wang, Zhen Zhuang, Shui Jiang, Siyuan Liang, Xiaoman Yang, Kai Zhu, Darong Huang, Luis Costero, Rongmei Chen, Tsung-Wei Huang, David Atienza, Tsung-Yi Ho

AI总结 针对3D IC高分辨率热仿真中稀疏矩阵求解的瓶颈,提出CUTh-Solver,通过压缩DIA存储格式、对角SpMV、高并行预处理和自适应混合精度策略,在GPU上实现高达25.8倍加速。

详情
AI中文摘要

粗粒度热仿真往往低估局部热问题,可能遗漏关键热点。因此,准确分析需要细粒度信息,这极大地增加了网格分辨率,从而增加了计算工作量。幸运的是,系数矩阵通常是稀疏的且具有规则稀疏模式,提供了优化机会。然而,GPU上现有的通用矩阵求解器很少利用这些领域特定属性,因此在数据存储、内存访问、并行性、计算效率和硬件利用率方面遇到瓶颈。因此,我们提出了CUTh-Solver,一个协同设计的GPU加速基于预条件共轭梯度(PCG)的稀疏求解器框架,用于高分辨率稳态和瞬态3D IC热仿真中出现的对称正定(SPD)系统。在数据存储方面,CUTh-Solver压缩对角(DIA)存储格式以消除冗余。为了优化内存访问,CUTh-Solver采用对角SpMV实现合并内存访问。我们进一步观察到并行性与预条件质量之间的关键冲突,因此采用高并行预条件策略。为了提高计算效率和硬件利用率,我们采用自适应细粒度混合精度策略,利用不同的浮点单元避免资源争用,在保证数值稳定性的同时提高吞吐量。实验结果表明,CUTh-Solver相比GPU加速的COMSOL Multiphysics 6.4实现了高达25.8倍加速,相比NVIDIA的原生通用库(AmgX、cuSPARSE、cuDSS)实现了超过3倍加速。消融研究验证了每种优化的单独贡献。代码可在以下网址获取:this https URL

英文摘要

Coarse-grained thermal simulation tends to underestimate localized thermal issues, potentially missing critical hotspots. Accurate analysis, therefore, demands fine-grained information, which dramatically increases grid resolution and thus computational workload. Fortunately, the coefficient matrices are often sparse with regular sparsity patterns, offering optimization opportunities. However, existing general-purpose matrix solvers on GPUs rarely exploit these domain-specific properties, thereby encountering bottlenecks in data storage, memory access, parallelism, computational efficiency, and hardware utilization. Therefore, we propose CUTh-Solver, a co-designed GPU-accelerated Preconditioned Conjugate Gradient (PCG)-based sparse solver framework for Symmetric Positive Definite (SPD) systems arising from high-resolution steady-state and transient 3D IC thermal simulation. For data storage, CUTh-Solver condenses the Diagonal (DIA) storage format to remove redundancy. To optimize the memory access, CUTh-Solver employs diagonal-wise SpMV to achieve coalesced memory access. We further observe a critical conflict between parallelism and preconditioning quality and thus adopt a high-parallelism preconditioning strategy. To improve computational efficiency and hardware utilization, we employ an adaptive fine-grained mixed-precision strategy that leverages diverse floating-point units to avoid resource contention, enhancing throughput without compromising numerical stability. Experimental results show that CUTh-Solver achieves up to 25.8x speedup over GPU-accelerated COMSOL Multiphysics 6.4 and over 3x speedup over NVIDIA's native general-purpose libraries (AmgX, cuSPARSE, cuDSS). Ablation studies validate the individual contribution of each optimization. The code is available at: https://github.com/Chenghan-Wang/CUTh-Solver

2606.17845 2026-06-17 cs.NI 新提交

UAV-CAS: A Calibrated Digital-Twin Dataset for Intrusion Detection in UAV Swarm Networks

UAV-CAS:用于无人机群网络入侵检测的校准数字孪生数据集

Sripath Mishra, Bharat Bhargava, Zizheng Liu, Shafkat Islam

AI总结 针对有线网络数据集训练的入侵检测系统在真实无人机群中性能急剧下降的问题,提出UAV-CAS数据集,通过四层校准管道生成大规模标记流数据,覆盖五种攻击族和九种协作攻击组合,验证了数据集的可学习性和攻击分类的挑战性。

Comments Repository URL: https://github.com/Sripathm2/Collaborative-UAV-Dataset, Dataset Link: https://dx.doi.org/10.21227/zgrg-z865

详情
AI中文摘要

基于有线网络基准训练的入侵检测系统(IDS)在真实无人机(UAV)群中性能急剧下降,因为移动性、波动的链路质量和去中心化路由重塑了流量分布。现有的无人机特定数据集也没有系统地改变这些条件,无法针对导致IDS失效的分布偏移进行训练或测试。我们提出了UAV-CAS,一个用于无人机网络入侵检测的大规模标记流数据集,由Containernet数字孪生生成,并针对AERPAW测试平台测量进行了系统校准。我们有一个四层校准管道,涵盖高度相关的路径损耗、任务特定的移动性、链路级性能链和端到端轨迹保真度。UAV-CAS包含来自1024种配置的99,492条流,涵盖五种攻击族(DoS、DDoS、黑洞、虫洞、重放)和九种协作攻击组合。多样性分析表明,高速率攻击与良性流量的分离程度比任何先前基准高出一个数量级,而隐蔽攻击则故意与良性流量混合。在十个基线IDS上,二分类攻击检测饱和于0.98以上,确认数据集是可学习的,而完整的攻击类别识别仍然困难——每类F1分数从接近零到0.82不等,对于隐蔽攻击则降至个位数。我们发布数据集、模拟器和校准数据,以支持可重复的无人机入侵检测研究。

英文摘要

Intrusion detection systems (IDS) trained on wired-network benchmarks degrade sharply in real-world unmanned aerial vehicle (UAV) swarms, where mobility, fluctuating link quality, and decentralized routing reshape traffic distributions. Existing UAV-specific datasets also do not systematically vary these conditions, leaving no way to train or test an IDS against the very shift that defeats it. We present UAV-CAS, a large-scale labeled flow dataset for UAV-network intrusion detection, generated by a Containernet digital twin that is systematically calibrated against AERPAW testbed measurements. We have a four-layer calibration pipeline spanning altitude-dependent path loss, mission-specific mobility, the link-level performance chain, and end-to-end trace fidelity. UAV-CAS comprises 99,492 flows drawn from 1,024 configurations that span five attack families (DoS, DDoS, blackhole, wormhole, replay) and nine collaborative attack compositions. A diversity analysis shows that high-rate attacks separate from benign traffic up to an order of magnitude more strongly than in any prior benchmark, while stealth attacks deliberately blend with benign traffic. Across ten baseline IDS, binary attack detection saturates above $0.98$, confirming the dataset is learnable, whereas full attack-class identification remains hard -- per-class $F_1$ ranges from near zero to $0.82$ and falls into the single digits for stealth attacks. We release the dataset, simulator, and calibration data to support reproducible UAV intrusion-detection research.

2606.17811 2026-06-17 cs.LO 新提交

UMB: A Unified Markov Binary Format for Probabilistic Model Checking (extended version)

UMB:一种用于概率模型检验的统一马尔可夫二进制格式(扩展版)

Roman Andriushchenko, Arnd Hartmanns, Joshua Jeppson, Sebastian Junges, Tobias Meggendorfer, David Parker, Tim Quatmann, Maximilian Weininger

AI总结 提出UMB格式,一种高效、可扩展的显式状态文件格式,用于表示多种概率系统,解决低层模型交换问题,已被主流工具采用并提供Python库支持。

详情
AI中文摘要

本文提出了统一马尔可夫二进制(UMB)格式,一种高效、可扩展且支持良好的显式状态文件格式,用于表示广泛的概率系统。UMB解决了以下问题:虽然概率模型检验工具通常支持常见的高级建模语言,但缺乏交换低层模型表示的有效机制。实践中,使用文本的、特定于工具的格式,阻碍了互操作性,并导致读写模型文件的开销很大。UMB基于通用的底层数学模型,并使用一小组位级原始数据结构进行编码,提供了一种简洁、统一且高效的解决方案。该格式已被主流工具采用,并附带一个方便的Python库,用于读取、操作、创建和验证模型,以及跨工具安装和持续验证的基础设施。我们报告了文件格式的效率以及它促成的新的实际用例。

英文摘要

This paper presents the unified Markov binary (UMB) format, an efficient, extensible, and well-supported explicit-state file format for representing a wide range of probabilistic systems. UMB addresses the problem that, while probabilistic model checking tools often support common high-level modelling languages, there is no effective mechanism for exchanging low-level model representations. In practice, textual, tool-specific formats are used, hampering interoperability and resulting in large overheads in writing and reading model files. UMB provides a clean, unified, and efficient solution, based on a general underlying mathematical model, and encoded using a small set of bit-level primitive data structures. The format has already been adopted by prominent tools and comes with a convenient Python library for reading, manipulating, creating, and validating models, plus infrastructure for cross-tool installation and continuous validation. We report on both the efficiency of the file format and the new practical use cases that it facilitates.

2606.17793 2026-06-17 cs.HC cs.DB 新提交

ARES: A Platform for Adaptive Role-Based Evaluation of Social Engineering Risks in Human--AI Games

ARES: 一种用于人类-人工智能游戏中基于角色的社会工程风险自适应评估平台

Roberto Daza, Javier Irigoyen, Ivan Lopez, Raquel Rodriguez-Carvajal, Laura Gomez, Julian Fierrez, Ruben Tolosana, Aythami Morales

AI总结 提出ARES平台,通过可控社交游戏审计LLM中介的社会决策中的自适应社会工程风险,支持人-人、人-AI和AI-AI设置,并收集多模态数据集以评估风险。

Comments 6 pages, 2 figures. Accepted at the International Carnahan Conference on Security Technology (ICCST 2026)

详情
AI中文摘要

本文介绍了ARES,一个平台和开放试点数据集,用于通过受控社交游戏审计LLM中介的社会决策中的自适应社会工程风险。ARES支持人-人、人-AI和AI-AI设置,结合了可配置的游戏模板、角色条件的LLM代理、心理学知情的参与者画像、结构化交互树以及同步的行为和生物特征采集、过滤和基于深度学习的特征提取。试点数据集来自15名参与者与角色条件的GPT-5.4代理在两个串联游戏(改编的囚徒困境和最后通牒游戏)中的互动。它包含340 GB的原始和处理过的多模态数据,涵盖六个流:交互日志、视频、屏幕录制、注视日志、智能手表信号以及游戏/问卷元数据。这些数据包括交互路径、书面理由、心理画像、主观反馈、感知对手身份、游戏结果以及衍生的行为、面部和注视特征。除了数据集,我们还提供了描述性分析来表征试点发布。严格的风险评估对于部署安全的AI系统至关重要,因为它能够识别和缓解漏洞,确保敏感数据的保护,并支持遵守社会不断发展的监管和伦理标准。

英文摘要

This work introduces ARES, a platform and open pilot dataset for auditing adaptive social engineering risks in LLM-mediated social decision-making through controlled social games. ARES supports human--human, human--AI, and AI--AI settings, combining configurable game templates, role-conditioned LLM agents, psychology-informed participant profiling, structured interaction trees, and synchronised behavioural and biometric acquisition, filtering, and deep-learning-based feature extraction. The pilot dataset was collected from 15 participants interacting with a role-conditioned GPT-5.4 agent in two concatenated games: an adapted Prisoner's Dilemma and an Ultimatum Game. It comprises 340 GB of raw and processed multimodal data across six streams: interaction logs, video, screen recordings, gaze logs, smartwatch signals, and game/questionnaire metadata. These data include interaction paths, written justifications, psychological profiles, subjective feedback, perceived counterpart identity, game outcomes, and derived behavioural, facial, and gaze features. Alongside the dataset, we provide descriptive analyses to characterise the pilot release. Rigorous risk evaluation is essential for the deployment of secure AI systems, as it enables the identification and mitigation of vulnerabilities, ensures the protection of sensitive data, and supports compliance with evolving regulatory and ethical standards in society.

2606.17789 2026-06-17 cs.HC 新提交

Mind Companion: An Embodied Conversational Agent for Process-Based Psychotherapy

Mind Companion: 一种用于基于过程心理治疗的具身对话代理

Sofie Kamber, Lukas Diebold, Pascal Riachi, Stella Brogna, Andrew Gloster, Rafael Wampfler

AI总结 提出Mind Companion,一种基于大语言模型的具身对话代理,通过多层级心理分析与过程治疗原则,实时分析客户陈述并生成回应,评估显示GPT-5.2在多个维度上优于人类治疗师。

详情
Journal ref
2026 IEEE 14th International Conference on Healthcare Informatics (ICHI), Minneapolis, MN, June 1-3, 2026, pp. 980-989
AI中文摘要

全球范围内获得循证心理治疗的机会仍然有限,即使在高收入地区也存在漫长的等待名单。最近大语言模型(LLM)的进展,在临床监督和安全机制设计下,为可扩展的心理健康支持提供了潜力。我们提出了Mind Companion,一种基于LLM的具身对话代理,将多层级心理分析与基于过程的治疗原则相结合。该系统对客户陈述进行实时分析,涵盖事实提取、心理灵活性过程检测、情绪识别和安全监控。分析结果存储供监督临床医生用于治疗规划。回应生成结合了来自循证治疗文献的检索增强生成和上下文感知提示。回应通过具身化虚拟角色以同步语音合成和动画传递。我们评估了三种LLM配置(GPT-4.1-mini、GPT-5.2、Claude Sonnet 4.5),与来自真实治疗会话的治疗师回应进行对比,使用自动LLM裁判评估和11位专业心理治疗师的专家评估。GPT-5.2在理解力、人际效能、协作和治疗一致性方面均获得高于人类治疗师回应的评分,证明了基于LLM的对话代理作为临床护理补充工具的可行性。

英文摘要

Access to evidence-based psychotherapy remains limited worldwide, with long waitlists even in high-income regions. Recent advances in large language models (LLMs) offer potential for scalable mental health support when designed with clinical oversight and safety mechanisms. We present Mind Companion, an LLM-based embodied conversational agent integrating multi-layered psychological analysis with process-based therapy principles. The system performs real-time analysis of client statements across fact extraction, psychological flexibility process detection, emotion recognition, and safety monitoring. Analysis results are stored for supervising clinicians to inform therapeutic planning. Response generation incorporates retrieval-augmented generation from evidence-based therapeutic literature and context-aware prompting. Responses are delivered through an embodied avatar with synchronized speech synthesis and animation. We evaluated three LLM configurations (GPT-4.1-mini, GPT-5.2, Claude Sonnet 4.5) against therapist responses from real therapy sessions using automated LLM-judge assessment and expert evaluation with 11 professional psychotherapists. GPT-5.2 achieved higher ratings than human therapist responses across understanding, interpersonal effectiveness, collaboration, and therapeutic alignment in both evaluations, demonstrating the feasibility of LLM-based conversational agents as tools to complement clinical care.

2606.17787 2026-06-17 cs.DC 新提交

LUMEN: Coordinated Failure Recovery for Distributed LLM Serving

LUMEN:分布式LLM服务的协调故障恢复

Zhang Cao, Shujie Han, Juncheng Zhang, Yuanming Ren, Yongkun Li, Patrick P. C. Lee

AI总结 针对分布式LLM服务中工作节点故障导致KV缓存丢失和请求重算的问题,提出LUMEN系统,通过负载感知的协调恢复策略(检查点放置、中断请求分配、容量恢复)显著提升服务与恢复时间。

详情
AI中文摘要

现代大语言模型(LLM)服务集群将推理请求分布到不同GPU上的多个工作进程中,但大规模下故障普遍存在。当工作进程故障时,集群同时丢失故障工作进程的GPU驻留键值(KV)缓存和服务容量,导致幸存工作进程在吸收重定向流量的同时从头重新运行中断的请求。现有容错系统要么从头重启中断请求,要么从固定邻居工作进程上的检查点恢复KV缓存,但这两种方法在未考虑当前集群负载的情况下路由恢复工作,并在模型重载期间使恢复工作进程空闲。我们提出LUMEN,一种容错LLM服务系统,将恢复视为跨三个决策点的负载感知协调问题:故障前的检查点放置、故障时的中断请求分配以及模型重载期间的服务容量恢复。我们通过原型实验和大规模模拟评估LUMEN,并展示了在服务时间和恢复时间上的显著改进。

英文摘要

Modern large language model (LLM) serving clusters distribute inference requests across multiple worker processes on different GPUs, but failures are prevalent at scale. When a worker fails, the cluster simultaneously loses the failed worker's GPU-resident key-value (KV) caches and serving capacity, leaving surviving workers to absorb the redirected traffic while re-running interrupted requests from scratch. Existing fault-tolerant systems either restart interrupted requests from scratch or restore KV caches from checkpoints stored on a fixed neighboring worker, but both approaches route recovery work without considering current cluster load and leave the recovering worker idle during model reload. We present LUMEN, a fault-tolerant LLM serving system that treats recovery as a load-aware coordination problem across three decision points: checkpoint placement before failures, interrupted-request distribution at failure time, and serving capacity restoration during model reload. We evaluate LUMEN using both prototype experiments and large-scale simulations and demonstrate significant improvements in serving and recovery times.

2606.17783 2026-06-17 cs.HC 新提交

Is It Real? Exploiting Virtual-Physical Discrimination Vulnerability in Mixed Reality

这是真的吗?利用混合现实中的虚实辨别漏洞

Xueyang Wang, Xihuan Yao, Yanming Xiu, Xin Yi, Maria Gorlatova, Hewu Li

AI总结 研究混合现实头显中用户无法区分虚拟与真实物体的漏洞,通过专家研讨和四项概念验证攻击(成功率85%-100%),揭示了攻击如何改变用户行为,并提出平台级溯源、交互门控和用户教育等防御措施。

Comments Accepted at the 2026 USENIX Symposium on Usable Privacy and Security (SOUPS 2026)

详情
AI中文摘要

消费级混合现实(MR)头显将虚拟内容以足够保真度无缝融合到物理环境中,用户可能无法区分虚拟物体和物理物体。我们将这种虚实辨别漏洞识别为一种可利用的安全原语。通过与12位来自网络安全和MR/HCI领域的专家进行推测性设计研讨会,我们开发了虚实混淆攻击的分类法,并在Apple Vision Pro上实现了四项概念验证攻击,在26名参与者参与的现实MR任务中进行了评估。所有四项攻击都改变了用户行为,成功率从85%到100%不等,产生了误导性交互、误判物体身份、有偏见的购买决策和改变的导航路径。值得注意的是,最成功的攻击也是最难被参与者主观评分检测到的。即使识别出虚拟内容的参与者在行为上仍然顺从,并且没有参与者将异常事件归因于对抗性原因。我们提出平台级溯源、交互门控和用户教育作为对策。

英文摘要

Consumer mixed reality (MR) headsets seamlessly blend virtual content into physical environments with sufficient fidelity that users may be unable to distinguish virtual objects from physical ones. We identify this virtual-physical discrimination vulnerability as an exploitable security primitive. Through speculative design workshops with 12 experts from cybersecurity and MR/HCI, we develop a taxonomy of virtual-physical confusion attacks and implement four proof-of-concept attacks on Apple Vision Pro, evaluating them with 26 participants in realistic MR tasks. All four attacks altered user behavior, with success rates ranging from 85% to 100%, producing misdirected interactions, misjudged object identities, biased purchasing decisions, and altered navigation paths. Notably, the most successful attacks were also the hardest to detect according to participants' subjective ratings. Even participants who recognized virtual content still complied behaviorally, and no participant attributed anomalous events to adversarial causes. We propose platform-level provenance, interaction gating, and user education as countermeasures.

2606.17746 2026-06-17 cs.NI 新提交

FlowCLIP: Contrastive Pretraining Using Domain Names for Encrypted Traffic Classification

FlowCLIP: 使用域名的对比预训练进行加密流量分类

Eun Hun Choi

AI总结 提出FlowCLIP框架,利用数据包侧信道特征(间隔、大小、方向)和CLIP对比目标对齐流量与域名表示,在QUIC流量数据集上跨周评估,优于基线方法。

详情
AI中文摘要

网络流量分类支持网站指纹识别、入侵检测和服务质量管理。然而,在现实部署条件下开发能够捕获稳定且可泛化的流量模式的方法仍然具有挑战性。我们引入了FlowCLIP,一个对比预训练框架,仅使用侧信道特征(数据包到达间隔时间、数据包大小和数据包方向)从加密流量中进行域名预测。FlowCLIP通过CLIP风格的对比目标将流量流表示与域名表示对齐,从而将原始域名作为文本监督。预训练的流量编码器随后被冻结,并通过线性探测在规范化的域名标签上进行评估。我们在一个基于时间协议的大规模QUIC流量数据集上评估FlowCLIP,其中模型在第1周的流量上训练,并在第2-4周的流量上评估。FlowCLIP在后续评估周中优于竞争性的机器学习基线,表明原始域名为学习可迁移的加密流量表示提供了文本监督信号。

英文摘要

Network traffic classification enables website fingerprinting, intrusion detection, and Quality of Service management. However, developing methods that capture stable and generalizable traffic patterns under realistic deployment conditions remains challenging. We introduce FlowCLIP, a contrastive pretraining framework for domain name prediction from encrypted traffic using only side-channel features: packet inter-arrival times, packet sizes, and packet directions. FlowCLIP uses raw domain names as textual supervision by aligning traffic flow representations with domain name representations through a CLIP-style contrastive objective. The pretrained traffic encoder is then frozen and evaluated through linear probing on canonicalized domain name labels. We evaluate FlowCLIP on a large-scale QUIC traffic dataset using a time-based protocol, where models are trained on Week 1 traffic and evaluated on traffic from Weeks 2-4. FlowCLIP outperforms competitive machine learning baselines across later evaluation weeks, suggesting that raw domain names provide a textual supervision signal for learning transferable encrypted traffic representations.

2606.17741 2026-06-17 eess.SY cs.HC cs.SY 新提交

A Wearable Multimodal Ultrasound+Inertial System for Real-Time Virtual Reality Interaction

用于实时虚拟现实交互的可穿戴多模态超声+惯性系统

Giusy Spacone, Sebastian Frey, Enzo Baraldi, Mattia Orlandi, Luca Benini, Andrea Cossettini

AI总结 提出基于前臂和上臂超声与惯性传感的完全可穿戴多模态接口,结合WULPUS平台和Unity VR环境,通过多模态学习实现手部姿态和前臂位置估计,在三个任务中在线成功率超88%。

Comments 8 pages, 8 figures, 3 tables

详情
AI中文摘要

A模式超声(US)是一种有前景的虚拟现实(VR)交互传感模态,因为它能够将肌肉活动映射为控制命令,同时保留可穿戴传感的优势。然而,现有方法在可穿戴性和交互复杂性方面仍面临限制,通常依赖外部硬件如摄像头。在这项工作中,我们提出了一种完全可穿戴的多模态接口,用于实时VR交互,基于来自前臂和上臂的并发US和惯性(加速度计)传感。该系统构建于WULPUS平台之上,并集成了一个端到端的软件框架,用于实时采集、可视化以及与基于Unity的VR环境通信。引入了一种多模态学习流水线,用于在二维空间中同时进行手部姿态和前臂位置估计。通过离线与在线实验对接口进行了评估,涉及五名受试者执行三项功能任务:圆柱体抓取(粗大运动)与搬运、弹珠捏取(精细运动)与搬运以及液体倾倒。对于离线实验,我们在多天内采集了5次采集会话,在手部姿态估计中实现了跨受试者的平均跨会话准确率80±6%,前臂位置估计为77±7%。在线验证仅需最少微调(5分钟),三项任务的成功率分别为92.0±16.0%、88.0±9.8%和96.0±8.0%。功耗仅为19.9 mW,我们的系统可在小型350 mAh锂聚合物电池上连续使用超过2.5天而无需充电,实现了真正可穿戴、多模态且功能有意义的VR交互。

英文摘要

A-mode ultrasound (US) is a promising sensing modality for Virtual Reality (VR) interaction, as it enables the mapping of muscular activity into control commands while retaining the benefits of wearable sensing. However, existing approaches still face limitations in terms of wearability and interaction complexity, often relying on external hardware such as cameras. In this work, we propose a fully wearable multimodal interface for real-time VR-interaction, based on concurrent US and inertial (accelerometry) sensing from the forearm and upper arm. The system is built on the WULPUS platform and integrates an end-to-end software framework for real-time acquisition, visualization, and communication with a Unity-based VR environment. A multimodal learning pipeline is introduced for concurrent hand pose and forearm position estimation in 2D space. The interface is evaluated through offline and online experiments with five subjects, during the execution of three functional tasks: cylinder grasping (gross motor) and relocation, marble pinching (fine motor) and relocation, and liquid pouring. For offline experiments, we collect 5 acquisition sessions across multiple days, achieving an average inter-session accuracy across subjects of 80$\pm$6\% for hand pose estimation and 77$\pm$7\% for forearm position estimation. Online validation with minimal fine-tuning (5 min) demonstrates success rates of 92.0$\pm$16.0\%, 88.0$\pm$9.8\%, and 96.0$\pm$8.0\% for the three tasks, respectively. With a power consumption of only 19.9~mW, our system enables more than 2.5 days of continuous use on a small 350 mAh LiPo battery without the need for recharge, enabling truly wearable, multimodal, and functionally meaningful VR interaction.