arXivDaily arXiv每日学术速递 周一至周五更新
重置
cs.HC人机交互34
2606.12214 2026-06-11 cs.HC cs.GR 新提交

Identifying cybersickness causes in virtual reality games using symbolic machine learning algorithms

使用符号机器学习算法识别虚拟现实游戏中的晕动症原因

Thiago Porcino, Erick Oliveira Rodrigues, Flavia Bernardini, Daniela Trevisan, Esteban Clua

AI总结 提出用符号机器学习算法对VR游戏中晕动症原因进行排序,通过两个游戏和37个有效样本的实验,发现旋转和加速度在飞行游戏中更易引发晕动症,且VR经验不足者更易不适。

详情
AI中文摘要

虚拟现实(VR)和头戴式显示器在教育、军事、娱乐和健康等各个领域越来越受欢迎。尽管此类技术提供了高度的沉浸感,但它们也可能引发不适症状。这种状况被称为晕动症(CS),在最近的虚拟现实出版物中相当常见。本文提出了一种新颖的实验分析,使用符号机器学习来对VR游戏中CS的潜在原因进行排序。我们估计CS的原因并根据其影响使用经典机器学习进行排序。实验使用了两个虚拟现实游戏和6个实验协议,以及来自88名志愿者的37个有效样本。我们的结果表明,与赛车游戏相比,在飞行游戏中旋转和加速度更频繁地引发晕动症。我们还可以观察到,VR经验较少的受试者更容易感到不适。以往经验在赛车游戏中扮演更重要的角色,因为该游戏在控制器方面给用户更多自由,更多的位移选择以及更多用户控制的加速度。此外,根据短期或长期VR暴露,引发不适的不同原因会出现。我们针对这两种场景(短期和长期暴露体验)提出了缓解CS的策略,并比较了两种突出场景(赛车和飞行)。

英文摘要

Virtual reality (VR) and head-mounted displays are constantly gaining popularity in various fields such as education, military, entertainment, and health. Although such technologies provide a high sense of immersion, they can also trigger symptoms of discomfort. This condition is called cybersickness (CS) and is quite popular in recent virtual reality publications. This work proposes a novel experimental analysis using symbolic machine learning to rank potential causes of CS in VR games. We estimate CS causes and rank them according to their impact using classical machine learning. Experiments are performed using two virtual reality games and 6 experimental protocols along with 37 valid samples from a total of 88 volunteers. Our results show that rotation and acceleration triggered cybersickness more frequently in a flight game in contrast to a race game. We could also observe that subjects that are less experienced with VR are more prone to feel discomfort. Former experience plays a more important role on the race game, as this game provides more liberty to the user in terms of controllers, more displacement alternatives and a more user-controlled acceleration. Furthermore, different causes that trigger discomfort arise based on short or long term VR exposures. We suggest strategies for mitigating CS for these two scenarios: short and long term exposure experiences and compare the two highlighted scenarios (race and flight).

2606.11986 2026-06-11 cs.HC 新提交

Channels and Substrates: Distributed Cognition as an Interaction Model for Ubiquitous Analytics

通道与基质:作为普适分析交互模型的分布式认知

Niklas Elmqvist, Panagiotis D. Ritsos, Peter W. S. Butcher

AI总结 针对跨设备和普适分析中交互模型不匹配的问题,提出基于分布式认知的通道与基质框架,将交互建模为表征状态在基质间的传播,并通过重分析多个系统验证其有效性。

详情
Comments
16 pages, 8 figures
AI中文摘要

传统的人机交互模型假设单一的整体界面和稳定的感觉运动环路。这些模型不适合跨设备(XVA)和普适分析(UA),在这些场景中,交互式数据理解跨越多个设备、人工制品和人员,分布在从办公室到工厂车间的不同环境中。在本文中,我们展示了如何使用分布式认知将普适分析中的交互建模为表征状态在基质(思维、语言、身体、人工制品和设备)之间的传播,而不是通过单一界面的流量。在此基础上,我们引入了输入和输出通道,作为数据可视化中视觉通道的推广:正如视觉通道通过视觉基质的属性传递数据,输入和输出通道通过基质传递表征状态,这些基质的可用性、适用性和偏好性取决于上下文。我们通过重新分析多个普适、沉浸式和情境分析系统来演示通道与基质框架。

英文摘要

Traditional HCI interaction models assume a single monolithic interface and a stable sensorimotor loop. These models fit poorly with cross-device (XVA) and ubiquitous analytics (UA), where interactive data sensemaking unfolds across multiple devices, artifacts, and people in disparate settings from the office to the factory floor. In this paper, we show how interaction in ubiquitous analytics can be modeled using distributed cognition as propagation of representational state across substrates -- minds, speech, bodies, artifacts, and devices -- rather than as traffic through a single interface. On this basis we introduce input and output channels as generalizations of the visual channels from data visualization: just as visual channels carry data through properties of the visual substrate, input and output channels carry representational state through substrates whose availability, suitability, and preferability depend on context. We demonstrate the channels and substrates framework by reanalyzing several ubiquitous, immersive, and situated analytics systems.

2606.11980 2026-06-11 cs.HC 新提交

Somewhere Over the Desktop: A Research Agenda for Ubiquitous Analytics

超越桌面:无处不在分析的研究议程

Niklas Elmqvist, Panagiotis D. Ritsos, Peter W. S. Butcher

AI总结 空间计算、生成式AI与开放网络标准融合,催生无处不在分析(UA)新机遇,通过梳理认知、上下文、交互等七大领域交叉,提出42个未来研究挑战。

详情
Comments
15 pages, 5 figures, 1 table
AI中文摘要

空间计算、生成式AI和开放网络标准正在融合。三个空间操作系统——Android XR、Meta Horizon OS和Apple visionOS——现已具备平台级场景理解能力。可穿戴显示设备覆盖从全头显到纤薄智能眼镜的广泛范围。智能体AI与人类用户共享相同的空间基础。这种融合为\textit{无处不在分析}(UA)带来了新机遇:利用大量物理分布的网络设备随时随地支持数据理解。但专有平台正在固化设计惯例,若无基于证据的替代方案,这些惯例将僵化。UA现已成熟到其思想史可被解读为结构化谱系(涵盖基础、贡献和传承)的程度。我们追溯这一谱系,并将其组织成涵盖认知、上下文、交互、平台、可视化、协作和评估的集群。最后,我们将这些集群相互交叉,共产生42个未来研究挑战。

英文摘要

Spatial computing, generative AI, and open web standards are converging. Three spatial operating systems -- Android XR, Meta Horizon OS, and Apple visionOS -- now ship with platform-level scene understanding. Wearable displays span the range from full headsets to slim smartglasses. Agentic AI operates on the same spatial substrates as the human user. This convergence enables new opportunities for \textit{ubiquitous analytics} (UA): the use of many, physically distributed, networked devices to support data sensemaking anytime and anywhere. But proprietary platforms are settling design conventions that will calcify without evidence-based alternatives. UA has now matured to the point where its intellectual history can be read as a structured genealogy of foundations, contributions, and lineages. We trace this genealogy and organize it into clusters spanning cognition, context, interaction, platforms, visualization, collaboration, and evaluation. Finally, we cross these clusters with each other, yielding a total of 42 future research challenges.

2606.11930 2026-06-11 cs.HC cs.AI cs.CV 新提交

Frozen Multimodal Embeddings for Personality and Cognitive Ability Assessment in Asynchronous Video Interviews

冻结多模态嵌入用于异步视频面试中的个性与认知能力评估

Kuo-En Hung, Hung-Yue Suen, Shih-Ching Yeh, Hsiang-Wen Wang

AI总结 针对异步视频面试中标注数据有限的高维多模态学习问题,提出使用冻结多模态编码器(CLIP、Whisper、RoBERTa等)结合低容量下游模型,在个性预测任务上实现MSE降低19.1%,并发现认知能力预测中存在数据集捷径。

详情
Comments
9 pages, 1 figure, 4 tables
AI中文摘要

从异步视频面试(AVI)中预测心理特质是一个具有挑战性的多模态学习问题,因为标注数据集有限,而每个回答包含高维的视觉、声学和语言信号。本文介绍了我们针对ACM多媒体AVI挑战2026的解决方案,该挑战评估两个任务:Track~1从与个性相关的面试回答中预测自我报告的HEXACO个性特质,Track~2从结构化AVI回答中对认知能力水平进行分类。我们将该问题视为小样本表示学习任务。我们不微调大型预训练模型,而是使用冻结的多模态编码器,包括用于视觉特征的CLIP、用于声学特征和转录的Whisper,以及用于文本表示的RoBERTa、E5和DeBERTaV3,随后使用低容量下游模型。对于Track~1,我们的特质特定回归和晚期融合系统实现了平均验证MSE为0.2696,优于官方基线0.3334。消融结果显示,从全局模型(0.3189)到逐特质建模(0.2871)再到逐特质晚期融合(0.2696)的三步改进,相对于官方基线MSE相对降低了19.1%。对于Track~2,一个紧凑的主题属性基线达到了0.5781的准确率,而我们的多模态集成达到了0.5313,两者均高于官方基线0.4062。我们将这一结果解释为验证分割中可能存在主题属性捷径的证据,而非从AVI内容中进行的稳健认知推理。总体而言,我们的发现表明,基于AVI的心理评估受益于特质特定的多模态建模,但认知能力预测需要仔细控制数据集捷径。

英文摘要

Predicting psychological traits from asynchronous video interviews (AVIs) is a challenging multimodal learning problem because labeled datasets are limited while each response contains high-dimensional visual, acoustic, and verbal signals. This paper presents our solution for the ACM Multimedia AVI Challenge 2026, which evaluates two tasks: Track~1 predicts self-reported HEXACO personality traits from personality-related interview responses, and Track~2 classifies cognitive ability levels from structured AVI responses. We treat the problem as a small-sample representation learning task. Instead of fine-tuning large pretrained models, we use frozen multimodal encoders, including CLIP for visual features, Whisper for acoustic features and transcripts, and RoBERTa, E5, and DeBERTaV3 for textual representations, followed by low-capacity downstream models. For Track~1, our trait-specific regression and late-fusion system achieves an average validation MSE of 0.2696, improving over the official baseline of 0.3334. Ablation results show a three-step improvement from a global model (0.3189), to per-trait modeling (0.2871), to per-trait late fusion (0.2696), corresponding to a 19.1\% relative MSE reduction over the official baseline. For Track~2, a compact subject-attribute baseline reaches 0.5781 accuracy, while our multimodal ensemble reaches 0.5313, both above the official baseline of 0.4062. We interpret this result as evidence of possible subject-attribute shortcuts in the validation split rather than robust cognitive inference from AVI content. Overall, our findings suggest that AVI-based psychological assessment benefits from trait-specific multimodal modeling, but cognitive ability prediction requires careful control of dataset shortcuts.

2606.11896 2026-06-11 cs.HC 新提交

PAPEL: A Collaborative System for Parental Guidance during Preschool Play-Based English Learning

PAPEL:一种面向学前游戏英语学习的家长协作系统

Xutong Wang, Yu Mei, Qinwei Li, Muyu Liu, Xiwen Yao, Chang Liu, Zhoutong Ye, Jie Cai, Chun Yu, Yuanchun Shi

AI总结 针对家长在游戏式英语学习中面临的挑战,提出PAPEL系统,通过场景感知建议和四个核心模块(内容生成、语言适配、平衡评估、扩展回应),提升亲子互动质量。

详情
Comments
38 pages, 9 figures, 5 tables. Accepted to CSCW 2026 / To appear in Proceedings of the ACM on Human-Computer Interaction (CSCW 2026)
AI中文摘要

基于游戏的亲子互动为学龄前儿童提供了丰富的日常外语学习机会,但许多家长难以将开放式游戏转化为有效的家庭英语作为外语(EFL)学习体验。为探索AI如何支持这一过程,我们通过访谈和Wizard-of-Oz研究进行了形成性研究,确定了四个关键挑战:内容选择、语言表达、教学与游戏的平衡以及问题解决。为应对这些挑战,我们提出了PAPEL,一个家长-AI协作系统,它将建议扎根于当前游戏场景,并将支持组织为四个核心模块:内容生成、语言适配、平衡评估和扩展回应。在一项包含16对亲子对的平衡受试者内研究中,与研究中使用的轻量级聊天机器人基线相比,PAPEL与更多整合了游戏和教学内容的家长话语以及更多的亲子对话轮次相关。

英文摘要

Play-based parent-child interaction offers preschoolers rich opportunities for everyday foreign language learning, yet many parents struggle to turn open-ended play into effective English-as-a-Foreign-Language (EFL) learning experiences at home. To explore how AI might support this process, we conducted formative studies through interviews and a Wizard-of-Oz study. We identified four key challenges: content selection, language expression, balancing instruction and play, and problem solving. To address these challenges, we present PAPEL, a parent-AI collaborative system that grounds suggestions in the ongoing play scene and organizes support into four core modules: content generation, language adaptation, balance assessment, and extended response. In a counterbalanced within-subjects study with 16 parent-child dyads, PAPEL was associated with more integrated parent utterances that combined playful and instructional content, as well as more parent-child conversational turns, than the lightweight chatbot baseline used in our study.

2606.11835 2026-06-11 cs.HC cs.AI 新提交

Designing AI-Supported Focus Groups: A Role x Modality Playbook

设计AI支持的焦点小组:角色×模态剧本

Zhiqing Wang, Steven Dow

AI总结 针对焦点小组资源密集且对引导高度敏感的问题,提出按AI角色(工具、联合主持、主持)和模态(文本、语音、具身)组织的剧本,并分析交互权衡与开放问题。

详情
AI中文摘要

收集参与者的生活经验是设计研究的核心。焦点小组的独特价值在于参与者不仅分享个人经历,还能相互回应,从而呈现比较、分歧和集体意义建构。然而,焦点小组资源密集且对引导高度敏感:主持人必须探究细节、平衡参与、管理话题流程并维持心理安全,微妙的引导选择可能影响哪些内容变得突出。近期人机交互研究和商业会议工具表明,生成式AI可以通过提示、轮流调节、主题映射和实时总结来支撑实时对话。然而,用户体验研究团队缺乏关于这些能力在焦点小组中的含义以及引入的方法论风险的清晰图景。我们综合了AI支持实时对话的相关工作,并将其转化为一个焦点小组特定的剧本,按AI角色(工具、联合主持、主持)和模态(文本、语音、具身)组织。我们描述了交互权衡,并识别了将AI支持的焦点小组作为方法论配置进行评估的开放问题。

英文摘要

Collecting participants' lived experiences is central to design research. Focus groups are uniquely valuable because participants not only share individual accounts but also respond to one another, surfacing comparison, disagreement, and collective sensemaking. However, focus groups are resource-intensive and highly sensitive to facilitation: moderators must probe for specificity, balance participation, manage topic flow, and sustain psychological safety, and subtle facilitation choices can shape what becomes salient. Recent HCI work and commercial meeting tools show that generative AI can scaffold live conversation through prompting, turn regulation, thematic mapping, and real-time summarization. Yet UXR teams lack a clear map of what these capabilities mean in focus groups and what methodological risks they introduce. We synthesize AI supports for live conversation and translate them into a focus-group-specific playbook organized by AI role (tool, co-host, host) and modality (text, voice, embodied).We synthesize prior work on AI-supported live conversation and propose a focus-group-specific playbook of AI supports organized by role (tool, co-host, host) and modality (text, voice, embodied). We characterize interactional trade-offs and identify open questions for evaluating AI-supported focus groups as methodological configurations.

2606.11693 2026-06-11 cs.HC 新提交

Understanding and Supporting Online Discussion with Opinionated Chatbots

理解与支持带有观点的聊天机器人在线讨论

Tianqi Song, Chi-Lan Yang, Zihan Liu, Zhengtao Xu, Yibin Feng, Yi-Chieh Lee

AI总结 研究通过实验探究了不同观点类型(反对、强化、平衡)的聊天机器人对用户后续在线讨论的影响,发现反对型机器人促进观点转变,强化型机器人促进更友好的交流风格。

详情
AI中文摘要

带有观点的聊天机器人越来越多地出现在在线平台上,并有可能通过影响用户在参与讨论前的观点来塑造公共话语。尽管它们的存在日益增多,但与带有观点的聊天机器人互动对后续在线互动的影响仍未被充分探索。本研究调查了接触不同类型的带有观点的聊天机器人(具体表达反对、强化或平衡观点的机器人)如何影响参与者后续的在线讨论。在一项包含83名参与者的对照实验中,我们发现,与持续反对参与者论点的带有观点的聊天机器人互动会导致更大的观点转变,表明对修正初始立场的开放性增强。相反,与持续强化参与者观点的聊天机器人互动的参与者,在后续与他人对话中更倾向于采用更友好的沟通风格。此外,与不同类型的带有观点的聊天机器人互动导致了不同程度的信任,以及对聊天机器人和人类对话者的不同看法。我们的发现表明,带有观点的聊天机器人可以影响个体对社会话题的观点以及他们在在线环境中的沟通行为。这为未来希望促进认知灵活性以改变观点,同时在公共话语中保持积极用户体验和对聊天机器人信任的设计者带来了权衡。我们讨论了设计带有观点的聊天机器人以促进更具建设性和更少极化的在线讨论的启示。

英文摘要

Opinionated chatbots are increasingly present on online platforms and have the potential to shape public discourse by influencing individuals' viewpoints before they engage in discussions. Despite their growing presence, the impact of interacting with opinionated chatbots on subsequent online interactions remains largely unexplored. This study investigated how exposure to different types of opinionated chatbots, specifically those expressing opposing, reinforcing, or balanced viewpoints, affected participants' subsequent online discussions. In a controlled experiment with 83 participants, we found that interacting with an opinionated chatbot that consistently opposed participants' arguments led to greater shifts in opinion, indicating enhanced openness to revising one's initial stance. Conversely, participants who interacted with a chatbot that consistently reinforced their views were more likely to adopt more agreeable communication styles in subsequent conversations with others. Furthermore, interactions with different types of opinionated chatbots resulted in varying levels of trust, as well as different perceptions of chatbots and human interlocutors. Our findings indicate that opinionated chatbots can influence both individuals' opinions on social topics and their communication behaviors in online environments. This presents a trade-off for future designers seeking to facilitate cognitive flexibility in changing opinions while maintaining positive user experiences and trust in the chatbots during public discourse. We discuss the implications for designing opinionated chatbots to promote more constructive and less polarized online

2606.11669 2026-06-11 cs.HC cs.CY 新提交

Learning by Chatting? Investigating the Impact of Generative AI on Information Seeking and Learning

通过聊天学习?探究生成式AI对信息寻求和学习的影响

Shravika Mittal, Su Lin Blodgett, Q. Vera Liao

AI总结 通过8天实地实验比较ChatGPT与Google搜索对非正式学习的影响,发现ChatGPT组因信息选择外包导致元认知负荷增加、知识探索减少,学习效果尤其是高阶批判性学习更差。

详情
AI中文摘要

生成式AI(GenAI)工具为增强人类认知任务提供了越来越多的机会。在这些任务中,信息寻求正被GenAI工具迅速重塑,对学习和知识获取具有潜在深远影响。为探究这些影响,我们进行了一项受试者间实地实验,参与者在8天内通过ChatGPT或Google搜索寻求信息进行非正式学习。采用每日日记协议,我们收集了其信息寻求过程的现场数据。我们的发现表明,ChatGPT组的参与者在信息寻求过程中能动性降低,因为他们将大量信息选择外包给AI,并因此因控制感减弱而经历更大的元认知负荷。我们进一步强调了使用ChatGPT时信息访问的两个扭曲来源:ChatGPT输出的偏见,特别是倾向于提供面向解决方案的产物而非原则性知识;以及用户信息寻求行为的系统性转变,当前GenAI工具的对话式和社交导向交互范式可能无意中减少了对更广泛知识空间的探索。结果,平均而言,ChatGPT组参与者的学习效果比使用Google的参与者差,尤其是在高阶批判性学习方面。我们的工作揭示了将信息寻求外包给AI与有意义学习之间的内在张力,并为理解AI对人类认知的风险提供了更广泛的意义。

英文摘要

Generative AI (GenAI) tools offer increasing opportunities for augmenting human cognitive tasks. Among these tasks, information seeking is being rapidly reshaped by GenAI tools, with potentially profound implications for learning and knowledge acquisition. To investigate these implications, we conducted a between-subjects field experiment in which participants pursued informal learning by seeking information through either ChatGPT or Google Search over a span of 8 days. Using a daily diary protocol, we gathered in-situ data on their information-seeking processes. Our findings show that participants in the ChatGPT group experienced diminished agency in their information-seeking processes, as they offloaded much of the information selection to AI, and consequently experienced greater meta-cognitive load arising from this reduced sense of control. We further highlight two sources of distortion in information access when using ChatGPT: biases in ChatGPT outputs, particularly towards providing solution-oriented artifacts over principled knowledge; and systematic shifts in users' information-seeking behaviors, whereby the conversational and socially-oriented interaction paradigm of current GenAI tools may inadvertently reduce exploration of the broader knowledge space. As a result, on average, participants in the ChatGPT group had worse learning outcomes than those using Google, especially for higher-order critical learning. Our work suggests inherent tensions between offloading information seeking to AI and meaningful learning, and provides broader implications for understanding AI's risks to human cognition.

2606.11654 2026-06-11 cs.IR cs.CL cs.HC cs.SI 新提交

The Long Tail, Not the Front Page: Cold-Start Prediction of Crowd Highlight Salience

长尾而非首页:众包高亮显著性的冷启动预测

Kazuki Nakayashiki, Keisuke Watanabe

AI总结 本文研究在无读者标记时,如何从文本预测文档的众包高亮显著性,提出基于句子嵌入和位置/上下文特征的对数排序模型,在平均精度上比位置基线提升0.044,并证明该优势源于真实读者标记的学习。

详情
Comments
10 pages, 3 figures, 4 tables
AI中文摘要

社交高亮工具最有用的信号——一群读者标记的段落——仅存在于人们已经阅读过的文档中。能否在标记积累之前,从文本预测文档的聚合众包显著性?先前关于此数据的研究发现,零样本语言模型恢复高亮位置的效果不如简单的基线(位置),因此我们询问,在高亮语料上训练的模型能否击败该基线。使用预注册的模型阶梯和按文档的聚类自助法,我们发现一个微小但稳健的优势:基于句子嵌入和位置/上下文特征的对数排序器比位置基线平均精度高出+0.044(95%置信区间[+0.029, +0.058];在97%的重采样中超过预注册的边界delta=0.03,且在流水线重复运行中稳定)。两种无监督抽取式基线(质心、LexRank风格中心性)均输给位置基线,而训练模型比它们高出+0.108,因此该优势并非由通用无监督代理恢复——它反映了从真实读者标记中学习。在产品术语中,precision@3从0.25上升到0.39(相对提升55%),模型在69%的文档上击败位置基线。消融实验将优势归因于原始嵌入(+0.014)和训练增强(+0.010),每个都有正的置信区间。该优势并非时间泛化失败,我们也没有发现内容漂移或近似重复泄露可以解释它的证据。标准化回归显示,优势主要由文档流行度(流行度越低,优势越大)和标签可靠性决定。它仅在流行度最高的内容上几乎消失;在那里,是位置基线变强,而非模型变弱。由于我们的评估条件设定在最终积累了读者的文档上,这些结果是回顾性的冷启动模拟。

英文摘要

A social highlighter's most useful signal -- which passages a crowd of readers marks -- exists only for documents people have already read. Can the aggregate crowd salience of a document be predicted from its text before its marks accumulate? Prior work on this data found that zero-shot language models recover highlight locations worse than a trivial lead (position) baseline, so we ask whether a model trained on the highlight corpus can beat that baseline. Using a pre-registered ladder of models and a by-document cluster bootstrap, we find a small but robust edge: a logistic ranker over sentence embeddings and positional/contextual features beats the lead baseline by +0.044 average precision (95% CI [+0.029, +0.058]; clears a pre-registered margin delta=0.03 in 97% of resamples, and stable across pipeline re-runs). Two unsupervised extractive baselines (centroid, LexRank-style centrality) lose to lead, and the trained model beats them by +0.108, so the edge is not recovered by generic unsupervised proxies -- it reflects learning from real reader marks. In product terms, precision@3 rises from 0.25 to 0.39 (+55% relative) and the model beats lead on 69% of documents. An ablation attributes the edge to the raw embedding (+0.014) and training augmentation (+0.010), each with a positive CI. The edge is not a temporal-generalization failure, and we find no evidence that content drift or near-duplicate leakage explains it. A standardized regression shows the advantage is governed mainly by document popularity (lower popularity, larger edge) and by label reliability. It nearly vanishes only on the most popular content; there it is the lead baseline that strengthens, not the model that weakens. Because our evaluation conditions on documents that eventually accumulated readers, these results are a retrospective cold-start simulation.

2606.11642 2026-06-11 cs.HC cs.CL 新提交

3-Key-Input: Exploring the Theoretical Minimum Keys for Text Entry

3-Key-Input: 探索文本输入的理论最少按键数

Naoki Kimura

AI总结 本研究通过结合语言模型与2-5个物理按键,系统评估文本输入系统,发现3键+GPT-4o可实现字符错误率9.46%,表明在强语言模型先验下3键是实用最小值。

详情
Comments
6 pages, 1 figure, 7 tables. Published in ICASSP 2026
AI中文摘要

如果我们为模糊键盘配备现代语言模型,可以将物理按键数量减少到多少?更少的按键在辅助设备和移动设备等受限场景中增加了硬件设计自由度。本文系统评估了使用2-5个物理按键结合基于语言模型的消歧的文本输入系统。在包含300个句子的英文语料库(商务/会话/技术各100句)上,我们比较了按键数量(2-5)、字母到按键映射(基于布局/基于频率/故意最坏情况)和解码器(仅Trie、GPT-2束搜索、GPT-4o选择)。我们发现,3键+GPT-4o实现了字符错误率(CER)9.46%和词错误率(WER)12.20%,相对于2键(CER 23.3%)CER降低了59%。在3键时,键流熵为1.54比特/字符;虽然增加到5键提高了准确率(CER 5.4%),但边际增益递减。在标准设计下,映射选择影响较小(ΔCER < 0.5个百分点),即使故意最坏映射也仅使CER增加+0.5个百分点,而技术句子的错误率大约是商务句子的两倍。这些结果表明,在我们评估的离线设置下,在强语言模型先验下,3键是通用英语的实用最小值。

英文摘要

How far can we reduce the number of physical keys if we endow an ambiguous keyboard with modern language models? Fewer keys increase hardware design freedom in constrained settings such as assistive devices and mobile form factors. This paper systematically evaluates text entry systems using 2-5 physical keys combined with language-model-based disambiguation. On a 300-sentence English corpus (100 sentences each for Business / Conversational / Technical), we compare key counts (2-5), letter-to-key mappings (layout-based / frequency-based / intentionally worst-case), and decoders (Trie-only, GPT-2 beam search, GPT-4o selection). We find that 3 keys + GPT-4o achieves character error rate (CER) 9.46% and word error rate (WER) 12.20%, reducing CER by 59% relative to 2 keys (CER 23.3%). At 3 keys, the key-stream entropy is 1.54 bits/char; while increasing to 5 keys improves accuracy (CER 5.4%), the marginal gains diminish. Mapping choice has a small impact under standard designs ({\Delta}CER < 0.5 pp), and even an intentionally worst mapping degrades CER by only +0.5 pp, whereas Technical sentences yield roughly twice the error rate of Business. These results suggest that, in our evaluated offline setting under a strong LM prior, 3 keys are a practical minimum for general English.

2606.11613 2026-06-11 cs.IR cs.CL cs.HC cs.SI 新提交

Factions Within, Uncertain Across: Within-Document Reader Sub-Groups in Social Highlighting

内部派系,跨文档不确定:社交高亮中的文档内读者子群体

Kazuki Nakayashiki, Keisuke Watanabe

AI总结 通过保留边界的曲线球零模型,发现文档内读者形成强子群体,其一致性远超共享显著性预测,且大部分源于细粒度读者特定共识;跨文档稳定性未解决。

详情
Comments
11 pages, 3 figures, 3 tables
AI中文摘要

当许多人高亮同一文档时,人群是单一共识,还是内部结构化为标记不同内容的读者子群体?这种结构是读者的稳定属性还是文档的属性?基于先前工作表明个体文档内高亮信号是低语而个体性存在于选择中,我们在一个共读平台上使用保留边界的曲线球零模型提出群体层面问题。实验1:在文档内,读者形成强子群体——配对一致性远超共享显著性、标记密度和句子流行度所预测的(最近邻一致性z=+6.3,在88%的文档中显著)。在八块区域保留零模型下,与文档相同粗略区域的共享参与解释了约40%的额外一致性;大部分以更细粒度的读者特定一致性存在(z=+3.6,77%显著)。因此,文档内人群在描述意义上是派系化的。实验2:这种分组是稳定的读者特质吗?这里我们诚实地面对统计功效。配对一致性的跨文档分半可重复性在合并后接近零(两个独立抽取样本中分别为+0.078和0.000),功效校准表明该检验仅对共读许多文档的配对有信息。在唯一有信息的高重叠子集(k>=4)中,点估计为正但小样本,在独立抽取样本间不精确,从未显著,并在区域保留零模型下衰减。因此,我们未解决跨文档稳定性:数据与从情境分组到弱至中等稳定读者特质的一切一致。人群在文档内是派系化的;这些派系是否随读者跨文档迁移,诚实地讲,超出了我们的能力范围。

英文摘要

When many people highlight the same document, is the crowd a single consensus, or is it internally structured into reader sub-groups that mark different things -- and is that structure a stable property of a reader or of the document? Building on prior work showing an individual's within-document highlighting signal is a whisper while individuality lives in selection, we ask the group-level question on a co-readership platform using a margin-preserving curveball null. Experiment 1: within a document, readers form strong sub-groups -- pairs agree far beyond what shared salience, mark density, and sentence popularity predict (nearest-neighbour agreement z=+6.3, significant in 88% of documents). Under an eight-block region-preserving null, shared engagement with the same coarse regions of the document accounts for about 40% of this excess; the majority survives as finer reader-specific agreement (z=+3.6, 77% significant). So the within-document crowd is, in a descriptive sense, factional. Experiment 2: is that grouping a stable reader trait? Here we are honest about power. The cross-document split-half reproducibility of a pair's agreement is near zero pooled (+0.078 and 0.000 in two separately drawn samples), and a power calibration shows the test is informative only for pairs that co-read many documents. In the only informative high-overlap subset (k>=4), point estimates are positive but small-sample, imprecise across the separately drawn samples, never significant, and attenuate under the region-preserving null. We therefore leave cross-document stability unresolved: the data is consistent with anything from situational grouping to a weak-to-moderate stable reader trait. The crowd is factional within a document; whether its factions follow the reader across documents is, honestly, beyond our reach.

2606.11349 2026-06-11 cs.AI cs.HC 新提交

Knowing When to Ask: Self-Gated Clarification for Hierarchical Language Agents

知道何时提问:分层语言代理的自门控澄清机制

Aijing Gao, Yiming Kang, Mengdie Flora Wang, Jae Oh Woo

发表机构 * Amazon Web Services(亚马逊云科技)

AI总结 提出ACTION-RATING框架,将澄清请求纳入代理的动作空间,与导航共享序数尺度,在分层推理中实现自门控澄清,通过强制性和机会性两种信息寻求模式提升决策准确性。

详情
AI中文摘要

在分层推理中,失败通常源于中间决策点,代理在没有意识到缺乏关键信息的情况下错误地选择了分支。我们不将澄清视为外部不确定性触发,而是提出ACTION-RATING,一种将澄清置于代理动作空间内、与导航共享序数尺度的公式,使得在每个决策点提问与行动直接竞争,并在中间状态可观察求助行为。从代理自身的评分中涌现出两种结构上不同的信息寻求模式:强制性(无可行分支)和机会性(尽管有领先候选但仍有残余不确定性)。在协调关税表分类(30,000节点分类树,三个基准,跨4个家族的9个LLM)上,我们观察到从强制性澄清到机会性澄清的机制转变,信息寻求有效性(ISE,一个局部诊断指标,定义为帮助交互后正确下一步导航步骤的比例,非最终任务指标)从50%上升到74%。三个诊断对比未能复现此结构。可分离性测试表明,当答案质量下降(准确率下降18.8%)时,信息寻求模式(模式分裂、ISE排名)保持不变,支持代理寻求帮助的位置与其所获帮助质量之间的经验分离。在受控答案通道下,10位数字准确率提升达+16.2%;我们将其解读为更好定位所能释放的上限,而非部署估计。

英文摘要

In hierarchical reasoning, failures often originate at intermediate decision points where the agent commits to a wrong branch without recognizing that it lacks critical information. Rather than treating clarification as an external uncertainty trigger, we propose ACTION-RATING, a formulation that places it inside the agent's action space on a shared ordinal scale with navigation, so that asking competes directly with acting at every decision point and help-seeking becomes observable at intermediate states. Two structurally distinct information-seeking modes emerge from the agent's own ratings: mandatory (no viable branch) and opportunistic (residual uncertainty despite a leading candidate). On Harmonized Tariff Schedule classification (30,000-node taxonomy, three benchmarks, 9~LLMs across 4 families), we observe a regime shift from mandatory to opportunistic clarification, with Information-Seeking Effectiveness (ISE), a local diagnostic defined as the fraction of help interactions followed by a correct next navigation step (not a final-task metric), rising from 50% to 74%. Three diagnostic contrasts fail to reproduce this structure. A separability test shows that the information-seeking pattern (mode split, ISE ranking) persists when answer quality is degraded (-18.8% accuracy), supporting an empirical separation between where an agent seeks help and the quality of the help it receives. Under the controlled answer channel, accuracy gains reach +16.2% at 10-digit; we read this as an upper bound on what better localization could unlock, not a deployment estimate.

2606.11336 2026-06-11 cs.HC cs.ET eess.SY 新提交

Towards a Joint Understanding of Remote Operation for Vehicles in Public Road Traffic

面向公共道路交通中车辆远程操作的联合理解

Elisabeth Shi, Maria-Magdalena Wolf, Nina Theobald, Bettina Abendroth, Eugen Wige, Johannes Springer, Katharina Hottelart, Andreas Schrank, Thorben Brandt, Michael Oehl, Frank Diermeyer, Lena Plum

AI总结 本文提出一个框架,通过追溯人车信息处理差异的术语,统一远程操作概念,促进跨学科交流,并整合近期讨论的远程操作形式。

详情
AI中文摘要

持续驾驶自动化系统被设想用作无人驾驶出行服务的基础。然而,研究人员和从业者都承认,当前的驾驶自动化系统尚无法处理人类驾驶员能够处理的所有交通情况。为了弥合这一差距并实现无需车内人类驾驶员或后备的出行服务,远程操作(或遥操作)正被越来越多地讨论。最近,已采取首批法律行动,允许在公共道路上进行某些形式的远程操作。远程操作涵盖了支持驾驶自动化系统的广泛方法,从远程辅助(包括提供信息或释放操作)到远程驾驶(包括从远程位置驾驶车辆)。因此,在公共道路交通中安全实施远程操作对多个学科(如工程学、心理学、信息学、法学等)和利益相关者(如远程操作服务提供商、远程操作员、车辆制造商、监管机构等)的协作提出了挑战。同时,由于期望和语言的不同,跨学科讨论往往具有挑战性。为了建立共同基础,本文追溯术语到人类和车辆双方信息处理的原始差异。该框架旨在通过明确指定需要什么来吸引包括不同背景和兴趣的研究人员和利益相关者在内的多样化受众,从而帮助进一步讨论。近期讨论的远程操作形式被整合到该框架中。

英文摘要

Sustained driving automation systems are envisioned to be used as the foundation for driverless mobility services. However, both researchers and practitioners acknowledge that current driving automation systems are not yet able to handle all traffic situations that a human driver can handle. To bridge this gap and enable mobility services without an in-vehicle human driver or fallback, remote operation (or teleoperation) is increasingly discussed. Recently, first legal actions have been taken to enable some forms of remote operation on public roads. Remote operation encompasses a broad spectrum of methods to support a driving automation system, ranging from remote assistance, which includes providing information or releasing a maneuver, to remote driving, which includes driving the vehicle from a remote location. As such, safe implementation of remote operation in public road traffic challenges the collaboration of multiple academic disciplines (e.g. engineering, psychology, informatics, law, etc.) and stakeholders (e.g. remote operation service providers, remote operators, vehicle manufacturers, regulatory authorities, etc.). At the same time, the interdisciplinary discourse is often challenging due to differing expectations and language. To build a common ground, this article traces terminology back to the original differences in information processing both on human and vehicle side. This framework aims to help further discourse by directly specifying what is needed to engage a diverse audience including researchers and stakeholders of different backgrounds and interests. Recently discussed forms of teleoperation are integrated into this framework.

2606.11269 2026-06-11 cs.CV cs.HC 新提交

Traits Run Deeper: Trait-Specific Asymmetric Fusion for Personality Assessment

特质更深:面向人格评估的特质特异性非对称融合

Jia Li, Qian Chen, Wei Wang, Xinyu Li, Zhenzhen Hu, Dongsheng Shao, Richang Hong, Meng Wang

发表机构 * Hefei University of Technology(合肥工业大学) Intelligent Interconnected Systems Laboratory of Anhui Province(安徽省智能互联系统实验室) Jianghuai Advanced Technology Center(江淮前沿技术中心) Anhui Provincial Industry Innovation Center of Humanoid Robots(安徽省人形机器人产业创新中心) Anhui Provincial Key Laboratory of Humanoid Robots(安徽省人形机器人重点实验室)

AI总结 提出Traits Run Deeper框架,通过多模态基础表示、特质特异性非对称融合和分布校准回归模块,解决人格评估中模态偏好差异和标签偏差问题,在AVI Challenge 2026上MSE降低约25%。

详情
AI中文摘要

人格评估旨在从语言、声音和面部线索等动态行为中推断稳定的人格特质。由于不同的人格维度通过不同的行为视角展现,建模特质特异性证据具有挑战性。然而,现有大多数方法对所有维度采用统一的多模态融合策略,假设模态贡献相同。这忽略了特质特异性的模态偏好,并引入了跨模态干扰。为解决这一问题,我们提出了一种新颖的人格评估框架,称为Traits Run Deeper,由三个组件组成。具体而言,多模态基础表示(MFR)模块构建面向人格的多模态输入,并利用心理学启发的语义模板作为锚点,使基础模型能够捕获特质相关信息。基于MFR,特质特异性模态融合(TSMF)模块作为一种非对称融合机制,允许每个维度从模态特定建模到互补融合中,选择性地利用不同的模态路径。因此,TSMF捕获了异质的模态偏好,同时减少了跨模态污染。此外,分布校准人格回归(DCPR)模块通过目标分布校准减轻了标签不平衡和中心趋势偏差,提高了鲁棒性和稳定性。在AVI Challenge 2026验证集上的实验结果表明了所提出框架的有效性,与基线相比,均方误差(MSE)降低了约25%。在官方测试集上观察到一致的改进,我们的方法取得了最佳性能,并在人格评估赛道中排名第一。源代码将在此https URL提供。

英文摘要

Personality assessment aims to infer stable personality traits from dynamic behaviors across language, voice, and facial cues. Since different personality dimensions are revealed through distinct behavioral perspectives, modeling trait-specific evidence is challenging. However, most existing approaches adopt a uniform multimodal fusion strategy across all dimensions, assuming identical modality contributions. This overlooks trait-specific modality preferences and introduces cross-modal interference. To address this issue, we propose a novel personality assessment framework called Traits Run Deeper, which consists of three components. Specifically, the Multimodal Foundation Representation (MFR) module constructs personality-oriented multimodal inputs and leverages psychology-informed semantic templates as anchors, enabling foundation models to capture trait-relevant information. Building upon MFR, the Trait-Specific Modality Fusion (TSMF) module acts as an asymmetric fusion mechanism, allowing each dimension to selectively exploit different modality pathways from modality-specific modeling to complementary fusion. Thus, TSMF captures heterogeneous modality preferences while reducing cross-modal contamination. Furthermore, the Distribution-Calibrated Personality Regression (DCPR) module mitigates label imbalance and central tendency bias through target distribution calibration, improving robustness and stability. Experimental results on the AVI Challenge 2026 validation set demonstrate the effectiveness of the proposed framework, reducing mean squared error (MSE) by approximately 25% compared with the baseline. Consistent improvements are observed on the official test set, where our method achieves the best performance and ranks first in the Personality Assessment Track. The source code will be made available at this https URL.

2606.11217 2026-06-11 cs.CY cs.AI cs.HC 新提交

Preregistration for Experiments with AI Agents

AI智能体实验的预注册

Michelle Vaccaro

AI总结 针对AI智能体实验中的方法论漏洞,提出将预注册实践扩展至该领域,并设计专用模板以提升研究可信度。

详情
Comments
Accepted at ICML 2026 as a Spotlight (Top 5%) Position Paper
AI中文摘要

大型语言模型(LLM)和自主AI智能体的普及催生了一种快速发展的方法论范式:“计算机内”行为实验。最初,这种方法被设想为在认知、决策和社会动态研究中,使用AI智能体作为人类参与者的替代品,但现在它已具有新的意义——随着AI智能体越来越多地代表个人和组织进行谈判、交易和做出重大决策,理解它们的行为本身已成为研究重点。虽然这些AI智能体实验在可扩展性、成本效益和实验控制方面提供了前所未有的优势,但它们也继承并有时放大了长期困扰人类受试者研究的方法论漏洞。为解决这些问题,本文主张,预注册实践——对于提高人类受试者实验的可信度至关重要——现在应扩展到AI智能体实验。我们系统地列举了AI智能体实验引入的研究者自由度——例如模型选择、提示措辞、设置和基于结果的重新设计——并展示了低迭代成本和缺乏报告规范如何使这些选择既容易被利用又难以被检测。我们提出了一个针对AI智能体实验的预注册模板,并呼吁会议、期刊和资助机构将预注册作为这一新兴研究范式的标准实践。

英文摘要

The proliferation of large language models (LLMs) and autonomous AI agents has given rise to a rapidly growing methodological paradigm: "in silico" behavioral experiments. Originally conceived as a way to use AI agents as proxies for human participants in studies of cognition, decision-making, and social dynamics, this approach has taken on new significance -- as AI agents increasingly negotiate, transact, and make consequential decisions on behalf of people and organizations, understanding their behavior has become a research priority in its own right. While these experiments with AI agents offer unprecedented advantages in terms of scalability, cost efficiency, and experimental control, they also inherit, and in some cases amplify, methodological vulnerabilities that have long plagued human subjects research. To address these issues, this paper argues that preregistration practices -- central to improving the credibility of human subjects experiments -- should now be extended to experiments with AI agents. We systematically catalog the researcher degrees of freedom that experiments with AI agents introduce -- model selection, prompt wording, settings, and outcome-contingent redesign, for example -- and show how the low cost of iteration and lack of reporting norms make these choices both easy to exploit and difficult to detect. We propose a preregistration template tailored to experiments with AI agents and call on conferences, journals, and funding agencies to make preregistration standard practice for this emerging research paradigm.

2606.11214 2026-06-11 cs.CY cs.AI cs.HC 新提交

From Awareness to Action: Understanding and Overcoming the Research-Practice Gap in Algorithmic Fairness for Public Health

从意识到行动:理解并克服公共卫生算法公平性中的研究-实践差距

Sara Altamirano, Tijs Portegies, Sennay Ghebreab

AI总结 通过混合方法研究,揭示算法公平性在公共卫生ML应用中从意识到行动的差距,提出Fairness-to-Action框架,整合方法、组织和系统维度,指出公平性制度化薄弱、翻译机制外部驱动及系统优先性偏重准确性的问题。

详情
Comments
Extended version of an accepted IASEAI'26 paper; includes technical appendices. 22 pages, 2 figures
AI中文摘要

算法公平性对于负责任的机器学习驱动的公共卫生研究至关重要,但其实际实施仍然有限。为了调查这种意识-行动差距,我们进行了一项顺序混合方法研究,包括专家访谈、在线调查和系统映射。专家访谈为调查设计提供了信息,调查揭示了公平性的碎片化定义、有限的培训和指导、对外部来源的依赖以及正式评估、缓解或监测的罕见使用。这些发现随后被映射到三个既定的研究-实践差距视角:知识-实践差距、知识到行动循环和知道-做差距,每个视角提供了互补的观点。基于这一综合,我们引入了公平到行动框架,该框架整合了方法、组织和系统维度,以识别算法公平性知识转化停滞的位置。我们的分析表明,公平性仍然制度化薄弱,转化机制由外部驱动,系统级优先事项继续强调准确性而非公平性。这些见解为推进安全、公平和道德的机器学习驱动的公共卫生研究实践提供了关键杠杆点。

英文摘要

Algorithmic fairness is essential for responsible ML-driven public health research, yet its practical implementation remains limited. To investigate this awareness-action gap, we conducted a sequential mixed-methods study comprising expert interviews, an online survey, and systematic mapping. The expert interviews informed the design of the survey, which in turn revealed fragmented definitions of fairness, limited training and guidance, reliance on external sources, and rare use of formal assessment, mitigation, or monitoring. These findings were subsequently mapped onto three established research-practice gap lenses: the Knowledge-Practice Gap, the Knowledge-to-Action Cycle, and the Knowing-Doing Gap, each offering complementary perspectives. Building on this synthesis, we introduce the Fairness-to-Action framework, which integrates methodological, organizational, and systemic dimensions to identify where translation of algorithmic fairness knowledge stalls. Our analysis shows that fairness remains weakly institutionalized, translation mechanisms are externally driven, and system-level priorities continue to emphasize accuracy over fairness. These insights suggest critical leverage points for advancing safe, fair, and ethical ML-driven public health research practice.

2606.11195 2026-06-11 cs.CY cs.AI cs.HC 新提交

From Consumption to Reflection: Designing Human-AI Relations for Stable Reasoning

从消费到反思:为稳定推理设计人-人工智能关系

Rikard Rosenbacke, Carl Rosenbacke, Victor Rosenbacke, Martin McKee

AI总结 提出关系反思智能(RRI),一种推理时治理层,通过可审计的推理循环实现反思,将人机交互转变为联合推理系统,以补偿双方局限并实现稳定推理。

详情
AI中文摘要

大型语言模型(LLM)改变了人类获取信息的方式,但并未改变我们推理信息的方式。它们的流畅性加速了消费,同时绕过了支撑健全判断的缓慢反思过程。本文介绍了关系反思智能(RRI),一种推理时治理层,通过可审计的推理循环将反思操作化。RRI 不在模型内部运行,而是在模型周围运行,为人类与 LLM 之间的稳定、可审计推理提供了实用结构。核心前提是,LLM 继承了与塑造人类思维相似的认知脆弱性:依赖直觉捷径、混淆表征与现实、偏好连贯性而非证伪。当人类和模型共享这些倾向时,它们的错误会叠加。我们称之为关系漂移,一种源于交互而非仅来自模型的失败。解决这一问题需要从建模词间关系转向建模模型输出与人类推理之间的关系。RRI 通过三个组件提供了这一缺失层:Rose-Frame(识别推理中可能的故障点)、Architect's Pen(在关键时刻引入针对性反思步骤)以及一个推理时工作流(无需重新训练模型即可嵌入这些步骤)。这些元素共同将人机交互转变为一个具有显式检查点、冲突揭示和可审计假设轨迹的联合推理系统。RRI 不是让机器像人类一样思考,也不是强迫人类像机器一样推理,而是创造一种结构化交互,使双方补偿彼此的局限。它将 AI 安全重新定义为认知架构问题,其中可靠决策取决于将反思直接嵌入交互过程。

英文摘要

Large language models (LLMs) have transformed how humans access information, but not how we reason with it. Their fluency accelerates consumption while bypassing the slow, reflective processes that underpin sound judgment. This paper introduces Relational Reflective Intelligence (RRI), an inference-time governance layer that operationalizes reflection through auditable reasoning loops. RRI operates not inside the model but around it, providing a practical structure for stable, auditable reasoning between humans and LLMs. The core premise is that LLMs inherit cognitive vulnerabilities similar to those that shape human thought: reliance on intuitive shortcuts, confusion between representation and reality, and a preference for coherence over falsification. When humans and models share these tendencies, their errors compound. We refer to this as relational drift, a failure that arises from interaction rather than from the model alone. Addressing this requires a shift from modeling relations between words to structuring relations between model outputs and human reasoning. RRI provides this missing layer through three components: the Rose-Frame, which identifies likely breakdowns in reasoning; the Architect's Pen, which introduces targeted reflection steps at critical moments; and an inference-time workflow that embeds these steps without retraining the model. Together, these elements transform human-AI interaction into a joint reasoning system with explicit checkpoints, conflict surfacing, and an auditable trail of assumptions. Rather than making machines think like humans or forcing humans to reason like machines, RRI creates a structured interaction in which both compensate for each other's limitations. It reframes AI safety as a cognitive architecture problem, where reliable decisions depend on embedding reflection directly into the interaction process.

2606.10120 2026-06-11 cs.IR cs.AI cs.HC 版本更新

MetaPlate: Counterfactual-Guided RAG-LLM Tool for Personalized Food Recommendation and Hyperglycemia Prevention

MetaPlate: 反事实引导的RAG-LLM工具用于个性化食物推荐和高血糖预防

Asiful Arefeen, Carol Johnston, Hassan Ghasemzadeh

AI总结 提出MetaPlate框架,结合反事实解释、机器学习预测和RAG-LLM,生成个性化膳食建议以预防餐后高血糖,经注册营养师评估证明其可行性和有效性。

详情
AI中文摘要

餐后高血糖是代谢紊乱的关键风险因素;然而,现有的饮食指导通常是静态的、不切实际的且个性化不足,提供的建议难以遵循或效果不佳。尽管最近的进展利用连续血糖监测(CGM)和机器学习来预测血糖反应,但这些方法主要是预测性的,缺乏可操作的指导。此外,推荐系统常常与用户目标不一致,且需要大量输入。我们提出了MetaPlate,一个反事实解释(CF)引导的、上下文感知的决策支持框架,用于生成个性化膳食建议,以减轻健康成年人的餐后血糖波动。MetaPlate整合了多模态数据,包括来自25名个体的CGM读数、可穿戴设备衍生的生理信号以及用户提供的膳食输入,以建模餐前上下文。一个机器学习模型预测血糖反应,而CF优化模块通过调整膳食组成(修改宏量营养素数量)来维持血糖水平在目标范围内(≤140 mg/dL)。基于LLM的检索增强生成(RAG)层通过使用USDA食品数据库的约束搜索生成人类可读的建议,增强了可解释性。我们通过结构化的专家在环评估,与注册营养师(RDs)一起评估MetaPlate,比较提示优化前后的性能。结果显示,在膳食真实性、份量适宜性和推荐可能性方面有所改进,专家反馈表明从临床不可行的输出转向了可操作、上下文适宜的建议。我们的发现强调了领域知识和结构化约束在LLM驱动系统中的重要性,并突出了MetaPlate作为实时个性化膳食决策支持工具的潜力。

英文摘要

Postprandial hyperglycemia is a key risk factor for metabolic disorders; however, existing dietary guidance is often static, impractical, and insufficiently personalized, providing recommendations that are difficult to follow or not impactful. While recent advances leverage continuous glucose monitoring (CGM) and machine learning to predict glycemic responses, these approaches are largely predictive and lack actionable guidance. Moreover, recommendation systems are often misaligned with user goals and require extensive input. We present MetaPlate, a counterfactual explanation (CF) guided, context-aware decision-support framework that generates personalized meal recommendations to mitigate postprandial glucose excursions in healthy adults. MetaPlate integrates multimodal data, including CGM readings, wearable-derived physiological signals, and user-provided meal inputs from $25$ individuals to model pre-meal context. A machine learning model predicts glucose response, while a CF optimization module adjusts meal composition modifying macronutrient amounts to maintain glucose levels within a target range ($\leq 140$ mg/dL). An LLM-based retrieval-augmented generation (RAG) layer enhances interpretability by producing human-readable recommendations using constrained search of the USDA food database. We evaluate MetaPlate via a structured expert-in-the-loop assessment with registered dietitians (RDs), comparing performance before and after prompt refinement. Results show improvements in meal realism, portion suitability, and recommendation likelihood, with expert feedback indicating a shift from clinically implausible outputs to actionable, contextually appropriate recommendations. Our findings emphasize the importance of domain knowledge and structured constraints in LLM-driven systems and highlight the potential of MetaPlate as a real-time personalized dietary decision-support tool.

2606.09174 2026-06-11 cs.HC 版本更新

Demonstrating chart-plot: Closing the Last Mile of Academic Chart Generation

展示chart-plot:弥合学术图表生成的最后一英里

Yinghao Tang, Yupeng Xie, Yingchaojie Feng, Jiale Lao, Tingfeng Lan, Wei Chen

AI总结 提出chart-plot系统,通过风格感知代码生成、部署感知渲染循环和结构化编辑层,解决学术图表从代码到发表的质量差距问题。

详情
Comments
7 pages, 6 figures. Submitted to the VLDB ADS 2026 Workshop: The Joint Workshop on Agentic Data Systems and Data-Centric AI
AI中文摘要

大型语言模型可以将研究者的意图转化为可运行的matplotlib代码,但生成的图表很少能不经多轮手动修改就直接用于论文。我们认为,开放的问题不是图表代码生成,而是图表发表:使输出看起来像顶级会议中的图表,适应目标布局,并响应精确的作者编辑。我们提出了chart-plot,一个通过三个组件弥合这最后一英里的智能体框架:(1) 一个风格感知的代码生成器,基于从目标会议已接受图表中提炼的文本风格技能;(2) 一个部署感知的渲染循环,在目标LaTeX上下文中编译图表并反复修改直到满足布局约束;(3) 一个结构化编辑层,将每个图表元素暴露为可直接操作的控制点。我们报告了三种图表类型案例研究(分组柱状图、缩放折线图、配对分布图)的初步结果和一个小型用户研究。

英文摘要

Large language models can translate a researcher's intent into runnable matplotlib code, yet the resulting chart rarely lands in a paper without multiple rounds of manual revision. We argue that the open problem is not chart code generation but chart publication: making the output look like a top-venue figure, survive the target layout, and respond to precise author edits. We present chart-plot, an agentic harness that closes this last mile through three components: (1) a style-aware code generator conditioned on a textual style skill distilled from accepted figures at the target venue, (2) a deployment-aware render loop that compiles the chart inside the target LaTeX context and revises until layout constraints are met, and (3) a structured edit layer that exposes every chart element as a directly manipulable handle. We report early results on three chart-type case studies (grouped bar, scaling line, paired distributions) and a small user study.

2605.23130 2026-06-11 cs.HC cs.CR 版本更新

From Preventive to Reactive: How AI Coding Assistants Transform Developers' Security Awareness

从预防到反应:AI编程助手如何改变开发者的安全意识

Faisal Haque Bappy, Tahrim Hossain, Sidratul Muntaher Meheraj, Annoor Sharara Akhand, Tasfia Tabassum, Tarannum Shaila Zaman, Raiful Hasan, Tariqul Islam

AI总结 通过访谈和观察15名专业开发者,发现AI编程助手将安全思考从编码阶段转移到审查阶段,导致从预防性安全转向反应性安全,并揭示了安全意识与行为脱节的现象。

详情
Comments
This paper has been accepted at the 2026 Symposium on Usable Privacy and Security (SOUPS)
AI中文摘要

AI编程助手现在已成为专业软件开发的核心,但它们如何影响开发者思考和实践安全仍知之甚少。虽然先前的工作记录了AI生成代码中的漏洞率,但一个更根本的问题依然存在:这些工具如何在真实、持续的开发实践中改变安全意识?我们对15名专业软件工程师进行了半结构化访谈,并观察他们在AI辅助下完成与安全相关的编码任务,这些工程师根据其专业形成过程中与AI工具的关系分为3个经验组。我们发现,AI编程助手重组而非消除安全思考,将其从编写代码的行为转移到审查代码的行为。这种从预防性安全到反应性安全的转变,在结构上受到将代码生成视为功能性任务的交互模型的鼓励,使安全成为事后考虑。值得注意的是,我们的编码会话参与者中没有人最初在提示中指定安全要求,即使他们拥有相关知识,这揭示了安全意识与安全行为的脱节。我们进一步记录了开发者独立发明的应对AI安全风险的非正式策略,这些策略均未得到当前工具或组织的支持,并发现经验组并不能可靠地预测安全表现。本文提供了一个基于实践的描述,说明AI辅助开发如何重塑安全编码的人为方面,为设计更注重安全的工具、培训计划和组织政策提供了实证基础。

英文摘要

AI coding assistants are now central to professional software development, yet their impact on how developers think about and practice security remains poorly understood. While prior work has documented vulnerability rates in AI-generated code, a more fundamental question persists: how do these tools transform security awareness in authentic, ongoing development practice? We conducted semi-structured interviews with 15 professional software engineers and observed them completing security-relevant coding tasks with AI assistance, spanning 3 experience cohorts defined by their relationship to AI tools during professional formation. We find that AI coding assistants reorganize rather than eliminate security thinking, shifting it from the act of writing code to the act of reviewing it. This transition from preventive to reactive security is structurally encouraged by interaction models that frame code generation as a functional task, leaving security as an afterthought. Notably, none of our coding session participants specified security requirements in their initial prompts, even when they possessed the relevant knowledge, revealing a decoupling of security awareness from security behavior. We further document informal coping strategies developers had independently invented to manage AI security risk, none of which are supported by current tools or organizations, and find that the experience cohort did not reliably predict security performance. This paper contributes a practice-grounded account of how AI-assisted development reshapes the human side of secure coding, offering empirical foundations for the design of more security-aware tools, training programs, and organizational policies.

2605.22509 2026-06-11 cs.HC cs.CL

Reflecti-Mate: A Conversational Agent for Adaptive Decision-Making Support Through System 1 and System 2 Thinking

Reflecti-Mate: 通过系统1和系统2思维实现自适应决策支持的对话代理

Morita Tarvirdians, Senthil Chandrasegaran, Hayley Hung, Catholijn M. Jonker, Catharine Oertel

AI总结 本文研究了一种对话代理,通过适应个体思维模式促进决策整合,该代理能提供更个性化的反思路径和整合性反思语言,优于传统决策支持系统。

详情
Journal ref
UMAP 2026: Proceedings of the 34th ACM Conference on User Modeling, Adaptation and Personalization
Comments
Accepted at UMAP 2026
AI中文摘要

在做出高风险个人决策时,涉及认知、情感和直觉过程,个体在这些模式间的注意力分配各不相同。整合这些过程已被证明有助于决策。然而,大多数现有决策支持系统主要支持认知方面,而非适应个体的思维特征以促进不同思维类型的整合。在本研究中,我们探讨了一种代理,旨在通过适应个体用户思维模式来促进整合。我们探讨了该代理对参与者对代理的看法及其反思行为的影响,与未受助的预反思和基线代理进行比较。在被试间研究(N=128)中,我们的代理促进了广泛且深入的思考,使参与者能够形成更个性化的反思轨迹,产生更多整合性的反思语言,并被感知为提供更强的全面反思支持。相比之下,基线代理产生了受认知语言主导的同质化特征。

英文摘要

Making high-stakes personal decisions involves cognitive, emotional, and intuitive processes, and individuals differ in how they allocate attention across these modes. Integration of these processes has shown to benefit decision making. Yet, most current decision-support systems focus primarily on supporting cognitive aspects, rather than adapting to the individual's thinking profile to support integration of different types of thoughts. In this study, we investigate an agent designed to encourage integration by adapting to the individual user's thought patterns. We explore its effects on participants' perceptions of the agent and their reflective behavior, in comparison with unaided pre-reflection and a baseline agent. In a between-subjects study (N = 128), our agent, which fostered broad and elaborated thinking, enabled more personalized reflective trajectories, elicited more integrative reflective language, and was perceived as providing stronger support for holistic reflection. In contrast, the baseline agent produced homogenized profiles dominated by cognitive language across participants.

2605.10592 2026-06-11 cs.AI cs.HC cs.LG 版本更新

A Resilient Solution for Sewer Overflow Monitoring across Cloud and Edge

跨云和边缘的防洪溢流监控稳健解决方案

Vipin Singh, Tianheng Ling, Peter Ghaly, Felix Grimmeisen, Gregor Schiele, Felix Biessmann

AI总结 本文提出一个基于深度学习的云边协同监控平台,用于预测溢流池填充动态,以应对城市排水系统老化问题,提升防洪预警能力。

详情
Comments
3 pages, 6 figures, accepted at 35th International Joint Conference on Artificial Intelligence 2026 (IJCAI-ECAI 2026), Demonstrations Track. URL: this https URL
AI中文摘要

许多历史城市的老化联合排水系统正因极端降雨事件而承受更大压力,可能引发联合排水溢流(CSO),对环境和公共健康造成严重影响。预测溢流池的填充动态对于预测容量超限并及时采取预防措施至关重要。我们提出一个基于网页的演示器(https://riwwer.demo.calgo-lab.de),将云和边缘环境中的深度学习预测方法整合到交互式监控仪表板中,以实现溢流监控的网络中断鲁棒性。一个视频演示可在在线(https://cloud.bht-berlin.de/index.php/s/b9xt4T3SdiLBiFZ)获取。

英文摘要

Aging combined sewer systems in many historical cities are increasingly stressed by extreme rainfall events, which can trigger combined sewer overflows (CSO) with significant environmental and public health impacts. Forecasting the filling dynamics of overflow basins is critical for anticipating capacity exceedance and enabling timely preventive actions for CSO. We present a web-based demonstrator that integrates Deep Learning forecasting methods in both cloud and edge settings into an interactive monitoring dashboard for overflow monitoring, resilient to network outages. A video showcase is available online ( this https URL ).

2412.01459 2026-06-11 cs.CY cs.AI cs.HC

Perception Gaps in Risk, Benefit, and Value Between Experts and Public Challenge Socially Accepted AI

Philipp Brauner, Felix Glawe, Gian Luca Liehner, Luisa Vervier, Martina Ziefle

详情
Journal ref
AI & Society (2026)
英文摘要

Artificial Intelligence (AI) is reshaping many societal domains, raising critical questions about its risks, benefits, and the potential misalignment between public and academic perspectives. This study examines how the general public (N=1110) -- individuals who interact with or are impacted by AI technologies -- and academic AI experts (N=119) -- those elites shaping AI development -- perceive AI's capabilities and impact across 71 scenarios. These scenarios span domains such as sustainability, healthcare, job performance, societal inequality, art, and warfare. Participants evaluated these scenarios across four dimensions using the psychometric model: likelihood, perceived risk and benefit, and overall value (or sentiment). The results suggest significant differences: experts consistently anticipate higher probabilities, perceive lower risks, report greater benefits, and express more positive sentiment toward AI compared to the non-experts. Moreover, both groups apply different weighting schemes: experts discount risk more heavily relative to benefit than non-experts. Visual mappings of these evaluations uncover areas convergent evaluations (e.g., AI performing medical diagnoses or criminal use) as well as tension points (e.g., decision of legal cases, political decision making), highlighting areas where communication and policy interventions may be needed. These findings underscore a critical translational challenge: if AI research and deployment are to align with societal priorities, the perception gap between developers and the public must be better understood and addressed. Our results provide an empirical foundation for value-sensitive AI governance and trust-building strategies across stakeholder groups.

2604.22497 2026-06-11 cs.HC math.NA 版本更新

Catheter Monitoring in Intelligent Endovascular Navigation Systems: Interactive Simulations and Mixed Reality for Enhanced Navigational Awareness

智能血管内导航系统中的导管监测:用于增强导航意识的交互式模拟与混合现实

Veronica Ruozzi, Giovanni Battista Regazzo, Maria Chiara Palumbo, Wim-Alexander Beckers, Mouloud Ourak, Xiu Zhang, Francesca Perico, Alessandro Caimi, Emmanuel Vander Poorten, Emiliano Votta

AI总结 提出一种结合实时导管形状重建、交互式模拟和混合现实可视化的框架,通过有限元模型和传感器数据监测导管-血管相互作用,实验验证位移误差低于2.33毫米。

详情
Comments
Int J CARS (2026)
AI中文摘要

目的:开发并测试一个集成实时导管形状重建、交互式模拟和混合现实可视化的框架,以在血管内导航过程中精确监测导管-血管相互作用。方法:基于CT数据生成从右股静脉到下腔静脉的静脉通路有限元模型,并实现为交互式模拟。将导管运动作为边界条件,采用拉格朗日乘子法对导管-血管接触进行建模以计算血管变形。使用带有光纤布拉格光栅和电磁传感器的传感导管,在硅胶血管解剖复制品中推进进行体外测试。实时传感器读数输入模拟,更新的导管和血管几何形状流式传输至Hololens 2。通过立体帧三角测量获得的实验真值验证有限元计算的血管壁位移的性能和准确性。结果:在初始导航阶段,模拟时间超出实际时间12%;当导管到达最曲折部分时超出45%。Hololens 2渲染稳定在35-40帧/秒。这两个阶段中,有限元计算与真值之间的血管壁位移中位相对误差分别低于1毫米和2.33毫米。结论:该研究证明了将交互式生物力学模拟与实时传感器数据集成以实现导管-血管相互作用的连续监测的可行性,混合现实可视化作为用户界面支持操作员决策。

英文摘要

Purpose: Developing and testing a framework that integrates real-time catheter shape reconstruction, interactive simulations, and mixed reality visualization to enable accurate monitoring of catheter-vessel interactions during endovascular navigation. Methods: A finite element model (FEM) of the venous pathway from the right femoral vein to the inferior vena cava was generated from computed tomography data and implemented into an interactive simulation. Catheter motion was imposed as boundary condition, and catheter-vessel contact was modeled with a Lagrange multiplier formulation to compute vessel deformation. The framework was tested in-vitro using a sensorized catheter with Fiber Bragg Grating and electromagnetic sensors as it was advanced through a silicone replica of the vascular anatomy. Real-time sensor read-outs fed the simulation, and the updated catheter and vessel geometries were streamed to Hololens 2. The performance and accuracy of FEM-computed vessel wall displacement were validated against experimental ground-truth obtained via stereo frames triangulation. Results: The simulated time exceeded the real temporal extent by 12% during initial navigation and by 45% when the catheter reached the most tortuous portion. Hololens 2 rendering remained stable at 35-40 frames per second. The median relative displacement error between FEM-computed and ground-truth vessel wall displacements remained below 1 mm and 2.33 mm for these two phases, respectively. Conclusion: The study demonstrates the feasibility of integrating interactive biomechanical simulation with real-time sensor data to enable continuous monitoring of catheter-vessel interactions, with mixed reality visualization serving as a user interface to support operator decision-making.

2604.06911 2026-06-11 cs.HC

Physics-driven Sonification for Improving Multisensory Needle Guidance in Percutaneous Epicardial Access

物理驱动声化改善经皮心外膜入路中的多感官针引导

Veronica Ruozzi, Sasan Matinfar, Pasquale Vergara, Alessandro Albanesi, Serena Dell'Aversana, Stefano Carugo, Gianluigi Buccoliero, Nassir Navab, Alberto Redaelli, Emiliano Votta

AI总结 针对透视下跳动心脏的经皮心外膜入路中针尖定位难题,提出基于物理驱动声化的扩展现实多感官导航方法,通过4D CTA动态心脏解剖重建与实时针追踪,结合多层物理膜模型听觉编码,显著提升导航安全性、准确性和降低认知负荷。

详情
Journal ref
in IEEE Access, vol. 14, pp. 80371-80385, 2026
Comments
This work has been submitted to the IEEE for possible publication
AI中文摘要

经皮心外膜入路(PEA)在透视下对跳动心脏进行,可实现心律失常治疗。然而,将针推进至薄且移动的心包仍极具挑战性和风险。为解决此问题,我们提出一种基于物理驱动声化的扩展现实(XR)多感官导航方法,以增强用户在PEA关键针尖着陆阶段的感知。重建来自4D CTA的动态心脏解剖结构,并将其配准到真实世界坐标系。实时针追踪提供针尖相对于移动心脏结构的位置,并驱动视听反馈模块。视觉显示呈现导航提示和动态解剖,而听觉显示使用多层物理膜模型编码生理心脏状态。进行了一项体模研究,十二名心脏病专家在仅视觉和多感官反馈下执行针穿刺。多感官方法显著提高了导航安全性(χ²=11.30,p<0.01),减少了心肌接触(3.64% vs. 7.27%),并增加了正确入路(90.91% vs. 52.73%)。针放置精度提高,膜接近度更近(Cliff delta=0.19),变异性降低(p<0.05)。执行时间相当,而时间-精度相关性在两种模态间显著不同(p<0.01)。NASA-TLX显示多感官引导下认知负荷较低(p<0.01)。这些结果证明了物理驱动声化改善时空感知和支持以用户为中心的手术导航的可行性。

英文摘要

Percutaneous epicardial access (PEA), performed on a beating heart under fluoroscopy, enables arrhythmia treatment. However, advancing a needle toward the thin and moving pericardium remains highly challenging and risky. To address this problem, we present a physics-driven sonification method for Extended Reality (XR)-based multisensory navigation to enhance user perception during the critical needle landing phase in PEA. Dynamic cardiac anatomy from 4D CTA was reconstructed and registered to a real-world coordinate system. Real-time needle tracking provided the position of the needle tip relative to moving cardiac structures and drove an audio-visual feedback module. The visual display presented navigational cues and dynamic anatomy, while the auditory display encoded physiological cardiac states using a multilayer physical membrane model. A phantom study was conducted with twelve cardiologists performing needle insertions under visual-only and multisensory feedback. The multisensory method significantly improved navigation safety ($χ^2 = 11.30$, $p < 0.01$), reducing myocardial contact (3.64% vs. 7.27%) and increasing correct access (90.91% vs. 52.73%). Needle placement accuracy improved, with closer membrane proximity (Cliff delta = 0.19) and reduced variability ($p < 0.05$). Execution time was comparable, while time-accuracy correlations differed significantly between modalities ($p < 0.01$). NASA-TLX indicated lower cognitive load with multisensory guidance ($p < 0.01$). These results demonstrate the feasibility of physics-driven sonification for improving spatiotemporal awareness and supporting user-centered surgical navigation.

2601.11128 2026-06-11 cs.SI cs.HC cs.IR 版本更新

The Big Ban Theory: A Pre- and Post-Intervention Dataset of Online Content Moderation Actions

大封禁理论:在线内容审核行为的前后干预数据集

Aldo Cerulli, Lorenzo Cima, Benedetta Tessa, Serena Tardelli, Stefano Cresci

AI总结 针对在线平台审核干预研究缺乏综合数据集的问题,构建了包含Reddit和Voat上25种干预措施、超33.9万用户和近3900万条消息的数据集,提供标准化元数据和匿名化用户活动数据,支持干预效果的可比分析。

详情
Comments
Article published in ICWSM'26 - 20th AAAI Conference on Web and Social Media. Please, cite the published version
AI中文摘要

在线平台依赖审核干预来遏制仇恨言论、毒性以及错误和虚假信息的传播等有害行为。然而,关于此类干预的效果和潜在偏见的研究面临多重限制。例如,由于缺乏全面的数据集,现有工作通常只关注单一或少数干预措施。因此,研究人员通常需要为每项新研究收集必要的数据,这限制了系统比较的机会。为了克服这些挑战,我们引入了大封禁理论(TBBT)——一个大型的审核干预数据集。TBBT涵盖了25种不同类型、严重程度和范围的干预措施,总计包括Reddit和Voat上的超过33.9万用户和近3900万条发布的消息。对于每次干预,我们提供标准化的元数据和在干预实施前后三个月收集的匿名化用户活动数据,从而能够对干预效果进行一致且可比较的分析。此外,我们提供了数据集的描述性探索性分析,以及几个用例说明它如何支持内容审核研究。通过这个数据集,我们旨在支持研究审核干预效果的研究人员,并促进更系统、可重复和可比较的研究。

英文摘要

Online platforms rely on moderation interventions to curb harmful behavior such as hate speech, toxicity, and the spread of mis- and disinformation. Yet research on the effects and possible biases of such interventions faces multiple limitations. For example, existing works frequently focus on single or a few interventions, due to the absence of comprehensive datasets. As a result, researchers must typically collect the necessary data for each new study, which limits opportunities for systematic comparisons. To overcome these challenges, we introduce The Big Ban Theory (TBBT) -- a large dataset of moderation interventions. TBBT covers 25 interventions of varying type, severity, and scope, comprising in total over 339K users and nearly 39M posted messages on Reddit and Voat. For each intervention, we provide standardized metadata and pseudonymized user activity collected three months before and after its enforcement, enabling consistent and comparable analyses of intervention effects. In addition, we provide a descriptive exploratory analysis of the dataset, along with several use cases of how it can support research on content moderation. With this dataset, we aim to support researchers studying the effects of moderation interventions and to promote more systematic, reproducible, and comparable research.

2601.18934 2026-06-11 cs.HC cs.MM 版本更新

Whispering Water: Materializing Human-AI Dialogue as Interactive Ripples

低语之水:将人机对话物化为交互式涟漪

Ruipeng Wang, Tawab Safi, Yunge Wen, Christina Cunningham, Hoi Ling Tang, Behnaz Farahi

AI总结 通过将语音情感转换为激发频率、语义内容输入多智能体LLM系统,以及用对数间距和Bark尺度映射分解合成语音为谐波分量,在物理水面上实现人机对话的物化。

详情
AI中文摘要

水在不同文化中长久以来一直作为人类忏悔的接受者。我们呈现《低语之水》,一个通过水面上声波图案物化人机对话的互动装置。参与者向水面忏悔,触发一个四阶段仪式:忏悔、沉思、回应和释放。语音情感被转换为激发频率,调节水的物理状态,而语义内容进入一个由异构LLM组成的多智能体系统,其身份通过情境对话涌现。一种新颖的算法通过对数间距和Bark尺度映射将合成语音分解为谐波分量,将机器声音重构为物理波叠加。该装置通过感官丰富、仪式化框架下的人机互动探索情感自我探索。

英文摘要

Water has long served as a recipient of human confession across cultures. We present \textit{Whispering Water}, an interactive installation that materializes human-AI dialogue through cymatic patterns on water. Participants confess to a water surface, triggering a four-phase ritual: confession, contemplation, response, and release. Speech sentiment is translated into excitation frequencies that prime the water's physical state, while semantic content enters a multi-agent system of heterogeneous LLMs whose identities emerge through situated discourse. A novel algorithm decomposes synthesized speech into harmonic components via logarithmic spacing and Bark-scale mapping, reconstructing machine voices as physical wave superpositions. The installation explores emotional self-exploration through sensory-rich, ritually framed human-AI interaction.

2601.14764 2026-06-11 cs.AI cs.HC cs.LO 版本更新

An XAI View on Explainable ASP: Methods, Systems, and Perspectives

可解释ASP的XAI视角:方法、系统与展望

Thomas Eiter, Tobias Geibinger, Zeynep G. Saribatur

AI总结 本文从XAI视角综述回答集编程(ASP)的解释方法,分类解释类型并评估现有理论与工具的覆盖范围,指出研究空白与未来方向。

详情
Comments
10 pages
AI中文摘要

回答集编程(ASP)是符号AI中一种流行的声明式推理和问题解决方法。其基于规则的形式化使其天生具有可解释和解释性推理的吸引力,随着可解释AI(XAI)的兴起,这一点日益重要。目前已经开发了许多针对ASP的解释方法和工具,它们通常处理特定的解释设置,可能无法覆盖ASP用户遇到的所有场景。在本综述中,我们从XAI视角出发,概述了与用户解释问题相关的ASP解释类型,并描述了当前理论和工具对其的覆盖情况。此外,我们指出了现有ASP解释方法中的空白,并确定了未来工作的研究方向。

英文摘要

Answer Set Programming (ASP) is a popular declarative reasoning and problem solving approach in symbolic AI. Its rule-based formalism makes it inherently attractive for explainable and interpretive reasoning, which is gaining importance with the surge of Explainable AI (XAI). A number of explanation approaches and tools for ASP have been developed, which often tackle specific explanatory settings and may not cover all scenarios that ASP users encounter. In this survey, we provide, guided by an XAI perspective, an overview of types of ASP explanations in connection with user questions for explanation, and describe their coverage by current theory and tools. Furthermore, we pinpoint gaps in existing ASP explanations approaches and identify research directions for future work.

2508.20464 2026-06-11 cs.HC 版本更新

Human-Centered Design for Connected Automation: Predicting Pedestrian Crossing Intentions

面向互联自动化的人本设计:预测行人过街意图

Sanaz Motamedi, Viktoria Marcus, Griffin Pitts

AI总结 通过扩展计划行为理论,研究影响行人与L5级自动驾驶系统交互时过街决策的因素,发现感知安全性和理解对意图影响最大,为设计eHMI和V2X通信策略提供指导。

详情
AI中文摘要

全球每年119万交通死亡事故中,超过一半涉及行人等弱势道路使用者,其中很大比例归因于人为错误。L5级自动驾驶系统(ADS)有潜力减少此类事故;然而,其有效性不仅取决于自动化性能,还取决于在缺乏传统驾驶员提示的情况下,系统传达意图并与行人安全协调的能力。本研究旨在通过将计划行为理论(TPB)与安全性、信任、兼容性和理解性相结合,对涉及L5级ADS的道路过街场景中行人的决策过程进行建模。一项在线调查(n=212)发现,感知行为控制、态度和社会信息显著影响行人的过街意图,其中感知安全性和理解性对TPB构念的影响最强。研究结果为设计eHMI和协作式V2X通信策略提供了指导,以促进安全的行人-ADS交互,并推进自动驾驶汽车的人本设计。

英文摘要

More than half of the 1.19 million annual traffic fatalities globally involve vulnerable road users, such as pedestrians, with a significant proportion attributable to human error. Level-5 automated driving systems (ADSs) have the potential to reduce these incidents; However, their effectiveness depends not only on automation performance but also on their ability to communicate intent and coordinate safely with pedestrians in the absence of traditional driver cues. This study aims to model pedestrian decision-making in road-crossing scenarios involving level-5 ADSs by extending the Theory of Planned Behavior (TPB) with safety, trust, compatibility, and understanding. An online survey (n = 212) found that perceived behavioral control, attitude, and social information significantly influence pedestrians' crossing intentions, with perceived safety and understanding having the strongest effects on the TPB constructs. The results offer guidance for designing eHMIs and cooperative V2X communication strategies that promote safe pedestrian-ADS interactions and advance human-centered design for autonomous vehicles.

2510.11290 2026-06-11 cs.AI cs.HC

Evolution in Simulation: AI-Agent School with Dual Memory for High-Fidelity Educational Dynamics

Sheng Jin, Haoming Wang, Zhiqi Gao, Yongbo Yang, Bao Chunjia, Chengliang Wang

详情
Journal ref
Findings of the Association for Computational Linguistics: EMNLP 2025
Comments
9 pages, 7 figures, EMNLP conference
英文摘要

Large language models (LLMs) based Agents are increasingly pivotal in simulating and understanding complex human systems and interactions. We propose the AI-Agent School (AAS) system, built around a self-evolving mechanism that leverages agents for simulating complex educational dynamics. Addressing the fragmented issues in teaching process modeling and the limitations of agents performance in simulating diverse educational participants, AAS constructs the Zero-Exp strategy, employs a continuous "experience-reflection-optimization" cycle, grounded in a dual memory base comprising experience and knowledge bases and incorporating short-term and long-term memory components. Through this mechanism, agents autonomously evolve via situated interactions within diverse simulated school scenarios. This evolution enables agents to more accurately model the nuanced, multi-faceted teacher-student engagements and underlying learning processes found in physical schools. Experiment confirms that AAS can effectively simulate intricate educational dynamics and is effective in fostering advanced agent cognitive abilities, providing a foundational stepping stone from the "Era of Experience" to the "Era of Simulation" by generating high-fidelity behavioral and interaction data.