arXivDaily arXiv每日学术速递 周一至周五更新

AI 大模型

语言大模型 / LLM

大语言模型、预训练、指令微调、后训练和语言模型应用。

今日/当前日期收录 19 信号源:cs.CL, cs.AI, cs.LG
2512.06899 2026-06-19 cs.CR 版本更新 85%

Patronus: Identifying and Mitigating Transferable Backdoors in Pre-trained Language Models

Patronus: 识别和缓解预训练语言模型中的可迁移后门

Tianhang Zhao, Haodong Zhao, Wei Du, Pengzhou Cheng, Junxian Li, Sufeng Duan, Haojin Zhu, Gongshen Liu

专题命中 其他LLM :针对预训练语言模型后门攻击的防御框架,涉及LLM安全。

AI总结 针对预训练语言模型供应链中可迁移后门的安全威胁,提出Patronus防御框架,通过输入侧不变性检测和双阶段缓解策略,在15个模型和9个任务上实现≥98.3%后门检测召回率。

Comments Work in progress

详情
AI中文摘要

“预训练,然后微调”范式彻底改变了自然语言处理(NLP)。在此背景下,可迁移后门对预训练语言模型(PLMs)供应链构成严重威胁,然而防御研究仍处于起步阶段,主要依赖于检测输出特征空间中的异常。我们发现一个关键缺陷:下游任务的微调不可避免地会修改模型参数,改变输出分布,使得预先计算的防御失效。为解决此问题,我们提出Patronus,一种新颖的防御框架,将防御焦点从输出特征转移到输入侧不变性,利用对抗性触发即使在模型权重变化时也保持恒定的特性。为了克服离散文本优化的收敛挑战,Patronus引入了一种多触发对比搜索算法,有效桥接了基于梯度的优化与对比学习目标。此外,我们采用了一种双阶段缓解策略,结合实时输入监控和通过对抗训练进行的模型净化。在15个PLMs和9个任务上的大量实验表明,Patronus实现了≥98.3%的后门检测召回率,并将攻击成功率降低到干净设置的水平,在所有设置中显著优于所有最先进的基线。代码可从此https URL获取。

英文摘要

The ``Pre-train, then fine-tune'' paradigm has revolutionized Natural Language Processing (NLP). In this context, transferable backdoors pose a severe threat to the Pre-trained Language Models (PLMs) supply chain, yet defensive research remains nascent, primarily relying on detecting anomalies in the output feature space. We identify a critical flaw that fine-tuning on downstream tasks inevitably modifies model parameters, shifting the output distribution and rendering pre-computed defense ineffective. To address this, we propose Patronus, a novel defense framework that shifts the defensive focus from output features to input-side invariance, exploiting the fact that adversarial triggers remain constant even as model weights change. To overcome the convergence challenges of discrete text optimization, Patronus introduces a multi-trigger contrastive search algorithm that effectively bridges gradient-based optimization with contrastive learning objectives. Furthermore, we employ a dual-stage mitigation strategy combining real-time input monitoring with model purification via adversarial training. Extensive experiments across 15 PLMs and nine tasks demonstrate that Patronus achieves $\geq98.3\%$ backdoor detection recall and reduces attack success rates to clean settings, significantly outperforming all state-of-the-art baselines in all settings. Code is available at https://github.com/zth855/Patronus.

2603.25702 2026-06-19 cs.CL 版本更新 80%

S2D2: Fast Decoding for Diffusion LLMs via Training-Free Self-Speculation

S2D2:通过免训练自我推测实现扩散LLM的快速解码

Ligong Han, Hao Wang, Han Gao, Kai Xu, Akash Srivastava

发表机构 * Red Hat AI Innovation(红帽AI创新) MIT-IBM Watson AI Lab(MIT-IBM沃森人工智能实验室) Iowa State University(爱荷华州立大学) Core AI, IBM(IBM核心AI)

专题命中 其他LLM :扩散LLM解码加速,属于语言模型方法

AI总结 提出S2D2,一种免训练的自我推测解码框架,通过将块扩散模型在块大小为1时变为自回归模型,实现草稿与验证角色复用,在不增加训练或测试计算下提升解码速度与准确性。

Comments Code is available at https://github.com/phymhan/S2D2

详情
AI中文摘要

块扩散语言模型通过结合块级自回归解码与块内并行去噪,为超越自回归生成提供了一条有前景的路径。然而,在实际加速所需的少步数场景中,标准的置信度阈值解码往往脆弱:激进的阈值损害质量,而保守的阈值则需要不必要的去噪步骤。现有解决此问题的方法要么需要额外训练,要么增加测试时计算。我们提出S2D2,一种用于块扩散语言模型的免训练自我推测解码框架。我们的关键观察是,当块大小减小到1时,块扩散模型变为自回归模型,从而允许相同的预训练模型同时充当草稿模型和验证模型。S2D2在标准块扩散解码中插入一个推测验证步骤,并使用轻量级路由策略来决定何时验证值得其成本。这产生了一种混合解码轨迹,其中扩散并行提出令牌,而自回归模式充当局部序列级评判器。在三个主流块扩散家族中,S2D2在准确性-速度权衡上持续优于强置信度阈值基线。在SDAR上,我们观察到相比自回归解码高达4.7倍加速,相比调优的动态解码基线高达1.57倍加速,同时准确性提升高达4.5个点。在LLaDA2.1-Mini上,S2D2与内置自校正保持互补,包括在保守设置下比静态基线快4.4倍且准确性略高。

英文摘要

Block-diffusion language models offer a promising path toward faster-than-autoregressive generation by combining block-wise autoregressive decoding with within-block parallel denoising. However, in the few-step regime needed for practical acceleration, standard confidence-thresholded decoding is often brittle: aggressive thresholds hurt quality, while conservative thresholds require unnecessary denoising steps. Existing approaches that address this issue either require additional training or incur extra test-time compute. We present S2D2, a training-free self-speculative decoding framework for block-diffusion language models. Our key observation is that a block-diffusion model becomes autoregressive when the block size is reduced to one, allowing the same pretrained model to act as both drafter and verifier. S2D2 inserts a speculative verification step into standard block-diffusion decoding and uses lightweight routing policies to decide when verification is worth its cost. This yields a hybrid decoding trajectory in which diffusion proposes tokens in parallel, while the autoregressive mode acts as a local sequence-level critic. Across three mainstream block-diffusion families, S2D2 consistently improves the accuracy-speed tradeoff over strong confidence-thresholding baselines. On SDAR, we observe up to $4.7\times$ speedup over autoregressive decoding, and up to $1.57\times$ over a tuned dynamic decoding baseline while improving accuracy by up to $4.5$ points. On LLaDA2.1-Mini, S2D2 remains complementary to built-in self-correction, including a conservative setting where it is $4.4\times$ faster than the static baseline with slightly higher accuracy.

2603.16606 2026-06-19 cs.CL 版本更新 80%

Omnilingual SONAR: Cross-Lingual and Cross-Modal Sentence Embeddings Bridging Massively Multilingual Text and Speech

Omnilingual SONAR:跨语言与跨模态句子嵌入,连接大规模多语言文本与语音

Omnilingual SONAR Team, João Maria Janeiro, Pere-Lluís Huguet Cabot, Ioannis Tsiamas, Yen Meng, Vivek Iyer, Guillem Ramírez, Loic Barrault, Belen Alastruey, Xiang "Tony" Cao, Yu-An Chung, Marta R. Costa-Jussa, David Dale, Kevin Heffernan, Jaehyeong Jo, Artyom Kozhevnikov, Alexandre Mourachko, Christophe Ropers, Holger Schwenk, Paul-Ambroise Duquenne

发表机构 * FAIR at Meta(Meta的FAIR)

专题命中 其他LLM :跨语言跨模态句子嵌入模型

AI总结 提出OmniSONAR模型,通过渐进式训练和教师-学生蒸馏,在数千种语言上实现文本、语音、代码和数学表达式的统一语义嵌入,在跨语言检索和翻译任务上显著降低错误率,并支持零样本语音翻译。

详情
AI中文摘要

跨语言句子编码器通常只覆盖几百种语言,并且常常为了更强的对齐而牺牲下游质量,限制了它们的采用。我们引入了OmniSONAR,一个新的全语言、跨语言和跨模态句子嵌入模型家族,它原生地将文本、语音、代码和数学表达式嵌入到单一语义空间中,同时在数千种语言(从高资源到极低资源变体)的规模上提供最先进的下游性能。为了在不发生表示崩溃的情况下达到这一规模,我们使用了渐进式训练。我们首先使用LLM初始化的编码器-解码器,结合token级解码、新颖的分裂softmax对比损失和合成硬负样本,为200种语言学习一个强大的基础空间。在此基础上,我们通过两阶段教师-学生编码器蒸馏框架扩展到数千种语言变体。最后,我们通过将177种口语无缝映射到该空间,展示了该空间的跨模态可扩展性。OmniSONAR将200种语言的FLORES数据集上的跨语言相似性搜索错误减半,并在1560种语言的BIBLE基准上将错误减少了15倍。它还实现了强大的翻译性能,在多语言基准上优于NLLB-3B,并在1560种语言到英语的BIBLE翻译上比先前模型(包括更大的LLM)高出15个chrF++点。OmniSONAR在MTEB和XLCoST上也表现强劲。对于语音,OmniSONAR实现了43%更低的相似性搜索错误,并达到了SeamlessM4T语音到文本质量的97%,尽管对于翻译是零样本(仅在ASR数据上训练)。最后,通过训练一个编码器-解码器LM Spectrum,仅使用英语文本处理OmniSONAR嵌入序列,我们为复杂的下游任务解锁了向数千种语言和语音的高性能迁移。

英文摘要

Cross-lingual sentence encoders typically cover only a few hundred languages and often trade downstream quality for stronger alignment, limiting their adoption. We introduce OmniSONAR, a new family of omnilingual, cross-lingual and cross-modal sentence embedding models that natively embed text, speech, code, and mathematical expressions in a single semantic space, while delivering state-of-the-art downstream performance at the scale of thousands of languages, from high-resource to extremely low-resource varieties. To reach this scale without representation collapse, we use progressive training. We first learn a strong foundational space for 200 languages with an LLM-initialized encoder-decoder, combining token-level decoding with a novel split-softmax contrastive loss and synthetic hard negatives. Building on this foundation, we expand to several thousands language varieties via a two-stage teacher-student encoder distillation framework. Finally, we demonstrate the cross-modal extensibility of this space by seamlessly mapping 177 spoken languages into it. OmniSONAR halves cross-lingual similarity search error on the 200-language FLORES dataset and reduces error by a factor of 15 on the 1,560-language BIBLE benchmark. It also enables strong translation, outperforming NLLB-3B on multilingual benchmarks and exceeding prior models (including much larger LLMs) by 15 chrF++ points on 1,560 languages into English BIBLE translation. OmniSONAR also performs strongly on MTEB and XLCoST. For speech, OmniSONAR achieves a 43% lower similarity-search error and reaches 97% of SeamlessM4T speech-to-text quality, despite being zero-shot for translation (trained only on ASR data). Finally, by training an encoder-decoder LM, Spectrum, exclusively on English text processing OmniSONAR embedding sequences, we unlock high-performance transfer to thousands of languages and speech for complex downstream tasks.

2512.03818 2026-06-19 cs.CL 版本更新 80%

Improving Alignment Between Human and Machine Codes: An Empirical Assessment of Prompt Engineering for Construct Identification in Psychology

改善人机编码对齐:心理学构念识别中提示工程的实证评估

Kylie L. Anglin, Stephanie Milan, Brittney Hernandez, Claudia Ventura

发表机构 * Department of Educational Psychology, Neag School of Education, University of Connecticut(教育心理学系,教育学院,康涅狄格大学) Department of Psychological Sciences, College of Liberal Arts and Sciences, University of Connecticut(心理学系,文理学院,康涅狄格大学)

专题命中 其他LLM :优化LLM在心理学文本中识别构念的提示工程。

AI总结 本研究提出一个实证框架,通过提示工程优化大语言模型在心理学文本中识别构念的性能。实验评估五种提示策略,发现构念定义和任务框架最关键,结合代码簿引导和自动提示工程的少样本方法最接近专家判断。

Comments 22 pages, 2 figures

详情
AI中文摘要

由于其架构和庞大的预训练数据,大语言模型(LLMs)表现出强大的文本分类性能。然而,LLM的输出——这里指分配给文本的类别——在很大程度上取决于提示的措辞。尽管关于提示工程的文献正在扩展,但很少有研究关注分类任务,更少有研究涉及心理学等领域,在这些领域中,构念具有精确的、理论驱动的定义,而这些定义可能未在预训练数据中得到充分体现。我们提出了一个实证框架,通过提示工程优化LLM在文本中识别构念的性能。我们实验评估了五种提示策略——代码簿引导的实证提示选择、自动提示工程、角色提示、思维链推理和解释性提示——采用零样本和少样本分类。我们发现,角色、思维链和解释并不能完全解决因措辞不当的提示而导致的性能损失。相反,提示中最有影响力的特征是构念定义、任务框架,以及在较小程度上提供的示例。在三个构念和两个模型中,与专家判断最一致的分类来自结合代码簿引导的实证提示选择和自动提示工程的少样本提示。基于我们的发现,我们建议研究人员生成并评估尽可能多的提示变体,无论是人工编写的、自动生成的,或者理想情况下两者兼有,并根据训练数据集中的实证性能选择提示和示例,在保留集中验证最终方法。该程序提供了一种实用、系统且理论驱动的方法,用于在需要与专家判断对齐的环境中优化LLM提示。

英文摘要

Due to their architecture and vast pre-training data, large language models (LLMs) demonstrate strong text classification performance. However, LLM output - here, the category assigned to a text - depends heavily on the wording of the prompt. While literature on prompt engineering is expanding, few studies focus on classification tasks, and even fewer address domains like psychology, where constructs have precise, theory-driven definitions that may not be well represented in pre-training data. We present an empirical framework for optimizing LLM performance for identifying constructs in texts via prompt engineering. We experimentally evaluate five prompting strategies -- codebook-guided empirical prompt selection, automatic prompt engineering, persona prompting, chain-of-thought reasoning, and explanatory prompting - with zero-shot and few-shot classification. We find that persona, chain-of-thought, and explanations do not fully address performance loss accompanying a badly worded prompt. Instead, the most influential features of a prompt are the construct definition, task framing, and, to a lesser extent, the examples provided. Across three constructs and two models, the classifications most aligned with expert judgments resulted from a few-shot prompt combining codebook-guided empirical prompt selection with automatic prompt engineering. Based on our findings, we recommend that researchers generate and evaluate as many prompt variants as feasible, whether human-crafted, automatically generated, or ideally both, and select prompts and examples based on empirical performance in a training dataset, validating the final approach in a holdout set. This procedure offers a practical, systematic, and theory-driven method for optimizing LLM prompts in settings where alignment with expert judgment is critical.

2606.06971 2026-06-19 cs.MA cs.SI 版本更新 70%

Modeling U.S. Attitudes Toward China via an Event-Steered Multi-Agent Simulator

通过事件驱动的多智能体模拟器建模美国对华态度

Chenxu Zhu, Hantao Yao, Wu Liu, Junbo Guo, Yongdong Zhang

专题命中 其他LLM :基于LLM的多智能体模拟,驱动舆论演化

AI总结 提出事件驱动多智能体模拟器(ES-MAS),利用CURE数据集和双流数据集成引擎(DSDIE)及新闻驱动动态交互模块(NDDI),模拟美国对华舆论的动态演化,实验表明优于现有模型。

详情
AI中文摘要

理解舆论的动态演化,如美国公众对中国的态度,对于评估地缘政治风险至关重要。然而,现有的基于LLM的多智能体模拟器主要依赖静态规则和固定数据集,限制了其捕捉现实世界中宏观层面舆论转变的动态、事件驱动特性的能力。为解决这一限制,我们提出了一种事件驱动的多智能体模拟器(ES-MAS),其中重大事件和日常新闻通过智能体之间的动态交互持续驱动舆论演化。我们首先构建了中美关系演化(CURE)数据集,涵盖2021年至2025年的20个季度,包括258个重大事件和超过14,000篇日常新闻文章,为建模舆论动态提供了全面的时间基础。基于CURE数据集,我们提出了双流数据集成引擎(DSDIE),该引擎通过宏观层面事件将模拟与历史时间线对齐,同时基于个体智能体画像和上下文信号实现个性化信息暴露。此外,我们设计了新闻驱动的动态交互(NDDI)模块,该模块自适应地将具有共同新闻兴趣的智能体分组到局部交互上下文中,促进自下而上的共识形成,同时降低孤立信息茧房的风险。在CURE数据集上的实验结果表明,ES-MAS在复现真实世界历史趋势方面显著优于现有模拟器,为建模动态舆论演化提供了一个可扩展且有效的框架。

英文摘要

Understanding the dynamic evolution of opinions, such as U.S. public attitudes toward China, is essential for assessing geopolitical risks. However, existing LLM-based multiagent simulators predominantly rely on static rules and fixed datasets, limiting their ability to capture the dynamic, event-driven nature of macro-level opinion shifts in real-world settings. To address this limitation, we propose an Event-Steered Multi-Agent Simulator (ES-MAS), in which significant events and daily news continuously drive opinion evolution through dynamic interactions among agents. We first construct the China-U.S. Relation Evolution (CURE) dataset, covering 20 quarters from 2021 to 2025, including 258 major events and over 14,000 daily news articles, and providing a comprehensive temporal foundation for modeling opinion dynamics. Building upon the CURE dataset, we propose a Dual-Stream Data Integration Engine (DSDIE) that aligns simulations with historical timelines via macro-level events while enabling personalized information exposure based on individual agent profiles and contextual signals. Furthermore, we design a News-Driven Dynamic Interaction (NDDI) module, which adaptively groups agents with shared news interests into localized interaction contexts, facilitating bottom-up consensus formation while mitigating the risk of isolated information cocoons. Experimental results on the CURE dataset demonstrate that ES-MAS substantially outperforms existing simulators in reproducing real-world historical trends, offering a scalable and effective framework for modeling dynamic opinion evolution.

2604.07593 2026-06-19 cs.AI 版本更新 70%

Too long; didn't solve

太长;没解决

Lucía M. Cabrera, Isaac Saxton-Knight, Jocelyn D'Arcy

发表机构 * Instituto Balseiro(巴塞罗那研究所) Poindexter Labs(波因迪克斯实验室)

专题命中 其他LLM :提示长度与数学推理性能关系研究

AI总结 研究提示长度和解答长度与大型语言模型在数学问题上的性能关系,发现两者与模型失败率正相关。

详情
AI中文摘要

由一系列数学问题组成的数学基准被广泛用于评估大型语言模型的推理能力,但关于其结构特性如何影响模型行为的研究很少。在这项工作中,我们研究了两个结构长度变量——提示长度和解答长度,并分析了它们如何与模型在新构建的、由专家编写的对抗性数学问题数据集上的性能相关。我们发现,提示长度和解答长度均与模型失败率的增加呈正相关。我们还进行了跨模型分歧的探索性辅助分析。在难度调整的归一化分析下,两个变量与实现模型分离仍保持弱负相关,提示长度的关联稍强。总体而言,我们的主要稳健发现是,结构长度与该数据集中的经验难度相关。

英文摘要

Mathematical benchmarks consisting of a range of mathematics problems are widely used to evaluate the reasoning abilities of large language models, yet little is known about how their structural properties influence model behaviour. In this work, we investigate two structural length variables, prompt length and solution length, and analyse how they relate to model performance on a newly constructed adversarial dataset of expert-authored mathematics problems. We find that both prompt and solution lengths correlate positively with increased model failure across models. We also include a secondary, exploratory analysis of cross-model disagreement. Under a difficulty-adjusted normalised analysis, both variables retain weak negative associations with realised model separation, slightly stronger for prompt length. Overall, our main robust finding is that structural length is linked to empirical difficulty in this dataset.

2604.01955 2026-06-19 cs.CY 版本更新 70%

Teaching Students to Question the Machine: An AI Literacy Intervention Improves Students' Regulation of LLM Use in a Science Task

教导学生质疑机器:一项AI素养干预措施提升学生在科学任务中调节LLM使用的能力

O. Clerc, R. Abdelghani, C. Desvaux, E. Poisson, P. Y. Oudeyer, H. Sauzéon

专题命中 其他LLM :AI素养干预提升学生LLM使用能力

AI总结 本研究通过两小时的AI素养工作坊,训练中学生(8-9年级)在科学问题解决中更有效地使用大语言模型,减少盲目依赖并提高答案质量。

Comments Workshop paper accepted at ALIT4ALL 2026: 2nd International Workshop on AI Literacy Education For All, co-located with AIED 2026

详情
AI中文摘要

生成式人工智能(GenAI)在学校中的快速普及引发了人们对学生不加批判地依赖其输出的担忧。有效使用大语言模型(LLM)不仅需要技术知识,还需要监控、评估和调节与系统交互的能力,这些过程与元认知调节密切相关。这些技能在中学阶段仍在发展中,使得学生特别容易过度信任和过早接受AI输出。由于课堂时间和教师培训资源有限,迫切需要开发和评估可在现实学校条件下实施的AI素养干预措施。我们报告了一项受控的课堂研究,考察两小时的AI素养工作坊是否能改善学生在LLM支持的科学问题解决中的交互策略和最终答案质量。共有116名学生(8-9年级;13-15岁)使用生成式AI系统完成了六项科学调查任务。两天前,干预组参加了工作坊,该工作坊结合了关于LLM如何工作及失败的信息,以及关于提示和响应评估的实用指导;对照组未接受培训。受过训练的学生表现出更少的盲目依赖:他们更频繁地重新表述查询、提出后续问题,并更准确地判断响应正确性,从而获得更好的表现。相比之下,GenAI和元认知自我报告分数不能预测表现,这表明有效使用生成式AI较少依赖于自我报告测量,而更多依赖于交互调节的明确训练。总体而言,结果表明,简短、可扩展的AI素养教学可以显著改善中学生在校本学习活动中使用生成式AI的方式。

英文摘要

The rapid adoption of generative artificial intelligence (GenAI) in schools raises concerns about students' uncritical reliance on its outputs. Effective use of large language models (LLMs) requires not only technical knowledge but also the ability to monitor, evaluate, and regulate one's interaction with the system, processes closely tied to metacognitive regulation. These skills are still developing in middle school, making students particularly vulnerable to over-trust and premature acceptance of AI outputs. Because classroom time and teacher training resources are constrained, there is a pressing need to develop and evaluate AI literacy interventions that can be implemented under realistic school conditions. We report a controlled classroom study examining whether a two-hour AI literacy workshop improves students' interaction strategies and quality of final answers in LLM-supported science problem solving. A total of 116 students (grades 8-9; ages 13-15) completed six science investigation tasks using a generative AI system. Two days prior, the intervention group attended the workshop, which combined information about how LLMs work and fail with practical guidance on prompting and response evaluation; the control group received no training. Trained students showed less uncritical reliance on the system: they more often reformulated queries, asked follow-up questions, and more accurately judged response correctness, leading to better performance. In contrast, GenAI and metacognitive self-report scores did not predict performance, suggesting that effective use of generative AI depends less on self-reported measures and more on explicit training in interaction regulation. Overall, the results show that brief, scalable AI literacy instruction can meaningfully improve how middle-school students use generative AI in school-like learning activities.

2603.16941 2026-06-19 eess.AS cs.CL cs.SD 版本更新 70%

The Voice Behind the Words: Quantifying Intersectional Bias in SpeechLLMs

言语背后的声音:量化语音大语言模型中的交叉偏见

Shree Harsha Bokkahalli Satish, Christoph Minixhofer, Maria Teleki, James Caverlee, Ondřej Klejch, Peter Bell, Gustav Eje Henter, Éva Székely

发表机构 * 1 Department of Speech, Music Hearing, KTH Royal Institute of Technology, Sweden 2 Centre for Speech Technology Research, University of Edinburgh, UK 3 Texas A\&M University, USA

专题命中 其他LLM :语音大语言模型中的交叉偏见量化

AI总结 本研究通过2880次受控交互,评估三种语音大语言模型在六种英语口音和两种性别呈现中的口音与性别交叉偏见,发现东欧口音(尤其女性)获得更低有用性评分,且人类评估者比LLM评判更敏感。

Comments 5 pages, 3 figures, 1 table, Accepted to Interspeech 2026

详情
AI中文摘要

语音大语言模型直接处理语音输入,保留了之前级联管道中去除的口音和感知性别等线索,这导致了依赖于说话者身份的反应差异。我们使用2880次受控交互(涵盖六种英语口音和两种性别呈现,通过语音克隆保持语言内容不变),对三种语音大语言模型中的口音和性别偏见进行了大规模交叉评估。通过逐点LLM评判评分、成对比较以及经过人工验证的最佳-最差缩放,我们检测到反复出现的定向差异。东欧口音的语音获得较低的有用性评分,尤其是女性呈现的语音。反应保持礼貌但在有用性上存在差异。虽然LLM评判捕捉到了这些偏见的定向趋势,但人类评估者表现出显著更高的敏感性,显示出更强的口音级别对比。

英文摘要

Speech Large Language Models (SpeechLLMs) process spoken input directly, retaining cues such as accent and perceived gender that were previously removed in cascaded pipelines. This introduces speaker identity dependent variation in responses. We present a large-scale intersectional evaluation of accent and gender bias in three SpeechLLMs using 2,880 controlled interactions across six English accents and two gender presentations, keeping linguistic content constant through voice cloning. Using pointwise LLM-judge ratings, pairwise comparisons, and Best-Worst Scaling with human validation, we detect recurring directional disparities. Eastern European-accented speech receives lower helpfulness scores, particularly for female-presenting voices. Responses remain polite but differ in helpfulness. While LLM judges capture the directional trend of these biases, human evaluators exhibit significantly higher sensitivity, showing stronger accent-level contrasts.

2603.16357 2026-06-19 cs.CY cs.SE 版本更新 70%

Beyond Grading Accuracy: Exploring Alignment of TAs and LLMs

超越评分准确性:探索助教与LLMs的一致性

Matthijs Jansen op de Haar, Nacir Bouali, Faizan Ahmed

专题命中 其他LLM :开源LLM用于UML类图评分评估

AI总结 本文提出一个评估管道,通过定量研究92个UML类图,比较助教与六个开源LLMs在单个评分标准上的表现,发现开源LLMs在评分准确性上接近助教,为混合主动评分系统提供了可能。

Comments 7 pages, 3 figures

详情
AI中文摘要

在本文中,我们研究了开源大型语言模型(LLMs)在评分统一建模语言(UML)类图方面的潜力。与现有主要评估专有LLMs的工作不同,我们专注于非专有模型,使得我们的方法适用于对透明度和成本敏感的大学。此外,现有研究评估的是完整图表而非单个标准的性能,对自动评分与人类评估的一致性提供的见解有限。为解决这些差距,我们提出一个评分管道,其中学生生成的UML类图由助教(TAs)和LLMs独立评估,然后在单个标准级别比较评分。我们通过一项对软件设计课程中92个UML类图的定量研究来评估该管道,将助教评分与六个开源LLMs产生的评估进行比较。性能在单个标准上测量,突出LLMs与人类评分者存在差异的领域。我们的结果显示,每个标准的准确率高达88.56%,皮尔逊相关系数高达0.78,仅使用开源模型就比先前工作有显著改进。这些模型的性能接近助教,表明了一条通往混合主动评分系统的可能路径,其中助教在评分中得到辅助。我们的发现表明,开源LLMs可以通过明确识别与评分标准的一致性来有效支持UML类图评分。所提出的管道提供了一种实用方法,以应对随着学生人数增长而增加的工作量。

英文摘要

In this paper, we investigate the potential of open-source Large Language Models (LLMs) for grading Unified Modeling Language (UML) class diagrams. In contrast to existing work, which primarily evaluates proprietary LLMs, we focus on non-proprietary models, making our approach suitable for universities where transparency and cost are critical. Additionally, existing studies assess performance over complete diagrams rather than individual criteria, offering limited insight into how automated grading aligns with human evaluation. To address these gaps, we propose a grading pipeline in which student-generated UML class diagrams are independently evaluated by both teaching assistants (TAs) and LLMs. Grades are then compared at the level of individual criteria. We evaluate this pipeline through a quantitative study of 92 UML class diagrams from a software design course, comparing TA grades against assessments produced by six open-source LLMs. Performance is measured across individual criteria, highlighting areas where LLMs diverge from human graders. Our results show per-criterion accuracy of up to 88.56\% and a Pearson correlation coefficient of up to 0.78, representing a substantial improvement over previous work while using only open-source models. The models achieve performance close to that of a TA, suggesting a possible path toward a mixed-initiative grading system, where TAs are aided in their grading. Our findings demonstrate that open-source LLMs can effectively support UML class diagram grading by explicitly identifying alignment with grading criteria. The proposed pipeline provides a practical approach to managing increasing workloads with growing student counts.

2502.19193 2026-06-19 cs.SI cs.AI cs.NE 版本更新 70%

Simulation of Language Evolution under Regulated Social Media Platforms: A Synergistic Approach of Large Language Models and Genetic Algorithms

受监管社交媒体平台下的语言演化模拟:大语言模型与遗传算法的协同方法

Jinyu Cai, Yusei Ishimizu, Mingyue Zhang, Munan Li, Jialong Li, Kenji Tei

专题命中 其他LLM :用LLM模拟语言演化,结合遗传算法

AI总结 提出基于大语言模型的多智能体框架,结合遗传算法模拟用户语言策略在监管下的迭代演化,实验表明对话轮次增加可提升信息传递准确性和对话持续性。

Comments The manuscript has been accepted to IEEE Transactions on Computational Social Systems

详情
AI中文摘要

社交媒体平台经常实施限制性政策来调节用户内容,从而催生出创造性的规避语言策略。本文提出了一个基于大语言模型(LLMs)的多智能体框架,用于模拟在监管约束下语言策略的迭代演化。在该框架中,参与者智能体作为社交媒体用户,不断演化其语言表达,而监管智能体通过评估政策违规来模拟平台级别的监管。为了实现更逼真的模拟,我们采用了语言策略的双重设计(约束和表达)来区分冲突目标,并利用LLM驱动的遗传算法(GA)进行语言策略的选择、变异和交叉。该框架使用两种不同的场景进行评估:一个抽象的密码游戏和一个逼真的模拟非法宠物交易场景。实验结果表明,随着对话轮次的增加,不间断对话轮次的数量和信息传输的准确性都显著提高。此外,一项包含40名参与者的用户研究验证了生成对话和策略的现实相关性。消融研究也验证了GA的重要性,强调了其对长期适应性和整体结果改善的贡献。

英文摘要

Social media platforms frequently impose restrictive policies to moderate user content, prompting the emergence of creative evasion language strategies. This paper presents a multi-agent framework based on Large Language Models (LLMs) to simulate the iterative evolution of language strategies under regulatory constraints. In this framework, participant agents, as social media users, continuously evolve their language expression, while supervisory agents emulate platform-level regulation by assessing policy violations. To achieve a more faithful simulation, we employ a dual design of language strategies (constraint and expression) to differentiate conflicting goals and utilize an LLM-driven GA (Genetic Algorithm) for the selection, mutation, and crossover of language strategies. The framework is evaluated using two distinct scenarios: an abstract password game and a realistic simulated illegal pet trade scenario. Experimental results demonstrate that as the number of dialogue rounds increases, both the number of uninterrupted dialogue turns and the accuracy of information transmission improve significantly. Furthermore, a user study with 40 participants validates the real-world relevance of the generated dialogues and strategies. Moreover, ablation studies validate the importance of the GA, emphasizing its contribution to long-term adaptability and improved overall results.

2605.05481 2026-06-19 cs.LG 版本更新 60%

Approximate Next Policy Sampling: Replacing Conservative Target Policy Updates in Deep RL

近似下一策略采样:替代深度强化学习中的保守目标策略更新

Dillon Sandhu, Ronald Parr

专题命中 其他LLM :提出近似下一策略采样方法,属于强化学习,非LLM核心内容

AI总结 提出近似下一策略采样(ANPS)方法,通过修改训练分布而非约束策略更新来解决强化学习中的“鸡生蛋”问题,并基于此设计稳定值近似策略迭代(SV-API)算法,在Atari和连续控制任务上实现更大目标策略更新且性能匹配或提升。

详情
AI中文摘要

我们重新审视强化学习中一个经典的“鸡生蛋”问题:为了安全地改进策略,价值函数必须在更新策略的状态访问分布上准确。该状态分布是未知的,且无法为训练价值函数而采样。保守更新解决了这个问题,但代价是缩小策略更新。本文探索了一种替代方案,即近似下一策略采样(ANPS),它通过修改训练分布而非约束策略更新来解决问题。如果训练数据的分布近似于下一策略的分布,则ANPS成立。为了证明ANPS的可行性和有效性,我们引入了稳定值近似策略迭代(SV-API)。SV-API修改了标准的近似策略迭代循环,在迭代更新的行为策略收集相关经验的同时,保持目标策略固定。它仅在满足收敛准则后才承诺采用新策略。如果满足某些稳定性准则,则更新保证是安全的;否则,其安全性不低于标准近似策略迭代。将SV-API应用于PPO得到稳定值PPO(SV-PPO),在高维离散(Atari)和连续控制基准测试中,SV-PPO在执行显著更大的目标策略更新的同时,性能匹配或提升。这些结果证明了ANPS作为RL中这一经典挑战的新解决方案的可行性。

英文摘要

We revisit a classic "chicken-and-egg" problem in reinforcement learning: to safely improve a policy, the value function must be accurate on the state-visitation distribution of the updated policy. That distribution over states is unknown and cannot be sampled for the purposes of training the value function. Conservative updates solve this problem, but at the cost of shrinking the policy update. This paper explores an alternative solution, Approximate Next Policy Sampling (ANPS), which addresses the problem by modifying the training distribution rather than constraining the policy update. ANPS is satisfied if the distribution of the training data approximates that of the next policy. To demonstrate the feasibility and efficacy of ANPS, we introduce Stable Value Approximate Policy Iteration (SV-API). SV-API modifies the standard approximate policy iteration loop to hold the target policy fixed while an iteratively updated behavioral policy gathers relevant experience. It only commits to a new policy once a convergence criterion has been met. If certain stability criteria are met, the update is guaranteed to be safe; otherwise, it remains no less safe than standard approximate policy iteration. Applying SV-API to PPO yields Stable Value PPO (SV-PPO), which matches or improves performance on high-dimensional discrete (Atari) and continuous control benchmarks while executing substantially larger target policy updates. These results demonstrate the viability of ANPS as a new solution to this classic challenge in RL.

2604.07328 2026-06-19 cs.LG 版本更新 60%

How to sketch a learning algorithm

如何勾勒学习算法

Sam Gunn

发表机构 * UC Berkeley(伯克利大学)

专题命中 其他LLM :提出数据删除方案用于深度学习模型

AI总结 提出一种数据删除方案,基于稳定性假设,通过随机复方向的高阶导数局部勾勒算术电路,实现深度学习模型输出预测的误差和失败概率可忽略,且预计算和推理仅慢对数因子。

Comments Improved presentation and simplified Algorithm 4

详情
AI中文摘要

训练数据的选择如何影响AI模型?这个广泛的问题对于可解释性、隐私和基础科学至关重要。其技术核心是数据删除问题:在合理的预计算量之后,快速预测如果从学习算法中排除给定训练数据子集,模型在给定情况下的行为。我们提出了一种数据删除方案,能够在深度学习设置中以可忽略的误差$\varepsilon$和失败概率$\delta$预测模型输出。我们的预计算和预测算法分别仅比常规训练和推理慢$\tilde{O}(\log(1/\delta)/\varepsilon^2)$因子。存储需求为$\tilde{O}(\log(1/\delta)/\varepsilon^2)$个模型。我们的证明基于一个称为稳定性的假设。与先前工作所做的假设相比,稳定性似乎与学习强大AI模型完全兼容。为支持这一点,我们展示了稳定性在microgpt的最小实验集中得到满足。我们的代码可在https://this URL获取。在技术层面,我们的工作基于一种新方法,通过计算随机复方向的高阶导数来局部勾勒算术电路。前向模式自动微分允许廉价计算这些导数。

英文摘要

How does the choice of training data influence an AI model? This broad question is of central importance to interpretability, privacy, and basic science. At its technical core is the data deletion problem: after a reasonable amount of precomputation, quickly predict how the model would behave in a given situation if a given subset of training data had been excluded from the learning algorithm. We present a data deletion scheme capable of predicting model outputs with vanishing error $\varepsilon$ and failure probability $δ$ in the deep learning setting. Our precomputation and prediction algorithms are only $\tilde{O}(\log(1/δ)/\varepsilon^2)$ factors slower than regular training and inference, respectively. The storage requirements are those of $\tilde{O}(\log(1/δ)/\varepsilon^2)$ models. Our proof is based on an assumption that we call stability. In contrast to the assumptions made by prior work, stability appears to be fully compatible with learning powerful AI models. In support of this, we show that stability is satisfied in a minimal set of experiments with microgpt. Our code is available at https://github.com/SamSpo1/microgpt-sketch. At a technical level, our work is based on a new method for locally sketching an arithmetic circuit by computing higher-order derivatives in random complex directions. Forward-mode automatic differentiation allows cheap computation of these derivatives.

2604.06464 2026-06-19 cs.LG physics.app-ph stat.ML 版本更新 60%

Weighted Bayesian Conformal Prediction

加权贝叶斯共形预测

Xiayin Lou, Peng Luo

发表机构 * Technical University of Munich(慕尼黑技术大学) Massachusetts Institute of Technology(麻省理工学院)

专题命中 其他LLM :加权贝叶斯共形预测方法

AI总结 提出加权贝叶斯共形预测(WBCP),通过加权Dirichlet先验推广贝叶斯共形预测到重要性加权设置,理论证明有效样本量决定后验方差,并提供更丰富的条件覆盖不确定性。

详情
AI中文摘要

共形预测提供具有有限样本覆盖保证的分布自由预测区间,Snell & Griffiths 最近的工作将其重新解释为贝叶斯求积(BQ-CP),通过阈值上的 Dirichlet 后验产生强大的数据条件保证。然而,BQ-CP 根本上要求 i.i.d. 假设。同时,加权共形预测通过重要性权重处理分布偏移,但仍然是频率学派方法,仅产生点估计阈值。我们提出 \textbf{加权贝叶斯共形预测(WBCP)},它将 BQ-CP 推广到任意重要性加权设置,用加权 Dirichlet $\Dir(\neff \cdot \tilde{w}_1, \ldots, \neff \cdot \tilde{w}_n)$ 替换均匀 Dirichlet $\Dir(1,\ldots,1)$,其中 $\neff$ 是 Kish 有效样本量。我们证明了四个理论结果:(1)~$\neff$ 是匹配频率学派和贝叶斯方差的唯一集中参数;(2)~后验标准差以 $O(1/\sqrt{\neff})$ 衰减;(3)~BQ-CP 的随机占优保证扩展到每个权重轮廓的数据条件保证;(4)~HPD 阈值在条件覆盖上提供 $O(1/\sqrt{\neff})$ 的改进。我们将 WBCP 实例化为 \emph{地理贝叶斯共形预测},其中基于核的空间权重产生每个位置的后验,并具有可解释的诊断。在合成和真实空间数据集上的实验表明,WBCP 在保持覆盖保证的同时提供了更丰富的不确定性信息。

英文摘要

Conformal prediction provides distribution-free prediction intervals with finite-sample coverage guarantees, and recent work by Snell \& Griffiths reframes it as Bayesian Quadrature (BQ-CP), yielding powerful data-conditional guarantees via Dirichlet posteriors over thresholds. However, BQ-CP fundamentally requires the i.i.d. assumption. Meanwhile, weighted conformal prediction handles distribution shift via importance weights but remains frequentist, producing only point-estimate thresholds. We propose \textbf{Weighted Bayesian Conformal Prediction (WBCP)}, which generalizes BQ-CP to arbitrary importance-weighted settings by replacing the uniform Dirichlet $\Dir(1,\ldots,1)$ with a weighted Dirichlet $\Dir(\neff \cdot \tilde{w}_1, \ldots, \neff \cdot \tilde{w}_n)$, where $\neff$ is Kish's effective sample size. We prove four theoretical results: (1)~$\neff$ is the unique concentration parameter matching frequentist and Bayesian variances; (2)~posterior standard deviation decays as $O(1/\sqrt{\neff})$; (3)~BQ-CP's stochastic dominance guarantee extends to per-weight-profile data-conditional guarantees; (4)~the HPD threshold provides $O(1/\sqrt{\neff})$ improvement in conditional coverage. We instantiate WBCP for spatial prediction as \emph{Geographical BQ-CP}, where kernel-based spatial weights yield per-location posteriors with interpretable diagnostics. Experiments on synthetic and real-world spatial datasets demonstrate that WBCP maintains coverage guarantees while providing substantially richer uncertainty information.

2603.10184 2026-06-19 stat.ML cs.LG 版本更新 60%

Stabilizing Bandits using Regularization: Precise Regret and A Quantitative Central Limit Theorem

使用正则化稳定赌博机:精确遗憾与定量中心极限定理

Budhaditya Halder, Ishan Sengupta, Koustav Chowdhury, Samya Praharaj, Koulik Khamaru

发表机构 * Department of Statistics, Rutgers University(罗切斯特大学统计系) Indian Statistical Institute, Kolkata(加尔各答印度统计研究所)

专题命中 其他LLM :研究赌博机算法稳定性,与LLM弱相关。

AI总结 本文提出一种精细的稳定性条件,证明正则化随机镜像下降算法满足该条件,并推导出自适应采样下经验奖励估计的非渐近Berry-Esseen界、匹配的遗憾上下界,以及抗腐败下的渐近正态性,同时揭示正则化是有效推断的必要代价。

Comments Updated rate of convergence and precise regret in version 2

详情
AI中文摘要

由于自适应采样违反了经典渐近理论中的独立性假设,使用赌博机数据进行统计推断面临根本性挑战。近期工作将稳定性~\citep{laiwei82} 确定为自适应下有效推断的充分条件。本文首先提出一个精细的稳定性条件,以在线算法的迭代形式表述,并证明一大类正则化随机镜像下降算法满足该条件。这一精细条件使我们能够在多个方面加强~\citet{laiwei82} 的渐近结果。首先,我们推导出自适应采样下经验奖励估计的非渐近Berry-Esseen界。其次,我们推导出所提算法遗憾的匹配非渐近上下界,从而精确刻画其遗憾。第三,我们证明这些正则化算法在给定水平的对抗性腐败下保持渐近正态性和有效推断。最后,我们表明正则化是必要的而非偶然的:Lai-Wei稳定性与最优的$O(\sqrt{T})$遗憾率(如EXP3等非正则化算法所达到的)不相容,因此受控的多对数级遗憾膨胀是有效推断的代价。

英文摘要

Statistical inference with bandit data presents fundamental challenges owing to adaptive sampling, which violates the independence assumptions underlying classical asymptotic theory. Recent work has identified stability~\citep{laiwei82} as a sufficient condition for valid inference under adaptivity. This paper first provides a refined stability condition, stated in terms of the iterates of an online algorithm, and shows that a large class of regularized stochastic-mirror-descent-style algorithms satisfy it. This refined condition allows us to strengthen the asymptotic results of~\citet{laiwei82} in several ways. First, we derive a non-asymptotic Berry--Esseen bound for the empirical reward estimates under adaptive sampling. Second, we derive matching non-asymptotic upper and lower bounds on the regret of the proposed algorithm, yielding a precise characterization of its regret. Third, we show that these regularized algorithms preserve asymptotic normality and valid inference under a prescribed level of adversarial corruption. Finally, we show that regularization is necessary rather than incidental: Lai--Wei stability is incompatible with the optimal $O(\sqrt{T})$ regret rate -- the rate attained by unregularized algorithms such as EXP3 -- so that a controlled, polylogarithmic inflation in regret is the price of valid inference.

2511.22283 2026-06-19 cs.LG 版本更新 60%

The Hidden Cost of Approximation in Online Mirror Descent

在线镜像下降中近似的隐藏代价

Ofir Schlisselberg, Uri Sherman, Tomer Koren, Yishay Mansour

发表机构 * Tel Aviv University(特拉维夫大学) Google Research(谷歌研究)

专题命中 其他LLM :研究在线镜像下降在近似误差下的鲁棒性,与优化相关。

AI总结 研究在线镜像下降(OMD)在近似误差下的鲁棒性,发现正则子光滑度与误差容忍度密切相关:均匀光滑正则子有紧界,而负熵在单纯形上需指数小误差,对数障碍和Tsallis正则子仅需多项式误差。

详情
AI中文摘要

在线镜像下降(OMD)是一个基本的算法范式,支撑着优化、机器学习和序列决策中的许多算法。OMD迭代被定义为优化子问题的解,而这些子问题通常只能近似求解,导致算法的不精确版本。然而,现有的OMD分析通常假设理想的无误差环境,从而限制了我们对实践中应期望的性能保证的理解。在这项工作中,我们启动了对不精确OMD的系统研究,并揭示了正则子光滑性与对近似误差鲁棒性之间的复杂关系。当正则子一致光滑时,我们建立了由误差引起的超额遗憾的紧界。然后,对于单纯形及其子集上的障碍正则子,我们识别出一个尖锐的分离:负熵需要指数小的误差以避免线性遗憾,而对数障碍和Tsallis正则子即使在误差仅为多项式大小时也能保持鲁棒。最后,我们表明当损失是随机的且域是单纯形时,负熵重新获得鲁棒性——但这种性质并不扩展到所有子集,在那里指数小的误差再次是避免次优遗憾所必需的。

英文摘要

Online mirror descent (OMD) is a fundamental algorithmic paradigm that underlies many algorithms in optimization, machine learning and sequential decision-making. The OMD iterates are defined as solutions to optimization subproblems which, oftentimes, can be solved only approximately, leading to an inexact version of the algorithm. Nonetheless, existing OMD analyses typically assume an idealized error free setting, thereby limiting our understanding of performance guarantees that should be expected in practice. In this work we initiate a systematic study into inexact OMD, and uncover an intricate relation between regularizer smoothness and robustness to approximation errors. When the regularizer is uniformly smooth, we establish a tight bound on the excess regret due to errors. Then, for barrier regularizers over the simplex and its subsets, we identify a sharp separation: negative entropy requires exponentially small errors to avoid linear regret, whereas log-barrier and Tsallis regularizers remain robust even when the errors are only polynomial. Finally, we show that when the losses are stochastic and the domain is the simplex, negative entropy regains robustness-but this property does not extend to all subsets, where exponentially small errors are again necessary to avoid suboptimal regret.

2509.23806 2026-06-19 cs.SE cs.LG 版本更新 60%

Influence-Guided Concolic Testing of Transformer Robustness

影响力引导的Transformer鲁棒性具体化测试

Chih-Duo Hong, Chih-Cheng Yang, Yu Wang, Fang Yu

发表机构 * Department of Management Information Systems(管理信息系)

专题命中 其他LLM :测试Transformer鲁棒性,但主要关注软件测试

AI总结 提出一种基于SHAP影响力排序路径谓词的具体化测试方法,通过纯Python实现多头注意力语义并显式化softmax边界,在CIFAR-10上对紧凑Transformer分类器实现60%攻击成功率,比差分进化基线高45%,且谓词优先级排序将中位攻击时间降低51%。

Comments Accepted at the 26th International Conference on Software Quality, Reliability, and Security

详情
AI中文摘要

神经网络的具体化测试交替进行具体执行和约束求解,以搜索翻转模型决策的输入。我们提出一种针对Transformer分类器的具体化测试器,使用SHAP估计对待定路径谓词按其当前预测的影响进行排序。为了支持SMT求解驱动的执行中多头自注意力机制,我们用纯Python实现注意力语义,使其与求解器兼容,并通过具体化指数参数使softmax边界显式化。我们在CIFAR-10上对三个紧凑Transformer分类器、ResNet18和VGG16在单像素预算和900秒时限下评估了该方法。在匹配比较的500个模型-输入对中,我们的方法实现了60%的成功率,而将模型视为黑盒的差分进化基线仅为15%。在主要的两层Transformer分支排序研究中,基于SHAP的谓词优先级排序将成功率从56%提升至60%,并将中位攻击时间降低51%。这些结果表明,影响力引导的路径探索可以使具体化测试成为在Transformer模型中寻找对抗样本的实用方法。

英文摘要

Concolic testing for neural networks alternates concrete execution with constraint solving to search for inputs that flip model decisions. We present a concolic tester for Transformer classifiers that uses SHAP estimates to rank pending path predicates by their impact on the current prediction. To support self-attention with multiple heads in execution backed by SMT solving, we implement attention semantics in pure Python that are compatible with the solver and make the softmax boundary explicit by concretizing exponentiation arguments. We evaluate our method on CIFAR-10 across three compact Transformer classifiers, ResNet18, and VGG16 under a one-pixel budget and a 900s horizon. Across the 500 model--input pairs in this matched comparison, our method achieves 60% success, compared with 15% for a differential evolution baseline that treats the model as a black box. In the primary two-layer Transformer branch-ordering study, SHAP-based predicate prioritization raises success from 56% to 60% and reduces median attack time by 51%. These results show that influence-guided path exploration can make concolic testing a practical way to find adversarial examples in Transformer models.

2507.05169 2026-06-19 cs.LG cs.AI cs.CL cs.CV cs.RO 版本更新 60%

Critique of World Model

世界模型批判:一种用于世界建模的生成式潜在预测架构

Eric Xing, Mingkai Deng, Jinyu Hou

专题命中 其他LLM :世界模型架构综述,涉及生成式预测,与LLM相关。

AI总结 本文从心理学“假设性思维”出发,提出世界模型的核心目标是模拟真实世界的所有可行动可能性,并设计了一种基于状态化、分层、多级、混合连续/离散表示的生成式潜在预测(GLP)架构。

详情
AI中文摘要

世界模型,即生物智能体所经历并对其采取行动的真实世界环境的算法模拟器,近年来因开发具有人工(通用)智能的虚拟智能体的需求日益增长而成为一个新兴课题。关于世界模型究竟是什么、如何构建、如何使用以及如何评估,已有许多讨论。本文从著名科幻经典《沙丘》中的想象出发,并借鉴心理学文献中“假设性思维”的概念,论证世界模型的主要目标是模拟真实世界中所有可行动的可能性,以进行有目的的推理和行动。我们审视了世界建模的关键设计维度:数据、表示、架构、学习目标和使用,调查了现有方法并分析了它们的权衡。在此基础上,我们提出了一种新的通用世界模型生成式潜在预测(GLP)架构,基于有状态的、分层的、多层次的、混合连续/离散表示,以及生成式和自监督学习框架,并展望了由这种模型支持的物理、智能体和嵌套(PAN)AGI系统。

英文摘要

World Model, the algorithmic simulator of the real-world environment which biological agents experience and act upon, has been an emerging topic in recent years due to the rising need to develop virtual agents with artificial (general) intelligence. There has been much discussion on what a world model really is, how to build it, how to use it, and how to evaluate it. In this essay, starting from the imagination in the famed Sci-Fi classic Dune, and drawing inspiration from the concept of ``hypothetical thinking'' in psychology literature, we argue the primary goal of a world model to be {\it simulating all actionable possibilities of the real world for purposeful reasoning and acting}. We examine the key design dimensions of world modeling: data, representation, architecture, learning objective, and usage, surveying existing approaches and analyzing their tradeoffs. Building on this examination, we propose a new Generative Latent Prediction (GLP) architecture for a general-purpose world model, based on stateful, hierarchical, multi-level, and mixed continuous/discrete representations, and a generative and self-supervised learning framework, with an outlook of a Physical, Agentic, and Nested (PAN) AGI system enabled by such a model.

2502.03227 2026-06-19 cs.LG cs.CV 版本更新 60%

Adversarial Dependence Minimization

对抗性依赖最小化

Pierre-François De Plaen, Tinne Tuytelaars, Marc Proesmans, Luc Van Gool

发表机构 * CVL, ETH Zürich, Switzerland(CVL,苏黎世联邦理工学院,瑞士) INSAIT, Sofia University, Bulgaria(INSAIT,索菲亚大学,保加利亚)

专题命中 其他LLM :算法可应用于自监督学习防止维度坍塌

AI总结 提出ADM算法,通过对抗博弈最小化特征维度间的统计依赖性,证明全局最优时达到相互独立,并应用于非线性去相关、图像分类泛化提升和自监督学习维度坍塌预防。

详情
AI中文摘要

最小冗余表示通常通过最小化特征协方差来学习。然而,基于协方差的方法无法消除所有依赖/冗余,因为线性不相关的变量仍可能表现出非线性关系。为了解决这个问题,我们引入了ADM,一种可微分的算法,通过对抗博弈最小化特征维度之间的统计依赖性:辅助网络识别依赖关系,而编码器去除它们。我们证明了在全局最优时实现了相互独立,经验验证了收敛性,并研究了三个潜在应用:将PCA扩展到非线性去相关、提高图像分类的泛化能力以及防止自监督学习中的维度坍塌。通过促进统计独立的表示,ADM为在多种应用中学习更鲁棒、更压缩和更泛化的表示铺平了道路。

英文摘要

Minimally redundant representations are typically learned by minimizing feature covariance. However, covariance-based methods fail to eliminate all dependencies/redundancies, as linearly uncorrelated variables can still exhibit nonlinear relationships. To address this, we introduce ADM, a differentiable algorithm that minimizes statistical dependence between feature dimensions through an adversarial game: auxiliary networks identify dependencies, while the encoder removes them. We prove that mutual independence is achieved at the global optimum, empirically verify convergence, and study three potential applications: extending PCA to nonlinear decorrelation, improving generalization in image classification, and preventing dimensional collapse in self-supervised learning. By promoting statistically independent representations, ADM paves the way for learning more robust, compressed, and generalizable representations across diverse applications.

2602.05533 2026-06-19 cs.AI 版本更新 55%

Conditional Diffusion Guidance under Hard Constraint: A Stochastic Analysis Approach

硬约束下的条件扩散引导:一种随机分析方法

Zhengyi Guo, Wenpin Tang, Renyuan Xu

发表机构 * Department of Industrial Engineering and Operations Research, Columbia University(哥伦比亚大学工业工程与运营管理系) Department of Management Science and Engineering, Stanford University(斯坦福大学管理科学与工程系)

专题命中 其他LLM :扩散模型条件生成,与LLM弱相关。

AI总结 提出基于Doob h-变换和鞅表示的条件扩散引导框架,通过鞅损失和鞅协方差损失学习条件函数梯度,确保硬约束满足并给出非渐近保证。

详情
AI中文摘要

我们研究了扩散模型中在硬约束下的条件生成,其中生成的样本必须以概率1满足预设事件。这类约束在安全关键应用和稀有事件模拟中自然出现,而软或基于奖励的引导方法无法保证约束满足。基于扩散模型的概率解释,我们利用Doob h-变换、鞅表示和二次变差过程,开发了一个原则性的条件扩散引导框架。具体地,得到的引导动力学通过涉及条件函数对数梯度的显式漂移校正来增强预训练扩散,而不修改预训练得分网络。利用鞅和二次变差恒等式,我们提出了两种新的离策略学习算法,基于鞅损失和鞅协方差损失,仅使用预训练模型的轨迹来估计h及其梯度。我们为得到的条件采样器在总变差和Wasserstein距离下提供了非渐近保证,明确刻画了得分近似和引导估计误差的影响。数值实验证明了所提方法在强制硬约束和生成稀有事件样本方面的有效性。数值实验的代码可在此https URL找到。

英文摘要

We study conditional generation in diffusion models under hard constraints, where generated samples must satisfy prescribed events with probability one. Such constraints arise naturally in safety-critical applications and in rare-event simulation, where soft or reward-based guidance methods offer no guarantee of constraint satisfaction. Building on a probabilistic interpretation of diffusion models, we develop a principled conditional diffusion guidance framework based on Doob's h-transform, martingale representation and quadratic variation process. Specifically, the resulting guided dynamics augment a pretrained diffusion with an explicit drift correction involving the logarithmic gradient of a conditioning function, without modifying the pretrained score network. Leveraging martingale and quadratic-variation identities, we propose two novel off-policy learning algorithms based on a martingale loss and a martingale-covariation loss to estimate h and its gradient using only trajectories from the pretrained model. We provide non-asymptotic guarantees for the resulting conditional sampler in both total variation and Wasserstein distances, explicitly characterizing the impact of score approximation and guidance estimation errors. Numerical experiments demonstrate the effectiveness of the proposed methods in enforcing hard constraints and generating rare-event samples. The code of the numerical experiments can be found at https://github.com/ZhengyiGuo2002/CDG_Finance.