arXivDaily arXiv每日学术速递 周一至周五更新
全部学科分类 2084
专题追踪
2510.05864 2026-05-27 cs.CL cs.CY

On the Sensitivity of Instruction-tuned LLMs to Harmful Sentences in Long Inputs

关于指令微调大语言模型对长输入中有害句子的敏感性

Faeze Ghorbanpour, Alexander Fraser

发表机构 * School of Computation, Information and Technology, TU Munich(计算、信息与技术学院,慕尼黑工业大学) Munich Center for Machine Learning (MCML)(慕尼黑机器学习中心(MCML))

AI总结 通过构建长输入并系统变化长度、有害比例、显隐性和位置,研究LLM对稀疏嵌入有害句子的敏感性,发现敏感性非单调、随长度下降、早期位置优先、显性危害更易识别。

详情
AI中文摘要

大型语言模型(LLM)越来越多地处理长输入,但当有害句子稀疏地嵌入其中时,其行为仍知之甚少。我们提出了一种敏感性分析,探究LLM如何提取嵌入在长输入中的有害句子。我们通过组合中性和有害句子构建长输入,并系统变化四个因素:输入长度(600–30,000个token)、有害句子比例(0.01–0.50)、危害实现方式(显性与隐性)以及有害句子在输入中的位置(开头、中间、结尾),从而进行受控压力测试评估。针对有毒、冒犯和仇恨内容,以及LLaMA-3.1、Qwen-2.5和Mistral的实验揭示了一致模式:敏感性相对于有害流行率是非单调的,在中等水平达到峰值;敏感性随输入长度增加而下降;位于输入较早位置的有害句子被更强烈地优先处理;显性危害比隐性危害更可靠地被识别。这些发现提供了在受控压力条件下LLM如何优先处理长输入中有害句子的系统视角,突出了安全相关应用中新兴的优势和持续的挑战。

英文摘要

Large language models (LLMs) increasingly operate on long inputs, yet their behavior when harmful sentences are sparsely embedded within such inputs remains poorly understood. We present a sensitivity analysis that probes how LLMs extract harmful sentences embedded in long inputs. We construct long inputs by combining neutral and harmful sentences, and systematically vary four factors: input length (600--30,000 tokens), the proportion of harmful sentences (0.01--0.50), harm realization (explicit vs. implicit), and the position of harmful sentences within the input (beginning, middle, end), enabling a controlled stress-test evaluation. Experiments across toxic, offensive, and hate content, and across LLaMA-3.1, Qwen-2.5, and Mistral, reveal consistent patterns: sensitivity is non-monotonic with respect to harmful prevalence, peaking at moderate levels; sensitivity degrades as input length increases; harmful sentences placed earlier in the input are more strongly prioritized; and explicit harm is more reliably identified than implicit harm. These findings provide a systematic view of how LLMs prioritize harmful sentences in long input under controlled stress conditions, highlighting both emerging strengths and remaining challenges for safety-related use.

2510.05141 2026-05-27 cs.CL

To model human linguistic prediction, make LLMs less superhuman

为了模拟人类语言预测,让大语言模型不那么超人类

Byung-Doh Oh, Tal Linzen

发表机构 * Division of Linguistics and Multilingual Studies, Nanyang Technological University, Singapore(南洋理工大学语言学与多语言研究系,新加坡) Department of Linguistics, New York University, New York, USA(纽约大学语言学系,纽约,美国) Center for Data Science, New York University, New York, USA(纽约大学数据科学中心,纽约,美国)

AI总结 本文指出大语言模型因超人类预测能力而无法解释人类阅读行为,主张通过模拟人类记忆来改进模型,并提出新实验方向。

Comments Accepted to Trends in Cognitive Sciences

详情
AI中文摘要

当我们阅读时,我们会预测即将出现的单词;这些预测会影响我们的阅读行为。大语言模型(LLMs)与人类一样,会对即将出现的单词进行预测,它们的成功促使它们被用作人类语言预测的模型。令人惊讶的是,在过去几年中,随着LLMs预测下一个单词的能力提高,它们解释阅读行为的能力却下降了。我们认为这是因为当前的LLMs预测即将出现的单词的能力远高于人类读者。这种“超人类性”是由LLMs大量的训练数据、对训练示例更强的长期记忆以及更强的短期记忆驱动的。我们主张开发具有类人记忆的LLMs,并进行新的实验来衡量人类与LLMs之间的一致性,并概述了实现这些目标的方向。

英文摘要

When we read, we make predictions about upcoming words; these predictions influence our reading behavior. The success of large language models (LLMs), which, like humans, make predictions about upcoming words, has motivated their use as models of human linguistic prediction. Surprisingly, in the last few years, as LLMs' ability to predict the next word has improved, their ability to explain reading behavior has declined. We argue this is because current LLMs can predict upcoming words much better than human readers can. This 'superhumanness' is driven by LLMs' extensive training data, stronger long-term memory of training examples, and stronger short-term memory. We advocate for LLMs with human-like memory and for new experiments to measure the alignment between humans and LLMs, and outline directions towards achieving these goals.

2510.04533 2026-05-27 cs.CV

TAG: Tangential Amplifying Guidance for Hallucination-Resistant Sampling

TAG: 切向放大引导用于抗幻觉采样

Hyunmin Cho, Donghoon Ahn, Susung Hong, Jee Eun Kim, Seungryong Kim, Kyong Hwan Jin

发表机构 * Korea University(韩国大学) University of California, Berkeley(加州大学伯克利分校) University of Washington(华盛顿大学)

AI总结 提出一种无需训练、与架构无关的即插即用引导方法TAG,通过放大估计分数的切向分量来纠正采样轨迹,减少语义不一致性并提高保真度。

Comments Accepted to ICML 2026 (Regular)

详情
AI中文摘要

扩散模型实现了最先进的图像生成,但经常产生语义不一致或幻觉。现有的推理时引导方法依赖外部信号或架构修改,增加了计算开销。我们提出切向放大引导(TAG),一种无需训练、与架构无关、即插即用的引导方法,仅基于轨迹信号操作。TAG使用中间样本作为投影基,放大估计分数的切向分量以纠正采样轨迹。一阶泰勒分析表明,这会将状态引导至数据流形的高概率区域,减少不一致性并提高保真度,同时为现有采样器增加可忽略的开销。代码可在我们的项目页面(https://hyeon-cho.github.io/TAG/)获取。

英文摘要

Diffusion models achieve state-of-the-art image generation but often produce semantic inconsistencies, or hallucinations. Existing inference-time guidance methods rely on external signals or architectural modifications, adding computational overhead. We propose $\mathbf{T}$angential $\mathbf{A}$mplifying $\mathbf{G}$uidance $\mathbf{(TAG)}$, a training-free, architecture-agnostic, plug-and-play guidance method that operates purely on trajectory signals. TAG uses an intermediate sample as a projection basis and amplifies the tangential components of the estimated score to correct the sampling trajectory. A first-order Taylor analysis shows that this steers the state toward higher-probability regions of the data manifold, reducing inconsistencies and improving fidelity while adding negligible overhead to existing samplers. Code is available at our Project Page (https://hyeon-cho.github.io/TAG/).

2510.01833 2026-05-27 cs.AI cs.CL

Plan Then Action:High-Level Planning Guidance Reinforcement Learning for LLM Reasoning

先规划后行动:面向LLM推理的高层规划引导强化学习

Zhihao Dou, Qinjian Zhao, Zhongwei Wan, Dinggen Zhang, Weida Wang, Towsif Raiyan, Benteng Chen, Qingtao Pan, Yang Ouyang, Chaoda Song, Zhiqiang Gao, Shufei Zhang, Sumon Biswas

发表机构 * Case Western Reserve University, Cleveland, OH, USA(凯斯西储大学) Kean University, Union, NJ, USA(凯恩大学) The Ohio State University, Columbus, OH, USA(俄亥俄州立大学) Fudan University, Shanghai, China(复旦大学) Shanghai Artificial Intelligence Laboratory, Shanghai, China(上海人工智能实验室) The University of Hong Kong, Hong Kong, China(香港大学) North Carolina State University, Raleigh, NC, USA(北卡罗来纳州立大学)

AI总结 提出PTA-GRPO两阶段框架,通过高层规划引导与强化学习联合优化,提升LLM在数学和自然科学推理任务中的准确性和泛化能力。

Comments 19 pages and 5 figures

详情
AI中文摘要

大型语言模型(LLMs)通过思维链(CoT)展现出强大的推理能力,但其token级别的生成倾向于局部决策,缺乏全局规划,常常导致冗余或不准确的推理。现有方法(如基于树的搜索和强化学习)试图解决这一问题,但计算成本高,且仍难以产生可靠的推理轨迹。为应对这些挑战,我们提出先规划后行动增强推理与组相对策略优化(PTA-GRPO),这是一个两阶段框架,旨在联合改进高层规划和细粒度CoT推理。具体而言,在第一阶段,给定LLM负责将CoT推理总结为紧凑的高层指导,然后用于监督微调。接着,我们引入一种指导感知的强化学习方法,联合优化最终输出和指导质量,提升推理效果。我们在数学和自然科学的十个推理基准上,使用五个覆盖多种数据模态的多样化基础模型进行评估。结果表明,PTA-GRPO在模型和任务上持续带来显著改进,展现出强大的有效性和泛化能力。

英文摘要

Large language models (LLMs) demonstrate strong reasoning abilities via Chain-of-Thought (CoT), but their token-level generation encourages local decisions and lacks global planning, often leading to redundant or inaccurate reasoning. Existing methods, such as tree-based search and reinforcement learning (RL), attempt to address this issue but incur high computational costs and still struggle to produce reliable reasoning trajectories. To address these challenges, we propose Plan-Then-Action Enhanced Reasoning with Group Relative Policy Optimization (PTA-GRPO), a two-stage framework designed to jointly improve high-level planning and fine-grained CoT reasoning. Specifically, in the first stage, a given LLM is responsible for summarizing CoT reasoning into compact high-level guidance, which is then leveraged for supervised fine-tuning. Then, we introduce a guidance-aware reinforcement learning method that jointly optimizes the final output and the quality of guidance, enhancing reasoning effectiveness. We evaluate PTA-GRPO on ten reasoning benchmarks across mathematics and natural sciences, using five diverse base models spanning multiple data modalities. The results show that PTA-GRPO consistently delivers significant improvements across models and tasks, demonstrating strong effectiveness and generalization.

2510.01336 2026-05-27 cs.CL cs.AI cs.LG

HiSpec: Hierarchical Speculative Decoding for LLMs

HiSpec: 分层推测解码用于大语言模型

Avinash Kumar, Sujay Sanghavi, Poulami Das

发表机构 * Department of Electrical and Computer Engineering, The University of Texas at Austin(德克萨斯大学奥斯汀分校电子与计算机工程系)

AI总结 提出HiSpec框架,利用早期退出模型进行低开销中间验证,通过重用键值缓存和隐藏状态提高吞吐量,平均加速1.28倍,最高2.01倍,且不损失准确性。

详情
AI中文摘要

推测解码通过使用较小的草稿模型推测令牌,再由较大的目标模型验证,从而加速LLM推理。验证通常是瓶颈(例如,当3B模型为70B目标模型推测时,验证速度比令牌生成慢4倍),但大多数先前工作只关注加速草稿生成。“中间”验证通过早期丢弃不准确的草稿令牌来减少验证时间,但现有方法在引入中间验证器时会产生大量训练开销,增加内存占用以协调中间验证步骤,并依赖近似启发式方法损害准确性。我们提出$\underline{\textit{Hi}}\textit{erarchical }\underline{\textit{Spec}}\textit{ulative Decoding (HiSpec)}$,一种高吞吐量推测解码框架,利用早期退出模型进行低开销中间验证。早期退出模型允许令牌通过跳过层遍历提前退出,并经过显式训练,使得选定层的隐藏状态可解释,从而在不显著增加计算和内存开销的情况下,非常适合中间验证。为了进一步提高资源效率,我们设计了一种方法,使HiSpec能够在草稿模型、中间验证器和目标模型之间重用键值缓存和隐藏状态。为了保持准确性,HiSpec定期针对目标模型验证中间验证器接受的草稿令牌。我们在各种代表性基准和模型上的评估表明,与基线单层推测相比,HiSpec平均提高吞吐量1.28倍,最高达2.01倍,且不损失准确性。

英文摘要

Speculative decoding accelerates LLM inference by using a smaller draft model to speculate tokens that a larger target model verifies. Verification is often the bottleneck (e.g. verification is $4\times$ slower than token generation when a 3B model speculates for a 70B target model), but most prior works focus only on accelerating drafting. $\textit{``Intermediate"}$ verification reduces verification time by discarding inaccurate draft tokens early, but existing methods incur substantial training overheads in incorporating the intermediate verifier, increase the memory footprint to orchestrate the intermediate verification step, and compromise accuracy by relying on approximate heuristics. We propose $\underline{\textit{Hi}}\textit{erarchical }\underline{\textit{Spec}}\textit{ulative Decoding (HiSpec)}$, a framework for high-throughput speculative decoding that exploits $\textit{early-exit (EE) models}$ for low-overhead intermediate verification. EE models allow tokens to exit early by skipping layer traversal and are explicitly trained so that hidden states at selected layers can be interpreted, making them uniquely suited for intermediate verification without drastically increasing compute and memory overheads. To improve resource-efficiency even further, we design a methodology that enables HiSpec to re-use key-value caches and hidden states between the draft, intermediate verifier, and target models. To maintain accuracy, HiSpec periodically validates the draft tokens accepted by the intermediate verifier against the target model. Our evaluations using various representative benchmarks and models show that HiSpec improves throughput by 1.28$\times$ on average and by up to 2.01$\times$ compared to the baseline single-layer speculation without compromising accuracy.

2510.00902 2026-05-27 cs.CV cs.CY cs.HC

Intuitions of Machine Learning Researchers about Transfer Learning for Medical Image Classification

机器学习研究者关于医学图像分类迁移学习的直觉

Yucheng Lu, Hubert Dariusz Zając, Veronika Cheplygina, Amelia Jiménez-Sánchez

发表机构 * IT University of Copenhagen(丹麦技术大学) University of Antwerp(安特卫普大学) University of Barcelona(巴塞罗那大学)

AI总结 通过任务调查揭示机器学习从业者选择源数据集的直觉依据,发现选择依赖于任务、社区实践和相似性感知,但相似性与性能并不一致,且缺乏伦理考量。

Comments Under review

详情
AI中文摘要

迁移学习对医学影像至关重要,然而源数据集的选择往往依赖于研究者的直觉而非系统原则,这可能影响算法的泛化能力,进而影响患者预后。本研究通过对机器学习从业者进行基于任务的调查来探究这些决策。与先前对模型和实验设置进行基准测试的工作不同,我们从人机交互(HCI)角度研究从业者如何选择源数据集。我们的发现表明,选择依赖于任务,并受到社区实践、数据集属性、计算(数据嵌入)或感知的视觉或语义相似性的影响。然而,相似性评分与预期性能并不总是一致,挑战了传统的“越相似越好”的观点。此外,伦理和公平性考虑在源数据集选择中基本缺失。参与者常使用模糊术语,这表明需要更清晰的定义和工具使其明确且可用。通过阐明这些启发式方法并引入迁移学习因素的概念框架,本研究为迁移学习中更系统的源选择提供了实用见解。

英文摘要

Transfer learning is crucial for medical imaging, yet the selection of source datasets often relies on researchers' intuition rather than systematic principles, which can impact the generalizability of algorithms and, thus, patient outcomes. This study investigates these decisions through a task-based survey with machine learning practitioners. Unlike prior work that benchmarks models and experimental setups, we take a human-computer interaction (HCI) perspective on how practitioners select source datasets. Our findings indicate that choices are task-dependent and influenced by community practices, dataset properties, and computational (data embedding), or perceived visual or semantic similarity. However, similarity ratings and expected performance are not always aligned, challenging a traditional "more similar is better" view. Moreover, ethical and fairness considerations remain largely absent from source dataset sections. Participants often used ambiguous terminology, which suggests a need for clearer definitions and tools to make them explicit and usable. By clarifying these heuristics and introducing a conceptual framework of transfer learning factors, this work provides practical insights for more systematic source selection in transfer learning.

2509.26600 2026-05-27 cs.CL cs.AI

When LLMs Benchmark Themselves: Deconstructing Self-Bias in Automated Evaluation

当LLM自我基准测试:解构自动评估中的自我偏见

Wenda Xu, Sweta Agrawal, Vilém Zouhar, Markus Freitag, Daniel Deutsch

发表机构 * Google(谷歌) ETH Zurich(苏黎世联邦理工学院)

AI总结 研究LLM自动创建基准测试时存在的自我偏见问题,发现测试集生成和评估两个环节均产生偏见,导致模型偏爱自身输出,并提出了多样性指标以部分缓解该偏见。

详情
AI中文摘要

随着LLM迅速饱和现有基准测试,使用LLM自动创建基准测试(LLM-as-a-benchmark)——即模型生成测试输入(LLM-as-a-testset)并评估输出(LLM-as-an-evaluator)——已成为人工策划的廉价替代方案。我们表明,这种范式存在一个根本问题:LLM生成的基准测试系统性地偏爱创建它们的模型。以机器翻译为主要测试平台,我们发现自我偏见源于两个叠加来源:LLM-as-a-testset和LLM-as-an-evaluator,它们的组合放大了这种效应。关键的是,即使测试数据在显式多样性控制下生成,每个模型的隐式风格倾向也会产生同质的、模型特定的输出,从而抬高其自身分数。使用我们提出的多样性度量增加源文本多样性,可以部分缓解这种偏见。自我偏见足够强,以至于每个模型都将自己排在首位,覆盖了同行共识排序。我们确认该现象扩展到Chatbot Arena任务上的开放式生成。

英文摘要

As LLMs rapidly saturate existing benchmarks, automated benchmark creation using LLMs (LLM-as-a-benchmark) -- where a model generates test inputs (LLM-as-a-testset) and evaluates outputs (LLM-as-an-evaluator) -- has gained traction as a cheap alternative to human curation. We show that this paradigm has a fundamental problem: LLM-generated benchmarks systematically favor the model that created them. Using machine translation as our primary testbed, we find that self-bias arises from two additive sources, LLM-as-a-testset and LLM-as-an-evaluator, and their combination amplifies the effect. Crucially, even when test data is generated with explicit diversity controls, each model's implicit stylistic tendencies produce homogeneous, model-specific outputs that inflate its own scores. Increasing source text diversity, using our proposed diversity metric, partially mitigates this bias. Self-bias is strong enough to cause each model to rank itself first, overriding the peer-consensus ordering. We confirm that the phenomenon extends to open-ended generation on the Chatbot Arena task.

2509.21552 2026-05-27 cs.CV cs.CL

Learning GUI Grounding with Spatial Reasoning from Visual Feedback

通过视觉反馈学习具有空间推理能力的 GUI 定位

Yu Zhao, Wei-Ning Chen, Huseyin Atahan Inan, Samuel Kessler, Lu Wang, Lukas Wutschitz, Fangkai Yang, Chaoyun Zhang, Pasquale Minervini, Saravan Rajmohan, Robert Sim

发表机构 * University of Edinburgh(爱丁堡大学) Microsoft(微软)

AI总结 本文提出将 GUI 定位重构为交互式搜索任务,利用多步在线强化学习训练 GUI-Cursor 模型,通过光标视觉反馈提升空间推理能力,在 GUI 定位和代理任务上超越强基线。

Comments Accepted at ICML 2026

详情
AI中文摘要

图形用户界面(GUI)定位通常被构建为坐标预测任务——给定自然语言指令,生成屏幕上用于点击和按键等操作的坐标。然而,最近的视觉语言模型(VLM)在处理高分辨率和复杂布局的 GUI 图像时,往往无法预测准确的数字坐标。为了解决这个问题,我们将 GUI 定位重构为交互式搜索任务,其中 VLM 生成动作以移动 GUI 中的光标来定位 UI 元素。在每一步,模型确定目标对象,评估光标与目标之间的空间关系,并根据移动历史将光标移近目标。在这个交互过程中,渲染的光标提供视觉反馈,帮助模型将其预测与相应的屏幕位置对齐。我们使用基于密集轨迹奖励函数的多步在线强化学习来训练我们的 GUI 定位模型 GUI-Cursor。实验结果表明,GUI-Cursor 在 GUI 定位和代理任务上超越了强基线,在相同基础模型下实现了更优性能,同时需要更少的训练数据。进一步分析表明,GUI-Cursor 学会在更困难的示例上自适应地执行更多步骤,并在分布外领域获得更好的空间推理能力。

英文摘要

Graphical User Interface (GUI) grounding is commonly framed as a coordinate prediction task -- given a natural language instruction, generate on-screen coordinates for actions such as clicks and keystrokes. However, recent Vision Language Models (VLMs) often fail to predict accurate numeric coordinates when processing GUI images with high resolutions and complex layouts. To address this issue, we reframe GUI grounding as an interactive search task, where the VLM generates actions to move a cursor in the GUI to locate UI elements. At each step, the model determines the target object, evaluates the spatial relations between the cursor and the target, and moves the cursor closer to the target conditioned on the movement history. In this interactive process, the rendered cursor provides visual feedback to help the model align its predictions with the corresponding on-screen locations. We train our GUI grounding model, GUI-Cursor, using multi-step online reinforcement learning with a dense trajectory-based reward function. Experimental results demonstrate that GUI-Cursor surpasses strong baselines in GUI grounding and agentic tasks, achieving superior performance with the same base models while requiring less training data. Further analysis shows that GUI-Cursor learns to adaptively conduct more steps on more difficult examples, and it obtains better spatial reasoning capability on out-of-distribution domains.

2506.08543 2026-05-27 cs.CV

Spectral Principal Paths: A Spectral Perspective on Linear Representation Formation in LLMs

谱主路径:大语言模型中线性表示形成的谱视角

Bowei Tian, Xuntao Lyu, Meng Liu, Hongyi Wang, Ang Li

发表机构 * University of Maryland, College Park(马里兰大学 College Park 分校) North Carolina State University(北卡罗来纳州立大学) Genbio AI

AI总结 提出输入空间线性假设和谱主路径框架,利用谱理论解释大语言模型中线性表示的形成与稳定性,并给出严格保证。

Comments arXiv admin note: text overlap with arXiv:2503.22720

详情
AI中文摘要

高层表示已成为增强AI透明度和可控性的核心焦点,将注意力从单个神经元或电路转向与人类可解释概念对齐的结构化语义方向。虽然线性表示假说(LRH)表明这些方向在表示中出现,但这些表示如何起源以及为何在层间变得日益稳定仍不清楚。为解决此问题,我们引入输入空间线性假设,认为与概念对齐的方向起源于输入空间,并随着深度增加而稳定维持。然后我们提出谱主路径(SPP)框架,该框架形式化了深度网络如何沿谱主方向逐步蒸馏线性表示。我们基于Wedin $\sin\Theta$ 扰动定理为SPP提供了严格的稳定性保证,识别了可测试的条件,包括谱间隙和上下文不连贯性,这些条件共同确保层间方向保持。通过将理论分析与实证证据相结合,本工作提供了关于线性表示如何在大语言模型中产生的谱视角,并暗示了对现代AI系统中公平性和透明度的概念级可控、鲁棒和连贯方法的潜在影响。

英文摘要

High-level representations have become a central focus in enhancing AI transparency and control, shifting attention from individual neurons or circuits to structured semantic directions that align with human-interpretable concepts. While the Linear Representation Hypothesis (LRH) suggests that such directions emerge in representations, it remains unclear how these representations originate and why they become increasingly stable across layers. To solve this issue, we introduce the Input-Space Linearity Hypothesis, positing that concept-aligned directions originate in the input space and are steadily maintained with increasing depth. We then propose the Spectral Principal Path (SPP) framework, which formalizes how deep networks progressively distill linear representations along the spectral principal directions. We provide rigorous stability guarantees for the SPP based on the Wedin $\sinΘ$ perturbation theorem, identifying testable conditions, including spectral gap and context incoherence, that jointly ensure layer-wise directional preservation. By bridging theoretical analysis with empirical evidence, this work identifies a spectral view of how linear representations arise in LLMs, and suggests potential implications for concept-level controllable, robust, and coherent approaches to fairness and transparency in modern AI systems.

2509.21167 2026-05-27 cs.LG cs.CV

A Unified Framework for Diffusion Model Unlearning with f-Divergence

基于f-散度的扩散模型遗忘统一框架

Nicola Novello, Federico Fontana, Luigi Cinque, Deniz Gunduz, Andrea M. Tonello

发表机构 * University of Klagenfurt, Austria(克雷根福特大学) Sapienza University of Rome, Italy(罗马萨皮恩扎大学) Imperial College London, UK(伦敦帝国学院)

AI总结 提出一个基于f-散度的统一框架,将扩散模型概念遗忘中的MSE损失推广到任意f-散度,并通过理论分析和实验验证不同散度对遗忘效果的影响。

Comments Accepted at ICML 2026

详情
AI中文摘要

现有的大多数文本到图像扩散模型概念遗忘方法最小化基于目标概念和锚定概念的去噪器输出之间的均方误差(MSE)损失,这隐式地是两个高斯分布之间的KL散度。我们将这一目标推广到任意$f$-散度,将MSE恢复为KL实例,并识别出一族$\alpha$-散度,其高斯闭式形式产生廉价、类似MSE的训练目标。对于剩余的$f$-散度,我们基于$f$-散度的变分公式提供了一个最小-最大目标。我们从理论上分析并数值验证了不同$f$-散度如何影响梯度幅度和算法的收敛性质,从而影响遗忘质量。例如,我们观察到Hellinger闭式实例在多种场景下始终优于MSE。更一般地,所提出的统一框架为根据应用和用户目标选择最优散度提供了灵活的范式,允许对遗忘效果与生成保真度之间的权衡进行更精细的控制。

英文摘要

Most existing methods for concept unlearning in text-to-image diffusion models minimize a mean squared error (MSE) loss between the denoiser outputs conditioned on a target and an anchor concept, which is implicitly the KL divergence between two Gaussians. We generalize this objective to any $f$-divergence, recovering MSE as the KL instance, and identify a family of $α$-divergences whose Gaussian closed-form yields cheap, MSE-like training objectives. For the remaining $f$-divergences, we provide a min-max objective based on the variational formulation of the $f$-divergence. We theoretically analyze and numerically validate how different $f$-divergences impact the gradient magnitude and the convergence properties of the algorithm, affecting the quality of unlearning. For instance, we observe that the Hellinger closed-form instance consistently dominates MSE across multiple scenarios. More generally, the proposed unified framework offers a flexible paradigm for selecting the optimal divergence based on the application and user goal, allowing for finer control over the trade-off between unlearning efficacy and generative fidelity.

2509.21106 2026-05-27 cs.CL cs.IR

BESPOKE: Benchmark for Search-Augmented Large Language Model Personalization via Diagnostic Feedback

BESPOKE:通过诊断反馈进行搜索增强型大语言模型个性化的基准测试

Hyunseo Kim, Sangam Lee, Kwangwook Seo, Dongha Lee

发表机构 * Department of Artificial Intelligence, Yonsei University, Seoul, Republic of Korea(人工智能系,延世大学,首尔,大韩民国)

AI总结 提出BESPOKE基准,通过收集真实用户历史并配对细粒度偏好分数与反馈,系统评估搜索增强型大语言模型在个性化信息检索任务中的表现。

Comments Accepted to ICML 2026

详情
AI中文摘要

搜索增强型大语言模型(LLMs)通过将检索集成到生成中,推进了信息寻求任务,相比传统搜索系统减少了用户的认知负担。然而,它们在充分满足多样化用户需求方面仍显不足,这需要识别同一查询如何反映不同用户的意图,并以偏好的形式传递信息。尽管最近如ChatGPT和Gemini等系统通过利用用户历史尝试个性化,但对这种个性化的系统评估尚不充分。为填补这一空白,我们提出了BESPOKE,一个用于评估搜索增强型LLMs个性化的现实基准。BESPOKE的设计既现实(通过直接从人类收集真实的聊天和搜索历史)又具有诊断性(通过将响应与细粒度的偏好分数和反馈配对)。该基准通过长期、深度参与的人工标注构建,标注者贡献自己的历史、撰写带有详细信息需求的查询,并用分数和诊断反馈评估响应。利用BESPOKE,我们进行了系统分析,揭示了信息寻求任务中有效个性化的关键要求,为个性化搜索增强型LLMs的细粒度评估提供了基础。我们的代码和数据可在https://augustinlib.github.io/BESPOKE/获取。

英文摘要

Search-augmented large language models (LLMs) have advanced information-seeking tasks by integrating retrieval into generation, reducing users' cognitive burden compared to traditional search systems. Yet they remain insufficient for fully addressing diverse user needs, which requires recognizing how the same query can reflect different intents across users and delivering information in preferred forms. While recent systems such as ChatGPT and Gemini attempt personalization by leveraging user histories, systematic evaluation of such personalization is under-explored. To address this gap, we propose BESPOKE, the realistic benchmark for evaluating personalization in search-augmented LLMs. BESPOKE is designed to be both realistic, by collecting authentic chat and search histories directly from humans, and diagnostic, by pairing responses with fine-grained preference scores and feedback. The benchmark is constructed through long-term, deeply engaged human annotation, where human annotators contributed their own histories, authored queries with detailed information needs, and evaluated responses with scores and diagnostic feedback. Leveraging BESPOKE, we conduct systematic analyses that reveal key requirements for effective personalization in information-seeking tasks, providing a foundation for fine-grained evaluation of personalized search-augmented LLMs. Our code and data are available at https://augustinlib.github.io/BESPOKE/.

2509.18919 2026-05-27 cs.CV

Advancing Metallic Surface Defect Detection via Anomaly-Guided Pretraining on a Large Industrial Dataset

通过在大规模工业数据集上的异常引导预训练推进金属表面缺陷检测

Chuni Liu, Hongjie Li, Jiaqi Du, Yangyang Hou, Qian Sun, Lei Jin, Ke Xu

发表机构 * Collaborative Innovation Center of Steel Technology, University of Science and Technology Beijing(钢铁技术协同创新中心,北京科技大学)

AI总结 提出异常引导自监督预训练(AGSSP)方法,通过两阶段框架利用异常先验引导表示学习,在金属表面缺陷检测中显著提升性能,mAP@0.5提升高达10%。

Comments Accepted for publication in Pattern Recognition

Journal ref Pattern Recognition, Volume 179, Part C, 2026, 113788

详情
AI中文摘要

预训练-微调范式是金属表面缺陷检测中缓解数据稀缺挑战的关键策略。然而,其实现面临一个关键困境:在ImageNet等自然图像数据集上预训练存在显著的领域差距;同时,由于现有学习目标无法区分复杂背景噪声和纹理中的细微缺陷模式,在领域内工业数据上进行简单的自监督预训练往往效果不佳。为解决这一问题,我们引入了异常引导自监督预训练(AGSSP),这是一种通过异常先验显式引导表示学习的新范式。AGSSP采用两阶段框架:(1)首先通过从异常图中蒸馏知识来预训练模型的主干网络,鼓励网络捕获缺陷显著特征;(2)然后使用从这些图中导出的伪缺陷框预训练检测器,使其与定位任务对齐。为此,我们开发了一种知识增强方法来生成高质量的异常图,并收集了一个包含120,000张图像的大规模工业数据集。此外,我们提供了两个小规模、像素级标注的金属表面缺陷数据集用于验证。大量实验表明,AGSSP在各种设置下均能持续提升性能,与基于ImageNet的模型相比,mAP@0.5提升高达10%,mAP@0.5:0.95提升高达11.4%。所有代码、预训练模型和数据集均可在https://clovermini.github.io/AGSSP-Dev/公开获取。

英文摘要

The pretraining-finetuning paradigm is a crucial strategy in metallic surface defect detection for mitigating the challenges posed by data scarcity. However, its implementation presents a critical dilemma. Pretraining on natural image datasets such as ImageNet, faces a significant domain gap. Meanwhile, naive self-supervised pretraining on in-domain industrial data is often ineffective due to the inability of existing learning objectives to distinguish subtle defect patterns from complex background noise and textures. To resolve this, we introduce Anomaly-Guided Self-Supervised Pretraining (AGSSP), a novel paradigm that explicitly guides representation learning through anomaly priors. AGSSP employs a two-stage framework: (1) it first pretrains the model's backbone by distilling knowledge from anomaly maps, encouraging the network to capture defect-salient features; (2) it then pretrains the detector using pseudo-defect boxes derived from these maps, aligning it with localization tasks. To enable this, we develop a knowledge-enhanced method to generate high-quality anomaly maps and collect a large-scale industrial dataset of 120,000 images. Additionally, we present two small-scale, pixel-level labeled metallic surface defect datasets for validation. Extensive experiments demonstrate that AGSSP consistently enhances performance across various settings, achieving up to a 10\% improvement in mAP@0.5 and 11.4\% in mAP@0.5:0.95 compared to ImageNet-based models. All code, pretrained models, and datasets are publicly available at https://clovermini.github.io/AGSSP-Dev/.

2509.18384 2026-05-27 cs.RO cs.FL

LAD-VF: LLM-Automatic Differentiation Enables Fine-Tuning-Free Robot Planning from Formal Methods Feedback

LAD-VF:LLM自动微分实现基于形式化方法反馈的无微调机器人规划

Yunhao Yang, Junyuan Hong, Gabriel Jacob Perin, Zhiwen Fan, Li Yin, Zhangyang Wang, Ufuk Topcu

发表机构 * The University of Texas at Austin(德克萨斯大学奥斯汀分校) University of São Paulo(圣保罗大学) Texas A&M University(德克萨斯A&M大学) SylphAI

AI总结 提出LAD-VF框架,利用形式化验证反馈和LLM自动微分自动优化提示词,无需微调即可提升机器人规划任务中规范符合率,成功率从60%提升至90%以上。

Comments Presented at ICRA 2026

详情
AI中文摘要

大型语言模型(LLM)能够将自然语言指令转化为机器人、自动驾驶等领域的可执行动作计划。然而,在物理世界中部署LLM驱动的规划需要严格遵守安全和监管约束,当前模型常因幻觉或弱对齐而违反这些约束。传统的数据驱动对齐方法(如直接偏好优化DPO)需要昂贵的人工标注,而近期基于形式化反馈的方法仍依赖资源密集型的微调。本文提出LAD-VF,一种无需微调的框架,利用形式化验证反馈实现自动化提示工程。通过引入与LLM-AutoDiff集成的形式化验证感知文本损失,LAD-VF迭代优化提示词而非模型参数。这带来三个关键优势:(i) 无需微调的可扩展适应;(ii) 与模块化LLM架构兼容;(iii) 通过可审计的提示词实现可解释的优化。在机器人导航和操作任务中的实验表明,LAD-VF显著提升了规范符合率,将成功率从60%提升至90%以上。因此,我们的方法为可信、形式化验证的LLM驱动控制系统提供了一条可扩展且可解释的路径。

英文摘要

Large language models (LLMs) can translate natural language instructions into executable action plans for robotics, autonomous driving, and other domains. Yet, deploying LLM-driven planning in the physical world demands strict adherence to safety and regulatory constraints, which current models often violate due to hallucination or weak alignment. Traditional data-driven alignment methods, such as Direct Preference Optimization (DPO), require costly human labeling, while recent formal-feedback approaches still depend on resource-intensive fine-tuning. In this paper, we propose LAD-VF, a fine-tuning-free framework that leverages formal verification feedback for automated prompt engineering. By introducing a formal-verification-informed text loss integrated with LLM-AutoDiff, LAD-VF iteratively refines prompts rather than model parameters. This yields three key benefits: (i) scalable adaptation without fine-tuning; (ii) compatibility with modular LLM architectures; and (iii) interpretable refinement via auditable prompts. Experiments in robot navigation and manipulation tasks demonstrate that LAD-VF substantially enhances specification compliance, improving success rates from 60% to over 90%. Our method thus presents a scalable and interpretable pathway toward trustworthy, formally-verified LLM-driven control systems.

2508.18444 2026-05-27 cs.CL cs.AI

How Reliable are LLMs for Reasoning on the Re-ranking task?

LLMs在重排序任务上的推理有多可靠?

Nafis Tanveer Islam, Zhiming Zhao

发表机构 * Multiscale Networked Systems (MNS) Group, University of Amsterdam(多尺度网络系统(MNS)组,阿姆斯特丹大学)

AI总结 本研究分析不同训练方法对LLMs在重排序任务中语义理解的影响,并探究模型能否生成更知情的文本推理以克服透明度和数据有限的挑战。

Comments This chapter has been published in Advancements in AI From Foundations to Cross-Disciplinary Applications, Springer, 2026

详情
AI中文摘要

随着大型语言模型(LLMs)语义理解能力的提升,它们表现出对人类更高的认知和一致性,但这以牺牲透明度为代价。尽管通过实验分析取得了有希望的结果,但深入理解LLM的内部工作机制对于理解重排序背后的推理是不可避免的,这为最终用户提供了解释,使他们能够做出明智的决定。此外,在新开发的系统中,用户参与有限且排序数据不足,准确地对内容进行重排序仍然是一个重大挑战。虽然各种训练方法影响LLMs的训练并生成推理,但我们的分析发现,一些训练方法比其他方法表现出更好的可解释性,这意味着并非所有训练方法都学到了准确的语义理解;相反,获得了抽象知识以优化评估,这引发了对LLMs真正可靠性的质疑。因此,在这项工作中,我们分析了不同训练方法如何影响LLMs在重排序任务中的语义理解,并调查这些模型是否能够生成更知情的文本推理,以克服透明度或LLMs以及有限训练数据的挑战。为了分析用于重排序任务的LLMs,我们利用来自环境和地球科学领域的相对较小的排序数据集来对检索到的内容进行重排序。此外,我们还分析了可解释信息,以查看是否可以使用可解释性对重排序进行推理。

英文摘要

With the improving semantic understanding capability of Large Language Models (LLMs), they exhibit a greater awareness and alignment with human values, but this comes at the cost of transparency. Although promising results are achieved via experimental analysis, an in-depth understanding of the LLM's internal workings is unavoidable to comprehend the reasoning behind the re-ranking, which provides end users with an explanation that enables them to make an informed decision. Moreover, in newly developed systems with limited user engagement and insufficient ranking data, accurately re-ranking content remains a significant challenge. While various training methods affect the training of LLMs and generate inference, our analysis has found that some training methods exhibit better explainability than others, implying that an accurate semantic understanding has not been learned through all training methods; instead, abstract knowledge has been gained to optimize evaluation, which raises questions about the true reliability of LLMs. Therefore, in this work, we analyze how different training methods affect the semantic understanding of the re-ranking task in LLMs and investigate whether these models can generate more informed textual reasoning to overcome the challenges of transparency or LLMs and limited training data. To analyze the LLMs for re-ranking tasks, we utilize a relatively small ranking dataset from the environment and the Earth science domain to re-rank retrieved content. Furthermore, we also analyze the explainable information to see if the re-ranking can be reasoned using explainability.

2504.08593 2026-05-27 cs.CV cs.AI

Hands-On: Segmenting Individual Signs from Continuous Sequences

动手实践:从连续序列中分割单个手势

JianHe Low, Harry Walsh, Ozge Mercanoglu Sincan, Richard Bowden

发表机构 * CVSSP, University of Surrey(CVSSP,萨里大学)

AI总结 针对连续手语分割难题,提出基于Transformer的架构,利用HaMeR手部特征和3D角度,采用BIO标注方案建模时序动态,在DGS语料库上达到最优性能。

Comments Accepted in the 19th IEEE International Conference on Automatic Face and Gesture Recognition. Code Implementation Released

Journal ref IEEE 19th International Conference on Automatic Face and Gesture Recognition. (2025) 1-5

详情
AI中文摘要

这项工作解决了连续手语分割的挑战,这是一项对手语翻译和数据标注具有重大影响的关键任务。我们提出了一种基于Transformer的架构,该架构对手语的时序动态进行建模,并使用开始-内部-外部(BIO)标注方案将分割视为序列标注问题。我们的方法利用了HaMeR手部特征,并辅以3D角度。大量实验表明,我们的模型在DGS语料库上取得了最先进的结果,而我们的特征在BSLCorpus上超越了先前的基准。

英文摘要

This work tackles the challenge of continuous sign language segmentation, a key task with huge implications for sign language translation and data annotation. We propose a transformer-based architecture that models the temporal dynamics of signing and frames segmentation as a sequence labeling problem using the Begin-In-Out (BIO) tagging scheme. Our method leverages the HaMeR hand features, and is complemented with 3D Angles. Extensive experiments show that our model achieves state-of-the-art results on the DGS Corpus, while our features surpass prior benchmarks on BSLCorpus.

2508.07996 2026-05-27 cs.CV

Structured Relational Reasoning for Group Activity Assessment

结构化关系推理用于群体活动评估

Thinesh Thiyakesan Ponbagavathi, Chengzheng Yang, Alina Roitberg

发表机构 * University of Stuttgart(斯图加特大学) University of Hildesheim(希尔德斯海姆大学)

AI总结 提出ProGraD框架,利用冻结视觉基础模型和轻量级GroupContext Transformer,通过结构化关系推理在单次前向传播中联合推断群体位置、成员关系和活动,仅用10M参数即在Cafe和Social-CAD基准上取得最优性能。

Comments Accepted to CVPR 2026 Workshop (SAUAFG)

详情
AI中文摘要

群体活动检测(GAD)涉及识别视频中的社会群体及其集体行为。视觉基础模型(VFM),如DINOv2,提供优秀的特征,但是在以物体为中心的数据上预训练的。我们发现,将它们简单替换到现有GAD流程中实际上会降低性能,暴露出结构化的群体感知解码才是真正的瓶颈。我们提出了ProGraD,一个基于冻结VFM构建的结构化关系推理框架。其核心是一个轻量级的两层GroupContext Transformer,显式建模演员-群体关联并聚合全局上下文以推断集体行为。可学习的群体提示作为最小条件机制,引导冻结骨干网络朝向社交相关表示,而关系解码器对演员和群体执行核心推理。该设计在单次前向传播中联合推断群体位置、成员关系和活动,仅使用10M可训练参数——不到先前方法的一半。在具有多个并发社交群体的Cafe基准上,ProGraD将Group mAP$@$1.0提升了6.5%,Group mAP$@$0.5提升了8.2%。在Social-CAD上,它实现了最先进的社交和成员关系准确性。ProGraD还生成可解释的注意力图,为演员-群体推理提供洞察。

英文摘要

Group Activity Detection (GAD) involves recognizing social groups and their collective behaviors in videos. Vision Foundation Models (VFMs), like DINOv2, offer excellent features but are pretrained on object-centric data. We find that naively substituting them into existing GAD pipelines actually degrades performance, exposing structured group-aware decoding as the true bottleneck. We introduce ProGraD, a structured relational-reasoning framework for GAD built on top of frozen VFMs. At its core is a lightweight two-layer GroupContext Transformer that explicitly models actor-group associations and aggregates global context to infer collective behavior. Learnable group prompts serve as a minimal conditioning mechanism to guide the frozen backbone toward socially relevant representations, while the relational decoder performs the core reasoning over actors and groups. This design jointly infers group locations, memberships, and activities in a single pass using only 10M trainable parameters - less than half of prior methods. On the Cafe benchmark with multiple concurrent social groups, ProGraD improves the state-of-the-art by 6.5% Group mAP$@$1.0 and 8.2% Group mAP$@$0.5. On Social-CAD, it achieves state-of-the-art social and membership accuracy. ProGraD further produces interpretable attention maps that provide insights into actor-group reasoning.

2506.00250 2026-05-27 cs.CL cs.IT math.IT

PersianMedQA: Evaluating Large Language Models on a Persian-English Bilingual Medical Question Answering Benchmark

PersianMedQA: 在波斯语-英语双语医学问答基准上评估大型语言模型

Mohammad Javad Ranjbar Kalahroodi, Amirhossein Sheikholselami, Sepehr Karimi, Sepideh Ranjbar Kalahroodi, Heshaam Faili, Azadeh Shakery

发表机构 * School of Electrical and Computer Engineering, University of Tehran(德黑兰大学电气与计算机工程学院) Shahid Beheshti University of Medical Sciences(沙希德·贝赫什提医学科学大学) Institute for Research in Fundamental Sciences (IPM)(基础科学研究所(IPM))

AI总结 本文提出PersianMedQA数据集,包含20,785道波斯语医学多选题,用于评估41个LLM在零样本和思维链设置下的双语医学推理能力,发现闭源通用模型表现最佳,而波斯语模型性能较差,且翻译会丢失文化临床线索。

Comments Accepted at LREC 2026 (The Fifteenth Language Resources and Evaluation Conference), Palma, Mallorca, Spain, May 2026

详情
AI中文摘要

大型语言模型(LLM)在广泛的自然语言处理(NLP)基准测试中取得了显著性能,通常超越人类水平。然而,它们在医学等高风险领域中的可靠性,尤其是在低资源语言中,仍未得到充分探索。在这项工作中,我们引入了PersianMedQA,这是一个大规模数据集,包含来自14年伊朗国家医学考试的20,785道经过专家验证的波斯语医学多选题,涵盖23个医学专业,旨在评估LLM在波斯语和英语中的表现。我们对41个最先进的模型进行了基准测试,包括通用型、波斯语和医学LLM,在零样本和思维链(CoT)设置下。我们的结果表明,闭权重通用模型(例如GPT-4.1)持续优于所有其他类别,在波斯语中达到83.09%的准确率,在英语中达到80.7%,而波斯语LLM(例如Dorna)表现显著较差(例如波斯语中34.9%),通常在指令遵循和领域推理方面存在困难。我们还分析了翻译的影响,表明虽然英语性能通常更高,但3-10%的问题只能通过波斯语正确回答,因为翻译丢失了文化和临床上下文线索。最后,我们证明,如果没有强大的领域或语言适应,仅凭模型大小不足以获得稳健的性能。PersianMedQA为评估LLM中的双语和文化基础医学推理提供了基础。该数据集以及双语医学词典可在以下网址获取:https://huggingface.co/datasets/MohammadJRanjbar/PersianMedQA 。

英文摘要

Large Language Models (LLMs) have achieved remarkable performance on a wide range of Natural Language Processing (NLP) benchmarks, often surpassing human-level accuracy. However, their reliability in high-stakes domains such as medicine, particularly in low-resource languages, remains underexplored. In this work, we introduce PersianMedQA, a large-scale dataset of 20,785 expert-validated multiple-choice Persian medical questions from 14 years of Iranian national medical exams, spanning 23 medical specialties and designed to evaluate LLMs in both Persian and English. We benchmark 41 state-of-the-art models, including general-purpose, Persian, and medical LLMs, in zero-shot and chain-of-thought (CoT) settings. Our results show that closed-weight general models (e.g., GPT-4.1) consistently outperform all other categories, achieving 83.09% accuracy in Persian and 80.7% in English, while Persian LLMs such as Dorna underperform significantly (e.g., 34.9% in Persian), often struggling with both instruction-following and domain reasoning. We also analyze the impact of translation, showing that while English performance is generally higher, 3-10% of questions can only be answered correctly in Persian due to cultural and clinical contextual cues that are lost in translation. Finally, we demonstrate that model size alone is insufficient for robust performance without strong domain or language adaptation. PersianMedQA provides a foundation for evaluating bilingual and culturally grounded medical reasoning in LLMs. The dataset, along with a bilingual medical dictionary, is available: https://huggingface.co/datasets/MohammadJRanjbar/PersianMedQA .

2508.05305 2026-05-27 cs.CL

SONAR-LLM: Autoregressive Transformer that Thinks in Sentence Embeddings and Speaks in Tokens

SONAR-LLM:在句子嵌入中思考、以令牌说话的自回归Transformer

Nikita Dragunov, Temurbek Rahmatullaev, Elizaveta Goncharova, Nikita Kurdiukov, Aysel Mirzoeva, Anna Borisiuk, Andrey Kuznetsov, Anton Razzhigaev

发表机构 * FusionBrain Lab, AXXX(融合大脑实验室,AXXX) T-Tech MSU(莫斯科大学) DS-NLP Group, AXXX(自然语言处理组,AXXX)

AI总结 提出SONAR-LLM,一种在连续SONAR嵌入空间“思考”但通过令牌级交叉熵监督的解码器-only Transformer,融合语义抽象与似然训练信号,在39M至1.3B参数规模上取得有竞争力的生成质量。

详情
AI中文摘要

最近提出的大型概念模型(LCM)通过预测句子级嵌入序列并使用均方误差或扩散目标进行训练来生成文本。我们提出了SONAR-LLM,一种仅在连续SONAR嵌入空间中“思考”的解码器-only Transformer,但通过冻结的SONAR解码器传播的令牌级交叉熵进行监督。这种混合目标保留了LCM的语义抽象,同时消除了其扩散采样器并恢复了基于似然的训练信号。在从39M到1.3B参数的模型规模上,SONAR-LLM取得了有竞争力的生成质量。我们报告了扩展趋势、消融实验、基准测试结果,并发布了完整的训练代码和所有预训练检查点,以促进可重复性和未来研究。

英文摘要

The recently proposed Large Concept Model (LCM) generates text by predicting a sequence of sentence-level embeddings and training with either mean-squared error or diffusion objectives. We present SONAR-LLM, a decoder-only transformer that "thinks" in the same continuous SONAR embedding space, yet is supervised through token-level cross-entropy propagated via the frozen SONAR decoder. This hybrid objective retains the semantic abstraction of LCM while eliminating its diffusion sampler and restoring a likelihood-based training signal. Across model sizes from 39M to 1.3B parameters, SONAR-LLM attains competitive generation quality. We report scaling trends, ablations, benchmark results, and release the complete training code and all pretrained checkpoints to foster reproducibility and future research.

2508.00748 2026-05-27 cs.CV cs.AI cs.CR cs.MM

Is It Really You? Exploring Biometric Verification Scenarios in Photorealistic Talking-Head Avatar Videos

真的是你吗?探索逼真说话头像视频中的生物特征验证场景

Laura Pedrouzo-Rodriguez, Pedro Delgado-DeRobles, Luis F. Gomez, Ruben Tolosana, Ruben Vera-Rodriguez, Aythami Morales, Julian Fierrez

发表机构 * Biometrics and Data Pattern Analytics Lab(生物特征与数据模式分析实验室)

AI总结 本文研究在逼真说话头像视频中,利用面部运动模式作为行为生物特征进行身份验证,提出基于图卷积网络的轻量级模型,AUC接近80%。

Comments Accepted at the IEEE International Joint Conference on Biometrics (IJCB 2025)

Journal ref 2025 IEEE International Joint Conference on Biometrics (IJCB)

详情
AI中文摘要

逼真说话头像在虚拟会议、游戏和社交平台中越来越常见。这些头像允许更沉浸式的交流,但也引入了严重的安全风险。一个新兴威胁是冒充:攻击者可以窃取用户的头像,保留其外观和声音,使得仅凭视觉或听觉几乎无法检测欺诈性使用。在本文中,我们探讨了在这种头像中介场景中生物特征验证的挑战。我们的主要问题是,当头像的视觉外观是其主人的复制品时,个体的面部运动模式能否作为可靠的行为生物特征来验证其身份。为了回答这个问题,我们引入了一个新的数据集,其中包含使用最先进的一次性头像生成模型GAGAvatar创建的逼真头像视频,包括真实和冒充的头像视频。我们还提出了一种轻量级、可解释的时空图卷积网络架构,具有时间注意力池化,仅使用面部标志点来建模动态面部手势。实验结果表明,面部运动线索能够实现有意义的身份验证,AUC值接近80%。所提出的基准和生物特征系统可供研究社区使用,以引起对基于头像的通信系统中更高级行为生物特征防御的迫切需求的关注。

英文摘要

Photorealistic talking-head avatars are becoming increasingly common in virtual meetings, gaming, and social platforms. These avatars allow for more immersive communication, but they also introduce serious security risks. One emerging threat is impersonation: an attacker can steal a user's avatar, preserving his appearance and voice, making it nearly impossible to detect its fraudulent usage by sight or sound alone. In this paper, we explore the challenge of biometric verification in such avatar-mediated scenarios. Our main question is whether an individual's facial motion patterns can serve as reliable behavioral biometrics to verify their identity when the avatar's visual appearance is a facsimile of its owner. To answer this question, we introduce a new dataset of realistic avatar videos created using a state-of-the-art one-shot avatar generation model, GAGAvatar, with genuine and impostor avatar videos. We also propose a lightweight, explainable spatio-temporal Graph Convolutional Network architecture with temporal attention pooling, that uses only facial landmarks to model dynamic facial gestures. Experimental results demonstrate that facial motion cues enable meaningful identity verification with AUC values approaching 80%. The proposed benchmark and biometric system are available for the research community in order to bring attention to the urgent need for more advanced behavioral biometric defenses in avatar-based communication systems.

2508.01253 2026-05-27 cs.CV

ODOV: Benchmark the Open-Domain Open-Vocabulary Object Detection

ODOV:开放域开放词汇目标检测基准

Yupeng Zhang, Ruize Han, Fangnan Zhou, Wei Feng, Liang Wan

发表机构 * College of Intelligence and Computing, Tianjin University(天津大学智能计算学院) Key Research Center for Surface Monitoring and Analysis of Relics, State Administration of Cultural Heritage(文物表面监测与分析国家重点研究中心) Faculty of Computer Science and Artificial Intelligence, Shenzhen University of Advanced Technology(深圳先进技术大学计算机科学与人工智能学院)

AI总结 针对真实场景中域偏移和类别偏移同时发生的问题,提出开放域开放词汇目标检测任务,构建OD-LVIS基准数据集,并设计基于VLM的基线方法,通过域无关类别提示和域投影嫁接模块提升检测性能。

详情
AI中文摘要

现有研究通常将域偏移和类别偏移作为独立问题进行研究,然而在真实场景中,这两种偏移常常同时发生并相互作用,导致检测性能显著下降。为了解决这一问题,我们提出并系统研究了一个新问题——开放域开放词汇(ODOV)目标检测,旨在评估模型在真实环境中适应复合域和类别偏移的能力。我们构建了一个新的基准数据集OD-LVIS,包含来自15个不同真实场景的46,949张图像和1,203个类别,用于评估目标检测性能。此外,我们提出了一种新的ODOV检测基线,充分利用VLM强大的多模态对齐能力,并引入两种关键机制以增强类别和域泛化能力。一种是域无关类别提示(DAPmt),它在增强类别语义的同时减弱域表示,从而实现纯粹的类别表示。另一种是域投影与嫁接(DP&G)模块,它融合了输入图像中的域特定特征,使模型能够动态地在各种开放域中进行泛化。这两个组件使模型能够在真实场景中同时存在类别和域变化的情况下保持有效的检测性能。我们为提出的ODOV检测任务提供了广泛的基准评估,并报告了实验结果。这些结果验证了ODOV任务的合理性、OD-LVIS数据集的实用性以及该方法的优越性。

英文摘要

Existing studies typically investigate domain shift and category shift as independent problems, however, in real-world scenarios, the two types of shifts often occur simultaneously and interact, leading to significant degradation in detection performance. To address this, we propose and systematically study a novel problem-Open-Domain Open-Vocabulary (ODOV) object detection-which aims to evaluate a model's ability to adapt to the compound domain and category shifts in real-world environments.We construct a new benchmark, OD-LVIS, which contains 46,949 images spanning 15 diverse real-world scenarios and 1,203 categories, for assessing object detection performance. Furthermore, we propose a novel ODOV detection baseline that fully leverages VLM's powerful multi-modal alignment capabilities and introduces two key mechanisms to enhance both category and domain generalization. One is the Domain-Agnostic Category Prompt (DAPmt), which strengthens category semantics while attenuating domain representations, enabling pure category representation. The other is the Domain Projection and Grafting (DP&G) module, which incorporates domain-specific features from input images, allowing the model to dynamically generalize across diverse open domains. These two components enable the model to maintain effective detection performance under simultaneous category and domain variations in real-world scenarios. We provide extensive benchmark evaluations for the proposed ODOV detection task and report experimental results. These results validate the soundness of the ODOV task, the practicality of the OD-LVIS dataset, and the superiority of the method.

2507.13762 2026-05-27 cs.LG q-bio.BM

MolPIF: A Parameter Interpolation Flow Model for Molecule Generation

MolPIF: 一种用于分子生成的参数插值流模型

Yaowei Jin, Junjie Wang, Yufan Tang, Wenkai Xiang, Duanhua Cao, Dan Teng, Zhehuan Fan, Jiacheng Xiong, Xia Sheng, Chuanlong Zeng, Duo An, Mingyue Zheng, Shuangjia Zheng, Qian Shi

发表机构 * Lingang Laboratory(灵冈实验室) School of Information Science and Technology(信息科学与技术学院) ShanghaiTech University(上海科技大学) Drug Discovery and Design Center, State Key Laboratory of Drug Research(药物发现与设计中心、国家药物研究重点实验室) Shanghai Institute of Materia Medica, Chinese Academy of Sciences(中国科学院上海 medicinal materials 研究院) Global Institute of Future Technology(未来技术全球研究院) Shanghai Jiao Tong University(上海交通大学) College of Computer Science and Artificial Intelligence(计算机科学与人工智能学院)

AI总结 提出参数插值流模型MolPIF,通过参数空间分布插值统一连续坐标与离散原子类型的生成,在CrossDocked2020数据集上优于基线方法。

Comments Accepted to Bioinformatics

详情
AI中文摘要

动机:基于结构的药物设计(SBDD)随着深度生成模型的发展而进步,但弥合连续原子坐标与离散原子类型之间的差距仍然是一个挑战。当前的方法,如扩散和流匹配模型,通常未能统一这些异质模态,依赖于分离的策略或对离散变量不合适的欧几里得度量。缺乏一致的框架限制了生成模型捕捉蛋白质-配体复合物的几何和化学结构的能力。结果:我们提出了MolPIF,一种参数插值流机制,旨在统一连续和离散分子变量的生成。与在样本空间中运行的传统流模型不同,MolPIF在参数空间中对分布进行插值,理论上恢复了连续坐标的Wasserstein-2最优传输,并建立了离散原子类型的Fisher-Rao测地线。我们进一步整合了几何增强学习策略,以改善原子上下文的捕捉。在CrossDocked2020数据集上的广泛评估表明,MolPIF在结合亲和力、化学有效性、几何保真度和化学空间覆盖方面优于基线。此外,MolPIF在先导优化中表现出多功能性,并提供灵活的先验分布选择(如Laplace),为SBDD建立了一个稳健的范式。可用性:源代码可在https://github.com/BLEACH366/MolPIF免费获取。补充信息:补充数据可在Bioinformatics上获取。

英文摘要

Motivation: Structure-based drug design (SBDD) has advanced with deep generative models, but bridging the gap between continuous atomic coordinates and discrete atom types remains a challenge. Current approaches, such as diffusion and flow matching models, often fail to unify these heterogeneous modalities, relying on separate strategies or ill-fitting Euclidean metrics for discrete variables. This lack of a consistent framework limits generative models' ability to capture the geometric and chemical structure of protein-ligand complexes. Results: We present MolPIF, a parameter interpolation flow mechanism designed to unify the generation of continuous and discrete molecular variables. Unlike traditional flow models that operate in sample space, MolPIF interpolates between distributions in the parameter space, theoretically recovering Wasserstein-2 optimal transport for continuous coordinates and establishing Fisher-Rao geodesics for discrete atom types. We further incorporate a geometry-enhanced learning strategy to improve the capture of atomic contexts. Extensive evaluations on the CrossDocked2020 dataset demonstrate that MolPIF outperforms baselines in binding affinity, chemical validity, geometric fidelity and chemical space coverage. Additionally, MolPIF exhibits versatility in lead optimization and offers flexible prior distribution selection (such as Laplace), establishing a robust paradigm for SBDD. Availability: Source code is freely available at https://github.com/BLEACH366/MolPIF. Supplementary information: Supplementary data are available at Bioinformatics.

2507.20758 2026-05-27 cs.AI

How Chain-of-Thought Works? Tracing Information Flow from Decoding, Projection, and Activation

思维链如何工作?从解码、投影和激活追踪信息流

Hao Yang, Qinghua Zhao, Lei Li, Lingyi Meng, Mengda Yu

发表机构 * State Key Laboratory for Novel Software Technology, Nanjing University(南京大学新型软件技术国家重点实验室) School of Artificial Intelligence and Big Data, Hefei University(合肥大学人工智能与大数据学院) School of Artificial Intelligence, Beijing Institute of Technology(北京理工大学人工智能学院) School of Computing and Information, University of Pittsburgh(匹兹堡大学计算机与信息学院) Center for Biostatistics, The Ohio State University Wexner Medical Center(俄亥俄州立大学韦克斯纳医学中心生物统计中心)

AI总结 通过反向追踪解码、投影和激活阶段的信息流,揭示思维链作为解码空间剪枝器的作用,并发现其以任务依赖方式调节神经元激活。

Comments Accept by ACL 2026

详情
AI中文摘要

思维链提示显著增强了模型推理能力,但其内部机制仍知之甚少。我们通过反向追踪解码、投影和激活阶段的信息流来分析CoT的操作原理。我们的定量分析表明,CoT可能作为解码空间剪枝器,利用答案模板引导输出生成,更高的模板遵循度与性能提升强相关。此外,我们惊讶地发现CoT以任务依赖方式调节神经元参与:在开放领域任务中减少神经元激活,而在封闭领域场景中增加激活。这些发现提供了一个新颖的机制可解释性框架,并为实现有针对性的CoT干预以设计更高效和鲁棒的提示提供了关键见解。我们在https://anonymous.4open.science/r/cot-D247发布了代码和数据。

英文摘要

Chain-of-Thought (CoT) prompting significantly enhances model reasoning, yet its internal mechanisms remain poorly understood. We analyze CoT's operational principles by reversely tracing information flow across decoding, projection, and activation phases. Our quantitative analysis suggests that CoT may serve as a decoding space pruner, leveraging answer templates to guide output generation, with higher template adherence strongly correlating with improved performance. Furthermore, we surprisingly find that CoT modulates neuron engagement in a task-dependent manner: reducing neuron activation in open-domain tasks, yet increasing it in closed-domain scenarios. These findings offer a novel mechanistic interpretability framework and critical insights for enabling targeted CoT interventions to design more efficient and robust prompts. We released our code and data at https://anonymous.4open.science/r/cot-D247.

2507.16116 2026-05-27 cs.CV

Pusa V1.0: Unlocking Temporal Control in Pretrained Video Diffusion Models via Vectorized Timestep Adaptation

Pusa V1.0: 通过向量化时间步长适应解锁预训练视频扩散模型中的时间控制

Yaofang Liu, Yumeng Ren, Aitor Artola, Yuxuan Hu, Xiaodong Cun, Xiaotong Zhao, Alan Zhao, Raymond H. Chan, Suiyun Zhang, Rui Liu, Dandan Tu, Jean-Michel Morel

发表机构 * City University of Hong Kong(香港城市大学) The Chinese University of Hong Kong(香港中文大学) Huawei Research(华为研究) Great Bay University(大湾大学) AI Technology Center, Tencent PCG(腾讯AI技术中心) Lingnan University(岭南大学) Hong Kong Centre for Cerebro-Cardiovascular Health Engineering(香港脑心血管健康工程中心)

AI总结 提出向量化时间步长适应(VTA)方法,在统一视频扩散框架中实现细粒度时间控制,零样本完成图像到视频生成、起止帧控制等任务,且不破坏基础模型能力。

Comments Code is open-sourced at https://github.com/Yaofang-Liu/Pusa-VidGen

详情
AI中文摘要

视频扩散模型的快速发展受到时间建模基本限制的阻碍,特别是传统标量时间步长变量导致的帧演化刚性同步。尽管任务特定适应和自回归模型试图解决这些挑战,但它们仍受限于计算效率低下、灾难性遗忘或适用性狭窄。在这项工作中,我们提出了 extbf{Pusa} V1.0,一个利用 extbf{向量化时间步长适应(VTA)}在统一视频扩散框架中实现细粒度时间控制的通用模型。注意,VTA是一种非破坏性适应,意味着它完全保留了基础模型的能力。与Wan-I2V等传统方法(通过大量资源微调基础文本到视频(T2V)模型以进行图像到视频(I2V))不同,我们在基于VTA的超高效微调过程后以零样本方式实现了可比结果。此外,该方法还同时解锁了许多其他零样本能力,例如起止帧和视频扩展——所有这些都不需要任务特定训练。同时,它保留了基础模型的T2V能力。机制分析还表明,我们的方法保留了基础模型的生成先验,同时精确注入时间动态,避免了向量化时间步长固有的组合爆炸。这项工作为下一代视频合成建立了一个可扩展、高效且通用的范式,使高保真视频生成在研究和工业领域得以普及。

英文摘要

The rapid advancement of video diffusion models has been hindered by fundamental limitations in temporal modeling, particularly the rigid synchronization of frame evolution imposed by conventional scalar timestep variables. While task-specific adaptations and autoregressive models have sought to address these challenges, they remain constrained by computational inefficiency, catastrophic forgetting, or narrow applicability. In this work, we present \textbf{Pusa} V1.0, a versatile model that leverages \textbf{vectorized timestep adaptation (VTA)} to enable fine-grained temporal control within a unified video diffusion framework. Note that VTA is a non-destructive adaptation, which means that it fully preserves the capabilities of the base model. Unlike conventional methods like Wan-I2V, which finetune a base text-to-video (T2V) model with abundant resources to do image-to-video (I2V), we achieve comparable results in a zero-shot manner after an ultra-efficient finetuning process based on VTA. Moreover, this method also unlocks many other zero-shot capabilities simultaneously, such as start-end frames and video extension -- all without task-specific training. Meanwhile, it keeps the T2V capability from the base model. Mechanistic analyses also reveal that our approach preserves the foundation model's generative priors while surgically injecting temporal dynamics, avoiding the combinatorial explosion inherent to the vectorized timestep. This work establishes a scalable, efficient, and versatile paradigm for next-generation video synthesis, democratizing high-fidelity video generation for research and industry alike.

2507.06513 2026-05-27 cs.CV

What Demands Attention in Urban Street Scenes? From Scene Understanding towards Road Safety: A Survey of Vision-driven Datasets and Studies

城市街景中什么需要关注?从场景理解到道路安全:视觉驱动数据集与研究的综述

Yaoqi Huang, Julie Stephany Berrio, Mao Shan, Stewart Worrall

发表机构 * Australian Centre For Robotics (ACFR), The University of Sydney(澳大利亚机器人中心(ACFR)、悉尼大学)

AI总结 本文通过系统分类交通场景中需要关注的关键元素,全面分析35个视觉驱动任务和73个数据集,提出统一分析框架,旨在促进道路安全研究。

Comments 40 tasks, 78 datasets

详情
AI中文摘要

基于视觉的传感器和计算机视觉算法的进步显著提升了对交通场景的分析与理解。为促进这些进步在道路安全中的应用,本综述系统分类了交通场景中需要关注的关键元素,并全面分析了现有的视觉驱动任务和数据集。与现有聚焦于孤立领域的综述相比,我们的分类法将值得关注的交通实体分为两大类:异常实体和正常但关键的实体,整合了十个类别和二十个子类。它建立了内在相关领域之间的联系,并提供了统一的分析框架。我们的综述重点分析了35个视觉驱动任务,并基于提出的分类法对73个可用数据集进行了全面检查和可视化。跨领域调查涵盖了每个基准的优缺点,旨在提供标准统一和资源优化的信息。文章最后系统讨论了现有弱点,从不同角度强调了潜在影响和有前景的解决方案。集成的分类法、全面分析和总结性表格为这一快速发展的领域提供了宝贵贡献,为研究人员提供了整体概览,指导战略性资源选择,并突出了关键研究空白。

英文摘要

Advances in vision-based sensors and computer vision algorithms have significantly improved the analysis and understanding of traffic scenarios. To facilitate the use of these improvements for road safety, this survey systematically categorizes the critical elements that demand attention in traffic scenarios and comprehensively analyzes available vision-driven tasks and datasets. Compared to existing surveys that focus on isolated domains, our taxonomy categorizes attention-worthy traffic entities into two main groups that are anomalies and normal but critical entities, integrating ten categories and twenty subclasses. It establishes connections between inherently related fields and provides a unified analytical framework. Our survey highlights the analysis of 35 vision-driven tasks and comprehensive examinations and visualizations of 73 available datasets based on the proposed taxonomy. The cross-domain investigation covers the pros and cons of each benchmark with the aim of providing information on standards unification and resource optimization. Our article concludes with a systematic discussion of the existing weaknesses, underlining the potential effects and promising solutions from various perspectives. The integrated taxonomy, comprehensive analysis, and recapitulatory tables serve as valuable contributions to this rapidly evolving field by providing researchers with a holistic overview, guiding strategic resource selection, and highlighting critical research gaps.

2507.05757 2026-05-27 cs.CV

Normal Patch Retinex Robust Alghoritm for White Balancing in Digital Microscopy

Normal Patch Retinex 稳健算法用于数字显微镜白平衡

Radoslaw Roszczyk, Artur Krupa, Izabella Antoniuk

发表机构 * Faculty of Electrical Engineering(电子工程学院) Institute of Information Technology(信息技术研究所)

AI总结 提出一种基于Normal Patch Retinex的全自动白平衡算法,用于校正数字显微镜彩色图像,实验证明其优于经典算法。

Journal ref Vol. 29 No. 1/4 (2020)

详情
AI中文摘要

在光学显微镜中获取准确彩色、平衡的图像即使对于经验丰富的显微镜操作者也可能是一个挑战。本文提出了一种完全自动的白平衡机制,能够充分校正显微彩色图像。该算法的结果已在200张显微图像数据集上通过实验验证。这些图像包含病理形态学中常用的三种显微标本的扫描图。此外,将所得结果与数字摄影中其他常用的白平衡算法进行了比较。本文应用的算法对于苏木精-荧光桃红-番红染色的显微图像和免疫组织化学染色图像比彩色摄影中使用的经典算法更有效。

英文摘要

The acquisition of accurately coloured, balanced images in an optical microscope can be a challenge even for experienced microscope operators. This article presents an entirely automatic mechanism for balancing the white level that allows the correction of the microscopic colour images adequately. The results of the algorithm have been confirmed experimentally on a set of two hundred microscopic images. The images contained scans of three microscopic specimens commonly used in pathomorphology. Also, the results achieved were compared with other commonly used white balance algorithms in digital photography. The algorithm applied in this work is more effective than the classical algorithms used in colour photography for microscopic images stained with hematoxylin-phloxine-saffron and for immunohistochemical staining images.

2506.23149 2026-05-27 cs.CL

AlignEvoSkill: Towards Knowledge-Aware and Task-Aligned Agent Skill Evolution

AlignEvoSkill: 迈向知识感知与任务对齐的智能体技能进化

Dingzirui Wang, Xuanliang Zhang, Keyan Xu, Qingfu Zhu, Wanxiang Che, Yang Deng

发表机构 * Singapore Management University(新加坡国立大学)

AI总结 提出AlignEvoSkill框架,通过联合建模知识覆盖和任务对齐,从失败轨迹中识别知识标签、检索并适配候选技能,再基于知识覆盖和任务对齐分数筛选高质量技能,在3个基准和4个LLM骨干上相对提升34.7%,实现技能进化新SOTA且成本更低。

详情
AI中文摘要

可重用技能在提升基于LLM的智能体中扮演关键角色,但现有技能进化方法往往无法确保进化后的技能既覆盖任务所需的知识,又与目标任务保持对齐。结果,进化后的技能可能不完整或无关。为解决这一局限,我们提出AlignEvoSkill,一个联合建模知识覆盖和任务对齐的技能进化框架。给定失败的任务轨迹,AlignEvoSkill首先识别与任务相关的知识标签,检索互补的先前技能,并将它们适配为弥补缺失知识的候选技能。然后,它使用基于知识覆盖和任务对齐分数的联合过滤标准选择高质量候选技能。在3个基准和4个LLM骨干上的实验表明,AlignEvoSkill相对于非进化基线实现了34.7%的相对增益,并以更低的成本实现了技能进化的新SOTA。

英文摘要

Reusable skills play a key role in improving LLM-based agents, but existing skill-evolution methods often fail to ensure that evolved skills both cover the knowledge required by the task and remain aligned with the target task. As a result, evolved skills could be incomplete or irrelevant. To address this limitation, we propose AlignEvoSkill, a skill-evolution framework that jointly models knowledge coverage and task alignment. Given failed task trajectories, AlignEvoSkill first identifies task-relevant knowledge tags, retrieves complementary prior skills, and adapts them into candidate skills that address missing knowledge. It then selects high-quality candidates using a joint filtering criterion based on knowledge-coverage and task-alignment scores. Experiments on 3 benchmarks with4 LLM backbones show a 34.7% relative gain of AlignEvoSkill over the non-evolution baseline and achieves a new SOTA in skill evolution with lower cost.

2506.21443 2026-05-27 cs.CL cs.AI

Domain Knowledge-Enhanced LLMs for Fraud and Concept Drift Detection

领域知识增强的大语言模型用于欺诈和概念漂移检测

Ali Şenol, Garima Agrawal, Huan Liu

发表机构 * School of Computing and Augmented Intelligence (SCAI), Arizona State University (ASU)(计算与增强智能学院(SCAI),亚利桑那州立大学) Department of Computer Engineering, Tarsus University(计算机工程系,塔鲁斯大学) Minerva CQ and HumaConn AI Consulting(Minerva CQ和HumaConn人工智能咨询) School of Computing and Augmented Intelligence (SCAI), Arizona State University(计算与增强智能学院(SCAI),亚利桑那州立大学)

AI总结 提出一种领域知识增强的大语言模型框架,通过集成结构化领域知识和漂移检测单元,实现高准确率的欺诈对话检测和概念漂移分类。

详情
AI中文摘要

在动态平台上检测欺骗性对话变得越来越困难,原因是语言模式的演变和概念漂移(CD)——即随着时间推移,语义或主题的转变会改变交互的上下文或意图。这些转变可能掩盖恶意意图或模仿正常对话,使得准确分类具有挑战性。尽管大语言模型(LLMs)在自然语言任务中表现出色,但在风险敏感场景中,它们常常面临上下文模糊和幻觉问题。为了解决这些挑战,我们提出了一个领域知识(DK)增强的LLM框架,该框架将预训练的LLM与结构化的、任务特定的见解相结合,以执行欺诈和概念漂移检测。所提出的架构由三个主要组件组成:(1)一个DK-LLM模块,用于检测虚假或欺骗性对话;(2)一个漂移检测单元(OCDD),用于判断是否发生了语义转变;(3)第二个DK-LLM模块,用于将漂移分类为良性或欺诈性。我们首先使用虚假评论数据集验证领域知识的价值,然后将我们的完整框架应用于SEConvo,一个包含多种欺诈和垃圾攻击的多轮对话数据集。结果表明,我们的系统能够高精度地检测虚假对话,并有效分类漂移的性质。在结构化提示的引导下,基于LLaMA的实现达到了98%的分类准确率。与零样本基线的对比研究表明,在高风险NLP应用中,融入领域知识和漂移意识显著提高了性能、可解释性和鲁棒性。

英文摘要

Detecting deceptive conversations on dynamic platforms is increasingly difficult due to evolving language patterns and Concept Drift (CD)-i.e., semantic or topical shifts that alter the context or intent of interactions over time. These shifts can obscure malicious intent or mimic normal dialogue, making accurate classification challenging. While Large Language Models (LLMs) show strong performance in natural language tasks, they often struggle with contextual ambiguity and hallucinations in risk-sensitive scenarios. To address these challenges, we present a Domain Knowledge (DK)-Enhanced LLM framework that integrates pretrained LLMs with structured, task-specific insights to perform fraud and concept drift detection. The proposed architecture consists of three main components: (1) a DK-LLM module to detect fake or deceptive conversations; (2) a drift detection unit (OCDD) to determine whether a semantic shift has occurred; and (3) a second DK-LLM module to classify the drift as either benign or fraudulent. We first validate the value of domain knowledge using a fake review dataset and then apply our full framework to SEConvo, a multiturn dialogue dataset that includes various types of fraud and spam attacks. Results show that our system detects fake conversations with high accuracy and effectively classifies the nature of drift. Guided by structured prompts, the LLaMA-based implementation achieves 98% classification accuracy. Comparative studies against zero-shot baselines demonstrate that incorporating domain knowledge and drift awareness significantly improves performance, interpretability, and robustness in high-stakes NLP applications.

2506.17633 2026-05-27 cs.CV cs.AI

Adaptive Multi-prompt Contrastive Network for Few-shot Out-of-distribution Detection

自适应多提示对比网络用于少样本分布外检测

Xiang Fang, Arvind Easwaran, Blaise Genest

发表机构 * College of Computing and Data Science, Nanyang Technological University, Singapore(南洋理工大学计算机学院和数据科学学院,新加坡)

AI总结 针对少样本分布外检测问题,提出自适应多提示对比网络(AMCN),通过CLIP学习可学习文本提示和类间/类内分布,实现ID-OOD分离边界自适应。

Comments Published in ICML 2025

详情
AI中文摘要

分布外(OOD)检测旨在区分异常样本,以防止在分布内(ID)数据集上训练的模型产生不可用的输出。大多数OOD检测方法需要大量IID样本进行训练,这严重限制了它们的实际应用。为此,我们针对一个具有挑战性的场景:少样本OOD检测,其中只有少量标记的ID样本可用。因此,少样本OOD检测比传统的OOD检测设置更具挑战性。先前的少样本OOD检测工作忽略了不同类别之间的显著多样性。在本文中,我们提出了一种新颖的网络:自适应多提示对比网络(AMCN),它通过学习类间和类内分布来适应ID-OOD分离边界。为了弥补OOD的缺失和ID图像样本的稀缺,我们利用CLIP连接文本与图像,设计可学习的ID和OOD文本提示。具体来说,我们首先生成自适应提示(可学习ID提示、标签固定OOD提示和标签自适应OOD提示)。然后,我们通过引入类级阈值为每个类生成自适应类边界。最后,我们提出一个提示引导的ID-OOD分离模块来控制ID和OOD提示之间的间隔。实验结果表明,AMCN优于其他最先进的工作。

英文摘要

Out-of-distribution (OOD) detection attempts to distinguish outlier samples to prevent models trained on the in-distribution (ID) dataset from producing unavailable outputs. Most OOD detection methods require many IID samples for training, which seriously limits their real-world applications. To this end, we target a challenging setting: few-shot OOD detection, where {Only a few {\em labeled ID} samples are available.} Therefore, few-shot OOD detection is much more challenging than the traditional OOD detection setting. Previous few-shot OOD detection works ignore the distinct diversity between different classes. In this paper, we propose a novel network: Adaptive Multi-prompt Contrastive Network (AMCN), which adapts the ID-OOD separation boundary by learning inter- and intra-class distribution. To compensate for the absence of OOD and scarcity of ID {\em image samples}, we leverage CLIP, connecting text with images, engineering learnable ID and OOD {\em textual prompts}. Specifically, we first generate adaptive prompts (learnable ID prompts, label-fixed OOD prompts and label-adaptive OOD prompts). Then, we generate an adaptive class boundary for each class by introducing a class-wise threshold. Finally, we propose a prompt-guided ID-OOD separation module to control the margin between ID and OOD prompts. Experimental results show that AMCN outperforms other state-of-the-art works.

2506.11253 2026-05-27 cs.CV cs.LG

Lifting Data-Tracing Machine Unlearning to Knowledge-Tracing for Foundation Models

将数据追踪的机器遗忘提升为基础模型的知识追踪

Yuwen Tan, Boqing Gong

发表机构 * Boston University(波士顿大学)

AI总结 本文提出将数据追踪的机器遗忘提升为基础模型的知识追踪,以应对多样化遗忘请求,并更接近人类遗忘机制,通过视觉语言模型案例展示实现范式。

Comments Accepted to TMLR

详情
AI中文摘要

机器遗忘从AI模型中移除特定训练数据点及其影响(例如,当数据所有者撤销其同意允许模型从数据中学习时)。在这篇立场论文中,我们提出将数据追踪的机器遗忘提升为基础模型(FMs)的知识追踪。我们基于实际需求和认知研究的见解支持这一立场。实际上,追踪数据无法满足对FMs的多样化遗忘请求,这些请求可能来自监管机构、企业用户、产品团队等,他们无法访问FMs的大量训练数据。相反,这些方方便提出关于FMs(不应)拥有的知识或能力的遗忘请求。认知上,知识追踪遗忘比追踪单个训练数据点更接近人脑的遗忘方式。我们进一步讨论了知识追踪机器遗忘范式中的重大挑战。最后,我们提供了一个关于视觉语言FMs的具体案例研究,以说明遗忘者如何实例化知识追踪机器遗忘范式。代码可在:https://1yuwen.github.io/Knowledge-Tracing-MU-Page 获取。

英文摘要

Machine unlearning removes certain training data points and their influence from AI models (e.g., when a data owner revokes their consent to allow models to learn from the data). In this position paper, we propose to lift data-tracing machine unlearning to knowledge-tracing for foundation models (FMs). We support this position based on practical needs and insights from cognitive studies. Practically, tracing data cannot meet the diverse unlearning requests for FMs, which may be from regulators, enterprise users, product teams, etc., who have no access to FMs' massive training data. Instead, it is convenient for these parties to issue an unlearning request about the knowledge or capability FMs (should not) possess. Cognitively, knowledge-tracing unlearning aligns with how the human brain forgets more closely than tracing individual training data points does. We further discuss the nontrivial challenges in the knowledge-tracing machine unlearning paradigm. Finally, we provide a concrete case study about a vision-language FM to illustrate how an unlearner might instantiate the knowledge-tracing machine unlearning paradigm. Code is available at: https://1yuwen.github.io/Knowledge-Tracing-MU-Page.

2506.10225 2026-05-27 cs.SD cs.AI eess.AS

Genre Controlled Music Generation via Activation Steering

通过激活引导实现体裁控制的音乐生成

Swathi Narashiman, Pranay Mathur, Dipanshu Panda, Jayden Koshy Joe, Harshith M R, Anish Veerakumar, Aniruddh Krishna, Keerthiharan A

发表机构 * Indian Institute of Technology Madras(印度理工学院马德拉斯学院)

AI总结 提出一种在推理时对自回归生成模型MusicGen进行干预的方法,利用线性探针权重引导残差流,实现细粒度的体裁控制。

详情
AI中文摘要

计算音乐生成正朝着非传统风格发展,需要能够精确且可控地融合不同音乐元素的方法。在这项工作中,我们提出了一种方法,通过对自回归生成变换器MusicGen进行推理时干预来实现细粒度控制。通过我们的方法,我们利用线性探针在残差流上的权重来引导残差流,从而实现体裁控制。通过将激活引导视为一种人类可控的交互,我们的工作突出了可解释的模型行为如何在协同创作的音乐生成中发挥作用。展示我们方法的音频样本可在我们的演示页面上找到。

英文摘要

Computational Music Generation is evolving towards non-conventional styles, demanding methods that enable precise and controllable blending of diverse music elements. In this work, we present a method for fine grained control using inference-time interventions on an autoregressive generative transformer, MusicGen. Through our approach, we achieve genre control by steering the residual stream using weights of a linear probe on it. By framing activation steering as a human-controllable interaction, our work highlights how interpretable model behaviors can empower in co-creative music generation.Audio samples demonstrating our method are available on our demo page.