Optimal ridge regularization revisited
最优岭回归正则化再探讨
AI总结 针对有限数据样本的线性岭回归,提出一种迭代算法从生成参数计算最优正则化强度,并证明其在有限噪声水平下的收敛性,实验表明结合样本参数估计可在多种设置下实现接近最优的泛化性能。
最优岭回归正则化再探讨
Jack Timmermans, Sergio A. Alvarez
AI总结 针对有限数据样本的线性岭回归,提出一种迭代算法从生成参数计算最优正则化强度,并证明其在有限噪声水平下的收敛性,实验表明结合样本参数估计可在多种设置下实现接近最优的泛化性能。
我们考虑在有限数据样本 $X$ 上的 $L^2$ 正则化线性(岭)回归,其中 $X$ 具有有界协方差,线性预测目标 $y$ 具有加性各向同性噪声且方差有限。我们提出了一种迭代过程,用于在固定 $X$ 设置下从生成参数数值计算最优正则化强度,并证明了其在有限噪声水平下的收敛性。我们在合成数据上的实验评估表明,所提出的过程结合基于样本的参数估计,在广泛的样本量、长宽比和噪声水平下,实现了接近最优的随机 $X$ 泛化性能,额外计算成本相当于欠参数化情况下的一次初步岭回归和过参数化情况下的两次初步岭回归。
We consider $L^2$-regularized linear (ridge) regression over a finite data sample $X$ with bounded covariance and linear prediction targets $y$ with additive isotropic noise of finite variance. We present an iterative procedure to compute the optimal regularization strength numerically from the generative parameters in the fixed-$X$ setting and prove its convergence at limited noise levels. Our experimental evaluation over synthetic data shows that the proposed procedure combined with sample-based parameter estimates attains near-optimal random-$X$ generalization across a wide range of sample sizes, aspect ratios, and noise levels, at an added computational cost equivalent to one preliminary ridge regression in the underparameterized regime and two in the overparameterized case.
DREAM-R: 基于强化学习的精炼草稿、精确验证与完全并行执行的多模态推测推理
Yunhai Hu, Zining Liu, Xiangyang Yin, Tianhua Xia, Bo Bao, Eric Sather, Vithursan Thangarasa, Sai Qian Zhang
AI总结 提出DREAM-R框架,通过强化学习优化草稿生成、阈值验证机制和完全并行执行,加速多模态模型的推理密集型任务,同时保持准确性。
推测推理最近被提出作为加速大型多模态模型中推理密集型生成的一种手段,但其有效性常受限于推测草稿与目标验证推理之间的不匹配。在本工作中,我们引入了DREAM-R,一个显著提升推测推理性能的框架。其核心是采用推测对齐策略优化(SAPO),这是一种强化学习目标,训练草稿模型生成既忠实于目标轨迹又简洁的推理步骤。我们进一步提出基于阈值的验证机制(TBVM),使用基于比率的标准,仅在正面证据明显占优时稳定且可解释地接受推测步骤,从而防止错误传播。基于这些组件,我们开发了完全并行推测推理(FPSR)框架,该框架将草稿生成、目标侧推理和验证并行化到多步推理中,支持提前停止和干净回退。在推理密集型基准上的实验表明,在保持目标模型准确性的同时,实现了高达[具体加速比]的加速,在不牺牲推理质量的情况下带来了显著的效率提升。
Speculative reasoning has recently been proposed as a means to accelerate reasoning-intensive generation in large multimodal models, but its effectiveness is often constrained by misalignment between speculative drafts and target-verified reasoning. In this work, we introduce DREAM-R, a framework that substantially improves the performance of speculative reasoning. At its core, DREAM-R employs Speculative Alignment Policy Optimization (SAPO), a reinforcement-learning objective that trains draft models to generate reasoning steps that are both faithful to target trajectories and concise. We further propose a Threshold-based Verification Mechanism (TBVM) that uses a ratio-based criterion to provide stable and interpretable acceptance of speculative steps only when positive evidence clearly dominates, thereby preventing error propagation. Building on these components, we develop a Fully Parallel Speculative Reasoning (FPSR) framework that parallelizes draft generation, target-side reasoning, and verification across multi-step reasoning, enabling early stopping and clean fallback. Experiments on reasoning-heavy benchmarks demonstrate up to speedup while preserving target-model accuracy, yielding substantial efficiency gains without compromising reasoning quality.
强化学习的最优数据获取:大偏差视角
Mingjie Hu, Jian-Qiang Hu, Enlu Zhou
AI总结 针对强化学习中数据获取效率问题,提出基于大偏差理论的统一框架,通过策略选择错误概率的指数衰减率作为效率指标,推导变分特征并设计自适应数据获取策略,证明其近鲁棒最优性。
数据获取效率是在商业和医疗运营中部署强化学习的一个核心挑战,在这些场景中,交互成本高、速度慢,并且通常涉及人类参与。本文为无限时域强化学习中的数据获取开发了一个统一的大偏差框架。我们引入策略选择错误概率的指数衰减率作为原则性的效率指标,并通过马尔可夫链的大偏差理论推导出该速率的变分特征,从而得到一个嵌套优化问题。基于这一特征,我们根据嵌套问题的最优解形式化了两种互补的最优性概念。由于所得程序是隐式的且通常难以处理,我们提出了一个具有显式约束的可处理凸松弛。然后,我们开发了一种懒惰的一步投影次梯度方法来求解松弛问题,并利用其迭代构造自适应数据获取策略。我们证明,在最优性准则下,所得的强化学习算法在常数因子内是近鲁棒最优的。最后,我们将该框架扩展到线性函数逼近以提高可扩展性,数值实验支持了所提方法的有效性。
Data acquisition efficiency is a central challenge in deploying reinforcement learning in business and healthcare operations, where interactions are costly, slow, and often involve humans in the loop. This paper develops a unified large deviations framework for data acquisition in infinite-horizon reinforcement learning. We introduce the exponential decay rate of the policy-selection error probability as a principled efficiency metric and derive a variational characterization of this rate via large deviations theory for Markov chains, yielding a nested optimization problem. Based on this characterization, we formalize two complementary notions of optimality in terms of the optimal solution of the nested problem. Because the resulting program is implicit and generally intractable, we propose a tractable convex relaxation with explicit constraints. We then develop a lazy one-step projected subgradient method to solve the relaxed problem and use its iterates to construct an adaptive data acquisition policy. We prove that the resulting reinforcement learning algorithm is near-robustly optimal under our optimality criterion, up to a constant factor. Finally, we extend the framework to linear function approximation to improve scalability, and numerical experiments support the effectiveness of the proposed approach.
Sense Representations Are Inducible Interfaces
Jan Christian Blaise Cruz, Alham Fikri Aji
AI总结 提出ACROS方法,通过门控残差加法在冻结的预训练解码器LM中诱导显式词义通路,实现零样本词义消歧、低KL词义引导和跨语言适应,保持基础LM质量。
词义表示(显式的、每个标记的意义分解)对于消歧、引导和跨语言对齐很有用,但现有方法要求模型在预训练时就内置词义结构。我们引入了ACROS,它通过门控残差加法在冻结的预训练解码器LM中诱导出显式的词义通路。在SmolLM2-360M上,ACROS在保持基础LM质量的同时,支持相同诱导变量的三种用途:零样本词义消歧(Raganato ALL上F1为64.95,与WordNet首义启发式方法相当)、在5,161个CoInCo案例中进行低KL词义引导(其中简单的非oracle代理恢复了约90%的正向偏移),以及针对四种语言的SENSIA跨语言适应(平均R@1为0.988,目标FLORES PPL为7.94)。ACROS使词义表示成为普通预训练LM的可诱导接口。
Sense representations (explicit, per-token meaning decompositions) are useful for disambiguation, steering, and cross-lingual alignment, but existing approaches require models to be pretrained with sense structure baked in. We introduce ACROS, which induces an explicit sense pathway into a frozen pretrained decoder LM through a gated residual addition. On SmolLM2-360M, ACROS preserves base LM quality while supporting three uses of the same induced variables: zero-shot word-sense disambiguation (64.95 F1 on Raganato ALL, competitive with the WordNet first-sense heuristic), low-KL lexical steering across 5,161 CoInCo cases where a simple non-oracle proxy recovers about 90% of positive shifts, and SENSIA cross-lingual adaptation to four languages (mean R@1 0.988, target FLORES PPL 7.94). ACROS makes sense representations an inducible interface for ordinary pretrained LMs.
基于LLM的直观灵活能力规划辅助系统
Luis Miguel Vieira da Silva, Nicolas König, Felix Gehlhoff
AI总结 提出一种混合辅助系统,将基于能力的形式化SMT规划与LLM自然语言交互层结合,通过人机协同实现规划解释与知识模型自适应,提升工业自动化中能力规划的可访问性和灵活性。
在现代工业中,动态环境以及模块化和可重构资源的复杂性要求对过程序列进行自动化规划。基于能力的规划方法通过从以机器可解释形式描述资源功能的语义知识模型自动生成计划来解决这一问题。然而,其实际应用仍然有限:求解器反馈(特别是在不可满足情况下)难以解释,并且知识模型需要随着操作条件变化或请求变得不可行而进行调整。本文提出一种混合辅助系统,通过基于大语言模型(LLM)的自然语言交互、解释和适应层,增强现有的基于能力的可满足性模理论(SMT)规划方法。形式化规划的正确性仍由符号规划器保证,而LLM层在明确的人机协同(HitL)批准下处理自然语言访问和灵活的知识模型适应。该系统分解为四个组件:能力基础化、符号规划、结果解释和规划适应,实现为路由代理工作流,其中中央路由器将任务委派给五个专门代理。该系统在模块化生产系统上针对四种场景类型进行了评估。在23个测试案例中,10个知识查询中的9个和所有4个可满足规划案例均被正确处理,4个不可满足案例中的3个产生了具体的修复建议,所有5个自适应规划场景通过迭代的、用户批准的知识模型修改最终生成了可满足计划。研究结果证实,将形式化规划与基于LLM的辅助相结合,显著提高了工业自动化的可访问性和适应性。
In modern industry, dynamic environments and the complexity of modular and reconfigurable resources require automated planning of process sequences. Capability-based planning approaches address this by automatically generating plans from semantic knowledge models that describe resource functions in a machine-interpretable form. Their practical use, however, remains limited: solver feedback, especially in the case of unsatisfiability, is difficult to interpret, and the knowledge models require adaptation as operational conditions change or requests become infeasible. This paper presents a hybrid assistance system that augments an existing capability-based Satisfiability Modulo Theories (SMT) planning approach with an Large Language Model (LLM)-based layer for natural-language interaction, explanation, and adaptation. Formal planning correctness remains with the symbolic planner, while the LLM layer handles natural-language access and flexible knowledge model adaptation under explicit Human-in-the-Loop (HitL) approval. The system decomposes into four components: Capability Grounding, Symbolic Planning, Result Interpretation, and Planning Adaptation, realized as a routed agentic workflow in which a central router delegates to five specialized agents. The system is evaluated on a modular production system across four scenario types. Of 23 test cases, 9 of 10 knowledge queries and all 4 satisfiable planning cases were handled correctly, 3 of 4 unsatisfiable cases produced concrete repair proposals, and all 5 adaptive planning scenarios resolved into satisfiable plans through iterative, user-approved knowledge model modifications. The findings confirm that combining formal planning with LLM-based assistance substantially improves accessibility and adaptability in industrial automation.
用于合成数据生成的激活引导:多样性在下游安全检测中的作用
Vijeta Deshpande, Tootiya Giyahchi, Veena Padmanabhan, Leman Akoglu, Anna Rumshisky
AI总结 研究激活引导(AS)生成高质量训练数据用于下游安全检测分类器,发现多样性是关键但被忽视的轴,且AS在窄参数范围内优于提示生成。
安全检测模型需要HHH(有帮助、无害、诚实)违反输出的示例以实现鲁棒泛化,但此类示例稀缺。激活引导(AS)已成为一种数据高效的方法,用于生成与目标概念对齐的响应。我们研究AS能否为下游分类器生成高质量训练数据集,这一问题尚未被测试。我们通过内在和外在评估,跨越4个概念×2个模型×4种引导方法进行了双重研究。内在方面,除了引导成功(概念对齐)和连贯性的领域标准,我们引入了样本级和集合级多样性作为文献中先前缺失的质量轴,并发现增加引导强度会降低响应多样性。外在方面,我们用引导生成替换可用训练数据中的HHH违反示例,并微调检测分类器。AS生成的数据在4个概念中的3个上产生了比提示生成数据更好的分类器。然而,136个AS配置中只有41个优于提示,表明下游效用存在于一个狭窄的区间,该区间同时满足成功、连贯性和多样性。这三个轴的调和平均数比单独的成功和连贯性更一致地与下游AUROC相关,为实践者调整AS超参数提供了实用的启发式目标。总之,我们的结果突出了AS在合成数据生成中改进安全检测的潜力,并确定了多样性作为调整AS的关键且先前被忽视的轴。
Safety detection models require examples of HHH (Helpful, Harmless, Honest)-violating outputs for robust generalization, however such examples are scarce. Activation Steering (AS) has emerged as a data-efficient method for generating target-concept-aligned responses. We investigate whether AS can generate high-quality training datasets for downstream classifiers, a question that remains untested. We present a two-fold study with intrinsic and extrinsic evaluation across $4$ concepts $\times\,2$ models $\times\,4$ steering methods. Intrinsically, beyond the field-standard rubric of steering success (concept alignment) and coherence, we introduce sample- and set-level diversity as a quality axis previously absent from the literature, and find that increasing steering strength reduces response diversity. Extrinsically, we replace HHH-violating examples in the available training data with steered generations and fine-tune detection classifiers. AS-generated data results in a better classifier than the prompting-generated data on $3$ of $4$ concepts. However, only $41$ of $136$ AS configurations outperform prompting, indicating that downstream utility lies in a narrow regime that jointly satisfies success, coherence, and diversity. The harmonic mean of these three axes correlates with downstream AUROC more consistently across concepts than success and coherence alone, providing a practical heuristic target for practitioners tuning AS hyperparameters. Together, our results highlight the potential of AS in synthetic data generation for improving safety detection and identify diversity as a critical, previously overlooked axis for tuning AS.
时间图学习在预测生物系统动力学中的应用
Manuel Dileo, Andrea Sottoriva
AI总结 本研究提出基于伪时间分辨基因调控网络的时间图神经网络框架,用于预测细胞状态演变,在三个任务上优于scGPT等基础模型。
生物基础模型通过将Transformer架构直接应用于基因表达矩阵,在单细胞表示学习中表现出色。然而,这些方法主要在静态设置下运行,并未显式建模细胞发育程序的时间演化。建模这种动态对于理解细胞状态在发育或疾病进展中如何逐步出现、分化和重组至关重要。在这篇进行中的论文中,我们探索了一种替代性的基于时间图的方法,其中细胞状态通过伪时间分辨的基因调控网络表示,并建模为持久基因身份上的演化图结构。从单细胞转录组数据开始,我们推断伪时间轨迹,将细胞离散化为发育快照,为每个快照重建一个基因调控网络,并应用时间图神经网络预测生物状态。我们在两个公开的小鼠发育数据集(红系原肠胚形成和胰腺内分泌发生)上评估该框架,考虑三个互补任务:基因表达预测、链接预测和出度中心性预测。我们的结果表明,基于图的模型优于著名的基础模型如scGPT和scFoundation,表明显式建模演化的调控结构提供了静态预训练表示之外的有用信息。对于链接预测和中心性预测,时间图学习捕捉了非平凡的调控动态,并能够识别时间上重要的基因枢纽。总体而言,我们的发现支持时间图学习作为建模动态生物系统的一个有前景的方向,以及作为当前单细胞生物学基础模型方法的补充范式。
Biological foundation models have shown strong performance in single-cell representation learning by applying transformer architectures directly to gene-expression matrices. However, these approaches predominantly operate in static settings and do not explicitly model the temporal evolution of developmental programs in the cell. Modeling such dynamics is important for understanding how cellular states progressively emerge, differentiate, and reorganize during development or disease progression. In this work-in-progress paper, we investigate an alternative temporal graph-based perspective in which cellular states are represented through pseudotime-resolved gene regulatory networks and modeled as evolving graph structures over persistent gene identities. Starting from single-cell transcriptomic data, we infer pseudotime trajectories, discretize cells into developmental snapshots, reconstruct one gene regulatory network per snapshot, and apply temporal graph neural networks to forecast biological states. We evaluate this framework on two publicly available mouse developmental datasets, erythroid gastrulation and pancreatic endocrinogenesis, considering three complementary tasks: gene-expression forecasting, link prediction, and out-degree centrality prediction. Our results show that graph-based models outperform well-known foundation-model such as scGPT and scFoundation, suggesting that explicitly modeling evolving regulatory structure provides useful information beyond static pretrained representations. For link prediction and centrality forecasting, temporal graph learning captures non-trivial regulatory dynamics and enables the identification of temporally important gene hubs. Overall, our findings support temporal graph learning as a promising direction for modeling dynamic biological systems and as a complementary paradigm to current foundation model approaches in single-cell biology.
DEMON: 音乐编排噪声的扩散引擎
Ryan Fosdick
AI总结 提出DEMON实时扩散引擎,通过异构去噪调度、共享可变状态、逐帧源混合和窗口化VAE解码四种机制,使去噪过程可作为现场乐器演奏,在单GPU上实现每秒12.3次60秒音乐解码。
我们提出DEMON,一个实时扩散引擎,使去噪过程可作为现场乐器演奏:控制面既宽广(每帧跨输出塑造多个参数)又响应迅速(每个控制在其去噪循环中的位置允许下尽快生效)。基于ACE-Step 1.5和StreamDiffusion的环形缓冲区架构,并采用TensorRT加速,在单消费级GPU(RTX 5090)上,对于60秒音乐,每秒可完成最多12.3次解码器完成,或在我们的生产环深度4下每秒生成11.3次。在这些速率下,去噪参数可作为现场表演控制,但环形缓冲区仅以其排出速率(S个去噪步的下限)传播每次请求的变化。我们贡献了四种机制。(1)每槽异构去噪调度:每个环形缓冲区槽拥有自己的时间步调度,因此移动的去噪滑块无需清除飞行队列即可被跟踪,而上游全局调度设计必须重建并丢弃队列。(2)共享可变的每步状态,使每个求解器步骤中查询的任何参数在下一拍生效,绕过环形缓冲区排出。(3)逐帧源混合:对标准SDE重噪步骤的采样时间控制,提供逐帧变换强度轴,补充标量去噪调度。(4)窗口化VAE解码,利用感受野分析实现8.0倍解码加速。这些机制将流式扩散参数按起始和收敛延迟分为四个传播类别。
We present DEMON, a real-time diffusion engine that makes the denoising process playable as a live musical instrument: a control surface both broad (many parameters shaped per-frame across the output) and responsive (each control taking effect as fast as its place in the denoising loop allows). Built on ACE-Step 1.5 and StreamDiffusion's ring-buffer architecture with TensorRT acceleration, it sustains up to 12.3 decoder completions per second for 60-second music on a single consumer GPU (RTX 5090), or 11.3 generations per second at our production ring-depth of 4. At these rates denoising parameters become viable as live performance controls, but the ring buffer propagates per-request changes only at its drain rate, a floor of S denoising steps. We contribute four mechanisms. (1) Per-slot heterogeneous denoise scheduling: each ring-buffer slot owns its timestep schedule, so a moving denoise slider is tracked without wiping the in-flight queue, where the upstream global-schedule design must rebuild and discard it. (2) Shared mutable per-step state, giving any parameter consulted at every solver step next-tick effect, bypassing ring-buffer drain. (3) Per-frame source blending: a sampling-time control on the standard SDE re-noise step, giving a framewise transformation-strength axis that complements scalar denoise scheduling. (4) Windowed VAE decode exploiting receptive-field analysis for an 8.0x decode speedup. Together these separate streaming-diffusion parameters into four propagation classes, by onset and convergence latency.
AutoScientists: 用于长期科学实验的自组织智能体团队
Shanghua Gao, Ada Fang, Marinka Zitnik
AI总结 提出一种去中心化的AI智能体团队系统AutoScientists,通过自组织协作、提案评审和失败知识共享,在生物医学机器学习、语言模型训练优化和蛋白质适应性预测等长期实验中显著优于现有方法。
科学研究通过假设生成、实验设计、执行和修正的迭代循环进行。AI智能体可以自动化这一过程的某些部分,但现有方法通常遵循单一研究轨迹,或通过具有固定目标的中央规划器进行协调。因此,它们难以维持并行探索、根据实验证据的变化进行调整,或在长期实验中保留失败方向的知识。我们引入了AutoScientists,一个用于长期计算科学实验的去中心化AI智能体团队。智能体解释共享的实验状态,围绕有希望的假设自组织成团队,在使用实验计算资源之前评审提案,并分享成功和失败以减少冗余探索。在匹配的实验预算下,AutoScientists在生物医学机器学习、语言模型训练优化和蛋白质适应性预测方面优于先前的AI智能体。在涵盖生物医学成像、蛋白质工程、单细胞组学和药物发现的BioML-Bench上,AutoScientists在24个任务中达到了74.4%的平均排行榜百分位,比最强的AI智能体提高了8.33%。在GPT训练优化中,AutoScientists达到目标验证bits-per-byte的速度比Autoresearch快1.9倍,并从初始冠军开始持续发现改进,而单智能体方法未发现任何改进(7个接受改进对比0个)。在ProteinGym适应性预测中,AutoScientists发现了一种ACE2-Spike结合方法,其Spearman相关性比当前最先进模型提高了12.5%。在未经修改地应用于所有217个ProteinGym检测时,相同方法比先前最先进技术提高了6.5%(Spearman相关性)。
Scientific research proceeds through iterative cycles of hypothesis generation, experiment design, execution, and revision. AI agents can automate parts of this process, but existing approaches typically follow a single research trajectory or coordinate through a central planner with fixed objectives. As a result, they struggle to sustain parallel exploration, adapt as experimental evidence changes, or preserve knowledge of failed directions over long-running experiments. We introduce AutoScientists, a decentralized team of AI agents for long-running computational scientific experimentation. Agents interpret a shared experimental state, self-organize into teams around promising hypotheses, critique proposals before using experimental compute, and share successes and failures to reduce redundant exploration. Under matched experimental budgets, AutoScientists improves over prior AI agents across biomedical machine learning, language-model training optimization, and protein fitness prediction. On BioML-Bench, spanning biomedical imaging, protein engineering, single-cell omics, and drug discovery, AutoScientists achieves a mean leaderboard percentile of 74.4% across 24 tasks, improving over the strongest AI agent by +8.33%. On GPT training optimization, AutoScientists reaches a target validation bits-per-byte 1.9x faster than Autoresearch and continues discovering improvements from a starting champion where the single-agent approach finds none (7 vs. 0 accepted improvements). On ProteinGym fitness prediction, AutoScientists discovers a method for ACE2-Spike binding that improves over the current state-of-the-art model by +12.5% in Spearman correlation. Applied without modification across all 217 ProteinGym assays, the same method improves over the prior state of the art by +6.5% (Spearman correlation).
集成探索感知的无人机路径优化与轨迹规划
Jimin Choi, Grant Stagg, Cameron K. Peterson, Max Z. Li
AI总结 提出一种集成探索感知的无人机路径优化与轨迹规划框架,通过风险地图、不确定兴趣区域建模、B样条轨迹优化和在线重规划,在灾害监测中平衡报告点访问与新信息探索,实现平均KL散度降低15.9%。
无人机越来越多地用于危险环境(如灾区、污染场地、野火区域和受损基础设施)中的探索驱动监测,此时有限的飞行续航必须在访问报告位置和收集新信息之间分配。在这些场景中,关于危险的先验信息通常不完整、空间不精确,并且在执行过程中可能发生变化。例如,初始报告可能识别出危险可能存在的区域,但实际危险可能被移动、部分观察到或完全未被报告。我们提出了一种集成的探索感知无人机路径优化与轨迹规划框架,用于在不确定和演变的先验信息下进行危险监测。环境被表示为空间风险地图,每个位置都有相关的危险状况信念。报告的危险被建模为不确定的兴趣区域(ROI),而不是确认的目标位置,要求无人机在检查报告区域的同时,利用有限的飞行续航探索信息丰富的区域。所提出的方法解决了报告ROI上的车辆路径问题,通过辅助伪节点增强路径以改善空间覆盖,将剩余飞行距离预算分配到路径段,并优化局部探索的动态可行B样条轨迹。在执行过程中,无人机测量更新基于网格的信念地图,当新信息和剩余预算证明调整合理时,对剩余轨迹进行重规划。在48种场景配置中,在线重规划相比离线优化规划器平均KL散度降低15.9%,相比直线遍历降低48.6%。
Uncrewed aerial vehicles (UAVs) are increasingly used for exploration-driven monitoring in hazardous environments such as disaster zones, contaminated sites, wildfire areas, and damaged infrastructure, where limited flight endurance must be allocated between visiting reported locations and gathering new information. In these settings, prior information regarding hazards is often incomplete, spatially imprecise, and subject to change during execution. For example, initial reports may identify a region where a hazard is likely to exist, but the actual hazard may be displaced, partially observed, or entirely unreported. We present an integrated exploration-aware UAV route optimization and path planning framework for hazard monitoring under uncertain and evolving prior information. The environment is represented as a spatial risk map, where each location has an associated belief of hazardous conditions. Reported hazards are modeled as uncertain regions of interest (ROIs) rather than confirmed target locations, requiring the UAV to inspect reported areas while also using its limited flight endurance to explore informative regions. The proposed method solves a vehicle routing problem over reported ROIs, augments the route with auxiliary pseudo-nodes to improve spatial coverage, allocates the remaining flight distance budget across route segments, and optimizes dynamically feasible B-spline trajectories for local exploration. During execution, UAV measurements update a grid-based belief map, and the remaining trajectory is replanned when new information and the remaining budget justify adaptation. Across 48 scenario configurations, online replanning improves average KL reduction by 15.9% over the offline optimized planner and 48.6% over straight-line traversal.
可解释性引导的子空间投影层选择:SAE作为听诊器而非手术刀用于原始任务向量模型编辑
Li Lei, Madalina Ciobanu, Qingqing Mao, Ritankar Das
AI总结 本文发现将任务向量投影到稀疏自编码器(SAE)特征子空间会丢弃约97%的修改能量,导致无效编辑;提出将SAE用于层诊断而非干预过滤,通过SAE特异性分数选择层注入原始任务向量,在数学推理任务上显著提升性能。
大型语言模型越来越需要精细的模型编辑来增强领域特定能力,而无需承担全微调带来的计算成本或灾难性遗忘。稀疏自编码器(SAE)在此背景下成为一种有前景的工具,原则上允许在特征级别识别干预位置。本文严格评估了基于SAE引导的编辑流程在Gemma-3-4B-IT上的数学推理能力,并揭示了一个根本性失败模式:将任务向量投影到SAE特征子空间这一直观方法实际上充当了信息瓶颈,丢弃了约97%的修改能量,在七个数学科目上均未产生统计显著的改进。我们表明,这种失败源于激活空间SAE方向与权重空间任务向量之间的几何失配。随后,我们提出视角转变:SAE作为听诊器而非手术刀,即SAE用于层级别诊断而非干预级别过滤。通过仅将未过滤的原始任务向量注入由SAE特异性分数识别的层,我们在Minerva Math基准上将数论准确率从29.6%提升至39.4%(z=+3.41,p=0.0007);7个数学科目中有5个显著提升,且无任何科目显著下降。我们的方法完全确定,无需额外推理成本,并为可解释性引导的模型编辑提供了原则性框架。
LLMs increasingly require surgical model editing to enhance domain-specific capabilities without incurring the computational cost or catastrophic forgetting associated with full fine-tuning. Sparse Autoencoders (SAEs) have emerged as a promising tool in this setting, in principle allowing for feature-level identification of where to intervene. In this work, we rigorously evaluate an SAE-guided editing pipeline for mathematical reasoning on Gemma-3-4B-IT and uncover a fundamental failure mode: the intuitively appealing approach of projecting task vectors onto SAE feature subspaces acts as an information bottleneck that discards approximately 97% of the modification energy, yielding no statistically significant improvements across seven math subjects. We show that this failure stems from a geometric misalignment between activation-space SAE directions and weight-space task vectors. We then propose a shift in perspective: SAE as a Stethoscope, Not a Scalpel, where SAEs are used for layer-level diagnosis rather than intervention-level filtering. By injecting unfiltered raw task vectors only into layers identified by an SAE-derived specificity score, we improve Number Theory accuracy from 29.6% to 39.4% (z=+3.41, p=0.0007) on the Minerva Math benchmark; 5 of 7 math subjects significantly improved and none significantly degraded. Our method is fully deterministic, requires no additional inference cost, and provides a principled framework for interpretability-guided model editing.
LLM沙盒与人格动态的伦理
Tim Gebbie, Stewart Gebbie
AI总结 本文论证LLM护栏和人格动态产生的现实差距(reality gap)构成不道德的“现实洗白”(reality laundering),并提出通过任务级因果需求规范而非响应级道德修正来解决。
众所周知,LLM护栏和训练的人格动态会产生现实差距:LLM被允许或塑造描述的世界与用户必须行动的世界之间的距离。这里我们论证,主动产生现实差距实际上是不道德的,因为它有意将认知风险转嫁给不知情的用户——这就是现实洗白。当大规模运作时,这可能会造成伤害。在高暴露建议情境中风险最为尖锐,用户寻求的是方向而非有边界、可外部检查的任务。护栏在声称防止直接伤害时看似在伦理上必要,但当它们压制真实感知并将令人不适的机制洗白为可接受的抽象时,往往变得可疑。巴塞尔式金融监管、B-BBEE式合规、法国兴业银行和伦敦鲸事件展示了正式安全系统如何变得可理解、可博弈和表演性,而真实风险却转移到了别处。同样的模式可能出现在LLM中作为道德合规:安全的语言,扭曲的现实。因此,我们区分拒绝伤害与拒绝现实;然后主张在任务层面进行自上而下的因果需求规范,而非在响应或沙盒层面进行自下而上的道德修正。人格动态之所以重要,是因为助手界面并非中立;它塑造了不确定性、冲突、权威和风险如何被呈现。结论是,所谓的“伦理AI”当用制度安慰替代与现实接触时,实质上变得不伦理。
It is well known that LLM guardrails and trained persona dynamics can produce a reality gap: the distance between the world a LLM is permitted or shaped to describe, and the world in which users must act. Here we argue that actively generating reality gaps is in fact unethical because it knowingly shifts epistemic risk back to the uninformed user -- this is reality laundering. This can potentially cause harm when operationalised at scale. The risk is sharpest in high-exposure advice contexts, where users seek orientation rather than a bounded, externally checkable task. Guardrails naively appear ethically necessary when they claim to prevent direct harm, but often become suspect when they suppress truthful perception and launder uncomfortable mechanisms into acceptable abstractions. Basel-style financial regulation, B-BBEE-style compliance, Societe Generale, and the London Whale show how formal safety systems can become legible, gameable, and performative while real exposure migrates elsewhere. The same pattern can appear in LLMs as moral compliance: safe language, distorted reality. We therefore distinguish refusing harm, from refusing reality; and then argue for top-down causal requirements specification at the task level rather than bottom-up moral correction at the response or sandbox level. Persona dynamics matter because the assistant interface is not neutral; it shapes how uncertainty, conflict, authority, and risk are staged. The conclusion is that so-called ``ethical AI'' becomes substantively unethical when it substitutes institutional reassurance for contact with reality.
GraphSteal:通过遍历重建从图RAG中窃取结构知识
Jinze Gu, Qinghua Mao, Xi Lin, Jun Wu
AI总结 提出一种结构导向的重建框架,通过深度优先启发式搜索和广度优先扩散搜索,从黑盒图RAG系统中高保真恢复隐藏知识图,揭示敏感实体、关系和结构依赖。
检索增强生成(RAG)通过将生成过程基于查询相关的外部证据来增强LLM。除了非结构化文本语料库外,图RAG将知识图谱集成到检索流程中,使LLM能够访问编码在结构化知识中的实体、关系和多跳依赖。然而,赋能图RAG的相同结构化知识也创造了新的隐私攻击面。我们证明,图RAG系统可以转变为结构预言机:通过自适应黑盒交互,对手可以引出足够的关联证据,以重建隐藏知识图的实质性部分。我们提出了一种面向结构的重建框架,从局部和全局角度恢复目标图。具体来说,深度优先启发式搜索通过递归扩展以实体为中心的证据来提取细粒度节点属性,而广度优先扩散搜索通过跨关系诱导邻域传播来推断图拓扑。在通用和医疗场景上的实验表明,我们的方法可以从代表性图RAG系统中恢复超过90%的原始知识图,以高保真度揭示敏感实体、关系和结构依赖。现有的防护措施对我们的攻击提供的防御有限,突显了在图RAG流程中保护结构隐私的固有困难。
Retrieval-Augmented Generation (RAG) enhances LLMs by grounding generation in query-relevant external evidence. Beyond unstructured text corpora, Graph RAG integrates knowledge graphs into the retrieval pipeline, enabling LLMs to access entities, relations, and multi-hop dependencies encoded in structured knowledge. However, the same structured knowledge that empowers Graph RAG also creates a new privacy attack surface. We demonstrate that Graph RAG systems can be turned into structural oracles: through adaptive black-box interactions, an adversary can elicit sufficient relational evidence to reconstruct substantial portions of the hidden knowledge graph. We propose a structure-oriented reconstruction framework that recovers targeted graphs from both local and global perspectives. Specifically, Depth-Wise Heuristic Search extracts fine-grained node attributes by recursively expanding entity-centered evidence, while Breadth-Wise Diffusion Search infers graph topology by propagating across relation-induced neighborhoods. Experiments on generic and healthcare scenarios demonstrate that our method can recover over 90\% of the original knowledge graph from representative Graph RAG systems, revealing sensitive entities, relations, and structural dependencies with high fidelity. Existing guradrails provide limited defense against our attack, highlighting the inherent difficulty of safeguarding structural privacy in Graph RAG pipelines.
带宽高效且隐私保护的边缘-云多对多语音翻译
Yexing Du, Kaiyuan Liu, Youcheng Pan, Bo Yang, Ming Liu, Bing Qin, Yang Xiang
AI总结 提出边缘-云协同框架ESRT,通过分割推理架构压缩中间特征实现带宽降低10倍和语音隐私保护,并采用多任务加权课程学习策略实现45种语言的多对多语音翻译。
多模态大语言模型(MLLMs)在语音到文本翻译(S2TT)方面展现出巨大潜力。然而,现有部署范式面临关键挑战:纯设备端模型受资源限制,而集中式云系统通过传输原始语音数据导致严重的隐私风险和带宽瓶颈。此外,大多数模型表现出以英语为中心的偏见,限制了多对多翻译的扩展。在本文中,我们提出边缘-云语音识别与翻译(ESRT),一种隐私保护且带宽高效的协作式边缘-云MLLM框架。具体而言,我们设计了一种边缘-云分割推理架构,在设备上保留轻量级语音编码器和适配器,仅将高度压缩的中间特征传输到云端。这从根本上防止了声纹泄露,并将带宽需求降低高达10倍。为克服以英语为中心的瓶颈,我们引入了一种多任务加权课程学习策略与数据平衡,以确保鲁棒的跨语言一致性。在FLEURS数据集上的大量实验表明,我们的模型ESRT-4B和ESRT-12B在45种语言(45×44个方向)上实现了最先进的多对多S2TT性能。代码和模型已发布,以促进可复现的、隐私感知的MLLM S2TT研究。代码和模型发布于https://github.com/yxduir/esrt。
Multimodal large language models (MLLMs) have demonstrated significant potential for speech-to-text translation (S2TT). However, existing deployment paradigms face critical challenges: pure on-device models suffer from resource constraints, while centralized cloud systems incur severe privacy risks and bandwidth bottlenecks by transmitting raw voice data. Furthermore, most models exhibit English-centric biases, restricting many-to-many translation scaling. In this paper, we propose Edge-cloud Speech Recognition and Translation (ESRT), a privacy-preserving and bandwidth-efficient collaborative edge-cloud MLLM framework. Specifically, we design an edge-cloud split inference architecture that retains a lightweight speech encoder and adapter on the device, transmitting only highly compressed intermediate features to the cloud. This fundamentally prevents voiceprint leakage and reduces bandwidth requirements by up to 10$\times$. To overcome English-centric bottlenecks, we introduce a multi-task weighted curriculum learning strategy with data balancing to ensure robust cross-lingual consistency. Extensive experiments on the FLEURS dataset demonstrate that our models, ESRT-4B and ESRT-12B, achieve state-of-the-art many-to-many S2TT performance across 45 languages ($45 \times 44$ directions). Code and models are released to facilitate reproducible, privacy-aware MLLM S2TT research. The code and models are released at https://github.com/yxduir/esrt.
用指数衰减记忆增强注意力提升查询感知的KV稀疏性
Xiuying Wei, Caglar Gulcehre
AI总结 本文通过引入指数衰减记忆模块增强注意力机制,在稀疏注意力推理中显著提升查询感知方法的准确性,并在多个任务上验证了其有效性。
高效推理对于长上下文语言模型至关重要,其中注意力计算和KV缓存访问主导了成本。最近的工作RAT+引入了一种递归增强的注意力骨干,使得在推理时能够进行灵活的扩张注意力。在本文中,我们研究了这种指数衰减记忆是否也能改进现有的查询感知稀疏推理方法。使用包括Quest、MoBA和SnapKV在内的代表性方法,我们展示了在八个“大海捞针”任务中,RAT+在不同稀疏预算下始终比标准注意力提高了准确性。我们既在RAT+论文发布的检查点上验证了这些增益,也在OLMo2-7B上进行了验证,后者我们使用添加的记忆模块继续预训练了100亿个token。最后,我们提出了两个假设来解释为什么这个记忆模块有利于查询感知的稀疏推理,并设计了有针对性的实验来支持它们。
Efficient inference is critical for long-context language models, where attention computation and KV-cache access dominate the cost. Recent work RAT+, introduces a recurrence-augmented attention backbone that enables flexible dilated attention at inference time. In this paper, we investigate whether this exponentially decaying memory can also improve existing query-aware sparse inference methods. Using representative methods including Quest, MoBA, and SnapKV, we show that RAT+ consistently improves accuracy over standard attention across sparse budgets on eight needle-in-a-haystack tasks. We validate these gains both on the released checkpoints from the RAT+ paper and on OLMo2-7B, which we continue pretraining with the added memory module for 10B tokens. Finally, we propose two hypotheses explaining why this memory module benefits query-aware sparse inference and design targeted experiments to support them.
Transformer语言模型中的注意力白熊效应
Rebecca Ramnauth, Brian Scassellati
AI总结 通过表征探测、注意力分析和行为语义泄露实验,发现指令抑制下Transformer语言模型仍能恢复被禁止概念的表征并影响后续生成,揭示了行为对齐与表征对齐之间的根本差距。
基于指令的抑制被广泛用于防止语言模型生成被禁止的内容,但尚不清楚抑制是减少了内部表征还是仅仅抑制了表达。我们通过跨多个Transformer模型的表征探测、注意力分析和行为语义泄露实验来研究这个问题。我们发现,在抑制下,被禁止的概念仍然可以从隐藏表征中高度恢复,继续影响注意力路由,并且在成功避免词汇的情况下可测量地塑造下游生成。这些效应在池化策略、间接语义控制和多个模型家族中持续存在。我们的结果暴露了行为对齐与表征对齐之间的根本差距。
Instruction-based suppression is widely used to prevent language models from generating prohibited content, yet it remains unclear whether suppression reduces internal representation or merely suppresses expression. We investigate this question through representational probing, attention analysis, and behavioral semantic leakage experiments across multiple transformer models. We find that prohibited concepts remain highly recoverable from hidden representations under suppression, continue to influence attention routing, and measurably shape downstream generations despite successful lexical avoidance. These effects persist across pooling strategies, indirect semantic controls, and multiple model families. Our results expose a fundamental gap between behavioral and representational alignment.
PrimitiveVLA:学习可复用的运动基元以实现高效且可泛化的机器人操作
Yutai Li, Shaohui Peng, Jiaming Guo, Di Huang, Zihao Zhang, Yuxuan Guo, Yunkai Gao, Siming Lan, Ling Li, Xing Hu, Yunji Chen
AI总结 提出PrimitiveVLA框架,通过将视觉-语言-动作模型从直接指令到控制映射转向以基元为中心的拆解与组装范式,利用多模态规范表示和自动化流水线,提升数据效率并实现零样本泛化。
视觉-语言-动作(VLA)模型为通用机器人策略提供了有前景的范式,但其适应受到数据效率低下和泛化能力差的阻碍。我们认为这些瓶颈源于主流的直接指令到控制映射,该映射迫使模型记忆整体轨迹而非可复用的运动模式,即基元。我们提出PrimitiveVLA,一个将该范式转向以基元为中心的拆解与组装范式的框架。在共享的多模态规范表示(MCR)支持下,PrimitiveVLA统一了两个阶段:(1)微调阶段拆解,使用自动化流水线将演示拆解为可复用的基元;(2)推理阶段组装,采用基于VLM的规划器和LLM生成的切换模块实现鲁棒的闭环执行。通过将任务拆解为可复用的基元,PrimitiveVLA使VLA模型能够学习不变的运动模式而非特定任务的轨迹。大量实验表明,我们的框架提高了数据效率,并在未见过的任务和长时域任务上实现了卓越的零样本泛化。
Vision-Language-Action (VLA) models offer a promising paradigm for generalist robotic policies, yet their adaptation is hindered by data inefficiency and poor generalization. We argue that these bottlenecks stem from the prevailing Direct Instruction-to-Control Mapping, which forces models to memorize monolithic trajectories rather than reusable motion patterns, i.e., primitives. We propose PrimitiveVLA, a framework that shifts this paradigm toward a Primitive-Centric Disassemble & Assemble paradigm. Supported by a shared Multimodal Canonical Representation (MCR), PrimitiveVLA unifies two phases: (1) Fine-tuning-phase Disassembly, which uses an automated pipeline to disassemble demonstrations into reusable primitives; and (2) Inference-phase Assembly, which employs a VLM-based planner and an LLM-generated switch module for robust closed-loop execution. By disassembling tasks into reusable primitives, PrimitiveVLA enables VLA models to learn invariant motion patterns instead of task-specific trajectories. Extensive experiments show that our framework improves data efficiency and achieves superior zero-shot generalization across unseen and long-horizon tasks.
盲PRNG劫持:一种针对LLM水印的不可检测的完整性保持攻击
Ziyang You, Huilong He, Xiaoke Yang, Xuxing Lu
AI总结 提出SeedHijack攻击,通过替换伪随机数生成器(PRNG)在供应链层面对LLM水印进行盲攻击,同时保持完整性并规避检测。
密码学水印是归因大型语言模型(LLM)生成文本的主要防御手段。现有方案(包括KGW、Unigram和DipMark)的安全性基于底层伪随机数生成器(PRNG)可信的假设。本文引入SeedHijack,这是首个针对LLM水印的供应链攻击,同时满足:(i) 盲——无需知道水印密钥、检测器或模型logits;(ii) 完整性保持——放大而非擦除水印信号;(iii) 与检测正交——攻击引入的偏差与所有内容侧检测器统计独立,确保放大和规避共存而无权衡。SeedHijack不扰动生成文本,而是在供应链层替换PRNG,偏向绿名单选择而不改变输出令牌或降低文本质量。在三种水印方案和三个开源LLM上,攻击触发了0/6个最先进的内容侧统计检测器,同时将水印z-score放大至2.42倍(系统级防御如熵源认证保持正交和互补)。量子随机数生成器(QRNG)对策被证明能完全中和攻击,同时保持良性水印效用。这些发现确立了PRNG完整性作为密码学内容来源系统的一等安全需求。
Cryptographic watermarking is a leading defense for attributing text generated by large language models (LLMs). Existing schemes, including KGW, Unigram, and DipMark, derive their security guarantees from the assumption that the underlying pseudo-random number generator (PRNG) is trustworthy. This work introduces SeedHijack, the first supply-chain attack on LLM watermarking that is simultaneously (i) blind -- requiring no knowledge of the watermark key, detector, or model logits, (ii) integrity-preserving -- amplifying rather than erasing the watermark signal, and (iii) orthogonal to detection -- the attack-induced bias is statistically independent of all content-side detector statistics, ensuring that amplification and evasion coexist without trade-off. Rather than perturbing generated text, SeedHijack replaces the PRNG at the supply-chain layer, biasing green-list selection without altering output tokens or degrading text quality. Across three watermarking schemes and three open-source LLMs, the attack triggers 0/6 state-of-the-art content-side statistical detectors while inflating the watermark z-score up to 2.42x (system-level defenses such as entropy-source attestation remain orthogonal and complementary). A quantum random number generator (QRNG) countermeasure is shown to fully neutralize the attack while preserving benign watermarking utility. These findings establish PRNG integrity as a first-class security requirement for cryptographic content-provenance systems.
单次展开隐藏状态动力学用于无训练RLVR数据选择
Jianghao Wu, Jianfei Cai, Weiqiang Wang, Jin Ye, Daniel F. Schmidt, Yasmeen George
AI总结 提出SHIFT方法,通过单次推理展开的隐藏状态变化(RIRS)作为实例效用代理,结合质量加权CoreSet覆盖,实现无训练、无标签的RLVR数据选择,在数学推理和医学QA上优于基线。
基于可验证奖励的强化学习(RLVR)可以从极少的训练实例中获得巨大的推理增益,但其对使用哪些实例的强敏感性使得数据选择成为核心瓶颈。大多数现有的选择流程依赖于训练时的优化信号,和/或需要访问可验证奖励或大规模候选池的真实答案,这在专业领域成本高昂且通常不可行。我们研究在必须进行任何RL训练之前且没有标签或完整池奖励评估的情况下进行RLVR数据选择。我们提出SHIFT,一种基于推理时隐藏状态动力学的单次、无训练选择器。对于每个候选实例,SHIFT运行一次确定性推理展开,并计算推理引起的表示偏移(RIRS)作为从开始到结束的隐藏状态变化。SHIFT使用RIRS幅度作为实例效用的轻量级代理,并通过在RIRS增强特征空间中的质量加权最远优先CoreSet过程强制覆盖,生成可扩展到大型未标记池的紧凑子集。在超低预算下的数学推理和医学QA基准测试中,SHIFT始终优于无训练的多样性和难度/不确定性基线,提高了领域内准确性和向更难评估设置的迁移。消融实验表明,基于RIRS的覆盖和质量加权贡献了互补的增益,分析表明RIRS不能由简单的输入/输出长度统计解释。代码可在github.com/JianghaoWu/SHIFT获取。
Reinforcement learning with verifiable rewards (RLVR) can yield large reasoning gains from very few training instances, yet its strong sensitivity to which instances are used makes data selection a central bottleneck. Most existing selection pipelines rely on training-time optimization signals and/or require access to verifiable rewards or ground-truth answers over large candidate pools, which is costly and often infeasible in specialized domains. We study RLVR data selection in a setting where selection must be performed before any RL training and without labels or reward evaluation on the full pool. We propose SHIFT, a one-shot, training-free selector based solely on inference-time hidden-state dynamics. For each candidate instance, SHIFT runs a single deterministic reasoning rollout and computes a reasoning-induced representation shift (RIRS) as the start-to-end hidden-state delta. SHIFT uses the RIRS magnitude as a lightweight proxy for instance utility and enforces coverage via a quality-weighted farthest-first CoreSet procedure in an RIRS-augmented feature space, producing compact subsets that scale to large unlabeled pools. Across mathematical reasoning and medical QA benchmarks under ultra-low budgets, SHIFT consistently outperforms training-free diversity and difficulty/uncertainty baselines, improving both in-domain accuracy and transfer to harder evaluation settings. Ablations show that RIRS-based coverage and quality-weighting contribute complementary gains, and analyses indicate that RIRS is not explained by simple input/output length statistics. Code is available at github.com/JianghaoWu/SHIFT.
EntroAD: 结构熵引导的提示自适应用于零样本异常检测
Xinyu Zhao, Qingyun Sun, Jiayi Luo, Jianxin Li
AI总结 提出EntroAD框架,利用结构熵引导动态路由机制和置信度感知双分支提示自适应,实现零样本异常检测,在跨数据集设置中达到最优性能。
零样本异常检测(ZSAD)旨在无需目标域适应的情况下检测未见域中的异常。最近的基于CLIP的方法通过利用提示学习和视觉-文本对齐展示了有前景的性能。然而,大多数现有方法依赖于单一适应路径,这可能不足以处理跨域的异质异常模式。在实践中,异常表现出截然不同的特征,从显著、局部的结构破坏到微妙、扩散且不规则的变异。为了解决这一挑战,我们提出了EntroAD,一种结构熵引导的零样本异常检测框架。与以往方法不同,EntroAD引入了一种动态路由机制,通过专门的适应策略处理不同类型的异常。具体地,我们从自注意力诱导的补丁关系中估计补丁级结构熵,并将其作为关系不确定性的代理来指导异常感知的令牌路由。基于该路由信号,我们构建异常感知的路由令牌,以更好地捕捉具有不同结构特征的异常线索。我们进一步引入了一个置信度感知的双分支提示自适应模块,以稳定视觉-文本对齐,同时保留CLIP的可迁移先验。在10个工业和医学基准上的大量实验表明,EntroAD在具有挑战性的跨数据集ZSAD设置中达到了最先进的性能。
Zero-Shot Anomaly Detection (ZSAD) aims to detect anomalies in unseen domains without target-domain adaptation. Recent CLIP-based methods have shown promising performance by leveraging prompt learning and visual-text alignment. However, most existing approaches rely on a single adaptation pathway, which may be insufficient for heterogeneous anomaly patterns across domains. In practice, anomalies exhibit vastly different characteristics, ranging from salient, localized structural disruptions to subtle, diffuse, and irregular variations. To address this challenge, we propose EntroAD, a structural entropy-guided zero-shot anomaly detection framework. Unlike previous methods, EntroAD introduces a dynamic routing mechanism to process different types of anomalies with specialized adaptation strategies. Specifically, we estimate patch-level structural entropy from self-attention-induced patch relations and use it as a proxy for relational uncertainty to guide anomaly-aware token routing. Based on this routing signal, we construct anomaly-aware routed tokens to better capture anomaly cues with different structural characteristics. We further introduce a confidence-aware dual-branch prompt adaptation module to stabilize visual-text alignment while preserving CLIP's transferable prior. Extensive experiments on 10 industrial and medical benchmarks show that EntroAD achieves state-of-the-art performance in challenging cross-dataset ZSAD settings.
Mobile-Aptus: 基于MLLM的手机使用代理中的置信度驱动的主动与鲁棒交互
Zheng Wu, Pengzhou Cheng, Zongru Wu, Yuan Guo, Tianjie Ju, Aston Zhang, Gongshen Liu, Zhuosheng Zhang
AI总结 针对手机使用代理中的过度执行和过度请求问题,提出一种置信度驱动的主动与鲁棒交互框架,通过监督微调和置信度偏差校正实现最优性能。
多模态大语言模型(MLLMs)的最新进展在使手机使用代理能够自主执行人类指令方面显示出非凡的潜力。然而,完全自动化的代理即使无法解决任务也常常尝试执行,导致过度执行问题。以往研究通过训练交互式手机使用代理来解决这一问题,让代理在无法完成用户指令时请求人类交互。但我们发现,这些交互式代理倾向于表现出过度请求行为,过度依赖人类干预。为了缓解过度执行和过度请求,我们提出了一种通用的置信度集成框架,使基于MLLM的手机使用代理能够实现置信度驱动的主动与鲁棒交互。该框架包括两个阶段:交互能力赋予和置信度偏差校正。在交互能力赋予阶段,代理通过监督微调学习输出动作和置信度分数。在置信度偏差校正阶段,代理通过结合语义相似性检索和直接偏好优化,学习输出更准确的置信度分数。实验结果表明,Mobile-Aptus在四个流行的手机使用代理基准测试(OS-Kairos、AITZ、Meta-GUI和AndroidControl)上达到了最先进的性能。在离线基准测试中,Mobile-Aptus始终优于所有基线,任务成功率平均提升超过17%。在真实世界动态实验中,Mobile-Aptus的任务成功率比基线高出26%,每个指令仅需0.64次干预步骤。代码可在https://github.com/Wuzheng02/Mobile-Aptus获取。
Recent advancements in multimodal large language models (MLLMs) have shown exceptional potential in enabling mobile-using agents to autonomously execute human instructions. However, fully automated agents often try to execute tasks even when they are unable to resolve them, leading to the problem of over-execution. Previous studies solve it by training a interactive mobile-using agents to let agents request human interaction when agents can not complete user instructions. However, we find that these interactive agents tend to exhibit over-soliciting behavior, relying excessively on human intervention. To mitigate both over-execution and over-soliciting, we propose a universal confidence integration framework that enables confidence-driven proactive and robust interaction in MLLM-based mobile-using agents. The framework consists of two stages: interaction capability empowerment and confidence bias correction. In the interaction capability empowerment stage, agents learn through supervised fine-tuning to output both actions and confidence scores. In the confidence bias correction stage, agents learn to output more accurate confidence scores by combining semantic similarity retrieval with direct preference optimization. Experimental results show Mobile-Aptus achieves state-of-the-art performance on the four popular mobile-using agent benchmarks: OS-Kairos, AITZ, Meta-GUI, and AndroidControl. Mobile-Aptus consistently outperforms all baselines in offline benchmarks, with an average improvement over 17\% in task success rate. In real-world dynamic experiments, Mobile-Aptus surpasses the baseline by 26% in task success rate with only 0.64 intervention steps per instruction. The codes are available at https://github.com/Wuzheng02/Mobile-Aptus.
当可解释性分配不均:混合可解释模型中的公平性
Ziba Jabbar Zare, Ulrich Aïvodji, Julien Ferry, Thibaut Vidal
AI总结 针对混合可解释模型将不同群体不均地分配给可解释组件与黑箱组件的问题,提出可解释性覆盖差异(ICD)度量,并通过约束缓解不公平性。
混合可解释模型通过将部分样本分配给透明组件,其余样本交给黑箱模型,结合了透明组件与黑箱模型。虽然这种设计允许在准确性和可解释性之间灵活权衡,但也引发了一个独特的程序公平性问题:某些人口群体可能系统地获得可解释的决策,而其他群体则被不成比例地路由到黑箱。我们将此问题形式化为可解释性覆盖差异(ICD),这是一种应用于混合可解释模型路由决策的群体均等性度量。利用预测多重性的工具,我们研究了四种混合可解释学习方法、三个标准公平性基准数据集和多个敏感属性下的ICD。我们的实验揭示了在中间透明度区间(即可解释组件和黑箱组件都被积极使用)存在显著的ICD。我们进一步表明,简单的覆盖差异约束可以显著减少精确混合学习方法中的ICD,同时对准确性和稀疏性的影响很小。在几种设置中,缓解ICD还改善了标准算法公平性指标。这些结果表明,混合可解释模型不仅应审计其预测公平性,还应审计其如何在个体和群体之间分配可解释性。
Hybrid interpretable models combine a transparent component with a black-box model by assigning some examples to the former and deferring the rest to the latter. While this design enables flexible tradeoffs between accuracy and interpretability, it also raises a distinct procedural fairness concern: some demographic groups may systematically receive interpretable decisions, while others are disproportionately routed to a black box. We formalize this issue as Interpretability Coverage Disparity (ICD), a demographic-parity-style measure applied to the routing decision of hybrid interpretable models. Using tools from predictive multiplicity, we study ICD across four hybrid interpretable learning methods, three standard fairness benchmark datasets, and multiple sensitive attributes. Our experiments reveal substantial ICD in intermediate transparency regimes, where both the interpretable and black-box components are actively used. We further show that simple coverage-disparity constraints can significantly reduce ICD in exact hybrid learning methods, with marginal impact on accuracy and sparsity. In several settings, ICD mitigation also improves standard algorithmic fairness metrics. These results show that hybrid interpretable models should be audited not only for predictive fairness, but also for how they allocate interpretability across individuals and groups.
随机过程流匹配:多元随机场的生成隐式表示
Julien Lalanne, David Picard, Lionel Boillot, Lina-María Guayacán-Carrillo, Leon Barens, Jean-Michel Pereira
AI总结 提出基于流匹配的随机过程流(RP Flow)框架,利用随机傅里叶特征学习隐式信号表示,通过集成采样编码不确定性,实现从稀疏观测生成高质量样本并校准不确定性估计。
生成建模为学习数据分布提供了强大框架。这些模型最初依赖于高斯过程等概率方法进行不确定性感知预测,并转向更大的可训练模型以学习更复杂的分布。在这项工作中,我们引入了随机过程流(RP Flow),一种基于流匹配的框架,将向量场表示为神经隐式函数。与现代生成方法不同,我们的设置涉及单个观测场,仅能获得稀疏测量。RP Flow使用随机傅里叶特征学习隐式信号表示,可以从有限的观测集查询任意位置,同时通过集成采样编码不确定性。我们提出通过源空间中的高斯过程回归构建贝叶斯后验以生成高质量样本。实验结果表明,即使在高频、高稀疏或高维等挑战性条件下,该框架也能生成逼真样本并提供校准的不确定性估计。这些发现使RP Flow成为数据稀缺且不确定性需可追踪的重建任务中生成模型的里程碑。
Generative modeling provides a powerful framework for learning data distributions. These models initially relied on probabilistic methods such as Gaussian Processes (GP) for uncertainty-aware predictions and shifted towards larger trainable models to learn more complex distributions. In this work, we introduce Random Process (RP) Flow, a Flow Matching-based framework that represents the vector field as a neural implicit function. Unlike modern generative methods, our setting involves a single observed field, from which only sparse measurements are available. RP Flow uses Random Fourier Features to learn an implicit signal representation that can be queried at any arbitrary location from a limited set of observations, while encoding uncertainty through ensemble sampling. We propose constructing a Bayesian posterior by GP regression in the source space to generate high-quality samples. Our empirical results demonstrate that this framework generates realistic samples along with calibrated uncertainty estimates, even under challenging conditions such as high frequency, high sparsity, or high dimensionality. These findings position RP Flow as a milestone towards generative models for reconstruction tasks where data is scarce and uncertainty must remain traceable.
图像分割的多尺度动力学框架:从粒子系统到连续模型
Horacio Tettamanti, Giulia Guicciardi, Mattia Zanella
AI总结 提出一种基于共识的多尺度动力学框架,通过将图像视为粒子系统并推导动力学方程与宏观模型,结合粒子优化实现图像分割。
在这项工作中,我们提出了一种用于基于共识的图像分割的多尺度动力学框架。通过将图像解释为相互作用的粒子系统,每个像素由其空间位置和编码颜色信息的内部特征来表征。我们引入了一个耦合相互作用方案,控制粒子在位置和特征空间中的演化,由此推导出空间-特征域中粒子密度的动力学公式,结合了输运、聚集和扩散效应。此外,通过适当的缩放,我们获得了一个一阶宏观模型,描述携带关于具有特定特征的像素分数信息的像素分数的演化。基于这个简化复杂度的模型,我们提出了一种数据导向的方法,利用基于粒子的优化技术进行精确的图像分割。数值测试显示了所提出框架的有效性及其在不同噪声条件下的鲁棒性。
In this work, we present a multiscale kinetic framework for consensus-based image segmentation. By interpreting an image as a system of interacting particles, each pixel is characterised by its spatial position and an internal feature encoding color information. We introduce a coupled interaction scheme governing the evolution of particles in both position and feature spaces, from which we derive a kinetic formulation for the particle density in the space-feature domain combining transport, aggregation, and diffusion effects. Furthermore, through a suitable scaling, we obtain a first-order macroscopic model describing the evolution of the fraction of pixels carrying information on the fraction of pixels having a certain feature. Based on this reduced-complexity model, we present a data-oriented approach where we make use of particle-based optimisation techniques for the accurate segmentation of images. Numerical tests show the effectiveness of the proposed framework and its robustness under different noise conditions.
LACUNA: 作为递归程序空洞的安全智能体
Yaoyu Zhao, Yichen Xu, Oliver Bračevac, Cao Nguyen Pham, Frank Zhengqing Wu, Martin Odersky
AI总结 提出LACUNA编程模型,通过类型化调用和编译时检查,让LLM智能体以递归程序空洞的方式安全地编写代码,实现表达性与安全性的统一。
LLM智能体越来越多地通过编写代码来行动,但驱动智能体的运行时与模型编写的代码之间仍然存在分裂。运行时拥有循环、上下文和控制流,而模型对这些几乎没有发言权。让模型编写的代码塑造运行时本身将使智能体更具表达性,但也会加剧安全问题。模型可能因提示注入而偏离方向、调用错误的工具,或在执行中途失败并留下不一致的状态,而当代码塑造运行时,此类失败的波及范围比仅表达单个动作时更广。我们提出了LACUNA,一种智能体编程模型,它在保持安全性的同时弥合了这种分裂。每个智能体动作都是一个类型化调用$\texttt{agent[T](task)}$,当执行到达该调用时,LLM用代码填充它,并且在代码运行之前,会针对周围程序进行类型检查。由于每个动作作为一个整体被接受或拒绝,被拒绝的动作不会影响环境,其编译器诊断信息会驱动重试。同样的检查也限制了动作可以使用哪些工具和数据以及它们如何流动。我们的原语将ReAct循环、子智能体、技能、并行分解和多模型规划表达为普通的控制流。我们在测试用例集合、BrowseComp-Plus和$τ^2$-bench上评估了LACUNA。在BrowseComp-Plus上,8.6%的生成在执行前被拒绝,平均每次查询重试0.7次,智能体达到27.1%的准确率。在$τ^2$-bench上,LACUNA使用一个能力强的模型解决了四个领域392个任务中的76.0%,与基线智能体相当。
LLM agents increasingly act by writing code, yet a split persists between the runtime that drives the agent and the code the model writes. The runtime owns the loop, context, and control flow, and the model has little say over any of them. Letting model-written code shape the runtime itself would make agents more expressive, but it would also sharpen safety problems. A model can be diverted by a prompt injection, call the wrong tool, or fail partway and leave an inconsistent state, and each such failure reaches further when the code shapes the runtime than when it expresses a single action. We present LACUNA, a programming model for agents that closes this split while preserving safety. Each agent action is a typed call $\texttt{agent[T](task)}$ that the LLM fills with code when execution reaches it, and the code is type-checked against the surrounding program before it runs. Because each action is accepted or rejected as a whole, a rejected one leaves the environment untouched, and its compiler diagnostics drive a retry. The same check also bounds which tools and data an action may use and how they flow. Our primitive expresses ReAct loops, sub-agents, skills, parallel decomposition, and multi-model planning as ordinary control flow. We evaluate LACUNA on a collection of test cases, BrowseComp-Plus, and $τ^2$-bench. On BrowseComp-Plus, $8.6\%$ of generations are rejected before execution, with 0.7 retries per query on average, and the agent reaches $27.1\%$ accuracy. On $τ^2$-bench, LACUNA solves $76.0\%$ of $392$ tasks across four domains with a capable model, on par with the baseline agent.
语言模型中的形式与功能测量
Héctor Javier Vázquez Martínez, Charles Yang
AI总结 通过引入儿童语言习得的定量指标,提出上下文替代选择(CAC)提示方法,评估语言模型在英语限定词的形式句法和功能话语知识方面的表现,发现仅大型模型能同时满足形式和功能基准。
我们引入儿童语言习得的定量指标来评估语言模型。我们的重点是英语中限定词的形式句法和功能话语属性,这些属性幼儿早期就能准确习得。我们提出了上下文替代选择(CAC),一种新的提示方法,为语言的句法和话语知识提供针对性测试。该方法能够直接将语言模型与儿童进行比较,更重要的是,与实证研究中独立建立的统计基准进行比较。目前,没有在可比数据量上训练的模型能像人类儿童一样同时满足形式和功能基准,但一些非常大的模型可以做到。我们将结果作为方法论和技术贡献呈现,特别强调语言模型的认知状态。
We introduce quantitative metrics for child language acquisition to evaluate language models. Our focus is on the formal syntactic and functional discourse properties of determiners in English, which young children acquire early and accurately. We propose Contextual Alternative Choice (CAC), a new prompting method which provides targeted tests for both syntactic and discourse knowledge of language. The method enables direct comparison of language models against children, and more importantly, against statistical benchmarks independently established in empirical research. No current model trained on a comparable amount of data simultaneously meet both formal and functional benchmarks like human children, but some very large models do. We present our results as methodological and technical contributions, with specific emphasis on cognitive status of language models.
基于区域感知双模态直接偏好优化的组合式文本到图像生成
Zhuohan Liu, Wujian Peng, Yitong Chen, Zuxuan Wu
AI总结 提出BiDPO框架,通过构建大规模偏好数据集BiComp和扩展Diffusion DPO联合优化图像与文本偏好,结合区域级引导方法,提升文本到图像模型对复杂组合提示的生成保真度。
尽管文本到图像(T2I)模型取得了快速进展,但生成准确反映复杂组合提示(涵盖属性绑定、对象关系、计数)的图像仍然具有挑战性。为了解决这个问题,我们提出了BiDPO,一个增强T2I模型组合式文本到图像生成能力的框架。我们首先引入一个精心设计的流程,构建大规模偏好数据集BiComp,并进行严格的质量控制。然后,我们将Diffusion DPO扩展到联合优化图像和文本偏好,这在提高模型遵循复杂文本提示生成方面被证明非常有效。为了进一步增强模型的细粒度对齐,我们采用区域级引导方法,聚焦于与组合概念相关的区域。实验结果表明,我们的BiDPO显著提高了组合保真度,在多个基准测试中持续优于先前方法。我们的方法突显了基于偏好微调在复杂文本到图像任务中的潜力,为现有技术提供了一种灵活且可扩展的替代方案。
Despite the rapid progress of text-to-image (T2I) models, generating images that accurately reflect complex compositional prompts (covering attribute bindings, object relationships, counting) still remains challenging. To address this, we propose BiDPO, a framework to enhance T2I model's capability of compositional text-to-image generation. We begin by introducing an carefully designed pipeline to construct a large-scale preference dataset, BiComp, with strictly quality control. Then, we extend Diffusion DPO to jointly optimize image and text preferences, which is shown to greatly effective in improving the models to follow complex text prompt in generation. To further enhance the models for fine-grained alignment, we employ a region-level guidance method to focus on regions relevant to compositional concepts. Experimental results demonstrate that our BiDPO substantially improves compositional fidelity, consistently outperforming prior methods across multiple benchmarks. Our approach highlights the potential of preference-based fine-tuning for complex text-to-image tasks, offering a flexible and scalable alternative to existing techniques.
扰动深度矩阵分解中的隐式正则化:谱条件与稳定性
Jingzhe Wang, Hung-Hsu Chou
AI总结 本文研究扰动深度矩阵分解中低秩隐式正则化的稳定性,通过推导谱条件分析无噪声情况下的低秩阶段,并证明扰动下梯度下降的收敛性与低秩阶段的保持性。
本文研究了扰动深度矩阵分解中低秩隐式正则化的稳定性,其中目标矩阵被噪声矩阵破坏。我们首先推导了充分的谱条件,使得梯度下降在无噪声情况下表现出低秩阶段。这些条件展示了目标谱、初始化和步长如何共同决定非空低秩区间的存在性。然后我们分析了扰动的梯度下降动力学,证明了收敛保证,并量化了扰动如何影响迭代复杂度和特征值恢复。最后,我们表明低秩阶段在扰动下仍然存在,且与扰动大小有显式依赖关系。数值实验支持了理论发现。
This paper studies the stability of low-rank implicit regularization in perturbed deep matrix factorization, where the target matrix is corrupted by a noise matrix. We first derive sufficient spectral conditions under which gradient descent exhibits a low-rank phase in the noiseless setting. These conditions show how the target spectrum, initialization, and step size jointly determine the existence of a nonempty low-rank interval. We then analyze the perturbed gradient descent dynamics, proving convergence guarantees and quantifying how the perturbation affects iteration complexity and eigenvalue recovery. Finally, we show that the low-rank phase persists under perturbation, with explicit dependence on the perturbation size. Numerical experiments support the theoretical findings.
使用乘积网络通过梯度下降学习高维奇偶函数
Guillaume Larue, Louis-Adrien Dufrène, Quentin Lampin, Hadi Ghauch, Ghaya Rekaya
AI总结 本文提出结合紧凑乘积神经网络架构与随机数据稀疏性(伯努利输入,p_e ≤ 1/N)及超参数优化,实现了高维奇偶函数的高效学习,并给出了收敛性理论保证。
奇偶函数是基本的布尔运算,在机器学习、密码学和纠错码中具有关键应用。然而,学习高维奇偶函数面临重大挑战:在一般情况下,标准神经网络架构通常需要指数级的样本复杂度,使得基于梯度的优化对于大输入数量$N$变得不可行。我们证明,紧凑的乘积神经网络架构结合随机数据稀疏性(伯努利输入,$p_e \leq 1/N$)和适当的超参数选择,能够实现高效的奇偶学习,并具有收敛的理论保证。实验验证了我们的理论在高达$N = 100{,}000$维度上的有效性,经验证据显示了$p_e$和学习率$\alpha$的最优超参数选择,以及多项式复杂度的标度律。这项工作建立了架构归纳偏差与数据稀疏性之间的基本联系,为神经算术、结构化推理、二值神经网络以及机器学习在自动协议发现中的应用开辟了新的可能性。
Parity functions are fundamental Boolean operations with critical applications across machine learning, cryptography, and error correction. Yet, learning high-dimensional parity functions poses significant challenges: in a general setting, standard neural network architectures typically require exponential sample complexity, making gradient-based optimization intractable for large number of inputs $N$. We demonstrate that compact product-based neural architectures combined with stochastic data sparsity (Bernoulli inputs with $p_e \leq 1/N$) and appropriate hyperparameter choice enable efficient parity learning, with theoretical guarantees of convergence. Experiments validate our theory across dimensions up to $N = 100{,}000$, with empirical evidence showing optimal hyperparameter choices for $p_e$ and learning rate $α$, as well as polynomial complexity scaling laws. This work establishes fundamental connections between architectural inductive bias and data sparsity, opening new possibilities for neural arithmetic, structured reasoning, binary neural networks, and machine learning applied to automated protocol discovery.
JECA^2: 面向取证视觉语言模型的判断-解释一致对抗攻击
Jiachen Qian
AI总结 针对取证视觉语言模型,提出一种白盒对抗攻击方法JECA^2,通过Grad-CAM引导的视觉扰动和令牌邻近约束的文本嵌入优化,实现判断与解释的一致性,实验表明攻击成功率和一致性优于基线。
取证视觉语言模型(VLM)最近被开发用于检测图像篡改并提供自然语言解释。然而,它们对抗对抗性操纵的鲁棒性仍未得到充分探索。现有的对抗攻击通常旨在翻转模型的二元判断,而伴随的解释可能仍然揭示取证线索并与被攻击的判断相矛盾。在本文中,我们研究了针对取证VLM的判断-解释一致对抗攻击,并提出了JECA^2,一种受控的白盒红队诊断方法,它联合重定向视觉注意力并将文本解释与目标判断对齐。在视觉方面,JECA^2使用Grad-CAM引导的扰动将注意力从篡改区域转移到良性区域。在文本方面,它在令牌邻近约束下优化提示嵌入,使其朝向真实性肯定的语义。在取证VLM基准上的实验表明,在白盒威胁设置下,JECA^2比实现的基线实现了更高的攻击成功率和自动判断-解释一致性,而迁移到闭源VLM仍然可测量但有限。我们的结果突显了基于解释的取证VLM中的一致性失败模式,并激励了超越二元检测准确性的未来鲁棒性评估。
Forensic vision-language models (VLMs) have recently been developed to detect image tampering and provide natural-language explanations. However, their robustness against adversarial manipulation remains underexplored. Existing adversarial attacks typically aim to flip the model's binary judgment, while the accompanying explanation may still reveal forensic cues and contradict the attacked judgment. In this paper, we study judgment-explanation consistent adversarial attacks against forensic VLMs and propose JECA^2, a controlled white-box red-team diagnostic that jointly redirects visual attribution and aligns textual explanations with the target judgment. On the visual side, JECA^2 uses Grad-CAM-guided perturbations to divert attribution from tampered regions toward benign regions. On the textual side, it optimizes prompt embeddings toward authenticity-affirming semantics under a token-proximity constraint. Experiments on forensic VLM benchmarks show that JECA^2 achieves higher attack success and automated judgment-explanation consistency than implemented baselines under white-box threat settings, while transfer to closed-source VLMs remains measurable but limited. Our results highlight a consistency failure mode in explanation-based forensic VLMs and motivate future robustness evaluation beyond binary detection accuracy.