arXivDaily arXiv每日学术速递 周一至周五更新
重置
全部学科分类 2092
2605.08007 2026-05-11 cs.LG

Interpreting Reinforcement Learning Agents with Susceptibilities

Chris Elliott, Einar Urdshals, David Quarel, Daniel Murfet

AI总结 本文提出了一种基于“易感性”的方法,用于解释强化学习智能体的行为,该方法通过研究损失函数扰动对观测量后验期望值的影响来揭示模型内部特性。研究将这一技术扩展到深度强化学习的遗憾(regret)分析中,并在一个具有非平凡阶段性发展的网格世界模型中验证了其有效性。实验表明,易感性能够揭示仅通过研究策略演化无法发现的模型参数空间中的内部特征,并通过激活引导进一步验证了其解释能力。

Comments 55 pages, comments welcome

详情
英文摘要

Susceptibilities are a technique for neural network interpretability that studies the response of posterior expectation values of observables to perturbations of the loss. We generalize this construction to the setting of the regret in deep reinforcement learning and investigate the utility of susceptibilities in a simple gridworld model that nevertheless exhibits non-trivial stagewise development. We argue that susceptibilities reveal internal features of the development of the model in parameter space that one cannot detect purely by studying the development of the learned policy. We validate these results with activation-steering, and discuss the framework's extension to RLHF post-training.

2605.08005 2026-05-11 cs.LG

STEPS: A Temporal Smooth Error Propagation Solver on the Manifolds for Test-Time Adaptation in Time Series Forecasting

Jiaqi Liu, Yifan Ouyang, Zhifei Song, Sim Kuan Goh, Ashwaq Qasem

AI总结 STEPS 是一种用于时间序列预测中测试时适应(TTA)的平滑误差传播求解器,旨在解决在分布偏移下利用有限观测进行预测时的性能下降问题。该方法将 TTA 问题建模为时间流形上的狄利克雷边值问题,通过局部求解器和全局求解器分别处理时间平滑性和跨窗口误差记忆,并结合时空流形融合技术生成稳定的修正结果。实验表明,STEPS 在多个基准数据集上显著提升了预测精度,平均相对均方误差降低达 26.82%,优于现有最强 TTA 方法。

Comments 9 pages main text, appendix included. 7 figures. Submitted to NeurIPS 2026

详情
英文摘要

Test-Time Adaptation (TTA) aims to improve time series forecasting under distribution shifts by using limited observations revealed during inference. However, forecasting TTA must operate in a source-free online setting, where the adaptation signal is short, temporally correlated, and potentially noisy. Existing methods can therefore suffer from weak identifiability, error accumulation, and unstable long-horizon corrections when the revealed prefix is sparse or contaminated. To address these issues, we propose STEPS, a Smooth Temporal Error Propagation Solver for TTA in time-series forecasting. STEPS reformulates forecasting TTA as a Dirichlet Boundary Value Problem on a temporal manifold, where the revealed prefix error serves as the boundary condition for the unknown future error field. Then, STEPS solves a smooth and bounded correction field in prediction space: a Local Solver propagates prefix errors under temporal smoothness, a Global Solver retrieves stable cross-window error memory and Spatiotemporal Manifold Fusion (SMF) integrates both solutions into the final correction. Across six standard benchmarks and four frozen backbones, STEPS achieves an average relative MSE reduction of 26.82% over the zero-shot backbone, exceeding the strongest compared TTA baseline by 12.77%. Additional sparse prefix and contamination tests confirm the robustness of STEPS under limited and noisy prefixes.

2605.08003 2026-05-11 cs.CV

SphereVAD: Training-Free Video Anomaly Detection via Geodesic Inference on the Unit Hypersphere

Chao Huang, Penfei Wei, Wei Wang, Jie Wen, Zhihua Wang, Li Shen, Wenqi Ren, Xiaochun Cao

AI总结 视频异常检测(VAD)旨在自动识别未剪辑监控视频中偏离正常模式的事件。现有方法通常依赖大规模标注或任务特定的训练过程,限制了其在新场景中的快速部署。本文提出SphereVAD,一种无需训练、零样本的视频异常检测框架,通过在单位超球面上进行vMF似然比测地推理,充分挖掘预训练多模态大语言模型中间层特征中隐含的几何判别性。该方法通过弗雷歇特均值中心化、全场景注意力机制和测地拉近策略,实现了对异常片段的有效识别,并在多个基准数据集上取得了优于现有无监督方法的性能。

Comments 48 pages, 25 figures

详情
英文摘要

Video anomaly detection (VAD) aims to automatically identify events that deviate from normal patterns in untrimmed surveillance videos. Existing methods universally depend on large-scale annotations or task-specific training procedures, severely limiting their rapid deployment to novel scenes. We observe that intermediate-layer features of pre-trained multimodal large language models (MLLMs) already encode rich anomaly semantics, yet existing approaches rely on the language output pathway and fail to exploit the geometric discriminability latent in these representations. Based on this finding, we propose SphereVAD, a fully training-free, zero-shot VAD framework that recasts anomaly discrimination as von Mises-Fisher (vMF) likelihood-ratio geodesic inference on the unit hypersphere, unleashing latent discriminability through principled geometric reasoning rather than learning new representations. Specifically, SphereVAD first applies Frechet mean centering to unfold feature distributions and eliminate domain biases, then employs Holistic Scene Attention (HSA) to reinforce feature consistency using cross-video priors, and finally performs vMF-guided Spherical Geodesic Pulling (SGP) to align ambiguous segments with directional prototypes on the spherical manifold. This training-free pipeline requires only minimal synthetic images for calibration. SphereVAD establishes new state-of-the-art results among training-free approaches on three major benchmarks and remains competitive with fully supervised baselines. Code will be available upon acceptance.

2605.08000 2026-05-11 cs.CV

Rethinking Dense Optical Flow without Test-Time Scaling

Praroop Chanda, Suryansh Kumar

AI总结 本文探讨了在无需测试时缩放计算的前提下,如何改进密集光流估计的问题。作者提出了一种单次前向传播的框架,通过利用预训练的基础模型中的视觉语义和几何先验信息,避免了传统的迭代优化过程,从而显著降低了计算成本。实验表明,该方法在多个基准测试中表现出色,尤其在Sintel Final数据集上取得了优于现有先进方法的性能,验证了基础模型先验在光流估计中的有效性。

Comments Accepted for publication at CVPR 2026; ViSCALE Workshop. Draft info: 10 pages, 2 figures, 4 tables

详情
英文摘要

Recent progress in dense optical flow has been driven by increasingly complex architectures and multi-step refinement for test-time scaling. While these approaches achieve strong benchmark performance, they also require substantial computation during inference. This raises a fundamental question: Is scaling test-time computation the only way to improve dense optical flow accuracy? We argue that it is not. Instead, powerful visual semantic and geometric priors encoded in modern foundation models can reduce, if not overcome, the need for computationally expensive iterative refinement at test-time. In this paper, we present a framework that estimates dense optical flow in a single forward pass, leveraging pretrained foundation representations, while avoiding iterative refinement and additional inference-time computation, thus offering an alternative to test-time scaling. Our method extracts visual semantic features from a frozen DINO-v2 backbone and combines them with geometric cues from a monocular depth foundation model. We fuse these complementary priors into a unified representation and apply a global matching formulation to estimate dense correspondences without recurrent updates or test-time optimization. Despite avoiding iterative refinement, our approach achieves strong cross-dataset generalization across challenging benchmarks. On Sintel Final, we obtain 2.81 EPE without refinement, significantly improving over state-of-the-art (SOTA) SEA-RAFT under comparable training conditions and outperforming RAFT, GMFlow (without refinement), and recent FlowSeek in the same setting. These results suggest that strong foundation priors can substitute for test-time scaling, offering a computationally efficient alternative to refinement-heavy pipelines.

2605.07999 2026-05-11 cs.LG cs.AI

Graph-Structured Hyperdimensional Computing for Data-Efficient and Explainable Process-Structure-Property Prediction

Jingzhan Ge, Ajeeth Vellore, Ajinkya Palwe, Ahsan Khan, David Gorsich, Matthew P. Castanier, SeungYeon Kang, Farhad Imani

AI总结 该研究针对复杂3D微结构制造中过程-结构-性能(PSP)预测数据稀疏、异构且交互复杂的问题,提出了一种基于图结构的高维计算框架PSP-HDC。该方法通过编码有向PSP图作为先验知识,结合可训练的标量到高维向量编码器和图对齐的绑定与捆绑操作,实现了对异构参数和噪声的鲁棒表示与预测。PSP-HDC不仅在预测性能上优于现有方法,还提供了参数级、组级和组内级的内在可解释性,为数据高效且可解释的PSP预测提供了新思路。

Comments 19 pages, 18 figures

详情
英文摘要

Multiphoton photoreduction enables high-fidelity fabrication of complex 3D microstructures, yet reliable process-structure-property (PSP) prediction remains difficult because the available data are sparse, heterogeneous, and interaction-dominated. In this regime, conventional feature-vector models are statistically underdetermined, making them prone to spurious correlations, poor regime transfer, and unstable post hoc explanations, whereas mechanistic pipelines depend on calibrated submodels that are rarely available during early process development. We present PSP-HDC, a graph-structured hyperdimensional computing framework that encodes a directed PSP graph as an internal prior for representation, inference, and explanation. A trainable scalar-to-hypervector encoder learns parameter-specific embeddings on a fixed hyperdimensional basis to accommodate heterogeneous scales and noise. Sample representations are then composed through graph-aligned binding and bundling along directed PSP dependencies, and prediction is performed by associative-memory retrieval against class prototypes. Because the same prototype memories support both decision making and attribution, PSP-HDC provides intrinsic explanations at the parameter, group, and within-group levels, while memory alignment and separation quantify prototype formation during training. On sheet-resistance regime prediction for the 3D platform, PSP-HDC achieves an accuracy of 0.910 +/- 0.077 over 1000 random splits and 0.896 under process-fold generalization, outperforming strong baselines.

2605.07993 2026-05-11 cs.LG stat.ME

Bayesian Sensitivity of Causal Inference Estimators under Evidence-Based Priors

Nikita Dhawan, Daniel Shen, Leonardo Cotta, Chris J. Maddison

AI总结 因果推断,尤其是在观察性研究中,依赖于对真实数据生成过程的不可检验假设。本文提出了一种基于现实证据构建先验的贝叶斯敏感性分析方法,用于评估因果估计量对三种常见假设的敏感性,克服了传统最坏情况分析可能过于悲观或与先验知识冲突的问题。该方法引入了贝叶斯敏感性值(BSV),通过蒙特卡洛近似计算估计量在假设违反下的期望敏感性,并在糖尿病治疗对体重影响的观察性研究中验证了其有效性。

Comments TMLR 2026

详情
英文摘要

Causal inference, especially in observational studies, relies on untestable assumptions about the true data-generating process. Sensitivity analysis helps us determine how robust our conclusions are when we alter these underlying assumptions. Existing frameworks for sensitivity analysis are concerned with worst-case changes in assumptions. In this work, we argue that using such pessimistic criteria can often become uninformative or lead to conclusions contradicting our prior knowledge about the world. To demonstrate this claim, we generalize the recent s-value framework (Gupta & Rothenhäusler, 2023) to estimate the sensitivity of three different common assumptions in causal inference. Empirically, we find that, indeed, worst-case conclusions about sensitivity can rely on unrealistic changes in the data-generating process. To overcome this, we extend the s-value framework with a new sensitivity analysis criterion: Bayesian Sensitivity Value (BSV), which computes the expected sensitivity of an estimate to assumption violations under priors constructed from real-world evidence. We use Monte Carlo approximations to estimate this quantity and illustrate its applicability in an observational study on the effect of diabetes treatments on weight loss.

2605.07988 2026-05-11 cs.RO

Evaluation of an Actuated Spine in Agile Quadruped Locomotion

Nico Bohlinger, Piotr Kicki, Davide Tateo, Krzysztof Walas, Jan Peters

AI总结 本文研究了可驱动脊柱对四足机器人敏捷运动性能的影响。通过在MuJoCo仿真环境中使用配备单自由度脊柱的Silver Badger机器人,实验验证了脊柱在高速奔跑、爬楼梯、爬陡坡、跨障碍和爬行等场景中的优势。研究结果表明,可驱动脊柱显著提升了机器人的敏捷性,使其能够克服更高障碍和更狭窄的通道。

详情
英文摘要

The spine plays a crucial role in the dynamic locomotion of quadrupedal animals, improving the stability, speed, and efficiency of their gait, especially for fast-paced and highly agile movements. Therefore, the spine is also a promising and natural way to extend the capabilities of quadruped robots. This paper empirically investigates the benefits of an actuated spine for learning agile quadruped locomotion. We evaluate whether the use of the spine brings benefits in terms of high-speed running, climbing stairs, climbing high-angle slopes, hurdling, and crawling scenarios. We conducted an empirical study in MuJoCo simulation using the Silver Badger robot from MAB Robotics with an actuated 1-DOF spine in the sagittal plane. The obtained results show that the use of the spine provides the robot with increased agility and allows it to overcome higher stairs, steeper slopes, higher obstacles, and smaller passages.

2605.07982 2026-05-11 cs.CL cs.CR

GLiGuard: Schema-Conditioned Classification for LLM Safeguard

Urchade Zaratiana, Mary Newhauser, George Hurn-Maloney, Ash Lewis

AI总结 为确保大语言模型输出的安全性和合规性,本文提出了一种基于结构化输入的分类方法GLiGuard,通过将任务定义和标签语义编码为输入序列中的结构化标记,实现了对提示安全、响应安全、拒绝检测及多种细粒度危害类别和越狱策略的单次非自回归评估。该方法基于双向编码器架构,参数量仅为0.3B,在九个安全基准测试中取得了与7B-27B参数解码器模型相当的F1分数,同时显著提升了推理吞吐量并降低了延迟。

Comments 20 pages, 4 figures

详情
英文摘要

Ensuring safe, policy-compliant outputs from large language models requires real-time content moderation that can scale across multiple safety dimensions. However, state-of-the-art guardrail models rely on autoregressive decoders with 7B--27B parameters, reformulating what is fundamentally a classification problem as sequential text generation, a design choice that incurs high latency and scales poorly to multi-aspect evaluation. In this work, we introduce \textbf{GLiGuard}, a 0.3B-parameter schema-conditioned bidirectional encoder adapted from GLiNER2 for LLM content moderation. The key idea is to encode task definitions and label semantics directly into the input sequence as structured token schemas, enabling simultaneous evaluation of prompt safety, response safety, refusal detection, 14 fine-grained harm categories, and 11 jailbreak strategies in a single non-autoregressive forward pass. This schema-conditioned design lets supported task and label blocks be composed directly in the input schema at inference time. Across nine established safety benchmarks, GLiGuard achieves F1 scores competitive with 7B--27B decoder-based guards despite being 23--90$\times$ smaller, while delivering up to 16$\times$ higher throughput and 17$\times$ lower latency. These results suggest that compact bidirectional encoders can approach the accuracy of much larger guard models while drastically reducing inference cost. Code and models are available at https://github.com/fastino-ai/GLiGuard.

2605.07980 2026-05-11 cs.LG cond-mat.stat-mech math.ST stat.TH

Susceptibilities and Patterning: A Primer on Linear Response in Bayesian Learning

Chris Elliott, Daniel Murfet

AI总结 本文介绍了在神经网络解释中发展的易感性理论,用于分析贝叶斯学习中的线性响应。易感性定义为可观测量对数据扰动的后验期望导数,根据涨落-耗散定理等价于后验协方差。通过不同可观测量的选择,可得到不同对象,如样本损失对应影响矩阵,局部组件可观测量对应结构易感性矩阵,该矩阵与数据模式和模型组件的映射有关,并可用于寻找实现特定结构变化的数据扰动。文章从统计力学基础出发,详细阐述了易感性及其估计方法与损失景观几何的关系。

Comments 34 pages, 3 figures, comments welcome!

详情
英文摘要

These notes introduce the theory of susceptibilities as developed in [arXiv:2504.18274, arXiv:2601.12703] for interpreting neural networks. The susceptibility of an observable $ϕ$ to a data perturbation is defined as a derivative of a posterior expectation, which by the fluctuation--dissipation theorem equals a posterior covariance. Different choices of $ϕ$ yield different objects: per-sample losses give the influence matrix (the Bayesian influence function of [arXiv:2509.26544]), while component-localized observables give the structural susceptibility matrix that pairs model components with data patterns. The susceptibility matrix is (up to a factor of $nβ$) the Jacobian of the map from data distributions to structural coordinates; its pseudo-inverse provides a linearized solution to the patterning problem of [arXiv:2601.13548]: finding data perturbations that produce a desired structural change. We motivate the theory from its statistical-mechanical foundations, then give a detailed exposition of susceptibilities, their empirical estimators, and their connection to the geometry of the loss landscape.

2605.07979 2026-05-11 cs.AI

The Limits of AI-Driven Allocation: Optimal Screening under Aleatoric Uncertainty

Santiago Cortes-Gomez, Mateo Dulce Rubio, Carlos Patino, Bryan Wilder

AI总结 本文研究了在存在不可约的随机不确定性(aleatoric uncertainty)情况下,如何最优地结合算法分配与实际筛查,以提高资源分配效率。作者提出了一种两阶段分配框架,其中先对部分个体进行筛查,再基于固定预算进行资源分配,并证明最优策略是在算法分配的边缘进行筛查,同时直接针对高风险个体。研究还揭示了筛查与算法分配在不同不确定性水平下可能互补或替代,并通过社会保护和人道主义排雷的实际案例验证了该框架的有效性。

详情
英文摘要

The rise of machine learning has shifted targeted resource allocation in policy and humanitarian settings toward algorithmic targeting based on predicted risk scores. This approach is typically cheaper and faster than traditional screening procedures that directly observe the latent vulnerability status through physical verification. Yet, even access to the true conditional vulnerability probability cannot eliminate misallocation: aleatoric uncertainty over individual vulnerability status is irreducible, and probabilistic targeting inevitably misallocates some resources. In this work we study how screening and algorithmic targeting should be optimally combined in a two-stage allocation framework where a screening stage observes true outcomes for a subset of units before a final allocation stage assigns the resource under a fixed coverage budget. We show that the optimal strategy screens units at the margin of algorithmic allocation, while directly targeting the highest-risk units. Furthermore, we empirically characterize when screening and algorithmic targeting act as complements or substitutes: efficiency gains from screening grow as the aleatoric uncertainty in the population increases. We illustrate our framework with applications in income-based social protection programs and humanitarian demining in Colombia, where the tension between screening costs and allocation efficiency is operationally consequential.

2605.07978 2026-05-11 cs.CV

Seeing Across Skies and Streets: Feedforward 3D Reconstruction from Satellite, Drone, and Ground Images

Qiwei Wang, Zhongyao Tuo, Xianghui Ze, Yujiao Shi

AI总结 该研究解决了跨视角定位问题,即如何将地面图像定位到卫星地图上的具体位置。传统方法仅能估计3自由度(x, y和偏航角),而本文提出的方法Cross3R通过引入无人机图像作为中间视角,能够恢复6自由度相机姿态和三维点云,从而实现更精确的三维重建与定位。研究还构建了CrossGeo数据集,并在多个基准测试中验证了方法的有效性。

详情
英文摘要

Cross-view localization classically asks: where does this ground image lie on the satellite tile? Existing methods are typically limited to 3-DoF estimates -- an $(x,y)$ position and a yaw angle -- because nadir satellite imagery provides no direct cues for roll, pitch, or altitude, forcing a reliance on planar-motion and zero-tilt assumptions. These assumptions break on real terrain with slopes, ramps, and tilted camera mounts. To overcome this, we introduce a single UAV image as an intermediate viewpoint: it reveals the 3D structure invisible from nadir, supplies the cues for roll, pitch, and altitude that the satellite alone cannot provide, and needs only spatial overlap with the ground camera -- no known relative pose is required. Building on this insight, we propose **Cross3R**, a flexible feed-forward model that ingests a satellite tile together with a UAV image, a ground image, or both, and, in a single forward pass, recovers a cross-view 3D point cloud, the 6-DoF poses of every input camera, and the on-tile $(x,y)$ position and yaw of each perspective camera. For training and evaluation, we also construct **CrossGeo**, a 278K-image tri-view dataset spanning 85 scenes across every continent except Antarctica. On CrossGeo, Cross3R consistently outperforms feed-forward 3D baselines in point-cloud reconstruction, 6-DoF camera-pose estimation, and cross-view localization. On KITTI, it outperforms dedicated cross-view methods trained on KITTI on most metrics, despite having no KITTI training itself.

2605.07977 2026-05-11 cs.LG

Self-Play Enhancement via Advantage-Weighted Refinement in Online Federated LLM Fine-Tuning with Real-Time Feedback

Seohyun Lee, Wenzhi Fang, Dong-Jun Han, Seyyedali Hosseinalipour, Christopher G. Brinton

AI总结 本文提出了一种名为SPEAR的高效在线学习算法,用于联邦大语言模型的微调。该方法通过优势加权细化机制,在无需昂贵的群体生成和真实上下文的情况下,利用反馈引导的自我对弈循环生成对比样本,从而提升模型性能。实验表明,SPEAR在多个基准数据集上优于现有先进方法,且适用于资源受限的边缘设备,具备良好的在线学习和联邦学习适应性。

Comments 27 pages

详情
英文摘要

Recent works have advanced feedback-based learning systems, whereby a foundation model is able to intake incoming feedback (e.g., a user) to self-improve, creating a self-loop system of training. However, existing works are limited in needing to consider an offline setup to allow for such feedback-based methods, and are further limited in the need of requiring privileged ground-truth contexts for training. Moreover, there is limited consideration of federated learning (FL), which is particularly well-suited for incorporating external feedback across large networks of end users, for example, but requires methods to be efficient for training on resource-constrained edge devices. Therefore, we introduce SPEAR (Self-Play Enhancement via Advantage-Weighted Refinement), an efficient online learning algorithm for federated LLM fine-tuning. SPEAR utilizes a feedback-guided self-play loop to construct naturally contrastive pairs per prompt which are utilized to be trained on (i) standard maximum likelihood on correct completions and (ii) confidence-weighted unlikelihood on tail tokens of incorrect completions. Without the need of expensive group generations and ground-truth contexts for training (i.e., only partial, non-answer feedback), in contrast with existing works, SPEAR can be trained both online and in a resource-efficient manner. We validate SPEAR across various benchmark datasets, demonstrating its superior performance in comparison to state-of-the-art baselines. The implementation code is publicly available at https://github.com/lee3296/SPEAR.

2605.07973 2026-05-11 cs.CV

HEART: Hyperspherical Embedding Alignment via Kent-Representation Traversal in Diffusion Models

Arani Roy, Shristi Das Biswas, Kaushik Roy

AI总结 本文研究了文本到图像扩散模型中基于文本条件进行图像编辑的难题,指出现有方法将嵌入空间视为欧几里得空间并应用线性变换,无法准确反映语义概念的实际组织方式。通过分析发现,文本编码器的表示实际上位于超球面上,语义概念在其中呈现各向异性的分布,更适合用肯特分布描述。基于此,作者提出了HEART框架,无需训练或优化,直接在超球面上进行几何变换,实现了对图像主体和属性的直观、精确编辑,并在多种扩散模型架构上具有良好的泛化能力。

详情
英文摘要

Text-to-image diffusion models can generate visually stunning images, yet, controlling what appears and how it appears, remains surprisingly difficult, especially when operating solely within the constraints of the text-conditioning space. For example, changing a subject or adjusting an attribute often leads to unintended side effects, such as altered backgrounds or distorted details. This is because most existing text-based control methods treat the embedding space as Euclidean and apply simple linear transformations, which do not reflect how semantic concepts are actually organized. In this work, we take a step back and ask: what is the true geometry of these embeddings? We find that text encoder representations lie on a hypersphere, where concepts are not linear directions but structured, anisotropic distributions better captured by Kent distributions. Building on this insight, we propose HEART, a training-free framework that performs Kent-aware geodesic transformations directly on the hypersphere. By respecting the underlying geometry, HEART enables intuitive and precise edits, such as consistent subject replacement and fine-grained attribute control, while preserving the original scene. Importantly, HEART requires no finetuning, inversion, or optimization, and generalizes across diffusion model architectures. Our results show that a simple shift in perspective, from linear to spherical, can unlock fast, and controllable image generation.

2605.07972 2026-05-11 cs.LG cs.AI stat.ML

It Just Takes Two: Scaling Amortized Inference to Large Sets

Antoine Wehenkel, Michael Kagan, Lukas Heinrich, Chris Pollard

AI总结 本文研究了如何将免计算推断扩展到大规模观测集合的问题,提出了一个简单且理论基础扎实的方法,将表示学习与后验建模解耦。该方法通过在最多包含两个元素的集合上训练一个均值池化Deep Set模型,生成的编码器能够泛化到任意规模的集合,从而显著降低了训练成本并提升了推断效率。实验表明,该方法在多种高维条件生成任务中表现优异,计算成本仅为传统方法的一小部分。

详情
英文摘要

Neural posterior estimation has emerged as a powerful tool for amortized inference, with growing adoption across scientific and applied domains. In many of these applications, the conditioning variable is a set of observations whose elements depend not only on the target but also on unknown factors shared across the set. Optimal inference therefore requires treating the set jointly, which in turn requires training the estimator at the deployment set size -- a regime where memory and compute quickly become prohibitive. We introduce a simple, theoretically grounded strategy that decouples representation learning from posterior modeling. Our method trains a mean-pool Deep Set on sets of size at most two, producing an encoder that generalizes to arbitrary set sizes. The inference head is then finetuned on pre-aggregated embeddings, making training cost essentially independent of the deployment set size N. Across scalar, image, multi-view 3D, molecular, and high-dimensional conditional generation benchmarks with N in the thousands, our approach matches or outperforms standard baselines at a fraction of the compute.

2605.07969 2026-05-11 cs.LG cs.IT math.IT

When Diffusion Model Can Ignore Dimension: An Entropy-Based Theory

Ahmad Aghapour, Erhan Bayraktar

AI总结 本文从信息论角度研究扩散模型在高维数据中的收敛性问题,提出了一种基于香农熵的理论分析框架。研究发现,对于高斯混合目标分布,离散化误差主要由潜在混合成分的熵控制,而非环境维度。该结果表明,当数据分布具有紧凑的潜在表示时,扩散采样在高维空间中仍能保持高效,为理解扩散模型的高效性提供了新的理论依据。

详情
英文摘要

Diffusion models perform remarkably well on high-dimensional data such as images, often using only a modest number of reverse-time steps. Despite this practical success, existing convergence theory does not fully explain why such samplers remain efficient in high dimensions. Many prior KL guarantees bound the discretization error in terms of the ambient dimension, while other improved results replace this dependence using intrinsic-dimensional or geometric structure assumptions. In this work, we develop an alternative information-theoretic perspective on diffusion sampler convergence. We prove that, for Gaussian mixture targets, the discretization error is controlled by the Shannon entropy of the latent mixture component rather than by the ambient dimension. Consequently, the leading step complexity scales linearly with latent entropy and depends only logarithmically on the second moment of the data. Our analysis also extends to discrete target distributions, where the relevant complexity is the entropy of the target rather than the dimension of the embedding space. These results suggest that diffusion sampling can remain efficient in high-dimensional spaces when the data distribution admits a compact latent representation, as is widely believed to be the case for natural images.

2605.07963 2026-05-11 cs.LG

Aggregation in conformal e-classification

Vladimir Vovk

AI总结 本文研究了交叉确认e预测(cross-conformal e-prediction)及其改进方法,这是一种用于聚合确认e预测器的有效技术。该方法在保持预测有效性的同时,提升了计算效率和灵活性。实验表明,这些方法在实际应用中具有良好的性能和实用性。

Comments 23 pages, 10 figures

详情
英文摘要

Aggregating conformal predictors is a standard way of balancing their predictive and computational efficiency while retaining their validity, at least approximately. An important advantage of conformal e-predictors is that they are easier to aggregate without sacrificing their validity. This paper studies experimentally cross-conformal e-prediction, which is an existing method of aggregating conformal e-predictors, and its modifications that are conceptually simpler and more flexible.

2605.07962 2026-05-11 cs.LG cs.DC

FLAM: Evaluating Model Performance with Aggregatable Measures in Federated Learning

Fabian Stricker, Jose A. Peregrina, David Bermbach, Christian Zirpins

AI总结 在联邦学习中,由于数据分布在不同参与者处,模型性能评估面临挑战,传统聚合方法难以与集中式评估结果保持一致。本文分析了这一不一致的原因,提出了FLAM方法,通过可聚合的评估指标实现与集中式评估相同的结果,无需全局测试数据,从而更准确地反映模型整体性能。

Comments Accepted for publication in 2nd IEEE International Conference on Federated Learning and Intelligent Computing Systems(FLICS2026)

详情
英文摘要

Performance evaluation is essential for assessing the quality of machine learning (ML) models and guiding deployment decisions. In federated learning (FL), assessing the performance is challenging because data are distributed across participants. Consequently, the coordinator must rely on locally computed evaluation metrics and aggregate them to assess the global model. A key challenge is that common aggregation strategies, such as weighted averaging based on the local samples per participant, do not always produce the same results as centralized evaluation. Existing definitions of performance evaluation are largely tailored to accuracy and do not generalize to other metrics, leading to inconsistencies between participant-based and centralized evaluation. However, such discrepancies are inconsistent with the FL objective and lead to a wrong calculation of the metric. To address this issue, we examine the underlying reasons for these discrepancies and propose FLAM, a performance evaluation method based on aggregatable measures that yields the same results as centralized evaluation without the need for a global test dataset.

2605.07961 2026-05-11 cs.LG cs.CR cs.NI

Graph Representation Learning Augmented Model Manipulation on Federated Fine-Tuning of LLMs

Hanlin Cai, Kai Li, Houtianfu Wang, Haofan Dong, Yichen Li, Falko Dressler, Ozgur B. Akan

AI总结 本文研究了联邦微调(FFT)场景下大型语言模型(LLMs)面临的模型操纵威胁,并提出了一种增强型模型操纵策略AugMP。该方法通过图表示学习框架捕捉良性更新之间的特征关联,指导生成具有欺骗性的恶意更新,并结合增强拉格朗日对偶形式设计迭代优化算法,以提升操纵效果与隐蔽性。实验表明,AugMP在多个LLM架构上均表现出最强的操纵性能,显著降低了全局模型和本地代理的准确率,同时有效规避了传统防御手段。

详情
英文摘要

Federated fine-tuning (FFT) has emerged as a privacy-preserving paradigm for collaboratively adapting large language models (LLMs). Built upon federated learning, FFT enables distributed agents to jointly refine a shared pretrained LLM by aggregating local LLM updates without sharing local raw data. However, FFT-based LLMs remain vulnerable to model manipulation threats, in which adversarial participants upload manipulated LLM updates that corrupt the aggregation process and degrade the performance of the global LLM. In this paper, we propose an Augmented Model maniPulation (AugMP) strategy against FFT-based LLMs. Specifically, we design a novel graph representation learning framework that captures feature correlations among benign LLM updates to guide the generation of malicious updates. To enhance manipulation effectiveness and stealthiness, we develop an iterative manipulation algorithm based on an augmented Lagrangian dual formulation. Through this formulation, malicious updates are optimized to embed adversarial objectives while preserving benign-like parameter characteristics. Experimental results across multiple LLM backbones demonstrate that the AugMP strategy achieves the strongest manipulation performance among all competing baselines, reducing the global LLM accuracy by up to 26% and degrading the average accuracy of local LLM agents by up to 22%. Meanwhile, AugMP maintains high statistical and geometric consistency with benign updates, enabling it to evade conventional distance- and similarity-based defense methods.

2605.07959 2026-05-11 cs.LG math.FA math.PR

Convergent Stochastic Training of Attention and Understanding LoRA

Zhengkai Sun, Dibyakanti Kumar, Alejandro F Frangi, Anirbit Mukherjee, Mingfei Sun

AI总结 本文研究了在注意力机制和浅层神经网络中使用低秩适配(LoRA)方法时,如何通过随机训练方法实现模型的可训练性。作者提出一个统一的理论框架,证明在轻微正则化条件下,注意力层和LoRA参数化的回归损失满足Poincaré不等式,从而保证了随机梯度下降的收敛性。该研究首次在无需假设数据分布或网络规模的情况下,严格建立了注意力模型和LoRA结构的可训练性,为大模型的高效训练提供了理论支撑。

详情
英文摘要

Transformers have revolutionized machine learning and deploying attention layers in the model is increasingly standard across a myriad of applications. Further, for large models, it is common to implement Low Rank Adaptation (LoRA), whereby a factorized parameterization of them is trained, to achieve a surprisingly beneficial accuracy-size trade-off. In this work, via a unified framework we rigorously establish trainability of such models under stochastic methods. We prove that for any mild regularization, the empirical regression loss on a attention layer and LoRA on a shallow neural net, both induce Poincaré inequality for the corresponding Gibbs' measure. Then it follows via invoking recent results that a certain SDE, which mimics the SGD, minimizes the corresponding losses. In both the cases, our first-of-its-kind results of trainability on attention and nets, do not rely on any assumptions on the data or the size of the architecture.

2605.07955 2026-05-11 cs.CV cs.AI

TimeLesSeg: Unified Contrast-Agnostic Cross-Sectional and Longitudinal MS Lesion Segmentation via a Stochastic Generative Model

Vicent Caselles-Ballester, Eloy Martínez-Heras, Giuseppe Pontillo, Zoe Mendelsohn, Elena M. Marrón, Juan Luis García Fernández, Laia Subirats, Jon Stutters, Jeremy Chataway, Frederik Barkhof, Sara Llufriu, Ferran Prados

AI总结 多发性硬化症(MS)的病灶分割面临临床和影像异质性的挑战,现有深度学习方法对数据分布和输入结构变化较为敏感。本文提出TimeLesSeg,一种统一的、无需依赖对比度的病灶分割框架,能够同时处理横断面和纵向影像数据。该方法通过生成模型模拟病灶演变过程,并结合基于高斯混合模型的领域随机化技术,提升模型对不同成像条件的鲁棒性,实验表明其在多个数据集上优于现有方法。

详情
英文摘要

Multiple sclerosis (MS) expresses substantial clinical and radiological heterogeneity, which poses significant challenges for automatic lesion segmentation. The current deep learning-based SOTA is highly susceptible to changes in both distribution, e.g., changes in scanner; as well as the structure of inputs, evident in the current divide between cross-sectional and longitudinal approaches. We introduce TimeLesSeg, a unified contrast-agnostic framework designed to segment MS lesions regardless of the presence of a temporal dimension in its inputs, with a single convolutional neural network. Our approach models pathological priors through lesion masks, which are processed together with the current scan. Cross-sectional processing is enabled by exposing the model to training cases where no prior information is available, which are modeled with an empty mask, allowing it to operate seamlessly in both scenarios. To overcome the scarcity and inconsistency of longitudinal datasets, we propose a novel generative pipeline in which patterns of lesion evolution are simulated by stochastically deforming each individual lesion with morphological operations, producing realistic prior timepoints. In parallel, we achieve contrast agnosticism through Gaussian mixture model-based domain randomization, enabling the network to experience a wide spectrum of intensity profiles. Results on three publicly available and two in-house datasets show that TimeLesSeg outperforms the contrast-agnostic state of the art on single-modality inputs across overlap- and distance-based metrics. In longitudinal processing, our method outperforms SAMSEG, and captures lesion load dynamics more accurately than both the former and LST-AI. All source code related to the development of TimeLesSeg is available at https://github.com/NeuroADaS-Lab/TimeLesSeg.

2605.07950 2026-05-11 cs.LG

Slowly Annealed Langevin Dynamics: Theory and Applications to Training-Free Guided Generation

Atsushi Nitanda, Dake Bu, Yueming Lyu, Tanya Veeravalli

AI总结 本文研究了慢速退火朗之万动力学(SALD),这是一种用于追踪移动目标分布并通过对时间进行放慢来逼近最终目标的采样方法。通过KL散度微分不等式,论文建立了非渐近收敛性保证,表明时间放慢有助于提升中间目标的追踪能力并降低路径复杂度。为实现无需训练的引导生成,作者进一步提出了速度感知的SALD(VA-SALD),该方法显式结合预训练模型的边缘分布,并利用时间放慢来修正引导引入的额外偏差,从而为基于扩散模型及相关生成模型的无训练引导生成提供了理论框架和收敛性保障。

详情
英文摘要

We study Slowly Annealed Langevin Dynamics (SALD), a sampler for tracking a path of moving target distributions and approximating the terminal target through time slowdown. We establish non-asymptotic convergence guarantees via a KL differential inequality, showing that slowdown improves tracking through contraction of intermediate targets and the complexity of the path. Motivated by training-free guided generation with pretrained score-based generative models, we further introduce Velocity-Aware SALD (VA-SALD), which explicitly incorporates the underlying marginal distributions of the pretrained model and uses slowdown to correct the additional deviation induced by guidance. This yields a principled framework for training-free guided generation for diffusion-based and related generative model families, together with convergence guarantees that clarify the roles of intermediate functional inequalities and guidance bias. Code is available at https://github.com/anitan0925/sald.

2605.07945 2026-05-11 cs.CV

Rebalancing gradient to improve self-supervised co-training of depth, odometry and optical flow predictions

Marwane Hariat, Antoine Manzanera, David Filliat

AI总结 本文提出了一种名为CoopNet的方法,通过动态调整梯度分配来提升自监督联合训练中深度、里程估计和光流预测的协同效果。该方法引入了一种基于光度重建误差分布的混合损失函数,有效协调了不同任务之间的学习进度。实验表明,CoopNet在KITTI和CityScapes数据集上优于或与现有最佳方法相当,为多任务自监督学习提供了新的思路。

详情
英文摘要

We present CoopNet, an approach that improves the cooperation of co-trained networks by dynamically adapting the apportionment of gradient, to ensure equitable learning progress. It is applied to motion-aware self-supervised prediction of depth maps, by introducing a new hybrid loss, based on a distribution model of photo-metric reconstruction errors made by, on the one hand the depth + odometry paired networks, and on the other hand the optical flow network. This model essentially assumes that the pixels from moving objects (that must be discarded for training depth and odometry), correspond to those where the two reconstructions strongly disagree. We justify this model by theoretical considerations and experimental evidences. A comparative evaluation on KITTI and CityScapes datasets shows that CoopNet improves or is comparable to the state-of-the-art in depth, odometry and optical flow predictions.

2605.07943 2026-05-11 cs.RO cs.AI cs.CV cs.LG

TAVIS: A Benchmark for Egocentric Active Vision and Anticipatory Gaze in Imitation Learning

Giacomo Spigler

AI总结 本文提出TAVIS,一个用于模仿学习中主动视觉与预见性注视评估的基准平台。该平台包含两个互补的任务集,分别针对头部和手部操作,基于两个仿人机器人平台构建,并引入了三种评估方法,包括对比固定摄像头的头部摄像头协议、基于认知科学的GALT指标以及程序化的分布内/外划分。实验表明,主动视觉在任务表现上具有优势,但效果依赖于任务类型,并揭示了模仿学习策略在分布偏移下的鲁棒性问题。

详情
英文摘要

Active vision -- where a policy controls its own gaze during manipulation -- has emerged as a key capability for imitation learning, with multiple independent systems demonstrating its benefits in the past year. Yet there is no shared benchmark to compare approaches or quantify what active vision contributes, on which task types, and under what conditions. We introduce TAVIS, evaluation infrastructure for active-vision imitation learning, with two complementary task suites -- TAVIS-Head (5 tasks, global search via pan/tilt necks) and TAVIS-Hands (3 tasks, local occlusion via wrist cameras) -- on two humanoid torso embodiments (GR1T2, Reachy2), built on IsaacLab. TAVIS provides three evaluation primitives: a paired headcam-vs-fixedcam protocol on identical demonstrations; GALT (Gaze-Action Lead Time), a novel metric grounded in cognitive science and HRI that quantifies anticipatory gaze in learned policies; and procedural ID/OOD splits. Baseline experiments with Diffusion Policy and $π_0$ reveal that (i) active-vision generally helps, but benefits are task-conditional rather than uniform; (ii) multi-task policies degrade sharply under controlled distribution shifts on both suites; and (iii) imitation alone yields anticipatory gaze, with median lead times comparable to the human teleoperator reference. Code, evaluation scripts, demonstrations (LeRobot v3.0; ~2200 episodes) and trained baselines are released at https://github.com/spiglerg/tavis and https://huggingface.co/tavis-benchmark.

2605.07940 2026-05-11 cs.CV

Delta-Adapter: Scalable Exemplar-Based Image Editing with Single-Pair Supervision

Jiacheng Chen, Songze Li, Han Fu, Baoquan Zhao, Wei Liu, Yanyan Liang, Li Qing, Xudong Mao

AI总结 Delta-Adapter 是一种基于单对示例监督的可扩展图像编辑方法,能够在无需文本指导的情况下学习可迁移的编辑语义。该方法通过预训练视觉编码器提取源图与目标图之间的语义差异,并利用基于 Perceiver 的适配器将其注入预训练的图像编辑模型中,从而实现对查询图像的编辑。通过引入语义差异一致性损失,进一步提升了编辑结果的保真度与语义一致性,实验表明其在多种编辑任务上均优于现有方法,并具有更好的泛化能力。

详情
英文摘要

Exemplar-based image editing applies a transformation defined by a source-target image pair to a new query image. Existing methods rely on a pair-of-pairs supervision paradigm, requiring two image pairs sharing the same edit semantics to learn the target transformation. This constraint makes training data difficult to curate at scale and limits generalization across diverse edit types. We propose Delta-Adapter, a method that learns transferable editing semantics under single-pair supervision, requiring no textual guidance. Rather than directly exposing the exemplar pair to the model, we leverage a pre-trained vision encoder to extract a semantic delta that encodes the visual transformation between the two images. This semantic delta is injected into a pre-trained image editing model via a Perceiver-based adapter. Since the target image is never directly visible to the model, it can serve as the prediction target, enabling single-pair supervision without requiring additional exemplar pairs. This formulation allows us to leverage existing large-scale editing datasets for training. To further promote faithful transformation transfer, we introduce a semantic delta consistency loss that aligns the semantic change of the generated output with the ground-truth semantic delta extracted from the exemplar pair. Extensive experiments demonstrate that Delta-Adapter consistently improves both editing accuracy and content consistency over four strong baselines on seen editing tasks, while also generalizing more effectively to unseen editing tasks. Code will be available at https://delta-adapter.github.io.

2605.07938 2026-05-11 cs.LG

Prototype Guided Post-pretraining for Single-Cell Representation Learning

Sachini Weerasekara, Natasha Darras, Sagar Kamarthi, Colles Price, Jacqueline Isaacs

AI总结 本文研究了单细胞表征学习中因细胞类型分布不均和基因表达数据协变量偏移导致的模型泛化问题。为解决这一问题,作者提出了一种名为CellRefine的后预训练方法,通过引入标记基因集作为结构先验,引导模型优化潜在嵌入空间,从而提升模型性能。实验表明,该方法在多个计算生物学任务中均能有效提升下游任务表现,最高提升达15%。

详情
英文摘要

Single-cell representation learning (SCRL) from gene expression data offers a way to uncover the complex regulatory logic underlying cellular function. Inspired by large language models in natural language modeling, several single-cell pretrained models have recently been proposed that treat genes as tokens and cells as sentences. However, these models are fundamentally limited by the long-tailed nature of cell-type distributions and struggle to generalize under covariate shifts in gene expression data. While fine-tuning is often used to mitigate these issues, we observe that performance remains bounded. To address this challenge, we introduce CellRefine, a post-pretraining method that operates between the pretraining and fine-tuning stages of a single-cell foundation model. CellRefine uses a multi-faceted objective that incorporates marker-gene sets as structural priors to guide post-pretraining and refine the latent embedding manifold of cells. Across multiple computational biology tasks, empirical results show that CellRefine consistently improves downstream performance, yielding gains up to 15%.

2605.07937 2026-05-11 cs.CL

Ask Early, Ask Late, Ask Right: When Does Clarification Timing Matter for Long-Horizon Agents?

Anmol Gulati, Hariom Gupta, Elias Lumer, Sahil Sen, Vamse Kumar Subbiah

AI总结 本文研究了长期任务中澄清时机对智能体性能的影响,发现澄清价值随任务进展而变化,并非“越早越好”。通过引入一个可控的澄清注入框架,作者在多个任务维度和模型上系统评估了不同信息缺失下澄清时机的效果,揭示了目标、输入等不同信息类型的最优澄清时间窗口。研究还表明,当前前沿模型普遍未能在最佳时机进行澄清,为设计时序感知的澄清策略提供了重要依据。

详情
英文摘要

Long-horizon AI agents execute complex workflows spanning hundreds of sequential actions, yet a single wrong assumption early on can cascade into irreversible errors. When instructions are incomplete, the agent must decide not only whether to ask for clarification but when, and no prior work measures how clarification value changes over the course of execution. We introduce a forced-injection framework that provides ground-truth clarifications at controlled points in the agent's trajectory across four information dimensions (goal, input, constraint, context), three agent benchmarks, and four frontier models (three per benchmark; one on a single benchmark only; 84 task variants; 6,000+ runs). Counter to the common intuition that "earlier is always better," we find that the value of clarification depends sharply on what information is missing: goal clarification loses nearly all value after 10% of execution (pass@3 drops from 0.78 to baseline), while input clarification retains value through roughly 50%. Deferring any clarification type past mid-trajectory degrades performance below never asking at all. Cross-model Kendall tau correlations (0.78-0.87 among models sharing identical task coverage; 0.34-0.67 across the full 4-model panel) confirm these timing profiles are substantially task-intrinsic. A complementary study of 300 unscripted sessions reveals that no current frontier model asks within the empirically optimal window, with strategies ranging from over-asking (52% of sessions) to never asking at all. These empirical demand curves provide the quantitative foundation that existing theoretical frameworks require but have lacked, and establish concrete design targets for timing-aware clarification policies. Code and data will be publicly released.

2605.07935 2026-05-11 cs.AI cs.MA

TraceFix: Repairing Agent Coordination Protocols with TLA+ Counterexamples

Shuren Xia, Qiwei Li, Taqiya Ehsan, Jorge Ortiz

AI总结 本文提出 TraceFix,一种以验证为核心的大型语言模型多智能体协调框架。该方法通过从任务描述中生成结构化的协议中间表示,并结合 TLA+ 模型检查器迭代修正协议,直至通过验证,最终将验证通过的协议编译为各智能体的系统提示并在运行时进行监控。实验表明,该方法在大量任务上实现了高效验证与执行,显著提升了任务完成率并降低了死锁和活锁的发生率。

详情
英文摘要

We present TraceFix, a verification-first pipeline for Large Language Model (LLM) multi-agent coordination. An agent synthesizes a protocol topology as a structured intermediate representation (IR) from a task description, generates PlusCal coordination logic, and iteratively repairs the protocol using counterexamples from the TLA+ model checker (TLC) until verification succeeds. Verified process bodies are compiled into per-agent system prompts and executed under a runtime monitor that rejects out-of-topology coordination operations. On 48 tasks spanning 16 scenario families, all tasks reach full TLC verification; 62.5% pass on the first attempt and none requires more than four repair iterations. State spaces span six orders of magnitude yet verification completes in under 60 s for every task. A 3,456-run runtime comparison shows that topology-monitored execution achieves the highest task completion (89.4% average, 81.5% full) and that runtimes using the verified protocol degrade at roughly half the rate of prompt-only and chat-only baselines when model capability is reduced. A paired ablation under a fixed runtime shows that TLC-verified protocols cut deadlock/livelock (DL/LL) from 31.1% to 14.1%, with the largest separation under fault injection.

2605.07933 2026-05-11 cs.CL

How to Train Your Latent Diffusion Language Model Jointly With the Latent Space

Viacheslav Meshchaninov, Alexander Shabalin, Egor Chimbulatov, Nikita Gushchin, Ilya Koziev, Alexander Korotin, Dmitry Vetrov

AI总结 本文提出了一种联合训练的潜在扩散语言模型(LDLM),通过同时训练潜在编码器、扩散模型和解码器,构建了一个适合扩散过程的潜在空间。研究发现,直接进行联合训练会导致生成质量下降,因此作者提出了一种包含均方误差解码损失、扩散到编码器预热、自适应时间步采样和解码器输入噪声的简单训练策略,显著提升了生成效果。实验表明,LDLM在生成性能上优于现有离散和连续扩散语言模型,且推理速度提高了2到13倍,证明了联合学习潜在空间对提升潜在扩散模型在文本生成中的竞争力具有重要意义。

详情
英文摘要

Latent diffusion models offer an attractive alternative to discrete diffusion for non-autoregressive text generation by operating on continuous text representations and denoising entire sequences in parallel. The major challenge in latent diffusion modeling is constructing a suitable latent space. In this work, we present the Latent Diffusion Language Model (LDLM), in which the latent encoder, diffusion model, and decoder are trained jointly. LDLM builds its latent space by reshaping the representations of a pre-trained language model with a trainable encoder, yielding latents that are easy to both denoise and decode into tokens. We show that naive joint training produces a low-quality diffusion model, and propose a simple training recipe consisting of an MSE decoder loss, diffusion-to-encoder warmup, adaptive timestep sampling, and decoder-input noise. Ablations show that each component substantially impacts generation performance. On OpenWebText and LM1B, LDLM achieves better generation performance than existing discrete and continuous diffusion language models while being $2{\text -}13\times$ faster, indicating that jointly learning the latent space is a key step toward making latent diffusion competitive for text generation.

2605.07930 2026-05-11 cs.LG cs.AI

INO-SGD: Addressing Utility Imbalance under Individualized Differential Privacy

Xiao Tian, Jue Fan, Rachael Hwee Ling Sim, Bryan Kian Hsiang Low

AI总结 本文研究了在个性化差分隐私(IDP)设置下,由于隐私要求不同导致的模型效用不平衡问题,即对隐私要求更高的数据在训练中可能被严重低估,从而影响模型在后续部署中的性能。为此,作者提出了一种名为INO-SGD的算法,通过在每个训练批次中策略性地降低隐私要求较高数据的权重,提升模型在这些数据上的表现。该算法专门设计以满足IDP要求,而现有解决效用不平衡的方法既不满足IDP,也难以适配到IDP场景中。实验验证了该方法的有效性。

Comments Accepted to the 14th International Conference on Learning Representations (ICLR-26)

详情
英文摘要

Differential privacy (DP) is widely employed in machine learning to protect confidential or sensitive training data from being revealed. As data owners gain greater control over their data due to personal data ownership, they are more likely to set their own privacy requirements, necessitating individualized DP (IDP) to fulfil such requests. In particular, owners of data from more sensitive subsets, such as positive cases of stigmatized diseases, likely set stronger privacy requirements, as leakage of such data could incur more serious societal impact. However, existing IDP algorithms induce a critical utility imbalance problem: Data from owners with stronger privacy requirements may be severely underrepresented in the trained model, resulting in poorer performance on similar data from subsequent users during deployment. In this paper, we analyze this problem and propose the INO-SGD algorithm, which strategically down-weights data within each batch to improve performance on the more private data across all iterations. Notably, our algorithm is specially designed to satisfy IDP, while existing techniques addressing utility imbalance neither satisfy IDP nor can be easily adapted to do so. Lastly, we demonstrate the empirical feasibility of our approach.

2605.07925 2026-05-11 cs.CL

How Value Induction Reshapes LLM Behaviour

Arnav Arora, Natalie Schluter, Katherine Metcalf, Maartje ter Hoeve

AI总结 该研究探讨了在对话型大语言模型中引入价值观对其行为的影响。研究通过使用精选的价值观子集对模型进行微调,分析了价值观诱导对模型表达其他价值观、安全性、拟人化语言使用以及问答表现的影响。结果表明,诱导特定价值观不仅会增强相关或对立价值观的表达,还可能提升模型安全性,但同时会增加模型使用拟人化语言的倾向,使其更趋迎合和验证用户。

Comments Accepted to Findings of ACL 2026

详情
英文摘要

Conversational Large Language Models are post-trained on language that expresses specific behavioural traits, such as curiosity, open-mindedness, and empathy, and values, such as helpfulness, harmlessness, and honesty. This is done to increase utility, ensure safety, and improve the experience of the people interacting with the model. However, values are complex and inter-related -- inducing one could modify behaviour on another. Further, inducing certain values can make models more addictive or sycophantic through language used in the generations, with a potential detrimental effect on the user. We investigate these and other unintended effects of value induction into models. We fine-tune models using curated value subsets of existing preference datasets, measuring the impact of value induction on expression of other values, model safety, anthropomorphic language, and various QA benchmarks. We find that (i) inducing values leads to expression of other related, and sometimes contrastive values, (ii) inducing positive values increases safety, and (iii) all values increase anthropomorphic language use, making models more validating and sycophantic.