arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2605.06512 2026-05-08 cs.CV

DCR: Counterfactual Attractor Guidance for Rare Compositional Generation

Taewon Kang, Matthias Zwicker

Comments 40 pages, 33 figures

详情

英文摘要

Diffusion models generate realistic visual content, yet often fail to produce rare but plausible compositions. When prompted with combinations that are valid but underrepresented in training data, such as a snowy beach or a rainbow at night, the generation process frequently collapses toward more common alternatives. We identify this failure mode as default completion bias, where denoising trajectories are implicitly attracted toward high-frequency semantic configurations. Existing guidance mechanisms do not explicitly model this competing tendency and therefore struggle to prevent such collapse. We introduce Default Completion Repulsion (DCR), a training-free framework that explicitly models and suppresses default completion behavior. DCR constructs a counterfactual attractor by relaxing the rare compositional factor while preserving surrounding semantics, inducing an alternative denoising trajectory reflecting the model's preferred completion. We define the discrepancy between target and attractor trajectories as a counterfactual drift, and propose a projection-based repulsion mechanism that removes guidance components aligned with this drift direction. This suppresses undesired frequent completions while preserving other semantic components. DCR operates entirely within the standard diffusion sampling process without retraining or architectural modification. Experiments on rare compositional prompts show that DCR improves compositional fidelity while maintaining visual quality. Our analysis further shows that the framework exposes and counteracts intrinsic model biases, offering a new perspective on controllable generation beyond explicit constraint enforcement.

URL PDF HTML ☆

赞 0 踩 0

2605.06510 2026-05-08 cs.LG cs.AI

Is One Layer Enough? Understanding Inference Dynamics in Tabular Foundation Models

Amir Rezaei Balef, Mykhailo Koshil, Katharina Eggensperger

Comments Accepted at the 43rd International Conference on Machine Learning (ICML 2026)

2605.06509 2026-05-08 cs.CV

FreeSpec: Training-Free Long Video Generation via Singular-Spectrum Reconstruction

Fangda Chen, Shanshan Zhao, Longrong Yang, Chuanfu Xu, Zhigang Luo, Long Lan

2605.06507 2026-05-08 cs.CV cs.LG

MARBLE: Multi-Aspect Reward Balance for Diffusion RL

Canyu Zhao, Hao Chen, Yunze Tong, Yu Qiao, Jiacheng Li, Chunhua Shen

Comments Homepage and code repo: https://aim-uofa.github.io/MARBLE

详情

英文摘要

Reinforcement learning fine-tuning has become the dominant approach for aligning diffusion models with human preferences. However, assessing images is intrinsically a multi-dimensional task, and multiple evaluation criteria need to be optimized simultaneously. Existing practice deal with multiple rewards by training one specialist model per reward, optimizing a weighted-sum reward $R(x)=\sum_k w_k R_k(x)$, or sequentially fine-tuning with a hand-crafted stage schedule. These approaches either fail to produce a unified model that can be jointly trained on all rewards or necessitates heavy manually tuned sequential training. We find that the failure stems from using a naive weighted-sum reward aggregation. This approach suffers from a sample-level mismatch because most rollouts are specialist samples, highly informative for certain reward dimensions but irrelevant for others; consequently, weighted summation dilutes their supervision. To address this issue, we propose MARBLE (Multi-Aspect Reward BaLancE), a gradient-space optimization framework that maintains independent advantage estimators for each reward, computes per-reward policy gradients, and harmonizes them into a single update direction without manually-tuned reward weighting, by solving a Quadratic Programming problem. We further propose an amortized formulation that exploits the affine structure of the loss used in DiffusionNFT, to reduce the per-step cost from K+1 backward passes to near single-reward baseline cost, together with EMA smoothing on the balancing coefficients to stabilize updates against transient single-batch fluctuations. On SD3.5 Medium with five rewards, MARBLE improves all five reward dimensions simultaneously, turns the worst-aligned reward's gradient cosine from negative under weighted summation in 80% of mini-batches to consistently positive, and runs at 0.97X the training speed of baseline training.

URL PDF HTML ☆

赞 0 踩 0

2605.06500 2026-05-08 cs.LG cs.AI

Operator-Guided Invariance Learning for Continuous Reinforcement Learning

Zuyuan Zhang, Fei Xu Yu, Tian Lan

2605.06494 2026-05-08 cs.AI

From Token Lists to Graph Motifs: Weisfeiler-Lehman Analysis of Sparse Autoencoder Features

Ruben Fernandez-Boullon, Pablo Magariños-Docampo, Javier Perez-Robles

2605.06490 2026-05-08 cs.AI cs.CY

Instrumental Choices: Measuring the Propensity of LLM Agents to Pursue Instrumental Behaviors

Jonas Wiedermann-Möller, Leonard Dung, Maksym Andriushchenko

2605.06487 2026-05-08 cs.CV cs.AI

3D MRI Image Pretraining via Controllable 2D Slice Navigation Task

Yu Wang, Qingchao Chen

Comments 9 pages, 5 figures

2605.06481 2026-05-08 cs.RO

OA-WAM: Object-Addressable World Action Model for Robust Robot Manipulation

Yushan Liu, Peibo Sun, Shoujie Li, Yifan Xie, Lingfeng Zhang, Xintao Chao, Shiyuan Dong, Fang Chen, Xiao-Ping Zhang, Wenbo Ding

2605.06480 2026-05-08 cs.AI cs.CL

Patch-Effect Graph Kernels for LLM Interpretability

Ruben Fernandez-Boullon, David N. Olivieri

2605.06478 2026-05-08 cs.RO

GA3T: A Ground-Aerial Terrain Traversability Dataset for Heterogeneous Robot Teams in Unstructured Environments

Siwei Cai, Knut Peterson, Quan Tran, Christian Ricks, Dhanush Parthasarathy, Amir Kaidarov, Neil Deshpande, Sukaina Najm, David Han, Lifeng Zhou

Comments For DARS 2026

2605.06477 2026-05-08 cs.CV

GeoStack: A Framework for Quasi-Abelian Knowledge Composition in VLMs

Pranav Mantini, Shishir K. Shah

2605.06476 2026-05-08 cs.CL

Towards Emotion Consistency Analysis of Large Language Models in Emotional Conversational Contexts

Sneha Oram, Ojaswita Bhushan, Pushpak Bhattacharyya

Comments Under-review

2605.06472 2026-05-08 cs.LG

Efficient Serving for Dynamic Agent Workflows with Prediction-based KV-Cache Management

Haoyu Zheng, Fangcheng Fu, Jia Wu, Binhang Yuan, Yongqiang Zhang, Hao Wang, Yuanyuan Zhu, Xiao Yan, Jiawei Jiang

2605.06470 2026-05-08 cs.LG

Hitting Time Isomorphism for Multi-Stage Planning with Foundation Policies

Magnus Victor Boock, Abdullah Akgül, Mustafa Mert Çelikok, Melih Kandemir

2605.06467 2026-05-08 cs.LG math.AT

No Triangulation Without Representation: Generalization in Topological Deep Learning

Johannes S. Schmidt, Martin Carrasco, Ernst Röell, Guy Wolf, Nello Blaser, Bastian Rieck

2605.06466 2026-05-08 cs.LG

Diversity Curves for Graph Representation Learning

Katharina Limbeck, Nadja Häusermann, Martin Carrasco, Guy Wolf, Bastian Rieck

2605.06462 2026-05-08 cs.LG math.CO

Invariant-Based Diagnostics for Graph Benchmarks

Richard von Moos, Mathieu Alain, Bastian Rieck

2605.06460 2026-05-08 cs.LG

MINER: Mining Multimodal Internal Representation for Efficient Retrieval

Weien Li, Rui Song, Zeyu Li, Haochen Liu, Gonghao Zhang, Difan Jiao, Zhenwei Tang, Bowei He, Haolun Wu, Xue Liu, Ye Yuan

Comments Preprint

2605.06458 2026-05-08 cs.LG cs.CL

Invariant Features in Language Models: Geometric Characterization and Model Attribution

Agnibh Dasgupta, Abdullah Tanvir, Xin Zhong

2605.06457 2026-05-08 cs.AI

Beyond Task Success: Measuring Workflow Fidelity in LLM-Based Agentic Payment Systems

Donghao Huang, Joon Kiat Chua, Zhaoxia Wang

Comments 6 pages, 2 tables. Accept at AI and Data Science for Digital Finance (AIDS4DF) Workshop, PAKDD 2026

2605.06455 2026-05-08 cs.AI

PrefixGuard: From LLM-Agent Traces to Online Failure-Warning Monitors

Xinmiao Huang, Jinwei Hu, Rajarshi Roy, Changshun Wu, Yi Dong, Xiaowei Huang

Comments Under Review

2605.06454 2026-05-08 cs.LG cs.AI

ORTHOBO: Orthogonal Bayesian Hyperparameter Optimization

Maresa Schröder, Pascal Janetzky, Michael Klar, Stefan Feuerriegel

2605.06447 2026-05-08 cs.LG

Scene-Adaptive Continual Learning for CSI-based Human Activity Recognition with Mixture of Experts

Wenhan Zheng, Yuyi Mao, Ivan Wang-Hei Ho

Comments 5 pages, 3 figures, 3 tables, this article was submitted to IEEE for possible publication

2605.06446 2026-05-08 cs.LG

FedFrozen: Two-Stage Federated Optimization via Attention Kernel Freezing

Junye Du, Zhenghao Li, Yushi Feng, Long Feng

Comments 25 pages

2605.06444 2026-05-08 cs.AI

SCRuB: Social Concept Reasoning under Rubric-Based Evaluation

Jamelle Watson-Daniels, Himaghna Bhattacharjee, Skyler Wang, Brandon Handoko, Antonio Li, Anaelia Ovalle, Mahesh Pasupuleti, Candace Ross, Vidya Sarma, Arjun Subramonian, Karen Ullrich, Will van der Vaart, Yijing Xin, Maximilian Nickel

2605.06434 2026-05-08 cs.AI

Knowledge Graphs, the Missing Link in Agentic AI-based Formal Verification

Vaisakh Naduvodi Viswambharan, Keerthan Kopparam Radhakrishna, Deepak Narayan Gadde, Aman Kumar

Comments To appear at the IEEE International Conference on IC Design and Technology 2026 (ICICDT), June 22 - 24, 2026, Dresden, Germany

2605.06433 2026-05-08 cs.LG

Federated Cross-Client Subgraph Pattern Detection

Selin Ceydeli, Rui Wang, Kubilay Atasu

2605.06432 2026-05-08 cs.RO

TouchDrive: Electronics-Free Tactile Sensing Interface for Assistive Grasping

Jing Xu, Xuezhi Niu, Didem Gurdur Broo, Klas Hjort

Comments Accepted at ICRA 2026 workshop on Visuo-Tactile Perception, Learning, Control for Manipulation: Embodied Tactile Intelligence in Predictive Perception, Learning & Control in Grasping & Manipulation, Emerging the Role of Embodiment and Visuo -Tactile - LLM Foundation Models (ICRA RoboTac 2026)

2605.06426 2026-05-08 cs.CL

From 124 Million Tokens to 1,021 Neologisms: A Large-Scale Pipeline for Automatic Neologism Detection

Diego Rossini, Lonneke van der Plas

Comments 14 pages, 5 tables. Accepted at NeoLLM 2026 Workshop, co-located with LREC-COLING 2026