arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2604.05279 2026-04-08 cs.AI

Pressure, What Pressure? Sycophancy Disentanglement in Language Models via Reward Decomposition

Muhammad Ahmed Mohsin, Ahsan Bilal, Muhammad Umer, Emily Fox

Comments Submitted to COLM 2026

详情

英文摘要

Large language models exhibit sycophancy, the tendency to shift their stated positions toward perceived user preferences or authority cues regardless of evidence. Standard alignment methods fail to correct this because scalar reward models conflate two distinct failure modes into a single signal: pressure capitulation, where the model changes a correct answer under social pressure, and evidence blindness, where the model ignores the provided context entirely. We operationalise sycophancy through formal definitions of pressure independence and evidence responsiveness, serving as a working framework for disentangled training rather than a definitive characterisation of the phenomenon. We propose the first approach to sycophancy reduction via reward decomposition, introducing a multi-component Group Relative Policy Optimisation (GRPO) reward that decomposes the training signal into five terms: pressure resistance, context fidelity, position consistency, agreement suppression, and factual correctness. We train using a contrastive dataset pairing pressure-free baselines with pressured variants across three authority levels and two opposing evidence contexts. Across five base models, our two-phase pipeline consistently reduces sycophancy on all metric axes, with ablations confirming that each reward term governs an independent behavioural dimension. The learned resistance to pressure generalises beyond our training methodology and prompt structure, reducing answer-priming sycophancy by up to 17 points on SycophancyEval despite the absence of such pressure forms during training.

URL PDF HTML ☆

赞 0 踩 0

2604.05277 2026-04-08 cs.RO

Semantic analysis of behavior in a DNA-functionalized molecular swarm

Tom Bachard, Gong Yiming, Ibuki Kawamata, Akira Kakugo, Nathanael Aubert-Kato

Comments 10 pages main text, 2 pages annexes, 9 figures in main text, 2 figures in annexes

2604.05274 2026-04-08 cs.AI

Simulating the Evolution of Alignment and Values in Machine Intelligence

Jonathan Elsworth Eicher

Comments 9 pages, 7 figures

2604.05273 2026-04-08 cs.CL

Beneath the Surface: Investigating LLMs' Capabilities for Communicating with Subtext

Kabir Ahuja, Yuxuan Li, Andrew Kyle Lampinen

2604.05272 2026-04-08 cs.RO cs.CV

Final Report, Center for Computer-Integrated Computer-Integrated Surgical Systems and Technology, NSF ERC Cooperative Agreement EEC9731748, Volume 1

Russell H. Taylor, Gregory D. Hager, Ralph Etienne-Cummings. Eric Grimson, Ron Kikinis, Cameron Riviere

2604.05267 2026-04-08 cs.CL

Do Domain-specific Experts exist in MoE-based LLMs?

Giang Do, Hung Le, Truyen Tran

Comments 15 pages

2604.05259 2026-04-08 cs.CV cs.RO

Coverage Optimization for Camera View Selection

Timothy Chen, Adam Dai, Maximilian Adang, Grace Gao, Mac Schwager

2604.05257 2026-04-08 cs.LG cs.AI

Extending Tabular Denoising Diffusion Probabilistic Models for Time-Series Data Generation

Umang Dobhal, Christina Garcia, Sozo Inoue

Comments 16 pages, 10 figures, 2 tables

2604.05256 2026-04-08 cs.CV

Protecting and Preserving Protest Dynamics for Responsible Analysis

Cohen Archbold, Usman Hassan, Nazmus Sakib, Sen-ching Cheung, Abdullah-Al-Zubaer Imran

Comments 21 pages, 6 figures, Submitted to ACM Journal on Responsible Computing

2604.05254 2026-04-08 cs.AI cs.LG

EAGLE: Edge-Aware Graph Learning for Proactive Delivery Delay Prediction in Smart Logistics Networks

Zhiming Xue, Menghao Huo, Yujue Wang

2604.05250 2026-04-08 cs.LG cs.CL

DualDiffusion: A Speculative Decoding Strategy for Masked Diffusion Models

Satyam Goyal, Kushal Patel, Tanush Mittal, Arjun Laxman

2604.05248 2026-04-08 cs.LG cs.CL

Improving Sparse Memory Finetuning

Satyam Goyal, Anirudh Kanchi, Garv Shah, Prakhar Gupta

2604.05243 2026-04-08 cs.CL cs.AI

Exemplar Retrieval Without Overhypothesis Induction: Limits of Distributional Sequence Learning in Early Word Learning

Jon-Paul Cacioli

Comments 27 pages, 7 figures, 22 references. Pre-registered study (OSF: https://osf.io/qj9hb/). Code and data: https://github.com/synthiumjp/overhypothesis. Submitted to Cognitive Computation

2604.05230 2026-04-08 cs.LG cs.AI cs.NA math.NA math.OC

Curvature-Aware Optimization for High-Accuracy Physics-Informed Neural Networks

Anas Jnini, Elham Kiyani, Khemraj Shukla, Jorge F. Urban, Nazanin Ahmadi Daryakenari, Johannes Muller, Marius Zeinhofer, George Em Karniadakis

Comments 54 pages, 24 figures

2604.05229 2026-04-08 cs.AI cs.HC cs.LG cs.MA

From Governance Norms to Enforceable Controls: A Layered Translation Method for Runtime Guardrails in Agentic AI

Christopher Koch

Comments 5 pages, 2 tables

2604.05227 2026-04-08 cs.CV

Active Measurement of Two-Point Correlations

Max Hamilton, Daniel Sheldon, Subhransu Maji

Comments AIStats 2026

2604.05226 2026-04-08 cs.RO cs.AI cs.CL cs.HC

RoboPlayground: Democratizing Robotic Evaluation through Structured Physical Domains

Yi Ru Wang, Carter Ung, Evan Gubarev, Christopher Tan, Siddhartha Srinivasa, Dieter Fox

Comments Yi Ru Wang and Carter Ung contributed equally

2604.05224 2026-04-08 cs.AI

Attribution Bias in Large Language Models

Eliza Berman, Bella Chang, Daniel B. Neill, Emily Black

Comments 21 pages

2604.05217 2026-04-08 cs.LG cs.CL

On the Geometry of Positional Encodings in Transformers

Giansalvo Cirrincione

2604.05215 2026-04-08 cs.CV q-bio.NC

Hierarchical Mesh Transformers with Topology-Guided Pretraining for Morphometric Analysis of Brain Structures

Yujian Xiong, Mohammad Farazi, Yanxi Chen, Wenhui Zhu, Xuanzhao Dong, Natasha Lepore, Yi Su, Raza Mushtaq, Stephen Foldes, Andrew Yang, Yalin Wang

2604.05212 2026-04-08 cs.CV

Boxer: Robust Lifting of Open-World 2D Bounding Boxes to 3D

Daniel DeTone, Tianwei Shen, Fan Zhang, Lingni Ma, Julian Straub, Richard Newcombe, Jakob Engel

Comments project page: http://facebookresearch.github.io/boxer

2604.05210 2026-04-08 cs.CV

Integration of Object Detection and Small VLMs for Construction Safety Hazard Identification

Muhammad Adil, Mehmood Ahmed, Muhammad Aqib, Vicente A. Gonzalez, Gaang Lee, Qipei Mei

详情

英文摘要

Accurate and timely identification of construction hazards around workers is essential for preventing workplace accidents. While large vision-language models (VLMs) demonstrate strong contextual reasoning capabilities, their high computational requirements limit their applicability in near real-time construction hazard detection. In contrast, small vision-language models (sVLMs) with fewer than 4 billion parameters offer improved efficiency but often suffer from reduced accuracy and hallucination when analyzing complex construction scenes. To address this trade-off, this study proposes a detection-guided sVLM framework that integrates object detection with multimodal reasoning for contextual hazard identification. The framework first employs a YOLOv11n detector to localize workers and construction machinery within the scene. The detected entities are then embedded into structured prompts to guide the reasoning process of sVLMs, enabling spatially grounded hazard assessment. Within this framework, six sVLMs (Gemma-3 4B, Qwen-3-VL 2B/4B, InternVL-3 1B/2B, and SmolVLM-2B) were evaluated in zero-shot settings on a curated dataset of construction site images with hazard annotations and explanatory rationales. The proposed approach consistently improved hazard detection performance across all models. The best-performing model, Gemma-3 4B, achieved an F1-score of 50.6%, compared to 34.5% in the baseline configuration. Explanation quality also improved significantly, with BERTScore F1 increasing from 0.61 to 0.82. Despite incorporating object detection, the framework introduces minimal overhead, adding only 2.5 ms per image during inference. These results demonstrate that integrating lightweight object detection with small VLM reasoning provides an effective and efficient solution for context-aware construction safety hazard detection.

URL PDF HTML ☆

赞 0 踩 0

2604.05195 2026-04-08 cs.LG

Vehicle-as-Prompt: A Unified Deep Reinforcement Learning Framework for Heterogeneous Fleet Vehicle Routing Problem

Shihong Huang, Shengjie Wang, Lei Gao, Hong Ma, Zhanluo Zhang, Feng Zhang, Weihua Zhou

2604.05192 2026-04-08 cs.CL

Faster Superword Tokenization

Craig W. Schmidt, Chris Tanner, Yuval Pinter

2604.05187 2026-04-08 cs.LG cs.SY eess.SY

FNO$^{\angle θ}$: Extended Fourier neural operator for learning state and optimal control of distributed parameter systems

Zhexian Li, Ketan Savla

Comments 6 pages, 3 figures

2604.05185 2026-04-08 cs.LG cs.SY eess.SY

Cross-fitted Proximal Learning for Model-Based Reinforcement Learning

Nishanth Venkatesh, Andreas A. Malikopoulos

2604.05183 2026-04-08 cs.CV cs.AI cs.LG

OrthoFuse: Training-free Riemannian Fusion of Orthogonal Style-Concept Adapters for Diffusion Models

Ali Aliev, Kamil Garifullin, Nikolay Yudin, Vera Soboleva, Alexander Molozhavenko, Ivan Oseledets, Aibek Alanov, Maxim Rakhuba

2604.05182 2026-04-08 cs.CV cs.AI

LSRM: High-Fidelity Object-Centric Reconstruction via Scaled Context Windows

Zhengqin Li, Cheng Zhang, Jakob Engel, Zhao Dong

2604.05181 2026-04-08 cs.LG

General Multimodal Protein Design Enables DNA-Encoding of Chemistry

Jarrid Rector-Brooks, Théophile Lambert, Marta Skreta, Daniel Roth, Yueming Long, Zi-Qi Li, Xi Zhang, Miruna Cretu, Francesca-Zhoufan Li, Tanvi Ganapathy, Emily Jin, Avishek Joey Bose, Jason Yang, Kirill Neklyudov, Yoshua Bengio, Alexander Tong, Frances H. Arnold, Cheng-Hao Liu

2604.05180 2026-04-08 cs.CV

MIRAGE: Benchmarking and Aligning Multi-Instance Image Editing

Ziqian Liu, Stephan Alaniz