arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2604.15574 2026-04-20 cs.CL cs.AI cs.LG cs.NE

Why Fine-Tuning Encourages Hallucinations and How to Fix It

Guy Kaplan, Zorik Gekhman, Zhen Zhu, Lotem Rozner, Yuval Reif, Swabha Swayamdipta, Derek Hoiem, Roy Schwartz

详情

英文摘要

Large language models are prone to hallucinating factually incorrect statements. A key source of these errors is exposure to new factual information through supervised fine-tuning (SFT), which can increase hallucinations w.r.t. knowledge acquired during pre-training. In this work, we explore whether SFT-induced hallucinations can be mitigated using established tools from the continual learning literature, since they arise as a by-product of knowledge degradation during training. We propose a self-distillation-based SFT method that facilitates effective factual learning while minimizing hallucinations w.r.t. pre-existing knowledge by regularizing output-distribution drift. We also show that, in settings where new knowledge acquisition is unnecessary, suppressing factual plasticity by freezing parameter groups, can preserve task performance while reducing hallucinations. Lastly, we investigate the mechanism behind SFT-induced hallucinations through three hypotheses: capacity limitations, behavior cloning, and localized interference. Our experiments show that a main driver is interference among overlapping semantic representations, and that self-distillation succeeds by mitigating this interference.

URL PDF HTML ☆

赞 0 踩 0

2604.15569 2026-04-20 cs.RO

ShapeGen: Robotic Data Generation for Category-Level Manipulation

Yirui Wang, Xiuwei Xu, Angyuan Ma, Bingyao Yu, Jie Zhou, Jiwen Lu

Comments 15 pages, 11 figures. Under review

2604.15559 2026-04-20 cs.AI

Subliminal Transfer of Unsafe Behaviors in AI Agent Distillation

Jacob Dang, Brian Y. Xie, Omar G. Younis

2604.15558 2026-04-20 cs.AI cs.CL cs.LO cs.MA

Preregistered Belief Revision Contracts

Saad Alqithami

详情

英文摘要

Deliberative multi-agent systems allow agents to exchange messages and revise beliefs over time. While this interaction is meant to improve performance, it can also create dangerous conformity effects: agreement, confidence, prestige, or majority size may be treated as if they were evidence, producing high-confidence convergence to false conclusions. To address this, we introduce PBRC (Preregistered Belief Revision Contracts), a protocol-level mechanism that strictly separates open communication from admissible epistemic change. A PBRC contract publicly fixes first-order evidence triggers, admissible revision operators, a priority rule, and a fallback policy. A non-fallback step is accepted only when it cites a preregistered trigger and provides a nonempty witness set of externally validated evidence tokens. This ensures that every substantive belief change is both enforceable by a router and auditable after the fact. In this paper, (a) we prove that under evidential contracts with conservative fallback, social-only rounds cannot increase confidence and cannot generate purely conformity-driven wrong-but-sure cascades. (b) We show that auditable trigger protocols admit evidential PBRC normal forms that preserve belief trajectories and canonicalized audit traces. (c) We demonstrate that sound enforcement yields epistemic accountability: any change of top hypothesis is attributable to a concrete validated witness set. For token-invariant contracts, (d) we prove that enforced trajectories depend only on token-exposure traces; under flooding dissemination, these traces are characterized exactly by truncated reachability, giving tight diameter bounds for universal evidence closure. Finally, we introduce a companion contractual dynamic doxastic logic to specify trace invariants, and provide simulations illustrating cascade suppression, auditability, and robustness-liveness trade-offs.

URL PDF HTML ☆

赞 0 踩 0

2604.15557 2026-04-20 cs.LG cs.CL

Predicting Where Steering Vectors Succeed

Jayadev Billa

Comments 19 pages, incl. 10 appendix pages, 4 figures, 20 tables

2604.15556 2026-04-20 cs.LG cs.CV

Learning Affine-Equivariant Proximal Operators

Oriel Savir, Zhenghan Fang, Jeremias Sulam

Comments 9 pages, 4 figures, Accepted at ICASSP 2026

2604.15555 2026-04-20 cs.CV

CXR-LT 2026 Challenge: Multi-Center Long-Tailed and Zero Shot Chest X-ray Classification

Hexin Dong, Yi Lin, Pengyu Zhou, Fengnian Zhao, Alan Clint Legasto, Juno Cho, Dohui Kim, Justin Namuk Kim, Mingeon Kim, Sunwoo Kwak, Gabriel Moyà-Alcover, Ky Trung Nguyen, Thanh-Huy Nguyen, Ha-Hieu Pham, Huy-Hieu Pham, Huy Le Pham, Nikhileswara Rao Sulake, Aina Tur-Serrano, Ruichi Zhang, Ang Zu, Adam E. Flanders, Zhiyong Lu, Ronald M. Summers, Mingquan Lin, Hao Chen, Yuzhe Yang, George Shih, Yifan Peng

Comments 25 pages, 6 figures

2604.15554 2026-04-20 cs.LG cs.AI cs.NA math.NA math.OC

Natural gradient descent with momentum

Anthony Nouy, Agustín Somacal

2604.15549 2026-04-20 cs.LG cs.DC math.OC

Optimizing Stochastic Gradient Push under Broadcast Communications

Tuan Nguyen, Ting He

2604.15547 2026-04-20 cs.CL cs.AI

Consistency Analysis of Sentiment Predictions using Syntactic & Semantic Context Assessment Summarization (SSAS)

Sharookh Daruwalla, Nitin Mayande, Shreeya Verma Kathuria, Nitin Joglekar, Charles Weber

Comments 27 pages, 2 figures. arXiv admin note: text overlap with arXiv:2604.12049

2604.15542 2026-04-20 cs.CV cs.LG

UA-Net: Uncertainty-Aware Network for TRISO Image Semantic Segmentation

Kyle Lucke, Zuzanna Krajewska-Travar, Shoukun Sun, Lu Cai, John D. Stempien, Min Xian

2604.15521 2026-04-20 cs.CV

Frequency-Aware Flow Matching for High-Quality Image Generation

Sucheng Ren, Qihang Yu, Ju He, Xiaohui Shen, Alan Yuille, Liang-Chieh Chen

Comments Accepted by CVPR 2026

2604.15514 2026-04-20 cs.AI cs.CY cs.HC

Bureaucratic Silences: What the Canadian AI Register Reveals, Omits, and Obscures

Dipto Das, Christelle Tessono, Syed Ishtiaque Ahmed, Shion Guha

Comments Accepted at FAccT 2026

2604.15507 2026-04-20 cs.RO

Trajectory Planning for Safe Dual Control with Active Exploration

Kaleb Ben Naveed, Manveer Singh, Devansh R. Agrawal, Dimitra Panagou

2604.15505 2026-04-20 cs.CL cs.AI

PolicyBank: Evolving Policy Understanding for LLM Agents

Jihye Choi, Jinsung Yoon, Long T. Le, Somesh Jha, Tomas Pfister

2604.15503 2026-04-20 cs.CL

Brain Score Tracks Shared Properties of Languages: Evidence from Many Natural Languages and Structured Sequences

Jingnong Qu, Ashvin Ranjan, Shane Steinert-Threlkeld

2604.15495 2026-04-20 cs.AI cs.CV cs.HC cs.RO

GIST: Multimodal Knowledge Extraction and Spatial Grounding via Intelligent Semantic Topology

Shivendra Agrawal, Bradley Hayes

2604.15494 2026-04-20 cs.LG cs.CV

ProtoTTA: Prototype-Guided Test-Time Adaptation

Mohammad Mahdi Abootorabi, Parvin Mousavi, Purang Abolmaesumi, Evan Shelhamer

Comments ICLR 2026 Test-Time Updates (TTU) Workshop

2604.15490 2026-04-20 cs.CL

Think Multilingual, Not Harder: A Data-Efficient Framework for Teaching Reasoning Models to Code-Switch

Eleanor M. Lin, David Jurgens

2604.15488 2026-04-20 cs.LG cs.AI cs.CL

FineSteer: A Unified Framework for Fine-Grained Inference-Time Steering in Large Language Models

Zixuan Weng, Jinghuai Zhang, Kunlin Cai, Ying Li, Peiran Wang, Yuan Tian

Comments Accepted by ACL 2026 (Main)

2604.15482 2026-04-20 cs.LG cs.AI

Harmonizing Multi-Objective LLM Unlearning via Unified Domain Representation and Bidirectional Logit Distillation

Yisheng Zhong, Sijia Liu, Zhuangdi Zhu

2604.15475 2026-04-20 cs.RO cs.MA

NeuroMesh: A Unified Neural Inference Framework for Decentralized Multi-Robot Collaboration

Yang Zhou, Yash Shetye, Long Quang, Devon Super, Jesse Milzman, Manohari Goarin, Aditya Azad, Devang Sunil Dhake, Jeffery Mao, Carlos Nieto-Granda, Giuseppe Loianno

Comments 8 page, 8 figures, Accepted at the IEEE Robotics Automation Letter (RA-L)

2604.15461 2026-04-20 cs.LG cs.CL cs.CR

Evaluating LLM Simulators as Differentially Private Data Generators

Nassima M. Bouzid, Dehao Yuan, Nam H. Nguyen, Mayana Pereira

Comments Submitted to ICLR 2026. 6 pages + appendix

2604.15456 2026-04-20 cs.AI

DeepER-Med: Advancing Deep Evidence-Based Research in Medicine Through Agentic AI

Zhizheng Wang, Chih-Hsuan Wei, Joey Chan, Robert Leaman, Chi-Ping Day, Chuan Wu, Mark A Knepper, Antolin Serrano Farias, Jordina Rincon-Torroella, Hasan Slika, Betty Tyler, Ryan Huu-Tuan Nguyen, Asmita Indurkar, Mélanie Hébert, Shubo Tian, Lauren He, Noor Naffakh, Aseem Aseem, Nicholas Wan, Emily Y Chew, Tiarnan D L Keenan, Zhiyong Lu

Comments 37 pages, 6 figures, 5 tables

2604.15455 2026-04-20 cs.RO

One-Shot Cross-Geometry Skill Transfer through Part Decomposition

Skye Thompson, Ondrej Biza, George Konidaris

Comments ICRA 2026

2604.15453 2026-04-20 cs.CV cs.AI cs.LG

(1D) Ordered Tokens Enable Efficient Test-Time Search

Zhitong Gao, Parham Rezaei, Ali Cy, Mingqiao Ye, Nataša Jovanović, Jesse Allardice, Afshin Dehghan, Amir Zamir, Roman Bachmann, Oğuzhan Fatih Kar

Comments Project page: https://soto.epfl.ch/

2604.15449 2026-04-20 cs.RO

Iterated Invariant EKF for Quadruped Robot Odometry

Hilton Marques Souza Santana, João Carlos Virgolino Soares, Sven Goffin, Ylenia Nisticò, Silvère Bonnabel, Claudio Semini, Marco Antonio Meggiolaro

2604.15448 2026-04-20 cs.LG cs.AI cs.LO

Transfer Learning from Foundational Optimization Embeddings to Unsupervised SAT Representations

Koyena Pal, Serdar Kadioglu

2604.15416 2026-04-20 cs.LG cs.AI math.OC

StoSignSGD: Unbiased Structural Stochasticity Fixes SignSGD for Training Large Language Models

Dingzhi Yu, Rui Pan, Yuxing Liu, Tong Zhang

详情

英文摘要

Sign-based optimization algorithms, such as SignSGD, have garnered significant attention for their remarkable performance in distributed learning and training large foundation models. Despite their empirical superiority, SignSGD is known to diverge on non-smooth objectives, which are ubiquitous in modern machine learning due to ReLUs, max-pools, and mixture-of-experts. To overcome this fundamental limitation, we propose \textbf{StoSignSGD}, an algorithm that injects structural stochasticity into the sign operator while maintaining an unbiased update step. In the regime of (online) convex optimization, our theoretical analysis shows that StoSignSGD rigorously resolves the non-convergence issues of SignSGD, achieving a sharp convergence rate matching the lower bound. For the more challenging non-convex non-smooth optimization, we introduce generalized stationary measures that encompass prior definitions, proving that StoSignSGD improves upon the best-known complexity bounds by dimensional factors. Empirically, StoSignSGD exhibits robust stability and superior efficiency across diverse large language model (LLM) training regimes. Notably, in low-precision FP8 pretraining -- a setting where AdamW fails catastrophically -- StoSignSGD remains highly stable and yields a remarkable 1.44$\times$ to 2.14$\times$ speedup relative to established baselines. Furthermore, when fine-tuning 7B LLMs on mathematical reasoning tasks, StoSignSGD delivers substantial performance gains over both AdamW and SignSGD. Finally, to dissect the mechanisms driving its success, we develop a sign conversion framework capable of transforming any general optimizer into its unbiased, sign-based counterpart. Utilizing this framework, we deconstruct the core components of StoSignSGD and present a comprehensive ablation study to empirically validate our algorithmic design choices.

URL PDF HTML ☆

赞 0 踩 0

2604.15411 2026-04-20 cs.LG cs.AI physics.data-an

PRL-Bench: A Comprehensive Benchmark Evaluating LLMs' Capabilities in Frontier Physics Research

Tingjia Miao, Wenkai Jin, Muhua Zhang, Jinxin Tan, Yuelin Hu, Tu Guo, Jiejun Zhang, Yuhan Wang, Wenbo Li, Yinuo Gao, Shuo Chen, Weiqi Jiang, Yayun Hu, Zixing Lei, Xianghe Pang, Zexi Liu, Yuzhi Zhang, Linfeng Zhang, Kun Chen, Wei Wang, Weinan E, Siheng Chen

Comments 15 pages, 5 figures