arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2603.22829 2026-03-25 cs.AI

Improving Safety Alignment via Balanced Direct Preference Optimization

Shiji Zhao, Mengyang Wang, Shukun Xiong, Fangzhou Chen, Qihui Zhu, Shouwei Ruan, Yisong Xiao, Ranjie Duan, Xun Chen, XingXing Wei

详情

英文摘要

With the rapid development and widespread application of Large Language Models (LLMs), their potential safety risks have attracted widespread attention. Reinforcement Learning from Human Feedback (RLHF) has been adopted to enhance the safety performance of LLMs. As a simple and effective alternative to RLHF, Direct Preference Optimization (DPO) is widely used for safety alignment. However, safety alignment still suffers from severe overfitting, which limits its actual performance. This paper revisits the overfitting phenomenon from the perspective of the model's comprehension of the training data. We find that the Imbalanced Preference Comprehension phenomenon exists between responses in preference pairs, which compromises the model's safety performance. To address this, we propose Balanced Direct Preference Optimization (B-DPO), which adaptively modulates optimization strength between preferred and dispreferred responses based on mutual information. A series of experimental results show that B-DPO can enhance the safety capability while maintaining the competitive general capabilities of LLMs on various mainstream benchmarks compared to state-of-the-art methods. \color{red}{Warning: This paper contains examples of harmful texts, and reader discretion is recommended.

URL PDF HTML ☆

赞 0 踩 0

2603.22826 2026-03-25 cs.CV

MVRD-Bench: Multi-View Learning and Benchmarking for Dynamic Remote Photoplethysmography under Occlusion

Zuxian He, Xu Cheng, Zhaodong Sun, Haoyu Chen, Jingang Shi, Xiaobai Li, Guoying Zhao

2603.22824 2026-03-25 cs.LG math.OC stat.ML

Towards The Implicit Bias on Multiclass Separable Data Under Norm Constraints

Shengping Xie, Zekun Wu, Quan Chen, Kaixu Tang

2603.22821 2026-03-25 cs.CV

Cross-Slice Knowledge Transfer via Masked Multi-Modal Heterogeneous Graph Contrastive Learning for Spatial Gene Expression Inference

Zhiceng Shi, Changmiao Wang, Jun Wan, Wenwen Min

Comments Accepted by CVPR-2026

2603.22820 2026-03-25 cs.CL

RadTimeline: Timeline Summarization for Longitudinal Radiological Lung Findings

Sitong Zhou, Meliha Yetisgen, Mari Ostendorf

Comments Accepted at Language Resources and Evaluation Conference (LREC) 2026

2603.22819 2026-03-25 cs.CV cs.AI

TDATR: Improving End-to-End Table Recognition via Table Detail-Aware Learning and Cell-Level Visual Alignment

Chunxia Qin, Chenyu Liu, Pengcheng Xia, Jun Du, Baocai Yin, Bing Yin, Cong Liu

Comments Acceptd by CVPR 2026. Project Page: https://github.com/Chunchunwumu/TDATR.git

2603.22815 2026-03-25 cs.CV cs.AI

Focus, Don't Prune: Identifying Instruction-Relevant Regions for Information-Rich Image Understanding

Mincheol Kwon, Minseung Lee, Seonga Choi, Miso Choi, Kyeong-Jin Oh, Hyunyoung Lee, Cheonyoung Park, Yongho Song, Seunghyun Park, Jinkyu Kim

Comments CVPR 2026

2603.22813 2026-03-25 cs.AI

Learning What Matters Now: Dynamic Preference Inference under Contextual Shifts

Xianwei Cao, Dou Quan, Zhenliang Zhang, Shuang Wang

Comments 10 pages, ICLR 2026 poster paper

2603.22812 2026-03-25 cs.CL

Efficient Hallucination Detection: Adaptive Bayesian Estimation of Semantic Entropy with Guided Semantic Exploration

Qiyao Sun, Xingming Li, Xixiang He, Ao Cheng, Xuanyu Ji, Hailun Lu, Runke Huang, Qingyong Hu

Comments Accepted to a AAAI 2026 (Oral Presentation, <5% acceptance rate), Project page: https://qingyonghu.github.io/Efficient-Hallucination-Detection/

2603.22810 2026-03-25 cs.LG

Universal and efficient graph neural networks with dynamic attention for machine learning interatomic potentials

Shuyu Bi, Zhede Zhao, Qiangchao Sun, Tao Hu, Xionggang Lu, Hongwei Cheng

Comments 10 pages, 6 figures, 6 tables

2603.22801 2026-03-25 cs.LG

Transformers Trained via Gradient Descent Can Provably Learn a Class of Teacher Models

Chenyang Zhang, Qingyue Zhao, Quanquan Gu, Yuan Cao

Comments 64 pages, 9 figures

2603.22800 2026-03-25 cs.RO

CATNAV: Cached Vision-Language Traversability for Efficient Zero-Shot Robot Navigation

Aditya Potnis, Francisco Affonso, Shreya Gummadi, Naveen Kumar Uppalapati, Girish Chowdhary

Comments 8 pages, 6 figures

2603.22799 2026-03-25 cs.CL

Span Modeling for Idiomaticity and Figurative Language Detection with Span Contrastive Loss

Blake Matheny, Phuong Minh Nguyen, Minh Le Nguyen

2603.22796 2026-03-25 cs.CV cs.AI cs.RO

PhotoAgent: A Robotic Photographer with Spatial and Aesthetic Understanding

Lirong Che, Zhenfeng Gan, Yanbo Chen, Junbo Tan, Xueqian Wang

Comments Accepted to the IEEE International Conference on Robotics and Automation (ICRA) 2026

2603.22794 2026-03-25 cs.CV

It Takes Two: A Duet of Periodicity and Directionality for Burst Flicker Removal

Lishen Qu, Shihao Zhou, Jie Liang, Hui Zeng, Lei Zhang, Jufeng Yang

Comments Accepted by CVPR 2026

2603.22791 2026-03-25 cs.AI

ABSTRAL: Automatic Design of Multi-Agent Systems Through Iterative Refinement and Topology Optimization

Weijia Song, Jiashu Yue, Zhe Pang

2603.22786 2026-03-25 cs.CV

Predictive Photometric Uncertainty in Gaussian Splatting for Novel View Synthesis

Chamuditha Jayanga Galappaththige, Thomas Gottwald, Peter Stehr, Edgar Heinert, Niko Suenderhauf, Dimity Miller, Matthias Rottmann

Comments Project Page: https://chumsy0725.github.io/GS-U/

2603.22785 2026-03-25 cs.CV cs.AI cs.LG

Exposure-Normalized Bed and Chair Fall Rates via Continuous AI Monitoring

Paolo Gabriel, Peter Rehani, Zack Drumm, Tyler Troy, Tiffany Wyatt, Narinder Singh

Comments 23 pages, 6 figures

2603.22784 2026-03-25 cs.LG

Caterpillar of Thoughts: The Optimal Test-Time Algorithm for Large Language Models

Amir Azarmehr, Soheil Behnezhad, Alma Ghafari

2603.22782 2026-03-25 cs.CV

Know3D: Prompting 3D Generation with Knowledge from Vision-Language Models

Wenyue Chen, Wenjue Chen, Peng Li, Qinghe Wang, Xu Jia, Heliang Zheng, Rongfei Jia, Yuan Liu, Ronggang Wang

Comments page: https://xishuxishu.github.io/Know3D.github.io/

2603.22781 2026-03-25 cs.CV

Typography-Based Monocular Distance Estimation Framework for Vehicle Safety Systems

Manognya Lokesh Reddy, Zheng Liu

Comments 25 pages, 11 figures

2603.22777 2026-03-25 cs.AI

AgriPestDatabase-v1.0: A Structured Insect Dataset for Training Agricultural Large Language Model

Yagizhan Bilal Durak, Ahsan Ul Islam, Shahidul Islam, Ashley Morgan-Olvera, Iftekhar Ibne Basith, Syed Hasib Akhter Faruqui

Comments Accepted in Artificial Super Intelligence Conference 2026 (Sponsored by KSU PLOT & IEEE CIS)

详情

英文摘要

Agricultural pest management increasingly relies on timely and accurate access to expert knowledge, yet high quality labeled data and continuous expert support remain limited, particularly for farmers operating in rural regions with unstable/no internet connectivity. At the same time, the rapid growth of AI and LLMs has created new opportunities to deliver practical decision support tools directly to end users in agriculture through compact and deployable systems. This work addresses (i) generating a structured insect information dataset, and (ii) adapting a lightweight LLM model ($\leq$ 7B) by fine tuning it for edge device uses in agricultural pest management. The textual data collection was done by reviewing and collecting information from available pest databases and published manuscripts on nine selected pest species. These structured reports were then reviewed and validated by a domain expert. From these reports, we constructed Q/A pairs to support model training and evaluation. A LoRA-based fine-tuning approach was applied to multiple lightweight LLMs and evaluated. Initial evaluation shows that Mistral 7B achieves an 88.9\% pass rate on the domain-specific Q/A task, substantially outperforming Qwen 2.5 7B (63.9\%), and LLaMA 3.1 8B (58.7\%). Notably, Mistral demonstrates higher semantic alignment (embedding similarity: 0.865) despite lower lexical overlap (BLEU: 0.097), indicating that semantic understanding and robust reasoning are more predictive of task success than surface-level conformity in specialized domains. By combining expert organized data, well-structured Q/A pairs, semantic quality control, and efficient model adaptation, this work contributes towards providing support for farmer facing agricultural decision support tools and demonstrates the feasibility of deploying compact, high-performing language models for practical field-level pest management guidance.

URL PDF HTML ☆

赞 0 踩 0

2603.22770 2026-03-25 cs.LG cs.AI

From Arithmetic to Logic: The Resilience of Logic and Lookup-Based Neural Networks Under Parameter Bit-Flips

Alan T. L. Bacellar, Sathvik Chemudupati, Shashank Nag, Allison Seigler, Priscila M. V. Lima, Felipe M. G. França, Lizy K. John

2603.22767 2026-03-25 cs.AI cs.CL

Can LLM Agents Generate Real-World Evidence? Evaluating Observational Studies in Medical Databases

Dubai Li, Yuxiang He, Yan Hu, Yu Tian, Jingsong Li

2603.22765 2026-03-25 cs.CL cs.AI cs.IR

DALDALL: Data Augmentation for Lexical and Semantic Diverse in Legal Domain by leveraging LLM-Persona

Janghyeok Choi, Jaewon Lee, Sungzoon Cho

2603.22763 2026-03-25 cs.CV

ENC-Bench: A Benchmark for Evaluating Multimodal Large Language Models in Electronic Navigational Chart Understanding

Ao Cheng, Xingming Li, Xuanyu Ji, Xixiang He, Qiyao Sun, Chunping Qiu, Runke Huang, Qingyong Hu

Comments Accepted to CVPR 2026, Project page: https://qingyonghu.github.io/ENC-Bench/

2603.22760 2026-03-25 cs.RO

SG-VLA: Learning Spatially-Grounded Vision-Language-Action Models for Mobile Manipulation

Ruisen Tu, Arth Shukla, Sohyun Yoo, Xuanlin Li, Junxi Li, Jianwen Xie, Hao Su, Zhuowen Tu

2603.22758 2026-03-25 cs.CV cs.LG

Reconstruction-Guided Slot Curriculum: Addressing Object Over-Fragmentation in Video Object-Centric Learning

WonJun Moon, Hyun Seok Seong, Jae-Pil Heo

Comments CVPR 2026 paper. Our code is available at github.com/wjun0830/SlotCurri

2603.22757 2026-03-25 cs.CV

Multimodal Industrial Anomaly Detection via Geometric Prior

Min Li, Jinghui He, Gang Li, Jiachen Li, Jin Wan, Delong Han

Comments Accepted for publication in IEEE Transactions on Circuits and Systems for Video Technology (TCSVT)

2603.22756 2026-03-25 cs.CV

MVPBench: A Multi-Video Perception Evaluation Benchmark for Multi-Modal Video Understanding

Purui Bai, Tao Wu, Jiayang Sun, Xinyue Liu, Huaibo Huang, Ran He

Comments 15 pages, 7 figures, accepted by IJCNN 2026, code and dataset available at https://github.com/MVPBench/MVPBench