arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2603.06264 2026-03-24 cs.CL cs.CY

Mind the Gap: Pitfalls of LLM Alignment with Asian Public Opinion

Hari Shankar, Vedanta S P, Sriharini Margapuri, Debjani Mazumder, Ponnurangam Kumaraguru, Abhijnan Chakraborty

Comments 13 pages, including AAAI Paper Checklist. Accepted in Proceedings of the 20th International AAAI Conference on Web and Social Media (ICWSM 2026)

详情

英文摘要

Large Language Models (LLMs) are increasingly being deployed in multilingual, multicultural settings, yet their reliance on predominantly English-centric training data risks misalignment with the diverse cultural values of different societies. In this paper, we present a comprehensive, multilingual audit of the cultural alignment of contemporary LLMs including GPT-4o-Mini, Gemini-2.5-Flash, Llama 3.2, Mistral and Gemma 3 across India, East Asia and Southeast Asia. Our study specifically focuses on the sensitive domain of religion as the prism for broader alignment. To facilitate this, we conduct a multi-faceted analysis of every LLM's internal representations, using log-probs/logits, to compare the model's opinion distributions against ground-truth public attitudes. We find that while the popular models generally align with public opinion on broad social issues, they consistently fail to accurately represent religious viewpoints, especially those of minority groups, often amplifying negative stereotypes. Lightweight interventions, such as demographic priming and native language prompting, partially mitigate but do not eliminate these cultural gaps. We further show that downstream evaluations on bias benchmarks (such as CrowS-Pairs, IndiBias, ThaiCLI, KoBBQ) reveal persistent harms and under-representation in sensitive contexts. Our findings underscore the urgent need for systematic, regionally grounded audits to ensure equitable global deployment of LLMs.

URL PDF HTML ☆

赞 0 踩 0

2603.05027 2026-03-24 cs.AI

S5-SHB Agent: Society 5.0 enabled Multi-model Agentic Blockchain Framework for Smart Home

Janani Rangila, Akila Siriweera, Incheon Paik, Keitaro Naruse, Isuru Jayanada, Vishmika Devindi

Comments 15 pages, 15 figures, preprint

2603.04846 2026-03-24 cs.CV

Multi-Paradigm Collaborative Adversarial Attack Against Multi-Modal Large Language Models

Yuanbo Li, Tianyang Xu, Cong Hu, Tao Zhou, Xiao-Jun Wu, Josef Kittler

Comments Accepted by CVPR2026

2603.04839 2026-03-24 cs.CV

Towards Highly Transferable Vision-Language Attack via Semantic-Augmented Dynamic Contrastive Interaction

Yuanbo Li, Tianyang Xu, Cong Hu, Tao Zhou, Xiao-Jun Wu, Josef Kittler

Comments Accepted by CVPR2026

2603.04831 2026-03-24 cs.LG

Missingness Bias Calibration in Feature Attribution Explanations

Shailesh Sridhar, Anton Xue, Eric Wong

2603.02960 2026-03-24 cs.AI

Architecting Trust in Artificial Epistemic Agents

Nahema Marchal, Stephanie Chan, Matija Franklin, Manon Revel, Geoff Keeling, Roberta Fischli, Bilva Chandra, Iason Gabriel

2602.23956 2026-03-24 cs.CV

SwitchCraft: Training-Free Multi-Event Video Generation with Attention Controls

Qianxun Xu, Chenxi Song, Yujun Cai, Chi Zhang

Comments CVPR 2026

2602.21233 2026-03-24 cs.LG cs.AI

AngelSlim: A more accessible, comprehensive, and efficient toolkit for large model compression

Rui Cen, QiangQiang Hu, Hong Huang, Hong Liu, Song Liu, Xin Luo, Lin Niu, Yifan Tan, Decheng Wu, Linchuan Xie, Rubing Yang, Guanghua Yu, Jianchen Zhu

2602.20880 2026-03-24 cs.CV

When Safety Collides: Resolving Multi-Category Harmful Conflicts in Text-to-Image Diffusion via Adaptive Safety Guidance

Yongli Xiang, Ziming Hong, Zhaoqing Wang, Xiangyu Zhao, Bo Han, Tongliang Liu

Comments CVPR 2026; Code is released at CVPR_CASG" target="_blank" rel="noopener">https://github.com/tmllab/2026_CVPR_CASG

2602.19961 2026-03-24 cs.CL cs.IR

Unlocking Multimodal Document Intelligence: From Current Triumphs to Future Frontiers of Visual Document Retrieval

Yibo Yan, Jiahao Huo, Guanbo Feng, Mingdong Ou, Yi Cao, Xin Zou, Shuliang Liu, Yuanhuiyi Lyu, Yu Huang, Jungang Li, Kening Zheng, Xu Zheng, Philip S. Yu, James Kwok, Xuming Hu

Comments Under review. This version updates the relevant works released before 15 March, 2026

2602.19248 2026-03-24 cs.CV cs.AI

No Need For Real Anomaly: MLLM Empowered Zero-Shot Video Anomaly Detection

Zunkai Dai, Ke Li, Jiajia Liu, Jie Yang, Yuanyuan Qiao

Comments Accepted by CVPR 2026

2602.19107 2026-03-24 cs.RO cs.HC

A User-driven Design Framework for Robotaxi

Yue Deng, Changyang He

2602.15510 2026-03-24 cs.LG cs.DC cs.NI

On the Geometric Coherence of Global Aggregation in Federated Graph Neural Networks

Chethana Prasad Kabgere, Shylaja SS

Comments This is a developing preprint of an 18-page journal manuscript (6 figures), currently being prepared for formal peer-review submission

2602.14408 2026-03-24 cs.CV cs.AI

Feature Recalibration Based Olfactory-Visual Multimodal Model for Enhanced Rice Deterioration Detection

Rongqiang Zhao, Hengrui Hu, Yijing Wang, Mingchun Sun, Jie Liu

2602.12275 2026-03-24 cs.CL

On-Policy Context Distillation for Language Models

Tianzhu Ye, Li Dong, Xun Wu, Shaohan Huang, Furu Wei

2602.11549 2026-03-24 cs.LG cs.AI

Native Reasoning Models: Training Language Models to Reason on Unverifiable Data

Yuanfu Wang, Zhixuan Liu, Xiangtian Li, Chaochao Lu, Chao Yang

Comments Accepted at ICLR 2026. Code available at https://github.com/sharkwyf/native-reasoning-models

详情

英文摘要

The prevailing paradigm for training large reasoning models--combining Supervised Fine-Tuning (SFT) with Reinforcement Learning with Verifiable Rewards (RLVR)--is fundamentally constrained by its reliance on high-quality, human-annotated reasoning data and external verifiers. This dependency incurs significant data-collection costs, risks embedding human cognitive biases, and confines the reinforcement learning stage to objectively assessable domains like mathematics and coding, leaving a wide range of unverifiable tasks beyond its scope. To overcome these limitations, we introduce NRT (Native Reasoning Training), a novel framework that cultivates complex reasoning by having the model generate its own reasoning traces using only standard question-answer pairs, thereby obviating the need for expert-written demonstrations. NRT reframes the training problem by treating the reasoning process as a latent variable. It employs a unified training objective that models reasoning as an optimization problem, intrinsically rewarding paths that increase the model's likelihood of producing the ground-truth answer. This unified perspective allows us to analyze intrinsic failure modes of prior methods, such as policy collapse, and systematically design more robust reward aggregation functions, creating a self-reinforcing feedback loop where the model learns to think in ways that resolve its own uncertainty. Empirical evaluation on Llama and Mistral model families demonstrates that NRT achieves state-of-the-art performance among verifier-free methods, significantly outperforming standard SFT baselines and prior verifier-free RL methods. Our approach yields particularly strong performance gains in complex reasoning domains and exhibits high robustness to policy collapse, offering a general, scalable path toward building more powerful and broadly applicable reasoning systems.

URL PDF HTML ☆

赞 0 踩 0

2602.05549 2026-03-24 cs.LG

Logical Guidance for the Exact Composition of Diffusion Models

Francesco Alesiani, Jonathan Warrell, Tanja Bien, Henrik Christiansen, Matheus Ferraz, Mathias Niepert

2602.05146 2026-03-24 cs.LG cs.AI

Cross-talk based multi-task learning for fault classification of machine system influenced by multiple variables

Wonjun Yi, Rismaya Kumar Mishra, Yong-Hwa Park

Comments Submitted to 32th International Congress on Sound and Vibration (ICSV32)

2602.01834 2026-03-24 cs.RO

Concept-Based Dictionary Learning for Inference-Time Safety in Vision Language Action Models

Siqi Wen, Shu Yang, Shaopeng Fu, Jingfeng Zhang, Lijie Hu, Di Wang

2602.01082 2026-03-24 cs.AI

EvoOpt-LLM: Evolving industrial optimization models with large language models

Yiliu He, Tianle Li, Binghao Ji, Zhiyuan Liu, Di Huang

2602.00319 2026-03-24 cs.CL cs.AI cs.LG cs.SI

Detecting AI-Generated Content in Academic Peer Reviews

Siyuan Shen, Kai Wang

2601.20193 2026-03-24 cs.LG cs.AI

Meta-Cognitive Reinforcement Learning with Self-Doubt and Recovery

Zhipeng Zhang, Xiongfei Su, Kai Li

2601.17657 2026-03-24 cs.CV

SPACE-CLIP: Spatial Perception via Adaptive CLIP Embeddings for Monocular Depth Estimation

Taewan Cho, Taeryang Kim, Andrew Jaeyong Choi

2601.16296 2026-03-24 cs.CV cs.AI cs.LG

Memory-V2V: Memory-Augmented Video-to-Video Diffusion for Consistent Multi-Turn Editing

Dohun Lee, Chun-Hao Paul Huang, Xuelin Chen, Jong Chul Ye, Duygu Ceylan, Hyeonho Jeong

Comments Project page: https://dohunlee1.github.io/MemoryV2V

2601.12494 2026-03-24 cs.SD cs.AI cs.CL eess.AS

Multi-Task Instruction Tuning via Data Scheduling for Low-Resource Arabic AudioLLMs

Hunzalah Hassan Bhatti, Firoj Alam, Shammur Absar Chowdhury

Comments Foundation Models, Large Language Models, Native, Speech Models, Arabic

2601.09444 2026-03-24 cs.RO

Data Scaling for Navigation in Unknown Environments

Lauri Suomela, Naoki Takahata, Sasanka Kuruppu Arachchige, Harry Edelman, Joni-Kristian Kämäräinen

Comments Robotics and Automation Letters (RA-L) 2026

2601.07148 2026-03-24 cs.CL cs.AI

Measuring Iterative Temporal Reasoning with Time Puzzles

Zhengxiang Wang, Zeyu Dong

Comments 11 pages, 4 tables, 3 figures

2601.05911 2026-03-24 cs.CL

Pantagruel: Unified Self-Supervised Encoders for French Text and Speech

Phuong-Hang Le, Valentin Pelloin, Arnault Chatelain, Maryem Bouziane, Mohammed Ghennai, Qianwen Guan, Kirill Milintsevich, Salima Mdhaffar, Aidan Mannion, Nils Defauw, Shuyue Gu, Alexandre Audibert, Marco Dinarelli, Yannick Estève, Lorraine Goeuriot, Steffen Lalande, Nicolas Hervé, Maximin Coavoux, François Portet, Étienne Ollion, Marie Candito, Maxime Peyrard, Solange Rossato, Benjamin Lecouteux, Aurélie Nardy, Gilles Sérasset, Vincent Segonne, Solène Evain, Diandra Fabre, Didier Schwab

Comments Accepted to LREC 2026

2601.05848 2026-03-24 cs.CV cs.AI cs.RO

Goal Force: Teaching Video Models To Accomplish Physics-Conditioned Goals

Nate Gillman, Yinghua Zhou, Zitian Tang, Evan Luo, Arjan Chakravarthy, Daksh Aggarwal, Michael Freeman, Charles Herrmann, Chen Sun

Comments Camera ready version (CVPR 2026). Code and interactive demos at https://goal-force.github.io/

2601.04453 2026-03-24 cs.CV

UniDrive-WM: Unified Understanding, Planning and Generation World Model For Autonomous Driving

Zhexiao Xiong, Xin Ye, Burhan Yaman, Sheng Cheng, Yiren Lu, Jingru Luo, Nathan Jacobs, Liu Ren

Comments Project Page: https://unidrive-wm.github.io/UniDrive-WM