arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2410.10862 2026-03-16 cs.CL cs.AI cs.CR cs.CY cs.LG

Superficial Safety Alignment Hypothesis

Jianwei Li, Jung-Eun Kim

Comments ICLR 2026

详情

英文摘要

As large language models (LLMs) are overwhelmingly more and more integrated into various applications, ensuring they generate safe responses is a pressing need. Previous studies on alignment have largely focused on general instruction-following but have often overlooked the distinct properties of safety alignment, such as the brittleness of safety mechanisms. To bridge the gap, we propose the Superficial Safety Alignment Hypothesis (SSAH), which posits that safety alignment teaches an otherwise unsafe model to choose the correct reasoning direction-fulfill or refuse users' requests-interpreted as an implicit binary classification task. Through SSAH, we hypothesize that only a few essential components can establish safety guardrails in LLMs. We successfully identify four types of attribute-critical components: Safety Critical Unit (SCU), Utility Critical Unit (UCU), Complex Unit (CU), and Redundant Unit (RU). Our findings show that freezing certain safety-critical components during fine-tuning allows the model to retain its safety attributes while adapting to new tasks. Similarly, we show that leveraging redundant units in the pre-trained model as an "alignment budget" can effectively minimize the alignment tax while achieving the alignment goal. All considered, this paper concludes that the atomic functional unit for safety in LLMs is at the neuron level and underscores that safety alignment should not be complicated. We have code implementation and other information on the project website: https://ssa-h.github.io/.

URL PDF HTML ☆

赞 0 踩 0

2409.03658 2026-03-16 cs.LG math-ph math.MP

A DNN Biophysics Model with Topological and Electrostatic Features

Elyssa Sliheet, Md Abu Talha, Weihua Geng

2409.03424 2026-03-16 cs.CV

Weight Conditioning for Smooth Optimization of Neural Networks

Hemanth Saratchandran, Thomas X. Wang, Simon Lucey

Comments ECCV 2024

2409.02108 2026-03-16 cs.CV cs.GR cs.MM

Unveiling Deep Shadows: A Survey and Benchmark on Image and Video Shadow Detection, Removal, and Generation in the Deep Learning Era

Xiaowei Hu, Zhenghao Xing, Tianyu Wang, Chi-Wing Fu, Pheng-Ann Heng

Comments Accepted by International Journal of Computer Vision (IJCV). Publicly available results, trained models, and evaluation metrics at https://github.com/xw-hu/Unveiling-Deep-Shadows

2405.05723 2026-03-16 cs.CL cs.AI cs.IR

Computational lexical analysis of Flamenco genres

Pablo Rosillo-Rodes, Maxi San Miguel, David Sanchez

Comments 25 pages, 20 figures

2403.19205 2026-03-16 cs.CV cs.LG

From Activation to Initialization: Scaling Insights for Optimizing Neural Fields

Hemanth Saratchandran, Sameera Ramasinghe, Simon Lucey

Comments CVPR 2024

2402.02017 2026-03-16 cs.LG

Adaptive $Q$-Aid for Conditional Supervised Learning in Offline Reinforcement Learning

Jeonghye Kim, Suyoung Lee, Woojun Kim, Youngchul Sung

Comments Accepted to NeurIPS 2024 (reduced file-size version). The project page is available at https://beanie00.com/publications/qcs

2310.11685 2026-03-16 cs.CL cs.LG

Why Softmax Attention Outperforms Linear Attention

Yichuan Deng, Zhao Song, Kaijun Yuan, Tianyi Zhou

2205.08987 2026-03-16 cs.CV

Trading Positional Complexity vs. Deepness in Coordinate Networks

Jianqiao Zheng, Sameera Ramasinghe, Xueqian Li, Simon Lucey

Comments arXiv admin note: substantial text overlap with arXiv:2107.02561

2603.12887 2026-03-16 cs.CV cs.LG

Forecasting Epileptic Seizures from Contactless Camera via Cross-Species Transfer Learning

Mingkai Zhai, Wei Wang, Zongsheng Li, Quanying Liu

2603.12886 2026-03-16 cs.CV

A protocol for evaluating robustness to H&E staining variation in computational pathology models

Lydia A. Schönpflug, Nikki van den Berg, Sonali Andani, Nanda Horeweg, Jurriaan Barkey Wolf, Tjalling Bosse, Viktor H. Koelzer, Maxime W. Lafarge

详情

英文摘要

Sensitivity to staining variation remains a major barrier to deploying computational pathology (CPath) models as hematoxylin and eosin (H&E) staining varies across laboratories, requiring systematic assessment of how this variability affects model prediction. In this work, we developed a three-step protocol for evaluating robustness to H&E staining variation in CPath models. Step 1: Select reference staining conditions, Step 2: Characterize test set staining properties, Step 3: Apply CPath model(s) under simulated reference staining conditions. Here, we first created a new reference staining library based on the PLISM dataset. As an exemplary use case, we applied the protocol to assess the robustness properties of 306 microsatellite instability (MSI) classification models on the unseen SurGen colorectal cancer dataset (n=738), including 300 attention-based multiple instance learning models trained on the TCGA-COAD/READ datasets across three feature extractors (UNI2-h, H-Optimus-1, Virchow2), alongside six public MSI classification models. Classification performance was measured as AUC, and robustness as the min-max AUC range across four simulated staining conditions (low/high H&E intensity, low/high H&E color similarity). Across models and staining conditions, classification performance ranged from AUC 0.769-0.911 ($Δ$ = 0.142). Robustness ranged from 0.007-0.079 ($Δ$ = 0.072), and showed a weak inverse correlation with classification performance (Pearson r=-0.22, 95% CI [-0.34, -0.11]). Thus, we show that the proposed evaluation protocol enables robustness-informed CPath model selection and provides insight into performance shifts across H&E staining conditions, supporting the identification of operational ranges for reliable model deployment. Code is available at https://github.com/CTPLab/staining-robustness-evaluation .

URL PDF HTML ☆

赞 0 踩 0

2603.12885 2026-03-16 cs.LG

Enhanced Drug-drug Interaction Prediction Using Adaptive Knowledge Integration

Pengfei Liu, Jun Tao, Zhixiang Ren

2603.12875 2026-03-16 cs.LG

Test-time RL alignment exposes task familiarity artifacts in LLM benchmarks

Kun Wang, Reinhard Heckel

2603.12873 2026-03-16 cs.CV

TRACE: Structure-Aware Character Encoding for Robust and Generalizable Document Watermarking

Jiale Meng, Jie Zhang, Runyi Hu, Zhe-Ming Lu, Tianwei Zhang, Yiming Li

2603.12872 2026-03-16 cs.CL

CLARIN-PT-LDB: An Open LLM Leaderboard for Portuguese to assess Language, Culture and Civility

João Silva, Luís Gomes, António Branco

Comments Accepted at PROPOR 2026

2603.12868 2026-03-16 cs.RO

Beyond Imitation: Reinforcement Learning Fine-Tuning for Adaptive Diffusion Navigation Policies

Junhe Sheng, Ruofei Bai, Kuan Xu, Ruimeng Liu, Jie Chen, Shenghai Yuan, Wei-Yun Yau, Lihua Xie

2603.12864 2026-03-16 cs.CV

Composing Driving Worlds through Disentangled Control for Adversarial Scenario Generation

Yifan Zhan, Zhengqing Chen, Qingjie Wang, Zhuo He, Muyao Niu, Xiaoyang Guo, Wei Yin, Weiqiang Ren, Qian Zhang, Yinqiang Zheng

2603.12854 2026-03-16 cs.SD

Perpetual Dialogues: A Computational Analysis of Voice-Guitar Interaction in Carlos Paredes's Discography

Gilberto Bernardes, Nádia Moura, António Sá Pinto

Comments 8 pages, 8 figures, to be published in ICMC 2026

2603.12852 2026-03-16 cs.CV cs.LG

Wear Classification of Abrasive Flap Wheels using a Hierarchical Deep Learning Approach

Falko Kähler, Maxim Wille, Ole Schmedemann, Thorsten Schüppstuhl

Comments 14 pages, 11 figures, 8 tables

2603.12850 2026-03-16 cs.LG

On Linear Separability of the MNIST Handwritten Digits Dataset

Ákos Hajnal

Comments 8 pages, 1 figure

2603.12848 2026-03-16 cs.CV cs.AI

Team LEYA in 10th ABAW Competition: Multimodal Ambivalence/Hesitancy Recognition Approach

Elena Ryumina, Alexandr Axyonov, Dmitry Sysoev, Timur Abdulkadirov, Kirill Almetov, Yulia Morozova, Dmitry Ryumin

Comments 8 pages, 2 figures

2603.12847 2026-03-16 cs.LG cs.AI

Hierarchical Reference Sets for Robust Unsupervised Detection of Scattered and Clustered Outliers

Yiqun Zhang, Zexi Tan, Xiaopeng Luo, Yunlin Liu

Comments 15 pages, 9 figures

2603.12842 2026-03-16 cs.RO

SmoothTurn: Learning to Turn Smoothly for Agile Navigation with Quadrupedal Robots

Zunzhi You, Haolan Guo, Yunke Wang, Chang Xu

详情

英文摘要

Quadrupedal robots show great potential for valuable real-world applications such as fire rescue and industrial inspection. Such applications often require urgency and the ability to navigate agilely, which in turn demands the capability to change directions smoothly while running in high speed. Existing approaches for agile navigation typically learn a single-goal reaching policy by encouraging the robot to stay at the target position after reaching there. As a result, when the policy is used to reach sequential goals that require changing directions, it cannot anticipate upcoming maneuvers or maintain momentum across the switch of goals, thereby preventing the robot from fully exploiting its agility potential. In this work, we formulate the task as sequential local navigation, extending the single-goal-conditioned local navigation formulation in prior work. We then introduce SmoothTurn, a learning-based control framework that learns to turn smoothly while running rapidly for agile sequential local navigation. The framework adopts a novel sequential goal-reaching reward, an expanded observation space with a lookahead window for future goals, and an automatic goal curriculum that progressively expands the difficulty of sampled goal sequences based on the goal-reaching performance. The trained policy can be directly deployed on real quadrupedal robots with onboard sensors and computation. Both simulation and real-world empirical results show that SmoothTurn learns an agile locomotion policy that performs smooth turning across goals, with emergent behaviors such as controlling momentum when switching goals, facing towards the future goal in advance, and planning efficient paths. We have provided video demos of the learned motions in the supplementary materials. The source code and trained policies will be made available upon acceptance.

URL PDF HTML ☆

赞 0 踩 0

2603.12837 2026-03-16 cs.SD cs.AI

Mask2Flow-TSE: Two-Stage Target Speaker Extraction with Masking and Flow Matching

Junwon Moon, Hyunjin Choi, Hansol Park, Heeseung Kim, Kyuhong Shim

Comments Submitted to Interspeech 2026

2603.12832 2026-03-16 cs.CV cs.AI

Hierarchical Dual-Change Collaborative Learning for UAV Scene Change Captioning

Fuhai Chen, Pengpeng Huang, Junwen Wu, Hehong Zhang, Shiping Wang, Xiaoguang Ma, Xuri Ge

Comments 20 pages,10 figures

2603.12829 2026-03-16 cs.CV

coDrawAgents: A Multi-Agent Dialogue Framework for Compositional Image Generation

Chunhan Li, Qifeng Wu, Jia-Hui Pan, Ka-Hei Hui, Jingyu Hu, Yuming Jiang, Bin Sheng, Xihui Liu, Wenjuan Gong, Zhengzhe Liu

Comments Accepted to CVPR 2026 Findings

2603.12826 2026-03-16 cs.CL

Rethinking Multiple-Choice Questions for RLVR: Unlocking Potential via Distractor Design

Xu Guo, Qiming Ge, Jian Tong, Kedi Chen, Jin Zhang, Xiaogui Yang, Xuan Gao, Haijun Lv, Zhihui Lu, Yicheng Zou, Qipeng Guo

2603.12823 2026-03-16 cs.CL cs.CV

Adaptive Vision-Language Model Routing for Computer Use Agents

Xunzhuo Liu, Bowei He, Xue Liu, Andy Luo, Haichen Zhang, Huamin Chen

2603.12816 2026-03-16 cs.LG cs.AI cs.CV

Residual SODAP: Residual Self-Organizing Domain-Adaptive Prompting with Structural Knowledge Preservation for Continual Learning

Gyutae Oh, Jungwoo Bae, Jitae Shin

Comments 29 page, 10 figures

2603.12813 2026-03-16 cs.AI

Context is all you need: Towards autonomous model-based process design using agentic AI in flowsheet simulations

Pascal Schäfer, Lukas J. Krinke, Martin Wlotzka, Norbert Asprion