arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2602.18652 2026-03-17 cs.CL

PolyFrame at MWE-2026 AdMIRe 2: When Words Are Not Enough: Multimodal Idiom Disambiguation

Nina Hosseini-Kivanani

Comments Accepted at AdMIRe 2 shared task (Advancing Multimodal Idiomaticity Representation) colocated with 22nd Workshop on Multiword Expressions (MWE 2026) @EACL2026

详情

英文摘要

Multimodal models struggle with idiomatic expressions due to their non-compositional meanings, a challenge amplified in multilingual settings. We introduced PolyFrame, our system for the MWE-2026 AdMIRe2 shared task on multimodal idiom disambiguation, featuring a unified pipeline for both image+text ranking (Subtask A) and text-only caption ranking (Subtask B). All model variants retain frozen CLIP-style vision--language encoders and the multilingual BGE M3 encoder, training only lightweight modules: a logistic regression and LLM-based sentence-type predictor, idiom synonym substitution, distractor-aware scoring, and Borda rank fusion. Starting from a CLIP baseline (26.7% Top-1 on English dev, 6.7% on English test), adding idiom-aware paraphrasing and explicit sentence-type classification increased performance to 60.0% Top-1 on English and 60.0% Top-1 (0.822 NDCG@5) in zero-shot transfer to Portuguese. On the multilingual blind test, our systems achieved average Top-1/NDCG scores of 0.35/0.73 for Subtask A and 0.32/0.71 for Subtask B across 15 languages. Ablation results highlight idiom-aware rewriting as the main contributor to performance, while sentence-type prediction and multimodal fusion enhance robustness. These findings suggest that effective idiom disambiguation is feasible without fine-tuning large multimodal encoders.

URL PDF HTML ☆

赞 0 踩 0

2602.16931 2026-03-17 cs.AI

Narrow Fine-Tuning Erodes Safety Alignment in Vision-Language Agents

Idhant Gulati, Shivam Raval

Comments 25 pages, 14 figures, Published at the Lifelong Agent Workshop at ICLR 2026

2602.14540 2026-03-17 cs.RO

Multimodal Belief-Space Covariance Steering with Active Probing and Influence for Interactive Driving

Devodita Chakravarty, John Dolan, Yiwei Lyu

Comments Accepted to IEEE International Conference on Robotics and Automation (ICRA 2026)

2602.12529 2026-03-17 cs.LG cs.CV

Flow-Factory: A Unified Framework for Reinforcement Learning in Flow-Matching Models

Bowen Ping, Chengyou Jia, Minnan Luo, Hangwei Qian, Ivor Tsang

2602.12117 2026-03-17 cs.LG cs.AI

KAN-FIF: Spline-Parameterized Lightweight Physics-based Tropical Cyclone Estimation on Meteorological Satellite

Jiakang Shen, Qinghui Chen, Runtong Wang, Chenrui Xu, Jinglin Zhang, Cong Bai, Feng Zhang

2602.11656 2026-03-17 cs.CV cs.AI cs.RO

SToRM: Supervised Token Reduction for Multi-modal LLMs toward efficient end-to-end autonomous driving

Seo Hyun Kim, Jin Bok Park, Do Yeon Koo, Hogun Park, Il Yong Chun

Comments Accepted to ICRA 2026

2602.09586 2026-03-17 cs.CV

Delving into Spectral Clustering with Vision-Language Representations

Bo Peng, Yuanwei Hu, Bo Liu, Ling Chen, Jie Lu, Zhen Fang

Comments ICLR26

2602.08751 2026-03-17 cs.LG q-bio.QM

Central Dogma Transformer II: An AI Microscope for Understanding Cellular Regulatory Mechanisms

Nobuyuki Ota

Comments 23 pages, 9 figures, 1 table, 37 references. v3: added gradient attribution analysis (Fig 8), TFRC Jacobian regulatory map (Fig 9, Table 1), PPMX-T003 clinical validation, corrected references

2602.08362 2026-03-17 cs.AI cs.LG cs.LO

Circuit Representations of Random Forests with Applications to XAI

Chunxi Ji, Adnan Darwiche

Comments Will appear in proceedings of the 4th World Conference on eXplainable Artificial Intelligence, XAI 2026

2602.05234 2026-03-17 cs.LG cs.CL

Faithful Bi-Directional Model Steering via Distribution Matching and Distributed Interchange Interventions

Yuntai Bao, Xuhong Zhang, Jintao Chen, Ge Su, Yuxiang Cai, Hao Peng, Bing Sun, Haiqin Weng, Liu Yan, Jianwei Yin

Comments camera ready version; 55 pages, 25 figures; accepted for ICLR 2026

详情

英文摘要

Intervention-based model steering offers a lightweight and interpretable alternative to prompting and fine-tuning. However, by adapting strong optimization objectives from fine-tuning, current methods are susceptible to overfitting and often underperform, sometimes generating unnatural outputs. We hypothesize that this is because effective steering requires the faithful identification of internal model mechanisms, not the enforcement of external preferences. To this end, we build on the principles of distributed alignment search (DAS), the standard for causal variable localization, to propose a new steering method: Concept DAS (CDAS). While we adopt the core mechanism of DAS, distributed interchange intervention (DII), we introduce a novel distribution matching objective tailored for the steering task by aligning intervened output distributions with counterfactual distributions. CDAS differs from prior work in two main ways: first, it learns interventions via weak-supervised distribution matching rather than probability maximization; second, it uses DIIs that naturally enable bi-directional steering and allow steering factors to be derived from data, reducing the effort required for hyperparameter tuning and resulting in more faithful and stable control. On AxBench, a large-scale model steering benchmark, we show that CDAS does not always outperform preference-optimization methods but may benefit more from increased model scale. In two safety-related case studies, overriding refusal behaviors of safety-aligned models and neutralizing a chain-of-thought backdoor, CDAS achieves systematic steering while maintaining general model utility. These results indicate that CDAS is complementary to preference-optimization approaches and conditionally constitutes a robust approach to intervention-based model steering. Our code is available at https://github.com/colored-dye/concept_das.

URL PDF HTML ☆

赞 0 踩 0

2602.04412 2026-03-17 cs.RO cs.LG

HoRD: Robust Humanoid Control via History-Conditioned Reinforcement Learning and Online Distillation

Puyue Wang, Jiawei Hu, Yan Gao, Junyan Wang, Yu Zhang, Gillian Dobbie, Tao Gu, Wafa Johal, Ting Dang, Hong Jia

2602.01960 2026-03-17 cs.LG

Grounding Generated Videos in Feasible Plans via World Models

Christos Ziakas, Amir Bar, Alessandra Russo

2602.01348 2026-03-17 cs.CL cs.LG

CRAFT: Calibrated Reasoning with Answer-Faithful Traces via Reinforcement Learning for Multi-Hop Question Answering

Yu Liu, Wenxiao Zhang, Diandian Guo, Cong Cao, Fangfang Yuan, Qiang Sun, Yanbing Liu, Jin B. Hong, Zhiyuan Ma

2601.21641 2026-03-17 cs.LG cs.AI

Seg-MoE: Multi-Resolution Segment-wise Mixture-of-Experts for Time Series Forecasting Transformers

Evandro S. Ortigossa, Eran Segal

Comments Under review

2601.20432 2026-03-17 cs.SD cs.AI

Self Voice Conversion as an Attack against Neural Audio Watermarking

Yigitcan Özer, Wanying Ge, Zhe Zhang, Xin Wang, Junichi Yamagishi

Comments 7 pages; 2 figures; 2 tables; accepted at IEICE, SP/SLP 2026

2601.18933 2026-03-17 cs.CL cs.AI

BabyReasoningBench: Generating Developmentally-Inspired Reasoning Tasks for Evaluating Baby Language Models

Kaustubh D. Dhole

2601.18260 2026-03-17 cs.CV

Depth to Anatomy: Organ Localization from Depth Images for Automated Patient Table Positioning in Radiology Workflow

Eytan Kats, Kai Geissler, Daniel Mensing, Julien Senegas, Jochen G. Hirsch, Stefan Heldman, Mattias P. Heinrich

Comments preprint

2601.15871 2026-03-17 cs.LG cs.AI

Why Inference in Large Models Becomes Decomposable After Training

Jidong Jin

Comments 42 pages, 6 figures

2601.14954 2026-03-17 cs.LG

Multimodal Rumor Detection Enhanced by External Evidence and Forgery Features

Han Li, Hua Sun

Comments 19 pages, 10 figures

2601.11578 2026-03-17 cs.CL cs.AI

Multi-Agent LLMs for Generating Research Limitations

Ibrahim Al Azher, Zhishuai Guo, Hamed Alhoori

Comments 18 Pages, 9 figures

2601.10453 2026-03-17 cs.SD cs.LG eess.AS physics.comp-ph

Stable Differentiable Modal Synthesis for Learning Nonlinear Dynamics

Victor Zheleznov, Stefan Bilbao, Alec Wright, Simon King

Comments Accepted for publication in Journal of the Audio Engineering Society (special issue on New Frontiers in Digital Audio Effects)

2601.10282 2026-03-17 cs.LG cs.AI cs.CE cs.NA math.DS math.NA

SPIKE: Sparse Koopman Regularization for Physics-Informed Neural Networks

Jose Marie Antonio Miñoza

2601.07773 2026-03-17 cs.CV

Self-transcendence: Is External Feature Guidance Indispensable for Accelerating Diffusion Transformer Training?

Lingchen Sun, Rongyuan Wu, Zhengqiang Zhang, Ruibin Li, Yujing Sun, Shuaizheng Liu, Lei Zhang

2601.07038 2026-03-17 cs.CL

Task Arithmetic with Support Languages for Low-Resource ASR

Emma Rafkin, Dan DeGenaro, Xiulin Yang

2601.06117 2026-03-17 cs.LG

The Active Discoverer Framework: Towards Autonomous Physics Reasoning through Neuro-Symbolic LaTeX Synthesis

Hyunjun Jeon

Comments V4 Coming S00N :)

2601.02716 2026-03-17 cs.CV

MorphGS: Morphology-Adaptive Articulated 3D Motion Transfer from Videos

Taeyeon Kim, Youngju Na, Jumin Lee, Sebin Lee, Minhyuk Sung, Sung-Eui Yoon

2601.02702 2026-03-17 cs.AI

MultiSessionCollab: Learning User Preferences with Memory to Improve Long-Term Collaboration

Shuhaib Mehri, Priyanka Kargupta, Tal August, Dilek Hakkani-Tür

2601.02320 2026-03-17 cs.CL

Estimating Text Temperature with Language Models

Nikolay Mikhaylovskiy

2601.01804 2026-03-17 cs.CV

V-CORE: Temporally Consistent Video Understanding for Video-LLM

Zhengjian Kang, Qi Chen, Rui Liu, Kangtong Mo, Xingyu Zhang, Xiaoyu Deng, Ye Zhang

Comments 7 pages, 4 figures

2601.00275 2026-03-17 cs.RO

Pure Inertial Navigation in Challenging Environments with Wheeled and Chassis Mounted Inertial Sensors

Dusan Nemec, Gal Versano, Itai Savin, Vojtech Simak, Juraj Kekelak, Itzik Klein