arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2604.18133 2026-04-21 cs.AI

Multi-Agent Systems: From Classical Paradigms to Large Foundation Model-Enabled Futures

Zixiang Wang, Mengjia Gong, Qiyu Sun, Jing Xu, Shuai Mao, Xin Jin, Qing-Long Han, Yang Tang

Comments Accepted by IEEE/CAA Journal of Automatica Sinica

2604.18131 2026-04-21 cs.AI

Training LLM Agents for Spontaneous, Reward-Free Self-Evolution via World Knowledge Exploration

Qifan Zhang, Dongyang Ma, Tianqing Fang, Jia Li, Jing Tang, Nuo Chen, Haitao Mi, Yan Wang

2604.18126 2026-04-21 cs.RO cs.CV

Chatting about Conditional Trajectory Prediction

Yuxiang Zhao, Wei Huang, Haipeng Zeng, Huan Zhao, Yujie Song

2604.18124 2026-04-21 cs.CL cs.AI

TLoRA: Task-aware Low Rank Adaptation of Large Language Models

Weicheng Lin, Yi Zhang, Jiawei Dang, Liang-Jie Zhang

Comments Accept to ACL 2026

2604.18122 2026-04-21 cs.CL

Decisive: Guiding User Decisions with Optimal Preference Elicitation from Unstructured Documents

Akriti Jain, Anish Mulay, Divyansh Verma, Aishani Pandey, Pritika Ramu, Aparna Garimella

Comments Accepted to ACL 2026 Main Conference

2604.18117 2026-04-21 cs.LG

LoRaQ: Optimized Low Rank Approximation for 4-bit Quantization

Yann Bouquet, Alireza Khodamoradi, Sophie Yáng Shen, Kristof Denolf, Mathieu Salzmann

2604.18109 2026-04-21 cs.CL cs.SD

FLiP: Towards understanding and interpreting multimodal multilingual sentence embeddings

Santosh Kesiraju, Bolaji Yusuf, Šimon Sedláček, Oldřich Plchot, Petr Schwarz

Comments Under review

2604.18107 2026-04-21 cs.CV

Test-Time Perturbation Learning with Delayed Feedback for Vision-Language-Action Models

Zehua Zang, Xi Wang, Fuchun Sun, Xiao Xu, Lixiang Lium, Jiahuan Zhou, Jiangmeng Li

Comments 12 pages, 7 figures, 5 tables

2604.18106 2026-04-21 cs.CL

Efficient Low-Resource Language Adaptation via Multi-Source Dynamic Logit Fusion

Chen Zhang, Jiuheng Lin, Zhiyuan Liao, Yansong Feng

Comments ACL 2026

2604.18095 2026-04-21 cs.AI

DSAINet: An Efficient Dual-Scale Attentive Interaction Network for General EEG Decoding

Zhiyuan Ma, Zeyuan Li, Zihao Qiu, Jinhao Li, Lingqin Meng, Xinche Zhang, Yixuan Liu, Xinke Shen, Sen Song

2604.18094 2026-04-21 cs.CV

Decision-Aware Attention Propagation for Vision Transformer Explainability

Sehyeong Jo, Gangjae Jang, Haesol Park

Comments 16 pages, 4 figures

2604.18092 2026-04-21 cs.LG

Generalization Boundaries of Fine-Tuned Small Language Models for Graph Structural Inference

Michal Podstawski

2604.18091 2026-04-21 cs.CL cs.CV

Culture-Aware Humorous Captioning: Multimodal Humor Generation across Cultural Contexts

Run Xu, Lu Li, Rongzhao Zhang, Jie Xu

2604.18090 2026-04-21 cs.RO cond-mat.mtrl-sci cond-mat.soft physics.app-ph

Muscle-inspired magnetic actuators that push, pull, crawl, and grasp

Muhammad Bilal Khan, Florian Hofmann, Kilian Schäfer, Matthias Lutzi, Oliver Gutfleisch

2604.18089 2026-04-21 cs.LG stat.ML

Towards E-Value Based Stopping Rules for Bayesian Deep Ensembles

Emanuel Sommer, Rickmer Schulte, Sarah Deubner, Julius Kobialka, David Rügamer

Comments Accepted for presentation at the OPTIMAL Workshop at AISTATS 2026, Tangier, Morocco

2604.18088 2026-04-21 cs.CV cs.AI stat.AP

Autonomous Unmanned Aircraft Systems for Enhanced Search and Rescue of Drowning Swimmers: Image-Based Localization and Mission Simulation

Sascha Emanuel Zell, Toni Schneidereit, Armin Fügenschuh, Michael Breuß

Comments Submitted to "Applied Intelligence"

2604.18087 2026-04-21 cs.CL cs.AI cs.CY

Mix and Match: Context Pairing for Scalable Topic-Controlled Educational Summarisation

Nathikan Yodthapa, Thanapong Intharah, Sahan Bulathwela

Comments To be published at the International Conference on Artificial Intelligence in Education (AIED'26)

2604.18083 2026-04-21 cs.LG cs.AI

Implicit neural representations as a coordinate-based framework for continuous environmental field reconstruction from sparse ecological observations

Agnieszka Pregowska, Hazem M. Kalaji

2604.18076 2026-04-21 cs.CV cs.AI

Class-specific diffusion models improve military object detection in a low-data domain

Ella P. Fokkinga, Jan Erik van Woerden, Thijs A. Eker, Sebastiaan P. Snel, Elfi I. S. Hofmeijer, Klamer Schutte, Friso G. Heslinga

Comments Submitted to SPIE Defense + Security

2604.18075 2026-04-21 cs.CV

Enhancing Continual Learning of Vision-Language Models via Dynamic Prefix Weighting

Hyeonseo Jang, Hyuk Kwon, Kibok Lee

Comments CVPR 2026; revised text and figures for improved readability

2604.18071 2026-04-21 cs.AI

Architectural Design Decisions in AI Agent Harnesses

Hu Wei

Comments 35 pages, 13 tables

2604.18069 2026-04-21 cs.CL

Modeling Human Perspectives with Socio-Demographic Representations

Leixin Zhang, Cagri Coltekin

2604.18064 2026-04-21 cs.AI

Understanding Human Actions through the Lens of Executable Models

Rimvydas Rubavicius, Manisha Dubey, N. Siddharth, Subramanian Ramamoorthy

Comments 16 pages, 3 figures, 2 tables

2604.18062 2026-04-21 cs.LG physics.flu-dyn

Towards a Foundation-Model Paradigm for Aerodynamic Prediction in Three-dimensional Design

Yunjia Yang, Babak Gholami, Caglar Gurbuz, Mohammad Rashed, Nils Thuerey

2604.18051 2026-04-21 cs.CV

INTENT: Invariance and Discrimination-aware Noise Mitigation for Robust Composed Image Retrieval

Zhiwei Chen, Yupeng Hu, Zhiheng Fu, Zixu Li, Jiale Huang, Qinlei Huang, Yinwei Wei

Comments Accepted by AAAI 2026

详情

英文摘要

Composed Image Retrieval (CIR) is a challenging image retrieval paradigm that enables to retrieve target images based on multimodal queries consisting of reference images and modification texts. Although substantial progress has been made in recent years, existing methods assume that all samples are correctly matched. However, in real-world scenarios, due to high triplet annotation costs, CIR datasets inevitably contain annotation errors, resulting in incorrectly matched triplets. To address this issue, the problem of Noisy Triplet Correspondence (NTC) has attracted growing attention. We argue that noise in CIR can be categorized into two types: cross-modal correspondence noise and modality-inherent noise. The former arises from mismatches across modalities, whereas the latter originates from intra-modal background interference or visual factors irrelevant to the coarse-grained modification annotations. However, modality-inherent noise is often overlooked, and research on cross-modal correspondence noise remains nascent. To tackle above issues, we propose the Invariance and discrimiNaTion-awarE Noise neTwork (INTENT), comprising two components: Visual Invariant Composition and Bi-Objective Discriminative Learning, specifically designed to handle the two-aspect noise. The former applies causal intervention on the visual side via Fast Fourier Transform (FFT) to generate intervened composed features, enforcing visual invariance and enabling the model to ignore modality-inherent noise during composition. The latter adopts collaborative optimization with both positive and negative samples, and constructs a scalable decision boundary that dynamically adjusts decisions based on the loyalty degree, enabling robust correspondence discrimination. Extensive experiments on two widely used benchmark datasets demonstrate the superiority and robustness of INTENT.

URL PDF HTML ☆

赞 0 踩 0

2604.18047 2026-04-21 cs.CV

GS-STVSR: Ultra-Efficient Continuous Spatio-Temporal Video Super-Resolution via 2D Gaussian Splatting

Mingyu Shi, Xin Di, Long Peng, Boxiang Cao, Anran Wu, Zhanfeng Feng, Jiaming Guo, Renjing Pei, Xueyang Fu, Yang Cao, Zhengjun Zha

2604.18041 2026-04-21 cs.CL cs.CY

JudgeMeNot: Personalizing Large Language Models to Emulate Judicial Reasoning in Hebrew

Itay Razumenko, Arnon Sturm, Nir Grinberg

Comments To appear in Findings of the ACL 2026

2604.18037 2026-04-21 cs.CV

HABIT: Chrono-Synergia Robust Progressive Learning Framework for Composed Image Retrieval

Zixu Li, Yupeng Hu, Zhiwei Chen, Shiqi Zhang, Qinlei Huang, Zhiheng Fu, Yinwei Wei

Comments Accepted by AAAI 2026

2604.18035 2026-04-21 cs.LG

Variational Autoencoder Domain Adaptation for Cross-System Generalization in ML-Based SOP Monitoring

Leyla Sadighi, Stefan Karlsson, Carlos Natalino, Mojtaba Eshghie, Fehmida Usmani, Eoin Kenny, Lena Wosinska, Paolo Monti, Marija Furdek, Marco Ruffini

2604.18034 2026-04-21 cs.CL cs.CV

SignDPO: Multi-level Direct Preference Optimisation for Skeleton-based Gloss-free Sign Language Translation

Muxin Pu, Xiao-Ming Wu, Mei Kuan Lim, Chun Yong Chong, Wei Li, Chen Change Loy