arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2603.21988 2026-03-24 cs.LG cs.AI

TREX: Trajectory Explanations for Multi-Objective Reinforcement Learning

Dilina Rajapakse, Juan C. Rosero, Ivana Dusparic

Comments Accepted by 4th World Conference on eXplainable Artificial Intelligence

详情

英文摘要

Reinforcement Learning (RL) has demonstrated its ability to solve complex decision-making problems in a variety of domains, by optimizing reward signals obtained through interaction with an environment. However, many real-world scenarios involve multiple, potentially conflicting objectives that cannot be easily represented by a single scalar reward. Multi-Objective Reinforcement Learning (MORL) addresses this limitation by enabling agents to optimize several objectives simultaneously, explicitly reasoning about trade-offs between them. However, the ``black box" nature of the RL models makes the decision process behind chosen objective trade-offs unclear. Current Explainable Reinforcement Learning (XRL) methods are typically designed for single scalar rewards and do not account for explanations with respect to distinct objectives or user preferences. To address this gap, in this paper we propose TREX, a Trajectory based Explainability framework to explain Multi-objective Reinforcement Learning policies, based on trajectory attribution. TREX generates trajectories directly from the learned expert policy, across different user preferences and clusters them into semantically meaningful temporal segments. We quantify the influence of these behavioural segments on the Pareto trade-off by training complementary policies that exclude specific clusters, measuring the resulting relative deviation on the observed rewards and actions compared to the original expert policy. Experiments on multi-objective MuJoCo environments - HalfCheetah, Ant and Swimmer, demonstrate the framework's ability to isolate and quantify the specific behavioural patterns.

URL PDF HTML ☆

赞 0 踩 0

2603.21987 2026-03-24 cs.CV cs.AI

LRC-WeatherNet: LiDAR, RADAR, and Camera Fusion Network for Real-time Weather-type Classification in Autonomous Driving

Nour Alhuda Albashir, Lars Pernickel, Danial Hamoud, Idriss Gouigah, Eren Erdal Aksoy

Comments Accepted for publication at IEEE Intelligent Vehicles Symposium - IVS 2026

2603.21986 2026-03-24 cs.CV

Speed by Simplicity: A Single-Stream Architecture for Fast Audio-Video Generative Foundation Model

SII-GAIR, Sand. ai, :, Ethan Chern, Hansi Teng, Hanwen Sun, Hao Wang, Hong Pan, Hongyu Jia, Jiadi Su, Jin Li, Junjie Yu, Lijie Liu, Lingzhi Li, Lyumanshan Ye, Min Hu, Qiangang Wang, Quanwei Qi, Steffi Chern, Tao Bu, Taoran Wang, Teren Xu, Tianning Zhang, Tiantian Mi, Weixian Xu, Wenqiang Zhang, Wentai Zhang, Xianping Yi, Xiaojie Cai, Xiaoyang Kang, Yan Ma, Yixiu Liu, Yunbo Zhang, Yunpeng Huang, Yutong Lin, Zewei Tao, Zhaoliang Liu, Zheng Zhang, Zhiyao Cen, Zhixuan Yu, Zhongshu Wang, Zhulin Hu, Zijin Zhou, Zinan Guo, Yue Cao, Pengfei Liu

2603.21978 2026-03-24 cs.CV cs.GR

GeoFusion-CAD: Structure-Aware Diffusion with Geometric State Space for Parametric 3D Design

Xiaolei Zhou, Chuangjie Fang, Jie Wu, Jingyi Yang, Boyi Lin, Jianwei Zheng

Comments Accepted to CVPR 2026 (Findings). Includes supplementary material

2603.21977 2026-03-24 cs.LG cs.SY eess.SY

BOOST-RPF: Boosted Sequential Trees for Radial Power Flow

Ehimare Okoyomon, Christoph Goebel

2603.21972 2026-03-24 cs.LG cs.CL

Demystifying Reinforcement Learning for Long-Horizon Tool-Using Agents: A Comprehensive Recipe

Xixi Wu, Qianguo Sun, Ruiyang Zhang, Chao Song, Junlong Wu, Yiyan Qi, Hong Cheng

Comments Codes are available at https://github.com/WxxShirley/Agent-STAR

2603.21966 2026-03-24 cs.CV cs.CL

BHDD: A Burmese Handwritten Digit Dataset

Swan Htet Aung, Hein Htet, Htoo Say Wah Khaing, Thuya Myo Nyunt

Comments 4 pages, 9 figures, 1 table. Dataset available at https://github.com/baseresearch/BHDD

2603.21957 2026-03-24 cs.CV

Unified Spatiotemporal Token Compression for Video-LLMs at Ultra-Low Retention

Junhao Du, Jialong Xue, Anqi Li, Jincheng Dai, Guo Lu

Comments Accepted by CVPR 2026

2603.21944 2026-03-24 cs.CV

Group3D: MLLM-Driven Semantic Grouping for Open-Vocabulary 3D Object Detection

Youbin Kim, Jinho Park, Hogun Park, Eunbyung Park

Comments 24 pages, 7 figures, Project page: https://ubin108.github.io/Group3D/

2603.21943 2026-03-24 cs.CV

GeoFlow: Real-Time Fine-Grained Cross-View Geolocalization via Iterative Flow Prediction

Ayesh Abu Lehyeh, Xiaohan Zhang, Ahmad Arrabi, Waqas Sultani, Chen Chen, Safwan Wshah

Comments Accepted at the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2026)

2603.21940 2026-03-24 cs.CL

SLURP-TN : Resource for Tunisian Dialect Spoken Language Understanding

Haroun Elleuch, Salima Mdhaffar, Yannick Estève, Fethi Bougares

Comments Accepted at LREC 2026

2603.21939 2026-03-24 cs.CV cs.MM

FeatDistill: A Feature Distillation Enhanced Multi-Expert Ensemble Framework for Robust AI-generated Image Detection

Zhilin Tu, Kemou Li, Fengpeng Li, Jianwei Fei, Jiamin Zhang, Haiwei Wu

Comments 6th place (6/507) technical report at the NTIRE 2026: Robust AI-Generated Image Detection in the Wild Challenge

详情

英文摘要

The rapid iteration and widespread dissemination of deepfake technology have posed severe challenges to information security, making robust and generalizable detection of AI-generated forged images increasingly important. In this paper, we propose FeatDistill, an AI-generated image detection framework that integrates feature distillation with a multi-expert ensemble, developed for the NTIRE Challenge on Robust AI-Generated Image Detection in the Wild. The framework explicitly targets three practical bottlenecks in real-world forensics: degradation interference, insufficient feature representation, and limited generalization. Concretely, we build a four-backbone Vision Transformer (ViT) ensemble composed of CLIP and SigLIP variants to capture complementary forensic cues. To improve data coverage, we expand the training set and introduce comprehensive degradation modeling, which exposes the detector to diverse quality variations and synthesis artifacts commonly encountered in unconstrained scenarios. We further adopt a two-stage training paradigm: the model is first optimized with a standard binary classification objective, then refined by dense feature-level self-distillation for representation alignment. This design effectively mitigates overfitting and enhances semantic consistency of learned features. At inference time, the final prediction is obtained by averaging the probabilities from four independently trained experts, yielding stable and reliable decisions across unseen generators and complex degradations. Despite the ensemble design, the framework remains efficient, requiring only about 10 GB peak GPU memory. Extensive evaluations in the NTIRE challenge setting demonstrate that FeatDistill achieves strong robustness and generalization under diverse ``in-the-wild'' conditions, offering an effective and practical solution for real-world deepfake image detection.

URL PDF HTML ☆

赞 0 踩 0

2603.21937 2026-03-24 cs.CV

MultiBind: A Benchmark for Attribute Misbinding in Multi-Subject Generation

Wenqing Tian, Hanyi Mao, Zhaocheng Liu, Lihua Zhang, Qiang Liu, Jian Wu, Liang Wang

2603.21933 2026-03-24 cs.CV cs.AI cs.LG

Camera-Agnostic Pruning of 3D Gaussian Splats via Descriptor-Based Beta Evidence

Peter Fasogbon, Ugurcan Budak, Patrice Rondao Alface, Hamed Rezazadegan Tavakoli

Comments 14 pages, 3 figures, 2 tables

2603.21931 2026-03-24 cs.CV

SatGeo-NeRF: Geometrically Regularized NeRF for Satellite Imagery

Valentin Wagner, Sebastian Bullinger, Michael Arens, Rainer Stiefelhagen

Comments Accepted at the ISPRS Congress 2026

2603.21928 2026-03-24 cs.CV cs.LG

The Golden Subspace: Where Efficiency Meets Generalization in Continual Test-Time Adaptation

Guannan Lai, Da-Wei Zhou, Zhenguo Li, Han-Jia Ye

Comments Accepted to CVPR 2026

2603.21926 2026-03-24 cs.RO

Disengagement Analysis and Field Tests of a Prototypical Open-Source Level 4 Autonomous Driving System

Marvin Seegert, Christian Oefinger, Korbinian Moller, Christoph Bank, Johannes Betz

Comments 8 pages, submitted to IEEE for possible publication

2603.21925 2026-03-24 cs.AI

Guideline-grounded retrieval-augmented generation for ophthalmic clinical decision support

Shuying Chen, Sen Cui, Zhong Cao

2603.21921 2026-03-24 cs.LG cs.AI

Deep Reinforcement Learning and The Tale of Two Temporal Difference Errors

Juan Sebastian Rojas, Chi-Guhn Lee

2603.21913 2026-03-24 cs.RO cs.SY eess.SY math.OC

Collision-Free Velocity Scheduling for Multi-Agent Systems on Predefined Routes via Inexact-Projection ADMM

Seungyeop Lee, Jong-Han Kim

2603.21911 2026-03-24 cs.CV cs.LG eess.IV

A Latent Representation Learning Framework for Hyperspectral Image Emulation in Remote Sensing

Chedly Ben Azizi, Claire Guilloteau, Gilles Roussel, Matthieu Puigt

2603.21908 2026-03-24 cs.LG

SparseDVFS: Sparse-Aware DVFS for Energy-Efficient Edge Inference

Ziyang Zhang, Zheshun Wu, Jie Liu, Luca Mottola

Comments 14 pages, 19 figures, 3 tables

2603.21904 2026-03-24 cs.CV cs.AI

SHAPE: Structure-aware Hierarchical Unsupervised Domain Adaptation with Plausibility Evaluation for Medical Image Segmentation

Linkuan Zhou, Yinghao Xia, Yufei Shen, Xiangyu Li, Wenjie Du, Cong Cong, Leyi Wei, Ran Su, Qiangguo Jin

2603.21900 2026-03-24 cs.CL

Ara-Best-RQ: Multi Dialectal Arabic SSL

Haroun Elleuch, Ryan Whetten, Salima Mdhaffar, Yannick Estève, Fethi Bougares

Comments Accepted at ICASSP 2026

2603.21884 2026-03-24 cs.CV cs.AI cs.LG

Not All Layers Are Created Equal: Adaptive LoRA Ranks for Personalized Image Generation

Donald Shenaj, Federico Errica, Antonio Carta

Comments Project page: https://donaldssh.github.io/NotAllLayersAreCreatedEqual/

2603.21882 2026-03-24 cs.CV

Deep S2P: Integrating Learning Based Stereo Matching Into the Satellite Stereo Pipeline

Elías Masquil, Thibaud Ehret, Pablo Musé, Gabriele Facciolo

Comments Accepted at IGARSS 2026

2603.21872 2026-03-24 cs.CV cs.AI

Manifold-Aware Exploration for Reinforcement Learning in Video Generation

Mingzhe Zheng, Weijie Kong, Yue Wu, Dengyang Jiang, Yue Ma, Xuanhua He, Bin Lin, Kaixiong Gong, Zhao Zhong, Liefeng Bo, Qifeng Chen, Harry Yang

Comments 17 pages, 12 figures

2603.21867 2026-03-24 cs.CV cs.AI

Adversarial Camouflage

Paweł Borsukiewicz, Daniele Lunghi, Melissa Tessa, Jacques Klein, Tegawendé F. Bissyandé

Comments 18 pages, 4 figures, 5 tables

2603.21866 2026-03-24 cs.AI

Tacit Knowledge Management with Generative AI: Proposal of the GenAI SECI Model

Naoshi Uchihira

Comments This paper is intended to be submitted to AHFE2026

2603.21864 2026-03-24 cs.CV cs.AI

Adaptive Video Distillation: Mitigating Oversaturation and Temporal Collapse in Few-Step Generation

Yuyang You, Yongzhi Li, Jiahui Li, Yadong Mu, Quan Chen, Peng Jiang