arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2603.10578 2026-03-12 cs.CV cs.DB

R4-CGQA: Retrieval-based Vision Language Models for Computer Graphics Image Quality Assessment

Zhuangzi Li, Jian Jin, Shilv Cai, Weisi Lin

详情

英文摘要

Immersive Computer Graphics (CGs) rendering has become ubiquitous in modern daily life. However, comprehensively evaluating CG quality remains challenging for two reasons: First, existing CG datasets lack systematic descriptions of rendering quality; and second existing CG quality assessment methods cannot provide reasonable text-based explanations. To address these issues, we first identify six key perceptual dimensions of CG quality from the user perspective and construct a dataset of 3500 CG images with corresponding quality descriptions. Each description covers CG style, content, and perceived quality along the selected dimensions. Furthermore, we use a subset of the dataset to build several question-answer benchmarks based on the descriptions in order to evaluate the responses of existing Vision Language Models (VLMs). We find that current VLMs are not sufficiently accurate in judging fine-grained CG quality, but that descriptions of visually similar images can significantly improve a VLM's understanding of a given CG image. Motivated by this observation, we adopt retrieval-augmented generation and propose a two-stream retrieval framework that effectively enhances the CG quality assessment capabilities of VLMs. Experiments on several representative VLMs demonstrate that our method substantially improves their performance on CG quality assessment.

URL PDF HTML ☆

赞 0 踩 0

2603.10572 2026-03-12 cs.RO

Safety-critical Control Under Partial Observability: Reach-Avoid POMDP meets Belief Space Control

Matti Vahs, Joris Verhagen, Jana Tumova

2603.10570 2026-03-12 cs.CL

End-to-End Chatbot Evaluation with Adaptive Reasoning and Uncertainty Filtering

Nhi Dang, Tung Le, Huy Tien Nguyen

2603.10565 2026-03-12 cs.RO

TacLoc: Global Tactile Localization on Objects from a Registration Perspective

Zirui Zhang, Boyang Zhang, Fumin Zhang, Huan Yin

Comments 8 pages, 12 figures

2603.10564 2026-03-12 cs.AI cs.NI

Adaptive RAN Slicing Control via Reward-Free Self-Finetuning Agents

Yuanhao Li, Haozhe Wang, Geyong Min, Nektarios Georgalas, Wang Miao

2603.10563 2026-03-12 cs.LG

Riemannian Geometry-Preserving Variational Autoencoder for MI-BCI Data Augmentation

Viktorija Poļaka, Ivo Pascal de Jong, Andreea Ioana Sburlea

Comments 6 pages, 4 figures, 2 tables

2603.10560 2026-03-12 cs.CV

PET-F2I: A Comprehensive Benchmark and Parameter-Efficient Fine-Tuning of LLMs for PET/CT Report Impression Generation

Yuchen Liu, Wenbo Zhang, Liling Peng, Yichi Zhang, Yu Fu, Xin Guo, Chao Qu, Yuan Qi, Le Xue

2603.10551 2026-03-12 cs.CV cs.MM

P-GSVC: Layered Progressive 2D Gaussian Splatting for Scalable Image and Video

Longan Wang, Yuang Shi, Wei Tsang Ooi

Comments MMSys 2026; Project Website: see https://longanwang-cs.github.io/PGSVC-webpage/

2603.10549 2026-03-12 cs.CV cs.AI eess.SP

Towards Cognitive Defect Analysis in Active Infrared Thermography with Vision-Text Cues

Mohammed Salah, Eman Ouda, Giuseppe Dell'Avvocato, Fabrizio Sarasini, Ester D'Accardi, Jorge Dias, Davor Svetinovic, Stefano Sfarra, Yusra Abdulrahman

详情

英文摘要

Active infrared thermography (AIRT) is currently witnessing a surge of artificial intelligence (AI) methodologies being deployed for automated subsurface defect analysis of high performance carbon fiber-reinforced polymers (CFRP). Deploying AI-based AIRT methodologies for inspecting CFRPs requires the creation of time consuming and expensive datasets of CFRP inspection sequences to train neural networks. To address this challenge, this work introduces a novel language-guided framework for cognitive defect analysis in CFRPs using AIRT and vision-language models (VLMs). Unlike conventional learning-based approaches, the proposed framework does not require developing training datasets for extensive training of defect detectors, instead it relies solely on pretrained multimodal VLM encoders coupled with a lightweight adapter to enable generative zero-shot understanding and localization of subsurface defects. By leveraging pretrained multimodal encoders, the proposed system enables generative zero-shot understanding of thermographic patterns and automatic detection of subsurface defects. Given the domain gap between thermographic data and natural images used to train VLMs, an AIRT-VLM Adapter is proposed to enhance the visibility of defects while aligning the thermographic domain with the learned representations of VLMs. The proposed framework is validated using three representative VLMs; specifically, GroundingDINO, Qwen-VL-Chat, and CogVLM. Validation is performed on 25 CFRP inspection sequences with impacts introduced at different energy levels, reflecting realistic defects encountered in industrial scenarios. Experimental results demonstrate that the AIRT-VLM adapter achieves signal-to-noise ratio (SNR) gains exceeding 10 dB compared with conventional thermographic dimensionality-reduction methods, while enabling zero-shot defect detection with intersection-over-union values reaching 70%.

URL PDF HTML ☆

赞 0 踩 0

2603.10547 2026-03-12 cs.CL

Automatic End-to-End Data Integration using Large Language Models

Aaron Steiner, Christian Bizer

Comments 8 pages, 9 tables. Accepted at the Beyond SQL Workshop at ICDE 2026

2603.10545 2026-03-12 cs.LG

Learning to Score: Tuning Cluster Schedulers through Reinforcement Learning

Martin Asenov, Qiwen Deng, Gingfung Yeung, Adam Barker

2603.10544 2026-03-12 cs.LG cs.AI

SCORE: Replacing Layer Stacking with Contractive Recurrent Depth

Guillaume Godin

Comments 32 pages, 21 figures, 12 tableaux

2603.10541 2026-03-12 cs.CV cs.AI

Prompting with the human-touch: evaluating model-sensitivity of foundation models for musculoskeletal CT segmentation

Caroline Magg, Maaike A. ter Wee, Johannes G. G. Dobbe, Geert J. Streekstra, Leendert Blankevoort, Clara I. Sánchez, Hoel Kervadec

2603.10538 2026-03-12 cs.CV

DSFlash: Comprehensive Panoptic Scene Graph Generation in Realtime

Julian Lorenz, Vladyslav Kovganko, Elias Kohout, Mrunmai Phatak, Daniel Kienzle, Rainer Lienhart

Comments Accepted at CVPR 2026

2603.10535 2026-03-12 cs.LG cs.CL

Tackling Length Inflation Without Trade-offs: Group Relative Reward Rescaling for Reinforcement Learning

Zichao Li, Jie Lou, Fangchen Dong, Zhiyuan Fan, Mengjie Ren, Hongyu Lin, Xianpei Han, Debing Zhang, Le Sun, Yaojie Lu, Xing Yu

2603.10529 2026-03-12 cs.RO

BinWalker: Development and Field Evaluation of a Quadruped Manipulator Platform for Sustainable Litter Collection

Giulio Turrisi, Angelo Bratta, Giovanni Minelli, Gabriel Fischer Abati, Amir H. Rad, João Carlos Virgolino Soares, Claudio Semini

2603.10528 2026-03-12 cs.LG cs.AI

UAV-MARL: Multi-Agent Reinforcement Learning for Time-Critical and Dynamic Medical Supply Delivery

Islam Guven, Mehmet Parlak

Comments 7 pages, 4 figures, 2 tables, conference

2603.10527 2026-03-12 cs.LG cs.SY eess.SY

World Model for Battery Degradation Prediction Under Non-Stationary Aging

Kai Chin Lim, Khay Wai See

Comments 18 pages, 3 figures

2603.10526 2026-03-12 cs.CV

Sparse Task Vector Mixup with Hypernetworks for Efficient Knowledge Transfer in Whole-Slide Image Prognosis

Pei Liu, Xiangxiang Zeng, Tengfei Ma, Yucheng Xing, Xuanbai Ren, Yiping Liu

Comments Accepted to CVPR 2026

2603.10524 2026-03-12 cs.CL

AILS-NTUA at SemEval-2026 Task 8: Evaluating Multi-Turn RAG Conversations

Dimosthenis Athanasiou, Maria Lymperaiou, Giorgos Filandrianos, Athanasios Voulodimos, Giorgos Stamou

2603.10521 2026-03-12 cs.AI cs.CL cs.CR cs.LG

IH-Challenge: A Training Dataset to Improve Instruction Hierarchy on Frontier LLMs

Chuan Guo, Juan Felipe Ceron Uribe, Sicheng Zhu, Christopher A. Choquette-Choo, Steph Lin, Nikhil Kandpal, Milad Nasr, Rai, Sam Toyer, Miles Wang, Yaodong Yu, Alex Beutel, Kai Xiao

2603.10519 2026-03-12 cs.CV

Visually-Guided Controllable Medical Image Generation via Fine-Grained Semantic Disentanglement

Xin Huang, Junjie Liang, Qingshan Hou, Peng Cao, Jinzhu Yang, Xiaoli Liu, Osmar R. Zaiane

Comments 10 pages, 7 figures. Currently under review

2603.10517 2026-03-12 cs.CV

UHD Image Deblurring via Autoregressive Flow with Ill-conditioned Constraints

Yucheng Xin, Dawei Zhao, Xiang Chen, Chen Wu, Pu Wang, Dianjie Lu, Guijuan Zhang, Xiuyi Jia, Zhuoran Zheng

Comments Submitted to ECCV 2026

2603.10505 2026-03-12 cs.CL

Safe and Scalable Web Agent Learning via Recreated Websites

Hyungjoo Chae, Jungsoo Park, Alan Ritter

2603.10494 2026-03-12 cs.CL cs.LG

VERI-DPO: Evidence-Aware Alignment for Clinical Summarization via Claim Verification and Direct Preference Optimization

Weixin Liu, Congning Ni, Qingyuan Song, Susannah L. Rose, Christopher Symons, Murat Kantarcioglu, Bradley A. Malin, Zhijun Yin

Comments Paper submitted to AMIA 2026 Annual Symposium

2603.10493 2026-03-12 cs.LG

A Universal Nearest-Neighbor Estimator for Intrinsic Dimensionality

Eng-Jon Ong, Omer Bobrowski, Gesine Reinert, Primoz Skraba

2603.10487 2026-03-12 cs.CV

Spatial self-supervised Peak Learning and correlation-based Evaluation of peak picking in Mass Spectrometry Imaging

Philipp Weigand, Nikolas Ebert, Shad A. Mohammed, Denis Abu Sammour, Carsten Hopf, Oliver Wasenmüller

2603.10484 2026-03-12 cs.CV

StructDamage:A Large Scale Unified Crack and Surface Defect Dataset for Robust Structural Damage Detection

Misbah Ijaz, Saif Ur Rehman Khan, Abd Ur Rehman, Sebastian Vollmer, Andreas Dengel, Muhammad Nabeel Asim

2603.10473 2026-03-12 cs.CL cs.AI

Aligning Large Language Models with Searcher Preferences

Wei Wu, Peilun Zhou, Liyi Chen, Qimeng Wang, Chengqiang Lu, Yan Gao, Yi Wu, Yao Hu, Hui Xiong

2603.10470 2026-03-12 cs.CV

Fighting Hallucinations with Counterfactuals: Diffusion-Guided Perturbations for LVLM Hallucination Suppression

Hamidreza Dastmalchi, Aijun An, Ali Cheraghian, Hamed Barzamini

Comments CVPR 2026