arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2603.09989 2026-03-12 cs.CL cs.AI

The System Hallucination Scale (SHS): A Minimal yet Effective Human-Centered Instrument for Evaluating Hallucination-Related Behavior in Large Language Models

Heimo Müller, Dominik Steiger, Markus Plass, Andreas Holzinger

详情

英文摘要

We introduce the System Hallucination Scale (SHS), a lightweight and human-centered measurement instrument for assessing hallucination-related behavior in large language models (LLMs). Inspired by established psychometric tools such as the System Usability Scale (SUS) and the System Causability Scale (SCS), SHS enables rapid, interpretable, and domain-agnostic evaluation of factual unreliability, incoherence, misleading presentation, and responsiveness to user guidance in model-generated text. SHS is explicitly not an automatic hallucination detector or benchmark metric; instead, it captures how hallucination phenomena manifest from a user perspective under realistic interaction conditions. A real-world evaluation with 210 participants demonstrates high clarity, coherent response behavior, and construct validity, supported by statistical analysis including internal consistency (Cronbach's alpha = 0.87$) and significant inter-dimension correlations (p < 0.001$). Comparative analysis with SUS and SCS reveals complementary measurement properties, supporting SHS as a practical tool for comparative analysis, iterative system development, and deployment monitoring.

URL PDF HTML ☆

赞 0 踩 0

2603.09988 2026-03-12 cs.CL cs.AI

Causally Grounded Mechanistic Interpretability for LLMs with Faithful Natural-Language Explanations

Ajay Pravin Mahale

Comments 8 pages, 7 figures, 4 tables. MSc thesis work conducted at Hochschule Trier (2026). Code will be released upon publication

2603.09987 2026-03-12 cs.CL cs.AI cs.LG

Evolving Demonstration Optimization for Chain-of-Thought Feature Transformation

Xinyuan Wang, Kunpeng Liu, Arun Vignesh Malarkkan, Yanjie Fu

2603.09985 2026-03-12 cs.CL cs.AI

The Dunning-Kruger Effect in Large Language Models: An Empirical Study of Confidence Calibration

Sudipta Ghosh, Mrityunjoy Panday

2603.09984 2026-03-12 cs.CL

An Efficient Hybrid Deep Learning Approach for Detecting Online Abusive Language

Vuong M. Ngo, Cach N. Dang, Kien V. Nguyen, Mark Roantree

Comments 10 pages, 7 figures

2603.09983 2026-03-12 cs.LG cs.AI cs.CL

MoE-SpAc: Efficient MoE Inference Based on Speculative Activation Utility in Heterogeneous Edge Scenarios

Shuhuai Li, Jianghao Lin, Dongdong Ge, Yinyu Ye

2603.09981 2026-03-12 cs.CL

Large Language Models and Book Summarization: Reading or Remembering, Which Is Better?

Tairan Fu, Javier Conde, Pedro Reviriego, Javier Coronado-Blázquez, Nina Melero, Elena Merino-Gómez

2603.09980 2026-03-12 cs.LG cs.AI cs.CL

Explainable LLM Unlearning Through Reasoning

Junfeng Liao, Qizhou Wang, Shanshan Ye, Xin Yu, Ling Chen, Zhen Fang

2603.09888 2026-03-12 cs.AI

LCA: Local Classifier Alignment for Continual Learning

Tung Tran, Danilo Vasconcellos Vargas, Khoat Than

Comments Accepted to the International Conference on Learning Representations (ICLR 2026)

2603.09827 2026-03-12 cs.CV cs.AI

MA-EgoQA: Question Answering over Egocentric Videos from Multiple Embodied Agents

Kangsan Kim, Yanlai Yang, Suji Kim, Woongyeong Yeo, Youngwan Lee, Mengye Ren, Sung Ju Hwang

Comments Under review

2603.09771 2026-03-12 cs.CV cs.AI

Ego: Embedding-Guided Personalization of Vision-Language Models

Soroush Seifi, Simon Gardier, Vaggelis Dorovatas, Daniel Olmeda Reino, Rahaf Aljundi

Comments Accepted at the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2026

2603.09741 2026-03-12 cs.CV

ENIGMA-360: An Ego-Exo Dataset for Human Behavior Understanding in Industrial Scenarios

Francesco Ragusa, Rosario Leonardi, Michele Mazzamuto, Daniele Di Mauro, Camillo Quattrocchi, Alessandro Passanisi, Irene D'Ambra, Antonino Furnari, Giovanni Maria Farinella

2603.09704 2026-03-12 cs.CL

Evaluation of LLMs in retrieving food and nutritional context for RAG systems

Maks Požarnik Vavken, Matevž Ogrinc, Tome Eftimov, Barbara Koroušić Seljak

Comments This is the preprint for our conference paper for IEEE International Conference on Big Data

2603.09689 2026-03-12 cs.CV cs.AI

AutoViVQA: A Large-Scale Automatically Constructed Dataset for Vietnamese Visual Question Answering

Nguyen Anh Tuong, Phan Ba Duc, Nguyen Trung Quoc, Tran Dac Thinh, Dang Duy Lan, Nguyen Quoc Thinh, Tung Le

2603.09688 2026-03-12 cs.CL

Fusing Semantic, Lexical, and Domain Perspectives for Recipe Similarity Estimation

Denica Kjorvezir, Danilo Najkov, Eva Valencič, Erika Jesenko, Barbara Koroišić Seljak, Tome Eftimov, Riste Stojanov

Comments Preprint version submitted to IEEE Big Data 2025

2603.09638 2026-03-12 cs.CL

Tracking Cancer Through Text: Longitudinal Extraction From Radiology Reports Using Open-Source Large Language Models

Luc Builtjes, Alessa Hering

Comments 6 pages, 2 figures

2603.09613 2026-03-12 cs.CV

A Saccade-inspired Approach to Image Classification using Vision Transformer Attention Maps

Matthis Dallain, Laurent Rodriguez, Laurent Udo Perrinet, Benoît Miramond

Comments 16 page, 11 figure main paper + 3 pages, 6 appendix

2603.09488 2026-03-12 cs.CV

Streaming Autoregressive Video Generation via Diagonal Distillation

Jinxiu Liu, Xuanming Liu, Kangfu Mei, Yandong Wen, Ming-Hsuan Yang, Weiyang Liu

Comments ICLR 2026 (31 pages, 10 figures, project page: https://spherelab.ai/diagdistill/)

详情

英文摘要

Large pretrained diffusion models have significantly enhanced the quality of generated videos, and yet their use in real-time streaming remains limited. Autoregressive models offer a natural framework for sequential frame synthesis but require heavy computation to achieve high fidelity. Diffusion distillation can compress these models into efficient few-step variants, but existing video distillation approaches largely adapt image-specific methods that neglect temporal dependencies. These techniques often excel in image generation but underperform in video synthesis, exhibiting reduced motion coherence, error accumulation over long sequences, and a latency-quality trade-off. We identify two factors that result in these limitations: insufficient utilization of temporal context during step reduction and implicit prediction of subsequent noise levels in next-chunk prediction (i.e., exposure bias). To address these issues, we propose Diagonal Distillation, which operates orthogonally to existing approaches and better exploits temporal information across both video chunks and denoising steps. Central to our approach is an asymmetric generation strategy: more steps early, fewer steps later. This design allows later chunks to inherit rich appearance information from thoroughly processed early chunks, while using partially denoised chunks as conditional inputs for subsequent synthesis. By aligning the implicit prediction of subsequent noise levels during chunk generation with the actual inference conditions, our approach mitigates error propagation and reduces oversaturation in long-range sequences. We further incorporate implicit optical flow modeling to preserve motion quality under strict step constraints. Our method generates a 5-second video in 2.61 seconds (up to 31 FPS), achieving a 277.3x speedup over the undistilled model.

URL PDF HTML ☆

赞 0 踩 0

2603.09412 2026-03-12 cs.LG

Reconstructing Movement from Sparse Samples: Enhanced Spatio-Temporal Matching Strategies for Low-Frequency Data

Ali Yousefian, Arianna Burzacchi, Simone Vantini

Comments 22 pages, 14 figures, 3 tables

2603.09378 2026-03-12 cs.LG cs.AI cs.RO

SPAARS: Safer RL Policy Alignment through Abstract Exploration and Refined Exploitation of Action Space

Swaminathan S K, Aritra Hazra

Comments 9 pages

2603.09141 2026-03-12 cs.CV

Agentic AI as a Network Control-Plane Intelligence Layer for Federated Learning over 6G

Loc X. Nguyen, Ji Su Yoon, Huy Q. Le, Yu Qiao, Avi Deb Raha, Eui-Nam Huh, Nguyen H. Tran, Zhu Han, Choong Seon Hong

2603.09137 2026-03-12 cs.CV

Transformer-Based Multi-Region Segmentation and Radiomic Analysis of HR-pQCT Imaging for Osteoporosis Classification

Mohseu Rashid Subah, Mohammed Abdul Gani Zilani, Thomas L. Nickolas, Matthew R. Allen, Stuart J. Warden, Rachel K. Surowiec

详情

英文摘要

Osteoporosis is a skeletal disease typically diagnosed using dual-energy X-ray absorptiometry (DXA), which quantifies areal bone mineral density but overlooks bone microarchitecture and surrounding soft tissues. High-resolution peripheral quantitative computed tomography (HR-pQCT) enables three-dimensional microstructural imaging with minimal radiation. However, current analysis pipelines largely focus on mineralized bone compartments, leaving much of the acquired image data underutilized. We introduce a fully automated framework for binary osteoporosis classification using radiomics features extracted from anatomically segmented HR-pQCT images. To our knowledge, this work is the first to leverage a transformer-based segmentation architecture, i.e., the SegFormer, for fully automated multi-region HR-pQCT analysis. The SegFormer model simultaneously delineated the cortical and trabecular bone of the tibia and fibula along with surrounding soft tissues and achieved a mean F1 score of 95.36%. Soft tissues were further subdivided into skin, myotendinous, and adipose regions through post-processing. From each region, 939 radiomic features were extracted and dimensionally reduced to train six machine learning classifiers on an independent dataset comprising 20,496 images from 122 HR-pQCT scans. The best image level performance was achieved using myotendinous tissue features, yielding an accuracy of 80.08% and an area under the receiver operating characteristic curve (AUROC) of 0.85, outperforming bone-based models. At the patient level, replacing standard biological, DXA, and HR-pQCT parameters with soft tissue radiomics improved AUROC from 0.792 to 0.875. These findings demonstrate that automated, multi-region HR-pQCT segmentation enables the extraction of clinically informative signals beyond bone alone, highlighting the importance of integrated tissue assessment for osteoporosis detection.

URL PDF HTML ☆

赞 0 踩 0

2603.09109 2026-03-12 cs.CV cs.AI

VIVID-Med: LLM-Supervised Structured Pretraining for Deployable Medical ViTs

Xiyao Wang, Xiaoyu Tan, Yang Dai, Yuxuan Fu, Shuo Li, Xihe Qiu

Comments 10 pages, 4 figures

2603.08900 2026-03-12 cs.LG cs.AI

A New Modeling to Feature Selection Based on the Fuzzy Rough Set Theory in Normal and Optimistic States on Hybrid Information Systems

Mohammad Hossein Safarpour, Seyed Majid Alavi, Mohammad Izadikhah, Hossein Dibachi

Comments 18 pages, 14 figures, 9 tables. Published version available at International Journal of Engineering. This preprint is distributed under CC BY 4.0 license

详情

DOI: 10.5829/ije.2025.38.11b.15
Journal ref: International Journal of Engineering, Transactions B: Applications, Vol. 38, No. 11, pp. 2657-2674, November 2025

英文摘要

Considering the high volume, wide variety, and rapid speed of data generation, investigating feature selection methods for big data presents various applications and advantages. By removing irrelevant and redundant features, feature selection reduces data dimensions, thereby facilitating optimal decision-making within decision systems. One of the key tools for feature selection in hybrid information systems is fuzzy rough set theory. However, this theory faces two significant challenges: First, obtaining fuzzy equivalence relations through intersection operations in high-dimensional spaces can be both time-consuming and memory-intensive. Additionally, this method may produce noisy data, complicating the feature selection process. The purpose and innovation of this paper are to address these issues. We proposed a new feature selection model that calculates the combined distance between objects and subsequently used this information to derive the fuzzy equivalence relation. Rather than directly solving the feature selection problem, this approach reformulates it into an optimization problem that can be tackled using appropriate meta-heuristic algorithms. We have named this new approach FSbuHD. The FSbuHD model operates in two modes - normal and optimistic - based on the selection of one of the two introduced fuzzy equivalence relations. The model is then tested on standard datasets from the UCI repository and compared with other algorithms. The results of this research demonstrate that FSbuHD is one of the most efficient and effective methods for feature selection when compared to previous methods and algorithms.

URL PDF HTML ☆

赞 0 踩 0

2603.08823 2026-03-12 cs.SD cs.AI cs.CL

Fish Audio S2 Technical Report

Shijia Liao, Yuxuan Wang, Songting Liu, Yifan Cheng, Ruoyi Zhang, Tianyu Li, Shidong Li, Yisheng Zheng, Xingwei Liu, Qingzheng Wang, Zhizhuo Zhou, Jiahua Liu, Xin Chen, Dawei Han

2603.08717 2026-03-12 cs.LG cs.NI

Equitable Multi-Task Learning for AI-RANs

Panayiotis Raptis, Fatih Aslan, George Iosifidis

Comments 6 pages, 3 figures

2603.08391 2026-03-12 cs.CL

Adaptive Loops and Memory in Transformers: Think Harder or Know More?

Markus Frey, Behzad Shomali, Ali Hamza Bashir, David Berghaus, Joachim Koehler, Mehdi Ali

Comments Published at Latent & Implicit Thinking Workshop @ ICLR 2026

2603.08359 2026-03-12 cs.CL cs.AI eess.AS

Computational modeling of early language learning from acoustic speech and audiovisual input without linguistic priors

Okko Räsänen

2603.08224 2026-03-12 cs.CV

SAVE: Speech-Aware Video Representation Learning for Video-Text Retrieval

Ruixiang Zhao, Zhihao Xu, Bangxiang Lan, Zijie Xin, Jingyu Liu, Xirong Li

Comments Accepted to CVPR2026

2603.07571 2026-03-12 cs.CV cs.AI cs.LG

A Systematic Comparison of Training Objectives for Out-of-Distribution Detection in Image Classification

Furkan Genç, Onat Özdemir, Emre Akbaş