arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2604.07102 2026-04-09 cs.CL cs.AI

The Impact of Steering Large Language Models with Persona Vectors in Educational Applications

Yongchao Wu, Aron Henriksson

详情

英文摘要

Activation-based steering can personalize large language models at inference time, but its effects in educational settings remain unclear. We study persona vectors for seven character traits in short-answer generation and automated scoring on the ASAP-SAS benchmark across three models spanning two architectures. Persona steering lowers answer quality overall, with much larger effects on open-ended English Language Arts (ELA) prompts than on factual science prompts; interpretive and argumentative tasks are up to 11x more sensitive. On the scoring side, we observe predictable valence-aligned calibration shifts: evil and impolite scorers grade more harshly, while good and optimistic scorers grade more leniently. ELA tasks are 2.5-3x more susceptible to scorer personalization than science tasks, and the Mixture-of-Experts model shows roughly 6x larger calibration shifts than the dense models. To our knowledge, this is the first study to systematically examine the effects of activation-steered persona traits in educational generation and scoring, and the results highlight the need for task-aware and architecture-aware calibration when deploying steered models in educational settings.

URL PDF HTML ☆

赞 0 踩 0

2604.07101 2026-04-09 cs.CV cs.AI cs.MM eess.IV

SurFITR: A Dataset for Surveillance Image Forgery Detection and Localisation

Qizhou Wang, Guansong Pang, Christopher Leckie

2604.07097 2026-04-09 cs.CV

Novel Anomaly Detection Scenarios and Evaluation Metrics to Address the Ambiguity in the Definition of Normal Samples

Reiji Saito, Satoshi Kamiya, Kazuhiro Hotta

Comments Accepted by CVPR 2026 Workshop

2604.07095 2026-04-09 cs.CL

Multilingual Embedding Probes Fail to Generalize Across Learner Corpora

Laurits Lyngbaek, Ross Deans Kristensen-McLachlan

2604.07084 2026-04-09 cs.RO cs.AI

Flow Motion Policy: Manipulator Motion Planning with Flow Matching Models

Davood Soleymanzadeh, Xiao Liang, Minghui Zheng

2604.07072 2026-04-09 cs.LG

Epistemic Robust Offline Reinforcement Learning

Abhilash Reddy Chenreddy, Erick Delage

2604.07067 2026-04-09 cs.CL

Is Cross-Lingual Transfer in Bilingual Models Human-Like? A Study with Overlapping Word Forms in Dutch and English

Iza Škrjanec, Irene Elisabeth Winther, Vera Demberg, Stefan L. Frank

2604.07066 2026-04-09 cs.CL

SemEval-2026 Task 3: Dimensional Aspect-Based Sentiment Analysis (DimABSA)

Liang-Chih Yu, Jonas Becker, Shamsuddeen Hassan Muhammad, Idris Abdulmumin, Lung-Hao Lee, Ying-Lung Lin, Jin Wang, Jan Philip Wahle, Terry Ruas, Natalia Loukachevitch, Alexander Panchenko, Ilseyar Alimova, Lilian Wanzare, Nelson Odhiambo, Bela Gipp, Kai-Wei Chang, Saif M. Mohammad

2604.07059 2026-04-09 cs.LG

Production-Ready Automated ECU Calibration using Residual Reinforcement Learning

Andreas Kampmeier, Kevin Badalian, Lucas Koch, Sung-Yong Lee, Jakob Andert

Comments This manuscript has been submitted to SAE as a conference paper for the 2026 Stuttgart International Symposium on Automotive and Powertrain Technology

2604.07057 2026-04-09 cs.CL

IndoBERT-Sentiment: Context-Conditioned Sentiment Classification for Indonesian Text

Muhammad Apriandito Arya Saputra, Andry Alamsyah, Dian Puteri Ramadhani, Thomhert Suprapto Siadari, Hanif Fakhrurroja

Comments 8 pages, 5 tables, and 2 figures

2604.07038 2026-04-09 cs.RO q-bio.NC

Exploring the proprioceptive potential of joint receptors using a biomimetic robotic joint

Akihiro Miki, Shun Hasegawa, Sota Yuzaki, Yuta Sahara, Yoshimoto Ribayashi, Kento Kawaharazuka, Kei Okada

Comments 26 pages including supplementary materials (17 pages main text), 6 main figures and 7 supplementary figures. Published in Scientific Reports

2604.07036 2026-04-09 cs.CL cs.LG cs.MA

ReDAct: Uncertainty-Aware Deferral for LLM Agents

Dzianis Piatrashyn, Nikita Kotelevskii, Kirill Grishchenkov, Nikita Glazkov, Ivan Nasonov, Ilya Makarov, Timothy Baldwin, Preslav Nakov, Roman Vashurin, Maxim Panov

2604.07034 2026-04-09 cs.RO cs.AI cs.CV

KITE: Keyframe-Indexed Tokenized Evidence for VLM-Based Robot Failure Analysis

Mehdi Hosseinzadeh, King Hang Wong, Feras Dayoub

Comments ICRA 2026; Project page: https://m80hz.github.io/kite/

2604.07030 2026-04-09 cs.LG

MoE Routing Testbed: Studying Expert Specialization and Routing Behavior at Small Scale

Tobias Falke, Nicolas Anastassacos, Samson Tan, Chankrisna Richy Meas, Chandana Satya Prakash, Nitesh Sekhar, M Saiful Bari, Krishna Kompella, Gamaleldin F. Elsayed

2604.07027 2026-04-09 cs.LG

Learning to Query History: Nonstationary Classification via Learned Retrieval

Jimmy Gammell, Bishal Thapaliya, Yoon Jung, Riyasat Ohib, Bilel Fehri, Deepayan Chakrabarti

Comments Accepted to ICLR 2026 Workshop on Time Series in the Age of Large Models (TSALM). 12 pages, 6 figures

2604.07026 2026-04-09 cs.CV

Not all tokens contribute equally to diffusion learning

Guoqing Zhang, Lu Shi, Wanru Xu, Linna Zhang, Sen Wang, Fangfang Wang, Yigang Cen

详情

英文摘要

With the rapid development of conditional diffusion models, significant progress has been made in text-to-video generation. However, we observe that these models often neglect semantically important tokens during inference, leading to biased or incomplete generations under classifier-free guidance. We attribute this issue to two key factors: distributional bias caused by the long-tailed token frequency in training data, and spatial misalignment in cross-attention where semantically important tokens are overshadowed by less informative ones. To address these issues, we propose Distribution-Aware Rectification and Spatial Ensemble (DARE), a unified framework that improves semantic guidance in diffusion models from the perspectives of distributional debiasing and spatial consistency. First, we introduce Distribution-Rectified Classifier-Free Guidance (DR-CFG), which regularizes the training process by dynamically suppressing dominant tokens with low semantic density, encouraging the model to better capture underrepresented semantic cues and learn a more balanced conditional distribution. This design mitigates the risk of the model distribution overfitting to tokens with low semantic density. Second, we propose Spatial Representation Alignment (SRA), which adaptively reweights cross-attention maps according to token importance and enforces representation consistency, enabling semantically important tokens to exert stronger spatial guidance during generation. This mechanism effectively prevents low semantic-density tokens from dominating the attention allocation, thereby avoiding the dilution of the spatial and distributional guidance provided by high semantic-density tokens. Extensive experiments on multiple benchmark datasets demonstrate that DARE consistently improves generation fidelity and semantic alignment, achieving significant gains over existing approaches.

URL PDF HTML ☆

赞 0 踩 0

2604.07023 2026-04-09 cs.CL

MARS: Enabling Autoregressive Models Multi-Token Generation

Ziqi Jin, Lei Wang, Ziwei Luo, Aixin Sun

Comments 15 pages, 4 fugures

2604.07019 2026-04-09 cs.LG cs.AI

ConceptTracer: Interactive Analysis of Concept Saliency and Selectivity in Neural Representations

Ricardo Knauer, Andre Beinrucker, Erik Rodner

Comments XAI 2026 Late-Breaking Work Track

2604.07017 2026-04-09 cs.AI

A-MBER: Affective Memory Benchmark for Emotion Recognition

Deliang Wen, Ke Sun, Yu Wang

详情

英文摘要

AI assistants that interact with users over time need to interpret the user's current emotional state in order to respond appropriately and personally. However, this capability remains insufficiently evaluated. Existing emotion datasets mainly assess local or instantaneous affect, while long-term memory benchmarks focus largely on factual recall, temporal consistency, or knowledge updating. As a result, current resources provide limited support for testing whether a model can use remembered interaction history to interpret a user's present affective state. We introduce A-MBER, an Affective Memory Benchmark for Emotion Recognition, to evaluate this capability. A-MBER focuses on present affective interpretation grounded in remembered multi-session interaction history. Given an interaction trajectory and a designated anchor turn, a model must infer the user's current affective state, identify historically relevant evidence, and justify its interpretation in a grounded way. The benchmark is constructed through a staged pipeline with explicit intermediate representations, including long-horizon planning, conversation generation, annotation, question construction, and final packaging. It supports judgment, retrieval, and explanation tasks, together with robustness settings such as modality degradation and insufficient-evidence conditions. Experiments compare local-context, long-context, retrieved-memory, structured-memory, and gold-evidence conditions within a unified framework. Results show that A-MBER is especially discriminative on the subsets it is designed to stress, including long-range implicit affect, high-dependency memory levels, trajectory-based reasoning, and adversarial settings. These findings suggest that memory supports affective interpretation not simply by providing more history, but by enabling more selective, grounded, and context-sensitive use of past interaction

URL PDF HTML ☆

赞 0 踩 0

2604.07016 2026-04-09 cs.LG

Predictive Representations for Skill Transfer in Reinforcement Learning

Ruben Vereecken, Luke Dickens, Alessandra Russo

Comments esearch conducted: September 2018 to June 2021. This manuscript represents the work as of June 2021

2604.07015 2026-04-09 cs.CL

Corpora deduplication or duplication in Natural Language Processing of few resourced languages ? A case of study: The Mexico's Nahuatl

Juan-José Guzman-Landa, Juan-Manuel Torres-Moreno, Graham Ranger, Miguel Figueroa-Saavedra, Martha-Lorena Avendaño-Garrido, Elvys Linhares-Pontes, Luis-Gil Moreno-Jiménez

Comments 8 pages, 1 figure, 1 table

2604.07012 2026-04-09 cs.CL

DTCRS: Dynamic Tree Construction for Recursive Summarization

Guanran Luo, Zhongquan Jian, Wentao Qiu, Meihong Wang, Qingqiang Wu

2604.07010 2026-04-09 cs.CV

Synthetic Dataset Generation for Partially Observed Indoor Objects

Jelle Vermandere, Maarten Bassier, Maarten Vergauwen

2604.07009 2026-04-09 cs.AI cs.LG

CAFP: A Post-Processing Framework for Group Fairness via Counterfactual Model Averaging

Irina Arévalo, Marcos Oliva

2604.07006 2026-04-09 cs.CL

Continuous Interpretive Steering for Scalar Diversity

Ye-eun Cho

2604.07000 2026-04-09 cs.CV

IQ-LUT: interpolated and quantized LUT for efficient image super-resolution

Yuxuan Zhang, Zhikai Dong, Xinning Chai, Xiangyun Zhou, Yi Xu, Zhengxue Cheng, Li Song

2604.06997 2026-04-09 cs.CL

ChunQiuTR: Time-Keyed Temporal Retrieval in Classical Chinese Annals

Yihao Wang, Zijian He, Jie Ren, Keze Wang

Comments 24 pages, 11 figures. To appear in Findings of ACL 2026

2604.06996 2026-04-09 cs.CL cs.AI

Self-Preference Bias in Rubric-Based Evaluation of Large Language Models

José Pombal, Ricardo Rei, André F. T. Martins

2604.06990 2026-04-09 cs.LG cs.AI

Stress Estimation in Elderly Oncology Patients Using Visual Wearable Representations and Multi-Instance Learning

Ioannis Kyprakis, Vasileios Skaramagkas, Georgia Karanasiou, Vasilis Bouratzis, Andri Papakonstantinou, Dimitar Stefanovski, Kalliopi Keramida, Aristofania Simatou, Ketti Mazzocco, Anastasia Constantinidou, Konstantinos Marias, Dimitrios I. Fotiadis, Manolis Tsiknakis

Comments 7 pages, 2 figures, under review for IEEE EMBC 2026

2604.06989 2026-04-09 cs.CV cs.AI

Generative Phomosaic with Structure-Aligned and Personalized Diffusion

Jaeyoung Chung, Hyunjin Son, Kyoung Mu Lee

Comments Project page: https://robot0321.github.io/GenerativePhotomosaic/index.html