arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2602.20558 2026-03-20 cs.AI cs.IR

From Logs to Language: Learning Optimal Verbalization for LLM-Based Recommendation at Industry Scale

Yucheng Shi, Ying Li, Yu Wang, Yesu Feng, Arjun Rao, Rein Houthooft, Shradha Sehgal, Jin Wang, Hao Zhen, Ninghao Liu, Linas Baltrunas

Comments Work in progress

详情

英文摘要

Large language models (LLMs) are promising backbones for generative recommender systems, yet a key challenge remains underexplored: verbalization, i.e., converting structured user interaction logs into effective natural language inputs. Existing methods rely on rigid templates that simply concatenate fields, yielding suboptimal representations for recommendation. We propose a data-centric framework that learns verbalization for LLM-based recommendation. Using reinforcement learning, a verbalization agent transforms raw interaction histories into optimized textual contexts, with recommendation accuracy as the training signal. This agent learns to filter noise, incorporate relevant metadata, and reorganize information to improve downstream predictions. Experiments on a large-scale industrial streaming dataset from Netflix show that learned verbalization delivers up to 93% relative improvement in discovery item recommendation accuracy over template-based baselines. Further analysis reveals emergent strategies such as user interest summarization, noise removal, and syntax normalization, offering insights into effective context construction for LLM-based recommender systems.

URL PDF HTML ☆

赞 0 踩 0

2602.16698 2026-03-20 cs.LG

Causality is Key for Interpretability Claims to Generalise

Shruti Joshi, Aaron Mueller, David Klindt, Wieland Brendel, Patrik Reizinger, Dhanya Sridhar

2602.11322 2026-03-20 cs.LG cs.AI cs.IR cs.NE

Predictive Associative Memory: Retrieval Beyond Similarity Through Temporal Co-occurrence

Jason Dury

Comments 20 pages, 6 figures, for associated Git: https://github.com/EridosAI/PAM-Benchmark

详情

DOI: 10.5281/zenodo.18595537

英文摘要

Current approaches to memory in neural systems rely on similarity-based retrieval: given a query, find the most representationally similar stored state. This assumption -- that useful memories are similar memories -- fails to capture a fundamental property of biological memory: association through temporal co-occurrence. We propose Predictive Associative Memory (PAM), an architecture in which a JEPA-style predictor, trained on temporal co-occurrence within a continuous experience stream, learns to navigate the associative structure of an embedding space. We introduce an Inward JEPA that operates over stored experience (predicting associatively reachable past states) as the complement to the standard Outward JEPA that operates over incoming sensory data (predicting future states). We evaluate PAM as an associative recall system -- testing faithfulness of recall for experienced associations -- rather than as a retrieval system evaluated on generalisation to unseen associations. On a synthetic benchmark, the predictor's top retrieval is a true temporal associate 97% of the time (Association Precision@1 = 0.970); it achieves cross-boundary Recall@20 = 0.421 where cosine similarity scores zero; and it separates experienced-together from never-experienced-together states with a discrimination AUC of 0.916 (cosine: 0.789). Even restricted to cross-room pairs where embedding similarity is uninformative, the predictor achieves AUC = 0.849 (cosine: 0.503, chance). A temporal shuffle control confirms the signal is genuine temporal co-occurrence structure, not embedding geometry: shuffling collapses cross-boundary recall by 90%, replicated across training seeds. All results are stable across seeds (SD < 0.006) and query selections (SD $\leq$ 0.012).

URL PDF HTML ☆

赞 0 踩 0

2602.06450 2026-03-20 cs.CV

What Is Wrong with Synthetic Data for Scene Text Recognition? A Strong Synthetic Engine with Diverse Simulations and Self-Evolution

Xingsong Ye, Yongkun Du, JiaXin Zhang, Chen Li, Jing Lyu, Zhineng Chen

Comments Accepted by CVPR 2026

2602.06023 2026-03-20 cs.AI cs.RO

Developing a Discrete-Event Simulator of School Shooter Behavior from VR Data

Christopher A. McClurg, Alan R. Wagner

Comments Accepted for presentation at ANNSIM 2026. Camera-ready version. 13 pages, 4 figures, 4 tables

2601.21690 2026-03-20 cs.LG

A Unified Generalization Framework for Model Merging: Trade-offs, Non-Linearity, and Scaling Laws

Qinglun Li, Anke Tang, Miao Zhang, Mengzhu Wang, Quanjun Yin, Li Shen

2601.19529 2026-03-20 cs.RO

RhoMorph: Rhombus-shaped Deformable Modular Robots for Stable, Medium-Independent Reconfiguration Motion

Jie Gu, Yirui Sun, Zhihao Xia, Tin Lun Lam, Chunxu Tian, Dan Zhang

2601.18032 2026-03-20 cs.LG cond-mat.mtrl-sci

Multimodal Machine Learning for Soft High-k Elastomers under Data Scarcity

Brijesh FNU, Viet Thanh Duy Nguyen, Ashima Sharma, Md Harun Rashid Molla, Chengyi Xu, Truong-Son Hy

2601.15644 2026-03-20 cs.CV

SuperOcc: Toward Cohesive Temporal Modeling for Superquadric-based 3D Occupancy Prediction

Zichen Yu, Quanli Liu, Wei Wang, Liyong Zhang, Xiaoguang Zhao

Comments This work has been submitted to the IEEE for possible publication

2601.13751 2026-03-20 cs.CV cs.LG

Towards Onboard Continuous Change Detection for Floods

Daniel Kyselica, Jonáš Herec, Oliver Kutis, Rado Pitoňák

Comments 19 pages, 9 figures, accepted at GISTAM 2026

2601.13590 2026-03-20 cs.CL cs.AI

Vulnerability of LLMs' Stated Beliefs? LLMs Belief Resistance Check Through Strategic Persuasive Conversation Interventions

Fan Huang, Haewoon Kwak, Jisun An

Comments Updated new models and minor revisions

2601.09734 2026-03-20 cs.CL cs.AI

From Detection to Diagnosis: Advancing Hallucination Analysis with Automated Data Synthesis

Yanyi Liu, Qingwen Yang, Tiezheng Guo, Feiyu Qu, Jun Liu, Yingyou Wen

Comments Accepted at The 40th Annual AAAI Conference on Artificial Intelligence

详情

DOI: 10.1609/aaai.v40i38.40495

英文摘要

Hallucinations in Large Language Models (LLMs), defined as the generation of content inconsistent with facts or context, represent a core obstacle to their reliable deployment in critical domains. Current research primarily focuses on binary "detection" approaches that, while capable of identifying hallucinations, fail to provide interpretable and actionable feedback for model improvement, thus limiting practical utility. To address this limitation, a new research paradigm is proposed, shifting from "detection" to "diagnosis". The Hallucination Diagnosis Task is introduced, a task which requires models to not only detect hallucinations, but also perform error localization, causal explanation, and content correction. We develop the Hallucination Diagnosis Generator (HDG), an automated pipeline that systematically generates high-quality training samples with rich diagnostic metadata from raw corpora through multi-dimensional augmentation strategies including controlled fact fabrication and reasoning chain perturbation. Using HDG-generated data, we train HDM-4B-RL, a 4-billion-parameter hallucination diagnosis model, employing Group Relative Policy Optimization (GRPO) with a comprehensive reward function incorporating structural, accuracy, and localization signals. Experimental results demonstrate that our model surpasses previous state-of-the-art detection models on the HaluEval benchmark while achieving comparable performance to advanced general-purpose models. In comprehensive diagnosis tasks, HDM-4B-RL matches the capabilities of larger general models while maintaining a smaller size. This work validates the feasibility and value of hallucination diagnosis, providing an effective methodology for building more trustworthy and reliable generative AI systems.

URL PDF HTML ☆

赞 0 踩 0

2601.09658 2026-03-20 cs.CV

Image2Garment: Simulation-ready Garment Generation from a Single Image

Selim Emir Can, Jan Ackermann, Kiyohiro Nakayama, Ruofan Liu, Tong Wu, Yang Zheng, Hugo Bertiche, Menglei Chai, Thabo Beeler, Gordon Wetzstein

Comments Project Page: https://image2garment.github.io/

2601.06134 2026-03-20 cs.LG eess.SP q-bio.NC

DeeperBrain: A Neuro-Grounded EEG Foundation Model Towards Universal BCI

Jiquan Wang, Sha Zhao, Yangxuan Zhou, Yiming Kang, Shijian Li, Gang Pan

Comments Preprint

2512.24338 2026-03-20 cs.CV

The Mechanics of CNN Filtering with Rectification

Liam Frija-Altarac, Matthew Toews

2512.21276 2026-03-20 cs.CV

GriDiT: Factorized Grid-Based Diffusion for Efficient Long Image Sequence Generation

Snehal Singh Tomar, Alexandros Graikos, Arjun Krishna, Dimitris Samaras, Klaus Mueller

Comments Transactions on ML Research (TMLR) 2026

详情

英文摘要

Modern deep learning methods typically treat image sequences as large tensors of sequentially stacked frames. However, is this straightforward representation ideal given the current state-of-the-art (SoTA)? In this work, we address this question in the context of generative models and aim to devise a more effective way of modeling image sequence data. Observing the inefficiencies and bottlenecks of current SoTA image sequence generation methods, we showcase that rather than working with large tensors, we can improve the generation process by factorizing it into first generating the coarse sequence at low resolution and then refining the individual frames at high resolution. We train a generative model solely on grid images comprising subsampled frames. Yet, we learn to generate image sequences, using the strong self-attention mechanism of the Diffusion Transformer (DiT) to capture correlations between frames. In effect, our formulation extends a 2D image generator to operate as a low-resolution 3D image-sequence generator without introducing any architectural modifications. Subsequently, we super-resolve each frame individually to add the sequence-independent high-resolution details. This approach offers several advantages and can overcome key limitations of the SoTA in this domain. Compared to existing image sequence generation models, our method achieves superior synthesis quality and improved coherence across sequences. It also delivers high-fidelity generation of arbitrary-length sequences and increased efficiency in inference time and training data usage. Furthermore, our straightforward formulation enables our method to generalize effectively across diverse data domains, which typically require additional priors and supervision to model in a generative context. Our method consistently outperforms SoTA in quality and inference speed (at least twice-as-fast) across datasets.

URL PDF HTML ☆

赞 0 踩 0

2512.20651 2026-03-20 cs.AI

Memory Bear AI A Breakthrough from Memory to Cognition Toward Artificial General Intelligence

Deliang Wen, Ke Sun

2512.07400 2026-03-20 cs.LG cs.AI

Heads collapse, features stay: Why Replay needs big buffers

Giulia Lanzillotta, Damiano Meier, Thomas Hofmann

2512.06679 2026-03-20 cs.CL

CMV-Fuse: Cross Modal-View Fusion of AMR, Syntax, and Knowledge Representations for Aspect Based Sentiment Analysis

Smitha Muthya Sudheendra, Mani Deep Cherukuri, Jaideep Srivastava

2512.06179 2026-03-20 cs.CV

Cast and Attached Shadow Detection via Iterative Light and Geometry Reasoning

Shilin Hu, Jingyi Xu, Sagnik Das, Dimitris Samaras, Hieu Le

Comments Project page: https://shilin21.github.io/attached_detection/

2512.06174 2026-03-20 cs.CV

Embedding Physical Reasoning into Diffusion-Based Shadow Generation

Shilin Hu, Jingyi Xu, Akshat Dave, Dimitris Samaras, Hieu Le

Comments Project page: https://shilin21.github.io/physical_generation/

2512.00960 2026-03-20 cs.CV

Efficient and Scalable Monocular Human-Object Interaction Motion Reconstruction

Boran Wen, Ye Lu, Sirui Wang, Keyan Wan, Jiahong Zhou, Junxuan Liang, Xinpeng Liu, Bang Xiao, Ruiyang Liu, Yong-Lu Li

2511.22184 2026-03-20 cs.CV

Shoe Style-Invariant and Ground-Aware Learning for Dense Foot Contact Estimation

Daniel Sungho Jung, Kyoung Mu Lee

Comments Accepted at CVPR 2026. Project page: https://feco-release.github.io/

2511.20909 2026-03-20 cs.LG cs.AI cs.NE

Evolved Sample Weights for Bias Mitigation: Effectiveness Depends on the Fairness Objective

Anil K. Saini, Jose Guadalupe Hernandez, Emily F. Wong, Debanshi Misra, Tiffani J. Bright, Jason H. Moore

2511.20830 2026-03-20 cs.LG

Autoregressive Surrogate Modeling of the Solar Wind with Spherical Fourier Neural Operator

Reza Mansouri, Dustin Kempton, Pete Riley, Rafal Angryk

Comments IEEE Conference on Data Mining (ICDM 2025)

2511.20649 2026-03-20 cs.CV

Infinity-RoPE: Action-Controllable Infinite Video Generation Emerges From Autoregressive Self-Rollout

Hidir Yesiltepe, Tuna Han Salih Meral, Adil Kaan Akan, Kaan Oktay, Pinar Yanardag

Comments CVPR 2026 | Project Page: https://infinity-rope.github.io/

2511.20623 2026-03-20 cs.AI

Copyright Detection in Large Language Models: An Ethical Approach to Generative AI Development

David Szczecina, Senan Gaffori, Edmond Li

Comments 4 pages, 3 figures

2511.11052 2026-03-20 cs.RO

AdaptPNP: Integrating Prehensile and Non-Prehensile Skills for Adaptive Robotic Manipulation

Jinxuan Zhu, Chenrui Tie, Xinyi Cao, Yuran Wang, Jingxiang Guo, Zixuan Chen, Haonan Chen, Junting Chen, Yangyu Xiao, Ruihai Wu, Lin Shao

2511.10045 2026-03-20 cs.CL

Do Language Models Associate Sound with Meaning? A Multimodal Study of Sound Symbolism

Jinhong Jeong, Sunghyun Lee, Jaeyoung Lee, Seonah Han, Youngjae Yu

Comments 33 pages, 27 tables, 10 figures, accepted to AAAI 2026 (Oral)

2511.06678 2026-03-20 cs.CV cs.LG

Flexible Concept Bottleneck Model

Xingbo Du, Qiantong Dou, Lei Fan, Rui Zhang

Comments To appear in AAAI 2026