arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2601.04666 2026-04-10 cs.AI cs.CR

Know Thy Enemy: Securing LLMs Against Prompt Injection via Diverse Data Synthesis and Instruction-Level Chain-of-Thought Learning

Zhiyuan Chang, Mingyang Li, Yuekai Huang, Ziyou Jiang, Xiaojun Jia, Qian Xiong, Junjie Wang, Zhaoyang Li, Qing Wang

Comments 19 pages, 6 figures; accepted by ACL 2026 Findings

2601.03703 2026-04-10 cs.LG cs.AI

TreeAdv: Tree-Structured Advantage Redistribution for Group-Based RL

Lang Cao, Hui Ruan, Yongqian Li, Peng Chao, Wu Ning, Haonan Song, Renhong Chen, Yitong Li

2601.02670 2026-04-10 cs.CL

Break Me If You Can: Self-Jailbreaking of Aligned LLMs via Lexical Insertion Prompting

Devang Kulshreshtha, Hang Su, Haibo Jin, Chinmay Hegde, Haohan Wang

2512.24517 2026-04-10 cs.CL

Paragraph Segmentation Revisited: Towards a Standard Task for Structuring Speech

Fabian Retkowski, Alexander Waibel

Comments Accepted at LREC 2026

2512.23365 2026-04-10 cs.CV

SpatialMosaic: A Multiview VLM Dataset for Partial Visibility

Kanghee Lee, Injae Lee, Minseok Kwak, Jungi Hong, Kwonyoung Ryu, Jaesik Park

2512.22416 2026-04-10 cs.CL cs.IR

Hallucination Detection and Evaluation of Large Language Model

Chenggong Zhang, Haopeng Wang, Hexi Meng

2512.18662 2026-04-10 cs.RO cs.CV

Pseudo-Expert Regularized Offline RL for End-to-End Autonomous Driving in Photorealistic Closed-Loop Environments

Chihiro Noguchi, Takaki Yamamoto

Comments Accepted to CVPR Findings 2026

2512.17445 2026-04-10 cs.CV

LangDriveCTRL: Natural Language Controllable Driving Scene Editing with Multi-modal Agents

Yun He, Francesco Pittaluga, Ziyu Jiang, Matthias Zwicker, Manmohan Chandraker, Zaid Tasneem

Comments Project Page: https://yunhe24.github.io/langdrivectrl/

2512.15920 2026-04-10 cs.LG astro-ph.IM cs.NE physics.comp-ph physics.data-an

Introduction to Symbolic Regression in the Physical Sciences

Deaglan J. Bartlett, Harry Desmond, Pedro G. Ferreira, Gabriel Kronberger

Comments 8 pages, no figures; accepted in Royal Society Philosophical Transactions A special issue "Symbolic regression in the physical sciences"

2512.13749 2026-04-10 cs.LG cs.AI cs.CE cs.CY cs.SE

Comparative Evaluation of Embedding Representations for Financial News Sentiment Analysis

Joyjit Roy, Samaresh Kumar Singh

Comments 6 pages, 2 figures. Published in the 4th IEEE International Conference on Interdisciplinary Approaches in Technology and Management for Social Innovation (IATMSI 2026), IEEE

2512.13040 2026-04-10 cs.LG cs.CL

Understanding Structured Financial Data with LLMs: A Case Study on Fraud Detection

Xuwei Tan, Yao Ma, Xueru Zhang

Comments Accepted to ACL 2026 Main Conference

2512.10723 2026-04-10 cs.LG

Generalized Spherical Neural Operators: Green's Function Formulation

Hao Tang, Hao Chen, Chao Li

Comments ICLR 2026 (International Conference on Learning Representations)

2512.08410 2026-04-10 cs.CV

Towards Effective Long Video Understanding of Multimodal Large Language Models via One-shot Clip Retrieval

Tao Chen, Shaobo Ju, Qiong Wu, Chenxin Fang, Kun Zhang, Jun Peng, Hui Li, Yiyi Zhou, Rongrong Ji

2512.08296 2026-04-10 cs.AI

Towards a Science of Scaling Agent Systems

Yubin Kim, Ken Gu, Chanwoo Park, Chunjong Park, Samuel Schmidgall, A. Ali Heydari, Yao Yan, Zhihan Zhang, Yuchen Zhuang, Yun Liu, Mark Malhotra, Paul Pu Liang, Hae Won Park, Yuzhe Yang, Xuhai Xu, Yilun Du, Shwetak Patel, Tim Althoff, Daniel McDuff, Xin Liu

2512.06774 2026-04-10 cs.CV cs.AI

RDSplat: Robust Watermarking for 3D Gaussian Splatting Against 2D and 3D Diffusion Editing

Longjie Zhao, Ziming Hong, Zhenyang Ren, Runnan Chen, Mingming Gong, Tongliang Liu

2512.03745 2026-04-10 cs.CV

Dual-level Modality Debiasing Learning for Unsupervised Visible-Infrared Person Re-Identification

Jiaze Li, Yan Lu, Bin Liu, Guojun Yin, Mang Ye

2512.01236 2026-04-10 cs.CV

PSR: Scaling Multi-Subject Personalized Image Generation with Pairwise Subject-Consistency Rewards

Shulei Wang, Longhui Wei, Xin He, Jianbo Ouyang, Hui Lu, Zhou Zhao, Qi Tian

Comments Accepted by CVPR 2026

2512.00170 2026-04-10 cs.LG stat.ML

We Still Don't Understand High-Dimensional Bayesian Optimization

Colin Doumont, Donney Fan, Natalie Maus, Jacob R. Gardner, Henry Moss, Geoff Pleiss

2511.18787 2026-04-10 cs.CV cs.LG

Understanding Task Transfer in Vision-Language Models

Bhuvan Sachdeva, Karan Uppal, Abhinav Java, Vineeth N. Balasubramanian

Comments CVPR 2026 (Oral)

2511.18244 2026-04-10 cs.AI cond-mat.mtrl-sci physics.ed-ph

Developing an AI Course for Synthetic Chemistry Students

Zhiling Zheng

Comments 17 pages, 3 figures

2511.15496 2026-04-10 cs.CV cs.AI

Evaluating Low-Light Image Enhancement Across Multiple Intensity Levels

Maria Pilligua, David Serrano-Lozano, Pai Peng, Ramon Baldrich, Michael S. Brown, Javier Vazquez-Corral

2511.12222 2026-04-10 cs.LG eess.SP

On the Interaction Between Chicken Swarm Rejuvenation and KLD-Adaptive Sampling in Particle Filters

Hangshuo Tian

2511.11435 2026-04-10 cs.CV cs.AI

The Persistence of Cultural Memory: Investigating Multimodal Iconicity in Diffusion Models

Maria-Teresa De Rosa Palmini, Eva Cetinic

详情

英文摘要

The ambiguity between generalization and memorization in TTI diffusion models becomes pronounced when prompts invoke culturally shared visual references, a phenomenon we term multimodal iconicity. These are instances in which images and texts reflect established cultural associations, such as when a title recalls a familiar artwork or film scene. Such cases challenge existing approaches to evaluating memorization, as they define a setting in which instance-level memorization and culturally grounded generalization are structurally intertwined. To address this challenge, we propose an evaluation framework to assess a model's ability to remain culturally grounded without relying on visual replication. Specifically, we introduce the Cultural Reference Transformation (CRT) metric, which separates two dimensions of model behavior: Recognition, whether a model evokes a reference, from Realization, how it depicts it through replication or reinterpretation. We evaluate five diffusion models on 767 Wikidata-derived cultural references, covering both still and moving imagery, and find differences in how they respond to multimodal iconicity: some show weaker recognition, while others rely more heavily on replication. To assess linguistic sensitivity, we conduct prompt perturbation experiments using synonym substitutions and literal image descriptions, finding that models often reproduce iconic visual structures even when textual cues are altered. Finally, we find that cultural reference recognition correlates not only with training data frequency, but also textual uniqueness, reference popularity, and creation date. Our findings show that the behavior of diffusion models in culturally iconic settings cannot be reduced to simple reproduction, but depends on how references are recognized and realized, advancing evaluation beyond simple text-image matching toward richer contextual understanding.

URL PDF HTML ☆

赞 0 踩 0

2511.09170 2026-04-10 cs.CV cs.RO

HOTFLoc++: End-to-End Hierarchical LiDAR Place Recognition, Re-Ranking, and 6-DoF Metric Localisation in Forests

Ethan Griffiths, Maryam Haghighat, Simon Denman, Clinton Fookes, Milad Ramezani

Comments 8 pages, 2 figures, Accepted for publication in IEEE RA-L (2026)

2511.07797 2026-04-10 cs.RO

Characterizing the Resilience and Sensitivity of Polyurethane Vision-Based Tactile Sensors

Benjamin Davis, Hannah Stuart

2511.03092 2026-04-10 cs.AI cs.AR cs.DC

SnapStream: Efficient Long Sequence Decoding on Dataflow Accelerators

Jonathan Li, Nasim Farahini, Evgenii Iuliugin, Magnus Vesterlund, Christian Häggström, Guangtao Wang, Shubhangi Upasani, Ayush Sachdeva, Rui Li, Faline Fu, Chen Wu, Ayesha Siddiqua, John Long, Tuowen Zhao, Matheen Musaddiq, Håkan Zeffer, Yun Du, Mingran Wang, Qinghua Li, Bo Li, Urmish Thakker, Raghu Prabhakar

2510.26241 2026-04-10 cs.CV cs.CL

Which Way Does Time Flow? A Psychophysics-Grounded Evaluation for Vision-Language Models

Shiho Matta, Lis Kanashiro Pereira, Peitao Han, Fei Cheng, Shigeru Kitazawa

Comments 12 pages

2510.20759 2026-04-10 cs.SD

Controllable Embedding Transformation for Mood-Guided Music Retrieval

Julia Wilkins, Jaehun Kim, Matthew E. P. Davies, Juan Pablo Bello, Matthew C. McCallum

Comments Preprint; under review

2510.17162 2026-04-10 cs.LG

ALPINE: Closed-Loop Adaptive Privacy Budget Allocation for Mobile Edge Crowdsensing

Guanjie Cheng, Siyang Liu, Xinkui Zhao, Yishan Chen, Junqin Huang, Linghe Kong, Shiguang Deng

Comments 12 pages, 12 figures, 6 tables. Submitted to The International Conference on Web Services (ICWS)

2510.15770 2026-04-10 cs.CV cs.LG

Mitigating Spurious Background Bias in Multimedia Recognition with Disentangled Concept Bottlenecks

Gaoxiang Huang, Songning Lai, Yutao Yue