arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

检索范围排序方式

检索时间范围

重置

HOT 人工智能、机器人等 9

cs.AI 人工智能 cs.CV 计算机视觉 cs.CL 自然语言处理 cs.RO 机器人 cs.LG 机器学习 cs.SD 声音 cs.ET 新兴技术 eess.AS 音频语音 eess.IV 图像视频

CS 计算机 41

cs 计算机 cs.AI 人工智能 cs.AR 硬件架构 cs.CC 计算复杂性 cs.CE 计算工程 cs.CG 计算几何 cs.CL 自然语言处理 cs.CR 密码安全 cs.CV 计算机视觉 cs.CY 计算机与社会 cs.DB 数据库 cs.DC 分布式计算 cs.DL 数字图书馆 cs.DM 离散数学 cs.DS 数据结构 cs.ET 新兴技术 cs.FL 形式语言 cs.GL 综述文献 cs.GR 图形学 cs.GT 博弈论 cs.HC 人机交互 cs.IR 信息检索 cs.IT 信息论 cs.LG 机器学习 cs.LO 计算机逻辑 cs.MA 多智能体 cs.MM 多媒体 cs.MS 数学软件 cs.NA 数值分析 cs.NE 神经进化 cs.NI 网络架构 cs.OH 其他计算机 cs.OS 操作系统 cs.PF 性能 cs.PL 编程语言 cs.RO 机器人 cs.SC 符号计算 cs.SD 声音 cs.SE 软件工程 cs.SI 社会信息网络 cs.SY 系统控制

ECON 经济学 4

econ 经济学 econ.EM 计量经济 econ.GN 一般经济 econ.TH 理论经济

EESS 电气与系统 5

eess 电气与系统 eess.AS 音频语音 eess.IV 图像视频 eess.SP 信号处理 eess.SY 系统控制

MATH 数学 33

math 数学 math.AC 交换代数 math.AG 代数几何 math.AP 偏微分方程 math.AT 代数拓扑 math.CA 经典分析 math.CO 组合数学 math.CT 范畴论 math.CV 复变函数 math.DG 微分几何 math.DS 动力系统 math.FA 泛函分析 math.GM 一般数学 math.GN 一般拓扑 math.GR 群论 math.GT 几何拓扑 math.HO 历史综述 math.IT 信息论 math.KT K理论 math.LO 逻辑 math.MG 度量几何 math.MP 数学物理 math.NA 数值分析 math.NT 数论 math.OA 算子代数 math.OC 优化控制 math.PR 概率 math.QA 量子代数 math.RA 环与代数 math.RT 表示论 math.SG 辛几何 math.SP 谱理论 math.ST 统计理论

PHYSICS 物理 55

astro-ph 天体物理 astro-ph.CO 宇宙学 astro-ph.EP 地球行星 astro-ph.GA 星系物理 astro-ph.HE 高能天体 astro-ph.IM 天文仪器 astro-ph.SR 太阳恒星 cond-mat 凝聚态 cond-mat.dis-nn 无序神经 cond-mat.mes-hall 介观纳米 cond-mat.mtrl-sci 材料科学 cond-mat.other 其他凝聚态 cond-mat.quant-gas 量子气体 cond-mat.soft 软凝聚态 cond-mat.stat-mech 统计力学 cond-mat.str-el 强关联电子 cond-mat.supr-con 超导 gr-qc 广义相对论 hep-ex 高能实验 hep-lat 格点高能 hep-ph 高能唯象 hep-th 高能理论 math-ph 数学物理 nlin 非线性科学 nlin.AO 自适应系统 nlin.CD 混沌动力学 nlin.CG 胞自动机 nlin.PS 斑图孤子 nlin.SI 可积系统 nucl-ex 核物理实验 nucl-th 核物理理论 physics 物理 physics.acc-ph 加速器物理 physics.ao-ph 大气海洋 physics.app-ph 应用物理 physics.atm-clus 原子分子团簇 physics.atom-ph 原子物理 physics.bio-ph 生物物理 physics.chem-ph 化学物理 physics.class-ph 经典物理 physics.comp-ph 计算物理 physics.data-an 数据分析 physics.ed-ph 物理教育 physics.flu-dyn 流体动力学 physics.gen-ph 普通物理 physics.geo-ph 地球物理 physics.hist-ph 物理史哲 physics.ins-det 仪器探测 physics.med-ph 医学物理 physics.optics 光学 physics.plasm-ph 等离子体 physics.pop-ph 科普物理 physics.soc-ph 物理与社会 physics.space-ph 空间物理 quant-ph 量子物理

Q-BIO 定量生物 11

q-bio 定量生物 q-bio.BM 生物分子 q-bio.CB 细胞行为 q-bio.GN 基因组学 q-bio.MN 分子网络 q-bio.NC 神经认知 q-bio.OT 其他定量生物 q-bio.PE 种群进化 q-bio.QM 定量方法 q-bio.SC 亚细胞过程 q-bio.TO 组织器官

Q-FIN 定量金融 10

q-fin 定量金融 q-fin.CP 计算金融 q-fin.EC 经济学 q-fin.GN 一般金融 q-fin.MF 数学金融 q-fin.PM 投资组合 q-fin.PR 证券定价 q-fin.RM 风险管理 q-fin.ST 统计金融 q-fin.TR 交易微观结构

STAT 统计 7

stat 统计 stat.AP 统计应用 stat.CO 统计计算 stat.ME 统计方法 stat.ML 机器学习 stat.OT 其他统计 stat.TH 统计理论

2601.23223 2026-02-04 cs.CL

Are you going to finish that? A Practical Study of the Partial Token Problem

Hao Xu, Alisa Liu, Jonathan Hayase, Yejin Choi, Noah A. Smith

2601.22975 2026-02-04 cs.AI

Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text

Ximing Lu, David Acuna, Jaehun Jung, Jian Hu, Di Zhang, Shizhe Diao, Yunheng Zou, Shaokun Zhang, Brandon Cui, Mingjie Liu, Hyunwoo Kim, Prithviraj Ammanabrolu, Jan Kautz, Yi Dong, Yejin Choi

2601.22875 2026-02-04 cs.CL

From Labels to Facets: Building a Taxonomically Enriched Turkish Learner Corpus

Elif Sayar, Tolgahan Türker, Anna Golynskaia Knezhevich, Bihter Dereli, Ayşe Demirhas, Lionel Nicolas, Gülşen Eryiğit

Comments An error was identified in the analyses presented in Section 5.3, impacting the conclusions of the paper. The authors have therefore withdrawn the submission

2601.22522 2026-02-04 cs.CV

Can 3D point cloud data improve automated body condition score prediction in dairy cattle?

Zhou Tang, Jin Wang, Angelo De Castro, Yuxi Zhang, Victoria Bastos Primo, Ana Beatriz Montevecchio Bernardino, Gota Morota, Xu Wang, Ricardo C Chebel, Haipeng Yu

2601.22513 2026-02-04 cs.AI

Why Self-Rewarding Works: Theoretical Guarantees for Iterative Alignment of Language Models

Shi Fu, Yingjie Wang, Shengchao Hu, Peng Wang, Dacheng Tao

2601.22125 2026-02-04 cs.CV

Creative Image Generation with Diffusion Models

Kunpeng Song, Ahmed Elgammal

Comments Project page: https://creative-t2i.github.io

2601.21835 2026-02-04 cs.LG

Scalable Linearized Laplace Approximation via Surrogate Neural Kernel

Luis A. Ortega, Simón Rodríguez-Santana, Daniel Hernández-Lobato

Comments 6 pages, 1 table. Accepted at European Symposium on Artificial Neural Networks (ESANN 2026) as oral presentation

2601.21712 2026-02-04 cs.RO

CoFreeVLA: Collision-Free Dual-Arm Manipulation via Vision-Language-Action Model and Risk Estimation

Xuanran Zhai, Binkai Ou, Qiaojun Yu, Ce Hao, Yaohua Liu

2601.21602 2026-02-04 cs.RO

AIR-VLA: Vision-Language-Action Systems for Aerial Manipulation

Jianli Sun, Bin Tian, Qiyao Zhang, Chengxiang Li, Zihan Song, Zhiyong Cui, Yisheng Lv, Yonglin Tian

2601.21123 2026-02-04 cs.AI

CUA-Skill: Develop Skills for Computer Using Agent

Tianyi Chen, Yinheng Li, Michael Solodko, Sen Wang, Nan Jiang, Tingyuan Cui, Junheng Hao, Jongwoo Ko, Sara Abdali, Leon Xu, Suzhen Zheng, Hao Fan, Pashmina Cameron, Justin Wagle, Kazuhito Koishida

2601.20834 2026-02-04 cs.CL cs.LG

Linear representations in language models can change dramatically over a conversation

Andrew Kyle Lampinen, Yuxuan Li, Eghbal Hosseini, Sangnie Bhardwaj, Murray Shanahan

2601.20753 2026-02-04 cs.LG

GraphAllocBench: A Flexible Benchmark for Preference-Conditioned Multi-Objective Policy Learning

Zhiheng Jiang, Yunzhe Wang, Ryan Marr, Ellen Novoseller, Benjamin T. Files, Volkan Ustun

2601.20041 2026-02-04 cs.LG cs.AI

CiMRAG: CiM-Aware Domain-Adaptive and Noise-Resilient Retrieval-Augmented Generation for Edge-Based LLMs

Shih-Hsuan Chiu, Ming-Syan Chen

Comments Accepted by ICASSP 2026

2601.19411 2026-02-04 cs.RO cs.LG

Task-Centric Policy Optimization from Misaligned Motion Priors

Ziang Zheng, Kai Feng, Yi Nie, Shentao Qin

Comments Work requires further details and not complete yet

2601.19402 2026-02-04 cs.AI

PROTEUS: SLA-Aware Routing via Lagrangian RL for Multi-LLM Serving Systems

Amit Singh Bhatti, Vishal Vaddina, Dagnachew Birru

Comments Submitted to EuroMLSys26

2601.19395 2026-02-04 cs.LG

SEAFormer: A Spatial Proximity and Edge-Aware Transformer for Real-World Vehicle Routing Problems

Saeed Nasehi Basharzad, Farhana Choudhury, Egemen Tanin

Comments 26 pages

2601.19136 2026-02-04 cs.CV

TFFM: Topology-Aware Feature Fusion Module via Latent Graph Reasoning for Retinal Vessel Segmentation

Iftekhar Ahmed, Shakib Absar, Aftar Ahmad Sami, Shadman Sakib, Debojyoti Biswas, Seraj Al Mahmud Mostafa

Comments Accepted in WACV 2026 @ P2P-workshop as a full paper and selected for oral presentation

2601.18123 2026-02-04 cs.AI

Deadline-Aware, Energy-Efficient Control of Domestic Immersion Hot Water Heater

Muhammad Ibrahim Khan, Bivin Pradeep, James Brusey

Comments Accepted at AAAI 2026

2601.16540 2026-02-04 cs.SD cs.AI eess.AS

Do Models Hear Like Us? Probing the Representational Alignment of Audio LLMs and Naturalistic EEG

Haoyun Yang, Xin Xiao, Jiang Zhong, Yu Tian, Dong Xiaohua, Yu Mao, Hao Wu, Kaiwen Wei

2601.15540 2026-02-04 cs.LG cs.AI cs.CL physics.data-an

PRISM: Deriving a White-Box Transformer as a Signal-Noise Decomposition Operator via Maximum Coding Rate Reduction

Dongchen Huang

Comments 12 pages, 6 figures. Derives Transformer as a signal-noise decomposition operator via Maximizing Coding Rate Reduction. Identifies 'Attention Sink' as spectral resonance (Arnold Tongues) and proposes $π$-RoPE for dynamical stability

2601.15468 2026-02-04 cs.LG cs.DS stat.ML

Learning from Synthetic Data: Limitations of ERM

Kareem Amin, Alex Bie, Weiwei Kong, Umar Syed, Sergei Vassilvitskii

2601.14096 2026-02-04 cs.AI

Remapping and navigation of an embedding space via error minimization: a fundamental organizational principle of cognition in natural and artificial systems

Benedikt Hartl, Léo Pio-Lopez, Chris Fields, Michael Levin

Comments 41 pages, 5 figures

2601.11641 2026-02-04 cs.CV cs.LG

Mixture of Distributions Matters: Dynamic Sparse Attention for Efficient Video Diffusion Transformers

Yuxi Liu, Yipeng Hu, Zekun Zhang, Kunze Jiang, Kun Yuan

2601.10554 2026-02-04 cs.CV

DeepUrban: Interaction-Aware Trajectory Prediction and Planning for Automated Driving by Aerial Imagery

Constantin Selzer, Fabian B. Flohr

Journal ref 2024 IEEE 27th International Conference on Intelligent Transportation Systems (ITSC), Edmonton, AB, Canada, 2024, pp. 221-227

2601.09241 2026-02-04 cs.CL

When to Trust: A Causality-Aware Calibration Framework for Accurate Knowledge Graph Retrieval-Augmented Generation

Jing Ren, Bowen Li, Ziqi Xu, Xikun Zhang, Haytham Fayek, Xiaodong Li

Comments Accepted by WWW 2026

2601.08662 2026-02-04 cs.AI quant-ph

From Classical to Quantum Reinforcement Learning and Its Applications in Quantum Control: A Beginner's Tutorial

Abhijit Sen, Sonali Panda, Mahima Arya, Subhajit Patra, Zizhan Zheng, Denys I. Bondar

2601.08248 2026-02-04 cs.RO

Spiking Neural-Invariant Kalman Fusion for Accurate Localization Using Low-Cost IMUs

Yaohua Liu, Qiao Xu, Binkai Ou

2601.07182 2026-02-04 cs.LG cs.AI

PRPO: Aligning Process Reward with Outcome Reward in Policy Optimization

Ruiyi Ding, Yongxuan Lv, Xianhui Meng, Jiahe Song, Chao Wang, Chen Jiang, Yuan Cheng

Comments 8 pages, 2 figures Code is available at: https://github.com/SchumiDing/srpocode

2601.02754 2026-02-04 cs.LG cs.AI cs.IR

Q-Regularized Generative Auto-Bidding: From Suboptimal Trajectories to Optimal Policies

Mingming Zhang, Na Li, Zhuang Feiqing, Hongyang Zheng, Jiangbing Zhou, Wang Wuyin, Sheng-jie Sun, XiaoWei Chen, Junxiong Zhu, Lixin Zou, Chenliang Li

Comments Due to the company's compliance requirements, we would like to wait until the paper is officially published before making it publicly available on arXiv

2512.21956 2026-02-04 cs.CL

Self-attention vector output similarities reveal how machines pay attention

Tal Halevi, Yarden Tzach, Ronit D. Gross, Shalom Rosner, Ido Kanter

Comments 23 pages, 14 figures

详情

英文摘要

The self-attention mechanism has significantly advanced the field of natural language processing, facilitating the development of advanced language-learning machines. Although its utility is widely acknowledged, the precise mechanisms of self-attention underlying its advanced learning and the quantitative characterization of this learning process remains an open research question. This study introduces a new approach for quantifying information processing within the self-attention mechanism. The analysis conducted on the BERT-12 architecture reveals that, in the final layers, the attention map focuses on sentence separator tokens, suggesting a practical approach to text segmentation based on semantic features. Based on the vector space emerging from the self-attention heads, a context similarity matrix, measuring the scalar product between two token vectors was derived, revealing distinct similarities between different token vector pairs within each head and layer. The findings demonstrated that different attention heads within an attention block focused on different linguistic characteristics, such as identifying token repetitions in a given text or recognizing a token of common appearance in the text and its surrounding context. This specialization is also reflected in the distribution of distances between token vectors with high similarity as the architecture progresses. The initial attention layers exhibit substantially long-range similarities; however, as the layers progress, a more short-range similarity develops, culminating in a preference for attention heads to create strong similarities within the same sentence. Finally, the behavior of individual heads was analyzed by examining the uniqueness of their most common tokens in their high similarity elements. Each head tends to focus on a unique token from the text and builds similarity pairs centered around it.

URL PDF HTML ☆

赞 0 踩 0

AI 大模型

视觉与机器人

科学与医疗