arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

检索范围排序方式

检索时间范围

重置

HOT 人工智能、机器人等 9

cs.AI 人工智能 cs.CV 计算机视觉 cs.CL 自然语言处理 cs.RO 机器人 cs.LG 机器学习 cs.SD 声音 cs.ET 新兴技术 eess.AS 音频语音 eess.IV 图像视频

CS 计算机 41

cs 计算机 cs.AI 人工智能 cs.AR 硬件架构 cs.CC 计算复杂性 cs.CE 计算工程 cs.CG 计算几何 cs.CL 自然语言处理 cs.CR 密码安全 cs.CV 计算机视觉 cs.CY 计算机与社会 cs.DB 数据库 cs.DC 分布式计算 cs.DL 数字图书馆 cs.DM 离散数学 cs.DS 数据结构 cs.ET 新兴技术 cs.FL 形式语言 cs.GL 综述文献 cs.GR 图形学 cs.GT 博弈论 cs.HC 人机交互 cs.IR 信息检索 cs.IT 信息论 cs.LG 机器学习 cs.LO 计算机逻辑 cs.MA 多智能体 cs.MM 多媒体 cs.MS 数学软件 cs.NA 数值分析 cs.NE 神经进化 cs.NI 网络架构 cs.OH 其他计算机 cs.OS 操作系统 cs.PF 性能 cs.PL 编程语言 cs.RO 机器人 cs.SC 符号计算 cs.SD 声音 cs.SE 软件工程 cs.SI 社会信息网络 cs.SY 系统控制

ECON 经济学 4

econ 经济学 econ.EM 计量经济 econ.GN 一般经济 econ.TH 理论经济

EESS 电气与系统 5

eess 电气与系统 eess.AS 音频语音 eess.IV 图像视频 eess.SP 信号处理 eess.SY 系统控制

MATH 数学 33

math 数学 math.AC 交换代数 math.AG 代数几何 math.AP 偏微分方程 math.AT 代数拓扑 math.CA 经典分析 math.CO 组合数学 math.CT 范畴论 math.CV 复变函数 math.DG 微分几何 math.DS 动力系统 math.FA 泛函分析 math.GM 一般数学 math.GN 一般拓扑 math.GR 群论 math.GT 几何拓扑 math.HO 历史综述 math.IT 信息论 math.KT K理论 math.LO 逻辑 math.MG 度量几何 math.MP 数学物理 math.NA 数值分析 math.NT 数论 math.OA 算子代数 math.OC 优化控制 math.PR 概率 math.QA 量子代数 math.RA 环与代数 math.RT 表示论 math.SG 辛几何 math.SP 谱理论 math.ST 统计理论

PHYSICS 物理 55

astro-ph 天体物理 astro-ph.CO 宇宙学 astro-ph.EP 地球行星 astro-ph.GA 星系物理 astro-ph.HE 高能天体 astro-ph.IM 天文仪器 astro-ph.SR 太阳恒星 cond-mat 凝聚态 cond-mat.dis-nn 无序神经 cond-mat.mes-hall 介观纳米 cond-mat.mtrl-sci 材料科学 cond-mat.other 其他凝聚态 cond-mat.quant-gas 量子气体 cond-mat.soft 软凝聚态 cond-mat.stat-mech 统计力学 cond-mat.str-el 强关联电子 cond-mat.supr-con 超导 gr-qc 广义相对论 hep-ex 高能实验 hep-lat 格点高能 hep-ph 高能唯象 hep-th 高能理论 math-ph 数学物理 nlin 非线性科学 nlin.AO 自适应系统 nlin.CD 混沌动力学 nlin.CG 胞自动机 nlin.PS 斑图孤子 nlin.SI 可积系统 nucl-ex 核物理实验 nucl-th 核物理理论 physics 物理 physics.acc-ph 加速器物理 physics.ao-ph 大气海洋 physics.app-ph 应用物理 physics.atm-clus 原子分子团簇 physics.atom-ph 原子物理 physics.bio-ph 生物物理 physics.chem-ph 化学物理 physics.class-ph 经典物理 physics.comp-ph 计算物理 physics.data-an 数据分析 physics.ed-ph 物理教育 physics.flu-dyn 流体动力学 physics.gen-ph 普通物理 physics.geo-ph 地球物理 physics.hist-ph 物理史哲 physics.ins-det 仪器探测 physics.med-ph 医学物理 physics.optics 光学 physics.plasm-ph 等离子体 physics.pop-ph 科普物理 physics.soc-ph 物理与社会 physics.space-ph 空间物理 quant-ph 量子物理

Q-BIO 定量生物 11

q-bio 定量生物 q-bio.BM 生物分子 q-bio.CB 细胞行为 q-bio.GN 基因组学 q-bio.MN 分子网络 q-bio.NC 神经认知 q-bio.OT 其他定量生物 q-bio.PE 种群进化 q-bio.QM 定量方法 q-bio.SC 亚细胞过程 q-bio.TO 组织器官

Q-FIN 定量金融 10

q-fin 定量金融 q-fin.CP 计算金融 q-fin.EC 经济学 q-fin.GN 一般金融 q-fin.MF 数学金融 q-fin.PM 投资组合 q-fin.PR 证券定价 q-fin.RM 风险管理 q-fin.ST 统计金融 q-fin.TR 交易微观结构

STAT 统计 7

stat 统计 stat.AP 统计应用 stat.CO 统计计算 stat.ME 统计方法 stat.ML 机器学习 stat.OT 其他统计 stat.TH 统计理论

2602.19041 2026-02-24 cs.LG

Back to Blackwell: Closing the Loop on Intransitivity in Multi-Objective Preference Fine-Tuning

Jiahao Zhang, Lujing Zhang, Keltin Grimes, Zhuohao Yu, Gokul Swamy, Zhiwei Steven Wu

Comments 21 pages, 5 figures

2602.19038 2026-02-24 cs.RO cs.HC

A Checklist for Deploying Robots in Public: Articulating Tacit Knowledge in the HRI Community

Claire Liang, Franziska Babel, Hannah Pelikan, Sydney Thompson, Xiang Zhi Tan

2602.19033 2026-02-24 cs.LG cs.AI cs.CV

A Markovian View of Iterative-Feedback Loops in Image Generative Models: Neural Resonance and Model Collapse

Vibhas Kumar Vats, David J. Crandall, Samuel Goree

Comments A preprint -- Under review

2602.19027 2026-02-24 cs.LG cs.AI

Pushing the Limits of Inverse Lithography with Generative Reinforcement Learning

Haoyu Yang, Haoxing Ren

Comments 7 pages, 4 figures, accepted by the 63th Design Automation Conference

2602.19024 2026-02-24 cs.CV

Towards Calibrating Prompt Tuning of Vision-Language Models

Ashshak Sharifdeen, Fahad Shamshad, Muhammad Akhtar Munir, Abhishek Basu, Mohamed Insaf Ismithdeen, Jeyapriyan Jeyamohan, Chathurika Sewwandi Silva, Karthik Nandakumar, Muhammad Haris Khan

Comments Accepted to CVPR 2026

2602.19022 2026-02-24 cs.CV cs.AI

An interpretable framework using foundation models for fish sex identification

Zheng Miao, Tien-Chieh Hung

2602.19020 2026-02-24 cs.LG cs.AI cs.CL

Learning to Detect Language Model Training Data via Active Reconstruction

Junjie Oscar Yin, John X. Morris, Vitaly Shmatikov, Sewon Min, Hannaneh Hajishirzi

2602.19017 2026-02-24 cs.LG

Why ReLU? A Bit-Model Dichotomy for Deep Network Training

Ilan Doron-Arad, Elchanan Mossel

2602.19008 2026-02-24 cs.CL cs.LG

Capable but Unreliable: Canonical Path Deviation as a Causal Mechanism of Agent Failure in Long-Horizon Tasks

Wilson Y. Lee

详情

英文摘要

Why do language agents fail on tasks they are capable of solving? We argue that many such failures are reliability failures caused by stochastic drift from a task's latent solution structure, not capability failures. Every well-defined tool-use task imposes a canonical solution path (i.e., a convergent set of tool invocations shared across successful runs) and agent success depends critically on whether a trajectory stays within this path's operating envelope. We establish this causally using a natural experiment that holds model capability and task difficulty fixed by construction. We analyze trajectories from the Toolathlon benchmark: 22 frontier models each attempt 108 real-world tool-use tasks across 3 independent runs, yielding 515 model$\times$task units where the same model succeeds on some runs and fails on others due to LLM sampling stochasticity alone. Within these units, successful runs adhere significantly more closely to the canonical solution path than failed runs ($+$0.060 Jaccard, $p<0.0001$, $n=488$ units, 95% CI [+0.043, +0.077]). This result survives six robustness checks including cross-model-family leave-one-out validation. Critically, the causal mechanism is gradual and self-reinforcing: the adherence gap is statistically indistinguishable from zero through the first 50% of the trajectory, ruling out early-branching selection bias, and each off-canonical tool call raises the probability that the next call is also off-canonical by 22.7 percentage points ($\hatβ=+0.227$, $p<0.0001$), more than doubling the baseline rate. These findings imply that agent reliability cannot be improved by capability scaling alone, but offer a highly actionable intervention: a simple monitor that restarts the bottom tercile of runs based on mid-trajectory canonical adherence lifts success rates by $+$8.8 percentage points among intervened runs.

URL PDF HTML ☆

赞 0 踩 0

2602.19006 2026-02-24 cs.AI quant-ph

Evaluating Large Language Models on Quantum Mechanics: A Comparative Study Across Diverse Models and Tasks

S. K. Rithvik

2602.19005 2026-02-24 cs.CV cs.LG

GUIDE-US: Grade-Informed Unpaired Distillation of Encoder Knowledge from Histopathology to Micro-UltraSound

Emma Willis, Tarek Elghareb, Paul F. R. Wilson, Minh Nguyen Nhat To, Mohammad Mahdi Abootorabi, Amoon Jamzad, Brian Wodlinger, Parvin Mousavi, Purang Abolmaesumi

Comments Accepted to IPCAI 2026

2602.19004 2026-02-24 cs.CV

MoBind: Motion Binding for Fine-Grained IMU-Video Pose Alignment

Duc Duy Nguyen, Tat-Jun Chin, Minh Hoai

Comments 8 pages, 6 tables, 7 figures, accepted to CVPR26

2602.19001 2026-02-24 cs.CV

A Benchmark and Knowledge-Grounded Framework for Advanced Multimodal Personalization Study

Xia Hu, Honglei Zhuang, Brian Potetz, Alireza Fathi, Bo Hu, Babak Samari, Howard Zhou

2602.18998 2026-02-24 cs.AI cs.CL

Benchmark Test-Time Scaling of General LLM Agents

Xiaochuan Li, Ryan Ming, Pranav Setlur, Abhijay Paladugu, Andy Tang, Hao Kang, Shuai Shao, Rong Jin, Chenyan Xiong

2602.18991 2026-02-24 cs.RO

FruitTouch: A Perceptive Gripper for Gentle and Scalable Fruit Harvesting

Ruohan Zhang, Mohammad Amin Mirzaee, Wenzhen Yuan

Comments 8 pages, 7 figures

2602.18986 2026-02-24 cs.AI

Quantifying Automation Risk in High-Automation AI Systems: A Bayesian Framework for Failure Propagation and Optimal Oversight

Vishal Srivastava, Tanmay Sah

2602.18985 2026-02-24 cs.AI

InfEngine: A Self-Verifying and Self-Optimizing Intelligent Engine for Infrared Radiation Computing

Kun Ding, Jian Xu, Ying Wang, Peipei Yang, Shiming Xiang

Comments 40 pages

2602.18981 2026-02-24 cs.AI

How Far Can We Go with Pixels Alone? A Pilot Study on Screen-Only Navigation in Commercial 3D ARPGs

Kaijie Xu, Mustafa Bugti, Clark Verbrugge

2602.18977 2026-02-24 cs.CV

Frame2Freq: Spectral Adapters for Fine-Grained Video Understanding

Thinesh Thiyakesan Ponbagavathi, Constantin Seibold, Alina Roitberg

Comments Accepted to CVPR 2026 (Main Track)

2602.18976 2026-02-24 cs.RO

Bumper Drone: Elastic Morphology Design for Aerial Physical Interaction

Pongporn Supa, Alex Dunnett, Feng Xiao, Rui Wu, Mirko Kovac, Basaran Bahadir Kocer

Comments Accepted to the 9th IEEE-RAS International Conference on Soft Robotics (RoboSoft) 2026

2602.18971 2026-02-24 cs.AI

When Do LLM Preferences Predict Downstream Behavior?

Katarina Slama, Alexandra Souly, Dishank Bansal, Henry Davidson, Christopher Summerfield, Lennart Luettgau

Comments 31 pages, 16 figures

2602.18967 2026-02-24 cs.RO

TactEx: An Explainable Multimodal Robotic Interaction Framework for Human-Like Touch and Hardness Estimation

Felix Verstraete, Lan Wei, Wen Fan, Dandan Zhang

Comments Accepted by 2026 ICRA

2602.18966 2026-02-24 cs.CL

Whisper: Courtside Edition Enhancing ASR Performance Through LLM-Driven Context Generation

Yonathan Ron, Shiri Gilboa, Tammuz Dubnov

2602.18965 2026-02-24 cs.CV eess.IV

Face Presentation Attack Detection via Content-Adaptive Spatial Operators

Shujaat Khan

Comments 14 Pages, 8 Figures

2602.18964 2026-02-24 cs.CL

Yor-Sarc: A gold-standard dataset for sarcasm detection in a low-resource African language

Toheeb Aduramomi Jimoh, Tabea De Wille, Nikola S. Nikolov

2602.18961 2026-02-24 cs.CV cs.SY eess.IV eess.SY

Depth-Enhanced YOLO-SAM2 Detection for Reliable Ballast Insufficiency Identification

Shiyu Liu, Dylan Lester, Husnu Narman, Ammar Alzarrad, Pingping Zhu

Comments Submitted to the IEEE International Symposium on Robotic and Sensors Environments (ROSE) 2026

2602.18960 2026-02-24 cs.AI cs.NE q-bio.NC

Modularity is the Bedrock of Natural and Artificial Intelligence

Alessandro Salatiello

Journal ref ICLR 2025 - Second Workshop on Representational Alignment (Re-Align) https://iclr.cc/virtual/2025/36838

2602.18959 2026-02-24 cs.CV

YOLOv10-Based Multi-Task Framework for Hand Localization and Laterality Classification in Surgical Videos

Kedi Sun, Le Zhang

2602.18951 2026-02-24 cs.RO

Temporal-Logic-Aware Frontier-Based Exploration

Azizollah Taheri, Derya Aksaray