arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

检索范围排序方式

检索时间范围

重置

HOT 人工智能、机器人等 9

cs.AI 人工智能 cs.CV 计算机视觉 cs.CL 自然语言处理 cs.RO 机器人 cs.LG 机器学习 cs.SD 声音 cs.ET 新兴技术 eess.AS 音频语音 eess.IV 图像视频

CS 计算机 41

cs 计算机 cs.AI 人工智能 cs.AR 硬件架构 cs.CC 计算复杂性 cs.CE 计算工程 cs.CG 计算几何 cs.CL 自然语言处理 cs.CR 密码安全 cs.CV 计算机视觉 cs.CY 计算机与社会 cs.DB 数据库 cs.DC 分布式计算 cs.DL 数字图书馆 cs.DM 离散数学 cs.DS 数据结构 cs.ET 新兴技术 cs.FL 形式语言 cs.GL 综述文献 cs.GR 图形学 cs.GT 博弈论 cs.HC 人机交互 cs.IR 信息检索 cs.IT 信息论 cs.LG 机器学习 cs.LO 计算机逻辑 cs.MA 多智能体 cs.MM 多媒体 cs.MS 数学软件 cs.NA 数值分析 cs.NE 神经进化 cs.NI 网络架构 cs.OH 其他计算机 cs.OS 操作系统 cs.PF 性能 cs.PL 编程语言 cs.RO 机器人 cs.SC 符号计算 cs.SD 声音 cs.SE 软件工程 cs.SI 社会信息网络 cs.SY 系统控制

ECON 经济学 4

econ 经济学 econ.EM 计量经济 econ.GN 一般经济 econ.TH 理论经济

EESS 电气与系统 5

eess 电气与系统 eess.AS 音频语音 eess.IV 图像视频 eess.SP 信号处理 eess.SY 系统控制

MATH 数学 33

math 数学 math.AC 交换代数 math.AG 代数几何 math.AP 偏微分方程 math.AT 代数拓扑 math.CA 经典分析 math.CO 组合数学 math.CT 范畴论 math.CV 复变函数 math.DG 微分几何 math.DS 动力系统 math.FA 泛函分析 math.GM 一般数学 math.GN 一般拓扑 math.GR 群论 math.GT 几何拓扑 math.HO 历史综述 math.IT 信息论 math.KT K理论 math.LO 逻辑 math.MG 度量几何 math.MP 数学物理 math.NA 数值分析 math.NT 数论 math.OA 算子代数 math.OC 优化控制 math.PR 概率 math.QA 量子代数 math.RA 环与代数 math.RT 表示论 math.SG 辛几何 math.SP 谱理论 math.ST 统计理论

PHYSICS 物理 55

astro-ph 天体物理 astro-ph.CO 宇宙学 astro-ph.EP 地球行星 astro-ph.GA 星系物理 astro-ph.HE 高能天体 astro-ph.IM 天文仪器 astro-ph.SR 太阳恒星 cond-mat 凝聚态 cond-mat.dis-nn 无序神经 cond-mat.mes-hall 介观纳米 cond-mat.mtrl-sci 材料科学 cond-mat.other 其他凝聚态 cond-mat.quant-gas 量子气体 cond-mat.soft 软凝聚态 cond-mat.stat-mech 统计力学 cond-mat.str-el 强关联电子 cond-mat.supr-con 超导 gr-qc 广义相对论 hep-ex 高能实验 hep-lat 格点高能 hep-ph 高能唯象 hep-th 高能理论 math-ph 数学物理 nlin 非线性科学 nlin.AO 自适应系统 nlin.CD 混沌动力学 nlin.CG 胞自动机 nlin.PS 斑图孤子 nlin.SI 可积系统 nucl-ex 核物理实验 nucl-th 核物理理论 physics 物理 physics.acc-ph 加速器物理 physics.ao-ph 大气海洋 physics.app-ph 应用物理 physics.atm-clus 原子分子团簇 physics.atom-ph 原子物理 physics.bio-ph 生物物理 physics.chem-ph 化学物理 physics.class-ph 经典物理 physics.comp-ph 计算物理 physics.data-an 数据分析 physics.ed-ph 物理教育 physics.flu-dyn 流体动力学 physics.gen-ph 普通物理 physics.geo-ph 地球物理 physics.hist-ph 物理史哲 physics.ins-det 仪器探测 physics.med-ph 医学物理 physics.optics 光学 physics.plasm-ph 等离子体 physics.pop-ph 科普物理 physics.soc-ph 物理与社会 physics.space-ph 空间物理 quant-ph 量子物理

Q-BIO 定量生物 11

q-bio 定量生物 q-bio.BM 生物分子 q-bio.CB 细胞行为 q-bio.GN 基因组学 q-bio.MN 分子网络 q-bio.NC 神经认知 q-bio.OT 其他定量生物 q-bio.PE 种群进化 q-bio.QM 定量方法 q-bio.SC 亚细胞过程 q-bio.TO 组织器官

Q-FIN 定量金融 10

q-fin 定量金融 q-fin.CP 计算金融 q-fin.EC 经济学 q-fin.GN 一般金融 q-fin.MF 数学金融 q-fin.PM 投资组合 q-fin.PR 证券定价 q-fin.RM 风险管理 q-fin.ST 统计金融 q-fin.TR 交易微观结构

STAT 统计 7

stat 统计 stat.AP 统计应用 stat.CO 统计计算 stat.ME 统计方法 stat.ML 机器学习 stat.OT 其他统计 stat.TH 统计理论

2602.20157 2026-02-24 cs.CV

Flow3r: Factored Flow Prediction for Scalable Visual Geometry Learning

Zhongxiao Cong, Qitao Zhao, Minsik Jeon, Shubham Tulsiani

Comments CVPR 2026. Project website: https://flow3r-project.github.io/

2602.20152 2026-02-24 cs.LG cs.AI stat.ML

Behavior Learning (BL): Learning Hierarchical Optimization Structures from Data

Zhenyao Ma, Yue Liang, Dongxu Li

Comments ICLR 2026

2602.20137 2026-02-24 cs.CV

Do Large Language Models Understand Data Visualization Rules?

Martin Sinnona, Valentin Bonas, Emmanuel Iarussi, Viviana Siless

2602.20132 2026-02-24 cs.LG

LAD: Learning Advantage Distribution for Reasoning

Wendi Li, Sharon Li

2602.20130 2026-02-24 cs.CL cs.AI

To Reason or Not to: Selective Chain-of-Thought in Medical Question Answering

Zaifu Zhan, Min Zeng, Shuang Zhou, Yiran Song, Xiaoyi Chen, Yu Hou, Yifan Wu, Yang Ruan, Rui Zhang

2602.20126 2026-02-24 cs.LG cs.IT math.IT math.ST stat.ML stat.TH

Adaptation to Intrinsic Dependence in Diffusion Language Models

Yunxiao Zhao, Changxiao Cai

2602.20119 2026-02-24 cs.RO cs.AI cs.CV

NovaPlan: Zero-Shot Long-Horizon Manipulation via Closed-Loop Video Language Planning

Jiahui Fu, Junyu Nan, Lingfeng Sun, Hongyu Li, Jianing Qian, Jennifer L. Barry, Kris Kitani, George Konidaris

Comments 25 pages, 15 figures. Project webpage: https://nova-plan.github.io/

2602.20117 2026-02-24 cs.AI cs.LG

ReSyn: Autonomously Scaling Synthetic Environments for Reasoning Models

Andre He, Nathaniel Weir, Kaj Bostrom, Allen Nie, Darion Cassel, Sam Bayless, Huzefa Rangwala

2602.20114 2026-02-24 cs.CV cs.AI

Benchmarking Unlearning for Vision Transformers

Kairan Zhao, Iurie Luca, Peter Triantafillou

2602.20113 2026-02-24 cs.SD cs.AI

StyleStream: Real-Time Zero-Shot Voice Style Conversion

Yisi Liu, Nicholas Lee, Gopala Anumanchipalli

2602.20111 2026-02-24 cs.LG

Reliable Abstention under Adversarial Injections: Tight Lower Bounds and New Upper Bounds

Ezra Edelman, Surbhi Goel

2602.20104 2026-02-24 cs.AI cs.HC cs.LG

Align When They Want, Complement When They Need! Human-Centered Ensembles for Adaptive Human-AI Collaboration

Hasan Amin, Ming Yin, Rajiv Khanna

Comments AAAI 2026

2602.20100 2026-02-24 cs.CV cs.AI eess.IV

Transcending the Annotation Bottleneck: AI-Powered Discovery in Biology and Medicine

Soumick Chatterjee

Journal ref Artificial Intelligence for Biomedical Data, AIBIO 2025, CCIS 2696, pp 243-248, 2026

2602.20094 2026-02-24 cs.AI

CausalFlip: A Benchmark for LLM Causal Judgment Beyond Semantic Matching

Yuzhe Wang, Yaochen Zhu, Jundong Li

Comments 8 pages plus references, 3 figures, 3 tables. Under review

详情

英文摘要

As large language models (LLMs) witness increasing deployment in complex, high-stakes decision-making scenarios, it becomes imperative to ground their reasoning in causality rather than spurious correlations. However, strong performance on traditional reasoning benchmarks does not guarantee true causal reasoning ability of LLMs, as high accuracy may still arise from memorizing semantic patterns instead of analyzing the underlying true causal structures. To bridge this critical gap, we propose a new causal reasoning benchmark, CausalFlip, designed to encourage the development of new LLM paradigm or training algorithms that ground LLM reasoning in causality rather than semantic correlation. CausalFlip consists of causal judgment questions built over event triples that could form different confounder, chain, and collider relations. Based on this, for each event triple, we construct pairs of semantically similar questions that reuse the same events but yield opposite causal answers, where models that rely heavily on semantic matching are systematically driven toward incorrect predictions. To further probe models' reliance on semantic patterns, we introduce a noisy-prefix evaluation that prepends causally irrelevant text before intermediate causal reasoning steps without altering the underlying causal relations or the logic of the reasoning process. We evaluate LLMs under multiple training paradigms, including answer-only training, explicit Chain-of-Thought (CoT) supervision, and a proposed internalized causal reasoning approach that aims to mitigate explicit reliance on correlation in the reasoning process. Our results show that explicit CoT can still be misled by spurious semantic correlations, where internalizing reasoning steps yields substantially improved causal grounding, suggesting that it is promising to better elicit the latent causal reasoning capabilities of base LLMs.

URL PDF HTML ☆

赞 0 踩 0

2602.20084 2026-02-24 cs.CV

Do Large Language Models Understand Data Visualization Principles?

Martin Sinnona, Valentin Bonas, Viviana Siless, Emmanuel Iarussi

2602.20079 2026-02-24 cs.CV

SemanticNVS: Improving Semantic Scene Understanding in Generative Novel View Synthesis

Xinya Chen, Christopher Wewer, Jiahao Xie, Xinting Hu, Jan Eric Lenssen

2602.20068 2026-02-24 cs.CV cs.LG

The Invisible Gorilla Effect in Out-of-distribution Detection

Harry Anthony, Ziyun Liang, Hermione Warr, Konstantinos Kamnitsas

Comments Accepted at CVPR 2026

2602.20066 2026-02-24 cs.CV cs.AI

HeatPrompt: Zero-Shot Vision-Language Modeling of Urban Heat Demand from Satellite Images

Kundan Thota, Xuanhao Mu, Thorsten Schlachter, Veit Hagenmeyer

2602.20065 2026-02-24 cs.CL cs.AI

Multilingual Large Language Models do not comprehend all natural languages to equal degrees

Natalia Moskvina, Raquel Montero, Masaya Yoshida, Ferdy Hubers, Paolo Morosi, Walid Irhaymi, Jin Yan, Tamara Serrano, Elena Pagliarini, Fritz Günther, Evelina Leivada

Comments 36 pages, 3 figures, 2 tables, 4 supplementary tables

2602.20062 2026-02-24 cs.LG stat.ML

A Theory of How Pretraining Shapes Inductive Bias in Fine-Tuning

Nicolas Anguita, Francesco Locatello, Andrew M. Saxe, Marco Mondelli, Flavia Mancini, Samuel Lippl, Clementine Domine

2602.20059 2026-02-24 cs.AI

Interaction Theater: A case of LLM Agents Interacting at Scale

Sarath Shekkizhar, Adam Earle

2602.20057 2026-02-24 cs.RO cs.AI

AdaWorldPolicy: World-Model-Driven Diffusion Policy with Online Adaptive Learning for Robotic Manipulation

Ge Yuan, Qiyuan Qiao, Jing Zhang, Dong Xu

Comments Homepage: https://AdaWorldPolicy.github.io

2602.20055 2026-02-24 cs.RO cs.AI cs.CV

To Move or Not to Move: Constraint-based Planning Enables Zero-Shot Generalization for Interactive Navigation

Apoorva Vashisth, Manav Kulshrestha, Pranav Bakshi, Damon Conover, Guillaume Sartoretti, Aniket Bera

2602.20053 2026-02-24 cs.CV

Decoupling Defense Strategies for Robust Image Watermarking

Jiahui Chen, Zehang Deng, Zeyu Zhang, Chaoyang Li, Lianchen Jia, Lifeng Sun

Comments CVPR 2026

2602.20052 2026-02-24 cs.CL

Entropy in Large Language Models

Marco Scharringhausen

Comments 7 pages, 2 figures, 3 tables

2602.20051 2026-02-24 cs.CV cs.AI

SEAL-pose: Enhancing 3D Human Pose Estimation via a Learned Loss for Structural Consistency

Yeonsung Kim, Junggeun Do, Seunguk Do, Sangmin Kim, Jaesik Park, Jay-Yoon Lee

Comments 17 pages

2602.20048 2026-02-24 cs.AI cs.SE

CodeCompass: Navigating the Navigation Paradox in Agentic Code Intelligence

Tarakanath Paipuru

Comments 23 pages, 7 figures. Research study with 258 trials on SWE-bench-lite tasks. Code and data: https://github.com/tpaip607/research-codecompass

2602.20046 2026-02-24 cs.CV cs.LG

Closing the gap in multimodal medical representation alignment

Eleonora Grassucci, Giordano Cicchetti, Danilo Comminiello

Comments Accepted at MLSP2025

2602.20041 2026-02-24 cs.RO cs.CV

EEG-Driven Intention Decoding: Offline Deep Learning Benchmarking on a Robotic Rover

Ghadah Alosaimi, Maha Alsayyari, Yixin Sun, Stamos Katsigiannis, Amir Atapour-Abarghouei, Toby P. Breckon

2602.20040 2026-02-24 cs.CL cs.AI

AgenticSum: An Agentic Inference-Time Framework for Faithful Clinical Text Summarization

Fahmida Liza Piya, Rahmatollah Beheshti