arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

检索范围排序方式

检索时间范围

重置

HOT 人工智能、机器人等 9

cs.AI 人工智能 cs.CV 计算机视觉 cs.CL 自然语言处理 cs.RO 机器人 cs.LG 机器学习 cs.SD 声音 cs.ET 新兴技术 eess.AS 音频语音 eess.IV 图像视频

CS 计算机 41

cs 计算机 cs.AI 人工智能 cs.AR 硬件架构 cs.CC 计算复杂性 cs.CE 计算工程 cs.CG 计算几何 cs.CL 自然语言处理 cs.CR 密码安全 cs.CV 计算机视觉 cs.CY 计算机与社会 cs.DB 数据库 cs.DC 分布式计算 cs.DL 数字图书馆 cs.DM 离散数学 cs.DS 数据结构 cs.ET 新兴技术 cs.FL 形式语言 cs.GL 综述文献 cs.GR 图形学 cs.GT 博弈论 cs.HC 人机交互 cs.IR 信息检索 cs.IT 信息论 cs.LG 机器学习 cs.LO 计算机逻辑 cs.MA 多智能体 cs.MM 多媒体 cs.MS 数学软件 cs.NA 数值分析 cs.NE 神经进化 cs.NI 网络架构 cs.OH 其他计算机 cs.OS 操作系统 cs.PF 性能 cs.PL 编程语言 cs.RO 机器人 cs.SC 符号计算 cs.SD 声音 cs.SE 软件工程 cs.SI 社会信息网络 cs.SY 系统控制

ECON 经济学 4

econ 经济学 econ.EM 计量经济 econ.GN 一般经济 econ.TH 理论经济

EESS 电气与系统 5

eess 电气与系统 eess.AS 音频语音 eess.IV 图像视频 eess.SP 信号处理 eess.SY 系统控制

MATH 数学 33

math 数学 math.AC 交换代数 math.AG 代数几何 math.AP 偏微分方程 math.AT 代数拓扑 math.CA 经典分析 math.CO 组合数学 math.CT 范畴论 math.CV 复变函数 math.DG 微分几何 math.DS 动力系统 math.FA 泛函分析 math.GM 一般数学 math.GN 一般拓扑 math.GR 群论 math.GT 几何拓扑 math.HO 历史综述 math.IT 信息论 math.KT K理论 math.LO 逻辑 math.MG 度量几何 math.MP 数学物理 math.NA 数值分析 math.NT 数论 math.OA 算子代数 math.OC 优化控制 math.PR 概率 math.QA 量子代数 math.RA 环与代数 math.RT 表示论 math.SG 辛几何 math.SP 谱理论 math.ST 统计理论

PHYSICS 物理 55

astro-ph 天体物理 astro-ph.CO 宇宙学 astro-ph.EP 地球行星 astro-ph.GA 星系物理 astro-ph.HE 高能天体 astro-ph.IM 天文仪器 astro-ph.SR 太阳恒星 cond-mat 凝聚态 cond-mat.dis-nn 无序神经 cond-mat.mes-hall 介观纳米 cond-mat.mtrl-sci 材料科学 cond-mat.other 其他凝聚态 cond-mat.quant-gas 量子气体 cond-mat.soft 软凝聚态 cond-mat.stat-mech 统计力学 cond-mat.str-el 强关联电子 cond-mat.supr-con 超导 gr-qc 广义相对论 hep-ex 高能实验 hep-lat 格点高能 hep-ph 高能唯象 hep-th 高能理论 math-ph 数学物理 nlin 非线性科学 nlin.AO 自适应系统 nlin.CD 混沌动力学 nlin.CG 胞自动机 nlin.PS 斑图孤子 nlin.SI 可积系统 nucl-ex 核物理实验 nucl-th 核物理理论 physics 物理 physics.acc-ph 加速器物理 physics.ao-ph 大气海洋 physics.app-ph 应用物理 physics.atm-clus 原子分子团簇 physics.atom-ph 原子物理 physics.bio-ph 生物物理 physics.chem-ph 化学物理 physics.class-ph 经典物理 physics.comp-ph 计算物理 physics.data-an 数据分析 physics.ed-ph 物理教育 physics.flu-dyn 流体动力学 physics.gen-ph 普通物理 physics.geo-ph 地球物理 physics.hist-ph 物理史哲 physics.ins-det 仪器探测 physics.med-ph 医学物理 physics.optics 光学 physics.plasm-ph 等离子体 physics.pop-ph 科普物理 physics.soc-ph 物理与社会 physics.space-ph 空间物理 quant-ph 量子物理

Q-BIO 定量生物 11

q-bio 定量生物 q-bio.BM 生物分子 q-bio.CB 细胞行为 q-bio.GN 基因组学 q-bio.MN 分子网络 q-bio.NC 神经认知 q-bio.OT 其他定量生物 q-bio.PE 种群进化 q-bio.QM 定量方法 q-bio.SC 亚细胞过程 q-bio.TO 组织器官

Q-FIN 定量金融 10

q-fin 定量金融 q-fin.CP 计算金融 q-fin.EC 经济学 q-fin.GN 一般金融 q-fin.MF 数学金融 q-fin.PM 投资组合 q-fin.PR 证券定价 q-fin.RM 风险管理 q-fin.ST 统计金融 q-fin.TR 交易微观结构

STAT 统计 7

stat 统计 stat.AP 统计应用 stat.CO 统计计算 stat.ME 统计方法 stat.ML 机器学习 stat.OT 其他统计 stat.TH 统计理论

2602.06699 2026-02-09 quant-ph cs.CL cs.LG

Quantum Attention by Overlap Interference: Predicting Sequences from Classical and Many-Body Quantum Data

Alessio Pecilli, Matteo Rosati

Comments 4 + 1 pages, 2 figures

2602.06693 2026-02-09 cs.NI cs.CC cs.LG

Makespan Minimization in Split Learning: From Theory to Practice

Robert Ganian, Fionn Mc Inerney, Dimitra Tsigkari

Comments This paper will appear at IEEE INFOCOM 2026

2602.06654 2026-02-09 cs.IR cs.AI

Multimodal Generative Retrieval Model with Staged Pretraining for Food Delivery on Meituan

Boyu Chen, Tai Guo, Weiyu Cui, Yuqing Li, Xingxing Wang, Chuan Shi, Cheng Yang

2602.06639 2026-02-09 eess.SY cs.RO cs.SY

Efficient and Robust Modeling of Nonlinear Mechanical Systems

Davide Tebaldi, Roberto Zanasi

2602.06621 2026-02-09 stat.ML cs.LG

Infinite-dimensional generative diffusions via Doob's h-transform

Thorben Pieper-Sethmacher, Daniel Paulin

2602.06616 2026-02-09 cs.CR cs.LG

Confundo: Learning to Generate Robust Poison for Practical RAG Systems

Haoyang Hu, Zhejun Jiang, Yueming Lyu, Junyuan Zhang, Yi Liu, Ka-Ho Chow

2602.06599 2026-02-09 cs.MA cs.AI cs.LG

Sample-Efficient Policy Space Response Oracles with Joint Experience Best Response

Ariyan Bighashdel, Thiago D. Simão, Frans A. Oliehoek

Comments Accepted at the 25th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2026)

2602.06593 2026-02-09 cs.SE cs.AI

AgentStepper: Interactive Debugging of Software Development Agents

Robert Hutter, Michael Pradel

详情

英文摘要

Software development agents powered by large language models (LLMs) have shown great promise in automating tasks like environment setup, issue solving, and program repair. Unfortunately, understanding and debugging such agents remain challenging due to their complex and dynamic nature. Developers must reason about trajectories of LLM queries, tool calls, and code modifications, but current techniques reveal little of this intermediate process in a comprehensible format. The key insight of this paper is that debugging software development agents shares many similarities with conventional debugging of software programs, yet requires a higher level of abstraction that raises the level from low-level implementation details to high-level agent actions. Drawing on this insight, we introduce AgentStepper, the first interactive debugger for LLM-based software engineering agents. AgentStepper enables developers to inspect, control, and interactively manipulate agent trajectories. AgentStepper represents trajectories as structured conversations among an LLM, the agent program, and tools. It supports breakpoints, stepwise execution, and live editing of prompts and tool invocations, while capturing and displaying intermediate repository-level code changes. Our evaluation applies AgentStepper to three state-of-the-art software development agents, ExecutionAgent, SWE-Agent, and RepairAgent, showing that integrating the approach into existing agents requires minor code changes (39-42 edited lines). Moreover, we report on a user study with twelve participants, indicating that AgentStepper improves the ability of participants to interpret trajectories (64% vs. 67% mean performance) and identify bugs in the agent's implementation (17% vs. 60% success rate), while reducing perceived workload (e.g., frustration reduced from 5.4/7.0 to 2.4/7.0) compared to conventional tools.

URL PDF HTML ☆

赞 0 踩 0

2602.06555 2026-02-09 cs.DC cs.LG

Reinforcement Learning-Based Dynamic Management of Structured Parallel Farm Skeletons on Serverless Platforms

Lanpei Li, Massimo Coppola, Malio Li, Valerio Besozzi, Jack Bell, Vincenzo Lomonaco

Comments Accepted at AHPC3 workshop, PDP 2026

2602.06553 2026-02-09 math.AG cs.LG

Evolving Ranking Functions for Canonical Blow-Ups in Positive Characteristic

Gergely Bérczi

Comments 41 pages

2602.06545 2026-02-09 stat.ML cs.LG

Operationalizing Stein's Method for Online Linear Optimization: CLT-Based Optimal Tradeoffs

Zhiyu Zhang, Aaditya Ramdas

2602.06534 2026-02-09 cs.CR cs.LG

AlertBERT: A noise-robust alert grouping framework for simultaneous cyber attacks

Lukas Karner, Max Landauer, Markus Wurzenberger, Florian Skopik

2602.06506 2026-02-09 cs.HC cs.CL cs.CY

Designing Computational Tools for Exploring Causal Relationships in Qualitative Data

Han Meng, Qiuyuan Lyu, Peinuan Qin, Yitian Yang, Renwen Zhang, Wen-Chieh Lin, Yi-Chieh Lee

Comments 19 pages, 5 figures, conditionally accepted by CHI26

2602.06476 2026-02-09 cs.MA cs.AI cs.LG

Prism: Spectral Parameter Sharing for Multi-Agent Reinforcement Learning

Kyungbeom Kim, Seungwon Oh, Kyung-Joong Kim

2602.06443 2026-02-09 cs.CR cs.AI

TrajAD: Trajectory Anomaly Detection for Trustworthy LLM Agents

Yibing Liu, Chong Zhang, Zhongyi Han, Hansong Liu, Yong Wang, Yang Yu, Xiaoyan Wang, Yilong Yin

Comments 9 pages, 5 figures, 1 table

2602.06431 2026-02-09 cs.SI cs.AI cs.IR

A methodology for analyzing financial needs hierarchy from social discussions using LLM

Abhishek Jangra, Sachin Thukral, Arnab Chatterjee, Jayasree Raveendran

Comments 15 pages, 5 figures, 4 tables

2602.06395 2026-02-09 cs.CR cs.AI cs.LG

Empirical Analysis of Adversarial Robustness and Explainability Drift in Cybersecurity Classifiers

Mona Rajhans, Vishal Khawarey

Comments Accepted for publication in 18th ACM International Conference on Agents and Artificial Intelligence (ICAART 2026), Marbella, Spain

2602.06365 2026-02-09 eess.SY cs.LG cs.SY

Advances in Battery Energy Storage Management: Control and Economic Synergies

Venkata Rajesh Chundru, Shreshta Rajakumar Deshpande, Stanislav A Gankov

Comments Pre Print

2602.06350 2026-02-09 eess.IV cs.CV

AS-Mamba: Asymmetric Self-Guided Mamba Decoupled Iterative Network for Metal Artifact Reduction

Bowen Ning, Zekun Zhou, Xinyi Zhong, Zhongzhen Wang, HongXin Wu, HaiTao Wang, Liu Shi, Qiegen Liu

Comments 10 pages,10 figures

2602.06345 2026-02-09 cs.CR cs.AI

Zero-Trust Runtime Verification for Agentic Payment Protocols: Mitigating Replay and Context-Binding Failures in AP2

Qianlong Lan, Anuj Kaul, Shaun Jones, Stephanie Westrum

2602.06336 2026-02-09 cs.CR cs.DC cs.LG

AdFL: In-Browser Federated Learning for Online Advertisement

Ahmad Alemari, Pritam Sen, Cristian Borcea

2602.06297 2026-02-09 stat.ML cs.LG

Time-uniform conformal and PAC prediction

Kayla E. Scharfstein, Arun Kumar Kuchibhotla

2602.05975 2026-02-09 cs.IR cs.CL

SAGE: Benchmarking and Improving Retrieval for Deep Research Agents

Tiansheng Hu, Yilun Zhao, Canyu Zhang, Arman Cohan, Chen Zhao

2602.05817 2026-02-09 cs.CR cs.LG cs.NI

Interpreting Manifolds and Graph Neural Embeddings from Internet of Things Traffic Flows

Enrique Feito-Casares, Francisco M. Melgarejo-Meseguer, Elena Casiraghi, Giorgio Valentini, José-Luis Rojo-Álvarez

2602.05754 2026-02-09 cs.DC cs.AI

TimelyFreeze: Adaptive Parameter Freezing Mechanism for Pipeline Parallelism

Seonghye Cho, Jaemin Han, Hyunjin Kim, Euisoo Jung, Jae-Gil Lee

2602.05386 2026-02-09 cs.CR cs.AI

Spider-Sense: Intrinsic Risk Sensing for Efficient Agent Defense with Hierarchical Adaptive Screening

Zhenxiong Yu, Zhi Yang, Zhiheng Jin, Shuhe Wang, Heng Zhang, Yanlin Fei, Lingfeng Zeng, Fangqi Lou, Shuo Zhang, Tu Hu, Jingping Liu, Rongze Chen, Xingyu Zhu, Kunyi Wang, Chaofa Yuan, Xin Guo, Zhaowei Liu, Feipeng Zhang, Jie Huang, Huacan Wang, Ronghao Chen, Liwen Zhang

2602.05227 2026-02-09 stat.ML cs.LG cs.NA math.AP math.NA stat.ME

Radon--Wasserstein Gradient Flows for Interacting-Particle Sampling in High Dimensions

Elias Hess-Childs, Dejan Slepčev, Lantian Xu

Comments 49 pages, 7 figures; corrected Figure 4.4

2602.03868 2026-02-09 eess.AS cs.AI cs.CL cs.SD

Benchmarking Automatic Speech Recognition for Indian Languages in Agricultural Contexts

Chandrashekar M S, Vineet Singh, Lakshmi Pedapudi

Comments 9 pages, 6 figures

2602.02614 2026-02-09 cs.SE cs.AI cs.CR

Testing Storage-System Correctness: Challenges, Fuzzing Limitations, and AI-Augmented Opportunities

Ying Wang, Jiahui Chen, Dejun Jiang

2602.00169 2026-02-09 cond-mat.mtrl-sci cs.AI

Towards Agentic Intelligence for Materials Science

Huan Zhang, Yizhan Li, Wenhao Huang, Ziyu Hou, Yu Song, Xuye Liu, Farshid Effaty, Jinya Jiang, Sifan Wu, Qianggang Ding, Izumi Takahara, Leonard R. MacGillivray, Teruyasu Mizoguchi, Tianshu Yu, Lizi Liao, Yuyu Luo, Yu Rong, Jia Li, Ying Diao, Heng Ji, Bang Liu

Comments 81 pages

详情

英文摘要

The convergence of artificial intelligence and materials science presents a transformative opportunity, but achieving true acceleration in discovery requires moving beyond task-isolated, fine-tuned models toward agentic systems that plan, act, and learn across the full discovery loop. This survey advances a unique pipeline-centric view that spans from corpus curation and pretraining, through domain adaptation and instruction tuning, to goal-conditioned agents interfacing with simulation and experimental platforms. Unlike prior reviews, we treat the entire process as an end-to-end system to be optimized for tangible discovery outcomes rather than proxy benchmarks. This perspective allows us to trace how upstream design choices-such as data curation and training objectives-can be aligned with downstream experimental success through effective credit assignment. To bridge communities and establish a shared frame of reference, we first present an integrated lens that aligns terminology, evaluation, and workflow stages across AI and materials science. We then analyze the field through two focused lenses: From the AI perspective, the survey details LLM strengths in pattern recognition, predictive analytics, and natural language processing for literature mining, materials characterization, and property prediction; from the materials science perspective, it highlights applications in materials design, process optimization, and the acceleration of computational workflows via integration with external tools (e.g., DFT, robotic labs). Finally, we contrast passive, reactive approaches with agentic design, cataloging current contributions while motivating systems that pursue long-horizon goals with autonomy, memory, and tool use. This survey charts a practical roadmap towards autonomous, safety-aware LLM agents aimed at discovering novel and useful materials.

URL PDF HTML ☆

赞 0 踩 0

AI 大模型

视觉与机器人

科学与医疗

Quantum Attention by Overlap Interference: Predicting Sequences from Classical and Many-Body Quantum Data

Makespan Minimization in Split Learning: From Theory to Practice

Multimodal Generative Retrieval Model with Staged Pretraining for Food Delivery on Meituan

Efficient and Robust Modeling of Nonlinear Mechanical Systems

Infinite-dimensional generative diffusions via Doob's h-transform

Confundo: Learning to Generate Robust Poison for Practical RAG Systems

Sample-Efficient Policy Space Response Oracles with Joint Experience Best Response

AgentStepper: Interactive Debugging of Software Development Agents

Reinforcement Learning-Based Dynamic Management of Structured Parallel Farm Skeletons on Serverless Platforms

Evolving Ranking Functions for Canonical Blow-Ups in Positive Characteristic

Operationalizing Stein's Method for Online Linear Optimization: CLT-Based Optimal Tradeoffs

AlertBERT: A noise-robust alert grouping framework for simultaneous cyber attacks

Designing Computational Tools for Exploring Causal Relationships in Qualitative Data

Prism: Spectral Parameter Sharing for Multi-Agent Reinforcement Learning

TrajAD: Trajectory Anomaly Detection for Trustworthy LLM Agents

A methodology for analyzing financial needs hierarchy from social discussions using LLM

Empirical Analysis of Adversarial Robustness and Explainability Drift in Cybersecurity Classifiers

Advances in Battery Energy Storage Management: Control and Economic Synergies

AS-Mamba: Asymmetric Self-Guided Mamba Decoupled Iterative Network for Metal Artifact Reduction

Zero-Trust Runtime Verification for Agentic Payment Protocols: Mitigating Replay and Context-Binding Failures in AP2

AdFL: In-Browser Federated Learning for Online Advertisement

Time-uniform conformal and PAC prediction

SAGE: Benchmarking and Improving Retrieval for Deep Research Agents

Interpreting Manifolds and Graph Neural Embeddings from Internet of Things Traffic Flows

TimelyFreeze: Adaptive Parameter Freezing Mechanism for Pipeline Parallelism

Spider-Sense: Intrinsic Risk Sensing for Efficient Agent Defense with Hierarchical Adaptive Screening

Radon--Wasserstein Gradient Flows for Interacting-Particle Sampling in High Dimensions

Benchmarking Automatic Speech Recognition for Indian Languages in Agricultural Contexts

Testing Storage-System Correctness: Challenges, Fuzzing Limitations, and AI-Augmented Opportunities

Towards Agentic Intelligence for Materials Science