arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

检索范围排序方式

检索时间范围

重置

HOT 人工智能、机器人等 9

cs.AI 人工智能 cs.CV 计算机视觉 cs.CL 自然语言处理 cs.RO 机器人 cs.LG 机器学习 cs.SD 声音 cs.ET 新兴技术 eess.AS 音频语音 eess.IV 图像视频

CS 计算机 41

cs 计算机 cs.AI 人工智能 cs.AR 硬件架构 cs.CC 计算复杂性 cs.CE 计算工程 cs.CG 计算几何 cs.CL 自然语言处理 cs.CR 密码安全 cs.CV 计算机视觉 cs.CY 计算机与社会 cs.DB 数据库 cs.DC 分布式计算 cs.DL 数字图书馆 cs.DM 离散数学 cs.DS 数据结构 cs.ET 新兴技术 cs.FL 形式语言 cs.GL 综述文献 cs.GR 图形学 cs.GT 博弈论 cs.HC 人机交互 cs.IR 信息检索 cs.IT 信息论 cs.LG 机器学习 cs.LO 计算机逻辑 cs.MA 多智能体 cs.MM 多媒体 cs.MS 数学软件 cs.NA 数值分析 cs.NE 神经进化 cs.NI 网络架构 cs.OH 其他计算机 cs.OS 操作系统 cs.PF 性能 cs.PL 编程语言 cs.RO 机器人 cs.SC 符号计算 cs.SD 声音 cs.SE 软件工程 cs.SI 社会信息网络 cs.SY 系统控制

ECON 经济学 4

econ 经济学 econ.EM 计量经济 econ.GN 一般经济 econ.TH 理论经济

EESS 电气与系统 5

eess 电气与系统 eess.AS 音频语音 eess.IV 图像视频 eess.SP 信号处理 eess.SY 系统控制

MATH 数学 33

math 数学 math.AC 交换代数 math.AG 代数几何 math.AP 偏微分方程 math.AT 代数拓扑 math.CA 经典分析 math.CO 组合数学 math.CT 范畴论 math.CV 复变函数 math.DG 微分几何 math.DS 动力系统 math.FA 泛函分析 math.GM 一般数学 math.GN 一般拓扑 math.GR 群论 math.GT 几何拓扑 math.HO 历史综述 math.IT 信息论 math.KT K理论 math.LO 逻辑 math.MG 度量几何 math.MP 数学物理 math.NA 数值分析 math.NT 数论 math.OA 算子代数 math.OC 优化控制 math.PR 概率 math.QA 量子代数 math.RA 环与代数 math.RT 表示论 math.SG 辛几何 math.SP 谱理论 math.ST 统计理论

PHYSICS 物理 55

astro-ph 天体物理 astro-ph.CO 宇宙学 astro-ph.EP 地球行星 astro-ph.GA 星系物理 astro-ph.HE 高能天体 astro-ph.IM 天文仪器 astro-ph.SR 太阳恒星 cond-mat 凝聚态 cond-mat.dis-nn 无序神经 cond-mat.mes-hall 介观纳米 cond-mat.mtrl-sci 材料科学 cond-mat.other 其他凝聚态 cond-mat.quant-gas 量子气体 cond-mat.soft 软凝聚态 cond-mat.stat-mech 统计力学 cond-mat.str-el 强关联电子 cond-mat.supr-con 超导 gr-qc 广义相对论 hep-ex 高能实验 hep-lat 格点高能 hep-ph 高能唯象 hep-th 高能理论 math-ph 数学物理 nlin 非线性科学 nlin.AO 自适应系统 nlin.CD 混沌动力学 nlin.CG 胞自动机 nlin.PS 斑图孤子 nlin.SI 可积系统 nucl-ex 核物理实验 nucl-th 核物理理论 physics 物理 physics.acc-ph 加速器物理 physics.ao-ph 大气海洋 physics.app-ph 应用物理 physics.atm-clus 原子分子团簇 physics.atom-ph 原子物理 physics.bio-ph 生物物理 physics.chem-ph 化学物理 physics.class-ph 经典物理 physics.comp-ph 计算物理 physics.data-an 数据分析 physics.ed-ph 物理教育 physics.flu-dyn 流体动力学 physics.gen-ph 普通物理 physics.geo-ph 地球物理 physics.hist-ph 物理史哲 physics.ins-det 仪器探测 physics.med-ph 医学物理 physics.optics 光学 physics.plasm-ph 等离子体 physics.pop-ph 科普物理 physics.soc-ph 物理与社会 physics.space-ph 空间物理 quant-ph 量子物理

Q-BIO 定量生物 11

q-bio 定量生物 q-bio.BM 生物分子 q-bio.CB 细胞行为 q-bio.GN 基因组学 q-bio.MN 分子网络 q-bio.NC 神经认知 q-bio.OT 其他定量生物 q-bio.PE 种群进化 q-bio.QM 定量方法 q-bio.SC 亚细胞过程 q-bio.TO 组织器官

Q-FIN 定量金融 10

q-fin 定量金融 q-fin.CP 计算金融 q-fin.EC 经济学 q-fin.GN 一般金融 q-fin.MF 数学金融 q-fin.PM 投资组合 q-fin.PR 证券定价 q-fin.RM 风险管理 q-fin.ST 统计金融 q-fin.TR 交易微观结构

STAT 统计 7

stat 统计 stat.AP 统计应用 stat.CO 统计计算 stat.ME 统计方法 stat.ML 机器学习 stat.OT 其他统计 stat.TH 统计理论

2602.10295 2026-02-12 cs.HC cs.AI cs.IR

ECHO: An Open Research Platform for Evaluation of Chat, Human Behavior, and Outcomes

Jiqun Liu, Nischal Dinesh, Ran Yu

2602.10253 2026-02-12 cs.DS cs.AI

The Complexity of Bayesian Network Learning: Revisiting the Superstructure

Robert Ganian, Viktoriia Korchemna

Comments A preliminary version of this article appeared in the proceedings of NeurIPS 2021

2602.10233 2026-02-12 cs.NE cs.AI math.CA math.MG math.OC

ImprovEvolve: Ask AlphaEvolve to Improve the Input Solution and Then Improvise

Alexey Kravatskiy, Valentin Khrulkov, Ivan Oseledets

Comments 18 pages, 23 figures, submitted to KDD '26

2602.10225 2026-02-12 quant-ph cs.AI cs.IT eess.SP math.IT

Quantum Integrated Sensing and Computation with Indefinite Causal Order

Ivana Nikoloska

Comments Submitted for publication

2602.10176 2026-02-12 stat.ML cs.LG

Dissecting Performative Prediction: A Comprehensive Survey

Thomas Kehrenberg, Javier Sanguino, Jose A. Lozano, Novi Quadrianto

2602.10171 2026-02-12 cs.SE cs.AI

EvoCodeBench: A Human-Performance Benchmark for Self-Evolving LLM-Driven Coding Systems

Wentao Zhang, Jianfeng Wang, Liheng Liang, Yilei Zhao, HaiBin Wen, Zhe Zhao

详情

英文摘要

As large language models (LLMs) continue to advance in programming tasks, LLM-driven coding systems have evolved from one-shot code generation into complex systems capable of iterative improvement during inference. However, existing code benchmarks primarily emphasize static correctness and implicitly assume fixed model capability during inference. As a result, they do not capture inference-time self-evolution, such as whether accuracy and efficiency improve as an agent iteratively refines its solutions. They also provide limited accounting of resource costs and rarely calibrate model performance against that of human programmers. Moreover, many benchmarks are dominated by high-resource languages, leaving cross-language robustness and long-tail language stability underexplored. Therefore, we present EvoCodeBench, a benchmark for evaluating self-evolving LLM-driven coding systems across programming languages with direct comparison to human performance. EvoCodeBench tracks performance dynamics, measuring solution correctness alongside efficiency metrics such as solving time, memory consumption, and improvement algorithmic design over repeated problem-solving attempts. To ground evaluation in a human-centered reference frame, we directly compare model performance with that of human programmers on the same tasks, enabling relative performance assessment within the human ability distribution. Furthermore, EvoCodeBench supports multiple programming languages, enabling systematic cross-language and long-tail stability analyses under a unified protocol. Our results demonstrate that self-evolving systems exhibit measurable gains in efficiency over time, and that human-relative and multi-language analyses provide insights unavailable through accuracy alone. EvoCodeBench establishes a foundation for evaluating coding intelligence in evolving LLM-driven systems.

URL PDF HTML ☆

赞 0 踩 0

2602.10167 2026-02-12 eess.IV cs.AI

Anatomy-Preserving Latent Diffusion for Generation of Brain Segmentation Masks with Ischemic Infarct

Lucia Borrego, Vajira Thambawita, Marco Ciuffreda, Ines del Val, Alejandro Dominguez, Josep Munuera

2602.10166 2026-02-12 cs.CR cs.SD eess.AS

MerkleSpeech: Public-Key Verifiable, Chunk-Localised Speech Provenance via Perceptual Fingerprints and Merkle Commitments

Tatsunori Ono

Comments 16 pages, 4 figures, 3 tables

详情

英文摘要

Speech provenance goes beyond detecting whether a watermark is present. Real workflows involve splicing, quoting, trimming, and platform-level transforms that may preserve some regions while altering others. Neural watermarking systems have made strides in robustness and localised detection, but most deployments produce outputs with no third-party verifiable cryptographic proof tying a time segment to an issuer-signed original. Provenance standards like C2PA adopt signed manifests and Merkle-based fragment validation, yet their bindings target encoded assets and break under re-encoding or routine processing. We propose MerkleSpeech, a system for public-key verifiable, chunk-localised speech provenance offering two tiers of assurance. The first, a robust watermark attribution layer (WM-only), survives common distribution transforms and answers "was this chunk issued by a known party?". The second, a strict cryptographic integrity layer (MSv1), verifies Merkle inclusion of the chunk's fingerprint under an issuer signature. The system computes perceptual fingerprints over short speech chunks, commits them in a Merkle tree whose root is signed with an issuer key, and embeds a compact in-band watermark payload carrying a random content identifier and chunk metadata sufficient to retrieve Merkle inclusion proofs from a repository. Once the payload is extracted, all subsequent verification steps (signature check, fingerprint recomputation, Merkle inclusion) use only public information. The result is a splice-aware timeline indicating which regions pass each tier and why any given region fails. We describe the protocol, provide pseudocode, and present experiments targeting very low false positive rates under resampling, bandpass filtering, and additive noise, informed by recent audits identifying neural codecs as a major stressor for post-hoc audio watermarks.

URL PDF HTML ☆

赞 0 踩 0

2602.10161 2026-02-12 cs.CR cs.AI cs.CL

Omni-Safety under Cross-Modality Conflict: Vulnerabilities, Dynamics Mechanisms and Efficient Alignment

Kun Wang, Zherui Li, Zhenhong Zhou, Yitong Zhang, Yan Mi, Kun Yang, Yiming Zhang, Junhao Dong, Zhongxiang Sun, Qiankun Li, Yang Liu

2602.10158 2026-02-12 physics.chem-ph cs.AI

NMRTrans: Structure Elucidation from Experimental NMR Spectra via Set Transformers

Liujia Yang, Zhuo Yang, Jiaqing Xie, Yubin Wang, Ben Gao, Tianfan Fu, Xingjian Wei, Jiaxing Sun, Jiang Wu, Conghui He, Yuqiang Li, Qinying Gu

2602.10157 2026-02-12 cs.CR cs.AI cs.NI

MalMoE: Mixture-of-Experts Enhanced Encrypted Malicious Traffic Detection Under Graph Drift

Yunpeng Tan, Qingyang Li, Mingxin Yang, Yannan Hu, Lei Zhang, Xinggong Zhang

Comments 10 pages, 9 figures, accepted by IEEE INFOCOM 2026

2602.10156 2026-02-12 q-bio.GN cs.LG q-bio.CB

STRAND: Sequence-Conditioned Transport for Single-Cell Perturbations

Boyang Fu, George Dasoulas, Sameer Gabbita, Xiang Lin, Shanghua Gao, Xiaorui Su, Soumya Ghosh, Marinka Zitnik

Comments 8 pages for main draft, 6 main figures

2602.10155 2026-02-12 eess.IV cs.CV

A Systematic Review on Data-Driven Brain Deformation Modeling for Image-Guided Neurosurgery

Tiago Assis, Colin P. Galvin, Joshua P. Castillo, Nazim Haouchine, Marta Kersten-Oertel, Zeyu Gao, Mireia Crispin-Ortuzar, Stephen J. Price, Thomas Santarius, Yangming Ou, Sarah Frisken, Nuno C. Garcia, Alexandra J. Golby, Reuben Dorent, Ines P. Machado

Comments 31 pages, 7 figures, 3 tables. Submitted to Medical Image Analysis

详情

英文摘要

Accurate compensation of brain deformation is a critical challenge for reliable image-guided neurosurgery, as surgical manipulation and tumor resection induce tissue motion that misaligns preoperative planning images with intraoperative anatomy and longitudinal studies. In this systematic review, we synthesize recent AI-driven approaches developed between January 2020 and April 2025 for modeling and correcting brain deformation. A comprehensive literature search was conducted in PubMed, IEEE Xplore, Scopus, and Web of Science, with predefined inclusion and exclusion criteria focused on computational methods applied to brain deformation compensation for neurosurgical imaging, resulting in 41 studies meeting these criteria. We provide a unified analysis of methodological strategies, including deep learning-based image registration, direct deformation field regression, synthesis-driven multimodal alignment, resection-aware architectures addressing missing correspondences, and hybrid models that integrate biomechanical priors. We also examine dataset utilization, reported evaluation metrics, validation protocols, and how uncertainty and generalization have been assessed across studies. While AI-based deformation models demonstrate promising performance and computational efficiency, current approaches exhibit limitations in out-of-distribution robustness, standardized benchmarking, interpretability, and readiness for clinical deployment. Our review highlights these gaps and outlines opportunities for future research aimed at achieving more robust, generalizable, and clinically translatable deformation compensation solutions for neurosurgical guidance. By organizing recent advances and critically evaluating evaluation practices, this work provides a comprehensive foundation for researchers and clinicians engaged in developing and applying AI-based brain deformation methods.

URL PDF HTML ☆

赞 0 踩 0

2602.10153 2026-02-12 cs.CR cs.LG cs.SE

Basic Legibility Protocols Improve Trusted Monitoring

Ashwin Sreevatsa, Sebastian Prasanna, Cody Rushing

2602.10150 2026-02-12 physics.flu-dyn cs.AI math.AP

PEST: Physics-Enhanced Swin Transformer for 3D Turbulence Simulation

Yilong Dai, Shengyu Chen, Xiaowei Jia, Peyman Givi, Runlong Yu

2602.10148 2026-02-12 cs.CR cs.AI

Red-teaming the Multimodal Reasoning: Jailbreaking Vision-Language Models via Cross-modal Entanglement Attacks

Yu Yan, Sheng Sun, Shengjia Cheng, Teli Liu, Mingfeng Li, Min Liu

2602.10147 2026-02-12 cs.SE cs.AI

On the Use of a Large Language Model to Support the Conduction of a Systematic Mapping Study: A Brief Report from a Practitioner's View

Cauã Ferreira Barros, Marcos Kalinowski, Mohamad Kassab, Valdemar Vicente Graciano Neto

Comments 6 pages, includes 2 tables. Submitted and Accepted to the WSESE 2026 ICSE Workshop

2602.10145 2026-02-12 physics.soc-ph cs.AI cs.IR

Silence Routing: When Not Speaking Improves Collective Judgment

Itsuki Fujisaki, Kunhao Yang

Comments 7pages, 2 figures

2602.10144 2026-02-12 stat.ML cs.AI cs.LG

When LLMs get significantly worse: A statistical approach to detect model degradations

Jonas Kübler, Kailash Budhathoki, Matthäus Kleindessner, Xiong Zhou, Junming Yin, Ashish Khetan, George Karypis

Comments https://openreview.net/forum?id=cM3gsqEI4K

Journal ref ICLR 2026

2602.10133 2026-02-12 cs.SE cs.AI

AgentTrace: A Structured Logging Framework for Agent System Observability

Adam AlSayyad, Kelvin Yuxiang Huang, Richik Pal

Comments AAAI 2026 Workshop LaMAS

2602.10131 2026-02-12 cs.SI cs.AI cs.CY

The Anatomy of the Moltbook Social Graph

David Holtz

Comments 20 pages, 7 figures

2602.10127 2026-02-12 cs.SI cs.AI cs.CR

"Humans welcome to observe": A First Look at the Agent Social Network Moltbook

Yukun Jiang, Yage Zhang, Xinyue Shen, Michael Backes, Yang Zhang

Comments 16 pages

2602.10124 2026-02-12 physics.soc-ph cs.CV cs.CY

URBAN-SPIN: A street-level bikeability index to inform design implementations in historical city centres

Haining Ding, Chenxi Wang, Michal Gath-Morad

Comments 32 pages, 10 figures

2602.10122 2026-02-12 cs.CY cs.AI

A Practical Guide to Agentic AI Transition in Organizations

Eranga Bandara, Ross Gore, Sachin Shetty, Sachini Rajapakse, Isurunima Kularathna, Pramoda Karunarathna, Ravi Mukkamala, Peter Foytik, Safdar H. Bouk, Abdul Rahman, Xueping Liang, Amin Hass, Tharaka Hewa, Ng Wee Keong, Kasun De Zoysa, Aruna Withanage, Nilaan Loganathan

详情

英文摘要

Agentic AI represents a significant shift in how intelligence is applied within organizations, moving beyond AI-assisted tools toward autonomous systems capable of reasoning, decision-making, and coordinated action across workflows. As these systems mature, they have the potential to automate a substantial share of manual organizational processes, fundamentally reshaping how work is designed, executed, and governed. Although many organizations have adopted AI to improve productivity, most implementations remain limited to isolated use cases and human-centered, tool-driven workflows. Despite increasing awareness of agentic AI's strategic importance, engineering teams and organizational leaders often lack clear guidance on how to operationalize it effectively. Key challenges include an overreliance on traditional software engineering practices, limited integration of business-domain knowledge, unclear ownership of AI-driven workflows, and the absence of sustainable human-AI collaboration models. Consequently, organizations struggle to move beyond experimentation, scale agentic systems, and align them with tangible business value. Drawing on practical experience in designing and deploying agentic AI workflows across multiple organizations and business domains, this paper proposes a pragmatic framework for transitioning organizational functions from manual processes to automated agentic AI systems. The framework emphasizes domain-driven use case identification, systematic delegation of tasks to AI agents, AI-assisted construction of agentic workflows, and small, AI-augmented teams working closely with business stakeholders. Central to the approach is a human-in-the-loop operating model in which individuals act as orchestrators of multiple AI agents, enabling scalable automation while maintaining oversight, adaptability, and organizational control.

URL PDF HTML ☆

赞 0 踩 0

2602.09447 2026-02-12 cs.SE cs.AI cs.CL

SWE-AGI: Benchmarking Specification-Driven Software Construction with MoonBit in the Era of Autonomous Agents

Zhirui Zhang, Hongbo Zhang, Haoxiang Fei, Zhiyuan Bao, Yubin Chen, Zhengyu Lei, Ziyue Liu, Yixuan Sun, Mingkun Xiao, Zihang Ye, Yu Zhang, Hongcheng Zhu, Yuxiang Wen, Heung-Yeung Shum

Comments 20 pages, 3 figures

2602.09067 2026-02-12 q-bio.GN cs.AI cs.CE cs.CL

AntigenLM: Structure-Aware DNA Language Modeling for Influenza

Yue Pei, Xuebin Chi, Yu Kang

Comments Accepted by ICLR 2026

2602.08515 2026-02-12 math.NA cs.LG cs.NA cs.NE math.OC

Do physics-informed neural networks (PINNs) need to be deep? Shallow PINNs using the Levenberg-Marquardt algorithm

Muhammad Luthfi Shahab, Imam Mukhlash, Hadi Susanto

2602.05016 2026-02-12 cs.HC cs.AI cs.CY cs.ET

From Fragmentation to Integration: Exploring the Design Space of AI Agents for Human-as-the-Unit Privacy Management

Eryue Xu, Tianshi Li

Journal ref CHI 2026

2602.00402 2026-02-12 cs.HC cs.AI cs.CY

A Conditional Companion: Lived Experiences of People with Mental Health Disorders Using LLMs

Aditya Kumar Purohit, Hendrik Heuer

Comments Accepted for presentation at CHI 2026 in Barcelona (ACM CHI Conference on Human Factors in Computing Systems)

2601.22860 2026-02-12 math.NA cs.AI cs.NA

Bayesian Interpolating Neural Network (B-INN): a scalable and reliable Bayesian model for large-scale physical systems

Chanwook Park, Brian Kim, Jiachen Guo, Wing Kam Liu

Comments 8 pages, 6 figures, ICML conference full paper submitted

AI 大模型

视觉与机器人

科学与医疗

ECHO: An Open Research Platform for Evaluation of Chat, Human Behavior, and Outcomes

The Complexity of Bayesian Network Learning: Revisiting the Superstructure

ImprovEvolve: Ask AlphaEvolve to Improve the Input Solution and Then Improvise

Quantum Integrated Sensing and Computation with Indefinite Causal Order

Dissecting Performative Prediction: A Comprehensive Survey

EvoCodeBench: A Human-Performance Benchmark for Self-Evolving LLM-Driven Coding Systems

Anatomy-Preserving Latent Diffusion for Generation of Brain Segmentation Masks with Ischemic Infarct

MerkleSpeech: Public-Key Verifiable, Chunk-Localised Speech Provenance via Perceptual Fingerprints and Merkle Commitments

Omni-Safety under Cross-Modality Conflict: Vulnerabilities, Dynamics Mechanisms and Efficient Alignment

NMRTrans: Structure Elucidation from Experimental NMR Spectra via Set Transformers

MalMoE: Mixture-of-Experts Enhanced Encrypted Malicious Traffic Detection Under Graph Drift

STRAND: Sequence-Conditioned Transport for Single-Cell Perturbations

A Systematic Review on Data-Driven Brain Deformation Modeling for Image-Guided Neurosurgery

Basic Legibility Protocols Improve Trusted Monitoring

PEST: Physics-Enhanced Swin Transformer for 3D Turbulence Simulation

Red-teaming the Multimodal Reasoning: Jailbreaking Vision-Language Models via Cross-modal Entanglement Attacks

On the Use of a Large Language Model to Support the Conduction of a Systematic Mapping Study: A Brief Report from a Practitioner's View

Silence Routing: When Not Speaking Improves Collective Judgment

When LLMs get significantly worse: A statistical approach to detect model degradations

AgentTrace: A Structured Logging Framework for Agent System Observability

The Anatomy of the Moltbook Social Graph

"Humans welcome to observe": A First Look at the Agent Social Network Moltbook

URBAN-SPIN: A street-level bikeability index to inform design implementations in historical city centres

A Practical Guide to Agentic AI Transition in Organizations

SWE-AGI: Benchmarking Specification-Driven Software Construction with MoonBit in the Era of Autonomous Agents

AntigenLM: Structure-Aware DNA Language Modeling for Influenza

Do physics-informed neural networks (PINNs) need to be deep? Shallow PINNs using the Levenberg-Marquardt algorithm

From Fragmentation to Integration: Exploring the Design Space of AI Agents for Human-as-the-Unit Privacy Management

A Conditional Companion: Lived Experiences of People with Mental Health Disorders Using LLMs

Bayesian Interpolating Neural Network (B-INN): a scalable and reliable Bayesian model for large-scale physical systems