arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

检索范围排序方式

检索时间范围

重置

HOT 人工智能、机器人等 9

cs.AI 人工智能 cs.CV 计算机视觉 cs.CL 自然语言处理 cs.RO 机器人 cs.LG 机器学习 cs.SD 声音 cs.ET 新兴技术 eess.AS 音频语音 eess.IV 图像视频

CS 计算机 41

cs 计算机 cs.AI 人工智能 cs.AR 硬件架构 cs.CC 计算复杂性 cs.CE 计算工程 cs.CG 计算几何 cs.CL 自然语言处理 cs.CR 密码安全 cs.CV 计算机视觉 cs.CY 计算机与社会 cs.DB 数据库 cs.DC 分布式计算 cs.DL 数字图书馆 cs.DM 离散数学 cs.DS 数据结构 cs.ET 新兴技术 cs.FL 形式语言 cs.GL 综述文献 cs.GR 图形学 cs.GT 博弈论 cs.HC 人机交互 cs.IR 信息检索 cs.IT 信息论 cs.LG 机器学习 cs.LO 计算机逻辑 cs.MA 多智能体 cs.MM 多媒体 cs.MS 数学软件 cs.NA 数值分析 cs.NE 神经进化 cs.NI 网络架构 cs.OH 其他计算机 cs.OS 操作系统 cs.PF 性能 cs.PL 编程语言 cs.RO 机器人 cs.SC 符号计算 cs.SD 声音 cs.SE 软件工程 cs.SI 社会信息网络 cs.SY 系统控制

ECON 经济学 4

econ 经济学 econ.EM 计量经济 econ.GN 一般经济 econ.TH 理论经济

EESS 电气与系统 5

eess 电气与系统 eess.AS 音频语音 eess.IV 图像视频 eess.SP 信号处理 eess.SY 系统控制

MATH 数学 33

math 数学 math.AC 交换代数 math.AG 代数几何 math.AP 偏微分方程 math.AT 代数拓扑 math.CA 经典分析 math.CO 组合数学 math.CT 范畴论 math.CV 复变函数 math.DG 微分几何 math.DS 动力系统 math.FA 泛函分析 math.GM 一般数学 math.GN 一般拓扑 math.GR 群论 math.GT 几何拓扑 math.HO 历史综述 math.IT 信息论 math.KT K理论 math.LO 逻辑 math.MG 度量几何 math.MP 数学物理 math.NA 数值分析 math.NT 数论 math.OA 算子代数 math.OC 优化控制 math.PR 概率 math.QA 量子代数 math.RA 环与代数 math.RT 表示论 math.SG 辛几何 math.SP 谱理论 math.ST 统计理论

PHYSICS 物理 55

astro-ph 天体物理 astro-ph.CO 宇宙学 astro-ph.EP 地球行星 astro-ph.GA 星系物理 astro-ph.HE 高能天体 astro-ph.IM 天文仪器 astro-ph.SR 太阳恒星 cond-mat 凝聚态 cond-mat.dis-nn 无序神经 cond-mat.mes-hall 介观纳米 cond-mat.mtrl-sci 材料科学 cond-mat.other 其他凝聚态 cond-mat.quant-gas 量子气体 cond-mat.soft 软凝聚态 cond-mat.stat-mech 统计力学 cond-mat.str-el 强关联电子 cond-mat.supr-con 超导 gr-qc 广义相对论 hep-ex 高能实验 hep-lat 格点高能 hep-ph 高能唯象 hep-th 高能理论 math-ph 数学物理 nlin 非线性科学 nlin.AO 自适应系统 nlin.CD 混沌动力学 nlin.CG 胞自动机 nlin.PS 斑图孤子 nlin.SI 可积系统 nucl-ex 核物理实验 nucl-th 核物理理论 physics 物理 physics.acc-ph 加速器物理 physics.ao-ph 大气海洋 physics.app-ph 应用物理 physics.atm-clus 原子分子团簇 physics.atom-ph 原子物理 physics.bio-ph 生物物理 physics.chem-ph 化学物理 physics.class-ph 经典物理 physics.comp-ph 计算物理 physics.data-an 数据分析 physics.ed-ph 物理教育 physics.flu-dyn 流体动力学 physics.gen-ph 普通物理 physics.geo-ph 地球物理 physics.hist-ph 物理史哲 physics.ins-det 仪器探测 physics.med-ph 医学物理 physics.optics 光学 physics.plasm-ph 等离子体 physics.pop-ph 科普物理 physics.soc-ph 物理与社会 physics.space-ph 空间物理 quant-ph 量子物理

Q-BIO 定量生物 11

q-bio 定量生物 q-bio.BM 生物分子 q-bio.CB 细胞行为 q-bio.GN 基因组学 q-bio.MN 分子网络 q-bio.NC 神经认知 q-bio.OT 其他定量生物 q-bio.PE 种群进化 q-bio.QM 定量方法 q-bio.SC 亚细胞过程 q-bio.TO 组织器官

Q-FIN 定量金融 10

q-fin 定量金融 q-fin.CP 计算金融 q-fin.EC 经济学 q-fin.GN 一般金融 q-fin.MF 数学金融 q-fin.PM 投资组合 q-fin.PR 证券定价 q-fin.RM 风险管理 q-fin.ST 统计金融 q-fin.TR 交易微观结构

STAT 统计 7

stat 统计 stat.AP 统计应用 stat.CO 统计计算 stat.ME 统计方法 stat.ML 机器学习 stat.OT 其他统计 stat.TH 统计理论

2602.16111 2026-02-19 stat.AP cs.AI

Surrogate-Based Prevalence Measurement for Large-Scale A/B Testing

Zehao Xu, Tony Paek, Kevin O'Sullivan, Attila Dobi

2602.16109 2026-02-19 cs.CR cs.AI cs.CE

Federated Graph AGI for Cross-Border Insider Threat Intelligence in Government Financial Schemes

Srikumar Nayak, James Walmesley

Comments 35 Pages, 8 figures

2602.16098 2026-02-19 cs.CR cs.LG

Collaborative Zone-Adaptive Zero-Day Intrusion Detection for IoBT

Amirmohammad Pasdar, Shabnam Kasra Kermanshahi, Nour Moustafa, Van-Thuan Pham

2602.16063 2026-02-19 eess.SY cs.CE cs.ET cs.LG cs.SY stat.CO

MARLEM: A Multi-Agent Reinforcement Learning Simulation Framework for Implicit Cooperation in Decentralized Local Energy Markets

Nelson Salazar-Pena, Alejandra Tabares, Andres Gonzalez-Mancera

Comments 32 pages, 7 figures, 1 table, 1 algorithm

2602.16062 2026-02-19 eess.SY cs.CE cs.LG cs.MA cs.SY stat.AP

Harnessing Implicit Cooperation: A Multi-Agent Reinforcement Learning Approach Towards Decentralized Local Energy Markets

Nelson Salazar-Pena, Alejandra Tabares, Andres Gonzalez-Mancera

Comments 42 pages, 7 figures, 10 tables

2602.16038 2026-02-19 cs.NE cs.LG

Heuristic Search as Language-Guided Program Optimization

Mingxin Yu, Ruixiao Yang, Chuchu Fan

Comments 8 pages, 3 figures, under review

2602.16033 2026-02-19 cs.HC cs.AI

Transforming GenAI Policy to Prompting Instruction: An RCT of Scalable Prompting Interventions in a CS1 Course

Ruiwei Xiao, Runlong Ye, Xinying Hou, Jessica Wen, Harsh Kumar, Michael Liut, John Stamper

Comments 11 pages, 3 figures

2602.16018 2026-02-19 quant-ph cs.ET cs.LG

Edge-Local and Qubit-Efficient Quantum Graph Learning for the NISQ Era

Armin Ahmadkhaniha, Jake Doliskani

2602.15996 2026-02-19 math.OC cs.LG

Exploring New Frontiers in Vertical Federated Learning: the Role of Saddle Point Reformulation

Aleksandr Beznosikov, Georgiy Kormakov, Alexander Grigorievskiy, Mikhail Rudakov, Ruslan Nazykov, Alexander Rogozin, Anton Vakhrushev, Andrey Savchenko, Martin Takáč, Alexander Gasnikov

Comments 104 pages, 1 table, 9 figures, 10 theorems, 12 algorithms

2602.15988 2026-02-19 eess.IV cs.CV cs.HC

Automated Assessment of Kidney Ureteroscopy Exploration for Training

Fangjie Li, Nicholas Kavoussi, Charan Mohan, Matthieu Chabanas, Jie Ying Wu

2602.15968 2026-02-19 cs.SE cs.AI cs.CY cs.HC

From Reflection to Repair: A Scoping Review of Dataset Documentation Tools

Pedro Reynolds-Cuéllar, Marisol Wong-Villacres, Adriana Alvarado Garcia, Heila Precel

Comments to be published at the CHI conference on Human Factors in Computing Systems

2602.15951 2026-02-19 astro-ph.CO cs.LG

MadEvolve: Evolutionary Optimization of Cosmological Algorithms with Large Language Models

Tianyi Li, Shihui Zang, Moritz Münchmeyer

2602.15945 2026-02-19 cs.CR cs.AI

From Tool Orchestration to Code Execution: A Study of MCP Design Choices

Yuval Felendler, Parth A. Gandhi, Idan Habler, Yuval Elovici, Asaf Shabtai

详情

英文摘要

Model Context Protocols (MCPs) provide a unified platform for agent systems to discover, select, and orchestrate tools across heterogeneous execution environments. As MCP-based systems scale to incorporate larger tool catalogs and multiple concurrently connected MCP servers, traditional tool-by-tool invocation increases coordination overhead, fragments state management, and limits support for wide-context operations. To address these scalability challenges, recent MCP designs have incorporated code execution as a first-class capability, an approach called Code Execution MCP (CE-MCP). This enables agents to consolidate complex workflows, such as SQL querying, file analysis, and multi-step data transformations, into a single program that executes within an isolated runtime environment. In this work, we formalize the architectural distinction between context-coupled (traditional) and context-decoupled (CE-MCP) models, analyzing their fundamental scalability trade-offs. Using the MCP-Bench framework across 10 representative servers, we empirically evaluate task behavior, tool utilization patterns, execution latency, and protocol efficiency as the scale of connected MCP servers and available tools increases, demonstrating that while CE-MCP significantly reduces token usage and execution latency, it introduces a vastly expanded attack surface. We address this security gap by applying the MAESTRO framework, identifying sixteen attack classes across five execution phases-including specific code execution threats such as exception-mediated code injection and unsafe capability synthesis. We validate these vulnerabilities through adversarial scenarios across multiple LLMs and propose a layered defense architecture comprising containerized sandboxing and semantic gating. Our findings provide a rigorous roadmap for balancing scalability and security in production-ready executable agent workflows.

URL PDF HTML ☆

赞 0 踩 0

2602.15925 2026-02-19 stat.ML cs.LG

Robust Stochastic Gradient Posterior Sampling with Lattice Based Discretisation

Zier Mensch, Lars Holdijk, Samuel Duffield, Maxwell Aifer, Patrick J. Coles, Max Welling, Miranda C. N. Cheng

2602.15923 2026-02-19 cond-mat.mtrl-sci cs.AI cs.LG

A fully differentiable framework for training proxy Exchange Correlation Functionals for periodic systems

Rakshit Kumar Singh, Aryan Amit Barsainyan, Bharath Ramsundar

2602.15920 2026-02-19 stat.ML cs.LG eess.SP

Including Node Textual Metadata in Laplacian-constrained Gaussian Graphical Models

Jianhua Wang, Killian Cressant, Pedro Braconnot Velloso, Arnaud Breloy

Comments Submitted to EUSIPCO 2026

2602.15917 2026-02-19 eess.IV cs.CV cs.DC cs.IT math.IT

ROIX-Comp: Optimizing X-ray Computed Tomography Imaging Strategy for Data Reduction and Reconstruction

Amarjit Singh, Kento Sato, Kohei Yoshida, Kentaro Uesugi, Yasumasa Joti, Takaki Hatsui, Andrès Rubio Proaño

Comments 11 pages, SCA/HPCAsia2026

2602.15914 2026-02-19 cond-mat.stat-mech cs.LG

Steering Dynamical Regimes of Diffusion Models by Breaking Detailed Balance

Haiqi Lu, Ying Tang

2602.15913 2026-02-19 eess.IV cs.AI cs.CV

Foundation Models for Medical Imaging: Status, Challenges, and Directions

Chuang Niu, Pengwei Wu, Bruno De Man, Ge Wang

2602.15890 2026-02-19 physics.comp-ph cs.AI cs.LG

Surrogate Modeling for Neutron Transport: A Neural Operator Approach

Md Hossain Sahadath, Qiyun Cheng, Shaowu Pan, Wei Ji

2602.15865 2026-02-19 cs.HC cs.AI cs.CL

AI as Teammate or Tool? A Review of Human-AI Interaction in Decision Support

Most. Sharmin Sultana Samu, Nafisa Khan, Kazi Toufique Elahi, Tasnuva Binte Rahman, Md. Rakibul Islam, Farig Sadeque

Comments Preprint

2602.15835 2026-02-19 cs.HC cs.CL cs.SE

A Methodology for Identifying Evaluation Items for Practical Dialogue Systems Based on Business-Dialogue System Alignment Models

Mikio Nakano, Hironori Takeuchi, Kazunori Komatani

Comments This paper has been accepted for presentation at International Workshop on Spoken Dialogue Systems Technology 2025 (IWSDS 2025)

2602.15485 2026-02-19 cs.CR cs.AI cs.SE

SecCodeBench-V2 Technical Report

Longfei Chen, Ji Zhao, Lanxiao Cui, Tong Su, Xingbo Pan, Ziyang Li, Yongxing Wu, Qijiang Cao, Qiyao Cai, Jing Zhang, Yuandong Ni, Junyao He, Zeyu Zhang, Chao Ge, Xuhuai Lu, Zeyu Gao, Yuxin Cui, Weisen Chen, Yuxuan Peng, Shengping Wang, Qi Li, Yukai Huang, Yukun Liu, Tuo Zhou, Terry Yue Zhuo, Junyang Lin, Chao Zhang

详情

英文摘要

We introduce SecCodeBench-V2, a publicly released benchmark for evaluating Large Language Model (LLM) copilots' capabilities of generating secure code. SecCodeBench-V2 comprises 98 generation and fix scenarios derived from Alibaba Group's industrial productions, where the underlying security issues span 22 common CWE (Common Weakness Enumeration) categories across five programming languages: Java, C, Python, Go, and JavaScript. SecCodeBench-V2 adopts a function-level task formulation: each scenario provides a complete project scaffold and requires the model to implement or patch a designated target function under fixed interfaces and dependencies. For each scenario, SecCodeBench-V2 provides executable proof-of-concept (PoC) test cases for both functional validation and security verification. All test cases are authored and double-reviewed by security experts, ensuring high fidelity, broad coverage, and reliable ground truth. Beyond the benchmark itself, we build a unified evaluation pipeline that assesses models primarily via dynamic execution. For most scenarios, we compile and run model-generated artifacts in isolated environments and execute PoC test cases to validate both functional correctness and security properties. For scenarios where security issues cannot be adjudicated with deterministic test cases, we additionally employ an LLM-as-a-judge oracle. To summarize performance across heterogeneous scenarios and difficulty levels, we design a Pass@K-based scoring protocol with principled aggregation over scenarios and severity, enabling holistic and comparable evaluation across models. Overall, SecCodeBench-V2 provides a rigorous and reproducible foundation for assessing the security posture of AI coding assistants, with results and artifacts released at https://alibaba.github.io/sec-code-bench. The benchmark is publicly available at https://github.com/alibaba/sec-code-bench.

URL PDF HTML ☆

赞 0 踩 0

2602.15286 2026-02-19 cs.NI cs.AI

AI-Paging: Lease-Based Execution Anchoring for Network-Exposed AI-as-a-Service

Mohaned Chraiti, Merve Saimler

2602.15281 2026-02-19 cs.NI cs.AI

High-Fidelity Network Management for Federated AI-as-a-Service: Cross-Domain Orchestration

Mohaned Chraiti, Ozgur Ercetin, Merve Saimler

2602.15146 2026-02-19 quant-ph cs.LG

Beyond Reinforcement Learning: Fast and Scalable Quantum Circuit Synthesis

Lukas Theißinger, Thore Gerlach, David Berghaus, Christian Bauckhage

2602.13521 2026-02-19 cs.DB cs.AI

Arming Data Agents with Tribal Knowledge

Shubham Agarwal, Asim Biswal, Sepanta Zeighami, Alvin Cheung, Joseph Gonzalez, Aditya G. Parameswaran

2602.10531 2026-02-19 stat.ML cs.LG math.ST stat.TH

From Collapse to Improvement: Statistical Perspectives on the Evolutionary Dynamics of Iterative Training on Contaminated Sources

Soham Bakshi, Sunrit Chakraborty

2602.05023 2026-02-19 cs.CR cs.AI

Do Vision-Language Models Respect Contextual Integrity in Location Disclosure?

Ruixin Yang, Ethan Mendes, Arthur Wang, James Hays, Sauvik Das, Wei Xu, Alan Ritter

Comments Accepted by ICLR 2026. Code and data can be downloaded via https://github.com/99starman/VLM-GeoPrivacyBench

2601.21093 2026-02-19 stat.ML cs.LG math.OC math.PR math.ST stat.TH

High-dimensional learning dynamics of multi-pass Stochastic Gradient Descent in multi-index models

Zhou Fan, Leda Wang

AI 大模型

视觉与机器人

科学与医疗