arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

检索范围排序方式

检索时间范围

重置

HOT 人工智能、机器人等 9

cs.AI 人工智能 cs.CV 计算机视觉 cs.CL 自然语言处理 cs.RO 机器人 cs.LG 机器学习 cs.SD 声音 cs.ET 新兴技术 eess.AS 音频语音 eess.IV 图像视频

CS 计算机 41

cs 计算机 cs.AI 人工智能 cs.AR 硬件架构 cs.CC 计算复杂性 cs.CE 计算工程 cs.CG 计算几何 cs.CL 自然语言处理 cs.CR 密码安全 cs.CV 计算机视觉 cs.CY 计算机与社会 cs.DB 数据库 cs.DC 分布式计算 cs.DL 数字图书馆 cs.DM 离散数学 cs.DS 数据结构 cs.ET 新兴技术 cs.FL 形式语言 cs.GL 综述文献 cs.GR 图形学 cs.GT 博弈论 cs.HC 人机交互 cs.IR 信息检索 cs.IT 信息论 cs.LG 机器学习 cs.LO 计算机逻辑 cs.MA 多智能体 cs.MM 多媒体 cs.MS 数学软件 cs.NA 数值分析 cs.NE 神经进化 cs.NI 网络架构 cs.OH 其他计算机 cs.OS 操作系统 cs.PF 性能 cs.PL 编程语言 cs.RO 机器人 cs.SC 符号计算 cs.SD 声音 cs.SE 软件工程 cs.SI 社会信息网络 cs.SY 系统控制

ECON 经济学 4

econ 经济学 econ.EM 计量经济 econ.GN 一般经济 econ.TH 理论经济

EESS 电气与系统 5

eess 电气与系统 eess.AS 音频语音 eess.IV 图像视频 eess.SP 信号处理 eess.SY 系统控制

MATH 数学 33

math 数学 math.AC 交换代数 math.AG 代数几何 math.AP 偏微分方程 math.AT 代数拓扑 math.CA 经典分析 math.CO 组合数学 math.CT 范畴论 math.CV 复变函数 math.DG 微分几何 math.DS 动力系统 math.FA 泛函分析 math.GM 一般数学 math.GN 一般拓扑 math.GR 群论 math.GT 几何拓扑 math.HO 历史综述 math.IT 信息论 math.KT K理论 math.LO 逻辑 math.MG 度量几何 math.MP 数学物理 math.NA 数值分析 math.NT 数论 math.OA 算子代数 math.OC 优化控制 math.PR 概率 math.QA 量子代数 math.RA 环与代数 math.RT 表示论 math.SG 辛几何 math.SP 谱理论 math.ST 统计理论

PHYSICS 物理 55

astro-ph 天体物理 astro-ph.CO 宇宙学 astro-ph.EP 地球行星 astro-ph.GA 星系物理 astro-ph.HE 高能天体 astro-ph.IM 天文仪器 astro-ph.SR 太阳恒星 cond-mat 凝聚态 cond-mat.dis-nn 无序神经 cond-mat.mes-hall 介观纳米 cond-mat.mtrl-sci 材料科学 cond-mat.other 其他凝聚态 cond-mat.quant-gas 量子气体 cond-mat.soft 软凝聚态 cond-mat.stat-mech 统计力学 cond-mat.str-el 强关联电子 cond-mat.supr-con 超导 gr-qc 广义相对论 hep-ex 高能实验 hep-lat 格点高能 hep-ph 高能唯象 hep-th 高能理论 math-ph 数学物理 nlin 非线性科学 nlin.AO 自适应系统 nlin.CD 混沌动力学 nlin.CG 胞自动机 nlin.PS 斑图孤子 nlin.SI 可积系统 nucl-ex 核物理实验 nucl-th 核物理理论 physics 物理 physics.acc-ph 加速器物理 physics.ao-ph 大气海洋 physics.app-ph 应用物理 physics.atm-clus 原子分子团簇 physics.atom-ph 原子物理 physics.bio-ph 生物物理 physics.chem-ph 化学物理 physics.class-ph 经典物理 physics.comp-ph 计算物理 physics.data-an 数据分析 physics.ed-ph 物理教育 physics.flu-dyn 流体动力学 physics.gen-ph 普通物理 physics.geo-ph 地球物理 physics.hist-ph 物理史哲 physics.ins-det 仪器探测 physics.med-ph 医学物理 physics.optics 光学 physics.plasm-ph 等离子体 physics.pop-ph 科普物理 physics.soc-ph 物理与社会 physics.space-ph 空间物理 quant-ph 量子物理

Q-BIO 定量生物 11

q-bio 定量生物 q-bio.BM 生物分子 q-bio.CB 细胞行为 q-bio.GN 基因组学 q-bio.MN 分子网络 q-bio.NC 神经认知 q-bio.OT 其他定量生物 q-bio.PE 种群进化 q-bio.QM 定量方法 q-bio.SC 亚细胞过程 q-bio.TO 组织器官

Q-FIN 定量金融 10

q-fin 定量金融 q-fin.CP 计算金融 q-fin.EC 经济学 q-fin.GN 一般金融 q-fin.MF 数学金融 q-fin.PM 投资组合 q-fin.PR 证券定价 q-fin.RM 风险管理 q-fin.ST 统计金融 q-fin.TR 交易微观结构

STAT 统计 7

stat 统计 stat.AP 统计应用 stat.CO 统计计算 stat.ME 统计方法 stat.ML 机器学习 stat.OT 其他统计 stat.TH 统计理论

2602.10381 2026-02-12 cs.LG

Deep learning outperforms traditional machine learning methods in predicting childhood malnutrition: evidence from survey data

Deepak Bastola, Yang Li

Comments 21 pages, 10 figures

2602.10380 2026-02-12 cs.CL cs.AI

The Alignment Bottleneck in Decomposition-Based Claim Verification

Mahmud Elahi Akhter, Federico Ruggeri, Iman Munire Bilal, Rob Procter, Maria Liakata

2602.10377 2026-02-12 cs.LG cs.CL

Hardware Co-Design Scaling Laws via Roofline Modelling for On-Device LLMs

Luoyang Sun, Jiwen Jiang, Yifeng Ding, Fengfa Li, Yan Song, Haifeng Zhang, Jian Ying, Lei Ren, Kun Zhan, Wei Chen, Yan Xie, Cheng Deng

2602.10371 2026-02-12 cs.LG

Simple LLM Baselines are Competitive for Model Diffing

Elias Kempf, Simon Schrodi, Bartosz Cywiński, Thomas Brox, Neel Nanda, Arthur Conmy

2602.10367 2026-02-12 cs.AI

LiveMedBench: A Contamination-Free Medical Benchmark for LLMs with Automated Rubric Evaluation

Zhiling Yan, Dingjie Song, Zhe Fang, Yisheng Ji, Xiang Li, Quanzheng Li, Lichao Sun

详情

英文摘要

The deployment of Large Language Models (LLMs) in high-stakes clinical settings demands rigorous and reliable evaluation. However, existing medical benchmarks remain static, suffering from two critical limitations: (1) data contamination, where test sets inadvertently leak into training corpora, leading to inflated performance estimates; and (2) temporal misalignment, failing to capture the rapid evolution of medical knowledge. Furthermore, current evaluation metrics for open-ended clinical reasoning often rely on either shallow lexical overlap (e.g., ROUGE) or subjective LLM-as-a-Judge scoring, both inadequate for verifying clinical correctness. To bridge these gaps, we introduce LiveMedBench, a continuously updated, contamination-free, and rubric-based benchmark that weekly harvests real-world clinical cases from online medical communities, ensuring strict temporal separation from model training data. We propose a Multi-Agent Clinical Curation Framework that filters raw data noise and validates clinical integrity against evidence-based medical principles. For evaluation, we develop an Automated Rubric-based Evaluation Framework that decomposes physician responses into granular, case-specific criteria, achieving substantially stronger alignment with expert physicians than LLM-as-a-Judge. To date, LiveMedBench comprises 2,756 real-world cases spanning 38 medical specialties and multiple languages, paired with 16,702 unique evaluation criteria. Extensive evaluation of 38 LLMs reveals that even the best-performing model achieves only 39.2%, and 84% of models exhibit performance degradation on post-cutoff cases, confirming pervasive data contamination risks. Error analysis further identifies contextual application-not factual knowledge-as the dominant bottleneck, with 35-48% of failures stemming from the inability to tailor medical knowledge to patient-specific constraints.

URL PDF HTML ☆

赞 0 踩 0

2602.10365 2026-02-12 cs.RO math.OC

Solving Geodesic Equations with Composite Bernstein Polynomials for Trajectory Planning

Nick Gorman, Gage MacLin, Maxwell Hammond, Venanzio Cichella

Comments Accepted for the 2026 IEEE Aerospace Conference

详情

英文摘要

This work presents a trajectory planning method based on composite Bernstein polynomials for autonomous systems navigating complex environments. The method is implemented in a symbolic optimization framework that enables continuous paths and precise control over trajectory shape. Trajectories are planned over a cost surface that encodes obstacles as continuous fields rather than discrete boundaries. Regions near obstacles are assigned higher costs, naturally encouraging the trajectory to maintain a safe distance while still allowing efficient routing through constrained spaces. The use of composite Bernstein polynomials preserves continuity while enabling fine control over local curvature to satisfy geodesic constraints. The symbolic representation supports exact derivatives, improving optimization efficiency. The method applies to both two- and three-dimensional environments and is suitable for ground, aerial, underwater, and space systems. In spacecraft trajectory planning, for example, it enables the generation of continuous, dynamically feasible trajectories with high numerical efficiency, making it well suited for orbital maneuvers, rendezvous and proximity operations, cluttered gravitational environments, and planetary exploration missions with limited onboard computational resources. Demonstrations show that the approach efficiently generates smooth, collision-free paths in scenarios with multiple obstacles, maintaining clearance without extensive sampling or post-processing. The optimization incorporates three constraint types: (1) a Gaussian surface inequality enforcing minimum obstacle clearance; (2) geodesic equations guiding the path along locally efficient directions on the cost surface; and (3) boundary constraints enforcing fixed start and end conditions. The method can serve as a standalone planner or as an initializer for more complex motion planning problems.

URL PDF HTML ☆

赞 0 踩 0

2602.10364 2026-02-12 cs.CV

Comp2Comp: Open-Source Software with FDA-Cleared Artificial Intelligence Algorithms for Computed Tomography Image Analysis

Adrit Rao, Malte Jensen, Andrea T. Fisher, Louis Blankemeier, Pauline Berens, Arash Fereydooni, Seth Lirette, Eren Alkan, Felipe C. Kitamura, Juan M. Zambrano Chaves, Eduardo Reis, Arjun Desai, Marc H. Willis, Jason Hom, Andrew Johnston, Leon Lenchik, Robert D. Boutin, Eduardo M. J. M. Farina, Augusto S. Serpa, Marcelo S. Takahashi, Jordan Perchik, Steven A. Rothenberg, Jamie L. Schroeder, Ross Filice, Leonardo K. Bittencourt, Hari Trivedi, Marly van Assen, John Mongan, Kimberly Kallianos, Oliver Aalami, Akshay S. Chaudhari

Comments Adrit Rao, Malte Jensen, Andrea T. Fisher, Louis Blankemeier: Co-first authors. Oliver Aalami, Akshay S. Chaudhari: Co-senior authors

2602.10357 2026-02-12 cs.LG

Theoretical Analysis of Contrastive Learning under Imbalanced Data: From Training Dynamics to a Pruning Solution

Haixu Liao, Yating Zhou, Songyang Zhang, Meng Wang, Shuai Zhang

2602.10354 2026-02-12 cs.CL cs.LG

Physically Interpretable AlphaEarth Foundation Model Embeddings Enable LLM-Based Land Surface Intelligence

Mashrekur Rahman

2602.10350 2026-02-12 cs.CL

When Less Is More? Diagnosing ASR Predictions in Sardinian via Layer-Wise Decoding

Domenico De Cristofaro, Alessandro Vietti, Marianne Pouplier, Aleese Block

2602.10345 2026-02-12 cs.LG

Identifying Evidence-Based Nudges in Biomedical Literature with Large Language Models

Jaydeep Chauhan, Mark Seidman, Pezhman Raeisian Parvari, Zhi Zheng, Zina Ben-Miled, Cristina Barboi, Andrew Gonzalez, Malaz Boustani

2602.10344 2026-02-12 cs.CV

Monte Carlo Maximum Likelihood Reconstruction for Digital Holography with Speckle

Xi Chen, Arian Maleki, Shirin Jalali

2602.10343 2026-02-12 cs.CV cs.LG

Conditional Uncertainty-Aware Political Deepfake Detection with Stochastic Convolutional Neural Networks

Rafael-Petruţ Gardoş

Comments 21 pages, 12 figures, 18 tables

2602.10329 2026-02-12 cs.CL cs.AI cs.LG

Are More Tokens Rational? Inference-Time Scaling in Language Models as Adaptive Resource Rationality

Zhimin Hu, Riya Roshan, Sashank Varma

2602.10319 2026-02-12 cs.CV

A Low-Rank Defense Method for Adversarial Attack on Diffusion Models

Jiaxuan Zhu, Siyu Huang

Comments Accepted by ICME2025

2602.10305 2026-02-12 cs.LG cs.AI cs.RO

Confounding Robust Continuous Control via Automatic Reward Shaping

Mateo Juliani, Mingxuan Li, Elias Bareinboim

Comments Mateo Juliani and Mingxuan Li contributed equally to this work; accepted in AAMAS 2026

2602.10303 2026-02-12 cs.LG q-bio.QM stat.ML

ICODEN: Ordinary Differential Equation Neural Networks for Interval-Censored Data

Haoling Wang, Lang Zeng, Tao Sun, Youngjoo Cho, Ying Ding

2602.10300 2026-02-12 cs.LG

Configuration-to-Performance Scaling Law with Neural Ansatz

Huaqing Zhang, Kaiyue Wen, Tengyu Ma

2602.10282 2026-02-12 cs.LG

Linear-LLM-SCM: Benchmarking LLMs for Coefficient Elicitation in Linear-Gaussian Causal Models

Kanta Yamaoka, Sumantrak Mukherjee, Thomas Gärtner, David Antony Selby, Stefan Konigorski, Eyke Hüllermeier, Viktor Bengs, Sebastian Josef Vollmer

Comments 16 pages, 4 figures, preprint

2602.10278 2026-02-12 cs.CV cs.AI

ERGO: Excess-Risk-Guided Optimization for High-Fidelity Monocular 3D Gaussian Splatting

Zehua Ma, Hanhui Li, Zhenyu Xie, Xiaonan Luo, Michael Kampffmeyer, Feng Gao, Xiaodan Liang

2602.10266 2026-02-12 cs.LG cs.AI

From Classical to Topological Neural Networks Under Uncertainty

Sarah Harkins Dayton, Layal Bou Hamdan, Ioannis D. Schizas, David L. Boothe, Vasileios Maroulas

2602.10265 2026-02-12 cs.CV

Colorimeter-Supervised Skin Tone Estimation from Dermatoscopic Images for Fairness Auditing

Marin Benčević, Krešimir Romić, Ivana Hartmann Tolić, Irena Galić

Comments Preprint submitted to Computer Methods and Programs in Biomedicine

2602.10261 2026-02-12 cs.LG stat.AP stat.ML

Kernel-Based Learning of Chest X-ray Images for Predicting ICU Escalation among COVID-19 Patients

Qiyuan Shi, Jian Kang, Yi Li

2602.10249 2026-02-12 cs.LG

Modeling Programming Skills with Source Code Embeddings for Context-aware Exercise Recommendation

Carlos Eduardo P. Silva, João Pedro M. Sena, Julio C. S. Reis, André G. Santos, Lucas N. Ferreira

Comments 10 pages, 4 figures, to be published in LAK26: 16th International Learning Analytics and Knowledge Conference (LAK 2026)

2602.10239 2026-02-12 cs.CV

XSPLAIN: XAI-enabling Splat-based Prototype Learning for Attribute-aware INterpretability

Dominik Galus, Julia Farganus, Tymoteusz Zapala, Mikołaj Czachorowski, Piotr Borycki, Przemysław Spurek, Piotr Syga

2602.10238 2026-02-12 cs.CL cs.LG

Learning to Evict from Key-Value Cache

Luca Moschella, Laura Manduchi, Ozan Sener

Comments 23 pages, 15 figures

2602.10232 2026-02-12 cs.LG

Risk-Equalized Differentially Private Synthetic Data: Protecting Outliers by Controlling Record-Level Influence

Amir Asiaee, Chao Yan, Zachary B. Abrams, Bradley A. Malin

2602.10231 2026-02-12 cs.LG cs.AI cs.CL

Blockwise Advantage Estimation for Multi-Objective RL with Verifiable Rewards

Kirill Pavlenko, Alexander Golubev, Simon Karasik, Boris Yangel

2602.10230 2026-02-12 cs.LG cs.SD eess.AS

Frame-Level Internal Tool Use for Temporal Grounding in Audio LMs

Joesph An, Phillip Keung, Jiaqi Wang, Orevaoghene Ahia, Noah A. Smith

Comments Under review. See https://github.com/inkitori/taudio/

2602.10229 2026-02-12 cs.CL

Latent Thoughts Tuning: Bridging Context and Reasoning with Fused Information in Latent Tokens

Weihao Liu, Dehai Min, Lu Cheng

Comments version 1.0

AI 大模型

视觉与机器人

科学与医疗