arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

检索范围排序方式

检索时间范围

重置

HOT 人工智能、机器人等 9

cs.AI 人工智能 cs.CV 计算机视觉 cs.CL 自然语言处理 cs.RO 机器人 cs.LG 机器学习 cs.SD 声音 cs.ET 新兴技术 eess.AS 音频语音 eess.IV 图像视频

CS 计算机 41

cs 计算机 cs.AI 人工智能 cs.AR 硬件架构 cs.CC 计算复杂性 cs.CE 计算工程 cs.CG 计算几何 cs.CL 自然语言处理 cs.CR 密码安全 cs.CV 计算机视觉 cs.CY 计算机与社会 cs.DB 数据库 cs.DC 分布式计算 cs.DL 数字图书馆 cs.DM 离散数学 cs.DS 数据结构 cs.ET 新兴技术 cs.FL 形式语言 cs.GL 综述文献 cs.GR 图形学 cs.GT 博弈论 cs.HC 人机交互 cs.IR 信息检索 cs.IT 信息论 cs.LG 机器学习 cs.LO 计算机逻辑 cs.MA 多智能体 cs.MM 多媒体 cs.MS 数学软件 cs.NA 数值分析 cs.NE 神经进化 cs.NI 网络架构 cs.OH 其他计算机 cs.OS 操作系统 cs.PF 性能 cs.PL 编程语言 cs.RO 机器人 cs.SC 符号计算 cs.SD 声音 cs.SE 软件工程 cs.SI 社会信息网络 cs.SY 系统控制

ECON 经济学 4

econ 经济学 econ.EM 计量经济 econ.GN 一般经济 econ.TH 理论经济

EESS 电气与系统 5

eess 电气与系统 eess.AS 音频语音 eess.IV 图像视频 eess.SP 信号处理 eess.SY 系统控制

MATH 数学 33

math 数学 math.AC 交换代数 math.AG 代数几何 math.AP 偏微分方程 math.AT 代数拓扑 math.CA 经典分析 math.CO 组合数学 math.CT 范畴论 math.CV 复变函数 math.DG 微分几何 math.DS 动力系统 math.FA 泛函分析 math.GM 一般数学 math.GN 一般拓扑 math.GR 群论 math.GT 几何拓扑 math.HO 历史综述 math.IT 信息论 math.KT K理论 math.LO 逻辑 math.MG 度量几何 math.MP 数学物理 math.NA 数值分析 math.NT 数论 math.OA 算子代数 math.OC 优化控制 math.PR 概率 math.QA 量子代数 math.RA 环与代数 math.RT 表示论 math.SG 辛几何 math.SP 谱理论 math.ST 统计理论

PHYSICS 物理 55

astro-ph 天体物理 astro-ph.CO 宇宙学 astro-ph.EP 地球行星 astro-ph.GA 星系物理 astro-ph.HE 高能天体 astro-ph.IM 天文仪器 astro-ph.SR 太阳恒星 cond-mat 凝聚态 cond-mat.dis-nn 无序神经 cond-mat.mes-hall 介观纳米 cond-mat.mtrl-sci 材料科学 cond-mat.other 其他凝聚态 cond-mat.quant-gas 量子气体 cond-mat.soft 软凝聚态 cond-mat.stat-mech 统计力学 cond-mat.str-el 强关联电子 cond-mat.supr-con 超导 gr-qc 广义相对论 hep-ex 高能实验 hep-lat 格点高能 hep-ph 高能唯象 hep-th 高能理论 math-ph 数学物理 nlin 非线性科学 nlin.AO 自适应系统 nlin.CD 混沌动力学 nlin.CG 胞自动机 nlin.PS 斑图孤子 nlin.SI 可积系统 nucl-ex 核物理实验 nucl-th 核物理理论 physics 物理 physics.acc-ph 加速器物理 physics.ao-ph 大气海洋 physics.app-ph 应用物理 physics.atm-clus 原子分子团簇 physics.atom-ph 原子物理 physics.bio-ph 生物物理 physics.chem-ph 化学物理 physics.class-ph 经典物理 physics.comp-ph 计算物理 physics.data-an 数据分析 physics.ed-ph 物理教育 physics.flu-dyn 流体动力学 physics.gen-ph 普通物理 physics.geo-ph 地球物理 physics.hist-ph 物理史哲 physics.ins-det 仪器探测 physics.med-ph 医学物理 physics.optics 光学 physics.plasm-ph 等离子体 physics.pop-ph 科普物理 physics.soc-ph 物理与社会 physics.space-ph 空间物理 quant-ph 量子物理

Q-BIO 定量生物 11

q-bio 定量生物 q-bio.BM 生物分子 q-bio.CB 细胞行为 q-bio.GN 基因组学 q-bio.MN 分子网络 q-bio.NC 神经认知 q-bio.OT 其他定量生物 q-bio.PE 种群进化 q-bio.QM 定量方法 q-bio.SC 亚细胞过程 q-bio.TO 组织器官

Q-FIN 定量金融 10

q-fin 定量金融 q-fin.CP 计算金融 q-fin.EC 经济学 q-fin.GN 一般金融 q-fin.MF 数学金融 q-fin.PM 投资组合 q-fin.PR 证券定价 q-fin.RM 风险管理 q-fin.ST 统计金融 q-fin.TR 交易微观结构

STAT 统计 7

stat 统计 stat.AP 统计应用 stat.CO 统计计算 stat.ME 统计方法 stat.ML 机器学习 stat.OT 其他统计 stat.TH 统计理论

2601.19447 2026-01-28 cs.CL cs.AI

KG-CRAFT: Knowledge Graph-based Contrastive Reasoning with LLMs for Enhancing Automated Fact-checking

Vítor N. Lourenço, Aline Paes, Tillman Weyde, Audrey Depeige, Mohnish Dubey

Comments Accepted to publication at the 19th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2026

2601.19446 2026-01-28 cs.CV

DSTCS: Dual-Student Teacher Framework with Segment Anything Model for Semi-Supervised Pubic Symphysis Fetal Head Segmentation

Yalin Luo, Shun Long, Huijin Wang, Jieyun Bai

2601.19439 2026-01-28 cs.LG

OSIRIS: Bridging Analog Circuit Design and Machine Learning with Scalable Dataset Generation

Giuseppe Chiari, Michele Piccoli, Davide Zoni

2601.19433 2026-01-28 cs.CV

RoamScene3D: Immersive Text-to-3D Scene Generation via Adaptive Object-aware Roaming

Jisheng Chu, Wenrui Li, Rui Zhao, Wangmeng Zuo, Shifeng Chen, Xiaopeng Fan

2601.19430 2026-01-28 cs.CV

Unveiling Perceptual Artifacts: A Fine-Grained Benchmark for Interpretable AI-Generated Image Detection

Yao Xiao, Weiyan Chen, Jiahao Chen, Zijie Cao, Weijian Deng, Binbin Yang, Ziyi Dong, Xiangyang Ji, Wei Ke, Pengxu Wei, Liang Lin

2601.19406 2026-01-28 cs.RO cs.AI

Sim-and-Human Co-training for Data-Efficient and Generalizable Robotic Manipulation

Kaipeng Fang, Weiqing Liang, Yuyang Li, Ji Zhang, Pengpeng Zeng, Lianli Gao, Jingkuan Song, Heng Tao Shen

2601.19399 2026-01-28 cs.SD cs.AI

Residual Tokens Enhance Masked Autoencoders for Speech Modeling

Samir Sadok, Stéphane Lathuilière, Xavier Alameda-Pineda

Comments Submitted to ICASSP 2026 (accepted)

2601.19394 2026-01-28 cs.LG

DSP-Reg: Domain-Sensitive Parameter Regularization for Robust Domain Generalization

Xudong Han, Senkang Hu, Yihang Tao, Yu Guo, Philip Birch, Sam Tak Wu Kwong, Yuguang Fang

2601.19375 2026-01-28 cs.LG cs.AI

Selective Steering: Norm-Preserving Control Through Discriminative Layer Selection

Quy-Anh Dang, Chris Ngo

2601.19360 2026-01-28 cs.CL

Binary Token-Level Classification with DeBERTa for All-Type MWE Identification: A Lightweight Approach with Linguistic Enhancement

Diego Rossini, Lonneke van der Plas

Comments Accepted at Findings of EACL 2026

2601.19354 2026-01-28 cs.RO cs.SY eess.SY

Self-Supervised Path Planning in Unstructured Environments via Global-Guided Differentiable Hard Constraint Projection

Ziqian Wang, Chenxi Fang, Zhen Zhang

2601.19352 2026-01-28 cs.LG

GraphSB: Boosting Imbalanced Node Classification on Graphs through Structural Balance

Zhixiao Wang, Chaofan Zhu, Qihan Feng, Jian Zhang, Xiaobin Rui, Philip S Yu

2601.19350 2026-01-28 cs.CL

Cross-Examination Framework: A Task-Agnostic Diagnostic for Information Fidelity in Text-to-Text Generation

Tathagata Raha, Clement Christophe, Nada Saadi, Hamza A Javed, Marco AF Pimentel, Ronnie Rajan, Praveenkumar Kanithi

2601.19341 2026-01-28 cs.LG cs.AI

Robust Uncertainty Estimation under Distribution Shift via Difference Reconstruction

Xinran Xu, Li Rong Wang, Xiuyi Fan

2601.19337 2026-01-28 cs.AI cs.LG cs.SE

SETA: Statistical Fault Attribution for Compound AI Systems

Sayak Chowdhury, Meenakshi D'Souza

Comments Accepted to CAIN 2026 co-hosted with ICSE 2026

2601.19336 2026-01-28 cs.LG cs.AI

From Observations to Events: Event-Aware World Model for Reinforcement Learning

Zhao-Han Peng, Shaohui Li, Zhi Li, Shulan Ruan, Yu Liu, You He

Comments 43 pages, accepted by ICLR 2026

2601.19334 2026-01-28 cs.CL cs.AI

When Benchmarks Leak: Inference-Time Decontamination for LLMs

Jianzhe Chai, Yu Zhe, Jun Sakuma

2601.19333 2026-01-28 cs.LG cs.DS

Metric $k$-clustering using only Weak Comparison Oracles

Rahul Raychaudhury, Aryan Esmailpour, Sainyam Galhotra, Stavros Sintos

Journal ref ICLR 2026

2601.19325 2026-01-28 cs.CV cs.AI

Innovator-VL: A Multimodal Large Language Model for Scientific Discovery

Zichen Wen, Boxue Yang, Shuang Chen, Yaojie Zhang, Yuhang Han, Junlong Ke, Cong Wang, Yicheng Fu, Jiawang Zhao, Jiangchao Yao, Xi Fang, Zhen Wang, Henxing Cai, Lin Yao, Zhifeng Gao, Yanhui Hong, Nang Yuan, Yixuan Li, Guojiang Zhao, Haoyi Tao, Nan Wang, Han Lyu, Guolin Ke, Ning Liao, Xiaoxing Wang, Kai Chen, Zhiyu Li, Feiyu Xiong, Sihan Hu, Kun Chen, Yanfeng Wang, Weinan E, Linfeng Zhang, Linfeng Zhang

Comments Innovator-VL tech report

2601.19315 2026-01-28 cs.LG

Generalizable IoT Traffic Representations for Cross-Network Device Identification

Arunan Sivanathan, David Warren, Deepak Mishra, Sushmita Ruj, Natasha Fernandes, Quan Z. Sheng, Minh Tran, Ben Luo, Daniel Coscia, Gustavo Batista, Hassan Habibi Gharakaheili

Comments 15 pages, 15 figures

2601.19314 2026-01-28 cs.CV cs.AI

Instance-Guided Radar Depth Estimation for 3D Object Detection

Chen-Chou Lo, Patrick Vandewalle

Comments Accepted to IPMV2026

2601.19312 2026-01-28 cs.LG cs.SY eess.SY stat.CO stat.ML

LightSBB-M: Bridging Schrödinger and Bass for Generative Diffusion Modeling

Alexandre Alouadi, Pierre Henry-Labordère, Grégoire Loeper, Othmane Mazhar, Huyên Pham, Nizar Touzi

2601.19309 2026-01-28 cs.CV

Beyond Shadows: A Large-Scale Benchmark and Multi-Stage Framework for High-Fidelity Facial Shadow Removal

Tailong Luo, Jiesong Bai, Jinyang Huang, Junyu Xia, Wangyu Wu, Xuhang Chen

Comments Accepted by ICASSP2026

2601.19306 2026-01-28 cs.AI

Curiosity Driven Knowledge Retrieval for Mobile Agents

Sijia Li, Xiaoyu Tan, Shahir Ali, Niels Schmidt, Gengchen Ma, Xihe Qiu

2601.19297 2026-01-28 cs.SD eess.AS

Phase-Retrieval-Based Physics-Informed Neural Networks For Acoustic Magnitude Field Reconstruction

Karl Schrader, Shoichi Koyama, Tomohiko Nakamura, Mirco Pezzoli

Comments Accepted to International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2026

2601.19296 2026-01-28 cs.LG

Process-Aware Procurement Lead Time Prediction for Shipyard Delay Mitigation

Yongjae Lee, Eunhee Park, Daesan Park, Dongho Kim, Jongho Choi, Hyerim Bae

2601.19295 2026-01-28 cs.CV

ProMist-5K: A Comprehensive Dataset for Digital Emulation of Cinematic Pro-Mist Filter Effects

Yingtie Lei, Zimeng Li, Chi-Man Pun, Wangyu Wu, Junke Yang, Xuhang Chen

Comments Accepted by ICASSP2026

2601.19290 2026-01-28 cs.CL

MetaGen: Self-Evolving Roles and Topologies for Multi-Agent LLM Reasoning

Yimeng Wang, Jiaxing Zhao, Hongbin Xie, Hexing Ma, Yuzhen Lei, Shuangxue Liu, Xuan Song, Zichen Zhang, Haoran Zhang

2601.19286 2026-01-28 cs.CL

ReToP: Learning to Rewrite Electronic Health Records for Clinical Prediction

Jesus Lovon-Melgarejo, Jose G. Moreno, Christine Damase-Michel, Lynda Tamine

Comments Accepted by WSDM 2026

Journal ref WSDM 2026, Feb 2026, Boise Idaho, United States

2601.19280 2026-01-28 cs.LG cs.AI cs.CL

Group Distributionally Robust Optimization-Driven Reinforcement Learning for LLM Reasoning

Kishan Panaganti, Zhenwen Liang, Wenhao Yu, Haitao Mi, Dong Yu

Comments Keywords: Large Language Models, Reasoning Models, Reinforcement Learning, Distributionally Robust Optimization, GRPO

详情

英文摘要

Recent progress in Large Language Model (LLM) reasoning is increasingly driven by the refinement of post-training loss functions and alignment strategies. However, standard Reinforcement Learning (RL) paradigms like Group Relative Policy Optimization (GRPO) remain constrained by static uniformity: uniform prompt sampling and a fixed number of rollouts per prompt. For heterogeneous, heavy-tailed reasoning data, this creates structural inefficiencies that waste compute on already-solved patterns while under-training the long tail of hard problems. To address this, we propose Multi-Adversary Group Distributionally Robust Optimization (GDRO), an optimization-first framework that moves beyond uniform reasoning models by dynamically adapting the training distribution. We introduce an Online Difficulty Classifier that partitions prompts into dynamic pass@k difficulty groups. We then propose two independent GDRO games for post-training: (1) Prompt-GDRO, which employs an EMA-debiased multiplicative-weights bandit sampler to target the intensive difficulty margin and upweight persistently hard groups without frequency bias; and (2) Rollout-GDRO, which uses a shadow-price controller to reallocate rollouts across groups, maximizing gradient variance reduction on hard tasks under a fixed mean budget (compute-neutral). We provide no-regret guarantees for both controllers and additionally a variance-proxy analysis motivating a square-root optimal rollout allocation for Rollout-GDRO. We validate our framework on the DAPO 14.1k dataset using Qwen3-Base models. Prompt-GDRO and Rollout-GDRO achieve average relative gains of +10.6% and +10.1%, respectively, in pass@8 accuracy across 1.7B, 4B, and 8B scales compared to the GRPO baseline. Qualitative analysis shows an emergent curriculum: the adversaries shift resources to the evolving reasoning frontier, enhancing the reasoning model's performance.

URL PDF HTML ☆

赞 0 踩 0

AI 大模型

视觉与机器人

科学与医疗