arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

检索范围排序方式

检索时间范围

重置

HOT 人工智能、机器人等 9

cs.AI 人工智能 cs.CV 计算机视觉 cs.CL 自然语言处理 cs.RO 机器人 cs.LG 机器学习 cs.SD 声音 cs.ET 新兴技术 eess.AS 音频语音 eess.IV 图像视频

CS 计算机 41

cs 计算机 cs.AI 人工智能 cs.AR 硬件架构 cs.CC 计算复杂性 cs.CE 计算工程 cs.CG 计算几何 cs.CL 自然语言处理 cs.CR 密码安全 cs.CV 计算机视觉 cs.CY 计算机与社会 cs.DB 数据库 cs.DC 分布式计算 cs.DL 数字图书馆 cs.DM 离散数学 cs.DS 数据结构 cs.ET 新兴技术 cs.FL 形式语言 cs.GL 综述文献 cs.GR 图形学 cs.GT 博弈论 cs.HC 人机交互 cs.IR 信息检索 cs.IT 信息论 cs.LG 机器学习 cs.LO 计算机逻辑 cs.MA 多智能体 cs.MM 多媒体 cs.MS 数学软件 cs.NA 数值分析 cs.NE 神经进化 cs.NI 网络架构 cs.OH 其他计算机 cs.OS 操作系统 cs.PF 性能 cs.PL 编程语言 cs.RO 机器人 cs.SC 符号计算 cs.SD 声音 cs.SE 软件工程 cs.SI 社会信息网络 cs.SY 系统控制

ECON 经济学 4

econ 经济学 econ.EM 计量经济 econ.GN 一般经济 econ.TH 理论经济

EESS 电气与系统 5

eess 电气与系统 eess.AS 音频语音 eess.IV 图像视频 eess.SP 信号处理 eess.SY 系统控制

MATH 数学 33

math 数学 math.AC 交换代数 math.AG 代数几何 math.AP 偏微分方程 math.AT 代数拓扑 math.CA 经典分析 math.CO 组合数学 math.CT 范畴论 math.CV 复变函数 math.DG 微分几何 math.DS 动力系统 math.FA 泛函分析 math.GM 一般数学 math.GN 一般拓扑 math.GR 群论 math.GT 几何拓扑 math.HO 历史综述 math.IT 信息论 math.KT K理论 math.LO 逻辑 math.MG 度量几何 math.MP 数学物理 math.NA 数值分析 math.NT 数论 math.OA 算子代数 math.OC 优化控制 math.PR 概率 math.QA 量子代数 math.RA 环与代数 math.RT 表示论 math.SG 辛几何 math.SP 谱理论 math.ST 统计理论

PHYSICS 物理 55

astro-ph 天体物理 astro-ph.CO 宇宙学 astro-ph.EP 地球行星 astro-ph.GA 星系物理 astro-ph.HE 高能天体 astro-ph.IM 天文仪器 astro-ph.SR 太阳恒星 cond-mat 凝聚态 cond-mat.dis-nn 无序神经 cond-mat.mes-hall 介观纳米 cond-mat.mtrl-sci 材料科学 cond-mat.other 其他凝聚态 cond-mat.quant-gas 量子气体 cond-mat.soft 软凝聚态 cond-mat.stat-mech 统计力学 cond-mat.str-el 强关联电子 cond-mat.supr-con 超导 gr-qc 广义相对论 hep-ex 高能实验 hep-lat 格点高能 hep-ph 高能唯象 hep-th 高能理论 math-ph 数学物理 nlin 非线性科学 nlin.AO 自适应系统 nlin.CD 混沌动力学 nlin.CG 胞自动机 nlin.PS 斑图孤子 nlin.SI 可积系统 nucl-ex 核物理实验 nucl-th 核物理理论 physics 物理 physics.acc-ph 加速器物理 physics.ao-ph 大气海洋 physics.app-ph 应用物理 physics.atm-clus 原子分子团簇 physics.atom-ph 原子物理 physics.bio-ph 生物物理 physics.chem-ph 化学物理 physics.class-ph 经典物理 physics.comp-ph 计算物理 physics.data-an 数据分析 physics.ed-ph 物理教育 physics.flu-dyn 流体动力学 physics.gen-ph 普通物理 physics.geo-ph 地球物理 physics.hist-ph 物理史哲 physics.ins-det 仪器探测 physics.med-ph 医学物理 physics.optics 光学 physics.plasm-ph 等离子体 physics.pop-ph 科普物理 physics.soc-ph 物理与社会 physics.space-ph 空间物理 quant-ph 量子物理

Q-BIO 定量生物 11

q-bio 定量生物 q-bio.BM 生物分子 q-bio.CB 细胞行为 q-bio.GN 基因组学 q-bio.MN 分子网络 q-bio.NC 神经认知 q-bio.OT 其他定量生物 q-bio.PE 种群进化 q-bio.QM 定量方法 q-bio.SC 亚细胞过程 q-bio.TO 组织器官

Q-FIN 定量金融 10

q-fin 定量金融 q-fin.CP 计算金融 q-fin.EC 经济学 q-fin.GN 一般金融 q-fin.MF 数学金融 q-fin.PM 投资组合 q-fin.PR 证券定价 q-fin.RM 风险管理 q-fin.ST 统计金融 q-fin.TR 交易微观结构

STAT 统计 7

stat 统计 stat.AP 统计应用 stat.CO 统计计算 stat.ME 统计方法 stat.ML 机器学习 stat.OT 其他统计 stat.TH 统计理论

2511.01107 2026-03-03 cs.RO cs.LG

SLAP: Shortcut Learning for Abstract Planning

Y. Isabel Liu, Bowen Li, Benjamin Eysenbach, Tom Silver

Comments Published at the International Conference on Learning Representations (ICLR) 2026. Code available at https://github.com/isabelliu0/SLAP

2511.00177 2026-03-03 cs.LG cs.CL

Can SAEs reveal and mitigate racial biases of LLMs in healthcare?

Hiba Ahsan, Byron C. Wallace

Comments camera-ready ICLR 2026

2510.27492 2026-03-03 cs.CV

ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning

Jiawei Gu, Yunzhuo Hao, Huichen Will Wang, Linjie Li, Michael Qizhe Shieh, Yejin Choi, Ranjay Krishna, Yu Cheng

Comments project page: https://thinkmorph.github.io/

2510.26144 2026-03-03 cs.AI

The FM Agent

Annan Li, Chufan Wu, Zengle Ge, Yee Hin Chong, Zhinan Hou, Lizhe Cao, Cheng Ju, Jianmin Wu, Huaiming Li, Haobo Zhang, Shenghao Feng, Mo Zhao, Fengzhi Qiu, Rui Yang, Mengmeng Zhang, Wenyi Zhu, Yingying Sun, Quan Sun, Shunhao Yan, Danyu Liu, Dawei Yin, Dou Shen

2510.25883 2026-03-03 cs.AI cs.IT math.IT

The Information-Theoretic Imperative: Compression and the Epistemic Foundations of Intelligence

Christian Dittrich, Jennifer Flygare Kinne

Comments 61 pages, 3 tables, 3 figures, 2 appendices. Submitted to arXiv for open access

2510.24711 2026-03-03 cs.CV

Routing Matters in MoE: Scaling Diffusion Transformers with Explicit Routing Guidance

Yujie Wei, Shiwei Zhang, Hangjie Yuan, Yujin Han, Zhekai Chen, Jiayu Wang, Difan Zou, Xihui Liu, Yingya Zhang, Yu Liu, Hongming Shan

Comments Accepted to ICLR 2026

2510.24302 2026-03-03 cs.CL

Lookahead Tree-Based Rollouts for Enhanced Trajectory-Level Exploration in Reinforcement Learning with Verifiable Rewards

Shangyu Xing, Siyuan Wang, Chenyuan Yang, Xinyu Dai, Xiang Ren

2510.23607 2026-03-03 cs.CV

Concerto: Joint 2D-3D Self-Supervised Learning Emerges Spatial Representations

Yujia Zhang, Xiaoyang Wu, Yixing Lao, Chengyao Wang, Zhuotao Tian, Naiyan Wang, Hengshuang Zhao

Comments NeurIPS 2025, produced by Pointcept, project page: https://pointcept.github.io/Concerto

Journal ref Neural Information Processing Systems 2025

2510.21314 2026-03-03 cs.LG cs.AI stat.ML

A Convergence Analysis of Adaptive Optimizers under Floating-point Quantization

Xuan Tang, Jichu Li, Difan Zou

Comments 68 pages, 13 figures, ICLR 2026

2510.19807 2026-03-03 cs.CL cs.AI cs.LG

Scaf-GRPO: Scaffolded Group Relative Policy Optimization for Enhancing LLM Reasoning

Xichen Zhang, Sitong Wu, Yinghao Zhu, Haoru Tan, Shaozuo Yu, Ziyi He, Jiaya Jia

Comments Code: https://github.com/JIA-Lab-research/Scaf-GRPO Accepted by ICLR 2026

2510.19208 2026-03-03 cs.CL

DiSRouter: Distributed Self-Routing for LLM Selections

Hang Zheng, Hongshen Xu, Yongkai Lin, Shuai Fan, Lu Chen, Kai Yu

2510.18866 2026-03-03 cs.CL cs.AI cs.CV cs.LG cs.MA

LightMem: Lightweight and Efficient Memory-Augmented Generation

Jizhan Fang, Xinle Deng, Haoming Xu, Ziyan Jiang, Yuqi Tang, Ziwen Xu, Shumin Deng, Yunzhi Yao, Mengru Wang, Shuofei Qiao, Huajun Chen, Ningyu Zhang

Comments ICLR 2026

2510.18808 2026-03-03 cs.LG q-bio.NC

Does Feedback Alignment Work at Biological Timescales?

Marc Gong Bacvanski, Liu Ziyin, Tomaso Poggio

2510.18299 2026-03-03 cs.LG

Physics-Informed Parametric Bandits for Beam Alignment in mmWave Communications

Hao Qin, Thang Duong, Ming F. Li, Chicheng Zhang

2510.16234 2026-03-03 cs.AI cs.CL cs.LG

ScholarEval: Research Idea Evaluation Grounded in Literature

Hanane Nour Moussa, Patrick Queiroz Da Silva, Daniel Adu-Ampratwum, Alyson East, Zitong Lu, Nikki Puccetti, Mingyi Xue, Huan Sun, Bodhisattwa Prasad Majumder, Sachin Kumar

2510.12586 2026-03-03 cs.CV

There is No VAE: End-to-End Pixel-Space Generative Modeling via Self-Supervised Pre-training

Jiachen Lei, Keli Liu, Julius Berner, Haiming Yu, Hongkai Zheng, Jiahong Wu, Xiangxiang Chu

2510.12563 2026-03-03 cs.AI

HardcoreLogic: Challenging Large Reasoning Models with Long-tail Logic Puzzle Games

Jingcong Liang, Shijun Wan, Xuehai Wu, Yitong Li, Qianglong Chen, Duyu Tang, Siyuan Wang, Zhongyu Wei

2510.12462 2026-03-03 cs.AI cs.CR

Evaluating and Mitigating LLM-as-a-judge Bias in Communication Systems

Jiaxin Gao, Chen Chen, Yanwen Jia, Xueluan Gong, Kwok-Yan Lam, Qian Wang

2510.11769 2026-03-03 cs.LG cs.AI

GAR: Generative Adversarial Reinforcement Learning for Formal Theorem Proving

Ruida Wang, Jiarui Yao, Rui Pan, Shizhe Diao, Tong Zhang

2510.10575 2026-03-03 cs.CV

UniFlow: A Unified Pixel Flow Tokenizer for Visual Understanding and Generation

Zhengrong Yue, Haiyu Zhang, Xiangyu Zeng, Boyu Chen, Chenting Wang, Shaobin Zhuang, Lu Dong, Yi Wang, Limin Wang, Yali Wang

Comments ICLR 2026

2510.10125 2026-03-03 cs.RO cs.AI

Ctrl-World: A Controllable Generative World Model for Robot Manipulation

Yanjiang Guo, Lucy Xiaoyang Shi, Jianyu Chen, Chelsea Finn

Comments 17 pages

2510.07959 2026-03-03 cs.LG cs.AI

DISCO: Diversifying Sample Condensation for Efficient Model Evaluation

Alexander Rubinstein, Benjamin Raible, Martin Gubri, Seong Joon Oh

Comments ICLR'26; arXiv v2: add camera-ready

2510.07940 2026-03-03 cs.CV cs.AI cs.CL cs.LG cs.MM

TTOM: Test-Time Optimization and Memorization for Compositional Video Generation

Leigang Qu, Ziyang Wang, Na Zheng, Wenjie Wang, Liqiang Nie, Tat-Seng Chua

Comments ICLR 2026 Camera-ready. Project page: https://ttom-t2v.github.io/

2510.07233 2026-03-03 cs.CL

LAD-RAG: Layout-aware Dynamic RAG for Visually-Rich Document Understanding

Zhivar Sourati, Zheng Wang, Marianne Menglin Liu, Yazhe Hu, Mengqing Guo, Sujeeth Bharadwaj, Kyu Han, Tao Sheng, Sujith Ravi, Morteza Dehghani, Dan Roth

2510.06292 2026-03-03 cs.CV cs.AI

ChainMPQ: Interleaved Text-Image Reasoning Chains for Mitigating Relation Hallucinations

Yike Wu, Yiwei Wang, Yujun Cai

Comments Accepted by ICLR2026

2510.05064 2026-03-03 cs.LG

Boomerang Distillation Enables Zero-Shot Model Size Interpolation

Sara Kangaslahti, Nihal V. Nayak, Jonathan Geuter, Marco Fumero, Francesco Locatello, David Alvarez-Melis

Comments ICLR 2026

2510.04727 2026-03-03 cs.LG

Directional Sheaf Hypergraph Networks: Unifying Learning on Directed and Undirected Hypergraphs

Emanuele Mule, Stefano Fiorini, Antonio Purificato, Federico Siciliano, Stefano Coniglio, Fabrizio Silvestri

Comments Camera ready revision: accepted to ICLR 2026

2510.04284 2026-03-03 cs.AI

Doctor-R1: Mastering Clinical Inquiry with Experiential Agentic Reinforcement Learning

Yunghwei Lai, Kaiming Liu, Ziyue Wang, Weizhi Ma, Yang Liu

2510.04157 2026-03-03 cs.SD eess.AS

GDiffuSE: Diffusion-based speech enhancement with noise model guidance

Efrayim Yanir, David Burshtein, Sharon Gannot

2510.04080 2026-03-03 cs.CL

PoLi-RL: A Point-to-List Reinforcement Learning Framework for Conditional Semantic Textual Similarity

Zixin Song, Bowen Zhang, Qian-Wen Zhang, Di Yin, Xing Sun, Chunping Li

详情

英文摘要

Conditional Semantic Textual Similarity (C-STS) measures the semantic proximity between text segments under a specific condition, thereby overcoming the ambiguity inherent in traditional STS. However, existing methods are largely confined to discriminative models, failing to fully leverage recent breakthroughs in the NLP community involving Large Language Models (LLMs) and Reinforcement Learning (RL). RL is a particularly well-suited paradigm for this task, as it can directly optimize the non-differentiable Spearman ranking metric and guide the reasoning process required by C-STS. Nevertheless, we find that naively applying listwise RL fails to produce meaningful improvements, as the model struggles with complex, coarse-grained reward signals, leading to optimization difficulties. To address this challenge, we introduce PoLi-RL, a novel Point-to-List Reinforcement Learning framework. PoLi-RL employs a two-stage curriculum: it first trains the model with a simple pointwise reward to establish fundamental scoring capabilities, then transitions to a hybrid reward that combines pointwise, pairwise, and listwise objectives to refine the model's ability to discern subtle semantic distinctions. Crucially, we propose an innovative Parallel Slice Ranking Reward (PSRR) mechanism that computes ranking rewards in parallel slices, where each slice consists of completions with the same index from different samples. This provides a precise, differentiated learning signal for each individual completion, enabling granular credit assignment and effective optimization. On the official C-STS benchmark, PoLi-RL achieves a Spearman correlation coefficient of 48.18, establishing a new SOTA for the cross-encoder architecture. As the first work to successfully apply RL to C-STS, our study introduces a powerful paradigm for aligning LLMs for complex, ranking-based conditional judgment tasks.

URL PDF HTML ☆

赞 0 踩 0

AI 大模型

视觉与机器人

科学与医疗

SLAP: Shortcut Learning for Abstract Planning

Can SAEs reveal and mitigate racial biases of LLMs in healthcare?

ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning

The FM Agent

The Information-Theoretic Imperative: Compression and the Epistemic Foundations of Intelligence

Routing Matters in MoE: Scaling Diffusion Transformers with Explicit Routing Guidance

Lookahead Tree-Based Rollouts for Enhanced Trajectory-Level Exploration in Reinforcement Learning with Verifiable Rewards

Concerto: Joint 2D-3D Self-Supervised Learning Emerges Spatial Representations

A Convergence Analysis of Adaptive Optimizers under Floating-point Quantization

Scaf-GRPO: Scaffolded Group Relative Policy Optimization for Enhancing LLM Reasoning

DiSRouter: Distributed Self-Routing for LLM Selections

LightMem: Lightweight and Efficient Memory-Augmented Generation

Does Feedback Alignment Work at Biological Timescales?

Physics-Informed Parametric Bandits for Beam Alignment in mmWave Communications

ScholarEval: Research Idea Evaluation Grounded in Literature

There is No VAE: End-to-End Pixel-Space Generative Modeling via Self-Supervised Pre-training

HardcoreLogic: Challenging Large Reasoning Models with Long-tail Logic Puzzle Games

Evaluating and Mitigating LLM-as-a-judge Bias in Communication Systems

GAR: Generative Adversarial Reinforcement Learning for Formal Theorem Proving

UniFlow: A Unified Pixel Flow Tokenizer for Visual Understanding and Generation

Ctrl-World: A Controllable Generative World Model for Robot Manipulation

DISCO: Diversifying Sample Condensation for Efficient Model Evaluation

TTOM: Test-Time Optimization and Memorization for Compositional Video Generation

LAD-RAG: Layout-aware Dynamic RAG for Visually-Rich Document Understanding

ChainMPQ: Interleaved Text-Image Reasoning Chains for Mitigating Relation Hallucinations

Boomerang Distillation Enables Zero-Shot Model Size Interpolation

Directional Sheaf Hypergraph Networks: Unifying Learning on Directed and Undirected Hypergraphs

Doctor-R1: Mastering Clinical Inquiry with Experiential Agentic Reinforcement Learning

GDiffuSE: Diffusion-based speech enhancement with noise model guidance

PoLi-RL: A Point-to-List Reinforcement Learning Framework for Conditional Semantic Textual Similarity