arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

检索范围排序方式

检索时间范围

重置

HOT 人工智能、机器人等 9

cs.AI 人工智能 cs.CV 计算机视觉 cs.CL 自然语言处理 cs.RO 机器人 cs.LG 机器学习 cs.SD 声音 cs.ET 新兴技术 eess.AS 音频语音 eess.IV 图像视频

CS 计算机 41

cs 计算机 cs.AI 人工智能 cs.AR 硬件架构 cs.CC 计算复杂性 cs.CE 计算工程 cs.CG 计算几何 cs.CL 自然语言处理 cs.CR 密码安全 cs.CV 计算机视觉 cs.CY 计算机与社会 cs.DB 数据库 cs.DC 分布式计算 cs.DL 数字图书馆 cs.DM 离散数学 cs.DS 数据结构 cs.ET 新兴技术 cs.FL 形式语言 cs.GL 综述文献 cs.GR 图形学 cs.GT 博弈论 cs.HC 人机交互 cs.IR 信息检索 cs.IT 信息论 cs.LG 机器学习 cs.LO 计算机逻辑 cs.MA 多智能体 cs.MM 多媒体 cs.MS 数学软件 cs.NA 数值分析 cs.NE 神经进化 cs.NI 网络架构 cs.OH 其他计算机 cs.OS 操作系统 cs.PF 性能 cs.PL 编程语言 cs.RO 机器人 cs.SC 符号计算 cs.SD 声音 cs.SE 软件工程 cs.SI 社会信息网络 cs.SY 系统控制

ECON 经济学 4

econ 经济学 econ.EM 计量经济 econ.GN 一般经济 econ.TH 理论经济

EESS 电气与系统 5

eess 电气与系统 eess.AS 音频语音 eess.IV 图像视频 eess.SP 信号处理 eess.SY 系统控制

MATH 数学 33

math 数学 math.AC 交换代数 math.AG 代数几何 math.AP 偏微分方程 math.AT 代数拓扑 math.CA 经典分析 math.CO 组合数学 math.CT 范畴论 math.CV 复变函数 math.DG 微分几何 math.DS 动力系统 math.FA 泛函分析 math.GM 一般数学 math.GN 一般拓扑 math.GR 群论 math.GT 几何拓扑 math.HO 历史综述 math.IT 信息论 math.KT K理论 math.LO 逻辑 math.MG 度量几何 math.MP 数学物理 math.NA 数值分析 math.NT 数论 math.OA 算子代数 math.OC 优化控制 math.PR 概率 math.QA 量子代数 math.RA 环与代数 math.RT 表示论 math.SG 辛几何 math.SP 谱理论 math.ST 统计理论

PHYSICS 物理 55

astro-ph 天体物理 astro-ph.CO 宇宙学 astro-ph.EP 地球行星 astro-ph.GA 星系物理 astro-ph.HE 高能天体 astro-ph.IM 天文仪器 astro-ph.SR 太阳恒星 cond-mat 凝聚态 cond-mat.dis-nn 无序神经 cond-mat.mes-hall 介观纳米 cond-mat.mtrl-sci 材料科学 cond-mat.other 其他凝聚态 cond-mat.quant-gas 量子气体 cond-mat.soft 软凝聚态 cond-mat.stat-mech 统计力学 cond-mat.str-el 强关联电子 cond-mat.supr-con 超导 gr-qc 广义相对论 hep-ex 高能实验 hep-lat 格点高能 hep-ph 高能唯象 hep-th 高能理论 math-ph 数学物理 nlin 非线性科学 nlin.AO 自适应系统 nlin.CD 混沌动力学 nlin.CG 胞自动机 nlin.PS 斑图孤子 nlin.SI 可积系统 nucl-ex 核物理实验 nucl-th 核物理理论 physics 物理 physics.acc-ph 加速器物理 physics.ao-ph 大气海洋 physics.app-ph 应用物理 physics.atm-clus 原子分子团簇 physics.atom-ph 原子物理 physics.bio-ph 生物物理 physics.chem-ph 化学物理 physics.class-ph 经典物理 physics.comp-ph 计算物理 physics.data-an 数据分析 physics.ed-ph 物理教育 physics.flu-dyn 流体动力学 physics.gen-ph 普通物理 physics.geo-ph 地球物理 physics.hist-ph 物理史哲 physics.ins-det 仪器探测 physics.med-ph 医学物理 physics.optics 光学 physics.plasm-ph 等离子体 physics.pop-ph 科普物理 physics.soc-ph 物理与社会 physics.space-ph 空间物理 quant-ph 量子物理

Q-BIO 定量生物 11

q-bio 定量生物 q-bio.BM 生物分子 q-bio.CB 细胞行为 q-bio.GN 基因组学 q-bio.MN 分子网络 q-bio.NC 神经认知 q-bio.OT 其他定量生物 q-bio.PE 种群进化 q-bio.QM 定量方法 q-bio.SC 亚细胞过程 q-bio.TO 组织器官

Q-FIN 定量金融 10

q-fin 定量金融 q-fin.CP 计算金融 q-fin.EC 经济学 q-fin.GN 一般金融 q-fin.MF 数学金融 q-fin.PM 投资组合 q-fin.PR 证券定价 q-fin.RM 风险管理 q-fin.ST 统计金融 q-fin.TR 交易微观结构

STAT 统计 7

stat 统计 stat.AP 统计应用 stat.CO 统计计算 stat.ME 统计方法 stat.ML 机器学习 stat.OT 其他统计 stat.TH 统计理论

2602.05323 2026-02-06 cs.LG cs.AI

GAS: Enhancing Reward-Cost Balance of Generative Model-assisted Offline Safe RL

Zifan Liu, Xinran Li, Shibo Chen, Jun Zhang

详情

英文摘要

Offline Safe Reinforcement Learning (OSRL) aims to learn a policy to achieve high performance in sequential decision-making while satisfying constraints, using only pre-collected datasets. Recent works, inspired by the strong capabilities of Generative Models (GMs), reformulate decision-making in OSRL as a conditional generative process, where GMs generate desirable actions conditioned on predefined reward and cost values. However, GM-assisted methods face two major challenges in OSRL: (1) lacking the ability to "stitch" optimal transitions from suboptimal trajectories within the dataset, and (2) struggling to balance reward targets with cost targets, particularly when they are conflict. To address these issues, we propose Goal-Assisted Stitching (GAS), a novel algorithm designed to enhance stitching capabilities while effectively balancing reward maximization and constraint satisfaction. To enhance the stitching ability, GAS first augments and relabels the dataset at the transition level, enabling the construction of high-quality trajectories from suboptimal ones. GAS also introduces novel goal functions, which estimate the optimal achievable reward and cost goals from the dataset. These goal functions, trained using expectile regression on the relabeled and augmented dataset, allow GAS to accommodate a broader range of reward-cost return pairs and achieve a better tradeoff between reward maximization and constraint satisfaction compared to human-specified values. The estimated goals then guide policy training, ensuring robust performance under constrained settings. Furthermore, to improve training stability and efficiency, we reshape the dataset to achieve a more uniform reward-cost return distribution. Empirical results validate the effectiveness of GAS, demonstrating superior performance in balancing reward maximization and constraint satisfaction compared to existing methods.

URL PDF HTML ☆

赞 0 踩 0

2602.05311 2026-02-06 cs.LG cs.AI cs.RO cs.SY eess.SY

Formal Synthesis of Certifiably Robust Neural Lyapunov-Barrier Certificates

Chengxiao Wang, Haoze Wu, Gagandeep Singh

2602.05310 2026-02-06 cs.RO

Learning Soccer Skills for Humanoid Robots: A Progressive Perception-Action Framework

Jipeng Kong, Xinzhe Liu, Yuhang Lin, Jinrui Han, Sören Schwertfeger, Chenjia Bai, Xuelong Li

Comments 13 pages, 9 figures, conference

2602.05307 2026-02-06 cs.CL

MentorCollab: Selective Large-to-Small Inference-Time Guidance for Efficient Reasoning

Haojin Wang, Yike Wang, Shangbin Feng, Hannaneh Hajishirzi, Yulia Tsvetkov

2602.05297 2026-02-06 cs.AI

Aspect-Aware MOOC Recommendation in a Heterogeneous Network

Seongyeub Chu, Jongwoo Kim, Mun Yong Yi

2602.05289 2026-02-06 cs.CL cs.AI cs.MA

Towards a Science of Collective AI: LLM-based Multi-Agent Systems Need a Transition from Blind Trial-and-Error to Rigorous Science

Jingru Fan, Dewen Liu, Yufan Dang, Huatao Li, Yuheng Wang, Wei Liu, Feiyu Duan, Xuanwen Ding, Shu Yao, Lin Wu, Ruijie Shi, Wai-Shing Leung, Yuan Cheng, Zhongyu Wei, Cheng Yang, Chen Qian, Zhiyuan Liu, Maosong Sun

2602.05279 2026-02-06 cs.AI cs.CR

Hallucination-Resistant Security Planning with a Large Language Model

Kim Hammar, Tansu Alpcan, Emil Lupu

Comments Accepted to IEEE/IFIP Network Operations and Management Symposium 2026. To appear in the conference proceedings

2602.05275 2026-02-06 cs.CV

Magic-MM-Embedding: Towards Visual-Token-Efficient Universal Multimodal Embedding with MLLMs

Qi Li, Yanzhe Zhao, Yongxin Zhou, Yameng Wang, Yandong Yang, Yuanjia Zhou, Jue Wang, Zuojian Wang, Jinxiang Liu

2602.05273 2026-02-06 cs.RO

Affordance-Aware Interactive Decision-Making and Execution for Ambiguous Instructions

Hengxuan Xu, Fengbo Lan, Zhixin Zhao, Shengjie Wang, Mengqiao Liu, Jieqian Sun, Yu Cheng, Tao Zhang

Comments 14 pages, 10 figures, 8 tables

2602.05271 2026-02-06 cs.CV

Unlocking Prototype Potential: An Efficient Tuning Framework for Few-Shot Class-Incremental Learning

Shengqin Jiang, Xiaoran Feng, Yuankai Qi, Haokui Zhang, Renlong Hang, Qingshan Liu, Lina Yao, Quan Z. Sheng, Ming-Hsuan Yang

Comments under review

2602.05269 2026-02-06 cs.LG cs.AI cs.CL

Hybrid Gated Flow (HGF): Stabilizing 1.58-bit LLMs via Selective Low-Rank Correction

David Alejandro Trejo Pizzo

Comments 21 pages, 4 figures, 6 tables. Code and models will be released at opencores.ai

2602.05266 2026-02-06 cs.AI

Beyond Cosine Similarity

Xinbo Ai

Comments 18 pages, 2 figures, 1 theorem, 3 corollaries

2602.05265 2026-02-06 cs.RO

Low-Cost Underwater In-Pipe Centering and Inspection Using a Minimal-Sensing Robot

Kalvik Jakkala, Jason O'Kane

2602.05262 2026-02-06 cs.CV

ReGLA: Efficient Receptive-Field Modeling with Gated Linear Attention Network

Junzhou Li, Manqi Zhao, Yilin Gao, Zhiheng Yu, Yin Li, Dongsheng Jiang, Li Xiao

Comments 11 pages, 4 figures

2602.05261 2026-02-06 cs.CL

Length-Unbiased Sequence Policy Optimization: Revealing and Controlling Response Length Variation in RLVR

Fanfan Liu, Youyang Yin, Peng Shi, Siqi Yang, Zhixiong Zeng, Haibo Qiu

2602.05258 2026-02-06 cs.CL cs.AI cs.LG

CoPE: Clipped RoPE as A Scalable Free Lunch for Long Context LLMs

Haoran Li, Sucheng Ren, Alan Yuille, Feng Wang

2602.05257 2026-02-06 cs.CV cs.RO

RFM-Pose:Reinforcement-Guided Flow Matching for Fast Category-Level 6D Pose Estimation

Diya He, Qingchen Liu, Cong Zhang, Jiahu Qin

Comments This work has been submitted to the IEEE for possible publication

2602.05251 2026-02-06 cs.LG

TADS: Task-Aware Data Selection for Multi-Task Multimodal Pre-Training

Guanjie Cheng, Boyi Li, Lingyu Sun, Mengying Zhu, Yangyang Wu, Xinkui Zhao, Shuiguang Deng

2602.05250 2026-02-06 cs.CV

Active Label Cleaning for Reliable Detection of Electron Dense Deposits in Transmission Electron Microscopy Images

Jieyun Tan, Shuo Liu, Guibin Zhang, Ziqi Li, Jian Geng, Lei Zhang, Lei Cao

Comments 10 pages, 6 figures

2602.05249 2026-02-06 cs.AI

Automatic Cognitive Task Generation for In-Situ Evaluation of Embodied Agents

Xinyi He, Ying Yang, Chuanjian Fu, Sihan Guo, Songchun Zhu, Lifeng Fan, Zhenliang Zhang, Yujia Peng

2602.05240 2026-02-06 cs.AI

Explainable AI: A Combined XAI Framework for Explaining Brain Tumour Detection Models

Patrick McGonagle, William Farrelly, Kevin Curran

2602.05238 2026-02-06 cs.CV cs.LG

PatchFlow: Leveraging a Flow-Based Model with Patch Features

Boxiang Zhang, Baijian Yang, Xiaoming Wang, Corey Vian

2602.05235 2026-02-06 cs.CL

FedMosaic: Federated Retrieval-Augmented Generation via Parametric Adapters

Zhilin Liang, Yuxiang Wang, Zimu Zhou, Hainan Zhang, Boyi Liu, Yongxin Tong

Comments 11 pages

2602.05233 2026-02-06 cs.RO

MobileManiBench: Simplifying Model Verification for Mobile Manipulation

Wenbo Wang, Fangyun Wei, QiXiu Li, Xi Chen, Yaobo Liang, Chang Xu, Jiaolong Yang, Baining Guo

2602.05232 2026-02-06 cs.LG cs.AI

Balanced Anomaly-guided Ego-graph Diffusion Model for Inductive Graph Anomaly Detection

Chunyu Wei, Siyuan He, Yu Wang, Yueguo Chen, Yunhai Wang, Bing Bai, Yidong Zhang, Yong Xie, Shunming Zhang, Fei Wang

Comments 12 pages,6 figures, Accepted by ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD '26)

2602.05230 2026-02-06 cs.LG cs.AI stat.ML

ZeroS: Zero-Sum Linear Attention for Efficient Transformers

Jiecheng Lu, Xu Han, Yan Sun, Viresh Pati, Yubin Kim, Siddhartha Somani, Shihao Yang

Comments Camera-ready version. Accepted at NeurIPS 2025

Journal ref Proceedings of the Thirty-ninth Annual Conference on Neural Information Processing Systems (NeurIPS 2025)

2602.05219 2026-02-06 cs.LG

Private Prediction via Shrinkage

Chao Yan

2602.05218 2026-02-06 cs.CV

Boosting SAM for Cross-Domain Few-Shot Segmentation via Conditional Point Sparsification

Jiahao Nie, Yun Xing, Wenbin An, Qingsong Zhao, Jiawei Shao, Yap-Peng Tan, Alex C. Kot, Shijian Lu, Xuelong Li

2602.05215 2026-02-06 cs.CV

E.M.Ground: A Temporal Grounding Vid-LLM with Holistic Event Perception and Matching

Jiahao Nie, Wenbin An, Gongjie Zhang, Yicheng Xu, Yap-Peng Tan, Alex C. Kot, Shijian Lu

2602.05213 2026-02-06 cs.CV

Dual-Representation Image Compression at Ultra-Low Bitrates via Explicit Semantics and Implicit Textures

Chuqin Zhou, Xiaoyue Ling, Yunuo Chen, Jincheng Dai, Guo Lu, Wenjun Zhang

AI 大模型

视觉与机器人

科学与医疗