arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2604.13175 2026-04-17 cs.LG cs.AI q-bio.BM q-bio.QM

Pareto-Optimal Offline Reinforcement Learning via Smooth Tchebysheff Scalarization

Aadyot Bhatnagar, Peter Mørch Groth, Ali Madani

详情

英文摘要

Large language models can be aligned with human preferences through offline reinforcement learning (RL) on small labeled datasets. While single-objective alignment is well-studied, many real-world applications demand the simultaneous optimization of multiple conflicting rewards, e.g. optimizing both catalytic activity and specificity in protein engineering, or helpfulness and harmlessness for chatbots. Prior work has largely relied on linear reward scalarization, but this approach provably fails to recover non-convex regions of the Pareto front. In this paper, instead of scalarizing the rewards directly, we frame multi-objective RL itself as an optimization problem to be scalarized via smooth Tchebysheff scalarization, a recent technique that overcomes the shortcomings of linear scalarization. We use this formulation to derive Smooth Tchebysheff Optimization of Multi-Objective Preferences (STOMP), a novel offline RL algorithm that extends direct preference optimization to the multi-objective setting in a principled way by standardizing the individual rewards based on their observed distributions. We empirically validate STOMP on a range of protein engineering tasks by aligning three autoregressive protein language models on three laboratory datasets of protein fitness. Compared to state-of-the-art baselines, STOMP achieves the highest hypervolumes in eight of nine settings according to both offline off-policy and generative evaluations. We thus demonstrate that STOMP is a powerful, robust multi-objective alignment algorithm that can meaningfully improve post-trained models for multi-attribute protein optimization and beyond.

URL PDF HTML ☆

赞 0 踩 0

2604.13001 2026-04-17 cs.RO

XRZero-G0: Pushing the Frontier of Dexterous Robotic Manipulation with Interfaces, Quality and Ratios

James Wang, Primo Pu, Zephyr Fung, Alex Wang, Sam Wang, Bender Deng, Kevin Wang, Zivid Liu, Chris Pan, Panda Yang, Andy Zhai, Lucy Liang, Shalfun Li, Johnny Sun, Jacky Xu, Will Tian, Kai Yan, Kohler Ye, Scott Li, Qian Wang, Roy Gan, Hao Wang

Comments Technical Report

2604.11529 2026-04-17 cs.LG

TempusBench: An Evaluation Framework for Time-Series Forecasting

Denizalp Goktas, Gerardo Riaño-Briceño, Alif Abdullah, Aryan Nair, Chenkai Shen, Beatriz de Lucio, Alexandra Magnusson, Farhan Mashrur, Ahmed Abdulla, Shawrna Sen, Mahitha Thippireddy, Gregory Schwartz, Amy Greenwald

2604.11502 2026-04-17 cs.CL cs.AI

METER: Evaluating Multi-Level Contextual Causal Reasoning in Large Language Models

Pengfeng Li, Chen Huang, Chaoqun Hao, Hongyao Chen, Xiao-Yong Wei, Wenqiang Lei, See-Kiong Ng

Comments ACL 2026. Our code and dataset are available at https://github.com/SCUNLP/METER

2604.11026 2026-04-17 cs.LG cs.AI

Optimal Stability of KL Divergence under Gaussian Perturbations

Jialu Pan, Yufeng Zhang, Nan Hu, Zhenbang Chen, Ji Wang, Keqin Li

2604.10866 2026-04-17 cs.CL

OccuBench: Evaluating AI Agents on Real-World Professional Tasks via Language Environment Simulation

Xiaomeng Hu, Yinger Zhang, Fei Huang, Jianhong Tu, Yang Su, Lianghao Deng, Yuxuan Liu, Yantao Liu, Dayiheng Liu, Tsung-Yi Ho

Comments 23 pages, 8 figures, 2 tables. Project page: https://gregxmhu.github.io/OccuBench-website/

2604.05158 2026-04-17 cs.CL

Just Pass Twice: Efficient Token Classification with LLMs for Zero-Shot NER

Ahmed Ewais, Ahmed Hashish, Amr Ali

Comments 16 pages, 9 figures, 12 tables

2603.27844 2026-04-17 cs.CL

Model Capability Dominates: Inference-Time Optimization Lessons from AIMO 3

Natapong Nitarach

Comments 18 pages, 6 figures, 10 tables. Kaggle AIMO 3 competition entry. Code and notebooks: https://github.com/nat-nischw/model-capability-dominates-lessons-aimo3

2603.05493 2026-04-17 cs.RO

cuRoboV2: Dynamics-Aware Motion Generation with Depth-Fused Distance Fields for High-DoF Robots

Balakumar Sundaralingam, Adithyavairavan Murali, Stan Birchfield

Comments cuRoboV2 Technical Report with code url

详情

英文摘要

Effective robot autonomy requires motion generation that is safe, feasible, and reactive. Current methods are fragmented: fast planners output physically unexecutable trajectories, reactive controllers struggle with high-fidelity perception, and existing solvers fail on high-DoF systems. We present cuRoboV2, a unified framework with three key innovations: (1) B-spline trajectory optimization that enforces smoothness and torque limits; (2) a GPU-native TSDF/ESDF perception pipeline that generates dense signed distance fields covering the full workspace, unlike existing methods that only provide distances within sparsely allocated blocks, up to 10x faster and in 8x less memory than the state-of-the-art at manipulation scale, with up to 99% collision recall; and (3) scalable GPU-native whole-body computation, namely topology-aware kinematics, differentiable inverse dynamics, and map-reduce self-collision, that achieves up to 61x speedup while also extending to high-DoF humanoids (where previous GPU implementations fail). On benchmarks, cuRoboV2 achieves 99.7% success under 3kg payload (where baselines achieve only 72--77%), 99.6% collision-free IK on a 48-DoF humanoid (where prior methods fail entirely), and 89.5% retargeting constraint satisfaction (vs. 61% for PyRoki); these collision-free motions yield locomotion policies with 21% lower tracking error than PyRoki and 12x lower cross-seed variance than GMR. A ground-up codebase redesign for discoverability enabled LLM coding assistants to author up to 73% of new modules, including hand-optimized CUDA kernels, demonstrating that well-structured robotics code can unlock productive human-LLM collaboration. Together, these advances provide a unified, dynamics-aware motion generation stack that scales from single-arm manipulators to full humanoids. Code is available at https://github.com/NVlabs/curobo.

URL PDF HTML ☆

赞 0 踩 0

2603.03897 2026-04-17 cs.RO cs.AI cs.CL cs.HC cs.LG

IROSA: Interactive Robot Skill Adaptation using Natural Language

Markus Knauer, Samuel Bustamante, Thomas Eiband, Alin Albu-Schäffer, Freek Stulp, João Silvério

Comments Accepted IEEE Robotics and Automation Letters (RA-L) journal, 8 pages, 5 figures, 3 tables, 1 listing. Code available: https://github.com/DLR-RM/IROSA

2603.02196 2026-04-17 cs.AI cs.LG math.ST stat.ML stat.TH

Conformal Policy Control

Drew Prinster, Clara Fannjiang, Ji Won Park, Kyunghyun Cho, Anqi Liu, Suchi Saria, Samuel Stanton

2602.22175 2026-04-17 cs.CL

DySCO: Dynamic Attention-Scaling Decoding for Long-Context Language Models

Xi Ye, Wuwei Zhang, Fangcong Yin, Howard Yen, Danqi Chen

2602.20370 2026-04-17 cs.LG cs.NA math.NA

Quantitative Approximation Rates for Group Equivariant Learning

Jonathan W. Siegel, Snir Hordan, Hannah Lawrence, Ali Syed, Nadav Dym

2602.12389 2026-04-17 cs.AI cs.CL

Evolving Beyond Snapshots: Harmonizing Structure and Sequence via Entity State Tuning for Temporal Knowledge Graph Forecasting

Siyuan Li, Yunjia Wu, Yiyong Xiao, Pingyang Huang, Peize Li, Ruitong Liu, Yan Wen, Te Sun

2602.06930 2026-04-17 cs.LG math.OC math.ST stat.ML stat.TH

Continuous-time reinforcement learning: ellipticity enables model-free value function approximation

Wenlong Mou

Comments update from previous version: removed unnecessarily strong requirement on discount rate

2601.20868 2026-04-17 cs.LG cs.AI cs.NE

Rethinking LLM-Driven Heuristic Design: Generating Efficient and Specialized Solvers via Dynamics-Aware Optimization

Rongzheng Wang, Yihong Huang, Muquan Li, Jiakai Li, Di Liang, Bob Simons, Pei Ke, Shuang Liang, Ke Qin

2601.18675 2026-04-17 cs.LG cs.AI

Learning temporal embeddings from electronic health records of chronic kidney disease patients

Aditya Kumar, Mario A. Cypko, Oliver Amft

2601.16870 2026-04-17 cs.RO

A Multimodal Data Collection Framework for Dialogue-Driven Assistive Robotics to Clarify Ambiguities: A Wizard-of-Oz Pilot Study

Guangping Liu, Nicholas Hawkins, Billy Madden, Tipu Sultan, Flavio Esposito, Madi Babaiasl

Comments Accepted to IEEE RAS/EMBS 11th International Conference on Biomedical Robotics and Biomechatronics (BioRob) 2026

2601.14053 2026-04-17 cs.LG cs.AI cs.CV cs.MA eess.IV

LLMOrbit: A Circular Taxonomy of Large Language Models -From Scaling Walls to Agentic AI Systems

Badri N. Patro, Vijay S. Agneeswaran

2601.12145 2026-04-17 cs.LG

Threshold Differential Attention for Sink-Free, Ultra-Sparse, and Non-Dispersive Language Modeling

Xingyue Huang, Xueying Ding, Mingxuan Ju, Yozen Liu, Neil Shah, Tong Zhao

2601.10237 2026-04-17 cs.LG cs.CR

Fundamental Limitations of Favorable Privacy-Utility Guarantees for DP-SGD

Murat Bilgehan Ertan, Marten van Dijk

Comments Accepted at ACM CCS 2026

2601.08831 2026-04-17 cs.CV

3AM: 3egment Anything with Geometric Consistency in Videos

Yang-Che Sun, Cheng Sun, Chin-Yang Lin, Fu-En Yang, Min-Hung Chen, Yen-Yu Lin, Yu-Lun Liu

Comments Project page: https://jayisaking.github.io/3AM-Page/

2601.08310 2026-04-17 cs.LG cs.AI

ORBIT: On-policy Exploration-Exploitation for Controllable Multi-Budget Reasoning

Kun Liang, Clive Bai, Xin Xu, Chenming Tang, Sanwoo Lee, Weijie Liu, Saiyong Yang, Yunfang Wu

Comments Preprint

2601.07667 2026-04-17 cs.CL cs.AI cs.LG

Adaptive Layer Selection for Layer-Wise Token Pruning in LLM Inference

Rei Taniguchi, Yuyang Dong, Makoto Onizuka, Chuan Xiao

Comments ACL 2026 Findings. Source code available at https://github.com/TANIGUCHIREI/ASL

2601.06559 2026-04-17 cs.CV

ArrowGEV: Grounding Events in Video via Learning the Arrow of Time

Fangxu Yu, Ziyao Lu, Liqiang Niu, Fandong Meng, Jie Zhou

Comments Accepted to Findings of ACL 2026

2512.22185 2026-04-17 cs.CV

Beyond Augmentation: Cross-Modal Transformer Fusion with Bi-directional Attention for Low-Data Aneurysm Screening

Antara Titikhsha, Divyanshu Tak

Comments We had major improvements in this second draft. Please refer to this version only

2512.04884 2026-04-17 cs.RO

Hoi! - A Multimodal Dataset for Force-Grounded, Cross-View Articulated Manipulation

Tim Engelbracht, René Zurbrügg, Matteo Wohlrapp, Martin Büchner, Abhinav Valada, Marc Pollefeys, Hermann Blum, Zuria Bauer

2511.22112 2026-04-17 cs.LG

Toward Data-Driven Surrogates of the Solar Wind with Spherical Fourier Neural Operator

Reza Mansouri, Dustin Kempton, Pete Riley, Rafal Angryk

Comments International Conference on Machine Learning and Applications (ICMLA 2025)

2511.20645 2026-04-17 cs.CV

PixelDiT: Pixel Diffusion Transformers for Image Generation

Yongsheng Yu, Wei Xiong, Weili Nie, Yichen Sheng, Shiqiu Liu, Jiebo Luo

Comments Accepted to CVPR 2026

2511.19204 2026-04-17 cs.RO cs.SY eess.SY

Reference-Free Sampling-Based Model Predictive Control

Fabian Schramm, Pierre Fabre, Nicolas Perrin-Gilbert, Justin Carpentier

Comments Accepted to the 2026 IEEE International Conference on Robotics and Automation (ICRA), Vienna, Austria