arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2603.20230 2026-03-24 cs.RO cs.AI cs.LG

Beyond Scalar Rewards: Distributional Reinforcement Learning with Preordered Objectives for Safe and Reliable Autonomous Driving

Ahmed Abouelazm, Jonas Michel, Daniel Bogdoll, Philip Schörner, J. Marius Zöllner

Comments First and Second authors contributed equally; Accepted to the 2026 IEEE International Conference on Robotics and Automation (ICRA 2026)

详情

英文摘要

Autonomous driving involves multiple, often conflicting objectives such as safety, efficiency, and comfort. In reinforcement learning (RL), these objectives are typically combined through weighted summation, which collapses their relative priorities and often yields policies that violate safety-critical constraints. To overcome this limitation, we introduce the Preordered Multi-Objective MDP (Pr-MOMDP), which augments standard MOMDPs with a preorder over reward components. This structure enables reasoning about actions with respect to a hierarchy of objectives rather than a scalar signal. To make this structure actionable, we extend distributional RL with a novel pairwise comparison metric, Quantile Dominance (QD), that evaluates action return distributions without reducing them into a single statistic. Building on QD, we propose an algorithm for extracting optimal subsets, the subset of actions that remain non-dominated under each objective, which allows precedence information to shape both decision-making and training targets. Our framework is instantiated with Implicit Quantile Networks (IQN), establishing a concrete implementation while preserving compatibility with a broad class of distributional RL methods. Experiments in Carla show improved success rates, fewer collisions and off-road events, and deliver statistically more robust policies than IQN and ensemble-IQN baselines. By ensuring policies respect rewards preorder, our work advances safer, more reliable autonomous driving systems.

URL PDF HTML ☆

赞 0 踩 0

2603.20224 2026-03-24 cs.CL

Beyond Test-Time Compute Strategies: Advocating Energy-per-Token in LLM Inference

Patrick Wilhelm, Thorsten Wittkopp, Odej Kao

2603.20222 2026-03-24 cs.CL

Linguistic Signatures for Enhanced Emotion Detection

Florian Lecourt, Madalina Croitoru, Konstantin Todorov

2603.20219 2026-03-24 cs.CL cs.LG

Thinking into the Future: Latent Lookahead Training for Transformers

Lorenzo Noci, Gregor Bachmann, Seyed-Mohsen Moosavi-Dezfooli, Moin Nabi

2603.20218 2026-03-24 cs.CL cs.LG

An experimental study of KV cache reuse strategies in chunk-level caching systems

Samuel Cestola, Tianxiang Xia, Zheng Weiyan, Zheng Pengfei, Diego Didona

2603.20217 2026-03-24 cs.CL cs.LG

Expected Reward Prediction, with Applications to Model Routing

Kenan Hasanaliyev, Silas Alberti, Jenny Hamer, Dheeraj Rajagopal, Kevin Robinson, Jasper Snoek, Victor Veitch, Alexander Nicholas D'Amour

Comments ICML 2025 Workshop on Models of Human Feedback for AI Alignment

2603.20215 2026-03-24 cs.CL cs.LG

Multi-Agent Debate with Memory Masking

Hongduan Tian, Xiao Feng, Ziyuan Zhao, Xiangyu Zhu, Rolan Yan, Bo Han

Comments ICLR 2026

2603.20213 2026-03-24 cs.AI cs.CL cs.LG cs.NE

AgenticGEO: A Self-Evolving Agentic System for Generative Engine Optimization

Jiaqi Yuan, Jialu Wang, Zihan Wang, Qingyun Sun, Ruijie Wang, Jianxin Li

2603.20212 2026-03-24 cs.CL cs.LG

Fast-Slow Thinking RM: Efficient Integration of Scalar and Generative Reward Models

Jiayun Wu, Peixu Hou, Shan Qu, Peng Zhang, Ning Gu, Tun Lu

2603.20208 2026-03-24 cs.CL cs.AI cs.CR

RedacBench: Can AI Erase Your Secrets?

Hyunjun Jeon, Kyuyoung Kim, Jinwoo Shin

2603.20206 2026-03-24 cs.CL cs.AI

Enhancing Safety of Large Language Models via Embedding Space Separation

Xu Zhao, Xiting Wang, Weiran Shen

2603.20200 2026-03-24 cs.RO cs.AI cs.CV

Your Robot Will Feel You Now: Empathy in Robots and Embodied Agents

Angelica Lim, Ö. Nilay Yalçin

Comments Accepted manuscript. Chapter in "Empathy and Artificial Intelligence: Challenges, Advances and Ethical Considerations" edited by Anat Perry; C. Daryl Cameron

2603.19078 2026-03-24 cs.RO

Articulated-Body Dynamics Network: Dynamics-Grounded Prior for Robot Learning

Sangwoo Shin, Kunzhao Ren, Xiaobin Xiong, Josiah P. Hanna

Comments Arxiv_r2

2603.17016 2026-03-24 cs.RO

Efficient and Reliable Teleoperation through Real-to-Sim-to-Real Shared Autonomy

Shuo Sha, Yixuan Wang, Binghao Huang, Antonio Loquercio, Yunzhu Li

Comments Project Page: https://residual-copilot.github.io/

2603.14672 2026-03-24 cs.CL cs.AI

Seamless Deception: Larger Language Models Are Better Knowledge Concealers

Dhananjay Ashok, Ruth-Ann Armstrong, Jonathan May

2603.13239 2026-03-24 cs.AI

Benchmarking Zero-Shot Reasoning Approaches for Error Detection in Solidity Smart Contracts

Eduardo Sardenberg, Antonio José Grandson Busson, Daniel de Sousa Moraes, Julio Cesar Duarte, Sérgio Colcher

2603.08964 2026-03-24 cs.AI cs.SY eess.SY

The FABRIC Strategy for Verifying Neural Feedback Systems

Samuel I. Akinwande, Sydney M. Katz, Mykel J. Kochenderfer, Clark Barrett

2603.06767 2026-03-24 cs.LG cs.AI

Failure Detection in Chemical Processes Using Symbolic Machine Learning: A Case Study on Ethylene Oxidation

Julien Amblard, Niklas Groll, Matthew Tait, Mark Law, Gürkan Sin, Alessandra Russo

Comments Accepted at AAAI-MAKE 2026

2602.12683 2026-03-24 cs.LG stat.ML

Flow Matching from Viewpoint of Proximal Operators

Kenji Fukumizu, Wei Huang, Han Bao, Shuntuo Xu, Nisha Chandramoorthy

Comments 38 pages, 6 figures

2601.20480 2026-03-24 cs.LG q-bio.NC

An explainable framework for the relationship between dementia and glucose metabolism patterns

C. Vázquez-García, F. J. Martínez-Murcia, F. Segovia Román, A. Forte, J. Ramírez, I. Illán, A. Hernández-Segura, C. Jiménez-Mesa, Juan M. Górriz

2511.17805 2026-03-24 cs.CV cs.AI

A Stitch in Time: Learning Procedural Workflow via Self-Supervised Plackett-Luce Ranking

Chengan Che, Chao Wang, Xinyue Chen, Sophia Tsoka, Luis C. Garcia-Peraza-Herrera

Comments Accepted at CVPR2026 main conference

2511.17487 2026-03-24 cs.CV

Downscaling Intelligence: Exploring Perception and Reasoning Bottlenecks in Small Multimodal Models

Mark Endo, Serena Yeung-Levy

Comments CVPR 2026, website at https://web.stanford.edu/~markendo/projects/downscaling_intelligence

2510.17017 2026-03-24 cs.CL

SafeSearch: Do Not Trade Safety for Utility in LLM Search Agents

Qiusi Zhan, Angeline Budiman-Chan, Abdelrahman Zayed, Xingzhi Guo, Daniel Kang, Joo-Kyung Kim

Comments EACL 2026 Findings. Code available at https://github.com/amazon-science/SafeSearch

2510.07735 2026-03-24 cs.LG

GeoGen: A Two-stage Coarse-to-Fine Framework for Fine-grained Synthetic Location-based Social Network Trajectory Generation

Rongchao Xu, Kunlin Cai, Lin Jiang, Zhiqing Hong, Yuan Tian, Guang Wang

详情

DOI: 10.1609/aaai.v40i2.37111
Journal ref: Proceedings of the AAAI Conference on Artificial Intelligence, 40(2), 1373-1381 (2026)

英文摘要

Location-Based Social Network (LBSN) check-in trajectory data are important for many practical applications, like POI recommendation, advertising, and pandemic intervention. However, the high collection costs and ever-increasing privacy concerns prevent us from accessing large-scale LBSN trajectory data. The recent advances in synthetic data generation provide us with a new opportunity to achieve this, which utilizes generative AI to generate synthetic data that preserves the characteristics of real data while ensuring privacy protection. However, generating synthetic LBSN check-in trajectories remains challenging due to their spatially discrete, temporally irregular nature and the complex spatio-temporal patterns caused by sparse activities and uncertain human mobility. To address this challenge, we propose GeoGen, a two-stage coarse-to-fine framework for large-scale LBSN check-in trajectory generation. In the first stage, we reconstruct spatially continuous, temporally regular latent movement sequences from the original LBSN check-in trajectories and then design a Sparsity-aware Spatio-temporal Diffusion model (S$^2$TDiff) with an efficient denosing network to learn their underlying behavioral patterns. In the second stage, we design Coarse2FineNet, a Transformer-based Seq2Seq architecture equipped with a dynamic context fusion mechanism in the encoder and a multi-task hybrid-head decoder, which generates fine-grained LBSN trajectories based on coarse-grained latent movement sequences by modeling semantic relevance and behavioral uncertainty. Extensive experiments on four real-world datasets show that GeoGen excels state-of-the-art models for both fidelity and utility evaluation, e.g., it increases over 69% and 55% in distance and radius metrics on the FS-TKY dataset.

URL PDF HTML ☆

赞 0 踩 0

2510.02284 2026-03-24 cs.CV cs.AI cs.LG

Learning to Generate Rigid Body Interactions with Video Diffusion Models

David Romero, Ariana Bermudez, Viacheslav Iablochnikov, Hao Li, Fabio Pizzati, Ivan Laptev

2509.21861 2026-03-24 cs.LG

SpecMol: A Spectroscopy-Grounded Foundation Model for Multi-Task Molecular Learning

Shuaike Shen, Jiaqing Xie, Zhuo Yang, Antong Zhang, Shuzhou Sun, Ben Gao, Tianfan Fu, Biqing Qi, Yuqiang Li

2509.08482 2026-03-24 cs.LG

SHAining on Process Mining: Explaining Event Log Characteristics Impact on Algorithms

Andrea Maldonado, Christian M. M. Frey, Sai Anirudh Aryasomayajula, Ludwig Zellner, Stephan A. Fahrenkrog-Petersen, Thomas Seidl

2509.08157 2026-03-24 cs.RO cs.AI cs.MA

Risk-Bounded Multi-Agent Visual Navigation via Iterative Risk Allocation

Viraj Parimi, Brian C. Williams

Comments Published at ICAPS '26

2508.16753 2026-03-24 cs.CL

GAICo: A Deployed and Extensible Framework for Evaluating Diverse and Multimodal Generative AI Outputs

Nitin Gupta, Pallav Koppisetti, Kausik Lakkaraju, Biplav Srivastava

Comments 11 pages, 7 figures; accepted at IAAI/AAAI 2026; (updated) extended version

2508.14828 2026-03-24 cs.CL cs.AI cs.LG

Long Chain-of-Thought Reasoning Across Languages

Josh Barua, Seun Eisape, Kayo Yin, Alane Suhr

Comments Accepted to ICLR 2026. v1 is a workshop version accepted to SCALR @ COLM 2025