arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2510.05486 2026-04-17 cs.CL

Language Model as Planner and Formalizer under Constraints

Cassie Huang, Stuti Mohan, Ziyi Yang, Stefanie Tellex, Li Zhang

Comments In ACL 2026 main conference

详情

英文摘要

LLMs have been widely used in planning, either as planners to generate action sequences end-to-end, or as formalizers to represent the planning domain and problem in a formal language that can derive plans deterministically. However, both lines of work rely on standard benchmarks that include only generic and simplistic environmental specifications, leading to potential overestimation of the planning ability of LLMs and safety concerns in downstream tasks. We bridge this gap by augmenting widely used planning benchmarks with manually annotated, fine-grained, and rich natural language constraints spanning four formally defined categories. Over 4 state-of-the-art reasoning LLMs, 4 formal languages, and 4 datasets, we show that the introduction of one-sentence constraints consistently halves performance, indicating current LLMs' lack of robustness and an avenue for future research.

URL PDF HTML ☆

赞 0 踩 0

2510.03851 2026-04-17 cs.AI

MetaMuse: Algorithm Generation via Creative Ideation

Ruiying Ma, Chieh-Jan Mike Liang, Yanjie Gao, Francis Y. Yan

Comments ICLR 2026

2510.02738 2026-04-17 cs.RO cs.LG

Flow with the Force Field: Learning 3D Compliant Flow Matching Policies from Force and Demonstration-Guided Simulation Data

Tianyu Li, Yihan Li, Zizhe Zhang, Nadia Figueroa

Comments Accepted to ICRA 2026

2510.02539 2026-04-17 cs.CL cs.IR

Hierarchical Semantic Retrieval with Cobweb

Anant Gupta, Karthik Singaravadivelan, Zekun Wang

Comments 20 pages, 7 tables, 4 figures

2510.01433 2026-04-17 cs.RO cs.AI

AFFORD2ACT: Affordance-Guided Automatic Keypoint Selection for Generalizable and Lightweight Robotic Manipulation

Anukriti Singh, Kasra Torshizi, Khuzema Habib, Kelin Yu, Ruohan Gao, Pratap Tokekar

2509.23638 2026-04-17 cs.LG

LayerScope: Predictive Cross-Layer Scheduling for Efficient Multi-Batch MoE Inference on Legacy Servers

Enda Yu, Dezun Dong, Zhaoning Zhang, Zhe Bai, Weiling Yang, Haojie Wang, Dongsheng Li, Yongwei Wu, Xiangke Liao

Comments publishing in ICS 2026

2509.23468 2026-04-17 cs.RO cs.AI cs.LG

Multi-Modal Manipulation via Multi-Modal Policy Consensus

Haonan Chen, Jiaming Xu, Hongyu Chen, Kaiwen Hong, Binghao Huang, Chaoqi Liu, Jiayuan Mao, Yunzhu Li, Yilun Du, Katherine Driggs-Campbell

Comments 8 pages, 7 figures. Project website: https://policyconsensus.github.io

2509.23249 2026-04-17 cs.LG cs.NA math.NA

Deep Learning for Subspace Regression

Vladimir Fanaskov, Vladislav Trifonov, Alexander Rudikov, Ekaterina Muravleva, Ivan Oseledets

Comments Accepted to ICLR 2026, reviewed at https://openreview.net/forum?id=HF60Lu1Maj

2509.22378 2026-04-17 cs.SD cs.AI cs.MM eess.AS

Zero-Effort Image-to-Music Generation: An Interpretable RAG-based VLM Approach

Zijian Zhao, Dian Jin, Zijing Zhou

2509.20869 2026-04-17 cs.LG cs.AI

Model-Based Reinforcement Learning under Random Observation Delays

Armin Karamzade, Kyungmin Kim, JB Lanier, Davide Corsi, Roy Fox

2509.15602 2026-04-17 cs.CV

TennisTV: Do Multimodal Large Language Models Understand Tennis Rallies?

Zhongyuan Bao, Lejun Zhang

2509.14255 2026-04-17 cs.CL cs.AI

Cosine-Similarity Routing with Semantic Anchors for Interpretable Mixture-of-Experts Language Models

Ivan Ternovtsii, Yurii Bilak

Comments 23 pages, 6 figures. Code available at https://github.com/ITernovtsii/semantic-resonance. Preprint

2509.14003 2026-04-17 cs.SD cs.AI

RFM-Editing: Rectified Flow Matching for Text-guided Audio Editing

Liting Gao, Yi Yuan, Yaru Chen, Yuelan Cheng, Zhenbo Li, Juan Wen, Shubin Zhang, Wenwu Wang

Comments Accepted to ICASSP 2026

2509.12833 2026-04-17 cs.LG

Safe Reinforcement Learning using Action Projection: Safeguard the Policy or the Environment?

Hannah Markgraf, Shambhuraj Sawant, Hanna Krasowski, Lukas Schäfer, Sebastien Gros, Matthias Althoff

2509.12712 2026-04-17 cs.SD cs.IR

A Lightweight Two-Branch Architecture for Multi-Instrument Transcription via Note-Level Contrastive Clustering

Ruigang Li, Yongxu Zhu

Comments Published in TISMIR, Vol. 9, No. 1, pp. 119-130, 2026

2509.06593 2026-04-17 cs.RO

A Robust Approach for LiDAR-Inertial Odometry Without Sensor-Specific Modeling

Meher V. R. Malladi, Tiziano Guadagnino, Luca Lobefaro, Cyrill Stachniss

2509.06591 2026-04-17 cs.CV

Hybrid Swin Attention Networks for Simultaneously Low-Dose PET and CT Denoising

Yichao Liu, Hengzhi Xue, YueYang Teng, Junwen Guo

2509.03472 2026-04-17 cs.LG cs.AI cs.DC

DPQuant: Efficient and Differentially-Private Model Training via Dynamic Quantization Scheduling

Yubo Gao, Renbo Tu, Gennady Pekhimenko, Nandita Vijaykumar

2508.20705 2026-04-17 cs.LG cs.AI

EEGDM: Learning EEG Representation with Latent Diffusion Model

Shaocong Wang, Tong Liu, Yihan Li, Ming Li, Kairui Wen, Pei Yang, Wenqi Ji, Minjing Yu, Yong-Jin Liu

2508.10962 2026-04-17 cs.CV

CSNR and JMIM Based Spectral Band Selection for Reducing Metamerism in Urban Driving

Jiarong Li, Imad Ali Shah, Diarmaid Geever, Fiachra Collins, Enda Ward, Martin Glavin, Edward Jones, Brian Deegan

Comments Under Review at IEEE OJITS, July, 2025

详情

DOI: 10.1109/OJITS.2026.3680774
Journal ref: IEEE Open Journal of Intelligent Transportation Systems, vol. 7, pp. 1021-1033, 2026

英文摘要

Protecting Vulnerable Road Users (VRU) is a critical safety challenge for automotive perception systems, particularly under visual ambiguity caused by metamerism, a phenomenon where distinct materials appear similar in RGB imagery. This work investigates hyperspectral imaging (HSI) to overcome this limitation by capturing unique material signatures beyond the visible spectrum, especially in the Near-Infrared (NIR). To manage the inherent high-dimensionality of HSI data, we propose a band selection strategy that integrates information theory techniques (joint mutual information maximization, correlation analysis) with a novel application of an image quality metric (contrast signal-to-noise ratio) to identify the most spectrally informative bands. Using the Hyperspectral City V2 (H-City) dataset, we identify three informative bands (497 nm, 607 nm, and 895 nm, $\pm$27 nm) and reconstruct pseudo-color images for comparison with co-registered RGB. Quantitative results demonstrate increased dissimilarity and perceptual separability of VRU from the background. The selected HSI bands yield improvements of 70.24%, 528.46%, 1206.83%, and 246.62% for dissimilarity (Euclidean, SAM, $T^2$) and perception (CIE $ΔE$) metrics, consistently outperforming RGB and confirming a marked reduction in metameric confusion. By providing a spectrally optimized input, our method enhances VRU separability, establishing a robust foundation for downstream perception tasks in Advanced Driver Assistance Systems (ADAS) and Autonomous Driving (AD), ultimately contributing to improved road safety.

URL PDF HTML ☆

赞 0 踩 0

2508.05015 2026-04-17 cs.LG cs.AI

SPaCe: Unlocking Sample-Efficient Large Language Models Training With Self-Pace Curriculum Learning

Dai Do, Manh Nguyen, Svetha Venkatesh, Hung Le

2508.03341 2026-04-17 cs.AI

What Deserves Memory: Adaptive Memory Distillation for LLM Agents

Wenquan Ma, Jiayan Nan, Wenlong Wu, Yize Chen

2507.15351 2026-04-17 cs.AI cs.ET cs.MA

One Step is Enough: Multi-Agent Reinforcement Learning based on One-Step Policy Optimization for Order Dispatch on Ride-Sharing Platforms

Zijian Zhao, Sen Li

2507.11812 2026-04-17 cs.SD eess.AS eess.SP

A Multimodal Data Fusion Generative Adversarial Network for Real Time Underwater Sound Speed Field Construction

Wei Huang, Yuqiang Huang, Jixuan Zhou, Fang Ji, Hao Zhang, Tianhe Xu

2506.14121 2026-04-17 cs.CV

FADPNet: Frequency-Aware Dual-Path Network for Face Super-Resolution

Siyu Xu, Wenjie Li, Guangwei Gao, Jian Yang, Guo-Jun Qi, Chia-Wen Lin

Comments 12 pages, 10 figures, 8 tales

2506.13763 2026-04-17 cs.LG cs.AI cs.CV stat.ML

Diagnosing and Improving Diffusion Models by Estimating the Optimal Loss Value

Yixian Xu, Shengjie Luo, Liwei Wang, Di He, Chang Liu

Comments 33 pages, 12 figures, 9 tables. ICLR 2026 Camera Ready version

2506.09457 2026-04-17 cs.CL cs.LG

Towards Bridging the Reward-Generation Gap in Direct Alignment Algorithms

Zeguan Xiao, Yun Chen, Guanhua Chen, Ke Tang

Comments Findings of ACL 2026

2506.00433 2026-04-17 cs.CV cs.LG eess.IV

Latent Wavelet Diffusion For Ultra-High-Resolution Image Synthesis

Luigi Sigillo, Shengfeng He, Danilo Comminiello

Comments Accepted at ICLR 2026

2505.20122 2026-04-17 cs.CV

MEBench: A Novel Benchmark for Understanding Mutual Exclusivity Bias in Vision-Language Models

Anh Thai, Stefan Stojanov, Zixuan Huang, Bikram Boote, James M. Rehg

2505.18129 2026-04-17 cs.CV cs.CL

One RL to See Them All: Visual Triple Unified Reinforcement Learning

Yan Ma, Linge Du, Xuyang Shen, Shaoxiang Chen, Pengfei Li, Qibing Ren, Lizhuang Ma, Yuchao Dai, Pengfei Liu, Junjie Yan

Comments Technical Report