arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2603.10225 2026-04-30 cs.LG cs.AI

Rethinking the Harmonic Loss via Non-Euclidean Distance Layers

Maxwell Miller-Golub, Collin Coil, Kamil Faber, Marcin Pietron, Panpan Zheng, Pasquale Minervini, Roberto Corizzo

详情

英文摘要

Cross-entropy loss has long been the standard choice for training deep neural networks, yet it suffers from interpretability limitations, unbounded weight growth, and inefficiencies that can contribute to costly training dynamics. The harmonic loss is a distance-based alternative grounded in Euclidean geometry that improves interpretability and mitigates phenomena such as grokking, or delayed generalization on the test set. However, the study of harmonic loss remains narrow: only Euclidean distance is explored, and no systematic evaluation of computational efficiency or sustainability was conducted. We extend harmonic loss by systematically investigating a broad spectrum of distance metrics as replacements for the Euclidean distance. We comprehensively evaluate distance-tailored harmonic losses on both vision backbones and large language models. Our analysis is framed around a three-way evaluation of model performance, interpretability, and sustainability. On vision tasks, cosine distances provide the most favorable trade-off, consistently improving accuracy while lowering carbon emissions, whereas Bray-Curtis and Mahalanobis further enhance interpretability at varying efficiency costs. On language models, cosine-based harmonic losses improve gradient and learning stability, strengthen representation structure, and reduce emissions relative to cross-entropy and Euclidean heads. Our code is available at: https://anonymous.4open.science/r/rethinking-harmonic-loss-5BAB/.

URL PDF HTML ☆

赞 0 踩 0

2603.09145 2026-04-30 cs.LG cs.AI

Causally Sufficient and Necessary Feature Expansion for Class-Incremental Learning

Zhen Zhang, Jielei Chu, Jiangtao Hu, Bin Liu, Jie Wang, Ya Liu, Tianrui Li

2603.08182 2026-04-30 cs.CL cs.AI

TildeOpen LLM: Leveraging Curriculum Learning to Achieve Equitable Language Representation

Toms Bergmanis, Martins Kronis, Ingus Jānis Pretkalniņš, Dāvis Nicmanis, Jeļizaveta Jelinska, Roberts Rozis, Rinalds Vīksna, Mārcis Pinnis

Comments LREC 2026

2603.07080 2026-04-30 cs.RO cs.LG

VLN-Cache: Enabling Token Caching for VLN Models with Visual/Semantic Dynamics Awareness

Zihao Zheng, Zhihao Mao, Xingyue Zhou, Jiayu Chen, Maoliang Li, Xinhao Sun, Hailong Zou, Zhaobo Zhang, Xuanzhe Liu, Donggang Cao, Hong Mei, Xiang Chen

2603.06752 2026-04-30 cs.LG cs.NA math.NA stat.ME stat.ML

Latent Autoencoder Ensemble Kalman Filter for Nonlinear Data assimilation

Xin T. Tong, Yanyan Wang, Liang Yan

2603.06635 2026-04-30 cs.LG

Graph Property Inference in Small Language Models: Effects of Representation and Reasoning Strategy

Michal Podstawski

2603.06198 2026-04-30 cs.CL

LIT-RAGBench: Benchmarking Generator Capabilities of Large Language Models in Retrieval-Augmented Generation

Koki Itai, Shunichi Hasegawa, Yuta Yamamoto, Gouki Minegishi, Masaki Otsuki

Comments Published as a conference paper at LREC 2026

2603.05959 2026-04-30 cs.CV

OVGGT: O(1) Constant-Cost Streaming Visual Geometry Transformer

Si-Yu Lu, Po-Ting Chen, Hui-Che Hsu, Sin-Ye Jhong, Wen-Huang Cheng, Yung-Yao Chen

Comments Project page: https://vaisr.github.io/OVGGT/ Code: https://github.com/VAISR/OVGGT

2603.05811 2026-04-30 cs.CV

Video Compression Meets Video Generation: Latent Inter-Frame Pruning with Attention Recovery

Dennis Menn, Yuedong Yang, Bokun Wang, Xiwen Wei, Mustafa Munir, Feng Liang, Radu Marculescu, Chenfeng Xu, Diana Marculescu

2603.04337 2026-04-30 cs.CV cs.CL

Pointer-CAD: Unifying B-Rep and Command Sequences via Pointer-based Edges & Faces Selection

Dacheng Qi, Chenyu Wang, Jingwei Xu, Tianzhe Chu, Zibo Zhao, Wen Liu, Wenrui Ding, Yi Ma, Shenghua Gao

Comments Accepted by CVPR2026

详情

英文摘要

Constructing computer-aided design (CAD) models is labor-intensive but essential for engineering and manufacturing. Recent advances in Large Language Models (LLMs) have inspired the LLM-based CAD generation by representing CAD as command sequences. But these methods struggle in practical scenarios because command sequence representation does not support entity selection (e.g. faces or edges), limiting its ability to support complex editing operations such as chamfer or fillet. Further, the discretization of a continuous variable during sketch and extrude operations may result in topological errors. To address these limitations, we present Pointer-CAD, a novel LLM-based CAD generation framework that leverages a pointer-based command sequence representation to explicitly incorporate the geometric information of B-rep models into sequential modeling. In particular, Pointer-CAD decomposes CAD model generation into steps, conditioning the generation of each subsequent step on both the textual description and the B-rep generated from previous steps. Whenever an operation requires the selection of a specific geometric entity, the LLM predicts a Pointer that selects the most feature-consistent candidate from the available set. Such a selection operation also reduces the quantization error in the command sequence-based representation. To support the training of Pointer-CAD, we develop a data annotation pipeline that produces expert-level natural language descriptions and apply it to build a dataset of approximately 575K CAD models. Extensive experimental results demonstrate that Pointer-CAD effectively supports the generation of complex geometric structures and reduces segmentation error to an extremely low level, achieving a significant improvement over prior command sequence methods, thereby significantly mitigating the topological inaccuracies introduced by quantization error.

URL PDF HTML ☆

赞 0 踩 0

2603.02854 2026-04-30 cs.RO cs.AI

CoFL: Continuous Flow Fields for Language-Conditioned Navigation

Haokun Liu, Zhaoqi Ma, Yicheng Chen, Masaki Kitagawa, Wentao Zhang, Zicen Xiong, Jinjie Li, Moju Zhao

Comments 18 pages, 13 figures

2603.01999 2026-04-30 cs.RO cs.CV cs.LG

Learning Vision-Based Omnidirectional Navigation: A Teacher-Student Approach Using Monocular Depth Estimation

Jan Finke, Wayne Paul Martis, Adrian Schmelter, Lars Erbach, Christian Jestel, Marvin Wiedemann

2602.23024 2026-04-30 cs.RO

InCoM: Intent-Driven Perception and Structured Coordination for Mobile Manipulation

Jiahao Liu, Cui Wenbo, Zhongpu Xia, Haoran Li, Dongbin Zhao

2602.21720 2026-04-30 cs.CL cs.AI

Evaluating the relationship between regularity and learnability in recursive numeral systems using Reinforcement Learning

Andrea Silvi, Ponrawee Prasertsom, Jennifer Culbertson, Devdatt Dubhashi, Moa Johansson, Kenny Smith

2602.20426 2026-04-30 cs.AI

Learning to Rewrite Tool Descriptions for Reliable LLM-Agent Tool Use

Ruocheng Guo, Kaiwen Dong, Xiang Gao, Kamalika Das

Comments Preprint

2602.19179 2026-04-30 cs.RO cs.SY eess.SY

Distributional Stability of Tangent-Linearized Gaussian Inference on Smooth Manifolds

Junghoon Seo, Hakjin Lee, Jaehoon Sim

Comments To appear in IEEE Robotics and Automation Letters (IEEE RA-L)

2602.17166 2026-04-30 cs.RO cs.SY eess.SY

Geometric Inverse Flight Dynamics on SO(3) and Application to Tethered Fixed-Wing Aircraft

Antonio Franchi, Chiara Gabellieri

Comments ACCEPTED ICUAS 2026

2602.13780 2026-04-30 cs.CV

Foundation Model-Driven Semantic Change Detection in Remote Sensing Imagery

Hengtong Shen, Li Yan, Hong Xie, Yaxuan Wei, Xinhao Li, Wenfei Shen, Peixian Lv, Fei Tan

2602.11731 2026-04-30 cs.CL

Thinking with Drafting: Optical Decompression via Logical Reconstruction

Jingxuan Wei, Honghao He, Caijun Jia, Siyuan Li, Zheng Sun, Yuhang Xu, Yuanyuan Lin, Linzhuang Sun, Yuchen Wu, Bihui Yu, Xiangxiang Zhang, Cheng Tan

2602.08826 2026-04-30 cs.CL cs.AI

Affective Flow Language Model for Emotional Support Conversation

Chenghui Zou, Ning Wang, Tiesunlong Shen, Luwei Xiao, Chuan Ma, Xiangpeng Li, Rui Mao, Erik Cambria

Comments 19 pages, 7 figures

2602.08373 2026-04-30 cs.AI cs.LG

Grounding Generative Planners in Verifiable Logic: A Hybrid Architecture for Trustworthy Embodied AI

Feiyu Wu, Xu Zheng, Yue Qu, Zhuocheng Wang, Zicheng Feng, Hui Li

Comments Accepted to ICLR 2026. Project page. https://openreview.net/forum?id=wb05ver1k8&noteId=v1Ax8CwI71

2602.06603 2026-04-30 cs.LG

The hidden risks of temporal resampling in clinical reinforcement learning

Thomas Frost, Hrisheekesh Vaidya, Steve Harris

Comments 12 pages, 6 figures, 3 tables. v3 updates with lit rev table

2602.03558 2026-04-30 cs.CV cs.AI cs.MM

ELIQ: A Label-Free Framework for Quality Assessment of Evolving AI-Generated Images

Xinyue Li, Zhiming Xu, Min Tang, Zhaolin Cai, Sijing Wu, Xiongkuo Min, Yitong Chen, Guangtao Zhai

2602.03467 2026-04-30 cs.AI cs.HC

The Dual Role of Abstracting over the Irrelevant in Symbolic Explanations: Cognitive Effort vs. Understanding

Zeynep G. Saribatur, Johannes Langer, Ute Schmid

Comments To appear in the Proceedings of the 48th Annual Meeting of the Cognitive Science Society (CogSci 2026)

2602.03412 2026-04-30 cs.CL

Verified Critical Step Optimization for LLM Agents

Mukai Li, Qingcheng Zeng, Tianqing Fang, Zhenwen Liang, Linfeng Song, Qi Liu, Haitao Mi, Dong Yu

Comments ACL 2026 Findings

2602.01297 2026-04-30 cs.AI

RE-MCDF: Closed-Loop Multi-Expert LLM Reasoning for Knowledge-Grounded Clinical Diagnosis

Shaowei Shen, Xiaohong Yang, Jie Yang, Lianfen Huang, Yongcai Zhang, Yang Zou, Seyyedali Hosseinalipour

Comments Accepted by International Joint Conference on Neural Networks (IJCNN 2026); 9 pages, 4 figures

详情

英文摘要

Electronic medical records (EMRs), particularly in neurology, are inherently heterogeneous, sparse, and noisy, which poses significant challenges for large language models (LLMs) in clinical diagnosis. In such settings, single-agent systems are vulnerable to self-reinforcing errors, as their predictions lack independent validation and can drift toward spurious conclusions. Although recent multi-agent frameworks attempt to mitigate this issue through collaborative reasoning, their interactions are often shallow and loosely structured, failing to reflect the rigorous, evidence-driven processes used by clinical experts. More fundamentally, existing approaches largely ignore the rich logical dependencies among diseases, such as mutual exclusivity, pathological compatibility, and diagnostic confusion. This limitation prevents them from ruling out clinically implausible hypotheses, even when sufficient evidence is available. To overcome these, we propose RE-MCDF, a relation-enhanced multi-expert clinical diagnosis framework. RE-MCDF introduces a generation--verification--revision closed-loop architecture that integrates three complementary components: (i) a primary expert that generates candidate diagnoses and supporting evidence, (ii) a laboratory expert that dynamically prioritizes heterogeneous clinical indicators, and (iii) a multi-relation awareness and evaluation expert group that explicitly enforces inter-disease logical constraints. Guided by a medical knowledge graph (MKG), the first two experts adaptively reweight EMR evidence, while the expert group validates and corrects candidate diagnoses to ensure logical consistency. Extensive experiments on the neurology subset of CMEMR (NEEMRs) and on our curated dataset (XMEMRs) demonstrate that RE-MCDF consistently outperforms state-of-the-art baselines in complex diagnostic scenarios (https://github.com/shenshaowei/RE-MCDF).

URL PDF HTML ☆

赞 0 踩 0

2601.21459 2026-04-30 cs.LG cs.AI

HER: Human-like Reasoning and Reinforcement Learning for LLM Role-playing

Chengyu Du, Xintao Wang, Aili Chen, Weiyuan Li, Rui Xu, Junteng Liu, Zishan Huang, Rong Tian, Zijun Sun, Yuhao Li, Liheng Feng, Deming Ding, Pengyu Zhao, Yanghua Xiao

Comments Findings of ACL, 2026

2601.18339 2026-04-30 cs.SD cs.LG

A Dataset for Automatic Vocal Mode Classification

Reemt Hinrichs, Sonja Stephan, Alexander Lange, Jörn Ostermann

Comments Extended manuscript of our Article in the proceedings of the EvoMUSART 2026: 15th International Conference on Artificial Intelligence in Music, Sound, Art and Design; Tiny corrigendum to v1, where the pitch distribution showed an incorrect F1. The truely lowest note of the dataset is a B1

2601.15808 2026-04-30 cs.AI

Inference-Time Scaling of Verification: Self-Evolving Deep Research Agents via Test-Time Rubric-Guided Verification

Yuxuan Wan, Tianqing Fang, Zaitang Li, Yintong Huo, Wenxuan Wang, Haitao Mi, Dong Yu, Michael R. Lyu

Comments ACL'2026-Findings

2601.13969 2026-04-30 cs.AI cs.IR cs.LG

Autonomous Knowledge Graph Exploration with Adaptive Breadth-Depth Retrieval

Joaquín Polonuer, Lucas Vittor, Iñaki Arango, Ayush Noori, David A. Clifton, Luciano Del Corro, Marinka Zitnik

Comments Accepted at ACL 2026 Main Conference