arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2507.09264 2026-03-06 cs.LG cs.AI eess.IV

Overtone: Cyclic Patch Modulation for Clean, Efficient, and Flexible Physics Emulators

Payel Mukhopadhyay, Michael McCabe, Ruben Ohana, Miles Cranmer

Comments 48 pages, 24 Figures. For code, see https://github.com/payelmuk150/patch-modulator

详情

Journal ref: Published as a conference paper at ICLR 2026

英文摘要

Transformer-based PDE surrogates achieve remarkable performance but face two key challenges: fixed patch sizes cause systematic error accumulation at harmonic frequencies, and computational costs remain inflexible regardless of problem complexity or available resources. We introduce Overtone, a unified solution through dynamic patch size control at inference. Overtone's key insight is that cyclically modulating patch sizes during autoregressive rollouts distributes errors across the frequency spectrum, mitigating the systematic harmonic artifact accumulation that plague fixed-patch models. We implement this through two architecture-agnostic modules--CSM (using dynamic stride modulation) and CKM (using dynamic kernel resizing)--that together provide both harmonic mitigation and compute-adaptive deployment. This flexible tokenization lets users trade accuracy for speed dynamically based on computational constraints, and the cyclic rollout strategy yields up to 40% lower long rollout error in variance-normalised RMSE (VRMSE) compared to conventional, static-patch surrogates. Across challenging 2D and 3D PDE benchmarks, one Overtone model matches or exceeds fixed-patch baselines across inference compute budgets, when trained under a fixed total training budget setting.

URL PDF HTML ☆

赞 0 踩 0

2507.07999 2026-03-06 cs.CV cs.AI cs.CL

Traceable Evidence Enhanced Visual Grounded Reasoning: Evaluation and Methodology

Haochen Wang, Xiangtai Li, Zilong Huang, Anran Wang, Jiacong Wang, Tao Zhang, Jiani Zheng, Sule Bai, Zijian Kang, Jiashi Feng, Zhuochen Wang, Zhaoxiang Zhang

Comments ICLR 2026 Camera Ready Version

2507.01853 2026-03-06 cs.CL

Eka-Eval: An Evaluation Framework for Low-Resource Multilingual Large Language Models

Samridhi Raj Sinha, Rajvee Sheth, Abhishek Upperwal, Mayank Singh

2507.01785 2026-03-06 cs.CL cs.AI cs.LG

MuRating: A High Quality Data Selecting Approach to Multilingual Large Language Model Pretraining

Zhixun Chen, Ping Guo, Wenhan Han, Yifan Zhang, Binbin Liu, Haobin Lin, Fengze Liu, Yan Zhao, Bingni Zhang, Taifeng Wang, Yin Zheng, Trevor Cohn, Meng Fang

Comments NeurIPS 2025 poster

2507.00999 2026-03-06 cs.CL

La Leaderboard: A Large Language Model Leaderboard for Spanish Varieties and Languages of Spain and Latin America

María Grandury, Javier Aula-Blasco, Júlia Falcão, Clémentine Fourrier, Miguel González, Gonzalo Martínez, Gonzalo Santamaría, Rodrigo Agerri, Nuria Aldama, Luis Chiruzzo, Javier Conde, Helena Gómez, Marta Guerrero, Guido Ivetta, Natalia López, Flor Miriam Plaza-del-Arco, María Teresa Martín-Valdivia, Helena Montoro, Carmen Muñoz, Pedro Reviriego, Leire Rosado, Alejandro Vaca, María Estrella Vallecillo-Rodríguez, Jorge Vallego, Irune Zubiaga

Comments Accepted at ACL 2025 Main

2507.00677 2026-03-06 cs.RO

Walk Like Dogs: Learning Steerable Imitation Controllers for Legged Robots from Unlabeled Motion Data

Dongho Kang, Jin Cheng, Fatemeh Zargarbashi, Taerim Yoon, Sungjoon Choi, Stelian Coros

Comments The supplementary video is available at https://youtu.be/DukyUGNYf5A

2506.23508 2026-03-06 cs.CL cs.AI

Why Reinforcement Fine-Tuning Enables MLLMs Preserve Prior Knowledge Better: A Data Perspective

Zhihao Zhang, Qiaole Dong, Qi Zhang, Jun Zhao, Enyu Zhou, Zhiheng Xi, Senjie Jin, Xiaoran Fan, Yuhao Zhou, Mingqi Wu, Yanwei Fu, Tao Ji, Tao Gui, Xuanjing Huang, Kai Chen

Comments Accepted by ICLR 2026

2506.23036 2026-03-06 cs.LG cs.SY eess.SY

Parameter Stress Analysis in Reinforcement Learning: Applying Synaptic Filtering to Policy Networks

Zain ul Abdeen, Ming Jin

2506.18812 2026-03-06 cs.RO cs.LG

Learning Physical Systems: Symplectification via Gauge Fixing in Dirac Structures

Aristotelis Papatheodorou, Pranav Vaidhyanathan, Natalia Ares, Ioannis Havoutis

Comments Presented at Equivariant Systems: Theory and Applications in State Estimation, Artificial Intelligence and Control, Robotics: Science and Systems (RSS) 2025 Workshop, 6 Pages, 3 Figures

2506.18339 2026-03-06 cs.LG cs.AI cs.SC nlin.CD physics.data-an

Structured Kolmogorov-Arnold Neural ODEs for Interpretable Learning and Symbolic Discovery of Nonlinear Dynamics

Wei Liu, Kiran Bacsa, Loon Ching Tang, Eleni Chatzi

2506.16112 2026-03-06 cs.CV

AutoV: Loss-Oriented Ranking for Visual Prompt Retrieval in LVLMs

Yuan Zhang, Chun-Kai Fan, Sicheng Yu, Junwen Pan, Tao Huang, Ming Lu, Kuan Cheng, Qi She, Shanghang Zhang

2506.14067 2026-03-06 cs.LG

From Bandit Regret to FDR Control: Online Selective Generation with Adversarial Feedback Unlocking

Minjae Lee, Yoonjae Jung, Sangdon Park

Comments 8 pages, 2 columns

2506.14020 2026-03-06 cs.LG cs.AI stat.ML

Bures-Wasserstein Flow Matching for Graph Generation

Keyue Jiang, Jiahao Cui, Xiaowen Dong, Laura Toni

2506.09984 2026-03-06 cs.CV cs.AI cs.SD

InterActHuman: Multi-Concept Human Animation with Layout-Aligned Audio Conditions

Zhenzhi Wang, Jiaqi Yang, Jianwen Jiang, Chao Liang, Gaojie Lin, Zerong Zheng, Ceyuan Yang, Yuan Zhang, Mingyuan Gao, Dahua Lin

Comments ICLR 2026 Camera Ready Version. TL;DR: The first multi-person dialogue video generation method from pairs of reference image and audio via explicit layout-aligned condition injection. Project page https://zhenzhiwang.github.io/interacthuman/

2506.09016 2026-03-06 cs.LG

SPEED-RL: Faster Training of Reasoning Models via Online Curriculum Learning

Ruiqi Zhang, Daman Arora, Song Mei, Andrea Zanette

Comments There are some bugs in the experiments, and we cannot fix them to make it satisfactory to us

2506.07915 2026-03-06 cs.AI cs.CL cs.SY eess.SY

A Signal Contract for Online Language Grounding and Discovery in Decision-Making

Dimitris Panagopoulos, Adolfo Perrusquia, Weisi Guo

Comments 10 pages, 4 Figures, 4 Tables, submitted to the IEEE for possible publication

2506.04764 2026-03-06 cs.CV

HypeVPR: Exploring Hyperbolic Space for Perspective to Equirectangular Visual Place Recognition

Suhan Woo, Seongwon Lee, Jinwoo Jang, Euntai Kim

Comments CVPR 2026

2506.03938 2026-03-06 cs.LG cs.AR

FPGA-Enabled Machine Learning Applications in Earth Observation: A Systematic Review

Cédric Léonard, Dirk Stober, Martin Schulz

Comments 35 pages, 5 figures, 4 tables. Accepted at ACM Computing Surveys (ACM CSUR). Cite as: Cédric Léonard, Dirk Stober, and Martin Schulz. 2026. FPGA-Enabled Machine Learning Applications in Earth Observation: A Systematic Review. ACM Comput. Surv. 1, 1 (January 2026), 35 pages. https://doi.org/10.1145/3800686

2506.03067 2026-03-06 cs.CV

EDITOR: Effective and Interpretable Prompt Inversion for Text-to-Image Diffusion Models

Mingzhe Li, Kejing Xia, Gehao Zhang, Zhenting Wang, Guanhong Tao, Siqi Pan, Juan Zhai, Shiqing Ma

2505.23648 2026-03-06 cs.LG

Continuous Chain of Thought Enables Parallel Exploration and Reasoning

Halil Alperen Gozeten, M. Emrullah Ildiz, Xuechen Zhang, Hrayr Harutyunyan, Ankit Singh Rawat, Samet Oymak

Comments ICLR 2026

2505.21430 2026-03-06 cs.LG

Attribute-Efficient PAC Learning of Sparse Halfspaces with Constant Malicious Noise Rate

Shiwei Zeng, Jie Shen

Comments v2 fixes a technical flaw in previous version, removing the dependence of sample complexity on the margin parameter

2505.19255 2026-03-06 cs.LG cs.AI

VTool-R1: VLMs Learn to Think with Images via Reinforcement Learning on Multimodal Tool Use

Mingyuan Wu, Jingcheng Yang, Jize Jiang, Meitang Li, Kaizhuo Yan, Hanchao Yu, Minjia Zhang, Chengxiang Zhai, Klara Nahrstedt

Comments ICLR 2026

2505.18374 2026-03-06 cs.CL cs.AI cs.LG

ShIOEnv: A Command Evaluation Environment for Grammar-Constrained Synthesis and Execution Behavior Modeling

Jarrod Ragsdale, Rajendra Boppana

Comments 15 pages, 7 figures, conference preprint

2505.10117 2026-03-06 cs.LG cs.CL

Learning Virtual Machine Scheduling in Cloud Computing through Language Agents

JieHao Wu, Ziwei Wang, Junjie Sheng, Wenhao Li, Xiangfeng Wang, Jun Luo

2505.08426 2026-03-06 cs.CV

DHECA-SuperGaze: Dual Head-Eye Cross-Attention and Super-Resolution for Unconstrained Gaze Estimation

Franko Šikić, Donik Vršnak, Sven Lončarić

详情

DOI: 10.1007/s00521-026-11849-y
Journal ref: Šikić, F., Vršnak, D. & Lončarić, S. DHECA-SuperGaze: dual head-eye cross-attention and super-resolution for unconstrained gaze estimation. Neural Comput & Applic 38, 116 (2026)

英文摘要

Unconstrained gaze estimation is the process of determining where a subject is directing their visual attention in uncontrolled environments. Gaze estimation systems are important for a myriad of tasks such as driver distraction monitoring, exam proctoring, accessibility features in modern software, etc. However, these systems face challenges in real-world scenarios, partially due to the low resolution of in-the-wild images and partially due to insufficient modeling of head-eye interactions in current state-of-the-art (SOTA) methods. This paper introduces DHECA-SuperGaze, a deep learning-based method that advances gaze prediction through super-resolution (SR) and a dual head-eye cross-attention (DHECA) module. Our dual-branch convolutional backbone processes eye and multiscale SR head images, while the proposed DHECA module enables bidirectional feature refinement between the extracted visual features through cross-attention mechanisms. Furthermore, we identified critical annotation errors in one of the most diverse and widely used gaze estimation datasets, Gaze360, and rectified the mislabeled data. Performance evaluation on Gaze360 and GFIE datasets demonstrates superior within-dataset performance of the proposed method, reducing angular error (AE) by 0.48° (Gaze360) and 2.95° (GFIE) in static configurations, and 0.59° (Gaze360) and 3.00° (GFIE) in temporal settings compared to prior SOTA methods. Cross-dataset testing shows improvements in AE of more than 1.53° (Gaze360) and 3.99° (GFIE) in both static and temporal settings, validating the robust generalization properties of our approach.

URL PDF HTML ☆

赞 0 踩 0

2505.08264 2026-03-06 cs.RO cs.AI

Automatic Curriculum Learning for Driving Scenarios: Towards Robust and Efficient Reinforcement Learning

Ahmed Abouelazm, Tim Weinstein, Tim Joseph, Philip Schörner, J. Marius Zöllner

Comments Accepted in the 36th IEEE Intelligent Vehicles Symposium (IV 2025)

2505.07409 2026-03-06 cs.CL

Computational Fact-Checking of Online Discourse: Scoring scientific accuracy in climate change related news articles

Tim Wittenborg, Constantin Sebastian Tremel, Markus Stocker, Sören Auer

Comments 8 pages, 7 figures, accepted at ICKG 2025

详情

DOI: 10.1109/ICKG66886.2025.00055
Journal ref: 2025 IEEE International Conference on Knowledge Graph (ICKG), Limassol, Cyprus, 2025, pp. 371-378

英文摘要

Democratic societies need reliable information. Misinformation in popular media, such as news articles or videos, threatens to impair civic discourse. Citizens are, unfortunately, not equipped to verify the flood of content consumed daily at increasing rates. This work aims to quantify the scientific accuracy of online media semi-automatically. We investigate the state of the art of climate-related ground truth knowledge representation. By semantifying media content of unknown veracity, their statements can be compared against these ground truth knowledge graphs. We implemented a workflow using LLM-based statement extraction and knowledge graph analysis. Our implementation can streamline content processing towards state-of-the-art knowledge representation and veracity quantification. Developed and evaluated with the help of 27 experts and detailed interviews with 10, the tool evidently provides a beneficial veracity indication. These findings are supported by 43 anonymous participants from a parallel user survey. This initial step, however, is unable to annotate public media at the required granularity and scale. Additionally, the identified state of climate change knowledge graphs is vastly insufficient to support this neurosymbolic fact-checking approach. Further work towards a FAIR (Findable, Accessible, Interoperable, Reusable) ground truth and complementary metrics is required to support civic discourse scientifically.

URL PDF HTML ☆

赞 0 踩 0

2505.06740 2026-03-06 cs.RO cs.AI

Boundary-Guided Trajectory Prediction for Road Aware and Physically Feasible Autonomous Driving

Ahmed Abouelazm, Mianzhi Liu, Christian Hubschneider, Yin Wu, Daniel Slieter, J. Marius Zöllner

Comments Accepted in the 36th IEEE Intelligent Vehicles Symposium (IV 2025)

2505.06737 2026-03-06 cs.RO cs.AI

Balancing Progress and Safety: A Novel Risk-Aware Objective for RL in Autonomous Driving

Ahmed Abouelazm, Jonas Michel, Helen Gremmelmaier, Tim Joseph, Philip Schörner, J. Marius Zöllner

Comments Accepted in the 36th IEEE Intelligent vehicles Symposium (IV 2025)

2505.06515 2026-03-06 cs.CV

RESAR-BEV: An Explainable Progressive Residual Autoregressive Approach for Camera-Radar Fusion in BEV Segmentation

Zhiwen Zeng, Yunfei Yin, Zheng Yuan, Argho Dey, Xianjian Bao

Comments This work was submitted to IEEE Transactions on Intelligent Transportation Systems (T-ITS) on 09-May-2025; revised 5 October 2025 and 26 January 2026; accepted 1 March 2026