arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2603.17319 2026-03-19 cs.AI cs.LG cs.RO

Physics-informed offline reinforcement learning eliminates catastrophic fuel waste in maritime routing

Aniruddha Bora, Julie Chalfant, Chryssostomos Chryssostomidis

详情

英文摘要

International shipping produces approximately 3% of global greenhouse gas emissions, yet voyage routing remains dominated by heuristic methods. We present PIER (Physics-Informed, Energy-efficient, Risk-aware routing), an offline reinforcement learning framework that learns fuel-efficient, safety-aware routing policies from physics-calibrated environments grounded in historical vessel tracking data and ocean reanalysis products, requiring no online simulator. Validated on one full year (2023) of AIS data across seven Gulf of Mexico routes (840 episodes per method), PIER reduces mean CO2 emissions by 10% relative to great-circle routing. However, PIER's primary contribution is eliminating catastrophic fuel waste: great-circle routing incurs extreme fuel consumption (>1.5x median) in 4.8% of voyages; PIER reduces this to 0.5%, a 9-fold reduction. Per-voyage fuel variance is 3.5x lower (p<0.001), with bootstrap 95% CI for mean savings [2.9%, 15.7%]. Partial validation against observed AIS vessel behavior confirms consistency with the fastest real transits while exhibiting 23.1x lower variance. Crucially, PIER is forecast-independent: unlike A* path optimization whose wave protection degrades 4.5x under realistic forecast uncertainty, PIER maintains constant performance using only local observations. The framework combines physics-informed state construction, demonstration-augmented offline data, and a decoupled post-hoc safety shield, an architecture that transfers to wildfire evacuation, aircraft trajectory optimization, and autonomous navigation in unmapped terrain.

URL PDF HTML ☆

赞 0 踩 0

2603.17312 2026-03-19 cs.CV cs.AI

Recurrent Reasoning with Vision-Language Models for Estimating Long-Horizon Embodied Task Progress

Yuelin Zhang, Sijie Cheng, Chen Li, Zongzhao Li, Yuxin Huang, Yang Liu, Wenbing Huang

Comments CVPR 2026

2603.17311 2026-03-19 cs.CL

Ruyi2.5 Technical Report

Huan Song, Shuyu Tian, Qingfei Zhao, Wenhao Hong, Jiang Liu, Ting Long, Jiawei Shao, Xuelong Li

2603.17307 2026-03-19 cs.CV cs.AI

Symphony: A Cognitively-Inspired Multi-Agent System for Long-Video Understanding

Haiyang Yan, Hongyun Zhou, Peng Xu, Xiaoxue Feng, Mengyi Liu

Comments Accepted by cvpr2026

2603.17304 2026-03-19 cs.CV

3D MRI-Based Alzheimer's Disease Classification Using Multi-Modal 3D CNN with Leakage-Aware Subject-Level Evaluation

Md Sifat, Sania Akter, Akif Islam, Md. Ekramul Hamid, Abu Saleh Musa Miah, Najmul Hassan, Md Abdur Rahim, Jungpil Shin

Comments 5 tables, 6 figures, Submitted to International Conference on Power, Electronics, Communications, Computing, and Intelligent Infrastructure 2026

2603.17303 2026-03-19 cs.CL cs.AI

From Words to Worlds: Benchmarking Cross-Cultural Cultural Understanding in Machine Translation

Bangju Han, Yingqi Wang, Huang Qing, Tiyuan Li, Fengyi Yang, Ahtamjan Ahmat, Abibulla Atawulla, Yating Yang, Xi Zhou

2603.17301 2026-03-19 cs.LG

WINFlowNets: Warm-up Integrated Networks Training of Generative Flow Networks for Robotics and Machine Fault Adaptation

Zahin Sufiyan, Shadan Golestan, Yoshihiro Mitsuka, Shotaro Miwa, Osmar Zaiane

2603.17300 2026-03-19 cs.RO

ReSteer: Quantifying and Refining the Steerability of Multitask Robot Policies

Zhenyang Chen, Alan Tian, Liquan Wang, Benjamin Joffe, Yingyan Celine Lin, Yuxiao Chen, Siddharth Karamcheti, Danfei Xu

Comments Project website: https://resteer-vla.github.io/

2603.17295 2026-03-19 cs.CV cs.AI

Directing the Narrative: A Finetuning Method for Controlling Coherence and Style in Story Generation

Jianzhang Zhang, Yijing Tian, Jiwang Qu, Chuang Liu

2603.17278 2026-03-19 cs.LG stat.ME

Classifier Pooling for Modern Ordinal Classification

Noam H. Rotenberg, Andreia V. Faria, Brian Caffo

2603.17275 2026-03-19 cs.CV cs.AI

DANCE: Dynamic 3D CNN Pruning: Joint Frame, Channel, and Feature Adaptation for Energy Efficiency on the Edge

Mohamed Mejri, Ashiqur Rasul, Abhijit Chatterjee

2603.17267 2026-03-19 cs.CV

ConfusionBench: An Expert-Validated Benchmark for Confusion Recognition and Localization in Educational Videos

Lu Dong, Xiao Wang, Mark Frank, Srirangaraj Setlur, Venu Govindaraju, Ifeoma Nwogu

2603.17265 2026-03-19 cs.CV cs.CL

LED: A Benchmark for Evaluating Layout Error Detection in Document Analysis

Inbum Heo, Taewook Hwang, Jeesu Jung, Sangkeun Jung

Comments 8pages

2603.17255 2026-03-19 cs.LG

Variational Rectification Inference for Learning with Noisy Labels

Haoliang Sun, Qi Wei, Lei Feng, Yupeng Hu, Fan Liu, Hehe Fan, Yilong Yin

详情

DOI: 10.1007/s11263-024-02205-5
Journal ref: International Journal of Computer Vision, 2025

英文摘要

Label noise has been broadly observed in real-world datasets. To mitigate the negative impact of overfitting to label noise for deep models, effective strategies (\textit{e.g.}, re-weighting, or loss rectification) have been broadly applied in prevailing approaches, which have been generally learned under the meta-learning scenario. Despite the robustness of noise achieved by the probabilistic meta-learning models, they usually suffer from model collapse that degenerates generalization performance. In this paper, we propose variational rectification inference (VRI) to formulate the adaptive rectification for loss functions as an amortized variational inference problem and derive the evidence lower bound under the meta-learning framework. Specifically, VRI is constructed as a hierarchical Bayes by treating the rectifying vector as a latent variable, which can rectify the loss of the noisy sample with the extra randomness regularization and is, therefore, more robust to label noise. To achieve the inference of the rectifying vector, we approximate its conditional posterior with an amortization meta-network. By introducing the variational term in VRI, the conditional posterior is estimated accurately and avoids collapsing to a Dirac delta function, which can significantly improve the generalization performance. The elaborated meta-network and prior network adhere to the smoothness assumption, enabling the generation of reliable rectification vectors. Given a set of clean meta-data, VRI can be efficiently meta-learned within the bi-level optimization programming. Besides, theoretical analysis guarantees that the meta-network can be efficiently learned with our algorithm. Comprehensive comparison experiments and analyses validate its effectiveness for robust learning with noisy labels, particularly in the presence of open-set noise.

URL PDF HTML ☆

赞 0 踩 0

2603.17248 2026-03-19 cs.LG cs.AI

Pathology-Aware Multi-View Contrastive Learning for Patient-Independent ECG Reconstruction

Youssef Youssef, Jitin Singla

2603.17247 2026-03-19 cs.LG q-bio.QM

Binary Latent Protein Fitness Landscapes for Quantum Annealing Optimization

Truong-Son Hy

2603.17244 2026-03-19 cs.AI cs.IR cs.LO

Graph-Native Cognitive Memory for AI Agents: Formal Belief Revision Semantics for Versioned Memory Architectures

Young Bin Park

Comments 56 pages, 1 figure

详情

英文摘要

While individual components for AI agent memory exist in prior systems, their architectural synthesis and formal grounding remain underexplored. We present Kumiho, a graph-native cognitive memory architecture grounded in formal belief revision semantics. The structural primitives required for cognitive memory -- immutable revisions, mutable tag pointers, typed dependency edges, URI-based addressing -- are identical to those required for managing agent-produced work as versionable assets, enabling a unified graph-native architecture that serves both purposes. The central formal contribution is a correspondence between the AGM belief revision framework and the operational semantics of a property graph memory system, proving satisfaction of the basic AGM postulates (K*2--K*6) and Hansson's belief base postulates (Relevance, Core-Retainment). The architecture implements a dual-store model (Redis working memory, Neo4j long-term graph) with hybrid fulltext and vector retrieval. On LoCoMo (token-level F1), Kumiho achieves 0.565 overall F1 (n=1,986) including 97.5% adversarial refusal accuracy. On LoCoMo-Plus, a Level-2 cognitive memory benchmark testing implicit constraint recall, Kumiho achieves 93.3% judge accuracy (n=401); independent reproduction by the benchmark authors yielded results in the mid-80% range, still substantially outperforming all published baselines (best: Gemini 2.5 Pro, 45.7%). Three architectural innovations drive the results: prospective indexing (LLM-generated future-scenario implications indexed at write time), event extraction (structured causal events preserved in summaries), and client-side LLM reranking. The architecture is model-decoupled: switching the answer model from GPT-4o-mini (~88%) to GPT-4o (93.3%) improves end-to-end accuracy without pipeline changes, at a total evaluation cost of ~$14 for 401 entries.

URL PDF HTML ☆

赞 0 踩 0

2603.17236 2026-03-19 cs.RO cs.CV

Neural Radiance Maps for Extraterrestrial Navigation and Path Planning

Adam Dai, Shubh Gupta, Grace Gao

Comments Published in the Proceedings of the ION GNSS+ 2023 Conference

2603.17232 2026-03-19 cs.RO

Full Stack Navigation, Mapping, and Planning for the Lunar Autonomy Challenge

Adam Dai, Asta Wu, Keidai Iiyama, Guillem Casadesus Vila, Kaila Coimbra, Thomas Deng, Grace Gao

Comments Published in the Proceedings of the ION GNSS+ 2025 conference

2603.17231 2026-03-19 cs.CL eess.AS

Neuron-Level Emotion Control in Speech-Generative Large Audio-Language Models

Xiutian Zhao, Ismail Rasim Ulgen, Philipp Koehn, Björn Schuller, Berrak Sisman

Comments 11 pages, 10 figures

2603.17229 2026-03-19 cs.RO cs.CV

Visual SLAM with DEM Anchoring for Lunar Surface Navigation

Adam Dai, Guillem Casadesus Vila, Grace Gao

Comments Accepted to IEEE Aerospace Conference 2026

2603.17228 2026-03-19 cs.CV cs.AI cs.LG

From Drop-off to Recovery: A Mechanistic Analysis of Segmentation in MLLMs

Boyong Wu, Sanghwan Kim, Zeynep Akata

2603.17220 2026-03-19 cs.CL cs.AI cs.LG

TharuChat: Bootstrapping Large Language Models for a Low-Resource Language via Synthetic Data and Human Validation

Prajwal Panth, Agniva Maiti

Comments 6 pages, 1 figure, 2 tables. Preprint. Code and dataset available on Hugging Face

2603.17217 2026-03-19 cs.CL cs.AI cs.LG

Anonymous-by-Construction: An LLM-Driven Framework for Privacy-Preserving Text

Federico Albanese, Pablo Ronco, Nicolás D'Ippolito

详情

英文摘要

Responsible use of AI demands that we protect sensitive information without undermining the usefulness of data, an imperative that has become acute in the age of large language models. We address this challenge with an on-premise, LLM-driven substitution pipeline that anonymizes text by replacing personally identifiable information (PII) with realistic, type-consistent surrogates. Executed entirely within organizational boundaries using local LLMs, the approach prevents data egress while preserving fluency and task-relevant semantics. We conduct a systematic, multi-metric, cross-technique evaluation on the Action-Based Conversation Dataset, benchmarking against industry standards (Microsoft Presidio and Google DLP) and a state-of-the-art approach (ZSTS, in redaction-only and redaction-plus-substitution variants). Our protocol jointly measures privacy, semantic utility, and trainability under privacy via a lifecycle-ready criterion obtained by fine-tuning a compact encoder (BERT+LoRA) on sanitized text. In addition, we assess agentic Q&A performance by inserting an on-premise anonymization layer before the answering LLM and evaluating the quality of its responses. This intermediate, type-preserving substitution stage ensures that no sensitive content is exposed to third-party APIs, enabling responsible deployment of Q\&A agents without compromising confidentiality. Our method attains state-of-the-art privacy, minimal topical drift, strong factual utility, and low trainability loss, outperforming rule-based approaches and named-entity recognition (NER) baselines and ZSTS variants on the combined privacy--utility--trainability frontier. These results show that local LLM substitution yields anonymized corpora that are both responsible to use and operationally valuable: safe for agentic pipelines and suitable for downstream fine-tuning with limited degradation.

URL PDF HTML ☆

赞 0 踩 0

2603.17216 2026-03-19 cs.AI

AI Scientist via Synthetic Task Scaling

Ziyang Cai, Harkirat Behl

2603.17208 2026-03-19 cs.CL cs.PL

SYMDIREC: A Neuro-Symbolic Divide-Retrieve-Conquer Framework for Enhanced RTL Synthesis and Summarization

Prashanth Vijayaraghavan, Apoorva Nitsure, Luyao Shi, Charles Mackin, Ashutosh Jadhav, David Beymer, Ehsan Degan, Vandana Mukherjee

2603.17204 2026-03-19 cs.CL cs.AR cs.PL

CODMAS: A Dialectic Multi-Agent Collaborative Framework for Structured RTL Optimization

Che-Ming Chang, Prashanth Vijayaraghavan, Ashutosh Jadhav, Charles Mackin, Vandana Mukherjee, Hsinyu Tsai, Ehsan Degan

2603.17201 2026-03-19 cs.RO

FastLoop: Parallel Loop Closing with GPU-Acceleration in Visual SLAM

Soudabeh Mohammadhashemi, Shishir Gopinath, Kimia Khabiri, Parsa Hosseininejad, Karthik Dantu, Steven Y. Ko

2603.17199 2026-03-19 cs.LG cs.AI cs.CL

Catching rationalization in the act: detecting motivated reasoning before and after CoT via activation probing

Parsa Mirtaheri, Mikhail Belkin

2603.17196 2026-03-19 cs.LG

Self-Conditioned Denoising for Atomistic Representation Learning

Tynan Perez, Rafael Gomez-Bombarelli