arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2511.00053 2026-03-20 cs.LG cs.AI stat.ML

Quadratic Direct Forecast for Training Multi-Step Time-Series Forecast Models

Hao Wang, Licheng Pan, Yuan Lu, Zhichao Chen, Tianqiao Liu, Shuting He, Zhixuan Chu, Qingsong Wen, Haoxuan Li, Zhouchen Lin

详情

Journal ref: ICLR 2026

英文摘要

The design of training objective is central to training time-series forecasting models. Existing training objectives such as mean squared error mostly treat each future step as an independent, equally weighted task, which we found leading to the following two issues: (1) overlook the label autocorrelation effect among future steps, leading to biased training objective; (2) fail to set heterogeneous task weights for different forecasting tasks corresponding to varying future steps, limiting the forecasting performance. To fill this gap, we propose a novel quadratic-form weighted training objective, addressing both of the issues simultaneously. Specifically, the off-diagonal elements of the weighting matrix account for the label autocorrelation effect, whereas the non-uniform diagonals are expected to match the most preferable weights of the forecasting tasks with varying future steps. To achieve this, we propose a Quadratic Direct Forecast (QDF) learning algorithm, which trains the forecast model using the adaptively updated quadratic-form weighting matrix. Experiments show that our QDF effectively improves performance of various forecast models, achieving state-of-the-art results. Code is available at https://anonymous.4open.science/r/QDF-8937.

URL PDF HTML ☆

赞 0 踩 0

2510.22689 2026-03-20 cs.CL

Rule-Based Explanations for Retrieval-Augmented LLM Systems

Joel Rorseth, Parke Godfrey, Lukasz Golab, Divesh Srivastava, Jarek Szlichta

2510.20579 2026-03-20 cs.CV cs.AI cs.MM

Open-o3-Video: Grounded Video Reasoning with Explicit Spatio-Temporal Evidence

Jiahao Meng, Xiangtai Li, Haochen Wang, Yue Tan, Tao Zhang, Lingdong Kong, Yunhai Tong, Anran Wang, Zhiyang Teng, Yujing Wang, Zhuochen Wang

2510.16344 2026-03-20 cs.RO cs.AI

Manual2Skill++: Connector-Aware General Robotic Assembly from Instruction Manuals via Vision-Language Models

Chenrui Tie, Shengxiang Sun, Yudi Lin, Yanbo Wang, Zhongrui Li, Zhouhan Zhong, Jinxuan Zhu, Yiman Pang, Haonan Chen, Junting Chen, Ruihai Wu, Lin Shao

2510.14369 2026-03-20 cs.CL cs.AI cs.CY cs.HC

From Binary to Bilingual: How the National Weather Service is Using Artificial Intelligence to Develop a Comprehensive Translation Program

Joseph E. Trujillo-Falcon, Monica L. Bozeman, Liam E. Llewellyn, Samuel T. Halvorson, Meryl Mizell, Stuti Deshpande, Bob Manning, Chris Rohrbach, Ian Blaylock, Angel Montanez, Todd Fagin

2510.11618 2026-03-20 cs.CL cs.MA

StoryBox: Collaborative Multi-Agent Simulation for Hybrid Bottom-Up Long-Form Story Generation Using Large Language Models

Zehao Chen, Rong Pan, Haoran Li

Comments Accepted by AAAI 2026. Project: https://storyboxproject.github.io

2510.10846 2026-03-20 cs.CL

DUAL-Bench: Measuring Over-Refusal and Robustness in Vision-Language Models

Kaixuan Ren, Preslav Nakov, Usman Naseem

Comments 26pages, 13 figures, Preprint

2510.10053 2026-03-20 cs.CV

DREAM: A Benchmark Study for Deepfake photoREalism AssessMent

Bo Peng, Zichuan Wang, Sheng Yu, Xiaochuan Jin, Wei Wang, Jing Dong

Comments Accepted by IEEE T-PAMI

详情

DOI: 10.1109/TPAMI.2026.3663547

英文摘要

Deep learning based face-swap videos, widely known as deepfakes, have drawn wide attention due to their threat to information credibility. Recent works mainly focus on the problem of deepfake detection that aims to reliably tell deepfakes apart from real ones, in an objective way. On the other hand, the subjective perception of deepfakes, especially its computational modeling and imitation, is also a significant problem but lacks adequate study. In this paper, we focus on the photorealism assessment of deepfakes, which is defined as the automatic assessment of deepfake photorealism that approximates human perception of deepfakes. It is important for evaluating the quality and deceptiveness of deepfakes which can be used for predicting the influence of deepfakes on Internet, and it also has potentials in improving the deepfake generation process by serving as a critic. This paper promotes this new direction by presenting a comprehensive benchmark called DREAM, which stands for Deepfake photoREalism AssessMent. It is comprised of a deepfake video dataset of diverse quality, a large scale annotation that includes 140,000 photorealism scores and textual descriptions obtained from 3,500 human annotators, and a comprehensive evaluation and analysis of 18 representative photorealism assessment methods, including recent large vision language model based methods and a newly proposed description-aligned CLIP method. The benchmark and insights included in this study can lay the foundation for future research in this direction and other related areas. We make the dataset available to the research community at https://github.com/bomb2peng/DREAM-A-Benchmark-Study-for-Deepfake-photoREalism-AssessMent.

URL PDF HTML ☆

赞 0 踩 0

2510.08882 2026-03-20 cs.LG

An Improved Model-Free Decision-Estimation Coefficient with Applications in Adversarial MDPs

Haolin Liu, Chen-Yu Wei, Julian Zimmert

Comments ICLR 2026

2510.08581 2026-03-20 cs.SD cs.AI eess.AS

Evaluating Hallucinations in Audio-Visual Multimodal LLMs with Spoken Queries under Diverse Acoustic Conditions

Hansol Park, Hoseong Ahn, Junwon Moon, Yejin Lee, Kyuhong Shim

Comments Submitted to Interspeech2026

2510.08316 2026-03-20 cs.CV

Unlocking 3D Affordance Segmentation with 2D Semantic Knowledge

Yu Huang, Zelin Peng, Changsong Wen, Xiaokang Yang, Wei Shen

Comments Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

2510.07842 2026-03-20 cs.CL cs.AI

AdaSwitch: Balancing Exploration and Guidance in Knowledge Distillation via Adaptive Switching

Jingyu Peng, Maolin Wang, Hengyi Cai, Yuchen Li, Kai Zhang, Shuaiqiang Wang, Dawei Yin, Xiangyu Zhao

2510.06265 2026-03-20 cs.CL

Large Language Models Hallucination: A Comprehensive Survey

Aisha Alansari, Hamzah Luqman

2510.04714 2026-03-20 cs.CV

Object-Centric Representation Learning for Enhanced 3D Semantic Scene Graph Prediction

KunHo Heo, GiHyun Kim, SuYeon Kim, MyeongAh Cho

Comments Accepted by NeurIPS 2025. Code: https://github.com/VisualScienceLab-KHU/OCRL-3DSSG-Codes

2510.03182 2026-03-20 cs.RO cs.AI cs.CL cs.SC

Simulation to Rules: A Dual-VLM Framework for Formal Visual Planning

Yilun Hao, Yongchao Chen, Chuchu Fan, Yang Zhang

Comments 40 pages, 6 figures, 13 tables

2510.01643 2026-03-20 cs.LG

Support Basis: Fast Attention Beyond Bounded Entries

Maryam Aliakbarpour, Vladimir Braverman, Junze Yin, Haochen Zhang

Comments AISTATS 2026 (Spotlight). Our code can be found at: https://github.com/yinj66/support_basis

2510.01242 2026-03-20 cs.CL cs.AI cs.IT cs.LG math.IT

Redundancy-as-Masking: Formalizing the Artificial Age Score (AAS) to Model Memory Aging in Generative AI

Seyma Yaman Kayadibi

Comments 37 pages, 17 figures. Includes theoretical development and mathematical proofs of the Artificial Age Score (AAS), with empirical illustrations via ChatGPT-based memory recall experiments

详情

DOI: 10.3389/frai.2026.1732691
Journal ref: Frontiers in Artificial Intelligence 9 (2026), 1732691

英文摘要

Artificial intelligence is observed to age not through chronological time but through structural asymmetries in memory performance. In large language models, semantic cues such as the name of the day often remain stable across sessions, while episodic details like the sequential progression of experiment numbers tend to collapse when conversational context is reset. To capture this phenomenon, the Artificial Age Score (AAS) is introduced as a log-scaled, entropy-informed metric of memory aging derived from observable recall behavior. The score is formally proven to be well-defined, bounded, and monotonic under mild and model-agnostic assumptions, making it applicable across various tasks and domains. In its Redundancy-as-Masking formulation, the score interprets redundancy as overlapping information that reduces the penalized mass. However, in the present study, redundancy is not explicitly estimated; all reported values assume a redundancy-neutral setting (R = 0), yielding conservative upper bounds. The AAS framework was tested over a 25-day bilingual study involving ChatGPT-5, structured into stateless and persistent interaction phases. During persistent sessions, the model consistently recalled both semantic and episodic details, driving the AAS toward its theoretical minimum, indicative of structural youth. In contrast, when sessions were reset, the model preserved semantic consistency but failed to maintain episodic continuity, causing a sharp increase in the AAS and signaling structural memory aging. These findings support the utility of AAS as a theoretically grounded, task-independent diagnostic tool for evaluating memory degradation in artificial systems. The study builds on foundational concepts from von Neumann's work on automata, Shannon's theories of information and redundancy, and Turing's behavioral approach to intelligence.

URL PDF HTML ☆

赞 0 踩 0

2509.26642 2026-03-20 cs.RO

MLA: A Multisensory Language-Action Model for Multimodal Understanding and Forecasting in Robotic Manipulation

Zhuoyang Liu, Jiaming Liu, Jiadong Xu, Nuowei Han, Chenyang Gu, Hao Chen, Kaichen Zhou, Renrui Zhang, Kai Chin Hsieh, Kun Wu, Zhengping Che, Jian Tang, Shanghang Zhang

Comments Project page: https://robotic-mla.github.io/

2509.22592 2026-03-20 cs.LG

OT-MeanFlow3D: Bridging Optimal Transport and Meanflow for Efficient 3D Point Cloud Generation

Elaheh Akbari, Shansita Sharma, Ping He, Ahmadreza Moradipari, Kyungtae Han, Hamed Pirsiavash, Yikun Bai, Soheil Kolouri

2509.13093 2026-03-20 cs.SD

GLAD: Global-Local Aware Dynamic Mixture-of-Experts for Multi-Talker ASR

Yujie Guo, Jiaming Zhou, Yuhang Jia, Shiwan Zhao, Yong Qin

Comments This paper has been submitted to Interspeech 2026 for review

2509.11085 2026-03-20 cs.LG stat.AP

DemandLens: Enhancing Forecast Accuracy Through Product-Specific Hyperparameter Optimization

Srijesh Pillai, M. I. Jawid Nazir

Comments 10 pages, 12 figures, 3 tables. Accepted for publication in the proceedings of the 2025 Advances in Science and Engineering Technology International Conferences (ASET)

2509.11075 2026-03-20 cs.LG

Machine Learning Framework for Audio-Based Equipment Condition Monitoring: A Comparative Study of Classification Algorithms

Srijesh Pillai, Yodhin Agarwal, Zaheeruddin Ahmed

Comments 10 pages, 7 figures. Accepted for publication in the proceedings of the 2025 Advances in Science and Engineering Technology International Conferences (ASET)

2509.08759 2026-03-20 cs.LG math.OC

Fourier Learning Machines: Nonharmonic Fourier-Based Neural Networks for Scientific Machine Learning

Mominul Rubel, Adam Meyers, Gabriel Nicolosi

Comments Please cite the peer-reviewed, published version available on Transactions on Machine Learning Research at https://openreview.net/forum?id=LPKt5vd7yz

2509.04050 2026-03-20 cs.CV

A Re-ranking Method using K-nearest Weighted Fusion for Person Re-identification

Huy Che, Le-Chuong Nguyen, Gia-Nghia Tran, Dinh-Duy Phan, Vinh-Tiep Nguyen

Comments Published in ICPRAM 2025, ISBN 978-989-758-730-6, ISSN 2184-4313

详情

DOI: 10.5220/0013176100003905
Journal ref: Proceedings of the 14th International Conference on Pattern Recognition Applications and Methods - ICPRAM (2025) 79-90

英文摘要

In person re-identification, re-ranking is a crucial step to enhance the overall accuracy by refining the initial ranking of retrieved results. Previous studies have mainly focused on features from single-view images, which can cause view bias and issues like pose variation, viewpoint changes, and occlusions. Using multi-view features to present a person can help reduce view bias. In this work, we present an efficient re-ranking method that generates multi-view features by aggregating neighbors' features using K-nearest Weighted Fusion (KWF) method. Specifically, we hypothesize that features extracted from re-identification models are highly similar when representing the same identity. Thus, we select K neighboring features in an unsupervised manner to generate multi-view features. Additionally, this study explores the weight selection strategies during feature aggregation, allowing us to identify an effective strategy. Our re-ranking approach does not require model fine-tuning or extra annotations, making it applicable to large-scale datasets. We evaluate our method on the person re-identification datasets Market1501, MSMT17, and Occluded-DukeMTMC. The results show that our method significantly improves Rank@1 and mAP when re-ranking the top M candidates from the initial ranking results. Specifically, compared to the initial results, our re-ranking method achieves improvements of 9.8%/22.0% in Rank@1 on the challenging datasets: MSMT17 and Occluded-DukeMTMC, respectively. Furthermore, our approach demonstrates substantial enhancements in computational efficiency compared to other re-ranking methods. Code is available at https://github.com/chequanghuy/Enhancing-Person-Re-Identification-via-UFFM-and-AMC.

URL PDF HTML ☆

赞 0 踩 0

2509.02460 2026-03-20 cs.CV

GenCompositor: Generative Video Compositing with Diffusion Transformer

Shuzhou Yang, Xiaoyu Li, Xiaodong Cun, Guangzhi Wang, Lingen Li, Ying Shan, Jian Zhang

Comments Accepted by ICLR 2026

2509.02437 2026-03-20 cs.RO

U-ARM : Ultra low-cost general teleoperation interface for robot manipulation

Yanwen Zou, Zhaoye Zhou, Chenyang Shi, Zewei Ye, Junda Huang, Yan Ding, Bo Zhao

2509.01019 2026-03-20 cs.CV cs.LG cs.RO

AI-driven Dispensing of Coral Reseeding Devices for Broad-scale Restoration of the Great Barrier Reef

Scarlett Raine, Emilio Olivastri, Benjamin Moshirian, Tobias Fischer

Comments 8 pages, 5 figures

2508.21475 2026-03-20 cs.AI

MMSearch-Plus: Benchmarking Provenance-Aware Search for Multimodal Browsing Agents

Xijia Tao, Yihua Teng, Xinxing Su, Xinyu Fu, Jihao Wu, Chaofan Tao, Ziru Liu, Haoli Bai, Rui Liu, Lingpeng Kong

Comments Project Page: https://mmsearch-plus.github.io

2508.20784 2026-03-20 cs.AI

Single Agent Robust Deep Reinforcement Learning for Bus Fleet Control

Yifan Zhang

详情

DOI: 10.1093/tse/tdag005

英文摘要

Bus bunching remains a challenge for urban transit due to stochastic traffic and passenger demand. Traditional solutions rely on multi-agent reinforcement learning (MARL) in loop-line settings, which overlook realistic operations characterized by heterogeneous routes, timetables, fluctuating demand, and varying fleet sizes. We propose a novel single-agent reinforcement learning (RL) framework for bus holding control that avoids the data imbalance and convergence issues of MARL under near-realistic simulation. A bidirectional timetabled network with dynamic passenger demand is constructed. The key innovation is reformulating the multi-agent problem into a single-agent one by augmenting the state space with categorical identifiers (vehicle ID, station ID, time period) in addition to numerical features (headway, occupancy, velocity). This high-dimensional encoding enables single-agent policies to capture inter-agent dependencies, analogous to projecting non-separable inputs into a higher-dimensional space. We further design a structured reward function aligned with operational goals: instead of exponential penalties on headway deviations, a ridge-shaped reward balances uniform headways and schedule adherence. Experiments show that our modified soft actor-critic (SAC) achieves more stable and superior performance than benchmarks, including MADDPG (e.g., -430k vs. -530k under stochastic conditions). These results demonstrate that single-agent deep RL, when enhanced with categorical structuring and schedule-aware rewards, can effectively manage bus holding in non-loop, real-world contexts. This paradigm offers a robust, scalable alternative to MARL frameworks, particularly where agent-specific experiences are imbalanced.

URL PDF HTML ☆

赞 0 踩 0

2507.19530 2026-03-20 cs.LG cs.AI

When Validation Fails: Cross-Institutional Blood Pressure Prediction and the Limits of Electronic Health Record-Based Models

Md Basit Azam, Sarangthem Ibotombi Singh

详情

英文摘要

External validation remains rare in healthcare machine learning despite being critical for establishing real-world feasibility. We developed an ensemble framework to predict blood pressure from electronic health records, incorporating rigorous data leakage prevention. Internal validation on the MIMIC-III dataset yielded moderate performance for systolic (R^2 = 0.248, RMSE = 14.84 mmHg) and diastolic (R^2 = 0.297, RMSE = 8.27 mmHg) blood pressure. However, external validation on the eICU dataset revealed substantial generalization challenges. Baseline systolic performance dropped significantly from R^2 = 0.248 to -0.024, with RMSE increasing from 14.84 to 18.69 mmHg. To address potential confounding from feature imputation, we conducted an intersection-only experiment using 16 universally available features; this yielded worse external performance (R^2 = -0.115, RMSE = 17.32 mmHg), proving imputation artifacts were not the primary cause. Attempts at post-hoc correction, including linear and isotonic recalibration (R^2 ranging from -0.170 to 0.024) and domain adaptation via covariate shift reweighting (R^2 = -0.141), showed limited gains. This highlights fundamental cross-institutional barriers. Our root-cause analysis identified three primary obstacles to generalizability: (1) site-specific feature distributions, even among standard physiological variables; (2) underlying patient population differences with unique pathophysiologies; and (3) institutional variations in measurement protocols creating non-transferable learned patterns. These findings demonstrate that strong internal performance cannot guarantee cross-institutional deployment success. Transparent reporting of validation failures is essential for setting realistic expectations for predictive models. Code is available at https://github.com/mdbasit897/ehr-bp-ensemble.

URL PDF HTML ☆

赞 0 踩 0