arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2603.04656 2026-03-06 cs.CL cs.IR cs.LG cs.MA

iAgentBench: Benchmarking Sensemaking Capabilities of Information-Seeking Agents on High-Traffic Topics

Preetam Prabhu Srikar Dammu, Arnav Palkhiwala, Tanya Roosta, Chirag Shah

详情

英文摘要

With the emergence of search-enabled generative QA systems, users are increasingly turning to tools that browse, aggregate, and reconcile evidence across multiple sources on their behalf. Yet many widely used QA benchmarks remain answerable by retrieving a single relevant passage, making them poorly suited for measuring cross-source sensemaking, such as integrating evidence, tracking causal links, and resolving dependencies across facets of a topic. We present iAgentBench, a dynamic ODQA benchmark that targets these higher-level information needs while keeping questions natural and grounded in realistic information-seeking behavior. iAgentBench draws seed topics from real-world attention signals and uses common user intent patterns to construct user-like questions whose answers require combining evidence from multiple sources, not just extracting a single snippet. Each instance is released with traceable evidence and auditable intermediate artifacts that support contamination checks and enable fine-grained diagnosis of failures in retrieval versus synthesis. Experiments across multiple LLMs show that retrieval improves accuracy, but retrieval alone does not reliably resolve these questions, underscoring the need to evaluate evidence use, not just evidence access.

URL PDF HTML ☆

赞 0 踩 0

2603.04647 2026-03-06 cs.CL

Coordinated Semantic Alignment and Evidence Constraints for Retrieval-Augmented Generation with Large Language Models

Xin Chen, Saili Uday Gadgil, Jiarong Qiu

2603.04642 2026-03-06 cs.RO

Autonomous Aerial Non-Destructive Testing: Ultrasound Inspection with a Commercial Quadrotor in an Unstructured Environment

Ruben Veenstra, Barbara Bazzana, Sander Smits, Antonio Franchi

2603.04638 2026-03-06 cs.CV cs.LG q-bio.QM

Spinverse: Differentiable Physics for Permeability-Aware Microstructure Reconstruction from Diffusion MRI

Prathamesh Pradeep Khole, Mario M. Brenes, Zahra Kais Petiwala, Ehsan Mirafzali, Utkarsh Gupta, Jing-Rebecca Li, Andrada Ianus, Razvan Marinescu

Comments 10 Pages, 5 Figures, 2 Tables

2603.04625 2026-03-06 cs.LG math.ST stat.ML stat.TH

K-Means as a Radial Basis function Network: a Variational and Gradient-based Equivalence

Felipe de Jesus Felix Arredondo, Alejandro Ucan-Puc, Carlos Astengo Noguez

Comments 21 pages, 2 figures, 1 appendix

2603.04614 2026-03-06 cs.CV

SGR3 Model: Scene Graph Retrieval-Reasoning Model in 3D

Zirui Wang, Ruiping Liu, Yufan Chen, Junwei Zheng, Weijia Fan, Kunyu Peng, Di Wen, Jiale Wei, Jiaming Zhang, Rainer Stiefelhagen

2603.04606 2026-03-06 cs.LG physics.plasm-ph

PDE foundation model-accelerated inverse estimation of system parameters in inertial confinement fusion

Mahindra Rautela, Alexander Scheinker, Bradley Love, Diane Oyen, Nathan DeBardeleben, Earl Lawrence, Ayan Biswas

2603.04598 2026-03-06 cs.CV

PinPoint: Evaluation of Composed Image Retrieval with Explicit Negatives, Multi-Image Queries, and Paraphrase Testing

Rohan Mahadev, Joyce Yuan, Patrick Poirson, David Xue, Hao-Yu Wu, Dmitry Kislyuk

Comments Accepted for CVPR 2026

2603.04597 2026-03-06 cs.CL cs.AI

Bootstrapping Exploration with Group-Level Natural Language Feedback in Reinforcement Learning

Lei Huang, Xiang Cheng, Chenxiao Zhao, Guobin Shen, Junjie Yang, Xiaocheng Feng, Yuxuan Gu, Xing Yu, Bing Qin

2603.04589 2026-03-06 cs.AI

ECG-MoE: Mixture-of-Expert Electrocardiogram Foundation Model

Yuhao Xu, Xiaoda Wang, Yi Wu, Wei Jin, Xiao Hu, Carl Yang

2603.04585 2026-03-06 cs.RO

ELLIPSE: Evidential Learning for Robust Waypoints and Uncertainties

Zihao Dong, Chanyoung Chung, Dong-Ki Kim, Mukhtar Maulimov, Xiangyun Meng, Harmish Khambhaita, Ali-akbar Agha-mohammadi, Amirreza Shaban

Comments 8 pages, 5 figures

2603.04582 2026-03-06 cs.AI cs.LG

Self-Attribution Bias: When AI Monitors Go Easy on Themselves

Dipika Khullar, Jack Hopkins, Rowan Wang, Fabien Roger

2603.04580 2026-03-06 cs.LG cs.AI

Why Do Neural Networks Forget: A Study of Collapse in Continual Learning

Yunqin Zhu, Jun Jin

2603.04579 2026-03-06 cs.RO

Risk-Aware Reinforcement Learning for Mobile Manipulation

Michael Groom, James Wilson, Nick Hawes, Lars Kunze

2603.04571 2026-03-06 cs.RO

Distributed State Estimation for Vision-Based Cooperative Slung Load Transportation in GPS-Denied Environments

Jack R. Pence, Jackson Fezell, Jack W. Langelaan, Junyi Geng

Comments In proceedings of the 2026 AIAA SciTech Forum, Session: Intelligent Systems-27

2603.04568 2026-03-06 cs.CV

Mask-aware inference with State-Space Models

Ignasi Mas, Ramon Morros, Javier-Ruiz Hidalgo, Ivan Huerta

2603.04565 2026-03-06 cs.CV

Structure-Guided Histopathology Synthesis via Dual-LoRA Diffusion

Xuan Xu, Prateek Prasanna

2603.04562 2026-03-06 cs.CV cs.LG

Fusion and Grouping Strategies in Deep Learning for Local Climate Zone Classification of Multimodal Remote Sensing Data

Ancymol Thomas, Jaya Sreevalsan-Nair

Comments 25 pages, 12 figures

2603.04560 2026-03-06 cs.RO

From Local Corrections to Generalized Skills: Improving Neuro-Symbolic Policies with MEMO

Benjamin A. Christie, Yinlong Dai, Mohammad Bararjanianbahnamiri, Simon Stepputtis, Dylan P. Losey

2603.04553 2026-03-06 cs.LG

Latent Particle World Models: Self-supervised Object-centric Stochastic Dynamics Modeling

Tal Daniel, Carl Qi, Dan Haramati, Amir Zadeh, Chuan Li, Aviv Tamar, Deepak Pathak, David Held

Comments ICLR 2026 Oral. Project webpage: https://taldatech.github.io/lpwm-web

2603.04549 2026-03-06 cs.AI cs.CL cs.MA

Adaptive Memory Admission Control for LLM Agents

Guilin Zhang, Wei Jiang, Xiejiashan Wang, Aisha Behr, Kai Zhao, Jeffrey Friedman, Xu Chu, Amine Anoun

2603.04547 2026-03-06 cs.RO

Many-RRT*: Robust Joint-Space Trajectory Planning for Serial Manipulators

Theodore M. Belmont, Benjamin A. Christie, Anton Netchaev

2603.04546 2026-03-06 cs.LG stat.ML

Oracle-efficient Hybrid Learning with Constrained Adversaries

Princewill Okoroafor, Robert Kleinberg, Michael P. Kim

2603.04538 2026-03-06 cs.CV

InverseNet: Benchmarking Operator Mismatch and Calibration Across Compressive Imaging Modalities

Chengshuai Yang, Xin Yuan

Comments Benchmarking Operator Mismatch and Calibration Across Compressive Imaging Modalities

2603.04534 2026-03-06 cs.LG cs.AI cs.CY

Invariant Causal Routing for Governing Social Norms in Online Market Economies

Xiangning Yu, Qirui Mi, Xiao Xue, Haoxuan Li, Yiwei Shi, Xiaowei Liu, Mengyue Yang

2603.04516 2026-03-06 cs.LG astro-ph.IM cs.AI

Augmenting representations with scientific papers

Nicolò Oreste Pinciroli Vago, Rocco Di Tella, Carolina Cuesta-Lázaro, Michael J. Smith, Cecilia Garraffo, Rafael Martínez-Galarza

Comments Accepted at the 2nd Workshop on Foundation Models for Science (ICLR 2026)

2603.04514 2026-03-06 cs.AI

Progressive Refinement Regulation for Accelerating Diffusion Language Model Decoding

Lipeng Wan, Jianhui Gu, Junjie Ma, Jianguo Huang, Shiguang Sun, Siyuan Li, Xuguang Lan

Comments 19 pages, 10 figures, Code available upon publication

2603.04509 2026-03-06 cs.CV

Recognition of Daily Activities through Multi-Modal Deep Learning: A Video, Pose, and Object-Aware Approach for Ambient Assisted Living

Kooshan Hashemifard, Pau Climent-Pérez, Francisco Florez-Revuelta

2603.04478 2026-03-06 cs.LG

Standing on the Shoulders of Giants: Rethinking EEG Foundation Model Pretraining via Multi-Teacher Distillation

Chenqi Li, Yu Liu, Shuo Zhang, Timothy Denison, Tingting Zhu

2603.04477 2026-03-06 cs.LG cs.AI

Activity Recognition from Smart Insole Sensor Data Using a Circular Dilated CNN

Yanhua Zhao

Comments 4 pages, 5 figures