arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2604.03240 2026-04-07 cs.LG cs.AI cs.CL

Scaling DPPs for RAG: Density Meets Diversity

Xun Sun, Baiheng Xie, Li Huang, Qiang Gao

详情

英文摘要

Retrieval-Augmented Generation (RAG) enhances Large Language Models (LLMs) by grounding generation in external knowledge, yielding relevance responses that are aligned with factual evidence and evolving corpora. Standard RAG pipelines construct context through relevance ranking, performing point-wise scoring between the user query and each corpora chunk. This formulation, however, ignores interactions among retrieved candidates, leading to redundant contexts that dilute density and fail to surface complementary evidence. We argue that effective retrieval should optimize jointly for both density and diversity, ensuring the grounding evidence that is dense in information yet diverse in coverage. In this study, we propose ScalDPP, a diversity-aware retrieval mechanism for RAG that incorporates Determinantal Point Processes (DPPs) through a lightweight P-Adapter, enabling scalable modeling of inter-chunk dependencies and complementary context selection. In addition, we develop a novel set-level objective, Diverse Margin Loss (DML), that enforces ground-truth complementary evidence chains to dominate any equally sized redundant alternatives under DPP geometry. Experimental results demonstrate the superiority of ScalDPP, substantiating our core statement in practice.

URL PDF HTML ☆

赞 0 踩 0

2604.03239 2026-04-07 cs.AI

To Throw a Stone with Six Birds: On Agents and Agenthood

Ioannis Tsiokos

2604.03234 2026-04-07 cs.AI math.OC

Structural Segmentation of the Minimum Set Cover Problem: Exploiting Universe Decomposability for Metaheuristic Optimization

Isidora Hernández, Héctor Ferrada, Cristóbal A. Navarro

Comments Submitted to journal

2604.03233 2026-04-07 cs.LG cs.NA math.NA

Integrating Artificial Intelligence, Physics, and Internet of Things: A Framework for Cultural Heritage Conservation

Carmine Valentino, Federico Pichi, Francesco Colace, Dajana Conte, Gianluigi Rozza

详情

英文摘要

The conservation of cultural heritage increasingly relies on integrating technological innovation with domain expertise to ensure effective monitoring and predictive maintenance. This paper presents a novel framework to support the preservation of cultural assets, combining Internet of Things (IoT) and Artificial Intelligence (AI) technologies, enhanced with the physical knowledge of phenomena. The framework is structured into four functional layers that permit the analysis of 3D models of cultural assets and elaborate simulations based on the knowledge acquired from data and physics. A central component of the proposed framework consists of Scientific Machine Learning, particularly Physics-Informed Neural Networks (PINNs), which incorporate physical laws into deep learning models. To enhance computational efficiency, the framework also integrates Reduced Order Methods (ROMs), specifically Proper Orthogonal Decomposition (POD), and is also compatible with classical Finite Element (FE) methods. Additionally, it includes tools to automatically manage and process 3D digital replicas, enabling their direct use in simulations. The proposed approach offers three main contributions: a methodology for processing 3D models of cultural assets for reliable simulation; the application of PINNs to combine data-driven and physics-based approaches in cultural heritage conservation; and the integration of PINNs with ROMs to efficiently model degradation processes influenced by environmental and material parameters. The reproducible and open-access experimental phase exploits simulated scenarios on complex and real-life geometries to test the efficacy of the proposed framework in each of its key components, allowing the possibility of dealing with both direct and inverse problems. Code availability: https://github.com/valc89/PhysicsInformedCulturalHeritage

URL PDF HTML ☆

赞 0 踩 0

2604.03232 2026-04-07 cs.AI

IC3-Evolve: Proof-/Witness-Gated Offline LLM-Driven Heuristic Evolution for IC3 Hardware Model Checking

Mingkai Miao, Guangyu Hu, Ziyi Yang, Hongce Zhang

2604.02007 2026-04-07 cs.LG

Apriel-1.5-OpenReasoner: RL Post-Training for General-Purpose and Efficient Reasoning

Rafael Pardinas, Ehsan Kamalloo, David Vazquez, Alexandre Drouin

Comments 20 pages, 4 tables, 6 figures, appendix included

2604.01702 2026-04-07 cs.CL

On the Role of Reasoning Patterns in the Generalization Discrepancy of Long Chain-of-Thought Supervised Fine-Tuning

Zhaoyi Li, Xiangyu Xi, Zhengyu Chen, Wei Wang, Gangwei Jiang, Ranran Shen, Linqi Song, Ying Wei, Defu Lian

Comments Under Review. version2: correct typos in Table 4 and add an ablation study (Table 5)

2604.01676 2026-04-07 cs.CV cs.AI cs.SE

GPA: Learning GUI Process Automation from Demonstrations

Zirui Zhao, Jun Hao Liew, Yan Yang, Wenzhuo Yang, Ziyang Luo, Doyen Sahoo, Silvio Savarese, Junnan Li

2604.01438 2026-04-07 cs.AI

ClawSafety: "Safe" LLMs, Unsafe Agents

Bowen Wei, Yunbei Zhang, Jinhao Pan, Kai Mei, Xiao Wang, Jihun Hamm, Ziwei Zhu, Yingqiang Ge

2604.01168 2026-04-07 cs.CL cs.LG

S0 Tuning: Zero-Overhead Adaptation of Hybrid Recurrent-Attention Models

Jack Young

Comments 15 pages (10 main + 5 appendix), 3 figures, code at https://github.com/jackyoung27/s0-tuning

2604.00909 2026-04-07 cs.CV

JAMMEval: A Refined Collection of Japanese Benchmarks for Reliable VLM Evaluation

Issa Sugiura, Koki Maeda, Shuhei Kurita, Yusuke Oda, Daisuke Kawahara, Naoaki Okazaki

Comments 16 pages, 11 figures

2604.00904 2026-04-07 cs.LG

Fatigue-Aware Learning to Defer via Constrained Optimisation

Zheng Zhang, Cuong C. Nguyen, David Rosewarne, Kevin Wells, Gustavo Carneiro

2604.00733 2026-04-07 cs.LG cs.AI

Spectral Compact Training: Pre-Training Large Language Models via Permanent Truncated SVD and Stiefel QR Retraction

Björn Roman Kohlberger

Comments 8 pages, 3 figures, 4 tables. Patent pending: Irish Application PTIE20260000000219. Code at https://github.com/EctoSpace/SCT

2604.00672 2026-04-07 cs.CL cs.IR math.ST stat.TH

Common TF-IDF variants arise as key components in the test statistic of a penalized likelihood-ratio test for word burstiness

Zeyad Ahmed, Paul Sheridan, Michael McIsaac, Aitazaz A. Farooque

Comments 27 pages, 3 tables, 7 figures, accepted in Discover Computing 2026

2604.00449 2026-04-07 cs.LG cs.MA cs.SY eess.SY

Convergence of Byzantine-Resilient Gradient Tracking via Probabilistic Edge Dropout

Amirhossein Dezhboro, Fateme Maleki, Arman Adibi, Erfan Amini, Jose E. Ramirez-Marquez

2603.29773 2026-04-07 cs.CV

Beyond Ground-Truth: Leveraging Image Quality Priors for Real-World Image Restoration

Fengyang Xiao, Peng Hu, Lei Xu, XingE Guo, Guanyi Qin, Yuqi Shen, Chengyu Fang, Rihan Zhang, Chunming He, Sina Farsiu

Comments Accepted by CVPR 2026

2603.29270 2026-04-07 cs.CV

Unbiased Model Prediction Without Using Protected Attribute Information

Puspita Majumdar, Surbhi Mittal, Saheb Chhabra, Mayank Vatsa, Richa Singh

2603.29087 2026-04-07 cs.SD eess.AS

IQRA 2026: Interspeech Challenge on Automatic Pronunciation Assessment for Modern Standard Arabic (MSA)

Yassine El Kheir, Amit Meghanani, Mostafa Shahin, Omnia Ibrahim, Shammur Absar Chowdhury, Nada AlMarwani, Youssef Elshahawy, Ahmed Ali

Comments 5 pages paper

2603.29086 2026-04-07 cs.LG cs.CE

Realistic Market Impact Modeling for Reinforcement Learning Trading Environments

Lucas Riera Abbade, Anna Helena Reali Costa

2603.28858 2026-04-07 cs.CL cs.AI cs.LG

OptiMer: Optimal Distribution Vector Merging Is Better than Data Mixing for Continual Pre-Training

Haiyue Song, Masao Utiyama

Comments Preprint, 20 pages, 10 tables, 12 figures

2603.28743 2026-04-07 cs.LG

Rethinking Language Model Scaling under Transferable Hypersphere Optimization

Liliang Ren, Yang Liu, Yelong Shen, Weizhu Chen

2603.28733 2026-04-07 cs.LG

See it to Place it: Evolving Macro Placements with Vision-Language Models

Ikechukwu Uchendu, Swati Goel, Karly Hou, Ebrahim Songhori, Kuang-Huei Lee, Joe Wenjie Jiang, Vijay Janapa Reddi, Vincent Zhuang

Comments 31 pages, 12 figures, 14 tables

2603.28533 2026-04-07 cs.CL

GraphWalker: Agentic Knowledge Graph Question Answering via Synthetic Trajectory Curriculum

Shuwen Xu, Yao Xu, Jiaxiang Liu, Chenhao Yuan, Wenshuo Peng, Jun Zhao, Kang Liu

2603.27139 2026-04-07 cs.CV

The Geometry of Robustness: Optimizing Loss Landscape Curvature and Feature Manifold Alignment for Robust Finetuning of Vision-Language Models

Shivang Chopra, Shaunak Halbe, Chengyue Huang, Brisa Maneechotesuwan, Zsolt Kira

2603.26357 2026-04-07 cs.CV

MPDiT: Multi-Patch Global-to-Local Transformer Architecture For Efficient Flow Matching and Diffusion Model

Quan Dao, Dimitris Metaxas

Comments Accepted at CVPR 2026

2603.25029 2026-04-07 cs.LG

Optimal High-Probability Regret for Online Convex Optimization with Two-Point Bandit Feedback

Haishan Ye

2603.22620 2026-04-07 cs.LG cs.AI

Causal Discovery in Action: Learning Chain-Reaction Mechanisms from Interventions

Panayiotis Panayiotou, Özgür Şimşek

Comments Accepted to the 5th Conference on Causal Learning and Reasoning (CLeaR 2026)

2603.21236 2026-04-07 cs.LG

Posterior-Calibrated Causal Circuits in Variational Autoencoders: Why Image-Domain Interpretability Fails on Tabular Data

Dip Roy, Rajiv Misra, Sanjay Kumar Singh, Anisha Roy

2603.20910 2026-04-07 cs.LG

LLM-ODE: Data-driven Discovery of Dynamical Systems with Large Language Models

Amirmohammad Ziaei Bideh, Jonathan Gryak

2603.19186 2026-04-07 cs.LG

Improving RCT-Based CATE Estimation Under Covariate Mismatch via Calibrated Alignment

Amir Asiaee, Samhita Pal