arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2605.01664 2026-05-05 cs.IR

A Hybrid Retrieval and Reranking Framework for Evidence-Grounded Retrieval-Augmented Generation

Fariba Afrin Irany, Sampson Akwafuo

详情

英文摘要

Retrieval-augmented generation (RAG) improves large language model reliability by grounding generated responses in external evidence. However, RAG performance depends on the relevance of retrieved passages, the quality of evidence ranking, and the ability to verify whether generated claims are supported by source documents. This study presents a hybrid retrieval and reranking framework for citation-aware RAG in biomedical and healthcare-related document question answering. The framework uses Amazon Bedrock Knowledge Bases for document ingestion, parsing, chunking, embedding generation, and evidence retrieval. Source PDF documents are stored in Amazon S3, embedded using Amazon Titan Text Embeddings V2, and indexed with Amazon OpenSearch Serverless. Hybrid retrieval first retrieves candidate evidence chunks, and Cohere reranking then prioritizes the most relevant passages before answer generation. The answer-generation stage uses top-ranked evidence chunks to produce controlled, evidence-grounded responses, while a separate judge model evaluates each generated factual claim against the retrieved evidence. The framework was evaluated using 25 biomedical NLP and healthcare transformer queries as a pilot-scale proof-of-concept study. Across the evaluation set, the system retrieved and reranked 500 evidence chunks and generated answers from top-ranked evidence. Claim-level grounding evaluation extracted 200 factual claims, all of which were judged to be supported by retrieved evidence, resulting in 100.0% grounding accuracy. The results suggest that hybrid retrieval, reranking, conservative prompting, and claim-level evaluation can support reliable evidence-grounded RAG responses when sufficient source evidence is available.

URL PDF HTML ☆

赞 0 踩 0

2605.01662 2026-05-05 cs.CV

Video Active Perception: Effective Inference-Time Long-Form Video Understanding with Vision-Language Models

Martin Q. Ma, Willis Guo, Aditya Agrawal, Ankit Gupta, Paul Pu Liang, Ruslan Salakhutdinov, Louis-Philippe Morency

Comments ICCV 2025 workshop

2605.01659 2026-05-05 cs.CV cs.AI

TRIMMER: A New Paradigm for Video Summarization through Self-Supervised Reinforcement Learning

Pritam Mishra, Coloma Ballester, Dimosthenis Karatzas

2605.01657 2026-05-05 cs.CV

Act2See: Emergent Active Visual Perception for Video Reasoning

Martin Q. Ma, Yuxiao Qu, Aditya Agrawal, Willis Guo, Paul Pu Liang, Ruslan Salakhutdinov, Louis-Philippe Morency

Comments CVPR 2026

2605.01656 2026-05-05 q-bio.NC cs.AI cs.LG

From Cortical Synchronous Rhythm to Brain Inspired Learning Mechanism: An Oscillatory Spiking Neural Network with Time-Delayed Coordination

Tingting Dan, Guorong Wu

Comments 19 pages, 6 figures

2605.01655 2026-05-05 math.CA cs.LG

Exact Loop Controllers for ReLU Realization of Homogeneous Curve Refinements

Boldsaikhan Bolorkhuu, Tsogtgerel Gantumur

Comments 39 pages, 6 figures

2605.01654 2026-05-05 cs.CR math.FA

Limit Properties at Critical Indices of Linear Canonical Riesz Potentials and Their Applications to Security of Multi-Image Encryption

Zunwei Fu, Dachun Yang, Shuhui Yang

Comments 39 pages

2605.01653 2026-05-05 cs.CV

SteeringDiffusion: A Bottlenecked Activation Control Interface for Diffusion Models

Fangzheng Wu, Brian Summa

2605.01650 2026-05-05 cs.LG

Geospatial foundation-model embeddings improve population estimation unevenly across space and scale

Wenbin Zhang, Eimear Cleary, Francisco Rowe, Somnath Chaudhuri, Maksym Bondarenko, Shengjie Lai, Andrew J. Tatem

2605.01647 2026-05-05 cs.CL

Beyond Perplexity: Character Distribution Signatures and the MDTA Benchmark for AI Text Detection

Priyadarshan Narayanasamy, Swastik Agrawal, Klint Faber, Fardina Fathmiul Alam

Comments 11 figures, 10 tables, 24 pages, Under Review at COLM 2026

2605.01644 2026-05-05 cs.CR

Toward a Principled Framework for Agent Safety Measurement

Shuyi Lin, Anshuman Suri, Alina Oprea, Cheng Tan

2605.01640 2026-05-05 cs.LG cs.CL

Prescriptive Scaling Laws for Data Constrained Training

Justin Lovelace, Christian Belardi, Srivatsa Kundurthy, Shriya Sudhakar, Kilian Q. Weinberger

2605.01638 2026-05-05 cs.CV

Omni-Fake: Benchmarking Unified Multimodal Social Media Deepfake Detection

Tianxiao Li, Zhenglin Huang, Haiquan Wen, Yiwei He, Xinze Li, Bingyu Zhu, Wuhui Duan, Congang Chen, Zeyu Fu, Yi Dong, Baoyuan Wu, Jason Li, Guangliang Cheng

Comments Accepted to CVPR 2026

2605.01637 2026-05-05 cs.LG cs.CC cs.DM math.CO

The Banach-Butterfly Invariant: Influence-Adaptive Walsh Geometry for Ternary Polynomial Threshold Functions

Gorgi Pavlov

Comments 21 pages, 3 figures. Theory paper; LLM-application companion in preparation. Code, certificates, and 616,126 NPN-canonical n=5 representatives in supplementary repository

2605.01636 2026-05-05 math.LO cs.LO

Inexpressibility in Exp-Minus-Log

Mark Carney

Comments 5 pages

2605.01634 2026-05-05 cs.LG

Chebyshev-Augmented One-Shot Transfer Learning for PINNs on Nonlinear Differential Equations

Yiqi Rao, Pavlos Protopapas

Comments 18 pages, 4 figures, 9 tables, accepted to ICLR 2026 Workshop on Artificial Intelligence and Partial Differential Equations

2605.01632 2026-05-05 cs.LG

Perturb and Correct: Post-Hoc Ensembles using Affine Redundancy

Eleanor Quint

2605.01630 2026-05-05 cs.CL cs.AI

Prosa: Rubric-Based Evaluation of LLMs on Real User Chats in Brazilian Portuguese

Roseval Malaquias Junior, Giovana Kerche Bonás, Thales Sales Almeida, Hugo Abonizio, Thiago Laitz, Ramon Pires, Marcos Piau, Celio Larcher, Rodrigo Nogueira

2605.01628 2026-05-05 stat.ML cs.LG math.ST stat.TH

Self-Normalized Martingales and Uniform Regret Bounds for Linear Regression

Fan Chen, Jian Qian, Alexander Rakhlin, Nikita Zhivotovskiy

详情

英文摘要

Self-normalized martingale inequalities lie at the heart of confidence ellipsoids for online least squares and, more broadly, many bandit and reinforcement-learning results. Yet existing vector and scalar results typically rely on bounded covariates and an explicit regularization matrix, producing bounds that are \emph{not scale-invariant}: although the self-normalized quantity is scale-invariant by definition, its standard upper bounds are not. We characterize when scale-invariant upper bounds on self-normalized martingales are possible. Without further assumptions, we prove that nontrivial scale-invariant bounds exist only in dimension $d=1$; moreover, in $d=1$ we obtain $O(\log T)$ scale-invariant self-normalized bounds without any assumptions on the covariates. In contrast, for $d>1$ we show that no nontrivial scale-invariant bound can hold in full generality. We then connect this dichotomy to \emph{doubly-uniform} regret in online linear regression (i.e., regret bounds that are simultaneously independent of the covariate scale and the comparator norm) and use it to resolve the open question of Gaillard, Gerchinovitz, Huard, and Stoltz, \emph{``Uniform regret bounds over $\mathbb{R}^d$ for the sequential linear regression problem with the square loss''} (ALT 2019): in $d=1$ we give an explicit algorithm with $O(\log T)$ doubly-uniform regret, whereas for $d>1$ sublinear doubly-uniform regret is impossible. Finally, under a natural \emph{smoothness} condition (bounded Radon--Nikodym derivatives of the conditional covariate laws with respect to a fixed base measure), we recover sublinear regret for $d>1$ without bounded covariates and derive a self-normalized concentration inequality free of the usual regularization penalties, yielding arguably a first natural scale-invariant bound for adaptive, non-i.i.d. vector martingales.

URL PDF HTML ☆

赞 0 踩 0

2605.01617 2026-05-05 math.NA cs.NA math.AP

Discontinuity Analysis and Semi-Analytic Spectral Approximation for the Nonlocal Poisson Equation

Thinh Dang, Bacim Alali, Nathan Albin

2605.01614 2026-05-05 cs.DC cs.OS

CvxCluster: Solving Large, Complex, Granular Resource Allocation Problems 100-1000x Faster

Obi Nnorom, Stephen Boyd, Philip Levis

Comments 13 pages, 5 figures, 2 tables. Submitted to SOSP 2026

2605.01611 2026-05-05 cs.CY cs.AI cs.LG

The Case for ESM3 as a General-Purpose AI Model with Systemic Risk Under the EU AI Act

Taro Qureshi, Jacob Griffith, Koen Holtman, Marcel Mir Teijeiro, Ze Shen Chin, Rokas Gipiškis

Comments 8 pages, 1 figure, Technical AI Safety Conference

2605.01610 2026-05-05 cs.HC cs.AI

Less Interaction But More Explanation: A Communication Perspective on Agentic AI Interfaces

Eunchae Jang, S. Shyam Sundar

2605.01609 2026-05-05 cs.LG cs.AI

Concepts Whisper While Syntax Shouts: Spectral Anti-Concentration and the Dual Geometry of Transformer Representations

Pratyush Acharya, Nuraj Rimal, Habish Dhakal

Comments 25 pages, 16 figures, 13 tables

2605.01605 2026-05-05 cs.CL cs.AI

Where Do Prompt Perturbations Break Generation? A Segment-Level View of Robustness in LoRA-Tuned Language Models

Zhuoyun Li, Boxuan Wang, Jinwei Hu, Zhenglin Huang, Qisong He, Xinmiao Huang, Guangliang Cheng, Xiaowei Huang, Yi Dong

Comments Under review

2605.01604 2026-05-05 cs.AI

Evaluating Agentic AI in the Wild: Failure Modes, Drift Patterns, and a Production Evaluation Framework

Mukund Pandey

Comments 11 pages, 6 tables, 1 figure. Reference implementation: https://github.com/mukund1985/llm-eval-toolkit

2605.01600 2026-05-05 cs.SE

A Lightweight Scrum Sprint Simulation to Help Learners Traverse the Empirical Process Control Threshold Concept

Eduardo Miranda, Torgeir Dingsøyr, Pritam Chita

Comments 10 pages

2605.01596 2026-05-05 cs.CL

Fine-Tuning Pre-Trained Code Models for AI-Generated Code Detection

Jany-Gabriel Ispas, Sergiu Nisioi

Comments Archaeology at SemEval-2026 Task 13

2605.01592 2026-05-05 cs.CG

Witness Set: A Visibility Problem in $NP\cap XP$

Satyabrata Jana, Debabrata Pal, Bodhayan Roy, Sasanka Roy

Comments 24 pages, 17 figures

详情

英文摘要

We study the Witness Set problem, a natural dual to the classical Art Gallery problem. In the Witness Set problem, we are given a polygon $P$ and an integer $k$ as input, and the objective is to determine whether $P$ has a witness set of size at least $k$. A point set $X$ in $P$ is called a witness set if every point in $P$ is visible from at most one point in $X$. For simple polygons, we show that Witness Set lies in both $NP$ and $XP$. This stands in sharp contrast to its dual, the Art Gallery problem, which was recently shown to be $\exists \mathbb{R}$-complete by Abrahamsen et al. and is therefore neither in $NP$ nor admits a polynomial-size discretization unless $NP=\exists \mathbb{R}$. In contrast, we prove that Witness Set for simple polygons admits a finite discretization of size $n^{f(k)}$ for some function $f$. For comparison, even for simple polygons, Efrat and Har-Peled gave an algorithm for Art Gallery running in time $n^{O(k)}$ using tools from real algebraic geometry, and it appears difficult to obtain such algorithms without this machinery. On the other hand, our approach for Witness Set is purely combinatorial and relies on discretization, leading to an $n^{f(k)}$-time algorithm. Although Amit et al. claimed more than fifteen years ago that Witness Set is $NP$-hard, no proof or reference was provided. We show that the discrete version of the Witness Set problem - where the witness set must be chosen from a given finite point set $Q$ (instead of allowing witnesses to be chosen anywhere in the polygon), referred to as Discrete Witness Set - is $NP$-complete, even when the input is restricted to rectilinear polygons with holes. However, for simple polygons, Discrete Witness Set admits a polynomial-time algorithm by Das et al. Thus, it remains an open question whether the Witness Set problem is $NP$-hard.

URL PDF HTML ☆

赞 0 踩 0

2605.01591 2026-05-05 cs.IR cs.CL

Led to Mislead: Adversarial Content Injection for Attacks on Neural Ranking Models

Amin Bigdeli, Amir Khosrojerdi, Radin Hamidi Rad, Morteza Zihayat, Charles L. A. Clarke, Ebrahim Bagheri