arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2604.28118 2026-05-01 cs.SE cs.AI cs.LG

DEFault++: Automated Fault Detection, Categorization, and Diagnosis for Transformer Architectures

Sigma Jahan, Saurabh Singh Rajput, Tushar Sharma, Mohammad Masudur Rahman

Comments 71 pages, 15 figures, 22 tables. Preprint; under preparation for journal submission. Standalone version of Chapter 7 of the lead author's PhD thesis (Dalhousie University, 2026). Replication package: https://github.com/SigmaJahan/DEFaultplusplus-Transformer-Debugging

2604.28065 2026-05-01 physics.soc-ph cs.LG

Assessing the Role of Intersection Proximity in Pedestrian Crashes: Insights from Data Mining Approach

Ahmed Hossain, Xiaoduan Sun, Subasish Das

Comments 59 pages, 14 figures

2604.28061 2026-05-01 cs.DL cs.CL

Measuring research data reuse in scholarly publications using generative artificial intelligence: Open Science Indicator development and preliminary results

Lauren Cadwallader, Iain Hrynaszkiewicz, parth sarin, Tim Vines

Comments 12 pages. Submitted to 30th Annual International Conference on Science and Technology Indicators

2604.28053 2026-05-01 cs.CY cs.AI

To Build or Not to Build? Factors that Lead to Non-Development or Abandonment of AI Systems

Shreya Chappidi, Jatinder Singh

Comments Accepted to ACM FAccT 2026

2604.28021 2026-05-01 physics.soc-ph cs.CL

Universal statistical laws governing culinary design

Ganesh Bagler, Gopal Krishna Tewari, Aditya Raj Yadav, Akshat Singh, Pranay Bansal, Ujjval Dargar, Mansi Goel, Madhvi Kumari Sinha

Comments 48 Pages (28 Pages of Main Manuscript + Supplementary Information), 4 Main Figures, 6 Extended Data Figures

2604.28018 2026-05-01 cs.CE cs.AI

Design Structure Matrix Modularization with Large Language Models

Shuo Jiang, Jianxi Luo

2604.27969 2026-05-01 cs.SE cs.AI

From Mirage to Grounding: Towards Reliable Multimodal Circuit-to-Verilog Code Generation

Guang Yang, Xing Hu, Xiang Chen, Xin Xi

2604.27952 2026-05-01 eess.IV cs.IT cs.LG math.IT

Diffusion-OAMP for Joint Image Compression and Wireless Transmission

Wentao Hou, Yimin Bai, Zelei Luo, Jiadong Hong, Lei Liu

Comments 6 pages, 5 figures, 2 tables, submitted for a possible publication

2604.27892 2026-05-01 stat.ML cs.LG stat.AP

Prediction-powered Inference by Mixture of Experts

Yanwu Gu, Linglong Kong, Dong Xia

2604.27883 2026-05-01 math.ST cs.IT cs.LG math.IT stat.ML stat.TH

Decoupled Descent: Exact Test Error Tracking Via Approximate Message Passing

Max Lovig

Comments 43 Pages, 7 Figures

2604.27866 2026-05-01 eess.AS cs.MM cs.SD

LRS-VoxMM: A benchmark for in-the-wild audio-visual speech recognition

Doyeop Kwak, Jeongsoo Choi, Suyeon Lee, Joon Son Chung

Comments Technical report for the LRS-VoxMM dataset release. Project page: https://mm.kaist.ac.kr/projects/voxmm

2604.27861 2026-05-01 cs.CR cs.CL cs.LG

TwinGate: Stateful Defense against Decompositional Jailbreaks in Untraceable Traffic via Asymmetric Contrastive Learning

Bowen Sun, Chaozhuo Li, Yaodong Yang, Yiwei Wang, Chaowei Xiao

2604.27852 2026-05-01 cs.IR cs.AI

NeocorRAG: Less Irrelevant Information, More Explicit Evidence, and More Effective Recall via Evidence Chains

Shiyao Peng, Qianhe Zheng, Zhuodi Hao, Zichen Tang, Rongjin Li, Qing Huang, Jiayu Huang, Jiacheng Liu, Yifan Zhu, Haihong E

Comments Accepted to WWW 2026

详情

DOI: 10.1145/3774904.3792093
Journal ref: Proc. ACM Web Conf. 2026, pages 1899-1910

英文摘要

Although precise recall is a core objective in Retrieval-Augmented Generation (RAG), a critical oversight persists in the field: improvements in retrieval performance do not consistently translate to commensurate gains in downstream reasoning. To diagnose this gap, we propose the Recall Conversion Rate (RCR), a novel evaluation metric to quantify the contribution of retrieval to reasoning accuracy. Our quantitative analysis of mainstream RAG methods reveals that as Recall@5 improves, the RCR exhibits a near-linear decay. We identify the neglect of retrieval quality in these methods as the underlying cause. In contrast, approaches that focus solely on quality optimization often suffer from inferior recall performance. Both categories lack a comprehensive understanding of retrieval quality optimization, resulting in a trade-off dilemma. To address these challenges, we propose comprehensive retrieval quality optimization criteria and introduce the NeocorRAG framework. This framework achieves holistic retrieval quality optimization by systematically mining and utilizing Evidence Chains. Specifically, NeocorRAG first employs an innovative activated search algorithm to obtain a refined candidate space. Then it ensures precise evidence chain generation through constrained decoding. Finally, the retrieved set of evidence chains guides the retrieval optimization process. Evaluated on benchmarks including HotpotQA, 2WikiMultiHopQA, MuSiQue, and NQ, NeocorRAG achieves SOTA performance on both 3B and 70B parameter models, while consuming less than 20% of tokens used by comparable methods. This study presents an efficient, training-free paradigm for RAG enhancement that effectively optimizes retrieval quality while maintaining high recall. Our code is released at https://github.com/BUPT-Reasoning-Lab/NeocorRAG.

URL PDF HTML ☆

赞 0 踩 0

2604.27844 2026-05-01 cs.DC cs.CL

ZipCCL: Efficient Lossless Data Compression of Communication Collectives for Accelerating LLM Training

Wenxiang Lin, Xinglin Pan, Ruibo Fan, Shaohuai Shi, Xiaowen Chu

2604.27790 2026-05-01 cs.IR cs.AI cs.CL cs.CY cs.HC

How Generative AI Disrupts Search: An Empirical Study of Google Search, Gemini, and AI Overviews

Riley Grossman, Songjiang Liu, Michael K. Chen, Mike Smith, Cristian Borcea, Yi Chen

Comments Paper Accepted to ACM SIGIR 2026 (49th International ACM SIGIR Conference on Research and Development in Information Retrieval)

2604.27789 2026-05-01 cs.SE cs.AI

Test Before You Deploy: Governing Updates in the LLM Supply Chain

Mohd Sameen Chishti, Damilare Peter Oyinloye, Jingyue Li

Comments 4 pages, 1 figure, accepted to The 2nd International Workshop on Large Language Model Supply Chain Analysis (LLMSC2026) co-located with FSE 2026

2604.27780 2026-05-01 cs.AR cs.AI

RuC: HDL-Agnostic Rule Completion Benchmark Generation

Arnau Ayguadé Domingo, Miquel Alberti-Binimelis, Cristian Gutierrez-Gomez, Emanuele Parisi, Razine Moundir Ghorab, Miquel Moreto, Gokcen Kestor, Dario Garcia-Gasulla

Comments 7 pages, 6 figures

详情

英文摘要

Large Language Models (LLMs) have rapidly improved in performance across code-related tasks, making their integration into Register Transfer Level (RTL) development increasingly attractive. Mimicking the behavior of inline code assistants, many benchmarks evaluate LLMs' capabilities in code completion, either assessing the generation of entire hardware modules or the completion of a single line within a module. However both of these approaches lack the ability to control the granularity of the code-completion sample size and the syntactic range of completions. To overcome these limitations, we present a framework for language-agnostic rule completion (RuC), a grammar-driven, rule-selectable benchmark generator that automatically produces RTL code-completion tasks from a set of input hardware description sources. RuC uses the target Hardware Description Language (HDL) grammar to mask syntactically defined code regions and prompts a model to regenerate them using the surrounding unmasked code as context, enabling a controlled and scalable evaluation of the domain-specific model's code-understanding capabilities, ranging from assignments to the reconstruction of entire logic blocks. We use RuC to generate two SystemVerilog rule-completion benchmarks from the Tiny Tapeout shuttle TT07 and the CVE2 RISC-V core to demonstrate RuC's applicability to a broad range of designs, and conduct a comparative study of the code completion capabilities of modern open-source LLMs across diverse settings. Results indicate that completion performance strongly depends on the model type, the grammatical structure of the masked region, and the prompting strategy. Specifically, the highest scores are obtained with Fill-in-the-Middle (FIM) prompting. These findings highlight the value of grammar-driven, arbitrarily granular benchmarks for meaningful evaluation of LLM capabilities in RTL development workflows.

URL PDF HTML ☆

赞 0 踩 0

2604.27775 2026-05-01 cond-mat.mtrl-sci cs.LG

Data-Efficient Indentation Size Effect Correction in Steels Using Machine Learning and Physics-Guided Augmentation

Radmir Karamov, Tagir Karamov

Comments Preprint, 19 pages, 8 figures, 4 tables

2604.27747 2026-05-01 cs.IR cs.AI

Position-Aware Drafting for Inference Acceleration in LLM-Based Generative List-Wise Recommendation

Jiaju Chen, Chongming Gao, Chenxiao Fan, Haoyan Liu, Qingpeng Cai, Peng Jiang, Xiangnan He

2604.27743 2026-05-01 cs.IT cs.AI cs.LG math.IT

Why Self-Supervised Encoders Want to Be Normal

Yuval Domb

2604.27738 2026-05-01 cond-mat.dis-nn cond-mat.stat-mech cs.LG hep-lat

Sampling two-dimensional spin systems with transformers

Piotr Białas, Piotr Korcyl, Tomasz Stebel, Adam Stefański, Dawid Zapolski

Comments 15 pages, 7 figures

2604.27004 2026-05-01 cs.NE cs.LG eess.SP

EdgeSpike: Spiking Neural Networks for Low-Power Autonomous Sensing in Edge IoT Architectures

Gustav Olaf Yunus Laitinen-Fredriksson Lundstrom-Imanov, Taner Yilmaz

Comments 9 pages, 6 figures, 10 tables. Submitted to IEEE Internet of Things Journal

2604.25326 2026-05-01 cs.AR cs.AI

AHASD: Asynchronous Heterogeneous Architecture for LLM Adaptive Drafting Speculative Decoding on Mobile Devices

Ma Zirui, Fan Zhihua, Li Wenxing, Wu Haibin, Zhang Fulin, Ye Xiaochun, Li Wenming

Comments 7 pages, 9 figures, accepted by DAC 2026, repo: https://github.com/MAdrid1011/AHASD

2604.17460 2026-05-01 cs.CY cs.AI cs.HC cs.SE

Agentic Education: Using Claude Code to Teach Claude Code

Zain Naboulsi

Comments 27 pages, 5 figures, 7 tables. v2: added discussion of the GenAI adoption gap (MIT NANDA 2025) and a future-work direction on affect-aware adaptation; no changes to the system, evaluation, or core contributions. Code: https://github.com/zainnab-sparq/cc-self-train

2604.16399 2026-05-01 cs.SE cs.AI

IACDM: Interactive Adversarial Convergence Development Methodology -- A Structured Framework for AI-Assisted Software Development

Jasmine Moreira

Comments 14 pages, 6 tables. Technical Foundation Document. Repository: https://github.com/jasminemoreira/Versus . VSCode extensions available at VS Marketplace (JasmineMoreira.versus-claude, JasmineMoreira.versus-copilot)

2604.13721 2026-05-01 cs.IR cs.AI

FRAGATA: Semantic Retrieval of HPC Support Tickets via Hybrid RAG over 20 Years of Request Tracker History

Santiago Paramés-Estévez, Nicolás Filloy-Montesino, Jorge Fernández-Fabeiro, José Carlos Mouriño-Gallego

Comments 6 pages, 2 figures, a Spanish version of this paper has been accepted at Jornadas SARTECO 2026. Code available at https://github.com/s-parames/fragata

2603.20965 2026-05-01 q-fin.TR cs.AI cs.MA q-fin.CP q-fin.ST

Learning to Aggregate Zero-Shot LLM Agents for Corporate Disclosure Classification

Kemal Kirtac

2603.18239 2026-05-01 q-bio.QM cs.CL cs.LG

Impact of automatic speech recognition quality on Alzheimer's disease detection from spontaneous speech: a reproducible benchmark study with lexical modeling and statistical validation

Himadri S Samanta

Comments 22 pages, 7 figures

2603.17765 2026-05-01 q-bio.QM cs.AI cs.CV

Grounded Multimodal Retrieval-Augmented Drafting of Radiology Impressions Using Case-Based Similarity Search

Himadri S Samanta

Comments 15 pages, 4 figures, 3 tables

2602.11897 2026-05-01 cs.CR cs.AI

Agentic AI for Cybersecurity: A Meta-Cognitive Architecture for Governable Autonomy

Andrei Kojukhov, Arkady Bovshover