arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2604.11101 2026-05-12 math.CO cs.LG

Generating Hadamard matrices with transformers

Geordie Williamson, Oded Yacobi, Paul Zinn-Justin

AI总结本文提出了一种结合变压器神经网络与局部搜索的新方法，用于构造哈达玛矩阵，特别适用于Goethals–Seidel类型的稀疏组合搜索问题。该方法在100到200阶之间能生成大量非等价的哈达玛矩阵，并在更高阶数上表现出优于传统局部搜索的性能，成功构造了阶数为252的哈达玛矩阵。实验还表明，变压器模型能够发现并利用搜索空间中的隐藏对称性，为组合优化提供了新的思路。

2604.07780 2026-05-12 eess.IV cs.CV

MonoUNet: A Robust Tiny Neural Network for Automated Knee Cartilage Segmentation on Point-of-Care Ultrasound Devices

Alvin Kimbowa, Arjun Parmar, Ibrahim Mujtaba, Will Wei, Maziar Badii, Matthew Harkey, David Liu, Ilker Hacihaliloglu

AI总结本研究提出了一种名为 MonoUNet 的轻量级深度学习模型，旨在用于便携式超声设备上自动分割膝关节软骨。该模型通过引入可训练的单基因块提取多尺度局部相位特征，并结合门控机制提升对超声图像变化的鲁棒性，显著减少了参数量和计算成本。实验表明，MonoUNet 在多个设备和站点的数据集上取得了优异的分割性能，Dice 分数高达 92.62% 至 94.82%，且与手动测量结果具有高度一致性与可靠性。

Comments 17 pages, 4 figures. Published in Ultrasound in Medicine & Biology (2026)

详情

DOI: 10.1016/j.ultrasmedbio.2026.04.011
Journal ref: Ultrasound in Medicine & Biology, 2026, ISSN 0301-5629

英文摘要

Objective: To develop a robust and compact deep learning model for automated knee cartilage segmentation on point-of-care ultrasound (POCUS) devices. Methods: We propose MonoUNet, a novel, highly compact segmentation model consisting of (i) an aggressively reduced U-Net backbone, (ii) a trainable monogenic block that extracts multi-scale local phase features from the input, and (iii) a gating mechanism that injects these features into the encoder stages to reduce sensitivity to variations in ultrasound image appearance. MonoUNet segmentation performance was evaluated on a multi-site, multi-device knee cartilage ultrasound dataset using Dice score and mean average surface distance (MASD). Agreement between MonoUNet and manual cartilage outcomes (thickness and echo intensity) was assessed using Bland-Altman analysis with 95% limits of agreement, and reliability was assessed using intraclass correlation coefficient (ICC$_{2,k}$). Results: Overall, MonoUNet outperformed existing lightweight segmentation models, with average Dice scores ranging from 92.62% to 94.82% and MASD values between 0.133 mm and 0.254 mm. MonoUNet reduces the number of parameters by 10x--700x and computational cost by 14x--2000x relative to existing lightweight models. MonoUNet cartilage outcomes showed excellent reliability and agreement with the manual outcomes: intraclass correlation coefficients (ICC$_{2,k})$=0.96 and bias=2.00% (0.047 mm) for average thickness, and ICC$_{2,k}$=0.99 and bias=0.80% (0.328 a.u.) for echo intensity. Conclusion: Incorporating trainable local phase features improves the robustness of highly compact neural networks for knee cartilage segmentation across varying acquisition settings and could support scalable ultrasound-based assessment and monitoring of knee osteoarthritis using POCUS devices. The code is publicly available at https://github.com/alvinkimbowa/monounet.

URL PDF HTML ☆

赞 0 踩 0

2604.01527 2026-05-12 cs.SE cs.AI cs.LG

REAP: Automatic Curation of Coding Agent Benchmarks from Interactive Production Usage

Smriti Jha, Matteo Paltenghi, Chandra Maddila, Vijayaraghavan Murali, Shubham Ugare, Satish Chandra

AI总结本文提出REAP（相关性与执行审核管道），一种自动构建生产环境衍生编码代理基准的流水线，无需人工标注即可从真实开发者与代理的交互会话中生成基准。REAP通过基于大语言模型的任务分类、代理测试相关性验证和多轮稳定性检查，解决了生产环境中基准构建所面临的不可测试提示、测试不一致和测试不稳定等问题。该方法生成的Harvest基准包含多种编程语言任务，评估结果显示前沿模型在任务解决率上存在显著差异，为实际部署提供了有价值的参考依据。

详情

英文摘要

Production deployment of AI coding agents requires fast, reproducible evaluation signals. Existing industrial practices trade off speed and fidelity: online A/B testing takes weeks and risks user experience, shadow deployment yields signals that are not reproducible across runs, and public benchmarks diverge from production workloads in language distribution, prompt style, and codebase structure. This paper presents REAP (Relevance and Execution-Audited Pipeline), an automated curation pipeline that constructs production-derived benchmarks from real developer-agent sessions without manual labeling. Such curation, while in-distribution to production usage, runs into several challenges. Untestable prompts, misaligned tests, and test flakiness all compromise evaluation reliability. While tasks can be manually audited to ensure only high-quality tasks remain in the benchmark, this approach is infeasible in the monorepo setting: the build infrastructure state is often ephemeral in large monorepos and requires the benchmark to be continuously re-curated against the current codebase. As manual verification cannot be sustained at this cadence, REAP adds an automated verification layer using LLM-based task classification, agentic test-relevance validation, and multi-run stability checks to ensure the executable benchmark yields trustworthy signals. We use REAP to curate Harvest, a benchmark where each task feeds the coding agent a real developer prompt and verifies the resulting code change against fail-to-pass tests retrieved from production. Harvest's distribution spans more than four programming languages with a majority of tasks drawn from Hack. Model and harness evaluations reveal that solve rates range from 42.9% to 58.2% across five frontier models, surfacing capability differences that inform concrete deployment decisions.

URL PDF HTML ☆

赞 0 踩 0

2603.14889 2026-05-12 eess.AS cs.CL cs.LG

SDiaReward: Modeling and Benchmarking Spoken Dialogue Rewards with Modality and Colloquialness

Jingyu Lu, Yuhan Wang, Fan Zhuo, Xize Cheng, Changhao Pan, Xueyi Pu, Yifu Chen, Chenyuhao Wen, Tianle Liang, Zhou Zhao

AI总结随着端到端语音对话系统的发展，如何准确建模对话中的副语言特征和口语化表达成为关键问题。为此，研究提出了SDiaReward，一个端到端的多轮对话奖励模型，通过新构建的SDiaReward-Dataset进行训练，能够同时评估语音中的模态特征和口语化程度。该模型在统一的评估框架下实现了对多轮对话质量的高效判断，并在新建立的ESDR-Bench基准上展现出优越的性能，显著优于通用音频大语言模型。

Comments Accepted to ACL 2026 Main Conference

2603.12800 2026-05-12 eess.IV cs.CV

GLEAM: A Multimodal Imaging Dataset and HAMM for Glaucoma Classification

Jiao Wang, Chi Liu, Yiying Zhang, Hongchen Luo, Zhifen Guo, Ying Hu, Ke Xu, Jing Zhou, Hongyan Xu, Ruiting Zhou, Man Tang

AI总结本文提出了GLEAM，一个包含三种成像模态的公开青光眼数据集，涵盖眼底扫描激光图像、视神经周围OCT图像和视野图模式偏差图，并标注了四个疾病阶段，有助于综合利用多模态信息进行精准诊断。为有效整合跨模态信息，研究提出了一种分层注意力掩码建模（HAMM）方法，通过分层注意力编码器和轻量解码器，聚焦于跨模态表征学习，提升青光眼分类的准确性。该研究为多模态医学影像分析提供了新思路和有效工具。

2603.03759 2026-05-12 cs.MA cs.AI cs.LG cs.SY eess.SY math.OC

Learning Approximate Nash Equilibria in Cooperative Multi-Agent Reinforcement Learning via Mean-Field Subsampling

Emile Anand, Ishani Karmarkar

AI总结本文研究了在通信受限条件下，一个全局代理与大量局部智能体协同合作的马尔可夫博弈问题，其中全局代理只能在每个时间步观测部分局部智能体的状态。为此，作者提出了一种交替学习框架（ALTERNATING-MARL），通过子采样均场Q学习和诱导MDP优化实现全局与局部智能体的联合策略更新。理论分析表明，该方法能够收敛到近似纳什均衡，并有效分离了状态-动作空间的样本复杂度；实验部分在多机器人控制任务中验证了方法的有效性。

Comments 57 pages, 10 figures, 4 tables

2602.22551 2026-05-12 math.OC cs.LG

Identifying Multi-Hit Cancer Drivers Without Massive Parallelization: A CP, MIP, and Column Generation Framework

Rick S. H. Willemsen, Tenindra Abeywickrama, Ramu Anandakrishnan

AI总结该研究旨在识别驱动癌症的多基因突变组合，将其形式化为多击癌症驱动集覆盖问题（MHCDSCP），目标是在最大化肿瘤样本覆盖的同时最小化正常样本的误分类。不同于传统依赖大规模并行计算的方法，作者提出了一种基于约束规划和混合整数规划的快速启发式算法，并结合列生成技术，能够在普通CPU上快速求解。实验表明，该框架在实际癌症基因组数据上表现优异，部分实例可获得理论最优解，证明该问题的计算复杂度远低于以往认知，为多击模型的进一步研究提供了可行的基准。

2602.01861 2026-05-12 eess.AS cs.LG

RIR-Former: Coordinate-Guided Transformer for Continuous Reconstruction of Room Impulse Responses

Shaoheng Xu, Chunyi Sun, Jihui Zhang, Prasanga N. Samarasinghe, Thushara D. Abhayapala

AI总结本文提出了一种名为RIR-Former的模型，用于从稀疏测量中重建房间脉冲响应（RIR），该模型基于Transformer架构并引入了正弦编码模块以有效利用麦克风位置信息，实现了任意阵列位置的插值。通过设计分段多分支解码器，模型能够分别处理早期反射和晚期混响，从而提升整体重建效果。实验表明，RIR-Former在多种模拟声学环境中均优于现有方法，具有较高的实用价值。

Comments Published in ICASSP 2026. Code: https://github.com/ShaoHenry/RIR-Former . Equal contribution: Shaoheng Xu and Chunyi Sun

2601.23252 2026-05-12 stat.CO cs.LG stat.ML

Nested Slice Sampling: Vectorized Nested Sampling for GPU-Accelerated Inference

David Yallup, Namu Kroupa, Will Handley

AI总结本文提出了一种名为嵌套切片采样（Nested Slice Sampling, NSS）的算法，旨在提高嵌套采样在GPU上的可扩展性和计算效率。该方法通过引入切片采样的击中-运行策略，实现了对约束更新的向量化处理，并给出了一个简单且近似最优的切片宽度设置规则，提升了高维问题下的性能和并行计算的可预测性。实验表明，NSS在复杂模型比较、高维贝叶斯推断和高斯过程超参数边缘化等任务中，能够保持准确的证据估计和高质量的后验样本，尤其在多模态问题上表现出优于现有方法的鲁棒性。

Comments 58 pages, 13 figures, Accepted to Transactions on Machine Learning Research

2601.19585 2026-05-12 cs.IR cs.AI

LLM-Enhanced Reinforcement Learning for Long-Term User Satisfaction in Interactive Recommendation

Chongjun Xia, Yanchun Peng, Xianzhi Wang

AI总结交互式推荐系统虽然能够动态适应用户反馈，但往往因过度迎合短期偏好而出现内容同质化和信息茧房问题。为提升长期用户满意度，本文提出了一种结合大语言模型（LLM）语义规划能力和强化学习（RL）细粒度适应性的分层推荐框架LERL。该方法通过高层LLM选择语义多样化的类别，底层RL在选定语义空间内推荐个性化内容，有效缩小动作空间、提高规划效率并缓解内容冗余问题。实验表明，LERL在真实数据集上显著优于现有先进方法，提升了长期用户满意度。

详情

DOI: 10.1007/978-981-92-0363-5_28
Journal ref: In: Jung, H., Wang, T., Toyoda, M., Kwon, HY., Lee, Jw. (eds) Database Systems for Advanced Applications. DASFAA 2026. Lecture Notes in Computer Science, vol 16535. Springer, Singapore

英文摘要

Interactive recommender systems can dynamically adapt to user feedback, but often suffer from content homogeneity and filter bubble effects due to overfitting short-term user preferences. While recent efforts aim to improve content diversity, they predominantly operate in static or one-shot settings, neglecting the long-term evolution of user interests. Reinforcement learning provides a principled framework for optimizing long-term user satisfaction by modeling sequential decision-making processes. However, its application in recommendation is hindered by sparse, long-tailed user-item interactions and limited semantic planning capabilities. In this work, we propose LLM-Enhanced Reinforcement Learning (LERL), a novel hierarchical recommendation framework that integrates the semantic planning power of LLM with the fine-grained adaptability of RL. LERL consists of a high-level LLM-based planner that selects semantically diverse content categories, and a low-level RL policy that recommends personalized items within the selected semantic space. This hierarchical design narrows the action space, enhances planning efficiency, and mitigates overexposure to redundant content. Extensive experiments on real-world datasets demonstrate that LERL significantly improves long-term user satisfaction when compared with state-of-the-art baselines. The implementation of LERL is available at https://github.com/1163710212/LERL.

URL PDF HTML ☆

赞 0 踩 0

2601.12248 2026-05-12 eess.AS cs.AI cs.CL cs.LG cs.SD

AQUA-Bench: Beyond Finding Answers to Knowing When There Are None in Audio Question Answering

Chun-Yi Kuan, Hung-yi Lee

AI总结 AQUA-Bench 是一个用于评估音频问答中不可答问题识别能力的新基准，旨在弥补现有评测体系对不可答问题关注不足的缺陷。该基准通过三个场景系统性地评估模型在缺失答案、答案与问题类别不匹配以及问题与音频内容无关等情况下的表现，从而更全面地衡量模型的可靠性与鲁棒性。实验表明，尽管现有模型在可答任务上表现良好，但在处理不可答问题时仍面临显著挑战，揭示了当前音频语言理解中的一个盲区。

Comments Accepted to ICASSP 2026 (Oral). Project Website: https://github.com/kuan2jiu99/aqua-bench

2601.02366 2026-05-12 cs.IR cs.AI

TextBridgeGNN: Pre-training Graph Neural Network for Cross-Domain Recommendation via Text-Guided Transfer

Yiwen Chen, Yiqing Wu, Huishi Luo, Fuzhen Zhuang, Deqing Wang, Zhao Zhang

AI总结本文提出了一种名为TextBridgeGNN的预训练图神经网络框架，旨在解决跨领域推荐中传统基于ID嵌入的图推荐模型难以迁移的问题。该方法通过文本作为语义桥梁，利用多层级图传播机制连接不同领域的异构交互图，从而在预训练阶段学习领域特有与通用的知识，并在微调阶段通过相似性迁移机制实现ID嵌入的跨域传递。实验表明，TextBridgeGNN在跨域、多域及无训练设置下均优于现有方法，有效结合了预训练语言模型语义与图协同过滤的优势。

2512.21587 2026-05-12 physics.optics cond-mat.dis-nn cs.LG math-ph math.MP physics.app-ph

Incorporating rank-free coupling and external field via an incoherent modulated spatial photonic Ising machine

Ze Zheng, Yuegang Li, Hang Xu, Jingzheng Huang, Tailong Xiao, Guihua Zeng

AI总结该研究提出了一种基于幅度调制的无秩耦合空间光Ising机，能够直接编程实现包含外部场的全连接Ising模型，解决了传统方案中因依赖衍射而需要辅助自旋或复用导致的效率或规模限制问题。该装置通过幅度与二值空间调制器的哈达玛积编码任意Ising哈密顿量，并利用单像素强度测量读取结果，实现了高精度、高效率的优化计算，并成功应用于稀疏问题求解与复杂网络的最大割计算。研究展示了光子Ising机在可编程模拟超越低秩耦合的物理模型方面的潜力。

Comments 15 pages, 6 figures

2512.02066 2026-05-12 quant-ph cs.AI cs.LG eess.IV

Parallel Multi-Circuit Quantum Feature Fusion in Hybrid Quantum-Classical Convolutional Neural Networks for Breast Tumor Classification

Ece Yurtseven

AI总结本文提出了一种混合量子-经典卷积神经网络（QCNN）架构，用于乳腺肿瘤图像的二分类任务，基于标准数据集BreastMNIST。该架构结合了经典卷积特征提取与两个不同的量子电路，分别采用振幅编码和角度编码方式，并在四个量子比特上实现环形纠缠。通过量子特征嵌入与经典特征融合形成联合特征空间，最终由全连接分类器处理。实验表明，该混合QCNN在分类准确率上显著优于经典CNN，且具有较大的效应量，验证了量子特征融合在医疗图像分类中的有效性。

Comments Accepted to QCNC 2026

详情

DOI: 10.1109/QCNC69040.2026.00172

英文摘要

Quantum machine learning has emerged as a promising approach to improve feature extraction and classification tasks in high-dimensional data domains such as medical imaging. In this work, we present a hybrid Quantum-Classical Convolutional Neural Network (QCNN) architecture designed for the binary classification of the BreastMNIST dataset, a standardized benchmark for distinguishing between benign and malignant breast tumors. Our architecture integrates classical convolutional feature extraction with two distinct quantum circuits: an amplitude-encoding variational quantum circuit (VQC) and an angle-encoding VQC circuit with circular entanglement, both implemented on four qubits. These circuits generate quantum feature embeddings that are fused with classical features to form a joint feature space, which is subsequently processed by a fully connected classifier. To ensure fairness, the hybrid QCNN is parameter-matched against a baseline classical CNN, allowing us to isolate the contribution of quantum layers. Both models are trained under identical conditions using the Adam optimizer and binary cross-entropy loss. Experimental evaluation in five independent runs demonstrates that the hybrid QCNN achieves statistically significant improvements in classification accuracy compared to the classical CNN, as validated by a one-sided Wilcoxon signed rank test (p = 0.03125) and supported by large effect size of Cohen's d = 2.14. Our results indicate that hybrid QCNN architectures can leverage entanglement and quantum feature fusion to enhance medical image classification tasks. This work establishes a statistical validation framework for assessing hybrid quantum models in biomedical applications and highlights pathways for scaling to larger datasets and deployment on near-term quantum hardware.

URL PDF HTML ☆

赞 0 踩 0

2512.00175 2026-05-12 stat.ME cs.LG stat.ML

Comparing Two Proxy Methods for Causal Identification

Helen Guo, Elizabeth L. Ogburn, Ilya Shpitser

AI总结本文比较了两种用于因果识别的代理变量方法：桥接方程方法和数组分解方法。前者通过求解积分方程来恢复因果目标，后者则通过特征分解任务识别潜在因子以估计反事实效应。研究分析了两种方法的模型限制及其假设条件，明确了各自的适用范围，为因果效应识别提供了理论指导。

Comments 10 pages; 5 figures

2510.23074 2026-05-12 cs.CR cs.CL

Fast-MIA: Efficient and Scalable Membership Inference for LLMs

Hiromu Takahashi, Shotaro Ishihara

AI总结本文提出 Fast-MIA，一个用于高效评估大语言模型（LLM）成员推理攻击（MIA）的 Python 库。针对现有方法计算开销大、重复计算中间结果的问题，Fast-MIA 采用批量推理优化和跨方法缓存机制，显著提升了评估效率。该库整合了多种代表性 MIA 方法，支持灵活配置和主流基准测试，旨在推动可扩展且可复现的隐私风险研究。

Comments ACL 2026 System Demonstrations

2510.20129 2026-05-12 cs.CR cs.AI

SAID: Safety-Aware Intent Defense via Prefix Probing for Large Language Models

Yulong Chen, Qi Zhang, Jiawen Zhang, Yadong Liu, Mu Li, Jie Wen, Yong Xu

AI总结本文提出了一种名为SAID的安全意识意图防御框架，用于提升大型语言模型对越狱攻击的防御能力。该方法通过在解码前对用户输入进行意图层面的安全探测，无需修改模型参数或解码过程，即可实现黑盒兼容的防御。实验表明，SAID在多种越狱攻击下表现出色，有效减少了有害响应，同时保持了对正常任务的实用性，为大型语言模型的安全性与实用性提供了良好的平衡。

Comments 12 pages, 5 figures. V2: Updated title, author list, and extensive experiments; expanded background on LLM security applications

2510.19414 2026-05-12 eess.AS cs.AI cs.SD

EchoFake: A Replay-Aware Dataset for Practical Speech Deepfake Detection

Tong Zhang, Yihuan Huang, Yanzhen Ren

AI总结随着语音深度伪造技术的广泛应用，电话诈骗和身份盗用等现实场景中的安全问题日益严重。现有反欺骗系统在实验室合成语音上表现良好，但在面对物理重放攻击时性能显著下降。为此，本文提出了EchoFake数据集，包含超过120小时、来自13000多名说话人的语音数据，涵盖先进的零样本文本到语音合成语音和多种设备及真实环境下的物理重放录音，有效提升了语音深度伪造检测模型的泛化能力与实际应用表现。

Comments ICASSP 2026

2510.19407 2026-05-12 math.OC cs.RO

A Radius of Robust Feasibility Approach to Directional Sensors in Uncertain Terrain

Vanshika Datta, C. Nahak

AI总结本文研究了在不确定地形中定向传感器网络的鲁棒可行性半径问题，提出了一种新的方法将该半径与分布式贪心算法结合，以提升传感器网络的覆盖性能。该方法给出了定向传感器网络中鲁棒可行性半径的精确公式，并通过策略性调整传感器方向，增强系统在不确定性下的鲁棒性。实验结果验证了该方法在最大化覆盖范围和优化传感器方向方面的有效性，具有实际应用价值。

2510.07136 2026-05-12 cs.IT cs.CR cs.LG cs.SI math.IT

Differentially Private Spectral Graph Clustering: Balancing Privacy, Accuracy, and Efficiency

Antti Koskela, Mohamed Seif, H. Vincent Poor, Andrea J. Goldsmith

AI总结本文研究了在边微分隐私约束下的图谱聚类问题，提出了一种结合随机边翻转与邻接矩阵随机排列的矩阵洗牌机制，有效提升了隐私保障水平。通过统一的误差分析框架，论文给出了不同机制在隐私预算、特征值间隔和社区数量下的误分类率，并证明所提方法的误差率随节点数呈 $\tilde{O}(1/n)$ 衰减，优于传统隐私PCA方法。此外，还提出了用于估计社区数量的隐私化谱隙检测算法，实验验证了理论结果的正确性。

2509.20799 2026-05-12 cs.HC cs.SD

AuthGlass: Benchmarking Voice Liveness Detection and Authentication on Smart Glasses via Comprehensive Acoustic Features

Weiye Xu, Zhang Jiang, Siqi Zheng, Xiyuxing Zhang, Changhao Zhang, Jian Liu, Weiqiang Wang, Yuntao Wang

AI总结随着智能眼镜的快速发展，语音交互因其自然性和便捷性被广泛应用，但其实际应用常受到欺骗攻击的威胁，且目前缺乏针对智能眼镜场景的语音活体检测与认证的公开数据集。为此，研究者收集了一个包含42名受试者16通道音频数据及两类攻击样本的多模态声学数据集，并提出了基于声场的活体检测方法AuthG-Live和多模态认证模型AuthG-Net。实验表明，该方法在四个基准任务中达到最先进水平，并通过消融实验验证了其在真实场景下的泛化能力，研究还发布了名为AuthGlass的数据集以推动相关领域的发展。

Comments Submitted to IMWUT 2026

2509.18484 2026-05-12 stat.ML cs.LG

Estimating Heterogeneous Causal Effect on Networks via Orthogonal Learning

Yuanchen Wu, Yubai Yuan

AI总结本文研究了在网络数据中估计异质性因果效应的问题，即处理不仅影响自身节点，还可能对邻居节点产生溢出效应，且不同节点和边的因果效应可能存在差异。为此，作者提出了一种两阶段正交学习框架，第一阶段利用图神经网络估计与协变量和网络结构相关的干扰因素，第二阶段通过可解释的注意力机制模型估计直接和溢出效应，并提供了边级、节点级和群体级的因果效应估计。该方法通过正交化和交叉拟合降低对第一阶段估计误差的敏感性，并结合自助法进行不确定性量化，实验表明其在异质效应估计和后续可解释分析方面具有优势。

2505.18184 2026-05-12 eess.SP cs.CV

AI- Enhanced Stethoscope in Remote Diagnostics for Cardiopulmonary Diseases

Hania Ghouse, Juveria Tanveen, Abdul Muqtadir Ahmed, Uma N. Dulhare

AI总结本文针对全球范围内日益严重的 cardiovascular 和 pulmonary 疾病诊断难题，提出了一种结合人工智能的低成本听诊器系统，用于远程诊断心肺疾病。该方法通过提取和处理听诊声音中的 MFCC 特征，结合 CNN 和 GRU 的混合模型实现对六种肺部和五种心血管疾病的自动分类，能够在资源匮乏地区部署于低成本嵌入式设备，提供实时诊断支持，为标准化医疗提供了创新解决方案。

2505.07349 2026-05-12 eess.IV cs.CV

Multi-Plane Vision Transformer for Hemorrhage Classification Using Axial and Sagittal MRI Data

Badhan Kumar Das, Gengyan Zhao, Boris Mailhe, Thomas J. Re, Dorin Comaniciu, Eli Gibson, Andreas Maier

AI总结本文提出了一种用于脑出血分类的多平面视觉Transformer（MP-ViT），旨在解决使用不同方位MRI数据（如轴向和矢状位）进行出血检测时的信息丢失问题。该方法采用两个独立的Transformer编码器分别处理不同方位的影像，并通过跨注意力机制融合多方位信息，同时引入模态指示向量以补充缺失的对比信息。实验表明，MP-ViT在包含10,084个训练样本的临床数据集上表现出色，其AUC值相比传统ViT和CNN模型分别提升了5.5%和1.8%，展示了其在多方位MRI出血检测中的优越性。

Comments 10 pages

2504.21015 2026-05-12 cs.IR cs.CL

Don't Retrieve, Generate: Prompting LLMs for Synthetic Training Data in Dense Retrieval

Aarush Sinha

AI总结本文研究了如何利用大语言模型（LLMs）直接生成合成的困难负样本，以替代传统密集检索模型训练中依赖的全语料库挖掘方法。作者通过四种不同规模的先进LLMs生成合成负样本，并在DistilBERT上进行微调，测试了其在10个BEIR基准数据集上的表现。研究发现，与传统基于语料库的负样本挖掘方法（如BM25和Cross-Encoder）相比，生成的负样本效果较差，且增大生成模型的参数规模并不一定提升检索性能，其中14B参数模型的表现优于30B参数模型。

2504.19451 2026-05-12 math.NT cs.AI

Artificial Intelligence in Number Theory: LLMs for Algorithm Generation and Ensemble Methods for Conjecture Verification

Ali Saraeb

AI总结本文探讨了人工智能在数论领域的两个具体应用。第一部分评估了先进开源大语言模型Qwen2.5-Math-7B-Instruct在算法数论任务中的表现，结果显示其在带有非提示性提示的情况下，对三十个算法问题和三十个计算问题的准确率均达到0.95以上。第二部分通过构建基于统计特征的LightGBM分类模型，实证验证了一个关于狄利克雷L函数零点与模数关系的数论猜想，在小模数情况下测试准确率超过93.9%。

2504.16093 2026-05-12 q-fin.PM cs.AI math.PR

Efficient Portfolio Selection through Preference Aggregation with Quicksort and the Bradley--Terry Model

Yurun Ge, Lucas Böttcher, Tom Chou, Maria R. D'Orsogna

AI总结本文研究了在不确定性环境下如何高效地选择最优项目组合的问题，提出了基于快速排序和Bradley-Terry模型的偏好聚合方法。该方法通过将项目间的不确定长期收益转化为成对的“胜率”，并结合多代理的评估进行聚合排序，从而实现对项目组合的优化选择。实验表明，所提方法在性能上优于现有主流方法，并可通过采样技术大幅减少成对比较的次数，具有较高的实用价值。

Comments 15pp, 4 figs

详情

DOI: 10.1016/j.jocs.2025.102728
Journal ref: J. Comput. Sci. 92, 102728 (2025)

英文摘要

How to allocate limited resources to projects that will yield the greatest long-term benefits is a problem that often arises in decision-making under uncertainty. For example, organizations may need to evaluate and select innovation projects with risky returns. Similarly, when allocating resources to research projects, funding agencies are tasked with identifying the most promising proposals based on idiosyncratic criteria. Finally, in participatory budgeting, a local community may need to select a subset of public projects to fund. Regardless of context, agents must estimate the uncertain values of a potentially large number of projects. Developing parsimonious methods to compare these projects, and aggregating agent evaluations so that the overall benefit is maximized, are critical in assembling the best project portfolio. Unlike in standard sorting algorithms, evaluating projects on the basis of uncertain long-term benefits introduces additional complexities. We propose comparison rules based on Quicksort and the Bradley--Terry model, which connects rankings to pairwise "win" probabilities. In our model, each agent determines win probabilities of a pair of projects based on his or her specific evaluation of the projects' long-term benefit. The win probabilities are then appropriately aggregated and used to rank projects. Several of the methods we propose perform better than the two most effective aggregation methods currently available. Additionally, our methods can be combined with sampling techniques to significantly reduce the number of pairwise comparisons. We also discuss how the Bradley--Terry portfolio selection approach can be implemented in practice.

URL PDF HTML ☆

赞 0 踩 0

2503.13558 2026-05-12 eess.SP cs.AI cs.LG

Survival Analysis with Machine Learning for Predicting Li-ion Battery Remaining Useful Life

Jingyuan Xue, Xiaozhen Zhao, Dongjing Jiang, Qingchong Jiao, Redouane EL Bouchtaoui, Jianfei Zhang

AI总结本文研究了利用机器学习方法预测锂离子电池剩余使用寿命（RUL）的问题，针对传统方法在处理非线性退化模式和不确定性量化方面的不足，提出了一种结合生存数据分析的混合框架。该方法通过路径签名将电池电压时间序列转化为失效时间数据，并采用基于Cox模型的多种生存分析方法（如DeepHit和MTLR）进行失效概率预测。实验表明，该方法在多个公开数据集上取得了较高的时间依赖AUC和一致性指数，同时保持了较低的综合Brier分数，具有较好的预测性能。

2502.09891 2026-05-12 cs.IR cs.AI

ArchRAG: Attributed Community-based Hierarchical Retrieval-Augmented Generation

Shu Wang, Yixiang Fang, Yingli Zhou, Xilin Liu, Yuchi Ma

AI总结本文提出了一种基于属性社区的分层检索增强生成方法 ArchRAG，旨在解决现有图谱增强生成方法在信息检索准确性和计算效率上的不足。该方法通过引入属性社区和基于大语言模型的分层聚类技术，构建分层索引结构并优化在线检索过程，从而更高效地从图数据中检索相关信息。实验表明，ArchRAG 在准确率和 token 消耗方面均优于现有方法。

Comments Published in Proceedings of the AAAI Conference on Artificial Intelligence, 2026

2501.05614 2026-05-12 cs.CR cs.AI

Watermarking Graph Neural Networks via Explanations for Ownership Protection

Jane Downer, Yingdan Shi, Ziyan Liu, Ren Wang, Binghui Wang

AI总结本文研究如何通过解释对图神经网络（GNN）进行水印嵌入，以实现对其知识产权的保护。作者提出了一种基于解释的水印方法，无需篡改训练数据，有效避免了现有方法中的所有权模糊和数据中毒攻击问题。该方法通过使GNN的解释具有统计显著性，确保所有权声明需通过统计验证，并在理论上证明了即使在完全了解水印方法的情况下，定位水印也是NP难问题，实验也验证了其对微调和剪枝攻击的鲁棒性。