arXivDaily arXiv每日学术速递 周一至周五更新
重置
全部学科分类 4033
2604.11101 2026-05-12 math.CO cs.LG

Generating Hadamard matrices with transformers

Geordie Williamson, Oded Yacobi, Paul Zinn-Justin

AI总结 本文提出了一种结合变压器神经网络与局部搜索的新方法,用于构造哈达玛矩阵,特别适用于Goethals–Seidel类型的稀疏组合搜索问题。该方法在100到200阶之间能生成大量非等价的哈达玛矩阵,并在更高阶数上表现出优于传统局部搜索的性能,成功构造了阶数为252的哈达玛矩阵。实验还表明,变压器模型能够发现并利用搜索空间中的隐藏对称性,为组合优化提供了新的思路。

详情
英文摘要

We present a new method for constructing Hadamard matrices that combines transformer neural networks with local search in the PatternBoost framework. Our approach is designed for extremely sparse combinatorial search problems and is particularly effective for Hadamard matrices of Goethals--Seidel type, where Fourier methods permit fast scoring and optimisation. For orders between 100 and 200, it produces large numbers of inequivalent Hadamard matrices, and for larger orders, it succeeds where local search from random initialisation fails. The largest example found by our method has order 252. In addition to these new constructions, our experiments reveal that the transformer can discover and exploit useful hidden symmetry in the search space.

2604.07780 2026-05-12 eess.IV cs.CV

MonoUNet: A Robust Tiny Neural Network for Automated Knee Cartilage Segmentation on Point-of-Care Ultrasound Devices

Alvin Kimbowa, Arjun Parmar, Ibrahim Mujtaba, Will Wei, Maziar Badii, Matthew Harkey, David Liu, Ilker Hacihaliloglu

AI总结 本研究提出了一种名为 MonoUNet 的轻量级深度学习模型,旨在用于便携式超声设备上自动分割膝关节软骨。该模型通过引入可训练的单基因块提取多尺度局部相位特征,并结合门控机制提升对超声图像变化的鲁棒性,显著减少了参数量和计算成本。实验表明,MonoUNet 在多个设备和站点的数据集上取得了优异的分割性能,Dice 分数高达 92.62% 至 94.82%,且与手动测量结果具有高度一致性与可靠性。

Comments 17 pages, 4 figures. Published in Ultrasound in Medicine & Biology (2026)

详情
Journal ref
Ultrasound in Medicine & Biology, 2026, ISSN 0301-5629
英文摘要

Objective: To develop a robust and compact deep learning model for automated knee cartilage segmentation on point-of-care ultrasound (POCUS) devices. Methods: We propose MonoUNet, a novel, highly compact segmentation model consisting of (i) an aggressively reduced U-Net backbone, (ii) a trainable monogenic block that extracts multi-scale local phase features from the input, and (iii) a gating mechanism that injects these features into the encoder stages to reduce sensitivity to variations in ultrasound image appearance. MonoUNet segmentation performance was evaluated on a multi-site, multi-device knee cartilage ultrasound dataset using Dice score and mean average surface distance (MASD). Agreement between MonoUNet and manual cartilage outcomes (thickness and echo intensity) was assessed using Bland-Altman analysis with 95% limits of agreement, and reliability was assessed using intraclass correlation coefficient (ICC$_{2,k}$). Results: Overall, MonoUNet outperformed existing lightweight segmentation models, with average Dice scores ranging from 92.62% to 94.82% and MASD values between 0.133 mm and 0.254 mm. MonoUNet reduces the number of parameters by 10x--700x and computational cost by 14x--2000x relative to existing lightweight models. MonoUNet cartilage outcomes showed excellent reliability and agreement with the manual outcomes: intraclass correlation coefficients (ICC$_{2,k})$=0.96 and bias=2.00% (0.047 mm) for average thickness, and ICC$_{2,k}$=0.99 and bias=0.80% (0.328 a.u.) for echo intensity. Conclusion: Incorporating trainable local phase features improves the robustness of highly compact neural networks for knee cartilage segmentation across varying acquisition settings and could support scalable ultrasound-based assessment and monitoring of knee osteoarthritis using POCUS devices. The code is publicly available at https://github.com/alvinkimbowa/monounet.

2604.01527 2026-05-12 cs.SE cs.AI cs.LG

REAP: Automatic Curation of Coding Agent Benchmarks from Interactive Production Usage

Smriti Jha, Matteo Paltenghi, Chandra Maddila, Vijayaraghavan Murali, Shubham Ugare, Satish Chandra

AI总结 本文提出REAP(相关性与执行审核管道),一种自动构建生产环境衍生编码代理基准的流水线,无需人工标注即可从真实开发者与代理的交互会话中生成基准。REAP通过基于大语言模型的任务分类、代理测试相关性验证和多轮稳定性检查,解决了生产环境中基准构建所面临的不可测试提示、测试不一致和测试不稳定等问题。该方法生成的Harvest基准包含多种编程语言任务,评估结果显示前沿模型在任务解决率上存在显著差异,为实际部署提供了有价值的参考依据。

详情
英文摘要

Production deployment of AI coding agents requires fast, reproducible evaluation signals. Existing industrial practices trade off speed and fidelity: online A/B testing takes weeks and risks user experience, shadow deployment yields signals that are not reproducible across runs, and public benchmarks diverge from production workloads in language distribution, prompt style, and codebase structure. This paper presents REAP (Relevance and Execution-Audited Pipeline), an automated curation pipeline that constructs production-derived benchmarks from real developer-agent sessions without manual labeling. Such curation, while in-distribution to production usage, runs into several challenges. Untestable prompts, misaligned tests, and test flakiness all compromise evaluation reliability. While tasks can be manually audited to ensure only high-quality tasks remain in the benchmark, this approach is infeasible in the monorepo setting: the build infrastructure state is often ephemeral in large monorepos and requires the benchmark to be continuously re-curated against the current codebase. As manual verification cannot be sustained at this cadence, REAP adds an automated verification layer using LLM-based task classification, agentic test-relevance validation, and multi-run stability checks to ensure the executable benchmark yields trustworthy signals. We use REAP to curate Harvest, a benchmark where each task feeds the coding agent a real developer prompt and verifies the resulting code change against fail-to-pass tests retrieved from production. Harvest's distribution spans more than four programming languages with a majority of tasks drawn from Hack. Model and harness evaluations reveal that solve rates range from 42.9% to 58.2% across five frontier models, surfacing capability differences that inform concrete deployment decisions.

2603.14889 2026-05-12 eess.AS cs.CL cs.LG

SDiaReward: Modeling and Benchmarking Spoken Dialogue Rewards with Modality and Colloquialness

Jingyu Lu, Yuhan Wang, Fan Zhuo, Xize Cheng, Changhao Pan, Xueyi Pu, Yifu Chen, Chenyuhao Wen, Tianle Liang, Zhou Zhao

AI总结 随着端到端语音对话系统的发展,如何准确建模对话中的副语言特征和口语化表达成为关键问题。为此,研究提出了SDiaReward,一个端到端的多轮对话奖励模型,通过新构建的SDiaReward-Dataset进行训练,能够同时评估语音中的模态特征和口语化程度。该模型在统一的评估框架下实现了对多轮对话质量的高效判断,并在新建立的ESDR-Bench基准上展现出优越的性能,显著优于通用音频大语言模型。

Comments Accepted to ACL 2026 Main Conference

详情
英文摘要

The rapid evolution of end-to-end spoken dialogue systems demands transcending mere textual semantics to incorporate paralinguistic nuances and the spontaneous nature of human conversation. However, current methods struggle with two critical gaps: the modality gap, involving prosody and emotion, and the colloquialness gap, distinguishing written scripts from natural speech. To address these challenges, we introduce SDiaReward, an end-to-end multi-turn reward model trained on SDiaReward-Dataset, a novel collection of episode-level preference pairs explicitly targeting these gaps. It operates directly on full multi-turn speech episodes and is optimized with pairwise preference supervision, enabling joint assessment of modality and colloquialness in a single evaluator. We further establish ESDR-Bench, a stratified benchmark for robust episode-level evaluation. Experiments demonstrate that SDiaReward achieves state-of-the-art pairwise preference accuracy, significantly outperforming general-purpose audio LLMs. Further analysis suggests that SDiaReward captures relative conversational expressiveness beyond superficial synthesis cues, improving generalization across domains and recording conditions. Code, data, and demos are available at https://github.com/MM-Speech/SDiaReward/.

2603.12800 2026-05-12 eess.IV cs.CV

GLEAM: A Multimodal Imaging Dataset and HAMM for Glaucoma Classification

Jiao Wang, Chi Liu, Yiying Zhang, Hongchen Luo, Zhifen Guo, Ying Hu, Ke Xu, Jing Zhou, Hongyan Xu, Ruiting Zhou, Man Tang

AI总结 本文提出了GLEAM,一个包含三种成像模态的公开青光眼数据集,涵盖眼底扫描激光图像、视神经周围OCT图像和视野图模式偏差图,并标注了四个疾病阶段,有助于综合利用多模态信息进行精准诊断。为有效整合跨模态信息,研究提出了一种分层注意力掩码建模(HAMM)方法,通过分层注意力编码器和轻量解码器,聚焦于跨模态表征学习,提升青光眼分类的准确性。该研究为多模态医学影像分析提供了新思路和有效工具。

详情
英文摘要

We propose glaucoma lesion evaluation and analysis with multimodal imaging (GLEAM), the first publicly available tri-modal glaucoma dataset comprising scanning laser ophthalmoscopy fundus images, circumpapillary OCT images, and visual field pattern deviation maps, annotated with four disease stages, enabling effective exploitation of multimodal complementary information and facilitating accurate diagnosis and treatment across disease stages. To effectively integrate cross-modal information, we propose hierarchical attentive masked modeling (HAMM) for multimodal glaucoma classification. Our framework employs hierarchical attentive encoders and light decoders to focus cross-modal representation learning on the encoder.

2603.03759 2026-05-12 cs.MA cs.AI cs.LG cs.SY eess.SY math.OC

Learning Approximate Nash Equilibria in Cooperative Multi-Agent Reinforcement Learning via Mean-Field Subsampling

Emile Anand, Ishani Karmarkar

AI总结 本文研究了在通信受限条件下,一个全局代理与大量局部智能体协同合作的马尔可夫博弈问题,其中全局代理只能在每个时间步观测部分局部智能体的状态。为此,作者提出了一种交替学习框架(ALTERNATING-MARL),通过子采样均场Q学习和诱导MDP优化实现全局与局部智能体的联合策略更新。理论分析表明,该方法能够收敛到近似纳什均衡,并有效分离了状态-动作空间的样本复杂度;实验部分在多机器人控制任务中验证了方法的有效性。

Comments 57 pages, 10 figures, 4 tables

详情
英文摘要

Many large-scale platforms and networked control systems have a centralized decision maker interacting with a massive population of agents under strict observability constraints. Motivated by such applications, we study a cooperative Markov game with a global agent and $n$ homogeneous local agents in a communication-constrained regime, where the global agent only observes a subset of $k$ local agent states per time step. We propose an alternating learning framework $(\texttt{ALTERNATING-MARL})$, where the global agent performs subsampled mean-field $Q$-learning against a fixed local policy, and local agents update by optimizing in an induced MDP. We prove that these approximate best-response dynamics converge to an $\widetilde{O}(1/\sqrt{k})$-approximate Nash Equilibrium, while separating the sample complexities between the joint state and action spaces. Finally, we validate our results in numerical simulations for multi-robot control.

2602.22551 2026-05-12 math.OC cs.LG

Identifying Multi-Hit Cancer Drivers Without Massive Parallelization: A CP, MIP, and Column Generation Framework

Rick S. H. Willemsen, Tenindra Abeywickrama, Ramu Anandakrishnan

AI总结 该研究旨在识别驱动癌症的多基因突变组合,将其形式化为多击癌症驱动集覆盖问题(MHCDSCP),目标是在最大化肿瘤样本覆盖的同时最小化正常样本的误分类。不同于传统依赖大规模并行计算的方法,作者提出了一种基于约束规划和混合整数规划的快速启发式算法,并结合列生成技术,能够在普通CPU上快速求解。实验表明,该框架在实际癌症基因组数据上表现优异,部分实例可获得理论最优解,证明该问题的计算复杂度远低于以往认知,为多击模型的进一步研究提供了可行的基准。

详情
英文摘要

Cancer is often driven by specific combinations of an estimated two to nine gene mutations, known as multi-hit combinations. Identifying these multi-hit combinations of gene mutations that drive cancer is critical for understanding carcinogenesis and designing targeted therapies. We formalize this challenge as the Multi-Hit Cancer Driver Set Cover Problem (MHCDSCP), optimizing the selection of gene combinations to maximize tumor coverage while strictly minimizing normal sample misclassification. While existing approaches rely on exhaustive enumeration and massive parallelization, we introduce fast heuristics based on constraint programming and mixed integer programming formulations. Evaluated on real-world cancer genomics data, our framework matches state-of-the-art supercomputing methods using a single commodity CPU in under a minute. We also propose a price-and-branch heuristic which, by solving the root node to optimality, provides the first provably optimal solutions for over half of the benchmark instances, thereby verifying the near-optimality of our fast heuristics. These findings demonstrate that on real-world problem instances, the MHCDSCP is far less computationally demanding than previously believed, providing an accessible baseline that enables the exploration of previously intractable multi-hit modeling assumptions.

2602.01861 2026-05-12 eess.AS cs.LG

RIR-Former: Coordinate-Guided Transformer for Continuous Reconstruction of Room Impulse Responses

Shaoheng Xu, Chunyi Sun, Jihui Zhang, Prasanga N. Samarasinghe, Thushara D. Abhayapala

AI总结 本文提出了一种名为RIR-Former的模型,用于从稀疏测量中重建房间脉冲响应(RIR),该模型基于Transformer架构并引入了正弦编码模块以有效利用麦克风位置信息,实现了任意阵列位置的插值。通过设计分段多分支解码器,模型能够分别处理早期反射和晚期混响,从而提升整体重建效果。实验表明,RIR-Former在多种模拟声学环境中均优于现有方法,具有较高的实用价值。

Comments Published in ICASSP 2026. Code: https://github.com/ShaoHenry/RIR-Former . Equal contribution: Shaoheng Xu and Chunyi Sun

详情
Journal ref
Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 15312--15316, 2026
英文摘要

Room impulse responses (RIRs) are essential for many acoustic signal processing tasks, yet measuring them densely across space is often impractical. In this work, we propose RIR-Former, a grid-free, one-step feed-forward model for RIR reconstruction. By introducing a sinusoidal encoding module into a transformer backbone, our method effectively incorporates microphone position information, enabling interpolation at arbitrary array locations. Furthermore, a segmented multi-branch decoder is designed to separately handle early reflections and late reverberation, improving reconstruction across the entire RIR. Experiments on diverse simulated acoustic environments demonstrate that RIR-Former consistently outperforms state-of-the-art baselines in terms of normalized mean square error (NMSE) and cosine distance (CD), under varying missing rates and array configurations. These results highlight the potential of our approach for practical deployment and motivate future work on scaling from randomly spaced linear arrays to complex array geometries, dynamic acoustic scenes, and real-world environments.

2601.23252 2026-05-12 stat.CO cs.LG stat.ML

Nested Slice Sampling: Vectorized Nested Sampling for GPU-Accelerated Inference

David Yallup, Namu Kroupa, Will Handley

AI总结 本文提出了一种名为嵌套切片采样(Nested Slice Sampling, NSS)的算法,旨在提高嵌套采样在GPU上的可扩展性和计算效率。该方法通过引入切片采样的击中-运行策略,实现了对约束更新的向量化处理,并给出了一个简单且近似最优的切片宽度设置规则,提升了高维问题下的性能和并行计算的可预测性。实验表明,NSS在复杂模型比较、高维贝叶斯推断和高斯过程超参数边缘化等任务中,能够保持准确的证据估计和高质量的后验样本,尤其在多模态问题上表现出优于现有方法的鲁棒性。

Comments 58 pages, 13 figures, Accepted to Transactions on Machine Learning Research

详情
英文摘要

Model comparison and calibrated uncertainty quantification often require integrating over parameters, but scalable inference can be challenging for complex, multimodal targets. Nested Sampling is a robust alternative to standard MCMC, yet its typically sequential structure and hard constraints make efficient accelerator implementations difficult. This paper introduces Nested Slice Sampling (NSS), a GPU-friendly, vectorized formulation of Nested Sampling that uses Hit-and-Run Slice Sampling for constrained updates. A tuning analysis yields a simple near-optimal rule for setting the slice width, improving high-dimensional behavior and making per-step compute more predictable for parallel execution. Experiments on challenging synthetic targets, high dimensional Bayesian inference, and Gaussian process hyperparameter marginalization show that NSS maintains accurate evidence estimates and high-quality posterior samples, and is particularly robust on difficult multimodal problems where current state-of-the-art methods such as tempered SMC baselines can struggle. An open-source implementation is released to facilitate adoption and reproducibility.

2601.19585 2026-05-12 cs.IR cs.AI

LLM-Enhanced Reinforcement Learning for Long-Term User Satisfaction in Interactive Recommendation

Chongjun Xia, Yanchun Peng, Xianzhi Wang

AI总结 交互式推荐系统虽然能够动态适应用户反馈,但往往因过度迎合短期偏好而出现内容同质化和信息茧房问题。为提升长期用户满意度,本文提出了一种结合大语言模型(LLM)语义规划能力和强化学习(RL)细粒度适应性的分层推荐框架LERL。该方法通过高层LLM选择语义多样化的类别,底层RL在选定语义空间内推荐个性化内容,有效缩小动作空间、提高规划效率并缓解内容冗余问题。实验表明,LERL在真实数据集上显著优于现有先进方法,提升了长期用户满意度。

详情
Journal ref
In: Jung, H., Wang, T., Toyoda, M., Kwon, HY., Lee, Jw. (eds) Database Systems for Advanced Applications. DASFAA 2026. Lecture Notes in Computer Science, vol 16535. Springer, Singapore
英文摘要

Interactive recommender systems can dynamically adapt to user feedback, but often suffer from content homogeneity and filter bubble effects due to overfitting short-term user preferences. While recent efforts aim to improve content diversity, they predominantly operate in static or one-shot settings, neglecting the long-term evolution of user interests. Reinforcement learning provides a principled framework for optimizing long-term user satisfaction by modeling sequential decision-making processes. However, its application in recommendation is hindered by sparse, long-tailed user-item interactions and limited semantic planning capabilities. In this work, we propose LLM-Enhanced Reinforcement Learning (LERL), a novel hierarchical recommendation framework that integrates the semantic planning power of LLM with the fine-grained adaptability of RL. LERL consists of a high-level LLM-based planner that selects semantically diverse content categories, and a low-level RL policy that recommends personalized items within the selected semantic space. This hierarchical design narrows the action space, enhances planning efficiency, and mitigates overexposure to redundant content. Extensive experiments on real-world datasets demonstrate that LERL significantly improves long-term user satisfaction when compared with state-of-the-art baselines. The implementation of LERL is available at https://github.com/1163710212/LERL.

2601.12248 2026-05-12 eess.AS cs.AI cs.CL cs.LG cs.SD

AQUA-Bench: Beyond Finding Answers to Knowing When There Are None in Audio Question Answering

Chun-Yi Kuan, Hung-yi Lee

AI总结 AQUA-Bench 是一个用于评估音频问答中不可答问题识别能力的新基准,旨在弥补现有评测体系对不可答问题关注不足的缺陷。该基准通过三个场景系统性地评估模型在缺失答案、答案与问题类别不匹配以及问题与音频内容无关等情况下的表现,从而更全面地衡量模型的可靠性与鲁棒性。实验表明,尽管现有模型在可答任务上表现良好,但在处理不可答问题时仍面临显著挑战,揭示了当前音频语言理解中的一个盲区。

Comments Accepted to ICASSP 2026 (Oral). Project Website: https://github.com/kuan2jiu99/aqua-bench

详情
英文摘要

Recent advances in audio-aware large language models have shown strong performance on audio question answering. However, existing benchmarks mainly cover answerable questions and overlook the challenge of unanswerable ones, where no reliable answer can be inferred from the audio. Such cases are common in real-world settings, where questions may be misleading, ill-posed, or incompatible with the information. To address this gap, we present AQUA-Bench, a benchmark for Audio Question Unanswerability Assessment. It systematically evaluates three scenarios: Absent Answer Detection (the correct option is missing), Incompatible Answer Set Detection (choices are categorically mismatched with the question), and Incompatible Audio Question Detection (the question is irrelevant or lacks sufficient grounding in the audio). By assessing these cases, AQUA-Bench offers a rigorous measure of model reliability and promotes the development of audio-language systems that are more robust and trustworthy. Our experiments suggest that while models excel on standard answerable tasks, they often face notable challenges with unanswerable ones, pointing to a blind spot in current audio-language understanding.

2601.02366 2026-05-12 cs.IR cs.AI

TextBridgeGNN: Pre-training Graph Neural Network for Cross-Domain Recommendation via Text-Guided Transfer

Yiwen Chen, Yiqing Wu, Huishi Luo, Fuzhen Zhuang, Deqing Wang, Zhao Zhang

AI总结 本文提出了一种名为TextBridgeGNN的预训练图神经网络框架,旨在解决跨领域推荐中传统基于ID嵌入的图推荐模型难以迁移的问题。该方法通过文本作为语义桥梁,利用多层级图传播机制连接不同领域的异构交互图,从而在预训练阶段学习领域特有与通用的知识,并在微调阶段通过相似性迁移机制实现ID嵌入的跨域传递。实验表明,TextBridgeGNN在跨域、多域及无训练设置下均优于现有方法,有效结合了预训练语言模型语义与图协同过滤的优势。

详情
英文摘要

Graph-based recommendation has achieved great success in recent years. The classical graph recommendation model utilizes ID embedding to store essential collaborative information. However, this ID-based paradigm faces challenges in transferring to a new domain, making it hard to build a pre-trained graph recommendation model. This phenomenon primarily stems from two inherent challenges: (1) the non-transferability of ID embeddings due to isolated domain-specific ID spaces, and (2) structural incompatibility between heterogeneous interaction graphs across domains. To address these issues, we propose TextBridgeGNN, a pre-training and fine-tuning framework that can effectively transfer knowledge from a pre-trained GNN to downstream tasks. We believe the key lies in how to build the relationship between domains. Specifically, TextBridgeGNN uses text as a semantic bridge to connect domains through multi-level graph propagation. During the pre-training stage, textual information is utilized to break the data islands formed by multiple domains, and hierarchical GNNs are designed to learn both domain-specific and domain-global knowledge with text features, ensuring the retention of collaborative signals and the enhancement of semantics. During the fine-tuning stage, a similarity transfer mechanism is proposed. This mechanism initializes ID embeddings in the target domain by transferring from semantically related nodes, successfully transferring the ID embeddings and graph pattern. Experiments demonstrate that TextBridgeGNN outperforms existing methods in cross-domain, multi-domain, and training-free settings, highlighting its ability to integrate Pre-trained Language Model (PLM)-driven semantics with graph-based collaborative filtering without costly language model fine-tuning or real-time inference overhead.

2512.21587 2026-05-12 physics.optics cond-mat.dis-nn cs.LG math-ph math.MP physics.app-ph

Incorporating rank-free coupling and external field via an incoherent modulated spatial photonic Ising machine

Ze Zheng, Yuegang Li, Hang Xu, Jingzheng Huang, Tailong Xiao, Guihua Zeng

AI总结 该研究提出了一种基于幅度调制的无秩耦合空间光Ising机,能够直接编程实现包含外部场的全连接Ising模型,解决了传统方案中因依赖衍射而需要辅助自旋或复用导致的效率或规模限制问题。该装置通过幅度与二值空间调制器的哈达玛积编码任意Ising哈密顿量,并利用单像素强度测量读取结果,实现了高精度、高效率的优化计算,并成功应用于稀疏问题求解与复杂网络的最大割计算。研究展示了光子Ising机在可编程模拟超越低秩耦合的物理模型方面的潜力。

Comments 15 pages, 6 figures

详情
英文摘要

Spatial photonic Ising machines offer a novel optical platform for optimization and spin-model simulation, but existing diffraction-based schemes rely on auxiliary spins or multiplexing to encode high-rank couplings and external fields, reducing either speed or spin count. We demonstrate an amplitude-only, rank-free spatial photonic Ising machine in which arbitrary Ising Hamiltonians are encoded as Hadamard products on aligned amplitude and binary spatial modulators and read out by a single-pixel intensity measurement. The machine directly programs fully connected 797-spin Ising models with external fields at nearly 9-bit precision and operates at a constant iteration rate of ~200 Hz. By removing zero-valued product terms, the same architecture scales to sparse problems and experimentally solves a Max-Cut instance on a 424,108-vertex Mobius ladder graph. We also observe the phase transition of the Sherrington-Kirkpatrick model, demonstrating programmable optical simulation beyond low-rank couplings. These results establish amplitude modulation as a scalable route to programmable photonic Ising machines.

2512.02066 2026-05-12 quant-ph cs.AI cs.LG eess.IV

Parallel Multi-Circuit Quantum Feature Fusion in Hybrid Quantum-Classical Convolutional Neural Networks for Breast Tumor Classification

Ece Yurtseven

AI总结 本文提出了一种混合量子-经典卷积神经网络(QCNN)架构,用于乳腺肿瘤图像的二分类任务,基于标准数据集BreastMNIST。该架构结合了经典卷积特征提取与两个不同的量子电路,分别采用振幅编码和角度编码方式,并在四个量子比特上实现环形纠缠。通过量子特征嵌入与经典特征融合形成联合特征空间,最终由全连接分类器处理。实验表明,该混合QCNN在分类准确率上显著优于经典CNN,且具有较大的效应量,验证了量子特征融合在医疗图像分类中的有效性。

Comments Accepted to QCNC 2026

详情
英文摘要

Quantum machine learning has emerged as a promising approach to improve feature extraction and classification tasks in high-dimensional data domains such as medical imaging. In this work, we present a hybrid Quantum-Classical Convolutional Neural Network (QCNN) architecture designed for the binary classification of the BreastMNIST dataset, a standardized benchmark for distinguishing between benign and malignant breast tumors. Our architecture integrates classical convolutional feature extraction with two distinct quantum circuits: an amplitude-encoding variational quantum circuit (VQC) and an angle-encoding VQC circuit with circular entanglement, both implemented on four qubits. These circuits generate quantum feature embeddings that are fused with classical features to form a joint feature space, which is subsequently processed by a fully connected classifier. To ensure fairness, the hybrid QCNN is parameter-matched against a baseline classical CNN, allowing us to isolate the contribution of quantum layers. Both models are trained under identical conditions using the Adam optimizer and binary cross-entropy loss. Experimental evaluation in five independent runs demonstrates that the hybrid QCNN achieves statistically significant improvements in classification accuracy compared to the classical CNN, as validated by a one-sided Wilcoxon signed rank test (p = 0.03125) and supported by large effect size of Cohen's d = 2.14. Our results indicate that hybrid QCNN architectures can leverage entanglement and quantum feature fusion to enhance medical image classification tasks. This work establishes a statistical validation framework for assessing hybrid quantum models in biomedical applications and highlights pathways for scaling to larger datasets and deployment on near-term quantum hardware.

2512.00175 2026-05-12 stat.ME cs.LG stat.ML

Comparing Two Proxy Methods for Causal Identification

Helen Guo, Elizabeth L. Ogburn, Ilya Shpitser

AI总结 本文比较了两种用于因果识别的代理变量方法:桥接方程方法和数组分解方法。前者通过求解积分方程来恢复因果目标,后者则通过特征分解任务识别潜在因子以估计反事实效应。研究分析了两种方法的模型限制及其假设条件,明确了各自的适用范围,为因果效应识别提供了理论指导。

Comments 10 pages; 5 figures

详情
英文摘要

Identifying causal effects in the presence of unmeasured variables is a fundamental challenge in causal inference, for which proxy variable methods have emerged as a powerful solution. We contrast two major approaches in this framework: (1) bridge equation methods, which leverage solutions to integral equations to recover causal targets, and (2) array decomposition methods, which recover latent factors used to identify counterfactual quantities via eigendecomposition tasks. We compare the model restrictions underlying these two approaches and provide insight into implications of the underlying assumptions, clarifying the scope of applicability for each method.

2510.23074 2026-05-12 cs.CR cs.CL

Fast-MIA: Efficient and Scalable Membership Inference for LLMs

Hiromu Takahashi, Shotaro Ishihara

AI总结 本文提出 Fast-MIA,一个用于高效评估大语言模型(LLM)成员推理攻击(MIA)的 Python 库。针对现有方法计算开销大、重复计算中间结果的问题,Fast-MIA 采用批量推理优化和跨方法缓存机制,显著提升了评估效率。该库整合了多种代表性 MIA 方法,支持灵活配置和主流基准测试,旨在推动可扩展且可复现的隐私风险研究。

Comments ACL 2026 System Demonstrations

详情
英文摘要

We propose Fast-MIA (https://github.com/Nikkei/fast-mia), a Python library for efficiently evaluating membership inference attacks (MIA) against large language models (LLMs). MIA has emerged as a crucial technique for auditing privacy risks and copyright infringement in LLMs. However, computational demands have grown substantially: recent methods rely on repeated inference, while practical auditing requires large-scale evaluation. Progress is further hindered by existing implementations that execute methods independently, redundantly computing shared intermediate results such as log-probabilities. To address these challenges, Fast-MIA combines two strategies: (1) high-throughput batch inference via vLLM, achieving approximately 5$\times$ speedup, and (2) a cross-method caching architecture that computes intermediate results once and shares them across methods. The library includes representative MIA methods under a unified framework, integrates with established benchmarks, and supports flexible YAML configuration. We release Fast-MIA under the Apache License 2.0 to support scalable and reproducible MIA research.

2510.20129 2026-05-12 cs.CR cs.AI

SAID: Safety-Aware Intent Defense via Prefix Probing for Large Language Models

Yulong Chen, Qi Zhang, Jiawen Zhang, Yadong Liu, Mu Li, Jie Wen, Yong Xu

AI总结 本文提出了一种名为SAID的安全意识意图防御框架,用于提升大型语言模型对越狱攻击的防御能力。该方法通过在解码前对用户输入进行意图层面的安全探测,无需修改模型参数或解码过程,即可实现黑盒兼容的防御。实验表明,SAID在多种越狱攻击下表现出色,有效减少了有害响应,同时保持了对正常任务的实用性,为大型语言模型的安全性与实用性提供了良好的平衡。

Comments 12 pages, 5 figures. V2: Updated title, author list, and extensive experiments; expanded background on LLM security applications

详情
英文摘要

Large Language Models (LLMs) remain vulnerable to jailbreak attacks, where adversarially crafted prompts induce policy-violating responses despite safety alignment. Existing defenses typically improve safety through external filtering, auxiliary guardrails, or decoding-time control. However, these interventions often reduce practical deployability because they may require additional model access, introduce extra inference cost, or affect benign-task utility. In this paper, we propose Safety-Aware Intent Defense (SAID), a training-free jailbreak defense framework based on intent-level safety probing. SAID first distills potentially obfuscated user inputs into concise core intents using the target model itself. It then applies a validated safety prefix to probe each distilled intent and elicit the model's safety-aware response. Finally, a conservative aggregation rule rejects the original request if any distilled intent is identified as unsafe. This design enables black-box-compatible defense without updating model parameters or modifying the decoding process. Experiments on four open-source LLMs under six representative jailbreak attacks show that SAID achieves state-of-the-art defense performance in reducing harmful responses while maintaining competitive utility on benign tasks. Further analyses on prefix variants, hierarchical distillation, and inference efficiency demonstrate that SAID provides a practical safety-utility trade-off for securing LLMs against jailbreak threats.

2510.19414 2026-05-12 eess.AS cs.AI cs.SD

EchoFake: A Replay-Aware Dataset for Practical Speech Deepfake Detection

Tong Zhang, Yihuan Huang, Yanzhen Ren

AI总结 随着语音深度伪造技术的广泛应用,电话诈骗和身份盗用等现实场景中的安全问题日益严重。现有反欺骗系统在实验室合成语音上表现良好,但在面对物理重放攻击时性能显著下降。为此,本文提出了EchoFake数据集,包含超过120小时、来自13000多名说话人的语音数据,涵盖先进的零样本文本到语音合成语音和多种设备及真实环境下的物理重放录音,有效提升了语音深度伪造检测模型的泛化能力与实际应用表现。

Comments ICASSP 2026

详情
英文摘要

The growing prevalence of speech deepfakes has raised serious concerns, particularly in real-world scenarios such as telephone fraud and identity theft. While many anti-spoofing systems have demonstrated promising performance on lab-generated synthetic speech, they often fail when confronted with physical replay attacks-a common and low-cost form of attack used in practical settings. Our experiments show that models trained on existing datasets exhibit severe performance degradation, with average accuracy dropping to 59.6% when evaluated on replayed audio. To bridge this gap, we present EchoFake, a comprehensive dataset comprising more than 120 hours of audio from over 13,000 speakers, featuring both cutting-edge zero-shot text-to-speech (TTS) speech and physical replay recordings collected under varied devices and real-world environmental settings. Additionally, we evaluate three baseline detection models and show that models trained on EchoFake achieve lower average EERs across datasets, indicating better generalization. By introducing more practical challenges relevant to real-world deployment, EchoFake offers a more realistic foundation for advancing spoofing detection methods.

2510.19407 2026-05-12 math.OC cs.RO

A Radius of Robust Feasibility Approach to Directional Sensors in Uncertain Terrain

Vanshika Datta, C. Nahak

AI总结 本文研究了在不确定地形中定向传感器网络的鲁棒可行性半径问题,提出了一种新的方法将该半径与分布式贪心算法结合,以提升传感器网络的覆盖性能。该方法给出了定向传感器网络中鲁棒可行性半径的精确公式,并通过策略性调整传感器方向,增强系统在不确定性下的鲁棒性。实验结果验证了该方法在最大化覆盖范围和优化传感器方向方面的有效性,具有实际应用价值。

详情
英文摘要

A sensor has the ability to probe its surroundings. However, uncertainties in its exact location can significantly compromise its sensing performance. The radius of robust feasibility defines the maximum range within which robust feasibility is ensured. This work introduces a novel approach integrating it with the directional sensor networks to enhance coverage using a distributed greedy algorithm. In particular, we provide an exact formula for the radius of robust feasibility of sensors in a directional sensor network. The proposed model strategically orients the sensors in regions with high coverage potential, accounting for robustness in the face of uncertainty. We analyze the algorithm's adaptability in dynamic environments, demonstrating its ability to enhance efficiency and robustness. Experimental results validate its efficacy in maximizing coverage and optimizing sensor orientations, highlighting its practical advantages for real-world scenarios.

2510.07136 2026-05-12 cs.IT cs.CR cs.LG cs.SI math.IT

Differentially Private Spectral Graph Clustering: Balancing Privacy, Accuracy, and Efficiency

Antti Koskela, Mohamed Seif, H. Vincent Poor, Andrea J. Goldsmith

AI总结 本文研究了在边微分隐私约束下的图谱聚类问题,提出了一种结合随机边翻转与邻接矩阵随机排列的矩阵洗牌机制,有效提升了隐私保障水平。通过统一的误差分析框架,论文给出了不同机制在隐私预算、特征值间隔和社区数量下的误分类率,并证明所提方法的误差率随节点数呈 $\tilde{O}(1/n)$ 衰减,优于传统隐私PCA方法。此外,还提出了用于估计社区数量的隐私化谱隙检测算法,实验验证了理论结果的正确性。

详情
英文摘要

We study spectral graph clustering under edge differential privacy. We propose a matrix shuffling mechanism that combines randomized edge flipping with a random permutation of the adjacency matrix. While edge flipping alone provides only a constant $\varepsilon$ guarantee as the graph grows, shuffling amplifies privacy so that the effective $\varepsilon$ tends to zero with the number of nodes. We develop a unified error analysis framework -- based on Davis--Kahan perturbation theory and a classification-margin bound -- that gives explicit misclassification rates for all the mechanisms considered as a function of the privacy budget, eigengap, and number of communities. Applying this framework, we show that the matrix shuffling mechanism achieves an error rate scaling of $\tilde{O}(1/n)$, a clear improvement over two canonical DP baselines from the private PCA literature: the Gaussian mechanism applied directly to the adjacency matrix (Analyze Gauss) and the noisy power method, both of which scale as $\tilde{O}(1)$ in $n$. We further propose a private spectral gap detection algorithm for estimating the number of communities. Experiments on synthetic and real-world networks validate our theoretical findings.

2509.20799 2026-05-12 cs.HC cs.SD

AuthGlass: Benchmarking Voice Liveness Detection and Authentication on Smart Glasses via Comprehensive Acoustic Features

Weiye Xu, Zhang Jiang, Siqi Zheng, Xiyuxing Zhang, Changhao Zhang, Jian Liu, Weiqiang Wang, Yuntao Wang

AI总结 随着智能眼镜的快速发展,语音交互因其自然性和便捷性被广泛应用,但其实际应用常受到欺骗攻击的威胁,且目前缺乏针对智能眼镜场景的语音活体检测与认证的公开数据集。为此,研究者收集了一个包含42名受试者16通道音频数据及两类攻击样本的多模态声学数据集,并提出了基于声场的活体检测方法AuthG-Live和多模态认证模型AuthG-Net。实验表明,该方法在四个基准任务中达到最先进水平,并通过消融实验验证了其在真实场景下的泛化能力,研究还发布了名为AuthGlass的数据集以推动相关领域的发展。

Comments Submitted to IMWUT 2026

详情
英文摘要

With the rapid advancement of smart glasses, voice interaction has been widely adopted due to its naturalness and convenience. However, its practical deployment is often undermined by vulnerability to spoofing attacks, while no public dataset currently exists for voice liveness detection and authentication in smart-glasses scenarios. To address this challenge, we first collect a multi-acoustic-modal dataset comprising 16-channel audio data from 42 subjects, along with corresponding attack samples covering two attack categories. Based on insights derived from this collected data, we propose AuthG-Live, a sound-field-based voice liveness detection method, and AuthG-Net, a multi-acoustic-modal authentication model. We further benchmark seven voice liveness detection methods and four authentication methods across diverse acoustic modalities. The results demonstrate that our proposed approach achieves state-of-the-art performance on four benchmark tasks, and extensive ablation studies validate the generalizability of our methods \red{under real-world constraints}. Finally, we release this dataset, termed AuthGlass, to facilitate future research on voice liveness detection and authentication for smart glasses.

2509.18484 2026-05-12 stat.ML cs.LG

Estimating Heterogeneous Causal Effect on Networks via Orthogonal Learning

Yuanchen Wu, Yubai Yuan

AI总结 本文研究了在网络数据中估计异质性因果效应的问题,即处理不仅影响自身节点,还可能对邻居节点产生溢出效应,且不同节点和边的因果效应可能存在差异。为此,作者提出了一种两阶段正交学习框架,第一阶段利用图神经网络估计与协变量和网络结构相关的干扰因素,第二阶段通过可解释的注意力机制模型估计直接和溢出效应,并提供了边级、节点级和群体级的因果效应估计。该方法通过正交化和交叉拟合降低对第一阶段估计误差的敏感性,并结合自助法进行不确定性量化,实验表明其在异质效应估计和后续可解释分析方面具有优势。

详情
英文摘要

Estimating causal effects on networks is challenging because treatments may affect both treated units and their neighbors, while network homophily induces dependence and confounding. These challenges are amplified when causal effects are heterogeneous across units and edges. We propose a two-stage orthogonal learning framework for estimating heterogeneous direct and spillover effects on networks. The first stage uses graph neural networks to estimate nuisance components that capture complex dependence on covariates and network structure. The second stage residualizes these nuisance components and estimates causal effects through an interpretable attention-based interference model, yielding edge-level spillover estimates as well as node- and population-level summaries. Neyman orthogonalization and cross-fitting reduce sensitivity to first-stage estimation error, so nuisance errors enter only at higher order. We further develop a bootstrap-based uncertainty quantification procedure for the estimated spillover matrix, enabling pointwise and simultaneous inference for heterogeneous edge- and node-level effects. Experiments show that our method improves heterogeneous effect estimation while supporting interpretable downstream analyses such as influential-neighbor detection and spillover-sign recovery.

2505.18184 2026-05-12 eess.SP cs.CV

AI- Enhanced Stethoscope in Remote Diagnostics for Cardiopulmonary Diseases

Hania Ghouse, Juveria Tanveen, Abdul Muqtadir Ahmed, Uma N. Dulhare

AI总结 本文针对全球范围内日益严重的 cardiovascular 和 pulmonary 疾病诊断难题,提出了一种结合人工智能的低成本听诊器系统,用于远程诊断心肺疾病。该方法通过提取和处理听诊声音中的 MFCC 特征,结合 CNN 和 GRU 的混合模型实现对六种肺部和五种心血管疾病的自动分类,能够在资源匮乏地区部署于低成本嵌入式设备,提供实时诊断支持,为标准化医疗提供了创新解决方案。

详情
英文摘要

The increase in cardiac and pulmonary diseases presents an alarming and pervasive health challenge on a global scale responsible for unexpected and premature mortalities. In spite of how serious these conditions are, existing methods of detection and treatment encounter challenges, particularly in achieving timely diagnosis for effective medical intervention. Manual screening processes commonly used for primary detection of cardiac and respiratory problems face inherent limitations, increased by a scarcity of skilled medical practitioners in remote or under-resourced areas. To address this, our study introduces an innovative yet efficient model which integrates AI for diagnosing lung and heart conditions concurrently using the auscultation sounds. Unlike the already high-priced digital stethoscope, our proposed model has been particularly designed to deploy on low-cost embedded devices and thus ensure applicability in under-developed regions that actually face an issue of accessing medical care. Our proposed model incorporates MFCC feature extraction and engineering techniques to ensure that the signal is well analyzed for accurate diagnostics through the hybrid model combining Gated Recurrent Unit with CNN in processing audio signals recorded from the low-cost stethoscope. Beyond its diagnostic capabilities, the model generates digital audio records that facilitate in classifying six pulmonary and five cardiovascular diseases. Hence, the integration of a cost effective stethoscope with an efficient AI empowered model deployed on a web app providing real-time analysis, represents a transformative step towards standardized healthcare

2505.07349 2026-05-12 eess.IV cs.CV

Multi-Plane Vision Transformer for Hemorrhage Classification Using Axial and Sagittal MRI Data

Badhan Kumar Das, Gengyan Zhao, Boris Mailhe, Thomas J. Re, Dorin Comaniciu, Eli Gibson, Andreas Maier

AI总结 本文提出了一种用于脑出血分类的多平面视觉Transformer(MP-ViT),旨在解决使用不同方位MRI数据(如轴向和矢状位)进行出血检测时的信息丢失问题。该方法采用两个独立的Transformer编码器分别处理不同方位的影像,并通过跨注意力机制融合多方位信息,同时引入模态指示向量以补充缺失的对比信息。实验表明,MP-ViT在包含10,084个训练样本的临床数据集上表现出色,其AUC值相比传统ViT和CNN模型分别提升了5.5%和1.8%,展示了其在多方位MRI出血检测中的优越性。

Comments 10 pages

详情
英文摘要

Identifying brain hemorrhages from magnetic resonance imaging (MRI) is a critical task for healthcare professionals. The diverse nature of MRI acquisitions with varying contrasts and orientation introduce complexity in identifying hemorrhage using neural networks. For acquisitions with varying orientations, traditional methods often involve resampling images to a fixed plane, which can lead to information loss. To address this, we propose a 3D multi-plane vision transformer (MP-ViT) for hemorrhage classification with varying orientation data. It employs two separate transformer encoders for axial and sagittal contrasts, using cross-attention to integrate information across orientations. MP-ViT also includes a modality indication vector to provide missing contrast information to the model. The effectiveness of the proposed model is demonstrated with extensive experiments on real world clinical dataset consists of 10,084 training, 1,289 validation and 1,496 test subjects. MP-ViT achieved substantial improvement in area under the curve (AUC), outperforming the vision transformer (ViT) by 5.5% and CNN-based architectures by 1.8%. These results highlight the potential of MP-ViT in improving performance for hemorrhage detection when different orientation contrasts are needed.

2504.21015 2026-05-12 cs.IR cs.CL

Don't Retrieve, Generate: Prompting LLMs for Synthetic Training Data in Dense Retrieval

Aarush Sinha

AI总结 本文研究了如何利用大语言模型(LLMs)直接生成合成的困难负样本,以替代传统密集检索模型训练中依赖的全语料库挖掘方法。作者通过四种不同规模的先进LLMs生成合成负样本,并在DistilBERT上进行微调,测试了其在10个BEIR基准数据集上的表现。研究发现,与传统基于语料库的负样本挖掘方法(如BM25和Cross-Encoder)相比,生成的负样本效果较差,且增大生成模型的参数规模并不一定提升检索性能,其中14B参数模型的表现优于30B参数模型。

详情
英文摘要

Training effective dense retrieval models typically relies on hard negative (HN) examples mined from large document corpora using methods such as BM25 or cross-encoders, which require full corpus access and expensive index construction. We propose generating synthetic hard negatives directly from a provided query and positive passage, using Large Language Models(LLMs). We fine-tune DistilBERT using synthetic negatives generated by four state-of-the-art LLMs ranging from 4B to 30B parameters (Qwen3, LLaMA3, Phi4) and evaluate performance across 10 BEIR benchmark datasets. Contrary to the prevailing assumption that stronger generative models yield better synthetic data, find that our generative pipeline consistently underperforms traditional corpus-based mining strategies (BM25 and Cross-Encoder). Furthermore, we observe that scaling the generator model does not monotonically improve retrieval performance and find that the 14B parameter model outperforms the 30B model and in some settings it is the worst performing.

2504.19451 2026-05-12 math.NT cs.AI

Artificial Intelligence in Number Theory: LLMs for Algorithm Generation and Ensemble Methods for Conjecture Verification

Ali Saraeb

AI总结 本文探讨了人工智能在数论领域的两个具体应用。第一部分评估了先进开源大语言模型Qwen2.5-Math-7B-Instruct在算法数论任务中的表现,结果显示其在带有非提示性提示的情况下,对三十个算法问题和三十个计算问题的准确率均达到0.95以上。第二部分通过构建基于统计特征的LightGBM分类模型,实证验证了一个关于狄利克雷L函数零点与模数关系的数论猜想,在小模数情况下测试准确率超过93.9%。

详情
英文摘要

This paper presents two concrete applications of Artificial Intelligence to algorithmic and analytic number theory. Recent benchmarks of large language models have mainly focused on general mathematics problems and the currently infeasible objective of automated theorem proving. In the first part of this paper, we relax our ambition and focus on a more specialized domain: we evaluate the performance of the state-of-the-art open-source large language model Qwen2.5-Math-7B-Instruct on algorithmic and computational tasks in algorithmic number theory. On a benchmark of thirty algorithmic problems and thirty computational questions taken from classical number-theoretic textbooks and Math StackExchange, the model achieves at least 0.95 accuracy (relative to the true answer) on every problem or question when given an optimal non-spoiling hint. The second part of the paper empirically verifies a folklore conjecture in analytic number theory stating that the modulus \(q\) of a Dirichlet character \(χ\) is uniquely determined by the initial nontrivial zeros \(\{ρ_1,\dots,ρ_k\}\) (for some \(k\in\mathbb{N}\)) of the corresponding Dirichlet \(L\)-function \(L(s,χ)\). We train a LightGBM multiclass classifier to predict the conductor \(q\) for 214 randomly chosen Dirichlet \(L\)-functions from a vector of statistical features of their initial zeros (moments, finite-difference statistics, FFT magnitudes, etc.). The model empirically verifies the conjecture for small \(q\), achieving at least 93.9\% test accuracy when sufficient statistical properties of the zeros are incorporated. For the second part of the paper, code and dataset are available.

2504.16093 2026-05-12 q-fin.PM cs.AI math.PR

Efficient Portfolio Selection through Preference Aggregation with Quicksort and the Bradley--Terry Model

Yurun Ge, Lucas Böttcher, Tom Chou, Maria R. D'Orsogna

AI总结 本文研究了在不确定性环境下如何高效地选择最优项目组合的问题,提出了基于快速排序和Bradley-Terry模型的偏好聚合方法。该方法通过将项目间的不确定长期收益转化为成对的“胜率”,并结合多代理的评估进行聚合排序,从而实现对项目组合的优化选择。实验表明,所提方法在性能上优于现有主流方法,并可通过采样技术大幅减少成对比较的次数,具有较高的实用价值。

Comments 15pp, 4 figs

详情
Journal ref
J. Comput. Sci. 92, 102728 (2025)
英文摘要

How to allocate limited resources to projects that will yield the greatest long-term benefits is a problem that often arises in decision-making under uncertainty. For example, organizations may need to evaluate and select innovation projects with risky returns. Similarly, when allocating resources to research projects, funding agencies are tasked with identifying the most promising proposals based on idiosyncratic criteria. Finally, in participatory budgeting, a local community may need to select a subset of public projects to fund. Regardless of context, agents must estimate the uncertain values of a potentially large number of projects. Developing parsimonious methods to compare these projects, and aggregating agent evaluations so that the overall benefit is maximized, are critical in assembling the best project portfolio. Unlike in standard sorting algorithms, evaluating projects on the basis of uncertain long-term benefits introduces additional complexities. We propose comparison rules based on Quicksort and the Bradley--Terry model, which connects rankings to pairwise "win" probabilities. In our model, each agent determines win probabilities of a pair of projects based on his or her specific evaluation of the projects' long-term benefit. The win probabilities are then appropriately aggregated and used to rank projects. Several of the methods we propose perform better than the two most effective aggregation methods currently available. Additionally, our methods can be combined with sampling techniques to significantly reduce the number of pairwise comparisons. We also discuss how the Bradley--Terry portfolio selection approach can be implemented in practice.

2503.13558 2026-05-12 eess.SP cs.AI cs.LG

Survival Analysis with Machine Learning for Predicting Li-ion Battery Remaining Useful Life

Jingyuan Xue, Xiaozhen Zhao, Dongjing Jiang, Qingchong Jiao, Redouane EL Bouchtaoui, Jianfei Zhang

AI总结 本文研究了利用机器学习方法预测锂离子电池剩余使用寿命(RUL)的问题,针对传统方法在处理非线性退化模式和不确定性量化方面的不足,提出了一种结合生存数据分析的混合框架。该方法通过路径签名将电池电压时间序列转化为失效时间数据,并采用基于Cox模型的多种生存分析方法(如DeepHit和MTLR)进行失效概率预测。实验表明,该方法在多个公开数据集上取得了较高的时间依赖AUC和一致性指数,同时保持了较低的综合Brier分数,具有较好的预测性能。

详情
英文摘要

Battery degradation significantly impacts the reliability and efficiency of energy storage systems, particularly in electric vehicles and industrial applications. Predicting the remaining useful life (RUL) of lithium-ion batteries is crucial for optimizing maintenance schedules, reducing costs, and improving safety. Traditional RUL prediction methods often struggle with nonlinear degradation patterns and uncertainty quantification. To address these challenges, we propose a hybrid survival analysis framework integrating survival data reconstruction, survival model learning, and survival probability estimation. Our approach transforms battery voltage time series into time-to-failure data using path signatures. The multiple Cox-based survival models and machine-learning-based methods, such as DeepHit and MTLR, are learned to predict battery failure-free probabilities over time. Experiments conducted on the Toyota battery and NASA battery datasets demonstrate the effectiveness of our approach, achieving high time-dependent AUC and concordance index (C-Index) while maintaining a low integrated Brier score. The data and source codes are available to the public at https://github.com/okic-ca/rul

2502.09891 2026-05-12 cs.IR cs.AI

ArchRAG: Attributed Community-based Hierarchical Retrieval-Augmented Generation

Shu Wang, Yixiang Fang, Yingli Zhou, Xilin Liu, Yuchi Ma

AI总结 本文提出了一种基于属性社区的分层检索增强生成方法 ArchRAG,旨在解决现有图谱增强生成方法在信息检索准确性和计算效率上的不足。该方法通过引入属性社区和基于大语言模型的分层聚类技术,构建分层索引结构并优化在线检索过程,从而更高效地从图数据中检索相关信息。实验表明,ArchRAG 在准确率和 token 消耗方面均优于现有方法。

Comments Published in Proceedings of the AAAI Conference on Artificial Intelligence, 2026

详情
Journal ref
Proceedings of the AAAI Conference on Artificial Intelligence, 40(19), 15868-15876, 2026
英文摘要

Retrieval-Augmented Generation (RAG) has proven effective in integrating external knowledge into large language models (LLMs) for solving question-answer (QA) tasks. The state-of-the-art RAG approaches often use the graph data as the external data since they capture the rich semantic information and link relationships between entities. However, existing graph-based RAG approaches cannot accurately identify the relevant information from the graph and also consume large numbers of tokens in the online retrieval process. To address these issues, we introduce a novel graph-based RAG approach, called Attributed Community-based Hierarchical RAG (ArchRAG), by augmenting the question using attributed communities, and also introducing a novel LLM-based hierarchical clustering method. To retrieve the most relevant information from the graph for the question, we build a novel hierarchical index structure for the attributed communities and develop an effective online retrieval method. Experimental results demonstrate that ArchRAG outperforms existing methods in both accuracy and token cost.

2501.05614 2026-05-12 cs.CR cs.AI

Watermarking Graph Neural Networks via Explanations for Ownership Protection

Jane Downer, Yingdan Shi, Ziyan Liu, Ren Wang, Binghui Wang

AI总结 本文研究如何通过解释对图神经网络(GNN)进行水印嵌入,以实现对其知识产权的保护。作者提出了一种基于解释的水印方法,无需篡改训练数据,有效避免了现有方法中的所有权模糊和数据中毒攻击问题。该方法通过使GNN的解释具有统计显著性,确保所有权声明需通过统计验证,并在理论上证明了即使在完全了解水印方法的情况下,定位水印也是NP难问题,实验也验证了其对微调和剪枝攻击的鲁棒性。

详情
英文摘要

Graph Neural Networks (GNNs) are widely deployed in industry, making their intellectual property valuable. However, protecting GNNs from unauthorized use remains a challenge. Watermarking offers a solution by embedding ownership information into models. Existing watermarking methods have two limitations: First, they rarely focus on graph data or GNNs. Second, the de facto backdoor-based method relies on manipulating training data, which can introduce ownership ambiguity through misclassification and vulnerability to data poisoning attacks that can interrupt the backdoor mechanism. Our explanation-based watermarking inherits the strengths of backdoor-based methods (e.g., black-box verification) without data manipulation, eliminating ownership ambiguity and data dependencies. In particular, we watermark GNN explanations such that these explanations are statistically distinct from others, so ownership claims must be verified through statistical significance. We theoretically prove that, even with full knowledge of our method, locating the watermark is NP-hard. Empirically, our method demonstrates robustness to fine-tuning and pruning attacks. By addressing these challenges, our approach significantly advances GNN intellectual property protection.