arXivDaily arXiv每日学术速递 周一至周五更新
全部学科分类 1967
2511.16964 2026-05-15 cs.MA cs.AI cs.DC

Optimizing PyTorch Inference with LLM-Based Multi-Agent Systems

Kirill Nagaitsev, Luka Grbcic, Samuel Williams, Costin Iancu

发表机构 * NVIDIA Corporation(NVIDIA公司) Microsoft Corporation(微软公司) Ansel et al. ( 2024 )(Ansel等人(2024)) Sabne ( 2020 )(Sabne(2020)) Kerr et al. ( 2017 )(Kerr等人(2017)) Tillet et al. ( 2019 )(Tillet等人(2019)) Spector et al. ( 2024 )(Spector等人(2024)) Ouyang et al. ( 2025 )(Ouyang等人(2025)) Lange et al. ( 2025a(Lange等人(2025a;b)) b )(Li等人(2025)) Li et al. ( 2025 )(METR(2025)) METR ( 2025 )(Andrews和Witteveen(2025)) Andrews and Witteveen ( 2025 )(Baronio等人(2025)) Baronio et al. ( 2025 )(Novikov等人(2025)) Novikov et al. ( 2025 )(Wei等人(2025)) Wei et al. ( 2025 )(Sharma(2025)) Sharma ( 2025 )

AI总结 本文研究了如何利用基于大语言模型的多智能体系统优化PyTorch推理性能。通过构建逻辑框架对比不同多智能体优化系统,发现采用以利用为主策略并结合错误修复智能体能取得最佳效果,且优化粒度对性能有显著影响。实验表明,该方法在H100 GPU上实现了比PyTorch Eager平均2.88倍的加速,优于torch.compile的1.85倍。

详情
英文摘要

Maximizing performance on available GPU hardware is an ongoing challenge for modern AI inference systems. Traditional approaches include writing custom GPU kernels and using specialized model compilers to tune high-level code for specific GPU targets. Recent work shows that LLM-based multi-agent systems can effectively perform such tuning, often outperforming existing compilers and eliminating the need for manual kernel development. However, the dynamics of multi-agent systems for this task remain unexplored. In this work, we present a logical framework for comparing multi-agent PyTorch optimization systems. Our evaluation shows that exploit-heavy strategies perform best when paired with error-fixing agents, and that performance correlates with the granularity of optimization steps. The best implementation achieves an average 2.88x speedup over PyTorch Eager (1.85x over torch.compile) on an H100 GPU across diverse tasks in KernelBench, a benchmark suite covering a range of machine learning architectures in PyTorch. Code is publicly available at: https://github.com/pike-project/pike

2511.05820 2026-05-15 cs.SE cs.AI

From Ranking to Reasoning: Explainable Web API Recommendation via Semantic Reasoning

Zishuo Xu, Dezhong Yao, Yao Wan

发表机构 * School of Software Engineering(软件工程学院) School of Computer Science and Technology(计算机科学与技术学院) Huazhong University of Science and Technology(华中科技大学)

AI总结 随着Web API数量的快速增长,自动化的API推荐对于高效构建混合应用变得至关重要。现有方法在推荐策略固定、无法适应复杂需求以及缺乏解释性方面存在不足。为此,本文提出WAR-R1框架,结合语义推理与可变规模推荐,通过轻量大语言模型生成推荐API及其自然语言解释,并引入特殊起始和终止标记以支持推荐数量的自适应调整。实验表明,WAR-R1在推荐准确率和解释质量上均优于现有方法,验证了其有效性。

详情
英文摘要

The rapid growth of Web APIs has made automated Web API recommendation essential for efficient mashup development. However, existing approaches suffer from two major limitations: 1) they rely on fixed top-N recommendation strategies that cannot adapt to mashup complexity, and 2) they provide little or no explanation for recommended APIs, limiting transparency and user trust. To address these challenges, we propose WAR-R1, an explainable Web API recommendation framework that integrates semantic reasoning with adaptive, variable-cardinality recommendation. Built on a lightweight large language model (LLM), WAR-R1 generates both a set of relevant APIs and a natural-language justification for each recommendation. To support adaptive recommendation size, we introduce special start and stop tokens that allow the model to learn when to begin and terminate API generation. WAR-R1 is trained in two stages: supervised fine-tuning on an annotated mashup-API corpus, followed by reinforcement learning using Group Relative Policy Optimization (GRPO) with low-rank adaptation to jointly optimize recommendation accuracy and reasoning quality. Experiments on the ProgrammableWeb dataset show that WAR-R1 outperforms state-of-the-art baselines by up to 10.89% in recommendation accuracy while consistently producing high-quality, semantically grounded explanations. Extensive ablation studies validate the effectiveness of reinforcement learning, special token design, and integrated reasoning.

2511.05159 2026-05-15 stat.ML cs.LG

A New Framework for Convex Clustering in Kernel Spaces: Finite Sample Bounds, Consistency and Performance Insights

Shubhayan Pan, Kushal Bose, Debolina Paul, Saptarshi Chakraborty, Swagatam Das

发表机构 * Indian Statistical Institute, Kolkata(印度统计研究院,加尔各答) Electronics and Communication Sciences Unit, Indian Statistical Institute(印度统计研究院电子与通信科学单位) Department of Statistics, University of Oxford(牛津大学统计系) Department of Statistics, University of Michigan(密歇根大学统计系)

AI总结 本文提出了一种在核空间中的凸聚类新框架,用于处理线性不可分或非凸结构的数据。该方法通过将数据映射到再生核希尔伯特空间(RKHS),在变换后的空间中进行凸聚类,从而提升对复杂数据分布的处理能力,并能在有限维空间中生成嵌入表示。研究提供了该方法的理论保证,包括算法收敛性和有限样本误差界,并通过实验验证了其在合成和真实数据集上的优越性能,为非线性与非凸数据的聚类提供了有效解决方案。

详情
英文摘要

Convex clustering is a well-regarded clustering method, resembling the similar centroid-based approach of Lloyd's $k$-means, without requiring a predefined cluster count. It starts with each data point as its centroid and iteratively merges them. Despite its advantages, this method can fail when dealing with data exhibiting linearly non-separable or non-convex structures. To mitigate the limitations, we propose a kernelized extension of the convex clustering method. This approach projects the data points into a Reproducing Kernel Hilbert Space (RKHS) using a feature map, enabling convex clustering in this transformed space. This kernelization not only allows for better handling of complex data distributions but also produces an embedding in a finite-dimensional vector space. We provide a comprehensive theoretical underpinning for our kernelized approach, proving algorithmic convergence and establishing finite sample bounds for our estimates. The effectiveness of our method is demonstrated through extensive experiments on both synthetic and real-world datasets, showing superior performance compared to state-of-the-art clustering techniques. This work marks a significant advancement in the field, offering an effective solution for clustering in non-linear and non-convex data scenarios.

2510.25240 2026-05-15 stat.ML cs.LG

Generative Bayesian Optimization: Generative Models as Acquisition Functions

Rafael Oliveira, Daniel M. Steinberg, Edwin V. Bonilla

发表机构 * CSIRO’s Data61(CSIRO的数据61)

AI总结 本文提出了一种将生成模型用于批量贝叶斯优化(BO)的通用策略,使生成模型能够作为候选解采样器,从而实现大规模批量优化、非连续设计空间优化以及高维和组合设计优化。受直接偏好优化(DPO)成功启发,研究通过使用观测数据计算出的简单效用值训练生成模型,使其生成的分布密度与预期效用(即BO的获取函数值)成正比,避免了传统方法中构建代理模型的需求。理论分析表明,生成模型在BO过程中形成的分布序列在一定条件下可逼近最优目标,并通过高维大规模优化实验验证了方法的有效性。

Comments Published at ICLR 2026. Compared with the proceedings version on OpenReview, this version includes a minor revision to Section 3

Journal ref The Fourteenth International Conference on Learning Representations (ICLR 2026)

详情
英文摘要

We present a general strategy for turning generative models into candidate solution samplers for batch Bayesian optimization (BO). The use of generative models for BO enables large batch scaling as generative sampling, optimization of non-continuous design spaces, and high-dimensional and combinatorial design. Inspired by the success of direct preference optimization (DPO), we show that one can train a generative model with noisy, simple utility values directly computed from observations to then form proposal distributions whose densities are proportional to the expected utility, i.e., BO's acquisition function values. Furthermore, this approach is generalizable beyond preference-based feedback to general types of reward signals and loss functions. This perspective avoids the construction of surrogate (regression or classification) models, common in previous methods that have used generative models for black-box optimization. Theoretically, we show that the generative models within the BO process follow a sequence of distributions which asymptotically approximate an optimal target under certain conditions. We also evaluate the performance through experiments on challenging optimization problems involving large batches in high dimensions.

2510.19973 2026-05-15 cs.NI cs.AI

A Tutorial on Cognitive Biases in Agentic AI-Driven 6G Autonomous Networks

Hatim Chergui, Farhad Rezazadeh, Merouane Debbah, Christos Verikoukis

发表机构 * i2CAT Foundation(i2CAT基金会) Hostelworld Group(Hostelworld集团) Technical University of Catalonia (UPC)(技术大学(加泰罗尼亚)) Khalifa University of Science and Technology(卡里玛大学) ISI/ATH University of Patras(帕特拉大学)

AI总结 本文综述了智能体驱动的6G自组织网络中常见的认知偏差问题,分析了这些偏差的分类、数学表达及其在通信系统中的表现,并提出了针对性的缓解策略。通过两个6G网络管理场景的案例验证,研究展示了如何利用本地化大语言模型和改进的记忆机制,有效减少锚定偏差和时间确认偏差,从而提升资源分配效率,实现显著的能耗降低和延迟优化。

Comments 26 pages, 18 figures, 4 tables, link to source code available. Accepted at IEEE OJCOMS

详情
英文摘要

The path to higher network autonomy in 6G lies beyond the mere optimization of key performance indicators (KPIs), requiring systems that perceive and reason over the network environment as it is. This can be achieved through agentic AI, where large language model (LLM)-powered agents utilize multimodal telemetry, memory, and cross-domain negotiation to achieve multi-objective goals. However, deploying such agents introduces cognitive biases inherited from human design, which can severely distort reasoning and actuation. This paper provides a comprehensive tutorial on well-known cognitive biases, detailing their taxonomy, mathematical formulation, emergence in telecom systems, and tailored mitigation strategies. We validate these concepts through two distinct use-cases in 6G management. First, we tackle anchoring bias in inter-slice resource negotiation. To overcome the prohibitive execution delays of cloud-based LLMs, this use-case deploys a locally hosted 1B-parameter model on an RTX A4000 GPU, successfully achieving sub-second inference latencies compatible with near-real-time operations. By replacing fixed heuristic anchors with a Truncated Weibull randomized anchor strategy, the agents dismantle rigid biases, intelligently consume SLA slack, and dynamically double the system-wide energy savings (peaking at 25\%) without violating strict latency limits. Second, we mitigate temporal and confirmation biases in RAN-Edge cross-domain negotiation by designing an unbiased collective memory. By integrating semantic/temporal decay and an inflection bonus that actively highlights past negotiation failures, agents are prevented from over-relying on recent data or repeating past mistakes. Grounding decisions in this richer, debiased historical context yields highly robust agreements, achieving a $\times 5$ latency reduction and roughly 40\% higher energy savings compared to memoryless baselines.

2510.15141 2026-05-15 stat.ML cs.LG stat.AP

Manifold Dimension Estimation via Local Graph Structure

Zelong Bi, Pierre Lafaye de Micheaux

发表机构 * School of Mathematics and Statistics, University of New South Wales(新南威尔士大学数学与统计学学院)

AI总结 本文提出了一种基于局部图结构的流形维度估计方法,通过在局部主成分分析坐标上进行回归来捕捉流形的局部结构。该方法引入了两个代表性估计器:二次嵌入(QE)和总最小二乘(TLS),实验表明它们在合成数据和现实数据上均具有竞争力,且在许多情况下优于现有先进方法。

详情
英文摘要

Most existing manifold dimension estimators rely on the assumption that the underlying manifold is locally flat within the neighborhoods under consideration. More recently, curvature-adjusted principal component analysis (CA-PCA) has emerged as a powerful alternative by explicitly accounting for the manifold's curvature. Motivated by these ideas, we propose a manifold dimension estimation framework that captures the local graph structure of the manifold through regression on local PCA coordinates. Within this framework, we introduce two representative estimators: quadratic embedding (QE) and total least squares (TLS). Experiments on both synthetic and real-world datasets demonstrate that these methods perform competitively with, and often outperform, state-of-the-art approaches.

2510.13583 2026-05-15 stat.ML cs.LG

On the Identifiability of Causal Graphs with the Invariance Principle

Francesco Montagna

发表机构 * Institute of Science and Technology Austria(奥地利科学与技术研究所) Chan Zuckerberg Initiative(查兰·泽克伯格倡议)

AI总结 本文研究了在独立同分布观测数据下因果图的可识别性问题,提出在结构因果模型生成的数据分布以及少量(最多两个)具有不同噪声统计特性的环境数据下,可以唯一确定因果图。该成果首次保证了在固定数量环境中恢复完整因果图的可能性,且适用于任意非线性机制,仅需噪声满足高斯性假设,并探讨了放松该假设的可能方法。研究还进一步拓展了独立成分分析与因果发现之间的对偶关系,表明在较少辅助信息条件下,因果发现可达到与非线性ICA相当的性能。

Comments Published as ICLR 2026 conference paper

详情
英文摘要

Causal discovery from i.i.d. observational data is known to be generally ill-posed. We demonstrate that if we have access to the distribution {induced} by a structural causal model, and additional data from (in the best case) \textit{only two} environments that sufficiently differ in the noise statistics, the unique causal graph is identifiable. Notably, this is the first result in the literature that guarantees the entire causal graph recovery with a constant number of environments and arbitrary nonlinear mechanisms. Our only constraint is the Gaussianity of the noise terms; however, we propose potential ways to relax this requirement. Of interest on its own, we expand on the well-known duality between independent component analysis (ICA) and causal discovery; recent advancements have shown that nonlinear ICA can be solved from multiple environments, at least as many as the number of sources: we show that the same can be achieved for causal discovery while having access to much less auxiliary information.

2508.14950 2026-05-15 eess.IV cs.LG

Potential and challenges of generative adversarial networks for super-resolution in 4D Flow MRI

Oliver Welin Odeback, Arivazhagan Geetha Balasubramanian, Jonas Schollenberger, Edward Ferdiand, Alistair A. Young, C. Alberto Figueroa, Susanne Schnell, Outi Tammisola, Ricardo Vinuesa, Tobias Granberg, Alexander Fyrdahl, David Marlevi

发表机构 * Surgery, Karolinska Institutet , addressline= Karolinska Universitetssjukhuset Solna (L1:00) , city= Stockholm , postcode= 171 76 , country= Sweden organization= FLOW, Engineering Mechanics, KTH Royal Institute of Technology , addressline= Osquars Backe 18 , city= Stockholm , postcode= 100 44 , country= Sweden organization= Department of Radiology Biomedical Imaging, University of California San Francisco , addressline= 505 Parnassus Avenue , city= San Francisco , postcode= 94143 , state= CA , country= USA organization= Faculty of Informatics, Telkom University , addressline= Jl.Telekomunikasi No. 1, Terusan Buahbatu , city= Bandung , postcode= 40257 , state= West Java , country= Indonesia organization= Auckland Bioengineering Institute, University of Auckland , addressline= Bioengineering House, 70 Symonds St , city= Grafton , postcode= 1010 , country= New Zealand organization= School of Biomedical Engineering \& Imaging Sciences, King's College London , addressline= 1 Lambeth Palace Rd, South Bank , city= London , postcode= SE1 7EU , country= UK organization= Department of Biomedical Engineering, University of Michigan , addressline= 1107 Carl A. Gerstacker Bldg 2200 Bonisteel Blvd. , city= Ann Arbor , postcode= 48109-2099 , state= MI , country= USA organization= Department of Physics, University of Greifswald , addressline= Felix-Hausdorff-Str. 6 , city= Greifswald , postcode= 174 89 , country= Germany organization= Department of Aerospace Engineering, University of Michigan , addressline= 1320 Beal Avenue , city= Ann Arbor , postcode= 48109-2140 , state= MI , country= USA organization= Department of Neuroradiology, Karolinska University Hospital , addressline= Hälsovägen 13, O42 , city= Stockholm , postcode= 141 86 , country= Sweden organization= Department of Clinical Physiology, Karolinska University Hospital , addressline= Eugeniavägen 3, A8:01 , city= Solna , postcode= 171 64 , country= Sweden organization= Institute for Medical Engineering Science, Massachusetts Institute of Technology , addressline= 45 Carleton St , city= Cambridge , postcode= 02142 , state= MA , country= USA

AI总结 本文研究了生成对抗网络(GAN)在4D血流磁共振成像(4D Flow MRI)超分辨率重建中的潜力与挑战。针对该技术在近壁速度测量中分辨率低、噪声大的问题,作者提出了一种专门设计的GAN架构,并在三种对抗损失函数下进行了评估。实验表明,Wasserstein GAN在提升近壁速度恢复精度和训练稳定性方面表现最优,展示了GAN在改善4D Flow MRI图像质量中的应用前景。

Comments 26 pages, 10 figures

Journal ref Computers in Biology and Medicine 211 (2026) 111745

详情
英文摘要

4D Flow Magnetic Resonance Imaging (4D Flow MRI) enables non-invasive quantification of blood flow and hemodynamic parameters. However, its clinical application is limited by low spatial resolution and noise, particularly affecting near-wall velocity measurements. Machine learning-based super-resolution has shown promise in addressing these limitations, but challenges remain, not least in recovering near-wall velocities. Generative adversarial networks (GANs) offer a compelling solution, having demonstrated strong capabilities in restoring sharp boundaries in non-medical super-resolution tasks. Yet, their application in 4D Flow MRI remains unexplored, with implementation challenged by known issues such as training instability and non-convergence. In this study, we investigate GAN-based super-resolution in 4D Flow MRI. Training and validation were conducted using patient-specific cerebrovascular in-silico models, converted into synthetic images via an MR-true reconstruction pipeline. A dedicated GAN architecture was implemented and evaluated across three adversarial loss functions: Vanilla, Relativistic, and Wasserstein. Our results demonstrate that the proposed GAN improved near-wall velocity recovery compared to a non-adversarial reference (vNRMSE: 6.9% vs. 9.6%); however, that implementation specifics are critical for stable network training. While Vanilla and Relativistic GANs proved unstable compared to generator-only training (vNRMSE: 8.1% and 7.8% vs. 7.2%), a Wasserstein GAN demonstrated optimal stability and incremental improvement (vNRMSE: 6.9% vs. 7.2%). The Wasserstein GAN further outperformed the generator-only baseline at low SNR (vNRMSE: 8.7% vs. 10.7%). These findings highlight the potential of GAN-based super-resolution in enhancing 4D Flow MRI, particularly in challenging cerebrovascular regions, while emphasizing the need for careful selection of adversarial strategies.

2508.07876 2026-05-15 stat.ML cs.LG math.DS math.ST stat.TH

Stochastic dynamics learning with state-space systems

Juan-Pablo Ortega, Florian Rossmannek

发表机构 * Division of Mathematical Sciences, School of Physical and Mathematical Sciences, Nanyang Technological University(数学科学系,物理与数学科学学院,南洋理工大学)

AI总结 本文研究了状态空间系统在随机动态学习中的特性,旨在深化对脉冲神经网络计算(RC)理论基础的理解。通过统一处理确定性和随机性场景下的记忆衰减和回声状态属性(ESP),作者证明了即使在缺乏ESP的情况下,记忆衰减和解的稳定性也具有普遍性,从而为RC模型的广泛应用提供了理论支持。在随机情形下,文章引入了基于概率分布吸引子动力学的新视角,拓展了非自主动力系统的相关研究,为RC模型在因果性、稳定性与记忆特性方面提供了更深入的见解。

Journal ref Mathematical Models and Methods in Applied Sciences, 2026

详情
英文摘要

This work advances the theoretical foundations of reservoir computing (RC) by providing a unified treatment of fading memory and the echo state property (ESP) in both deterministic and stochastic settings. We investigate state-space systems, a central model class in time series learning, and establish that fading memory and solution stability hold generically -- even in the absence of the ESP -- offering a robust explanation for the empirical success of RC models without strict contractivity conditions. In the stochastic case, we critically assess stochastic echo states, proposing a novel distributional perspective rooted in attractor dynamics on the space of probability distributions, which leads to a rich and coherent theory. Our results extend and generalize previous work on non-autonomous dynamical systems, offering new insights into causality, stability, and memory in RC models. This lays the groundwork for reliable generative modeling of temporal data in both deterministic and stochastic regimes.

2508.03941 2026-05-15 cs.IR cs.LG

Measuring the stability and plasticity of recommender systems

Maria João Lavoura, Robert Jungnickel, João Vinagre

发表机构 * Independent researcher(独立研究者) Joint Research Centre - European Commission(欧洲委员会联合研究中心)

AI总结 本文研究了推荐系统在长期运行中的稳定性与可塑性问题,提出了一个离线评估方法,用于分析推荐模型在重新训练时的行为表现。该方法从模型保留历史模式(稳定性)和适应新变化(可塑性)两个方面对算法进行评估,提供了一种与数据集、算法和指标无关的长期性能分析框架。实验结果表明,不同类型的推荐算法在稳定性和可塑性上存在差异,并可能存在两者之间的权衡关系。

Comments Final version published in the proceedings of ACM UMAP 2026: https://doi.org/10.1145/3774935.3812707

详情
英文摘要

The typical offline protocol to evaluate recommendation algorithms is to collect a dataset of user-item interactions and then use a part of this dataset to train a model, and the remaining data to measure how closely the model recommendations match the observed user interactions. This protocol is straightforward, useful and practical, but it only provides snapshot performance. We know, however, that online systems evolve over time. In general, it is a good idea that models are frequently retrained with recent data. But if this is the case, to what extent can we trust previous evaluations? How will a model perform when a different pattern (re)emerges? In this paper we propose a methodology to study how recommendation models behave when they are retrained. The idea is to profile algorithms according to their ability to, on the one hand, retain past patterns - stability - and, on the other hand, (quickly) adapt to changes - plasticity. We devise an offline evaluation protocol that provides detail on the long-term behavior of models, and that is agnostic to datasets, algorithms and metrics. To illustrate the potential of this framework, we present preliminary results of three different types of algorithms on the GoodReads dataset that suggest different stability and plasticity profiles depending on the algorithmic technique, and a possible trade-off between stability and plasticity. We further discuss the potential and limitations of the proposal and advance some possible improvements.

2507.13941 2026-05-15 q-bio.NC cs.AI cs.CV eess.IV

Shared representations in brains and models reveal a two-route cortical organization during scene perception

Pablo Marcos-Manchón, Lluís Fuentemilla

发表机构 * Department of Cognition, Development and Education Psychology, Faculty of Psychology, University of Barcelona(认知、发展与教育心理学系,心理学学院,巴塞罗那大学) Institute of Neurosciences, University of Barcelona(神经科学研究所,巴塞罗那大学) Bellvitge Institute for Biomedical Research(Bellvitge生物医学研究 institute)

AI总结 该研究通过分析7T fMRI数据,探讨了人类大脑在场景感知过程中信息的组织与传递路径。研究利用表征相似性分析,比较了个体间共享的脑区表征结构与视觉和语言神经网络的层次特征,发现大脑存在两条分离的处理通路:一条负责场景布局与环境背景,另一条专门处理生物内容。这一发现深化了对视觉信息处理的经典模型,揭示了场景感知是一个由多个可区分表征路径组成的分布式脑网络。

Comments for associate code, see https://github.com/memory-formation/convergent-transformations

详情
英文摘要

The brain transforms visual inputs into high-dimensional cortical representations that support diverse cognitive and behavioral goals. Characterizing how this information is organized and routed across the human brain is essential for understanding how we process complex visual scenes. Here, we applied representational similarity analysis to 7T fMRI data collected during natural scene viewing. We quantified representational geometry shared across individuals and compared it to hierarchical features from vision and language neural networks. This analysis revealed two distinct processing routes: a ventromedial pathway specialized for scene layout and environmental context, and a lateral occipitotemporal pathway selective for animate content. Vision models aligned with shared structure in both routes, whereas language models corresponded primarily with the lateral pathway. These findings refine classical visual-stream models by characterizing scene perception as a distributed cortical network with separable representational routes for context and animate content.

2507.05193 2026-05-15 eess.IV cs.CV

RAM-W600: A Multi-Task Wrist Dataset and Benchmark for Rheumatoid Arthritis

Songxiao Yang, Haolin Wang, Yao Fu, Ye Tian, Tamotsu Kamishima, Masayuki Ikebe, Yafei Ou, Masatoshi Okutomi

发表机构 * Institute of Science Tokyo(东京科学研究所) Hokkaido University(北海道大学) The University of Tokyo(东京大学)

AI总结 该研究提出了一种名为RAM-W600的多任务腕关节X光图像数据集,用于类风湿性关节炎(RA)的辅助诊断与疾病监测。该数据集包含来自六个医疗中心的388名患者的1048张腕部常规X光图像,提供了像素级的腕骨实例分割标注和SvdH骨侵蚀评分,是首个公开的腕骨实例分割资源。该数据集有助于推动RA相关研究,如关节间隙狭窄量化、骨侵蚀检测、骨变形评估等,并可能应用于腕部骨折定位等任务,有望降低腕部RA研究的门槛,促进计算机辅助诊断技术的发展。

Comments Published in NeurIPS 2025

详情
英文摘要

Rheumatoid arthritis (RA) is a common autoimmune disease that has been the focus of research in computer-aided diagnosis (CAD) and disease monitoring. In clinical settings, conventional radiography (CR) is widely used for the screening and evaluation of RA due to its low cost and accessibility. The wrist is a critical region for the diagnosis of RA. However, CAD research in this area remains limited, primarily due to the challenges in acquiring high-quality instance-level annotations. (i) The wrist comprises numerous small bones with narrow joint spaces, complex structures, and frequent overlaps, requiring detailed anatomical knowledge for accurate annotation. (ii) Disease progression in RA often leads to osteophyte, bone erosion (BE), and even bony ankylosis, which alter bone morphology and increase annotation difficulty, necessitating expertise in rheumatology. This work presents a multi-task dataset for wrist bone in CR, including two tasks: (i) wrist bone instance segmentation and (ii) Sharp/van der Heijde (SvdH) BE scoring, which is the first public resource for wrist bone instance segmentation. This dataset comprises 1048 wrist conventional radiographs of 388 patients from six medical centers, with pixel-level instance segmentation annotations for 618 images and SvdH BE scores for 800 images. This dataset can potentially support a wide range of research tasks related to RA, including joint space narrowing (JSN) progression quantification, BE detection, bone deformity evaluation, and osteophyte detection. It may also be applied to other wrist-related tasks, such as carpal bone fracture localization. We hope this dataset will significantly lower the barrier to research on wrist RA and accelerate progress in CAD research within the RA-related domain.

2506.20425 2026-05-15 stat.ML cs.LG stat.CO stat.ME

Scalable Subset Selection in Linear Mixed Models

Ryan Thompson, Matt P. Wand, Joanna J. J. Wang

发表机构 * School of Mathematical and Physical Sciences, University of Technology Sydney(技术与物理科学学院,悉尼技术大学)

AI总结 本文研究了在包含固定效应和随机效应的线性混合模型中如何高效地进行可扩展的子集选择问题。为了解决现有方法在处理大量预测变量时计算效率低下的问题,作者提出了一种基于 $\ell_0$ 正则化的新型子集选择方法,并结合坐标下降算法和局部搜索算法以实现快速收敛和非凸优化的高效求解。该方法在统计上提供了有限样本下的KL散度界,并在合成和真实数据实验中表现出优越的性能。

详情
英文摘要

Linear mixed models (LMMs), which incorporate fixed and random effects, are key tools for analyzing heterogeneous data, such as in personalized medicine. Nowadays, this type of data is increasingly wide, sometimes containing thousands of candidate predictors, necessitating sparsity for prediction and interpretation. However, existing sparse learning methods for LMMs do not scale well beyond tens or hundreds of predictors, leaving a large gap compared with sparse methods for linear models, which ignore random effects. This paper closes the gap with a new $\ell_0$ regularized method for LMM subset selection that can run on datasets containing thousands of predictors in seconds to minutes. On the computational front, we develop a coordinate descent algorithm as our main workhorse and provide a guarantee of its convergence. We also develop a local search algorithm to help traverse the nonconvex optimization surface. Both algorithms readily extend to subset selection in generalized LMMs via a penalized quasi-likelihood approximation. On the statistical front, we provide a finite-sample bound on the Kullback-Leibler divergence of the new method. We then demonstrate its excellent performance in experiments involving synthetic and real datasets.

2505.16714 2026-05-15 quant-ph cs.LG

Experimental robustness benchmarking of quantum neural networks on a superconducting quantum processor

Hai-Feng Zhang, Zhao-Yun Chen, Peng Wang, Liang-Liang Guo, Tian-Le Wang, Xiao-Yan Yang, Ren-Ze Zhao, Ze-An Zhao, Sheng Zhang, Lei Du, Hao-Ran Tao, Zhi-Long Jia, Wei-Cheng Kong, Huan-Yu Liu, Athanasios V. Vasilakos, Yang Yang, Yu-Chun Wu, Ji Guan, Peng Duan, Guo-Ping Guo

发表机构 * Laboratory of Quantum Information, School of Physics, University of Science(量子信息实验室,物理学院,科学大学) CAS Center For Excellence in Quantum Information(中国科学院量子信息卓越中心) Quantum Physics, University of Science(量子物理,科学大学) Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, Hefei, Anhui, 230088, China(人工智能研究所,合肥国家综合性科学中心,合肥,安徽,230088,中国) Department of ICT(信息与通信技术系) Center for AI Research, University of Agder (UiA), Jon Lilletuns vei 9, 4879 Grimstad, Norway(人工智能研究中心,阿格德大学(UiA),Jon Lilletuns vei 9,4879 Grimstad,挪威) Anhui University, Hefei, Anhui, 230039, China(安徽大学,合肥,安徽,230039,中国) Key Laboratory of System Software (Chinese Academy of Sciences)(系统软件重点实验室(中国科学院)) State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences, Beijing 100190, China(计算机科学国家重点实验室,软件研究所,中国科学院,北京100190,中国)

AI总结 本研究首次在超导量子处理器上对20量子比特的量子神经网络分类器进行了系统的实验鲁棒性评估,揭示了量子机器学习模型在对抗攻击下的安全性问题。研究提出了一种高效的对抗攻击算法,用于量化评估量子神经网络的鲁棒性,并验证了对抗训练能够通过正则化输入梯度显著提升其鲁棒性。实验还表明,与经典神经网络相比,量子神经网络具有更强的对抗鲁棒性,这归因于其固有的量子噪声,并且实验结果与理论下界高度吻合,验证了攻击方法的有效性与鲁棒性界限的紧致性。

Comments There are 8 pages with 5 figures in the main text

Journal ref SCIENCE CHINA Physics, Mechanics & Astronomy Volume 69, Issue 6: 260315 (2026)

详情
英文摘要

Quantum machine learning (QML) models, like their classical counterparts, are vulnerable to adversarial attacks, hindering their secure deployment. Here, we report the first systematic experimental robustness benchmark for 20-qubit quantum neural network (QNN) classifiers executed on a superconducting processor. Our benchmarking framework features an efficient adversarial attack algorithm designed for QNNs, enabling quantitative characterization of adversarial robustness and robustness bounds. From our analysis, we verify that adversarial training reduces sensitivity to targeted perturbations by regularizing input gradients, significantly enhancing QNN's robustness. Additionally, our analysis reveals that QNNs exhibit superior adversarial robustness compared to classical neural networks, an advantage attributed to inherent quantum noise. Furthermore, the empirical upper bound extracted from our attack experiments shows a minimal deviation ($3 \times 10^{-3}$) from the theoretical lower bound, providing strong experimental confirmation of the attack's effectiveness and the tightness of fidelity-based robustness bounds. This work establishes a critical experimental framework for assessing and improving quantum adversarial robustness, paving the way for secure and reliable QML applications.

2505.09552 2026-05-15 stat.ME cs.LG stat.ML

Scalable Krylov Subspace Methods for Generalized Mixed-Effects Models with Crossed Random Effects

Pascal Kündig, Fabio Sigrist

发表机构 * Lucerne University of Applied Sciences and Arts(卢塞恩应用科学与艺术大学) Seminar for Statistics, ETH Zurich(苏黎世联邦理工学院统计研究所) University of Basel(巴塞尔大学)

AI总结 该论文针对具有交叉随机效应的广义混合效应模型中的计算瓶颈问题,提出了一种基于Krylov子空间的方法,有效提升了高维数据下的计算效率。研究通过理论分析和实验验证,展示了预条件随机Lanczos拟合和共轭梯度方法在收敛性和数值稳定性方面的优势,并开发了可扩展的预测方差计算方法。实验表明,新方法相比传统的Cholesky分解方法,在速度和稳定性上均有显著提升。

详情
英文摘要

Mixed-effects models are widely used to model data with hierarchical grouping structures and high-cardinality categorical predictor variables. However, for high-dimensional crossed random effects, current standard computations relying on Cholesky decompositions can become prohibitively slow. In this work, we present Krylov subspace-based methods that address existing computational bottlenecks, and we analyze them both theoretically and empirically. In particular, we derive new results on the convergence and accuracy of the preconditioned stochastic Lanczos quadrature and conjugate gradient methods for mixed-effects models, and we develop scalable methods for calculating predictive variances. In experiments with simulated and real-world data, the proposed methods yield speedups by factors of up to about 10,000 and are numerically more stable than Cholesky-based computations.

2505.09246 2026-05-15 cs.IR cs.AI cs.CL

Autofocus Retrieval: An Effective Pipeline for Multi-Hop Question Answering With Semi-Structured Knowledge

Derian Boer, Stephen Roth, Stefan Kramer

发表机构 * Institute of Computer Science(计算机科学研究所) Johannes Gutenberg University Mainz(美因茨约翰内斯·古腾堡大学)

AI总结 本文提出了一种基于半结构化知识库的多跳问答框架Autofocus-Retriever(AF-Retriever),旨在有效结合结构化和非结构化信息进行问答。该方法通过引入可交换的大语言模型提取实体属性和关系约束,并结合向量相似度搜索与增量范围扩展策略,实现了在多个基准测试中优于现有方法的零样本和少样本性能。其核心贡献在于通过四步约束驱动的检索与四步补充排序流程,显著提升了答案检索的准确性和鲁棒性。

Journal ref Transactions on Machine Learning Research 2026

详情
英文摘要

In many real-world settings, machine learning models and interactive systems have access to both structured knowledge, e.g., knowledge graphs or tables, and unstructured content, e.g., natural language documents. Yet, most rely on either. Semi-Structured Knowledge Bases (SKBs) bridge this gap by linking unstructured content to nodes within structured data. In this work, we present Autofocus-Retriever (AF-Retriever), a modular framework for SKB-based, multi-hop question answering. It combines structural and textual retrieval through novel integration steps and optimizations, achieving the best zero- and one-shot results across all three STaRK QA benchmarks, which span diverse domains and evaluation metrics. AF-Retriever's average first-hit rate surpasses the second-best method by 32.1%. Its performance is driven by (1) leveraging exchangeable large language models (LLMs) to extract entity attributes and relational constraints for both parsing and reranking the top-k answers, (2) vector similarity search for ranking both extracted entities and final answers, (3) a novel incremental scope expansion procedure that prepares for the reranking on a configurable amount of suitable candidates that fulfill the given constraints the most, and (4) a hybrid retrieval strategy that reduces error susceptibility. In summary, while constantly adjusting the focus like an optical autofocus, AF-Retriever delivers a configurable amount of answer candidates in four constraint-driven retrieval steps, which are then supplemented and ranked through four additional processing steps. An ablation study and a detailed error analysis, including a comparison of three different LLM reranking strategies, provide component-level insights. The source code is available at https://github.com/kramerlab/AF-Retriever .

2504.11703 2026-05-15 cs.CR cs.AI

Progent: Securing AI Agents with Privilege Control

Tianneng Shi, Jingxuan He, Zhun Wang, Hongwei Li, Linyu Wu, Wenbo Guo, Dawn Song

发表机构 * UC Berkeley(加州大学伯克利分校) UC Santa Barbara(加州大学圣巴巴拉分校) National University of Singapore(新加坡国立大学)

AI总结 AI代理通过调用工具与外部环境交互,容易受到如间接提示注入等攻击,导致未经授权的操作。为此,本文提出Progent框架,通过特权控制机制增强AI代理的安全性。Progent将特权表示为基于工具名称和参数的符号化安全策略,通过确定性过程检查每个工具调用,确保最小特权原则。该框架利用大型语言模型自动生成并动态更新策略,并结合SMT求解器保证策略更新的单调性,从而在保障实用性的前提下有效防止权限升级,实验表明其在多个基准测试中显著降低了攻击成功率。

详情
英文摘要

AI agents interact with external environments through tool calls, exposing them to attacks like indirect prompt injection that can trigger unauthorized actions. Securing these agents is challenging: they behave autonomously and probabilistically, security requirements evolve depending on the user's task and execution state, and there is an inherent tradeofff between security and utility. In this work, we introduce Progent, a novel framework that secures AI agents via privilege control. Progent represents privilege as a security policy consisting of symbolic rules over tool names and arguments. These rules specify which tool calls are allowed for task completion and which unnecessary ones are blocked for security. Every tool call is checked against such a policy through a deterministic procedure, enforcing the principle of least privilege. To handle diverse user tasks and evolving execution contexts, an LLM automatically generates the initial policy from the user's task and updates it during execution as new information arrives. Each proposed update is determined by an SMT solver to be either a narrowing (applied automatically) or an expansion (requiring explicit approval), ensuring that the agent's effective action space can only shrink without approval (monotonic confinement). This deterministic update mechanism preserves utility and prevents silent privilege escalation, even when adversarial inputs are present. Our evaluation on popular benchmarks (i.e., AgentDojo and ASB) shows that Progent significantly reduces attack success rates while maintaining high utility. We further validate Progent's practicality by showcasing its effectiveness in real-world agent frameworks such as LangChain and OpenAI Agents SDK.

2504.01571 2026-05-15 cs.GR cs.AI cs.CV cs.LG

Pro-DG: Procedural Diffusion Guidance for Architectural Facade Generation

Aleksander Plocharski, Jan Swidzinski, Przemyslaw Musialski

发表机构 * Warsaw University of Technology(华沙技术大学) Akces NCBR Imperial College London(伦敦帝国理工学院) New Jersey Institute of Technology(新泽西理工学院)

AI总结 本文提出了一种基于过程化扩散引导(Pro-DG)的建筑立面生成方法,通过在稳定扩散框架中引入分层过程化规则生成控制图,从而生成逼真的建筑立面图像。该方法从单张输入图像及其分割结果出发,利用逆过程模块识别立面的分层布局,并结合结构特征设计了一种新的ControlNet流程,实现由过程化变换引导的立面图像生成。该方法能够精确控制局部外观并进行大规模结构编辑,实验表明其在保持建筑风格和实现可控编辑方面优于现有方法。

Comments 17 pages, 15 figures, Computer Graphics Forum 2026 Journal Paper

详情
英文摘要

We use hierarchical procedural rules for the generation of control maps within the stable diffusion framework to produce photo-realistic architectural facade images. Starting from a single input image and its segmentation, we apply an inverse procedural module to identify the facade's hierarchical layout. Leveraging this hierarchy and structural features, we introduce a novel ControlNet pipeline that generates new facade imagery guided by procedural transformations. Our method enables various structural edits, including floor duplication and window rearrangement, by integrating hierarchical alignment directly into control maps. This precisely guides the diffusion-based generative process, ensuring local appearance fidelity alongside extensive structural modifications. Comprehensive evaluations, including comparisons with inpainting-based approaches and synthetic benchmarks, confirm our approach's superior capability in preserving architectural identity and achieving accurate, controllable edits. Quantitative results and user feedback validate our method's effectiveness.

2501.18756 2026-05-15 stat.ML cs.LG math.OC

A Unified Framework for Entropy Search and Expected Improvement in Bayesian Optimization

Nuojin Cheng, Leonard Papenmeier, Stephen Becker, Luigi Nardi

发表机构 * Department of Applied Mathematics, University of Colorado Boulder(科罗拉多大学波尔得分校应用数学系) Department of Computer Science, Lund University(吕勒欧大学计算机科学系)

AI总结 本文提出了一种统一的理论框架——变分熵搜索(Variational Entropy Search),揭示了预期改进(EI)与基于信息论的获取函数之间的深层联系,挑战了它们本质不同的传统观点。研究通过将EI解释为最大值熵搜索(MES)的变分近似,提出了一个新的获取函数VES-Gamma,该方法在合成和现实世界的低维与高维基准测试中表现出色,优于现有的EI和MES方法。

Journal ref Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:10106-10120, 2025

详情
英文摘要

Bayesian optimization is a widely used method for optimizing expensive black-box functions, with Expected Improvement being one of the most commonly used acquisition functions. In contrast, information-theoretic acquisition functions aim to reduce uncertainty about the function's optimum and are often considered fundamentally distinct from EI. In this work, we challenge this prevailing perspective by introducing a unified theoretical framework, Variational Entropy Search, which reveals that EI and information-theoretic acquisition functions are more closely related than previously recognized. We demonstrate that EI can be interpreted as a variational inference approximation of the popular information-theoretic acquisition function, named Max-value Entropy Search. Building on this insight, we propose VES-Gamma, a novel acquisition function that balances the strengths of EI and MES. Extensive empirical evaluations across both low- and high-dimensional synthetic and real-world benchmarks demonstrate that VES-Gamma is competitive with state-of-the-art acquisition functions and in many cases outperforms EI and MES.

2410.03280 2026-05-15 eess.AS cs.AI cs.LG eess.SP

Manikin-Recorded Cardiopulmonary Sounds Dataset Using Digital Stethoscope

Yasaman Torabi, Shahram Shirani, James P. Reilly

发表机构 * Electrical and Computer Engineering Department, McMaster University(麦斯特大学电气与计算机工程系)

AI总结 该研究提出了一种使用数字听诊器录制的心肺声音数据集,包含正常及多种异常心肺音,如杂音、心律失常和呼吸音等。数据集通过临床模拟人采集,涵盖了不同身体部位的单独和混合声音,并经过频率滤波处理以增强特定声音类型。该数据集为人工智能在心肺疾病自动检测、声音分类及深度学习等领域的研究提供了重要的资源。

Journal ref IEEE Data Descriptions, vol. 2, pp. 133-140, 2025

详情
英文摘要

Heart and lung sounds are crucial for healthcare monitoring. Recent improvements in stethoscope technology have made it possible to capture patient sounds with enhanced precision. In this dataset, we used a digital stethoscope to capture both heart and lung sounds, including individual and mixed recordings. To our knowledge, this is the first dataset to offer both separate and mixed cardiorespiratory sounds. The recordings were collected from a clinical manikin, a patient simulator designed to replicate human physiological conditions, generating clean heart and lung sounds at different body locations. This dataset includes both normal sounds and various abnormalities (i.e., murmur, atrial fibrillation, tachycardia, atrioventricular block, third and fourth heart sound, wheezing, crackles, rhonchi, pleural rub, and gurgling sounds). The dataset includes audio recordings of chest examinations performed at different anatomical locations, as determined by specialist nurses. Each recording has been enhanced using frequency filters to highlight specific sound types. This dataset is useful for applications in artificial intelligence, such as automated cardiopulmonary disease detection, sound classification, unsupervised separation techniques, and deep learning algorithms related to audio signal processing.

2410.02091 2026-05-15 cs.SE cs.AI cs.HC econ.GN q-fin.EC

The Impact of Generative AI on Collaborative Open-Source Software Development: Evidence from GitHub Copilot

Fangchen Song, Ashish Agarwal, Wen Wen

发表机构 * University of Texas at Austin(德克萨斯大学奥斯汀分校)

AI总结 本研究探讨了生成式人工智能(AI)对协作式开源软件(OSS)开发的影响,重点分析了GitHub Copilot这一AI编程助手在GitHub开源项目中的实际作用。研究发现,使用Copilot可使项目层面的代码贡献量提升5.9%,主要源于开发者参与度和个体生产力的提高,但同时也带来了8%的协调时间增加。研究还指出,AI对核心开发者和外围开发者的影响存在差异,为理解AI在开源社区中的长期影响提供了重要参考。

详情
英文摘要

Generative artificial intelligence (AI) facilitates content production and enhances ideation capabilities, which can significantly influence developer productivity and participation in software development. To explore its impact on collaborative open-source software (OSS) development, we investigate the role of GitHub Copilot, a generative AI pair programmer, in OSS development where multiple distributed developers voluntarily collaborate. Using GitHub's proprietary Copilot usage data, combined with public OSS project data obtained from GitHub, we find that Copilot use increases project-level code contributions by 5.9%. This gain is driven by a 3.4% rise in developer coding participation and a 2.1% increase in individual productivity. However, Copilot use also leads to an increase in coordination time by 8% due to more code discussions. This reveals an important tradeoff: While AI expands who can contribute and how much they contribute, it slows coordination in collective development efforts. Despite this tension, the combined effect of these two competing forces remains positive, indicating a net gain in overall project-level timely merge of code contributions from using AI pair programmers. Interestingly, we also find the effects differ across developer roles. Peripheral developers show relatively smaller increases in project-level code contributions and experience larger increases in coordination time than core developers. In summary, our study underscores the dual role of AI pair programmers in affecting project-level code contributions and coordination time in OSS development. Our findings on the differential effects between core and peripheral developers also provide important implications for the structure of OSS communities in the long run.

2404.13649 2026-05-15 stat.ML cs.LG stat.ME

Distributional Principal Autoencoders

Xinwei Shen, Nicolai Meinshausen

发表机构 * Department of Statistics, University of Washington(华盛顿大学统计系)

AI总结 本文提出了一种名为分布主成分自编码器(DPA)的降维方法,旨在在重建数据时保留原始数据的分布特性。该方法通过学习数据在低维潜在变量条件下的条件分布,使得重建数据与原始数据在分布上一致。实验表明,DPA在气候数据、单细胞数据和图像数据上均能有效保留数据的原始分布和重要结构特征。

详情
英文摘要

Dimension reduction techniques usually lose information in the sense that reconstructed data are not identical to the original data. However, we argue that it is possible to have reconstructed data identically distributed as the original data, irrespective of the retained dimension or the specific mapping. This can be achieved by learning a distributional model that matches the conditional distribution of data given its low-dimensional latent variables. Motivated by this, we propose Distributional Principal Autoencoder (DPA) that consists of an encoder that maps high-dimensional data to low-dimensional latent variables and a decoder that maps the latent variables back to the data space. For reducing the dimension, the DPA encoder aims to minimise the unexplained variability of the data with an adaptive choice of the latent dimension. For reconstructing data, the DPA decoder aims to match the conditional distribution of all data that are mapped to a certain latent value, thus ensuring that the reconstructed data retains the original data distribution. Our numerical results on climate data, single-cell data, and image benchmarks demonstrate the practical feasibility and success of the approach in reconstructing the original distribution of the data. DPA embeddings are shown to preserve meaningful structures of data such as the seasonal cycle for precipitations and cell types for gene expression.

2303.14511 2026-05-15 hep-ex cs.AI cs.LG hep-ph physics.data-an

Improving robustness of jet tagging algorithms with adversarial training: exploring the loss surface

Annika Stein

发表机构 * Center for Theoretical Physics, Sloane Physics Laboratory, Yale University(理论物理中心,斯洛恩物理实验室,耶鲁大学) III. Physics Institute A, RWTH Aachen University(物理研究所A,亚琛工业大学)

AI总结 本文研究了如何通过对抗训练提高高能物理中喷注分类算法的鲁棒性,重点分析了输入特征微小扰动对模型性能的影响。作者通过探索损失函数的几何结构,揭示了模型在面对系统性不确定性时的稳健性机制,并提出了一种在保持高性能的同时增强模型鲁棒性的对抗训练方法。

Comments 5 pages, 2 figures; submitted to ACAT 2022 proceedings

Journal ref 2026 J. Phys.: Conf. Ser. 3206 012085

详情
英文摘要

In the field of high-energy physics, deep learning algorithms continue to gain in relevance and provide performance improvements over traditional methods, for example when identifying rare signals or finding complex patterns. From an analyst's perspective, obtaining highest possible performance is desirable, but recently, some attention has been shifted towards studying robustness of models to investigate how well these perform under slight distortions of input features. Especially for tasks that involve many (low-level) inputs, the application of deep neural networks brings new challenges. In the context of jet flavor tagging, adversarial attacks are used to probe a typical classifier's vulnerability and can be understood as a model for systematic uncertainties. A corresponding defense strategy, adversarial training, improves robustness, while maintaining high performance. Investigating the loss surface corresponding to the inputs and models in question reveals geometric interpretations of robustness, taking correlations into account.

2211.16113 2026-05-15 cs.NE cs.LG

Timing-Based Backpropagation in Spiking Neural Networks Without Single-Spike Restrictions

Kakei Yamamoto, Yusuke Sakemi, Kazuyuki Aihara

发表机构 * University of Tokyo(东京大学) Research Center for Mathematical Engineering(数学工程研究中心) University of Tokyo Institutes for Advanced Study(东京大学先进研究机构)

AI总结 本文提出了一种无需单次放电限制的新型反向传播算法,用于训练脉冲神经网络(SNNs),该算法通过单个神经元的多个脉冲时间相对关系来编码信息。与传统方法不同,该方法允许每个神经元多次放电,从而提升了网络的计算能力,并在多个任务中达到了与非卷积人工神经网络相当的准确率。研究还发现,网络的脉冲数量特性依赖于突触后电流和膜电位的时间常数,并存在一个最优时间常数以实现最高测试准确率,这一现象在传统基于单次放电的时间编码方法中未被观察到。

Comments 10 pages, 5 figures

Journal ref 2024 International Joint Conference on Neural Networks (IJCNN), Yokohama, Japan, 2024, pp. 1-9

详情
英文摘要

We propose a novel backpropagation algorithm for training spiking neural networks (SNNs) that encodes information in the relative multiple spike timing of individual neurons without single-spike restrictions. The proposed algorithm inherits the advantages of conventional timing-based methods in that it computes accurate gradients with respect to spike timing, which promotes ideal temporal coding. Unlike conventional methods where each neuron fires at most once, the proposed algorithm allows each neuron to fire multiple times. This extension naturally improves the computational capacity of SNNs. Our SNN model outperformed comparable SNN models and achieved as high accuracy as non-convolutional artificial neural networks. The spike count property of our networks was altered depending on the time constant of the postsynaptic current and the membrane potential. Moreover, we found that there existed the optimal time constant with the maximum test accuracy. That was not seen in conventional SNNs with single-spike restrictions on time-to-fast-spike (TTFS) coding. This result demonstrates the computational properties of SNNs that biologically encode information into the multi-spike timing of individual neurons. Our code would be publicly available.

2202.05568 2026-05-15 stat.ML cs.IT cs.LG math.IT math.PR math.ST stat.TH

Change of measure through the Legendre transform

Antoine Picard-Weibel, Benjamin Guedj

发表机构 * Suez, CIRSEE, France(苏伊士,CIRSEE,法国) University College London, United Kingdom(伦敦大学学院,英国) Inria, France(法国国家信息与自动化技术研究所)

AI总结 本文研究了通过Legendre变换实现测度变化的方法,用于推导PAC-Bayes泛化界。作者结合Legendre变换与Fenchel-Young不等式,基于$f$-散度构建了测度变化不等式,拓展了传统Donsker-Varadhan定理的条件。该方法为学习理论提供了更灵活的分析工具,能够在更广泛的假设条件下建立PAC-Bayes保证。

Comments 27 pages

详情
英文摘要

PAC-Bayes generalisation bounds are derived via change-of-measure inequalities that transfer concentration properties from a reference measure to all posterior measures. The specific choice of change of measure determines the assumptions required on the empirical risk; in particular, the classical Donsker--Varadhan theorem leads to bounds relying on bounded exponential moments. We study change-of-measure inequalities based on \(f\)-divergences, obtained by combining the Legendre transform of \(f\) with the Fenchel--Young inequality. Beyond their intrinsic interest in probability theory, we show how these inequalities are helpful in learning theory and yield PAC-Bayes bounds under tailored assumptions on the empirical risk, thereby extending the range of conditions under which PAC-Bayesian guarantees can be established.

2605.14188 2026-05-15 quant-ph cs.CL cs.DL physics.atom-ph

QOuLiPo: What a quantum computer sees when it reads a book

Christophe Jurczak

发表机构 * Quantonation

AI总结 本文研究了量子计算机如何“阅读”书籍,通过将八部文艺复兴时期的经典著作输入中性原子量子处理器,将文本结构转化为图结构,从而探索量子硬件对文本的处理方式。研究引入了“刚性 rho”指标,用于衡量书籍结构的独特性,并反向设计文本结构以匹配量子硬件的图结构,生成名为 QOuLiPo 的新文本集合,为量子处理器的性能评估提供基准。该工作为数字人文领域提供了与量子计算结合的新方法,并展示了量子处理器在处理复杂文本结构上的潜力。

详情
英文摘要

What does a book look like to a quantum computer? This paper takes eight classical works of the Renaissance and its late-antique inheritance -- from Augustine to Galileo -- and runs each through a neutral-atom quantum processor. The bridge is graphs: each textual unit becomes an atom, and graph edges are physical blockade constraints for engineered exact unit-disk designs, or a 2D approximation to the semantic graph for natural texts. Three contributions follow. First, we introduce rigidity rho, a metric for how unique a book's structural backbone is -- distinguishing Marguerite de Navarre's Heptameron (rigid, twelve-nouvelle hard core) from Boethius (fully fungible, every chapter substitutable). Second, we invert the pipeline: rather than extracting a graph from existing prose, we pick a target graph the hardware encodes natively, and write a book whose structure matches it. The twenty-nine texts written this way, collected under the name QOuLiPo, extend the OuLiPo tradition to graph-topological constraints and, together with the eight natural texts, form a benchmark distribution against which neutral-atom hardware can be tracked as it scales. Third, we run both natural and engineered texts on Pasqal's FRESNEL processor up to one hundred atoms; engineered texts reach high approximation ratios, the cleanest instances returning the exact backbone. A cloud-accessible quantum machine plus an agentic coding environment now lets a single investigator run this pipeline end-to-end. What is reported is an application layer, not a speedup -- humanistic instances ready to load onto neutral-atom processors as they scale, already complementing classical text analysis. The Digital Humanities community has a stake in building familiarity with this hardware now: the engineered-corpus design choices made today fix the benchmark distribution future hardware will be measured against.

2605.14177 2026-05-15 cs.IR cs.AI cs.CL

Thinking Ahead: Prospection-Guided Retrieval of Memory with Language Models

Harshita Chopra, Krishna Kant Chintalapudi, Suman Nath, Ryen W. White, Chirag Shah

发表机构 * University of Washington(华盛顿大学) Microsoft Research(微软研究院)

AI总结 本文研究了如何通过前瞻思维引导语言模型从长期对话历史中检索用户特定的事实,以提升个性化对话系统的性能。为了解决传统检索方法依赖语义相似度而难以发现远距离相关事实的问题,作者提出了基于前瞻引导的检索方法(PGR),通过构建可能的未来步骤作为检索探针,从而更有效地挖掘用户历史中相关但不易被传统方法发现的记忆。实验表明,该方法在多个基准测试中显著提升了检索效果和响应质量。

Comments Preprint

详情
英文摘要

Long-horizon personalization requires dialogue assistants to retrieve user-specific facts from extended interaction histories. In practice, many relevant facts often have low semanticsimilarity to the query under dense retrieval. Standard Retrieval-Augmented Generation (RAG) and GraphRAG systems are still largely retrospective: they rely on embedding similarity to the query or on fixed graph traversals, so they often miss facts that matter for the user's needs but lie far from the query in embedding space. Inspired by prospection, the human ability to use imagined futures as cues for recall, we introduce Prospection-Guided Retrieval (PGR), which decouples retrieval from how memories are stored. Given a user query, PGR first expands the goal into a short Tree-of-Thought (ToT) or linear chain of plausible next steps, and uses these steps as retrieval probes rather than relying on the original query alone. The facts retrieved by these probes are then used to personalize the next round of prospection, enabling PGR to uncover additional memories that become relevant only after the simulation is grounded in the user's history. We also introduce MemoryQuest, a challenging multi-session benchmark in which each query is annotated with 3--5 dated reference facts subject to a low query-reference similarity constraint. Across 1,625 queries spanning 185 user profiles from 3 publicly available datasets, PGR-TOT substantially improves retrieval, including nearly 3x recall on MemoryQuest over the strongest baseline. In pairwise LLM-as-judge comparisons against baselines, PGR-generated responses are preferred on 89--98% of queries, with blinded human annotations on held-out subsets showing the same trend. Overall, the results demonstrate that explicit prospection yields large gains in long-horizon retrieval and response quality relative to similarity-only baselines.

2605.14153 2026-05-15 cs.CR cs.AI

ExploitBench: A Capability Ladder Benchmark for LLM Cybersecurity Agents

Seunghyun Lee, David Brumley

发表机构 * Carnegie Mellon University(卡内基梅隆大学) Bugcrowd

AI总结 本文提出ExploitBench,一个用于评估大语言模型(LLM)在网络安全领域能力的分级基准,将漏洞利用过程分解为16个可衡量的阶段,从代码崩溃到完全控制目标系统。该基准通过确定性验证机制,准确评估模型在不同阶段的表现。实验基于41个V8漏洞进行,结果显示当前公开部署的前沿模型在触发漏洞和崩溃方面表现良好,但在实现任意代码执行等高级能力上仍有明显不足,而私有模型则表现出更强的利用能力。

详情
英文摘要

Exploitation is not a binary event. It is a ladder of acquiring progressive capabilities, from executing a single buggy line of code to taking full control of the target. However, existing LLM security benchmarks treat a crash as exploitation success. That single binary outcome collapses the hard parts of exploitation: the transition from triggering a bug to constructing reusable primitives and control. We present ExploitBench, a capability-graded benchmark that decomposes exploitation into 16 measurable flags, from coverage and crash through sandbox primitives, arbitrary read/write, control-flow hijack, and arbitrary code execution. Each capability is verified by a deterministic oracle that uses a per-run randomized challenge-response for primitives, differential execution against ground-truth binaries to measure progress, and a signal-handler proof for code execution. We instantiate ExploitBench on 41 V8 bugs because V8 is both widely deployed and exploitation-hardened. We report three arms: <model,env> as the primary measurement of model-environment capability, <model,env, adaptive coaching> as a secondary arm that adds adaptive coaching to test whether targeted feedback shifts outcomes, and <model,env,harness> as an ablation that swaps in the model's native CLI to check whether vendor-side optimizations increase exploitation capabilities. Our results show a sharp capability split between publicly deployed frontier models and the private frontier. Across the 8 publicly deployed models tested, reaching the vulnerable code and triggering a crash is routine, but arbitrary code execution is not. The private model shows arbitrary code execution on approximately half. Overall, results suggest that exploit construction against hardened targets is an emerging frontier capability.

2605.14142 2026-05-15 stat.ML cs.LG stat.CO

To discretize continually: Mean shift interacting particle systems for Bayesian inference

Ayoub Belhadji, Daniel Sharp, Youssef M. Marzouk

发表机构 * Center for Computational Science and Engineering(计算科学中心) Laboratory for Information and Decision Systems(信息与决策系统实验室) Massachusetts Institute of Technology(麻省理工学院)

AI总结 本文提出了一种基于最大均值差异(MMD)最小化的交互粒子系统,用于在已知非归一化密度的情况下近似概率分布的积分。该方法扩展了经典均值漂移算法和经验分布最优量化算法,适用于连续分布,并且不受未知归一化常数的影响,支持无梯度和有梯度的实现方式。实验表明,该方法在多模态混合、贝叶斯分层模型、受PDE约束的反问题等多种采样任务中表现出良好的收敛性、多模态捕捉能力和高维扩展性。

详情
英文摘要

Integration against a probability distribution given its unnormalized density is a central task in Bayesian inference and other fields. We introduce new methods for approximating such expectations with a small set of weighted samples -- i.e., a quadrature rule -- constructed via an interacting particle system that minimizes maximum mean discrepancy (MMD) to the target distribution. These methods extend the classical mean shift algorithm, as well as recent algorithms for optimal quantization of empirical distributions, to the case of continuous distributions. Crucially, our approach creates dynamics for MMD minimization that are invariant to the unknown normalizing constant; they also admit both gradient-free and gradient-informed implementations. The resulting mean shift interacting particle systems converge quickly, capture anisotropy and multi-modality, avoid mode collapse, and scale to high dimensions. We demonstrate their performance on a wide range of benchmark sampling problems, including multi-modal mixtures, Bayesian hierarchical models, PDE-constrained inverse problems, and beyond.

2605.14123 2026-05-15 eess.IV cs.CV

Keyed Nonlinear Transform: Lightweight Privacy-Enhancing Feature Sharing for Medical Image Analysis

Haebom Lee, Gyeongjung Kim

发表机构 * OOLU Soft Co., Ltd.(OOLU软件有限公司)

AI总结 本文提出了一种名为Keyed Nonlinear Transform(KNT)的轻量级特征转换方法,用于在医疗图像分析中增强隐私保护,解决特征共享过程中患者身份信息泄露的问题。该方法通过密钥条件的非线性变换对中间特征进行混淆,有效降低了特征的可重新识别性,同时保持了模型的分类性能和计算效率。实验表明,KNT在不重新训练模型的前提下,显著提升了隐私保护水平,并适用于多种医学图像任务。

详情
英文摘要

Feature sharing via split inference offers a lightweight alternative to federated learning for resource-constrained hospitals, but transmitted features still leak patient identity information and lack practical mechanisms for controlled feature sharing. We propose Keyed Nonlinear Transform (KNT), a drop-in feature transformation that applies key-conditioned obfuscation to intermediate representations. KNT reduces re-identification AUC from 0.635 to 0.586, corresponding to a 36% reduction in above-chance identity signal, while introducing only 0.15 ms CPU overhead, without backbone retraining, and preserving classification performance within 1.0 pp. Our analysis shows that KNT's nonlinear transform prevents closed-form inversion and shifts recovery to iterative gradient-based optimization under full key compromise, substantially increasing inversion difficulty. The same transform generalizes to dense prediction tasks, incurring only a 4.4 pp Dice reduction on skin-lesion segmentation without retraining. These results position KNT as a practical and efficient privacy layer for split inference deployments.