arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2508.14950 2026-05-15 eess.IV cs.LG

Potential and challenges of generative adversarial networks for super-resolution in 4D Flow MRI

Oliver Welin Odeback, Arivazhagan Geetha Balasubramanian, Jonas Schollenberger, Edward Ferdiand, Alistair A. Young, C. Alberto Figueroa, Susanne Schnell, Outi Tammisola, Ricardo Vinuesa, Tobias Granberg, Alexander Fyrdahl, David Marlevi

AI总结本文研究了生成对抗网络（GAN）在4D血流磁共振成像（4D Flow MRI）超分辨率重建中的潜力与挑战。针对该技术在近壁速度测量中分辨率低、噪声大的问题，作者提出了一种专门设计的GAN架构，并在三种对抗损失函数下进行了评估。实验表明，Wasserstein GAN在提升近壁速度恢复精度和训练稳定性方面表现最优，展示了GAN在改善4D Flow MRI图像质量中的应用前景。

Comments 26 pages, 10 figures

详情

DOI: 10.1016/j.compbiomed.2026.111745
Journal ref: Computers in Biology and Medicine 211 (2026) 111745

英文摘要

4D Flow Magnetic Resonance Imaging (4D Flow MRI) enables non-invasive quantification of blood flow and hemodynamic parameters. However, its clinical application is limited by low spatial resolution and noise, particularly affecting near-wall velocity measurements. Machine learning-based super-resolution has shown promise in addressing these limitations, but challenges remain, not least in recovering near-wall velocities. Generative adversarial networks (GANs) offer a compelling solution, having demonstrated strong capabilities in restoring sharp boundaries in non-medical super-resolution tasks. Yet, their application in 4D Flow MRI remains unexplored, with implementation challenged by known issues such as training instability and non-convergence. In this study, we investigate GAN-based super-resolution in 4D Flow MRI. Training and validation were conducted using patient-specific cerebrovascular in-silico models, converted into synthetic images via an MR-true reconstruction pipeline. A dedicated GAN architecture was implemented and evaluated across three adversarial loss functions: Vanilla, Relativistic, and Wasserstein. Our results demonstrate that the proposed GAN improved near-wall velocity recovery compared to a non-adversarial reference (vNRMSE: 6.9% vs. 9.6%); however, that implementation specifics are critical for stable network training. While Vanilla and Relativistic GANs proved unstable compared to generator-only training (vNRMSE: 8.1% and 7.8% vs. 7.2%), a Wasserstein GAN demonstrated optimal stability and incremental improvement (vNRMSE: 6.9% vs. 7.2%). The Wasserstein GAN further outperformed the generator-only baseline at low SNR (vNRMSE: 8.7% vs. 10.7%). These findings highlight the potential of GAN-based super-resolution in enhancing 4D Flow MRI, particularly in challenging cerebrovascular regions, while emphasizing the need for careful selection of adversarial strategies.

URL PDF HTML ☆

赞 0 踩 0

2508.07876 2026-05-15 stat.ML cs.LG math.DS math.ST stat.TH

Stochastic dynamics learning with state-space systems

Juan-Pablo Ortega, Florian Rossmannek

AI总结本文研究了状态空间系统在随机动态学习中的特性，旨在深化对脉冲神经网络计算（RC）理论基础的理解。通过统一处理确定性和随机性场景下的记忆衰减和回声状态属性（ESP），作者证明了即使在缺乏ESP的情况下，记忆衰减和解的稳定性也具有普遍性，从而为RC模型的广泛应用提供了理论支持。在随机情形下，文章引入了基于概率分布吸引子动力学的新视角，拓展了非自主动力系统的相关研究，为RC模型在因果性、稳定性与记忆特性方面提供了更深入的见解。

2508.03941 2026-05-15 cs.IR cs.LG

Measuring the stability and plasticity of recommender systems

Maria João Lavoura, Robert Jungnickel, João Vinagre

AI总结本文研究了推荐系统在长期运行中的稳定性与可塑性问题，提出了一个离线评估方法，用于分析推荐模型在重新训练时的行为表现。该方法从模型保留历史模式（稳定性）和适应新变化（可塑性）两个方面对算法进行评估，提供了一种与数据集、算法和指标无关的长期性能分析框架。实验结果表明，不同类型的推荐算法在稳定性和可塑性上存在差异，并可能存在两者之间的权衡关系。

Comments Final version published in the proceedings of ACM UMAP 2026: https://doi.org/10.1145/3774935.3812707

2507.13941 2026-05-15 q-bio.NC cs.AI cs.CV eess.IV

Shared representations in brains and models reveal a two-route cortical organization during scene perception

Pablo Marcos-Manchón, Lluís Fuentemilla

AI总结该研究通过分析7T fMRI数据，探讨了人类大脑在场景感知过程中信息的组织与传递路径。研究利用表征相似性分析，比较了个体间共享的脑区表征结构与视觉和语言神经网络的层次特征，发现大脑存在两条分离的处理通路：一条负责场景布局与环境背景，另一条专门处理生物内容。这一发现深化了对视觉信息处理的经典模型，揭示了场景感知是一个由多个可区分表征路径组成的分布式脑网络。

Comments for associate code, see https://github.com/memory-formation/convergent-transformations

2507.05193 2026-05-15 eess.IV cs.CV

RAM-W600: A Multi-Task Wrist Dataset and Benchmark for Rheumatoid Arthritis

Songxiao Yang, Haolin Wang, Yao Fu, Ye Tian, Tamotsu Kamishima, Masayuki Ikebe, Yafei Ou, Masatoshi Okutomi

AI总结该研究提出了一种名为RAM-W600的多任务腕关节X光图像数据集，用于类风湿性关节炎（RA）的辅助诊断与疾病监测。该数据集包含来自六个医疗中心的388名患者的1048张腕部常规X光图像，提供了像素级的腕骨实例分割标注和SvdH骨侵蚀评分，是首个公开的腕骨实例分割资源。该数据集有助于推动RA相关研究，如关节间隙狭窄量化、骨侵蚀检测、骨变形评估等，并可能应用于腕部骨折定位等任务，有望降低腕部RA研究的门槛，促进计算机辅助诊断技术的发展。

Comments Published in NeurIPS 2025

2506.20425 2026-05-15 stat.ML cs.LG stat.CO stat.ME

Scalable Subset Selection in Linear Mixed Models

Ryan Thompson, Matt P. Wand, Joanna J. J. Wang

AI总结本文研究了在包含固定效应和随机效应的线性混合模型中如何高效地进行可扩展的子集选择问题。为了解决现有方法在处理大量预测变量时计算效率低下的问题，作者提出了一种基于 $\ell_0$ 正则化的新型子集选择方法，并结合坐标下降算法和局部搜索算法以实现快速收敛和非凸优化的高效求解。该方法在统计上提供了有限样本下的KL散度界，并在合成和真实数据实验中表现出优越的性能。

2505.16714 2026-05-15 quant-ph cs.LG

Experimental robustness benchmarking of quantum neural networks on a superconducting quantum processor

Hai-Feng Zhang, Zhao-Yun Chen, Peng Wang, Liang-Liang Guo, Tian-Le Wang, Xiao-Yan Yang, Ren-Ze Zhao, Ze-An Zhao, Sheng Zhang, Lei Du, Hao-Ran Tao, Zhi-Long Jia, Wei-Cheng Kong, Huan-Yu Liu, Athanasios V. Vasilakos, Yang Yang, Yu-Chun Wu, Ji Guan, Peng Duan, Guo-Ping Guo

AI总结本研究首次在超导量子处理器上对20量子比特的量子神经网络分类器进行了系统的实验鲁棒性评估，揭示了量子机器学习模型在对抗攻击下的安全性问题。研究提出了一种高效的对抗攻击算法，用于量化评估量子神经网络的鲁棒性，并验证了对抗训练能够通过正则化输入梯度显著提升其鲁棒性。实验还表明，与经典神经网络相比，量子神经网络具有更强的对抗鲁棒性，这归因于其固有的量子噪声，并且实验结果与理论下界高度吻合，验证了攻击方法的有效性与鲁棒性界限的紧致性。

Comments There are 8 pages with 5 figures in the main text

2505.09552 2026-05-15 stat.ME cs.LG stat.ML

Scalable Krylov Subspace Methods for Generalized Mixed-Effects Models with Crossed Random Effects

Pascal Kündig, Fabio Sigrist

AI总结该论文针对具有交叉随机效应的广义混合效应模型中的计算瓶颈问题，提出了一种基于Krylov子空间的方法，有效提升了高维数据下的计算效率。研究通过理论分析和实验验证，展示了预条件随机Lanczos拟合和共轭梯度方法在收敛性和数值稳定性方面的优势，并开发了可扩展的预测方差计算方法。实验表明，新方法相比传统的Cholesky分解方法，在速度和稳定性上均有显著提升。

2505.09246 2026-05-15 cs.IR cs.AI cs.CL

Autofocus Retrieval: An Effective Pipeline for Multi-Hop Question Answering With Semi-Structured Knowledge

Derian Boer, Stephen Roth, Stefan Kramer

AI总结本文提出了一种基于半结构化知识库的多跳问答框架Autofocus-Retriever（AF-Retriever），旨在有效结合结构化和非结构化信息进行问答。该方法通过引入可交换的大语言模型提取实体属性和关系约束，并结合向量相似度搜索与增量范围扩展策略，实现了在多个基准测试中优于现有方法的零样本和少样本性能。其核心贡献在于通过四步约束驱动的检索与四步补充排序流程，显著提升了答案检索的准确性和鲁棒性。

详情

Journal ref: Transactions on Machine Learning Research 2026

英文摘要

In many real-world settings, machine learning models and interactive systems have access to both structured knowledge, e.g., knowledge graphs or tables, and unstructured content, e.g., natural language documents. Yet, most rely on either. Semi-Structured Knowledge Bases (SKBs) bridge this gap by linking unstructured content to nodes within structured data. In this work, we present Autofocus-Retriever (AF-Retriever), a modular framework for SKB-based, multi-hop question answering. It combines structural and textual retrieval through novel integration steps and optimizations, achieving the best zero- and one-shot results across all three STaRK QA benchmarks, which span diverse domains and evaluation metrics. AF-Retriever's average first-hit rate surpasses the second-best method by 32.1%. Its performance is driven by (1) leveraging exchangeable large language models (LLMs) to extract entity attributes and relational constraints for both parsing and reranking the top-k answers, (2) vector similarity search for ranking both extracted entities and final answers, (3) a novel incremental scope expansion procedure that prepares for the reranking on a configurable amount of suitable candidates that fulfill the given constraints the most, and (4) a hybrid retrieval strategy that reduces error susceptibility. In summary, while constantly adjusting the focus like an optical autofocus, AF-Retriever delivers a configurable amount of answer candidates in four constraint-driven retrieval steps, which are then supplemented and ranked through four additional processing steps. An ablation study and a detailed error analysis, including a comparison of three different LLM reranking strategies, provide component-level insights. The source code is available at https://github.com/kramerlab/AF-Retriever .

URL PDF HTML ☆

赞 0 踩 0

2504.11703 2026-05-15 cs.CR cs.AI

Progent: Securing AI Agents with Privilege Control

Tianneng Shi, Jingxuan He, Zhun Wang, Hongwei Li, Linyu Wu, Wenbo Guo, Dawn Song

AI总结 AI代理通过调用工具与外部环境交互，容易受到如间接提示注入等攻击，导致未经授权的操作。为此，本文提出Progent框架，通过特权控制机制增强AI代理的安全性。Progent将特权表示为基于工具名称和参数的符号化安全策略，通过确定性过程检查每个工具调用，确保最小特权原则。该框架利用大型语言模型自动生成并动态更新策略，并结合SMT求解器保证策略更新的单调性，从而在保障实用性的前提下有效防止权限升级，实验表明其在多个基准测试中显著降低了攻击成功率。

2504.01571 2026-05-15 cs.GR cs.AI cs.CV cs.LG

Pro-DG: Procedural Diffusion Guidance for Architectural Facade Generation

Aleksander Plocharski, Jan Swidzinski, Przemyslaw Musialski

AI总结本文提出了一种基于过程化扩散引导（Pro-DG）的建筑立面生成方法，通过在稳定扩散框架中引入分层过程化规则生成控制图，从而生成逼真的建筑立面图像。该方法从单张输入图像及其分割结果出发，利用逆过程模块识别立面的分层布局，并结合结构特征设计了一种新的ControlNet流程，实现由过程化变换引导的立面图像生成。该方法能够精确控制局部外观并进行大规模结构编辑，实验表明其在保持建筑风格和实现可控编辑方面优于现有方法。

Comments 17 pages, 15 figures, Computer Graphics Forum 2026 Journal Paper

2501.18756 2026-05-15 stat.ML cs.LG math.OC

A Unified Framework for Entropy Search and Expected Improvement in Bayesian Optimization

Nuojin Cheng, Leonard Papenmeier, Stephen Becker, Luigi Nardi

AI总结本文提出了一种统一的理论框架——变分熵搜索（Variational Entropy Search），揭示了预期改进（EI）与基于信息论的获取函数之间的深层联系，挑战了它们本质不同的传统观点。研究通过将EI解释为最大值熵搜索（MES）的变分近似，提出了一个新的获取函数VES-Gamma，该方法在合成和现实世界的低维与高维基准测试中表现出色，优于现有的EI和MES方法。

2410.03280 2026-05-15 eess.AS cs.AI cs.LG eess.SP

Manikin-Recorded Cardiopulmonary Sounds Dataset Using Digital Stethoscope

Yasaman Torabi, Shahram Shirani, James P. Reilly

AI总结该研究提出了一种使用数字听诊器录制的心肺声音数据集，包含正常及多种异常心肺音，如杂音、心律失常和呼吸音等。数据集通过临床模拟人采集，涵盖了不同身体部位的单独和混合声音，并经过频率滤波处理以增强特定声音类型。该数据集为人工智能在心肺疾病自动检测、声音分类及深度学习等领域的研究提供了重要的资源。

2410.02091 2026-05-15 cs.SE cs.AI cs.HC econ.GN q-fin.EC

The Impact of Generative AI on Collaborative Open-Source Software Development: Evidence from GitHub Copilot

Fangchen Song, Ashish Agarwal, Wen Wen

AI总结本研究探讨了生成式人工智能（AI）对协作式开源软件（OSS）开发的影响，重点分析了GitHub Copilot这一AI编程助手在GitHub开源项目中的实际作用。研究发现，使用Copilot可使项目层面的代码贡献量提升5.9%，主要源于开发者参与度和个体生产力的提高，但同时也带来了8%的协调时间增加。研究还指出，AI对核心开发者和外围开发者的影响存在差异，为理解AI在开源社区中的长期影响提供了重要参考。

2404.13649 2026-05-15 stat.ML cs.LG stat.ME

Distributional Principal Autoencoders

Xinwei Shen, Nicolai Meinshausen

AI总结本文提出了一种名为分布主成分自编码器（DPA）的降维方法，旨在在重建数据时保留原始数据的分布特性。该方法通过学习数据在低维潜在变量条件下的条件分布，使得重建数据与原始数据在分布上一致。实验表明，DPA在气候数据、单细胞数据和图像数据上均能有效保留数据的原始分布和重要结构特征。

2303.14511 2026-05-15 hep-ex cs.AI cs.LG hep-ph physics.data-an

Improving robustness of jet tagging algorithms with adversarial training: exploring the loss surface

Annika Stein

AI总结本文研究了如何通过对抗训练提高高能物理中喷注分类算法的鲁棒性，重点分析了输入特征微小扰动对模型性能的影响。作者通过探索损失函数的几何结构，揭示了模型在面对系统性不确定性时的稳健性机制，并提出了一种在保持高性能的同时增强模型鲁棒性的对抗训练方法。

Comments 5 pages, 2 figures; submitted to ACAT 2022 proceedings

2211.16113 2026-05-15 cs.NE cs.LG

Timing-Based Backpropagation in Spiking Neural Networks Without Single-Spike Restrictions

Kakei Yamamoto, Yusuke Sakemi, Kazuyuki Aihara

AI总结本文提出了一种无需单次放电限制的新型反向传播算法，用于训练脉冲神经网络（SNNs），该算法通过单个神经元的多个脉冲时间相对关系来编码信息。与传统方法不同，该方法允许每个神经元多次放电，从而提升了网络的计算能力，并在多个任务中达到了与非卷积人工神经网络相当的准确率。研究还发现，网络的脉冲数量特性依赖于突触后电流和膜电位的时间常数，并存在一个最优时间常数以实现最高测试准确率，这一现象在传统基于单次放电的时间编码方法中未被观察到。

Comments 10 pages, 5 figures

2202.05568 2026-05-15 stat.ML cs.IT cs.LG math.IT math.PR math.ST stat.TH

Change of measure through the Legendre transform

Antoine Picard-Weibel, Benjamin Guedj

AI总结本文研究了通过Legendre变换实现测度变化的方法，用于推导PAC-Bayes泛化界。作者结合Legendre变换与Fenchel-Young不等式，基于$f$-散度构建了测度变化不等式，拓展了传统Donsker-Varadhan定理的条件。该方法为学习理论提供了更灵活的分析工具，能够在更广泛的假设条件下建立PAC-Bayes保证。

Comments 27 pages

2605.14188 2026-05-15 quant-ph cs.CL cs.DL physics.atom-ph

QOuLiPo: What a quantum computer sees when it reads a book

Christophe Jurczak

AI总结本文研究了量子计算机如何“阅读”书籍，通过将八部文艺复兴时期的经典著作输入中性原子量子处理器，将文本结构转化为图结构，从而探索量子硬件对文本的处理方式。研究引入了“刚性 rho”指标，用于衡量书籍结构的独特性，并反向设计文本结构以匹配量子硬件的图结构，生成名为 QOuLiPo 的新文本集合，为量子处理器的性能评估提供基准。该工作为数字人文领域提供了与量子计算结合的新方法，并展示了量子处理器在处理复杂文本结构上的潜力。

详情

英文摘要

What does a book look like to a quantum computer? This paper takes eight classical works of the Renaissance and its late-antique inheritance -- from Augustine to Galileo -- and runs each through a neutral-atom quantum processor. The bridge is graphs: each textual unit becomes an atom, and graph edges are physical blockade constraints for engineered exact unit-disk designs, or a 2D approximation to the semantic graph for natural texts. Three contributions follow. First, we introduce rigidity rho, a metric for how unique a book's structural backbone is -- distinguishing Marguerite de Navarre's Heptameron (rigid, twelve-nouvelle hard core) from Boethius (fully fungible, every chapter substitutable). Second, we invert the pipeline: rather than extracting a graph from existing prose, we pick a target graph the hardware encodes natively, and write a book whose structure matches it. The twenty-nine texts written this way, collected under the name QOuLiPo, extend the OuLiPo tradition to graph-topological constraints and, together with the eight natural texts, form a benchmark distribution against which neutral-atom hardware can be tracked as it scales. Third, we run both natural and engineered texts on Pasqal's FRESNEL processor up to one hundred atoms; engineered texts reach high approximation ratios, the cleanest instances returning the exact backbone. A cloud-accessible quantum machine plus an agentic coding environment now lets a single investigator run this pipeline end-to-end. What is reported is an application layer, not a speedup -- humanistic instances ready to load onto neutral-atom processors as they scale, already complementing classical text analysis. The Digital Humanities community has a stake in building familiarity with this hardware now: the engineered-corpus design choices made today fix the benchmark distribution future hardware will be measured against.

URL PDF HTML ☆

赞 0 踩 0

2605.14177 2026-05-15 cs.IR cs.AI cs.CL

Thinking Ahead: Prospection-Guided Retrieval of Memory with Language Models

Harshita Chopra, Krishna Kant Chintalapudi, Suman Nath, Ryen W. White, Chirag Shah

AI总结本文研究了如何通过前瞻思维引导语言模型从长期对话历史中检索用户特定的事实，以提升个性化对话系统的性能。为了解决传统检索方法依赖语义相似度而难以发现远距离相关事实的问题，作者提出了基于前瞻引导的检索方法（PGR），通过构建可能的未来步骤作为检索探针，从而更有效地挖掘用户历史中相关但不易被传统方法发现的记忆。实验表明，该方法在多个基准测试中显著提升了检索效果和响应质量。

Comments Preprint

详情

英文摘要

Long-horizon personalization requires dialogue assistants to retrieve user-specific facts from extended interaction histories. In practice, many relevant facts often have low semanticsimilarity to the query under dense retrieval. Standard Retrieval-Augmented Generation (RAG) and GraphRAG systems are still largely retrospective: they rely on embedding similarity to the query or on fixed graph traversals, so they often miss facts that matter for the user's needs but lie far from the query in embedding space. Inspired by prospection, the human ability to use imagined futures as cues for recall, we introduce Prospection-Guided Retrieval (PGR), which decouples retrieval from how memories are stored. Given a user query, PGR first expands the goal into a short Tree-of-Thought (ToT) or linear chain of plausible next steps, and uses these steps as retrieval probes rather than relying on the original query alone. The facts retrieved by these probes are then used to personalize the next round of prospection, enabling PGR to uncover additional memories that become relevant only after the simulation is grounded in the user's history. We also introduce MemoryQuest, a challenging multi-session benchmark in which each query is annotated with 3--5 dated reference facts subject to a low query-reference similarity constraint. Across 1,625 queries spanning 185 user profiles from 3 publicly available datasets, PGR-TOT substantially improves retrieval, including nearly 3x recall on MemoryQuest over the strongest baseline. In pairwise LLM-as-judge comparisons against baselines, PGR-generated responses are preferred on 89--98% of queries, with blinded human annotations on held-out subsets showing the same trend. Overall, the results demonstrate that explicit prospection yields large gains in long-horizon retrieval and response quality relative to similarity-only baselines.

URL PDF HTML ☆

赞 0 踩 0

2605.14153 2026-05-15 cs.CR cs.AI

ExploitBench: A Capability Ladder Benchmark for LLM Cybersecurity Agents

Seunghyun Lee, David Brumley

AI总结本文提出ExploitBench，一个用于评估大语言模型（LLM）在网络安全领域能力的分级基准，将漏洞利用过程分解为16个可衡量的阶段，从代码崩溃到完全控制目标系统。该基准通过确定性验证机制，准确评估模型在不同阶段的表现。实验基于41个V8漏洞进行，结果显示当前公开部署的前沿模型在触发漏洞和崩溃方面表现良好，但在实现任意代码执行等高级能力上仍有明显不足，而私有模型则表现出更强的利用能力。

2605.14142 2026-05-15 stat.ML cs.LG stat.CO

To discretize continually: Mean shift interacting particle systems for Bayesian inference

Ayoub Belhadji, Daniel Sharp, Youssef M. Marzouk

AI总结本文提出了一种基于最大均值差异（MMD）最小化的交互粒子系统，用于在已知非归一化密度的情况下近似概率分布的积分。该方法扩展了经典均值漂移算法和经验分布最优量化算法，适用于连续分布，并且不受未知归一化常数的影响，支持无梯度和有梯度的实现方式。实验表明，该方法在多模态混合、贝叶斯分层模型、受PDE约束的反问题等多种采样任务中表现出良好的收敛性、多模态捕捉能力和高维扩展性。

2605.14123 2026-05-15 eess.IV cs.CV

Keyed Nonlinear Transform: Lightweight Privacy-Enhancing Feature Sharing for Medical Image Analysis

Haebom Lee, Gyeongjung Kim

AI总结本文提出了一种名为Keyed Nonlinear Transform（KNT）的轻量级特征转换方法，用于在医疗图像分析中增强隐私保护，解决特征共享过程中患者身份信息泄露的问题。该方法通过密钥条件的非线性变换对中间特征进行混淆，有效降低了特征的可重新识别性，同时保持了模型的分类性能和计算效率。实验表明，KNT在不重新训练模型的前提下，显著提升了隐私保护水平，并适用于多种医学图像任务。

2605.14098 2026-05-15 stat.ML cs.CL cs.LG

Pause and Reflect: Conformal Aggregation for Chain-of-Thought Reasoning

Yu Gu, Zijun Yu, Vahid Partovi Nia, Masoud Asgharian

AI总结该研究针对链式推理（CoT）中多路径推理结果的聚合不确定性问题，提出了一种基于 conformal 的聚合方法，以提升系统在拒绝回答时的准确性。不同于传统的多数投票方式，该方法采用加权得分聚合，并结合 conformal 风险控制来校准拒绝规则，从而在有限样本下保证自信错误率的控制。实验表明，该方法在多个基准测试中实现了较高的选择性准确率，且无需重新训练模型。

Comments 9 pages, 4 figures, submitted

2605.14090 2026-05-15 cs.CY cs.GR cs.LG

Synthetic Sociality: How Generative Models Privatize the Social Fabric

Ana Dodik, Moira Weigel

AI总结本文提出了一种批判性理论框架，用于分析生成模型在描述性和规范性层面的影响。研究指出，生成模型不仅自动化了智力劳动，还复制和重塑了更广泛的人类社会能力，即“社会行为”。文章通过梳理数字经济中社会性的商品化过程，探讨生成模型如何依赖社会数据，并引入“合成社会性”概念，揭示由私营且缺乏民主治理的生成模型所塑造的社会现实，最后提出规范性分析与未来设计方向。

2605.14066 2026-05-15 eess.AS cs.AI cs.CL cs.SD

A Benchmark for Early-stage Parkinson's Disease Detection from Speech

Terry Yi Zhong, Cristian Tejedor-Garcia, Khiet P. Truong, Janna Maas, Louis ten Bosch, Bastiaan R. Bloem

AI总结该研究提出首个用于基于语音的早期帕金森病检测的基准，旨在解决现有研究因数据集、语言、任务和评估方式不同而导致的结果难以比较的问题。该基准采用说话人无关划分，支持在公开数据集上进行公平且可复现的跨方法评估，并涵盖三种常见语音任务，同时在不同训练资源条件下对方法进行测试。研究还提供了多维度的评估分析，助力细粒度比较与临床应用，为推动鲁棒且具有临床意义的早期帕金森病检测提供了可复用的参考。

Comments Submitted to Interspeech2026

2605.14041 2026-05-15 stat.ME cs.LG

Wahkon: A Statistically Principled Deep RKHS Superposition Network

Yongkai Chen, Wenxuan Zhong, Ping Ma

AI总结本文提出了一种名为Wahkon的深度再生核希尔伯特空间（RKHS）叠加网络，旨在结合深度学习的预测能力与RKHS方法的统计保证。该方法基于Kolmogorov叠加原理和Wahba样条的RKHS正则化思想，建立了有限维的深度表示定理，实现了可训练的模型结构与逐层复杂度控制。理论分析表明，该方法在层次化高斯过程先验下等价于最大后验估计，并在深度与宽度的正则化权衡方面具有最优收敛率；实验显示其在多个基准任务和单细胞数据分析中优于传统深度模型。

2605.14025 2026-05-15 q-bio.NC cs.AI

Do Language Models Align with Brains? Prediction Scores Are Not Enough

Xiao Jia

AI总结本文探讨了语言模型是否与大脑在语言处理上具有一致性，并质疑仅凭预测得分是否足以证明语言模型能捕捉大脑相关的语言计算。研究采用L-PACT框架，从预测性、关系性、机制剥离和可靠性等多个维度进行严格评估，发现现有语言模型在多个关键指标上无法通过对照实验的检验，表明其与大脑的对齐程度尚未得到充分支持。研究强调需更审慎地解读模型与大脑之间的关系，避免将表面积极结果误认为结构性对齐。

Comments 39 pages, 4 main figures, 6 supplementary figures

详情

英文摘要

Brain-language model comparisons often interpret neural prediction scores as evidence that model representations capture brain-relevant language computation. We asked whether language models align with brains, and whether prediction scores are enough to support that claim, using L-PACT, a source-audited framework that evaluates predictive, relational, mechanism-stripping, and reliability-bounded evidence. Across primary naturalistic language neural datasets and derived language-model representations, L-PACT compared real model features with nuisance baselines and severe controls, tested whether model-to-brain profiles reproduced brain-to-brain patterns, recomputed held-out scores after mechanism stripping, and normalized evidence against brain-brain ceilings. The locked analysis set contains 414 predictive-control rows, 2304 relational profile rows, 4320 mechanism-stripping rows, 420 brain-brain ceiling rows, and 146 integrated decision rows. Assay-sensitivity checks showed that brain-brain reliability, brain-as-model run-to-run relational profiles, independent low-level neural and WAV-derived acoustic-envelope gates, and a deterministic implanted-signal simulation can produce positive evidence when expected. Nevertheless, no real model row passed the predictive, relational, mechanism-stripping, or operational Turing-bounded reliability gates; all 146 integrated rows were control-explained. Less stringent single-criterion rules would have counted raw positive predictive, relational, stripping-delta, and ceiling-normalized effects, but L-PACT downgraded them because controls explained the apparent evidence. In the analyzed derived artifact set, the tested language-model representations do not satisfy L-PACT alignment gates; apparent positives are converted into an auditable control-explained taxonomy rather than treated as structural alignment.

URL PDF HTML ☆

赞 0 踩 0

2605.14021 2026-05-15 cs.CY cs.AI

Measuring Google AI Overviews: Activation, Source Quality, Claim Fidelity, and Publisher Impact

Haofei Xu, Umar Iqbal, Jacob M. Montgomery

AI总结该研究对谷歌AI概览（AIOs）进行了大规模纵向测量，分析了其激活率、引用来源质量、声明准确性及对出版商的影响。研究发现，AIOs的激活率在问题类查询中高达64.7%，但对政治敏感话题则明显降低；其引用的来源比传统搜索结果更可信，但部分来源未出现在搜索结果中，表明其选择机制不同于谷歌的排名算法。此外，AIOs的回答中约11%的声明缺乏来源支持，且引用页面中超过半数包含广告，可能影响出版商收入。该研究揭示了生成式AI对在线信息生态系统的深远影响。

Comments Under Review

2605.14019 2026-05-15 econ.EM cs.LG math.ST stat.CO stat.TH

Regret Equals Covariance: A Closed-Form Characterization for Stochastic Optimization

Irene Aldridge

AI总结本文研究了随机优化问题中遗憾（Regret）的度量问题，提出了一个精确的协方差分解公式，将期望遗憾表示为不确定参数与最优决策之间的协方差加上一个可估计的残差项。对于线性规划和无约束二次规划问题，该残差项为零，使得遗憾可直接由协方差计算得出，从而避免了传统样本平均近似方法的高计算复杂度。该方法在实际问题中可通过历史数据高效估计协方差，计算效率显著提升，并通过理论分析和实验验证了其有效性。

Comments 33 pages