arXivDaily arXiv每日学术速递 周一至周五更新
全部学科分类 2075
专题追踪
2605.05876 2026-05-14 cs.GR cs.CV

3DSS: 3D Surface Splatting for Inverse Rendering

Mae Younes, Adnane Boukhayma

发表机构 * INRIA, University of Rennes(INRIA,里昂大学)

AI总结 本文提出了一种名为3D Surface Splatting(3DSS)的可微表面点扩散渲染方法,用于从多视角图像中进行基于物理的逆向渲染。其核心思想是将表面分离问题直接建模为重建核的函数,从而推导出一种基于覆盖度的合成模型,能够生成抗锯齿的轮廓和稀疏区域的可见性梯度。结合优化的高动态范围环境光和密度感知的自适应细化,3DSS能够同时恢复物体的形状、空间变化的材质属性以及光照信息,并可通过有向点云重建方法自然地与基于网格的工作流程衔接。

详情
英文摘要

We present 3D Surface Splatting (3DSS), the first differentiable surface splatting renderer for physically-based inverse rendering from multi-view images. Our central insight is that the surface separation problem at the heart of surface splatting admits a direct formulation in terms of the reconstruction kernels themselves. From this foundation we derive a coverage-based compositing model whose per-layer opacity arises directly from the accumulated Elliptical Weighted Average reconstruction weight, yielding anti-aliased silhouettes and informative visibility gradients at sparsely covered edges. Combined with forward microfacet shading under co-optimized HDR environment lighting and density-aware adaptive refinement, 3DSS jointly recovers shape, spatially-varying BRDF materials, and illumination. Because the optimized representation is a set of oriented surface samples, it bridges natively to mesh-based workflows via surface reconstruction from oriented point cloud methods. We evaluate 3DSS against mesh-based, implicit, and Gaussian-splatting baselines across geometry reconstruction, novel-view synthesis, and novel-illumination relighting.

2604.22966 2026-05-14 cs.CY cs.AI

Institutions for the Post-Scarcity of Judgment

Lauri Lovén

发表机构 * Future Computing Group, University of Oulu(奥卢大学未来计算组)

AI总结 本文探讨了人工智能革命带来的“判断稀缺性”反转现象,指出随着AI技术的发展,高质量判断的生产成本趋于零,而验证信号、合法性、真实来源和整合能力等四类资源变得稀缺。文章认为,传统机构(如法院、期刊、立法机构)在制造合法判断方面正与AI技术竞争,并提出将AI政策重新定位为制度设计、构建验证与溯源的公共基础设施、以及发展战略代理下的制度组合形式等三步行动议程。

Comments 5 pages, 9 references. Submitted to Communications of the ACM (Opinion section). Comments welcome

详情
英文摘要

Each major technological revolution inverts a particular scarcity and rebuilds institutions around the shift. The near-consensus diagnosis of the AI revolution holds that AI collapses the cost of prediction while judgment remains scarce. This Opinion argues the inversion has now flipped: competent-looking judgment (selecting, ranking, attributing, certifying) is produced at scale and at marginal cost approaching zero, and four complements become scarce: verified signal, legitimacy, authentic provenance, and integration capacity (the community's tolerance for delegated cognition). Because judgment is the substance of institutions, the institutions built to manufacture legitimate judgment (courts, journals, licensing bodies, legislatures) now compete with the technology for the same functional role. The piece traces the pattern across scientific institutions, professional licensing, intellectual property, democratic legitimacy, and foundation-model concentration, and closes with a three-move agenda: reframe AI policy as institutional redesign, build provenance and verification as commons, and develop the formal apparatus for institutional composition under strategic agents.

2604.21789 2026-05-14 cs.GT cs.LG

Mechanism Design for Decentralized Risk Detection: Strict Propriety, Network Coalitions, and the Backfiring Mandat

Jian Ni, Lecheng Zheng, John R Birge

发表机构 * Pamplin College of Business, Virginia Tech(弗吉尼亚理工学院商学院帕姆林学院) Booth School of Business, University of Chicago(芝加哥大学博斯商学院)

AI总结 本文研究了在多个企业共享高风险客户群体的情况下,如何设计机制以实现去中心化的风险检测。核心问题在于各企业掌握碎片化信息,但缺乏激励进行真实共享。文章提出了一种动态机制设计框架,引入时间价值分配(TVA)机制,通过严格正确的评分规则激励企业如实报告后验信息,并分析了网络联盟中的边际贡献度,提出了优先考虑企业间交互规模而非企业规模的联盟设计原则。研究还揭示了在缺乏兼容激励设计的情况下,强制信息共享政策可能适得其反,降低整体福利。

详情
英文摘要

Competing firms that share a population of risky customers face a decentralized risk detection problem in which each firm holds fragmentary information whose aggregation would generate social value, but private incentives impede truthful sharing. We develop a dynamic mechanism design framework for this setting and identify three strategic frictions that distinguish it from classical mechanism design with decentralized information: compliance moral hazard, adversarial adaptation, and information destruction through intervention. A temporal value assignment (TVA) mechanism credits firms using a strictly proper scoring rule applied to discounted verified outcomes; under stated assumptions, TVA implements truthful posterior reporting as a Bayes--Nash equilibrium (uniquely optimal at each edge in large federations, with $O(1/m)$ shading in finite systems). A network Shapley characterization shows that under edge-additive coalition value, each firm's marginal contribution is proportional to its weighted cross-firm interaction degree, yielding a sharp prescription for coalition design that prioritizes inter-firm volume over firm size. Embedding TVA in a model of competition among firms, we establish a welfare ordering across four regulatory regimes (autarky, voluntary federation, mandated full sharing, TVA) and identify conditions under which information-sharing mandates without compatible incentive design reduce welfare below autarky: a ``backfiring mandate.'' We illustrate the framework on a 1.4M-transaction synthetic anti-money-laundering benchmark; the same machinery extends to platform fraud, cybersecurity threat intelligence, and supply chain risk detection.

2603.07770 2026-05-14 cs.DC cs.CL

ArcLight: A Lightweight LLM Inference Architecture for Many-Core CPUs

Yuzhuang Xu, Xu Han, Yuxuan Li, Wanxiang Che

发表机构 * Harbin Institute of Technology(哈尔滨工业大学) Tsinghua University(清华大学)

AI总结 尽管现有的CPU大语言模型推理框架已较为成熟,但它们未能充分利用多核CPU平台的计算潜力。为此,研究提出了一种轻量级的LLM推理架构ArcLight,专门针对多核CPU设计,通过高效的内存管理和线程调度,结合精细控制的张量并行技术,有效降低了跨NUMA节点的内存访问开销。实验表明,ArcLight在主流框架的基础上显著提升了推理吞吐量,最高可达46%,并且兼容各种CPU设备。

Comments Accepted by ACL 2026 Demo

详情
英文摘要

Although existing frameworks for large language model (LLM) inference on CPUs are mature, they fail to fully exploit the computation potential of many-core CPU platforms. Many-core CPUs are widely deployed in web servers and high-end networking devices, and are typically organized into multiple NUMA nodes that group cores and memory. Current frameworks largely overlook the substantial overhead of cross-NUMA memory access, limiting inference scalability and intelligence enabling on such platforms. To address this limitation, we build ArcLight, a lightweight LLM inference architecture designed from the ground up for many-core CPUs. ArcLight integrates efficient memory management and thread scheduling, and introduces finely controlled tensor parallelism to mitigate the cross-node memory access wall. Experimental results show that ArcLight significantly surpasses the performance ceiling of mainstream frameworks, achieving up to 46% higher inference throughput. Moreover, ArcLight maintains compatibility with arbitrary CPU devices. ArcLight is publicly available at https://github.com/OpenBMB/ArcLight.

2603.02245 2026-05-14 eess.AS cs.LG cs.SD

LMU-Based Sequential Learning and Posterior Ensemble Fusion for Cross-Domain Infant Cry Classification

Niloofar Jazaeri, Hilmi R. Dajani, Marco Janeczek, Martin Bouchard

发表机构 * University of Ottawa(渥太华大学) Crynostics Inc.(Crynostics公司)

AI总结 本文研究了跨领域婴儿哭声分类问题,针对信号非平稳、标注有限及领域差异大的挑战,提出了一种融合MFCC、STFT和基频特征的紧凑声学框架,并采用增强的Legendre记忆单元(LMU)建模时序动态。通过引入校准的后验集成融合方法,有效提升了模型在不同数据集上的泛化能力,实验表明该方法在跨域评估中取得了更好的宏F1分数,并具备实时部署的可行性。

Comments 7 pages, to appear in Proc. Int. Conf. IEEE Engineering in Medicine and Biology Society (EMBC 2026), Toronto, Canada, July 26-30 2026

详情
英文摘要

Decoding infant cry causes remains challenging for healthcare monitoring due to short nonstationary signals, limited annotations, and strong domain shifts across infants and datasets. We propose a compact acoustic framework that fuses mel-frequency cepstral coefficients (MFCCs), short-time Fourier transform (STFT) features, and fundamental-frequency (F0) contours within a multi-branch convolutional neural network (CNN) encoder, and models temporal dynamics using an enhanced Legendre Memory Unit (LMU). Compared to LSTMs, the LMU backbone provides stable sequence modeling with substantially fewer recurrent parameters, supporting efficient deployment. To improve cross-dataset generalization, we introduce calibrated posterior ensemble fusion with entropy-gated weighting to preserve domain-specific expertise while mitigating dataset bias. Experiments on Baby2020 and Baby Crying demonstrate improved macro-F1 under cross-domain evaluation, along with leakage aware splits and real-time feasibility for on-device monitoring.

2602.17346 2026-05-14 cs.DM cs.DS cs.LG

Partial Optimality in the Preordering Problem

David Stein, Jannik Irmai, Bjoern Andres

发表机构 * Machine Learning for Computer Vision TU Dresden(计算机视觉机器学习 TU 漳州)

AI总结 本文研究预序问题,即在给定元素对的数值权重下,寻找一个预序关系以最大化特定对的权重和。该问题具有 NP 难特性,作者提出了新的部分最优条件及高效算法,用于判断某些元素对在最优预序中是否不满足关系。实验表明,这些条件有效提升了对元素对关系的高效判定比例,为实际应用提供了更高效的解决方案。

详情
英文摘要

Preordering is a generalization of clustering and partial ordering with applications in bioinformatics and social network analysis. Given a finite set $V$ and a value $c_{ab} \in \mathbb{R}$ for every ordered pair $ab$ of elements of $V$, the preordering problem asks for a preorder $\lesssim$ on $V$ that maximizes the sum of the values of those pairs $ab$ for which $a \lesssim b$. Building on the state of the art in solving this NP-hard problem partially, we contribute new partial optimality conditions and efficient algorithms for deciding these conditions. In experiments with real and synthetic data, these new conditions increase, in particular, the fraction of pairs $ab$ for which it is decided efficiently that $a \not\lesssim b$ in an optimal preorder.

2602.16253 2026-05-14 eess.AS cs.SD

How Much Does Machine Identity Matter in Anomalous Sound Detection at Test Time?

Kevin Wilkinghoff, Keisuke Imoto, Zheng-Hua Tan

发表机构 * Aalborg University(奥胡斯大学) Pioneer Centre for Artificial Intelligence(先锋人工智能中心) Kyoto University(京都大学)

AI总结 本文研究了在测试阶段缺乏机器身份信息时,对异常声音检测(ASD)性能的影响。作者提出了一种修改后的评估方法,将多台机器的测试录音合并处理,不依赖机器身份进行推理,仅在事后评估中使用身份标签。实验表明,这种方法揭示了传统评估下隐藏的性能下降和方法鲁棒性差异,并发现这些下降与模型隐含的机器识别准确性密切相关。

详情
英文摘要

Anomalous sound detection (ASD) benchmarks typically assume that the identity of the monitored machine is known at test time and that recordings are evaluated in a machine-wise manner. However, in realistic monitoring scenarios with multiple known machines operating concurrently, test recordings may not be reliably attributable to a specific machine, and requiring machine identity imposes deployment constraints such as dedicated sensors per machine. To reveal performance degradations and method-specific differences in robustness that are hidden under standard machine-wise evaluation, we consider a minimal modification of the ASD evaluation protocol in which test recordings from multiple machines are merged and evaluated jointly without access to machine identity at inference time. Training data and evaluation metrics remain unchanged, and machine identity labels are used only for post hoc evaluation. Experiments with representative ASD methods show that relaxing this assumption reveals performance degradations and method-specific differences in robustness that are hidden under standard machine-wise evaluation, and that these degradations are strongly related to implicit machine identification accuracy.

2602.07029 2026-05-14 eess.IV cs.CV

Guidestar-Free Adaptive Optics with Asymmetric Apertures

Weiyun Jiang, Haiyun Guo, Christopher A. Metzler, Ashok Veeraraghavan

发表机构 * Rice University(Rice大学) University of Maryland, College Park(马里兰大学学院公园分校)

AI总结 本文提出了一种无需引导星或波前传感器的闭环自适应光学系统,能够实时校正光学像差。该方法基于非对称孔径和机器学习,结合波前感知、点扩散函数估计与光学校正,实现了高效、低计算量的波前校正。实验表明,该方法在复杂自然场景中表现优于现有无引导星波前调控技术,测量次数和计算量分别减少了十倍和千倍。

Comments Accepted to ACM Transactions on Graphics (TOG)

详情
英文摘要

This work introduces the first closed-loop adaptive optics (AO) system capable of optically correcting aberrations in real-time without a guidestar or a wavefront sensor. Nearly 40 years ago, Cederquist et al. demonstrated that asymmetric apertures enable phase retrieval (PR) algorithms to perform fully computational wavefront sensing, albeit at a high computational cost. More recently, Chimitt et al. extended this approach with machine learning and demonstrated real-time wavefront sensing using only a single (guidestar-based) point-spread-function (PSF) measurement. Inspired by these works, we introduce a guidestar-free AO framework built around asymmetric apertures and machine learning. Our approach combines three key elements: (1) an asymmetric aperture placed at the system's pupil plane that enables PR-based wavefront sensing, (2) a pair of machine learning algorithms that estimate the PSF from natural scene measurements and reconstruct phase aberrations, and (3) a spatial light modulator that performs optical correction. We experimentally validate this framework on dense natural scenes imaged through unknown obscurants. Our method outperforms state-of-the-art guidestar-free wavefront shaping methods, using an order of magnitude fewer measurements and three orders of magnitude less computation.

2602.06021 2026-05-14 stat.ML cs.LG cs.NA math.NA math.PR

Diffusion Model's Generalization Can Be Characterized by Inductive Biases toward a Data-Dependent Ridge Manifold

Ye He, Yitong Qiu, Molei Tao

发表机构 * Georgia Institute of Technology(佐治亚理工学院) University of Science and Technology of China(中国科学技术大学)

AI总结 本文研究扩散模型在不记忆训练数据时生成样本的分布特性,提出了一种基于数据依赖的几何视角来刻画其泛化能力。作者引入了一组随时间变化的对数密度脊流形,用于表征反向扩散过程,并发现生成样本遵循“进入-对齐-滑动”的机制。研究进一步将这一几何结构与训练动态联系起来,揭示了模型架构偏差与优化误差之间的定量关系,并在合成数据和MNIST实验中验证了理论预测。

详情
英文摘要

We study a data-dependent notion of diffusion-model generalization: when a model does not memorize the training set, where do its generated samples go relative to the geometry induced by the data? To answer this, we introduce a time-dependent family of log-density ridge manifolds constructed from the smoothed empirical distribution, and use it to characterize reverse-time inference. Our main result shows that generated samples evolve by a reach-align-slide mechanism: they first enter a neighborhood of the ridge, then their distance to the ridge is controlled by the normal component of training error, and finally their motion along the ridge is controlled by the tangential component. We further connect this geometric picture to training dynamics through directional decompositions of the learned error, and make this link explicit for random feature models, where architectural bias and optimization error can be separated quantitatively. Experiments on synthetic multimodal data and MNIST latent diffusion support the predicted geometric behavior in both low and high dimensions.

2602.02791 2026-05-14 stat.ML cs.LG math.ST stat.TH

Plug-In Classification of Drift Functions in Diffusion Processes Using Neural Networks

Yuzhen Zhao, Jiarong Fan, Yating Liu

发表机构 * Université Paris-Dauphine, PSL Chaire DIALog, Fondation du Risque Institut Louis Bachelier(巴黎-第十大学,PSL DIALog研究中心,风险基金会路易·巴舍利埃研究所) LaMME, University of Paris-Saclay(LaMME,巴黎-萨克雷大学) CEREMADE, CNRS Université Paris-Dauphine, PSL(CEREMADE,国家科学研究中心巴黎-第十大学,PSL)

AI总结 本文研究了扩散过程中的监督多类分类问题,每个类别由不同的漂移函数表征,观测数据为离散时间轨迹。作者提出了一种基于神经网络的插件分类方法,通过估计类别特定的漂移函数进行分类,并在标准正则性假设下建立了误分类风险的收敛速率,明确了漂移估计、时间离散化和维度的影响。理论分析表明,利用扩散结构进行漂移学习能够获得比直接基于轨迹的神经分类更优的性能,数值实验也验证了该方法在不同维度下的有效性。

详情
英文摘要

We study supervised multiclass classification for diffusion processes, where each class is characterized by a distinct drift function and trajectories are observed at discrete times. We first derive a multidimensional Bayes rule and then construct a plug-in classifier by estimating the class-specific drifts with neural networks. Under standard regularity assumptions, we establish convergence rates for the excess misclassification risk, making explicit the contributions of drift estimation, time discretization, and dimension. Our analysis also highlights the benefit of exploiting the diffusion structure: the drift is learned from all observed increments, leading to sharper guarantees than direct trajectory-based neural classifiers in the considered setting. Numerical experiments support the theory: the proposed method achieves better classification performance than Denis et al. (2024) in dimension one, remains effective in higher dimensions when the drift functions admit a compositional structure, and outperforms end-to-end neural classifiers trained directly on trajectories, as in Bos & Schmidt-Hieber (2022).

2602.00586 2026-05-14 q-bio.MN cs.AI cs.LG

RAG-GNN: Integrating Retrieved Knowledge with Graph Neural Networks for Precision Medicine

Hasi Hays, William J. Richardson

发表机构 * Department of Chemical Engineering, University of Arkansas(化学工程系,阿肯色大学)

AI总结 该研究提出了一种名为 RAG-GNN 的端到端可训练框架,将图神经网络(GNN)与动态检索的生物医学文献知识相结合,以提升精准医学中的功能聚类性能。通过联合优化的检索投影、门控融合机制和对比对齐方法,RAG-GNN 在癌症信号通路案例中显著提升了功能聚类效果,并验证了检索信息对聚类一致性和内部紧密性的积极影响。实验表明,该方法在功能聚类任务上优于仅依赖图结构的传统方法,为精准医学中的知识整合提供了新思路。

详情
英文摘要

Network topology excels at structural predictions but fails to capture functional semantics encoded in biomedical literature. We present RAG-GNN, an end-to-end trainable retrieval-augmented graph neural network framework that integrates GNN representations with dynamically retrieved literature-derived knowledge through a jointly optimized retrieval projection, gated fusion mechanism, and contrastive alignment. In a cancer signaling case study (379 proteins, 3,498 interactions, 14 functional categories), RAG-GNN improves functional clustering from silhouette $= -0.237 \pm 0.065$ (GNN-only) to $-0.144 \pm 0.066$, a consistent improvement of $+0.093 \pm 0.022$ across 10 random seeds, while the learned retrieval achieves mean precision@10 $= 0.242$, a 152\% improvement over the random baseline ($0.096$). Heuristic information decomposition with bootstrap confidence intervals reveals that topology and retrieval encode overwhelmingly shared information (95.6\%), with retrieval improving both intra-cluster cohesion (silhouette) and cluster agreement (ARI $+0.021 \pm 0.015$). Counterfactual experiments confirm that adversarial, absent, and random retrieval all degrade performance, validating that the gated fusion mechanism depends on document content. Benchmarking against eight established embedding methods demonstrates task-specific complementarity: topology-focused methods achieve strong link prediction, while retrieval augmentation consistently improves functional clustering within the controlled GNN-only ablation. DDR1 subnetwork analysis provides confirmatory validation consistent with established synthetic lethality relationships. These results establish that topology-only and retrieval-augmented approaches serve complementary purposes for precision medicine applications.

2601.17187 2026-05-14 cs.IT cs.AI math.IT

High-Rate Quantized Matrix Multiplication I

Or Ordentlich, Yury Polyanskiy

发表机构 * Hebrew University of Jerusalem(海法大学) MIT(麻省理工学院) MIT-IBM Watson AI Lab(麻省理工-IBM Watson AI实验室)

AI总结 本文研究了量化矩阵乘法(MatMul)问题,这对于高效部署大型语言模型至关重要。文章在无需先验统计信息的情况下,探讨了通用矩阵乘法场景中权重和激活量化的问题,并分析了量化率与失真之间的信息论基本权衡,同时对比了常用量化方案的性能。研究还为这些方案提供了准确的启发式近似,并在后续部分探讨了仅对权重进行量化的场景。

详情
英文摘要

This paper investigates the problem of quantized matrix multiplication (MatMul), which has become crucial for the efficient deployment of large language models (LLMs). We consider a Generic MatMul setting, where both matrices must be quantized (weight+activation quantization) without specific apriori (calibration) statistical information about the factors. We review the fundamental information-theoretic tradeoff between quantization rate and distortion (high-rate theory), and contrast those with the performance of popular quantization schemes (absmax INT and floating-point (FP)), for which we also derive accurate heuristic approximations. Part II of this paper studies the weight-only quantization setup where second-order statistics of the activation matrices are available at the encoder.

2601.15280 2026-05-14 cs.HC cs.AI

LLM-based Multimodal Feedback Produces Equivalent Learning and Better Student Perceptions than Educator Feedback

Chloe Qianhui Zhao, Jie Cao, Jionghao Lin, Kenneth R. Koedinger

发表机构 * Carnegie Mellon University(卡内基梅隆大学) The University of North Carolina at Chapel Hill(北卡罗来纳大学教堂山分校) The University of Hong Kong(香港大学)

AI总结 本研究提出了一种基于大语言模型的实时多模态反馈系统,结合结构化文本解释与动态多媒体资源,旨在提升学习效果与学生体验。实验结果表明,该系统在学习成效方面与教师反馈相当,但在清晰度、针对性、简洁性、学习动机和认知负荷等方面表现更优。研究还发现,AI反馈在不同题型中展现出不同的促进作用,凸显了其在规模化教学中的潜力与优势。

Comments 11 pages, to be published at the 16th International Learning Analytics & Knowledge Conference (LAK '26)

详情
英文摘要

Providing timely, targeted, and multimodal feedback helps students quickly correct errors, build deep understanding and stay motivated, yet making it at scale remains a challenge. This study introduces a real-time AI-facilitated multimodal feedback system that integrates structured textual explanations with dynamic multimedia resources, including the retrieved most relevant slide page references and streaming AI audio narration. In an online crowdsourcing experiment, we compared this system against fixed business-as-usual feedback by educators across three dimensions: (1) learning effectiveness, (2) learner engagement, (3) perceived feedback quality and value. Results showed that AI multimodal feedback achieved learning gains equivalent to original educator feedback while significantly outperforming it on perceived clarity, specificity, conciseness, motivation, satisfaction, and reducing cognitive load, with comparable correctness, trust, and acceptance. Process logs revealed distinct engagement patterns: for multiple-choice questions, educator feedback encouraged more submissions; for open-ended questions, AI-facilitated targeted suggestions lowered revision barriers and promoted iterative improvement. These findings highlight the potential of AI multimodal feedback to provide scalable, real-time, and context-aware support that both reduces instructor workload and enhances student experience.

2512.06109 2026-05-14 math.OC cs.LG cs.RO cs.SY eess.SY

Unifying Entropy Regularization in Optimal Control: From and Back to Classical Objectives via Iterated Soft Policies and Path Integral Solutions

Ajinkya Bhole, Mohammad Mahmoudi Filabadi, Guillaume Crevecoeur, Tom Lefebvre

发表机构 * Department of Electromechanical, Systems and Metal Engineering, Ghent University, Ghent, Belgium(电子机械、系统与金属工程系,根特大学,根特,比利时) Core lab MIRO, Flanders Make, Belgium(Flanders Make核心实验室,比利时)

AI总结 本文通过Kullback-Leibler(KL)正则化的视角,统一了多种最优控制问题的表述,提出了一种将策略和转移的KL惩罚分离并赋予独立权重的核心问题,从而推广了概率最优控制中常用的轨迹级KL正则化方法。该统一框架能够涵盖经典随机最优控制(SOC)、风险敏感随机最优控制(RSOC)及其对应的软策略变体,并揭示了软策略方法在迭代求解过程中可恢复原始目标的特性。此外,文中还识别出一种策略与转移KL权重一致的同步情况,使得问题可转化为线性Bellman算子形式,支持路径积分解法,从而将这些计算优势扩展到更广泛的控制问题中。

Comments refurbished introduction, added a few remarks, reduced size

详情
英文摘要

This paper develops a unified perspective on several optimal control formulations through the lens of Kullback-Leibler (KL) regularization. We propose a central problem that separates the KL penalties on policies and transitions with independent weights, thus generalizing the standard trajectory-level KL-regularization used in probabilistic optimal control. This umbrella formulation recovers various control problems: the classical Stochastic Optimal Control (SOC), Risk-Sensitive Stochastic Optimal Control (RSOC), and their policy-based KL-regularized counterparts, termed soft-policy SOC and RSOC, which yield tractable surrogates. Beyond being regularized variants, these soft-policy formulations majorize the original SOC and RSOC, thus, iterating their solutions recovers the original objectives. We further identify a synchronized case of soft-policy RSOC where the policy and transition KL weights coincide, yielding a linear Bellman operator, path-integral solution, and compositionality -- extending these computationally favourable properties to a broad class of control problems.

2511.10709 2026-05-14 quant-ph cs.LG

Limitations of Quantum Advantage in Unsupervised Machine Learning

Apoorva D. Patel

发表机构 * Centre for High Energy Physics, Indian Institute of Science, Bangalore(高能物理中心,印度科学研究院,班加罗尔) International Centre for Theoretical Sciences, Bangalore(理论科学国际中心,班加罗尔)

AI总结 本文探讨了量子计算在无监督机器学习中可能带来的优势及其局限性。研究指出,量子模型通过密度矩阵替代经典概率分布来拟合数据,但其优势仅在特定数据和目标观测量下才能体现。文章通过具体例子分析了限制量子优势的关键因素,揭示了量子优势在不同任务中的依赖性和适用范围。

Comments 4 pages,1 figure. Invited talk at the 2025 IEEE International Conference on Quantum Control, Computing and Learning (IEEE qCCL2025), Hong Kong, June 2025. Published in the proceedings, pp. 39-42 (v2) Published version

Journal ref Proceedings of IEEE qCCL2025, June 2025, pp. 39-42

详情
英文摘要

Machine learning models are used for pattern recognition analysis of big data, without direct human intervention. The task of unsupervised learning is to find the probability distribution that would best describe the available data, and then use it to make predictions for observables of interest. Classical models generally fit the data to Boltzmann distribution of Hamiltonians with a large number of tunable parameters. Quantum extensions of these models replace classical probability distributions with quantum density matrices. An advantage can be obtained only when features of density matrices that are absent in classical probability distributions are exploited. Such situations depend on the input data as well as the targeted observables. Explicit examples are discussed that bring out the constraints limiting possible quantum advantage. The problem-dependent extent of quantum advantage has implications for both data analysis and sensing applications.

2510.04698 2026-05-14 q-bio.NC cs.AI econ.TH

The Bayesian Origin of the Probability Weighting Function in Human Representation of Probabilities

Xin Tong, Thi Thu Uyen Hoang, Xue-Xin Wei, Michael Hahn

发表机构 * Saarland University(萨尔兰大学) The University of Texas at Austin(德克萨斯大学奥斯汀分校)

AI总结 人类在感知概率时普遍存在系统性的扭曲,表现为典型的反S型权重模式,但其成因长期未明。本文提出一种基于贝叶斯编码-解码的解释框架,认为概率通过带有噪声的内部信号进行编码,并通过最小化贝叶斯风险进行解码。研究发现,这种编码过程中的扭曲可分解为边界回归、似然排斥和先验吸引,从而预测出反S型权重模式源于编码精度的U型分布,即在概率接近0和1时更为敏感。实验结果表明,该框架能够从数据中自然恢复出U型编码结构,并在多个任务中优于传统确定性权重函数和其它模型。

详情
英文摘要

Humans systematically misrepresent probability in a stereotyped inverse-S pattern. It has been documented for decades, but its origin remains unexplained. We propose a Bayesian encoding-decoding account in which probabilities are represented by noisy internal signals and decoded by Bayes-risk minimization. For bounded probability stimuli, we show that distortion decomposes into boundary regression, likelihood repulsion, and prior attraction, yielding a key prediction: the classic inverse-S-shaped weighting pattern implies a U-shaped allocation of encoding precision with greater sensitivity near 0 and 1. Across judgment of relative frequency, lottery pricing, and risky choice, this U-shape is recovered from data without imposing any functional form on the encoding, and our framework outperforms deterministic weighting functions, bounded log-odds models, uniform-encoding Bayesian accounts, and matched efficient-coding models on held-out data. In a new dot probability estimation experiment with bimodal stimulus statistics, the recovered prior tracks the new distribution while the recovered encoding remains U-shaped. Together, these results identify the inverse-S-shaped probability weighting function as the joint product of a stable U-shaped encoding and a flexible prior, integrated by optimal Bayesian decoding.

2510.03992 2026-05-14 cs.CR cs.AI

Quantitative Certification of Agentic Tool Selection

Jehyeok Yeon, Isha Chaudhary, Gagandeep Singh

发表机构 * University of Illinois Urbana-Champaign(伊利诺伊大学厄巴纳-香槟分校)

AI总结 随着大型语言模型(LLMs)在智能代理系统中的广泛应用,如何准确地将用户意图映射到合适的外部工具成为一个关键问题。本文提出了一种名为LLMCert-T的统计框架,用于在真实工具分布下对工具选择流程的安全性进行定量认证,返回具有高置信度的上界概率。该方法将工具选择评估建模为伯努利估计问题,并通过条件生成过程模拟实际部署环境,从而揭示当前主流LLM代理在面对干扰选择和Top-N饱和等安全规范时仍存在较大的性能下降。

详情
英文摘要

Large language models (LLMs) are increasingly deployed in agentic systems, where a fundamental task is mapping user intents to relevant external tools. Errors in tool selection can have severe outcomes, such as unauthorized data access, even without modifying the agent's underlying model. Existing evaluations measure performance on curated, benign benchmarks. However, a pipeline's behavior in deployment depends on the tool pool the agent actually encounters, which in open registries is shaped by third parties. We introduce LLMCert-T, the first statistical framework that returns \textbf{high-confidence upper bounds on the probability that a tool-selection pipeline satisfies a declared safety specification under a realistic tool distribution}. LLMCert-T models tool-selection evaluation as a Bernoulli estimation problem, drawing inserted-tool sequences from a distribution that the safety specification fixes. To evaluate robustness against realistic deployment conditions, we instantiate this distribution as a stochastic process that generates inserted-tool sequences round by round, conditioning each round on the agent's selection in the previous round. LLMCert-T aggregates the per-trial outcomes into a one-sided Clopper-Pearson upper bound on the probability that the specification is satisfied. By returning this bound as a certificate with statistical guarantees over the inserted-tool sequence distribution, LLMCert-T makes safety claims intuitive, actionable, and comparable across models, retrievers, mitigations, and registry policies. Across popular BFCL and OpenAPI tool pools, LLMCert-T shows that current LLM agents remain fragile under Distractor Selection and Top-N Saturation specifications: their certified correctness upper bounds drop to approximately 20\%, far below their clean-pool lower bounds.

2510.00417 2026-05-14 math.OC cs.LG stat.ML

Progressively Sampled Equality-Constrained Optimization

Frank E. Curtis, Lingjun Guo, Daniel P. Robinson

发表机构 * Department of Industrial and Systems Engineering, Lehigh University(莱维大学工业与系统工程系)

AI总结 本文提出了一种用于求解连续非线性等式约束优化问题的算法,适用于目标函数和约束函数由大量项的期望或平均定义的情形。该算法通过逐步增加样本量,依次求解一系列相关优化问题,从而在保证一定精度的前提下降低最坏情况下的样本复杂度。实验结果表明,该方法在实际应用中具有良好的效果。

详情
英文摘要

An algorithm is proposed, analyzed, and tested for solving continuous nonlinear-equality-constrained optimization problems where the objective and constraint functions are defined by expectations or averages over large, finite numbers of terms. The main idea of the algorithm is to solve a sequence of related problems, each involving finite samples of objective- and constraint-function terms, over which the sample sets grow progressively. Under assumptions about the problem functions and their first- and second-order derivatives that are reasonable in real-world settings of interest, it is shown that -- with sufficiently large initial sample sizes -- solving a sequence of problems defined through progressive sampling yields a better worst-case sample complexity bound compared to solving a single problem with the full sets of samples. The results of numerical experiments with a set of test problems demonstrate that the proposed approach can be effective in practice.

2509.23800 2026-05-14 stat.ML cs.LG

Sample-Efficient Optimisation over the Outputs of Generative Models

Samuel Willis, Paul Duckworth, Jack Simons, Aleksandra Kalisz, Krisztina Sinkovics, Noam Ghenassia, Shikha Surana, Henry T. Oldroyd, Alexandru I. Stere, Dragos D Margineantu, Carl Henrik Ek, Henry Moss, Erik Bodin

发表机构 * University of Cambridge(剑桥大学) Lancaster University(兰卡斯特大学) Karolinska Institutet(卡罗林斯卡研究院) Boeing Commercial Airplanes(波音商用飞机) Boeing AI(波音人工智能) InstaDeep Monumo

AI总结 本文提出了一种名为O3的方法,用于在生成模型的输出上进行样本高效的黑箱优化,特别适用于连续变量的扩散模型和流匹配模型。该方法基于代理潜在空间,即从生成模型中提取的低维欧几里得嵌入,无需额外训练即可实现可控维度的表示,并支持直接应用标准优化算法。实验表明,在图像和蛋白质设计任务中,代理空间优化相比传统采样或原潜在空间优化能获得显著更优的样本。该方法对模型和优化器具有通用性,额外成本极低,且无需重新训练或微调生成模型。

详情
英文摘要

Modern generative AI models, such as diffusion and flow matching models, can sample from rich data distributions. However, many applications, especially in science and engineering, require more than drawing samples from the model distribution: they require searching within this distribution for samples that optimise task-specific criteria. In this work, we propose O3 (Optimisation Over the Outputs of Generative Models), a method for sample-efficient black-box optimisation over continuous-variable diffusion and flow-matching models. O3 is built around surrogate latent spaces: low-dimensional Euclidean embeddings that can be extracted from a generative model without additional training. The resulting representations have controllable dimensionality and support the direct application of standard optimisation algorithms. We show, on image and protein design tasks, that surrogate-space optimisation finds substantially higher-scoring samples than standard sampling or optimisation in the original latent space. Our method is model- and optimiser-agnostic, incurs negligible additional cost over standard generation, and requires no retraining or fine-tuning of the generative model.

2509.19929 2026-05-14 stat.ML cs.LG physics.comp-ph physics.data-an

Geometric Autoencoder Priors for Bayesian Inversion: Learn First Observe Later

Arnaud Vadeboncoeur, Gregory Duthé, Mark Girolami, Eleni Chatzi

发表机构 * Department of Engineering University of Cambridge(工程系剑桥大学) Institute of Structural Engineering ETH Zürich(结构工程研究所苏黎世联邦理工学院)

AI总结 本文提出了一种用于贝叶斯反演的几何自编码器先验框架(GABI),旨在解决从少量噪声观测中恢复物理系统全场信息这一高度不适定的问题。GABI通过学习不同几何结构系统的物理响应生成模型,构建出与几何条件相关的强先验信息,从而在反演过程中提升不确定性量化(UQ)的准确性与鲁棒性。该方法无需依赖物理方程或边界条件,利用近似贝叶斯计算(ABC)采样实现高效计算,并在多个复杂几何场景中验证了其有效性与可靠性。

详情
英文摘要

Uncertainty Quantification (UQ) is paramount for inference in engineering. A common inference task is to recover full-field information of physical systems from a small number of noisy observations, a usually highly ill-posed problem. Sharing information from multiple distinct yet related physical systems can alleviate this ill-posedness. Critically, engineering systems often have complicated variable geometries prohibiting the use of standard multi-system Bayesian UQ. In this work, we introduce Geometric Autoencoders for Bayesian Inversion (GABI), a framework for learning geometry-aware generative models of physical responses that serve as highly informative geometry-conditioned priors for Bayesian inversion. Following a ''learn first, observe later'' paradigm, GABI distills information from large datasets of systems with varying geometries, without requiring knowledge of governing PDEs, boundary conditions, or observation processes, into a rich latent prior. At inference time, this prior is seamlessly combined with the likelihood of a specific observation process, yielding a geometry-adapted posterior distribution. Our proposed framework is architecture-agnostic. A creative use of Approximate Bayesian Computation (ABC) sampling yields an efficient implementation that utilizes modern GPU hardware. We test our method on: steady-state heat over rectangular domains; Reynolds-Averaged Navier-Stokes (RANS) flow around airfoils; Helmholtz resonance and source localization on 3D car bodies; RANS airflow over terrain. We find: the predictive accuracy to be comparable to deterministic supervised learning approaches in the restricted setting where supervised learning is applicable; UQ to be well calibrated and robust on challenging problems with complex geometries.

2506.12075 2026-05-14 cs.IR cs.AI

T-TExTS (Teaching Text Expansion for Teacher Scaffolding): Enhancing Text Selection in High School Literature through Knowledge Graph-Based Recommendation

Nirmal Gelal, Chloe Snow, Ambyr Rios, Kathleen M. Jagodnik, Hande Küçük McGinty

发表机构 * Department of Computer Science(计算机科学系) Department of Curriculum and Instruction(课程与教学系) Kansas State University(堪萨斯州立大学)

AI总结 本文提出了一种基于知识图谱的推荐系统 T-TExTS,旨在帮助高中英语文学教师更高效地选择主题一致且多样化的教学文本。该系统通过构建教育领域本体,并结合多种图嵌入方法进行优化,实验表明其在不同数据规模下均表现出优越的推荐性能。研究证明,结合结构化知识与教学价值信号的混合模型在保持可解释性的同时仍具有较高的推荐质量,为教育推荐系统提供了新的方法支持。

Comments Under Review

详情
英文摘要

High school English Literature teachers often encounter barriers to assembling diverse, thematically aligned text sets due to limited planning time and pedagogical resources. To address this need, we present T-TExTS (Teaching Text Expansion for Teacher Scaffolding), a knowledge graph (KG)-based recommendation system that suggests literature texts based on pedagogical merit rather than surface-level metadata. We construct a domain-specific ontology using the Knowledge Acquisition and Representation Methodology (KNARM), instantiate it as a knowledge graph with separate Terminological Box (TBox) and Assertional Box (ABox) components, and evaluate four graph embedding strategies (DeepWalk, biased random walk, hybrid embedding, and Node2Vec) across three dataset configurations (98, 196, and 351 texts) and two relation-weighting schemes. The experimental results reveal that traversal-level expert weighting alone does not outperform algorithmic structural tuning: Node2Vec achieves the highest Area Under the Curve (AUC) at every dataset size (0.9642--0.9750) and the strongest ranking metrics (Hits@K, MRR, nDCG) at larger scales. Combining structural and pedagogical signals through embedding concatenation, however, preserves both interpretability and competitive ranking quality, with the hybrid model maintaining a high AUC across all scales (0.9122--0.9350) and remaining within a few percentage points of Node2Vec on every ranking metric. These findings highlight the value of ontology-driven knowledge graph embeddings for educational recommendation systems and demonstrate that T-TExTS can meaningfully ease the burden of English Literature text selection for secondary educators, supporting more informed and inclusive curricular decisions. The source code for T-TExTS is available at https://github.com/koncordantlab/TTExTS.

2505.01012 2026-05-14 quant-ph cs.CR cs.LG

Quantum Support Vector Regression for Robust Anomaly Detection

Kilian Tscharke, Maximilian Wendlinger, Sebastian Issel, Pascal Debus

发表机构 * Fraunhofer Institute for Applied and Integrated Security (AISEC)(弗劳恩霍夫应用与集成安全研究所(AISEC))

AI总结 本文研究了量子支持向量回归(QSVR)在鲁棒异常检测中的应用,重点探讨其对噪声和对抗攻击的鲁棒性。通过在IBM量子硬件上对11个数据集进行基准测试,结果表明QSVR在噪声环境下仍能保持良好的分类性能,甚至在部分数据集上优于无噪声模拟。研究还发现QSVR对多种量子噪声具有一定的鲁棒性,但对振幅阻尼和校准误差噪声较为敏感,同时QSVR在面对对抗攻击时表现出较高的脆弱性。

Comments Accepted to International Conference on Agents and Artificial Intelligence (ICAART) 2026

详情
英文摘要

Anomaly Detection (AD) is critical in data analysis, particularly within the domain of IT security. In this study, we explore the potential of Quantum Machine Learning for application to AD with special focus on the robustness to noise and adversarial attacks. We build upon previous work on Quantum Support Vector Regression (QSVR) for semisupervised AD by conducting a comprehensive benchmark on IBM quantum hardware using eleven datasets. Our results demonstrate that QSVR achieves strong classification performance and even outperforms the noiseless simulation on two of these datasets. Moreover, we investigate the influence of - in the NISQ-era inevitable - quantum noise on the performance of the QSVR. Our findings reveal that the model exhibits robustness to depolarizing, phase damping, phase flip, and bit flip noise, while amplitude damping and miscalibration noise prove to be more disruptive. Finally, we explore the domain of Quantum Adversarial Machine Learning by demonstrating that QSVR is highly vulnerable to adversarial attacks, with neither quantum noise nor adversarial training improving the model's robustness against such attacks.

2502.20427 2026-05-14 cs.CR cs.AI cs.SD eess.AS

DeePen: Penetration Testing for Audio Deepfake Detection

Nicolas Müller, Piotr Kawa, Adriana Stan, Thien-Phuc Doan, Souhwan Jung, Wei Herng Choong, Philip Sperl, Konstantin Böttinger

发表机构 * Technical University of Cluj-Napocay(克卢日-纳波卡技术大学) AISRC, Soongsil University(Soongsil大学人工智能研究中心)

AI总结 本文提出了一种名为DeePen的系统化渗透测试方法,用于评估基于机器学习的深度伪造音频检测分类器的鲁棒性。该方法无需了解或接触目标检测模型,而是通过一系列精心设计的信号处理攻击来测试模型的漏洞。研究发现,无论是实际部署的系统还是公开的学术模型,均存在可被简单操作(如时间拉伸或添加回声)欺骗的弱点,表明当前的深度伪造检测技术仍面临严峻挑战。

详情
英文摘要

Deepfakes - manipulated or forged audio and video media - pose significant security risks to individuals, organizations, and society at large. To address these challenges, machine learning-based classifiers are commonly employed to detect deepfake content. In this paper, we assess the robustness of such classifiers through a systematic penetration testing methodology, which we introduce as DeePen. Our approach operates without prior knowledge of or access to the target deepfake detection models. Instead, it leverages a set of carefully selected signal processing modifications - referred to as attacks - to evaluate model vulnerabilities. Using DeePen, we analyze both real-world production systems and publicly available academic model checkpoints, demonstrating that all tested systems exhibit weaknesses and can be reliably deceived by simple manipulations such as time-stretching or echo addition. Furthermore, our findings reveal that while some attacks can be mitigated by retraining detection systems with knowledge of the specific attack, others remain persistently effective.

2502.11583 2026-05-14 stat.ML cs.LG

Distributional Autoencoders Know the Score

Andrej Leban

发表机构 * Department of Statistics, University of Michigan(密歇根大学统计学系)

AI总结 本文研究了分布型主成分自编码器(DPA),旨在实现分布正确重建与编码可解释性的统一。通过理论分析,作者建立了最优水平集几何与数据分布得分之间的精确关系,揭示了DPA能够分离数据变化因素的机理,并允许直接从样本中恢复得分函数。此外,当数据服从玻尔兹曼分布时,该关系可用于单次拟合中近似最小自由能路径。研究还证明,在数据位于可由编码器逼近的流形上时,超出流形维度的潜在变量与数据分布条件独立,从而揭示了数据的内在维度。这些结果表明,单一模型可以在保证下同时学习数据分布及其内在维度,统一了无监督学习的两个长期目标。

Comments NeurIPS 2025 - camera-ready version

Journal ref Advances in Neural Information Processing Systems 38 (NeurIPS 2025), 2025

详情
英文摘要

The Distributional Principal Autoencoder (DPA) combines distributionally correct reconstruction with principal-component-like interpretability of the encodings. In this work, we provide exact theoretical guarantees on both fronts. First, we derive a closed-form relation linking each optimal level-set geometry to the data-distribution score. This result explains DPA's empirical ability to disentangle factors of variation of the data, as well as allows the score to be recovered directly from samples. When the data follows the Boltzmann distribution, we demonstrate that this relation yields an approximation of the minimum free-energy path for the Mueller-Brown potential in a single fit. Second, we prove that if the data lies on a manifold that can be approximated by the encoder, latent components beyond the manifold dimension are conditionally independent of the data distribution - carrying no additional information - and thus reveal the intrinsic dimension. Together, these results show that a single model can learn the data distribution and its intrinsic dimension with exact guarantees simultaneously, unifying two longstanding goals of unsupervised learning.

1804.01050 2026-05-14 stat.ML cs.CV cs.LG

Training VAEs Under Structured Residuals

Gara Dorta, Sara Vicente, Lourdes Agapito, Neill D. F. Campbell, Ivor Simpson

发表机构 * University of Bath(巴斯大学) Anthropics Technology Ltd.(Anthropics技术有限公司) University College London(伦敦大学学院)

AI总结 本文研究了在变分自编码器(VAE)中如何更好地建模图像重构残差中的结构化相关性。传统VAE假设像素间的不确定性是独立的,但实际重构残差往往具有明显结构。为此,作者提出了一种新的方法,在VAE中引入结构化高斯似然预测网络,以建模残差中的相关性,并在保持模型复杂度较低的前提下,有效提升了VAE对颜色图像的不确定性建模能力与生成质量。

Comments Simplified training methodology, added more results

详情
英文摘要

Variational auto-encoders (VAEs) are a popular and powerful deep generative model. Previous works on VAEs have assumed a factorized likelihood model, whereby the output uncertainty of each pixel is assumed to be independent. This approximation is clearly limited as demonstrated by observing a residual image from a VAE reconstruction, which often possess a high level of structure. This paper demonstrates a novel scheme to incorporate a structured Gaussian likelihood prediction network within the VAE that allows the residual correlations to be modeled. Our novel architecture, with minimal increase in complexity, incorporates the covariance matrix prediction within the VAE. We also propose a new mechanism for allowing structured uncertainty on color images. Furthermore, we provide a scheme for effectively training this model, and include some suggestions for improving performance in terms of efficiency or modeling longer range correlations.

2605.13337 2026-05-14 cs.CR cs.LG

Context-Aware Web Attack Detection in Open-Source SIEM Systems via MITRE ATT&CK-Enriched Behavioral Profiling

Badr Alboushy, Assef Jafar, Mohamad Aljnidi, Mohamad Bashar Disoki, Aref Shaheed

发表机构 * Higher Institute for Applied Sciences and Technology(应用科学与技术高级学院) Syrian Private University(叙利亚私立大学) Arab International University(阿拉伯国际大学) Latakia University(拉塔基亚大学)

AI总结 该研究提出了一种基于行为分析的智能SIEM系统Smart-SIEM,用于检测开源平台中的网络攻击。其核心方法结合了MITRE ATT&CK框架的行为特征与机器学习模型,通过构建上下文感知的特征向量并采用两阶段混合模型(LightGBM和XGBoost)实现攻击检测与分类。实验表明,该方法在检测准确率和分类性能上显著优于传统规则引擎,并具备应对概念漂移的自适应重训练机制。

Comments 38 pages, 13 figures, 13 tables

详情
英文摘要

Security Information and Event Management (SIEM) systems aggregate log data from heterogeneous sources to detect coordinated attacks. Traditional rule-based correlation engines struggle to classify multi-step web application attacks because they examine each event without reference to the behavioural history of the originating host. We present Smart-SIEM, an AI module for the open-source Wazuh SIEM platform with two contributions: (1) a per-source-IP behavioural context vector encoding HTTP response-status distributions, peak rule activation counts, and MITRE ATT&CK technique frequencies from the N most recent prior events; (2) a two-stage hybrid cascade combining LightGBM for binary attack detection and XGBoost for six-class attack categorisation. Evaluated on 46,454 purpose-built Wazuh security events, context features improve all tested gradient boosting algorithms from ~0.705 macro F1 to 0.947-0.967 (Stage 1) and 0.876-0.914 (Stage 2), an average gain of +0.254 and +0.324 respectively. The hybrid cascade achieves F1 of 0.967 (binary) and 0.914 (six-class). Wazuh's native rule engine detects 0% of Brute Force and Broken Authentication events; the AI module detects 100% and 98.3% respectively. A self-adaptive retraining mechanism recovers from concept drift: F1 drops from 0.905 to 0.465 when unseen attack types emerge, recovering to 0.814 after retraining on the combined corpus.

2605.13315 2026-05-14 cs.ET cs.LG cs.NE cs.SY eess.SY q-bio.NC

Embodied Neurocomputation: A Framework for Interfacing Biological Neural Cultures with Scaled Task-Driven Validation

Johnson Zhou, Daniel Tanneberg, Forough Habibollahi, Alon Loeffler, Kiaran Lawson, Valentina Baccetti, Kwaku Dad Abu-Bonsrah, Candice Desouza, Finn Doensen, Bradley Watmuff, Daria Kornienko, Azin Azadi, Justin Leigh Bourke, Bernhard Sendhoff, Brett J. Kagan

发表机构 * Cortical Labs, Australia Honda Research Institute Europe, Germany

AI总结 该研究提出了一种“具身神经计算”框架,旨在解决生物神经网络与传统硅基计算接口之间的最优编码与解码问题。通过在模拟环境中对生物神经网络代理进行闭环导航任务的参数优化,研究发现了12种能够稳定学习的配置,其任务表现优于相同交互预算下的硅基深度Q网络代理。该工作为基于生物神经网络的目标导向学习提供了基础,并推动了任务驱动神经计算和跨领域基准的建立。

详情
英文摘要

Biological neural networks (BNNs) have been established as a powerful and adaptive substrate that offer the potential for incredibly energy and data efficient information processing with distinct learning mechanisms. Yet a core challenge to utilizing BNN for neurocomputation is determining the optimal encoding and decoding mechanisms between the traditional silicon computing interface and the living biology. Here, we propose an Embodied Neurocomputation framework as a systems-level approach to this multi-variable optimization encoding/decoding problem. We operationalize this approach through the first large-scale parameter optimization of encoding configurations for a BNN agent performing closed-loop navigation along an odor-style gradient in a simulated grid-world. Despite the relative simplicity of the task, the biological interactions gave rise to a massive multi-combinatorial search space for optimal parameters. By considering how the components of the system are interconnected and parameterized, we evaluated approximately 1,300 parameter combinations, over 4,000 hours of real-time agent-environment interactions, to identify 12 configurations that consistently demonstrated learning across multiple episodes. These configurations achieved significantly higher task performances than optimized silicon-based DQN agents under the same interaction budget. These findings represent an initial step toward robust and scalable goal-oriented learning using BNNs. Our framework establishes a foundation for applying task-driven neurocomputing and supports the development of field-wide benchmarks. In the long term, this work supports the development of hybrid bio-silicon architectures capable of efficient, adaptive and real-time computation, including the potential for robotic control applications.

2605.13284 2026-05-14 stat.ML cs.LG math.ST stat.TH

Learning Perturbations to Extrapolate Your LLM

Zetai Cen, Chenfei Gu, Jin Zhu, Ting Li, Yunxiao Chen, Chengchun Shi

发表机构 * School of Mathematics, University of Bristol(布里斯托大学数学学院) School of Statistics and Data Science, Shanghai University of Finance and Economics(上海财经大学统计与数据科学学院) School of Mathematics, University of Birmingham(伯明翰大学数学学院) Department of Statistics, London School of Economics and Political Science(伦敦政治经济学院统计系)

AI总结 该研究旨在提升大语言模型在未知领域中的泛化能力,提出了一种通过学习连续潜在向量的可学习变换来扰动词元前缀的方法。该方法克服了传统离散固定扰动的局限性,并通过推导无偏估计方程并利用随机梯度下降进行优化,建立了在过参数化场景下的统计性质。实验表明,该方法在合成和真实数据集上均显著优于现有先进方法。

Comments 35 pages

详情
英文摘要

Recent advancements in large language models demonstrate that injecting perturbations can substantially enhance extrapolation performance. However, current approaches often rely on discrete perturbations with fixed designs, which limits their flexibility. In this work, we propose a framework where token prefixes are perturbed by a learnable transformation of a continuous latent vector within an embedding space. To overcome the challenge of an intractable marginal likelihood, we derive unbiased estimating equations for model parameters and optimize them via stochastic gradient descent. We establish the statistical properties of the resulting estimator in over-parameterized regimes. Empirical evaluations on both synthetic and real-world datasets demonstrate that our proposal yields significant gains in out-of-domain settings over a range of state-of-the-art baseline methods.

2605.13280 2026-05-14 cs.SE cs.AI

The Readability Spectrum: Patterns, Issues, and Prompt Effects in LLM-Generated Code

Hengzhi Ye, Fengyuan Ran, Weiwei Xu, Minghui Zhou

发表机构 * Peking university(北京大学) Wuhan University(武汉大学)

AI总结 随着大语言模型(LLM)在软件开发中的广泛应用,生成代码的可读性这一关键非功能性属性尚未得到充分研究。本文构建了一个综合的可读性模型,结合文本、结构、程序和视觉特征,系统评估了主流LLM生成代码在数千种场景下的可读性,并发现其整体可读性与人类编写的代码相当,但存在独特的可读性问题模式。研究还表明,提示设计对生成代码的可读性有一定影响,但整体效果有限,揭示了LLM生成代码在长期可维护性方面仍需进一步改进。

详情
英文摘要

As Large Language Models (LLMs) are transforming software development, the functional quality of generated code has become a central focus, leaving readability, one of critical non-functional attributes, understudied. Given that LLM-generated code still needs human review before adoption, it is important to understand its readability especially compared with human-written code and the role of prompt design in shaping it. We therefore set out to conduct a systematic investigation into the code readability of LLM-generated code. To systematically quantify code readability, We establish a comprehensive readability model that synthesizes textual, structural, program, and visual features of code. Based on the model, we evaluate the readability of code generated by the mainstream LLMs under 5,869 scenarios extracted from large code base including World of Code (WoC) and LeetCode. We find that current LLMs produce code with overall readability comparable to human-written code, but displaying distinct readability issue patterns. We further examine how different prompt dimensions affect the readability of LLM-generated code, and find that function signatures, constraints and style descriptions emerge as the most influential factors, while the overall impact of prompt design remains limited. Our findings indicate that, on one hand, LLM-generated code is at least comparable to human-written code in readability, validating its potential for systematic integration into software workflows from a non-functional perspective; on the other hand, distinct readability issue patterns and limited effectiveness of prompt engineering reveal a latent technical debt, highlighting the need for future research to improve the readability of LLM-generated code and thus ensure long-term maintainability.

2605.13261 2026-05-14 cs.HC cs.AI

"It became a self-fulfilling prophecy": How Lived Experiences are Entangled with AI Predictions in Menstrual Cycle Tracking Apps

Wendy Zhou, Pelin Karaturhan, Alexandra Weilenmann, Jichen Zhu

发表机构 * IT University of Copenhagen(哥本哈根IT大学) Department of Applied Information Technology, University of Gothenburg(应用信息科技系,哥德堡大学)

AI总结 本文研究了月经周期追踪应用中人工智能预测如何与用户的实际体验相互交织,影响其对身体和心理状态的理解。通过半结构化访谈和群体自传研究,研究发现用户往往依据AI预测来理解自身经历,但预测的准确性受限于记录不完善,且界面设计缺乏对用户批判性思考的支持。研究还指出,非典型用户在与AI交互过程中常感到孤立,并据此提出了针对预测型AI功能的设计改进建议。

详情
英文摘要

In menstrual cycle tracking apps (MCTAs), AI-based predictions and insights have become increasingly popular. These features enable users to receive personalized information about their bodies and mental states. However, there is currently little research on how these predictive AI features and explanations affect users' lived experiences. This paper examines human-AI entanglement in MCTAs through 14 semi-structured user interviews and a group autoethnography. These methods uncover the processes leading to this phenomenon. Our results reveal that: (1) users understand their lived experiences in light of AI predictions, although these predictions can be faulty due to imperfect logging practices, (2) the user interface features and AI explanations do not support awareness or critical engagement with this entanglement and meaning-making, and (3) non-normative MCTA users report a sense of isolation in this entangled interaction. Based on our findings, we propose design implications for predictive AI features and explanations.