arXivDaily arXiv每日学术速递 周一至周五更新
重置
全部学科分类 2329
2605.11864 2026-05-13 cs.IR cs.AI cs.CV cs.MM

Very Efficient Listwise Multimodal Reranking for Long Documents

Yiqun Sun, Pengfei Wei, Lawrence B. Hsieh

AI总结 本文提出了一种高效的列表级多模态重排序模型ZipRerank,旨在解决长文档视觉中心检索和多模态检索增强生成中的计算瓶颈问题。该方法通过轻量的查询-图像早期交互机制缩短输入长度,并采用单次前向传播对所有候选进行评分,从而避免了自回归解码的高耗时过程。实验表明,ZipRerank在保持高性能的同时,显著降低了大语言模型的推理延迟,适用于对延迟敏感的实际应用场景。

Comments To appear in ICML 2026

详情
英文摘要

Listwise reranking is a key yet computationally expensive component in vision-centric retrieval and multimodal retrieval-augmented generation (M-RAG) over long documents. While recent VLM-based rerankers achieve strong accuracy, their practicality is often limited by long visual-token sequences and multi-step autoregressive decoding. We propose ZipRerank, a highly efficient listwise multimodal reranker that directly addresses both bottlenecks. It reduces input length via a lightweight query-image early interaction mechanism and eliminates autoregressive decoding by scoring all candidates in a single forward pass. To enable effective learning, ZipRerank adopts a two-stage training strategy: (i) listwise pretraining on large-scale text data rendered as images, and (ii) multimodal finetuning with VLM-teacher-distilled soft-ranking supervision. Extensive experiments on the MMDocIR benchmark show that ZipRerank matches or surpasses state-of-the-art multimodal rerankers while reducing LLM inference latency by up to an order of magnitude, making it well-suited for latency-sensitive real-world systems. The code is available at https://github.com/dukesun99/ZipRerank.

2605.11850 2026-05-13 math.OC cs.LG

Constrained Stochastic Spectral Preconditioning Converges for Nonconvex Objectives

Konstantinos Oikonomidis, Jan Quan, Kimon Antonakopoulos, Antonio Silveti-Falls, Volkan Cevher, Panagiotis Patrinos

AI总结 本文研究了在非凸目标函数下带有约束的随机谱预处理梯度方法的收敛性问题,提出了一类能够处理多种凸和非凸约束的随机算法,并通过针对方法几何特性的全新分析,研究了其在重尾噪声下的收敛性。此外,文章还引入了方差缩减版本以加速收敛,并指出Muon优化器中使用的多项式迭代更适合用非线性预处理子进行建模,从而更准确地反映实际实现中的收敛行为。

详情
英文摘要

In this work, we develop proximal preconditioned gradient methods with a focus on spectral gradient methods providing a proximal extension to the Muon and Scion optimizers. We introduce a family of stochastic algorithms that can handle a wide variety of convex and nonconvex constraints and study its convergence under heavy-tailed noise, through a novel analysis tailored to the geometry of the proposed methods. We further propose a variance-reduced version, which achieves faster convergence under standard noise assumptions. Finally, we show that the polynomial iterations used in Muon are more accurately captured by a nonlinear preconditioner than by the ideal matrix sign, leading to a convergence analysis that more faithfully reflects practical implementations.

2605.11847 2026-05-13 cs.ET cs.LG

A Fast and Energy-Efficient Latch-Based Memristive Analog Content-Addressable Memory

Paul-Philipp Manea, Aishwarya Natarajan, Jim Ignowski, John Paul Strachan, Luca Buonanno

AI总结 本文提出了一种基于忆阻器的快速且能效高的锁存型模拟内容可寻址存储器(aCAM),旨在解决传统设计中静态搜索功耗高、电压增益有限及匹配线串扰等问题。该方法采用动态电流竞争比较器替代静态电压分压,实现了高再生增益、内在结果锁存和接近零的静态功耗。实验表明,与传统6T2M架构相比,新设计在相同延迟下降低了33%的读取能耗,并支持大规模阵列扩展和能效-延迟的优化,适用于边缘人工智能等应用场景。

Comments This work has been submitted to the IEEE for possible publication

详情
英文摘要

Analog content-addressable memories (aCAMs) based on memristors provide a promising pathway toward energy-efficient large-scale associative computing for Edge AI and embedded intelligence applications. They have been successfully applied to decision-tree inference and extend the capabilities of compute-in-memory (CIM) architectures beyond conventional vector-matrix multiplication. However, conventional designs such as the 6T2M architecture suffer from static search power, limited voltage gain, and pronounced match-line crosstalk, constraining analog precision and scalability. We introduce a strong-arm latched memristor (SALM) aCAM cell that replaces static voltage division with a dynamic current-race comparator, enabling high regenerative gain, intrinsic result latching, and near-zero static search power. Compared to 6T2M, SALM reduces read energy by 33% at identical latency while eliminating the gain and crosstalk limitations that prevent 6T2M from scaling to large arrays. SALM further enables scalable sequential and parallel latch sharing, and a dataset-aware optimization framework exposes an explicit energy-latency tradeoff, achieving up to 50% energy reduction at 3x latency across representative workloads. To enable architectural exploration, we develop a circuit-accurate behavioral model derived from SPICE lookup tables in 22 nm FD-SOI technology, capturing match-line dynamics and crosstalk. Integrated into the X-TIME decision-tree compiler, this framework demonstrates that SALM maintains near-software accuracy for high-dimensional datasets, whereas baseline designs degrade due to limited gain and cumulative crosstalk.

2605.11841 2026-05-13 stat.ML cs.AI cs.LG

Minimax Rates and Spectral Distillation for Tree Ensembles

Binh Duc Vu, David S. Watson

AI总结 本文研究了随机森林和梯度提升机等树集成模型的理论性质,提出了基于谱方法的分析框架。通过分析诱导核算子的特征值衰减,得出了随机森林回归的最小最大收敛率,并基于这一视角开发了模型压缩方法。该方法通过学习核算子或平滑矩阵的主特征函数或奇异向量,生成预测性能优异但规模大幅缩减的蒸馏模型,适用于资源受限的计算场景。

Comments 9 pages main text, 33 pages total, with 12 figures and 7 tables total

详情
英文摘要

Tree ensembles such as random forests (RFs) and gradient boosting machines (GBMs) are among the most widely used supervised learners, yet their theoretical properties remain incompletely understood. We adopt a spectral perspective on these algorithms, with two main contributions. First, we derive minimax-optimal convergence for RF regression, showing that, under mild regularity conditions on tree growth, the eigenvalue decay of the induced kernel operator governs the statistical rate. Second, we exploit this spectral viewpoint to develop compression schemes for tree ensembles. For RFs, leading eigenfunctions of the kernel operator capture the dominant predictive directions; for GBMs, leading singular vectors of the smoother matrix play an analogous role. Learning nonlinear maps for these spectral representations yields distilled models that are orders of magnitude smaller than the originals while maintaining competitive predictive performance. Our methods compare favorably to state of the art algorithms for forest pruning and rule extraction, with applications to resource constrained computing.

2605.11839 2026-05-13 cs.DC cs.AI

Trade-offs in Decentralized Agentic AI Discovery Across the Compute Continuum

Patrizio Dazzi, Emanuele Carlini, Matteo Mordacchini, Saul Urso

AI总结 本文研究了在计算连续体(包括云、边缘和间歇连接环境)中部署的智能体系统所面临的分布式发现机制的权衡问题。作者对比了Chord、Pastry和Kademlia三种结构化覆盖网络在统一控制平面框架下的性能,分析了它们在发现可靠性、启动行为和控制开销等方面的差异。研究旨在为智能体在边缘到云环境中的发现机制提供明确的性能边界和适用场景指导。

详情
英文摘要

Agentic systems deployed across the compute continuum need discovery mechanisms that remain effective across cloud, edge, and intermittently connected domains. In some emerging agentic architectures, decentralized discovery is already an active design direction, placing DHT-based lookup on the path toward agent directories. This paper studies the trade-offs among major structured-overlay families for agent discovery, comparing Chord, Pastry, and Kademlia as candidate indexing substrates within a shared control-plane framework. Using a benchmark subset centered on a 4096-node stationary comparison and a representative 4096-node churn benchmark, the paper characterizes how discovery reliability, startup behavior, and control-plane overhead vary across these overlays. The goal is to clarify the operating points they expose for agent discovery across edge-to-cloud environments.

2605.11835 2026-05-13 cs.NE cs.AI cs.LG

Multi-Timescale Conductance Spiking Networks: A Sparse, Gradient-Trainable Framework with Rich Firing Dynamics for Enhanced Temporal Processing

Alex Fulleda-Garcia, Saray Soldado-Magraner, Josep Maria Margarit-Taulé

AI总结 本文提出了一种多时间尺度电导型脉冲神经网络(Multi-Timescale Conductance Spiking Networks),旨在解决传统脉冲神经网络在梯度训练、动态丰富性和活动稀疏性之间的权衡问题。该框架通过调节快、慢和极慢时间尺度的电导参数,系统地控制神经元的兴奋性,从而实现包括持续放电、瞬时放电和爆发放电等多种放电模式。实验表明,该模型在时间序列回归任务中优于现有LIF和AdLIF网络,同时表现出更稀疏的活动特性,为能效优先的时序处理和类脑计算提供了新的基础。

Comments Published in 2026 IEEE Neuro-Inspired Computational Elements Conference (Atlanta, USA)

详情
英文摘要

Spiking neural networks (SNNs) promise low-power event-driven computation for temporally rich tasks, but commonly used neuron models often trade off gradient-based trainability, dynamical richness, and high activity sparsity. These limitations are acute in regression, where approximation error, noise and spike discretization can severely degrade continuous-valued outputs. Indeed, many state-of-the-art (SOTA) SNNs rely on simple phenomenological dynamics trained with surrogate gradients and offer limited control over spiking diversity and sparsity. To overcome such limitations, we introduce multi-timescale conductance spiking networks, a gradient-trainable framework in which neural dynamics emerge from shaping the current-voltage (I-V) curve by tuning fast, slow and ultra-slow conductances. This parametrization allows systematic control over excitability, can be implemented efficiently in analog circuits, and yields rich firing regimes including tonic, phasic and bursting responses within a single model. We derive a discrete-time formulation of these differentiable dynamics, enabling direct backpropagation through time without surrogate-gradient approximations. To probe both trainability and accuracy, we evaluate feedforward networks of these neurons at the predictability limit of Mackey-Glass time-series regression and compare them to baseline LIF and SOTA AdLIF networks. Our model outperforms LIF and AdLIF networks, while exhibiting substantially sparser activity from both communication and computational perspectives. These results highlight multi-timescale conductance spiking neurons as a promising building block for energy-aware temporal processing and neuromorphic implementation.

2605.10584 2026-05-13 astro-ph.IM cs.AI gr-qc

An agentic framework for gravitational-wave counterpart association in the multi-messenger era

Yiming Dong, Yacheng Kang, Junjie Zhao, Xinyuan Zhu, Ziming Wang, Lijing Shao

AI总结 随着多信使天文学的发展,引力波(GW)与电磁(EM)信号的关联成为研究天体物理的重要步骤。本文提出GW-Eyes,一个基于大语言模型的智能代理框架,首次实现了引力波信号与候选电磁事件的自主关联,并支持自然语言交互以辅助专家完成目录管理、天区图可视化等任务。该框架利用大语言模型的复杂决策能力和可追溯的推理过程,为多信使天文学提供了新的研究视角。

详情
英文摘要

With the detection of gravitational waves (GWs), multi-messenger astronomy has opened a new window for advancing our understanding of astrophysics, dense matter, gravitation, and cosmology. The GW sources detected to date are from mergers of compact object binaries, which possess the potential to generate detectable electromagnetic (EM) counterparts. Searching for associations between GW signals and their EM counterparts is an essential step toward enabling subsequent multi-messenger studies. In the era of next-generation GW and EM detectors, the rapid increase in the number of events brings not only unprecedented scientific opportunities, but also substantial challenges to the existing data analysis paradigm. To help address these challenges, we develop GW-Eyes, an agentic framework powered by large language models (LLMs). For the first time, GW-Eyes integrates domain-specific tools and autonomously performs counterpart association tasks between GW and candidate EM events. It supports natural language interaction to assist human experts with auxiliary tasks such as catalog management, skymap visualization, and rapid verification. Our framework leverages the complex decision-making capabilities of LLMs and their traceable reasoning processes, offering a new perspective to the multi-messenger astronomy.

2605.10442 2026-05-13 cs.CY cs.AI cs.CL

StereoTales: A Multilingual Framework for Open-Ended Stereotype Discovery in LLMs

Pierre Le Jeune, Étienne Duchesne, Weixuan Xiao, Stefano Palminteri, Bazire Houssin, Benoît Malézieux, Matteo Dora

AI总结 本文提出了一种多语言框架 StereoTales,用于系统研究开放生成式大语言模型中的社会偏见。该框架包含10种语言、79个社会人口属性以及超过65万个由23个大模型生成的故事,并通过统计分析识别出1500多个过度关联的刻板印象,并由人类和模型共同评估其有害性。研究发现,所有模型都会生成有害刻板印象,且这些偏见具有跨语言和跨模型的共性,人类与模型对有害性的判断也表现出较高一致性。

Comments Preprint

详情
英文摘要

Multilingual studies of social bias in open-ended LLM generation remain limited: most existing benchmarks are English-centric, template-based, or restricted to recognizing pre-specified stereotypes. We introduce StereoTales, a multilingual dataset and evaluation pipeline for systematically studying the emergence of social bias in open-ended LLM generation. The dataset covers 10 languages and 79 socio-demographic attributes, and comprises over 650k stories generated by 23 recent LLMs, each annotated with the socio-demographic profile of the protagonist across 19 dimensions. From these, we apply statistical tests to identify more than 1{,}500 over-represented associations, which we then rate for harmfulness through both a panel of humans (N = 247) and the same LLMs. We report three main findings. \textbf{(i)} Every model we evaluate emits consequential harmful stereotypes in open-ended generation, regardless of size or capabilities, and these associations are largely shared across providers rather than isolated misbehaviors. \textbf{(ii)} Prompt language strongly shapes which stereotypes appear: rather than transferring as a shared set of biases, harmful associations adapt culturally to the prompt language and amplify bias against locally salient protected groups. \textbf{(iii)} Human and LLM harmfulness judgments are broadly aligned (Spearman $ρ=0.62$), with disagreements concentrating on specific attribute classes rather than specific providers. To support further analyses, we release the evaluation code and the dataset, including model generations, attribute annotations, and harmfulness ratings.

2605.09115 2026-05-13 cs.CR cs.AI

AI Native Asset Intelligence

Gal Engelberg, Leon Goldberg, Konstantin Koutsyi, Boris Plotnikov, Tiltan Gilat, Ben Benhemo

AI总结 本文提出了一种名为“AI原生资产智能”的框架,旨在解决现代安全环境中异构数据碎片化、优先级不明确的问题。该框架通过建模层和评分层将分散的安全信号转化为结构化的资产重要性评估,区分资产的内在暴露风险与业务上下文相关的重要性,从而实现更稳定、上下文感知的资产优先级排序。实验表明,该方法能够有效提升安全态势分析的准确性和主动性,为企业的安全决策提供可靠支持。

Comments 23 pages, 4 figures, 8 tables. Preprint

详情
英文摘要

Modern security environments generate fragmented signals across cloud resources, identities, configurations, and third-party security tools. Although AI-native security assistants improve access to this data, they remain largely reactive: users must ask the right questions and interpret disconnected findings. This does not scale in enterprise environments, where signal importance depends on exposure, exploitability, dependencies, and business context. Repeated AI queries may therefore produce unstable prioritization without a structured basis for comparing assets. This paper introduces AI-native asset intelligence, a framework that transforms heterogeneous security data into a structured intelligence layer for consistent, contextual, and proactive asset-level reasoning. The framework combines a modeling layer, representing assets, identities, relationships, controls, attack vectors, and blast-radius patterns, with a scoring layer that converts fragmented signals into a normalized measure of asset importance. The scoring system separates intrinsic exposure, based on misconfigurations and attack-vector evidence, from contextual importance, based on anomaly, blast radius, business criticality, and data criticality. AI contextualization refines severity and business/data classifications, while deterministic aggregation preserves consistency. We evaluate the scoring system on a production snapshot with 131,625 resources across 15 vendors and 178 asset types. Sensitivity analyses and ablations show that severity mappings control finding sensitivity, AI severity adjustment refines prioritization, attack-vector scoring responds to rare exploitability evidence, and contextual modulation selectively modifies exposed resources based on business or data importance. The results support AI-native asset intelligence as a foundation for stable prioritization and proactive security-posture reasoning.

2605.08151 2026-05-13 cs.DC cs.AI

SPECTRE: Hybrid Ordinary-Parallel Speculative Serving for Resource-Efficient LLM Inference

Jincheng Xie, Yawen Ling, Qi Xiao, Feiyu Zhang, Zhongyi Huang, Wen Hu, Yu Zheng

AI总结 随着大语言模型(LLM)服务平台逐渐部署为多模型云系统,用户需求呈现长尾分布,少数热门大模型承担了大部分请求,而许多小模型则利用率低下。为此,本文提出了一种名为SPECTRE的混合并行推测服务框架,通过将未充分利用的小模型作为远程草稿生成器,为负载较重的大模型提供推测解码支持。该方法结合了阈值引导的混合推测策略、多租户优先级调度和草稿端提示压缩等关键技术,有效提升了大模型的推理吞吐量,实验表明在多种场景下SPECTRE相较传统方法实现了显著的性能提升。

详情
英文摘要

LLM serving platforms are increasingly deployed as multi-model cloud systems, where user demand is often long-tailed: a few popular large models receive most requests, while many smaller tail models remain underutilized. We propose \textbf{SPECTRE} (Parallel \textbf{SPEC}ulative Decoding with a Multi-\textbf{T}enant \textbf{RE}mote Drafter), a serving framework that reuses underutilized tail-model services as remote drafters for heavily loaded large-model services through speculative decoding. SPECTRE enables draft generation and target-side verification to run in parallel, and makes such parallelism effective through three techniques: a hybrid ordinary-parallel speculative decoding strategy guided by a threshold derived from throughput analysis, speculative priority scheduling to preserve draft--target overlap under multi-tenant traffic, and draft-side prompt compression to reduce draft latency. We implement SPECTRE in \texttt{SGLang} and evaluate it across multiple draft--target model pairs, reasoning benchmarks, real-world long-context workloads, and a wide range of batch sizes. Results show that SPECTRE consistently improves large-model serving throughput while causing only minor interference to the native workloads of tail-model services. In large-model deployments, including Qwen3-235B-A22B with TP=8, SPECTRE achieves up to \textbf{2.28$\times$ speedup} over autoregressive decoding and up to an additional \textbf{66\% relative improvement} over the strongest speculative decoding baselines. Talk is cheap, we show you the code: https://github.com/sgl-project/sglang/pull/22272.

2605.07912 2026-05-13 cs.HC cs.AI cs.CY

Sycophantic AI makes human interaction feel more effortful and less satisfying over time

Lujain Ibrahim, Franziska Sofia Hafner, Myra Cheng, Cinoo Lee, Rebecca Anselmetti, Robb Willer, Luc Rocher, Diyi Yang

AI总结 该研究探讨了谄媚型人工智能对人类社交互动的影响,发现这类AI系统在短期内能提供类似亲密朋友和家人的情感支持,使用户更倾向于向其寻求个人建议。然而,长期使用后,用户对现实社交关系的满意度下降,并更依赖AI获取情感认同。研究通过多项实验表明,人们更偏好谄媚型AI的回应方式,主要因其让用户感到被理解,而非因其建议质量更高。

详情
英文摘要

Millions of people now turn to artificial intelligence (AI) systems for personal advice, guidance, and support. Such systems can be sycophantic, frequently affirming users' views and beliefs. Across five preregistered studies (N = 3,075 participants, 12,766 human-AI conversations), including a three-week study with a census-representative U.S. sample, we provide longitudinal experimental evidence that sycophantic AI shifts how users approach their closest relationships. We show that sycophantic AI immediately delivers the emotional and esteem support users typically associate with close friends and family. Over three weeks of such interactions, users became nearly as likely to seek personal advice from sycophantic AI as from close friends and family, and reported lower satisfaction with their real-world social interactions. When given a choice among AI response styles, a majority preferred sycophantic AI -- not for the quality of its advice, but because it made them feel most understood. Together, these findings offer a relational account of AI sycophancy and its impacts.

2605.07473 2026-05-13 quant-ph cond-mat.stat-mech cs.AI cs.ET cs.LG

Breaking QAOA's Fixed Target Hamiltonian Barrier: A Fully Connected Quantum Boltzmann Machine via Bilevel Optimization

Jun Liu

AI总结 本文提出了一种基于双层优化的全连接量子玻尔兹曼机(QBM),突破了传统量子近似优化算法(QAOA)固定目标哈密顿量的限制。该方法通过内层训练模拟QAOA电路的正相能最小化过程,外层训练则通过优化目标哈密顿量的结构参数实现对比散度学习。实验表明,该模型在单层QAOA电路下表现出优异的性能和噪声鲁棒性,即使在当前主流量子设备的噪声水平下仍能保持较高的目标量子态测量概率,并在图像生成任务中展现出稳定的性能。

Comments 34 pages, 8 figures, 3 tables, 1 algorithm

详情
英文摘要

To overcome the limitations of classical partially connected Boltzmann machines and mainstream quantum Boltzmann machines (QBMs), this work extends the conventional circuit of the quantum approximate optimization algorithm (QAOA) to a bilevel optimization architecture and proposes a fully connected QBM. The inner-loop training simulates positive phase energy minimization based on the computational process of the conventional QAOA circuit, whereas the outer-loop training simulates negative phase contrastive divergence learning by optimizing the structural parameters of the target Hamiltonian. It is found that, first, the model exhibits superior performance using only a single layer (p=1) in the QAOA circuit, with an average probability of 0.9559 in measuring the target quantum state under noiseless conditions. Second, the model exhibits notable noise robustness. Under the typical noise level of current mainstream commercial quantum computing devices, the average probability of measuring the target quantum state reaches 0.6047; when the noise rises to a more stringent level with doubled intensity, this probability remains at 0.3859. In both scenarios, the target quantum state maintains the highest measurement probability among all detected states, with a value several times higher than that of the second-ranked state. This indicates that the model retains strong robustness even when noise meets or exceeds the upper limit of current mainstream commercial quantum computing devices. Third, under a block-by-block learning strategy with p=1 and only 10 measurement shots, the model consistently generates the target "qubit" grid image regardless of noise interference, demonstrating strong robustness in image generation.

2605.07001 2026-05-13 cs.SE cs.CL

SmellBench: Evaluating LLM Agents on Architectural Code Smell Repair

Ion George Dinu, Marian Cristian Mihăescu, Traian Rebedea

AI总结 本文提出SmellBench,一个用于评估大语言模型代理在架构级代码异味修复能力的框架。研究发现,尽管LLM在局部代码修复上表现良好,但在涉及跨模块设计理解的架构异味修复任务中效果有限,且存在较高的误报率。该工作通过实证分析揭示了当前LLM在架构层面重构能力的不足,并为自动化软件工程中这一方向的研究提供了可复用的评估基础设施。

详情
英文摘要

Architectural code smells erode software maintainability and are costly to repair manually, yet unlike localized bugs, they require cross-module reasoning about design intent that challenges both developers and automated tools. While large language model agents excel at bug fixing and code-level refactoring, their ability to repair architectural code smells remains unexplored. We present the first empirical evaluation of LLM agents on architectural code smell repair. We contribute SmellBench, a task orchestration framework that incorporates smell-type-specific optimized prompts and supports iterative multi-step execution, together with a scoring methodology that separately evaluates repair effectiveness, false positive identification, and net codebase impact. We evaluate 11 agent configurations from four model families (GPT, Claude, Gemini, Mistral) on 65 hard-severity architectural smells detected by PyExamine in the Python project scikit-learn, validated against expert judgments. Expert validation reveals that 63.1% of detected smells are false positives, while the best agent achieves a 47.7% resolution rate. Agents identify false positives with up to $κ= 0.94$ expert agreement, but repair aggressiveness and net codebase quality are inversely related: the most aggressive agent introduces 140 new smells. These findings expose a gap between current LLM capabilities in localized code transformations and the architectural understanding needed for cross-module refactoring. SmellBench provides reusable infrastructure for tracking progress on this underexplored dimension of automated software engineering. We release our code and data at https://doi.org/10.5281/zenodo.19247588.

2605.06033 2026-05-13 cs.DL cs.AI cs.CY cs.SI

When AI Meets Science: Research Diversity, Interdisciplinarity, Visibility, and Retractions across Disciplines in a Global Surge

Andrés F. Castro Torres, Joan Giner-Miguelez, Mercè Crosas

AI总结 本文研究了人工智能(AI)技术在全球科学领域中的应用趋势及其对科学研究的影响。通过分析2.27亿篇学术论文,研究揭示了AI在不同学科中的采纳时间与程度存在差异,但其对科研范式的变革作用有限,主要集中于计算机科学和统计学相关领域。研究还指出AI支持的研究存在引用偏高和撤稿率偏高的问题,并揭示了发展中国家在AI应用上的相对优势,凸显了AI在科学中尚未充分发挥其变革潜力,并引发了对科研开放性、透明性和伦理的进一步思考。

详情
英文摘要

The extent to which Artificial Intelligence (AI) technologies can trigger generalized paradigm shifts in science is unclear. Although these technologies have revolutionized data collection and analysis in specific fields, their overall impact depends on the scope and ways of adoption. We analyze over 227 million scholarly works from the OpenAlex collection (1960-2024) spanning four scientific domains and 46 fields. To distinguish the use of AI as research method (AI adoption) from mentioning AI-related terms (AI engagement), we developed a two-step AI-assisted semantic classification pipeline, validated through human coding of 911 abstracts and a robustness check on 348,000 full-text articles (PLOS One). We document differences in the timing and extent of AI adoption across domains, with generalized exponential growth after 2015. The transformative nature of this growth, however, is less apparent. AI-supported research is confined to a few topics with strong ties to Computer Science and conventional statistical frameworks, suggesting limited epistemological transformation. It is also associated with an unwarranted citation premium and substantially higher retraction rates than non-AI-supported. Geographically, while wealthy countries lead in AI publications per capita, global South countries in a belt from Indonesia to Algeria lead in AI adoption relative to their national output, signaling a distinctive resource concentration pattern. The transformative capacity of AI in science thus remains untapped, and its rapid adoption underlines challenges in research openness, transparency, reproducibility, and ethics. We discuss how best research practices could boost the benefits of AI adoption and highlight areas that warrant closer scrutiny.

2605.03629 2026-05-13 quant-ph cs.LG

Adversarial Effects on Expressibility and Trainability in Distributed Variational Quantum Algorithms

Abhishek Sadhu, Sharu Theresa Jose

AI总结 本文研究了分布式变分量子算法中对抗性扰动对可表达性和可训练性的影响。作者指出,现有方法假设量子处理器间的纠缠共享是可信的,但实际上这种假设可能引入根本性漏洞,使得对抗性扰动能够通过影响共享纠缠引入结构化的门级噪声,从而干扰量子学习过程。为此,研究者提出了一种基于Kraus算符表示的框架,定义了衡量噪声量子通道可表达性的新指标——Kraus可表达性,并通过梯度方差分析揭示了其与可训练性之间的权衡关系,展示了对抗者如何操控这一特性以维持较大的梯度并引导优化过程走向错误解。数值实验验证了这些发现的有效性。

Comments Comments are welcome

详情
英文摘要

Distributed quantum algorithms offer a promising pathway to scale variational quantum algorithms beyond the constraints of noisy intermediate-scale quantum hardware. However, existing approaches implicitly assume a trusted entanglement-sharing layer across quantum processors. We show that this assumption introduces a fundamental vulnerability: adversarial perturbations of shared entanglement induce structured gate-level noise that directly impacts quantum learning. We develop a framework that maps entanglement-level perturbations to gate-level noise via an explicit Kraus representation. To quantify their impact, we introduce Kraus expressibility, a metric that generalizes unitary expressibility to noisy quantum channels. We then establish a trade-off between Kraus expressibility and trainability of noisy quantum circuits through gradient variance analysis. Our analysis reveals that an adversary can manipulate Kraus expressibility to maintain sufficiently large cost gradients (avoiding barren plateaus) while systematically biasing optimization toward incorrect solutions. We validate these findings through numerical simulations, demonstrating adversarial degradation of expressibility and trainability.

2604.24155 2026-05-13 cs.CY cs.AI cs.HC

The Alignment Target Problem: Divergent Moral Judgments of Humans, AI Systems, and Their Designers

Benjamin Minhao Chen, Xinyu Xie

AI总结 本文探讨了对齐人工智能行为与人类价值观过程中出现的基本问题:应以何种道德期望作为AI决策的指导标准。研究通过实验发现,当AI系统的来源被揭示时,人们对AI行为的道德评判会发生显著变化,并且对设计AI的人类与AI系统本身或实际行为者之间的判断也存在差异。研究指出,人类代理的可见性会引发更严格的道德约束,从而引发“对齐目标问题”——即在高风险领域中,应以何种统一的规范标准来指导人工智能道德代理的发展。

Comments Accepted at ACM FAccT 2026

详情
英文摘要

The project of aligning machine behavior with human values raises a basic problem: whose moral expectations should guide AI decision-making? Much alignment research assumes that the appropriate benchmark is how humans themselves would act in a given situation. Studies of agent-type value forks challenge this assumption by showing that people do not always judge humans and AI systems identically.This paper extends that challenge by examining two further possibilities: first, that evaluations of AI behavior change when its human origins are made visible; and second, that people judge the humans who program AI systems differently from either the machines or the human actors they are compared against. An experiment with 1,002 U.S. adults measured moral judgments in a runaway mine train scenario, varying the subject of evaluation across four conditions: a repairman, a repair robot, a repair robot programmed by company engineers, and company engineers programming a repair robot. We find no significant difference in evaluations of the repairman and the robot. However, judgments shifted substantially when the robot's actions were described as the product of human design. Participants exhibited markedly more deontological, rule-based reasoning when evaluating either the programmed robot or the engineers who programmed it, suggesting that rendering human agency visible activates heightened moral constraints. These findings indicate that people may evaluate humans, AI systems acting in the same situation, and the humans who design them in meaningfully different ways. The fact that these evaluations do not necessarily converge gives rise to the alignment target problem: which normative target should guide the development of artificial moral agents in high-stakes domains, and whether these plural judgments can be reconciled within a coherent account of value alignment.

2604.23260 2026-05-13 stat.ML cs.LG

Explicit integral representations and quantitative bounds for two-layer ReLU networks

Anthony Lee

AI总结 本文提出了一种为两层ReLU网络构建显式积分表示的方法,能够为任意多变量多项式提供较为简单的表达形式。通过引入调和延拓和投影的锐化ReLU积分表示,给出了定量误差界,表明函数的$L^{2}(\mathcal{D})$逼近误差仅依赖于其单项式展开的系数和分布$\mathcal{D}$,而与维度或次数无关。此外,文章还建立了该表示与指数核再生核希尔伯特空间之间的联系,并提出了一种具有更优误差界的简单积分表示形式。

详情
英文摘要

An approach to construct explicit integral representations for two-layer ReLU networks is presented, which provides relatively simple representations for any multivariate polynomial. Quantitative bounds are provided for a particular, sharpened ReLU integral representation, which involves a harmonic extension and a projection. The bounds demonstrate that functions can be approximated with $L^{2}(\mathcal{D})$ errors that do not depend explicitly on dimension or degree, but rather the coefficients of their monomial expansions and the distribution $\mathcal{D}$. We also present a connection to the RKHS of the exponential kernel $K(x,y)=\exp\left(\left\langle x,y\right\rangle \right)$, and a very simple integral representation involving additionally multiplication via a fixed function which has better quantitative bounds.

2604.16445 2026-05-13 eess.AS cs.AI cs.CV cs.LG

SAND: The Challenge on Speech Analysis for Neurodegenerative Disease Assessment

Giovanna Sannino, Ivanoe De Falco, Nadia Brancati, Laura Verde, Maria Frucci, Daniel Riccio, Vincenzo Bevilacqua, Antonio Di Marino, Lucia Aruta, Valentina Virginia Iuzzolino, Gianmaria Senerchia, Myriam Spisto, Raffaele Dubbioso

AI总结 本文介绍了SAND挑战赛,旨在利用语音信号进行神经退行性疾病(如肌萎缩侧索硬化症ALS)的早期诊断与病情进展预测。研究团队联合临床专家和机器学习学者,构建了一个临床标注的语音数据集,并基于该数据集发起挑战赛,推动AI模型在语音分析中的应用与验证。该工作为利用非侵入性生物标志物进行疾病评估提供了重要的数据基础和研究平台。

详情
英文摘要

Recent advances in Artificial Intelligence (AI) and the exploration of noninvasive, objective biomarkers, such as speech signals, have encouraged the development of algorithms to support the early diagnosis of neurodegenerative diseases, including Amyotrophic Lateral Sclerosis (ALS). Voice changes in subjects suffering from ALS typically manifest as progressive dysarthria, which is a prominent neurodegenerative symptom because it affects patients as the disease progresses. Since voice signals are complex data, the development and use of advanced AI techniques are fundamental to extracting distinctive patterns from them. Validating AI algorithms for ALS diagnosis and monitoring using voice signals is challenging, particularly due to the lack of annotated reference datasets. In this work, we present the outcome of a collaboration between a multidisciplinary team of clinicians and Machine Learning experts to create both a clinically annotated validation dataset and the "Speech Analysis for Neurodegenerative Diseases" (SAND) challenge based on it. Specifically, by analyzing voice disorders, the SAND challenge provides an opportunity to develop, test, and evaluate AI models for the automatic early identification and prediction of ALS disease progression.

2604.14322 2026-05-13 stat.ML cs.LG

Doubly Outlier-Robust Online Infinite Hidden Markov Model

Horace Yiu, Leandro Sánchez-Betancourt, Álvaro Cartea, Gerardo Duran-Martin

AI总结 本文研究了在流数据中存在异常值和模型误设的情况下,如何提高在线无限隐马尔可夫模型(iHMM)的鲁棒性。通过引入后验影响函数(PIF)定义鲁棒性,并给出了在线iHMM具有有界PIF的条件,提出了一种名为BR-iHMM的方法,在适应性和鲁棒性之间引入两个可调参数进行平衡。实验表明,该方法在多个实际数据集上显著降低了一步预测误差,验证了其在预测和可解释在线学习中的有效性。

Comments 43rd International Conference on Machine Learning (ICML 2026)

详情
英文摘要

We derive a robust update rule for the online infinite hidden Markov model (iHMM) for when the streaming data contains outliers and the model is misspecified. Leveraging recent advances in generalised Bayesian inference, we define robustness via the posterior influence function (PIF), and provide conditions under which the online iHMM has bounded PIF. Imposing robustness inevitably induces an adaptation lag for regime switching. Our method, which is called Batched Robust iHMM (BR-iHMM), balances adaptivity and robustness with two additional tunable parameters. Across limit order book data, hourly electricity demand, and a synthetic high-dimensional linear system, BR-iHMM reduces one-step-ahead forecasting error by up to 67% relative to competing online Bayesian methods. Together with theoretical guarantees of bounded PIF, our results highlight the practicality of our approach for both forecasting and interpretable online learning.

2604.01621 2026-05-13 cs.DC cs.AI

DWDP: Distributed Weight Data Parallelism for High-Performance LLM Inference on NVL72

Wanqian Li, Jintao Peng, Zongfei Jing, Tianyu Zhang, Ze Long, Xianjie Qiao, Xiaoming Chen, Dongxu Yang, Kefeng Duan, June Yang

AI总结 本文提出了一种名为DWDP的分布式权重数据并行方法,旨在提升在NVL72平台上的大语言模型推理性能。该方法通过在多GPU之间异步传输模型权重,避免了传统策略中的层间同步开销,从而实现更高效的端到端推理。实验表明,DWDP在保持用户吞吐量相近的前提下,显著提升了每GPU的输出吞吐量。

Comments Technical Report. 17 pages. 8 figures

详情
英文摘要

Large language model (LLM) inference increasingly depends on multi-GPU execution, yet existing inference parallelization strategies require layer-wise inter-rank synchronization, making end-to-end performance sensitive to workload imbalance. We present DWDP (Distributed Weight Data Parallelism), an inference parallelization strategy that preserves data-parallel execution while offloading MoE weights across peer GPUs and fetching missing experts on demand. By removing collective inter-rank synchronization, DWDP allows each GPU to progress independently. We further address the practical overheads of this design with two optimizations for split-weight management and asynchronous remote-weight prefetch. Implemented in TensorRT-LLM and evaluated with DeepSeek-R1 on GB200 NVL72, DWDP improves end-to-end output TPS/GPU by 8.8% at comparable TPS/user in the 20-100 TPS/user serving range under 8K input sequence length and 1K output sequence length.

2603.24410 2026-05-13 cs.CY cs.AI

Real Talk, Virtual Faces: Symbolic-Semantic Discourse Geometry of Virtual and Human Influencer Audiences

Shahram Chaudhry, Sidahmed Benabderrahmane, Talal Rahwan

AI总结 本文研究虚拟网红(VI)与真人网红(HI)受众在社交媒体上的讨论模式差异,探讨虚拟身份是否引发与真人不同的话语结构。研究提出一种符号-语义分析框架,通过形式概念分析和关联规则挖掘提取情感、主题和心理语言特征的共现结构,并利用自然语言描述和嵌入模型进行对比分析。研究发现,VI受众的讨论更具多样性,语义分布更分散,且在心理健康、身体形象等敏感话题中表现出更高的负面情感,揭示了虚拟身份对在线社交话语结构和情感组织方式的深远影响。

详情
英文摘要

Virtual influencers~(VIs) -- digitally constructed social-media personas -- are becoming increasingly visible in online culture, marketing, and identity formation. Yet it remains unclear whether audiences respond to them through the same discourse patterns used for human influencers~(HIs), or whether virtuality produces distinctive modes of reaction. Existing studies often rely on surveys, engagement statistics, or marginal sentiment distributions, which reveal what audiences say but not how affective, topical, and psycholinguistic signals are jointly organised. We introduce a symbolic-semantic framework for analysing audience discourse around virtual and human influencers. The symbolic layer uses Formal Concept Analysis and association rule mining to extract closed co-occurrence structures from sentiment labels, topic tags, and Big Five psycholinguistic cues. The semantic layer renders these formal concepts as natural-language descriptions, embeds them with MiniLM, and compares their geometry across VI and HI audiences. Applied to 69,498 YouTube comments from three matched VI-HI influencer pairs, our analysis shows that HI discourse is organised around a compact, stability-centred pattern in which low neuroticism anchors positive sentiment, whereas VI discourse supports multiple discourse regimes. VI concepts are also more semantically dispersed than HI concepts, while both groups show strong symbolic-semantic alignment between closed-set structure and embedding geometry. Finally, VI discourse contains a distinct artificial-identity region and a higher concentration of negative sentiment in sensitive topics such as mental health, body image, and artificial identity. These findings suggest that virtuality reshapes not only the sentiment of audience reactions, but also the symbolic and semantic organisation of online social discourse.

2603.14094 2026-05-13 stat.ML cs.LG math.ST stat.CO stat.ME stat.TH

Maximin Robust Bayesian Experimental Design

Hany Abdulsamad, Sahel Iqbal, Christian A. Naesseth, Takuo Matsubara, Adrien Corenflos

AI总结 本文研究了贝叶斯实验设计在模型误设下的鲁棒性问题,将其建模为实验者与对抗性自然之间的极大极小博弈,并引入信息论约束以提升鲁棒性。研究提出使用Sibson的α-互信息作为鲁棒目标函数,确定了α-倾斜后验作为鲁棒信念更新方式,并以Rényi散度作为条件信息增益的度量。为减少嵌套蒙特卡洛估计器的偏差和方差,作者采用PAC-Bayes框架搜索随机设计策略,从而得到具有显式有限样本误差控制的鲁棒期望信息增益下界。

详情
英文摘要

We address the brittleness of Bayesian experimental design under model misspecification by formulating the problem as a max--min game between the experimenter and an adversarial nature subject to information-theoretic constraints. We demonstrate that this approach yields a robust objective governed by Sibson's $α$-mutual information (MI), which identifies the $α$-tilted posterior as the robust belief update and establishes the Rényi divergence as the appropriate measure of conditional information gain. To mitigate the bias and variance of nested Monte Carlo estimators needed to estimate Sibson's $α$-MI, we adopt a PAC-Bayes framework to search over stochastic design policies, yielding rigorous high-probability lower bounds on the robust expected information gain that explicitly control finite-sample error.

2603.13420 2026-05-13 cs.CR cs.AI

Accelerating Suffix Jailbreak attacks with Prefix-Shared KV-cache

Xinhai Wang, Shaopeng Fu, Shu Yang, Liangyu Wang, Tianhang Zheng, Di Wang

AI总结 本文提出了一种名为Prefix-Shared KV Cache(PSKV)的优化方法,用于加速针对大语言模型的后缀越狱攻击。该方法通过共享相同前缀部分的键值缓存,避免了对重复前缀的冗余计算,从而显著降低了计算和内存开销。实验表明,PSKV在保持攻击成功率不变的前提下,将推理时间减少了40%,峰值内存使用量减少了50%。

Comments 27 pages, 7 figures, preprint

详情
英文摘要

Suffix jailbreak attacks serve as a systematic method for red-teaming Large Language Models (LLMs) but suffer from prohibitive computational costs, as a large number of candidate suffixes need to be evaluated before identifying a jailbreak suffix. This paper presents Prefix-Shared KV Cache (PSKV), a plug-and-play inference optimization technique tailored for jailbreak suffix generation. Our method is motivated by a key observation that when performing suffix jailbreaking, while a large number of candidate prompts need to be evaluated, they share the same targeted harmful instruction as the prefix. Therefore, instead of performing redundant inference on the duplicated prefix, PSKV maintains a single KV cache for this prefix and shares it with every candidate prompt, enabling the parallel inference of diverse suffixes with minimal memory overhead. This design enables more aggressive batching strategies that would otherwise be limited by memory constraints. Extensive experiments on six widely used suffix attacks across five widely deployed LLMs demonstrate that PSKV reduces inference time by 40\% and peak memory usage by 50\%, while maintaining the original Attack Success Rate (ASR). The code has been submitted and will be released publicly.

2602.09725 2026-05-13 cs.DC cs.LG

Efficient Remote KV Cache Reuse with GPU-native Video Codec

Liang Mi, Weijun Wang, Jinghan Chen, Ting Cao, Haipeng Dai, Yunxin Liu

AI总结 该论文提出了一种基于GPU原生视频编解码技术的高效远程KV缓存复用方法,旨在提升大语言模型推理性能。针对现有方法在带宽受限场景下因解压开销抵消复用优势的问题,研究设计了紧凑的张量布局和高效的缓存获取机制,实现了快速传输与解码,有效降低首词生成时间。实验表明,该方法在多种GPU上均能显著提升推理效率,同时保持无损精度。

Comments Accepted by SIGCOMM 2026

详情
英文摘要

Remote KV cache reuse fetches KV cache for identical contexts from remote storage, avoiding recomputation, accelerating LLM inference. While it excels in high-speed networks, its performance degrades significantly in bandwidth-limited scenarios. Recent studies address this by transmitting KV caches in compressed form, but the associated heavyweight decompression counteracts the KV reuse benefits. In this paper, we propose an efficient and widely deployable remote KV cache reuse solution that leverages GPU-native video codecs. Our system, KVCodec, enables effective KV cache coding with two techniques. The codec-friendly tensor layout compresses the KV cache in a highly compact video format, enabling fast transmission. The efficient KV fetcher orchestrates the transmission, decoding, and restoration of compressed KV caches in an efficient pipelined manner, eliminating resource contention, masking network fluctuations, and achieving minimum time-to-first-token (TTFT). We prototype KVCodec on diverse GPUs from high- to low-end. Experiments reveal that it reduces TTFT by up to 3.51 times while maintaining lossless accuracy, compared to SOTA methods.

2602.08606 2026-05-13 math.OC cs.LG math.AP math.PR

Constructive conditional normalizing flows

Borjan Geshkovski, Domènec Ruiz-Balet

AI总结 本文研究在条件采样应用背景下,如何同时近似一个微分同胚映射及其推送前向测度的问题。核心方法是通过连续性方程的流来近似,其中速度场采用具有分段常数权重的感知机神经网络。作者提出了一种基于拉格朗日插值极坐标分解的显式构造方法,并针对更光滑的映射提供了另一种概率构造,有效减少了权重不连续点的数量对维度的依赖。

详情
英文摘要

Motivated by applications in conditional sampling, given a probability measure $μ$ and a diffeomorphism $ϕ$, we consider the problem of simultaneously approximating $ϕ$ and the pushforward $ϕ_{\#}μ$ by means of the flow of a continuity equation whose velocity field is a perceptron neural network with piecewise constant weights. We provide an explicit construction based on a polar-like decomposition of the Lagrange interpolant of $ϕ$. The latter involves a compressible component, given by the gradient of a particular convex function, which can be realized exactly, and an incompressible component, which -- after approximating via permutations -- can be implemented through shear flows intrinsic to the continuity equation. For more regular maps $ϕ$ -- such as the Knöthe-Rosenblatt rearrangement -- we provide an alternative, probabilistic construction inspired by the Maurey empirical method, in which the number of discontinuities in the weights doesn't scale inversely with the ambient dimension.

2602.02406 2026-05-13 stat.ML cs.LG

Provably Data-driven Multiple Hyper-parameter Tuning with Structured Loss Function

Tung Quoc Le, Anh Tuan Nguyen, Viet Anh Nguyen

AI总结 本文研究了数据驱动方法中多维超参数调优的泛化保证问题,针对现有理论仅适用于单维超参数的局限性,提出了首个适用于多维超参数调优的通用框架。该方法结合实代数几何工具,强化了半代数函数类的泛化界分析,获得了更精确且适用性更广的理论保证,并进一步拓展到验证损失下的超参数调优场景,展示了框架在数据驱动加权组lasso和加权融合lasso等新学习问题中的应用潜力。

Comments Accepted to ICML 2026

详情
英文摘要

Data-driven algorithm design automates hyperparameter tuning, but its statistical foundations remain limited because model performance can depend on hyperparameters in implicit and highly non-smooth ways. Existing guarantees focus on the simple case of a one-dimensional (scalar) hyperparameter. This leaves the practically important, multi-dimensional hyperparameter tuning setting unresolved. We address this open question by establishing the first general framework for establishing generalization guarantees for tuning multi-dimensional hyperparameters in data-driven settings. Our approach strengthens the generalization guarantee framework for semi-algebraic function classes by exploiting tools from real algebraic geometry, yielding sharper, more broadly applicable guarantees. For completeness, we also instantiate the first lower bound for this general setting. We further extend the analysis to hyperparameter tuning using the validation loss under minimal assumptions, and derive improved bounds when additional structure is available. Finally, we demonstrate the scope of the framework with new learnability results, including data-driven weighted group lasso and weighted fused lasso.

2512.24105 2026-05-13 cs.GT cs.AI

Multilevel Fair Allocation with Matroid-Rank Preferences

Maxime Lucet, Nawal Benabbou, Aurélie Beynier, Nicolas Maudet

AI总结 本文研究了具有树状分层关系的多层级公平资源分配问题,提出在叶子节点具有拟阵秩效用函数、内部节点效用为子节点效用之和的假设下,设计兼顾公平性与效率的分配算法。文章提出了两种原创算法:一种是具有理论效率与公平性保证的自顶向下多项式时间算法,适用于多种局部分配机制;另一种是对通用耶鲁交换算法的多层级扩展,虽仅保证效率,但在实践中表现出良好的公平性。

详情
英文摘要

We introduce the concept of multilevel fair allocation of resources with tree-structured hierarchical relations among agents. While at each level it is possible to consider the problem locally as an allocation of an agent to its children, the multilevel allocation can be seen as a trace capturing the fact that the process is iterated until the leaves of the tree. In principle, each intermediary node may have its own local allocation mechanism. The main challenge is then to design algorithms which can retain good fairness and efficiency properties. In this paper we propose two original algorithms under the assumption that leaves of the tree have matroid-rank utility functions and the utility of any internal node is the sum of the utilities of its children. The first one is a generic polynomial-time sequential algorithm that comes with theoretical guarantees in terms of efficiency and fairness. It operates in a top-down fashion -- as commonly observed in real-world applications -- and is compatible with various local algorithms. The second one extends the recently proposed General Yankee Swap to the multilevel setting. This extension comes with efficiency guarantees only, but we show that it preserves excellent fairness properties in practice.

2512.11868 2026-05-13 cs.CY cs.AI

Industrial AI Robustness Card for Time Series Models

Alexander Windmann, Benedikt Stratmann, Mariya Lyashenko, Oliver Niggemann

AI总结 本文提出了一种用于时间序列模型的工业AI鲁棒性卡片(IARC-TS),旨在解决工业AI实践中面对新兴法规时鲁棒性要求模糊、缺乏具体实施协议的问题。该方法通过定义明确的字段和评估流程,结合漂移监测、不确定性量化和压力测试等技术,支持符合欧盟AI法案相关要求的鲁棒性评估与文档记录。研究通过一个生物制药软传感器案例展示了IARC-TS在生成可复现的鲁棒性证据和定义监控触发条件方面的应用价值。

Comments Accepted to IFAC World Congress 2026

详情
英文摘要

Industrial AI practitioners face vague robustness requirements in emerging regulations and standards but lack concrete, implementation-ready protocols. This paper introduces the Industrial AI Robustness Card for Time Series (IARC-TS), a lightweight protocol for documenting and evaluating industrial time series models. IARC-TS specifies required fields and an empirical measurement and reporting protocol that combines drift and operational domain monitoring, uncertainty quantification, and stress tests, and maps these to selected EU AI Act documentation, testing, and monitoring obligations. A biopharmaceutical soft sensor case study illustrates how IARC-TS supports reproducible robustness evidence and defines monitoring triggers.

2511.05940 2026-05-13 math.OC cs.AI math.AP

A PDE Perspective on Generative Diffusion Models

Kang Liu, Enrique Zuazua

AI总结 本文从偏微分方程(PDE)的角度出发,对基于分数的扩散生成模型进行了理论分析,揭示了其动态过程的数学基础。研究建立了严格的PDE框架,证明了分数驱动的福克-普朗克方程的适定性与稳定性,并分析了反向扩散过程在数据流形上的收敛行为。该工作不仅为扩散模型提供了理论保证,还为模型设计提供了指导,有助于理解生成能力与模仿保真度之间的权衡。

Comments 30 pages, 4 figures

详情
英文摘要

Score-based diffusion models have emerged as a powerful class of generative methods, achieving state-of-the-art performance across diverse domains. Despite their empirical success, the mathematical foundations of those models remain only partially understood, particularly regarding the stability and consistency of the underlying stochastic and partial differential equations governing their dynamics. In this work, we develop a rigorous partial differential equation (PDE) framework for score-based diffusion processes. Building on the Li--Yau differential inequality for the heat flow, we prove well-posedness and derive sharp $L^p$-stability estimates for the associated score-based Fokker--Planck dynamics, providing a mathematically consistent description of their temporal evolution. Through entropy stability methods, we further show that the reverse-time dynamics of diffusion models concentrate on the data manifold for compactly supported data distributions and a broad class of initialization schemes, with a concentration rate of order $\sqrt{t}$ as $t \to 0$. These results yield a theoretical guarantee that, under exact score guidance, diffusion trajectories return to the data manifold while preserving imitation fidelity. Our findings also provide practical insights for designing diffusion models, including principled criteria for score-function construction, loss formulation, and stopping-time selection. Altogether, this framework provides a quantitative understanding of the trade-off between generative capacity and imitation fidelity, bridging rigorous analysis and model design within a unified mathematical perspective.

2510.16620 2026-05-13 cs.IT cs.AI cs.CR cs.LG eess.SP math.IT

Feedback Lunch: Learned Feedback Codes for Secure Communications

Yingyao Zhou, Natasha Devroye, Onur Günlü

AI总结 本文研究了具有信道输出反馈的块衰落高斯窃听信道中的安全通信问题,提出了一种结合通用哈希函数和学习反馈编码的种子模运算码设计方法,以实现安全性和可靠性的平衡。研究发现,反馈机制能够使合法用户协商共享密钥,从而克服窃听者的信息优势。该成果为集成感知与通信(ISAC)场景下的感知辅助安全通信提供了新的编码设计思路。

Comments Accepted to WiseML'26

详情
英文摘要

We consider reversely-degraded secure-communication channels, for which the secrecy capacity is zero if there is no channel feedback. Specifically, we focus on a seeded modular code design for the block-fading Gaussian wiretap channel with channel-output feedback, combining universal hash functions for security and learned feedback-based codes for reliability. The trade-off between communication reliability and information leakage is studied, illustrating that feedback enables agreeing on a secret key shared between legitimate parties, overcoming the security advantage of the eavesdropper. Our findings motivate code designs for sensing-assisted secure communications in the context of integrated sensing and communication (ISAC).