arXivDaily arXiv每日学术速递 周一至周五更新
重置
2606.12011 2026-06-11 cs.CR 新提交

InjectV: Modeling Fault Injection Attacks in RISC-V Simulation Environment

InjectV:在RISC-V仿真环境中建模故障注入攻击

Niccolò Lentini, Giorgio Fardo, Stefano Di Carlo, Alessandro Savino

AI总结 提出InjectV框架,基于gem5模拟器在RISC-V平台上实现精确、引导式的故障注入,支持寄存器和存储器瞬态故障攻击,实验表明相比传统方法节省95.8%时间。

详情
AI中文摘要

故障注入攻击(FIA)对硬件安全构成重大威胁,能够通过在计算或存储中诱导恶意故障来破坏系统。由于物理故障实验的高成本、复杂性和有限可用性,特别是在硅前开发阶段,评估对此类攻击的韧性具有挑战性。架构级仿真提供了一种面向开发者的白盒视角,用于系统性的漏洞评估。本文介绍了InjectV,一个基于gem5模拟器构建的RISC-V平台故障注入攻击框架。InjectV能够在安全关键执行点(如控制流决策、计数器和比较)实现精确、引导式的故障注入,从而系统性地探索攻击向量。它目前支持寄存器和存储器中的瞬态故障攻击,拓宽了模拟多种攻击场景的能力。在FISSC套件(包括VerifyPIN应用的强化变体)的安全基准测试上的实验结果表明,InjectV能够有效识别故障注入点,相比传统故障注入方法节省了95.8%的时间。

英文摘要

Fault Injection Attacks (FIAs) are a significant threat to hardware security, capable of compromising systems by inducing malicious faults in computation or storage. Evaluating resilience against such attacks is challenging due to the high cost, complexity, and limited availability of physical fault experiments, particularly during pre-silicon development. Architectural-level simulation offers a developer-oriented, white-box perspective for systematic vulnerability assessment. This paper introduces InjectV, a fault injection attack framework for RISC-V platforms built on the gem5 simulator. InjectV enables precise, guided fault injection at security-critical execution points, such as control-flow decisions, counters, and comparisons, allowing systematic exploration of attack vectors. It currently supports transient fault attacks in registers and memory, broadening its ability to simulate diverse attack scenarios. Experimental results on security benchmarks from the FISSC suite, including hardened variants of the VerifyPIN application, demonstrate InjectV's ability to effectively identify fault-injection points, achieving a 95.8% time-saving advantage over traditional fault injection approaches.

2606.12008 2026-06-11 cs.CG cs.GR 新提交

Automated Responsive Thematic Mapping with Layout Guides

基于布局引导的自动化响应式专题制图

Arjen Simons, Sarah Schöttler, Wouter Meulemans, Kevin Verbeek, Bettina Speckmann

AI总结 提出首个算法框架,通过布局引导结构高效计算响应式专题地图,实现地图元素在不同显示尺寸下的平滑自适应,兼顾统计信息可读性与制图上下文。

详情
AI中文摘要

专题地图以视觉方式传达关于空间单元(如国家或州)的统计信息。它们必须平衡承载统计信息的地图元素的个体可读性与整体制图上下文。如今,大多数地图不再是静态图像,而必须灵活响应各种设备类型和显示尺寸。当前的响应式专题制图方法存在局限性:对从业者而言劳动密集,且通常依赖组合不连贯的视觉编码以覆盖不同设备类型。在本文中,我们首次提出一种算法框架,用于高效计算能平滑适应不同显示尺寸的响应式专题地图。我们框架的关键组件是布局引导:一种组合结构,编码了专题地图的两个基本方面。第一个方面是每个统计地图元素的视觉需求(至少其期望的宽度和高度),第二个方面是以地图元素相对位置形式呈现的制图上下文。我们的主要算法贡献是地图排列器,它接收视觉容器作为输入,并返回合适的布局引导。地图排列器以稳定且一致的方式实现:如果容器变化很小,布局引导也变化很小,且相同的输入容器总是产生相同的布局引导。要使用我们的框架,需要三个要素:$(1)$ 参考布局,对应于“理想”的专题地图,$(2)$ 所有地图元素的总体垂直和水平顺序(针对具有极端宽高比的容器的期望布局),以及$(3)$ 能够从布局引导构建专题地图的专题制图算法。我们在两种类型的专题地图上演示了我们的框架,即矩形和Demers面积图。

英文摘要

Thematic maps visually communicate statistical information about spatial units such as countries or states. They must balance the individual readability of those map elements that carry the statistical information and the overall cartographic context. Nowadays, most maps are not static images, but must flexibly respond to a range of device types and display sizes. Current approaches to responsive thematic mapping are limited: they are labor-intensive for practitioners and often rely on combining disjointed visual encodings to cover different device types. In this paper we introduce the first algorithmic framework to efficiently compute responsive thematic maps that smoothly adapt to different display sizes. A key component of our framework is the layout guide: a combinatorial structure which encodes the two essential aspects of a thematic map. The first aspect are the visual requirements of each statistical map element (at least their desired width and height), the second aspect is the cartographic context in the form of relative positions of map elements. Our main algorithmic contribution is the map arranger which takes a visual container as input and returns a suitable layout guide. The map arranger does so in a stable and consistent manner: if the container changes only a little, then so does the layout guide, and the same input container always results in the same layout guide. To use our framework, one needs three ingredients: $(1)$ a reference layout, which corresponds to the ``ideal'' thematic map, $(2)$ a total vertical and horizontal order for all map elements (the desired layouts for containers with extreme aspect ratios), and $(3)$ a thematic mapping algorithm that can construct a thematic map from a layout guide. We demonstrate our framework on two types of thematic maps, namely rectangular and Demers cartograms.

2606.12006 2026-06-11 cs.LG cs.AI 新提交

Tabular Foundation Models for Clinical Survival Analysis via Survival-Aware Adaptation

通过生存感知适配的临床生存分析表格基础模型

Minh-Khoi Pham, Luca Cotugno, Alina Sirbu, Tai Tan Mai, Martin Crane, Marija Bezbradica

发表机构 * ADAPT Centre, Dublin City University(ADAPT中心,都柏林城市大学) School of Computing, Dublin City University(都柏林城市大学计算机学院) Department of Computer Science and Engineering, University of Bologna(博洛尼亚大学计算机科学与工程系)

AI总结 提出轻量级适配方法,将表格基础模型(TabPFN、TabDPT、TabICL)与多任务逻辑回归头结合,用于临床生存分析,在多个基准和ICU队列上达到竞争性或更优性能。

详情
Comments
Accepted for publication at International Conference on AI in Healthcare 2026
AI中文摘要

预测死亡率等时间至事件结果是临床决策中的基本任务,通常通过生存分析来解决。虽然经典的统计和深度学习方法已被广泛研究,但它们通常需要特定任务的训练和足够的标记数据。最近表格基础模型的进展通过学习结构化数据的通用表示提供了一种新范式。然而,它们在临床环境中对删失时间至事件预测的适用性仍未得到充分探索,因为典型应用仅限于离散分类而非生存分析任务。在这项工作中,我们提出了一种轻量级适配方法,通过直接在预训练表示之上训练一个生存感知头,将表格基础模型应用于临床生存分析。我们研究了代表性架构,包括TabPFN、TabDPT和TabICL,并使用多任务逻辑回归(MTLR)头对它们进行适配,以建模右删失时间至事件结果。我们在多个公开生存基准和两个大规模ICU队列MIMIC-IV和eICU上评估了该方法。我们的结果表明,这种迁移学习方法与强基线相比达到了竞争性或更优的性能。在MIMIC-IV上,TabDPT-FT-MTLR达到了0.856的C指数,相对于最佳非FM基线(DeepSurv,0.844)相对提升了+1.4%,相对于最佳零样本模型(0.802)提升了+6.7%。在eICU上,TabICL-FT-MTLR达到了0.797,分别获得了+1.7%(DeepSurv,0.784)和+6.4%(0.749)的提升。这些发现强调了将预训练表格表示与生存感知目标相结合的重要性,并表明表格基础模型为临床生存预测提供了一种实用且有效的替代方案。

英文摘要

Predicting time-to-event outcomes such as mortality is a fundamental task in clinical decision-making, commonly addressed through survival analysis. While classical statistical and deep learning approaches have been widely studied, they typically require task-specific training and sufficient labeled data. Recent advances in tabular foundation models offer a new paradigm by learning general-purpose representations for structured data. However, their applicability to censored time-to-event prediction in clinical settings remains underexplored, as typical applications are restricted to discrete classification rather than survival analysis tasks. In this work, we propose a lightweight adaptation approach for applying tabular foundation models to clinical survival analysis by directly training a survival-aware head on top of the pretrained representations. We study representative architectures, including TabPFN, TabDPT, and TabICL, and adapt them using a multi-task logistic regression (MTLR) head to model right-censored time-to-event outcomes. We evaluate this approach on a diverse set of public survival benchmarks and two large-scale ICU cohorts, MIMIC-IV and eICU. Our results show that this transfer learning approach achieves competitive or superior performance compared to strong baselines. On MIMIC-IV, TabDPT-FT-MTLR reaches a C-index of 0.856, corresponding to a relative improvement of +1.4% over the best non-FM baseline (DeepSurv, 0.844) and +6.7% over the best zero-shot model (0.802). On eICU, TabICL-FT-MTLR achieves 0.797, yielding gains of +1.7% (DeepSurv, 0.784) and +6.4% (0.749), respectively. These findings highlight the importance of combining pretrained tabular representations with survival-aware objectives and suggest that tabular foundation models provide a practical and effective alternative for clinical survival prediction.

2606.12005 2026-06-11 cs.GT cs.IT 新提交

Game-Theoretic Latent Space Alignment for Multi-user Semantic MIMO Communications

博弈论潜在空间对齐用于多用户语义MIMO通信

Giuseppe Di Poce, Mattia Merluzzi, Emilio Calvanese Strinati, Paolo Di Lorenzo

AI总结 针对多用户语义MIMO干扰网络中的语义失配问题,提出非合作博弈框架,通过闭式解联合优化线性语义MIMO收发机,并设计迭代语义注水算法,实现潜在空间对齐与干扰管理。

详情
AI中文摘要

语义通信通过将原始数据映射为压缩的任务导向潜在表示,实现AI原生无线系统。然而,独立训练的智能体通常依赖异构潜在空间和背景知识,导致语义失配,降低相互理解和下游任务执行性能,尤其在干扰受限的多用户无线网络中。本文研究具有认知无线电约束的多用户语义MIMO干扰网络中的分布式潜在空间对齐问题。我们考虑主用户和语义感知次用户共享相同无线资源,其中次用户必须同时缓解干扰并对齐异构语义表示。为解决此问题,我们将语义对齐建模为非合作博弈,并推导出在功率和干扰约束下联合优化线性语义MIMO收发机的闭式解。利用问题结构,我们将原始矩阵值优化转化为低维功率分配博弈,从而提出迭代语义注水算法。我们建立了存在性、唯一性和全局收敛到纳什均衡的充分条件,明确关联了语义对齐特性和物理信道交互。数值结果评估了所提框架的性能,揭示了语义压缩、任务性能与分层频谱接入之间的关键权衡。

英文摘要

Semantic communications enable AI-native wireless systems by mapping raw data into compressed task-oriented latent representations. However, independently trained agents often rely on heterogeneous latent spaces and background knowledge, leading to semantic mismatch that degrades mutual understanding and downstream task execution, especially in interferencelimited multi-user wireless networks. This paper investigates distributed latent-space alignment in multi-user semantic MIMO interference networks with cognitive radio constraints. We consider primary users and semantic-aware secondary users sharing the same wireless resources, where secondary agents must simultaneously mitigate interference and align heterogeneous semantic representations. To address this problem, we formulate semantic alignment as a non-cooperative game and derive a closed-form solution for the joint optimization of linear semantic MIMO transceivers under power and interference constraints. Exploiting the structure of the problem, we recast the original matrix valued optimization into a lower-dimensional power-allocation game, leading to an iterative semantic water-filling algorithm. We establish sufficient conditions for existence, uniqueness, and global convergence to a Nash equilibrium, explicitly relating semantic alignment properties and physical-channel interactions. Numerical results assess the performance of the proposed framework, revealing key trade-offs among semantic compression, task performance, and hierarchical spectrum access.

2606.12003 2026-06-11 cs.CL 新提交

Agreement in Representation Space for Open-Ended Self-Consistency

表示空间中的一致性:面向开放式自洽性

Paula Ontalvilla, Gorka Azkune, Aitor Ormazabal

发表机构 * HiTZ Center - Ixa, University of the Basque Country (UPV/EHU)(HiTZ中心 - Ixa,巴斯克大学(UPV/EHU))

AI总结 针对开放式生成任务,提出基于嵌入的协议(EBA),通过聚类采样生成的嵌入表示来估计自洽性,无需训练即可鲁棒地选择更可靠的输出。

详情
AI中文摘要

自洽性通过采样多个输出并选择最一致的答案来改进大语言模型的推理,但现有方法主要依赖于精确匹配,因此仅限于具有分类输出的任务。在这项工作中,我们研究开放式生成任务(如代码合成和文本摘要)中的自洽性。我们假设一致性可以理解为生成空间的几何属性,其中语义兼容的生成在表示空间的相似区域中集中。为了研究这一假设,我们引入了基于嵌入的协议(EBA),这是一种简单的无需训练的操作方法,通过在嵌入空间中对采样生成进行聚类来估计一致性。通过在数学推理、代码生成和摘要上的实验,我们表明表示空间中的一致性为开放式任务提供了鲁棒且可扩展的自洽性信号。特别是,EBA 始终优于随机选择,并且比最近基于大语言模型评估或不确定性估计的选择方法表现出更稳定的扩展行为。我们进一步表明,这些一致性信号在不同模型家族和嵌入空间中保持稳定,即使使用原生隐藏表示也是如此。最后,我们的分析表明,采样生成所占据的几何位置与生成质量强相关:集中在表示空间中心区域附近的生成往往对应于更可靠的输出,而外围生成则显著不准确。总体而言,我们的研究结果支持将自洽性视为采样生成的几何组织属性,而非精确符号重叠。

英文摘要

Self-consistency improves LLM reasoning by sampling multiple outputs and selecting the most consistent answer, but existing formulations largely rely on exact matching and therefore remain limited to tasks with categorical outputs. In this work, we study self-consistency in open-ended generation tasks such as code synthesis and text summarization. We hypothesize that consistency can be understood as a geometric property of the generation space, where semantically compatible generations concentrate in similar regions of representation space. To study this hypothesis, we introduce Embedding-Based Agreement (EBA), a simple training-free operationalization that estimates agreement by clustering sampled generations in embedding space. Through experiments on mathematical reasoning, code generation, and summarization, we show that agreement in representation space provides a robust and scalable signal of self-consistency for open-ended tasks. In particular, EBA consistently outperforms random selection and exhibits more stable scaling behavior than recent selection approaches based on LLM evaluation or uncertainty estimation. We further show that these agreement signals remain stable across model families and embedding spaces, even with native hidden representations. Finally, our analysis shows that the geometric location occupied by sampled generations is strongly correlated with generation quality: generations concentrated near central regions of representation space tend to correspond to more reliable outputs, whereas peripheral generations are substantially less accurate. Overall, our findings support viewing self-consistency as a property of the geometric organization of sampled generations rather than exact symbolic overlap.

2606.11998 2026-06-11 cs.LG 新提交

Bootstrapped Monitoring: Leveraging Transparent Reasoning to Oversee Stronger AI Agents

自助监控:利用透明推理监督更强的AI智能体

Frank Xiao, Mary Phuong

发表机构 * California Institute of Technology(加州理工学院)

AI总结 提出自助监控协议,通过插入具有透明思维链的不可信中间模型来监督更强智能体,在软件工程任务中显著提升捕获率,即使不可信监控者与智能体合谋。

详情
AI中文摘要

可信监控是AI控制的基石。然而,随着前沿模型能力增强,可信与不可信模型之间的能力差距可能使可信模型成为不可靠的监控者。我们引入了\emph{自助监控}协议,通过在监督链中插入一个具有透明思维链推理的更强的不可信中间模型来解决这一问题。不可信监控者($U_m$)评估智能体的行为,而较弱的可信模型($T$)监督$U_m$的推理以检测合谋。我们在多轮软件工程任务(BashArena)上对多个智能体和监控者评估了自助监控。即使不可信监控者主动与智能体合谋,只要我们能够访问其原始思维链,自助监控相比仅使用可信监控显著提高了捕获率。我们的结果表明,随着AI能力的进步,自助监控可以延长可信模型在控制中的有效寿命。

英文摘要

Trusted monitoring is a cornerstone of AI control. However, as frontier models grow more capable, the increasing capabilities gap between trusted and untrusted models may render trusted models unreliable monitors. We introduce \emph{bootstrapped monitoring}, a protocol that addresses this by inserting a stronger, intermediate untrusted model with transparent chain-of-thought reasoning into the oversight chain. The untrusted monitor ($U_m$) evaluates the agent's actions, while a weaker trusted model ($T$) oversees $U_m$'s reasoning to detect collusion. We evaluate bootstrapped monitoring on multi-turn software engineering tasks (BashArena) across multiple agents and monitors. Bootstrapped monitoring substantially improves catch rates over trusted-only monitoring, even when the untrusted monitor actively colludes with the agent, provided we have access to its raw chain-of-thought. Our results suggest that bootstrapped monitoring can extend the useful lifetime of trusted models in control as AI capabilities advance.

2606.11995 2026-06-11 cs.CE 新提交

A Computational Model for Measuring Adaptability Among U.S. Farmers: Evidence from 1997-2022

衡量美国农民适应性的计算模型:来自1997-2022年的证据

Hossein Sabzian

AI总结 基于1997-2022年数据,构建框架研究美国县域作物选择的文化进化机制,发现环境收益偏向选择驱动适应性最大化,并呈现长期组合性状趋同趋势。

详情
Comments
17 pages, 7 figures
AI中文摘要

农作物是一种文化特征,美国各县农民选择它们的方式本身可以产生县级文化特征。利用1997年至2022年的真实世界数据,我们开发了一个系统框架来研究这些特征背后的选择机制。我们的研究结果表明,环境收益偏向选择已促使各县采用在其特定环境中最大化适应性和产量的特征。这些实证结果与现有理论文献[3,16]一致。此外,一个明显的长期选择趋势表明,美国各县正在逐渐发展出一组特定的更复杂的组合特征,这些特征通过增强农民的环境适应性来提供更大的收益。本研究为美国农民文化进化过程的实证建模提供了一个强有力的案例。

英文摘要

Agricultural crops are a type of cultural trait and the way farmers of US counties select them can itself result in county-level cultural traits. Using real-world data from 1997 to 2022, we have developed a systematic framework to study the selective mechanisms behind these traits. Our findings indicate that environmental payoff-biased selection has driven counties to adopt traits that maximize their adaptability and yield within their specific environments. These empirical results align with existing theoretical literature [3,16]. Additionally, a clear long-term selective trend is evident, showing that US counties are gradually developing a specific set of more complex combinatorial traits, which provide greater payoffs by enhancing the farmers' environmental adaptability. This study serves as a strong case for empirically modeling the cultural evolutionary processes among US farmers.

2606.11993 2026-06-11 cs.LO math.LO 新提交

A Rank-Preserving Gaifman Normal Form

保秩的盖夫曼范式

Martin Grohe, Nicole Schweikardt

AI总结 提出一阶逻辑的秩度量,并证明保秩的盖夫曼定理,简化了先前结果并用于证明无稠密结构的一阶性质可在近线性时间内判定。

详情
AI中文摘要

我们为一阶逻辑引入了一个秩度量,并证明了盖夫曼定理的一个“保秩”版本。与早期的“保秩局部性定理”(特别是 [Grohe, Kreutzer, Siebertz, JACM 2017])相比,我们的定理不仅更简单,而且生成的公式与盖夫曼原始定理中的范式完全相同。作为该定理的一个应用,我们给出了 [Grohe, Kreutzer, Siebertz, JACM 2017] 主要结果的一个简化证明,即无稠密结构的一阶性质可以在近线性时间内判定。

英文摘要

We introduce a rank measure for first-order logic and prove a "rank-preserving'" version of Gaifman's theorem. Compared to earlier "rank-preserving locality theorems'" (in particular, [Grohe, Kreutzer, Siebertz, JACM 2017]), our theorem is not only much simpler, but also yields formulas in exactly the same normal form as Gaifman's original theorem. As an application of this theorem, we give a simplified proof of the main result of [Grohe, Kreutzer, Siebertz, JACM 2017] that first-order properties of nowhere-dense structures can be decided in almost linear time.

2606.11990 2026-06-11 cs.LG cs.AI 新提交

Time-Series Foundation Model Embeddings for Remaining Useful Life Estimation

用于剩余使用寿命估计的时间序列基础模型嵌入

Amir El-Ghoussani, Michele De Vita, Ronald Naumann, Valiseios Belagiannis

发表机构 * University of Erlangen-Nuremberg(埃尔朗根-纽伦堡大学) Siemens AG(西门子股份公司)

AI总结 提出冻结预训练时间序列基础模型Chronos-2作为骨干,结合轻量回归头进行剩余寿命预测,在工业传感器数据上优于多种基线方法。

详情
Comments
Accepted to EUSIPCO 2026, 4 pages, 2 figures
AI中文摘要

剩余使用寿命(RUL)预测对于工业预测性维护至关重要,然而许多基于学习的方法依赖于大量的特征工程或大型标注数据集来训练特定任务的序列模型。在这项工作中,我们引入了一种轻量级学习方法,利用冻结的预训练时间序列基础模型(TSFM),并将其与一个小型回归头结合,用于从多变量传感器流中估计RUL。具体来说,我们使用Chronos-2作为冻结骨干来提取上下文窗口特征,并训练一个轻量级回归神经网络进行RUL预测。在来自两种设备类型的真实工业传感器数据上的实验表明,在相同的预处理和评估协议下,Chronos-2特征一致地优于循环、卷积、基于Transformer和梯度提升基线。我们进一步分析了上下文长度的影响,发现随着历史记录变长,性能显著提升,这表明TSFM表示为工业环境中的RUL估计提供了一种实用且数据高效的替代方案。

英文摘要

Remaining Useful Life (RUL) prediction is essential for industrial predictive maintenance, yet many learning-based approaches rely on extensive feature engineering or large labeled datasets to train task-specific sequence models. In this work, we introduce a lightweight learning approach, in which we leverage a frozen pretrained time-series foundation model (TSFM) and combine it with a small regression head for RUL estimation from multivariate sensor streams. More specifically, we use Chronos-2 as a frozen backbone to extract context window features and train a lightweight regression neural network for RUL prediction. Experiments on real-world industrial sensor data from two device types show that Chronos-2 features consistently improve over recurrent, convolutional, Transformer-based, and gradient-boosting baselines under the same preprocessing and evaluation protocol. We further analyze the impact of context length and find that performance improves significantly with longer histories, indicating that TSFM representation offer a practical and data-efficient alternative for RUL estimation in industrial settings.

2606.11989 2026-06-11 cs.CV 新提交

From Nominal Intensity to Equivalent Rainfall: A Path-Based Credibility Evaluation Framework for Simulated Rainfall in Autonomous-Driving Perception Tests

从名义强度到等效降雨:自动驾驶感知测试中模拟降雨的基于路径的可信度评估框架

Tian Xia, Xin Zhao, Shaolingfeng Ye, Junyi Chen

发表机构 * College of Automotive and Energy Engineering, Tongji University(同济大学汽车与能源工程学院) Tsinghua University(清华大学)

AI总结 提出基于路径的可信度评估方法,通过路径等效降雨强度、不确定性带和雨滴分布真实度评分,结合激光雷达点云计数和平均反射率进行感知一致性校正,实现模拟降雨与真实降雨的对齐及测试结果映射。

详情
Comments
17 pages, preprint
AI中文摘要

可信的模拟降雨条件对于识别自动驾驶感知系统边界和支持面向SOTIF的风险评估至关重要。然而,封闭场地测试通常仅用名义降雨强度或单点测量来描述,这使得模拟降雨场难以与真实降雨对齐,并将测试结果映射到真实场景。本文提出了一种基于路径的自动驾驶感知测试中模拟降雨的可信度评估方法。以真实降雨的雨滴尺寸和速度联合分布为参考,每条候选路径由路径等效降雨强度、不确定性带和路径平均雨滴分布真实度(RRD)评分表示。进一步利用激光雷达目标点云计数和平均反射率进行感知一致性校正,量化每条模拟降雨路径对真实降雨感知效果的代理能力。实验使用了约10,000个真实降雨雨滴谱样本、728个RainSense感知样本以及2.4 m x 7.2 m模拟降雨区域内的45个空间采样点。结果表明,在相同名义条件下空间非均匀性仍然存在,证实了基于路径评估的必要性。该方法识别出路径IV和路径VI为优选候选路径,结果分别为11.54 +/- 0.31 mm/h、RRD = 0.43和8.28 +/- 0.34 mm/h、RRD = 0.46。这些路径在降雨强度稳定性、雨滴谱真实性和感知一致性方面表现出更均衡的性能。所提方法支持降雨条件下自动驾驶感知测试的路径选择、条件描述和可信解释。

英文摘要

Credible simulated-rainfall conditions are essential for identifying perception-system boundaries and supporting SOTIF-oriented risk assessment in automated driving. However, closed-field tests are often described only by nominal rainfall intensity or single-point measurements, making it difficult to align simulated rain fields with real rainfall and map test results to real-world scenarios. This paper proposes a path-based credibility evaluation method for simulated rainfall in autonomous-driving perception tests. Using the drop size and velocity joint distribution of real rainfall as the reference, each candidate path is represented by path-equivalent rainfall intensity, an uncertainty band, and a path-averaged Realism of Raindrop Distribution (RRD) score. Lidar target point-cloud count and mean reflectivity are further used for perception-consistency correction, quantifying the proxy capability of each simulated-rainfall path for real-rainfall perception effects. Experiments are conducted using about 10,000 real-rainfall raindrop-spectrum samples, 728 RainSense perception samples, and 45 spatial sampling points in a 2.4 m x 7.2 m simulated-rainfall area. Results show that spatial non-uniformity remains under the same nominal condition, confirming the need for path-based evaluation. The method identifies Path IV and Path VI as preferable candidates, with results of 11.54 +/- 0.31 mm/h, RRD = 0.43, and 8.28 +/- 0.34 mm/h, RRD = 0.46, respectively. These paths show more balanced performance in rainfall-intensity stability, raindrop-spectrum realism, and perception consistency. The proposed method supports path selection, condition description, and credible interpretation of autonomous-driving perception tests under rainfall.

2606.11988 2026-06-11 cs.LG stat.ML 新提交

What Uncertainties Do We Need for Dynamical Systems?

动力系统需要哪些不确定性?

Yusuf Sale, Christopher Bülte, Felix Czaja, Joshua Stiller, Eyke Hüllermeier

发表机构 * Institute of Computer Science, LMU Munich(慕尼黑大学计算机科学研究所) Munich Center for Machine Learning (MCML)(慕尼黑机器学习中心) Department of Mathematics, LMU Munich(慕尼黑大学数学系) German Research Center for Artificial Intelligence (DFKI, DSA)(德国人工智能研究中心(DFKI, DSA))

AI总结 本文从机器学习视角探讨动力系统中的不确定性,区分偶然与认知不确定性,并讨论不同任务中表示和量化不确定性的目标。

详情
Comments
EIML@ICML
AI中文摘要

偶然不确定性和认知不确定性之间的区别在机器学习研究中受到了相当大的关注,主要是在监督学习的背景下,但也涉及其他设置,如生成建模。在本文中,我们提供了一个关于动力系统不确定性建模的机器学习视角,这方面的研究迄今较少。特别是,我们提出:动力系统需要哪些不确定性?我们讨论了不确定性的来源,阐明了它们的性质(偶然或认知),并考虑了表示和量化不确定性的目标如何在不同任务中变化。

英文摘要

The distinction between aleatoric and epistemic uncertainty has received considerable attention in machine learning research, mainly in the context of supervised learning but also in other settings such as generative modeling. In this paper, we offer a machine learning perspective on uncertainty modeling for dynamical systems, which has been studied much less so far. In particular, we ask: what uncertainties do we need for dynamical systems? We discuss sources of uncertainty, clarify their nature (aleatoric or epistemic), and consider how the objectives of representing and quantifying uncertainty vary across different tasks.

2606.11987 2026-06-11 cs.IT math.CO 新提交

Graphical Analysis of Lifted Product Code Constructions

提升积码构造的图解分析

Ragnar Freij-Hollanti, Kirsten D. Morris, Patricija Šapokaitė

AI总结 本文证明提升积码的X和Z校验矩阵的Tanner图同构,分析其图论结构,建立连通性条件并给出最小吸收集界限,揭示影响解码性能的组合结构。

详情
AI中文摘要

提升积码是一类重要的量子低密度奇偶校验(QLDPC)码,因为它们是首个被证明渐近良好的QLDPC码族。理解其奇偶校验矩阵$H_{\mathsf{X}}$和$H_{\mathsf{Z}}$的结构以及相关的Tanner图,对于分析其解码行为和错误平层性能至关重要。在这项工作中,我们证明了$H_{\mathsf{X}}$和$H_{\mathsf{Z}}$的Tanner图实际上是同构的,并研究了它们的图论结构。我们建立了确保这些图连通性的条件,并给出了它们最小吸收集的界限,为影响解码性能的组合结构提供了新的见解。

英文摘要

Lifted product codes are an important family of quantum low-density parity-check (QLDPC) codes, as they were the first QLDPC code family shown to be asymptotically good. Understanding the structure of their parity-check matrices $H_{\mathsf{X}}$ and $H_{\mathsf{Z}}$, as well as the associated Tanner graphs, is essential for analyzing their decoding behavior and error-floor performance. In this work, we show that the Tanner graphs of $H_{\mathsf{X}}$ and $H_{\mathsf{Z}}$ are indeed isomorphic, and investigate their graph-theoretical structure. We establish conditions ensuring the connectivity of these graphs and provide bounds on their minimal absorbing sets, providing new insight into the combinatorial structures influencing decoding performance.

2606.11986 2026-06-11 cs.HC 新提交

Channels and Substrates: Distributed Cognition as an Interaction Model for Ubiquitous Analytics

通道与基质:作为普适分析交互模型的分布式认知

Niklas Elmqvist, Panagiotis D. Ritsos, Peter W. S. Butcher

AI总结 针对跨设备和普适分析中交互模型不匹配的问题,提出基于分布式认知的通道与基质框架,将交互建模为表征状态在基质间的传播,并通过重分析多个系统验证其有效性。

详情
Comments
16 pages, 8 figures
AI中文摘要

传统的人机交互模型假设单一的整体界面和稳定的感觉运动环路。这些模型不适合跨设备(XVA)和普适分析(UA),在这些场景中,交互式数据理解跨越多个设备、人工制品和人员,分布在从办公室到工厂车间的不同环境中。在本文中,我们展示了如何使用分布式认知将普适分析中的交互建模为表征状态在基质(思维、语言、身体、人工制品和设备)之间的传播,而不是通过单一界面的流量。在此基础上,我们引入了输入和输出通道,作为数据可视化中视觉通道的推广:正如视觉通道通过视觉基质的属性传递数据,输入和输出通道通过基质传递表征状态,这些基质的可用性、适用性和偏好性取决于上下文。我们通过重新分析多个普适、沉浸式和情境分析系统来演示通道与基质框架。

英文摘要

Traditional HCI interaction models assume a single monolithic interface and a stable sensorimotor loop. These models fit poorly with cross-device (XVA) and ubiquitous analytics (UA), where interactive data sensemaking unfolds across multiple devices, artifacts, and people in disparate settings from the office to the factory floor. In this paper, we show how interaction in ubiquitous analytics can be modeled using distributed cognition as propagation of representational state across substrates -- minds, speech, bodies, artifacts, and devices -- rather than as traffic through a single interface. On this basis we introduce input and output channels as generalizations of the visual channels from data visualization: just as visual channels carry data through properties of the visual substrate, input and output channels carry representational state through substrates whose availability, suitability, and preferability depend on context. We demonstrate the channels and substrates framework by reanalyzing several ubiquitous, immersive, and situated analytics systems.

2606.11982 2026-06-11 cs.LG 新提交

PAWS: Preference Learning with Advantage-Weighted Segments

PAWS: 基于优势加权片段的首选学习

Aleksandar Taranovic, Onur Celik, Niklas Freymuth, Ge Li, Serge Thilges, Huy Le, Tai Hoang, Rania Rayyes, Gerhard Neumann

AI总结 针对偏好强化学习中训练与推理分布不匹配导致时间信用分配退化的问题,提出PAWS方法,利用片段级优势函数直接进行策略更新,在机器人操作和运动任务上优于现有方法。

详情
Comments
Published as a conference paper at ICML 2026
AI中文摘要

基于偏好的强化学习(PbRL)从人类轨迹级比较中学习策略,避免了显式奖励设计和专家演示。现有方法通常在轨迹或片段级偏好上训练效用函数,同时在策略优化过程中依赖每步效用估计。这种训练和推理的不匹配导致了分布偏移,严重降低了时间信用分配并限制了策略学习。我们分析了这一问题,并提出了PAWS,一种基于片段的偏好学习方法,直接使用片段级优势函数进行策略更新。通过使效用训练与策略优化对齐,PAWS保留了轨迹级偏好信息,避免了不可靠的每步学习信号。在模拟机器人操作和运动任务上的实验表明,PAWS持续优于现有的PbRL方法,突显了分布一致偏好学习的重要性。

英文摘要

Preference-based reinforcement learning (PbRL) learns policies from human trajectory-level comparisons, avoiding explicit reward design and expert demonstrations. Existing methods typically train utility functions on trajectory or segment-level preferences while relying on per-step utility estimates during policy optimization. This training and inference mismatch induces a distribution shift that severely degrades temporal credit assignment and limits policy learning. We analyze this issue and propose PAWS, a segment-based preference learning method that performs policy updates directly using segment-level advantage functions. By aligning utility training with policy optimization, PAWS preserves trajectory-level preference information and avoids unreliable per-step learning signals. Experiments on simulated robotic manipulation and locomotion tasks demonstrate that PAWS consistently outperforms existing PbRL approaches, highlighting the importance of distribution-consistent preference learning.

2606.11980 2026-06-11 cs.HC 新提交

Somewhere Over the Desktop: A Research Agenda for Ubiquitous Analytics

超越桌面:无处不在分析的研究议程

Niklas Elmqvist, Panagiotis D. Ritsos, Peter W. S. Butcher

AI总结 空间计算、生成式AI与开放网络标准融合,催生无处不在分析(UA)新机遇,通过梳理认知、上下文、交互等七大领域交叉,提出42个未来研究挑战。

详情
Comments
15 pages, 5 figures, 1 table
AI中文摘要

空间计算、生成式AI和开放网络标准正在融合。三个空间操作系统——Android XR、Meta Horizon OS和Apple visionOS——现已具备平台级场景理解能力。可穿戴显示设备覆盖从全头显到纤薄智能眼镜的广泛范围。智能体AI与人类用户共享相同的空间基础。这种融合为\textit{无处不在分析}(UA)带来了新机遇:利用大量物理分布的网络设备随时随地支持数据理解。但专有平台正在固化设计惯例,若无基于证据的替代方案,这些惯例将僵化。UA现已成熟到其思想史可被解读为结构化谱系(涵盖基础、贡献和传承)的程度。我们追溯这一谱系,并将其组织成涵盖认知、上下文、交互、平台、可视化、协作和评估的集群。最后,我们将这些集群相互交叉,共产生42个未来研究挑战。

英文摘要

Spatial computing, generative AI, and open web standards are converging. Three spatial operating systems -- Android XR, Meta Horizon OS, and Apple visionOS -- now ship with platform-level scene understanding. Wearable displays span the range from full headsets to slim smartglasses. Agentic AI operates on the same spatial substrates as the human user. This convergence enables new opportunities for \textit{ubiquitous analytics} (UA): the use of many, physically distributed, networked devices to support data sensemaking anytime and anywhere. But proprietary platforms are settling design conventions that will calcify without evidence-based alternatives. UA has now matured to the point where its intellectual history can be read as a structured genealogy of foundations, contributions, and lineages. We trace this genealogy and organize it into clusters spanning cognition, context, interaction, platforms, visualization, collaboration, and evaluation. Finally, we cross these clusters with each other, yielding a total of 42 future research challenges.

2606.11977 2026-06-11 cs.CV 新提交

ParseFixer: An Agentic Framework for Document Parsing via Selective Multimodal Correction

ParseFixer: 一种通过选择性多模态校正的文档解析智能体框架

LeKai Yu, Hao Liu, Kun Wang, Zhiran Li, Ruping Cao, Fan Liu, Yupeng Hu

发表机构 * Shandong University(山东大学) Southeast University(东南大学)

AI总结 提出ParseFixer框架,结合全页骨干解析和智能体选择性校正,通过验证-回滚机制修复高价值解析错误,在DataMFM挑战赛文档解析任务中获得第三名。

详情
AI中文摘要

在本报告中,我们介绍了DataMFM挑战赛赛道1:文档解析的第三名解决方案。该赛道要求模型从文档页面图像中恢复结构化的Markdown文档,同时保留文本内容和文档结构。为了解决准确内容恢复和忠实结构重建的互补需求,我们提出了ParseFixer,一个用于骨干解析和选择性校正的智能体框架。ParseFixer包含两个关键模块:全页骨干解析(FBP)和智能体选择性校正(ASC)。FBP使用MinerU2.5 Pro生成稳定的初始Markdown输出,而ASC通过验证-回滚校正过程检测并修复高价值的解析失败。通过在开源骨干解析之后放置选择性多模态校正,ParseFixer在不重写可靠骨干预测的情况下,改善关键文档元素的恢复。在测试集上,我们的最终系统取得了61.78的总分,在赛道1中排名第三,证明了其在准确文档解析方面的有效性。我们的代码将发布在:this https URL。

英文摘要

In this report, we present our third-place solution for the DataMFM Challenge Track 1: Document Parsing. This track requires models to recover structured Markdown documents from document page images while preserving textual content and document structure. To address the complementary requirements of accurate content recovery and faithful structure reconstruction, we propose ParseFixer, an agentic framework for backbone parsing and selective correction. ParseFixer consists of two key modules: Full-Page Backbone Parsing (FBP) and Agentic Selective Correction (ASC). FBP produces stable initial Markdown outputs with MinerU2.5 Pro, while ASC detects high-value parsing failures and repairs them through a verify-and-rollback correction process. By placing selective multimodal correction after open-source backbone parsing, ParseFixer improves the recovery of key document elements without rewriting reliable backbone predictions. On the test set, our final system achieves an overall score of 61.78 and ranks third in Track 1, demonstrating its effectiveness for accurate document parsing. Our code will be released at: this https URL.

2606.11976 2026-06-11 cs.SE cs.AI 新提交

Exploration Structure in LLM Agents for Multi-File Change Localization

LLM代理中的探索结构用于多文件变更定位

Akeela Darryl Fattha, Kia Ying Chua, Lingxiao Jiang, Laura Wynter

AI总结 针对多子系统变更场景,提出非线性、领域范围的并行代理探索结构,在SWE Bench Pro基准上,小规模Haiku类模型通过领域代理并行生成实现高微F1分数,优于线性顺序探索。

详情
AI中文摘要

软件工程工具越来越依赖基于LLM的代理来定位需要更改的文件以解决软件问题。大多数AI代理以线性方式探索仓库,即每步访问一个目录或文件。我们假设这对于跨越多个子系统的变更存在结构上的不匹配。我们比较了线性顺序探索与非线性的、领域范围的并行代理探索。使用SWE Bench Pro作为初始基准,我们专注于ansible作为示例。我们构建了一种方法,用于在单个基础提交上对GitHub问题进行持久会话评估。我们将我们的非线性领域代理文件遍历系统与没有直接仓库访问权限的基础LLM、具有持久Python REPL的单代理递归语言模型(RLM)基线以及使用Codex 5.5 High的外部CLI基线进行比较。使用小型Haiku类模型的领域范围并行代理生成在Haiku类模型中实现了最高的微F1分数,且领先幅度较大。在我们自己的扩展基准(包括2025年和2026年更近期的PR)上,领域代理仅次于更大的Codex 5.5 High。在原始、精选的2020年SWE-bench Pro基准上,较大的Sonnet普通LLM基线通过预测少量文件获得了更高的微F1分数,从而实现了更高的精确度,但所有黄金召回率显著较低。我们还提出了三个额外发现。首先,文档演化是所有方法都未解决的潜在依赖关系。其次,天真的文件系统访问可能会因测试文件过度预测而降低定位性能。最后,强制多代理协商没有明显帮助,并且会大幅增加令牌成本。

英文摘要

Software engineering tools increasingly rely on LLM based agents to localize files to change to resolve a software issue. Most AI agents explore repositories linearly, that is, visiting one directory or file per step. We postulate that this is a structural mismatch for changes that span several subsystems. We compare linear sequential exploration against non-linear, domain-scoped parallel agentic exploration. Using SWE Bench Pro as initial benchmark, we focus on ansible as an exemplar. We construct an approach for persistent-session evaluation of GitHub issues anchored at a single base commit. We compare our non-linear domain-agent file traversal system against a base LLM without direct repository access, a single agent Recursive Language Model (RLM) baseline with a persistent Python REPL and an external CLI baseline using Codex 5.5 High. Domain scoped parallel agent spawning with a small Haiku-class model achieves the highest micro F1 among Haiku class models by a large margin. Domain-agents is the second highest behind only the much larger Codex 5.5 High on our own expanded benchmark including over more recent PRs from 2025 and 2026. On the original, curated, 2020 SWE-bench Pro benchmark, a larger Sonnet plain LLM baseline attains higher micro F1 by predicting few files, leading to higher precision, but at significantly lower all gold recall. We also present three additional findings. First, documentation evolution is a latent dependency unresolved by any approach. Second, naive file system access can degrade localization driven by test-file over prediction. Lastly, forced multi-agent consultation does not measurably help and raises token cost substantially.

2606.11974 2026-06-11 cs.DS cs.DC 新提交

Near-Optimal Distributed 2-Ruling Sets on Graphs with Low Arboricity

低树度图上的近最优分布式2-统治集

Malte Baumecker, Rustam Latypov, Yannic Maus, Jara Uitto

AI总结 针对低树度图,提出在LOCAL模型中几乎最优的随机算法,在O(log log n)轮内高概率计算2-统治集,改进指数级并匹配下界。

详情
AI中文摘要

给定图$G=(V,E)$,一个$\beta$-统治集是节点子集$S\subseteq V$,满足$S$是独立集,且每个节点$V$到$S$中某节点的距离至多为$\beta$。本文在经典\LOCAL模型中提出了几乎最优的分布式算法来寻找$2$-统治集。我们的主要贡献是一个随机算法,它在具有有界树度的任意$n$节点图上,以高概率在$O(\log \log n)$轮内计算出$2$-统治集。事实上,该算法适用于树度高达$O(\log\log n)$的图,比结合[Barenboim, Elkin, Pettie, Schneider; JACM'16]、[Ghaffari; SODA'16]和[Bisht, Kothapalli and Pemmaraju; PODC'14]所能达到的先前最优结果指数级改进,并且几乎匹配$\Omega(\log \log n / \log \log \log n)$的下界[Balliu, Brandt, Kuhn, Olivetti; FOCS'20]。统治参数$\beta=2$对于运行时间为$\log^{o(1)}n$的算法是最优的:在树度为2的图上,MIS(即$\beta = 1$)存在$\Omega(\sqrt{\log n})$轮的下界[Khoury, Schild; FOCS'25]。此外,对于更大的树度,我们获得了改进的算法。对于树度为$\alpha$的一般图,我们提出了一个随机算法,在$\widetilde{O}(\log^{5/8} \alpha +\log^{5/3} \log n)$轮内计算出$2$-统治集。对于一大类非常数树度,这比先前最优结果指数级改进。我们的技术超越了分布式计算。在低空间大规模并行计算(\mpc)模型中,我们提出了一个$O(\log \log \log n)$轮的算法,该算法以高概率在树度高达$2^{poly (\log \log n)}$的任意图上计算出$2$-统治集,比[Kothapalli, Pai, Pemmaraju; FSTTCS'20]结合[Fischer, Giliberti, Grunau; SPAA'23]的先前最优结果指数级改进。

英文摘要

Given a graph $G=(V,E)$, a $\beta$-ruling set is a subset of nodes $S\subseteq V$ that is independent, and each node in $V$ is at distance at most $\beta$ from some node in $S$. In this paper, we present almost optimal distributed algorithms for finding $2$-ruling sets in the classical \LOCAL model. Our main contribution is a randomized algorithm that w.h.p.\ computes a $2$-ruling set on any $n$-node graph with bounded arboricity in $O(\log \log n)$ rounds. In fact, the algorithm works up to arboricity $O(\log\log n)$, improves exponentially over the prior state of the art that can be achieved by combining [Barenboim, Elkin, Pettie, Schneider; JACM'16], [Ghaffari; SODA'16], and [Bisht, Kothapalli and Pemmaraju; PODC'14], and nearly matches the lower bound of $\Omega(\log \log n / \log \log \log n)$ [Balliu, Brandt, Kuhn, Olivetti; FOCS'20]. The domination parameter $\beta=2$ is optimal for algorithms with runtime $\log^{o(1)}n$: on graphs with arboricity $2$, there is a lower bound of $\Omega(\sqrt{\log n})$ rounds for MIS (i.e., $\beta = 1$) [Khoury, Schild; FOCS'25]. Additionally, we obtain improved algorithms for larger arboricity. For general graphs with arboricity $\alpha$, we present a randomized algorithm that computes a $2$-ruling set in $\widetilde{O}(\log^{5/8} \alpha +\log^{5/3} \log n)$ rounds. This improves exponentially over the state of the art for a large range of non-constant arboricity. Our techniques extend beyond distributed computing. We present an $O(\log \log \log n)$-round algorithm in the low-space Massively Parallel Computation (\mpc) model that w.h.p.\ computes a $2$-ruling set on any graph with arboricity up to $2^{poly (\log \log n)}$, improving exponentially over the state of the art from [Kothapalli, Pai, Pemmaraju; FSTTCS'20] combined with [Fischer, Giliberti, Grunau; SPAA'23].

2606.11969 2026-06-11 cs.CV 新提交

SpecLoR: Spectral Lookahead Rectification for Motion-Coherent Text-to-Video Generation

SpecLoR: 面向运动连贯文本到视频生成的频谱前瞻矫正

Xu Zhang, Yu Lu, Ruijie Quan, Zhaozheng Chen, Bohan Wang, Yi Yang

发表机构 * ReLER, College of Artificial Intelligence, Zhejiang University(浙江大学人工智能学院ReLER实验室) Huawei Central Research Institute(华为中央研究院)

AI总结 提出SpecLoR,一种即插即用的推理方法,通过前瞻预测和频域矫正减少文本到视频生成中的时空不一致性,在Wan2.2上显著提升运动连贯性且仅增加4次NFE。

详情
AI中文摘要

流匹配通过潜在ODE采样实现了鲁棒的文本到视频生成。然而,速度逼近和数值离散误差不可避免地累积,导致采样轨迹漂移。因此,生成的视频常常遭受严重的时空不一致性。尽管如此,直接矫正这些漂移的噪声潜在变量具有挑战性:(i) 时间步相关的噪声掩盖了可靠的结构线索;(ii) 空间干预可能破坏复杂的局部几何结构,同时带来高昂的计算成本。为了解决这个问题,我们提出了频谱前瞻矫正(SpecLoR),一种即插即用的推理方法,通过前瞻预测绕过噪声,并通过将矫正转移到频域来规避时空纠缠,在频域中自然视频的通用统计先验易于获取。首先,在早期采样阶段,SpecLoR前瞻估计干净潜在变量 $z_{t,0}$ 并计算其3D时空频谱。接着,SpecLoR矫正幅度谱以匹配先验,保持相位不变。最后,将矫正后的状态重新加噪以恢复ODE积分。在Wan2.2上的实验表明,SpecLoR在多个基准上显著减少了物理伪影并增强了运动连贯性,且计算开销极小(仅增加4次NFE)。

英文摘要

Flow Matching has enabled robust text-to-video generation via latent ODE sampling. However, velocity approximation and numerical discretization errors inevitably accumulate, causing sampling trajectories to drift. Consequently, generated videos often suffer from severe spatiotemporal inconsistencies. Nevertheless, directly correcting these drifted, noisy latents is challenging: (i) timestep-dependent noise obscures reliable structural cues; (ii) spatial interventions risk disrupting intricate local geometry while incurring heavy computational costs. To address this, we propose Spectral Lookahead Rectification (SpecLoR), a plug-and-play inference method that bypasses noise via lookahead prediction, and circumvents spatiotemporal entanglement by shifting corrections to the frequency domain, where universal statistical priors of natural videos are readily available. First, during early sampling stages, SpecLoR looks ahead to estimate the clean latent $z_{t,0}$ and computes its 3D spatiotemporal spectrum. Next, SpecLoR rectifies the amplitude spectrum to match the prior, leaving the phase intact. Finally, the corrected state is re-noised to resume ODE integration. Experiments on Wan2.2 demonstrate that SpecLoR significantly reduces physical artifacts and enhances motion coherence across multiple benchmarks with minimal computational overhead (4 additional NFEs).

2606.11968 2026-06-11 cs.LG stat.ML 新提交

Efficient Multinomial Logistic Bandit via Frequent Directions

基于频繁方向的高效多项式逻辑斯蒂老虎机

Linzhe He, Yu-Jie Zhang, Sifan Yang, Lijun Zhang

发表机构 * State Key Laboratory of Novel Software Technology, Nanjing University(南京大学计算机软件新技术国家重点实验室) School of Artificial Intelligence, Nanjing University(南京大学人工智能学院) Paul G. Allen School of Computer Science & Engineering, University of Washington(华盛顿大学保罗·G·艾伦计算机科学与工程学院)

AI总结 针对多项式逻辑斯蒂老虎机的高维计算瓶颈,提出集成频繁方向矩阵素描的EOFD-MLogB算法,将每轮复杂度降至O(Kd(m+K)^2)时间和O(Kd(m+K))空间,并证明其遗憾界接近原算法。

详情
AI中文摘要

本文研究多项式逻辑斯蒂老虎机(MLogB)的高效在线算法,其中$K+1$个结果的反馈分布遵循$d$维动作向量的多项式逻辑斯蒂模型。代表性的UCB型算法OFUL-MLogB实现了$\tilde{\mathcal{O}}(Kd\sqrt{T})$的遗憾界,但由于参数估计和乐观奖励构造,每轮仍需$\mathcal{O}(K^3d^3)$时间和$\mathcal{O}(K^2d^2)$空间,在高维场景下不可行。为解决此限制,我们提出EOFD-MLogB,将频繁方向矩阵素描集成到OFUL-MLogB中。通过维护累积Hessian的低秩SVD素描,参数估计中的约束在线牛顿更新和奖励奖励中的$Kd \times K$谱范数计算分别简化为单维求根任务和$K \times K$特征值计算。这导致每轮主要时间复杂度为$\mathcal{O}(Kd(m+K)^2)$,空间复杂度为$\mathcal{O}(Kd(m+K))$,其中$m \ll d$为素描大小。我们进一步证明了$\tilde{\mathcal{O}}(\Delta_T(Kd\ln\Delta_T+m)\sqrt{T})$的遗憾界,其中素描误差因子$\Delta_T$由Hessian的$m$截断谱尾控制。因此,当Hessian近似低秩时,遗憾接近OFUL-MLogB。实验验证了计算效率和竞争性能。

英文摘要

This paper studies efficient online algorithms for multinomial logistic bandits (MLogB), where the feedback distribution over $K+1$ outcomes follows a multinomial logistic model of $d$-dimensional action vectors. A representative UCB-type algorithm, OFUL-MLogB, achieves a regret bound of $\tilde{\mathcal{O}}(Kd\sqrt{T})$, but still requires $\mathcal{O}(K^3d^3)$ time and $\mathcal{O}(K^2d^2)$ space per round due to parameter estimation and optimistic reward construction, which is prohibitive in high-dimensional settings. To address this limitation, we propose EOFD-MLogB, which integrates frequent directions matrix sketching into OFUL-MLogB. By maintaining a low-rank SVD sketch of the accumulated Hessian, constrained online Newton updates in parameter estimation and $Kd \times K$ spectral-norm computations in the reward bonus are reduced to one-dimensional root-finding tasks and $K \times K$ eigenvalue computations, respectively. This yields dominant per-round time complexity $\mathcal{O}(Kd(m+K)^2)$ and space complexity $\mathcal{O}(Kd(m+K))$, where $m \ll d$ is the sketch size. We further prove a regret bound of $\tilde{\mathcal{O}}(\Delta_T(Kd\ln\Delta_T+m)\sqrt{T})$, where the sketching error factor $\Delta_T$ is controlled by the $m$-truncated spectral tail of the Hessian. Thus, when the Hessian is approximately low-rank, the regret is close to that of OFUL-MLogB. Experiments validate the computational efficiency and competitive performance.

2606.11967 2026-06-11 cs.CR cs.IT math.CO 新提交

Quadratic APN Functions in Dimension 8 via Gröbner Basis Search in a Self-Equivalence Subspace

通过自等价子空间中的Gröbner基搜索发现8维二次APN函数

Oleksandr Kuznetsov

AI总结 本文在8维自等价子空间中通过Gröbner基搜索发现566个二次APN函数,其中4个新CCZ等价类(500个函数)未被现有数据库收录,并验证了搜索管道的正确性。

详情
AI中文摘要

我们描述了一种在结构化自等价子空间内对8维二次APN(几乎完美非线性)函数的计算搜索。搜索空间是一个40维二元线性子空间,由所有与5阶线性自同构(Beierle、Brinkmann和Leander 2021年分类中的第22类)交换的函数组成,此前报道该子空间不含APN函数。我们的方法结合了通过显式RREF参数化的随机采样(每核心小时约600次新的APN阳性评估)和Magma中的Gröbner基计算,以枚举每个中心点24维超平面中的所有APN函数(每个超平面约10分钟)。在覆盖全部65,536个超平面中0.65%的428次超平面计算中,我们获得了566个二次APN函数,它们在正交导数不变量下形成6个CCZ等价类。其中4个类(包含500个函数)与2025年数据库中的3,775,599个二次APN函数或2020年前的12,921个实例汇编中的任何条目均不匹配。两个类(66个函数)与Gold函数x^3和x^9 CCZ等价,证实了搜索管道的正确性。成员分析表明,三个新类(B、C、D)完全位于原始子空间之外,且仅出现在以Gold函数为中心的切片中,展示了Gröbner基阶段的关键作用。在532次以数据库函数为切片中心的实验和20次以随机中心进行的实验中,未发现APN邻居,表明网关现象是搜索空间自等价结构特有的。由于正交导数不变量是二次APN函数的完全CCZ不变量,缺失匹配签名提供了CCZ不等价的严格证明。

英文摘要

We describe a computational search for quadratic APN (Almost Perfect Nonlinear) functions in dimension 8 within a structured self-equivalence subspace. The search space is a 40-dimensional binary linear subspace consisting of all functions commuting with a linear automorphism of order 5 (class 22 in the taxonomy of Beierle, Brinkmann, and Leander, 2021), previously reported to contain no APN functions. Our approach combines random sampling via an explicit RREF parameterization (approximately 600 fresh APN-positive evaluations per core-hour) with Gröbner basis computation in Magma to enumerate all APN functions in a 24-dimensional hyperplane through each center (approximately 10 minutes per hyperplane). From 428 hyperplane computations, covering 0.65% of all 65,536 hyperplanes, we obtained 566 quadratic APN functions forming six CCZ-equivalence classes under the ortho-derivative invariant. Four classes, comprising 500 functions, match no entry in the 2025 database of 3,775,599 quadratic APN functions or in the pre-2020 compilation of 12,921 instances. Two classes (66 functions) are CCZ-equivalent to the Gold functions x^3 and x^9, confirming the correctness of the search pipeline. A membership analysis shows that the three new classes (B, C, D) lie entirely outside the original subspace and occur only in Gold-centered slices, demonstrating the essential role of the Gröbner basis stage. In 532 experiments using database functions as slice centers and 20 experiments with random centers, no APN neighbors were found, indicating that the gateway phenomenon is specific to the self-equivalence structure of the search space. Since the ortho-derivative invariant is a complete CCZ-invariant for quadratic APN functions, the absence of matching signatures provides a rigorous proof of CCZ-inequivalence.

2606.11966 2026-06-11 cs.CV 新提交

Feature extraction for plant growth estimation

用于植物生长估计的特征提取

Simbarashe Aldrin Ngorima, Albert Helberg, Marelie H. Davel

发表机构 * Faculty of Engineering, North-West University(西北大学工程学院) Centre for Artificial Intelligence Research(人工智能研究中心) National Institute for Theoretical and Computational Sciences(国家理论与计算科学研究所)

AI总结 针对精准农业中实时估计植物生长阶段的需求,提出两种特征提取方法(Gabor滤波器与形态学操作、预训练CNN与迁移学习),在公开数据集上测试,CNN方法在速度和精度上均优于手工特征,最佳系统(VGG-19特征+RBF SVM)达到98.4%准确率,每图处理0.08秒。

详情
Comments
13 pages
AI中文摘要

精准农业需要实时估计植物生长阶段。当植物生长阶段已知时,可以减少栽培中资源(如养分和水)的浪费,因为只需供应所需的资源。然而,不同生长阶段的植物具有相似的形态特征,这可能使自主生长阶段估计变得困难。本文提出了两种用于生长阶段估计的特征提取方法:一种使用Gabor滤波器组和形态学操作,另一种使用预训练卷积神经网络(CNN)和迁移学习。我们在公开的植物生长阶段数据集(“bccr-segset”)上测试了这些方法,该数据集包含两种在室内条件下生长和捕获的物种:油菜和小萝卜。使用支持向量机和提升树作为分类器,比较了两种提出的特征提取方法。我们发现两种方法都适用于实时应用,并且CNN特征在速度和准确性方面均优于手工特征。最佳系统(VGG-19特征,使用径向基函数支持向量机分类)对两个物种均获得了98.4%的准确率,处理一张图像仅需0.08秒。

英文摘要

Precision agriculture requires the estimation of plant growth stages in real-time. When the plant growth stage is known, the wastage of resources in cultivation, such as nutrients and water, is reduced as only the required resources need to be supplied. Plants at different growth stages, however, have similar morphological features, which can make autonomous growth stage estimation difficult. This paper presents two feature extraction methods for growth stage estimation: one that uses a bank of Gabor filters and morphological operations, and the other that uses pre-trained convolutional neural networks (CNNs) and transfer learning. We test these methods on a publicly available plant growth stage dataset (``bccr-segset``) for two species, canola and radish, grown and captured under indoor conditions. The two proposed feature extraction methods are compared, using support vector machines and boosted trees as classifiers. We find that both methods are suitable for real-time applications, and that CNN features outperform the hand-crafted features, both with regard to speed and accuracy. The best system (VGG-19 features, classified with a radial basis function support vector machine) obtained an accuracy of 98.4% for both species, processing an image in 0.08 seconds.

2606.11963 2026-06-11 cs.LG physics.comp-ph 新提交

HAMNO: A Hierarchical Adaptive Multi-scale Neural Operator with Physics-Informed Learning for Dynamical Systems

HAMNO: 一种用于动力系统的分层自适应多尺度神经算子与物理信息学习

Mostafa Bamdad, Mohammad Sadegh Eshaghi, Timon Rabczuk

发表机构 * Bauhaus-Universität Weimar(魏玛包豪斯大学) Leibniz University Hannover(莱布尼茨汉诺威大学)

AI总结 提出HAMNO神经算子架构,通过自适应门控机制平衡局部与全局信息,结合物理信息扩展PI-HAMNO,在非周期Allen-Cahn等方程上提升长期预测精度与物理一致性。

详情
AI中文摘要

神经算子为直接在函数空间学习偏微分方程解映射提供了强大框架。然而,许多现有架构仍难以表示涉及多尺度结构、长程相互作用和稳定长时间演化的非线性时变系统。本文引入分层自适应多尺度神经算子(HAMNO),一种结合局部卷积表示、全局谱算子和分层编码器-解码器处理的神经算子架构。HAMNO的核心是一个数据相关的门控机制,可在每个空间位置自适应平衡局部和全局信息,使模型能够解析细尺度特征同时保持长程依赖。我们进一步基于多目标损失策略开发了物理信息扩展PI-HAMNO,该策略将数据拟合与强形式和弱形式物理约束相结合。强形式项惩罚物理坐标中域积分平方PDE残差,而弱形式项通过将控制残差乘以有限元测试函数并使用基于质心的四面体求积法评估所得单元积分来构建。该框架在定义于立方域上的非周期Allen-Cahn(AC)、Cahn-Hilliard(CH)和Swift-Hohenberg(SH)方程上进行了评估。在长时程展开、数据有限训练、分布外初始条件偏移和随机种子变化下,HAMNO提高了相对于标准神经算子基线的预测精度,而PI-HAMNO进一步增强了稳定性、物理一致性和数据效率。实现代码公开于https://github.com/HAMNO/HAMNO。

英文摘要

Neural operators provide a powerful framework for learning solution mappings of partial differential equations directly in function space. However, many existing architectures still struggle to represent nonlinear time-dependent systems that involve multi-scale structures, long-range interactions, and stable long-time evolution. In this work, we introduce the Hierarchical Adaptive Multi-scale Neural Operator (HAMNO), a neural-operator architecture that combines local convolutional representations, global spectral operators, and hierarchical encoder-decoder processing. The central component of HAMNO is a data-dependent gating mechanism that adaptively balances local and global information at each spatial location, allowing the model to resolve fine-scale features while preserving long-range dependencies. We further develop a physics-informed extension, PI-HAMNO, based on a multi-objective loss strategy that combines data fitting with strong- and weak-form physics constraints. The strong-form term penalizes the domain-integrated squared PDE residual in physical coordinates, while the weak-form term is constructed by multiplying the governing residual by finite-element test functions and evaluating the resulting element integrals using centroid-based tetrahedral quadrature. The framework is evaluated on non-periodic Allen-Cahn (AC), Cahn-Hilliard (CH), and Swift-Hohenberg (SH) equations defined on cubic domains. Across long-horizon rollout, data-limited training, out-of-distribution initial-condition shifts, and random-seed variations, HAMNO improves predictive accuracy over standard neural-operator baselines, while PI-HAMNO further enhances stability, physical consistency, and data efficiency. The implementation is publicly available at this https URL.

2606.11961 2026-06-11 cs.LG cs.AI 新提交

Categorical Prior Lock-in: Why In-Context Learning Fails for Structured Data

类别先验锁定:为何上下文学习在结构化数据上失败

Antonio Pelusi, Stefano Braghin, Alberto Trombetta

发表机构 * University of Insubria(因苏布里亚大学) IBM Research Ireland(IBM 爱尔兰研究院)

AI总结 研究大语言模型在结构化数据生成中上下文学习的局限性,发现其无法更新预训练中的类别先验分布,导致罕见类完全无法生成;参数高效微调可解决但带来记忆化风险。

详情
Comments
9 pages, 5 figures. Empirical study of in-context learning and LoRA fine-tuning for synthetic tabular data generation, introducing the phenomenon of categorical prior lock-in. Under review
AI中文摘要

大型语言模型(LLM)越来越多地被用作结构化数据的条件生成器,依赖上下文学习(ICL)来适应新分布而无需更新参数。我们以高基数表格数据作为受控测试案例,研究分布不匹配下ICL在结构化生成中的局限性,并识别出一种结构性失败模式,我们称之为“类别先验锁定”:ICL无法更新模型从预训练中继承的令牌分布先验。在两个70亿参数开源模型中,ICL随着示例增加提高了数值保真度,但在类别分布上表现出明显的天花板效应,完全无法复现罕见类。参数高效微调(LoRA)克服了这些限制,但引入了可测量的记忆化风险,并在某些情况下破坏了结构化输出生成的稳定性,凸显了适应性与隐私之间的基本权衡。

英文摘要

Large language models (LLMs) are increasingly used as conditional generators for structured data, relying on in-context learning (ICL) to adapt to new distributions without parameter updates. We investigate the limits of ICL for structured generation under distribution mismatch, using high-cardinality tabular data as a controlled test case, and identify a structural failure mode we term \textit{categorical prior lock-in}: the inability of ICL to update the model's prior over token distributions inherited from pre-training. Across two 7B-parameter open-weight models, ICL improves numerical fidelity with additional examples but exhibits a sharp ceiling on categorical distributions, failing to reproduce rare classes entirely. Parameter-efficient fine-tuning (LoRA) overcomes these limitations but introduces measurable memorization risk and, in some cases, destabilizes structured output generation, highlighting a fundamental trade-off between adaptability and privacy.

2606.11953 2026-06-11 cs.CL 新提交

Decoding Multimodal Cues: Unveiling the Implicit Meaning Behind Hateful Videos

解码多模态线索:揭示仇恨视频背后的隐含意义

Junyu Lu, Deyi Ji, Liqun Liu, Xiaokun Zhang, Youlin Wu, Roy Ka-Wei Lee, Peng Shu, Huan Yu, Jie Jiang, Bo Xu, Liang Yang, Hongfei Lin

发表机构 * Dalian University of Technology(大连理工大学) Tencent(腾讯) City University of Hong Kong(香港城市大学) Singapore University of Technology and Design(新加坡科技设计大学)

AI总结 提出IARE框架,通过信息增强和推理优化实现可解释的仇恨视频检测,在Ex-HateMM和Ex-ImpliHateVid数据集上达到最优性能。

详情
AI中文摘要

仇恨视频在在线平台上日益普遍,凸显了有效检测的迫切需求。然而,现有研究主要关注二元分类,未能提供揭示这些判断背后隐含意义的上下文理由,严重削弱了模型的可解释性。为填补这一空白,我们旨在实现可解释的仇恨视频检测,使模型能够提供整合相关证据和逻辑推理的上下文理由,同时做出决策。这种方法可以全面增强对视频内容的理解以及决策过程的可解释性。我们首先引入了两个用于可解释仇恨视频检测的数据集Ex-HateMM和Ex-ImpliHateVid。每个数据集提供了多模态有害元素的细粒度标注以及上下文理由。然后,我们提出了一个用于可解释检测的信息增强与推理优化(IARE)框架。该框架采用信息增强阶段,利用多模态思维链整合有害元素,从而丰富理由证据。此外,IARE包含一个推理优化阶段,其中直接偏好优化引导模型走向正确的推理路径并远离错误的路径,从而提高其理由的逻辑连贯性。我们在两个数据集上进行了大量实验,将多个基线与我们提出的IARE框架进行比较。结果表明,IARE在生成准确理由的同时实现了最先进的性能。

英文摘要

Hateful videos have become prevalent on online platforms, highlighting an urgent need for effective detection. However, existing studies primarily focus on binary classification and fail to provide contextual rationales that reveal the implicit meanings behind these judgments, significantly undermining model explainability. To fill this gap, we aim to achieve explainable hateful video detection, enabling models to provide contextual rationales that integrate relevant evidence and logical reasoning alongside decisions. This approach can comprehensively enhance the understanding of video content and the explainability of the decision-making process. We first introduce two datasets, Ex-HateMM and Ex-ImpliHateVid, for explainable hateful video detection. Each dataset provides fine-grained annotations of multimodal harmful elements, along with contextual rationales. We then propose an Information Augmentation and Reasoning Enhancement (IARE) framework designed for explainable detection. The framework employs an information augmentation phase that leverages the multimodal chain-of-thought to integrate harmful elements, thereby enriching rationale evidence. Additionally, IARE incorporates a reasoning enhancement phase, in which Direct Preference Optimization guides the model toward correct reasoning paths and away from incorrect ones, thereby improving the logical coherence of its justifications. We conduct extensive experiments on the two datasets, comparing multiple baselines with our proposed IARE framework. The results demonstrate that IARE achieves state-of-the-art performance while also generating accurate rationales.

2606.11952 2026-06-11 cs.RO 新提交

Deformable In-Hand Slip-Aware Tactile Sensor with Integrated Velocity, Force/Torque, and Pressure Map Sensing

可变形手内滑移感知触觉传感器,集成速度、力/力矩和压力图传感

Gabriel Arslan Waltersson, Yiannis Karayiannidis

发表机构 * Chalmers University of Technology(查尔姆斯理工大学) Lund University(隆德大学)

AI总结 提出一种新型触觉传感器,通过可变形接触垫集成速度、力/力矩和压力图传感,实现手内操作的滑移感知控制,并支持快速低成本制造。

详情
AI中文摘要

本文介绍了一种用于手内操作的新型触觉传感器,具有滑移感知控制功能,将速度、力/力矩和压力图传感集成到一个带有可变形接触垫的单一设备中。据我们所知,这是首个将这些传感模态结合在单一柔性结构中的传感器。该传感器具有可变形接触表面,能够稳健地跟踪各种材料上的平面和曲面。通过一系列全面的实验评估了其性能,突出了其能力和局限性。该传感器设计用于快速低成本制造,结合了标准PCB制造和快速原型制作技术。

英文摘要

This paper introduces a novel tactile sensor for in-hand manipulation with slip-aware control that integrates velocity, force/torque, and pressure map sensing into a single device with a deformable contact pad. To the best of our knowledge, this is the first sensor to combine these sensing modalities within a single compliant structure. The sensor features a deformable contact surface and can robustly track both flat and curved surfaces across a wide range of materials. Its performance is evaluated through a comprehensive set of experiments that highlight both its capabilities and limitations. The sensor is designed for rapid and low-cost fabrication using a combination of standard PCB manufacturing and rapid prototyping techniques.

2606.11949 2026-06-11 cs.LG cs.CR stat.ML 新提交

Online Shift Detection and Conformal Adaptation for Deployed Safety Classifiers

已部署安全分类器的在线漂移检测与共形自适应

Jun Wen Leong

AI总结 提出在线监测系统,使用校准序列统计检测分布漂移,并通过共形弃权层自适应阈值恢复目标错误率,在800个实验单元中实现86.6%有效检测。

详情
Comments
16 pages, 4 figures, 7 tables. Code and data at this https URL
AI中文摘要

我们提出了一种在线监测系统,用于检测已部署安全分类器中的分布漂移,使用校准的序列统计量来检测分类器何时移出分布。一旦检测到,共形弃权层会自适应调整决策阈值,以恢复目标错误率ε=0.1。在一项预注册的析因评估(4个分类器×5种漂移条件×20个种子×2个窗口大小,共800个单元)中,该系统实现了86.6%的有效检测(693/800,95% CI [84.1%, 88.8%]),平均延迟为39.5步。检测在三种真实标签机制下均有效:合成发作(86.6%)、真实时间越狱(85%,17/20)和GCG对抗攻击。加权共形预测为DeBERTa恢复了高达39个百分点的丢失覆盖率(ESS=46/300),但所有其他分类器均崩溃(ESS≈300):逻辑密度比估计在高维嵌入空间中实现了完美的源/目标可分离性,将所有重要性权重裁剪至下限。DeBERTa显示出从有效校正(释义,ESS=46)到几乎完全崩溃(对抗后缀,ESS=206)的梯度。PCA降至32维打破了崩溃,为Llama Guard恢复了33个百分点,为ShieldGemma恢复了21个百分点。方差分解显示分类器(η²=0.243)、漂移类型(η²=0.237)及其交互作用(η²=0.185)均对检测延迟方差有显著贡献(所有p<0.001),表明需要针对每个分类器的监测配置文件。

英文摘要

We present an online monitoring system for distributional shift in deployed safety classifiers, using calibrated sequential statistics to detect when a classifier has moved out of distribution. Upon detection, a conformal abstention layer adapts decision thresholds to recover a target error rate epsilon=0.1. In a pre-registered factorial evaluation (4 classifiers x 5 shift conditions x 20 seeds x 2 window sizes, 800 cells), the system achieves 86.6% valid detection (693/800, 95% CI [84.1%, 88.8%]) with mean latency of 39.5 steps. Detection holds across three ground-truth regimes: synthetic onset (86.6%), real temporal jailbreaks (85%, 17/20), and GCG adversarial attacks. Weighted conformal prediction recovers up to 39 pp of lost coverage for DeBERTa (ESS=46/300) but collapses for all other classifiers (ESS~300): logistic density ratio estimation achieves perfect source/target separability in high-dimensional embedding spaces, clipping all importance weights to the floor. DeBERTa shows a gradient from effective correction (paraphrase, ESS=46) to near-total collapse (adversarial suffix, ESS=206). PCA to 32 dimensions breaks the collapse, recovering 33 pp for Llama Guard and 21 pp for ShieldGemma. Variance decomposition reveals classifier (eta^2=0.243), shift type (eta^2=0.237), and their interaction (eta^2=0.185) all contribute substantially to detection latency variance (all p<0.001), indicating per-classifier monitoring profiles are necessary.

2606.11946 2026-06-11 cs.DB cs.CC cs.LG cs.LO 新提交

Neuro-Relational Programs: Unifying Queries and Neural Computation over Structured Data

神经关系程序:统一结构化数据上的查询与神经计算

Arie Soeteman, Balder ten Cate, Maurice Funk, Benny Kimelfeld, Carsten Lutz, Moritz Schönherr

AI总结 提出神经关系程序(NRP),一种扩展Datalog规则的声明式查询语言,通过嵌入操作融合关系推理与可学习神经组件,实现关系数据上的通用神经计算。

详情
Comments
37 pages
AI中文摘要

在关系数据库上进行深度学习的传统方法是将图神经网络(GNN)等神经模型应用于数据库的图表示。最近的方法则直接操作数据库,将元组与嵌入关联,并扩展查询机制以联合处理嵌入和关系内容。受这些发展的启发,我们引入了神经关系程序(NRP),这是一种针对关系数据库的声明式查询语言,其事实携带数值向量嵌入。NRP扩展了Datalog风格的规则,增加了组合、聚合和转换嵌入的操作,从而在单一形式主义中交错关系推理和可学习神经组件。这产生了一种对关系数据进行神经计算的通用方法:NRP既可以看作带有可训练组件的查询计划,也可以看作内置关系结构的神经架构。NRP的自然语法片段恢复了现有架构和查询形式主义。零元NRP对应于非自适应查询算法;一元NRP推广了GNN风格的消息传递,并精确捕捉了深度同态网络,我们将这一联系扩展到带有行ID的数据库上的前沿保护NRP。我们通过FOCQ(一阶逻辑在实权重结构上的计数扩展)刻画了带有ReLU-FFN变换的无限制NRP的表达能力,从而建立了与有序数据库上的均匀TC$^0$的精确联系。这些结果共同确立了NRP作为关系数据上查询和神经计算的广泛声明式框架。

英文摘要

The conventional approach to deep learning over relational databases applies neural models, such as Graph Neural Networks (GNNs), to a graph representation of the database. Recent approaches instead operate on databases directly, associating tuples with embeddings and extending query mechanisms to jointly process embeddings and relational content. Inspired by these developments, we introduce Neuro-Relational Programs (NRPs), a declarative query language for relational databases whose facts carry numeric vector embeddings. NRPs extend Datalog-style rules with operations that combine, aggregate, and transform embeddings, thereby interleaving relational reasoning and learnable neural components within a single formalism. This yields a general approach to neural computation over relational data: an NRP can be read both as a query plan with trainable components and as a neural architecture with relational structure built in. Natural syntactic fragments of NRPs recover existing architectures and query formalisms. Zero-ary NRPs correspond to non-adaptive query algorithms; monadic NRPs generalize GNN-style message passing and precisely capture Deep Homomorphism Networks, a connection that we extend to frontier-guarded NRPs over databases with row-ids. We characterize the expressive power of unrestricted NRPs with ReLU-FFN transformations by FOCQ, an extension of first-order logic with counting interpreted over real-weighted structures, yielding a precise connection with uniform TC$^0$ over ordered databases. Together, these results establish NRPs as a broad declarative framework for querying and neural computation over relational data.

2606.11945 2026-06-11 cs.CL cs.IR 新提交

uva-irlab-conv at SemEval-2026 Task 8: Multi-Turn RAG with Learned Sparse Retrieval and Listwise Reranking

uva-irlab-conv 在 SemEval-2026 任务 8:基于学习型稀疏检索和列表式重排序的多轮 RAG

Simon Lupart, Kidist Amde Mekonnen, Zahra Abbasiantaeb, Mohammad Aliannejadi

发表机构 * University of Amsterdam(阿姆斯特丹大学)

AI总结 提出结合学习型稀疏检索与基于 LLM 的重排序和生成的多轮检索增强生成流水线,用于跨四个领域的对话系统,有效处理不可回答查询。

详情
Comments
SemEval-2026, The 20th International Workshop on Semantic Evaluation, collocated with ACL 2026, 9 pages, 5 figures, 6 tables
AI中文摘要

本报告描述了我们在 SemEval-2026 任务 8(多轮检索与问答)中的参与情况。该任务评估跨四个领域(金融、云文档、政府、维基百科)的对话系统,并包括不可回答的查询,即可用集合中没有足够证据来生成完整回答。我们提出了一种多轮检索增强生成流水线,将学习型稀疏检索与基于 LLM 的重排序和生成相结合。使用稀疏检索作为主要检索方法,我们利用了其跨领域的强泛化能力。此外,我们利用 LLM 的长上下文能力进行对话查询重写、逐点和列表式重排序以及生成最终回答,每一步都基于完整的对话历史。这种多步骤设计使得在整个检索和生成过程中有效整合对话上下文,提高了跨领域的鲁棒性。

英文摘要

This report describes our participation in SemEval-2026 Task 8 on multi-turn retrieval and question answering. The task evaluates conversational systems across four domains (finance, cloud documentation, government, Wikipedia), and includes unanswerable queries where the available collection does not contain sufficient evidence to produce a complete response. We propose a multi-turn retrieval-augmented generation pipeline that combines learned sparse retrieval with LLM-based reranking and generation. Using sparse retrieval as the primary retrieval method, we leverage its strong generalization across domains. In addition, we make use of the long-context capabilities of LLMs for conversational query rewriting, pointwise and listwise reranking, and generating the final response, each conditioned on the full conversational history. This multi-step design enables effective integration of conversational context throughout retrieval and generation, improving robustness across domains.

2606.11937 2026-06-11 cs.DC cs.PF 新提交

From Fork-Join to Asynchronous Tasks: Parallelizing Tiled Cholesky Decomposition with OpenMP and HPX

从Fork-Join到异步任务:使用OpenMP和HPX并行化瓦片Cholesky分解

Alexander Strack, Alexander Van Craen, Dirk Pflüger

AI总结 本文通过Cholesky-Bench基准,比较了OpenMP和HPX运行时下四种瓦片Cholesky分解并行变体,发现HPX在最优瓦片大小下性能优于OpenMP 15%-30%,异步任务开销降低约3.8倍。

详情
Comments
15 pages, 8 figures, accepted paper at AMTE held in conjunction with PPAM 2026
AI中文摘要

由OpenMP推广的Fork-Join并行性仍然是共享内存并行编程的主导模型,但其隐式同步屏障会惩罚工作负载不均匀的算法。异步多任务(AMT)运行时通过将工作表示为细粒度任务的依赖图来绕过这些屏障。然而,与精心编写的fork-join基线相比,实际的性能优势很少被量化。在这项工作中,我们引入了Cholesky-Bench,并利用它重新审视了瓦片Cholesky分解(一个典型的不规则内核),比较了两种运行时(GCC和LLVM附带的OpenMP实现,以及HPX AMT运行时)中右视算法的四种并行化变体。这些变体包括经典的fork-join、暴露额外内循环并行性的折叠fork-join、同步任务以及具有显式数据依赖的异步任务。我们在双插槽128核AMD Zen 2节点上,针对多种瓦片大小和问题大小,对所有八种组合进行了基准测试。我们的结果表明,在所有变体中,HPX在最优瓦片大小下比OpenMP快15%-30%。具体来说,异步HPX任务比对应的OpenMP任务快高达26%,并且任务开销大约小3.8倍。此外,折叠fork-join变体缩小了与同步任务的大部分差距。消除冗余同步屏障带来了额外的改进,OpenMP为7%,HPX为14%。GCC与LLVM的比较进一步揭示了fork-join调度和任务创建开销中编译器特定的差异。

英文摘要

Fork-join parallelism, popularized by OpenMP, remains the dominant model for shared-memory parallel programming, but its implicit synchronization barriers can penalize algorithms with inhomogeneous workloads. Asynchronous many-task (AMT) runtimes sidestep these barriers by expressing work as a dependency graph of fine-grained tasks. Yet, the actual performance benefit over a carefully written fork-join baseline is rarely quantified. In this work, we introduce Cholesky-Bench and use it to revisit the tiled Cholesky decomposition, a canonical irregular kernel, comparing four parallelization variants of the right-looking algorithm across two runtimes: the OpenMP implementations shipped with GCC and LLVM, and the HPX AMT runtime. The variants span classical fork-join, a collapsed fork-join that exposes additional inner-loop parallelism, synchronous tasking, and asynchronous tasking with explicit data dependencies. We benchmark all eight combinations on a dual-socket 128-core AMD Zen 2 node across multiple tile sizes and problem sizes. Our results show that across all variants, HPX outperforms OpenMP at the optimal tile size by 15%-30%. Specifically, asynchronous HPX tasks are up to 26% faster than their OpenMP counterparts, and exhibit roughly 3.8x smaller task overhead. Furthermore, the collapsed fork-join variants close most of the gap to synchronous tasking. Removing redundant synchronization barriers yields an additional improvement of 7% (OpenMP) to 14% (HPX). A GCC-versus-LLVM comparison further reveals compiler-specific differences in fork-join scheduling and task-creation overheads.