arXivDaily arXiv每日学术速递 周一至周五更新
2606.20414 2026-06-19 cs.AR 新提交

ExSpike: A General Full-Event Neuromorphic Architecture for Exploiting Irregular Sparsity with Event Compression

ExSpike: 一种利用事件压缩开发不规则稀疏性的通用全事件神经形态架构

Yuehai Chen, Farhad Merchant

AI总结 提出ExSpike通用全事件神经形态架构,通过数据流优化实现纯事件驱动执行,并引入相邻位置事件压缩减少冗余累加,在FPGA上实现高能效SNN加速。

Comments Accepted by the 36th International Conference on Field-Programmable Logic and Applications (FPL 2026); 9 pages, 9 figures

详情
AI中文摘要

脉冲神经网络(SNN)因其稀疏的时空活动而有望实现节能计算。然而,有效将这种不规则稀疏性转化为实际的性能和能耗增益仍然具有挑战性,因为全事件计算架构尚未得到充分探索。本文提出ExSpike,一种通用的全事件神经形态架构,充分利用SNN中的不规则稀疏性。为了实现纯事件驱动执行,我们首先提出一组数据流优化,确保每个SNN层的输入保持基于脉冲,从而在整个网络中实现全事件执行。然后,我们设计了一种硬件高效的全事件架构,命名为ExSpike,它支持优化的纯事件驱动数据流以及用于脉冲驱动自注意力的额外注意力核心。为了进一步提高计算效率,我们引入了相邻位置事件压缩,以减少跨空间相邻脉冲序列的冗余累加。ExSpike在AMD Xilinx Virtex-7 FPGA上实现,并在分类和分割任务上进行了评估。实验结果表明,ExSpike在保持竞争性精度的同时,在多种SNN模型上实现了高归一化能效,最高可达479.15 GOPS、281.85 GOPS/W和0.80 GOPS/W/PE。特别是,ExSpike的PE归一化能效比最先进的基于FPGA的SNN加速器(FireFly-T)高出10倍。ExSpike的代码可在\url{this https URL}获取。

英文摘要

Spiking neural networks (SNNs) promise energy-efficient computing due to their sparse spatio-temporal activity. However, effectively translating such irregular sparsity into practical performance and energy gains remains challenging, as full-event computing architectures are still underexplored. This paper proposes ExSpike, a general full-event neuromorphic architecture that fully exploits irregular sparsity in SNNs. To realize pure event-driven execution, we first propose a set of dataflow optimizations to ensure that the inputs to each SNN layer remain spike-based, thereby enabling full-event execution throughout the network. We then design a hardware-efficient full-event architecture, named ExSpike, which supports the optimized pure event-driven dataflow and an additional Attention Core for spike-driven self-attention. To further improve computing efficiency, we introduce adjacent-position event compression to reduce redundant accumulations across spatially adjacent spike sequences. ExSpike is implemented on an AMD Xilinx Virtex-7 FPGA and evaluated on both classification and segmentation workloads. Experimental results show that ExSpike achieves high normalized energy efficiency across diverse SNN models while maintaining competitive accuracy, delivering up to 479.15 GOPS, 281.85 GOPS/W, and 0.80 GOPS/W/PE. In particular, ExSpike achieves up to 10$\times$ higher PE-normalized energy efficiency than the SOTA FPGA-based SNN accelerator (FireFly-T). The code for ExSpike is available at \url{https://github.com/xiaoyuehai/ExSpike}.

2606.20411 2026-06-19 cs.LG 新提交

Direct Advantage Estimation for Scalable and Sample-efficient Deep Reinforcement Learning

直接优势估计:可扩展且样本高效的深度强化学习

Hsiao-Ru Pan, Bernhard Schölkopf

AI总结 针对直接优势估计(DAE)在部分可观测域和高维观测下的局限性,本文扩展其理论框架并引入离散潜动态模型降低计算复杂度,在Arcade学习环境中验证了DAE的可扩展性和样本效率。

Comments Accepted at RLC2026

详情
AI中文摘要

直接优势估计(DAE)已被证明可以提高深度强化学习算法的样本效率。然而,它对完全环境可观测性的依赖限制了其在现实场景中的适用性,并且其对转移概率建模的要求在高维观测下会带来巨大的计算开销。在本文中,我们解决了这两个局限性。首先,我们将DAE的理论框架扩展到部分可观测域,只需最小的修改。其次,我们通过引入高效近似转移概率的离散潜动态模型来降低其计算复杂度。我们在Arcade学习环境上评估了我们的方法,发现DAE在保持高样本效率的同时,能有效地随函数逼近器容量扩展。

英文摘要

Direct Advantage Estimation (DAE) has been shown to improve the sample efficiency of deep reinforcement learning algorithms. However, its reliance on full environment observability limits its applicability in realistic settings, and its requirement to model transition probabilities incurs substantial computational overhead for high-dimensional observations. In the present work, we address both limitations. First, we extend the theoretical framework of DAE to partially observable domains with minimal modifications. Second, we reduce its computational complexity by introducing discrete latent dynamics models that efficiently approximate transition probabilities. We evaluate our approach on the Arcade Learning Environment and find that DAE scales effectively with function approximator capacity while retaining high sample efficiency.

2606.20410 2026-06-19 cs.MS 新提交

MaRDI Open Interfaces for Interoperable Nonlinear Optimization

MaRDI 开放接口:实现可互操作的非线性优化

Dmitry I. Kabanov, Stephan Rave, Mario Ohlberger

AI总结 提出MaRDI开放接口软件包,通过统一数值问题接口和自动数据编组,提升非线性优化中不同求解器和编程语言间的互操作性,减少代码修改和测试成本。

Comments 12 pages, 1 figure, 1 table, deRSE2026

详情
AI中文摘要

MaRDI开放接口是一个旨在提高科学计算互操作性的软件包,特别是针对非线性优化。为此,该包具有两个主要特点。首先,它为典型的数值问题提供统一接口,以帮助在同一问题类型的求解器之间切换。其次,它自动处理编程语言之间的数据编组。因此,计算科学家可以通过使用该包更快地进行实验,减少代码修改和测试工作。本文描述了该软件包的总体结构,并展示了非线性优化接口的示例。

英文摘要

MaRDI Open Interfaces is a software package that aims to improve interoperability in scientific computing, particularly, for nonlinear optimization. To this end, this package holds two main characteristics. First, it provides unified interfaces for typical numerical problems to help switching between solvers for the same problem type. Second, it automates data marshalling between programming languages. Hence, computational scientists can conduct experiments faster by using the package, with fewer code-modification and testing efforts. In this work we describe the general structure of the software package and show examples with the interface for nonlinear optimization.

2606.20408 2026-06-19 cs.CR cs.AI 新提交

LLM agent safety, multi-turn red-teaming, jailbreak benchmarks, adversarial robustness, safety-critical systems

LLM智能体安全性、多轮红队测试、越狱基准、对抗鲁棒性、安全关键系统

Hanwool Lee, Dasol Choi, Bokyeong Kim, Seung Geun Kim, Haon Park

AI总结 提出NRT-Bench基准,通过模拟核电站控制室的多轮红队测试,评估LLM智能体在安全关键系统中的对抗鲁棒性,发现不同模型的漏洞几乎不重叠,且防御效果高度依赖模型。

详情
AI中文摘要

大型语言模型(LLM)智能体越来越多地被提议作为安全关键系统的监督组件,但它们在持续、自适应对抗压力下的鲁棒性仍鲜有表征。我们提出了NRT-Bench,一个用于对作为安全关键系统操作员的LLM智能体进行多轮红队测试的基准,实例化为一个模拟核电站控制室。一个由五个角色组成的操作员团队,每个角色由可配置的LLM支持,运行一个由六项关键安全功能(CSF)管理的工厂,而对手在有限的多轮会话中通过四个通道注入消息,每轮有反馈。危害是一个客观信号,而非LLM评判的文本:一旦任何CSF丢失,运行即终止,并归因于导致该消息。在固定攻击配对重放协议下评估四个前沿操作员模型,我们发现自适应多轮攻击可靠地将操作员团队推过安全极限:在这四个模型中,8.7%至12.1%的攻击会话以工厂失去关键安全功能告终。尽管这四个模型在此聚合率下看起来几乎同样鲁棒,但它们的失败几乎没有重叠:在149个会话中,没有一个会话击败所有四个模型,而三分之一的会话至少击败一个模型,因此漏洞在模型之间几乎是不相交的,而非嵌套的。添加防御的效果强烈依赖于模型:同一套护栏或安全顾问智能体对一个模型降低攻击成功率,却可能对另一个模型提高成功率。我们发布了模拟场地、攻击数据集和重放工具,用于LLM智能体的可重复安全评估。

英文摘要

Large language model (LLM) agents are increasingly proposed as supervisory components for safety-critical systems, yet their robustness under sustained, adaptive adversarial pressure remains poorly characterized. We present NRT-Bench, a benchmark for multi-turn red-teaming of LLM agents acting as operators of a safety-critical system, instantiated in a simulated nuclear power plant control room. A five-role operator team, each backed by a configurable LLM, runs a plant governed by six critical safety functions (CSFs), while adversaries inject messages over four channels in bounded multi-turn sessions with per-turn feedback. Harm is an objective signal rather than LLM-judged text: a run terminates the moment any CSF is lost, attributed to the causing message. Evaluating four frontier operator models under a fixed-attack paired-replay protocol, we find that adaptive multi-turn attacks reliably push the operator team past a safety limit: across the four models, between 8.7% and 12.1% of attack sessions end with the plant losing a critical safety function. Although the four models look almost equally robust by this aggregate rate, their failures barely overlap: of $149$ sessions, none defeat all four models while a third defeat at least one, so vulnerabilities are nearly disjoint across models rather than nested. The effect of added defences is strongly model-dependent: the same guardrail stack or safety-advisor agent that lowers attack success for one model can raise it for another. We release the simulation venue, attack dataset, and replay tooling for reproducible safety evaluation of LLM agents.

2606.20404 2026-06-19 cs.CV 新提交

FlowBender: Feedback-Aware Training for Self-Correcting Conditional Flows

FlowBender: 面向自校正条件流的反馈感知训练

Daniel Gilo, Sven Elflein, Ido Sobol, Or Litany

发表机构 * Technion(以色列理工学院) NVIDIA(英伟达) University of Toronto(多伦多大学) Vector Institute(向量研究所)

AI总结 针对条件扩散/流模型常违反任务约束的问题,提出FlowBender闭环框架,将对齐误差作为输入训练网络学习校正策略,在图像翻译、复原和3D纹理贴图中同时提升保真度与合理性。

Comments Project page: https://flow-bender.github.io/

详情
AI中文摘要

条件扩散和流模型通常无法满足定义其任务的约束条件。例如,深度条件模型经常产生重新提取的深度与输入不一致的图像,尽管定义约束的前向算子(深度预测器)在训练和推理期间都可用。现有方法通常分为两类:将条件信号视为静态线索并在推理时忽略对齐信息的监督模型,以及通过手动调整的线性更新咨询约束的基于引导的方法,通常以生成样本的合理性为代价来换取对条件的保真度。我们认为这两种范式的根本差距在于模型从未被训练利用自身的对齐误差。我们引入FlowBender,一个闭环框架,将此误差视为一等输入,训练网络学习基于推理时反馈的校正策略。在每一步,无引导的前瞻传递估计干净信号,通过前向算子计算特定任务的偏差,然后细化传递消耗此信号以产生校正速度。我们提出了FlowBender的几种变体,包括用于可微算子的基于梯度的公式和用于不可微设置(如JPEG压缩)的零阶变体。为了实现高效采样,我们引入了一个前一步捷径,使得以最小的额外计算成本实现闭环校正。在图像到图像翻译、复原和3D网格纹理贴图中,FlowBender始终优于标准监督基线、对齐损失增强训练和最先进的推理时引导,同时提高保真度和合理性,而不是在它们之间进行权衡。项目页面:此 https URL

英文摘要

Conditional diffusion and flow models routinely fail to satisfy the very constraints that define their task. For instance, a depth-conditioned model often produces images whose re-extracted depth disagrees with the input, even though the forward operator--the depth predictor defining the constraint--is available during both training and inference. Existing approaches generally fall into two categories: supervised models that treat the conditioning signal as a static cue and ignore alignment information at inference, and guidance-based methods that consult it through hand-tuned linear updates, typically trading fidelity to the condition against the plausibility of the generated sample. We argue that the fundamental gap in both paradigms is that the model is never trained to utilize its own alignment error. We introduce FlowBender, a closed-loop framework that treats this error as a first-class input, training the network to learn a correction policy conditioned on inference-time feedback. At each step, an unguided look-ahead pass estimates the clean signal, a task-specific deviation is computed via the forward operator, and a refinement pass consumes this signal to produce a corrected velocity. We propose several variants of FlowBender, including a gradient-based formulation for differentiable operators and a zero-order variant for non-differentiable settings such as JPEG compression. For efficient sampling, we introduce a prior-step shortcut that enables closed-loop correction at a minimal additional computational cost. Across image-to-image translation, restoration, and 3D mesh texturing, FlowBender consistently outperforms standard supervised baselines, alignment-loss-augmented training, and state-of-the-art inference-time guidance, improving fidelity and plausibility simultaneously rather than trading them against each other. Project page: https://flow-bender.github.io/

2606.20401 2026-06-19 eess.SY cs.SY 新提交

PowerAgentBench-Dyn: A Benchmark for Agentic AI in Power System Dynamic Studies

PowerAgentBench-Dyn:电力系统动态研究中智能体AI的基准测试

Qian Zhang, Andrea Pomarico, Costas Mylonas, Magda Foti, Alberto Berizzi, Le Xie

AI总结 提出PowerAgentBench-Dyn基准,用于评估基于LLM的智能体在电力系统动态分析任务中的能力,涵盖模型质量审查和安全风险筛选两个任务。

详情
AI中文摘要

基于大型语言模型(LLM)的智能体越来越多地被用于通过与软件工具交互、解释中间结果以及自主规划后续行动来自动化多步骤工程工作流。电力系统动态研究是这些智能体一个特别有前景但尚未充分探索的应用领域。与静态计算任务不同,动态研究通常需要更多时间进行模型参数校准、工程判断以及在受限动作空间下的决策。本文介绍了PowerAgentBench-Dyn,一个旨在评估智能体AI系统在电力系统动态分析任务上的基准测试。该基准针对那些不能简化为单一优化或编码任务的问题,而是需要经验丰富的电力系统工程师日常执行的那种推理、工具使用和迭代实验。所提出的框架包括两个初始基准任务。第一个是动态模型质量审查基准,评估智能体根据系统运营商指定的模型质量合规标准验证和诊断动态模型的能力。第二个是动态安全风险筛选基准,评估智能体利用语义记忆和有限的仿真预算从未见故障数据集中识别、排序和分析最关键短路事故,并提出和评估可能的缓解措施的能力。对于每个任务,我们定义了仿真环境、观测和动作空间以及评估指标。该基准在基于度量的意义上是可复现的:发布案例和仿真器设置定义了确定性评估器,而随机智能体行为通过重复运行使用成功率和其他指标进行评估。该基准支持未来用于电力系统运行和规划的智能体AI的开发。

英文摘要

Large Language Model (LLM)-based agents are increasingly being used to automate multi-step engineering work flows by interacting with software tools, interpreting intermediate results, and autonomously planning subsequent actions. Power system dynamic studies represent a particularly promising yet largely unexplored application domain for these agents. Unlike static computational tasks, dynamic studies often require more time on model parameter calibration, engineering judgment, and decision making under constrained action spaces. This paper introduces PowerAgentBench-Dyn, a benchmark designed to evaluate Agentic AI systems on power system dynamic-analysis tasks. The benchmark targets problems that cannot be reduced to a single optimization or coding task, but instead require a type of reasoning, tool usage, and iterative experimentation routinely performed by experienced power system engineers. The proposed framework includes two initial benchmark tasks. The first, the Dynamic Model Quality Review Benchmark, evaluates agents' ability to validate and diagnose dynamic models based on model-quality compliance criteria specified by system operators. The second, the Dynamic Security Risk Screening Benchmark, assesses agents' capability to leverage semantic memory and a limited simulation budget to identify, rank, and analyze the most critical short-circuit contingencies from an unseen fault dataset, as well as propose and evaluate possible mitigation measures. For each task, we define the simulation environment, observation and action spaces, and evaluation metrics. The benchmark is reproducible in a metric-based sense: released cases and simulator settings define a deterministic evaluator, while stochastic agent behavior is assessed over repeated runs using success rates and other metrics. The benchmark supports the development of future Agentic AI for power system operation and planning.

2606.20400 2026-06-19 cs.LG 新提交

The Significance of Style Diversity in Annotation-Free Synthetic Data Generation

无标注合成数据生成中风格多样性的重要性

Zahra Abbasiantaeb, Zeno Belligoli, Omar Essam, Mohammad Aliannejadi

发表机构 * University of Amsterdam(阿姆斯特丹大学)

AI总结 提出无需人工标注的对话生成框架,利用主题和风格属性增强多样性,并设计两种后处理风格化模型,实验表明风格多样性比主题多样性更关键,性能可达人工标注数据的93.3%。

详情
AI中文摘要

为意图分类生成高实用性的合成数据通常需要人工标注的种子数据,这在快节奏的工业环境中往往不可用。在本文中,我们提出了一个完全无需人工标注数据、仅依赖意图定义的合成对话生成框架。我们提出的对话生成框架利用两种不同类型的主题和风格属性来提高数据多样性。此外,我们提出了两种新颖的后处理风格化模型,称为Univ和Exam,以将合成的LLM生成的语句转换为更多样化、更接近人类的语言风格。为了提升数据质量,我们利用LLM作为评判的过滤过程。在工业数据集和公开数据集上的实验结果表明,所提出的方法达到了使用人工标注训练数据所获得性能的93.3%。至关重要的是,研究结果揭示,对于合成数据的实用性,风格多样性比主题多样性更为关键,因为它能防止模型学习虚假的风格相关性。此外,研究表明,在生成过程中融入风格属性比后处理风格适应更有效。

英文摘要

Generating high-utility synthetic data for intent classification typically requires human-annotated seed data, which is often unavailable in fast-paced industrial settings. In this paper, we propose a framework for synthetic dialogue generation that works entirely without human-annotated data, relying solely on intent definitions. Our proposed dialogue generation framework utilizes two different types of topic and style attributes to improve data diversity. Also, we propose two novel post-hoc stylization models called Univ and Exam to transform synthetic LLM-generated utterances into more varied, human-like linguistic styles. To enhance data quality, we utilize an LLM-as-a-judge filtering process. Experimental results on both industrial and public datasets demonstrate that the proposed approach achieves up to 93.3% of the performance obtained using human-annotated training data. Crucially, the findings reveal that style diversity is more critical than topic diversity for synthetic data utility, as it prevents models from learning spurious stylistic correlations. Furthermore, the study shows that incorporating style attributes during the generation process is more effective than post-hoc style adaptation.

2606.20399 2026-06-19 cs.CC 新提交

Linked Fates: How Small of an Ambiguity Increase Can Make the Difference Between Equaling and Separating from P?

关联的命运:歧义增加多小才能区分P与等于P?

Benjamin Carleton, Michael C. Chavrimootoo, Lane A. Hemaspaandra, David E. Narváez, Conor Taliancich, Melissa Welsh

AI总结 研究NP的歧义有界版本UP_{≤f(n)}是否与P相等,通过路径毒化和填充技术,证明了某些歧义范围下P=UP_{≤f1(n)}蕴含P=UP_{≤f2(n)},并给出了其他情况下不成立的相对化结果。

详情
AI中文摘要

NP的歧义有界版本,记为$\mathrm{UP}_{\leq f(n)}$,通过$f(n)$限制非确定性多项式时间图灵机在长度为$n$的输入上接受路径的数量。这些类别从Valiant的完全无歧义($f(n)=1$)类$\mathrm{UP}$到$\mathrm{NP}$本身,其中没有界限或等价地有指数界限($f(n) = 2^{n^{O(1)}}$)。本文旨在理解这些类别中哪些在是否等于确定性多项式时间的问题上共存亡。通俗地说,哪些歧义范围具有关联的命运?即,对于满足$(\forall n)[f_1(n) \leq f_2(n)]$的非递减函数对$(f_1,f_2)$,何时有$\mathrm{P} = \mathrm{UP}_{\leq f_1(n)} \implies \mathrm{P} = \mathrm{UP}_{\leq f_2(n)}$。更具体地,哪些对鲁棒地成立,即在现实世界和所有相对化世界中成立?哪些对不鲁棒地成立,即存在一个谕示$A$使得$\mathrm{P}^A = \mathrm{UP}_{\leq f_1(n)}^A \subsetneq \mathrm{UP}_{\leq f_2(n)}^A$?先前唯一已知的正面结果是Watanabe 1988年的结果:$\mathrm{P} = \mathrm{UP}_{\leq 1} \implies (\forall k \geq 1)[\mathrm{P} = \mathrm{UP}_{\leq k}]$,该结果甚至鲁棒地成立。他的结果虽然优美,但仅适用于常数有界歧义。作为我们的正面结果,我们提出了一个适用于更高歧义水平的新情况类(定理3.8),且甚至鲁棒地适用。为了给出我们的情况类,我们利用了两种方法:一种新颖的路径毒化方法,即使在超常数歧义上也有效(定理3.5),以及填充技术的新应用(定理3.3/3.4)。作为负面结果,我们表明在几乎所有其他情况下,没有关联鲁棒地成立。

英文摘要

Ambiguity-bounded versions of $\mathrm{NP}$, denoted $\mathrm{UP}_{\leq f(n)}$, bound by $f(n)$ the number of accepting paths the nondeterministic polynomial-time Turing machine can have on inputs of length $n$. Such classes range from Valiant's completely unambiguous ($f(n)=1$) class $\mathrm{UP}$ to $\mathrm{NP}$ itself, where there is no bound or, equivalently, there is the toothless exponential bound ($f(n) = 2^{n^{O(1)}}$). This paper seeks to understand which of these classes stand and fall together as to whether they equal deterministic polynomial time. Informally put, what ranges of ambiguities have linked fates? That is, for which pairs of nondecreasing functions, $(f_1 ,f_2)$, satisfying $(\forall n)[f_1(n) \leq f_2(n)]$, does it hold that $\mathrm{P} = \mathrm{UP}_{\leq f_1(n)} \implies \mathrm{P} = \mathrm{UP}_{\leq f_2(n)}$. More particularly, for which pairs does that hold robustly, i.e., it holds in the real world and every relativized world? And for which pairs does that implication fail to hold robustly, i.e., there is an oracle $A$ such that $\mathrm{P}^A = \mathrm{UP}_{\leq f_1(n)}^A \subsetneq \mathrm{UP}_{\leq f_2(n)}^A$? The only previously known positive result is Watanabe's 1988 result that $ \mathrm{P} = \mathrm{UP}_{\leq 1} \implies (\forall k \geq 1)[\mathrm{P} = \mathrm{UP}_{\leq k}]$, which even holds robustly. His result, though lovely, applies only to constant-bounded ambiguities. As our positive result, we present a new class of cases (Theorem 3.8) that apply (and even robustly apply) at greater ambiguity levels. To give our class of cases, we leverage two approaches: a novel path-poisoning approach that works even on superconstant ambiguities (Theorem 3.5) and a new application of the power of padding (Theorems 3.3/3.4). As negative results, we show that for essentially all other cases, no linkage holds robustly.

2606.20390 2026-06-19 cs.CV 新提交

Geometry-Aware Superpixel Graph Transformer with Metadata for Skin Lesion Classification

几何感知超像素图变换器结合元数据用于皮肤病变分类

Muhammad Azeem, Tanveer Hussain, Amr Ahmed, Ardhendu Behera

发表机构 * Edge Hill University(埃奇希尔大学)

AI总结 提出一种基于区域的图学习框架,将病变建模为超像素图,利用几何边属性和元数据上下文节点,通过边缘感知图变换器实现多模态融合,在四个公开数据集上取得优于现有方法的分类性能。

Comments Accepted at MICCAI 2026

详情
AI中文摘要

由于病变结构异质性、类内变异大以及良恶性病例间细微视觉差异,从皮肤镜图像进行自动化皮肤癌分类仍然具有挑战性。现有的CNN/ViT流程通常依赖全局或补丁级特征,并常通过后期融合结合患者元数据,这限制了空间基础的多模态推理。我们提出一种新颖的基于区域的图学习框架,将病变显式建模为空间连贯的超像素区域图,这些区域表示为冻结的CNN特征。为了捕捉细粒度的病变排列,我们将区域间几何编码为边属性,并引入一个与所有区域相连的专用元数据上下文节点,从而在同一关系空间内结构化地整合人口统计学/临床变量。节点表示通过我们的边缘感知图变换器进行更新,随后进行注意力驱动的传播,最终生成用于良恶性分类的图级嵌入。在四个公开基准上的实验表明,显式的区域级关系建模和图原生多模态融合相较于现有技术取得了持续改进。因此,我们建立了一种新的以图为中心的视角,其中CNN特征被建模为关系节点,并通过上下文整合得到改进,从而产生更具表现力和鲁棒性的分类结果。

英文摘要

Automated skin cancer classification from dermoscopic images remains challenging due to heterogeneous lesion structure, strong intra-class variability, and subtle visual differences between benign and malignant cases. Existing CNN/ViT pipelines typically rely on global or patch-level features and often combine patient metadata via late fusion, which limits spatially grounded multimodal reasoning. We present a novel region-based graph learning framework that explicitly models lesions as graphs of spatially coherent superpixel regions represented as frozen CNN features. To capture fine-grained lesion arrangements, we encode inter-regional geometry as edge attributes and introduce a dedicated metadata context node connected to all regions, providing structured integration of demographic/clinical variables within the same relational space. Node representations are updated using our edge-aware graph transformer followed by attention-driven propagation, and a final graph-level embedding for benign-malignant classification. Experiments on four public benchmarks demonstrate that explicit region-level relational modeling and graph-native multimodal fusion yield consistent gains over the state-of-the-art. Consequently, we establish a new graph-centric perspective in which CNN features are modeled as relational nodes and improved through contextual integration, yielding more expressive and robust classifications.

2606.20389 2026-06-19 cs.RO 新提交

CoLI: A Reproducible Platform for Continuum Robot Learning via Monolithic 3D Printing and Isomorphic Teleoperation

CoLI: 通过整体3D打印和同构遥操作实现连续体机器人学习的可复现平台

Ziyuan Tang, Chenxi Xiao*

AI总结 提出一种基于多材料3D打印和同构遥操作的连续体机器人平台,简化制造流程并实现无奇异映射控制,支持模仿学习自主控制,通过硬件表征和操作任务验证其可复现性和学习就绪性。

Comments 8 pages, 7 figures, 1 table, accepted by IROS2026

详情
AI中文摘要

连续体机器人因其高自由度、柔顺结构和操作安全性,在操作任务中展现出巨大潜力。然而,复杂的制造和组装过程、具有挑战性的运动学建模以及缺乏直观的控制接口,导致其在研究和实际应用中的可复现性受到阻碍。为解决这些问题,我们提出了一种新颖的开源连续体机器人设计。该平台采用多材料3D打印实现简化的制造流程,使机械臂能够作为整体柔顺结构制造,且组装工作量最小。控制通过同构遥操作接口实现,该接口建立了直接的执行器级映射,无需显式运动学建模,并提供无奇异映射。基于该硬件设计,平台进一步支持基于模仿学习的自主控制。通过硬件表征和一系列操作任务对所提出的系统进行了评估。实验结果表明,该平台提供了一个可复现的、学习就绪的连续体机器人系统,加速了连续体机器人社区的算法开发和系统基准测试。

英文摘要

Continuum robots offer strong potential for manipulation tasks due to their high degrees of freedom, compliant structures, and operational safety. However, their adoption in both research and practical applications has been hindered by reproducibility issues arising from complex fabrication and assembly processes, challenging kinematic modeling, and a lack of intuitive control interfaces. To address these challenges, we present a novel open-source continuum robot design. The platform features a simplified fabrication pipeline enabled by multi-material 3D printing, allowing the arm to be fabricated as a monolithic compliant structure with minimal assembly. Control is achieved through an isomorphic teleoperation interface that establishes a direct actuator-level mapping, eliminating the need for explicit kinematic modeling and providing a singularity-free mapping. Building on this hardware design, the platform further supports imitation-learning-based autonomous control. The proposed system is evaluated through hardware characterization and a set of manipulation tasks. Experimental results demonstrate that the platform provides a reproducible, learning-ready continuum robot system, accelerating algorithmic development and systematic benchmarking for the continuum robotics community.

2606.20388 2026-06-19 cs.HC cs.AI cs.DB 新提交

DataMagic: Transforming Tabular Data into Data Insight Video

DataMagic: 将表格数据转化为数据洞察视频

Yupeng Xie, Chen Ma, Zhenyang Wang, Liangwei Wang, Jiayi Zhu, Chuxuan Zeng, Zhouan Shen, Boyan Li, Yuyu Luo

AI总结 提出DataMagic系统,通过声明式规范DVSpec和多智能体架构,将原始表格数据和自然语言查询转化为叙事性数据洞察视频,并支持交互式探索。

Comments 5 pages, 3 figures, accepted at VLDB 2026

详情
AI中文摘要

数据视频整合动态图表、语音叙述和同步动画,以时间叙事的方式传达数据洞察,使其成为提高数据管理生命周期中数据消费效率的有效媒介。然而,制作高质量的数据视频需要涵盖数据分析、叙事设计和视频制作的专业知识。现有方法存在不足:静态可视化工具(如BI仪表板)缺乏叙事逻辑和动画;创作工具要求用户预先准备可视化,而非从原始数据开始;像素级视频生成模型无法保证数据保真度或来源。我们演示了DataMagic,一个端到端的交互式系统,将原始表格数据和自然语言查询转化为叙事性数据洞察视频。为确保数据保真度,DataMagic引入了声明式规范DVSpec,通过数据驱动的语义引用将视觉和动画元素绑定到底层数据字段。为解决设计空间的组合爆炸问题,DataMagic采用先生成后编排的多智能体架构,并行生成候选场景,然后通过全局编排优化叙事连贯性。利用DVSpec逻辑与渲染的解耦,系统进一步支持三种交互模式和基于结构化来源的数据问答,将单向视频转化为可探索的交互式数据界面。在109个真实世界样本上的评估验证了DataMagic的有效性。主页:此 https URL

英文摘要

Data videos integrate dynamic charts, voice narration, and synchronized animations to communicate data insights as temporal narratives, making them an effective medium for improving data consumption efficiency in the data management lifecycle. However, producing high-quality data videos requires expertise spanning data analysis, narrative design, and video production. Existing approaches fall short: static visualization tools (e.g., BI dashboards) lack narrative logic and animation; authoring tools require users to pre-prepare visualizations rather than working from raw data; pixel-level video generation models cannot guarantee data fidelity or provenance. We demonstrate DataMagic, an end-to-end interactive system that transforms raw tabular data and natural language queries into narrative data-insight videos. To ensure data fidelity, DataMagic introduces the declarative specification DVSpec, which binds visual and animation elements to underlying data fields through data-driven semantic references. To address the combinatorial explosion of the design space, DataMagic adopts a Generate-then-Orchestrate multi-agent architecture that generates candidate scenes in parallel and then optimizes narrative coherence through global orchestration. Leveraging DVSpec's decoupling of logic and rendering, the system further supports three interaction modes and structured provenance-based data Q&A, transforming one-way videos into explorable interactive data interfaces. Evaluation on 109 real-world samples validates the effectiveness of the DataMagic. Homepage: https://datamagic-home.github.io/

2606.20382 2026-06-19 cs.LG 新提交

Towards Modality-imbalanced Federated Graph Learning: A Data Synthesis-based Approach

面向模态不平衡的联邦图学习:一种基于数据合成的方法

Zhengyu Wu, Hongchao Qin, Xunkai Li, Zekai Chen, Rong-Hua Li, Guoren Wang

AI总结 针对联邦图学习中客户端级和节点级模态不平衡问题,提出隐式图感知潜在语义表示合成范式FedMGS,通过可用性感知图编码器、原型引导语义合成器和可靠性校准融合机制恢复缺失模态语义,在四个任务上最高提升17.41%。

详情
AI中文摘要

多模态联邦图学习(MM-FGL)提供了一种自然的协作训练范式,但其实际部署受到两种粒度的模态不平衡挑战。当某些客户端缺少完整模态时,会出现客户端级不平衡;而当单个节点缺少视觉或文本属性时,会出现节点级不平衡。尽管存在一些相关研究,但我们的调查表明,它们主要针对图无关或集中式场景,难以直接适应。为了解决这些挑战,我们将模态不平衡的MM-FGL形式化为一个隐式图感知潜在语义表示合成问题。该范式直接在表示空间中恢复缺失的模态语义,从而最大化与原始数据语义分布的对齐,并缓解由缺失模态引起的高方差。为此,我们提出了FedMGS(联邦模态感知图合成),它集成了三个核心组件。可用性感知图编码器防止缺失模态污染局部结构传播。原型引导潜在语义合成器为不可用模态建立跨客户端语义锚点。可靠性校准语义融合机制在预测读出之前调节恢复的潜在表示的影响。在四个任务上的大量实验表明,FedMGS始终优于竞争基线,最高提升17.41%,并实现了最佳效率-性能权衡。

英文摘要

MultiModal Federated Graph Learning (MM-FGL) offers a natural collaborative training paradigm, but its practical deployment is challenged by two granularities of modality imbalance. Client-level imbalance occurs when certain clients lack entire modalities, while node-level imbalance occurs when individual nodes exhibit missing visual or textual attributes. While several relevant studies exist, our investigation reveals that they predominantly target graph-agnostic or centralized scenarios, rendering them difficult to adapt directly. To address these challenges, we formalize modality-imbalanced MM-FGL as an implicit graph-aware latent semantic representation synthesis problem. This paradigm recovers missing modal semantics directly within the representation space, thereby maximizing alignment with the original data's semantic distribution and mitigating the high variance induced by missing modalities. To this end, we propose FedMGS (Federated Modality-aware Graph Synthesis), which integrates three core components. The availability-aware graph encoder prevents missing modalities from contaminating local structural propagation. The prototype-guided latent semantic synthesizer establishes cross-client semantic anchors for unavailable modalities. The reliability-calibrated semantic fusion mechanism regulates the impact of recovered latent representations prior to predictive readout. Extensive experiments on four tasks show that FedMGS consistently outperforms competitive baselines with gains up to 17.41% with best efficiency-performance tradeoff.

2606.20381 2026-06-19 cs.AI 新提交

Rethinking Shrinkage Bias in LLM FP4 Pretraining: Geometric Origin, Systemic Impact, and UFP4 Recipe

重新思考LLM FP4预训练中的收缩偏差:几何起源、系统影响与UFP4方案

Qian Zhao, Kunlong Chen, Changxin Tian, Zhonghui Jiang, Haitao Zhang, Chaofan Yu, Peijie Jiang, Mingliang Gong, Jia Liu, Ziqi Liu, Zhiqiang Zhang, Jun Zhou

发表机构 * Ling Team, Ant Group(蚂蚁集团灵团队)

AI总结 本文发现E2M1格式因几何不对称导致收缩偏差,该偏差经随机哈达玛变换放大,造成训练不稳定;提出均匀网格E1M2/INT4及UFP4训练方案,在多种模型上实现更低损失。

Comments 18 pages, 12 figures

详情
AI中文摘要

FP4训练有望大幅减少LLM预训练的内存和计算成本,然而当前的FP4硬件路径和方案,包括NVIDIA Blackwell/Rubin级系统和AMD MI350系列GPU,仍以E2M1数据元素为中心。在本研究中,我们识别出该选择的一个根本限制:诸如E2M1的非均匀格式固有地遭受收缩偏差,这是一种由其可表示区间的几何不对称性导致的系统性负舍入误差。我们证明该偏差在层间乘性累积,并被随机哈达玛变换(RHT)放大,为现有基于E2M1的FP4方案中观察到的训练不稳定性提供了统一解释。相比之下,均匀网格(E1M2/INT4)绕过了这种网格几何误差,并能更好地将RHT改进的桶利用率转化为更高的量化质量。基于这一发现,我们提出UFP4,一种均匀4位训练方案,它将RHT应用于所有三个训练GEMM,同时仅对dY施加随机舍入。在Dense 1.5B、MoE 7.9B和MoE 124B的长程预训练中,UFP4始终比强E2M1基线实现更低的BF16相对损失退化,这得到了缩放定律分析和消融研究的支持。我们的结果表明,未来的加速器应支持E1M2/INT4风格的均匀4位网格作为与E2M1并列的一等训练原语。

英文摘要

FP4 training promises substantial reductions in memory and computation cost for LLM pretraining, yet current FP4 hardware paths and recipes, including NVIDIA Blackwell/Rubin-class systems and AMD MI350-series GPUs, remain centered on E2M1 data elements. In this study, we identify a fundamental limitation of that choice: non-uniform formats such as E2M1 inherently suffer from Shrinkage Bias, a systematic negative rounding error caused by the geometric asymmetry of their representable bins. We show that this bias accumulates multiplicatively across layers and is amplified by the Random Hadamard Transform (RHT), providing a unified explanation for the training instability observed in existing E2M1-based FP4 recipes. In contrast, uniform grids (E1M2/INT4) bypass this grid-geometry error and better convert the improved bucket utilization from RHT into higher quantization quality. Based on this finding, we propose UFP4, a uniform 4-bit training recipe that applies RHT to all three training GEMMs while restricting stochastic rounding to dY alone. On Dense 1.5B, MoE 7.9B, and MoE 124B long-run pretraining, UFP4 consistently achieves lower BF16-relative loss degradation than strong E2M1-based baselines, supported by scaling-law analysis and ablation studies. Our results suggest that future accelerators should support E1M2/INT4-style uniform 4-bit grids as first-class training primitives alongside E2M1.

2606.20376 2026-06-19 cs.LG cs.AI 新提交

CRAX: Fast Safe Reinforcement Learning Benchmarking

CRAX:快速安全强化学习基准测试

Tristan Tomilin, Mourad Boustani, Mickey Beurskens, Thiago D. Simão

发表机构 * Eindhoven University of Technology(埃因霍温理工大学)

AI总结 提出基于JAX加速的安全RL基准CRAX,利用MJX物理引擎实现高达100倍加速,包含6个环境套件和3个智能体任务,评估6种方法揭示性能与安全权衡。

详情
AI中文摘要

安全性是强化学习(RL)智能体在机器人、自动驾驶等现实领域部署的核心问题。尽管基准测试对RL的进步至关重要,但现有具有高保真3D物理的安全基准计算速度慢,限制了大规模实验和快速原型开发。为解决这一问题,我们提出CRAX(基于JAX加速的约束RL)。CRAX构建在具有逼真3D动力学的MuJoCo XLA(MJX)物理引擎之上,利用向量化操作和硬件加速,相比基于CPU的同类安全基准实现高达约100倍的加速。该基准包含六个环境套件和三个智能体特定任务,每个任务涵盖三个难度级别。对六种流行安全RL方法的评估表明,没有单一方法在所有任务中占主导地位,并揭示了性能与安全之间的权衡。我们发现,跨难度级别的课程学习和安全迁移可以比直接在更困难设置中训练提高性能。

英文摘要

Safety is a core concern for deploying reinforcement learning (RL) agents in real-world domains such as robotics and autonomous driving. While benchmarks have been central to progress in RL, existing safety benchmarks with high-fidelity 3D physics remain computationally slow, limiting large-scale experimentation and rapid prototyping. To address this gap, we propose CRAX (Constrained RL Accelerated with JAX). Built on top of the MuJoCo XLA (MJX) physics engine with realistic 3D dynamics, CRAX leverages vectorized operations and hardware acceleration, yielding up to ~100x speedups over comparable CPU-based safety benchmarks. The benchmark features six environment suites and three agent-specific tasks, each spanning three difficulty levels. Evaluating six popular safe RL methods shows that no single approach dominates across all tasks, and reveals the trade-offs between performance and safety. We find that curriculum learning across difficulty levels and safety transfer can improve performance over direct training in harder settings.

2606.20375 2026-06-19 cs.HC cs.CY 新提交

Organizing in the Digital Age: Understanding Community, Challenges, and Consequences in Digitally-facilitated Labor Organizing

数字时代的组织:理解数字辅助劳工组织中的社区、挑战与后果

Frederick Reiber, Alishah Chator, Dana Calacci, Allison McDonald

AI总结 本研究通过17次定性访谈,分析劳工组织如何使用Discord、WhatsApp和Slack等数字平台进行组织,揭示了技术安全、信息过载和信任建立等挑战与机遇。

Comments To appear in CSCW 2026

详情
AI中文摘要

当代美国劳动力高度分散,需要使用数字通信工具来弥合工会组织中的空间和时间差距。本研究深入分析了不同工会中的工人如何利用基于文本的数字通信平台(包括Discord、WhatsApp和Slack)进行劳工组织。通过17次定性访谈,我们考察了数字组织带来的挑战和机遇,识别了技术和社会障碍。我们的研究结果表明,尽管数字工具对当代劳工成功至关重要,但它们也引入了新的复杂性,例如应对技术安全、管理信息过载以及建立信任和共识。基于这些见解,我们将数字组织与数字工具在工会中的作用联系起来,以更广泛地理解数字组织。

英文摘要

The contemporary American labor force is highly dispersed, necessitating the use of digital communication tools to bridge spatial and temporal gaps in union organizing. This study provides an in-depth analysis of how workers within various labor unions utilize digital, text-based communication platforms -- including Discord, WhatsApp, and Slack -- for labor organizing. Through 17 qualitative interviews, we examine the challenges and opportunities presented by digital organizing, identifying both technical and social obstacles. Our findings reveal that although digital tools are integral to contemporary labor successes, they also introduce new complexities, such as navigating technical security, managing information overload, and building trust and consensus. Based on these insights, we draw connections to broader understandings of digital organizing and the role of digital tools in unions.

2606.20374 2026-06-19 cs.DC 新提交

ARGUS: Production-Scale Tracing and Performance Diagnosis for over 10,000-GPU Clusters

ARGUS:面向超过10,000 GPU集群的生产级追踪与性能诊断

Jiasheng Zhou, Longbin Zeng, Clavis Chen, Ruiming Lu, Qinwei Yang, Leyi Ye, Ray Ying, Key Zhang

AI总结 提出低开销、细粒度的始终在线追踪与实时分析系统ARGUS,通过分解训练调用层次、统一数据管道和渐进式诊断框架,在超过10,000 GPU集群上实现<2%开销的持续故障检测与性能优化。

详情
AI中文摘要

大规模LLM训练需要始终在线、细粒度的可观测性以实现有效的规模性能诊断。粗粒度的资源监控器无法定位根本原因,而细粒度的分析器会产生高昂(5%-30%)的开销和海量追踪数据,使得在大型生产集群中始终在线部署不切实际。我们提出ARGUS,一个面向10,000+ GPU规模生产集群中训练工作负载的低开销、细粒度、始终在线的追踪与实时分析系统。ARGUS将沿训练调用层次的观测分解为CPU调用栈、框架语义和GPU内核执行,始终在线收集的总开销低于2%。它构建统一数据管道,将原始内核事件压缩约3,700倍,从每个rank每步10 MB降至2.7 KB。其渐进式诊断框架通过迭代时间、阶段级和内核级分析自动隔离异常窗口、落后rank和性能下降的内核。在超过10,000 GPU的生产集群上部署超过六个月,ARGUS持续支持故障慢速检测和性能优化。我们的案例研究进一步展示了其在代表性异常中的有效性,包括计算落后、链路退化、流水线气泡放大、FlashAttention JIT停滞以及被通信症状掩盖的计算落后。

英文摘要

Large-scale LLM training requires always-on, fine-grained observability for effective performance diagnosis at scale. Coarse resource monitors alone cannot localize root causes, and fine-grained profilers incur prohibitive (5%-30%) overheads and massive trace volumes, making always-on deployment impractical in large production clusters. We propose ARGUS, a low-overhead, fine-grained, always-on tracing and real-time analysis system for training workloads in 10,000+ GPU-scale production clusters. ARGUS decomposes observation along the training call hierarchy into CPU call stacks, framework semantics, and GPU kernel execution, with always-on collection under a combined overhead of less than 2%. It builds a unified data pipeline and compresses raw kernel events by approximately 3,700x from 10 MB to 2.7 KB per rank per step. Its progressive diagnosis framework automatically isolates anomalous windows, straggler ranks, and degraded kernels through iteration-time, phase-level, and kernel-level analysis. Deployed for over six months on a 10,000+ GPU production cluster, ARGUS has supported continuous fail-slow detection and performance optimization. Our case studies further demonstrate its effectiveness across representative anomalies, including compute stragglers, link degradation, pipeline-bubble amplification, FlashAttention JIT stalls, and compute stragglers masked by communication symptoms.

2606.20373 2026-06-19 cs.SE cs.AI 新提交

AutoPass: Evidence-Guided LLM Agents for Compiler Performance Tuning

AutoPass:基于证据的LLM智能体用于编译器性能调优

Zepeng Li, Jie Ren, Zhanyong Tang, Jie Zheng, Zheng Wang

AI总结 提出AutoPass多智能体框架,通过查询编译器内部状态和中间表示,利用运行时反馈迭代优化编译选项,无需训练即可提升性能,在x86-64和ARM64上分别实现1.043倍和1.117倍加速。

详情
AI中文摘要

大型语言模型(LLM)在代码编译任务中展现出潜力,但由于复杂的微架构效应和噪声运行时测量,将其应用于运行时性能调优较为困难。我们提出AutoPass,一个用于编译器性能调优的多智能体框架,它利用编译器和运行时证据来指导LLM生成的优化决策。与先前的自动调优方案将编译器视为黑盒不同,AutoPass向LLM开放编译器,使其能够查询编译器内部的优化状态并分析中间表示以编排编译器选项。搜索过程利用测量的运行时反馈迭代地优化配置,以诊断性能回退并指导延迟改进的编辑。AutoPass在仅推理、无需训练的环境下运行,无需离线训练或任务特定的微调,因此可轻松应用于新的基准测试和平台。我们在LLVM编译器上实现AutoPass,并在服务器级x86-64和嵌入式ARM64系统上进行评估。AutoPass优于专家调优的启发式方法和经典自动调优方法,在x86-64和ARM64上相对于LLVM -O3分别实现了1.043倍和1.117倍的几何平均加速。

英文摘要

Large Language Models (LLMs) show promise for code compilation tasks, but applying them to runtime performance tuning is difficult due to complex microarchitectural effects and noisy runtime measurements. We present AutoPass, a multi-agent framework for compiler performance tuning that uses compiler and runtime evidence to guide LLM-generated optimization decisions. Rather than treating the compiler as a black box like prior auto-tuning schemes, AutoPass opens up the compiler to the LLM, enabling it to query compiler-internal optimization states and analyze the intermediate representation to orchestrate compiler options. The search process iteratively refines optimization configurations using measured runtime feedback to diagnose regressions and guide latency-improving edits. AutoPass operates in an inference-only, training-free setting and requires no offline training or task-specific fine-tuning, making it readily applicable to new benchmarks and platforms. We implement AutoPass on the LLVM compiler and evaluate it on server-grade x86-64 and embedded ARM64 systems. AutoPass outperforms expert-tuned heuristics and classical autotuning methods, achieving geometric-mean speedups of 1.043x and 1.117x over LLVM -O3 on x86-64 and ARM64, respectively.

2606.20369 2026-06-19 cs.CL 新提交

CATCH-ME if you RAG: a dataset of Contextually Annotated multi-Turn Counterspeech against Hate and Misinformation Exchanges

CATCH-ME if you RAG:针对仇恨与虚假信息交流的上下文注释多轮对抗言论数据集

Helena Bonaldi, Genoveffa Martone, Marco Guerini

发表机构 * Fondazione Bruno Kessler(布鲁诺·凯斯勒基金会) Università Cattolica del Sacro Cuore(圣心天主教大学)

AI总结 提出首个大规模、专家策划的多语言对话数据集,覆盖仇恨与虚假信息重叠问题,包含事实核查锚定和跨度标注,支持RAG系统训练更可信的对抗言论模型。

详情
AI中文摘要

在线仇恨言论和虚假信息经常重叠,但NLP研究主要将它们孤立处理。虽然LLMs代表了协助人类针对这两种威胁生成对抗言论的可扩展解决方案,但零样本模型经常生成重复和模糊的回应,凸显了需要高质量示例来指导模型生成。然而,现有的针对仇恨和虚假信息重叠的对抗言论数据集很少,且仅限于单轮英语对话,而现实中的交互跨越多个轮次和语言。为弥补这一差距,我们引入了第一个大规模、专家策划的多语言对话数据集,处理仇恨与虚假信息的交叉点。为确保事实基础,对话还锚定在已验证的外部知识(即事实核查文章和非政府组织报告)中,并包含文档级和块级跨度标注,使其可直接应用于RAG系统。该新资源涵盖五种语言,针对七个边缘化群体的仇恨,能够训练和评估更具说服力、基于事实的对抗言论模型。

英文摘要

Online hate speech and misinformation frequently overlap, yet NLP research has mainly treated them in isolation. While LLMs represent a scalable solution for assisting humans in the generation of counterspeech for both threats, zero-shot models frequently generate repetitive and vague responses, underscoring the need for high-quality examples to steer model generation. However, existing counterspeech datasets against the overlap of hate and misinformation are scarce and limited to single-turn English dialogues, while real-life interactions span across multiple turns and languages. To bridge this gap, we introduce the first large-scale, expert-curated, multilingual dataset of dialogues tackling the intersection of hate and misinformation. To ensure factual grounding, the dialogues are also anchored in verified external knowledge (i.e., fact-checking articles and NGO reports) and include document- and chunk-level span annotations, making it directly applicable for RAG systems. Covering five languages and targeting hate directed at seven marginalized groups, this novel resource enables the training and evaluation of more persuasive, factually grounded counterspeech models.

2606.20365 2026-06-19 cs.RO cs.MA 新提交

An Infrastructure-less, Control-Independent Solution to Relative Localisation of a Team of Mobile Robots using Ranging Measurements

基于测距的移动机器人团队相对定位的无基础设施、控制无关解决方案

Paolo Golinelli, Tommaso Faraci, Daniele Fontanelli

发表机构 * Department of Industrial Engineering, University of Trento(特伦托大学工业工程系) Department of Information Engineering and Computer Science, University of Trento(特伦托大学信息工程与计算机科学系)

AI总结 提出一种无锚点、完全去中心化的协作定位算法,仅依赖局部里程计、稀疏测距和短程通信,无需控制机器人运动即可实现团队可观测性,采用多假设贝叶斯框架保证鲁棒性。

详情
AI中文摘要

定位机器人团队的能力对于从非结构化环境中的机器人舰队到协作控制和导航任务等应用至关重要。在此类场景中,固定基础设施通常不可用,部署必须快速灵活,系统要求必须最小化。我们提出了一种去中心化协作定位算法,同时解决了所有这些挑战。该方法无锚点、完全去中心化,并且与大多数现有方法不同,不需要控制机器人运动来确保团队可观测性。它仅依赖局部里程计、稀疏的代理间测距测量和短程通信,这些在实践中广泛可用。该算法采用多假设贝叶斯框架,维护所有可行解集,确保在瞬态不可观测条件下的鲁棒性。此外,通过信息共享,每个代理都能受益于整个群体的估计,即使在部分连接条件下也是如此。

英文摘要

The ability to localise teams of robots is essential for applications ranging from robotic fleets in unstructured environments to cooperative control and navigation tasks. In such contexts, fixed infrastructure is often unavailable, deployments must be fast and flexible, and system requirements must be minimal. We present a decentralised cooperative localisation algorithm that addresses all these challenges at once. The method is anchor-less, fully decentralised, and, unlike most existing approaches, does not require controlling the robots motion to ensure team observability. It relies only on local odometry, sparse inter-agent ranging measurements, and short-range communication, all of which are widely available in practice. The algorithm adopts a multi-hypothesis Bayesian framework that maintains the entire set of feasible solutions, ensuring robustness under transient unobservable conditions. Moreover, through information sharing, each agent benefits from the estimates of the entire group, even in partially connected conditions.

2606.20364 2026-06-19 cs.LG 新提交

Judging to Improve: A De-biased VLM-as-3D-Judge Protocol for Single-Image 3D Generation

评判以改进:一种去偏的 VLM-as-3D-Judge 协议用于单图像 3D 生成

Ali Asaria, Tony Salomone, Deep Gandhi

发表机构 * Transformer Lab

AI总结 本文提出一种去偏的跨模型 VLM-as-3D-Judge 协议,将评判者从排序扩展到优化,通过训练与评估评判者分离、位置偏差校正及修复三种失效模式,实现轻量级适应下与强基线的匹配。

详情
AI中文摘要

一项伴随研究建立了一个去偏的、跨模型的 VLM-as-3D-Judge,能够可靠地对单图像到 3D 网格质量进行排序,而廉价的几何和 CLIP 代理在此方面表现不足。本文提出:该评判者的偏好能否专门化一个强大的开放生成器 TRELLIS,针对单一资产类别(家具),且无需人工标注?将评判者从排序扩展到优化是本文的工作所在。将 VLM 评判者推入训练和评估循环会暴露排序从未触发的失效模式,因此我们的贡献是对评判者进行优化级别的强化:一个训练评判者(Qwen2.5-VL-7B)与一个评估评判者(InternVL3-8B)保持分离以打破循环性;位置偏差校正;以及针对三种失效模式(图像过载、隐藏几何的溅射渲染、以及奖励干净但错误输出的无参考评判)的修复,并附有校准证据(清晰差距胜率 0.83-1.0;基线间约 0.5)。使用此协议作为独立评估者,仅从公开模型和数据出发,采用轻量级参数高效适应,我们发现我们的方法匹配了强基线而非超越它。独立基线样本几乎不携带可学习的偏好(0.94 顺序翻转率),因此信号必须通过质量对比构造来设计。在六种适应方法、两种输入模式和严重程度扫描中,最具针对性的方法——严重退化下的条件器修复——达到了与基线持平(0.50),而没有方法达到 >=65% 的胜率目标。结果是机制性的:干净输入使评判者饱和,流式 DIT 微调通过采样器被冲刷,而条件器修复是改变几何的位点。胜率在 n=8 个对象时具有方向性。匹配一个强大的公开数据基线本身具有信息量:超越它需要比公开数据上的轻量级 PEFT 更多,而评判者协议是可复用的。

英文摘要

A companion study established a de-biased, cross-model VLM-as-3D-judge that reliably ranks single-image-to-3D mesh quality where cheap geometry and CLIP proxies fall short. This paper asks: can that judge's preferences specialize a strong open generator, TRELLIS, on one asset class (furniture), cheaply and without human labels? Taking the judge from ranking to optimization is where the work lives. Pushing a VLM judge into the training and evaluation loop exposes failure modes ranking never triggered, so our contribution is an optimization-grade hardening of the judge: a training judge (Qwen2.5-VL-7B) held distinct from an evaluation judge (InternVL3-8B) to break circularity; position-bias correction; and fixes for three failure modes (image overload, geometry-hiding splat renders, and reference-free judging that rewards clean-but-wrong outputs), with calibration evidence (clear-gap win-rate 0.83-1.0; base-vs-base ~0.5). Using this protocol as an independent evaluator, and working only from public models and data with lightweight parameter-efficient adaptation, we find our methods match the strong base rather than exceed it. Independent base samples carry essentially no learnable preference (0.94 order-flip rate), so signal must be engineered by quality-contrastive construction. Across six adaptation methods, two input regimes, and a severity sweep, the most targeted - conditioner repair under severe degradation - reaches parity (0.50) with the base, while no method clears the >=65% win-rate target. The result is mechanistic: clean inputs saturate the judge, flow-DIT fine-tuning washes out through the sampler, and conditioning repair is the locus that moves geometry. Win-rates are directional at n=8 objects. Matching a strong public-data base with cheap adaptation is itself informative: exceeding it needs more than lightweight PEFT on public data, and the judge protocol is reusable.

2606.20363 2026-06-19 cs.AI 新提交

Automating SKILL.md Generation for Computer-Using Agents via Interaction Trajectory Mining

为计算机使用智能体自动生成SKILL.md:基于交互轨迹挖掘

Yuexing Hao, Xiaomin Li

发表机构 * Massachusetts Institute of Technology(麻省理工学院) Harvard University(哈佛大学)

AI总结 提出三阶段流水线从GUI轨迹中挖掘可读技能库,但发现可读性不保证下游策略提升,GRPO仅带来微小改进,揭示当前方法的局限性。

详情
AI中文摘要

显式技能库使计算机使用智能体更易于检查,但尚不清楚是否可以从交互数据中挖掘此类库以改进下游策略。我们通过一个三阶段流水线研究这个问题:分割GUI轨迹,将片段聚类为候选技能,并从生成的注释中训练技能感知策略。挖掘的聚类在源基准上是可读的:八个聚类中有五个对InteraSkill Workflows标签的纯度至少为0.95。然而,可读性并不意味着可迁移。GRPO仅将IW技能步骤准确率从18.5%提高到20.5%,使BrowseComp+基本不变,并在关键源域指标上低于简单的频率先验。因此,我们将该方法作为诊断性研究呈现:轨迹挖掘可以暴露可检查的技能结构,但当前的边界检测器、无序片段表示和离线奖励模型不足以实现可靠的跨域策略改进。

英文摘要

Explicit skill libraries make computer-using agents easier to inspect, but it remains unclear whether such libraries can be mined from interaction data in a way that improves downstream policies. We study this question through a three-stage pipeline that segments GUI trajectories, clusters segments into candidate skills, and trains a skill-aware policy from the resulting annotations. The mined clusters are readable on the source benchmark: five of eight clusters have at least 0.95 purity against InteraSkill Workflows labels. However, readability does not imply transfer. GRPO improves IW skill-step accuracy only from 18.5\% to 20.5\%, leaves BrowseComp+ essentially unchanged, and underperforms trivial frequency priors on key source-domain metrics. We therefore present the method as a diagnostic study: trajectory mining can expose inspectable skill structure, but the current boundary detector, orderless segment representation, and offline reward model are insufficient for reliable cross-domain policy improvement.

2606.20361 2026-06-19 eess.SY cs.SY 新提交

Sparse add-on controller design: A Youla approach to system-level performance

稀疏附加控制器设计:一种面向系统级性能的Youla方法

M. van der Hulst, N. Dirkx, R. A. González, K. Tiels, J. van de Wijdeven, T. oomen

AI总结 提出一种基于Youla参数化的稀疏附加控制器设计框架,通过凸优化求解稀疏H2综合问题,实现系统级性能与互联复杂度的最优权衡。

详情
AI中文摘要

高科技系统的性能通常由机器中运行的多个闭环控制子系统共享的少数性能目标决定,例如同步、协调和对齐,这需要明确处理这些目标以实现最优性能的控制方法。本文旨在引入一种框架,通过设计系统级控制器作为现有子系统控制结构的附加组件来提高系统性能。所开发的方法使用Youla框架参数化所有稳定的系统级附加控制器,从而能够凸形式化稀疏$\mathcal{H}_2$综合问题。结果是一个稀疏附加控制器,实现了组合性能与互联复杂度之间的最优权衡,如数值模拟所示。

英文摘要

The performance of high-tech systems is often dictated by a few performance objectives shared among the many closed-loop controlled subsystems operating in the machine, such as synchronization, coordination, and alignment, which necessitates control methods that explicitly address them to achieve optimal performance. The aim of this paper is to introduce a framework that improves system performance through system-level controllers designed to be implemented as add-ons to the existing subsystem control structure. The developed method parametrizes all stabilizing system-level add-on controllers using the Youla framework, enabling a convex formulation of the sparse $\mathcal{H}_2$ synthesis problem. The result is a sparse add-on controller that achieves the optimal trade-off between combined performance and interconnection complexity, as demonstrated through numerical simulations.

2606.20359 2026-06-19 cs.LG 新提交

Train, Retrieve, or Both? A Four-Arm Head-to-Head for Correct Statutory Citation on the Ontario Residential Tenancies Act

训练、检索,还是两者兼用?针对安大略省住宅租赁法的正确法定引用的四组头对头比较

Ali Asaria, Tony Salomone, Deep Gandhi

发表机构 * Transformer Lab

AI总结 研究自诉租户、房东和帮助台工作人员如何获得正确的法定引用,通过四组实验比较微调、检索及混合方法,发现SFT+RAG混合模型在精确匹配上得分最高且无幻觉引用。

详情
AI中文摘要

自诉租户、房东和帮助台工作人员需要被指向实际管辖问题的法律条款,并附有正确的法定引用。我们在2006年安大略省住宅租赁法(RTA)及其核心法规上研究此任务,从操作者的角度实证提问:微调是否足够,还是需要混合检索?我们在Qwen2.5-7B-Instruct上运行四组头对头比较(基础零样本、仅LoRA SFT、仅RAG、以及SFT+RAG混合),在一个小型、待人工验证的真实评估集上,以引用的精确匹配(节+小节)评分。基础模型无法引用RTA,仅SFT会错误回忆章节;检索至关重要,并通过构造将幻觉降至零;而SFT+RAG混合模型得分最高,精确匹配为0.481,且无幻觉引用。其优势在于SFT使得条款选择对高召回候选集(损害零样本RAG)更加鲁棒。值得注意的是,这种廉价的bge-small混合模型匹配或超越了基于更大、专门检索模型(更大的嵌入器和交叉编码器重排序器)的管道,更大/改进的训练集也无帮助:在此任务中,强法定引用性能不需要专门的检索模型或更多数据。该工件将幻觉归零并超过了基准提升线,但未达到期望的0.70精确匹配目标。所有结果均基于小型、待人工验证的真实评估集,并作为初步结果报告。

英文摘要

Self-represented tenants, landlords, and help-desk staff need to be pointed at the provision of law that actually governs a question, with a correct statutory citation. We study this task on the Ontario Residential Tenancies Act, 2006 (RTA) and its core regulation, asking the operator's question empirically: is fine-tuning enough, or is hybrid retrieval needed? We run a four-arm head-to-head on Qwen2.5-7B-Instruct (base zero-shot, LoRA SFT-only, RAG-only, and an SFT+RAG hybrid), scored on citation exact-match (section+subsection) over a small, human-verification-pending real eval set. The base model cannot cite the RTA and SFT-only mis-recalls sections; retrieval is essential and drives hallucination to zero by construction; and the SFT+RAG hybrid scores highest at 0.481 exact-match with zero hallucinated citations. Its edge comes from SFT making provision selection more robust to the higher-recall candidate sets that hurt zero-shot RAG. Notably, this cheap bge-small hybrid matches or beats a pipeline built on bigger, specialized retrieval models (a larger embedder and a cross-encoder reranker), and a larger/improved training set does not help either: strong statutory-citation performance here does not require specialized retrieval models or more data. The artifact zeroes hallucination and clears the lift-over-base bar but does not reach the aspirational 0.70 exact-match target. All results are on a small, human-verification-pending real eval set and are reported as preliminary.

2606.20357 2026-06-19 cs.LG 新提交

On the Variance of Temporal Difference Learning and its Reduction Using Control Variates

时序差分学习的方差及其通过控制变量的降低

Hsiao-Ru Pan, Bernhard Schölkopf

AI总结 本文分析表格表示下相位设置中时序差分学习的方差,证明其方差降低机制是通过有效聚合更多独立轨迹,并比较了TD、MC和DAE的方差界限。

Comments Accepted at RLC2026

详情
AI中文摘要

我们使用表格表示的相位设置分析了时序差分(TD)学习的方差,并表明其降低方差的能力背后的机制之一是通过有效聚合大量独立轨迹。基于这一见解,我们证明(1)TD的方差渐近地被蒙特卡洛(MC)估计器的方差从上方界定,以及(2)对于固定数量的样本,较短的水平更新会导致较小的方差。除了TD,我们还展示了直接优势估计(DAE),一种估计优势函数的方法,可以被视为一种回归调整的控制变量,在大样本极限下实现了比TD更紧的方差界限。最后,我们通过精心设计的环境数值说明了这些估计器的行为。

英文摘要

We analyze the variance of temporal difference (TD) learning using the phased setting with tabular representation, and show that one of the mechanisms behind its ability to reduce variance is by effectively aggregating over a larger number of independent trajectories. Based on this insight, we demonstrate that (1) the variance of TD is asymptotically bounded from above by Monte Carlo (MC) estimators, and (2) shorter horizon updates incurs less variance for a fixed number of samples. Beyond TD, we show that Direct Advantage Estimation (DAE), a method for estimating the advantage function, can be seen as a type of regression-adjusted control variate, which achieves a tighter bound on the variance compared to TD in the large-sample limit. Finally, we numerically illustrate the behaviors of these estimators with carefully designed environments.

2606.20351 2026-06-19 cs.LO cs.PL 新提交

A cubical formalisation of conditional independence, Bayesian conditioning, and Pearl's d-separation soundness

条件独立性、贝叶斯条件化和Pearl的d-分离正确性的立方形式化

Karen Sargsyan

AI总结 本文在Cubical Agda中形式化概率单子,提出一种高阶归纳类型,通过引入贝叶斯公式修正标准凸代数交换公理的不足,并验证了半图oid公理、do-演算规则和d-分离定理。

详情
AI中文摘要

自Stone以来概率单子形式化中常见的标准凸代数交换公理被证明不足以支持完整的贝叶斯条件化。我们在Cubical Agda中精确地说明了这一点:有限分布作为高阶归纳类型,条件独立性作为核之间的立方路径,递归贝叶斯条件化作为全支撑片段上的全函数。将条件化提升到完整的HIT暴露了一个结构上的不匹配——重新排列的四叶混合的两半携带由贝叶斯公式关联的不同贝叶斯权重,而不是标准公理提供的单个共享内部权重。我们展示了解决这一问题的最小推广,并证明了标准形式是退化情况,其中两个内部权重重合。基于这一观察,我们在抽象有序域接口之上以构造性方式验证了代数上下文,无需任何公设:绑定交换性、四个半图oid公理、交(通过结构性Σ-见证简化为收缩,无需正性)、Pearl的do-演算规则1、2和3的核形式、有限类型贝叶斯条件化,以及任意n顶点有限有向无环图(DAG)上干预和贝叶斯形式的Pearl的d-分离定理(正确性)。概率单子也被验证为马尔可夫范畴;抽象接口在Q处实现。

英文摘要

The standard convex-algebra interchange axiom, common to probability-monad formalisations since Stone, is provably too weak to support full Bayesian conditioning. We make this precise in Cubical Agda: finite distributions as a higher inductive type, conditional independence as a cubical path between kernels, recursive Bayesian conditioning as a total function on a full-support fragment. Lifting conditioning to the full HIT exposes a structural mismatch -- the two halves of the rearranged 4-leaf mix carry distinct Bayesian weights related by Bayes' formula, not the single shared inner weight the standard axiom provides. We exhibit the minimal generalisation that resolves this and prove the standard form is the degenerate case where the two inner weights coincide. Around this observation we verify the algebraic context constructively, with zero postulates above an abstract ordered-field interface: bind commutativity, the four semi-graphoid axioms, intersection (reduced to contraction via structural $Σ$-witnesses, without positivity), Pearl's do-calculus Rules~1, 2, and~3 in kernel form, finite-type Bayesian conditioning, and Pearl's d-separation theorem (soundness) on arbitrary $n$-vertex finite directed acyclic graphs (DAGs) in both interventional and Bayesian forms. The probability monad is also verified as a Markov category; the abstract interface discharges at $Q$.

2606.20336 2026-06-19 cs.RO 新提交

Autonomous Driving with Priority-Ordered STL Specifications Under Multimodal Uncertainty

多模态不确定性下基于优先级排序STL规范的自动驾驶

Taha Bouzid, Shuhao Qi, Mircea Lazar, Sofie Haesaert

发表机构 * Eindhoven University of Technology(埃因霍温理工大学)

AI总结 提出一种不确定性感知的轨迹规划框架,通过信号时序逻辑的词典序优先级处理冲突目标,并结合模型预测路径积分控制实现,在仿真中验证了有效性。

详情
AI中文摘要

自动驾驶车辆必须规划满足安全、乘客舒适度和交通规则等多重要求的轨迹。然而,在安全关键场景中,不可能同时满足所有要求,因此需要根据重要性进行优先级排序。同时,在这些安全关键场景中,应明确考虑周围交通(如其他车辆和行人)轨迹预测的不确定性。在这项工作中,我们提出了一种不确定性感知的轨迹规划框架,该框架结合了信号时序逻辑(STL)规范上的预定义词典序,该排序在不确定性下仍然有效。我们使用模型预测路径积分(MPPI)控制实现了该公式,并在仿真场景中展示了我们方法的有效性,表明我们的框架在现实的多模态不确定性下有效处理了冲突目标。

英文摘要

Autonomous vehicles must plan trajectories that satisfy a multitude of requirements on safety, passenger comfort, and compliance with traffic rules. However, in safety-critical scenarios, it is not always possible to satisfy all requirements simultaneously, necessitating their prioritization based on importance. At the same time, in these safety-critical scenarios, the uncertainty in trajectory predictions of the surrounding traffic, such as other vehicles and pedestrians, should be explicitly accounted for. In this work, we propose an uncertainty-aware trajectory planning framework that incorporates a predefined lexicographic ordering over Signal Temporal Logic (STL) specifications that stays valid under uncertainty. We implement this formulation with Model Predictive Path Integral (MPPI) control and we demonstrate the effectiveness of our method on simulation scenarios, showing that our framework efficiently handles conflicting objectives under realistic multi-modal uncertainty.

2606.20333 2026-06-19 cs.AI 新提交

SoftSkill: Behavioral Compression for Contextual Adaptation

SoftSkill: 用于上下文适应的行为压缩

Xijia Tao, Yihua Teng, Xinyu Fu, Ziru Liu, Kecheng Chen, Yuzhi Zhao, Suiyun Zhang, Rui Liu, Lingpeng Kong

发表机构 * The University of Hong Kong(香港大学) Huawei Research(华为研究院) City University of Hong Kong(香港城市大学) Huazhong University of Science and Technology(华中科技大学)

AI总结 提出SoftSkill方法,通过可训练的软技能前缀压缩自然语言技能为紧凑连续向量,在冻结基模型上提升问答和数学任务性能,减少标记数量。

详情
AI中文摘要

智能体技能通常以自然语言Markdown文件形式部署,编码回答策略、证据使用习惯和任务流程。这些文件可读且可移植,但间接消耗:对于每个任务实例,冻结的语言模型必须将长文本制品转换为生成时行为。本文探讨自然语言技能是否可以初始化一个紧凑的连续上下文对象,通过可训练的软增量进行优化,同时基模型保持冻结。我们提出SoftSkill,一种冻结骨干方法,通过下一词预测调整此类软技能,并在推理时将其部署为潜在行为先验。在我们的主要单轮设置中,在Qwen3.5-4B上使用长度为32的SoftSkill前缀,相比无技能提示在SearchQA上提升8.3分,LiveMath上提升42.1分,DocVQA上提升1.3分。相对于SkillOpt,SoftSkill在SearchQA上准确率提升5.2分,LiveMath上提升12.5分,同时用少量虚拟标记替换数百到数千个Markdown技能标记。我们进一步研究了作为更难边界情况的智能体执行,其中稀疏轨迹模仿提供了有用信号,但尚未稳健地压缩长程过程行为。更广泛地说,结果表明某些任务技能更适合被视为紧凑的潜在控制,而不是在推理时重新解释的额外Markdown,用于控制冻结模型如何进入任务。

英文摘要

Agent skills are commonly deployed as natural-language Markdown files that encode answer policies, evidence-use habits, and task procedures. These files are readable and portable, but they are consumed indirectly: for each task instance, a frozen language model must translate a long textual artifact into generation-time behavior. This paper asks whether a natural-language skill can instead initialize a compact continuous context object, refined by a trainable soft delta while the base model remains frozen. We propose SoftSkill, a frozen-backbone method that tunes such soft skills with next-token prediction and deploys them as latent behavioral priors at inference time. In our main single-round setting, a length-32 SoftSkill prefix on Qwen3.5-4B improves over no-skill prompting by 8.3 points on SearchQA, 42.1 points on LiveMath, and 1.3 points on DocVQA. Relative to SkillOpt, SoftSkill improves accuracy by 5.2 points on SearchQA and 12.5 points on LiveMath, while replacing hundreds to thousands of Markdown skill tokens with a few virtual tokens. We further study agentic execution as a harder boundary case, where sparse trajectory imitation provides useful signal but does not yet robustly compress long-horizon procedural behavior. More broadly, the results suggest that some task skills are better treated not as additional Markdown to be reinterpreted at inference time, but as compact latent controls over how a frozen model enters the task.

2606.20331 2026-06-19 cs.DS cs.CC 新提交

Computing Twin-Width via Treedepth and Vertex Integrity

通过树深度和顶点完整性计算双宽度

Robert Ganian, Mathis Rocton

AI总结 本文证明,当参数化为树深度时,近似双宽度是固定参数可解的;当参数化为顶点完整性时,精确计算双宽度是固定参数可解的,首次为非平凡参数化算法提供最优收缩序列。

Comments A short version of this preprint appeared at STACS 2026

详情
AI中文摘要

双宽度是一个图参数,已成为解释一阶模型检验在许多图类上固定参数可解性的核心。尽管其算法重要性,计算双宽度仍然知之甚少:甚至识别双宽度至多为4的图是NP难的,并且没有已知的以双宽度本身为参数的固定参数近似。最近突破这一障碍的方法侧重于首先开发以不同于双宽度的参数化来计算或近似双宽度的固定参数算法。我们的第一个结果表明,当以树深度为参数时,近似双宽度是固定参数可解的,从而打破了所有先前可处理的参数化都基于删除距离的长期障碍。证明通过有向双宽度进行,首次提供了该变体可能在算法上更易处理的构造性证据。作为第二个主要结果,我们表明,以顶点完整性为参数时,精确计算双宽度是固定参数可解的。这构成了计算最优收缩序列的第一个非平凡参数化算法。

英文摘要

Twin-width is a graph parameter that has become central to explaining the fixed-parameter tractability of first-order model checking across many graph classes. Despite its algorithmic importance, computing twin-width remains poorly understood: even recognizing graphs of twin-width at most four is NP-hard, and no fixed-parameter approximations parameterized by twin-width itself are known. A recent approach towards breaking this barrier focuses on first developing fixed-parameter algorithms for computing or approximating twin-width under parameterizations distinct from twin-width. Our first result establishes that approximating twin-width is fixed-parameter tractable when parameterized by treedepth, thereby breaking the long-standing barrier that all previous tractable parameterizations were based on deletion distance. The proof proceeds via oriented twin-width, yielding the first constructive evidence that this variant may be easier to handle algorithmically. As our second main result, we show that computing twin-width exactly is fixed-parameter tractable with respect to vertex integrity. This constitutes the first non-trivial parameterized algorithm for computing optimal contraction sequences.

2606.20324 2026-06-19 cs.SE cs.LG 新提交

A Model-Driven Approach for Developing Families of Reinforcement Learning Environments

一种模型驱动的方法用于开发强化学习环境族

Xiaoran Liu, Istvan David

AI总结 提出一种模型驱动方法,通过混合遗传算法和模型转换自动生成强化学习训练环境族,以解决手动开发环境族耗时且易错的问题,并在野火缓解场景中验证了其有效性。

详情
AI中文摘要

虚拟训练环境是软件密集型系统,强化学习(RL)智能体在其中学习、适应并展示有意义的行为。虚拟训练环境为在现实环境中训练智能体提供了一种安全且成本效益高的替代方案。然而,为了收敛,大多数现实的RL问题需要在多个相似但略有不同的环境中进行训练——即环境变体族。环境族的典型开发过程是一项劳动密集型且容易出错的手动工作,难以扩展。为了缓解这些问题,本文提出了一种模型驱动的方法来开发RL训练环境族。为了获得环境族,我们开发了一种方法和原型工具。在我们的方法中,一种混合遗传算法——基于种群的全局搜索和启发式局部搜索的结合——生成环境族。变异和约束被表达为模型转换,并通过最先进的模型转换引擎操作化为搜索过程。我们在野火缓解场景和课程学习(一种依赖于环境族的特定学习范式)中展示了我们方法的有效性。

英文摘要

Virtual training environments are software-intensive systems in which reinforcement learning (RL) agents learn, adapt, and demonstrate meaningful behavior. Virtual training environments offer a safe and cost-efficient alternative to training agents in real-world settings. However, to converge, most realistic RL problems require training in multiple, mostly similar but slightly different environments - i.e., families of environment variants. The typical development process of environment families is a labor-intensive and error-prone manual endeavor that does not scale well. To alleviate these issues, in this paper, we propose a model-driven approach for developing families of RL training environments. To obtain the family of environments, we develop an approach and prototype tool. In our approach, a hybrid genetic algorithm - a combination of population-based global search and heuristic local search - generates environment families. Mutations and constraints are expressed as model transformations and are operationalized into a search process by a state-of-the-art model transformation engine. We demonstrate the soundness of our approach in a wildfire mitigation scenario and curriculum learning - a particular learning paradigm that relies on environment families.

2606.20323 2026-06-19 cs.AI 新提交

Leveraging systems' non-linearity to tackle the scarcity of data in the design of Intelligent Fault Diagnosis Systems

利用系统非线性应对智能故障诊断系统设计中的数据稀缺问题

Giancarlo Santamato, Andrea Mattia Garavagno, Massimiliano Solazzi, Antonio Frisoli

AI总结 提出一种利用系统固有非线性的周期多激励级方法,结合数据可视化与增强技术,在数据稀缺条件下实现基于深度迁移学习的振动故障诊断,并在铁路受电弓结构上验证有效性。

Journal ref Nonlinear Dynamics, vol. 112, pp. 16153-16166, 2024

详情
AI中文摘要

深度迁移学习(DTL)允许高效构建智能故障诊断系统(IFDS)。另一方面,DTL方法仍然严重依赖大量标记数据。在处理机器或结构故障时,获取如此大量的数据可能具有挑战性。本文提出了一种在数据严重稀缺条件下使用DTL设计基于振动的IFDS的新方法。利用真实世界系统固有非线性的周期性多激励级过程生成图像,这些图像可以由预训练的卷积神经网络(CNN)方便地分析以诊断故障。本文提出了一种新的数据可视化方法及其增强技术,以应对IFDS设计过程中典型的数据缺乏问题。在铁路受电弓结构上的实验验证为所提方法提供了有效支持。

英文摘要

Deep Transfer Learning (DTL) allows for the efficient building of Intelligent Fault Diagnosis Systems (IFDS). On the other hand, DTL methods still heavily rely on large amounts of labelled data. Obtaining such an amount of data can be challenging when dealing with machines or structures faults. This document proposes a novel approach to the design of vibration-based IFDS using DTL in condition of strong data scarcity. A periodic multi-excitation level procedure leveraging intrinsic non-linearities of real-world systems is used to produce images that can be conveniently analysed by pre-trained Convolutional Neural Networks (CNNs) to diagnose faults. A new data visualization method and its augmentation technique are proposed in this paper to tackle the typical lack of data encountered during the design of IFDS. Experimental validation on a railway pantograph structure provides effective support for the proposed method.