arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2606.20414 2026-06-19 cs.AR 新提交

ExSpike: A General Full-Event Neuromorphic Architecture for Exploiting Irregular Sparsity with Event Compression

ExSpike: 一种利用事件压缩开发不规则稀疏性的通用全事件神经形态架构

Yuehai Chen, Farhad Merchant

AI总结提出ExSpike通用全事件神经形态架构，通过数据流优化实现纯事件驱动执行，并引入相邻位置事件压缩减少冗余累加，在FPGA上实现高能效SNN加速。

Comments Accepted by the 36th International Conference on Field-Programmable Logic and Applications (FPL 2026); 9 pages, 9 figures

详情

AI中文摘要

脉冲神经网络（SNN）因其稀疏的时空活动而有望实现节能计算。然而，有效将这种不规则稀疏性转化为实际的性能和能耗增益仍然具有挑战性，因为全事件计算架构尚未得到充分探索。本文提出ExSpike，一种通用的全事件神经形态架构，充分利用SNN中的不规则稀疏性。为了实现纯事件驱动执行，我们首先提出一组数据流优化，确保每个SNN层的输入保持基于脉冲，从而在整个网络中实现全事件执行。然后，我们设计了一种硬件高效的全事件架构，命名为ExSpike，它支持优化的纯事件驱动数据流以及用于脉冲驱动自注意力的额外注意力核心。为了进一步提高计算效率，我们引入了相邻位置事件压缩，以减少跨空间相邻脉冲序列的冗余累加。ExSpike在AMD Xilinx Virtex-7 FPGA上实现，并在分类和分割任务上进行了评估。实验结果表明，ExSpike在保持竞争性精度的同时，在多种SNN模型上实现了高归一化能效，最高可达479.15 GOPS、281.85 GOPS/W和0.80 GOPS/W/PE。特别是，ExSpike的PE归一化能效比最先进的基于FPGA的SNN加速器（FireFly-T）高出10倍。ExSpike的代码可在\url{this https URL}获取。

英文摘要

Spiking neural networks (SNNs) promise energy-efficient computing due to their sparse spatio-temporal activity. However, effectively translating such irregular sparsity into practical performance and energy gains remains challenging, as full-event computing architectures are still underexplored. This paper proposes ExSpike, a general full-event neuromorphic architecture that fully exploits irregular sparsity in SNNs. To realize pure event-driven execution, we first propose a set of dataflow optimizations to ensure that the inputs to each SNN layer remain spike-based, thereby enabling full-event execution throughout the network. We then design a hardware-efficient full-event architecture, named ExSpike, which supports the optimized pure event-driven dataflow and an additional Attention Core for spike-driven self-attention. To further improve computing efficiency, we introduce adjacent-position event compression to reduce redundant accumulations across spatially adjacent spike sequences. ExSpike is implemented on an AMD Xilinx Virtex-7 FPGA and evaluated on both classification and segmentation workloads. Experimental results show that ExSpike achieves high normalized energy efficiency across diverse SNN models while maintaining competitive accuracy, delivering up to 479.15 GOPS, 281.85 GOPS/W, and 0.80 GOPS/W/PE. In particular, ExSpike achieves up to 10$\times$ higher PE-normalized energy efficiency than the SOTA FPGA-based SNN accelerator (FireFly-T). The code for ExSpike is available at \url{https://github.com/xiaoyuehai/ExSpike}.

URL PDF HTML ☆

赞 0 踩 0

2606.20411 2026-06-19 cs.LG 新提交

Direct Advantage Estimation for Scalable and Sample-efficient Deep Reinforcement Learning

直接优势估计：可扩展且样本高效的深度强化学习

Hsiao-Ru Pan, Bernhard Schölkopf

AI总结针对直接优势估计（DAE）在部分可观测域和高维观测下的局限性，本文扩展其理论框架并引入离散潜动态模型降低计算复杂度，在Arcade学习环境中验证了DAE的可扩展性和样本效率。

Comments Accepted at RLC2026

2606.20410 2026-06-19 cs.MS 新提交

MaRDI Open Interfaces for Interoperable Nonlinear Optimization

MaRDI 开放接口：实现可互操作的非线性优化

Dmitry I. Kabanov, Stephan Rave, Mario Ohlberger

AI总结提出MaRDI开放接口软件包，通过统一数值问题接口和自动数据编组，提升非线性优化中不同求解器和编程语言间的互操作性，减少代码修改和测试成本。

Comments 12 pages, 1 figure, 1 table, deRSE2026

2606.20408 2026-06-19 cs.CR cs.AI 新提交

LLM agent safety, multi-turn red-teaming, jailbreak benchmarks, adversarial robustness, safety-critical systems

LLM智能体安全性、多轮红队测试、越狱基准、对抗鲁棒性、安全关键系统

Hanwool Lee, Dasol Choi, Bokyeong Kim, Seung Geun Kim, Haon Park

AI总结提出NRT-Bench基准，通过模拟核电站控制室的多轮红队测试，评估LLM智能体在安全关键系统中的对抗鲁棒性，发现不同模型的漏洞几乎不重叠，且防御效果高度依赖模型。

详情

AI中文摘要

大型语言模型（LLM）智能体越来越多地被提议作为安全关键系统的监督组件，但它们在持续、自适应对抗压力下的鲁棒性仍鲜有表征。我们提出了NRT-Bench，一个用于对作为安全关键系统操作员的LLM智能体进行多轮红队测试的基准，实例化为一个模拟核电站控制室。一个由五个角色组成的操作员团队，每个角色由可配置的LLM支持，运行一个由六项关键安全功能（CSF）管理的工厂，而对手在有限的多轮会话中通过四个通道注入消息，每轮有反馈。危害是一个客观信号，而非LLM评判的文本：一旦任何CSF丢失，运行即终止，并归因于导致该消息。在固定攻击配对重放协议下评估四个前沿操作员模型，我们发现自适应多轮攻击可靠地将操作员团队推过安全极限：在这四个模型中，8.7%至12.1%的攻击会话以工厂失去关键安全功能告终。尽管这四个模型在此聚合率下看起来几乎同样鲁棒，但它们的失败几乎没有重叠：在149个会话中，没有一个会话击败所有四个模型，而三分之一的会话至少击败一个模型，因此漏洞在模型之间几乎是不相交的，而非嵌套的。添加防御的效果强烈依赖于模型：同一套护栏或安全顾问智能体对一个模型降低攻击成功率，却可能对另一个模型提高成功率。我们发布了模拟场地、攻击数据集和重放工具，用于LLM智能体的可重复安全评估。

英文摘要

Large language model (LLM) agents are increasingly proposed as supervisory components for safety-critical systems, yet their robustness under sustained, adaptive adversarial pressure remains poorly characterized. We present NRT-Bench, a benchmark for multi-turn red-teaming of LLM agents acting as operators of a safety-critical system, instantiated in a simulated nuclear power plant control room. A five-role operator team, each backed by a configurable LLM, runs a plant governed by six critical safety functions (CSFs), while adversaries inject messages over four channels in bounded multi-turn sessions with per-turn feedback. Harm is an objective signal rather than LLM-judged text: a run terminates the moment any CSF is lost, attributed to the causing message. Evaluating four frontier operator models under a fixed-attack paired-replay protocol, we find that adaptive multi-turn attacks reliably push the operator team past a safety limit: across the four models, between 8.7% and 12.1% of attack sessions end with the plant losing a critical safety function. Although the four models look almost equally robust by this aggregate rate, their failures barely overlap: of $149$ sessions, none defeat all four models while a third defeat at least one, so vulnerabilities are nearly disjoint across models rather than nested. The effect of added defences is strongly model-dependent: the same guardrail stack or safety-advisor agent that lowers attack success for one model can raise it for another. We release the simulation venue, attack dataset, and replay tooling for reproducible safety evaluation of LLM agents.

URL PDF HTML ☆

赞 0 踩 0

2606.20404 2026-06-19 cs.CV 新提交

FlowBender: Feedback-Aware Training for Self-Correcting Conditional Flows

FlowBender: 面向自校正条件流的反馈感知训练

Daniel Gilo, Sven Elflein, Ido Sobol, Or Litany

发表机构 * Technion（以色列理工学院）； NVIDIA（英伟达）； University of Toronto（多伦多大学）； Vector Institute（向量研究所）

AI总结针对条件扩散/流模型常违反任务约束的问题，提出FlowBender闭环框架，将对齐误差作为输入训练网络学习校正策略，在图像翻译、复原和3D纹理贴图中同时提升保真度与合理性。

Comments Project page: https://flow-bender.github.io/

详情

AI中文摘要

条件扩散和流模型通常无法满足定义其任务的约束条件。例如，深度条件模型经常产生重新提取的深度与输入不一致的图像，尽管定义约束的前向算子（深度预测器）在训练和推理期间都可用。现有方法通常分为两类：将条件信号视为静态线索并在推理时忽略对齐信息的监督模型，以及通过手动调整的线性更新咨询约束的基于引导的方法，通常以生成样本的合理性为代价来换取对条件的保真度。我们认为这两种范式的根本差距在于模型从未被训练利用自身的对齐误差。我们引入FlowBender，一个闭环框架，将此误差视为一等输入，训练网络学习基于推理时反馈的校正策略。在每一步，无引导的前瞻传递估计干净信号，通过前向算子计算特定任务的偏差，然后细化传递消耗此信号以产生校正速度。我们提出了FlowBender的几种变体，包括用于可微算子的基于梯度的公式和用于不可微设置（如JPEG压缩）的零阶变体。为了实现高效采样，我们引入了一个前一步捷径，使得以最小的额外计算成本实现闭环校正。在图像到图像翻译、复原和3D网格纹理贴图中，FlowBender始终优于标准监督基线、对齐损失增强训练和最先进的推理时引导，同时提高保真度和合理性，而不是在它们之间进行权衡。项目页面：此 https URL

英文摘要

Conditional diffusion and flow models routinely fail to satisfy the very constraints that define their task. For instance, a depth-conditioned model often produces images whose re-extracted depth disagrees with the input, even though the forward operator--the depth predictor defining the constraint--is available during both training and inference. Existing approaches generally fall into two categories: supervised models that treat the conditioning signal as a static cue and ignore alignment information at inference, and guidance-based methods that consult it through hand-tuned linear updates, typically trading fidelity to the condition against the plausibility of the generated sample. We argue that the fundamental gap in both paradigms is that the model is never trained to utilize its own alignment error. We introduce FlowBender, a closed-loop framework that treats this error as a first-class input, training the network to learn a correction policy conditioned on inference-time feedback. At each step, an unguided look-ahead pass estimates the clean signal, a task-specific deviation is computed via the forward operator, and a refinement pass consumes this signal to produce a corrected velocity. We propose several variants of FlowBender, including a gradient-based formulation for differentiable operators and a zero-order variant for non-differentiable settings such as JPEG compression. For efficient sampling, we introduce a prior-step shortcut that enables closed-loop correction at a minimal additional computational cost. Across image-to-image translation, restoration, and 3D mesh texturing, FlowBender consistently outperforms standard supervised baselines, alignment-loss-augmented training, and state-of-the-art inference-time guidance, improving fidelity and plausibility simultaneously rather than trading them against each other. Project page: https://flow-bender.github.io/

URL PDF HTML ☆

赞 0 踩 0

2606.20401 2026-06-19 eess.SY cs.SY 新提交

PowerAgentBench-Dyn: A Benchmark for Agentic AI in Power System Dynamic Studies

PowerAgentBench-Dyn：电力系统动态研究中智能体AI的基准测试

Qian Zhang, Andrea Pomarico, Costas Mylonas, Magda Foti, Alberto Berizzi, Le Xie

AI总结提出PowerAgentBench-Dyn基准，用于评估基于LLM的智能体在电力系统动态分析任务中的能力，涵盖模型质量审查和安全风险筛选两个任务。

详情

AI中文摘要

基于大型语言模型（LLM）的智能体越来越多地被用于通过与软件工具交互、解释中间结果以及自主规划后续行动来自动化多步骤工程工作流。电力系统动态研究是这些智能体一个特别有前景但尚未充分探索的应用领域。与静态计算任务不同，动态研究通常需要更多时间进行模型参数校准、工程判断以及在受限动作空间下的决策。本文介绍了PowerAgentBench-Dyn，一个旨在评估智能体AI系统在电力系统动态分析任务上的基准测试。该基准针对那些不能简化为单一优化或编码任务的问题，而是需要经验丰富的电力系统工程师日常执行的那种推理、工具使用和迭代实验。所提出的框架包括两个初始基准任务。第一个是动态模型质量审查基准，评估智能体根据系统运营商指定的模型质量合规标准验证和诊断动态模型的能力。第二个是动态安全风险筛选基准，评估智能体利用语义记忆和有限的仿真预算从未见故障数据集中识别、排序和分析最关键短路事故，并提出和评估可能的缓解措施的能力。对于每个任务，我们定义了仿真环境、观测和动作空间以及评估指标。该基准在基于度量的意义上是可复现的：发布案例和仿真器设置定义了确定性评估器，而随机智能体行为通过重复运行使用成功率和其他指标进行评估。该基准支持未来用于电力系统运行和规划的智能体AI的开发。

英文摘要

Large Language Model (LLM)-based agents are increasingly being used to automate multi-step engineering work flows by interacting with software tools, interpreting intermediate results, and autonomously planning subsequent actions. Power system dynamic studies represent a particularly promising yet largely unexplored application domain for these agents. Unlike static computational tasks, dynamic studies often require more time on model parameter calibration, engineering judgment, and decision making under constrained action spaces. This paper introduces PowerAgentBench-Dyn, a benchmark designed to evaluate Agentic AI systems on power system dynamic-analysis tasks. The benchmark targets problems that cannot be reduced to a single optimization or coding task, but instead require a type of reasoning, tool usage, and iterative experimentation routinely performed by experienced power system engineers. The proposed framework includes two initial benchmark tasks. The first, the Dynamic Model Quality Review Benchmark, evaluates agents' ability to validate and diagnose dynamic models based on model-quality compliance criteria specified by system operators. The second, the Dynamic Security Risk Screening Benchmark, assesses agents' capability to leverage semantic memory and a limited simulation budget to identify, rank, and analyze the most critical short-circuit contingencies from an unseen fault dataset, as well as propose and evaluate possible mitigation measures. For each task, we define the simulation environment, observation and action spaces, and evaluation metrics. The benchmark is reproducible in a metric-based sense: released cases and simulator settings define a deterministic evaluator, while stochastic agent behavior is assessed over repeated runs using success rates and other metrics. The benchmark supports the development of future Agentic AI for power system operation and planning.

URL PDF HTML ☆

赞 0 踩 0

2606.20400 2026-06-19 cs.LG 新提交

The Significance of Style Diversity in Annotation-Free Synthetic Data Generation

无标注合成数据生成中风格多样性的重要性

Zahra Abbasiantaeb, Zeno Belligoli, Omar Essam, Mohammad Aliannejadi

发表机构 * University of Amsterdam（阿姆斯特丹大学）

AI总结提出无需人工标注的对话生成框架，利用主题和风格属性增强多样性，并设计两种后处理风格化模型，实验表明风格多样性比主题多样性更关键，性能可达人工标注数据的93.3%。

详情

AI中文摘要

为意图分类生成高实用性的合成数据通常需要人工标注的种子数据，这在快节奏的工业环境中往往不可用。在本文中，我们提出了一个完全无需人工标注数据、仅依赖意图定义的合成对话生成框架。我们提出的对话生成框架利用两种不同类型的主题和风格属性来提高数据多样性。此外，我们提出了两种新颖的后处理风格化模型，称为Univ和Exam，以将合成的LLM生成的语句转换为更多样化、更接近人类的语言风格。为了提升数据质量，我们利用LLM作为评判的过滤过程。在工业数据集和公开数据集上的实验结果表明，所提出的方法达到了使用人工标注训练数据所获得性能的93.3%。至关重要的是，研究结果揭示，对于合成数据的实用性，风格多样性比主题多样性更为关键，因为它能防止模型学习虚假的风格相关性。此外，研究表明，在生成过程中融入风格属性比后处理风格适应更有效。

英文摘要

Generating high-utility synthetic data for intent classification typically requires human-annotated seed data, which is often unavailable in fast-paced industrial settings. In this paper, we propose a framework for synthetic dialogue generation that works entirely without human-annotated data, relying solely on intent definitions. Our proposed dialogue generation framework utilizes two different types of topic and style attributes to improve data diversity. Also, we propose two novel post-hoc stylization models called Univ and Exam to transform synthetic LLM-generated utterances into more varied, human-like linguistic styles. To enhance data quality, we utilize an LLM-as-a-judge filtering process. Experimental results on both industrial and public datasets demonstrate that the proposed approach achieves up to 93.3% of the performance obtained using human-annotated training data. Crucially, the findings reveal that style diversity is more critical than topic diversity for synthetic data utility, as it prevents models from learning spurious stylistic correlations. Furthermore, the study shows that incorporating style attributes during the generation process is more effective than post-hoc style adaptation.

URL PDF HTML ☆

赞 0 踩 0

2606.20399 2026-06-19 cs.CC 新提交

Linked Fates: How Small of an Ambiguity Increase Can Make the Difference Between Equaling and Separating from P?

关联的命运：歧义增加多小才能区分P与等于P？

Benjamin Carleton, Michael C. Chavrimootoo, Lane A. Hemaspaandra, David E. Narváez, Conor Taliancich, Melissa Welsh

AI总结研究NP的歧义有界版本UP_{≤f(n)}是否与P相等，通过路径毒化和填充技术，证明了某些歧义范围下P=UP_{≤f1(n)}蕴含P=UP_{≤f2(n)}，并给出了其他情况下不成立的相对化结果。

详情

AI中文摘要

NP的歧义有界版本，记为$\mathrm{UP}_{\leq f(n)}$，通过$f(n)$限制非确定性多项式时间图灵机在长度为$n$的输入上接受路径的数量。这些类别从Valiant的完全无歧义（$f(n)=1$）类$\mathrm{UP}$到$\mathrm{NP}$本身，其中没有界限或等价地有指数界限（$f(n) = 2^{n^{O(1)}}$）。本文旨在理解这些类别中哪些在是否等于确定性多项式时间的问题上共存亡。通俗地说，哪些歧义范围具有关联的命运？即，对于满足$(\forall n)[f_1(n) \leq f_2(n)]$的非递减函数对$(f_1,f_2)$，何时有$\mathrm{P} = \mathrm{UP}_{\leq f_1(n)} \implies \mathrm{P} = \mathrm{UP}_{\leq f_2(n)}$。更具体地，哪些对鲁棒地成立，即在现实世界和所有相对化世界中成立？哪些对不鲁棒地成立，即存在一个谕示$A$使得$\mathrm{P}^A = \mathrm{UP}_{\leq f_1(n)}^A \subsetneq \mathrm{UP}_{\leq f_2(n)}^A$？先前唯一已知的正面结果是Watanabe 1988年的结果：$\mathrm{P} = \mathrm{UP}_{\leq 1} \implies (\forall k \geq 1)[\mathrm{P} = \mathrm{UP}_{\leq k}]$，该结果甚至鲁棒地成立。他的结果虽然优美，但仅适用于常数有界歧义。作为我们的正面结果，我们提出了一个适用于更高歧义水平的新情况类（定理3.8），且甚至鲁棒地适用。为了给出我们的情况类，我们利用了两种方法：一种新颖的路径毒化方法，即使在超常数歧义上也有效（定理3.5），以及填充技术的新应用（定理3.3/3.4）。作为负面结果，我们表明在几乎所有其他情况下，没有关联鲁棒地成立。

英文摘要

Ambiguity-bounded versions of $\mathrm{NP}$, denoted $\mathrm{UP}_{\leq f(n)}$, bound by $f(n)$ the number of accepting paths the nondeterministic polynomial-time Turing machine can have on inputs of length $n$. Such classes range from Valiant's completely unambiguous ($f(n)=1$) class $\mathrm{UP}$ to $\mathrm{NP}$ itself, where there is no bound or, equivalently, there is the toothless exponential bound ($f(n) = 2^{n^{O(1)}}$). This paper seeks to understand which of these classes stand and fall together as to whether they equal deterministic polynomial time. Informally put, what ranges of ambiguities have linked fates? That is, for which pairs of nondecreasing functions, $(f_1 ,f_2)$, satisfying $(\forall n)[f_1(n) \leq f_2(n)]$, does it hold that $\mathrm{P} = \mathrm{UP}_{\leq f_1(n)} \implies \mathrm{P} = \mathrm{UP}_{\leq f_2(n)}$. More particularly, for which pairs does that hold robustly, i.e., it holds in the real world and every relativized world? And for which pairs does that implication fail to hold robustly, i.e., there is an oracle $A$ such that $\mathrm{P}^A = \mathrm{UP}_{\leq f_1(n)}^A \subsetneq \mathrm{UP}_{\leq f_2(n)}^A$? The only previously known positive result is Watanabe's 1988 result that $ \mathrm{P} = \mathrm{UP}_{\leq 1} \implies (\forall k \geq 1)[\mathrm{P} = \mathrm{UP}_{\leq k}]$, which even holds robustly. His result, though lovely, applies only to constant-bounded ambiguities. As our positive result, we present a new class of cases (Theorem 3.8) that apply (and even robustly apply) at greater ambiguity levels. To give our class of cases, we leverage two approaches: a novel path-poisoning approach that works even on superconstant ambiguities (Theorem 3.5) and a new application of the power of padding (Theorems 3.3/3.4). As negative results, we show that for essentially all other cases, no linkage holds robustly.

URL PDF HTML ☆

赞 0 踩 0

2606.20390 2026-06-19 cs.CV 新提交

Geometry-Aware Superpixel Graph Transformer with Metadata for Skin Lesion Classification

几何感知超像素图变换器结合元数据用于皮肤病变分类

Muhammad Azeem, Tanveer Hussain, Amr Ahmed, Ardhendu Behera

发表机构 * Edge Hill University（埃奇希尔大学）

AI总结提出一种基于区域的图学习框架，将病变建模为超像素图，利用几何边属性和元数据上下文节点，通过边缘感知图变换器实现多模态融合，在四个公开数据集上取得优于现有方法的分类性能。

Comments Accepted at MICCAI 2026

详情

AI中文摘要

由于病变结构异质性、类内变异大以及良恶性病例间细微视觉差异，从皮肤镜图像进行自动化皮肤癌分类仍然具有挑战性。现有的CNN/ViT流程通常依赖全局或补丁级特征，并常通过后期融合结合患者元数据，这限制了空间基础的多模态推理。我们提出一种新颖的基于区域的图学习框架，将病变显式建模为空间连贯的超像素区域图，这些区域表示为冻结的CNN特征。为了捕捉细粒度的病变排列，我们将区域间几何编码为边属性，并引入一个与所有区域相连的专用元数据上下文节点，从而在同一关系空间内结构化地整合人口统计学/临床变量。节点表示通过我们的边缘感知图变换器进行更新，随后进行注意力驱动的传播，最终生成用于良恶性分类的图级嵌入。在四个公开基准上的实验表明，显式的区域级关系建模和图原生多模态融合相较于现有技术取得了持续改进。因此，我们建立了一种新的以图为中心的视角，其中CNN特征被建模为关系节点，并通过上下文整合得到改进，从而产生更具表现力和鲁棒性的分类结果。

英文摘要

Automated skin cancer classification from dermoscopic images remains challenging due to heterogeneous lesion structure, strong intra-class variability, and subtle visual differences between benign and malignant cases. Existing CNN/ViT pipelines typically rely on global or patch-level features and often combine patient metadata via late fusion, which limits spatially grounded multimodal reasoning. We present a novel region-based graph learning framework that explicitly models lesions as graphs of spatially coherent superpixel regions represented as frozen CNN features. To capture fine-grained lesion arrangements, we encode inter-regional geometry as edge attributes and introduce a dedicated metadata context node connected to all regions, providing structured integration of demographic/clinical variables within the same relational space. Node representations are updated using our edge-aware graph transformer followed by attention-driven propagation, and a final graph-level embedding for benign-malignant classification. Experiments on four public benchmarks demonstrate that explicit region-level relational modeling and graph-native multimodal fusion yield consistent gains over the state-of-the-art. Consequently, we establish a new graph-centric perspective in which CNN features are modeled as relational nodes and improved through contextual integration, yielding more expressive and robust classifications.

URL PDF HTML ☆

赞 0 踩 0

2606.20389 2026-06-19 cs.RO 新提交

CoLI: A Reproducible Platform for Continuum Robot Learning via Monolithic 3D Printing and Isomorphic Teleoperation

CoLI: 通过整体3D打印和同构遥操作实现连续体机器人学习的可复现平台

Ziyuan Tang, Chenxi Xiao*

AI总结提出一种基于多材料3D打印和同构遥操作的连续体机器人平台，简化制造流程并实现无奇异映射控制，支持模仿学习自主控制，通过硬件表征和操作任务验证其可复现性和学习就绪性。

Comments 8 pages, 7 figures, 1 table, accepted by IROS2026

详情

AI中文摘要

连续体机器人因其高自由度、柔顺结构和操作安全性，在操作任务中展现出巨大潜力。然而，复杂的制造和组装过程、具有挑战性的运动学建模以及缺乏直观的控制接口，导致其在研究和实际应用中的可复现性受到阻碍。为解决这些问题，我们提出了一种新颖的开源连续体机器人设计。该平台采用多材料3D打印实现简化的制造流程，使机械臂能够作为整体柔顺结构制造，且组装工作量最小。控制通过同构遥操作接口实现，该接口建立了直接的执行器级映射，无需显式运动学建模，并提供无奇异映射。基于该硬件设计，平台进一步支持基于模仿学习的自主控制。通过硬件表征和一系列操作任务对所提出的系统进行了评估。实验结果表明，该平台提供了一个可复现的、学习就绪的连续体机器人系统，加速了连续体机器人社区的算法开发和系统基准测试。

英文摘要

Continuum robots offer strong potential for manipulation tasks due to their high degrees of freedom, compliant structures, and operational safety. However, their adoption in both research and practical applications has been hindered by reproducibility issues arising from complex fabrication and assembly processes, challenging kinematic modeling, and a lack of intuitive control interfaces. To address these challenges, we present a novel open-source continuum robot design. The platform features a simplified fabrication pipeline enabled by multi-material 3D printing, allowing the arm to be fabricated as a monolithic compliant structure with minimal assembly. Control is achieved through an isomorphic teleoperation interface that establishes a direct actuator-level mapping, eliminating the need for explicit kinematic modeling and providing a singularity-free mapping. Building on this hardware design, the platform further supports imitation-learning-based autonomous control. The proposed system is evaluated through hardware characterization and a set of manipulation tasks. Experimental results demonstrate that the platform provides a reproducible, learning-ready continuum robot system, accelerating algorithmic development and systematic benchmarking for the continuum robotics community.

URL PDF HTML ☆

赞 0 踩 0

2606.20388 2026-06-19 cs.HC cs.AI cs.DB 新提交

DataMagic: Transforming Tabular Data into Data Insight Video

DataMagic: 将表格数据转化为数据洞察视频

Yupeng Xie, Chen Ma, Zhenyang Wang, Liangwei Wang, Jiayi Zhu, Chuxuan Zeng, Zhouan Shen, Boyan Li, Yuyu Luo

AI总结提出DataMagic系统，通过声明式规范DVSpec和多智能体架构，将原始表格数据和自然语言查询转化为叙事性数据洞察视频，并支持交互式探索。

Comments 5 pages, 3 figures, accepted at VLDB 2026

详情

AI中文摘要

数据视频整合动态图表、语音叙述和同步动画，以时间叙事的方式传达数据洞察，使其成为提高数据管理生命周期中数据消费效率的有效媒介。然而，制作高质量的数据视频需要涵盖数据分析、叙事设计和视频制作的专业知识。现有方法存在不足：静态可视化工具（如BI仪表板）缺乏叙事逻辑和动画；创作工具要求用户预先准备可视化，而非从原始数据开始；像素级视频生成模型无法保证数据保真度或来源。我们演示了DataMagic，一个端到端的交互式系统，将原始表格数据和自然语言查询转化为叙事性数据洞察视频。为确保数据保真度，DataMagic引入了声明式规范DVSpec，通过数据驱动的语义引用将视觉和动画元素绑定到底层数据字段。为解决设计空间的组合爆炸问题，DataMagic采用先生成后编排的多智能体架构，并行生成候选场景，然后通过全局编排优化叙事连贯性。利用DVSpec逻辑与渲染的解耦，系统进一步支持三种交互模式和基于结构化来源的数据问答，将单向视频转化为可探索的交互式数据界面。在109个真实世界样本上的评估验证了DataMagic的有效性。主页：此 https URL

英文摘要

Data videos integrate dynamic charts, voice narration, and synchronized animations to communicate data insights as temporal narratives, making them an effective medium for improving data consumption efficiency in the data management lifecycle. However, producing high-quality data videos requires expertise spanning data analysis, narrative design, and video production. Existing approaches fall short: static visualization tools (e.g., BI dashboards) lack narrative logic and animation; authoring tools require users to pre-prepare visualizations rather than working from raw data; pixel-level video generation models cannot guarantee data fidelity or provenance. We demonstrate DataMagic, an end-to-end interactive system that transforms raw tabular data and natural language queries into narrative data-insight videos. To ensure data fidelity, DataMagic introduces the declarative specification DVSpec, which binds visual and animation elements to underlying data fields through data-driven semantic references. To address the combinatorial explosion of the design space, DataMagic adopts a Generate-then-Orchestrate multi-agent architecture that generates candidate scenes in parallel and then optimizes narrative coherence through global orchestration. Leveraging DVSpec's decoupling of logic and rendering, the system further supports three interaction modes and structured provenance-based data Q&A, transforming one-way videos into explorable interactive data interfaces. Evaluation on 109 real-world samples validates the effectiveness of the DataMagic. Homepage: https://datamagic-home.github.io/

URL PDF HTML ☆

赞 0 踩 0

2606.20382 2026-06-19 cs.LG 新提交

Towards Modality-imbalanced Federated Graph Learning: A Data Synthesis-based Approach

面向模态不平衡的联邦图学习：一种基于数据合成的方法

Zhengyu Wu, Hongchao Qin, Xunkai Li, Zekai Chen, Rong-Hua Li, Guoren Wang

AI总结针对联邦图学习中客户端级和节点级模态不平衡问题，提出隐式图感知潜在语义表示合成范式FedMGS，通过可用性感知图编码器、原型引导语义合成器和可靠性校准融合机制恢复缺失模态语义，在四个任务上最高提升17.41%。

详情

AI中文摘要

多模态联邦图学习（MM-FGL）提供了一种自然的协作训练范式，但其实际部署受到两种粒度的模态不平衡挑战。当某些客户端缺少完整模态时，会出现客户端级不平衡；而当单个节点缺少视觉或文本属性时，会出现节点级不平衡。尽管存在一些相关研究，但我们的调查表明，它们主要针对图无关或集中式场景，难以直接适应。为了解决这些挑战，我们将模态不平衡的MM-FGL形式化为一个隐式图感知潜在语义表示合成问题。该范式直接在表示空间中恢复缺失的模态语义，从而最大化与原始数据语义分布的对齐，并缓解由缺失模态引起的高方差。为此，我们提出了FedMGS（联邦模态感知图合成），它集成了三个核心组件。可用性感知图编码器防止缺失模态污染局部结构传播。原型引导潜在语义合成器为不可用模态建立跨客户端语义锚点。可靠性校准语义融合机制在预测读出之前调节恢复的潜在表示的影响。在四个任务上的大量实验表明，FedMGS始终优于竞争基线，最高提升17.41%，并实现了最佳效率-性能权衡。

英文摘要

MultiModal Federated Graph Learning (MM-FGL) offers a natural collaborative training paradigm, but its practical deployment is challenged by two granularities of modality imbalance. Client-level imbalance occurs when certain clients lack entire modalities, while node-level imbalance occurs when individual nodes exhibit missing visual or textual attributes. While several relevant studies exist, our investigation reveals that they predominantly target graph-agnostic or centralized scenarios, rendering them difficult to adapt directly. To address these challenges, we formalize modality-imbalanced MM-FGL as an implicit graph-aware latent semantic representation synthesis problem. This paradigm recovers missing modal semantics directly within the representation space, thereby maximizing alignment with the original data's semantic distribution and mitigating the high variance induced by missing modalities. To this end, we propose FedMGS (Federated Modality-aware Graph Synthesis), which integrates three core components. The availability-aware graph encoder prevents missing modalities from contaminating local structural propagation. The prototype-guided latent semantic synthesizer establishes cross-client semantic anchors for unavailable modalities. The reliability-calibrated semantic fusion mechanism regulates the impact of recovered latent representations prior to predictive readout. Extensive experiments on four tasks show that FedMGS consistently outperforms competitive baselines with gains up to 17.41% with best efficiency-performance tradeoff.

URL PDF HTML ☆

赞 0 踩 0

2606.20381 2026-06-19 cs.AI 新提交

Rethinking Shrinkage Bias in LLM FP4 Pretraining: Geometric Origin, Systemic Impact, and UFP4 Recipe

重新思考LLM FP4预训练中的收缩偏差：几何起源、系统影响与UFP4方案

Qian Zhao, Kunlong Chen, Changxin Tian, Zhonghui Jiang, Haitao Zhang, Chaofan Yu, Peijie Jiang, Mingliang Gong, Jia Liu, Ziqi Liu, Zhiqiang Zhang, Jun Zhou

发表机构 * Ling Team, Ant Group（蚂蚁集团灵团队）

AI总结本文发现E2M1格式因几何不对称导致收缩偏差，该偏差经随机哈达玛变换放大，造成训练不稳定；提出均匀网格E1M2/INT4及UFP4训练方案，在多种模型上实现更低损失。

Comments 18 pages, 12 figures

详情

AI中文摘要

FP4训练有望大幅减少LLM预训练的内存和计算成本，然而当前的FP4硬件路径和方案，包括NVIDIA Blackwell/Rubin级系统和AMD MI350系列GPU，仍以E2M1数据元素为中心。在本研究中，我们识别出该选择的一个根本限制：诸如E2M1的非均匀格式固有地遭受收缩偏差，这是一种由其可表示区间的几何不对称性导致的系统性负舍入误差。我们证明该偏差在层间乘性累积，并被随机哈达玛变换（RHT）放大，为现有基于E2M1的FP4方案中观察到的训练不稳定性提供了统一解释。相比之下，均匀网格（E1M2/INT4）绕过了这种网格几何误差，并能更好地将RHT改进的桶利用率转化为更高的量化质量。基于这一发现，我们提出UFP4，一种均匀4位训练方案，它将RHT应用于所有三个训练GEMM，同时仅对dY施加随机舍入。在Dense 1.5B、MoE 7.9B和MoE 124B的长程预训练中，UFP4始终比强E2M1基线实现更低的BF16相对损失退化，这得到了缩放定律分析和消融研究的支持。我们的结果表明，未来的加速器应支持E1M2/INT4风格的均匀4位网格作为与E2M1并列的一等训练原语。

英文摘要

FP4 training promises substantial reductions in memory and computation cost for LLM pretraining, yet current FP4 hardware paths and recipes, including NVIDIA Blackwell/Rubin-class systems and AMD MI350-series GPUs, remain centered on E2M1 data elements. In this study, we identify a fundamental limitation of that choice: non-uniform formats such as E2M1 inherently suffer from Shrinkage Bias, a systematic negative rounding error caused by the geometric asymmetry of their representable bins. We show that this bias accumulates multiplicatively across layers and is amplified by the Random Hadamard Transform (RHT), providing a unified explanation for the training instability observed in existing E2M1-based FP4 recipes. In contrast, uniform grids (E1M2/INT4) bypass this grid-geometry error and better convert the improved bucket utilization from RHT into higher quantization quality. Based on this finding, we propose UFP4, a uniform 4-bit training recipe that applies RHT to all three training GEMMs while restricting stochastic rounding to dY alone. On Dense 1.5B, MoE 7.9B, and MoE 124B long-run pretraining, UFP4 consistently achieves lower BF16-relative loss degradation than strong E2M1-based baselines, supported by scaling-law analysis and ablation studies. Our results suggest that future accelerators should support E1M2/INT4-style uniform 4-bit grids as first-class training primitives alongside E2M1.

URL PDF HTML ☆

赞 0 踩 0

2606.20376 2026-06-19 cs.LG cs.AI 新提交

CRAX: Fast Safe Reinforcement Learning Benchmarking

CRAX：快速安全强化学习基准测试

Tristan Tomilin, Mourad Boustani, Mickey Beurskens, Thiago D. Simão

发表机构 * Eindhoven University of Technology（埃因霍温理工大学）

AI总结提出基于JAX加速的安全RL基准CRAX，利用MJX物理引擎实现高达100倍加速，包含6个环境套件和3个智能体任务，评估6种方法揭示性能与安全权衡。

详情

AI中文摘要

安全性是强化学习（RL）智能体在机器人、自动驾驶等现实领域部署的核心问题。尽管基准测试对RL的进步至关重要，但现有具有高保真3D物理的安全基准计算速度慢，限制了大规模实验和快速原型开发。为解决这一问题，我们提出CRAX（基于JAX加速的约束RL）。CRAX构建在具有逼真3D动力学的MuJoCo XLA（MJX）物理引擎之上，利用向量化操作和硬件加速，相比基于CPU的同类安全基准实现高达约100倍的加速。该基准包含六个环境套件和三个智能体特定任务，每个任务涵盖三个难度级别。对六种流行安全RL方法的评估表明，没有单一方法在所有任务中占主导地位，并揭示了性能与安全之间的权衡。我们发现，跨难度级别的课程学习和安全迁移可以比直接在更困难设置中训练提高性能。

英文摘要

Safety is a core concern for deploying reinforcement learning (RL) agents in real-world domains such as robotics and autonomous driving. While benchmarks have been central to progress in RL, existing safety benchmarks with high-fidelity 3D physics remain computationally slow, limiting large-scale experimentation and rapid prototyping. To address this gap, we propose CRAX (Constrained RL Accelerated with JAX). Built on top of the MuJoCo XLA (MJX) physics engine with realistic 3D dynamics, CRAX leverages vectorized operations and hardware acceleration, yielding up to ~100x speedups over comparable CPU-based safety benchmarks. The benchmark features six environment suites and three agent-specific tasks, each spanning three difficulty levels. Evaluating six popular safe RL methods shows that no single approach dominates across all tasks, and reveals the trade-offs between performance and safety. We find that curriculum learning across difficulty levels and safety transfer can improve performance over direct training in harder settings.

URL PDF HTML ☆

赞 0 踩 0

2606.20375 2026-06-19 cs.HC cs.CY 新提交

Organizing in the Digital Age: Understanding Community, Challenges, and Consequences in Digitally-facilitated Labor Organizing

数字时代的组织：理解数字辅助劳工组织中的社区、挑战与后果

Frederick Reiber, Alishah Chator, Dana Calacci, Allison McDonald

AI总结本研究通过17次定性访谈，分析劳工组织如何使用Discord、WhatsApp和Slack等数字平台进行组织，揭示了技术安全、信息过载和信任建立等挑战与机遇。

Comments To appear in CSCW 2026

2606.20374 2026-06-19 cs.DC 新提交

ARGUS: Production-Scale Tracing and Performance Diagnosis for over 10,000-GPU Clusters

ARGUS：面向超过10,000 GPU集群的生产级追踪与性能诊断

Jiasheng Zhou, Longbin Zeng, Clavis Chen, Ruiming Lu, Qinwei Yang, Leyi Ye, Ray Ying, Key Zhang

AI总结提出低开销、细粒度的始终在线追踪与实时分析系统ARGUS，通过分解训练调用层次、统一数据管道和渐进式诊断框架，在超过10,000 GPU集群上实现<2%开销的持续故障检测与性能优化。

详情

AI中文摘要

大规模LLM训练需要始终在线、细粒度的可观测性以实现有效的规模性能诊断。粗粒度的资源监控器无法定位根本原因，而细粒度的分析器会产生高昂（5%-30%）的开销和海量追踪数据，使得在大型生产集群中始终在线部署不切实际。我们提出ARGUS，一个面向10,000+ GPU规模生产集群中训练工作负载的低开销、细粒度、始终在线的追踪与实时分析系统。ARGUS将沿训练调用层次的观测分解为CPU调用栈、框架语义和GPU内核执行，始终在线收集的总开销低于2%。它构建统一数据管道，将原始内核事件压缩约3,700倍，从每个rank每步10 MB降至2.7 KB。其渐进式诊断框架通过迭代时间、阶段级和内核级分析自动隔离异常窗口、落后rank和性能下降的内核。在超过10,000 GPU的生产集群上部署超过六个月，ARGUS持续支持故障慢速检测和性能优化。我们的案例研究进一步展示了其在代表性异常中的有效性，包括计算落后、链路退化、流水线气泡放大、FlashAttention JIT停滞以及被通信症状掩盖的计算落后。

英文摘要

Large-scale LLM training requires always-on, fine-grained observability for effective performance diagnosis at scale. Coarse resource monitors alone cannot localize root causes, and fine-grained profilers incur prohibitive (5%-30%) overheads and massive trace volumes, making always-on deployment impractical in large production clusters. We propose ARGUS, a low-overhead, fine-grained, always-on tracing and real-time analysis system for training workloads in 10,000+ GPU-scale production clusters. ARGUS decomposes observation along the training call hierarchy into CPU call stacks, framework semantics, and GPU kernel execution, with always-on collection under a combined overhead of less than 2%. It builds a unified data pipeline and compresses raw kernel events by approximately 3,700x from 10 MB to 2.7 KB per rank per step. Its progressive diagnosis framework automatically isolates anomalous windows, straggler ranks, and degraded kernels through iteration-time, phase-level, and kernel-level analysis. Deployed for over six months on a 10,000+ GPU production cluster, ARGUS has supported continuous fail-slow detection and performance optimization. Our case studies further demonstrate its effectiveness across representative anomalies, including compute stragglers, link degradation, pipeline-bubble amplification, FlashAttention JIT stalls, and compute stragglers masked by communication symptoms.

URL PDF HTML ☆

赞 0 踩 0

2606.20373 2026-06-19 cs.SE cs.AI 新提交

稀疏附加控制器设计：一种面向系统级性能的Youla方法

M. van der Hulst, N. Dirkx, R. A. González, K. Tiels, J. van de Wijdeven, T. oomen

AI总结提出一种基于Youla参数化的稀疏附加控制器设计框架，通过凸优化求解稀疏H2综合问题，实现系统级性能与互联复杂度的最优权衡。

2606.20359 2026-06-19 cs.LG 新提交

Train, Retrieve, or Both? A Four-Arm Head-to-Head for Correct Statutory Citation on the Ontario Residential Tenancies Act

训练、检索，还是两者兼用？针对安大略省住宅租赁法的正确法定引用的四组头对头比较

Ali Asaria, Tony Salomone, Deep Gandhi

发表机构 * Transformer Lab

AI总结研究自诉租户、房东和帮助台工作人员如何获得正确的法定引用，通过四组实验比较微调、检索及混合方法，发现SFT+RAG混合模型在精确匹配上得分最高且无幻觉引用。

详情

AI中文摘要

自诉租户、房东和帮助台工作人员需要被指向实际管辖问题的法律条款，并附有正确的法定引用。我们在2006年安大略省住宅租赁法（RTA）及其核心法规上研究此任务，从操作者的角度实证提问：微调是否足够，还是需要混合检索？我们在Qwen2.5-7B-Instruct上运行四组头对头比较（基础零样本、仅LoRA SFT、仅RAG、以及SFT+RAG混合），在一个小型、待人工验证的真实评估集上，以引用的精确匹配（节+小节）评分。基础模型无法引用RTA，仅SFT会错误回忆章节；检索至关重要，并通过构造将幻觉降至零；而SFT+RAG混合模型得分最高，精确匹配为0.481，且无幻觉引用。其优势在于SFT使得条款选择对高召回候选集（损害零样本RAG）更加鲁棒。值得注意的是，这种廉价的bge-small混合模型匹配或超越了基于更大、专门检索模型（更大的嵌入器和交叉编码器重排序器）的管道，更大/改进的训练集也无帮助：在此任务中，强法定引用性能不需要专门的检索模型或更多数据。该工件将幻觉归零并超过了基准提升线，但未达到期望的0.70精确匹配目标。所有结果均基于小型、待人工验证的真实评估集，并作为初步结果报告。

英文摘要

Self-represented tenants, landlords, and help-desk staff need to be pointed at the provision of law that actually governs a question, with a correct statutory citation. We study this task on the Ontario Residential Tenancies Act, 2006 (RTA) and its core regulation, asking the operator's question empirically: is fine-tuning enough, or is hybrid retrieval needed? We run a four-arm head-to-head on Qwen2.5-7B-Instruct (base zero-shot, LoRA SFT-only, RAG-only, and an SFT+RAG hybrid), scored on citation exact-match (section+subsection) over a small, human-verification-pending real eval set. The base model cannot cite the RTA and SFT-only mis-recalls sections; retrieval is essential and drives hallucination to zero by construction; and the SFT+RAG hybrid scores highest at 0.481 exact-match with zero hallucinated citations. Its edge comes from SFT making provision selection more robust to the higher-recall candidate sets that hurt zero-shot RAG. Notably, this cheap bge-small hybrid matches or beats a pipeline built on bigger, specialized retrieval models (a larger embedder and a cross-encoder reranker), and a larger/improved training set does not help either: strong statutory-citation performance here does not require specialized retrieval models or more data. The artifact zeroes hallucination and clears the lift-over-base bar but does not reach the aspirational 0.70 exact-match target. All results are on a small, human-verification-pending real eval set and are reported as preliminary.

URL PDF HTML ☆

赞 0 踩 0

2606.20357 2026-06-19 cs.LG 新提交

On the Variance of Temporal Difference Learning and its Reduction Using Control Variates

时序差分学习的方差及其通过控制变量的降低

Hsiao-Ru Pan, Bernhard Schölkopf

AI总结本文分析表格表示下相位设置中时序差分学习的方差，证明其方差降低机制是通过有效聚合更多独立轨迹，并比较了TD、MC和DAE的方差界限。

Comments Accepted at RLC2026

2606.20351 2026-06-19 cs.LO cs.PL 新提交

通过树深度和顶点完整性计算双宽度

Robert Ganian, Mathis Rocton

AI总结本文证明，当参数化为树深度时，近似双宽度是固定参数可解的；当参数化为顶点完整性时，精确计算双宽度是固定参数可解的，首次为非平凡参数化算法提供最优收缩序列。

Comments A short version of this preprint appeared at STACS 2026

详情

DOI: 10.4230/LIPIcs.STACS.2026.42

AI中文摘要

双宽度是一个图参数，已成为解释一阶模型检验在许多图类上固定参数可解性的核心。尽管其算法重要性，计算双宽度仍然知之甚少：甚至识别双宽度至多为4的图是NP难的，并且没有已知的以双宽度本身为参数的固定参数近似。最近突破这一障碍的方法侧重于首先开发以不同于双宽度的参数化来计算或近似双宽度的固定参数算法。我们的第一个结果表明，当以树深度为参数时，近似双宽度是固定参数可解的，从而打破了所有先前可处理的参数化都基于删除距离的长期障碍。证明通过有向双宽度进行，首次提供了该变体可能在算法上更易处理的构造性证据。作为第二个主要结果，我们表明，以顶点完整性为参数时，精确计算双宽度是固定参数可解的。这构成了计算最优收缩序列的第一个非平凡参数化算法。

英文摘要

Twin-width is a graph parameter that has become central to explaining the fixed-parameter tractability of first-order model checking across many graph classes. Despite its algorithmic importance, computing twin-width remains poorly understood: even recognizing graphs of twin-width at most four is NP-hard, and no fixed-parameter approximations parameterized by twin-width itself are known. A recent approach towards breaking this barrier focuses on first developing fixed-parameter algorithms for computing or approximating twin-width under parameterizations distinct from twin-width. Our first result establishes that approximating twin-width is fixed-parameter tractable when parameterized by treedepth, thereby breaking the long-standing barrier that all previous tractable parameterizations were based on deletion distance. The proof proceeds via oriented twin-width, yielding the first constructive evidence that this variant may be easier to handle algorithmically. As our second main result, we show that computing twin-width exactly is fixed-parameter tractable with respect to vertex integrity. This constitutes the first non-trivial parameterized algorithm for computing optimal contraction sequences.

URL PDF HTML ☆

赞 0 踩 0

2606.20324 2026-06-19 cs.SE cs.LG 新提交

A Model-Driven Approach for Developing Families of Reinforcement Learning Environments

一种模型驱动的方法用于开发强化学习环境族

Xiaoran Liu, Istvan David

AI总结提出一种模型驱动方法，通过混合遗传算法和模型转换自动生成强化学习训练环境族，以解决手动开发环境族耗时且易错的问题，并在野火缓解场景中验证了其有效性。

详情

AI中文摘要

虚拟训练环境是软件密集型系统，强化学习（RL）智能体在其中学习、适应并展示有意义的行为。虚拟训练环境为在现实环境中训练智能体提供了一种安全且成本效益高的替代方案。然而，为了收敛，大多数现实的RL问题需要在多个相似但略有不同的环境中进行训练——即环境变体族。环境族的典型开发过程是一项劳动密集型且容易出错的手动工作，难以扩展。为了缓解这些问题，本文提出了一种模型驱动的方法来开发RL训练环境族。为了获得环境族，我们开发了一种方法和原型工具。在我们的方法中，一种混合遗传算法——基于种群的全局搜索和启发式局部搜索的结合——生成环境族。变异和约束被表达为模型转换，并通过最先进的模型转换引擎操作化为搜索过程。我们在野火缓解场景和课程学习（一种依赖于环境族的特定学习范式）中展示了我们方法的有效性。

英文摘要

Virtual training environments are software-intensive systems in which reinforcement learning (RL) agents learn, adapt, and demonstrate meaningful behavior. Virtual training environments offer a safe and cost-efficient alternative to training agents in real-world settings. However, to converge, most realistic RL problems require training in multiple, mostly similar but slightly different environments - i.e., families of environment variants. The typical development process of environment families is a labor-intensive and error-prone manual endeavor that does not scale well. To alleviate these issues, in this paper, we propose a model-driven approach for developing families of RL training environments. To obtain the family of environments, we develop an approach and prototype tool. In our approach, a hybrid genetic algorithm - a combination of population-based global search and heuristic local search - generates environment families. Mutations and constraints are expressed as model transformations and are operationalized into a search process by a state-of-the-art model transformation engine. We demonstrate the soundness of our approach in a wildfire mitigation scenario and curriculum learning - a particular learning paradigm that relies on environment families.

URL PDF HTML ☆

赞 0 踩 0

2606.20323 2026-06-19 cs.AI 新提交

Leveraging systems' non-linearity to tackle the scarcity of data in the design of Intelligent Fault Diagnosis Systems

利用系统非线性应对智能故障诊断系统设计中的数据稀缺问题

Giancarlo Santamato, Andrea Mattia Garavagno, Massimiliano Solazzi, Antonio Frisoli

AI总结提出一种利用系统固有非线性的周期多激励级方法，结合数据可视化与增强技术，在数据稀缺条件下实现基于深度迁移学习的振动故障诊断，并在铁路受电弓结构上验证有效性。

Journal ref Nonlinear Dynamics, vol. 112, pp. 16153-16166, 2024

详情

DOI: 10.1007/s11071-024-09864-6

AI中文摘要

深度迁移学习（DTL）允许高效构建智能故障诊断系统（IFDS）。另一方面，DTL方法仍然严重依赖大量标记数据。在处理机器或结构故障时，获取如此大量的数据可能具有挑战性。本文提出了一种在数据严重稀缺条件下使用DTL设计基于振动的IFDS的新方法。利用真实世界系统固有非线性的周期性多激励级过程生成图像，这些图像可以由预训练的卷积神经网络（CNN）方便地分析以诊断故障。本文提出了一种新的数据可视化方法及其增强技术，以应对IFDS设计过程中典型的数据缺乏问题。在铁路受电弓结构上的实验验证为所提方法提供了有效支持。

英文摘要

Deep Transfer Learning (DTL) allows for the efficient building of Intelligent Fault Diagnosis Systems (IFDS). On the other hand, DTL methods still heavily rely on large amounts of labelled data. Obtaining such an amount of data can be challenging when dealing with machines or structures faults. This document proposes a novel approach to the design of vibration-based IFDS using DTL in condition of strong data scarcity. A periodic multi-excitation level procedure leveraging intrinsic non-linearities of real-world systems is used to produce images that can be conveniently analysed by pre-trained Convolutional Neural Networks (CNNs) to diagnose faults. A new data visualization method and its augmentation technique are proposed in this paper to tackle the typical lack of data encountered during the design of IFDS. Experimental validation on a railway pantograph structure provides effective support for the proposed method.

URL PDF HTML ☆

赞 0 踩 0