arXivDaily arXiv每日学术速递 周一至周五更新
重置
全部学科分类 2042
2606.05871 2026-06-05 cs.IT cs.AI math.IT stat.ME

Compositional Boundaries for Density Fusion

密度融合的组合边界

Ratan Bahadur Thapa, Ali Darijani, Jürgen Beyerer, Steffen Staab

发表机构 * University of Stuttgart Department of Computer Science, Germany(斯图加特大学计算机科学系,德国) KIT Department of Computer Science, Germany(卡尔斯鲁厄理工学院计算机科学系,德国) Fraunhofer IOSB of Fraunhofer-Gesellschaft, Germany(弗劳恩霍夫研究所IOSB分部,德国) University of Southampton Department of Computer Science, United Kingdom(南安普顿大学计算机科学系,英国)

AI总结 研究分布式不确定性管理系统中加权概率密度的层次融合顺序不变性,证明在连续二元规则下,顺序不变的层次融合等价于归一化加权线性池化,并揭示了端点-候选f-散度平衡的局部几何障碍。

详情
AI中文摘要

分布式不确定性管理系统通常沿着由通信、隐私或调度约束选择的聚合树组合局部概率模型。最终密度应取决于加权源,而不是中间节点组合它们的特定顺序。我们将这一要求研究为加权概率密度的二元融合的代数组合性问题。核心问题是局部融合规则何时可以层次化执行同时保持顺序不变。我们为局部段值融合规则建立了一个组合边界。在具有加性输出权重和仅权重系数的连续二元规则类中,顺序不变的层次执行刻画了归一化加权线性池化;范数诱导的段平衡实现了相应的系数。平滑端点-候选$f$-散度平衡具有不同的局部几何:其二次展开引入了平方根有效权重,表明仅凭成对可解性不足以实现调度无关的融合。我们证明这一障碍是端点-候选二元平衡所特有的,而全局散度重心保留了加性权重的局部极限。最后,高斯混合展示了相同问题如何在有限模型类中出现:精确融合是组合的,而逐步压缩仅在未归一化分量测度的同余条件下才是组合的。这些结果区分了精确的调度无关融合与全局聚合目标及局部近似启发式。

英文摘要

Distributed uncertainty-management systems often combine local probabilistic models along aggregation trees chosen by communication, privacy, or scheduling constraints. The final density should depend on the weighted sources, not on the particular order in which intermediate nodes combine them. We study this requirement as an algebraic compositionality problem for binary fusion of weighted probability densities. The central question is when a local fusion rule can be executed hierarchically while remaining order-invariant. We establish a compositional boundary for local segment-valued fusion rules. Within the class of continuous binary rules with additive output weights and weight-only coefficients, order-invariant hierarchical execution characterizes normalized weighted linear pooling; norm-induced segment balancing realizes the corresponding coefficient. Smooth endpoint-to-candidate $f$-divergence balancing has a different local geometry: its quadratic expansion induces square-root effective weights, showing why pairwise solvability alone is insufficient for schedule-independent fusion. We show that this obstruction is local to endpoint-to-candidate binary balancing, whereas global divergence barycenters retain additive-weight local limits. Finally, Gaussian mixtures show how the same issue appears in finite model classes: exact fusion is compositional, whereas stepwise compression is compositional only under a congruence condition on unnormalized component measures. These results distinguish exact schedule-independent fusion from global aggregation objectives and local approximation heuristics.

2606.05870 2026-06-05 q-bio.NC cs.LG q-bio.QM

Cross-scale spatially-aware generative modeling of transcriptomic programs underlying neurodegenerative brain organization

跨尺度空间感知生成模型揭示神经退行性脑组织下的转录组程序

Krishnakumar Vaithianathan

发表机构 * Department of Computer Engineering, Karaikal Polytechnic College, Karaikal, Puducherry, India(计算机工程系,卡莱克尔理工学院,卡莱克尔,浦那赫里,印度)

AI总结 提出一种跨尺度空间感知生成框架,通过变分生成架构结合图空间平滑正则化,学习区域基因表达与皮质退化的潜在生物程序,实现高精度预测(解释方差0.8604,空间相关r=0.9439)。

Comments 26 pages, 5 figures

详情
AI中文摘要

神经退行性疾病如阿尔茨海默病表现出高度有序的区域性脑脆弱性模式,但这种空间选择性的生物学机制仍不完全清楚。现有的成像-转录组研究主要依赖于基因表达与神经影像表型之间的相关性分析,限制了它们模拟分子组织如何导致神经退化的能力。在这里,我们引入了一个跨尺度空间感知生成框架,用于模拟皮质退化下的转录组程序。使用艾伦人脑图谱中910个标志基因在68个皮质区域的区域转录组图谱。通过计算认知正常对照(NC=926)和阿尔茨海默病受试者(AD=426)之间的皮质厚度差异,从ADNI FreeSurfer皮质厚度测量构建神经退行性脆弱性图谱。采用变分生成架构学习连接区域基因表达组织与皮质退化的潜在生物程序,同时结合基于图的空间平滑正则化以保持皮质组织。所提出的框架实现了对区域神经退行性脆弱性的强预测,解释方差为0.8604,预测与观察到的皮质退化图谱之间存在显著空间相关性(r=0.9439,p<0.001)。学习到的潜在表示揭示了与分布性疾病易感性相关的结构化转录组组织。这些发现表明,生物约束的生成建模可以桥接微观分子组织与宏观神经退化,为空间感知的生成神经生物学和计算神经科学奠定基础。

英文摘要

Neurodegenerative disorders such as Alzheimer's disease exhibit highly organized patterns of regional brain vulnerability, yet the biological mechanisms underlying this spatial selectivity remain incompletely understood. Existing imaging-transcriptomic studies have largely relied on correlation-based analyses between gene expression and neuroimaging phenotypes, limiting their ability to model how molecular organization gives rise to neurodegeneration. Here, we introduce a cross-scale spatially-aware generative framework for modeling transcriptomic programs underlying cortical neurodegeneration. Regional transcriptomic profiles were derived from the Allen Human Brain Atlas using 910 landmark genes across 68 cortical regions. Neurodegenerative vulnerability maps were constructed from ADNI FreeSurfer cortical thickness measurements by computing regional cortical thinning differences between cognitively normal controls (NC = 926) and Alzheimer's disease subjects (AD = 426). A variational generative architecture was used to learn latent biological programs linking regional gene-expression organization to cortical degeneration while incorporating graph-based spatial smoothness regularization to preserve cortical organization. The proposed framework achieved strong prediction of regional neurodegenerative vulnerability, yielding an explained variance of 0.8604 and a significant spatial correlation between predicted and observed cortical degeneration profiles (r = 0.9439, p < 0.001). The learned latent representations revealed structured transcriptomic organization associated with distributed disease susceptibility. These findings demonstrate that biologically constrained generative modeling can bridge microscale molecular organization with macroscale neurodegeneration, providing a foundation for spatially-aware generative neurobiology and computational neuroscience.

2606.05855 2026-06-05 cs.HC cs.AI

EEGDancer: Dynamic Emotion Latent Space Masked Modeling with Reinforcement Learning for EEG Continuous Emotion Prediction

EEGDancer:基于强化学习的动态情感潜空间掩码建模用于EEG连续情感预测

Zhihao Zhou, Weishan Ye, Li Zhang, Gan Huang, Zhen Liang

发表机构 * National University of Singapore(新加坡国立大学) Agency for Science, Technology and Research(科技研究局)

AI总结 提出EEGDancer框架,结合向量量化表示学习、掩码时间建模和强化学习轨迹优化,解决EEG连续情感预测中长时依赖和噪声问题。

Comments 51 pages, 9 figures, 13 tables

详情
AI中文摘要

连续脑电图(EEG)情感预测旨在从EEG信号中建模人类情感状态的时间演化。与传统的离散情感识别不同,连续预测需要捕捉长时依赖和连贯的情感动态。然而,现有方法主要依赖于逐点回归并直接对噪声高维EEG特征建模,限制了其刻画连续情感演化的能力。为应对这些挑战,我们提出EEGDancer,一个用于连续EEG情感预测的动态情感潜空间学习框架。该框架将向量量化表示学习、掩码时间建模和基于强化学习的轨迹优化整合到一个统一架构中。具体而言,设计了一个因果时空向量量化变分自编码器(VQ-VAE),用于学习结构化情感原型并从EEG信号构建离散-连续情感潜空间。基于学习到的潜表示,采用基于Transformer的掩码动态建模策略捕捉长时情感依赖和时间演化模式。此外,将连续情感预测建模为序列决策问题,并引入软演员-评论家(SAC)框架在序列级别优化情感预测轨迹,而非逐帧局部拟合。在SEED、SEED-IV和长期自然情感数据集上的大量实验表明,EEGDancer持续优于现有机器学习和深度学习方法。消融研究进一步验证了所提出的潜空间和基于强化学习的轨迹优化在建模连续EEG情感动态方面的有效性。

英文摘要

Continuous electroencephalography (EEG) emotion prediction aims to model the temporal evolution of human emotional states from EEG signals. Unlike conventional discrete emotion recognition, continuous prediction requires capturing long-range temporal dependencies and coherent emotional dynamics. However, existing methods mainly rely on point-wise regression and directly model noisy high-dimensional EEG features, limiting their ability to characterize continuous emotional evolution.To address these challenges, we propose EEGDancer, a dynamic emotional latent space learning framework for continuous EEG emotion prediction. The framework integrates vector-quantized representation learning, masked temporal modeling, and reinforcement learning-based trajectory optimization into a unified architecture.Specifically, a causal spatiotemporal Vector-Quantization Variational Autoencoder (VQ-VAE) is designed to learn structured emotional prototypes and construct a discrete-continuous emotional latent space from EEG signals. Based on the learned latent representations, a Transformer-based masked dynamic modeling strategy captures long-range emotional dependencies and temporal evolution patterns. Furthermore, continuous emotion prediction is formulated as a sequential decision-making problem, and a Soft Actor-Critic (SAC) framework is introduced to optimize emotional prediction trajectories at the sequence level instead of frame-wise local fitting.Extensive experiments on the SEED, SEED-IV, and Long-Term Naturalistic Emotion datasets demonstrate that EEGDancer consistently outperforms existing machine learning and deep learning methods. Ablation studies further verify the effectiveness of the proposed latent space and reinforcement learning-based trajectory optimization for modeling continuous EEG emotional dynamics.

2606.05849 2026-06-05 physics.optics cs.CV

Inverse Design of Realizable Metasurface based Absorbers using Improved Conditioning and Diversity Enhanced Progressively Growing GANs

利用改进的条件化和多样性增强的渐进式生长GAN实现可实现的超表面吸收体的逆向设计

Vineetha Joy, Mohammad Abdullah, Pramit Pal, Anshuman Kumar, Amit Sethi, Hema Singh

发表机构 * Centre for Electromagnetics, CSIR-National Aerospace Laboratories(电磁研究中心,国家航空航天实验室) Birla Institute of Technology and Science, Pilani, Rajasthan(比拉理工学院和科学学院,比里尼) Indian Institute of Technology, Bombay, Maharashtra(孟买印度理工学院,马哈拉施特拉)

AI总结 提出一种基于渐进式生长WGAN-GP与特征线性调制条件化的生成式逆向设计框架,结合替代辅助光谱对齐损失和行列式点过程多样性正则化,实现连续光谱约束下物理一致且多样化的超表面吸收体设计。

详情
AI中文摘要

超表面能够精确操控电磁波,用于波束转向、传感和隐身技术等应用。然而,由于迭代全波仿真驱动优化的计算成本高昂,以及现有生成方法在条件保真度和多样性方面的限制,具有目标电磁响应的超表面的逆向设计仍然具有挑战性。为了解决这些问题,本文提出了一种生成式逆向设计框架,用于在连续光谱约束下实现可控且物理一致的超表面合成。该方法采用渐进式生长Wasserstein生成对抗网络,结合梯度惩罚和基于特征线性调制的条件化,以实现连续光谱和制造约束的稳定传播。通过替代辅助光谱对齐损失,将电磁一致性直接嵌入生成学习过程,从而在训练期间实现物理约束生成。此外,引入基于行列式点过程的多样性正则化策略,以生成几何多样但光谱一致的实现,对应同一目标响应。通过在2至18 GHz频率范围内生成具有不同反射特性的实际可实现的超表面吸收体,证明了所提框架的有效性。电磁仿真验证了生成的设别以高精度满足目标规格。最终提出的框架实现了平均均方误差0.0052、多样性分数0.8730、波段对齐精度0.8533以及有效电磁设计生成百分比89.57,清晰展示了其生成高精度、多样化、电磁一致且可制造的超表面配置的能力。

英文摘要

Metasurfaces enable precise manipulation of electromagnetic waves for applications such as beam steering, sensing, and stealth technology. However, inverse design of metasurfaces with targeted EM responses remains challenging due to the computational expense of iterative full wave simulation driven optimization and the limited conditioning fidelity and diversity of existing generative approaches. To address these challenges, this paper presents a generative inverse design framework for controllable and physically consistent metasurface synthesis under continuous spectral constraints. The proposed approach employs a progressively growing Wasserstein generative adversarial network with gradient penalty integrated with feature wise linear modulation based conditioning for stable propagation of continuous spectral and fabrication constraints. EM consistency is embedded directly into the generative learning process through a surrogate assisted spectral alignment loss, enabling physics constrained generation during training. Further, a determinantal point process based diversity regularization strategy is incorporated to generate geometrically diverse yet spectrally consistent realizations for the same target response. The effectiveness of the proposed framework is demonstrated through the generation of practically realizable metasurface absorbers exhibiting diverse reflection characteristics in the frequency range of 2 to 18 GHz. EM simulations validate that the generated designs meet the target specifications with high accuracy. The final proposed framework achieved an average mean squared error of 0.0052, diversity score of 0.8730, band alignment accuracy of 0.8533, and a valid EM design generation percentage of 89.57, clearly demonstrating its capability to generate highly accurate, diverse, electromagnetically consistent and fabrication realizable metasurface configurations.

2606.05844 2026-06-05 cs.CR cs.AI

GenTI: Benchmarking LLMs for Autonomous IDPS Rule Generation for Unseen Attacks

GenTI: 针对未知攻击的自主IDPS规则生成的LLM基准测试

Hassan Jalil Hadi, Rehana Yasmin, Ali Shoker

发表机构 * Cyber Security and Resilience Technology (CyberSaR), King Abdullah University of Science and Technology (KAUST)(网络安全与韧性技术(CyberSaR),国王阿卜杜勒·阿齐兹大学科学与技术学院(KAUST))

AI总结 提出GenTI框架,通过构建包含15万条检测与防御规则的数据集GTI,并设计基于LLM的流水线(含结构化提示工程、思维链推理和验证循环),实现针对未知攻击的IDPS规则自动生成,将未知攻击检测率从45%提升至87.4%,误报率从8.5%降至2.3%。

详情
AI中文摘要

基于规则的入侵检测与防御系统(IDPS)能够提供精确的攻击检测和缓解,但其手动制作的、基于签名的规则限制了针对新兴和零日威胁的适应性。此外,现有的公共数据集(如CICIDS2017、UNSW-NB15)侧重于流量分类,提供的结构化信息很少,无法支持自动规则合成或防御逻辑。为填补这一空白,我们提出了生成式威胁情报(GenTI)——一个用于自动生成针对未知攻击的IDPS规则的LLM驱动基准。该数据集(GTI)汇集了来自Snort、Suricata、Emerging Threats的超过15万条检测和防御规则,以及5万条YARA规则,每条规则都标注了协议行为、负载签名、上下文关系、与网络威胁情报(CTI)的映射,以及可操作的响应类型(告警、丢弃、拒绝)。此外,在此语料库之上,我们设计了一个基于LLM的流水线,通过结构化提示工程、思维链(CoT)推理以及用于语法、语义和安全验证的验证链(CoVe)循环,将分析师提示和代表性负载转换为可部署的规则。生成的规则在(Snort/Suricata)上实时执行,并通过语法准确性、语义相似性、CTI覆盖率、安全有效性以及未知攻击检测进行评估。此外,我们的GenTI实例实现了89.4%的综合规则质量分数,CTI覆盖率达94.8%,将未知攻击检测率从45%提高到87.4%,并将误报率从8.5%降低到2.3%。总体而言,GenTI建立了第一个将规则级CTI与基于LLM的自动化紧密结合的大规模基准,实现了自适应、自演进的IDPS。

英文摘要

Rule-based Intrusion Detection and Prevention Systems (IDPS) offer precise attack detection as well as mitigation, however their manually crafted, signature-driven rules limit adaptability to emerging and zero-day threats. Additionally, existing public datasets (e.g., CICIDS2017, UNSW-NB15) focus on traffic classification and provide little structured information to support automatic rule synthesis or prevention logic. To address this gap, we propose Generative Thread Intelligence (GenTI) \footnote{GenTI refers to the proposed framework, and GTI refers to the dataset.} an LLM-driven benchmark for automatic generation of IDPS rules targeting unseen attacks. The dataset (GTI) aggregates over 150k detection and prevention rules from Snort, Suricata, Emerging Threats, as well as 50k YARA, each annotated with protocol behavior, payload signatures, contextual relationships, mappings to Cyber Threat Intelligence (CTI), along with actionable response types (alert, drop, reject). Moreover, on top of this corpus we design an LLM-based pipeline that transforms analyst prompts and representative payloads into deployable rules via structured prompt engineering, Chain-of-Thought (CoT) reasoning, as well as a Chain-of-Verification (CoVe) loop for syntactic, semantic, and security validation. The generated rules are executed in real time on (Snort/Suricata) and evaluated by syntax accuracy, semantic similarity, CTI coverage, security effectiveness as well as unseen attacks detection. Furthermore, our GenTI instantiation achieves a composite rule-quality score of 89.4\%, with 94.8\% CTI coverage, improving unseen attacks detection from 45\% to 87.4\% and reducing the false-positive rate from 8.5\% to 2.3\%. Overall, GenTI establishes the first large-scale benchmark that tightly couples rule-level CTI with LLM-based automation, enabling adaptive, self-evolving IDPS.

2606.05840 2026-06-05 eess.SY cs.RO cs.SY

Amortized Nonlinear Model Predictive Control

摊销非线性模型预测控制

Francesco Pillitteri, Alberto Bemporad

发表机构 * IMT School for Advanced Studies(IMT高级研究学院)

AI总结 针对输入仿射非线性系统,提出一种基于状态依赖二次规划的单网络残差校正架构,通过可微内点层保证约束满足,实现实时非线性模型预测控制,在机械臂跟踪任务中取得数量级加速。

Comments 6 pages

详情
AI中文摘要

非线性模型预测控制需要在每个采样时刻实时求解一个约束非线性规划(NLP),这是一个计算瓶颈,限制了在资源受限硬件或高采样率下的部署。我们针对输入仿射非线性系统这一广泛类别解决了这一挑战,证明了最优控制动作可以通过一个状态依赖的二次规划(QP)来近似,其成本参数取决于当前状态和参考。我们提出了一种单网络残差校正架构:一个状态依赖的解析基线提供初始QP参数,网络仅学习匹配完整NLP解所需的校正;QP通过一个可微内点层求解,保证了第一个控制动作的约束满足。该网络使用由NLP求解器生成的数据进行离线训练,采用结合监督模仿和KKT残差惩罚的混合损失。我们在一个具有笛卡尔末端执行器跟踪的三连杆平面机械臂上验证了该方法,展示了相比NLP求解器数量级的加速,同时保持了可比的跟踪性能。

英文摘要

Nonlinear Model Predictive Control requires solving a constrained nonlinear program (NLP) in real-time at every sampling instant, a computational bottleneck that limits deployment on resource-constrained hardware or at high sampling rates. We address this challenge for the broad class of input-affine nonlinear systems to show that the optimal control move can be approximated by a state-dependent quadratic program (QP) whose cost parameters depend on the current state and reference. We propose a single-network residual-corrector architecture: a state-dependent analytic baseline provides initial QP parameters, and the network learns only the corrections needed to match the full NLP solution; the QP is solved by a differentiable interior-point layer, guaranteeing constraint satisfaction for the first control action. The network is trained offline on data generated by an NLP solver using a hybrid loss that combines supervised imitation and KKT-residual penalties. We validate the approach on a three-link planar robotic arm with Cartesian end-effector tracking, demonstrating orders-of-magnitude speedup over the NLP solver while maintaining comparable tracking performance.

2606.05818 2026-06-05 math.HO cs.AI math.AG math.CO math.RT

Benchmarks in Leipzig

莱比锡基准测试

Andrei Balakin, Miklós Bóna, Marie-Charlotte Brandenburg, Clara Briand, Veronica Calvo Cortes, Shelby Cox, Jesus A. De Loera, Danai Deligeorgaki, Hannah Friedman, Tim Gehrunger, Chiara Giardino, Stephen Griffeth, Baran Hashemi, Elena Hoster, Alexander Ivanov, Nupur Jain, Aryaman Jal, Leonie Kayser, Joris Koefler, Kevin Kühn, Mario Kummer, Felix Lotter, René Marczinzik, Victor S. Miller, Alejandro Morales, Greta Panova, Gianni Petrella, Nathan Pflueger, Lakshmi Ramesh, Nikolas Rieke, Carlos Rodriguez, Andrea Rosana, Flavio Salizzoni, Otto T. P. Schmidt, Sven Ulf Schmitz, Lina Maria Simbaqueba Marin, Luca Sodomaco, Christian Stump, Bernd Sturmfels, Alexander Taveira Blomenhofer, Simon Telen, Philipp Tuchel, Emil Verkama, Carl Felix Waller, Julian Weigert, Annette Werner, Nathan Williams, Claudius Zibrowius

发表机构 * University of California, Berkeley(加州大学伯克利分校)

AI总结 49位数学家于2026年4月至5月编制了100个研究级数学问题数据集,通过多阶段评估大型语言模型的数学推理能力,最终仅剩2个问题未解决。

Comments 8 pages including 8 benchmark statistics tables + 20 pages appendix containing the 100 Leipzig Benchmark questions

详情
AI中文摘要

在2026年4月1日至5月15日期间,由49位数学家组成的小组编制了一个包含已知答案的研究级数学问题数据集。大部分工作是在德国莱比锡马克斯·普朗克数学科学研究所举办的为期3天的研讨会*Benchmarks in Leipzig*上完成的,共有35名参与者。我们展示了由此产生的100个问题集合。我们分三个阶段评估了这些问题:首先由五个最先进的大型语言模型各尝试一次,随后对其中三个模型进行每个模型20次运行的评估,最后用两个深度思考模型进行3次尝试。第一阶段后,41个问题完全未解决;第二阶段后,这一数字降至16个;第三阶段结束时,仅剩2个问题未解决。这表明大型语言模型的数学推理能力正变得令人印象深刻。

英文摘要

Between April 1 and May 15, 2026, a group of 49 mathematicians compiled a dataset of research-level mathematics questions with known answers. Most of the work was done during the 3-day workshop *Benchmarks in Leipzig* with 35 participants at the Max Planck Institute for Mathematics in the Sciences in Leipzig, Germany. We present the resulting collection of 100 questions. We evaluated these questions in three stages: a single attempt by five state-of-the-art LLMs, followed by a 20-runs-per-model evaluation with three of these models, and finally a 3-run attempt with two heavy-thinking models. After Stage 1, 41 questions remained completely unsolved; after Stage 2, this count dropped to 16; and we concluded Stage 3 with only 2 unsolved questions. This demonstrates that the mathematical reasoning capabilities of LLMs are becoming impressive.

2606.05779 2026-06-05 cs.CR cs.AI stat.ML

TinyML-Driven Cybersecurity for Autonomous Spacecraft: Latency-Accuracy Analysis for SPARTA RF and Cyber Threat Detection

TinyML驱动的自主航天器网络安全:SPARTA射频与网络威胁检测的延迟-精度分析

Van Le, Trevor Tran, Tan Le

发表机构 * Virginia Tech(弗吉尼亚理工学院) Hampton University(哈姆普顿大学)

AI总结 针对自主航天器,基于SPARTA攻击模型分析TinyML兼容经典模型(随机森林、逻辑回归、SVM、MLP)在检测多种网络射频威胁时的延迟-精度权衡,发现逻辑回归在微秒级推理下仅比随机森林精度低1%,适合作为机载自主基线。

Comments Twenty Fifth International Conference on Security & Management (SAM'26)

详情
AI中文摘要

自主航天器需要快速、轻量且可靠的在轨检测网络射频威胁。利用SPARTA攻击模型,我们分析了TinyML兼容的经典模型——随机森林、逻辑回归、支持向量机和多层感知机——在检测上行链路干扰、Fake-NR欺骗、有效载荷操纵、地面段妥协和未授权命令注入时的延迟-精度权衡。我们对每个模型的计算复杂度、VC维、Lipschitz连续性和延迟缩放进行了基于物理的理论分析,并通过在通过BandErasure、FakeNR和NoiseBurst损坏模式生成的对抗性射频频谱图上的经验测量加以支持。结果表明,逻辑回归实现了微秒级推理,且相对于随机森林仅下降1%的精度,使其成为机载自主的有效TinyML基线。该研究还指出了通过更丰富的特征编码器和多时间尺度学习架构来推进航天器网络安全的机会,这建立在边缘智能和可信AI的最新进展之上。

英文摘要

Autonomous spacecraft require rapid, lightweight, and reliable onboard detection of cyber-RF threats. Using the SPARTA attack model, we analyze the latency-accuracy trade-offs of TinyML-compatible classical models -- Random Forest, Logistic Regression, SVM, and MLP -- for detecting uplink jamming, Fake-NR spoofing, payload manipulation, ground-segment compromise, and unauthorized command injection. We present a physics-informed theoretical analysis of each model's computational complexity, VC dimension, Lipschitz continuity, and latency scaling, supported by empirical measurements on adversarial RF spectrograms generated via BandErasure, FakeNR, and NoiseBurst corruption modes. Results show that Logistic Regression achieves microsecond-level inference with only a 1\% accuracy drop relative to Random Forest, making it an effective TinyML baseline for onboard autonomy. The study also identifies opportunities for advancing spacecraft cybersecurity through richer feature encoders and multi-timescale learning architectures, building on recent progress in edge intelligence and trustworthy AI.

2606.05776 2026-06-05 cs.CR cs.AI cs.LG

An Improved CNN-LSTM Based Intrusion Detection System for IoT Networks

基于改进的CNN-LSTM的物联网网络入侵检测系统

Mohammad Tariq Ikhlas, Pohanyar Khowaja Khil, Malik Muhammad Mueed Aslam, Muhammad Khuram Shahzad

发表机构 * University of Engineering and Technology, Lahore(拉合尔工程与技术大学)

AI总结 提出一种结合多类分类、数据集集成和时间特征学习的改进CNN-LSTM入侵检测模型,在物联网网络上达到约97%的准确率。

Comments 8 pages, 8 figures

详情
AI中文摘要

随着物联网设备的快速普及,安全问题急剧增加,入侵检测系统对于保护网络环境变得至关重要。本文提出了一种改进的基于CNN-LSTM的入侵检测模型,该模型结合了多类分类、数据集集成和时间特征学习,以增强物联网网络中的检测性能。使用网络流量数据,所提出的方法在入侵检测任务上进行了评估,达到了约97%的准确率。实验结果表明,该模型能有效检测多种攻击类别,同时保持稳定的训练和验证性能。卷积和循环神经网络组件的集成使框架能够捕获网络流量的空间和时间特征,提高了物联网环境中的整体入侵检测能力。

英文摘要

With the rapid proliferation of IoT devices, security concerns have dramatically escalated and intrusion detection systems have become critical for protecting networked environments. This paper presents an improved CNN-LSTM based intrusion detection model that combines multi-class classification, dataset integration, and temporal feature learning to enhance detection performance in IoT networks. Using network traffic data, the proposed approach is evaluated on intrusion detection tasks and achieves an accuracy of approximately 97%. Experimental results demonstrate that the model effectively detects multiple attack categories while maintaining stable training and validation performance. The integration of convolutional and recurrent neural network components enables the framework to capture both spatial and temporal characteristics of network traffic, improving overall intrusion detection capability in IoT environments.

2606.05770 2026-06-05 cs.SE cs.AI

Human Oversight and Overload: Two Hidden and Costly Burdens of AI-Assisted Software Engineering

人类监督与过载:AI辅助软件工程中两种隐藏且昂贵的负担

Vahid Garousi

发表机构 * Queen’s University Belfast(女王大学贝尔法斯特) Azerbaijan Technical University(阿塞拜疆技术大学)

AI总结 本文通过分析从业者观点,揭示了AI辅助软件工程中人类持续监督AI生成产物和认知过载两种隐藏负担,并探讨了团队应对策略。

详情
AI中文摘要

AI正在改变软件工程师的工作方式,但常常伴随着隐藏的负担和成本。在本文中,我们描述了两种常被忽视的负担:(1)对人类持续监督和检查AI生成产物的需求;(2)软件工程师因接收大量AI工具建议而日益增长的认知过载。人类监督的需求并非可选——工程师必须审查、验证,有时甚至重做AI产生的内容。同时,大量AI建议、提示和可能的解决方案会使开发者精神紧张。通过融合近期从业者观点的证据,我们强调了这些常被忽视的挑战,并开启了关于团队如何在日常AI辅助软件工程中应对这些挑战的对话。

英文摘要

AI is changing how software engineers work, but it often comes with hidden burdens and costs. In this paper, we characterize two such often-overlooked burdens: (1) the constant need for human oversight and inspection of AI-generated artifacts; and (2) the growing cognitive overload on software engineers from receiving large amounts of suggestions from AI tools. The need for human oversight is not optional-engineers must review, validate, and sometimes rework what AI produces. At the same time, the flood of AI suggestions, prompts, and possible solutions can leave developers mentally stretched. By blending evidence from recent opinions from practitioners, we highlight these often-overlooked challenges and open a conversation about how teams can handle them in day-to-day AI-assisted software engineering.

2606.05748 2026-06-05 cs.MM cs.AI cs.CL

UNIVID: Unified Vision-Language Model for Video Moderation

UNIVID:用于视频审核的统一视觉语言模型

Kejuan Yang, Yizhuo Zhang, Mingyuan Du, Yue Zhang, Dixin Zheng, Kaili Zhao, Yang Xiao, Hanzhong Liang, Kenan Xiao

发表机构 * Bytedance(字节跳动)

AI总结 提出UNIVID统一视觉语言模型,通过生成可解释的策略感知字幕,实现端到端视频审核,减少违规泄露42.7%和过度审核率37.0%。

Comments 7 pages, 3 figures. Accepted to ACL 2026 Industry Track

详情
AI中文摘要

全球规模的视频审核面临双重挑战:需要细粒度的多模态推理以及可解释的输出以支持下游执法。传统的审核系统通常依赖于难以维护且缺乏透明度的碎片化黑盒分类器。在本文中,我们提出了UNIVID,一种用于视频审核的统一视觉语言模型。与标准分类模型不同,UNIVID生成策略感知的字幕,作为可解释的中间表示,实现人类可验证的决策和多任务可重用性。尽管现有的开源和商业VLM通常存在安全护栏拒绝问题,并且缺乏细粒度的策略对齐,我们开发了一种专门的训练数据配方,结合专家人工精炼的标签和合成数据,使模型与我们的安全指南对齐。通过将UNIVID作为核心字幕生成器,我们设计了一种新颖的端到端视频审核系统,相对减少了42.7%的违规泄露和37.0%的过度审核率。同时,通过用单个UNIVID骨干替换超过1000个策略特定模型,我们回收了大量计算资源,同时减少了工程维护开销。据我们所知,这是首批关于高效字幕生成VLM成功支持工业规模审核和跨职能业务的报告之一。

英文摘要

Global-scale video moderation faces a dual challenge: the need for fine-grained multi-modal reasoning and the demand for interpretable outputs to support downstream enforcement. Traditional moderation systems often rely on fragmented black-box classifiers that are difficult to maintain and lack transparency. In this paper, we present UNIVID, a UNIfied VIsion-language model for video moDeration. Unlike standard classification models, UNIVID generates policy-aware captions that serve as an interpretable intermediate representation, enabling human-verifiable decisions and multi-task reusability. While existing open-source and commercial VLMs often suffer from safety-guardrail refusals and lack fine-grained policy alignment, we develop a specialized training data recipe that combines expert human-refined labels with synthetic data to align the model with our safety guidelines. By integrating UNIVID as the core captioner, we design a novel end-to-end video moderation system that reduces violation leakage by 42.7% and overkill rate by 37.0% relatively. Meanwhile, by replacing over 1,000 policy-specific models with a single UNIVID backbone, we recycled extensive computation resources while reducing engineering maintenance overhead. To our knowledge, this is one of the first reports of a high-efficiency captioning VLM successfully supporting industrial-scale moderation and cross-functional business.

2606.05743 2026-06-05 cs.CR cs.CL

Membrane: A Self-Evolving Contrastive Safety Memory for LLM Agent Defense

Membrane: 一种用于LLM智能体防御的自演化对比安全记忆

Minseok Choi, Seungbin Yang, Dongjin Kim, Subin Kim, Jungmin Son, Yunseung Lee, Jaegul Choo, Youngjun Kwak

发表机构 * KAIST AI(韩国科学技术院人工智能实验室) Financial Tech Lab, KakaoBank Corp(Kakao银行金融科技实验室)

AI总结 提出Membrane,一种基于对比安全记忆(CSM)的自演化护栏,通过将有害交互及其良性对应物蒸馏为对比单元来防御不断演化的越狱攻击,无需重新训练即可实现高F1和低良性拒绝率。

详情
AI中文摘要

尽管在安全对齐方面取得了进展,大型语言模型仍然容易受到不断演化的越狱攻击。现有的微调安全分类器无法适应这些演化的攻击,而基于自适应记忆的护栏往往过度拒绝与存储攻击相似的良性查询。我们提出Membrane,一种基于对比安全记忆(CSM)构建的自演化护栏:每个单元将阻止有害查询的条件与允许表面相似的良性请求的条件配对。无需重新训练,Membrane通过将每次有害交互及其良性对应物蒸馏为一个由底层攻击策略索引的对比单元来演化CSM,使得一个单元能够泛化到同一机制的不同主题变体。在推理时,检索到的单元作为精确安全决策的上下文基础。在HarmBench上的模型级安全和AgentHarm上的智能体级安全评估中,Membrane在所有六种越狱攻击上实现了最高的F1分数。值得注意的是,AgentHarm上的良性拒绝率保持在7-14%,远低于先前护栏的28-85%范围。在跨攻击转移下,记忆单元仍保持87-88%的F1,并在记忆投毒下保持稳定。

英文摘要

Despite advances in safety alignment, large language models remain vulnerable to continuously evolving jailbreaks. Existing fine-tuned safety classifiers cannot adapt to these evolving attacks, while adaptive memory-based guardrails tend to over-refuse benign queries that resemble stored attacks. We propose Membrane, a self-evolving guardrail built on Contrastive Safety Memory (CSM): each cell pairs the conditions for blocking a harmful query with those for permitting a superficially similar benign request. Without retraining, Membrane evolves CSM by distilling each harmful interaction and its benign counterpart into a contrastive cell indexed by the underlying attack strategy, so that one cell generalizes across topical variants of the same mechanism. At inference, retrieved cells serve as grounding context for precise safety decisions. Across model-level safety on HarmBench and agent-level safety on AgentHarm, Membrane achieves the highest F1 on all six jailbreak attacks. Notably, benign refusal on AgentHarm stays at 7-14%, well below the 28-85% range of prior guards. Memory cells also retain 87-88% F1 under cross-attack transfer and remain stable under memory poisoning.

2606.05729 2026-06-05 cs.IT cs.LG math.IT

Automated Proving of Shannon-Type Entropy Inequalities via Fine-Tuned Language Models and Guided Tree Search

通过微调语言模型和引导树搜索自动证明香农型熵不等式

Shing Yin Wong, Shaocheng Liu, Linqi Song, Amin Gohari, Cheuk Ting Li

发表机构 * University of California, Berkeley(加州大学伯克利分校)

AI总结 本文通过微调小规模语言模型并结合引导束搜索,自动化证明香农型熵不等式,在含10-15个变量的测试集上达到85%的证明成功率。

详情
AI中文摘要

证明香农型熵不等式是信息论中的一项基本任务,通常需要构造已知约束的非平凡线性组合,这是一个组合搜索问题,其规模随随机变量数量增加而急剧增长。我们研究了小规模大语言模型(0.6B--1.7B参数),在原子证明步骤上微调并结合引导束搜索,能否自动化这一过程。在包含n=10到15个变量的60个不等式的保留测试集上,我们的0.6B微调模型通过树搜索达到了85%的证明成功率。GPT-5.5在零样本提示下解决了1.7%的样本,而Psitip解决了33.3%的样本。跨训练上下文长度(4096 vs. 8192 token)和数据分布(n=9偏斜 vs. 非偏斜)的系统消融研究表明,4096 token的非偏斜训练分布表现最佳,而扩展上下文和偏斜数据没有带来边际收益。我们进一步识别了两种主要的失败模式——格式失败和步骤质量退化——并通过受控消融验证了束评分启发式的必要性(随机评分将成功率从83%降至23%)。

英文摘要

Proving Shannon-type entropy inequalities is a fundamental task in information theory that often requires constructing non-trivial linear combinations of known constraints, which is a combinatorial search problem that scales poorly with the number of random variables. We investigate whether small-scale large language models (0.6B--1.7B parameters), fine-tuned on atomic proof steps and combined with guided beam search, can automate this process. On a held-out test set of 60 inequalities spanning n=10 to 15 variables, our 0.6B fine-tuned model achieves an 85\% proof success rate with tree search. GPT-5.5 solves 1.7\% samples under zero-shot prompting while Psitip solves 33.3\% samples. A systematic ablation study across training context length (4096 vs.\ 8192 tokens) and data distribution (n=9-skewed vs not skewed) reveals that a 4096-token not skewed training distribution yields the best performance, with extended context and skewed data providing no marginal benefit. We further identify two dominant failure modes -- format failures and step quality degradation -- and verify that the beam-scoring heuristic is essential via a controlled ablation (random scoring reduces success from 83\% to 23\%).

2606.05725 2026-06-05 cs.CR cs.CL

An Embarrassingly Simple Detector for Model Extraction Attacks in Large Language Model API Traffic

一种用于大型语言模型API流量中模型提取攻击的极其简单的检测器

Shuze Liu, Qianwen Guo, Yushun Dong

发表机构 * Santa Clara University(圣克拉拉大学) Florida State University(佛罗里达州立大学)

AI总结 本文提出一种基于最大均值差异(MMD)的简单检测方法,通过将查询嵌入语义空间并比较其与历史良性流量的分布差异,有效检测LLM API中的模型提取攻击。

Comments Preprint. Code available at https://github.com/LabRAI/mmd-llm-mea-detection

详情
AI中文摘要

大型语言模型(LLM)越来越多地通过托管API部署,使得模型提取成为对模型所有权和服务安全的实际威胁。然而,单个提取查询通常类似于良性请求,现有评估往往关注单查询异常评分或纯良性对攻击者用户设置。我们将模型提取监控形式化为良性校准的流量窗口分布测试,并展示一个极其简单的检测器是有效的:将传入查询嵌入语义空间,并测试其聚合分布是否偏离历史良性流量。我们使用最大均值差异(MMD)实例化该检测器,仅通过良性对良性比较来设置决策阈值。我们在来自四个提取场景的十四个攻击者-正常查询对上进行评估,并与改编的PRADA、SEAT、CAP、DATE和边际马氏距离基线进行比较。在三个随机种子下,MMD实现了0.3%的良性假阳性率、100.0%的纯攻击者真阳性率、攻击者比例上的平均真阳性率90.5%以及平衡准确率95.1%。这些结果表明,良性校准的分布测试是用户级和混合多用户LLM API流量中模型提取检测的强经验基线。代码发布在:https://github.com/LabRAI/mmd-llm-mea-detection。

英文摘要

Large language models (LLMs) are increasingly deployed through hosted APIs, making model extraction a practical threat to model ownership and service security. However, individual extraction queries often resemble benign requests, and existing evaluations often focus on single-query anomaly scoring or pure benign-versus-attacker user settings. We formulate model extraction monitoring as benign-calibrated traffic-window distribution testing and show that an embarrassingly simple detector is effective: embed incoming queries into a semantic space and test whether their aggregate distribution deviates from historical benign traffic. We instantiate the detector with maximum mean discrepancy (MMD), using only benign-vs-benign comparisons to set the decision threshold. We evaluate on fourteen attacker-normal query pairs from four extraction scenarios and compare with adapted PRADA, SEAT, CAP, DATE, and marginal Mahalanobis baselines. Across three random seeds, MMD achieves 0.3% benign FPR, 100.0% pure-attacker TPR, 90.5% average TPR over attacker fractions, and 95.1% balanced accuracy. These results show that benign-calibrated distribution testing is a strong empirical baseline for model extraction detection in both user-level and mixed multi-user LLM API traffic. Code is released at: https://github.com/LabRAI/mmd-llm-mea-detection.

2606.05720 2026-06-05 cs.SE cs.AI

Microskill Architecture: A Modular Skill-Driven Framework for AI-Native Code Generation

微技能架构:一种面向AI原生代码生成的模块化技能驱动框架

Mohammad Zare, Omid Abdolrahmani

发表机构 * Artificial Intelligence Laboratory at AriooBarzan(AriooBarzan人工智能实验室) Engineering Team, Shiraz, Iran(伊朗谢尔兹工程团队)

AI总结 本文提出微技能架构,通过将知识封装为原子技能胶囊并动态选择相关胶囊,解决AI代码生成中的上下文窗口管理问题,显著降低token消耗、提高编译成功率并消除架构违规。

详情
AI中文摘要

大型语言模型和AI编码代理已经重塑了软件开发,但完全AI原生系统的路径面临结构性挑战。其中最主要的是在保持准确性和效率的同时管理上下文窗口。当开发者将完整的项目文档和代码注入模型内存时,模型会丢失序列中间的信息,token成本激增,架构发生漂移。本文提出微技能架构:一种受微服务启发的模块化设计范式,应用于知识封装而非服务分解。该架构不是将整个代码库提供给代理,而是将知识划分为原子化、范围明确的技能胶囊,并由动态路由器仅选择语义相关的胶囊来执行任务。我们将上下文分配形式化为在token预算约束下基于语义相关性的约束优化。一个针对具有十五个复杂特性的企业内容管理系统的实证案例研究表明,微技能将token消耗降低了90%以上,首次尝试编译成功率几乎翻倍,完全消除了架构违规,并通过自学习机制实现了七个新技能胶囊的自主提取和注册。这些发现表明,微技能架构为构建更高效、更可靠且能够随时间演进的AI原生开发系统提供了可扩展的基础。

英文摘要

Large language models and AI coding agents have reshaped software development, but the path to fully AI-native systems faces structural challenges. Chief among them is managing context windows without losing accuracy or efficiency. When developers inject full project documentation and code into a model's memory, the model loses mid-sequence information, token costs spiral, and architecture drifts. This paper presents MicroSkill Architecture: a modular design paradigm inspired by microservices, applied to knowledge encapsulation instead of service decomposition. Instead of feeding an agent the entire codebase, the architecture partitions knowledge into atomic, sharply scoped skill capsules, and a dynamic router selects only semantically relevant capsules for the task. We formally model context allocation as constrained optimization over semantic relevance subject to a token budget. An empirical case study an enterprise content management system with fifteen complex features shows that MicroSkill cuts token consumption by over 90%, nearly doubles first-try compilation success rates, eliminates architectural violations entirely, and enables autonomous extraction and registration of seven new skill capsules via a self-learning mechanism. These findings suggest MicroSkill Architecture offers a scalable foundation for building AI-native development systems that are more efficient, more reliable, and capable of evolving over time.

2606.05714 2026-06-05 cs.CR cs.LG

Hybrid CNN-LSTM Framework for Intelligent Cyber Attack Detection and Prevention in U.S. Critical Digital Infrastructure: A Comparative Machine Learning Evaluation on CSE-CIC-IDS2018

混合CNN-LSTM框架用于美国关键数字基础设施的智能网络攻击检测与防御:基于CSE-CIC-IDS2018的机器学习比较评估

Md. Iqbal Hossan, Md. Serajul Kabir Chowdhury Rubel, Md. Arifur Rahman, B. M. Taslimul Haque

发表机构 * Department of Computer Science, Maharishi International University(马哈拉吉国际大学计算机科学系) Department of Information Studies, Trine University(特林大学信息学系) Department of Business Information Systems, Central Michigan University(中央密歇根大学商业信息系统系)

AI总结 提出一种结合CNN和LSTM的混合深度学习框架,利用CSE-CIC-IDS2018数据集进行网络攻击检测与防御,通过比较多种机器学习模型,实现高精度入侵检测和自动防御。

Comments 25 pages, 9 figures, CSE CIC IDS2018 dataset, Hybrid CNN LSTM, cyber attack detection

详情
Journal ref
Journal of Ai ML DL, 1(1), 2025
AI中文摘要

美国数字基础设施正在快速增长,因此,关键领域(包括医疗、金融、交通、能源和政府系统)面临的先进网络威胁也在增加。传统的网络安全方法,包括基于签名的入侵检测系统,已无法有效应对当今的网络攻击,因为它们无法实时检测未知和变化的攻击。为了克服这些限制,本研究提出了一种智能网络防御系统,利用人工智能(AI)和机器学习(ML)算法来检测和预防美国数字基础设施中的网络攻击。本研究使用CSE-CIC-IDS2018数据集,这是一个真实的网络流量数据集,包含各种网络攻击场景,包括分布式拒绝服务(DDoS)、暴力攻击、僵尸网络、渗透攻击和基于Web的攻击。实施并评估了多种机器学习和深度学习模型,如随机森林、XGBoost、卷积神经网络(CNN)和长短期记忆(LSTM)网络,用于识别恶意网络行为并提高入侵检测的准确性。所提出的框架结合了数据预处理、特征工程、实时流量监控、智能威胁分类和自动防御机制,以增强网络安全弹性。

英文摘要

Digital infrastructure is growing at a rapid pace in the United States, and as a result, exposure to advanced cyber threats to critical sectors including healthcare, finance, transportation, energy and government systems is growing. The traditional cybersecurity approaches, including signature-based intrusion detection systems, have become less effective against today's cyber attacks, as they are unable to detect unknown and changing attacks in real time. To overcome these constraints, this research suggests a smart cyber-defense system, which utilizes Artificial Intelligence (AI) and Machine Learning (ML) algorithms in the detection and prevention of cyber attacks in the U.S. digital infrastructure. This study uses the CSE-CIC-IDS2018 dataset, which is a realistic network traffic dataset, along with various cyber attack scenarios, including Distributed Denial of Service (DDoS), brute force attacks, botnets, infiltration attacks, and web-based attacks. A number of machine learning and deep learning models such as Random Forest, XGBoost, Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM) networks are implemented and evaluated to be used in identifying malicious network behavior and boosting the accuracy of intrusion detection. The framework proposed combines data preprocessing, feature engineering, real-time traffic monitoring, intelligent threat classification with automated prevention mechanisms to build cybersecurity resilience. E

2606.05713 2026-06-05 cs.MM cs.SD eess.AS

Beyond Generative Decoding: Discriminative Hidden-State Readout from a Native Omni-Modal LLM for Multimodal Sentiment Analysis

超越生成式解码:来自原生全模态大语言模型的判别性隐藏状态读出用于多模态情感分析

Bin Wen, Tien-Ping Tan

发表机构 * School of Computer Sciences, Universiti Sains Malaysia, Penang, Malaysia(计算机科学学院,马来西亚国际科学大学,槟城)

AI总结 针对多模态情感分析中生成式读出将连续回归绑定到离散自回归解码导致精度和效率损失的问题,提出基于原生全模态大语言模型Qwen2.5-Omni-7B的Thinker模块的判别性读出方法,通过轻量回归头直接映射最终层隐藏状态,在单消费级GPU上实现最先进性能。

Comments 18 pages, 4 figures, 6 tables

详情
AI中文摘要

多模态情感分析(MSA)从语言、声学和视觉信号推断人类情感。最近的方法越来越多地通过生成式读出适应大型多模态模型(LMM):提示模型将情感分数作为文本字符串输出。虽然方便,但这将连续回归与离散自回归解码绑定,带来了未测量的成本。我们重新审视这种读出机制,并提出一种基于原生全模态大语言模型(Qwen2.5-Omni-7B)的Thinker模块构建的判别性公式。我们不是进行文本解码,而是通过轻量回归头在单次前向传播中将最后一个非填充标记的最终层隐藏状态映射到连续分数。使用4位量化和低秩适应(QLoRA),整个7B管道——包括视频和音频处理——在单个消费级GPU(RTX 5090,32 GB)上训练,峰值内存10-21 GB,可训练参数仅1.14%。通过固定骨干网络、数据和LoRA配置的受控比较,我们隔离了读出的影响。在CMU-MOSI和CMU-MOSEI上,我们的判别性读出无需任务特定特征工程即可达到最先进的准确率(MOSI:MAE 0.551,Corr 0.888;MOSEI:MAE 0.506,Corr 0.790),并表现出强大的多种子稳定性。相比之下,生成式读出——即使经过等效的监督训练——平均绝对误差增加了一倍以上,产生无法解析或超出范围的输出(零样本下2.8%),并且延迟更高。模态消融实验揭示了CMU-MOSI上的文本主导模式。我们的发现表明,LMM的读出方式与其训练方式同样重要,证明判别性读出为连续MSA提供了更准确、高效和可靠的替代方案。

英文摘要

Multimodal sentiment analysis (MSA) infers human affect from language, acoustic, and visual signals. Recent methods increasingly adapt large multimodal models (LMMs) via generative readout: prompting the model to emit a sentiment score as a text string. While convenient, this ties continuous regression to discrete autoregressive decoding, incurring unmeasured costs. We revisit this readout mechanism and propose a discriminative formulation built on the Thinker module of a native omni-modal LLM (Qwen2.5-Omni-7B). Instead of text decoding, we map the final-layer hidden state of the last non-padding token to a continuous score via a lightweight regression head in a single forward pass. Using 4-bit quantization and low-rank adaptation (QLoRA), the entire 7B pipeline -- including video and audio processing -- trains on a single consumer GPU (RTX 5090, 32 GB) with 10-21 GB peak memory and 1.14% trainable parameters. Through a controlled comparison fixing the backbone, data, and LoRA configuration, we isolate the impact of the readout. On CMU-MOSI and CMU-MOSEI, our discriminative readout reaches state-of-the-art accuracy without task-specific feature engineering (MOSI: MAE 0.551, Corr 0.888; MOSEI: MAE 0.506, Corr 0.790) and exhibits strong multi-seed stability. In contrast, the generative readout -- even after equivalent supervised training -- more than doubles the mean absolute error, yields unparsable or out-of-range outputs (2.8% zero-shot), and suffers from higher latency. Modality ablations reveal a text-dominant regime on CMU-MOSI. Our findings indicate that how an LMM is read out is as consequential as how it is trained, demonstrating that a discriminative readout offers a more accurate, efficient, and reliable alternative for continuous MSA.

2606.05710 2026-06-05 cs.CR cs.AI

Explainable AI-Driven Cyber Risk Analytics and Model Reliability Assessment for Intelligent Governance of U.S. Critical Infrastructure: An XGBoost and SHAP-Based Intrusion Detection Framework

面向美国关键基础设施智能治理的可解释AI驱动的网络风险分析与模型可靠性评估:基于XGBoost和SHAP的入侵检测框架

B. M. Taslimul Haque, Md. Arifur Rahman, Md. Serajul Kabir Chowdhury Rubel, Md. Iqbal Hossan

发表机构 * Department of Business Information Systems, Central Michigan University(中央密歇根大学商业信息系统系) Department of Information Studies, Trine University(特林大学信息学系) Department of Computer Science, Maharishi International University(Maharishi国际大学计算机科学系)

AI总结 针对美国关键基础设施面临的网络威胁,提出一种结合XGBoost、随机森林等机器学习分类器与可解释AI(XAI)技术的入侵检测与网络风险预测框架,通过CICIDS2017数据集验证模型性能与可靠性。

Comments 20 pages, 8 figures, empirical research article, CICIDS2017 dataset, XGBoost, Random Forest, Decision Tree, Logistic Regression, SHAP explainability analysis, cyber risk analytics, intrusion detection, critical infrastructure cybersecurity, model reliability assessment

详情
Journal ref
Applied IT & Engineering, 2(1), 1-20, 2024
AI中文摘要

美国关键基础设施领域智能数字技术的日益渗透极大地增加了面对高级网络对手和运营漏洞的风险。AI驱动的治理和自动化决策系统正成为关键基础设施系统(包括能源、医疗、交通、金融服务和通信基础设施)运行的关键部分,以提高效率和战略管理。不断增长的网络威胁环境,如分布式拒绝服务(DDoS)攻击、僵尸网络、勒索软件和高级持续性威胁(APT),对基础设施韧性、网络安全可靠性和治理可信度构成了重大挑战。在不断变化的攻击态势和动态网络环境中,传统的网络安全机制往往无法满足不断变化的需求和保护关键系统。本研究将开发一个弹性网络风险分析和模型可靠性评估框架,以支持美国关键基础设施环境中网络风险暴露的智能治理和决策支持。本研究基于CICIDS2017数据集,用于开发和测试基于机器学习的入侵检测系统模型和网络风险预测模型。使用XGBoost、随机森林和决策树等多种分类器来检测网络上的恶意活动并确定网络风险水平。此外,集成了可解释人工智能(XAI)技术,以增强网络安全决策过程的透明度、可解释性和信任度。所提出的框架通过多种性能指标(如准确率、精确率、召回率、F1分数、ROC-AUC和假阳性率)展示了模型的可靠性和韧性。

英文摘要

The increasing penetrations of the critical infrastructure sector in the United States with intelligent digital technologies have greatly increased exposure to advanced cyber adversaries and operational vulnerabilities. AI-powered governance and automated decision-making systems are becoming a key part of the operation of critical infrastructure systems, including energy, healthcare, transportation, financial services, and communication infrastructure, in order to improve efficiency and strategic management. The growing cyber threat environment, such as Distributed Denial of Service (DDos) attacks, botnets, ransomware, and Advanced Persistent Threats (APTs) pose significant challenges to infrastructure resilience, cyber security reliability, and governance trustworthiness. In a changing attack landscape and dynamic network environment, traditional cybersecurity mechanisms can often fall short of meeting the evolving needs and protecting critical systems. This study will develop a resilient cyber risk analytics and model reliability assessment framework to support intelligent governance and decision support for cyber risk exposure in the U.S. critical infrastructure environment. This study is based on the CICIDS2017 dataset for the development and testing of intrusion detection system models and cyber risk prediction models based on machine learning. Various classifiers like XGBoost, Random Forest, and Decision Tree are used to detect malicious activities on the network and determine the level of cyber risk. Furthermore, the Explainable Artificial Intelligence (XAI) techniques are integrated to enhance transparency, interpretability, and trust in cybersecurity decision-making processes. The proposed framework presents the reliability and resilience of the model by having various performance measures such as accuracy, precision, recall, F1 score, ROC-AUC, and false positive rate.

2606.05701 2026-06-05 cs.CR cs.AI

Cognitive Threat Intelligence and Explainable Federated Security Analytics for distributed Infrastructure Systems

面向分布式基础设施系统的认知威胁情报与可解释联邦安全分析

Md. Arifur Rahman, B. M. Taslimul Haque, Md. Iqbal Hossan, Md. Serajul Kabir Chowdhury Rubel

发表机构 * Dept. of Information Studies, Trine University(信息研究系,特林大学) Dept. of Business Information Systems, Central Michigan University(商业信息系统系,中央密歇根大学) Dept. of CS, Maharishi International University(计算机科学系, Maharishi 国际大学)

AI总结 提出一种集成联邦学习、可解释人工智能和认知网络安全分析的框架,用于分布式基础设施系统的协作式隐私保护威胁检测。

Comments 22 pages, 10 figures, 1 conceptual framework diagram, 1 methodology workflow diagram, empirical study using NSL-KDD and CIC-IDS2017 datasets, Federated Learning, Explainable AI (SHAP, LIME), cybersecurity and intrusion detection framework

详情
Journal ref
International Journal of Research and Technology (IJRT), Volume 13, Issue 01, January-March 2025, pp. 132-151
AI中文摘要

分布式基础设施系统、云计算、物联网技术和边缘架构的日益普及显著扩大了网络安全攻击面,并引入了日益复杂的网络威胁。传统的集中式入侵检测方法在可扩展性、数据隐私、通信开销以及人工智能驱动决策过程的透明度方面常面临挑战。为解决这些限制,本文提出了一种面向分布式基础设施系统的认知威胁情报与可解释联邦安全分析框架。该框架集成了联邦学习、可解释人工智能和认知网络安全分析,能够在分布式网络环境中实现协作式且保护隐私的网络威胁检测。敏感原始网络流量数据不传输到集中式服务器,而是在分布式节点上独立训练本地安全模型,仅通过联邦聚合机制共享加密的模型参数和更新。这种去中心化学习架构在减少通信依赖和集中式安全风险的同时提高了隐私保护。为增强智能威胁分析,该框架采用了机器学习和深度学习算法,包括随机森林、XGBoost、自编码器、卷积神经网络和长短期记忆网络。此外,可解释人工智能技术(如SHAP和LIME)被集成以提供透明且可理解的威胁检测决策解释,从而增强安全分析师之间的信任和可操作性。在包括CICIDS2017、UNSW-NB15和CSE-CIC-IDS2018在内的多个基准网络入侵数据集上进行的实验评估表明,所提框架在检测准确率、精确率、召回率和F1分数方面优于传统集中式和现有联邦学习方法,同时确保数据隐私、通信效率和模型可解释性。

英文摘要

The increasing adoption of distributed infrastructure systems, cloud computing, Internet of Things (IoT) technologies, and edge-based architectures has significantly expanded the cybersecurity attack surface and introduced increasingly sophisticated cyber threats. Conventional centralized intrusion detection approaches often face challenges related to scalability, data privacy, communication overhead, and limited transparency in artificial intelligence-driven decision-making processes. To address these limitations, this study proposes a Cognitive Threat Intelligence and Explainable Federated Security Analytics framework for distributed infrastructure systems. The proposed framework integrates Federated Learning (FL), Explainable Artificial Intelligence (XAI), and cognitive cybersecurity analytics to enable collaborative and privacy-preserving cyber threat detection across distributed network environments. Instead of transmitting sensitive raw network traffic data to centralized servers, local security models are independently trained at distributed nodes, where only encrypted model parameters and updates are shared through a federated aggregation mechanism. This decentralized learning architecture improves privacy protection while reducing communication dependency and centralized security risks. To enhance intelligent threat analysis, the framework incorporates machine learning and deep learning algorithms including Random Forest, XGBoost, Autoencoder

2606.05680 2026-06-05 cs.PL cs.AR cs.LG

CASS-RTL: Correctness-Aware Subspace Steering for RTL Generation with LLMs

CASS-RTL:面向LLM的RTL生成的正确性感知子空间引导

Mohammad Akyash, Nowfel Mashnoor, Kimia Azar, Hadi Kamali

发表机构 * Department of Electrical and Computer Engineering (ECE), University of Central Florida, Orlando, FL 32816, USA(电子与计算机工程系,中央佛罗里达大学,奥兰多,佛罗里达州32816,美国)

AI总结 提出CASS-RTL框架,通过识别LLM中与RTL正确性相关的注意力头并构建低维子空间进行轻量级干预,在无需额外监督或重训练的情况下提升RTL代码生成的功能准确性。

Comments Accepted to the IEEE International Conference on LLM-Aided Design (LAD '26)

详情
AI中文摘要

近期大型语言模型(LLM)的进展使得从自然语言指令自动综合(生成)寄存器传输级(RTL)代码成为可能,为加速芯片设计提供了有前景的途径。与典型的自然语言(及软件编码)任务不同,基于LLM的RTL代码生成要求严格的周期准确性和并发性,微小的逻辑错误可能导致电路无法使用或不安全。尽管先前的工作通过外部验证、自我评估提示、检索增强提示、领域特定微调、智能体解决方案和推理来探索幻觉缓解,但这些方法大多忽视了LLM中可能固有地与RTL正确性相关的注意力导向内部机制。本文提出CASS-RTL,这是首个发现并利用LLM的正确性感知组件来引导RTL生成朝向功能准确输出的框架。我们(i)识别注意力头,其激活模式一致地区分正确与不正确的RTL;(ii)构建一个低维子空间以捕获正确性相关信号;(iii)设计一种轻量级的、几何感知的干预,在推理时引导模型。CASS-RTL完全与模型无关,无需额外监督或重训练,并易于集成到现有模型中。实验上,我们在多个模型上评估CASS-RTL,观察到在VerilogEval上pass@1/5/10准确率提升10%-20%,在CVDP上提升5%,证明了我们的方法在增强可靠性方面的有效性,同时不牺牲模型效率或需要大型标注数据集进行微调。

英文摘要

Recent advances in large language models (LLMs) have enabled the automatic synthesis (generation) of register-transfer level (RTL) code from natural language instructions, offering a promising pathway to accelerate chip design. Unlike typical natural language (and software coding) tasks, LLM-based RTL code generation demands strict cycle accuracy with concurrency, where minor logical errors can render a circuit unusable or insecure. While prior work has explored hallucination mitigation via external verification, self-evaluation prompts, retrieval-augmented prompting, domain specific fine-tuning, agentic solutions, and reasoning, these approaches largely overlook the attention-oriented internal mechanisms of LLMs that may inherently correlate with RTL correctness. This work proposes CASS-RTL, a first-of-its-kind framework for discovering and leveraging LLMs' correctness-aware components to guide RTL generation toward functionally accurate outputs. We (i) identify attention heads whose activation patterns consistently differentiate correct from incorrect RTL; (ii) construct a low-dimensional subspace capturing correctness-relevant signals; and (iii) design a lightweight, geometry-aware intervention that steers the model at inference time. CASS-RTL is fully model-agnostic, requires no additional supervision or retraining, and readily integrates into existing models. Empirically, we evaluate CASS-RTL on multiple models and observe 10%-20% improvement in pass@1/5/10 accuracy on VerilogEval and 5% improvement on CVDP, demonstrating the effectiveness of our method in enhancing reliability without sacrificing model efficiency or requiring a large labeled dataset for fine-tuning.

2606.05679 2026-06-05 cs.DB cs.AI

Data Flow Control: Data Safety Policies for AI Agents

数据流控制:AI 智能体的数据安全策略

Charlie Summers, Eugene Wu

发表机构 * Columbia University(哥伦比亚大学)

AI总结 提出数据流控制框架,通过声明式策略和可移植查询重写层 Passant,在 DBMS 中强制执行元组级数据安全策略,实现接近零开销。

Comments 15 pages, 12 figures

详情
AI中文摘要

智能体越来越多地代表用户生成 SQL、编排管道和自动化数据分析。虽然最近的工作提高了查询的正确性,但正确性不等于安全性。一个查询可能在语义上有效,却违反了管理数据如何组合和发布的监管、隐私或业务约束。我们认为,强制执行此类约束本质上是一个数据基础设施问题。本文介绍了数据流控制(DFC),一个在 DBMS 查询中声明式指定并保证对元组级数据流实施策略的框架。一个关键挑战是定义一种优化器无关但可大规模高效执行的策略语言。我们将数据安全形式化为关于溯源单项的聚合谓词,并提出了 Passant,一个可移植的查询重写层,无需物化溯源即可强制执行 DFC 策略。在五个 DBMS 引擎——DuckDB、Umbra、PostgreSQL、DataFusion 和 SQLServer 上,Passant 实现了约 0% 的开销,并且性能优于替代方案数个数量级。因此,数据流控制是将数据安全从提示和事后检查转移到数据基础设施的第一步。数据流控制开源可用:https://github.com/dataflowcontrol/data-flow-control。

英文摘要

Agents increasingly generate SQL, orchestrate pipelines, and automate data analysis on behalf of users. While recent work improves query correctness, correctness is not safety. A query may be semantically valid yet violate regulatory, privacy, or business constraints that govern how data may be combined and released. We argue that enforcing such constraints is fundamentally a data infrastructure problem. This paper introduces Data Flow Control (DFC), a framework to declaratively specify and guarantee policy enforcement over tuple-level data flows within a DBMS query. A key challenge is defining a policy language that is optimizer-invariant yet efficient to enforce at scale. We formalize data safety as aggregate predicates over provenance monomials and present Passant, a portable query rewriting layer that enforces DFC policies without materializing provenance. Across five DBMS engines -- DuckDB, Umbra, PostgreSQL, DataFusion, and SQLServer -- Passant achieves ~0% overhead and outperforms alternatives by orders of magnitude. As a result, Data Flow Control is the first step towards moving data safety from prompts and post-hoc checks into the data infrastructure. Data Flow Control is available open source at https://github.com/dataflowcontrol/data-flow-control.

2606.05658 2026-06-05 cs.IR cs.AI

Agent-Orchestrated Adaptive RAG: A Comparative Study on Structured and Multi-Hop Retrieval

Agent编排的自适应RAG:结构化与多跳检索的比较研究

Anuj Maharjan, Devinder Kaur, Richard Molyet

发表机构 * University of California, Berkeley(加州大学伯克利分校) University of Washington(华盛顿大学) University of California, Los Angeles(加州大学洛杉矶分校)

AI总结 提出Agent编排的自适应RAG框架,通过动态查询分解、迭代检索和自反思评估,在结构化领域(DevOps)和多跳推理基准(MuSiQue)上对比发现,查询分解在结构化领域提升性能但降低多跳排名精度,反思机制提高引用准确性但增加延迟,表明Agent增强需根据查询和领域特性选择性应用。

详情
AI中文摘要

检索增强生成(RAG)通过将响应基于外部知识来增强大型语言模型(LLM),但传统流水线依赖于静态的单步检索,这限制了复杂查询的性能。本文提出了一种Agent编排的自适应RAG框架,引入了动态查询分解、迭代检索和有界自反思评估循环。我们在两个互补的数据集上评估该系统:一个特定领域的DevOps知识库和多跳推理基准MuSiQue。使用包括总体得分、引用准确性、平均倒数排名和主题覆盖度在内的指标,我们发现查询分解在结构化领域(DevOps上总体得分+0.04,MRR+0.17)带来一致的增益,但在多跳基准上降低了排名精度,而反思机制以显著的延迟成本提高了引用准确性。这些对比结果表明,Agent增强并非普遍有益,必须根据查询和领域特性选择性应用。我们的发现支持自适应、成本感知的编排,而非统一激进的推理流水线。

英文摘要

Retrieval-Augmented Generation (RAG) enhances Large Language Models (LLMs) by grounding their responses in external knowledge, but conventional pipelines rely on static, single-step retrieval that limits performance on complex queries. This paper presents an Agent-Orchestrated Adaptive RAG framework that introduces dynamic query decomposition, iterative retrieval, and a bounded self-reflective evaluation loop. We evaluate the system across two complementary datasets: a domain-specific DevOps knowledge base and the multi-hop reasoning benchmark MuSiQue. Using metrics that include overall score, citation accuracy, mean reciprocal rank, and topic coverage, we find that query decomposition yields consistent gains in the structured domain (overall score $+0.04$, MRR $+0.17$ on DevOps) but degrades ranking precision on the multi-hop benchmark, while the reflection mechanism improves citation accuracy at a substantial latency cost. These contrasting results show that agentic enhancements are not universally beneficial and must be applied selectively according to query and domain characteristics. Our findings argue for adaptive, cost-aware orchestration rather than uniformly aggressive reasoning pipelines.

2606.05650 2026-06-05 cs.MM cs.CV cs.GR cs.NI

GS-NFS: Bandwidth-adaptive Streaming of Dynamic Gaussian Splats and Point Clouds

GS-NFS: 动态高斯溅射和点云的带宽自适应流传输

Rajrup Ghosh, Haodong Wang, Haoran Hong, Eduardo Pavez, Amartya Chaudhuri, Weiwu Pang, Harsha V. Madhyastha, Antonio Ortega, Ramesh Govindan

发表机构 * University of Southern California(南加州大学)

AI总结 提出GS-NFS方法,通过GPU并行加速动态3DGS帧的编解码,实现全帧率运行,速度比现有技术快1-2个数量级,同时保持竞争性的压缩性能和渲染质量。

详情
AI中文摘要

动态3D高斯溅射(3DGS)作为一种3D视频流技术具有很大前景,因为它能够以高保真度表示复杂的3D场景。在该方法中,3D视频的每一帧将环境表示为一组高斯体,每个高斯体具有位置以及其他属性,如尺度、旋转、不透明度和颜色。帧捕捉了精细细节,允许从任意视角观看,但数据量比2D视频帧大一个数量级或更多。最近的一系列工作探索了如何压缩动态3DGS帧,但这些方法通常较慢,部分原因是它们的压缩技术不适合高效加速。GS-NFS在GPU上加速动态3DGS的压缩和解压缩,达到能够以全帧率编码和解码的程度。它通过开发基于GPU的新型并行化方法,对现有的高斯位置和属性编码算法进行并行化来实现这一点。因此,它在编码和解码一帧时比现有技术快1-2个数量级,同时提供具有竞争力的压缩性能和渲染质量。

英文摘要

Dynamic 3D Gaussian Splatting (3DGS) holds great promise as a 3D video streaming technology since it can represent complex 3D scenes with high fidelity. In this approach, every frame in a 3D video represents the environment as a collection of Gaussians with position and other attributes such as scale, rotation, opacity, and color. Frames capture fine details, permit views from any arbitrary perspective, but are an order of magnitude, or more, larger than 2D video frames. A line of recent work has explored how to compress dynamic 3DGS frames, but these approaches are often slow, in part because their compression techniques are not amenable to efficient acceleration. GS-NFS accelerates dynamic 3DGS compression and decompression on a GPU, to the point where it can encode and decode at full frame rate. It achieves this by developing novel GPU-based parallelizations of existing algorithms for encoding both positions and attributes of Gaussians. As a result, it is 1-2 orders of magnitude faster than the state-of-the-art in encoding and decoding a frame, while offering competitive compression performance and rendering quality.

2606.05649 2026-06-05 stat.CO cs.LG

Diff2SP: Diffusion Models for Correlated Scenario Generation in Stochastic Programming

Diff2SP:随机规划中相关场景生成的扩散模型

Haixiang Sun, Andrew Liu

发表机构 * Purdue University(普渡大学)

AI总结 提出Diff2SP扩散生成框架,将下游优化目标嵌入场景生成过程,通过理论证明和经验验证实现统计一致性与决策感知的平衡。

详情
AI中文摘要

场景生成是随机规划(SP)中的关键组成部分,直接影响不确定性下决策的质量。现有方法主要依赖于基于采样的技术或使用神经网络的监督学习。基于采样的方法通常难以捕捉复杂依赖关系和罕见但可能的事件,而监督学习需要固定的输入-输出对进行训练,且生成不受预定义模式或规则限制的多样化现实场景的能力有限。为了解决这些局限性,我们引入了Diff2SP,一种基于扩散的生成框架,将下游优化目标直接融入场景生成中。与将场景生成和决策制定视为独立步骤的传统方法不同,Diff2SP将随机优化嵌入训练过程,从而生成既统计一致又具有决策感知的场景。为了正式证明这种优化感知设计的合理性,我们建立了将分布精度与决策质量联系起来的遗憾界,并建立了样本复杂度保证,显示出比传统生成模型(如GAN)更快的收敛速度。在合成数据集和电力系统数据集上的实证结果验证了这些理论见解,表明Diff2SP在统计保真度和下游优化结果上均有一致提升。

英文摘要

Scenario generation is a critical component in stochastic programming (SP), as it directly influences the quality of decision-making under uncertainty. Existing approaches predominantly rely on either sampling-based techniques or supervised learning using neural networks. Sampling-based techniques often struggle to capture complex dependencies and rare but plausible events, while supervised learning requires fixed input-output pairs for training and is limited in its ability to generate a wide variety of realistic scenarios that are not restricted by predefined patterns or rules. To address these limitations, we introduce Diff2SP, a diffusion-based generative framework that incorporates downstream optimization objectives directly into scenario generation. Unlike conventional methods that treat scenario generation and decision-making as separate steps, Diff2SP embeds stochastic optimization into the training process, enabling the generation of scenarios that are both statistically coherent and decision-aware. To formally justify this optimization-aware design, we establish a regret bounds that link distributional accuracy to decision quality, and establish sample complexity guarantees showing faster convergence than traditional generative models such as GANs. Empirical results on both synthetic and power-system datasets validate these theoretical insights, demonstrating that Diff2SP consistently improves both statistical fidelity and downstream optimization outcomes.

2606.05646 2026-06-05 cs.SE cs.AI

Enhancing Software Engineering Through Closed-Loop Memory Optimization

通过闭环内存优化增强软件工程

Xuehang Guo, Zora Zhiruo Wang, Qingyun Wang, Graham Neubig, Xingyao Wang

发表机构 * William & Mary(威廉玛丽学院) Carnegie Mellon University(卡内基梅隆大学) OpenHands University(OpenHands大学) University of Illinois Urbana-Champaign(伊利诺伊大学厄巴纳-香槟分校)

AI总结 提出闭环内存优化框架,通过验证下游影响来定义内存效用,作为评估基准和优化信号,显著提升软件工程代理的成功率和效率。

详情
AI中文摘要

大型语言模型(LLMs)使得强大的软件工程(SE)代理能够导航复杂的代码库并解决现实世界的问题。然而,这些代理本质上仍然是 episodic 的:它们无法跨任务保留、改进和重用经验,反复从头构建上下文并重复类似的错误。即使有内存支持,它们也无法弥补缺乏原则性、任务无关的 \textit{内存效用} 的缺陷,这使得难以严格评估或跨代理和设置进行泛化。为了解决这些限制,我们引入了 \ours,一个用于 SE 代理内存增强的闭环框架。\ours 将内存效用建立在 \textit{验证的下游影响} 上,将效用确立为任务无关的 \textbf{评估基准} 和无注释的 \textbf{优化信号}。通过在 \textit{单 episode} 和 \textit{跨 episode} 内存增强上的互补评估,结果表明 \ours 在不同设置下一致地改进了 SE 代理,在成功率上实现了高达 $\uparrow5.25\\%$ 的绝对增益,在解决效率上实现了 $\uparrow4.63\\%$ 的绝对增益,同时大幅降低了计算成本 $\geq9.79\\%$。我们的项目页面:\href{https://xhguo7.github.io/MemOp/}{https://xhguo7.github.io/MemOp/}。

英文摘要

Large language models (LLMs) have enabled powerful software engineering (SE) agents capable of navigating complex codebases and resolving real-world issues. However, these agents remain fundamentally episodic: they fail to retain, refine, and reuse experiences across tasks, repeatedly reconstructing context from scratch and reproducing similar mistakes. Even with memory support, they offer no remedy for the absence of a principled, task-agnostic \textit{memory utility}, making them difficult to evaluate rigorously or generalize across agents and settings. To tackle these limitations, we introduce \ours, a closed-loop framework for memory augmentation in SE agents. \ours grounds memory utility in \textit{validated downstream impact}, establishing utility as both a task-agnostic \textbf{evaluation benchmark} and an annotation-free \textbf{optimization signal}. Through complementary evaluation on \textit{single-episode} and \textit{cross-episode} memory augmentation, results demonstrate that \ours consistently improves SE agents across settings, achieving absolute gains of up to $\uparrow5.25\%$ in success rate and $\uparrow4.63\%$ in resolve efficiency, while substantially reducing computational cost by $\geq9.79\%$. Our project page: \href{https://xhguo7.github.io/MemOp/}{https://xhguo7.github.io/MemOp/}.

2606.05618 2026-06-05 nlin.CD cs.LG math.DS

Uncovering Extreme Event Mechanisms for Prediction and Control with Sensitivity-Balanced Projections

利用敏感度平衡投影揭示极端事件机制以进行预测与控制

Nicholas Zolman, Sajeda Mokbel, Samuel E. Otto, Steven L. Brunton

发表机构 * Department of Mechanical Engineering, University of Washington(华盛顿大学机械工程系) AI Institute in Dynamic Systems, University of Washington(华盛顿大学动态系统人工智能研究所) Sibley School of Mechanical and Aerospace Engineering, Cornell University(康奈尔大学Sibley机械与航空航天工程学院)

AI总结 提出基于协方差平衡降维(CoBRAS)的可解释方法,通过自动微分替代伴随计算,识别敏感度平衡投影以揭示极端事件机制,并用于数据驱动预测和事件抑制控制。

Comments 12 pages, 6 figures (main text). Additional 14 pages of references and Supplementary Information

详情
AI中文摘要

极端事件——如地震和日冕物质抛射——在许多混沌动力系统中很常见,但由于驱动它们的微妙不稳定性机制,很难表征和预测。在这项工作中,我们开发了一种可解释的技术,揭示极端事件背后的潜在机制,并利用它们构建数据驱动的预测和直观的事件抑制控制器。特别是,我们利用伴随快照的协方差平衡降维(CoBRAS)方法来识别线性斜投影,这些投影最好地捕获感兴趣量的敏感度并重建原始状态。重要的是,我们绕过了繁琐的伴随计算的需要,而是通过现代自动可微数值框架使用反向传播。为了适应空间局部事件,我们还引入了一种新的CoBRAS变体,以获得局部敏感度平衡投影。我们展示了这种方法在一系列具有挑战性的系统中表征极端事件的效用,包括二维Kolmogorov流中湍流能量耗散的爆发、耦合FitzHugh-Nagumo振荡器网络中的自发同步,以及由修正非线性薛定谔方程产生的海洋怪波的局部形成。对于每个例子,我们展示了我们的简单预测模型准确预测极端事件,并且潜在机制可用于设计控制律以防止这些事件。最后,我们证明了通过直接从数据学习动力学的神经网络代理模型,我们可以将这种方法扩展到实验系统和那些并非原生用自动可微编程语言编写的系统。

英文摘要

Extreme events -- such as earthquakes and coronal mass ejections -- are common in many chaotic dynamical systems, yet are difficult to characterize and predict due to the subtle instability mechanisms that drive them. In this work, we develop an interpretable technique that reveals the underlying mechanisms behind extreme events and uses them to build data-driven forecasts and intuitive event suppression controllers. In particular, we utilize the covariance balancing reduction using adjoint snapshots (CoBRAS) method to identify linear oblique projections that best capture the sensitivity of a quantity of interest and reconstruct the original state. Importantly, we bypass the need for cumbersome adjoint calculations, instead using backpropagation via modern automatically differentiable numerical frameworks. To accommodate spatially localized events, we also introduce a new variant of CoBRAS to obtain local sensitivity-balanced projections. We demonstrate the utility of this approach to characterize extreme events across a diverse set of challenging systems, including turbulent bursts of energy dissipation in the 2D Kolmogorov Flow, spontaneous synchronization in networks of coupled FitzHugh-Nagumo oscillators, and the localized formation of ocean rogue waves from a modified nonlinear Schrödinger equation. For each example, we show that our simple forecast models accurately predict extreme events and that the underlying mechanisms may be used to design control laws to prevent these events. Finally, we demonstrate that by learning a neural network surrogate model of the dynamics directly from data, we may extend this approach to experimental systems and systems that are not natively written in an automatically differentiable programming language.

2606.05609 2026-06-05 cs.CR cs.AI cs.LG

SlotGCG: Exploiting the Positional Vulnerability in LLMs for Jailbreak Attacks

SlotGCG:利用LLMs中的位置脆弱性进行越狱攻击

Seungwon Jeong, Jiwoo Jeong, Hyeonjin Kim, Yunseok Lee, Woojin Lee

发表机构 * Dongguk University-Seoul(东国大学-首尔)

AI总结 本文提出SlotGCG方法,通过量化提示中不同插入位置(槽)的脆弱性得分(VSS),选择最脆弱的位置插入对抗性令牌,从而显著提升基于优化的越狱攻击成功率。

详情
Journal ref
International Conference on Learning Representations (ICLR), 2026
AI中文摘要

随着大型语言模型(LLMs)的广泛部署,通过越狱攻击识别其脆弱性变得日益关键。基于优化的攻击方法如贪婪坐标梯度(GCG)专注于将对抗性令牌插入到提示的末尾。然而,GCG将对抗性令牌限制在固定的插入点(通常是提示后缀),未探索在其他位置插入令牌的效果。在本文中,我们实证研究了提示中可插入令牌的候选位置(称为槽)。我们发现越狱的脆弱性与槽的选择高度相关。基于这些发现,我们引入了脆弱性槽得分(VSS)来量化越狱的位置脆弱性。随后,我们提出SlotGCG,该方法使用VSS评估所有槽,选择最脆弱的槽进行插入,并在这些槽上运行针对性的优化攻击。我们的方法提供了一种与攻击无关的位置搜索机制,可插入任何基于优化的攻击,仅增加200毫秒的预处理时间。在多个模型上的实验表明,SlotGCG显著优于现有方法。具体而言,与基于GCG的攻击相比,它实现了14%更高的攻击成功率(ASR),收敛更快,并且对防御方法表现出更强的鲁棒性,ASR比基线方法高42%。我们的实现可在https://github.com/youai058/SlotGCG获取。

英文摘要

As large language models (LLMs) are widely deployed, identifying their vulnerability through jailbreak attacks becomes increasingly critical. Optimization-based attacks like Greedy Coordinate Gradient (GCG) have focused on inserting adversarial tokens to the end of prompts. However, GCG restricts adversarial tokens to a fixed insertion point (typically the prompt suffix), leaving the effect of inserting tokens at other positions unexplored. In this paper, we empirically investigate \emph{slots}, i.e., candidate positions within a prompt where tokens can be inserted. We find that vulnerability to jailbreaking is highly related to the selection of the \emph{slots}. Based on these findings, we introduce the \textit{Vulnerable Slot Score} (VSS) to quantify the positional vulnerability to jailbreaking. We then propose SlotGCG, which evaluates all slots with VSS, selects the most vulnerable slots for insertion, and runs a targeted optimization attack at those slots. Our approach provides a position-search mechanism that is attack-agnostic and can be plugged into any optimization-based attack, adding only 200ms of preprocessing time. Experiments across multiple models demonstrate that SlotGCG significantly outperforms existing methods. Specifically, it achieves 14\% higher Attack Success Rates (ASR) over GCG-based attacks, converges faster, and shows superior robustness against defense methods with 42\% higher ASR than baseline approaches. Our implementation is available at \href{https://github.com/youai058/SlotGCG}{https://github.com/youai058/SlotGCG}

2606.05584 2026-06-05 cs.CR cs.AI

Dimensionality Reduction for Cyberattack Classification: A Comparative Evaluation of PCA and Linear Predictive Coding

网络攻击分类的降维:PCA与线性预测编码的比较评估

Nelly Elsayed, Zag ElSayed, Navid Asadizanjani

发表机构 * University of California, Los Angeles(加州大学洛杉矶分校)

AI总结 本文通过比较主成分分析(PCA)和线性预测编码(LPC)两种降维方法,研究网络攻击分类中的特征压缩技术,实验表明PCA在激进压缩下仍能保持分类性能,LPC则略有性能下降,但两者均能在最小影响分类准确率的情况下大幅降低特征维度。

Comments Acceprted in the IEEE MWSCAS 2026

详情
AI中文摘要

高维特征表示被广泛用于基于机器学习的网络攻击检测系统。然而,它们增加了计算复杂度,并可能阻碍在资源受限环境中的部署。在本文中,我们通过比较两种降维方法:主成分分析(PCA)和线性预测编码(LPC),研究用于网络攻击分类的特征压缩技术。生成具有不同维度的压缩特征表示,并在多个分类模型上进行评估。实验分析表明,即使在激进压缩下,PCA也能保持分类性能。另一方面,LPC提供了具有竞争力的预测表示,但性能下降略大。结果表明,可以在对分类准确率影响最小的情况下实现特征维度的显著降低,突显了轻量级特征压缩在高效网络安全分析中的潜力。

英文摘要

High-dimensional feature representations are widely used in machine learning-based cyberattack detection systems. However, they increase computational complexity and may hinder deployment in resource-constrained environments. In this paper, we investigate feature compression techniques for cyberattack classification by comparing two dimensionality reduction approaches: Principal Component Analysis (PCA) and Linear Predictive Coding (LPC). Compressed feature representations with varying dimensionalities are generated and evaluated across several classification models. Experimental analysis demonstrates that PCA preserves classification performance even under aggressive compression. On the other hand, LPC provides competitive predictive representations with slightly larger performance degradation. The results show that substantial reductions in feature dimensionality can be achieved with minimal impact on classification accuracy, highlighting the potential of lightweight feature compression for efficient cybersecurity analytics.

2606.05581 2026-06-05 cs.GR cs.CV cs.LG

Monte Carlo Steklov Operators for Large-Scale Geometry Processing in the Wild

蒙特卡洛Steklov算子用于大规模野外几何处理

Arman Maesumi, Tanish Makadia, Aruna Anderson, Oras Phongpanangam, Justin Solomon, Daniel Ritchie

发表机构 * Brown University(布朗大学) Loyola Marymount University(洛约拉玛丽蒙特大学) Massachusetts Institute of Technology(麻省理工学院)

AI总结 提出一种蒙特卡洛方法估计Dirichlet-to-Neumann算子及其Steklov特征模态,实现鲁棒且高效的体积算子计算,并应用于大规模3D对比表示学习。

Comments 21 pages

详情
AI中文摘要

内在方法填充了网格几何处理的默认工具箱。内在算子,特别是拉普拉斯算子,是对等距不变性有要求的方法的基础,因此已用于许多形状分析、学习和编辑算法。然而,内在方法的前提假设在处理野外几何时变得脆弱,因为(i)网格质量无法保证,(ii)许多网格由多个连通分量建模。在这种情况下,体积构造定义更清晰,因为可以放宽对表面拓扑的限制。本文提出了一种蒙特卡洛方法,用于估计Dirichlet-to-Neumann (DtN)算子——一种边界到边界的体积算子——及其相关的Steklov特征模态。我们基于蒙特卡洛几何处理的最新发展,将该边界算子本身作为估计对象。通过体积随机过程定义的DtN算子被推广到外部域,通过周围环境空间耦合断开的分量。我们表明,我们的方法在计算Steklov谱时比现有的边界元方法快几个数量级,同时对低质量三角剖分、高分辨率网格和多分量几何保持鲁棒。为了展示这种可扩展性,我们计算了来自未策划的Objaverse数据集的约450,000个形状的内外Steklov特征谱。我们将这些算子集成到Steklov-CLIP中,这是一种基于网格的神经网络,使用体积谱算子进行大规模对比3D表示学习。得到的网络学习到语义上有意义的全局和密集形状表示,说明几何上有原则的体积算子可以在现代3D数据集规模上变得实用。

英文摘要

Intrinsic methods fill the default toolbox for geometry processing on meshes. Intrinsic operators, in particular the Laplacian, underlie methods that require invariance to isometry and have hence been employed in many algorithms for shape analysis, learning, and editing. However, intrinsic methods are predicated on assumptions that quickly become brittle when working with in-the-wild geometry, where (i) mesh quality is not guaranteed, and (ii) many meshes are modeled with multiple connected components. In such settings, volumetric constructions are better-defined, since restrictions on surface topology can be relaxed. This paper presents a Monte Carlo method for estimating the Dirichlet-to-Neumann (DtN) operator -- a boundary-to-boundary volumetric operator -- and its associated Steklov eigenmodes. We build on recent developments in Monte Carlo geometry processing by casting this boundary operator itself as the subject of estimation. The DtN operator, defined through a volumetric stochastic process, is then generalized to the exterior domain, where it couples disconnected components through the surrounding ambient space. We show that our method is orders of magnitude faster than existing boundary-element approaches for computing Steklov spectra while remaining robust to poor triangulations, high-resolution meshes, and multi-component geometry. To demonstrate this scalability, we compute interior and exterior Steklov eigenspectra for approximately 450,000 shapes from the uncurated Objaverse dataset. We incorporate these operators into Steklov-CLIP, a mesh-based neural network that uses volumetric spectral operators for large-scale contrastive 3D representation learning. The resulting network learns semantically meaningful global and dense shape representations, illustrating that geometrically-principled volumetric operators can be made practical at the scale of modern 3D datasets.

2606.05572 2026-06-05 cs.ET cs.HC cs.RO physics.app-ph

Wave Focusing in Metamaterials: Tactile Displays Beyond the Diffraction Limit

超材料中的波聚焦:超越衍射极限的触觉显示器

Gregory Reardon, Max Linnander, Dustin Goetz, Neeli Tummala, Yon Visell

发表机构 * Media Arts and Technology Program(媒体艺术与技术项目) Department of Mechanical Engineering(机械工程系) Department of Electrical and Computer Engineering(电气与计算机工程系) University of California, Santa Barbara(加州大学圣芭芭拉分校)

AI总结 本文利用局部共振超材料板中的慢波分支实现机械波聚焦,突破衍射极限,生成高分辨率虚拟触觉像素,并将像素面积缩小十倍。

详情
AI中文摘要

我们解决了工程化分布式触觉显示器的挑战,该显示器能够在表面上任意位置再现多个局部化、可独立寻址的振动——代表虚拟触觉像素。我们的技术基于使用稀疏的致动器阵列在弯曲板中聚焦机械波。在触觉频率下,波衍射阻止了在多指触摸交互相关空间尺度上形成局部化虚拟触觉像素。我们通过在板上增加机械共振器晶格,形成局部共振超材料板,克服了这一限制。板的动态模式与共振器模式之间的耦合改变了控制波传播的色散关系,引入了一个慢波分支,使得能够超越未修改板所施加的衍射极限进行聚焦。我们使用数值模拟来设计超材料系统的色散关系,以实现触觉频率下的高分辨率聚焦。然后,我们制造了一个超材料触觉显示器,并实验证明虚拟像素比在没有共振器的相同板上生成的像素更加局部化,导致虚拟像素面积缩小十倍。在行为实验中,我们展示了该系统能够传递感知上局部化的单点和多点触觉反馈以及移动触觉源,同时保持对多个显示位置的时间波形的独立控制。这里报告的方法可以使用少量致动自由度实现高分辨率触觉显示器,适用于广泛应用。

英文摘要

We address the challenge of engineering distributed haptic displays capable of reproducing multiple localized, independently addressable vibrations -- representing virtual tactile pixels -- at arbitrary locations on a surface. Our technique is based on the focusing of mechanical waves in a flexural plate using a sparse set of actuators. At tactile frequencies, wave diffraction prevents the formation of localized virtual tactile pixels at spatial scales relevant for multi-digit touch interactions. We overcome this limitation by augmenting the plate with a lattice of mechanical resonators, forming a locally resonant metamaterial plate. Coupling between the plate's dynamic modes and those of the resonators alters the dispersion relation governing wave transmission, introducing a slow-wave branch that enables focusing beyond the diffraction limit imposed by the unmodified plate. We use numerical simulations to engineer the dispersion relation of the metamaterial system for high-resolution focusing at tactile frequencies. We then fabricate a metamaterial tactile display and experimentally demonstrate virtual pixels that are far more localized than those generated on an otherwise identical plate without resonators, resulting in a tenfold reduction in virtual-pixel area. In behavioral experiments, we show that this system can deliver perceptually localized single- and multi-point tactile feedback and moving tactile sources while maintaining independent control over temporal waveforms at multiple display locations. The methods reported here can enable high-resolution haptic displays for widespread applications using a small number of actuated degrees of freedom.