arXivDaily arXiv每日学术速递 周一至周五更新
重置
全部学科分类 1755
2606.07099 2026-06-08 eess.SY cs.SY 新提交

SABLE: GPU-Based Power Flow Accelerator for Sparsity-Aware Batched Learning

SABLE: 基于GPU的稀疏感知批量学习潮流加速器

Suho Park, Keunju Song, Hongseok Kim

AI总结 提出SABLE,一种基于GPU的稀疏批量潮流加速器,通过块对角嵌入和可复用稀疏模板,在PyTorch、CuPy和cuDSS间实现零拷贝互操作,显著提升独立潮流求解和端到端训练吞吐量。

Comments 10 pages

详情
AI中文摘要

最近的研究开发了基于GPU的交流潮流求解方法,并成功将其应用于独立潮流问题。然而,在保持稀疏性的同时将这些方法集成到现代可微学习框架中仍然具有挑战性。为此,我们提出了SABLE,一种基于GPU的稀疏批量潮流加速器,通过隐式潮流层实现可微学习。SABLE利用块对角嵌入,将批量三维雅可比矩阵重构为固定模式的二维稀疏模板,该模板在PyTorch、CuPy和cuDSS之间共享。这种公共模板实现了零拷贝互操作性以及跨软件栈的稀疏内存复用。在此表示之上,SABLE通过可复用稀疏模板、自定义GPU内核、基于cuDSS的稀疏直接LU求解器和混合精度技术加速重复潮流计算。大量实验表明,SABLE将独立潮流求解吞吐量相比pandapower提升高达253.4倍,相比ExaPF提升5.7倍。在端到端训练中,基于DC3和DeepLDE的交流最优潮流学习模型评估显示,SABLE将可行训练批量范围扩大高达64倍,并将训练吞吐量相比相应基线提升高达206.7倍。

英文摘要

Recent studies have developed GPU-based approaches for solving AC power flow and successfully applied them to standalone power flow problems. However, integrating these approaches into modern differentiable learning frameworks while preserving sparsity remains challenging. To this end, we present SABLE, a GPU-based sparse batched power flow accelerator for differentiable learning via an implicit power flow layer. SABLE leverages a block-diagonal embedding that reformulates batched three-dimensional Jacobians as a fixed-pattern two-dimensional sparse template that is shared across PyTorch, CuPy, and cuDSS. This common template enables zero-copy interoperability and memory-efficient sparse reuse across the software stack. On top of this representation, SABLE accelerates repeated power flow computations through reusable sparse templates, custom GPU kernels, a cuDSS-based sparse-direct LU solver, and mixed-precision techniques. Extensive experiments show that SABLE improves standalone power flow solving throughput by up to 253.4$\times$ over pandapower and 5.7$\times$ over ExaPF. In end-to-end training, evaluated on AC optimal power flow learning models based on DC3 and DeepLDE, SABLE expands the feasible training batch range by up to 64$\times$ and improves training throughput by up to 206.7$\times$ over the corresponding baseline.

2606.07085 2026-06-08 cs.SE 新提交

Porting Declarative UI to HarmonyOS: A Heuristic-guided LLM Approach

将声明式UI移植到鸿蒙:一种启发式引导的LLM方法

Kunwu Zheng, Pengyu Xue, Zhen Yang, Xiran Lyu, Peishi Lai, Mengying Zhao, Yutian Tang, Huizhi Zhang, Xianhang Li, Linhao Wu, Chengyi Wang

AI总结 针对鸿蒙系统从Android/iOS迁移声明式UI的需求,提出ArkTrans方法,通过启发式构建骨架和模式匹配修复语法错误,实现高编译成功率和视觉保真度。

详情
AI中文摘要

作为一个新兴操作系统,鸿蒙对从Android和iOS等平台进行软件迁移有显著需求,其中用户界面(UI)翻译是关键环节。然而,最新的UI开发已转向声明式范式,例如Android的Kotlin Jetpack Compose(KJC)、iOS的SwiftUI和鸿蒙的ArkUI,这使得先前的翻译方法不再适用,因为它们要么针对后端逻辑,要么针对传统的命令式UI。因此,本文针对ArkUI提出了一种自动翻译方法,名为ArkTrans,用于将UI文件从Android和iOS移植到鸿蒙。ArkTrans克服了翻译过程中的两个突出挑战:(1)编程语言(PL)不熟悉,以及(2)严重的语法混乱。针对第一个挑战,ArkTrans通过从源PL提取元数据启发式地构建ArkUI骨架,从而指导LLM的初始翻译。针对第二个挑战,ArkTrans通过模式匹配执行经验揭示的后修复规则,以修复大部分剩余的语法错误。为了检验ArkTrans的有效性,我们在文件级别构建了一个包含100个样本的从KJC/SwiftUI到ArkUI的并行UI页面翻译基准。大量实验表明,直接/一次性提示的LLM无法翻译出一个可编译的UI页面。相比之下,最多90.67%的ArkTrans翻译文件可以成功编译,并具有高视觉保真度。

英文摘要

As an emerging operating system, HarmonyOS has a significant demand for software migration from platforms such as Android and iOS, where the User Interface (UI) translation accounts for a critical link. However, the latest UI development has shifted to declarative paradigms, e.g., Kotlin Jetpack Compose (KJC) for Android, SwiftUI for iOS, and ArkUI for HarmonyOS, rendering prior translation approaches inapplicable, as they target either backend logic or legacy imperative UIs. As such, this paper targets ArkUI and proposes an automatic translation approach, namely ArkTrans, to port UI files from Android and iOS to HarmonyOS. ArkTrans overcomes two salient challenges during the translation: (1) Programming Language (PL) unfamiliarity, and (2) severe syntactic chaos. Towards the first challenge, ArkTrans heuristically constructs ArkUI skeletons by extracting metadata from source PL, thereby guiding LLMs' initial translation. As for the second challenge, ArkTrans executes empirically revealed post-fixing rules via pattern matching to repair most of the remaining syntactic errors. To examine the effectiveness of ArkTrans, we construct a 100-sample parallel UI page translation benchmark from KJC/SwiftUI to ArkUI at the file level. Extensive experiments demonstrate that LLMs with direct/one-shot prompting cannot translate a single compilable UI page. In contrast, at most 90.67\% ArkTrans-translated files can be successfully compiled with high visual fidelity.

2606.07078 2026-06-08 cs.CG 新提交

HRsR: Hierarchical Rotation System Reconstruction

HRsR: 层次化旋转系统重建

Ruiqi Cui, Cem Akarsubaşı, Emil Toftegaard Gæde, Eva Rotenberg, Leif Kobbelt, J. Andreas Bærentzen

AI总结 提出层次化旋转系统重建(HRsR)方法,通过边塌缩和顶点分裂的层次化流水线加速旋转系统重建(RsR),实现高达6倍加速和8倍内存减少,同时保持几何保真度和拓扑控制。

详情
AI中文摘要

从点云进行表面重建在同时要求几何保真度和拓扑控制时仍然具有挑战性。旋转系统重建(RsR)通过欧拉示性数显式控制拓扑从点云重建三角网格,但其顺序边插入限制了可扩展性。我们提出层次化旋转系统重建(HRsR),通过边塌缩和顶点分裂的层次化流水线加速RsR。HRsR首先使用$k$-最近邻图简化输入,在简化结构上进行重建,然后在保持拓扑的同时恢复几何细节。为保持几何一致性,我们引入相交处理和基于质量的顶点分裂选择。实验表明,与RsR相比,HRsR实现了高达6倍的加速和超过8倍的内存减少,同时达到可比较的重建结果。

英文摘要

Surface reconstruction from point clouds remains challenging when both geometric fidelity and topology control are required. Rotation System Reconstruction (RsR) reconstructs triangle meshes from point clouds while explicitly controlling topology through the Euler characteristic, but its sequential edge insertion limits scalability. We present Hierarchical Rotation System Reconstruction (HRsR), which accelerates RsR through a hierarchical pipeline of edge collapses and vertex splits. HRsR first simplifies the input using a $k$-nearest neighbor graph, performs reconstruction on the reduced structure, and then restores geometric detail while preserving topology. To maintain geometric consistency, we incorporate intersection handling and quality-driven vertex split selection. Experiments demonstrate up to a $6\times$ speedup and more than $8\times$ reduction in memory usage over RsR, while achieving comparable reconstruction results.

2606.07071 2026-06-08 cs.IR 新提交

Decision-Theoretic Stopping Rules for Document Screening

文档筛选的决策论停止规则

Aaron H. A. Fletcher, Mark Stevenson

AI总结 针对文档筛选何时停止的问题,基于决策理论和完美信息期望值提出三种停止策略,在专利审查和系统综述任务中比现有方法获得更高净效用。

详情
AI中文摘要

决定何时停止审查搜索结果是一个常见问题,具有多种应用。技术辅助审查(TAR)中现有的停止规则旨在实现预定的召回目标,而未考虑检查结果的原因,可能导致次优建议。本文将决策理论应用于该问题,并基于完美信息期望值推导出三种实用的停止策略。该方法应用于两个专业搜索任务:专利审查和系统综述。在CLEF-IP和医学系统综述数据集上的实验表明,与现有方法相比,所提出的方法通常能产生更合适的停止决策,在评估的成本和收益设置下表现出更高的净效用。

英文摘要

Deciding when to stop reviewing the results of a search is a common problem with multiple applications. Existing stopping rules developed within Technology-Assisted Review (TAR) aim to achieve a pre-specified recall target and do not take into account the reason for examining the results, potentially leading to sub-optimal recommendations. This paper applies decision theory to the problem and uses it to derive three practical stopping policies based on the Expected Value of Perfect Information. The approach is applied to two professional search tasks: patent examining and systematic reviewing. Experiments on CLEF-IP and medical systematic review datasets show that the proposed approach generally produces more appropriate stopping decisions than existing methods, as demonstrated by higher net utility under the evaluated cost and payoff settings.

2606.07060 2026-06-08 cs.DB 新提交

Auto-Relate: A Unified Approach to Discovering Reliable Functional Relationships Leveraging Statistical Tests

Auto-Relate: 一种利用统计测试发现可靠函数关系的统一方法

Ziyan Han, Yeye He, Shuyuan Kang, Min Xie, Weiwei Cui, Song Ge, Haidong Zhang, Dongmei Zhang, Surajit Chaudhuri, Rui Mao, Jianbin Qin

AI总结 提出Auto-Relate框架,通过挖掘-验证流程和四种可靠性准则(准确性、原子性、稳定性、完整性)发现表格中的可靠函数关系,在58,679个真实表格上平均PR-AUC达0.87。

详情
AI中文摘要

电子表格、计算笔记本和数据库中的表格通常包含丰富的列间关系。然而,这些关系通常是隐式的,并且在表格导出为标准格式时常常丢失。恢复它们可以有益于下游任务,包括表格理解、数据质量改进和溯源分析。然而,仅仅挖掘在观察到的表格上成立的关系是不够的,因为许多关系由于巧合、冗余或有限的数据多样性而具有虚假性。在本文中,我们引入函数关系(FR)作为表格中列间关系的统一概念,涵盖算术关系、字符串变换和函数依赖。我们通过四个互补的准则来刻画FR的可靠性:准确性、原子性、稳定性和完整性。在这些准则的指导下,我们提出了Auto-Relate,一个先挖掘后验证的框架,首先生成准确的候选FR,然后通过最小性测试、扰动测试和独立性测试分别验证剩余的可靠性准则。为了进一步提高效率,我们开发了三种优化策略,包括用于早期拒绝的分组下界、用于算术FR的闭式加速以及用于统计引导早期终止的二项式界。我们从58,679个真实电子表格和关系表中构建了一个大规模基准套件,包含6,414个覆盖所有三种FR类型的地面真实FR。针对18个基线的广泛实验表明,Auto-Relate在所有设置中始终实现最佳性能,平均PR-AUC为0.87,比最佳竞争基线高出59%。

英文摘要

Tables in spreadsheets, computational notebooks, and databases often contain rich inter-column relationships. Yet these relationships are typically implicit and are often lost when tables are exported to standard formats. Recovering them can benefit downstream tasks, including table understanding, data quality improvement, and provenance analysis. However, simply mining relationships that hold on an observed table is insufficient, as many are spurious due to coincidence, redundancy, or limited data diversity. In this paper, we introduce functional relationships (FRs) as a unified notion for inter-column relationships in tables, subsuming arithmetic relationships, string transformations, and functional dependencies. We characterize FR reliability through four complementary criteria: accuracy, atomicity, stability, and integrity. Guided by these criteria, we propose Auto-Relate, a mine-then-verify framework that first generates accurate candidate FRs and then verifies the remaining reliability criteria through a Minimality Test, a Perturbation Test, and an Independence Test, respectively. To further improve efficiency, we develop three optimization strategies, including a group-by lower bound for early rejection, a closed-form speedup for arithmetic FRs, and a binomial bound for statistically guided early termination. We construct a large-scale benchmark suite from 58,679 real-world spreadsheets and relational tables, containing 6,414 ground-truth FRs spanning all three FR types. Extensive experiments against 18 baselines show that Auto-Relate consistently achieves the best performance, with an average PR-AUC of 0.87, 59% higher than the best competing baseline across all settings.

2606.07046 2026-06-08 cs.DC 新提交

Predictive Autoscaling in Cloud-Native and Federated Cloud-Edge Computing Environments: A Taxonomy and Future Directions

云原生与联邦云边计算环境中的预测性自动缩放:分类与未来方向

Bablu Kumar, Anshul Verma, Rajkumar Buyya

AI总结 本文系统综述了云原生与联邦云边环境中的预测性自动缩放技术,提出了基于触发器、目标、预测模型和评估指标的分类法,并探讨了CRD、MAPE控制环及联邦学习等机制,最后指出了未来研究方向。

详情
AI中文摘要

自动缩放是云原生系统中的关键能力,其中动态工作负载、异构环境和延迟敏感型应用需要高效且自适应的资源管理。基于固定阈值的传统反应式方法通常响应过迟,导致资源失衡、性能下降和缩放行为不稳定。近期在预测模型、Kubernetes自定义资源定义(CRD)、基于监控-分析-计划-执行(MAPE)的控制循环以及联邦学习(FL)方面的进展,使得更主动和自主的自动缩放策略成为可能。本文对这些进展进行了结构化综述。首先,基于触发器、目标、预测模型和评估指标,提出了自动缩放技术的分类法。然后,考察了预测性自动缩放方法和基于CRD的机制,包括Kubernetes操作器和协调工作流。进一步,分析了联邦学习环境中的自动缩放,强调了反应式和主动式策略以及隐私保护技术和容器级隔离。本文还讨论了漂移感知和不确定性感知的自动缩放,引入了自动缩放漂移指数(ADI)、反馈驱动校正和异构工作负载的稳定性控制等概念。最后,概述了开放挑战和未来研究方向,为云边环境中下一代智能预测性自动缩放奠定了基础。

英文摘要

Autoscaling is a key capability in cloud-native systems, where dynamic workloads, heterogeneous environments, and latency-sensitive applications require efficient and adaptive resource management. Traditional reactive approaches based on fixed thresholds often respond too late, leading to resource imbalance, performance degradation, and unstable scaling behavior. Recent advances in predictive models, Kubernetes Custom Resource Definitions (CRDs), Monitor-Analyse-Plan-Execute (MAPE) based control loops, and federated learning (FL) have enabled more proactive and autonomous autoscaling strategies. This paper presents a structured review of these developments. It first introduces a taxonomy of autoscaling techniques based on triggers, targets, prediction models, and evaluation metrics. It then examines predictive autoscaling approaches and CRD-based mechanisms, including Kubernetes operators and reconciliation workflows. Further, it analyses autoscaling in federated learning environments, highlighting reactive and proactive strategies alongside privacy-preserving techniques and container-level isolation. The paper also discusses drift-aware and uncertainty-aware autoscaling, incorporating concepts such as the Autoscaling Drift Index (ADI), feedback-driven correction, and stability control for heterogeneous workloads. Finally, it outlines open challenges and future research directions, providing a foundation for next-generation intelligent predictive autoscaling in cloud-edge environments.

2606.07019 2026-06-08 cs.DC 新提交

PCCL: Process Group-Aware Scalable and Generic Collective Algorithm Synthesizer

PCCL:进程组感知的可扩展通用集合算法合成器

William Won, Kartik Lakhotia, Madhu Kumar, Sudarshan Srinivasan, Tushar Krishna

AI总结 提出PCCL框架,通过进程组感知和拓扑感知,自动生成针对任意集合模式(如All-to-All)的近似最优算法,显著提升分布式训练中集合通信的效率。

Comments Contains 11 main pages, 19 figures, three tables, three algorithms

详情
AI中文摘要

由于大规模生成模型的庞大规模,分布式机器学习变得日益重要。模型参数和数据分布在众多计算设备上,需要频繁的集合通信来同步激活值和参数更新。这种集合通信已成为主要瓶颈。虽然集合算法的性能取决于物理网络拓扑,但集合通信库中的基线集合算法在很大程度上是拓扑无关的。集合算法合成器通过自动生成拓扑感知的集合算法来解决这一低效问题。然而,先前的工作大多忽略了集合通信通常只发生在设备子集(称为进程组)中。此外,大多数现有的合成器在可生成的目标集合模式范围上受到限制。我们提出了PCCL,一个可扩展且通用的框架,用于合成拓扑感知的集合算法。PCCL具有进程组感知能力,即使只有部分设备参与集合操作,也能生成接近最优的集合算法。PCCL可以合成任意集合模式,包括在11.68分钟内完成512-NPU的全对全合成。

英文摘要

Distributed machine learning has become increasingly important due to the massive scale of large-scale generative models. Both model parameters and data are distributed across many compute devices, which requires frequent collective communications to synchronize activations and parameter updates. Such collective communications have become a major bottleneck. While the performance of the collective algorithm depends on the physical network topology, the baseline collective algorithms in collective communication libraries are largely topology-agnostic. Collective algorithm synthesizers address this inefficiency by automatically generating topology-aware collective algorithms. However, prior works have largely overlooked that collective communication typically occurs only among a subset of devices, known as process groups. Additionally, most existing synthesizers are limited in the range of target collective patterns they can generate. We propose PCCL, a scalable and generic framework for synthesizing topology-aware collective algorithms. PCCL is process group-aware and capable of generating near-optimal collective algorithms even when only a subset of devices participates in collective operations. PCCL synthesizes arbitrary collective patterns, including 512-NPU All-to-All synthesis in 11.68 minutes.

2606.07009 2026-06-08 cs.CR cs.IT math.IT 新提交

Fast Bounded-Independence Functions and Their Duals

快速有界独立函数及其对偶

Martijn Brehm, Yuval Ishai, Nicolas Resch

AI总结 本文构造了具有线性电路规模的快速函数,实现了最优代数度的t-wise独立哈希函数,改进了快速码及其对偶的构造,并首次实现了将任意t个线性独立输入映射到均匀统计独立输出的快速线性函数族,应用于密码学。

Comments Full version of paper to appear in ITC 2026. 34 pages

详情
AI中文摘要

我们继续研究{\em 快速}函数,即可由线性规模电路计算、且具有随机函数有用性质的函数。受密码学应用驱动,我们推广并改进了该领域的先前结果,得到以下结果:- 对于任意常数$t$,我们构造了一个快速$t$元独立哈希函数,其代数次数为$\log_2 t$(在$\mathbb F_2$上),同时优化了渐近电路规模和次数。- 我们简化并改进了近期(ITCS 2026)的一个快速码族及其快速对偶的构造,两者均达到Gilbert-Varshamov界。与先前构造不同,我们的构造具有可忽略的失败概率,可适应一般域和速率,支持系统编码,并具有快速通用编码器。- 我们加强了上述结果以支持更强的随机性质,例如最优组合列表解码。这是通过为任意常数$t$构造一个快速线性函数族实现的,该函数族将任意$t$个线性独立输入映射到均匀且统计独立的输出。在我们的工作之前,这仅对$t=1$已知。我们展示了上述结果对密码学的有用性。这包括首个电路复杂度随参与方数量线性扩展的完美安全多方计算协议,以及计算加密矩阵-向量积且具有最优渐近电路复杂度的协议。

英文摘要

We continue the study of {\em fast} functions, computable by linear-size circuits, that share useful properties of random functions. Motivated by cryptographic applications, we generalize and improve on previous results in this area, obtaining the following results: - For any constant $t$, we construct a fast $t$-wise independent hash function with algebraic degree $\log_2 t$ (over $\mathbb F_2$), simultaneously optimizing both asymptotic circuit size and degree. - We simplify and improve a recent construction (ITCS 2026) of a family of fast codes with fast duals, both meeting the Gilbert-Varshamov bound. Unlike the previous construction, our construction has negligible failure probability, can accommodate general fields and rates, supports a systematic encoding, and admits fast universal encoders. - We strengthen the above to support stronger random-like properties, such as optimal combinatorial list-decoding. This is achieved by constructing, for any constant $t$, a family of fast linear functions that map any $t$ linearly independent inputs to uniform and statistically independent outputs. Prior to our work, this was only known for $t=1$. We demonstrate the usefulness of the above results to cryptography. This includes the first nontrivial protocols for perfectly secure multiparty computation whose circuit complexity scales linearly with the number of parties, as well as protocols for computing encrypted matrix-vector products with optimal asymptotic circuit complexity.

2606.07005 2026-06-08 cs.CR 新提交

The Sound of Malware: A Memory Forensics Approach for Android Malware Analysis via Audio Signals

恶意软件之声:通过音频信号进行Android恶意软件分析的内存取证方法

Silvia Lucia Sanna, Massimo Palozzi, Leonardo Regano, Riccardo Lazzeretti, Giorgio Giacinto

AI总结 提出一种内存取证框架,将Android恶意软件的静态字节码和内存快照转换为音频波形,利用频谱描述符、CNN和Transformer嵌入实现高达98.0%的准确率。

详情
AI中文摘要

Android恶意软件分析目前面临稳健分类和检测隐蔽攻击的日益严峻挑战。现代威胁采用先进的规避策略,如代码混淆、动态加载、加壳,甚至对传统静态和动态特征进行隐写操作。这些技术降低了基于签名的系统的有效性,并削弱了依赖显式语义指标(如权限、API调用或控制流结构)的机器学习模型的可靠性。在这项工作中,我们提出\approachname,一种内存取证恶意软件检测框架,将分析视角从语义程序建模转向基于信号的结构表示。静态字节码和早期执行内存快照通过直接二进制到波形映射转换为音频波形,保留底层结构模式,无需反汇编或特征工程。生成的信号使用手工设计的频谱描述符、卷积神经网络和基于Transformer的嵌入进行处理。在CICMalDroid2020数据集和VirusTotal恶意软件上的实验表明,\approachname达到高达98.0%的准确率,优于静态声纳化和竞争性的最新方法。

英文摘要

Android malware analysis is currently facing increasing challenges in achieving robust classification and detecting stealth attacks. Modern threats employ advanced evasion strategies such as code obfuscation, dynamic loading, packing, and even steganographic manipulation of traditional static and dynamic features. These techniques reduce the effectiveness of signature-based systems and degrade the reliability of Machine Learning models that depend on explicit semantic indicators such as permissions, API calls, or control-flow structures. In this work, we propose \approachname, a memory forensics malware detection framework that shifts the analysis perspective from semantic program modeling to signal-based structural representation. Both static bytecode and early-execution memory snapshots are transformed into audio waveforms through direct binary-to-waveform mapping, preserving low-level structural patterns without requiring disassembly or feature engineering. The resulting signals are processed using handcrafted spectral descriptors, Convolutional Neural Networks, and transformer-based embeddings. Experiments on CICMalDroid2020 dataset and VirusTotal malware demonstrate that \approachname achieves up to 98.0\% accuracy, outperforming static sonification and competitive state-of-the-art approaches.

2606.06995 2026-06-08 eess.SY cs.SY 新提交

Power Grid Topology Control

电网拓扑控制

Tong Han, Yan Xu, David J. Hill

AI总结 本文综述电网拓扑控制的发展,涵盖稳态拓扑控制、拓扑转换和暂态拓扑控制,旨在利用网络侧灵活性应对可再生能源并网挑战。

详情
AI中文摘要

电网正面临可再生能源并网增加和气候影响加剧的重大挑战。虽然需求侧和发电侧的灵活性已被广泛探索以应对这些挑战,但网络侧灵活性,特别是网络拓扑,仍未得到充分利用。通信、电力电子和断路器的进步使网络拓扑越来越可控。然而,利用这种拓扑灵活性带来了巨大挑战,主要源于相关优化和控制问题中固有的非凸性和混合动态。本专著调查了电网拓扑控制早期和近年来的发展。首先讨论了拓扑控制问题中涉及的基本拓扑约束。随后,分别介绍了输电网和配电网的稳态拓扑控制,涵盖基础、最新进展综述和代表性近期成果。此外,进一步建模和分析了网络拓扑转换问题,该问题涉及最优拓扑方案的实现,近年来受到越来越多的关注。除了利用稳态网络拓扑的灵活性外,在暂态过程中控制网络拓扑也有助于系统稳定。传统方法,如输电网的有意解列,以及最近开发的微电网稳定拓扑控制方法,都体现了这一概念。最后,对本专著进行了总结。

英文摘要

Power grids are facing major challenges from growing renewable integration and worsening climate impacts. While flexibility on both the demand and generation sides has been widely explored to address these challenges, network-side flexibility, especially in network topology, remains highly underutilized. Advances in communication, power electronics, and circuit breakers have made network topology increasingly controllable. However, leveraging this topological flexibility poses substantial challenges, primarily due to the inherent non-convexity and hybrid dynamics in associated optimization and control problems. This monograph surveys the development of power grid topology control in both early and recent years. It begins by discussing the fundamental topological constraints involved in topology control problems. Subsequently, it introduces steady-state topology control for transmission and distribution networks separately, covering fundamentals, a state-of-the-art review, and representative recent advances. Additionally, the network topology transition problem, which addresses the implementation of optimal topology solutions and has garnered increasing attention in recent years, is further modeled and analyzed. Beyond utilizing the flexibility of steady-state network topology, controlling network topology during transients can also contribute to system stabilization. Traditional approaches, such as intentional controlled islanding for transmission networks, as well as recently developed topology control methods for microgrid stabilization, exemplify this concept. Finally, a summary of this monograph is provided.

2606.06989 2026-06-08 cs.GT 新提交

Menu Selection: A Computational Approach to Minimizing Food Waste

菜单选择:一种最小化食物浪费的计算方法

Haris Aziz, Nicholas Mattei, Shivika Narang, Sanjukta Roy

AI总结 提出一种集体决策问题,通过两种消费模型(乐观和悲观)选择最小规模菜单,确保所有代理获得足够食物并最小化浪费,给出有效菜单特征、多项式时间算法及悲观浪费比紧界。

详情
AI中文摘要

我们引入了一个新颖的集体决策问题,该问题捕捉了为满足多样化饮食偏好和需求而订购食物的普遍问题。我们的设置涉及具有不同饮食需求的代理,以及具有不同份量的菜单选项。目标是选择一个菜单,使得每个人都有足够的食物可以消费,并且食物浪费最小化。我们引入了两种不同的消费模型:乐观和悲观。乐观消费假设中央规划者可以最优地在代理之间分配订购的食物,以最大化获得足够食物的人数。悲观消费考虑当代理以任意顺序自己盛食物时,消费的最坏情况保证。在任一消费模型下,我们寻求最小规模的可行菜单(在该菜单下所有代理都得到充分喂养)。我们的工作提供了两组特征描述:(1)我们刻画了任一消费模型下的可行菜单;(2)我们刻画了允许多项式时间算法找到最小规模菜单的实例空间。我们的结果还帮助我们设计整数线性规划,以在一般设置中找到最小规模菜单。此外,我们针对重要的特例提出了多项式时间算法。然后,我们考虑最小规模乐观和悲观菜单大小之间的最坏情况差异。我们称之为悲观浪费,由最小规模悲观菜单与最小规模乐观菜单的大小之比表示。我们给出了该比值的紧上界。我们的结果还提供了关于寻找最小规模极大匹配问题的额外见解,这可能具有独立意义。

英文摘要

We introduce a novel collective decision making problem that captures the ubiquitous issue of ordering food to cater for varied dietary preferences and requirements. Our settings involve agents with diverse dietary requirements over menu options with varied serving sizes. The goal is to select a menu where everyone has enough food they can consume and wastage of food is minimized. We introduce two different consumption models: optimistic and pessimistic. Optimistic consumption assumes a situation when a central planner can optimally allocate the food ordered among the agents to maximize the number of people who get enough to eat. Pessimistic considers the worst case guarantee on consumption when agents fill their own plates in an arbitrary order. Under either consumption model, we seek valid menus (under which all agents are sufficiently fed) of minimum size. Our work provides two sets of characterizations: (1) we characterize valid menus under either consumption model and (2) we characterize the space of instances that admit polynomial-time algorithms to find minimum sized menus. Our results also help us design Integer Linear Programs to find minimum sized menus in general settings. Furthermore, we present polynomial-time algorithms for important special cases. We then consider the worst case discrepancy between the size of minimum sized optimistic and pessimistic menus. We call this the waste of pessimism, captured by the ratio of the minimum sized pessimistic menu to that of the minimum sized optimistic menu. We show tight upper bounds on this ratio. Our results also provide additional insights on the problem of finding a minimum sized maximal matching, which may be of independent interest.

2606.06970 2026-06-08 cs.IR 新提交

SSRLive: Live Streaming Recommendation with Dynamic Semantic ID

SSRLive:基于动态语义ID的直播推荐

Teng Shi, Zhaoheng Li, Yuanhang Qu, Yi Liu, Lixiang Lai, Yuning Jiang

AI总结 针对直播推荐中静态语义ID无法反映内容动态变化、生成式方法忽略用户-主播交互信号的问题,提出SSRLive框架,结合生成模块(动态语义ID)与判别模块(交互增强),在真实部署中显著提升观看时长、GMV等指标。

详情
AI中文摘要

直播已成为增长最快的在线媒体形式之一,支持即时内容广播和用户与主播之间的实时互动。尽管现有推荐算法在该领域有效,但它们通常计算资源利用率有限,低FLOPs阻碍了性能进一步提升。生成式推荐技术在各种工业任务中受到关注,为改进直播推荐提供了有前景的途径。然而,直接将生成方法应用于直播并非易事,因为存在两大挑战:(1)静态语义ID无法反映直播房间内容的快速变化;(2)生成式流水线通常不包含用户-主播交互信号(如点赞、订单),而这些信号对于建模用户对主播和展示产品的意图至关重要。为应对这些挑战,我们提出SSRLive:面向直播平台的动态语义ID引导的流式推荐。该框架在统一架构中集成了生成模块和判别模块。生成组件采用编码器-解码器设计,产生静态和动态语义ID,能够及时表示直播房间内容,同时利用多模态信息。判别组件通过将语义ID与用户特征结合来细化任务特定表示,并用用户-主播交互数据增强这些表示,执行多任务预测。实际部署中的在线A/B测试证明了切实的收益:观看时长(+3.38%)、GMV(+0.72%)、粉丝增长(+3.12%)和互动量(+2.92%)。这些改进凸显了SSRLive的有效性和商业价值,该系统现已全面部署,服务于数亿活跃用户。

英文摘要

Live streaming has emerged as one of the fastest-growing forms of online media, enabling instant content broadcasting and real-time engagement between users and streamers. Despite the effectiveness of existing recommendation algorithms in this domain, they often suffer from limited utilization of computational resources, with low FLOPs that hinder further performance enhancement. Generative recommendation techniques, which have gained traction in various industrial tasks, offer a promising avenue for improving live streaming recommendations. However, directly applying generative methods to live streaming is non-trivial due to two major challenges: (1) static semantic IDs (SIDs) cannot reflect the rapidly changing nature of live room content; and (2) generative pipelines generally do not incorporate user--streamer interaction signals (e.g., likes, orders), which are critical for modeling user intent toward both the streamer and showcased products. To address these challenges, we introduce SSRLive: Dynamic Semantic ID-guided Streaming Recommendation for Live platforms. The proposed framework integrates a generative module and a discriminative module in a unified architecture. The generative component employs an encoder-decoder design to produce both static and dynamic SIDs, enabling timely representation of live room content while leveraging multimodal information. The discriminative component refines task-specific representations by combining SIDs with user features, augments them with user-streamer interaction data, and performs multi-task predictions. Online A/B tests in real-world deployment demonstrate tangible benefits: watch time (+3.38%), GMV (+0.72%), follower growth (+3.12%), and interaction volume (+2.92%). These improvements highlight the effectiveness and business value of SSRLive, which is now fully deployed, serving hundreds of millions of active users.

2606.06968 2026-06-08 cs.CR 新提交

HAVE: Host Active Verification Engine for Closing the Contextual Reality Gap in Security Digital Twins

HAVE:用于弥合安全数字孪生中上下文现实差距的主机主动验证引擎

Vincenzo Sammartino, Marco Pasquini

AI总结 提出HAVE引擎,通过安全约束主机代理进行最大似然估计,测量经验妥协概率,利用Wilson区间置信权重和贝叶斯混合规则修正CVSS评分导致的上下文现实差距,实验显示在误报和漏报场景中分别降低38.2%和提升132.4%的到达概率。

Comments This work has been submitted to the IEEE for possible publication

详情
AI中文摘要

安全数字孪生(SDT)提供持续更新的基础设施虚拟副本用于威胁模拟,但它们依赖理论CVSS分数来分配横向移动概率——这造成了上下文现实差距:在未确认的缓解措施抵消利用时风险被高估,而在逻辑缺陷绕过所有内存安全防御时风险被严重低估。我们提出主机主动验证引擎(HAVE),这是一种SDT扩展,它部署一个安全约束的主机代理,通过对快照隔离的伯努利试验进行最大似然估计来测量经验妥协概率$\hat{p}$。Wilson区间宽度置信权重$\alpha_w$通过形式上与Beta-Binomial后验相关的贝叶斯混合规则将$\hat{p}$传播到蒙特卡洛模拟中。跨四个漏洞类别、三个安全层级和两个生产二进制文件的评估显示,HAVE在误报场景中将$P_{\text{reach}}$降低了38.2%,在漏报场景中将其提高了132.4%,净修正+124.1%;HAVE后的估计在不同校准指数$\kappa$下仅变化$1.12\times$,而仅使用CVSS的基线变化$4.6\times$。

英文摘要

Security Digital Twins (SDTs) provide continuously updated virtual replicas of infrastructure for threat simulation, yet they rely on theoretical CVSS scores to assign lateral-movement probabilities -- creating the Contextual Reality Gap: risk is overestimated where unacknowledged mitigations neutralize exploits, and drastically underestimated where logic flaws bypass all memory-safety defenses. We present the Host Active Verification Engine (HAVE), an SDT extension that deploys a safety-constrained host agent to measure the empirical probability of compromise $\hat{p}$ via maximum-likelihood estimation over snapshot-isolated Bernoulli trials. A Wilson interval-width confidence weight $α_w$ propagates $\hat{p}$ into Monte Carlo simulations via a Bayesian blending rule formally related to the Beta-Binomial posterior. Evaluation across four vulnerability classes, three security tiers, and two production binaries shows HAVE reduces $P_{\text{reach}}$ by 38.2% in false-positive scenarios and increases it by 132.4% in false-negative scenarios, with a net +124.1% correction; post-HAVE estimates vary by only $1.12\times$ across calibration exponents $κ$, versus $4.6\times$ for CVSS-only baselines.

2606.06955 2026-06-08 cs.NI 新提交

i2Slicer: Enabling Flexible and Automated Orchestration of 5G SA End-to-End Network Slices

i2Slicer:实现5G SA端到端网络切片的灵活自动化编排

M. Catalan-Cid, A. Fernandez, D. Camps-Mur, S. Siddiqui

AI总结 提出i2Slicer,一种灵活编排5G独立组网端到端网络切片的解决方案,支持多租户和多服务,通过自动化生命周期管理简化切片部署。

详情
Journal ref
2023 IEEE Conference on Network Function Virtualization and Software Defined Networks (NFV-SDN)
AI中文摘要

5G网络切片通过允许创建适应服务需求的逻辑网络,在定制无线接入和核心网络方面迈出了一步。此外,软件化推动了不需要专用硬件平台的5G解决方案的出现。因此,推动垂直行业采用5G切片的一个关键要求是通过自动化编排简化其管理。在本文中,我们提出了i2Slicer,一种灵活的解决方案,用于编排具有多租户和多服务能力的5G独立组网端到端网络切片的部署。使用最先进的5G软件和硬件对i2Slicer进行的实现和评估表明,它提供了实用且高效的网络切片生命周期管理。

英文摘要

5G network slicing implies a step forward in customizing radio access and core networks by allowing the creation of logical networks adapted to service requirements. In addition, softwarisation has fueled the emergence of 5G solutions which do not require specialized hardware platforms. Therefore, a key requirement to drive the adoption of 5G slicing by verticals is to simplify its management through automated orchestration. In this paper, we present i2Slicer, a flexible solution to orchestrate the deployment of 5G standalone end-to-end network slices with multi-tenancy and multi-service capabilities. The implementation and evaluation of i2Slicer using state-of-the-art 5G software and hardware demonstrate that it offers a practical and efficient lifecycle management of network slices.

2606.06947 2026-06-08 cs.IR 新提交

DREAM: Dynamic Refinement of Early Assignment Mappings

DREAM:早期分配映射的动态精炼

Liwei Guan, Huanjie Wang, Hongwei Zhang, Linxun Chen, Zhaojie Liu

AI总结 针对SID生成式推荐中冷启动项目因静态编码导致的性能瓶颈,提出DREAM框架,通过意图感知分词器、冻结骨干评估和动态波束机制三阶段渐进精炼,显著提升冷启动推荐效果。

Comments 12 pages, 4 figures, 5 tables

详情
AI中文摘要

生成式推荐通过将物品检索重构为语义ID(SID)的自回归生成来推进物品检索,SID是编码物品语义的紧凑令牌序列。虽然SID提供了强大的语义先验,但当前基于SID的方法在观察到足够的用户反馈之前,通过离线分词为每个物品分配一个单一的静态标识符。对于冷启动项目,这种一次性承诺产生了区分性差的编码,生成未对齐的路径,由于相关令牌在训练期间很少被采样,这些路径无法被精炼。我们识别出这种早期静态承诺(而非模型容量)是SID生成式推荐中冷启动的根本瓶颈。为克服这一瓶颈并弥合分词和生成的不相交目标,我们提出DREAM(早期分配映射的动态精炼),一个通过渐进精炼解决此缺陷的三阶段框架。首先,意图感知分词器通过反事实对比学习重建SID空间,为每个冷启动项目生成多样化的行为对齐候选池。其次,冻结的推荐骨干作为评估器,基于多上下文用户支持选择最可靠的候选,无需重新训练。第三,动态波束机制在训练和推理过程中维护多个加权的SID假设,防止过早坍缩到单一分配。在三个Amazon基准上的大量实验表明,DREAM在冷启动指标上显著优于最先进的生成式和序列式基线。

英文摘要

Generative recommendation advances item retrieval by reformulating it as autoregressive generation of Semantic IDs (SIDs), compact token sequences that encode item semantics. While SIDs offer a strong semantic prior, current SID-based methods assign each item a single static identifier through offline tokenization before sufficient user feedback is observed. For cold-start items, this one-shot commitment produces poorly discriminative codes, generating misaligned paths that remain unrefined because the associated tokens are rarely sampled during training. We identify this early static commitment, not model capacity, as the fundamental cold-start bottleneck in SID-based generative recommendation. To overcome this bottleneck and bridge the disjoint objectives of tokenization and generation, we propose DREAM (Dynamic Refinement of Early Assignment Mappings), a three-stage framework that resolves this flaw through progressive refinement. First, an intent-aware tokenizer rebuilds the SID space through counterfactual contrastive learning, generating a diverse pool of behavior-aligned candidates per cold-start item. Second, the frozen recommendation backbone serves as an evaluator, selecting the most reliable candidate based on multi-context user support without retraining. Third, a dynamic beam mechanism maintains multiple weighted SID hypotheses throughout training and inference, preventing premature collapse to a single assignment. Extensive experiments on three Amazon benchmarks show that DREAM substantially outperforms state-of-the-art generative and sequential baselines on cold-start metrics.

2606.06936 2026-06-08 cs.HC 新提交

Personality Anchoring for Social Simulation: Linking Personality, Social Behavior, and Interaction Success with LLM Agents

社会模拟的人格锚定:将人格、社会行为与交互成功关联于LLM智能体

Vahid Sadiri Javadi, Aksa Aksa, Fryderyk Róg, Lucie Flek, Johanne R. Trippas

AI总结 提出人格锚定方法,利用电影角色构建多LLM社会模拟,发现双人宜人性组合与共享目标达成呈单调关系,同质宜人性对成功率是同质不宜人性的10倍。

详情
AI中文摘要

社会互动由性格特质和情境语境的相互作用塑造,但系统研究个体间人格配置如何共同影响不同社会情境中的社会行为在方法上仍具挑战。我们通过引入改编自CHARISMA框架的模拟流程来填补这一空白,该流程使用知名电影角色和公众人物作为心理学基础的智能体,采用我们称为人格锚定的方法进行多LLM社会模拟。我们进行了一项大规模实证研究,考察了1,010个模拟对话中双人宜人性组成对社会互动结果的影响。结果显示,双人宜人性组成与共享目标达成之间存在单调关系,同质宜人性对的成功率为同质不宜人性对的10倍(62% vs. 6%)。行为中介分析表明,宜人性部分通过合作策略选择影响目标达成,但在相同主导策略内仍能预测结果,表明存在超出可观察对话行为的路径。稳健性分析证实了重复模拟结果的高度一致性(ICC = 0.89)以及跨不同场景的稳定人格表达,验证了人格锚定作为一种可行的操作化策略。

英文摘要

Social interactions are shaped by the interplay of dispositional traits and situational context, yet systematically investigating how personality configurations between individuals jointly influence social behavior across diverse social contexts remains methodologically challenging. We address this gap by introducing a simulation pipeline adapted from the CHARISMA framework, which employs well-known movie characters and public figures as psychologically grounded agents for multi-LLM social simulation using a method we term personality anchoring. We present a large-scale empirical study examining how dyadic Agreeableness composition influences social interaction outcomes across 1,010 simulated conversations. Our results reveal a monotonic relationship between dyadic Agreeableness composition and shared goal achievement, with Homogeneous-Agreeable pairs achieving success 10 times the rate of Homogeneous-Disagreeable pairs (62% vs. 6%). Behavioral mediation analysis reveals that Agreeableness shapes goal achievement partially through cooperative strategy selection, though it continues to predict outcomes within the same dominant strategy, indicating pathways beyond observable conversational behavior. Robustness analyses confirm high consistency of results across repeated simulations (ICC = 0.89) and stable personality expression across diverse scenarios, validating personality anchoring as a viable operationalization strategy.

2606.06932 2026-06-08 eess.SY cs.SY 新提交

Forecast and Model Predictive Control of Distributed Energy Resource Aggregators for Net-Demand Balancing

分布式能源聚合体的预测与模型预测控制在净需求平衡中的应用

Obai Bahwal, Oliver Kosut, LalithaSankar

AI总结 提出结合预测与模型预测控制的方法,将分布式能源聚合体视为虚拟电池,通过滚动时域MPC跟踪净需求模式,并分析预测时域、MPC更新率及预测模型选择的影响。

详情
AI中文摘要

随着能源需求的快速增长,即使加入大量可再生能源也不足以完全满足需求,反而增加了供应不确定性。分布式能源聚合体(DERAs)通过聚合和控制分散的分布式能源,有潜力解决这种不确定性,从而充当虚拟电厂。我们提出了一种新方法,结合预测和模型预测控制,将DERAs分配以跟随净需求模式,同时考虑聚合能源的动态及其容量限制。每个DERA被表示为一个灵活的“虚拟电池”,具有荷电状态和功率限制的约束。调度问题被设定为一个长期模型预测控制任务,旨在最小化与期望荷电水平、输出爬坡和净负荷跟踪误差的差异。为了保持实时运行效率,我们实现了滚动时域MPC,该MPC使用最新的边际需求预测定期更新决策。在预测方面,我们提出了两种模型:线性回归和长短期记忆(LSTM)神经网络。使用高分辨率的CAISO净需求数据和五种典型的DERA类型,我们的模拟展示了该方法跟踪边际需求的效果;特别地,我们强调了预测时域与MPC更新率之间的权衡,以及对负荷预测模型选择的依赖性。我们的结果还表明,在期望的时间偏移和时域选择下,LSTM模型略优于线性回归。

英文摘要

With the rapid demand for energy, even the incorporation of bulk renewable energy sources is not entirely sufficient to meet demand besides adding supply uncertainty. Distributed Energy Resource Aggregators (DERAs) have the potential to address this uncertainty via aggregation and control of decentralized distributed energy sources, thereby acting like virtual power plants. We present a new approach that combines forecasting and model-predictive control to assign DERAs to follow net-demand patterns, while accounting for the dynamics of the aggregate energy sources and their capacity limits. Each DERA is represented as a flexible ``virtual battery" with constraints on state-of-charge and power limits. The dispatch problem is set up as a long-term model predictive control task that aims to minimize differences from desired charge levels, output ramping, and net-load tracking errors. To keep operations efficient in real time, we implement a rolling-horizon MPC, which updates decisions regularly using the latest marginal-demand forecasts. For forecasting, we present two models: linear regression and long-short term memory (LSTM) neural network. Using high-resolution CAISO net-demand data and five typical DERA types, our simulations demonstrate how well our approach tracks marginal-demand; in particular, we highlight the tradeoffs between forecasting horizon times and MPC update rate as well as the dependence on the choice of the load forecasting model. Our results also indicate a slight edge for LSTM models over linear regression for desired time shifts and horizon choices.

2606.06917 2026-06-08 cs.ET 新提交

Belief-Aware Scheduling for Predictive Wildfire Hazard Mapping under Sparse-Window Telemetry

基于信念感知的稀疏窗口遥测下预测性野火危险地图调度

Xun Shao, Kohsuke Yamakawa, Cheah Wai Shiang

AI总结 针对边缘节点遥测受限问题,提出结构化信念与调度器协同方法,在物理校准合成环境中验证,轻量级跨区域注意力编码器优于FAIR基线约28%。

详情
AI中文摘要

监测野火的边缘节点观测到的数据超过受任务限制或窗口化下行链路所能承载的量。接收器必须根据链路传输的任何内容预测H步前的危险地图。我们认为,操作设计问题不在于使用哪种神经架构,而在于如何推导出足以满足接收器预测任务的结构化信念,并通过一个预测未来传输机会的调度器来维持该信念。我们将此形式化为一个部分可观测的序贯分配问题,具有三个耦合的每区域动作轴(感知、表示、传输),并根据H步前向算子的输入需求推导出结构化信念的每个组成部分。识别这些机制需要独立控制窗口周期P、每窗口容量C、预测范围H和燃料组成,这在真实景观数据中不可分离;因此我们在物理校准的合成环境中进行评估。三个实证观察支持该原则:非短视的活动节奏参考与均匀节奏之间的差距在窗口周期稀疏度上呈单峰分布,在中间间隔处达到峰值;消融结构化信念后,主导操作组件在默认景观(时间陈旧性)和结构化景观(静态风险先验)之间翻转,而每单元强度信念在两者中都是冗余的;一个40k参数的轻量级跨区域注意力编码器在默认景观上超过FAIR活动节奏参考约28%,在结构化景观上约11%。更深的Transformer编码器在平均预测损失上并未优于轻量级编码器,且表现出更高的训练种子方差。在此任务类别和机制下,当信念和调度问题正确提出时,适度的架构归纳偏置就足够了。

英文摘要

An edge node monitoring a wildfire observes more than a duty-limited or windowed down-link can carry. The receiver must predict the H-step-ahead hazard map from whatever the link delivers. We argue the operative design problem is not which neural architecture to use but how to derive a structured belief sufficient for the receiver's prediction task and maintain it through a scheduler that anticipates future transmission opportunities. We formalize this as a partially observed sequential allocation problem with three coupled per-region action axes (sensing, representation, transmission), and derive each component of the structured belief from the H-step forward operator's input requirements. Identifying these mechanisms requires independent control over the window period P, per-window capacity C, predictive horizon H, and fuel composition, which is not separable in real-landscape data; we therefore evaluate on a physics-calibrated synthetic environment. Three empirical observations support the principle: the gap between a non-myopic activity-paced reference and uniform pacing is unimodal in window-period sparsity, peaking at intermediate spacing; ablating the structured belief, the dominant operative component flips between a default landscape (temporal staleness) and a structured landscape (static-risk prior), while the per-cell intensity belief is redundant in both; and a 40 k-parameter lightweight cross-region attention encoder exceeds the FAIR activity-paced reference by ~28% on the default landscape and ~11% on the structured landscape. A deeper Transformer encoder does not improve over the lightweight encoder in mean predictive loss and exhibits higher training-seed variance. Within this task class and regime, a modest architectural inductive bias suffices when the belief and the scheduling problem are correctly posed.

2606.06914 2026-06-08 cs.CR 新提交

DPAgent-in-the-Middle: Agentic Defense and Repair Against AI-Groomed Deceptive Patterns

中间DPAgent:针对AI诱导欺骗模式的代理防御与修复

Zewei Shi, Ruoxi Sun, Haoyang Li, Seong Oun Hwang, Feng Liu, Minhui Xue, Xingliang Yuan

AI总结 针对网络界面中的隐私欺骗模式,提出DPAgent框架,通过四个专门代理结合潜在空间净化与防御性提示,主动检测并修复欺骗性界面,检测率达90.98%,修复率77%。

详情
AI中文摘要

网络界面中的隐私欺骗模式系统性地操纵用户泄露个人数据,然而现有防御措施零散、静态,且越来越容易受到大型语言模型的操纵。此外,数据空洞(网络生态系统中信息稀缺的区域)为对手注入误导性内容提供了肥沃土壤,这些内容可被AI系统抓取和学习,从而放大欺骗性设计和模型不当行为。在本文中,我们形式化了一种新的威胁模型——AI诱导,攻击者利用数据空洞植入看似良性但恶意的样本,破坏模型推理并使欺骗行为常态化。为应对隐私欺骗模式中的这一威胁,我们提出了DPAgent,一个基于代理和推理感知的框架,通过四个专门代理协调工作,将潜在空间净化与防御性提示相结合,直接在实时网络环境中主动探索、检测和修复隐私欺骗用户界面,使其在到达最终用户之前被消除。大量评估表明,DPAgent检测出90.98%的诱导样本,在隐私欺骗模式检测中达到最先进水平(微F1为0.816),在仅访问基线所需约10%页面的情况下探索了超过80%的模式类型,并成功修复了77%的检测到的欺骗界面。一项对485个真实网站的大规模研究发现,高达98%的网站至少包含一个隐私欺骗模式,其中超过90%可通过DPAgent缓解。用户研究进一步证实,DPAgent在保持浏览体验的同时有效降低了隐私风险。我们的结果证明了中间代理防御在保护Web UI供应链免受欺骗性设计和基于数据空洞利用的新兴AI威胁方面的潜力。

英文摘要

Privacy deceptive patterns in web interfaces systematically manipulate users into disclosing personal data, yet existing defenses are fragmented, static, and increasingly vulnerable to manipulation by large language models. Moreover, data voids, areas of information scarcity within the web ecosystem, create fertile ground for adversaries to inject misleading content that can be scraped and learned by AI systems, thereby amplifying both deceptive design and model misbehavior. In this paper, we formalize a new threat model, AI grooming, where attackers exploit data voids to seed benign-looking but malicious samples that corrupt model reasoning and normalize deceptive practices. To address this threat in privacy deceptive patterns, we present DPAgent, an agentic and reasoning-aware framework that orchestrates four specialized agents to mitigate the AI Grooming threat via a proactive defense that combines latent space purification with defensive prompting and operates directly in live web environments to proactively explore, detect, and repair privacy deceptive user interfaces before they reach end users. Extensive evaluations show that DPAgent detects 90.98% of groomed samples, achieves state-of-the-art privacy deceptive pattern detection with a micro F1 of 0.816, explores over 80% of pattern types while visiting only about 10% of the pages required by baselines, and successfully repairs 77% of detected deceptive interfaces. A large-scale study of 485 websites in the wild reveals that up to 98% contain at least one privacy deceptive pattern, over 90% of which can be mitigated by DPAgent. User studies further confirm that DPAgent effectively reduces privacy risks while preserving browsing experience. Our results demonstrate the promise of agent-in-the-middle defenses for securing the web UI supply chain against deceptive design and emerging AI threats rooted in data void exploitation.

2606.06912 2026-06-08 cs.SE 新提交

From Custom Logic to APIs: Understanding and Recommending API Replacement Refactorings

从自定义逻辑到API:理解与推荐API替换重构

Bridget Nyirongo, Yanjie Jiang, Yuxia Zhang, Hui Liu

AI总结 通过实证研究挖掘API替换重构的模式,提出混合框架AKIRA,结合模式确定性启发式与重构感知知识库,在推荐API替换重构上达到90%召回率和88%精确率。

详情
AI中文摘要

软件重构对于维护代码质量至关重要。然而,将自定义逻辑替换为API调用的API替换重构仍未被充分探索。现有的重构工具对此类机会的检测支持有限,因为它们依赖预定义模板,难以捕捉复杂的多语句语义等价物。为解决这一局限,我们通过挖掘六个开源Java项目中的166,299次提交,并手动分析精心挑选的1,800次提交,开展了首次API替换重构的实证研究,从中识别出366个验证实例,以表征其范围、类别和重复模式。基于这些洞察,我们提出了AKIRA(自适应知识发现与检索),一个将模式确定性启发式与重构感知知识库相结合的混合框架,以评估推荐API替换重构的实际可行性。我们的评估表明,AKIRA在手动策划的数据集上实现了90%的召回率和88%的精确率。此外,在外部RETIWA数据集上,AKIRA将召回率从21%提高到81%,精确率从40%提高到78%,显著提升了现有技术水平。这些结果证明了将静态模式匹配与语义推理相结合以支持自动化推荐复杂API替换重构的有效性。

英文摘要

Software refactoring is essential for maintaining code quality. However, API replacement refactoring, which replaces custom logic with API calls, remains underexplored. Existing refactoring tools provide limited support for detecting such opportunities because they rely on predefined templates and have difficulty capturing complex, multi-statement semantic equivalents. To address this limitation, we conduct the first empirical study of API replacement refactorings by mining 166,299 commits across six open-source Java projects and manually analyzing a curated subset of 1,800 commits, from which we identify 366 validated instances to characterize their scope, categories, and recurring patterns. Based on these insights, we propose AKIRA (Adaptive Knowledge Discovery and Retrieval), a hybrid framework that integrates pattern-deterministic heuristics with a refactoring-aware knowledge base to assess the practical feasibility of recommending API replacement refactorings. Our evaluation shows that AKIRA achieves 90% recall and 88% precision on a manually curated dataset. Furthermore, on the external RETIWA dataset, AKIRA significantly improves the state of the art by increasing recall from 21% to 81% and precision from 40% to 78%. These results demonstrate the effectiveness of combining static pattern matching with semantic reasoning to support the automation of recommending complex API replacement refactorings.

2606.06910 2026-06-08 cs.DC math-ph math.MP 新提交

Communication Strategy Selection for Multi-GPU 3D FDTD with Convolutional Perfectly Matched Boundary Layers

面向卷积完美匹配边界层的多GPU三维FDTD通信策略选择

Victory C. Obieke

AI总结 针对带CPML边界条件的多GPU三维FDTD计算,研究直接GPU间对等交换相比主机中转的加速效果,并评估扩大鬼域区域的影响。

详情
AI中文摘要

本文描述了一项针对使用CUDA进行卷积完美匹配层边界条件的三维时域有限差分计算的多GPU通信策略研究。用于确定最有效实现的指标包括运行时间、每秒百万输出点的吞吐量、强扩展效率、CPML开销、主机中转与直接GPU间对等交换的加速比,以及扩大鬼域区域的加速比。在单个NVIDIA Quadro RTX 6000 GPU上,CPML实现维持每秒2,889–3,290百万输出点,边界层开销小于1%,为多GPU研究提供了单GPU基线。结果表明,直接GPU间对等交换是主导优化,相比主机中转交换实现了2.46–2.76倍的加速,而扩大鬼域区域仅带来适度收益,因为通信频率降低部分被冗余计算和额外内存流量抵消。在NVIDIA Quadro RTX 8000 GPU上,对于测试的强扩展情况,该实现在两个GPU上提供了高达1.51倍的加速,而四个GPU能够处理接近或超过单GPU内存容量的大网格。

英文摘要

In this paper we describe a communication-strategy study for multi-GPU three-dimensional finite-difference time-domain computation with convolutional perfectly matched layer boundary conditions using CUDA. The metrics used to determine the most effective implementation include runtime, throughput in millions of output points per second, strong-scaling efficiency, CPML overhead, host-staged versus direct GPU-to-GPU exchange speedup, and enlarged-ghost speedup. On a single NVIDIA Quadro RTX 6000 GPU, the CPML implementation sustains 2,889--3,290 million output points per second with less than 1\% boundary-layer overhead, providing the single-GPU baseline for the multi-GPU study. The results show that direct GPU-to-GPU peer exchange is the dominant optimization with a 2.46--2.76$\times$ speedup over host-staged exchange, while enlarged ghost regions give only modest benefits because the reduced communication frequency is partly offset by redundant computation and additional memory traffic. On NVIDIA Quadro RTX 8000 GPUs, the implementation gives up to a 1.51$\times$ speedup on two GPUs for the tested strong-scaling cases, while four GPUs enable larger grids that approach or exceed single-GPU memory capacity.

2606.06894 2026-06-08 cs.CR 新提交

FDM: A Framework for Decision-making to build ML-based Malware detection systems

FDM:构建基于机器学习的恶意软件检测系统的决策框架

Tadiwa Vhito, Jakapan Suaboot, Warodom Werapun, Norrathep Rattanavipanon

AI总结 提出FDM框架,通过加权配置兼容性评分(WCCS)多准则函数,将五个操作参数映射到九个配置维度的排序推荐,实验验证了最优ML配置依赖于部署环境。

Comments 18 pages, 5 figures, 14 tables

详情
AI中文摘要

为恶意软件检测选择合适的机器学习(ML)配置是一个复杂的多准则问题。模型选择、特征工程和更新机制必须共同满足不同部署环境中变化的操作约束。本文提出了用于构建基于ML的恶意软件检测系统的决策框架(FDM)。FDM使用加权配置兼容性评分(WCCS)形式化了这一选择过程,WCCS是一个多准则评分函数,将五个操作参数(平台约束、资源预算、响应延迟、更新频率和检测灵敏度)映射到九个配置维度的排序推荐。为验证该框架,在三个数据集(一个私有Windows API数据集、公共Malimg图像基准和一个Android静态API数据集)上进行了四项实验。关键结果包括:(i)XGBoost在二分类中实现了最佳精度-资源比(测试准确率97.46%,<70 MB RAM),优于消耗高达2.8 GB的LSTM/BiLSTM;(ii)在多分类中,经典模型(XGBoost 79.03%)优于循环深度模型(BiLSTM 72.27%),逆转了二分类的排名;(iii)使用EfficientNetB0的类增量学习在11个增量步骤中保持了99.13%的准确率,仅下降0.65个百分点;(iv)迁移学习在图像型恶意软件数据上平均减少了2.14倍的训练时间,且没有显著的精度损失;(v)自编码器预处理实现了14倍的训练加速,仅损失0.86个百分点的精度。这些发现证实了最优ML配置是上下文相关的,验证了FDM的核心前提,并展示了其对网络安全从业者的实用价值。

英文摘要

Selecting appropriate machine learning (ML) configurations for malware detection is a complex, multi-criteria problem. Model choice, feature engineering, and update mechanisms must jointly satisfy operational constraints that vary across deployment contexts. This paper proposes the Framework for Decision-making (FDM) to build ML-based malware detection systems. The FDM formalises this selection process using the Weighted Configuration Compatibility Score (WCCS), a multi-criteria scoring function mapping five operational parameters (platform constraint, resource budget, response latency, update frequency, and detection sensitivity) to ranked recommendations across nine configuration dimensions. To validate the framework, four experiments were conducted on three datasets (a private Windows API dataset, the public Malimg image benchmark, and an Android static API dataset). Key results include: (i) XGBoost achieved the best accuracy-to-resource ratio in binary classification (97.46 % test accuracy, <70 MB RAM), outperforming LSTM/BiLSTM which consumed up to 2.8 GB; (ii) in multi-class classification, classical models (XGBoost 79.03 %) outperformed recurrent deep models (BiLSTM 72.27 %), reversing the binary ranking; (iii) class-incremental learning with EfficientNetB0 maintained 99.13 % accuracy with only 0.65 pp degradation across 11 incremental steps; (iv) transfer learning reduced training time by 2.14 times on average for image-based malware data without significant accuracy cost; and (v) autoencoder pre-processing yielded a 14 times training speedup at a cost of only 0.86 pp accuracy. These findings confirm that the optimal ML configuration is context-dependent, validating the FDM's core premise and demonstrating its practical utility for cybersecurity practitioners.

2606.06882 2026-06-08 cs.GT cs.CE 新提交

Learning to Strategically Acquire Resources in Competition

在竞争中学习策略性获取资源

Safwan Hossain, Mirah Shi, Andrew Bennett, Neil Andrew Chriss, Michael Kearns, Anderson Schneider, Yuriy Nevmyvaka

AI总结 研究多智能体在时间上竞争获取可分割资源的问题,提出博弈论模型,分析不同信息假设下的贝叶斯纳什均衡,并给出学习动态的收敛条件。

详情
AI中文摘要

我们考虑多个智能体在时间上竞争获取某种昂贵的可分割资源(例如金融资产份额、计算资源等)。利用标准的价格动态模型,我们提出了一个新颖的博弈论模型,推广了不同文献中研究的环境。我们的分析考虑了智能体可用信息的不同假设。在具有共同先验的部分信息下(完全信息作为特例),我们建立了贝叶斯纳什均衡(BNE)的存在性、唯一性和高效可计算性,并限定了无政府状态价格。接下来,更一般地,我们考虑没有共同先验的智能体,根据重复交互中的现实市场反馈学习最优行为。我们提供了智能体同时进行学习动态的充分条件,以实现最后一步收敛到BNE。对于所有设置,我们基于真实金融数据进行了模拟,以说明我们的理论结果,并为交易和资源获取背景下的策略行为提供新见解。

英文摘要

We consider multiple agents competing to acquire some costly divisible resource (e.g. shares of a financial asset, compute resources, etc.) over time. Leveraging a standard model for price dynamics, we propose a novel game-theoretic model for this problem, generalizing settings studied in diverse literatures. Our analysis considers different assumptions on the information available to agents. Under partial-information with a common prior (which subsumes complete information as a special case), we establish the existence, uniqueness, and efficient computability of the Bayesian Nash equilibrium (BNE), and bound the price of anarchy. Next and more generally, we consider agents with no common prior learning to act optimally given realistic market feedback from repeated interactions. We provide sufficient conditions on agents doing simultaneous learning dynamics for last-iterate convergence to the BNE. For all settings, we provide simulations based on real financial data to illustrate our theoretical results and offer new insights on strategic behavior in the context of trading and resource acquisition.

2606.06880 2026-06-08 cs.IR 新提交

Towards Retrieving Interaction Spaces for Agentic Search

面向智能体搜索的交互空间检索

Shengyao Zhuang, Yuansheng Ni, Hengxin Fun, Jimmy Lin, Xueguang Ma

AI总结 提出RISE方法,通过BM25构建有边界的交互空间,并预处理文档支持shell式导航,在BrowseComp-Plus上以约四分之一成本达到78%准确率,优于纯shell基线。

详情
AI中文摘要

搜索智能体的检索仍继承自非智能体信息检索:检索器对语料库排序,智能体读取少量返回文档。最近的直接语料交互(DCI)工作表明,智能体可以通过grep和文件读取等shell工具与原始语料交互。但无界交互无法扩展:每个宽泛的shell命令都是对整个语料库的扫描,延迟随语料库增长急剧下降。我们认为,智能体搜索中检索的作用不仅是选择适合LLM上下文窗口的文档,而是构建一个交互空间:语料库的一个有界子集,智能体可以使用相关工具进行探索。这带来了两个设计后果:空间需要由检索提供的边界,并且其中的对象应被处理以支持交互。作为概念验证,我们提出RISE(检索交互空间):使用BM25构建交互空间;同时,在索引期间处理其文档以支持shell式导航。在BrowseComp-Plus上,RISE在gpt-5.4-mini上以约四分之一的每次查询成本达到78%的准确率,与纯shell DCI基线相当。在100万文档规模下,RISE-BM25在gpt-5.4-mini上达到81%,而gpt-5.4-nano上的DCI降至60%,且100次中有33次时钟失败。

英文摘要

Retrieval for search agents is still inherited from non-agentic information retrieval: a retriever ranks the corpus and the agent reads a small set of returned documents. Recent direct corpus interaction (DCI) work shows that agents can instead interact with the raw corpus through shell tools such as grep and file reads. But unbounded interaction does not scale: every broad shell command is a scan over the whole corpus, and latency degrades sharply as the corpus grows. We argue that the role of retrieval for agentic search is not just to select documents that fit in the LLM context window, but to construct an interaction space: a bounded subset of the corpus the agent can explore with associated tools. Two design consequences follow. The space needs a boundary supplied by retrieval, and the objects within it should be processed for interaction. As a proof of concept, we propose RISE (Retrieving Interaction SpacE): we use BM25 to construct the interaction space; meanwhile, its documents are processed during indexing for shell-style navigation. On BrowseComp-Plus, RISE matches the pure-shell DCI baseline at 78% accuracy with gpt-5.4-mini at roughly one quarter of the per-query cost. At 1M documents, RISE-BM25 reaches 81% on gpt-5.4-mini, whereas DCI on gpt-5.4-nano degrades to 60% with 33 of 100 wall-clock failures.

2606.06860 2026-06-08 cs.CR 新提交

On the Incentive Compatibility of Block Propagation in Bitcoin

论比特币中区块传播的激励相容性

Fumichika Maeda, Akira Sakurai, Taishi Nakai, Kazuyuki Shudo

AI总结 研究比特币矿工在区块传播中的个体激励,通过区块链网络模型推导不同打破平局规则下的奖励表达式,揭示传播延迟、算力分布和规则如何共同决定挖矿奖励,并分析激励与公平性的权衡。

详情
AI中文摘要

比特币是无许可的,不依赖任何中央管理员,这使其具有强大的抗审查性。同时,激励矿工以符合系统整体利益的方式行事也很重要。本文探讨矿工是否在个体上被激励去传播区块——比特币中最基本的过程之一。矿工通过生成区块并将其传播到网络中来共同维护区块链。如果矿工有动机不传播某些区块,这将表明比特币激励设计存在根本缺陷。尽管先前的工作研究了传播延迟如何影响分叉和挖矿奖励,但并未完全刻画在不同打破平局规则下矿工改善区块传播的激励。为填补这一空白,我们基于一个捕获分叉对挖矿公平性影响的区块链网络模型,为每种打破平局规则推导出解析奖励表达式。这些表达式明确刻画了区块传播延迟、算力分布和打破平局规则如何共同决定挖矿奖励。然后,我们利用它们分析矿工改善区块传播的激励。例如,我们的结果表明,矿工没有挖矿奖励激励去中继其他矿工生成的区块。相比之下,在先到先得规则下,每个非多数矿工都有激励更快地接收其他矿工的区块并更快地传播自己的区块。最后,我们比较了打破平局规则,并识别出传播激励与挖矿公平性之间的权衡。特别是,先到先得规则提供了最强的减少传播延迟的激励,但也最大程度地恶化了挖矿公平性。

英文摘要

Bitcoin is permissionless and does not rely on any central administrator, which gives it strong censorship resistance. At the same time, it is important to incentivize miners to behave in ways that align with the interests of the system as a whole. This paper asks whether miners are individually incentivized to propagate blocks, one of the most fundamental processes in Bitcoin. Miners collectively maintain the blockchain by generating blocks and disseminating them across the network. If miners have an incentive not to propagate some blocks, this would indicate a fundamental flaw in Bitcoin's incentive design. Although prior work has studied how propagation delays affect forks and mining rewards, it has not fully characterized miners' incentives to improve block propagation under different tie-breaking rules. To address this gap, we derive analytical reward expressions for each tie-breaking rule based on a blockchain network model that captures the effect of forks on mining fairness. These expressions explicitly characterize how block propagation delays, hashrate distribution, and tie-breaking rules jointly determine mining rewards. We then use them to analyze miners' incentives to improve block propagation. Our results show, for example, that miners have no mining-reward incentive to relay blocks generated by other miners. By contrast, under the first-seen rule, every non-majority miner is incentivized to receive other miners' blocks more quickly and to propagate its own blocks more quickly. Finally, we compare tie-breaking rules and identify a trade-off between propagation incentives and mining fairness. In particular, the first-seen rule provides the strongest incentives to reduce propagation delays, but it also worsens mining fairness the most.

2606.06851 2026-06-08 cs.CY cs.HC 新提交

Toward a Metaphysics of Learning Analytics: Ontological Positioning of Data, Inference, and Normativity

迈向学习分析学的形而上学:数据、推理与规范性的本体论定位

Kensuke Takii

AI总结 本文通过追溯学习分析学的定义和原则,从内部回答“学习分析学是什么”的本体论问题,揭示了数据的存在方式、八个本体论前提,并指出“规范嵌入型学习分析”与第一原则的本体论张力。

Comments 25 pages, 1 figures

详情
AI中文摘要

自首届LAK会议召开以来的15年间,学习分析学(LA)社区经历了快速发展。然而,尽管关于LA哲学基础的认识论和伦理辩论十分激烈,形而上学讨论却很少,这表明缺乏从内部原则推导LA身份的努力。本文试图通过解决“LA是什么?”的本体论问题来建立LA的形而上学。我们通过追溯LA自身的定义和原则,从LA内部推导出答案。具体来说,我们探讨了LA所操作的数据构成何种存在,识别了包括学习者在内的八个本体论前提,并通过实然/应然问题阐明LA并不从数据中推导出规范性。特别地,该系统揭示了一类LA实践(此处称为“规范嵌入型LA”)将LA的目的与其操作混为一谈,与第一原则产生了本体论张力。我们还讨论了与相关领域的联系以及该系统的局限性。这里概述的形而上学并非从外部强加于LA,而是揭示了LA自身一直隐含预设的内容。

英文摘要

The Learning Analytics (LA) community has undergone rapid development over the 15 years since the first LAK conference was held. However, while epistemological and ethical debates regarding the philosophical foundations of LA have been vigorous, metaphysical discussions have been sparse, signifying a lack of effort to derive the identity of LA from its internal principles. In this paper, we attempt to establish a metaphysics of LA by addressing the ontological question of ``What is LA?'' We do so by tracing back to LA's own definitions and principles to derive an answer from within LA itself. Specifically, we address what kind of existence the data LA operates on constitutes, identify eight agents including learners as ontological prerequisites, and clarify, via the is/ought problem, that LA does not derive norms from data. In particular, this system reveals that a class of LA practices, here termed \textit{norm-embedded LA}, conflates LA's purpose with its operations, creating an ontological tension with the first principle. We also discuss connections with related fields and the limitations of this system. The metaphysics outlined here is not imposed from outside LA, but surfaces what LA itself has always implicitly presupposed.

2606.06843 2026-06-08 cs.SE 新提交

Empirical Study on the Characteristics and Evolution of AI-usage in GitHub Repositories: Evidence from Code Comments

GitHub仓库中AI使用特征与演化的实证研究:来自代码注释的证据

Abdullah Al Mujahid, Preetha Chatterjee, Mia Mohammad Imran

AI总结 通过分析35,361条提及AI的GitHub代码注释及后续提交,发现开发者主要用LLM实现代码,随后频繁重构、集成和修复,AI引用从代码生成转向知识支持和代码增强。

Comments Preprint version

详情
AI中文摘要

开发者在日常软件工作流中越来越多地使用ChatGPT、Copilot和Claude等AI工具,但先前的研究通常孤立评估LLM输出,而非考察开发者如何在真实项目中调整它们。我们分析了35,361条明确提及AI使用的GitHub代码注释及其关联代码块。首先,我们对500条独特的注释和代码块进行开放编码,推导出AI辅助开发活动的分类法;然后,使用两个基于LLM的分类器对完整数据集进行标注,并通过Dawid-Skene期望最大化聚合预测。我们还分析了12,996条后续提交消息,以研究AI辅助代码在引入后的演变,并考察了2022年12月至2026年3月的时间趋势。结果表明,开发者主要将LLM用于代码实现,其次是代码增强、调试、文档编写和测试。后续提交频繁涉及重构和清理、功能集成和扩展以及错误修复,表明在调整AI辅助代码时存在持续的人工监督。随着时间的推移,提及AI的注释从直接代码生成转向知识和概念支持以及代码增强。这些发现表明,AI工具不仅作为代码生成辅助工具,而且作为协作支持机制嵌入,其输出由开发者随时间进行精炼、扩展和纠正。

英文摘要

Developers increasingly use AI tools such as ChatGPT, Copilot, and Claude in everyday software workflows, but prior studies often evaluate LLM outputs in isolation rather than examining how developers adapt them in real projects. We analyze 35,361 GitHub code comments that explicitly reference AI use and their associated code blocks. We first open-code 500 unique comments and code blocks to derive a taxonomy of AI-assisted development activities, then annotate the full dataset using two LLM-based classifiers and aggregate predictions with Dawid-Skene expectation-maximization. We also analyze 12,996 subsequent commit messages to study how AI-assisted code evolves after introduction, and examine temporal trends from December 2022 to March 2026. Our results show that developers primarily use LLMs for code implementation, followed by code enhancement, debugging, documentation, and testing. Subsequent commits frequently involve refactoring and cleanup, feature integration and extension, and bug fixing, indicating sustained human oversight in adapting AI-assisted code. Over time, AI-referencing comments shift from direct code generation toward knowledge and conceptual support and code enhancement. These findings suggest that AI tools are becoming embedded not only as code-generation aids, but also as collaborative support mechanisms whose outputs are refined, extended, and corrected by developers over time.

2606.06826 2026-06-08 cs.SE 新提交

SkelDPO: A Skeleton-Guided Direct Preference Optimization Framework for Efficient Code Generation

SkelDPO: 一种骨架引导的直接偏好优化框架用于高效代码生成

Yu Yu, Chen Lyu

AI总结 提出SkelDPO框架,通过骨架引导的偏好优化,在代码生成中同时优化语义正确性和执行效率,相比现有方法在Pass@1、Beyond@1和Effi@1上提升3-7%。

详情
AI中文摘要

随着代码大语言模型(Code LLMs)在语义正确性方面取得显著进展,执行效率已成为评估其实用性的重要维度。然而,现有方法通常将完整程序视为训练中的单一优化目标,而未显式建模影响效率的结构因素。因此,尽管这些模型能生成语义正确的代码,但无法在细粒度层面上学习导致高效实现的底层骨架特征。为解决这一局限,我们提出SkelDPO(骨架引导的直接偏好优化),一种骨架引导的偏好优化框架,系统性地提升代码生成效率。SkelDPO首先从代码数据集中识别高效和低效实现,通过对比分析定位它们的效率倾向点和低效倾向点,形成效率与低效骨架之间的对齐信号。训练过程中,引入联合代码和骨架偏好损失,使模型在学习语义正确性的同时,强化对代码中效率关键组件的理解。结果表明,SkelDPO持续超越现有方法:与仅依赖高效和低效代码偏好优化的SOTA方法相比,它在Pass@1、Beyond@1和Effi@1上分别提升3-6%、3-7%和2-5%,在复杂任务上提升更显著。总体而言,SkelDPO提供了骨架级效率对齐的新视角,打破了传统偏好优化仅依赖正确性或效率对的局限。所有数据集和源代码已公开:此 https URL。

英文摘要

With the remarkable progress of Code Large Language Models (Code LLMs) in achieving semantic correctness, execution efficiency has become an increasingly important dimension for evaluating their practical utility. However, existing approaches typically treat full programs as a single optimization target during training, without explicitly modeling the structural factors that influence efficiency. As a result, although these models can generate semantically correct code, they fail to learn, at a fine-grained level, the underlying skeleton features that lead to efficient implementations. To address this limitation, we propose SkelDPO (Skeleton-Guided Direct Preference Optimization), a skeleton-guided preference optimization framework that systematically enhances the efficiency of code generation. SkelDPO first identifies efficient and inefficient implementations from the code dataset and, through comparative analysis, locates their efficiency-prone and inefficiency-prone points, forming alignment signals between efficiency and inefficiency skeletons. During training, a joint code and skeleton preference loss is introduced, enabling the model to learn semantic correctness while reinforcing its understanding of efficiency-critical components in code. Results show that SkelDPO consistently surpasses existing methods: compared with SOTA method that relies solely on efficient and inefficient code preference optimization, it improves Pass@1, Beyond@1, and Effi@1 by 3-6%, 3-7%, and 2-5%, with greater improvements observed on complex tasks. Overall, SkelDPO provides a new perspective on skeleton-level efficiency alignment, breaking the limitation of conventional preference optimization that relies solely on correctness or efficiency pairs. All datasets and source code are publicly available at: https://github.com/icpcSkelDPO/SkelDPO.

2606.06821 2026-06-08 cs.SE 新提交

Chiseling Out Efficiency: Structured Skeleton Supervision for Efficient Code Generation

剔除低效:用于高效代码生成的结构骨架监督

Yu Yu, Zhihong Sun, Jia Li, Yao Wan, Chuanyi Li, Hongyu Zhang, Ruyun Wang, Tao Huang, Zhi Jin, Ge Li, Chen Lyu

AI总结 本文提出EffiSkel框架,通过提取和学习效率骨架,提升代码生成的效率与正确性,实验显示在多个编程语言和基准上均取得显著提升。

详情
AI中文摘要

大型语言模型(LLMs)能够生成语法正确且功能完整的程序,大幅简化软件开发。然而,近期研究表明这些程序通常比人类优化的程序执行更慢。现有方法通常通过迭代优化或在高效代码语料上微调模型来弥合效率差距。然而,这些方法仅通过模仿完整优化的解决方案暴露模型于效率信号,而未显式编码实现高性能运行时间的关键结构模式。本文提出EffiSkel框架,通过三种互补策略提取和学习效率骨架——抽象、可重用的结构模式,从而联合优化代码生成和骨架预测。实验表明,EffiSkel在多个编程语言和基准上显著提升了功能正确性和效率,例如在Mercury上使用DeepSeek-Coder(7B)时,效率比EffiCoder高11.11%,比CodeDPO高3.71%,平均加速比也分别提高0.36和0.22。这些结果表明显式建模效率骨架能有效提升LLM生成代码的运行性能。

英文摘要

Large Language Models (LLMs) are capable of generating syntactically correct and functionally complete programs, greatly streamlining software development. However, recent studies reveal that these programs typically execute substantially slower than human-optimized counterparts. Existing approaches to bridging this efficiency gap typically involve either iteratively optimizing code after generation or fine-tuning models on corpora of efficient code. Yet, these methods expose the model to efficiency signals only by mimicking complete, optimized solutions, without explicitly encoding the structural code patterns essential for achieving high runtime performance. Addressing this gap presents two core challenges: (1) extracting and representing latent, efficiency-oriented structural patterns embedded within complex syntax and control flows, and (2) effectively learning these patterns without destabilizing the semantic training of LLMs. To tackle these challenges, we propose EffiSkel, an efficiency skeleton-guided framework that explicitly extracts and learns efficiency skeletons-abstract, reusable structural patterns underpinning efficient code-by leveraging three complementary strategies. These skeletons are integrated into a multi-task learning regime that jointly optimizes code generation and skeleton prediction. Experiments across multiple programming languages and benchmarks demonstrate that EffiSkel significantly enhances both functional correctness and efficiency, resulting on Mercury with DeepSeek-Coder (7B) a +11.11% (vs. EffiCoder) and +3.71% (vs. CodeDPO) higher Efficiency Ratio (ER), and a +0.36 (vs. EffiCoder) and +0.22 (vs. CodeDPO) increase in Average Speedup (AS). These results highlight the effectiveness of explicitly modeling efficiency skeletons in improving the runtime performance of code generated by LLMs.

2606.06811 2026-06-08 cs.PF q-bio.GN 新提交

Dependencies and Dataflow in Seed-Filter-Extend Pipelines

种子-过滤-扩展流水线中的依赖关系与数据流

Shiv Sundram

AI总结 针对基因组比对中种子-过滤-扩展流水线的串行依赖和局部对齐不规则性,通过综合LASTZ等四种方法,优化跨区域全局流水线以加速端到端比对。

详情
AI中文摘要

比较基因组对于发现突变、追踪进化谱系和推进跨物种基因组学至关重要。从根本上讲,这归结为一个O(n^2)的字符串匹配动态规划(DP)问题,这一挑战推动了数十年的性能研究。然而,对于跨越数百万到数十亿碱基对的基因组,执行严格的O(n^2) DP算法在计算上是不可行的。因此,现代比对器依赖全局启发式方法来识别物种间数千个候选相似区域。不幸的是,这些方法受到复杂串行依赖关系的困扰。一旦识别出候选区域,流水线执行局部DP比对,这引入了其自身的非平凡启发式和不规则数据依赖。虽然并行化密集的二维DP是一个研究充分的问题,但加速这种端到端流水线更具挑战性。跨候选区域并行化以及将不规则、充满启发式的局部比对卸载到现代硬件(如GPU)仍然是一个主要障碍。在这项工作中,我们通过优化跨区域的全局流水线来克服这些串行瓶颈。我们从四篇论文中汲取灵感:LASTZ、SegAlign、Darwin-WGA和SNAP,综合每篇论文的发现以指导优化,我们在LASTZ中要么原型化要么直接实现这些优化。

英文摘要

Comparing genomes is critical for discovering mutations, tracking evolutionary lineages, and advancing cross-species genomics. Fundamentally, this reduces to an O(n^2) string-matching dynamic programming (DP) problem, a challenge that has driven decades of performance research. However, executing a strict O(n^2) DP algorithm is computationally intractable for genomes spanning millions to billions of base pairs. Consequently, modern aligners rely on global heuristics to identify thousands of candidate similarity regions between species. Unfortunately, these methods are burdened by complex serial dependencies. Once candidate regions are identified, the pipeline executes localized DP alignments, which introduce their own non-trivial heuristics and irregular data dependencies. While parallelizing dense, two-dimensional DP is a well-studied problem, accelerating this end-to-end pipeline is significantly more challenging. Parallelizing across candidate regions and offloading irregular, heuristic-laden local alignments to modern hardware (such as GPUs) remains a major hurdle. In this work, we address the challenge of overcoming these serial bottlenecks by optimizing the global pipeline across regions. We take inspiration from four papers: LASTZ, SegAlign, Darwin-WGA, and SNAP, synthesizing findings across each to inform optimizations, which we either prototype or implement directly in LASTZ.