arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2606.07246 2026-06-08 cs.AR 新提交

MailoHLS: Multi-Adapter Structure-Aware Learning for Pareto-Driven HLS Pragma Optimization

MailoHLS: 面向帕累托驱动HLS编译指示优化的多适配器结构感知学习

Elena Vouvali, Dimosthenis Masouros, Aggelos Ferikoglou, Dimitrios Soudris, Sotirios Xydis

AI总结提出MailoHLS混合框架，结合LLM语义推理与GNN结构建模，通过交叉注意力、目标条件LoRA适配器和帕累托优化，实现HLS编译指示的联合优化，在延迟优化上最高提速12.42倍，并持续生成近帕累托最优设计。

详情

AI中文摘要

高层次综合（HLS）能够快速开发FPGA加速器，但由于编译器指令（即编译指示）导致的设计空间庞大且不规则，实现高质量结果（QoR）仍然具有挑战性。选择有效配置需要推理程序结构、内存行为以及延迟和资源利用率等常常相互冲突的目标之间的复杂交互。先前的模型驱动方法在跨内核的泛化能力上表现有限，且无法捕捉更高层次的优化意图。最近，大型语言模型（LLM）能够捕捉代码语义和高层意图，但其顺序表示阻碍了对结构依赖性和全局权衡的建模，导致HLS设计次优。我们提出MailoHLS，一个混合框架，结合了基于LLM的语义推理和基于GNN的结构建模，用于目标感知的指令优化。通过交叉注意力集成结构嵌入，并利用PEFT与目标条件LoRA适配器以及帕累托驱动优化，MailoHLS能够对代码语义、结构和设计权衡进行联合推理。在已见和未见的内核上，MailoHLS在延迟优化上实现了高达12.42倍和8.4倍的加速（几何平均分别为9.48倍和4.97倍），持续生成接近帕累托最优的设计。在完全未见过的应用上，它达到了高达10.2倍的加速（几何平均6.58倍），优于高端LLM和先前方法，同时缩小了与帕累托前沿的差距。

英文摘要

High-Level Synthesis (HLS) enables rapid development of FPGA accelerators, yet achieving high-quality results (QoR) remains challenging due to the large and irregular design space induced by compiler directives (a.k.a pragmas). Selecting effective configurations requires reasoning over complex interactions between program structure, memory behavior, and often conflicting objectives such as latency and resource utilization. Prior model-driven approaches exhibit limited generalization across kernels and fail to capture higher-level optimization intent. Recently, Large Language Models (LLMs) capture code semantics and high-level intent, but their sequential representations hinder modeling of structural dependencies and global trade-offs, leading to suboptimal HLS designs. We present MailoHLS, a hybrid framework that combines LLM-based semantic reasoning with GNN-based structural modeling for objective-aware directive optimization. By integrating structural embeddings via cross-attention and leveraging PEFT with objective-conditioned LoRA adapters and Pareto-driven optimization, MailoHLS enables joint reasoning over code semantics, structure, and design trade-offs. Across seen and unseen kernels, MailoHLS achieves up to 12.42x and 8.4x speedup (9.48x and 4.97x geometric mean) for latency optimization, consistently producing near-Pareto-optimal designs. On fully unseen applications, it reaches up to 10.2x speedup (6.58x geometric mean), outperforming high-end LLMs and prior approaches while narrowing the gap to the Pareto frontier.

URL PDF HTML ☆

赞 0 踩 0

2606.07238 2026-06-08 cs.GT 新提交

No, Cake Cutting Really is a Piece of Cake

不，蛋糕切割确实是小菜一碟

Stephen Arndt, Benjamin Moseley, Sungjin Im, Kirk Pruhs

AI总结提出一种确定性蛋糕切割算法，使用线性数量的切割实现比例公平。

2606.07231 2026-06-08 cs.HC 新提交

Moodie: An Early-Stage Design Exploration for Supporting Fear of Missing Out with LLM-based Chatbots

Moodie：基于LLM的聊天机器人支持错失恐惧症的早期设计探索

Hsin-Yu Tsai, Jingxian Liao, Fu-Yin Cherng, Tzu-Hsiang Huang

AI总结提出基于大语言模型的聊天机器人Moodie，通过情绪调节支持减少错失恐惧症，初步评估显示其相比通用模型能提高用户参与度和社交连接。

Comments 7 pages, 1 figure, 1 table. Preliminary work submitted to the ACM CUI 2026 Works-in-Progress (WiP) track

详情

AI中文摘要

社交媒体的过度使用导致了被称为错失恐惧症（FoMO）的挑战。现有研究未能提供可访问的、交互式的工具，专注于FoMO的情感和认知方面。本工作提出了Moodie，一个使用大语言模型设计的聊天机器人，以支持情绪调节并减少FoMO。我们进行了一项形成性研究以了解FoMO个体的需求，并开发了Moodie。然后，我们进行了一项初步评估研究（N=21），观察参与者与Moodie和基线聊天机器人（GPT-4o）在一周内的互动。结果显示，虽然Moodie和基线聊天机器人在减少FoMO方面程度相似，但Moodie带来了更高的参与度和社交连接。这一发现引发了关于专用聊天机器人相比通用模型在心理健康支持方面优势的有趣问题。未来研究将包括聊天记录分析、原型改进和纵向评估。

英文摘要

The excessive use of social media has led to the challenge known as Fear of Missing Out (FoMO). Existing studies fail to provide accessible, interactive tools that focus on the emotional and cognitive aspects of FoMO. This work presents Moodie, a chatbot designed using Large Language Models to support emotion regulation and reduce FoMO. We conducted a formative study to understand the needs of individuals with FoMO and developed Moodie. Then, we conducted a preliminary evaluative study (N=21) to observe how participants interact with Moodie and a baseline chatbot (GPT-4o) over one week. The results show that while both Moodie and a baseline chatbot reduced FoMO to a similar extent, Moodie resulted in greater engagement and social connection. This finding raises interesting questions about the advantages of purpose-built chatbots compared to general-purpose models for mental health support. Future research will include chat log analysis, prototype refinements, and longitudinal evaluations.

URL PDF HTML ☆

赞 0 踩 0

2606.07215 2026-06-08 cs.CE 新提交

A Comparative Study of Deep Learning Models for Geological Carbon Sequestration

深度学习模型在地质碳封存中的比较研究

Giovanni Zingaro, Robert Gracie, Yuri Leonenko

AI总结比较U-Net、V-Net、TCN、FNO和U-FNO等深度学习代理模型在地质碳封存中预测瞬态压力积聚和CO2饱和度的性能，发现U-FNO对CO2饱和度预测最准，FNO对压力预测最佳。

详情

AI中文摘要

数值油藏模拟计算成本极高，因为需要反复求解由离散控制方程导出的大型非线性代数系统。随着数字孪生应用中对实时优化、不确定性量化和历史匹配的需求增长，降低计算成本变得至关重要。基于深度学习（DL）的代理模型已成为加速地下流动模拟的有效方法。在此，我们试图确定哪些DL架构最适合高维、瞬态地下流动问题。在本研究中，我们考察了训练此类模型的优势和相对成本，包括内存需求、训练速度、准确性、鲁棒性和泛化能力。我们对几种常用作地下流动问题代理模型的DL架构进行了比较研究，包括U-Net、V-Net、时间卷积网络、傅里叶神经算子（FNO）和U-Net增强的FNO（U-FNO）。作为基准，我们比较了所研究模型在地质碳封存中预测瞬态压力积聚和CO$_2$饱和度场的性能。我们研究了在二维域中单井注入CO$_2$的问题，该问题由各向异性、非均质渗透率和孔隙度场、注入配置和储层属性参数化。结果表明，代理模型的性能强烈依赖于底层PDE类型（即双曲型与椭圆型）。U-FNO在预测CO$_2$饱和度场方面达到了最高精度，而FNO在压力积聚预测方面提供了最佳性能。

英文摘要

Numerical reservoir simulations are extremely computationally expensive, as they require the repeated solution of large nonlinear algebraic systems derived from the discretized governing equations. With growing demand for real-time optimization, uncertainty quantification, and history matching in digital twin applications, reducing computational cost has become essential. Deep learning (DL)--based surrogate models have emerged as an effective approach for accelerating subsurface flow simulations. Here, we seek to determine which DL architectures are best suited for high-dimensional, transient subsurface flow problems. In this study, we examine the advantages and relative costs associated with training such models, including memory requirements, training speed, accuracy, robustness, and generalization. We conduct a comparative study of several DL architectures commonly used as surrogate models for subsurface flow problems, including U-Net, V-Net, Temporal Convolutional Networks, Fourier Neural Operators (FNO), and a U-Net--enhanced FNO (U-FNO). As a benchmark, we compare the performance of the studied models for geological carbon sequestration to predict transient pressure build-up and CO$_2$ saturation fields. We study the problem of CO$_2$ injection into a single wellbore in a two-dimensional domain, which is parameterized by anisotropic, heterogeneous permeability and porosity fields, injection configurations, and reservoir properties. Results demonstrate that surrogate model performance is strongly dependent on the underlying PDE type (i.e., hyperbolic vs. elliptic). The U-FNO achieves the highest accuracy for predicting CO$_2$ saturation fields, while the FNO provides the best performance for pressure build-up prediction.

URL PDF HTML ☆

赞 0 踩 0

2606.07208 2026-06-08 eess.SY cs.SY 新提交

Unlocking feedforward capabilities in Model Predictive Control algorithms to deal with measurable disturbances

解锁模型预测控制算法中的前馈能力以处理可测扰动

José Luis Guzmán, Igor Pataro, Juan D. Gil, Manuel Berenguel

AI总结提出一种在MPC中嵌入真正前馈能力的双控制结构框架，通过无控制代价的前馈动作实现可测扰动的完全补偿，并在DMC、GPC和状态空间MPC中验证有效性。

详情

AI中文摘要

扰动抑制是过程控制的核心目标，特别是当可测扰动可以通过前馈动作加以利用时。尽管模型预测控制（MPC）自然地包含扰动模型和预测能力，但标准公式无法实现完全扰动抑制，因为代价函数惩罚控制努力。这一限制阻止了MPC复现经典前馈补偿器的行为。本文提出了一种新颖的框架，在不移除控制努力惩罚的情况下，在MPC中嵌入真正的前馈能力。该方法引入了一种双控制结构，其中同时计算两种控制动作：面向跟踪的动作，处理设定点跟踪和鲁棒性；以及面向前馈的动作，专门用于扰动抑制。两种贡献被组合成一个单一的控制信号，并显式地施加过程约束。面向前馈的动作在无控制努力惩罚的情况下制定，从而实现对可测扰动的完全补偿。该方法针对动态矩阵控制（DMC）、广义预测控制（GPC）和状态空间MPC进行了开发。通过仿真研究，包括与标准MPC和经典前馈方案的比较，证明了其有效性。基于反渗透过程的案例研究表明，所提出的方法在保持约束处理和整体控制性能的同时，改善了扰动抑制。

英文摘要

Disturbance rejection is a central objective in process control, particularly when measurable disturbances can be exploited through feedforward action. Although Model Predictive Control (MPC) naturally incorporates disturbance models and prediction capabilities, standard formulations cannot achieve complete disturbance rejection since the cost function penalises control effort. This limitation prevents MPC from reproducing the behaviour of classical feedforward compensators. This work proposes a novel framework to embed true feedforward capabilities within MPC without removing the control effort penalty. The approach introduces a dual-control structure in which two control actions are computed simultaneously: a tracking-oriented action addressing set-point tracking and robustness, and a feedforward-oriented action dedicated to disturbance rejection. Both contributions are combined into a single control signal on which the process constraints are explicitly enforced. The feedforward-oriented action is formulated without penalising control effort, enabling full compensation of measurable disturbances. The methodology is developed for Dynamic Matrix Control (DMC), Generalised Predictive Control (GPC), and state-space MPC. Its effectiveness is demonstrated through simulation studies, including comparisons with standard MPC and classical feedforward schemes. A case study based on a reverse osmosis process shows that the proposed approach improves disturbance rejection while preserving constraint handling and overall control performance.

URL PDF HTML ☆

赞 0 踩 0

2606.07202 2026-06-08 cs.SI 新提交

Technological Fitness and Regional Growth in Japan

技术适应性与日本区域增长

Rintaro Karashima, Hiroyasu Inoue

AI总结利用约390万条企业专利记录构建二分网络，通过Fitness-Complexity算法评估日本47个都道府县的技术能力，发现技术适应性与后续经济增长正相关，且对低收入地区影响更大。

详情

AI中文摘要

技术知识在塑造区域经济表现中扮演重要角色。本研究考察了日本各都道府县区域技术能力的 sophistication 与经济增长之间的关系。利用1981至2015财年约390万条企业专利记录，我们构建了连接47个都道府县与35个技术类别的二分网络，并应用Fitness-Complexity算法为七个五年期导出区域Fitness得分。我们使用Driscoll-Kraay标准误估计固定效应面板模型，以随后五年人均实际地区生产总值年均增长率为因变量。在控制初始收入、人口密度和专利活动后，都道府县Fitness与后续增长正相关（$\hat{\beta} = 0.0029$，$p = 0.007$），但仅在同时包含个体和时间固定效应时该关系可检测。Fitness与后续增长之间的横截面相关性在不同时期改变符号，凸显了面板方法的重要性。Fitness的增长效应在初始收入较低的都道府县更强，表明技术 sophistication 在经济扩张空间更大的地区对增长的贡献更大。滞后和领先分析表明，该关系是从Fitness到后续增长，而非反向。

英文摘要

Technological knowledge plays an important role in shaping regional economic performance. This study examines the relationship between the sophistication of regional technological capabilities and economic growth across Japanese prefectures. Using approximately 3.9 million corporate patent records filed from fiscal years 1981 to 2015, we construct bipartite networks linking 47 prefectures to 35 technology classes and apply the Fitness-Complexity algorithm to derive regional Fitness scores for seven five-year periods. We estimate fixed-effects panel models with Driscoll-Kraay standard errors, using the annual average growth rate of real gross regional product per capita over the subsequent five years as the dependent variable. Prefectural Fitness is positively associated with subsequent growth ($\hatβ = 0.0029$, $p = 0.007$) after controlling for initial income, population density, and patenting activity, but this relationship is detectable only when both entity and time fixed effects are included. Cross-sectional correlations between Fitness and subsequent growth change sign across periods, underscoring the importance of the panel approach. The growth effect of Fitness is stronger in prefectures with lower initial income, suggesting that technological sophistication contributes more to growth where there is greater scope for economic expansion. Lag and lead analyses indicate that the relationship runs from Fitness to subsequent growth rather than the reverse.

URL PDF HTML ☆

赞 0 踩 0

2606.07200 2026-06-08 cs.MA 新提交

Learning Multi-Agent Communication Protocol: Study on Information Entropy Efficiency in MARL

学习多智能体通信协议：MARL中信息熵效率研究

Xinren Zhang, Zixin Zhong, Jiadong Yu

AI总结提出信息熵效率指数(IEI)作为量化通信效率的指标，通过将其纳入训练损失函数，使智能体学习到平衡性能与信息紧凑性的通信协议，实验表明在保持或提升任务性能的同时提高通信效率。

详情

AI中文摘要

多智能体系统(MAS)已成为分布式问题求解的基本范式，其中自主智能体协作实现复杂目标。在此框架下，带通信的多智能体强化学习(MARL)在协作任务中取得了显著成功。然而，现有方法主要通过日益复杂的架构和不断增加的通信开销来追求性能提升，缺乏评估信息交换效率的原则性指标。本文专注于使智能体学习高效的多智能体通信协议，以平衡性能和信息紧凑性。我们提出信息熵效率指数(IEI)，这是一个新颖的指标，用于量化学习到的通信协议中消息熵与任务性能之间的比率。较低的IEI表示更紧凑和高效的消息表示。通过将IEI纳入训练损失函数，我们鼓励智能体开发出以更高通信效率实现高性能的通信协议。跨多种MARL算法的大量实验表明，与基线方法相比，我们的方法在提高通信效率的同时实现了等效或更优的任务性能。这些发现挑战了性能提升需要复杂架构或增加通信开销的主流假设，并凸显了同时提高任务成功率和通信效率以实现可扩展MAS的潜力。

英文摘要

Multi-Agent Systems (MAS) have emerged as a fundamental paradigm for distributed problem-solving, where autonomous agents collaborate to achieve complex objectives. Within this framework, Multi-Agent Reinforcement Learning (MARL) with communication has demonstrated remarkable success in cooperative tasks. However, existing approaches predominantly pursue performance gains through increasingly complex architectures and expanding communication overhead, lacking principled metrics to evaluate the efficiency of information exchange. In this paper, we focus on enabling agents to learn efficient multi-agent communication protocols that balance performance and information compactness. We propose the Information Entropy Efficiency Index (IEI), a novel metric that quantifies the ratio between message entropy and task performance in learned communication protocols. A lower IEI indicates more compact and efficient message representations. By incorporating IEI into training loss functions, we encourage agents to develop communication protocols that achieve high performance with improved communication efficiency. Extensive experiments across diverse MARL algorithms demonstrate that our approach achieves equivalent or superior task performance compared to baseline methods while improving communication efficiency. These findings challenge the prevailing assumption that performance improvements require complex architectures or increased communication overhead and highlight the potential of improving both task success and communication efficiency to enable scalable MAS.

URL PDF HTML ☆

赞 0 踩 0

2606.07187 2026-06-08 cs.IR 新提交

RISE: A Rust Library for Inverted Index Search Engines

RISE：一个用于倒排索引搜索引擎的Rust库

Angelo Savino, Rossano Venturini

AI总结提出Rust实现的倒排索引库RISE，利用Rust安全性和性能，通过可扩展trait系统提供高效查询，速度可达现有最优的2倍。

详情

AI中文摘要

倒排索引是大规模文本语料库中高效信息检索的关键数据结构。它通过将每个词项映射到出现该词的文档，实现快速全文搜索，并在此基础上通过高效算法快速检索与用户查询相关的文档。我们提出了RISE，一个用Rust实现的新型倒排索引库，旨在为信息检索任务提供高性能和高效率。RISE利用Rust的安全性和性能，为构建和查询倒排索引提供了稳健的解决方案，并通过其富有表现力的trait系统提供了可访问的扩展性。在开发RISE的过程中，我们重新审视了倒排索引文献，从而使用这个新的测试平台复现了许多先前的工作。我们评估了RISE与现有库的性能，展示了在各种数据集和工作负载下具有竞争力的查询性能，速度比当前最先进技术提升高达2倍。我们的结果表明，RISE是信息检索领域研究人员和从业者的一个有前途的工具。

英文摘要

Inverted indexes are a crucial data structure for efficient information retrieval in large text corpora. They enable fast full-text search by mapping each term to the documents in which it appears, on top of which efficient algorithms quickly retrieve the documents relevant to a user query. We present RISE, a novel inverted index library implemented in Rust, designed to deliver high performance and efficiency for information retrieval tasks. RISE leverages Rust's safety and performance to provide a robust solution for building and querying inverted indexes, while offering accessible extensibility through its expressive trait system. While developing RISE, we revisited the inverted-index literature, thereby reproducing numerous prior works using this new test bench. We evaluated RISE against existing libraries, demonstrating competitive query performance across various datasets and workloads, with speedups of up to 2x over the current state of the art. Our results indicate that RISE is a promising tool for researchers and practitioners in the field of information retrieval.

URL PDF HTML ☆

赞 0 踩 0

2606.07159 2026-06-08 cs.ET cs.AR 新提交

Distributed Persistence Domain for Persistent Memory Pooling

分布式持久域用于持久内存池化

Khan Shaikhul Hadi, Andres David Delgado, Naveed Ul Mustafa, Mark Heinrich, Hao Zheng, Yan Solihin

AI总结针对CXL内存池化中持久化延迟高的问题，提出分布式持久域（DPD）抽象，在CXL交换机中实现持久支持，通过读转发和写合并优化，平均加速33%。

详情

AI中文摘要

Compute Express Link (CXL) 支持通过分解内存进行内存池化，有望提高持久内存系统的资源利用率。然而，将持久化语义集成到基于CXL的内存池化中会引入大量延迟，限制了系统可扩展性。这种开销源于持久化操作必须遍历整个CXL结构，包括交换机、链路和协议层，才能到达远程持久内存。为此，我们认为扩展CXL交换机以支持持久化是提高持久内存池化可扩展性的有前景的方向。然而，将持久化支持移入网络会破坏传统集中式持久域的正确性假设。特别是，在分布式结构（如CXL交换机）中启用持久化，如果协调不当，可能会引入过期读取和写入。在本文中，我们提出分布式持久域（DPD），这是一种用于持久内存池化的新抽象，可在CXL交换机级别实现持久化支持。我们首先形式化分布式持久域的概念，并使用DPD作为框架来识别持久化结构分布在CXL结构中时出现的正确性风险。基于此分析，我们推导出保证正确性所需的设计要求。基于这些见解，我们提出了持久化CXL交换机，这是一种包含持久化支持的CXL交换机架构，可显著降低持久化延迟，实现读转发和写合并，同时保持正确性和崩溃一致性。我们使用SPLASH-4和YCSB基准测试评估了系统设计。模拟结果显示，与易失性CXL交换机相比，平均加速33%，在所有工作负载中，通过读转发优化可实现高达36%的加速。

英文摘要

Compute Express Link (CXL) enables memory pooling over disaggregated memory, offering the potential to improve resource utilization in persistent memory systems. However, integrating persistence semantics into CXL-based memory pooling introduces substantial latency, which limits system scalability. This overhead arises because persist operations must traverse the entire CXL fabric, including switches, links, and protocol layers, before reaching remote persistent memory. To this end, we argue that extending CXL switches with persistence support is a promising direction for improving the scalability of persistent memory pooling. However, moving persistence support into the network breaks the traditional correctness assumptions of centralized persistence domains. In particular, enabling persistence within distributed structures, such as CXL switches, can introduce stale reads and writes if not carefully coordinated. In this paper, we propose Distributed Persistence Domain (DPD), a new abstraction for persistent memory pooling that enables persistence support at the CXL switch level. We first formalize the concept of a distributed persistence domain and use DPD as a framework to identify the correctness hazards that arise when persistence structures are distributed across the CXL fabric. Based on this analysis, we derive the design requirements needed to guarantee correctness. Building on these insights, we present Persistent CXL Switch, a CXL switch architecture that incorporates persistence support to significantly reduce persist latency, enable read forwarding, and coalesce writes, while preserving correctness and crash consistency. We evaluated our system design using both SPLASH-4 and YCSB benchmarks. Simulation results show an average speedup of 33% over volatile CXL switches, and up to 36% speedup with read forwarding optimization across all workloads.

URL PDF HTML ☆

赞 0 踩 0

2606.07158 2026-06-08 cs.CR 新提交

Synthetic APTs: the Collapse of TTP-Based Attribution

合成APT：基于TTP的归因的崩溃

Francesco Balassone, Víctor Mayoral-Vilches, María Sanz-Gómez, Paul Zabalegui-Landa, Stefan Rass, Davide Quarta, Daniel Sanchez-Prieto, Marina Oteiza-Álvarez, Almerindo Graziano, Lauren Min Kim, MinSeok Choi

AI总结研究AI驱动的对手模拟是否挑战基于TTP的归因，通过CSI框架配置五个APT组进行实验，发现企业网络均被攻陷且攻击者独立武器化防御工具，表明AI时代TTP归因基础被削弱。

详情

AI中文摘要

网络威胁情报（CTI）归因依赖于识别区分不同威胁行为者的战术、技术和程序（TTP）。这种方法预设每个对手都会留下可识别的操作指纹。本研究调查AI驱动的对手模拟是否挑战了这一预设。我们部署了来自网络安全超级智能（CSI）框架的智能体，配置为五个高级持续性威胁（APT）组——APT28、APT29、APT41、APT44和Lazarus Group，与AI驱动的防御智能体在CYBER RANGES提供的两个网络靶场（企业网络和军事基础设施）中进行对抗，靶场配备了防御软件Wazuh、Velociraptor、Elasticsearch和主动AI防御者。在20次使用两种防御模型的实验中，出现了一个二元模式：所有10次企业网络实验均被攻陷，每次实验攻陷2至12台主机；而所有10次军事网络实验均成功防御或陷入僵局，与APT配置文件或防御模型无关。在10次企业网络实验中的8次，攻击者独立地将防御者自己的Velociraptor端点管理平台武器化为命令与控制通道，这是一种未编码在任何威胁情报配置文件中的趋同行为。我们认为，在AI时代，只要提供正确的模型并配备适当的框架和智能体配置，就可以部署智能体，像国家行为体APT那样行动的门槛已经崩溃：除了国家行为体外，个人现在也可以像常见威胁行为者一样行动，从而从根本上削弱了基于TTP的归因。

英文摘要

Cyber Threat Intelligence CTI attribution relies on identifying the Tactics, Techniques, and Procedures TTPs that distinguish one threat actor from another. This approach presupposes that each adversary leaves a recognizable operational fingerprint. This work investigates whether AI driven adversary emulation challenges that presupposition. We deploy agents from our Cybersecurity SuperIntelligence CSI framework, configured as five Advanced Persistent Threat APT groups, APT28, APT29, APT41, APT44, and Lazarus Group, against AI driven Defender agents across two cyber ranges provided by CYBER RANGES, equipped with defensive software Wazuh, Velociraptor, Elasticsearch and active AI driven defenders: an enterprise network and a military infrastructure. Across 20 experiments using two defender models, a binary pattern emerges: all 10 Enterprise range experiments resulted in compromise 2 to 12 hosts per experiment, while all 10 Military range experiments were successfully defended or resulted in stalemates, regardless of APT profile or defender model. In 8 of 10 Enterprise experiments, attackers independently weaponized the defender's own Velociraptor endpoint management platform as a command and control channel, a convergent behavior not encoded in any threat intelligence profile. We argue that in the AI era, wherein agents can be deployed provided the right models are available and subject to the right scaffolding and agentic configuration, the entry barrier for operating like a nation state APT collapses: beyond nation states, individuals can now act like commonly identified threat actors, and with it, fundamentally undermine TTP based attribution.

URL PDF HTML ☆

赞 0 踩 0

2606.07156 2026-06-08 cs.PF 新提交

ANNS-AMP: Accelerating Approximate Nearest Neighbor Search via Adaptive Mixed-Precision Computing

ANNS-AMP：通过自适应混合精度计算加速近似最近邻搜索

Mingkai Chen, Cheng Liu, Shengwen Liang, Lei Zhang, Xiaowei Li, Huawei Li

AI总结提出自适应混合精度框架ANNS-AMP，利用PQ索引的聚类结构预测精度，设计位串行加速器，在保持精度损失低于2.7%时实现平均163.76倍性能提升和1100倍能耗降低。

详情

AI中文摘要

近似最近邻搜索（ANNS）是现代应用（如大语言模型和推荐系统）中的关键内核。然而，其效率从根本上受到计算查询与大量高维向量（其中大多数是不相关的）之间距离的需求的限制。现有方法通过索引优化或提前终止来减少冗余，但仍受限于固定精度计算，导致不必要的算术和内存带宽开销。本文提出ANNS-AMP，一种自适应混合精度框架和加速器，它根据查询和数据特征自适应调整距离计算的精度。关键洞察是向量空间的不同区域需要不同精度级别以保持top-k结果。ANNS-AMP利用基于PQ索引的聚类结构，并引入轻量级预测器，根据尺度、半径和查询等特征在运行时确定聚类级精度。为了高效实现可变精度执行，我们设计了一种位串行加速器，采用位交错数据布局，使吞吐量随精度降低而扩展，同时通过贪心调度策略缓解内存带宽瓶颈和负载不平衡。此外，运行时预测器可以重用位串行计算阵列以实现高效的运行时预测，并且可以无缝集成到ANNS流水线中而不影响性能。根据我们在代表性数据集上的实验，ANNS-AMP相比CPU、GPU和定制ANNS加速器基线，平均实现163.76倍、10.57倍和2.06倍的性能加速，并分别降低平均能耗1100.00倍、39.41倍和6.66倍，同时保持精度损失低于2.7%。这些结果表明，自适应混合精度计算是高效大规模ANNS的一个有前景的方向。

英文摘要

Approximate nearest neighbor search(ANNS) is a critical kernel in modern applications such as LLM and recommendation systems.However,its efficiency is fundamentally limited by the need to compute distances between a query and a massive number of high-dimensional vectors,most of which are non-neighbors.Existing approaches reduce redundancy via index optimization or early termination,but remain constrained by fixed-precision computation,leading to unnecessary arithmetic and memory bandwidth overhead.This paper presents ANNS-AMP,an adaptive mixed-precision framework and accelerator that adapts the precision of distance computation to the characteristics of queries and data distribution.The key insight is that different regions of the vector space require different levels of precision to preserve top-k accuracy.ANNS-AMP leverages the clustered structure of PQ-based indices and introduces a lightweight predictor to determine cluster-level precision at runtime based on features such as scale,radius,and query distance.To efficiently realize variable-precision execution,we design a bit-serial accelerator with a bit-interleaved data layout,enabling throughput to scale with reduced precision while mitigating memory bandwidth bottlenecks and load imbalance through a greedy scheduling strategy.Moreover,the runtime predictor can also reuse the bit-serial computing array for efficient runtime prediction and can be fitted to the ANNS pipeline without performance penalty.According to our experiments on representative datasets,ANNS-AMP achieves 163.76x,10.57x,and 2.06x performance speedups on average,and reduces average energy consumption by 1100.00x,39.41x,and 6.66x compared to CPU,GPU,and customized ANNS accelerator baselines,respectively,while maintaining accuracy loss below 2.7%.These results demonstrate that adaptive mixed-precision computing is a promising direction for efficient large-scale ANNS.

URL PDF HTML ☆

赞 0 踩 0

2606.07152 2026-06-08 cs.NE cs.SC 新提交

A Data-Free Symbolic Regression Approach for Solving Equations

一种无数据的符号回归方法用于求解方程

Sergei Garmaev, Vinay Sharma, Olga Fink

AI总结提出符号方程求解器（SES），将方程求解转化为可微符号模型的优化问题，无需配对数据，直接从方程和边界条件构建目标函数，恢复显式符号解。

详情

AI中文摘要

当前科学中出现的许多方程无法通过现有解析技术求解，因此采用数值方法求解，而不产生显式符号表达式。现有的符号回归方法可以恢复符号表达式，但需要从底层过程获取训练数据，而不仅仅是控制方程。我们提出了符号方程求解器（SES），这是一个将方程求解表述为可微符号模型上的优化问题的框架。SES 从方程以及初始或边界条件构建其目标函数，消除了对配对输入-输出数据的需求。学习到的模型以显式符号形式表达，便于进一步分析。我们在代表性的代数和微分方程上评估了 SES，包括一个代数方程组、一个具有超越项的方程、一个常微分方程以及具有不同初始或边界条件的偏微分方程。在这些设置中，SES 恢复了与相应解析解匹配的紧凑符号表达式。

英文摘要

Many equations arising in science currently cannot be solved by available analytical techniques and are therefore solved numerically, without yielding explicit symbolic expressions. Existing symbolic regression approaches can recover symbolic expressions, but require training data obtained from the underlying process, rather than the governing equation alone. We propose the Symbolic Equation Solver (SES), a framework that formulates equation solving as an optimization problem over differentiable symbolic models. SES constructs its objective from the equation together with initial or boundary conditions, eliminating the need for paired input-output data. The learned model is expressed in explicit symbolic form, enabling further analysis. We evaluate SES on representative algebraic and differential equations, including a system of algebraic equations, an equation with transcendental terms, an ordinary differential equation, and partial differential equations with different initial or boundary conditions. Across these settings, SES recovers compact symbolic expressions that match the corresponding analytical solutions.

URL PDF HTML ☆

赞 0 踩 0

2606.07148 2026-06-08 cs.DB 新提交

Efficient $(α,β)$-core Computation and On-the-fly Query at Billion Scale with GPUs

高效的 $(α,β)$-核计算及十亿规模GPU在线查询

Qingshuai Feng, Shunyang Li, Kai Wang, Xuemin Lin, Kongzhang Hao, Long Yuan

AI总结提出基于GPU的无索引剥离算法GCC和GCC+，以及连通性感知算法GFQ，实现大规模二分图上的(α,β)-核高效计算与在线查询。

Comments 10 pages, 8 figures

详情

AI中文摘要

在二分图中，$(\alpha,\beta)$-核是一种广泛用于凝聚子图挖掘的模型。具体而言，一个$(\alpha,\beta)$-核是一个最大子图，其中上层每个顶点的度数至少为$\alpha$，下层每个顶点的度数至少为$\beta$。最先进的基于CPU的解决方案需要为所有$\alpha$和$\beta$组合构建索引结构，成本高昂，导致在大规模二分图上存在可扩展性挑战。此外，在线查询旨在判断边更新是否属于目标$(\alpha,\beta)$-核，对于欺诈监控和推荐系统等实时应用至关重要。然而，现有的基于索引的方法由于维护开销高，难以支持大规模下的此类查询。在本文中，我们研究如何利用GPU架构实现高效的$(\alpha,\beta)$-核计算并支持在线查询。虽然GPU被广泛用于加速图处理，但其有限的内存容量使得存储大型索引结构不切实际。为解决此问题，我们提出GCC，一种无索引的基于GPU的剥离算法，通过以warp为中心的处理加速$(\alpha,\beta)$-核计算。为进一步提高效率，我们开发了GCC+，利用$(\alpha,\beta)$-核的嵌套性质，采用基于核的早期剪枝策略。为处理在线查询，我们提出GFQ，一种连通性感知算法，通过利用连通分量信息显著缩小计算范围，从而避免全图剥离。在11个数据集上的大量实验表明，我们提出的技术在空间和时间效率上均优于现有的基于CPU的解决方案。

英文摘要

In bipartite graphs, $(α,β)$-core is a widely used model for cohesive subgraph mining. Specifically, an $(α,β)$-core is a maximal subgraph in which each vertex in the upper layer has degree at least $α$, and each vertex in the lower layer has degree at least $β$. The state-of-the-art CPU-based solutions incur extensive costs to construct an index structure for all $α$ and $β$ combinations, leading to scalability challenges on large bipartite graphs. Moreover, on-the-fly queries, which aim to determine whether an edge update belongs to a target $(α,β)$-core, are essential for real-time applications such as fraud monitoring and recommendation systems. However, existing index-based methods struggle to support such queries at scale due to their high maintenance overhead. In this paper, we investigate how to leverage GPU architectures to enable efficient $(α,β)$-core computation and support on-the-fly queries. While GPUs are widely used to accelerate graph processing, their limited memory capacity makes it impractical to store large index structures. To address this issue, we propose GCC, an index-free GPU-based peeling algorithm that accelerates $(α,β)$-core computation via warp-centric processing. To further improve efficiency, we develop GCC+, which leverages the nested property of $(α,β)$-core with a core-based early pruning strategy. For handling on-the-fly queries, we propose GFQ, a connectivity-aware algorithm that significantly narrows the computation scope by leveraging connected component information, thereby avoiding full-graph peeling. Extensive experiments on 11 datasets demonstrate that our proposed techniques outperform existing CPU-based solutions in terms of both space and time efficiency.

URL PDF HTML ☆

赞 0 踩 0

2606.07110 2026-06-08 cs.DM 新提交

Entanglement from Expansion: High Rank-Width in Deterministic Graphs

纠缠源于扩展：确定性图中的高秩宽

Tristan Cam, Cyril Gavoille, Yvan Le Borgne, Simon Martiel

AI总结本文通过边扩展推导正则图秩宽的下界，结合边等周不等式与强色指数等方法，证明确定性图族可达最大秩宽Θ(n)，填补了秩宽大于Θ(√n)的确定性图族空白。

详情

AI中文摘要

量子图态中的纠缠与秩宽（Oum和Seymour引入的图复杂度度量）内在相关。本文通过发展一种通用方法，从正则图的边扩展推导其秩宽下界，从而能够在恒定深度下制备最大纠缠的确定性图态。通过将边等周不等式与强色指数以及Jelínek的下界方法相结合，我们系统地建立了笛卡尔积（包括超立方体、Hamming图和网格）的秩宽下界。利用布尔函数分析扩展该框架，通过Kahn-Kalai-Linial定理的推广，我们以非平凡的对数因子加强了所有笛卡尔积的界。这些方法发现了$n$个顶点上确定性图族具有可证明的最大秩宽Θ(n)。我们的结果填补了文献中秩宽大于Θ(√n)的确定性图族之前的空白。

英文摘要

Entanglement in quantum graph states is intrinsically linked to rank-width, a graph complexity measure introduced by Oum and Seymour. In this work, we enable the preparation of maximally entangled deterministic graph states in constant depth by developing a general method to derive lower bounds on the rank-width of regular graphs from their edge expansion. By bridging edge-isoperimetric inequalities with the strong chromatic index and Jelínek's approach for lower bounding cut-rank, we systematically establish lower bounds for the rank-width of Cartesian products, including hypercubes, Hamming graphs, and grids. Extending this framework via Boolean function analysis, using a generalization of the Kahn-Kalai-Linial's Theorem, we strengthen the bounds for all Cartesian products by a non-trivial logarithmic factor. These methods result in the discovery of deterministic families of graphs on $n$ vertices with a provably maximum rank-width $Θ(n)$. Our results fill the previous gap in the literature for deterministic graph families of rank-width greater than $Θ(\sqrt{n})$.

URL PDF HTML ☆

赞 0 踩 0

2606.07101 2026-06-08 cs.HC 新提交

CANote: Empowering Fact-checking Note Writing Through Scaffolded and Provenance-based Human-AI Collaboration

CANote: 通过支架式和基于来源的人机协作增强事实核查笔记撰写

Shuning Zhang, Jingruo Chen, Yuwei Chuai, Dai Shi, Yifan Wang, Xin Yi, Hewu Li

AI总结提出CANote系统，通过子主张提取、证据链接和结构化草稿辅助用户撰写高质量辟谣笔记，显著提升非专家用户的笔记质量至专家水平。

详情

AI中文摘要

众包事实核查机制，如X的社区笔记，在减轻错误信息传播方面发挥着关键作用。然而，撰写高质量、基于证据的辟谣笔记给贡献者带来了沉重负担。我们提出了CANote，一个AI辅助的辟谣笔记撰写系统，具有证据关联和结构化协同起草功能。CANote通过从社交媒体帖子中提取子主张、通过子主张与检索到的证据之间的显式链接提供来源，并生成中立的结构化草稿来支持人类推理，从而为工作流程提供支架。我们在模拟X平台上对CANote与手动撰写（N=52名事实核查员，N=52名普通用户）进行了评估，发现CANote显著提高了笔记质量。值得注意的是，CANote使普通用户能够写出与专家撰写的笔记质量相当的笔记。虽然任务完成时间和感知认知负荷与手动起草相当，但CANote显著提高了用户满意度。然而，这种辅助引入了一种权衡，导致用户对辟谣笔记的所有感和控制感降低。

英文摘要

Crowdsourced fact-checking mechanisms, such as X's Community Notes, play a critical role in mitigating the spread of misinformation. However, drafting high-quality, evidence-based debunking notes imposes a substantial burden on contributors. We present CANote, an AI-assisted debunking note writing system featuring evidence correlation and structured co-drafting. CANote scaffolds the workflow by extracting subclaims from social media posts, providing provenance through explicit links between subclaims and retrieved evidence, and generating neutral, structural drafts to support human reasoning. We evaluated CANote against manual writing (N=52 fact-checkers, N=52 lay users) on simulated X platform, where we found CANote significantly improves note quality. Notably, CANote enables lay users to write notes that have comparable quality to those written by experts. While the task completion time and perceived cognitive load remain comparable to manual drafting, CANote significantly increases user satisfaction. However, this assistance introduces a trade-off, resulting in a reduced sense of user ownership and control over the debunking note.

URL PDF HTML ☆

赞 0 踩 0

2606.07099 2026-06-08 eess.SY cs.SY 新提交

SABLE: GPU-Based Power Flow Accelerator for Sparsity-Aware Batched Learning

SABLE: 基于GPU的稀疏感知批量学习潮流加速器

Suho Park, Keunju Song, Hongseok Kim

AI总结提出SABLE，一种基于GPU的稀疏批量潮流加速器，通过块对角嵌入和可复用稀疏模板，在PyTorch、CuPy和cuDSS间实现零拷贝互操作，显著提升独立潮流求解和端到端训练吞吐量。

Comments 10 pages

详情

AI中文摘要

最近的研究开发了基于GPU的交流潮流求解方法，并成功将其应用于独立潮流问题。然而，在保持稀疏性的同时将这些方法集成到现代可微学习框架中仍然具有挑战性。为此，我们提出了SABLE，一种基于GPU的稀疏批量潮流加速器，通过隐式潮流层实现可微学习。SABLE利用块对角嵌入，将批量三维雅可比矩阵重构为固定模式的二维稀疏模板，该模板在PyTorch、CuPy和cuDSS之间共享。这种公共模板实现了零拷贝互操作性以及跨软件栈的稀疏内存复用。在此表示之上，SABLE通过可复用稀疏模板、自定义GPU内核、基于cuDSS的稀疏直接LU求解器和混合精度技术加速重复潮流计算。大量实验表明，SABLE将独立潮流求解吞吐量相比pandapower提升高达253.4倍，相比ExaPF提升5.7倍。在端到端训练中，基于DC3和DeepLDE的交流最优潮流学习模型评估显示，SABLE将可行训练批量范围扩大高达64倍，并将训练吞吐量相比相应基线提升高达206.7倍。

英文摘要

Recent studies have developed GPU-based approaches for solving AC power flow and successfully applied them to standalone power flow problems. However, integrating these approaches into modern differentiable learning frameworks while preserving sparsity remains challenging. To this end, we present SABLE, a GPU-based sparse batched power flow accelerator for differentiable learning via an implicit power flow layer. SABLE leverages a block-diagonal embedding that reformulates batched three-dimensional Jacobians as a fixed-pattern two-dimensional sparse template that is shared across PyTorch, CuPy, and cuDSS. This common template enables zero-copy interoperability and memory-efficient sparse reuse across the software stack. On top of this representation, SABLE accelerates repeated power flow computations through reusable sparse templates, custom GPU kernels, a cuDSS-based sparse-direct LU solver, and mixed-precision techniques. Extensive experiments show that SABLE improves standalone power flow solving throughput by up to 253.4$\times$ over pandapower and 5.7$\times$ over ExaPF. In end-to-end training, evaluated on AC optimal power flow learning models based on DC3 and DeepLDE, SABLE expands the feasible training batch range by up to 64$\times$ and improves training throughput by up to 206.7$\times$ over the corresponding baseline.

URL PDF HTML ☆

赞 0 踩 0

2606.07085 2026-06-08 cs.SE 新提交

Porting Declarative UI to HarmonyOS: A Heuristic-guided LLM Approach

将声明式UI移植到鸿蒙：一种启发式引导的LLM方法

Kunwu Zheng, Pengyu Xue, Zhen Yang, Xiran Lyu, Peishi Lai, Mengying Zhao, Yutian Tang, Huizhi Zhang, Xianhang Li, Linhao Wu, Chengyi Wang

AI总结针对鸿蒙系统从Android/iOS迁移声明式UI的需求，提出ArkTrans方法，通过启发式构建骨架和模式匹配修复语法错误，实现高编译成功率和视觉保真度。

详情

AI中文摘要

作为一个新兴操作系统，鸿蒙对从Android和iOS等平台进行软件迁移有显著需求，其中用户界面（UI）翻译是关键环节。然而，最新的UI开发已转向声明式范式，例如Android的Kotlin Jetpack Compose（KJC）、iOS的SwiftUI和鸿蒙的ArkUI，这使得先前的翻译方法不再适用，因为它们要么针对后端逻辑，要么针对传统的命令式UI。因此，本文针对ArkUI提出了一种自动翻译方法，名为ArkTrans，用于将UI文件从Android和iOS移植到鸿蒙。ArkTrans克服了翻译过程中的两个突出挑战：（1）编程语言（PL）不熟悉，以及（2）严重的语法混乱。针对第一个挑战，ArkTrans通过从源PL提取元数据启发式地构建ArkUI骨架，从而指导LLM的初始翻译。针对第二个挑战，ArkTrans通过模式匹配执行经验揭示的后修复规则，以修复大部分剩余的语法错误。为了检验ArkTrans的有效性，我们在文件级别构建了一个包含100个样本的从KJC/SwiftUI到ArkUI的并行UI页面翻译基准。大量实验表明，直接/一次性提示的LLM无法翻译出一个可编译的UI页面。相比之下，最多90.67%的ArkTrans翻译文件可以成功编译，并具有高视觉保真度。

英文摘要

As an emerging operating system, HarmonyOS has a significant demand for software migration from platforms such as Android and iOS, where the User Interface (UI) translation accounts for a critical link. However, the latest UI development has shifted to declarative paradigms, e.g., Kotlin Jetpack Compose (KJC) for Android, SwiftUI for iOS, and ArkUI for HarmonyOS, rendering prior translation approaches inapplicable, as they target either backend logic or legacy imperative UIs. As such, this paper targets ArkUI and proposes an automatic translation approach, namely ArkTrans, to port UI files from Android and iOS to HarmonyOS. ArkTrans overcomes two salient challenges during the translation: (1) Programming Language (PL) unfamiliarity, and (2) severe syntactic chaos. Towards the first challenge, ArkTrans heuristically constructs ArkUI skeletons by extracting metadata from source PL, thereby guiding LLMs' initial translation. As for the second challenge, ArkTrans executes empirically revealed post-fixing rules via pattern matching to repair most of the remaining syntactic errors. To examine the effectiveness of ArkTrans, we construct a 100-sample parallel UI page translation benchmark from KJC/SwiftUI to ArkUI at the file level. Extensive experiments demonstrate that LLMs with direct/one-shot prompting cannot translate a single compilable UI page. In contrast, at most 90.67\% ArkTrans-translated files can be successfully compiled with high visual fidelity.

URL PDF HTML ☆

赞 0 踩 0

2606.07078 2026-06-08 cs.CG 新提交

HRsR: Hierarchical Rotation System Reconstruction

HRsR: 层次化旋转系统重建

Ruiqi Cui, Cem Akarsubaşı, Emil Toftegaard Gæde, Eva Rotenberg, Leif Kobbelt, J. Andreas Bærentzen

AI总结提出层次化旋转系统重建(HRsR)方法，通过边塌缩和顶点分裂的层次化流水线加速旋转系统重建(RsR)，实现高达6倍加速和8倍内存减少，同时保持几何保真度和拓扑控制。

2606.07071 2026-06-08 cs.IR 新提交

Decision-Theoretic Stopping Rules for Document Screening

文档筛选的决策论停止规则

Aaron H. A. Fletcher, Mark Stevenson

AI总结针对文档筛选何时停止的问题，基于决策理论和完美信息期望值提出三种停止策略，在专利审查和系统综述任务中比现有方法获得更高净效用。

2606.07060 2026-06-08 cs.DB 新提交

Auto-Relate: A Unified Approach to Discovering Reliable Functional Relationships Leveraging Statistical Tests

Auto-Relate: 一种利用统计测试发现可靠函数关系的统一方法

Ziyan Han, Yeye He, Shuyuan Kang, Min Xie, Weiwei Cui, Song Ge, Haidong Zhang, Dongmei Zhang, Surajit Chaudhuri, Rui Mao, Jianbin Qin

AI总结提出Auto-Relate框架，通过挖掘-验证流程和四种可靠性准则（准确性、原子性、稳定性、完整性）发现表格中的可靠函数关系，在58,679个真实表格上平均PR-AUC达0.87。

详情

AI中文摘要

电子表格、计算笔记本和数据库中的表格通常包含丰富的列间关系。然而，这些关系通常是隐式的，并且在表格导出为标准格式时常常丢失。恢复它们可以有益于下游任务，包括表格理解、数据质量改进和溯源分析。然而，仅仅挖掘在观察到的表格上成立的关系是不够的，因为许多关系由于巧合、冗余或有限的数据多样性而具有虚假性。在本文中，我们引入函数关系（FR）作为表格中列间关系的统一概念，涵盖算术关系、字符串变换和函数依赖。我们通过四个互补的准则来刻画FR的可靠性：准确性、原子性、稳定性和完整性。在这些准则的指导下，我们提出了Auto-Relate，一个先挖掘后验证的框架，首先生成准确的候选FR，然后通过最小性测试、扰动测试和独立性测试分别验证剩余的可靠性准则。为了进一步提高效率，我们开发了三种优化策略，包括用于早期拒绝的分组下界、用于算术FR的闭式加速以及用于统计引导早期终止的二项式界。我们从58,679个真实电子表格和关系表中构建了一个大规模基准套件，包含6,414个覆盖所有三种FR类型的地面真实FR。针对18个基线的广泛实验表明，Auto-Relate在所有设置中始终实现最佳性能，平均PR-AUC为0.87，比最佳竞争基线高出59%。

英文摘要

Tables in spreadsheets, computational notebooks, and databases often contain rich inter-column relationships. Yet these relationships are typically implicit and are often lost when tables are exported to standard formats. Recovering them can benefit downstream tasks, including table understanding, data quality improvement, and provenance analysis. However, simply mining relationships that hold on an observed table is insufficient, as many are spurious due to coincidence, redundancy, or limited data diversity. In this paper, we introduce functional relationships (FRs) as a unified notion for inter-column relationships in tables, subsuming arithmetic relationships, string transformations, and functional dependencies. We characterize FR reliability through four complementary criteria: accuracy, atomicity, stability, and integrity. Guided by these criteria, we propose Auto-Relate, a mine-then-verify framework that first generates accurate candidate FRs and then verifies the remaining reliability criteria through a Minimality Test, a Perturbation Test, and an Independence Test, respectively. To further improve efficiency, we develop three optimization strategies, including a group-by lower bound for early rejection, a closed-form speedup for arithmetic FRs, and a binomial bound for statistically guided early termination. We construct a large-scale benchmark suite from 58,679 real-world spreadsheets and relational tables, containing 6,414 ground-truth FRs spanning all three FR types. Extensive experiments against 18 baselines show that Auto-Relate consistently achieves the best performance, with an average PR-AUC of 0.87, 59% higher than the best competing baseline across all settings.

URL PDF HTML ☆

赞 0 踩 0

2606.07046 2026-06-08 cs.DC 新提交

Predictive Autoscaling in Cloud-Native and Federated Cloud-Edge Computing Environments: A Taxonomy and Future Directions

云原生与联邦云边计算环境中的预测性自动缩放：分类与未来方向

Bablu Kumar, Anshul Verma, Rajkumar Buyya

AI总结本文系统综述了云原生与联邦云边环境中的预测性自动缩放技术，提出了基于触发器、目标、预测模型和评估指标的分类法，并探讨了CRD、MAPE控制环及联邦学习等机制，最后指出了未来研究方向。

详情

AI中文摘要

自动缩放是云原生系统中的关键能力，其中动态工作负载、异构环境和延迟敏感型应用需要高效且自适应的资源管理。基于固定阈值的传统反应式方法通常响应过迟，导致资源失衡、性能下降和缩放行为不稳定。近期在预测模型、Kubernetes自定义资源定义（CRD）、基于监控-分析-计划-执行（MAPE）的控制循环以及联邦学习（FL）方面的进展，使得更主动和自主的自动缩放策略成为可能。本文对这些进展进行了结构化综述。首先，基于触发器、目标、预测模型和评估指标，提出了自动缩放技术的分类法。然后，考察了预测性自动缩放方法和基于CRD的机制，包括Kubernetes操作器和协调工作流。进一步，分析了联邦学习环境中的自动缩放，强调了反应式和主动式策略以及隐私保护技术和容器级隔离。本文还讨论了漂移感知和不确定性感知的自动缩放，引入了自动缩放漂移指数（ADI）、反馈驱动校正和异构工作负载的稳定性控制等概念。最后，概述了开放挑战和未来研究方向，为云边环境中下一代智能预测性自动缩放奠定了基础。

英文摘要

Autoscaling is a key capability in cloud-native systems, where dynamic workloads, heterogeneous environments, and latency-sensitive applications require efficient and adaptive resource management. Traditional reactive approaches based on fixed thresholds often respond too late, leading to resource imbalance, performance degradation, and unstable scaling behavior. Recent advances in predictive models, Kubernetes Custom Resource Definitions (CRDs), Monitor-Analyse-Plan-Execute (MAPE) based control loops, and federated learning (FL) have enabled more proactive and autonomous autoscaling strategies. This paper presents a structured review of these developments. It first introduces a taxonomy of autoscaling techniques based on triggers, targets, prediction models, and evaluation metrics. It then examines predictive autoscaling approaches and CRD-based mechanisms, including Kubernetes operators and reconciliation workflows. Further, it analyses autoscaling in federated learning environments, highlighting reactive and proactive strategies alongside privacy-preserving techniques and container-level isolation. The paper also discusses drift-aware and uncertainty-aware autoscaling, incorporating concepts such as the Autoscaling Drift Index (ADI), feedback-driven correction, and stability control for heterogeneous workloads. Finally, it outlines open challenges and future research directions, providing a foundation for next-generation intelligent predictive autoscaling in cloud-edge environments.

URL PDF HTML ☆

赞 0 踩 0

2606.07019 2026-06-08 cs.DC 新提交

PCCL: Process Group-Aware Scalable and Generic Collective Algorithm Synthesizer

PCCL：进程组感知的可扩展通用集合算法合成器

William Won, Kartik Lakhotia, Madhu Kumar, Sudarshan Srinivasan, Tushar Krishna

AI总结提出PCCL框架，通过进程组感知和拓扑感知，自动生成针对任意集合模式（如All-to-All）的近似最优算法，显著提升分布式训练中集合通信的效率。

Comments Contains 11 main pages, 19 figures, three tables, three algorithms

详情

AI中文摘要

由于大规模生成模型的庞大规模，分布式机器学习变得日益重要。模型参数和数据分布在众多计算设备上，需要频繁的集合通信来同步激活值和参数更新。这种集合通信已成为主要瓶颈。虽然集合算法的性能取决于物理网络拓扑，但集合通信库中的基线集合算法在很大程度上是拓扑无关的。集合算法合成器通过自动生成拓扑感知的集合算法来解决这一低效问题。然而，先前的工作大多忽略了集合通信通常只发生在设备子集（称为进程组）中。此外，大多数现有的合成器在可生成的目标集合模式范围上受到限制。我们提出了PCCL，一个可扩展且通用的框架，用于合成拓扑感知的集合算法。PCCL具有进程组感知能力，即使只有部分设备参与集合操作，也能生成接近最优的集合算法。PCCL可以合成任意集合模式，包括在11.68分钟内完成512-NPU的全对全合成。

英文摘要

Distributed machine learning has become increasingly important due to the massive scale of large-scale generative models. Both model parameters and data are distributed across many compute devices, which requires frequent collective communications to synchronize activations and parameter updates. Such collective communications have become a major bottleneck. While the performance of the collective algorithm depends on the physical network topology, the baseline collective algorithms in collective communication libraries are largely topology-agnostic. Collective algorithm synthesizers address this inefficiency by automatically generating topology-aware collective algorithms. However, prior works have largely overlooked that collective communication typically occurs only among a subset of devices, known as process groups. Additionally, most existing synthesizers are limited in the range of target collective patterns they can generate. We propose PCCL, a scalable and generic framework for synthesizing topology-aware collective algorithms. PCCL is process group-aware and capable of generating near-optimal collective algorithms even when only a subset of devices participates in collective operations. PCCL synthesizes arbitrary collective patterns, including 512-NPU All-to-All synthesis in 11.68 minutes.

URL PDF HTML ☆

赞 0 踩 0

2606.07009 2026-06-08 cs.CR cs.IT math.IT 新提交

Fast Bounded-Independence Functions and Their Duals

快速有界独立函数及其对偶

Martijn Brehm, Yuval Ishai, Nicolas Resch

AI总结本文构造了具有线性电路规模的快速函数，实现了最优代数度的t-wise独立哈希函数，改进了快速码及其对偶的构造，并首次实现了将任意t个线性独立输入映射到均匀统计独立输出的快速线性函数族，应用于密码学。

Comments Full version of paper to appear in ITC 2026. 34 pages

详情

AI中文摘要

我们继续研究{\em 快速}函数，即可由线性规模电路计算、且具有随机函数有用性质的函数。受密码学应用驱动，我们推广并改进了该领域的先前结果，得到以下结果：- 对于任意常数$t$，我们构造了一个快速$t$元独立哈希函数，其代数次数为$\log_2 t$（在$\mathbb F_2$上），同时优化了渐近电路规模和次数。- 我们简化并改进了近期（ITCS 2026）的一个快速码族及其快速对偶的构造，两者均达到Gilbert-Varshamov界。与先前构造不同，我们的构造具有可忽略的失败概率，可适应一般域和速率，支持系统编码，并具有快速通用编码器。- 我们加强了上述结果以支持更强的随机性质，例如最优组合列表解码。这是通过为任意常数$t$构造一个快速线性函数族实现的，该函数族将任意$t$个线性独立输入映射到均匀且统计独立的输出。在我们的工作之前，这仅对$t=1$已知。我们展示了上述结果对密码学的有用性。这包括首个电路复杂度随参与方数量线性扩展的完美安全多方计算协议，以及计算加密矩阵-向量积且具有最优渐近电路复杂度的协议。

英文摘要

We continue the study of {\em fast} functions, computable by linear-size circuits, that share useful properties of random functions. Motivated by cryptographic applications, we generalize and improve on previous results in this area, obtaining the following results: - For any constant $t$, we construct a fast $t$-wise independent hash function with algebraic degree $\log_2 t$ (over $\mathbb F_2$), simultaneously optimizing both asymptotic circuit size and degree. - We simplify and improve a recent construction (ITCS 2026) of a family of fast codes with fast duals, both meeting the Gilbert-Varshamov bound. Unlike the previous construction, our construction has negligible failure probability, can accommodate general fields and rates, supports a systematic encoding, and admits fast universal encoders. - We strengthen the above to support stronger random-like properties, such as optimal combinatorial list-decoding. This is achieved by constructing, for any constant $t$, a family of fast linear functions that map any $t$ linearly independent inputs to uniform and statistically independent outputs. Prior to our work, this was only known for $t=1$. We demonstrate the usefulness of the above results to cryptography. This includes the first nontrivial protocols for perfectly secure multiparty computation whose circuit complexity scales linearly with the number of parties, as well as protocols for computing encrypted matrix-vector products with optimal asymptotic circuit complexity.

URL PDF HTML ☆

赞 0 踩 0

2606.07005 2026-06-08 cs.CR 新提交

The Sound of Malware: A Memory Forensics Approach for Android Malware Analysis via Audio Signals

恶意软件之声：通过音频信号进行Android恶意软件分析的内存取证方法

Silvia Lucia Sanna, Massimo Palozzi, Leonardo Regano, Riccardo Lazzeretti, Giorgio Giacinto

AI总结提出一种内存取证框架，将Android恶意软件的静态字节码和内存快照转换为音频波形，利用频谱描述符、CNN和Transformer嵌入实现高达98.0%的准确率。

详情

AI中文摘要

Android恶意软件分析目前面临稳健分类和检测隐蔽攻击的日益严峻挑战。现代威胁采用先进的规避策略，如代码混淆、动态加载、加壳，甚至对传统静态和动态特征进行隐写操作。这些技术降低了基于签名的系统的有效性，并削弱了依赖显式语义指标（如权限、API调用或控制流结构）的机器学习模型的可靠性。在这项工作中，我们提出\approachname，一种内存取证恶意软件检测框架，将分析视角从语义程序建模转向基于信号的结构表示。静态字节码和早期执行内存快照通过直接二进制到波形映射转换为音频波形，保留底层结构模式，无需反汇编或特征工程。生成的信号使用手工设计的频谱描述符、卷积神经网络和基于Transformer的嵌入进行处理。在CICMalDroid2020数据集和VirusTotal恶意软件上的实验表明，\approachname达到高达98.0%的准确率，优于静态声纳化和竞争性的最新方法。

英文摘要

Android malware analysis is currently facing increasing challenges in achieving robust classification and detecting stealth attacks. Modern threats employ advanced evasion strategies such as code obfuscation, dynamic loading, packing, and even steganographic manipulation of traditional static and dynamic features. These techniques reduce the effectiveness of signature-based systems and degrade the reliability of Machine Learning models that depend on explicit semantic indicators such as permissions, API calls, or control-flow structures. In this work, we propose \approachname, a memory forensics malware detection framework that shifts the analysis perspective from semantic program modeling to signal-based structural representation. Both static bytecode and early-execution memory snapshots are transformed into audio waveforms through direct binary-to-waveform mapping, preserving low-level structural patterns without requiring disassembly or feature engineering. The resulting signals are processed using handcrafted spectral descriptors, Convolutional Neural Networks, and transformer-based embeddings. Experiments on CICMalDroid2020 dataset and VirusTotal malware demonstrate that \approachname achieves up to 98.0\% accuracy, outperforming static sonification and competitive state-of-the-art approaches.

URL PDF HTML ☆

赞 0 踩 0

2606.06995 2026-06-08 eess.SY cs.SY 新提交

Power Grid Topology Control

电网拓扑控制

Tong Han, Yan Xu, David J. Hill

AI总结本文综述电网拓扑控制的发展，涵盖稳态拓扑控制、拓扑转换和暂态拓扑控制，旨在利用网络侧灵活性应对可再生能源并网挑战。

详情

AI中文摘要

电网正面临可再生能源并网增加和气候影响加剧的重大挑战。虽然需求侧和发电侧的灵活性已被广泛探索以应对这些挑战，但网络侧灵活性，特别是网络拓扑，仍未得到充分利用。通信、电力电子和断路器的进步使网络拓扑越来越可控。然而，利用这种拓扑灵活性带来了巨大挑战，主要源于相关优化和控制问题中固有的非凸性和混合动态。本专著调查了电网拓扑控制早期和近年来的发展。首先讨论了拓扑控制问题中涉及的基本拓扑约束。随后，分别介绍了输电网和配电网的稳态拓扑控制，涵盖基础、最新进展综述和代表性近期成果。此外，进一步建模和分析了网络拓扑转换问题，该问题涉及最优拓扑方案的实现，近年来受到越来越多的关注。除了利用稳态网络拓扑的灵活性外，在暂态过程中控制网络拓扑也有助于系统稳定。传统方法，如输电网的有意解列，以及最近开发的微电网稳定拓扑控制方法，都体现了这一概念。最后，对本专著进行了总结。

英文摘要

Power grids are facing major challenges from growing renewable integration and worsening climate impacts. While flexibility on both the demand and generation sides has been widely explored to address these challenges, network-side flexibility, especially in network topology, remains highly underutilized. Advances in communication, power electronics, and circuit breakers have made network topology increasingly controllable. However, leveraging this topological flexibility poses substantial challenges, primarily due to the inherent non-convexity and hybrid dynamics in associated optimization and control problems. This monograph surveys the development of power grid topology control in both early and recent years. It begins by discussing the fundamental topological constraints involved in topology control problems. Subsequently, it introduces steady-state topology control for transmission and distribution networks separately, covering fundamentals, a state-of-the-art review, and representative recent advances. Additionally, the network topology transition problem, which addresses the implementation of optimal topology solutions and has garnered increasing attention in recent years, is further modeled and analyzed. Beyond utilizing the flexibility of steady-state network topology, controlling network topology during transients can also contribute to system stabilization. Traditional approaches, such as intentional controlled islanding for transmission networks, as well as recently developed topology control methods for microgrid stabilization, exemplify this concept. Finally, a summary of this monograph is provided.

URL PDF HTML ☆

赞 0 踩 0

2606.06989 2026-06-08 cs.GT 新提交

Menu Selection: A Computational Approach to Minimizing Food Waste

菜单选择：一种最小化食物浪费的计算方法

Haris Aziz, Nicholas Mattei, Shivika Narang, Sanjukta Roy

AI总结提出一种集体决策问题，通过两种消费模型（乐观和悲观）选择最小规模菜单，确保所有代理获得足够食物并最小化浪费，给出有效菜单特征、多项式时间算法及悲观浪费比紧界。

详情

AI中文摘要

我们引入了一个新颖的集体决策问题，该问题捕捉了为满足多样化饮食偏好和需求而订购食物的普遍问题。我们的设置涉及具有不同饮食需求的代理，以及具有不同份量的菜单选项。目标是选择一个菜单，使得每个人都有足够的食物可以消费，并且食物浪费最小化。我们引入了两种不同的消费模型：乐观和悲观。乐观消费假设中央规划者可以最优地在代理之间分配订购的食物，以最大化获得足够食物的人数。悲观消费考虑当代理以任意顺序自己盛食物时，消费的最坏情况保证。在任一消费模型下，我们寻求最小规模的可行菜单（在该菜单下所有代理都得到充分喂养）。我们的工作提供了两组特征描述：（1）我们刻画了任一消费模型下的可行菜单；（2）我们刻画了允许多项式时间算法找到最小规模菜单的实例空间。我们的结果还帮助我们设计整数线性规划，以在一般设置中找到最小规模菜单。此外，我们针对重要的特例提出了多项式时间算法。然后，我们考虑最小规模乐观和悲观菜单大小之间的最坏情况差异。我们称之为悲观浪费，由最小规模悲观菜单与最小规模乐观菜单的大小之比表示。我们给出了该比值的紧上界。我们的结果还提供了关于寻找最小规模极大匹配问题的额外见解，这可能具有独立意义。

英文摘要

We introduce a novel collective decision making problem that captures the ubiquitous issue of ordering food to cater for varied dietary preferences and requirements. Our settings involve agents with diverse dietary requirements over menu options with varied serving sizes. The goal is to select a menu where everyone has enough food they can consume and wastage of food is minimized. We introduce two different consumption models: optimistic and pessimistic. Optimistic consumption assumes a situation when a central planner can optimally allocate the food ordered among the agents to maximize the number of people who get enough to eat. Pessimistic considers the worst case guarantee on consumption when agents fill their own plates in an arbitrary order. Under either consumption model, we seek valid menus (under which all agents are sufficiently fed) of minimum size. Our work provides two sets of characterizations: (1) we characterize valid menus under either consumption model and (2) we characterize the space of instances that admit polynomial-time algorithms to find minimum sized menus. Our results also help us design Integer Linear Programs to find minimum sized menus in general settings. Furthermore, we present polynomial-time algorithms for important special cases. We then consider the worst case discrepancy between the size of minimum sized optimistic and pessimistic menus. We call this the waste of pessimism, captured by the ratio of the minimum sized pessimistic menu to that of the minimum sized optimistic menu. We show tight upper bounds on this ratio. Our results also provide additional insights on the problem of finding a minimum sized maximal matching, which may be of independent interest.

URL PDF HTML ☆

赞 0 踩 0

2606.06970 2026-06-08 cs.IR 新提交

SSRLive: Live Streaming Recommendation with Dynamic Semantic ID

SSRLive：基于动态语义ID的直播推荐

Teng Shi, Zhaoheng Li, Yuanhang Qu, Yi Liu, Lixiang Lai, Yuning Jiang

AI总结针对直播推荐中静态语义ID无法反映内容动态变化、生成式方法忽略用户-主播交互信号的问题，提出SSRLive框架，结合生成模块（动态语义ID）与判别模块（交互增强），在真实部署中显著提升观看时长、GMV等指标。

详情

AI中文摘要

直播已成为增长最快的在线媒体形式之一，支持即时内容广播和用户与主播之间的实时互动。尽管现有推荐算法在该领域有效，但它们通常计算资源利用率有限，低FLOPs阻碍了性能进一步提升。生成式推荐技术在各种工业任务中受到关注，为改进直播推荐提供了有前景的途径。然而，直接将生成方法应用于直播并非易事，因为存在两大挑战：（1）静态语义ID无法反映直播房间内容的快速变化；（2）生成式流水线通常不包含用户-主播交互信号（如点赞、订单），而这些信号对于建模用户对主播和展示产品的意图至关重要。为应对这些挑战，我们提出SSRLive：面向直播平台的动态语义ID引导的流式推荐。该框架在统一架构中集成了生成模块和判别模块。生成组件采用编码器-解码器设计，产生静态和动态语义ID，能够及时表示直播房间内容，同时利用多模态信息。判别组件通过将语义ID与用户特征结合来细化任务特定表示，并用用户-主播交互数据增强这些表示，执行多任务预测。实际部署中的在线A/B测试证明了切实的收益：观看时长（+3.38%）、GMV（+0.72%）、粉丝增长（+3.12%）和互动量（+2.92%）。这些改进凸显了SSRLive的有效性和商业价值，该系统现已全面部署，服务于数亿活跃用户。

英文摘要

Live streaming has emerged as one of the fastest-growing forms of online media, enabling instant content broadcasting and real-time engagement between users and streamers. Despite the effectiveness of existing recommendation algorithms in this domain, they often suffer from limited utilization of computational resources, with low FLOPs that hinder further performance enhancement. Generative recommendation techniques, which have gained traction in various industrial tasks, offer a promising avenue for improving live streaming recommendations. However, directly applying generative methods to live streaming is non-trivial due to two major challenges: (1) static semantic IDs (SIDs) cannot reflect the rapidly changing nature of live room content; and (2) generative pipelines generally do not incorporate user--streamer interaction signals (e.g., likes, orders), which are critical for modeling user intent toward both the streamer and showcased products. To address these challenges, we introduce SSRLive: Dynamic Semantic ID-guided Streaming Recommendation for Live platforms. The proposed framework integrates a generative module and a discriminative module in a unified architecture. The generative component employs an encoder-decoder design to produce both static and dynamic SIDs, enabling timely representation of live room content while leveraging multimodal information. The discriminative component refines task-specific representations by combining SIDs with user features, augments them with user-streamer interaction data, and performs multi-task predictions. Online A/B tests in real-world deployment demonstrate tangible benefits: watch time (+3.38%), GMV (+0.72%), follower growth (+3.12%), and interaction volume (+2.92%). These improvements highlight the effectiveness and business value of SSRLive, which is now fully deployed, serving hundreds of millions of active users.

URL PDF HTML ☆

赞 0 踩 0

2606.06968 2026-06-08 cs.CR 新提交

HAVE: Host Active Verification Engine for Closing the Contextual Reality Gap in Security Digital Twins

HAVE：用于弥合安全数字孪生中上下文现实差距的主机主动验证引擎

Vincenzo Sammartino, Marco Pasquini

AI总结提出HAVE引擎，通过安全约束主机代理进行最大似然估计，测量经验妥协概率，利用Wilson区间置信权重和贝叶斯混合规则修正CVSS评分导致的上下文现实差距，实验显示在误报和漏报场景中分别降低38.2%和提升132.4%的到达概率。

Comments This work has been submitted to the IEEE for possible publication

详情

AI中文摘要

安全数字孪生（SDT）提供持续更新的基础设施虚拟副本用于威胁模拟，但它们依赖理论CVSS分数来分配横向移动概率——这造成了上下文现实差距：在未确认的缓解措施抵消利用时风险被高估，而在逻辑缺陷绕过所有内存安全防御时风险被严重低估。我们提出主机主动验证引擎（HAVE），这是一种SDT扩展，它部署一个安全约束的主机代理，通过对快照隔离的伯努利试验进行最大似然估计来测量经验妥协概率$\hat{p}$。Wilson区间宽度置信权重$\alpha_w$通过形式上与Beta-Binomial后验相关的贝叶斯混合规则将$\hat{p}$传播到蒙特卡洛模拟中。跨四个漏洞类别、三个安全层级和两个生产二进制文件的评估显示，HAVE在误报场景中将$P_{\text{reach}}$降低了38.2%，在漏报场景中将其提高了132.4%，净修正+124.1%；HAVE后的估计在不同校准指数$\kappa$下仅变化$1.12\times$，而仅使用CVSS的基线变化$4.6\times$。

英文摘要

Security Digital Twins (SDTs) provide continuously updated virtual replicas of infrastructure for threat simulation, yet they rely on theoretical CVSS scores to assign lateral-movement probabilities -- creating the Contextual Reality Gap: risk is overestimated where unacknowledged mitigations neutralize exploits, and drastically underestimated where logic flaws bypass all memory-safety defenses. We present the Host Active Verification Engine (HAVE), an SDT extension that deploys a safety-constrained host agent to measure the empirical probability of compromise $\hat{p}$ via maximum-likelihood estimation over snapshot-isolated Bernoulli trials. A Wilson interval-width confidence weight $α_w$ propagates $\hat{p}$ into Monte Carlo simulations via a Bayesian blending rule formally related to the Beta-Binomial posterior. Evaluation across four vulnerability classes, three security tiers, and two production binaries shows HAVE reduces $P_{\text{reach}}$ by 38.2% in false-positive scenarios and increases it by 132.4% in false-negative scenarios, with a net +124.1% correction; post-HAVE estimates vary by only $1.12\times$ across calibration exponents $κ$, versus $4.6\times$ for CVSS-only baselines.

URL PDF HTML ☆

赞 0 踩 0

2606.06955 2026-06-08 cs.NI 新提交

i2Slicer: Enabling Flexible and Automated Orchestration of 5G SA End-to-End Network Slices

i2Slicer：实现5G SA端到端网络切片的灵活自动化编排

M. Catalan-Cid, A. Fernandez, D. Camps-Mur, S. Siddiqui

AI总结提出i2Slicer，一种灵活编排5G独立组网端到端网络切片的解决方案，支持多租户和多服务，通过自动化生命周期管理简化切片部署。

Journal ref 2023 IEEE Conference on Network Function Virtualization and Software Defined Networks (NFV-SDN)

2606.06947 2026-06-08 cs.IR 新提交

DREAM: Dynamic Refinement of Early Assignment Mappings

DREAM：早期分配映射的动态精炼

Liwei Guan, Huanjie Wang, Hongwei Zhang, Linxun Chen, Zhaojie Liu

AI总结针对SID生成式推荐中冷启动项目因静态编码导致的性能瓶颈，提出DREAM框架，通过意图感知分词器、冻结骨干评估和动态波束机制三阶段渐进精炼，显著提升冷启动推荐效果。

Comments 12 pages, 4 figures, 5 tables

详情

AI中文摘要

生成式推荐通过将物品检索重构为语义ID（SID）的自回归生成来推进物品检索，SID是编码物品语义的紧凑令牌序列。虽然SID提供了强大的语义先验，但当前基于SID的方法在观察到足够的用户反馈之前，通过离线分词为每个物品分配一个单一的静态标识符。对于冷启动项目，这种一次性承诺产生了区分性差的编码，生成未对齐的路径，由于相关令牌在训练期间很少被采样，这些路径无法被精炼。我们识别出这种早期静态承诺（而非模型容量）是SID生成式推荐中冷启动的根本瓶颈。为克服这一瓶颈并弥合分词和生成的不相交目标，我们提出DREAM（早期分配映射的动态精炼），一个通过渐进精炼解决此缺陷的三阶段框架。首先，意图感知分词器通过反事实对比学习重建SID空间，为每个冷启动项目生成多样化的行为对齐候选池。其次，冻结的推荐骨干作为评估器，基于多上下文用户支持选择最可靠的候选，无需重新训练。第三，动态波束机制在训练和推理过程中维护多个加权的SID假设，防止过早坍缩到单一分配。在三个Amazon基准上的大量实验表明，DREAM在冷启动指标上显著优于最先进的生成式和序列式基线。

英文摘要

Generative recommendation advances item retrieval by reformulating it as autoregressive generation of Semantic IDs (SIDs), compact token sequences that encode item semantics. While SIDs offer a strong semantic prior, current SID-based methods assign each item a single static identifier through offline tokenization before sufficient user feedback is observed. For cold-start items, this one-shot commitment produces poorly discriminative codes, generating misaligned paths that remain unrefined because the associated tokens are rarely sampled during training. We identify this early static commitment, not model capacity, as the fundamental cold-start bottleneck in SID-based generative recommendation. To overcome this bottleneck and bridge the disjoint objectives of tokenization and generation, we propose DREAM (Dynamic Refinement of Early Assignment Mappings), a three-stage framework that resolves this flaw through progressive refinement. First, an intent-aware tokenizer rebuilds the SID space through counterfactual contrastive learning, generating a diverse pool of behavior-aligned candidates per cold-start item. Second, the frozen recommendation backbone serves as an evaluator, selecting the most reliable candidate based on multi-context user support without retraining. Third, a dynamic beam mechanism maintains multiple weighted SID hypotheses throughout training and inference, preventing premature collapse to a single assignment. Extensive experiments on three Amazon benchmarks show that DREAM substantially outperforms state-of-the-art generative and sequential baselines on cold-start metrics.

URL PDF HTML ☆

赞 0 踩 0

AI 大模型

视觉与机器人

科学与医疗

MailoHLS: Multi-Adapter Structure-Aware Learning for Pareto-Driven HLS Pragma Optimization

No, Cake Cutting Really is a Piece of Cake

Moodie: An Early-Stage Design Exploration for Supporting Fear of Missing Out with LLM-based Chatbots

A Comparative Study of Deep Learning Models for Geological Carbon Sequestration

Unlocking feedforward capabilities in Model Predictive Control algorithms to deal with measurable disturbances

Technological Fitness and Regional Growth in Japan

Learning Multi-Agent Communication Protocol: Study on Information Entropy Efficiency in MARL

RISE: A Rust Library for Inverted Index Search Engines

Distributed Persistence Domain for Persistent Memory Pooling

Synthetic APTs: the Collapse of TTP-Based Attribution

ANNS-AMP: Accelerating Approximate Nearest Neighbor Search via Adaptive Mixed-Precision Computing

A Data-Free Symbolic Regression Approach for Solving Equations

Efficient $(α,β)$-core Computation and On-the-fly Query at Billion Scale with GPUs

Entanglement from Expansion: High Rank-Width in Deterministic Graphs

CANote: Empowering Fact-checking Note Writing Through Scaffolded and Provenance-based Human-AI Collaboration

SABLE: GPU-Based Power Flow Accelerator for Sparsity-Aware Batched Learning

Porting Declarative UI to HarmonyOS: A Heuristic-guided LLM Approach

HRsR: Hierarchical Rotation System Reconstruction

Decision-Theoretic Stopping Rules for Document Screening

Auto-Relate: A Unified Approach to Discovering Reliable Functional Relationships Leveraging Statistical Tests

Predictive Autoscaling in Cloud-Native and Federated Cloud-Edge Computing Environments: A Taxonomy and Future Directions

PCCL: Process Group-Aware Scalable and Generic Collective Algorithm Synthesizer

Fast Bounded-Independence Functions and Their Duals

The Sound of Malware: A Memory Forensics Approach for Android Malware Analysis via Audio Signals

Power Grid Topology Control

Menu Selection: A Computational Approach to Minimizing Food Waste

SSRLive: Live Streaming Recommendation with Dynamic Semantic ID

HAVE: Host Active Verification Engine for Closing the Contextual Reality Gap in Security Digital Twins

i2Slicer: Enabling Flexible and Automated Orchestration of 5G SA End-to-End Network Slices

DREAM: Dynamic Refinement of Early Assignment Mappings