arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2606.12136 2026-06-11 cs.NI 新提交

Greenness-Driven Scheduling in Far Edge Kubernetes: A CODECO Evaluation

远边缘Kubernetes中的绿色驱动调度：一项CODECO评估

Kaikang Huang, Dalal Ali, Rute C. Sofia

AI总结本文研究Kubernetes CODECO框架如何通过跨层能量感知调度，在IoT-Edge-Cloud连续体中降低容器化应用能耗，实验表明在ARM设备上可节省高达11.01 mJ计算能耗和4.14 mJ网络能耗。

详情

AI中文摘要

能源消耗在IoT-Edge-Cloud基础设施中日益受到关注，其中容器化应用编排必须在性能与可持续性之间取得平衡。本文研究了Kubernetes CODECO框架如何将跨层能量感知集成到IoT-Edge-Cloud连续体中容器化应用的调度决策中。CODECO通过Kepler在计算层面以及网络（IP）层面监控能量，并使用这些指标定义绿色启发式规则，通过其基于ILP的调度器指导Pod放置决策。该方法在由基于ARM的嵌入式设备组成的真实远边缘测试平台上进行了实验评估，在多种场景下将CODECO与原生Kubernetes进行了比较。结果表明，CODECO持续降低了集群的能耗，在峰值负载下，对于结合了不同类型注入故障条件（包括CPU压力、非对称网络延迟和带宽争用）的广泛场景，计算能耗节省高达11.01 mJ，网络传输能耗节省高达4.14 mJ。结合两个能量维度的复合绿色评分在所有条件下提供了稳定且一致的调度策略排名，证明了其作为跨IoT-Edge-Cloud连续体集群级编排决策的统一能量指标的适用性。

英文摘要

Energy consumption is an increasing concern in IoT-Edge-Cloud infrastructures, where containerized application orchestration must balance performance with sustainability. This paper investigates how the Kubernetes CODECO framework integrates cross-layer energy-awareness into scheduling decisions for containerized applications across the IoT-Edge-Cloud continuum. CODECO monitors energy at both the computational level, via Kepler, and at a network (IP) level, and uses these metrics to define greenness heuristics that guide pod placement decisions through its ILP-based scheduler. The approach is experimentally evaluated on a real-world far Edge testbed composed of ARM-based embedded devices, comparing CODECO against vanilla Kubernetes across multiple scenarios. The results show that CODECO consistently reduces the energy consumption of the cluster, with savings of up to 11.01 mJ in computational energy and 4.14 mJ in network transmission energy consumption at peak load, for a wide set of scenarios which combine different types of injected fault conditions, including CPU stress, asymmetric network delay, and bandwidth contention. A composite greenness score combining both energy dimensions provides a stable and consistent ranking of scheduling strategies across all conditions, demonstrating its suitability as a unified energy indicator for cluster-level orchestration decisions across the IoT-Edge-Cloud continuum.

URL PDF HTML ☆

赞 0 踩 0

2606.11934 2026-06-11 cs.NI 新提交

Exploratory Analysis of Wi-Fi 6 Dynamic Resource Unit Sharing in Small-Scale Network Scenarios

小规模网络场景中Wi-Fi 6动态资源单元共享的探索性分析

Sai Mada, Anna Baron, Luigi Martino, Rute C. Sofia

AI总结针对静态RU调度在动态流量下的局限性，提出一种动态RU分配算法，映射TSN流量类别至Wi-Fi 6 QoS机制，仿真表明相比静态方案降低了延迟、抖动和丢包率。

详情

AI中文摘要

本文研究了与时间敏感网络（TSN）集成的Wi-Fi 6（IEEE 802.11ax）网络的动态资源单元（RU）分配策略，针对静态RU调度在动态流量条件下的局限性。我们提出了一种动态RU分配算法，将TSN流量类别映射到Wi-Fi 6服务质量（QoS）机制，包括增强分布式信道接入（EDCA），并使TSN控制与基于以太网的TSN域对齐。所提出的解决方案使用fortiss开发的ns-3 DetNetWiFi框架进行评估，重点关注时间敏感流量。仿真结果表明，与静态RU分配方案相比，网络效率提高，延迟、抖动和丢包率降低。这些发现突显了动态RU分配在支持基于Wi-Fi 6的TSN部署中的确定性通信需求以及增强混合工业网络可靠性方面的潜力。

英文摘要

This paper investigates dynamic Resource Unit (RU) allocation strategies for Wi-Fi~6 (IEEE 802.11ax) networks integrated with Time-Sensitive Networking (TSN), targeting the limitations of static RU scheduling under dynamic traffic conditions. We propose a dynamic RU allocation algorithm that maps TSN traffic classes to Wi-Fi~6 Quality of Service (QoS) mechanisms, including Enhanced Distributed Channel Access (EDCA) and aligns TSN control with Ethernet-based TSN domains. The proposed solution is evaluated using the ns-3 DetNetWiFi framework developed by fortiss, focusing on time-sensitive traffic. Simulation results demonstrate improved network efficiency with reductions in latency, jitter, and packet loss compared to static RU allocation schemes. These findings highlight the potential of dynamic RU allocation to support deterministic communication requirements in Wi-Fi~6-based TSN deployments and to enhance the reliability of hybrid industrial networks.

URL PDF HTML ☆

赞 0 踩 0

2606.11877 2026-06-11 cs.NI 新提交

LLM-Enabled NWDAF: A Step Toward AI-Native 6G Network Intelligence

LLM赋能的NWDAF：迈向AI原生的6G网络智能

Henok Daniel, Omar Alhussein, Cheng Li, Jie Liang, Ernesto Damiani

AI总结开发了一个与Free5GC兼容的开源NWDAF，集成大语言模型接口，通过意图识别实现自然语言交互，简化网络分析管理，为AI原生6G网络奠定基础。

详情

Comments: 20 pages

AI中文摘要

网络数据分析功能（NWDAF）通过支持实时分析和闭环自动化，在第五代（5G）网络中实现零接触网络管理方面起着核心作用。尽管其关键作用，开源NWDAF实现的范围和可访问性仍然有限。在本文中，我们开发了一个与开源核心网络Free5GC兼容的开源NWDAF，它通过订阅网络功能（NF）收集网络数据，并包含一个集成的大语言模型（LLM）接口，支持与人类操作员的自然语言交互。该接口处理用户意图，使用语义嵌入模型进行编码，并将其映射到七个预定义意图类别之一，以触发分析查询或事件订阅命令。这种架构抽象了传统接口的复杂性，使非专家用户能够轻松管理网络分析和订阅。该系统支持访问和移动管理功能（AMF）和会话管理功能（SMF）事件订阅、实时监控以及通过Prometheus进行分析检索，所有这些都可以通过对话界面访问。通过将AI驱动的意图识别与标准化网络分析相结合，我们的实现增强了操作员的可用性，并为AI原生6G网络奠定了基础。本研究中生成的源代码和数据集可在github仓库中获取，网址为：this https URL。

英文摘要

The Network Data Analytics Function (NWDAF) is central to enabling zero-touch network management in fifth-generation (5G) networks by supporting real-time analytics and closed-loop automation. Despite its critical role, open-source NWDAF implementations remain limited in scope and accessibility. In this paper, we develop an open-source NWDAF, compatible with the open-source core network Free5GC, that collects network data via subscriptions to Network Functions (NFs), and also includes an integrated Large Language Model (LLM) interface that enables natural language interaction with human operators. The interface processes user intents, encodes them using a semantic embedding model, and maps them to one of seven predefined intent categories to trigger analytics queries or event subscription commands. This architecture abstracts the complexity of traditional interfaces, allowing non-expert users to manage network analytics and subscriptions with ease. The system supports Access and Management Function (AMF) and Session Management Function (SMF) event subscriptions, real-time monitoring, and analytics retrieval via Prometheus, all accessible through a conversational interface. By bridging AI-driven intent recognition with standardized network analytics, our implementation enhances operator usability and provides a foundation towards AI-native 6G networks. The source code and datasets generated during the current study are available in the github repository, this https URL.

URL PDF HTML ☆

赞 0 踩 0

2606.11803 2026-06-11 cs.CR cs.NI 新提交

SwarmSense-DNN: A Trustworthy and Decentralized Neural Framework for Proactive Anomaly Defense in Consumer IoT

SwarmSense-DNN：面向消费物联网中主动异常防御的可信去中心化神经框架

Jing Yang, Vijay Govindarajan, Saad Arif, Xu Xu, Mohamed Kallel, Zaffar Ahmed Shaikh, Zhe Liu, Chunhong Yuan, Lip Yee Por

AI总结提出SwarmSense-DNN，一种结合群体智能与深度神经网络的去中心化框架，通过分层联邦学习与图注意力机制实现分布式IoT环境中的协同异常检测，在五个基准数据集上达到95.44%平均检测精度并降低67%通信开销。

详情

Comments: 11 pages, 14 figures

AI中文摘要

消费物联网设备的快速增长带来了针对AI网络威胁的可信异常检测的前所未有的挑战，需要实时、隐私保护和可扩展的防御机制。传统的集中式策略在处理分布式消费数据时面临关键限制，包括通信瓶颈、单点故障和隐私漏洞。我们提出SwarmSense-DNN，一种新颖的去中心化神经框架，采用群体智能在分布式IoT环境中进行安全、协作的异常检测。该框架将自主智能体与深度神经网络集成，形成一个自组织的防御系统，无需集中协调即可检测不断演变的异常。它利用带有图神经网络和注意力机制的分层联邦学习来捕获局部和全局异常行为，同时确保数据隐私。大量实验证明了SwarmSense-DNN的优越性能：它在五个基准数据集上实现了95.44%的平均检测精度，同时将通信开销降低了67%。该框架通过差分隐私保障对对抗性威胁保持稳健的弹性，并在节点故障和AI攻击下表现出强大的容错能力。

英文摘要

The rapid growth of consumer IoT devices has introduced unprecedented challenges in trustworthy anomaly detection against AI-enabled cyber threats, requiring real-time, privacy-preserving, and scalable defense mechanisms. Traditional centralized strategies face critical limitations, including communication bottlenecks, single points of failure, and privacy vulnerabilities when processing distributed consumer data. We propose SwarmSense-DNN, a novel decentralized neural framework employing swarm intelligence for secure, cooperative anomaly detection across distributed IoT environments. The framework integrates autonomous agents with deep neural networks to form a self-organizing defense system that detects evolving anomalies without centralized coordination. It utilizes hierarchical federated learning with graph neural networks and attention mechanisms to capture local and global anomaly behaviors while ensuring data privacy. Extensive experiments demonstrate SwarmSense-DNN's superior performance: it achieves 95.44% average detection accuracy across five benchmark datasets while reducing communication overhead by 67%. The framework maintains robust resilience against adversarial threats through differential privacy safeguards and demonstrates strong fault tolerance under node failures and AI-enabled attacks.

URL PDF HTML ☆

赞 0 踩 0

2606.11729 2026-06-11 cs.CR cs.NI 新提交

A VPN-as-a-Service Tailored Enabler for Computing-constrained Environments

面向计算受限环境的VPN即服务定制化使能器

Carolina Fernández-Martínez, César Cajas Parra, Shuaib Siddiqui

AI总结提出一种云原生VPN即服务(VPNaaS)，可动态编排为每个租户部署独立隧道，集成IAM工具，并适应计算或熵受限环境，支持RSA或椭圆曲线密钥算法选择。

详情

Comments: Proc. 2025 IEEE 11th International Conference on Network Softwarization (NetSoft), 2025

AI中文摘要

工业界已采用零信任(Zero Trust, ZT)架构原则和实现用于云原生环境，遵循更严格的安全要求，面向内部和外部租户。这些方法结合了细粒度身份管理和监控，用于清单编制和更好地分析设备安全态势以实现整体保护，同时通过严格关注点分离和隔离来强制执行最小权限。在网络方面，ZT方法也依赖隔离和最小权限；通过每个租户连接到给定基础设施的独立安全隧道来实现。此类实现也可应用于实验基础设施内部及与之的连接。在此意义上，本工作贡献了一种云原生VPN即服务(VPNaaS)的设计和评估，该服务可以：(i) 轻松编排以动态部署每个远程连接到基础设施的租户的独立隧道；(ii) 与常见的身份和访问管理(IAM)工具集成，这是ZT部署的关键；(iii) 适应计算或熵受限环境。该解决方案是可定制的，允许选择RSA或椭圆曲线(EC)作为密钥生成算法及其参数，以实现更安全的密钥并适应资源受限环境。

英文摘要

Industry has embraced Zero Trust (ZT) architectural tenets and implementations for cloud-native environments, following stricter security requirements to both internal and external tenants. Among others, these approaches combine fine-grained identity management and monitoring for both inventorying and better analysing the devices' security posture for overall protection, along with strict separation of concerns and isolation to enforce minimal privilege. Networking-wise, ZT approaches rely as well on isolation and least privilege; enacted by separate, secure tunnels per tenant connecting to a given infrastructure. Such implementations can also be applied to the connectivity within and towards experimental infrastructures. In this sense, this work contributes the design and evaluation of a cloud-native VPN-as-a-Service (VPNaaS) that can be (i) easily orchestrated to deploy on-the-fly, separate tunnels per each tenant remotely connecting to the infrastructure; (ii) integrated with common Identity and Access Management (IAM) tools, key to ZT deployments; and (iii) adapt to computing- or entropy- constrained environments. This solution is customisable and allows, among others, to select from RSA or Elliptic Curves (EC) as key generation algorithm and their parameters to achieve more secure keys and adapt to resource-constrained environments.

URL PDF HTML ☆

赞 0 踩 0

2606.11398 2026-06-11 cs.NI 新提交

Predictive and Spatially Aware Scheduling in Flexible Duplexing for Deterministic Communications

确定性通信中灵活双工的预测与空间感知调度

Syed Morsleen Riaz, Baldomero Coll-Perales, M. Carmen Lucas-Estañ, Javier Gozalvez, Miguel Sepulcre

AI总结提出利用流量预测和预测调度减少灵活双工中的上下行冲突，并利用空间分集减轻不可避免冲突的影响，相比参考方案成功传输数提升超40%。

详情

AI中文摘要

下一代无线网络必须为时间敏感的闭环应用维持确定性服务等级。灵活双工（FD）是一种支持这些服务的有效解决方案，因为它能在同一频段内的正交资源上同时进行上行（UL）和下行（DL）传输。然而，同时的UL和DL传输可能因带内发射（IBE）和UL到DL交叉链路干扰（CLI）产生冲突，从而降低性能。本文提出利用流量预测和预测调度来缓解FD中的UL/DL冲突。我们的方案利用流量预测增加调度无CLI的UL和DL传输的可能性，并利用空间分集最小化不可避免冲突的影响。结果表明，所提方案减少了UL/DL调度冲突，并将冲突传输的SINR提高了5 dB以上。与参考FD方案相比，成功完成的传输数量提升了超过40%。

英文摘要

Next generation wireless networks must sustain deterministic service levels for time-sensitive closed-loop applications. Flexible duplexing (FD) is an efficient solution to support these services, as it enables simultaneous uplink (UL) and downlink (DL) transmissions over orthogonal resources within the same band. However, simultaneous UL and DL transmissions can create conflicts that degrade performance due to interference from in-band emissions (IBE) and UL-to-DL cross-link interference (CLI). In this paper, we propose to use traffic forecasting and predictive scheduling to mitigate UL/DL conflicts in FD. Our proposal exploits traffic predictions to increase the likelihood of scheduling CLI-free UL and DL transmissions, and leverages spatial diversity to minimize the impact of unavoidable conflicts. Results show that the proposed scheme reduces UL/DL scheduling conflicts and improves the SINR of conflicted transmissions by more than 5 dB. This leads to gains of over 40% in the number of successfully completed transmissions compared to reference FD schemes.

URL PDF HTML ☆

赞 0 踩 0

2606.10508 2026-06-11 cs.CR cs.NI 版本更新

A Deployment-Oriented Framework for Explainable AI-Assisted eBPF/XDP Mitigation at the IoT Edge

面向部署的可解释AI辅助eBPF/XDP缓解框架在物联网边缘的应用

Abdurrahman Tolay

AI总结提出一种基于Linux的物联网边缘网关框架，结合资源感知的AI风险评分、事件级可解释性和eBPF/XDP限界缓解，实现可部署的异常流量管控。

详情

Comments: 59 pages, 2 figures, 12 tables. Conceptual framework and research agenda for explainable AI-assisted eBPF/XDP mitigation at the IoT edge. Corrected truncated abstract metadata

AI中文摘要

物联网部署结合了异构、资源受限的设备，这些设备具有弱安全配置、暴露的服务、有限的日志记录、补丁约束和长生命周期。基于签名和阈值的控制仍然是有用的基线，但在动态物联网网络中作为独立机制是不够的。同样，离线人工智能基准性能本身并不能建立操作可部署性。本文提出了一个概念框架和研究议程，用于基于Linux的物联网边缘网关，该网关结合了资源感知的流级AI辅助风险评分、事件级可解释性以及通过eBPF/XDP的限界缓解。控制器应用可逆的、时间受限的操作，受关键设备保护措施约束，更新数据包级执行状态，并记录结构化日志。该架构将用户空间中的复杂推理和策略控制与内核中简洁的数据包处理决策分离。它还定义了一条未来的硬件感知评估路径，涵盖检测质量、资源成本、响应时间、回滚行为和合法流量保留。本文不报告新的实验测量结果。

英文摘要

Internet of Things (IoT) deployments combine heterogeneous, resource-constrained devices with weak security configurations, exposed services, limited logging, patching constraints, and long lifecycles. Signature- and threshold-based controls remain useful baselines, but they are insufficient as standalone mechanisms in dynamic IoT networks. Likewise, offline artificial intelligence (AI) benchmark performance alone does not establish operational deployability. This article presents a conceptual framework and research agenda for a Linux-based IoT edge gateway that combines resource-aware flow-level AI-assisted risk scoring, event-level explainability, and bounded mitigation through eBPF/XDP. The controller applies reversible, time-limited actions subject to critical-device safeguards, updates packet-level enforcement state, and records structured logs. The architecture separates complex reasoning and policy control in user space from concise packet-handling decisions in the kernel. It also defines a future hardware-aware evaluation pathway covering detection quality, resource cost, response timing, rollback behaviour, and legitimate-traffic preservation. The paper does not report new experimental measurements or claim measured superiority or completed real-time performance.

URL PDF HTML ☆

赞 0 踩 0

2604.25018 2026-06-11 cs.ET cs.AI cs.DC cs.NI 版本更新

Internet of Everything in the 6G Era: Paradigms, Enablers, Potentials and Future Directions

6G时代的万物互联：范式、使能技术、潜力与未来方向

Driss Choukri, Essaid Sabir, Elmahdi Driouch, Abdelkrim Haqiq

AI总结本文综述了万物互联（IoE）的概念、核心组件、架构基础、使能技术及研究挑战，并探讨了面向6G智能IoE系统的开放研究方向，重点关注可扩展性、安全、隐私和能效。

2510.18058 2026-06-11 cs.NI cs.DC 版本更新

A New Broadcast Model for Several Network Topologies

一种适用于多种网络拓扑的新型广播模型

Hongbo Lu, Junsung Hwang, Bernard Tenreiro, Nabila Jaman Tripti, Darren Hamilton, Yuefan Deng

AI总结提出基于平衡饱和的广播（BBS）算法，通过树形流水线优化大规模消息广播的通信效率，在Mesh、Butterfly、Dragonfly和Fat-Tree拓扑上均优于现有算法。

详情

Comments: 30 pages, 7 figures

AI中文摘要

我们引入了基于平衡饱和的广播（BBS），这是一类通用的基于树的流水线广播算法，旨在优化不同网络拓扑下的通信效率，特别关注大消息尺寸。通过解决广播中的两个基本理论挑战——生成树构建和通信任务调度，BBS提供了一个统一且灵活的框架，能够在各种网络约束下有效运行。该算法在最大化聚合吞吐量的同时，处理拓扑约束、同步开销、带宽限制和竞争。在标准假设（包括全双工和单端口通信）下，使用SimGrid在Mesh、Butterfly、Dragonfly和Fat-Tree拓扑上评估了多种算法。结果表明，BBS在多种拓扑和消息尺寸下始终优于通用和拓扑感知的广播算法，成为大规模系统中稳健且高性能的解决方案。

英文摘要

We introduce Broadcast by Balanced Saturation (BBS), a general class of tree-based pipelined broadcast algorithms that optimizes communication efficiency across diverse network topologies, with a particular emphasis on large message sizes. By addressing spanning tree construction and communication task scheduling, two fundamental theoretical challenges in broadcasting, BBS offers a unified and flexible framework that operates effectively under varied network constraints. The algorithm maximizes aggregated throughput while simultaneously addressing topology constraints, synchronization overhead, bandwidth limitations and contention. Using SimGrid under standard assumptions, including full-duplex and one-port communication, various algorithms were evaluated on Mesh, Butterfly, Dragonfly, and Fat-Tree topologies. Results demonstrate that BBS consistently outperforms both general-purpose and topology-aware broadcast algorithms across a wide range of topologies and message sizes, establishing it as a robust and high-performance solution for large-scale systems.

URL PDF HTML ☆

赞 0 踩 0

2602.13628 2026-06-11 cs.NI 版本更新

Compact LLM Deployment and World Model Assisted Offloading in Mobile Edge Computing

紧凑型大语言模型部署与世界模型辅助的卸载在移动边缘计算中

Ruichen Zhang, Xiaofeng Luo, Jiayi He, Jiawen Kang, Zehui Xiong, Shiwen Mao

AI总结本文研究了移动边缘计算网络中紧凑型大语言模型（LLM）部署和世界模型辅助的推断卸载问题，提出了一种边缘紧凑型LLM部署（ECLD）框架，结合结构化剪枝、低比特量化和知识蒸馏来构建可部署于边缘的LLM变种，并通过四个互补指标评估这些模型。基于这些紧凑模型，提出了一个MEC卸载优化问题，旨在最小化长期平均推断延迟，同时满足设备能耗预算和LLM特定的服务质量约束。为了解决未知且时间变化的网络动态问题，开发了一种世界模型-近端策略优化（PPO）算法，该算法在原策略优化算法中加入了学习的递归世界模型，以提供改进的价值目标和短时间想象回放。

详情

Comments: 16 pages, 10 figures

AI中文摘要

本文研究了移动边缘计算网络中紧凑型大语言模型（LLM）部署和世界模型辅助的推断卸载问题。我们首先提出了一种边缘紧凑型LLM部署（ECLD）框架，该框架联合应用结构化剪枝、低比特量化和知识蒸馏，以构建可部署于边缘的LLM变种，并通过四个互补指标评估这些模型：可访问性、能耗、幻觉率和泛化准确性。基于这些紧凑模型，我们提出了一个MEC卸载优化问题，该问题旨在最小化长期平均推断延迟，同时满足每设备能耗预算和LLM特定的服务质量约束（即有效准确性和幻觉率）。为了解决未知且时间变化的网络动态问题，我们开发了一种世界模型-近端策略优化（PPO）算法，该算法在原策略优化算法中加入了学习的递归世界模型，以提供改进的价值目标和短时间想象回放。在Llama-3.1-8B、Qwen3-8B和Mistral-12B上的大量实验表明，ECLD将基础模型的存储压缩了约70-80%（例如，Llama-3.1-8B从15.3 GB压缩到3.3 GB），并将每个查询的能耗减少了高达50%，同时在很大程度上保持了准确性，经常降低了幻觉率，与仅量化或仅剪枝的基线相比。此外，这些实验还表明，世界模型-PPO在收敛速度上加快了约50%，在与原生PPO相比，最终奖励提高了15.8%，在不同用户群体中平均推断延迟减少了12-30%，同时满足准确性和幻觉约束，并接近始终卸载的质量，同时具有局部执行的大部分效率。

英文摘要

This paper investigates compact large language model (LLM) deployment and world-model-assisted inference offloading in mobile edge computing (MEC) networks. We first propose an edge compact LLM deployment (ECLD) framework that jointly applies structured pruning, low-bit quantization, and knowledge distillation to construct edge-deployable LLM variants, and we evaluate these models using four complementary metrics: accessibility, energy consumption, hallucination rate, and generalization accuracy. Building on the resulting compact models, we formulate an MEC offloading optimization problem that minimizes the long-term average inference latency subject to per-device energy budgets and LLM-specific quality-of-service constraints on effective accuracy and hallucination. To solve this problem under unknown and time-varying network dynamics, we develop a world model-proximal policy optimization (PPO) algorithm, which augments an on-policy PPO algorithm with a learned recurrent world model that provides improved value targets and short imagination rollouts. Extensive experiments on Llama-3.1-8B, Qwen3-8B, and Mistral-12B show that ECLD compresses base models by about 70-80% in storage (i.e., from 15.3 GB to 3.3 GB for Llama-3.1-8B) and reduces per-query energy consumption by up to 50%, while largely preserving accuracy and often lowering hallucination compared with quantization-only or pruning-only baselines. Moreover, they also show that world model-PPO speeds up convergence by about 50%, improves the final reward by 15.8% over vanilla PPO, and reduces average inference latency by 12-30% across different user populations, while satisfying the accuracy and hallucination constraints and approaching the generation quality of always-offloading with much of the efficiency of local execution.

URL PDF HTML ☆

赞 0 踩 0

2509.23248 2026-06-11 cs.AI cs.NI 版本更新

Resource-Aware LLM Reasoning for Mobile Edge General Intelligence

面向移动边缘通用智能的资源感知LLM推理

Mingyi Luo, Ruichen Zhang, Xiangwang Hou, Jun Du, Chunxiao Jiang, Yong Ren, Shiwen Mao

AI总结提出联合优化框架，通过自适应CoT提示和分布式MoE架构协同优化推理深度、专家激活和传输功率，在资源受限的移动边缘环境中实现LLM高效推理，推理质量与资源效率平衡，额外推理时间小于1秒时准确率和延迟满足率均达90%。

详情

AI中文摘要

大型语言模型（LLM）的快速发展催生了具有强大推理和自主决策能力的智能体人工智能（AI）。与边缘计算的集成推动了移动边缘通用智能（MEGI）的发展，将实时、隐私保护的推理带到网络边缘。然而，在MEGI环境中部署基于LLM的智能体AI推理面临重大挑战，原因是推理的高计算需求与边缘设备的有限资源。为应对这些挑战，我们提出了一种在MEGI中高效部署LLM推理的联合优化框架。首先，我们系统回顾增强方法，识别适合边缘适配的机制。随后，我们提出一个分布式框架，通过自适应思维链（CoT）提示协同推理增强，并通过分布式专家混合（MoE）架构实现可扩展部署。该方法的一个重要创新是将推理深度建模为动态网络资源变量，并与专家激活和传输功率联合优化。该机制使系统能够根据任务需求和设备能力动态调节专家网络和推理复杂度。在移动边缘环境中的实验评估表明，所提框架有效平衡了推理质量和资源效率。结果显示，在额外推理时间小于1秒的情况下，准确率和延迟满足率均可达到90%，验证了在资源受限的MEGI系统中部署复杂LLM推理的实际可行性。

英文摘要

The rapid advancement of large language models (LLMs) has enabled an emergence of agentic artificial intelligence (AI) with powerful reasoning and autonomous decision-making capabilities. This integration with edge computing has led to the development of Mobile Edge General Intelligence (MEGI), which brings real-time, privacy-preserving reasoning to the network edge. However, deploying LLM-based agentic AI reasoning in MEGI environments poses significant challenges due to the high computational demands of reasoning and the limited resources of edge devices. To address these challenges, we propose a joint optimization framework for efficient LLM reasoning deployment in MEGI. First, we systematically review enhancement methods to identify mechanisms suitable for edge adaptation. Subsequently, we present a distributed framework that synergizes reasoning enhancement via adaptive CoT prompting with scalable deployment through a distributed MoE architecture. An important innovation of this approach involves modeling reasoning depth as a dynamic network resource variable, which is optimized jointly with expert activation and transmission power. This mechanism allows the system to dynamically regulate expert networks and reasoning complexity according to task requirements and device capabilities. Experimental evaluations in mobile edge environments demonstrate that the proposed framework effectively balances reasoning quality and resource efficiency. The results show that with less than one second of additional inference time, both accuracy and latency satisfaction rate can reach 90\%, validating the practical viability of deploying sophisticated LLM reasoning in resource-constrained MEGI systems.

URL PDF HTML ☆

赞 0 踩 0

2510.22397 2026-06-11 cs.NI cs.LG 版本更新

NetBurst: Event-Centric Forecasting of Bursty, Intermittent Time Series

NetBurst: 以事件为中心的突发间歇性时间序列预测

Satyandra Guthula, Jaber Daneshamooz, Charles Fleming, Kesheng Wu, Walter Willinger, Arpit Gupta

AI总结针对网络遥测数据中罕见突发和长间隔低活动的“野性”统计特性，提出NetBurst事件中心管道，通过压缩低活动期、分离突发时序和幅度流学习统一表示，在预测误差、突发分布匹配和异常描述性上显著优于Chronos-2等基线。

详情

AI中文摘要

网络运营商通过收集遥测数据（如数据包计数、字节速率或流体积）来监控其基础设施，但有效运营所需的问题——预测未来负载、诊断和表征异常、搜索和检索历史先例——需要超越原始测量。弥合这一差距需要学习表示：紧凑的每实体摘要，从每个实体的单变量时间序列中捕获时间动态。时间序列基础模型是自然的起点，但它们是为密集、周期性的基准数据集（“温和”统计体制）设计的。然而，网络遥测数据处于“野性”体制：操作相关事件罕见，被可变长度的低活动或无活动（“低潮”）间隔分隔，并伴有间歇性的重尾极端值突发（“潮汐”）。我们提出NetBurst，一个以事件为中心的管道，它压缩低潮，将每个时间序列分离为突发时序流和突发幅度流，并学习一个服务于所有三个操作任务的单一表示。与八个基线中最强的竞争者（包括Amazon的Chronos-2和Datadog的Toto）相比，在九个生产遥测配置上，NetBurst在野性体制数据上将中位预测误差降低了1.3–116倍，对真实突发分布的匹配度提高了1.0–7.5倍，并在温和体制基准上与基线相当。对于异常表征，NetBurst产生平衡、分布良好的聚类，在一种新的可解释性评分下，这些聚类在操作员熟悉的术语中可描述性提高了16倍，而聚类过滤搜索实现了7.5倍的端到端检索加速。

英文摘要

Network operators monitor their infrastructure by collecting telemetry data such as packet counts, byte rates, or flow volumes, yet answering the questions that effective operations demand -- forecasting future load, diagnosing and characterizing anomalies, and searching for and retrieving historical precedents -- requires more than raw measurements. Bridging this gap calls for learned representations: compact per-entity summaries that capture temporal dynamics from each entity's univariate time series. Time-series foundation models are the natural starting point, but they are designed for dense, periodic benchmark datasets -- the \emph{mild} statistical regime. However, network telemetry data inhabits the \emph{wild} regime: operationally relevant events are rare, separated by variable-length stretches of low or no activity (``ebbs''), with intermittent bursts of heavy-tailed extremes (``tides''). We present NetBurst, an event-centric pipeline that collapses ebbs, separates each time series into a stream of burst timings and a stream of burst magnitudes, and learns a single representation serving all three operational tasks. Compared to the strongest competitors among eight baselines -- including Amazon's Chronos-2 and Datadog's Toto -- and across nine production telemetry configurations, NetBurst reduces median forecasting error by $1.3$--$116\times$ on wild-regime data with a $1.0$--$7.5\times$ better match to the true burst distribution, and matches baselines on mild-regime benchmarks. For characterizing anomalies, NetBurst produces balanced, well-spread clusters that are $16\times$ more describable in operator-familiar terms under a novel interpretability score, and cluster-filtered search delivers $7.5\times$ faster end-to-end retrieval.

URL PDF HTML ☆

赞 0 踩 0

2502.09084 2026-06-11 cs.CR cs.LG cs.NI

Application of Tabular Transformer Architectures for Operating System Fingerprinting

Rubén Pérez-Jove, Cristian R. Munteanu, Alejandro Pazos, Jose Vázquez-Naya