URL PDF HTML ☆

赞 0 踩 0

2604.19275 2026-06-03 eess.SY cs.OS cs.RO cs.SY

Scheduling Analysis of UAV Flight Control Workloads on PREEMPT_RT Linux Using a Raspberry Pi 5

基于Raspberry Pi 5的PREEMPT_RT Linux上无人机飞行控制工作负载的调度分析

Luiz Giacomossi, Håkan Forsberg, Ivan Tomasic, Baran Çürüklü, Tommaso Cucinotta

发表机构 * Mälardalen University（马尔达LEN大学）； ReTiS Lab, Scuola Superiore Sant’Anna（ReTiS实验室，圣安娜高等学院）

AI总结通过分析Raspberry Pi 5上PREEMPT_RT Linux内核的激活路径对250 Hz控制回路的影响，发现标准内核最差延迟超过9 ms，而PREEMPT_RT将最差延迟降低约88%至225微秒以下，但剩余抖动主要由硬件内存争用引起。

Comments 9 pages, 8 figures, conference

详情

AI中文摘要

现代无人机架构日益趋向于将高级自主性和低级飞行控制统一在单个通用操作系统（GPOS）上。然而，复杂的多核片上系统（SoC）由于共享资源争用引入了显著的时间不确定性。本文对Raspberry Pi 5上的PREEMPT_RT Linux内核进行了架构分析，特别隔离了内核激活路径（延迟执行的SoftIRQ与实时直接激活）对250 Hz控制回路的影响。结果表明，在高负载下，标准内核不适合，最差延迟超过9毫秒。相比之下，PREEMPT_RT将最差延迟降低了近88%，降至225微秒以下，通过强制直接唤醒路径减轻了操作系统噪声。这些发现表明，虽然PREEMPT_RT解决了调度方差问题，但现代SoC上的剩余抖动主要由硬件内存争用驱动。

英文摘要

Modern UAV architectures increasingly aim to unify high-level autonomy and low-level flight control on a single General-Purpose Operating System (GPOS). However, complex multi-core System-on-Chips (SoCs) introduce significant timing indeterminism due to shared resource contention. This paper performs an architectural analysis of the PREEMPT RT Linux kernel on a Raspberry Pi 5, specifically isolating the impact of kernel activation paths (deferred execution SoftIRQs versus real-time direct activation) on a 250 Hz control loop. Results show that under heavy stress, the standard kernel is unsuitable, exhibiting worst-case latencies exceeding 9 ms. In contrast, PREEMPT RT reduced the worst-case latency by nearly 88 percent to under 225 microseconds, enforcing a direct wake-up path that mitigates OS noise. These findings demonstrate that while PREEMPT RT resolves scheduling variance, the residual jitter on modern SoCs is primarily driven by hardware memory contention.

URL PDF HTML ☆

赞 0 踩 0

2604.17220 2026-06-03 cs.MA cs.AI

Dynamics of Cognitive Heterogeneity: Investigating Behavioral Biases in Multi-Stage Supply Chains with LLM-Based Simulation

认知异质性动力学：基于大语言模型模拟的多阶段供应链中行为偏差研究

Jiuyun Jiang, Yuecheng Hong, Bo Yang, Jin Yang, Guangxin Jiang, Xiaomeng Guo, Guang Xiao

发表机构 * Harbin Institute of Technology（哈尔滨工业大学）； The Hong Kong Polytechnic University（香港理工大学）

AI总结本文通过引入大语言模型模拟多阶段供应链，基于分层推理框架分析认知异质性对智能体交互的影响，发现信息共享可缓解短视和自利行为导致的系统效率低下。

TRAP: 通过对抗性补丁劫持VLA的CoT推理

Zhengxian Huang, Wenjun Zhu, Haoxuan Qiu, Xiaoyu Ji, Wenyuan Xu

发表机构 * University of Science and Technology of China（中国科学技术大学）

AI总结提出TRAP攻击，利用对抗性补丁劫持视觉-语言-动作模型的链式推理，实现目标行为操控。

Comments Accepted by ICML 2026

详情

AI中文摘要

基于核心的高效图RAG层次结构

Jakir Hossain, Ahmet Erdem Sarıyüce

发表机构 * University at Buffalo（布法罗大学）

AI总结针对图RAG中Leiden聚类不可复现的问题，提出用k-core分解替代，构建确定性、密度感知的层次结构，并设计轻量级启发式方法，在保证连接性的同时降低LLM成本，提升答案全面性和多样性。

Comments Accepted at the 32nd ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2026)

详情

DOI: 10.1145/3770855.3818007

AI中文摘要

检索增强生成（RAG）通过引入外部知识增强了大型语言模型。然而，现有的基于向量的方法通常无法处理需要跨多个文档推理的全局理解任务。GraphRAG通过将文档组织成具有层次化社区的知识图谱来解决这一问题，这些社区可以被递归总结。当前的GraphRAG方法依赖Leiden聚类进行社区检测，但我们证明，在平均度数为常数且大多数节点度数较低的稀疏知识图谱上，模块度优化允许指数级数量的近似最优划分，使得基于Leiden的社区本质上不可复现。为了解决这个问题，我们提出用k-core分解替代Leiden，它在线性时间内产生确定性的、密度感知的层次结构。我们引入一组轻量级启发式方法，利用k-core层次结构构建大小有界、保持连接性的社区用于检索和总结，同时采用一种令牌预算感知的采样策略来降低LLM成本。我们在包括金融收益报告、新闻文章和播客在内的真实世界数据集上评估了我们的方法，使用三个LLM进行答案生成，并由五个独立的LLM裁判进行逐项比较评估。跨数据集和模型，我们的方法一致地提高了答案的全面性和多样性，同时减少了令牌使用量，证明了基于k-core的GraphRAG是一种有效且高效的全局理解框架。

英文摘要

Retrieval-Augmented Generation (RAG) enhances large language models by incorporating external knowledge. However, existing vector-based methods often fail on global sensemaking tasks that require reasoning across many documents. GraphRAG addresses this by organizing documents into a knowledge graph with hierarchical communities that can be recursively summarized. Current GraphRAG approaches rely on Leiden clustering for community detection, but we prove that on sparse knowledge graphs, where average degree is constant and most nodes have low degree, modularity optimization admits exponentially many near-optimal partitions, making Leiden-based communities inherently non-reproducible. To address this, we propose replacing Leiden with k-core decomposition, which yields a deterministic, density-aware hierarchy in linear time. We introduce a set of lightweight heuristics that leverage the k-core hierarchy to construct size-bounded, connectivity-preserving communities for retrieval and summarization, along with a token-budget-aware sampling strategy that reduces LLM costs. We evaluate our methods on real-world datasets including financial earnings transcripts, news articles, and podcasts, using three LLMs for answer generation and five independent LLM judges for head-to-head evaluation. Across datasets and models, our approach consistently improves answer comprehensiveness and diversity while reducing token usage, demonstrating that k-core-based GraphRAG is an effective and efficient framework for global sensemaking.

URL PDF HTML ☆

赞 0 踩 0

2510.20372 2026-06-03 stat.ML cs.LG econ.EM math.ST stat.ME stat.TH

Testing Most Influential Sets

最具影响力集合的检验

Lucas D. Konrad, Nikolas Kuschnig

发表机构 * Vienna University of Economics and Business（维也纳经济与商业大学）； Monash University（墨尔本大学）

AI总结针对小部分数据点可能过度影响模型结论的问题，基于线性最小二乘法推导精确影响公式并识别最大影响的极值分布，提出一个用于检验过度影响的假设检验框架。

Comments Published as a conference paper at ICLR 2026

详情

AI中文摘要

小的有影响力的数据子集可以极大地影响模型结论，少数数据点可能推翻关键发现。虽然最近的研究识别了这些最具影响力的集合，但没有正式的方法来判断最大影响何时是过度的，而非在自然随机抽样变异下预期的。我们通过开发一个关于最具影响力集合的原则性框架来填补这一空白。聚焦于线性最小二乘法，我们推导了一个方便的精确影响公式，并识别了最大影响的极值分布——对于固定大小的集合和重尾数据是重尾的弗雷歇分布，对于增长集合或轻尾数据是表现良好的耿贝尔分布。这使得我们能够对过度影响进行严格的假设检验。我们通过跨经济学、生物学和机器学习基准的应用，解决了有争议的发现，并用严格的推断取代了临时的启发式方法。

英文摘要

Small influential data subsets can dramatically impact model conclusions, with a few data points overturning key findings. While recent work identifies these most influential sets, there is no formal way to tell when maximum influence is excessive rather than expected under natural random sampling variation. We address this gap by developing a principled framework for most influential sets. Focusing on linear least-squares, we derive a convenient exact influence formula and identify the extreme value distributions of maximal influence - the heavy-tailed Fréchet for constant-size sets and heavy-tailed data, and the well-behaved Gumbel for growing sets or light tails. This allows us to conduct rigorous hypothesis tests for excessive influence. We demonstrate through applications across economics, biology, and machine learning benchmarks, resolving contested findings and replacing ad-hoc heuristics with rigorous inference.

URL PDF HTML ☆

赞 0 踩 0

2603.01471 2026-06-03 cs.IR cs.LG

Reconstructing Content with Collaborative Attention for Universal Multimodal Representation Learning

通过协同注意力重建内容以提升多模态嵌入质量

Jiahan Chen, Da Li, Hengran Zhang, Yinqiong Cai, Lixin Su, Jiafeng Guo, Daiting Shi, Dawei Yin, Keping Bi

发表机构 * State Key Laboratory of AI Safety（人工智能安全国家重点实验室）； Institute of Computing Technology, Chinese Academy of Sciences（中国科学院计算技术研究所）； University of Chinese Academy of Sciences（中国科学院大学）； Baidu Inc.（百度公司）

AI总结提出CoCoA预训练范式，通过重构注意力流和基于EOS的重建任务，利用协同注意力优化多模态嵌入，使模型将输入语义压缩到<EOS>令牌中，从而提升嵌入质量。

详情

AI中文摘要

多模态嵌入模型，根植于多模态大语言模型（MLLMs），在检索和分类等多样任务中取得了显著的性能提升。然而，现有方法大多严重依赖大规模对比学习，对MLLMs的架构和训练范式如何影响嵌入质量的探索有限。虽然MLLMs的因果注意力和下一个令牌预测范式在生成任务中有效，但并未明确鼓励形成全局紧凑的表示，限制了其作为多模态嵌入骨干的有效性。为解决这一问题，我们提出了CoCoA，一种基于协同注意力的内容重建预训练范式，用于多模态嵌入优化。具体而言，我们重构注意力流并引入基于EOS的重建任务，鼓励模型从相应的<EOS>嵌入中重建输入。这促使多模态模型将输入的语义信息压缩到<EOS>令牌中，为后续的对比学习奠定基础。在MMEB-V1上的大量实验表明，基于Qwen2-VL和Qwen2.5-VL构建的CoCoA显著提升了嵌入质量。结果验证了内容重建作为最大化现有数据价值的有效策略，使多模态嵌入模型能够生成紧凑且信息丰富的表示，提升其性能上限。

英文摘要

Multimodal embedding models, rooted in multimodal large language models (MLLMs), have yielded significant performance improvements across diverse tasks such as retrieval and classification. However, most existing approaches rely heavily on large-scale contrastive learning, with limited exploration of how the architectural and training paradigms of MLLMs affect embedding quality. While effective for generation, the causal attention and next-token prediction paradigm of MLLMs does not explicitly encourage the formation of globally compact representations, limiting their effectiveness as multimodal embedding backbones. To address this, we propose CoCoA, a Content reconstruction pre-training paradigm based on Collaborative Attention for multimodal embedding optimization. Specifically, we restructure the attention flow and introduce an EOS-based reconstruction task, encouraging the model to reconstruct input from the corresponding <EOS> embeddings. This drives the multimodal model to compress the semantic information of the input into the <EOS> token, laying the foundations for subsequent contrastive learning. Extensive experiments on MMEB-V1 demonstrate that CoCoA built upon Qwen2-VL and Qwen2.5-VL significantly improves embedding quality. Results validate that content reconstruction serves as an effective strategy to maximize the value of existing data, enabling multimodal embedding models generate compact and informative representations, raising their performance ceiling.

URL PDF HTML ☆

赞 0 踩 0

2602.20213 2026-06-03 cs.SE cs.AI cs.CR

将噪声适应于数据：来自一维过程的生成流

Jannis Chemseddine, Gregor Kornhardt, Richard Duong, Gabriele Steidl

发表机构 * University of Cambridge（剑桥大学）

AI总结提出一个通用框架，通过一维分位数函数学习数据自适应的参数化先验分布（潜在噪声），利用噪声与数据之间的Wasserstein距离进行优化，以改善生成流模型对重尾等分布的学习能力。

Comments ICML 2026