arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2605.06727 2026-05-11 cs.LG cs.ET eess.IV

Medical Imaging Classification with Cold-Atom Reservoir Computing using Auto-Encoders and Surrogate-Driven Training

Nuno Batista, Ana Morgado, Oscar Ferraz, Sagar Silva Pratapsi, Jorge Lobo, Gabriel Falcao

AI总结本文提出了一种基于中性原子量子储库计算的混合量子-经典框架，用于医学图像分类，特别针对息肉检测的二分类任务。为应对高维图像数据，研究引入了引导式自编码器以学习紧凑且具有判别性的图像表示，并通过可微分的替代模型克服量子测量的非微分特性，实现端到端训练。实验表明，该方法在分类准确率和图像重建质量方面优于传统方法，展示了其在当前NISQ时代医学影像应用中的鲁棒性和灵活性。

Comments 8 pages, 6 figures. Accepted to the 2025 IEEE International Conference on Quantum AI (IEEE QAI). Supported by FCT and the Open Quantum Institute (OQI)

详情

DOI: 10.1109/QAI63978.2025.00064
Journal ref: 2025 IEEE International Conference on Quantum Artificial Intelligence (QAI)

英文摘要

We introduce a hybrid quantum-classical pipeline, based on neutral-atom reservoir computing, for medical image classification, focusing on the binary classification task of polyp detection. To deal effectively with the high dimensionality, we integrate a guided auto-encoder. This pipeline learns compact and discriminative representations of image data that are also well-suited for quantum reservoir computing. A key challenge in such systems is the non-differentiable nature of quantum measurements, which creates a 'gradient barrier' for standard training. We overcome this barrier by incorporating a differentiable surrogate model that emulates the quantum layer, enabling end-to-end backpropagation through the entire system. This guided training process is jointly optimized for classification accuracy and for faithful image recovery from the auto-encoder. The learned latent representations are encoded as pulse detuning parameters within a Rydberg Hamiltonian, and quantum embeddings are subsequently obtained through expectation values. These embeddings are then passed to a linear classifier. Our simulations show that this method outperforms some traditional approaches that use PCA or unguided autoencoders. We also conduct ablation studies to assess the impact of various quantum and training parameters, demonstrating the robustness and flexibility of our proposed pipeline for real-world medical imaging applications, even in the current NISQ era.

URL PDF HTML ☆

赞 0 踩 0

2605.06726 2026-05-11 cs.LG

Transformer-Based Wildlife Species Classification from Daily Movement Trajectories

Obed Irakoze, Prasenjit Mitra

AI总结本文研究如何仅从野生动物每日移动轨迹数据中识别物种，提出了一种基于Transformer的序列模型进行分类。相比LSTM、CNN等传统模型，Transformer在多个物种分类任务中表现出更高的平衡准确率，尤其在大象二分类任务中取得了0.83的平衡准确率和0.92的AUC值。研究还发现，引入更丰富的运动特征描述可显著提升模型性能，特别是在数据稀缺的物种上，同时统一使用1小时时间分辨率有助于提升整体分类效果。

Comments 8 pages

2605.06724 2026-05-11 cs.LG cs.AI eess.SP

Enabling Unsupervised Training of Deep EEG Denoisers With Intelligent Partitioning

Qiyu Rao, Haozhe Tian, Homayoun Hamedmoghadam, Danilo Mandic

AI总结本文研究了如何在无监督条件下训练深度脑电图（EEG）去噪模型，针对可穿戴EEG中神经活动微弱且与噪声频谱重叠的问题，提出了一种名为iPSD的智能分区自监督去噪方法。该方法无需干净的参考信号，通过学习将输入EEG片段分割为具有相同潜在信号的独立噪声实例，实现对去噪模型的自监督训练。实验表明，iPSD在极低信噪比和复杂噪声环境下表现优异，显著优于现有方法。

2605.06723 2026-05-11 cs.AI cs.CL cs.LG

When Does a Language Model Commit? A Finite-Answer Theory of Pre-Verbalization Commitment

Long Zhang, Wei-neng Chen, Feng-feng Wei, Zi-bo Qin

AI总结本文研究语言模型在生成最终答案前何时形成稳定答案偏好，提出了一种基于有限答案的偏好稳定化理论。通过将模型的续写概率投影到有限答案集合上，定义了精确的对数奇点度量，并据此分析了答案起始时间、回顾稳定时间等关键指标。实验表明，该方法在无需贪心解码或学习探针的情况下，能够提前于答案可解析时检测到偏好稳定，并且该信号与模型最终输出高度相关，具备良好的可解释性和可迁移性。

2605.06720 2026-05-11 cs.LG cs.AI

Conditional generation of antibody sequences with classifier-guided germline-absorbing discrete diffusion

Justin Sanders, Luca Giancardo, Lan Guo, Yue Zhao, Kemal Sonmez, Nina Cheng, Melih Yilmaz

AI总结该研究针对抗体序列的条件生成问题，提出了一种基于分类器引导的离散扩散模型，旨在克服现有方法在生成生物学意义的体细胞变异和灵活条件生成方面的不足。核心方法引入了“种系吸收扩散”，将种系序列作为扩散过程的吸收状态，从而引导模型学习从种系到成熟抗体序列的演化路径，有效减少种系偏倚。实验表明，该方法在非种系残基预测和条件生成任务中均表现出色，显著优于现有方法。

Comments 9 pages, 2 figures, 2 tables

详情

英文摘要

Antibody therapeutics are among the most successful modern medicines, yet computationally designing antibodies with desirable binding and developability properties remains challenging. While protein language models (pLMs) have emerged as powerful tools for antibody sequence design, existing approaches largely suffer from two key limitations: they predominantly memorize germline sequences rather than modeling biologically meaningful somatic variation, and they offer limited support for flexible classifier-guided conditional generation. We address these challenges through two primary contributions. First, we demonstrate that discrete diffusion fine-tuning achieves strong language modeling performance on antibody sequences while allowing for generation conditioned on any off-the-shelf classifier. Second, we introduce germline absorbing diffusion, a novel modification of the discrete diffusion noise process in which the germline sequence - rather than a masked sequence - serves as the absorbing state. This biologically motivated inductive bias restricts the model to learning the trajectory from germline to observed sequence, effectively excluding genetic variation and V(D)J recombination statistics from the learned distribution and dramatically mitigating germline bias. We show that germline diffusion improves non-germline residue prediction accuracy from 26 percent to 46 percent, approaching the theoretical upper bound set by true biological variability. We then demonstrate the utility of our germline diffusion model on the conditional generation tasks of sampling antibodies with improved hydrophobicity and predicted binding affinity. On both tasks our model shows an improved tradeoff between class adherence and sample quality, significantly outperforming EvoProtGrad, a popular strategy to sample from pLMs with gradient-based discrete Markov Chain Monte Carlo.

URL PDF HTML ☆

赞 0 踩 0

2605.06716 2026-05-11 cs.AI cs.CL

From Storage to Experience: A Survey on the Evolution of LLM Agent Memory Mechanisms

Jinghao Luo, Yuchen Tian, Chuxue Cao, Ziyang Luo, Hongzhan Lin, Kaixin Li, Chuyi Kong, Ruichao Yang, Jing Ma

AI总结本文综述了基于大语言模型（LLM）的智能体记忆机制的演化过程，提出了一种新的进化框架，将发展过程划分为存储、反思和经验三个阶段，明确了推动这一演化的三个核心驱动力。文章还重点探讨了经验阶段的两种关键机制——主动探索与跨轨迹抽象，为下一代LLM智能体的设计提供了坚实的理论基础和清晰的发展路径。

Comments Accepted by ACL 2026 Findings

2605.06714 2026-05-11 cs.CV cs.AI

Edge Deep Learning in Computer Vision and Medical Diagnostics: A Comprehensive Survey

Yiwen Xu, Tariq M. Khan, Yang Song, Erik Meijering

AI总结本文综述了边缘深度学习在计算机视觉与医学诊断领域的最新进展，重点探讨了其基础原理、技术优势及实际应用。文章提出了基于性能和使用场景的边缘硬件平台分类方法，并总结了在边缘设备上高效部署深度神经网络的关键技术，如轻量化设计与模型压缩。通过分析实际应用案例，展示了边缘深度学习在现实场景中的深远影响，并指出了未来研究方向与面临的挑战，为研究人员和实践者提供了全面的参考。

详情

DOI: 10.1007/s10462-024-11033-5
Journal ref: Artificial Intelligence Review, Volume 58, Article number 93 (2025)

英文摘要

Edge deep learning, a paradigm change reconciling edge computing and deep learning, facilitates real-time decision making attuned to environmental factors through the close integration of computational resources and data sources. Here we provide a comprehensive review of the current state of the art in edge deep learning, focusing on computer vision applications, in particular medical diagnostics. An overview of the foundational principles and technical advantages of edge deep learning is presented, emphasising the capacity of this technology to revolutionise a wide range of domains. Furthermore, we present a novel categorisation of edge hardware platforms based on performance and usage scenarios, facilitating platform selection and operational effectiveness. Following this, we dive into approaches to effectively implement deep neural networks on edge devices, encompassing methods such as lightweight design and model compression. Reviewing practical applications in the fields of computer vision in general and medical diagnostics in particular, we demonstrate the profound impact edge-deployed deep learning models can have in real-life situations. Finally, we provide an analysis of potential future directions and obstacles to the adoption of edge deep learning, with the intention to stimulate further investigations and advancements of intelligent edge deep learning solutions. This survey provides researchers and practitioners with a comprehensive reference shedding light on the critical role deep learning plays in the advancement of edge computing applications.

URL PDF HTML ☆

赞 0 踩 0

2605.06708 2026-05-11 cs.CV cs.AI

Visual Text Compression as Measure Transport

Lv Tang, Tianyi Zheng, Yang Liu, Bo Li, Xingyu Li

AI总结该研究探讨了视觉文本压缩（VTC）在处理长文本时的效率与效果问题，提出了一种基于测度传输的理论框架，用于量化视觉编码引起的任务相关信息损失。研究通过将文本和视觉标记视为经验概率测度，分析了ViT编码器的推前映射及其分解为精度和覆盖成本的传输代价，并基于此提出了无需下游标签的路由准则和聚焦机制，有效提升了模型在多个自然语言处理任务中的表现。

2605.06702 2026-05-11 cs.AI cs.CL cs.LG

CASCADE: Case-Based Continual Adaptation for Large Language Models During Deployment

Siyuan Guo, Yali Du, Hechang Chen, Yi Chang, Jun Wang

AI总结本文提出CASCADE，一种面向大语言模型在部署阶段持续适应的框架，旨在解决传统训练与部署阶段分离导致的学习停滞问题。CASCADE通过构建一个动态的案例记忆库，使模型能够在不修改参数的情况下，从部署过程中的经验中持续学习和优化。该方法将经验复用建模为上下文老虎机问题，实现了探索与利用的平衡，并在多个任务中显著提升了模型性能。

2605.06696 2026-05-11 cs.AI cs.LG cs.MA

Hidden Coalitions in Multi-Agent AI: A Spectral Diagnostic from Internal Representations

Cameron Berg, Susan L. Schneider, Mark M. Bailey

AI总结该研究探讨了多智能体系统中隐藏的联盟结构问题，提出了一种基于内部神经表示的谱分析方法，用于检测智能体之间潜在的信息耦合与联盟关系。该方法通过构建智能体隐藏状态之间的互信息图，并利用谱聚类技术识别出最显著的联盟边界，有效区分了真实的信息耦合与行为协调带来的虚假相似性。实验表明，该方法在强化学习和大型语言模型中均能准确揭示隐藏的联盟结构，为监测分布式AI系统中的涌现组织提供了有力工具。

Comments 18 pages

2605.06690 2026-05-11 cs.AI cs.CL cs.LG

State Representation and Termination for Recursive Reasoning Systems

Debashis Guha, Amritendu Mukherjee, Sanjay Kukreja, Tarun Kumar

AI总结本文研究递归推理系统中的状态表示与终止条件问题，提出了一个基于知识状态图的表示方法，用于编码推理过程中的主张、证据关系、开放问题及置信度权重。通过定义“顺序差距”作为迭代顺序对推理结果影响的度量，文章给出了判断何时应停止迭代的条件，并证明了该条件在固定点附近的非退化性，为递归推理系统的终止提供了理论依据。该方法可应用于智能体循环、树状思维、定理证明和持续学习等多个领域。

2605.06686 2026-05-11 cs.LG econ.EM stat.AP stat.ML

Robustness of Refugee-Matching Gains to Off-Policy Evaluation Choices

Kirk Bansak, Elisabeth Paulson, Dominik Rothenhäusler, Jeremy Ferwerda, Jens Hainmueller, Michael Hotard

AI总结本文研究了在美国难民匹配政策中，反事实影响评估结果对离线策略评估方法的稳健性。通过应用逆概率加权（IPW）和增强型逆概率加权（AIPW）等多种评估方法，并结合不同的模型结构和分配程序，研究发现无论采用何种方法，影响估计结果在数量级上均保持一致，且在多数情况下具有统计显著性。这些结果与Bansak等人（2018）最初的研究结论也高度一致。

Comments 13 pages, 2 figures, 10 tables

2605.06685 2026-05-11 cs.SD eess.AS stat.AP

An audio-to-analysis pipeline with certified transcription for information-theoretic profiling of the piano repertoire

Fred Jalbert-Desforges

AI总结本文提出了一种从音频直接生成作曲家层面信息论特征的分析流程，通过认证的乐谱转录层（在MAESTRO数据集上F1值达0.9791）提取和声音阶分布，并利用香农熵、非对称KL散度和齐普夫模型进行分析。研究揭示了作曲家在和声可预测性上的可解释排序，重现了已知的风格传承关系，并区分出现代极简主义作曲家与历史作曲家在和声过渡分布上的显著差异。

Comments 25 pages, 4 figures, 25 references

2605.06684 2026-05-11 cs.LG

From Canopy to Collision: A Hybrid Predictive Framework for Identifying Risk Factors in Tree-Involved Traffic Crashes

Abdul Azim, Ahmed Hossain, Soumyadip Maitra, Panick Kalambay

AI总结该研究针对涉及树木的交通事故，提出了一种混合预测框架，用于识别和量化影响事故严重程度的风险因素。研究基于2020至2023年的交通事故报告数据库，结合分类模型、SHAP解释工具和逻辑回归模型，分析了如安全带使用、车辆年龄、超速和驾驶员状态等因素的影响，并揭示了这些因素之间的关键交互作用。研究发现，未使用安全带是导致严重后果的最主要因素，同时车辆老化、超速和驾驶者状态等因素也显著影响事故严重性，为制定针对性的安全干预措施提供了重要依据。

Comments 30 pages, 10 figures

2605.06683 2026-05-11 cs.LG cs.AI cs.CL

Toeplitz MLP Mixers are Low Complexity, Information-Rich Sequence Models

Benjamin L. Badger, Ethan Roland

AI总结本文提出了一种名为Toeplitz MLP Mixer（TMM）的新型序列模型架构，旨在解决基于Transformer的大语言模型在计算复杂度上的局限性。TMM通过将注意力机制替换为三角掩码的Toeplitz矩阵乘法，实现了更低的训练时间和空间复杂度，同时在推理阶段也表现出更高的效率。实验表明，TMM在信息保留和上下文学习方面优于现有亚二次复杂度模型，且从算子索引理论的角度分析，其因果不可逆模型的Toeplitz层更可能具有可逆性。

2605.06682 2026-05-11 cs.AI cs.CY

Fast and Effective Redistricting Optimization via Composite-Move Tabu Search

Hai Jin, Diansheng Guo

AI总结本文提出了一种基于复合移动的禁忌搜索算法（CM-Tabu），用于解决空间选区划分中的优化问题。该方法通过系统性地扩展可行邻域空间，在保证选区连通性的前提下提升搜索效率与解的质量。实验表明，与传统禁忌搜索及其他基准方法相比，CM-Tabu 在解的质量、鲁棒性和计算效率方面均有显著提升，尤其在费城案例中能够稳定达到人口均衡的理论最优解，并支持多目标权衡。

2605.06680 2026-05-11 cs.LG cs.CV physics.flu-dyn

On the Role of Strain and Vorticity in Numerical Integration Error for Flow Matching

Chenxi Tao, Seung-Kyum Choi

AI总结本文研究了流匹配中速度场的应变和涡度对数值积分误差的影响，通过将速度场雅可比矩阵分解为对称部分（应变率）和反对称部分（涡度），揭示了两者在误差传播中的不同作用机制。研究证明，应变主导指数级误差放大，而涡度仅对局部截断误差有线性贡献，并据此提出了一种基于加权雅可比正则化的优化方法，实验表明该方法在降低积分误差和提升生成质量方面具有显著效果。

Comments 16 pages, 7 figures. Preliminary version. Includes qualitative CIFAR-10 comparison and supporting synthetic experiments

2605.06679 2026-05-11 cs.LG

Breaking the Illusion: When Positive Meets Negative in Multimodal Decoding

Yubo Jiang, Yitong An, Xin Yang, Abudukelimu Wuerkaixi, Xuxin Cheng, Fengying Xie, Zhiguo Jiang, Cao Liu, Ke Zeng, Haopeng Zhang

AI总结视觉语言模型（VLMs）常因过度依赖语言先验而产生与视觉内容不符的幻觉。本文提出了一种无需训练的推理框架——正负解码（PND），通过在解码过程中引入正负两条路径，分别增强视觉证据和构建反事实以抑制语言主导生成，从而提升视觉真实性。实验表明，PND在多个基准测试中取得了最先进的性能。

Comments Accepted by CVPR 2026 (Conference on Computer Vision and Pattern Recognition). 11 pages, 5 figures. Code available at: https://github.com/JiangYubo4399/PND

2605.06678 2026-05-11 cs.LG q-fin.RM stat.AP

A Wasserstein GAN-based climate scenario generator for risk management and insurance: the case of soil subsidence

Antoine Heranval, Olivier Lopez, Didier Ngatcha, Daniel Nkameni

AI总结本文提出了一种基于Wasserstein GAN的气候情景生成框架SwiGAN，用于生成未来气候指数的时空演变轨迹，以支持风险管理与保险策略制定。该方法聚焦于法国用于评估干旱程度的关键指标——土壤湿润指数（SWI），并模拟其到2050年的可能演变路径，帮助理解气候变化下的干旱动态。该模型不仅有助于制定适应性风险应对策略，还可推广至其他气候相关风险及精算应用。

2605.06676 2026-05-11 cs.LG cs.CL

LKV: End-to-End Learning of Head-wise Budgets and Token Selection for LLM KV Cache Eviction

Enshuai Zhou, Yifan Hao, Chao Wang, Rui Zhang, Di Huang, Jiaming Guo, Xing Hu, Zidong Du, Qi Guo, Yunji Chen

AI总结大型语言模型（LLM）在长文本推理时面临键值（KV）缓存内存线性增长的瓶颈。现有方法多依赖启发式策略进行缓存压缩，难以与任务目标对齐。本文提出LKV方法，将KV缓存压缩建模为端到端可微优化问题，通过学习任务导向的全局预算分配和关键值重要性评估，有效提升了压缩效率与推理质量。实验表明，LKV在多个基准测试中实现了领先的压缩性能，尤其在保留仅15% KV缓存时仍能保持接近无损的效果。

2605.06675 2026-05-11 cs.LG cs.CL cs.IT math.IT

RateQuant: Optimal Mixed-Precision KV Cache Quantization via Rate-Distortion Theory

Fei Zuo, Zikang Zhou, Hao Cong, Xiaoyan Xi, Ho Fai Leung

AI总结大型语言模型在生成过程中需要缓存所有已计算的键值（KV）对，随着序列长度增加，KV缓存占用的内存迅速增长，成为服务阶段的主要瓶颈。现有方法通常采用固定精度对所有注意力头进行量化，忽略了不同头之间重要性的差异。本文提出RateQuant方法，基于率失真理论，通过校准每个量化器的失真模型，并采用逆水填充算法实现最优的混合精度分配，有效提升了量化性能，显著降低了模型的困惑度，且校准过程高效，推理时无额外开销。

Comments 18 pages, 7 figures, 5 tables

2605.06673 2026-05-11 cs.CL cs.AI cs.LG

Domain-level metacognitive monitoring in frontier LLMs: A 33-model atlas

Jon-Paul Cacioli

AI总结该研究分析了33个前沿大语言模型在MMLU基准不同领域中的元认知监控能力，发现模型在整体表现良好的情况下，其在各个领域的监控能力存在显著差异。研究通过计算每个模型-领域组合的Type-2 AUROC指标，揭示了应用/专业知识领域最易监控，而形式推理和自然科学领域最难监控，并指出模型家族内部的监控能力分布具有显著的聚类特征。研究结果表明，聚合指标可能掩盖了模型在不同应用领域中的实际性能差异，强调了在部署前进行领域级评估的重要性。

Comments 25 pages, 7 figures, 1 supplementary table. Code and data: https://github.com/synthiumjp/metacognitive-profile-atlas

2605.06672 2026-05-11 cs.AI cs.CL cs.LG

More Thinking, More Bias: Length-Driven Position Bias in Reasoning Models

Xiao Wang

AI总结该研究探讨了推理模型在多选题问答任务中的位置偏差问题，发现即使具备推理能力的模型，其位置偏差程度也会随着推理轨迹长度的增加而上升。研究通过多个模型和数据集的实验证明，推理轨迹越长，模型越容易表现出对选项位置的偏好，这一现象在不同规模的模型中普遍存在。研究还指出，直接答案的位置偏差与推理轨迹长度无关，而推理过程会引入另一种与轨迹长度相关的偏差，并提出了用于检测和评估位置偏差的诊断工具。

详情

英文摘要

Chain-of-thought (CoT) reasoning and reasoning-tuned models such as DeepSeek-R1 are commonly assumed to reduce shallow heuristic biases by thinking carefully. We test this on position bias in multiple-choice QA and find a different story: within any reasoning-capable model, per-question position bias scales with the length of the reasoning trajectory. Across thirteen reasoning-mode configurations (two R1-distilled 7-8B models, two base models prompted with CoT, and DeepSeek-R1 at 671B) on MMLU, ARC-Challenge, and GPQA, twelve show a positive partial correlation between trajectory length and Position Bias Score (PBS) after controlling for accuracy, ranging from 0.11 to 0.41 (all p < 0.05). All twelve open-weight reasoning-mode configurations show monotonically increasing PBS across length quartiles. A truncation intervention provides causal evidence: continuations resumed from later points in the trajectory are increasingly likely to shift toward position-preferred options (16% to 32% for R1-Qwen-7B across absolute-position buckets). At 671B, aggregate PBS collapses to 0.019, but the length effect still manifests in the longest quartile (PBS = 0.071), suggesting that accuracy gates the expression of length-driven bias rather than eliminating the underlying mechanism. We additionally find that direct-answer position bias is a distinct phenomenon with a different footprint (strong in Llama-Instruct-direct, weak in Qwen-Instruct-direct, and uncorrelated with trajectory length): CoT reasoning replaces this baseline bias with length-accumulated bias. Our results argue that reasoning-capable models should not be treated as order-robust by default in MCQ evaluation pipelines, and offer a diagnostic toolkit (PBS, commitment change point, effective switching, truncation probes) for auditing position bias in reasoning models.

URL PDF HTML ☆

赞 0 踩 0

2605.06671 2026-05-11 cs.AI cs.MA

GraphDC: A Divide-and-Conquer Multi-Agent System for Scalable Graph Algorithm Reasoning

Wenjin Li, Jiaming Cui

AI总结本文提出了一种名为GraphDC的分而治之多智能体系统，用于解决大规模图算法推理任务。该方法通过将输入图分解为子图，并由专门的智能体进行局部推理，再由主智能体整合结果，从而降低单个智能体的推理负担，提升计算效率和鲁棒性。实验表明，GraphDC在多种图算法任务中表现优于现有方法，尤其在处理大规模图实例时具有显著优势。

2605.06623 2026-05-11 cs.AI cs.CL cs.LG cs.MA

MASPO: Joint Prompt Optimization for LLM-based Multi-Agent Systems

Zhexuan Wang, Xuebo Liu, Li Wang, Zifei Shan, Yutong Wang, Zhenxi Song, Min Zhang

AI总结本文提出了一种名为MASPO的新框架，用于联合优化基于大语言模型的多智能体系统中的角色提示，以提升系统整体协作性能。MASPO通过引入联合评估机制，从全局系统目标出发优化各智能体的局部提示，避免了传统方法中局部目标与整体目标不一致的问题。此外，MASPO采用数据驱动的进化光束搜索策略高效探索高维提示空间，实验表明其在多个任务中优于现有方法，平均准确率提升了2.9个百分点。

Comments Accepted at ICML 2026

2605.06435 2026-05-11 cs.CL cs.AI cs.LG

COVID-19 Infodemic. Understanding content features in detecting fake news using a machine learning approach

Vimala Balakrishnan, Lee Zing Hii, Eric Laporte

AI总结本文研究了利用文本和语言特征检测假新闻的问题，采用传统机器学习方法，选取词双 grams、词性分布等特征进行实验。实验基于新冠疫情时期收集的数据集，结果显示随机森林和支持向量机在检测效果上表现最佳，且单独使用文本或语言特征即可提升检测性能，但两者结合并未显著提高效果。研究证明，在不依赖深度学习的情况下，传统机器学习方法也能有效利用文本和语言特征进行假新闻识别。

2605.06298 2026-05-11 cs.CV cs.AI

Render, Don't Decode: Weight-Space World Models with Latent Structural Disentanglement

Roussel Desmond Nzoyem, Mauro Comi

AI总结本文提出了一种名为NOVA的世界模型框架，通过将系统状态表示为辅助坐标隐式神经表示（INR）的权重和偏置，解决了传统世界模型依赖复杂解码器和不可解释潜空间的问题。该方法通过解析渲染结构化表示，消除了解码瓶颈，实现了模型的紧凑性、可移植性和零样本超分辨率。实验表明，NOVA能够在单块消费级GPU上高效运行，实现了可控的未来预测，并能分离场景中的结构组件，如背景、前景和运动，从而支持内容与动态的独立编辑。

Comments 35 pages, 30 figures, 8 tables

2605.06230 2026-05-11 cs.AI cs.DC

Safactory: A Scalable Agentic Infrastructure for Training Trustworthy Autonomous Intelligence

Xinquan Chen, Zhenyun Yin, Shan He, Bin Huang, Shanzhe Lei, Pengcheng Shi, Kun Cai, Bei Chen, Bangwei Liu, Zeyu Kang, Chao Huang, Yang Zhang, Wenjie Li, Ruijun Ge, Yajie Wang, Tianshun Fang, Tianyang Xu, Yiwen Cong, Meng Jin, Gaolei Li, Xuansheng Wu, Linhan Liu, Zijing He, An Li, Yan Teng, Xin Tan, Dongrui Liu, Jing Shao, ChaoChao Lu, Ji He, Jie Li, Chunfeng Song, Jinya Xu, Fan Song, Shujie Wang, Jianmin Qian, Jie Hou, Xuhong Wang, Yingchun Wang, Hui Wang, Xia Hu

AI总结随着大模型从对话助手演变为自主智能体，长期决策、工具使用和真实环境交互带来的挑战日益凸显。现有智能体基础设施在评估、数据管理和智能体进化方面较为分散，难以系统发现风险并实现持续闭环优化。本文提出 **Safactory**，一个可扩展的智能体工厂框架，集成轨迹生成、可信数据管理与自主进化的三大平台，构建统一的进化流程，为下一代可信自主智能体提供了全新的基础设施。

Comments 50 pages, 21 figures

2605.06175 2026-05-11 cs.RO

VLA-GSE: Boosting Parameter-Efficient Fine-Tuning in VLA with Generalized and Specialized Experts

Yuhua Jiang, Junjie Lu, Xinyao Qin, Xiaoyu Chen, Kaixin Wang, Feifei Gao, Li Zhao

AI总结本文提出了一种名为VLA-GSE的参数高效的视觉-语言-动作（VLA）模型微调框架，旨在解决将预训练VLA模型适配到机器人控制任务中的挑战。该方法通过谱分解冻结的主干网络，将主导奇异值分量分配给通用专家（共享专家），而离散残差分量则分配给专用专家（路由专家），从而在固定可训练参数预算下提升模型的适应能力。实验表明，VLA-GSE在保持预训练知识的同时，仅更新全模型2.51%的参数，显著优于全微调和现有PEFT方法，在多个基准测试中表现出优异的零样本控制和多模态理解性能。

2605.06169 2026-05-11 cs.LG cs.CV

Mean Mode Screaming: Mean--Variance Split Residuals for 1000-Layer Diffusion Transformers

Pengqi Lu

AI总结本文研究了扩散变换器（DiT）在扩展到数百层时出现的结构脆弱性问题，即网络可能陷入均值主导的崩溃状态，导致特征表示趋同、变化被抑制。通过机制分析，作者发现了触发这一崩溃现象的机制——“均值模式尖叫”（MMS），并提出了一种名为“均值-方差分割残差”（MV-Split Residuals）的新方法，通过分离均值和方差梯度更新，有效防止了深层网络的崩溃，验证了该方法在1000层DiT模型中的稳定训练能力。

Comments 43 pages (9-page main paper + appendix)