arXivDaily arXiv每日学术速递 周一至周五更新
热门方向导航
2309.15769 2026-06-19 math.ST cs.LG stat.ME stat.TH 版本更新

Benign overfitting beyond prediction: The ordinary least squares interpolator

超越预测的良性过拟合:普通最小二乘插值器

Dennis Shen, Dogyoon Song, Peng Ding, Jasjeet S. Sekhon

发表机构 * Department of Data Sciences & Operations, University of Southern California(数据科学与运营系,南加州大学) Department of Statistics, University of California, Davis(统计学系,加州大学戴维斯分校) Department of Statistics, University of California, Berkeley(统计学系,加州大学伯克利分校) Google DeepMind(谷歌DeepMind)

AI总结 本文研究过参数化线性模型中最小ℓ2范数OLS插值器的参数估计与推断性质,推导了留k法、遗漏变量偏误公式和Frisch-Waugh-Lovell定理的过参数化版本,并扩展了高斯-马尔可夫定理。

Comments This work is accepted for publication in Biometrika

详情
AI中文摘要

深度学习的最新进展突显了过参数化统计模型中良性过拟合的现象,引发了对其基础理解的浓厚兴趣。由于其简单性和实际相关性,普通最小二乘(OLS)插值器已成为从理论上理解这一现象的关键研究对象。虽然OLS在经典欠参数化设置下的性质已得到充分理解,但其在过参数化区域中的行为——与岭回归或lasso不同——仍相对较少被探索。我们通过为最小$\ell_2$范数OLS插值器推导新的代数和统计结果,为这一不断增长的文献做出贡献。与现有大部分关注预测风险的工作不同,我们的分析集中于参数估计和推断,这对于许多统计学和因果推断应用至关重要。具体地,我们建立了以下内容的过参数化类比:(i) 留$k$法公式,(ii) 遗漏变量偏误公式,以及(iii) Frisch-Waugh-Lovell定理。在高斯-马尔可夫模型下,我们进一步扩展了高斯-马尔可夫定理,并分析了过参数化设置下同方差性时的方差估计。这些结果共同为研究过参数化线性模型中的参数估计和推断提供了一个系统框架,为超越预测含义的良性过拟合提供了新视角。

英文摘要

Recent advances in deep learning have highlighted the phenomenon of benign overfitting in overparameterized statistical models, sparking significant interest in understanding its foundations. Owing to its simplicity and practical relevance, the ordinary least squares (OLS) interpolator has become a key object of study for gaining theoretical insight into this phenomenon. While the properties of OLS are well understood in classical underparameterized settings, its behavior in the overparameterized regime -- unlike that of ridge regression or the lasso -- remains comparatively less explored. We contribute to this growing literature by deriving new algebraic and statistical results for the minimum $\ell_2$-norm OLS interpolator. In contrast to much of the existing work, which focuses on prediction risk, we center our analysis on parameter estimation and inference, which are fundamental for many statistics and causal inference applications. Specifically, we establish overparameterized analogues of (i) the leave-$k$-out formulas, (ii) the omitted variable bias formula, and (iii) the Frisch-Waugh-Lovell theorem. Under the Gauss-Markov model, we further extend the Gauss-Markov theorem and analyze variance estimation under homoskedasticity in the overparameterized setting. Collectively, these results provide a systematic framework for studying parameter estimation and inference in overparameterized linear models, offering a novel perspective on benign overfitting beyond its implications for prediction.

2405.10705 2026-06-19 eess.IV cs.CV 版本更新

3D Vessel Reconstruction from Sparse-View Dynamic DSA Images via Vessel Probability Guided Attenuation Learning

基于血管概率引导衰减学习的稀疏视角动态DSA图像三维血管重建

Zhentao Liu, Huangxuan Zhao, Wenhui Qin, Zhenghong Zhou, Xinggang Wang, Wenping Wang, Xiaochun Lai, Chuansheng Zheng, Dinggang Shen, Zhiming Cui

发表机构 * School of Biomedical Engineering \& State Key Laboratory of Advanced Medical Materials Devices, ShanghaiTech University, Shanghai, China National Engineering Research Center for Multimedia Software, School of Computer Science, Wuhan University, Wuhan, China School of Electronic Information Communications, Huazhong University of Science Department of Computer Science \& Engineering, Texas A\&M University, USA

AI总结 提出血管概率引导衰减学习框架,通过静态与动态衰减场互补加权实现稀疏视角DSA重建,降低辐射剂量,并采用渐进训练和时间扰动损失提升质量。

Comments Accepted by Medical Image Analysis (MedIA), 2026

详情
AI中文摘要

数字减影血管造影(DSA)是血管疾病诊断的金标准之一。借助造影剂,时间分辨的二维DSA图像提供全面的血流信息,可用于重建三维血管结构以进行医学评估。当前的商用DSA系统通常需要数百个扫描视角进行重建,导致大量辐射暴露。在本研究中,我们提出了一种基于神经渲染的优化框架,专门用于高质量稀疏视角DSA重建,以减少辐射剂量。我们的方法称为血管概率引导衰减学习,将DSA成像表示为静态和动态衰减场的互补加权组合,权重来自时间无关的血管概率场。作为前景掩膜,血管概率为静态和动态场提供适应不同场景类型的适当梯度。该机制实现了静态背景与动态造影剂流的自监督分解,并显著提高了重建质量。我们的模型通过最小化合成投影与真实DSA图像之间的差异进行训练。我们进一步采用两种训练策略来提高重建质量:(1)由粗到细的渐进训练以改善几何结构,以及(2)时间扰动渲染损失以保持时间一致性。实验结果表明了高质量的三维血管重建和二维DSA图像合成。

英文摘要

Digital Subtraction Angiography (DSA) is one of the gold standards for vascular disease diagnosis. With the help of a contrast agent, time-resolved 2D DSA images deliver comprehensive blood flow information and can be utilized to reconstruct 3D vessel structures for medical assessment. Current commercial DSA systems typically require hundreds of scanning views to perform reconstruction, resulting in substantial radiation exposure. In this study, we propose a neural rendering-based optimization framework tailored for high-quality sparse-view DSA reconstruction to reduce radiation dosage. Our approach, termed vessel probability guided attenuation learning, represents DSA imaging as a complementary weighted combination of static and dynamic attenuation fields, with the weights derived from the time-independent vessel probability field. Functioning as a foreground mask, vessel probability provides proper gradients for both static and dynamic fields adaptive to different scene types. This mechanism enables self-supervised decomposition between static backgrounds and dynamic contrast agent flow, and significantly improves reconstruction quality. Our model is trained by minimizing the discrepancy between synthesized projections and real captured DSA images. We further employ two training strategies to improve reconstruction quality: (1) coarse-to-fine progressive training for better geometry and (2) temporal perturbed rendering loss for temporal consistency. Experimental results have demonstrated high-quality 3D vessel reconstruction and 2D DSA image synthesis.

2104.08928 2026-06-19 stat.ML cs.CL cs.LG 版本更新

Group-Sparse Matrix Factorization for Transfer Learning of Word Embeddings

面向词嵌入迁移学习的组稀疏矩阵分解

Kan Xu, Xuanyi Zhao, Hamsa Bastani, Osbert Bastani

发表机构 * W. P. Carey School of Business, Arizona State University(亚利桑那州立大学韦伯商学院) University of Pennsylvania(宾夕法尼亚大学) Wharton School, University of Pennsylvania(宾夕法尼亚大学沃顿商学院)

AI总结 提出一种基于组稀疏惩罚的两阶段估计器,通过结合大规模语料和少量领域数据高效迁移学习领域特定的词嵌入,并证明了其泛化误差界和非凸目标函数的局部最优与全局最优统计等价。

详情
AI中文摘要

非结构化文本为许多领域的决策者提供了丰富的数据源,从零售中的产品评论到医疗保健中的护理记录。为了利用这些信息,单词通常通过无监督学习算法(如矩阵分解)转化为词嵌入——编码单词之间语义关系的向量。然而,从训练数据有限的新领域学习词嵌入可能具有挑战性,因为在新领域中含义/用法可能不同,例如,单词“positive”通常具有积极情感,但在医疗记录中通常具有消极情感,因为它可能意味着患者检测出疾病阳性。在实践中,我们预计只有少数领域特定的单词可能具有新含义。我们提出了一种直观的两阶段估计器,通过组稀疏惩罚利用这种结构,通过结合大规模文本语料库(如维基百科)和有限的领域特定文本数据,高效地迁移学习领域特定的词嵌入。我们限定了迁移学习估计器的泛化误差,证明当只有少量嵌入在领域间改变时,它可以用显著更少的领域特定数据实现高精度。此外,我们证明了在标准正则化条件下,由非凸目标函数识别的所有局部最小值与全局最小值在统计上不可区分,这意味着我们的估计器可以高效计算。我们的结果首次给出了组稀疏矩阵分解的界限,这可能具有独立意义。我们通过与自然语言处理中最先进的微调启发式方法进行实证比较来评估我们的方法。

英文摘要

Unstructured text provides decision-makers with a rich data source in many domains, ranging from product reviews in retail to nursing notes in healthcare. To leverage this information, words are typically translated into word embeddings -- vectors that encode the semantic relationships between words -- through unsupervised learning algorithms such as matrix factorization. However, learning word embeddings from new domains with limited training data can be challenging, because the meaning/usage may be different in the new domain, e.g., the word ``positive'' typically has positive sentiment, but often has negative sentiment in medical notes since it may imply that a patient tested positive for a disease. In practice, we expect that only a small number of domain-specific words may have new meanings. We propose an intuitive two-stage estimator that exploits this structure via a group-sparse penalty to efficiently transfer learn domain-specific word embeddings by combining large-scale text corpora (such as Wikipedia) with limited domain-specific text data. We bound the generalization error of our transfer learning estimator, proving that it can achieve high accuracy with substantially less domain-specific data when only a small number of embeddings are altered between domains. Furthermore, we prove that all local minima identified by our nonconvex objective function are statistically indistinguishable from the global minimum under standard regularization conditions, implying that our estimator can be computed efficiently. Our results provide the first bounds on group-sparse matrix factorization, which may be of independent interest. We empirically evaluate our approach compared to state-of-the-art fine-tuning heuristics from natural language processing.

2605.13438 2026-06-19 cs.AI cs.CL 版本更新

CogniFold: Always-On Proactive Memory via Cognitive Folding

CogniFold: 通过认知折叠实现始终在线的主动记忆

Suli Wang, Yiqun Duan, Yu Deng, Rundong Zhao, Dai Shi, Minghua Deng, Chen Chen, Xinliang Zhou

AI总结 提出CogniFold,一种受大脑启发的主动记忆系统,通过将互补学习系统扩展为三层(海马体、新皮层、前额叶意图层)并利用图拓扑自组织,实现事件流的持续认知结构涌现,在认知评估和常规记忆基准上均表现优异。

Comments Code is available at https://github.com/OpenNorve/CogniFold

详情
AI中文摘要

现有的智能体记忆主要仍是被动反应式和基于检索的,缺乏自主将经验组织成持久认知结构的能力。为了迈向真正自主的智能体,我们引入了CogniFold,一种受大脑启发的“始终在线”智能体记忆,专为下一代主动助手设计。CogniFold持续将碎片化事件流折叠成自涌现的认知结构,从传入事件和积累的知识中逐步引导出更高层次的认知。我们通过将互补学习系统(CLS)理论从两层(海马体、新皮层)扩展到三层,增加了一个前额叶意图层来奠定基础。模仿前额叶皮层作为意图控制和决策制定的中心,CogniFold通过图拓扑自组织实现这一点:认知结构在事件流下主动组装,语义相似时合并,过时时衰减,通过联想回忆重新链接,并在概念簇密度超过阈值时浮现意图。我们使用CogEval-Bench评估结构形成,证明CogniFold独特地产生了符合认知期望和概念涌现的记忆结构。此外,在跨越五个认知领域的7个广泛覆盖的基准测试中,我们验证了CogniFold在常规记忆基准上同时表现出稳健的性能。

英文摘要

Existing agent memory remains predominantly reactive and retrieval-based, lacking the capacity to autonomously organize experience into persistent cognitive structure. Toward genuinely autonomous agents, we introduce CogniFold, a brain-inspired "always-on" agent memory designed for the next generation of proactive assistants. CogniFold continuously folds fragmented event streams into self-emerging cognitive structures, bootstrapping progressively higher-level cognition from incoming events and accumulated knowledge. We ground this by extending Complementary Learning Systems (CLS) theory from two layers (hippocampus, neocortex) to three, adding a prefrontal intent layer. Emulating the prefrontal cortex as the locus of intentional control and decision-making, CogniFold achieves this through graph-topology self-organization: cognitive structures proactively assemble under the stream, merge when semantically similar, decay when stale, relink through associative recall, and surface intents when concept-cluster density crosses a threshold. We evaluate structural formation using CogEval-Bench, demonstrating that CogniFold uniquely produces memory structures that match cognitive expectations and concept emergence. Furthermore, across eight downstream benchmarks -- two probing long-term conversational memory (LoCoMo, LongMemEval) and six spanning other cognitive domains -- we validate that CogniFold simultaneously performs robustly on conventional memory tasks. Our code is available at https://github.com/OpenNorve/CogniFold.

2605.05481 2026-06-19 cs.LG 版本更新

Approximate Next Policy Sampling: Replacing Conservative Target Policy Updates in Deep RL

近似下一策略采样:替代深度强化学习中的保守目标策略更新

Dillon Sandhu, Ronald Parr

AI总结 提出近似下一策略采样(ANPS)方法,通过修改训练分布而非约束策略更新来解决强化学习中的“鸡生蛋”问题,并基于此设计稳定值近似策略迭代(SV-API)算法,在Atari和连续控制任务上实现更大目标策略更新且性能匹配或提升。

详情
AI中文摘要

我们重新审视强化学习中一个经典的“鸡生蛋”问题:为了安全地改进策略,价值函数必须在更新策略的状态访问分布上准确。该状态分布是未知的,且无法为训练价值函数而采样。保守更新解决了这个问题,但代价是缩小策略更新。本文探索了一种替代方案,即近似下一策略采样(ANPS),它通过修改训练分布而非约束策略更新来解决问题。如果训练数据的分布近似于下一策略的分布,则ANPS成立。为了证明ANPS的可行性和有效性,我们引入了稳定值近似策略迭代(SV-API)。SV-API修改了标准的近似策略迭代循环,在迭代更新的行为策略收集相关经验的同时,保持目标策略固定。它仅在满足收敛准则后才承诺采用新策略。如果满足某些稳定性准则,则更新保证是安全的;否则,其安全性不低于标准近似策略迭代。将SV-API应用于PPO得到稳定值PPO(SV-PPO),在高维离散(Atari)和连续控制基准测试中,SV-PPO在执行显著更大的目标策略更新的同时,性能匹配或提升。这些结果证明了ANPS作为RL中这一经典挑战的新解决方案的可行性。

英文摘要

We revisit a classic "chicken-and-egg" problem in reinforcement learning: to safely improve a policy, the value function must be accurate on the state-visitation distribution of the updated policy. That distribution over states is unknown and cannot be sampled for the purposes of training the value function. Conservative updates solve this problem, but at the cost of shrinking the policy update. This paper explores an alternative solution, Approximate Next Policy Sampling (ANPS), which addresses the problem by modifying the training distribution rather than constraining the policy update. ANPS is satisfied if the distribution of the training data approximates that of the next policy. To demonstrate the feasibility and efficacy of ANPS, we introduce Stable Value Approximate Policy Iteration (SV-API). SV-API modifies the standard approximate policy iteration loop to hold the target policy fixed while an iteratively updated behavioral policy gathers relevant experience. It only commits to a new policy once a convergence criterion has been met. If certain stability criteria are met, the update is guaranteed to be safe; otherwise, it remains no less safe than standard approximate policy iteration. Applying SV-API to PPO yields Stable Value PPO (SV-PPO), which matches or improves performance on high-dimensional discrete (Atari) and continuous control benchmarks while executing substantially larger target policy updates. These results demonstrate the viability of ANPS as a new solution to this classic challenge in RL.

2602.13139 2026-06-19 cs.CL 版本更新

OpenLID-v3: Improving the Precision of Closely Related Language Identification -- An Experience Report

OpenLID-v3:提高近亲语言识别精度的经验报告

Mariia Fedorova, Nikolay Arefyev, Maja Buljan, Jindřich Helcl, Stephan Oepen, Egil Rønningstad, Yves Scherrer

AI总结 针对现有语言识别工具对近亲语言和噪声区分困难的问题,通过增加训练数据、合并问题语言变体簇和引入噪声标签扩展OpenLID分类器,提出OpenLID-v3,在多个基准上提升精度。

Comments VarDial'26 workshop at the EACL 2026 conference

详情
AI中文摘要

语言识别(LID)是从网络数据构建高质量多语言数据集的关键步骤。现有的LID工具(如OpenLID或GlotLID)通常难以识别近亲语言,也难以区分有效自然语言与噪声,这污染了特定语言子集,尤其是低资源语言。在本工作中,我们通过增加更多训练数据、合并有问题的语言变体簇以及引入一个专门标记噪声的标签来扩展OpenLID分类器。我们将这个扩展系统称为OpenLID-v3,并在多个基准上将其与GlotLID进行评估。在开发过程中,我们重点关注三组近亲语言(波斯尼亚语、克罗地亚语和塞尔维亚语;意大利北部和法国南部的罗曼语变体;以及斯堪的纳维亚语言),并在现有评估数据集不足的地方贡献了新的评估数据集。我们发现集成方法提高了精度,但也显著降低了对低资源语言的覆盖。OpenLID-v3可在该https URL上获取。

英文摘要

Language identification (LID) is an essential step in building high-quality multilingual datasets from web data. Existing LID tools (such as OpenLID or GlotLID) often struggle to identify closely related languages and to distinguish valid natural language from noise, which contaminates language-specific subsets, especially for low-resource languages. In this work we extend the OpenLID classifier by adding more training data, merging problematic language variant clusters, and introducing a special label for marking noise. We call this extended system OpenLID-v3 and evaluate it against GlotLID on multiple benchmarks. During development, we focus on three groups of closely related languages (Bosnian, Croatian, and Serbian; Romance varieties of Northern Italy and Southern France; and Scandinavian languages) and contribute new evaluation datasets where existing ones are inadequate. We find that ensemble approaches improve precision but also substantially reduce coverage for low-resource languages. OpenLID-v3 is available on https://huggingface.co/HPLT/OpenLID-v3.

2510.27568 2026-06-19 cs.AI cs.CL 版本更新

SIGMA: Search-Augmented On-Demand Knowledge Integration for Agentic Mathematical Reasoning

SIGMA: 搜索增强的按需知识集成用于智能体数学推理

Ali Asgarov, Umid Suleymanov, Aadyant Khatri

AI总结 提出SIGMA框架,通过多智能体独立推理、定向搜索和协调机制,实现上下文敏感的知识集成,在MATH500等基准上提升7.4%的绝对性能。

Comments AAAI 2026 LMReasoning

详情
AI中文摘要

解决数学推理问题不仅需要准确访问相关知识,还需要仔细的多步骤思考。然而,当前的检索增强模型通常依赖单一视角,遵循僵化的搜索策略,并且难以有效结合来自多个来源的信息。我们提出了SIGMA(搜索增强的按需知识集成用于智能体数学推理),这是一个统一框架,通过协调机制编排专门智能体独立推理、执行定向搜索并综合发现。每个智能体生成假设段落以优化其分析视角的检索,确保知识集成既上下文敏感又计算高效。在MATH500、AIME和博士级科学问答GPQA等具有挑战性的基准测试中,SIGMA持续优于开源和闭源系统,实现了7.4%的绝对性能提升。我们的结果表明,多智能体按需知识集成显著提高了推理准确性和效率,为复杂、知识密集型问题解决提供了可扩展的方法。代码将在发表后公开。

英文摘要

Solving mathematical reasoning problems requires not only accurate access to relevant knowledge but also careful, multi-step thinking. However, current retrieval-augmented models often rely on a single perspective, follow inflexible search strategies, and struggle to effectively combine information from multiple sources. We introduce SIGMA (Search-Augmented On-Demand Knowledge Integration for AGentic Mathematical reAsoning), a unified framework that orchestrates specialized agents to independently reason, perform targeted searches, and synthesize findings through a moderator mechanism. Each agent generates hypothetical passages to optimize retrieval for its analytic perspective, ensuring knowledge integration is both context-sensitive and computation-efficient. When evaluated on challenging benchmarks such as MATH500, AIME, and PhD-level science QA GPQA, SIGMA consistently outperforms both open- and closed-source systems, achieving an absolute performance improvement of 7.4%. Our results demonstrate that multi-agent, on-demand knowledge integration significantly enhances both reasoning accuracy and efficiency, offering a scalable approach for complex, knowledge-intensive problem-solving. We will release the code upon publication.

2507.05169 2026-06-19 cs.LG cs.AI cs.CL cs.CV cs.RO 版本更新

Critique of World Model

世界模型批判:一种用于世界建模的生成式潜在预测架构

Eric Xing, Mingkai Deng, Jinyu Hou

AI总结 本文从心理学“假设性思维”出发,提出世界模型的核心目标是模拟真实世界的所有可行动可能性,并设计了一种基于状态化、分层、多级、混合连续/离散表示的生成式潜在预测(GLP)架构。

详情
AI中文摘要

世界模型,即生物智能体所经历并对其采取行动的真实世界环境的算法模拟器,近年来因开发具有人工(通用)智能的虚拟智能体的需求日益增长而成为一个新兴课题。关于世界模型究竟是什么、如何构建、如何使用以及如何评估,已有许多讨论。本文从著名科幻经典《沙丘》中的想象出发,并借鉴心理学文献中“假设性思维”的概念,论证世界模型的主要目标是模拟真实世界中所有可行动的可能性,以进行有目的的推理和行动。我们审视了世界建模的关键设计维度:数据、表示、架构、学习目标和使用,调查了现有方法并分析了它们的权衡。在此基础上,我们提出了一种新的通用世界模型生成式潜在预测(GLP)架构,基于有状态的、分层的、多层次的、混合连续/离散表示,以及生成式和自监督学习框架,并展望了由这种模型支持的物理、智能体和嵌套(PAN)AGI系统。

英文摘要

World Model, the algorithmic simulator of the real-world environment which biological agents experience and act upon, has been an emerging topic in recent years due to the rising need to develop virtual agents with artificial (general) intelligence. There has been much discussion on what a world model really is, how to build it, how to use it, and how to evaluate it. In this essay, starting from the imagination in the famed Sci-Fi classic Dune, and drawing inspiration from the concept of ``hypothetical thinking'' in psychology literature, we argue the primary goal of a world model to be {\it simulating all actionable possibilities of the real world for purposeful reasoning and acting}. We examine the key design dimensions of world modeling: data, representation, architecture, learning objective, and usage, surveying existing approaches and analyzing their tradeoffs. Building on this examination, we propose a new Generative Latent Prediction (GLP) architecture for a general-purpose world model, based on stateful, hierarchical, multi-level, and mixed continuous/discrete representations, and a generative and self-supervised learning framework, with an outlook of a Physical, Agentic, and Nested (PAN) AGI system enabled by such a model.

2606.04101 2026-06-19 cs.DC cs.LG 版本更新

UltraEP: Unleash MoE Training and Inference on Rack-Scale Nodes with Near-Optimal Load Balancing

UltraEP:在机架级节点上以近最优负载均衡释放MoE训练与推理

Xinming Wei, Chao Jin, Tuo Dai, Yinmin Zhong, Shan Yu, Chengxu Yang, Bingyang Wu, Zili Zhang, Jing Mai, Qianchao Zhu, Zhouyang Li, Yuliang Liu, Guojie Luo

AI总结 提出UltraEP,首个基于精确负载的实时均衡器,通过协同设计规划求解与专家复制通信,在机架级节点上实现MoE训练和推理的微批次与逐层重均衡,达到94.3%的力均衡理想吞吐量。

详情
AI中文摘要

大规模专家并行(EP)正成为训练和服务前沿MoE模型的关键,但它也加剧了设备级专家负载不均衡,导致计算掉队者、令牌全对全瓶颈和激活内存峰值。现有的均衡器基于历史负载定期重新分配专家,这对于具有非平稳负载模式的生产部署变得不可靠。我们提出UltraEP,首个用于大规模EP MoE训练和在机架级节点(RSN)上服务预填充的精确负载实时均衡器。基于RSN扩展的纵向扩展连接性,UltraEP在关键路径上对每个微批次和层进行重均衡,这需要规划求解和专家复制通信的非平凡协同设计,以最小化暴露的开销。为此,UltraEP通过高效的配额驱动规划对门控后负载做出积极反应,并利用RSN原生的持久tile流和基于中继的扇出缓解来执行由此产生的不规则专家状态传输。在训练和预填充中,平均涵盖106B到671B参数的MoE模型,UltraEP实现了力均衡理想吞吐量的94.3%,相比无均衡提升了1.49倍,同时将最终跨秩不均衡从1.30-4.01降低到1.01-1.04。此外,我们在2560个GPU的生产MoE训练中验证了UltraEP的可扩展性和鲁棒性。

英文摘要

Large-scale expert parallelism (EP) is becoming pivotal for training and serving frontier MoE models, but it also amplifies device-level expert load imbalance into compute stragglers, token all-to-all bottlenecks, and activation-memory spikes. Existing balancers redistribute experts periodically based on historical load, which becomes unreliable for production deployments with non-stationary load patterns. We present UltraEP, the first exact-load, real-time balancer for large-EP MoE training and serving prefill on rack-scale nodes (RSNs). Leveraging the extended scale-up connectivity among dozens of GPUs within RSNs, UltraEP rebalances every microbatch and layer on critical paths, which requires nontrivial co-design of plan solving and expert replication communication to minimize exposed overhead. To this end, UltraEP eagerly reacts to post-gating load with an efficient quota-driven planner, and executes the resulting irregular expert-state transfers with RSN-native persistent tile streaming and relay-based fan-out mitigation. We evaluate UltraEP in a multi-RSN deployment of up to 256 GPUs, using cutting-edge MoE models from 106B to 671B parameters. Averaged across training and serving, UltraEP achieves 94.3% of the force-balanced ideal throughput, delivering 1.49$\times$ improvement over no-balancing, while reducing the final inter-rank imbalance from 1.30$-$4.01 to 1.01$-$1.04.

2502.19193 2026-06-19 cs.SI cs.AI cs.NE 版本更新

Simulation of Language Evolution under Regulated Social Media Platforms: A Synergistic Approach of Large Language Models and Genetic Algorithms

受监管社交媒体平台下的语言演化模拟:大语言模型与遗传算法的协同方法

Jinyu Cai, Yusei Ishimizu, Mingyue Zhang, Munan Li, Jialong Li, Kenji Tei

AI总结 提出基于大语言模型的多智能体框架,结合遗传算法模拟用户语言策略在监管下的迭代演化,实验表明对话轮次增加可提升信息传递准确性和对话持续性。

Comments The manuscript has been accepted to IEEE Transactions on Computational Social Systems

详情
AI中文摘要

社交媒体平台经常实施限制性政策来调节用户内容,从而催生出创造性的规避语言策略。本文提出了一个基于大语言模型(LLMs)的多智能体框架,用于模拟在监管约束下语言策略的迭代演化。在该框架中,参与者智能体作为社交媒体用户,不断演化其语言表达,而监管智能体通过评估政策违规来模拟平台级别的监管。为了实现更逼真的模拟,我们采用了语言策略的双重设计(约束和表达)来区分冲突目标,并利用LLM驱动的遗传算法(GA)进行语言策略的选择、变异和交叉。该框架使用两种不同的场景进行评估:一个抽象的密码游戏和一个逼真的模拟非法宠物交易场景。实验结果表明,随着对话轮次的增加,不间断对话轮次的数量和信息传输的准确性都显著提高。此外,一项包含40名参与者的用户研究验证了生成对话和策略的现实相关性。消融研究也验证了GA的重要性,强调了其对长期适应性和整体结果改善的贡献。

英文摘要

Social media platforms frequently impose restrictive policies to moderate user content, prompting the emergence of creative evasion language strategies. This paper presents a multi-agent framework based on Large Language Models (LLMs) to simulate the iterative evolution of language strategies under regulatory constraints. In this framework, participant agents, as social media users, continuously evolve their language expression, while supervisory agents emulate platform-level regulation by assessing policy violations. To achieve a more faithful simulation, we employ a dual design of language strategies (constraint and expression) to differentiate conflicting goals and utilize an LLM-driven GA (Genetic Algorithm) for the selection, mutation, and crossover of language strategies. The framework is evaluated using two distinct scenarios: an abstract password game and a realistic simulated illegal pet trade scenario. Experimental results demonstrate that as the number of dialogue rounds increases, both the number of uninterrupted dialogue turns and the accuracy of information transmission improve significantly. Furthermore, a user study with 40 participants validates the real-world relevance of the generated dialogues and strategies. Moreover, ablation studies validate the importance of the GA, emphasizing its contribution to long-term adaptability and improved overall results.

2605.03064 2026-06-19 cs.LO 版本更新

Neural networks as fuzzy logic formulas

神经网络作为模糊逻辑公式

Damian Heiman, Antti Kuusisto, Esko Turunen

AI总结 本文通过Rational Pavelka逻辑及其扩展,为有理权重ReLU激活的神经网络提供了模糊逻辑刻画,并推广到允许任意实数值激活的广义多项式环。

详情
AI中文摘要

神经网络是现代人工智能的一个基本方面,在包括Transformer和图神经网络在内的各种重要机器学习架构中扮演着关键角色。最近,逻辑刻画已被用于研究许多机器学习架构的表达能力,但普通神经网络的逻辑刻画受到的关注较少。在本文中,我们通过Rational Pavelka逻辑($\mathrm{RPL}$)及其扩展$\mathrm{RPL}(\odot)_{\leq 1}$,以及$\mathit{L \Pi} \frac{1}{2}$的两个片段$\mathit{L \Pi} \frac{1}{2}(\rightarrow_{P}^-)_{\leq 1}$和$\mathit{L \Pi} \frac{1}{2}(\odot^-, \rightarrow_{P}^-)$,为有理权重ReLU激活的神经网络提供了模糊逻辑刻画。神经网络的激活值允许为任意实数。我们还通过模糊逻辑$\mathrm{RPL}(\odot)$和$\mathit{L \Pi} \frac{1}{2}$的一个片段$\mathit{L \Pi} \frac{1}{2}(\rightarrow_{P}^-)$,为可数多个变量上允许使用ReLU函数的广义多项式环$\mathbb{Q}$提供了模糊逻辑刻画。

英文摘要

Neural networks are a fundamental aspect of modern artificial intelligence, playing a key role in various important machine learning architectures including transformers and graph neural networks. Recently, logical characterisations have been used to study the expressive power of many machine learning architectures, but logical characterisations of plain neural networks have received less attention. In this paper, we provide fuzzy logic characterisations of rational-weight ReLU-activated neural networks via Rational Pavelka logic ($\mathrm{RPL}$) and an extension of $\mathrm{RPL}$ called $\mathrm{RPL}(\odot)_{\leq 1}$, as well as two fragments of $\mathit{L Π} \frac{1}{2}$ called $\mathit{L Π} \frac{1}{2}(\rightarrow_{P}^-)_{\leq 1}$ and $\mathit{L Π} \frac{1}{2}(\odot^-, \rightarrow_{P}^-)$. The activation values of the neural networks are allowed to be arbitrary real numbers. We also provide fuzzy logic characterisations of a generalised polynomial ring over $\mathbb{Q}$ in countably many variables where the use of the ReLU-function is permitted via the fuzzy logic $\mathrm{RPL}(\odot)$ and a fragment of $\mathit{L Π} \frac{1}{2}$ called $\mathit{L Π} \frac{1}{2}(\rightarrow_{P}^-)$.

2605.02989 2026-06-19 cs.IT eess.SP math.IT stat.ML 版本更新

Information Theory and Statistical Learning

信息论与统计学习

Abbas El Gamal

AI总结 本文是Cover & Thomas《信息论基础》第三版的章节预印本,系统介绍了散度度量在模型训练中的作用,涵盖线性回归、生成扩散模型等,并给出了扩散模型更系统的推导。

详情
AI中文摘要

本手稿包含即将出版的《Cover and Thomas信息论基础》第三版中一章的预印本,经Wiley许可发布。新版的目录EIT-3 ToC可在此https URL找到。反馈请联系abbas@ee. this http URL。学习与信息论在模型训练和基本性能极限的表征中均有交叉。本手稿对第一个交叉点进行了简洁易懂的处理,仅需高年级本科生或一年级研究生水平的信息论和统计学基础知识。章末习题使材料既适合课堂使用也适合自学。本章重点讨论散度度量在模型训练中的作用,示例涵盖从线性回归、逻辑回归到自回归模型、变分自编码器、扩散模型、生成对抗网络和基于分数的模型。介绍了证据下界(ELBO)、f-散度和Fisher散度。特别是,对生成扩散模型的处理提供了比文献中更系统、更明确的推导。

英文摘要

This manuscript contains preprint of a chapter under consideration for inclusion in the forthcoming third edition of {\em Cover and Thomas's Elements of Information Theory}, posted with permission from Wiley. The table of contents EIT-3 ToC of the new edition can be found at: https://docs.google.com/document/d/1L-m4oQEJw1PJhoxBeMwrrBD8S_HmvzMEkPbYvS24980/edit?usp=sharing . For feedback, please contact abbas@ee.stanford.edu Learning and information theory intersect in both model training and the characterization of fundamental performance limits. This manuscript provides a concise and accessible treatment of the first intersection, requiring only basic background in information theory and statistics at the senior undergraduate or first-year graduate level. End-of-chapter exercises make the material well suited for classroom use as well as self-study. The chapter focuses on the role of divergence measures in model training, with examples ranging from linear and logistic regression to autoregressive models, variational autoencoders, diffusion models, generative adversarial networks, and score-based models. It introduces the evidence lower bound (ELBO), f-divergences, and the Fisher divergence. In particular, the treatment of the generative diffusion model provides a more systematic and explicit derivation than is typical in the literature.

2604.21804 2026-06-19 physics.ins-det hep-ex hep-ph 版本更新

Agentic-AI Detector Co-design and Optimization in Vertically-Integrated Differentiable Full Simulations

Agentic-AI探测器协同设计与优化在垂直集成可微分全模拟中

Wonyong Chung, Qibin Liu, Liangyu Wu, Julia Gonski

AI总结 提出双层级优化框架,将AI智能体集成到高能物理探测器设计中,通过可微分全模拟联合优化几何、前端数字化和重建算法参数,在竞争性能指标下找到最优设计点。

Comments 7 pages, 3 figures

详情
AI中文摘要

我们首次实现了AI智能体在高能物理实验探测器设计与优化中的应用,通过一个双层级优化框架,在可微分全模拟中垂直集成探测器几何、前端数字化和高层重建算法参数。以基线分辨率为$3\\%/\sqrt{E}$的双读出分段晶体电磁量能器为例,我们研究了AI智能体在识别和减少关键探测器参数以及非线性遍历设计空间方面的能力和价值。我们发现,当前前沿的LLM推理模型,在未提供额外实验特定上下文的情况下,能够有效执行复杂工作流,并主动提出通用但相关的进一步研究或改进方向。在此,我们展示了AI智能体在三个竞争性能指标中寻找最优设计点的能力,表明将智能体有效集成到前沿研究领域的复杂工作流中,可以在减少劳动和计算的同时,提高关键物理目标的性能。本研究为未来首次完全由AI设计的探测器在科学设施中的应用奠定了基础。

英文摘要

We present the first implementation of AI agents into the design and optimization of detectors in high-energy physics experiments via a bi-level optimization framework that vertically integrates detector geometry, front-end digitization, and high-level reconstruction algorithm parameters in differentiable full simulations. Using the example of a dual-readout, segmented crystal EM calorimeter with a baseline resolution of $3\%/\sqrt{E}$, we investigate the capabilities and value propositions of AI agents in the identification and reduction of key detector parameters and in the nonlinear traversal of design space. We find that frontier LLM reasoning-models today, without being given additional experiment-specific context, are able to effectively execute complex workflows and proactively suggest generic but relevant avenues for further study or improvement. Here, we demonstrate an AI agent's ability to find an optimal design point amidst three competing performance criteria, showing that effective integration of agents into the complex workflows of frontier research areas can yield higher performance for key physics goals while reducing labor and compute. This study establishes the foundation for a future demonstration of the first fully AI-designed detector for future scientific facilities.

2605.04823 2026-06-19 hep-th cond-mat.stat-mech 版本更新

Expectation values after an integrable boundary quantum quench

可积边界量子淬火后的期望值

Zoltán Bajnok, Dávid Fülepi, Máté Lencsés

AI总结 研究可积边界淬火问题,基于体算子和边界改变算子的形状因子,分析实时动力学,计算淬火后局域算子的真空矩阵元,并用截断共形空间方法验证。

Comments 1+37 pages, 20 figures; v2 minor revision, references added

详情
AI中文摘要

我们研究了一个可积边界淬火,其中一个可积边界条件突然切换到另一个。我们基于体算子和边界改变算子的形状因子,开发了一个分析由此产生的实时动力学的通用框架。我们首先在Lee-Yang模型的共形点研究该问题,然后将分析扩展到其有质量扰动,其中我们检查了淬火前真空的时间演化,并计算了淬火后插入的局域算子的真空到真空矩阵元。解析结果通过适用于边界改变情况的截断共形空间方法的数值计算得到验证。

英文摘要

We investigate an integrable boundary quench, in which one integrable boundary condition is suddenly switched to another. We develop a general framework for analyzing the resulting real-time dynamics based on form factors of bulk and boundary-changing operators. We first study the problem at the conformal point of the Lee-Yang model and then extend the analysis to its massive perturbation, where we examine the time evolution of the pre-quench vacuum and compute the vacuum-to-vacuum matrix elements of local operators inserted after the quench. The analytical results are validated by numerical calculations using the truncated conformal space approach adapted to boundary-changing situations.

2605.02238 2026-06-19 astro-ph.SR 版本更新

Low-luminosity Wolf-Rayet stars: a model-data comparison

低光度沃尔夫-拉叶星:模型与数据比较

Siyu Wu, Zhi Li, Yan Li

AI总结 通过单星演化模型检验低光度WC/WNC星的HR图位置和风特性,发现修订的WR风可缓解光度侧矛盾,但WNC星强烈暗示需要额外混合、剥离或双星通道。

详情
AI中文摘要

越来越多的银河系沃尔夫-拉叶(WR)星,特别是WC和过渡型WN/C(WNC)天体,被报道具有相对较低的光度。如果得到确认,这些低光度WR星将为恒星演化模型提供严格的检验,因为它们在赫罗图上的位置和表面成分对内部混合以及所采用的WR阶段质量损失率高度敏感。我们检验了低光度WC/WNC星的赫罗图位置和风特性是否可以被大约太阳金属丰度下的单星演化轨迹所重现,并识别了可能需要额外通道(如双星剥离)或主导系统不确定性的情况。低光度WNC/WC星为WR混合和质量损失率公式提供了灵敏的杠杆。分阶段的模型-数据比较表明,修订的WR风可以缓解暗WCL星的光度侧矛盾,但对温度、表面成分和WR-like风密度的同时要求仍然重要。WNC星提供了最强证据,表明可能需要额外的混合、剥离或双星相关通道。

英文摘要

A growing number of Galactic Wolf-Rayet (WR) stars, in particular WC and transitional WN/C (WNC) objects, have been reported at comparatively low luminosities. If confirmed, these low-luminosity WR stars provide stringent tests of stellar-evolution models, because their HR-diagram locations and surface compositions are highly sensitive to internal mixing and to the adopted WR-phase mass-loss history.We examine whether the HR-diagram positions and wind properties of low-luminosity WC/WNC stars can be reproduced by single-star evolutionary tracks at approximately solar metallicity, and we identify cases where additional channels (e.g. binary stripping) or dominant systematic uncertainties are likely required. Low-luminosity WNC/WC stars offer sensitive leverage on WR mixing and mass-loss prescriptions. A staged model-data comparison shows that revised WR winds can alleviate the luminosity-side tension for faint WCL stars, but the simultaneous requirements of temperature, surface composition, and WR-like wind density remain important. The WNC stars provide the strongest evidence that additional mixing, stripping, or binary-related channels may be required.

2606.09824 2026-06-19 cs.DB 版本更新

TSseek: Regular Expression-Based Similarity Search for Distributed Time Series Datasets

TSseek: 基于正则表达式的分布式时间序列数据集相似性搜索

Xiaoshuai Li, Khalid Alnuaim, Mohamed Y. Eltabakh, Elke A. Rundensteiner

AI总结 提出TSseek框架,通过正则表达式查询语言支持趋势、值范围和通配符模式搜索,并构建分布式空间索引TSseek-X实现高效精确匹配。

Comments Extended version with full ablation studies and additional experiments. v3 corrects bibliographic metadata for several references

详情
AI中文摘要

相似性搜索是时间序列分析中的基本操作。然而,大多数现有技术要求用户提供精确的值序列(通常是整个时间序列对象)作为查询输入。这种严格的要求限制了实际应用,用户更希望表达模式、趋势或值范围。灵活的基于模式的搜索已在文本检索和复杂事件处理中得到探索,但在大规模分布式时间序列中仍未得到充分研究。为弥补这一差距,我们提出TSseek,一个基于正则表达式的分布式时间序列数据集搜索框架。TSseek的查询语言使用户能够组合包含趋势、值范围和通配符片段的模式。我们表明,传统的近似技术(如PAA和SAX)及其索引结构不适合此类查询,因为它们无法对正则表达式查询构造进行操作。在TSseek中,我们通过将时间序列对象近似为保留趋势(斜率方向)和值范围的线段序列,并将查询构造转换为边界矩形,将时间序列对象和查询构造映射到同一空间。为支持高效处理,我们构建了TSseek-X,一个基于时间序列片段的分布式空间索引。TSseek支持两种基本查询类型:全匹配查询(针对整个序列)和子序列匹配查询(针对序列内的任意窗口)。在基准和真实数据集上,全扫描、基于模型和基于SAX的基线方法要么牺牲准确性,要么牺牲速度,而TSseek能高效地返回精确答案。此外,对于子序列工作负载,它比最先进的子序列匹配引擎实现了显著的加速。

英文摘要

Similarity search is a fundamental operation in time series analysis. Most existing techniques, however, require users to supply a precise sequence of values (typically an entire time series object) as the query input. This rigid requirement limits real-world applications, where users instead want to express patterns, trends, or value ranges. Flexible, pattern-based search has been explored in text retrieval and complex event processing, but remains underexplored for large-scale distributed time series. To close this gap, we propose TSseek, a regular-expression-powered search framework for distributed time series datasets. TSseek's query language enables users to compose patterns encompassing trends, value ranges, and wildcard segments. We show that conventional approximation techniques (e.g., PAA and SAX) and their index structures are ill-suited for such queries because they cannot operate on regular-expression query constructs. In TSseek, we map the time series objects and the query constructs into the same space by approximating time series objects as sequences of line segments that retain both trend (slope direction) and value range, and translating query constructs into bounding rectangles. To support efficient processing, we build TSseek-X, a distributed spatial index over the time series segments. TSseek supports two fundamental query types, namely whole-matching queries (over entire series) and subsequence-matching queries (over arbitrary windows within a series). Across benchmark and real-world datasets, full-scan, model-based, and SAX-based baselines all sacrifice either accuracy or speed, whereas TSseek returns exact answers efficiently. Also, for subsequence workloads, it achieves significant speedups over state-of-the-art subsequence matching engines.

2606.07751 2026-06-19 astro-ph.GA astro-ph.SR 版本更新

A Colour-colour Fingerprint Links the UV Upturn in Early-type Galaxies to Second-generation Stars from Dissolved Globular Clusters

颜色-颜色指纹将早型星系中的紫外超与溶解球状星团的第二代恒星联系起来

Paul Goudfrooij, Andrea Bellini, Thomas M. Brown, Thomas H. Puzia

AI总结 通过HST/WFC3观测,发现F275W-F390W颜色梯度与紫外超强度相关,支持富金属球状星团溶解产生的第二代恒星(高氦、高氮)是紫外超起源的假说。

Comments 7 pages, 4 figures, accepted for publication as a MNRAS Letter

Journal ref MNRAS, Vol. 549, 1-7 (2026)

详情
AI中文摘要

我们探讨了早型星系(ETGs)中两个与质量相关的性质:(1)丰度比[N/Fe]和[Na/Fe],以及(2)远紫外(FUV)波段中心集中的“紫外超”,这很可能由具有超太阳氦丰度的极端水平分支星产生。利用HST/WFC3对一个FUV弱和一个FUV强的ETG的新观测,我们检验了Goudfrooij提出的“MP情景”,该情景认为紫外超以及ETG内部和之间N和Na随质量变化的丰度差异在物理上相关,并由富金属球状星团的溶解产生——这是已知唯一发生He、N和Na质量依赖增丰的星系环境(即“多重星族”现象的第二代恒星)。我们表明,当结合F475W和F850LP的存档数据时,F275W和F390W波段对积分光测中$Y$和[N/Fe]的相关变化特别敏感。虽然F475W-F850LP在两个星系中都随半径增加而减小(与已知的金属丰度梯度一致),但F275W-F390W随半径增加而增大,正如紫外超由具有超太阳$Y$和[N/Fe]的第二代恒星引起所预期的那样。此外,F275W-F390W的径向梯度以及He和N增强星的隐含比例在FUV强的ETG中显著大于FUV弱的ETG,这与MP情景的预测一致。

英文摘要

We address two mass-dependent properties among early-type galaxies (ETGs): (1) abundance ratios [N/Fe] and [Na/Fe], and (2) the centrally concentrated "UV upturn" at far-UV (FUV) wavelengths, which is likely produced by extreme horizontal branch stars with supersolar helium abundances. Using new HST/WFC3 observations of one FUV-weak and one FUV-bright ETG, we probe the "MP scenario" by Goudfrooij who posited that the UV upturn and the mass-dependent abundance variations of N and Na within and among ETGs are physically connected and produced by dissolution of metal-rich globular clusters, which represent the only galactic environment where mass-dependent enrichment of He, N, and Na is known to occur (i.e., second-generation stars of the "multiple stellar populations" (MPs) phenomenon). We show that passbands F275W and F390W are uniquely sensitive to correlated changes in $Y$ and [N/Fe] in integrated-light photometry when combined with archival data in F475W and F850LP. While F475W-F850LP is found to decrease with increasing radius in both galaxies, consistent with known metallicity gradients, F275W-F390W increases with increasing radius, as expected if the UV upturn is caused by second-generation stars with supersolar $Y$ and [N/Fe]. Furthermore, the radial gradient in F275W-F390W and the implied fractions of He- and N-enhanced stars are found to be significantly larger in the FUV-bright ETG than in the FUV-weak one, consistent with the predictions of the MP scenario.

2605.03894 2026-06-19 math.AT math.CO 版本更新

Quasimonophobic graphs and degree spectral sequences in discrete cubical homology

拟单恐惧图与离散立方同调中的度谱序列

Samira Sahar Jamil, Mark Behrens

AI总结 引入图的离散立方链复形上的度过滤,定义基于奇异n-立方体面的最大内射维数,研究由此产生的度谱序列,该序列插值离散立方同调与内射同调,并引入拟单恐惧性条件证明谱序列消失及内射同调同构于填充子立方后的CW复形同调,应用于计算Greene球面图的H_2。

Comments v3: corrected minor typos

详情
AI中文摘要

我们在图的离散立方链复形上引入度过滤,该过滤由奇异$n$-立方体面的最大内射维数定义,并研究由此过滤产生的度谱序列。该谱序列在图的离散立方同调$H_n(G)$与内射同调$H_n^{inj}(G)$之间插值,后者是基于内射奇异立方体的离散立方同调的一个变体。基于Babson等人的工作,我们引入了图的拟单恐惧性组合条件,并证明拟单恐惧性意味着度谱序列在某些双次数下消失,并且$H_n^{inj}(G)$同构于通过“填充”图的子立方体得到的CW复形的同调。这些结果应用于计算Greene球面图$G^{sph}_n$的$H_2(G_n^{sph})$。

英文摘要

We introduce the degree filtration on the discrete cubical chain complex of a graph, defined in terms of the maximal injective dimension of the facets of singular $n$-cubes, and study the degree spectral sequence which arises from this filtration. This spectral sequence interpolates between the discrete cubical homology of a graph $H_n(G)$ and the injective homology $H_n^{inj}(G)$, a variant of the discrete cubical homology based on injective singular cubes. Building on the work of Babson et al. we introduce the combinatorial condition of quasimonophobicity on graphs, and show quasimonophobicity implies both the vanishing of the degree spectral sequence in certain bidegrees, and implies $H_n^{inj}(G)$ is isomorphic to the homology of the CW complex obtained by ``filling in'' subcubes of the graph. These results are applied to compute $H_2(G_n^{sph})$ for the Greene sphere graphs $G^{sph}_n$.

2606.06971 2026-06-19 cs.MA cs.SI 版本更新

Modeling U.S. Attitudes Toward China via an Event-Steered Multi-Agent Simulator

通过事件驱动的多智能体模拟器建模美国对华态度

Chenxu Zhu, Hantao Yao, Wu Liu, Junbo Guo, Yongdong Zhang

AI总结 提出事件驱动多智能体模拟器(ES-MAS),利用CURE数据集和双流数据集成引擎(DSDIE)及新闻驱动动态交互模块(NDDI),模拟美国对华舆论的动态演化,实验表明优于现有模型。

详情
AI中文摘要

理解舆论的动态演化,如美国公众对中国的态度,对于评估地缘政治风险至关重要。然而,现有的基于LLM的多智能体模拟器主要依赖静态规则和固定数据集,限制了其捕捉现实世界中宏观层面舆论转变的动态、事件驱动特性的能力。为解决这一限制,我们提出了一种事件驱动的多智能体模拟器(ES-MAS),其中重大事件和日常新闻通过智能体之间的动态交互持续驱动舆论演化。我们首先构建了中美关系演化(CURE)数据集,涵盖2021年至2025年的20个季度,包括258个重大事件和超过14,000篇日常新闻文章,为建模舆论动态提供了全面的时间基础。基于CURE数据集,我们提出了双流数据集成引擎(DSDIE),该引擎通过宏观层面事件将模拟与历史时间线对齐,同时基于个体智能体画像和上下文信号实现个性化信息暴露。此外,我们设计了新闻驱动的动态交互(NDDI)模块,该模块自适应地将具有共同新闻兴趣的智能体分组到局部交互上下文中,促进自下而上的共识形成,同时降低孤立信息茧房的风险。在CURE数据集上的实验结果表明,ES-MAS在复现真实世界历史趋势方面显著优于现有模拟器,为建模动态舆论演化提供了一个可扩展且有效的框架。

英文摘要

Understanding the dynamic evolution of opinions, such as U.S. public attitudes toward China, is essential for assessing geopolitical risks. However, existing LLM-based multiagent simulators predominantly rely on static rules and fixed datasets, limiting their ability to capture the dynamic, event-driven nature of macro-level opinion shifts in real-world settings. To address this limitation, we propose an Event-Steered Multi-Agent Simulator (ES-MAS), in which significant events and daily news continuously drive opinion evolution through dynamic interactions among agents. We first construct the China-U.S. Relation Evolution (CURE) dataset, covering 20 quarters from 2021 to 2025, including 258 major events and over 14,000 daily news articles, and providing a comprehensive temporal foundation for modeling opinion dynamics. Building upon the CURE dataset, we propose a Dual-Stream Data Integration Engine (DSDIE) that aligns simulations with historical timelines via macro-level events while enabling personalized information exposure based on individual agent profiles and contextual signals. Furthermore, we design a News-Driven Dynamic Interaction (NDDI) module, which adaptively groups agents with shared news interests into localized interaction contexts, facilitating bottom-up consensus formation while mitigating the risk of isolated information cocoons. Experimental results on the CURE dataset demonstrate that ES-MAS substantially outperforms existing simulators in reproducing real-world historical trends, offering a scalable and effective framework for modeling dynamic opinion evolution.

2606.06180 2026-06-19 hep-ph hep-ex hep-lat 版本更新

Vector charmonium(-like) states in the energy range of 4.1-4.6 GeV

4.1-4.6 GeV 能量范围内的矢量粲偶素(类)态

Xiang-Kun Dong, Vadim Baru, Leon von Detten, Feng-Kun Guo, Christoph Hanhart, Teng Ji, Ulf-G. Meißner, Alexey Nefediev

AI总结 针对 4.1-4.6 GeV 能量区域矢量粲偶素(类)态谱的长期争议,本文发展了一个统一耦合道框架,通过同时拟合 BESIII 多个截面积数据,证明强耦合道效应和动力学产生极点可解释观测到的线形行为。

Comments 62 pages, 14 figures and 9 tables. Additional $chi_c0ω$ data included. Discussion on HQSS partners added

详情
AI中文摘要

4.1-4.6 GeV 能量区域的矢量粲偶素(类)态谱存在长期争议。尽管包含性 $R$ 值表明只存在通常被解释为常规态的矢量粲偶素,但遍举 $e^+e^-$ 截面在采用 Breit-Wigner 函数拟合时揭示了额外的结构,其参数强烈依赖于观测的末态。这种令人困惑的模式表明耦合道和阈值效应起着关键作用。在本工作中,我们发展了一个适用于所考虑能量区域中 $1^{--}$ 共振态的统一耦合道框架。该框架包含了由重夸克自旋对称性约束的 $S$ 波开粧道 $D\bar{D}_1$、$D^*\bar{D}_1$ 和 $D^*\bar{D}_2^*$,可选的对应夸克模型态(可能与 $\psi(4160)$ 和 $\psi(4415)$ 相关)的裸极点,以及对于三体末态相关的 $Z_c$ 道中的末态相互作用。我们基于所构建的框架采用几个基准模型,对 BESIII 的 $e^+e^-\to J/\psi\pi^+\pi^-$、$h_c\pi^+\pi^-$、$D\bar{D}^*\pi$、$D^*\bar{D}^*\pi$、$J/\psi\eta$ 和 $\chi_{c0}\omega$ 截面数据,以及呈现 $Z_c(3900)$ 和 $Z_c(4020)$ 结构的可用不变质量分布进行同时拟合。这些模型在裸种子态数目和拟合策略上有所不同。我们表明,即使是纯动力学方案也能捕捉所有分析分布的主要特征。因此我们得出结论,所研究能量范围内测量线形的非平凡行为可以用强耦合道效应和动力学产生极点来理解。包含裸致密态会定量改善拟合质量,但不改变这一结论。

英文摘要

The spectrum of vector charmonium(-like) states in the 4.1\dash4.6~GeV energy region exhibits a long-standing tension between inclusive and exclusive measurements. While the inclusive $R$-value indicates only conventional vector charmonia such as $ψ(4160)$ and $ψ(4415)$, exclusive $e^+e^-$ cross sections reveal additional structures whose parameters strongly depend on the observed final states when fitted with Breit--Wigner functions. This puzzling pattern suggests that coupled-channel and threshold effects play an essential role. In this work, we develop a unified coupled-channel framework for the $1^{--}$ resonances in this energy region. The framework incorporates the $S$-wave open-charm channels $D\bar{D}_1$, $D^*\bar{D}_1$, and $D^*\bar{D}_2^*$ constrained by heavy-quark spin symmetry, optional bare poles associated with $ψ(4160)$ and $ψ(4415)$, and final-state interactions in the $Z_c$ channels. We perform simultaneous fits to the BESIII cross sections for $e^+e^-\to J/ψπ^+π^-$, $h_cπ^+π^-$, $D\bar{D}^*π$, $D^*\bar{D}^*π$, $J/ψη$, and $χ_{c0}ω$, together with invariant-mass distributions exhibiting the $Z_c(3900)$ and $Z_c(4020)$ structures. The benchmark models differ in the number of bare seed states and the fitting strategy. We show that even the purely dynamical scheme without bare charmonia captures the gross features of the analyzed distributions. The inclusion of bare compact states improves the fit quality but does not change the conclusion that the measured line shapes can be understood in terms of strong coupled-channel effects with dynamically generated poles. We also discuss possible heavy-quark spin partners of the exotic $1^{--}$ states.

2606.06138 2026-06-19 cond-mat.quant-gas physics.atom-ph quant-ph 版本更新

Charge-Conjugation Violation and Population Asymmetry in Bipartite Fermionic Lattices

电荷共轭破坏与二分费米子晶格中的布居不对称性

Di Xiao, Xue-Ting Fang, Lushuai Cao, Zhong-Kun Hu, Peter Schmelcher

AI总结 本文通过二分费米子晶格中的子晶格扭结展示了内禀电荷共轭破坏机制,其源于图拓扑性质,并导致布居不对称性及谱中的隐藏叶状结构。

详情
AI中文摘要

电荷共轭破坏(CCV)是粒子物理中的核心概念,也出现在量子多体系统的准粒子中,通常依赖于底层系统中嵌入的外部对称性破缺。一个开放问题是内禀CCV机制如何产生及其宏观后果。我们建立了二分费米子晶格中的子晶格扭结作为展示内禀CCV的具体设置。子晶格扭结的内禀CCV基于底层哈密顿量的图拓扑性质,没有发生显式对称性破缺。它导致不同构型的布居不对称性,并在本征能谱中留下隐藏的叶状结构。布居不对称性还导致由淬火动力学中的真空不稳定性触发的子晶格扭结产生的不平衡。我们的工作证明了图拓扑作为内禀CCV的微观起源,布居不对称性作为宏观后果,所提出的设置非常适合于通过冷原子量子模拟器进行实验实现。

英文摘要

Charge conjugation violation (CCV) is a central concept in particle physics and appears also for quasiparticles in quantum many-body systems, which typically relies on an embedded external symmetry breaking to the underlying system. An open question is how an intrinsic CCV mechanism could emerge and what its macroscopic consequences would be. We establish sublattice kinks in bipartite fermionic lattices as a concrete setup showing intrinsic CCV. The intrinsic CCV of the sublattice kink is based on the graph-topological nature of the underlying Hamiltonian, with no explicit symmetry breaking taking place. It leads to a population asymmetry of different configurations and imprints a hidden leaf-like structure in the eigenenergy spectrum. The population asymmetry also leads to an imbalanced sublattice-kink production triggered by the vacuum-instability in the quench dynamics. Our work demonstrates the graph topology as the microscopic origin of intrinsic CCV, with the population asymmetry as the macroscopic consequence, of which the proposed setup is highly amenable to experimental implementation via cold-atom quantum simulators.

2606.05845 2026-06-19 cond-mat.mes-hall cond-mat.stat-mech physics.optics 版本更新

Breakdown of Fluctuational Electrodynamics in the Extreme Near Field

极端近场中涨落电动力学的失效

Philippe Ben-Abdallah

AI总结 本文通过微观耦合振子模型和格林张量方法,证明在极端近场区域,不同物体间的热涨落不再独立,导致涨落电动力学失效,并给出辐射热流的关联修正。

详情
AI中文摘要

涨落电动力学依赖于不同物体中热涨落在统计上独立的假设。我们证明,在极端近场区域,这一近似失效,因为重叠的倏逝表面场会杂化纳米真空间隙两侧的光学声子,并在相对界面之间产生涨落电流交叉关联。利用微观耦合振子模型结合坡印廷矢量的格林张量表述,我们推导了由此产生的辐射热流的关联修正。对于支持表面声子-极化激元的极性材料,当杂化能量与固有阻尼率相当时,这些关联变得显著,并能在亚纳米间距下显著改变传统涨落电动力学的预测。我们的结果为极端近场区域中的关联热涨落建立了微观框架,并量化了它们对辐射传热的影响。

英文摘要

Fluctuational electrodynamics relies on the assumption that thermal fluctuations in distinct bodies are statistically independent. It is shown that this approximation breaks down in the extreme near-field regime, where hybridization of surface phonon-polaritons across nanometric vacuum gaps generates finite fluctuating-current cross correlations between opposite interfaces. Using a microscopic coupled-oscillator model combined with a Green-tensor formulation of the Poynting vector, the resulting correlation-induced correction to radiative heat transfer is derived. For polar materials, these correlations become significant when the hybridization energy approaches the intrinsic damping rate and can substantially modify conventional fluctuational-electrodynamics predictions at subnanometric separations.

2606.05306 2026-06-19 hep-lat hep-ph hep-th nucl-th 版本更新

Gauge field flow for chiral gauge theories on a slab

平板上的手征规范理论的规范场流

Jinlong Dang, Rohith Karur, Srimoyee Sen

AI总结 本文提出在平板几何的2n+1维欧几里得格点上,利用畴壁费米子构造手征规范理论,通过梯度流将规范场延伸到额外维度以解耦反壁上的镜像费米子,并实现了EOM流,在格点上展示了流守恒和反常流入机制。

Comments minor typo corrections, references updated

详情
AI中文摘要

使用平板几何的$2n+1$维欧几里得格点上的畴壁费米子来表述手征规范理论的提议,涉及位于其中一个畴壁上的$2n$维动力学规范场。通过梯度流将规范场延伸到额外维度,从而解耦反壁上的镜像费米子。我们在存在$2n$维背景规范场的情况下,在$n=1$的格点上实现了这一构造。我们还制定并实现了另一种规范场流方案,其中规范场在远离畴壁处满足$2n+1$维运动方程,称为EOM(运动方程)流。在这两种情况下,我们将规范场与费米子耦合,并在格点上演示了流守恒和反常流入如何工作。

英文摘要

The proposal to formulate chiral gauge theories using domain wall fermions on $2n+1$ dimensional Euclidean lattice with a slab geometry involves $2n$ dimensional dynamical gauge fields residing on one of the domain walls. The gauge fields are extended into the extra dimension using gradient flow decoupling the mirror fermions on the anti-wall. We implement this construction on the lattice for $n=1$ in the presence of $2n$ dimensional background gauge fields. We also formulate and implement an additional gauge field flow proposal, where the gauge fields satisfy $2n+1$ dimensional equation of motion away from the domain wall, known as the EOM (equation of motion) flow. In both cases, we couple the gauge fields to fermions and demonstrate how current conservation and anomaly inflow work on the lattice.

2606.05305 2026-06-19 hep-lat hep-ph hep-th nucl-th 版本更新

Gauge field flow for chiral gauge theories on a disk boundary

圆盘边界上手征规范理论的规范场流

Jinlong Dang, Rohith Karur, Srimoyee Sen

AI总结 本文提出在方形格点嵌入圆盘上实现运动方程流的具体方案,并演示格点上异常流入与异常抵消机制。

Comments minor typos corrected and references updated

详情
AI中文摘要

最近一种$2n$维手征规范理论的非微扰表述依赖于在$2n+1$维圆盘流形的$2n$维边界上实现手征费米子。该表述还要求使用某种保持$2n$维规范不变性的流方案将边界规范构型扩展到圆盘内部。本文提出了在方形格点嵌入圆盘上运动方程流的具体实现。此外,我们将流规范场与费米子耦合,并在格点上演示了异常流入和异常抵消机制的作用。

英文摘要

A recent non-perturbative formulation of $2n$ dimensional chiral gauge theories relies on realizing chiral fermions on the $2n$ dimensional boundary of a $2n+1$ dimensional disk manifold. It also requires extending boundary gauge configurations into the interior of the disk using some flow prescription that preserves 2n dimensional gauge invariance. In this paper we propose a concrete realization of the equation of motion flow with the disk embedded on a square lattice. In addition, we couple the flow gauge field to fermions and demonstrate the mechanism of anomaly inflow and anmaly cancellation at work on the lattice.

2606.05017 2026-06-19 cs.AR cs.MS 版本更新

GoldenFloat: A Phi-Derived Static-Split Floating-Point Family from GF4 to GF256 with a Lucas-Exact Integer Identity

GoldenFloat: 从GF4到GF256的基于Phi的静态拆分浮点系列及其Lucas精确整数恒等式

Dmitrii Vasilev

AI总结 提出一种由单一闭式规则生成的静态拆分浮点系列GoldenFloat,并给出多宽度RTL生成器、Lucas精确累加器路径和FPGA编解码器三个具体实现。

Comments 20 pages, single-file LaTeX, ASCII source. v2: peer-anchor updates. Adds Sarnoff P3109 (arXiv:2606.04028), AMD MXFP4 silicon (arXiv:2605.09825), NVIDIA GB10 NVFP4 measurement, companion catalog (arXiv:2606.09686), MixFP4 (arXiv:2605.31035). FL-002 expanded: (c1) GF256 bias, (c2) count drift, (g) static-split vs micro-mixing. TTSKY26a regeneration timeline added. No mathematical claims revised

详情
AI中文摘要

我们提出一种面向硬件的GoldenFloat(GF)描述,这是一个由单一闭式规则生成的静态拆分浮点系列,以及三个具体成果:(i)一个开放的多宽度RTL生成器,覆盖GF4-GF256,并带有针对正确舍入参考的连续积分差分扫描;(ii)一个整数支持的Lucas精确累加器路径,在n=1,...,256时以500位精度验证;(iii)一个GF16 FPGA编解码器,在Artix-7(Xilinx XC7A35T)上以323 MHz通过35/35测试台。对于每个总宽度N>=4,指数宽度e=round((N-1)/phi^2),其中小数部分f=N-1-e,phi=(1+sqrt(5))/2。该规则复现了九种格式(9/9)的已实现指数宽度,并一致扩展到GF128、GF512、GF1024。该规则与posit、takum、OCP-MX以及IEEE P3109多宽度浮点草案并列。我们不对其中任何一种提出每级精度或优越性声明。广度/工具链一致性框架被记录为一个开放猜想,并带有预注册的证伪路径。证伪分类账(FL-002)记录了开放问题及解决它们的实验。报告了日期为2026-05-31的RTL正确性勘误;制造的TTSKY26b芯片带有缺陷的乘法器组合,修正后的生成器是再生基线。

英文摘要

We present a hardware-oriented description of GoldenFloat (GF), a static-split floating-point family generated by a single closed rule, and three concrete artefacts: (i) an open multi-width RTL generator covering GF4-GF256 with a continuous-integration differential sweep against a correctly-rounded reference; (ii) an integer-backed Lucas-exact accumulator path verified at 500-digit precision for n = 1, ..., 256; and (iii) a GF16 FPGA codec passing a 35-of-35 testbench at 323 MHz on Artix-7 (Xilinx XC7A35T). A format-conformance oracle (Corona) ships in the same repository and is used as the blackbox check in our continuous-integration audit. The rule and its scope. For each total width N >= 4, the exponent width is e = round((N-1)/phi^2) with fraction f = N-1-e and phi = (1+sqrt(5))/2. The rule reproduces the realised exponent widths of nine formats GF4, GF8, GF12, GF16, GF20, GF24, GF32, GF64, GF256 (9/9) and extends consistently to GF128, GF512, GF1024. The rule is positioned alongside posit (2022 Posit Standard), takum (Hunhold 2024, 2025), OCP-MX (Rouhani et al. 2023), and the IEEE P3109 multi-width float draft, all of which are width-spanning families under a parameterised rule. We make no per-rung accuracy or superiority claim against any of them. What is open. The breadth/toolchain-coherence framing is recorded as an open conjecture with a pre-registered falsification path: a matched-substrate FPGA experiment and a matched-budget software ablation. A falsification ledger (FL-002) records the open questions and the experiments that would settle them. An RTL-correctness erratum dated 2026-05-31 is reported in Section 5.5; the fabricated TTSKY26b dies carry the defective multiplier portfolio, and the corrected generator is the regeneration baseline.

2606.04969 2026-06-19 quant-ph 版本更新

Quantifying Entanglement via Quantum Wasserstein Distances

基于量子最优输运的纠缠度量

Enmin Shao, Lin Chen, Huixia He

AI总结 提出一种基于阶1量子Wasserstein距离的二分纠缠度量E(ρ),满足所有基本公理,并通过Lipschitz对偶形式给出下界、两比特系统常数及与纠缠见证的定量联系。

详情
AI中文摘要

我们提出一种二分纠缠度量$E(ρ)$,定义为从状态到可分状态集合的最小阶1量子Wasserstein距离。由于Wasserstein度量的通用数据处理不等式,该度量在单一几何框架内满足所有基本公理。Lipschitz对偶形式给出了纯态和混合态的显式下界、两比特系统的尖锐常数以及Haar随机纯态的期望值。我们进一步建立了与纠缠见证的定量联系:任何负的见证期望值都保证了$E$的下界,并且对偶变分界恰好是Lipschitz-1见证所能达到的最大违背。该方法自然地提供了次可加性、迹距离估计以及局域可观测量上的界,同时指向大偏差猜想。这项工作为纠缠理论、最优输运和实验纠缠检测的交叉领域提供了一个通用范式。

英文摘要

We propose a bipartite entanglement measure defined as the minimal order-1 quantum Wasserstein distance from a state to the set of separable states. Owing to the universal data-processing inequality of the Wasserstein metric, the measure satisfies all fundamental axioms within a single geometric framework. A Lipschitz dual formulation yields explicit lower bounds for pure and mixed states, a sharp constant for two-qubit systems, and an expected value for Haar-random pure states. We further establish a quantitative connection to entanglement witnesses: any negative witness expectation value certifies a lower bound, and the dual variational bound is exactly the maximal violation achievable by a Lipschitz-1 witness. The approach naturally provides subadditivity, trace-distance estimates, and bounds on local observables, while pointing toward large-deviation conjectures. This work introduces a framework at the interface of entanglement theory, optimal transport, and experimental entanglement detection.

2606.04742 2026-06-19 cond-mat.supr-con cond-mat.mtrl-sci 版本更新

Nodal superconductivity with spin-triplet component in a noncentrosymmetric weakly-correlated metal

非中心对称弱关联金属中具有自旋三重态分量的节点超导电性

Marcel Strohmeier, Andriy Smolyanyuk, Karsten Held, Michael Smidman, Geetha Balakrishnan, Wolfgang Belzig, Elke Scheer, Angelo Di Bernardo

AI总结 通过低温扫描隧道谱和对称性约束模型,在非中心对称弱关联金属Nb18Re82中证实了反演不对称自旋轨道耦合足以产生可观的自旋三重态分量,混合宇称序参量中三重态振幅可达单重态的一半。

详情
AI中文摘要

在常规超导体中,库珀对形成于偶宇称自旋单态。缺乏反演对称性的非中心对称超导体表现出反对称自旋轨道耦合(ASOC),可将偶宇称自旋单态和奇宇称自旋三重态对组合成混合宇称序参量。自旋三重态分量对超自旋电子器件非常有利。仅凭ASOC(无需强电子关联)是否足以产生可测量的三重态分量仍是一个核心开放问题。本文在弱关联非中心对称金属Nb$_{18}$Re$_{82}$(Nb-Re)中解决了这一问题,其超导配对对称性一直存在争议。通过对四种不同晶体学取向的单晶进行低温扫描隧道谱测量,发现局域态密度中存在显著的取向依赖性各向异性。在对称性约束模型的支持下,我们表明完整的隧穿谱需要混合宇称序参量,其中三重态振幅可达单重态分量的一半。这些结果调和了文献中关于Nb-Re的矛盾报道,并证明即使没有强电子关联,ASOC也足以产生可观的自旋三重态分量,表明混合宇称超导态可能比先前假设的更普遍。由于Nb-Re易于制备成薄膜形式,这些发现将其定位为超自旋电子器件的可及平台,并确立了取向分辨隧穿谱作为检测混合宇称序参量的通用方案。

英文摘要

The most compelling evidence for spin-triplet superconductivity has emerged from strongly correlated electron systems, yet whether a substantial spin-triplet component can be realized without strong electronic coupling, by virtue of antisymmetric spin-orbit coupling (ASOC), remains unresolved. We address this question in the weakly-correlated noncentrosymmetric superconductor Nb$_{18}$Re$_{82}$ using low-temperature scanning tunneling spectroscopy on single crystals with different crystallographic orientations. The tunneling spectra exhibit orientation-dependent variations. A symmetry-constrained analysis shows that understanding the complete spectroscopic dataset requires an superconducting order parameter combining a nodal spin-singlet component with a spin-triplet contribution reaching up to half of the singlet amplitude. These results resolve the debated pairing symmetry of Nb$_{18}$Re$_{82}$ and demonstrate that ASOC alone can generate substantial parity mixing, suggesting that triplet superconductivity may be more widespread than previously recognized.

2606.03367 2026-06-19 cs.IR 版本更新

Automating Information Extraction and Retrieval for Industrial Spare Parts Pooling

自动化信息提取与检索用于工业备件池化

Dyuman Bulloni, Rocco Felici, Oliver Avram, Anna Valente

AI总结 提出PhRAG混合检索增强生成框架,通过命名实体识别结构化异构备件描述并构建虚拟库存池,结合生成式语言模型处理数据稀缺和查询变异性,实现可解释的备件检索。

详情
AI中文摘要

制造业的维护组织试图通过重用现有资产来避免停机和不必要的采购,但主要障碍不是缺乏零件,而是缺乏跨站点和合作伙伴的可操作可见性。库存分布广泛,描述命名约定不一致,包含重复和部分指定的引用,因此正确的零件通常存在于某处,但实际无法发现。本文提出PhRAG,一种混合检索增强生成方法,将这种碎片化景观池化为一个虚拟库存池(VSPool),可以作为一个单一资源进行结构化和搜索。非结构化的异构备件描述通过命名实体识别(NER)结构化到一个共享的虚拟池数据集中,并进行索引以支持稳健的检索,即使用户以自然语言而非精确技术规格表达需求。所提出的模块化流水线利用生成语言模型的多任务特性,覆盖了使工业备件池化具有挑战性的两个维度:(i)来自不同数据源(例如新合作伙伴、目录、市场列表)的非结构化技术规格通过离线提取处理;(ii)运行时的请求变异性(引用、部分引用、规格、价格/条件约束)通过基于混合RAG的搜索引擎处理,该引擎能够检索相关组件并证明结果。该框架展示了在技术规格提取数据稀缺情况下,生成方法相比传统NER方法的潜力,并通过为检索到的组件生成理由,克服了标准信息检索系统的不透明性。项目的开源代码可在此https URL找到。

英文摘要

Maintenance organizations in manufacturing try to avoid downtime and unnecessary purchasing by reusing existing assets, but the main obstacle is not a lack of parts but a lack of actionable visibility across sites and partners. Inventories are distributed, described with inconsistent naming conventions, and contain duplicates and partially specified references, so the right part often exists somewhere but remains effectively undiscoverable. The paper proposes PhRAG, a hybrid Retrieval-Augmented Generation for pooling this fragmented landscape into a Virtual Stock Pool (VSPool) that can be structured and searched as a single resource. Heterogeneous spare part descriptions are structured via Named Entity Recognition (NER) into a shared virtual pool dataset and indexed to support robust retrieval even when users express needs in natural language rather than exact technical specifications. The proposed modular pipeline leverages the multitasking nature of generative language models to cover two dimensions that make industrial parts pooling challenging: ($\boldsymbol{i}$) unstructured technical specifications from diverse data sources (e.g. new partners, catalogs, marketplace listings) are handled through an offline extraction and ($\boldsymbol{ii}$) request variability at runtime (references, partial references, specifications, price/condition constraints) is handled through a hybrid RAG-based search engine capable of retrieving relevant components and justifying results. The framework demonstrates the potential of generative approaches compared with traditional NER approaches in the presence of data scarcity for technical specifications extraction and overcomes the opacity of standard information retrieval systems by generating justifications for retrieved components.

2606.03448 2026-06-19 hep-ph astro-ph.CO hep-th 版本更新

Gravitino Freeze-In Dark Matter with an Additional Scalar Field

具有额外标量场的引力微子冻结暗物质

Georgios Georgilas, Vassilis C. Spanos

AI总结 本文研究在非标准宇宙学场景中,额外标量场如何通过改变状态方程影响引力微子暗物质的丰度,从而允许或限制再加热温度。

Comments 25 pages, 5 figures. Comments and references added

详情
AI中文摘要

引力微子是冻结暗物质候选者的一个突出例子。其遗迹丰度取决于再加热温度和超对称破缺参数,即通用规范微子质量 $M_{1/2}$ 和引力微子质量 $m_{3/2}$。因此,与观测到的暗物质丰度一致的再加热温度存在最大值 $T_{ m reh}^{ m reak}$,该最大值随着 $M_{1/2}$ 的增加而减小。这种行为导致未来对撞机搜索中胶微子质量的下限与成功热轻子生成所需的高再加热温度之间存在紧张关系。在这项工作中,我们研究了一种非标准宇宙学场景,其中热浴由额外的标量场补充。我们表明,对于类似物质的状态方程,这种成分可以引起引力微子丰度的显著稀释,从而允许显著更大的再加热温度值。相反,对于类似动能主导的状态方程,引力微子丰度增强而非稀释,导致最大允许再加热温度降低。

英文摘要

The gravitino is a prominent example of a freeze-in dark matter candidate. Its relic abundance depends on the reheating temperature and on supersymmetry-breaking parameters, that is the universal gaugino mass, $M_{1/2}$, and the gravitino mass, $m_{3/2}$. As a consequence, the reheating temperature consistent with the observed dark matter abundance exhibits a maximum value, $T_{\rm reh}^{\rm reak}$, which decreases as $M_{1/2}$ increases. This behavior gives rise to a tension between prospective lower bounds on the gluino mass from future collider searches and the high reheating temperatures required for successful thermal leptogenesis. In this work, we investigate a nonstandard cosmological scenario in which the thermal bath is supplemented by an additional scalar field. We show that, for a matter-like equation of state, this component can induce a substantial dilution of the gravitino abundance, thereby allowing significantly larger values of the reheating temperature. In contrast, for a kination-like equation of state, the gravitino abundance is enhanced rather than diluted, leading to a reduction of the maximum allowed reheating temperature.

2606.01505 2026-06-19 math.OC 版本更新

Inexactly Smooth Performance Estimation and New Optimized Gradient Methods

非精确光滑性能估计与新的优化梯度方法

Aaron Zoll, Benjamin Grimmer

AI总结 针对非精确光滑凸函数类,提出插值定理并利用性能估计问题(PEP)分析一阶方法,进而设计出最优或最优已知的梯度方法。

Comments 29 pages, 3 figures

详情
AI中文摘要

我们考虑一类广义的“非精确光滑”凸函数,提供了一个通用模型,将$L$-光滑、$M$-Lipschitz和Hölder光滑函数及其任意组合作为特例。这类函数具有与光滑函数密切相关的微积分性质。我们的主要结果为非精确光滑函数提供了插值定理,这些定理在适度的通用常数范围内是必要且充分的。这使得通过求解凸性能估计问题(PEP)可以分析任何非精确光滑凸问题类的一阶方法。此外,这些结果使得Drori和Taylor的构造性算法设计方法得以扩展。由此,我们推导出针对$(β,0)$-Hölder光滑问题的精确极小极大最优方法,针对任何$(β,p)$-Hölder光滑凸最小化问题具有已知最佳收敛保证(常数范围内)的方法,以及针对任何非精确光滑凸问题的一种新的通用快速回溯方法。

英文摘要

We consider a general class of ``inexactly smooth'' convex functions, providing a universal model capturing as special cases $L$-smooth, $M$-Lipschitz, and Hölder smooth functions, and any combination thereof. Such functions possess a calculus closely following that of smooth functions. Our main results provide inexactly smooth functions with interpolation theorems that are necessary and sufficient up to modest universal constants. These enable analysis of first-order methods for any inexactly smooth convex problem class via solving convex Performance Estimation Problems (PEPs). Further, these enable the extension of Drori and Taylor's constructive approach to algorithm design. From this, we derive an exactly minimax optimal method for $(β,0)$-Hölder smooth problems, methods with the best-known convergence guarantees up to constants for any $(β,p)$-Hölder smooth convex minimization, and a new universal fast backtracking method for any inexactly smooth convex problem.