arXivDaily arXiv每日学术速递 周一至周五更新
重置
2606.19349 2026-06-19 cs.CL cs.AI 新提交

Where to Place the Query? Unveiling and Mitigating Positional Bias in In-Context Learning for Diffusion LLMs via Decoding Dynamics

查询应置于何处?通过解码动力学揭示并缓解扩散大语言模型中上下文学习的位置偏差

Zhengheng Li, Panrui Li, Xuyang Liu, Puzhi Xia

发表机构 * Southeast University(东南大学)

AI总结 本文系统分析了扩散大语言模型中查询位置对生成质量的影响,发现其与示例语义质量同等重要,并提出基于平均置信度的无训练自适应路由策略Auto-ICL以优化查询放置。

Comments 9 figures, 4 tables

详情
AI中文摘要

尽管上下文学习(ICL)在自回归(AR)大语言模型(LLMs)中已被广泛研究,但其在扩散大语言模型(dLLMs)中的机制仍基本未被探索。与受单向因果掩码限制的AR模型不同,dLLMs本质上利用双向注意力,为查询放置提供了广泛的空间灵活性。不幸的是,当前实践通常继承AR风格的尾随查询模板,往往忽略了结构范式转变。本文通过全面分析揭示了查询位置实际上是dLLMs中的一阶变量。通过经验解耦,我们证明了位置方差对生成质量的影响与示例语义质量相当。在内部,这种位置敏感性源于注意力流中的空间“近因效应”以及解码轨迹中依赖于任务的偏移。为了在没有真实标签的情况下缓解这种不稳定性,我们揭示了传统的单步置信度($C_{decoded}$)在dLLMs中失效。相反,我们提出了平均置信度($\overline{C}$),一种跟踪迭代解码过程的新指标。通过建立基础的空间ICL基线,我们引入了Auto-ICL,一种无需训练的自适应路由策略,动态优化查询放置,在异构推理和感知任务中稳健地接近最优性能。

英文摘要

While In-Context Learning (ICL) is extensively studied in Autoregressive (AR) LLMs, its mechanism within Diffusion Large Language Models (dLLMs) remains largely unexplored. Unlike AR models restricted by unidirectional causal masking, dLLMs intrinsically utilize bidirectional attention, offering extensive spatial flexibility for query placement. Unfortunately, current practices conventionally inherit AR-style trailing-query templates, often overlooking the structural paradigm shift. This paper presents a comprehensive analysis unveiling that query position is actually a first-order variable in dLLMs. Through empirical decoupling, we demonstrate that positional variance impacts generation quality on par with example semantic quality. Internally, this positional sensitivity stems from a spatial ``Recency Effect'' in attention flow and task-dependent shifts in decoding trajectories. To mitigate this instability without ground-truth labels, we reveal that traditional single-step confidence ($C_{decoded}$) fails in dLLMs. Instead, we propose Average Confidence ($\overline{C}$), a novel metric tracking the iterative decoding process. By establishing the foundational spatial ICL baselines, we introduce Auto-ICL, a training-free adaptive routing strategy that dynamically optimizes query placement, robustly approaching oracle performance across heterogeneous reasoning and perception tasks.

2606.19348 2026-06-19 cs.CL cs.AI 新提交

DeepSeek-V4: Towards Highly Efficient Million-Token Context Intelligence

DeepSeek-V4: 迈向高效百万令牌上下文智能

DeepSeek-AI, Anyi Xu, Bangcai Lin, Bing Xue, Bingxuan Wang, Bingzheng Xu, Bochao Wu, Bowei Zhang, Chaofan Lin, Chen Dong, Chenchen Ling, Chengda Lu, Chenggang Zhao, Chengqi Deng, Chengyu Hou, Chenhao Xu, Chenze Shao, Chong Ruan, Conner Sun, Damai Dai, Daya Guo, Dejian Yang, Deli Chen, Donghao Li, Dongjie Ji, Erhang Li, Fang Wei, Fangyun Lin, Fangzhou Yuan, Feiyu Xia, Fucong Dai, Guangbo Hao, Guanting Chen, Guoai Cao, Guolai Meng, Guowei Li, Han Yu, Han Zhang, Hanwei Xu, Hao Li, Haofen Liang, Haoling Zhang, Haoming Luo, Haoran Wei, Haotian Yuan, Haowei Zhang, Haowen Luo, Haoyu Chen, Haozhe Ji, Hengqing Zhang, Honghui Ding, Hongxuan Tang, Huanqi Cao, Huazuo Gao, Hui Qu, Hui Zeng, J Yang, JQ Zhu, Jia Luo, Jia Song, Jia Yu, Jialiang Huang, Jialu Cai, Jian Liang, Jiangting Zhou, Jiasheng Ye, Jiashi Li, Jiaxin Xu, Jiewen Hu, Jieyu Yang, Jin Chen, Jin Yan, Jingchang Chen, Jingli Zhou, Jingting Xiang, Jingyang Yuan, Jingyuan Cheng, Jingzi Zhou, Jinhua Zhu, Jiping Yu, Joseph Sun, Jun Ran, Junguang Jiang, Junjie Qiu, Junlong Li, Junmin Zheng, Junxiao Song, Kai Dong, Kaige Gao, Kang Guan, Kexing Zhou, Kezhao Huang, Kuai Yu, Lean Wang, Lecong Zhang, Lei Wang, Leyi Xia, Li Zhang, Liang Zhao, Lihua Guo, Lingxiao Luo, Linwang Ma, Linyan Zhu, Litong Wang, Liyu Cai, Liyue Zhang, Longhao Chen, MS Di, MY Xu, Max Mei, Miaojun Wang, Mingchuan Zhang, Minghua Zhang, Minghui Tang, Mingming Li, Mingxu Zhou, Minmin Han, Ning Wang, Panpan Huang, Panpan Wang, Peixin Cong, Peiyi Wang, Peng Zhang, Qiancheng Wang, Qihao Zhu, Qingyang Li, Qinyu Chen, Qiushi Du, Qiwei Jiang, Rui Tian, Ruifan Xu, Ruijie Lu, Ruiling Xu, Ruiqi Ge, Ruisong Zhang, Ruizhe Pan, Runji Wang, Runqian Chen, Runqiu Yin, Runxin Xu, Ruomeng Shen, Ruoyu Zhang, Ruyi Chen, SH Liu, Shanghao Lu, Shangmian Sun, Shangyan Zhou, Shanhuang Chen, Shaofei Cai, Shaoheng Nie, Shaoqing Wu, Shaoyuan Chen, Shengding Hu, Shengyu Liu, Shiqiang Hu, Shirong Ma, Shiyu Wang, Shuiping Yu, Shunfeng Zhou, Shuting Pan, Shuying Yu, Songyang Zhou, Tao Ni, Tao Yun, Tian Jin, Tian Pei, Tian Ye, Tianle Lin, Tianran Ji, Tianyi Cui, Tianyuan Yue, Tingting Yu, Tun Wang, W Zhang, WL Xiao, Wangding Zeng, Wei An, Weilin Zhao, Wen Liu, Wenfeng Liang, Wenjie Pang, Wenjing Luo, Wenjing Yao, Wenjun Gao, Wenkai Yang, Wenlve Huang, Wenqing Hou, Wentao Zhang, Wenting Ma, Xi Gao, Xiang He, Xiangwen Wang, Xianzu Wang, Xiao Bi, Xiaodong Liu, Xiaohan Wang, Xiaokang Chen, Xiaokang Zhang, Xiaotao Nie, Xiaowen Sun, Xiaoxiang Wang, Xin Cheng, Xin Liu, Xin Xie, Xingchao Liu, Xingchen Liu, Xingkai Yu, Xingyou Li, Xinyu Yang, Xinyu Zhang, Xu Chen, Xuanyu Wang, Xuecheng Su, Xueyin Chen, Xuheng Lin, Xuwei Fu, YC Yan, YQ Wang, YW Ma, Yanfeng Luo, Yang Zhang, Yanhong Xu, Yanru Ma, Yanwen Huang, Yao Li, Yao Li, Yao Xu, Yao Zhao, Yaofeng Sun, Yaohui Wang, Yi Qian, Yi Shao, Yi Yu, Yichao Zhang, Yifan Ding, Yifan Shi, Yijia Wu, Yiliang Xiong, Yiling Ma, Ying He, Ying Tang, Ying Zhou, Yingjia Luo, Yinmin Zhong, Yishi Piao, Yisong Wang, Yixiang Zhang, Yixiao Chen, Yixuan Tan, Yixuan Wei, Yiyang Ma, Yiyuan Liu, Yonglun Yang, Yongqiang Guo, Yongtong Wu, Yu Wu, YuKun Li, Yuan Cheng, Yuan Ou, Yuanfan Xu, Yuanhao Li, Yuduan Wang, Yuehan Yang, Yuer Xu, Yuhan Wu, Yuhao Meng, Yuheng Zou, Yukun Zha, Yunfan Xiong, Yupeng Chen, Yuping Lin, Yuqian Cao, Yuqian Wang, Yushun Zhang, Yuting Yan, Yutong Lin, Yuxian Gu, Yuxiang Luo, Yuxiang You, Yuxuan Liu, Yuxuan Zhou, Yuyang Zhou, Yuzhen Huang, ZF Wu, Zehao Wang, Zehua Zhao, Zehui Ren, Zekai Zhang, Zhangli Sha, Zhe Fu, Zhe Ju, Zhean Xu, Zhenda Xie, Zhengyan Zhang, Zheren Gao, Zhewen Hao, Zhibin Gou, Zhicheng Ma, Zhigang Yan, Zhihong Shao, Zhixian Huang, Zhixuan Chen, Zhiyu Wu, Zhizhou Ren, Zhongyu Wu, Zhuoshu Li, Zhuping Zhang, Zian Xu, Zihao Wang, Zihua Qu, Zihui Gu, Zijia Zhu, Zilin Li, Zipeng Zhang, Ziwei Xie, Ziyi Gao, Ziyi Wan, Zizheng Pan, Zongqing Yao

发表机构 * DeepSeek-AI(深度求索人工智能)

AI总结 提出DeepSeek-V4系列MoE模型,通过混合注意力架构、流形约束超连接和Muon优化器,实现百万令牌上下文的高效推理,在核心任务上超越前代。

详情
AI中文摘要

我们展示了DeepSeek-V4系列的预览版本,包括两个强大的混合专家(MoE)语言模型——DeepSeek-V4-Pro(1.6T参数,49B激活)和DeepSeek-V4-Flash(284B参数,13B激活),两者均支持一百万个令牌的上下文长度。DeepSeek-V4系列在架构和优化方面引入了多项关键升级:(1)混合注意力架构,结合压缩稀疏注意力(CSA)和重度压缩注意力(HCA),以提高长上下文效率;(2)流形约束超连接(mHC),增强传统残差连接;(3)Muon优化器,实现更快的收敛和更高的训练稳定性。我们在超过32T多样且高质量的令牌上预训练了两个模型,随后通过全面的后训练流程解锁并进一步增强其能力。DeepSeek-V4-Pro-Max是DeepSeek-V4-Pro的最大推理努力模式,重新定义了开放模型的最先进水平,在核心任务上超越了其前代。同时,DeepSeek-V4系列在长上下文场景中非常高效。在百万令牌上下文设置下,与DeepSeek-V3.2相比,DeepSeek-V4-Pro仅需27%的单令牌推理FLOPs和10%的KV缓存。这使得我们能够常规支持百万令牌上下文,从而使长时任务和进一步的测试时扩展更加可行。模型检查点可从此https URL获取。

英文摘要

We present a preview version of DeepSeek-V4 series, including two strong Mixture-of-Experts (MoE) language models -- DeepSeek-V4-Pro with 1.6T parameters (49B activated) and DeepSeek-V4-Flash with 284B parameters (13B activated) -- both supporting a context length of one million tokens. DeepSeek-V4 series incorporate several key upgrades in architecture and optimization: (1) a hybrid attention architecture that combines Compressed Sparse Attention (CSA) and Heavily Compressed Attention (HCA) to improve long-context efficiency; (2) Manifold-Constrained Hyper-Connections (mHC) that enhance conventional residual connections; (3) and the Muon optimizer for faster convergence and greater training stability. We pre-train both models on more than 32T diverse and high-quality tokens, followed by a comprehensive post-training pipeline that unlocks and further enhances their capabilities. DeepSeek-V4-Pro-Max, the maximum reasoning effort mode of DeepSeek-V4-Pro, redefines the state-of-the-art for open models, outperforming its predecessors in core tasks. Meanwhile, DeepSeek-V4 series are highly efficient in long-context scenarios. In the one-million-token context setting, DeepSeek-V4-Pro requires only 27% of single-token inference FLOPs and 10% of KV cache compared with DeepSeek-V3.2. This enables us to routinely support one-million-token contexts, thereby making long-horizon tasks and further test-time scaling more feasible. The model checkpoints are available at https://huggingface.co/collections/deepseek-ai/deepseek-v4.

2606.19347 2026-06-19 cs.CL cs.AI cs.PL 新提交

How LLMs Fail and Generalize in RTL Coding for Hardware Design?

LLM在硬件设计的RTL编码中如何失败与泛化?

Guan-Ting Liu, Chao-Han Huck Yang, Chenhui Deng, Zhongzhi Yu, Brucek Khailany, Yu-Chiang Frank Wang

发表机构 * NVIDIA Research(英伟达研究院)

AI总结 提出基于问题可解性的错误分类法,揭示LLM在RTL编码中受限于预训练知识,对齐技术仅教会编译,而推理能力才是关键瓶颈。

Comments Preview, under submission for EMNLP 2026

详情
AI中文摘要

将顺序编程先验转换为硬件设计的并行时序逻辑仍然是大型语言模型(LLM)的关键瓶颈。为了研究这一点,我们引入了一种新的错误分类法,该分类法基于问题可解性,受认知理论启发。我们的分类法将失败分为语法、语义、可解功能和不可解功能类型。评估揭示了VerilogEval基准上的严格经验上限,前沿模型初始通过率稳定在90.8%。这些平台期由不可解的功能错误定义,暴露出对测试时计算扩展免疫的持续知识差距。此外,我们揭示了一个显著的表面收敛差距:优化容易消除语法错误,但同时加剧了更深层次的功能失败。我们的发现表明,对齐技术仅仅教会模型编译。虽然重复采样策略可以修补可解错误,但寄存器传输级(RTL)编码能力仍然严格受限于预训练知识。解决当前基于LLM的硬件生成流水线中的挑战需要更多关于模型推理的研究,而不是对齐干预。

英文摘要

Translating sequential programming priors into the parallel temporal logic of hardware design remains a crucial bottleneck for large language models(LLM). To investigate this, we introduce a new error taxonomy grounded in problem solvability, inspired by cognitive theory. Our taxonomy categorizes failures into syntactic, semantic, solvable functional, and unsolvable functional types. Evaluations reveal a strict empirical ceiling on the VerilogEval benchmark, as frontier models plateau at a 90.8% initial pass rate. These plateaus are defined by unsolvable functional errors, exposing persistent knowledge gaps immune to test time compute scaling. Furthermore, we expose a striking surface convergence gap: optimization readily eliminates syntax errors but concurrently exacerbates deeper functional failures. Our findings demonstrate that alignment techniques merely teach models to compile. While repeated sampling strategies can patch solvable errors, register-transfer level(RTL) coding capacity remains strictly bounded by pretraining knowledge. Addressing challenges in the current LLM based hardware generation pipeline requires more studies in model reasoning rather than alignment interventions.

2606.19346 2026-06-19 cs.CL cs.AI 新提交

Disentangling Linguistic Relatedness from Task Alignment in Cross-Lingual Transfer

跨语言迁移中语言相关性与任务对齐的解耦

Ahmed Haj Ahmed, Ruochen Zhang, Alvin Grissom

发表机构 * Haverford College(哈弗福德学院) Brown University(布朗大学)

AI总结 通过微调大语言模型并在闪语族与非闪语族语言上评估零样本阅读理解,发现跨语言迁移主要提升任务格式对齐而非语言特定知识。

详情
AI中文摘要

我们通过微调七个大语言模型(4B--671B参数)在阿拉伯语上,并在闪语族语言和非闪语族对照语言上评估零样本阅读理解,研究跨语言迁移。在密集架构和混合专家架构中,我们没有发现闪语族特定迁移的证据:基线较弱的模型在所有语言上都有显著提升,而基线较强的模型无论语言族系如何,只有边际提升。思维链消融实验强化了这一发现——从微调中获益最多的模型同样从推理时推理中获益,这表明两种机制都解决了任务格式对齐问题,而非跨语言知识迁移。

英文摘要

We study cross-lingual transfer by fine-tuning seven large language models (4B--671B parameters) on Arabic and evaluating zero-shot reading comprehension on Semitic languages and non-Semitic controls. Across dense and Mixture-of-Experts architectures, we find no evidence of Semitic-specific transfer: models with weak baselines improve dramatically across all languages, while strong-baseline models show only marginal gains regardless of language family. A chain-of-thought ablation reinforces this finding -- the same models that benefit most from fine-tuning benefit equally from inference-time reasoning, suggesting both mechanisms address task-format alignment rather than cross-lingual knowledge transfer.

2606.19345 2026-06-19 cs.CL cs.AI 新提交

Ensembles of Large Language Models for Identifying EQ-5D Studies in PubMed Based on Their Abstracts

基于摘要识别PubMed中EQ-5D研究的大型语言模型集成

Zhyar Rzgar K. Rostam, Márta Péntek, János Tibor Czere, Zsombor Zrubka, László Gulácsi, Gábor Kertész

发表机构 * Doctoral School of Applied Informatics and Applied Mathematics, Obuda University(欧布达大学应用信息学与应用数学博士学院) John von Neumann Faculty of Informatics, Obuda University(欧布达大学约翰·冯·诺伊曼信息学学院) Doctoral School of Innovation Management, Obuda University(欧布达大学创新管理博士学院)

AI总结 提出多阶段框架集成Gemini和Gemma等LLM,通过少样本提示、权重集成和软堆叠元分类器,自动检测PubMed中EQ-5D研究,加权集成F1达0.74。

Comments 6 pages, 7 tables, 8 equations

详情
AI中文摘要

科学出版物的快速增长导致系统文献综述(SLR)中的人工研究筛选越来越耗费资源、效率低下且不一致。分类明确报告健康相关生活质量结果(如EQ-5D数据)的研究需要高水平的临床解释,并给人类评审者带来挑战。本研究探讨了使用Google的Gemini和Gemma大型语言模型(LLM)仅基于已发表摘要自动检测PubMed生物医学数据库中的EQ-5D。提出了一个多阶段框架,集成了少样本提示、权重集成聚合和软堆叠元分类器。在由两位专家手动标记的PubMed研究数据集上评估了九个LLM的EQ-5D报告情况。gemini-2.5-pro、gemma-3-12b和gemma-3-27b的加权集成获得了0.74的加权F1分数和0.74的准确率,超过了单独获得的结果。与单个模型相比,表现最佳模型的集成改善了精确率和召回率之间的平衡,而软堆叠方法提供了更高的可靠性和可解释性。特征分析表明,模型的概率结果在指导最终预测中很重要。研究结果表明,基于集成的LLM设置是自动化生物医学研究筛选的可靠且可扩展的方法。

英文摘要

The rapid increase in scientific publications leads to the fact that manual study screening in systematic literature reviews (SLRs) is increasingly resource consuming, inefficient, and inconsistent. Classifying studies that clearly report health-related quality-of-life results, such as EQ-5D data, requires a high level of clinical interpretation and poses challenges for human reviewers. This study investigates the use of Google's Gemini and Gemma large language models (LLMs) in automating EQ-5D detection in the PubMed biomedical database based only on published abstracts. A multi-phase framework is proposed that integrates few-shot prompting, weight ensembling aggregation, and a soft stacking meta-classifier. Nine LLMs are evaluated on a dataset of PubMed studies manually labeled by two experts regarding EQ-5D reporting. The weighted ensemble of gemini-2.5-pro, gemma-3-12b, and gemma-3-27b obtained a 0.74 weighted F1-score and 0.74 accuracy, exceeding individually attained results. The ensembling of top-performing models improved the balance between precision and recall compared to individual models, while the soft stacking approach provided greater reliability and interpretability. Feature analysis shows that the probability results from the models are important in guiding the final predictions. The findings suggest that an ensemble-based LLM setup is a reliable and scalable approach for automating screening in biomedical research.

2606.19344 2026-06-19 cs.CL cs.AI 新提交

Exposing the Unsaid: Visualizing Hidden LLM Bias through Stochastic Path Aggregation

揭示未言明之事:通过随机路径聚合可视化隐藏的LLM偏见

Matteo Pelossi, Rita Sevastjanova, Thilo Spinner, Mennatallah El-Assady

发表机构 * ETH Zurich(苏黎世联邦理工学院)

AI总结 提出TreeTracer工具,通过系统扰动分析、语法对齐聚合和分类感知节点合并,利用桑基图对比不同语义上下文,揭示LLM中隐藏的代表性和句法偏见。

Comments 14 pages

详情
AI中文摘要

大型语言模型(LLM)表现出表征性和句法性偏见,由于文本生成的随机性,这些偏见难以评估。标准审计方法依赖于单一输出检查或静态自动化指标,这些方法掩盖了底层概率分布,未能捕捉隐藏在低概率生成分支中的偏见。本文介绍了TreeTracer,一种通过聚合比较评估LLM偏见的可视化分析工具。该工具使用系统扰动分析流程,替换每个输入提示中由本体定义的术语,将数百次随机生成聚合成语法对齐的层次结构,然后使用辅助语言模型进行分类感知节点合并。生成的结构通过自定义桑基图可视化。通过并置两个本体驱动的树,工作空间能够直接比较语义上下文,并支持系统性偏见检测。由于任何可视化仅反映模型学习行为的一个子集,系统进一步应用对比推理来计算并直接显示跨上下文的反事实标记概率,从而降低误解偏见存在的风险。我们通过案例研究验证了该工作空间,比较了未对齐的基线模型GPT-2 XL与宪法对齐的Apertus模型。视觉聚合成功揭示了隐藏的代表性伤害,例如反事实代词抑制和对话中对个体的边缘化。初步用户研究证实,聚合比较界面降低了认知负荷,并有效支持分析人员检测系统性偏见。

英文摘要

Large Language Models (LLMs) exhibit representational and syntactic biases that are difficult to evaluate due to the stochastic nature of text generation. Standard auditing methods rely on a single output inspection or static automated metrics. These approaches obscure the underlying probability distributions and fail to capture biases hidden in lower-probability generation branches. This paper introduces TreeTracer, a visual analytics tool designed to evaluate LLM bias through aggregated comparison. Using a systematic perturbation analysis pipeline, the tool replaces ontology-defined terms in each input prompt, aggregates hundreds of stochastic generations into a syntax-aligned hierarchical structure, and then performs classification-aware node merging with an auxiliary language model. The resulting structure is visualized through a custom Sankey diagram. By juxtaposing two ontology-driven trees, the workspace enables direct comparison between semantic contexts and supports systematic bias detection. Because any visualization reflects only a subset of the model's learned behavior, the system further applies contrastive inference to compute and directly display counterfactual token probabilities across contexts, reducing the risk of misinterpreting the presence of bias. We validate the workspace through case studies comparing an unaligned baseline model GPT-2 XL against the constitutionally aligned Apertus models. The visual aggregation successfully exposes hidden representational harms, such as counterfactual pronoun suppression and conversational marginalization of individuals. A preliminary user study confirms that the aggregated comparative interface reduces cognitive load and effectively supports analysts in detecting systemic biases.

2606.19342 2026-06-19 cs.SE 新提交

Supporting Design Decisions in Rule-Based Model Transformations

支持基于规则的模型转换中的设计决策

Dejan Stojimirovic, Sinisa Neskovic

AI总结 提出一种将设计决策显式建模并分离于规则转换实现的方法,通过决策、绑定、配置三个模型实现灵活性和可重用性,并建立形式化框架证明其性质。

详情
AI中文摘要

模型驱动工程依赖模型转换来自动从源模型推导目标模型或源代码。然而,控制源元素如何映射到目标制品的设计决策通常嵌入在转换源代码中,限制了灵活性、可重用性和可追溯性。本文提出一种在基于规则的模型转换中显式建模和管理设计决策的方法。通过三种机制将设计决策与转换实现分离。首先,决策模型独立于任何源建模语言捕获设计决策及其选项,针对给定转换领域。其次,绑定模型将这些决策连接到给定源建模语言中的特定元模型概念,使得相同设计知识能够在相似语言的转换中重用。第三,配置模型记录每个适用源模型元素所选的特定选项,并自动预选默认值。执行时,转换规则中的可变点根据配置选择动态解析。跟踪模型记录应用了哪些规则和选项来生成每个目标元素。我们建立了一个形式化数学框架,定义了基于可变性的转换的核心概念,并证明了所得转换的关键性质。这些形式化概念在一个由四个相互关联的制品模型组成的实用架构中实现。我们通过为现有的嵌入式领域特定语言扩展可变性支持来演示实际可行性,并用一个完整的ER到关系转换示例说明该方法。

英文摘要

Model Driven Engineering relies on model transformations to automate the derivation of target models or source code from source models. However, the design decisions that govern how source elements are mapped to target artifacts typically remain embedded in the transformation source code, limiting flexibility, reusability, and traceability. This paper proposes an approach to explicitly model and manage design decisions in rule-based model transformations. Design decisions are separated from transformation implementation through three mechanisms. First, a decision model captures design decisions and their options independently of any source modeling language for a given transformation domain. Second, a binding model connects these decisions to specific metamodel concepts in a given source modeling language, enabling reuse of the same design knowledge across transformations from similar languages. Third, a configuration model records the specific option chosen for each applicable source model element, with defaults pre-selected automatically. During execution, variability points in transformation rules are resolved dynamically according to configured choices. A trace model records which rules and options were applied to produce each target element. We establish a formal mathematical framework that defines the core concepts of variability-based transformations and proves key properties of the resulting transformations. The formal concepts are realized in a practical architecture of four interconnected artifact models. We demonstrate practical feasibility by extending an existing embedded domain-specific language for model-driven engineering with variability support and illustrate the approach with a complete ER-to-Relational transformation example.

2606.20557 2026-06-19 cs.LG math.ST stat.ML stat.TH 新提交

Optimal Deterministic Multicalibration and Omniprediction

最优确定性多校准与全预测

Georgy Noarov, Aaron Roth

发表机构 * University of Pennsylvania(宾夕法尼亚大学)

AI总结 本文提出一种确定性算法,实现多校准的极小化最优样本复杂度,并推广到结果不可区分性,解决确定性预测器是否必要的问题。

详情
AI中文摘要

一个模型在一组群体权重 $G$ 上是多校准的,如果它是校准的——即即使以其预测为条件也是无偏的——不仅整体上,而且在通过每个 $g \in G$ 对上下文重新加权后也是如此。这对于许多下游应用是一个有用的性质,也是可信机器学习的基本要求。在这项工作之前,所有已知达到 $\varepsilon$-多校准的极小化最优 $\widetilde O(\varepsilon^{-3})$ 样本复杂度的预测器都是随机化的,而确定性预测器仅以更差的样本复杂度已知。多校准中随机化对于最优样本复杂度是否必要的问题由 [CLNR26] 明确提出,并在之前的几项工作中隐含提出。我们通过给出一个输出确定性预测器的极小化最优多校准算法解决了这个开放问题。然后我们将该算法推广到产生满足关于有限或有限覆盖测试集合的结果不可区分性(OI)的最优确定性预测器。作为一个应用,这也给出了具有最优样本复杂度的确定性全预测器和泛预测器,解决了 [OKK25] 和 [BHHLZ25] 提出的开放问题。

英文摘要

A model is multicalibrated on a collection of group weights $G$ if it is calibrated -- i.e. unbiased even conditional on its prediction -- not just overall, but also after reweighting contexts by each $g \in G$. It is a useful property for many downstream applications and is a basic desideratum of trustworthy machine learning. Before this work, all predictors known to attain the minimax-optimal $\widetilde O(\varepsilon^{-3})$ sample complexity rate for $\varepsilon$-multicalibration were randomized, while deterministic predictors were known only with substantially worse sample complexity. Whether randomization is necessary for optimal sample complexity in multicalibration was explicitly asked by [CLNR26] and implicitly in several prior works. We resolve this open problem by giving a minimax-optimal multicalibration algorithm that outputs a deterministic predictor. We then generalize the algorithm to produce optimal deterministic predictors that satisfy outcome indistinguishability (OI) with respect to finite or finitely covered collections of tests. As an application, this also gives deterministic omnipredictors and panpredictors with optimal sample complexity, resolving open problems posed by [OKK25] and [BHHLZ25].

2606.20547 2026-06-19 cs.LG cs.CV cs.GR cs.RO math.DG 新提交

The Token Is a Group Element: On Lie-Algebra Attention over Matrix Lie Groups

Token 是群元素:关于矩阵李群上的李代数注意力

Przemyslaw Musialski

发表机构 * New Jersey Institute of Technology(新泽西理工学院)

AI总结 提出李代数注意力机制,将token定义为矩阵李群元素,利用相对位姿的李代数范数作为注意力分数,无需学习核函数或表示论工具,适用于仿射全帧群等非紧致非阿贝尔群。

Comments preprint, 19 pages, 3 figures

详情
AI中文摘要

我们将注意力token置于群上:一个token是矩阵李群$G$的一个元素$g_i$——一个纯粹的变换,没有特征负载,也没有外部作用$\rho(g)$承载它。据我们所知,这是第一个token为裸矩阵李群元素的注意力构造:它们的分数是相对位姿的闭式代数范数,而非学习核,并且它达到了每个基于不可约表示或满射指数的方法必须排除的仿射全帧群。我们称之为李代数注意力。一旦token是群元素,其余部分无需通常的表示论机制。一对的相对几何是规范的,即$g_i^{-1} g_j$,因此成对不变量$w_{ij} = \log(g_i^{-1} g_j)$是内在的而非设计的;在$G$对角作用下的等变性是重言式的,且余循环条件自动成立。注意力分数是负平方代数范数$s_{ij} = -\|\log(g_i^{-1} g_j)\|_\lambda^2/\tau$:在块加权Frobenius内积下的规范邻近核,无需不可约表示、球谐函数、Clebsch-Gordan积或学习核。该构造适用于任何矩阵李群,在包含相对位姿的选定对数图上,包括具有尺度和剪切的非紧致非阿贝尔仿射群,这些是向量token注意力方法无法达到的:既不是不可约表示传统,也不是满射指数方法。在SE(2)、SO(3)和Aff(2)上的三个序列补全实验证实了这一点:闭式分数匹配了相同不变量上的学习MLP核,并在SE(2)上优于它,使用的分数参数少50到80倍,而向量token基线破坏了不变量,误差达五到十二个数量级。

英文摘要

We place the attention token on the group: a token is an element $g_i$ of a matrix Lie group $G$ -- a bare transformation, with no feature payload and no external action $ρ(g)$ carrying it. To our knowledge this is the first attention construction whose tokens are bare matrix Lie group elements: their score is the closed-form algebra norm of the relative pose rather than a learned kernel, and it reaches the affine full-frame groups that every irrep- or surjective-exp-based method must exclude. We call it Lie-Algebra Attention. Once tokens are group elements, the rest follows with none of the usual representation-theoretic machinery. The relative geometry of a pair is canonical, $g_i^{-1} g_j$, so the pairwise invariant $w_{ij} = \log(g_i^{-1} g_j)$ is intrinsic rather than designed; equivariance under the diagonal $G$-action is tautological, and the cocycle condition holds automatically. The attention score is the negative squared algebra norm, $s_{ij} = -\|\log(g_i^{-1} g_j)\|_λ^2/τ$: the canonical proximity kernel under a block-weighted Frobenius inner product, with no irreducible representations, spherical harmonics, Clebsch-Gordan products, or learned kernel. The construction applies to any matrix Lie group on a chosen logarithm chart containing the relative poses, including the non-compact non-abelian affine groups with scale and shear that no vector-token attention method reaches: neither the irrep tradition nor surjective-exp methods. Three sequence-completion experiments, on SE(2), SO(3), and Aff(2), bear this out: the closed-form score matches a learned MLP kernel on the same invariant and outperforms it on SE(2), using 50 to 80x fewer score parameters, while a vector-token baseline breaks invariance by five to twelve orders of magnitude.

2606.20443 2026-06-19 eess.SY cs.LG cs.SY math.AT 新提交

Topological Data Analysis for High-Dimensional Dynamic Process Monitoring

高维动态过程监测的拓扑数据分析

Angan Mukherjee, Tyler A. Soderstrom, Michael J. Kurtz, Victor M. Zavala

AI总结 提出结合拓扑数据分析和机器学习的方法,将多变量时间序列表示为流形,用拓扑描述符总结结构,并用神经常微分方程学习拓扑结构动态演化,实现高效事件检测。

详情
AI中文摘要

实时过程监测需要从高维时间序列数据中提取可操作信息的方法。在这项工作中,我们提出了一种新的过程监测方法,结合了拓扑数据分析(TDA)和机器学习工具。在所提出的方法中,我们将多变量时间序列数据表示为流形,并使用拓扑描述符来总结此类数据的结构;然后,我们使用神经常微分方程来学习系统拓扑结构的动态演化。使用来自工业过程的真实数据,我们表明这种基于轨迹的事件检测方法能有效检测多种类型的事件。我们将该方法与基于重构的方法(如主成分分析和自编码器)以及使用Koopman自编码器的基于轨迹的方法进行了对比。

英文摘要

Real-time process monitoring requires methods that extract actionable information from high-dimensional time-series data. In this work, we present a new approach for process monitoring that combines tools of topological data analysis (TDA) and machine learning. In the proposed approach, we represent multivariate time-series data as manifolds and use topological descriptors to summarize the structure of such data; we then use a neural ordinary differential equation to learn the dynamic evolution of the topological structure of the system. Using real data from an industrial process, we show that this trajectory-based event detection approach is effective at detecting diverse types of events. We contrast this approach against reconstruction-based approaches such as principal component analysis and autoencoders and against a trajectory-based approach that uses Koopman autoencoders.

2606.20442 2026-06-19 cs.LG cs.NA cs.NE math.NA 新提交

Evolutionary Two-Stage Hyperparameter Optimization Strategies for Physics-Informed Neural Networks

物理信息神经网络的进化两阶段超参数优化策略

Fedor Buzaev, Dmitry Efremenko, Egor Bugaev, Andrei Ermakov, Denis Derkach, Daria Pugacheva, Fedor Ratnikov

发表机构 * HSE University(高等经济大学) AXXX

AI总结 针对物理信息神经网络训练不稳定、超参数敏感的问题,提出基于进化算法的两阶段优化策略,先低保真筛选再全训练,在三个PDE问题上显著降低误差。

Comments Equal advising: Daria Pugacheva and Fedor Ratnikov. Accepted to the ICLR 2026 Workshop on AI and PDEs

详情
AI中文摘要

物理信息神经网络(PINNs)通过将物理定律嵌入神经网络训练来求解偏微分方程(PDE)。然而,由于物理信息损失的高度非凸和多项结构,其性能受到不稳定收敛、训练平台期以及对架构和优化超参数的强敏感性的影响。在这种情况下,外循环超参数搜索是一个在异构参数上的噪声黑盒优化问题,经典的局部或基于梯度的策略容易陷入次优区域。进化算法凭借其基于种群的探索能力和处理混合、不可微搜索空间的能力,为发现有前景的配置提供了更稳健的机制。我们提出并研究了一种基于进化算法的两阶段方法,该方法结合了PINNs训练的探索和利用部分,以在固定计算预算下提高解的精度和鲁棒性。在第一阶段,我们执行具有截断轮次的低保真训练运行,以快速筛选候选配置,将超参数选择视为黑盒外循环问题。在第二阶段,只有最有希望的候选者使用标准基于梯度的优化器进行完全训练以细化解。在三个流行问题(即平流方程、Klein-Gordon方程和Helmholtz方程)上评估,我们的方法一致优于标准训练,并在受限计算资源内实现了显著更低的平均误差。

英文摘要

Physics-Informed Neural Networks (PINNs) solve Partial Differential Equations (PDEs) by embedding physical laws into neural network training. However, their performance suffers from unstable convergence, training plateaus, and strong sensitivity to architectural and optimization hyperparameters due to the highly non-convex and multi-term structure of the physics-informed loss. In this setting, the outer-loop hyperparameter search is a noisy and black-box optimization problem over heterogeneous parameters, where classical local or gradient-based strategies are easily trapped in suboptimal regions. Evolutionary algorithms, with their population-based exploration and ability to handle mixed, non-differentiable search spaces, provide a more robust mechanism for discovering promising configurations. We propose and investigate a two-stage approach based on evolutionary algorithms that combines exploration and exploitation parts of PINNs training to improve solution accuracy and robustness under fixed computational budgets. In the first stage, we perform low-fidelity training runs with truncated epochs to rapidly screen candidate configurations, treating hyperparameter selection as a black-box outer-loop problem. In the second stage, only the most promising candidates are fully trained with standard gradient-based optimizers to refine the solution. Evaluated on three popular problems, namely Advection, Klein-Gordon and Helmholtz equations, our method consistently outperforms standard training and achieves significantly lower mean error within constrained computational resources.

2606.20413 2026-06-19 eess.SP cs.IT math.IT 新提交

Hybrid TRP-UE Sensing for Enhanced Target Localization

混合TRP-UE感知用于增强目标定位

Necati Kagan Erkek, Marco Di Renzo, Arman Shojaeifard, Yasser Mestrah, Remun Koirala, Mohammad Heggo, Kunjan Shah

AI总结 提出一种混合TRP-UE感知机制,利用UE辅助感知提升网络感知性能,在室内工厂等复杂传播环境下显著改善目标定位精度。

Comments 6 pages

详情
AI中文摘要

集成感知与通信(ISAC)指的是网络在提供通信服务的同时,能够以可扩展的方式感知环境的能力。ISAC的关键功能之一是对无源和移动感知目标的精确定位。本文介绍了一种新颖的混合TRP-UE感知机制,该机制提升了基于网络的感知性能。使用符合3GPP标准的ISAC信道模型提供了评估结果。结果表明,在室内工厂等具有挑战性的传播环境中,用UE辅助感知补充基于TRP的感知具有显著优势。

英文摘要

Integrated Sensing and Communication (ISAC) refers to the capability for the network to provide communications services whilst also being able to sense the environment in a scalable manner. One of the key functions of ISAC is the accurate localization of passive and mobile sensing targets. This paper introduces a novel hybrid TRP-UE sensing mechanism that improves network-based sensing performance. Evaluation results are provided using 3GPP-compliant ISAC channel models. The results demonstrate the significant benefit in complimenting TRP-based sensing with UE-assisted sensing in challenging propagation environments such as indoor factory.

2606.20394 2026-06-19 cs.RO math.OC 新提交

Agentic AutoResearch forSpace Autonomy: An Auditable, LLM-Driven Research Agent for Aerospace Control Problems

面向空间自主性的智能体自动研究:用于航空航天控制问题的可审计、LLM驱动的研究代理

Amit Jain, Richard Linares

发表机构 * Department of Aeronautics and Astronautics(航空航天学系)

AI总结 提出AutoResearch框架,利用大语言模型作为离线研究代理,自动迭代开发航天控制策略,并通过内置可信层审计结果,消除种子噪声影响,在交会和对接问题上验证了有效性。

详情
AI中文摘要

航天器的制导、导航与控制功能日益通过从专家求解器中提炼的学习策略来实现。开发这样的策略本身就是一个研究过程:研究者选择架构和超参数,运行实验,并必须判断一个明显的改进是真实的还是仅仅是种子噪声。本文提出了AutoResearch框架,其中大语言模型自主驱动这一循环,用于航空航天控制问题,并结合了一个内置在循环中的可信层,该层根据问题自身测量的种子噪声对每个报告的结果进行认证。语言模型仅作为离线研究代理,负责开发控制策略;它产生的训练策略随后部署在航天器上,而模型本身从不操作飞行器。在每次迭代中,代理读取自然语言描述的问题描述和运行历史,对训练脚本提出一次编辑,执行它,并记录结果。任何报告的结果在通过相同的三项检查之前不会被认可:测量的每个问题的种子噪声、最佳配置的重新播种验证,以及代理编辑的留一法剪枝。相同的循环被原样应用于两个航空航天控制问题:Clohessy-Wiltshire相对交会问题和带有安全约束的避碰对接问题(经过禁飞区),每个问题都针对已知的最优控制基准进行了校准。在这两个问题中,经过审计的策略以多个标准差超过了测量的种子噪声;对相同参数的未定向搜索则没有。在对接问题上,差距变得明显:未定向搜索没有产生可行的策略,而学习到的策略在每个种子上都保持在禁飞区之外。

英文摘要

Spacecraft guidance, navigation, and control functions are increasingly realized as learned policies distilled from expert solvers. Developing such a policy is itself a research process: an investigator selects an architecture and hyperparameters, runs experiments, and must determine whether an apparent improvement is genuine or merely seed noise. This paper presents AutoResearch, a framework in which a large language model autonomously drives that loop for aerospace control problems, coupled with a credibility layer, built into the loop, that certifies each reported result against the problem's own measured seed noise. The language model serves only as the offline research agent that develops the control policy; the trained policy it produces is then deployed onboard the spacecraft, while the model itself never operates the vehicle. At each iteration the agent reads a plain-language problem description and the run history, proposes a single edit to the training script, executes it, and logs the outcome. No reported result is credited until it passes the same three checks: measured per-problem seed noise, reseeded verification of the best configuration, and leave-one-out pruning of the agent's edits. The same loop is applied, unchanged, to two aerospace control problems: a Clohessy-Wiltshire relative rendezvous and a safety-constrained collision-avoidance docking past a keep-out zone, each calibrated against a known optimal control benchmark. In both, the audited policy clears the measured seed noise by many standard deviations; an undirected search over the same parameters does not. On the docking problem the gap becomes categorical: undirected search yields no feasible policy, while the learned policy stays outside the keep-out zone on every seed.

2606.20325 2026-06-19 cs.LG cs.SC math.DS 新提交

Recurrent neural networks approximate continuous functions

递归神经网络近似连续函数

Valentin Abadie, Clemens Hutter, Helmut Bölcskei

AI总结 本文证明,对于[-1,1]上的任意连续函数,存在一个固定权重和隐藏维度的ReLU递归神经网络,其时间演化可以均匀逼近该函数,并给出了收敛速率和极小极大下界。

详情
AI中文摘要

经典逼近定理要求每当目标精度提高时,就需要一个新的神经网络。本文研究相反的可能性:能否一劳永逸地选择网络,而仅通过让其运行更长时间来换取精度?我们证明这对于[-1,1]上的每个连续函数都是可能的。更准确地说,每个这样的函数都可以通过一个具有固定权重和固定隐藏维度的单ReLU递归神经网络的时间演化来均匀逼近。该构造背后的机制是一个新的中间模型——带神经单元的图灵机(TMNU)。该模型保留了实现多项式逼近方案所需的算法自由度,同时保持足够的刚性,以便被具有显式隐藏维度和权重幅度界限的RNN模拟。由此产生的收敛速率反映了底层多项式逼近的速率。我们通过极小极大下界补充了该构造,表明运行时间不仅仅是证明的产物,而是这种固定网络逼近范式中不可避免的资源。

英文摘要

Classical approximation theorems ask for a new neural network whenever the target accuracy is improved. This paper studies the opposite possibility: can the network be chosen once and for all, and can accuracy be bought only by letting it run longer? We prove that this is possible for every continuous function on [-1,1]. More precisely, each such function is uniformly approximated by the time evolution of a single ReLU recurrent neural network with fixed weights and fixed hidden dimension. The mechanism behind the construction is a new intermediate model, the Turing machine with neural units (TMNU). This model retains the algorithmic freedom needed to implement polynomial approximation schemes, while remaining rigid enough to be simulated by RNNs with explicit bounds on hidden dimension and weight magnitude. The resulting convergence rates reflect the underlying polynomial approximation rates. We complement the construction with minimax lower bounds showing that runtime is not merely a proof artifact, but an unavoidable resource in this fixed-network approximation paradigm.

2606.20195 2026-06-19 cs.PF cs.NA math.NA 新提交

Randomized Sketching is Robust to Low-Precision Rounding on GPUs

随机草图对GPU低精度舍入具有鲁棒性

Aryaman Jeendgar, Clément Flint, Hartwig Anzt

AI总结 研究随机草图在GPU低精度下的性能与精度,提出SparseStack改进CountSketch,发现FP16舍入方式对嵌入质量影响小,分布比量化更关键。

Comments 14 pages, 3 figures

详情
AI中文摘要

随机草图是随机数值线性代数中的核心原语。在现代硬件架构上,特别是在GPU上,稀疏草图的性能受限于内存流量和原子累加,而非浮点吞吐量。这使得草图成为混合精度的自然目标,前提是低精度累加不会降低嵌入质量。我们研究了稀疏子空间嵌入的混合精度GPU实现,重点关注Higgins等人提出的GPU CountSketch内核的SparseStack泛化。SparseStack在相干输入上相对于CountSketch提高了嵌入质量,但其每列额外的非零元素增加了原子更新争用并降低了吞吐量。因此,我们实现了使用确定性舍入到最近、精确随机舍入和抖动舍入的FP16 SparseStack变体,并将它们与FP32 SparseStack、CountSketch、混合精度CountSketch和FlashSketch进行比较。我们的主要实证发现是,在测试的范围内,SparseStack嵌入质量对FP16舍入规则不敏感。确定性、随机和抖动舍入的FP16 SparseStack在不相干、相干和对抗性测试问题上产生几乎相同的子空间失真和草图求解最小二乘精度。主导精度因素是草图分布而非量化规则:SparseStack变体在相干输入上显著改善失真,而所有方法在不相干输入上表现相似。由于确定性舍入的开销最低,它在FP16 SparseStack变体中提供了最佳的性能-精度权衡。

英文摘要

Randomized sketching is a core primitive in randomized numerical linear algebra. On modern hardware architectures, in particular on GPUs, the performance of sparse sketches is limited by memory traffic and atomic accumulation rather than floating-point throughput. This makes sketching a natural target for mixed precision, provided that low-precision accumulation does not degrade the embedding quality. We study mixed-precision GPU implementations of sparse oblivious subspace embeddings, focusing on a SparseStack generalization of the GPU CountSketch kernel of Higgins et al. SparseStack improves embedding quality relative to CountSketch on coherent inputs, but its additional nonzeros per column increase atomic-update contention and reduce throughput. We therefore implement FP16 SparseStack variants using deterministic round-to-nearest, exact stochastic rounding, and dithered rounding, and compare them with FP32 SparseStack, CountSketch, mixed-precision CountSketch, and FlashSketch. Our main empirical finding is that, for the tested regimes, SparseStack embedding quality is insensitive to the FP16 rounding rule. Deterministic, stochastic, and dithered rounding FP16 SparseStack produce nearly identical subspace distortion and sketch-and-solve least-squares accuracy across incoherent, coherent, and adversarial test problems. The dominant accuracy factor is the sketch distribution rather than the quantization rule: SparseStack variants substantially improve distortion on coherent inputs, while all methods behave similarly on incoherent inputs. Since deterministic rounding has the lowest overhead, it provides the best performance--accuracy tradeoff among the FP16 SparseStack variants.

2606.20162 2026-06-19 cs.AI cs.IT cs.NI math.IT 新提交

Implicit Semantic-Aware Communication Based on Hypergraph Reasoning

基于超图推理的隐式语义感知通信

Yiwei Liao, Shurui Tu, Yong Xiao, Yingyu Li, Guangming Shi

发表机构 * China Electric Power Research Institute Co., Ltd(中国电力科学研究院有限公司) National Key Laboratory for Power Grid Environmental Protection(电网环境保护国家重点实验室) School of Electronic Information and Communications, Huazhong University of Science and Technology(华中科技大学电子信息与通信学院) Peng Cheng Laboratory(鹏城实验室) Pazhou Laboratory (Huangpu)(琶洲实验室(黄埔)) School of Mechanical Engineering and Electronic Information, China University of Geosciences(中国地质大学机械与电子信息学院)

AI总结 提出基于超图的隐式语义推理框架HISR,通过超图建模多实体高阶关系,在噪声信道下提升语义推理鲁棒性,准确率提升36.6%。

Comments This work is accepted at IEEE Transactions on Communications

详情
AI中文摘要

语义感知通信已成为下一代通信系统的变革性范式,将基本目标从传输比特级符号转变为可靠恢复和理解信息的语义含义。先前研究表明,将源消息的语义内容表示为基于图的结构可以显著提高通信效率和接收端语义推理的准确性。然而,现有解决方案通常采用仅捕获成对关系的图,从而忽略了现实场景中常见的高阶隐式相关性,例如群体交互、多实体关联和复杂关系上下文。这种限制降低了语义表达能力,并使语义推理容易受到歧义和性能下降的影响,尤其是在噪声或损坏的信道条件下。为了解决这些问题,本文提出了一种新颖的基于超图的隐式语义推理框架HISR,该框架利用超图表示语义知识实体之间的复杂多实体关系。在HISR中,实体及其关联的高阶关系被映射到针对不同关系上下文定制的专用语义子空间中。这种设计不仅解耦了多样的语义交互以减轻传统图嵌入方法中常见的过平滑效应,而且即使在传输过程中发生部分信息丢失时也能实现鲁棒的语义推理。数值结果表明,所提出的HISR在隐式语义解释准确率上比最先进的基准提高了36.6%。

英文摘要

Semantic-aware communication has emerged as a transformative paradigm for next-generation communication systems, shifting the fundamental goal from transmitting bit-level symbols to reliably recovering and understanding the semantic meaning of information. Previous studies have demonstrated that representing the semantic content of source messages as graph-based structures can significantly improve communication efficiency and the accuracy of semantic inference at the receiver. However, existing solutions typically employ graphs that capture only pairwise relationships, thereby neglecting higher-order implicit correlations commonly observed in real-world scenarios, such as group interactions, multi-entity associations, and complex relational contexts. This limitation reduces semantic expressiveness and makes semantic inference susceptible to ambiguity and performance degradation, particularly under noisy or corrupted channel conditions. To address these issues, this paper proposes a novel hypergraph-based implicit semantic reasoning framework, HISR, which leverages hypergraphs to represent complex multi-entity relationships among semantic knowledge entities. In HISR, entities and their associated higher-order relations are mapped into dedicated semantic subspaces tailored to distinct relational contexts. This design not only disentangles diverse semantic interactions to mitigate the over-smoothing effects commonly found in traditional graph embedding methods but also enables robust semantic inference even when partial information loss occurs during transmission. Numerical results show that the proposed HISR achieves up to a 36.6% improvement in implicit semantic interpretation accuracy over the state-of-the-art benchmarks.

2606.20022 2026-06-19 stat.ML cs.LG math.OC 新提交

Stochastic Linear Contextual Bandits with Bounded Noise: A Set-Membership Approach

具有有界噪声的随机线性上下文赌博机:一种集合成员方法

Haonan Xu, Yingying Li

AI总结 针对有界奖励噪声的随机线性上下文赌博机,提出基于集合成员估计和乐观原则的SME-OFU算法,实现O(log T)的遗憾界,优于次高斯噪声下的最优界。

Comments 23 pages, 1 figure

详情
AI中文摘要

本文考虑具有有界奖励噪声的随机线性上下文赌博机(SLCB)。现有工作通常假设次高斯奖励噪声和有界期望奖励,在此条件下最优遗憾界关于时间T为$\tilde{O}(\sqrt{T})$。然而,在许多应用中,实现/观测到的奖励也自然有界,这意味着奖励噪声有界。有界噪声比次高斯条件更具信息性,但在SLCB文献中尚未被明确利用。本文通过利用一种称为集合成员估计(SME)的不确定性量化方法,并应用面对不确定性的乐观原则(OFU),提出了一种新颖的算法SME-OFU。我们的算法享有改进的遗憾界$O(\log T)$。注意,这并不与次高斯噪声下现有的最优界$\tilde{O}(\sqrt{T})$矛盾,因为有界噪声是更强的条件。最后,仿真表明,当奖励噪声有界时,SME-OFU相对于为次高斯噪声设计的基准算法在经验上有所改进。

英文摘要

This paper considers stochastic linear contextual bandits (SLCB) with bounded reward noise. Existing works typically assume sub-Gaussian reward noise and bounded expected rewards, under which the optimal regret bound scales as $\tilde{O}(\sqrt{T})$ in terms of horizon $T$. However, in many applications, realized/observed rewards are also naturally bounded, implying bounded reward noise. Bounded noise is more informative than the sub-Gaussian condition but has not been leveraged explicitly in the SLCB literature. In this paper, we propose a novel algorithm SME-OFU by utilizing an uncertainty quantification method called set-membership estimation (SME) and applying the principle of optimism in the face of uncertainty (OFU). Our algorithm enjoys an improved regret bound $O(\log T)$. Notice that this does not contradict the existing optimal bound $\tilde{O}(\sqrt{T})$ for sub-Gaussian noise because bounded noise is a stronger condition. Finally, simulations show empirical improvements of SME-OFU over a benchmark algorithm designed for sub-Gaussian noise when the reward noise is bounded.

2606.19878 2026-06-19 cs.LG math.OC stat.ML 新提交

On the Oracle Complexity of Interpolation-Based Gradient Descent

基于插值的梯度下降的预言复杂度

Dongmin Lee, William Lu, Anuran Makur

发表机构 * Purdue University(普渡大学)

AI总结 提出分段多项式插值梯度下降(PPI-GD)方法,通过数据域等距点查询一阶预言构造多项式插值近似全梯度,在强凸和非凸损失下分析预言复杂度,证明在数据维数受限且损失足够光滑时优于多种GD变体。

Comments 16 pages, 2 figures

详情
AI中文摘要

最近关于经验风险最小化(ERM)的一阶优化器的工作表明,可以利用ERM损失函数在训练数据中的光滑性(而非优化参数中的光滑性)来改进梯度下降(GD)方法的预言复杂度。在本文中,我们提出了一种不精确梯度方法——分段多项式插值梯度下降(PPI-GD),该方法通过在数据域中的等距点处查询一阶预言来近似每次迭代中的全梯度,从而在数据域的适当大小的块上构造所得梯度样本的多项式插值。我们分析了PPI-GD在强凸和非凸损失函数下的预言复杂度,其中数据空间维数以训练样本数量的多对数函数为界,并发现当损失函数足够光滑时,PPI-GD在关键区域优于几种GD变体。此外,我们的分析将双三次样条插值误差分析中的几种技术扩展到$d$变量张量积多项式插值的设置中,这可能对插值分析具有独立意义。

英文摘要

Recent work on first-order optimizers for empirical risk minimization (ERM) has suggested that smoothness of ERM loss functions in the training data, rather than in the optimization parameters, can be leveraged to improve the oracle complexity of gradient descent (GD) methods. In this paper, we propose an inexact gradient method, piecewise polynomial interpolation-based gradient descent (PPI-GD), which approximates the full gradient in each iteration by querying the first-order oracle at equidistant points in the data domain to construct polynomial interpolants of the resulting gradient samples over appropriately sized patches of the data domain. We analyze the oracle complexity of PPI-GD for strongly convex and non-convex loss functions when the data space dimension is bounded by a polylogarithmic function of the number of training samples, and find it to outperform several GD variants in key regimes when the loss function is sufficiently smooth. Furthermore, our analysis extends several techniques from the error analysis of bicubic spline interpolants to the setting of $d$-variate tensor product polynomial interpolants which may be of independent interest in interpolation analysis.

2606.19876 2026-06-19 cs.LG math.OC 新提交

Global Convergence of Gradient Descent for Score Matching in Gaussian Mixtures via Reverse Fisher Divergence

通过反向Fisher散度实现高斯混合模型中得分匹配的梯度下降全局收敛

Alexander Tyurin

AI总结 研究反向Fisher散度下梯度下降拟合高斯混合模型的全局收敛性,证明从任意初始化或随机初始化下学生分量收敛到最近教师分量,并给出全变差距离收敛条件。

详情
AI中文摘要

得分匹配问题是现代生成建模、扩散模型、拟合非归一化统计模型和逆问题中的核心训练目标。标准方法是最小化前向Fisher散度,其中期望相对于教师分布取。然而,最近结果表明,即使在简单的高斯混合模型设置中,该目标也可能导致不良且依赖初始化的收敛行为。本文研究另一种目标:反向Fisher散度,其中期望相对于学生分布取。我们分析梯度下降(GD)拟合高斯混合模型,并表明目标函数的这一改变导致显著更好的优化性质。首先,当教师分布是单个高斯分布且学生是固定权重和单位协方差的高斯混合模型时,我们证明了从任意初始化出发GD的全局收敛性。其次,我们将分析扩展到教师也是高斯混合模型的情况,并在全局随机初始化方案和目标均值满足$\widetilde{\Omega}(1)$-分离假设下证明了全局收敛保证。特别地,以高概率,每个学生分量收敛到其最近的教师分量,并且我们提供了学生分布在全变差距离下收敛的条件。我们的证明依赖于基于Lyapunov的梯度下降动力学新分析,表明反向Fisher散度比前向Fisher散度具有更有利的优化景观。

英文摘要

The score matching problem is a central training objective in modern generative modeling, diffusion models, fitting unnormalized statistical models, and inverse problems. A standard approach is to minimize the forward Fisher divergence, where the expectation is taken with respect to the teacher distribution. However, recent results show that even in simple Gaussian mixture model settings, this objective can lead to undesirable and initialization-dependent convergence behavior. In this paper, we study an alternative objective: the reverse Fisher divergence, where the expectation is taken with respect to the student distribution. We analyze gradient descent (GD) for fitting Gaussian mixture models and show that this change in the objective leads to significantly better optimization properties. First, when the teacher distribution is a single Gaussian and the student is a Gaussian mixture model with fixed weights and identity covariances, we prove the global convergence of GD from arbitrary initializations. Second, we extend the analysis to the case where the teacher is also a Gaussian mixture model and prove global convergence guarantees under a global random initialization scheme and a $\widetildeΩ(1)$-separation assumption on the target means. In particular, with high probability, each student component converges near its closest teacher component, and we provide conditions under which the student distribution converges in total variation distance. Our proofs rely on a new Lyapunov-based analysis of the gradient descent dynamics, showing that the reverse Fisher divergence has a much more favorable optimization landscape than the forward Fisher divergence.

2606.19834 2026-06-19 cs.DC cs.IT cs.NI math.IT 新提交

Multi-Orientation Edge-Minimum Repair for Non-Redundant Fault-Tolerant Broadcasting in Dense Eisenstein--Jacobi Networks

密集Eisenstein-Jacobi网络中非冗余容错广播的多方向边最小修复

Bader Albader

AI总结 针对密集Eisenstein-Jacobi网络,提出多方向边最小修复方法EJ-MOEM,通过评估六边形广播树方向、选择容错候选、收缩故障剪枝树并利用外部跨组件修复边重构生成树,证明单故障深度不超过t+1、双故障深度不超过t+2,实验验证至t=200均成功。

Comments Preprint also available on Zenodo:https://doi.org/10.5281/zenodo.20691537

详情
AI中文摘要

密集Eisenstein-Jacobi (EJ) 网络是六次代数互连网络,其有限商几何自然由六边形轴向坐标球表示。本文研究由 $\alpha=(t+1)+t\omega$ 生成的密集EJ网络中的非冗余一对多广播修复,其中 $t$ 是网络直径。我们提出EJ-MOEM,一种多方向边最小修复方法,该方法评估一个常数大小的六边形广播树方向族,选择一个容错感知候选,将故障剪枝树收缩为健康组件,并使用外部跨组件修复边重新连接这些组件。得到的结构是健康子图的一个有根生成树:每个健康节点恰好接收一次消息,不使用任何故障节点,并保留原始健康树组件。我们证明,对于所选方向,其故障剪枝组件图是连通的,恰好需要 $c-1$ 条外部修复边,其中 $c$ 是健康组件的数量。我们还证明了EJ坐标归约树的深度证书定理:每个单故障位置允许深度至多 $t+1$ 的修复,每个双故障位置允许深度至多 $t+2$ 的修复。证明使用了EJ六边形的三带表示、扇区后缀附着引理、非相邻扇区分离引理以及六方向屏蔽分类用于配对割集。扩展验证包括对 $t=2,\ldots,12,14,16,18$(在 $t=18$ 时多达 $N=1027$ 和 525,825 个双故障位置)的穷举单故障和双故障枚举,通过 $t=30$ 的结构化定理关键测试,以及通过 $t=200$ 的大型随机测试,全部100%成功且无违反定理的情况。

英文摘要

Dense Eisenstein--Jacobi (EJ) networks are degree-six algebraic interconnection networks whose finite quotient geometry is naturally represented by a hexagonal axial-coordinate ball. This paper studies non-redundant one-to-all broadcast repair in the dense EJ network generated by $α=(t+1)+tω$, where $t$ is the network diameter. We propose EJ-MOEM, a multi-orientation edge-minimum repair method that evaluates a constant-size family of hexagonal broadcast-tree orientations, selects a fault-aware candidate, contracts the fault-pruned tree into healthy components, and reconnects these components using external component-crossing repair edges. The resulting structure is a rooted spanning tree of the healthy subgraph: every healthy node receives the message exactly once, no faulty node is used, and the original healthy tree components are preserved. We prove that, for a chosen orientation whose fault-pruned component graph is connected, exactly $c-1$ external repair edges are necessary and sufficient, where $c$ is the number of healthy components. We also prove a depth-certificate theorem for EJ coordinate-reduction trees: every one-fault placement admits a repair of depth at most $t+1$, and every two-fault placement admits a repair of depth at most $t+2$. The proof uses the three-strip representation of EJ hexagons, a sector-suffix attachment lemma, a non-adjacent-sector separation lemma, and a six-direction shielding classification for paired cuts. Extended validation includes exhaustive one- and two-fault enumeration for $t=2,\ldots,12,14,16,18$ (up to $N=1027$ and 525,825 two-fault placements at $t=18$), structured theorem-critical tests through $t=30$, and large random tests through $t=200$, all with 100\% success and no violation of the theorem.

2606.19833 2026-06-19 cs.DC cs.IT cs.NI math.IT 新提交

Fault-Tolerant Shared-Relay Communication in Circulant Interconnection Networks

循环互连网络中的容错共享中继通信

Bader Albader, Galal Hassan, Mohamed R. Al-Mulla

AI总结 本文研究有向循环图中两跳容错共享中继问题,通过循环差多重性条件建立网络设计框架,分析中继冗余度与度预算的关系,并验证生成器选择对中继生存性的关键影响。

Comments Preprint also available on Zenodo:https://doi.org/10.5281/zenodo.20691084

详情
AI中文摘要

循环互连网络提供对称寻址、紧凑生成器描述和均匀局部连通性。本文映射了有向循环图中容错两跳原语的度-冗余度景观:给定$n$个节点和度预算$m$,最坏情况下的共享中继多重性$R(n,m)$能有多大?如果节点到有序终端对都有出边,则该节点是共享中继;一个$f$中继容错循环图要求每对终端至少有$f+1$个这样的中继。基本可行性条件是循环差多重性条件,我们将其作为数学工具而非新对象。贡献在于围绕该工具的网络设计框架:参数$R(n,m)$和$D_f(n)$、区间循环图的否定定理、中继表预处理和查找算法、对抗性和随机故障保证、负载均衡范围、启发式设计的认证上界解释、精确的小$n$校准、软件查找与搜索微基准测试,以及对526,539个生成器集的可重复研究。结果表明,生成器选择关键决定最坏情况下的中继生存性:优化阈值设计在约$1.16$-$1.63$倍计数下界内实现$f$中继容错,而标准区间生成器即使在更大度下也可能结构失效。

英文摘要

Circulant interconnection networks provide symmetric addressing, compact generator descriptions, and uniform local connectivity. This paper maps a degree--redundancy landscape for a fault-tolerant two-hop primitive in directed circulants: given $n$ nodes and degree budget $m$, how large can the worst-case shared-relay multiplicity $R(n,m)$ be? A node is a shared relay for an ordered terminal pair if it has outgoing links to both terminals; an $f$-relay-fault-tolerant circulant requires at least $f+1$ such relays for every pair. The underlying feasibility condition is a cyclic difference-multiplicity condition, which we use as a mathematical tool rather than claim as a new object. The contribution is the network-design framework around this tool: the parameters $R(n,m)$ and $D_f(n)$, a negative theorem for interval circulants, relay-table preprocessing and lookup algorithms, adversarial and random failure guarantees, load-balance scope, certified upper-bound interpretation of heuristic designs, exact small-$n$ calibration, a software lookup-versus-search microbenchmark, and a reproducible study of 526,539 generator sets. The results show that generator choice critically determines worst-case relay survivability: optimized threshold designs achieve $f$-relay-fault tolerance within about $1.16$--$1.63$ of the counting lower bound, while standard interval generators can fail structurally even at much larger degrees.

2606.19832 2026-06-19 cs.DC cs.IT cs.NI math.IT 新提交

Certified Euclidean-Residue Minimal-Alignment Switch Decompositions for Three Edge-Disjoint Hamiltonian Cycles in Eisenstein--Jacobi Networks

Eisenstein-Jacobi网络中三条边不交哈密顿环的认证欧几里得剩余最小对齐交换分解

Bader Albader

AI总结 针对非互质Eisenstein-Jacobi网络,提出一种基于局部交换演算的最小交换分解方法,构建三条边不交哈密顿环,并通过代数补关联证明其正确性。

Comments Preprint also available on Zenodo:https://doi.org/10.5281/zenodo.20693870

详情
AI中文摘要

Eisenstein-Jacobi (EJ) 网络是六度商格互连网络。对于生成元 $\alpha=a+b\rho$,设 $N=a^2+ab+b^2$ 和 $d=\gcd(a,b)$。若 $d=1$,三个自然单位方向已给出三条边不交哈密顿环。若 $d>1$,每个单位方向分裂为 $d$ 个环,边不交哈密顿环问题变为环拼接问题。现有的非互质EJ分解通过矩形表示和交换调度证明存在性。本文在自然Cayley几何中发展了一种不同的局部交换演算。前两个哈密顿环各自使用最少可能的 $d-1$ 个组件间交换构建,第三个因子作为未使用的边补集获得。贡献并非对所有非互质EJ网络的新存在性定理,而是针对欧几里得剩余族的一种紧凑、公式驱动、最小交换分解,其补关联通过符号方式证明。证明分离四个要素:组件标签坍缩、锚点取消、提升交换代表的无碰撞性以及连通补关联。本文中没有无限族定理通过有限证据或计算枚举证明。定理范围限定在代数补关联证书已写明的参数范围内。表格和CSV数据仅用于验证和重现公式,从不作为无限族定理的证明。

英文摘要

Eisenstein--Jacobi (EJ) networks are degree-six quotient-lattice interconnection networks. For a generator $α=a+bρ$, let $N=a^2+ab+b^2$ and $d=\gcd(a,b)$. If $d=1$, the three natural unit directions already give three edge-disjoint Hamiltonian cycles. If $d>1$, each unit direction splits into $d$ cycles and the EDHC problem becomes a cycle-splicing problem. Existing non-coprime EJ decompositions prove existence by using a rectangular representation and exchange schedules. This paper develops a different, local switch calculus in the natural Cayley geometry. The first two Hamiltonian cycles are built using the minimum possible $d-1$ intercomponent switches each, and the third factor is obtained as the unused edge complement. The contribution is deliberately not a new existence theorem for all non-coprime EJ networks; rather, it is a compact, formula-driven, minimal-switch decomposition for Euclidean-residue families whose complement incidence is proved symbolically. The proof separates four ingredients: component-label collapse, anchor cancellation, noncollision of lifted switch representatives, and connected complement incidence. No infinite-family theorem in this manuscript is proved by finite witnesses or by computational enumeration. The theorem scope is stated for the parameter ranges where an algebraic complement-incidence certificate is written down. Tables and CSV data are used only to verify and reproduce the formulas, never as proof of an infinite-family theorem.

2606.19761 2026-06-19 cs.LO math.LO 新提交

Finishing Oltean's Completeness Proof in Lean 4 for Hybrid Logic $L(\forall)$

在 Lean 4 中完成 Oltean 关于混合逻辑 $L(\forall)$ 的完备性证明

Lars Warren Ericson

AI总结 本文在 Lean 4 中完成了混合逻辑 $L(\forall)$ 的机器检查完备性证明,通过结构新鲜性和存在引理 Henkin 构造两种工具解决了新鲜名称的生成问题。

Comments 147 pages, 5 figures

详情
AI中文摘要

我们给出了一个在 Lean 4 中机器检查的完备性定理,针对混合逻辑 $L(\forall)$:带有名义词、满足风格绑定器 $\forall$ 和盒子模态的命题模态逻辑。(基本混合逻辑(无绑定器)的机器检查完备性由 Asta Halkjær From 在 Isabelle/HOL 中开创。)我们基于 Alex Oltean 2023 年的 Lean 4 形式化工作,该工作机械化了语法、语义、希尔伯特风格证明系统和可靠性(遵循 Blackburn 的混合完备性(1998)),但留下了不完备的部分。完成它需要在两个结构不同的点上制造新鲜名称,我们的核心发现是它们需要两种不同的工具。(1)通过扩展的 Lindenbaum 构造构建的根可证最大一致集,每一步都需要一个对整个集合新鲜的名义词;正确的工具是结构新鲜性:扩展语言,使得通过构造保留无限的名义词供应。我们调查了设计空间(Oltean 在 $\mathbb{N}$ 内的奇偶编码、Bud Mishra 建议的不交和 $N \oplus \mathbb{N}$ 参数化,以及 From 的合成完备性框架)并解释了我们采用的编码。(2)一个最大一致集的可证 $\Diamond$-后继不能通过这种方式获得:其典范盒子归约可证地提及每个名义词,因此没有保留的名称是新鲜的。这里正确的工具是 Oltean 选择但未完成的:一个存在引理 Henkin 构造,通过一个新鲜状态变量从前驱的可证性中抽取每个见证;我们通过一个携带数据的见证累加器和一个紧致性论证完成了它。定理 $\Gamma \models \varphi \to \Gamma \vdash \varphi$ 被完全形式化:该开发是无 sorry 的,且 #print axioms 仅报告 propext、this http URL 和 this http URL。我们将开发移植到 Lean v4.30.0 / mathlib v4.30.0。

英文摘要

We present a machine-checked completeness theorem, in Lean 4, for the hybrid logic $L(\forall)$: propositional modal logic with nominals, the satisfaction-style binder $\forall$, and the box modality. (Machine-checked completeness for basic hybrid logic, without binders, was pioneered by Asta Halkjær From in Isabelle/HOL.) We build on Alex Oltean's 2023 Lean 4 formalization, which mechanized the syntax, semantics, Hilbert-style proof system, and soundness following Blackburn's Hybrid Completeness (1998), but left completeness unfinished. Finishing it requires manufacturing fresh names at two structurally different points, and our central finding is that they call for two different tools. (1) The root witnessed maximal consistent set, built by an extended Lindenbaum construction, needs at each step a nominal fresh for the whole set; the right tool is structural freshness: extend the language so an infinite supply of nominals is reserved by construction. We survey the design space (Oltean's odd/even encoding inside $\mathbb{N}$, the disjoint-sum $N \oplus \mathbb{N}$ parameterization suggested by Bud Mishra, and From's synthetic-completeness frameworks) and explain the encoding we adopt. (2) The witnessed $\Diamond$-successor of a maximal consistent set cannot be obtained this way: its canonical box-reduct provably mentions every nominal, so no reserved name is fresh. Here the right tool is one Oltean chose but left incomplete: an existence-lemma Henkin construction drawing each witness from the predecessor's witnessedness through a fresh state variable; we complete it with a data-carrying witness accumulator and a compactness argument. The theorem $Γ\models φ\to Γ\vdash φ$ is fully formalized: the development is sorry-free, and #print axioms reports only propext, Classical.choice, and Quot.sound. We port the development to Lean v4.30.0 / mathlib v4.30.0.

2606.19754 2026-06-19 cs.LG cs.NA math.NA 新提交

Learning universal approximations for partial differential equations with Physics-Informed Broad Learning System

基于物理信息广度学习系统的偏微分方程通用逼近学习

Zhiwen Yu, Derong Yang, Liujian Zhang, Kaixiang Yang, Peilin Zhan, Jianmin Lv, Jane You, C. L. Philip Chen

发表机构 * School of Computer Science and Engineering, South China University of Technology(华南理工大学计算机科学与工程学院) Peng Cheng Laboratory(鹏城实验室) School of Future Technology, South China University of Technology(华南理工大学未来技术学院) School of Computer Science and Technology, Guangdong University of Technology(广东工业大学计算机科学与技术学院) Department of Industrial and Systems Engineering, The Hong Kong Polytechnic University(香港理工大学工业及系统工程学系)

AI总结 提出物理信息广度学习系统(PIBLS),通过无反向传播的最小二乘优化高效求解线性和非线性偏微分方程,比传统PINN快1-3个数量级且精度更高。

详情
AI中文摘要

偏微分方程(PDE)在建模复杂的物理、生物和工程系统中起着核心作用。虽然传统的数值求解器很稳健,但由于网格依赖性,它们常常带来高昂的计算成本,而最近的物理信息神经网络(PINN)提供了一种无网格替代方案,但经常遭受收敛缓慢和优化不稳定的问题。为了弥合这一差距,本文提出了物理信息广度学习系统(PIBLS),一种新颖的无反向传播框架,将PDE求解重新表述为直接的最小二乘优化。我们改进了该框架内的一个算法以高效处理非线性PDE,并提供了严格的数学证明,确立了PIBLS对这些方程的通用逼近性质。在线性和非线性PDE上的实验表明,PIBLS比传统PINN快1到3个数量级,同时实现了显著更高的求解精度。该框架为科学机器学习提供了一种计算高效的范式,为实时仿真和设计优化任务提供了一种实用、高速的替代方案。

英文摘要

Partial differential equations (PDEs) play a central role in modeling complex physical, biological, and engineering systems. While traditional numerical solvers are robust, they often incur prohibitive computational costs due to mesh dependencies, whereas recent Physics-Informed Neural Networks (PINNs) offer a mesh-free alternative but frequently suffer from slow convergence and optimization instability. To bridge this gap, this article proposes the Physics-Informed Broad Learning System (PIBLS), a novel backpropagation-free framework that reformulates PDE solving as a direct least-squares optimization. We improved an algorithm within this framework to handle nonlinear PDEs efficiently and provide a rigorous mathematical proof establishing the universal approximation property of PIBLS for these equations. Experiments on linear and nonlinear PDEs demonstrate that PIBLS is one to three orders of magnitude faster than conventional PINNs while achieving significantly higher solution accuracy. This framework provides a computationally efficient paradigm for scientific machine learning, offering a practical, high-speed alternative for real-time simulation and design optimization tasks.

2606.19751 2026-06-19 cs.DB math.OC 新提交

DeQL: A Decision Query Language for Prescriptive Analytics over Relational Data

DeQL:一种用于关系数据规范性分析的决策查询语言

Matteo Brucato, Fjodor Kholodkov, Soren Little, Jakob Mayer, Duc Nguyen

AI总结 DeQL扩展SQL以支持决策查询,通过CREATE CANDIDATES和DECIDE两个构造定义选项空间、约束和目标,实现子集选择、分配、调度等决策,并支持不确定性优化和模型评分。

详情
AI中文摘要

DeQL(决策查询语言)扩展了SQL以表达决策查询:给定从关系数据中提取的选项、策略约束和可测量的目标,DeQL查询计算出最佳行动方案。两个构造实现了这一扩展:CREATE CANDIDATES,定义来自关系源的选项空间;DECIDE,声明决策变量、命名约束以及针对这些变量的目标。该设计遵循SQL的原则:用户说明要优化的内容,而引擎选择如何求解;每个查询消费并产生关系;问题的结构对引擎保持可见。本文档规范了该语言(其设计原则、语法、形式文法及执行模型),并附有涵盖子集选择、分配、指派、调度以及多级聚合决策的示例,以及针对不确定性优化、内联模型评分和时间与质量受限求解的扩展。这是该规范的第一版;该语言正在积极开发中,本版本固定了后续修订将基于的核心构造。

英文摘要

DeQL (Decision Query Language) extends SQL to express decision queries: given options drawn from relational data, constraints from policy, and a measurable objective, a DeQL query computes the best course of action. Two constructs carry the extension: CREATE CANDIDATES, which defines the space of options from relational sources, and DECIDE, which declares decision variables, named constraints, and an objective over them. The design follows SQL's principles: the user states what to optimize while the engine chooses how to solve it, every query consumes and produces relations, and the structure of a problem stays visible to the engine. This document specifies the language (its design principles, syntax, formal grammar, and execution model) with examples spanning subset selection, allocation, assignment, scheduling, and decisions at multiple levels of aggregation, and extensions for optimization under uncertainty, inline model scoring, and time- and quality-bounded solving. It is the first version of the specification; the language is under active development, and this version fixes the core constructs on which later revisions will build.

2606.19715 2026-06-19 eess.SP cs.IT math.IT 新提交

Generalized Pinching-Antenna Systems: A Radio-Stripe-Based Realization

广义夹捏天线系统:基于无线电条带的实现

Yanqing Xu, Zhiguo Ding, Tsung-Hui Chang

AI总结 本文提出基于无线电条带(RS)的广义夹捏天线(RS-GPA)框架,通过主动天线处理单元实现位置灵活的无线接入,并开发稀疏激活与波束成形算法以降低总功耗。

Comments 13 pages, 7 figures

详情
AI中文摘要

本文研究无线电条带(RS)作为广义夹捏天线的实际实现,并提出基于RS的广义夹捏天线(RS-GPA)框架。与依赖导波到自由空间被动耦合的介质波导基被动夹捏天线不同,RS采用沿共享电缆部署的主动天线处理单元(APU)进行本地传输、接收和信号处理。这种类似电缆的主动架构提供了灵活的安装和广泛的频率适用性,同时允许选定的APU作为离散且可控的辐射或接收点,实现位置灵活的无线接入。基于所提出的RS-GPA框架,我们通过考虑距离相关的APU-用户信道建立了系统和信道模型。对于下行传输,我们提出了一个电路功率感知的稀疏APU激活和波束成形问题,并开发了一种重加权群稀疏波束成形算法。为了揭示激活原理,我们分析了单用户下行情况,并通过平衡发射功率节省和电路功率成本来刻画何时应激活额外的APU。受此启发,提出了一种几何引导的低复杂度多用户算法。对于上行传输,我们提出了一个联合APU激活和用户功率控制问题,并开发了一种几何引导的稀疏激活设计。数值结果表明,与基准方案相比,所提出的RS-GPA框架显著降低了总功耗,而几何引导算法在运行时间显著降低的情况下实现了与群稀疏设计几乎相同的功耗性能。

英文摘要

This paper investigates radio stripes (RSs) as a practical realization of generalized pinching antennas and proposes an RS-based generalized pinching-antenna (RS-GPA) framework. Unlike dielectric-waveguide-based passive pinching antennas that rely on passive coupling from a guided wave into free space, RSs employ active antenna processing units (APUs) deployed along a shared cable for local transmission, reception, and signal processing. This cable-like active architecture offers flexible installation and broad frequency applicability, while allowing selected APUs to act as discrete and controllable radiation or reception points for location-flexible wireless access. Based on the proposed RS-GPA framework, we establish the system and channel models by accounting for the distance-dependent APU-user channels. For downlink transmission, we formulate a circuit-power-aware sparse APU activation and beamforming problem and develop a reweighted group-sparse beamforming algorithm. To reveal the activation principle, we analyze the single-user downlink case and characterize when an additional APU should be activated by balancing transmit-power saving and circuit-power cost. Inspired by this insight, a geometry-guided low-complexity multiuser algorithm is proposed. For uplink transmission, we formulate a joint APU activation and user power control problem and develop a geometry-guided sparse activation design. Numerical results show that the proposed RS-GPA framework substantially reduces the total consumed power compared with benchmark schemes, while the geometry-guided algorithm achieves near-identical consumed-power performance to the group-sparse design with significantly lower runtime.

2606.19695 2026-06-19 eess.SY cs.GT cs.SY math.OC 新提交

A Unified Framework for Joint Sensor Placement and Scheduling for Intrusion Detection

入侵检测中联合传感器放置与调度的统一框架

Jayanth Bhargav, Mahsa Ghasemi, Shreyas Sundaram

AI总结 提出一个统一框架,将传感器放置与方向调度联合优化,通过博弈论设计效用函数并利用弱子模性实现近最优检测性能。

Comments 27 pages, 4 figures

详情
AI中文摘要

我们考虑一个入侵检测任务,其中防御者必须联合优化传感器放置位置和方向,以最小化入侵者穿越受保护环境时被漏检的概率。我们将此问题分解为一个元问题(称为SensorPlacement)和一个嵌入的子问题(称为OrientationScheduling)。对于固定的传感器放置,OrientationScheduling子问题被建模为防御者和入侵者之间的两人零和博弈,其中防御者寻求已部署传感器的方向策略以最小化漏检概率,而入侵者则寻求路径选择策略以最大化该概率。由于防御者的策略空间随传感器数量和方向组合增长,通过标准线性规划求解博弈变得不可行。为此,我们开发了一种迭代且高效的均衡求解算法,该算法利用博弈收益函数的结构,并建立了收敛到博弈纳什均衡(NE)的理论保证。该NE值随后被用作SensorPlacement元问题中的效用度量。我们证明了这个基于博弈值的效用函数在传感器放置集合上是弱子模的,并提出了一个具有近最优性保证的贪婪放置算法。据我们所知,这是第一个将博弈论效用设计与(弱)子模优化相结合的统一框架,实现了传感器放置和方向调度的原则性联合优化。通过大量仿真,我们证明所提出的方法实现了近最优的检测性能,同时与基线相比显著减少了计算时间。

英文摘要

We consider an intrusion detection task in which a defender must jointly optimize sensor placement locations and orientations to minimize the probability of missed detection of an intruder traversing a protected environment. We decompose this problem into a meta problem, termed SensorPlacement, and an embedded subproblem, termed OrientationScheduling. The OrientationScheduling subproblem, for a fixed sensor placement, is modeled as a 2-player zero-sum game between the defender and the intruder, where the defender seeks an orientation strategy for the deployed sensors to minimize the probability of missed detection, while the intruder seeks a path selection strategy to maximize it. Since the defender's strategy space grows combinatorially with the number of sensors and orientations, solving the game via standard linear programming becomes prohibitive. To this end, we develop an iterative and efficient equilibrium-seeking algorithm that exploits the structure of the game's payoff function and establishes theoretical guarantees for convergence to the Nash equilibrium (NE) of the game. This NE value is then used as a utility measure in the SensorPlacement meta problem. We show that this game-value-based utility function is weakly submodular over the set of sensor placements and propose a greedy placement algorithm with near-optimality guarantees. To our knowledge, this is the first unified framework to integrate game-theoretic utility design with (weak) submodular optimization, enabling principled joint optimization of sensor placement and orientation scheduling. Through extensive simulations, we demonstrate that the proposed approach achieves near-optimal detection performance while significantly reducing computation time compared to baselines.

2606.19521 2026-06-19 cs.LG math.OC 新提交

Interactive Pareto navigation for deep multi-task learning

深度多任务学习的交互式帕累托导航

Augustina C. Amakor, Konstantin Sonntag, Sebastian Peitz

发表机构 * Department of Computer Science, TU Dortmund, Dortmund, Germany(多特蒙德工业大学计算机科学系,德国多特蒙德) Lamarr Institute for Machine Learning and Artificial Intelligence(拉马尔机器学习和人工智能研究所)

AI总结 提出偏好帕累托探索(PPE)框架,通过预测-校正方法沿帕累托流形切线方向引导偏好,利用Krylov子空间方法避免Hessian计算,实现高效交互式多目标优化。

详情
AI中文摘要

在多任务学习中,处理越来越多的目标在计算资源和决策者选择适当权衡的能力方面都很快变得具有挑战性。因此,一种广泛使用的方法是通过加权和将各个损失聚合到单个损失函数中。这通常由于帕累托前沿的形状而无法捕捉决策者的偏好,或者需要多次调整和计算,这在深度学习应用中变得过于昂贵。为了解决这些问题,我们引入了一个新颖的框架,偏好帕累托探索(PPE),它在交互式探索过程中强制执行决策者的偏好,同时考虑帕累托集的几何形状。PPE基于预测-校正方法,该方法沿着帕累托最优解流形的切线方向执行预测步骤,遵循决策者的偏好。随后的校正步骤产生反映该偏好的新权衡。为了在表征流形切空间时避免显式的Hessian计算,我们采用了一种仅依赖于矩阵-向量乘积的Krylov子空间方法。这些乘积可以通过自动微分高效获得,确保了整个优化过程的效率和鲁棒性。该方法的有效性和性能通过玩具问题和深度学习示例进行了展示。

英文摘要

In multi-task learning, handling an increasing number of objectives can quickly become challenging, both in terms of the computational resources and the decision maker's capacity to choose appropriate trade-offs. A widely used approach is thus to aggregate the individual losses in a single loss function by a weighted sum. This often fails to capture either the decision maker's preferences as a result of the shape of the Pareto front, or requires multiple adjustments and computations which becomes prohibitively expensive in deep learning applications. To address these issues, we introduce a novel framework, Preference Pareto Exploration (PPE), which enforces the decision maker's preferences while accounting for the geometry of the Pareto set in an interactive exploration process. PPE is based on a predictor-corrector method that performs predictor steps tangential to the manifold of Pareto-optimal solutions, following the decision maker's preference. The subsequent corrector step results in a new trade-off reflecting this preference. To avoid explicit Hessian computations when characterizing the tangent space of the manifold, we employ a Krylov subspace method that relies solely on matrix-vector products. These products can be efficiently obtained via automatic differentiation, ensuring both efficiency and robustness throughout the optimization process. The method's functionality and performance are demonstrated using both toy problems and examples from deep learning.

2606.19393 2026-06-19 cs.DM cs.DS math.CO 新提交

An alternative way of defining finite graphs

定义有限图的另一种方式

Maxim Nazarov

AI总结 提出一种完全图不变量“图线性符号”,作为有限图的替代定义,用于简化图的对称性图示和同构比较。

详情
Journal ref
Prikl. Diskr. Mat., 2015, no. 3(29), 83-94
AI中文摘要

在本文中,我们引入了“图线性符号”——一种完全图不变量——它被定位为有限图的替代定义。该不变量使用类似于寻找图规范形式的算法构建。存储图线性符号而不是常规图,使我们能够极大地简化两个主要问题:考虑可能图对称性的图插图构建,以及两个图的同构比较。我们还展示了诸如着色和图路径等经典图论概念向图线性符号的可转移性。

英文摘要

In this paper we introduce "graph linear notation" -- a complete graph invariant -- which is positioned as an alternative definition for the finite graphs. This invariant is constructed using an algorithm similar to the algorithm of finding canonical forms of graphs. Storing graph linear notation instead of a regular graph allows us to greatly simplify two major problems: the construction of illustrations for graphs with regards to possible graph symmetries, and the comparison of two graphs for isomorphism. We also demonstrate the transferability to the graph linear notations such classical graph theory concepts as colourings and graph paths.

2606.19361 2026-06-19 cs.LG cs.AI cs.NA math.NA stat.CO stat.ME stat.ML 新提交

Computational Identifiability

计算可识别性

Lucius E. J. Bynum, Rajesh Ranganath, Kyunghyun Cho

发表机构 * New York University(纽约大学)

AI总结 提出“计算可识别性”框架,通过有限计算搜索过程在指定误差容限内找到经验估计量,从而解决理论可识别性在有限样本、模糊图标准等实际场景中的不足。

详情
AI中文摘要

识别条件描述了目标查询或感兴趣参数作为可用信息类型和数量的函数的可计算性。在因果识别中,这些信息通常以因果图的形式表达,数据是针对图中某些变量子集观测或收集的。目标查询可以是单个效应,也可以是给定模型中的一类效应。识别算法的推导在数学上定义了期望中理论上唯一确定所需因果效应的过程。期望中的可识别性,即“理论可识别性”,通常假设渐近性质、无限数据或其他数学理想化条件。在本文中,我们探讨了这种理论理想化的可识别性与一种受计算限制的替代方案之间的根本区别。我们提出的框架——“计算可识别性”——而是为经验估计量定义一个有限的计算搜索过程。如果该过程在期望的误差容限内经验性地找到了估计量,则满足可识别性,条件取决于搜索的指定假设(即参数上的先验分布)以及搜索过程本身。通过多个实验,我们展示了该框架如何回答细粒度的实际识别问题,例如小有限样本下的识别、模糊图标准下的识别、混合观测-干预数据下的识别,以及跨反事实数据和估计量的识别。代码见 https://this https URL。

英文摘要

Identification conditions describe the computability of a target query or parameter of interest as a function of the type and amount of information available. In causal identification, this information is often expressed in the form of a causal graph, and data are observed or collected for some subset of variables in the graph. Target queries may be for a single effect alone or for a class of effects in a given model. The derivation of an identification algorithm then defines mathematically the process by which the desired causal effect(s) can be uniquely determined, theoretically, in expectation. Identifiability in expectation, or 'theoretical identifiability,' generally assumes asymptotic properties, infinite data, or other mathematically idealized conditions. In this paper, we explore a fundamental distinction between this theoretical, idealized notion of identifiability and a proposed alternative that is computation-bound. The framework we propose - 'computational identifiability' - is to instead define a finite computational search procedure for an empirical estimator. If this process finds an estimator empirically, within a desired error tolerance, then identifiability is satisfied, conditional on the specified assumptions of the search (i.e., a prior distribution over the parameters) and conditional on the search procedure itself. Through several experiments, we demonstrate how this framework allows us to answer fine-grained, practical identification questions, such as identification with small finite samples, with ambiguous graphical criteria, with mixed observational-interventional data, and across counterfactual data and estimands. Code is available at https://github.com/lbynum/metadentify.