arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2603.14169 2026-06-05 stat.ME cs.AI

Beyond Means: Topological Causal Effects under Persistent-Homology Ignorability

超越均值：基于持久同调的因果效应

Amir Saki, Usef Faghihi

发表机构 * Université du Québec à Trois-Rivières（魁北克三河大学）

AI总结本文提出基于持久同调的因果框架，以解决均值基于因果估计在处理结局分布形状变化时的局限性，通过定义拓扑学的CATE和ATE，并证明其在近似拓扑可忽略性下的可识别性。

详情

AI中文摘要

平均处理效应（ATE）和条件平均处理效应（CATE）是因果估计的核心，但它们仅关注预期结果的变化，可能忽略处理引起的结局分布形状变化。当对照组结果单峰，处理组结果双峰且均值相同，均值基于的因果估计会失效。本文基于持久同调发展了因果框架，提出了持久同调可忽略性条件，定义了拓扑学的CATE和ATE，并证明这些估计量在近似拓扑可忽略性下可识别。同时指出，边际持久图效应不能仅通过条件拓扑可忽略性确定，因为持久同调通常不与协变量混合交换。为保持原意并确保科学正确性，本文保留边际效应作为动机量，但将数学上稳健的条件估计量置于理论中心。合成实验显示，均值基于的因果估计仍接近零，而所提拓扑效应显著增加并在调整混杂后可恢复。

英文摘要

Average treatment effects (ATE) and conditional average treatment effects (CATE) are foundational causal estimands, but they target changes in expected outcomes and can miss treatment-induced changes in the shape of outcome distributions. A canonical failure mode occurs when control outcomes are unimodal, treated outcomes become bimodal, and both distributions have the same mean. In such cases mean-based causal estimands are zero even though the geometry and topology of the outcome law change substantially. This paper develops a topological causal framework based on persistent homology. We formalize a persistent-homology ignorability condition, define topological analogues of CATE and ATE, and prove that these estimands are identifiable up to an explicit error bound under approximate topological ignorability. We also clarify a subtle but important point: a marginal persistence-diagram effect is not identified from conditional topological ignorability alone because persistent homology does not in general commute with mixtures over covariates. To preserve the original intuition while ensuring scientific correctness, we retain the marginal effect as a motivating quantity, but place the mathematically sound conditional estimands at the center of the theory. A synthetic experiment with mean-preserving topology change shows that mean-based causal estimands remain near zero while the proposed topological effect increases sharply and remains recoverable after adjustment for confounding.

URL PDF HTML ☆

赞 0 踩 0

2601.11527 2026-06-05 cs.HC cs.AI cs.CY

"What if she doesn't feel the same?" What Happens When We Ask AI for Relationship Advice

如果她不再有同样的感觉呢？当我们将AI用于关系建议时会发生什么

Niva Manchanda, Akshata Kishore Moharir, Ratna Kandala

发表机构 * Department of Psychology, University of Kansas（堪萨斯大学心理学系）； Independent Researcher（独立研究者）

AI总结研究探讨了用户对LLM生成的浪漫关系建议的评价，发现用户对建议的满意度高，并且这种满意度与对模型可靠性和有用性的感知正相关，同时用户对LLM的态度也显著改善。

Journal ref First Workshop on LLM Persona Modeling, NeurIPS 2025

2603.02376 2026-06-05 cs.DC cs.AR cs.LG cs.MA

CUCo: An Agentic Framework for Compute and Communication Co-design

CUCo：一种用于计算与通信协同设计的代理框架

Yoga Sri Varshan Varadharajan, Bodun Hu, Saurabh Agarwal, Aditya Akella

发表机构 * UT Austin（德克萨斯大学奥斯汀分校）

AI总结本文提出CUCo框架，通过结合结构化设计空间形式化和正确性优先的快速路径代理以及进化驱动的慢速路径代理，实现了CUDA内核的计算与通信协同设计，从而在四个多GPU工作负载中实现了1.57倍的加速，并在LLM推理成本低于10美元的情况下发现了一种双流重叠策略。

2509.20345 2026-06-05 stat.ME cs.LG stat.ML

General Synthetic-Powered Inference

通用合成数据驱动推断

Meshi Bashari, Yonghoon Lee, Roy Maor Lotan, Edgar Dobriban, Yaniv Romano

发表机构 * Department of Electrical and Computer Engineering, Technion IIT, Israel（电气与计算机工程系，技术离子研究所，以色列）； Department of Statistics and Data Science, The Wharton School, University of Pennsylvania, USA（统计学与数据科学系，沃顿商学院，宾夕法尼亚大学，美国）； Department of Computer Science, Technion IIT, Israel（计算机科学系，技术离子研究所，以色列）

AI总结本文提出了一种通用合成数据驱动推断框架，通过结合高质量合成数据和真实数据来提高样本效率，同时在合成数据质量低时自动回退到传统方法，无需分布假设即可保持误差率在用户指定范围内。

详情

AI中文摘要

高质量合成数据的快速普及——由先进的人工智能模型生成或从相关任务中收集——为统计推断带来了机遇和挑战。本文介绍了一种通用合成数据驱动推断（GESPI）框架，该框架围绕广义的统计推断程序包裹，通过结合合成和真实数据安全地提高样本效率。我们的框架利用高质量合成数据提高统计效力，但能自适应回退到仅使用真实数据的传统方法，当合成数据质量较低时。在不假设合成数据分布的情况下，该方法的误差率始终低于用户指定的界限，且随着合成数据质量的提高而降低。这种灵活性使该框架能够无缝集成到符合性预测、风险控制、假设检验和多重检验程序中，而无需修改基础推断方法。我们在有限标注数据的挑战性任务上展示了该方法的优势，包括AlphaFold蛋白质结构预测，以及在复杂数学问题上比较大型推理模型。

英文摘要

The rapid proliferation of high-quality synthetic data -- generated by advanced AI models or collected as auxiliary data from related tasks -- presents both opportunities and challenges for statistical inference. This paper introduces a GEneral Synthetic-Powered Inference (GESPI) framework that wraps around a broad class of statistical inference procedures to safely enhance sample efficiency by combining synthetic and real data. Our framework leverages high-quality synthetic data to boost statistical power, yet adaptively defaults to the standard method using only real data when synthetic data are of low quality. The error rate of our method remains below a user-specified bound without any distributional assumptions on the synthetic data, and decreases as the quality of the synthetic data improves. This flexibility enables seamless integration with conformal prediction, risk control, hypothesis testing, and multiple testing procedures, all without modifying the base inference method. We demonstrate the benefits of our method on challenging tasks with limited labeled data, including AlphaFold protein structure prediction, and comparing large reasoning models on complex math problems.

URL PDF HTML ☆

赞 0 踩 0

2508.04409 2026-06-05 stat.ML cs.LG

The Relative Instability of Model Comparison with Cross-validation

模型比较与交叉验证的相对不稳定性

Alexandre Bayle, Lucas Janson, Lester Mackey

发表机构 * Department of Statistics, Harvard University, Cambridge, MA, USA（哈佛大学统计系）； Microsoft Research New England, Cambridge, MA, USA（微软研究院新英格兰分部）

AI总结研究指出即使个体稳定的模型在比较时也可能产生相对不稳定的结果，挑战了交叉验证推断的有效性，特别指出Lasso和软阈值化在最有利的学习条件下仍会导致无效的交叉验证推断。

2602.07739 2026-06-05 cs.IR cs.AI

HypRAG: Hyperbolic Dense Retrieval for Retrieval Augmented Generation

HypRAG: 超几何密集检索用于检索增强生成

Hiren Madhu, Ngoc Bui, Ali Maatouk, Leandros Tassiulas, Smita Krishnaswamy, Menglin Yang, Sukanta Ganguly, Kiran Srinivasan, Rex Ying

发表机构 * University of California, Berkeley（加州大学伯克利分校）

AI总结本文提出超几何密集检索方法，通过在双曲空间中构建HyTE-FH和HyTE-H两种模型变体，解决传统欧几里得空间在检索增强生成中的局限性，提升文档相关性和回答相关性。

详情

AI中文摘要

嵌入几何在检索质量中起着根本作用，然而用于检索增强生成（RAG）的密集检索器仍然主要局限于欧几里得空间。然而，自然语言从广泛主题到具体实体具有层次结构，而欧几里得嵌入无法保持这种结构，导致语义上距离远的文档显得相似，增加幻觉风险。为了解决这些限制，我们引入了双曲密集检索，开发了两种模型变体：HyTE-FH，一个完全双曲的Transformer，以及HyTE-H，一个混合架构，将预训练的欧几里得嵌入投影到双曲空间。为了防止序列聚合期间的表示崩溃，我们引入了向外爱因斯坦中点，一种几何感知的池化操作符，可以证明地保持层次结构。在MTEB上，HyTE-FH优于等效的欧几里得基线，而在RAGBench上，HyTE-H在上下文相关性和回答相关性方面比欧几里得基线高出高达29%，使用比当前最先进的检索器小得多的模型。我们的分析还表明，双曲表示通过基于范数的分离编码文档特定性，从一般到具体概念的径向增加超过20%，这一特性在欧几里得嵌入中不存在，突显了几何归纳偏置在忠实RAG系统中的关键作用。

英文摘要

Embedding geometry plays a fundamental role in retrieval quality, yet dense retrievers for retrieval-augmented generation (RAG) remain largely confined to Euclidean space. However, natural language exhibits hierarchical structure from broad topics to specific entities that Euclidean embeddings fail to preserve, causing semantically distant documents to appear spuriously similar and increasing hallucination risk. To address these limitations, we introduce hyperbolic dense retrieval, developing two model variants in the Lorentz model of hyperbolic space: HyTE-FH, a fully hyperbolic transformer, and HyTE-H, a hybrid architecture projecting pre-trained Euclidean embeddings into hyperbolic space. To prevent representational collapse during sequence aggregation, we introduce the Outward Einstein Midpoint, a geometry-aware pooling operator that provably preserves hierarchical structure. On MTEB, HyTE-FH outperforms equivalent Euclidean baselines, while on RAGBench, HyTE-H achieves up to 29% gains over Euclidean baselines in context relevance and answer relevance using substantially smaller models than current state-of-the-art retrievers. Our analysis also reveals that hyperbolic representations encode document specificity through norm-based separation, with over 20% radial increase from general to specific concepts, a property absent in Euclidean embeddings, underscoring the critical role of geometric inductive bias in faithful RAG systems.

URL PDF HTML ☆

赞 0 踩 0

2602.01607 2026-06-05 math.ST cs.IT cs.LG math.IT stat.ML stat.TH

Minimax optimal differentially private synthetic data for smooth queries

最小最大最优差分隐私合成数据用于平滑查询

Rundong Ding, Yiyun He, Yizhe Zhu

发表机构 * Department of Mathematics, University of Southern California（南加州大学数学系）； Department of Mathematics, University of California San Diego（加州圣地亚哥大学数学系）

AI总结本文研究了如何生成具有(ε,δ)差分隐私的合成数据，以在保证个体隐私的同时，为有意义的下游分析提供强效用保证。提出了一种多项式时间算法，实现了最小最大误差率O_{k,d}(n^{-min{1, k/d}})，并建立了针对k-平滑查询的首个最小最大下界。

Comments COLT 2026 arXiv version. 34 pages

详情

AI中文摘要

差分隐私合成数据使敏感数据集的共享和分析成为可能，同时为个体贡献者提供严格的隐私保证。一个核心挑战是为有意义的下游分析提供强效用保证。许多现有方法确保在广泛的查询类上具有均匀的准确性，如所有Lipschitz函数，但这种通用性往往导致对实际感兴趣的统计量的次优速率。由于许多常见数据分析查询的平滑性超出了最坏情况Lipschitz界所捕捉的范围，我们询问是否可以利用这种额外的结构来提高效用。我们研究了从大小为n的数据集生成(ε,δ)差分隐私合成数据的问题，该数据集支持在超立方体[-1,1]^d上，具有对所有具有受界导数的平滑查询的均匀效用保证。我们提出了一种多项式时间算法，实现了最小最大误差率O_{k,d}(n^{-min{1, k/d}})，除了一个log(n)因子。这一特征揭示了k=d处的相变。我们的结果推广了Chebyshev矩匹配框架（Musco等，2025；Wang等，2016），并且严格改进了在\citep{wang2016differentially}中为k-平滑查询建立的误差率。此外，我们建立了针对k-平滑查询的首个最小最大下界，扩展了Boedihardjo等（2024）中关于ε-差分隐私的Wasserstein下界。

英文摘要

Differentially private synthetic data enables the sharing and analysis of sensitive datasets while providing rigorous privacy guarantees for individual contributors. A central challenge is to achieve strong utility guarantees for meaningful downstream analysis. Many existing methods ensure uniform accuracy over broad query classes, such as all Lipschitz functions, but this level of generality often leads to suboptimal rates for statistics of practical interest. Since many common data analysis queries exhibit smoothness beyond what worst-case Lipschitz bounds capture, we ask whether exploiting this additional structure can yield improved utility. We study the problem of generating $(\varepsilon,δ)$-differentially private synthetic data from a dataset of size $n$ supported on the hypercube $[-1,1]^d$, with utility guarantees uniformly for all smooth queries having bounded derivatives up to order $k$. We propose a polynomial-time algorithm that achieves a minimax error rate of $O_{k,d}(n^{-\min \{1, \frac{k}{d}\}})$, up to a $\log(n)$ factor. This characterization uncovers a phase transition at $k=d$. Our results generalize the Chebyshev moment matching framework of (Musco et al., 2025; Wang et al., 2016) and strictly improve the error rates for $k$-smooth queries established in \citep{wang2016differentially}. Moreover, we establish the first minimax lower bound for the utility of $(\varepsilon,δ)$-differentially private synthetic data with respect to $k$-smooth queries, extending the Wasserstein lower bound for $\varepsilon$-differential privacy in (Boedihardjo et al., 2024).

URL PDF HTML ☆

赞 0 踩 0

2602.05056 2026-06-05 cs.CR cs.CL cs.LG

Grounded but Misleading: Evaluating Semantic Alignment in AI-Generated Security Explanations

grounded but Misleading: Evaluating Semantic Alignment in AI-Generated Security Explanations

Heajun An, Connor Ng, Sandesh Sharma Dulal, Junghwan Kim, Jin-Hee Cho

发表机构 * Virginia Tech（弗吉尼亚理工学院）

AI总结本文研究了AI生成的安全解释中语义对齐的问题，通过VEXA测试平台验证了词汇基础与语义风险对齐之间的差距，发现即使解释在词汇上显得合理，其语义解释可能削弱检测器的意图风险评估。

详情

AI中文摘要

在线诈骗越来越多地利用流畅且具有上下文意识的社会工程策略，导致对能够解释为何一条信息可能具有风险的AI系统的需求日益增长。然而，引用检测器衍生证据的解释可能仍然在语义上削弱或改变预期的风险解释。我们介绍了VEXA：验证语义解释对齐，一个用于研究AI生成诈骗风险解释中词汇基础与语义风险对齐差距的受控测试平台。VEXA通过独立控制证据基础和语义框架来生成无基础、风险对齐和风险稀释的解释。通过LLM作为判断者和人类评估，我们发现即使解释的语义解释削弱了检测器的意图风险评估，解释仍可能在比较上显得合理。在人类评估中，风险稀释的XAI基础解释保留了相对较高的感知证据基础评分（3.66），尽管其帮助性（3.00）和推理支持（3.14）评分较低。这些发现提供了AI生成安全解释中基础错觉效应的受控证据，并表明可信的解释评估必须不仅验证是否引用了证据，还要验证如何解释这些证据。

英文摘要

Online scams increasingly leverage fluent and context-aware social engineering strategies, creating growing demand for AI systems that explain why a message may be risky. However, explanations that cite detector-derived evidence may still semantically weaken or redirect the intended risk interpretation. We introduce VEXA: Verifying Semantic Explanation Alignment, a controlled testbed for studying the gap between lexical grounding and semantic risk alignment in AI-generated scam-risk explanations. VEXA generates ungrounded, risk-aligned, and risk-diluting explanations by independently controlling evidence grounding and semantic framing. Through LLM-as-a-judge and human evaluations, we show that explanations may continue to appear comparatively grounded even when their semantic interpretation weakens the detector's intended risk assessment. In human evaluation, risk-diluting XAI-grounded explanations retained comparatively elevated Perceived Evidence Grounding scores (3.66) despite lower Helpfulness (3.00) and Reasoning Support (3.14) scores. These findings provide controlled evidence of grounding illusion effects in AI-generated security explanations and suggest that trustworthy explanation evaluation must verify not only whether evidence is cited, but also how that evidence is interpreted.

URL PDF HTML ☆

赞 0 踩 0

2601.21162 2026-06-05 cs.IR cs.AI cs.DB

A2RAG: Adaptive Agentic Graph Retrieval for Cost-Aware and Reliable Reasoning

A2RAG：面向成本感知和可靠推理的自适应代理图检索

Jiate Liu, Zebin Chen, Shaobo Qiao, Mingchen Ju, Danting Zhang, Bocheng Han, Shuyue Yu, Xin Shu, Jinglin Wu, Dong Wen, Xin Cao, Guanfeng Liu, Zhengyi Yang

发表机构 * University of New South Wales（新南威尔士大学）； Euler AI ； Sigma Trading Management（Sigma 交易管理）； Eigenflow AI ； Macquarie University（麦考瑞大学）

AI总结本文提出A2RAG框架，通过自适应控制器和代理检索器解决图检索中成本和可靠性问题，提升多跳问答的准确率并减少计算开销。

详情

AI中文摘要

图检索增强生成（Graph-RAG）通过将语料库组织成知识图谱并利用关系结构路由证据来增强多跳问答。然而，实际部署面临两个持续瓶颈：（i）混合难度的工作负载中，单一检索策略要么浪费成本于简单查询，要么在多跳情况中失败；（ii）提取损失，即图抽象省略了仅存在于源文本中的细粒度限定词。我们提出了A2RAG，一种面向成本感知和可靠推理的自适应和代理图RAG框架。A2RAG结合了一个自适应控制器，用于验证证据充分性并在必要时触发定向细化，以及一个代理检索器，逐步提升检索努力并映射图信号回来源文本，以在提取损失和不完整图的情况下保持稳健。在HotpotQA和2WikiMultiHopQA上的实验表明，A2RAG在Recall@2上实现了+9.9/+11.8的绝对增益，同时将token消耗和端到端延迟降低了约50%。

英文摘要

Graph Retrieval-Augmented Generation (Graph-RAG) enhances multihop question answering by organizing corpora into knowledge graphs and routing evidence through relational structure. However, practical deployments face two persistent bottlenecks: (i) mixed-difficulty workloads where one-size-fits-all retrieval either wastes cost on easy queries or fails on hard multihop cases, and (ii) extraction loss, where graph abstraction omits fine-grained qualifiers that remain only in source text. We present A2RAG, an adaptive-and-agentic GraphRAG framework for cost-aware and reliable reasoning. A2RAG couples an adaptive controller that verifies evidence sufficiency and triggers targeted refinement only when necessary, with an agentic retriever that progressively escalates retrieval effort and maps graph signals back to provenance text to remain robust under extraction loss and incomplete graphs. Experiments on HotpotQA and 2WikiMultiHopQA demonstrate that A2RAG achieves +9.9/+11.8 absolute gains in Recall@2, while cutting token consumption and end-to-end latency by about 50% relative to iterative multihop baselines.

URL PDF HTML ☆

赞 0 踩 0

2601.18219 2026-06-05 physics.med-ph cs.CV cs.LG

Automated HER2 scoring with uncertainty quantification using lensfree holography and deep learning

利用无透镜全息和深度学习进行自动HER2评分及不确定性量化

Che-Yung Shen, Xilin Yang, Yuzhu Li, Leon Lenk, Aydogan Ozcan

发表机构 * Electrical and Computer Engineering Department, University of California, Los Angeles, CA, 90095, USA（加州大学洛杉矶分校电气与计算机工程系）； Bioengineering Department, University of California, Los Angeles, CA, 90095, USA（加州大学洛杉矶分校生物工程系）； California NanoSystems Institute (CNSI), University of California, Los Angeles, CA, 90095, USA（加州大学洛杉矶分校加州纳米系统研究所）； Department of Computer Science, University of California, Los Angeles, CA, 90095, USA（加州大学洛杉矶分校计算机科学系）

AI总结本文提出了一种基于无透镜全息和深度学习的紧凑型、低成本系统，用于自动免疫组化染色乳腺组织切片的HER2评分，通过贝叶斯蒙特卡洛Dropout策略提高诊断可靠性，实现了高准确率的HER2分类和评分。

Comments 23 Pages, 6 Figures, 1 Table

Journal ref BME Frontiers, AAAS (2026)

详情

DOI: 10.34133/bmef.0278

AI中文摘要

准确评估人类表皮生长因子受体2（HER2）的表达对于乳腺癌的诊断、预后和治疗选择至关重要；然而，大多数现有的数字HER2评分方法依赖于笨重且昂贵的光学系统。本文提出了一种紧凑且经济的无透镜全息平台，结合深度学习用于自动免疫组化染色乳腺组织切片的HER2评分。该系统在RGB激光照明下捕获染色HER2组织切片的无透镜衍射图案，并在约1250 mm²的样本区域上以约84 mm²/分钟的有效吞吐量获取复杂数学信息。为提高诊断可靠性，我们采用了基于贝叶斯蒙特卡洛Dropout的不确定性量化策略，为每个预测提供自主的不确定性估计，支持可靠且稳健的HER2评分，整体修正率为30.4%。使用412个盲测样本的测试集，本方法在4类（0，1+，2+，3+）HER2分类中实现了84.9%的测试准确率，在二分类（0/1+ vs. 2+/3+）HER2评分中实现了94.8%的准确率，结合不确定性量化。总体而言，这种无透镜全息方法提供了一条通往便携式、高吞吐量和低成本HER2评分的实用途径，特别适用于资源有限的环境，其中传统数字病理基础设施不可用。

英文摘要

Accurate assessment of human epidermal growth factor receptor 2 (HER2) expression is critical for breast cancer diagnosis, prognosis, and therapy selection; yet, most existing digital HER2 scoring methods rely on bulky and expensive optical systems. Here, we present a compact and cost-effective lensfree holography platform integrated with deep learning for automated HER2 scoring of immunohistochemically stained breast tissue sections. The system captures lensfree diffraction patterns of stained HER2 tissue sections under RGB laser illumination and acquires complex field information over a sample area of ~1,250 mm^2 at an effective throughput of ~84 mm^2 per minute. To enhance diagnostic reliability, we incorporated an uncertainty quantification strategy based on Bayesian Monte Carlo dropout, which provides autonomous uncertainty estimates for each prediction and supports reliable, robust HER2 scoring, with an overall correction rate of 30.4%. Using a blinded test set of 412 unique tissue samples, our approach achieved a testing accuracy of 84.9% for 4-class (0, 1+, 2+, 3+) HER2 classification and 94.8% for binary (0/1+ vs. 2+/3+) HER2 scoring with uncertainty quantification. Overall, this lensfree holography approach provides a practical pathway toward portable, high-throughput, and cost-effective HER2 scoring, particularly suited for resource-limited settings, where traditional digital pathology infrastructure is unavailable.

URL PDF HTML ☆

赞 0 踩 0

2505.03336 2026-06-05 cs.IR cs.AI cs.SI

Eliminating Out-of-Domain Recommendations in LLM-based Recommender Systems: A Unified View

消除基于大语言模型的推荐系统中的域外推荐：一种统一视角

Hao Liao, Jiwei Zhang, Jianxun Lian, Wensheng Lu, Mingqi Wu, Shuo Wang, Yong Zhang, Yitian Huang, Mingyang Zhou, Rui Mao

发表机构 * College of Computer Science and Software Engineering, Shenzhen University（深圳大学计算机科学与软件工程学院）； Microsoft Research Asia（微软亚洲研究院）

AI总结本文提出RecLM框架，通过统一架构整合三种 grounding 方法，系统比较了基于嵌入检索、约束生成和离散项生成的推荐方法，有效消除域外推荐并提升了推荐准确性。

Comments 20 pages

详情

AI中文摘要

基于大语言模型（LLMs）的推荐系统常常受到域外（OOD）项目幻觉的困扰。为了解决这个问题，我们提出了RecLM，一种统一框架，通过在单一架构下实例化三种grounding范式来弥合检索与生成之间的差距：基于嵌入的检索、在重写项目标题上的约束生成以及离散项目-令牌生成。使用相同的LLM和提示，我们系统地在公开基准上比较了这三种视角。RecLM在所有变体中严格消除了域外推荐（OOD@10=0），并且约束生成变体RecLM-cgen和RecLM-token在与强ID基线和LLM基线相比时达到了最先进的准确性。我们的统一视角为比较三种不同的范式提供了系统的基础，以减少项目幻觉，提供了一个实用的框架来促进LLM在推荐任务中的应用。源代码位于https://github.com/microsoft/RecAI。

英文摘要

Recommender systems based on Large Language Models (LLMs) are often plagued by hallucinations of out-of-domain (OOD) items. To address this, we propose RecLM, a unified framework that bridges the gap between retrieval and generation by instantiating three grounding paradigms under a single architecture: embedding-based retrieval, constrained generation over rewritten item titles, and discrete item-tokenizer generation. Using the same backbone LLM and prompts, we systematically compare these three views on public benchmarks. RecLM strictly eradicates OOD recommendations (OOD@10 = 0) across all variants, and the constrained generation variants RecLM-cgen and RecLM-token achieve overall state-of-the-art accuracy compared to both strong ID-based and LLM-based baselines. Our unified view provides a systematic basis for comparing three distinct paradigms to reduce item hallucinations, offering a practical framework to facilitate the application of LLMs to recommendation tasks. Source code is at https://github.com/microsoft/RecAI.

URL PDF HTML ☆

赞 0 踩 0

2601.06056 2026-06-05 cs.CY cs.AI cs.CV

Using street view images and visual LLMs to predict heritage values for governance support: Risks, ethics, and policy implications

利用街景图像和视觉大语言模型预测遗产价值以支持治理：风险、伦理与政策影响

Tim Johansson, Mikael Mangold, Kristina Dabrock, Anna Donarelli, Ingrid Campo-Ruiz

发表机构 * RISE Research Institutes of Sweden AB（瑞典RISE研究机构）； Malmö University（马尔默大学）； Forschungszentrum Jülich GmbH（朱利奇研究中心）； Uppsala University（乌普萨拉大学）

AI总结本研究利用街景图像和视觉大语言模型评估瑞典建筑遗产价值，以支持建筑翻新计划的制定，探讨了方法中的问题、潜在改进以及使用LLM数据的伦理风险。

详情

AI中文摘要

在2025年至2026年期间，欧盟成员国必须实施《建筑性能能效指令》，要求所有成员国制定国家建筑翻新计划。在瑞典，没有全面记录具有遗产价值的建筑的国家注册表，这被视为阻碍建筑翻新计划制定分析的障碍。本研究旨在帮助瑞典当局了解瑞典建筑存量中的遗产价值。通过对瑞典各地（N=154710）的街景图像中的建筑进行多模态大语言模型（LLM）分析，评估了可见的遗产价值指示方面。使用LLM的零样本预测作为基础，确定了潜在具有遗产价值的建筑，覆盖500万平方米的供暖地板面积。本文呈现了预测结果和所学到的经验，并将其与瑞典建筑翻新计划的制定相结合，作为治理的一部分。讨论了方法中的问题和潜在的改进。探讨了当局使用基于LLM的数据的潜在风险，重点是透明性、错误检测和阿谀奉承的问题。

英文摘要

During 2025 and 2026, the Energy Performance of Buildings Directive is being implemented in the European Union member states, requiring all member states to have National Building Renovation Plans. In Sweden, there is no comprehensive national register of buildings with heritage values. This is seen as a barrier for the analyses underlying the development of Building Renovation Plans by the involved Swedish authorities. The purpose of this research was to assist Swedish authorities in developing information on heritage values in the Swedish building stock. Buildings in street view images from all over Sweden (N=154 710) have been analysed using multimodal Large Language Models (LLM) to assess visible aspects indicative of heritage value. Zero-shot predictions by LLMs were used as a basis for identifying buildings with potential heritage values for 5.0 million square meters of heated floor area. In this paper, the results of the predictions and lessons learned are presented and related to the development of the Swedish Building Renovation Plan as part of governance. The problems with the method and potential improvements are discussed. Risks with authorities use of LLM-based data are addressed, with a focus on issues of transparency, error detection and sycophancy.

URL PDF HTML ☆

赞 0 踩 0

2510.02415 2026-06-05 physics.ao-ph cs.LG

The Equilibrium Response of Atmospheric Machine-Learning Models to Uniform Sea Surface Temperature Warming

大气机器学习模型对均匀海表温度变暖的平衡响应

Bosong Zhang, Timothy M. Merlis

发表机构 * University of Washington（华盛顿大学）

AI总结本文评估了几种先进的机器学习模型对均匀海表温度变暖的气候响应，探讨了这些模型在气候预测中的潜力与局限性。

详情

AI中文摘要

近年来，能够产生稳定、多年气候模拟的全球大气机器学习模型已得到发展。然而，这些机器学习模型超越训练分布进行泛化的能力仍是一个开放性问题。在本研究中，我们评估了几种最先进的机器学习模型（ACE2-ERA5、NeuralGCM和cBottle）对均匀海表温度变暖的气候响应，这是一种广泛用于评估气候变化的基准测试。我们评估了这些机器学习模型相对于基于物理的一般环流模型（NOAA的Geophysical Fluid Dynamics Laboratory AM4）在关键诊断指标上的性能，包括地表空气温度、降水量、温度和风廓线以及大气顶部辐射。尽管机器学习模型能够再现物理模型响应的关键方面，特别是降水量的响应，但某些模型在辐射响应和陆地区域变暖方面表现出显著偏离稳健的物理响应。我们的结果突显了机器学习模型在气候变化应用中的潜力和当前的局限性，并表明需要进一步改进以实现稳健的样本外泛化。

英文摘要

Machine learning models for the global atmosphere that are capable of producing stable, multi-year simulations of Earth's climate have recently been developed. However, the ability of these ML models to generalize beyond the training distribution remains an open question. In this study, we evaluate the climate response of several state-of-the-art ML models (ACE2-ERA5, NeuralGCM, and cBottle) to a uniform sea surface temperature warming, a widely used benchmark for evaluating climate change. We assess each ML model's performance relative to a physics-based general circulation model (NOAA's Geophysical Fluid Dynamics Laboratory AM4) across key diagnostics, including surface air temperature, precipitation, temperature and wind profiles, and top-of-atmosphere radiation. While the ML models reproduce key aspects of the physical model response, particularly the response of precipitation, some exhibit notable departures from robust physical responses, including radiative responses and land region warming. Our results highlight the promise and current limitations of ML models for climate change applications and suggest that further improvements are needed for robust out-of-sample generalization.

URL PDF HTML ☆

赞 0 踩 0

2512.21335 2026-06-05 physics.med-ph cs.LG physics.app-ph physics.bio-ph

Autonomous Uncertainty Quantification for Computational Point-of-care Sensors

自主不确定性量化用于计算床旁传感器

Artem Goncharov, Rajesh Ghosh, Hyou-Arm Joung, Dino Di Carlo, Aydogan Ozcan

发表机构 * Electrical & Computer Engineering Department（电气与计算机工程系）； Bioengineering Department（生物工程系）； California NanoSystems Institute (CNSI)（加州纳米系统研究所）； Department of Surgery（外科医学系）； University of California, Los Angeles（加州大学洛杉矶分校）

AI总结本文提出了一种自主不确定性量化技术，用于改进床旁诊断中的神经网络驱动计算传感器系统，通过蒙特卡洛dropout方法提高诊断的准确性和可靠性。

Comments 18 Pages, 5 Figures

Journal ref ACS Nano (2026)

详情

DOI: 10.1021/acsnano.6c06616

AI中文摘要

计算床旁（POC）传感器能够为紧急、偏远和资源有限地区提供快速、低成本和可及的诊断。这些系统可以利用基于神经网络的算法从快速诊断测试或传感器生成的信号中准确推断诊断。然而，基于神经网络的诊断模型容易产生幻觉，并可能产生错误预测，导致误诊和不准确的临床决策。为了解决这一挑战，本文提出了一种专为POC诊断开发的自主不确定性量化技术。作为测试平台，我们使用了用于快速诊断莱姆病（全球最普遍的蜱传疾病）的纸基计算垂直流分析（xVFA）平台。xVFA平台集成了可丢弃的纸基检测、手持光学读取器和基于神经网络的推断算法，可在20分钟内使用仅20微升患者血清提供快速且经济有效的莱姆病诊断。通过将基于蒙特卡洛dropout（MCDO）的不确定性量化方法整合到诊断流程中，我们识别并排除了具有高不确定性的错误预测，显著提高了xVFA的灵敏度和可靠性，无需访问患者的真实诊断信息。使用新患者样本的盲测显示，诊断灵敏度从88.2%提高到95.7%，表明基于MCDO的不确定性量化在增强神经网络驱动的计算POC传感系统鲁棒性方面的有效性。

英文摘要

Computational point-of-care (POC) sensors enable rapid, low-cost, and accessible diagnostics in emergency, remote and resource-limited areas that lack access to centralized medical facilities. These systems can utilize neural network-based algorithms to accurately infer a diagnosis from the signals generated by rapid diagnostic tests or sensors. However, neural network-based diagnostic models are subject to hallucinations and can produce erroneous predictions, posing a risk of misdiagnosis and inaccurate clinical decisions. To address this challenge, here we present an autonomous uncertainty quantification technique developed for POC diagnostics. As our testbed, we used a paper-based, computational vertical flow assay (xVFA) platform developed for rapid POC diagnosis of Lyme disease, the most prevalent tick-borne disease globally. The xVFA platform integrates a disposable paper-based assay, a handheld optical reader and a neural network-based inference algorithm, providing rapid and cost-effective Lyme disease diagnostics in under 20 min using only 20 uL of patient serum. By incorporating a Monte Carlo dropout (MCDO)-based uncertainty quantification approach into the diagnostics pipeline, we identified and excluded erroneous predictions with high uncertainty, significantly improving the sensitivity and reliability of the xVFA in an autonomous manner, without access to the ground truth diagnostic information of patients. Blinded testing using new patient samples demonstrated an increase in diagnostic sensitivity from 88.2% to 95.7%, indicating the effectiveness of MCDO-based uncertainty quantification in enhancing the robustness of neural network-driven computational POC sensing systems.

URL PDF HTML ☆

赞 0 踩 0

2512.20627 2026-06-05 cs.NI cs.AI

Efficient Asynchronous Federated Evaluation with Strategy Similarity Awareness for Intent-Based Networking in Industrial Internet of Things

面向工业互联网-of-things意图网络的高效异步联邦评估与策略相似性意识

Shaowen Qin, Jianfeng Zeng, Haodong Guo, Xiaohuan Li, Jiawen Kang, Qian Chen

发表机构 * Guangxi University Key Laboratory of Intelligent Networking and Scenario System (School of Information and Communication, Guilin University of Electronic Technology)（广西智能网络与场景系统重点实验室（信息与通信学院，桂林电子科技大学））； National Engineering Laboratory for Comprehensive Transportation Big Data Application Technology (Guangxi)（综合交通运输大数据应用技术国家工程实验室（广西））； School of Automation, Guangdong University of Technology（自动化学院，广东工业大学）； School of Architecture and Transportation Engineering, GUET（建筑与交通工程学院，桂林电子科技大学）

AI总结本文提出了一种基于联邦学习的增强意图网络框架FEIBN，利用大语言模型将用户意图转化为结构化策略元组，并通过策略相似性意识联邦学习机制提升训练效率和通信效率，从而在工业互联网-of-things环境中实现更高效的策略评估。

Comments 12 pages with 7 figures and 4 tables

详情

AI中文摘要

意图网络（IBN）通过将高层用户意图转化为可执行的网络策略，为工业互联网-of-things（IIoT）环境中的智能和自动化网络控制提供了一种有前景的范式。然而，由于紧密耦合的工作流和高停机成本，频繁的策略部署和回滚是不切实际的，而节点异质性和隐私约束进一步复杂化了集中式策略评估。为了解决这些挑战，我们提出了一种联邦评估增强的意图网络框架（FEIBN），该框架利用大语言模型（LLMs）将用户意图转化为结构化策略元组，并采用联邦学习支持分布式策略评估。为了提高训练效率并减少通信开销，我们设计了一种策略相似性意识联邦学习机制（SSAFL），该机制根据策略相似性和资源状态选择相关节点，并仅在本地更新显著时触发异步模型上传。实验表明，所提出的方法在模型精度、收敛速度和通信成本方面均优于基线方法。

英文摘要

Intent-Based Networking (IBN) offers a promising paradigm for intelligent and automated network control in Industrial Internet of Things (IIoT) environments by translating high-level user intents into executable network strategies. However, frequent strategy deployment and rollback are impractical due to tightly coupled workflows and high downtime costs, while node heterogeneity and privacy constraints further complicate centralized strategy evaluation. To address these challenges, we propose a Federated Evaluation Enhanced Intent-Based Networking framework (FEIBN), which leverages large language models (LLMs) to translate user intents into structured strategy tuples and employs federated learning to support distributed strategy evaluation. To improve training efficiency and reduce communication overhead, we design a Strategy Similarity Aware Federated Learning mechanism (SSAFL), which selects nodes relevant to the task based on strategy similarity and resource status, and triggers asynchronous model uploads only when local updates are significant. Experiments demonstrate that the proposed method improves model accuracy, accelerates convergence, and reduces communication cost compared with the baselines.

URL PDF HTML ☆

赞 0 踩 0

2506.11152 2026-06-05 q-bio.GN cs.LG q-bio.CB

HEIST: A Graph Foundation Model for Spatial Transcriptomics and Proteomics Data

HEIST：一种用于空间转录组学和蛋白质组学数据的图基础模型

Hiren Madhu, João Felipe Rocha, Tinglin Huang, Siddharth Viswanath, Smita Krishnaswamy, Rex Ying

发表机构 * Yale University, USA（耶鲁大学）

AI总结本文提出HEIST模型，通过图结构建模空间转录组学和蛋白质组学数据，利用层次化图Transformer实现对细胞空间位置和基因表达的联合建模，从而提升对细胞异质性和微环境响应的理解。

详情

AI中文摘要

单细胞转录组学和蛋白质组学已成为驱动生物学研究的重要数据来源，使高级深度学习方法能够理解单细胞水平的细胞异质性和基因表达。随着空间组学数据的出现，我们有希望在组织背景下表征细胞，因为其提供了空间坐标和细胞内转录或蛋白质计数。蛋白质组学通过直接测量蛋白质提供互补视角，蛋白质是细胞功能的主要效应器和关键治疗靶点。然而，现有模型要么忽略空间信息，要么忽略细胞内的复杂遗传和蛋白质组程序，因此无法推断细胞内部调节如何适应微环境信号。此外，这些模型通常使用固定基因词汇表，限制了其对未知基因的泛化能力。在本文中，我们介绍了HEIST，一种用于空间转录组学和蛋白质组学的层次化图Transformer基础模型。HEIST将组织建模为层次化图。高层图是空间细胞图，每个细胞再由其下层的基因共表达网络图表示。HEIST通过执行不同层次的消息传递来利用其嵌入中的层次结构，从而能够泛化到包括空间蛋白质组学在内的新数据类型，而无需重新训练。HEIST在15个器官的124种组织中使用空间感知对比和掩码自动编码目标，预训练了2230万细胞。对HEIST嵌入的无监督分析揭示了先前模型遗漏的具有空间信息的亚群。下游评估显示其在蛋白质组学数据上的泛化能力和在临床结果预测、细胞类型注释和基因填补中的最先进性能。

英文摘要

Single-cell transcriptomics and proteomics have become a great source for data-driven insights into biology, enabling the use of advanced deep learning methods to understand cellular heterogeneity and gene expression at the single-cell level. With the advent of spatial-omics data, we have the promise of characterizing cells within their tissue context as it provides both spatial coordinates and intra-cellular transcriptional or protein counts. Proteomics offers a complementary view by directly measuring proteins, which are the primary effectors of cellular function and key therapeutic targets. However, existing models either ignore the spatial information or the complex genetic and proteomic programs within cells. Thus they cannot infer how cell internal regulation adapts to microenvironmental cues. Furthermore, these models often utilize fixed gene vocabularies, hindering their generalizability unseen genes. In this paper, we introduce HEIST, a hierarchical graph transformer foundation model for spatial transcriptomics and proteomics. HEIST models tissues as hierarchical graphs. The higher level graph is a spatial cell graph, and each cell in turn, is represented by its lower level gene co-expression network graph. HEIST achieves this by performing both intra-level and cross-level message passing to utilize the hierarchy in its embeddings and can thus generalize to novel datatypes including spatial proteomics without retraining. HEIST is pretrained on 22.3M cells from 124 tissues across 15 organs using spatially-aware contrastive and masked autoencoding objectives. Unsupervised analysis of HEIST embeddings reveals spatially informed subpopulations missed by prior models. Downstream evaluations demonstrate generalizability to proteomics data and state-of-the-art performance in clinical outcome prediction, cell type annotation, and gene imputation across multiple technologies.

URL PDF HTML ☆

赞 0 踩 0

2512.03086 2026-06-05 cs.PL cs.AI cs.SE

Beyond Code Pairs: Dialogue-Based Data Generation for LLM Code Translation

超越代码对：基于对话的数据生成用于LLM代码翻译

Le Chen, Nuo Xu, Winson Chen, Bin Lei, Pei-Hung Lin, Dunzhi Zhou, Rajeev Thakur, Caiwen Ding, Ali Jannesari, Chunhua Liao

发表机构 * Argonne National Laboratory（阿贡国家实验室）； University of Minnesota（明尼苏达大学）； Iowa State University（爱荷华州立大学）； Lawrence Livermore National Laboratory（劳伦斯利弗莫尔国家实验室）

AI总结本文提出了一种基于对话的数据生成方法，通过双LLM架构生成验证的翻译和多轮对话，以提升LLM在低资源编程领域中的代码翻译能力。

详情

AI中文摘要

大型语言模型（LLMs）在代码翻译任务中表现出色，但在资源稀缺的编程领域如Fortran和新兴框架如CUDA中性能下降，因为高质量并行数据稀缺。我们提出了一种自动化数据生成流水线，采用双LLM提问者-求解器设计，整合编译器和运行时反馈的外部知识。除了传统的源-目标代码对数据集外，我们的方法还生成（1）带有单元测试的验证翻译以评估功能一致性，以及（2）多轮对话，捕捉翻译优化过程中的推理过程。应用于Fortran到C++和C++到CUDA的转换中，该流水线分别生成3,640和3,930个对话。在该数据上微调可显著提升功能正确性，使C++到CUDA任务的单元测试成功率提高超过56%。我们证明生成的数据使7B开放式模型在编译成功率等关键指标上显著优于更大的专有系统。

英文摘要

Large language models (LLMs) have shown remarkable capabilities in code translation, yet their performance deteriorates in low-resource programming domains such as Fortran and emerging frameworks like CUDA, where high-quality parallel data are scarce. We present an automated dataset generation pipeline featuring a dual-LLM Questioner-Solver design that incorporates external knowledge from compilers and runtime feedback. Beyond traditional source-target code pair datasets, our approach additionally generates (1) verified translations with unit tests for assessing functional consistency and (2) multi-turn dialogues that capture the reasoning process behind translation refinement. Applied to Fortran-to-C++ and C++-to-CUDA, the pipeline yields 3.64k and 3.93k dialogues, respectively. Fine-tuning on this data yields dramatic improvements in functional correctness, boosting unit test success rates by over 56% on the challenging C++-to-CUDA task. We show that the generated data enables a 7B open-weight model to significantly outperform larger proprietary systems on key metrics like compilation success.

URL PDF HTML ☆

赞 0 踩 0

2511.16111 2026-06-05 stat.ML cs.LG math.SP

Rotation-Parameterized Graph Fractional Fourier Transform: Definition, Properties, and Optimal Filtering

旋转参数化图分数阶傅里叶变换：定义、性质和最优滤波

Feiyue Zhao, Mingzhi Wang, Yangfan He, Zhichao Zhang

发表机构 * School of Mathematics and Statistics, Nanjing University of Information Science and Technology（南京信息工程大学数学与统计学院）； School of Communication and Artificial Intelligence, Nanjing Institute of Technology（南京理工大学通信与人工智能学院）； School of Integrated Circuits, Nanjing Institute of Technology（南京理工大学集成电路学院）； Jiangsu Province Engineering Research Center of IntelliSense Technology and System（江苏省智能感知技术与系统工程研究中心）； Hubei Key Laboratory of Applied Mathematics, Hubei University（湖北省应用数学重点实验室）； Key Laboratory of System Control and Information Processing, Ministry of Education, Shanghai Jiao Tong University（教育部系统控制与信息处理重点实验室，上海交通大学）

AI总结本文提出旋转参数化图分数阶傅里叶变换（RP-GFRFT），通过统一分数阶和旋转参数化的谱分析，解决现有方法在旋转基控制和零角度退化方面的不足，提升图信号处理的去噪、重建和特征保留性能。

详情

AI中文摘要

图谱表示在图信号处理中是基础，为分析图结构数据提供严谨的框架。图分数阶傅里叶变换（GFRFT）通过分数阶参数扩展图傅里叶变换（GFT），实现灵活的谱分析并保持数学一致性。角图傅里叶变换（AGFT）通过旋转GFT特征向量引入角度控制；然而现有构造可能无法在零角度时精确还原为GFT，削弱理论一致性和可解释性。为解决这些互补的局限性，即GFRFT缺乏基于旋转的基控制和AGFT的零角度退化问题，本文提出旋转参数化图分数阶傅里叶变换（RP-GFRFT），统一分数阶和旋转参数化的谱分析。构造了一个保持退化的旋转矩阵族以保证在零角度时精确还原为GFT。然后提出了两种RP-GFRFT变体，I-RP-GFRFT和II-RP-GFRFT，并通过理论分析确认其幺正性、可逆性、还原行为和光滑参数依赖性。将分数阶和旋转角度联合优化用于自适应图谱滤波。在真实世界信号、图像和点云上的实验表明，RP-GFRFT在去噪精度、重建质量和特征保留方面优于GFRFT、AGFT和代表性滤波基线。

英文摘要

Graph spectral representations are fundamental in graph signal processing, providing a rigorous frameworkforanalyzing graph-structured data. The graph fractional Fourier transform (GFRFT) extends the graph Fourier transform (GFT) through a fractional-order parameter, enabling flexible spectral analysis with mathematical consistency. The angular graph Fourier transform (AGFT) further introduces angular control by rotating GFT eigenvectors; however, existing constructions may fail to reduce exactly to the GFT at zero angle, weakening theoretical consistency and interpretability. To address these complementary limitations, namely the lack of rotation-based basis control in GFRFT and the defective zero-angle degeneracy of AGFT, this paper proposes the rotation-parameterized graph fractional Fourier transform (RP-GFRFT), which unifies fractional order and rotation-parameterized spectral analysis. A degeneracy preserving rotation matrix family is constructed to guarantee exact GFT reduction at zero angle. TwoRP-GFRFTvariants,I-RP-GFRFTandII-RP-GFRFT,arethenformulated, with theoretical analyses confirming their unitarity, invertibility, reduction behavior, and smooth parameter dependence. The fractional order and rotation angle are jointly optimized for adaptive graph spectral filtering. Experiments on real-world signals, images, and point clouds demonstrate that RP-GFRFT improves denoising accuracy, reconstruction quality, and feature preservation over GFRFT, AGFT, and representative filtering baselines.

URL PDF HTML ☆

赞 0 踩 0

2503.01734 2026-06-05 cs.CR cs.AI

Adversarial Agents: Black-Box Evasion Attacks with Reinforcement Learning

对抗代理：基于强化学习的黑盒逃逸攻击

Kyle Domico, Jean-Charles Noirot Ferrand, Ryan Sheatsley, Eric Pauley, Josiah Hanna, Patrick McDaniel

发表机构 * University of Wisconsin-Madison（威斯康星大学麦迪逊分校）； Virginia Tech（弗吉尼亚理工大学）

AI总结本文提出了一种基于强化学习的对抗攻击方法，通过学习生成对抗样本的新算法，提高了攻击效率和成功率，同时在图像分类基准上展示了其优越的性能。

Comments Accepted to the Findings of CVPR 2026

详情

AI中文摘要

对机器学习模型的攻击已通过无状态优化广泛研究。本文展示了强化学习（RL）代理如何学习一种新类型的攻击算法来生成对抗样本。与传统对抗机器学习（AML）方法不同，我们的RL方法保留并利用过去的攻击经验，以提高未来攻击的有效性和效率。我们将对抗样本生成建模为马尔可夫决策过程，并评估RL在（a）学习有效且高效的攻击策略以及（b）与最先进的AML竞争的能力。在两个图像分类基准上，我们的代理在训练过程中将攻击成功率提高了最高13.2%，并将每个攻击的受害者模型查询平均次数减少了最高16.9%。在与最先进的图像攻击进行直接比较时，我们的方法使攻击者能够在训练后在未见过的输入上生成对抗样本的成功率提高了17%。从安全角度来看，这项工作展示了一种强大的新攻击向量，利用RL训练能够高效且大规模攻击ML模型的代理。

英文摘要

Attacks on machine learning models have been extensively studied through stateless optimization. In this paper, we demonstrate how a reinforcement learning (RL) agent can learn a new class of attack algorithms that generate adversarial samples. Unlike traditional adversarial machine learning (AML) methods that craft adversarial samples independently, our RL-based approach retains and exploits past attack experience to improve the effectiveness and efficiency of future attacks. We formulate adversarial sample generation as a Markov Decision Process and evaluate RL's ability to (a) learn effective and efficient attack strategies and (b) compete with state-of-the-art AML. On two image classification benchmarks, our agent increases attack success rate by up to 13.2% and decreases the average number of victim model queries per attack by up to 16.9% from the start to the end of training. In a head-to-head comparison with state-of-the-art image attacks, our approach enables an adversary to generate adversarial samples with 17% more success on unseen inputs post-training. From a security perspective, this work demonstrates a powerful new attack vector that uses RL to train agents that attack ML models efficiently and at scale.

URL PDF HTML ☆

赞 0 踩 0

2510.15814 2026-06-05 stat.ML cs.LG

On Universality of Deep Equivariant Networks

关于深度等变网络的通用性

Marco Pacini, Mircea Petrache, Bruno Lepri, Shubhendu Trivedi, Robin Walters

发表机构 * University of Trento（特伦托大学）； Fondazione Bruno Kessler（布鲁诺·凯瑟勒基金会）； PUC Chile（智利天主教大学）； Northeastern University（东北大学）

AI总结本文研究了等变神经网络的通用性问题，提出在分离约束下，通过全连接读出层可实现连续函数的近似，并引入了更严格的逐元素分离性准则，证明了足够深度或适当读出层可使等变网络在逐元素分离性范围内实现通用性。

Comments Published as a conference paper at ICLR 2026

Journal ref International Conference on Learning Representations (ICLR), 2026

详情

AI中文摘要

对于等变神经网络的通用性结果仍然很少。已有的结果通常仅在受限的设置中成立：要么依赖于常规或高阶张量表示，导致隐藏空间维度过高，要么针对专门的架构，通常局限于不变设置。本文提出了一种更一般性的结论。对于不变网络，我们在分离约束下建立了通用性定理，证明添加全连接读出层可使连续函数的近似在分离约束下实现。对于等变网络，其中结果更为稀少，我们证明标准分离性概念不足，并引入更严格的逐元素分离性准则。我们证明在足够深度或添加适当读出层的情况下，等变网络可在逐元素分离性范围内实现通用性。结合先前结果表明浅层模型无法实现通用性，我们的发现将深度和读出层识别为通用性的关键机制，同时提供了一个统一的视角，涵盖了并扩展了先前专门的结果。

英文摘要

Universality results for equivariant neural networks remain rare. Those that do exist typically hold only in restrictive settings: either they rely on regular or higher-order tensor representations, leading to impractically high-dimensional hidden spaces, or they target specialized architectures, often confined to the invariant setting. This work develops a more general account. For invariant networks, we establish a universality theorem under separation constraints, showing that the addition of a fully connected readout layer secures approximation within the class of separation-constrained continuous functions. For equivariant networks, where results are even scarcer, we demonstrate that standard separability notions are inadequate and introduce the sharper criterion of $\textit{entry-wise separability}$. We show that with sufficient depth or with the addition of appropriate readout layers, equivariant networks attain universality within the entry-wise separable regime. Together with prior results showing the failure of universality for shallow models, our findings identify depth and readout layers as a decisive mechanism for universality, additionally offering a unified perspective that subsumes and extends earlier specialized results.

URL PDF HTML ☆

赞 0 踩 0

2510.11974 2026-06-05 cs.CR cs.AI

CTIConnect: A Benchmark for Retrieval-Augmented LLMs over Heterogeneous Cyber Threat Intelligence

CTIConnect：一种用于异构网络威胁情报的检索增强大语言模型基准

Yutong Cheng, Yang Liu, Changze Li, Dawn Song, Peng Gao

发表机构 * Virginia Tech Department of Computer Science（弗吉尼亚理工大学计算机科学系）； University of California, Berkeley Department of Computer Science（加州大学伯克利分校计算机科学系）

AI总结本文提出CTIConnect基准，用于评估检索增强型大语言模型在网络威胁情报任务中的表现，通过整合五个异构数据源构建了1860个专家验证的问答对，揭示了不同任务类别中跨源语义差距的差异以及检索策略和性能瓶颈的变化，展示了领域特定策略在提升性能上的优势。

Comments Accepted to KDD 2026

详情

AI中文摘要

网络威胁情报（CTI）是现代网络安全的基础，使组织能够主动防御不断演变的威胁。然而，CTI数据的规模和异质性，从结构化知识库（CVE、CWE、CAPEC、MITRE ATT&CK）和非结构化威胁报告，远远超出了手动分析的能力。大型语言模型（LLMs）强大的上下文理解和推理能力推动了其在CTI任务中的应用。然而，现有的基准评估在检索增强设置中缺乏适当的评估框架，无法访问分析师在实践中依赖的异构领域知识源。为此，我们提出了CTIConnect，一种系统评估检索增强型LLMs在CTI任务领域的基准。我们构建了一个统一的评估环境，整合了五个异构CTI数据源，构建了1860个专家验证的问答对，涵盖实体链接、多文档综合和实体归属三个类别共九项任务。对十种最先进的LLMs进行了大量实验，发现跨源语义差距在不同任务类别中表现不同，需要根本不同的检索策略，并且性能瓶颈在检索基础设施和证据利用之间切换。我们的领域特定策略进一步优于更强的一般检索范式（检索后重排、IRCoT），表明缩小这一差距需要结构干预而非通用检索改进。这些发现在所有十种LLMs上均成立，保持在完整基准上的一致性，并在2008-2025时间分割下保持稳定。共同，它们为设计可扩展的异构CTI生态系统检索架构提供了可操作的指导。

英文摘要

Cyber Threat Intelligence (CTI) is foundational to modern cybersecurity, enabling organizations to proactively defend against evolving threats. However, the sheer volume and heterogeneity of CTI data, spanning structured knowledge bases (CVE, CWE, CAPEC, MITRE ATT&CK) and unstructured threat reports, far exceed the capacity of manual analysis. The strong contextual understanding and reasoning of Large Language Models (LLMs) have driven growing interest in applying them to CTI tasks. Yet no existing benchmark evaluates LLMs in a retrieval-augmented setting with a proper evaluation harness that grants access to the heterogeneous domain knowledge sources analysts rely on in practice. To address this gap, we present CTIConnect, a benchmark for systematically evaluating retrieval-augmented LLMs across the CTI task landscape. We construct a unified evaluation environment integrating five heterogeneous CTI sources into 1,860 expert-verified QA pairs spanning nine tasks across three categories: Entity Linking, Multi-Document Synthesis, and Entity Attribution. Extensive experiments on ten state-of-the-art LLMs reveal that the cross-source semantic gap manifests differently across task categories, demanding fundamentally different retrieval strategies, and that the performance bottleneck shifts between retrieval infrastructure and evidence utilization depending on the task. Our domain-specific strategies further outperform stronger general-purpose retrieval paradigms (retrieve-then-rerank, IRCoT), showing that closing this gap requires structural interventions rather than generic retrieval improvements. These findings hold across all ten LLMs, remain consistent on the full benchmark, and stay stable under temporal splits spanning 2008-2025. Together, they provide actionable guidance for designing scalable retrieval architectures over heterogeneous CTI ecosystems.

URL PDF HTML ☆

赞 0 踩 0

2510.05709 2026-06-05 cs.CR cs.AI cs.CL

Correcting Prompt Dependence in LLM Benchmarks: A Bayesian Hierarchical Model with Embedding-Space Clustering

纠正大语言模型基准测试中的提示依赖：一种具有嵌入空间聚类的贝叶斯分层模型

Mary Llewellyn, Isobel Thornton, James Bishop, Annie Gray

发表机构 * University of Cambridge（剑桥大学）

AI总结本文提出了一种贝叶斯分层模型，通过嵌入空间聚类来纠正大语言模型基准测试中的提示依赖问题，在数据有限的情况下提供更稳健的性能指标，并在对抗鲁棒性基准测试中实现了性能指标的显著提升。

Comments Accepted to the 1st Workshop on Combining Theory and Benchmarks, CTB@ICML 2026, Seoul, South Korea

2509.25450 2026-06-05 cs.CE cs.AI cs.NA math.NA physics.comp-ph

Multi-patch isogeometric neural solver for partial differential equations on computer-aided design domains

多补丁等几何神经求解器用于计算机辅助设计域上的偏微分方程

Moritz von Tresckow, Ion Gabriel Ion, Dimitrios Loukrezis

发表机构 * Institute for Accelerator Science and Electromagnetic Fields, Technische Universität Darmstadt（加速器科学与电磁场研究所，德累斯顿技术大学）； Terra Quantum AG（Terra Quantum公司）； Scientific Computing, Centrum Wiskunde & Informatica（科学计算，数学与信息学中心）

AI总结本文提出了一种结合物理感知神经网络与多补丁等几何分析的计算框架，用于解决复杂计算机辅助设计几何上的偏微分方程。该方法利用补丁局部神经网络在等几何分析的参考域上操作，并通过定制的输出层强加狄利克雷边界条件。通过专用的界面神经网络确保非均匀有理B样条补丁之间界面的解一致性。通过变分框架最小化偏微分方程弱形式导出的能量函数进行训练。在两个高度非平凡且实际相关的应用案例中验证了该方法的有效性，即四极磁铁的2D磁静力学模型和机械夹具的3D非线性固体力学与接触力学模型。结果与高保真有限元求解器获得的参考解高度一致，展示了该神经求解器在处理复杂工程问题方面的潜力。

Comments 33 pages, 15 figures

详情

DOI: 10.1007/s00366-026-02351-z

AI中文摘要

本工作开发了一种计算框架，结合物理感知神经网络与多补丁等几何分析，用于解决复杂计算机辅助设计几何上的偏微分方程。该方法利用补丁局部神经网络在等几何分析的参考域上操作。定制的输出层使强加狄利克雷边界条件。通过专用的界面神经网络确保非均匀有理B样条补丁之间界面的解一致性。通过变分框架最小化偏微分方程弱形式导出的能量函数进行训练。该方法的有效性在两个高度非平凡且实际相关的应用案例中得到验证，即四极磁铁的2D磁静力学模型和机械夹具的3D非线性固体力学与接触力学模型。结果与高保真有限元求解器获得的参考解高度一致，从而突显了该神经求解器在处理复杂工程问题方面的潜力，鉴于相应的计算机辅助设计模型。

英文摘要

This work develops a computational framework that combines physics-informed neural networks with multi-patch isogeometric analysis to solve partial differential equations on complex computer-aided design geometries. The method utilizes patch-local neural networks that operate on the reference domain of isogeometric analysis. A custom output layer enables the strong imposition of Dirichlet boundary conditions. Solution conformity across interfaces between non-uniform rational B-spline patches is enforced using dedicated interface neural networks. Training is performed using the variational framework by minimizing the energy functional derived after the weak form of the partial differential equation. The effectiveness of the suggested method is demonstrated on two highly non-trivial and practically relevant use-cases, namely, a 2D magnetostatics model of a quadrupole magnet and a 3D nonlinear solid and contact mechanics model of a mechanical holder. The results show excellent agreement to reference solutions obtained with high-fidelity finite element solvers, thus highlighting the potential of the suggested neural solver to tackle complex engineering problems given the corresponding computer-aided design models.

URL PDF HTML ☆

赞 0 踩 0

2509.25397 2026-06-05 cs.SE cs.AI cs.LG

A Cartography of Open Collaboration in Open Source AI: Mapping Practices, Motivations, and Governance in 14 Open Large Language Model Projects

开源人工智能中开放协作的图谱：映射14个开源大语言模型项目的实践、动机与治理

Johan Linåker, Cailean Osborne, Jennifer Ding, Ben Burtenshaw

发表机构 * RISE Research Institutes of Sweden AB（瑞典RISE研究机构）； University of Oxford（牛津大学）

AI总结本文通过分析14个开源大语言模型项目的开发与再利用生命周期中的开放协作实践，揭示了协作方法、动机和治理结构的多样性，以及开放源代码AI并非单一属性，而是协作组织方式在互联艺术领域、生命周期阶段和制度背景下的涌现结果。

Comments In submission

详情

AI中文摘要

开源大语言模型（LLMs）的普及正在推动人工智能（AI）领域形成一个活跃的生态系统。然而，开发开源LLMs所使用的协作方法，在其公开发布前后仍未被系统研究，这限制了我们对开源LLM项目如何启动、组织和治理的理解，以及进一步促进这一生态系统的机会。我们通过探索性分析开源LLMs的开发与再利用生命周期中的开放协作，基于对14个不同开源LLM项目开发者的半结构化访谈。这些协作跨越多个艺术领域——包括模型、数据、软件、评估、计算和社区参与——每个领域都使不同的参与形式成为可能，并涉及不同的利益相关者，这些利益相关者在LLM开发生命周期中不断演变，从早期的集中、选择性参与转变为模型发布后的广泛、分散参与。开源LLM开发者受多种社会、经济和技术动机驱动，从民主化AI访问和促进开放科学到构建区域生态系统和扩展语言代表性。这些动态通过一系列治理结构协调，通常在不同程度上正式和专业化，包括以公司为中心的集中努力到去中心化的基层倡议。我们通过一个概念模型综合了我们的发现，提供了实践建议，并得出结论：开源AI的开放性并非单一属性，而是协作在互联艺术领域、生命周期阶段和制度背景下的组织方式的涌现结果。

英文摘要

The proliferation of open large language models (LLMs) is fostering a vibrant ecosystem in artificial intelligence (AI). However, the methods of collaboration used to develop open LLMs, both before and after their public release, have not yet been systematically studied, limiting our understanding of how open LLM projects are initiated, organised, and governed, as well as the opportunities to further foster this ecosystem. We address this gap through an exploratory analysis of open collaboration throughout the development and reuse lifecycle of open LLMs, drawing on semi-structured interviews with the developers of 14 diverse open LLM projects. These collaborations span multiple artefact domains -- including models, data, software, evaluation, compute, and community engagement -- each enabling distinct forms of participation and involving different stakeholders that evolves across the LLM development lifecycle, shifting from concentrated, selective engagement in the early stages to broader, distributed participation after model release. The open LLM developers are motivated by a variety of social, economic, and technological motivations, ranging from democratising access to AI and promoting open science to building regional ecosystems and expanding language representation. These dynamics are coordinated through a range of governance structures, typically formal and professionalised to varying degrees, including centralised company-led efforts to decentralised grassroots initiatives. We synthesise our findings in a conceptual model of open collaboration in open LLM ecosystems, provide recommendations for practice, and conclude that openness in open source AI is not a uniform property but an emergent outcome of how collaboration is organised across interconnected artefact domains, lifecycle stages, and institutional contexts.

URL PDF HTML ☆

赞 0 踩 0

2509.20324 2026-06-05 cs.CR cs.AI

RAG Security and Privacy: Formalizing the Threat Model and Attack Surface

RAG安全与隐私：形式化威胁模型和攻击面

Atousa Arzanipour, Rouzbeh Behnia, Reza Ebrahimi, Kaushik Dutta

发表机构 * University of California, Berkeley（加州大学伯克利分校）

AI总结本文研究了RAG系统中的安全与隐私问题，提出首个形式化的威胁模型，定义了攻击向量如文档级成员推断和数据中毒，以提升对RAG系统隐私和安全性的理解。

Comments Published at the 5th ICDM Workshop in November 2025

Journal ref 2025 IEEE International Conference on Data Mining Workshops (ICDMW), pp. 1387-1394, 2025

详情

DOI: 10.1109/ICDMW69685.2025.00165

AI中文摘要

检索增强生成（RAG）是一种新兴的自然语言处理方法，结合大型语言模型（LLMs）与外部文档检索以生成更准确和基于事实的响应。尽管RAG在减少幻觉和提高事实一致性方面表现出色，但其也引入了与传统LLMs不同的隐私和安全挑战。现有研究表明，LLMs可通过训练数据记忆或对抗性提示泄露敏感信息，而RAG系统继承了许多这些漏洞。同时，RAG依赖外部知识库打开了新的攻击面，包括可能泄露检索文档的存在或内容信息，或注入恶意内容以操控模型行为。尽管存在这些风险，目前尚无正式框架定义RAG系统的威胁景观。本文通过提出首个形式化的RAG威胁模型，填补了文献中的关键空白。我们引入了基于对模型组件和数据访问的对手类型的结构化分类，并正式定义了关键威胁向量，如文档级成员推断和数据中毒，这些向量在实际部署中对隐私和完整性构成严重风险。通过建立正式定义和攻击模型，本文为更严谨和原则性的理解RAG系统的隐私和安全奠定了基础。

英文摘要

Retrieval-Augmented Generation (RAG) is an emerging approach in natural language processing that combines large language models (LLMs) with external document retrieval to produce more accurate and grounded responses. While RAG has shown strong potential in reducing hallucinations and improving factual consistency, it also introduces new privacy and security challenges that differ from those faced by traditional LLMs. Existing research has demonstrated that LLMs can leak sensitive information through training data memorization or adversarial prompts, and RAG systems inherit many of these vulnerabilities. At the same time, reliance of RAG on an external knowledge base opens new attack surfaces, including the potential for leaking information about the presence or content of retrieved documents, or for injecting malicious content to manipulate model behavior. Despite these risks, there is currently no formal framework that defines the threat landscape for RAG systems. In this paper, we address a critical gap in the literature by proposing, to the best of our knowledge, the first formal threat model for retrieval-RAG systems. We introduce a structured taxonomy of adversary types based on their access to model components and data, and we formally define key threat vectors such as document-level membership inference and data poisoning, which pose serious privacy and integrity risks in real-world deployments. By establishing formal definitions and attack models, our work lays the foundation for a more rigorous and principled understanding of privacy and security in RAG systems.

URL PDF HTML ☆

赞 0 踩 0

2509.02971 2026-06-05 stat.ML cs.LG cs.NA math.NA math.PR

Scale-Adaptive Generative Flows for Multiscale Scientific Data

多尺度科学数据的自适应生成流

Yifan Chen, Eric Vanden-Eijnden

发表机构 * Department of Mathematics, University of California, Los Angeles（加州大学洛杉矶分校数学系）； Machine Learning Lab, Capital Fund Management（资本基金管理有限公司机器学习实验室）； Courant Institute, New York University（纽约大学柯朗研究所）

AI总结本文提出了一种多尺度科学数据生成模型，通过设计噪声分布和插值计划，解决多尺度傅里叶谱数据中的数值挑战，提高了生成样本的质量和效率。

详情

AI中文摘要

基于流的生成模型在处理具有多尺度傅里叶谱的科学数据时常常面临数值挑战，通常在细尺度上产生较大的误差。我们通过在流匹配和随机插值框架内，通过噪声分布和插值计划的原理性设计来解决这个问题。在函数空间中工作可以确保生成模型在分辨率细化时仍然定义良好；漂移的Lipschitz正则性对这种函数空间的良定义性和固定分辨率下的积分成本都很重要。核心观察是噪声应至少与目标分布一样粗糙——通过傅里叶谱衰减来衡量——以保持Lipschitz常数有限。对于已知细尺度结构的高斯和近高斯目标，匹配谱噪声比标准白噪声选择更有效。对于更复杂的非高斯目标，匹配谱噪声可能不足以应对噪声比数据粗糙时出现的终端时间刚性问题，我们提出自适应插值计划来缓解这种情况。在合成高斯随机场和随机Allen-Cahn和Navier-Stokes方程不变测度上的数值实验展示了该方法，并证明了其在传统方法基础上以更低计算成本生成高质量样本的能力。

英文摘要

Flow-based generative models can face numerical challenges on scientific data with multiscale Fourier spectra, often producing large errors at fine scales. We approach this problem within the flow matching and stochastic interpolants framework, through the principled design of noise distributions and interpolation schedules. Working in function space ensures that the generative model remains well defined as the resolution is refined; the Lipschitz regularity of the drift is important to both this function-space well-posedness and the integration cost at fixed resolution. The central observation is that the noise should be at least as rough as the target distribution -- measured by Fourier-spectrum decay -- in order to keep the Lipschitz constant finite. For Gaussian and near-Gaussian targets whose fine-scale structure is known, matched-spectrum noise improves numerical efficiency over standard white-noise choices. For more complex non-Gaussian targets, matched-spectrum noise may not be sufficient, and we propose scale-adaptive interpolation schedules to mitigate the terminal-time stiffness that arises when the noise is rougher than the data. Numerical experiments on synthetic Gaussian random fields and on invariant measures of the stochastic Allen--Cahn and Navier--Stokes equations illustrate the approach and demonstrate its ability to generate high-fidelity samples at lower computational cost than traditional approaches.

URL PDF HTML ☆

赞 0 踩 0

2508.20693 2026-06-05 cs.DL cs.CL

Leveraging Large Language Models for Generating Research Topic Ontologies: A Multi-Disciplinary Study

利用大型语言模型生成研究主题本体：多学科研究

Tanay Aggarwal, Angelo Salatino, Francesco Osborne, Enrico Motta

发表机构 * Knowledge Media Institute, The Open University（开放大学知识媒体学院）； The Open University（开放大学）； University of Milano Bicocca（米兰比克卡大学）； Department of Business and Law, University of Milano Bicocca（米兰比克卡大学商学院与法学院）

AI总结本文研究了大型语言模型在生物医学、物理和工程学三个学科中识别研究主题语义关系的能力，通过零样本提示、链式思维提示和在现有本体上微调三种条件评估模型性能，并引入PEM-Rel-8K数据集验证跨学科迁移能力。

详情

AI中文摘要

研究领域本体和分类法对于管理和组织科学知识至关重要，因为它们有助于信息的高效分类、传播和检索。然而，创建和维护此类本体是昂贵且耗时的任务，通常需要多个领域专家的协同工作。因此，此类本体在不同学科中的覆盖程度不均，学科间连接有限，更新周期也较短。在本研究中，我们探讨了几种大型语言模型在生物医学、物理和工程学三个学科中识别研究主题间语义关系的能力。模型在三种不同的条件下进行评估：零样本提示、链式思维提示和在现有本体上微调。此外，我们通过测量模型在某一学科训练后应用到不同学科的表现，评估了微调模型的跨学科迁移能力。为了支持这项分析，我们引入了PEM-Rel-8K数据集，该数据集包含从生物医学、物理和工程学三个学科中最广泛采用的分类法中提取的超过8000个关系。我们的实验表明，将大型语言模型微调到PEM-Rel-8K上在所有学科中都表现出色。

英文摘要

Ontologies and taxonomies of research fields are critical for managing and organising scientific knowledge, as they facilitate efficient classification, dissemination and retrieval of information. However, the creation and maintenance of such ontologies are expensive and time-consuming tasks, usually requiring the coordinated effort of multiple domain experts. Consequently, ontologies in this space often exhibit uneven coverage across different disciplines, limited inter-discipline connectivity, and infrequent updating cycles. In this study, we investigate the capability of several large language models to identify semantic relationships among research topics within three academic disciplines: biomedicine, physics, and engineering. The models were evaluated under three distinct conditions: zero-shot prompting, chain-of-thought prompting, and fine-tuning on existing ontologies. Additionally, we assessed the cross-discipline transferability of fine-tuned models by measuring their performance when trained in one discipline and subsequently applied to a different one. To support this analysis, we introduce PEM-Rel-8K, a novel dataset consisting of over 8,000 relationships extracted from the most widely adopted taxonomies in the three disciplines considered in this study: MeSH, PhySH, and IEEE. Our experiments demonstrate that fine-tuning LLMs on PEM-Rel-8K yields excellent performance across all disciplines.

URL PDF HTML ☆

赞 0 踩 0

2508.19006 2026-06-05 q-fin.PR cs.LG econ.EM q-fin.CP

Is attention truly all we need? An empirical study of asset pricing in pretrained RNN sparse and global attention models

注意力真的全部我们需要吗？对预训练RNN稀疏和全局注意力模型在资产定价中的实证研究

Shanyan Lai

发表机构 * Department of Economics and Related Studies, Univiersity of York（经济与相关研究系，约克大学）

AI总结本文研究了预训练RNN注意力模型在资产定价中的应用，探讨了注意力机制在捕捉时间依赖性和长期记忆方面的改进，以及在不同市场条件下的稳定性。

Comments 72 pages including appendix

详情

AI中文摘要

本研究探讨了主流注意力机制，如加权注意力、Luong的三种注意力、全局自注意力和滑动窗口稀疏注意力，在顶级420只大型美国股票上的实证资产定价研究。这是首次将大规模最先进的（SOTA）注意力机制应用于资产定价领域。这些模型克服了传统机器学习资产定价方法的局限性，如误捕时间依赖性和短期记忆。此外，注意力机制中的强制因果掩码解决了未来数据泄漏问题，而这一问题被更先进的注意力模型如经典Transformer所忽视。所提出的注意力模型还考虑了资产定价数据的时间稀疏性，并通过部署简化模型结构来缓解潜在的过拟合问题。本文为未来实证经济研究提供了某些见解。所有模型均在三个时期内进行测试，涵盖新冠前、新冠期间和新冠后一年，以测试这些模型在极端市场条件下的稳定性。研究发现，在价值加权投资组合回测中，全局自注意力模型和滑动窗口稀疏注意力模型在获得绝对收益和对冲下行风险方面表现出色，在新冠期间静态交易成本情景下，它们分别实现了2.0和1.80的年化Sortino比率。此外，从绝对投资组合收益的角度来看，滑动窗口稀疏注意力模型在股票市值大小方面比全局自注意力模型表现更加稳定。

英文摘要

This study investigates the pre-trained RNN attention models with the mainstream attention mechanisms, such as additive attention, Luong's three attentions, global self-attention and sliding window sparse attention, for the empirical asset pricing research on the top 420 large-cap US stocks. This is the first paper on the large-scale state-of-the-art (SOTA) attention mechanisms applied in the asset pricing context. They overcome the limitations of the traditional machine learning-based asset pricing, such as mis-capturing the temporal dependency and short memory. Moreover, the enforced causal masks in the attention mechanisms address the future data leaking issue ignored by the more advanced attention-based models, such as the classic Transformer. The proposed attention models also consider the temporal sparsity characteristic of asset pricing data and mitigate potential overfitting issues by deploying the simplified model structures. This provides some insights for future empirical economic research. All models are examined in three periods, which cover pre-COVID-19, COVID-19 and one year post-COVID-19, for testing the stability of these models under extreme market conditions. The study finds that in value-weighted portfolio back testing, the global self-attention model and the sliding window sparse attention model exhibit excellent capabilities in deriving the absolute returns and hedging downside risks, while they achieve an annualized Sortino ratio of 2.0 and 1.80 respectively in the period with COVID-19 in the static transaction cost scenario. Moreover, the sliding window sparse attention model performs more stably than the global self-attention model from the perspective of absolute portfolio returns with respect to the size of stocks' market capitalization.

URL PDF HTML ☆

赞 0 踩 0

2508.10555 2026-06-05 physics.comp-ph cs.CE cs.LG

A Differentiable Framework for Full and Phaseless Data Inversion Using Neural Implicit Contrast-Source Representation

一种基于神经隐式对比源表示的全数据和相位less数据反演可微框架

Haoran Sun, Daoqi Liu, Hongyu Zhou, Maokun Li, Shenheng Xu, Fan Yang

发表机构 * Department of Electronic Engineering, Beijing National Research Center for Information Science and Technology (BNRist), and State Key laboratory of Space Network and Communications（电子工程系，北京信息科学与技术国家研究中心（BNRist），空间网络与通信国家重点实验室）

AI总结本文提出了一种基于神经隐式对比源表示的可微框架，用于全数据和相位less数据反演，通过引入轻量级残差多层感知机作为连续神经场，提升了反演精度和鲁棒性，同时通过总变分正则化将状态方程和数据方程结合，形成可微目标函数，实现了端到端的可微优化。

详情

AI中文摘要

在本研究中，我们扩展了对比源反演，将其扩展为一个完全可微、无监督的框架，基于神经隐式表示的对比源。具体来说，而不是使用像素级离散表示，对比源由一个轻量级残差多层感知机（ResMLP）参数化，作为连续神经场，该神经场基于空间坐标和发射器设置进行条件化。这种连续参数化提供了更灵活的对比源表示，并在有噪声测量的情况下提高了重建精度和鲁棒性。基于此表示，状态方程和数据方程与总变分正则化相结合，形成一个可微的目标函数。通过将VIE约束反演重新公式化为一个端到端的可微优化问题，网络参数和介质对比率通过自动微分联合优化。在相同框架内，通过仅修改数据失配函数，同时支持全数据和相位less数据反演。数值实验表明，该方案在各种噪声水平和测量设置下，比传统CSI具有更高的重建精度和鲁棒性。连续神经场进一步使超分辨率推理成为可能，在训练网格更细的分辨率下实现，将反演成本与重建保真度解耦。消融研究和与替代神经架构的比较进一步确认，对比源参数化和基于VIE的公式化对于观察到的改进都是必不可少的。

英文摘要

In this study, we extend the contrast source inversion to a fully differentiable, unsupervised framework based on a neural implicit representation of the contrast source. Specifically, instead of a pixel-wise discrete representation, the contrast source is parameterized by a lightweight residual multilayer perceptron (ResMLP) as a continuous neural field conditioned on spatial coordinates and transmitter settings. This continuous parameterization provides a more flexible representation of the contrast source and improves reconstruction accuracy and robustness under noisy measurements. Building on this representation, the state equation and data equation are combined with total-variation regularization to form a differentiable objective function. By reformulating the VIE-constrained inversion as an end-to-end differentiable optimization problem, the network parameters and the medium contrast are jointly optimized via automatic differentiation. Within the same framework, both full and phaseless data inversion are accommodated by only modifying the data misfit function. Numerical experiments demonstrate that this scheme yields higher reconstruction accuracy and robustness than conventional CSI across a range of noise levels and measurement settings. The continuous neural field further enables super-resolution inference at resolutions finer than the training grid, decoupling inversion cost from reconstruction fidelity. Ablation studies and comparisons with alternative neural architectures further confirm that the contrast source parameterization and VIE-based formulation are both essential to the observed improvements.

URL PDF HTML ☆

赞 0 踩 0

2508.00775 2026-06-05 eess.SY cs.LG cs.SY math.OC

Learning to optimize with guarantees: a complete characterization of linearly convergent algorithms

学习优化并保证收敛性：线性收敛算法的完整表征

Andrea Martin, Ian R. Manchester, Luca Furieri

发表机构 * School of Electrical Engineering and Computer Science, and Digital Futures, KTH Royal Institute of Technology, Sweden（电气工程与计算机科学学院及数字未来学院，瑞典皇家理工学院）； Australian Centre for Robotics and School of Aerospace, Mechanical and Mechatronic Engineering, The University of Sydney, Australia（澳大利亚机器人中心及航空航天、机械与机电工程学院，澳大利亚悉尼大学）； Department of Engineering Sciences, University of Oxford, United Kingdom（工程科学系，英国牛津大学）

AI总结本文研究了如何通过改进算法在特定问题分布下的平均性能，提出了一种线性收敛算法的完整表征方法，展示了如何通过基线算法和可训练的指数衰减修改来实现线性收敛，并在非凸、梯度主导函数、强凸函数和多面体可行集优化中验证了其有效性。

详情

AI中文摘要

许多经典优化算法的设计受到线性收敛速率在问题类中的认证驱动。本文考虑了如何改进算法在特定问题实例分布下的平均性能。虽然可以通过将可训练组件嵌入算法更新中来解决这一任务，但关键挑战是保持整个问题类中的最坏保证。对于复合优化问题的类别，我们证明所有线性收敛算法都可以参数化为一个基线线性收敛算法和一组可训练的指数衰减修改其更新规则的参数；关键在于这种参数化排除了且仅排除了那些不收敛的算法。我们的结果适用于改进经典算法（如梯度下降用于非凸、梯度主导函数；Nesterov加速方法用于平滑强凸函数；投影梯度方法用于多面体可行集优化）的平均性能。我们展示了如何利用我们的表征来学习优化并保证线性收敛和可行性。数值结果展示了在求解病态线性方程组和在线性动力学系统上运行模型预测控制方案时，相较于经典优化器的优势。

英文摘要

The design of many classical optimization algorithms is driven by the certification of linear convergence rates over classes of optimization problems. In this paper, we consider the problem of improving the average-case performance of an algorithm over a specific distribution of problem instances. While this task can be tackled by embedding trainable components into the algorithm updates, a key challenge is to preserve worst-case guarantees across the entire problem class. For classes of composite optimization problems, we show that all linearly convergent algorithms can be parametrized in terms of a baseline linearly convergent algorithm, and a set of trainable, exponentially-decaying modifications to its update rule; crucially, this parametrization excludes all-and only-the algorithms that do not converge linearly. Our results apply to improving the average-case performance of classical algorithms such as gradient descent for nonconvex, gradient-dominated functions; Nesterov's accelerated method for smooth, strongly convex functions; and projected gradient methods for optimization over polyhedral feasible sets. We illustrate how our characterization can be used for learning to optimize with linear convergence and feasibility guarantees. Numerical results showcase benefits over classical optimizers when solving ill-conditioned systems of linear equations and running a model predictive control scheme on a linear dynamical system.

URL PDF HTML ☆

赞 0 踩 0

AI 大模型

视觉与机器人

科学与医疗

Beyond Means: Topological Causal Effects under Persistent-Homology Ignorability

"What if she doesn't feel the same?" What Happens When We Ask AI for Relationship Advice

CUCo: An Agentic Framework for Compute and Communication Co-design

General Synthetic-Powered Inference

The Relative Instability of Model Comparison with Cross-validation

HypRAG: Hyperbolic Dense Retrieval for Retrieval Augmented Generation

Minimax optimal differentially private synthetic data for smooth queries

Grounded but Misleading: Evaluating Semantic Alignment in AI-Generated Security Explanations

A2RAG: Adaptive Agentic Graph Retrieval for Cost-Aware and Reliable Reasoning

Automated HER2 scoring with uncertainty quantification using lensfree holography and deep learning

Eliminating Out-of-Domain Recommendations in LLM-based Recommender Systems: A Unified View

Using street view images and visual LLMs to predict heritage values for governance support: Risks, ethics, and policy implications

The Equilibrium Response of Atmospheric Machine-Learning Models to Uniform Sea Surface Temperature Warming

Autonomous Uncertainty Quantification for Computational Point-of-care Sensors

Efficient Asynchronous Federated Evaluation with Strategy Similarity Awareness for Intent-Based Networking in Industrial Internet of Things

HEIST: A Graph Foundation Model for Spatial Transcriptomics and Proteomics Data

Beyond Code Pairs: Dialogue-Based Data Generation for LLM Code Translation

Rotation-Parameterized Graph Fractional Fourier Transform: Definition, Properties, and Optimal Filtering

Adversarial Agents: Black-Box Evasion Attacks with Reinforcement Learning

On Universality of Deep Equivariant Networks

CTIConnect: A Benchmark for Retrieval-Augmented LLMs over Heterogeneous Cyber Threat Intelligence

Correcting Prompt Dependence in LLM Benchmarks: A Bayesian Hierarchical Model with Embedding-Space Clustering

Multi-patch isogeometric neural solver for partial differential equations on computer-aided design domains

A Cartography of Open Collaboration in Open Source AI: Mapping Practices, Motivations, and Governance in 14 Open Large Language Model Projects

RAG Security and Privacy: Formalizing the Threat Model and Attack Surface

Scale-Adaptive Generative Flows for Multiscale Scientific Data

Leveraging Large Language Models for Generating Research Topic Ontologies: A Multi-Disciplinary Study

Is attention truly all we need? An empirical study of asset pricing in pretrained RNN sparse and global attention models

A Differentiable Framework for Full and Phaseless Data Inversion Using Neural Implicit Contrast-Source Representation

Learning to optimize with guarantees: a complete characterization of linearly convergent algorithms