arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2505.04907 2026-06-15 cs.LG

VaCDA: Variational Contrastive Alignment-based Scalable Human Activity Recognition

VaCDA：基于变分对比对齐的可扩展人类活动识别

Soham Khisa, Avijoy Chakma

发表机构 * Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology（计算机科学与工程系，孟加拉国工程与技术大学）； Department of Computer Science, Bowie State University（计算机科学系，布里沃州立大学）

AI总结本文提出VaCDA框架，结合变分自编码器和对比学习，解决多源领域适应中的数据异质性问题，提升跨人物、跨位置和跨设备场景下的活动识别性能。

详情

DOI: 10.1109/ICMLA66185.2025.00160

AI中文摘要

技术进步促使可穿戴设备的兴起，这些设备持续监测用户活动，生成大量未标记数据。这种数据难以解读，手动标注费时且易出错。此外，数据分布往往异质，由于设备放置、类型和用户行为的变化。因此，传统迁移学习方法效果不佳，难以识别日常活动。为解决这些问题，我们使用变分自编码器（VAE）从可用传感器数据中学习共享的低维潜在空间。该空间在不同传感器间泛化数据，缓解异质性并帮助适应目标领域。我们整合对比学习以增强特征表示，通过在不同领域对同一类实例进行对齐并分离不同类实例。我们提出变分对比域适应（VaCDA），一种结合VAE和对比学习的多源域适应框架，以提高特征表示并减少源域和目标域之间的异质性。我们评估了VaCDA在三个异质场景下的多个公开数据集上，即跨人物、跨位置和跨设备。VaCDA在跨位置和跨设备场景中优于基线方法。

英文摘要

Technological advancements have led to the rise of wearable devices with sensors that continuously monitor user activities, generating vast amounts of unlabeled data. This data is challenging to interpret, and manual annotation is labor-intensive and error-prone. Additionally, data distribution is often heterogeneous due to device placement, type, and user behavior variations. As a result, traditional transfer learning methods perform suboptimally, making it difficult to recognize daily activities. To address these challenges, we use a variational autoencoder (VAE) to learn a shared, low-dimensional latent space from available sensor data. This space generalizes data across diverse sensors, mitigating heterogeneity and aiding robust adaptation to the target domain. We integrate contrastive learning to enhance feature representation by aligning instances of the same class across domains while separating different classes. We propose Variational Contrastive Domain Adaptation (VaCDA), a multi-source domain adaptation framework combining VAEs and contrastive learning to improve feature representation and reduce heterogeneity between source and target domains. We evaluate VaCDA on multiple publicly available datasets across three heterogeneity scenarios: cross-person, cross-position, and cross-device. VaCDA outperforms the baselines in cross-position and cross-device scenarios.

URL PDF HTML ☆

赞 0 踩 0

2311.05139 2026-06-15 cs.LG

Hard-Negative Sampling for Contrastive Learning: Optimal Representation Geometry and Neural- vs Dimensional-Collapse

对比学习中的硬负样本：最优表示几何与神经折叠与维度折叠

Ruijie Jiang, Thuan Nguyen, Shuchin Aeron, Prakash Ishwar

发表机构 * Department of Electrical Engineering, Tufts University（Tufts大学电气工程系）； Department of Engineering, Engineering Technology, East Tennessee State University（东田纳西州立大学工程系）； Department of Electrical and Computer Engineering, Boston University（波士顿大学电气与计算机工程系）

AI总结本文证明了在对比学习中，SCL、HSCL和UCL的损失最小化需要神经折叠几何，且HSCL和HUCL损失下界不低于SCL和UCL。同时，通过随机初始化和合适难度级别，Adam优化可收敛至神经折叠几何，而无硬负样本或特征归一化则会导致维度折叠。

Comments Final version: Reviewed and accepted to TMLR April 2025. Updated exposition, Added analysis of lower bounds

Journal ref Transactions on Machine Learning Research, 2025

详情

AI中文摘要

对于广泛研究的数据模型和通用损失及样本硬化函数，我们证明监督对比学习（SCL）、硬SCL（HSCL）和无监督对比学习（UCL）的损失最小化由表现神经折叠（NC）的表示实现，即类均值形成等角紧框架（ETF）且同类数据映射到同一表示。我们还证明对于任何表示映射，HSCL和硬UCL（HUCL）损失下界不低于对应的SCL和UCL损失。与现有文献不同，我们的SCL理论结果不需增强视图的类条件独立性，适用于包含广泛使用的InfoNCE损失函数的一般损失函数类。此外，我们的证明更简单、紧凑且透明。类似现有文献，我们的理论声明也适用于实际场景中使用批处理优化的情况。我们实证显示，首次证明在使用随机初始化和合适难度级别时，Adam优化HSCL和HUCL损失可收敛至NC几何，若加入单位球或单位球面特征归一化。不加入硬负样本或特征归一化时，通过Adam学习的表示会遭受维度折叠（DC）并无法达到NC几何。这些结果展示了硬负样本采样在对比表示学习中的作用，我们最后提出几个开放性的理论问题以供未来研究。代码可在https://github.com/rjiang03/HCL/tree/main找到。

英文摘要

For a widely-studied data model and general loss and sample-hardening functions we prove that the losses of Supervised Contrastive Learning (SCL), Hard-SCL (HSCL), and Unsupervised Contrastive Learning (UCL) are minimized by representations that exhibit Neural-Collapse (NC), i.e., the class means form an Equiangular Tight Frame (ETF) and data from the same class are mapped to the same representation. We also prove that for any representation mapping, the HSCL and Hard-UCL (HUCL) losses are lower bounded by the corresponding SCL and UCL losses. In contrast to existing literature, our theoretical results for SCL do not require class-conditional independence of augmented views and work for a general loss function class that includes the widely used InfoNCE loss function. Moreover, our proofs are simpler, compact, and transparent. Similar to existing literature, our theoretical claims also hold for the practical scenario where batching is used for optimization. We empirically demonstrate, for the first time, that Adam optimization (with batching) of HSCL and HUCL losses with random initialization and suitable hardness levels can indeed converge to the NC-geometry if we incorporate unit-ball or unit-sphere feature normalization. Without incorporating hard-negatives or feature normalization, however, the representations learned via Adam suffer from Dimensional-Collapse (DC) and fail to attain the NC-geometry. These results exemplify the role of hard-negative sampling in contrastive representation learning and we conclude with several open theoretical problems for future work. The code can be found at https://github.com/rjiang03/HCL/tree/main

URL PDF HTML ☆

赞 0 踩 0

2406.11565 2026-06-15 cs.CL cs.CY

Extrinsic Evaluation of Cultural Competence in Large Language Models

对外评估大型语言模型中的文化素养

Shaily Bhatt, Fernando Diaz

发表机构 * Carnegie Mellon University（卡内基梅隆大学）

AI总结本文通过两个文本生成任务评估模型在文化敏感性方面的表现，发现文化提示对输出有影响，但不同国家输出的相似性与文化价值观无显著关联。

Comments Accepted to EMNLP Findings 2024

详情

DOI: 10.18653/v1/2024.findings-emnlp.942

AI中文摘要

多样化的用户与语言技术之间的互动要求后者输出具有文化相关性和敏感性。先前的工作评估了模型对文化规范、价值观和物品的知识，但未考虑这种知识如何在下游应用中体现。在本工作中，我们专注于两个文本生成任务的外在评估：开放性问答和故事生成。我们定量和定性地评估当提示中明确提示文化，特别是国籍时，模型输出的变化。尽管我们发现当改变国籍和特征文化相关词汇时，模型输出确实有所变化，但我们还发现不同国家输出的相似性与这些国家的文化价值观之间存在弱相关性。最后，我们讨论了在面向用户的任务中设计全面评估文化能力的重要考虑因素。

英文摘要

Productive interactions between diverse users and language technologies require outputs from the latter to be culturally relevant and sensitive. Prior works have evaluated models' knowledge of cultural norms, values, and artifacts, without considering how this knowledge manifests in downstream applications. In this work, we focus on extrinsic evaluation of cultural competence in two text generation tasks, open-ended question answering and story generation. We quantitatively and qualitatively evaluate model outputs when an explicit cue of culture, specifically nationality, is perturbed in the prompts. Although we find that model outputs do vary when varying nationalities and feature culturally relevant words, we also find weak correlations between text similarity of outputs for different countries and the cultural values of these countries. Finally, we discuss important considerations in designing comprehensive evaluation of cultural competence in user-facing tasks.

URL PDF HTML ☆

赞 0 踩 0

2406.03221 2026-06-15 cs.CL cs.IR

Linking Named Entities in Diderot's Encyclopédie to Wikidata

将狄德罗《百科全书》中的命名实体链接到Wikidata

Pierre Nugues

发表机构 * Université de Lausanne（洛桑大学）

AI总结本文通过将《百科全书》中的10300多个条目与Wikidata标识符链接，实现了知识图谱的连接，展示了地理和人文实体的标注方法与应用实例。

Comments 6 pages, 3 figures

Journal ref Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pp. 10610--10615

详情

AI中文摘要

狄德罗《百科全书》是一部18世纪欧洲的知识参考作品，旨在收集当时的知识。维基百科有相同的目标，但范围更大。然而，两者之间缺乏数字连接可能阻碍其比较和知识演变的研究。维基百科的关键要素是Wikidata，它通过结构化数据图谱支持文章。本文描述了对《百科全书》中超过10,300个条目进行注释，以连接到图谱。我们考虑了地理和人文实体。《百科全书》不包含传记条目，因为它们大多作为位置的子条目出现。我们提取了所有地理条目，并完全注释了所有包含人类实体描述的条目。这代表了超过2,600个链接，指向位置或人类实体。此外，我们注释了超过9,500个仅包含地理内容的条目。我们描述了注释过程以及应用示例。此资源可在https://github.com/pnugues/encyclopedie_1751获取。

英文摘要

Diderot's Encyclopédie is a reference work from XVIIIth century in Europe that aimed at collecting the knowledge of its era. Wikipedia has the same ambition with a much greater scope. However, the lack of digital connection between the two encyclopedias may hinder their comparison and the study of how knowledge has evolved. A key element of Wikipedia is Wikidata that backs the articles with a graph of structured data. In this paper, we describe the annotation of more than 10,300 of the Encyclopédie entries with Wikidata identifiers enabling us to connect these entries to the graph. We considered geographic and human entities. The Encyclopédie does not contain biographic entries as they mostly appear as subentries of locations. We extracted all the geographic entries and we completely annotated all the entries containing a description of human entities. This represents more than 2,600 links referring to locations or human entities. In addition, we annotated more than 9,500 entries having a geographic content only. We describe the annotation process as well as application examples. This resource is available at https://github.com/pnugues/encyclopedie_1751

URL PDF HTML ☆

赞 0 踩 0

2209.00078 2026-06-15 cs.LG

Supervised Contrastive Learning with Hard Negative Samples

带有难负样本的监督对比学习

Ruijie Jiang, Thuan Nguyen, Prakash Ishwar, Shuchin Aeron

发表机构 * Dept. of ECE Tufts University（电子工程系塔夫茨大学）； Dept. of CS Tufts University（计算机科学系塔夫茨大学）； Dept. of ECE Boston University（电子工程系波士顿大学）

AI总结本文提出H-SCL，通过硬化函数调整类条件负采样分布，提升对比学习在下游分类任务中的性能，并分析H-SCL损失与H-UCL损失的关系。

Journal ref 2024 International Joint Conference on Neural Networks (IJCNN), pp. 1-8, 2024

详情

DOI: 10.1109/IJCNN60899.2024.10650863

AI中文摘要

通过最小化适当的损失函数（如InfoNCE损失），对比学习（CL）通过将正样本拉近、推斥负样本来学习有用的表示函数。正样本通常通过

英文摘要

Through minimization of an appropriate loss function such as the InfoNCE loss, contrastive learning (CL) learns a useful representation function by pulling positive samples close to each other while pushing negative samples far apart in the embedding space. The positive samples are typically created using "label-preserving" augmentations, i.e., domain-specific transformations of a given datum or anchor. In absence of class information, in unsupervised CL (UCL), the negative samples are typically chosen randomly and independently of the anchor from a preset negative sampling distribution over the entire dataset. This leads to class-collisions in UCL. Supervised CL (SCL), avoids this class collision by conditioning the negative sampling distribution to samples having labels different from that of the anchor. In hard-UCL (H-UCL), which has been shown to be an effective method to further enhance UCL, the negative sampling distribution is conditionally tilted, by means of a hardening function, towards samples that are closer to the anchor. Motivated by this, in this paper we propose hard-SCL (H-SCL) {wherein} the class conditional negative sampling distribution {is tilted} via a hardening function. Our simulation results confirm the utility of H-SCL over SCL with significant performance gains {in downstream classification tasks.} Analytically, we show that {in the} limit of infinite negative samples per anchor and a suitable assumption, the {H-SCL loss} is upper bounded by the {H-UCL loss}, thereby justifying the utility of H-UCL {for controlling} the H-SCL loss in the absence of label information. Through experiments on several datasets, we verify the assumption as well as the claimed inequality between H-UCL and H-SCL losses. We also provide a plausible scenario where H-SCL loss is lower bounded by UCL loss, indicating the limited utility of UCL in controlling the H-SCL loss.

URL PDF HTML ☆

赞 0 踩 0

2606.14693 2026-06-15 cs.MA cs.AI 新提交

Learning Coordinated Preference for Multi-Objective Multi-Agent Reinforcement Learning

学习协调偏好用于多目标多智能体强化学习

Pengxin Wang, Lihao Guo, Yi Xie, Bo Liu, Siyang Cao, Jingdi Chen

发表机构 * Department of Electrical and Computer Engineering, University of Arizona（亚利桑那大学电气与计算机工程系）

AI总结提出偏好协调多智能体策略优化（PCMA），通过学习协调的智能体特定偏好实现多目标多智能体强化学习中的互补权衡，理论证明偏好多样性可诱导团队改进，实验验证性能与协调性提升。

2606.14629 2026-06-15 cs.CR cs.AI 新提交

When Good Verifiers Go Bad: Self-Improving VLMs Can Regress on New Tasks

当好的验证器变坏：自我改进的视觉语言模型可能在新任务上退步

Jianzhe Lin

发表机构 * MetaAI（Meta）

AI总结本文发现验证器驱动的自我DPO中，验证器质量具有任务特异性，在低准确率任务上会导致学生模型性能退步，并给出机制解释和部署建议。

Comments 12 pages, 2 figure

详情

AI中文摘要

验证器驱动的自我DPO是自我改进的生产级视觉语言模型的常见方法。在这种设置中，冻结的验证器对候选生成进行评分，得分最高和最低的候选形成偏好示例，DPO更新学习器。部署时的假设是单调的：更强的验证器应产生更强的学生。我们表明这个假设可能失败，因为验证器质量高度依赖于任务。在MathVista、MMMU和BLINK上的四层开源验证器阶梯中，相同的验证器在MathVista上高于阈值并改进Qwen-3-VL-2B学生，但在MMMU上变得低于阈值，其任务评分准确率降至8%到23%。在这个范围内，我们测试的每个验证器都无声地使学生退步，产生比冻结基线低3.4到10.9个百分点的下降，而DPO训练损失持续下降。这种退步在第二个学生Qwen-2.5-VL-3B上重复出现。此外，在失败范围内，损害是置信度反转的：更准确但仍然错误的验证器比接近随机的验证器导致更大的退步，这表明进度门控重放放大了自信的错误偏好对。我们通过进度门控重放的方差定理及其方向不匹配失败模式给出了一个紧凑的机制解释。部署信息是操作性的而非纯粹诊断性的：在运行任何验证器驱动循环之前，团队应测量目标任务的评分准确率，根据目标任务评分质量而非参数数量对验证器排序，并将高于阈值范围内的收益递减视为验证器侧的计算预算上限。

英文摘要

Verifier-driven self-DPO is a common recipe for self-improving production visual-language models. In this setup, a frozen verifier scores candidate generations, the top- and bottom-scoring candidates form a preference example, and DPO updates the learner. The deployment-time assumption is monotone: a stronger verifier should yield a stronger student. We show that this assumption can fail because verifier quality is highly task-specific. On a four-rung open-source verifier ladder across MathVista, MMMU, and BLINK, the same verifiers that are above-threshold and improve a Qwen-3-VL-2B student on MathVista become sub-threshold on MMMU, where their task-rubric accuracy drops to 8% to 23%. In this regime, every verifier we tested silently regresses the student, producing drops of 3.4 to 10.9 percentage points below the frozen baseline while the DPO training loss continues to decrease. The regression replicates on a second student, Qwen-2.5-VL-3B. Moreover, within the failure regime, damage is confidence-inverted: the more accurate-but-still-wrong verifier causes larger regression than a near-random verifier, suggesting that progress-gated replay amplifies confidently wrong preference pairs. We give a compact mechanistic explanation via a variance theorem for progress-gated replay and its direction-mismatch failure mode. The deployment message is operational rather than purely diagnostic: before running any verifier-driven loop, teams should measure target-task rubric accuracy, rank verifiers by target-task rubric quality rather than parameter count, and treat diminishing returns in above-threshold regimes as a verifier-side compute budget cap.

URL PDF HTML ☆

赞 0 踩 0

2606.14594 2026-06-15 cs.SE cs.AI 新提交

三模态胶质瘤表示对齐通过体积对比学习

Denise Marini, Eleonora Grassucci, Danilo Comminiello

发表机构 * arXiv

AI总结提出GLORIA框架，通过Gramian对比损失对齐组织病理、基因表达和MRI三模态特征，用于胶质瘤分级和生存预测，在132例患者数据上优于双模态基线。

详情

AI中文摘要

胶质瘤分级和生存预测需要整合在不同空间和生物学尺度上收集的异质性信息。组织病理学描述组织形态，mRNA表达捕捉分子活动，磁共振成像提供肿瘤范围和放射学异质性的非侵入性视图。现有的胶质瘤预后模型通常只结合其中两个来源，而其对齐目标大多保持成对。本文介绍了GLORIA，一种用于胶质瘤组学-放射学-组织病理学对齐的新型三模态框架。GLORIA通过模态特定编码器处理全切片图像区域、基因表达谱和3D MRI体积，将它们投影到共享潜在空间，并使用Gramian对比损失对齐，该损失测量三个模态嵌入张成的体积。对齐的表示通过跨模态门控模块融合，并联合优化用于三级胶质瘤分级和总生存期预测。我们在匹配的TCGA-GBM/LGG和BraTS21队列上评估GLORIA，该队列包含132名具有所有三种模态的患者。在共享的三模态测试集上，GLORIA在所有考虑的指标上均优于双模态WSI-mRNA基线。

英文摘要

Glioma grading and survival prediction require the integration of heterogeneous information collected at different spatial and biological scales. Histopathology describes tissue morphology, mRNA expression captures molecular activity, and magnetic resonance imaging provides a non-invasive view of tumor extent and radiological heterogeneity. Existing glioma prognosis models often combine only two of these sources, while their alignment objectives remain mostly pairwise. This paper introduces GLORIA, a novel trimodal framework for GLioma Omics - Radiology - hIstopathology Alignment. GLORIA processes whole-slide image regions, gene-expression profiles, and 3D MRI volumes through modality-specific encoders, projects them into a shared latent space, and aligns them with a Gramian contrastive loss that measures the volume spanned by the three modality embeddings. The aligned representations are fused through a cross-modal gating module and optimized jointly for three-class glioma grading and overall survival prediction. We evaluate GLORIA on a matched TCGA-GBM/LGG and BraTS21 cohort, comprising 132 patients with all three modalities. On the shared trimodal test set, GLORIA improves over the bimodal WSI-mRNA baseline in all the metrics considered.

URL PDF HTML ☆

赞 0 踩 0

2606.14560 2026-06-15 math.OC cs.LG stat.ML 新提交

Free Heavy-Tailed Lunch for Muon: A Theoretical Justification of Empirical Success

Muon 的免费重尾午餐：实证成功的理论证明

Florian Hübler, Thomas Pethick, Suvrit Sra

发表机构 * Department of Computer Science, ETH Zurich, Switzerland（苏黎世联邦理工学院计算机科学系）； Department of Mathematics, Technical University of Munich, Germany（慕尼黑技术大学数学系）； Munich Center for Machine Learning (MCML)（慕尼黑机器学习中心）

AI总结本文在重尾非凸优化中证明，Muon 等非欧几里得方法在核范数平稳性下达到最优样本复杂度，避免了欧几里得方法的维度依赖，并通过大语言模型实验验证。

详情

AI中文摘要

最近，具有矩阵值更新的非欧几里得优化方法（如 Muon 和 Scion）在训练 Transformer 模型方面显示出强大的实证性能，但其相对于欧几里得方法的理论优势仍知之甚少。我们在重尾非凸机制中解决了这一差距，其中随机梯度具有有界的 $p$ 阶中心矩，$p \in (1,2]$。我们表明，某些非欧几里得方法在更强的平稳性度量下实现了最优样本复杂度，而欧几里得方法则会产生额外的维度相关成本。因此，对于 $m \times n$ 矩阵，Muon 在核范数下找到一个 $\varepsilon$-平稳点所需的样本数为 $\mathcal{O}\left(\min\{m, n\} \frac{\Delta_1 L}{\varepsilon^2} \left(\frac \sigma \varepsilon \right)^{\frac p {p-1}}\right)$，吸收了重尾噪声而无需额外的维度依赖，这与欧几里得方法不同。我们进一步证明，对于所有一阶方法在核范数平稳性下，该样本复杂度（包括其维度依赖）是最优的。在大语言模型上的实验支持了我们的理论。令人惊讶的是，我们的结果表明，除了 Muon 的谱几何之外，其他 Schatten 几何在某些设置下也能具有竞争力。

英文摘要

Non-Euclidean optimisation methods with matrix-valued updates, such as Muon and Scion, have recently shown strong empirical performance for training Transformer models, yet their theoretical advantages over Euclidean methods remain poorly understood. We address this gap in the heavy-tailed non-convex regime, where stochastic gradients have bounded $p$-th central moments, $p \in (1,2]$. We show that certain non-Euclidean methods achieve optimal sample complexity under stronger stationarity measures, while Euclidean methods incur additional dimension-dependent costs. As a consequence, for $m \times n$ matrices, Muon finds an $\varepsilon$-stationary point in nuclear norm within $\mathcal{O}\left(\min\{m, n\} \frac{Δ_1 L}{\varepsilon^2} \left(\frac σ\varepsilon \right)^{\frac p {p-1}}\right)$ samples, absorbing heavy-tailed noise without extra dimension dependence, unlike Euclidean methods. We further prove this sample complexity, including its dimension dependence, is optimal for all first-order methods under nuclear-norm stationarity. Experiments on large language models support our theory. Surprisingly, our results suggest that other Schatten geometries beyond the spectral geometry of Muon can perform competitively in certain settings.

URL PDF HTML ☆

赞 0 踩 0

2606.14515 2026-06-15 cs.CR cs.AI 新提交

Securing the Future of IoMT in the Post-Quantum Era: An Edge-Native Federated Learning Approach

后量子时代保障IoMT的未来：一种边缘原生联邦学习方法

Taym Alshoghri, Deemah H. Tashman, Mohammad Reza Gerami, Soumaya Cherkaoui

发表机构 * LINCS Laboratory, Department of Computer and Software Engineering, Polytechnique Montréal（LINCS实验室，计算机与软件工程系，蒙特利尔理工学院）； Department of Computer Science, University of Toronto（计算机科学系，多伦多大学）

AI总结针对IoMT设备资源受限且处理敏感健康数据的安全隐私问题，提出一种集成后量子密码学的Kubernetes框架，通过边缘原生联邦学习实现低延迟分布式加密处理。

详情

AI中文摘要

医疗物联网（IoMT）设备在严格资源约束下运行，同时处理高度敏感的健康数据，使得安全性和隐私成为关键问题。联邦学习（FL）进一步复杂化了这一局面，因为训练期间交换的模型更新可能无意中暴露私人医疗信息。新兴的量子计算能力威胁着传统轻量级密码机制的长期可行性，推动了将后量子密码学（PQC）集成到IoMT系统中。本文讨论了量子弹性IoMT的关键使能技术，包括后量子密钥建立、轻量级加密和边缘原生编排。我们提出了一种可扩展的基于Kubernetes的框架，将PQC集成到支持FL的IoMT环境中，并在Raspberry Pi测试平台上进行了验证。结果表明，与顺序设计相比，分布式加密处理显著降低了延迟，同时保持了可行的资源开销。本工作的主要贡献在于设计和验证了支持FL的IoMT系统的安全编排和通信框架。最后，我们概述了未来方向，包括能量感知架构、智能安全优化和弹性下一代智能医疗物联网（IIoMT）生态系统。

英文摘要

Internet of Medical Things (IoMT) devices operate under strict resource constraints while handling highly sensitive health data, making security and privacy critical concerns. Federated learning (FL) further complicates this landscape, as model updates exchanged during training may unintentionally expose private medical information. Emerging quantum computing capabilities threaten the long-term viability of conventional lightweight cryptographic mechanisms, motivating the integration of Post-Quantum Cryptography (PQC) into IoMT systems. This article discusses key enabling technologies for quantum-resilient IoMT, including post-quantum key establishment, lightweight encryption, and edge-native orchestration. We propose a scalable Kubernetes-based framework that integrates PQC into FL-enabled IoMT environments and validate it on a Raspberry Pi testbed. Results demonstrate that distributed cryptographic processing significantly reduces latency compared to sequential designs while maintaining feasible resource overhead. The primary contribution of this work lies in the design and validation of a secure orchestration and communication framework for FL-enabled IoMT systems. We conclude by outlining future directions toward energy-aware architectures, intelligent security optimization, and resilient next-generation Intelligent Internet of Medical Things (IIoMT) ecosystems.

URL PDF HTML ☆

赞 0 踩 0

2606.14506 2026-06-15 stat.ML cs.LG stat.ME 新提交

Beyond the Training Distribution: Evaluating Predictions Under Distribution Shift and Selection Bias

超越训练分布：评估分布偏移和选择偏差下的预测

Annie Ulichney, Amanda Coston

发表机构 * Department of Statistics, University of California, Berkeley（加州大学伯克利分校统计学系）

AI总结针对协变量偏移和选择性标签共存时的模型评估问题，提出双机器学习程序估计目标风险，并通过eICU数据验证其准确性优于单独处理任一种偏差的方法。

详情

AI中文摘要

理解预测模型在新环境中的表现对于防止算法在决策中造成伤害至关重要。模型性能下降的两个常见原因是：(i) 协变量偏移，即目标协变量分布与源分布不同；(ii) 选择性标签，即结果的可观测性取决于历史决策。我们研究在协变量偏移和基于观测特征的选择性标签共同存在下的部署前模型评估。特别地，我们提出了一种双机器学习程序，用于在一般损失函数下估计任意黑箱预测模型的目标风险。我们在标准假设下证明了该估计量的可识别性，并基于目标风险的影响函数推导出偏差校正估计量。最后，我们通过使用eICU电子健康记录数据库的实验评估了我们的估计量，结果表明，与单独处理选择性标签或协变量偏移的方法以及结合标准插值方法的基线相比，我们的估计量更准确地跟踪真实目标风险。

英文摘要

Understanding how a prediction model will perform in a new environment before deployment is essential to preventing harm when algorithms inform decision-making. Two common sources of model performance degradation are (i) covariate shift, where the target covariate distribution differs from the source, and (ii) selective labels, where the observability of outcomes depends on historical decisions. We study pre-deployment model evaluation under the joint presence of covariate shift and labeling of outcomes selectively based on observed features. In particular, we present a double machine learning procedure for estimating the target risk of an arbitrary black-box prediction model under a general loss function. We show identification of this estimand under standard assumptions and derive a bias-corrected estimator based on the influence function of the target risk. Finally, we evaluate our estimator through experiments using the eICU electronic health records database, showing that it tracks the true target risk more accurately than methods that address either selective labels or covariate shift alone, as well as baselines that combine standard plug-in approaches.

URL PDF HTML ☆

赞 0 踩 0

2606.14498 2026-06-15 physics.chem-ph cs.AI 新提交

A Fixed-Point Neural Operator for Size- and Functional-Transferable Hamiltonian Prediction

用于尺寸和功能可迁移哈密顿量预测的定点神经算子

Yunhong Lou, Xihang Yue, Xinran Wei, Tianqi Deng, Linchao Zhu

发表机构 * Zhejiang University（浙江大学）； Zhongguancun Academy（中关村学院）； Zhongguancun Institute of Artificial Intelligence（中关村人工智能研究院）

AI总结提出HamEvo神经算子，将自洽场迭代的收敛哈密顿量作为不动点学习，结合密度矩阵监督，在分子性质预测中达到化学精度，并实现尺寸迁移和加速。

Comments 30 pages, 5 figures, 2 tables

详情

AI中文摘要

利用机器学习预测Kohn-Sham哈密顿量可以加速密度泛函理论，同时保留对分子轨道、能级和电子结构可观测量的访问，而纯能量代理无法解析这些量。然而，与收敛哈密顿量的元素级一致性（自洽场迭代的隐式不动点）并不能决定控制轨道能量和密度的占据子空间。在这里，我们提出HamEvo，一种学习单步自洽更新并将收敛哈密顿量作为其不动点返回的神经算子。HamEvo在中间自洽轨迹上预训练，并在平衡态通过密度矩阵监督进行校准。在从MD17到类药QMugs的基准测试中，HamEvo将哈密顿量误差比直接回归和深度均衡基线降低了35-49%，并以0.036和0.053 eV的平均绝对误差预测QMugs的HOMO和LUMO能量，接近1 kcal/mol的化学精度尺度。仅使用20个参考构象的少样本微调将HamEvo扩展到多达122个原子的分子，远超预训练覆盖的尺寸范围。通过热分子动力学采样，HamEvo捕捉到超越谐波近似的温度依赖HOMO-LUMO间隙重整化。推理速度比传统DFT快242倍。

英文摘要

Predicting the Kohn-Sham Hamiltonian with machine learning can accelerate density functional theory while retaining access to molecular orbitals, energy levels, and electronic-structure observables that energy-only surrogates cannot resolve. Yet element-wise agreement with the converged Hamiltonian, an implicit fixed point of the self-consistent field iteration, does not determine the occupied subspace that governs orbital energies and densities. Here we present HamEvo, a neural operator that learns the single-step self-consistent update and returns the converged Hamiltonian as its fixed point. HamEvo is pre-trained on intermediate self-consistent trajectories and calibrated at equilibrium with density-matrix supervision. Across benchmarks from MD17 to drug-like QMugs, HamEvo lowers Hamiltonian errors by 35-49% over direct-regression and deep-equilibrium baselines, and predicts QMugs HOMO and LUMO energies with mean absolute errors of 0.036 and 0.053 eV, near the 1 kcal/mol chemical-accuracy scale. Few-shot fine-tuning with only 20 reference conformations extends HamEvo to molecules of up to 122 atoms, well beyond the size range covered by pre-training. With thermal molecular-dynamics sampling, HamEvo captures temperature-dependent HOMO-LUMO gap renormalization beyond the harmonic approximation. Inference is up to 242 times faster than conventional DFT.

URL PDF HTML ☆

赞 0 踩 0

2606.14488 2026-06-15 cs.IT cs.LG math.IT 新提交

用可组合属性图查询转换形状模式（扩展版）

Philipp Seifer, Daniel Hernández, Ralf Lämmel, Steffen Staab

发表机构 * The Software Languages Team（软件语言团队）； University of Koblenz（科伦茨大学）； Institute for Artificial Intelligence（人工智能研究所）； University of Stuttgart（斯图加特大学）； University of Southampton（南安普顿大学）

AI总结研究在给定输入模式（ProGS）和查询（G-CORE）时推断输出模式的问题，通过映射到RDF、SHACL和SPARQL CONSTRUCT利用描述逻辑推理器实现模式约束的自动推断。

详情

AI中文摘要

属性图可能受模式约束，这些模式向查询引擎和人类用户告知有效数据的形状，强制执行数据提供者和消费者之间的契约。可组合属性图查询将输入图转换为输出图。那么，问题就出现了：在一个（或几个）转换步骤之后，可以预期哪种模式。我们研究了在给定输入模式和转换查询的情况下如何推断模式约束。具体来说，我们提出了一种推理过程，给定ProGS中的输入模式和G-CORE中的查询，推断输出模式。由于图更新会频繁发生，我们的推理过程不依赖于图实例，因此计算出的输出模式适用于所有源自符合输入模式的任何输入图的图。相关工作已经针对SPARQL CONSTRUCT查询解决了这个问题，将其编码在描述逻辑（DL）中，使得输出模式由从输入模式和查询推断出的公理蕴含。然而，属性图及其查询使问题复杂化，因为属性图具有标签和属性注释以及一等边。因此，必须以某种方式使用具体化，尽管可用的DL缺乏直接编码这些特征的手段。我们通过一系列映射来应对这一新挑战：i) 在RDF中具体化的属性图，与ii) 从ProGS到SHACL的映射以及iii) 从G-CORE到SPARQL CONSTRUCT查询的映射对齐。通过这种方式，属性图的模式推断变得可管理，因为我们通过额外的映射层分解问题并利用高效的DL推理器。我们发展了关于推断模式约束的可靠性和映射模式及查询的语义等价性的元理论。

英文摘要

Property graphs may be constrained by schemas that inform both query engines and human users about the shape of valid data, enforcing a contract between data provider and consumer. Composable property-graph queries transform input graphs into output graphs. Then, the question arises of which schema can be expected after one (or several) transformation steps. We investigate how schema constraints can be inferred given an input schema and a transforming query. Specifically, we propose a reasoning procedure that, given an input schema in ProGS and a query in G-CORE infers an output schema. Since graph updates will happen frequently, our inference procedure does not rely on graph instances, such that the computed output schema applies to all graphs originating from any input graph complying with the input schema. Related work has addressed this problem for SPARQL CONSTRUCT queries, encoding it in Description Logics (DLs) so that the output schema is entailed by axioms inferred from input schema and queries. Property graphs and their queries, however, complicate the matter, as property graphs feature label and property annotations as well as first-class edges. Thus, reification has to be used in one way or another, though available DLs lack the means to encode such features directly. We approach this novel challenge via a family of mappings for i) property graphs reified in RDF, aligned with ii) a mapping from ProGS to SHACL and iii) a mapping from G-CORE to SPARQL CONSTRUCT queries. In this manner, schema inference for property graphs becomes manageable, as we break apart the problem through the extra mapping layer and utilize efficient DL reasoners. We develop the metatheory regarding the soundness of inferred schema constraints and the semantic equivalence of mapped schemas and queries.

URL PDF HTML ☆

赞 0 踩 0

2606.14306 2026-06-15 cs.HC cs.AI 新提交

极值的梯度提升：抽样理论及其在保险中的应用

Stéphane Lhaut, Olivier Lopez

发表机构 * CREST, CNRS, Ecole polytechnique, Groupe ENSAE-ENSAI, ENSAE Paris, Institut Polytechnique de Paris, Palaiseau, France（CREST、国家科学研究中心、巴黎高等工业学校、ENSAE-ENSAI集团、巴黎ENSAE、巴黎理工学院、Palaiseau法国）

AI总结提出梯度提升估计广义帕累托分布的理论，通过正交重参数化改进收敛性，并在保险数据中验证了方法有效性。

Comments 36 pages, 10 figures

详情

AI中文摘要

我们为梯度提升在超阈值建模中估计协变量依赖的广义帕累托（GP）分布开发了统计学习理论。在对GP似然进行正交重参数化以对角化其Fisher信息矩阵后，我们将估计问题纳入经验风险最小化（ERM）框架，并推导了提升估计器的非渐近误差界。我们的分析考虑了过程中的三个不同误差来源：统计波动、GP模型渐近性质固有的近似偏差（在二阶正则变化下控制）以及与有限次提升迭代相关的近似误差，明确了由此产生的偏差-方差权衡。通过模拟，我们展示了重参数化的实际好处，表明它在训练过程中显著降低了梯度相关性并提高了收敛稳定性。该方法应用于德克萨斯州保险部的医疗事故保险数据集，包含超过18000个已结索赔。梯度提升方法对和解成本分布的尾部拟合良好，并揭示出和解天数是对尾部重尾性起主导作用的预测因子，这与准备金文献中的早期发现一致。

英文摘要

We develop a statistical learning theory for gradient boosting applied to the estimation of covariate-dependent Generalized Pareto (GP) distributions in the context of Peaks-over-Threshold modeling. After an orthogonal reparametrization of the GP likelihood that diagonalizes its Fisher information matrix, we cast the estimation problem within the Empirical Risk Minimization (ERM) framework and derive non-asymptotic error bounds for the boosting estimator. Our analysis accounts for three distinct sources of error in the process: statistical fluctuations, the approximation bias inherent to the asymptotic nature of the GP model-controlled under second-order regular variation-and the approximation error associated with the finite number of boosting iterates, making explicit the resulting bias-variance trade-off. We illustrate the practical benefits of the reparametrization through simulations, showing that it significantly reduces gradient correlation during training and improves convergence stability. The methodology is applied to a medical malpractice insurance dataset from the Texas Department of Insurance, comprising over 18 000 closed claims. The gradient boosting approach yields a good fit for the tail of settlement cost distributions and reveals that the number of days to settlement is the dominant predictor of tail heaviness, consistent with earlier findings in the reserving literature.

URL PDF HTML ☆

赞 0 踩 0

AI 大模型

视觉与机器人

科学与医疗

VaCDA: Variational Contrastive Alignment-based Scalable Human Activity Recognition

Hard-Negative Sampling for Contrastive Learning: Optimal Representation Geometry and Neural- vs Dimensional-Collapse

Extrinsic Evaluation of Cultural Competence in Large Language Models

Linking Named Entities in Diderot's Encyclopédie to Wikidata

Supervised Contrastive Learning with Hard Negative Samples

Learning Coordinated Preference for Multi-Objective Multi-Agent Reinforcement Learning

When Good Verifiers Go Bad: Self-Improving VLMs Can Regress on New Tasks

Regulating the Machine Contributor: Governance and Policy Alignment in Open Source

Cluster LOCO: Feature Importance For Interpreting Clusters

When Errors Become Narratives: A Longitudinal Taxonomy of Silent Failures in a Production LLM Agent Runtime

Regional Climate Model Emulation with Diffusion Approaches: What is the Added Value of Generative Machine Learning?

Trimodal Glioma Representation Alignment via Volumetric Contrastive Learning

Free Heavy-Tailed Lunch for Muon: A Theoretical Justification of Empirical Success

Securing the Future of IoMT in the Post-Quantum Era: An Edge-Native Federated Learning Approach

Beyond the Training Distribution: Evaluating Predictions Under Distribution Shift and Selection Bias

A Fixed-Point Neural Operator for Size- and Functional-Transferable Hamiltonian Prediction

Nonlinear Two-Time-Scale Stochastic Approximation: A Sharp Phase Transition and How to Beat It

tap: A File-Based Protocol for Heterogeneous LLM Agent Collaboration

Machine-learned particle flow as a foundation model for collider physics

No Accidental Software Agent First Canonical Code for Human Code Entropy Reduction and 30 to 500 times Lower Frontier Model Requirements

PLAIground: SLO-Driven Runtime Model Selection for Compound AI Systems in the Edge-Cloud-Space Continuum

Detecting Historical Turning Points in Italian Media: A Complex Systems Approach to a Diachronic News Corpus

Recovery thresholds for hidden weighted sparse graphs

I'm Sorry Driver, I'm Afraid I Can't Do That: Appraising the Safety of LLMs within Automotive Contexts

Nonlocal Bayesian Modeling of Continuous Spatio-Temporal Dynamics

Transforming Shape Schemas with Composable Property-Graph Queries (Extended Version)

Thinking Outside the [Chat]Box: Bridging Computer Science and Industrial Design for Cognitive-Inclusive Generative AI

Operator Calculus for Population-Based Optimization: A Mean-Field Convergence Theory

ScoreGate: Adaptive Chunk Selection for Retrieval-Augmented Generation via Dual-Score Statistical Fusion

Gradient boosting for extremes: sampling theory and application to insurance