AI for Science - arXivDaily 专题

2606.10376 2026-06-18 cs.AI cs.IT math.IT 交叉投稿 90%

Belief-Space Control for Personalized Cancer Treatment via Active Inference

基于主动推理的个性化癌症治疗信念空间控制

Deniz Sargun, H. Bugra Tulay, C. Emre Koksal

发表机构 * American Association for Cancer Research（美国癌症研究协会）； AACR Project GENIE registry（AACR Project GENIE 注册中心）； AACR Project GENIE Biopharma Collaborative（AACR Project GENIE 生物制药合作组织）

专题命中 AI制药：主动推理用于个性化癌症治疗

AI总结提出用主动推理将癌症治疗建模为信念空间规划问题，在测量预算下统一目标导向控制与信息获取，实现患者分类与高效治疗。

Comments 11 pages including appendix

2606.18390 2026-06-18 cs.LG q-bio.QM 新提交 80%

MOLAR: Learning Multimodal Molecular Representations from Noisy Labels

MOLAR: 从噪声标签中学习多模态分子表示

Yingxu Wang, Kunyu Zhang, Nan Yin, Yu Li, Eran Segal

发表机构 * Mohamed bin Zayed University of Artificial Intelligence（穆罕默德·本·扎耶德人工智能大学）； Zhengzhou University（郑州大学）； The Education University of Hong Kong（香港教育大学）； The Chinese University of Hong Kong（香港中文大学）； Weizmann Institute of Science（魏茨曼科学研究所）

专题命中 AI制药：提出多模态分子表示学习框架用于属性预测

AI总结提出MOLAR框架，通过分离干净属性推断与标签观测，利用图与文本模态的残差证据，从噪声标签中学习多模态分子表示，在自然噪声和标签翻转基准上优于基线方法。

详情

AI中文摘要

动机：噪声标签是分子属性预测中的常见挑战，因为分子注释通常来自实验分析、 curated数据库或弱注释流程，而非直接观测到的干净生物状态。将记录标签视为可靠监督会导致模型记忆损坏的观测并学习误导性的分子证据。在多模态分子表示学习中，图-文本融合或对齐可能放大此问题，从而跨模态传播标签引起的错误。结果：我们提出MOLAR，一个从噪声标签中学习多模态分子表示的噪声感知框架。MOLAR将潜在干净属性推断与记录标签观测分离：图和文本视图为干净属性分布贡献残差证据，一个分类标签观测通道将此分布映射到记录标签用于训练。该公式从模型中推导出后验标签可靠性和模态特定的分子证据。在自然噪声分子基准和受控标签翻转基准上的实验表明，MOLAR始终优于代表性基线。可视化分析进一步表明MOLAR提供了可解释的可靠性和模态证据诊断。

英文摘要

Motivation: Noisy labels are a common challenge in molecular property prediction because molecular annotations are often obtained from assays, curated databases, or weak annotation pipelines rather than directly observed clean biological states. Treating recorded labels as reliable supervision can cause models to memorize corrupted observations and learn misleading molecular evidence. In multimodal molecular representation learning, this issue can be amplified by graph-text fusion or alignment, which may propagate label-induced errors across modalities. Results: We propose MOLAR, a noise-aware framework for learning multimodal molecular representations from noisy labels. MOLAR separates latent clean-property inference from recorded-label observation: graph and text views contribute residual evidence to a clean-property distribution, and a categorical label-observation channel maps this distribution to recorded labels for training. This formulation derives posterior label reliability and modality-specific molecular evidence from the model. Experiments on naturally noisy molecular benchmarks and controlled label-flipping benchmarks show that MOLAR consistently outperforms representative baselines. Visualization analyses further show that MOLAR provides interpretable reliability and modality-evidence diagnostics.

URL PDF HTML ☆

赞 0 踩 0

2606.18785 2026-06-18 cs.LG cs.AI 新提交 75%

Bayesian Anytime Pareto Set Identification for Multi-Objective Multi-Armed Bandits

贝叶斯任意时间帕累托集识别用于多目标多臂老虎机

Lennert Saerens, Bram Silue, Eleni Litsa, Peter Vrancx, Pieter Libin

发表机构 * imec ； Data Science Institute, Interuniversity Institute of Biostatistics and Statistical Bioinformatics, UHasselt（哈瑟尔特大学生物统计学与统计生物信息学跨大学研究所数据科学研究所）

专题命中 AI制药：多目标分子发现，属于AI制药

AI总结提出首个任意时间多目标多臂老虎机算法Top-Two帕累托前沿汤普森采样(TTPFTS)，用于帕累托集识别，在合成环境和超大型分子库中验证有效性，并引入不确定性量化指标。

Comments 26 pages, 13 figures

详情

AI中文摘要

识别帕累托最优解对于支持多目标决策至关重要。我们首次提出了一种用于帕累托集识别问题的任意时间多目标多臂老虎机算法，采用贝叶斯方法：Top-Two帕累托前沿汤普森采样（TTPFTS）。我们在合成环境中将TTPFTS与最先进的固定预算帕累托集识别算法进行基准测试。接下来，我们通过高效探索超大型按需合成分子库，在具有挑战性的多目标分子发现场景中展示了其实用性。此外，我们引入了一种新颖的不确定性量化指标，用于估计算法在预测帕累托集上的置信度。我们证明该指标有效代理真实性能，为监控复杂环境中的学习进度提供了一种稳健的方法。最后，我们用算法渐近正确性的理论证明补充了这些实证发现。

英文摘要

Identifying Pareto optimal solutions is critical to support multi-objective decision-making. We introduce the first anytime Multi-Objective Multi-Armed Bandit algorithm for the Pareto Set Identification problem, taking a Bayesian approach: Top-Two Pareto Front Thompson Sampling (TTPFTS). We benchmark TTPFTS against state-of-the-art fixed-budget Pareto Set Identification algorithms on synthetic environments. Next, we demonstrate its practical utility in a challenging multi-objective molecular discovery setting by efficiently exploring an ultra-large synthesis-on-demand molecular library. Furthermore, we introduce a novel uncertainty quantification metric that estimates our algorithm's confidence in the predicted Pareto set. We demonstrate that this metric effectively proxies true performance, yielding a robust methodology for monitoring learning progress in complex settings. Finally, we complement these empirical findings with a theoretical proof of the algorithm's asymptotic correctness.

URL PDF HTML ☆

赞 0 踩 0