arXivDaily arXiv每日学术速递 周一至周五更新
重置
Q-BIO定量生物1
2602.10163 2026-06-07 q-bio.QM

Beyond SMILES: Evaluating Agentic Systems for Drug Discovery

超越SMILES:评估药物发现中的代理系统

Edward Wijaya

AI总结 评估药物发现代理系统在肽药物、体内药理学和资源受限环境中的泛化能力,发现五个能力缺口,并提出下一代框架的设计要求和能力矩阵。

详情
Comments
arXiv admin comment: This version has been removed by arXiv administrators as the submitter did not have the rights to agree to the license at the time of submission
AI中文摘要

药物发现中的代理系统已展现出自主合成规划、文献挖掘和分子设计能力。我们探讨这些系统在不同任务类别的泛化能力。通过评估六个框架在肽药物、体内药理学和资源受限环境下的15个任务类别,发现五个能力缺口:不支持蛋白质语言模型或肽特异性预测、体内与体外数据无桥梁、依赖LLM推理但无ML训练或强化学习路径、假设与大药企资源绑定、单目标优化忽略安全-疗效-稳定性权衡。一项配对知识探测实验表明瓶颈是架构而非认知问题:四个前沿LLM对肽的理解水平与小分子相当,但无框架暴露此能力。我们提出下一代框架的设计要求和能力矩阵,使其在现实约束下作为计算伙伴发挥作用。

英文摘要

Agentic systems for drug discovery have demonstrated autonomous synthesis planning, literature mining, and molecular design. We ask how well they generalize. Evaluating six frameworks against 15 task classes drawn from peptide therapeutics, in vivo pharmacology, and resource-constrained settings, we find five capability gaps: no support for protein language models or peptide-specific prediction, no bridges between in vivo and in silico data, reliance on LLM inference with no pathway to ML training or reinforcement learning, assumptions tied to large-pharma resources, and single-objective optimization that ignores safety-efficacy-stability trade-offs. A paired knowledge-probing experiment suggests the bottleneck is architectural rather than epistemic: four frontier LLMs reason about peptides at levels comparable to small molecules, yet no framework exposes this capability. We propose design requirements and a capability matrix for next-generation frameworks that function as computational partners under realistic constraints.