What an Autonomous Agent Discovers About Molecular Transformer Design: Does It Transfer?
自主代理在分子变换器设计中发现什么:它是否能够迁移?
Edward Wijaya
AI总结 研究通过自主架构搜索测试分子序列是否受益于不同设计,发现SMILES序列优化学习率优于搜索,自然语言改进显著,蛋白质居中,不同领域创新可迁移,表明差异源于搜索路径而非生物学需求。
详情
- Comments
- arXiv admin comment: This version has been removed by arXiv administrators as the submitter did not have the rights to agree to the license at the time of submission
深度学习模型在药物分子和蛋白质中广泛复用自然语言设计的变换器架构,但分子序列是否受益于不同设计尚未系统测试。我们通过自主架构搜索在三种序列类型(SMILES、蛋白质和英语文本)上运行3106次实验。对于SMILES,架构搜索无效益:仅调优学习率和调度优于完整搜索(p=0.001)。自然语言中架构变化推动81%的改进(p=0.009)。蛋白质介于两者之间。令人惊讶的是,尽管代理在不同领域发现不同架构(p=0.004),所有创新在三个领域迁移时降级小于1%,表明差异反映搜索路径依赖而非基本生物学需求。我们发布了一个决策框架和开源工具包,供分子建模团队在自主架构搜索和简单超参数调优之间选择。
Deep learning models for drug-like molecules and proteins overwhelmingly reuse transformer architectures designed for natural language, yet whether molecular sequences benefit from different designs has not been systematically tested. We deploy autonomous architecture search via an agent across three sequence types (SMILES, protein, and English text as control), running 3,106 experiments on a single GPU. For SMILES, architecture search is counterproductive: tuning learning rates and schedules alone outperforms the full search (p = 0.001). For natural language, architecture changes drive 81% of improvement (p = 0.009). Proteins fall between the two. Surprisingly, although the agent discovers distinct architectures per domain (p = 0.004), every innovation transfers across all three domains with <1% degradation, indicating that the differences reflect search-path dependence rather than fundamental biological requirements. We release a decision framework and open-source toolkit for molecular modeling teams to choose between autonomous architecture search and simple hyperparameter tuning.