Methods for Inferring Interaction Potentials from Cross-Linking Mass Spectrometry Data
从交联质谱数据推断相互作用势的方法
Börries von Seggern, Mohsen Sadeghi
AI总结 提出一种从交联质谱数据中参数化相互作用势的框架,通过连接逆Henderson问题并改进算法,在均匀流体和多相系统中实现了高效准确的势参数恢复。
Comments 19 pages, 10 Figure, 5 Tables
详情
交联质谱(XL-MS)已成为一种强大的定量技术,以前所未有的规模探测蛋白质内部结构信息以及蛋白质-蛋白质相互作用。XL-MS数据通过分子间连接子提供蛋白质对空间邻近性的信息。然而,将这些数据适配到粗粒化相互作用粒子模型的系统方法仍然有限。主要焦点集中在直接拟合径向分布函数(RDF),而许多可观测量,例如配位数(RDF的泛函),无法唯一地反演。在这项工作中,我们开发了一个框架,用于从这些可观测量中参数化相互作用势,适用于XL-MS结果中可能出现的相分离混合物。我们将该问题与逆Henderson问题建立联系,并采用迭代Boltzmann反演和迭代蒙特卡洛等算法进行数值求解。我们推导了精确和低密度极限梯度近似,并基于预测-校正框架提出了两种新算法。总共,我们在生物真实的十组分测试系统上评估了几种优化算法。我们证明,对于均匀流体,所有方法都实现了卓越的效率和准确性。关键的是,我们进一步证明在具有挑战性的三相系统中成功进行了参数化。其中,三种算法,即Adam、采用低密度导数的梯度下降以及使用精确梯度的牛顿法,可靠地恢复了正确的参数。这些结果为从XL-MS实验到相分离控制生物功能的系统的粗粒化蛋白质模型建立了清晰路径,可能促进对生物分子凝聚体和蛋白质聚集的新研究。
Cross-linking mass spectrometry (XL-MS) has emerged as a powerful quantitative technique for probing intra-protein structural information as well as protein-protein interactions at an unprecedented scale. XL-MS data yield information on the pairwise spatial proximity of proteins through inter-molecular linkers. However, systematic methods for adapting such data for coarse-grained interacting particle models remain limited. Predominant focus is put on directly fitting radial distribution functions (RDFs), while numerous observables, e.g. coordination numbers, which are functionals of the RDF, cannot be uniquely inverted. In this work, we develop a framework for parameterizing interaction potentials from such observables in potentially phase-separated mixtures, as encountered in XL-MS results. We establish a connection between this problem and the inverse Henderson problem and adapt algorithms such as Iterative Boltzmann Inversion and Iterative Monte Carlo to its numerical solution. We derive exact and low-density limit gradient approximations and propose two new algorithms based on an adaptation of the predictor-corrector~framework. In total, we evaluate several optimization algorithms on biologically realistic ten-component test systems. We demonstrate that for homogeneous fluids, all methods achieve exceptional efficiency and accuracy. Critically, we further demonstrate successful parametrization in a challenging three-phase system. Here, three algorithms, namely Adam and gradient descent employing the low-density derivative as well as Newton's method with the exact gradient, reliably recover the correct parameters. These results establish a clear pathway from XL-MS experiments to coarse-grained protein models for systems where phase separation governs biological function, potentially enabling new investigations of biomolecular condensates and protein aggregation.