arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2605.31539 2026-06-01 cs.CV cs.LG q-bio.QM

Automated Prediction of Postoperative Pancreatic Fistula Using Preoperative Computed Tomography

利用术前计算机断层扫描自动预测术后胰瘘

Ashok Choudhary, Chris Varghese, Leo Y. Li-Han, Frank G. Lee, Ellen L. Larson, Elizabeth B. Habermann, Cornelius A. Thiels, Hojjat Salehinejad

AI总结提出一种从胰腺分割到分类的端到端深度学习流程，利用术前CT扫描自动预测术后胰瘘风险，为临床决策提供工具和方法基准。

2605.31522 2026-06-01 cs.LG q-bio.GN q-bio.QM

Chem-PerturBridge: a harmonized compendium of small molecule perturbation transcriptomic effects

Chem-PerturBridge：小分子扰动转录组效应的协调汇编

Artur Szałata, Olga Novitskaia, Maiia Shulman, Matthew Mella, Altynbek Zhubanchaliyev, Fabian J. Theis

AI总结为解决小分子扰动转录组数据碎片化问题，构建了涵盖37k化合物、136种细胞背景和125万样本的协调资源Chem-PerturBridge，并验证了其在跨数据集签名一致性评估和化合物表示学习预训练中的有效性。

详情

Comments: 33 pages, 6 figures, 16 tables

AI中文摘要

大型扰动模型需要涵盖化学、细胞和检测多样性的训练数据。然而，当前用于小分子建模的转录组资源在技术、元数据惯例、对照、剂量和预处理流程方面是碎片化的。我们引入了Chem-PerturBridge，这是一个协调的多数据集资源，包含超过37k种化合物、136种细胞背景和125万个转录组样本，涵盖八种检测类型，具有标准化的标识符、元数据和考虑重复的条件级效应。我们利用该资源评估了跨数据集的匹配条件一致性和数据集内的重复一致性。匹配的相同化合物条件在大多数数据集对上的细粒度logFC排名和幅度上通常表现出弱一致性，通常低于相同背景不同化合物的基线。相比之下，logFC方向的一致性要稳定得多，并且通常超过这些基线。我们进一步评估了Chem-PerturBridge作为化合物表示学习预训练资源的效果。在化合物留出的OP3评估分割下，基于Chem-PerturBridge预训练的嵌入在各项指标上优于仅使用L1000的嵌入、Morgan指纹和无描述符的OP3基线。在11个数据集上的广泛分子留出评估进一步表明，基于Chem-PerturBridge训练的模型优于或匹配未使用该资源的模型。因此，Chem-PerturBridge支持跨数据集签名一致性的诊断评估以及异质扰动转录组数据的模型导向复用。

英文摘要

Large perturbation models require training data encompassing chemical, cellular, and assay diversity. Current transcriptomic resources for small-molecule modeling, however, are fragmented across technologies, metadata conventions, controls, doses, and preprocessing pipelines. We introduce Chem-PerturBridge, a harmonized multi-dataset resource comprising over 37k compounds, 136 cellular contexts, and 1.25M transcriptomic samples across eight assay types, with standardized identifiers, metadata, and replicate-aware condition-level effects. We use the resource to evaluate matched-condition agreement across datasets and replicate agreement within datasets. Matched same-compound conditions generally show weak agreement in fine-grained logFC rankings and magnitudes across most dataset pairs, often falling below same-context different-compound baselines. In contrast, logFC direction agreement is substantially more stable and usually exceeds these baselines. We further evaluate Chem-PerturBridge as a pretraining resource for compound representation learning. Under a compound-held-out OP3 evaluation split, embeddings pretrained on Chem-PerturBridge improve over L1000-only embeddings, Morgan fingerprints, and the descriptor-free OP3 baseline across metrics. An extensive molecule-holdout evaluation across 11 datasets further shows that models trained on Chem-PerturBridge outperform or match those that are not. Chem-PerturBridge therefore supports both diagnostic evaluation of cross-dataset signature agreement and model-oriented reuse of heterogeneous perturbation transcriptomic data.

URL PDF HTML ☆

赞 0 踩 0

2605.31473 2026-06-01 q-bio.NC

The Metastable Mind: Neural Underpinnings of Naturalistic Cognition Through the Synthesis of Event Segmentation and Metastable Neural States

亚稳态思维：通过事件分割与亚稳态神经状态的综合理解自然认知的神经基础

Dora Gozukara, Nasir Ahmad, Djamari Oetringer, Linda Geerligs

AI总结本文综述了事件分割的认知理论与亚稳态神经活动的机制方法，指出两者从不同角度研究相同的亚稳态神经状态，并阐述了这些状态作为认知基本计算单元的核心原则。

详情

Comments: 24 pages

AI中文摘要

来自认知、行为与计算神经科学的众多发现和理论表明，神经活动以多种有意义的时域单位展开。关于事件分割（ES）的行为研究表明，连续体验被分割为离散事件和子事件，这有助于实时理解、记忆和决策。计算神经科学的研究将持续的大脑活动观察并建模为一系列在广泛时空尺度上发生的稳定群体活动，称为亚稳态神经活动（MNA）。通过本综述，我们表明这些孤立的文献分支——事件分割（ES）的认知理论和亚稳态（MNA）的机制方法——实际上从不同角度研究了相同的亚稳态神经状态。行为分支提供了分割的认知和行为效用的理论，而亚稳态文献则在实现层面提供了机制解释。我们描述了亚稳态神经状态如何作为认知的基本计算单元，并确定了它们运作的一些核心原则。其一是在时空上嵌套的状态层级，其中高级区域中持续时间较长的状态既约束又受快速运作区域的状态塑造。其二是神经状态反映了潜在的预测模型，这些模型塑造感知、决策、记忆编码和回忆。最后，神经状态是更模块化处理的时期，其间穿插着连接重组发生的边界。理解神经状态如何涌现、相互作用并塑造认知，使我们更接近理解大脑在其自然运作模式下的状态。

英文摘要

A multitude of findings and theories from cognitive, behavioural and computational neuroscience show that neural activity unfolds in a variety of meaningful temporal units. Behavioural research on event segmentation (ES) has shown that continuous experience is segmented into discrete events and sub-events, which aid real-time comprehension, memory, and decision-making. Computational neuroscience research observes and models ongoing brain activity as a series of stable population activity that occur across wide spatial and temporal scales, referred to as metastable neural activity (MNA). Through this review, we show that these isolated branches of literature, the cognitive theory of Event Segmentation (ES) and the mechanistic approach of metastability (MNA), actually study the same metastable neural states from different perspectives. While the behavioural branch offers a theory for the cognitive and behavioural utility of segmentation, the metastability literature provides the mechanistic account at the implementational level. We describe how metastable neural states act as the fundamental computational units of cognition and identify a number of core principles of how they operate. One is the spatio-temporally nested hierarchy of states, where longer-duration states in higher-order regions both constrain and are shaped by states in faster-operating regions. Another is that neural states are a reflection of underlying predictive models which shape perception, decision making, memory encoding and recall. And finally that neural states are periods of more modular processing, which are interspersed by boundaries where there is a reconfiguration of connectivity. Understanding how neural states emerge, interact, and shape cognition brings us closer to understanding the brain in its natural mode of operation.

URL PDF HTML ☆

赞 0 踩 0

2605.31305 2026-06-01 q-bio.PE stat.ME

Consensus-level substitution rates are distinct from the virion-level rate

共识水平替代率不同于病毒颗粒水平替代率

David J Pascall

AI总结本文区分了病毒颗粒水平替代率（VLSR）和共识水平替代率（CLSRs），指出两者生物学意义不同且不可互换，并强调共识生成规则应作为常规报告要求。

2605.31296 2026-06-01 q-bio.BM cs.LG

mRNAutilus: Multi-Objective-Guided Discrete Generation of mRNA with Optimized Therapeutic Properties

mRNAutilus：多目标引导的mRNA离散生成与优化治疗特性

Sawan Patel, Sophia Tang, Yesol Kim, Yinuo Zhang, Divya Srijay, Ping-Jung Lin, Shambhavi Shubham, Fengmei Pi, Cedric Wu, Sherwood Yao, Pranam Chatterjee

AI总结提出mRNAutilus框架，结合掩码离散扩散模型和蒙特卡洛树引导，实现同时优化密码子和从头设计UTR，生成多目标帕累托最优的完整mRNA序列，在多个靶标上显著提升表达和稳定性。

详情

AI中文摘要

治疗性mRNA设计需要协调整个转录本中多个相互作用的序列特征，其中密码子使用、非翻译区（UTR）及其耦合共同决定稳定性、翻译效率和蛋白质表达。在这里，我们提出通过展开轨迹和信息潜在更新生成mRNA（mRNAutilus），这是一个直接从序列进行同时密码子优化和从头UTR设计的框架。mRNAutilus结合了在数百万全长mRNA上训练的掩码离散扩散模型与蒙特卡洛树引导，在多个功能目标下生成帕累托高效序列，使用模型嵌入上的轻量级回归器预测半衰期、翻译效率和蛋白质丰度。与最近分别设计编码序列和UTR或依赖事后组装和筛选的方法不同，mRNAutilus在单个过程中生成完整转录本，并跨属性优化。在多种靶标上，编码P. pyralis荧光素酶的零样本mRNA表达量比野生型高400倍以上，并优于商业和机器学习设计的基线，包括零样本生成方法。零样本SARS-CoV-2 Spike mRNA超过临床使用和商业构建体，并匹配或超越实验室优化设计，同时具有更好的耐久性。我们进一步展示了在治疗环境中的通用性，包括先导编辑（PEMax）和可编程蛋白质组调节，其中mRNAutilus设计的构建体增强了用于β-连环蛋白降解的肽引导E3连接酶（uAbs）的表达。这些结果建立了一个基于序列的多目标框架，用于生成适用于多种生物应用的功能性mRNA。

英文摘要

Therapeutic mRNA design requires coordinating multiple interacting sequence features across the full transcript, where codon usage, untranslated regions (UTRs), and their coupling jointly determine stability, translation efficiency, and protein expression. Here, we present mRNA generation via unrolled trajectories and informed latent updates (mRNAutilus), a framework for simultaneous codon optimization and de novo UTR design directly from sequence. mRNAutilus combines a masked discrete diffusion model trained on millions of full-length mRNAs with Monte Carlo Tree Guidance to generate Pareto-efficient sequences under multiple functional objectives, using lightweight regressors over model embeddings to predict half-life, translation efficiency, and protein abundance. Unlike recent methods that design coding sequences and UTRs separately or rely on post hoc assembly and screening, mRNAutilus generates complete transcripts in a single process optimized across properties. Across diverse targets, zero-shot mRNAs encoding P. pyralis luciferase achieve over 400-fold higher expression than wild-type and outperform commercial and machine learning-designed baselines, including zero-shot generative approaches. Zero-shot SARS-CoV-2 Spike mRNAs exceed clinically used and commercial constructs and match or surpass lab-optimized designs with improved durability. We further demonstrate generality in therapeutic settings, including prime editing (PEMax) and programmable proteome modulation, where mRNAutilus-designed constructs enhance expression of peptide-guided E3 ligases (uAbs) for beta-catenin degradation. These results establish a sequence-based, multi-objective framework for generating functional mRNAs tailored to diverse biological applications.

URL PDF HTML ☆

赞 0 踩 0

2605.31274 2026-06-01 math.AP q-bio.PE

Derivation, Analysis and Simulation of a Spatio-Temporal Epidemiology Model with Memory

具有记忆的时空流行病学模型的推导、分析与模拟

Hassan El Bouz, Karim Faraj, Anthony Khairallah, Fatima Mroue

AI总结提出一个包含积分记忆项的反应扩散系统，用于模拟无症状传播的传染病时空演化，并通过Faedo-Galerkin方法证明弱解的局部存在性，最后应用于黎巴嫩疾病地理演化模拟。

2605.31236 2026-06-01 q-bio.BM

SwitchCraft: A Programmatic Framework for Designing State-Switching Proteins

SwitchCraft：用于设计状态切换蛋白质的程序化框架

Bowen Jing, Mihir Bafna, Anisha Parsan, Heyuan Michael Ni, David Kwabi-Addo, Bryan Bryson, Adam Klivans, Bonnie Berger

AI总结提出SwitchCraft框架，通过结构预测模型参数化的组合设计约束反向传播，实现多状态蛋白质的理性设计，并在多种状态切换功能原语和荧光生物传感器设计上验证成功。

详情

Comments: ICML 2026

AI中文摘要

多状态机制是天然蛋白质中许多复杂功能的基础。理性设计多状态蛋白质的能力将对生物技术的许多领域产生变革性影响，但现有蛋白质设计深度学习框架无法实现。为解决这一差距，我们引入SwitchCraft，一个通用且程序化的框架，用于基于通过结构预测模型参数化的组合设计约束反向传播来设计状态切换蛋白质。计算机模拟评估展示了在多种状态切换功能原语上的成功，从基序的变构调节到结合配体身份的区分。利用这些原语，我们展示了一种从头设计针对任意小分子分析物的荧光生物传感器的计算机模拟策略。这些结果将SwitchCraft定位为高阶功能蛋白质设计强大范式的开端。代码可在https://github.com/bjing2016/switchcraft获取。

英文摘要

Multistate mechanisms underlie many of the complex functions observed in natural proteins. The ability to rationally design multistate proteins would have transformative implications for many areas of biotechnology, yet lies beyond the capabilities of existing deep learning frameworks for protein design. To address this gap, we introduce SwitchCraft, a versatile and programmatic framework for designing state-switching proteins based on backpropagation through compositional design constraints parameterized by structure prediction models. In silico evaluations demonstrate success on a wide range of state-switching functional primitives, from allosteric regulation of motifs to discrimination of bound ligand identities. Using these primitives, we demonstrate an in silico strategy for de novo design of fluorescent biosensors to arbitrary small molecule analytes. These results position SwitchCraft at the inception of a powerful paradigm for higher-order functional protein design. Code is available at https://github.com/bjing2016/switchcraft.

URL PDF HTML ☆

赞 0 踩 0

2605.31150 2026-06-01 q-bio.OT

Quantifying biofilm-virulence index to predict antifungal resistance in Candida albicans

量化生物膜-毒力指数以预测白色念珠菌的抗真菌耐药性

Nikhil Ujlayan, Teena Singh, Vanshika Dhama, Harsh Pratap Singh, Mahesh Kumar, R K Brojen Singh

AI总结本研究提出生物膜-毒力指数（BVI）作为定量参数，通过结合结晶紫染色和CFU计数构建加性模型，评估白色念珠菌的抗真菌耐药性。

详情

Comments: 9 pages, 2 figures

AI中文摘要

白色念珠菌是一种共生微生物，可引起机会性感染，如口腔念珠菌病、影响女性、新生儿和免疫功能低下患者的阴道炎。生物膜形成可通过引入抗真菌耐药性使共生微生物变成威胁生命的微生物。我们进行的实验结合了用于生物膜生物量的结晶紫染色和CFU计数，通过分析实验数据统计构建了加性BVI模型。我们对数据的研究提出了生物膜-毒力指数（BVI）作为评估白色念珠菌抗真菌药物耐药性的新型定量参数。药物对抑菌圈直径的影响是双重的：首先，在早期生物膜形成期间随时间线性增加；其次，在后期阶段稳定并与毒力直接相关。大多数BVI值保持在轻度感染范围内，表明抗真菌药物成功降低了毒力。BVI模型将生物膜研究和活细胞计数结合在一个参数中。因此，这使得在生物膜分析期间更容易比较样本。研究结果表明，CFU和生物膜测量的结合可能改善对白色念珠菌抗真菌反应的解读。该方法在未来研究生物膜相关耐药性的实验研究中可能有用。

英文摘要

Candida albicans is a commensal microorganism that causes opportunistic infections, such as oral candidiasis, vaginitis affecting females, newborns, and immunocompromised patients. Biofilm formation can lead to a commensal organism becoming a life-threatening organism by introducing antifungal resistance. The experiment we did combines crystal violet staining for biofilm biomass and CFU counts to statistically construct an additive BVI model by analysing the experimental data. Our study on the data proposes a Biofilm-Virulence Index (BVI) as a novel and quantitative parameter for assessing antifungal drug resistance in Candida albicans. The effect of the drugs on inhibition zone diameter is twofold, first, linear increase with time during early biofilm formation, second, stabilizing in later phases and correlating directly with virulence. Most BVI values remained in the mild infection range, indicating successful virulence reduction by antifungal drugs. The BVI model model combines the study of biofilm and viable cell count in a single parameter. So, this makes comparison between samples easier during biofilm analysis. Findings suggest that combination of CFU and biofilm measurement may improve interpretation of antifungal response in Candida albicans. This approach could be useful in future experimental studies investigating biofilm associated resistance.

URL PDF HTML ☆

赞 0 踩 0

2605.30974 2026-06-01 q-bio.PE

Morphological routes to extinction: A mechanistic assessment of habitat loss

灭绝的形态路径：栖息地丧失的机制评估

E. H. Colombo, L. Menon, E. Hernandez-Garcia, C. Anteneodo

AI总结本研究通过反应-扩散框架比较几何与机制指标对栖息地退化影响的评估，发现几何指标系统性地高估种群持久性，而机制指标揭示快速加速的灭绝趋势。

详情

AI中文摘要

由气候和人为压力驱动的栖息地丧失改变了斑块形态，对种群持久性产生关键影响。几何和机制指标常被用于量化退化，但它们各自的局限性仍知之甚少。在这里，我们使用一个反应-扩散框架来研究种群在敌对环境中一个可行斑块内的生长和扩散，从而解决这一空白。我们比较了斑块形状的几何描述符与从灭绝阈值附近种群增长导出的机制指标。沿着退化轨迹，我们发现几何指标系统性地高估了持久性，表明影响温和且减缓，而机制指标揭示了快速、加速的灭绝逼近。这些结果突显了几何方法的根本局限性，并强调了在评估复杂景观中生物多样性丧失时需要进行机制评估。

英文摘要

Habitat loss driven by climate and anthropogenic pressures alters patch morphology, with critical consequences for population persistence. Geometric and mechanistic metrics are commonly used to quantify degradation, yet their respective limitations remain poorly understood. Here, we address this gap using a reaction-diffusion framework for population growth and dispersal in a viable patch embedded in a hostile environment. We compare geometric descriptors of patch shape with a mechanistic metric derived from population growth near the extinction threshold. Along degradation trajectories, we find that geometric metrics systematically overestimate persistence, suggesting moderate and decelerating impacts, whereas mechanistic indicators reveal rapid, accelerating approaches to extinction. These results highlight fundamental limitations of geometric approaches and underscore the need for mechanistic assessments when evaluating biodiversity loss in complex landscapes.

URL PDF HTML ☆

赞 0 踩 0

2605.30963 2026-06-01 q-bio.BM cs.AI

AMix-2: Establishing Protein as a Native Modality in Large Language Models

AMix-2：将蛋白质确立为大语言模型的原生模态

Keyue Qiu, Yixin Wu, Lihao Wang, Yawen Ouyang, Jixiang Yu, Zihan Zhou, Changze Lv, Dongyu Xue, Yuxuan Song, Xinbo Zhang, Hao Wang, Jiangtao Feng, Zhiqiang Gao, Lijun Wu, Xiaoqing Zheng, Ka-Chun Wong, Lei Bai, Ya-Qin Zhang, Wei-Ying Ma, Dahua Lin, Bowen Zhou, Hao Zhou

AI总结提出AMix-2，一种蛋白质-文本基础模型，通过统一蛋白质理解与序列设计，将蛋白质作为大语言模型的原生模态，并引入块状扩散语言建模骨干以更好地匹配蛋白质内在特性。

详情

Comments: 30 pages, 4 figures, 12 tables

AI中文摘要

我们提出了AMix-2，一种蛋白质-文本基础模型，将蛋白质确立为大语言模型（LLMs）的原生模态，在单一基础模型中统一了蛋白质理解和序列设计。AMix-2基于两个关键思想：（1）统一的蛋白质-文本公式，将自然语言和蛋白质序列嵌入共享的标记空间，使一个模型能够执行生物推理和条件设计，而不是使用单独的下游任务专用模型；（2）块状扩散语言建模骨干，结合了跨块的因果生成与块内的双向上下文和迭代细化。这种方案比严格的从左到右分解更好地匹配了蛋白质的内在本质。为了在现实的泛化设置下评估蛋白质基础模型，我们进一步引入了ProteinArena，一个全面的基准测试，具有时间感知和同源性感知协议，涵盖各种理解和设计任务，并以经典生物信息学工具、蛋白质专用模型和LLMs作为基线。在ProteinArena上，AMix-2优于前沿的LLMs，并展现出与任务专用蛋白质模型竞争的性能。控制实验进一步表明，基于扩散的范式普遍优于其自回归对应物，突显了蛋白质序列灵活生成顺序的优势。我们发布了AMix-2和ProteinArena，以促进蛋白质基础模型的开放研究。

英文摘要

We present AMix-2, a protein-text foundation model that establishes protein as a native modality in large language models (LLMs), unifying protein understanding and sequence design within a single foundation model. AMix-2 is built upon two key ideas: (1) a unified protein-text formulation that embeds natural language and protein sequence in a shared token space, enabling one model to perform biological reasoning and conditional design instead of separate downstream task-specialized models; and (2) a block-wise diffusion language modeling backbone that combines causal generation across blocks with bidirectional context and iterative refinement within blocks. This scheme better matches the intrinsic nature of proteins than a strict left-to-right factorization. To evaluate protein foundation models under realistic generalization settings, we further introduce ProteinArena, a comprehensive benchmark with time-aware and homology-aware protocols across various understanding and design tasks, and with baselines covering classical bioinformatics tools, protein-specialized models and LLMs. On ProteinArena, AMix-2 outperforms frontier LLMs and demonstrates competitive performance to task-specific protein models. Controlled experiments further show that the diffusion-based paradigm generally surpasses its autoregressive counterpart, highlighting the advantage of flexible generation order for protein sequences. We release both AMix-2 and ProteinArena to facilitate open research in protein foundation models.

URL PDF HTML ☆

赞 0 踩 0

2605.30950 2026-06-01 q-bio.PE physics.soc-ph stat.AP

Coordination without communication: beyond optimisation and geometric Brownian motion

无通信协调：超越优化与几何布朗运动

G J Milburn, A K Ringsmuth

AI总结提出基于信息受限反馈的物理框架，在部分观测随机动力系统中，通过宏观到微观的反馈实现群体协调，无需直接通信或策略优化。

详情

AI中文摘要

我们引入了一个基于物理的群体协调框架，该框架基于部分观测随机动力系统中的信息受限反馈。种群规模作为连续时间生灭马尔可夫过程演化，其转移速率响应与潜在种群状态相关的共享随机测量信号。个体既不直接通信也不优化策略；相反，协调通过由不完美公共信息介导的宏观到微观反馈涌现。我们证明，当测量强度和种群统计满足适当条件时，几何布朗运动作为条件动力学的极限情况出现。更一般地，改变测量通道的信噪比特性会产生更广泛的随机增长过程，包括扩散和跳跃类机制，尽管系综平均增长仍然是指数级的。在适当的极限下，该框架恢复了Peters和Adamou的随机乘法增长模型，为部分可观测性下的推理和反馈协调提供了物理解释。

英文摘要

We introduce a physically grounded framework for coordination in a population based on information constrained feedback in a partially observed stochastic dynamical system. Population size evolves as a continuous time birth death Markov process whose transition rates respond to a shared stochastic measurement signal correlated with the underlying population state. Individuals neither communicate directly nor optimise strategies; instead, coordination emerges from macro to micro feedback mediated by imperfect common information. We show that geometric Brownian motion arises as a limiting case of the conditional dynamics when measurement strength and population statistics satisfy suitable conditions. More generally, varying the signal to noise properties of the measurement channel produces a wider class of stochastic growth processes, including diffusive and jump like regimes, even though ensemble average growth remains exponential. In an appropriate limit the framework recovers the stochastic multiplicative growth model of Peters and Adamou, providing a physical interpretation of coordination as inference and feedback under partial observability.

URL PDF HTML ☆

赞 0 踩 0

2605.30882 2026-06-01 q-bio.NC

Extended predictive coding framework as variational free-energy minimisation under exponential-family assumption

扩展预测编码框架：指数族假设下的变分自由能最小化

Asaki Kataoka, Kenji Doya

AI总结本文在指数族分布假设下扩展了预测编码框架，使其能够捕捉生物神经网络的非线性和异质性，并保持自由能原理与预测编码的对应关系。

详情

AI中文摘要

大脑的感觉皮层通过其复杂的神经元网络高效地进行感知推理。这一过程的理论解释之一是自由能原理（FEP），它假设大脑执行变分贝叶斯推理。开创性研究表明，在高斯假设和拉普拉斯近似下，FEP可以对应预测编码（PC）假说。然而，在如此有限的高斯框架下基于PC的FEP实现未能捕捉到生物神经网络的若干特性，例如网络内输入-输出特性的非线性和异质性，以及负发放率的生物不合理性。本研究表明，当假设变分后验和先验属于更广泛的概率分布类，即指数族分布（EFD）时，这些缺失的特性会在网络中出现，同时保持FEP与PC在直到后验二阶累积量上的对应关系。我们还表明，所提出的模型可以通过生物合理的局部可塑性规则进行训练。我们的结果丰富了FEP关于作为变分推理的感知神经动力学的解释力。

英文摘要

The sensory cortices of the brain perform perceptual inference efficiently through their complex networks of neurons. One of the theoretical accounts of this process is the free-energy principle (FEP), which postulates that the brain performs variational Bayesian inference. Pioneering studies have shown that FEP can correspond to the predictive coding (PC) hypothesis under the Gaussian assumption and Laplace approximation. However, PC-based implementations of FEP within such a limited Gaussian regime have failed to capture several properties of biological neural networks, such as nonlinearity and heterogeneity of input--output properties within a network, and the biological implausibility of negative firing rates. This study shows that, when a broader class of probability distributions, namely the exponential family of distributions (EFD), is assumed for the variational posterior and prior, these missing characteristics are exhibited within the network, maintaining the FEP--PC correspondence up to the second cumulant of the posterior. We also show that the proposed model can be trained by biologically plausible local plasticity rules. Our results enrich the explanatory power of FEP regarding neural dynamics involved in perception as variational inference.

URL PDF HTML ☆

赞 0 踩 0

2605.30864 2026-06-01 cs.HC q-bio.NC

What makes an action sequence enjoyable to watch?

什么使得动作序列看起来令人愉悦？

Jean-Peïc Chou, Kristine Zheng, Junyi Chu, Maneesh Agrawala, Judith E. Fan

AI总结通过Flappy Bird风格游戏视频，研究动作序列的难度和不确定性对观众愉悦度的影响，发现难度而非危险性预测愉悦度。

详情

Comments: 6 pages, 4 figures, cogsci 2026

AI中文摘要

人们常常寻求观看他人执行复杂动作序列（例如体育）的方式。是什么使得某些序列比其他序列更令人愉悦？我们生成了24个来自Flappy Bird风格游戏玩法的视频片段。片段在难度（玩家平均成功频率）和瞬间不确定性（玩家在任何给定步骤中坠毁的可能性）上有所不同。参与者（N=864）对每个视频在三个维度之一进行评分：他们有多喜欢它、关卡看起来有多困难、或者玩家的轨迹看起来有多危险。我们发现参与者更喜欢玩家似乎完成更困难障碍物课程的视频，但危险性并不能预测愉悦度评分。这些发现展示了程序化生成的刺激如何隔离影响动作序列观看愉悦度的因素。

英文摘要

People often seek out ways to watch others perform complex action sequences (e.g., sports). What makes some sequences more enjoyable to watch than others? We generated 24 video clips of gameplay from a Flappy Bird-style video game. Clips varied in difficulty (how often players succeeded on average) and in moment-to-moment uncertainty (how likely the player was to crash at any given step). Participants (N=864) rated each video on one of three dimensions: how much they enjoyed it, how difficult the level appeared, or how dangerous the player's trajectory appeared. We found that participants preferred videos where the player seemed to be completing more difficult obstacle courses, but dangerousness did not predict enjoyment ratings. These findings show how procedurally generated stimuli can isolate the factors that affect how enjoyable an action sequence is to watch.

URL PDF HTML ☆

赞 0 踩 0

2605.30831 2026-06-01 q-bio.QM cs.LG physics.chem-ph

The Geometry of Activity Cliffs: Representation Dependence and Multi-Scale Characterization of Activity Landscapes

活性悬崖的几何结构：活性景观的表征依赖性与多尺度表征

Pawel Dabrowski-Tumanski, Bartosz Topolski, Dariusz Plewczynski, Tomasz Jetka

AI总结本研究通过六步分析流程，系统探究不同分子表征（如指纹和嵌入）对活性悬崖定义的影响，发现无单一表征在所有标准下均最优，揭示了活性悬崖是表征诱导的几何现象而非分子对固有属性。

详情

AI中文摘要

活性悬崖是指结构相似但活性差异巨大的化合物，通常被视为化学数据集的固有特征。我们认为，除了靶标生物学因素外，我们对活性悬崖的理解很大程度上是由所选分子表征所诱导的几何结构决定的，而非分子对本身的属性。我们设计了一个六步分析流程来系统检验这一假设。该流程包括：评估成对距离几何、悬崖富集度、活性梯度分布、悬崖子空间的持续同调、嵌入和度量对的预测基准测试，以及最终匹配分子对和立体异构体的分析。我们将该流程应用于十五种嵌入和度量配置，以构建针对三个已知活性悬崖挑战的不同数据集的基准。没有一种表征在所有标准上均表现优异：Morgan Tanimoto 提供了最强的悬崖富集度和跨骨架泛化能力；MolFormer 余弦提供了唯一有意义的立体化学敏感性；MACCS 和 RDKit Dice 指纹对匹配分子对变换最敏感；ChemBERTa 由于嵌入坍缩而全面失败。这些发现并非排名。它们反映了不同表征编码了分子识别的不同方面，而选择一种表征实际上就隐含地定义了活性悬崖是什么。

英文摘要

Activity cliffs, structurally similar compounds with large potency differences, are widely treated as intrinsic features of chemical datasets. We argue that apart from target biology, much of our cliff understanding is a consequence of the geometry induced by the chosen molecular representation, not a property of a molecule pair itself. We designed a six-step pipeline to systematically test this hypothesis. The pipeline consists of: assessing pairwise distance geometry, cliff enrichment, activity gradient distribution, persistent homology of the cliff subspace, predictive benchmarking for a chosen pair of an embedding and a metric, and eventually, analysis of the matched molecular pairs and stereoisomers. We applied the pipeline to fifteen configurations of embeddings and metrics to build a benchmark across three distinctive datasets known of activity cliffs challenges. No representation excels on all criteria: Morgan Tanimoto provides the strongest cliff enrichment and cross-scaffold generalization; MolFormer cosine provides the only meaningful stereochemical sensitivity; MACCS and RDKit Dice fingerprints are most sensitive to matched-molecular-pair transformations; ChemBERTa fails uniformly due to embedding collapse. These findings are not a ranking. They reflect the fact that different representations encode different aspects of molecular recognition, and that choosing one implicitly defines what an activity cliff actually is.

URL PDF HTML ☆

赞 0 踩 0

2605.30662 2026-06-01 cs.LG q-bio.PE

Spatio-temporal stochastic graph-based learning for infectious disease forecasting

基于时空随机图的传染病预测学习

Luz Stefani Sotomayor Valenzuela, Susanna Cramb, Darren Wraith

AI总结提出一种集成随机公式和不确定性近似过程的时空图架构，用于预测新发传染病病例，在COVID-19和水痘数据集上表现出竞争性性能。

详情

Comments: Preprint under review

AI中文摘要

时空图模型通常用于预测COVID-19和水痘爆发等传染病的新病例。然而，在其学习过程中使用随机建模的研究却出人意料地不足，并且很少考虑大国家的完整数据集。因此，尚不清楚这些模型是否能在真实疾病传播场景中提供准确的预测。在这项工作中，我们提出了一种时空随机图架构，该架构集成了随机公式和不确定性近似过程，以预测新的传染病病例。我们发现，我们的方法能够适应在单一模型架构中编码大小人口地理网络。使用两个真实世界数据集——美国COVID-19和匈牙利水痘，我们报告了所提出的架构在预测美国2022年第一波COVID-19和匈牙利2012-2014年水痘波次中的增强效果。通过与四种时空图模型进行基准测试，定量结果显示，所提出的方法在预测美国所有3218个县和匈牙利所有20个县的新病例方面，具有竞争性的整体周度性能。所提出的方法能够表示相对于基线的整体流行病进展，尽管存在一步延迟；同时表现出对高频低幅变异的低敏感性。

英文摘要

Spatio-temporal graph-based models have typically been used to forecast new cases of infectious diseases such as COVID-19 and chickenpox outbreaks. However, the use of stochastic modelling into their learning process has been surprisingly under-investigated and rarely considered entire data sets of large countries. As a result, it is unknown whether these models would provide accurate forecasts in real-world disease spread scenarios. In this work, we propose a spatio-temporal stochastic graph-based architecture that integrates a stochastic formulation and uncertainty approximation process to forecast new infectious disease cases. We find that our approach can adapt to encode large and small population geographical networks within a single model architecture. Using two real-world data sets, COVID-19 in the US and chickenpox in Hungary, we report an enhanced effect of the proposed architecture across predictions of the 2022 first wave for COVID-19 in the US and comparative results of chickenpox waves during 2012-2014 in Hungary. By benchmarking with four spatio-temporal graph-based models, quantitative results show competitive overall weekly performance of the proposed approach on forecasting new cases for all 3,218 US counties and all 20 Hungary counties. The proposed approach can represent overall epidemic progression relative to baselines, though with a one-step delay; while exhibiting a reduced sensitivity to high-frequency and low-amplitude variability.

URL PDF HTML ☆

赞 0 踩 0

2605.30635 2026-06-01 cs.LG q-bio.GN

CellBRIDGE: Learning Cellular Trajectories via Interaction-Aware Alignment

CellBRIDGE：通过交互感知对齐学习细胞轨迹

Silas Ruhrberg Estévez, Nicolas Huynh, Tennison Liu, Roderik M. Kortlever, Gerard I. Evan, David L. Bentley, Mihaela van der Schaar

AI总结提出CellBRIDGE方法，通过将配体-受体介导的细胞间通信成本融入最优传输框架，改进了单细胞RNA测序数据中的轨迹推断和跨快照耦合。

详情

Journal ref: ICML 2026

AI中文摘要

从群体快照推断动态是机器学习和生物学中的一个基本挑战。在单细胞RNA测序（scRNA-seq）中，破坏性测量阻止了跨时间直接追踪单个细胞，使得轨迹推断欠定。最优传输（OT）为快照对齐提供了一个原则性框架，但一个长期存在的建模问题是哪些成本函数能产生生物学上有意义的耦合。标准的OT方法依赖于基因表达距离，隐含地将细胞视为独立点，并忽略了由配体-受体信号介导的结构化细胞间通信。我们引入了CellBRIDGE（基于细胞的规则化交互驱动基因表达），它用源自配体-受体活性的定向、类型化交互成本来增强基于特征的OT。通过显式建模细胞间通信，与仅基于特征的基线相比，CellBRIDGE在合成和真实scRNA-seq数据集上改善了跨快照耦合和下游轨迹估计。值得注意的是，CellBRIDGE实现了可机械解释的计算机扰动：在肺癌数据上，沉默特定的配体-受体对诱导的轨迹变化重现了预期靶向通路抑制的效果。

英文摘要

Inferring dynamics from population snapshots is a fundamental challenge in machine learning and biology. In scRNA-sequencing (scRNA-seq), destructive measurements preclude direct tracking of individual cells across time, making trajectory inference underdetermined. Optimal Transport (OT) provides a principled framework for snapshot alignment, but a long-standing modeling question is which cost functions yield biologically meaningful couplings. Standard OT approaches rely on gene-expression distances, implicitly treating cells as independent points and neglecting structured cell-cell communication mediated by ligand-receptor signaling. We introduce CellBRIDGE (Cell-Based Regularized Interaction-Driven Gene Expression), which augments feature-based OT with a directed, typed interaction cost derived from ligand-receptor activity. By explicitly modeling cell-cell communication, CellBRIDGE improves cross-snapshot couplings and downstream trajectory estimates across synthetic and real scRNA-seq datasets relative to feature-only baselines. Notably, CellBRIDGE enables mechanistically interpretable in silico perturbations: on lung cancer data, silencing specific ligand-receptor pairs induces trajectory shifts that recapitulate expected effects of targeted pathway inhibition.

URL PDF HTML ☆

赞 0 踩 0

2605.30591 2026-06-01 q-bio.QM

Obesity and Sociodemographic Factors in Luminal Breast Cancer

肥胖与社会人口学因素在管腔型乳腺癌中的作用

Vacanti Anderson, Paramahansa Pramanik, Haley K. Robinson

AI总结本研究通过分析1928例Luminal A和1610例Luminal B乳腺癌患者，发现高BMI和非洲裔与Luminal B型风险增加独立相关，而绝经后状态与风险降低相关，且BMI部分介导了祖先与Luminal B型的关联。

详情

Comments: 33 pages, 7 figures

AI中文摘要

管腔型乳腺癌是乳腺癌最常见的分子亚型，其中Luminal A型通常比Luminal B型具有更有利的临床结局。肥胖相关的炎症和长期暴露于外源性类固醇被认为与管腔型恶性肿瘤的进展有关。本研究评估了1,928例Luminal A型乳腺癌患者和1,610例Luminal B型乳腺癌患者，以检查体重指数（BMI）、年龄、种族背景、绝经状态和受体表达（包括雌激素受体（ER）、孕激素受体（PR）和人表皮生长因子受体2（HER2））之间的关联。Luminal B型肿瘤患者的平均BMI显著高于Luminal A型患者。此外，与白人和西班牙裔人群相比，Luminal B型肿瘤在非洲裔患者中更常见。多变量分析显示，BMI升高和非洲裔与Luminal B型癌的几率增加独立相关，而绝经后状态与较低风险相关。中介分析进一步表明，BMI部分解释了祖先与Luminal B型疾病之间的关联。这些发现提示，肥胖和人群特异性因素可能有助于更具侵袭性的管腔型乳腺癌表型的发展。

英文摘要

Luminal breast cancers represent the most prevalent molecular subtype of breast carcinoma, with Luminal A tumors generally associated with more favorable clinical outcomes than Luminal B tumors. Obesity-related inflammation and prolonged exposure to exogenous steroids have been implicated in the progression of luminal malignancies. This study evaluated 1,928 patients with Luminal A breast cancer and 1,610 patients with Luminal B breast cancer to examine associations among body mass index (BMI), age, ethnic background, menopausal status, and receptor expression, including estrogen receptor (ER), progesterone receptor (PR), and human epidermal growth factor receptor 2 (HER2). Patients with Luminal B tumors demonstrated a significantly greater mean BMI compared with those with Luminal A tumors. In addition, Luminal B tumors were more frequently observed among patients of African ancestry relative to White and Hispanic populations. Multivariable analyses revealed that elevated BMI and African ancestry were independently associated with increased odds of Luminal B carcinoma, whereas postmenopausal status was associated with lower risk. Mediation analysis further indicated that BMI partially explained the association between ancestry and Luminal B disease. These findings suggest that obesity and population-specific factors may contribute to the development of more aggressive luminal breast cancer phenotypes.

URL PDF HTML ☆

赞 0 踩 0

2605.30566 2026-06-01 physics.soc-ph econ.TH q-bio.PE

Participation Costs Narrow Democratic Cooperation

参与成本缩小民主合作

Mohammad Salahshour, Fjolle Shabani, Urs Fischbacher, Iain D. Couzin

AI总结通过进化模型和在线实验，研究投票成本如何影响民主分配公共品回报的自我维持合作，发现投票成本会减少活跃参与者并导致民主搭便车。

详情

Comments: 32 page, 6 figures

AI中文摘要

集体行动通常需要使合作对个体有价值的制度。我们探讨公共品回报的民主分配能否将重复公共品转化为自我维持的合作制度，以及参与成本如何重塑这一过程。一个简单的进化模型表明，投票再分配可以支持亲社会的分配秩序，但也可能维持反社会的分配秩序或民主搭便车，即个体通过他人维持的制度获益而避免参与成本。模型预测了投票成本的竞争效应：在强选择下，成本可能抑制使用制度来奖励低贡献者，但也可能削弱活跃选民群体并侵蚀对贡献奖励的支持。我们在一个预注册的在线实验中测试了这些预测，实验包含\NIncludedGroupsVone{}个五人小组。内生的民主再分配相对于等额公共品控制提高了贡献，零成本投票产生了最强的时间改善。投票成本并未主要使活跃选民转向奖励低贡献者的分配，而是促使行为转向弃权和民主搭便车，使弃权在局部有利，并扩大了任务后对民主参与的感知与行为记录之间的差距。因此，民主分配可以稳定合作，但参与成本会减少积极维持制度的人数，并使这种侵蚀对参与者自身不那么明显。

英文摘要

Collective action often requires institutions that make cooperation individually worthwhile. We ask whether democratic allocation of public-good return can transform a repeated public good into a self-sustaining cooperative institution, and how participation costs reshape that process. A simple evolutionary model shows that voted redistribution can support a prosocial allocation order, but can also sustain an antisocial allocation order or democratic free riding, in which individuals benefit from an institution maintained by others while avoiding the cost of participation. The model predicts competing effects of voting cost. Cost can suppress use of the institution to reward low contributors under strong selection, but can also thin the active electorate and erode contributor-rewarding support. We test these predictions in a preregistered online experiment with \NIncludedGroupsVone{} five-person groups. Endogenous democratic redistribution increased contributions relative to an equal-share public-goods control, with zero-cost voting producing the strongest temporal improvement. Voting costs did not mainly turn active voters toward low-contributor-rewarding allocation. Instead, they shifted behavior toward abstention and democratic free riding, made abstention locally rewarding, and widened the gap between post-task perceptions of democratic participation and the behavioral record. Democratic allocation can therefore stabilize cooperation, but participation costs can reduce the number of people actively sustaining the institution and can make that erosion less visible to participants themselves.

URL PDF HTML ☆

赞 0 踩 0

2605.30556 2026-06-01 cs.LG q-bio.NC

Supervised Training Rapidly Degrades Early Visual Cortex Alignment Across Biologically Plausible Learning Rules

监督训练在生物合理学习规则下迅速降低早期视觉皮层对齐

Nils Leutenegger

AI总结研究发现无训练网络在早期视觉皮层表征相似性上优于或持平于训练网络，通过对比四种学习规则（BP、FA、PC、STDP）在训练过程中与人类fMRI数据的对齐变化，揭示全局误差信号（BP）比局部学习规则（PC、STDP）更剧烈地重塑早期表征。

详情

Comments: 7 pages, 4 figures

AI中文摘要

随机、未训练的神经网络在早期视觉皮层的表征相似性上始终达到或超过训练网络。这一令人困惑的发现挑战了学习能改善大脑对齐的假设。我们通过追踪四种学习规则（反向传播BP、反馈对齐FA、预测编码PC和脉冲时序依赖可塑性STDP）在训练过程中与人类fMRI数据的表征相似性分析（RSA）对齐来研究这一问题。使用THINGS数据库中的720张物体图像和三名被试在六个视觉ROI上的fMRI数据，我们在八个训练检查点（epoch 0-40）测量模型与大脑表征相异矩阵之间的Spearman相关性。我们发现：（1）单个训练epoch根据学习规则不同使V1对齐降低25-90%；（2）反向传播对V1对齐的降低最为严重（delta r = -0.080），而预测编码和STDP保留更多（delta r ~ -0.04）；（3）在物体选择皮层（LOC）中出现较弱的相反趋势，BP在训练中对齐增加最大，但绝对变化很小。这些结果表明，未训练架构仅通过归纳偏置捕获低级视觉统计，且全局误差信号（BP）比局部学习规则（PC、STDP）更激进地重塑早期表征，后者更好地保留了类脑结构。

英文摘要

Random, untrained neural networks consistently match or exceed trained networks in representational similarity to early visual cortex. This puzzling finding challenges the assumption that learning improves brain alignment. We investigate it by tracking representational similarity analysis (RSA) alignment to human fMRI data across training for four learning rules: backpropagation (BP), feedback alignment (FA), predictive coding (PC), and spike-timing-dependent plasticity (STDP). Using 720 object images from the THINGS database and fMRI data from three subjects across six visual ROIs, we measure Spearman correlations between model and brain representational dissimilarity matrices at eight training checkpoints (epochs 0-40). We find that (1) a single epoch of training reduces V1 alignment by 25-90%, depending on the learning rule; (2) backpropagation reduces V1 alignment most severely (delta r = -0.080), while predictive coding and STDP preserve substantially more (delta r ~ -0.04); and (3) a weaker, opposite tendency appears in object-selective cortex (LOC), where BP shows the largest increase in alignment during training, although the absolute change is small. These results suggest that untrained architectures capture low-level visual statistics through inductive biases alone, and that global error signals (BP) reshape early representations more aggressively than local learning rules (PC, STDP), which better preserve brain-like structure.

URL PDF HTML ☆

赞 0 踩 0

2605.30552 2026-06-01 q-bio.NC

High-Fidelity 3D Simulator for Synthetic fNIRS Data Generation

用于合成fNIRS数据生成的高保真3D模拟器

Condell Eastmond, Niels Bracher, Xavier Intes, Stefan T. Radev

AI总结提出基于网格蒙特卡罗模拟的3D fNIRS模拟器，生成高时空保真度的全头合成记录，解决标注数据稀缺问题。

详情

AI中文摘要

功能性近红外光谱（fNIRS）通过测量皮层中氧合和脱氧血红蛋白的任务相关变化，提供了一种非侵入性的脑活动观测窗口。fNIRS的一个关键优势是它有望用于复杂现实环境中的移动参与者，例如行走、运动、课堂学习、驾驶模拟或社交互动。然而，由于运动伪影、生理噪声和其他混杂因素，分析fNIRS数据具有挑战性。这一挑战因标注数据集的有限可用性而进一步加剧，这阻碍了新分析流程的开发与验证，尤其是在AI方法日益普及的背景下。认识到这些挑战，我们引入了一个3D fNIRS模拟器，该模拟器使用基于网格的蒙特卡罗模拟来创建具有高时空保真度的生理逼真全头合成记录。我们的模拟器将解剖学上准确的灵敏度分布与参数化的血流动力学响应、系统生理学和非系统伪影模型相结合。因此，用户可以生成几乎无限量的标注数据集，用于测试去噪算法、数据增强、机制建模或计算机模拟实验。我们使用来自开源手指敲击、疼痛评估和手术技能数据集的实验fNIRS数据验证了该模拟器，并提供了开源实现以支持可重复性和广泛采用。

英文摘要

Functional near-infrared spectroscopy (fNIRS) provides a noninvasive window into brain activity by measuring task-related changes in oxygenated and deoxygenated hemoglobin in the cortex. A key advantage of fNIRS is its promise of use with mobile participants in complex, real-world environments, such as walking, sports, classroom learning, driving simulations, or social interactions. However, analyzing fNIRS data is challenging because of motion artifacts, physiological noise, and other confounding factors. This challenge is further compounded by the limited availability of annotated datasets, which hinders the development and validation of new analysis pipelines, particularly given the growing use of AI methods. Recognizing these challenges, we introduce a 3D fNIRS simulator that uses mesh-based Monte Carlo simulations to create physiologically realistic, full-head synthetic recordings with high spatiotemporal fidelity. Our simulator combines anatomically accurate sensitivity profiles with parameterized models of hemodynamic responses, systemic physiology, and nonsystematic artifacts. As a result, users can generate virtually unlimited labeled datasets for testing denoising algorithms, data augmentation, mechanistic modeling, or \textit{in silico} experimentation. We validate the simulator using experimental fNIRS data from open-source finger-tapping, pain-assessment, and surgical-skill datasets and provide an open-source implementation to support reproducibility and broad adoption.

URL PDF HTML ☆

赞 0 踩 0

2605.30522 2026-06-01 physics.soc-ph cs.SI q-bio.NC

Private Noise and Public Error in Collective Information Acquisition

集体信息获取中的私人噪声与公共错误

Mohammad Salahshour, Sumanth Bhargava, Kajal Kumari, Niccolo Pescetelli, Yasser Roudi, Bahador Bahrami, Iain D. Couzin

AI总结通过在线实验和模型分析，研究了通信噪声类型（理解噪声与生产噪声）对集体信息获取的影响，发现生产噪声通过产生共同错误信号导致群体更持久地收敛于错误值。

详情

Comments: 48 pages, 8 figures

AI中文摘要

集体信息获取要求群体将个人证据与社会信息相结合，同时保持与外部状态的耦合。通信噪声会影响这一过程，但噪声的作用仍不清楚。在一项在线实验中，600名参与者以四人小组形式工作，在25轮中估计室温，同时接收忠实的社会信息、理解噪声（每个接收者看到独立扰动的社会信息）或生产噪声（扰动在显示前存储，可被多个接收者看到）。温度计线索是客观真实的，但其可靠性主观不确定，无单位的50–250室温范围造成了显示证据与日常温度预期之间的任务诱发冲突。生产噪声组比理解噪声组更多轮次紧密聚集在错误值周围（p=0.016，组级置换检验）。生产噪声更常产生错误的共同信号（p=0.025，Fisher精确检验），并使该信号持续更多轮次（p=0.004，置换检验）。动态更新模型表明，生产噪声并非因为人们更强烈地追随同伴而更有害，而是因为相同的同伴影响作用于更相关的生产噪声扰动。探索性人类分析将该机制与心理模式联系起来，而GPT智能体实验阐明了一个边界条件：GPT智能体通过降低置信度来记录不确定性，但没有重现人类规模的生产噪声脆弱性。总体而言，噪声并非简单地降低集体信息获取。理解噪声有时可以相对于忠实控制改善校正，而生产噪声则可能将扰动转化为共同证据并稳定共识于错误。

英文摘要

Collective information acquisition requires groups to combine personal evidence with social information while remaining coupled to the external state. Communication noise can affect this process, but the role of noise remains unclear. In an online experiment, 600 participants worked in four-person human groups estimating a room temperature across 25 rounds while receiving either faithful social information, comprehension noise in which each receiver saw independently perturbed social information, or production noise in which perturbations were stored before display and could be seen by multiple receivers. The thermometer cue was objectively veridical, but its reliability was subjectively uncertain and the unitless 50--250 room-temperature range created a task-induced conflict between displayed evidence and everyday temperature expectations. Production-noise groups spent more rounds tightly clustered around a wrong value than comprehension-noise groups ($p=0.016$, group-level permutation). Production noise more often created a wrong common signal ($p=0.025$, Fisher's exact test) and made that signal persist across more rounds ($p=0.004$, permutation). Dynamic update models showed that production noise was not more harmful because people followed peers more strongly, but because the same peer influence acted on more correlated production-noise perturbations. Exploratory human analyses linked the mechanism to psychological patterns while a GPT-agent experiment clarified a boundary condition: GPT agents registered uncertainty through reduced confidence without reproducing human-scale production-noise vulnerability. Overall, noise did not simply degrade collective information acquisition. Comprehension noise could sometimes improve correction relative to the faithful control, whereas production noise could turn perturbations into common evidence and stabilize consensus on error.

URL PDF HTML ☆

赞 0 踩 0

2605.30518 2026-06-01 q-bio.QM

Gaussian Mixture Model-Based Focused Refinement for Enhanced Flexible Structure Determination in CryoEM and CryoET

基于高斯混合模型的聚焦精化用于增强冷冻电镜和冷冻电子断层扫描中的柔性结构确定

Muyuan Chen

AI总结提出一种基于高斯混合模型的聚焦对齐流程，通过校正亚基运动和捕获旋转动力学，提高了高度动态蛋白质中小结构域的分辨率并揭示了复杂构象变化。

2605.04200 2026-06-01 q-bio.NC

Neural Manifolds as Crystallized Embeddings: A Synthesis of the Free Energy Principle, Generalized Synchronization, and Hebbian Plasticity

神经流形作为晶化嵌入：自由能原理、广义同步与赫布可塑性的综合

Vikas N. O'Reilly-Shah

AI总结本文提出广义同步作为自由能原理的底层实现机制，结合赫布可塑性将感觉驱动同步产生的相关结构化为连续吸引子网络，从而解释神经流形的形成与感知现象。

详情

Comments: Updated to expand open mathematical problems and incorporate prediction-separation link as specific predictions of the synthesis

AI中文摘要

自由能原理将感知视为变分推理，但其生物学实现尚不明确。广义坐标形式并非声称神经元计算任意泰勒展开的实在主张。本文论证广义同步提供了缺失的自底向上机制。某些循环回路满足收缩性质：邻近轨迹指数收敛。受结构化感觉输入驱动的收缩回路同步于驱动动力学。在一般嵌入条件下，所得同步映射将低维感觉流形嵌入神经状态空间。自由能原理预测的几何结构并非由显式贝叶斯神经计算自上而下强加，而是源于普通循环动力学。接着我提出一个发展性扩展。作用于感觉驱动同步产生的相关性的赫布可塑性将嵌入流形塑造为循环连接，产生一个近似嵌入感觉流形的连续吸引子网络。预测-分离结果通过预测精度约束所得回路的表征保真度：在网络能很好预测未来观测的地方，同步映射分离潜在状态；在预测失败的地方，表征崩溃。这些崩溃可观察为范畴感知、同质等效应和辨别阈值。据此观点，成熟的头方向、网格细胞和刺激驱动的视觉流形是三个相互作用过程的发展产物：动力学收缩、广义同步和基于相关性的可塑性。核心开放问题在于赫布不动点是否存在，以及赫布动力学能否在相关输入分布上产生足够精确的预测器。

英文摘要

The free energy principle casts perception as variational inference, but its biological implementation is underspecified. The generalized-coordinate formalism is not a literal claim that neurons compute arbitrary Taylor expansions. This paper argues that generalized synchronization (GS) provides the missing bottom-up mechanism. Certain recurrent circuits satisfy a contraction property: nearby trajectories converge exponentially. A contracting circuit driven by structured sensory input synchronizes to driving dynamics. Under generic embedding conditions, the resulting synchronization map embeds the low-dimensional sensory manifold into neural state space. The geometry predicted by the free energy principle is not imposed from above by an explicitly Bayesian neural calculus. It arises from ordinary recurrent dynamics. I then propose a developmental extension. Hebbian plasticity acting on the correlations generated by sensory-driven synchronization shapes the embedded manifold into recurrent connectivity, producing a continuous attractor network that approximates the embedded sensory manifold. Prediction-separation results bound the representational fidelity of the resulting circuit by prediction accuracy: where the network predicts future observations well, the synchronization map separates underlying states; where prediction fails, the representation collapses. The collapses are observable as categorical perception, metameric equivalence, and discrimination thresholds. On this view, mature head-direction, grid-cell, and stimulus-driven visual manifolds are developmental products of three interacting processes: dynamical contraction, generalized synchronization, and correlation-based plasticity. The central open problems are whether the Hebbian fixed point exists and whether Hebbian dynamics produce a sufficiently accurate predictor on the relevant input distribution.

URL PDF HTML ☆

赞 0 踩 0

2605.30463 2026-06-01 q-bio.GN

Meta-analysis of scRNA-seq data for choroidal endothelial cells in dry Age-related Macular Degeneration

干性年龄相关性黄斑变性中脉络膜内皮细胞的单细胞RNA测序数据荟萃分析

Kyle M. Veksler, Levi Dong, Timothy A. Blenkinsop, Aurelian Radu

AI总结通过荟萃分析公开的单细胞RNA测序数据，发现干性AMD中血管生成启动但未能执行，导致RPE血供不足，提出干性和湿性AMD均由脉络膜内皮细胞功能障碍引发的统一假说。

详情

AI中文摘要

导致干性年龄相关性黄斑变性（AMD）的机制在很大程度上尚未阐明，这阻碍了有效疗法的引入。文献中存在实验支持脉络膜内皮细胞（ChEC）功能障碍先于黄斑视网膜色素上皮（RPE）丢失的假说，而RPE丢失可能仅是血供不足的继发后果。如果是这样，针对ChEC水平的干预可能构成一种未被充分研究的治疗策略。关于早期或中期干性AMD转录变化的公开数据集是可用的，但其中一些数据集的ChEC信息尚未被分析，或未使用最新最强大的软件工具进行分析。我们在此展示通过生物信息学分析这些数据集产生的新数据。主要新发现是：在干性AMD中，血管生成被启动，如同在湿性AMD中一样。然而，与湿性AMD相反，干性AMD中的血管生成未能执行，因此支持RPE的血供逐渐不足，导致其功能障碍和死亡。这些数据支持干性和湿性AMD起源/起始/病因的统一假说，即两者均由ChEC功能障碍引发——干性AMD中血管生成不足/失败，或湿性AMD中血管生成过度。通路分析还揭示了Notch和TNF信号、内皮-间充质转化（EndoMT）、线粒体、“流体剪切应力”、“破骨细胞分化”和“钙化/骨质疏松”的扰动。总体而言，新数据为实验研究提供了依据，以验证和进一步表征这些扰动，并研究纠正它们的策略。

英文摘要

The mechanisms that lead to dry Age-related Macular Degeneration are largely unelucidated, which prevents the introduction of effective therapies. Experimental support exists in the literature for the hypothesis that choroidal endothelial cell (ChEC) dysfunction precedes the loss of macular retinal pigmented epithelial (RPE), which may be only a secondary consequence of inadequate blood supply. If so, interventions at the level of ChEC could constitute an under investigated therapeutic strategy. Datasets regarding the transcriptional changes in early or intermediate dry AMD are publicly available, but for some some of them the information about ChECs have not been analyzed, or not analyzed using the most powerful and recent software tools. We present here new data generated by our bioinformatics analysis of these datasets. The main new finding is that angiogenesis is initiated in dry AMD, as it is in wet AMD. However, contrary to wet AMD, in dry AMD angiogenesis fails to execute, and therefore the blood supply that supports the RPE becomes gradually insufficient, leading to their dysfunctionality and death. The data support a unitary hypothesis of the origin / initiation / etiology of both dry and wet AMD, namely that both are initiated by ChEC dysfunction - either insufficient / abortive angiogenesis in dry AMD, or excessive angiogenesis in wet AMD. Pathway analysis also reveals as perturbed Notch and TNF signaling, endothelial to mesenchymal transition (EndoMT), mitochondria, "fluid shear stress", "osteoclast differentiation" and "calcification/osteoporosis". Overall, the new data provide a rationale for experimental studies, to validate and further characterize these perturbations, and investigate strategies to correct them.

URL PDF HTML ☆

赞 0 踩 0

2605.30399 2026-06-01 q-bio.QM cs.LG eess.IV

A Novel Computer Vision Approach for Assessing Fish Responses to Intrusive Objects in Aquaculture

一种用于评估鱼类对水产养殖中侵入性物体反应的新型计算机视觉方法

Hanne-Grete Alvheim, Stian Mjelde Jakobsen, Martin Føre, Eleni Kelasidi

AI总结本研究提出一种基于YOLOv8、ByteTrack、SuperGlue和三角测量的新型立体视觉方法，用于检测、跟踪和估计鱼类三维位置，以分析不同形状、大小和颜色的结构对鱼类行为的影响。

详情

AI中文摘要

水产养殖业需要应对若干挑战，以确保可持续的海产品生产满足日益增长的全球需求。其中一个主要挑战是确保生产过程中鱼类健康良好和福利可接受，因为改善鱼类福利在当前和未来的生产系统中至关重要。本研究通过开发和实施方法，识别鱼类对侵入性物体的个体和群体行为反应，从而解决这一问题。因此，我们开发了一种检测、跟踪和估计个体鱼类三维位置的新方法，并专门设计用于跟踪工业海水网箱中养殖鱼类的尾鳍。跟踪数据采用一种新型立体视觉方法进行处理，该方法适用于估计鱼类的位置、速度、加速度以及转向和俯仰角。随后，分析了从工业规模养鱼场获得的数据集，以识别不同形状、大小和颜色的结构对鱼类行为的影响。该方法使用手动标注的尾鳍进行训练，并采用YOLOv8结合ByteTrack作为目标检测器和跟踪器，SuperGlue用于匹配左右帧中的检测结果，以及三角测量来重建鱼类的三维位置。测试了不同的图像预处理和增强方法以提高目标检测准确性，并比较了它们的性能，同时测试了RAFT-Stereo用于深度估计。获得的结果既验证了该方法相对于先前研究工作的性能，也展示了该方法在提供对海水网箱中行为动态更深入理解方面的新颖性和潜力。

英文摘要

The aquaculture industry needs to address several challenges to secure sustainable seafood production that can serve an increasing global demand. One major challenge is to ensure good fish health and acceptable welfare during production since the improvement of fish welfare is of vital importance in current and future production systems. In this study, this is addressed by developing and implementing methods to identify fish behaviors in response to intrusive objects both on individual and on a group basis. A novel approach for detecting, tracking, and estimating the 3D position of individual fish has thus been developed, and specifically designed to track the caudal fins of farmed fish in industrial sea cages. The tracking data was subjected to a novel stereo-vision method adapted to estimate fish positions, velocities, accelerations, and turning and pitch angles. Datasets obtained from industrial-scale fish farms were then analyzed to identify the impact of structures of varying shapes, sizes, and colors on fish behavior. The method was trained using manually labeled caudal fins, and used YOLOv8 with ByteTrack as an object detector and tracker, SuperGlue for matching detections in the left and right frames, and triangulation to reconstruct the 3D positions of the fish. Different image pre-processing and augmentation methods for enhancing object detection accuracy were tested and their performance compared, while RAFT-Stereo was tested for depth estimation purposes. The obtained results both validate the method's performance against previous research efforts, and demonstrate the novelty and potential of this method in providing more insight into behavioral dynamics in sea-cages.

URL PDF HTML ☆

赞 0 踩 0

2605.30382 2026-06-01 q-bio.PE q-bio.QM

On the Connection Between Differential Population Growth Rate and Epidemic Reproduction Numbers

关于差异种群增长率与流行病再生数之间联系的探讨

Hong Qin

AI总结本文通过理论推导和实证分析（涵盖SARS-CoV-2和流感数据），建立了从基因组监测估计的差异种群增长率（DPGR）与流行病再生数（Rt）之间的数学联系，并验证了DPGR在预测变异株传播优势方面的有效性。

详情

Comments: 23 pages, 5 figures

AI中文摘要

在大流行期间，公共卫生机构需要快速评估新的病毒变异株是否比现有谱系更具传播性。对于共同流行的变异株，相对适应性可以表示为选择系数，即从基因组监测估计的差异种群增长率（DPGR），或者，在附加假设下，表示为流行病再生数$R_t$的对比。我们证明DPGR估计的是成对的增长率差异。在指定的代际间隔模型下，该差异可以转化为再生数空间；在等世代时间SIR特例中，它简化为变异特异性$R_t$的缩放差异。相关的增长率对比也出现在多项逻辑和增长优势随机游走模型中，尽管这些方法在似然、平滑、先验和数据输入方面与DPGR不同。我们通过总计超过2200个匹配数据点的五项SARS-CoV-2和流感分析评估了该理论。当真实$R_t$已知时，SIR模拟恢复了预期的映射，回顾性SARS-CoV-2分析显示，在变异株占主导地位前43至65天，DPGR信号持续存在，在我们的分析中符号准确率为95%。DPGR在谱系三元组中近似可传递，对于选定的功能相似亚谱系接近零，并且在各国之间方向一致。这些结果通过一个假设明确的增长率桥梁，将基于序列计数的适应性估计与再生数对比联系起来。

英文摘要

During pandemics, public health agencies need to rapidly assess whether a new viral variant is more transmissible than existing lineages. For co-circulating variants, relative fitness can be expressed as a selective coefficient, as the differential population growth rate (DPGR) estimated from genomic surveillance, or, with additional assumptions, as a contrast in epidemic reproduction numbers $R_t$. We show that DPGR estimates a pairwise growth-rate difference. Under a specified generation-interval model, this difference can be transformed into reproduction-number space; in the equal-generation-time SIR special case, it reduces to a scaled difference in variant-specific $R_t$. Related growth-rate contrasts also appear in multinomial logistic and growth-advantage random-walk models, although those methods differ from DPGR in likelihood, smoothing, priors, and data inputs. We evaluate the theory across five SARS-CoV-2 and influenza analyses totaling more than 2,200 matched data points. SIR simulation recovers the expected mapping when the true $R_t$ is known, and retrospective SARS-CoV-2 analyses show sustained DPGR signals 43 to 65 days before variant dominance, with 95\% sign accuracy in our analysis. DPGR is approximately transitive across lineage triplets, near zero for selected functionally similar sublineages, and directionally consistent across countries. These results connect sequence-count-based fitness estimates to reproduction-number contrasts through an assumption-explicit growth-rate bridge.

URL PDF HTML ☆

赞 0 踩 0

2605.30372 2026-06-01 cs.NE cs.AI cs.LG q-bio.NC

Evolutionary Algorithm for Reservoir Learning and Yielding

用于储层学习和生成的进化算法

Julien Testu, Pierrick Legrand, Xavier Hinaut

AI总结提出进化算法EARLY，通过进化多储层回声状态网络的拓扑和超参数，在时序学习任务上优于随机搜索，并发现任务难度影响网络结构。

详情

Journal ref: GECCO '26 - The Genetic and Evolutionary Computation Conference, Jul 2026, San jos{é}, Costa Rica

AI中文摘要

储层计算是一种递归神经网络，因其将动态处理与训练好的读出层分离而成为时序学习的有前途方法。然而，经典的回声状态网络（ESN）通常需要针对任务调整其架构和超参数才能获得良好性能。本文介绍了EARLY（用于储层学习和生成的进化算法），这是一个旨在进化多储层ESN的拓扑和超参数的框架。受大脑模块化组织的启发，EARLY将架构编码为基于图的基因组，并应用交叉、变异和选择来发现有效的配置。我们的目标是创建通用架构和任务诱导泛化。该方法在CogScale数据集的时序学习任务上进行了评估。结果表明，进化出的架构在多个任务上优于通过随机搜索获得的架构，并根据任务难度表现出结构差异：简单任务产生轻量级架构，而复杂任务倾向于更丰富的模块化组织。这些发现表明，进化搜索有助于为更广泛的时序问题识别可复用的储层结构。进一步在跨情境学习数据集上评估进化出的架构，以评估其适应新环境的能力。

英文摘要

Reservoir computing, a type of recurrent neural network, is a promising approach for temporal learning as it separates dynamic processing from the trained readout layer. However, classical Echo State Networks (ESNs) often require task-specific tuning of their architecture and hyperparameters to achieve good performance. This paper introduces EARLY (Evolutionary Algorithm for Reservoir Learning and Yielding), a framework designed to evolve both the topology and hyperparameters of multi-reservoir ESNs. Inspired by the modular organisation of the brain, EARLY encodes architectures as graph-based genomes and applies crossover, mutation, and selection to discover effective configurations. Our goal is to create both generic architectures and tasks inducing generalization. The method is evaluated on temporal learning tasks from the CogScale dataset. Results show that evolved architectures outperform those obtained with random search on several tasks and exhibit structural differences depending on task difficulty: simpler tasks yield lightweight architectures, while more complex tasks favour richer modular organisations. These findings suggest that evolutionary search can help identify reusable reservoir structures for a broader range of temporal problems. The evolved architectures are further evaluated on a cross-situational learning dataset to assess their ability to adapt to new environments.

URL PDF HTML ☆

赞 0 踩 0

2605.30368 2026-06-01 cs.NE cs.AI cs.RO q-bio.NC

Reinterpreting Safety Thresholds as Neuron Spiking Thresholds

将安全阈值重新解释为神经元放电阈值

Enrico Del Re, Mohamed Sabry, Cristina Olaverri-Monreal

AI总结提出将替代安全措施（SSM）的固定阈值重新解释为泄漏积分点火（LIF）神经元的放电阈值，构建脉冲神经网络（SNN）学习人类刹车起始点，实现客观SSM与主观安全感知的融合。

详情

Comments: 6 pages

AI中文摘要

替代安全措施（SSM）在自动驾驶领域的交通风险评估中被广泛使用。然而，大多数基于SSM的评估采用固定阈值，无法捕捉人类对持续临界状态的响应或对短暂高风险峰值的反应。本文提出了一种受生物学启发的SSM阈值重新解释，将其建模为泄漏积分点火（LIF）神经元的放电阈值，并将多个SSM输入组合成脉冲神经网络（SNN）。该SNN经过训练，使其发放的脉冲与人类刹车起始点对齐。训练数据是在使用3D-CoAutoSim平台（基于CARLA/Unreal和六自由度运动平台）的受控跟车实验中记录的，实验中生成了诱导的关键事件。结果表明，学习到的脉冲活动在定性上与跨场景的刹车行为一致，并捕捉了仅靠阈值交叉无法一致解释的反应。跨参与者的分析进一步表明，学习到的输入阈值保持相对一致，而学习到的衰减因子编码了SSM的不同时间敏感性。本研究的发现表明，脉冲动力学可能作为一种机制，促进客观SSM与主观人类安全感知的融合。

英文摘要

Surrogate Safety Measures (SSMs) are extensively utilised in the evaluation of traffic risk in automated driving contexts. However, the majority of SSM-based evaluations employ fixed thresholds that fail to capture the human response to sustained borderline conditions or the reaction to brief, high-risk peaks. The present work proposes a biologically inspired reinterpretation of SSM thresholds. This is modelled as spiking thresholds of leaky integrate-and-fire (LIF) neurons, with multiple SSM inputs combined into a spiking neural network (SNN). The SNN is trained to emit spikes that are aligned with human braking onsets. The training data was recorded in a controlled car-following experiment using the 3D-CoAutoSim platform with CARLA/Unreal and a 6-DOF motion platform, where induced critical events were generated. The results demonstrate that the learned spiking activity qualitatively aligns with braking behaviour across scenarios and captures reactions that are not consistently explained by threshold crossings alone. Analysis across participants further indicates that learned input thresholds remain relatively consistent, while learned decay factors encode different temporal sensitivities for the SSMs. The findings of this study indicate that spiking dynamics may serve as a mechanism to facilitate the convergence of objective SSMs with subjective human safety perception.

URL PDF HTML ☆

赞 0 踩 0

2605.26183 2026-06-01 q-bio.QM cs.LG

What Molecular Structure Cannot Tell Us: A Taxonomy of Explainability Gaps in GNN-Based Drug Toxicity Prediction

分子结构无法告诉我们的事：基于GNN的药物毒性预测中可解释性差距的分类

Juergen Dietrich

AI总结本研究引入了一个操作分类法，系统性地分析了图神经网络在药物毒性预测中由于结构信息限制导致的不可解释性差距，并以阿司匹林为例量化了分子结构仅能解释约45%的不良反应。

详情

Comments: 13 pages

AI中文摘要

并非所有临床相关的不良反应都能从分子图中结构推断出来——无论模型质量或架构复杂性如何。本研究引入了一个操作分类法，用于描述独立于所用学习算法的结构信息限制，这些限制阻碍了基于结构的毒性预测。图神经网络（GNN）已成为分子毒性预测的自然方法，直接作用于原子连接性，避免了固定长度指纹固有的信息损失。然而，药物已知药理学特征中实际可从分子结构推断的比例仍未被系统探索。以乙酰水杨酸（ASA，阿司匹林）——药理学中表征最全面的药物之一——作为模型化合物进行系统性案例研究。在Tox21基准上训练消息传递神经网络（MPNN），并应用GNNExplainer表征原子级归因。结果表明，分子结构解释了约45%（5/11）的已知ASA不良反应。引入了一个四类差距分类法（GAP-1至GAP-4），区分了原则上不可编码的效应、由非随机缺失（MNAR）机制引起的数据差距、检测面板不匹配和表示误差。通过系统的ChEMBL查询（42个已记录检测，0个可检索生物活性条目）经验量化了MNAR差距。注意力池化实验将表示误差定位到MPNN消息传递层而非聚合步骤。该差距分类法对药物安全信号检测和监管框架（包括良好药物警戒实践（GVP）指南和新方法论（NAMs））具有直接影响。在伴随的DDI消融研究中确认了所识别的结构限制。

英文摘要

Not all clinically relevant adverse effects are structurally inferable from molecular graphs - regardless of model quality or architectural complexity. This study introduces an operational taxonomy of the structural information limits that prevent structure-based toxicity prediction, independent of the learning algorithm employed. Graph Neural Networks (GNNs) have emerged as a natural approach for molecular toxicity prediction, operating directly on atomic connectivity without the information loss inherent to fixed-length fingerprints. However, the fraction of a drug's known pharmacological profile that is actually inferable from molecular structure remains systematically underexplored. A systematic case study using acetylsalicylic acid (ASA, Aspirin) - one of the most comprehensively characterized drugs in pharmacology - serves as model compound. A Message Passing Neural Network (MPNN) is trained on the Tox21 benchmark and GNNExplainer is applied to characterize atom-level attribution. Results indicate that molecular structure explains approximately 45% (5/11) of known ASA adverse effects. A four-category Gap Taxonomy (GAP-1 through GAP-4) is introduced distinguishing between principally non-encodable effects, data gaps arising from Missing Not At Random (MNAR) mechanisms, assay panel mismatches, and representation errors. The MNAR gap is empirically quantified via a systematic ChEMBL query (42 documented assays, 0 retrievable bioactivity entries). An attention pooling experiment localizes the representation error to the MPNN message passing layers rather than the aggregation step. The Gap Taxonomy has direct implications for drug safety signal detection and regulatory frameworks including Good Pharmacovigilance Practice (GVP) guidelines and New Approach Methodologies (NAMs). Structural limits identified are confirmed in a companion DDI ablation study.

URL PDF HTML ☆

赞 0 踩 0

2602.20176 2026-06-01 q-bio.BM cs.LG

Cross-Chirality Generalization by Axial Vectors for Hetero-Chiral Protein-Peptide Interaction Design

通过轴向向量实现异手性蛋白质-肽相互作用设计的跨手性泛化

Ziyi Yang, Zitong Tian, Yinjun Jia, Tianyi Zhang, Jiqing Zheng, Hao Wang, Yubu Su, Juncai He, Lei Liu, Yanyan Lan

AI总结提出向E(3)等变（极）向量特征注入轴向特征的方法，结合潜在扩散模型实现从同手性训练数据到异手性设计任务的跨手性泛化，首次通过湿实验验证了生成式AI从头设计D-肽结合物的有效性。

详情

Comments: v3: Revised acknowledgements only. The paper has been accepted to ICML 2026

AI中文摘要

靶向L-蛋白的D-肽结合物具有广阔的治疗潜力。尽管基于机器学习的靶标条件肽设计取得了快速进展，但生成D-肽结合物仍基本未被探索。在这项工作中，我们表明通过向$E(3)$等变（极）向量特征注入轴向特征，可以实现从同手性（L--L）训练数据到异手性（D--L）设计任务的跨手性泛化。通过在潜在扩散模型中实现该方法，我们实现了D-肽结合物设计，不仅在 extit{in silico}基准测试中优于现有工具，而且在湿实验验证中显示出有效性。据我们所知，我们的方法代表了首个经过湿实验验证的用于 extit{de novo}设计D-肽结合物的生成式AI，为处理蛋白质设计中的手性提供了新视角。代码可在https://github.com/YZY010418/PepMirror获取。

英文摘要

D-peptide binders targeting L-proteins have promising therapeutic potential. Despite rapid advances in machine learning-based target-conditioned peptide design, generating D-peptide binders remains largely unexplored. In this work, we show that by injecting axial features to $E(3)$-equivariant (polar) vector features, it is feasible to achieve cross-chirality generalization from homo-chiral (L--L) training data to hetero-chiral (D--L) design tasks. By implementing this method within a latent diffusion model, we achieved D-peptide binder design that not only outperforms existing tools in \textit{in silico} benchmarks, but also demonstrates efficacy in wet-lab validation. To our knowledge, our approach represents the first wet-lab validated generative AI for the \textit{de novo} design of D-peptide binders, offering new perspectives on handling chirality in protein design. Codes are available at https://github.com/YZY010418/PepMirror

URL PDF HTML ☆

赞 0 踩 0