arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2605.21454 2026-05-21 cs.CV q-bio.QM q-bio.TO

ProtoPathway: Biologically Structured Prototype-Pathway Fusion for Multimodal Cancer Survival Prediction

ProtoPathway: 为多模态癌症生存预测设计的生物结构化原型-路径融合

Amaya Gallagher-Syed, Costantino Pitzalis, Myles J. Lewis, Michael R. Barnes, Gregory Slabaugh

AI总结本文提出ProtoPathway框架，通过统一全切片成像和转录组学，利用编码器生成生物基础的表示，以提升癌症生存预测的生物可解释性和计算效率。

详情

Comments: Currently under peer review

AI中文摘要

我们介绍了ProtoPathway，一种为癌症生存预测设计的可解释多模态框架，通过编码器在两个融合侧生成生物基础的表示。在组织病理学侧，$K$个可学习的形态原型通过端到端训练与生存目标相结合，作为切片本身的表示：片段通过软分配流入原型标记，将可变长度的片段集压缩成固定任务适应的标记。在基因组侧，双分图神经网络在Reactome通路层级编码基因表达，生成反映构成基因及其更广泛生物背景的通路嵌入，通过双向消息传递在共享的基因-通路图上进行。跨模态注意机制则在紧凑的原型$ imes$通路矩阵上操作，其中原型查询通路，建模分子程序如何导致组织形态的生物方向。由于两个轴都携带稳定的任务学习身份，注意矩阵本身是可解释性输出，从而在完整的生物层级上实现原生的推理时间归因，从基因通过通路和原型到空间组织图。我们在五个TCGA癌症队列上进行评估，展示了与现有方法相比具有竞争力或更优的生存预测能力，同时具有显著改进的生物可解释性和减少的计算成本，通过折叠分层的基于排名的群体水平分析验证了可解释性声明。我们的源代码、模型权重和Reactome通路，以及一个重新实现所有多模态生存基准的统一代码库，在相同预处理和评估条件下可用：https://github.com/AmayaGS/ProtoPathway.

英文摘要

We introduce ProtoPathway, an interpretable-by-design multimodal framework for cancer survival prediction that unifies whole slide imaging and transcriptomics through encoders producing biologically grounded representations on both sides of the fusion. On the histopathology side, $K$ learnable morphological prototypes, trained end-to-end with the survival objective, serve as the slide representation itself: patches flow into prototype tokens via soft assignment, compressing variable-length patch sets into fixed task-adaptive tokens. On the genomic side, a bipartite graph neural network encodes gene expression within the Reactome pathway hierarchy, producing pathway embeddings that reflect both constituent genes and their broader biological context through bidirectional message passing over a shared gene--pathway graph. Cross-modal attention then operates over a compact prototype $\times$ pathway matrix in which prototypes query pathways, modeling the biological direction in which molecular programs give rise to tissue morphology. Because both axes carry stable task-learned identity, the attention matrix is itself an interpretability output, yielding native inference-time attribution across the full biological hierarchy, from genes through pathways and prototypes to spatial tissue maps. We evaluate on five TCGA cancer cohorts, demonstrating competitive or superior survival prediction with substantially improved biological interpretability and reduced computational cost, with interpretability claims validated through fold-stratified rank-based population-level analysis. Our source code, model weights, and Reactome pathways, together with a unified codebase reimplementing all multimodal survival baselines under identical preprocessing and evaluation, are available at: https://github.com/AmayaGS/ProtoPathway.

URL PDF HTML ☆

赞 0 踩 0

2605.21420 2026-05-21 cs.LG cs.AI q-bio.MN

HiRes: Inspectable Precedent Memory for Reaction Condition Recommendation

HiRes: 反应条件推荐的可检查先例记忆

Shreyas Vinaya Sathyanarayana, Raja Sekhar Pappala, Deepak Warrier

AI总结 HiRes通过结合图编码器、变换感知交叉注意力、多流反应融合和k-NN检索层，实现了反应条件推荐的高准确率和可解释性，其在催化剂、溶剂和试剂的Top-1准确率分别达到0.929、0.534和0.530，优于现有方法。

详情

AI中文摘要

反应条件推荐紧接在 retrosynthetic disconnection 选择之后，实际应用中化学家需要准确的预测以及支持这些预测的先例。我们提出了HiRes（分层反应表示），这是一种检索增强的条件推荐系统，其学习的反应空间同时作为分类特征和可检查的先例记忆。模型结合了图编码器、变换感知交叉注意力、多流反应融合和k-NN检索层。HiRes在主要槽位USPTO-Condition模型中达到最先进的性能，分别在催化剂、溶剂和试剂的Top-1准确率（Acc@1）为0.929、0.534和0.530。它与最佳报告的基线在催化剂上持平，但在溶剂和试剂上优于REACON等模型。此外，配对bootstrap分析表明，将检索与学习的条件头部结合，为溶剂和试剂选择提供了统计上显著的优势，优于纯参数方法。最终，HiRes在预测准确性和化学可解释性之间架起桥梁，提供了一个单一的表示，既能提供具有竞争力的推荐，又能提供实际合成计划所需的具体化学先例。

英文摘要

Reaction condition recommendation sits immediately after retrosynthetic disconnection selection, and in practice, chemists require both accurate predictions and the precedents that justify them. We present HiRes (Hierarchical Reaction Representations), a retrieval-augmented condition recommendation system whose learned reaction space serves as both a classifier feature and an inspectable precedent memory. The model combines a graph encoder, transformation-aware cross-attention, multi-stream reaction fusion, and a k-NN retrieval layer. HiRes achieves state-of-the-art performance among primary-slot USPTO-Condition models, reaching Catalyst, Solvent, and Reagent top-1 accuracies (Acc@1) of 0.929, 0.534, and 0.530 respectively. It ties the best reported baseline on Catalyst while outperforming models such as REACON on Solvent and Reagent. Furthermore, paired bootstrap analysis demonstrates that integrating retrieval with learned condition heads provides statistically significant gains for solvent and reagent selection over purely parametric approaches. Ultimately, HiRes bridges the gap between predictive accuracy and chemical interpretability, offering a single representation that supplies both competitive recommendations and the concrete chemical precedents necessary for practical synthesis planning.

URL PDF HTML ☆

赞 0 踩 0

2605.21356 2026-05-21 q-bio.NC

A simple model of co-emergence of grid and place fields

一种网格场和位置场共同出现的简单模型

Zhaoze Wang, Genela Morris, Dori Derdikman, Pratik Chaudhari, Vijay Balasubramanian

AI总结该研究提出了一种统一的递归网络模型，通过训练预测下一个感官观察，实现了网格场和位置场的共同出现，无需监督或依赖已有的空间细胞表示。

详情

AI中文摘要

海马体中的位置细胞和内侧扣带回皮层中的网格细胞共同支持空间导航。这两个区域相互连接，发展过程中存在鸡生蛋还是蛋生鸡的问题。当前的计算模型要么推导出一种类型来自另一种，要么用网络动态来模拟一种类型的出现。我们引入了一个统一的递归网络模型，实现了达勒定律（每个神经元要么是兴奋性要么是抑制性），并训练以从被遮蔽的先前感官观察和自体运动预测下一个感官观察。到目前为止，这是第一个单一目标模型，其中网格细胞和位置细胞在没有监督或依赖已有的空间细胞表示的情况下共同出现。两种空间编码在1000种不同的训练配置中共存，其平衡由感官噪声和遮蔽量决定。无需重新训练，网络能够定性地重现实验观察到的网格碎片化、网格合并、连接房间中的晶格对齐、自由飞行蝙蝠观察到的局部有序3D场，以及位置细胞先于网格细胞发展的发育顺序。我们将这些结果解释为在单一感官预测目标下的两种互补编码压力：（1）纠正错误或重建缺失的感官观察组件，（2）预测导航过程中的下一个感官状态。我们的结果表明了网格和位置细胞共同出现的回路层面的解释，并对这两种空间编码提供了可实验验证的预测。

英文摘要

Grid cells in the medial entorhinal cortex and place cells in the hippocampus together support spatial navigation. The two regions are reciprocally connected, and there is a chicken-and-egg problem for how both arise and reinforce each other during development. Current computational accounts either derive one type from the other or use network dynamics to model the emergence of one type in isolation. We introduce a unified recurrent network model that instantiates Dale's Law (every neuron is either excitatory or inhibitory), and is trained to predict the next sensory observation from masked previous sensory observations and egocentric motion. To our knowledge, this is the first single-objective model in which grid and place cells co-emerge without supervision of either type, or reliance on pre-existing spatial-cell representations. The two kinds of spatial codes coexist across 1,000 different training configurations, with their balance set by the amount of sensory noise and masking. Without retraining, the network qualitatively reproduces experimentally observed grid fragmentation in hairpin mazes, grid merging after wall removal, lattice alignment across connected rooms, locally ordered 3D fields observed in freely flying bats, as well as the developmental order in which place cells precede grid cells. We interpret these results in terms of two complementary encoding pressures within a single sensory-prediction objective: (1) correcting errors or reconstructing missing components of sensory observations, and (2) prediction of the next sensory state during navigation. Our results suggest a circuit-level account of the co-emergence of grid and place cells, and experimentally testable predictions for the two kinds of spatial codes.

URL PDF HTML ☆

赞 0 踩 0

2605.21324 2026-05-21 q-bio.NC cs.LG

Stimulus symmetries can confound representational similarity analyses

刺激对称性可能混淆表征相似性分析

Farhad Pashakhanloo, Jacob A. Zavatone-Veth

AI总结研究探讨了网络输入对称性如何影响表征相似性矩阵（RSMs）的分析，指出不同配置可能导致不同的RSMs，并展示了随机梯度下降或能量正则化如何生成稀疏漂移代码，从而导致漂移RSMs。

详情

Comments: 40 pages

AI中文摘要

表征相似性矩阵（RSMs）能告诉我们关于神经编码的什么信息？随着这些汇总统计量的普及，对它们性质的更全面描述的需求也日益增加。本文表明，网络输入中的对称性可能干扰基于RSM的分析。刺激对称性使许多表示在功能上等价，但这些不同配置可能导致不同的RSMs。这些不同的RSMs反映了质上不同的表征几何结构。我们展示随机梯度下降或能量正则化可以生成稀疏、漂移的代码，从而导致漂移的RSMs。此外，我们证明这些现象在训练以编码图像数据的网络中也存在，其中对称性是隐含的。我们的结果说明了在非线性神经编码比较中面临的挑战，当功能等价的表示不通过简单的旋转相关时。

英文摘要

What can representational similarity matrices (RSMs) tell us about a neural code? As the popularity of these summary statistics grows, so too does the need for a more complete characterization of their properties. Here, we show that symmetries in network inputs can confound RSM-based analyses. Stimulus symmetries render many representations functionally equivalent, but these different configurations can lead to different RSMs. These different RSMs reflect qualitatively different representational geometries. We show that stochastic gradient descent or energetic regularization can generate sparse, drifting codes, leading in turn to drifting RSMs. Moreover, we demonstrate that these phenomena are present in networks trained to encode image data, where the symmetry is latent. Our results illustrate the challenges inherent in comparing nonlinear neural codes, when functionally-equivalent representations are not related by a simple rotation.

URL PDF HTML ☆

赞 0 踩 0

2602.04916 2026-05-21 q-bio.QM cs.CL

AFD-INSTRUCTION: A Comprehensive Antibody Instruction Dataset with Functional Annotations for LLM-Based Understanding and Design

AFD-INSTRUCTION: 一个全面的抗体指令数据集，具有功能注解，用于基于LLM的理解和设计

Ling Luo, Wenbin Jiang, Hongyuan Chang, Xinkang Wang, Xushi Zhang, Yueting Xiong, Mengsha Tong, Rongshan Yu

AI总结本文提出AFD-INSTRUCTION数据集，通过功能注解提升LLM在抗体理解与设计中的性能，为抗体建模和治疗发现提供新基础。

详情

AI中文摘要

大型语言模型（LLMs）在蛋白质表示学习方面显著进步。然而，其通过自然语言解释和设计抗体的能力仍然有限。为解决这一挑战，我们提出了AFD-Instruction，首个大规模指令数据集，专门针对抗体进行功能注解。该数据集包含两个关键部分：抗体理解，直接从序列推断功能属性；抗体设计，允许在功能约束下生成新的序列。这些部分提供了显式的序列-功能对齐，并支持由自然语言指令引导的抗体设计。在通用LLM上的广泛指令微调实验表明，AFD-Instruction在各种抗体相关任务中 consistently 提高了性能。通过将抗体序列与功能的文本描述链接，AFD-Instruction为推进抗体建模和加速治疗发现建立了新基础。

英文摘要

Large language models (LLMs) have significantly advanced protein representation learning. However, their capacity to interpret and design antibodies through natural language remains limited. To address this challenge, we present AFD-Instruction, the first large-scale instruction dataset with functional annotations tailored to antibodies. This dataset encompasses two key components: antibody understanding, which infers functional attributes directly from sequences, and antibody design, which enables de novo sequence generation under functional constraints. These components provide explicit sequence-function alignment and support antibody design guided by natural language instructions. Extensive instruction-tuning experiments on general-purpose LLMs demonstrate that AFD-Instruction consistently improves performance across diverse antibody-related tasks. By linking antibody sequences with textual descriptions of function, AFD-Instruction establishes a new foundation for advancing antibody modeling and accelerating therapeutic discovery.

URL PDF HTML ☆

赞 0 踩 0

2512.10983 2026-05-21 q-bio.TO q-bio.CB q-bio.NC

Compartmental-reaction diffusion framework for microscale dynamics of extracellular serotonin in brain tissue

用于脑组织微尺度外周血清素动力学的隔室-反应扩散框架

Merlin Pelz, Skirmantas Janusonis, Gregory Handy

AI总结本文提出了一种数学框架，用于研究脑组织中外周血清素的微尺度动力学，通过建立二维隔室-反应扩散系统，利用强局部扰动理论推导出非线性积分微分方程，以分析血清素的稳态和动力学特性，揭示了血清素释放、囊泡几何和摄取动力学对细胞外血清素的影响。

详情

AI中文摘要

血清素（5-羟色胺）是一种主要的神经递质，其从密集分布的血清素能囊泡的释放塑造了大脑的可塑性和网络整合，但其细胞外动态仍由于涉及亚微米和毫秒尺度而理解不足。我们开发了一个数学框架，以捕捉在真实组织微环境中调控血清素信号的耦合反应-扩散过程。通过建立二维隔室-反应扩散系统，我们利用强局部扰动理论推导出一组渐近等价的非线性积分微分方程，以保持扩散耦合并实现高效计算。我们分析了周期平均的稳态，利用Jensen不等式建立了界限，获得了闭合形式的尖峰最大值和最小值，并实现了基于指数和核的快速推进方案求解器。这些数学结果提供了定量见解，阐明了放电频率、囊泡几何和摄取动力学如何塑造细胞外血清素。该模型揭示了囊泡形成能够生成空间“血清素储备”的扩散耦合微域，澄清了局部与体积传输的某些方面，并提供了与高分辨率血清素成像和选择性血清素再摄取抑制剂作用相关的预测。

英文摘要

Serotonin (5-hydroxytryptamine) is a major neurotransmitter whose release from densely distributed serotonergic varicosities shapes plasticity and network integration throughout the brain, yet its extracellular dynamics remain poorly understood due to the sub-micrometer and millisecond scales involved. We develop a mathematical framework that captures the coupled reaction-diffusion processes governing serotonin signaling in realistic tissue microenvironments. Formulating a two-dimensional compartmental-reaction diffusion system, we use strong localized perturbation theory to derive an asymptotically equivalent set of nonlinear integro-ODEs that preserve diffusive coupling while enabling efficient computation. We analyze period-averaged steady states, establish bounds using Jensen's inequality, obtain closed-form spike maxima and minima, and implement a fast marching-scheme solver based on sum-of-exponentials kernels. These mathematical results provide quantitative insight into how firing frequency, varicosity geometry, and uptake kinetics shape extracellular serotonin. The model reveals that varicosities form diffusively coupled microdomains capable of generating spatial "serotonin reservoirs," clarifies aspects of local versus volume transmission, and yields predictions relevant to interpreting high-resolution serotonin imaging and the actions of selective serotonin-reuptake inhibitors.

URL PDF HTML ☆

赞 0 踩 0

2605.21129 2026-05-21 physics.soc-ph econ.GN nlin.AO q-bio.PE q-fin.EC

How hate spreads online and why it returns: Re-entrant phases driven by collective behavior

在线仇恨如何传播以及为何会返回：由集体行为驱动的重新进入阶段

Chen Xu, Pak Ming Hui, Chenkai Xia, Neil F. Johnson

AI总结本文提出了一种双物种凝聚-破碎模型，结合易感-感染-康复动态，分析了仇恨内容在线上传播的机制和影响因素，揭示了系统传播受重新进入阈值阶段的调控，为预防系统性传播提供了理论依据。

详情

DOI: 10.1103/pghw-mmzz
Journal ref: Phys Rev E May 20 2026
Comments: earlier draft of published paper

AI中文摘要

2025年邦迪海滩大规模枪击事件是由受到ISIS宣传影响的个体所实施，而该宣传在2023年10月以色列-巴勒斯坦战争开始后越来越多地包含反犹太仇恨内容。类似的故事适用于其他类型的仇恨攻击，例如2026年5月18日针对穆斯林的攻击。迫切需要通过理解新仇恨内容何时以及如何在在线系统中传播来应对未来的威胁。本文提出了一种双物种凝聚-破碎模型，结合易感-感染-康复动态，该模型纳入了已发表的实证特征：(1) 新的仇恨内容往往由少数内置社区在较少受监管的平台上生成和推广。(2) 这些'仇恨'社区会与其他社区建立链接（超链接），形成动态演化的集群（即凝聚），新的仇恨内容可以在这些集群中传播。(3) 这些集群可能因管理员关闭而破裂（即破碎）。本文提出了数值解，并推导出两个层次的近似平均场理论：有效介质理论（EMT）和超越有效介质理论（BEMT）。数值和解析解揭示了系统传播受重新进入阈值阶段的调控：随着仇恨社区比例的变化，系统可以从传播到无传播再回到传播。推导出的解析公式提供了如何操纵这些相界来防止系统传播的明确见解。更广泛地说，重新进入阶段的行为警告政策若持续减少仇恨社区的数量，起初可能有效，但若进一步推进则可能适得其反，表明平台只需做'更多'的政策要求过于简单化。

英文摘要

The 2025 Bondi Beach mass-shooting was perpetrated by individuals inspired by ISIS (Islamic State) propaganda that increasingly featured anti-Semitic hate content following the October 2023 start of the Israel-Palestine war. Similar stories hold for other types of hate attacks, e.g. against Muslims on May 18, 2026. There is an urgent need to get ahead of future threats by understanding how and when a newly created piece of hate content will spread system-wide online. We present a two-species coalescence-fragmentation model with Susceptible-Infected-Recovered dynamics that incorporates the following published empirical features: (1) New pieces of hate content tend to be generated and promoted by a subset of in-built communities on less regulated platforms. (2) These `hate' communities create links (hyperlinks) with each other and with non-hate communities across all platforms to form dynamically evolving clusters (i.e. coalescence) across which new hate content can then spread. (3) These clusters can get broken up by moderator shutdowns (i.e. fragmentation). We present numerical solutions and derive two levels of approximate mean-field theory: Effective Medium Theory (EMT) and Beyond Effective Medium Theory (BEMT). Both numerical and analytic solutions reveal that system-wide spreading is governed by re-entrant threshold phases: as the fraction of hate communities varies, the system can transition from spreading to no-spreading and back to spreading. The derived analytic formulae give explicit insight into how these phase boundaries might be manipulated to prevent system-wide spreading. More broadly, the re-entrant phase behavior warns that policies which steadily reduce the number of hate communities can initially succeed but then backfire if pushed further, suggesting that blanket requirements for platforms to simply do `more' are over-simplistic.

URL PDF HTML ☆

赞 0 踩 0

2605.20989 2026-05-21 cs.LG q-bio.GN

Modeling Temporal scRNA-seq Data with Latent Gaussian Process and Optimal Transport

用潜在高斯过程和最优传输建模时间序列scRNA-seq数据

Mehmet Yigit Balik, Harri Lähdesmäki

AI总结本文提出了一种生成框架，利用潜在异方差高斯过程建模种群趋势，并通过最优传输对齐生成和观测的种群分布，以捕捉生物异质性，从而在复杂插值和外推基准上实现最先进的性能。

详情

AI中文摘要

单细胞RNA测序提供了单细胞分辨率的基因表达见解，但从这些静态快照测量中推断时间过程仍然是一个根本性挑战。当前利用神经微分方程和流的方法容易过拟合且缺乏对生物变异性的仔细考虑。在本文中，我们提出了一种生成框架，利用希尔伯特空间方法近似潜在异方差高斯过程（GP）来建模种群趋势。为解决真实细胞轨迹的缺失问题，我们利用最优传输（OT）目标对齐生成和观测的种群分布。我们的方法通过引入细胞特异性潜在时间和细胞类型条件来捕捉生物异质性，从而解构时间异步性和不同细胞类型的轨迹。我们展示了在复杂插值和外推基准上的最先进性能，并引入了一种新的基于梯度的策略来推断扰动轨迹。

英文摘要

Single-cell RNA sequencing provides insights into gene expression at single-cell resolution, yet inferring temporal processes from these static snapshot measurements remains a fundamental challenge. Current approaches utilizing neural differential equations and flows are sensitive to overfitting and lack careful considerations of biological variability. In this work, we propose a generative framework that models population trends using a latent heteroscedastic Gaussian process (GP) approximated by Hilbert space methods. To address the absence of genuine cell trajectories, we leverage an optimal transport (OT) objective that aligns generated and observed population distributions. Our method explicitly captures biological heterogeneity by incorporating cell-specific latent time and cell type conditioning to disentangle temporal asynchrony and trajectories to different cell types. We demonstrate state-of-the-art performance on complex interpolation and extrapolation benchmarks and introduce a novel gradient-based strategy for inferring perturbation trajectories.

URL PDF HTML ☆

赞 0 踩 0

2605.20885 2026-05-21 cs.LG q-bio.QM

Training distribution determines the ceiling of drug-blind cancer sensitivity prediction

训练分布决定了药物盲癌敏感性预测的上限

Taekyung Heo

AI总结本文研究了药物盲癌敏感性预测中训练分布对预测性能的影响，发现传统指标存在偏差，通过机制分层训练和响应匹配策略恢复了预测增益。

详情

AI中文摘要

精准肿瘤学需要预测特定肿瘤从其分子特征出发哪种药物能抑制它，但尽管药物表示越来越复杂，药物盲敏感性预测却停滞不前。本文表明这种停滞反映的是度量偏差而非表示瓶颈。标准基准全球皮尔逊相关系数受药物间效力差异主导，一个简单的药物均值预测器即可捕捉。每种药物皮尔逊相关系数揭示了在四个独立数据集中，没有药物编码能超过仅基于细胞特征的预测。受控实验将作用机制身份作为药物特征或训练分布约束，确定了原因。将作用机制作为特征提供微小收益，而将其作为训练分布分层则显著提高针对靶向激酶抑制剂的每种药物相关系数，因为全癌症联合训练抑制了通路特异性敏感信号。机制分层训练和试点观察的响应匹配提供了两种可部署策略，共同恢复了药物盲敏感性预测中的主要预测增益来源。

英文摘要

Precision oncology requires predicting which drugs will suppress a specific tumor from its molecular profile, but drug-blind sensitivity prediction has plateaued despite increasingly complex drug representations. Here we show that this stagnation reflects a metric artifact rather than a representational bottleneck. The standard benchmark, global Pearson r, is dominated by between-drug potency differences that a trivial drug-mean predictor captures without any cell-specific learning. Per-drug Pearson r, which isolates within-drug cell ranking, reveals that no drug encoding improves over cell-only features across four independent datasets. A controlled experiment channeling mechanism-of-action identity as either a drug feature or a training-distribution constraint identifies the cause. Supplying MoA as a feature yields negligible benefit, whereas using it to stratify training raises per-drug r substantially for targeted kinase inhibitors, because pan-cancer co-training suppresses pathway-specific sensitivity signals. Mechanism-stratified training and response matching from pilot observations provide two deployable strategies that together recover the principal sources of predictive gain in drug-blind sensitivity prediction.

URL PDF HTML ☆

赞 0 踩 0

2605.20692 2026-05-21 stat.ME q-bio.PE q-bio.QM stat.AP

Inferring infectiousness: a joint model of the within-host viral kinetics of SARS-CoV-2

推断传染性：SARS-CoV-2宿主内病毒动力学的联合模型

Christopher B. Boyer, Stephen M. Kissler, Seran Hakki, Jakob Jonnerby, Ajit Lalvani, Marc Lipsitch

AI总结本文提出了一种联合模型，通过分析多个病毒脱落间接指标的数据，推断SARS-CoV-2宿主内病毒动力学的传染性轨迹，从而为政策制定提供更准确的传染性评估。

详情

AI中文摘要

在传染病爆发期间，提供准确的政策问题答案需要详细的传染病性自然史模型。不幸的是，直接测量传染性通常不可用。相反，我们通常依赖间接代理，如通过PCR或抗原测试测量的病毒载量、通过病毒培养检测复制活性病毒或症状发作，这些都反映了病毒动力学或宿主反应的不同方面。然而，这些代理在收集的便利性、可扩展性和与病毒脱落及基础传染性相关联方面存在差异。在此，我们利用来自五个前瞻性、密集采样队列的数据，这些队列有纵向数据，涵盖多个病毒脱落代理，约2000例感染，开发了一个贝叶斯联合模型，用于SARS-CoV-2感染的宿主内病毒动力学。对联合分布的建模使我们能够推断仅提供PCR数据的个体的病毒脱落轨迹——传染性的最直接相关指标，并计算无法通过任何单一代理单独获得的衍生量。这些包括根据诊断后时间、变种、疫苗接种状态和感染史分层的群体层面传染性持续时间和概率；隔离解除的残余风险；以及根据新检测结果逐步更新的个性化实时传染性估计。

英文摘要

During an infectious disease outbreak, providing accurate answers to policy questions about transmission requires a detailed model of the natural history of infectiousness. Unfortunately, direct measures of infectiousness are generally unavailable. Instead, we often rely on indirect proxies, such as viral load measured by PCR or antigen tests, viral culture to detect replication-competent virus, or symptom onset, each of which reflects different aspects of viral dynamics or host response. However, these proxies vary in terms of the ease of collection, scalability, and their relationship to viral shedding and therefore underlying infectiousness. Here, we use data from five prospective, densely sampled cohorts with longitudinal data on multiple proxies of viral shedding for approximately 2,000 infections to develop a Bayesian joint model for the within-host viral kinetics of SARS-CoV-2 infection. Modeling the joint distribution allows us to infer the trajectory of infectious virus shedding -- the most direct correlate of infectiousness -- for individuals who contribute only PCR data, and to compute derived quantities that are inaccessible from any single proxy alone. These include the population-level probability and expected duration of ongoing infectiousness as a function of time since diagnosis, stratified by variant, vaccination status, and infection history; the residual risk of releasing an individual from isolation; and personalized, real-time estimates of infectiousness that are sequentially updated as new test results become available.

URL PDF HTML ☆

赞 0 踩 0

2605.19171 2026-05-21 q-bio.TO

A putative model of the gut-muscle axis in aged livestock

老龄家畜肠道-肌肉轴的潜在模型

Karin Suzuki, Aoi Fukushima, Yu Adachi, Tsubasa Irie, Arisa Sano, Daisuke Yamamoto, Hirokuni Miyamoto, Shigeharu Moriya, Makiko Matsuura, Naoko Tsuji, Takashi Satoh, Tamotsu Kato, Takumi Nishiuchi, Hiroshi Ohno, Hiroaki Kodama, Naruki Sato

AI总结本研究通过多组学分析探讨了老龄家畜肠道-肌肉轴的潜在模型，发现发酵饲料能显著改变肠道微生物群和肌肉代谢物，揭示了肠道代谢物在连接微生物群与肌肉生理中的作用。

详情

Comments: 17 pages, 4 figures (5 supplementary figures)

AI中文摘要

肠道-肌肉轴被认为将肠道微生物群与骨骼肌生理联系起来，但其在不同家畜物种中的普遍性仍不明确。利用具有相对较短消化道的老龄产蛋鸡作为模型，我们通过整合多组学分析研究了食用含Caldifermentibacillus hisashii的发酵饲料或对照饲料的鸡的肠道微生物群、粪便代谢组和胸肌代谢组。非度量多维尺度分析显示，微生物群在组间有明显分离（压力=0.0097），特征是发酵饲料的使用导致乳酸菌显著增加。方差分析显示16S微生物群与粪便（共享R2 adj=0.54）和肌肉（共享R2 adj=0.48）代谢组共享大量方差，并且部分dbRDA显示在控制16S后，粪便-肌肉代谢物关联仍得以保留（直接R2=0.538，部分R2=0.485），这与粪便代谢物作为连接微生物群与肌肉的整合层一致。Cliff's delta基于选择显示蛋白质分解菌群和粪便氨基酸的减少，以及肌肉中鸟氨酸和尿酸的减少和高嘌呤含量的增加。由于两组在屠宰后处理相同，这些差异反映了体内状态：尽管细菌蛋白质分解能力减少，但宿主利用增强，而尿酸的减少表明高效的氮周转而非积累。总体而言，这些发现支持了老龄产蛋鸡肠道-肌肉轴的潜在三联模型，为理解微生物对老龄家畜肌肉生理的贡献提供了统计学基础的框架。

英文摘要

The gut-muscle axis has been proposed to link gut microbiota with skeletal muscle physiology, yet its universality across livestock species remains unclear. Using aged laying hens, a livestock model with a relatively short digestive tract, we examined the gut microbiota, faecal metabolome, and breast-muscle metabolome by integrative multi-omics analyses in hens fed a Caldifermentibacillus hisashii-containing fermented feed or a control diet. Non-metric multidimensional scaling revealed clear separation of the microbial community between groups (stress = 0.0097), characterised by a marked expansion of Lactobacillus with the administration of the fermented feed. Variance partitioning showed that the 16S microbiota shared substantial variance with both the faecal (shared R2 adj = 0.54) and muscle (shared R2 adj = 0.48) metabolomes, and partial dbRDA demonstrated that the faecal-to-muscle metabolite association was largely retained after controlling for 16S (direct R2 = 0.538, partial R2 = 0.485), consistent with faecal metabolites acting as an integral layer linking microbiota to muscle. Cliff's delta-based selection showed depletion of proteolytic taxa and faecal amino acids, and reduced muscle Ornithine and uric acid alongside elevated Hypoxanthine. Because both groups were processed identically post-slaughter, these differences reflect in vivo states: amino acid depletion despite reduced bacterial proteolytic capacity points to enhanced host utilisation, and reduced uric acid, a post-mortem-stable purine end-product in uricotelic chickens, indicates efficient nitrogen turnover rather than accumulation. Collectively, these findings support a putative tripartite model of the gut-muscle axis in aged laying hens, providing a statistically grounded framework for understanding microbial contributions to muscle physiology in aged livestock.

URL PDF HTML ☆

赞 0 踩 0

2605.19070 2026-05-21 q-bio.NC

Computational Auditory Periphery Models: the Return of the Rodent

计算听觉外周模型：啮齿类动物的回归

Morgan Thienpont, F. Deloche, S. Keshishzadeh, D. Kiselev, J. Bourien, J. -L. Puel, B. N. Buran, N. Bramhall, S. Verhulst

AI总结本文提出了一种跨物种的计算听觉外周模型，用于研究感音神经性听力损失（SNHL）的跨物种研究，通过调整物种特异性解剖和生理参数，验证了模型在不同物种中的有效性。

详情

AI中文摘要

动物实验为听觉功能提供了许多见解，特别是在感音神经性听力损失（SNHL）的情况下。然而，这些发现如何转化为临床相关的人类听觉系统并不总是清楚。跨物种的听觉外周计算模型可以帮助弥合非侵入性人类诊断与动物研究实验证据之间的差距。在本工作中，我们适应了一个为人类听觉外周设计的1D非线性耳蜗传输线模型，用于小鼠和豚鼠，从而实现了跨物种研究SNHL的单一计算框架。物种特异性解剖和生理参数，包括基底膜（BM）长度和宽度、镫骨面积、中耳转移函数和频率范围，被调整以匹配每种物种的听觉外周和听觉范围。其他耳蜗参数被校准以重现真实的耳蜗调谐和压缩。适应后的小鼠和豚鼠模型被验证与实验BM速度水平增长特性、听神经（AN）调谐曲线和DPOAEs相匹配。模拟的听神经输出合理匹配了实测数据，包括真实的听神经阈值和频率选择性。然而，耳蜗部分更接近基底或尖端时，模拟与测量之间的差异会增大。模拟耳蜗突触病理性表现再现了从小鼠和豚鼠中记录到的听觉脑干和包络跟随响应差异。基于DPOAEs对小鼠模型的OHC个体化未能忠实重现个体测量，尽管组间OHC损伤差异被捕捉。我们的发现表明，生物物理基础的听觉模型可以跨物种转换，同时保持真实的声音编码属性和病理改变。

英文摘要

Animal experiments have provided many insights on auditory function, notably in cases of sensorineural hearing loss (SNHL). However, it is not always clear how these findings translate to the human auditory system in clinically relevant contexts. Cross-species computational models of the auditory periphery can help bridge the gap between non-invasive human diagnostics and experimental evidence from animal studies. In this work we adapted a 1-D nonlinear cochlear transmission-line model designed for the human auditory periphery to mouse and gerbil, enabling a single computational framework for cross-species research on SNHL. Species-specific anatomical and physiological parameters - including basilar membrane (BM) length and width, stapes area, middle-ear transfer functions, and frequency range - were adjusted to match each species' auditory periphery and hearing range. Other cochlear parameters were calibrated to reproduce realistic cochlear tuning and compression. The adapted mouse and gerbil models were validated against experimental BM velocity level-growth characteristics, auditory-nerve (AN) tuning curves, and DPOAEs. Simulated AN outputs reasonably matched empirical measurements, including realistic AN thresholds and frequency selectivity. However, the discrepancy between simulations and measurements became larger for cochlear sections closer to the base or apex. Simulations of cochlear synaptopathy reproduced observed differences in recorded auditory brainstem and envelope following responses from mice and gerbils with SNHL. OHC individualization of the mouse model based on DPOAEs failed to faithfully reproduce individual measurements, although intergroup differences in OHC damage were captured. Our findings demonstrate that biophysically grounded auditory models can be translated across species while preserving realistic sound-coding properties and pathophysiological alterations.

URL PDF HTML ☆

赞 0 踩 0

2605.03690 2026-05-21 cs.LG cs.AI q-bio.QM

Graph Neural Network based Hierarchy-Aware Embeddings of Knowledge Graphs: Applications to Yeast Phenotype Prediction

基于图神经网络的面向层次的知识图谱嵌入：应用于酵母表型预测

Filip Kronström, Alexander H. Gower, Daniel Brunnsåker, Ievgeniia A. Tiukova, Ross D. King

AI总结本文提出了一种利用图神经网络和来自底层本体的语义损失来生成层次感知的知识图谱嵌入的方法，用于酵母表型预测，并展示了其在基因敲除效应预测和知识图谱修订评估中的应用。

详情

AI中文摘要

我们提出了一种利用图神经网络和来自底层本体的语义损失来生成层次感知的知识图谱嵌入的方法。该方法生成的嵌入更能反映领域知识。为了展示其效用，我们预测并解释了酵母Saccharomyces cerevisiae中基因敲除的影响，并在没有预测任务的情况下学习知识图谱的盒嵌入。我们进一步展示了盒嵌入如何作为评估知识图谱修订的基础。我们的酵母知识图谱是从社区数据库和本体术语构建的。低维盒嵌入结合图神经网络用于预测双基因敲除的细胞生长。在10折交叉验证中，这些预测的平均R²分数为0.360，显著高于基线比较，证明了高层定性知识对实验结果的影响力。在模型训练中纳入语义损失项提高了其预测性能（R²=0.377），通过将嵌入对齐本体结构。这表明本体中的类层次可以用于定量预测。我们还测试了训练好的模型在三基因敲除上的表现，展示了其对训练数据之外数据的泛化能力。此外，通过识别酵母知识图谱中对细胞生长预测重要的共现关系，我们构建了关于酵母相互作用特征的假说。一个生物实验验证了其中一个发现，揭示了肌醇利用与渗透压压力抗性之间的关联，突显了模型在生物发现中的潜力。

英文摘要

We present a method for finding hierarchy-aware embeddings of knowledge graphs (KGs) using graph neural networks (GNNs) enriched with a semantic loss derived from underlying ontologies. This method yields embeddings that better reflect domain knowledge. To demonstrate their utility, we predict and interpret the effects of gene deletions in the yeast Saccharomyces cerevisiae and learn box embeddings for KGs in the absence of a prediction task. We further show how box embeddings can serve as the basis for evaluating KG revisions. Our yeast KG is constructed from community databases and ontology terms. Low-dimensional box embeddings combined with GNNs are used to predict cell growth for double gene knockouts. Over 10-fold cross validation, these predictions have a mean $R^2$~score~of~0.360, significantly higher than baseline comparisons, demonstrating that high-level qualitative knowledge is informative about experimental outcomes. Incorporating semantic loss terms in the training of the models improves their predictive performance ($R^2$=0.377) by aligning embeddings with ontology structure. This shows that class hierarchies from ontologies can be exploited for quantitative prediction. We also test the trained models on triple gene knockouts, showing they generalise to data beyond those seen in training. Additionally, by identifying co-occurring relations in the yeast KG important for the cell-growth predictions, we construct hypotheses about interacting traits in yeast. A biological experiment validates one such finding, revealing an association between inositol utilisation and osmotic stress resistance, highlighting the model's potential to guide biological discovery.

URL PDF HTML ☆

赞 0 踩 0

2604.01341 2026-05-21 cs.CV q-bio.NC

Perceptual misalignment of texture representations in convolutional neural networks

卷积神经网络中纹理表示的感知偏差

Ludovica de Paolis, Fabio Anselmi, Alessio Ansuini, Eugenio Piasini

AI总结本文研究了卷积神经网络中纹理表示与人类感知内容之间的对齐关系，发现传统CNN视觉模型质量评估与人类纹理感知对齐性无直接关联，表明纹理感知可能涉及不同于传统CNN对象识别模型的机制。

2604.01295 2026-05-21 q-bio.NC

Parallelized Hierarchical Connectome: A Spatiotemporal Recurrent Framework for Spiking State-Space Models

并行化层次连接组：一种用于脉冲状态空间模型的时空递归框架

Po-Han Chiang

AI总结本文提出了一种并行化层次连接组（PHC）框架，将仅时间的State-Space Models（SSMs）升级为时空递归网络，通过整合神经物理先验知识，实现了更高效的模型训练和应用。

详情

Comments: 38 pages, 3 figures, 9 tables. Submitted to Neural Networks

AI中文摘要

本文提出了并行化层次连接组（PHC），一种通用的架构框架，将仅时间的State-Space Models（SSMs）升级为时空递归网络。传统SSMs能够实现并行扫描训练，但仅限于时间递归，缺乏单时间步内的横向或反馈交互。PHC将对角SSM核心映射到共享的神经元层，并将神经元间通信映射到共享的层次区域突触层，通过多传输环路在每个时间窗口内进行空间递归，参数复杂度为Theta(D^2)而非堆叠SSMs的Theta(D^2 L)。这种时空框架使神经物理先验知识的无缝整合成为可能，这些先验知识通常对标准SSMs难以处理，包括自适应LIF、突触延迟、STP、Dale定律与E/I不对称拓扑结构以及STDP。该框架被实例化为PHCSSM，这是首个整合了所有五个生物先验的脉冲SSM，并在长序列数据上进行评估，其测试精度与最先进的SSM基线相媲美，参数数量从1,312到4,891（比每个基线小1到4个数量级）。PHCSSM进一步允许顺序递归脉冲神经网络（RSNN）的部署模式，该模式可以渐近收敛到并行扫描训练模式，而无需人工神经网络到脉冲神经网络（ANN-to-SNN）转换，其跨后端可重复性已在四个硬件后端（x86 CPU、H100 GPU、Cortex-A76、Cortex-M4F）上得到验证，包括在Cortex-M4F微控制器上的端到端部署（40 KB SRAM，128 KB Flash）。PHCSSM因此将并行扫描SSM和基于生物的RSNN两种以前不兼容的训练范式整合到单一架构和训练权重中。

英文摘要

This work presents the Parallelized Hierarchical Connectome (PHC), a general architectural framework that upgrades temporal-only State-Space Models (SSMs) into spatiotemporal recurrent networks. Conventional SSMs achieve parallel-scan training but are limited to temporal recurrence, lacking lateral or feedback interactions within a single timestep. PHC maps the diagonal SSM core to a shared Neuron Layer and inter-neuronal communication to a shared Synapse Layer of hierarchical regions, reconnected by a Multi-Transmission Loop iterating spatial recurrence within each temporal window, at parameter complexity Theta(D^2) versus Theta(D^2 L) of stacked SSMs. This spatiotemporal framework enables the seamless integration of neuro-physical priors typically intractable for standard SSMs, including adaptive LIF, synaptic delay, STP, Dale's Law with E/I-asymmetric topology, and STDP. The framework is instantiated as PHCSSM, the first spiking SSM that integrates all five biological priors and is evaluated on long-sequence data, achieving test accuracy competitive with state-of-the-art SSM baselines at 1,312 to 4,891 trainable parameters (1 to 4 orders of magnitude smaller than every baseline). PHCSSM further admits a sequential recurrent spiking neural network (RSNN) deployment mode that converges asymptotically to the parallel-scan training mode without artificial-neural-network-to-spiking-neural-network (ANN-to-SNN) conversion, with cross-backend reproducibility verified across four hardware backends (x86 CPU, H100 GPU, Cortex-A76, Cortex-M4F) including end-to-end deployment on the Cortex-M4F microcontroller (40 KB SRAM, 128 KB Flash). PHCSSM thereby bridges parallel-scan SSM and biologically grounded RSNN, two paradigms with previously incompatible training regimes, into a single architecture and trained weights.

URL PDF HTML ☆

赞 0 踩 0

2603.20420 2026-05-21 q-bio.GN cs.LG q-bio.QM

CRANE: Correcting Errors in Raw Nanopore Signals Using Hidden Markov Models

CRANE：利用隐马尔可夫模型纠正原始纳米孔信号中的错误

Simon Ambrozak, Ulysse McConnell, Bhargav Srinivasan, Burak Ozkan, Ernest Zhang, Can Firtina

AI总结本文提出CRANE方法，通过训练和使用隐马尔可夫模型（HMM）来纠正纳米孔信号中的错误，从而提高原始信号分析的准确性，减少分析管道优化的负担，并且不引入显著的计算开销。

详情

AI中文摘要

纳米孔测序可以读取比其他测序方法更长的核酸分子序列，称为读数，这已推动了基因组分析的进步，如无间隙的人类基因组组装。通过分析纳米孔测序生成的原始电信号读数，现有方法可以将这些读数映射到DNA字符（即碱基调序）而无需转换，从而实现快速高效的测序数据分析。然而，原始信号常常由于噪声和处理误差而包含错误，这限制了原始信号分析的总体准确性。本文的目标是检测并纠正原始信号中的错误，以提高原始信号分析的准确性。为此，我们提出了CRANE，一种通过训练和利用隐马尔可夫模型（HMM）来准确纠正信号错误的机制。我们在各种数据集上的广泛评估表明，CRANE 1）一致提高了底层原始信号分析工具的整体准确性，2）最小化了为新型纳米孔技术优化分析管道的负担，3）不引入显著的计算开销。我们得出结论，CRANE提供了一种有效的方法，系统地在进一步分析之前识别并纠正原始纳米孔信号中的错误，这可以促进一种专门为原始纳米孔信号设计的新类别的错误校正机制。源代码：CRANE可在https://github.com/STORMgroup/CRANE上获得。我们还在GitHub页面上提供了完全重现我们结果的脚本。

英文摘要

Nanopore sequencing can read substantially longer sequences of nucleic acid molecules, called reads, than other sequencing methods, which has led to advances in genomic analysis such as the gapless human genome assembly. By analyzing the raw electrical signal reads that nanopore sequencing generates from molecules, existing works can map these reads without translating them into DNA characters (i.e., basecalling), allowing for quick and efficient analysis of sequencing data. However, raw signals often contain errors due to noise and processing errors, which limits the overall accuracy of raw signal analysis. Our goal in this work is to detect and correct errors in raw signals to improve the accuracy of raw signal analyses. To this end, we propose CRANE, a mechanism that trains and utilizes a Hidden Markov Model (HMM) to accurately correct signal errors. Our extensive evaluation on various datasets shows that CRANE 1) consistently improves the overall accuracy of the underlying raw signal analysis tools, 2) minimizes the burden of optimizing analysis pipelines for newer nanopore technologies, and 3) does not introduce substantial computational overhead. We conclude that CRANE provides an effective mechanism to systematically identify and correct the errors in raw nanopore signals before further analysis, which can enable the development of a new class of error correction mechanisms purely designed for raw nanopore signals. Source Code: CRANE is available at https://github.com/STORMgroup/CRANE. We also provide the scripts to fully reproduce our results on our GitHub page

URL PDF HTML ☆

赞 0 踩 0

2602.04150 2026-05-21 q-bio.PE cond-mat.dis-nn nlin.AO

A brief review of evolutionary game dynamics in the reinforcement learning paradigm

强化学习范式下进化博弈动力学的简要回顾

Guozhong Zheng, Xin Ou, Shengfeng Deng, Jiqiang Zhang, Li Chen

AI总结本文探讨了在强化学习范式下，进化博弈动力学在解释合作、公平、信任和资源协调等现代文明核心问题中的最新进展，以及该范式相较于传统模仿学习范式的优势。

详情

Journal ref: Communications in Theoretical Physics 78, 067601 (2026)
Comments: 27 pages, 7 figures, invited review

AI中文摘要

合作、公平、信任和资源协调是现代文明的基石，然而它们的出现仍无法充分解释理论预测与行为实验之间的持续差异。部分原因可能在于先前理论模型中常用的模仿学习范式，该范式假设个体仅根据预设的固定规则复制成功邻居。本文回顾了近期在进化博弈动力学中采用强化学习（RL）作为替代范式的新进展。在RL中，个体通过试错学习，并根据环境反馈内省性地完善其策略。我们首先介绍了进化博弈理论中的关键概念和两种学习范式，然后综合了将RL应用于阐明合作、信任、公平、最优资源协调和生态动态方面的进展。总体而言，这些研究表明，RL提供了一个有前景的统一框架，用于理解人类和自然系统中观察到的多样化社会和生态现象。

英文摘要

Cooperation, fairness, trust, and resource coordination are cornerstones of modern civilization, yet their emergence remains inadequately explained by the persistent discrepancies between theoretical predictions and behavioral experiments. Part of this gap may arise from the imitation learning paradigm commonly used in prior theoretical models, which assumes individuals merely copy successful neighbors according to predetermined, fixed rules. This review examines recent advances in evolutionary game dynamics that employ reinforcement learning (RL) as an alternative paradigm. In RL, individuals learn through trial and error and introspectively refine their strategies based on environmental feedback. We begin by introducing key concepts in evolutionary game theory and the two learning paradigms, then synthesize progress in applying RL to elucidate cooperation, trust, fairness, optimal resource coordination, and ecological dynamics. Collectively, these studies indicate that RL offers a promising unified framework for understanding the diverse social and ecological phenomena observed in human and natural systems.

URL PDF HTML ☆

赞 0 踩 0

2601.03019 2026-05-21 q-bio.GN cs.CL

DNACHUNKER: Learnable Tokenization for DNA Language Models

DNACHUNKER: 用于DNA语言模型的可学习分词

Taewon Kim, Jihwan Shin, Hyomin Kim, Youngmok Jung, Jonghoon Lee, Won-Chul Lee, Sungsoo Ahn, Insu Han

AI总结本文提出DNACHUNKER，一种可学习的DNA分词方法，通过动态分段模块生成上下文依赖的变长单元，提升DNA语言模型在基因组序列处理中的鲁棒性和效率。

详情

Comments: ICML 2026 camera-ready version

AI中文摘要

DNA语言模型越来越多地用于表示基因组序列，但其有效性严重依赖于原始核苷酸如何转换为模型输入。与自然语言不同，DNA没有标准的边界，使固定分词成为在移位、插入缺失和局部重复下脆弱的设计选择。我们引入了DNAChunker，一种带有可学习自适应分段模块的遮蔽DNA语言模型，以生成上下文依赖、变长的单元。基于动态分段过程，DNAChunker学会在功能丰富区域分配更细的粒度，同时压缩重复或冗余序列。我们预训练DNAChunker在人类参考基因组上，并在五个基准上评估其性能，结果在强固定分词基线中一致提升。进一步的分析和消融实验表明，与固定分词不同，分段是以生物信息学指导、对突变具有鲁棒性的学习方式进行的。

英文摘要

DNA language models are increasingly used to represent genomic sequence, yet their effectiveness depends critically on how raw nucleotides are converted into model inputs. Unlike natural language, DNA offers no canonical boundaries, making fixed tokenizations a brittle design choice under shifts, indels, and local repeats. We introduce DNAChunker, a masked DNA language model that incorporates a learnable adaptive segmentation module to produce context-dependent, variable-length units. Building on a dynamic segmentation procedure, DNAChunker learns to allocate finer granularity to functionally enriched regions while compressing repetitive or redundant sequence. We pretrain DNAChunker on the human reference genome and evaluate it across five benchmarks, where it consistently improves over strong fixed-tokenization baselines. Further analyses and ablations indicate that unlike fixed tokenizations, segmentation is learned in a biologically-informed, mutation-resilient manner.

URL PDF HTML ☆

赞 0 踩 0

2510.07797 2026-05-21 cond-mat.stat-mech q-bio.CB q-bio.MN

Cell State Transitions Beyond the Small-Noise Limit

超越小噪声极限的细胞状态转换

Jianzhe Wei, Jingwen Zhu, Pan Chu, Liang Luo, Xiongfei Fu

AI总结该研究通过单细胞观察揭示了合成细菌基因回路中细胞状态转换的现象，发现转换过程发生在小噪声极限之外，挑战了经典Kramers理论的适用性，并指出需要新的理论框架来解释生物状态转换。

2508.17599 2026-05-21 q-bio.PE cond-mat.dis-nn nlin.AO

Decoding species coexistence: A reinforcement learning perspective

解码物种共存：强化学习视角

Kaiwen Jiang, Chenyang Zhao, Shengfeng Deng, Weiran Cai, Jiqiang Zhang, Li Chen

AI总结本文通过强化学习框架研究空间RPS模型，发现个体移动性通过Q学习算法动态调节后，三种物种能在广泛迁移率范围内稳定共存，揭示生存优先与捕食优先行为倾向的平衡是物种共存的关键，同时指出移动性失衡会威胁生物多样性。

详情

Journal ref: Phys. Rev. E 113, 054411 (2026), Editors' Suggestion
Comments: 13 pages, 11 figures

AI中文摘要

生态学的核心目标之一是理解生物多样性如何得以维持。此前的理论工作采用了石头-剪刀-布（RPS）游戏作为玩具模型，证明种群流动性在决定物种共存中至关重要。一个关键预测是当流动性超过一定值时，生物多样性会受到威胁并最终消失——这一结论与自然界中高度流动性物种共存的实证观察相矛盾。为解决这一矛盾，我们引入了强化学习框架，并研究了一个空间RPS模型，其中个体流动性通过Q学习算法动态调节，而非固定不变。我们的结果表明，三种物种可以在广泛的基础迁移率范围内稳定共存，灭绝概率在广泛范围内保持较低。机制分析揭示，个体发展出两种行为倾向：生存优先（逃离捕食者）和捕食优先（靠近猎物）。尽管物种共存源于这两种倾向的平衡，但其不平衡会威胁生物多样性。值得注意的是，在特定状态下存在动作偏好对称破缺，这导致了物种密度的差异。此外，当Q学习物种与固定流动性物种互动时，具有适应性流动性的物种表现出显著的进化优势。本研究表明，强化学习可能为揭示生物多样性的机制和指导保护策略提供新的视角。

英文摘要

A central goal in ecology is to understand how biodiversity is maintained. Previous theoretical works have employed the rock-paper-scissors (RPS) game as a toy model, demonstrating that population mobility is crucial in determining the species' coexistence. One key prediction is that biodiversity is jeopardized and eventually lost when mobility exceeds a certain value--a conclusion at odds with empirical observations of highly mobile species coexisting in nature. To address this discrepancy, we introduce a reinforcement learning framework and study a spatial RPS model, where individual mobility is adaptively regulated via a Q-learning algorithm rather than held fixed. Our results show that all three species can coexist stably, with extinction probabilities remaining low across a broad range of baseline migration rates. Mechanistic analysis reveals that individuals develop two behavioral tendencies: survival priority (escaping from predators) and predation priority (remaining near prey). While species coexistence emerges from the balance of the two tendencies, their imbalance jeopardizes biodiversity. Notably, there is a symmetry-breaking of action preference in a particular state that is responsible for the divergent species densities. Furthermore, when Q-learning species interact with fixed-mobility counterparts, those with adaptive mobility exhibit a significant evolutionary advantage. Our study suggests that reinforcement learning may offer a promising new perspective for uncovering the mechanisms of biodiversity and informing conservation strategies.

URL PDF HTML ☆

赞 0 踩 0

2508.10821 2026-05-21 q-bio.QM

SimAQ: Mitigating Experimental Artifacts in Soft X-Ray Tomography using Simulated Acquisitions

SimAQ：利用模拟采集缓解软X射线断层扫描中的实验伪影

Jacob Egebjerg, Daniel Wüstner

AI总结本文提出SimAQ模拟管道，通过生成逼真的酵母假体并应用合成成像伪影，产生成对的噪声体积、sinograms和重建图像，以提高软X射线断层扫描中伪影的缓解能力，并通过训练神经网络实现少样本和零样本迁移学习，实现无需大规模标注数据的精确分割和定量分析。

2605.20523 2026-05-21 cs.LG cs.AI q-bio.QM

Machine-Learning-Enhanced Non-Invasive Testing for MASLD Fibrosis: Shallow-Deep Neural Networks Versus FIB-4, Tabular Foundation Models, and Large Language Models

机器学习增强的非侵入性测试用于MASLD纤维化：浅层-深层神经网络与FIB-4、表格基础模型和大语言模型的比较

Athanasios Angelakis, Gabriele De Vito, Eleni-Myrto Trifylli, Filomena Ferrucci

AI总结本文研究了机器学习增强的非侵入性测试在MASLD纤维化检测中的应用，比较了浅层-深层神经网络、FIB-4、表格基础模型和大语言模型在不同队列中的性能，发现浅层-深层神经网络在保持FIB-4变量空间的同时提供了更平衡的外部操作性能。

详情

Comments: 26 pages, 4 figures, 3 tables. Preprint

AI中文摘要

晚期纤维化是代谢功能障碍相关脂肪性肝病（MASLD）中肝相关发病率的主要决定因素。FIB-4被广泛用作一线非侵入性测试，但其固定公式可能低估了年龄、天冬氨酸转氨酶、丙氨酸转氨酶和血小板计数中包含的诊断信息。我们评估了机器学习增强的非侵入性测试（MLE-NIT）是否能够在保持FIB-4变量空间的同时提高晚期纤维化的检测能力。我们使用了来自中国、马来西亚和印度的三个经活检确认的MASLD队列（n=784）。中国队列被分为486名训练样本和54名内部验证/调整治疗样本；最终性能仅在马来西亚和印度的外部队列中报告。模型使用了五个变量：年龄、FIB-4、天冬氨酸转氨酶、血小板计数和丙氨酸转氨酶。我们比较了FIB-4与浅层-深层神经网络（s-DNN）、TabPFN和gpt-4o-2024-08-06。FIB-4在马来西亚和印度的外部ROC-AUC分别为0.75和0.60。TabPFN达到0.69和0.66，微调后的GPT-4o达到0.75和0.63，而s-DNN达到0.77和0.67。s-DNN仅包含354个可训练参数，相比TabPFN的7,244,554个参数，却提供了更平衡的外部操作性能。校准显示s-DNN的Brier分数为0.18和0.22，排列重要性识别出AST和FIB-4为主要变量。紧凑的非线性MLE-NIT可能在不增加临床数据需求的情况下增强基于FIB-4的纤维化评估。

英文摘要

Advanced fibrosis is a major determinant of liver-related morbidity in metabolic dysfunction-associated steatotic liver disease (MASLD). FIB-4 is widely used as a first-line non-invasive test, but its fixed formula may underuse diagnostic information contained in age, aspartate aminotransferase, alanine aminotransferase, and platelet count. We evaluated whether machine-learning-enhanced non-invasive testing (MLE-NIT) can improve advanced fibrosis detection while preserving this FIB-4 variable space. We used three biopsy-confirmed MASLD cohorts from China, Malaysia, and India (n=784). The Chinese cohort was split into 486 training and 54 internal validation/tuning patients; final performance was reported only on the Malaysian and Indian external cohorts. Models used five variables: age, FIB-4, aspartate aminotransferase, platelet count, and alanine aminotransferase. We compared FIB-4 with a shallow-deep neural network (s-DNN), TabPFN, and gpt-4o-2024-08-06. FIB-4 achieved external ROC-AUCs of 0.75 and 0.60 in Malaysia and India, respectively. TabPFN achieved 0.69 and 0.66, fine-tuned GPT-4o achieved 0.75 and 0.63, and the s-DNN achieved 0.77 and 0.67, respectively. The s-DNN contained only 354 trainable parameters, compared with 7,244,554 for TabPFN, yet provided a more balanced external operating profile. Calibration showed s-DNN Brier scores of 0.18 and 0.22, and permutation importance identified AST and FIB-4 as dominant variables. Compact non-linear MLE-NITs may enhance FIB-4-based fibrosis assessment without increasing clinical data requirements.

URL PDF HTML ☆

赞 0 踩 0

2605.20496 2026-05-21 q-bio.NC cs.CV

Platonic Representations in the Human Brain: Unsupervised Recovery of Universal Geometry

人类大脑中的柏拉图表示：无监督恢复通用几何

Pablo Marcos-Manchón, Rishi Jha, Lluís Fuentemilla

AI总结该研究探讨了人类大脑是否能无监督地恢复通用几何结构，通过自监督编码器在fMRI数据中学习个体特定的嵌入表示，并证明这些表示可以通过几何变换在不同个体间转换。

详情

Comments: Code available at https://github.com/memory-formation/platonic-representations-fmri

AI中文摘要

强柏拉图表示假说提出，人工神经网络中的表征收敛可以被积极利用：嵌入可以通过一个通用潜在空间在不同模型间转换，而无需配对数据。我们探讨是否可以在人类大脑中恢复类似的几何结构。使用自然场景数据集的fMRI数据，我们提出了一种自监督编码器，通过利用重复的刺激呈现，仅依靠脑数据学习个体特定的嵌入表示。我们证明这些独立学习的空间可以通过无监督的正交旋转在不同个体间转换，而无需配对的跨个体样本或中间模型表示。将成对旋转同步到一个共享的潜在空间进一步提高了跨个体检索效果，表明个体特定的空间与一个共同的坐标系统相互兼容。这些结果为人类视觉皮层中的共享神经几何提供了证据：个体特定的fMRI表示在不同个体间近似等距，并可通过纯粹的几何变换进行转换。

英文摘要

The Strong Platonic Representation Hypothesis suggests that representational convergence in artificial neural networks can be harnessed constructively: embeddings can be translated across models through a universal latent space without paired data. We ask whether an analogous geometry can be recovered across human brains. Using fMRI data from the Natural Scenes Dataset, we propose a self-supervised encoder that learns subject-specific embeddings from brain data alone by exploiting repeated stimulus presentations. We show that these independently learned spaces can be translated across subjects using unsupervised orthogonal rotations, without paired cross-subject samples or intermediate model representations. Synchronizing pairwise rotations into a single shared latent space further improves cross-subject retrieval, indicating that subject-specific spaces are mutually compatible with a common coordinate system. These results provide evidence for a shared neural geometry in the human visual cortex: subject-specific fMRI representations are approximately isometric across individuals and can be translated through purely geometric transformations.

URL PDF HTML ☆

赞 0 踩 0

2605.20454 2026-05-21 physics.soc-ph cs.SI q-bio.QM

Sparse Contextual Coupling Reshapes Diffusion Geometry in Multilayer Hypergraphs

稀疏上下文耦合重塑多层超图中的扩散几何

Hao Ding, Sanjukta Krishnagopal

AI总结该研究提出了一种基于扩散的框架，用于分析稀疏条件特定层如何重塑多层超图中的扩散几何，通过将密集的MSigDB功能基因集层与稀疏的疾病特定DGIdb药物-基因超图耦合，发现稀疏层对扩散距离和社区结构有显著影响。

详情

AI中文摘要

许多复杂系统结合了密集的背景结构与稀疏的上下文信息。我们介绍了一种基于扩散的框架，用于分析稀疏条件特定层如何重塑多层超图中的扩散几何。每个层被表示为加权超图，层通过共享实体耦合，耦合系统上的随机游走诱导节点间的多尺度扩散距离。我们通过将密集的MSigDB功能基因集层与稀疏的疾病特定DGIdb药物-基因超图耦合，利用疾病相关的药物从DDDB和HumanNet-GSP定义外部基因权重，发现疾病特定层在耦合系统中包含不到2%的基因，但显著改变了扩散距离和社区结构。中心性分析表明，这种不成比例的影响是由于DGIdb关联的基因在MSigDB衍生的功能网络中占据重要位置。所得到的扩散衍生社区在子采样下保持稳定，并显示后验功能富集的一致性，包括神经精神疾病中的信号和神经递质类别，以及癌症相关疾病中的免疫、翻译和代谢类别。社区层面的比较进一步揭示了疾病相似性，这些相似性无法仅通过直接DGIdb基因重叠来解释，包括乳腺癌与精神分裂症的关系，这与最近的生物医学证据一致。这些结果表明，稀疏上下文层可以诱导在更高阶网络几何中的可解释非局部变化。

英文摘要

Many complex systems combine dense background structure with sparse contextual information. We introduce a diffusion-based framework for analyzing how sparse condition-specific layers reshape diffusion geometry in multilayer hypergraphs. Each layer is represented as a weighted hypergraph, layers are coupled through shared entities, and random walks on the coupled system induce multiscale diffusion distances between nodes. We apply the framework to disease-conditioned gene networks by coupling a dense MSigDB functional gene-set layer to sparse disease-specific DGIdb drug-gene hypergraphs, with disease-associated drugs selected from DDDB and HumanNet-GSP used to define external gene weights. Across Bipolar Disorder, Schizophrenia, Leukemia, and Breast Cancer, the disease-specific layer contains less than 2 percent of genes in the coupled system, yet substantially changes diffusion distances and community structure. Centrality analysis suggests that this disproportionate effect arises because DGIdb-associated genes occupy influential positions in the MSigDB-derived functional network. The resulting diffusion-derived communities are stable under subsampling and show coherent post hoc functional enrichment, including signaling and neurotransmission categories in neuropsychiatric diseases and immune, translational, and metabolic categories in cancer-associated diseases. Community-level comparisons further reveal disease similarities not reducible to direct DGIdb gene overlap, including a Breast Cancer-Schizophrenia relationship consistent with recent biomedical evidence. These results show that sparse contextual layers can induce interpretable nonlocal changes in higher-order network geometry.

URL PDF HTML ☆

赞 0 踩 0

2605.20208 2026-05-21 cs.CR cs.CY q-bio.TO

Artificial Pancreas Implantables -- How Healthcare Professionals May Deal With DIY Bio Cases

人工胰腺植入体——医疗专业人员如何应对DIY生物案例

Austin James, Xavier-Lewis Palmer, Lucas Potter, Celisha Oscar

AI总结本文探讨了医疗专业人员在处理受监管和DIY人工胰岛素输注系统时面临的网络安全风险及临床实践挑战。

详情

AI中文摘要

自动化胰岛素输送（AID）和人工胰腺系统越来越多地作为安全关键的网络物理技术应用于临床护理，整合传感器、算法、软件和胰岛素输送硬件，以自动化维持生命的治疗。尽管受监管的商业系统受到正式批准途径、制造商治理和市场后监控的支持，但临床医生也遇到依赖自制（DIY）人工胰腺系统患者，这些系统在传统监管和机构控制结构之外运行。本文研究了常规临床处理实践如何与受监管和DIY AID系统之间的网络生物安全风险交叉。当胰岛素输送系统被根本性重新配置为定制AID系统时，患者用户成为主要威胁来源，通过承担制造商级别的角色而没有强制性的治理，整个利益相关者的生态系统都处于法律和临床不确定性之中。

英文摘要

Automated insulin delivery (AID) and artificial pancreas systems increasingly serve as safety-critical cyber-physical technologies in clinical care, integrating sensors, algorithms, software, and insulin-delivery hardware to automate a life-sustaining therapy. While regulated commercial systems are supported by formal approval pathways, manufacturer governance, and post-market surveillance, clinicians are also encountering patients who rely on do-it-yourself (DIY) artificial pancreas systems that operate outside conventional regulatory and institutional control structures. This paper examines how routine clinical handling practices intersect with cyberbiosecurity risk across both regulated and DIY AID systems. When insulin delivery systems are fundamentally reconfigured into a bespoke AID system, with the patient-user becoming the primary threat vector by assuming manufacturer-level roles without mandated governance, the entire ecosystem of stakeholders is placed in legal and clinical uncertainty.

URL PDF HTML ☆

赞 0 踩 0

2506.08277 2026-05-21 q-bio.NC cs.AI cs.CL cs.CV cs.LG

Task-conditioned probing of instruction-tuned multimodal LLMs: Region-specific brain alignment patterns under naturalistic stimuli

基于任务的指令调制多模态大语言模型探测：在自然主义刺激下的区域特定大脑对齐模式

Subba Reddy Oota, Khushbu Pahwa, Prachi Jindal, Satya Sai Srinath Namburi, Maneesh Singh, Tanmoy Chakraborty, Bapi S. Raju, Manish Gupta

AI总结本研究探讨了指令调制多模态大语言模型在自然主义刺激下的大脑对齐模式，通过比较不同模型在视频和音频任务中的表现，揭示了指令调制对模型表示能力的影响。

详情

Comments: 57 pages, 39 figures

AI中文摘要

近期的体素级多模态脑编码研究显示，多模态大语言模型（MLLMs）在大脑对齐程度上高于单模态模型。更近期的研究表明，指令调制多模态（IT）模型能够生成与大脑活动强相关的任务特定表示，但大多数先前评估集中在单模态刺激或非指令调制模型上。我们仍然缺乏对指令调制是否使IT-MLLMs围绕功能任务需求组织其表示，还是仅反映表面语义的清晰理解。为此，我们通过预测自然主义电影观看（带音频的视频）期间记录的fMRI响应，来估计大脑对齐情况。使用来自六个视频和两个音频IT-MLLMs的指令特定嵌入，跨13个视频任务指令，我们发现指令调制视频MLLMs的大脑对齐程度高于上下文学习（ICL）多模态模型（~9%）、非指令调制多模态模型（~15%）和单模态基线（~20%）。我们对视频和音频任务以及语言引导的探测评估，产生了不同任务特定的MLLM表示，这些表示在不同大脑区域中变化。我们还发现，ICL模型表现出强语义组织（r=0.78），而IT模型与指令文本语义的耦合较弱（r=0.14），这与与更高大脑对齐相关的任务条件子空间一致。这些发现支持了任务特定指令与更强的大脑-MLLM对齐之间的关联，并为映射两个系统中的联合信息处理开辟了新途径。我们公开了代码 [https://github.com/subbareddy248/mllm_videos]。

英文摘要

Recent voxel-wise multimodal brain encoding studies have shown that multimodal large language models (MLLMs) exhibit a higher degree of brain alignment compared to unimodal models. More recently, instruction-tuned multimodal (IT) models have been shown to generate task-specific representations that align strongly with brain activity, yet most prior evaluations focus on unimodal stimuli or non-instruction-tuned models under multimodal stimuli. We still lack a clear understanding of whether instruction-tuning is associated with IT-MLLMs organizing their representations around functional task demands or if they simply reflect surface semantics. To address this, we estimate brain alignment by predicting fMRI responses recorded during naturalistic movie watching (video with audio) from MLLM representations. Using instruction-specific embeddings from six video and two audio IT-MLLMs, across 13 video task instructions, we find that instruction-tuned video MLLMs show higher brain alignment than in-context learning (ICL) multimodal models (~9%), non-instruction-tuned multimodal models (~15%), and unimodal baselines (~20%). Our evaluation of MLLMs across video and audio tasks, and language-guided probing produces distinct task-specific MLLM representations that vary across brain regions. We also find that ICL models show strong semantic organization (r=0.78), while IT models show weak coupling to instruction-text semantics (r=0.14), consistent with task-conditioned subspaces associated with higher brain alignment. These findings are consistent with an association between task-specific instructions and stronger brain-MLLM alignment, and open new avenues for mapping joint information processing in both systems. We make the code publicly available [https://github.com/subbareddy248/mllm_videos].

URL PDF HTML ☆

赞 0 踩 0

2502.07860 2026-05-21 q-bio.QM

Design of an Automated Ethanol Vapor Generating System for Alcohol Use Disorder(AUD) Animal Studies

为酒精使用障碍(AUD)动物研究设计的自动乙醇蒸汽生成系统

Alexander Pozhitkov, Douglas Ramsay, Peter A Noble

AI总结本研究提出了一种基于温度控制的乙醇蒸汽生成系统，用于精准控制动物实验中的乙醇蒸汽剂量，解决了传统系统在参数调节和安全性方面的不足。

详情

Comments: 6 pages, 2 figures, 1 table

AI中文摘要

酒精使用障碍(AUD)是一种影响约2950万美国人的常见成瘾性疾病，其特征是尽管有负面后果仍无法控制饮酒行为。个体诊断标准的数目通常决定了AUD的严重程度。AUD研究旨在理解个体易感性差异并开发预防策略。酒精蒸汽吸入已成为动物病理生理学研究中的一种有前途的方法，使研究者能够控制酒精暴露的剂量和持续时间。这种方法对于研究自愿饮酒行为的加剧至关重要。目前商用酒精蒸汽生成系统存在局限性，包括燃烧风险和需要调节多个参数。其他方法如气泡或吹过蒸发面临维持平衡和避免气溶胶化的问题。为了解决这些问题，提出了一种新的乙醇蒸汽生成系统，仅依赖温度控制，在热力学控制下形成真空，使乙醇在其中蒸发。这种方法消除了调节多个参数的需要，提供了更精确的蒸汽剂量输送。我们验证了该系统，经过几次预热循环后实现了稳定的乙醇蒸汽。使用1.2升圆筒，我们1分钟内获得了约3.6升的饱和蒸汽/空气混合物。重力测定结果显示，每次循环产生约100毫克/升或~10,000 ppm的蒸汽/空气混合物。该乙醇蒸汽生成器的预期用途是提供一种浓缩的乙醇蒸汽/空气混合物，之后需进一步稀释后再用于动物实验。

英文摘要

Alcohol Use Disorder (AUD) is a prevalent addictive disorder affecting an estimated 29.5 million Americans. It is characterized by impaired control over alcohol consumption despite negative consequences. The number of diagnostic criteria met by an individual typically determines the severity of AUD. Research into AUD focuses on understanding individual susceptibility differences and developing preventive strategies. Alcohol vapor inhalation has emerged as a promising method for pathophysiological investigations in animals, allowing researchers to control the dose and duration of alcohol exposure. This approach is crucial for studying the escalation of voluntary alcohol-drinking behavior. Current commercial systems for alcohol vapor generation have limitations, including combustion risks and the need to adjust multiple parameters. Other methods, like bubbling or blow-over evaporation, face challenges in maintaining equilibrium and avoiding aerosolization. To address these issues, a new type of ethanol vapor generating system is proposed that relies solely on temperature control, creating a vacuum into which ethanol evaporates under thermodynamic control. This approach eliminates the need to adjust multiple parameters and offers improved accuracy and precision in vapor dose delivery. We validated the system as anticipated, achieving stable ethanol vapor after a few priming cycles. Using a 1.2 L cylinder, we obtained approximately 3.6 L of saturated vapor/air mix in 1 minute. Gravimetric results showed that each cycle produced about 100 mg/L or ~10,000 ppm vapor-to-air mixture. The intended use of the ethanol vapor generator is to provide a concentrated ethanol vapor / air mixture to be further diluted before delivering to the animals.

URL PDF HTML ☆

赞 0 踩 0

2405.20264 2026-05-21 q-bio.PE math.DS

Transmission of multiple pathogens across species

跨物种多重病原体传播

Clotilde Djuikem, Julien Arino

AI总结本文研究了多种病原体在不同物种间传播的模型，采用分支过程近似计算疾病爆发的概率，并分析了两种宿主物种和一个或两个病原体的水生环境特例。