arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2605.08019 2026-05-11 cs.AI q-bio.NC

Reason to Play: Behavioral and Brain Alignment Between Frontier LRMs and Human Game Learners

Botos Csaba, Sreejan Kumar, Austin Tudor David Andrews, Laurence Hunt, Chris Summerfield, Joshua B. Tenenbaum, Rui Ponte Costa, Marcelo G. Mattar, Momchil Tomov

AI总结该研究探讨了现代人工智能系统是否能像人类一样在新环境中快速学习抽象知识并高效行动，通过分析人类在复杂游戏任务中的行为与脑活动数据，对比了前沿大推理模型（LRMs）与深度强化学习代理的表现。研究发现，LRMs在游戏学习行为和脑活动预测方面显著优于传统强化学习模型，尤其在皮层和皮下区域表现出数量级的预测优势。结果表明，LRMs能更有效地模拟人类在复杂自然环境中的学习与决策过程。

2605.08014 2026-05-11 q-bio.NC math.DS

Dynamical mechanisms of flexible phase-locking in cortical theta oscillators

Yangyang Wang, Benjamin R. Pittman-Polletta

AI总结本研究探讨了大脑皮层θ振荡器如何实现对不同时间尺度输入信号的灵活相位锁定机制。通过动力系统理论分析，研究发现多时间尺度的内在抑制电流相互作用，能够产生延迟Hopf分岔现象，从而显著扩展振荡器的同步频率范围。实验表明，θ频段和δ频段的抑制电流协同作用，形成了三时间尺度结构，使皮层振荡器在外部输入下具备更强的相位锁定能力，为语音分割等认知功能提供了潜在的神经机制基础。

Comments 30 pages, 15 figures

详情

英文摘要

Oscillatory activity in auditory cortex is thought to play a central role in auditory and speech processing by synchronizing neural rhythms to external acoustic features of the speech stream. To support this function, cortical oscillators must flexibly phase-lock to inputs spanning a wide range of timescales, including rhythms substantially slower than their intrinsic frequency. Here we identify a general dynamical mechanism by which intrinsic inhibitory currents operating on multiple timescales enable such flexible phase-locking. Using tools from dynamical systems theory, we show that interactions between slow and superslow inhibitory processes generate prolonged post-input recovery delays through delayed Hopf phenomena, thereby substantially expanding the frequency range over which entrainment can occur. We demonstrate this mechanisms in a biophysically grounded cortical theta oscillator model for speech segmentation. Specifically, we show that both a theta-timescale (4-8 Hz) inhibitory current $I_m$ and a slower delta-timescale (1-4 Hz) inhibitory potassium current $I_{\rm K_{SS}}$ are crucial for entrainment flexibility. Their interaction creates a three-timescale structure that gives rise to pronounced delay phenomena associated with a delayed Hopf bifurcation (DHB). Interestingly, the superslow $I_{\rm K_{SS}}$ and the associated DHB play little role in the unforced oscillatory dynamics, but are recruited to support phase locking under external forcing. Moreover, the intermediate-timescale current $I_m$, rather than being redundant, further expands the phase-locking range by prolonging delayed recovery along the superslow manifold. Together, these results suggest that coordination among intrinsic inhibitory currents operating on multiple timescales may represent a key mechanism supporting flexible phase locking to rhythmic inputs in the brain.

URL PDF HTML ☆

赞 0 踩 0

2605.07838 2026-05-11 q-bio.QM cs.AI cs.LG

PPI-Net connects molecular protein interactions to functional processes in disease

Kyle Higgins, Guadalupe Gonzalez, Dennis Veselkov, Ivan Laponogov, Kirill Veselkov

AI总结该研究提出了一种名为PPI-Net的分层图神经网络，旨在通过整合蛋白质-蛋白质相互作用网络与通路层级表示，揭示分子互作如何驱动疾病功能过程。该模型利用图注意力机制，将患者特异的分子特征在共享的生物互作网络中传播，从而实现从基因到高阶生物学过程的信号聚合。实验表明，PPI-Net在多种癌症数据集上表现出优异的预测性能，并通过整合多组学数据提升了模型的可解释性，揭示了癌症相关的关键信号通路和生物学机制。

Comments 17 pages, 3 figures, 2 tables

2605.07746 2026-05-11 stat.ML cs.LG q-bio.QM

Flow Matching for Count Data

Ganchao Wei, John Pearson

AI总结本文研究了高维计数数据（如单细胞RNA测序和神经脉冲序列）的生成建模问题，提出了一种基于连续时间出生-死亡过程的流匹配框架count-FM。该方法通过模拟自由的方式学习计数空间中的边际转移率，实现了在任意计数分布源和目标之间进行高效的生成与迁移。实验表明，count-FM在样本质量、模型效率和路径可解释性方面优于现有方法，适用于无条件生成、数据迁移和条件生成等多种任务。

2605.07614 2026-05-11 math.DS q-bio.MN

Predictive-Switching Control of Stochastic Gene Regulatory Networks: A Contractive PIDE Framework

Christian Fernández, Manuel Pájaro, Gábor Szederkényi, Irene Otero-Muras

AI总结本文提出了一种基于部分积分微分方程（PIDE）模型的预测切换控制算法，用于调控随机基因调控网络的概率密度函数形状。通过从有限候选集选择控制输入以最小化给定代价函数，并引入神经网络近似控制策略，构建了一个适用于高维系统的混合控制框架。核心理论贡献在于基于收缩性分析的闭环PIDE动力学稳定性证明，确保了概率密度演化对初始条件的渐进独立性，并在存在严格正泄漏项时实现了指数收敛。数值仿真验证了该方法的有效性与灵活性。

2605.07608 2026-05-11 q-bio.QM q-bio.BM

GoForth: Language Models for RNA Design under Structure, Sequence, and Coding Constraints

Michael Lindsey

AI总结本文提出了一种名为GoForth的语言模型，用于在结构、序列和编码约束下进行RNA设计。该模型通过条件生成的方式处理复杂的逆向设计问题，将序列先验、前向折叠采样器和奖励或似然评估器三个通常耦合的要素进行解耦。实验表明，GoForth能够高效生成高质量的RNA序列候选，并提供了对设计任务的语义嵌入和设计可行性的学习表征。

2605.07554 2026-05-11 cs.LG cs.AI q-bio.BM stat.ML

ProteinJEPA: Latent prediction complements protein language models

Dan Ofer, Dafna Shahaf, Michal Linial

AI总结本文研究了在蛋白质语言模型中引入潜在空间预测（JEPA）是否能提升模型性能，并在相同训练时间预算下与传统的掩码语言建模（MLM）进行对比。研究发现，在预训练和从头训练的蛋白质序列编码器中，仅在掩码位置进行潜在预测并保留MLM交叉熵损失的方法（称为masked-position MLM+JEPA）表现最佳，显著优于仅使用MLM或仅使用JEPA的方法。该方法在多个下游任务中取得了更好的性能，包括蛋白质稳定性预测、酶分类和结构检索等。

2605.03169 2026-05-11 q-bio.NC

NeuralSet: A High-Performing Python Package for Neuro-AI

Jean-Rémi King, Corentin Bel, Linnea Evanson, Julien Gadonneix, Sophia Houhamdi, Jarod Lévy, Josephine Raugel, Andrea Santos Revilla, Mingfang Zhang, Julie Bonnaire, Charlotte Caucheteux, Alexandre Défossez, Théo Desbordes, Pablo Diego-Simón, Shubh Khanna, Juliette Millet, Pierre Orhan, Saarang Panchavati, Antoine Ratouchniak, Alexis Thual, Teon L. Brooks, Katelyn Begany, Yohann Benchetrit, Marlène Careil, Hubert Banville, Stéphane d'Ascoli, Simon Dahan, Jérémy Rapin

AI总结本文介绍了NeuralSet，一个高效的Python工具包，旨在解决神经科学与现代人工智能融合中的软件生态碎片化问题。该框架统一处理多种神经记录数据（如fMRI、EEG和 spikes）及复杂实验刺激（如文本、音频和视频），通过解耦元数据与高效的数据提取，实现了与预训练深度学习模型的无缝集成。NeuralSet 提供了一个统一的PyTorch接口，支持从本地开发到高性能集群的无缝扩展，为下一代神经-人工智能研究提供了可扩展的基础设施。

2602.03490 2026-05-11 cs.LG q-bio.NC

Path Integration and Object-Location Binding Emerge in an Action-Conditioned Predictive Sequence Network

Linda Ariel Ventura, Victoria Bosch, Tim C Kietzmann, Sushrut Thorat

AI总结该研究探讨了如何通过行动条件下的预测序列网络实现路径整合和物体-位置绑定。研究中使用了一个递归神经网络，在连续的二维场景中依次采样标记，并通过预测下一个标记来学习环境模型。实验表明，网络能够逐步提升预测准确性，并在解码分析中展现出路径整合和动态绑定能力，揭示了结构化表征如何通过灵活绑定支持预测，为认知科学中的序列世界建模提供了机制性解释。

Comments 8 pages, 4 figures; accepted at CogSci 2026

2510.01808 2026-05-11 q-bio.PE physics.bio-ph

Optimization of sequential therapies to maximize extinction of resistant bacteria through collateral sensitivity

Javier Molina-Hernández, José A. Cuesta, Beatriz Pascual-Escudero, Saúl Ares, Pablo Catalán

AI总结该研究探讨了利用交叉敏感性（CS）设计序贯抗生素疗法以最大化耐药细菌灭绝的优化策略。通过构建包含四种基因型的随机出生-死亡模型，研究揭示了抗生素切换周期与细菌灭绝概率之间的非线性关系，并提出了基于几何分布的预测框架。研究还分析了抗生素剂量和突变率对灭绝效果的非单调影响，指出存在权衡关系，为体外和临床序贯疗法提供了定量设计原则。

Comments 17 pages, 15 figures, 2 tables

详情

DOI: 10.1103/zty8-yj5p
Journal ref: Phys. Rev. E 113, 054404 (2026)

英文摘要

Antimicrobial resistance (AMR) threatens global health. A promising and underexplored strategy to tackle this problem is sequential therapies exploiting collateral sensitivity (CS), whereby resistance to one drug increases sensitivity to another. Here, we develop a four-genotype stochastic birth-death model with two bacteriostatic antibiotics to identify switching periods that maximize bacterial extinction under subinhibitory concentrations. We show that extinction probability depends nonlinearly on switching period, with stepwise increases aligned to discrete switch events: fast sequential therapies are suboptimal as they do not allow for the evolution of resistance, a key ingredient in these therapies. A geometric distribution framework accurately predicts cumulative extinction probabilities, where the per-switch extinction probability rises with switching period. We further derive a heuristic approximation for the extinction probability based on times to fixation of single-resistant mutants. Sensitivity analyses reveal that strong reciprocal CS is required for this strategy to work, and we explore how increasing antibiotic doses and higher mutation rates modulate extinction in a nonmonotonic manner. Finally, we discuss how longer therapies maximize extinction but also cause higher resistance, leading to a Pareto front of optimal switching periods. Our results provide quantitative design principles for in vitro and clinical sequential antibiotic therapies, underscoring the potential of CS-guided regimens to suppress resistance evolution and eradicate infections.

URL PDF HTML ☆

赞 0 踩 0

2508.04056 2026-05-11 cs.RO q-bio.QM

SCOUT: Closed-Loop in-vivo System for Continuous Methane Concentration Monitoring in Cattle

Yuelin Deng, Hinayah Rojas de Oliveira, Richard M. Voyles, Upinder Kaur

AI总结该研究提出了一种名为SCOUT的闭环在体监测系统，用于持续测量牛瘤胃内甲烷浓度，解决了现有方法在准确性和操作可行性之间的矛盾。SCOUT通过闭环气体循环维持瘤胃厌氧环境，实现了高时间分辨率的甲烷浓度监测，揭示了与动物行为变化相关的快速浓度波动。该系统为建立浓度与排放量之间的模型提供了可靠的数据基础，有助于精准表型分析、排放代理校准和减排策略评估。

2505.22134 2026-05-11 q-bio.PE

Infection dynamics for fluctuating infection or removal rates regarding the number of infected and susceptible individuals

Seong Jun Park, M. Y. Choi

AI总结本文研究了感染率和移除率随感染者和易感者数量变化的非线性关系下的传染病传播动力学问题。作者提出了一种解析方法，用于计算在非线性感染和移除率下感染人数随时间的变化情况，拓展了传统SIR模型的适用范围。该研究为理解复杂传染病动态提供了新的定量分析工具。

2504.16559 2026-05-11 cs.LG q-bio.QM

Synergistic Benefits of Joint Molecule Generation and Property Prediction

Adam Izdebski, Jan Olszewski, Pankhil Gawade, Krzysztof Koras, Serra Korkmaz, Valentin Rauscher, Jakub M. Tomczak, Ewa Szczurek

AI总结该研究探讨了联合分子生成与性质预测的协同优势，提出了一种基于Transformer架构的联合模型Hyformer。该模型通过交替注意力机制和联合预训练策略，实现了分子生成与性质预测功能的融合，能够在条件采样、分布外性质预测和表征学习等方面展现协同效益。实验表明，Hyformer在抗菌肽设计等药物研发任务中表现出显著的联合学习优势。

Comments 17 pages, 4 figures

2311.08433 2026-05-11 q-bio.QM cs.LG stat.AP

Clinical Characteristics and Laboratory Biomarkers in ICU-admitted Septic Patients with and without Bacteremia

Sangwon Baek, Seung Jun Lee

AI总结该研究旨在探讨重症监护病房内感染性休克患者中是否存在菌血症的临床特征和实验室生物标志物的预测价值。通过回顾性分析218例患者的临床数据，研究发现C反应蛋白（CRP）和降钙素原（PCT）对菌血症具有较好的预测能力，而结合PCT、胆红素、中性粒细胞与淋巴细胞比值（NLR）、血小板、乳酸、红细胞沉降率（ESR）和格拉斯哥昏迷评分（GCS）构建的多变量逻辑回归模型显著提升了预测准确性，AUC达到0.907。研究还发现菌血症与患者死亡率存在显著关联，表明这些生物标志物在临床诊断和预后评估中具有重要应用价值。

Comments This research is not complete

2605.07498 2026-05-11 q-bio.PE cs.CY

Modeling the Impact of Exposed Cases in a Hantavirus Outbreak on a Cruise Ship

Jiaming Cui

AI总结本文研究了某邮轮上汉坦病毒疫情中隐性感染者对疫情传播的影响，构建了一个离散时间随机SEIRD模型，用于估计疫情传播动态、隐性感染情况及爆发风险。通过卡尔曼滤波方法结合世界卫生组织和欧洲疾病预防控制中心的疫情数据，推算了基本再生数为2.76，表明疫情在严格隔离措施实施前具有持续传播的潜力。研究还指出，疫情初期可能存在未被发现的隐性感染者，仅依赖症状监测难以有效识别，强调了在密闭旅行环境中快速监测、广泛检测和针对性隔离的重要性。

2605.07439 2026-05-11 q-bio.BM

CA-DEL: An Open Multi-Target, Multi-Modal Benchmark for Learning from DNA-Encoded Library Screens

Mutian He, Hanqun Cao, Cheng Tan, Zijun Gao, Xiaojun Yao, Chunbin Gu, Pheng-Ann Heng

AI总结本文提出了一种名为CA-DEL的开放多靶点、多模态基准数据集，用于从DNA编码文库筛选中学习分子与靶点之间的关系。该数据集聚焦于同源碳酸酐酶亚型（CAII、CAIX、CAXII）的选择性识别问题，通过整合实验测定的结合亲和力数据（$K_i$），建立了从噪声筛选数据到高精度生物物理数据的模拟到现实评估范式，为开发鲁棒的药物发现模型提供了重要支持。

2605.07035 2026-05-11 q-bio.OT cs.ET

Genetic Information as a "Chord" of Chemical Oscillations: Emergence of Catalyst-RNA Systems Driven by Superposed Rhythms

Takeshi Ishida

AI总结本文探讨了生命起源过程中催化多肽和信息承载核酸如何相互依赖地形成系统这一核心问题，提出了一种基于两个内部洛特卡-沃尔泰拉化学振荡器的原始认知模型。通过模拟二进制序列所代表的聚合物相互作用，研究展示了催化循环、原始tRNA和记录放大振荡信息的核酸可能形成的机制。该模型表明，内部振荡可以为聚合物延伸过程中的序列选择提供时间偏差，并有效促进功能分子的积累与催化功能和信息存储的协同演化。

详情

英文摘要

A central challenge in the origin of life is understanding how catalytic peptide-like polymers and information-bearing nucleic acid-like polymers emerged as an interde-pendent system. This study constructs a primordial cognitive model incorporating two internal Lotka-Volterra chemical oscillators to investigate, through simulation, whether a catalytic loop, primordial tRNAs, and nucleic acids that record and amplify them, can form through the interaction of polymers represented by binary (0/1) sequences. In this model, a mechanism was introduced where the synthesis of internal oscillations pro-vides a temporal bias for 0/1 selection during polymer elongation, while generated functional sequences are protected, recorded, and re-amplified. Simulation results demonstrated that the proposed cognitive model significantly outperformed a contrast model based on random 0/1 selection in terms of the establishment rate of catalytic loops, the accumulation of functional molecules, polymer elongation, and the reduction of Shannon entropy in sequence distribution. Furthermore, this superiority was generally maintained across sensitivity analyses, including batch calculations with different ran-dom seeds. While this study is a computational model based on abstract binary se-quences and simplified translation/replication rules rather than a direct reconstruction of life's origin, it provides a working hypothesis for the interdependent emergence of catalytic function and information retention by demonstrating that internal oscillations can bias sequence exploration within a framework linking autocatalytic networks, re-cording, and group selection. Future research must verify the generality and empirical validity of this framework by expanding monomer types, evolving into multi-oscillator systems, and establishing correspondences with compartmentalized experimental sys-tems.

URL PDF HTML ☆

赞 0 踩 0

2605.07028 2026-05-11 q-bio.PE physics.soc-ph

Quo nomine vis vocari? A random-copying model explains the temporal sequence of papal names

Egor Lappo, Noah A. Rosenberg

AI总结本文研究了教皇名号选择这一持续千年的文化演化过程，揭示其背后存在一种类似于种群遗传学中的随机复制机制。研究发现，尽管每个教皇在选择名字时会综合考虑多种因素，但从长期趋势来看，教皇名字的使用频率符合“尤恩斯抽样理论”和“中国餐馆过程”等模型的预测，即名字的选择具有按频率随机复制的特性，并允许新名字的出现。这一发现表明，复杂的人类文化行为可能遵循简单而普适的规律。

Comments 12 pages, 3 figures, 1 table

2605.07026 2026-05-11 q-bio.NC cs.AI cs.LG

Learning Cross-Atlas Consistent Brain Disorder Representations via Disentangled Multi-Atlas Functional Connectivity Learning

Minheng Chen, Chao Cao, Jing Zhang, Tianming Liu, Dajiang Zhu

AI总结该研究旨在解决不同脑图谱下功能连接（FC）表示不一致的问题，提出了一种多图谱解耦功能连接学习框架（MADCLE）。该方法通过联合编码来自不同图谱的FC矩阵，学习图谱特异的疾病相关表示，并通过分布对齐促进跨图谱一致性。同时，通过协变量监督、图谱特异性重建和去相关约束，分离协变量和图谱依赖的残差因素，减少非疾病信息对疾病嵌入的干扰。实验表明，MADCLE在ADNI和ADHD-200数据集上优于多种现有方法，展示了其在异构图谱方案下基于FC的疾病识别中的有效性。

2605.07007 2026-05-11 q-bio.CB physics.bio-ph q-bio.PE

Essential Role of Extrinsic Noise in Models of E. coli Division Control

Mattia Corigliano, Kuheli Biswas, Matteo Bocchiola, Daniele Montagnani, Ariel Amir, Marco Cosentino Lagomarsino

AI总结本文研究了大肠杆菌分裂调控中外源噪声的关键作用，通过解析求解一个随机阈值积累模型，揭示了分裂蛋白在达到噪声相关阈值时触发分裂的机制。研究量化了内在与外源噪声以及关键分子机制参数的综合影响，表明引入这些因素可以产生比传统“加法模型”更丰富的分裂策略，并能解释实验观测到的细胞大小波动。研究为理解细菌分裂规律提供了统一的理论框架。

Comments 20 pages, 5 figures, 1 table

2605.06995 2026-05-11 q-bio.QM q-bio.NC

Partitioning Neural Co-Variability

Skyler Thomas, Brandon J. Zhu, Kathleen E. Cullen, Adam S. Charles

AI总结该研究探讨了神经元群体响应中的试次间变异问题，提出了一种新的模型——泊松矩阵正态潜在变量模型（PMNLV），用于捕捉神经群体中结构化的增益共变性。该模型通过矩阵正态先验和二次软整流链接函数，扩展了单神经元的过度离散模型，能够同时估计神经元之间的协方差和时间相关性。研究还开发了两种互补的估计算法，并在模拟数据和小鼠视觉皮层的神经记录数据中验证了模型的有效性，揭示了初级视觉皮层中群体共变异的显著特征。

详情

英文摘要

Trial-to-trial variability of neural responses has been linked to important aspects of neural computation and is essential for understanding how neuronal populations respond. While current overdispersion models treat each neuron's gain as independent of each other, this assumption fails to capture the network statistics of neuronal populations. As no existing model can capture overdispersed structured spiking gain-modulation across a neural population, network-level gain covariance remains largely unstudied. We thus present the Poisson matrix-normal latent variable (PMNLV) model, which extends single-neuron overdispersion to neural populations by placing a matrix-normal prior over the latent gain with a Kronecker-factored covariance. Spike counts are Poisson-distributed with a rate equal to the sum of a per-neuron stimulus tuning term and a matrix-normal gain, passed through a quadratic soft-rectifying link. We derive two complementary estimation algorithms: a variational EM (VEM) with a matrix-normal posterior that recovers dense Kronecker factors without structural assumptions, and a Kernel Tournament Method (KTM) that performs data-driven selection over a biologically motivated kernel dictionary and composite likelihood. On simulated data, both algorithms recover the inter-neuron and temporal covariance factors and accurate tuning curves. Applying VEM to Neuropixel recordings across four cortical regions of mouse visual hierarchy, we replicate a previous finding that single-neuron marginal variability changes little across cortical areas. We then show that shared population co-variability, invisible to scalar summaries e.g., the Fano factor, peaks in primary visual cortex and declines in higher visual areas. The PMNLV framework is applicable to any simultaneously recorded population where structured gain covariance is of scientific interest.

URL PDF HTML ☆

赞 0 踩 0

2605.06879 2026-05-11 cs.LG q-bio.QM

Better Protein Function Prediction by Modeling Survivorship Bias

Zhongmou Chao, Poompol Buathong, Ekaterina Selivanovitch, Susan Daniel, Peter I. Frazier

AI总结该研究针对蛋白质功能预测中因自然选择导致的幸存者偏差问题，提出了一种基于进化知识的正例-未标记例学习框架Evo-PU。该方法通过建模序列在进化过程中的可观测性差异，区分因非功能而未被观察到的序列与因突变路径罕见而未出现的序列，从而提升预测准确性。实验表明，Evo-PU在多个单物种和多物种数据集上均优于现有方法，展示了其在蛋白质功能预测中的有效性与广泛适用性。

Comments 29 pages, 12 figures, 3 tables

2605.06762 2026-05-11 q-bio.GN cs.AI

A Linear-Transformer Hybrid for SNP-Based Genotype-to-Phenotype Prediction in Grapevine

Yibin Wang, Murukarthick Jayakodi, Silvas Kirubakaran, Ambika Chandra, Azlan Zahid

AI总结本文提出了一种结合线性模型与Transformer架构的混合方法LiT-G2P，用于基于SNP数据的葡萄基因型到表型预测。该方法通过整合加性遗传效应与非线性基因互作，提升了复杂性状在不同年份间的预测稳定性与准确性。实验结果表明，LiT-G2P在单年和跨年预测中均优于基准模型，尤其在叶毛密度和绒毛密度等性状上表现突出，并通过注意力机制提取关键SNP位点，为后续验证提供了可解释的候选标记。

Comments 15 pages, 4 Figures

2605.06728 2026-05-11 q-bio.GN cs.AI q-bio.CB

OmicsLM: A Multimodal Large Language Model for Multi-Sample Omics Reasoning

Maciej Sypetkowski, Joanna Krawczyk, Łukasz Smoliński, Remigiusz Kinas, Przemysław Pietrzak, Tomasz Jetka, Rafał Powalski

AI总结 OmicsLM 是一个用于多样本组学推理的多模态大语言模型，旨在连接定量组学数据与自然语言生物任务。该模型通过将转录组数据表示为紧凑的连续向量，在统一的上下文中处理自然语言指令、基因名称和多个样本数据，从而实现语言引导下的多样本推理。研究还引入了 GEO-OmicsQA 基准，用于评估模型在真实表达谱上的多样本生物问答能力，并表明 OmicsLM 在语言引导的生物推理任务中优于现有专门模型和通用大语言模型。

Comments 13 pages (main text), 14 pages (appendix), 1 figure, 10 tables

2602.09034 2026-05-11 q-bio.NC cs.AI

Latent-Space Causal Discovery from Indirect Neuroimaging Observations

Sangyoon Bae, Miruna Oprescu, David Keetae Park, Shinjae Yoo, Jiook Cha

AI总结该研究旨在从间接神经影像观测中发现潜在空间中的因果关系，克服了血流动力学和体积传导对信号的扭曲影响。研究提出了一个基于物理模型和非平稳潜在动态的条件框架，并推导了逆向误差传播的上界。在此基础上，作者设计了INCAMA方法，结合物理感知的逆向建模与延迟感知的Mamba编码器，通过机制变化提升因果图结构的估计性能。实验表明，该方法在模拟和真实fMRI数据上均显著优于现有方法，尤其在运动任务中能准确捕捉经典的视觉-运动通路。

Comments 9 pages, 2 figures

2602.02320 2026-05-11 cs.CL cs.AI q-bio.BM

A Large-Scale Dataset for Molecular Structure-Language Description via a Rule-Regularized Method

Feiyang Cai, Guijuan He, Yi Hu, Jingjing Wang, Joshua Luo, Tianyu Zhu, Srikanth Pilla, Gang Li, Ling Liu, Feng Luo

AI总结该研究提出了一种基于规则的自动化标注框架，用于生成包含完整分子结构信息的自然语言描述，解决了构建大规模高质量分子结构-语言对数据集的难题。通过扩展化学命名规则解析器，生成结构化的XML元数据，并引导大语言模型生成精确描述，最终构建了一个包含约16.3万个分子-描述对的数据集，经验证其描述精度高达98.6%。该数据集为分子与语言的对齐研究提供了可靠基础，适用于多种化学任务。

2601.17061 2026-05-11 q-bio.PE cs.NE nlin.AO

How Information Evolves: Stability-Driven Assembly and the Emergence of a Natural Genetic Algorithm

Dan Adler

AI总结该研究探讨了信息如何在非平衡动力学中演化，提出了一种名为“稳定性驱动组装”（SDA）的框架，通过随机组装与差异持久性机制，使系统倾向于形成更持久的结构模式，从而实现类似遗传算法的自然演化过程。研究将SDA应用于化学符号空间，展示了其具备进化搜索的典型特征，如骨架主导、持续创新和熵减少等，揭示了在固定转移率的平衡模型中所不具备的开放动态演化机制，并提出了“演化阶梯”假说，认为基于持久性的选择先于基因复制发生。

Comments 39 pages, 13 figures

2512.17129 2026-05-11 cs.LG cs.MA cs.RO q-bio.QM

DiffeoMorph: Learning to Morph 3D Shapes Using Differentiable Agent-Based Simulations

Seong Ho Pahng, Guoye Guan, Benjamin Fefferman, Sahand Hormoz

AI总结本文提出了一种名为 DiffeoMorph 的端到端可微分框架，用于学习引导一群智能体从初始状态演化成目标三维形状的形态发生协议。该方法基于 SE(3) 等变图神经网络，使每个智能体能够根据自身状态和与其他智能体的交互信号更新位置和内部状态。研究引入了一种基于三维泽尔尼克多项式的形状匹配损失函数，能够将预测形状与目标形状作为连续空间分布进行比较，并对智能体顺序、数量和全局方向不变，同时保持对镜像的敏感性。实验表明，DiffeoMorph 能够从简单初始条件生成复杂三维结构，为形态发生、群体机器人和可编程自组装等领域的分布式控制策略学习提供了通用框架。

2510.24736 2026-05-11 q-bio.QM cs.LG q-bio.BM

RNAGenScape: Property-Guided, Optimized Generation of mRNA Sequences with Manifold Langevin Dynamics

Danqi Liao, Chen Liu, Xingzhi Sun, Dié Tang, Haochen Wang, Scott Youlten, Srikar Krishna Gopinath, Haejeong Lee, Ethan C. Strayer, Antonio J. Giraldez, Smita Krishnaswamy

AI总结 RNAGenScape 是一种基于流形朗之万动力学的 mRNA 序列生成框架，旨在生成具有特定生物性质的优化 mRNA 序列。该方法通过学习真实数据的潜在流形，并在该流形上进行约束优化，确保生成序列的生物学可行性与功能有效性。研究结合了自编码器、属性预测器和属性引导的优化过程，显著提升了生成序列的性能指标，同时保持了较高的生成效率。

Comments ICML 2025 Generative AI and Biology (GenBio) Workshop, Oral presentation

2510.18516 2026-05-11 q-bio.NC cs.LG

Decoding Dynamic Visual Experience from Calcium Imaging via Cell-Pattern-Aware Pretraining

Sangyoon Bae, Mehdi Azabou, Blake Richards, Jiook Cha

AI总结该研究针对神经记录数据中由细胞类型差异、电路动态和刺激响应随机性引起的异质性问题，提出了一种基于生物特性的预训练方法POYO-CAP。该方法通过识别统计规律性强的神经元并进行掩码重建与辅助监督预训练，再对更随机的神经元群体进行微调，从而提升模型性能。实验表明，该方法在Allen Brain Observatory数据集上相较从零训练提升了12-13%，并实现了模型规模的稳定扩展，有效利用了神经异质性作为可扩展的学习优势。