arXivDaily arXiv每日学术速递 周一至周五更新
重置
2508.12434 2026-02-24 q-bio.PE

Estimating wolf population size in France using non-invasive genetic sampling and spatial capture recapture models

Cyril Milleret, Christophe Duchamp, Sarah Bauduin, Cecile Kaerle, Agathe Pirog, Guillaume Queney, Olivier Gimenez

Comments 17 pages, 1 figure

详情
英文摘要

Population size is a key metric for management and conservation. This is especially true for large carnivore populations for which management decisions are often based on population size estimates. In France, gray wolves (Canis lupus) have been monitored for more than two decades using non-invasive genetic sampling and capture-recapture models. Population size estimates directly inform the annual number of wolves that can be killed legally. It is therefore key to use appropriate methods to obtain robust population size estimates. To track the recent numerical and geographical expansion of the population, a substantial increase in sample collection was performed during the winter 2023/24 within the entire wolf distribution range in France. A total of 1964 samples were genotyped and assigned to 576 different individuals using microsatellites genetic markers. During the winter 2023/24, spatial capture-recapture models estimated the wolf population size in France to be likely between 920 and 1125 individuals (95% credible interval). Detection probability varied spatially and was positively influenced by snow cover and accessibility. Wolf density was strongly associated with the recent presence of the species, reflecting the ongoing recolonization process from the Alps. This work illustrates the usefulness of non-invasive genetic data and spatial capture-recapture for large-scale population assessment. It also lays the ground for future improvements in monitoring to fully exploit the potential of spatial capture-recapture models.

2501.10471 2026-02-24 cs.LG q-bio.QM stat.ML

VillageNet: Graph-based, Easily-interpretable, Unsupervised Clustering for Broad Biomedical Applications

Aditya Ballal, Gregory A. DePaul, Esha Datta, Asuka Hatano, Erik Carlsson, Ye Chen-Izu, Javier E. López, Leighton T. Izu

Comments Software available at https://villagenet.streamlit.app/ Github Link: https://github.com/lordareicgnon/VillageNet

详情
英文摘要

Clustering large high-dimensional datasets with diverse variable is essential for extracting high-level latent information from these datasets. Here, we developed an unsupervised clustering algorithm, we call "Village-Net". Village-Net is specifically designed to effectively cluster high-dimension data without priori knowledge on the number of existing clusters. The algorithm operates in two phases: first, utilizing K-Means clustering, it divides the dataset into distinct subsets we refer to as "villages". Next, a weighted network is created, with each node representing a village, capturing their proximity relationships. To achieve optimal clustering, we process this network using a community detection algorithm called Walk-likelihood Community Finder (WLCF), a community detection algorithm developed by one of our team members. A salient feature of Village-Net Clustering is its ability to autonomously determine an optimal number of clusters for further analysis based on inherent characteristics of the data. We present extensive benchmarking on extant real-world datasets with known ground-truth labels to showcase its competitive performance, particularly in terms of the normalized mutual information (NMI) score, when compared to other state-of-the-art methods. The algorithm is computationally efficient, boasting a time complexity of O(N*k*d), where N signifies the number of instances, k represents the number of villages and d represents the dimension of the dataset, which makes it well suited for effectively handling large-scale datasets.

2602.19775 2026-02-24 q-bio.QM cond-mat.stat-mech cs.LG physics.comp-ph q-bio.MN

Exact Discrete Stochastic Simulation with Deep-Learning-Scale Gradient Optimization

Jose M. G. Vilar, Leonor Saiz

Comments 28 pages, 8 figures

详情
英文摘要

Exact stochastic simulation of continuous-time Markov chains (CTMCs) is essential when discreteness and noise drive system behavior, but the hard categorical event selection in Gillespie-type algorithms blocks gradient-based learning. We eliminate this constraint by decoupling forward simulation from backward differentiation, with hard categorical sampling generating exact trajectories and gradients propagating through a continuous massively-parallel Gumbel-Softmax straight-through surrogate. Our approach enables accurate optimization at parameter scales over four orders of magnitude beyond existing simulators. We validate for accuracy, scalability, and reliability on a reversible dimerization model (0.09% error), a genetic oscillator (1.2% error), a 203,796-parameter gene regulatory network achieving 98.4% MNIST accuracy (a prototypical deep-learning multilayer perceptron benchmark), and experimental patch-clamp recordings of ion channel gating (R^2 = 0.987) in the single-channel regime. Our GPU implementation delivers 1.9 billion steps per second, matching the scale of non-differentiable simulators. By making exact stochastic simulation massively parallel and autodiff-compatible, our results enable high-dimensional parameter inference and inverse design across systems biology, chemical kinetics, physics, and related CTMC-governed domains.

2602.19521 2026-02-24 q-bio.CB

A mathematical model for the role of macrophage chemotactic emigration in the early atherosclerotic plaque

Michael G. Watson

详情
英文摘要

Atherosclerotic plaques are fatty, cellular lesions that form in artery walls. The early plaque contains monocyte-derived macrophages, which are recruited to consume locally bound lipid deposits. Plaque progression is characterised by an imbalance in the rates of cell entry and exit from the plaque, which can occur if macrophages die in situ rather than leave by emigration. The mechanisms that regulate macrophage emigration are not well understood, but there is evidence that a chemotactic response can guide macrophages out of the plaque towards the artery wall lymphatics. In this paper, we develop a novel spatial model of the early plaque to study the implications of macrophage chemotactic emigration. Using mathematical analysis and numerical simulations, we investigate how the properties of the chemotactic response contribute to the spatial characteristics and lipid burden of the model plaque. Calculations of macrophage transit times are found to provide a reliable indicator of long-term plaque lipid burden, and also highlight the potential rate-limiting effect of the internal elastic lamina (IEL) on chemotactic emigration. When macrophage emigration is rate-limited by the IEL, we observe non-monotonic cell and lipid profiles that are associated with macrophage accumulation deep in the plaque. The model further predicts that when the chemoattractant penetrates only a short distance into the plaque, the proportion of emigrating macrophages may increase relative to that for a longer-range signal. The theoretical observations in this study can potentially be used to identify evidence of macrophage emigration in data from real atherosclerotic plaques.

2602.19295 2026-02-24 q-bio.QM stat.AP stat.ME

Time-Varying Hazard Patterns and Co-Mutation Profiles of KRAS G12C and G12D in Real-World NSCLC

Robert Amevor, Dennis Baidoo, Emmanuel Kubuafor

详情
英文摘要

Background: KRAS mutations are the largest oncogenic subset in NSCLC. While KRAS G12C is now targetable, no approved therapies exist for G12D. We examined time-to-next-treatment (TTNT) and overall survival (OS) differences between G12C and G12D, allowing for time-varying hazard effects. Methods: De-identified data from AACR Project GENIE BPC NSCLC v2.0-public were analyzed. TTNT served as a real-world surrogate for progression-free survival. Co-mutations (TP53, STK11, KEAP1, SMARCA4, MET), TMB, and PD-L1 were harmonized. Kaplan-Meier, multivariable Cox, and a pre-specified piecewise Cox model (split at median TTNT = 23 months) were applied. Schoenfeld residuals assessed proportional hazards; bootstrap resampling (B=1000) evaluated stability. Results: Among 162 TTNT-evaluable patients (G12C n=130; G12D n=32), median TTNT was 28.6 versus 32.0 months (log-rank p=0.79). Adjusted Cox regression showed no overall hazard difference (HR=0.85; 95% CI 0.53-1.37; p=0.50), but Schoenfeld testing indicated borderline non-proportionality (p=0.053). Piecewise Cox modeling revealed time-varying effects: early TTNT hazard favored G12D (HR=0.41; 95% CI 0.17-0.97; p=0.043) with significant KRAS x period interaction (HR=3.33; p=0.021) and late-period attenuation (HR=1.38; 95% CI 0.77-2.47; p=0.285). Bootstrap resampling confirmed this pattern (median HRearly=0.39; HRlate=1.41). Among 278 OS-evaluable patients (133 deaths), G12D showed improved OS (adjusted HR=0.63; 95% CI 0.39-0.99; p=0.048). G12C tumors exhibited higher TMB (9.79 vs 7.83 mut/Mb; p=0.002) and greater STK11/KEAP1 enrichment. Conclusions: KRAS G12D demonstrated early TTNT advantage and improved OS. Late-period TTNT differences were non-significant (post-hoc power: 12.3%). These exploratory findings require validation in larger cohorts but support allele-specific therapeutic development for G12D.

2602.19196 2026-02-24 q-bio.QM cs.CE cs.LG physics.flu-dyn

An Interpretable Data-Driven Model of the Flight Dynamics of Hawks

Lydia France, Karl Lapo, J. Nathan Kutz

Comments 16 pages, 4 figures

详情
英文摘要

Despite significant analysis of bird flight, generative physics models for flight dynamics do not currently exist. Yet the underlying mechanisms responsible for various flight manoeuvres are important for understanding how agile flight can be accomplished. Even in a simple flight, multiple objectives are at play, complicating analysis of the overall flight mechanism. Using the data-driven method of dynamic mode decomposition (DMD) on motion capture recordings of hawks, we show that multiple behavioral states such as flapping, turning, landing, and gliding, can be modeled by simple and interpretable modal structures (i.e. the underlying wing-tail shape) which can be linearly combined to reproduce the experimental flight observations. Moreover, the DMD model can be used to extrapolate naturalistic flapping. Flight is highly individual, with differences in style across the hawks, but we find they share a common set of dynamic modes. The DMD model is a direct fit to data, unlike traditional models constructed from physics principles which can rarely be tested on real data and whose assumptions are typically invalid in real flight. The DMD approach gives a highly accurate reconstruction of the flight dynamics with only three parameters needed to characterize flapping, and a fourth to integrate turning manoeuvres. The DMD analysis further shows that the underlying mechanism of flight, much like simplest walking models, displays a parametric coupling between dominant modes suggesting efficiency for locomotion.

2602.19138 2026-02-24 q-bio.NC cs.AI

CRCC: Contrast-Based Robust Cross-Subject and Cross-Site Representation Learning for EEG

Xiaobin Wong, Zhonghua Zhao, Haoran Guo, Zhengyi Liu, Yu Wu, Feng Yan, Zhiren Wang, Sen Song

Comments First edition

详情
英文摘要

EEG-based neural decoding models often fail to generalize across acquisition sites due to structured, site-dependent biases implicitly exploited during training. We reformulate cross-site clinical EEG learning as a bias-factorized generalization problem, in which domain shifts arise from multiple interacting sources. We identify three fundamental bias factors and propose a general training framework that mitigates their influence through data standardization and representation-level constraints. We construct a standardized multi-site EEG benchmark for Major Depressive Disorder and introduce CRCC, a two-stage training paradigm combining encoder-decoder pretraining with joint fine-tuning via cross-subject/site contrastive learning and site-adversarial optimization. CRCC consistently outperforms state-of-the-art baselines and achieves a 10.7 percentage-point improvement in balanced accuracy under strict zero-shot site transfer, demonstrating robust generalization to unseen environments.

2602.16255 2026-02-24 q-bio.BM math-ph math.MP nlin.PS nlin.SI

Piecewise integrability of the discrete Hasimoto map for analytic prediction and design of helical peptides

Yiquan Wang

详情
英文摘要

The representation of protein backbone geometry through the discrete nonlinear Schrödinger equation provides a theoretical connection between biological structure and integrable systems. Although the global application of this framework is constrained by chiral degeneracies and non-local interactions, helical peptides can be modeled as piecewise integrable systems where the discrete Hasimoto map remains applicable within specific geometric boundaries. We delineate these boundaries through an analytic mapping $(ϕ,ψ) \rightarrow (κ,τ)$ between biochemical dihedral angles and Frenet frame parameters for 50 helical peptide chains. This transformation is globally information-preserving but ill-conditioned within the helical basin (median Jacobian condition number 31), suggesting chiral information loss arises primarily from local coordinate compression rather than topological singularities. Using a local integrability error $E[n]$ derived from the discrete dispersion relation, we show deviations from integrability are driven predominantly by torsion non-uniformity, while curvature remains rigid. This metric identifies integrable islands where the analytic dispersion relation predicts backbone coordinates with sub-angstrom accuracy (median RMSD 0.77\,Å), enabling a segmentation strategy that isolates structural defects and trims non-integrable terminal fraying. Evaluating only these integrable islands, the dispersion relation extracts high-accuracy structural cores for 88\% of the dataset. Inverse backbone design is feasible within a defined integrability zone where the design constraint reduces essentially to controlling torsion uniformity. These findings advance the Hasimoto formalism from a qualitative descriptor toward a precise quantitative framework for analyzing and designing local protein geometry within the limits of piecewise integrability.

2602.14005 2026-02-24 q-bio.BM

Physical principles of building protein megacomplexes in a crowded milieu

Jiayi Wang, Jules Nde, Andrei G. Gasic, Jacob Haseley, Margaret S. Cheung

详情
英文摘要

Multiple phenotypic protein expressions arising from one genome represent variations in the protein relative abundance and their stoichiometry. A lack of definite compositional parts challenges the modeling of protein megacomplexes and cellular architectures. Despite the advances in protein structural predictions with AI, the mechanism of protein interactions and the emergence of megacomplexes they assemble remains unclear. Here, we present a statistical physics framework of grand canonical ensemble to explore the protein interactions that drive the emergent assembly of a megacomplex using the observational mass spectrometry datasets including protein relative abundance and the cross linked connections. Using chromatin remodeler megacomplex, INO80, as an example, we discovered a class of divergent protein that plays a critical role in orchestrating the assembly beyond nearest neighbors, dependent on the excluded volumes exerted by others. With the constraints of the excluded volumes by varying crowding contents, these divergent subunits orchestrate and form clusters with selective components growing into configurationally distinct architectures. We propose a machinery view for the INO80 chromatin remodeler complex where each loosely associated subunits can be occasionally recruited for parts as attachment into a core assembly driven by excluded volumes. Our computational framework provides a mechanistic insight into taking the macromolecular crowding as necessary physicochemical variables representing cell states to remodel the configurations of protein megacomplexes with structurally loose modules.

2601.05605 2026-02-24 q-bio.QM

AntibodyDesignBFN: High-Fidelity Fixed-Backbone Antibody Design via Discrete Bayesian Flow Networks

Yue Hu, Feng Tao, Junqing Wang, YingChao Liu

Comments 6 pages, 1 table, 4 equations

详情
英文摘要

The computational design of antibodies with high specificity and affinity is a cornerstone of modern therapeutic development. While deep generative models have demonstrated potential, they often struggle to balance high-fidelity geometric conditioning with the discrete nature of amino acid sequences. In this work, we present AntibodyDesignBFN, a novel framework for fixed-backbone antibody design based on Discrete Bayesian Flow Networks (BFN). Unlike standard diffusion models, BFNs operate on a continuous probability simplex, enabling a fully differentiable generative process that seamlessly integrates geometric gradients. By combining a lightweight Geometric Transformer with Invariant Point Attention (IPA) and a resource-efficient training strategy, our model establishes a new state-of-the-art. Evaluations on a rigorous 2025 temporal test set (43 complexes) demonstrate that AntibodyDesignBFN achieves an unprecedented Amino Acid Recovery(AAR) of 67.8%, significantly outperforming leading graph-based baselines. Furthermore, the model is highly efficient, enabling millisecond-scale inference on consumer-grade hardware. AntibodyDesignBFN thus offers a powerful, accessible, and mathematically robust framework for next generation antibody engineering. Code and model checkpoints are available at https://github.com/YueHuLab/AntibodyDesignBFN and https://huggingface.co/YueHuLab/AntibodyDesignBFN.

2512.04808 2026-02-24 q-bio.NC cs.AI

Setting up for failure: automatic discovery of the neural mechanisms of cognitive errors

Puria Radmard, Paul M. Bays, Máté Lengyel

详情
英文摘要

Discovering the neural mechanisms underpinning cognition is one of the grand challenges of neuroscience. However, previous approaches for building models of RNN dynamics that explain behaviour required iterative refinement of architectures and/or optimisation objectives, resulting in a piecemeal, and mostly heuristic, human-in-the-loop process. Here, we offer an alternative approach that automates the discovery of viable RNN mechanisms by explicitly training RNNs to reproduce behaviour, including the same characteristic errors and suboptimalities, that humans and animals produce in a cognitive task. Achieving this required two main innovations. First, as the amount of behavioural data that can be collected in experiments is often too limited to train RNNs, we use a non-parametric generative model of behavioural responses to produce surrogate data for training RNNs. Second, to capture all relevant statistical aspects of the data, we developed a novel diffusion model-based approach for training RNNs. To showcase the potential of our approach, we chose a visual working memory task as our test-bed, as behaviour in this task is well known to produce response distributions that are patently multimodal (due to swap errors). The resulting network dynamics correctly qualitative features of macaque neural data. Importantly, these results were not possible to obtain with more traditional approaches, i.e., when only a limited set of behavioural signatures (rather than the full richness of behavioural response distributions) were fitted, or when RNNs were trained for task optimality (instead of reproducing behaviour). Our approach also yields novel predictions about the mechanism of swap errors, which can be readily tested in experiments. These results suggest that fitting RNNs to rich patterns of behaviour provides a powerful way to automatically discover mechanisms of important cognitive functions.

2509.00122 2026-02-24 q-bio.PE

Two Issues in Modelling Fish Migration

Hidekazu Yoshioka

详情
英文摘要

Fish migration is a dynamic phenomenon observed in many surface water bodies on the earth, while its understanding is still insufficient. Particularly, the biological mechanism behind fish migration is not fully understood. Moreover, its observation is often conducted visually and hence manually, raising questions of accuracy and interpretation of the data sampled. We address the two issues, mechanism and observation, of fish migration based on a recently developed mathematical model. The results obtained in this short paper show that fish migration can be characterized through a minimization principle and evaluate the error of its manual observations. The minimization principle we hypothesize is an optimal control problem where the migrating fish population dynamically changes its size and fluctuation. We numerically investigate alternating and intensive observation schemes as case studies, demonstrating that in some realistic conditions the estimate of total fish count is not reliable. We believe that this paper contributes to a deeper understanding of fish migration.

2503.05128 2026-02-24 q-bio.OT

Toward a general theory for the universality and scaling in critical thermal responses in biology

Jose Ignacio Arroyo, Pablo A. Marquet, Christopher P. Kempes, Geoffrey West

Comments This chapter is accepted in the forthcoming volume entitled "Scaling in Biology: A New Synthesis (In Press) SFI Press, Editors: Enquist, Kempes, O'Connor."

详情
英文摘要

We developed a theory showing that under appropriate normalizations and rescalings, temperature response curves show a remarkably regular behavior and follow a general, universal law. The impressive universality of temperature response curves remained hidden due to various curve-fitting models not well-grounded in first principles. In addition, this framework has the potential to explain the origin of different scaling relationships in thermal performance in biology, from molecules to ecosystems. Here, we summarize the background, principles and assumptions, predictions, implications, and possible extensions of this theory.

2602.19023 2026-02-24 cond-mat.stat-mech physics.bio-ph q-bio.NC

Critical Scaling and Metabolic Regulation in a Ginzburg--Landau Theory of Cognitive Dynamics

Gunn Kim

Comments 5 pages, 3 figures. Includes Supplemental Material. Submitted for publication

详情
英文摘要

We formulate a phenomenological effective field theory in which biological intelligence emerges as a macroscopic order parameter sustained by continuous metabolic flux. By modeling cognition as a coarse-grained neural activity field governed by a variational free energy, we derive closed-form expressions for information capacity and structural susceptibility using a Gaussian maximum entropy approximation. The theory predicts a universal algebraic divergence of the susceptibility, $χ\sim K^{-3/2}$, as the structural stiffness $K$ approaches the instability threshold. The exponent $γ= 3/2$ is consistent with the mean-field branching process universality class, thereby providing a theoretical rationale for the observed avalanche size exponent $τ\approx 3/2$ in cortical dynamics without invoking microscopic equivalence. We identify adult cognition as a metabolically pinned non-equilibrium steady state maintained near the critical regime $Γ\equiv K/α\approx 1$ by continuous metabolic regulation, while pathological decline corresponds to a delocalization transition triggered by the violation of structural stability conditions. The framework generates concrete, falsifiable predictions for attention scaling, altered states of consciousness, and transcranial magnetic stimulation responses, each of which can be tested against existing neuroimaging and electrophysiological datasets.

2602.18960 2026-02-24 cs.AI cs.NE q-bio.NC

Modularity is the Bedrock of Natural and Artificial Intelligence

Alessandro Salatiello

详情
Journal ref
ICLR 2025 - Second Workshop on Representational Alignment (Re-Align) https://iclr.cc/virtual/2025/36838
英文摘要

The remarkable performance of modern AI systems has been driven by unprecedented scales of data, computation, and energy -- far exceeding the resources required by human intelligence. This disparity highlights the need for new guiding principles and motivates drawing inspiration from the fundamental organizational principles of brain computation. Among these principles, modularity has been shown to be critical for supporting the efficient learning and strong generalization abilities consistently exhibited by humans. Furthermore, modularity aligns well with the No Free Lunch Theorem, which highlights the need for problem-specific inductive biases and motivates architectures composed of specialized components that solve subproblems. However, despite its fundamental role in natural intelligence and its demonstrated benefits across a range of seemingly disparate AI subfields, modularity remains relatively underappreciated in mainstream AI research. In this work, we review several research threads in artificial intelligence and neuroscience through a conceptual framework that highlights the central role of modularity in supporting both artificial and natural intelligence. In particular, we examine what computational advantages modularity provides, how it has emerged as a solution across several AI research areas, which modularity principles the brain exploits, and how modularity can help bridge the gap between natural and artificial intelligence.

2602.18942 2026-02-24 q-bio.PE physics.bio-ph

Feasibility as a moving target: Fluctuating species interactions lead to universal power law in equilibrium abundances

Cagatay Eskin, Vu Nguyen, Dervis Can Vural

Comments 10 pages, 6 figures

详情
英文摘要

Theoretical ecology has traditionally equated persistence with the stability of a fixed equilibrium point. Here we argue that the primary threat to ecosystem persistence need not be the loss of stability, but instead the escape of the stable equilibrium to a negative orthant. In a realistic setting, fluctuations in interactions do not merely disturb abundances about an equilibrium but can displace the equilibrium point itself. We theoretically and empirically analyze such displacements of the equilibrium point in a complex community. Theoretically, we find that light-tailed fluctuations in species interactions, no matter how small, lead to a heavy-tailed power law $P(y)=1/y^α$ for the equilibrium abundance $y$ of a species. Remarkably, the exponent $α=2$ is a universal value independent of interaction structure, community size, and species. Empirically, our analysis of 34 species reveals a power law signal for most, with a median exponent $α\sim2.56$. Next, we derive a formula for the critical noise, $σ_c$, beyond which the community experiences feasibility loss ``with near certainty''. We find that $σ_c(N)\sim N^{-1}$, implying that larger communities are significantly more fragile to noise induced feasibility loss. Lastly, we define and calculate biologically measurable analytical metrics for both global and species-specific feasibility escape rates, and implement these metrics in dynamic simulations of 98 real world mutualistic and food web networks, to successfully predict their fragility.

2602.18932 2026-02-24 q-bio.MN cond-mat.stat-mech math.DG math.OC physics.chem-ph

Convex Analysis of Relaxation Dynamics in Chemical Reaction Networks and Generalized Gradient Flows

Keisuke Sugie, Dimitri Loutchko, Tetsuya J. Kobayashi

Comments 25 pages, 1 figure

详情
英文摘要

We obtain bounds on the Kullback--Leibler divergence to equilibrium for mass-action chemical reaction networks (CRNs) with equilibrium. The associated decay rates are characterized in terms of the singular values of the stoichiometric matrix, convexity parameters, and time-integrated activities via deformed-exponential-type functions. We further extend these bounds within a generalized gradient flow framework. We highlight the biological relevance of this framework: the resulting bounds apply to quasi-steady-state regimes, where long transients and plateau-like behavior are common and functionally important. We illustrate the framework using a catalytic CRN exhibiting plateaus, where the bounds capture slow relaxation induced by local convexity and provide a bound-based approach to quantifying relaxation in CRNs.

2602.18915 2026-02-24 q-bio.QM cs.AI cs.CL cs.LG

AAVGen: Precision Engineering of Adeno-associated Viral Capsids for Renal Selective Targeting

Mohammadreza Ghaffarzadeh-Esfahani, Yousof Gheisari

Comments 22 pages, 6 figures, and 5 supplementary files. Corresponding author: ygheisari@med.mui.ac.ir, Kaggle notebook is available at https://www.kaggle.com/code/mohammadgh009/aavgen

详情
英文摘要

Adeno-associated viruses (AAVs) are promising vectors for gene therapy, but their native serotypes face limitations in tissue tropism, immune evasion, and production efficiency. Engineering capsids to overcome these hurdles is challenging due to the vast sequence space and the difficulty of simultaneously optimizing multiple functional properties. The complexity also adds when it comes to the kidney, which presents unique anatomical barriers and cellular targets that require precise and efficient vector engineering. Here, we present AAVGen, a generative artificial intelligence framework for de novo design of AAV capsids with enhanced multi-trait profiles. AAVGen integrates a protein language model (PLM) with supervised fine-tuning (SFT) and a reinforcement learning technique termed Group Sequence Policy Optimization (GSPO). The model is guided by a composite reward signal derived from three ESM-2-based regression predictors, each trained to predict a key property: production fitness, kidney tropism, and thermostability. Our results demonstrate that AAVGen produces a diverse library of novel VP1 protein sequences. In silico validations revealed that the majority of the generated variants have superior performance across all three employed indices, indicating successful multi-objective optimization. Furthermore, structural analysis via AlphaFold3 confirms that the generated sequences preserve the canonical capsid folding despite sequence diversification. AAVGen establishes a foundation for data-driven viral vector engineering, accelerating the development of next-generation AAV vectors with tailored functional characteristics.

2602.18909 2026-02-24 cond-mat.soft physics.bio-ph q-bio.CB

Geometric Limits of Mitotic Pressure Under Confinement

Amit Singh Vishen

Comments 7 pages, 3 figures

详情
英文摘要

Cells often divide under mechanical confinement, where surrounding structures restrict shape changes during cytokinesis. Although forces generated during confined division have been measured experimentally, it remains unclear how confinement geometry and mechanics determine the transmitted force. Here we develop a minimal mechanical theory of cell division under confinement. Modeling the cell as an incompressible volume bounded by an interface with effective isotropic tension, we show that confinement restricts the set of mechanically admissible furrow shapes. As the furrow radius decreases, it reaches it reaches a confinement-induced minimum. Beyond this point, further ingression does not alter the interface shape, and both pressure and axial force saturate. We analyze force and pressure in rigid, soft, and strong three-dimensional confinement and demonstrate that a single geometric mechanism underlies these distinct cases. After rescaling force and length by the appropriate geometric scale, cells of different size and surface tension collapse onto a single universal curve. The relevant length scale is the cell size for rigid and soft confinement, and the confinement size in fully enclosing three-dimensional confinement. In soft confinement, environmental stiffness and spindle-generated axial forces determine the operating force and pressure, while the geometric constraint fixes the maximal attainable levels. In summary, our results show that mitotic force transmission and mitotic pressure during cytokinesis are bounded by confinement geometry, with material properties and active forces selecting the operating point within these geometry-imposed limits.

2602.18854 2026-02-24 q-bio.MN

Modeling Dynamics, Cell Type Specificity, and Perturbations in Gene Regulatory Networks

Junha Shin, Spencer Halberg-Spencer, Yuda Liu, Suvojit Hazra, Erika Da-Inn Lee, Sushmita Roy

Comments 30 pages, 4 figures, This article is scheduled to appear in the Annual Review of Genomics and Human Genetics

详情
英文摘要

Gene regulatory networks (GRNs) define the regulatory relationships among molecules such as transcription factors, chromatin remodelers, and target genes. GRNs play a critical role in diverse biological processes, including development, disease manifestation, and evolution. However, fully characterizing these networks across multiple cell types and states remains a significant challenge. Recent advances in single-cell omics have dramatically enhanced our ability to measure biological systems at unprecedented resolution. These technologies have opened new avenues for computational methods to infer GRNs, offering deeper insights into cell type-specific mechanisms, causality, and dynamic regulatory processes. This review summarizes the current state of GRN inference from single cell omic datasets, with a particular focus on dynamics and perturbations, and outlines key open challenges that must be addressed to advance the field.

2602.18787 2026-02-24 q-bio.NC

From Modules to Movement: Deconstructing the Modular Architecture of the Motor System

Alessandro Salatiello

详情
英文摘要

Coordinating multi-articulated bodies to generate purposeful movement is a formidable computational challenge. Yet the human motor system performs this task robustly in dynamic, uncertain environments, despite noisy and delayed feedback, slow actuators, and strict energetic constraints. A central question is what organizational principles underlie this efficiency. One widely recognized principle of neural organization is modularity, which enables complex problems to be decomposed into simpler subproblems that specialized modules are optimized to solve. In this review, we argue that modularity is a fundamental organizing principle of the motor system. We first summarize evidence for brain modularity, ranging from classical lesion studies to contemporary graph-theoretical analyses. We next discuss the main factors underlying the emergence and evolutionary selection of modular architectures, highlighting the computational advantages they provide. We then review the major neuroanatomical modules that structure current descriptions of the motor system and compare three prominent computational frameworks of motor control$-$optimal feedback control theory, muscle synergy theory, and dynamical systems approaches$-$showing that all implicitly or explicitly rely on specialized computational modules. We conclude by contrasting the key strengths and limitations of existing frameworks and by proposing promising directions toward more comprehensive theories.

2602.18727 2026-02-24 stat.AP q-bio.QM

Statistical methods for reference-free single-molecule localisation microscopy

Jack Peyton, Benjamin Davis, Emily Gribbin, Daniel Rolfe, Hannah Mitchell

详情
英文摘要

MINFLUX (Minimal Photon Flux) is a single-molecule imaging technique capable of resolving fluorophores at a precision of <5 nm. Interpretation of the point patterns generated by this technique presents challenges due to variable emitter density, incomplete bio-labelling of target molecules and their detection, error prone measurement processes, and the presence of spurious (non-structure associated) fluorescent detections. Together, these challenges ensure structural inferences from single-molecule imaging datasets are non-trivial in the absence of strong a priori information, for all but the smallest of point patterns. In addition, current methods often require subjective parameter tuning and presuppose known structural templates, limiting reference-free discovery. We present a statistically grounded, end-to-end analysis framework. Focusing on MINFLUX derived datasets and leveraging Bayesian and spatial statistical methods, a pipeline is presented that demonstrates 1) uncertainty aware clustering of measurements into emitter groups that performs better than current gold standards, 2) rapid identification of molecular structure supergroups, and 3) reconstruction of repeating structures within the dataset without substantial prior knowledge. This pipeline is demonstrated using simulated and real MINFLUX datasets, where emitter clustering and centre detection maintain high performance (emitter subset assignment accuracy > 0.75) across all conditions evaluated, while structural inference achieves reliable discrimination (F1 approx. 0.9) at high labelling efficiency. Template-free reconstruction of Nup96 and DNA-Origami 3x3 grids are achieved.

2602.18715 2026-02-24 q-bio.NC cs.LG

A Data-Driven Method to Map the Functional Organisation of Human Brain White Matter

Yifei Sun, James M. Shine, Robert D. Sanders, Robin F. H. Cash, Sharon L. Naismith, Fernando Calamante, Jinglei Lv

Comments 19 pages, 4 figures, journal paper under review

详情
英文摘要

The white matter of the brain is organised into axonal bundles that support long-range neural communication. Although diffusion MRI (dMRI) enables detailed mapping of these pathways through tractography, how white matter pathways directly facilitate large-scale neural synchronisation remains poorly understood. We developed a data-driven framework that integrates dMRI and functional MRI (fMRI) to model the dynamic coupling supported by white matter tracks. Specifically, we employed track dynamic functional connectivity (Track-DFC) to characterise functional coupling of remote grey matter connected by individual white matter tracks. Using independent component analysis followed by k-medoids clustering, we derived functionally-coherent clusters of white matter tracks from the Human Connectome Project young adult cohort. When applied to the HCP ageing cohort, these clusters exhibited widespread age-related declines in both functional coupling strength and temporal variability. Importantly, specific clusters encompassing pathways linking control, default mode, attention, and visual systems significantly mediated the relationship between age and cognitive performance. Together, these findings depict the functional organisation of white matter tracks and provide a powerful tool to study brain ageing and cognitive decline.

2602.18643 2026-02-24 q-bio.QM

Project Hermes: A Model-Agnostic Validation Layer for Wearable Health Prediction Systems

Richik Chakraborty

详情
英文摘要

The deployment of wearable-based health prediction systems has accelerated rapidly, yet these systems face a fundamental challenge: they generate alerts under substantial uncertainty without principled mechanisms for user-specific validation. While large language models (LLMs) have been increasingly applied to healthcare tasks, existing work focuses predominantly on diagnosis generation and risk prediction rather than post-prediction validation of detected signals. We introduce Project Hermes, a model-agnostic validation layer that treats signal confirmation as a sequential decision problem. Hermes operates downstream of arbitrary upstream predictors, using LLM-generated contextual queries to elicit targeted user feedback and performing Bayesian confidence updates to distinguish true positives from false alarms. In a 60-day longitudinal case study of migraine prediction, Hermes achieved a 34% reduction in false positive rate (from 61.7% to 12.5%) while maintaining 89% sensitivity, with mean lead time of 4.2 hours before symptom onset. Critically, Hermes does not perform diagnosis or make novel predictions; it validates whether signals detected by upstream models are clinically meaningful for specific individuals at specific times. This work establishes validation as a first-class computational problem distinct from prediction, with implications for trustworthy deployment of consumer health AI systems.

2602.18637 2026-02-24 cs.LG q-bio.NC

Online decoding of rat self-paced locomotion speed from EEG using recurrent neural networks

Alejandro de Miguel, Nelson Totah, Uri Maoz

Comments 17 pages, 1 table and 7 figures

详情
英文摘要

$\textit{Objective.}$ Accurate neural decoding of locomotion holds promise for advancing rehabilitation, prosthetic control, and understanding neural correlates of action. Recent studies have demonstrated decoding of locomotion kinematics across species on motorized treadmills. However, efforts to decode locomotion speed in more natural contexts$-$where pace is self-selected rather than externally imposed$-$are scarce, generally achieve only modest accuracy, and require intracranial implants. Here, we aim to decode self-paced locomotion speed non-invasively and continuously using cortex-wide EEG recordings from rats. $\textit{Approach.}$ We introduce an asynchronous brain$-$computer interface (BCI) that processes a stream of 32-electrode skull-surface EEG (0.01$-$45 Hz) to decode instantaneous speed from a non-motorized treadmill during self-paced locomotion in head-fixed rats. Using recurrent neural networks and a dataset of over 133 h of recordings, we trained decoders to map ongoing EEG activity to treadmill speed. $\textit{Main results.}$ Our decoding achieves a correlation of 0.88 ($R^2$ = 0.78) for speed, primarily driven by visual cortex electrodes and low-frequency ($< 8$ Hz) oscillations. Moreover, pre-training on a single session permitted decoding on other sessions from the same rat, suggesting uniform neural signatures that generalize across sessions but fail to transfer across animals. Finally, we found that cortical states not only carry information about current speed, but also about future and past dynamics, extending up to 1000 ms. $\textit{Significance.}$ These findings demonstrate that self-paced locomotion speed can be decoded accurately and continuously from non-invasive, cortex-wide EEG. Our approach provides a framework for developing high-performing, non-invasive BCI systems and contributes to understanding distributed neural representations of action dynamics.

2602.18507 2026-02-24 cs.NE q-bio.NC

Fine-Pruning: A Biologically Inspired Algorithm for Personalization of Machine Learning Models

Joseph Bingham, Saman Zonouz, Dvir Aran

Comments 20 pages, 7 figures, accepted to Cell: Patterns

详情
Journal ref
Patterns (New York, N.Y.), vol. 6, no. 5, Elsevier BV, May 2025, p. 101242
英文摘要

Neural networks have long strived to emulate the learning capabilities of the human brain. While deep neural networks (DNNs) draw inspiration from the brain in neuron design, their training methods diverge from biological foundations. Backpropagation, the primary training method for DNNs, requires substantial computational resources and fully labeled datasets, presenting major bottlenecks in development and application. This work demonstrates that by returning to biomimicry, specifically mimicking how the brain learns through pruning, we can solve various classical machine learning problems while utilizing orders of magnitude fewer computational resources and no labels. Our experiments successfully personalized multiple speech recognition and image classification models, including ResNet50 on ImageNet, resulting in increased sparsity of approximately 70\% while simultaneously improving model accuracy to around 90\%, all without the limitations of backpropagation. This biologically inspired approach offers a promising avenue for efficient, personalized machine learning models in resource-constrained environments.

2602.18490 2026-02-24 astro-ph.EP physics.chem-ph q-bio.BM

Distinguishing life from non-life via molecular frontier orbital energy gaps

José L. Ramírez-Colón, Ziqin Ni, Christopher E. Carr

Comments Keywords: Life detection, Biosignatures, Amino acids, Frontier Orbitals, HOMO-LUMO Gap, Agnostic. Code available at: https://github.com/jlramirezcolon/hlg-life-detection. AGPLv3 License

详情
英文摘要

Amino acids (AAs) are a key target in the search for life beyond Earth due to their extensive role in the machinery of all known life, persistence over geologic timescales, and analytical detectability. However, AAs can also arise from abiotic processes on planets and in space. For example, material from asteroid Bennu contained 33 AAs, including 15 of the 20 proteinogenic AAs that are fundamental to life's functions. Distinguishing life from non-life based on AAs in a sample remains an unsolved problem, particularly when their isotopic and structural signatures (e.g., chirality) could be altered via physicochemical processes. Here we introduce LUMOS (Life Unveiled via Molecular Orbital Signatures), a statistical framework that distinguishes life from non-life by analyzing the distribution of abundance-weighted HOMO-LUMO gap (HLG) values of AAs within a sample. Compilation of AAs datasets from diverse environments and provenances revealed that abiotic samples display highly uniform distributions of AAs HLGs. In contrast, biotic samples show greater variance and preference towards AAs with lower HLG, likely reflecting the need for life to control when, where, and how chemical reactions occur. LUMOS achieves >95% accuracy in distinguishing biotic versus abiotic provenance across diverse environmental and extraterrestrial conditions. These results suggest that varied molecular reactivity within biochemical systems may be a universal feature of life, representing an agnostic biosignature unlinked to the specific set of AAs used by life as we know it. LUMOS is compatible with existing analytical instrumentation, applicable to returned samples or in situ analyses. Broader characterization of abiotic and biotic environments will further refine the chemical boundaries separating biotic from abiotic chemical systems.

2602.18476 2026-02-24 q-bio.BM cs.AI cs.LG

BioLM-Score: Language-Prior Conditioned Probabilistic Geometric Potentials for Protein-Ligand Scoring

Zhangfan Yang, Baoyun Chen, Dong Xu, Jia Wang, Ruibin Bai, Junkai Ji, Zexuan Zhu

Comments 9 pages, 2 figures

详情
英文摘要

Protein-ligand scoring is a central component of structure-based drug design, underpinning molecular docking, virtual screening, and pose optimization. Conventional physics-based energy functions are often computationally expensive, limiting their utility in large-scale screening. In contrast, deep learning-based scoring models offer improved computational efficiency but frequently suffer from limited cross-target generalization and poor interpretability, which restrict their practical applicability. Here we present BioLM-Score, a simple yet generalizable protein-ligand scoring model that couples geometric modeling with representation learning. Specifically, it employs modality-specific and structure-aware encoders for proteins and ligands, each augmented with biomolecular language models to enrich structural and chemical representations. Subsequently, these representations are integrated through a mixture density network to predict multimodal interatomic distance distributions, from which statistically grounded likelihood-based scores are derived. Evaluations on the CASF-2016 benchmark demonstrate that BioLM-Score achieves significant improvements across docking, scoring, ranking, and screening tasks. Moreover, the proposed scoring function serves as an effective optimization objective for guiding docking protocols and conformational search. In summary, BioLM-Score provides a principled and practical alternative to existing scoring functions, combining efficiency, generalization, and interpretability for structure-based drug discovery.

2602.18472 2026-02-24 cs.LG cs.AI q-bio.QM

Physiologically Informed Deep Learning: A Multi-Scale Framework for Next-Generation PBPK Modeling

Shunqi Liu, Han Qiu, Tong Wang

详情
英文摘要

Physiologically Based Pharmacokinetic (PBPK) modeling is a cornerstone of model-informed drug development (MIDD), providing a mechanistic framework to predict drug absorption, distribution, metabolism, and excretion (ADME). Despite its utility, adoption is hindered by high computational costs for large-scale simulations, difficulty in parameter identification for complex biological systems, and uncertainty in interspecies extrapolation. In this work, we propose a unified Scientific Machine Learning (SciML) framework that bridges mechanistic rigor and data-driven flexibility. We introduce three contributions: (1) Foundation PBPK Transformers, which treat pharmacokinetic forecasting as a sequence modeling task; (2) Physiologically Constrained Diffusion Models (PCDM), a generative approach that uses a physics-informed loss to synthesize biologically compliant virtual patient populations; and (3) Neural Allometry, a hybrid architecture combining Graph Neural Networks (GNNs) with Neural ODEs to learn continuous cross-species scaling laws. Experiments on synthetic datasets show that the framework reduces physiological violation rates from 2.00% to 0.50% under constraints while offering a path to faster simulation.

2601.11636 2026-02-24 math.DS cs.NA math.NA q-bio.PE

Qualitative analysis and numerical investigations of time-fractional Zika virus model arising in population dynamics

Gaurav Saini, Bappa Ghosh, Sunita Chand

Comments A significant error was identified in the model analysis that affects the validity of the results. The authors therefore withdraw the paper

详情
英文摘要

Epidemic models play a crucial role in population dynamics, offering valuable insights into disease transmission while aiding in epidemic prediction and control. In this paper, we analyze the mathematical model of the time-fractional Zika virus transmission for human and mosquito populations. The fractional derivative is considered in the Caputo sense of order $α\in(0,1).$ We begin by conducting a qualitative analysis using the stability theory of differential equations. The existence and uniqueness of the solution are established, and the model's stability is examined through Hyers-Ulam stability analysis. Furthermore, an efficient difference scheme utilizing the standard L1 technique is developed to simulate the model and analyze the solution's behavior under key parameters. The resulting nonlinear algebraic system is solved using the Newton-Raphson method. Finally, illustrative examples are presented to validate the theoretical findings. Graphical results indicate that the fractional model provides deeper insights and a better understanding of disease dynamics. These findings aid in controlling the virus through contact precautions and recommended therapies while also helping to predict its future spread.