arXivDaily arXiv每日学术速递 周一至周五更新
重置
2602.16696 2026-02-19 q-bio.GN cs.LG q-bio.QM

Parameter-free representations outperform single-cell foundation models on downstream benchmarks

Huan Souza, Pankaj Mehta

详情
英文摘要

Single-cell RNA sequencing (scRNA-seq) data exhibit strong and reproducible statistical structure. This has motivated the development of large-scale foundation models, such as TranscriptFormer, that use transformer-based architectures to learn a generative model for gene expression by embedding genes into a latent vector space. These embeddings have been used to obtain state-of-the-art (SOTA) performance on downstream tasks such as cell-type classification, disease-state prediction, and cross-species learning. Here, we ask whether similar performance can be achieved without utilizing computationally intensive deep learning-based representations. Using simple, interpretable pipelines that rely on careful normalization and linear methods, we obtain SOTA or near SOTA performance across multiple benchmarks commonly used to evaluate single-cell foundation models, including outperforming foundation models on out-of-distribution tasks involving novel cell types and organisms absent from the training data. Our findings highlight the need for rigorous benchmarking and suggest that the biology of cell identity can be captured by simple linear representations of single cell gene expression data.

2602.16633 2026-02-19 q-bio.PE

Behavioral change models for infectious disease transmission: a systematic review (2020-2025)

Youngji Jo, Sileshi Sintayehu Sharbayta, Bruno Buonomo

Comments 23 pages, 3 tables, 3 figures

详情
英文摘要

Background: Human behavior shapes infectious disease dynamics, yet its integration into transmission models remains fragmented. Recent epidemics, particularly COVID-19, highlight the need for models capturing adaptation to perceived risk, social influence, and policy signals. This review synthesizes post-2020 models incorporating behavioral adaptation, examines their theoretical grounding, and evaluates how behavioral constructs modify transmission, vaccination, and compliance. Methods: Following PRISMA guidelines, we searched Scopus and PubMed (2020-2025), screening 1,274 records with citation chaining. We extracted data on disease context, country, modeling framework, behavioral mechanisms (prevalence-dependent, policy/media, imitation/social learning), and psychosocial constructs (personal threat, coping appraisal, barriers, social norms, cues to action). A total of 216 studies met inclusion criteria. Results: COVID-19 accounted for 73% of studies. Most used compartmental ODE models (81%) and focused on theoretical or U.S. settings. Behavioral change was mainly reactive: 47% applied prevalence-dependent feedback, 25% included awareness/media dynamics, and 19% relied on exogenous policy triggers. Game-theoretic or social learning approaches were rare (less or equal than 5%). Behavioral effects primarily modified contact or transmission rates (91%). Psychosocial constructs were unevenly represented: cues to action (n=159) and personal threat (n=145) dominated, whereas coping appraisal (n=82), barriers (n=36), and social norms (n=25) were less common. Conclusions: We propose a taxonomy structured by behavioral drivers, social scale, and memory to clarify dominant paradigms and their empirical basis. Mapping models to psychosocial constructs provides guidance for more theory-informed and data grounded-integration of behavioral processes in epidemiological modeling.

2602.16626 2026-02-19 cs.LG cs.AI q-bio.NC

A Systematic Evaluation of Sample-Level Tokenization Strategies for MEG Foundation Models

SungJun Cho, Chetan Gohil, Rukuang Huang, Oiwi Parker Jones, Mark W. Woolrich

Comments 15 pages, 10 figures, 1 table

详情
英文摘要

Recent success in natural language processing has motivated growing interest in large-scale foundation models for neuroimaging data. Such models often require discretization of continuous neural time series data, a process referred to as 'tokenization'. However, the impact of different tokenization strategies for neural data is currently poorly understood. In this work, we present a systematic evaluation of sample-level tokenization strategies for transformer-based large neuroimaging models (LNMs) applied to magnetoencephalography (MEG) data. We compare learnable and non-learnable tokenizers by examining their signal reconstruction fidelity and their impact on subsequent foundation modeling performance (token prediction, biological plausibility of generated data, preservation of subject-specific information, and performance on downstream tasks). For the learnable tokenizer, we introduce a novel approach based on an autoencoder. Experiments were conducted on three publicly available MEG datasets spanning different acquisition sites, scanners, and experimental paradigms. Our results show that both learnable and non-learnable discretization schemes achieve high reconstruction accuracy and broadly comparable performance across most evaluation criteria, suggesting that simple fixed sample-level tokenization strategies can be used in the development of neural foundation models. The code is available at https://github.com/OHBA-analysis/Cho2026_Tokenizer.

2602.16584 2026-02-19 q-bio.NC

The Representational Alignment Hypothesis: Evidence for and Consequences of Invariant Semantic Structure Across Embedding Modalities

Akhil Ramidi, Kevin Scharp

Comments 23 pages, 3 figures

详情
英文摘要

There is growing evidence that independently trained AI systems come to represent the world in the same way. In other words, independently trained embeddings from text, vision, audio, and neural signals share an underlying geometry. We call this the Representational Alignment Hypothesis (RAH) and investigate evidence for and consequences of this claim. The evidence is of two kinds: (i) internal structure comparison techniques, such as representational similarity analysis and topological data analysis, reveal matching relational patterns across modalities without explicit mapping; and (ii) methods based on cross-modal embedding alignment, which learn mappings between representation spaces, show that simple linear transformations can bring different embedding spaces into close correspondence, suggesting near-isomorphism. Taken together, the evidence suggests that, even after controlling for trivial commonalities inherent in standard data preprocessing and embedding procedures, a robust structural correspondence persists, hinting at an underlying organizational principle. Some have argued that this result shows that the shared structure is getting at a fundamental, Platonic level of reality. We argue that this conclusion is unjustified. Moreover, we aim to give the idea an alternative philosophical home, rooted in contemporary metasemantics (i.e., theories of what makes a representation and what makes something meaningful) and responses to the symbol grounding problem. We conclude by considering the scope of the RAH and proposing new ways of distinguishing semantic structures that are genuinely invariant from those that inevitably arise due to the fact that all our data is generated under human-specific conditions on Earth.

2602.16504 2026-02-19 q-bio.QM

GRIMM: Genetic stRatification for Inference in Molecular Modeling

Ashley Babjac, Adrienne Hoarfrost

Comments 9 pages, 1 figure, 2 tables, submitted to ISMB main conference proceedings 2026

详情
英文摘要

The vast majority of biological sequences encode unknown functions and bear little resemblance to experimentally characterized proteins, limiting both our understanding of biology and our ability to harness functional potential for the bioeconomy. Predicting enzyme function from sequence remains a central challenge in computational biology, complicated by low sequence diversity and imbalanced label support in publicly available datasets. Models trained on these data can overestimate performance and fail to generalize. To address this, we introduce GRIMM (Genetic stRatification for Inference in Molecular Modeling), a benchmark for enzyme function prediction that employs genetic stratification: sequences are clustered by similarity and clusters are assigned exclusively to training, validation, or test sets. This ensures that sequences from the same cluster do not appear in multiple partitions. GRIMM produces multiple test sets: a closed-set test with the same label distribution as training (Test-1) and an open-set test containing novel labels (Test-2), serving as a realistic out-of-distribution proxy for discovering novel enzyme functions. While demonstrated on enzymes, this approach is generalizable to any sequence-based classification task where inputs can be clustered by similarity. By formalizing a splitting strategy often used implicitly, GRIMM provides a unified and reproducible framework for closed- and open-set evaluation. The method is lightweight, requiring only sequence clustering and label annotations, and can be adapted to different similarity thresholds, data scales, and biological tasks. GRIMM enables more realistic evaluation of functional prediction models on both familiar and unseen classes and establishes a benchmark that more faithfully assesses model performance and generalizability.

2602.16447 2026-02-19 q-bio.PE cond-mat.stat-mech

Evolutionary Advantage of Diversity-Generating Retroelements in Switching Environments

Léo Régnier, Paul Rochette, Raphaël Laurenceau, David Bikard, Simona Cocco, Rémi Monasson

Comments Main text: 6 pages, 3 figures. Supplementary Materials: 22 pages, 14 figures

详情
英文摘要

Diversity-Generating Retroelements (DGRs) create rapid, targeted variation within specific genomic regions in phages and bacteria. They operate through stochastic retro-transcription of a template region (TR) into a variable region (VR), which typically encodes ligand-binding proteins. Despite their prevalence, the evolutionary conditions that maintain such hypermutating systems remain unclear. Here we introduce a two-timescale framework separating fast VR diversification from slow TR evolution, allowing the dynamics of DGR-controlled loci to be analytically understood from the TR design point of view. We quantity the fitness gain provided by the diversification mechanism of DGR in the presence of environmental switching with respect to standard mutagenesis. Our framework accounts for observed patterns of DGR activity in human-gut \textit{Bacteroides} and clarifies when constitutive DGR activation is evolutionarily favored.

2602.00663 2026-02-19 cs.AI cs.LG q-bio.BM

SEISMO: Increasing Sample Efficiency in Molecular Optimization with a Trajectory-Aware LLM Agent

Fabian P. Krüger, Andrea Hunklinger, Adrian Wolny, Tim J. Adler, Igor Tetko, Santiago David Villalba

Comments Fabian P. Krüger and Andrea Hunklinger contributed equally to this work

详情
英文摘要

Optimizing the structure of molecules to achieve desired properties is a central bottleneck across the chemical sciences, particularly in the pharmaceutical industry where it underlies the discovery of new drugs. Since molecular property evaluation often relies on costly and rate-limited oracles, such as experimental assays, molecular optimization must be highly sample-efficient. To address this, we introduce SEISMO, an LLM agent that performs strictly online, inference-time molecular optimization, updating after every oracle call without the need for population-based or batched learning. SEISMO conditions each proposal on the full optimization trajectory, combining natural-language task descriptions with scalar scores and, when available, structured explanatory feedback. Across the Practical Molecular Optimization benchmark of 23 tasks, SEISMO achieves a 2-3 times higher area under the optimisation curve than prior methods, often reaching near-maximal task scores within 50 oracle calls. Our additional medicinal-chemistry tasks show that providing explanatory feedback further improves efficiency, demonstrating that leveraging domain knowledge and structured information is key to sample-efficient molecular optimization.

2601.16689 2026-02-19 q-bio.NC eess.SP

Agonist-Antagonist Neural Coordination without Mechanical Coupling after Targeted Muscle Reinnervation

Laura Ferrante, Anna Boesendorfer, Benedikt Baumgartner, Manuel Catalano, Antonio Bicchi, Oskar Aszmann, Dario Farina

详情
英文摘要

Following limb amputation and targeted muscle reinnervation (TMR), nerves that originally innervated agonist and antagonist muscles are rerouted into one or more residual target muscles. This rerouting profoundly alters the natural mechanical coupling and afferent signalling that normally link muscle groups in intact limbs. Despite this disruption, in this study we demonstrate, using high-density intramuscular microelectrode arrays implanted in reinnervated muscles of three TMR participants, that motor units (MUs) associated with agonist and antagonist tasks remain functionally coupled. Specifically, over 40% of motor units active during agonist tasks were also recruited during the corresponding antagonist tasks, even though no visual feedback on antagonist neural activity was provided. These motor units exhibited significantly different firing rates depending on their functional role. These results provide the first motor-unit-level evidence that the central nervous system preserves coordinated agonist-antagonist control after TMR and inform restorative surgical strategies and prosthetic systems capable of regulating both limb kinematics and dynamics based on agonist-antagonist commands interplay.

2601.02606 2026-02-19 q-bio.NC

gPC-based robustness analysis of neural systems through probabilistic recurrence metrics

Uros Sutulovic, Daniele Proverbio, Rami Katz, Giulia Giordano

Comments 20 pages, 7 figures

详情
英文摘要

Neuronal systems often preserve their characteristic functions and signalling patterns, also referred to as regimes, despite parametric uncertainties and variations. For neural models having uncertain parameters with a known probability distribution, probabilistic robustness analysis (PRA) allows us to understand and quantify under which uncertainty conditions a regime is preserved in expectation. We introduce a new computational framework for the efficient and systematic PRA of dynamical systems in neuroscience and we show its efficacy in analysing well-known neural models that exhibit multiple dynamical regimes: the Hindmarsh-Rose model for single neurons and the Jansen-Rit model for cortical columns. Given a model subject to parametric uncertainty, we employ generalised polynomial chaos to derive mean neural activity signals, which are then used to assess the amount of parametric uncertainty that the system can withstand while preserving the current regime, thereby quantifying the regime's robustness to such uncertainty. To assess persistence of regimes, we propose new metrics, which we apply to recurrence plots obtained from the mean neural activity signals. The overall result is a novel, general computational methodology that combines recurrence plot analysis and systematic persistence analysis to assess how much the uncertain model parameters can vary, with respect to their nominal value, while preserving the nominal regimes in expectation. We summarise the PRA results through probabilistic regime preservation (PRP) plots, which capture the effect of parametric uncertainties on the robustness of dynamical regimes in the considered models.

2511.21677 2026-02-19 q-bio.NC

Lesion-Independent Associations Between Thalamic Nuclei Volumes and Information Processing Speed in Multiple Sclerosis

Arshya Pooladi-Darvish, Heather Rosehart, Marina R. Everest, Ali R. Khan, Sarah A. Morrow

Comments 21 pages, 2 figures, 1 table, 2 supplementary figures

详情
英文摘要

Background: Cognitive impairment in multiple sclerosis (MS) is driven by both focal inflammation and compartmentalized neurodegeneration, yet the relative effect of lesion-independent thalamic atrophy on information processing speed (IPS) remains unclear. Methods: This retrospective cohort study included 100 participants with MS. Automatic segmentation techniques quantified lesion load and delineated 26 thalamic regions of interest (ROIs). Linear models compared associations between ROI volumes and Symbol Digit Modalities Test (SDMT) performance in lesion-adjusted and unadjusted models. Results: Twenty-one of 26 ROIs showed significant SDMT associations before lesion adjustment; twelve remained significant after adjustment. Lesion-independent associations were observed in the global thalamus, sensory relay nuclei (ventral posterolateral, medial and lateral geniculate), and associative hubs (pulvinar and mediodorsal-parafascicular complex). These processing-associated ROIs exhibited significantly lower lesion-mediated effects (13.4%) than those losing significance after adjustment (34.2%, p < 0.001). Conclusion: Our findings suggest that IPS impairment reflects heterogeneous contributions from focal lesion-driven and chronic neurodegenerative pathology, with nucleus-specific phenotyping potentially informing identification of higher risk individuals.

2510.12976 2026-02-19 q-bio.PE q-bio.QM

Likelihood-free inference of phylogenetic tree posterior distributions

Luc Blassel, Noémie Sauvage, Pierre Barrat-Charlaix, Bastien Boussau, Nicolas Lartillot, Laurent Jacob

Comments 12 Pages, 3 figures

详情
英文摘要

Phylogenetic inference, the task of reconstructing how related sequences evolved from common ancestors, is a central objective in evolutionary genomics. The current state-of-the-art methods exploit probabilistic models of sequence evolution along phylogenetic trees, by searching for the tree maximizing the likelihood of observed sequences, or by estimating the posterior of the tree given the sequences in a Bayesian framework. Both approaches typically require to compute likelihoods, which is only feasible under simplifying assumptions such as independence of the evolution at the different positions of the sequence, and even then remains a costly operation. Here we present the first likelihood-free inference method for posterior distributions over phylogenies. It exploits a novel expressive encoding for pairs of sequences, and a parameterized probability distribution factorized over a succession of subtree merges. The resulting network provides well-calibrated estimates of the posterior distribution leading to more accurate tree topologies than existing methods, even under models amenable to likelihood computation. We further show that its edge against likelihood-based methods dramatically increases under models of sequence evolution with intractable likelihoods.

2507.12347 2026-02-19 physics.bio-ph nlin.AO q-bio.CB

Threshold sensing yields optimal path formation in Physarum polycephalum

Daniele Proverbio, Giulia Giordano

详情
英文摘要

The model organism Physarum polycephalum is known to perform decentralised problem solving despite absence of nervous system. Experimental evidence and modelling studies have linked these abilities, and in particular maze-solving, to some sort of memory and adaptation. However, despite compelling hypotheses, it is still not clear whether the tasks are solved optimally, and which key dynamical mechanisms enable Physarum's impressive abilities. Here, we employ a circuital network model for the foraging behaviour of Physarum polycephalum to prove that threshold sensing yields the emergence of unique and optimal paths that connect food sources and solve mazes. We also prove which conditions lead to alternative paths, thus elucidating how the organism achieves flexibility and adaptation in a self-organised manner. These findings are aligned with experimental evidences and provide insight into the evolution of primitive intelligence. Our results can also inspire the development of threshold-based algorithms for computing applications.

2506.19018 2026-02-19 cond-mat.stat-mech physics.bio-ph q-bio.PE

Nonequilibrium Theory for Adaptive Systems in Varying Environments

Ying-Jen Yang, Charles D. Kocher, Ken A. Dill

详情
英文摘要

Biological organisms are adaptive, able to function in unpredictably changing environments. Drawing on recent nonequilibrium physics, we show that in adaptation, fitness has two components parameterized by observable coordinates: a static Generalism component characterized by state distributions, and a dynamic Tracking component sustained by nonequilibrium fluxes. Our findings: (1) General Theory: We prove that tracking gain scales strictly with environmental variability and switching time-scales; near-static or fast-switching environments are not worth tracking. (2) Optimal Strategies: We explain optimal bet-hedging and phenotypic memory as the interplay between these components. (3) Control: We demonstrate, with an example, how to suppress pathogens by independently attacking their Generalism robustness (via environmental time fractions) and Tracking capabilities (via environmental switching speed). This work provides a physical framework for understanding and controlling adaptivity.

2504.03276 2026-02-19 nlin.AO q-bio.MN

Resilience of the positive gene autoregulation loop

Daniele Proverbio, Giulia Giordano

详情
英文摘要

Gene expression in response to stimuli is regulated by transcription factors (TFs) through feedback loop motifs, aimed at maintaining the desired TF concentration despite uncertainties and perturbations. In this work, we consider a stochastic model of the positive gene autoregulating feedback loop and we probabilistically quantify its resilience, \textit{i.e.}, its ability to preserve the equilibrium associated with a prescribed concentration of TFs, and the corresponding basin of attraction, in the presence of noise. We show that the formation of larger oligomers, corresponding to larger Hill coefficients of the regulation function, and thus to sharper non-linearities, improves the system resilience, even close to critical concentrations of TFs. We also explore a complementary definition of resilience that can be assessed within a stochastic formulation relying on the Fokker-Planck equation. Our formal results are accompanied by numerical simulations.

2503.12286 2026-02-19 cs.CL cs.AI q-bio.GN q-bio.QM

Integrating Chain-of-Thought and Retrieval Augmented Generation Enhances Rare Disease Diagnosis from Clinical Notes

Zhanliang Wang, Da Wu, Quan Nguyen, Kai Wang

详情
英文摘要

Background: Several studies show that large language models (LLMs) struggle with phenotype-driven gene prioritization for rare diseases. These studies typically use Human Phenotype Ontology (HPO) terms to prompt foundation models like GPT and LLaMA to predict candidate genes. However, in real-world settings, foundation models are not optimized for domain-specific tasks like clinical diagnosis, yet inputs are unstructured clinical notes rather than standardized terms. How LLMs can be instructed to predict candidate genes or disease diagnosis from unstructured clinical notes remains a major challenge. Methods: We introduce RAG-driven CoT and CoT-driven RAG, two methods that combine Chain-of-Thought (CoT) and Retrieval Augmented Generation (RAG) to analyze clinical notes. A five-question CoT protocol mimics expert reasoning, while RAG retrieves data from sources like HPO and OMIM (Online Mendelian Inheritance in Man). We evaluated these approaches on rare disease datasets, including 5,980 Phenopacket-derived notes, 255 literature-based narratives, and 220 in-house clinical notes from Childrens Hospital of Philadelphia. Results: We found that recent foundations models, including Llama 3.3-70B-Instruct and DeepSeek-R1-Distill-Llama-70B, outperformed earlier versions such as Llama 2 and GPT-3.5. We also showed that RAG-driven CoT and CoT-driven RAG both outperform foundation models in candidate gene prioritization from clinical notes; in particular, both methods with DeepSeek backbone resulted in a top-10 gene accuracy of over 40% on Phenopacket-derived clinical notes. RAG-driven CoT works better for high-quality notes, where early retrieval can anchor the subsequent reasoning steps in domain-specific evidence, while CoT-driven RAG has advantage when processing lengthy and noisy notes.

2412.06004 2026-02-19 math.ST math.PR q-bio.PE stat.CO stat.TH

Large-sample analysis of cost functionals for inference under the coalescent

Martina Favero, Jere Koskela

Comments 34 pages, 7 figures

详情
Journal ref
Stochastic Processes and their Applications, Volume 195, 2026, Stochastic Processes and their Applications 195 (2026) 104894
英文摘要

The coalescent is a foundational model of latent genealogical trees under neutral evolution, but suffers from intractable sampling probabilities. Methods for approximating these sampling probabilities either introduce bias or fail to scale to large sample sizes. We show that a class of cost functionals of the coalescent with recurrent mutation and a finite number of alleles converge to tractable processes in the infinite-sample limit. A particular choice of costs yields insight about importance sampling methods, which are a classical tool for coalescent sampling probability approximation. These insights reveal that the behaviour of coalescent importance sampling algorithms differs markedly from standard sequential importance samplers, with or without resampling. We conduct a simulation study to verify that our asymptotics are accurate for algorithms with finite (and moderate) sample sizes. Our results constitute the first theoretical description of large-sample importance sampling algorithms for the coalescent, provide heuristics for the a priori optimisation of computational effort, and identify settings where resampling is harmful for algorithm performance. We observe strikingly different behaviour for importance sampling methods under the infinite sites model of mutation, which is regarded as a good and more tractable approximation of finite alleles mutation in most respects.

2602.16357 2026-02-19 cs.LG q-bio.QM

Optical Inversion and Spectral Unmixing of Spectroscopic Photoacoustic Images with Physics-Informed Neural Networks

Sarkis Ter Martirosyan, Xinyue Huang, David Qin, Anthony Yu, Stanislav Emelianov

详情
英文摘要

Accurate estimation of the relative concentrations of chromophores in a spectroscopic photoacoustic (sPA) image can reveal immense structural, functional, and molecular information about physiological processes. However, due to nonlinearities and ill-posedness inherent to sPA imaging, concentration estimation is intractable. The Spectroscopic Photoacoustic Optical Inversion Autoencoder (SPOI-AE) aims to address the sPA optical inversion and spectral unmixing problems without assuming linearity. Herein, SPOI-AE was trained and tested on \textit{in vivo} mouse lymph node sPA images with unknown ground truth chromophore concentrations. SPOI-AE better reconstructs input sPA pixels than conventional algorithms while providing biologically coherent estimates for optical parameters, chromophore concentrations, and the percent oxygen saturation of tissue. SPOI-AE's unmixing accuracy was validated using a simulated mouse lymph node phantom ground truth.

2602.16282 2026-02-19 q-bio.PE physics.soc-ph

Neutral species facilitate coexistence among cyclically competing species under birth and death processes

Yikang Lu, Wenhao She, Xiaofang Duan, Junpyo Park

Comments 10 pages, 12 figures

详情
英文摘要

Natural birth and death are fundamental mechanisms of population dynamics in ecosystems and have played pivotal roles in shaping population dynamics. Nevertheless, in studies of cyclic competition systems governed by the rock-paper-scissors (RPS) game, these mechanisms have often been ignored in analyses of biodiversity. On the other hand, given the prevalence and profound impact on biodiversity, understanding how higher-order interactions (HOIs) can affect biodiversity is one of the most challenging issues, and thus HOIs have been continuously studied for their effects on biodiversity in systems of cyclic competing populations, with a focus on neutral species. However, in real ecosystems, species can evolve and die naturally or be preyed upon by predators, whereas previous studies have considered only classic reaction rules among three species with a neutral, nonparticipant species. To identify how neutral species can affect the biodiversity of the RPS system when species' natural birth and death are assumed, we consider a model of neutral species in higher-order interactions within the spatial RPS system, assuming birth-and-death processes. Extensive simulations show that when neutral species interfere positively, they dominate the available space, thereby reducing the proportion of other species. Conversely, when the interference is harmful, the density of competing species increases. In addition, unlike traditional RPS dynamics, biodiversity can be effectively maintained even in high-mobility regimes. Our study reaffirms the critical role of neutral species in preserving biodiversity.

2602.16059 2026-02-19 q-bio.PE

Properties of biodiversity indices that model future extinction risk

Mike Steel, Kristina Wicke, Arne Mooers

Comments 16 pages, 5 figures

详情
英文摘要

The loss of biodiversity due to the likely widespread extinction of species in the near future is a focus of current concern in conservation biology. One approach to measure the impact of this extinction is based on the predicted loss of phylogenetic diversity. These predictions have become a focus of the Zoological Society of London's 'EDGE2' program for quantifying biodiversity loss and involves considering the HED (heightened evolutionary distinctiveness) and HEDGE (heightened evolutionary distinctiveness and globally endangered) indices. Here, we show how to generalise the HED(GE) indices by expanding their application to more general settings (to phylogenetic networks, to feature diversity on discrete traits, and to arbitrary biodiversity measures). We provide a simple and explicit description of the mean and variance of such measures, and illustrate our results by an application to the phylogeny of all 27 extant Crocodilians. We also derive various equalities for feature diversity, and an inequality if species extinction rates are correlated with feature types.

2602.16022 2026-02-19 math.AP q-bio.PE

The lingering phenomenon and pattern formation in a nonlocal population model with cognitive map

Kyung-Han Choi, Thomas Hillen

详情
英文摘要

The rates at which individuals memorize and forget environmental information strongly influence their movement paths and long-term space use. To understand how these cognitive time scales shape population-level patterns, we propose and analyze a nonlocal population model with a cognitive map. The population density moves by a Fokker--Planck type diffusion driven by a cognitive map that stores a habitat quality information nonlocally. The map is updated through local presence with learning and forgetting rates, and we consider both truncated and normalized perception kernels. We first study the movement-only system without growth. We show that finite perceptual range generates spatial heterogeneity in the cognitive map even in nearly homogeneous habitats, and we prove a lingering phenomenon on unimodal landscapes: for the fixed high learning rate, the peak density near the best location is maximized at an intermediate forgetting rate. We then couple cognitive diffusion to logistic growth. We establish local well-posedness and persistence by proving instability of the extinction equilibrium and the existence of a positive steady state, with uniqueness under an explicit condition on the motility function. Numerical simulations show that lingering persists under logistic growth and reveal a trade-off between the lingering and total population size, since near the strongest-lingering regime the total mass can fall below the total resource, in contrast to classical random diffusive--logistic models.

2602.16004 2026-02-19 q-bio.NC q-bio.QM

Time-Varying Directed Interactions in Functional Brain Networks: Modeling and Validation

Nan Xu, Xiaodi Zhang, Wen-Ju Pan, Jeremy L. Smith, Eric H. Schumacher, Jason W. Allen, Vince D. Calhoun, Shella D. Keilholz

详情
英文摘要

Understanding the dynamic nature of brain connectivity is critical for elucidating neural processing, behavior, and brain disorders. Traditional approaches such as sliding-window correlation (SWC) characterize time-varying undirected associations but do not resolve directional interactions, limiting inference about time-resolved information flow in brain networks. We introduce sliding-window prediction correlation (SWpC), which embeds a directional linear time-invariant (LTI) model within each sliding window to estimate time-varying directed functional connectivity (FC). SWpC yields two complementary descriptors of directed interactions: a strength measure (prediction correlation) and a duration measure (window-wise duration of information transfer). Using concurrent local field potential (LFP) and fMRI BOLD recordings from rat somatosensory cortices, we demonstrate stable directionality estimates in both LFP band-limited power and BOLD. Using Human Connectome Project (HCP) motor task fMRI, SWpC detects significant task-evoked changes in directed FC strength and duration and shows higher sensitivity than SWC for identifying task-evoked connectivity differences. Finally, in post-concussion vestibular dysfunction (PCVD), SWpC reveals reproducible vestibular-multisensory brain-state shifts and improves healthy-control vs subacute patient (HC-ST) discrimination using state-derived features. Together, these results show that SWpC provides biologically interpretable, time-resolved directed connectivity patterns across multimodal validation and clinical application settings, supporting both basic and translational neuroscience.

2602.15957 2026-02-19 q-bio.PE cs.NE econ.TH

Evolutionary Systems Thinking -- From Equilibrium Models to Open-Ended Adaptive Dynamics

Dan Adler

Comments 17 pages, 5 figures

详情
英文摘要

Complex change is often described as "evolutionary" in economics, policy, and technology, yet most system dynamics models remain constrained to fixed state spaces and equilibrium-seeking behavior. This paper argues that evolutionary dynamics should be treated as a core system-thinking problem rather than as a biological metaphor. We introduce Stability-Driven Assembly (SDA) as a minimal, non-equilibrium framework in which stochastic interactions combined with differential persistence generate endogenous selection without genes, replication, or predefined fitness functions. In SDA, longer-lived patterns accumulate in the population, biasing future interactions and creating feedback between population composition and system dynamics. This feedback yields fitness-proportional sampling as an emergent property, realizing a natural genetic algorithm driven solely by stability. Using SDA, we demonstrate why equilibrium-constrained models, even when simulated numerically, cannot exhibit open-ended evolution: evolutionary systems require population-dependent, non-stationary dynamics in which structure and dynamics co-evolve. We conclude by discussing implications for system dynamics, economics, and policy modeling, and outline how agent-based and AI-enabled approaches may support evolutionary models capable of sustained novelty and structural emergence.

2512.18454 2026-02-19 cs.LG q-bio.QM

Out-of-Distribution Detection in Molecular Complexes via Diffusion Models for Irregular Graphs

David Graber, Victor Armegioiu, Rebecca Buller, Siddhartha Mishra

详情
英文摘要

Predictive machine learning models generally excel on in-distribution data, but their performance degrades on out-of-distribution (OOD) inputs. Reliable deployment therefore requires robust OOD detection, yet this is particularly challenging for irregular 3D graphs that combine continuous geometry with categorical identities and are unordered by construction. Here, we present a probabilistic OOD detection framework for complex 3D graph data built on a diffusion model that learns a density of the training distribution in a fully unsupervised manner. A key ingredient we introduce is a unified continuous diffusion over both 3D coordinates and discrete features: categorical identities are embedded in a continuous space and trained with cross-entropy, while the corresponding diffusion score is obtained analytically via posterior-mean interpolation from predicted class probabilities. This yields a single self-consistent probability-flow ODE (PF-ODE) that produces per-sample log-likelihoods, providing a principled typicality score for distribution shift. We validate the approach on protein-ligand complexes and construct strict OOD datasets by withholding entire protein families from training. PF-ODE likelihoods identify held-out families as OOD and correlate strongly with prediction errors of an independent binding-affinity model (GEMS), enabling a priori reliability estimates on new complexes. Beyond scalar likelihoods, we show that multi-scale PF-ODE trajectory statistics - including path tortuosity, flow stiffness, and vector-field instability - provide complementary OOD information. Modeling the joint distribution of these trajectory features yields a practical, high-sensitivity detector that improves separation over likelihood-only baselines, offering a label-free OOD quantification workflow for geometric deep learning.

2510.15828 2026-02-19 q-bio.NC cs.AI

GENESIS: A Generative Model of Episodic-Semantic Interaction

Marco D'Alessandro, Leo D'Amato, Mikel Elkano, Mikel Uriz, Giovanni Pezzulo

Comments 18 pages, 6 figures

详情
英文摘要

A central challenge in cognitive neuroscience is to explain how semantic and episodic memory, two major forms of declarative memory, typically associated with cortical and hippocampal processing, interact to support learning, recall, and imagination. Despite significant advances, we still lack a unified computational framework that jointly accounts for core empirical phenomena across both semantic and episodic processing domains. Here, we introduce the Generative Episodic-Semantic Integration System (GENESIS), a computational model that formalizes memory as the interaction between two limited-capacity generative systems: a Cortical-VAE, supporting semantic learning and generalization, and a Hippocampal-VAE, supporting episodic encoding and retrieval within a retrieval-augmented generation (RAG) architecture. GENESIS reproduces hallmark behavioral findings, including generalization in semantic memory, recognition, serial recall effects and gist-based distortions in episodic memory, and constructive episodic simulation, while capturing their dynamic interactions. The model elucidates how capacity constraints shape the fidelity and memorability of experiences, how semantic processing introduces systematic distortions in episodic recall, and how episodic replay can recombine previous experiences. Together, these results provide a principled account of memory as an active, constructive, and resource-bounded process. GENESIS thus advances a unified theoretical framework that bridges semantic and episodic memory, offering new insights into the generative foundations of human cognition.

2410.14355 2026-02-19 q-bio.BM

Pore-level Quantitative Structure-Activity Relationship (QSAR) for Water Permeation Rate in Aquaporins

Juan José Galano-Frutos, Luca Bergamasco, Paolo Vigo, Matteo Morciano, Matteo Fasano, Davide Pirolli, Eliodoro Chiavazzo, Maria Cristina de Rosa

详情
英文摘要

Aquaporins (AQPs) and aquaglyceroporins (AQGPs) play a crucial role in regulating water transport and solute selectivity across biological membranes. Besides their biological relevance, AQPs have at-tracted growing interest as models for the design of next-generation biomimetic membranes for water filtration. In this work, we present a pore-level Quantitative Structure-Activity Relationship (QSAR) approach that relates structural and physicochemical pore descriptors with experimentally reported water permeation rates across a set of AQ(G)Ps with high-resolution 3D structures. This data-driven methodology, presented here as a proof of concept, introduces a multi-feature framework for determining pore descriptors associated with water transport efficiency in AQ(G)Ps. Applied to two compiled permeation rate datasets, this framework recapitulates determinants previously reported in single-feature studies, while also highlighting additional pore descriptors that emerge as relevant in a multi-variable context. The insights gained through this approach may, in perspective, contribute to advancing the rational design of AQP-based filtration devices and to deepening the molecular understanding of the function of these valuable macromolecules in health and disease.