arXivDaily arXiv每日学术速递 周一至周五更新
2602.15022 2026-02-17 cs.LG cs.AI math.GR q-bio.BM

Rethinking Diffusion Models with Symmetries through Canonicalization with Applications to Molecular Graph Generation

Cai Zhou, Zijie Chen, Zian Li, Jike Wang, Kaiyi Jiang, Pan Li, Rose Yu, Muhan Zhang, Stephen Bates, Tommi Jaakkola

Comments 32 pages

详情
英文摘要

Many generative tasks in chemistry and science involve distributions invariant to group symmetries (e.g., permutation and rotation). A common strategy enforces invariance and equivariance through architectural constraints such as equivariant denoisers and invariant priors. In this paper, we challenge this tradition through the alternative canonicalization perspective: first map each sample to an orbit representative with a canonical pose or order, train an unconstrained (non-equivariant) diffusion or flow model on the canonical slice, and finally recover the invariant distribution by sampling a random symmetry transform at generation time. Building on a formal quotient-space perspective, our work provides a comprehensive theory of canonical diffusion by proving: (i) the correctness, universality and superior expressivity of canonical generative models over invariant targets; (ii) canonicalization accelerates training by removing diffusion score complexity induced by group mixtures and reducing conditional variance in flow matching. We then show that aligned priors and optimal transport act complementarily with canonicalization and further improves training efficiency. We instantiate the framework for molecular graph generation under $S_n \times SE(3)$ symmetries. By leveraging geometric spectra-based canonicalization and mild positional encodings, canonical diffusion significantly outperforms equivariant baselines in 3D molecule generation tasks, with similar or even less computation. Moreover, with a novel architecture Canon, CanonFlow achieves state-of-the-art performance on the challenging GEOM-DRUG dataset, and the advantage remains large in few-step generation.

2602.14843 2026-02-17 q-bio.NC

Evolutionarily Primitive Social Entities

Angelica Kaufmann

详情
英文摘要

Social entities only exist in virtue of collective acceptance or recognition, or acknowledgement by two or more individuals in the context of joint activities. Joint activities are made possible by the coordination of plans for action, and the coordination of plans for action is made possible by the capacity for collective intentionality. This paper investigates how primitive is the capacity that nonhuman animals have to create social entities, by individuating how primitive is the capacity for collective intentionality. I present a novel argument for the evolutionary primitiveness of social entities, by showing that the collective intentions upon which these social entities are created and shared are metaphysically reducible to the relevant individual intentions.

2602.14645 2026-02-17 q-bio.PE

Conditions for Bacterial Selection and Extinction Driven by Growth-Kill Trade-Off in Cyclic Antimicrobial Treatments

Nerea Martínez-López, Niclas Nordholt, Frank Schreiber, Míriam R. García

详情
英文摘要

Antimicrobial protocols - using substances such as antibiotics or disinfectants - remain the preferred option for preventing the spread of pathogenic bacteria. However, bacteria can develop mechanisms to reduce their antimicrobial susceptibility, which can lead to treatment failure and the selection of resistance or tolerance. In this work, we propose a minimal population dynamics model to study bacterial selection during cyclic antimicrobial application, a commonly used protocol. Selection in bacterial populations with heterogeneous antimicrobial susceptibility is modelled here as a trade-off between survival advantage (reduction in antimicrobial killing) and potential fitness costs (reduction in growth rate) of the less susceptible strains. The proposed model allows us to derive useful expressions for determining the success of cyclic antimicrobial treatments based on two bacterial traits: growth and kill rates. The results obtained here are directly applicable to preventing the selection and spread of resistant and tolerant bacterial strains in real-life protocols.

2412.16261 2026-02-17 q-bio.QM

A reduced model for the long-term effects of physical activity on type 2 diabetes

Lea Multerer, Pierluigi Francesco De Paola, Marta Lenatti, Alessia Paglialonga, Laura Azzimonti

详情
Journal ref
Math. Biosci. 394 (2026) 109645
英文摘要

Type 2 diabetes progresses slowly and may be reversed through lifestyle changes, but quantifying the long-term impact of regular physical activity remains challenging due to sparse longitudinal data. Mechanistic models offer a powerful tool by simulating metabolic processes over extended timescales. However, multi-scale formulations that capture both the short-term effects of exercise sessions and the slow evolution of disease tend to be computationally demanding, limiting their practical use in personalized decision support. To address this limitation, we derived a reduced version of a two-scale model that captures the short- and long-term effects of physical activity on blood glucose regulation. By analytically averaging the short-term effects induced by exercise, we developed a homogenized formulation that transmits the average contribution of physical activity to the slower glucose-insulin dynamics. This reduction preserves the key model dynamics while decreasing computational complexity by almost a factor 2000. We prove that the approximation error remains bounded and confirm the model's accuracy through a parameter-based simulation study. The resulting model provides a mathematically grounded reduction that retains key physiological mechanisms while enabling fast long-term simulations. This substantial computational gain makes it suitable for integration into medical decision support systems, where it can be used to design and evaluate personalized physical activity plans aimed at reducing the risk of type 2 diabetes.

2410.17801 2026-02-17 q-bio.GN

Rawsamble: Overlapping and Assembling Raw Nanopore Signals using a Hash-based Seeding Mechanism

Can Firtina, Maximilian Mordig, Harun Mustafa, Sayan Goswami, Nika Mansouri Ghiasi, Stefano Mercogliano, Furkan Eris, Joël Lindegger, Andre Kahles, Onur Mutlu

Comments Accepted to appear in the Bioinformatics journal

详情
英文摘要

Raw nanopore signal analysis is a common approach in genomics to provide fast and resource-efficient analysis without translating the signals to bases (i.e., without basecalling). However, existing solutions cannot interpret raw signals directly if a reference genome is unknown due to a lack of accurate mechanisms to handle increased noise in pairwise raw signal comparison. Our goal is to enable the direct analysis of raw signals without a reference genome. To this end, we propose Rawsamble, the first mechanism that can identify regions of similarity between all raw signal pairs, known as all-vs-all overlapping, using a hash-based search mechanism. We use these overlaps to construct de novo assembly graphs with an existing assembler, miniasm, off-the-shelf. To our knowledge, these are the first de novo assemblies ever constructed directly from raw signals without basecalling. Our extensive evaluations across multiple genomes of varying sizes show that Rawsamble provides a significant speedup (on average by 5.01x and up to 23.10x) and reduces peak memory usage (on average by 5.74x and up to by 22.00x) compared to a conventional genome assembly pipeline using the state-of-the-art tools for basecalling (Dorado's fastest mode) and overlapping (minimap2) on a CPU.We find that around one-third of Rawsamble 's overlapping pairs are also found by minimap2. We find that when we use overlapping reads from Rawsamble, we can construct unitigs that are 1) as accurate as those built from minimap2's overlaps and 2) up to half a chromosome in length (e.g., 2.3 million bases for E. coli). Source code: https://github.com/CMU-SAFARI/RawHash

2602.14616 2026-02-17 stat.CO q-bio.QM stat.AP

Higher-Order Hit-&-Run Samplers for Linearly Constrained Densities

Richard D. Paul, Anton Stratmann, Johann F. Jadebeck, Martin Beyß, Hanno Scharr, David Rügamer, Katharina Nöh

详情
英文摘要

Markov chain Monte Carlo (MCMC) sampling of densities restricted to linearly constrained domains is an important task arising in Bayesian treatment of inverse problems in the natural sciences. While efficient algorithms for uniform polytope sampling exist, much less work has dealt with more complex constrained densities. In particular, gradient information as used in unconstrained MCMC is not necessarily helpful in the constrained case, where the gradient may push the proposal's density out of the polytope. In this work, we propose a novel constrained sampling algorithm, which combines strengths of higher-order information, like the target's log-density's gradients and curvature, with the Hit-&-Run proposal, a simple mechanism which guarantees the generation of feasible proposals, fulfilling the linear constraints. Our extensive experiments demonstrate improved sampling efficiency on complex constrained densities over various constrained and unconstrained samplers.

2602.14562 2026-02-17 math.PR q-bio.PE

Infection models on dense dynamic random graphs

Simone Baldassarri, Peter Braunsteins, Frank den Hollander, Michel Mandjes

Comments 29 pages,

详情
英文摘要

We consider Susceptible-Infected-Recovered (SIR) models on dense dynamic random graphs, in which the joint dynamics of vertices and edges are co-evolutionary, i.e., they influence each other bidirectionally. In particular, edges appear and disappear over time depending on the states of the two connected vertices, on how long they have been infected, and on the total density of susceptible and infected vertices. Our main results establish functional laws of large numbers for the densities of susceptible, infected, and recovered vertices, jointly with the underlying evolving random graphs in the graphon space. Our results are supported by simulations, which characterize the limiting size of the epidemics, i.e., the limiting density of susceptible vertices, and how the peak of the epidemics depends on the rate of the evolution of the underlying graph. The proofs of our main results rely on the careful construction of a mimicking process, obtained by approximating the two-way feedback interaction between vertex and edge dynamics with a mean-field type interaction, acting only as one-way feedback, that remains sufficiently close to the original co-evolutionary process. To treat the more general setting in which edge dynamics are affected by the proportions of susceptible and infected individuals, we introduce a methodological extension of existing techniques. We thus show that our model exhibits multiple epidemic peaks -- a phenomenon observed in real-world epidemics -- which can emerge in models that incorporate mutual feedback between vertex and edge dynamics.

2602.14328 2026-02-17 q-bio.BM q-bio.QM

Conformational landscapes in cryo-ET data based on MD simulations

Slavica Jonic

详情
英文摘要

Cryo-electron tomography (cryo-ET) provides a unique window into molecular organization in cellular environments (in situ). However, the interpretation of molecular structural information is complicated by several intrinsic properties of cryo-ET data, such as noise, missing wedge, and continuous conformational variability of the molecules. Additionally, in crowded in situ environments, the number of particles extracted is sometimes small and precludes extensive classification into discrete states. These challenges shift the emphasis from high-resolution structure determination toward validation and interpretation of low-resolution density maps, and analysis of conformational flexibility. Molecular Dynamics (MD) simulations are particularly well suited to this task, as they provide a physically grounded way to explore continuous conformation transitions consistent with both experimental data and molecular energetics. This review focuses on the roles of MD simulations in cryo-ET, emphasizing their use in emerging methods for conformational landscape determination and their contribution to gain new biological insight.

2602.13887 2026-02-17 cs.CV q-bio.NC

Human-Aligned Evaluation of a Pixel-wise DNN Color Constancy Model

Hamed Heidari-Gorji, Raquel Gil Rodriguez, Karl R. Gegenfurtner

详情
英文摘要

We previously investigated color constancy in photorealistic virtual reality (VR) and developed a Deep Neural Network (DNN) that predicts reflectance from rendered images. Here, we combine both approaches to compare and study a model and human performance with respect to established color constancy mechanisms: local surround, maximum flux and spatial mean. Rather than evaluating the model against physical ground truth, model performance was assessed using the same achromatic object selection task employed in the human experiments. The model, a ResNet based U-Net from our previous work, was pre-trained on rendered images to predict surface reflectance. We then applied transfer learning, fine-tuning only the network's decoder on images from the baseline VR condition. To parallel the human experiment, the model's output was used to perform the same achromatic object selection task across all conditions. Results show a strong correspondence between the model and human behavior. Both achieved high constancy under baseline conditions and showed similar, condition-dependent performance declines when the local surround or spatial mean color cues were removed.

2511.00310 2026-02-17 q-bio.OT

Theoretical morphology of a cichlid according to the approach of Systemic Morphometry

Juan Rivera Cázares, Xavier Valencia Díaz, Christian Lambarri Martínez

Comments 54 pages, 10 Figures, 19 Tables. The first version is replaced to include figures and tables

详情
英文摘要

We analyzed the body structure of the Blackstripe Cichlid Vieja fenestrata (Günther, 1860), a species with highly phenotypic variability, by the Systemics Morphometrics Methodology, previously proposed by one of the authors. From this perspective and considering the properties of its bauplan, we describe the expected morphometrics variability of this species. The Infinitesimal Change Rates (IChR) were obtained deriving the allometric equations that relate pairs of morphometric variables, and they demonstrated that the species' growth is continuous throughout its ontogeny. For some of the morphometric variables, relative growth trajectories were traced and their relationship with the IChR showed. Also, the observed and theoretical Systemic Phenotypical Spaces (SPS) were described by using three dimensional graphs and Mahalanobis Quadratic Distances (MQD). This was an alternate approach that allowed the analysis of the phenotypical spaces' properties in a wider, more objective, and analytical manner. We conclude that the morphometric variability observed in V. fenestrata agrees with the variability expected in the times and places sampled, although there are still some issues to be explained. We propose to incorporate the structural variance into the classical phenotypic variance equation, and consider the equality: phenotypic variance = SPS theoretical, (the phenotypic variance is equal to the theoretical Systemic Phenotypic Space), as a point of convergence between Quantitative Genetics and Systemic Morphometry.

2510.26525 2026-02-17 q-bio.OT

Biological Engineering: What does it mean? Where does it (need to) go?

Ulrike A. Nuber, Viktor Stein

Comments 19 pages, 2 Figures, 2 Tables

详情
英文摘要

Biological engineering, the convergence between engineering and biology, is at the forefront of significant advances in healthcare, agriculture, and environmental sustainability, making it highly relevant to current scientific and societal challenges. We take a comprehensive look at this broad and interdisciplinary domain, structure it into three main areas - bioinspired, biological and biohybrid approaches - and dissect inherent and fundamental challenges along with opportunities, highlighting specific examples. We describe how data-driven discovery and design, in conjunction with artificial intelligence, can mitigate the absence of reductionist models in these areas. Additionally, we address the education of a new generation of biological engineers, emphasizing mathematical, technical, and artificial intelligence frameworks.

2510.24903 2026-02-17 cond-mat.dis-nn cond-mat.stat-mech q-bio.NC

Emergence of Chimeras States in One-dimensional Ising model with Long-Range Diffusion

Alejandro de Haro García, Joaquín J. Torres

Comments 36 pages, 8 figures

详情
Journal ref
Chaos, Solitons and Fractals 207, 118068 (2026)
英文摘要

In this work, we examine the conditions for the emergence of chimera-like states in Ising systems. We study an Ising chain with periodic boundaries in contact with a thermal bath at temperature T, that induces stochastic changes in spin variables. To capture the non-locality needed for chimera formation, we introduce a model setup with non-local diffusion of spin values through the whole system. More precisely, diffusion is modeled through spin-exchange interactions between units up to a distance R, using Kawasaki dynamics. This setup mimics, e.g., neural media, as the brain, in the presence of electrical (diffusive) interactions. We explored the influence of such non-local dynamics on the emergence of complex spatiotemporal synchronization patterns of activity. Depending on system parameters we report here for the first time chimera-like states in the Ising model, characterized by relatively stable moving domains of spins with different local magnetization. We analyzed the system at T=0, both analytically and via simulations and computed the system's phase diagram, revealing rich behavior: regions with only chimeras, coexistence of chimeras and stable domains, and metastable chimeras that decay into uniform stable domains. This study offers fundamental insights into how coherent and incoherent synchronization patterns can arise in complex networked systems as it is, e.g., the brain.

2510.18387 2026-02-17 physics.med-ph eess.IV eess.SP q-bio.QM

Quantification of dual-state 5-ALA-induced PpIX fluorescence: Methodology and validation in tissue-mimicking phantoms

Silvère Ségaud, Charlie Budd, Matthew Elliot, Graeme Stasiuk, Jonathan Shapey, Yijing Xie, Tom Vercauteren

详情
英文摘要

Quantification of protoporphyrin IX (PpIX) fluorescence in human brain tumours has the potential to significantly improve patient outcomes in neuro-oncology, but represents a formidable imaging challenge. Protoporphyrin is a biological molecule which interacts with the tissue micro-environment to form two photochemical states in glioma. Each exhibits markedly different quantum efficiencies, with distinct but overlapping emission spectra that also overlap with tissue autofluorescence. Fluorescence emission is known to be distorted by the intrinsic optical properties of tissue, coupled with marked intra-tumoural heterogeneity as a hallmark of glioma tumours. Existing quantitative fluorescence systems are developed and validated using simplified phantoms that do not simultaneously mimic the complex interactions between fluorophores and tissue optical properties or micro-environment. Consequently, existing systems risk introducing systematic errors into PpIX quantification when used in tissue. In this work, we introduce a novel pipeline for quantification of PpIX in glioma, which robustly differentiates both emission states from background autofluorescence without reliance on a priori spectral information, and accounts for variations in their quantum efficiency. Unmixed PpIX emission forms are then corrected for wavelength-dependent optical distortions and weighted for accurate quantification. Significantly, this pipeline is developed and validated using novel tissue-mimicking phantoms replicating the optical properties of glioma tissues and photochemical variability of PpIX fluorescence in glioma. Our workflow achieves strong correlation with ground-truth PpIX concentrations (R2 = 0.918+-0.002), demonstrating its potential for robust, quantitative PpIX fluorescence imaging in clinical settings.

2508.01055 2026-02-17 cs.LG cs.AI q-bio.BM q-bio.QM

FGBench: A Dataset and Benchmark for Molecular Property Reasoning at Functional Group-Level in Large Language Models

Xuan Liu, Siru Ouyang, Xianrui Zhong, Jiawei Han, Huimin Zhao

Comments NeurIPS 2025 (Datasets and Benchmarks Track)

详情
英文摘要

Large language models (LLMs) have gained significant attention in chemistry. However, most existing datasets center on molecular-level property prediction and overlook the role of fine-grained functional group (FG) information. Incorporating FG-level data can provide valuable prior knowledge that links molecular structures with textual descriptions, which can be used to build more interpretable, structure-aware LLMs for reasoning on molecule-related tasks. Moreover, LLMs can learn from such fine-grained information to uncover hidden relationships between specific functional groups and molecular properties, thereby advancing molecular design and drug discovery. Here, we introduce FGBench, a dataset comprising 625K molecular property reasoning problems with functional group information. Functional groups are precisely annotated and localized within the molecule, which ensures the dataset's interoperability thereby facilitating further multimodal applications. FGBench includes both regression and classification tasks on 245 different functional groups across three categories for molecular property reasoning: (1) single functional group impacts, (2) multiple functional group interactions, and (3) direct molecular comparisons. In the benchmark of state-of-the-art LLMs on 7K curated data, the results indicate that current LLMs struggle with FG-level property reasoning, highlighting the need to enhance reasoning capabilities in LLMs for chemistry tasks. We anticipate that the methodology employed in FGBench to construct datasets with functional group-level information will serve as a foundational framework for generating new question-answer pairs, enabling LLMs to better understand fine-grained molecular structure-property relationships. The dataset and evaluation code are available at https://github.com/xuanliugit/FGBench.

2504.18367 2026-02-17 physics.comp-ph cs.LG physics.chem-ph q-bio.BM

A Novel 4-D Dataset Paradigm for Studying Complete Ligand-Protein Dissociation Dynamics

Maodong Li, Jiying Zhang, Zhe Wang, Bin Feng, Wenqi Zeng, Dechin Chen, Zhijun Pan, Yu Li, Zijing Liu, Yi Isaac Yang

Comments The dissociation dynamics dataset DD-13M is publicly available at https://huggingface.co/datasets/SZBL-IDEA/MD (For facilitated browsing and categorical download, a dedicated web interface is maintained at: https://aimm.szbl.ac.cn/database/ddd/#/home)

详情
英文摘要

The kinetics and dynamics of drug-protein binding and dissociation are crucial to understanding drug absorption and metabolism. Despite advances in artificial intelligence (AI) tools for drug-protein interaction studies, existing training datasets remain limited to static structures or quasi-static conformations. This paper proposes a novel computational approach for rapidly generating drug-protein dissociation trajectories and presents the inaugural dynamically time-resolved 4-D (t, x, y, z) trajectory database DD-13M. This dataset captures over 26,000 complete dissociation processes for 565 ligand-protein complexes, providing nearly 13 million frames of all-atom simulation trajectories. A deep equivariant generative model, UnbindingFlow, was trained using the DD-13M dataset. This model has the capacity to produce dissociation trajectories for novel targets whilst accurately predicting their rate constants (koff). DD-13M introduces a new type of training dataset for AI models, establishing a de novo paradigm for studying the dynamics of drug-protein interactions.

2410.04305 2026-02-17 q-bio.PE nlin.CD nlin.PS

Traveling vegetation-herbivore waves can sustain ecosystems threatened by droughts and population growth

Joydeep Singha, Hannes Uecker, Ehud Meron

Comments 16 pages, 13 figures

详情
英文摘要

Dryland vegetation can survive water stress by forming spatial patterns but is often subjected to herbivory as an additional stress that puts it at risk of desertification. Understanding the mutual relationships between vegetation patterning and herbivory is crucial for securing food production in drylands, which constitute the majority of rangelands worldwide. Here, we introduce a novel vegetation-herbivore model that captures pattern-forming feedbacks associated with water and herbivory stress and a behavioral aspect of herbivores representing an exploitation strategy.Applying numerical continuation methods, we analyze the bifurcation structure of uniform and patterned vegetation-herbivore solutions, and use direct numerical simulations to study various forms of collective herbivore dynamics. We find that herbivory stress can induce traveling vegetation-herbivore waves and uncover the ecological mechanism that drives their formation. In the traveling-wave state, the herbivore distribution is asymmetric with higher density on one side of each vegetation patch. At low precipitation values their distribution is localized, while at high precipitation the herbivores are spread over the entire landscape. Importantly, their asymmetric distribution results in uneven herbivory stress, strong on one side of each vegetation patch and weak on the opposing side - weaker than the stress exerted in spatially uniform herbivore distribution. Consequently, the formation of traveling waves results in increased sustainability to herbivory stress. We conclude that vegetation-herbivore traveling waves may play an essential role in sustaining herbivore populations under conditions of combined water and herbivory stress, thereby contributing to food security in endangered regions threatened by droughts and population growth.

2602.13630 2026-02-17 q-bio.PE

Bistability to Quad-stability: Emergence of Hybrid Phenotypes & Enhanced Spatio-temporal Plasticity in Presence of Host-Circuit Coupling

Ranu Kundu, Priya Chakraborty, Sohini Guin, Shyam Sundar Poriah, Sayantari Ghosh

Comments 18 pages 9 Fig

详情
英文摘要

In the context of multistability driven diseases, like cancer, spatiotemporal plasticity plays a significant role to achieve a spectrum of phenotypic variations. The interplay between gene regulatory networks and environmental factors, such as resource competition and spatial diffusion, plays a crucial role in determining cellular behaviour and phenotypic heterogeneity. Though reaction diffusion frameworks have been widely applied in developmental biology, less attention has been paid to the simultaneous effects of resource competition and growth feedback on spatial organization. In this paper, we observed that a bistable genetic circuit under high resource competition due to growth feedback gives rise to multiple emergent phenotypes, as observed in cancer systems. Furthermore, we observed how spatial diffusion coupled with intrinsic nonlinearity can drive the emergence of distinct spatial dynamics over time. The observed spatiotemporal plasticity can also be driven by the comparative stability of the fixed points, diffusivity, and asymmetry of diffusion. Our findings highlight that growth-induced resource competition combined with diffusion can provide deeper insights into metastasis and cancer progression.

2602.13503 2026-02-17 q-bio.BM

Hermes: Large DEL Datasets Train Generalizable Protein-Ligand Binding Prediction Models

Maxwell Kleinsasser, Brayden J. Halverson, Edward Kraft, Sean Francis-Lyon, Sarah E. Hugo, Mackenzie R. Roman, Ben Miller, Andrew D. Blevins, Ian K. Quigley

详情
英文摘要

The quality and consistency of training data remain critical bottlenecks for protein-ligand binding prediction. Public affinity datasets, aggregated from thousands of labs and assay formats, introduce biases that limit model generalization and complicate evaluation. DNA-encoded chemical libraries (DELs) offer a potential solution: unified experimental protocols generating massive binding datasets across diverse chemical and protein target space. We present Hermes, a lightweight transformer trained exclusively on DEL data from screens against hundreds of protein targets, representing one of the largest and most protein-diverse DEL training sets applied to protein-ligand interaction (PLI) modeling to date. Despite never seeing traditional affinity measurements during training, Hermes generalizes to held-out targets, novel chemical scaffolds, and external benchmarks derived from public binding data and high-throughput screens. Our results demonstrate that DEL data alone captures transferable protein-ligand interaction representations, while Hermes' minimal architecture enables inference speeds suitable for large-scale virtual screening.

2602.13502 2026-02-17 cs.AI q-bio.OT

Translating Dietary Standards into Healthy Meals with Minimal Substitutions

Trevor Chan, Ilias Tagkopoulos

Comments 49 pages, 4 figures

详情
英文摘要

An important goal for personalized diet systems is to improve nutritional quality without compromising convenience or affordability. We present an end-to-end framework that converts dietary standards into complete meals with minimal change. Using the What We Eat in America (WWEIA) intake data for 135,491 meals, we identify 34 interpretable meal archetypes that we then use to condition a generative model and a portion predictor to meet USDA nutritional targets. In comparisons within archetypes, generated meals are better at following recommended daily intake (RDI) targets by 47.0%, while remaining compositionally close to real meals. Our results show that by allowing one to three food substitutions, we were able to create meals that were 10% more nutritious, while reducing costs 19-32%, on average. By turning dietary guidelines into realistic, budget-aware meals and simple swaps, this framework can underpin clinical decision support, public-health programs, and consumer apps that deliver scalable, equitable improvements in everyday nutrition.

2602.13423 2026-02-17 q-bio.PE cond-mat.dis-nn cond-mat.stat-mech

Spatiotemporal noise stabilizes unbounded diversity in strongly-competitive communities

Amer Al-Hiyasat, Daniel W. Swartz, Jeff Gore, Mehran Kardar

Comments Main text: 6 pages, 4 figures. Supplementary Information: 12 pages, 3 figures

详情
英文摘要

Classical ecological models predict that large, diverse communities should be unstable, presenting a central challenge to explaining the stable biodiversity seen in nature. We revisit this long-standing problem by extending the generalized Lotka-Volterra model to include both spatial structure and environmental fluctuations across space and time. We find that neither space nor environmental noise alone can resolve the tension between diversity and stability, but that their combined effects permit arbitrarily many species to stably coexist despite strongly disordered competitive interactions. We analytically characterize the noise-induced transition to coexistence, showing that spatiotemporal noise drives an anomalous scaling of abundance fluctuations, known empirically as Taylor's law. At the community level, this manifests as an effective sublinear self-inhibition that renders the community stable and asymptotically neutral in the high-diversity limit. Spatiotemporal noise thus provides a novel resolution to the diversity-stability paradox and a generic mechanism by which complex communities can persist.

2602.13419 2026-02-17 q-bio.QM cs.AI cs.CL cs.LG q-bio.BM

Protect$^*$: Steerable Retrosynthesis through Neuro-Symbolic State Encoding

Shreyas Vinaya Sathyanarayana, Shah Rahil Kirankumar, Sharanabasava D. Hiremath, Bharath Ramsundar

详情
英文摘要

Large Language Models (LLMs) have shown remarkable potential in scientific domains like retrosynthesis; yet, they often lack the fine-grained control necessary to navigate complex problem spaces without error. A critical challenge is directing an LLM to avoid specific, chemically sensitive sites on a molecule - a task where unconstrained generation can lead to invalid or undesirable synthetic pathways. In this work, we introduce Protect$^*$, a neuro-symbolic framework that grounds the generative capabilities of Large Language Models (LLMs) in rigorous chemical logic. Our approach combines automated rule-based reasoning - using a comprehensive database of 55+ SMARTS patterns and 40+ characterized protecting groups - with the generative intuition of neural models. The system operates via a hybrid architecture: an ``automatic mode'' where symbolic logic deterministically identifies and guards reactive sites, and a ``human-in-the-loop mode'' that integrates expert strategic constraints. Through ``active state tracking,'' we inject hard symbolic constraints into the neural inference process via a dedicated protection state linked to canonical atom maps. We demonstrate this neuro-symbolic approach through case studies on complex natural products, including the discovery of a novel synthetic pathway for Erythromycin B, showing that grounding neural generation in symbolic logic enables reliable, expert-level autonomy.

2602.13398 2026-02-17 cs.LG q-bio.QM

Accelerated Discovery of Cryoprotectant Cocktails via Multi-Objective Bayesian Optimization

Daniel Emerson, Nora Gaby-Biegel, Purva Joshi, Yoed Rabin, Rebecca D. Sandlin, Levent Burak Kara

详情
英文摘要

Designing cryoprotectant agent (CPA) cocktails for vitrification is challenging because formulations must be concentrated enough to suppress ice formation yet non-toxic enough to preserve cell viability. This tradeoff creates a large, multi-objective design space in which traditional discovery is slow, often relying on expert intuition or exhaustive experimentation. We present a data-efficient framework that accelerates CPA cocktail design by combining high-throughput screening with an active-learning loop based on multi-objective Bayesian optimization. From an initial set of measured cocktails, we train probabilistic surrogate models to predict concentration and viability and quantify uncertainty across candidate formulations. We then iteratively select the next experiments by prioritizing cocktails expected to improve the Pareto front, maximizing expected Pareto improvement under uncertainty, and update the models as new assay results are collected. Wet-lab validation shows that our approach efficiently discovers cocktails that simultaneously achieve high CPA concentrations and high post-exposure viability. Relative to a naive strategy and a strong baseline, our method improves dominated hypervolume by 9.5\% and 4.5\%, respectively, while reducing the number of experiments needed to reach high-quality solutions. In complementary synthetic studies, it recovers a comparably strong set of Pareto-optimal solutions using only 30\% of the evaluations required by the prior state-of-the-art multi-objective approach, which amounts to saving approximately 10 weeks of experimental time. Because the framework assumes only a suitable assay and defined formulation space, it can be adapted to different CPA libraries, objective definitions, and cell lines to accelerate cryopreservation development.

2602.13346 2026-02-17 q-bio.GN cs.AI cs.CV

CellMaster: Collaborative Cell Type Annotation in Single-Cell Analysis

Zhen Wang, Yiming Gao, Jieyuan Liu, Enze Ma, Jefferson Chen, Mark Antkowiak, Mengzhou Hu, JungHo Kong, Dexter Pratt, Zhiting Hu, Wei Wang, Trey Ideker, Eric P. Xing

Comments Preprint

详情
英文摘要

Single-cell RNA-seq (scRNA-seq) enables atlas-scale profiling of complex tissues, revealing rare lineages and transient states. Yet, assigning biologically valid cell identities remains a bottleneck because markers are tissue- and state-dependent, and novel states lack references. We present CellMaster, an AI agent that mimics expert practice for zero-shot cell-type annotation. Unlike existing automated tools, CellMaster leverages LLM-encoded knowledge (e.g., GPT-4o) to perform on-the-fly annotation with interpretable rationales, without pre-training or fixed marker databases. Across 9 datasets spanning 8 tissues, CellMaster improved accuracy by 7.1% over best-performing baselines (including CellTypist and scTab) in automatic mode. With human-in-the-loop refinement, this advantage increased to 18.6%, with a 22.1% gain on subtype populations. The system demonstrates particular strength in rare and novel cell states where baselines often fail. Source code and the web application are available at \href{https://github.com/AnonymousGym/CellMaster}{https://github.com/AnonymousGym/CellMaster}.

2602.13325 2026-02-17 q-bio.NC cs.LG

Graph neural networks uncover structure and functions underlying the activity of simulated neural assemblies

Cédric Allier, Larissa Heinrich, Magdalena Schneider, Stephan Saalfeld

详情
英文摘要

Graph neural networks trained to predict observable dynamics can be used to decompose the temporal activity of complex heterogeneous systems into simple, interpretable representations. Here we apply this framework to simulated neural assemblies with thousands of neurons and demonstrate that it can jointly reveal the connectivity matrix, the neuron types, the signaling functions, and in some cases hidden external stimuli. In contrast to existing machine learning approaches such as recurrent neural networks and transformers, which emphasize predictive accuracy but offer limited interpretability, our method provides both reliable forecasts of neural activity and interpretable decomposition of the mechanisms governing large neural assemblies.

2507.09045 2026-02-17 q-bio.NC physics.bio-ph

Coevolutionary balance of resting-state brain networks in autism

S. Rezaei Afshar, G. Reza Jafari

Comments 29 pages, 18 figures, 8 tables

详情
英文摘要

Autism spectrum disorder (ASD) is associated with atypical large-scale brain organization, yet the functional principles underlying these alterations remain incompletely understood. We examined whether coevolutionary balance, a network-level energy measure derived from signed interactions and nodal activity states, captures disruptions in resting-state functional connectivity in autistic adults. Using resting-state fMRI data from ABIDE I with ComBat harmonization to mitigate multi-site batch effects, we constructed whole-brain networks by combining binarized fALFF activity with signed functional correlations and quantified their coevolutionary energy. In the primary analysis with global signal regression (GSR), the ASD group showed significantly more negative global coevolutionary energy (pFDR < 0.002), higher proportions of agreement links, and lower proportions of imbalanced-same links, indicating a systematic redistribution of local motifs rather than a uniform increase in balance. Because GSR can introduce artifactual negative correlations, we repeated all analyses without GSR. In this sensitivity analysis, whole-brain energy and motif differences were attenuated, but bipolarity, a measure of global two-block signed network organization, became the only FDR-significant metric (pFDR = 0.047), with ASD showing higher bipolarity. Intra-network energy differences did not survive FDR correction under either pipeline. Coevolutionary energy showed modest associations with ADI-R and ADOS scores, none of which survived correction across 720 tests. Machine learning classification achieved 77.8% test accuracy (AUC = 0.79) with GSR and 64.7% (AUC = 0.65) without GSR. These findings suggest that coevolutionary balance captures altered signed network organization in ASD, though the specific metric driving group differences depends on preprocessing choices regarding global signal regression.

2504.10524 2026-02-17 q-bio.QM physics.bio-ph physics.med-ph

Hemodynamic Markers: CFD-Based Prediction of Cerebral Aneurysm Rupture Risk

Reza Bozorgpour, Jacob R. Rammer

Comments 16 figures, 24 pages

详情
英文摘要

This study investigates the influence of aneurysm evolution on hemodynamic characteristics within the sac region. Using computational fluid dynamics (CFD), blood flow through the parent vessel and aneurysm sac was analyzed to assess the impact on wall shear stress (WSS), time-averaged wall shear stress (TAWSS), and the oscillatory shear index (OSI), key indicators of rupture risk. Additionally, Relative Residence Time (RRT) and Endothelial Cell Activation Potential (ECAP) were examined to provide a broader understanding of the aneurysm's hemodynamic environment. Six distinct cerebral aneurysm (CA) models, all from individuals of the same gender, were selected to minimize gender-related variability. Results showed that unruptured cases exhibited higher WSS and TAWSS, along with lower OSI and RRT values patterns consistent with stable flow conditions supporting vascular integrity. In contrast, ruptured cases had lower WSS and TAWSS, coupled with elevated OSI and RRT, suggesting disturbed and oscillatory flow commonly linked to aneurysm wall weakening. ECAP was also higher in ruptured cases, indicating increased endothelial activation under unstable flow. Notably, areas with the highest OSI and RRT often aligned with vortex centers, reinforcing the association between disturbed flow and aneurysm instability. These findings highlight the value of combining multiple hemodynamic parameters for rupture risk assessment. Including RRT and ECAP provides deeper insight into flow endothelium-interactions, offering a stronger basis for evaluating aneurysm stability and guiding treatment decisions.

2502.05872 2026-02-17 q-bio.PE

Flexible inference of evolutionary accumulation dynamics using uncertain observational data

Jessica Renz, Morten Brun, Iain G. Johnston

Comments Added case studies 2 and 3

详情
英文摘要

Understanding and predicting evolutionary accumulation pathways is a key objective in many fields of research, ranging from classical evolutionary biology to diverse applications in medicine. In this context, we are often confronted with the problem that data is sparse and uncertain. To use the available data as best as possible, inference approaches that can handle this uncertainty are required. One way that allows us to use not only cross-sectional data, but also phylogenetic related and longitudinal data, is using `hypercubic inference' models. In this article we introduce HyperLAU, a new algorithm for hypercubic inference that makes it possible to use datasets including uncertainties for learning evolutionary pathways. Expanding the flexibility of accumulation modelling, HyperLAU allows us to infer dynamic pathways and interactions between features, even when large sets of particular features are unobserved across the source dataset. We show that HyperLAU is able to highlight the main pathways found by other tools, even when up to 50% of the features in the input data are uncertain. Additionally, we demonstrate how it can help to overcome possible biases that can occur then reducing the used data by excluding uncertain parts. We illustrate the approach with a case study on multidrug resistance in tuberculosis, showing that HyperLAU allows more flexible data and provides new information about evolutionary pathways compared to existing approaches.

2311.17824 2026-02-17 physics.optics physics.med-ph q-bio.QM

Depth-multiplexing spectral domain OCT for full eye length imaging with a single modulation unit

Guanghan Meng, Xue Dong, Andrew Zhang, Fabio Feroldi, Austin Roorda, Laura Waller

详情
英文摘要

Clinical measurement of a patient's axial eye length is emerging as a crucial approach to track progression and monitor management of myopia. However, the preferred method for such measurements is swept-source OCT, whose cost prohibits broad use, especially in lower-income communities. Spectral domain (SD) OCT is a more affordable option, but it has limited imaging depth range, so is not suitable for full eye length measurement. Depth-multiplexing (DM) techniques for SD-OCT provide a workaround by capturing images at multiple depths within the eye. However, these methods typically require multiple light modulation units or detectors for simultaneous imaging across depths, adding complexity and cost. In response, we propose a novel DM-SD-OCT approach that utilizes a single light modulation unit for depth encoding. We capture images at multiple depths within the eye simultaneously with a single line scan camera, then computationally demix the contributions from different depths. Here, we demonstrate acquisition and demixing of signals from three distinct depths within the eye and validate experimentally in human subjects. Our method thus offers a cost-effective solution for comprehensive eye length measurement in clinical myopia research.