arXivDaily arXiv每日学术速递 周一至周五更新
重置
2603.16789 2026-03-18 cs.LG q-bio.QM

Conservative Continuous-Time Treatment Optimization

Nora Schneider, Georg Manten, Niki Kilbertus

详情
英文摘要

We develop a conservative continuous-time stochastic control framework for treatment optimization from irregularly sampled patient trajectories. The unknown patient dynamics are modeled as a controlled stochastic differential equation with treatment as a continuous-time control. Naive model-based optimization can exploit model errors and propose out-of-support controls, so optimizing the estimated dynamics may not optimize the true dynamics. To limit extrapolation, we add a consistent signature-based MMD regularizer on path space that penalizes treatment plans whose induced trajectory distribution deviates from observed trajectories. The resulting objective minimizes a computable upper bound on the true cost. Experiments on benchmark datasets show improved robustness and performance compared to non-conservative baselines.

2603.16773 2026-03-18 q-bio.PE

Age-dependent distribution of officially reported cases of vector-borne infections

Francisco Antonio Bezerra Coutinho, Marcos Amaku, Esper Georges Kallas, Eduardo Massad

详情
英文摘要

OBJECTIVE: To propose a new approach to analyze the age-distribution of reported cases for vector-transmitted infections. METHODS: Using officially reported number of cases of dengue, Zika, chikungunya, malaria and leishmaniasis for distinct geographical areas, in different periods. Data were treated in special but well-known procedure, transforming the raw data into a density age-dependent distribution and fitting a special continuous function to it. RESULTS: We found that the proportion of age-dependent cases with respect to the total number of cases in a given year (or any transmission season) is probably determined by the ecological interactions between vectors and hosts. The age-distribution of the proportion of cases for the three Aedes-related infections are essentially the same independently of the magnitude of the outbreak and the geographical region considered. On the other hand, for the infections transmitted by other vectors, the age-distributions of the proportion of cases are entirely different. CONCLUSIONS: During specific outbreaks, the ratio between the age distribution of the proportion of officially reported cases and the total number of cases for Aedes transmitted infections such as dengue, chikungunya and zika is independent of the size of the outbreak, the size of the studied population, the period when the outbreak occurs; and the geographical region considered. Our results also suggest that the age-distribution of cases is mainly due to the interaction between vectors and their hosts.

2603.16770 2026-03-18 q-bio.BM

Training a force field for proteins and small molecules from scratch

Alexandre Blanco-González, Thea K Schulze, Evianne Rovers, Joe G Greener

Comments Alexandre Blanco-González and Thea K Schulze contributed equally

详情
英文摘要

Force fields for molecular dynamics are usually developed manually, limiting their transferability and making systematic exploration of functional forms challenging. We developed a graph neural network that assigns all force field parameters for diverse molecules using continuous atom typing. The freely-available model, called Garnet, was trained on quantum mechanical, condensed phase and protein nuclear magnetic resonance data without the use of existing parameters. The resulting force field shows comparable performance to current force fields on small molecules, folded proteins, protein complexes and disordered proteins. It shows similar results to popular approaches for relative binding free energy predictions across a range of targets. Assessing different functional forms shows that the double exponential potential is a flexible and accurate alternative to the Lennard-Jones potential. Garnet provides a platform for automated, reproducible force field discovery that brings the benefits of machine learning to classical force fields.

2603.16587 2026-03-18 q-bio.QM cs.CV eess.IV

HistoAtlas: A Pan-Cancer Morphology Atlas Linking Histomics to Molecular Programs and Clinical Outcomes

Pierre-Antoine Bannier

详情
英文摘要

We present HistoAtlas, a pan-cancer computational atlas that extracts 38 interpretable histomic features from 6,745 diagnostic H&E slides across 21 TCGA cancer types and systematically links every feature to survival, gene expression, somatic mutations, and immune subtypes. All associations are covariate-adjusted, multiple-testing corrected, and classified into evidence-strength tiers. The atlas recovers known biology, from immune infiltration and prognosis to proliferation and kinase signaling, while uncovering compartment-specific immune signals and morphological subtypes with divergent outcomes. Every result is spatially traceable to tissue compartments and individual cells, statistically calibrated, and openly queryable. HistoAtlas enables systematic, large-scale biomarker discovery from routine H&E without specialized staining or sequencing. Data and an interactive web atlas are freely available at https://histoatlas.com .

2211.13231 2026-03-18 q-bio.QM cs.LG

Predicting Biomedical Interactions with Probabilistic Model Selection for Graph Neural Networks

Kishan KC, Rui Li, Paribesh Regmi, Anne R. Haake

详情
英文摘要

Heterogeneous molecular entities and their interactions, commonly depicted as a network, are crucial for advancing our systems-level understanding of biology. With recent advancements in high-throughput data generation and a significant improvement in computational power, graph neural networks (GNNs) have demonstrated their effectiveness in predicting biomedical interactions. Since GNNs follow a neighborhood aggregation scheme, the number of graph convolution (GC) layers (i.e., depth) determines the neighborhood orders from which they can aggregate information, thereby significantly impacting the model's performance. However, it often relies on heuristics or extensive experimentation to determine an appropriate GNN depth for a given biomedical network. These methods can be unreliable or result in expensive computational overhead. Moreover, GNNs with more GC layers tend to exhibit poor calibration, leading to high confidence in incorrect predictions. To address these challenges, we propose a Bayesian model selection framework to jointly infer the most plausible number of GC layers supported by the data, apply dropout regularization, and learn network parameters. Experiments on four biomedical interaction datasets demonstrate that our method achieves superior performance over competing methods, providing well-calibrated predictions by allowing GNNs to adapt their depths to accommodate interaction information from various biomedical networks. Source code and data is available at: https://github.com/kckishan/BBGCN-LP/tree/master

2603.16741 2026-03-18 cs.LG q-bio.NC q-bio.QM stat.ML

Bayesian Inference of Psychometric Variables From Brain and Behavior in Implicit Association Tests

Christian A. Kothe, Sean Mullen, Michael V. Bronstein, Grant Hanada, Marcelo Cicconet, Aaron N. McInnes, Tim Mullen, Marc Aafjes, Scott R. Sponheim, Alik S. Widge

Comments 43 pages, 7 figures, 6 tables, submitted to: Journal of Neural Engineering

详情
英文摘要

Objective. We establish a principled method for inferring mental health related psychometric variables from neural and behavioral data using the Implicit Association Test (IAT) as the data generation engine, aiming to overcome the limited predictive performance (typically under 0.7 AUC) of the gold-standard D-score method, which relies solely on reaction times. Approach. We propose a sparse hierarchical Bayesian model that leverages multi-modal data to predict experiences related to mental illness symptoms in new participants. The model is a multivariate generalization of the D-score with trainable parameters, engineered for parameter efficiency in the small-cohort regime typical of IAT studies. Data from two IAT variants were analyzed: a suicidality-related E-IAT ($n=39$) and a psychosis-related PSY-IAT ($n=34$). Main Results. Our approach overcomes a high inter-individual variability and low within-session effect size in the dataset, reaching AUCs of 0.73 (E-IAT) and 0.76 (PSY-IAT) in the best modality configurations, though corrected 95% confidence intervals are wide ($\pm 0.18$) and results are marginally significant after FDR correction ($q=0.10$). Restricting the E-IAT to MDD participants improves AUC to 0.79 $[0.62, 0.97]$ (significant at $q=0.05$). Performance is on par with the best reference methods (shrinkage LDA and EEGNet) for each task, even when the latter were adapted to the task, while the proposed method was not. Accuracy was substantially above near-chance D-scores (0.50-0.53 AUC) in both tasks, with more consistent cross-task performance than any single reference method. Significance. Our framework shows promise for enhancing IAT-based assessment of experiences related to entrapment and psychosis, and potentially other mental health conditions, though further validation on larger and independent cohorts will be needed to establish clinical utility.

2603.16562 2026-03-18 cs.CV q-bio.CB q-bio.QM

Understanding Cell Fate Decisions with Temporal Attention

Florian Bürger, Martim Dias Gomes, Adrián E. Granada, Noémie Moreau, Katarzyna Bozek

Comments 10 pages, 6 figures

详情
英文摘要

Understanding non-genetic determinants of cell fate is critical for developing and improving cancer therapies, as genetically identical cells can exhibit divergent outcomes under the same treatment conditions. In this work, we present a deep learning approach for cell fate prediction from raw long-term live-cell recordings of cancer cell populations under chemotherapeutic treatment. Our Transformer model is trained to predict cell fate directly from raw image sequences, without relying on predefined morphological or molecular features. Beyond classification, we introduce a comprehensive explainability framework for interpreting the temporal and morphological cues guiding the model's predictions. We demonstrate that prediction of cell outcomes is possible based on the video only, our model achieves balanced accuracy of 0.94 and an F1-score of 0.93. Attention and masking experiments further indicate that the signal predictive of the cell fate is not uniquely located in the final frames of a cell trajectory, as reliable predictions are possible up to 10 h before the event. Our analysis reveals distinct temporal distribution of predictive information in the mitotic and apoptotic sequences, as well as the role of cell morphology and p53 signaling in determining cell outcomes. Together, these findings demonstrate that attention-based temporal models enable accurate cell fate prediction while providing biologically interpretable insights into non-genetic determinants of cellular decision-making. The code is available at https://github.com/bozeklab/Cell-Fate-Prediction.

2603.16501 2026-03-18 q-bio.NC

The immediate effect of kangaroo mother care on Mother-infant inter-brain synchrony and infant brain function

Yu Liu, Jiayang Xu, Tianzi Wang, Zichen Shi, Xiang Chen, Yanting Kong, Lianli Chen, Sha Sha, Shanbao Tong, Chuhan Dong, Guanghai Wang, Xiaoli Guo, Fei Bei

详情
英文摘要

Kangaroo mother care (KMC) is an intervention involving skin-to-skin contact that promotes physiological stability and supports long-term neurodevelopment in preterm infants. However, the underlying neurophysiological mechanisms remain unclear. We aimed to investigate the immediate effects of the first KMC on infants' brain function, mother-infant inter-brain synchrony, as well as their associations. Fifty-eight preterm infants (gestational age < 32 weeks or birth weight < 1500 g) and their mothers underwent synchronous dual-electroencephalography recording before and during the first KMC session. Infant brain function was assessed via power spectrum energy and graph theory-based network metrics, and mother-infant inter-brain synchrony was quantified using phase-locking value (PLV), from which inter-brain density and inter-brain strength were calculated. Correlation analyses were performed between infant intra-brain metrics and inter-brain synchrony indicators.During the first KMC, preterm infants showed enhanced theta, alpha, and beta power alongside reduced relative delta power, while brain network topological metrics remained stable. Concurrently, mother-infant inter-brain synchrony was significantly enhanced across all frequency bands, as evidenced by increased inter-brain density and strength (all p < .001). Furthermore, in the alpha band, inter-brain strength correlated positively with infant local efficiency and clustering coefficient, and in the beta band, it was positively correlated with infant small-worldness. The first KMC session can immediately enhance both preterm infant single-brain activity and mother-infant inter-brain synchrony. The strength of inter-brain synchrony is associated with the infant's intra-brain network organization, suggesting that KMC may promote intra-brain development in preterm infants via enhancing mother-infant inter-brain synchrony.

2603.16384 2026-03-18 cs.RO cs.LG q-bio.PE

Controlling Fish Schools via Reinforcement Learning of Virtual Fish Movement

Yusuke Nishii, Hiroaki Kawashima

Comments English translation of the author's 2018 bachelor's thesis. Keywords: fish schooling, reinforcement learning, collective behavior, artificial agents, swarm-machine interaction

详情
英文摘要

This study investigates a method to guide and control fish schools using virtual fish trained with reinforcement learning. We utilize 2D virtual fish displayed on a screen to overcome technical challenges such as durability and movement constraints inherent in physical robotic agents. To address the lack of detailed behavioral models for real fish, we adopt a model-free reinforcement learning approach. First, simulation results show that reinforcement learning can acquire effective movement policies even when simulated real fish frequently ignore the virtual stimulus. Second, real-world experiments with live fish confirm that the learned policy successfully guides fish schools toward specified target directions. Statistical analysis reveals that the proposed method significantly outperforms baseline conditions, including the absence of stimulus and a heuristic "stay-at-edge" strategy. This study provides an early demonstration of how reinforcement learning can be used to influence collective animal behavior through artificial agents.

2603.16288 2026-03-18 q-bio.NC

Hippocampus mediates conceptual generalization of pain modulation

Dylan Sutterlin Guindon, Tor D Wager, Leonie Koban

详情
英文摘要

Pain is strongly influenced by expectations and learning from previous experience, such as in classical conditioning. Conditioned responses and expectations can generalize to perceptually and conceptually related cues, but how generalization influences pain experience and the neurobiological processing of pain remains unclear. We used fMRI and multilevel mediation analyses to address this question. Thirty-six human participants first learned to associate two visual cues from distinct conceptual categories (e.g., animals vs. vehicles) with high or low levels of heat pain. In a subsequent phase, they were presented novel cues (images, drawings, or words) not previously paired with pain, but which shared the conceptual category of the initial pain-predictive cues. Participants who developed explicit expectations during learning reported greater pain in response to stimuli conceptually related to high-vs. low-pain cues ('generalization stimuli'), demonstrating generalization of cue influences on pain. This effect was mediated by increased pain-related activity to generalization stimuli in the hippocampus, which correlated with individual differences in cue-evoked expectations. A broader network, including areas of the default mode network and striatum, also contributed to conceptual generalization of pain modulation, while threat-related regions such as the amygdala responded to generalization stimuli but did not mediate effects on pain ratings. These findings extend our understanding of expectancy-driven pain modulation by showing how conceptual processes can influence pain and its neurobiological substrates, offering new insight into placebo effects and maladaptive learning in chronic pain.

2603.16194 2026-03-18 q-bio.GN

TPMM: Three-component Posterior Mixture Model Enables Robust Inverton Detection in Low-Depth Metagenomes and Suggests Potential Viral Invertons

Yi Lu, Jiaojiao Guan, Yang Shen, Jiayu Shang, Yanni Sun

Comments 10 pages, 5 figures

详情
英文摘要

Bacterial phase variation enables reversible, locus-specific phenotypic switching, often driven by DNA inversion (invertons). To identify these events, researchers commonly rely on sequencing reads that provide orientation-specific support. Metagenomic sequencing, which captures total genetic material independent of cultivation, offers a powerful platform for the comprehensive study of invertons. However, computational inverton calling from metagenomic data is difficult at low sequencing depth: hard read-support cutoffs can miss true events, while sequence-only predictors lack read-backed interpretability and uncertainty quantification. To address this, we present TPMM, a three-component posterior mixture model for inverton calling in metagenomic data. TPMM explicitly incorporates sequencing depth to formulate inverton detection as a probabilistic mixture problem. Starting from candidates flanked by inverted repeats, the model classifies the candidates into noise, low-probability, or high-probability inversion signals using read evidence. Finally, TPMM assigns posterior probabilities as soft labels and applies cumulative Bayesian False Discovery Rate control to robustly identify true invertons. On two real gut metagenomic datasets, TPMM agrees well with PhaseFinder at high depth but recovers substantially more invertons under systematic downsampling, demonstrating superior performance in sparse-data regimes. We further examine potential reversible inversion elements in viral genomes and provide supporting analyses, suggesting a broader scope for inversion-mediated regulation.

2603.16185 2026-03-18 cs.LG cs.AI q-bio.QM

Sample-Efficient Adaptation of Drug-Response Models to Patient Tumors under Strong Biological Domain Shift

Camille Jimenez Cortes, Philippe Lalanda, German Vega

详情
英文摘要

Predicting drug response in patients from preclinical data remains a major challenge in precision oncology due to the substantial biological gap between in vitro cell lines and patient tumors. Rather than aiming to improve absolute in vitro prediction accuracy, this work examines whether explicitly separating representation learning from task supervision enables more sample-efficient adaptation of drug-response models to patient data under strong biological domain shift. We propose a staged transfer-learning framework in which cellular and drug representations are first learned independently from large collections of unlabeled pharmacogenomic data using autoencoder-based representation learning. These representations are then aligned with drug-response labels on cell-line data and subsequently adapted to patient tumors using few-shot supervision. Through a systematic evaluation spanning in-domain, cross-dataset, and patient-level settings, we show that unsupervised pretraining provides limited benefit when source and target domains overlap substantially, but yields clear gains when adapting to patient tumors with very limited labeled data. In particular, the proposed framework achieves faster performance improvements during few-shot patient-level adaptation while maintaining comparable accuracy to single-phase baselines on standard cell-line benchmarks. Overall, these results demonstrate that learning structured and transferable representations from unlabeled molecular profiles can substantially reduce the amount of clinical supervision required for effective drug-response prediction, offering a practical pathway toward data-efficient preclinical-to-clinical translation.

2603.16178 2026-03-18 q-bio.NC

Early Pre-Stroke Detection via Wearable IMU-Based Gait Variability and Postural Drift Analysis

Chanakan Chaipan, Aueaphum Aueawatthanaphisut

Comments 6 pages, 5 figures, 20 equations

详情
英文摘要

Early identification of individuals at risk of stroke remains a major clinical challenge, as prodromal motor im- pairments are often subtle and transient. In this pilot study, a wearable sensor-based framework is proposed for early pre- stroke risk screening using a single inertial measurement unit mounted on the sacral region to capture pelvic motion during gait and standing tasks. The pelvis is treated as a biomechanical proxy for global motor control, enabling the quantification of gait variability and postural drift as digital biomarkers of neurological instability. Raw inertial signals are processed using a sensor fusion pipeline to estimate pelvic kinematics, from which variability and nonlinear dynamic features are extracted. These features are subsequently used to train a machine learning model for risk stratification across control, pre-stroke, and stroke groups. Progressive increases in pelvic angular variability and postural instability are observed from the control to stroke groups, with the pre-stroke cohort exhibiting intermediate char- acteristics. As a proof-of-concept investigation, the proposed framework demonstrates the feasibility of using a minimal wearable configuration to capture pelvic micro-instability associ- ated with early cerebrovascular motor adaptation. The classifier achieves a macro-averaged area under the curve of 0.785, indicating preliminary discriminative capability between risk categories. While not intended for clinical diagnosis, the proposed approach provides a low-cost, non-invasive, and scalable solution for continuous community-level screening, supporting proactive intervention prior to the onset of major stroke events.

2603.16064 2026-03-18 q-bio.PE

Evaluating Targeted Mobility Restrictions on COVID-19 Transmission in Seoul: A Metapopulation Modeling Study Using Mobile Phone Data

Yuna Lim, Jonggul Lee, Eunok Jung

Comments 16 pages (including references), 6 figures

详情
英文摘要

Broad mobility restrictions can help control infectious disease spread, but their socioeconomic costs and the variation in transmission risks by mobility purpose, age group, and spatial connectivity highlight the need for targeted approaches. In this study, we developed an age-structured SEIR metapopulation model for COVID-19 across Seoul's 25 districts, integrating mobile phone-derived origin-destination data. We stratified mobility by age (0-19, 20-59, 60+) and purpose: residential (H), school/work (W), and other non-routine (O). Using 2024 mobility data as a baseline and incorporating pandemic-period (2020-2021) mobility deviations, we investigated counterfactual strategies under various targeting scenarios. Our results showed that W restrictions among adults aged 20-59 produced the highest per-capita reductions in infection. Spatial clustering based on population-adjusted W inflows showed that high-inflow central business districts corresponded to the fast-spreading districts identified in the simulations. Targeting W flows into and within this cluster consistently reduced epidemic size across uncertain seeding locations. Furthermore, weekday-inclusive schedules outperformed weekend-only restrictions. Overall, our findings suggest that although citywide restrictions achieve larger reductions, strategically targeting routine school/work mobility among adults aged 20-59 within fast-spreading clusters can provide substantial epidemiological benefits while reducing broader socioeconomic disruption.

2603.15217 2026-03-18 q-bio.PE math.DS q-bio.QM

A multiscale discrete-to-continuum framework for structured population models

Eleonora Agostinelli, Keith L. Chambers, Helen M. Byrne, Mohit P. Dalwadi

详情
英文摘要

Mathematical models of biological populations commonly use discrete structure classes to capture trait variation among individuals (e.g. age, size, phenotype, intracellular state). Upscaling these discrete models into continuum descriptions can improve analytical tractability and scalability of numerical solutions. Common upscaling approaches based solely on Taylor expansions may, however, introduce ambiguities in truncation order, uniform validity and boundary conditions. To address this, here we introduce a discrete multiscale framework to systematically derive continuum approximations of structured population models. Using the method of multiple scales and matched asymptotic expansions applied to discrete systems, we identify regions of structure space for which a continuum representation is appropriate and derive the corresponding partial differential equations. The leading-order dynamics are given by a nonlinear advection equation in the bulk domain and advection-diffusion processes in small inner layers about the leading wavefronts and stagnation point. We further derive discrete boundary layer descriptions for regions where a continuum representation is fundamentally inappropriate. Finally, we demonstrate the method on a simple lipid-structured model for early atherosclerosis and verify consistency between the discrete and continuum descriptions. The multiscale framework we present can be applied to other heterogeneous systems with discrete structure in order to obtain appropriate upscaled dynamics with asymptotically consistent boundary conditions.

2603.15175 2026-03-18 stat.ME math.DS q-bio.PE

Bayesian Inference in Epidemic Modelling: A Beginner's Guide

Augustine Okolie

Comments 12 pages, 2 plots

详情
英文摘要

This lecture note provides a self-contained introduction to Bayesian inference and Markov Chain Monte Carlo (MCMC) methods for parameter estimation in epidemic models. Using the classical Susceptible-Infectious-Recovered (SIR) compartmental model as a running example, we derive the likelihood function from first principles, specify priors on the transmission and recovery parameters, and implement the Metropolis-Hastings algorithm to sample from the posterior distribution. The note is aimed at graduate students and researchers in mathematical epidemiology with limited prior exposure to Bayesian statistics.

2603.12662 2026-03-18 q-bio.NC cs.SY eess.SY

Dual-Laws Model for a theory of artificial consciousness

Yoshiyuki Ohmura, Yasuo Kuniyoshi

详情
英文摘要

Objectively verifying the generative mechanism of consciousness is extremely difficult because of its subjective nature. As long as theories of consciousness focus solely on its generative mechanism, developing a theory remains challenging. We believe that broadening the theoretical scope and enhancing theoretical unification are necessary to establish a theory of consciousness. This study proposes seven questions that theories of consciousness should address: phenomena, self, causation, state, function, contents, and universality. The questions were designed to examine the functional aspects of consciousness and its applicability to system design. Next, we will examine how our proposed Dual-Laws Model (DLM) can address these questions. Based on our theory, we anticipate two unique features of a conscious system: autonomy in constructing its own goals and cognitive decoupling from external stimuli. We contend that systems with these capabilities differ fundamentally from machines that merely follow human instructions. This makes a design theory that enables high moral behavior indispensable.

2603.12253 2026-03-18 q-bio.QM

Binding Free Energies without Alchemy

Michael Brocidiacono, Brandon Novy, Rishabh Dey, Konstantin I. Popov, Alexander Tropsha

Comments 14 pages, 4 figures

详情
英文摘要

Absolute Binding Free Energy (ABFE) methods are among the most accurate computational techniques for predicting protein-ligand binding affinities, but their utility is limited by the need for many simulations of alchemically modified intermediate states. We propose Direct Binding Free Energy (DBFE), an end-state ABFE method in implicit solvent that requires no alchemical intermediates. DBFE outperforms OBC2 double decoupling on a host-guest benchmark and performs comparably to OBC2 MM/GBSA on a protein-ligand benchmark. Since receptor and ligand simulations can be precomputed and amortized across compounds, DBFE requires only one complex simulation per ligand compared to the many lambda windows needed for double decoupling, making it a promising candidate for virtual screening workflows. We publicly release the code for this method at https://github.com/molecularmodelinglab/dbfe.

2603.09531 2026-03-18 q-bio.QM cs.CV eess.IV stat.AP

Association of Progressive PPFE and Mortality in Lung Cancer Screening Cohorts

Shahab Aslani, Mehran Azimbagirad, Daryl Cheng, Daisuke Yamada, Ryoko Egashira, Adam Szmul, Justine Chan-Fook, Robert Chapman, Alfred Chung Pui So, Shanshan Wang, John McCabe, Tianqi Yang, Jose M Brenes, Eyjolfur Gudmundsson, The SUMMIT Consortium, Susan M. Astley, Daniel C. Alexander, Sam M. Janes, Joseph Jacob

详情
英文摘要

Background: Pleuroparenchymal fibroelastosis (PPFE) is an upper lobe predominant fibrotic lung abnormality associated with increased mortality in established interstitial lung disease. However, the clinical significance of radiologic PPFE progression in lung cancer screening (LCS) populations remains unclear. Methods: We analysed longitudinal low-dose CT scans and clinical data from two LCS studies: National Lung Screening Trial (NLST; n=7,980); SUMMIT study (n=8,561). An automated algorithm quantified PPFE volume on baseline and follow-up scans. Annualised change in PPFE was derived and dichotomised using a distribution-based threshold to define progressive PPFE. Associations between progressive PPFE and mortality were evaluated using Cox proportional hazards models adjusted for demographic and clinical variables. In SUMMIT cohort, associations between progressive PPFE and clinical outcomes were assessed using incidence rate ratios (IRR) and odds ratios (OR). Findings: Progressive PPFE independently associated with mortality in both LCS cohorts (NLST: Hazard Ratio (HR)=1.25, 95% Confidence Interval (CI): 1.01--1.56, p=0.042; SUMMIT: HR=3.14, 95% CI: 1.66--5.97, p<0.001). Within SUMMIT, progressive PPFE was strongly associated with higher respiratory admissions (IRR=2.79, p<0.001), increased antibiotic and steroid use (IRR=1.55, p=0.010), and showed a trend towards higher modified medical research council scores (OR=1.40, p=0.055). Interpretation: Radiologic PPFE progression independently associates with mortality across two large LCS cohorts, and associates with adverse clinical outcomes. Quantitative assessment of PPFE progression may provide a clinically relevant imaging biomarker to identify individuals at increased risk of respiratory morbidity within LCS programmes.

2603.06819 2026-03-18 q-bio.QM

Modeling Metabolic State Transitions in Obesity Using a Time-Varying Lambda-Omega Framework

Soheil Saghafi, Gari D. Clifford

详情
英文摘要

Obesity does not emerge abruptly; rather, it develops gradually over extended periods. The gradual progression often prevents early recognition of physiological changes until excess adiposity is established. A common belief is that weight reduction can be achieved simply by "eating less and moving more". Although reductions in caloric intake and increases in physical activity are fundamental principles of weight management, this perspective oversimplifies a complex and adaptive biological system. Metabolic rate, hormonal regulation, behavioral factors, and compensatory physiological responses all influence the body's resistance to changes in weight. During weight loss, reduced metabolic rate and increased efficiency make maintaining a caloric deficit increasingly difficult. Conversely, during periods of overfeeding, resting metabolic rate, the thermic effect of food, and non-exercise activity thermogenesis increase with rising body weight, partially offsetting the caloric surplus and slowing weight gain. However, these compensatory responses are asymmetrical, with stronger and more persistent adaptations to underfeeding than to overfeeding. This asymmetry helps explain why weight gain often occurs gradually and why sustained weight loss is biologically challenging. In this work, we employ a lambda-omega model from dynamical systems theory to describe metabolic regulation in response to lifestyle perturbations. We introduce time-varying parameters that allow the regulatory coefficients to evolve gradually under sustained environmental and physiological stressors. By allowing lambda(t) and omega(t) to vary over time, the model captures progressive shifts in the metabolic set-point and deformation of the underlying dynamical landscape. This framework enables exploration of transitions between metabolic states and long-term adaptations that shape trajectories of weight gain and loss.

2509.25386 2026-03-18 cond-mat.stat-mech physics.comp-ph q-bio.PE

Spatial correlations in SIS processes on random regular graphs

Alexander Leibenzon, Samuel W. S. Johnson, Ruth E. Baker, Michael Assaf

Comments 10 pages, 6 figures

详情
英文摘要

In network-based SIS models of infectious disease transmission, infection can only occur between directly connected individuals. This constraint naturally gives rise to spatial correlations between the states of neighboring nodes, as the infection status of connected individuals becomes interdependent. Although mean-field approximations and the standard pairwise model are commonly used to simplify disease forecasting on networks, they inadequately capture spatial correlations; mean-field frameworks assume that populations are well-mixed, while the pairwise model neglects correlations beyond nearest-neighbor connections, which leads to inaccurate predictions of infection numbers over time. As such, the development of approximations that account for higher order spatially correlated infections is of great interest, as they offer a compromise between accurate disease forecasting and analytic tractability. Here, we use existing corrections to mean-field theory on the regular lattice to construct a more general framework for equivalent corrections on random regular graph topologies. We derive and simulate a hierarchical system of ordinary differential equations for the time evolution of the spatial correlation function at various geodesic distances on random networks. Solving these equations allows us to predict the time-dependent global infection density, which agrees well with numerical simulations. Our results substantially improve on existing corrections to mean-field theory for infectious individuals in SIS processes and provide an in-depth characterization of how structural randomness in networks affects the dynamical trajectories of infectious diseases on networks.

2509.24657 2026-03-18 q-bio.PE math.DS

Modelling the control of West Nile virus using mosquito reduction methods, vaccination of equids, and human behavioral adaptation to the usage of personal protective equipment

Pride Duve, Felix Gregor Sauer, Renke Lühken

Comments 21 Pages, 1 Supplementary File

详情
英文摘要

West Nile virus (WNV) is a mosquito-borne virus in the genus Flavivirus that circulates between mosquitoes and birds, whereas humans, equids, and other mammals are dead-end hosts. Since its emergence in Germany in 2018, the virus has spread across the country, emphasising the need for effective intervention strategies. However, it remains unclear how different strategies should be combined and timed to effectively reduce WNV transmission under temperature-driven dynamics. In this study, we develop a temperature-dependent, process-based model to evaluate the effectiveness of WNV control strategies, such as mosquito reduction methods, equid vaccination, and the use of personal protective equipment (PPE). Human behavioural responses to infection risk are incorporated through imitation dynamics that capture how individuals adopt PPE based on perceived infection risk and social influence. An optimal control problem has been formulated and studied to determine the seasonal timing of mosquito controls under temperature forcing. Results suggest that mosquito control efforts initiated in early spring and intensified in early May, may reduce the August peak in the infectious bird population. Moreover, a combined scenario of mosquito control methods, human PPE adoption, and equid vaccination could be the best strategy among dead-end hosts. The analysis of various combinations of constant controls is available as an interactive application, allowing users to explore intervention strategies under different temperature projections corresponding to the low-mitigation (SSP126), intermediate (SSP245), and high-emission (SSP585) scenarios.

2508.14740 2026-03-18 q-bio.PE

Modeling the impact of temperature and bird migration on the spread of West Nile virus

Pride Duve, Felix Sauer, Renke Lühken

Comments 1 supplentary pdf file, 6 videos

详情
英文摘要

West Nile virus (WNV) is a climate-sensitive mosquito-borne arbovirus circulating between mosquitoes of the genus Culex and birds, with a potential spillover to humans and other mammals. Recent trends in climatic change, characterized by early and/or prolonged summer seasons, increased temperatures, and above-average rainfall, probably facilitated the spread of WNV in Europe, including Germany. In this work, we formulate a spatial WNV model consisting of a system of parabolic partial differential equations (PDEs), using the concept of diffusion and advection in combination with temperature-dependent parameters, i.e., mosquito biting rate, extrinsic incubation, and mortality rate. Diffusion represents the random movement of both mosquitoes and hosts across space, while advection captures the directed movement of migratory birds. The model is first studied mathematically, and we show that it has non-negative, unique, and bounded solutions in time and space. Numerical simulations of the PDE model are performed using temperature data for Germany (2019 - 2024). Results obtained from the simulation showed a high agreement with the reported WNV cases among birds and equids in Germany. The observed spreading patterns from the year 2018 to 2022 and the year 2024 were mainly driven by temperature in combination with diffusion processes of hosts and vectors. Only during the year 2023, the additional inclusion of advection for migratory birds was important to correctly predict new hotspots in new locations in Germany.

2505.21777 2026-03-18 cs.LG cond-mat.dis-nn cs.CV q-bio.NC stat.ML

Memorization to Generalization: Emergence of Diffusion Models from Associative Memory

Bao Pham, Gabriel Raya, Matteo Negri, Mohammed J. Zaki, Luca Ambrogioni, Dmitry Krotov

详情
英文摘要

Dense Associative Memories (DenseAMs) are generalizations of Hopfield networks, which have superior information storage capacity and can store training data points (memories) at local minima of the energy landscape. When the amount of training data exceeds the critical memory storage capacity of these models, new local minima, which are different from the training data, emerge. In Associative Memory these emergent local minima are called $\textit{spurious}\; \textit{states}$, which hinder memory retrieval. In this work, we examine diffusion models (DMs) through the DenseAM lens, viewing their generative process as an attempt of a memory retrieval. In the small data regimes, DMs create distinct attractors for each training sample, akin to DenseAMs below the critical memory storage. As the training data size increases, they transition from memorization to generalization. We identify a critical intermediate phase, predicted by DenseAM theory -- the spurious states. In generative modeling, these states are no longer negative artifacts but rather are the first signs of generative capabilities. We characterize the basins of attraction, energy landscape curvature, and computational properties of these previously overlooked states. Their existence is demonstrated across a wide range of architectures and datasets.

2503.18855 2026-03-18 q-bio.OT

Boundary effects in biological planar networks: pentagonsdominate Pyropia marginal cells

Kai Xu, Fei He

详情
Journal ref
PHILOSOPHICAL MAGAZINE, 2026
英文摘要

The topological and geometrical features at the boundary zone of planar polygonal networks remain poorly understood. Based on observations and mathematical proofs, we propose that marginal cells in the thalli of Pyropia haitanensis, a two-dimensional (2D) biological polygonal network, have an average edge number of approximately five. We demonstrate that this number is maintained by specific division patterns. Furthermore, we observe that both marginal cells and inner cells follow the trends predicted by the Lewis law and Aboav-Weaire law, but each cell type requires its own set of correlation parameters to more accurately describe its topological and geometrical features. The boundary effects are also evident in the differences between marginal cells and inner cells in terms of the distributions of interior angles and edge lengths. Similar to inner cells, cell division tends to occur in marginal cells with large sizes and transects a pair of unconnected edges. In particular, this study finds that the division of marginal cells preferentially transects the marginal edge. These specific topological and geometrical features of marginal cells and division patterns may inform the development of modelling algorithms for boundary conditions in biological 2D cellular networks.

2410.03385 2026-03-18 cs.LG q-bio.NC

From Epilepsy Seizures Classification to Detection: A Deep Learning-based Approach for Raw EEG Signals

Davy Darankoum, Manon Villalba, Clelia Allioux, Baptiste Caraballo, Carine Dumont, Eloise Gronlier, Corinne Roucard, Yann Roche, Chloe Habermacher, Sergei Grudinin, Julien Volle

Comments 25 pages, 3 tables, 5 figures

详情
英文摘要

Epilepsy represents the most prevalent neurological disease in the world. One-third of people suffering from mesial temporal lobe epilepsy (MTLE) exhibit drug resistance, urging the need to develop new treatments. A key part in anti-seizure medication (ASM) development is the capability of detecting and quantifying epileptic seizures occurring in electroencephalogram (EEG) signals, which is crucial for treatment efficacy evaluation. In this study, we introduced a seizure detection pipeline based on deep learning models applied to raw EEG signals. This pipeline integrates: a new pre-processing technique which segments continuous raw EEG signals without prior distinction between seizure and seizure-free activities; a post-processing algorithm developed to reassemble EEG segments and allow the identification of seizures start/end; and finally, a new evaluation procedure based on a strict seizure events comparison between predicted and real labels. Models training have been performed using a data splitting strategy which addresses the potential for data leakage. We demonstrated the fundamental differences between a seizure classification and a seizure detection task and showed the differences in performance between the two tasks. Finally, we demonstrated the generalization capabilities across species of our best architecture, combining a Convolutional Neural Network and a Transformer encoder. The model was trained on animal EEGs and tested on human EEGs with a F1-score of 93% on a balanced Bonn dataset.

2407.19892 2026-03-18 stat.ML cs.LG q-bio.GN

Making Multi-Axis Gaussian Graphical Models Scalable to Millions of Cells

Bailey Andrew, Erica L. Harris, James A. Poulter, David R. Westhead, Luisa Cutillo

Comments 8 pages (35 with appendix+references), 8 figures, 10 tables

详情
英文摘要

Motivation: Networks underlie the generation and interpretation of many biological datasets: gene networks shed light on the regulatory structure of the genome, and cell networks can capture structure of the tumor micro-environment. However, most methods that learn such networks make the faulty 'independence assumption'; to learn the gene network, they assume that no cell network exists. 'Multi-axis' methods, which do not make this assumption, fail to scale beyond a few thousand cells or genes. This limits their applicability to only the smallest datasets. Results: We develop a multi-axis method capable of processing million-cell datasets within minutes. This was previously impossible, and unlocks the use of such methods on modern scRNA-seq datasets, as well as more complex datasets. We show that our method yields novel biological insights from real single-cell data, and compares favorably to the existing hdWGCNA methodology. In particular, it identifies long non-coding RNA genes that potentially have a regulatory or functional role in neuronal development. Availability and implementation: Our methodology is available as a Python package GmGM on PyPI (https://pypi.org/project/GmGM/0.5.3/). The code for all experiments performed in this paper is available on GitHub (https://github.com/BaileyAndrew/GmGM-Bioinformatics). Contact: sceba@leeds.ac.uk Supplementary information: Our proofs, and some additional experiments, are available in the supplementary material. Keywords: gaussian graphical models, multi-axis models, transcriptomics, multi-omics, scalability

2405.08979 2026-03-18 cs.LG q-bio.MN q-bio.QM

drGT: Attention-Guided Gene Assessment of Drug Response Utilizing a Drug-Cell-Gene Heterogeneous Network

Yoshitaka Inoue, Hunmin Lee, Tianfan Fu, Rui Kuang, Augustin Luna

详情
英文摘要

For translational impact, both accurate drug response prediction and biological plausibility of predictive features are needed. We present drGT, a heterogeneous graph deep learning model over drugs, genes, and cell lines that couples prediction with mechanism-oriented interpretability via attention coefficients (ACs). We assess both predictive generalization (random, unseen-drug, unseen-cell, and zero-shot splits) and biological plausibility (use of text-mined PubMed gene-drug co-mentions and comparison to a structure-based DTI predictor) on GDSC, NCI60, and CTRP datasets. Across benchmarks, drGT consistently delivers top regression performance while maintaining competitive classification accuracy for drug sensitivity. Under random 5-fold cross-validation, drGT attains an AUROC of up to 0.945 (3rd overall) and an $R^2$ up to 0.690, outperforming all baselines on regression. In leave-one-out tests for unseen cell lines and drugs, drGT achieves AUROCs of 0.706 and 0.844, and $R^2$ values of 0.692 and 0.022, the only model yielding positive $R^2$ for unseen drugs. In zero-shot prediction, drGT achieves an AUROC of 0.786 and a regression $R^2$ of 0.334, both representing the highest scores among all models. For interpretability, AC-derived drug-gene links recover known biology: among 976 drugs with known DTIs, 36.9% of predicted links match established DTIs, and 63.7% are supported by either PubMed abstracts or a structure-based predictive model. Enrichment analyses of AC-prioritized genes reveal drug-perturbed biological processes, providing pathway-level explanations. drGT advances predictive generalization and mechanism-centered interpretability, offering state-of-the-art regression accuracy and literature-supported biological hypotheses that demonstrate the use of graph learning from heterogeneous input data for biological discovery. Code: https://github.com/sciluna/drGT

2012.14309 2026-03-18 q-bio.PE cond-mat.soft cs.CL physics.bio-ph

General Mechanism of Evolution Shared by Proteins and Words

Li-Min Wang, Hsing-Yi Lai, Sun-Ting Tsai, Chen Siang Ng, Kevin Sheng-Kai Ma, Shan-Jyun Wu, Meng-Xue Tsai, Yi-Ching Su, Daw-Wei Wang, Tzay-Ming Hong

详情
英文摘要

Complex systems, such as life and languages, are governed by principles of evolution. The analogy and comparison between biology and linguistics\cite{alphafold2, RoseTTAFold, lang_virus, cell language, faculty1, language of gene, Protein linguistics, dictionary, Grammar of pro_dom, complexity, genomics_nlp, InterPro, language modeling, Protein language modeling} provide a computational foundation for characterizing and analyzing protein sequences, human corpora, and their evolution. However, no general mathematical formula has been proposed so far to illuminate the origin of quantitative hallmarks shared by life and language. Here we show several new statistical relationships shared by proteins and words, which inspire us to establish a general mechanism of evolution with explicit formulations that can incorporate both old and new characteristics. We found natural selection can be quantified via the entropic formulation by the principle of least effort to determine the sequence variation that survives in evolution. Besides, the origin of power law behavior and how changes in the environment stimulate the emergence of new proteins and words can also be explained via the introduction of function connection network. Our results demonstrate not only the correspondence between genetics and linguistics over their different hierarchies but also new fundamental physical properties for the evolution of complex adaptive systems. We anticipate our statistical tests can function as quantitative criteria to examine whether an evolution theory of sequence is consistent with the regularity of real data. In the meantime, their correspondence broadens the bridge to exchange existing knowledge, spurs new interpretations, and opens Pandora's box to release several potentially revolutionary challenges. For example, does linguistic arbitrariness conflict with the dogma that structure determines function?

2603.15711 2026-03-18 cs.AI cs.IR q-bio.QM

Knowledge Graph Extraction from Biomedical Literature for Alkaptonuria Rare Disease

Giang Pham, Rebecca Finetti, Caterina Graziani, Bianca Roncaglia, Asma Bendjeddou, Linda Brodo, Sara Brunetti, Moreno Falaschi, Stefano Forti, Silvia Giulia Galfré, Paolo Milazzo, Corrado Priami, Annalisa Santucci, Ottavia Spiga, Alina Sîrbu

详情
英文摘要

Alkaptonuria (AKU) is an ultra-rare autosomal recessive metabolic disorder caused by mutations in the HGD (Homogentisate 1,2-Dioxygenase) gene, leading to a pathological accumulation of homogentisic acid (HGA) in body fluids and tissues. This leads to systemic manifestations, including premature spondyloarthropathy, renal and prostatic stones, and cardiovascular complications. Being ultra-rare, the amount of data related to the disease is limited, both in terms of clinical data and literature. Knowledge graphs (KGs) can help connect the limited knowledge about the disease (basic mechanisms, manifestations and existing therapies) with other knowledge; however, AKU is frequently underrepresented or entirely absent in existing biomedical KGs. In this work, we apply a text-mining methodology based on PubTator3 for large-scale extraction of biomedical relations. We construct two KGs of different sizes, validate them using existing biochemical knowledge and use them to extract genes, diseases and therapies possibly related to AKU. This computational framework reveals the systemic interactions of the disease, its comorbidities, and potential therapeutic targets, demonstrating the efficacy of our approach in analyzing rare metabolic disorders.