arXivDaily arXiv每日学术速递 周一至周五更新
2602.04762 2026-02-05 q-bio.PE stat.AP stat.OT

Uncertainty in Island-based Ecosystem Services and Climate Change

Nazli Demirel, Ioannis N. Vogiatzakis, George Zittis, Mirela Tase, Attila D. Sandor, Savvas Zotos, Christos Zoumides, Turgay Dindaroglu, Mauro Fois, Irene Christoforidi, Valentini Stamatiadou, Shiri Zemah-Shamir, Tamer Albayrak, Cigdem Kaptan Ayhan, Paraskevi Manolaki, Ina Sieber, Ziv Zemah-Shamir, Elli Tzirkalli, Aristides Moustakas

详情
英文摘要

Small and medium-sized islands are acutely exposed to climate change and ecosystem degradation, yet the extent to which uncertainty is systematically addressed in scientific assessments of their ecosystem services remains poorly understood. This study revisits 226 peer-reviewed articles drawn from two global systematic reviews on island ecosystem services and climate change, applying a structured post hoc analysis to evaluate how uncertainty is treated across methods, service categories, ecosystem realms, and decision contexts. Studies were classified according to whether uncertainty was explicitly analysed, just mentioned, or ignored. Only 30 percent of studies incorporated uncertainty explicitly, while more than half did not address it at all. Scenario-based approaches dominated uncertainty assessment, whereas probabilistic and ensemble-based frameworks remained limited. Cultural ecosystem services and extreme climate impacts exhibited the lowest levels of uncertainty integration, and few studies connected uncertainty treatment to policy relevant decision frameworks. Weak or absent treatment of uncertainty emerges as a structural challenge in island systems, where narrow ecological thresholds, strong land-sea coupling, limited spatial buffers, and reduced institutional redundancy amplify the consequences of decision-making under incomplete knowledge. Systematic mapping of how uncertainty is framed, operationalised, or neglected reveals persistent methodological and conceptual gaps and informs concrete directions for strengthening uncertainty integration in future island-focused ecosystem service and climate assessments. Embedding uncertainty more robustly into modelling practices, participatory processes, and policy tools is essential for enhancing scientific credibility, governance relevance, and adaptive capacity in insular socio-ecological systems.

2602.04512 2026-02-05 q-bio.NC cs.AI

BrainVista: Modeling Naturalistic Brain Dynamics as Multimodal Next-Token Prediction

Xuanhua Yin, Runkai Zhao, Lina Yao, Weidong Cai

Comments 17 pages, 7 figures, 11 tables

详情
英文摘要

Naturalistic fMRI characterizes the brain as a dynamic predictive engine driven by continuous sensory streams. However, modeling the causal forward evolution in realistic neural simulation is impeded by the timescale mismatch between multimodal inputs and the complex topology of cortical networks. To address these challenges, we introduce BrainVista, a multimodal autoregressive framework designed to model the causal evolution of brain states. BrainVista incorporates Network-wise Tokenizers to disentangle system-specific dynamics and a Spatial Mixer Head that captures inter-network information flow without compromising functional boundaries. Furthermore, we propose a novel Stimulus-to-Brain (S2B) masking mechanism to synchronize high-frequency sensory stimuli with hemodynamically filtered signals, enabling strict, history-only causal conditioning. We validate our framework on Algonauts 2025, CineBrain, and HAD, achieving state-of-the-art fMRI encoding performance. In long-horizon rollout settings, our model yields substantial improvements over baselines, increasing pattern correlation by 36.0\% and 33.3\% on relative to the strongest baseline Algonauts 2025 and CineBrain, respectively.

2602.04492 2026-02-05 q-bio.NC cs.AI cs.LG

Discovering Mechanistic Models of Neural Activity: System Identification in an in Silico Zebrafish

Jan-Matthis Lueckmann, Viren Jain, Michał Januszewski

详情
英文摘要

Constructing mechanistic models of neural circuits is a fundamental goal of neuroscience, yet verifying such models is limited by the lack of ground truth. To rigorously test model discovery, we establish an in silico testbed using neuromechanical simulations of a larval zebrafish as a transparent ground truth. We find that LLM-based tree search autonomously discovers predictive models that significantly outperform established forecasting baselines. Conditioning on sensory drive is necessary but not sufficient for faithful system identification, as models exploit statistical shortcuts. Structural priors prove essential for enabling robust out-of-distribution generalization and recovery of interpretable mechanistic models. Our insights provide guidance for modeling real-world neural recordings and offer a broader template for AI-driven scientific discovery.

2602.04481 2026-02-05 physics.soc-ph cs.GT q-bio.PE

The impact of heterogeneity on the co-evolution of cooperation and epidemic spreading in complex networks

Mehran Noori, Nahid Azimi-Tafreshi, Mohammad Salahshour

Comments 11 pages, 8 figures

详情
英文摘要

The dynamics of herd immunity depend crucially on the interaction between collective social behavior and disease transmission, but the role of heterogeneity in this context frequently remains unclear. Here, we dissect this co-evolutionary feedback by coupling a public goods game with an epidemic model on complex networks, including multiplex and real-world networks. Our results reveals a dichotomy in how heterogeneity shapes outcomes. We demonstrate that structural heterogeneity in social networks acts as a powerful catalyst for cooperation and disease suppression. This emergent effect is driven by highly connected hubs who, facing amplified personal risk, adopt protective strategies out of self-interest. In contrast, heterogeneity in individual infection costs proves detrimental, undermining cooperation and amplifying the epidemic. This creates a ``weakest link'' problem, where individuals with low perceived risk act as persistent free-riders and disease reservoirs, degrading the collective response. Our findings establish that heterogeneity is a double-edged sword: its impact is determined by whether it creates an asymmetry of influence (leverage points) or an asymmetry of motivation (weakest links), recommending disease intervention policies that facilitate cooperative transition in hubs (strengthening the leverage point) and homogenize incentives to weakest links.

2602.04437 2026-02-05 math.PR math.DS q-bio.PE

How seed banks evolve in plants: a stochastic dynamical system subject to a strong drift

Alison Etheridge, João Luiz de Oliveira Madeira

Comments 79 pages, 5 figures

详情
英文摘要

We study how changes in population size and fluctuating environmental conditions influence the establishment of seed banks in plants. Our model is a modification of the Wright-Fisher model with seed bank, introduced by Kaj, Krone and Lascoux. We distinguish between wild type individuals, producing only nondormant seeds, and mutants, producing seeds with dormancy. To understand how changing population size shapes the establishment of seed banks, we analyse the process under a diffusive scaling. The results support the biological insight that seed banks are favoured in a declining population, and disfavoured if population size is constant or increasing. The surprise is that this is true even when population sizes are changing very slowly -- over evolutionary timescales. We also investigate the influence of short-term fluctuations, such as annual variations in rainfall or temperature. Mathematically, our analysis reduces to a stochastic dynamical system forced onto a manifold by a large drift, which converges under scaling to a diffusion on the manifold. Inspired by the Lyapunov--Schmidt reduction, we derive an explicit formula for the limiting diffusion coefficients by projecting the system onto its linear counterpart. This provides a general framework for deriving diffusion approximations in models with strong drift and nonlinear constraints.

2601.11653 2026-02-05 q-bio.NC cs.LG cs.MA

AI Agents Need Memory Control Over More Context

Fouad Bousetouane

详情
英文摘要

AI agents are increasingly used in long, multi-turn workflows in both research and enterprise settings. As interactions grow, agent behavior often degrades due to loss of constraint focus, error accumulation, and memory-induced drift. This problem is especially visible in real-world deployments where context evolves, distractions are introduced, and decisions must remain consistent over time. A common practice is to equip agents with persistent memory through transcript replay or retrieval-based mechanisms. While convenient, these approaches introduce unbounded context growth and are vulnerable to noisy recall and memory poisoning, leading to unstable behavior and increased drift. In this work, we introduce the Agent Cognitive Compressor (ACC), a bio-inspired memory controller that replaces transcript replay with a bounded internal state updated online at each turn. ACC separates artifact recall from state commitment, enabling stable conditioning while preventing unverified content from becoming persistent memory. We evaluate ACC using an agent-judge-driven live evaluation framework that measures both task outcomes and memory-driven anomalies across extended interactions. Across scenarios spanning IT operations, cybersecurity response, and healthcare workflows, ACC consistently maintains bounded memory and exhibits more stable multi-turn behavior, with significantly lower hallucination and drift than transcript replay and retrieval-based agents. These results show that cognitive compression provides a practical and effective foundation for reliable memory control in long-horizon AI agents.

2512.03312 2026-02-05 q-bio.BM cs.LG

Unlocking hidden biomolecular conformational landscapes in diffusion models at inference time

Daniel D. Richman, Jessica Karaguesian, Carl-Mikael Suomivuori, Ron O. Dror

Comments Project page: https://github.com/drorlab/conformix

Journal ref NeurIPS 2025

详情
英文摘要

The function of biomolecules such as proteins depends on their ability to interconvert between a wide range of structures or "conformations." Researchers have endeavored for decades to develop computational methods to predict the distribution of conformations, which is far harder to determine experimentally than a static folded structure. We present ConforMix, an inference-time algorithm that enhances sampling of conformational distributions using a combination of classifier guidance, filtering, and free energy estimation. Our approach upgrades diffusion models -- whether trained for static structure prediction or conformational generation -- to enable more efficient discovery of conformational variability without requiring prior knowledge of major degrees of freedom. ConforMix is orthogonal to improvements in model pretraining and would benefit even a hypothetical model that perfectly reproduced the Boltzmann distribution. Remarkably, when applied to a diffusion model trained for static structure prediction, ConforMix captures structural changes including domain motion, cryptic pocket flexibility, and transporter cycling, while avoiding unphysical states. Case studies of biologically critical proteins demonstrate the scalability, accuracy, and utility of this method.

2501.04147 2026-02-05 q-bio.QM

A Framework for Building Enviromics Matrices in Mixed Models

B. A. Trevisan, V. S. Junqueira, B. M. Florencio, A. S. G. Coelho, G. E. Marcatti, R. T. Resende

Comments 17 pages, 3 figures

Journal ref Brazilian Journal of Biometrics 43 (2025) e-43865

详情
英文摘要

This study introduces a framework for constructing enviromics matrices in mixed models to integrate genetic and environmental data to enhance phenotypic predictions in plant breeding. Enviromics utilizes diverse data sources, such as climate and soil, to characterize genotype-by-environment (GxE) interactions. The approach employs block-diagonal structures in the design matrix to incorporate random effects from genetic and envirotypic covariates across trials. The covariance structure is modeled using the Kronecker product of the genetic relationship matrix and an identity matrix representing envirotypic effects, capturing genetic and environmental variability. This dual representation enables more accurate crop performance predictions across environments, improving selection strategies in breeding programs. The framework is compatible with existing mixed model software, including rrBLUP and BGLR, and can be extended for more complex interactions. By combining genetic relationships and environmental influences, this approach offers a powerful tool for advancing GxE studies and accelerating the development of improved crop varieties.

2602.04270 2026-02-05 cs.LG q-bio.NC q-bio.QM stat.ML

Multi-Integration of Labels across Categories for Component Identification (MILCCI)

Noga Mudrik, Yuxi Chen, Gal Mishne, Adam S. Charles

详情
英文摘要

Many fields collect large-scale temporal data through repeated measurements (trials), where each trial is labeled with a set of metadata variables spanning several categories. For example, a trial in a neuroscience study may be linked to a value from category (a): task difficulty, and category (b): animal choice. A critical challenge in time-series analysis is to understand how these labels are encoded within the multi-trial observations, and disentangle the distinct effect of each label entry across categories. Here, we present MILCCI, a novel data-driven method that i) identifies the interpretable components underlying the data, ii) captures cross-trial variability, and iii) integrates label information to understand each category's representation within the data. MILCCI extends a sparse per-trial decomposition that leverages label similarities within each category to enable subtle, label-driven cross-trial adjustments in component compositions and to distinguish the contribution of each category. MILCCI also learns each component's corresponding temporal trace, which evolves over time within each trial and varies flexibly across trials. We demonstrate MILCCI's performance through both synthetic and real-world examples, including voting patterns, online page view trends, and neuronal recordings.

2602.04095 2026-02-05 q-bio.NC cs.AI cs.ET

A computational account of dreaming: learning and memory consolidation

Qi Zhang

Comments 30 pages, 4 tables, 2 figures

Journal ref Cognitive System Research, 2009

详情
英文摘要

A number of studies have concluded that dreaming is mostly caused by randomly arriving internal signals because "dream contents are random impulses", and argued that dream sleep is unlikely to play an important part in our intellectual capacity. On the contrary, numerous functional studies have revealed that dream sleep does play an important role in our learning and other intellectual functions. Specifically, recent studies have suggested the importance of dream sleep in memory consolidation, following the findings of neural replaying of recent waking patterns in the hippocampus. The randomness has been the hurdle that divides dream theories into either functional or functionless. This study presents a cognitive and computational model of dream process. This model is simulated to perform the functions of learning and memory consolidation, which are two most popular dream functions that have been proposed. The simulations demonstrate that random signals may result in learning and memory consolidation. Thus, dreaming is proposed as a continuation of brain's waking activities that processes signals activated spontaneously and randomly from the hippocampus. The characteristics of the model are discussed and found in agreement with many characteristics concluded from various empirical studies.

2602.04021 2026-02-05 cs.LG q-bio.QM stat.ML

Group Contrastive Learning for Weakly Paired Multimodal Data

Aditya Gorla, Hugues Van Assel, Jan-Christian Huetter, Heming Yao, Kyunghyun Cho, Aviv Regev, Russell Littman

详情
英文摘要

We present GROOVE, a semi-supervised multi-modal representation learning approach for high-content perturbation data where samples across modalities are weakly paired through shared perturbation labels but lack direct correspondence. Our primary contribution is GroupCLIP, a novel group-level contrastive loss that bridges the gap between CLIP for paired cross-modal data and SupCon for uni-modal supervised contrastive learning, addressing a fundamental gap in contrastive learning for weakly-paired settings. We integrate GroupCLIP with an on-the-fly backtranslating autoencoder framework to encourage cross-modally entangled representations while maintaining group-level coherence within a shared latent space. Critically, we introduce a comprehensive combinatorial evaluation framework that systematically assesses representation learners across multiple optimal transport aligners, addressing key limitations in existing evaluation strategies. This framework includes novel simulations that systematically vary shared versus modality-specific perturbation effects enabling principled assessment of method robustness. Our combinatorial benchmarking reveals that there is not yet an aligner that uniformly dominates across settings or modality pairs. Across simulations and two real single-cell genetic perturbation datasets, GROOVE performs on par with or outperforms existing approaches for downstream cross-modal matching and imputation tasks. Our ablation studies demonstrate that GroupCLIP is the key component driving performance gains. These results highlight the importance of leveraging group-level constraints for effective multi-modal representation learning in scenarios where only weak pairing is available.

2602.04008 2026-02-05 q-bio.TO

Mathematical simulations of pediatric hemodynamics in isolated ventricular septal defect

Mitchel J. Colebank, Alfonso Limon, Anthony Chang, Brandon Wong, Wyman Lai, Hamilton Baker

Comments 11 Figures, 1 Supplemental Figure

详情
英文摘要

Computer modeling of the cardiovascular system has potential to revolutionize personalized medical care. This is especially promising for congenital heart defects, such as ventricular septal defect (VSD), a hole between the two ventricles of the heart. However, relatively few studies have built computer models for VSD, nor have they considered how natural adaptation to the cardiovascular system with age might interact with the presence of a small, medium, or large size VSD. Here, we combine a lumped parameter model of the cardiovascular system with two key modeling components: a size-dependent resistance dictating shunt flow between the two ventricles and age-dependent scaling relationships for the systemic and pulmonary circulations. Our results provide insight into changes in hemodynamic conditions with various VSD sizes. We investigate the combined effects of VSD size, vascular parameters, and age, showing distinct differences with these three factors. This study lays the necessary foundation for studying VSD and towards building digital shadows and digital twins for managing VSD in pediatrics.

2602.03902 2026-02-05 q-bio.QM cs.AI cs.LG

All-Atom GPCR-Ligand Simulation via Residual Isometric Latent Flow

Jiying Zhang, Shuhao Zhang, Pierre Vandergheynst, Patrick Barth

Comments 36 pages

详情
英文摘要

G-protein-coupled receptors (GPCRs), primary targets for over one-third of approved therapeutics, rely on intricate conformational transitions to transduce signals. While Molecular Dynamics (MD) is essential for elucidating this transduction process, particularly within ligand-bound complexes, conventional all-atom MD simulation is computationally prohibitive. In this paper, we introduce GPCRLMD, a deep generative framework for efficient all-atom GPCR-ligand simulation.GPCRLMD employs a Harmonic-Prior Variational Autoencoder (HP-VAE) to first map the complex into a regularized isometric latent space, preserving geometric topology via physics-informed constraints. Within this latent space, a Residual Latent Flow samples evolution trajectories, which are subsequently decoded back to atomic coordinates. By capturing temporal dynamics via relative displacements anchored to the initial structure, this residual mechanism effectively decouples static topology from dynamic fluctuations. Experimental results demonstrate that GPCRLMD achieves state-of-the-art performance in GPCR-ligand dynamics simulation, faithfully reproducing thermodynamic observables and critical ligand-receptor interactions.

2602.00157 2026-02-05 q-bio.QM cs.AI q-bio.BM

ProDCARL: Reinforcement Learning-Aligned Diffusion Models for De Novo Antimicrobial Peptide Design

Fang Sheng, Mohammad Noaeen, Zahra Shakeri

详情
英文摘要

Antimicrobial resistance threatens healthcare sustainability and motivates low-cost computational discovery of antimicrobial peptides (AMPs). De novo peptide generation must optimize antimicrobial activity and safety through low predicted toxicity, but likelihood-trained generators do not enforce these goals explicitly. We introduce ProDCARL, a reinforcement-learning alignment framework that couples a diffusion-based protein generator (EvoDiff OA-DM 38M) with sequence property predictors for AMP activity and peptide toxicity. We fine-tune the diffusion prior on AMP sequences to obtain a domain-aware generator. Top-k policy-gradient updates use classifier-derived rewards plus entropy regularization and early stopping to preserve diversity and reduce reward hacking. In silico experiments show ProDCARL increases the mean predicted AMP score from 0.081 after fine-tuning to 0.178. The joint high-quality hit rate reaches 6.3\% with pAMP $>$0.7 and pTox $<$0.3. ProDCARL maintains high diversity, with $1-$mean pairwise identity equal to 0.929. Qualitative analyses with AlphaFold3 and ProtBERT embeddings suggest candidates show plausible AMP-like structural and semantic characteristics. ProDCARL serves as a candidate generator that narrows experimental search space, and experimental validation remains future work.

2601.15313 2026-02-05 q-bio.NC cs.AI

Attention Is Not Retention: The Orthogonality Constraint in Infinite-Context Architectures

Oliver Zahn, Matt Beton, Simran Chana

Comments 32 Pages, 7 Figures

详情
英文摘要

Biological memory solves a problem that eludes current AI: storing specific episodic facts without corrupting general semantic knowledge. Complementary Learning Systems theory explains this through two subsystems - a fast hippocampal system using sparse, pattern-separated representations for episodes, and a slow neocortical system using distributed representations for statistical regularities. Current AI systems lack this separation, attempting to serve both functions through neural weights alone. We identify the Orthogonality Constraint: reliable memory requires orthogonal keys, but semantic embeddings cannot be orthogonal because training clusters similar concepts together. The result is Semantic Interference (connecting to what cognitive psychologists have long observed in human memory), where neural systems writing facts into shared continuous parameters collapse to near-random accuracy within tens of semantically related facts. Through semantic density (rho), the mean pairwise cosine similarity, we show collapse occurs at N=5 facts (rho > 0.6) or N ~ 20-75 (moderate rho). We validate across modalities: 16,309 Wikipedia facts, scientific measurements (rho = 0.96, 0.02% accuracy at N=10,000), and image embeddings (rho = 0.82, 0.05% at N=2,000). This failure is geometric - no increase in model capacity can overcome interference when keys share semantic overlap. We propose Knowledge Objects (KOs): structured facts with hash-based identity, controlled vocabularies, and explicit version chains. On Wikipedia facts, KO retrieval achieves 45.7% where Modern Hopfield Networks collapse to near-zero; hash-based retrieval maintains 100%. Production systems (Claude Memory, ChatGPT Memory) store unstructured text, causing schema drift (40-70% consistency) and version ambiguity. Knowledge Objects provide the discrete hippocampal component that enables reliable bicameral memory.

2506.00597 2026-02-05 q-bio.GN cs.AR

Processing-in-memory for genomics workloads

William Andrew Simon, Leonid Yavits, Konstantina Koliogeorgi, Yann Falevoz, Yoshihiro Shibuya, Dominique Lavenier, Irem Boybat, Klea Zambaku, Berkan Şahin, Mohammad Sadrosadati, Onur Mutlu, Abu Sebastian, Rayan Chikhi, The BioPIM Consortium, Can Alkan

详情
英文摘要

Low-cost, high-throughput DNA and RNA sequencing (HTS) data is the backbone of the life sciences. Genome sequencing is now becoming a part of Predictive, Preventive, Personalized, and Participatory (termed 'P4') medicine. All genomic data are currently processed in energy-hungry computer clusters and centers, necessitating data transfer, consuming substantial energy, and wasting valuable time. Therefore, there is a need for fast, energy-efficient, and cost-efficient technologies that enable genomics research without requiring data centers and cloud platforms. We recently launched the BioPIM Project to leverage emerging processing-in-memory (PIM) technologies to enable energy- and cost-efficient analysis of bioinformatics workloads. The BioPIM Project focuses on co-designing algorithms and data structures commonly used in genomics with several PIM architectures to achieve the highest cost, energy, and time savings.

2505.17914 2026-02-05 q-bio.BM cs.LG

Flexible MOF Generation with Torsion-Aware Flow Matching

Nayoung Kim, Seongsu Kim, Sungsoo Ahn

Comments 24 pages, 9 figures

Journal ref Neural Information Processing Systems (NeurIPS) 2025

详情
英文摘要

Designing metal-organic frameworks (MOFs) with novel chemistries is a longstanding challenge due to their large combinatorial space and complex 3D arrangements of the building blocks. While recent deep generative models have enabled scalable MOF generation, they assume (1) a fixed set of building blocks and (2) known local 3D coordinates of building blocks. However, this limits their ability to (1) design novel MOFs and (2) generate the structure using novel building blocks. We propose a two-stage MOF generation framework that overcomes these limitations by modeling both chemical and geometric degrees of freedom. First, we train an SMILES-based autoregressive model to generate metal and organic building blocks, paired with a cheminformatics toolkit for 3D structure initialization. Second, we introduce a flow matching model that predicts translations, rotations, and torsional angles to assemble the blocks into valid 3D frameworks. Our experiments demonstrate improved reconstruction accuracy, the generation of valid, novel, and unique MOFs, and the ability to create novel building blocks. Our code is available at https://github.com/nayoung10/MOFFlow-2.

2505.12653 2026-02-05 q-bio.NC

High-dimensional structure underlying individual differences in naturalistic visual experience

Chihye Han, Michael F. Bonner

Journal ref Current Biology 36 (2026) 723-733

详情
英文摘要

How do different brains create unique visual experiences from identical sensory input? While neural representations vary across individuals, the fundamental architecture underlying these differences remains poorly understood. Here, we reveal that individual visual experience emerges from a high-dimensional neural geometry across the visual cortical hierarchy. Using spectral decomposition of fMRI responses during naturalistic movie viewing, we find that idiosyncratic neural patterns persist across multiple orders of magnitude of latent dimensions. Remarkably, each dimensional range encodes qualitatively distinct aspects of individual processing, and this multidimensional neural geometry predicts subsequent behavioral differences in memory recall. These fine-grained patterns of inter-individual variability cannot be reduced to those detected by conventional intersubject correlation measures. Our findings demonstrate that subjective visual experience arises from information integrated across an expansive multidimensional manifold. This geometric framework offers a powerful new lens for understanding how diverse brains construct unique perceptual worlds from shared experiences.

2405.18605 2026-02-05 cs.CL cs.AI cs.IR q-bio.MN

Merged ChemProt-DrugProt for Relation Extraction from Biomedical Literature

Mai H. Nguyen, Shibani Likhite, Jiawei Tang, Darshini Mahendran, Bridget T. McInnes

详情
英文摘要

The extraction of chemical-gene relations plays a pivotal role in understanding the intricate interactions between chemical compounds and genes, with significant implications for drug discovery, disease understanding, and biomedical research. This paper presents a data set created by merging the ChemProt and DrugProt datasets to augment sample counts and improve model accuracy. We evaluate the merged dataset using two state of the art relationship extraction algorithms: Bidirectional Encoder Representations from Transformers (BERT) specifically BioBERT, and Graph Convolutional Networks (GCNs) combined with BioBERT. While BioBERT excels at capturing local contexts, it may benefit from incorporating global information essential for understanding chemical-gene interactions. This can be achieved by integrating GCNs with BioBERT to harness both global and local context. Our results show that by integrating the ChemProt and DrugProt datasets, we demonstrated significant improvements in model performance, particularly in CPR groups shared between the datasets. Incorporating the global context using GCN can help increase the overall precision and recall in some of the CPR groups over using just BioBERT.

2210.09470 2026-02-05 q-bio.MN cond-mat.soft math.DS

Biomass transfer on autocatalytic reaction network: a delay differential equation formulation

Wei-Hsiang Lin

详情
英文摘要

For a biological system to grow and expand, mass must be transferred from the environment to the system and be assimilated into its reaction network. Here, I characterize the biomass transfer process for growing autocatalytic systems. By track biomass along reaction pathways, an n-dimensional ordinary differential equation (ODE) of the reaction network can be reformulated into a one-dimensional delay differential equation (DDE) for its long-term dynamics. The kernel function of the DDE summarizes the overall amplification and transfer delay of the system and serves as a signature for autocatalysis dynamics. The DDE formulation allows reaction networks of various topologies and complexities to be compared and provides rigorous estimation scheme for growth rate upon dimensional reduction of reaction networks.