arXivDaily arXiv每日学术速递 周一至周五更新
重置
2604.19718 2026-04-22 q-bio.QM

Direct RNA sequence design under codon constraints using expressive tensor-based secondary structure models

Mark Fornace, Christina Wuyan Wang, Michael Lindsey

详情
英文摘要

Nucleic acid sequence design via codon optimization is a fundamental task with applications across synthetic biology, mRNA therapeutics, and vaccine design. Given a target protein, it is a major open challenge to navigate the combinatorially large design space of codon sequences mapping to its amino acid sequence. Computational approaches generally seek to optimize simple objectives based on the codon sequence, possibly together with more complicated contributions based on secondary structure analysis. In this work, we demonstrate a direct and efficient algorithm to sample sequences from a suitable Boltzmann distribution defined in terms of the codon sequence and a fully detailed secondary structure free energy model, as well as related algorithms for exact computation of statistical quantities such as free energies, base pairing probabilities, and base and codon marginals. These algorithms draw upon a recently developed tensor-based formulation of secondary structure thermodynamics and demonstrate, for the first time, that global sequence design can be accomplished with respect to a highly accurate free energy model. Moreover, the algorithms can leverage any available CPU and GPU resources in parallel for massive computational speedups.

2604.19662 2026-04-22 q-bio.NC physics.bio-ph stat.AP

Modelling time-order effects in haptic perception with a Bayesian dynamical framework

Gastón Avetta, Jose Lobera, Juan José Zárate, Inés Samengo, Damián G. Hernández

Comments 21 pages, 7 figures

详情
英文摘要

Perceptual judgments of sequential stimuli are systematically biased by prior expectations and by the temporal structure of sensory input. In haptic discrimination tasks, these effects often manifest as time-order asymmetries, whereby the perceived difference between two stimuli depends on their presentation order. Here, we introduce a dynamical Bayesian model that accounts for these biases by combining noisy sensory measurements with an evolving internal representation of stimulus intensity. The model formalizes perception as an inference process in which prior expectations are updated by incoming stimuli and propagate in time between observations. We test the model on psychophysical data from vibrotactile discrimination experiments, in which participants compare pairs of sequential stimuli with varying intensities. With a small number of parameters, the model quantitatively reproduces both the direction and magnitude of time-order effects across subjects, as well as the observed inter-individual variability. The inferred parameters provide a compact description of perceptual biases in terms of prior expectations and noise characteristics. Beyond fitting the data, the model induces a transformation of stimulus space, leading to a subject-dependent geometry of perceived stimuli. In this transformed space, perceptual judgments exhibit approximate symmetries that are absent in the physical stimulus coordinates. These results suggest that temporal biases in perception can be understood as a consequence of dynamical inference, and that they impose non-trivial geometric constraints on perceptual representations.

2604.19563 2026-04-22 q-bio.QM cond-mat.stat-mech physics.bio-ph

Information-to-energy trade-offs and the optimal alphabet of polymer replication

Damián G. Hernández

Comments 12 pages, 6 figures

详情
英文摘要

We analyze information transmission in a recently proposed coarse-grained model of polymer replication by framing it as a communication channel between templates and copies. By calculating the mutual information in the steady-state limit of long chains, we recover the accurate-random phase diagram and establish that the information per-monomer depends solely on template specificity within the accurate regime. Crucially, even in the accurate region, small error fractions lead to substantial information loss due to the nonlinear relationship between errors and mutual information. Examining the information-to-energy cost ratio reveals non-monotonic behavior as a function of monomer alphabet size, with an optimum determined primarily by the per-monomer assembly free energy. For DNA's four-base alphabet, we find that the observed effective assembly energy (at least $14\,k_B T$) places the system far from the information-transmission optimum, suggesting that biological replication may prioritize the suppression of spontaneous random assembly over information-to-energy efficiency. We also characterize achievable rate-fidelity trade-offs using Shannon bounds, providing a theoretical framework for evaluating future proofreading mechanisms in ensemble models.

2604.18784 2026-04-22 q-bio.OT

Mathematical modeling and intuition in microbiology: a perspective

Jamie A. Lopez, Amir Erez

详情
Journal ref
Environmental Microbiology 28 (4), e70266 (2026)
英文摘要

Mathematical models are increasingly a part of microbiological research. Here, we share our perspective on how modeling advances the discipline by: (i) enforcing logical consistency, (ii) enabling quantitative prediction, (iii) extracting hidden parameters from data, and (iv) generating intuitive understanding. We map a spectrum of modeling frameworks, from whole-cell simulations to minimal logistic growth equations, and provide interactive examples for some common frameworks. Building on this overview, we outline pragmatic criteria for choosing an appropriate level of description to capture phenomena of interest. Finally, we present a case study in modeling of microbial ecosystems from our own work to illustrate how mechanistic modeling can yield generalizable intuition. This perspective aims to be an introductory roadmap for integrating mathematical modeling into experimental microbiology.

2604.18603 2026-04-22 q-bio.QM cs.LG

Dual Triangle Attention: Effective Bidirectional Attention Without Positional Embeddings

Logan Hallee, Jason P. Gleghorn

详情
英文摘要

Bidirectional transformers are the foundation of many sequence modeling tasks across natural, biological, and chemical language domains, but they are permutation-invariant without explicit positional embeddings. In contrast, unidirectional attention inherently encodes positional information through its triangular mask, enabling models to operate without positional embeddings altogether. Here, we introduce Dual Triangle Attention, a novel bidirectional attention mechanism that separates the query-key subspace of each attention head into two complementary triangular masks: one that attends to past-and-self positions and one that attends to future-and-self positions. This design provides bidirectional context while maintaining the causal mask's implicit positional inductive bias in both directions. Using PyTorch's flex_attention, Dual Triangle Attention is implemented as a single compiled kernel call with no additional parameters beyond standard multi-head attention. We evaluated Dual Triangle Attention across three settings: (1) a synthetic argmax position probe, (2) masked language modeling (MLM) on natural language, and (3) MLM on protein sequences. In the argmax task, both Dual Triangle Attention and causal attention learn positional information without explicit positional embeddings, whereas standard bidirectional attention cannot. In the MLM experiments, Dual Triangle Attention with Rotary Positional Embeddings (RoPE) achieved the best context extension performance and strong performance across the board. These findings suggest that Dual Triangle Attention is a viable attention mechanism for bidirectional transformers, with or without positional embeddings.

2604.18431 2026-04-22 physics.soc-ph q-bio.QM

Care Trajectories Are Linked to Mental Health and Mortality in Cancer Patients

Simon D. Lindner, Elisabeth L. Zeilinger, Amelie Fuchs, Simone Lubowitzki, Peter Klimek, Alexander Gaiger

详情
英文摘要

Treatment of cancer involves heterogeneous, complex care pathways. The relationship between these longitudinal trajectories, baseline mental health, and prognostic outcomes remains poorly understood. We introduce an interpretable time-analysis framework leveraging these temporal dynamics, analyzing care patterns spanning up to 37 years for >8,000 patients. Using Dynamic Time Warping (DTW) and Hierarchical Clustering on sequence data of healthcare encounters, we identified nine distinct, robust trajectory phenotypes. We evaluated their prognostic utility by incorporating them into generalized linear models alongside conventional clinical, demographic, and socioeconomic covariates. The trajectory clusters significantly enhanced mortality prediction and maintained independent predictive significance. Compared to a low-utilization reference group (mortality 31.5%), all eight remaining clusters exhibited substantially higher mortality odds. We uncovered two primary high-risk trajectory patterns: long-term, complex care pathways reflecting chronic disease courses (up to 196 events; mortality OR up to 3.38, 95% CI 2.13-5.37), and shorter but intense trajectories indicating rapid progression (median 78 events; OR 2.32, 95% CI 1.82-2.97). Unexpectedly, the high-utilization complexity clusters were associated with significantly lower baseline anxiety scores, highlighting a divergent relationship between trajectory intensity, mortality risk, and initial psychological burden. These results demonstrate that incorporating temporal healthcare utilization data uncovers robust trajectory phenotypes capturing multidimensional prognostic information. This offers significant explanatory power beyond established static variables for refining risk stratification in precision oncology.

2509.09693 2026-04-22 q-bio.TO eess.IV

Glorbit: A Modular, Web-Based Platform for AI Based Periorbital Measurement in Low-Resource Settings

George R. Nahass, Jacob van der Ende, Sasha Hubschman, Benjamin Beltran, Bhavana Kolli, Caitlin Berek, James D. Edmonds, R. V. Paul Chan, Pete Setabutr, James W. Larrick, Darvin Yi, Ann Q. Tran

Comments 10 pages, 3 figures, 3 tables

详情
Journal ref
JMIR Hum Factors 2026;13:e82859
英文摘要

Periorbital measurements such as margin reflex distances (MRD1/2), palpebral fissure height, and scleral show are essential in diagnosing and managing conditions like ptosis and eyelid disorders. We developed Glorbit, a lightweight, browser-based application for automated periorbital distance measurement using artificial intelligence, designed for use in low-resource clinical settings. The app integrates a DeepLabV3 segmentation model into a modular pipeline with secure, site-specific Google Cloud storage. Glorbit supports offline mode, local preprocessing, and cloud upload via Firebase-authenticated logins. We evaluated usability, cross-platform compatibility, and deployment readiness through a simulated enrollment study of 15 volunteers. The app completed the full workflow -- metadata entry, image capture, segmentation, and upload -- on all tested sessions without error. Glorbit successfully ran on laptops, tablets, and mobile phones across major browsers. The segmentation model succeeded on all images. Average session time was 101.7 seconds (standard deviation: 17.5). Usability survey scores (1-5 scale) were uniformly high: intuitiveness and efficiency (5.0), workflow clarity (4.8), output confidence (4.9), and clinical utility (4.9). Glorbit provides a functional, scalable solution for standardized periorbital measurement in diverse environments. It supports secure data collection and may enable future development of real-time triage tools and multimodal AI-driven oculoplastics. Tool available at: https://glorbit.app

2506.19866 2026-04-22 q-bio.MN cs.PF math.OC q-bio.QM

GPU-accelerated Modeling of Biological Regulatory Networks

Joyce Reimer, Pranta Saha, Chris Chen, Neeraj Dhar, Brook Byrns, Steven Rayan, Gordon Broderick

Comments 10 pages, 5 figures, 2 tables; submitted to 16th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics (ACM BCB 2025) as submission no. 6

详情
英文摘要

The complex regulatory dynamics of a biological network can be succinctly captured using discrete logic models. Given even sparse time-course data from the system of interest, previous work has shown that global optimization schemes are suitable for proposing logic models that explain the data and make predictions about how the system will behave under varying conditions. Considering the large scale of the parameter search spaces associated with these regulatory systems, performance optimizations on the level of both hardware and software are necessary for making this a practical tool for in silico pharmaceutical research. We show here how the implementation of these global optimization algorithms in a GPU-computing environment can accelerate the solution of these parameter search problems considerably. We carry out parameter searches on two model biological regulatory systems that represent almost an order of magnitude scale-up in complexity, and we find the gains in efficiency from GPU to be a 33%-43% improvement compared to multi-thread CPU implementations and a 33%-1866% increase compared to CPU in serial. These improvements make global optimization of logic model identification a far more attractive and feasible method for in silico hypothesis generation and design of experiments.

2504.20565 2026-04-22 q-bio.QM q-bio.PE

DLCM: a versatile multi-level solver for heterogeneous multicellular systems

Erik Blom, Stefan Engblom

Comments 34 pages, 9 figures

详情
英文摘要

Computational modeling of multicellular systems may aid in untangling cellular dynamics and emergent properties of biological cell populations. A key challenge is to balance the level of model detail and the computational efficiency, while using physically interpretable parameters to facilitate meaningful comparisons with biological data. For this purpose, we present the DLCM-solver (discrete Laplacian cell mechanics), a flexible and efficient computational solver for spatial and stochastic simulations of populations of cells, developed from first principle to support mechanistic investigations. The solver has been designed as a module in URDME, the unstructured reaction-diffusion master equation open software framework, to allow for the integration of intra-cellular models with extra-cellular features handled by the DLCM. The solver manages discrete cells on a fixed lattice and reaction-transport events in a continuous-time Markov chain. Space-continuous micro-environment quantities such as pressure and chemical substances are supported by the framework, permitting a variety of modeling choices concerning chemotaxis, mechanotaxis, nutrient-driven cell growth and death, among others. An essential and novel feature of the DLCM-solver is the coupling of cellular pressure to the curvature of the cell populations by elliptic projection onto the computational grid, with which we can include effects from surface tension between populations. We demonstrate the flexibility of the framework by implementing benchmark problems of cell sorting, cellular signaling, tumor growth, and chemotaxis models. We additionally formally analyze the computational complexity and show that it is theoretically optimal for systems based on pressure-driven cell migration. In summary, the solver balances efficiency and a relatively fine resolution, while supporting a high level of interpretability.

2504.09365 2026-04-22 quant-ph cs.PF q-bio.MN

Identifying Protein Co-regulatory Network Logic by Solving B-SAT Problems through Gate-based Quantum Computing

Aspen Erlandsson Brisebois, Jason Broderick, Zahed Khatooni, Heather L. Wilson, Steven Rayan, Gordon Broderick

Comments 9 pages, 6 figures, 4 tables; submitted to Quantum Applications Track (QAPP) of IEEE Quantum Week 2025 (QCE25) as submission no. 209; v2 accepted by QCE25 for presentation / publication

详情
英文摘要

There is growing awareness that the success of pharmacologic interventions on living organisms is significantly impacted by context and timing of exposure. In turn, this complexity has led to an increased focus on regulatory network dynamics in biology and our ability to represent them in a high-fidelity way, in silico. Logic network models show great promise here and their parameter estimation can be formulated as a constraint satisfaction problem (CSP) that is well-suited to the often sparse, incomplete data in biology. Unfortunately, even in the case of Boolean logic, the combinatorial complexity of these problems grows rapidly, challenging the creation of models at physiologically-relevant scales. That said, quantum computing, while still nascent, facilitates novel information-processing paradigms with the potential for transformative impact in problems such as this one. In this work, we take a first step at actualizing this potential by identifying the structure and Boolean decisional logic of a well-studied network linking 5 proteins involved in the neural development of the mammalian cortical area of the brain. We identify the protein-protein connectivity and binary decisional logic governing this network by formulating it as a Boolean Satisfiability (B-SAT) problem. We employ Grover's algorithm to solve the NP-hard problem faster than the exponential time complexity required by deterministic classical algorithms. Using approaches deployed on both quantum simulators and actual noisy intermediate scale quantum (NISQ) hardware, we accurately recover several high-likelihood models from very sparse protein expression data. The results highlight the differential roles of data types in supporting accurate models; the impact of quantum algorithm design as it pertains to the mutability of quantum hardware; and the opportunities for accelerated discovery enabled by this approach.

2110.00601 2026-04-22 cs.DL q-bio.QM

Album: executable building blocks for scientific imaging routines, from sharing to LLM-assisted orchestration

Jan Philipp Albrecht, Deborah Schmidt, Lucas Rieckert, Maximilian Otto, Kyle Harrington

Comments 38 pages, 7 figures

详情
英文摘要

Open-source scientific software is a major driver of scientific progress, yet its development and reuse remain difficult in collaborative settings. Researchers repeatedly face four recurring challenges: discovering and reproducing existing routines, adapting them for new use cases, sharing and scaling them across collaborators, and stabilizing them with reproducible execution environments. We present Album, an open-source framework for packaging and sharing scientific routines as executable artifacts through two minimal primitives: (i) the solution, a Python-native executable entry point that combines machine-readable metadata, arguments, environment specifications, and lifecycle hooks; and (ii) the catalog, a decentralized, git-native distribution mechanism with indexed search and optional web rendering for discovery, provenance, and governance. Album uses a two-context execution model in which a host controller evaluates manifests and prepares per-solution environments, while lifecycle hooks execute inside isolated solution environments. This design supports reproducible execution, post-environment setup, and the composition of routines with incompatible dependencies. Album can be used in conjunction with LLM agents: solutions can be drafted and revised with LLM assistance, and a MCP interface exposes cataloged solutions as callable tools for tool-grounded discovery and orchestration. We evaluate Album through four realworld imaging deployments spanning interactive visualization of electron microscopy data, integration of multiple segmentation methods, the orchestration of cryo-electron tomography competition workflows, and mineral quantification pipelines. Overall, Album complements package managers, workflow systems, and container runtimes by making scientific routines executable, shareable artifacts. Documentation and examples are available at https://album.solutions.

2604.18872 2026-04-22 q-bio.PE cs.DS

Meeting times on graphs in near-cubic time

Alex McAvoy

Comments 11 pages

详情
英文摘要

The expected meeting time of two random walkers on an undirected graph of size $N$, where at each time step one walker moves and the process stops when they collide, satisfies a system of $\binom{N}{2}$ linear equations. Naïvely, solving this system takes $O\left(N^{6}\right)$ operations. However, this system of linear equations has nice structure in that it is almost a Sylvester equation, with the obstruction being a diagonal absorption constraint. We give a simple algorithm for solving this system that exploits this structure, leading to $O\left(N^{4}\right)$ operations and $Θ\left(N^{2}\right)$ space for exact computation of all $\binom{N}{2}$ meeting times. While this practical method uses only standard dense linear algebra, it can be improved (in theory) to $O\left(N^{3}\log^{2}N\right)$ operations by exploiting the Cauchy structure of the diagonal correction. We generalize this result slightly to cover the Poisson equation for the absorbing "lazy" pair walk with an arbitrary source, which can be solved at the same cost, with $O\left(N^{3}\right)$ per additional source on the same graph. We conclude with applications to evolutionary dynamics, giving improved algorithms for calculating fixation probabilities and mean trait frequencies.

2604.18851 2026-04-22 q-bio.CB physics.bio-ph

Intrinsic stochasticity in cell polarity and contact inhibition of locomotion

Mariia Kryvoruchko, Brian A. Camley

详情
英文摘要

When cells collide, they often exhibit "contact inhibition of locomotion" (CIL), a behavior in which cells repolarize and migrate away from the site of contact. Experimental CIL outcomes are highly variable - why? Here, we develop a minimal stochastic model to quantify how intrinsic noise in cell polarity, arising from the finite number of signaling molecules, influences CIL decision-making. We simulate polarization dynamics by tracking individual Rho GTPase proteins that diffuse and switch stochastically between the cell membrane and cytosol. In the absence of cell-cell contact, the polarity axis diffuses rotationally - the cell's orientation wanders - with a diffusion coefficient that decreases as Rho GTPase copy number increases. Assuming that cell-cell contact inhibits Rho GTPase activation, we investigate how contact geometry, duration, and strength affect CIL sensitivity. At low protein copy number, weak, brief, or spatially narrow contacts are masked by molecular noise. In contrast, at high protein copy number, intrinsic polarity noise is negligible, and randomness in CIL response is more likely to reflect the variability from collision to collision in the cell-cell contact properties.

2604.18643 2026-04-22 q-bio.NC quant-ph

Quantum-Like Models of Cognition and Decision Making: Open-Systems and Gorini--Kossakowski--Sudarshan--Lindblad Dynamics

Masanari Asano, Andrei Khrennikov

详情
英文摘要

This paper starts with surveying the evolution of quantum-like models of cognition and decision making, transitioning from static kinematic representations to a robust dynamical framework based on open quantum systems. We provide a comprehensive analysis of the Gorini-Kossakowski-Sudarshan-Lindblad (GKSL) master equation's application in cognitive psychology and decision making, illustrating how it models mental state evolution as a dissipative process influenced by an informational environment. We categorize dynamical regimes into Passive and Active Hamiltonians, demonstrating how non-commutation with projections on decision basis serves as a mathematical signature of cognitive agency and Quantum Escape from classical equilibria. The utility of this framework is further explored through its ability to stabilize non-Nash outcomes in strategic games, such as the Prisoner's Dilemma. Building upon this dynamical foundation, we identify ``cognitive beats'' as a signature of the internal struggle between competing ``flows of mind'' deliberated at approximately equal frequencies. Distinct from the damped oscillations of simple interference, these beats emerge from a structural tension between Liouvillian channels that generates a secondary, slow-scale modulation of conviction. This beat envelope dictates the timing of peak readiness and hesitation, providing a mathematical map of the transition between conflicting cognitive states. By resolving these nested time scales, we provide a new spectral diagnostic for the depth of cognitive agency and the complexity of the underlying deliberation process. This paper develops a theoretical framework linking GKSL dynamics with quantum-like cognition and decision-making (QCDM), highlighting how dissipative quantum models can capture features of human thought and decision processes.

2604.18634 2026-04-22 q-bio.QM

Topological analysis of hemodynamic response to cardiac resynchronization therapy

Aina Ferrà Marcús, Carles Casacuberta, Josep Vives, Joan Guich, Gerard Amorós-Figueras, Jose M. Guerra

Comments 12 pages, with 2 figures, plus supplementary material (4 pages, 2 figures)

详情
英文摘要

Objective: The Mapper algorithm is a qualitative method in topological data analysis that constructs graphs from point clouds by combining dimensionality reduction and clustering techniques. The aim of this study is to apply Mapper, together with novel quantitative indices, to compare the effects of biventricular pacing from the left ventricular epicardium versus the endocardium in a swine model of pacing-induced non-ischemic cardiomyopathy. Methods: The distributions of four hemodynamic variables from a previous study on endocardial and epicardial cardiac resynchronization in an experimental swine model of nonischemic cardiomyopathy were analyzed using the Mapper algorithm, enhanced with numerical indices quantifying self-connectivity, scattering, and homogeneity of the resulting colored graphs. Results: Statistically significant differences were observed between pacing from basal regions versus mid or apical regions, with the following self-connectivity index values: basal $0.57$; mid $0.14$ ($p < 0.01$); apical $0.24$ ($p < 0.01$). Endocardial stimulation at lateral sites increased the contrast between the distributions of basal versus mid or apical data, when compared with epicardial stimulation. Conclusions: Topological analysis using the Mapper algorithm, enhanced with quantitative statistical measures, revealed new and biologically plausible significant differences in pacing effects across heart regions.

2604.18622 2026-04-22 q-bio.QM

MDAgent: A Multi-Agent Framework for End-to-End Molecular Dynamics Research

Zhenyu Ma, Chunyi Yang, Yuyang Song, Jingyi Zhu, Letian Yang, Limei Xu, Min Xiao, Xukai Jiang

详情
英文摘要

Molecular dynamics (MD) simulation is a powerful tool for studying biomolecular structural changes, molecular recognition, transmembrane transport, and functional mechanisms. However, its practical bottleneck lies not only in software operation or parameter setup, but in translating experimental questions into executable, interpretable, and reviewable computational workflows. Here, we present MDAgent, a multi-agent system for end-to-end molecular dynamics research. The system integrates problem understanding, literature-guided strategy design, simulation execution, trajectory analysis, mechanistic interpretation, and quality supervision into a unified workflow, enabling agents not only to run simulations but also to generate research-oriented computational plans and analytical reports. We further introduce a case-based learning mechanism based on Skill and Memory, which stores reusable knowledge from prior tasks, including parameter choices, operational rules, analytical logic, and problem-solving pathways, thereby supporting cross-task transfer without retraining the underlying model. Across multiple representative molecular simulation tasks, MDAgent achieved stable end-to-end performance with improved strategic adaptability, interpretability, and generalization. In an independent complex task involving conformational transitions of TMEM16F and XKR8, the system successfully completed system design, simulation, and mechanistic analysis for large membrane proteins. These results show that combining multi-agent collaboration with case-based learning can transform MD agents from workflow automation tools into scientific question-oriented computational research systems, providing a scalable framework for AI-driven automated research.

2604.18621 2026-04-22 q-bio.GN cs.LG

Quantum AI for Cancer Diagnostic Biomarker Discovery

Mandeep Kaur Saggi, Amandeep Singh Bhatia, Humaira Gowher, Sabre Kais

Comments 25 pages, 15 figures

详情
英文摘要

Quantum machine learning offers a promising new paradigm for computational biology by leveraging quantum mechanical principles to enhance cancer classification, biomarker discovery, and bioinformatics diagnostics. In this study, we apply QML to identify subtype specific biomarkers for lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC), the two predominant forms of non-small cell lung cancer. Our methodology involves a two-phase process: in Phase 1, differential expression analysis and methylation analysis between tumor and normal samples allows us to identify LUAD-specific and LUSC-specific genes, revealing potential prognostic biomarkers for cancer subtypes. Phase 2 focuses on developing a quantum classifier capable of distinguishing between LUAD and LUSC tumors, as well as between tumor and normal samples. This classifier not only enhances diagnostic precision but also demonstrates the quantum advantage in processing large-scale multiomic datasets. Our results consistently demonstrated that Sample3, representing the combined gene set, achieved the highest overall predictive performance in all metrics. These results demonstrate that QML provides an effective and scalable approach for biomarker discovery and subtype specific cancer classification. GO enrichment analysis highlighted the significant involvement of genes in synaptic signaling, ion channel regulation, and neuronal development. In the quantum phase, KEGG analysis further identified enrichment in cancer-associated pathways, including neurotrophin, MAPK, Ras, and PI3KAkt signaling, with key genes such as NGFR, NTRK2, and NTF3 suggesting a central role in neurotrophinmediated oncogenic processes. Our findings highlight the growing potential of quantum computing to advance precision oncology and next-generation biomedical analytics.

2604.18599 2026-04-22 stat.AP q-bio.NC

Simulation Based Inference of a Simple Neural Network Structure

Pierre Charitat, Ségolen Geffray, Christophe Pouzat

详情
英文摘要

Neurophysiologists are nowadays able to record from a large number of extracellular electrodes and to extract, from the raw data, the sequences of action potentials or spikes generated by many neurons. Unfortunately these ''many neurons'' still represent only a tiny fraction of the neuronal population that constitutes the network. Using association statistics such as the estimation of the cross-correlation functions, they are trying to infer the structure of the network formed by the recorded neurons. But this inference is compromised by the tremendous under-sampling of the neuronal population. We propose to focus instead on simple spike train statistics, like the empirical spikes frequency, or the interspike interval distribution. Their sampling distributions can be estimated by simulations, and, given a few observed spike train statistics, they provide enough information to infer the structure of the underlying network. We show that, on a ''toy model'', our method gives significantly better results than the sub-network reconstruction method with regards to the inference of the connection probability of the original network.

2604.04981 2026-04-22 q-bio.GN cs.LG cs.NE

An Imbalanced Dataset with Multiple Feature Representations for Studying Quality Control of Next-Generation Sequencing

Philipp Röchner, Clarissa Krämer, Johannes U Mayer, Franz Rothlauf, Steffen Albrecht, Maximilian Sprang

详情
英文摘要

Next-generation sequencing (NGS) is a key technique for studying the DNA and RNA of organisms. However, identifying quality problems in NGS data across different experimental settings remains challenging. To develop automated quality-control tools, researchers require datasets with features that capture the characteristics of quality problems. Existing NGS repositories, however, offer only a limited number of quality-related features. To address this gap, we propose a dataset derived from 37,491 NGS samples with two types of quality-related feature representations. The first type consists of 34 features derived from quality control tools (QC-34 features). The second type has a variable number of features ranging from eight to 1,183. These features were derived from read counts in problematic genomic regions identified by the ENCODE blocklist (BL features). All features describe the same human and mouse samples from five genomic assays, allowing direct comparison of feature representations. The proposed dataset includes a binary quality label, derived from automated quality control and domain experts. Among all samples, $3.2\%$ are of low quality. Supervised machine learning algorithms accurately predicted quality labels from the features, confirming the relevance of the provided feature representations. The proposed feature representations enable researchers to study how different feature types (QC-34 vs. BL features) and granularities (varying number of BL features) affect the detection of quality problems.

2604.03476 2026-04-22 cs.CV cs.AI q-bio.BM

Fine-tuning DeepSeek-OCR-2 for Molecular Structure Recognition

Haocheng Tang, Xingyu Dang, Junmei Wang

详情
英文摘要

Optical Chemical Structure Recognition (OCSR) is critical for converting 2D molecular diagrams from printed literature into machine-readable formats. While Vision-Language Models have shown promise in end-to-end OCR tasks, their direct application to OCSR remains challenging, and direct full-parameter supervised fine-tuning often fails. In this work, we adapt DeepSeek-OCR-2 for molecular optical recognition by formulating the task as image-conditioned SMILES generation. To overcome training instabilities, we propose a two-stage progressive supervised fine-tuning strategy: starting with parameter-efficient LoRA and transitioning to selective full-parameter fine-tuning with split learning rates. We train our model on a large-scale corpus combining synthetic renderings from PubChem and realistic patent images from USPTO-MOL to improve coverage and robustness. Our fine-tuned model, MolSeek-OCR, demonstrates competitive capabilities, achieving exact matching accuracies comparable to the best-performing image-to-sequence model. However, it remains inferior to state-of-the-art image-to-graph modelS. Furthermore, we explore reinforcement-style post-training and data-curation-based refinement, finding that they fail to improve the strict sequence-level fidelity required for exact SMILES matching.

2602.16129 2026-04-22 q-bio.MN

Oscillation Criteria in Large-Scale Gene Regulatory Networks with Intrinsic Fluctuations

Manuel Eduardo Hernández-García, Jorge Velázquez-Castro

Comments 13 pages, 6 figures

详情
英文摘要

Gene Regulatory Networks(GRNs) with feedback are essential components of many cellular processes and may exhibit oscillatory behavior. Analyzing such systems becomes increasingly complex as the number of components increases. Since gene regulation often involves a small number of molecules, fluctuations are inevitable. Therefore, it is important to understand how fluctuations affect the oscillatory dynamics of cellular processes, as this will allow comprehension of the mechanisms that enable cellular functions to remain even in the presence of fluctuations or, failing that, to determine the limit of fluctuations that permits various cellular functions. In this study, we investigated the conditions under which GRNs with feedback and intrinsic fluctuations exhibit oscillatory behavior. Our focus was on developing a procedure that would be both manageable and practical, even for extensive regulatory networks, that is, those comprising numerous nodes. Using the second-moment approach, we described the stochastic dynamics through a set of ordinary differential equations for the mean concentration and its second central moment. The system can attain either a stable equilibrium or oscillatory behavior, depending on its scale and, consequently, the intensity of fluctuations. To illustrate the procedure, we analyzed two relevant systems: a repressilator with three nodes and a system with five nodes, both incorporating intrinsic fluctuations. In both cases, it was observed that for very small systems, which therefore exhibit significant fluctuations, oscillatory behavior is inhibited. The procedure presented here for analyzing the stability of oscillations under fluctuations enables the determination of the critical minimum size of GRNs at which intrinsic fluctuations do not eliminate their cyclical behavior.

2601.20981 2026-04-22 cs.NE q-bio.PE

Diversifying Toxicity Search in Large Language Models Through Speciation

Onkar Shelar, Travis Desell

Comments Preprint. 4 pages, Accepted at GECCO as short paper

详情
英文摘要

Evolutionary prompt search is a practical black-box approach for red teaming large language models, however existing methods often collapse onto a small family of high-performing prompts, limiting coverage of distinct failure modes. We present a speciated quality-diversity extension of \textit{ToxSearch} that maintains multiple high-toxicity prompt niches in parallel rather than optimizing a single best prompt. \textit{ToxSearch-S} introduces unsupervised prompt speciation via a search methodology that maintains capacity-limited species with exemplar leaders, a reserve pool for emerging niches, and species-aware parent selection that trades off within-niche exploitation and cross-niche exploration. Preliminary results show \textit{ToxSearch-S} reaching higher peak toxicity ($\approx 0.73$ vs.\ $\approx 0.47$) with a heavier tail (top-10 median $0.66$ vs.\ $0.45$) than the baseline. Speciation also yields broader semantic coverage under a topics-as-species analysis (higher effective topic diversity and larger unique topic coverage). Finally, species formed are well-separated in embedding space (mean separation ratio $\approx 1.93$) and exhibit distinct toxicity distributions, indicating that speciation partitions the adversarial space into behaviorally differentiated niches rather than superficial lexical variants.

2511.03769 2026-04-22 q-bio.OT

Current validation practice undermines surgical AI development

Annika Reinke, Ziying O. Li, Minu D. Tizabi, Pascaline André, Marcel Knopp, Mika M. Rother, Ines P. Machado, Maria S. Altieri, Deepak Alapatt, Sophia Bano, Sebastian Bodenstedt, Oliver Burgert, Elvis C. S. Chen, Justin W. Collins, Olivier Colliot, Evangelia Christodoulou, Tobias Czempiel, Adrito Das, Reuben Docea, Daniel Donoho, Qi Dou, Jennifer Eckhoff, Sandy Engelhardt, Gabor Fichtinger, Philipp Fuernstahl, Pablo García Kilroy, Stamatia Giannarou, Stephen Gilbert, Ines Gockel, Patrick Godau, Jan Gödeke, Teodor P. Grantcharov, Tamas Haidegger, Alexander Hann, Makoto Hashizume, Charles Heitz, Rebecca Hisey, Hanna Hoffmann, Arnaud Huaulmé, Paul F. Jäger, Pierre Jannin, Anthony Jarc, Rohit Jena, Yueming Jin, Leo Joskowicz, Luc Joyeux, Max Kirchner, Axel Krieger, Gernot Kronreif, Kyle Lam, Shlomi Laufer, Joël L. Lavanchy, Gyusung I. Lee, Robert Lim, Peng Liu, Hani J. Marcus, Pietro Mascagni, Ozanan R. Meireles, Beat P. Mueller, Lars Mündermann, Hirenkumar Nakawala, Nassir Navab, Abdourahmane Ndong, Juliane Neumann, Felix Nickel, Marco Nolden, Chinedu Nwoye, Namkee Oh, Nicolas Padoy, Thomas Pausch, Micha Pfeiffer, Tim Rädsch, Hongliang Ren, Nicola Rieke, Dominik Rivoir, Duygu Sarikaya, Samuel Schmidgall, Matthias Seibold, Silvia Seidlitz, Alexander Seitel, Lalith Sharan, Jeffrey H. Siewerdsen, Vinkle Srivastav, Raphael Sznitman, Russell Taylor, Thuy N. Tran, Matthias Unberath, Fons van der Sommen, Martin Wagner, Amine Yamlahi, Shaohua K. Zhou, Aneeq Zia, Amin Madani, Danail Stoyanov, Stefanie Speidel, Daniel A. Hashimoto, Fiona R. Kolbinger, Lena Maier-Hein

Comments Under review in Nature BME

详情
英文摘要

Surgical data science (SDS) is rapidly advancing, yet clinical adoption of artificial intelligence (AI) in surgery remains limited, with inadequate validation emerging as an important contributing factor. In fact, existing validation practices often neglect the temporal and hierarchical structure of intraoperative videos, producing misleading, unstable, or clinically irrelevant results. In a pioneering, consensus-driven effort, we introduce a comprehensive catalog of validation pitfalls in AI-based surgical video analysis that was derived from a multi-stage Delphi process with 92 international experts. The collected pitfalls span three categories: (1) data (e.g., incomplete annotation, spurious correlations), (2) metric selection and configuration (e.g., neglect of temporal stability, mismatch with clinical needs), and (3) aggregation and reporting (e.g., clinically uninformative aggregation, failure to account for frame dependencies in hierarchical data structures). A systematic review of surgical AI papers reveals that these pitfalls are widespread in current practice, with the majority of studies failing to account for temporal dynamics or hierarchical data structure, or relying on clinically uninformative metrics. Experiments on real surgical video datasets provide empirical evidence that ignoring temporal and hierarchical data structures can substantially understate uncertainty, obscure critical failure modes, and even alter algorithm rankings. To address these shortcomings, we provide a catalogue of best practices compiled in a multi-stage Delphi process. Together, this work provides an evidence-based framework to inform more rigorous validation of surgical video analysis algorithms and to guide future efforts in benchmarking, reporting, regulatory review, and clinical translation.

2510.12751 2026-04-22 q-bio.NC

Non-linear associations of amyloid-$β$ with resting-state functional networks and their cognitive relevance in a large community-based cohort of cognitively normal older adults

Junjie Wu, Benjamin B Risk, Taylor A James, Nicholas Seyfried, David W Loring, Felicia C Goldstein, Allan I Levey, James J Lah, Deqiang Qiu

详情
Journal ref
Alz Res Therapy 18, 90 (2026)
英文摘要

Background: Non-linear alterations in brain network connectivity may represent early neural signatures of Alzheimer's disease (AD) pathology in cognitively normal older adults. Understanding these changes and their cognitive relevance may help clarify early network vulnerability associated with AD pathology. Most prior studies recruited participants from memory clinics, often with subjective memory concerns, limiting generalizability. Methods: We examined 14 large-scale functional brain networks in 968 cognitively normal older adults recruited from the community using resting-state functional MRI, cerebrospinal fluid (CSF) biomarkers (amyloid-$β$ 1-42 [A$β$], total tau, phosphorylated tau 181), and neuropsychological assessments. Functional networks were identified using group independent component analysis. Results: Inverted U-shaped associations between CSF A$β$ and functional connectivity were observed in the precuneus network and ventral default mode network (DMN), but not in the dorsal DMN, indicating network-specific vulnerability to early amyloid pathology. Higher connectivity in A$β$-related networks, including dorsal and ventral DMN, precuneus, and posterior salience networks, was associated with better visual memory, visuospatial, and executive performance. No significant relationships were observed between CSF tau and functional connectivity. Conclusions: Using a large, community-based cohort, we demonstrate that non-linear alterations in functional connectivity occur in specific networks even during the asymptomatic phase of AD. Moreover, A$β$-related network connectivity is cognitively relevant, highlighting early network vulnerability and its functional consequences in amyloid pathology.

2508.21490 2026-04-22 q-bio.NC quant-ph

Testing quantum-like markers in neural dynamics

Partha Ghose, Dimitris Pinotsis

Comments 12 pages, one figure; thouroughly revised; title slightly changed; Abstract also changed accordingly

详情
英文摘要

We propose two experiments for identifying quantum markers in neural data based on quantum variants of well-known equations for neural activity that describe electrical signal propagation on axonal arbors and dendrites. These include (i) testing if power spectra from subthreshold oscillations in neuronal cultures follow the classical Fitzgugh-Nagumo equations or a recently introduced quantum variant of them and (ii) testing if propagation statistics of electrical activity in axons follow the classical diffusive cable equation or a quantum variant of it.

2208.11805 2026-04-22 math.DS q-bio.QM

On the Diffusion Time Evolution of Folding Chains in the Heteropolymer Model

Okezue Bell

Comments Actively updating work to distance/build upon IMP heteropolymer study reproduction

详情
英文摘要

In this paper, we mathematically describe the time evolution of protein folding features via Iori et al.'s heteropolymer model. More specifically, we identify that the folding amino acid chain evolve according to a power law $D \sim t^ν$. The power $ν$ decreases from $0.\overline{66}$ to $0.5$ when the randomness of the coupling constants in the Lennard-Jones potential increases.