arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2603.02165 2026-03-03 physics.ao-ph q-bio.QM

Investigating the short-term effects of particulate matter (PM) chemical components on mortality and the potential modifying effect of extreme temperature: A time-series analysis in London

Xiaolu Zhang, Anna Font, Anja Tremper, Max Priestman, Shawn Y. Lee, David C. Green, Dimitris Evangelopoulos, Gang I. Chen

详情

英文摘要

Particulate matter (PM) is linked to adverse health outcomes, yet the roles of specific PM components and their modification by extreme temperature remain unclear. We examined short-term associations between ten PM chemical components and daily mortality in Greater London (2015-2018). PM components include inorganic aerosols (black carbon from wood burning (BCwb) and traffic exhaust (BCtr), SO4, NO3, and NH4) and organic aerosols (hydrocarbon-like organic aerosol (HOA), biomass burning OA (BBOA), cooking-like OA (COA), more and less oxidized oxygenated OA (MO-OOA and LO-OOA)). We applied quasi-Poisson generalized additive models and weighted quantile sum (WQS) regression to estimate single-pollutant, multi-pollutant, and mixture effects, respectively, and included interaction terms to test effect modification by heat waves and cold spells. All ten components showed positive associations with all-cause mortality in single-pollutant models with stronger estimated risks for respiratory mortality, particularly for NH4, NO3, SO4. In mixture analyses, the WQS index was significantly associated with all-cause mortality (RR = 1.015, 95% CI: 1.006-1.024 per 25th-percentile increase) and showed a marginally significance with respiratory mortality (RR = 1.018, 95% CI: 0.994-1.042). MO-OOA and COA contributed most to all-cause mortality, while BBOA and BC Wood dominated respiratory effects. Heat waves consistently amplified respiratory risks in both single-pollutant and mixture models with little evidence for cardiovascular mortality. Overall, MO-OOA demonstrated harmful associations across outcomes, suggesting potential toxicity link to secondary atmospheric oxidation processes. These findings support source-specific control strategies and highlight the importance of accounting for extreme temperature in air pollution mitigation policies.

URL PDF HTML ☆

赞 0 踩 0

2603.02036 2026-03-03 q-bio.PE math.DS

Lag-Induced Critical Transitions to Extinction in Replicating Systems

Edward A. Turner, Francisco Crespo, Joan Gimeno, Ernest Fontich, Santiago F. Elena, Josep Sardanyés

Comments 6 pages, 5 figures

2603.01965 2026-03-03 cs.LG q-bio.QM

CoVAE: correlated multimodal generative modeling

Federico Caretti, Guido Sanguinetti

2603.01873 2026-03-03 q-bio.BM

Bi-TEAM: A Unified Cross-Scale Representation Learning Framework for Chemically Modified Biomolecules

Chunbin Gu, Zijun Gao, Mutian He, Jingjie Zhang, Haipeng Wen, Zihao Luo, Xiaorui Wang, Hanqun Cao, Jiajun Bu, Chang-Yu Hsieh, Pheng Ann Heng

Comments 57 pages, 16 figures

2603.01849 2026-03-03 q-bio.QM

Characterization of the novel transposon Tn7722 harboring bla NDM-1 : Insights into the evolutionary dynamics of resistance in Klebsiella pneumoniae

Tram Vo, Aïcha Hamieh, Marc Levy, Pierre Pontarotti, Jean-Marc Rolain, Vicky Merhej

详情

英文摘要

Background: Klebsiella pneumoniae is a major opportunistic pathogen responsible for various invasive infections. The rise of carbapenem-resistant K. pneumoniae, primarily due to acquisition of bla NDM genes, presents a serious global health threat. In French Polynesia, where international travel is frequent, sporadic cases of NDM-producing Enterobacteriales have emerged. This study aims to characterize the genomic features of NDM-producing K. pneumoniae isolates collected in French Polynesia and evaluate the roles of clonal expansion and horizontal gene transfer mediated by mobile genetic elements in bla NDM spread. Materials and Methods: Between July 2006 and September 2021, 17 carbapenemase-producing K. pneumoniae isolates were identified from 715 clinical samples in Tahiti. Whole-genome sequencing using Illumina MiSeq and Oxford Nanopore technologies was performed. Results: Seven NDM-producing K. pneumoniae strains were identified, five bla NDM-1 and two bla NDM-9 variants. All isolates were resistant to ertapenem (MICs 1 to >32 mg/L), with three resistants to imipenem (MICs 8 to >32 mg/L) and six to meropenem (MICs 2 to >8 mg/L). A novel IS26mediated composite transposon, Tn7722 (16,246 bp), was detected in four isolates on IncF and IncR plasmids. This transposon also carried qnrS1 and aph(3')-VI genes, conferring resistance to fluoroquinolones and aminoglycosides. Tn7722-like elements were found in diverse bacterial genomes worldwide, suggesting it facilitates bla NDM transmission across multiple species and regions. Conclusion: NDM-producing K. pneumoniae in French Polynesia remain sporadic but genetically diverse, without evidence of local outbreak. It underscores the role of plasmid and Tn7722-driven evolution and adaptation. Ongoing genomic surveillance is vital to track the evolution of highrisk clones and MGEs guiding effective containment.

URL PDF HTML ☆

赞 0 踩 0

2602.23797 2026-03-03 physics.soc-ph q-bio.PE

Co-spreading dynamics of smoking behavior and awareness on social contact networks

Saicharan Ritwik Chinni, Anupama Sharma

2512.18114 2026-03-03 q-bio.QM

Greater than the Sum of Its Parts: Building Substructure into Protein Encoding Models

Robert Calef, Arthur Liang, Manolis Kellis, Marinka Zitnik

2506.09007 2026-03-03 cs.LG q-bio.QM

Branched Schrödinger Bridge Matching

Sophia Tang, Yinuo Zhang, Alexander Tong, Pranam Chatterjee

Comments Published at ICLR 2026. (Proceedings of the 14th International Conference on Learning Representations, Rio de Janeiro, Brazil)

2501.06762 2026-03-03 q-bio.NC cs.LG cs.NE

Improving the adaptive and continuous learning capabilities of artificial neural networks: Lessons from multi-neuromodulatory dynamics

Jie Mei, Alejandro Rodriguez-Garcia, Daigo Takeuchi, Gabriel Wainstein, Nina Hubig, Yalda Mohsenzadeh, Srikanth Ramaswamy

2307.14025 2026-03-03 cs.LG cs.CV eess.IV q-bio.QM stat.ML

Topological Inductive Bias fosters Multiple Instance Learning in Data-Scarce Scenarios

Salome Kazeminia, Carsten Marr, Bastian Rieck

2603.01780 2026-03-03 cs.LG q-bio.GN

D3LM: A Discrete DNA Diffusion Language Model for Bidirectional DNA Understanding and Generation

Zhao Yang, Hengchang Liu, Chuan Cao, Bing Su

Comments Accepted as a workshop paper at MLGenX 2026

2603.01774 2026-03-03 q-bio.PE math.PR

Approximate message passing for block-structured ecological systems

Maxime Clenet, Mohammed-Younes Gueddari

2603.01701 2026-03-03 q-bio.PE cs.MA

A speciation simulation that partly passes open-endedness tests

Théo de Pinho, Lana Sinapayen

Comments 12 pages, 4 figures

2603.01568 2026-03-03 cs.LG cs.CV cs.IT math.IT q-bio.NC

Rate-Distortion Signatures of Generalization and Information Trade-offs

Leyla Roksan Caglar, Pedro A. M. Mediano, Baihan Lin

2603.01537 2026-03-03 cs.AI q-bio.BM q-bio.QM

Pharmacology Knowledge Graphs: Do We Need Chemical Structure for Drug Repurposing?

Youssef Abo-Dahab, Ruby Hernandez, Ismael Caleb Arechiga Duran

Comments 34 pages, 5 figures. Under review at Discover Artificial Intelligence

详情

英文摘要

The contributions of model complexity, data volume, and feature modalities to knowledge graph-based drug repurposing remain poorly quantified under rigorous temporal validation. We constructed a pharmacology knowledge graph from ChEMBL 36 comprising 5,348 entities including 3,127 drugs, 1,156 proteins, and 1,065 indications. A strict temporal split was enforced with training data up to 2022 and testing data from 2023 to 2025, together with biologically verified hard negatives mined from failed assays and clinical trials. We benchmarked five knowledge graph embedding models and a standard graph neural network with 3.44 million parameters that incorporates drug chemical structure using a graph attention encoder and ESM-2 protein embeddings. Scaling experiments ranging from 0.78 to 9.75 million parameters and from 25 to 100 percent of the data, together with feature ablation studies, were used to isolate the contributions of model capacity, graph density, and node feature modalities. Removing the graph attention based drug structure encoder and retaining only topological embeddings combined with ESM-2 protein features improved drug protein PR-AUC from 0.5631 to 0.5785 while reducing VRAM usage from 5.30 GB to 353 MB. Replacing the drug encoder with Morgan fingerprints further degraded performance, indicating that explicit chemical structure representations can be detrimental for predicting pharmacological network interactions. Increasing model size beyond 2.44 million parameters yielded diminishing returns, whereas increasing training data consistently improved performance. External validation confirmed 6 of the top 14 novel predictions as established therapeutic indications. These results show that drug pharmacological behavior can be accurately predicted using target-centric information and drug network topology alone, without requiring explicit chemical structure representations.

URL PDF HTML ☆

赞 0 踩 0

2603.01184 2026-03-03 cs.LG cs.AI q-bio.NC stat.CO

Scaling of learning time for high dimensional inputs

Carlos Stein Brito

Comments 14 pages, 5 figures

2603.01054 2026-03-03 q-bio.OT

Topological analysis of bladder filling

Arturo Tozzi

Comments 8 pages, 1 figure

详情

英文摘要

Bladder function is typically assessed through pressure volume relations, compliance indices and flow measurements, whereas structural evaluation relies largely on qualitative imaging findings. These approaches do not formally quantify how bladder geometry changes during filling. To distinguish structural reorganization from pure mechanical stiffness, we developed a simulation based topological analysis of bladder filling grounded in mechanical parameters derived from the literature. Progressive filling was modeled under quasi static conditions, generating multi volume geometries from which spatial descriptors were computed. Drawing on the Freudenthal suspension theorem, filling was interpreted as a dimensional expansion process and structural stability was evaluated by testing whether geometric invariants remain preserved across increasing volumes. Simulated smooth expansion and controlled structural perturbations were compared under identical loading conditions. Pressure trajectories and wall stress estimates were similar across configurations when compliance was matched, whereas geometric descriptors showed divergent volume indexed stability profiles in the presence of remodeling. Computable instability measures identified progressive spatial heterogeneity despite preserved global pressure behavior. By providing a quantitative measure of geometric continuity across successive filling states, our approach indicates that structural remodeling may become detectable before conventional functional impairment appears. Progressive surface irregularity can arise even when compliance, detrusor pressure and flow parameters remain within reference limits. Serial imaging over time may support identification of individuals at greater risk of diverticula formation, decompensation or structural complications despite stable pressure measurements.

URL PDF HTML ☆

赞 0 踩 0

2512.11582 2026-03-03 cs.LG cs.CV q-bio.NC

Brain-Semantoks: Learning Semantic Tokens of Brain Dynamics with a Self-Distilled Foundation Model

Sam Gijsen, Marc-Andre Schulz, Kerstin Ritter

Comments Accepted at ICLR 2026. Code and pretrained models available at https://github.com/SamGijsen/Brain-Semantoks

2511.11758 2026-03-03 q-bio.QM cs.AI

Protein Structure Tokenization via Geometric Byte Pair Encoding

Michael Sun, Weize Yuan, Gang Liu, Wojciech Matusik, Marinka Zitnik

Comments ICLR 2026

2510.25976 2026-03-03 cs.CV cs.AI q-bio.NC

Brain-IT: Image Reconstruction from fMRI via Brain-Interaction Transformer

Roman Beliy, Amit Zalcher, Jonathan Kogman, Navve Wasserman, Michal Irani

Comments Accepted at ICLR 2026

2509.26560 2026-03-03 stat.ML cs.LG q-bio.NC

Estimating Dimensionality of Neural Representations from Finite Samples

Chanwoo Chun, Abdulkadir Canatar, SueYeon Chung, Daniel Lee

2509.20719 2026-03-03 cs.LG q-bio.QM

A Genetic Algorithm for Navigating Synthesizable Molecular Spaces

Alston Lo, Connor W. Coley, Wojciech Matusik

Comments ICLR 2026

2508.14492 2026-03-03 q-bio.NC cs.AI nlin.AO

Synaptic bundle theory for spike-driven sensor-motor system: More than eight independent synaptic bundles collapse reward-STDP learning

Takeshi Kobayashi, Shogo Yonekura, Yasuo Kuniyoshi

Comments 5 pages, 4 figures

2508.11674 2026-03-03 cs.NE cs.AI q-bio.NC

Learning Internal Biological Neuron Parameters and Complexity-Based Encoding for Improved Spiking Neural Networks Performance

Zofia Rudnicka, Janusz Szczepanski, Agnieszka Pregowska

2508.10760 2026-03-03 q-bio.BM cs.AI

FROGENT: An End-to-End Full-process Drug Design Multi-Agent System

Qihua Pan, Dong Xu, Qianwei Yang, Jenna Xinyi Yao, Sisi Yuan, Zexuan Zhu, Jianqiang Li, Junkai Ji

Comments 37 pages, 20 figures

2506.07459 2026-03-03 cs.LG q-bio.QM

ProteinZero: Self-Improving Protein Generation via Online Reinforcement Learning

Ziwen Wang, Jiajun Fan, Ruihan Guo, Thao Nguyen, Heng Ji, Ge Liu

详情

英文摘要

Protein generative models have shown remarkable promise in protein design, yet their success rates remain constrained by reliance on curated sequence-structure datasets and by misalignment between supervised objectives and real design goals. We present ProteinZero, an online reinforcement learning framework for inverse folding models that enables scalable, automated, and continuous self-improvement with computationally efficient feedback. ProteinZero employs a reward pipeline that combines structural guidance from ESMFold with a novel self-derived ddG predictor, providing stable multi-objective signals while avoiding the prohibitive cost of physics-based methods. To ensure robustness in online RL, we further introduce a novel embedding-level diversity regularizer that mitigates mode collapse and promotes functionally meaningful sequence variation. Within a general RL formulation balancing multi-reward optimization, KL-divergence from a reference model, and diversity regularization, ProteinZero achieves robust improvements across designability, stability, recovery, and diversity. On the CATH-4.3 benchmark, it consistently outperforms state-of-the-art baselines including ProteinMPNN, ESM-IF, and InstructPLM, reducing design failure rates by 36-48% and achieving success rates above 90% across diverse folds. Importantly, a complete RL run can be executed on a single 8 X GPU node within three days, including reward computation and data generation. These results indicate that efficient online RL fine-tuning can complement supervised pretraining by allowing protein generative models to evolve continuously from their own outputs and optimize multiple design objectives without labeled data, opening new possibilities for exploring the vast protein design space. Full source code and model checkpoints will be released upon publication.

URL PDF HTML ☆

赞 0 踩 0

2506.06750 2026-03-03 cs.AI cs.LG q-bio.NC

Accuracy-Efficiency Trade-Offs in Spiking Neural Networks: A Lempel-Ziv Complexity Perspective on Learning Rules

Zofia Rudnicka, Janusz Szczepanski, Agnieszka Pregowska

2506.02052 2026-03-03 q-bio.BM cs.AI cs.LG q-bio.QM

General Protein Pretraining or Domain-Specific Designs? Benchmarking Protein Modeling on Realistic Applications

Shuo Yan, Yuliang Yan, Bin Ma, Chenao Li, Haochun Tang, Jiahua Lu, Minhua Lin, Yuyuan Feng, Enyan Dai

2505.12565 2026-03-03 cs.AI cs.CL cs.LG q-bio.QM

mCLM: A Modular Chemical Language Model that Generates Functional and Makeable Molecules

Carl Edwards, Chi Han, Gawon Lee, Thao Nguyen, Sara Szymkuć, Chetan Kumar Prasad, Bowen Jin, Jiawei Han, Ying Diao, Ge Liu, Hao Peng, Bartosz A. Grzybowski, Martin D. Burke, Heng Ji

Comments Accepted to ICLR 2026 (Oral). Code: https://github.com/blender-nlp/mCLM Data and Model: https://huggingface.co/collections/language-plus-molecules/mclm

详情

英文摘要

Despite their ability to understand chemical knowledge, large language models (LLMs) remain limited in their capacity to propose novel molecules with desired functions (e.g., drug-like properties). In addition, the molecules that LLMs propose can often be challenging to make, and are almost never compatible with automated synthesis approaches. To better enable the discovery of functional small molecules, LLMs need to learn a new molecular language that is more effective in predicting properties and inherently synced with automated synthesis technology. Current molecule LLMs are limited by representing molecules based on atoms. In this paper, we argue that just like tokenizing texts into meaning-bearing (sub-)word tokens instead of characters, molecules should be tokenized at the level of functional building blocks, i.e., parts of molecules that bring unique functions and serve as effective building blocks for real-world automated laboratory synthesis. This motivates us to propose mCLM, a modular Chemical-Language Model that comprises a bilingual language model that understands both natural language descriptions of functions and molecular blocks. mCLM front-loads synthesizability considerations while improving the predicted functions of molecules in a principled manner. Experiments on FDA-approved drugs showed that mCLM is capable of significantly improving chemical functions. mCLM, with only 3B parameters, also achieves improvements in synthetic accessibility relative to 7 other leading generative AI methods including GPT-5. When tested on 122 out-of-distribution medicines using only building blocks/tokens that are compatible with automated modular synthesis, mCLM outperforms all baselines in property scores and synthetic accessibility. mCLM can also reason on multiple functions and iteratively self-improve to rescue drug candidates that failed late in clinical trials ("fallen angels").

URL PDF HTML ☆

赞 0 踩 0

2503.04490 2026-03-03 cs.CL q-bio.GN

Large Language Models in Bioinformatics: A Survey

Zhenyu Wang, Zikang Wang, Jiyue Jiang, Pengan Chen, Xiangyu Shi, Yu Li

Comments Accepted by ACL 2025