arXivDaily arXiv每日学术速递 周一至周五更新
2602.20883 2026-02-25 q-bio.PE

Adaptation by Cumulative Selection

Rudy Arthur

详情
英文摘要

Biological systems like long-lived clonal organisms, holobionts and clades challenge traditional evolutionary thinking since they adapt without populations or reproduction. This paper aims to provide an overarching theoretical framework which encompasses standard Darwinian evolution as well as other processes of adaptation. This framework is cumulative selection and I provide a general `recipe' for it to occur. Lewontin's recipe for evolution by natural selection is shown to be a particular example of cumulative selection, but not the only one. Similarly, reproduction, inheritance and populations are just one way to perform cumulative selection. I discuss several other examples of cumulative selection including clonal organisms, dioecious populations, Gaia and neural networks.

2602.18510 2026-02-25 astro-ph.EP astro-ph.IM q-bio.PE

Experimental and numerical modeling of liposome congregation in meteorite craters of Early Earth

Vladimir M. Subbotin, Benjamin A. Turner, Brian A. Davies, Alric G. Lopez, Gennady Fiksel

Comments 6 pages, 7 figures

详情
英文摘要

This paper provides experimental and numerical evidence supporting the occurrence of liposome congregation at the floors of meteor craters on Early Earth. This work builds on our earlier research, which demonstrated that liposomes submerged in a shallow Archean pond are protected from harmful UV radiation. This protection allows them to survive long enough for autocatalytic replication of amphiphiles and for mutation and selection of assemblies that maximize membrane stability. For liposomes to fuse, grow, exchange contents and membranes, and divide, they need to establish a population, which means forming a dense conglomerate that enables close physical contact. The study demonstrates that such a congregation is feasible in bowl-shaped meteor craters on Early Earth, especially under periodic seismic disturbances.

2510.06578 2026-02-25 q-bio.QM

A geometric feature tracking approach for noninvasive patient specific estimation of leaflet strain from 3D images of heart valves

Wensi Wu, Matthew Daemer, Jeffrey A. Weiss, Alison M. Pouch, Matthew A. Jolley

详情
英文摘要

Valvular heart disease is prevalent and a major contributor to heart failure. Valve leaflet strain is a promising metric for evaluating the mechanics underlying the initiation and progression of valvular pathology. However, generalizable methods for noninvasively quantifying valvular strain from clinically acquired patient images remain limited. To address this limitation, we developed a geometric feature-tracking framework to quantify in vivo leaflet strain from 3DE images. The method integrates a cohort-derived geometric reference atlas to establish geometric correspondence and introduces a novel distance-weighted coherent point drift algorithm for non-rigid registration. We evaluated performance against a finite element benchmark model and compared the approach with conventional point-based tracking methods. The framework was applied to pediatric and adult patient datasets (N = 31) across variable valve morphologies. The proposed method demonstrated greater accuracy in quantifying anatomical alignment and leaflet strain than conventional point-based approaches. Validation against the finite element benchmark confirmed improved strain estimation. The framework achieved reliable inter-phase tracking of valve deformation across diverse morphologies in pediatric and adult patients. Analysis identified a consistent distribution pattern of the 1st principal strain associated with leaflet billow (prolapse). This feature-tracking framework provides a generalizable method for noninvasive quantification of atrioventricular valve leaflet strain from clinical 3DE images. Characterization of biomechanical strain patterns may improve prognostic assessment and support longitudinal evaluation of valvular heart disease. Further investigation of the biomechanical signatures of heart valve disease has the potential to enhance prognostic assessment and longitudinal evaluation of valvular heart disease.

2510.05193 2026-02-25 q-bio.QM

Harmonic fields and the mechanical response of a cellular monolayer to ablation

Oliver E. Jensen, Christopher K. Revell

详情
英文摘要

Multicellular tissues, such as the epithelium coating a developing embryo, often combine complex tissue shapes with heterogeneity in the spatial arrangement of individual cells. Discrete approximations, such as the cell vertex model, can accommodate these geometric features, but techniques for analysis of such models are underdeveloped. Here, we express differential operators defined on a network representing a monolayer of confluent cells in a framework inspired by discrete exterior calculus, considering scalar fields defined over cell vertices and centres and vector fields defined over cell edges. We achieve this by defining Hodge stars, wedge products and musical isomorphisms that are appropriate for a disordered monolayer for which cell edges and links between cell centres are not orthogonal, as is generic for epithelia. We use this framework to evaluate the harmonic vector field arising in an ablated planar monolayer, demonstrating an approximate 1/\textit{r} scaling of the upper bound of the field's amplitude, where \textit{r} is the distance from the ablation. Using a vertex model that incorporates osmotic effects, we then calculate the mechanical response of a monolayer in a jammed state to ablation. Perturbation displacements exhibit long-range coherence, monopolar and quadrupolar features, and an approximate 1/\textit{r} near-hole upper-bound scaling, implicating the harmonic field. The upper bounds on perturbation stress amplitudes scale approximately like 1/\textit{r}$^2$, a feature relevant to long-range mechanical signalling.

2503.01834 2026-02-25 q-bio.CB

Intercellular contact is sufficient to drive Fibroblast to Myofibroblast transitions

Vasuretha Chandar, Benjamin M. Goykadosh, Harikrishnan Parameswaran

详情
英文摘要

Fibroblast cells play a key role in maintaining the extracellular matrix. During wound healing, fibroblasts differentiate into highly contractile myofibroblasts, which secrete extracellular matrix proteins like collagen to facilitate tissue repair. Under normal conditions, myofibroblasts undergo programmed cell death after healing to prevent excessive scar formation. However, in diseases like fibrosis, the myofibroblasts remain active even after the wound is closed, resulting in excessive collagen buildup and a stiff, fibrotic matrix. The reasons for the persistence of myofibroblasts in fibrosis are not well understood. Here, we show the existence of a mechanism where direct physical contact between a fibroblast and a myofibroblast is sufficient for fibroblasts to transition into myofibroblasts. We demonstrate that the fibroblast-myofibroblast transition can occur even in the absence of known biochemical cues, such as growth factor activation or mechanical cues from a stiff, fibrotic matrix. Furthermore, we demonstrate that contact-based fibroblast-myofibroblast activation can be inhibited by the Gαq/11/14 inhibitor FR900359, which prevents the formation of myofibroblasts. These findings provide new insights into the persistence of fibrosis despite therapeutic interventions, suggesting a potential strategy for targeting the fibroblast-to-myofibroblast transition in fibrotic conditions.

2407.00976 2026-02-25 q-bio.NC

ODIN: Open Data In Neurophysiology: Advancements, Solutions & Challenges

Colleen J. Gillon, Cody Baker, Ryan Ly, Edoardo Balzani, Bingni W. Brunton, Manuel Schottdorf, Satrajit Ghosh, Nima Dehghani

详情
英文摘要

Across the life sciences, an ongoing effort over the last 50 years has made data and methods more reproducible and transparent. This openness has led to transformative insights and vastly accelerated scientific progress. For example, structural biology and genomics have undertaken systematic collection and publication of protein sequences and structures over the past half-century, and these data have led to scientific breakthroughs that were unthinkable when data collection first began. We believe that neuroscience is poised to follow the same path, and that principles of open data and open science will transform our understanding of the nervous system in ways that are impossible to predict at the moment. To this end, new social structures along with active and open scientific communities are essential to facilitate and expand the still limited adoption of open science practices in our field. Unified by shared values of openness, we set out to organize a symposium for Open Data in Neuroscience (ODIN) to strengthen our community and facilitate transformative neuroscience research at large. In this report, we share what we learned during this first ODIN event. We also lay out plans for how to grow this movement, document emerging conversations, and propose a path toward a better and more transparent science of tomorrow.

2602.20702 2026-02-25 q-bio.PE cond-mat.stat-mech nlin.CD physics.bio-ph

Tipping points in complex ecological systems

Alan Hastings, Sergei Petrovskii, Valerio Lucarini, Andrew Morozov

详情
英文摘要

Tipping points are one of the hot topics in modern physics of complex systems. But what is a tipping point? A generic definition declares it as ``a state of the system where a small change in its parameters can lead to a significant change in its properties''. Additional ingredients that often enter the definition of tipping process are the abruptness of the resulting change and its irreversibility, i.e. it is impossible to recover the initial state if one reverses the protocol of change of the parameters. However, there exists a number of different mathematical structures that can show this behavior, the one that was originally suggested as a tipping point (nowadays usually referred to as bifurcation induced tipping) is just one of many. Different preconditions and/or different level of details included into the model, reflecting also different environmental forcing, can lead to a variety of tipping mechanisms. Furthermore, in a spatially extended system and/or a system with multiple scales, different parts can react to a change in environmental conditions differently or at a different time, interacting with each other to create a tipping cascade. In this paper, using ecosystems as a paradigm of complex nonlinear open systems, we provide a critical overview of the progress made in tipping point science over the last 15 years. We highlight the main findings, identify gaps in our knowledge, and outline a roadmap for further progress.

2602.20495 2026-02-25 q-bio.QM

Unveiling Scaling Laws of Parameter Identifiability and Uncertainty Quantification in Data-Driven Biological Modeling

Shun Wang, Wenrui Hao

Comments 45 pages, 5figures

详情
英文摘要

Integrating high-dimensional biological data into data-driven mechanistic modeling requires rigorous practical identifiability to ensure interpretability and generalizability. However, coordinate identifiability analysis often suffers from numerical instabilities near singular local minimizers. We present a computational framework that uncovers fundamental scaling laws governing practical identifiability through asymptotic analysis. By synthesizing Fisher information with perturbed Hessian matrices, we establish a hierarchical approach to quantify coordinate identifiability and inform uncertainty quantification within non-identifiable subspaces across different orders. Supported by rigorous mathematical analysis and validated on synthetic and real-world data, our framework was applied to HIV-host dynamics and spatiotemporal amyloid-beta propagation. These applications demonstrate the framework's efficiency in elucidating critical mechanisms underlying HIV diagnostics and Alzheimer's disease progression. In the era of large-scale mechanistic digital twins, our framework provides the scaling laws for data-driven modeling in terms of both parameter identifiability and uncertainty, ensuring that data-driven inferences are grounded in verifiable biological reality.

2602.20449 2026-02-25 cs.LG cs.AI cs.CL q-bio.BM

Protein Language Models Diverge from Natural Language: Comparative Analysis and Improved Inference

Anna Hart, Chi Han, Jeonghwan Kim, Huimin Zhao, Heng Ji

详情
英文摘要

Modern Protein Language Models (PLMs) apply transformer-based model architectures from natural language processing to biological sequences, predicting a variety of protein functions and properties. However, protein language has key differences from natural language, such as a rich functional space despite a vocabulary of only 20 amino acids. These differences motivate research into how transformer-based architectures operate differently in the protein domain and how we can better leverage PLMs to solve protein-related tasks. In this work, we begin by directly comparing how the distribution of information stored across layers of attention heads differs between the protein and natural language domain. Furthermore, we adapt a simple early-exit technique-originally used in the natural language domain to improve efficiency at the cost of performance-to achieve both increased accuracy and substantial efficiency gains in protein non-structural property prediction by allowing the model to automatically select protein representations from the intermediate layers of the PLMs for the specific task and protein at hand. We achieve performance gains ranging from 0.4 to 7.01 percentage points while simultaneously improving efficiency by over 10 percent across models and non-structural prediction tasks. Our work opens up an area of research directly comparing how language models change behavior when moved into the protein domain and advances language modeling in biological domains.

2602.20344 2026-02-25 cs.LG cs.AI q-bio.QM

Hierarchical Molecular Representation Learning via Fragment-Based Self-Supervised Embedding Prediction

Jiele Wu, Haozhe Ma, Zhihan Guo, Thanh Vinh Vo, Tze Yun Leong

Comments 15 pages (8 pages main text),8 figures

详情
英文摘要

Graph self-supervised learning (GSSL) has demonstrated strong potential for generating expressive graph embeddings without the need for human annotations, making it particularly valuable in domains with high labeling costs such as molecular graph analysis. However, existing GSSL methods mostly focus on node- or edge-level information, often ignoring chemically relevant substructures which strongly influence molecular properties. In this work, we propose Graph Semantic Predictive Network (GraSPNet), a hierarchical self-supervised framework that explicitly models both atomic-level and fragment-level semantics. GraSPNet decomposes molecular graphs into chemically meaningful fragments without predefined vocabularies and learns node- and fragment-level representations through multi-level message passing with masked semantic prediction at both levels. This hierarchical semantic supervision enables GraSPNet to learn multi-resolution structural information that is both expressive and transferable. Extensive experiments on multiple molecular property prediction benchmarks demonstrate that GraSPNet learns chemically meaningful representations and consistently outperforms state-of-the-art GSSL methods in transfer learning settings.

2602.20289 2026-02-25 eess.SP cs.LG q-bio.QM

The Sim-to-Real Gap in MRS Quantification: A Systematic Deep Learning Validation for GABA

Zien Ma, S. M. Shermer, Oktay Karakuş, Frank C. Langbein

Comments 37 pages, 10 figures, 12 tables

详情
英文摘要

Magnetic resonance spectroscopy (MRS) is used to quantify metabolites in vivo and estimate biomarkers for conditions ranging from neurological disorders to cancers. Quantifying low-concentration metabolites such as GABA ($γ$-aminobutyric acid) is challenging due to low signal-to-noise ratio (SNR) and spectral overlap. We investigate and validate deep learning for quantifying complex, low-SNR, overlapping signals from MEGA-PRESS spectra, devise a convolutional neural network (CNN) and a Y-shaped autoencoder (YAE), and select the best models via Bayesian optimisation on 10,000 simulated spectra from slice-profile-aware MEGA-PRESS simulations. The selected models are trained on 100,000 simulated spectra. We validate their performance on 144 spectra from 112 experimental phantoms containing five metabolites of interest (GABA, Glu, Gln, NAA, Cr) with known ground truth concentrations across solution and gel series acquired at 3 T under varied bandwidths and implementations. These models are further assessed against the widely used LCModel quantification tool. On simulations, both models achieve near-perfect agreement (small MAEs; regression slopes $\approx 1.00$, $R^2 \approx 1.00$). On experimental phantom data, errors initially increased substantially. However, modelling variable linewidths in the training data significantly reduced this gap. The best augmented deep learning models achieved a mean MAE for GABA over all phantom spectra of 0.151 (YAE) and 0.160 (FCNN) in max-normalised relative concentrations, outperforming the conventional baseline LCModel (0.220). A sim-to-real gap remains, but physics-informed data augmentation substantially reduced it. Phantom ground truth is needed to judge whether a method will perform reliably on real data.

2602.20209 2026-02-25 q-bio.QM cs.LG

Regressor-guided Diffusion Model for De Novo Peptide Sequencing with Explicit Mass Control

Shaorong Chen, Jingbo Zhou, Jun Xia

详情
英文摘要

The discovery of novel proteins relies on sensitive protein identification, for which de novo peptide sequencing (DNPS) from mass spectra is a crucial approach. While deep learning has advanced DNPS, existing models inadequately enforce the fundamental mass consistency constraint, that a predicted peptide's mass must match the experimental measured precursor mass. Previous DNPS methods often treat this critical information as a simple input feature or use it in post-processing, leading to numerous implausible predictions that do not adhere to this fundamental physical property. To address this limitation, we introduce DiffuNovo, a novel regressor-guided diffusion model for de novo peptide sequencing that provides explicit peptide-level mass control. Our approach integrates the mass constraint at two critical stages: during training, a novel peptide-level mass loss guides model optimization, while at inference, regressor-based guidance from gradient-based updates in the latent space steers the generation to compel the predicted peptide adheres to the mass constraint. Comprehensive evaluations on established benchmarks demonstrate that DiffuNovo surpasses state-of-the-art methods in DNPS accuracy. Additionally, as the first DNPS model to employ a diffusion model as its core backbone, DiffuNovo leverages the powerful controllability of diffusion architecture and achieves a significant reduction in mass error, thereby producing much more physically plausible peptides. These innovations represent a substantial advancement toward robust and broadly applicable DNPS. The source code is available in the supplementary material.

2602.20198 2026-02-25 q-bio.QM cs.LG

KEMP-PIP: A Feature-Fusion Based Approach for Pro-inflammatory Peptide Prediction

Soumik Deb Niloy, Md. Fahmid-Ul-Alam Juboraj, Swakkhar Shatabda

Comments 11 pages, 4 figures, 6 tables; includes web server and GitHub implementation

详情
英文摘要

Pro-inflammatory peptides (PIPs) play critical roles in immune signaling and inflammation but are difficult to identify experimentally due to costly and time-consuming assays. To address this challenge, we present KEMP-PIP, a hybrid machine learning framework that integrates deep protein embeddings with handcrafted descriptors for robust PIP prediction. Our approach combines contextual embeddings from pretrained ESM protein language models with multi-scale k-mer frequencies, physicochemical descriptors, and modlAMP sequence features. Feature pruning and class-weighted logistic regression manage high dimensionality and class imbalance, while ensemble averaging with an optimized decision threshold enhances the sensitivity--specificity balance. Through systematic ablation studies, we demonstrate that integrating complementary feature sets consistently improves predictive performance. On the standard benchmark dataset, KEMP-PIP achieves an MCC of 0.505, accuracy of 0.752, and AUC of 0.762, outperforming ProIn-fuse, MultiFeatVotPIP, and StackPIP. Relative to StackPIP, these results represent improvements of 9.5% in MCC and 4.8% in both accuracy and AUC. The KEMP-PIP web server is freely available at https://nilsparrow1920-kemp-pip.hf.space/ and the full implementation at https://github.com/S18-Niloy/KEMP-PIP.

2602.17557 2026-02-25 q-bio.NC cs.AI cs.CV

Probability-Invariant Random Walk Learning on Gyral Folding-Based Cortical Similarity Networks for Alzheimer's and Lewy Body Dementia Diagnosis

Minheng Chen, Tong Chen, Chao Cao, Jing Zhang, Tianming Liu, Li Su, Dajiang Zhu

详情
英文摘要

Alzheimer's disease (AD) and Lewy body dementia (LBD) present overlapping clinical features yet require distinct diagnostic strategies. While neuroimaging-based brain network analysis is promising, atlas-based representations may obscure individualized anatomy. Gyral folding-based networks using three-hinge gyri provide a biologically grounded alternative, but inter-individual variability in cortical folding results in inconsistent landmark correspondence and highly irregular network sizes, violating the fixed-topology and node-alignment assumptions of most existing graph learning methods, particularly in clinical datasets where pathological changes further amplify anatomical heterogeneity. We therefore propose a probability-invariant random-walk-based framework that classifies individualized gyral folding networks without explicit node alignment. Cortical similarity networks are built from local morphometric features and represented by distributions of anonymized random walks, with an anatomy-aware encoding that preserves permutation invariance. Experiments on a large clinical cohort of AD and LBD subjects show consistent improvements over existing gyral folding and atlas-based models, demonstrating robustness and potential for dementia diagnosis.

2602.02620 2026-02-25 q-bio.QM cs.AI cs.LG

CryoLVM: Self-supervised Learning from Cryo-EM Density Maps with Large Vision Models

Weining Fu, Kai Shu, Kui Xu, Qiangfeng Cliff Zhang

详情
英文摘要

Cryo-electron microscopy (cryo-EM) has revolutionized structural biology by enabling near-atomic-level visualization of biomolecular assemblies. However, the exponential growth in cryo-EM data throughput and complexity, coupled with diverse downstream analytical tasks, necessitates unified computational frameworks that transcend current task-specific deep learning approaches with limited scalability and generalizability. We present CryoLVM, a foundation model that learns rich structural representations from experimental density maps with resolved structures by leveraging the Joint-Embedding Predictive Architecture (JEPA) integrated with SCUNet-based backbone, which can be rapidly adapted to various downstream tasks. We further introduce a novel histogram-based distribution alignment loss that accelerates convergence and enhances fine-tuning performance. We demonstrate CryoLVM's effectiveness across three critical cryo-EM tasks: density map sharpening, density map super-resolution, and missing wedge restoration. Our method consistently outperforms state-of-the-art baselines across multiple density map quality metrics, confirming its potential as a versatile model for a wide spectrum of cryo-EM applications.

2512.24192 2026-02-25 q-bio.BM

SeedProteo: Accurate De Novo All-Atom Design of Protein Binders

Wei Qu, Yiming Ma, Fei Ye, Chan Lu, Yi Zhou, Kexin Zhang, Lan Wang, Minrui Gui, Quanquan Gu

详情
英文摘要

We present SeedProteo, a diffusion-based model for de novo all-atom protein design. We demonstrate how to repurpose a cutting-edge folding architecture into a powerful generative design framework by effectively integrating self-conditioning features. Extensive benchmarks highlight the model's capabilities across two distinct tasks: in unconditional generation, SeedProteo exhibits superior length generalization and structural diversity, maintaining robustness for long sequences and complex topologies; in binder design, it achieves state-of-the-art performance among open-source methods, attaining the highest in-silico design success rates, structural diversity and novelty. Finally, we validate SeedProteo through wet-lab assays on two therapeutic targets, achieving hit rates of 70%-80% and picomolar-level binding affinities, establishing leading results. To facilitate community adoption, we provide public access to SeedProteo via a webserver (https://seedfold.io/proteinDesign).

2509.15796 2026-02-25 cs.LG cs.AI q-bio.BM

Monte Carlo Tree Diffusion with Multiple Experts for Protein Design

Xuefeng Liu, Mingxuan Cao, Songhao Jiang, Xiao Luo, Xiaotian Duan, Mengdi Wang, Tobin R. Sosnick, Jinbo Xu, Rick Stevens

详情
英文摘要

The goal of protein design is to generate amino acid sequences that fold into functional structures with desired properties. Prior methods combining autoregressive language models with Monte Carlo Tree Search (MCTS) struggle with long-range dependencies and suffer from an impractically large search space. We propose MCTD-ME, Monte Carlo Tree Diffusion with Multiple Experts, which integrates masked diffusion models with tree search to enable multi-token planning and efficient exploration under the guidance of multiple experts. Unlike autoregressive planners, MCTD-ME uses biophysical-fidelity-enhanced diffusion denoising as the rollout engine, jointly revising multiple positions and scaling to large sequence spaces. It further leverages experts of varying capacities to enrich exploration, guided by a pLDDT-based masking schedule that targets low-confidence regions while preserving reliable residues. We propose a novel multi-expert selection rule ( PH-UCT-ME) extends Shannon-entropy-based UCT to expert ensembles with mutual information. MCTD-ME achieves superior performance on the CAMEO and PDB benchmarks, excelling in protein design tasks such as inverse folding, folding, and conditional design challenges like motif scaffolding on lead optimization tasks. Our framework is model-agnostic, plug-and-play, and extensible to denovo protein engineering and beyond.

2507.08055 2026-02-25 q-bio.OT

MCPmed: A Call for MCP-Enabled Bioinformatics Web Services for LLM-Driven Discovery

Matthias Flotho, Ian Ferenc Diks, Philipp Flotho, Leidy-Alejandra G. Molano, Pascal Hirsch, Andreas Keller

详情
英文摘要

Bioinformatics web servers are critical resources in modern biomedical research, facilitating interactive exploration of datasets through custom-built interfaces with rich visualization capabilities. However, this human-centric design limits machine readability for large language models (LLMs) and deep research agents. We address this gap by adapting the Model Context Protocol (MCP) to bioinformatics web server backends - a standardized, machine-actionable layer that explicitly associates webservice endpoints with scientific concepts and detailed metadata. Our implementations across widely-used databases (GEO, STRING, UCSC Cell Browser) demonstrate enhanced exploration capabilities through MCP-enabled LLMs. To accelerate adoption, we propose MCPmed, a community effort supplemented by lightweight breadcrumbs for services not yet fully MCP-enabled and templates for setting up new servers. This structured transition significantly enhances automation, reproducibility, and interoperability, preparing bioinformatics web services for next-generation research agents.

2507.00407 2026-02-25 physics.chem-ph cs.AI q-bio.QM

Augmenting Molecular Graphs with Geometries via Machine Learning Interatomic Potentials

Cong Fu, Yuchao Lin, Zachary Krueger, Haiyang Yu, Maho Nakata, Jianwen Xie, Emine Kucukbenli, Xiaofeng Qian, Shuiwang Ji

Comments Accepted by TMLR

详情
英文摘要

Accurate molecular property predictions require 3D geometries, which are typically obtained using expensive methods such as density functional theory (DFT). Here, we attempt to obtain molecular geometries by relying solely on machine learning interatomic potential (MLIP) models. To this end, we first curate a large-scale molecular relaxation dataset comprising 3.5 million molecules and 300 million snapshots. Then MLIP pre-trained models are trained with supervised learning to predict energy and forces given 3D molecular structures. Once trained, we show that the pre-trained models can be used in different ways to obtain geometries either explicitly or implicitly. First, it can be used to obtain approximate low-energy 3D geometries via geometry optimization. While these geometries do not consistently reach DFT-level chemical accuracy or convergence, they can still improve downstream performance compared to non-relaxed structures. To mitigate potential biases and enhance downstream predictions, we introduce geometry fine-tuning based on the relaxed 3D geometries. Second, the pre-trained models can be directly fine-tuned for property prediction when ground truth 3D geometries are available. Our results demonstrate that MLIP pre-trained models trained on relaxation data can learn transferable molecular representations to improve downstream molecular property prediction and can provide practically valuable but approximate molecular geometries that benefit property predictions. Our code is publicly available at: https://github.com/divelab/AIRS/