arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2604.04858 2026-04-07 cs.LG q-bio.QM

FairLogue: A Toolkit for Intersectional Fairness Analysis in Clinical Machine Learning Models

Nick Souligne, Vignesh Subbian

详情

英文摘要

Objective: Algorithmic fairness is essential for equitable and trustworthy machine learning in healthcare. Most fairness tools emphasize single-axis demographic comparisons and may miss compounded disparities affecting intersectional populations. This study introduces Fairlogue, a toolkit designed to operationalize intersectional fairness assessment in observational and counterfactual contexts within clinical settings. Methods: Fairlogue is a Python-based toolkit composed of three components: 1) an observational framework extending demographic parity, equalized odds, and equal opportunity difference to intersectional populations; 2) a counterfactual framework evaluating fairness under treatment-based contexts; and 3) a generalized counterfactual framework assessing fairness under interventions on intersectional group membership. The toolkit was evaluated using electronic health record data from the All of Us Controlled Tier V8 dataset in a glaucoma surgery prediction task using logistic regression with race and gender as protected attributes. Results: Observational analysis identified substantial intersectional disparities despite moderate model performance (AUROC = 0.709; accuracy = 0.651). Intersectional evaluation revealed larger fairness gaps than single-axis analyses, including demographic parity differences of 0.20 and equalized odds true positive and false positive rate gaps of 0.33 and 0.15, respectively. Counterfactual analysis using permutation-based null distributions produced unfairness ("u-value") estimates near zero, suggesting observed disparities were consistent with chance after conditioning on covariates. Conclusion: Fairlogue provides a modular toolkit integrating observational and counterfactual methods for quantifying and evaluating intersectional bias in clinical machine learning workflows.

URL PDF HTML ☆

赞 0 踩 0

2604.04770 2026-04-07 cs.NE q-bio.NC

Regime Mapping of Oscillatory States in Balanced Spiking Networks with Multiple Time Scales

Tsung-Han Kuo, Tzu-Chia Tung

2604.04677 2026-04-07 q-bio.BM cs.LG

Towards protein folding pathways by reconstructing protein residue networks with a policy-driven model

Susan Khor

Comments 8 pages, 5 figures, 3 tables

2604.04469 2026-04-07 cs.CL q-bio.QM

Same Geometry, Opposite Noise: Transformer Magnitude Representations Lack Scalar Variability

Jon-Paul Cacioli

Comments 7 pages, 5 figures, 1 table. Pre-registered on OSF (osf.io/w4892). Companion to arXiv:2603.20642

2604.04239 2026-04-07 cs.LG cs.AI q-bio.QM

Good Rankings, Wrong Probabilities: A Calibration Audit of Multimodal Cancer Survival Models

Sajad Ghawami

Comments 15 pages, 5 figures

2604.04155 2026-04-07 cs.LG cs.IT math.IT q-bio.QM stat.ML

The Geometric Alignment Tax: Tokenization vs. Continuous Geometry in Scientific Foundation Models

Prashant C. Raju

2604.04154 2026-04-07 cond-mat.stat-mech cond-mat.dis-nn cs.LG q-bio.NC

Non-Equilibrium Stochastic Dynamics as a Unified Framework for Insight and Repetitive Learning: A Kramers Escape Approach to Continual Learning

Gunn Kim

Comments 12 pages, 4 figures

详情

英文摘要

Continual learning in artificial neural networks is fundamentally limited by the stability--plasticity dilemma: systems that retain prior knowledge tend to resist acquiring new knowledge, and vice versa. Existing approaches, most notably elastic weight consolidation~(EWC), address this empirically without a physical account of why plasticity eventually collapses as tasks accumulate. Separately, the distinction between sudden insight and gradual skill acquisition through repetitive practice has lacked a unified theoretical description. Here, we show that both problems admit a common resolution within non-equilibrium statistical physics. We model the state of a learning system as a particle evolving under Langevin dynamics on a double-well energy landscape, with the noise amplitude governed by a time-dependent effective temperature $T(t)$. The probability density obeys a Fokker--Planck equation, and transitions between metastable states are governed by the Kramers escape rate $k = (ω_0ω_b/2π)\,e^{-ΔE/T}$. We make two contributions. First, we identify the EWC penalty term as an energy barrier whose height grows linearly with the number of accumulated tasks, yielding an exponential collapse of the transition rate predicted analytically and confirmed numerically. Second, we show that insight and repetitive learning correspond to two qualitatively distinct temperature protocols within the same Fokker--Planck equation: insight events produce transient spikes in $T(t)$ that drive rapid barrier crossing, whereas repetitive practice operates at a modestly elevated but fixed temperature, achieving transitions through sustained stochastic diffusion. These results establish a physically grounded framework for understanding plasticity and its failure in continual learning systems, and suggest principled design criteria for adaptive noise schedules in artificial intelligence.

URL PDF HTML ☆

赞 0 踩 0

2603.09033 2026-04-07 q-bio.QM math.ST stat.TH

Sequential learning theory for Markov genealogy processes

David J Pascall

2602.14828 2026-04-07 q-bio.QM cs.LG

Exploring the limits of pre-trained embeddings in machine-guided protein design: a case study on predicting AAV vector viability

Ana F. Rodrigues, Lucas Ferraz, Laura Balbi, Pedro Giesteira Cotovio, Catia Pesquita

详情

DOI: 10.1038/s41598-026-45458-5

英文摘要

Effective representations of protein sequences are widely recognized as a cornerstone of machine learning-based protein design. Yet, protein bioengineering poses unique challenges for sequence representation, as experimental datasets typically feature few mutations, which are either sparsely distributed across the entire sequence or densely concentrated within localized regions. This limits the ability of sequence-level representations to extract functionally meaningful signals. In addition, comprehensive comparative studies remain scarce, despite their crucial role in clarifying which representations best encode relevant information and ultimately support superior predictive performance. In this study, we systematically evaluate multiple ProtBERT and ESM2 embedding variants as sequence representations, using the adeno-associated virus capsid as a case study and prototypical example of bioengineering, where functional optimization is targeted through highly localized sequence variation within an otherwise large protein. Our results reveal that, prior to fine-tuning, amino acid-level embeddings outperform sequence-level representations in supervised predictive tasks, whereas the latter tend to be more effective in unsupervised settings. However, optimal performance is only achieved when embeddings are fine-tuned with task-specific labels, with sequence-level representations providing the best performance. Moreover, our findings indicate that the extent of sequence variation required to produce notable shifts in sequence representations exceeds what is typically explored in bioengineering studies, showing the need for fine-tuning in datasets characterized by sparse or highly localized mutations.

URL PDF HTML ☆

赞 0 踩 0

2511.09216 2026-04-07 cs.LG q-bio.QM stat.ML

Controllable protein design with particle-based Feynman-Kac steering

Erik Hartman, Jonas Wallin, Johan Malmström, Jimmy Olsson

Comments In version 2 we added an experiment on improving designability through steering towards lower delta G

2509.09480 2026-04-07 q-bio.PE cond-mat.stat-mech

Large deviations in non-Markovian stochastic epidemics

Matan Shmunik, Michael Assaf

Comments 7 pages, 4 figures + Supplemental Information file

2410.08823 2026-04-07 q-bio.NC

Gray Anchoring: a New Computational Theory for Biological Color Constancy

Kai-Fu Yang, Dajun Xing, Yong-Jie Li

Comments 22 pages, 6 figures

2604.04033 2026-04-07 q-bio.NC cs.LG

Topological Sensitivity in Connectome-Constrained Neural Networks

Nalin Dhiman

Comments 17 pages, 5 fig

2604.04025 2026-04-07 q-bio.NC cs.SD

Neurological Plausibility of AI-Generated Music for Commercial Environments: An In-Silico Cortical Investigation Using Wubble and TRIBE v2

Shaad Sufi

Comments IEEE-style preprint; 4 figures; 4 tables

2604.03952 2026-04-07 stat.AP q-bio.QM

Multidimensional physical fitness is associated with reduced dementia risk through proteomic and neuroimaging pathways: a prospective cohort study of the UK Biobank

Yiqing Sun, Runyu Lin, Jiayue Qin, Feiyue Pan, Bingjie Li, Zhigang Yao

Comments 22 pages, 6 figures

2604.03911 2026-04-07 cs.LG q-bio.QM

Align Your Structures: Generating Trajectories with Structure Pretraining for Molecular Dynamics

Aniketh Iyengar, Jiaqi Han, Pengwei Sun, Mingjian Jiang, Jianwen Xie, Stefano Ermon

Comments Published at ICLR 2026. 38 pages, 17 figures, 17 tables

2604.03794 2026-04-07 q-bio.QM cs.SY eess.SY

Bounding Transient Moments for a Class of Stochastic Reaction Networks Using Kolmogorov's Backward Equation

Takeyuki Iwasaki, Yutaka Hori

2604.03791 2026-04-07 math.OC cs.SY eess.SY q-bio.QM

Acceleration of Moment Bound Optimization for Stochastic Chemical Reactions Using Reaction-wise Sparsity of Moment Equations

Tomoki Sadatoshi, Antonis Papachristodoulou, Yutaka Hori

2604.03630 2026-04-07 cs.AI q-bio.QM

A Multimodal Foundation Model of Spatial Transcriptomics and Histology for Biological Discovery and Clinical Prediction

Jinxi Xiang, Siyu Hou, Yuchen Li, Ryan Quinton, Xiaoming Zhang, Feyisope Eweje, Xiangde Luo, Yijiang Chen, Zhe Li, Colin Bergstrom, Ted Kim, Sierra Willens, Francesca Maria Olguin, Matthew Abikenari, Andrew Heider, Sanjeeth Rajaram, Joel Neal, Maximilian Diehn, Xiang Zhou, Ruijiang Li

Comments 29 pages, 5 figures. This manuscript is a work in progress; further updates and revisions will be posted as they become available

2604.03625 2026-04-07 nlin.AO econ.TH physics.bio-ph q-bio.PE

Overcoming unfairness via repeated interactions in mini-ultimatum game

Prosanta Mandal, Arunava Patra, Sagar Chakraborty

2604.03538 2026-04-07 physics.bio-ph cond-mat.stat-mech q-bio.SC

Thermal fluctuations set fundamental limits on ion channel function

Jose M. Betancourt, Benjamin B. Machta

Comments 7 pages, 4 figures, supplement included

2604.03480 2026-04-07 q-bio.NC cs.AI cs.CL

Large Language Models Align with the Human Brain during Creative Thinking

Mete Ismayilzada, Simone A. Luchini, Abdulkadir Gokce, Badr AlKhamissi, Antoine Bosselut, Antonio Laverghetta, Lonneke van der Plas, Roger E. Beaty

Comments Under review

2604.03361 2026-04-07 cs.LG q-bio.QM

The limits of bio-molecular modeling with large language models : a cross-scale evaluation

Yaxin Xu, Yue Zhou, Tianyu Zhao, Fengwei An, Zhixiang Ren

2603.20922 2026-04-07 q-bio.PE math.PR math.SP

Spectral Geometry and Heat Kernels on Phylogenetic Trees

Ángel Alfredo Morán Ledezma

2603.08861 2026-04-07 math.DS math.PR nlin.CD q-bio.PE

Geometric early warning indicator from stochastic separatrix structure in a random two-state ecosystem model

Yuzhu Shi, Larissa Serdukova, Yayun Zheng, Sergei Petrovskii, Valerio Lucarini

Comments 25 pages, 9 figures. Submitted to Physica D: Nonlinear Phenomena

2510.23391 2026-04-07 q-bio.NC

Conduction velocity of intracortical axons in monkey primary visual cortex grows with distance: implications for computation

Li Zhaoping

Comments 19 pages (double line spacing), 2 figures, this paper is also archived at arXiv:2510.23391 at PsyArXiv

2509.05110 2026-04-07 cond-mat.soft physics.bio-ph q-bio.TO

Elasticity and plasticity of epithelial gap closure

Maryam Setoudeh, Pierre A. Haas

Comments 6 pages, 5 figures; Supplemental Material: 8 pages, 2 figures

2509.04316 2026-04-07 cond-mat.soft physics.bio-ph q-bio.TO

Control of lumen morphology by lateral and basal cell surfaces

Chandraniva Guha Ray, Markus Mukenhirn, Alf Honigmann, Pierre A. Haas

Comments 13 pages, 5 figures

2507.23769 2026-04-07 q-bio.PE

Environment heterogeneity creates fast amplifiers of natural selection in graph-structured populations

Cecilia Fruet, Arthur Alexandre, Alia Abbara, Claude Loverdo, Anne-Florence Bitbol

Comments 62 pages, 25 figures

2505.22680 2026-04-07 q-bio.NC

Evidence for Bures--Wasserstein Boundary Dynamics in the Living Human Brain

Christian Kerskens