arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2603.26544 2026-03-30 cs.CL q-bio.QM

Development of a European Union Time-Indexed Reference Dataset for Assessing the Performance of Signal Detection Methods in Pharmacovigilance using a Large Language Model

Maria Kefala, Jeffery L. Painter, Syed Tauhid Bukhari, Maurizio Sessa

Comments 4 Figures and 2 Tables

详情

英文摘要

Background: The identification of optimal signal detection methods is hindered by the lack of reliable reference datasets. Existing datasets do not capture when adverse events (AEs) are officially recognized by regulatory authorities, preventing restriction of analyses to pre-confirmation periods and limiting evaluation of early detection performance. This study addresses this gap by developing a time-indexed reference dataset for the European Union (EU), incorporating the timing of AE inclusion in product labels along with regulatory metadata. Methods: Current and historical Summaries of Product Characteristics (SmPCs) for all centrally authorized products (n=1,513) were retrieved from the EU Union Register of Medicinal Products (data lock: 15 December 2025). Section 4.8 was extracted and processed using DeepSeek V3 to identify AEs. Regulatory metadata, including labelling changes, were programmatically extracted. Time indexing was based on the date of AE inclusion in the SmPC. Results: The database includes 17,763 SmPC versions spanning 1995-2025, comprising 125,026 drug-AE associations. The time-indexed reference dataset, restricted to active products, included 1,479 medicinal products and 110,823 drug-AE associations. Most AEs were identified pre-marketing (74.5%) versus post-marketing (25.5%). Safety updates peaked around 2012. Gastrointestinal, skin, and nervous system disorders were the most represented System Organ Classes. Drugs had a median of 48 AEs across 14 SOCs. Conclusions: The proposed dataset addresses a critical gap in pharmacovigilance by incorporating temporal information on AE recognition for the EU, supporting more accurate assessment of signal detection performance and facilitating methodological comparisons across analytical approaches.

URL PDF HTML ☆

赞 0 踩 0

2603.26370 2026-03-30 q-bio.QM math.DS math.OC

Multi-scale Metabolic Modeling and Simulation

Peter E. Carstensen, Teddy Groves, Lars K. Nielsen, Ulrich Krühne, Krist V. Gernaey, John B. Jørgensen

Comments To be presented at ESCAPE36, 7 pages, 6 figures

2603.26267 2026-03-30 q-bio.NC

On the RAID dataset of perceptual responses: analysis and statistical causes

Paula Daudén-Oliver, David Agost-Beltran, Emilio Sansano-Sansano, Raul Montoliu, Valero Laparra, Jesús Malo, Marina Martínez-Garcia

2603.26110 2026-03-30 q-bio.QM

TurboESM: Ultra-Efficient 3-Bit KV Cache Quantization for Protein Language Models with Orthogonal Rotation and QJL Correction

Yue Hu, Junqing Wang, Yingchao Liu

Comments 16 pages, 7 tables

详情

英文摘要

The rapid scaling of Protein Language Models (PLMs) has unlocked unprecedented accuracy in protein structure prediction and design, but the quadratic memory growth of the Key-Value (KV) cache during inference remains a prohibitive barrier for single-GPU deployment and high-throughput generation. While 8-bit quantization is now standard, 3-bit quantization remains elusive due to severe numerical outliers in activations. This paper presents TurboESM, an adaptation of Google's TurboQuant to the PLM domain. We solve the fundamental incompatibility between Rotary Position Embeddings (RoPE) and orthogonal transformations by deriving a RoPE-first rotation pipeline. We introduce a head-wise SVD calibration method tailored to the amino acid activation manifold, a dual look-up table (LUT) strategy for asymmetric K/V distributions, and a 1-bit Quantized Johnson-Lindenstrauss (QJL) residual correction. All experiments are conducted on ESM-2 650M, where our implementation achieves a 7.1x memory reduction (330 MB to 47 MB) while maintaining cosine similarity > 0.96 in autoregressive decoding across diverse protein families, including short peptides, transmembrane helices, enzyme active site fragments, and intrinsically disordered regions. We further implement a Triton-based fused decode attention kernel that eliminates intermediate dequantization memory allocations, achieving a 1.96x speedup over the PyTorch two-step path for the KV fetch operation alone; however, TurboESM incurs a prefill overhead of 21-27 ms relative to the original model due to KV quantization and packing, making it most suitable for memory-bound scenarios rather than latency-critical short-sequence workloads. Analysis reveals that PLMs exhibit sharper outlier profiles than large language models (LLMs) due to amino acid vocabulary sparsity, and our method effectively addresses these distributions.

URL PDF HTML ☆

赞 0 踩 0

2509.24779 2026-03-30 cs.LG q-bio.BM

MarS-FM: Generative Modeling of Molecular Dynamics via Markov State Models

Kacper Kapuśniak, Cristian Gabellini, Michael Bronstein, Prudencio Tossou, Francesco Di Giovanni

2507.03005 2026-03-30 cs.CL q-bio.PE

Beyond cognacy

Gerhard Jäger

Comments 9 pages, 2 figures

2410.03757 2026-03-30 math.OC math-ph math.CA math.MP q-bio.QM

Framing structural identifiability in terms of parameter symmetries

Johannes G Borgqvist, Alexander P Browning, Fredrik Ohlsson, Ruth E Baker

Comments 45 pages, 2 figures

2405.16885 2026-03-30 stat.ME q-bio.PE

Hidden Markov modelling of spatio-temporal dynamics of measles in 1750-1850 Finland

Tiia-Maria Pasanen, Jouni Helske, Tarmo Ketola

2603.26007 2026-03-30 q-bio.NC cs.AI cs.CV

Longitudinal Boundary Sharpness Coefficient Slopes Predict Time to Alzheimer's Disease Conversion in Mild Cognitive Impairment: A Survival Analysis Using the ADNI Cohort

Ishaan Cherukuri

2603.25991 2026-03-30 eess.SY cs.SY q-bio.NC

Passivity-Based Control of Electrographic Seizures in a Neural Mass Model of Epilepsy

Gagan Acharya, Erfan Nozari

2603.25986 2026-03-30 q-bio.PE q-bio.QM

Evaluating Phylogenetic Comparative Methods under Reticulate Evolutionary Scenarios

Lydia Morley, Emma Lehmberg, Sungsik Kong

Comments 28 pages, 10 figures, 4 tables

2603.25880 2026-03-30 q-bio.QM cs.AI cs.LG

Spectral Coherence Index: A Model-Free Metric for Protein Structural Ensemble Quality Assessment

Yuda Bi, Huaiwen Zhang, Jingnan Sun, Vince D Calhoun

2603.25762 2026-03-30 q-bio.GN quant-ph

QHap: Quantum-Inspired Haplotype Phasing

Rui Zhang, Xian-Zhe Tao, Yibo Chen, Jiawei Zhang, Lei He, Dongming Fang, Lin Yang, Yuhui Sun, Qinyuan Zheng, Xinmeng Shi, Yang Zhou, Wanyi Chen, Chentao Yang, Man-Hong Yung, Jun-Han Huang

Comments 19 pages, 7 figures

2603.25755 2026-03-30 physics.chem-ph cs.LG q-bio.QM stat.ML

KANEL: Kolmogorov-Arnold Network Ensemble Learning Enables Early Hit Enrichment in High-Throughput Virtual Screening

Pavel Koptev, Nikita Krainov, Konstantin Malkov, Alexander Tropsha

Comments 8 Pages