arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2602.22400 2026-03-03 cs.LG

Predicting Multi-Drug Resistance in Bacterial Isolates Through Performance Comparison and LIME-based Interpretation of Classification Models

Santanam Wishal, Riad Sahara

Comments 6 pages, 7 figures

详情

英文摘要

The rise of Antimicrobial Resistance, particularly Multi-Drug Resistance (MDR), presents a critical challenge for clinical decision-making due to limited treatment options and delays in conventional susceptibility testing. This study proposes an interpretable machine learning framework to predict MDR in bacterial isolates using clinical features and antibiotic susceptibility patterns. Five classification models were evaluated, including Logistic Regression, Random Forest, AdaBoost, XGBoost, and LightGBM. The models were trained on a curated dataset of 9,714 isolates, with resistance encoded at the antibiotic family level to capture cross-class resistance patterns consistent with MDR definitions. Performance assessment included accuracy, F1-score, AUC-ROC, and Matthews Correlation Coefficient. Ensemble models, particularly XGBoost and LightGBM, demonstrated superior predictive capability across all metrics. To address the clinical transparency gap, Local Interpretable Model-agnostic Explanations (LIME) was applied to generate instance-level explanations. LIME identified resistance to quinolones, Co-trimoxazole, Colistin, aminoglycosides, and Furanes as the strongest contributors to MDR predictions, aligning with known biological mechanisms. The results show that combining high-performing models with local interpretability provides both accuracy and actionable insights for antimicrobial stewardship. This framework supports earlier MDR identification and enhances trust in machine learning-assisted clinical decision support.

URL PDF HTML ☆

赞 0 踩 0

2602.22286 2026-03-03 cs.LG cs.IT math.IT

OmniZip: Learning a Unified and Lightweight Lossless Compressor for Multi-Modal Data

Yan Zhao, Zhengxue Cheng, Junxuan Zhang, Dajiang Zhou, Qunshan Gu, Qi Wang, Li Song

Comments 8 figures, 10 tables

2602.22284 2026-03-03 cs.LG

BrepCoder: A Unified Multimodal Large Language Model for Multi-task B-rep Reasoning

Mingi Kim, Yongjun Kim, Jungwoo Kang, Hyungki Kim

2602.21820 2026-03-03 cs.CV

Joint Shadow Generation and Relighting via Light-Geometry Interaction Maps

Shan Wang, Peixia Li, Chenchen Xu, Ziang Cheng, Jiayu Yang, Hongdong Li, Pulak Purkait

Comments ICLR 2026

2602.21647 2026-03-03 cs.CL cs.AI cs.LG

Mitigating Structural Noise in Low-Resource S2TT: An Optimized Cascaded Nepali-English Pipeline with Punctuation Restoration

Tangsang Chongbang, Pranesh Pyara Shrestha, Amrit Sarki, Anku Jaiswal

Comments 16 pages, 4 figures, 12 tables, Transactions on Asian and Low-Resource Language Information Processing (Under Review)

2602.20999 2026-03-03 cs.CV

VII: Visual Instruction Injection for Jailbreaking Image-to-Video Generation Models

Bowen Zheng, Yongli Xiang, Ziming Hong, Zerong Lin, Chaojian Yu, Tongliang Liu, Xinge You

Comments Project page: https://Zbwwwwwwww.github.io/VII

2602.20677 2026-03-03 cs.LG cs.AI

UrbanFM: Scaling Urban Spatio-Temporal Foundation Models

Wei Chen, Yuqian Wu, Junle Chen, Xiaofang Zhou, Yuxuan Liang

详情

英文摘要

Urban systems, as dynamic complex systems, continuously generate spatio-temporal data streams that encode the fundamental laws of human mobility and city evolution. While AI for Science has witnessed the transformative power of foundation models in disciplines like genomics and meteorology, urban computing remains fragmented due to "scenario-specific" models, which are overfitted to specific regions or tasks, hindering their generalizability. To bridge this gap and advance spatio-temporal foundation models for urban systems, we adopt scaling as the central perspective and systematically investigate two key questions: what to scale and how to scale. Grounded in first-principles analysis, we identify three critical dimensions: heterogeneity, correlation, and dynamics, aligning these principles with the fundamental scientific properties of urban spatio-temporal data. Specifically, to address heterogeneity through data scaling, we construct WorldST. This billion-scale corpus standardizes diverse physical signals, such as traffic flow and speed, from over 100 global cities into a unified data format. To enable computation scaling for modeling correlations, we introduce the MiniST unit, a novel split mechanism that discretizes continuous spatio-temporal fields into learnable computational units to unify representations of grid-based and sensor-based observations. Finally, addressing dynamics via architecture scaling, we propose UrbanFM, a minimalist self-attention architecture designed with limited inductive biases to autonomously learn dynamic spatio-temporal dependencies from massive data. Furthermore, we establish EvalST, the largest-scale urban spatio-temporal benchmark to date. Extensive experiments demonstrate that UrbanFM achieves remarkable zero-shot generalization across unseen cities and tasks, marking a pivotal first step toward large-scale urban spatio-temporal foundation models.

URL PDF HTML ☆

赞 0 踩 0

2602.20650 2026-03-03 cs.CV cs.AI

Dataset Color Quantization: A Training-Oriented Framework for Dataset-Level Compression

Chenyue Yu, Lingao Xiao, Jinhong Deng, Ivor W. Tsang, Yang He

Comments Accepted by ICLR 2026

2602.20629 2026-03-03 cs.LG

QEDBENCH: Quantifying the Alignment Gap in Automated Evaluation of University-Level Mathematical Proofs

Santiago Gonzalez, Alireza Amiri Bavandpour, Peter Ye, Edward Zhang, Ruslans Aleksejevs, Todor Antić, Polina Baron, Sujeet Bhalerao, Shubhrajit Bhattacharya, Zachary Burton, John Byrne, Hyungjun Choi, Nujhat Ahmed Disha, Koppany István Encz, Yuchen Fang, Robert Joseph George, Ebrahim Ghorbani, Alan Goldfarb, Jing Guo, Meghal Gupta, Stefano Huber, Annika Kanckos, Minjung Kang, Hyun Jong Kim, Dino Lorenzini, Levi Lorenzo, Tianyi Mao, Giovanni Marzenta, Ariane M. Masuda, Lukas Mauth, Ana Mickovic, Andres Miniguano-Trujillo, Antoine Moulin, Wenqi Ni, Tomos Parry, Kevin Ren, Hossein Roodbarani, Mathieu Rundström, Manjil Saikia, Detchat Samart, Rebecca Steiner, Connor Stewart, Dhara Thakkar, Jeffrey Tse, Vasiliki Velona, Yunhai Xiang, Sibel Yalçın, Jun Yan, Ji Zeng, Arman Cohan, Quanquan C. Liu

2602.20089 2026-03-03 cs.CV cs.AI

StructXLIP: Enhancing Vision-language Models with Multimodal Structural Cues

Zanxi Ruan, Songqun Gao, Qiuyu Kong, Yiming Wang, Marco Cristani

Comments Accepted by CVPR 2026

2602.19816 2026-03-03 cs.SD cs.AI cs.LG

Depth-Structured Music Recurrence: Budgeted Recurrent Attention for Full-Piece Symbolic Music Modeling

Yungang Yi, Weihua Li, Matthew Kuo, Catherine Shi, Quan Bai

2602.19661 2026-03-03 cs.LG

PaReGTA: An LLM-based EHR Data Encoding Approach to Capture Temporal Information

Kihyuk Yoon, Lingchao Mao, Catherine Chong, Todd J. Schwedt, Chia-Chun Chiang, Jing Li

Comments 26 pages, 5 figures, 7 tables

2602.19424 2026-03-03 cs.CV

Hepato-LLaVA: An Expert MLLM with Sparse Topo-Pack Attention for Hepatocellular Pathology Analysis on Whole Slide Images

Yuxuan Yang, Zhonghao Yan, Yi Zhang, Bo Yun, Muxi Diao, Guowei Zhao, Kongming Liang, Wenbin Li, Zhanyu Ma

Comments 10 pages, 3 figures

2602.19113 2026-03-03 cs.LG cs.AI stat.ML

Learning from Complexity: Exploring Dynamic Sample Pruning of Spatio-Temporal Training

Wei Chen, Junle Chen, Yuqian Wu, Yuxuan Liang, Xiaofang Zhou

2602.18182 2026-03-03 cs.LG cs.AI

Capabilities Ain't All You Need: Measuring Propensities in AI

Daniel Romero-Alvarado, Fernando Martínez-Plumed, Lorenzo Pacchiardi, Hugo Save, Siddhesh Milind Pawar, Behzad Mehrbakhsh, Pablo Antonio Moreno Casares, Ben Slater, Paolo Bova, Peter Romero, Zachary R. Tidler, Jonathan Prunty, Luning Sun, Jose Hernandez-Orallo

2602.16241 2026-03-03 cs.CL cs.AI

Are LLMs Ready to Replace Bangla Annotators?

Md. Najib Hasan, Touseef Hasan, Souvika Sarkar

Comments We have identified significant methodological discrepancies in the current version of the manuscript that affect the validity and reproducibility of the reported results. In order to prevent potential misunderstanding or misinterpretation of our findings, we request the complete withdrawal of this submission while we conduct a thorough revision

2602.16179 2026-03-03 cs.AI cs.LG

EnterpriseBench Corecraft: Training Generalizable Agents on High-Fidelity RL Environments

Sushant Mehta, Logan Ritchie, Suhaas Garre, Ian Niebres, Nick Heiner, Edwin Chen

2602.16110 2026-03-03 cs.CV cs.AI

OmniCT: Towards a Unified Slice-Volume LVLM for Comprehensive CT Analysis

Tianwei Lin, Zhongwei Qiu, Wenqiao Zhang, Jiang Liu, Yihan Xie, Mingjian Gao, Zhenxuan Fan, Zhaocheng Li, Sijing Li, Zhongle Xie, Peng LU, Yueting Zhuang, Ling Zhang, Beng Chin Ooi, Yingda Xia

详情

英文摘要

Computed Tomography (CT) is one of the most widely used and diagnostically information-dense imaging modalities, covering critical organs such as the heart, lungs, liver, and colon. Clinical interpretation relies on both slice-driven local features (e.g., sub-centimeter nodules, lesion boundaries) and volume-driven spatial representations (e.g., tumor infiltration, inter-organ anatomical relations). However, existing Large Vision-Language Models (LVLMs) remain fragmented in CT slice versus volumetric understanding: slice-driven LVLMs show strong generalization but lack cross-slice spatial consistency, while volume-driven LVLMs explicitly capture volumetric semantics but suffer from coarse granularity and poor compatibility with slice inputs. The absence of a unified modeling paradigm constitutes a major bottleneck for the clinical translation of medical LVLMs. We present OmniCT, a powerful unified slice-volume LVLM for CT scenarios, which makes three contributions: (i) Spatial Consistency Enhancement (SCE): volumetric slice composition combined with tri-axial positional embedding that introduces volumetric consistency, and an MoE hybrid projection enables efficient slice-volume adaptation; (ii) Organ-level Semantic Enhancement (OSE): segmentation and ROI localization explicitly align anatomical regions, emphasizing lesion- and organ-level semantics; (iii) MedEval-CT: the largest slice-volume CT dataset and hybrid benchmark integrates comprehensive metrics for unified evaluation. OmniCT consistently outperforms existing methods with a substantial margin across diverse clinical tasks and satisfies both micro-level detail sensitivity and macro-level spatial reasoning. More importantly, it establishes a new paradigm for cross-modal medical imaging understanding. Our project is available at https://github.com/ZJU4HealthCare/OmniCT.

URL PDF HTML ☆

赞 0 踩 0

2602.15959 2026-03-03 cs.CV cs.AI

Deformation-Free Cross-Domain Image Registration via Position-Encoded Temporal Attention

Yiwen Wang, Jiahao Qin

Comments 11 pages, 3 figures

2602.14760 2026-03-03 cs.CL cs.AI

Residual Connections and the Causal Shift: Uncovering a Structural Misalignment in Transformers

Jonathan Lys, Vincent Gripon, Bastien Pasdeloup, Axel Marmoret, Lukas Mauch, Fabien Cardinaux, Ghouthi Boukli Hacene

2602.14759 2026-03-03 cs.LG cs.AI

Inner Loop Inference for Pretrained Transformers: Unlocking Latent Capabilities Without Training

Jonathan Lys, Vincent Gripon, Bastien Pasdeloup, Axel Marmoret, Lukas Mauch, Fabien Cardinaux, Ghouthi Boukli Hacene

2602.14643 2026-03-03 cs.AI

Arbor: A Framework for Reliable Navigation of Critical Conversation Flows

Luís Silva, Diogo Gonçalves, Catarina Farinha, Clara Matos, Luís Ungaro

2602.13575 2026-03-03 cs.CL cs.AI

Elo-Evolve: A Co-evolutionary Framework for Language Model Alignment

Jing Zhao, Ting Zhen, Junwei Bao, Hongfei Jiang, Yang Song

2602.12660 2026-03-03 cs.CL

Learning Ordinal Probabilistic Reward from Preferences

Longze Chen, Lu Wang, Renke Shan, Ze Gong, Run Luo, Jiaming Li, Jing Luo, Qiyao Wang, Min Yang

Comments 28 pages, 5 figures, ICLR 2026

2602.12267 2026-03-03 cs.LG

Self-Supervised Learning via Flow-Guided Neural Operator on Time-Series Data

Duy Nguyen, Jiachen Yao, Jiayun Wang, Julius Berner, Animashree Anandkumar

Comments Accepted to NeurIPS 2025 Workshop

2602.11661 2026-03-03 cs.AI

Quark Medical Alignment: A Holistic Multi-Dimensional Alignment and Collaborative Optimization Paradigm

Tianxiang Xu, Jiayi Liu, Yixuan Tong, Jialu Xu, Yunqing Wei, Kaiwen Feng, PanPan Hou, Kangping Yin, Jiyuan Hu, Hao Zhou, Zhenxin Ma, Jian Xu, Guanjun Jiang

详情

英文摘要

While reinforcement learning for large language model alignment has progressed rapidly in recent years, transferring these paradigms to high-stakes medical question answering reveals a fundamental paradigm mismatch. Reinforcement Learning from Human Feedback relies on preference annotations that are prohibitively expensive and often fail to reflect the absolute correctness of medical facts. Reinforcement Learning from Verifiable Rewards lacks effective automatic verifiers and struggles to handle complex clinical contexts. Meanwhile, medical alignment requires the simultaneous optimization of correctness, safety, and compliance, yet multi-objective heterogeneous reward signals are prone to scale mismatch and optimization conflicts. To address these challenges, we propose a robust medical alignment paradigm. We first construct a holistic multi-dimensional medical alignment matrix that decomposes alignment objectives into four categories: fundamental capabilities, expert knowledge, online feedback, and format specifications. Within each category, we establish a closed loop of where observable metrics inform attributable diagnosis, which in turn drives optimizable rewards, thereby providing fine-grained, high-resolution supervision signals for subsequent iterative optimization. To resolve gradient domination and optimization instability problem caused by heterogeneous signals, we further propose a unified optimization mechanism. This mechanism employs Reference-Frozen Normalization to align reward scales and implements a Tri-Factor Adaptive Dynamic Weighting strategy to achieve collaborative optimization that is weakness-oriented, risk-prioritized, and redundancy-reducing. Experimental results demonstrate the effectiveness of our proposed paradigm in real-world medical scenario evaluations, establishing a new paradigm for complex alignment in vertical domains.

URL PDF HTML ☆

赞 0 踩 0

2602.10609 2026-03-03 cs.CL cs.AI

Online Causal Kalman Filtering for Stable and Effective Policy Optimization

Shuo He, Lang Feng, Xin Cheng, Lei Feng, Bo An

Comments Preprint

2602.09463 2026-03-03 cs.AI

SpotAgent: Grounding Visual Geo-localization in Large Vision-Language Models through Agentic Reasoning

Furong Jia, Ling Dai, Wenjin Deng, Fan Zhang, Chen Hu, Daxin Jiang, Yu Liu

2602.08237 2026-03-03 cs.CL

Document Reconstruction Unlocks Scalable Long-Context RLVR

Yao Xiao, Lei Wang, Yue Deng, Guanzheng Chen, Ziqi Jin, Jung-jae Kim, Xiaoli Li, Roy Ka-wei Lee, Lidong Bing

2602.07543 2026-03-03 cs.AI cond-mat.mtrl-sci

MSP-LLM: A Unified Large Language Model Framework for Complete Material Synthesis Planning

Heewoong Noh, Gyoung S. Na, Namkyeong Lee, Chanyoung Park