arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2604.24749 2026-04-28 cs.LG stat.ML

The Optimal Sample Complexity of Multiclass and List Learning

Chirag Pabbaraju

详情

英文摘要

While the optimal sample complexity of binary classification in terms of the VC dimension is well-established, determining the optimal sample complexity of multiclass classification has remained open. The appropriate complexity parameter for multiclass classification is the DS dimension, and despite significant efforts, a gap of $\sqrt{\text{DS}}$ has persisted between the upper and lower bounds on sample complexity. Recent work by Hanneke et al. (2026) shows a novel algebraic characterization of multiclass hypothesis classes in terms of their DS dimension. Building up on this, we show that the maximum hypergraph density of any multiclass hypothesis class is upper-bounded by its DS dimension. This proves a longstanding conjecture of Daniely and Shalev-Shwartz (2014). As a consequence, we determine the optimal dependence of the sample complexity on the DS dimension for multiclass as well as list learning.

URL PDF HTML ☆

赞 0 踩 0

2604.24745 2026-04-28 cs.LG

Conflict-Aware Harmonized Rotational Gradient for Multiscale Kinetic Regimes

Zhangyong Liang

2604.24737 2026-04-28 cs.LG cs.AI cs.CC stat.ML

Learning to Think from Multiple Thinkers

Nirmit Joshi, Roey Magen, Nathan Srebro, Nikolaos Tsilivis, Gal Vardi

Comments Comments are welcome. There are 78 pages and 5 Figures

2604.24729 2026-04-28 cs.LG

SpecRLBench: A Benchmark for Generalization in Specification-Guided Reinforcement Learning

Zijian Guo, İlker Işık, H. M. Sabbir Ahmad, Wenchao Li

2604.24720 2026-04-28 cs.CL

Sentiment and Emotion Classification of Indonesian E-Commerce Reviews via Multi-Task BiLSTM and AutoML Benchmarking

Hermawan Manurung, Ibrahim Al-Kahfi, Ahmad Rizqi, Martin Clinton Tosima Manullang

Comments 8 pages, 5 figures, 4 tables. Final project for Natural Language Processing course (PBA 2026) at Institut Teknologi Sumatera

2604.24719 2026-04-28 cs.CV

DiffuSAM: Diffusion-Based Prompt-Free SAM2 for Few-Shot and Source-Free Medical Image Segmentation

Tal Grossman, Noa Cahan, Lev Ayzenberg, Hayit Greenspan

2604.24718 2026-04-28 cs.CV

WildLIFT: Lifting monocular drone video to 3D for species-agnostic wildlife monitoring

Vandita Shukla, Fabio Remondino, Blair Costelloe, Benjamin Risse

2604.24717 2026-04-28 cs.AI

Learning to Rotate: Temporal and Semantic Rotary Encoding for Sequential Modeling

Hailing Cheng, Daqi Sun, Xinyu Lu

Comments 8 pages, 3 figures

详情

英文摘要

Every Transformer architecture dedicates enormous capacity to learning rich representations in semantic embedding space -- yet the rotation manifold acted upon by Rotary Positional Embeddings (RoPE) has been treated as a fixed, hand-crafted structure, populated only by discrete ordinal indices. We argue that this rotation space is a largely overlooked second dimension of expressivity in the attention mechanism, one whose systematic exploration may open a new door for attention-based architectures. The analogy to complex numbers is instructive: just as introducing the imaginary axis -- orthogonal to and independent of the real line -- unlocked new algebraic structure once believed impossible, treating the rotation manifold as a learnable, signal-conditioned space opens an orthogonal degree of freedom in attention. In this framing, the token embedding encodes the semantic (real) component of a representation -- what a token means -- while the rotation encodes its dynamic (imaginary) component -- how it relates to every other token across time, position, and context. We introduce SIREN-RoPE, a concrete instantiation of this idea, which populates the rotation dimension with heterogeneous signals -- continuous timestamps, cyclical temporal patterns, and categorical metadata -- via a dual-branch Sinusoidal Representation Network (SIREN). As a proof of concept, we evaluate on a production-scale news feed dataset from a major social network using a generative recommender as the ranking model, demonstrating that activating this hidden dimension yields consistent improvements across calibration and ranking objectives with negligible computational overhead. We invite the community to view the rotation space not as a solved positional-encoding detail, but as an untapped axis whose rich structure may prove as consequential for attention as the imaginary unit proved for algebra.

URL PDF HTML ☆

赞 0 踩 0

2604.24715 2026-04-28 cs.CL cs.LG

Long-Context Aware Upcycling: A New Frontier for Hybrid LLM Scaling

Parsa Ashrafi Fashi, Utkarsh Saxena, Mehdi Rezagholizadeh, Aref Jafari, Akash Haridas, Mingyu Yang, Vansh Bhatia, Guihong Li, Vikram Appia, Emad Barsoum

2604.24710 2026-04-28 cs.AI cs.CL

Case-Specific Rubrics for Clinical AI Evaluation: Methodology, Validation, and LLM-Clinician Agreement Across 823 Encounters

Aaryan Shah, Andrew Hines, Alexia Downs, Denis Bajet, Paulius Mui, Fabiano Araujo, Laura Offutt, Aida Rutledge, Elizabeth Jimenez

Comments 14 pages, 2 figures, 3 tables, submitted to JAMIA

详情

英文摘要

Objective. Clinical AI documentation systems require evaluation methodologies that are clinically valid, economically viable, and sensitive to iterative changes. Methods requiring expert review per scoring instance are too slow and expensive for safe, iterative deployment. We present a case-specific, clinician-authored rubric methodology for clinical AI evaluation and examine whether LLM-generated rubrics can approximate clinician agreement. Materials and Methods. Twenty clinicians authored 1,646 rubrics for 823 clinical cases (736 real-world, 87 synthetic) across primary care, psychiatry, oncology, and behavioral health. Each rubric was validated by confirming that an LLM-based scoring agent consistently scored clinician-preferred outputs higher than rejected ones. Seven versions of an EHR-embedded AI agent for clinicians were evaluated across all cases. Results. Clinician-authored rubrics discriminated effectively between high- and low-quality outputs (median score gap: 82.9%) with high scoring stability (median range: 0.00%). Median scores improved from 84% to 95%. In later experiments, clinician-LLM ranking agreement (tau: 0.42-0.46) matched or exceeded clinician-clinician agreement (tau: 0.38-0.43), attributable to both ceiling compression and LLM rubric improvement. Discussion. This convergence supports incorporating LLM rubrics alongside clinician-authored ones. At roughly 1,000 times lower cost, LLM rubrics enable substantially greater evaluation coverage, while continued clinical authorship grounds evaluation in expert judgment. Ceiling compression poses a methodological challenge for future inter-rater agreement studies. Conclusion. Case-specific rubrics offer a path for clinical AI evaluation that preserves expert judgment while enabling automation at three orders lower cost. Clinician-authored rubrics establish the baseline against which LLM rubrics are validated.

URL PDF HTML ☆

赞 0 踩 0

2604.24708 2026-04-28 cs.LG cs.AI

Scalable Hyperparameter-Divergent Ensemble Training with Automatic Learning Rate Exploration for Large Models

Hailing Cheng, Tao Huang, Chen Zhu, Antonio Alonso

Comments 8 pages, 2 figures

2604.24707 2026-04-28 cs.RO

Passage-Aware Structural Mapping for RGB-D Visual SLAM

Ali Tourani, Miguel Fernandez-Cortizas, Saad Ejaz, David Pérez Saura, Asier Bikandi-Noya, Jose Luis Sanchez-Lopez, Holger Voos

Comments 5 pages, 5 figures

2604.24700 2026-04-28 cs.CL cs.AI

Green Shielding: A User-Centric Approach Towards Trustworthy AI

Aaron J. Li, Nicolas Sanchez, Hao Huang, Ruijiang Dong, Jaskaran Bains, Katrin Jaradeh, Zhen Xiang, Bo Li, Feng Liu, Aaron Kornblith, Bin Yu

2604.24698 2026-04-28 cs.CL

The Chameleon's Limit: Investigating Persona Collapse and Homogenization in Large Language Models

Yunze Xiao, Vivienne J. Zhang, Chenghao Yang, Ningshan Ma, Weihao Xuan, Jen-tse Huang

2604.24693 2026-04-28 cs.CL

Contextual Linear Activation Steering of Language Models

Brandon Hsu, Daniel Beaglehole, Adityanarayanan Radhakrishnan, Mikhail Belkin

2604.24692 2026-04-28 cs.LG

Diffusion-Guided Feature Selection via Nishimori Temperature: Noise-Based Spectral Embedding

Vasiliy S. Usatyuk, Denis A. Sapozhnikov, Sergey I. Egorov

Comments 8 pages, 3 figures, extended version (with noise shift proof) of DSPA2026 article

2604.24690 2026-04-28 cs.CL

Can LLMs Act as Historians? Evaluating Historical Research Capabilities of LLMs via the Chinese Imperial Examination

Lirong Gao, Zeqing Wang, Yuyan Cai, Jiayi Deng, Yanmei Gu, Yiming Zhang, Jia Zhou, Yanfei Zhang, Junbo Zhao

Comments Accepted at ACL 2026

2604.24686 2026-04-28 cs.AI

Governing What You Cannot Observe: Adaptive Runtime Governance for Autonomous AI Agents

German Marin, Jatin Chaudhary

2604.24685 2026-04-28 cs.CV

Aycromo: An Open-Source Platform for Automatic Chromosome Detection in Metaphase Images Based on Deep Learning

Jorge L. A. Lima, Filipe R. Cordeiro

Comments Accepted at SBCAS'26

2604.24679 2026-04-28 cs.CV cs.LG

Benchmarking Pathology Foundation Models for Breast Cancer Survival Prediction

Fredrik K. Gustafsson, Constance Boissin, Johan Vallon-Christersson, David A. Clifton, Mattias Rantalainen

2604.24674 2026-04-28 cs.RO

Pushing Radar Odometry Beyond the Pavement: Current Capabilities and Challenges

Shaunak Kolhe, Peng Jiang, Maggie Wigness, Philip Osteen, Timothy Overbye, Chrisitan Ellis, Srikanth Saripalli

2604.24672 2026-04-28 cs.LG math.AT

A Functorial Formulation of Neighborhood Aggregating Deep Learning

Sun Woo Park, Yun Young Choi, U Jin Choi, Youngho Woo

Comments 32 pages, 11 figures. Comments welcome

2604.24665 2026-04-28 cs.CL cs.AI

Benchmarking Source-Sensitive Reasoning in Turkish: Humans and LLMs under Evidential Trust Manipulation

Sercan Karakaş, Yusuf Şimşek

Comments Accepted to The 15th edition of the Workshop on Cognitive Modeling and Computational Linguistics, co-located with the Language Resources and Evaluation Conference

2604.24648 2026-04-28 cs.RO

Computational Design and Co-Robotic Fabrication for Material Reuse in Architecture

Arash Adel, Daniel Ruan, Ruxin Xie

Comments Accepted for publication in Proceedings of the 45th Annual Conference of the Association for Computer Aided Design in Architecture (ACADIA 2025)

2604.24647 2026-04-28 cs.CL cs.AI

DepthKV: Layer-Dependent KV Cache Pruning for Long-Context LLM Inference

Zahra Dehghanighobadi, Asja Fischer

2604.24645 2026-04-28 cs.CL cs.AI

K-MetBench: A Multi-Dimensional Benchmark for Fine-Grained Evaluation of Expert Reasoning, Locality, and Multimodality in Meteorology

Soyeon Kim, Cheongwoong Kang, Myeongjin Lee, Eun-Chul Chang, Jaedeok Lee, Jaesik Choi

Comments 39 pages, 32 figures, 14 tables, including appendices. Accepted to Findings of the Association for Computational Linguistics (ACL 2026)

2604.24642 2026-04-28 cs.CV

Probing CLIP's Comprehension of 360-Degree Textual and Visual Semantics

Hai Wang, Xiaochen Yang, Mingzhi Dong, Jing-Hao Xue

Comments Project Page: https://littlewhitesea.github.io/360Semantics.github.io/

2604.24628 2026-04-28 cs.RO

Real-time windrow detection from onboard tractor sensors for automated following

Lorenz Gunreben, Nico Heider, Sebastian Zürner, Martin Schieck, Bogdan Franczyk

Comments Published in the proceedings of the 46th GIL Annual Conference (GIL-Jahrestagung 2026)

2604.24625 2026-04-28 cs.CV cs.AI cs.LG cs.MM

Meta-CoT: Enhancing Granularity and Generalization in Image Editing

Shiyi Zhang, Yiji Cheng, Tiankai Hang, Zijin Yin, Runze He, Yu Xu, Wenxun Dai, Yunlong Lin, Chunyu Wang, Qinglin Lu, Yansong Tang

Comments Accepted by CVPR2026, Project Page: https://shiyi-zh0408.github.io/projectpages/Meta-CoT/

2604.24623 2026-04-28 cs.AI cs.IR cs.LG

XGRAG: A Graph-Native Framework for Explaining KG-based Retrieval-Augmented Generation

Zhuoling Li, Ha Linh Hong Tran Nguyen, Valeria Bladinieres, Maxim Romanovsky