arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2603.06369 2026-03-09 cs.LG cs.NA math.NA math.OC

Adaptive Lipschitz-Free Conditional Gradient Methods for Stochastic Composite Nonconvex Optimization

Ganzhao Yuan

详情

英文摘要

We propose ALFCG (Adaptive Lipschitz-Free Conditional Gradient), the first \textit{adaptive} projection-free framework for stochastic composite nonconvex minimization that \textit{requires neither global smoothness constants nor line search}. Unlike prior conditional gradient methods that use openloop diminishing stepsizes, conservative Lipschitz constants, or costly backtracking, ALFCG maintains a self-normalized accumulator of historical iterate differences to estimate local smoothness and minimize a quadratic surrogate model at each step. This retains the simplicity of Frank-Wolfe while adapting to unknown geometry. We study three variants. ALFCG-FS addresses finite-sum problems with a SPIDER estimator. ALFCG-MVR1 and ALFCG-MVR2 handle stochastic expectation problems by using momentum-based variance reduction with single-batch and two-batch updates, and operate under average and individual smoothness, respectively. To reach an $ε$-stationary point, ALFCG-FS attains $\mathcal{O}(N+\sqrt{N}ε^{-2})$ iteration complexity, while ALFCG-MVR1 and ALFCG-MVR2 achieve $\tilde{\mathcal{O}}(σ^2ε^{-4}+ε^{-2})$ and $\tilde{\mathcal{O}}(σε^{-3}+ε^{-2})$, where $N$ is the number of components and $σ$ is the noise level. In contrast to typical $\mathcal{O}(ε^{-4})$ or $\mathcal{O}(ε^{-3})$ rates, our bounds reduce to the optimal rate up to logarithmic factors $\tilde{\mathcal{O}}(ε^{-2})$ as the noise level $σ\to 0$. Extensive experiments on multiclass classification over nuclear norm balls and $\ell_p$ balls show that ALFCG generally outperforms state-of-the-art conditional gradient baselines.

URL PDF HTML ☆

赞 0 踩 0

2603.06366 2026-03-09 cs.CV

OralGPT-Plus: Learning to Use Visual Tools via Reinforcement Learning for Panoramic X-ray Analysis

Yuxuan Fan, Jing Hao, Hong Chen, Jiahao Bao, Yihua Shao, Yuci Liang, Kuo Feng Hung, Hao Tang

Comments 34 pages, 24 figures, conference

2603.06362 2026-03-09 cs.CV

Computer vision-based estimation of invertebrate biomass

Mikko Impiö, Philipp M. Rehsen, Jarrett Blair, Cecilie Mielec, Arne J. Beermann, Florian Leese, Toke T. Høye, Jenni Raitoharju

2603.06361 2026-03-09 cs.LG cs.AI cs.SY eess.SY

CLAIRE: Compressed Latent Autoencoder for Industrial Representation and Evaluation -- A Deep Learning Framework for Smart Manufacturing

Mohammadhossein Ghahramani, Mengchu Zhou

Comments 13 pages. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2026

2603.06359 2026-03-09 cs.LG cs.CR

Tiny, Hardware-Independent, Compression-based Classification

Charles Meyers, Aaron MacSween, Erik Elmroth, Tommy Löfstedt

详情

英文摘要

The recent developments in machine learning have highlighted a conflict between online platforms and their users in terms of privacy. The importance of user privacy and the struggle for power over user data has been intensified as regulators and operators attempt to police online platforms. As users have become increasingly aware of privacy issues, client-side data storage, management, and analysis have become a favoured approach to large-scale centralised machine learning. However, state-of-the-art machine learning methods require vast amounts of labelled user data, making them unsuitable for models that reside client-side and only have access to a single user's data. State-of-the-art methods are also computationally expensive, which degrades the user experience on compute-limited hardware and also reduces battery life. A recent alternative approach has proven remarkably successful in classification tasks across a wide variety of data -- using a compression-based distance measure (called normalised compression distance) to measure the distance between generic objects in classical distance-based machine learning methods. In this work, we demonstrate that the normalised compression distance is actually not a metric; develop it for the wider context of kernel methods to allow modelling of complex data; and present techniques to improve the training time of models that use this distance measure. We demonstrate that the normalised compression distance works as well as and sometimes better than other metrics and kernels -- while requiring only marginally more computational costs and in spite of the lack of formal metric properties. The end results is a simple model with remarkable accuracy even when trained on a very small number of samples allowing for models that are small and effective enough to run entirely on a client device using only user-supplied data.

URL PDF HTML ☆

赞 0 踩 0

2603.06357 2026-03-09 cs.CV

LATO: 3D Mesh Flow Matching with Structured TOpology Preserving LAtents

Tianhao Zhao, Youjia Zhang, Hang Long, Jinshen Zhang, Wenbing Li, Yang Yang, Gongbo Zhang, Jozef Hladký, Matthias Nießner, Wei Yang

2603.06356 2026-03-09 cs.RO

Safe Consensus of Cooperative Manipulation with Hierarchical Event-Triggered Control Barrier Functions

Simiao Zhuang, Bingkun Huang, Zewen Yang

Comments 8 pages

2603.06348 2026-03-09 cs.CL

Transparent AI for Mathematics: Transformer-Based Large Language Models for Mathematical Entity Relationship Extraction with XAI

Tanjim Taharat Aurpa

2603.06343 2026-03-09 cs.RO cs.NI

Open-Source Based and ETSI Compliant Cooperative, Connected, and Automated Mini-Cars

Lorenzo Farina, Federico Gavioli, Salvatore Iandolo, Francesco Moretti, Giuseppe Perrone, Matteo Piccoli, Francesco Raviglione, Marco Rapelli, Antonio Solida, Paolo Burgio, Carlo Augusto Grazia, Alessandro Bazzi

Comments 5 pages, 6 figures

2603.06340 2026-03-09 cs.CV cs.AI

K-MaT: Knowledge-Anchored Manifold Transport for Cross-Modal Prompt Learning in Medical Imaging

Jiajun Zeng, Shadi Albarqouni

2603.06333 2026-03-09 cs.AI cs.CL cs.LG

SAHOO: Safeguarded Alignment for High-Order Optimization Objectives in Recursive Self-Improvement

Subramanyam Sahoo, Aman Chadha, Vinija Jain, Divya Chaudhary

Comments Published at ICLR 2026 Workshop on AI with Recursive Self-Improvement. 20 pages, 5 figures

2603.06324 2026-03-09 cs.CL cs.CV

The Art That Poses Back: Assessing AI Pastiches after Contemporary Artworks

Anca Dinu, Andreiana Mihail, Andra-Maria Florescu, Claudiu Creanga

2603.06321 2026-03-09 cs.CV

P-SLCR: Unsupervised Point Cloud Semantic Segmentation via Prototypes Structure Learning and Consistent Reasoning

Lixin Zhan, Jie Jiang, Tianjian Zhou, Yukun Du, Yan Zheng, Xuehu Duan

2603.06317 2026-03-09 cs.LG cs.AI

From Entropy to Calibrated Uncertainty: Training Language Models to Reason About Uncertainty

Azza Jenane, Nassim Walha, Lukas Kuhn, Florian Buettner

Comments 4 pages, submitted to AISTATS Workshop

2603.06311 2026-03-09 cs.CV

Latent Transfer Attack: Adversarial Examples via Generative Latent Spaces

Eitan Shaar, Ariel Shaulov, Yalcin Tur, Gal Chechik, Ravid Shwartz-Ziv

2603.06303 2026-03-09 cs.LG

Polarized Direct Cross-Attention Message Passing in GNNs for Machinery Fault Diagnosis

Zongyu Shi, Laibin Zhang, Maoyin Chen

2603.06302 2026-03-09 cs.CV cs.AI

DEX-AR: A Dynamic Explainability Method for Autoregressive Vision-Language Models

Walid Bousselham, Angie Boggust, Hendrik Strobelt, Hilde Kuehne

Comments Project page: https://walidbousselham.com/DEX-AR

2603.06300 2026-03-09 cs.CV cs.LG

3D CBCT Artefact Removal Using Perpendicular Score-Based Diffusion Models

Susanne Schaub, Florentin Bieder, Matheus L. Oliveira, Yulan Wang, Dorothea Dagassan-Berndt, Michael M. Bornstein, Philippe C. Cattin

Comments Accepted at DGM4MICCAI 2025

2603.06290 2026-03-09 cs.AI cs.CL

The EpisTwin: A Knowledge Graph-Grounded Neuro-Symbolic Architecture for Personal AI

Giovanni Servedio, Potito Aghilar, Alessio Mattiace, Gianni Carmosino, Francesco Musicco, Gabriele Conte, Vito Walter Anelli, Tommaso Di Noia, Francesco Maria Donini

2603.06280 2026-03-09 cs.RO

SuperSuit: An Isomorphic Bimodal Interface for Scalable Mobile Manipulation

Tongqing Chen, Hang Wu, Jiasen Wang, Xiaotao Li, Zhu Jin, Lu Fang

2603.06279 2026-03-09 cs.CV cs.RO eess.IV

Can we Trust Unreliable Voxels? Exploring 3D Semantic Occupancy Prediction under Label Noise

Wenxin Li, Kunyu Peng, Di Wen, Junwei Zheng, Jiale Wei, Mengfei Duan, Yuheng Zhang, Rui Fan, Kailun Yang

Comments The benchmark and source code will be made publicly available at https://github.com/mylwx/OccNL

2603.06278 2026-03-09 cs.AI

Artificial Intelligence for Climate Adaptation: Reinforcement Learning for Climate Change-Resilient Transport

Miguel Costa, Arthur Vandervoort, Carolin Schmidt, João Miranda, Morten W. Petersen, Martin Drews, Karyn Morrisey, Francisco C. Pereira

2603.06275 2026-03-09 cs.CV

Spectral and Trajectory Regularization for Diffusion Transformer Super-Resolution

Jingkai Wang, Yixin Tang, Jue Gong, Jiatong Li, Shu Li, Libo Liu, Jianliang Lan, Yutong Liu, Yulun Zhang

Comments 14 pages

2603.06274 2026-03-09 cs.LG cs.AI

Stem: Rethinking Causal Information Flow in Sparse Attention

Lin Niu, Xin Luo, Linchuan Xie, Yifu Sun, Guanghua Yu, Jianchen Zhu, S Kevin Zhou

Comments 12 pages, preprint

2603.06271 2026-03-09 cs.LG cs.AI

Agentic retrieval-augmented reasoning reshapes collective reliability under model variability in radiology question answering

Mina Farajiamiri, Jeta Sopa, Saba Afza, Lisa Adams, Felix Barajas Ordonez, Tri-Thien Nguyen, Mahshad Lotfinia, Sebastian Wind, Keno Bressem, Sven Nebelung, Daniel Truhn, Soroosh Tayebi Arasteh

详情

英文摘要

Agentic retrieval-augmented reasoning pipelines are increasingly used to structure how large language models (LLMs) incorporate external evidence in clinical decision support. These systems iteratively retrieve curated domain knowledge and synthesize it into structured reports before answer selection. Although such pipelines can improve performance, their impact on reliability under model variability remains unclear. In real-world deployment, heterogeneous models may align, diverge, or synchronize errors in ways not captured by accuracy. We evaluated 34 LLMs on 169 expert-curated publicly available radiology questions, comparing zero-shot inference with a radiology-specific multi-step agentic retrieval condition in which all models received identical structured evidence reports derived from curated radiology knowledge. Agentic inference reduced inter-model decision dispersion (median entropy 0.48 vs. 0.13) and increased robustness of correctness across models (mean 0.74 vs. 0.81). Majority consensus also increased overall (P<0.001). Consensus strength and robust correctness remained correlated under both strategies (\r{ho}=0.88 for zero-shot; \r{ho}=0.87 for agentic), although high agreement did not guarantee correctness. Response verbosity showed no meaningful association with correctness. Among 572 incorrect outputs, 72% were associated with moderate or high clinically assessed severity, although inter-rater agreement was low (\k{appa}=0.02). Agentic retrieval therefore was associated with more concentrated decision distributions, stronger consensus, and higher cross-model robustness of correctness. These findings suggest that evaluating agentic systems through accuracy or agreement alone may not always be sufficient, and that complementary analyses of stability, cross-model robustness, and potential clinical impact are needed to characterize reliability under model variability.

URL PDF HTML ☆

赞 0 踩 0

2603.06270 2026-03-09 cs.CV cs.AI

HiPP-Prune: Hierarchical Preference-Conditioned Structured Pruning for Vision-Language Models

Lincen Bai, Hedi Tabia, Raul Santos-Rodriguez

2603.06266 2026-03-09 cs.RO

Towards Robotic Lake Maintenance: Integrating SONAR and Satellite Data to Assist Human Operators

Ahmed H. Elsayed, Christoph Manss, Tarek A. El-Mihoub, Andrej Lejman, Frederic Stahl

Comments Accepted to and presented at the 2026 IEEE International Conference on Mechatronics and Robotics Engineering (ICMRE)

2603.06265 2026-03-09 cs.CV

ODD-SEC: Onboard Drone Detection with a Spinning Event Camera

Kuan Dai, Hongxin Zhang, Sheng Zhong, Yi Zhou

2603.06260 2026-03-09 cs.LG cs.AI

Learning to Solve Orienteering Problem with Time Windows and Variable Profits

Songqun Gao, Zanxi Ruan, Patrick Floor, Marco Roveri, Luigi Palopoli, Daniele Fontanelli

Comments Accepted at ICLR 2026

2603.06256 2026-03-09 cs.CV cs.AI

GazeMoE: Perception of Gaze Target with Mixture-of-Experts

Zhuangzhuang Dai, Zhongxi Lu, Vincent G. Zakka, Luis J. Manso, Jose M Alcaraz Calero, Chen Li

Comments 8 pages, 3 figures, ICRA 2026