arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2603.08709 2026-03-10 cs.CV cs.AI

Scale Space Diffusion

Soumik Mukhopadhyay, Prateksha Udhayanan, Abhinav Shrivastava

Comments Project website: https://prateksha.github.io/projects/scale-space-diffusion/ . The first two authors contributed equally

详情

英文摘要

Diffusion models degrade images through noise, and reversing this process reveals an information hierarchy across timesteps. Scale-space theory exhibits a similar hierarchy via low-pass filtering. We formalize this connection and show that highly noisy diffusion states contain no more information than small, downsampled images - raising the question of why they must be processed at full resolution. To address this, we fuse scale spaces into the diffusion process by formulating a family of diffusion models with generalized linear degradations and practical implementations. Using downsampling as the degradation yields our proposed Scale Space Diffusion. To support Scale Space Diffusion, we introduce Flexi-UNet, a UNet variant that performs resolution-preserving and resolution-increasing denoising using only the necessary parts of the network. We evaluate our framework on CelebA and ImageNet and analyze its scaling behavior across resolutions and network depths. Our project website ( https://prateksha.github.io/projects/scale-space-diffusion/ ) is available publicly.

URL PDF HTML ☆

赞 0 踩 0

2603.08708 2026-03-10 cs.CV

FVG-PT: Adaptive Foreground View-Guided Prompt Tuning for Vision-Language Models

Haoyang Li, Liang Wang, Siyu Zhou, Jiacheng Sun, Jing Jiang, Chao Wang, Guodong Long, Yan Peng

Comments 27 Pages, 9 Figures, 15 Tables

2603.08706 2026-03-10 cs.AI cs.CL cs.LG

Agentic Critical Training

Weize Liu, Minghui Liu, Sy-Tuyen Ho, Souradip Chakraborty, Xiyao Wang, Furong Huang

Comments Project page: https://attention-is-all-i-need.github.io/ACT/

2603.08704 2026-03-10 cs.AI

Evaluating Financial Intelligence in Large Language Models: Benchmarking SuperInvesting AI with LLM Engines

Akshay Gulati, Kanha Singhania, Tushar Banga, Parth Arora, Anshul Verma, Vaibhav Kumar Singh, Agyapal Digra, Jayant Singh Bisht, Danish Sharma, Varun Singla, Shubh Garg

Comments 12 pages, 6 Figures, 5 Tables

2603.08703 2026-03-10 cs.CV

HiAR: Efficient Autoregressive Long Video Generation via Hierarchical Denoising

Kai Zou, Dian Zheng, Hongbo Liu, Tiankai Hang, Bin Liu, Nenghai Yu

Comments Project page: https://jacky-hate.github.io/HiAR/ Code: https://github.com/Jacky-hate/HiAR

2603.08700 2026-03-10 cs.DS cs.CC

Learning Functions of Halfspaces

Josh Alman, Shyamal Patel, Rocco A. Servedio

2603.08692 2026-03-10 cs.AI

A Multi-Objective Optimization Approach for Sustainable AI-Driven Entrepreneurship in Resilient Economies

Anas ALsobeh, Raneem Alkurdi

Comments 35 Pages,

2603.08687 2026-03-10 cs.LG cs.AI

Split Federated Learning Architectures for High-Accuracy and Low-Delay Model Training

Yiannis Papageorgiou, Yannis Thomas, Ramin Khalili, Iordanis Koutsopoulos

2603.08685 2026-03-10 cs.NI

Predicting Conflict Impact on Performance in O-RAN

Pietro Brach del Prever, Niloofar Mohamadi, Salvatore D'Oro, Leonardo Bonati, Michele Polese, Łukasz Kułacz, Piotr Jaworski, Adrian Kliks, Heiko Lehmann, Tommaso Melodia

Comments INFOCOM 2026 Workshop - 6G AI-RAN: AI Native Distributed Intelligence for 6G Networks. 6 pages, 5 figures, 3 tables

2603.08681 2026-03-10 cs.CV

ER-Pose: Rethinking Keypoint-Driven Representation Learning for Real-Time Human Pose Estimation

Nanjun Li, Pinqi Cheng, Zean Liu, Minghe Tian, Xuanyin Wang

详情

英文摘要

Single-stage multi-person pose estimation aims to jointly perform human localization and keypoint prediction within a unified framework, offering advantages in inference efficiency and architectural simplicity. Consequently, multi-scale real-time detection architectures, such as YOLO-like models, are widely adopted for real-time pose estimation. However, these approaches typically inherit a box-driven modeling paradigm from object detection, in which pose estimation is implicitly constrained by bounding-box supervision during training. This formulation introduces biases in sample assignment and feature representation, resulting in task misalignment and ultimately limiting pose estimation accuracy. In this work, we revisit box-driven single-stage pose estimation from a keypoint-driven perspective and identify semantic conflicts among parallel objectives as a key source of performance degradation. To address this issue, we propose a keypoint-driven learning paradigm that elevates pose estimation to a primary prediction objective. Specifically, we remove bounding-box prediction and redesign the prediction head to better accommodate the high-dimensional structured representations for pose estimation. We further introduce a keypoint-driven dynamic sample assignment strategy to align training objectives with pose evaluation metrics, enabling dense supervision during training and efficient NMS-free inference. In addition, we propose a smooth OKS-based loss function to stabilize optimization in regression-based pose estimation. Based on these designs, we develop a single-stage multi-person pose estimation framework, termed ER-Pose. On MS COCO and CrowdPose, ER-Pose-n achieves AP improvements of 3.2/6.7 without pre-training and 7.4/4.9 with pre-training respectively compared with the baseline YOLO-Pose. These improvements are achieved with fewer parameters and higher inference efficiency.

URL PDF HTML ☆

赞 0 踩 0

2603.08679 2026-03-10 cs.LG cs.AI cs.GT econ.TH

A New Lower Bound for the Random Offerer Mechanism in Bilateral Trade using AI-Guided Evolutionary Search

Yang Cai, Vineet Gupta, Zun Li, Aranyak Mehta

2603.08676 2026-03-10 stat.ML cs.LG stat.CO

Momentum SVGD-EM for Accelerated Maximum Marginal Likelihood Estimation

Adam Rozzio, Rafael Athanasiades, O. Deniz Akyildiz

Comments Accepted to AISTATS 2026

2603.08674 2026-03-10 cs.CV

Talking Together: Synthesizing Co-Located 3D Conversations from Audio

Mengyi Shan, Shouchieh Chang, Ziqian Bai, Shichen Liu, Yinda Zhang, Luchuan Song, Rohit Pandey, Sean Fanello, Zeng Huang

Comments Accepted to CVPR 2026

2603.08668 2026-03-10 cs.RO

Exp-Force: Experience-Conditioned Pre-Grasp Force Selection with Vision-Language Models

Siqi Shang, Minchao Huang, Bill Fan, Lillian Chin

2603.08667 2026-03-10 quant-ph cs.LG hep-ex

Characterization and upgrade of a quantum graph neural network for charged particle tracking

Matteo Argenton, Laura Cappelli, Concezio Bozzi

Comments 16 total pages, 15 figures

2603.08661 2026-03-10 cs.CV

ImprovedGS+: A High-Performance C++/CUDA Re-Implementation Strategy for 3D Gaussian Splatting

Jordi Muñoz Vicente

Comments 6 pages, 1 figure. Technical Report. This work introduces ImprovedGS+, a library-free C++/CUDA implementation for 3D Gaussian Splatting within the LichtFeld-Studio framework. Source code available at https://github.com/jordizv/ImprovedGS-Plus

2603.08660 2026-03-10 cs.LG cs.CL

How Far Can Unsupervised RLVR Scale LLM Training?

Bingxiang He, Yuxin Zuo, Zeyuan Liu, Shangziqi Zhao, Zixuan Fu, Junlin Yang, Cheng Qian, Kaiyan Zhang, Yuchen Fan, Ganqu Cui, Xiusi Chen, Youbang Sun, Xingtai Lv, Xuekai Zhu, Li Sheng, Ran Li, Huan-ang Gao, Yuchen Zhang, Bowen Zhou, Zhiyuan Liu, Ning Ding

Comments Accepted to the ICLR 2026

2603.08658 2026-03-10 cs.LG

Context-free Self-Conditioned GAN for Trajectory Forecasting

Tiago Rodrigues de Almeida, Eduardo Gutierrez Maestro, Oscar Martinez Mozos

Comments Accepted at the 2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)

2603.08657 2026-03-10 cs.HC

Clarity and Computational Efficiency of Orbital Boundary Labeling

Markus Wallinger, Annika Bonerath, Soeren Terziadis, Jules Wulms, Martin Nöllenburg

2603.08656 2026-03-10 math.NA cs.NA

Structure-preserving model reduction on manifolds of port-Hamiltonian systems

Silke Glas, Hongliang Mu

Comments 14 pages, 4 figures

2603.08655 2026-03-10 cs.AI cs.CL cs.IR

OfficeQA Pro: An Enterprise Benchmark for End-to-End Grounded Reasoning

Krista Opsahl-Ong, Arnav Singhvi, Jasmine Collins, Ivan Zhou, Cindy Wang, Ashutosh Baheti, Owen Oertell, Jacob Portes, Sam Havens, Erich Elsen, Michael Bendersky, Matei Zaharia, Xing Chen

Comments 24 pages, 16 figures. Introduces the OfficeQA Pro benchmark for grounded reasoning over enterprise documents

2603.08654 2026-03-10 eess.SY cs.SY

Carbon-aware Market Participation for Building Energy Management Systems

Young-ho Cho, Mohamad Chehade, Fatima Al-Janahi, Sol Lim, Javad Mohammadi, Hao Zhu

2603.08652 2026-03-10 cs.AI

CoCo: Code as CoT for Text-to-Image Preview and Rare Concept Generation

Haodong Li, Chunmei Qing, Huanyu Zhang, Dongzhi Jiang, Yihang Zou, Hongbo Peng, Dingming Li, Yuhong Dai, ZePeng Lin, Juanxi Tian, Yi Zhou, Siqi Dai, Jingwei Wu

Comments 21 pages, 7 figures, 7 tables

2603.08649 2026-03-10 cs.LG

Divide and Predict: An Architecture for Input Space Partitioning and Enhanced Accuracy

Fenix W. Huang, Henning S. Mortveit, Christian M. Reidys

Comments Under review; 24 pages; 8 figures

2603.08648 2026-03-10 cs.CV

CAST: Modeling Visual State Transitions for Consistent Video Retrieval

Yanqing Liu, Yingcheng Liu, Fanghong Dong, Budianto Budianto, Cihang Xie, Yan Jiao

2603.08647 2026-03-10 cs.LG

Grow, Don't Overwrite: Fine-tuning Without Forgetting

Dyah Adila, Hanna Mazzawi, Benoit Dherin, Xavier Gonzalvo

2603.08646 2026-03-10 math.LO cs.LO

On the expressive power of inquisitive team logic and inquisitive first-order logic

Juha Kontinen, Ivano Ciardelli

2603.08645 2026-03-10 cs.CV cs.GR cs.LG

Retrieval-Augmented Gaussian Avatars: Improving Expression Generalization

Matan Levy, Gavriel Habib, Issar Tzachor, Dvir Samuel, Rami Ben-Ari, Nir Darshan, Or Litany, Dani Lischinski

2603.08641 2026-03-10 cs.IT math.IT

Coherence-Aware Over-the-Air Distributed Learning under Heterogeneous Link Impairments

Mehdi Karbalayghareh, David J. Love, Christopher G. Brinton

Comments This paper has been accepted for publication in IEEE Journal on Special Areas in Information Theory (JSAIT)

2603.08633 2026-03-10 eess.SY cs.SY

Reachability-based Temporal Logic Verification for Reliable LLM-guided Human-Autonomy Teaming

Joonwon Choi, Kartik Anand Pant, Karthik Nune, Inseok Hwang