arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2603.02351 2026-03-04 cs.CV

MERG3R: A Divide-and-Conquer Approach to Large-Scale Neural Visual Geometry

Leo Kaixuan Cheng, Abdus Shaikh, Ruofan Liang, Zhijie Wu, Yushi Guan, Nandita Vijaykumar

Comments Project page: https://leochengkx.github.io/MERG3R/

详情

英文摘要

Recent advancements in neural visual geometry, including transformer-based models such as VGGT and Pi3, have achieved impressive accuracy on 3D reconstruction tasks. However, their reliance on full attention makes them fundamentally limited by GPU memory capacity, preventing them from scaling to large, unordered image collections. We introduce MERG3R, a training-free divide-and-conquer framework that enables geometric foundation models to operate far beyond their native memory limits. MERG3R first reorders and partitions unordered images into overlapping, geometrically diverse subsets that can be reconstructed independently. It then merges the resulting local reconstructions through an efficient global alignment and confidence-weighted bundle adjustment procedure, producing a globally consistent 3D model. Our framework is model-agnostic and can be paired with existing neural geometry models. Across large-scale datasets, including 7-Scenes, NRGBD, Tanks & Temples, and Cambridge Landmarks, MERG3R consistently improves reconstruction accuracy, memory efficiency, and scalability, enabling high-quality reconstruction when the dataset exceeds memory capacity limits.

URL PDF HTML ☆

赞 0 踩 0

2603.02349 2026-03-04 cs.LG

Learning graph topology from metapopulation epidemic encoder-decoder

Xin Li, Jonathan Cohen, Shai Pilosof, Rami Puzis

2603.02348 2026-03-04 cs.LG cs.AI cs.RO

Diffusion-MPC in Discrete Domains: Feasibility Constraints, Horizon Effects, and Critic Alignment: Case study with Tetris

Haochuan Kevin Wang

Comments 7 pages, 3 figures, 2 tables. Includes regret diagnostics and compute-quality frontier analysis. Code and experiment configurations available in the Diffusion-Tetris repository

2603.02333 2026-03-04 cs.CL

Characterizing Memorization in Diffusion Language Models: Generalized Extraction and Sampling Effects

Xiaoyu Luo, Wenrui Yu, Qiongxiu Li, Johannes Bjerva

Comments 21 pages, 9 figures

2603.02329 2026-03-04 cs.CV

HAMMER: Harnessing MLLM via Cross-Modal Integration for Intention-Driven 3D Affordance Grounding

Lei Yao, Yong Chen, Yuejiao Su, Yi Wang, Moyun Liu, Lap-Pui Chau

Comments Accepted by CVPR 2026. Project Page: https://rayyoh.github.io/Hammer

2603.02291 2026-03-04 cs.RO

Goal-Oriented Semantic Communication for ISAC-Enabled Robotic Obstacle Avoidance

Wenjie Liu, Yansha Deng, Henk Wymeersch

Comments 13 pages, 15 figures

2603.02286 2026-03-04 cs.CV cs.AI

Beyond Prompt Degradation: Prototype-guided Dual-pool Prompting for Incremental Object Detection

Yaoteng Zhang, Zhou Qing, Junyu Gao, Qi Wang

Comments Our paper has been accepted to CVPR 2026

2603.02285 2026-03-04 cs.SD cs.LG eess.AS

Sequence-Level Unsupervised Training in Speech Recognition: A Theoretical Study

Zijian Yang, Jörg Barkoczi, Ralf Schlüter, Hermann Ney

Comments accepted to ICASSP 2026

2603.02281 2026-03-04 cs.LG cs.AI quant-ph

Quantum-Inspired Fine-Tuning for Few-Shot AIGC Detection via Phase-Structured Reparameterization

Kaiyang Xing, Han Fang, Zhaoyun Chen, Zhonghui Li, Yang Yang, Weiming Zhang, Guoping Guo

Comments 12 pages, 5 figures

2603.02280 2026-03-04 cs.LG cs.AI

Temporal Imbalance of Positive and Negative Supervision in Class-Incremental Learning

Jinge Ma, Fengqing Zhu

Comments Accepted to CVPR 2026

2603.02273 2026-03-04 cs.LG

Graph Attention Based Prioritization of Disease Responsible Genes from Multimodal Alzheimer's Network

Binon Teji, Subhajit Bandyopadhyay, Swarup Roy

2603.02270 2026-03-04 cs.CV

From Visual to Multimodal: Systematic Ablation of Encoders and Fusion Strategies in Animal Identification

Vasiliy Kudryavtsev, Kirill Borodin, German Berezin, Kirill Bubenchikov, Grach Mkrtchian, Alexander Ryzhkov

Comments Published at MDPI Journal of Imaging (see at https://www.mdpi.com/2313-433X/12/1/30)

2603.02268 2026-03-04 cs.LG cs.AI

PRISM: Exploring Heterogeneous Pretrained EEG Foundation Model Transfer to Clinical Differential Diagnosis

Jeet Bandhu Lahiri, Parshva Runwal, Arvasu Kulkarni, Mahir Jain, Aditya Ray Mishra, Siddharth Panwar, Sandeep Singh

Comments 14 pages, 1 figure, 5 tables

2603.02267 2026-03-04 cs.LG cs.AI

Boosting Meta-Learning for Few-Shot Text Classification via Label-guided Distance Scaling

Yunlong Gao, Xinyue Liu, Yingbo Wang, Linlin Zong, Bo Xu

2603.02266 2026-03-04 cs.SD cs.AI eess.AS

When Scaling Fails: Mitigating Audio Perception Decay of LALMs via Multi-Step Perception-Aware Reasoning

Ruixiang Mao, Xiangnan Ma, Dan Chen, Ziming Zhu, Yuan Ge, Aokai Hao, Haishu Zhao, Yifu Huo, Qing Yang, Kaiyan Chang, Xiaoqian Liu, Chenglong Wang, Qiaozhi He, Tong Xiao, Jingbo Zhu

Comments Under Review

2603.02265 2026-03-04 cs.LG cs.AI

High-order Knowledge Based Network Controllability Robustness Prediction: A Hypergraph Neural Network Approach

Shibing Mo, Jiarui Zhang, Jiayu Xie, Xiangyi Teng, Jing Liu

2603.02258 2026-03-04 cs.CL cs.AI cs.LG

Universal Conceptual Structure in Neural Translation: Probing NLLB-200's Multilingual Geometry

Kyle Elliott Mathewson

Comments 14 figures; code and interactive toolkit available at https://github.com/kylemathewson/InterpretCognates

2603.02256 2026-03-04 cs.CV

CamDirector: Towards Long-Term Coherent Video Trajectory Editing

Zhihao Shi, Kejia Yin, Weilin Wan, Yuhongze Zhou, Yuanhao Yu, Xinxin Zuo, Qiang Sun, Juwei Lu

2603.02255 2026-03-04 cs.SD cs.AI eess.AS

MEBM-Speech: Multi-scale Enhanced BrainMagic for Robust MEG Speech Detection

Li Songyi, Zheng Linze, Liang Jinghua, Zhang Zifeng

Comments 5 pages, 1 figure. To appear in the PNPL Competition Workshop at NeurIPS 2025

2603.02254 2026-03-04 cs.SD cs.AI eess.AS

MEBM-Phoneme: Multi-scale Enhanced BrainMagic for End-to-End MEG Phoneme Classification

Liang Jinghua, Zhang Zifeng, Li Songyi, Zheng Linze

Comments 5 pages, 1 figure. To appear in the PNPL Competition Workshop at NeurIPS 2025

2603.02250 2026-03-04 cs.SD eess.AS

SGPA: Spectrogram-Guided Phonetic Alignment for Feasible Shapley Value Explanations in Multimodal Large Language Models

Paweł Pozorski, Jakub Muszyński, Maria Ganzha

Comments Submitted for admission in Interspeech 2026 conference

2603.02240 2026-03-04 cs.AI cs.CR

SuperLocalMemory: Privacy-Preserving Multi-Agent Memory with Bayesian Trust Defense Against Memory Poisoning

Varun Pratap Bhardwaj

Comments 11 pages, 5 tables, 1 figure. Code: https://github.com/varun369/SuperLocalMemoryV2

2603.02239 2026-03-04 cs.AI cs.SE

Engineering Reasoning and Instruction (ERI) Benchmark: A Large Taxonomy-driven Dataset for Foundation Models and Agents

MZ Naser, Ahmad Bani Awwad, Zoie McCreery, Radwa Eissa, Ahmad Naser, Gianluca Cusatis, Andrew Metcalf, Kapil Madathil, Jamal Abdalla, Venkatesh Kodur, Mohammad Reza Saeb

2603.02236 2026-03-04 cs.LG cs.AI

CUDABench: Benchmarking LLMs for Text-to-CUDA Generation

Jiace Zhu, Wentao Chen, Qi Fan, Zhixing Ren, Junying Wu, Xing Zhe Chai, Chotiwit Rungrueangwutthinon, Yehan Ma, An Zou

2603.02235 2026-03-04 cs.LG cs.AI cs.SE

Talking with Verifiers: Automatic Specification Generation for Neural Network Verification

Yizhak Y. Elboher, Reuven Peleg, Zhouxing Shi, Guy Katz, Jan Křetínský

2603.02233 2026-03-04 cs.LG cs.AI

Adaptive Personalized Federated Learning via Multi-task Averaging of Kernel Mean Embeddings

Jean-Baptiste Fermanian, Batiste Le Bars, Aurélien Bellet

2603.02232 2026-03-04 cs.LG cs.AI

Beyond Binary Preferences: A Principled Framework for Reward Modeling with Ordinal Feedback

Amirhossein Afsharrad, Ruida Zhou, Luca Viano, Sanjay Lall, Mohammad Ghavamzadeh

2603.02231 2026-03-04 cs.LG cs.AI

Physics-Informed Neural Networks with Architectural Physics Embedding for Large-Scale Wave Field Reconstruction

Huiwen Zhang, Feng Ye, Chu Ma

Comments 20 pages, 17 figures

2603.02229 2026-03-04 cs.LG cs.CL

Safety Training Persists Through Helpfulness Optimization in LLM Agents

Benjamin Plaut

Comments Under submission

2603.02228 2026-03-04 cs.LG cs.AI

Neural Paging: Learning Context Management Policies for Turing-Complete Agents

Liang Chen, Qi Liu