arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2603.11531 2026-03-13 cs.CV

Mobile-GS: Real-time Gaussian Splatting for Mobile Devices

Xiaobiao Du, Yida Wang, Kun Zhan, Xin Yu

Comments Project Page: https://xiaobiaodu.github.io/mobile-gs-project/

详情

英文摘要

3D Gaussian Splatting (3DGS) has emerged as a powerful representation for high-quality rendering across a wide range of applications.However, its high computational demands and large storage costs pose significant challenges for deployment on mobile devices. In this work, we propose a mobile-tailored real-time Gaussian Splatting method, dubbed Mobile-GS, enabling efficient inference of Gaussian Splatting on edge devices. Specifically, we first identify alpha blending as the primary computational bottleneck, since it relies on the time-consuming Gaussian depth sorting process. To solve this issue, we propose a depth-aware order-independent rendering scheme that eliminates the need for sorting, thereby substantially accelerating rendering. Although this order-independent rendering improves rendering speed, it may introduce transparency artifacts in regions with overlapping geometry due to the scarcity of rendering order. To address this problem, we propose a neural view-dependent enhancement strategy, enabling more accurate modeling of view-dependent effects conditioned on viewing direction, 3D Gaussian geometry, and appearance attributes. In this way, Mobile-GS can achieve both high-quality and real-time rendering. Furthermore, to facilitate deployment on memory-constrained mobile platforms, we also introduce first-order spherical harmonics distillation, a neural vector quantization technique, and a contribution-based pruning strategy to reduce the number of Gaussian primitives and compress the 3D Gaussian representation with the assistance of neural networks. Extensive experiments demonstrate that our proposed Mobile-GS achieves real-time rendering and compact model size while preserving high visual quality, making it well-suited for mobile applications.

URL PDF HTML ☆

赞 0 踩 0

2603.11526 2026-03-13 cs.LG

CFD-HAR: User-controllable Privacy through Conditional Feature Disentanglement

Alex Gn, Fan Li, S Kuniyilh, Ada Axan

2603.11525 2026-03-13 cs.CV

MDS-VQA: Model-Informed Data Selection for Video Quality Assessment

Jian Zou, Xiaoyu Xu, Zhihua Wang, Yilin Wang, Balu Adsumilli, Kede Ma

2603.11521 2026-03-13 cs.CV cs.AI

EReCu: Pseudo-label Evolution Fusion and Refinement with Multi-Cue Learning for Unsupervised Camouflage Detection

Shuo Jiang, Gaojia Zhang, Min Tan, Yufei Yin, Gang Pan

Comments Accepted by CVPR 2026

2603.11520 2026-03-13 cs.CV cs.AI

FBCIR: Balancing Cross-Modal Focuses in Composed Image Retrieval

Chenchen Zhao, Jianhuan Zhuo, Muxi Chen, Zhaohua Zhang, Wenyu Jiang, Tianwen Jiang, Qiuyong Xiao, Jihong Zhang, Qiang Xu

Comments 20 pages, 5 figures, 15 tables

2603.11515 2026-03-13 cs.AI

Multi-Agent Collaboration for Automated Design Exploration on High Performance Computing Systems

Harshitha Menon, Charles F. Jekel, Kevin Korner, Brian Gunnarson, Nathan K. Brown, Michael Stees, M. Giselle Fernandez-Godino, Walter Nissen, Meir H. Shachar, Dane M. Sterbentz, William J. Schill, Yue Hao, Robert Rieben, William Quadros, Steve Owen, Scott Mitchell, Ismael D. Boureima, Jonathan L. Belof

2603.11513 2026-03-13 cs.CL

Can Small Language Models Use What They Retrieve? An Empirical Study of Retrieval Utilization Across Model Scale

Sanchit Pandey

Comments 10 pages, 5 figures, planning to submit to arr march 2026. Code and evaluation data: https://anonymous.4open.science/r/rag-utilization-study-C67F . Earlier draft preprint available on Zenodo: https://zenodo.org/records/18870116 (note: this arXiv submission is an updated draft)

2603.11510 2026-03-13 cs.CL

Tiny Aya: Bridging Scale and Multilingual Depth

Alejandro R. Salamanca, Diana Abagyan, Daniel D'souza, Ammar Khairi, David Mora, Saurabh Dash, Viraat Aryabumi, Sara Rajaee, Mehrnaz Mofakhami, Ananya Sahu, Thomas Euyang, Brittawnya Prince, Madeline Smith, Hangyu Lin, Acyr Locatelli, Sara Hooker, Tom Kocmi, Aidan Gomez, Ivan Zhang, Phil Blunsom, Nick Frosst, Joelle Pineau, Beyza Ermis, Ahmet Üstün, Julia Kreutzer, Marzieh Fadaee

2603.11509 2026-03-13 cs.CV

Manifold-Optimal Guidance: A Unified Riemannian Control View of Diffusion Guidance

Zexi Jia, Pengcheng Luo, Zhengyao Fang, Jinchao Zhang, Jie Zhou

2603.11505 2026-03-13 cs.CV cs.AI cs.LG

Gen-Fab: A Variation-Aware Generative Model for Predicting Fabrication Variations in Nanophotonic Devices

Rambod Azimi, Yuri Grinberg, Dan-Xia Xu, Odile Liboiron-Ladouceur

Comments Accepted and published in Structural and Multidisciplinary Optimization (2026)

详情

DOI: 10.1007/s00158-026-04272-3
Journal ref: Structural and Multidisciplinary Optimization (2026)

英文摘要

Silicon photonic devices often exhibit fabrication-induced variations such as over-etching, underetching, and corner rounding, which can significantly alter device performance. These variations are non-uniform and are influenced by feature size and shape. Accurate digital twins are therefore needed to predict the range of possible fabricated outcomes for a given design. In this paper, we introduce Gen-Fab, a conditional generative adversarial network (cGAN) based on Pix2Pix to predict and model uncertainty in photonic fabrication outcomes. The proposed method takes a design layout (in GDS format) as input and produces diverse high-resolution predictions similar to scanning electron microscope (SEM) images of fabricated devices, capturing the range of process variations at the nanometer scale. To enable one-to-many mapping, we inject a latent noise vector at the model bottleneck. We compare Gen-Fab against three baselines: (1) a deterministic U-Net predictor, (2) an inference-time Monte Carlo Dropout U-Net, and (3) an ensemble of varied U-Nets. Evaluations on an out-of-distribution dataset of fabricated photonic test structures demonstrate that Gen-Fab outperforms all baselines in both accuracy and uncertainty modeling. An additional distribution shift analysis further confirms its strong generalization to unseen fabrication geometries. Gen-Fab achieves the highest intersection-over-union (IoU) score of 89.8%, outperforming the deterministic U-Net (85.3%), the MC-Dropout U-Net (83.4%), and varying U-Nets (85.8%). It also better aligns with the distribution of real fabrication outcomes, achieving lower Kullback-Leibler divergence and Wasserstein distance.

URL PDF HTML ☆

赞 0 踩 0

2603.11503 2026-03-13 cs.LG

Sharpness-Aware Minimization for Generalized Embedding Learning in Federated Recommendation

Fengyuan Yu, Xiaohua Feng, Yuyuan Li, Changwang Zhang, Jun Wang, Chaochao Chen

Comments Accepted by the ACM Web Conference 2026

2603.11498 2026-03-13 cs.CV

ActiveFreq: Integrating Active Learning and Frequency Domain Analysis for Interactive Segmentation

Lijun Guo, Qian Zhou, Zidi Shi, Hua Zou, Gang Ke

Comments 16 pages, 8 figures, published in Knowledge-Based Systems

详情

DOI: 10.1016/j.knosys.2025.114091
Journal ref: Knowledge-Based Systems 327 (2025) 114091

英文摘要

Interactive segmentation is commonly used in medical image analysis to obtain precise, pixel-level labeling, typically involving iterative user input to correct mislabeled regions. However, existing approaches often fail to fully utilize user knowledge from interactive inputs and achieve comprehensive feature extraction. Specifically, these methods tend to treat all mislabeled regions equally, selecting them randomly for refinement without evaluating each region's potential impact on segmentation quality. Additionally, most models rely solely on spatial domain features, overlooking frequency domain information that could enhance feature extraction and improve performance. To address these limitations, we propose ActiveFreq, a novel interactive segmentation framework that integrates active learning and frequency domain analysis to minimize human intervention while achieving high-quality labeling. ActiveFreq introduces AcSelect, an autonomous module that prioritizes the most informative mislabeled regions, ensuring maximum performance gain from each click. Moreover, we develop FreqFormer, a segmentation backbone incorporating a Fourier transform module to map features from the spatial to the frequency domain, enabling richer feature extraction. Evaluations on the ISIC-2017 and OAI-ZIB datasets demonstrate that ActiveFreq achieves high performance with reduced user interaction, achieving 3.74 NoC@90 on ISIC-2017 and 9.27 NoC@90 on OAI-ZIB, with 23.5% and 12.8% improvements over previous best results, respectively. Under minimal input conditions, such as two clicks, ActiveFreq reaches mIoU scores of 85.29% and 75.76% on ISIC-2017 and OAI-ZIB, highlighting its efficiency and accuracy in interactive medical segmentation.

URL PDF HTML ☆

赞 0 踩 0

2603.11495 2026-03-13 cs.CL

Try, Check and Retry: A Divide-and-Conquer Framework for Boosting Long-context Tool-Calling Performance of LLMs

Kunfeng Chen, Qihuang Zhong, Juhua Liu, Bo Du, Dacheng Tao

Comments 17 pages, 8 figures

2603.11493 2026-03-13 cs.CV cs.AI cs.CY

OrthoEraser: Coupled-Neuron Orthogonal Projection for Concept Erasure

Chuancheng Shi, Wenhua Wu, Fei Shen, Xiaogang Zhu, Kun Hu, Zhiyong Wang

2603.11492 2026-03-13 cs.CV cs.AI

SPEGC: Continual Test-Time Adaptation via Semantic-Prompt-Enhanced Graph Clustering for Medical Image Segmentation

Xiaogang Du, Jiawei Zhang, Tongfei Liu, Tao Lei, Yingbo Wang

Comments Accepted to CVPR 2026. 16 pages, 7 figures

2603.11481 2026-03-13 cs.CV cs.AI

INFACT: A Diagnostic Benchmark for Induced Faithfulness and Factuality Hallucinations in Video-LLMs

Junqi Yang, Yuecong Min, Jie Zhang, Shiguang Shan, Xilin Chen

2603.11480 2026-03-13 cs.RO

SPARK: Skeleton-Parameter Aligned Retargeting on Humanoid Robots with Kinodynamic Trajectory Optimization

Hanwen Wang, Qiayuan Liao, Bike Zhang, Kunzhao Ren, Koushil Sreenath, Xiaobin Xiong

2603.11476 2026-03-13 cs.LG q-bio.QM

Leveraging Phytolith Research using Artificial Intelligence

Andrés G. Mejía Ramón, Kate Dudgeon, Nina Witteveen, Dolores Piperno, Michael Kloster, Luigi Palopoli, Mónica Moraes R., José M. Capriles, Umberto Lombardo

Comments 45 pages, 23 figures

2603.11475 2026-03-13 cs.LG cs.NI

Deep Learning Network-Temporal Models For Traffic Prediction

Yufeng Xin, Ethan Fan

2603.11470 2026-03-13 cs.RO

NFPO: Stabilized Policy Optimization of Normalizing Flow for Robotic Policy Learning

Diyuan Shi, Yiqi Tang, Zifeng Zhuang, Donglin Wang

2603.11462 2026-03-13 cs.LG cs.AI

Bridging Discrete Marks and Continuous Dynamics: Dual-Path Cross-Interaction for Marked Temporal Point Processes

Yuxiang Liu, Qiao Liu, Tong Luo, Yanglei Gan, Peng He, Yao LIu

2603.11456 2026-03-13 cs.LG

UniHetCO: A Unified Heterogeneous Representation for Multi-Problem Learning in Unsupervised Neural Combinatorial Optimization

Kien X. Nguyen, Ilya Safro

2603.11447 2026-03-13 cs.RO

Enhancing Lightweight Vision Language Models through Group Competitive Learning for Socially Compliant Navigation

Xinyu Zhang, Atsushi Konno, Toshihiko Yamasaki, Ling Xiao

2603.11446 2026-03-13 cs.CL

LLM-Assisted Causal Structure Disambiguation and Factor Extraction for Legal Judgment Prediction

Yuzhi Liang, Lixiang Ma, Xinrong Zhu

2603.11441 2026-03-13 cs.CV

Detect Anything in Real Time: From Single-Prompt Segmentation to Multi-Class Detection

Mehmet Kerem Turkcan

2603.11439 2026-03-13 cs.CV

Stay in your Lane: Role Specific Queries with Overlap Suppression Loss for Dense Video Captioning

Seung Hyup Baek, Jimin Lee, Hyeongkeun Lee, Jae Won Cho

Comments Accepted to CVPR 2026

2603.11436 2026-03-13 cs.LG

ZTab: Domain-based Zero-shot Annotation for Table Columns

Ehsan Hoseinzade, Ke Wang

2603.11433 2026-03-13 cs.AI cs.CR

Adversarial Reinforcement Learning for Detecting False Data Injection Attacks in Vehicular Routing

Taha Eghtesad, Yevgeniy Vorobeychik, Aron Laszka

2603.11431 2026-03-13 cs.RO

A Generalized Theory of Load Distribution in Redundantly-actuated Robotic Systems

Joshua Flight, Clément Gosselin

Comments 20 pages, 11 figures. Submitted to The International Journal of Robotics Research

2603.11423 2026-03-13 cs.CV

Beyond Single-Sample: Reliable Multi-Sample Distillation for Video Understanding

Songlin Li, Xin Zhu, Zechao Guan, Peipeng Chen, Jian Yao