arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2603.14958 2026-03-17 cs.LG

Lightweight User-Personalization Method for Closed Split Computing

Yuya Okada, Takayuki Nishio

Comments 15 pages, 12 figures

详情

英文摘要

Split Computing enables collaborative inference between edge devices and the cloud by partitioning a deep neural network into an edge-side head and a server-side tail, reducing latency and limiting exposure of raw input data. However, inference performance often degrades in practical deployments due to user-specific data distribution shifts, unreliable communication, and privacy-oriented perturbations, especially in closed environments where model architectures and parameters are inaccessible. To address this challenge, we propose SALT (Split-Adaptive Lightweight Tuning), a lightweight adaptation framework for closed Split Computing systems. SALT introduces a compact client-side adapter that refines intermediate representations produced by a frozen head network, enabling effective model adaptation without modifying the head or tail networks or increasing communication overhead. By modifying only the training conditions, SALT supports multiple adaptation objectives, including user personalization, communication robustness, and privacy-aware inference. Experiments using ResNet-18 on CIFAR-10 and CIFAR-100 show that SALT achieves higher accuracy than conventional retraining and fine-tuning while significantly reducing training cost. On CIFAR-10, SALT improves personalized accuracy from 88.1% to 93.8% while reducing training latency by more than 60%. SALT also maintains over 90% accuracy under 75% packet loss and preserves high accuracy (about 88% at sigma = 1.0) under noise injection. These results demonstrate that SALT provides an efficient and practical adaptation framework for real-world Split Computing systems.

URL PDF HTML ☆

赞 0 踩 0

2603.14957 2026-03-17 cs.CV

CyCLeGen: Cycle-Consistent Layout Prediction and Image Generation in Vision Foundation Models

Xiaojun Shan, Haoyu Shen, Yucheng Mao, Xiang Zhang, Abhay Anand, Bingnan Li, Haiyang Xu, Zhuowen Tu

2603.14956 2026-03-17 cs.LG

SFedHIFI: Fire Rate-Based Heterogeneous Information Fusion for Spiking Federated Learning

Ran Tao, Qiugang Zhan, Shantian Yang, Xiurui Xie, Qi Tian, Guisong Liu

Comments 9 pages, 1 figure

2603.14953 2026-03-17 cs.CV cs.AI

Learning Question-Aware Keyframe Selection with Synthetic Supervision for Video Question Answering

Minchan Kwon, Hyounguk Shon, Junmo Kim

2603.14952 2026-03-17 cs.CV

Pansharpening for Thin-Cloud Contaminated Remote Sensing Images: A Unified Framework and Benchmark Dataset

Songcheng Du, Yang Zou, Jiaxin Li, Mingxuan Liu, Ying Li, Changjing Shang, Qiang Shen

Comments 11 pages,5 figures,published in AAAI2026

2603.14951 2026-03-17 cs.CV

GT-PCQA: Geometry-Texture Decoupled Point Cloud Quality Assessment with MLLM

Guohua Zhang, Jian Jin, Meiqin Liu, Chao Yao, Weisi Lin, Yao Zhao

2603.14948 2026-03-17 cs.CV

Bridging Scene Generation and Planning: Driving with World Model via Unifying Vision and Motion Representation

Xingtai Gui, Meijie Zhang, Tianyi Yan, Wencheng Han, Jiahao Gong, Feiyang Tan, Cheng-zhong Xu, Jianbing Shen

Comments 16 pages, 9 figures. The code is available at https://github.com/TabGuigui/WorldDrive

详情

英文摘要

End-to-end autonomous driving aims to generate safe and plausible planning policies from raw sensor input. Driving world models have shown great potential in learning rich representations by predicting the future evolution of a driving scene. However, existing driving world models primarily focus on visual scene representation, and motion representation is not explicitly designed to be planner-shared and inheritable, leaving a schism between the optimization of visual scene generation and the requirements of precise motion planning. We present WorldDrive, a holistic framework that couples scene generation and real-time planning via unifying vision and motion representation. We first introduce a Trajectory-aware Driving World Model, which conditions on a trajectory vocabulary to enforce consistency between visual dynamics and motion intentions, enabling the generation of diverse and plausible future scenes conditioned on a specific trajectory. We transfer the vision and motion encoders to a downstream Multi-modal Planner, ensuring the driving policy operates on mature representations pre-optimized by scene generation. A simple interaction between motion representation, visual representation, and ego status can generate high-quality, multi-modal trajectories. Furthermore, to exploit the world model's foresight, we propose a Future-aware Rewarder, which distills future latent representation from the frozen world model to evaluate and select optimal trajectories in real-time. Extensive experiments on the NAVSIM, NAVSIM-v2, and nuScenes benchmarks demonstrate that WorldDrive achieves leading planning performance among vision-only methods while maintaining high-fidelity action-controlled video generation capabilities, providing strong evidence for the effectiveness of unifying vision and motion representation for robust autonomous driving.

URL PDF HTML ☆

赞 0 踩 0

2603.14947 2026-03-17 cs.LG cs.AI

FairMed-XGB: A Bayesian-Optimised Multi-Metric Framework with Explainability for Demographic Equity in Critical Healthcare Data

Mitul Goswami, Romit Chatterjee, Arif Ahmed Sekh

2603.14946 2026-03-17 cs.LG

Spiking Layer-Adaptive Magnitude-based Pruning

Junqiao Wang, Zhehang Ye, Yuqi Ouyang

2603.14944 2026-03-17 cs.LG

Ultra-Early Prediction of Tipping Points: Integrating Dynamical Measures with Reservoir Computing

Xin Li, Qunxi Zhu, Chengli Zhao, Bolin Zhao, Xue Zhang, Xiaojun Duan, Wei Lin

2603.14941 2026-03-17 cs.AI

RS-WorldModel: a Unified Model for Remote Sensing Understanding and Future Sense Forecasting

Linrui Xu, Zhongan Wang, Fei Shen, Gang Xu, Huiping Zhuang, Ming Li, Haifeng Li

2603.14938 2026-03-17 cs.CV

FAR-Drive: Frame-AutoRegressive Video Generation in Closed-Loop Autonomous Driving

Yaoru Li, Federico Landi, Marco Godi, Xin Jin, Ruiju Fu, Yufei Ma, Muyang Sun, Heyu Si, Qi Guo

2603.14935 2026-03-17 cs.CV

Video-CoE: Reinforcing Video Event Prediction via Chain of Events

Qile Su, Jing Tang, Rui Chen, Lei Sun, Xiangxiang Chu

Comments 21 pages, 18 figures, 6 tables

2603.14923 2026-03-17 cs.LG cs.AI

Directional Routing in Transformers

Kevin Taylor

2603.14916 2026-03-17 cs.CV cs.MM

EditHF-1M: A Million-Scale Rich Human Preference Feedback for Image Editing

Zitong Xu, Huiyu Duan, Zhongpeng Ji, Xinyun Zhang, Yutao Liu, Xiongkuo Min, Ke Gu, Jian Zhang, Shusong Xu, Jinwei Chen, Bo Li, Guangtao Zhai

2603.14915 2026-03-17 cs.CV

ILV: Iterative Latent Volumes for Fast and Accurate Sparse-View CT Reconstruction

Seungryong Lee, Woojeong Baek, Joosang Lee, Eunbyung Park

Comments Project page: \url{https://sngryonglee.github.io/ILV/}

2603.14909 2026-03-17 cs.CV

TopoVST: Toward Topology-fidelitous Vessel Skeleton Tracking

Yaoyu Liu, Minghui Zhang, Junjun He, Yun Gu

Comments 10 pages, 9 figures. Under Review

2603.14908 2026-03-17 cs.RO cs.CV

PerlAD: Towards Enhanced Closed-loop End-to-end Autonomous Driving with Pseudo-simulation-based Reinforcement Learning

Yinfeng Gao, Qichao Zhang, Deqing Liu, Zhongpu Xia, Guang Li, Kun Ma, Guang Chen, Hangjun Ye, Long Chen, Da-Wei Ding, Dongbin Zhao

Comments Accepted by IEEE RA-L. Submitted: 2025.12.2; Revised: 2026.2.4; Accepeted: 2026.3.7

2603.14900 2026-03-17 cs.RO

From Folding Mechanics to Robotic Function: A Unified Modeling Framework for Compliant Origami

Bohan Zhang, Bo Wang, Huajiang Ouyang, Zhigang Wu, Haohao Bi, Jiawei Xu, Mingchao Liu, Weicheng Huang

Comments 24 pages, 7 figures

2603.14897 2026-03-17 cs.LG

BiTro: Bidirectional Transfer Learning Enhances Bulk and Spatial Transcriptomics Prediction in Cancer Pathological Images

Jingkun Yu, Guangkai Shang, Changtao Li, Xun Gong, Tianrui Li, Yazhou He, Zhipeng Luo

2603.14893 2026-03-17 cs.CL cs.AI

LLMs as Signal Detectors: Sensitivity, Bias, and the Temperature-Criterion Analogy

Jon-Paul Cacioli

Comments 15 pages, 8 figures, 2 tables

2603.14892 2026-03-17 cs.CV

Balancing Saliency and Coverage: Semantic Prominence-Aware Budgeting for Visual Token Compression in VLMs

Jaehoon Lee, Mingi Jung, Soohyuk Jang, Seungryong Yoo, Dahuin Jung, Sungroh Yoon

2603.14891 2026-03-17 cs.CL cs.AI

Decision-Level Ordinal Modeling for Multimodal Essay Scoring with Large Language Models

Han Zhang, Jiamin Su, Li liu

2603.14886 2026-03-17 cs.CV

PASTE: Physics-Aware Scattering Topology Embedding Framework for SAR Object Detection

Jiacheng Chen, Yuxuan Xiong, Haipeng Wang

2603.14885 2026-03-17 cs.CV

SpiralDiff: Spiral Diffusion with LoRA for RGB-to-RAW Conversion Across Cameras

Huanjing Yue, Shangbin Xie, Cong Cao, Qian Wu, Lei Zhang, Lei Zhao, Jingyu Yang

Comments Accepted by CVPR 2026

2603.14880 2026-03-17 cs.CV

RealVLG-R1: A Large-Scale Real-World Visual-Language Grounding Benchmark for Robotic Perception and Manipulation

Linfei Li, Lin Zhang, Ying Shen

Comments Accepted by CVPR 2026

2603.14879 2026-03-17 cs.LG cs.AI

Seismic full-waveform inversion based on a physics-driven generative adversarial network

Xinyi Zhang, Caiyun Liu, Jie Xiong, Qingfeng Yu

2603.14876 2026-03-17 cs.AI

A Hybrid AI and Rule-Based Decision Support System for Disease Diagnosis and Management Using Labs

Muhammad Hammad Maqsood, Mubashir Sajid, Khubaib Ahmed, Muhammad Usamah Shahid, Muddassar Farooq

2603.14873 2026-03-17 cs.CL

Developing an English-Efik Corpus and Machine Translation System for Digitization Inclusion

Offiong Bassey Edet, Mbuotidem Sunday Awak, Emmanuel Oyo-Ita, Benjamin Okon Nyong, Ita Etim Bassey

Comments 8 pages, 1 figure, accepted at AfricaNLP 2026 (co-located with EACL)

2603.14870 2026-03-17 cs.LG cs.AI

IgPose: A Generative Data-Augmented Pipeline for Robust Immunoglobulin-Antigen Binding Prediction

Tien-Cuong Bui, Injae Chung, Wonjun Lee, Junsu Ko, Juyong Lee

Comments 11 pages, 4 figures, Bioinformatics