arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2603.26666 2026-03-30 cs.RO

VLA-OPD: Bridging Offline SFT and Online RL for Vision-Language-Action Models via On-Policy Distillation

Zhide Zhong, Haodong Yan, Junfeng Li, Junjie He, Tianran Zhang, Haoang Li

详情

英文摘要

Although pre-trained Vision-Language-Action (VLA) models exhibit impressive generalization in robotic manipulation, post-training remains crucial to ensure reliable performance during deployment. However, standard offline Supervised Fine-Tuning (SFT) suffers from distribution shifts and catastrophic forgetting of pre-trained capabilities, while online Reinforcement Learning (RL) struggles with sparse rewards and poor sample efficiency. In this paper, we propose On-Policy VLA Distillation (VLA-OPD), a framework bridging the efficiency of SFT with the robustness of RL. Instead of relying on sparse environmental rewards, VLA-OPD leverages an expert teacher to provide dense, token-level supervision on the student's self-generated trajectories. This enables active error correction on policy-induced states while preserving pre-trained general capabilities through gentle alignment. Crucially, we formulate VLA-OPD via a Reverse-KL objective. Unlike standard Forward-KL that induces mode-covering entropy explosion, or Hard-CE that causes premature entropy collapse, our bounded mode-seeking objective ensures stable policy learning by filtering out the teacher's epistemic uncertainty while maintaining action diversity. Experiments on LIBERO and RoboTwin2.0 benchmarks demonstrate that VLA-OPD significantly improves sample efficiency over RL and robustness over SFT, while effectively mitigating catastrophic forgetting during post-training.

URL PDF HTML ☆

赞 0 踩 0

2603.26665 2026-03-30 cs.CV

Detailed Geometry and Appearance from Opportunistic Motion

Ryosuke Hirai, Kohei Yamashita, Antoine Guédon, Ryo Kawahara, Vincent Lepetit, Ko Nishino

2603.26664 2026-03-30 cs.SE cs.CL

Learning to Commit: Generating Organic Pull Requests via Online Repository Memory

Mo Li, L. H. Xu, Qitai Tan, Ting Cao, Yunxin Liu

Comments Preprint. Work in progress

2603.26663 2026-03-30 cs.CL

Weight Tying Biases Token Embeddings Towards the Output Space

Antonio Lopardo, Avyukth Harish, Catherine Arnett, Akshat Gupta

2603.26661 2026-03-30 cs.CV

GaussianGPT: Towards Autoregressive 3D Gaussian Scene Generation

Nicolas von Lützow, Barbara Rössle, Katharina Schmid, Matthias Nießner

Comments Project page: https://nicolasvonluetzow.github.io/GaussianGPT/ - Project video: https://youtu.be/zVnMHkFzHDg

2603.26659 2026-03-30 cs.RO

Partial Motion Imitation for Learning Cart Pushing with Legged Manipulators

Mili Das, Morgan Byrd, Donghoon Baek, Sehoon Ha

Comments 8 pages, 5 figures

2603.26658 2026-03-30 cs.CV

Zero-Shot Depth from Defocus

Yiming Zuo, Hongyu Wen, Venkat Subramanian, Patrick Chen, Karhan Kayan, Mario Bijelic, Felix Heide, Jia Deng

2603.26657 2026-03-30 cs.CV cs.LG

Tunable Soft Equivariance with Guarantees

Md Ashiqur Rahman, Lim Jun Hao, Jeremiah Jiang, Teck-Yian Lim, Raymond A. Yeh

2603.26653 2026-03-30 cs.CV cs.AI cs.CL cs.LG

PerceptionComp: A Video Benchmark for Complex Perception-Centric Reasoning

Shaoxuan Li, Zhixuan Zhao, Hanze Deng, Zirun Ma, Shulin Tian, Zuyan Liu, Yushi Hu, Haoning Wu, Yuhao Dong, Benlin Liu, Ziwei Liu, Ranjay Krishna

Comments Project Page: https://perceptioncomp.github.io

2603.26652 2026-03-30 math.MG cs.CG math.CO math.DG math.GT

Surfaces without quasi-isometric simplicial triangulations

James Davies

Comments 9 pages, 3 figures

2603.26647 2026-03-30 cs.LG cs.SY eess.SY

An LP-based Sampling Policy for Multi-Armed Bandits with Side-Observations and Stochastic Availability

Ashutosh Soni, Peizhong Ju, Atilla Eryilmaz, Ness B. Shroff

2603.26646 2026-03-30 cs.CV

Beyond Language: Grounding Referring Expressions with Hand Pointing in Egocentric Vision

Ling Li, Bowen Liu, Zinuo Zhan, Peng Jie, Jianhui Zhong, Kenglun Chang, Zhidong Deng

2603.26644 2026-03-30 cs.LG astro-ph.IM stat.ME

Automatic Laplace Collapsed Sampling: Scalable Marginalisation of Latent Parameters via Automatic Differentiation

Toby Lovick, David Yallup, Will Handley

Comments 28 Pages, 7 Figures. Comments welcome

2603.26643 2026-03-30 math.NA cs.NA

Boundary neuron method for solving partial differential equations

Ye Lin, Wentao Liu, Young Ju Lee, Jiwei Jia

2603.26639 2026-03-30 cs.CV cs.AI

Make Geometry Matter for Spatial Reasoning

Shihua Zhang, Qiuhong Shen, Shizun Wang, Tianbo Pan, Xinchao Wang

2603.26638 2026-03-30 cs.CV cs.RO

Drive-Through 3D Vehicle Exterior Reconstruction via Dynamic-Scene SfM and Distortion-Aware Gaussian Splatting

Nitin Kulkarni, Akhil Devarashetti, Charlie Cluss, Livio Forte, Philip Schneider, Chunming Qiao, Alina Vereshchaka

Comments 8 pages, 7 figures, Submitted to IEEE IROS 2026 (under review)

2603.26637 2026-03-30 cs.AR

Who Checks the Checker? Enhancing Component-level Architectural SEU Fault Tolerance for End-to-End SoC Protection

Michael Rogenmoser, Philippe Sauter, Chen Wu, Angelo Garofalo, Luca Benini

Comments 7 pages, accepted at VLSI Test Symposium 2026 (VTS 2026)

2603.26636 2026-03-30 physics.app-ph cs.SY eess.SY

Patched-Wall Quasistatic Cavity Resonators for 3-D Wireless Power Transfer

Takuya Sasatani, Yoshihiro Kawahara

Comments 5 pages, 6 figures

2603.26635 2026-03-30 cs.MA

Deception and Communication in Autonomous Multi-Agent Systems: An Experimental Study with Among Us

Maria Milkowski, Tim Weninger

Comments 8 pages + references, 9 figures. Accepted at AAMAS 2026

2603.26632 2026-03-30 cs.CR cs.AI cs.LG

Machine Learning Transferability for Malware Detection

César Vieira, João Vitorino, Eva Maia, Isabel Praça

Comments 12 pages, 1 Figure, 2 tables, World CIST 2026

2603.26631 2026-03-30 cs.GT cs.SI

Learning From Social Interactions: Personalized Pricing and Buyer Manipulation

Qinqi Lin, Lingjie Duan, Jianwei Huang

Comments Published in IEEE Transactions on Mobile Computing (a complete version with supplementary materials included)

详情

DOI: 10.1109/TMC.2024.3411111
Journal ref: IEEE Transactions on Mobile Computing, vol. 23, no. 12, pp. 11871-11888, Dec. 2024

英文摘要

As the sociological theory of homophily suggests, people tend to interact with those of similar preferences. Motivated by this well-established phenomenon, today's online sellers, such as Amazon,~seek~to learn a new buyer's private preference from his friends' purchase records. Although such learning allows the seller to enable personalized pricing and boost revenue, buyers are also increasingly aware of these practices and may alter their social behaviors accordingly. This paper presents the first study regarding how buyers strategically manipulate their social interaction signals considering their preference correlations, and how a seller can take buyers' strategic social behaviors into consideration when designing the pricing scheme. Starting with the fundamental two-buyer network, we propose and analyze a parsimonious model that uniquely captures the double-layered information asymmetry between the seller and buyers, integrating both individual buyer information and inter-buyer correlation information. Our analysis reveals that only high-preference buyers tend to manipulate their social interactions to evade the seller's personalized pricing, but surprisingly, their payoffs may actually worsen as a result. Moreover, we demonstrate that the seller can considerably benefit from the learning practice, regardless of whether the buyers are aware of this fact or not. Indeed, our analysis reveals that buyers' learning-aware strategic manipulation has only a slight impact on the seller's revenue. In light of the tightening regulatory policies concerning data access, it is advisable for sellers to maintain transparency with buyers regarding their access to buyers' social interaction data for learning purposes. This finding aligns well with current informed-consent industry practices for data sharing.

URL PDF HTML ☆

赞 0 踩 0

2603.26629 2026-03-30 cs.LG

Context-specific Credibility-aware Multimodal Fusion with Conditional Probabilistic Circuits

Pranuthi Tenali, Sahil Sidheekh, Saurabh Mathur, Erik Blasch, Kristian Kersting, Sriraam Natarajan

2603.26628 2026-03-30 cs.IT math.IT

USAM: A Unified Safety-Age metric for Timeliness in Heterogeneous IoT Systems

Mikael Gidlund

2603.26621 2026-03-30 eess.SY cs.SY

Inclusion conditions for the Constrained Polynomial Zonotopic case

Bogdan Gheorghe, Amr Alanwar, Florin Stoican

2603.26614 2026-03-30 cs.IT math.IT

Function-Based Minimal Linear Codes over Galois Rings $\mathrm{GR}(p^{n}, \ell)$: Minimality Criteria and Infinite Constructions

Biplab Chatterjee, Sihem Mesnager, Ratnesh Kumar Mishra, Makhan Maji, Kalyan Hansda

2603.26611 2026-03-30 cs.LG stat.ME stat.ML

Benchmarking Tabular Foundation Models for Conditional Density Estimation in Regression

Rafael Izbicki, Pedro L. C. Rodrigues

2603.26610 2026-03-30 cs.CV cs.AI

Think over Trajectories: Leveraging Video Generation to Reconstruct GPS Trajectories from Cellular Signaling

Ruixing Zhang, Hanzhang Jiang, Leilei Sun, Liangzhe Han, Jibin Wang, Weifeng Lv

2603.26608 2026-03-30 cs.HC cs.ET

Sticky and Magnetic: Evaluating Error Correction and User Adaptation in Gaze and Pinch Interaction

Jazmin Collins, Prasanthi Gurumurthy, Eric J. Gonzalez, Mar Gonzalez-Franco

Comments 5 page, 5 figures, Extended Abstracts of the 2026 CHI Conference on Human Factors in Computing Systems (CHI '26), April 13-17, 2026, Barcelona, Spain. ACM

2603.26604 2026-03-30 cs.LG hep-ph physics.ins-det

Hardware-Aware Tensor Networks for Real-Time Quantum-Inspired Anomaly Detection at Particle Colliders

Sagar Addepalli, Prajita Bhattarai, Abhilasha Dave, Julia Gonski

Comments 28 pages, 9 figures

2603.26599 2026-03-30 cs.CV

VGGRPO: Towards World-Consistent Video Generation with 4D Latent Reward

Zhaochong An, Orest Kupyn, Théo Uscidda, Andrea Colaco, Karan Ahuja, Serge Belongie, Mar Gonzalez-Franco, Marta Tintore Gazulla

Comments Project Page: https://zhaochongan.github.io/projects/VGGRPO