arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2512.23619 2026-04-07 cs.RO math.GT math.OC

The N-5 Scaling Law: Topological Dimensionality Reduction in the Optimal Design of Fully-actuated Multirotors

Antonio Franchi

详情

英文摘要

The geometric design of fully-actuated and omnidirectional N-rotor aerial vehicles is conventionally formulated as a parametric optimization problem, seeking a single optimal set of N orientations within a fixed architectural family. This work departs from that paradigm to investigate the intrinsic topological structure of the optimization landscape itself. We formulate the design problem on the product manifold of Projective Lines \RP^2^N, fixing the rotor positions to the vertices of polyhedral chassis while varying their lines of action. By minimizing a coordinate-invariant Log-Volume isotropy metric, we reveal that the topology of the global optima is governed strictly by the symmetry of the chassis. For generic (irregular) vertex arrangements, the solutions appear as a discrete set of isolated points. However, as the chassis geometry approaches regularity, the solution space undergoes a critical phase transition, collapsing onto an N-dimensional Torus of the lines tangent at the vertexes to the circumscribing sphere of the chassis, and subsequently reducing to continuous 1-dimensional curves driven by Affine Phase Locking. We synthesize these observations into the N-5 Scaling Law: an empirical relationship holding for all examined regular planar polygons and Platonic solids (N <= 10), where the space of optimal configurations consists of K=N-5 disconnected 1D topological branches. We demonstrate that these locking patterns correspond to a sequence of admissible Star Polygons {N/q}, allowing for the exact prediction of optimal phases for arbitrary N. Crucially, this topology reveals a design redundancy that enables optimality-preserving morphing: the vehicle can continuously reconfigure along these branches while preserving optimal isotropic control authority.

URL PDF HTML ☆

赞 0 踩 0

2512.22647 2026-04-07 cs.CV

FinPercep-RM: A Fine-grained Reward Model and Co-evolutionary Curriculum for RL-based Real-world Super-Resolution

Yidi Liu, Zihao Fan, Jie Huang, Jie Xiao, Dong Li, Wenlong Zhang, Lei Bai, Xueyang Fu, Zheng-Jun Zha

Comments Accepted by CVPR2026

2512.22227 2026-04-07 cs.CL cs.LG

Geometric Organization of Cognitive States in Transformer Embedding Spaces

Sophie Zhao

2512.18599 2026-04-07 cs.CV

Restore-R1: Efficient Image Restoration Agents via Reinforcement Learning with Multimodal LLM Perceptual Feedback

Jianglin Lu, Yuanwei Wu, Ziyi Zhao, Hongcheng Wang, Felix Jimenez, Abrar Majeedi, Yun Fu

2512.18503 2026-04-07 cs.CV cs.LG

NASTaR: NovaSAR Automated Ship Target Recognition Dataset

Benyamin Hosseiny, Kamirul Kamirul, Odysseas Pappas, Alin Achim

2512.17541 2026-04-07 cs.CV

FLEG: Feed-Forward Language Embedded Gaussian Splatting from Any Views via Compact Semantic Representation

Qijian Tian, Xin Tan, Jiayu Ying, Xuhong Wang, Yuan Xie, Lizhuang Ma

Comments Project page: https://fangzhou2000.github.io/projects/fleg

2512.17051 2026-04-07 cs.LG

SFBD-OMNI: Bridge models for lossy measurement restoration with limited clean samples

Haoye Lu, Yaoliang Yu, Darren Lo

2512.14471 2026-04-07 cs.LG

Kinetic-Mamba: Mamba-Assisted Predictions of Stiff Chemical Kinetics

Additi Pandey, Liang Wei, Hessam Babaee, George Em Karniadakis

2512.14400 2026-04-07 cs.LG

GRAFT: Grid-Aware Load Forecasting with Multi-Source Textual Alignment and Fusion

Fangzhou Lin, Guoshun He, Zhenyu Guo, Zhe Huang, Jinsong Tao

2512.10421 2026-04-07 cs.CV

Neural Collapse in Test-Time Adaptation

Xiao Chen, Zhongjing Du, Jiazhen Huang, Xu Jiang, Li Lu, Jingyan Jiang, Zhi Wang

Comments Aceepted by CVPR 2026

2512.09299 2026-04-07 cs.CV cs.SD

VABench: A Comprehensive Benchmark for Audio-Video Generation

Daili Hua, Xizhi Wang, Bohan Zeng, Xinyi Huang, Hao Liang, Junbo Niu, Xinlong Chen, Quanqing Xu, Wentao Zhang

Comments 24 pages, 25 figures

2512.07571 2026-04-07 cs.CL cs.MM

A Simple Method to Enhance Pre-trained Language Models with Speech Tokens for Classification

Nicolas Calbucura, Jose Guillen, Valentin Barriere

2512.05495 2026-04-07 cs.RO cs.SY eess.SY

Temporal Reach-Avoid-Stay Control for Differential Drive Systems via Spatiotemporal Tubes

Ratnangshu Das, Ahan Basu, Christos Verginis, Pushpak Jagtap

2512.04616 2026-04-07 cs.SD physics.med-ph

Standard audiogram classification from loudness scaling data using unsupervised, supervised, and explainable machine learning techniques

Chen Xu, Lena Schell-Majoor, Birger Kollmeier

2512.03795 2026-04-07 cs.RO cs.AI

MPCFormer: A physics-informed data-driven approach for explainable socially-aware autonomous driving

Jia Hu, Zhexi Lian, Xuerun Yan, Ruiang Bi, Dou Shen, Yu Ruan, Chunlong Xia, Haoran Wang

Comments 17 pages, 17 figures

详情

英文摘要

Autonomous Driving (AD) vehicles still struggle to exhibit human-like behavior in highly dynamic and interactive traffic scenarios. The key challenge lies in AD's limited ability to interact with surrounding vehicles, largely due to a lack of understanding the underlying mechanisms of social interaction. To address this issue, we introduce MPCFormer, an explainable socially-aware autonomous driving approach with physics-informed and data-driven coupled social interaction dynamics. In this model, the dynamics are formulated into a discrete space-state representation, which embeds physics priors to enhance modeling explainability. The dynamics coefficients are learned from naturalistic driving data via a Transformer-based encoder-decoder architecture. To the best of our knowledge, MPCFormer is the first approach to explicitly model the dynamics of multi-vehicle social interactions. The learned social interaction dynamics enable the planner to generate manifold, human-like behaviors when interacting with surrounding traffic. By leveraging the MPC framework, the approach mitigates the potential safety risks typically associated with purely learning-based methods. Open-looped evaluation on NGSIM dataset demonstrates that MPCFormer achieves superior social interaction awareness, yielding the lowest trajectory prediction errors compared with other state-of-the-art approaches. The prediction achieves an ADE as low as 0.86 m over a long prediction horizon of 5 seconds. Close-looped experiments in highly intense interaction scenarios, where consecutive lane changes are required to exit an off-ramp, further validate the effectiveness of MPCFormer. Results show that MPCFormer achieves the highest planning success rate of 94.67%, improves driving efficiency by 15.75%, and reduces the collision rate from 21.25% to 0.5%, outperforming a frontier Reinforcement Learning (RL) based planner.

URL PDF HTML ☆

赞 0 踩 0

2512.00408 2026-04-07 cs.CV cs.AI

Low-Bitrate Video Compression through Semantic-Conditioned Diffusion

Lingdong Wang, Guan-Ming Su, Divya Kothandaraman, Tsung-Wei Huang, Mohammad Hajiesmaili, Ramesh K. Sitaraman

2511.20814 2026-04-07 cs.CV cs.AI cs.LG

SPHINX: A Synthetic Environment for Visual Perception and Reasoning

Md Tanvirul Alam, Saksham Aggarwal, Justin Yang Chae, Nidhi Rastogi

2511.17378 2026-04-07 cs.LG

A Unified Stability Analysis of SAM vs SGD: Role of Data Coherence and Emergence of Simplicity Bias

Wei-Kai Chang, Rajiv Khanna

Comments Neurips 2025

2511.17362 2026-04-07 cs.CV

ATAC: Augmentation-Based Test-Time Adversarial Correction for CLIP

Linxiang Su, András Balogh

Comments 16 pages

2511.16383 2026-04-07 cs.AI cs.SE

An Agent-Based Framework for the Automatic Validation of Mathematical Optimization Models

Alexander Zadorojniy, Segev Wasserkrug, Eitan Farchi

2511.14130 2026-04-07 cs.AI cs.CE cs.CL cs.IR

PRISM: Prompt-Refined In-Context System Modelling for Financial Retrieval

Chun Chet Ng, Jia Yu Lim, Wei Zeng Low

Comments 3rd-place solution for the ACM ICAIF 2025 Agentic Retrieval Grand Challenge. Accepted for poster presentation at ICLR 2026 (Advances in Financial AI Workshop)

2511.09216 2026-04-07 cs.LG q-bio.QM stat.ML

Controllable protein design with particle-based Feynman-Kac steering

Erik Hartman, Jonas Wallin, Johan Malmström, Jimmy Olsson

Comments In version 2 we added an experiment on improving designability through steering towards lower delta G

2511.08887 2026-04-07 cs.LG cs.AI

FAST-CAD: A Fairness-Aware Framework for Non-Contact Stroke Diagnosis

Tommy Sha, Zhan Cheng, Haotian Zhai, Xuwei Ding, Junnan Li, Haixiang Tang, Zaoting Sun, Yanchuan Tang, Yongzhe, Yi, Yuan Gao, Anhao Li

2511.06391 2026-04-07 cs.CL cs.AI

HatePrototypes: Interpretable and Transferable Representations for Implicit and Explicit Hate Speech Detection

Irina Proskurina, Marc-Antoine Carpentier, Julien Velcin

2510.27533 2026-04-07 cs.CV cs.GR

Deep Neural Watermarking for Robust Copyright Protection in 3D Point Clouds

Khandoker Ashik Uz Zaman, Mohammad Zahangir Alam, Mohammed N. M. Ali, Mahdi H. Miraz

详情

DOI: 10.33166/AETiC.2025.04.002
Journal ref: Print ISSN: 2516-0281, Online ISSN: 2516-029X, pp. 17-30, Vol. 9, No. 4, 1 October 2025

英文摘要

The protection of intellectual property has become critical due to the rapid growth of three-dimensional content in digital media. Unlike traditional images or videos, 3D point clouds present unique challenges for copyright enforcement, as they are especially vulnerable to a range of geometric and non-geometric attacks that can easily degrade or remove conventional watermark signals. In this paper, we address these challenges by proposing a robust deep neural watermarking framework for 3D point cloud copyright protection and ownership verification. Our approach embeds binary watermarks into the singular values of 3D point cloud blocks using spectral decomposition, i.e. Singular Value Decomposition (SVD), and leverages the extraction capabilities of Deep Learning using PointNet++ neural network architecture. The network is trained to reliably extract watermarks even after the data undergoes various attacks such as rotation, scaling, noise, cropping and signal distortions. We validated our method using the publicly available ModelNet40 dataset, demonstrating that deep learning-based extraction significantly outperforms traditional SVD-based techniques under challenging conditions. Our experimental evaluation demonstrates that the deep learning-based extraction approach significantly outperforms existing SVD-based methods with deep learning achieving bitwise accuracy up to 0.83 and Intersection over Union (IoU) of 0.80, compared to SVD achieving a bitwise accuracy of 0.58 and IoU of 0.26 for the Crop (70%) attack, which is the most severe geometric distortion in our experiment. This demonstrates our method's ability to achieve superior watermark recovery and maintain high fidelity even under severe distortions.

URL PDF HTML ☆

赞 0 踩 0

2510.26433 2026-04-07 cs.LG

Co-Evolving Latent Action World Models

Yucen Wang, Fengming Zhang, De-Chuan Zhan, Li Zhao, Kaixin Wang, Jiang Bian

2510.23448 2026-04-07 cs.LG stat.ML

An Information-Theoretic Analysis of OOD Generalization in Meta-Reinforcement Learning

Xingtu Liu

2510.23095 2026-04-07 cs.CV

Revisiting Multimodal Positional Encoding in Vision-Language Models

Jie Huang, Xuejing Liu, Sibo Song, Ruibing Hou, Hong Chang, Junyang Lin, Shuai Bai

Comments 16 pages

2510.20685 2026-04-07 cs.RO

C-NAV: Towards Self-Evolving Continual Object Navigation in Open World

Ming-Ming Yu, Fei Zhu, Wenzhuo Liu, Yirong Yang, Qunbo Wang, Wenjun Wu, Jing Liu

Comments Accepted at NeurIPS 2025

2510.15746 2026-04-07 cs.CL cs.AI

LLMs Judge Themselves: A Game-Theoretic Framework for Human-Aligned Evaluation

Gao Yang, Yuhang Liu, Siyu Miao, Xinyue Liang, Zhengyang Liu, Heyan Huang