arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2603.13227 2026-03-16 cs.LG cs.CV

Representation Learning for Spatiotemporal Physical Systems

Helen Qu, Rudy Morel, Michael McCabe, Alberto Bietti, François Lanusse, Shirley Ho, Yann LeCun

Comments Published at ICLR 2026 Workshop on AI & PDE

详情

英文摘要

Machine learning approaches to spatiotemporal physical systems have primarily focused on next-frame prediction, with the goal of learning an accurate emulator for the system's evolution in time. However, these emulators are computationally expensive to train and are subject to performance pitfalls, such as compounding errors during autoregressive rollout. In this work, we take a different perspective and look at scientific tasks further downstream of predicting the next frame, such as estimation of a system's governing physical parameters. Accuracy on these tasks offers a uniquely quantifiable glimpse into the physical relevance of the representations of these models. We evaluate the effectiveness of general-purpose self-supervised methods in learning physics-grounded representations that are useful for downstream scientific tasks. Surprisingly, we find that not all methods designed for physical modeling outperform generic self-supervised learning methods on these tasks, and methods that learn in the latent space (e.g., joint embedding predictive architectures, or JEPAs) outperform those optimizing pixel-level prediction objectives. Code is available at https://github.com/helenqu/physical-representation-learning.

URL PDF HTML ☆

赞 0 踩 0

2603.13215 2026-03-16 cs.CV

Out of Sight, Out of Mind? Evaluating State Evolution in Video World Models

Ziqi Ma, Mengzhan Liufu, Georgia Gkioxari

Comments https://glab-caltech.github.io/STEVOBench/

2603.13201 2026-03-16 cs.CL

Neuron-Aware Data Selection In Instruction Tuning For Large Language Models

Xin Chen, Junchao Wu, Shu Yang, Runzhe Zhan, Zeyu Wu, Min Yang, Shujian Huang, Lidia S. Chao, Derek F. Wong

2603.13186 2026-03-16 cs.LG cs.AI cs.CR

Learnability and Privacy Vulnerability are Entangled in a Few Critical Weights

Xingli Fang, Jung-Eun Kim

Comments ICLR 2026

2603.13185 2026-03-16 cs.CV

Towards Spatio-Temporal World Scene Graph Generation from Monocular Videos

Rohith Peddi, Saurabh, Shravan Shanmugam, Likhitha Pallapothula, Yu Xiang, Parag Singla, Vibhav Gogate

Comments https://github.com/rohithpeddi/WorldSGG

2603.13180 2026-03-16 cs.LG cs.AI cs.NE

MXNorm: Reusing MXFP block scales for efficient tensor normalisation

Callum McLean, Luke Y. Prince, Alexandre Payot, Paul Balança, Carlo Luschi

Comments Preprint, Under Review. 15 pages, 12 figures

2603.13176 2026-03-16 cs.CV

Perceive What Matters: Relevance-Driven Scheduling for Multimodal Streaming Perception

Dingcheng Huang, Xiaotong Zhang, Kamal Youcef-Toumi

Comments Accepted to ICRA 2026

2603.13168 2026-03-16 cs.AI cs.CL cs.IR

Developing and evaluating a chatbot to support maternal health care

Smriti Jha, Vidhi Jain, Jianyu Xu, Grace Liu, Sowmya Ramesh, Jitender Nagpal, Gretchen Chapman, Benjamin Bellows, Siddhartha Goyal, Aarti Singh, Bryan Wilder

Comments 17 pages; submitted to IJCAI 2026 AI and Social Good Track

2603.13163 2026-03-16 cs.CV cs.LG

Towards Faithful Multimodal Concept Bottleneck Models

Pierre Moreau, Emeline Pineau Ferrand, Yann Choho, Benjamin Wong, Annabelle Blangero, Milan Bhan

2603.13154 2026-03-16 cs.CL cs.AI

ESG-Bench: Benchmarking Long-Context ESG Reports for Hallucination Mitigation

Siqi Sun, Ben Peng Wu, Mali Jin, Peizhen Bai, Hanpei Zhang, Xingyi Song

Comments To be published in the AAAI 2026 proceedings

2603.13134 2026-03-16 cs.AI

When Right Meets Wrong: Bilateral Context Conditioning with Reward-Confidence Correction for GRPO

Yu Li, Tian Lan, Zhengling Qi

2603.13121 2026-03-16 cs.CV

FDeID-Toolbox: Face De-Identification Toolbox

Hui Wei, Hao Yu, Guoying Zhao

Comments Technical Report. Codebase: https://github.com/infraface/FDeID-Toolbox

2603.13118 2026-03-16 cs.CV

NOIR: Neural Operator mapping for Implicit Representations

Sidaty El Hadramy, Nazim Haouchine, Michael Wehrli, Philippe C. Cattin

2603.13115 2026-03-16 cs.LG

ZO-SAM: Zero-Order Sharpness-Aware Minimization for Efficient Sparse Training

Jie Ji, Gen Li, Kaiyuan Deng, Fatemeh Afghah, Xiaolong Ma

2603.13109 2026-03-16 cs.LG cs.AI

BoSS: A Best-of-Strategies Selector as an Oracle for Deep Active Learning

Denis Huseljic, Paul Hahn, Marek Herde, Christoph Sandrock, Bernhard Sick

2603.13108 2026-03-16 cs.RO cs.CV eess.IV

Panoramic Multimodal Semantic Occupancy Prediction for Quadruped Robots

Guoqiang Zhao, Zhe Yang, Sheng Wu, Fei Teng, Mengfei Duan, Yuanfan Zheng, Kai Luo, Kailun Yang

Comments The dataset and code will be publicly released at https://github.com/SXDR/PanoMMOcc

2603.13103 2026-03-16 cs.RO

A Feasibility-Enhanced Control Barrier Function Method for Multi-UAV Collision Avoidance

Qishen Zhong, Junlong Wu, Jian Yang, Guanwei Xiao, Junqi Wu, Zimeng Jiang, Pingan Fang

2603.13102 2026-03-16 cs.CV

BenDFM: A taxonomy and synthetic CAD dataset for manufacturability assessment in sheet metal bending

Matteo Ballegeer, Dries F. Benoit

详情

英文摘要

Predicting the manufacturability of CAD designs early, in terms of both feasibility and required effort, is a key goal of Design for Manufacturing (DFM). Despite advances in deep learning for CAD and its widespread use in manufacturing process selection, learning-based approaches for predicting manufacturability within a specific process remain limited. Two key challenges limit progress: inconsistency across prior work in how manufacturability is defined and consequently in the associated learning targets, and a scarcity of suitable datasets. Existing labels vary significantly: they may reflect intrinsic design constraints or depend on specific manufacturing capabilities (such as available tools), and they range from discrete feasibility checks to continuous complexity measures. Furthermore, industrial datasets typically contain only manufacturable parts, offering little signal for infeasible cases, while existing synthetic datasets focus on simple geometries and subtractive processes. To address these gaps, we propose a taxonomy of manufacturability metrics along the axes of configuration dependence and measurement type, allowing clearer scoping of generalizability and learning objectives. Next, we introduce BenDFM, the first synthetic dataset for manufacturability assessment in sheet metal bending. BenDFM contains 20,000 parts, both manufacturable and unmanufacturable, generated with process-aware bending simulations, providing both folded and unfolded geometries and multiple manufacturability labels across the taxonomy, enabling systematic study of previously unexplored learning-based DFM challenges. We benchmark two state-of-the-art 3D learning architectures on BenDFM, showing that graph-based representations that capture relationships between part surfaces achieve better accuracy, and that predicting metrics that depend on specific manufacturing setups remains more challenging.

URL PDF HTML ☆

赞 0 踩 0

2603.13100 2026-03-16 cs.RO cs.AI

Evaluating VLMs' Spatial Reasoning Over Robot Motion: A Step Towards Robot Planning with Motion Preferences

Wenxi Wu, Jingjing Zhang, Martim Brandão

Comments Accepted to the First Workshop on Efficient Spatial Reasoning at ICLR 2026

2603.13098 2026-03-16 cs.RO cs.CV

SldprtNet: A Large-Scale Multimodal Dataset for CAD Generation in Language-Driven 3D Design

Ruogu Li, Sikai Li, Yao Mu, Mingyu Ding

Comments Accept by ICRA 2026

2603.13089 2026-03-16 cs.CV

V-Bridge: Bridging Video Generative Priors to Versatile Few-shot Image Restoration

Shenghe Zheng, Junpeng Jiang, Wenbo Li

Comments Transfer the prior knowledge of video generative models to image restoration tasks

2603.13082 2026-03-16 cs.CV cs.RO eess.IV

InterEdit: Navigating Text-Guided Multi-Human 3D Motion Editing

Yebin Yang, Di Wen, Lei Qi, Weitong Kong, Junwei Zheng, Ruiping Liu, Yufan Chen, Chengzhi Wu, Kailun Yang, Yuqian Fu, Danda Pani Paudel, Luc Van Gool, Kunyu Peng

Comments The dataset and code will be released at https://github.com/YNG916/InterEdit

2603.13077 2026-03-16 cs.CV

Rooftop Wind Field Reconstruction Using Sparse Sensors: From Deterministic to Generative Learning Methods

Yihang Zhou, Chao Lin, Hideki Kikumoto, Ryozo Ooka, Sibo Cheng

2603.13070 2026-03-16 cs.CV

Mitigating Memorization in Text-to-Image Diffusion via Region-Aware Prompt Augmentation and Multimodal Copy Detection

Yunzhuo Chen, Jordan Vice, Naveed Akhtar, Nur Al Hasan Haldar, Ajmal Mian

2603.13069 2026-03-16 cs.LG cs.CV cs.IT math.DS math.IT

Fractals made Practical: Denoising Diffusion as Partitioned Iterated Function Systems

Ann Dooms

2603.13068 2026-03-16 cs.LG cs.AI

GeoChemAD: Benchmarking Unsupervised Geochemical Anomaly Detection for Mineral Exploration

Yihao Ding, Yiran Zhang, Chris Gonzalez, Eun-Jung Holden, Wei Liu

Comments Work in progress

2603.13065 2026-03-16 cs.LG cs.AI

L2GTX: From Local to Global Time Series Explanations

Ephrem Tibebe Mekonnen, Luca Longo, Lucas Rizzo, Pierpaolo Dondio

Comments Accepted for publication at the 4th World Conference on Explainable Artificial Intelligence (xAI 2026), 18 pages, 6 figures

2603.13059 2026-03-16 cs.LG cs.AI

Competition-Aware CPC Forecasting with Near-Market Coverage

Sebastian Frey, Edoardo Beccari, Maximilian Kranz, Nicolò Alberto Pellizzari, Ali Mete Karaman, Qiwei Han, Maximilian Kaiser

Comments 16 pages, 2 figures, 4 tables

2603.13057 2026-03-16 cs.CV

Reference-Free Image Quality Assessment for Virtual Try-On via Human Feedback

Yuki Hirakawa, Takashi Wada, Ryotaro Shimizu, Takuya Furusawa, Yuki Saito, Ryosuke Araki, Tianwei Chen, Fan Mo, Yoshimitsu Aoki

2603.13056 2026-03-16 cs.CV cs.AI

Team RAS in 10th ABAW Competition: Multimodal Valence and Arousal Estimation Approach

Elena Ryumina, Maxim Markitantov, Alexandr Axyonov, Dmitry Ryumin, Mikhail Dolgushin, Denis Dresvyanskiy, Alexey Karpov

Comments 8 pages, 1 figure