arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2512.24310 2026-03-17 cs.RO

World In Your Hands: A Large-Scale and Open-Source Ecosystem for Learning Human-Centric Manipulation in the Wild

Yupeng Zheng, Jichao Peng, Weize Li, Yuhang Zheng, Xiang Li, Yujie Jin, Julong Wei, Guanhua Zhang, Ruiling Zheng, Ming Cao, Songen Gu, Zhenhong Zou, Kaige Li, Ke Wu, Mingmin Yang, Jiahao Liu, Pengfei Li, Hengjie Si, Feiyu Zhu, Wang Fu, Likun Wang, Ruiwen Yao, Jieru Zhao, Yilun Chen, Wenchao Ding

Comments This dataset represents the first large-scale collection of real-world, human-centric multimodal data integrating vision, language, tactile sensing, and action (VLTA) Github: https://github.com/tars-robotics/World-In-Your-Hands

2512.22955 2026-03-17 cs.CL

Diversity or Precision? A Deep Dive into Next Token Prediction

Haoyuan Wu, Hai Wang, Jiajia Wu, Jinxiang Ou, Keyao Wang, Weile Chen, Zihao Zheng, Bei Yu

2512.22705 2026-03-17 cs.CL cs.AI cs.LG

GHaLIB: A Multilingual Framework for Hope Speech Detection in Low-Resource Languages

Ahmed Abdullah, Sana Fatima, Haroon Mahmood

Comments Accepted and presented at the 15th International Arab Conference on Information Technology (ICAIT); proceedings not yet published

2512.22213 2026-03-17 cs.LG cs.AI cs.CL

On the Existence and Behavior of Secondary Attention Sinks

Jeffrey T. H. Wong, Cheng Zhang, Louis Mahon, Wayne Luk, Anton Isopoussu, Yiren Zhao

2512.20348 2026-03-17 cs.LG

Physics-guided Neural Network-based Shaft Power Prediction for Vessels

Dogan Altan, Hamza Haruna Mohammed, Glenn Terje Lines, Dusica Marijan, Arnbjørn Maressa

Comments This work has been accepted for publication in the 11th Special Session on Intelligent Data Mining at IEEE BigData 2025. The final published version of this work will be available through IEEE

2512.20272 2026-03-17 cs.LG

HGAN-SDEs: Learning Neural Stochastic Differential Equations with Hermite-Guided Adversarial Training

Yuanjian Xu, Yuan Shuai, Jianing Hao, Guang Zhang

2512.20074 2026-03-17 cs.AI cs.CL

Reason2Decide: Rationale-Driven Multi-Task Learning

H M Quamran Hasan, Housam Khalifa Bashier, Jiayi Dai, Mi-Young Kim, Randy Goebel

Comments Uploaded the camera-version of the paper accepted to LREC 2026

2512.19554 2026-03-17 cs.LG cs.AI

CARE What Fails: Contrastive Anchored-REflection for Verifiable Multimodal Reasoning

Yongxin Wang, Zhicheng Yang, Meng Cao, Mingfei Han, Haokun Lin, Yingying Zhu, Xiaojun Chang, Xiaodan Liang

2512.18991 2026-03-17 cs.CV cs.AI

Training-Free Global Geometric Association for 4D LiDAR Panoptic Segmentation

Gyeongrok Oh, Youngdong Jang, Jonghyun Choi, Suk-Ju Kang, Guang Lin, Sangpil Kim

2512.17945 2026-03-17 cs.LG q-fin.RM q-fin.ST

What's the Price of Monotonicity? A Multi-Dataset Benchmark of Monotone-Constrained Gradient Boosting for Credit PD

Petr Koklev

Comments 56 pages. This version: December 2025. Includes multi-dataset benchmark results and diagnostic analyses; replication code and configuration files are available via the GitHub repository referenced in the paper

2512.16357 2026-03-17 cs.CV

GMODiff: One-Step Gain Map Refinement with Diffusion Priors for HDR Reconstruction

Tao Hu, Weiyu Zhou, Yanjie Tu, Peng Wu, Wei Dong, Qingsen Yan, Yanning Zhang

2512.15746 2026-03-17 cs.LG cond-mat.mtrl-sci

A Unified Generative-Predictive Framework for Deterministic Inverse Design

Reza T. Batley, Sourav Saha

详情

DOI: 10.2514/6.2026-0365
Journal ref: AIAA SciTech Forum, 2026

英文摘要

Inverse design of heterogeneous material microstructures is a fundamentally ill-posed and famously computationally expensive problem. This is exacerbated by the high-dimensional design spaces associated with finely resolved images, multimodal input property streams, and a highly nonlinear forward physics. Whilst modern generative models excel at accurately modeling such complex forward behavior, most of them are not intrinsically structured to support fast, stable \emph{deterministic} inversion with a physics-informed bias. This work introduces Janus, a unified generative-predictive framework to address this problem. Janus couples a deep encoder-decoder architecture with a predictive KHRONOS head, a separable neural architecture. Topologically speaking, Janus learns a latent manifold simultaneously isometric for generative inversion and pruned for physical prediction; the joint objective inducing \emph{disentanglement} of the latent space. Janus is first validated on the MNIST dataset, demonstrating high-fidelity reconstruction, accurate classification and diverse generative inversion of all ten target classes. It is then applied to the inverse design of heterogeneous microstructures labeled with thermal conductivity. It achieves a forward prediction accuracy $R^2=0.98$ (2\% relative error) and sub-5\% pixelwise reconstruction error. Inverse solutions satisfy target properties to within $1\%$ relative error. Inverting a sweep through properties reveal smooth traversal of the latent manifold, and UMAP visualization confirms the emergence of a low-dimensional, disentangled manifold. By unifying prediction and generation within a single latent space, Janus enables real-time, physics-informed inverse microstructure generation at a lower computational cost typically associated with classical optimization-based approaches.

URL PDF HTML ☆

赞 0 踩 0

2512.15206 2026-03-17 cs.LG

Chorus: Harmonizing Context and Sensing Signals for Data-Free Model Customization in IoT

Liyu Zhang, Yejia Liu, Kwun Ho Liu, Runxi Huang, Xiaomin Ouyang

详情

英文摘要

A key bottleneck toward scalable IoT sensing is how to efficiently adapt AI models to new deployment conditions. In real-world IoT systems, sensor data is collected under diverse contexts, such as sensor placements or ambient environments, which alter signal patterns and degrade downstream performance. Traditional domain adaptation and generalization methods often ignore such contextual information or incorporate it in overly simplistic ways, making them ineffective under unseen context shifts after deployment. In this paper, we propose Chorus, a context-aware, data-free model customization approach that adapts models to unseen deployment conditions without requiring target-domain data. The key idea is to learn context representations that capture how contextual factors influence sensor data, and then use these representations as structured priors for context-aware customization under unseen shifts. Specifically, Chorus learns a shared sensor-context latent space through bidirectional cross-modal reconstruction on unlabeled sensor-context pairs, and regularizes the context embedding space to obtain compact and generalizable context representations. Building on the aligned representations, Chorus trains a lightweight gated head with limited labeled data to exploit context priors during inference, and introduces a context-caching mechanism that reuses cached context representations when no context shift is detected, thereby reducing inference overhead on smartphones. Experiments on IMU, speech enhancement, and WiFi sensing tasks under diverse context shifts show that Chorus outperforms state-of-the-art baselines by up to 20.2% in unseen contexts, with cached inference latency close to sensor-only deployment, while maintaining stable performance under continuous context transitions. A video demonstration of Chorus's performance in real world is available at https://youtu.be/ZBdro0jPNkE.

URL PDF HTML ☆

赞 0 踩 0

2512.14086 2026-03-17 cs.LG cs.NA math.NA

Derivative-Informed Fourier Neural Operator: Universal Approximation and Applications to PDE-Constrained Optimization

Boyuan Yao, Dingcheng Luo, Lianghao Cao, Nikola Kovachki, Thomas O'Leary-Roseberry, Omar Ghattas

2512.12898 2026-03-17 cs.CV cs.GR cs.LG

Towards High-Fidelity Gaussian Splatting with Queried-Convolution Neural Networks

Abhinav Kumar, Tristan Aumentado-Armstrong, Lazar Valkov, Gopal Sharma, Alex Levinshtein, Radek Grzeszczuk, Suren Kumar

Comments 38 pages, 8 figures, Project Page: https://abhi1kumar.github.io/qonvolution/

2512.12459 2026-03-17 cs.CV cs.GR

From Particles to Fields: Reframing Photon Mapping with Continuous Gaussian Photon Fields

Jiachen Tao, Benjamin Planche, Van Nguyen Nguyen, Junyi Wu, Yuchun Liu, Haoxuan Wang, Zhongpai Gao, Gengyu Zhang, Meng Zheng, Feiran Wang, Anwesa Choudhuri, Zhenghao Zhao, Weitai Kang, Terrence Chen, Yan Yan, Ziyan Wu

2512.11542 2026-03-17 cs.CV

Infinity and Beyond: Compositional Alignment in VAR and Diffusion T2I Models

Hossein Shahabadi, Niki Sepasian, Arash Marioriyad, Ali Sharifi-Zarchi, Mahdieh Soleymani Baghshah

Comments Accepted at the ICLR 2026 Workshop on Multimodal Intelligence: Next Token Prediction and Beyond

2512.10793 2026-03-17 cs.CL cs.AI

LabelFusion: Fusing Large Language Models with Transformer Encoders for Robust Financial News Classification

Michael Schlee, Christoph Weisser, Timo Kivimäki, Melchizedek Mashiku, Benjamin Saefken

2512.09944 2026-03-17 cs.AI cs.CV cs.LG eess.IV

Echo-CoPilot: A Multiple-Perspective Agentic Framework for Reliable Echocardiography Interpretation

Moein Heidari, Ali Mehrabian, Mohammad Amin Roohi, Wenjin Chen, David J. Foran, Jasmine Grewal, Ilker Hacihaliloglu

2512.09924 2026-03-17 cs.CV

Towards Reason-Informed Video Editing in Unified Models with Self-Reflective Learning

Xinyu Liu, Hangjie Yuan, Yujie Wei, Jiazheng Xing, Yujin Han, Jiahao Pan, Yanbiao Ma, Chi-Min Chan, Kang Zhao, Shiwei Zhang, Wenhan Luo, Yike Guo

2512.04784 2026-03-17 cs.CV

PaCo-RL: Advancing Reinforcement Learning for Consistent Image Generation with Pairwise Reward Modeling

Bowen Ping, Chengyou Jia, Minnan Luo, Changliang Xia, Xin Shen, Zhuohang Dang, Hangwei Qian

2512.04305 2026-03-17 cs.CV

How (Mis)calibrated is Your Federated CLIP and What To Do About It?

Mainak Singha, Masih Aminbeidokhti, Paolo Casari, Gianni Franchi, Elisa Ricci, Subhankar Roy

Comments Preprint

2512.03886 2026-03-17 cs.RO cs.SY eess.SY

A Modular Architecture Design for Autonomous Driving Racing in Controlled Environments

Brais Fontan-Costas, M. Diaz-Cacho, Ruben Fernandez-Boullon, Manuel Alonso-Carracedo, Javier Perez-Robles

2512.01850 2026-03-17 cs.CV cs.RO

Register Any Point: Scaling 3D Point Cloud Registration by Flow Matching

Yue Pan, Tao Sun, Liyuan Zhu, Lucas Nunes, Iro Armeni, Jens Behley, Cyrill Stachniss

2511.22436 2026-03-17 cs.CV

ABounD: Adversarial Boundary-Driven Few-Shot Learning for Multi-Class Anomaly Detection

Runzhi Deng, Yundi Hu, Xinshuang Zhang, Zhao Wang, Xixi Liu, Wang-Zhou Dai, Caifeng Shan, Fang Zhao

2511.21945 2026-03-17 cs.CV

GENA3D: Generative Amodal 3D Modeling by Bridging 2D Priors and 3D Coherence

Junwei Zhou, Yu-Wing Tai

Comments 29 pages

2511.20629 2026-03-17 cs.CV cs.AI cs.LG

MapReduce LoRA: Advancing the Pareto Front in Multi-Preference Optimization for Generative Models

Chieh-Yun Chen, Zhonghao Wang, Qi Chen, Zhifan Ye, Min Shi, Yue Zhao, Yinan Zhao, Hui Qu, Wei-An Lin, Yiru Shen, Ajinkya Kale, Irfan Essa, Humphrey Shi

Comments CVPR 2026; Code: https://github.com/SHI-Labs/MapReduce-LoRA

2511.18519 2026-03-17 cs.LG

CHIPS: Efficient CLIP Adaptation via Curvature-aware Hybrid Influence-based Data Selection

Xinlin Zhuang, Yichen Li, Xiwei Liu, Haolin Yang, Yifan Lu, Ziyun Zou, Yulong Li, Huifa Li, Dongliang Chen, Qinglei Wang, Weiyang Liu, Ying Qian, Jiangming Shi, Imran Razzak

Comments Accepted by CVPR 2026

2511.18254 2026-03-17 cs.CV

UniFlow: Zero-Shot LiDAR Scene Flow for Autonomous Vehicles

Siyi Li, Qingwen Zhang, Ishan Khatri, Kyle Vedder, Eric Eaton, Deva Ramanan, Neehar Peri

Comments Project Page: https://lisiyi777.github.io/UniFlow/

2511.18242 2026-03-17 cs.CV

EgoVITA: Learning to Plan and Verify for Egocentric Video Reasoning

Yogesh Kulkarni, Pooyan Fazli