arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2602.21778 2026-03-02 cs.CV

From Statics to Dynamics: Physics-Aware Image Editing with Latent Transition Priors

Liangbing Zhao, Le Zhuo, Sayak Paul, Hongsheng Li, Mohamed Elhoseiny

Comments All code, checkpoints, and datasets are available at https://liangbingzhao.github.io/statics2dynamics/

详情

英文摘要

Instruction-based image editing has achieved remarkable success in semantic alignment, yet state-of-the-art models frequently fail to render physically plausible results when editing involves complex causal dynamics, such as refraction or material deformation. We attribute this limitation to the dominant paradigm that treats editing as a discrete mapping between image pairs, which provides only boundary conditions and leaves transition dynamics underspecified. To address this, we reformulate physics-aware editing as predictive physical state transitions and introduce PhysicTran38K, a large-scale video-based dataset comprising 38K transition trajectories across five physical domains, constructed via a two-stage filtering and constraint-aware annotation pipeline. Building on this supervision, we propose PhysicEdit, an end-to-end framework equipped with a textual-visual dual-thinking mechanism. It combines a frozen Qwen2.5-VL for physically grounded reasoning with learnable transition queries that provide timestep-adaptive visual guidance to a diffusion backbone. Experiments show that PhysicEdit improves over Qwen-Image-Edit by 5.9% in physical realism and 10.1% in knowledge-grounded editing, setting a new state-of-the-art for open-source methods, while remaining competitive with leading proprietary models.

URL PDF HTML ☆

赞 0 踩 0

2602.21746 2026-03-02 cs.AI

fEDM+: A Risk-Based Fuzzy Ethical Decision Making Framework with Principle-Level Explainability and Pluralistic Validation

Abeer Dyoub, Francesca A. Lisi

Comments correcting captions of figures 7 and 8 and some other minor errors

2602.21644 2026-03-02 cs.RO

DAGS-SLAM: Dynamic-Aware 3DGS SLAM via Spatiotemporal Motion Probability and Uncertainty-Aware Scheduling

Li Zhang, Yu-An Liu, Xijia Jiang, Conghao Huang, Danyang Li, Yanyong Zhang

2602.21515 2026-03-02 cs.LG cs.AI cs.MA

Training Generalizable Collaborative Agents via Strategic Risk Aversion

Chengrui Qu, Yizhou Zhang, Nicolas Lanzetti, Eric Mazumdar

2602.21429 2026-03-02 cs.LG cs.AI cs.SY eess.SY math.OC

Provably Safe Generative Sampling with Constricting Barrier Functions

Darshan Gadginmath, Ahmed Allibhoy, Fabio Pasqualetti

Comments 21 pages, 7 figures

2602.21157 2026-03-02 cs.RO

HALO: A Unified Vision-Language-Action Model for Embodied Multimodal Chain-of-Thought Reasoning

Quanxin Shou, Fangqi Zhu, Shawn Chen, Puxin Yan, Zhengyang Yan, Yikun Miao, Xiaoyi Pang, Zicong Hong, Ruikai Shi, Hao Huang, Jie Zhang, Song Guo

2602.20293 2026-03-02 cs.LG stat.ML

Discrete Diffusion with Sample-Efficient Estimators for Conditionals

Karthik Elamvazhuthi, Abhijith Jayakumar, Andrey Y. Lokhov

2602.19919 2026-03-02 cs.CL cs.LG

Janus-Q: End-to-End Event-Driven Trading via Hierarchical-Gated Reward Modeling

Xiang Li, Zikai Wei, Yiyan Qi, Wanyun Zhou, Xiang Liu, Penglei Sun, Jian Guo, Yongqi Zhang, Xiaowen Chu

2602.19756 2026-03-02 cs.CV

Multimodal Dataset Distillation Made Simple by Prototype-Guided Data Synthesis

Junhyeok Choi, Sangwoo Mo, Minwoo Chae

Comments ICLR 2026

2602.18968 2026-03-02 cs.AI

Robust and Efficient Tool Orchestration via Layered Execution Structures with Reflective Correction

Tao Zhe, Haoyu Wang, Bo Luo, Min Wu, Wei Fan, Xiao Luo, Zijun Yao, Haifeng Chen, Dongjie Wang

2602.18946 2026-03-02 cs.LG math.OC

Exponential Convergence of (Stochastic) Gradient Descent for Separable Logistic Regression

Sacchit Kale, Piyushi Manupriya, Pierre Marion, Francis Bach, Anant Raj

2602.17632 2026-03-02 cs.LG cs.AI

SMAC: Score-Matched Actor-Critics for Robust Offline-to-Online Transfer

Nathan Samuel de Lara, Florian Shkurti

2602.17049 2026-03-02 cs.AI cs.HC cs.RO

IntentCUA: Learning Intent-level Representations for Skill Abstraction and Multi-Agent Planning in Computer-Use Agents

Seoyoung Lee, Seobin Yoon, Seongbeen Lee, Yoojung Chun, Dayoung Park, Doyeon Kim, Joo Yong Sim

Comments 12 pages, 9 figures, AAMAS 2026

2602.14135 2026-03-02 cs.AI cs.CR cs.CY

ForesightSafety Bench: A Frontier Risk Evaluation and Governance Framework towards Safe AI

Haibo Tong, Feifei Zhao, Linghao Feng, Ruoyu Wu, Ruolin Chen, Lu Jia, Zhou Zhao, Jindong Li, Tenglong Li, Erliang Lin, Shuai Yang, Enmeng Lu, Yinqian Sun, Qian Zhang, Zizhe Ruan, Jinyu Fan, Zeyang Yue, Ping Wu, Huangrui Li, Chengyi Sun, Yi Zeng

2602.14069 2026-03-02 cs.CL

Open Rubric System: Scaling Reinforcement Learning with Pairwise Adaptive Rubric

Ruipeng Jia, Yunyi Yang, Yuxin Wu, Yongbo Gai, Siyuan Tao, Mengyu Zhou, Jianhe Lin, Xiaoxi Jiang, Guanjun Jiang

2602.13964 2026-03-02 cs.CL

HLE-Verified: A Systematic Verification and Structured Revision of Humanity's Last Exam

Weiqi Zhai, Zhihai Wang, Jinghang Wang, Boyu Yang, Xiaogang Li, Xander Xu, Bohan Wang, Peng Wang, Xingzhe Wu, Anfeng Li, Qiyuan Feng, Yuhao Zhou, Shoulin Han, Wenjie Luo, Yiyuan Li, Yaxuan Wang, Ruixian Luo, Guojie Lin, Peiyao Xiao, Chengliang Xu, Ben Wang, Zeyu Wang, Zichao Chen, Jianan Ye, Yijie Hu, Jialong Chen, Zongwen Shen, Yuliang Xu, An Yang, Bowen Yu, Dayiheng Liu, Junyang Lin, Hu Wei, Que Shen, Bing Zhao

Comments 14 pages, 10 figures

详情

英文摘要

Humanity's Last Exam (HLE) has become a widely used benchmark for evaluating frontier large language models on challenging, multi-domain questions. However, community-led analyses have raised concerns that HLE contains a non-trivial number of noisy items, which can bias evaluation results and distort cross-model comparisons. To address this challenge, we introduce HLE-Verified, a verified and revised version of HLE with a transparent verification protocol and fine-grained error taxonomy. Our construction follows a two-stage validation-and-repair workflow resulting in a certified benchmark. In Stage I, each item undergoes binary validation of the problem and final answer through domain-expert review and model-based cross-checks, yielding 668 verified items. In Stage II, flawed but fixable items are revised under strict constraints preserving the original evaluation intent, through dual independent expert repairs, model-assisted auditing, and final adjudication, resulting in 1,143 revised-and-certified items. The remaining 689 items are released as a documented uncertain set with explicit uncertainty sources and expertise tags for future refinement. We evaluate eight state-of-the-art language models on HLE and HLE-Verified, observing an average absolute accuracy gain of 7--10 percentage points on HLE-Verified. The improvement is particularly pronounced on items where the original problem statement and/or reference answer is erroneous, with gains of 30--40 percentage points. Our analyses further reveal a strong association between model confidence and the presence of errors in the problem statement or reference answer, supporting the effectiveness of our revisions. Overall, HLE-Verified improves HLE-style evaluations by reducing annotation noise and enabling more faithful measurement of model capabilities. Data is available at: https://huggingface.co/datasets/skylenage/HLE-Verified

URL PDF HTML ☆

赞 0 踩 0

2602.13585 2026-03-02 cs.CV

Diff-Aid: Inference-time Adaptive Interaction Denoising for Rectified Text-to-Image Generation

Binglei Li, Mengping Yang, Zhiyu Tan, Junping Zhang, Hao Li

Comments 18 pages

2602.13287 2026-03-02 cs.CV cs.NI

COOPERTRIM: Adaptive Data Selection for Uncertainty-Aware Cooperative Perception

Shilpa Mukhopadhyay, Amit Roy-Chowdhury, Hang Qiu

Comments Accepted in ICLR 2026

详情

英文摘要

Cooperative perception enables autonomous agents to share encoded representations over wireless communication to enhance each other's live situational awareness. However, the tension between the limited communication bandwidth and the rich sensor information hinders its practical deployment. Recent studies have explored selection strategies that share only a subset of features per frame while striving to keep the performance on par. Nevertheless, the bandwidth requirement still stresses current wireless technologies. To fundamentally ease the tension, we take a proactive approach, exploiting the temporal continuity to identify features that capture environment dynamics, while avoiding repetitive and redundant transmission of static information. By incorporating temporal awareness, agents are empowered to dynamically adapt the sharing quantity according to environment complexity. We instantiate this intuition into an adaptive selection framework, COOPERTRIM, which introduces a novel conformal temporal uncertainty metric to gauge feature relevance, and a data-driven mechanism to dynamically determine the sharing quantity. To evaluate COOPERTRIM, we take semantic segmentation and 3D detection as example tasks. Across multiple open-source cooperative segmentation and detection models, COOPERTRIM achieves up to 80.28% and 72.52% bandwidth reduction respectively while maintaining a comparable accuracy. Relative to other selection strategies, COOPERTRIM also improves IoU by as much as 45.54% with up to 72% less bandwidth. Combined with compression strategies, COOPERTRIM can further reduce bandwidth usage to as low as 1.46% without compromising IoU performance. Qualitative results show COOPERTRIM gracefully adapts to environmental dynamics, localization error, and communication latency, demonstrating flexibility and paving the way for real-world deployment.

URL PDF HTML ☆

赞 0 踩 0

2602.12769 2026-03-02 cs.CV

PixelRush: Ultra-Fast, Training-Free High-Resolution Image Generation via One-step Diffusion

Hong-Phuc Lai, Phong Nguyen, Anh Tran

Comments Accepted to CVPR 2026 (Main Conference)

2602.12113 2026-03-02 cs.AI cs.CL

Stop Unnecessary Reflection: Training LRMs for Efficient Reasoning with Adaptive Reflection and Length Coordinated Penalty

Zewei Yu, Lirong Gao, Yuke Zhu, Bo Zheng, Junbo Zhao, Sheng Guo, Haobo Wang

Comments Accepted to ICLR 2026

2602.10594 2026-03-02 cs.RO cs.LG

Flow-Enabled Generalization to Human Demonstrations in Few-Shot Imitation Learning

Runze Tang, Penny Sweetser

Comments Accepted to ICRA 2026

2602.09961 2026-03-02 cs.CL

ViMultiChoice: Toward a Method That Gives Explanation for Multiple-Choice Reading Comprehension in Vietnamese

Trung Tien Cao, Lam Minh Thai, Nghia Hieu Nguyen, Duc-Vu Nguyen, Ngan Luu-Thuy Nguyen

2602.07588 2026-03-02 cs.LG

Unified Biomolecular Trajectory Generation via Pretrained Variational Bridge

Ziyang Yu, Wenbing Huang, Yang Liu

Comments The Fourteenth International Conference on Learning Representations (ICLR 2026)

2602.06846 2026-03-02 cs.SD

DynFOA: Generating First-Order Ambisonics with Conditional Diffusion for Dynamic and Acoustically Complex 360-Degree Videos

Ziyu Luo, Lin Chen, Qiang Qu, Xiaoming Chen, Yiran Shen

2602.05513 2026-03-02 cs.RO cs.AI

DECO: Decoupled Multimodal Diffusion Transformer for Bimanual Dexterous Manipulation with a Plugin Tactile Adapter

Xukun Li, Yu Sun, Lei Zhang, Bosheng Huang, Yibo Peng, Yuan Meng, Haojun Jiang, Shaoxuan Xie, Guocai Yao, Alois Knoll, Zhenshan Bing, Xinlong Wang, Zhenguo Sun

Comments 17 pages, 8 figures. Project Page: https://baai-humanoid.github.io/DECO-webpage/

2602.05375 2026-03-02 cs.LG cs.CV

Erase at the Core: Representation Unlearning for Machine Unlearning

Jaewon Lee, Yongwoo Kim, Donghyun Kim

2602.05362 2026-03-02 cs.CV

Imagine a City: CityGenAgent for Procedural 3D City Generation

Zishan Liu, Zecong Tang, RuoCheng Wu, Xinzhe Zheng, Jingyu Hu, Ka-Hei Hui, Haoran Xie, Bo Dai, Zhengzhe Liu

2602.02960 2026-03-02 cs.RO cs.AI cs.LG

Embodiment-Aware Generalist Specialist Distillation for Unified Humanoid Whole-Body Control

Quanquan Peng, Yunfeng Lin, Yufei Xue, Jiangmiao Pang, Weinan Zhang

2602.02090 2026-03-02 cs.CL cs.AI

LEC-KG: An LLM-Embedding Collaborative Framework for Domain-Specific Knowledge Graph Construction -- A Case Study on SDGs

Yikai Zeng, Yingchao Piao, Changhua Pei, Jianhui Li

2602.01840 2026-03-02 cs.CL

Read As Human: Compressing Context via Parallelizable Close Reading and Skimming

Jiwei Tang, Shilei Liu, Zhicheng Zhang, Qingsong Lv, Runsong Zhao, Tingwei Lu, Langming Liu, Haibin Chen, Yujin Yuan, Hai-Tao Zheng, Wenbo Su, Bo Zheng

Comments 13 pages,5 figures (Compared with v1, the author affiliations have been corrected.)