arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2509.25035 2026-03-13 cs.CL cs.AI cs.LG

Ultra-Fast Language Generation via Discrete Diffusion Divergence Instruct

Haoyang Zheng, Xinyang Liu, Cindy Xiangrui Kong, Nan Jiang, Zheyuan Hu, Weijian Luo, Wei Deng, Guang Lin

Comments [ICLR 2026] 38 pages, 7 figures, 13 tables

详情

英文摘要

Fast and high-quality language generation is the holy grail that people pursue in the age of AI. In this work, we introduce Discrete Diffusion Divergence Instruct (DiDi-Instruct), a training-based method that initializes from a pre-trained diffusion large language model (dLLM) and distills a few-step student for fast generation. The model distilled with DiDi-Instruct matches or surpasses its dLLM teacher and the GPT-2 baseline while providing up to 64$\times$ acceleration. The theoretical foundation of DiDi-Instruct is a novel framework based on integral KL-divergence minimization, which leads to a practical training algorithm. We further introduce grouped reward normalization, intermediate-state matching, and the reward-guided ancestral sampler to improve training stability, model coverage, and inference quality. On the OpenWebText benchmark, DiDi-Instruct achieves perplexity ranging from 62.2 (8 NFEs) to 18.4 (128 NFEs), outperforming prior accelerated dLLMs and the GPT-2 baseline. These gains incur a negligible entropy loss (around $1$%) and reduce additional training wall-clock time by more than $20\times$ compared to competing dLLM distillation methods. We further validate the robustness and effectiveness of DiDi-Instruct through extensive ablation studies, model scaling, downstream task evaluations, and unconditional protein sequence generation. In conclusion, DiDi-Instruct enables efficient and effective distillation for language generation in the blink of an eye.

URL PDF HTML ☆

赞 0 踩 0

2509.23097 2026-03-13 cs.CV

Streamline pathology foundation model by cross-magnification distillation

Ziyu Su, Abdul Rehman Akbar, Usama Sajjad, Anil V. Parwani, Muhammad Khalid Khan Niazi

2509.22824 2026-03-13 cs.CL

Critique-Coder: Enhancing Coder Models by Critique Reinforcement Learning

Chi Ruan, Dongfu Jiang, Yubo Wang, Wenhu Chen

2509.20681 2026-03-13 cs.RO cs.AI cs.CV

Efficient Construction of Implicit Surface Models From a Single Image for Motion Generation

Wei-Teng Chu, Tianyi Zhang, Matthew Johnson-Roberson, Weiming Zhi

Comments 9 pages, 6 figures, 2026 IEEE International Conference on Robotics and Automation (ICRA)

2509.18395 2026-03-13 cs.CL

NormGenesis: Multicultural Dialogue Generation via Exemplar-Guided Social Norm Modeling and Violation Recovery

Minki Hong, Jangho Choi, Jihie Kim

Comments 39 pages, 17 figures, EMNLP 2025 Main Conference, Senior Area Chair (SAC) Highlights Award

2509.11125 2026-03-13 cs.RO cs.CV

ManiVID-3D: Generalizable View-Invariant Reinforcement Learning for Robotic Manipulation via Disentangled 3D Representations

Zheng Li, Pei Qu, Yufei Jia, Shihui Zhou, Haizhou Ge, Jiahang Cao, Jinni Zhou, Guyue Zhou, Jun Ma

Comments Accepted to RA-L. Project website: https://zheng-joe-lee.github.io/manivid3d/

2509.06322 2026-03-13 cs.LG

Text-Trained LLMs Can Zero-Shot Extrapolate PDE Dynamics, Revealing a Three-Stage In-Context Learning Mechanism

Jiajun Bao, Nicolas Boullé, Toni J. B. Liu, Raphaël Sarfati, Christopher J. Earls

2509.00639 2026-03-13 cs.LG

Disentangling Slow and Fast Temporal Dynamics in Degradation Inference with Hierarchical Differential Models

Mengjie Zhao, Olga Fink

详情

英文摘要

Reliable inference of system degradation from sensor data is fundamental to condition monitoring and prognostics in mechanical and infrastructural systems. Since degradation is rarely directly observable and measurable, it must be inferred to enable accurate health assessment and decision-making. This is particularly challenging because operational and environmental variations dominate system behavior, while degradation introduces only subtle, long-term changes. Consequently, sensor data primarily reflect short-term operational variability, making it difficult to disentangle the underlying degradation process. Most unsupervised degradation inference methods learn nominal system behavior and use residuals as degradation proxies. However, residuals remain strongly entangled with operational history, yielding noisy and unreliable degradation estimates, particularly in infrastructural systems with dominant transient dynamics. Neural Ordinary Differential Equations (NODEs) offer a flexible framework for modeling latent dynamics, but in degraded systems, they suffer from numerical stiffness and degradation disentanglement remains difficult. To address these challenges, we propose a Hierarchical Controlled Differential Equation (H-CDE) framework that jointly models slow degradation dynamics and fast operational dynamics. H-CDE improves numerical efficiency through separate time integration of slow and fast components. Through a learnable path transformation mapping raw inputs to a latent degradation-relevant control path and a monotonicity-enforcing activation function that regularizes the inferred degradation dynamics, H-CDE enables effective disentangled degradation inference. Evaluations on mechanical and infrastructural systems demonstrate that H-CDE outperforms residual-based baselines, yielding more accurate, robust, and interpretable degradation inference in an unsupervised setting.

URL PDF HTML ☆

赞 0 踩 0

2508.16777 2026-03-13 cs.AI

Evaluation and LLM-Guided Learning of ICD Coding Rationales

Mingyang Li, Viktor Schlegel, Tingting Mu, Wuraola Oyewusi, Kai Kang, Goran Nenadic

2508.10745 2026-03-13 cs.AI cs.CV cs.LG cs.MA cs.MM

Agentic Design Review System

Sayan Nag, K J Joseph, Koustava Goswami, Vlad I Morariu, Balaji Vasan Srinivasan

Comments Project Page: https://sayannag.github.io/AgenticDRS

2508.09487 2026-03-13 cs.CV

Semantic-Aware Reconstruction Error for Detecting AI-Generated Images

Ju Yeon Kang, Jaehong Park, Semin Kim, Ji Won Yoon, Nam Soo Kim

2508.04604 2026-03-13 cs.CL cs.AI cs.IR

TURA: Tool-Augmented Unified Retrieval Agent for AI Search

Zhejun Zhao, Yuchen Li, Alley Liu, Yuehu Dong, Xiaolong Wei, Lixue Zheng, Pingsheng Liu, Dongdong Shen, Long Xia, Jiashu Zhao, Dawei Yin

2507.16083 2026-03-13 cs.CL cs.AI cs.LG

Efficient Compositional Multi-tasking for On-device Large Language Models

Ondrej Bohdal, Mete Ozay, Jijoong Moon, Kyeng-Hun Lee, Hyeonmok Ko, Umberto Michieli

Comments Accepted at EMNLP 2025 (main track, long paper)

2507.14552 2026-03-13 cs.AI

Large Language Models Assisting Ontology Evaluation

Anna Sofia Lippolis, Mohammad Javad Saeedizade, Robin Keskisärkkä, Aldo Gangemi, Eva Blomqvist, Andrea Giovanni Nuzzolese

2506.23543 2026-03-13 cs.CV

Pyramidal Patchification Flow for Visual Generation

Hui Li, Baoyou Chen, Liwei Zhang, Jiaye Li, Jingdong Wang, Siyu Zhu

Comments ICLR 2026

2506.21583 2026-03-13 cs.CL cs.AI

Hope Speech Detection in code-mixed Roman Urdu tweets: A Positive Turn in Natural Language Processing

Muhammad Ahmad, Muhammad Waqas, Ameer Hamza, Ildar Batyrshin, Grigori Sidorov

Comments We are withdrawing this preprint because it contains initial experimental results and an early version of the manuscript. We are currently improving the methodology, conducting additional experiments, and refining the analysis. A substantially revised version will be submitted in the future

2506.16584 2026-03-13 cs.CL cs.AI cs.LG

Measuring Intent Comprehension in LLMs

Nadav Kunievsky, James A. Evans

2506.13723 2026-03-13 cs.CV

SOTA: Self-adaptive Optimal Transport for Zero-Shot Classification with Multiple Foundation Models

Zhanxuan Hu, Qiyu Xu, Yu Duan, Yonghang Tai, Huafeng Li

2505.20967 2026-03-13 cs.CV

RF4D:Neural Radar Fields for Novel View Synthesis in Outdoor Dynamic Scenes

Jiarui Zhang, Zhihao Li, Chong Wang, Bihan Wen

2505.19459 2026-03-13 cs.LG cs.AI

Your Classifier Can Do More: Towards Balancing the Gaps in Classification, Robustness, and Generation

Kaichao Jiang, He Wang, Xiaoshuai Hao, Xiulong Yang, Ajian Liu, Qi Chu, Yunfeng Diao, Richang Hong

Comments accepted by CVPR2026

2505.19240 2026-03-13 cs.CL cs.AI cs.LG

LLLMs: A Data-Driven Survey of Evolving Research on Limitations of Large Language Models

Aida Kostikova, Zhipin Wang, Deidamea Bajri, Ole Pütz, Benjamin Paaßen, Steffen Eger

Comments ACM Computing Surveys (CSUR); 56 pages

2505.18607 2026-03-13 cs.AI

From Entity-Centric to Goal-Oriented Graphs: Enhancing LLM Knowledge Retrieval in Minecraft

Jonathan Leung, Yongjie Wang, Zhiqi Shen

Comments Accepted at Knowledge-Based Systems

2505.18547 2026-03-13 cs.AI cs.CV

Diffusion Blend: Inference-Time Multi-Preference Alignment for Diffusion Models

Min Cheng, Fatemeh Doudi, Dileep Kalathil, Mohammad Ghavamzadeh, Panganamala R. Kumar

Comments Accepted at ICLR 2026

2505.17778 2026-03-13 cs.CV

TextFlux: An OCR-Free DiT Model for High-Fidelity Multilingual Scene Text Synthesis

Yu Xie, Jielei Zhang, Pengyu Chen, Weihang Wang, Longwen Gao, Peiyi Li, Qian Qiao, Zhouhui Lian

Comments Accepted to Eurographics 2026 (Computer Graphics Forum)

2505.16211 2026-03-13 cs.SD cs.AI cs.CL eess.AS

AudioTrust: Benchmarking the Multifaceted Trustworthiness of Audio Large Language Models

Kai Li, Can Shen, Yile Liu, Jirui Han, Kelong Zheng, Xuechao Zou, Lionel Z. Wang, Shun Zhang, Xingjian Du, Hanjun Luo, Yingbin Jin, Xinxin Xing, Ziyang Ma, Yue Liu, Yifan Zhang, Junfeng Fang, Kun Wang, Yibo Yan, Gelei Deng, Haoyang Li, Yiming Li, Xiaobin Zhuang, Tianlong Chen, Qingsong Wen, Tianwei Zhang, Yang Liu, Haibo Hu, Zhizheng Wu, Xiaolin Hu, Eng-Siong Chng, Wenyuan Xu, XiaoFeng Wang, Wei Dong, Xinfeng Li

Comments Accepted to ICLR 2026

2505.13903 2026-03-13 cs.CL cs.AI

Let's Verify Math Questions Step by Step

Chengyu Shen, Zhen Hao Wong, Runming He, Hao Liang, Meiyi Qiang, Zimo Meng, Zhengyang Zhao, Bohan Zeng, Zhengzhou Zhu, Bin Cui, Wentao Zhang

2505.05806 2026-03-13 cs.CV

Image Segmentation via Variational Model Based Tailored UNet: A Deep Variational Framework

Kaili Qi, Wenli Yang, Ye Li, Zhongyi Huang

2504.16916 2026-03-13 cs.RO cs.SY eess.SY

Zero-shot Sim-to-Real Transfer for Reinforcement Learning-based Visual Servoing of Soft Continuum Arms

Hsin-Jung Yang, Mahsa Khosravi, Benjamin Walt, Girish Krishnan, Soumik Sarkar

Comments The 7th Annual Learning for Dynamics & Control Conference (L4DC) 2025

2504.09940 2026-03-13 cs.LG

TianQuan-S2S: A Subseasonal-to-Seasonal Global Weather Model via Incorporate Climatology State

Guowen Li, Xintong Liu, Yang Liu, Mengxuan Chen, Shilei Cao, Xuehe Wang, Juepeng Zheng, Jinxiao Zhang, Haoyuan Liang, Lixian Zhang, Jiuke Wang, Meng Jin, Hong Cheng, Haohuan Fu

2504.05662 2026-03-13 cs.CV

InvAD: Inversion-based Reconstruction-Free Anomaly Detection with Diffusion Models

Shunsuke Sakai, Xiangteng He, Chunzhi Gu, Leonid Sigal, Tatsuhito Hasegawa

Comments Accepted to CVPR2026. Project page: https://invad-project.com