arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2605.05781 2026-05-08 cs.CV cs.AI

Steering Visual Generation in Unified Multimodal Models with Understanding Supervision

Zeyu Liu, Zanlin Ni, Yang Yue, Cheng Da, Huan Yang, Di Zhang, Kun Gai, Gao Huang

详情

英文摘要

Unified multimodal models are envisioned to bridge the gap between understanding and generation. Yet, to achieve competitive performance, state-of-the-art models adopt largely decoupled understanding and generation components. This design, while effective for individual tasks, weakens the connection required for mutual enhancement, leaving the potential synergy empirically uncertain. We propose to explicitly restore this synergy by introducing Understanding-Oriented Post-Training (UNO), a lightweight framework that treats understanding not only as a distinct task, but also a direct supervisory signal to steer generative representations. By incorporating objectives that encode semantic abstraction (captioning) and structural details (visual regression), we enable effective gradient flow from understanding to generation. Extensive experiments on image generation and editing demonstrate that understanding can serve as an effective catalyst for generation.

URL PDF HTML ☆

赞 0 踩 0

2605.05780 2026-05-08 cs.AI cs.CV cs.LG

Von Neumann Networks

Shekhar S. Chandra

2605.05777 2026-05-08 cs.CL

Estimating the Black-box LLM Uncertainty with Distribution-Aligned Adversarial Distillation

Huizi Cui, Huan Ma, Qilin Wang, Yuhang Gao, Changqing Zhang

Comments Accepted to ACL 2026

2605.05776 2026-05-08 cs.AI

HEDP: A Hybrid Energy-Distance Prompt-based Framework for Domain Incremental Learning

Yu Feng, Zhen Tian, Haoran Luo, Xie Yu, Diancheng Cheng, Haoyue Zheng, Shuai Lyu, Ping Zong, Lianyuan Li, Xin Ge, Yifan Zhu

Comments 13 pages, 6 figures, Accepted by ICML 2026

2605.05773 2026-05-08 cs.AI

CircuitFormer: A Circuit Language Model for Analog Topology Design from Natural Language Prompt

Md Touhidul Islam, Sujan Kumar Saha, Farimah Farahmandi, Mark Tehranipoor

详情

英文摘要

Automating analog circuit design remains a longstanding challenge in Electronic Design Automation (EDA). While Transformer-based Large Language Models (LLMs) have revolutionized software code generation, their application to analog hardware design is hindered by two critical limitations: (i) the scarcity of analog design datasets containing natural language description of a design and its corresponding netlist, and (ii) the inefficiency of general-purpose tokenizers (e.g., Byte Pair Encoding (BPE)) in capturing the inherent graph structure of circuits. To bridge this gap, first, we curate the largest annotated dataset of analog circuit netlists to date, comprising 31,341 netlist-natural language description pairs across all major circuit classes. Furthermore, we propose Circuit Tokenizer (CKT), a novel circuit graph tokenizer designed to encode netlist connectivity by explicitly mining frequent subcircuits. In terms of scalability, CKT overcomes the bottleneck of prior circuit graph serialization methods where vocabulary size scales linearly with maximum number of components in the dataset, n_max, (O(n_max)); instead, CKT decouples vocabulary growth from circuit complexity, achieving a constant O(1) complexity. Empirically, CKT outperforms standard BPE on circuit topology representation, reducing sequence length by 57% and achieving a 2.3x superior compression ratio using a compact, fixed vocabulary of size 512. Leveraging this optimized tokenization, we train a circuit-specific language model, CircuitFormer, a 511M parameter encoder-decoder transformer. Our model achieves 100% syntactic correctness and an 83% functional success rate across all major analog circuit categories, outperforming state-of-the-art open-source LLMs by 10% and 14%, respectively, while requiring 240x fewer parameters. The dataset is publicly available at https://huggingface.co/datasets/touhid314/cktformer-dataset.

URL PDF HTML ☆

赞 0 踩 0

2605.05770 2026-05-08 cs.AI

Confidence is the key: how conformal prediction enhances the generative design of permeable peptides

Laura van Weesep, Sunay Chankeshwara, Leonardo De Maria, Florian David, Ola Engkvist, Gökçe Geylan

详情

英文摘要

Generative models coupled with reinforcement learning (RL), such as REINVENT and PepINVENT, have emerged as a powerful framework for de novo molecular design. During the ideation process these generative frameworks utilize various predictive models as part of the optimization objectives. However, the utility of the predictive models can be limited by their domain of applicability. When RL is used to explore the chemical space with predictive models, it can suggest molecules that lie outside the predictor's domain of applicability. As a result, the predictions may become less reliable, potentially steering designs into high reward but also high uncertainty chemical spaces. This is particularly pronounced for cyclic peptides which show therapeutic promise due to their modifiability and large interaction surfaces but are understudied compared to small molecules. While passive membrane permeation in cyclic peptides has attracted interest, identifying optimal permeable designs remains challenging yet crucial for targeting intracellular sites. We present an RL-guided generative framework that designs permeable cyclic peptides using an uncertainty-aware permeability predictor as the scoring component. To address predictive uncertainty, especially impacted by novel chemistry, we integrate conformal prediction (CP) as our uncertainty quantification method. CP assesses designs based on the calibrated model under a user-defined confidence level. We demonstrate that rewarding generated peptides with CP-informed predictions improves both reliability and efficiency of peptide optimization process. This also discourages exploration outside the predictor's applicability domain. This approach bridges the gap between predictive uncertainty and RL-guided exploration, showing how generative modelling and conformal prediction can be combined for the first time.

URL PDF HTML ☆

赞 0 踩 0

2605.05769 2026-05-08 cs.LG cs.AI cs.CL

Adaptive Selection of LoRA Components in Privacy-Preserving Federated Learning

Myoungjun Kim, Sangwoo Park, Yoseob Han, Jin-Hyun Ahn

Comments Submitted to a conference

2605.05758 2026-05-08 cs.CL

BioTool: A Comprehensive Tool-Calling Dataset for Enhancing Biomedical Capabilities of Large Language Models

Xin Gao, Ruiyi Zhang, Meixi Du, Peijia Qin, Pengtao Xie

Comments Published at ACL 2026; Code and data available at https://github.com/gxx27/BioTool

2605.05756 2026-05-08 cs.RO cs.CV

MaMi-HOI: Harmonizing Global Kinematics and Local Geometry for Human-Object Interaction Generation

Hao Wang, Shiqi Wang, Qi Liu

2605.05753 2026-05-08 cs.CV

Jointly Learning Structured Representations and Stabilized Affinity for Human Motion Segmentation

Xianghan Meng, Zhiyuan Huang, Zhengyu Tong, Chun-Guang Li

Comments This manuscript is currently under review by the IEEE Transactions on Circuits and Systems for Video Technology (TCSVT)

2605.05750 2026-05-08 cs.LG cs.CL

RVPO: Risk-Sensitive Alignment via Variance Regularization

Ivan Montero, Tomasz Jurczyk, Bhuwan Dhingra

Comments 17 pages, 5 figures

2605.05748 2026-05-08 cs.AI

Evaluating Explainability in Safety-Critical ATR Systems: Limitations of Post-Hoc Methods and Paths Toward Robust XAI

Vanessa Buhrmester, David Muench, Dimitri Bulatov, Michael Arens

Comments 15 pages, 1 image 1 table, ICPR workshop

2605.05745 2026-05-08 cs.AI

Best Arm Identification in Generalized Linear Bandits via Hybrid Feedback

Qirun Zeng, Xuchuang Wang, Jiayi Shen, Xutong Liu, Fang Kong, Jinhang Zuo

2605.05742 2026-05-08 cs.LG

Weak-to-Strong Generalization is Nearly Inevitable (in Linear Models)

Scott Geng, Dutch Hansen, Jerry Li

2605.05741 2026-05-08 cs.AI

HyperLens: Quantifying Cognitive Effort in LLMs with Fine-grained Confidence Trajectory

Chengda Lu, Xiaoyu Fan, Wei Xu

Comments 33 pages

2605.05738 2026-05-08 cs.LG cs.AI

CoMemNet: Contrastive Sampling with Memory Replay Network for Continual Traffic Prediction

Mei Wu, Wenchao Weng, Wenxin Su, Wenjie Tang, Wei Zhou

Comments 12 pages, 6 figures

2605.05737 2026-05-08 cs.AI cs.CL

ReFlect: An Effective Harness System for Complex Long-Horizon LLM Reasoning

Fan Huang

2605.05731 2026-05-08 cs.AI

Knee Osteoarthritis Severity Grading Using Optimized Deep Learning and LLM-Driven Intelligent AI on Computationally Limited Systems

Dayam Nadeem, Neha, Safdar Mustafa, Adnan Alvi, Mohd Hussain

Comments 6 pages, 11 figures, Accepted and presented at the 2nd International Conference on Emerging Computational Intelligence (ICECI 2026), IEEE. Published in conference proceedings. To appear in IEEE Xplore

2605.05728 2026-05-08 cs.LG cs.AI cs.SY eess.SY math.OC

WARP: A Benchmark for Primal-Dual Warm-Starting of Interior-Point Solvers

Dhruv Suri, Helgi Hilmarsson, Shourya Bose

2605.05726 2026-05-08 cs.AI

SkillRet: A Large-Scale Benchmark for Skill Retrieval in LLM Agents

Hongcheol Cho, Ryangkyung Kang, Youngeun Kim

2605.05725 2026-05-08 cs.AI

Detecting Time Series Anomalies Like an Expert: A Multi-Agent LLM Framework with Specialized Analyzers

Hyeongwon Kang, Jeongseob Kim, Jinwoo Park, Pilsung Kang

Comments Preprint. 9 pages main text, 29 pages total, 8 figures, 9 tables, with appendix

2605.05722 2026-05-08 cs.CV

$\mathcal{B}^{3}$-Net: Controlled Posterior Bridge Learning for Multi-Task Dense Prediction

Meihua Zhou, Li Yang

Comments 14 pages, 10 figures

2605.05718 2026-05-08 cs.LG

Enabling Federated Inference via Unsupervised Consensus Embedding

Yui Hashimoto, Takayuki Nishio, Yuichi Kitagawa, Takahito Tanimura

Comments 18 pages, 15 figures, submitted to IEEE Transactions on Mobile Computing (TMC) (under review)

2605.05716 2026-05-08 cs.AI cs.CL

More Is Not Always Better: Cross-Component Interference in LLM Agent Scaffolding

Ming Liu

Comments 10 pages, 5 tables; preprint, under review

2605.05715 2026-05-08 cs.AI cs.CL cs.LG

Decodable but Not Corrected by Fixed Residual-Stream Linear Steering: Evidence from Medical LLM Failure Regimes

Ming Liu

Comments 22 pages (14 main + 8 appendix), 5 figures, 7 tables. Under review

2605.05714 2026-05-08 cs.CV cs.RO

TriRelVLA: Triadic Relational Structure for Generalizable Embodied Manipulation

Hanyu Zhou, Chuanhao Ma, Gim Hee Lee

2605.05712 2026-05-08 cs.CV

EgoEMG: A Multimodal Egocentric Dataset with Bilateral EMG and Vision for Hand Pose Estimation

Ziheng Xi, Jiayi Yu, Yitao Wang, Yanbo Duan, Jianjiang Feng, Jie Zhou

Comments 34 pages, 13 figures, 15 tables. Submitted to NeurIPS 2026

2605.05711 2026-05-08 cs.CV cs.GR cs.HC cs.LG cs.MM

Closing the Loop: Unified 3D Scene Generation and Immersive Interaction via LLM-RL Coupling

Anh H. Vo, Sungyo Lee, Phil-Joong Kim, Soo-Mi Choi, Yong-Guk Kim

2605.05710 2026-05-08 cs.LG

On the Blessing of Pre-training in Weak-to-Strong Generalization

Wei Yao, Wang Zhaoyang, Gengze Xu, Chen Qian, Dongrui Liu, Ziqiao Wang, Yong Liu, Yunbei Xu

Comments 40 pages, 14 figures

2605.05709 2026-05-08 cs.AI

Conceal, Reconstruct, Jailbreak: Exploiting the Reconstruction-Concealment Tradeoff in MLLMs

Md Farhamdur Reza, Richeng Jin, Tianfu Wu, Huaiyu Dai

Comments 39 pages, including appendices