arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2604.21268 2026-04-24 cs.LG cs.AI cs.CV

Measure Twice, Click Once: Co-evolving Proposer and Visual Critic via Reinforcement Learning for GUI Grounding

Wenkai Wang, Xiyun Li, Hongcan Guo, Wenhao Yu, Tianqing Fang, Haitao Mi, Dong Yu, Shengyu Zhang

详情

英文摘要

Graphical User Interface (GUI) grounding requires mapping natural language instructions to precise pixel coordinates. However, due to visually homogeneous elements and dense layouts, models typically grasp semantic intent yet struggle with achieving precise localization. While scaling sampling attempts (Pass@k) reveals potential gains, static self-consistency strategies derived from geometric clustering often yield limited improvements, as the model's predictions tend to be spatially dispersed. In this paper, we propose replacing static consistency strategies with a learnable selection mechanism that selects the optimal target by critiquing its own proposals rendered on the screenshot. Given the significant disparity between the model's grounding and critiquing capabilities, we propose a co-evolving Propose-then-Critic framework. To jointly optimize these, we introduce a maturity-aware adaptive co-evolutionary reinforcement learning paradigm. This approach dynamically balances the training objectives of proposer and critic, where the diversity of the proposer's outputs enhances critic robustness, while the critic's maturing discrimination capability conversely unlocks the proposer's potential for extensive spatial exploration, fostering the mutual reinforcement and co-evolution of both capabilities, thereby ensuring generalizability to adapt to diverse and complex interface layouts. Extensive experiments over 6 benchmarks show that our method significantly enhances both grounding accuracy and critic reliability.

URL PDF HTML ☆

赞 0 踩 0

2604.21265 2026-04-24 cs.CL

Listen and Chant Before You Read: The Ladder of Beauty in LM Pre-Training

Yoshinori Nomura

Comments 17 pages, 3 figures

2604.21264 2026-04-24 cs.AI

Enhancing Online Recruitment with Category-Aware MoE and LLM-based Data Augmentation

Minping Chen, Bing Xu, Zulong Chen, Chuanfei Xu, Ying Zhou, Zui Tao, Zeyi Wen

Comments Accepted to ACL Industry Track 2026

2604.21263 2026-04-24 cs.AI cs.PL cs.SE q-bio.QM

Trustworthy Clinical Decision Support Using Meta-Predicates and Domain-Specific Languages

Michael Bouzinier, Sergey Trifonov, Michael Chumack, Eugenia Lvova, Dmitry Etin

详情

英文摘要

\textbf{Background:} Regulatory frameworks for AI in healthcare, including the EU AI Act and FDA guidance on AI/ML-based medical devices, require clinical decision support to demonstrate not only accuracy but auditability. Existing formal languages for clinical logic validate syntactic and structural correctness but not whether decision rules use epistemologically appropriate evidence. \textbf{Methods:} Drawing on design-by-contract principles, we introduce meta-predicates -- predicates about predicates -- for asserting epistemological constraints on clinical decision rules expressed in a DSL. An epistemological type system classifies annotations along four dimensions: purpose, knowledge domain, scale, and method of acquisition. Meta-predicates assert which evidence types are permissible in any given rule. The framework is instantiated in AnFiSA, an open-source platform for genetic variant curation, and demonstrated using the Brigham Genomics Medicine protocol on 5.6 million variants from the Genome in a Bottle benchmark. \textbf{Results:} Decision trees used in variant interpretation can be reformulated as unate cascades, enabling per-variant audit trails that identify which rule classified each variant and why. Meta-predicate validation catches epistemological errors before deployment, whether rules are human-written or AI-generated. The approach complements post-hoc methods such as LIME and SHAP: where explanation reveals what evidence was used after the fact, meta-predicates constrain what evidence may be used before deployment, while preserving human readability. \textbf{Conclusions:} Meta-predicate validation is a step toward demonstrating not only that decisions are accurate but that they rest on appropriate evidence in ways that can be independently audited. While demonstrated in genomics, the approach generalises to any domain requiring auditable decision logic.

URL PDF HTML ☆

赞 0 踩 0

2604.21256 2026-04-24 cs.AI

Robustness Analysis of POMDP Policies to Observation Perturbations

Benjamin Kraske, Qi Heng Ho, Federico Rossi, Morteza Lahijanian, Zachary Sunberg

Comments 43 Pages

2604.21255 2026-04-24 cs.CL

When Agents Look the Same: Quantifying Distillation-Induced Similarity in Tool-Use Behaviors

Chenghao Yang, Yuning Zhang, Zhoufutu Wen, Tao Gong, Jiaheng Liu, Qi Chu, Nenghai Yu

Comments Accepted by ACL 2026 Main Conference

2604.21253 2026-04-24 cs.CL cs.AI

Planning Beyond Text: Graph-based Reasoning for Complex Narrative Generation

Hanwen Gu, Chao Guo, Junle Wang, Wenda Xie, Yisheng Lv

Comments Accepted to Findings of the Association for Computational Linguistics: ACL 2026

2604.21252 2026-04-24 cs.LG

Improving Performance in Classification Tasks with LCEN and the Weighted Focal Differentiable MCC Loss

Pedro Seber, Richard D. Braatz

2604.21249 2026-04-24 cs.RO

Reasoning About Traversability: Language-Guided Off-Road 3D Trajectory Planning

Byounggun Park, Soonmin Hwang

2604.21241 2026-04-24 cs.RO cs.AI

CorridorVLA: Explicit Spatial Constraints for Generative Action Heads via Sparse Anchors

Dachong Li, ZhuangZhuang Chen, Jin Zhang, Jianqiang Li

2604.21238 2026-04-24 cs.CL cs.IR

Unlocking the Power of Large Language Models for Multi-table Entity Matching

Yingkai Tang, Taoyu Su, Wenyuan Zhang, Xiaoyang Guo, Tingwen Liu

Comments Accepted by NLPCC 2025

2604.21235 2026-04-24 cs.LG cs.CL stat.ME

Learning Dynamic Representations and Policies from Multimodal Clinical Time-Series with Informative Missingness

Zihan Liang, Ziwen Pan, Ruoxuan Xiong

Comments Findings of ACL 2026 (30 pages)

2604.21229 2026-04-24 cs.CL cs.AI

EngramaBench: Evaluating Long-Term Conversational Memory with Structured Graph Retrieval

Julian Acuna

Comments 9 pages, 2 figures, 3 tables

2604.21227 2026-04-24 cs.CV cs.MM

UAU-Net: Uncertainty-aware Representation Learning and Evidential Classification for Facial Action Unit Detection

Yuze Li, Zhilei Liu

Comments Accepted by ICMR 2026

2604.21223 2026-04-24 cs.CL cs.AI

Zero-Shot Detection of LLM-Generated Text via Implicit Reward Model

Runheng Liu, Heyan Huang, Xingchen Xiao, Zhijing Wu

Comments NeurIPS 2025

2604.21221 2026-04-24 cs.CV cs.LG

Sparse Forcing: Native Trainable Sparse Attention for Real-time Autoregressive Diffusion Video Generation

Boxun Xu, Yuming Du, Zichang Liu, Siyu Yang, Ziyang Jiang, Siqi Yan, Rajasi Saha, Albert Pumarola, Wenchen Wang, Peng Li

2604.21215 2026-04-24 cs.LG

The Recurrent Transformer: Greater Effective Depth and Efficient Decoding

Costin-Andrei Oncescu, Depen Morwani, Samy Jelassi, Alexandru Meterez, Mujin Kwun, Sham Kakade

2604.21211 2026-04-24 cs.CL

Subject-level Inference for Realistic Text Anonymization Evaluation

Myeong Seok Oh, Dong-Yun Kim, Hanseok Oh, Chaean Kang, Joeun Kang, Xiaonan Wang, Hyunjung Park, Young Cheol Jung, Hansaem Kim

Comments Accepted at ACL 2026

2604.21209 2026-04-24 cs.AI cs.CL

Align Generative Artificial Intelligence with Human Preferences: A Novel Large Language Model Fine-Tuning Method for Online Review Management

Yanan Wang, Yong Ge

Comments Accepted to Information Systems Research (ISR). This is a preliminary version

详情

DOI: 10.1287/isre.2024.1518

英文摘要

Online reviews have played a pivotal role in consumers' decision-making processes. Existing research has highlighted the significant impact of managerial review responses on customer relationship management and firm performance. However, a large portion of online reviews remains unaddressed due to the considerable human labor required to respond to the rapid growth of online reviews. While generative AI has achieved remarkable success in a range of tasks, they are general-purpose models and may not align well with domain-specific human preferences. To tailor these general generative AI models to domain-specific applications, finetuning is commonly employed. Nevertheless, several challenges persist in finetuning with domain-specific data, including hallucinations, difficulty in representing domain-specific human preferences, and over conservatism in offline policy optimization. To address these challenges, we propose a novel preference finetuning method to align an LLM with domain-specific human preferences for generating online review responses. Specifically, we first identify the source of hallucination and propose an effective context augmentation approach to mitigate the LLM hallucination. To represent human preferences, we propose a novel theory-driven preference finetuning approach that automatically constructs human preference pairs in the online review domain. Additionally, we propose a curriculum learning approach to further enhance preference finetuning. To overcome the challenge of over conservatism in existing offline preference finetuning method, we propose a novel density estimation-based support constraint method to relax the conservatism, and we mathematically prove its superior theoretical guarantees. Extensive evaluations substantiate the superiority of our proposed preference finetuning method.

URL PDF HTML ☆

赞 0 踩 0

2604.21204 2026-04-24 cs.CL cs.AI cs.IR

On Reasoning Behind Next Occupation Recommendation

Shan Dong, Palakorn Achananuparp, Hieu Hien Mai, Lei Wang, Yao Lu, Ee-Peng Lim

Comments Accepted to PAKDD 2026

2604.21198 2026-04-24 cs.CV

A Probabilistic Framework for Improving Dense Object Detection in Underwater Image Data via Annealing-Based Data Augmentation

Eleanor Wiesler, Trace Baxley

2604.21197 2026-04-24 cs.LG

Toward Efficient Membership Inference Attacks against Federated Large Language Models: A Projection Residual Approach

Guilin Deng, Silong Chen, Yuchuan Luo, Yi Liu, Songlei Wang, Zhiping Cai, Lin Liu, Xiaohua Jia, Shaojing Fu

Comments This is the full version (including complete appendices and supplementary materials) of the paper accepted for publication at the 2026 IEEE Symposium on Security and Privacy

2604.21193 2026-04-24 cs.AI

Trust but Verify: Introducing DAVinCI -- A Framework for Dual Attribution and Verification in Claim Inference for Language Models

Vipula Rawte, Ryan Rossi, Franck Dernoncourt, Nedim Lipka

2604.21192 2026-04-24 cs.RO cs.AI

How VLAs (Really) Work In Open-World Environments

Amir Rasouli, Yangzheng Wu, Zhiyuan Li, Rui Heng Yang, Xuan Zhao, Charles Eret, Sajjad Pakdamansavoji

Comments 8 pages, 7 figures, 2 tables

2604.21191 2026-04-24 cs.CL

Prefix Parsing is Just Parsing

Clemente Pasti, Andreas Opedal, Timothy J. O'Donnell, Ryan Cotterell, Tim Vieira

Comments To appear at ACL 2026

2604.21189 2026-04-24 cs.RO

Full-Body Dynamic Safety for Robot Manipulators: 3D Poisson Safety Functions for CBF-Based Safety Filters

Meg Wilkinson, Gilbert Bahati, Ryan M. Bena, Emily Fourney, Joel W. Burdick, Aaron D. Ames

2604.21182 2026-04-24 cs.CV

WildSplatter: Feed-forward 3D Gaussian Splatting with Appearance Control from Unconstrained Images

Yuki Fujimura, Takahiro Kushida, Kazuya Kitano, Takuya Funatomi, Yasuhiro Mukaigawa

Comments Project page: https://github.com/yfujimura/WildSplatter

2604.21175 2026-04-24 cs.LG cs.DS

Graph Neural Network-Informed Predictive Flows for Faster Ford-Fulkerson and PAC-Learnability

Eleanor Wiesler, Trace Baxley

2604.21160 2026-04-24 cs.CV

Reinforcing 3D Understanding in Point-VLMs via Geometric Reward Credit Assignment

Jingkun Chen, Ruoshi Xu, Mingqi Gao, Shengda Luo, Jungong Han

Comments 10 pages, 3 figures, 5 tables

2604.21155 2026-04-24 cs.AI cs.MA

Multi-Agent Empowerment and Emergence of Complex Behavior in Groups

Tristan Shah, Ilya Nemenman, Daniel Polani, Stas Tiomkin

Comments 11 pages