arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2604.25727 2026-04-29 cs.AI

Toward Scalable Terminal Task Synthesis via Skill Graphs

Zhiyuan Fan, Tinghao Yu, Yuanjun Cai, Jiangtao Guan, Yun Yang, Dingxin Hu, Jiang Zhou, Xing Wu, Zhuo Han, Feng Zhang, Lilin Wang

详情

英文摘要

Terminal agents have demonstrated strong potential for autonomous command-line execution, yet their training remains constrained by the scarcity of high-quality and diverse execution trajectories. Existing approaches mitigate this bottleneck by synthesizing large-scale terminal task instances for trajectory sampling. However, they primarily focus on scaling the number of tasks while providing limited control over the diversity of execution trajectories that agents actually experience during training. In this paper, we present SkillSynth, an automated framework for terminal task synthesis built on a scenario-mediated skill graph. SkillSynth first constructs a large-scale skill graph, where scenarios serve as intermediate transition nodes that connect diverse command-line skills. It then samples paths from this graph as abstractions of real-world workflows, and uses a multi-agent harness to instantiate them into executable task instances. By grounding task synthesis in graph-sampled workflow paths, SkillSynth explicitly controls the diversity of minimal execution trajectories required to solve the synthesized tasks. Experiments on Terminal-Bench demonstrate the effectiveness of SkillSynth. Moreover, task instances synthesized by SkillSynth have been adopted to train Hy3 Preview, contributing to its enhanced agentic capabilities in terminal-based settings.

URL PDF HTML ☆

赞 0 踩 0

2604.25724 2026-04-29 cs.AI

Scalable Inference Architectures for Compound AI Systems: A Production Deployment Study

Srikanta Prasad S, Utkarsh Arora

Comments Accepted to the ACM Conference on AI and Agentic Systems (ACM CAIS 2026)

2604.25720 2026-04-29 cs.CV cs.CL

Toward Multimodal Conversational AI for Age-Related Macular Degeneration

Ran Gu, Benjamin Hou, Mélanie Hébert, Asmita Indurkar, Yifan Yang, Emily Y. Chew, Tiarnán D. L. Keenan, Zhiyong Lu

Comments 38 pages, 4 figures

2604.25716 2026-04-29 cs.CL cs.AI

Cross-Lingual Jailbreak Detection via Semantic Codebooks

Shirin Alanova, Bogdan Minko, Sabrina Sadiekh, Evgeniy Kokuykin

2604.25698 2026-04-29 cs.RO

Reference-Augmented Learning for Precise Tracking Policy of Tendon-Driven Continuum Robots

Ziqing Zou, Ke Qiu, Haojian Lu, Rong Xiong, Yue Wang

2604.25693 2026-04-29 cs.AI

RADD: Retrieval-Augmented Discrete Diffusion for Multi-Modal Knowledge Graph Completion

Guanglin Niu, Bo Li

Comments 12 pages, 3 figures, 6 tables

2604.25691 2026-04-29 cs.RO

Learning-Based Dynamics Modeling and Robust Control for Tendon-Driven Continuum Robots

Ziqing Zou, Ke Qiu, Fei Wang, Haojian Lu, Rong Xiong, Yue Wang

2604.25688 2026-04-29 cs.CV

QB-LIF: Learnable-Scale Quantized Burst Neurons for Efficient SNNs

Dewei Bai, Hongxiang Peng, Jiajun Mei, Yang Ren, Hong Qu, Dawen Xia, Zhang Yi

2604.25684 2026-04-29 cs.AI

Think Before You Act -- A Neurocognitive Governance Model for Autonomous AI Agents

Eranga Bandara, Ross Gore, Asanga Gunaratna, Sachini Rajapakse, Isurunima Kularathna, Ravi Mukkamala, Sachin Shetty, Xueping Liang, Amin Hass, Tharaka Hewa, Abdul Rahman, Christopher K. Rhea, Anita H. Clayton, Preston Samuel, Atmaram Yarlagadda

2604.25680 2026-04-29 cs.CV eess.IV

Exploring Remote Photoplethysmography for Neonatal Pain Detection from Facial Videos

Ashutosh Dhamaniya, Anup Kumar Gupta, Trishna Saikia, Puneet Gupta

Comments 25 pages, 9 figures, 10 tables. Proposed rPPG-based method for neonatal pain detection from facial videos, with multimodal (rPPG + audio) analysis and extensive ablation studies on the iCOPEvid dataset

2604.25676 2026-04-29 cs.CL cs.AI

CORAL: Adaptive Retrieval Loop for Culturally-Aligned Multilingual RAG

Nayeon Lee, Jiwoo Song, Byeongcheol Kang

Comments 23 pages, 9 figures. Accepted at ACL 2026 (Findings)

2604.25670 2026-04-29 cs.RO

GEGLU-Transformer for IMU-to-EMG Estimation with Few-Shot Adaptation

Miroljub Mihailovic, Luca Tonin, Stefano Tortora, Emanuele Menegatti

2604.25665 2026-04-29 cs.CL cs.AI cs.DL cs.IR

LLM-ReSum: A Framework for LLM Reflective Summarization through Self-Evaluation

Huyen Nguyen, Haoxuan Zhang, Yang Zhang, Junhua Ding, Haihua Chen

Comments 15 pages, 3 figures, 5 tables

2604.25661 2026-04-29 cs.RO cs.HC

SlicerRoboTMS: An Open-Source 3D Slicer Extension for Robot-Assisted Transcranial Magnetic Stimulation

Wenzhi Bai, Yituo Guo, Bhaskar Basu, Andrew Weightman, Zhenhong Li

Comments Accepted by the 48th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) 2026

2604.25654 2026-04-29 cs.CL

Progressing beyond Art Masterpieces or Touristic Clichés: how to assess your LLMs for cultural alignment?

António Branco, João Silva, Nuno Marques, Luis Gomes, Ricardo Campos, Raquel Sequeira, Sara Nerea, Rodrigo Silva, Miguel Marques, Rodrigo Duarte, Artur Putyato, Diogo Folques, Tiago Valente

Comments RESOURCEFUL-2026 Workshop at LREC 2026

2604.25642 2026-04-29 cs.CV cs.AI

Prefill-Time Intervention for Mitigating Hallucination in Large Vision-Language Models

Chengsheng Zhang, Chenghao Sun, Xinyan Jiang, Wei Li, Xinmei Tian

Comments Accepted by CVPR 2026

2604.25636 2026-04-29 cs.CV

Refinement via Regeneration: Enlarging Modification Space Boosts Image Refinement in Unified Multimodal Models

Jiayi Guo, Linqing Wang, Jiangshan Wang, Yang Yue, Zeyu Liu, Zhiyuan Zhao, Qinglin Lu, Gao Huang, Chunyu Wang

Comments GitHub: https://github.com/LeapLabTHU/RvR

2604.25614 2026-04-29 cs.AI

HotComment: A Benchmark for Evaluating Popularity of Online Comments

Yafeng Wu, Yunyao Zhang, Liliang Ye, Guiyi Zeng, Junqing Yu, Chen Xu, Zikai Song

2604.25612 2026-04-29 cs.AI

The Nonverbal Syntax Framework: An Evidence-Based Tiered System for Inferring Learner States from Observable Behavioral Cues

Sherzod Turaev, Mary John, Jaloliddin Rustamov, Zahiriddin Rustamov, Saja Aldabet, Nazar Zaki, Khaled Shuaib

Comments 40 pages

详情

英文摘要

Understanding learners' cognitive and affective states underpins adaptive educational systems and effective teaching. Although research links nonverbal cues to internal states, no framework calibrates them to evidence. We present the Nonverbal Syntax Framework, drawn from a systematic review of 908 studies and 17,043 cue-state mappings (Turaev et al., 2026). The framework addresses three challenges: terminological fragmentation (behaviors described inconsistently), evidence heterogeneity (single observations to replicated findings), and state ambiguity (similar patterns indicating multiple states). Normalization consolidated 5,537 state labels into 2,010 canonical states (63.7%) and 11,521 cues into 6,434 normalized cues (44.2%) across nine behavioral channels. Dual-evidence assessment separately evaluates Component Evidence (coverage of cues and states) and Relationship Evidence (independent studies per cue-state link). 52% of "Very High" relationships rest on one paper, so separation enables calibrated rather than overconfident inference from preliminary findings. The framework's four levels comprise a Cue Vocabulary of 6,434 indicators classified as observable/instrumental; State Clusters linking 2,010 states to indicative cues; State Profiles with multimodal behavioral signatures and actionable specifications; and Discriminative Analysis distinguishing 1,215 confusable state pairs. We identify 480 actionable R1-R4 relationships (three or more independent papers), the replicated core of six decades of research, covering 35.5% of mappings across 47 key learning states and 111 distinct indicators. The remaining 91.5% (9,653 single-paper findings) form exploratory hypotheses for replication. The framework gives researchers an empirical foundation for identifying gaps, practitioners evidence-based tools for state inference, and technologists validated features for multimodal detection.

URL PDF HTML ☆

赞 0 踩 0

2604.25611 2026-04-29 cs.CL cs.SD

WhisperPipe: A Resource-Efficient Streaming Architecture for Real-Time Automatic Speech Recognition

Erfan Ramezani, Mohammad Mahdi Giahi, Mohammad Erfan Zarabadipour, Amir Reza Yosefian, Hamid Ghadiri

Comments 36 pages, 14 figures. Open-source implementation available at PyPI

2604.25584 2026-04-29 cs.AI

DualFact+: A Multimodal Fact Verification Framework for Procedural Video Understanding

Cennet Oguz, Yasser Hamidullah, Josef van Genabith, Simon Ostermann

Comments ACL 2026 Findings

2604.25580 2026-04-29 cs.CL

Bye Bye Perspective API: Lessons for Measurement Infrastructure in NLP, CSS and LLM Evaluation

David Hartmann, Manuel Tonneau, Angelie Kraft, LK Seiling, Dimitri Staufer, Pieter Delobelle, Jan Fillies, Anna Ricarda Luther, Jan Batzner, Mareike Lisker

Comments 13 pages, 1 figure, 1 table

2604.25578 2026-04-29 cs.CL cs.AI

Marco-MoE: Open Multilingual Mixture-of-Expert Language Models with Efficient Upcycling

Fan Jiang, Yu Zhao, Chenyang Lyu, Tianqi Shi, Yichao Du, Feihu Jiang, Longyue Wang, Weihua Luo

2604.25574 2026-04-29 cs.CV

Control Your Queries: Heterogeneous Query Interaction for Camera-Radar Fusion

Jialong Wu, Yihan Wang, Matthias Rottmann

2604.25570 2026-04-29 cs.CV

Vision SmolMamba: Spike-Guided Token Pruning for Energy-Efficient Spiking State-Space Vision Models

Dewei Bai, Hongxiang Peng, Yunyun Zeng, Ziyu Zhang, Hong Qu, Yi Zhang

2604.25563 2026-04-29 cs.RO

Improving Sensing Coverage and Compliance of 3D-Printed Artificial Skins Through Multi-Modal Sensing and Soft Materials

Carson Kohlbrenner, Caleb Escobedo, Sayak Ray, Alexander Dickhans, Anna Soukhovei, Nickolaus Jackoski, Lyle Antieau, Alessandro Roncone

Comments This work was accepted at the "Towards Large-Area Tactile Sensing Skins: From Scalable Materials to Embodied Robotic Perception" workshop at the International Conference on Robotics and Automation (ICRA) 2026

2604.25554 2026-04-29 cs.RO cs.LG

Egocentric Tactile and Proximity Sensors as Observation Priors for Humanoid Collision Avoidance

Carson Kohlbrenner, Niraj Pudasaini, William Xie, Naren Sivagnanadasan, Nikolaus Correll, Alessandro Roncone

Comments This work was accepted at the 8th RoboTac Workshop at the International Conference on Robotics and Automation (ICRA) 2026

2604.25551 2026-04-29 cs.LG cs.AI cs.LO

On Halting vs Converging in Recurrent Graph Neural Networks

Jeroen Bollen, Stijn Vansummeren

2604.25550 2026-04-29 cs.LG

Enhancing SignSGD: Small-Batch Convergence Analysis and a Hybrid Switching Strategy

Haoran Chen, Wentao Wang

Comments 5 pages, 3 figures

2604.25149 2026-04-29 cs.AI

Semantic Layers for Reliable LLM-Powered Data Analytics: A Paired Benchmark of Accuracy and Hallucination Across Three Frontier Models

Michael Rumiantsau, Ivan Fokeev