arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2604.12390 2026-04-20 cs.AI

Heuristic Classification of Thoughts Prompting (HCoT): Integrating Expert System Heuristics for Structured Reasoning into Large Language Models

Lei Lin, Jizhao Zhu, Yong Liu, Donghong Sun, Hongbo He, Yihua Du

详情

英文摘要

This paper addresses two limitations of large language models (LLMs) in solving complex problems: (1) their reasoning processes exhibit Bayesian-like stochastic generation, where each token is sampled from a context-dependent probability distribution, leading to inherently random decision trajectories rather than deterministic planning; (2) the reasoning and decision-making mechanisms are statically decoupled, meaning dynamically retrieved domain knowledge fails to dynamically adjust the underlying reasoning strategy. These dual deficiencies result in initial decisions lacking strategic anchoring and reasoning chains often failing to converge on correct solutions, as stochastic generation lacks mechanisms for trajectory correction or knowledge-guided optimization during sequential reasoning. To resolve these issues, we propose a problem-solving method integrated into the LLM's generation process to guide reasoning. This method, compatible with numerous LLMs and featuring reusable solutions, is grounded in a novel Heuristic-Classification-of-Thoughts prompting schema (HCoT). HCoT synergizes the LLM's reasoning ability with a structured problem space via a heuristic classification model that controls the reasoning process and provides reusable abstract solutions. Evaluated on two complex inductive reasoning tasks with ill-defined search spaces, HCoT outperforms existing approaches (e.g., Tree-of-Thoughts and Chain-of-Thoughts prompting) in performance. On the well-structured 24 Game task, HCoT demonstrates significantly higher token efficiency compared to the state-of-the-art Tree-of-Thoughts-Breadth-First-Search. In terms of both accuracy and token usage, HCoT achieves a Pareto frontier balance, offering a strong trade-off between performance and computational cost.

URL PDF HTML ☆

赞 0 踩 0

2604.11804 2026-04-20 cs.CV

OmniShow: Unifying Multimodal Conditions for Human-Object Interaction Video Generation

Donghao Zhou, Guisheng Liu, Hao Yang, Jiatong Li, Jingyu Lin, Xiaohu Huang, Yichen Liu, Xin Gao, Cunjian Chen, Shilei Wen, Chi-Wing Fu, Pheng-Ann Heng

Comments Project page: https://correr-zhou.github.io/OmniShow/

2604.11490 2026-04-20 cs.AI cs.CL cs.CV

Anthropogenic Regional Adaptation in Multimodal Vision-Language Model

Samuel Cahyawijaya, Peerat Limkonchotiwat, Tack Hwa Wong, Hitesh Laxmichand Patel, Amit Agarwal, Manuel Antonio Rufino, Carlos Rafael Catalan, Muhammad Reza Qorib, Vicky Feliren, Holy Lovenia, Aye Hninn Khine, Frederikus Hudi, David Anugraha, Alham Fikri Aji, Romrawin Chumpu, Viet-Thanh Pham, Minghan Wang, Mohamed Fazli Imam, Ruochen Zhang, Joseph Marvin Imperial, Khumaisa Nur'aini, Do Xuan Long, Musa Izzanardi Wijanarko, Joel Ruben Antony Moniz, Patrick Amadeus Irawan, Hanif Muhammad Zhafran, Isaiah Flores, Salsabila Zahirah Pranida, Jun Kevin, Jostin Jerico Rosal, Patricia Nicole Monderin, Kun Kerdthaisong, Ahmad Mustafid, My Chiffon Nguyen, Natchapon Jongwiriyanurak, Siva Worajitwannakul, Haochen Li, Adrian Xuan Wei Lim, Bin Wang, Muhammad Ravi Shulthan Habibi, Lynnette Hui Xian Ng, Mithil Bangera, Yeshil Bangera, Priyaranjan Pattnayak, Dun Li Chan, Sherissa Caren Djuniwar, Cho Chan Myei Oo, Hee Ming Shan

2604.11305 2026-04-20 cs.LG cs.IT math.IT stat.ML

Beyond Fixed False Discovery Rates: Post-Hoc Conformal Selection with E-Variables

Meiyi Zhu, Osvaldo Simeone

Comments 32 pages, 29 figures

2604.11251 2026-04-20 cs.RO

CLAW: Composable Language-Annotated Whole-body Motion Generation

Jianuo Cao, Yuxin Chen, Masayoshi Tomizuka

2604.11182 2026-04-20 cs.CL

Evaluating Memory Capability in Continuous Lifelog Scenario

Jianjie Zheng, Zhichen Liu, Zhanyu Shen, Jingxiang Qu, Guanhua Chen, Yile Wang, Yang Xu, Yang Liu, Sijie Cheng

Comments 27 pages, 7 figures. ACL 2026 Findings camera-ready

2604.10916 2026-04-20 cs.CV cs.AI

ReXSonoVQA: A Video QA Benchmark for Procedure-Centric Ultrasound Understanding

Xucheng Wang, Xiaoman Zhang, Sung Eun Kim, Ankit Pal, Pranav Rajpurkar

2604.10736 2026-04-20 cs.CL cs.SD

BlasBench: An Open Benchmark for Irish Speech Recognition

Jyoutir Raj, John Conway

Comments 9 pages, 4 tables, 3 appendices. Code and data: https://github.com/jyoutir/blasbench

2604.10261 2026-04-20 cs.AI cs.CL cs.LG

The Amazing Agent Race: Strong Tool Users, Weak Navigators

Zae Myung Kim, Dongseok Lee, Jaehyung Kim, Vipul Raheja, Dongyeop Kang

2604.10096 2026-04-20 cs.CV

ABot-Claw: A Foundation for Persistent, Cooperative, and Self-Evolving Robotic Agents

Dongjie Huo, Haoyun Liu, Guoqing Liu, Dekang Qi, Zhiming Sun, Maoguo Gao, Jianxin He, Yandan Yang, Xinyuan Chang, Feng Xiong, Xing Wei, Zhiheng Ma, Mu Xu

2604.09836 2026-04-20 cs.AI cs.CL cs.LG

COMPOSITE-Stem

Kyle Waters, Lucas Nuzzi, Tadhg Looram, Alessandro Tomasiello, Ariel Ghislain Kemogne Kamdoum, Bikun Li, Damien Sileo, Egor Kretov, Francesco Fournier-Facio, Georgios Soloupis, Haile Kassahun, Hew Wolff, Jiaqi Cai, Lianghui Li, Marc Roth, Mohinder Naiya, Naixu Guo, Qicheng Tang, Richard Wheeler, Samuele Sala, Serguei Popov, Steven Dillmann, Yuqi Li

2604.09305 2026-04-20 cs.CV

VAGNet: Vision-based Accident Anticipation with Global Features

Vipooshan Vipulananthan, Charith D. Chitraranjan

2604.09270 2026-04-20 cs.RO

Soft Electroadhesive Feet for Micro Aerial Robots Perching on Smooth and Curved Surfaces

Chen Liu, Sonu Feroz, Ketao Zhang

Comments 7 pages, 8 figures

2604.09232 2026-04-20 cs.CV cs.AI

Neural Distribution Prior for LiDAR Out-of-Distribution Detection

Zizhao Li, Zhengkang Xiang, Jiayang Ao, Feng Liu, Joseph West, Kourosh Khoshelham

Comments CVPR 2026

2604.08281 2026-04-20 cs.CL

When to Trust Tools? Adaptive Tool Trust Calibration For Tool-Integrated Math Reasoning

Ruotao Xu, Yixin Ji, Yu Luo, Jinpeng Li, Dong Li, Peifeng Li, Juntao Li, Min Zhang

2604.07786 2026-04-20 cs.CV cs.LG

Cross-Modal Emotion Transfer for Emotion Editing in Talking Face Video

Chanhyuk Choi, Taesoo Kim, Donggyu Lee, Siyeol Jung, Taehwan Kim

Comments Accepted to CVPR 2026. Project Page: https://chanhyeok-choi.github.io/C-MET/

2604.06425 2026-04-20 cs.LG cs.AI

Neural Computers

Mingchen Zhuge, Changsheng Zhao, Haozhe Liu, Zijian Zhou, Shuming Liu, Wenyi Wang, Ernie Chang, Gael Le Lan, Junjie Fei, Wenxuan Zhang, Yasheng Sun, Zhipeng Cai, Zechun Liu, Yunyang Xiong, Yining Yang, Yuandong Tian, Yangyang Shi, Vikas Chandra, Jürgen Schmidhuber

Comments Github (data pipeline): https://github.com/metauto-ai/NeuralComputer; Blogpost: https://metauto.ai/neuralcomputer/index_eng.html

2604.05552 2026-04-20 cs.CL cs.AI

Context-Agent: Dynamic Discourse Trees for Non-Linear Dialogue

Junan Hu, Shudan Guo, Wenqi Liu, Jianhua Yin, Yinwei Wei

Comments 14 pages, 7 figures, ACL 2026

2604.02880 2026-04-20 cs.CV

InstructTable: Improving Table Structure Recognition Through Instructions

Boming Chen, Zining Wang, Zhentao Guo, Jianqiang Liu, Chen Duan, Yu Gu, Kai zhou, Pengfei Yan

Comments 2026 IEEE/CVF Conference on Computer Vision and Pattern Recognition- FINDINGS Track (CVPRF)

2604.02393 2026-04-20 cs.LG nlin.AO

Plateaus, Optima, and Overfitting in Multi-Layer Perceptrons: A Saddle-Saddle-Attractor Scenario

Alex Alì Maleknia, Yuzuru Sato

2603.27955 2026-04-20 cs.LG hep-ph

Symbolic Density Estimation: A Decompositional Approach

Angelo Rajendram, Xieting Chu, Vijay Ganesh, Max Fieg, Aishik Ghosh

2603.27759 2026-04-20 cs.CV

When Surfaces Lie: Exploiting Wrinkle-Induced Attention Shift to Attack Vision-Language Models

Chengyin Hu, Xuemeng Sun, Jiaju Han, Qike Zhang, Xiang Chen, Xin Wang, Yiwei Wei, Jiahua Long

2603.27312 2026-04-20 cs.LG

Scalable Maximum Entropy Population Synthesis via Persistent Contrastive Divergence

Mirko Degli Esposti

2603.24621 2026-04-20 cs.AI

ARC-AGI-3: A New Challenge for Frontier Agentic Intelligence

ARC Prize Foundation

2603.20633 2026-04-20 cs.AI

Seed1.8 Model Card: Towards Generalized Real-World Agency

Bytedance Seed

2603.20410 2026-04-20 cs.LG

SLE-FNO: Single-Layer Extensions for Task-Agnostic Continual Learning in Fourier Neural Operators

Mahmoud Elhadidy, Roshan M. D'Souza, Amirhossein Arzani

详情

英文摘要

Scientific machine learning is increasingly used to build surrogate models, yet most models are trained under a restrictive assumption in which future data follow the same distribution as the training set. In practice, new experimental conditions or simulation regimes may differ significantly, requiring extrapolation and model updates without re-access to prior data. This creates a need for continual learning (CL) frameworks that can adapt to distribution shifts while preventing catastrophic forgetting. Such challenges are pronounced in fluid dynamics, where changes in geometry, boundary conditions, or flow regimes induce non-trivial changes to the solution. Here, we introduce a new architecture-based approach (SLE-FNO) combining a Single-Layer Extension (SLE) with the Fourier Neural Operator (FNO) to support efficient CL. SLE-FNO was compared with a range of established CL methods, including Elastic Weight Consolidation (EWC), Learning without Forgetting (LwF), replay-based approaches, Orthogonal Gradient Descent (OGD), Gradient Episodic Memory (GEM), PiggyBack, and Low-Rank Adaptation (LoRA), within a spatial field-to-field regression setting. The models were trained to map transient concentration fields to time-averaged wall shear stress (TAWSS) in pulsatile aneurysmal blood flow. Tasks were derived from 230 computational fluid dynamics simulations grouped into four sequential and out-of-distribution configurations. Results show that replay-based methods and architecture-based approaches (PiggyBack, LoRA, and SLE-FNO) achieve the best retention, with SLE-FNO providing the strongest overall balance between plasticity and stability, achieving accuracy with zero forgetting and minimal additional parameters. Our findings highlight key differences between CL algorithms and introduce SLE-FNO as a promising strategy for adapting baseline models when extrapolation is required.

URL PDF HTML ☆

赞 0 踩 0

2603.16427 2026-04-20 cs.CV

Cross-modal learning for plankton recognition

Joona Kareinen, Veikka Immonen, Tuomas Eerola, Lumi Haraguchi, Lasse Lensu, Kaisa Kraft, Sanna Suikkanen, Heikki Kälviäinen

2603.13966 2026-04-20 cs.AI

vla-eval: A Unified Evaluation Harness for Vision-Language-Action Models

Suhwan Choi, Yunsung Lee, Yubeen Park, Chris Dongjoo Kim, Ranjay Krishna, Dieter Fox, Youngjae Yu

2603.13829 2026-04-20 cs.RO cs.AI cs.HC

ArrayTac: A Closed-loop Piezoelectric Tactile Platform for Continuously Tunable Rendering of Shape, Stiffness, and Friction

Tianhai Liang, Shiyi Guo, Baiye Cheng, Zhengrong Xue, Han Zhang, Huazhe Xu

Comments Project website: https://arraytac.github.io/

2603.13683 2026-04-20 cs.CL cs.AI cs.CY

Preconditioned Test-Time Adaptation for Out-of-Distribution Debiasing in Narrative Generation

Hanwen Shen, Ting Ying, Jiajie Lu, Shanshan Wang

Comments This paper has been accepted to ACL2026 main conference