arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2604.24219 2026-04-28 cs.AI

Adaptive ToR: Complexity-Aware Tree-Based Retrieval for Pareto-Optimal Multi-Intent NLU

Hee-Kyong Yoo, Wonbae Kim, Hyocheol Ahn

Comments 17 pages, 5 Figures, 4 Tables

详情

英文摘要

Multi-intent natural language understanding requires retrieval systems that simultaneously achieve high accuracy and computational efficiency, yet existing approaches apply either uniform single-step retrieval that compromises recall or fixed-depth hierarchical decomposition that introduces excessive latency regardless of query complexity. This paper proposes Adaptive Tree-of-Retrieval (Adaptive ToR), a complexity-aware retrieval architecture that dynamically configures retrieval topology based on query characteristics. The system integrates four components: (1) a Query Tree Classifier computing a Query Complexity Index from weighted linguistic signals to route queries to either a rapid single-step path or an adaptive-depth hierarchical path; (2) a Tree-Based Retrieval module that recursively decomposes complex queries into focused sub-queries calibrated to predicted complexity; (3) an Adaptive Pruning Module employing two-stage filtering combining quantitative similarity gating with semantic relevance evaluation to suppress exponential node growth; and (4) a Retrieval Reranking Layer featuring a deduplicator-first pipeline and global LLM rescoring for production efficiency. Evaluation on the NLU++ benchmark (2,693 multi-intent queries across Banking and Hotel domains) yields 29.07% Subset Accuracy and 71.79% Micro-F1, a 9.7% relative improvement over fixed-depth baselines, while reducing latency by 37.6%, LLM invocations by 43.0%, and token consumption by 9.8%. Depth-wise analysis reveals that 26.92% of queries resolve within three seconds (2.45s mean latency) via single-step routing (d=0: 37.9% Subset Accuracy, 74.8% Micro-F1), while token consumption scales by 4.9x across depths, validating complexity-aware resource allocation and establishing Pareto-optimal balance across accuracy, latency, and computational efficiency.

URL PDF HTML ☆

赞 0 踩 0

2604.24201 2026-04-28 cs.LG q-bio.GN q-bio.MN

CMGL: Confidence-guided Multi-omics Graph Learning for Cancer Subtype Classification

Boyang Fan, Hengchuang Yin, Siyu Yi, Yifan Wang, Zhicheng Li, Leijiyu Zhou, Jiancheng Lv, Wei Ju

Comments 24 pages, 15 figures, 13 tables, 2 algorithms (main paper + supplementary materials)

2604.24198 2026-04-28 cs.CL cs.AI cs.CE cs.LG cs.MA

Rewarding the Scientific Process: Process-Level Reward Modeling for Agentic Data Analysis

Zhisong Qiu, Shuofei Qiao, Kewei Xu, Yuqi Zhu, Lun Du, Ningyu Zhang, Huajun Chen

Comments Work in progress

2604.24197 2026-04-28 cs.CL cs.AI

Seeing Is No Longer Believing: Frontier Image Generation Models, Synthetic Visual Evidence, and Real-World Risk

Shuai Wu, Xue Li, Yanna Feng, Yufang Li, Zhijun Wang, Ran Wang

Comments Technical report, 20 pages, 15 figures, 2 tables, 1 algorithm

2604.24193 2026-04-28 cs.CV

Computer Vision-Based Early Detection of Container Loss at Sea

Vishakha Lall, Capt. Stanley S Pinto, Capt. Chu Xing Peng, Wu Kaiwen

Comments Accepted and Presented at SMRC x ICMASS/MTEC 2026

2604.24191 2026-04-28 cs.CV

Omni-o3: Deep Nested Omnimodal Deduction for Deliberative Audio-Visual Reasoning

Zhicheng Zhang, Wentao Gu, Weicheng Wang, Yongjie Zhu, Wenyu Qin, Meng Wang, Pengfei Wan, Jufeng Yang

2604.24188 2026-04-28 cs.RO cs.GR

Generalizable Friction Coefficient Estimation via Material Embedding and Proxy Interaction Modeling

Zhendong Wang, Huamin Wang

2604.24187 2026-04-28 cs.CV

Multivariate Gaussian NeRF for Wide Field-of-View Ultrasound Reconstruction

Patris Valera, Magdalena Wysocki, Felix Duelmer, Mohammad Farid Azampour, Sebastian Herz, Stefan Wörz, Nassir Navab

2604.24186 2026-04-28 cs.CL cs.AI

MultiDx: A Multi-Source Knowledge Integration Framework towards Diagnostic Reasoning

Yimin Deng, Zhenxi Lin, Yejing Wang, Guoshuai Zhao, Pengyue Jia, Zichuan Fu, Derong Xu, Yefeng Zheng, Xiangyu Zhao, Li Zhu, Xian Wu, Xueming Qian

Comments ACL 2026 findings

2604.24182 2026-04-28 cs.RO

$M^2$-VLA: Boosting Vision-Language Models for Generalizable Manipulation via Layer Mixture and Meta-Skills

Siyao Xiao, Yuhong Zhang, Zhifang Liu, Zihan Gao, Jingye Zhang, Sinwai Choo, Dake Zhong, Mengzhe Wang, Xiao Lin, Xianfeng Zhou, Jia Jia, Haoqian Wang

2604.24178 2026-04-28 cs.LG cs.AI

Meta-Aligner: Bidirectional Preference-Policy Optimization for Multi-Objective LLMs Alignment

Wenzhe Xu, Biao Liu, Yiyang Sun, Xin Geng, Ning Xu

2604.24176 2026-04-28 cs.AI

Explanation Quality Assessment as Ranking with Listwise Rewards

Thomas Bailleux, Tanmoy Mukherjee, Emmanuel Lonca, Pierre Marquis, Zied Bouraoui

2604.24175 2026-04-28 cs.CL cs.AI

AdapTime: Enabling Adaptive Temporal Reasoning in Large Language Models

Yimin Deng, Yejing Wang, Zhenxi Lin, Zichuan Fu, Guoshuai Zhao, Derong Xu, Yefeng Zheng, Xiangyu Zhao, Xian Wu, Li Zhu, Xueming Qian

Comments ACL 2026 findings

2604.24171 2026-04-28 cs.CV

POCA: Pareto-Optimal Curriculum Alignment for Visual Text Generation

Yaohou Fan, Qingzhong Wang, Yongsong Huang, Junyi Liu, Tomo Miyazaki, Shinichiro Omachi

Comments Accepted by CVPR 2026

2604.24170 2026-04-28 cs.AI

Credal Concept Bottleneck Models for Epistemic-Aleatoric Uncertainty Decomposition

Tanmoy Mukherjee, Thomas Bailleux, Pierre Marquis, Zied Bouraoui

2604.24167 2026-04-28 cs.CV cs.GR cs.LG

PEPS: Positional Encoding Projected Sampling -- Extended

Guillaume Perez, Janarbek Matai, Takahiro Harada

2604.24163 2026-04-28 cs.CV

Robust Deepfake Detection, NTIRE 2026 Challenge: Report

Benedikt Hopf, Radu Timofte, Chenfan Qu, Junchi Li, Fei Wu, Dagong Lu, Mufeng Yao, Xinlei Xu, Fengjun Guo, Yongwei Tang, Zhiqiang Yang, Zhiqiang Wu, Jia Wen Seow, Hong Vin Koay, Haodong Ren, Feng Xu, Shuai Chen, Minh-Khoa Le-Phan, Minh-Hoang Le, Trong-Le Do, Minh-Triet Tran, Chih-Yu Jian, Yi-Fan Wang, Bang-Kang Chen, You-Chen Chao, Chia-Ming Lee, Fu-En Yang, Yu-Chiang Frank Wang, Chih-Chung Hsu, Aashish Negi, Hardik Sharma, Prateek Shaily, Jayant Kumar, Sachin Chaudhary, Akshay Dudhane, Praful Hambarde, Amit Shukla, Jielun Peng, Yabin Wang, Yaqi Li, Jincheng Liu, Xiaopeng Hong, Krish Wadhwani, Liam Fitzpatrick, Utkarsh Tiwari, Bilel Benjdira, Anas M. Ali, Wadii Boulila, Cristian Lazo Quispe, Aishwarya A, Akshara S, Ashwathi N, Jiachen Tu, Guoyi Xu, Yaoxin Jiang, Jiajia Liu, Yaokun Shi

2604.24158 2026-04-28 cs.AI

Multi-Dimensional Evaluation of Sustainable City Trips with LLM-as-a-Judge and Human-in-the-Loop

Ashmi Banerjee, Adithi Satish, Wolfgang Wörndl, Yashar Deldjoo

2604.24154 2026-04-28 cs.LG cs.AI

Progressive Approximation in Deep Residual Networks: Theory and Validation

Wei Wang, Xiao-Yong Wei, Qing Li

2604.24153 2026-04-28 cs.AI

Right-to-Act: A Pre-Execution Non-Compensatory Decision Protocol for AI Systems

Gadi Lavi

Comments 14 pages, 3 figures. Introduces a pre-execution decision protocol for AI systems

2604.24149 2026-04-28 cs.CV

6thGrid-Net: Unified Remote Sensing Image Dehazing Based on Color Restoration and Edge-Preserving

Runci Bai, Kui Jiang, Xiang Chen, Chen Wu, Dianjie Lu, Guijuan Zhang, Zhuoran Zheng

2604.24146 2026-04-28 cs.CV

EXACT: an explainable anomaly-aware vision foundation model for analysis of 3D chest CT

Xuguang Bai, Mingxuan Liu, Tongxi Song, Yifei Chen, Hongjia Yang, Kasidit Anmahapong, Zihan Li, Ying Zhou, Qiyuan Tian

2604.24143 2026-04-28 cs.LG

Machine-Learning-Based Classification of Radio Frequency Building Loss

Jiayi Tan, Neelabhro Roy, James Gross, Rohit Chandra, Tsao-Tsen Chen

Comments Accepted as a short paper in International Conference on Telecommunications (ICT) 2026

2604.24127 2026-04-28 cs.LG cs.AI

Leveraging Human Feedback for Semantically-Relevant Skill Discovery

Maxence Hussonnois, Thommen George Karimpanal, Santu Rana

Comments Accepted at the 28th International Conference on Pattern Recognition (ICPR 2026)

2604.24126 2026-04-28 cs.CL

Psychologically-Grounded Graph Modeling for Interpretable Depression Detection

Rishitej Reddy Vyalla, Kritarth Prasad, Avinash Anand, Erik Cambria, Shaoxiong Ji, Faten S. Alamri, Zhengkui Wang

2604.24125 2026-04-28 cs.CV

Open-Vocabulary Semantic Segmentation Network Integrating Object-Level Label and Scene-Level Semantic Features for Multimodal Remote Sensing Images

Jinkun Dai, Yuanxin Ye, Peng Tang, Tengfeng Tang, Xianping Ma, Jing Xiao, Mi Wang

2604.24123 2026-04-28 cs.CV

FDIM: A Feature-distance-based Generic Video Quality Metric for Versatile Codecs

Jiayi Wang, Lichun Zhang, Xiaoqi Zhuang, Jiaqi Zhang, Lu Yu, Yin Zhao

2604.24114 2026-04-28 cs.CL

IRIS: Interleaved Reinforcement with Incremental Staged Curriculum for Cross-Lingual Mathematical Reasoning

Navya Gupta, Rishitej Reddy Vyalla, Avinash Anand, Chhavi Kirtani, Erik Cambria, Zhengchen Zhang, Zhengkui Wang, Timothy Liu, Aik Beng Ng, Simon See, Rajiv Ratn Shah

Comments Accepted in ACL main

2604.24109 2026-04-28 cs.CV

SemiSAM-O1: How far can we push the boundary of annotation-efficient medical image segmentation?

Yichi Zhang, Le Xue, Bichun Xu, Judong Luo, Zhigang Wu, Yu Fu, Zixin Hu, Yuan Cheng, Yuan Qi

详情

英文摘要

Semi-supervised learning (SSL) has become a promising solution to alleviate the annotation burden of deep learning-based medical image segmentation models. While recent advances in foundation model-driven SSL have pushed the boundary to extremely limited annotation scenarios, they fail to maintain robust competitive performance in complex imaging modalities. In this paper, we propose SemiSAM-O1, an annotation-efficient framework using only one annotated template image for segmentation. SemiSAM-O1 extends the specialist-generalist collaborative learning framework to the extreme one-label setting by fully exploiting the foundation model's feature representation capability beyond its prompting interface. SemiSAM-O1 operates in two stages. In the first stage, the foundation model's encoder extracts dense features from all volumes, and class prototypes derived from the single annotated template are propagated to the unlabeled pool via feature similarity to produce coarse initial pseudo-labels. In the second stage, an iterative training-and-refinement loop progressively improves both the segmentation model and the pseudo-labels over multiple rounds, where each round trains the model from scratch on current pseudo-labels and generates updated predictions with voxel-wise uncertainty estimates. An uncertainty-guided refinement step further leverages the foundation model's global feature space to correct high-uncertainty regions by aggregating labels from their most similar confident neighbors, establishing a virtuous cycle of mutual improvement. Extensive experiments on a wide range of segmentation tasks across different modalities and anatomical targets demonstrate that SemiSAM-O1 significantly narrows the performance gap between one-label semi-supervised learning and full supervision, while significantly reducing the computational overhead of online foundation model inference.

URL PDF HTML ☆

赞 0 踩 0

2604.24104 2026-04-28 cs.CL

Factual and Edit-Sensitive Graph-to-Sequence Generation via Graph-Aware Adaptive Noising

Aditya Hemant Shahane, Anuj Kumar Sirohi, Tanmoy Chakraborty, Prathosh A P, Sandeep Kumar