arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2604.14622 2026-04-17 cs.CV

Multigrain-aware Semantic Prototype Scanning and Tri-Token Prompt Learning Embraced High-Order RWKV for Pan-Sharpening

Junfeng Li, Wenyang Zhou, Xueheng Li, Xuanhua He, Jianhou Gan, Wenqi Ren

详情

英文摘要

In this work, we propose a Multigrain-aware Semantic Prototype Scanning paradigm for pan-sharpening, built upon a high-order RWKV architecture and a tri-token prompting mechanism derived from semantic clustering. Specifically, our method contains three key components: 1) Multigrain-aware Semantic Prototype Scanning. Although RWKV offers a efficient linear-complexity alternative to Transformers, its conventional bidirectional raster scanning is still semantic-agnostic and prone to positional bias. To address this issue, we introduce a semantic-driven scanning strategy that leverages locality-sensitive hashing to group semantically related regions and construct multi-grain semantic prototypes, enabling context-aware token reordering and more coherent global interaction. 2) Tri-token Prompt Learning. We design a tri-token prompting mechanism consisting of a global token, cluster-derived prototype tokens, and a learnable register token. The global and prototype tokens provide complementary semantic priors for RWKV modeling, while the register token helps suppress noisy and artifact-prone intermediate representations. 3) Invertible Q-Shift. To counteract spatial details, we apply center difference convolution on the value pathway to inject high-frequency information, and introduce an invertible multi-scale Q-shift operation for efficient and lossless feature transformation without parameter-heavy receptive field expansion. Experimental results demonstrate the superiority of our method.

URL PDF HTML ☆

赞 0 踩 0

2604.14619 2026-04-17 cs.SD cs.LG eess.AS q-fin.CP q-fin.ST

The Acoustic Camouflage Phenomenon: Re-evaluating Speech Features for Financial Risk Prediction

Dhruvin Dungrani, Disha Dungrani

2604.14616 2026-04-17 cs.CL cs.AI cs.LG

Retrieve, Then Classify: Corpus-Grounded Automation of Clinical Value Set Authoring

Sumit Mukherjee, Juan Shu, Nairwita Mazumder, Tate Kernell, Celena Wheeler, Shannon Hastings, Chris Sidey-Gibbons

详情

英文摘要

Clinical value set authoring -- the task of identifying all codes in a standardized vocabulary that define a clinical concept -- is a recurring bottleneck in clinical quality measurement and phenotyping. A natural approach is to prompt a large language model (LLM) to generate the required codes directly, but structured clinical vocabularies are large, version-controlled, and not reliably memorized during pretraining. We propose Retrieval-Augmented Set Completion (RASC): retrieve the $K$ most similar existing value sets from a curated corpus to form a candidate pool, then apply a classifier to each candidate code. Theoretically, retrieve-and-select can reduce statistical complexity by shrinking the effective output space from the full vocabulary to a much smaller retrieved candidate pool. We demonstrate the utility of RASC on 11,803 publicly available VSAC value sets, constructing the first large-scale benchmark for this task. A cross-encoder fine-tuned on SAPBert achieves AUROC~0.852 and value-set-level F1~0.298, outperforming a simpler three-layer Multilayer Perceptron (AUROC~0.799, F1~0.250) and both reduce the number of irrelevant candidates per true positive from 12.3 (retrieval-only) to approximately 3.2 and 4.4 respectively. Zero-shot GPT-4o achieves value-set-level F1~0.105, with 48.6\% of returned codes absent from VSAC entirely. This performance gap widens with increasing value set size, consistent with RASC's theoretical advantage. We observe similar performance gains across two other classifier model types, namely a cross-encoder initialized from pre-trained SAPBert and a LightGBM model, demonstrating that RASC's benefits extend beyond a single model class. The code to download and create the benchmark dataset, as well as the model training code is available at: \href{https://github.com/mukhes3/RASC}{https://github.com/mukhes3/RASC}.

URL PDF HTML ☆

赞 0 踩 0

2604.14615 2026-04-17 cs.AI

CoDaS: AI Co-Data-Scientist for Biomarker Discovery via Wearable Sensors

Yubin Kim, Salman Rahman, Samuel Schmidgall, Chunjong Park, A. Ali Heydari, Ahmed A. Metwally, Hong Yu, Xin Liu, Xuhai Xu, Yuzhe Yang, Maxwell A. Xu, Zhihan Zhang, Cynthia Breazeal, Tim Althoff, Petar Sirkovic, Ivor Rendulic, Annalisa Pawlosky, Nicolas Stroppa, Juraj Gottweis, Elahe Vedadi, Alan Karthikesalingam, Pushmeet Kohli, Vivek Natarajan, Mark Malhotra, Shwetak Patel, Hae Won Park, Hamid Palangi, Daniel McDuff

2604.14612 2026-04-17 cs.LG cs.CL

ConfLayers: Adaptive Confidence-based Layer Skipping for Self-Speculative Decoding

Walaa Amer, Uday das, Fadi Kurdahi

Comments 13 pages, 9 figures

2604.14609 2026-04-17 cs.AI physics.comp-ph

El Agente Forjador: Task-Driven Agent Generation for Quantum Simulation

Zijian Zhang, Aiwei Yin, Amaan Baweja, Jiaru Bai, Ignacio Gustin, Varinia Bernales, Alán Aspuru-Guzik

2604.14602 2026-04-17 cs.CL cs.AI

CausalDetox: Causal Head Selection and Intervention for Language Model Detoxification

Yian Wang, Yuen Chen, Agam Goyal, Hari Sundaram

Comments Accepted to ACL 2026. 22 pages, 1 figure

2604.14595 2026-04-17 cs.CL

NLP needs Diversity outside of 'Diversity'

Joshua Tint

Comments 7 pages, 1 figure

2604.14591 2026-04-17 cs.CV

Prompt-Guided Image Editing with Masked Logit Nudging in Visual Autoregressive Models

Amir El-Ghoussani, Marc Hölle, Gustavo Carneiro, Vasileios Belagiannis

Comments Accepted at the 2026 IEEE/CVF Conference on Computer Vision and Pattern Recognition Findings (CVPRF)

2604.14587 2026-04-17 cs.LG math.OC stat.ML

CLion: Efficient Cautious Lion Optimizer with Enhanced Generalization

Feihu Huang, Guanyi Zhang, Songcan Chen

Comments 30 pages

2604.14583 2026-04-17 cs.LG

From Risk to Rescue: An Agentic Survival Analysis Framework for Liquidation Prevention

Fernando Spadea, Oshani Seneviratne

2604.14582 2026-04-17 cs.CV

MapSR: Prompt-Driven Land Cover Map Super-Resolution via Vision Foundation Models

Ruiqi Wang, Qi Yu, Jie Ma, Hanlin Wu

2604.14580 2026-04-17 cs.CV cs.MM cs.SD

TurboTalk: Progressive Distillation for One-Step Audio-Driven Talking Avatar Generation

Xiangyu Liu, Feng Gao, Xiaomei Zhang, Yong Zhang, Xiaoming Wei, Zhen Lei, Xiangyu Zhu

2604.14576 2026-04-17 cs.AI

Enhancing Mental Health Counseling Support in Bangladesh using Culturally-Grounded Knowledge

Md Arid Hasan, Azhagu Meena SP, Aditya Khan, Abu Md Akteruzzaman Bhuiyan, Helal Uddin Ahmed, Joysree Debi, Farig Sadeque, Annie En-Shiun Lee, Syed Ishtiaque Ahmed

Comments submitted to CLPsych 2026

2604.14574 2026-04-17 cs.CV

M3D-Net: Multi-Modal 3D Facial Feature Reconstruction Network for Deepfake Detection

Haotian Wu, Yue Cheng, Shan Bian

2604.14570 2026-04-17 cs.CV

Deepfake Detection Generalization with Diffusion Noise

Hongyuan Qi, Wenjin Hou, Hehe Fan, Jun Xiao

Comments 17 pages

2604.14568 2026-04-17 cs.CV cs.CL

Learning Adaptive Reasoning Paths for Efficient Visual Reasoning

Yixu Huang, Tinghui Zhu, Muhao Chen

2604.14566 2026-04-17 cs.LG cs.SY eess.SY

Physics-Informed Machine Learning for Pouch Cell Temperature Estimation

Zheng Liu

Comments 4 pages, 2 figures

2604.14565 2026-04-17 cs.RO cs.SY eess.SY

Model-Based Reinforcement Learning Exploits Passive Body Dynamics for High-Performance Biped Robot Locomotion

Tomoya Kamimura, Haruka Washiyama, Akihito Sano

2604.14564 2026-04-17 cs.AI cs.CL

MARS$^2$: Scaling Multi-Agent Tree Search via Reinforcement Learning for Code Generation

Pengfei Li, Shijie Wang, Fangyuan Li, Yikun Fu, Kaifeng Liu, Kaiyan Zhang, Dazhi Zhang, Yuqiang Li, Biqing Qi, Bowen Zhou

Comments Accepted by ACL 2026

2604.14563 2026-04-17 cs.CV

Revisiting Token Compression for Accelerating ViT-based Sparse Multi-View 3D Object Detectors

Mingqian Ji, Shanshan Zhang, Jian Yang

Comments Accepted by CVPR 2026

2604.14562 2026-04-17 cs.LG physics.app-ph physics.comp-ph

Material-Agnostic Zero-Shot Thermal Inference for Metal Additive Manufacturing via a Parametric PINN Framework

Hyeonsu Lee, Jihoon Jeong

详情

英文摘要

Accurate thermal modeling in metal additive manufacturing (AM) is essential for understanding the process-structure-performance relationship. While prior studies have explored generalization across unseen process conditions, they often require extensive datasets, costly retraining, or pre-training. Generalization across different materials also remains relatively unexplored due to the challenges posed by distinct material-dependent thermal behaviors. This paper introduces a parametric physics-informed neural network (PINN) framework for zero-shot generalization across arbitrary materials without labeled data, retraining, or pre-training. The framework adopts a decoupled parametric PINN architecture that separately encodes material properties and spatiotemporal coordinates, fusing them through conditional modulation to better align with the multiplicative role of material parameters in the governing equation and boundary conditions. Physics-guided output scaling derived from Rosenthal's analytical solution and a hybrid optimization strategy are further incorporated to enhance physical consistency, training stability, and convergence. Experiments on bare plate laser powder bed fusion (LPBF) across diverse metal alloys, including both in-distribution and out-of-distribution cases, demonstrate effective zero-shot generalizability along with superior training efficiency. Specifically, the proposed framework achieved up to a 64.2% reduction in relative L2 error compared to the non-parametric baseline while surpassing its performance within only 4.4% of the baseline training epochs. Ablation studies confirm that the proposed framework's components are broadly applicable to other PINN-based approaches. Overall, the proposed framework provides an efficient and scalable material-agnostic solution for zero-shot thermal modeling, contributing to more flexible and practical deployment in metal AM.

URL PDF HTML ☆

赞 0 踩 0

2604.14560 2026-04-17 cs.CV

DVFace: Spatio-Temporal Dual-Prior Diffusion for Video Face Restoration

Zheng Chen, Bowen Chai, Rongjun Gao, Mingtao Nie, Xi Li, Bingnan Duan, Jianping Fang, Xiaohong Liu, Linghe Kong, Yulun Zhang

Comments Code is available at: https://github.com/zhengchen1999/DVFace

2604.14558 2026-04-17 cs.CV

The Fourth Challenge on Image Super-Resolution ($\times$4) at NTIRE 2026: Benchmark Results and Method Overview

Zheng Chen, Kai Liu, Jingkai Wang, Xianglong Yan, Jianze Li, Ziqing Zhang, Jue Gong, Jiatong Li, Lei Sun, Xiaoyang Liu, Radu Timofte, Yulun Zhang, Jihye Park, Yoonjin Im, Hyungju Chun, Hyunhee Park, MinKyu Park, Zheng Xie, Xiangyu Kong, Weijun Yuan, Zhan Li, Qiurong Song, Luen Zhu, Fengkai Zhang, Xinzhe Zhu, Junyang Chen, Congyu Wang, Yixin Yang, Zhaorun Zhou, Jiangxin Dong, Jinshan Pan, Shengwei Wang, Jiajie Ou, Baiang Li, Sizhuo Ma, Qiang Gao, Jusheng Zhang, Jian Wang, Keze Wang, Yijiao Liu, Yingsi Chen, Hui Li, Yu Wang, Congchao Zhu, Saeed Ahmad, Ik Hyun Lee, Jun Young Park, Ji Hwan Yoon, Kainan Yan, Zian Wang, Weibo Wang, Shihao Zou, Chao Dong, Wei Zhou, Linfeng Li, Jaeseong Lee, Jaeho Chae, Jinwoo Kim, Seonjoo Kim, Yucong Hong, Zhenming Yan, Junye Chen, Ruize Han, Song Wang, Yuxuan Jiang, Chengxi Zeng, Tianhao Peng, Fan Zhang, David Bull, Tongyao Mu, Qiong Cao, Yifan Wang, Youwei Pan, Leilei Cao, Xiaoping Peng, Wei Deng, Yifei Chen, Wenbo Xiong, Xian Hu, Yuxin Zhang, Xiaoyun Cheng, Yang Ji, Zonghao Chen, Zhihao Xue, Junqin Hu, Nihal Kumar, Snehal Singh Tomar, Klaus Mueller, Surya Vashisth, Prateek Shaily, Jayant Kumar, Hardik Sharma, Ashish Negi, Sachin Chaudhary, Akshay Dudhane, Praful Hambarde, Amit Shukla, Shijun Shi, Jiangning Zhang, Yong Liu, Kai Hu, Jing Xu, Xianfang Zeng, Amitesh M, Hariharan S, Chia-Ming Lee, Yu-Fan Lin, Chih-Chung Hsu, Nishalini K, Sreenath K A, Bilel Benjdira, Anas M. Ali, Wadii Boulila, Shuling Zheng, Zhiheng Fu, Feng Zhang, Zhanglu Chen, Boyang Yao, Nikhil Pathak, Aagam Jain, Milan Kumar, Kishor Upla, Vivek Chavda, Sarang N S, Raghavendra Ramachandra, Zhipeng Zhang, Qi Wang, Shiyu Wang, Jiachen Tu, Guoyi Xu, Yaoxin Jiang, Jiajia Liu, Yaokun Shi, Yuqi Li, Chuanguang Yang, Weilun Feng, Zhuzhi Hong, Hao Wu, Junming Liu, Yingli Tian, Amish Bhushan Kulkarni, Tejas R R Shet, Saakshi M Vernekar, Nikhil Akalwadi, Kaushik Mallibhat, Ramesh Ashok Tabib, Uma Mudenagudi, Yuwen Pan, Tianrun Chen, Deyi Ji, Qi Zhu, Lanyun Zhu, Heyan Zhangyi

Comments NTIRE 2026 webpage: https://cvlai.net/ntire/2026. Code: https://github.com/zhengchen1999/NTIRE2026_ImageSR_x4

2604.14556 2026-04-17 cs.CV cs.AI

Controllable Video Object Insertion via Multiview Priors

Xia Qi, Peishan Cong, Yichen Yao, Ziyi Wang, Yaoqin Ye, Yuexin Ma

2604.14547 2026-04-17 cs.LG

Predicting Post-Traumatic Epilepsy from Clinical Records using Large Language Model Embeddings

Wenhui Cui, Nicholas Swingle, Anand A. Joshi, Dileep Nair, Richard M. Leahy

2604.14545 2026-04-17 cs.RO

CT-VIR: Continuous-Time Visual-Inertial-Ranging Fusion for Indoor Localization with Sparse Anchors

Yu-An Liu, Li Zhang

2604.14541 2026-04-17 cs.CV

Giving Faces Their Feelings Back: Explicit Emotion Control for Feedforward Single-Image 3D Head Avatars

Yicheng Gong, Jiawei Zhang, Liqiang Liu, Yanwen Wang, Lei Chu, Jiahao Li, Hao Pan, Hao Zhu, Yan Lu

2604.14540 2026-04-17 cs.CV

WILD-SAM: Phase-Aware Expert Adaptation of SAM for Landslide Detection in Wrapped InSAR Interferograms

Yucheng Pan, Heping Li, Zhangle Liu, Sajid Hussain, Bin Pan

2604.14534 2026-04-17 cs.LG stat.AP

An unsupervised decision-support framework for multivariate biomarker analysis in athlete monitoring

Fernando Barcelos Rosito, Sebastião De Jesus Menezes, Simone Ferreira Sturza, Adriana Seixas, Muriel Figueredo Franco

Comments 15 pages, 4 figures, 3 tables, submitted to Springer Nature Scientific Reports