arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2604.13171 2026-04-16 cs.CV

3DRealHead: Few-Shot Detailed Head Avatar

Jalees Nehvi, Timo Bolkart, Thabo Beeler, Justus Thies

详情

英文摘要

The human face is central to communication. For immersive applications, the digital presence of a person should mirror the physical reality, capturing the users idiosyncrasies and detailed facial expressions. However, current 3D head avatar methods often struggle to faithfully reproduce the identity and facial expressions, despite having multi-view data or learned priors. Learning priors that capture the diversity of human appearances, especially, for regions with highly person-specific features, like the mouth and teeth region is challenging as the underlying training data is limited. In addition, many of the avatar methods are purely relying on 3D morphable model-based expression control which strongly limits expressivity. To address these challenges, we are introducing 3DRealHead, a few-shot head avatar reconstruction method with a novel expression control signal that is extracted from a monocular video stream of the subject. Specifically, the subject can take a few pictures of themselves, recover a 3D head avatar and drive it with a consumer-level webcam. The avatar reconstruction is enabled via a novel few-shot inversion process of a 3D human head prior which is represented as a Style U-Net that emits 3D Gaussian primitives which can be rendered under novel views. The prior is learned on the NeRSemble dataset. For animating the avatar, the U-Net is conditioned on 3DMM-based facial expression signals, as well as features of the mouth region extracted from the driving video. These additional mouth features allow us to recover facial expressions that cannot be represented by the 3DMM leading to a higher expressivity and closer resemblance to the physical reality.

URL PDF HTML ☆

赞 0 踩 0

2604.13153 2026-04-16 cs.CV cs.CR cs.LG

PatchPoison: Poisoning Multi-View Datasets to Degrade 3D Reconstruction

Prajas Wadekar, Venkata Sai Pranav Bachina, Kunal Bhosikar, Ankit Gangwal, Charu Sharma

Comments CVPR Workshop on Security, Privacy, and Adversarial Robustness in 3D Generative Vision Models (SPAR-3D), 2026

2604.13151 2026-04-16 cs.AI

Exploration and Exploitation Errors Are Measurable for Language Model Agents

Jaden Park, Jungtaek Kim, Jongwon Jeong, Robert D. Nowak, Kangwook Lee, Yong Jae Lee

2604.13142 2026-04-16 cs.RO cs.CV cs.DB

Multi-modal panoramic 3D outdoor datasets for place categorization

Hojung Jung, Yuki Oto, Oscar M. Mozos, Yumi Iwashita, Ryo Kurazume

Comments This is the authors' manuscript. The final published article was presented at IROS 2026, and it is available at https://doi.org/10.1109/IROS.2016.7759669

2604.13133 2026-04-16 cs.LG

Automated co-design of high-performance thermodynamic cycles via graph-based hierarchical reinforcement learning

Wenqing Li, Xu Feng, Peixue Jiang, Yinhai Zhu

Comments 21 pages,8 figures

2604.13131 2026-04-16 cs.LG cs.CV

Depth-Resolved Coral Reef Thermal Fields from Satellite SST and Sparse In-Situ Loggers Using Physics-Informed Neural Networks

Alzayat Saleh, Mostafa Rahimi Azghadi

Comments 23 pages, 7 figures, submitted to Remote Sensing of Environment

2604.13130 2026-04-16 cs.LG stat.ML

Generalization Guarantees on Data-Driven Tuning of Gradient Descent with Langevin Updates

Saumya Goyal, Rohith Rongali, Ritabrata Ray, Barnabás Póczos

2604.13125 2026-04-16 cs.LG

Synthetic Tabular Generators Fail to Preserve Behavioral Fraud Patterns: A Benchmark on Temporal, Velocity, and Multi-Account Signals

Bhavana Sajja

Comments 28 pages, 5 figures. Submitted to DMLR (Journal of Data-centric Machine Learning Research). Code: https://github.com/bhavana3/synthetic-data-experiments

详情

DOI: 10.5281/zenodo.19545114

英文摘要

We introduce behavioral fidelity -- a third evaluation dimension for synthetic tabular data that measures whether generated data preserves the temporal, sequential, and structural behavioral patterns that distinguish real-world entity activity. Existing frameworks evaluate statistical fidelity (marginal distributions and correlations) and downstream utility (classifier AUROC on synthetic-trained models), but neither tests for the behavioral signals that operational detection and analysis systems actually rely on. We formalize a taxonomy of four behavioral fraud patterns (P1-P4) covering inter-event timing, burst structure, multi-account graph motifs, and velocity-rule trigger rates; define a degradation ratio metric calibrated to a real-data noise floor (1.0 = matches real variability, k = k-times worse); and prove that row-independent generators -- the dominant paradigm -- are structurally incapable of reproducing P3 graph motifs (Proposition 1) and produce non-positive within-entity IET autocorrelation (Proposition 2), making the positive burst fingerprint of fraud sequences unachievable regardless of architecture or training data size. We benchmark CTGAN, TVAE, GaussianCopula, and TabularARGN on IEEE-CIS Fraud Detection and the Amazon Fraud Dataset. All four fail severely: on IEEE-CIS composite degradation ratios range from 24.4x (TVAE) to 39.0x (GaussianCopula); on Amazon FDB, row-independent generators score 81.6-99.7x, while TabularARGN achieves 17.2x. We document generator-specific failure modes and their resolutions. The P1-P4 framework extends to any domain with entity-level sequential tabular data, including healthcare and network security. We release our evaluation framework as open source.

URL PDF HTML ☆

赞 0 踩 0

2604.13121 2026-04-16 cs.RO

Olfactory pursuit: catching a moving odor source in complex flows

Maurizio Carbone, Lorenzo Piro, Robin A. Heinonen, Luca Biferale, Massimo Cencini, Antonio Celani

2604.13119 2026-04-16 cs.SD

Melodic contour does not cluster: Reconsidering contour typology

Bas Cornelissen, Willem Zuidema, John Ashley Burgoyne, Henkjan Honing

Comments 16 pages, 8 figures, plus 5 pages of supplements

2604.13112 2026-04-16 cs.CV

A Lightweight Multi-Metric No-Reference Image Quality Assessment Framework for UAV Imaging

Koffi Titus Sergio Aglin, Anthony K. Muchiri, Celestin Nkundineza

Comments 13 pages, 5 figures, article

2604.13082 2026-04-16 cs.LG cs.AI

The Long Delay to Arithmetic Generalization: When Learned Representations Outrun Behavior

Laura Gomezjurado Gonzalez

Comments 19 pages, 10 fugures

2604.13078 2026-04-16 cs.CL

IWLV-Ramayana: A Sarga-Aligned Parallel Corpus of Valmiki's Ramayana Across Indian Languages

Sumesh VP

Comments 9 pages, dataset paper, HuggingFace: insightpublica/ramayana-indic

2604.13077 2026-04-16 cs.CL

Can Large Language Models Reliably Extract Physiology Index Values from Coronary Angiography Reports?

Sofia Morgado, Filipa Valdeira, Niklas Sander, Diogo Ferreira, Marta Vilela, Miguel Menezes, Cláudia Soares

2604.13074 2026-04-16 cs.CL cs.CV

PersonaVLM: Long-Term Personalized Multimodal LLMs

Chang Nie, Chaoyou Fu, Yifan Zhang, Haihua Yang, Caifeng Shan

Comments Accepted by CVPR 2026. Project page: https://PersonaVLM.github.io

2604.13073 2026-04-16 cs.CL cs.AI cs.MM

OmniTrace: A Unified Framework for Generation-Time Attribution in Omni-Modal LLMs

Qianqi Yan, Yichen Guo, Ching-Chen Kuo, Shan Jiang, Hang Yin, Yang Zhao, Xin Eric Wang

2604.13072 2026-04-16 cs.CL cs.AI cs.LG

LiveClawBench: Benchmarking LLM Agents on Complex, Real-World Assistant Tasks

Xiang Long, Li Du, Yilong Xu, Fangcheng Liu, Haoqing Wang, Ning Ding, Ziheng Li, Jianyuan Guo, Yehui Tang

2604.13070 2026-04-16 cs.CL cs.AI

Curation of a Palaeohispanic Dataset for Machine Learning

Gonzalo Martínez-Fernández, Jose F Quesada, Agustín Riscos-Núñez, Francisco José Salguero-Lamillar

2604.13066 2026-04-16 cs.CL cs.AI cs.LG

Lossless Prompt Compression via Dictionary-Encoding and In-Context Learning: Enabling Cost-Effective LLM Analysis of Repetitive Data

Andresa Rodrigues de Campos, David Lee, Imry Kissos, Piyush Paritosh

2604.13065 2026-04-16 cs.CL cs.AI cs.LO

Correct Chains, Wrong Answers: Dissociating Reasoning from Output in LLM Logic

Abinav Rao, Sujan Rachuri, Nikhil Vemuri

Comments 9 pages, 4 figures. ICLR 2026 Workshop on Logical Reasoning of LLMs

2604.13064 2026-04-16 cs.CL cs.CY

Red Skills or Blue Skills? A Dive Into Skills Published on ClawHub

Haichuan Hu, Ye Shang, Quanjun Zhang

2604.13062 2026-04-16 cs.CL

Mathematical Reasoning Enhanced LLM for Formula Derivation: A Case Study on Fiber NLI Modellin

Yao Zhang, Yuchen Song, Xiao Luo, Shengnan Li, Xiaotian Jiang, Min Zhang, Danshi Wang

2604.13060 2026-04-16 cs.CL cs.LG cs.MM

Dental-TriageBench: Benchmarking Multimodal Reasoning for Hierarchical Dental Triage

Ziyi He, Yushi Feng, Shuangyu Yang, Yinghao Zhu, Xichen Zhang, Pak Chuen Patrick Tai, Hei Yuet Lo, Songying Wu, Weifa Yang, Lequan Yu

2604.13059 2026-04-16 cs.CL cs.AI

A Proactive EMR Assistant for Doctor-Patient Dialogue: Streaming ASR, Belief Stabilization, and Preliminary Controlled Evaluation

Zhenhai Pan, Yan Liu, Jia You

Comments 10 pages, 1 figure, 6 tables. Companion systems manuscript. Preliminary controlled evaluation in a simulated pilot setting

2604.13057 2026-04-16 cs.CL

A Multi-Model Approach to English-Bangla Sentiment Classification of Government Mobile Banking App Reviews

Md. Naim Molla, Md Muhtasim Munif Fahim, Md. Binyamin, Md Jahid Hasan Imran, Tonmoy Shil, Nura Rayhan, Md Rezaul Karim

2604.13056 2026-04-16 cs.CL cs.AI

Text-as-Signal: Quantitative Semantic Scoring with Embeddings, Logprobs, and Noise Reduction

Hugo Moreira

Comments 14 pages, 5 figures, 2 tables. Preprint

2604.13055 2026-04-16 cs.CL cs.AI

WorkRB: A Community-Driven Evaluation Framework for AI in the Work Domain

Matthias De Lange, Warre Veys, Federico Retyk, Daniel Deniz, Warren Jouanneau, Mike Zhang, Aleksander Bielinski, Emma Jouffroy, Nicole Clobes, Nina Baranowska, David Graus, Marc Palyart, Rabih Zbib, Dimitra Gkatzia, Thomas Demeester, Tijl De Bie, Toine Bogers, Jens-Joris Decorte, Jeroen Van Hautte

Comments Community paper preprint

2604.13054 2026-04-16 cs.CL cs.AI cs.CV

Caption First, VQA Second: Knowledge Density, Not Task Format, Drives Multimodal Scaling

Hongjian Zou, Yue Ge, Qi Ding, Yixuan Liao, Xiaoxin Chen

Comments 23 pages, 4 figures, 10 tables. Preprint

2604.13051 2026-04-16 cs.CL cs.LG

The Consciousness Cluster: Emergent preferences of Models that Claim to be Conscious

James Chua, Jan Betley, Samuel Marks, Owain Evans

Comments 16 pages

2604.12856 2026-04-16 cs.CV

PianoFlow: Music-Aware Streaming Piano Motion Generation with Bimanual Coordination

Xuan Wang, Kai Ruan, Jiayi Han, Kaiyue Zhou, Gaoang Wang