arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

检索范围排序方式

检索时间范围

重置

HOT 人工智能、机器人等 9

cs.AI 人工智能 cs.CV 计算机视觉 cs.CL 自然语言处理 cs.RO 机器人 cs.LG 机器学习 cs.SD 声音 cs.ET 新兴技术 eess.AS 音频语音 eess.IV 图像视频

CS 计算机 41

cs 计算机 cs.AI 人工智能 cs.AR 硬件架构 cs.CC 计算复杂性 cs.CE 计算工程 cs.CG 计算几何 cs.CL 自然语言处理 cs.CR 密码安全 cs.CV 计算机视觉 cs.CY 计算机与社会 cs.DB 数据库 cs.DC 分布式计算 cs.DL 数字图书馆 cs.DM 离散数学 cs.DS 数据结构 cs.ET 新兴技术 cs.FL 形式语言 cs.GL 综述文献 cs.GR 图形学 cs.GT 博弈论 cs.HC 人机交互 cs.IR 信息检索 cs.IT 信息论 cs.LG 机器学习 cs.LO 计算机逻辑 cs.MA 多智能体 cs.MM 多媒体 cs.MS 数学软件 cs.NA 数值分析 cs.NE 神经进化 cs.NI 网络架构 cs.OH 其他计算机 cs.OS 操作系统 cs.PF 性能 cs.PL 编程语言 cs.RO 机器人 cs.SC 符号计算 cs.SD 声音 cs.SE 软件工程 cs.SI 社会信息网络 cs.SY 系统控制

ECON 经济学 4

econ 经济学 econ.EM 计量经济 econ.GN 一般经济 econ.TH 理论经济

EESS 电气与系统 5

eess 电气与系统 eess.AS 音频语音 eess.IV 图像视频 eess.SP 信号处理 eess.SY 系统控制

MATH 数学 33

math 数学 math.AC 交换代数 math.AG 代数几何 math.AP 偏微分方程 math.AT 代数拓扑 math.CA 经典分析 math.CO 组合数学 math.CT 范畴论 math.CV 复变函数 math.DG 微分几何 math.DS 动力系统 math.FA 泛函分析 math.GM 一般数学 math.GN 一般拓扑 math.GR 群论 math.GT 几何拓扑 math.HO 历史综述 math.IT 信息论 math.KT K理论 math.LO 逻辑 math.MG 度量几何 math.MP 数学物理 math.NA 数值分析 math.NT 数论 math.OA 算子代数 math.OC 优化控制 math.PR 概率 math.QA 量子代数 math.RA 环与代数 math.RT 表示论 math.SG 辛几何 math.SP 谱理论 math.ST 统计理论

PHYSICS 物理 55

astro-ph 天体物理 astro-ph.CO 宇宙学 astro-ph.EP 地球行星 astro-ph.GA 星系物理 astro-ph.HE 高能天体 astro-ph.IM 天文仪器 astro-ph.SR 太阳恒星 cond-mat 凝聚态 cond-mat.dis-nn 无序神经 cond-mat.mes-hall 介观纳米 cond-mat.mtrl-sci 材料科学 cond-mat.other 其他凝聚态 cond-mat.quant-gas 量子气体 cond-mat.soft 软凝聚态 cond-mat.stat-mech 统计力学 cond-mat.str-el 强关联电子 cond-mat.supr-con 超导 gr-qc 广义相对论 hep-ex 高能实验 hep-lat 格点高能 hep-ph 高能唯象 hep-th 高能理论 math-ph 数学物理 nlin 非线性科学 nlin.AO 自适应系统 nlin.CD 混沌动力学 nlin.CG 胞自动机 nlin.PS 斑图孤子 nlin.SI 可积系统 nucl-ex 核物理实验 nucl-th 核物理理论 physics 物理 physics.acc-ph 加速器物理 physics.ao-ph 大气海洋 physics.app-ph 应用物理 physics.atm-clus 原子分子团簇 physics.atom-ph 原子物理 physics.bio-ph 生物物理 physics.chem-ph 化学物理 physics.class-ph 经典物理 physics.comp-ph 计算物理 physics.data-an 数据分析 physics.ed-ph 物理教育 physics.flu-dyn 流体动力学 physics.gen-ph 普通物理 physics.geo-ph 地球物理 physics.hist-ph 物理史哲 physics.ins-det 仪器探测 physics.med-ph 医学物理 physics.optics 光学 physics.plasm-ph 等离子体 physics.pop-ph 科普物理 physics.soc-ph 物理与社会 physics.space-ph 空间物理 quant-ph 量子物理

Q-BIO 定量生物 11

q-bio 定量生物 q-bio.BM 生物分子 q-bio.CB 细胞行为 q-bio.GN 基因组学 q-bio.MN 分子网络 q-bio.NC 神经认知 q-bio.OT 其他定量生物 q-bio.PE 种群进化 q-bio.QM 定量方法 q-bio.SC 亚细胞过程 q-bio.TO 组织器官

Q-FIN 定量金融 10

q-fin 定量金融 q-fin.CP 计算金融 q-fin.EC 经济学 q-fin.GN 一般金融 q-fin.MF 数学金融 q-fin.PM 投资组合 q-fin.PR 证券定价 q-fin.RM 风险管理 q-fin.ST 统计金融 q-fin.TR 交易微观结构

STAT 统计 7

stat 统计 stat.AP 统计应用 stat.CO 统计计算 stat.ME 统计方法 stat.ML 机器学习 stat.OT 其他统计 stat.TH 统计理论

2603.29616 2026-04-01 cs.CV

Video-Oasis: Rethinking Evaluation of Video Understanding

Geuntaek Lim, Minho Shim, Sungjune Park, Jaeyun Lee, Inwoong Lee, Taeoh Kim, Dongyoon Wee, Yukyung Choi

2603.29608 2026-04-01 cs.CL

Learning Diagnostic Reasoning for Decision Support in Toxicology

Nico Oberländer, David Bani-Harouni, Tobias Zellner, Nassir Navab, Florian Eyer, Matthias Keicher

2603.29591 2026-04-01 cs.CV

FlowID : Enhancing Forensic Identification with Latent Flow-Matching Models

Jules Ripoll, David Bertoin, Alasdair Newson, Charles Dossal, Jose Pablo Baraybar

2603.29578 2026-04-01 cs.CV

Emotion Diffusion Classifier with Adaptive Margin Discrepancy Training for Facial Expression Recognition

Rongkang Dong, Cuixin Yang, Cong Zhang, Yushen Zuo, Kin-Man Lam

2603.29570 2026-04-01 cs.CV cs.AI

Generating Key Postures of Bharatanatyam Adavus with Pose Estimation

Jagadish Kashinath Kamble, Jayanta Mukhopadhyay, Debaditya Roy, Partha Pratim Das

Comments Published in ICVGIP, 2025

Journal ref In Proceedings of the Indian Conference on Computer Vision, Graphics and Image Processing (ICVGIP), 2025

详情

DOI: 10.1145/3774521.3774561

英文摘要

Preserving intangible cultural dances rooted in centuries of tradition and governed by strict structural and symbolic rules presents unique challenges in the digital era. Among these, Bharatanatyam, a classical Indian dance form, stands out for its emphasis on codified adavus and precise key postures. Accurately generating these postures is crucial not only for maintaining anatomical and stylistic integrity, but also for enabling effective documentation, analysis, and transmission to broader global audiences through digital means. We propose a pose-aware generative framework integrated with a pose estimation module, guided by keypoint-based loss and pose consistency constraints. These supervisory signals ensure anatomical accuracy and stylistic integrity in the synthesized outputs. We evaluate four configurations: standard conditional generative adversarial network (cGAN), cGAN with pose supervision, conditional diffusion, and conditional diffusion with pose supervision. Each model is conditioned on key posture class labels and optimized to maintain geometric structure. In both cGAN and conditional diffusion settings, the integrated pose guidance aligns generated poses with ground-truth keypoint structures, promoting cultural fidelity. Our results demonstrate that incorporating pose supervision significantly enhances the quality, realism, and authenticity of generated Bharatanatyam postures. This framework provides a scalable approach for the digital preservation, education, and dissemination of traditional dance forms, enabling high-fidelity generation without compromising cultural precision. Code is available at https://github.com/jagidsh/Generating-Key-Postures-of-Bharatanatyam-Adavus-with-Pose-Estimation.

URL PDF HTML ☆

赞 0 踩 0

2603.29566 2026-04-01 cs.LG math.AG

The Geometry of Polynomial Group Convolutional Neural Networks

Yacoub Hendi, Daniel Persson, Magdalena Larfors

Comments 22 pages, 2 figures

2603.29559 2026-04-01 cs.CL cs.CY

When Can We Trust LLM Graders? Calibrating Confidence for Automated Assessment

Robinson Ferrer, Damla Turgut, Zhongzhou Chen, Shashank Sonkar

2603.29557 2026-04-01 cs.AI cs.CL

FlowPIE: Test-Time Scientific Idea Evolution with Flow-Guided Literature Exploration

Qiyao Wang, Hongbo Wang, Longze Chen, Zhihao Yang, Guhong Chen, Hamid Alinejad-Rokny, Hui Li, Yuan Lin, Min Yang

Comments 30 pages, 11 figures, 15 tables

2603.29555 2026-04-01 cs.LG math.PR

Total Variation Guarantees for Sampling with Stochastic Localization

Jakob Kellermann

Comments 12 pages main body, 13 pages Appendix

2603.29554 2026-04-01 cs.LG

Capturing Multivariate Dependencies of EV Charging Events: From Parametric Copulas to Neural Density Estimation

Martin Výboh, Gabriela Grmanová

Comments 5 pages, 3 figures. Submitted to IEEE PES ISGT Europe 2026

2603.28610 2026-04-01 cs.CV cs.AI cs.CL

ResAdapt: Adaptive Resolution for Efficient Multimodal Reasoning

Huanxuan Liao, Zhongtao Jiang, Yupu Hao, Yuqiao Tan, Shizhu He, Ben Wang, Jun Zhao, Kun Xu, Kang Liu

Comments work in progress

2603.28594 2026-04-01 cs.CV cs.AI cs.CR cs.RO

Detection of Adversarial Attacks in Robotic Perception

Ziad Sharawy, Mohammad Nakshbandi, Sorin Mihai Grigorescu

Comments 9 pages, 6 figures. Accepted and presented at STE 2025, Transilvania University of Brasov, Romania

2603.28537 2026-04-01 cs.CL

Training data generation for context-dependent rubric-based short answer grading

Pavel Šindelář, Dávid Slivka, Christopher Bouma, Filip Prášil, Ondřej Bojar

2603.28460 2026-04-01 cs.CV cs.LG

$R_\text{dm}$: Re-conceptualizing Distribution Matching as a Reward for Diffusion Distillation

Linqian Fan, Peiqin Sun, Tiancheng Wen, Shun Lu, Chengru Song

2603.28021 2026-04-01 cs.SD

Audio Language Model for Deepfake Detection Grounded in Acoustic Chain-of-Thought

Runkun Chen, Yixiong Fang, Pengyu Chang, Yuante Li, Massa Baali, Bhiksha Raj

2603.27999 2026-04-01 cs.CV

CLIP-AUTT: Test-Time Personalization with Action Unit Prompting for Fine-Grained Video Emotion Recognition

Muhammad Osama Zeeshan, Masoumeh Sharafi, Benoît Savary, Alessandro Lameiras Koerich, Marco Pedersoli, Eric Granger

详情

英文摘要

Personalization in emotion recognition (ER) is essential for an accurate interpretation of subtle and subject-specific expressive patterns. Recent advances in vision-language models (VLMs) such as CLIP demonstrate strong potential for leveraging joint image-text representations in ER. However, CLIP-based methods either depend on CLIP's contrastive pretraining or on LLMs to generate descriptive text prompts, which are noisy, computationally expensive, and fail to capture fine-grained expressions, leading to degraded performance. In this work, we leverage Action Units (AUs) as structured textual prompts within CLIP to model fine-grained facial expressions. AUs encode the subtle muscle activations underlying expressions, providing localized and interpretable semantic cues for more robust ER. We introduce CLIP-AU, a lightweight AU-guided temporal learning method that integrates interpretable AU semantics into CLIP. It learns generic, subject-agnostic representations by aligning AU prompts with facial dynamics, enabling fine-grained ER without CLIP fine-tuning or LLM-generated text supervision. Although CLIP-AU models fine-grained AU semantics, it does not adapt to subject-specific variability in subtle expressions. To address this limitation, we propose CLIP-AUTT, a video-based test-time personalization method that dynamically adapts AU prompts to videos from unseen subjects. By combining entropy-guided temporal window selection with prompt tuning, CLIP-AUTT enables subject-specific adaptation while preserving temporal consistency. Our extensive experiments on three challenging video-based subtle ER datasets, BioVid, StressID, and BAH, indicate that CLIP-AU and CLIP-AUTT outperform state-of-the-art CLIP-based FER and TTA methods, achieving robust and personalized subtle ER. Our code is publicly available at: https://github.com/osamazeeshan/CLIP-AUTT.

URL PDF HTML ☆

赞 0 踩 0

2603.27971 2026-04-01 cs.LG

Principal Prototype Analysis on Manifold for Interpretable Reinforcement Learning

Bodla Krishna Vamshi, Haizhao Yang

2603.27959 2026-04-01 cs.CV

MathGen: Revealing the Illusion of Mathematical Competence through Text-to-Image Generation

Ruiyao Liu, Hui Shen, Ping Zhang, Yunta Hsieh, Yifan Zhang, Jing Xu, Sicheng Chen, Junchen Li, Jiawei Lu, Jianing Ma, Jiaqi Mo, Qi Han, Zhen Zhang, Zhongwei Wan, Jing Xiong, Xin Wang, Ziyuan Liu, Hangrui Cao, Ngai Wong

2603.27516 2026-04-01 cs.CV

SGS-Intrinsic: Semantic-Invariant Gaussian Splatting for Sparse-View Indoor Inverse Rendering

Jiahao Niu, Rongjia Zheng, Wenju Xu, Wei-Shi Zheng, Qing Zhang

Comments CVPR2026

2603.26984 2026-04-01 cs.CV

A Provable Energy-Guided Test-Time Defense Boosting Adversarial Robustness of Large Vision-Language Models

Mujtaba Hussain Mirza, Antonio D'Orazio, Odelia Melamed, Iacopo Masi

Comments Accepted at the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2026, Main Conference

2603.24680 2026-04-01 cs.CV

ReDiPrune: Relevance-Diversity Pre-Projection Token Pruning for Efficient Multimodal LLMs

An Yu, Ting Yu Tsai, Zhenfei Zhang, Weiheng Lu, Felix X. -F. Ye, Ming-Ching Chang

2603.22387 2026-04-01 cs.CV

Efficient Universal Perception Encoder

Chenchen Zhu, Saksham Suri, Cijo Jose, Maxime Oquab, Marc Szafraniec, Wei Wen, Yunyang Xiong, Patrick Labatut, Piotr Bojanowski, Raghuraman Krishnamoorthi, Vikas Chandra

Comments Code: https://github.com/facebookresearch/EUPE; Model: https://huggingface.co/collections/facebook/eupe

2603.22292 2026-04-01 cs.LG cs.AI cs.RO

Beyond Hard Constraints: Budget-Conditioned Reachability For Safe Offline Reinforcement Learning

Janaka Chathuranga Brahmanage, Akshat Kumar

Comments Accepted to the 36th International Conference on Automated Planning and Scheduling (ICAPS 2026)

2603.20182 2026-04-01 cs.RO cs.MA

IndoorR2X: Indoor Robot-to-Everything Coordination with LLM-Driven Planning

Fan Yang, Soumya Teotia, Shaunak A. Mehta, Prajit KrisshnaKumar, Quanting Xie, Jun Liu, Yueqi Song, Wenkai Li, Atsunori Moteki, Kanji Uchino, Yonatan Bisk

2603.18804 2026-04-01 cs.RO cs.HC

"You've got a friend in me": Co-Designing a Peer Social Robot for Young Newcomers' Language and Cultural Learning

Neil Fernandes, Cheng Tang, Tehniyat Shahbaz, Alex Hauschildt, Emily Davies-Robinson, Yue Hu, Kerstin Dautenhahn

2603.17875 2026-04-01 cs.LG math.OC

Operator-Theoretic Foundations and Policy Gradient Methods for General MDPs with Unbounded Costs

Abhishek Gupta, Aditya Mahajan

2603.17624 2026-04-01 cs.CL

Do Language Models Encode Semantic Relations? Probing and Sparse Feature Analysis

Andor Diera, Ansgar Scherp

Comments accepted at LREC 2026

2603.09085 2026-04-01 cs.LG cs.AI

Not All News Is Equal: Topic- and Event-Conditional Sentiment from Finetuned LLMs for Aluminum Price Forecasting

Alvaro Paredes Amorin, Andre Python, Christoph Weisser

2602.19146 2026-04-01 cs.CV cs.CL

VIGiA: Instructional Video Guidance via Dialogue Reasoning and Retrieval

Diogo Glória-Silva, David Semedo, João Maglhães

Comments Published at EACL 2026 Findings

Journal ref Findings of the Association for Computational Linguistics: EACL 2026

2602.19123 2026-04-01 cs.CV

StreetTree: A Large-Scale Global Benchmark for Fine-Grained Tree Species Classification

Jiapeng Li, Yingjing Huang, Fan Zhang, Yu liu