arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

检索范围排序方式

检索时间范围

重置

HOT 人工智能、机器人等 9

cs.AI 人工智能 cs.CV 计算机视觉 cs.CL 自然语言处理 cs.RO 机器人 cs.LG 机器学习 cs.SD 声音 cs.ET 新兴技术 eess.AS 音频语音 eess.IV 图像视频

CS 计算机 41

cs 计算机 cs.AI 人工智能 cs.AR 硬件架构 cs.CC 计算复杂性 cs.CE 计算工程 cs.CG 计算几何 cs.CL 自然语言处理 cs.CR 密码安全 cs.CV 计算机视觉 cs.CY 计算机与社会 cs.DB 数据库 cs.DC 分布式计算 cs.DL 数字图书馆 cs.DM 离散数学 cs.DS 数据结构 cs.ET 新兴技术 cs.FL 形式语言 cs.GL 综述文献 cs.GR 图形学 cs.GT 博弈论 cs.HC 人机交互 cs.IR 信息检索 cs.IT 信息论 cs.LG 机器学习 cs.LO 计算机逻辑 cs.MA 多智能体 cs.MM 多媒体 cs.MS 数学软件 cs.NA 数值分析 cs.NE 神经进化 cs.NI 网络架构 cs.OH 其他计算机 cs.OS 操作系统 cs.PF 性能 cs.PL 编程语言 cs.RO 机器人 cs.SC 符号计算 cs.SD 声音 cs.SE 软件工程 cs.SI 社会信息网络 cs.SY 系统控制

ECON 经济学 4

econ 经济学 econ.EM 计量经济 econ.GN 一般经济 econ.TH 理论经济

EESS 电气与系统 5

eess 电气与系统 eess.AS 音频语音 eess.IV 图像视频 eess.SP 信号处理 eess.SY 系统控制

MATH 数学 33

math 数学 math.AC 交换代数 math.AG 代数几何 math.AP 偏微分方程 math.AT 代数拓扑 math.CA 经典分析 math.CO 组合数学 math.CT 范畴论 math.CV 复变函数 math.DG 微分几何 math.DS 动力系统 math.FA 泛函分析 math.GM 一般数学 math.GN 一般拓扑 math.GR 群论 math.GT 几何拓扑 math.HO 历史综述 math.IT 信息论 math.KT K理论 math.LO 逻辑 math.MG 度量几何 math.MP 数学物理 math.NA 数值分析 math.NT 数论 math.OA 算子代数 math.OC 优化控制 math.PR 概率 math.QA 量子代数 math.RA 环与代数 math.RT 表示论 math.SG 辛几何 math.SP 谱理论 math.ST 统计理论

PHYSICS 物理 55

astro-ph 天体物理 astro-ph.CO 宇宙学 astro-ph.EP 地球行星 astro-ph.GA 星系物理 astro-ph.HE 高能天体 astro-ph.IM 天文仪器 astro-ph.SR 太阳恒星 cond-mat 凝聚态 cond-mat.dis-nn 无序神经 cond-mat.mes-hall 介观纳米 cond-mat.mtrl-sci 材料科学 cond-mat.other 其他凝聚态 cond-mat.quant-gas 量子气体 cond-mat.soft 软凝聚态 cond-mat.stat-mech 统计力学 cond-mat.str-el 强关联电子 cond-mat.supr-con 超导 gr-qc 广义相对论 hep-ex 高能实验 hep-lat 格点高能 hep-ph 高能唯象 hep-th 高能理论 math-ph 数学物理 nlin 非线性科学 nlin.AO 自适应系统 nlin.CD 混沌动力学 nlin.CG 胞自动机 nlin.PS 斑图孤子 nlin.SI 可积系统 nucl-ex 核物理实验 nucl-th 核物理理论 physics 物理 physics.acc-ph 加速器物理 physics.ao-ph 大气海洋 physics.app-ph 应用物理 physics.atm-clus 原子分子团簇 physics.atom-ph 原子物理 physics.bio-ph 生物物理 physics.chem-ph 化学物理 physics.class-ph 经典物理 physics.comp-ph 计算物理 physics.data-an 数据分析 physics.ed-ph 物理教育 physics.flu-dyn 流体动力学 physics.gen-ph 普通物理 physics.geo-ph 地球物理 physics.hist-ph 物理史哲 physics.ins-det 仪器探测 physics.med-ph 医学物理 physics.optics 光学 physics.plasm-ph 等离子体 physics.pop-ph 科普物理 physics.soc-ph 物理与社会 physics.space-ph 空间物理 quant-ph 量子物理

Q-BIO 定量生物 11

q-bio 定量生物 q-bio.BM 生物分子 q-bio.CB 细胞行为 q-bio.GN 基因组学 q-bio.MN 分子网络 q-bio.NC 神经认知 q-bio.OT 其他定量生物 q-bio.PE 种群进化 q-bio.QM 定量方法 q-bio.SC 亚细胞过程 q-bio.TO 组织器官

Q-FIN 定量金融 10

q-fin 定量金融 q-fin.CP 计算金融 q-fin.EC 经济学 q-fin.GN 一般金融 q-fin.MF 数学金融 q-fin.PM 投资组合 q-fin.PR 证券定价 q-fin.RM 风险管理 q-fin.ST 统计金融 q-fin.TR 交易微观结构

STAT 统计 7

stat 统计 stat.AP 统计应用 stat.CO 统计计算 stat.ME 统计方法 stat.ML 机器学习 stat.OT 其他统计 stat.TH 统计理论

2602.19914 2026-02-24 cs.AI

Watson & Holmes: A Naturalistic Benchmark for Comparing Human and LLM Reasoning

Thatchawin Leelawat, Lewis D Griffin

Comments 51 pages, 13 figures

2602.19907 2026-02-24 cs.CV cs.LG

Gradient based Severity Labeling for Biomarker Classification in OCT

Kiran Kokilepersaud, Mohit Prabhushankar, Ghassan AlRegib, Stephanie Trejo Corona, Charles Wykoff

Comments Accepted at International Conference on Image Processing (ICIP) 2022

2602.19898 2026-02-24 cs.RO

Athena: An Autonomous Open-Hardware Tracked Rescue Robot Platform

Stefan Fabian, Aljoscha Schmidt, Jonas Süß, Dishant, Aum Oza, Oskar von Stryk

Comments https://github.com/tu-darmstadt-ros-pkg/athena

2602.19896 2026-02-24 cs.CV

Monocular Mesh Recovery and Body Measurement of Female Saanen Goats

Bo Jin, Shichao Zhao, Jin Lyu, Bin Zhang, Tao Yu, Liang An, Yebin Liu, Meili Wang

Comments Accepted to AAAI2026

2602.19895 2026-02-24 cs.LG cs.CL

DSDR: Dual-Scale Diversity Regularization for Exploration in LLM Reasoning

Zhongwei Wan, Yun Shen, Zhihao Dou, Donghao Zhou, Yu Zhang, Xin Wang, Hui Shen, Jing Xiong, Chaofan Tao, Zixuan Zhong, Peizhou Huang, Mi Zhang

2602.19893 2026-02-24 cs.LG stat.ML

Generalized Random Direction Newton Algorithms for Stochastic Optimization

Soumen Pachal, Prashanth L. A., Shalabh Bhatnagar, Avinash Achar

2602.19883 2026-02-24 cs.CL cs.LO

Denotational Semantics for ODRL: Knowledge-Based Constraint Conflict Detection

Daham Mustafa, Diego Collarana, Yixin Peng, Rafiqul Haque, Christoph Lange-Bever, Christoph Quix, Stephan Decker

Comments 17 pages, 6 tables. Working draft. Supplementary material (154 TPTP/SMT-LIB benchmarks, Isabelle/HOL theory file) will be made available at https://github.com/Daham-Mustaf/odrl-benchmark upon publication

2602.19881 2026-02-24 cs.CV cs.AI

Make Some Noise: Unsupervised Remote Sensing Change Detection Using Latent Space Perturbations

Blaž Rolih, Matic Fučka, Filip Wolf, Luka Čehovin Zajc

2602.19874 2026-02-24 cs.CV

BigMaQ: A Big Macaque Motion and Animation Dataset Bridging Image and 3D Pose Representations

Lucas Martini, Alexander Lappe, Anna Bognár, Rufin Vogels, Martin A. Giese

Journal ref International Conference on Learning Representations (ICLR), 2026

详情

英文摘要

The recognition of dynamic and social behavior in animals is fundamental for advancing ethology, ecology, medicine and neuroscience. Recent progress in deep learning has enabled automated behavior recognition from video, yet an accurate reconstruction of the three-dimensional (3D) pose and shape has not been integrated into this process. Especially for non-human primates, mesh-based tracking efforts lag behind those for other species, leaving pose descriptions restricted to sparse keypoints that are unable to fully capture the richness of action dynamics. To address this gap, we introduce the $\textbf{Big Ma}$ca$\textbf{Q}$ue 3D Motion and Animation Dataset ($\texttt{BigMaQ}$), a large-scale dataset comprising more than 750 scenes of interacting rhesus macaques with detailed 3D pose descriptions. Extending previous surface-based animal tracking methods, we construct subject-specific textured avatars by adapting a high-quality macaque template mesh to individual monkeys. This allows us to provide pose descriptions that are more accurate than previous state-of-the-art surface-based animal tracking methods. From the original dataset, we derive BigMaQ500, an action recognition benchmark that links surface-based pose vectors to single frames across multiple individual monkeys. By pairing features extracted from established image and video encoders with and without our pose descriptors, we demonstrate substantial improvements in mean average precision (mAP) when pose information is included. With these contributions, $\texttt{BigMaQ}$ establishes the first dataset that both integrates dynamic 3D pose-shape representations into the learning task of animal action recognition and provides a rich resource to advance the study of visual appearance, posture, and social interaction in non-human primates. The code and data are publicly available at https://martinivis.github.io/BigMaQ/ .

URL PDF HTML ☆

赞 0 踩 0

2602.19872 2026-02-24 cs.CV cs.AI

GOAL: Geometrically Optimal Alignment for Continual Generalized Category Discovery

Jizhou Han, Chenhao Ding, SongLin Dong, Yuhang He, Shaokun Wang, Qiang Wang, Yihong Gong

Comments Accept by AAAI 2026

2602.19870 2026-02-24 cs.CV

ApET: Approximation-Error Guided Token Compression for Efficient VLMs

Qiankun Ma, Ziyao Zhang, Haofei Wang, Jie Chen, Zhen Song, Hairong Zheng

Comments CVPR2026

2602.19855 2026-02-24 cs.CL

SHIELD: Semantic Heterogeneity Integrated Embedding for Latent Discovery in Clinical Trial Safety Signals

Francois Vandenhende, Anna Georgiou, Theodoros Psaras, Ellie Karekla

Comments 3 figures, 1 table

2602.19850 2026-02-24 cs.RO

TactiVerse: Generalizing Multi-Point Tactile Sensing in Soft Robotics Using Single-Point Data

Junhui Lee, Hyosung Kim, Saekwang Nam

Comments 6 pages, 7 figures, accepted at 9th IEEE-RAS International Conference on Soft Robotics (RoboSoft)

2602.19845 2026-02-24 cs.LG

I Dropped a Neural Net

Hyunwoo Park

2602.19840 2026-02-24 cs.CL

SAMAS: A Spectrum-Guided Multi-Agent System for Achieving Style Fidelity in Literary Translation

Jingzhuo Wu, Jiajun Zhang, Keyan Jin, Dehua Ma, Junbo Wang

2602.19832 2026-02-24 cs.CV

M3S-Net: Multimodal Feature Fusion Network Based on Multi-scale Data for Ultra-short-term PV Power Forecasting

Penghui Niu, Taotao Cai, Suqi Zhang, Junhua Gu, Ping Zhang, Qiqi Liu, Jianxin Li

详情

英文摘要

The inherent intermittency and high-frequency variability of solar irradiance, particularly during rapid cloud advection, present significant stability challenges to high-penetration photovoltaic grids. Although multimodal forecasting has emerged as a viable mitigation strategy, existing architectures predominantly rely on shallow feature concatenation and binary cloud segmentation, thereby failing to capture the fine-grained optical features of clouds and the complex spatiotemporal coupling between visual and meteorological modalities. To bridge this gap, this paper proposes M3S-Net, a novel multimodal feature fusion network based on multi-scale data for ultra-short-term PV power forecasting. First, a multi-scale partial channel selection network leverages partial convolutions to explicitly isolate the boundary features of optically thin clouds, effectively transcending the precision limitations of coarse-grained binary masking. Second, a multi-scale sequence to image analysis network employs Fast Fourier Transform (FFT)-based time-frequency representation to disentangle the complex periodicity of meteorological data across varying time horizons. Crucially, the model incorporates a cross-modal Mamba interaction module featuring a novel dynamic C-matrix swapping mechanism. By exchanging state-space parameters between visual and temporal streams, this design conditions the state evolution of one modality on the context of the other, enabling deep structural coupling with linear computational complexity, thus overcoming the limitations of shallow concatenation. Experimental validation on the newly constructed fine-grained PV power dataset demonstrates that M3S-Net achieves a mean absolute error reduction of 6.2% in 10-minute forecasts compared to state-of-the-art baselines. The dataset and source code will be available at https://github.com/she1110/FGPD.

URL PDF HTML ☆

赞 0 踩 0

2602.19828 2026-02-24 cs.CV

TextShield-R1: Reinforced Reasoning for Tampered Text Detection

Chenfan Qu, Yiwu Zhong, Jian Liu, Xuekang Zhu, Bohan Yu, Lianwen Jin

Comments AAAI 2026

2602.19823 2026-02-24 cs.CV

Open-vocabulary 3D scene perception in industrial environments

Keno Moenck, Adrian Philip Florea, Julian Koch, Thorsten Schüppstuhl

2602.19822 2026-02-24 cs.CV cs.AI

Efficient endometrial carcinoma screening via cross-modal synthesis and gradient distillation

Dongjing Shan, Yamei Luo, Jiqing Xuan, Lu Huang, Jin Li, Mengchu Yang, Zeyu Chen, Fajin Lv, Yong Tang, Chunxiang Zhang

2602.19815 2026-02-24 cs.CL

Keyboards for the Endangered Idu Mishmi Language

Akhilesh Kakolu Ramarao

2602.19788 2026-02-24 cs.LG

Bayesian Meta-Learning with Expert Feedback for Task-Shift Adaptation through Causal Embeddings

Lotta Mäkinen, Jorge Loría, Samuel Kaski

Comments 27 pages, 8 figures

2602.19785 2026-02-24 cs.LG cs.NE stat.ML

Unsupervised Anomaly Detection in NSL-KDD Using $β$-VAE: A Latent Space and Reconstruction Error Approach

Dylan Baptiste, Ramla Saddem, Alexandre Philippot, François Foyer

Journal ref 2025 15th France-Japan \& 13th Europe-Asia Congress on Mechatronics (MECATRONICS) / 23rd International Conference on Research and Education in Mechatronics (REM), Dec 2025, Saint-Ouen-sur-Seine, France. pp.1-6

2602.19782 2026-02-24 cs.LG

Addressing Instrument-Outcome Confounding in Mendelian Randomization through Representation Learning

Shimeng Huang, Matthew Robinson, Francesco Locatello

2602.19764 2026-02-24 cs.RO

Towards Dexterous Embodied Manipulation via Deep Multi-Sensory Fusion and Sparse Expert Scaling

Yirui Sun, Guangyu Zhuge, Keliang Liu, Jie Gu, Zhihao xia, Qionglin Ren, Chunxu tian, Zhongxue Ga

2602.19763 2026-02-24 cs.CV eess.IV

Training Deep Stereo Matching Networks on Tree Branch Imagery: A Benchmark Study for Real-Time UAV Forestry Applications

Yida Lin, Bing Xue, Mengjie Zhang, Sam Schofield, Richard Green

2602.19753 2026-02-24 cs.CV cs.GR

RAP: Fast Feedforward Rendering-Free Attribute-Guided Primitive Importance Score Prediction for Efficient 3D Gaussian Splatting Processing

Kaifa Yang, Qi Yang, Yiling Xu, Zhu Li

Comments Accepted by CVPR 2026

2602.19735 2026-02-24 cs.CV

VGGT-MPR: VGGT-Enhanced Multimodal Place Recognition in Autonomous Driving Environments

Jingyi Xu, Zhangshuo Qi, Zhongmiao Yan, Xuyu Gao, Qianyun Jiao, Songpengcheng Xia, Xieyuanli Chen, Ling Pei

2602.19733 2026-02-24 cs.LG math.OC

Understanding the Curse of Unrolling

Sheheryar Mehmood, Florian Knoll, Peter Ochs

2602.19723 2026-02-24 cs.CV

Towards Personalized Multi-Modal MRI Synthesis across Heterogeneous Datasets

Yue Zhang, Zhizheng Zhuo, Siyao Xu, Shan Lv, Zhaoxi Liu, Jun Qiu, Qiuli Wang, Yaou Liu, S. Kevin Zhou

Comments 19 pages, 4 figures

详情

英文摘要

Synthesizing missing modalities in multi-modal magnetic resonance imaging (MRI) is vital for ensuring diagnostic completeness, particularly when full acquisitions are infeasible due to time constraints, motion artifacts, and patient tolerance. Recent unified synthesis models have enabled flexible synthesis tasks by accommodating various input-output configurations. However, their training and evaluation are typically restricted to a single dataset, limiting their generalizability across diverse clinical datasets and impeding practical deployment. To address this limitation, we propose PMM-Synth, a personalized MRI synthesis framework that not only supports various synthesis tasks but also generalizes effectively across heterogeneous datasets. PMM-Synth is jointly trained on multiple multi-modal MRI datasets that differ in modality coverage, disease types, and intensity distributions. It achieves cross-dataset generalization through three core innovations: a Personalized Feature Modulation module that dynamically adapts feature representations based on dataset identifier to mitigate the impact of distributional shifts; a Modality-Consistent Batch Scheduler that facilitates stable and efficient batch training under inconsistent modality conditions; and a selective supervision loss to ensure effective learning when ground truth modalities are partially missing. Evaluated on four clinical multi-modal MRI datasets, PMM-Synth consistently outperforms state-of-the-art methods in both one-to-one and many-to-one synthesis tasks, achieving superior PSNR and SSIM scores. Qualitative results further demonstrate improved preservation of anatomical structures and pathological details. Additionally, downstream tumor segmentation and radiological reporting studies suggest that PMM-Synth holds potential for supporting reliable diagnosis under real-world modality-missing scenarios.

URL PDF HTML ☆

赞 0 踩 0

2602.19719 2026-02-24 cs.CV

Generative 6D Pose Estimation via Conditional Flow Matching

Amir Hamza, Davide Boscaini, Weihang Li, Benjamin Busam, Fabio Poiesi

Comments Project Website : https://tev-fbk.github.io/Flose/