arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

检索范围排序方式

检索时间范围

重置

HOT 人工智能、机器人等 9

cs.AI 人工智能 cs.CV 计算机视觉 cs.CL 自然语言处理 cs.RO 机器人 cs.LG 机器学习 cs.SD 声音 cs.ET 新兴技术 eess.AS 音频语音 eess.IV 图像视频

CS 计算机 41

cs 计算机 cs.AI 人工智能 cs.AR 硬件架构 cs.CC 计算复杂性 cs.CE 计算工程 cs.CG 计算几何 cs.CL 自然语言处理 cs.CR 密码安全 cs.CV 计算机视觉 cs.CY 计算机与社会 cs.DB 数据库 cs.DC 分布式计算 cs.DL 数字图书馆 cs.DM 离散数学 cs.DS 数据结构 cs.ET 新兴技术 cs.FL 形式语言 cs.GL 综述文献 cs.GR 图形学 cs.GT 博弈论 cs.HC 人机交互 cs.IR 信息检索 cs.IT 信息论 cs.LG 机器学习 cs.LO 计算机逻辑 cs.MA 多智能体 cs.MM 多媒体 cs.MS 数学软件 cs.NA 数值分析 cs.NE 神经进化 cs.NI 网络架构 cs.OH 其他计算机 cs.OS 操作系统 cs.PF 性能 cs.PL 编程语言 cs.RO 机器人 cs.SC 符号计算 cs.SD 声音 cs.SE 软件工程 cs.SI 社会信息网络 cs.SY 系统控制

ECON 经济学 4

econ 经济学 econ.EM 计量经济 econ.GN 一般经济 econ.TH 理论经济

EESS 电气与系统 5

eess 电气与系统 eess.AS 音频语音 eess.IV 图像视频 eess.SP 信号处理 eess.SY 系统控制

MATH 数学 33

math 数学 math.AC 交换代数 math.AG 代数几何 math.AP 偏微分方程 math.AT 代数拓扑 math.CA 经典分析 math.CO 组合数学 math.CT 范畴论 math.CV 复变函数 math.DG 微分几何 math.DS 动力系统 math.FA 泛函分析 math.GM 一般数学 math.GN 一般拓扑 math.GR 群论 math.GT 几何拓扑 math.HO 历史综述 math.IT 信息论 math.KT K理论 math.LO 逻辑 math.MG 度量几何 math.MP 数学物理 math.NA 数值分析 math.NT 数论 math.OA 算子代数 math.OC 优化控制 math.PR 概率 math.QA 量子代数 math.RA 环与代数 math.RT 表示论 math.SG 辛几何 math.SP 谱理论 math.ST 统计理论

PHYSICS 物理 55

astro-ph 天体物理 astro-ph.CO 宇宙学 astro-ph.EP 地球行星 astro-ph.GA 星系物理 astro-ph.HE 高能天体 astro-ph.IM 天文仪器 astro-ph.SR 太阳恒星 cond-mat 凝聚态 cond-mat.dis-nn 无序神经 cond-mat.mes-hall 介观纳米 cond-mat.mtrl-sci 材料科学 cond-mat.other 其他凝聚态 cond-mat.quant-gas 量子气体 cond-mat.soft 软凝聚态 cond-mat.stat-mech 统计力学 cond-mat.str-el 强关联电子 cond-mat.supr-con 超导 gr-qc 广义相对论 hep-ex 高能实验 hep-lat 格点高能 hep-ph 高能唯象 hep-th 高能理论 math-ph 数学物理 nlin 非线性科学 nlin.AO 自适应系统 nlin.CD 混沌动力学 nlin.CG 胞自动机 nlin.PS 斑图孤子 nlin.SI 可积系统 nucl-ex 核物理实验 nucl-th 核物理理论 physics 物理 physics.acc-ph 加速器物理 physics.ao-ph 大气海洋 physics.app-ph 应用物理 physics.atm-clus 原子分子团簇 physics.atom-ph 原子物理 physics.bio-ph 生物物理 physics.chem-ph 化学物理 physics.class-ph 经典物理 physics.comp-ph 计算物理 physics.data-an 数据分析 physics.ed-ph 物理教育 physics.flu-dyn 流体动力学 physics.gen-ph 普通物理 physics.geo-ph 地球物理 physics.hist-ph 物理史哲 physics.ins-det 仪器探测 physics.med-ph 医学物理 physics.optics 光学 physics.plasm-ph 等离子体 physics.pop-ph 科普物理 physics.soc-ph 物理与社会 physics.space-ph 空间物理 quant-ph 量子物理

Q-BIO 定量生物 11

q-bio 定量生物 q-bio.BM 生物分子 q-bio.CB 细胞行为 q-bio.GN 基因组学 q-bio.MN 分子网络 q-bio.NC 神经认知 q-bio.OT 其他定量生物 q-bio.PE 种群进化 q-bio.QM 定量方法 q-bio.SC 亚细胞过程 q-bio.TO 组织器官

Q-FIN 定量金融 10

q-fin 定量金融 q-fin.CP 计算金融 q-fin.EC 经济学 q-fin.GN 一般金融 q-fin.MF 数学金融 q-fin.PM 投资组合 q-fin.PR 证券定价 q-fin.RM 风险管理 q-fin.ST 统计金融 q-fin.TR 交易微观结构

STAT 统计 7

stat 统计 stat.AP 统计应用 stat.CO 统计计算 stat.ME 统计方法 stat.ML 机器学习 stat.OT 其他统计 stat.TH 统计理论

2602.13035 2026-02-16 cs.LG cs.AI cs.CL

Look Inward to Explore Outward: Learning Temperature Policy from LLM Internal States via Hierarchical RL

Yixiao Zhou, Yang Li, Dongzhou Cheng, Hehe Fan, Yu Cheng

2602.13030 2026-02-16 cs.LG cs.CV cs.HC

Resource-Efficient Gesture Recognition through Convexified Attention

Daniel Schwartz, Dario Salvucci, Yusuf Osmanlioglu, Richard Vallett, Genevieve Dion, Ali Shokoufandeh

Comments 22 pages, 3 figures, EICS 2026

2602.13028 2026-02-16 cs.CV cs.CL

Human-Aligned MLLM Judges for Fine-Grained Image Editing Evaluation: A Benchmark, Framework, and Analysis

Runzhou Liu, Hailey Weingord, Sejal Mittal, Prakhar Dungarwal, Anusha Nandula, Bo Ni, Samyadeep Basu, Hongjie Chen, Nesreen K. Ahmed, Li Li, Jiayi Zhang, Koustava Goswami, Subhojyoti Mukherjee, Branislav Kveton, Puneet Mathur, Franck Dernoncourt, Yue Zhao, Yu Wang, Ryan A. Rossi, Zhengzhong Tu, Hongru Du

2602.13024 2026-02-16 cs.CV cs.LG

FedHENet: A Frugal Federated Learning Framework for Heterogeneous Environments

Alejandro Dopico-Castro, Oscar Fontenla-Romero, Bertha Guijarro-Berdiñas, Amparo Alonso-Betanzos, Iván Pérez Digón

Comments Accepted for publication at the 34th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN 2026)

2602.13022 2026-02-16 cs.CV

Learning Image-based Tree Crown Segmentation from Enhanced Lidar-based Pseudo-labels

Julius Pesonen, Stefan Rua, Josef Taher, Niko Koivumäki, Xiaowei Yu, Eija Honkavaara

2602.13020 2026-02-16 cs.CV

DynaGuide: A Generalizable Dynamic Guidance Framework for Unsupervised Semantic Segmentation

Boujemaa Guermazi, Riadh Ksantini, Naimul Khan

Comments Accepted at Image and Vision Computing

2602.13016 2026-02-16 cs.RO

How Swarms Differ: Challenges in Collective Behaviour Comparison

André Fialho Jesus, Jonas Kuckling

Comments Accepted for publication in the proceeding of ANTS 2026 - 15th International Conference on Swarm Intelligence

2602.13013 2026-02-16 cs.CV

Towards Universal Video MLLMs with Attribute-Structured and Quality-Verified Instructions

Yunheng Li, Hengrui Zhang, Meng-Hao Guo, Wenzhao Gao, Shaoyong Jia, Shaohui Jiao, Qibin Hou, Ming-Ming Cheng

Comments Project page: https://asid-caption.github.io/

2602.13010 2026-02-16 cs.LG

Probabilistic Wind Power Forecasting with Tree-Based Machine Learning and Weather Ensembles

Max Bruninx, Diederik van Binsbergen, Timothy Verstraeten, Ann Nowé, Jan Helsen

2602.13008 2026-02-16 cs.LG cs.NE

Machine Learning-Based Classification of Jhana Advanced Concentrative Absorption Meditation (ACAM-J) using 7T fMRI

Puneet Kumar, Winson F. Z. Yang, Alakhsimar Singh, Xiaobai Li, Matthew D. Sacchet

2602.13003 2026-02-16 cs.CV cs.LG

MASAR: Motion-Appearance Synergy Refinement for Joint Detection and Trajectory Forecasting

Mohammed Amine Bencheikh Lehocine, Julian Schmidt, Frank Moosmann, Dikshant Gupta, Fabian Flohr

Comments Accepted to the 2026 IEEE International Conference on Robotics and Automation (ICRA 2026)

2602.12989 2026-02-16 cs.CL

Evaluating the Homogeneity of Keyphrase Prediction Models

Maël Houbre, Florian Boudin, Beatrice Daille

Comments Accepted to LREC 2026

2602.12983 2026-02-16 cs.CV cs.AI

Detecting Object Tracking Failure via Sequential Hypothesis Testing

Alejandro Monroy Muñoz, Rajeev Verma, Alexander Timans

Comments Accepted in WACV workshop "Real World Surveillance: Applications and Challenges, 6th"

2602.12982 2026-02-16 cs.LG

Multi-Dimensional Visual Data Recovery: Scale-Aware Tensor Modeling and Accelerated Randomized Computation

Wenjin Qin, Hailin Wang, Jiangjun Peng, Jianjun Wang, Tingwen Huang

2602.12980 2026-02-16 cs.LG

MAUNet-Light: A Concise MAUNet Architecture for Bias Correction and Downscaling of Precipitation Estimates

Sumanta Chandra Mishra Sharma, Adway Mitra, Auroop Ratan Ganguly

2602.12976 2026-02-16 cs.LG cs.AI

Drift-Aware Variational Autoencoder-based Anomaly Detection with Two-level Ensembling

Jin Li, Kleanthis Malialis, Christos G. Panayiotou, Marios M. Polycarpou

Comments accepted

2602.12975 2026-02-16 cs.LG cs.AI

Extending confidence calibration to generalised measures of variation

Andrew Thompson, Vivek Desai

2602.12963 2026-02-16 cs.AI

Information-theoretic analysis of world models in optimal reward maximizers

Alfred Harwood, Jose Faustino, Alex Altair

Comments 28 pages, 0 figures. Not submitted to any conference yet

2602.12961 2026-02-16 cs.LG

Ca-MCF: Category-level Multi-label Causal Feature selection

Wanfu Gao, Yanan Wang, Yonghao Li

Comments 16 pages, 5 figures. Includes appendices

2602.12936 2026-02-16 cs.CV

Unleashing MLLMs on the Edge: A Unified Framework for Cross-Modal ReID via Adaptive SVD Distillation

Hongbo Jiang, Jie Li, Xinqi Cai, Tianyu Xie, Yunhang Shen, Pingyang Dai, Liujuan Cao

Comments Equal contribution by Jie Li

2602.12933 2026-02-16 cs.CV cs.AI physics.med-ph

Deep-Learning Atlas Registration for Melanoma Brain Metastases: Preserving Pathology While Enabling Cohort-Level Analyses

Nanna E. Wielenberg, Ilinca Popp, Oliver Blanck, Lucas Zander, Jan C. Peeken, Stephanie E. Combs, Anca-Ligia Grosu, Dimos Baltas, Tobias Fechter

详情

英文摘要

Melanoma brain metastases (MBM) are common and spatially heterogeneous lesions, complicating cohort-level analyses due to anatomical variability and differing MRI protocols. We propose a fully differentiable, deep-learning-based deformable registration framework that aligns individual pathological brains to a common atlas while preserving metastatic tissue without requiring lesion masks or preprocessing. Missing anatomical correspondences caused by metastases are handled through a forward-model similarity metric based on distance-transformed anatomical labels, combined with a volume-preserving regularization term to ensure deformation plausibility. Registration performance was evaluated using Dice coefficient (DSC), Hausdorff distance (HD), average symmetric surface distance (ASSD), and Jacobian-based measures. The method was applied to 209 MBM patients from three centres, enabling standardized mapping of metastases to anatomical, arterial, and perfusion atlases. The framework achieved high registration accuracy across datasets (DSC 0.89-0.92, HD 6.79-7.60 mm, ASSD 0.63-0.77 mm) while preserving metastatic volumes. Spatial analysis demonstrated significant over-representation of MBM in the cerebral cortex and putamen, under-representation in white matter, and consistent localization near the gray-white matter junction. No arterial territory showed increased metastasis frequency after volume correction. This approach enables robust atlas registration of pathological brain MRI without lesion masks and supports reproducible multi-centre analyses. Applied to MBM, it confirms and refines known spatial predilections, particularly preferential seeding near the gray-white matter junction and cortical regions. The publicly available implementation facilitates reproducible research and extension to other brain tumours and neurological pathologies.

URL PDF HTML ☆

赞 0 踩 0

2602.12922 2026-02-16 cs.CV

Beyond Benchmarks of IUGC: Rethinking Requirements of Deep Learning Methods for Intrapartum Ultrasound Biometry from Fetal Ultrasound Videos

Jieyun Bai, Zihao Zhou, Yitong Tang, Jie Gan, Zhuonan Liang, Jianan Fan, Lisa B. Mcguire, Jillian L. Clarke, Weidong Cai, Jacaueline Spurway, Yubo Tang, Shiye Wang, Wenda Shen, Wangwang Yu, Yihao Li, Philippe Zhang, Weili Jiang, Yongjie Li, Salem Muhsin Ali Binqahal Al Nasim, Arsen Abzhanov, Numan Saeed, Mohammad Yaqub, Zunhui Xian, Hongxing Lin, Libin Lan, Jayroop Ramesh, Valentin Bacher, Mark Eid, Hoda Kalabizadeh, Christian Rupprecht, Ana I. L. Namburete, Pak-Hei Yeung, Madeleine K. Wyburd, Nicola K. Dinsdale, Assanali Serikbey, Jiankai Li, Sung-Liang Chen, Zicheng Hu, Nana Liu, Yian Deng, Wei Hu, Cong Tan, Wenfeng Zhang, Mai Tuyet Nhi, Gregor Koehler, Rapheal Stock, Klaus Maier-Hein, Marawan Elbatel, Xiaomeng Li, Saad Slimani, Victor M. Campello, Benard Ohene-Botwe, Isaac Khobo, Yuxin Huang, Zhenyan Han, Hongying Hou, Di Qiu, Zheng Zheng, Gongning Luo, Dong Ni, Yaosheng Lu, Karim Lekadir, Shuo Li

详情

英文摘要

A substantial proportion (45\%) of maternal deaths, neonatal deaths, and stillbirths occur during the intrapartum phase, with a particularly high burden in low- and middle-income countries. Intrapartum biometry plays a critical role in monitoring labor progression; however, the routine use of ultrasound in resource-limited settings is hindered by a shortage of trained sonographers. To address this challenge, the Intrapartum Ultrasound Grand Challenge (IUGC), co-hosted with MICCAI 2024, was launched. The IUGC introduces a clinically oriented multi-task automatic measurement framework that integrates standard plane classification, fetal head-pubic symphysis segmentation, and biometry, enabling algorithms to exploit complementary task information for more accurate estimation. Furthermore, the challenge releases the largest multi-center intrapartum ultrasound video dataset to date, comprising 774 videos (68,106 frames) collected from three hospitals, providing a robust foundation for model training and evaluation. In this study, we present a comprehensive overview of the challenge design, review the submissions from eight participating teams, and analyze their methods from five perspectives: preprocessing, data augmentation, learning strategy, model architecture, and post-processing. In addition, we perform a systematic analysis of the benchmark results to identify key bottlenecks, explore potential solutions, and highlight open challenges for future research. Although encouraging performance has been achieved, our findings indicate that the field remains at an early stage, and further in-depth investigation is required before large-scale clinical deployment. All benchmark solutions and the complete dataset have been publicly released to facilitate reproducible research and promote continued advances in automatic intrapartum ultrasound biometry.

URL PDF HTML ☆

赞 0 踩 0

2602.12921 2026-02-16 cs.CL

When Words Don't Mean What They Say: Figurative Understanding in Bengali Idioms

Adib Sakhawat, Shamim Ara Parveen, Md Ruhul Amin, Shamim Al Mahmud, Md Saiful Islam, Tahera Khatun

Comments 9 pages, 5 figures. Accepted for presentation at LREC 2026 (Language Resources and Evaluation Conference)

2602.12919 2026-02-16 cs.CV cs.AI cs.NE

EPRBench: A High-Quality Benchmark Dataset for Event Stream Based Visual Place Recognition

Xiao Wang, Xingxing Xiong, Jinfeng Gao, Xufeng Lou, Bo Jiang, Si-bao Chen, Yaowei Wang, Yonghong Tian

2602.12918 2026-02-16 cs.RO

Adding internal audio sensing to internal vision enables human-like in-hand fabric recognition with soft robotic fingertips

Iris Andrussow, Jans Solano, Benjamin A. Richardson, Georg Martius, Katherine J. Kuchenbecker

Journal ref 2025 IEEE-RAS 24th International Conference on Humanoid Robots (Humanoids)

详情

DOI: 10.1109/Humanoids65713.2025.11203050

英文摘要

Distinguishing the feel of smooth silk from coarse cotton is a trivial everyday task for humans. When exploring such fabrics, fingertip skin senses both spatio-temporal force patterns and texture-induced vibrations that are integrated to form a haptic representation of the explored material. It is challenging to reproduce this rich, dynamic perceptual capability in robots because tactile sensors typically cannot achieve both high spatial resolution and high temporal sampling rate. In this work, we present a system that can sense both types of haptic information, and we investigate how each type influences robotic tactile perception of fabrics. Our robotic hand's middle finger and thumb each feature a soft tactile sensor: one is the open-source Minsight sensor that uses an internal camera to measure fingertip deformation and force at 50 Hz, and the other is our new sensor Minsound that captures vibrations through an internal MEMS microphone with a bandwidth from 50 Hz to 15 kHz. Inspired by the movements humans make to evaluate fabrics, our robot actively encloses and rubs folded fabric samples between its two sensitive fingers. Our results test the influence of each sensing modality on overall classification performance, showing high utility for the audio-based sensor. Our transformer-based method achieves a maximum fabric classification accuracy of 97 % on a dataset of 20 common fabrics. Incorporating an external microphone away from Minsound increases our method's robustness in loud ambient noise conditions. To show that this audio-visual tactile sensing approach generalizes beyond the training data, we learn general representations of fabric stretchiness, thickness, and roughness.

URL PDF HTML ☆

赞 0 踩 0

2602.12911 2026-02-16 cs.CL

ViMedCSS: A Vietnamese Medical Code-Switching Speech Dataset & Benchmark

Tung X. Nguyen, Nhu Vo, Giang-Son Nguyen, Duy Mai Hoang, Chien Dinh Huynh, Inigo Jauregi Unanue, Massimo Piccardi, Wray Buntine, Dung D. Le

Comments Accepted at LREC 2026

2602.12905 2026-02-16 cs.CV

Adaptive Scaling with Geometric and Visual Continuity of completed 3D objects

Jelle Vermandere, Maarten Bassier, Maarten Vergauwen

Comments ISPRS Congress 2026

2602.12902 2026-02-16 cs.CV cs.AI cs.LG cs.SE

Robustness of Object Detection of Autonomous Vehicles in Adverse Weather Conditions

Fox Pettersen, Hong Zhu

2602.12892 2026-02-16 cs.CV cs.AI cs.CL

RADAR: Revealing Asymmetric Development of Abilities in MLLM Pre-training

Yunshuang Nie, Bingqian Lin, Minzhe Niu, Kun Xiang, Jianhua Han, Guowei Huang, Xingyue Quan, Hang Xu, Bokui Chen, Xiaodan Liang

详情

英文摘要

Pre-trained Multi-modal Large Language Models (MLLMs) provide a knowledge-rich foundation for post-training by leveraging their inherent perception and reasoning capabilities to solve complex tasks. However, the lack of an efficient evaluation framework impedes the diagnosis of their performance bottlenecks. Current evaluation primarily relies on testing after supervised fine-tuning, which introduces laborious additional training and autoregressive decoding costs. Meanwhile, common pre-training metrics cannot quantify a model's perception and reasoning abilities in a disentangled manner. Furthermore, existing evaluation benchmarks are typically limited in scale or misaligned with pre-training objectives. Thus, we propose RADAR, an efficient ability-centric evaluation framework for Revealing Asymmetric Development of Abilities in MLLM pRe-training. RADAR involves two key components: (1) Soft Discrimination Score, a novel metric for robustly tracking ability development without fine-tuning, based on quantifying nuanced gradations of the model preference for the correct answer over distractors; and (2) Multi-Modal Mixture Benchmark, a new 15K+ sample benchmark for comprehensively evaluating pre-trained MLLMs' perception and reasoning abilities in a 0-shot manner, where we unify authoritative benchmark datasets and carefully collect new datasets, extending the evaluation scope and addressing the critical gaps in current benchmarks. With RADAR, we comprehensively reveal the asymmetric development of perceptual and reasoning capabilities in pretrained MLLMs across diverse factors, including data volume, model size, and pretraining strategy. Our RADAR underscores the need for a decomposed perspective on pre-training ability bottlenecks, informing targeted interventions to advance MLLMs efficiently. Our code is publicly available at https://github.com/Nieysh/RADAR.

URL PDF HTML ☆

赞 0 踩 0

2602.12889 2026-02-16 cs.CL

BaziQA-Benchmark: Evaluating Symbolic and Temporally Compositional Reasoning in Large Language Models

Jiangxi Chen, Qian Liu