arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

检索范围排序方式

检索时间范围

重置

HOT 人工智能、机器人等 9

cs.AI 人工智能 cs.CV 计算机视觉 cs.CL 自然语言处理 cs.RO 机器人 cs.LG 机器学习 cs.SD 声音 cs.ET 新兴技术 eess.AS 音频语音 eess.IV 图像视频

CS 计算机 41

cs 计算机 cs.AI 人工智能 cs.AR 硬件架构 cs.CC 计算复杂性 cs.CE 计算工程 cs.CG 计算几何 cs.CL 自然语言处理 cs.CR 密码安全 cs.CV 计算机视觉 cs.CY 计算机与社会 cs.DB 数据库 cs.DC 分布式计算 cs.DL 数字图书馆 cs.DM 离散数学 cs.DS 数据结构 cs.ET 新兴技术 cs.FL 形式语言 cs.GL 综述文献 cs.GR 图形学 cs.GT 博弈论 cs.HC 人机交互 cs.IR 信息检索 cs.IT 信息论 cs.LG 机器学习 cs.LO 计算机逻辑 cs.MA 多智能体 cs.MM 多媒体 cs.MS 数学软件 cs.NA 数值分析 cs.NE 神经进化 cs.NI 网络架构 cs.OH 其他计算机 cs.OS 操作系统 cs.PF 性能 cs.PL 编程语言 cs.RO 机器人 cs.SC 符号计算 cs.SD 声音 cs.SE 软件工程 cs.SI 社会信息网络 cs.SY 系统控制

ECON 经济学 4

econ 经济学 econ.EM 计量经济 econ.GN 一般经济 econ.TH 理论经济

EESS 电气与系统 5

eess 电气与系统 eess.AS 音频语音 eess.IV 图像视频 eess.SP 信号处理 eess.SY 系统控制

MATH 数学 33

math 数学 math.AC 交换代数 math.AG 代数几何 math.AP 偏微分方程 math.AT 代数拓扑 math.CA 经典分析 math.CO 组合数学 math.CT 范畴论 math.CV 复变函数 math.DG 微分几何 math.DS 动力系统 math.FA 泛函分析 math.GM 一般数学 math.GN 一般拓扑 math.GR 群论 math.GT 几何拓扑 math.HO 历史综述 math.IT 信息论 math.KT K理论 math.LO 逻辑 math.MG 度量几何 math.MP 数学物理 math.NA 数值分析 math.NT 数论 math.OA 算子代数 math.OC 优化控制 math.PR 概率 math.QA 量子代数 math.RA 环与代数 math.RT 表示论 math.SG 辛几何 math.SP 谱理论 math.ST 统计理论

PHYSICS 物理 55

astro-ph 天体物理 astro-ph.CO 宇宙学 astro-ph.EP 地球行星 astro-ph.GA 星系物理 astro-ph.HE 高能天体 astro-ph.IM 天文仪器 astro-ph.SR 太阳恒星 cond-mat 凝聚态 cond-mat.dis-nn 无序神经 cond-mat.mes-hall 介观纳米 cond-mat.mtrl-sci 材料科学 cond-mat.other 其他凝聚态 cond-mat.quant-gas 量子气体 cond-mat.soft 软凝聚态 cond-mat.stat-mech 统计力学 cond-mat.str-el 强关联电子 cond-mat.supr-con 超导 gr-qc 广义相对论 hep-ex 高能实验 hep-lat 格点高能 hep-ph 高能唯象 hep-th 高能理论 math-ph 数学物理 nlin 非线性科学 nlin.AO 自适应系统 nlin.CD 混沌动力学 nlin.CG 胞自动机 nlin.PS 斑图孤子 nlin.SI 可积系统 nucl-ex 核物理实验 nucl-th 核物理理论 physics 物理 physics.acc-ph 加速器物理 physics.ao-ph 大气海洋 physics.app-ph 应用物理 physics.atm-clus 原子分子团簇 physics.atom-ph 原子物理 physics.bio-ph 生物物理 physics.chem-ph 化学物理 physics.class-ph 经典物理 physics.comp-ph 计算物理 physics.data-an 数据分析 physics.ed-ph 物理教育 physics.flu-dyn 流体动力学 physics.gen-ph 普通物理 physics.geo-ph 地球物理 physics.hist-ph 物理史哲 physics.ins-det 仪器探测 physics.med-ph 医学物理 physics.optics 光学 physics.plasm-ph 等离子体 physics.pop-ph 科普物理 physics.soc-ph 物理与社会 physics.space-ph 空间物理 quant-ph 量子物理

Q-BIO 定量生物 11

q-bio 定量生物 q-bio.BM 生物分子 q-bio.CB 细胞行为 q-bio.GN 基因组学 q-bio.MN 分子网络 q-bio.NC 神经认知 q-bio.OT 其他定量生物 q-bio.PE 种群进化 q-bio.QM 定量方法 q-bio.SC 亚细胞过程 q-bio.TO 组织器官

Q-FIN 定量金融 10

q-fin 定量金融 q-fin.CP 计算金融 q-fin.EC 经济学 q-fin.GN 一般金融 q-fin.MF 数学金融 q-fin.PM 投资组合 q-fin.PR 证券定价 q-fin.RM 风险管理 q-fin.ST 统计金融 q-fin.TR 交易微观结构

STAT 统计 7

stat 统计 stat.AP 统计应用 stat.CO 统计计算 stat.ME 统计方法 stat.ML 机器学习 stat.OT 其他统计 stat.TH 统计理论

2602.05534 2026-02-06 cs.CV

SSG: Scaled Spatial Guidance for Multi-Scale Visual Autoregressive Generation

Youngwoo Shin, Jiwan Hur, Junmo Kim

Comments Accepted to ICLR 2026

2602.05532 2026-02-06 cs.AI cs.LG

Split Personality Training: Revealing Latent Knowledge Through Alternate Personalities

Florian Dietz, William Wale, Oscar Gilg, Robert McCarthy, Felix Michalak, Gustavo Ewbank Rodrigues Danon, Miguelito de Guzman, Dietrich Klakow

2602.05522 2026-02-06 cs.CV math.GT

Mapper-GIN: Lightweight Structural Graph Abstraction for Corrupted 3D Point Cloud Classification

Jeongbin You, Donggun Kim, Sejun Park, Seungsang Oh

2602.05516 2026-02-06 cs.RO

Virtual-Tube-Based Cooperative Transport Control for Multi-UAV Systems in Constrained Environments

Runxiao Liu, Pengda Mao, Xiangli Le, Shuang Gu, Yapeng Chen, Quan Quan

Comments 10 pages, 8 figures

2602.05515 2026-02-06 cs.AI cs.CL

A Unified Multimodal Framework for Dataset Construction and Model-Based Diagnosis of Ameloblastoma

Ajo Babu George, Anna Mariam John, Athul Anoop, Balu Bhasuran

2602.05508 2026-02-06 cs.CV

VGGT-Motion: Motion-Aware Calibration-Free Monocular SLAM for Long-Range Consistency

Zhuang Xiong, Chen Zhang, Qingshan Xu, Wenbing Tao

2602.05499 2026-02-06 cs.AI

SDFP: Speculative Decoding with FIT-Pruned Models for Training-Free and Plug-and-Play LLM Acceleration

Hanyu Wei, Zunhai Su, Peng Lu, Chao Li, Spandan Tiwari, Ashish Sirasao, Yuhan Dong

2602.05493 2026-02-06 cs.CL cs.AI cs.MA

LinguistAgent: A Reflective Multi-Model Platform for Automated Linguistic Annotation

Bingru Li

2602.05487 2026-02-06 cs.CV

Feature points evaluation on omnidirectional vision with a photorealistic fisheye sequence -- A report on experiments done in 2014

Julien Moreau, S. Ambellouis, Yassine Ruichek

2602.05480 2026-02-06 cs.CV

SOMA-1M: A Large-Scale SAR-Optical Multi-resolution Alignment Dataset for Multi-Task Remote Sensing

Peihao Wu, Yongxiang Yao, Yi Wan, Wenfei Zhang, Ruipeng Zhao, Jiayuan Li, Yongjun Zhang

2602.05479 2026-02-06 cs.AI

Phi-Former: A Pairwise Hierarchical Approach for Compound-Protein Interactions Prediction

Zhe Wang, Zijing Liu, Chencheng Xu, Yuan Yao

Comments Accepted to BIBM 2025. 6 pages, 5 figures

2602.05468 2026-02-06 cs.RO

TaSA: Two-Phased Deep Predictive Learning of Tactile Sensory Attenuation for Improving In-Grasp Manipulation

Pranav Ponnivalavan, Satoshi Funabashi, Alexander Schmitz, Tetsuya Ogata, Shigeki Sugano

Comments 8 pages, 8 figures, 8 tables, ICRA2026 accepted

详情

英文摘要

Humans can achieve diverse in-hand manipulations, such as object pinching and tool use, which often involve simultaneous contact between the object and multiple fingers. This is still an open issue for robotic hands because such dexterous manipulation requires distinguishing between tactile sensations generated by their self-contact and those arising from external contact. Otherwise, object/robot breakage happens due to contacts/collisions. Indeed, most approaches ignore self-contact altogether, by constraining motion to avoid/ignore self-tactile information during contact. While this reduces complexity, it also limits generalization to real-world scenarios where self-contact is inevitable. Humans overcome this challenge through self-touch perception, using predictive mechanisms that anticipate the tactile consequences of their own motion, through a principle called sensory attenuation, where the nervous system differentiates predictable self-touch signals, allowing novel object stimuli to stand out as relevant. Deriving from this, we introduce TaSA, a two-phased deep predictive learning framework. In the first phase, TaSA explicitly learns self-touch dynamics, modeling how a robot's own actions generate tactile feedback. In the second phase, this learned model is incorporated into the motion learning phase, to emphasize object contact signals during manipulation. We evaluate TaSA on a set of insertion tasks, which demand fine tactile discrimination: inserting a pencil lead into a mechanical pencil, inserting coins into a slot, and fixing a paper clip onto a sheet of paper, with various orientations, positions, and sizes. Across all tasks, policies trained with TaSA achieve significantly higher success rates than baseline methods, demonstrating that structured tactile perception with self-touch based on sensory attenuation is critical for dexterous robotic manipulation.

URL PDF HTML ☆

赞 0 踩 0

2602.05464 2026-02-06 cs.AI cs.CV cs.LG

Refine and Purify: Orthogonal Basis Optimization with Null-Space Denoising for Conditional Representation Learning

Jiaquan Wang, Yan Lyu, Chen Li, Yuheng Jia

2602.05463 2026-02-06 cs.LG cs.AI cs.IT math.IT

Thermodynamic Limits of Physical Intelligence

Koichi Takahashi, Yusuke Hayashi

详情

英文摘要

Modern AI systems achieve remarkable capabilities at the cost of substantial energy consumption. To connect intelligence to physical efficiency, we propose two complementary bits-per-joule metrics under explicit accounting conventions: (1) Thermodynamic Epiplexity per Joule -- bits of structural information about a theoretical environment-instance variable newly encoded in an agent's internal state per unit measured energy within a stated boundary -- and (2) Empowerment per Joule -- the embodied sensorimotor channel capacity (control information) per expected energetic cost over a fixed horizon. These provide two axes of physical intelligence: recognition (model-building) vs.control (action influence). Drawing on stochastic thermodynamics, we show how a Landauer-scale closed-cycle benchmark for epiplexity acquisition follows as a corollary of a standard thermodynamic-learning inequality under explicit subsystem assumptions, and we clarify how Landauer-scaled costs act as closed-cycle benchmarks under explicit reset/reuse and boundary-closure assumptions; conversely, we give a simple decoupling construction showing that without such assumptions -- and without charging for externally prepared low-entropy resources (e.g.fresh memory) crossing the boundary -- information gain and in-boundary dissipation need not be tightly linked. For empirical settings where the latent structure variable is unavailable, we align the operational notion of epiplexity with compute-bounded MDL epiplexity and recommend reporting MDL-epiplexity / compression-gain surrogates as companions. Finally, we propose a unified efficiency framework that reports both metrics together with a minimal checklist of boundary/energy accounting, coarse-graining/noise, horizon/reset, and cost conventions to reduce ambiguity and support consistent bits-per-joule comparisons, and we sketch connections to energy-adjusted scaling analyses.

URL PDF HTML ☆

赞 0 踩 0

2602.05459 2026-02-06 cs.LG

When Are RL Hyperparameters Benign? A Study in Offline Goal-Conditioned RL

Jan Malte Töpperwien, Aditya Mohan, Marius Lindauer

Comments 27 pages, 19 figures

2602.05456 2026-02-06 cs.RO cs.AI cs.SY eess.SY

Ontology-Driven Robotic Specification Synthesis

Maksym Figat, Ryan M. Mackey, Michel D. Ingham

Comments 8 pages, 9 figures, 3 tables, journal

2602.05454 2026-02-06 cs.CV cs.AI

Attention Retention for Continual Learning with Vision Transformers

Yue Lu, Xiangyu Zhou, Shizhou Zhang, Yinghui Xing, Guoqiang Liang, Wencong Zhang

Comments AAAI-2026 Camera Ready

2602.05441 2026-02-06 cs.RO cs.AI

Benchmarking Affordance Generalization with BusyBox

Dean Fortier, Timothy Adamson, Tess Hellebrekers, Teresa LaScala, Kofi Ennin, Michael Murray, Andrey Kolobov, Galen Mullins

2602.05440 2026-02-06 cs.CV

Synthetic Defect Geometries of Cast Metal Objects Modeled via 2d Voronoi Tessellations

Natascha Jeziorski, Petra Gospodnetić, Claudia Redenbach

2602.05434 2026-02-06 cs.CV

LD-SLRO: Latent Diffusion Structured Light for 3-D Reconstruction of Highly Reflective Objects

Sanghoon Jeon, Gihyun Jung, Suhyeon Ka, Jae-Sang Hyun

Comments 10 pages, 7 figures

2602.05430 2026-02-06 cs.AI cs.LG

Day-Ahead Electricity Price Forecasting for Volatile Markets Using Foundation Models with Regularization Strategy

Kritchanat Ponyuenyong, Pengyu Tu, Jia Wei Tan, Wei Soon Cheong, Jamie Ng Suat Ling, Lianlian Jiang

Comments Accepted to AI4TS Workshop @ AAAI'26 (Oral and Poster), see https://ai4ts.github.io/aaai2026

2602.05429 2026-02-06 cs.AI cs.CV

M$^2$-Miner: Multi-Agent Enhanced MCTS for Mobile GUI Agent Data Mining

Rui Lv, Juncheng Mo, Tianyi Chu, Chen Rao, Hongyi Jing, Jiajie Teng, Jiafu Chen, Shiqi Zhang, Liangzi Ding, Shuo Fang, Huaizhong Lin, Ziqiang Dang, Chenguang Ma, Lei Zhao

Comments Accepted by ICLR 2026. Supplementary material is included at the end of the main paper (16 pages, 15 figures, 2 tables)

2602.05426 2026-02-06 cs.CV

Multi-AD: Cross-Domain Unsupervised Anomaly Detection for Medical and Industrial Applications

Wahyu Rahmaniar, Kenji Suzuki

Comments 28 pages, 8 figures

Journal ref Pattern Recognition 172 (Part B) (April 2026) 112486

详情

DOI: 10.1016/j.patcog.2025.112486

英文摘要

Traditional deep learning models often lack annotated data, especially in cross-domain applications such as anomaly detection, which is critical for early disease diagnosis in medicine and defect detection in industry. To address this challenge, we propose Multi-AD, a convolutional neural network (CNN) model for robust unsupervised anomaly detection across medical and industrial images. Our approach employs the squeeze-and-excitation (SE) block to enhance feature extraction via channel-wise attention, enabling the model to focus on the most relevant features and detect subtle anomalies. Knowledge distillation (KD) transfers informative features from the teacher to the student model, enabling effective learning of the differences between normal and anomalous data. Then, the discriminator network further enhances the model's capacity to distinguish between normal and anomalous data. At the inference stage, by integrating multi-scale features, the student model can detect anomalies of varying sizes. The teacher-student (T-S) architecture ensures consistent representation of high-dimensional features while adapting them to enhance anomaly detection. Multi-AD was evaluated on several medical datasets, including brain MRI, liver CT, and retina OCT, as well as industrial datasets, such as MVTec AD, demonstrating strong generalization across multiple domains. Experimental results demonstrated that our approach consistently outperformed state-of-the-art models, achieving the best average AUROC for both image-level (81.4% for medical and 99.6% for industrial) and pixel-level (97.0% for medical and 98.4% for industrial) tasks, making it effective for real-world applications.

URL PDF HTML ☆

赞 0 踩 0

2602.05423 2026-02-06 cs.CV cs.GR

NeVStereo: A NeRF-Driven NVS-Stereo Architecture for High-Fidelity 3D Tasks

Pengcheng Chen, Yue Hu, Wenhao Li, Nicole M Gunderson, Andrew Feng, Zhenglong Sun, Peter Beerel, Eric J Seibel

2602.05420 2026-02-06 cs.CV cs.AI

Disco: Densely-overlapping Cell Instance Segmentation via Adjacency-aware Collaborative Coloring

Rui Sun, Yiwen Yang, Kaiyu Guo, Chen Jiang, Dongli Xu, Zhaonan Liu, Tan Pan, Limei Han, Xue Jiang, Wu Wei, Yuan Cheng

Comments 17 pages, 10 figures; ICLR 2026

2602.05419 2026-02-06 cs.CL

Grammatical Error Correction Evaluation by Optimally Transporting Edit Representation

Takumi Goto, Yusuke Sakai, Taro Watanabe

Comments Accepted to TACL. This is a pre-MIT Press publication version

2602.05415 2026-02-06 cs.CV

VMF-GOS: Geometry-guided virtual Outlier Synthesis for Long-Tailed OOD Detection

Ningkang Peng, Qianfeng Yu, Yuhao Zhang, Yafei Liu, Xiaoqian Peng, Peirong Ma, Yi Chen, Peiheng Li, Yanhui Gu

2602.05406 2026-02-06 cs.SD cs.AI

Enabling Automatic Disordered Speech Recognition: An Impaired Speech Dataset in the Akan Language

Isaac Wiafe, Akon Obu Ekpezu, Sumaya Ahmed Salihs, Elikem Doe Atsakpo, Fiifi Baffoe Payin Winful, Jamal-Deen Abdulai

2602.05403 2026-02-06 cs.AI cs.CY cs.SI

Advancing Opinion Dynamics Modeling with Neural Diffusion-Convection-Reaction Equation

Chenghua Gong, Yihang Jiang, Hao Li, Rui Sun, Juyuan Zhang, Tianjun Gu, Liming Pan, Linyuan Lü

2602.05397 2026-02-06 cs.CV

Explainable Pathomics Feature Visualization via Correlation-aware Conditional Feature Editing

Yuechen Yang, Junlin Guo, Ruining Deng, Junchao Zhu, Zhengyi Lu, Chongyu Qu, Yanfan Zhu, Xingyi Guo, Yu Wang, Shilin Zhao, Haichun Yang, Yuankai Huo

AI 大模型

视觉与机器人

科学与医疗

SSG: Scaled Spatial Guidance for Multi-Scale Visual Autoregressive Generation

Split Personality Training: Revealing Latent Knowledge Through Alternate Personalities

Mapper-GIN: Lightweight Structural Graph Abstraction for Corrupted 3D Point Cloud Classification

Virtual-Tube-Based Cooperative Transport Control for Multi-UAV Systems in Constrained Environments

A Unified Multimodal Framework for Dataset Construction and Model-Based Diagnosis of Ameloblastoma

VGGT-Motion: Motion-Aware Calibration-Free Monocular SLAM for Long-Range Consistency

SDFP: Speculative Decoding with FIT-Pruned Models for Training-Free and Plug-and-Play LLM Acceleration

LinguistAgent: A Reflective Multi-Model Platform for Automated Linguistic Annotation

Feature points evaluation on omnidirectional vision with a photorealistic fisheye sequence -- A report on experiments done in 2014

SOMA-1M: A Large-Scale SAR-Optical Multi-resolution Alignment Dataset for Multi-Task Remote Sensing

Phi-Former: A Pairwise Hierarchical Approach for Compound-Protein Interactions Prediction

TaSA: Two-Phased Deep Predictive Learning of Tactile Sensory Attenuation for Improving In-Grasp Manipulation

Refine and Purify: Orthogonal Basis Optimization with Null-Space Denoising for Conditional Representation Learning

Thermodynamic Limits of Physical Intelligence

When Are RL Hyperparameters Benign? A Study in Offline Goal-Conditioned RL

Ontology-Driven Robotic Specification Synthesis

Attention Retention for Continual Learning with Vision Transformers

Benchmarking Affordance Generalization with BusyBox

Synthetic Defect Geometries of Cast Metal Objects Modeled via 2d Voronoi Tessellations

LD-SLRO: Latent Diffusion Structured Light for 3-D Reconstruction of Highly Reflective Objects

Day-Ahead Electricity Price Forecasting for Volatile Markets Using Foundation Models with Regularization Strategy

M$^2$-Miner: Multi-Agent Enhanced MCTS for Mobile GUI Agent Data Mining

Multi-AD: Cross-Domain Unsupervised Anomaly Detection for Medical and Industrial Applications

NeVStereo: A NeRF-Driven NVS-Stereo Architecture for High-Fidelity 3D Tasks

Disco: Densely-overlapping Cell Instance Segmentation via Adjacency-aware Collaborative Coloring

Grammatical Error Correction Evaluation by Optimally Transporting Edit Representation

VMF-GOS: Geometry-guided virtual Outlier Synthesis for Long-Tailed OOD Detection

Enabling Automatic Disordered Speech Recognition: An Impaired Speech Dataset in the Akan Language

Advancing Opinion Dynamics Modeling with Neural Diffusion-Convection-Reaction Equation

Explainable Pathomics Feature Visualization via Correlation-aware Conditional Feature Editing