arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

检索范围排序方式

检索时间范围

重置

HOT 人工智能、机器人等 9

cs.AI 人工智能 cs.CV 计算机视觉 cs.CL 自然语言处理 cs.RO 机器人 cs.LG 机器学习 cs.SD 声音 cs.ET 新兴技术 eess.AS 音频语音 eess.IV 图像视频

CS 计算机 41

cs 计算机 cs.AI 人工智能 cs.AR 硬件架构 cs.CC 计算复杂性 cs.CE 计算工程 cs.CG 计算几何 cs.CL 自然语言处理 cs.CR 密码安全 cs.CV 计算机视觉 cs.CY 计算机与社会 cs.DB 数据库 cs.DC 分布式计算 cs.DL 数字图书馆 cs.DM 离散数学 cs.DS 数据结构 cs.ET 新兴技术 cs.FL 形式语言 cs.GL 综述文献 cs.GR 图形学 cs.GT 博弈论 cs.HC 人机交互 cs.IR 信息检索 cs.IT 信息论 cs.LG 机器学习 cs.LO 计算机逻辑 cs.MA 多智能体 cs.MM 多媒体 cs.MS 数学软件 cs.NA 数值分析 cs.NE 神经进化 cs.NI 网络架构 cs.OH 其他计算机 cs.OS 操作系统 cs.PF 性能 cs.PL 编程语言 cs.RO 机器人 cs.SC 符号计算 cs.SD 声音 cs.SE 软件工程 cs.SI 社会信息网络 cs.SY 系统控制

ECON 经济学 4

econ 经济学 econ.EM 计量经济 econ.GN 一般经济 econ.TH 理论经济

EESS 电气与系统 5

eess 电气与系统 eess.AS 音频语音 eess.IV 图像视频 eess.SP 信号处理 eess.SY 系统控制

MATH 数学 33

math 数学 math.AC 交换代数 math.AG 代数几何 math.AP 偏微分方程 math.AT 代数拓扑 math.CA 经典分析 math.CO 组合数学 math.CT 范畴论 math.CV 复变函数 math.DG 微分几何 math.DS 动力系统 math.FA 泛函分析 math.GM 一般数学 math.GN 一般拓扑 math.GR 群论 math.GT 几何拓扑 math.HO 历史综述 math.IT 信息论 math.KT K理论 math.LO 逻辑 math.MG 度量几何 math.MP 数学物理 math.NA 数值分析 math.NT 数论 math.OA 算子代数 math.OC 优化控制 math.PR 概率 math.QA 量子代数 math.RA 环与代数 math.RT 表示论 math.SG 辛几何 math.SP 谱理论 math.ST 统计理论

PHYSICS 物理 55

astro-ph 天体物理 astro-ph.CO 宇宙学 astro-ph.EP 地球行星 astro-ph.GA 星系物理 astro-ph.HE 高能天体 astro-ph.IM 天文仪器 astro-ph.SR 太阳恒星 cond-mat 凝聚态 cond-mat.dis-nn 无序神经 cond-mat.mes-hall 介观纳米 cond-mat.mtrl-sci 材料科学 cond-mat.other 其他凝聚态 cond-mat.quant-gas 量子气体 cond-mat.soft 软凝聚态 cond-mat.stat-mech 统计力学 cond-mat.str-el 强关联电子 cond-mat.supr-con 超导 gr-qc 广义相对论 hep-ex 高能实验 hep-lat 格点高能 hep-ph 高能唯象 hep-th 高能理论 math-ph 数学物理 nlin 非线性科学 nlin.AO 自适应系统 nlin.CD 混沌动力学 nlin.CG 胞自动机 nlin.PS 斑图孤子 nlin.SI 可积系统 nucl-ex 核物理实验 nucl-th 核物理理论 physics 物理 physics.acc-ph 加速器物理 physics.ao-ph 大气海洋 physics.app-ph 应用物理 physics.atm-clus 原子分子团簇 physics.atom-ph 原子物理 physics.bio-ph 生物物理 physics.chem-ph 化学物理 physics.class-ph 经典物理 physics.comp-ph 计算物理 physics.data-an 数据分析 physics.ed-ph 物理教育 physics.flu-dyn 流体动力学 physics.gen-ph 普通物理 physics.geo-ph 地球物理 physics.hist-ph 物理史哲 physics.ins-det 仪器探测 physics.med-ph 医学物理 physics.optics 光学 physics.plasm-ph 等离子体 physics.pop-ph 科普物理 physics.soc-ph 物理与社会 physics.space-ph 空间物理 quant-ph 量子物理

Q-BIO 定量生物 11

q-bio 定量生物 q-bio.BM 生物分子 q-bio.CB 细胞行为 q-bio.GN 基因组学 q-bio.MN 分子网络 q-bio.NC 神经认知 q-bio.OT 其他定量生物 q-bio.PE 种群进化 q-bio.QM 定量方法 q-bio.SC 亚细胞过程 q-bio.TO 组织器官

Q-FIN 定量金融 10

q-fin 定量金融 q-fin.CP 计算金融 q-fin.EC 经济学 q-fin.GN 一般金融 q-fin.MF 数学金融 q-fin.PM 投资组合 q-fin.PR 证券定价 q-fin.RM 风险管理 q-fin.ST 统计金融 q-fin.TR 交易微观结构

STAT 统计 7

stat 统计 stat.AP 统计应用 stat.CO 统计计算 stat.ME 统计方法 stat.ML 机器学习 stat.OT 其他统计 stat.TH 统计理论

2602.14445 2026-02-17 cs.LG cs.AI cs.CL cs.NE

Selective Synchronization Attention

Hasi Hays

2602.14444 2026-02-17 cs.LG cs.AI

Broken Chains: The Cost of Incomplete Reasoning in LLMs

Ian Su, Gaurav Purushothaman, Jey Narayan, Ruhika Goel, Kevin Zhu, Sunishchal Dev, Yash More, Maheep Chaudhary

2602.14443 2026-02-17 cs.CV

Controlling Your Image via Simplified Vector Graphics

Lanqing Guo, Xi Liu, Yufei Wang, Zhihao Li, Siyu Huang

Comments Preprint

2602.14438 2026-02-17 cs.RO cs.MA

RoboSolver: A Multi-Agent Large Language Model Framework for Solving Robotic Arm Problems

Hamid Khabazi, Ali F. Meghdari, Alireza Taheri

详情

英文摘要

This study proposes an intelligent multi-agent framework built on LLMs and VLMs and specifically tailored to robotics. The goal is to integrate the strengths of LLMs and VLMs with computational tools to automatically analyze and solve problems related to robotic manipulators. Our developed framework accepts both textual and visual inputs and can automatically perform forward and inverse kinematics, compute velocities and accelerations of key points, generate 3D simulations of the robot, and ultimately execute motion control within the simulated environment, all according to the user's query. To evaluate the framework, three benchmark tests were designed, each consisting of ten questions. In the first benchmark test, the framework was evaluated while connected to GPT-4o, DeepSeek-V3.2, and Claude-Sonnet-4.5, as well as their corresponding raw models. The objective was to extract the forward kinematics of robots directly from textual descriptions. The results showed that the framework integrated with GPT-4o achieved the highest accuracy, reaching 0.97 in computing the final solution, whereas the raw model alone attained an accuracy of only 0.30 for the same task. Similarly, for the other two models, the framework consistently outperformed the corresponding raw models in terms of accuracy. The second benchmark test was identical to the first, except that the input was provided in visual form. In this test, the GPT-4o LLM was used alongside the Gemini 2.5 Pro VLM. The results showed that the framework achieved an accuracy of 0.93 in obtaining the final answer, which is approximately 20% higher than that of the corresponding raw model. The third benchmark test encompassed a range of robotic tasks, including simulation, control, velocity and acceleration computation, as well as inverse kinematics and Jacobian calculation, for which the framework achieved an accuracy of 0.97.

URL PDF HTML ☆

赞 0 踩 0

2602.14434 2026-02-17 cs.RO

A Soft Wrist with Anisotropic and Selectable Stiffness for Robust Robot Learning in Contact-rich Manipulation

Steven Oh, Tomoya Takahashi, Cristian C. Beltran-Hernandez, Yuki Kuroda, Masashi Hamaya

2602.14432 2026-02-17 cs.LG cs.AI stat.ML

S2D: Selective Spectral Decay for Quantization-Friendly Conditioning of Neural Activations

Arnav Chavan, Nahush Lele, Udbhav Bamba, Sankalp Dayal, Aditi Raghunathan, Deepak Gupta

2602.14430 2026-02-17 cs.LG

A unified framework for evaluating the robustness of machine-learning interpretability for prospect risking

Prithwijit Chowdhury, Ahmad Mustafa, Mohit Prabhushankar, Ghassan AlRegib

Journal ref Geophysics 90, no. 3 (2025): IM103-IM118

2602.14428 2026-02-17 cs.CL

LLM-Guided Knowledge Distillation for Temporal Knowledge Graph Reasoning

Wang Xing, Wei Song, Siyu Lin, Chen Wu, Man Wang

2602.14425 2026-02-17 cs.CV

Hierarchical Vision-Language Interaction for Facial Action Unit Detection

Yong Li, Yi Ren, Yizhe Zhang, Wenhua Zhang, Tianyi Zhang, Muyun Jiang, Guo-Sen Xie, Cuntai Guan

Comments Accepted to IEEE Transaction on Affective Computing 2026

Journal ref IEEE Transaction on Affective Computing 2026

2602.14423 2026-02-17 cs.LG cs.AI stat.ML

The geometry of invariant learning: an information-theoretic analysis of data augmentation and generalization

Abdelali Bouyahia, Frédéric LeBlanc, Mario Marchand

详情

英文摘要

Data augmentation is one of the most widely used techniques to improve generalization in modern machine learning, often justified by its ability to promote invariance to label-irrelevant transformations. However, its theoretical role remains only partially understood. In this work, we propose an information-theoretic framework that systematically accounts for the effect of augmentation on generalization and invariance learning. Our approach builds upon mutual information-based bounds, which relate the generalization gap to the amount of information a learning algorithm retains about its training data. We extend this framework by modeling the augmented distribution as a composition of the original data distribution with a distribution over transformations, which naturally induces an orbit-averaged loss function. Under mild sub-Gaussian assumptions on the loss function and the augmentation process, we derive a new generalization bound that decompose the expected generalization gap into three interpretable terms: (1) a distributional divergence between the original and augmented data, (2) a stability term measuring the algorithm dependence on training data, and (3) a sensitivity term capturing the effect of augmentation variability. To connect our bounds to the geometry of the augmentation group, we introduce the notion of group diameter, defined as the maximal perturbation that augmentations can induce in the input space. The group diameter provides a unified control parameter that bounds all three terms and highlights an intrinsic trade-off: small diameters preserve data fidelity but offer limited regularization, while large diameters enhance stability at the cost of increased bias and sensitivity. We validate our theoretical bounds with numerical experiments, demonstrating that it reliably tracks and predicts the behavior of the true generalization gap.

URL PDF HTML ☆

赞 0 踩 0

2602.14419 2026-02-17 cs.CL

WavePhaseNet: A DFT-Based Method for Constructing Semantic Conceptual Hierarchy Structures (SCHS)

Kiyotaka Kasubuchi, Kazuo Fukiya

2602.14413 2026-02-17 cs.CV cs.RO

Understanding Sensor Vulnerabilities in Industrial XR Tracking

Sourya Saha, Md. Nurul Absur

Comments IEEE VR XRIOS 2026 Workshop

2602.14409 2026-02-17 cs.CV

Learning Proposes, Geometry Disposes: A Modular Framework for Efficient Spatial Reasoning

Haichao Zhu, Zhaorui Yang, Qian Zhang

2602.14406 2026-02-17 cs.CL cs.AI

TruthStance: An Annotated Dataset of Conversations on Truth Social

Fathima Ameen, Danielle Brown, Manusha Malgareddy, Amanul Haque

2602.14404 2026-02-17 cs.AI cs.LG cs.NE

Boule or Baguette? A Study on Task Topology, Length Generalization, and the Benefit of Reasoning Traces

William L. Tong, Ege Cakar, Cengiz Pehlevan

Comments 38 pages, 11 figures, code available at https://github.com/wtong98/boule-or-baguette

2602.14401 2026-02-17 cs.CV cs.AI

pFedNavi: Structure-Aware Personalized Federated Vision-Language Navigation for Embodied AI

Qingqian Yang, Hao Wang, Sai Qian Zhang, Jian Li, Yang Hua, Miao Pan, Tao Song, Zhengwei Qi, Haibing Guan

Comments Preprint

2602.14386 2026-02-17 cs.CL

Beyond Token-Level Policy Gradients for Complex Reasoning with Large Language Models

Mufan Xu, Kehai Chen, Xuefeng Bai, Zhengyu Niu, Muyun Yang, Tiejun Zhao, Min Zhang

2602.14381 2026-02-17 cs.CV cs.AI

Adapting VACE for Real-Time Autoregressive Video Diffusion

Ryan Fosdick

Comments 10 pages, 4 figures, 7 tables

2602.14376 2026-02-17 cs.CV

Event-based Visual Deformation Measurement

Yuliang Wu, Wei Zhai, Yuxin Cui, Tiesong Zhao, Yang Cao, Zheng-Jun Zha

2602.14375 2026-02-17 cs.LG

A Study on Multi-Class Online Fuzzy Classifiers for Dynamic Environments

Kensuke Ajimoto, Yuma Yamamoto, Yoshifumi Kusunoki, Tomoharu Nakashima

2602.14365 2026-02-17 cs.CV cs.AI

Image-based Joint-level Detection for Inflammation in Rheumatoid Arthritis from Small and Imbalanced Data

Shun Kato, Yasushi Kondo, Shuntaro Saito, Yoshimitsu Aoki, Mariko Isogawa

2602.14363 2026-02-17 cs.RO cs.LG

AdaptManip: Learning Adaptive Whole-Body Object Lifting and Delivery with Online Recurrent State Estimation

Morgan Byrd, Donghoon Baek, Kartik Garg, Hyunyoung Jung, Daesol Cho, Maks Sorokin, Robert Wright, Sehoon Ha

Comments Website: https://morganbyrd03.github.io/adaptmanip/

2602.14356 2026-02-17 cs.CV

A Generative AI Approach for Reducing Skin Tone Bias in Skin Cancer Classification

Areez Muhammed Shabu, Mohammad Samar Ansari, Asra Aslam

2602.14344 2026-02-17 cs.LG cs.AI

Zero-Shot Instruction Following in RL via Structured LTL Representations

Mathias Jackermeier, Mattia Giuri, Jacques Cloete, Alessandro Abate

2602.14338 2026-02-17 cs.LG cs.AI

Train Less, Learn More: Adaptive Efficient Rollout Optimization for Group-Based Reinforcement Learning

Zhi Zhang, Zhen Han, Costas Mavromatis, Qi Zhu, Yunyi Zhang, Sheng Guan, Dingmin Wang, Xiong Zhou, Shuai Wang, Soji Adeshina, Vassilis Ioannidis, Huzefa Rangwala

2602.14318 2026-02-17 cs.LG

In Transformer We Trust? A Perspective on Transformer Architecture Failure Modes

Trishit Mondal, Ameya D. Jagtap

Comments 46 pages, 34 Figures

2602.14311 2026-02-17 cs.RO

Exploiting Structure-from-Motion for Robust Vision-Based Map Matching for Aircraft Surface Movement

Daniel Choate, Jason Rife

Comments Accepted to the Proceedings of the 38th International Technical Meeting of the Satellite Division of The Institute of Navigation (ION GNSS+ 2025). 15 pages, 13 figures

2602.14301 2026-02-17 cs.LG cs.AI cs.MA

DeepFusion: Accelerating MoE Training via Federated Knowledge Distillation from Heterogeneous Edge Devices

Songyuan Li, Jia Hu, Ahmed M. Abdelmoniem, Geyong Min, Haojun Huang, Jiwei Huang

Comments Index Terms: Large language models, Mixture-of-experts, Federated knowledge distillation, Edge device heterogeneity

2602.14297 2026-02-17 cs.CV

Differential pose optimization in descriptor space -- Combining Geometric and Photometric Methods for Motion Estimation

Andreas L. Teigen, Annette Stahl, Rudolf Mester

2602.14296 2026-02-17 cs.AI cs.SE

AutoWebWorld: Synthesizing Infinite Verifiable Web Environments via Finite State Machines

Yifan Wu, Yiran Peng, Yiyu Chen, Jianhao Ruan, Zijie Zhuang, Cheng Yang, Jiayi Zhang, Man Chen, Yenchi Tseng, Zhaoyang Yu, Liang Chen, Yuyao Zhai, Bang Liu, Chenglin Wu, Yuyu Luo