arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

检索范围排序方式

检索时间范围

重置

HOT 人工智能、机器人等 9

cs.AI 人工智能 cs.CV 计算机视觉 cs.CL 自然语言处理 cs.RO 机器人 cs.LG 机器学习 cs.SD 声音 cs.ET 新兴技术 eess.AS 音频语音 eess.IV 图像视频

CS 计算机 41

cs 计算机 cs.AI 人工智能 cs.AR 硬件架构 cs.CC 计算复杂性 cs.CE 计算工程 cs.CG 计算几何 cs.CL 自然语言处理 cs.CR 密码安全 cs.CV 计算机视觉 cs.CY 计算机与社会 cs.DB 数据库 cs.DC 分布式计算 cs.DL 数字图书馆 cs.DM 离散数学 cs.DS 数据结构 cs.ET 新兴技术 cs.FL 形式语言 cs.GL 综述文献 cs.GR 图形学 cs.GT 博弈论 cs.HC 人机交互 cs.IR 信息检索 cs.IT 信息论 cs.LG 机器学习 cs.LO 计算机逻辑 cs.MA 多智能体 cs.MM 多媒体 cs.MS 数学软件 cs.NA 数值分析 cs.NE 神经进化 cs.NI 网络架构 cs.OH 其他计算机 cs.OS 操作系统 cs.PF 性能 cs.PL 编程语言 cs.RO 机器人 cs.SC 符号计算 cs.SD 声音 cs.SE 软件工程 cs.SI 社会信息网络 cs.SY 系统控制

ECON 经济学 4

econ 经济学 econ.EM 计量经济 econ.GN 一般经济 econ.TH 理论经济

EESS 电气与系统 5

eess 电气与系统 eess.AS 音频语音 eess.IV 图像视频 eess.SP 信号处理 eess.SY 系统控制

MATH 数学 33

math 数学 math.AC 交换代数 math.AG 代数几何 math.AP 偏微分方程 math.AT 代数拓扑 math.CA 经典分析 math.CO 组合数学 math.CT 范畴论 math.CV 复变函数 math.DG 微分几何 math.DS 动力系统 math.FA 泛函分析 math.GM 一般数学 math.GN 一般拓扑 math.GR 群论 math.GT 几何拓扑 math.HO 历史综述 math.IT 信息论 math.KT K理论 math.LO 逻辑 math.MG 度量几何 math.MP 数学物理 math.NA 数值分析 math.NT 数论 math.OA 算子代数 math.OC 优化控制 math.PR 概率 math.QA 量子代数 math.RA 环与代数 math.RT 表示论 math.SG 辛几何 math.SP 谱理论 math.ST 统计理论

PHYSICS 物理 55

astro-ph 天体物理 astro-ph.CO 宇宙学 astro-ph.EP 地球行星 astro-ph.GA 星系物理 astro-ph.HE 高能天体 astro-ph.IM 天文仪器 astro-ph.SR 太阳恒星 cond-mat 凝聚态 cond-mat.dis-nn 无序神经 cond-mat.mes-hall 介观纳米 cond-mat.mtrl-sci 材料科学 cond-mat.other 其他凝聚态 cond-mat.quant-gas 量子气体 cond-mat.soft 软凝聚态 cond-mat.stat-mech 统计力学 cond-mat.str-el 强关联电子 cond-mat.supr-con 超导 gr-qc 广义相对论 hep-ex 高能实验 hep-lat 格点高能 hep-ph 高能唯象 hep-th 高能理论 math-ph 数学物理 nlin 非线性科学 nlin.AO 自适应系统 nlin.CD 混沌动力学 nlin.CG 胞自动机 nlin.PS 斑图孤子 nlin.SI 可积系统 nucl-ex 核物理实验 nucl-th 核物理理论 physics 物理 physics.acc-ph 加速器物理 physics.ao-ph 大气海洋 physics.app-ph 应用物理 physics.atm-clus 原子分子团簇 physics.atom-ph 原子物理 physics.bio-ph 生物物理 physics.chem-ph 化学物理 physics.class-ph 经典物理 physics.comp-ph 计算物理 physics.data-an 数据分析 physics.ed-ph 物理教育 physics.flu-dyn 流体动力学 physics.gen-ph 普通物理 physics.geo-ph 地球物理 physics.hist-ph 物理史哲 physics.ins-det 仪器探测 physics.med-ph 医学物理 physics.optics 光学 physics.plasm-ph 等离子体 physics.pop-ph 科普物理 physics.soc-ph 物理与社会 physics.space-ph 空间物理 quant-ph 量子物理

Q-BIO 定量生物 11

q-bio 定量生物 q-bio.BM 生物分子 q-bio.CB 细胞行为 q-bio.GN 基因组学 q-bio.MN 分子网络 q-bio.NC 神经认知 q-bio.OT 其他定量生物 q-bio.PE 种群进化 q-bio.QM 定量方法 q-bio.SC 亚细胞过程 q-bio.TO 组织器官

Q-FIN 定量金融 10

q-fin 定量金融 q-fin.CP 计算金融 q-fin.EC 经济学 q-fin.GN 一般金融 q-fin.MF 数学金融 q-fin.PM 投资组合 q-fin.PR 证券定价 q-fin.RM 风险管理 q-fin.ST 统计金融 q-fin.TR 交易微观结构

STAT 统计 7

stat 统计 stat.AP 统计应用 stat.CO 统计计算 stat.ME 统计方法 stat.ML 机器学习 stat.OT 其他统计 stat.TH 统计理论

2507.14811 2026-03-17 cs.CV cs.AI

SegQuant: A Semantics-Aware and Generalizable Quantization Framework for Diffusion Models

Jiaji Zhang, Ruichao Sun, Hailiang Zhao, Jiaju Wu, Peng Chen, Hao Li, Yuying Liu, Kingsum Chow, Gang Xiong, Shuiguang Deng

Comments 22 pages, 15 figures, to be published in The IEEE/CVF Conference on Computer Vision and Pattern Recognition 2026 (CVPR 2026), code is available at https://github.com/OptiSys-ZJU/segquant

2507.13984 2026-03-17 cs.CV cs.AI

CSD-VAR: Content-Style Decomposition in Visual Autoregressive Models

Quang-Binh Nguyen, Minh Luu, Quang Nguyen, Anh Tran, Khoi Nguyen

Comments Accepted to ICCV 2025; Project page: https://nqbinhcs.github.io/csd-var-page/

2507.08801 2026-03-17 cs.CV cs.AI cs.MM

Lumos-1: On Autoregressive Video Generation with Discrete Diffusion from a Unified Model Perspective

Hangjie Yuan, Weihua Chen, Jun Cen, Hu Yu, Jingyun Liang, Shuning Chang, Zhihui Lin, Tao Feng, Pengwei Liu, Jiazheng Xing, Hao Luo, Jiasheng Tang, Fan Wang, Yi Yang

Comments ICLR 2026 Camera Ready Version. Code and Models: https://github.com/alibaba-damo-academy/Lumos

2507.08741 2026-03-17 cs.CV

HieraRS: A Hierarchical Segmentation Paradigm for Remote Sensing Enabling Multi-Granularity Interpretation and Cross-Domain Transfer

Tianlong Ai, Tianzhu Liu, Haochen Jiang, Yanfeng Gu

详情

英文摘要

Hierarchical land cover and land use (LCLU) classification aims to assign pixel-wise labels with multiple levels of semantic granularity to remote sensing (RS) imagery. However, existing deep learning-based methods face two major challenges: 1) They predominantly adopt a flat classification paradigm, which limits their ability to generate end-to-end multi-granularity hierarchical predictions aligned with tree-structured hierarchies used in practice. 2) Most cross-domain studies focus on performance degradation caused by sensor or scene variations, with limited attention to transferring LCLU models to cross-domain tasks with heterogeneous hierarchies (e.g., LCLU to crop classification). These limitations hinder the flexibility and generalization of LCLU models in practical applications. To address these challenges, we propose HieraRS, a novel hierarchical interpretation paradigm that enables multi-granularity predictions and supports the efficient transfer of LCLU models to cross-domain tasks with heterogeneous tree-structured hierarchies. We introduce the Bidirectional Hierarchical Consistency Constraint Mechanism (BHCCM), which can be seamlessly integrated into mainstream flat classification models to generate hierarchical predictions, while improving both semantic consistency and classification accuracy. Furthermore, we present TransLU, a dual-branch cross-domain transfer framework comprising two key components: Cross-Domain Knowledge Sharing (CDKS) and Cross-Domain Semantic Alignment (CDSA). TransLU supports dynamic category expansion and facilitates the effective adaptation of LCLU models to heterogeneous hierarchies. In addition, we construct MM-5B, a large-scale multi-modal hierarchical land use dataset featuring pixel-wise annotations. The code and MM-5B dataset will be released at: https://github.com/AI-Tianlong/HieraRS.

URL PDF HTML ☆

赞 0 踩 0

2507.05498 2026-03-17 cs.LG cs.AI

Explainable Hierarchical Deep Learning Neural Networks (Ex-HiDeNN)

Reza T. Batley, Chanwook Park, Wing Kam Liu, Sourav Saha

详情

DOI: 10.1007/s00466-025-02719-w
Journal ref: Computational Mechanics, 2025

英文摘要

Data-driven science and computation have advanced immensely to construct complex functional relationships using trainable parameters. However, efficiently discovering interpretable and accurate closed-form expressions from complex dataset remains a challenge. The article presents a novel approach called Explainable Hierarchical Deep Learning Neural Networks or Ex-HiDeNN that uses an accurate, frugal, fast, separable, and scalable neural architecture with symbolic regression to discover closed-form expressions from limited observation. The article presents the two-step Ex-HiDeNN algorithm with a separability checker embedded in it. The accuracy and efficiency of Ex-HiDeNN are tested on several benchmark problems, including discerning a dynamical system from data, and the outcomes are reported. Ex-HiDeNN generally shows outstanding approximation capability in these benchmarks, producing orders of magnitude smaller errors compared to reference data and traditional symbolic regression. Later, Ex-HiDeNN is applied to three engineering applications: a) discovering a closed-form fatigue equation, b) identification of hardness from micro-indentation test data, and c) discovering the expression for the yield surface with data. In every case, Ex-HiDeNN outperformed the reference methods used in the literature. The proposed method is built upon the foundation and published works of the authors on Hierarchical Deep Learning Neural Network (HiDeNN) and Convolutional HiDeNN. The article also provides a clear idea about the current limitations and future extensions of Ex-HiDeNN.

URL PDF HTML ☆

赞 0 踩 0

2506.24092 2026-03-17 cs.CV eess.IV

WaRA: Wavelet Low Rank Adaptation

Moein Heidari, Yijin Huang, Yasamin Medghalchi, Alireza Rafiee, Roger Tam, Ilker Hacihaliloglu

2506.19997 2026-03-17 cs.LG cs.AI

TRACED: Transition-aware Regret Approximation with Co-learnability for Environment Design

Geonwoo Cho, Jaegyun Im, Jihwan Lee, Hojun Yi, Sejin Kim, Sundong Kim

Comments ICLR 2026

2506.07323 2026-03-17 cs.SD cs.AI eess.AS

Speech Recognition on TV Series with Video-guided Post-ASR Correction

Haoyuan Yang, Yue Zhang, Liqiang Jing, John H. L. Hansen

Comments Submitted to Interspeech 2026

2506.05980 2026-03-17 cs.LG cs.AI

AMPED: Adaptive Multi-objective Projection for balancing Exploration and skill Diversification

Geonwoo Cho, Jaemoon Lee, Jaegyun Im, Subi Lee, Jihwan Lee, Sundong Kim

Comments ICLR 2026

2506.05240 2026-03-17 cs.LG cs.CV

Aligning Latent Spaces with Flow Priors

Yizhuo Li, Yuying Ge, Yixiao Ge, Ying Shan, Ping Luo

2506.03942 2026-03-17 cs.CV

Average Calibration Losses for Reliable Uncertainty in Medical Image Segmentation

Theodore Barfoot, Luis C. Garcia-Peraza-Herrera, Samet Akcay, Ben Glocker, Tom Vercauteren

Comments 15 pages, 6 figures, IEEE TMI submission. This version originally appeared in error as arXiv:2403.06759(v2)

2506.02164 2026-03-17 cs.CV cs.LG q-bio.NC q-bio.QM

Quantifying task-relevant representational similarity using decision variable correlation

Yu Eric Qian, Wilson S. Geisler, Xue-Xin Wei

Comments Camera-ready version; accepted at NeurIPS 2025

2505.24786 2026-03-17 cs.RO cs.AI cs.CV

DiG-Net: Enhancing Human-Robot Interaction through Hyper-Range Dynamic Gesture Recognition in Assistive Robotics

Eran Bamani Beeri, Eden Nissinman, Avishai Sintov

Comments arXiv admin note: substantial text overlap with arXiv:2411.18413

2505.23404 2026-03-17 cs.CL

AJF: Adaptive Jailbreak Framework Based on the Comprehension Ability of Black-Box Large Language Models

Mingyu Yu, Wei Wang, Yanjie Wei, Sujuan Qin, Fei Gao, Wenmin Li

2505.22524 2026-03-17 cs.LG

Inference-Time Scaling of Discrete Diffusion Models via Importance Weighting and Optimal Proposal Design

Zijing Ou, Chinmay Pani, Yingzhen Li

2505.22089 2026-03-17 cs.CV

Efficient feature matching for UAV images based on compact GPU data scheduling

San Jiang, Kan You, Ruqin Zhou, Xing Zhang, Zhijun Wang, Qingquan Li

2505.20629 2026-03-17 cs.CV cs.LG

Unified Text-Image-to-Video Generation: A Training-Free Approach to Flexible Visual Conditioning

Bolin Lai, Sangmin Lee, Xu Cao, Xiang Li, James M. Rehg

Comments 27 pages, 10 figures, 7 tables

2505.20309 2026-03-17 cs.CL cs.LG

Guiding Giants: Lightweight Controllers for Weighted Activation Steering in LLMs

Amr Hegazy, Mostafa Elhoushi, Amr Alanwar

2505.20081 2026-03-17 cs.CL cs.AI

Inference-time Alignment in Continuous Space

Yige Yuan, Teng Xiao, Li Yunfan, Bingbing Xu, Shuchang Tao, Yunqi Qiu, Huawei Shen, Xueqi Cheng

Comments Accepted at NeurIPS 2025

2505.20072 2026-03-17 cs.CL cs.AI

Incentivizing Strong Reasoning from Weak Supervision

Yige Yuan, Teng Xiao, Shuchang Tao, Xue Wang, Jinyang Gao, Bolin Ding, Bingbing Xu

Comments Accepted at EACL 2026

2505.18876 2026-03-17 cs.RO

DiffusionRL: Efficient Training of Diffusion Policies for Robotic Grasping Using RL-Adapted Large-Scale Datasets

Maria Makarova, Qian Liu, Dzmitry Tsetserukou

2505.17646 2026-03-17 cs.LG

Unveiling the Basin-Like Loss Landscape in Large Language Models

Huanran Chen, Yinpeng Dong, Zeming Wei, Yao Huang, Yichi Zhang, Hang Su, Jun Zhu

2505.17343 2026-03-17 cs.CV cs.HC

Ocular Authentication: Fusion of Gaze and Periocular Modalities

Dillon Lohr, Michael J. Proulx, Mehedi Hasan Raju, Oleg V. Komogortsev

Comments Supplementary material is available

2505.15208 2026-03-17 cs.CV

GT2-GS: Geometry-aware Texture Transfer for Gaussian Splatting

Wenjie Liu, Zhongliang Liu, Junwei Shu, Changbo Wang, Yang Li

Comments Accepted to AAAI 2026

2505.15036 2026-03-17 cs.RO cs.AI cs.MA

Fault-Tolerant Multi-Robot Coordination with Limited Sensing within Confined Environments

Kehinde O. Aina, Hosain Bagheri, Daniel I. Goldman

Comments 15 pages, 4 figures. Accepted to DARS 2024 (Distributed Autonomous Robotic Systems), to appear in Springer Proceedings in Advanced Robotics

详情

DOI: 10.1007/978-3-032-04584-3_22
Journal ref: Distributed Autonomous Robotic Systems. DARS 2024. Springer Proceedings in Advanced Robotics

英文摘要

As robots are increasingly deployed to collaborate on tasks within shared workspaces and resources, the failure of an individual robot can critically affect the group's performance. This issue is particularly challenging when robots lack global information or direct communication, relying instead on social interaction for coordination and to complete their tasks. In this study, we propose a novel fault-tolerance technique leveraging physical contact interactions in multi-robot systems, specifically under conditions of limited sensing and spatial confinement. We introduce the "Active Contact Response" (ACR) method, where each robot modulates its behavior based on the likelihood of encountering an inoperative (faulty) robot. Active robots are capable of collectively repositioning stationary and faulty peers to reduce obstructions and maintain optimal group functionality. We implement our algorithm in a team of autonomous robots, equipped with contact-sensing and collision-tolerance capabilities, tasked with collectively excavating cohesive model pellets. Experimental results indicate that the ACR method significantly improves the system's recovery time from robot failures, enabling continued collective excavation with minimal performance degradation. Thus, this work demonstrates the potential of leveraging local, social, and physical interactions to enhance fault tolerance and coordination in multi-robot systems operating in constrained and extreme environments.

URL PDF HTML ☆

赞 0 踩 0

2505.11920 2026-03-17 cs.RO

H2R: A Human-to-Robot Data Augmentation for Robot Pre-training from Videos

Guangrun Li, Yaoxu Lyu, Zhuoyang Liu, Chengkai Hou, Jieyu Zhang, Shanghang Zhang

详情

英文摘要

Large-scale pre-training using egocentric human videos has proven effective for robot learning. However, the models pre-trained on such data can be suboptimal for robot learning due to the significant visual gap between human hands and those of different robots. To remedy this, we propose H2R, a human-to-robot data augmentation pipeline that converts egocentric human videos into robot-centric visual data. H2R estimates human hand pose from videos, retargets the motion to simulated robotic arms, removes human limbs via segmentation and inpainting, and composites rendered robot embodiments into the original frames with camera-aligned geometry. This process explicitly bridges the visual gap between human and robot embodiments during pre-training. We apply H2R to augment large-scale egocentric human video datasets such as Ego4D and SSv2. To verify the effectiveness of the augmentation pipeline, we introduce a CLIP-based image-text similarity metric that quantitatively evaluates the semantic fidelity of robot-rendered frames to the original human actions. We evaluate H2R through comprehensive experiments in both simulation and real-world settings. In simulation, H2R consistently improves downstream success rates across four benchmark suites-Robomimic, RLBench, PushT, and CortexBench-yielding gains of 1.3%-10.2% across different visual encoders and policy learning methods. In real-world experiments, H2R improves performance on UR5 and dual-arm Franka/UR5 manipulation platforms, achieving 3.3%-23.3% success rate gains across gripper-based, dexterous, and bimanual tasks. We further demonstrate the potential of H2R in cross-embodiment generalization and its compatibility with vision-language-action models. These results indicate that H2R improves the generalization ability of robotic policies by mitigating the visual discrepancies between human and robot domains.

URL PDF HTML ☆

赞 0 踩 0

2504.14728 2026-03-17 cs.LG q-bio.PE quant-ph

Geometric Learning Dynamics

Vitaly Vanchurin

Comments 16 pages, accepted for publication in Biological Cybernetics

2504.14372 2026-03-17 cs.LG cs.AI cs.CE cs.CV stat.ML

Learning Enhanced Structural Representations with Block-Based Uncertainties for Ocean Floor Mapping

Jose Marie Antonio Minoza

2504.13941 2026-03-17 cs.LG cs.AI

Nemotron-CrossThink: Scaling Self-Learning beyond Math Reasoning

Syeda Nahida Akter, Shrimai Prabhumoye, Matvei Novikov, Seungju Han, Ying Lin, Evelina Bakhturina, Eric Nyberg, Yejin Choi, Mostofa Patwary, Mohammad Shoeybi, Bryan Catanzaro

Comments 19 pages, 10 figures

2503.22764 2026-03-17 cs.CL cs.AI cs.LG

Boosting Large Language Models with Mask Fine-Tuning

Mingyuan Zhang, Yue Bai, Huan Wang, Yizhou Wang, Qihua Dong, Yitian Zhang, Yun Fu