arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

检索范围排序方式

检索时间范围

重置

HOT 人工智能、机器人等 9

cs.AI 人工智能 cs.CV 计算机视觉 cs.CL 自然语言处理 cs.RO 机器人 cs.LG 机器学习 cs.SD 声音 cs.ET 新兴技术 eess.AS 音频语音 eess.IV 图像视频

CS 计算机 41

cs 计算机 cs.AI 人工智能 cs.AR 硬件架构 cs.CC 计算复杂性 cs.CE 计算工程 cs.CG 计算几何 cs.CL 自然语言处理 cs.CR 密码安全 cs.CV 计算机视觉 cs.CY 计算机与社会 cs.DB 数据库 cs.DC 分布式计算 cs.DL 数字图书馆 cs.DM 离散数学 cs.DS 数据结构 cs.ET 新兴技术 cs.FL 形式语言 cs.GL 综述文献 cs.GR 图形学 cs.GT 博弈论 cs.HC 人机交互 cs.IR 信息检索 cs.IT 信息论 cs.LG 机器学习 cs.LO 计算机逻辑 cs.MA 多智能体 cs.MM 多媒体 cs.MS 数学软件 cs.NA 数值分析 cs.NE 神经进化 cs.NI 网络架构 cs.OH 其他计算机 cs.OS 操作系统 cs.PF 性能 cs.PL 编程语言 cs.RO 机器人 cs.SC 符号计算 cs.SD 声音 cs.SE 软件工程 cs.SI 社会信息网络 cs.SY 系统控制

ECON 经济学 4

econ 经济学 econ.EM 计量经济 econ.GN 一般经济 econ.TH 理论经济

EESS 电气与系统 5

eess 电气与系统 eess.AS 音频语音 eess.IV 图像视频 eess.SP 信号处理 eess.SY 系统控制

MATH 数学 33

math 数学 math.AC 交换代数 math.AG 代数几何 math.AP 偏微分方程 math.AT 代数拓扑 math.CA 经典分析 math.CO 组合数学 math.CT 范畴论 math.CV 复变函数 math.DG 微分几何 math.DS 动力系统 math.FA 泛函分析 math.GM 一般数学 math.GN 一般拓扑 math.GR 群论 math.GT 几何拓扑 math.HO 历史综述 math.IT 信息论 math.KT K理论 math.LO 逻辑 math.MG 度量几何 math.MP 数学物理 math.NA 数值分析 math.NT 数论 math.OA 算子代数 math.OC 优化控制 math.PR 概率 math.QA 量子代数 math.RA 环与代数 math.RT 表示论 math.SG 辛几何 math.SP 谱理论 math.ST 统计理论

PHYSICS 物理 55

astro-ph 天体物理 astro-ph.CO 宇宙学 astro-ph.EP 地球行星 astro-ph.GA 星系物理 astro-ph.HE 高能天体 astro-ph.IM 天文仪器 astro-ph.SR 太阳恒星 cond-mat 凝聚态 cond-mat.dis-nn 无序神经 cond-mat.mes-hall 介观纳米 cond-mat.mtrl-sci 材料科学 cond-mat.other 其他凝聚态 cond-mat.quant-gas 量子气体 cond-mat.soft 软凝聚态 cond-mat.stat-mech 统计力学 cond-mat.str-el 强关联电子 cond-mat.supr-con 超导 gr-qc 广义相对论 hep-ex 高能实验 hep-lat 格点高能 hep-ph 高能唯象 hep-th 高能理论 math-ph 数学物理 nlin 非线性科学 nlin.AO 自适应系统 nlin.CD 混沌动力学 nlin.CG 胞自动机 nlin.PS 斑图孤子 nlin.SI 可积系统 nucl-ex 核物理实验 nucl-th 核物理理论 physics 物理 physics.acc-ph 加速器物理 physics.ao-ph 大气海洋 physics.app-ph 应用物理 physics.atm-clus 原子分子团簇 physics.atom-ph 原子物理 physics.bio-ph 生物物理 physics.chem-ph 化学物理 physics.class-ph 经典物理 physics.comp-ph 计算物理 physics.data-an 数据分析 physics.ed-ph 物理教育 physics.flu-dyn 流体动力学 physics.gen-ph 普通物理 physics.geo-ph 地球物理 physics.hist-ph 物理史哲 physics.ins-det 仪器探测 physics.med-ph 医学物理 physics.optics 光学 physics.plasm-ph 等离子体 physics.pop-ph 科普物理 physics.soc-ph 物理与社会 physics.space-ph 空间物理 quant-ph 量子物理

Q-BIO 定量生物 11

q-bio 定量生物 q-bio.BM 生物分子 q-bio.CB 细胞行为 q-bio.GN 基因组学 q-bio.MN 分子网络 q-bio.NC 神经认知 q-bio.OT 其他定量生物 q-bio.PE 种群进化 q-bio.QM 定量方法 q-bio.SC 亚细胞过程 q-bio.TO 组织器官

Q-FIN 定量金融 10

q-fin 定量金融 q-fin.CP 计算金融 q-fin.EC 经济学 q-fin.GN 一般金融 q-fin.MF 数学金融 q-fin.PM 投资组合 q-fin.PR 证券定价 q-fin.RM 风险管理 q-fin.ST 统计金融 q-fin.TR 交易微观结构

STAT 统计 7

stat 统计 stat.AP 统计应用 stat.CO 统计计算 stat.ME 统计方法 stat.ML 机器学习 stat.OT 其他统计 stat.TH 统计理论

2510.01938 2026-04-03 cs.LG

StelLA: Subspace Learning in Low-rank Adaptation using Stiefel Manifold

Zhizhong Li, Sina Sajadmanesh, Jingtao Li, Lingjuan Lyu

Comments NeurIPS 2025 Spotlight

2509.22652 2026-04-03 cs.RO cs.CV

Pixel Motion Diffusion is What We Need for Robot Control

E-Ro Nguyen, Yichi Zhang, Kanchana Ranasinghe, Xiang Li, Michael S. Ryoo

Comments Accepted to CVPR 2026. Project page: https://eronguyen.github.io/DAWN

2509.18001 2026-04-03 cs.LG cs.AI

Unveiling m-Sharpness Through the Structure of Stochastic Gradient Noise

Haocheng Luo, Mehrtash Harandi, Dinh Phung, Trung Le

Comments Accepted to NeurIPS 2025; added code availability

2509.14963 2026-04-03 cs.AI

Set Contribution Functions for Quantitative Bipolar Argumentation and their Principles

Filip Naudot, Andreas Brännström, Vicenç Torra, Timotheus Kampik

Comments Published in International Journal of Approximate Reasoning, Vol. 194, 2026

Journal ref International Journal of Approximate Reasoning, 194:109673, 2026

2509.08469 2026-04-03 cs.CV

Maximally Useful and Minimally Redundant: The Key to Self Supervised Learning for Imbalanced Data

Yash Kumar Sharma, Vineet Padmanabhan

2509.07252 2026-04-03 cs.LG cs.CV

GCond: Gradient Conflict Resolution via Accumulation-based Stabilization for Large-Scale Multi-Task Learning

Evgeny Alves Limarenko, Anastasiia Studenikina, Svetlana Illarionova, Maxim Sharaev

Comments Published in IEEE Access. This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License (CC BY-NC-ND 4.0)

Journal ref IEEE Access, vol. 14, pp. 42086-42104, 2026

详情

DOI: 10.1109/ACCESS.2026.3673372

英文摘要

In multi-task learning (MTL), gradient conflict poses a significant challenge. Effective methods for addressing this problem, including PCGrad, CAGrad, and GradNorm, in their original implementations are computationally demanding, which significantly limits their application in modern large models such as transformers. We propose Gradient Conductor (GCond), a method that builds upon PCGrad principles by combining them with gradient accumulation and an adaptive arbitration mechanism. We evaluated GCond on self-supervised multi-task learning tasks using MobileNetV3-Small and ConvNeXt architectures on the ImageNet 1K dataset and a combined head and neck CT scan dataset, comparing the proposed method against baseline linear combinations and state-of-the-art gradient conflict resolution methods. The classical and stochastic approaches of GCond were analyzed. The stochastic mode of GCond achieved a two-fold computational speedup while maintaining optimization quality, and demonstrated superior performance across all evaluated metrics, achieving lower L1 and SSIM losses compared to other methods on both datasets, and demonstrating superior generalization in heterogeneous scenarios: GCond improved ImageNet Top-1 Accuracy by 4.5% over baselines and prevented confidence overfitting in medical diagnosis tasks. GCond exhibited high scalability, being successfully applied to both compact models: MobileNetV3-Small and ConvNeXt-tiny; and large architecture ConvNeXtV2-Base. It also showed compatibility with modern optimizers such as AdamW and Lion/LARS. Therefore, GCond offers a scalable and efficient solution to the problem of gradient conflicts in multi-task learning.

URL PDF HTML ☆

赞 0 踩 0

2509.01058 2026-04-03 cs.CL cs.AI

Speaking at the Right Level: Literacy-Controlled Counterspeech Generation with RAG-RL

Xiaoying Song, Anirban Saha Anik, Dibakar Barua, Pengcheng Luo, Junhua Ding, Lingzi Hong

Comments Accepted at Findings of EMNLP 2025

Journal ref Findings of the Association for Computational Linguistics: EMNLP 2025

2509.01053 2026-04-03 cs.CL cs.AI

A Dynamic Fusion Model for Consistent Crisis Response

Xiaoying Song, Anirban Saha Anik, Eduardo Blanco, Vanessa Frias-Martinez, Lingzi Hong

Comments Accepted at Findings of EMNLP 2025

Journal ref Findings of the Association for Computational Linguistics: EMNLP 2025

2508.20755 2026-04-03 cs.LG cs.AI stat.ML

Provable Benefits of In-Tool Learning for Large Language Models

Sam Houliston, Ambroise Odonnat, Charles Arnal, Vivien Cabannes

Journal ref ICLR 2026 MemAgents Workshop

2508.17521 2026-04-03 cs.LG

Modeling Irregular Astronomical Time Series with Neural Stochastic Delay Differential Equations

YongKyung Oh, Seungsu Kam, Dong-Young Lim, Sungil Kim

Comments CIKM '25: Proceedings of the 34th ACM International Conference on Information and Knowledge Management. https://doi.org/10.1145/3746252.3760805

2508.17519 2026-04-03 cs.LG cs.AI

TANDEM: Temporal Attention-guided Neural Differential Equations for Missingness in Time Series Classification

YongKyung Oh, Dong-Young Lim, Sungil Kim, Alex Bui

Comments CIKM '25: Proceedings of the 34th ACM International Conference on Information and Knowledge Management. https://doi.org/10.1145/3746252.3760996

2508.14285 2026-04-03 cs.LG cs.AI stat.ML

Meta-Learning at Scale for Large Language Models via Low-Rank Amortized Bayesian Meta-Learning

Liyi Zhang, Jake Snell, Thomas L. Griffiths

Comments 17 pages, 2 figures

2508.12957 2026-04-03 cs.CV

Adaptive Reinforcement for Open-ended Medical Reasoning via Semantic-Guided Reward Collapse Mitigation

Yizhou Liu, Dingkang Yang, Zizhi Chen, Minghao Han, Xukun Zhang, Keliang Liu, Jingwei Wei, Lihua Zhang

Comments Accept to 2026 CVPR Findings

2508.10634 2026-04-03 cs.RO cs.SY eess.SY

Synthesis of Deep Neural Networks with Safe Robust Adaptive Control for Reliable Operation of Wheeled Mobile Robots

Mehdi Heydari Shahna, Jouni Mattila

Journal ref IEEE Transactions on Automation Science and Engineering

详情

DOI: 10.1109/TASE.2026.3679545

英文摘要

Deep neural networks (DNNs) can enable precise control while maintaining low computational costs by circumventing the need for dynamic modeling. However, the deployment of such black-box approaches remains challenging for heavy-duty wheeled mobile robots (WMRs), which are subject to strict international standards and prone to faults and disturbances. We designed a hierarchical control policy for heavy-duty WMRs, monitored by two safety layers with differing levels of authority. To this end, a DNN policy was trained and deployed as the primary control strategy, providing high-precision performance under nominal operating conditions. When external disturbances arise and reach a level of intensity such that the system performance falls below a predefined threshold, a low-level safety layer intervenes by deactivating the primary control policy and activating a model-free robust adaptive control (RAC) policy. This transition enables the system to continue operating while ensuring stability by effectively managing the inherent trade-off between system robustness and responsiveness. Regardless of the control policy in use, a high-level safety layer continuously monitors system performance during operation. It initiates a shutdown only when disturbances become sufficiently severe such that compensation is no longer viable and continued operation would jeopardize the system or its environment. The proposed synthesis of DNN and RAC policy guarantees uniform exponential stability of the entire WMR system while adhering to safety standards to some extent. The effectiveness of the proposed approach was further validated through real-time experiments using a 6,000 kg WMR.

URL PDF HTML ☆

赞 0 踩 0

2508.02530 2026-04-03 cs.CV

Understanding the Risks of Asphalt Art to the Reliability of Vision-Based Perception Systems

Jin Ma, Abyad Enan, Long Cheng, Mashrur Chowdhury

Comments J. Ma and A. Enan are co-first authors; they have contributed equally. This second revised version has been resubmitted to the Transportation Research Record: Journal of the Transportation Research Board after addressing the reviewers' comments and is currently awaiting the final decision

2508.00580 2026-04-03 cs.RO cs.AI

OmniUnet: A Multimodal Network for Unstructured Terrain Segmentation on Planetary Rovers Using RGB, Depth, and Thermal Imagery

Raul Castilla-Arquillo, Carlos Perez-del-Pulgar, Levin Gerdes, Alfonso Garcia-Cerezo, Miguel A. Olivares-Mendez

Journal ref 2025 International Conference on Space Robotics (iSpaRo)

详情

DOI: 10.1109/iSpaRo66239.2025.11436158

英文摘要

Robot navigation in unstructured environments requires multimodal perception systems that can support safe navigation. Multimodality enables the integration of complementary information collected by different sensors. However, this information must be processed by machine learning algorithms specifically designed to leverage heterogeneous data. Furthermore, it is necessary to identify which sensor modalities are most informative for navigation in the target environment. In Martian exploration, thermal imagery has proven valuable for assessing terrain safety due to differences in thermal behaviour between soil types. This work presents OmniUnet, a transformer-based neural network architecture for semantic segmentation using RGB, depth, and thermal (RGB-D-T) imagery. A custom multimodal sensor housing was developed using 3D printing and mounted on the Martian Rover Testbed for Autonomy (MaRTA) to collect a multimodal dataset in the Bardenas semi-desert in northern Spain. This location serves as a representative environment of the Martian surface, featuring terrain types such as sand, bedrock, and compact soil. A subset of this dataset was manually labeled to support supervised training of the network. The model was evaluated both quantitatively and qualitatively, achieving a pixel accuracy of 80.37% and demonstrating strong performance in segmenting complex unstructured terrain. Inference tests yielded an average prediction time of 673 ms on a resource-constrained computer (Jetson Orin Nano), confirming its suitability for on-robot deployment. The software implementation of the network and the labeled dataset have been made publicly available to support future research in multimodal terrain perception for planetary robotics.

URL PDF HTML ☆

赞 0 踩 0

2507.11992 2026-04-03 cs.AI

Understanding visual attention beehind bee-inspired UAV navigation

Pranav Rajbhandari, Abhi Veda, Matthew Garratt, Mandyam Srinivasan, Sridhar Ravi

2507.09681 2026-04-03 cs.CV eess.IV

Seamless High-Resolution Terrain Reconstruction: A Prior-Based Vision Transformer Approach

Osher Rafaeli, Tal Svoray, Ariel Nahlieli

2507.02989 2026-04-03 cs.CL

A Comparative Study of Competency Question Elicitation Methods from Ontology Requirements

Reham Alharbi, Valentina Tamma, Terry R. Payne, Jacopo de Berardinis

Comments Revised version (v2) accepted for the 23rd European Semantic Web Conference (ESWC-2026)

2507.01351 2026-04-03 cs.CV

Long-Tailed Distribution-Aware Router For Mixture-of-Experts in Large Vision-Language Model

Chaoxiang Cai, Longrong Yang, Minghe Weng, Xuewei Li, Zequn Qin, Xi Li

2506.20370 2026-04-03 cs.CV cs.LG cs.MM

InvZW: Invariant Feature Learning via Noise-Adversarial Training for Robust Image Zero-Watermarking

Abdullah All Tanvir, Frank Y. Shih, Xin Zhong

Comments This paper has been accepted for publication by the Frontiers in Signal Processing

2506.12553 2026-04-03 cs.LG cs.CR stat.ML

Beyond Laplace and Gaussian: Exploring the Generalized Gaussian Mechanism for Private Machine Learning

Roy Rinberg, Ilia Shumailov, Vikrant Singhal, Rachel Cummings, Nicolas Papernot

2506.07194 2026-04-03 cs.AI

Exploring Effective Strategies for Building a User-Configured GPT for Coding Classroom Dialogues

Luwei Bai, Dongkeun Han, Sara Hennessy

Comments Draft technical report. 39 pages, 2 figures. Not yet submitted for publication. Update expected

2506.07134 2026-04-03 cs.LG cs.AI math.OC

Monotone and Conservative Policy Iteration Beyond the Tabular Case

S. R. Eshwar, Gugan Thoppe, Ananyabrata Barua, Aditya Gopalan, Gal Dalal

2506.03828 2026-04-03 cs.AI cs.MA

AssetOpsBench: Benchmarking AI Agents for Task Automation in Industrial Asset Operations and Maintenance

Dhaval Patel, Shuxin Lin, James Rayfield, Nianjun Zhou, Chathurangi Shyalika, Suryanarayana R Yarrabothula, Roman Vaculin, Natalia Martinez, Fearghal O'donncha, Jayant Kalagnanam

Comments 25 pages, 18 figures

2506.02736 2026-04-03 cs.CV cs.RO

GeneA-SLAM2: Dynamic SLAM with AutoEncoder-Preprocessed Genetic Keypoints Resampling and Depth Variance-Guided Dynamic Region Removal

Shufan Qing, Anzhen Li, Qiandi Wang, Yuefeng Niu, Mingchen Feng, Guoliang Hu, Jinqiao Wu, Fengtao Nan, Yingchun Fan

2505.23824 2026-04-03 cs.CL

Reviewing Scientific Papers for Critical Problems With Reasoning LLMs: Baseline Approaches and Automatic Evaluation

Tianmai M. Zhang, Neil F. Abernethy

Comments Accepted and presented at NeurIPS 2025 AI for Science Workshop

2505.23752 2026-04-03 cs.CV

ThinkGeo: Evaluating Tool-Augmented Agents for Remote Sensing Tasks

Akashah Shabbir, Muhammad Akhtar Munir, Akshay Dudhane, Muhammad Umer Sheikh, Muhammad Haris Khan, Paolo Fraccaro, Juan Bernabe Moreno, Fahad Shahbaz Khan, Salman Khan

2505.22279 2026-04-03 cs.CV

Learning Fine-Grained Geometry for Sparse-View Splatting via Cascade Depth Loss

Wenjun Lu, Haodong Chen, Anqi Yi, Guoxi Huang, Yuk Ying Chung, Kun Hu, Zhiyong Wang

2505.19585 2026-04-03 cs.CV

CARE: Confidence-aware Ratio Estimation for Medical Biomarkers

Jiameng Li, Teodora Popordanoska, Aleksei Tiulpin, Sebastian G. Gruber, Frederik Maes, Matthew B. Blaschko

Comments 12 pages

AI 大模型

视觉与机器人

科学与医疗

StelLA: Subspace Learning in Low-rank Adaptation using Stiefel Manifold

Pixel Motion Diffusion is What We Need for Robot Control

Unveiling m-Sharpness Through the Structure of Stochastic Gradient Noise

Set Contribution Functions for Quantitative Bipolar Argumentation and their Principles

Maximally Useful and Minimally Redundant: The Key to Self Supervised Learning for Imbalanced Data

GCond: Gradient Conflict Resolution via Accumulation-based Stabilization for Large-Scale Multi-Task Learning

Speaking at the Right Level: Literacy-Controlled Counterspeech Generation with RAG-RL

A Dynamic Fusion Model for Consistent Crisis Response

Provable Benefits of In-Tool Learning for Large Language Models

Modeling Irregular Astronomical Time Series with Neural Stochastic Delay Differential Equations

TANDEM: Temporal Attention-guided Neural Differential Equations for Missingness in Time Series Classification

Meta-Learning at Scale for Large Language Models via Low-Rank Amortized Bayesian Meta-Learning

Adaptive Reinforcement for Open-ended Medical Reasoning via Semantic-Guided Reward Collapse Mitigation

Synthesis of Deep Neural Networks with Safe Robust Adaptive Control for Reliable Operation of Wheeled Mobile Robots

Understanding the Risks of Asphalt Art to the Reliability of Vision-Based Perception Systems

OmniUnet: A Multimodal Network for Unstructured Terrain Segmentation on Planetary Rovers Using RGB, Depth, and Thermal Imagery

Understanding visual attention beehind bee-inspired UAV navigation

Seamless High-Resolution Terrain Reconstruction: A Prior-Based Vision Transformer Approach

A Comparative Study of Competency Question Elicitation Methods from Ontology Requirements

Long-Tailed Distribution-Aware Router For Mixture-of-Experts in Large Vision-Language Model

InvZW: Invariant Feature Learning via Noise-Adversarial Training for Robust Image Zero-Watermarking

Beyond Laplace and Gaussian: Exploring the Generalized Gaussian Mechanism for Private Machine Learning

Exploring Effective Strategies for Building a User-Configured GPT for Coding Classroom Dialogues

Monotone and Conservative Policy Iteration Beyond the Tabular Case

AssetOpsBench: Benchmarking AI Agents for Task Automation in Industrial Asset Operations and Maintenance

GeneA-SLAM2: Dynamic SLAM with AutoEncoder-Preprocessed Genetic Keypoints Resampling and Depth Variance-Guided Dynamic Region Removal

Reviewing Scientific Papers for Critical Problems With Reasoning LLMs: Baseline Approaches and Automatic Evaluation

ThinkGeo: Evaluating Tool-Augmented Agents for Remote Sensing Tasks

Learning Fine-Grained Geometry for Sparse-View Splatting via Cascade Depth Loss

CARE: Confidence-aware Ratio Estimation for Medical Biomarkers