arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

检索范围排序方式

检索时间范围

重置

HOT 人工智能、机器人等 9

cs.AI 人工智能 cs.CV 计算机视觉 cs.CL 自然语言处理 cs.RO 机器人 cs.LG 机器学习 cs.SD 声音 cs.ET 新兴技术 eess.AS 音频语音 eess.IV 图像视频

CS 计算机 41

cs 计算机 cs.AI 人工智能 cs.AR 硬件架构 cs.CC 计算复杂性 cs.CE 计算工程 cs.CG 计算几何 cs.CL 自然语言处理 cs.CR 密码安全 cs.CV 计算机视觉 cs.CY 计算机与社会 cs.DB 数据库 cs.DC 分布式计算 cs.DL 数字图书馆 cs.DM 离散数学 cs.DS 数据结构 cs.ET 新兴技术 cs.FL 形式语言 cs.GL 综述文献 cs.GR 图形学 cs.GT 博弈论 cs.HC 人机交互 cs.IR 信息检索 cs.IT 信息论 cs.LG 机器学习 cs.LO 计算机逻辑 cs.MA 多智能体 cs.MM 多媒体 cs.MS 数学软件 cs.NA 数值分析 cs.NE 神经进化 cs.NI 网络架构 cs.OH 其他计算机 cs.OS 操作系统 cs.PF 性能 cs.PL 编程语言 cs.RO 机器人 cs.SC 符号计算 cs.SD 声音 cs.SE 软件工程 cs.SI 社会信息网络 cs.SY 系统控制

ECON 经济学 4

econ 经济学 econ.EM 计量经济 econ.GN 一般经济 econ.TH 理论经济

EESS 电气与系统 5

eess 电气与系统 eess.AS 音频语音 eess.IV 图像视频 eess.SP 信号处理 eess.SY 系统控制

MATH 数学 33

math 数学 math.AC 交换代数 math.AG 代数几何 math.AP 偏微分方程 math.AT 代数拓扑 math.CA 经典分析 math.CO 组合数学 math.CT 范畴论 math.CV 复变函数 math.DG 微分几何 math.DS 动力系统 math.FA 泛函分析 math.GM 一般数学 math.GN 一般拓扑 math.GR 群论 math.GT 几何拓扑 math.HO 历史综述 math.IT 信息论 math.KT K理论 math.LO 逻辑 math.MG 度量几何 math.MP 数学物理 math.NA 数值分析 math.NT 数论 math.OA 算子代数 math.OC 优化控制 math.PR 概率 math.QA 量子代数 math.RA 环与代数 math.RT 表示论 math.SG 辛几何 math.SP 谱理论 math.ST 统计理论

PHYSICS 物理 55

astro-ph 天体物理 astro-ph.CO 宇宙学 astro-ph.EP 地球行星 astro-ph.GA 星系物理 astro-ph.HE 高能天体 astro-ph.IM 天文仪器 astro-ph.SR 太阳恒星 cond-mat 凝聚态 cond-mat.dis-nn 无序神经 cond-mat.mes-hall 介观纳米 cond-mat.mtrl-sci 材料科学 cond-mat.other 其他凝聚态 cond-mat.quant-gas 量子气体 cond-mat.soft 软凝聚态 cond-mat.stat-mech 统计力学 cond-mat.str-el 强关联电子 cond-mat.supr-con 超导 gr-qc 广义相对论 hep-ex 高能实验 hep-lat 格点高能 hep-ph 高能唯象 hep-th 高能理论 math-ph 数学物理 nlin 非线性科学 nlin.AO 自适应系统 nlin.CD 混沌动力学 nlin.CG 胞自动机 nlin.PS 斑图孤子 nlin.SI 可积系统 nucl-ex 核物理实验 nucl-th 核物理理论 physics 物理 physics.acc-ph 加速器物理 physics.ao-ph 大气海洋 physics.app-ph 应用物理 physics.atm-clus 原子分子团簇 physics.atom-ph 原子物理 physics.bio-ph 生物物理 physics.chem-ph 化学物理 physics.class-ph 经典物理 physics.comp-ph 计算物理 physics.data-an 数据分析 physics.ed-ph 物理教育 physics.flu-dyn 流体动力学 physics.gen-ph 普通物理 physics.geo-ph 地球物理 physics.hist-ph 物理史哲 physics.ins-det 仪器探测 physics.med-ph 医学物理 physics.optics 光学 physics.plasm-ph 等离子体 physics.pop-ph 科普物理 physics.soc-ph 物理与社会 physics.space-ph 空间物理 quant-ph 量子物理

Q-BIO 定量生物 11

q-bio 定量生物 q-bio.BM 生物分子 q-bio.CB 细胞行为 q-bio.GN 基因组学 q-bio.MN 分子网络 q-bio.NC 神经认知 q-bio.OT 其他定量生物 q-bio.PE 种群进化 q-bio.QM 定量方法 q-bio.SC 亚细胞过程 q-bio.TO 组织器官

Q-FIN 定量金融 10

q-fin 定量金融 q-fin.CP 计算金融 q-fin.EC 经济学 q-fin.GN 一般金融 q-fin.MF 数学金融 q-fin.PM 投资组合 q-fin.PR 证券定价 q-fin.RM 风险管理 q-fin.ST 统计金融 q-fin.TR 交易微观结构

STAT 统计 7

stat 统计 stat.AP 统计应用 stat.CO 统计计算 stat.ME 统计方法 stat.ML 机器学习 stat.OT 其他统计 stat.TH 统计理论

2510.09805 2026-01-28 cs.LG cs.AI

Temporal Lifting as Latent-Space Regularization for Continuous-Time Flow Models in AI Systems

Jeffrey Camlin

Comments 7 pages, 1 figure, 1 table, 1 algorithm

2510.09110 2026-01-28 cs.CV cs.AI

Synthetic Object Compositions for Scalable and Accurate Learning in Detection, Segmentation, and Grounding

Weikai Huang, Jieyu Zhang, Taoyang Jia, Chenhao Zheng, Ziqi Gao, Jae Sung Park, Winson Han, Ranjay Krishna

Comments Project website: https://github.com/weikaih04/Synthetic-Detection-Segmentation-Grounding-Data

2510.03871 2026-01-28 cs.LG cs.AI stat.ML

Optimal Scaling Needs Optimal Norm

Oleg Filatov, Jiangtao Wang, Jan Ebert, Stefan Kesselheim

2510.03252 2026-01-28 cs.LG cs.AI cs.CV

Universal Multi-Domain Translation via Diffusion Routers

Duc Kieu, Kien Do, Tuan Hoang, Thao Minh Le, Tung Kieu, Dang Nguyen, Thin Nguyen

Comments Accepted in ICLR 2026

2510.02091 2026-01-28 cs.AI

Demystifying the Roles of LLM Layers in Retrieval, Knowledge, and Reasoning

Xinyuan Song, Keyu Wang, PengXiang Li, Lu Yin, Shiwei Liu

Comments Accepted by ICASSP 2026

2510.01812 2026-01-28 cs.SD cs.AI eess.AS

SingMOS-Pro: An Comprehensive Benchmark for Singing Quality Assessment

Yuxun Tang, Lan Liu, Wenhao Feng, Yiwen Zhao, Jionghao Han, Yifeng Yu, Jiatong Shi, Qin Jin

Comments Accepted by ICASSP 2026

2509.26201 2026-01-28 cs.AI cond-mat.mes-hall cond-mat.mtrl-sci

LLM Agents for Knowledge Discovery in Atomic Layer Processing

Andreas Werbrouck, Marshall B. Lindsay, Matthew Maschmann, Matthias J. Young

Comments Accepted submission to the AI4MAT workshop@NEURIPS 2025. As submitted, except author names added

2509.25795 2026-01-28 cs.CL

Assessing Algorithmic Bias in Language-Based Depression Detection: A Comparison of DNN and LLM Approaches

Obed Junias, Prajakta Kini, Theodora Chaspari

Comments 7 pages, 1 figure. This paper has been accepted to the IEEE-EMBS International Conference on Biomedical and Health Informatics (BHI 2025), Georgia Institute of Technology, Atlanta, Georgia, October 26-29, 2025

2509.21044 2026-01-28 cs.LG cs.AI

Reinforcement Learning Fine-Tuning Enhances Activation Intensity and Diversity in the Internal Circuitry of LLMs

Honglin Zhang, Qianyue Hao, Fengli Xu, Yong Li

2509.20829 2026-01-28 cs.LG

Explaining Grokking and Information Bottleneck through Neural Collapse Emergence

Keitaro Sakamoto, Issei Sato

Comments Accepted at ICLR 2026. Code is available at https://github.com/keitaroskmt/collapse-dynamics

2509.20674 2026-01-28 cs.RO cs.CV

Equi-RO: A 4D mmWave Radar Odometry via Equivariant Networks

Zeyu Han, Shuocheng Yang, Minghan Zhu, Fang Zhang, Shaobing Xu, Maani Ghaffari, Jianqiang Wang

2509.15098 2026-01-28 cs.CL cs.AI

TextMineX: Data, Evaluation Framework and Ontology-guided LLM Pipeline for Humanitarian Mine Action

Chenyue Zhou, Gürkan Solmaz, Flavio Cirillo, Kiril Gashteovski, Jonathan Fürst

2509.06297 2026-01-28 cs.LG

LoaQ: Layer-wise Output Approximation Quantization

Li Lin, Xiaojun Wan

Comments under review

2509.05895 2026-01-28 cs.CV

BTCChat: Advancing Remote Sensing Bi-temporal Change Captioning with Multimodal Large Language Model

Yujie Li, Wenjia Xu, Yuanben Zhang, Zhiwei Wei, Mugen Peng

Comments 5 pages, 2 figures; Accepted by ICASSP 2026

2509.00961 2026-01-28 cs.AI cs.LG

LLM-Generated Explanations Do Not Suffice for Ultra-Strong Machine Learning

Lun Ai, Johannes Langer, Ute Schmid, Stephen Muggleton

2508.19028 2026-01-28 cs.LG

GRADSTOP: Early Stopping of Gradient Descent via Posterior Sampling

Arash Jamshidi, Lauri Seppäläinen, Katsiaryna Haitsiukevich, Hoang Phuc Hau Luu, Anton Björklund, Kai Puolamäki

Journal ref 28th European Conference on Artificial Intelligence (ECAI 2025)

2508.18829 2026-01-28 cs.CV

Assessing the Effectiveness of Deep Embeddings for Tree Species Classification in the Dutch Forest Inventory

Takayuki Ishikawa, Carmelo Bonannella, Bas J. W. Lerink, Marc Rußwurm

2508.10539 2026-01-28 cs.AI cs.CL

Improving Value-based Process Verifier via Low-Cost Variance Reduction

Zetian Sun, Dongfang Li, Baotian Hu, Min Zhang

Comments Accepted by AAAI-2026

2508.10530 2026-01-28 cs.AI cs.CL

Is On-Policy Data always the Best Choice for Direct Preference Optimization-based LM Alignment?

Zetian Sun, Dongfang Li, Xuhui Chen, Baotian Hu, Min Zhang

Comments Accepted by ICLR-2026

2508.10498 2026-01-28 cs.CV

TweezeEdit: Consistent and Efficient Image Editing with Path Regularization

Jianda Mao, Kaibo Wang, Yang Xiang, Kani Chen

Journal ref AAAI2026

2508.09239 2026-01-28 cs.CV cs.AI

Gradient-Direction-Aware Density Control for 3D Gaussian Splatting

Zheng Zhou, Yu-Jie Xiong, Jia-Chen Zhang, Chun-Ming Xia, Xihe Qiu, Hongjian Zhan

2508.09156 2026-01-28 cs.LG cs.AI stat.AP

Physics-Constrained Fine-Tuning of Flow-Matching Models for Generation and Inverse Problems

Jan Tauberschmidt, Sophie Fellenz, Sebastian J. Vollmer, Andrew B. Duncan

2508.09125 2026-01-28 cs.CL cs.LG

Complex Logical Instruction Generation

Mian Zhang, Shujian Liu, Sixun Dong, Ming Yin, Yebowen Hu, Xun Wang, Steven Ma, Song Wang, Sathish Reddy Indurthi, Haoyun Deng, Zhiyu Zoey Chen, Kaiqiang Song

2508.07295 2026-01-28 cs.CL

CCFQA: A Benchmark for Cross-Lingual and Cross-Modal Speech and Text Factuality Evaluation

Yexing Du, Kaiyuan Liu, Youcheng Pan, Zheng Chu, Bo Yang, Xiaocheng Feng, Ming Liu, Yang Xiang

Comments Accepted in AAAI 2026

2508.06832 2026-01-28 cs.AI

Remote Sensing Image Intelligent Interpretation with the Language-Centered Perspective: Principles, Methods and Challenges

Haifeng Li, Wang Guo, Haiyang Wu, Mengwei Wu, Jipeng Zhang, Qing Zhu, Yu Liu, Xin Huang, Chao Tao

2508.05792 2026-01-28 cs.AI

Holistic Explainable AI (H-XAI): Extending Transparency Beyond Developers in AI-Driven Decision Making

Kausik Lakkaraju, Siva Likitha Valluru, Biplav Srivastava

2508.03785 2026-01-28 cs.LG cs.AI

SoilNet: A Multimodal Multitask Model for Hierarchical Classification of Soil Horizons

Vipin Singh, Teodor Chiaburu, Einar Eberhardt, Stefan Broda, Joey Prüssing, Frank Haußer, Felix Bießmann

Comments 29 pages, 9 figures, 7 tables

Journal ref Geoderma, Volume 466, 2026, 117684

详情

DOI: 10.1016/j.geoderma.2026.117684

英文摘要

Recent advances in artificial intelligence (AI), in particular foundation models, have improved the state of the art in many application domains including geosciences. Some specific problems, however, could not benefit from this progress yet. Soil horizon classification, for instance, remains challenging because of its multimodal and multitask characteristics and a complex hierarchically structured label taxonomy. Accurate classification of soil horizons is crucial for monitoring soil condition. In this work, we propose \textit{SoilNet} - a multimodal multitask model to tackle this problem through a structured modularized pipeline. In contrast to omnipurpose AI foundation models, our approach is designed to be inherently transparent by following the task structure human experts developed for solving this challenging annotation task. The proposed approach integrates image data and geotemporal metadata to first predict depth markers, segmenting the soil profile into horizon candidates. Each segment is characterized by a set of horizon-specific morphological features. Finally, horizon labels are predicted based on the multimodal concatenated feature vector, leveraging a graph-based label representation to account for the complex hierarchical relationships among soil horizons. Our method is designed to address complex hierarchical classification, where the number of possible labels is very large, imbalanced and non-trivially structured. We demonstrate the effectiveness of our approach on a real-world soil profile dataset and a comprehensive user study with domain experts. Our empirical evaluations demonstrate that SoilNet reliably predicts soil horizons that are plausible and accurate. User study results indicate that SoilNet achieves predictive performance on par with or better than that of human experts. All code can be found at: https://github.com/calgo-lab/BGR/

URL PDF HTML ☆

赞 0 踩 0

2508.01858 2026-01-28 cs.CL cs.AI

Web-CogReasoner: Towards Knowledge-Induced Cognitive Reasoning for Web Agents

Yuhan Guo, Cong Guo, Aiwen Sun, Hongliang He, Xinyu Yang, Yue Lu, Yingji Zhang, Xuntao Guo, Dong Zhang, Jianzhuang Liu, Jiang Duan, Yijia Xiao, Liangjian Wen, Hai-Ming Xu, Yong Dai

Comments Accepted to ICLR 2026. Our code and data is released at https://github.com/Gnonymous/Web-CogReasoner

详情

英文摘要

Multimodal large-scale models have significantly advanced the development of web agents, enabling perception and interaction with digital environments akin to human cognition. In this paper, we argue that web agents must first acquire sufficient knowledge to effectively engage in cognitive reasoning. Therefore, we decompose a web agent's capabilities into two essential stages: knowledge content learning and cognitive processes. To formalize this, we propose Web-CogKnowledge Framework, categorizing knowledge as Factual, Conceptual, and Procedural. In this framework, knowledge content learning corresponds to the agent's processes of Memorizing and Understanding, which rely on the first two knowledge types, representing the "what" of learning. Conversely, cognitive processes correspond to Exploring, grounded in Procedural knowledge, defining the "how" of reasoning and action. To facilitate knowledge acquisition, we construct the Web-CogDataset, a structured resource curated from 14 real-world websites, designed to systematically instill core knowledge necessary for web agent. This dataset serves as the agent's conceptual grounding-the "nouns" upon which comprehension is built-as well as the basis for learning how to reason and act. Building on this foundation, we operationalize these processes through a novel knowledge-driven Chain-of-Thought (CoT) reasoning framework, developing and training our proposed agent, the Web-CogReasoner. Extensive experimentation reveals its significant superiority over existing models, especially in generalizing to unseen tasks where structured knowledge is decisive. To enable rigorous evaluation, we introduce the Web-CogBench, a comprehensive evaluation suite designed to assess and compare agent performance across the delineated knowledge domains and cognitive capabilities. Our code and data is open sourced at https://github.com/Gnonymous/Web-CogReasoner

URL PDF HTML ☆

赞 0 踩 0

2508.01475 2026-01-28 cs.AI

$R^2$-CoD: Understanding Text-Graph Complementarity in Relational Reasoning via Knowledge Co-Distillation

Zhen Wu, Ritam Dutt, Luke M. Breitfeller, Armineh Nourbakhsh, Siddharth Parekh, Carolyn Rosé

Journal ref Proc. IJCNLP-AACL 2025, pages 1628-1652

2508.00600 2026-01-28 cs.CL cs.LG

A Context-Aware Dual-Metric Framework for Confidence Estimation in Large Language Models

Mingruo Yuan, Shuyi Zhang, Ben Kao

AI 大模型

视觉与机器人

科学与医疗

Temporal Lifting as Latent-Space Regularization for Continuous-Time Flow Models in AI Systems

Synthetic Object Compositions for Scalable and Accurate Learning in Detection, Segmentation, and Grounding

Optimal Scaling Needs Optimal Norm

Universal Multi-Domain Translation via Diffusion Routers

Demystifying the Roles of LLM Layers in Retrieval, Knowledge, and Reasoning

SingMOS-Pro: An Comprehensive Benchmark for Singing Quality Assessment

LLM Agents for Knowledge Discovery in Atomic Layer Processing

Assessing Algorithmic Bias in Language-Based Depression Detection: A Comparison of DNN and LLM Approaches

Reinforcement Learning Fine-Tuning Enhances Activation Intensity and Diversity in the Internal Circuitry of LLMs

Explaining Grokking and Information Bottleneck through Neural Collapse Emergence

Equi-RO: A 4D mmWave Radar Odometry via Equivariant Networks

TextMineX: Data, Evaluation Framework and Ontology-guided LLM Pipeline for Humanitarian Mine Action

LoaQ: Layer-wise Output Approximation Quantization

BTCChat: Advancing Remote Sensing Bi-temporal Change Captioning with Multimodal Large Language Model

LLM-Generated Explanations Do Not Suffice for Ultra-Strong Machine Learning

GRADSTOP: Early Stopping of Gradient Descent via Posterior Sampling

Assessing the Effectiveness of Deep Embeddings for Tree Species Classification in the Dutch Forest Inventory

Improving Value-based Process Verifier via Low-Cost Variance Reduction

Is On-Policy Data always the Best Choice for Direct Preference Optimization-based LM Alignment?

TweezeEdit: Consistent and Efficient Image Editing with Path Regularization

Gradient-Direction-Aware Density Control for 3D Gaussian Splatting

Physics-Constrained Fine-Tuning of Flow-Matching Models for Generation and Inverse Problems

Complex Logical Instruction Generation

CCFQA: A Benchmark for Cross-Lingual and Cross-Modal Speech and Text Factuality Evaluation

Remote Sensing Image Intelligent Interpretation with the Language-Centered Perspective: Principles, Methods and Challenges

Holistic Explainable AI (H-XAI): Extending Transparency Beyond Developers in AI-Driven Decision Making

SoilNet: A Multimodal Multitask Model for Hierarchical Classification of Soil Horizons

Web-CogReasoner: Towards Knowledge-Induced Cognitive Reasoning for Web Agents

$R^2$-CoD: Understanding Text-Graph Complementarity in Relational Reasoning via Knowledge Co-Distillation

A Context-Aware Dual-Metric Framework for Confidence Estimation in Large Language Models