arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

检索范围排序方式

检索时间范围

重置

HOT 人工智能、机器人等 9

cs.AI 人工智能 cs.CV 计算机视觉 cs.CL 自然语言处理 cs.RO 机器人 cs.LG 机器学习 cs.SD 声音 cs.ET 新兴技术 eess.AS 音频语音 eess.IV 图像视频

CS 计算机 41

cs 计算机 cs.AI 人工智能 cs.AR 硬件架构 cs.CC 计算复杂性 cs.CE 计算工程 cs.CG 计算几何 cs.CL 自然语言处理 cs.CR 密码安全 cs.CV 计算机视觉 cs.CY 计算机与社会 cs.DB 数据库 cs.DC 分布式计算 cs.DL 数字图书馆 cs.DM 离散数学 cs.DS 数据结构 cs.ET 新兴技术 cs.FL 形式语言 cs.GL 综述文献 cs.GR 图形学 cs.GT 博弈论 cs.HC 人机交互 cs.IR 信息检索 cs.IT 信息论 cs.LG 机器学习 cs.LO 计算机逻辑 cs.MA 多智能体 cs.MM 多媒体 cs.MS 数学软件 cs.NA 数值分析 cs.NE 神经进化 cs.NI 网络架构 cs.OH 其他计算机 cs.OS 操作系统 cs.PF 性能 cs.PL 编程语言 cs.RO 机器人 cs.SC 符号计算 cs.SD 声音 cs.SE 软件工程 cs.SI 社会信息网络 cs.SY 系统控制

ECON 经济学 4

econ 经济学 econ.EM 计量经济 econ.GN 一般经济 econ.TH 理论经济

EESS 电气与系统 5

eess 电气与系统 eess.AS 音频语音 eess.IV 图像视频 eess.SP 信号处理 eess.SY 系统控制

MATH 数学 33

math 数学 math.AC 交换代数 math.AG 代数几何 math.AP 偏微分方程 math.AT 代数拓扑 math.CA 经典分析 math.CO 组合数学 math.CT 范畴论 math.CV 复变函数 math.DG 微分几何 math.DS 动力系统 math.FA 泛函分析 math.GM 一般数学 math.GN 一般拓扑 math.GR 群论 math.GT 几何拓扑 math.HO 历史综述 math.IT 信息论 math.KT K理论 math.LO 逻辑 math.MG 度量几何 math.MP 数学物理 math.NA 数值分析 math.NT 数论 math.OA 算子代数 math.OC 优化控制 math.PR 概率 math.QA 量子代数 math.RA 环与代数 math.RT 表示论 math.SG 辛几何 math.SP 谱理论 math.ST 统计理论

PHYSICS 物理 55

astro-ph 天体物理 astro-ph.CO 宇宙学 astro-ph.EP 地球行星 astro-ph.GA 星系物理 astro-ph.HE 高能天体 astro-ph.IM 天文仪器 astro-ph.SR 太阳恒星 cond-mat 凝聚态 cond-mat.dis-nn 无序神经 cond-mat.mes-hall 介观纳米 cond-mat.mtrl-sci 材料科学 cond-mat.other 其他凝聚态 cond-mat.quant-gas 量子气体 cond-mat.soft 软凝聚态 cond-mat.stat-mech 统计力学 cond-mat.str-el 强关联电子 cond-mat.supr-con 超导 gr-qc 广义相对论 hep-ex 高能实验 hep-lat 格点高能 hep-ph 高能唯象 hep-th 高能理论 math-ph 数学物理 nlin 非线性科学 nlin.AO 自适应系统 nlin.CD 混沌动力学 nlin.CG 胞自动机 nlin.PS 斑图孤子 nlin.SI 可积系统 nucl-ex 核物理实验 nucl-th 核物理理论 physics 物理 physics.acc-ph 加速器物理 physics.ao-ph 大气海洋 physics.app-ph 应用物理 physics.atm-clus 原子分子团簇 physics.atom-ph 原子物理 physics.bio-ph 生物物理 physics.chem-ph 化学物理 physics.class-ph 经典物理 physics.comp-ph 计算物理 physics.data-an 数据分析 physics.ed-ph 物理教育 physics.flu-dyn 流体动力学 physics.gen-ph 普通物理 physics.geo-ph 地球物理 physics.hist-ph 物理史哲 physics.ins-det 仪器探测 physics.med-ph 医学物理 physics.optics 光学 physics.plasm-ph 等离子体 physics.pop-ph 科普物理 physics.soc-ph 物理与社会 physics.space-ph 空间物理 quant-ph 量子物理

Q-BIO 定量生物 11

q-bio 定量生物 q-bio.BM 生物分子 q-bio.CB 细胞行为 q-bio.GN 基因组学 q-bio.MN 分子网络 q-bio.NC 神经认知 q-bio.OT 其他定量生物 q-bio.PE 种群进化 q-bio.QM 定量方法 q-bio.SC 亚细胞过程 q-bio.TO 组织器官

Q-FIN 定量金融 10

q-fin 定量金融 q-fin.CP 计算金融 q-fin.EC 经济学 q-fin.GN 一般金融 q-fin.MF 数学金融 q-fin.PM 投资组合 q-fin.PR 证券定价 q-fin.RM 风险管理 q-fin.ST 统计金融 q-fin.TR 交易微观结构

STAT 统计 7

stat 统计 stat.AP 统计应用 stat.CO 统计计算 stat.ME 统计方法 stat.ML 机器学习 stat.OT 其他统计 stat.TH 统计理论

2601.19928 2026-01-29 cs.CL

Towards a Mechanistic Understanding of Large Reasoning Models: A Survey of Training, Inference, and Failures

Yi Hu, Jiaqi Gu, Ruxin Wang, Zijun Yao, Hao Peng, Xiaobao Wu, Jianhui Chen, Muhan Zhang, Liangming Pan

2601.19927 2026-01-29 cs.CL

Attribution Techniques for Mitigating Hallucinated Information in RAG Systems: A Survey

Yuqing Zhao, Ziyao Liu, Yongsen Zheng, Kwok-Yan Lam

Journal ref The 8th International Conference on Artifcial Intelligence in Information and Communication (ICAIIC 2026)

2601.19925 2026-01-29 cs.CL cs.AI

Evaluating Large Language Models for Abstract Evaluation Tasks: An Empirical Study

Yinuo Liu, Emre Sezgin, Eric A. Youngstrom

Comments 17 pages, 4 figures, 2 tables

详情

英文摘要

Introduction: Large language models (LLMs) can process requests and generate texts, but their feasibility for assessing complex academic content needs further investigation. To explore LLM's potential in assisting scientific review, this study examined ChatGPT-5, Gemini-3-Pro, and Claude-Sonnet-4.5's consistency and reliability in evaluating abstracts compared to one another and to human reviewers. Methods: 160 abstracts from a local conference were graded by human reviewers and three LLMs using one rubric. Composite score distributions across three LLMs and fourteen reviewers were examined. Inter-rater reliability was calculated using intraclass correlation coefficients (ICCs) for within-AI reliability and AI-human concordance. Bland-Altman plots were examined for visual agreement patterns and systematic bias. Results: LLMs achieved good-to-excellent agreement with each other (ICCs: 0.59-0.87). ChatGPT and Claude reached moderate agreement with human reviewers on overall quality and content-specific criteria, with ICCs ~.45-.60 for composite, impression, clarity, objective, and results. They exhibited fair agreement on subjective dimensions, with ICC ranging from 0.23-0.38 for impact, engagement, and applicability. Gemini showed fair agreement on half criteria and no reliability on impact and applicability. Three LLMs showed acceptable or negligible mean difference (ChatGPT=0.24, Gemini=0.42, Claude=-0.02) from the human mean composite scores. Discussion: LLMs could process abstracts in batches with moderate agreement with human experts on overall quality and objective criteria. With appropriate process architecture, they can apply a rubric consistently across volumes of abstracts exceeding feasibility for a human rater. The weaker performance on subjective dimensions indicates that AI should serve a complementary role in evaluation, while human expertise remains essential.

URL PDF HTML ☆

赞 0 踩 0

2601.19918 2026-01-29 cs.CL

Lowest Span Confidence: A Zero-Shot Metric for Efficient and Black-Box Hallucination Detection in LLMs

Yitong Qiao, Licheng Pan, Yu Mi, Lei Liu, Yue Shen, Fei Sun, Zhixuan Chu

2601.19916 2026-01-29 cs.CL

PaperAudit-Bench: Benchmarking Error Detection in Research Papers for Critical Automated Peer Review

Songjun Tu, Yiwen Ma, Jiahao Lin, Qichao Zhang, Xiangyuan Lan, Junfeng. Li, Nan Xu, Linjing Li, Dongbin Zhao

2601.19915 2026-01-29 cs.CL cs.AI cs.LO

Modeling Next-Token Prediction as Left-Nested Intuitionistic Implication

Paul Tarau

Comments 25 pages

2601.19785 2026-01-29 cs.CV

GeoDiff3D: Self-Supervised 3D Scene Generation with Geometry-Constrained 2D Diffusion Guidance

Haozhi Zhu, Miaomiao Zhao, Dingyao Liu, Runze Tian, Yan Zhang, Jie Guo, Fenggen Yu

2601.19620 2026-01-29 cs.LG cs.AI

R^3: Replay, Reflection, and Ranking Rewards for LLM Reinforcement Learning

Zhizheng Jiang, Kang Zhao, Weikai Xu, Xinkui Lin, Wei Liu, Jian Luan, Shuo Shang, Peng Han

2601.19489 2026-01-29 cs.CV

Fast Converging 3D Gaussian Splatting for 1-Minute Reconstruction

Ziyu Zhang, Tianle Liu, Diantao Tu, Shuhan Shen

Comments First Rank of SIGGRAPH Asia 2025 3DGS Challenge. Code available at https://github.com/will-zzy/siggraph_asia

2601.19388 2026-01-29 cs.RO

Judgelight: Trajectory-Level Post-Optimization for Multi-Agent Path Finding via Closed-Subwalk Collapsing

Yimin Tang, Sven Koenig, Erdem Bıyık

2601.19380 2026-01-29 cs.CV cs.AI

Tri-Reader: An Open-Access, Multi-Stage AI Pipeline for First-Pass Lung Nodule Annotation in Screening CT

Fakrul Islam Tushar, Joseph Y. Lo

Comments 1 figure , 2 tables, 20 page supplement

2601.19225 2026-01-29 cs.CL cs.AI

RPO-RAG: Aligning Small LLMs with Relation-aware Preference Optimization for Knowledge Graph Question Answering

Kaehyun Um, KyuHwan Yeom, Haerim Yang, Minyoung Choi, Hyeongjun Yang, Kyong-Ho Lee

Comments Accepted at The Web Conference (WWW) 2026

详情

英文摘要

Large Language Models (LLMs) have recently demonstrated remarkable reasoning abilities, yet hallucinate on knowledge-intensive tasks. Retrieval-augmented generation (RAG) mitigates this issue by grounding answers in external sources, e.g., knowledge graphs (KGs). However, existing KG-based RAG approaches rely on semantics-unaware path sampling and are weakly aligned with KG reasoning objectives, which limits further accuracy gains. They also feed retrieved paths directly into the reasoner without organizing them into answer-centered reasoning paths, hindering small LLMs' ability to leverage the retrieved knowledge. Furthermore, prior works predominantly rely on large LLMs (e.g., ChatGPT/GPT-4) or assume backbones above 7B parameters, leaving sub-7B models underexplored. We address this gap with RPO-RAG, the first KG-based RAG framework specifically designed for small LLMs, to the best of our knowledge. RPO-RAG introduces three key innovations: (1) a query-path semantic sampling strategy that provides informative supervisory signals; (2) a relation-aware preference optimization that aligns training with intermediate KG reasoning signals (e.g., relation); and (3) an answer-centered prompt design that organizes entities and reasoning paths in an interpretable format. Extensive experiments on two benchmark Knowledge Graph Question Answering (KGQA) datasets, WebQSP and CWQ, demonstrate that RPO-RAG effectively bridges the performance gap between small and large language models. On WebQSP, it improves F1 by up to 8.8%, reflecting enhanced answer precision, while on CWQ it achieves new state-of-the-art results among models under 8B parameters in both Hit and F1. Overall, RPO-RAG substantially improves the reasoning capability of small LLMs, even under 3B parameters-highlighting their potential for resource-efficient and practical on-device KGQA applications.

URL PDF HTML ☆

赞 0 踩 0

2601.19129 2026-01-29 cs.CV cs.AI

CLIP-Guided Unsupervised Semantic-Aware Exposure Correction

Puzhen Wu, Han Weng, Quan Zheng, Yi Zhan, Hewei Wang, Yiming Li, Jiahui Han, Rui Xu

Comments Accepted at ICASSP 2026

2601.18811 2026-01-29 cs.LG q-fin.CP q-fin.PM quant-ph

Variational Quantum Circuit-Based Reinforcement Learning for Dynamic Portfolio Optimization

Vincent Gurgul, Ying Chen, Stefan Lessmann

2601.18800 2026-01-29 cs.LG cs.CV

NavFormer: IGRF Forecasting in Moving Coordinate Frames

Yoontae Hwang, Dongwoo Lee, Minseok Choi, Heechan Park, Yong Sup Ihn, Daham Kim, Deok-Young Lee

2601.18631 2026-01-29 cs.AI cs.CL cs.CV cs.MA

AdaReasoner: Dynamic Tool Orchestration for Iterative Visual Reasoning

Mingyang Song, Haoyu Sun, Jiawei Gu, Linjie Li, Luxin Xu, Ranjay Krishna, Yu Cheng

Comments 28 pages, 10 figures and 13 tables

2601.18543 2026-01-29 cs.CV

GenAgent: Scaling Text-to-Image Generation via Agentic Multimodal Reasoning

Kaixun Jiang, Yuzheng Wang, Junjie Zhou, Pandeng Li, Zhihang Liu, Chen-Wei Xie, Zhaoyu Chen, Yun Zheng, Wenqiang Zhang

2601.17702 2026-01-29 cs.CL cs.LG

S$^3$-Attention:Attention-Aligned Endogenous Retrieval for Memory-Bounded Long-Context Inference

Qingsen Ma, Dianyun Wang, Yaoye Wang, Lechen Ning, Sujie Zhu, Xiaohang Zhang, Jiaming Lyu, Linhao Ren, Zhenbo Xu, Zhaofeng He

2601.17607 2026-01-29 cs.LG

A Thermodynamic Theory of Learning I: Irreversible Ensemble Transport and Epistemic Costs

Daisuke Okanohara

Comments 9 pages. Part I of a planned series entitled "A Thermodynamic Theory of Learning." Minor revisions throughout; appendix added. Clarified the treatment of the noise temperature T and refined the presentation of the Epistemic Speed Limit

2601.17517 2026-01-29 cs.SD cs.LG eess.AS

EuleroDec: A Complex-Valued RVQ-VAE for Efficient and Robust Audio Coding

Luca Cerovaz, Michele Mancusi, Emanuele Rodolà

Comments Accepted at ICASSP 2026

2601.17367 2026-01-29 cs.CL cs.AI

Elastic Attention: Test-time Adaptive Sparsity Ratios for Efficient Transformers

Zecheng Tang, Quantong Qiu, Yi Yang, Zhiyi Hong, Haiya Xiang, Kebin Liu, Qingqing Dang, Juntao Li, Min Zhang

2601.17216 2026-01-29 cs.CV cs.AI cs.LG eess.IV

Spatiotemporal Semantic V2X Framework for Cooperative Collision Prediction

Murat Arda Onsu, Poonam Lohan, Burak Kantarci, Aisha Syed, Matthew Andrews, Sean Kennedy

Comments 6 pages 5 figures, accepted to IEEE ICC 2026

2601.17046 2026-01-29 cs.CV cs.LG

Atomic Depth Estimation From Noisy Electron Microscopy Data Via Deep Learning

Matan Leibovich, Mai Tan, Ramon Manzorro, Adria Marcos-Morales, Sreyas Mohan, Peter A. Crozier, Carlos Fernandez-Granda

2601.16991 2026-01-29 cs.LG cs.AI

Sparsity-Aware Low-Rank Representation for Efficient Fine-Tuning of Large Language Models

Longteng Zhang, Sen Wu, Shuai Hou, Zhengyu Qing, Zhuo Zheng, Danning Ke, Qihong Lin, Qiang Wang, Shaohuai Shi, Xiaowen Chu

2601.16669 2026-01-29 cs.CL cs.AI cs.CY

PLawBench: A Rubric-Based Benchmark for Evaluating LLMs in Real-World Legal Practice

Yuzhen Shi, Huanghai Liu, Yiran Hu, Gaojie Song, Xinran Xu, Yubo Ma, Tianyi Tang, Li Zhang, Qingjing Chen, Di Feng, Wenbo Lv, Weiheng Wu, Kexin Yang, Sen Yang, Wei Wang, Rongyao Shi, Yuanyang Qiu, Yuemeng Qi, Jingwen Zhang, Xiaoyu Sui, Yifan Chen, Yi Zhang, An Yang, Bowen Yu, Dayiheng Liu, Junyang Lin, Weixing Shen, Bing Zhao, Charles L. A. Clarke, Hu Wei

2601.14417 2026-01-29 cs.CL

Quantifying Speaker Embedding Phonological Rule Interactions in Accented Speech Synthesis

Thanathai Lertpetchpun, Yoonjeong Lee, Thanapat Trachu, Jihwan Lee, Tiantian Feng, Dani Byrd, Shrikanth Narayanan

Comments Accepted to ICASSP2026

2601.12706 2026-01-29 cs.LG

Trend-Adjusted Time Series Models with an Application to Gold Price Forecasting

Sina Kazemdehbashi

2601.10992 2026-01-29 cs.LG stat.CO

Constant Metric Scaling in Riemannian Computation

Kisung You

2601.10187 2026-01-29 cs.CL cs.AI

HOMURA: Taming the Sand-Glass for Time-Constrained LLM Translation via Reinforcement Learning

Ziang Cui, Mengran Yu, Tianjiao Li, Chenyu Shi, Yingxuan Shi, Lusheng Zhang, Hongwei Lin

2601.09831 2026-01-29 cs.LG math.OC

A New Convergence Analysis of Plug-and-Play Proximal Gradient Descent Under Prior Mismatch

Guixian Xu, Jinglai Li, Junqi Tang

AI 大模型

视觉与机器人

科学与医疗