arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

检索范围排序方式

检索时间范围

重置

HOT 人工智能、机器人等 9

cs.AI 人工智能 cs.CV 计算机视觉 cs.CL 自然语言处理 cs.RO 机器人 cs.LG 机器学习 cs.SD 声音 cs.ET 新兴技术 eess.AS 音频语音 eess.IV 图像视频

CS 计算机 41

cs 计算机 cs.AI 人工智能 cs.AR 硬件架构 cs.CC 计算复杂性 cs.CE 计算工程 cs.CG 计算几何 cs.CL 自然语言处理 cs.CR 密码安全 cs.CV 计算机视觉 cs.CY 计算机与社会 cs.DB 数据库 cs.DC 分布式计算 cs.DL 数字图书馆 cs.DM 离散数学 cs.DS 数据结构 cs.ET 新兴技术 cs.FL 形式语言 cs.GL 综述文献 cs.GR 图形学 cs.GT 博弈论 cs.HC 人机交互 cs.IR 信息检索 cs.IT 信息论 cs.LG 机器学习 cs.LO 计算机逻辑 cs.MA 多智能体 cs.MM 多媒体 cs.MS 数学软件 cs.NA 数值分析 cs.NE 神经进化 cs.NI 网络架构 cs.OH 其他计算机 cs.OS 操作系统 cs.PF 性能 cs.PL 编程语言 cs.RO 机器人 cs.SC 符号计算 cs.SD 声音 cs.SE 软件工程 cs.SI 社会信息网络 cs.SY 系统控制

ECON 经济学 4

econ 经济学 econ.EM 计量经济 econ.GN 一般经济 econ.TH 理论经济

EESS 电气与系统 5

eess 电气与系统 eess.AS 音频语音 eess.IV 图像视频 eess.SP 信号处理 eess.SY 系统控制

MATH 数学 33

math 数学 math.AC 交换代数 math.AG 代数几何 math.AP 偏微分方程 math.AT 代数拓扑 math.CA 经典分析 math.CO 组合数学 math.CT 范畴论 math.CV 复变函数 math.DG 微分几何 math.DS 动力系统 math.FA 泛函分析 math.GM 一般数学 math.GN 一般拓扑 math.GR 群论 math.GT 几何拓扑 math.HO 历史综述 math.IT 信息论 math.KT K理论 math.LO 逻辑 math.MG 度量几何 math.MP 数学物理 math.NA 数值分析 math.NT 数论 math.OA 算子代数 math.OC 优化控制 math.PR 概率 math.QA 量子代数 math.RA 环与代数 math.RT 表示论 math.SG 辛几何 math.SP 谱理论 math.ST 统计理论

PHYSICS 物理 55

astro-ph 天体物理 astro-ph.CO 宇宙学 astro-ph.EP 地球行星 astro-ph.GA 星系物理 astro-ph.HE 高能天体 astro-ph.IM 天文仪器 astro-ph.SR 太阳恒星 cond-mat 凝聚态 cond-mat.dis-nn 无序神经 cond-mat.mes-hall 介观纳米 cond-mat.mtrl-sci 材料科学 cond-mat.other 其他凝聚态 cond-mat.quant-gas 量子气体 cond-mat.soft 软凝聚态 cond-mat.stat-mech 统计力学 cond-mat.str-el 强关联电子 cond-mat.supr-con 超导 gr-qc 广义相对论 hep-ex 高能实验 hep-lat 格点高能 hep-ph 高能唯象 hep-th 高能理论 math-ph 数学物理 nlin 非线性科学 nlin.AO 自适应系统 nlin.CD 混沌动力学 nlin.CG 胞自动机 nlin.PS 斑图孤子 nlin.SI 可积系统 nucl-ex 核物理实验 nucl-th 核物理理论 physics 物理 physics.acc-ph 加速器物理 physics.ao-ph 大气海洋 physics.app-ph 应用物理 physics.atm-clus 原子分子团簇 physics.atom-ph 原子物理 physics.bio-ph 生物物理 physics.chem-ph 化学物理 physics.class-ph 经典物理 physics.comp-ph 计算物理 physics.data-an 数据分析 physics.ed-ph 物理教育 physics.flu-dyn 流体动力学 physics.gen-ph 普通物理 physics.geo-ph 地球物理 physics.hist-ph 物理史哲 physics.ins-det 仪器探测 physics.med-ph 医学物理 physics.optics 光学 physics.plasm-ph 等离子体 physics.pop-ph 科普物理 physics.soc-ph 物理与社会 physics.space-ph 空间物理 quant-ph 量子物理

Q-BIO 定量生物 11

q-bio 定量生物 q-bio.BM 生物分子 q-bio.CB 细胞行为 q-bio.GN 基因组学 q-bio.MN 分子网络 q-bio.NC 神经认知 q-bio.OT 其他定量生物 q-bio.PE 种群进化 q-bio.QM 定量方法 q-bio.SC 亚细胞过程 q-bio.TO 组织器官

Q-FIN 定量金融 10

q-fin 定量金融 q-fin.CP 计算金融 q-fin.EC 经济学 q-fin.GN 一般金融 q-fin.MF 数学金融 q-fin.PM 投资组合 q-fin.PR 证券定价 q-fin.RM 风险管理 q-fin.ST 统计金融 q-fin.TR 交易微观结构

STAT 统计 7

stat 统计 stat.AP 统计应用 stat.CO 统计计算 stat.ME 统计方法 stat.ML 机器学习 stat.OT 其他统计 stat.TH 统计理论

2505.11891 2026-02-03 cs.CL cs.AI

Mobile-Bench-v2: A More Realistic and Comprehensive Benchmark for VLM-based Mobile Agents

Weikai Xu, Zhizheng Jiang, Yuxuan Liu, Pengzhi Gao, Wei Liu, Jian Luan, Yuanchun Li, Yunxin Liu, Bin Wang, Bo An

2505.09134 2026-02-03 cs.LG stat.ML

Scaling Gaussian Process Regression with Full Derivative Observations

Daniel Huang

Comments 13 pages, Published in TMLR

2505.08664 2026-02-03 cs.RO cs.AI

A Social Robot with Inner Speech for Dietary Guidance

Valerio Belcamino, Alessandro Carfì, Valeria Seidita, Fulvio Mastrogiovanni, Antonio Chella

2505.05064 2026-02-03 cs.LG

WaterDrum: Watermarking for Data-centric Unlearning Metric

Xinyang Lu, Xinyuan Niu, Gregory Kang Ruey Lau, Bui Thi Cam Nhung, Rachael Hwee Ling Sim, John Russell Himawan, Fanyu Wen, Chuan-Sheng Foo, See-Kiong Ng, Bryan Kian Hsiang Low

2504.19472 2026-02-03 cs.CL

Conflicts in Texts: Data, Implications and Challenges

Siyi Liu, Dan Roth

2504.19110 2026-02-03 cs.CL

APE-Bench: Evaluating Automated Proof Engineering for Formal Math Libraries

Huajian Xin, Luming Li, Xiaoran Jin, Jacques Fleuriot, Wenda Li

2504.18881 2026-02-03 cs.LG

TSCAN: Context-Aware Uplift Modeling via Two-Stage Training for Online Merchant Business Diagnosis

Hangtao Zhang, Zhe Li, Kairui Zhang

Comments 15 pages,7 figures

2504.16063 2026-02-03 cs.CL cs.DB cs.IR

Free Access to World News: Reconstructing Full-Text Articles from GDELT

A. Fronzetti Colladon, R. Vestrelli

Journal ref Big Data and Cognitive Computing, 10(2), 45 (2026)

2504.09970 2026-02-03 cs.LG

ASIL: Augmented Structural Information Learning for Deep Graph Clustering in Hyperbolic Space

Li Sun, Zhenhao Huang, Yujie Wang, Hongbo Lv, Chunyang Liu, Hao Peng, Philip S. Yu

Comments Accepted by IEEE TPAMI, 36 pages

2504.08697 2026-02-03 cs.CL

LLMs as Span Annotators: A Comparative Study of LLMs and Humans

Zdeněk Kasner, Vilém Zouhar, Patrícia Schmidtová, Ivan Kartáč, Kristýna Onderková, Ondřej Plátek, Dimitra Gkatzia, Saad Mahamood, Ondřej Dušek, Simone Balloccu

Comments Accepted to the MME workshop @ EACL 2026

2504.05711 2026-02-03 cs.AI cs.DL cs.IR cs.LG

Automated Archival Descriptions with Federated Intelligence of LLMs

Jinghua Groppe, Andreas Marquet, Annabel Walz, Sven Groppe

Comments 16 pages

2504.05520 2026-02-03 cs.LG cs.CL

Efficient Reinforcement Finetuning via Adaptive Curriculum Learning

Taiwei Shi, Yiyang Wu, Linxin Song, Tianyi Zhou, Jieyu Zhao

Comments 23 pages, 8 figures, 7 tables

2504.01842 2026-02-03 cs.LG stat.CO

shapr: Explaining Machine Learning Models with Conditional Shapley Values in R and Python

Martin Jullum, Lars Henry Berge Olsen, Jon Lachmann, Annabelle Redelmeier

2504.01154 2026-02-03 cs.AI cs.GT cs.MA

Past-Discounting is Key for Learning Markovian Fairness with Long Horizons

Ashwin Kumar, William Yeoh

2504.00573 2026-02-03 cs.CL

Training a Utility-based Retriever Through Shared Context Attribution for Retrieval-Augmented Language Models

Yilong Xu, Jinhua Gao, Xiaoming Yu, Yuanhai Xue, Baolong Bi, Huawei Shen, Xueqi Cheng

Comments EMNLP 2025 Main Conference (Long paper)

2503.24047 2026-02-03 cs.AI cs.MA

Towards Scientific Intelligence: A Survey of LLM-based Scientific Agents

Shuo Ren, Can Xie, Pu Jian, Zhenjiang Ren, Chunlin Leng, Jiajun Zhang

2503.17279 2026-02-03 cs.CL

CASE -- Condition-Aware Sentence Embeddings for Conditional Semantic Textual Similarity Measurement

Gaifan Zhang, Yi Zhou, Danushka Bollegala

Comments Accepted to EACL2026

2503.16718 2026-02-03 cs.SD cs.CL cs.LG

CAARMA: Class Augmentation with Adversarial Mixup Regularization

Massa Baali, Xiang Li, Hao Chen, Syed Abdul Hannan, Rita Singh, Bhiksha Raj

Comments Accepted to EMNLP 2025 Findings

2503.14858 2026-02-03 cs.LG cs.AI

1000 Layer Networks for Self-Supervised RL: Scaling Depth Can Enable New Goal-Reaching Capabilities

Kevin Wang, Ishaan Javali, Michał Bortkiewicz, Tomasz Trzciński, Benjamin Eysenbach

Comments Link to project website: https://wang-kevin3290.github.io/scaling-crl/

2503.10304 2026-02-03 cs.LG cs.AI cs.GT

Large-Scale Auto-bidding with Nash Equilibrium Constraints

Zhiyu Mou, Miao Xu, Rongquan Bai, Zhuoran Yang, Chuan Yu, Jian Xu, Bo Zheng

2503.09701 2026-02-03 cs.CL cs.LG

Reassessing Active Learning Adoption in Contemporary NLP: A Community Survey

Julia Romberg, Christopher Schröder, Julius Gonsior, Katrin Tomanek, Fredrik Olsson

Comments EACL 2026 Main Conference

2502.16570 2026-02-03 cs.LG cs.AI cs.CV

Entropy-Lens: Uncovering Decision Strategies in LLMs

Riccardo Ali, Francesco Caso, Christopher Irwin, Pietro Liò

2502.12769 2026-02-03 cs.CL cs.AI

How Much Do LLMs Hallucinate across Languages? On Realistic Multilingual Estimation of LLM Hallucination

Saad Obaid ul Islam, Anne Lauscher, Goran Glavaš

Comments EMNLP 2025

详情

DOI: 10.18653/v1/2025.emnlp-main.1481

英文摘要

In the age of misinformation, hallucination - the tendency of Large Language Models (LLMs) to generate non-factual or unfaithful responses - represents the main risk for their global utility. Despite LLMs becoming increasingly multilingual, the vast majority of research on detecting and quantifying LLM hallucination are (a) English-centric and (b) focus on machine translation (MT) and summarization, tasks that are less common in realistic settings than open information seeking. In contrast, we aim to quantify the extent of LLM hallucination across languages in knowledge-intensive long-form question answering (LFQA). To this end, we train a multilingual hallucination detection model and conduct a large-scale study across 30 languages and 6 open-source LLM families. We start from an English hallucination detection dataset and rely on MT to translate-train a detection model. We also manually annotate gold data for five high-resource languages; we then demonstrate, for these languages, that the estimates of hallucination rates are similar between silver (LLM-generated) and gold test sets, validating the use of silver data for estimating hallucination rates for other languages. For the final rates estimation, we build open-domain QA dataset for 30 languages with LLM-generated prompts and Wikipedia articles as references. Our analysis shows that LLMs, in absolute terms, hallucinate more tokens in high-resource languages due to longer responses, but that the actual hallucination rates (i.e., normalized for length) seems uncorrelated with the sizes of languages' digital footprints. We also find that smaller LLMs hallucinate more, and significantly, LLMs with broader language support display higher hallucination rates.

URL PDF HTML ☆

赞 0 踩 0

2502.11367 2026-02-03 cs.LG cs.AI cs.CL

Sparse Autoencoder Features for Classifications and Transferability

Jack Gallifant, Shan Chen, Kuleen Sasse, Hugo Aerts, Thomas Hartvigsen, Danielle S. Bitterman

2502.05568 2026-02-03 cs.CL cs.AI cs.LG

Large Multimodal Models for Low-Resource Languages: A Survey

Marian Lupascu, Ana-Cristina Rogoz, Mihai Sorin Stupariu, Radu Tudor Ionescu

Comments Accepted in Information Fusion

2502.04528 2026-02-03 cs.CL cs.LG

Group-Adaptive Threshold Optimization for Robust AI-Generated Text Detection

Minseok Jung, Cynthia Fuertes Panizo, Liam Dugan, Yi R., Fung, Pin-Yu Chen, Paul Pu Liang

2502.01477 2026-02-03 cs.LG cs.AI

Achieving Time Series Reasoning Requires Rethinking Model Design, Tasks Formulation, and Evaluation

Yaxuan Kong, Yiyuan Yang, Shiyu Wang, Chenghao Liu, Yuxuan Liang, Ming Jin, Stefan Zohren, Dan Pei, Yan Liu, Qingsong Wen

2501.15098 2026-02-03 cs.LG cs.AI

CFT-RAG: An Entity Tree Based Retrieval Augmented Generation Algorithm With Cuckoo Filter

Zihang Li, Yangdong Ruan, Wenjun Liu, Zhengyang Wang, Tong Yang

2501.14687 2026-02-03 cs.LG cs.AI

Decoding Generalization from Memorization in Deep Neural Networks

Simran Ketha, Venkatakrishnan Ramaswamy

Journal ref Transactions on Machine Learning Research, 2026

2501.08907 2026-02-03 cs.LG cs.AI

PIQL: Projective Implicit Q-Learning with Support Constraint for Offline Reinforcement Learning

Xinchen Han, Hossam Afifi, Michel Marot

AI 大模型

视觉与机器人

科学与医疗