arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

检索范围排序方式

检索时间范围

重置

HOT 人工智能、机器人等 9

cs.AI 人工智能 cs.CV 计算机视觉 cs.CL 自然语言处理 cs.RO 机器人 cs.LG 机器学习 cs.SD 声音 cs.ET 新兴技术 eess.AS 音频语音 eess.IV 图像视频

CS 计算机 41

cs 计算机 cs.AI 人工智能 cs.AR 硬件架构 cs.CC 计算复杂性 cs.CE 计算工程 cs.CG 计算几何 cs.CL 自然语言处理 cs.CR 密码安全 cs.CV 计算机视觉 cs.CY 计算机与社会 cs.DB 数据库 cs.DC 分布式计算 cs.DL 数字图书馆 cs.DM 离散数学 cs.DS 数据结构 cs.ET 新兴技术 cs.FL 形式语言 cs.GL 综述文献 cs.GR 图形学 cs.GT 博弈论 cs.HC 人机交互 cs.IR 信息检索 cs.IT 信息论 cs.LG 机器学习 cs.LO 计算机逻辑 cs.MA 多智能体 cs.MM 多媒体 cs.MS 数学软件 cs.NA 数值分析 cs.NE 神经进化 cs.NI 网络架构 cs.OH 其他计算机 cs.OS 操作系统 cs.PF 性能 cs.PL 编程语言 cs.RO 机器人 cs.SC 符号计算 cs.SD 声音 cs.SE 软件工程 cs.SI 社会信息网络 cs.SY 系统控制

ECON 经济学 4

econ 经济学 econ.EM 计量经济 econ.GN 一般经济 econ.TH 理论经济

EESS 电气与系统 5

eess 电气与系统 eess.AS 音频语音 eess.IV 图像视频 eess.SP 信号处理 eess.SY 系统控制

MATH 数学 33

math 数学 math.AC 交换代数 math.AG 代数几何 math.AP 偏微分方程 math.AT 代数拓扑 math.CA 经典分析 math.CO 组合数学 math.CT 范畴论 math.CV 复变函数 math.DG 微分几何 math.DS 动力系统 math.FA 泛函分析 math.GM 一般数学 math.GN 一般拓扑 math.GR 群论 math.GT 几何拓扑 math.HO 历史综述 math.IT 信息论 math.KT K理论 math.LO 逻辑 math.MG 度量几何 math.MP 数学物理 math.NA 数值分析 math.NT 数论 math.OA 算子代数 math.OC 优化控制 math.PR 概率 math.QA 量子代数 math.RA 环与代数 math.RT 表示论 math.SG 辛几何 math.SP 谱理论 math.ST 统计理论

PHYSICS 物理 55

astro-ph 天体物理 astro-ph.CO 宇宙学 astro-ph.EP 地球行星 astro-ph.GA 星系物理 astro-ph.HE 高能天体 astro-ph.IM 天文仪器 astro-ph.SR 太阳恒星 cond-mat 凝聚态 cond-mat.dis-nn 无序神经 cond-mat.mes-hall 介观纳米 cond-mat.mtrl-sci 材料科学 cond-mat.other 其他凝聚态 cond-mat.quant-gas 量子气体 cond-mat.soft 软凝聚态 cond-mat.stat-mech 统计力学 cond-mat.str-el 强关联电子 cond-mat.supr-con 超导 gr-qc 广义相对论 hep-ex 高能实验 hep-lat 格点高能 hep-ph 高能唯象 hep-th 高能理论 math-ph 数学物理 nlin 非线性科学 nlin.AO 自适应系统 nlin.CD 混沌动力学 nlin.CG 胞自动机 nlin.PS 斑图孤子 nlin.SI 可积系统 nucl-ex 核物理实验 nucl-th 核物理理论 physics 物理 physics.acc-ph 加速器物理 physics.ao-ph 大气海洋 physics.app-ph 应用物理 physics.atm-clus 原子分子团簇 physics.atom-ph 原子物理 physics.bio-ph 生物物理 physics.chem-ph 化学物理 physics.class-ph 经典物理 physics.comp-ph 计算物理 physics.data-an 数据分析 physics.ed-ph 物理教育 physics.flu-dyn 流体动力学 physics.gen-ph 普通物理 physics.geo-ph 地球物理 physics.hist-ph 物理史哲 physics.ins-det 仪器探测 physics.med-ph 医学物理 physics.optics 光学 physics.plasm-ph 等离子体 physics.pop-ph 科普物理 physics.soc-ph 物理与社会 physics.space-ph 空间物理 quant-ph 量子物理

Q-BIO 定量生物 11

q-bio 定量生物 q-bio.BM 生物分子 q-bio.CB 细胞行为 q-bio.GN 基因组学 q-bio.MN 分子网络 q-bio.NC 神经认知 q-bio.OT 其他定量生物 q-bio.PE 种群进化 q-bio.QM 定量方法 q-bio.SC 亚细胞过程 q-bio.TO 组织器官

Q-FIN 定量金融 10

q-fin 定量金融 q-fin.CP 计算金融 q-fin.EC 经济学 q-fin.GN 一般金融 q-fin.MF 数学金融 q-fin.PM 投资组合 q-fin.PR 证券定价 q-fin.RM 风险管理 q-fin.ST 统计金融 q-fin.TR 交易微观结构

STAT 统计 7

stat 统计 stat.AP 统计应用 stat.CO 统计计算 stat.ME 统计方法 stat.ML 机器学习 stat.OT 其他统计 stat.TH 统计理论

2601.18796 2026-01-27 cs.CL cs.AI cs.LG

ctELM: Decoding and Manipulating Embeddings of Clinical Trials with Embedding Language Models

Brian Ondov, Chia-Hsuan Chang, Yujia Zhou, Mauro Giuffrè, Hua Xu

2601.18791 2026-01-27 cs.CL cs.AI cs.LG

Subword-Based Comparative Linguistics across 242 Languages Using Wikipedia Glottosets

Iaroslav Chelombitko, Mika Hämäläinen, Aleksey Komissarov

Comments 15 pages, 4 figues, 4 tables

2601.18790 2026-01-27 cs.CL

MortalMATH: Evaluating the Conflict Between Reasoning Objectives and Emergency Contexts

Etienne Lanzeray, Stephane Meilliez, Malo Ruelle, Damien Sileo

2601.18788 2026-01-27 cs.CL cs.LG stat.ML

Unsupervised Text Segmentation via Kernel Change-Point Detection on Sentence Embeddings

Mumin Jia, Jairo Diaz-Rodriguez

Comments arXiv admin note: substantial text overlap with arXiv:2510.03437. substantial text overlap with arXiv:2510.03437. substantial text overlap with arXiv:2510.03437. substantial text overlap with arXiv:2510.03437

2601.18779 2026-01-27 cs.LG cs.AI cs.CL

POPE: Learning to Reason on Hard Problems via Privileged On-Policy Exploration

Yuxiao Qu, Amrith Setlur, Virginia Smith, Ruslan Salakhutdinov, Aviral Kumar

2601.18771 2026-01-27 cs.CL cs.AI cs.IR

Dep-Search: Learning Dependency-Aware Reasoning Traces with Persistent Memory

Yanming Liu, Xinyue Peng, Zixuan Yan, Yanxin Shen, Wenjie Xu, Yuefeng Huang, Xinyi Wang, Jiannan Cao, Jianwei Yin, Xuhong Zhang

Comments Dep-Search 1st version

2601.18760 2026-01-27 cs.LG cs.CL

Beyond Preferences: Learning Alignment Principles Grounded in Human Reasons and Values

Henry Bell, Lara Neubauer da Costa Schertel, Bochu Ding, Brandon Fain

2601.18751 2026-01-27 cs.LG cs.AI

Trust, Don't Trust, or Flip: Robust Preference-Based Reinforcement Learning with Multi-Expert Feedback

Seyed Amir Hosseini, Maryam Abdolali, Amirhosein Tavakkoli, Fardin Ayar, Ehsan Javanmardi, Manabu Tsukada, Mahdi Javanmardi

Comments Equal contribution: Seyed Amir Hosseini and Maryam Abdolali. Corresponding author: Maryam Abdolali (maryam.abdolali@kntu.ac.ir)

2601.18736 2026-01-27 cs.LG cs.NI

Benchmarking Machine Learning Models for IoT Malware Detection under Data Scarcity and Drift

Jake Lyon, Ehsan Saeedizade, Shamik Sengupta

2601.18735 2026-01-27 cs.AI cs.LG

Why Keep Your Doubts to Yourself? Trading Visual Uncertainties in Multi-Agent Bandit Systems

Jusheng Zhang, Yijia Fan, Kaitong Cai, Jing Yang, Jiawei Yao, Jian Wang, Guanlong Qu, Ziliang Chen, Keze Wang

Comments Accepted to ICLR 2026

2601.18733 2026-01-27 cs.RO cs.AI cs.CV

Advances and Innovations in the Multi-Agent Robotic System (MARS) Challenge

Li Kang, Heng Zhou, Xiufeng Song, Rui Li, Bruno N. Y. Chen, Ziye Wang, Ximeng Meng, Stone Tao, Yiran Qin, Xiaohong Liu, Ruimao Zhang, Lei Bai, Yilun Du, Hao Su, Philip Torr, Zhenfei Yin, Ruihao Gong, Yejun Zeng, Fengjun Zhong, Shenghao Jin, Jinyang Guo, Xianglong Liu, Xiaojun Jia, Tianqi Shan, Wenqi Ren, Simeng Qin, Jialing Yang, Xiaoyu Ma, Tianxing Chen, Zixuan Li, Zijian Cai, Yan Qin, Yusen Qin, Qiangyu Chen, Kaixuan Wang, Zhaoming Han, Yao Mu, Ping Luo, Yuanqi Yao, Haoming Song, Jan-Nico Zaech, Fabien Despinoy, Danda Pani Paudel, Luc Van Gool

Comments MARS Challenge @ NeurIPS 2025 Workshop on Space in Vision, Language, and Embodied AI. Challenge page: https://mars-eai.github.io/MARS-Challenge-Webpage/

2601.18730 2026-01-27 cs.CL cs.LG

Reflect: Transparent Principle-Guided Reasoning for Constitutional Alignment at Scale

Henry Bell, Caroline Zhang, Mohammed Mobasserul Haque, Dhaval Potdar, Samia Zaman, Brandon Fain

2601.18724 2026-01-27 cs.CL cs.AI cs.DL

HalluCitation Matters: Revealing the Impact of Hallucinated References with 300 Hallucinated Papers in ACL Conferences

Yusuke Sakai, Hidetaka Kamigaito, Taro Watanabe

Comments Work In Progress

2601.18723 2026-01-27 cs.RO

Trustworthy Evaluation of Robotic Manipulation: A New Benchmark and AutoEval Methods

Mengyuan Liu, Juyi Sheng, Peiming Li, Ziyi Wang, Tianming Xu, Tiantian Xu, Hong Liu

详情

英文摘要

Driven by the rapid evolution of Vision-Action and Vision-Language-Action models, imitation learning has significantly advanced robotic manipulation capabilities. However, evaluation methodologies have lagged behind, hindering the establishment of Trustworthy Evaluation for these behaviors. Current paradigms rely on binary success rates, failing to address the critical dimensions of trust: Source Authenticity (i.e., distinguishing genuine policy behaviors from human teleoperation) and Execution Quality (e.g., smoothness and safety). To bridge these gaps, we propose a solution that combines the Eval-Actions benchmark and the AutoEval architecture. First, we construct the Eval-Actions benchmark to support trustworthiness analysis. Distinct from existing datasets restricted to successful human demonstrations, Eval-Actions integrates VA and VLA policy execution trajectories alongside human teleoperation data, explicitly including failure scenarios. This dataset is structured around three core supervision signals: Expert Grading (EG), Rank-Guided preferences (RG), and Chain-of-Thought (CoT). Building on this, we propose the AutoEval architecture: AutoEval leverages Spatio-Temporal Aggregation for semantic assessment, augmented by an auxiliary Kinematic Calibration Signal to refine motion smoothness; AutoEval Plus (AutoEval-P) incorporates the Group Relative Policy Optimization (GRPO) paradigm to enhance logical reasoning capabilities. Experiments show AutoEval achieves Spearman's Rank Correlation Coefficients (SRCC) of 0.81 and 0.84 under the EG and RG protocols, respectively. Crucially, the framework possesses robust source discrimination capabilities, distinguishing between policy-generated and teleoperated videos with 99.6% accuracy, thereby establishing a rigorous standard for trustworthy robotic evaluation. Our project and code are available at https://term-bench.github.io/.

URL PDF HTML ☆

赞 0 踩 0

2601.18722 2026-01-27 cs.CL cs.LG

Gained in Translation: Privileged Pairwise Judges Enhance Multilingual Reasoning

Lintang Sutawika, Gokul Swamy, Zhiwei Steven Wu, Graham Neubig

Comments Code available at https://github.com/lintangsutawika/SP3F

2601.18716 2026-01-27 cs.AI q-bio.BM

Conditioned Generative Modeling of Molecular Glues: A Realistic AI Approach for Synthesizable Drug-like Molecules

Naeyma N. Islam, Thomas R. Caulfield

Comments 30 pages, 8 figures

Journal ref Biomolecules 2025, 15, 849

2601.18706 2026-01-27 cs.AI cs.LG

Health-SCORE: Towards Scalable Rubrics for Improving Health-LLMs

Zhichao Yang, Sepehr Janghorbani, Dongxu Zhang, Jun Han, Qian Qian, Andrew Ressler, Gregory D. Lyng, Sanjit Singh Batra, Robert E. Tillman

2601.18698 2026-01-27 cs.CV

Are Video Generation Models Geographically Fair? An Attraction-Centric Evaluation of Global Visual Knowledge

Xiao Liu, Jiawei Zhang

Comments Work in progress

2601.18694 2026-01-27 cs.SD cs.AI

Neural Multi-Speaker Voice Cloning for Nepali in Low-Resource Settings

Aayush M. Shrestha, Aditya Bajracharya, Projan Shakya, Dinesh B. Kshatri

Comments 16 pages with appendix included

2601.18676 2026-01-27 cs.LG

Quasi Monte Carlo methods enable extremely low-dimensional deep generative models

Miles Martinez, Alex H. Williams

2601.18638 2026-01-27 cs.LG physics.comp-ph

Physics-Informed Uncertainty Enables Reliable AI-driven Design

Tingkai Xue, Chin Chun Ooi, Yang Jiang, Luu Trung Pham Duong, Pao-Hsiung Chiu, Weijiang Zhao, Nagarajan Raghavan, My Ha Dao

2601.18633 2026-01-27 cs.CV

Splat-Portrait: Generalizing Talking Heads with Gaussian Splatting

Tong Shi, Melonie de Almeida, Daniela Ivanova, Nicolas Pugeault, Paul Henderson

2601.18630 2026-01-27 cs.AI cs.HC

Assessing the Quality of Mental Health Support in LLM Responses through Multi-Attribute Human Evaluation

Abeer Badawi, Md Tahmid Rahman Laskar, Elahe Rahimi, Sheri Grach, Lindsay Bertrand, Lames Danok, Frank Rudzicz, Jimmy Huang, Elham Dolatabadi

2601.18629 2026-01-27 cs.RO

ExoGS: A 4D Real-to-Sim-to-Real Framework for Scalable Manipulation Data Collection

Yiming Wang, Ruogu Zhang, Minyang Li, Hao Shi, Junbo Wang, Deyi Li, Jieji Ren, Wenhai Liu, Weiming Wang, Hao-Shu Fang

2601.18625 2026-01-27 cs.CV

CONQUER: Context-Aware Representation with Query Enhancement for Text-Based Person Search

Zequn Xie

Comments Accepted by ICASSP 2026

2601.18620 2026-01-27 cs.LG

CASSANDRA: Programmatic and Probabilistic Learning and Inference for Stochastic World Modeling

Panagiotis Lymperopoulos, Abhiramon Rajasekharan, Ian Berlot-Attwell, Stéphane Aroca-Ouellette, Kaheer Suleman

Comments 28 pages, 2 figures

2601.18619 2026-01-27 cs.CV

Scale-Aware Self-Supervised Learning for Segmentation of Small and Sparse Structures

Jorge Quesada, Ghassan AlRegib

2601.18617 2026-01-27 cs.AI cs.CL

Emergence of Phonemic, Syntactic, and Semantic Representations in Artificial Neural Networks

Pierre Orhan, Pablo Diego-Simón, Emmnanuel Chemla, Yair Lakretz, Yves Boubenec, Jean-Rémi King

2601.18615 2026-01-27 cs.LG

Geometry-Free Conditional Diffusion Modeling for Solving the Inverse Electrocardiography Problem

Ramiro Valdes Jara, Adam Meyers

2601.18595 2026-01-27 cs.AI cs.LG

A Balanced Neuro-Symbolic Approach for Commonsense Abductive Logic

Joseph Cotnareanu, Didier Chetelat, Yingxue Zhang, Mark Coates

AI 大模型

视觉与机器人

科学与医疗