arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

检索范围排序方式

检索时间范围

重置

HOT 人工智能、机器人等 9

cs.AI 人工智能 cs.CV 计算机视觉 cs.CL 自然语言处理 cs.RO 机器人 cs.LG 机器学习 cs.SD 声音 cs.ET 新兴技术 eess.AS 音频语音 eess.IV 图像视频

CS 计算机 41

cs 计算机 cs.AI 人工智能 cs.AR 硬件架构 cs.CC 计算复杂性 cs.CE 计算工程 cs.CG 计算几何 cs.CL 自然语言处理 cs.CR 密码安全 cs.CV 计算机视觉 cs.CY 计算机与社会 cs.DB 数据库 cs.DC 分布式计算 cs.DL 数字图书馆 cs.DM 离散数学 cs.DS 数据结构 cs.ET 新兴技术 cs.FL 形式语言 cs.GL 综述文献 cs.GR 图形学 cs.GT 博弈论 cs.HC 人机交互 cs.IR 信息检索 cs.IT 信息论 cs.LG 机器学习 cs.LO 计算机逻辑 cs.MA 多智能体 cs.MM 多媒体 cs.MS 数学软件 cs.NA 数值分析 cs.NE 神经进化 cs.NI 网络架构 cs.OH 其他计算机 cs.OS 操作系统 cs.PF 性能 cs.PL 编程语言 cs.RO 机器人 cs.SC 符号计算 cs.SD 声音 cs.SE 软件工程 cs.SI 社会信息网络 cs.SY 系统控制

ECON 经济学 4

econ 经济学 econ.EM 计量经济 econ.GN 一般经济 econ.TH 理论经济

EESS 电气与系统 5

eess 电气与系统 eess.AS 音频语音 eess.IV 图像视频 eess.SP 信号处理 eess.SY 系统控制

MATH 数学 33

math 数学 math.AC 交换代数 math.AG 代数几何 math.AP 偏微分方程 math.AT 代数拓扑 math.CA 经典分析 math.CO 组合数学 math.CT 范畴论 math.CV 复变函数 math.DG 微分几何 math.DS 动力系统 math.FA 泛函分析 math.GM 一般数学 math.GN 一般拓扑 math.GR 群论 math.GT 几何拓扑 math.HO 历史综述 math.IT 信息论 math.KT K理论 math.LO 逻辑 math.MG 度量几何 math.MP 数学物理 math.NA 数值分析 math.NT 数论 math.OA 算子代数 math.OC 优化控制 math.PR 概率 math.QA 量子代数 math.RA 环与代数 math.RT 表示论 math.SG 辛几何 math.SP 谱理论 math.ST 统计理论

PHYSICS 物理 55

astro-ph 天体物理 astro-ph.CO 宇宙学 astro-ph.EP 地球行星 astro-ph.GA 星系物理 astro-ph.HE 高能天体 astro-ph.IM 天文仪器 astro-ph.SR 太阳恒星 cond-mat 凝聚态 cond-mat.dis-nn 无序神经 cond-mat.mes-hall 介观纳米 cond-mat.mtrl-sci 材料科学 cond-mat.other 其他凝聚态 cond-mat.quant-gas 量子气体 cond-mat.soft 软凝聚态 cond-mat.stat-mech 统计力学 cond-mat.str-el 强关联电子 cond-mat.supr-con 超导 gr-qc 广义相对论 hep-ex 高能实验 hep-lat 格点高能 hep-ph 高能唯象 hep-th 高能理论 math-ph 数学物理 nlin 非线性科学 nlin.AO 自适应系统 nlin.CD 混沌动力学 nlin.CG 胞自动机 nlin.PS 斑图孤子 nlin.SI 可积系统 nucl-ex 核物理实验 nucl-th 核物理理论 physics 物理 physics.acc-ph 加速器物理 physics.ao-ph 大气海洋 physics.app-ph 应用物理 physics.atm-clus 原子分子团簇 physics.atom-ph 原子物理 physics.bio-ph 生物物理 physics.chem-ph 化学物理 physics.class-ph 经典物理 physics.comp-ph 计算物理 physics.data-an 数据分析 physics.ed-ph 物理教育 physics.flu-dyn 流体动力学 physics.gen-ph 普通物理 physics.geo-ph 地球物理 physics.hist-ph 物理史哲 physics.ins-det 仪器探测 physics.med-ph 医学物理 physics.optics 光学 physics.plasm-ph 等离子体 physics.pop-ph 科普物理 physics.soc-ph 物理与社会 physics.space-ph 空间物理 quant-ph 量子物理

Q-BIO 定量生物 11

q-bio 定量生物 q-bio.BM 生物分子 q-bio.CB 细胞行为 q-bio.GN 基因组学 q-bio.MN 分子网络 q-bio.NC 神经认知 q-bio.OT 其他定量生物 q-bio.PE 种群进化 q-bio.QM 定量方法 q-bio.SC 亚细胞过程 q-bio.TO 组织器官

Q-FIN 定量金融 10

q-fin 定量金融 q-fin.CP 计算金融 q-fin.EC 经济学 q-fin.GN 一般金融 q-fin.MF 数学金融 q-fin.PM 投资组合 q-fin.PR 证券定价 q-fin.RM 风险管理 q-fin.ST 统计金融 q-fin.TR 交易微观结构

STAT 统计 7

stat 统计 stat.AP 统计应用 stat.CO 统计计算 stat.ME 统计方法 stat.ML 机器学习 stat.OT 其他统计 stat.TH 统计理论

2602.03709 2026-02-04 cs.CL

No Shortcuts to Culture: Indonesian Multi-hop Question Answering for Complex Cultural Understanding

Vynska Amalia Permadi, Xingwei Tan, Nafise Sadat Moosavi, Nikos Aletras

2602.03704 2026-02-04 cs.CL cs.AI

Cognitively Diverse Multiple-Choice Question Generation: A Hybrid Multi-Agent Framework with Large Language Models

Yu Tian, Linh Huynh, Katerina Christhilf, Shubham Chakraborty, Micah Watanabe, Tracy Arner, Danielle McNamara

Comments This manuscript is under review at Electronics

2602.03702 2026-02-04 cs.LG cs.AI math.OC stat.ML

Anytime Pretraining: Horizon-Free Learning-Rate Schedules with Weight Averaging

Alexandru Meterez, Pranav Ajit Nair, Depen Morwani, Cengiz Pehlevan, Sham Kakade

2602.03698 2026-02-04 cs.LG

Data-Driven Graph Filters via Adaptive Spectral Shaping

Dylan Sandfelder, Mihai Cucuringu, Xiaowen Dong

2602.03696 2026-02-04 cs.LG cs.CL

Conflict-Resolving and Sharpness-Aware Minimization for Generalized Knowledge Editing with Multiple Updates

Duy Nguyen, Hanqi Xiao, Archiki Prasad, Elias Stengel-Eskin, Hyunji Lee, Mohit Bansal

Comments 22 pages, 8 figures. Code link: https://github.com/duykhuongnguyen/CoRSA

2602.03693 2026-02-04 cs.CL cs.AI

OCRTurk: A Comprehensive OCR Benchmark for Turkish

Deniz Yılmaz, Evren Ayberk Munis, Çağrı Toraman, Süha Kağan Köse, Burak Aktaş, Mehmet Can Baytekin, Bilge Kaan Görür

Comments Accepted by EACL 2026 SIGTURK

2602.03690 2026-02-04 cs.LG cs.AI

LLM-Inspired Pretrain-Then-Finetune for Small-Data, Large-Scale Optimization

Zishi Zhang, Jinhui Han, Ming Hu, Yijie Peng

2602.03689 2026-02-04 cs.CL cs.AI

Rethinking the Reranker: Boundary-Aware Evidence Selection for Robust Retrieval-Augmented Generation

Jiashuo Sun, Pengcheng Jiang, Saizhuo Wang, Jiajun Fan, Heng Wang, Siru Ouyang, Ming Zhong, Yizhu Jiao, Chengsong Huang, Xueqiang Xu, Pengrui Han, Peiran Li, Jiaxin Huang, Ge Liu, Heng Ji, Jiawei Han

Comments 19 pages, 8 tables, 5 figures

2602.03686 2026-02-04 cs.LG cs.AI

QuAIL: Quality-Aware Inertial Learning for Robust Training under Data Corruption

Mattia Sabella, Alberto Archetti, Pietro Pinoli, Matteo Matteucci, Cinzia Cappiello

2602.03678 2026-02-04 cs.LG cs.AI

ContraLog: Log File Anomaly Detection with Contrastive Learning and Masked Language Modeling

Simon Dietz, Kai Klede, An Nguyen, Bjoern M Eskofier

Comments 26 pages with 16 figures

2602.03673 2026-02-04 cs.CV

Referring Industrial Anomaly Segmentation

Pengfei Yue, Xiaokang Jiang, Yilin Lu, Jianghang Lin, Shengchuan Zhang, Liujuan Cao

2602.03669 2026-02-04 cs.CV cs.AI cs.LG eess.IV

Efficient Sequential Neural Network with Spatial-Temporal Attention and Linear LSTM for Robust Lane Detection Using Multi-Frame Images

Sandeep Patil, Yongqi Dong, Haneen Farah, Hans Hellendoorn

Comments 14 pages, 9 figures, under review by IEEE T-ITS

2602.03665 2026-02-04 cs.CV cs.HC

MM-SCALE: Grounded Multimodal Moral Reasoning via Scalar Judgment and Listwise Alignment

Eunkyu Park, Wesley Hanwen Deng, Cheyon Jin, Matheus Kunzler Maldaner, Jordan Wheeler, Jason I. Hong, Hong Shen, Adam Perer, Ken Holstein, Motahhare Eslami, Gunhee Kim

2602.03652 2026-02-04 cs.CL cs.AI cs.IR

RAGTurk: Best Practices for Retrieval Augmented Generation in Turkish

Süha Kağan Köse, Mehmet Can Baytekin, Burak Aktaş, Bilge Kaan Görür, Evren Ayberk Munis, Deniz Yılmaz, Muhammed Yusuf Kartal, Çağrı Toraman

Comments Accepted by EACL 2026 SIGTURK

2602.03647 2026-02-04 cs.AI cs.CL

Search-R2: Enhancing Search-Integrated Reasoning via Actor-Refiner Collaboration

Bowei He, Minda Hu, Zenan Xu, Hongru Wang, Licheng Zong, Yankai Chen, Chen Ma, Xue Liu, Pluto Zhou, Irwin King

2602.03645 2026-02-04 cs.LG

Reinforcement Fine-Tuning for History-Aware Dense Retriever in RAG

Yicheng Zhang, Zhen Qin, Zhaomin Wu, Wenqi Zhang, Shuiguang Deng

Comments On going work. Codes are released at https://github.com/zyc140345/HARR

2602.03641 2026-02-04 cs.LG

CTTVAE: Latent Space Structuring for Conditional Tabular Data Generation on Imbalanced Datasets

Milosh Devic, Jordan Gierschendorf, David Garson

2602.03635 2026-02-04 cs.CL cs.LG

TRE: Encouraging Exploration in the Trust Region

Chao Huang, Yujing Lu, Quangang Li, Shenghe Wang, Yan Wang, Yueyang Zhang, Long Xia, Jiashu Zhao, Zhiyuan Sun, Daiting Shi, Tingwen Liu

2602.03634 2026-02-04 cs.CV

SPWOOD: Sparse Partial Weakly-Supervised Oriented Object Detection

Wei Zhang, Xiang Liu, Ningjing Liu, Mingxin Liu, Wei Liao, Chunyan Xu, Xue Yang

Comments The Fourteenth International Conference on Learning Representations (ICLR 2026)

2602.03633 2026-02-04 cs.CL cs.AI cs.DB

BIRDTurk: Adaptation of the BIRD Text-to-SQL Dataset to Turkish

Burak Aktaş, Mehmet Can Baytekin, Süha Kağan Köse, Ömer İlbilgi, Elif Özge Yılmaz, Çağrı Toraman, Bilge Kaan Görür

Comments Accepted by EACL 2026 SIGTURK

2602.03630 2026-02-04 cs.AI

Can LLMs Do Rocket Science? Exploring the Limits of Complex Reasoning with GTOC 12

Iñaki del Campo, Pablo Cuervo, Victor Rodriguez-Fernandez, Roberto Armellin, Jack Yarndley

Comments Extended version of the paper presented at AIAA SciTech 2026 Forum. Includes futher experiments, corrections and new appendix

Journal ref Proceedings of the AIAA SciTech 2026 Forum, January 2026

详情

DOI: 10.2514/6.2026-2379

英文摘要

Large Language Models (LLMs) have demonstrated remarkable proficiency in code generation and general reasoning, yet their capacity for autonomous multi-stage planning in high-dimensional, physically constrained environments remains an open research question. This study investigates the limits of current AI agents by evaluating them against the 12th Global Trajectory Optimization Competition (GTOC 12), a complex astrodynamics challenge requiring the design of a large-scale asteroid mining campaign. We adapt the MLE-Bench framework to the domain of orbital mechanics and deploy an AIDE-based agent architecture to autonomously generate and refine mission solutions. To assess performance beyond binary validity, we employ an "LLM-as-a-Judge" methodology, utilizing a rubric developed by domain experts to evaluate strategic viability across five structural categories. A comparative analysis of models, ranging from GPT-4-Turbo to reasoning-enhanced architectures like Gemini 2.5 Pro, and o3, reveals a significant trend: the average strategic viability score has nearly doubled in the last two years (rising from 9.3 to 17.2 out of 26). However, we identify a critical capability gap between strategy and execution. While advanced models demonstrate sophisticated conceptual understanding, correctly framing objective functions and mission architectures, they consistently fail at implementation due to physical unit inconsistencies, boundary condition errors, and inefficient debugging loops. We conclude that, while current LLMs often demonstrate sufficient knowledge and intelligence to tackle space science tasks, they remain limited by an implementation barrier, functioning as powerful domain facilitators rather than fully autonomous engineers.

URL PDF HTML ☆

赞 0 踩 0

2602.03627 2026-02-04 cs.LG

Ultra Fast PDE Solving via Physics Guided Few-step Diffusion

Cindy Xiangrui Kong, Yueqi Wang, Haoyang Zheng, Weijian Luo, Guang Lin

2602.03625 2026-02-04 cs.CV

Multi-Objective Optimization for Synthetic-to-Real Style Transfer

Estelle Chigot, Thomas Oberlin, Manon Huguenin, Dennis Wilson

Comments Accepted in International Conference on the Applications of Evolutionary Computation (Part of EvoStar), April 2026 (EvoApplications 2026)

2602.03623 2026-02-04 cs.RO

Self-supervised Physics-Informed Manipulation of Deformable Linear Objects with Non-negligible Dynamics

Youyuan Long, Gokhan Solak, Sara Zeynalpour, Heng Zhang, Arash Ajoudani

Comments Submitted to IEEE Transactions on Robotics. Video: https://youtu.be/lgX2J-00TRM

2602.03622 2026-02-04 cs.CV physics.med-ph

Quasi-multimodal-based pathophysiological feature learning for retinal disease diagnosis

Lu Zhang, Huizhen Yu, Zuowei Wang, Fu Gui, Yatu Guo, Wei Zhang, Mengyu Jia

Journal ref Zhang, L., Yu, H., Wang, Z., Gui, F., Guo, Y., Zhang, W., Jia, M., 2026. Quasi-multimodal-based pathophysiological feature learning for retinal disease diagnosis. Medical Image Analysis 109, 103886

2602.03615 2026-02-04 cs.CV

KTV: Keyframes and Key Tokens Selection for Efficient Training-Free Video LLMs

Baiyang Song, Jun Peng, Yuxin Zhang, Guangyao Chen, Feidiao Yang, Jianyuan Guo

2602.03614 2026-02-04 cs.LG

Quantization-Aware Regularizers for Deep Neural Networks Compression

Dario Malchiodi, Mattia Ferraretto, Marco Frasca

2602.03611 2026-02-04 cs.LG

Explanations Leak: Membership Inference with Differential Privacy and Active Learning Defense

Fatima Ezzeddine, Osama Zammar, Silvia Giordano, Omran Ayoub

2602.03608 2026-02-04 cs.CL cs.AI cs.IR

Controlling Output Rankings in Generative Engines for LLM-based Search

Haibo Jin, Ruoxi Chen, Peiyan Zhang, Yifeng Luo, Huimin Zeng, Man Luo, Haohan Wang

Comments 23 pages

2602.03603 2026-02-04 cs.RO

Human-in-the-Loop Failure Recovery with Adaptive Task Allocation

Lorena Maria Genua, Nikita Boguslavskii, Zhi Li

AI 大模型

视觉与机器人

科学与医疗

No Shortcuts to Culture: Indonesian Multi-hop Question Answering for Complex Cultural Understanding

Cognitively Diverse Multiple-Choice Question Generation: A Hybrid Multi-Agent Framework with Large Language Models

Anytime Pretraining: Horizon-Free Learning-Rate Schedules with Weight Averaging

Data-Driven Graph Filters via Adaptive Spectral Shaping

Conflict-Resolving and Sharpness-Aware Minimization for Generalized Knowledge Editing with Multiple Updates

OCRTurk: A Comprehensive OCR Benchmark for Turkish

LLM-Inspired Pretrain-Then-Finetune for Small-Data, Large-Scale Optimization

Rethinking the Reranker: Boundary-Aware Evidence Selection for Robust Retrieval-Augmented Generation

QuAIL: Quality-Aware Inertial Learning for Robust Training under Data Corruption

ContraLog: Log File Anomaly Detection with Contrastive Learning and Masked Language Modeling

Referring Industrial Anomaly Segmentation

Efficient Sequential Neural Network with Spatial-Temporal Attention and Linear LSTM for Robust Lane Detection Using Multi-Frame Images

MM-SCALE: Grounded Multimodal Moral Reasoning via Scalar Judgment and Listwise Alignment

RAGTurk: Best Practices for Retrieval Augmented Generation in Turkish

Search-R2: Enhancing Search-Integrated Reasoning via Actor-Refiner Collaboration

Reinforcement Fine-Tuning for History-Aware Dense Retriever in RAG

CTTVAE: Latent Space Structuring for Conditional Tabular Data Generation on Imbalanced Datasets

TRE: Encouraging Exploration in the Trust Region

SPWOOD: Sparse Partial Weakly-Supervised Oriented Object Detection

BIRDTurk: Adaptation of the BIRD Text-to-SQL Dataset to Turkish

Can LLMs Do Rocket Science? Exploring the Limits of Complex Reasoning with GTOC 12

Ultra Fast PDE Solving via Physics Guided Few-step Diffusion

Multi-Objective Optimization for Synthetic-to-Real Style Transfer

Self-supervised Physics-Informed Manipulation of Deformable Linear Objects with Non-negligible Dynamics

Quasi-multimodal-based pathophysiological feature learning for retinal disease diagnosis

KTV: Keyframes and Key Tokens Selection for Efficient Training-Free Video LLMs

Quantization-Aware Regularizers for Deep Neural Networks Compression

Explanations Leak: Membership Inference with Differential Privacy and Active Learning Defense

Controlling Output Rankings in Generative Engines for LLM-based Search

Human-in-the-Loop Failure Recovery with Adaptive Task Allocation