arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

检索范围排序方式

检索时间范围

重置

HOT 人工智能、机器人等 9

cs.AI 人工智能 cs.CV 计算机视觉 cs.CL 自然语言处理 cs.RO 机器人 cs.LG 机器学习 cs.SD 声音 cs.ET 新兴技术 eess.AS 音频语音 eess.IV 图像视频

CS 计算机 41

cs 计算机 cs.AI 人工智能 cs.AR 硬件架构 cs.CC 计算复杂性 cs.CE 计算工程 cs.CG 计算几何 cs.CL 自然语言处理 cs.CR 密码安全 cs.CV 计算机视觉 cs.CY 计算机与社会 cs.DB 数据库 cs.DC 分布式计算 cs.DL 数字图书馆 cs.DM 离散数学 cs.DS 数据结构 cs.ET 新兴技术 cs.FL 形式语言 cs.GL 综述文献 cs.GR 图形学 cs.GT 博弈论 cs.HC 人机交互 cs.IR 信息检索 cs.IT 信息论 cs.LG 机器学习 cs.LO 计算机逻辑 cs.MA 多智能体 cs.MM 多媒体 cs.MS 数学软件 cs.NA 数值分析 cs.NE 神经进化 cs.NI 网络架构 cs.OH 其他计算机 cs.OS 操作系统 cs.PF 性能 cs.PL 编程语言 cs.RO 机器人 cs.SC 符号计算 cs.SD 声音 cs.SE 软件工程 cs.SI 社会信息网络 cs.SY 系统控制

ECON 经济学 4

econ 经济学 econ.EM 计量经济 econ.GN 一般经济 econ.TH 理论经济

EESS 电气与系统 5

eess 电气与系统 eess.AS 音频语音 eess.IV 图像视频 eess.SP 信号处理 eess.SY 系统控制

MATH 数学 33

math 数学 math.AC 交换代数 math.AG 代数几何 math.AP 偏微分方程 math.AT 代数拓扑 math.CA 经典分析 math.CO 组合数学 math.CT 范畴论 math.CV 复变函数 math.DG 微分几何 math.DS 动力系统 math.FA 泛函分析 math.GM 一般数学 math.GN 一般拓扑 math.GR 群论 math.GT 几何拓扑 math.HO 历史综述 math.IT 信息论 math.KT K理论 math.LO 逻辑 math.MG 度量几何 math.MP 数学物理 math.NA 数值分析 math.NT 数论 math.OA 算子代数 math.OC 优化控制 math.PR 概率 math.QA 量子代数 math.RA 环与代数 math.RT 表示论 math.SG 辛几何 math.SP 谱理论 math.ST 统计理论

PHYSICS 物理 55

astro-ph 天体物理 astro-ph.CO 宇宙学 astro-ph.EP 地球行星 astro-ph.GA 星系物理 astro-ph.HE 高能天体 astro-ph.IM 天文仪器 astro-ph.SR 太阳恒星 cond-mat 凝聚态 cond-mat.dis-nn 无序神经 cond-mat.mes-hall 介观纳米 cond-mat.mtrl-sci 材料科学 cond-mat.other 其他凝聚态 cond-mat.quant-gas 量子气体 cond-mat.soft 软凝聚态 cond-mat.stat-mech 统计力学 cond-mat.str-el 强关联电子 cond-mat.supr-con 超导 gr-qc 广义相对论 hep-ex 高能实验 hep-lat 格点高能 hep-ph 高能唯象 hep-th 高能理论 math-ph 数学物理 nlin 非线性科学 nlin.AO 自适应系统 nlin.CD 混沌动力学 nlin.CG 胞自动机 nlin.PS 斑图孤子 nlin.SI 可积系统 nucl-ex 核物理实验 nucl-th 核物理理论 physics 物理 physics.acc-ph 加速器物理 physics.ao-ph 大气海洋 physics.app-ph 应用物理 physics.atm-clus 原子分子团簇 physics.atom-ph 原子物理 physics.bio-ph 生物物理 physics.chem-ph 化学物理 physics.class-ph 经典物理 physics.comp-ph 计算物理 physics.data-an 数据分析 physics.ed-ph 物理教育 physics.flu-dyn 流体动力学 physics.gen-ph 普通物理 physics.geo-ph 地球物理 physics.hist-ph 物理史哲 physics.ins-det 仪器探测 physics.med-ph 医学物理 physics.optics 光学 physics.plasm-ph 等离子体 physics.pop-ph 科普物理 physics.soc-ph 物理与社会 physics.space-ph 空间物理 quant-ph 量子物理

Q-BIO 定量生物 11

q-bio 定量生物 q-bio.BM 生物分子 q-bio.CB 细胞行为 q-bio.GN 基因组学 q-bio.MN 分子网络 q-bio.NC 神经认知 q-bio.OT 其他定量生物 q-bio.PE 种群进化 q-bio.QM 定量方法 q-bio.SC 亚细胞过程 q-bio.TO 组织器官

Q-FIN 定量金融 10

q-fin 定量金融 q-fin.CP 计算金融 q-fin.EC 经济学 q-fin.GN 一般金融 q-fin.MF 数学金融 q-fin.PM 投资组合 q-fin.PR 证券定价 q-fin.RM 风险管理 q-fin.ST 统计金融 q-fin.TR 交易微观结构

STAT 统计 7

stat 统计 stat.AP 统计应用 stat.CO 统计计算 stat.ME 统计方法 stat.ML 机器学习 stat.OT 其他统计 stat.TH 统计理论

2511.10339 2026-02-10 cs.AI cs.DC cs.GT

Massively Parallel Proof-Number Search for Impartial Games and Beyond

Tomáš Čížek, Martin Balko, Martin Schmid

2511.08579 2026-02-10 cs.CL cs.AI cs.LG

Training Language Models to Explain Their Own Computations

Belinda Z. Li, Zifan Carl Guo, Vincent Huang, Jacob Steinhardt, Jacob Andreas

Comments 23 pages, 8 tables, 7 figures. Code and data at https://github.com/TransluceAI/introspective-interp

2511.08552 2026-02-10 cs.LG cs.IT math.IT

FMMI: Flow Matching Mutual Information Estimation

Ivan Butakov, Alexander Semenenko, Valeriya Kirova, Alexey Frolov, Ivan Oseledets

Comments 11 pages

2511.06250 2026-02-10 cs.LG cs.CV

Test-Time Iterative Error Correction for Efficient Diffusion Models

Yunshan Zhong, Weiqi Yan, Yuxin Zhang

Comments Accepted by ICLR 2026

2511.05403 2026-02-10 cs.CV

PALM: A Dataset and Baseline for Learning Multi-subject Hand Prior

Zicong Fan, Edoardo Remelli, David Dimond, Fadime Sener, Liuhao Ge, Bugra Tekin, Cem Keskin, Shreyas Hampali

2511.04094 2026-02-10 cs.LG

KoTaP: A Panel Dataset for Corporate Tax Avoidance, Performance, and Governance in Korea

Hyungjong Na, Wonho Song, Seungyong Han, Donghyeon Jo, Sejin Myung, Hyungjoon Kim

Comments 18 pages, 3 figures, 8 tables. Submitted to Scientific Data; currently under review. Data and codebook available at Zenodo (DOI: 10.5281/zenodo.17149808)

2510.26098 2026-02-10 cs.AI

GUI Knowledge Bench: Revealing the Knowledge Gap of VLMs in GUI Tasks

Chenrui Shi, Zedong Yu, Zhi Gao, Ruining Feng, Enqi Liu, Yuwei Wu, Yunde Jia, Liuyu Xiang, Zhaofeng He, Qing Li

2510.19349 2026-02-10 cs.LG stat.ML

Scalable LinUCB: Low-Rank Design Matrix Updates for Recommenders with Large Action Spaces

Evgenia Shustova, Marina Sheshukova, Sergey Samsonov, Evgeny Frolov

2510.17541 2026-02-10 cs.RO

Distributed Spatial-Temporal Trajectory Optimization for Unmanned-Aerial-Vehicle Swarm

Xiaobo Zheng, Pan Tang, Defu Lin, Shaoming He

2510.16070 2026-02-10 cs.CV cs.AI cs.HC eess.IV

Effect of Reporting Mode and Clinical Experience on Radiologists' Gaze and Image Analysis Behavior in Chest Radiography

Mahta Khoobi, Marc Sebastian von der Stueck, Felix Barajas Ordonez, Anca-Maria Iancu, Eric Corban, Julia Nowak, Aleksandar Kargaliev, Valeria Perelygina, Anna-Sophie Schott, Daniel Pinto dos Santos, Christiane Kuhl, Daniel Truhn, Sven Nebelung, Robert Siepmann

Comments Preprint version - Under second revision at Radiology (manuscript RAD-25-1348)

Journal ref Radiology 2026; 318(2):e25134

详情

DOI: 10.1148/radiol.251348

英文摘要

Structured reporting (SR) and artificial intelligence (AI) may transform how radiologists interact with imaging studies. This prospective study (July to December 2024) evaluated the impact of three reporting modes: free-text (FT), structured reporting (SR), and AI-assisted structured reporting (AI-SR), on image analysis behavior, diagnostic accuracy, efficiency, and user experience. Four novice and four non-novice readers (radiologists and medical students) each analyzed 35 bedside chest radiographs per session using a customized viewer and an eye-tracking system. Outcomes included diagnostic accuracy (compared with expert consensus using Cohen's $κ$), reporting time per radiograph, eye-tracking metrics, and questionnaire-based user experience. Statistical analysis used generalized linear mixed models with Bonferroni post-hoc tests with a significance level of ($P \le .01$). Diagnostic accuracy was similar in FT ($κ= 0.58$) and SR ($κ= 0.60$) but higher in AI-SR ($κ= 0.71$, $P < .001$). Reporting times decreased from $88 \pm 38$ s (FT) to $37 \pm 18$ s (SR) and $25 \pm 9$ s (AI-SR) ($P < .001$). Saccade counts for the radiograph field ($205 \pm 135$ (FT), $123 \pm 88$ (SR), $97 \pm 58$ (AI-SR)) and total fixation duration for the report field ($11 \pm 5$ s (FT), $5 \pm 3$ s (SR), $4 \pm 1$ s (AI-SR)) were lower with SR and AI-SR ($P < .001$ each). Novice readers shifted gaze towards the radiograph in SR, while non-novice readers maintained their focus on the radiograph. AI-SR was the preferred mode. In conclusion, SR improves efficiency by guiding visual attention toward the image, and AI-prefilled SR further enhances diagnostic accuracy and user satisfaction.

URL PDF HTML ☆

赞 0 踩 0

2510.13201 2026-02-10 cs.CV cs.AI cs.DL cs.LG

Paper Copilot: Tracking the Evolution of Peer Review in AI Conferences

Jing Yang, Qiyao Wei, Jiaxin Pei

Comments ICLR 2026. https://papercopilot.com/

2510.11341 2026-02-10 cs.CV

InternSVG: Towards Unified SVG Tasks with Multimodal Large Language Models

Haomin Wang, Jinhui Yin, Qi Wei, Wenguang Zeng, Lixin Gu, Shenglong Ye, Zhangwei Gao, Yaohui Wang, Yanting Zhang, Yuanqi Li, Yanwen Guo, Wenhai Wang, Kai Chen, Yu Qiao, Hongjie Zhang

2510.11098 2026-02-10 cs.SD cs.CL

VCB Bench: An Evaluation Benchmark for Audio-Grounded Large Language Model Conversational Agents

Jiliang Hu, Wenfu Wang, Zuchao Li, Chenxing Li, Yiyang Zhao, Hanzhao Li, Liqiang Zhang, Meng Yu, Dong Yu

Comments 23 pages, 5 figures

2510.10753 2026-02-10 cs.CV

Restricted Receptive Fields for Face Verification

Kagan Ozturk, Aman Bhatta, Haiyu Wu, Patrick Flynn, Kevin W. Bowyer

2510.10238 2026-02-10 cs.AI

The Achilles' Heel of LLMs: How Altering a Handful of Neurons Can Cripple Language Abilities

Zixuan Qin, Qingchen Yu, Kunlin Lyu, Zhaoxin Fan, Yifan Sun

2510.10152 2026-02-10 cs.CV

Color3D: Controllable and Consistent 3D Colorization with Personalized Colorizer

Yecong Wan, Mingwen Shao, Renlong Wu, Wangmeng Zuo

Comments ICLR 2026 Project Page https://yecongwan.github.io/Color3D/

2510.09092 2026-02-10 cs.CV

GL-DT: Multi-UAV Detection and Tracking with Global-Local Integration

Juanqin Liu, Leonardo Plotegher, Eloy Roura, Shaoming He

2510.08529 2026-02-10 cs.CL cs.AI

CoMAS: Co-Evolving Multi-Agent Systems via Interaction Rewards

Xiangyuan Xue, Yifan Zhou, Guibin Zhang, Zaibin Zhang, Yijiang Li, Chen Zhang, Zhenfei Yin, Philip Torr, Wanli Ouyang, Lei Bai

2510.07915 2026-02-10 cs.CV

MARC: Memory-Augmented RL Token Compression for Efficient Video Understanding

Peiran Wu, Zhuorui Yu, Yunze Liu, Chi-Hao Wu, Enmin Zhou, Junxiao Shen

Comments Accepted at ICLR 2026

2510.07760 2026-02-10 cs.LG cs.AI

VAO: Validation-Aligned Optimization for Cross-Task Generative Auto-Bidding

Yiqin Lv, Zhiyu Mou, Miao Xu, Jinghao Chen, Qi Wang, Yixiu Mao, Yun Qu, Rongquan Bai, Chuan Yu, Jian Xu, Bo Zheng, Xiangyang Ji

2510.06582 2026-02-10 cs.CV cs.RO

Through the Perspective of LiDAR: A Feature-Enriched and Uncertainty-Aware Annotation Pipeline for Terrestrial Point Cloud Segmentation

Fei Zhang, Rob Chancia, Josie Clapp, Amirhossein Hassanzadeh, Dimah Dera, Richard MacKenzie, Jan van Aardt

Comments 40 pages (28 main text), 20 figures, 4 supplementary materials; links to 3D point animations are included in the last table

详情

英文摘要

Accurate semantic segmentation of terrestrial laser scanning (TLS) point clouds is limited by costly manual annotation. We propose a semi-automated, uncertainty-aware pipeline that integrates spherical projection, feature enrichment, ensemble learning, and targeted annotation to reduce labeling effort, while sustaining high accuracy. Our approach projects 3D points to a 2D spherical grid, enriches pixels with multi-source features, and trains an ensemble of segmentation networks to produce pseudo-labels and uncertainty maps, the latter guiding annotation of ambiguous regions. The 2D outputs are back-projected to 3D, yielding densely annotated point clouds supported by a three-tier visualization suite (2D feature maps, 3D colorized point clouds, and compact virtual spheres) for rapid triage and reviewer guidance. Using this pipeline, we build Mangrove3D, a semantic segmentation TLS dataset for mangrove forests. We further evaluate data efficiency and feature importance to address two key questions: (1) how much annotated data are needed and (2) which features matter most. Results show that performance saturates after ~12 annotated scans, geometric features contribute the most, and compact nine-channel stacks capture nearly all discriminative power, with the mean Intersection over Union (mIoU) plateauing at around 0.76. Finally, we confirm the generalization of our feature-enrichment strategy through cross-dataset tests on ForestSemantic and Semantic3D. Our contributions include: (i) a robust, uncertainty-aware TLS annotation pipeline with visualization tools; (ii) the Mangrove3D dataset; and (iii) empirical guidance on data efficiency and feature importance, thus enabling scalable, high-quality segmentation of TLS point clouds for ecological monitoring and beyond. The dataset and processing scripts are publicly available at https://fz-rit.github.io/through-the-lidars-eye/.

URL PDF HTML ☆

赞 0 踩 0

2510.05846 2026-02-10 cs.CL

Luth: Efficient French Specialization for Small Language Models and Cross-Lingual Transfer

Maxence Lasbordes, Sinoué Gad

Comments Accepted at the EACL 2026 Student Research Workshop (SRW)

2510.01220 2026-02-10 cs.CL cs.AI

Towards Open-Ended Discovery for Low-Resource NLP

Bonaventure F. P. Dossou, Henri Aïdasso

Comments Proceedings of the 2nd Workshop on Uncertainty-Aware NLP (UncertaiNLP) at EMNLP 2025

2510.00910 2026-02-10 cs.CV

PAL-Net: A Point-Wise CNN with Patch-Attention for 3D Facial Landmark Localization

Ali Shadman Yazdi, Annalisa Cappella, Benedetta Baldini, Riccardo Solazzo, Gianluca Tartaglia, Chiarella Sforza, Giuseppe Baselli

Comments Published in Informatics in Medicine Unlocked. Code available at: https://github.com/Ali5hadman/PAL-Net-A-Point-Wise-CNN-with-Patch-Attention

Journal ref Informatics in Medicine Unlocked, Volume 60, 2026, 101729

详情

DOI: 10.1016/j.imu.2025.101729

英文摘要

Manual annotation of anatomical landmarks on 3D facial scans is a time-consuming and expertise-dependent task, yet it remains critical for clinical assessments, morphometric analysis, and craniofacial research. While several deep learning methods have been proposed for facial landmark localization, most focus on pseudo-landmarks or require complex input representations, limiting their clinical applicability. This study presents a fully automated deep learning pipeline (PAL-Net) for localizing 50 anatomical landmarks on stereo-photogrammetry facial models. The method combines coarse alignment, region-of-interest filtering, and an initial approximation of landmarks with a patch-based pointwise CNN enhanced by attention mechanisms. Trained and evaluated on 214 annotated scans from healthy adults, PAL-Net achieved a mean localization error of 3.686 mm and preserves relevant anatomical distances with a 2.822 mm average error, comparable to intra-observer variability. To assess generalization, the model was further evaluated on 700 subjects from the FaceScape dataset, achieving a point-wise error of 0.41\,mm and a distance-wise error of 0.38\,mm. Compared to existing methods, PAL-Net offers a favorable trade-off between accuracy and computational cost. While performance degrades in regions with poor mesh quality (e.g., ears, hairline), the method demonstrates consistent accuracy across most anatomical regions. PAL-Net generalizes effectively across datasets and facial regions, outperforming existing methods in both point-wise and structural evaluations. It provides a lightweight, scalable solution for high-throughput 3D anthropometric analysis, with potential to support clinical workflows and reduce reliance on manual annotation. Source code can be found at https://github.com/Ali5hadman/PAL-Net-A-Point-Wise-CNN-with-Patch-Attention

URL PDF HTML ☆

赞 0 踩 0

2509.26321 2026-02-10 cs.LG

A Review on Single-Problem Multi-Attempt Heuristic Optimization

Judith Echevarrieta, Etor Arza, Aritz Pérez, Josu Ceberio

详情

DOI: 10.1145/3795878

英文摘要

In certain real-world optimization scenarios, practitioners are not interested in solving multiple problems but rather in finding the best solution to a single, specific problem. When the computational budget is large relative to the cost of evaluating a candidate solution, multiple heuristic alternatives can be tried to solve the same given problem, each possibly with a different algorithm, parameter configuration, initialization, or stopping criterion. In this practically relevant setting, the sequential selection of which alternative to try next is crucial for efficiently identifying the best possible solution across multiple attempts. However, suitable sequential alternative selection strategies have traditionally been studied separately across different research topics and have not been the exclusive focus of any existing review. As a result, the state-of-the-art remains fragmented for practitioners interested in this setting, with surveys either covering only subsets of relevant strategies or including approaches that rely on assumptions that are not feasible for the single-problem case. This work addresses the identified gap by providing a focused review of single-problem multi-attempt heuristic optimization. It brings together suitable strategies for this setting that have been studied separately through algorithm selection, parameter tuning, multi-start, and resource allocation. These strategies are described using a unified terminology within a common framework, which supports the construction of a taxonomy for systematically organizing and classifying them. The resulting comprehensive review facilitates both the identification and the development of strategies for the single-problem multi-attempt setting in practice.

URL PDF HTML ☆

赞 0 踩 0

2509.25351 2026-02-10 cs.LG stat.ML

Gradient Descent with Large Step Sizes: Chaos and Fractal Convergence Region

Shuang Liang, Guido Montúfar

2509.23436 2026-02-10 cs.LG

LOTFormer: Doubly-Stochastic Linear Attention via Low-Rank Optimal Transport

Ashkan Shahbazi, Chayne Thrash, Yikun Bai, Keaton Hamm, Navid NaderiAlizadeh, Soheil Kolouri

2509.22246 2026-02-10 cs.LG cs.AI

ASSESS: A Semantic and Structural Evaluation Framework for Statement Similarity

Xiaoyang Liu, Tao Zhu, Zineng Dong, Yuntian Liu, Qingfeng Guo, Zhaoxuan Liu, Yu Chen, Tao Luo

Comments Accepted to ICLR 2026

2509.22219 2026-02-10 cs.LG cs.AI

Interpretable Discovery of One-parameter Subgroups: A Modular Framework for Elliptical, Hyperbolic, and Parabolic Symmetries

Pavan Karjol, Vivek V Kashyap, Rohan Kashyap, Prathosh A P

2509.21880 2026-02-10 cs.CL cs.AI cs.LG

No Prompt Left Behind: Exploiting Zero-Variance Prompts in LLM Reinforcement Learning via Entropy-Guided Advantage Shaping

Thanh-Long V. Le, Myeongho Jeon, Kim Vu, Viet Lai, Eunho Yang

Comments ICLR 2026 camera-ready version

Journal ref The Fourteenth International Conference on Learning Representations (ICLR 2026)

AI 大模型

视觉与机器人

科学与医疗

Massively Parallel Proof-Number Search for Impartial Games and Beyond

Training Language Models to Explain Their Own Computations

FMMI: Flow Matching Mutual Information Estimation

Test-Time Iterative Error Correction for Efficient Diffusion Models

PALM: A Dataset and Baseline for Learning Multi-subject Hand Prior

KoTaP: A Panel Dataset for Corporate Tax Avoidance, Performance, and Governance in Korea

GUI Knowledge Bench: Revealing the Knowledge Gap of VLMs in GUI Tasks

Scalable LinUCB: Low-Rank Design Matrix Updates for Recommenders with Large Action Spaces

Distributed Spatial-Temporal Trajectory Optimization for Unmanned-Aerial-Vehicle Swarm

Effect of Reporting Mode and Clinical Experience on Radiologists' Gaze and Image Analysis Behavior in Chest Radiography

Paper Copilot: Tracking the Evolution of Peer Review in AI Conferences

InternSVG: Towards Unified SVG Tasks with Multimodal Large Language Models

VCB Bench: An Evaluation Benchmark for Audio-Grounded Large Language Model Conversational Agents

Restricted Receptive Fields for Face Verification

The Achilles' Heel of LLMs: How Altering a Handful of Neurons Can Cripple Language Abilities

Color3D: Controllable and Consistent 3D Colorization with Personalized Colorizer

GL-DT: Multi-UAV Detection and Tracking with Global-Local Integration

CoMAS: Co-Evolving Multi-Agent Systems via Interaction Rewards

MARC: Memory-Augmented RL Token Compression for Efficient Video Understanding

VAO: Validation-Aligned Optimization for Cross-Task Generative Auto-Bidding

Through the Perspective of LiDAR: A Feature-Enriched and Uncertainty-Aware Annotation Pipeline for Terrestrial Point Cloud Segmentation

Luth: Efficient French Specialization for Small Language Models and Cross-Lingual Transfer

Towards Open-Ended Discovery for Low-Resource NLP

PAL-Net: A Point-Wise CNN with Patch-Attention for 3D Facial Landmark Localization

A Review on Single-Problem Multi-Attempt Heuristic Optimization

Gradient Descent with Large Step Sizes: Chaos and Fractal Convergence Region

LOTFormer: Doubly-Stochastic Linear Attention via Low-Rank Optimal Transport

ASSESS: A Semantic and Structural Evaluation Framework for Statement Similarity

Interpretable Discovery of One-parameter Subgroups: A Modular Framework for Elliptical, Hyperbolic, and Parabolic Symmetries

No Prompt Left Behind: Exploiting Zero-Variance Prompts in LLM Reinforcement Learning via Entropy-Guided Advantage Shaping