arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

检索范围排序方式

检索时间范围

重置

HOT 人工智能、机器人等 9

cs.AI 人工智能 cs.CV 计算机视觉 cs.CL 自然语言处理 cs.RO 机器人 cs.LG 机器学习 cs.SD 声音 cs.ET 新兴技术 eess.AS 音频语音 eess.IV 图像视频

CS 计算机 41

cs 计算机 cs.AI 人工智能 cs.AR 硬件架构 cs.CC 计算复杂性 cs.CE 计算工程 cs.CG 计算几何 cs.CL 自然语言处理 cs.CR 密码安全 cs.CV 计算机视觉 cs.CY 计算机与社会 cs.DB 数据库 cs.DC 分布式计算 cs.DL 数字图书馆 cs.DM 离散数学 cs.DS 数据结构 cs.ET 新兴技术 cs.FL 形式语言 cs.GL 综述文献 cs.GR 图形学 cs.GT 博弈论 cs.HC 人机交互 cs.IR 信息检索 cs.IT 信息论 cs.LG 机器学习 cs.LO 计算机逻辑 cs.MA 多智能体 cs.MM 多媒体 cs.MS 数学软件 cs.NA 数值分析 cs.NE 神经进化 cs.NI 网络架构 cs.OH 其他计算机 cs.OS 操作系统 cs.PF 性能 cs.PL 编程语言 cs.RO 机器人 cs.SC 符号计算 cs.SD 声音 cs.SE 软件工程 cs.SI 社会信息网络 cs.SY 系统控制

ECON 经济学 4

econ 经济学 econ.EM 计量经济 econ.GN 一般经济 econ.TH 理论经济

EESS 电气与系统 5

eess 电气与系统 eess.AS 音频语音 eess.IV 图像视频 eess.SP 信号处理 eess.SY 系统控制

MATH 数学 33

math 数学 math.AC 交换代数 math.AG 代数几何 math.AP 偏微分方程 math.AT 代数拓扑 math.CA 经典分析 math.CO 组合数学 math.CT 范畴论 math.CV 复变函数 math.DG 微分几何 math.DS 动力系统 math.FA 泛函分析 math.GM 一般数学 math.GN 一般拓扑 math.GR 群论 math.GT 几何拓扑 math.HO 历史综述 math.IT 信息论 math.KT K理论 math.LO 逻辑 math.MG 度量几何 math.MP 数学物理 math.NA 数值分析 math.NT 数论 math.OA 算子代数 math.OC 优化控制 math.PR 概率 math.QA 量子代数 math.RA 环与代数 math.RT 表示论 math.SG 辛几何 math.SP 谱理论 math.ST 统计理论

PHYSICS 物理 55

astro-ph 天体物理 astro-ph.CO 宇宙学 astro-ph.EP 地球行星 astro-ph.GA 星系物理 astro-ph.HE 高能天体 astro-ph.IM 天文仪器 astro-ph.SR 太阳恒星 cond-mat 凝聚态 cond-mat.dis-nn 无序神经 cond-mat.mes-hall 介观纳米 cond-mat.mtrl-sci 材料科学 cond-mat.other 其他凝聚态 cond-mat.quant-gas 量子气体 cond-mat.soft 软凝聚态 cond-mat.stat-mech 统计力学 cond-mat.str-el 强关联电子 cond-mat.supr-con 超导 gr-qc 广义相对论 hep-ex 高能实验 hep-lat 格点高能 hep-ph 高能唯象 hep-th 高能理论 math-ph 数学物理 nlin 非线性科学 nlin.AO 自适应系统 nlin.CD 混沌动力学 nlin.CG 胞自动机 nlin.PS 斑图孤子 nlin.SI 可积系统 nucl-ex 核物理实验 nucl-th 核物理理论 physics 物理 physics.acc-ph 加速器物理 physics.ao-ph 大气海洋 physics.app-ph 应用物理 physics.atm-clus 原子分子团簇 physics.atom-ph 原子物理 physics.bio-ph 生物物理 physics.chem-ph 化学物理 physics.class-ph 经典物理 physics.comp-ph 计算物理 physics.data-an 数据分析 physics.ed-ph 物理教育 physics.flu-dyn 流体动力学 physics.gen-ph 普通物理 physics.geo-ph 地球物理 physics.hist-ph 物理史哲 physics.ins-det 仪器探测 physics.med-ph 医学物理 physics.optics 光学 physics.plasm-ph 等离子体 physics.pop-ph 科普物理 physics.soc-ph 物理与社会 physics.space-ph 空间物理 quant-ph 量子物理

Q-BIO 定量生物 11

q-bio 定量生物 q-bio.BM 生物分子 q-bio.CB 细胞行为 q-bio.GN 基因组学 q-bio.MN 分子网络 q-bio.NC 神经认知 q-bio.OT 其他定量生物 q-bio.PE 种群进化 q-bio.QM 定量方法 q-bio.SC 亚细胞过程 q-bio.TO 组织器官

Q-FIN 定量金融 10

q-fin 定量金融 q-fin.CP 计算金融 q-fin.EC 经济学 q-fin.GN 一般金融 q-fin.MF 数学金融 q-fin.PM 投资组合 q-fin.PR 证券定价 q-fin.RM 风险管理 q-fin.ST 统计金融 q-fin.TR 交易微观结构

STAT 统计 7

stat 统计 stat.AP 统计应用 stat.CO 统计计算 stat.ME 统计方法 stat.ML 机器学习 stat.OT 其他统计 stat.TH 统计理论

2505.21366 2026-03-03 cs.LG

PLANETALIGN: A Comprehensive Python Library for Benchmarking Network Alignment

Qi Yu, Zhichen Zeng, Yuchen Yan, Zhining Liu, Baoyu Jing, Ruizhong Qiu, Ariful Azad, Hanghang Tong

Comments Published as a conference paper at ICLR 2026

2505.21082 2026-03-03 cs.CL

RPM: Reasoning-Level Personalization for Black-Box Large Language Models

Jieyong Kim, Tongyoung Kim, Soojin Yoon, Jaehyung Kim, Dongha Lee

2505.19193 2026-03-03 cs.LG

SuperMAN: Interpretable and Expressive Networks over Temporally Sparse Heterogeneous Data

Maya Bechler-Speicher, Andrea Zerio, Maor Huri, Marie Vibeke Vestergaard, Ran Gilad-Bachrach, Tine Jess, Samir Bhatt, Aleksejs Sazonovs

2505.17702 2026-03-03 cs.CV cs.AI

Seek-CAD: A Self-refined Generative Modeling for 3D Parametric CAD Using Local Inference via DeepSeek

Xueyang Li, Jiahao Li, Yu Song, Yunzhong Lou, Xiangdong Zhou

Comments Accepted to ICLR 2026. The datatset has been released publicly and can be acessed in https://github.com/Sunny-Hack/Seek-CAD

2505.17132 2026-03-03 cs.CV cs.CL

Dynamic Token Reweighting for Robust Vision-Language Models

Tanqiu Jiang, Jiacheng Liang, Rongyi Zhu, Jiawei Zhou, Fenglong Ma, Ting Wang

Comments CVPR 2026

2505.16448 2026-03-03 cs.AI

The First Impression Problem: Internal Bias Triggers Overthinking in Reasoning Models

Renfei Dang, Zhening Li, Shujian Huang, Jiajun Chen

Comments ICLR 2026 poster

2505.16056 2026-03-03 cs.LG cs.AI

Not All Models Suit Expert Offloading: On Local Routing Consistency of Mixture-of-Expert Models

Jingcong Liang, Siyuan Wang, Miren Tian, Yitong Li, Duyu Tang, Zhongyu Wei

2505.16017 2026-03-03 cs.LG cs.CV

GradPCA: Leveraging NTK Alignment for Reliable Out-of-Distribution Detection

Mariia Seleznova, Hung-Hsu Chou, Claudio Mayrink Verdun, Gitta Kutyniok

Journal ref In Proceedings of International Conference on Learning Representations (ICLR), 2026

2505.14362 2026-03-03 cs.CV

DeepEyes: Incentivizing "Thinking with Images" via Reinforcement Learning

Ziwei Zheng, Michael Yang, Jack Hong, Chenxiao Zhao, Guohai Xu, Le Yang, Chao Shen, Xing Yu

Comments Accepted by ICLR2026. Ziwei, Michael, Jack, and Chenxiao are equal-contribution. The list order is random

2505.14218 2026-03-03 cs.CV

Flexible-weighted Chamfer Distance: Enhanced Objective Function for Point Cloud Completion

Jie Li, Shengwei Tian, Long Yu, Xin Ning

Comments Accepted by IEEE TPAMI 2026. This is the author's version of the work. \c{opyright} 2026 IEEE. Personal use of this material is permitted. Code is available at this https URL [https://github.com/Carroll-Li/FCD]

Journal ref IEEE Transactions on Pattern Analysis and Machine Intelligence, 2026

详情

DOI: 10.1109/TPAMI.2026.3669003

英文摘要

The Chamfer Distance (CD) is a cornerstone objective function for point cloud completion, yet its inherent symmetric weighting mechanism limits the quality of the generated results. By penalizing local detail deviations and global coverage deficiencies equally, standard CD often causes structural defects such as point aggregation and incomplete spatial structures. We introduce the Flexible-weighted Chamfer Distance (FCD), which decouples CD into local precision and global completeness sub-objectives. FCD employs an asymmetric weighting strategy that prioritizes global structural integrity, steering the optimization away from sub-optimal solutions. As a plug-and-play module with negligible overhead, extensive experiments on state-of-the-art networks demonstrate that FCD significantly enhances global distribution metrics while preserving local precision. Specifically, on the ShapeNet55 benchmark using AdaPoinTr, FCD reduces the Density-aware Chamfer Distance (DCD) by approximately 12.4% (from 0.613 to 0.537), effectively mitigating point clustering. Similarly, on the PCN dataset, the proposed method reduces the Earth Mover's Distance (EMD) from 23.79 to 21.40, demonstrating superior global uniformity compared to the standard CD baseline. Furthermore, FCD demonstrates excellent generalization. When applied to diverse tasks and datasets, including real-world scans (KITTI), industrial components (ABC), and point cloud upsampling (PU-GAN), it yields significant quantitative gains and produces visually more uniform and structurally complete point clouds. These results underscore FCD's potential as a versatile objective function for the broader point cloud generation domain.

URL PDF HTML ☆

赞 0 踩 0

2505.12734 2026-03-03 cs.SD cs.AI cs.GR cs.HC eess.AS

SounDiT: Geo-Contextual Soundscape-to-Landscape Generation

Junbo Wang, Haofeng Tan, Bowen Liao, Albert Jiang, Teng Fei, Qixing Huang, Bing Zhou, Zhengzhong Tu, Shan Ye, Yuhao Kang

Comments 12 pages, 4 figures

2505.12096 2026-03-03 cs.LG cs.AI stat.ML

When Bias Meets Trainability: Connecting Theories of Initialization

Alberto Bassi, Marco Baity-Jesi, Aurelien Lucchi, Carlo Albert, Emanuele Francazi

2505.11076 2026-03-03 cs.LG

Addition is almost all you need: Compressing large language models with double binary factorization

Vladimír Boža, Vladimír Macko

2505.09662 2026-03-03 cs.CL

When Large Language Models are More PersuasiveThan Incentivized Humans, and Why

Philipp Schoenegger, Francesco Salvi, Jiacheng Liu, Xiaoli Nan, Ramit Debnath, Barbara Fasolo, Evelina Leivada, Gabriel Recchia, Fritz Günther, Ali Zarifhonarvar, Joe Kwon, Zahoor Ul Islam, Marco Dehnert, Daryl Y. H. Lee, Madeline G. Reinecke, David G. Kamper, Mert Kobaş, Adam Sandford, Jonas Kgomo, Luke Hewitt, Shreya Kapoor, Kerem Oktar, Eyup Engin Kucuk, Bo Feng, Cameron R. Jones, Izzy Gainsburg, Sebastian Olschewski, Nora Heinzelmann, Francisco Cruz, Ben M. Tappin, Tao Ma, Peter S. Park, Rayan Onyonka, Arthur Hjorth, Peter Slattery, Qingcheng Zeng, Lennart Finke, Igor Grossmann, Alessandro Salatiello, Ezra Karger

2505.09305 2026-03-03 cs.RO

Embodied intelligent industrial robotics: Framework and techniques

Chaoran Zhang, Chenhao Zhang, Zhaobo Xu, Qinghongbing Xie, Jinliang Hou, Pingfa Feng, Long Zeng

Comments 71 pages, 13 figures. The associated project can be found at https://github.com/jackyzengl/EIIR

2505.06566 2026-03-03 cs.CV

Dynamic Uncertainty Learning with Noisy Correspondence for Text-Based Person Search

Zequn Xie, Haoming Ji, Chengxuan Li, Lingwei Meng

2505.02881 2026-03-03 cs.LG cs.AI

Rewriting Pre-Training Data Boosts LLM Performance in Math and Code

Kazuki Fujii, Yukito Tajima, Sakae Mizuki, Masaki Kawamura, Hinari Shimada, Taihei Shiotani, Koshiro Saito, Masanari Oi, Taishi Nakamura, Takumi Okamoto, Shigeki Ishida, Kakeru Hattori, Youmi Ma, Hiroya Takamura, Rio Yokota, Jun Sakuma, Naoaki Okazaki

2505.02825 2026-03-03 cs.CV

Towards Application-Specific Evaluation of Vision Models: Case Studies in Ecology and Biology

Alex Hoi Hang Chan, Otto Brookes, Urs Waldmann, Hemal Naik, Iain D. Couzin, Majid Mirmehdi, Noël Adiko Houa, Emmanuelle Normand, Christophe Boesch, Lukas Boesch, Mimi Arandjelovic, Hjalmar Kühl, Tilo Burghardt, Fumihiro Kano

Comments Accepted at CVPR Workshops, CV4Animals 2025

2504.21464 2026-03-03 cs.CV

VR-FuseNet: A Fusion of Heterogeneous Fundus Data and Explainable Deep Network for Diabetic Retinopathy Classification

Shamim Rahim Refat, Ziyan Shirin Raha, Shuvashis Sarker, Faika Fairuj Preotee, MD. Musfikur Rahman, Tashreef Muhammad, Mohammad Shafiul Alam

Comments Published in Biomedical Materials & Devices (Springer)

Journal ref Biomedical Materials & Devices (2026)

详情

DOI: 10.1007/s44174-026-00638-9

英文摘要

Diabetic retinopathy is a severe eye condition caused by diabetes where the retinal blood vessels get damaged and can lead to vision loss and blindness if not treated. Early and accurate detection is key to intervention and stopping the disease progressing. For addressing this disease properly, this paper presents a comprehensive approach for automated diabetic retinopathy detection by proposing a new hybrid deep learning model called VR-FuseNet. Diabetic retinopathy is a major eye disease and leading cause of blindness especially among diabetic patients so accurate and efficient automated detection methods are required. To address the limitations of existing methods including dataset imbalance, diversity and generalization issues this paper presents a hybrid dataset created from five publicly available diabetic retinopathy datasets. Essential preprocessing techniques such as SMOTE for class balancing and CLAHE for image enhancement are applied systematically to the dataset to improve the robustness and generalizability of the dataset. The proposed VR-FuseNet model combines the strengths of two state-of-the-art convolutional neural networks, VGG19 which captures fine-grained spatial features and ResNet50V2 which is known for its deep hierarchical feature extraction. This fusion improves the diagnostic performance and achieves an accuracy of 91.824%. The model outperforms individual architectures on all performance metrics demonstrating the effectiveness of hybrid feature extraction in Diabetic Retinopathy classification tasks. To make the proposed model more clinically useful and interpretable this paper incorporates multiple XAI techniques. These techniques generate visual explanations that clearly indicate the retinal features affecting the model's prediction such as microaneurysms, hemorrhages and exudates so that clinicians can interpret and validate.

URL PDF HTML ☆

赞 0 踩 0

2504.12419 2026-03-03 cs.LG math.OC quant-ph

Standardization of Multi-Objective QUBOs

Loong Kuan Lee, Thore Gerlach, Nico Piatkowski

Comments 7 pages, 3 figures; Published in the 2025 IEEE International Conference on Quantum Computing and Engineering (QCE); For associated code, see https://gitlab.com/lklee/qubo-standardization

Journal ref 2025 IEEE International Conference on Quantum Computing and Engineering (QCE) (Vol. 1, pp. 58-64)

2504.03889 2026-03-03 cs.LG

Identifying and Evaluating Inactive Heads in Pretrained LLMs

Pedro Sandoval-Segura, Xijun Wang, Ashwinee Panda, Micah Goldblum, Ronen Basri, Tom Goldstein, David Jacobs

Comments Accepted to ICLR 2026. Code available at https://github.com/psandovalsegura/inactive-heads

2504.01519 2026-03-03 cs.CL eess.AS

Chain of Correction for Full-text Speech Recognition with Large Language Models

Zhiyuan Tang, Dong Wang, Zhikai Zhou, Yong Liu, Shen Huang, Shidong Shang

Comments ICASSP 2026

2503.24378 2026-03-03 cs.AI

ACPBench Hard: Unrestrained Reasoning about Action, Change, and Planning

Harsha Kokel, Michael Katz, Kavitha Srinivas, Shirin Sohrabi

Comments Accepted at Proceedings of the Fourteenth International Conference on Learning Representations (ICLR 2026), see https://openreview.net/forum?id=WIXohR7mEo

2503.22178 2026-03-03 cs.LG cs.AI cs.CV

AdaRank: Adaptive Rank Pruning for Enhanced Model Merging

Chanhyuk Lee, Jiho Choi, Chanryeol Lee, Donggyun Kim, Seunghoon Hong

Comments ICLR 2026. Code available at: github.com/david3684/AdaRank

2503.18950 2026-03-03 cs.CV

Target-Aware Video Diffusion Models

Taeksoo Kim, Hanbyul Joo

Comments ICLR 2026. The project page is available at https://taeksuu.github.io/tavid/

2503.16553 2026-03-03 cs.CL

A Foundational Individual Mobility Prediction Model based on Open-Source Large Language Models

Zhenlin Qin, Leizhen Wang, Yancheng Ling, Francisco Camara Pereira, Zhenliang Ma

Journal ref Transportation Research Part C: Emerging Technologies, Vol. 185, 105562 (2026)

2503.11120 2026-03-03 cs.LG cs.CV

A Multi-Objective Evaluation Framework for Analyzing Utility-Fairness Trade-Offs in Machine Learning Systems

Gökhan Özbulak, Oscar Jimenez-del-Toro, Maíra Fatoretto, Lilian Berton, André Anjos

Comments Accepted for publication at the Journal of Machine Learning for Biomedical Imaging (MELBA) https://melba-journal.org/2025:050

Journal ref Machine.Learning.for.Biomedical.Imaging. 2025 (3)

详情

DOI: 10.59275/j.melba.2025-ab9a

英文摘要

The evaluation of fairness models in Machine Learning involves complex challenges, such as defining appropriate metrics, balancing trade-offs between utility and fairness, and there are still gaps in this stage. This work presents a novel multi-objective evaluation framework that enables the analysis of utility-fairness trade-offs in Machine Learning systems. The framework was developed using criteria from Multi-Objective Optimization that collect comprehensive information regarding this complex evaluation task. The assessment of multiple Machine Learning systems is summarized, both quantitatively and qualitatively, in a straightforward manner through a radar chart and a measurement table encompassing various aspects such as convergence, system capacity, and diversity. The framework's compact representation of performance facilitates the comparative analysis of different Machine Learning strategies for decision-makers, in real-world applications, with single or multiple fairness requirements. In particular, this study focuses on the medical imaging domain, where fairness considerations are crucial due to the potential impact of biased diagnostic systems on patient outcomes. The proposed framework enables a systematic evaluation of multiple fairness constraints helping to identify and mitigate disparities among demographic groups while maintaining diagnostic performance. The framework is model-agnostic and flexible to be adapted to any kind of Machine Learning systems, that is, black- or white-box, any kind and quantity of evaluation metrics, including multidimensional fairness criteria. The functionality and effectiveness of the proposed framework is shown with different simulations, and an empirical study conducted on three real-world medical imaging datasets with various Machine Learning systems. Our evaluation framework is publicly available at https://pypi.org/project/fairical.

URL PDF HTML ☆

赞 0 踩 0

2503.08980 2026-03-03 cs.LG cs.CL

I Predict Therefore I Am: Is Next Token Prediction Enough to Learn Human-Interpretable Concepts from Data?

Yuhang Liu, Dong Gong, Yichao Cai, Erdun Gao, Zhen Zhang, Biwei Huang, Mingming Gong, Anton van den Hengel, Javen Qinfeng Shi

2503.07392 2026-03-03 cs.CV

SPEED: Scalable, Precise, and Efficient Concept Erasure for Diffusion Models

Ouxiang Li, Yuan Wang, Xinting Hu, Houcheng Jiang, Yanbin Hao, Fuli Feng

Comments Accepted to ICLR 2026

2503.06764 2026-03-03 cs.CV cs.AI

SemHiTok: A Unified Image Tokenizer via Semantic-Guided Hierarchical Codebook for Multimodal Understanding and Generation

Zisheng Chen, Chunwei Wang, Runhui Huang, Hongbin Xu, Xiuwei Chen, Jun Zhou, Jianhua Han, Hang Xu, Xiaodan Liang

Comments ICLR 2026

AI 大模型

视觉与机器人

科学与医疗

PLANETALIGN: A Comprehensive Python Library for Benchmarking Network Alignment

RPM: Reasoning-Level Personalization for Black-Box Large Language Models

SuperMAN: Interpretable and Expressive Networks over Temporally Sparse Heterogeneous Data

Seek-CAD: A Self-refined Generative Modeling for 3D Parametric CAD Using Local Inference via DeepSeek

Dynamic Token Reweighting for Robust Vision-Language Models

The First Impression Problem: Internal Bias Triggers Overthinking in Reasoning Models

Not All Models Suit Expert Offloading: On Local Routing Consistency of Mixture-of-Expert Models

GradPCA: Leveraging NTK Alignment for Reliable Out-of-Distribution Detection

DeepEyes: Incentivizing "Thinking with Images" via Reinforcement Learning

Flexible-weighted Chamfer Distance: Enhanced Objective Function for Point Cloud Completion

SounDiT: Geo-Contextual Soundscape-to-Landscape Generation

When Bias Meets Trainability: Connecting Theories of Initialization

Addition is almost all you need: Compressing large language models with double binary factorization

When Large Language Models are More PersuasiveThan Incentivized Humans, and Why

Embodied intelligent industrial robotics: Framework and techniques

Dynamic Uncertainty Learning with Noisy Correspondence for Text-Based Person Search

Rewriting Pre-Training Data Boosts LLM Performance in Math and Code

Towards Application-Specific Evaluation of Vision Models: Case Studies in Ecology and Biology

VR-FuseNet: A Fusion of Heterogeneous Fundus Data and Explainable Deep Network for Diabetic Retinopathy Classification

Standardization of Multi-Objective QUBOs

Identifying and Evaluating Inactive Heads in Pretrained LLMs

Chain of Correction for Full-text Speech Recognition with Large Language Models

ACPBench Hard: Unrestrained Reasoning about Action, Change, and Planning

AdaRank: Adaptive Rank Pruning for Enhanced Model Merging

Target-Aware Video Diffusion Models

A Foundational Individual Mobility Prediction Model based on Open-Source Large Language Models

A Multi-Objective Evaluation Framework for Analyzing Utility-Fairness Trade-Offs in Machine Learning Systems

I Predict Therefore I Am: Is Next Token Prediction Enough to Learn Human-Interpretable Concepts from Data?

SPEED: Scalable, Precise, and Efficient Concept Erasure for Diffusion Models

SemHiTok: A Unified Image Tokenizer via Semantic-Guided Hierarchical Codebook for Multimodal Understanding and Generation