arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

检索范围排序方式

检索时间范围

重置

HOT 人工智能、机器人等 9

cs.AI 人工智能 cs.CV 计算机视觉 cs.CL 自然语言处理 cs.RO 机器人 cs.LG 机器学习 cs.SD 声音 cs.ET 新兴技术 eess.AS 音频语音 eess.IV 图像视频

CS 计算机 41

cs 计算机 cs.AI 人工智能 cs.AR 硬件架构 cs.CC 计算复杂性 cs.CE 计算工程 cs.CG 计算几何 cs.CL 自然语言处理 cs.CR 密码安全 cs.CV 计算机视觉 cs.CY 计算机与社会 cs.DB 数据库 cs.DC 分布式计算 cs.DL 数字图书馆 cs.DM 离散数学 cs.DS 数据结构 cs.ET 新兴技术 cs.FL 形式语言 cs.GL 综述文献 cs.GR 图形学 cs.GT 博弈论 cs.HC 人机交互 cs.IR 信息检索 cs.IT 信息论 cs.LG 机器学习 cs.LO 计算机逻辑 cs.MA 多智能体 cs.MM 多媒体 cs.MS 数学软件 cs.NA 数值分析 cs.NE 神经进化 cs.NI 网络架构 cs.OH 其他计算机 cs.OS 操作系统 cs.PF 性能 cs.PL 编程语言 cs.RO 机器人 cs.SC 符号计算 cs.SD 声音 cs.SE 软件工程 cs.SI 社会信息网络 cs.SY 系统控制

ECON 经济学 4

econ 经济学 econ.EM 计量经济 econ.GN 一般经济 econ.TH 理论经济

EESS 电气与系统 5

eess 电气与系统 eess.AS 音频语音 eess.IV 图像视频 eess.SP 信号处理 eess.SY 系统控制

MATH 数学 33

math 数学 math.AC 交换代数 math.AG 代数几何 math.AP 偏微分方程 math.AT 代数拓扑 math.CA 经典分析 math.CO 组合数学 math.CT 范畴论 math.CV 复变函数 math.DG 微分几何 math.DS 动力系统 math.FA 泛函分析 math.GM 一般数学 math.GN 一般拓扑 math.GR 群论 math.GT 几何拓扑 math.HO 历史综述 math.IT 信息论 math.KT K理论 math.LO 逻辑 math.MG 度量几何 math.MP 数学物理 math.NA 数值分析 math.NT 数论 math.OA 算子代数 math.OC 优化控制 math.PR 概率 math.QA 量子代数 math.RA 环与代数 math.RT 表示论 math.SG 辛几何 math.SP 谱理论 math.ST 统计理论

PHYSICS 物理 55

astro-ph 天体物理 astro-ph.CO 宇宙学 astro-ph.EP 地球行星 astro-ph.GA 星系物理 astro-ph.HE 高能天体 astro-ph.IM 天文仪器 astro-ph.SR 太阳恒星 cond-mat 凝聚态 cond-mat.dis-nn 无序神经 cond-mat.mes-hall 介观纳米 cond-mat.mtrl-sci 材料科学 cond-mat.other 其他凝聚态 cond-mat.quant-gas 量子气体 cond-mat.soft 软凝聚态 cond-mat.stat-mech 统计力学 cond-mat.str-el 强关联电子 cond-mat.supr-con 超导 gr-qc 广义相对论 hep-ex 高能实验 hep-lat 格点高能 hep-ph 高能唯象 hep-th 高能理论 math-ph 数学物理 nlin 非线性科学 nlin.AO 自适应系统 nlin.CD 混沌动力学 nlin.CG 胞自动机 nlin.PS 斑图孤子 nlin.SI 可积系统 nucl-ex 核物理实验 nucl-th 核物理理论 physics 物理 physics.acc-ph 加速器物理 physics.ao-ph 大气海洋 physics.app-ph 应用物理 physics.atm-clus 原子分子团簇 physics.atom-ph 原子物理 physics.bio-ph 生物物理 physics.chem-ph 化学物理 physics.class-ph 经典物理 physics.comp-ph 计算物理 physics.data-an 数据分析 physics.ed-ph 物理教育 physics.flu-dyn 流体动力学 physics.gen-ph 普通物理 physics.geo-ph 地球物理 physics.hist-ph 物理史哲 physics.ins-det 仪器探测 physics.med-ph 医学物理 physics.optics 光学 physics.plasm-ph 等离子体 physics.pop-ph 科普物理 physics.soc-ph 物理与社会 physics.space-ph 空间物理 quant-ph 量子物理

Q-BIO 定量生物 11

q-bio 定量生物 q-bio.BM 生物分子 q-bio.CB 细胞行为 q-bio.GN 基因组学 q-bio.MN 分子网络 q-bio.NC 神经认知 q-bio.OT 其他定量生物 q-bio.PE 种群进化 q-bio.QM 定量方法 q-bio.SC 亚细胞过程 q-bio.TO 组织器官

Q-FIN 定量金融 10

q-fin 定量金融 q-fin.CP 计算金融 q-fin.EC 经济学 q-fin.GN 一般金融 q-fin.MF 数学金融 q-fin.PM 投资组合 q-fin.PR 证券定价 q-fin.RM 风险管理 q-fin.ST 统计金融 q-fin.TR 交易微观结构

STAT 统计 7

stat 统计 stat.AP 统计应用 stat.CO 统计计算 stat.ME 统计方法 stat.ML 机器学习 stat.OT 其他统计 stat.TH 统计理论

2503.06749 2026-03-03 cs.CV cs.AI cs.CL cs.LG

Vision-R1: Incentivizing Reasoning Capability in Multimodal Large Language Models

Wenxuan Huang, Bohan Jia, Zijie Zhai, Shaosheng Cao, Zheyu Ye, Fei Zhao, Zhe Xu, Xu Tang, Yao Hu, Shaohui Lin

Comments Accepted to ICLR 2026. Code is available at https://github.com/Osilly/Vision-R1

2503.03862 2026-03-03 cs.CL cs.AI

Not-Just-Scaling Laws: Towards a Better Understanding of the Downstream Impact of Language Model Design Decisions

Emmy Liu, Amanda Bertsch, Lintang Sutawika, Lindia Tjuatja, Patrick Fernandes, Lara Marinov, Michael Chen, Shreya Singhal, Carolin Lawrence, Aditi Raghunathan, Kiril Gashteovski, Graham Neubig

2503.02623 2026-03-03 cs.CL cs.AI

Rewarding Doubt: A Reinforcement Learning Approach to Calibrated Confidence Expression of Large Language Models

David Bani-Harouni, Chantal Pellegrini, Paul Stangel, Ege Özsoy, Kamilia Zaripova, Nassir Navab, Matthias Keicher

2502.19949 2026-03-03 cs.LG eess.SP

Machine-learning for photoplethysmography analysis: Benchmarking feature, image, and signal-based approaches

Mohammad Moulaeifard, Loic Coquelin, Mantas Rinkevičius, Andrius Sološenko, Oskar Pfeffer, Ciaran Bench, Nando Hegemann, Sara Vardanega, Manasi Nandi, Jordi Alastruey, Christian Heiss, Vaidotas Marozas, Andrew Thompson, Philip J. Aston, Peter H. Charlton, Nils Strodthoff

Comments 39 pages, 9 figures, code available at https://gitlab.com/qumphy/d1-code

2502.19167 2026-03-03 cs.LG eess.SP

Generalizable deep learning for photoplethysmography-based blood pressure estimation -- A Benchmarking Study

Mohammad Moulaeifard, Peter H. Charlton, Nils Strodthoff

Comments 20 pages, 5 figures, code available at https://github.com/AI4HealthUOL/ppg-ood-generalization

Journal ref Machine Learning: Health 1(1):010501, 2025

2502.15021 2026-03-03 cs.CV

Thicker and Quicker: A Jumbo Token for Fast Plain Vision Transformers

Anthony Fuller, Yousef Yassin, Daniel G. Kyrollos, Evan Shelhamer, James R. Green

Comments ICLR 2026

2502.12179 2026-03-03 cs.LG cs.AI cs.CL

Sparse Shift Autoencoders for Identifying Concepts from Large Language Model Activations

Shruti Joshi, Andrea Dittadi, Sébastien Lachapelle, Dhanya Sridhar

Comments 27 pages, 9 figures

2502.06885 2026-03-03 cs.LG cs.AI

Topological derivative approach for deep neural network architecture adaptation

C G Krishnanunni, Tan Bui-Thanh, Clint Dawson

详情

英文摘要

This work presents a novel algorithm for progressively adapting neural network architecture along the depth. In particular, we attempt to address the following questions in a mathematically principled way: i) Where to add a new capacity (layer) during the training process? ii) How to initialize the new capacity? At the heart of our approach are two key ingredients: i) the introduction of a ``shape functional" to be minimized, which depends on neural network topology, and ii) the introduction of a topological derivative of the shape functional with respect to the neural network topology. Using an optimal control viewpoint, we show that the network topological derivative exists under certain conditions, and its closed-form expression is derived. In particular, we explore, for the first time, the connection between the topological derivative from a topology optimization framework with the Hamiltonian from optimal control theory. Further, we show that the optimality condition for the shape functional leads to an eigenvalue problem for deep neural architecture adaptation. Our approach thus determines the most sensitive location along the depth where a new layer needs to be inserted during the training phase and the associated parametric initialization for the newly added layer. We also demonstrate that our layer insertion strategy can be derived from an optimal transport viewpoint as a solution to maximizing a topological derivative in $p$-Wasserstein space, where $p>= 1$. Numerical investigations with fully connected network, convolutional neural network, and vision transformer on various regression and classification problems demonstrate that our proposed approach can outperform an ad-hoc baseline network and other architecture adaptation strategies. Further, we also demonstrate other applications of topological derivative in fields such as transfer learning.

URL PDF HTML ☆

赞 0 踩 0

2502.04326 2026-03-03 cs.CV cs.AI

WorldSense: Evaluating Real-world Omnimodal Understanding for Multimodal LLMs

Jack Hong, Shilin Yan, Jiayin Cai, Xiaolong Jiang, Yao Hu, Weidi Xie

Comments Accepted by ICLR2026

2502.03566 2026-03-03 cs.CV cs.LG

CLIP Behaves like a Bag-of-Words Model Cross-modally but not Uni-modally

Darina Koishigarina, Arnas Uselis, Seong Joon Oh

Comments ICLR 2026

2502.02339 2026-03-03 cs.CL

AStar: Boosting Multimodal Reasoning with Automated Structured Thinking

Jinyang Wu, Mingkuan Feng, Guocheng Zhai, Shuai Zhang, Zheng Lian, Fangrui Lv, Pengpeng Shao, Ruihan Jin, Zhengqi Wen, Jianhua Tao

Comments Accepted by AAAI 2026 Oral

2501.12739 2026-03-03 cs.LG

Multiscale Training of Convolutional Neural Networks

Shadab Ahamed, Niloufar Zakariaei, Eldad Haber, Moshe Eliasof

Comments 25 pages, 10 figures, 8 tables

2501.10067 2026-03-03 cs.CV

FiLo++: Zero-/Few-Shot Anomaly Detection by Fused Fine-Grained Descriptions and Deformable Localization

Zhaopeng Gu, Bingke Zhu, Guibo Zhu, Yingying Chen, Ming Tang, Jinqiao Wang

2412.20377 2026-03-03 cs.LG cs.CY

On Demographic Group Fairness Guarantees in Deep Learning

Yan Luo, Congcong Wen, Min Shi, Hao Huang, Yi Fang, Mengyu Wang

Comments Accepted for publication in TPAMI 2026

2412.18564 2026-03-03 cs.LG cs.CE

Efficient Aircraft Design Optimization Using Multi-Fidelity Models and Multi-fidelity Physics Informed Neural Networks

Apurba Sarker

Comments 7 pages, 3 figures

2412.03772 2026-03-03 cs.AI

A Contemporary Overview: Trends and Applications of Large Language Models on Mobile Devices

Lianjun Liu, Hongli An, Pengxuan Chen, Longxiang Ye

Comments The authors withdraw this manuscript. A substantially revised version will be submitted later

2412.01948 2026-03-03 cs.AI

The Evolution and Future Perspectives of Artificial Intelligence Generated Content

Chengzhang Zhu, Luobin Cui, Ying Tang, Jiacun Wang

Comments 13 pages, 16 figures

Journal ref IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. PP, no. 99, pp. 1-19, 2025

2411.10492 2026-03-03 cs.CV eess.IV

MFP3D: Monocular Food Portion Estimation Leveraging 3D Point Clouds

Jinge Ma, Xiaoyan Zhang, Gautham Vinod, Siddeshwar Raghavan, Jiangpeng He, Fengqing Zhu

Comments 9th International Workshop on Multimedia Assisted Dietary Management, in conjunction with the 27th International Conference on Pattern Recognition (ICPR2024)

2411.07430 2026-03-03 cs.CV

XPoint: A Self-Supervised Visual-State-Space based Architecture for Multispectral Image Registration

Ismail Can Yagmur, Hasan F. Ates, Bahadir K. Gunturk

Comments 13 pages, 11 figures, 1 table, Journal

Journal ref IEEE Access, 2026

详情

DOI: 10.1109/ACCESS.2026.3668631

英文摘要

Accurate multispectral image matching presents significant challenges due to non-linear intensity variations across spectral modalities, extreme viewpoint changes, and the scarcity of labeled datasets. Current state-of-the-art methods are typically specialized for a single spectral difference, such as visibleinfrared, and struggle to adapt to other modalities due to their reliance on expensive supervision, such as depth maps or camera poses. To address the need for rapid adaptation across modalities, we introduce XPoint, a self-supervised, modular image-matching framework designed for adaptive training and fine-tuning on aligned multispectral datasets, allowing users to customize key components based on their specific tasks. XPoint employs modularity and self-supervision to allow for the adjustment of elements such as the base detector, which generates pseudoground truth keypoints invariant to viewpoint and spectrum variations. The framework integrates a VMamba encoder, pretrained on segmentation tasks, for robust feature extraction, and includes three joint decoder heads: two are dedicated to interest point and descriptor extraction; and a task-specific homography regression head imposes geometric constraints for superior performance in tasks like image registration. This flexible architecture enables quick adaptation to a wide range of modalities, demonstrated by training on Optical-Thermal data and fine-tuning on settings such as visual-near infrared, visual-infrared, visual-longwave infrared, and visual-synthetic aperture radar. Experimental results show that XPoint consistently outperforms or matches state-ofthe-art methods in feature matching and image registration tasks across five distinct multispectral datasets. Our source code is available at https://github.com/canyagmur/XPoint.

URL PDF HTML ☆

赞 0 踩 0

2411.00472 2026-03-03 cs.CV

MV-Adapter: Enhancing Underwater Instance Segmentation via Adaptive Channel Attention

Lianjun Liu

Comments The authors withdraw this manuscript. A substantially revised version will be submitted later

2410.16597 2026-03-03 cs.CL cs.IR

Scaling Knowledge Graph Construction through Synthetic Data Generation and Distillation

Prafulla Kumar Choubey, Xin Su, Man Luo, Xiangyu Peng, Caiming Xiong, Tiep Le, Shachar Rosenman, Vasudev Lal, Phil Mui, Ricky Ho, Phillip Howard, Chien-Sheng Wu

2410.05669 2026-03-03 cs.AI

ACPBench: Reasoning about Action, Change, and Planning

Harsha Kokel, Michael Katz, Kavitha Srinivas, Shirin Sohrabi

Comments In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI 2025)

2409.12446 2026-03-03 cs.LG cs.AI math.ST stat.ML stat.TH

Neural Networks Generalize on Low Complexity Data

Sourav Chatterjee, Timothy Sudijono

Comments 37 pages. Small corrections made

2408.05233 2026-03-03 cs.AI

Electric Vehicle User Charging Behavior Analysis Integrating Psychological and Environmental Factors: A Statistical-Driven LLM based Agent Approach

Chuanlin Zhang, Junkang Feng, Chenggang Cui, Pengfeng Lin, Hui Chen, Yan Xu, A. M. Y. M. Ghias, Qianguang Ma, Pei Zhang

Comments Accepted for publication in CSEE Journal of Power and Energy Systems

2407.15663 2026-03-03 cs.CV

MSSPlace: Multi-Sensor Place Recognition with Visual and Text Semantics

Alexander Melekhin, Dmitry Yudin, Ilia Petryashin, Vitaly Bezuglyj

Comments This work has been submitted to the IEEE for possible publication

2407.13750 2026-03-03 cs.CV

PO-GUISE+: Pose and object guided transformer token selection for efficient driver action recognition

Ricardo Pizarro, Roberto Valle, Rafael Barea, Jose M. Buenaposada, Luis Baumela, Luis Miguel Bergasa

Journal ref IEEE Transactions on Intelligent Transportation Systems (2026)

2406.17297 2026-03-03 cs.CV cs.AI

Towards Camera Open-set 3D Object Detection for Autonomous Driving Scenarios

Zhuolin He, Xinrun Li, Jiacheng Tang, Shoumeng Qiu, Wenfu Wang, Xiangyang Xue, Jian Pu

2406.07670 2026-03-03 cs.RO

Design and Control of a Compact Series Elastic Actuator Module for Robots in MRI Scanners

Binghan He, Naichen Zhao, David Y. Guo, Charles H. Paxson, Alfredo De Goyeneche, Michael Lustig, Chunlei Liu, Ronald S. Fearing

2404.13671 2026-03-03 cs.CV cs.LG

FiLo: Zero-Shot Anomaly Detection by Fine-Grained Description and High-Quality Localization

Zhaopeng Gu, Bingke Zhu, Guibo Zhu, Yingying Chen, Hao Li, Ming Tang, Jinqiao Wang

Comments Accepted by ACM MM 2024

2402.06223 2026-03-03 cs.LG cs.CV stat.ML

Beyond DAGs: A Latent Partial Causal Model for Multimodal Learning

Yuhang Liu, Zhen Zhang, Dong Gong, Erdun Gao, Biwei Huang, Mingming Gong, Anton van den Hengel, Kun Zhang, Javen Qinfeng Shi

AI 大模型

视觉与机器人

科学与医疗