arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

检索范围排序方式

检索时间范围

重置

HOT 人工智能、机器人等 9

cs.AI 人工智能 cs.CV 计算机视觉 cs.CL 自然语言处理 cs.RO 机器人 cs.LG 机器学习 cs.SD 声音 cs.ET 新兴技术 eess.AS 音频语音 eess.IV 图像视频

CS 计算机 41

cs 计算机 cs.AI 人工智能 cs.AR 硬件架构 cs.CC 计算复杂性 cs.CE 计算工程 cs.CG 计算几何 cs.CL 自然语言处理 cs.CR 密码安全 cs.CV 计算机视觉 cs.CY 计算机与社会 cs.DB 数据库 cs.DC 分布式计算 cs.DL 数字图书馆 cs.DM 离散数学 cs.DS 数据结构 cs.ET 新兴技术 cs.FL 形式语言 cs.GL 综述文献 cs.GR 图形学 cs.GT 博弈论 cs.HC 人机交互 cs.IR 信息检索 cs.IT 信息论 cs.LG 机器学习 cs.LO 计算机逻辑 cs.MA 多智能体 cs.MM 多媒体 cs.MS 数学软件 cs.NA 数值分析 cs.NE 神经进化 cs.NI 网络架构 cs.OH 其他计算机 cs.OS 操作系统 cs.PF 性能 cs.PL 编程语言 cs.RO 机器人 cs.SC 符号计算 cs.SD 声音 cs.SE 软件工程 cs.SI 社会信息网络 cs.SY 系统控制

ECON 经济学 4

econ 经济学 econ.EM 计量经济 econ.GN 一般经济 econ.TH 理论经济

EESS 电气与系统 5

eess 电气与系统 eess.AS 音频语音 eess.IV 图像视频 eess.SP 信号处理 eess.SY 系统控制

MATH 数学 33

math 数学 math.AC 交换代数 math.AG 代数几何 math.AP 偏微分方程 math.AT 代数拓扑 math.CA 经典分析 math.CO 组合数学 math.CT 范畴论 math.CV 复变函数 math.DG 微分几何 math.DS 动力系统 math.FA 泛函分析 math.GM 一般数学 math.GN 一般拓扑 math.GR 群论 math.GT 几何拓扑 math.HO 历史综述 math.IT 信息论 math.KT K理论 math.LO 逻辑 math.MG 度量几何 math.MP 数学物理 math.NA 数值分析 math.NT 数论 math.OA 算子代数 math.OC 优化控制 math.PR 概率 math.QA 量子代数 math.RA 环与代数 math.RT 表示论 math.SG 辛几何 math.SP 谱理论 math.ST 统计理论

PHYSICS 物理 55

astro-ph 天体物理 astro-ph.CO 宇宙学 astro-ph.EP 地球行星 astro-ph.GA 星系物理 astro-ph.HE 高能天体 astro-ph.IM 天文仪器 astro-ph.SR 太阳恒星 cond-mat 凝聚态 cond-mat.dis-nn 无序神经 cond-mat.mes-hall 介观纳米 cond-mat.mtrl-sci 材料科学 cond-mat.other 其他凝聚态 cond-mat.quant-gas 量子气体 cond-mat.soft 软凝聚态 cond-mat.stat-mech 统计力学 cond-mat.str-el 强关联电子 cond-mat.supr-con 超导 gr-qc 广义相对论 hep-ex 高能实验 hep-lat 格点高能 hep-ph 高能唯象 hep-th 高能理论 math-ph 数学物理 nlin 非线性科学 nlin.AO 自适应系统 nlin.CD 混沌动力学 nlin.CG 胞自动机 nlin.PS 斑图孤子 nlin.SI 可积系统 nucl-ex 核物理实验 nucl-th 核物理理论 physics 物理 physics.acc-ph 加速器物理 physics.ao-ph 大气海洋 physics.app-ph 应用物理 physics.atm-clus 原子分子团簇 physics.atom-ph 原子物理 physics.bio-ph 生物物理 physics.chem-ph 化学物理 physics.class-ph 经典物理 physics.comp-ph 计算物理 physics.data-an 数据分析 physics.ed-ph 物理教育 physics.flu-dyn 流体动力学 physics.gen-ph 普通物理 physics.geo-ph 地球物理 physics.hist-ph 物理史哲 physics.ins-det 仪器探测 physics.med-ph 医学物理 physics.optics 光学 physics.plasm-ph 等离子体 physics.pop-ph 科普物理 physics.soc-ph 物理与社会 physics.space-ph 空间物理 quant-ph 量子物理

Q-BIO 定量生物 11

q-bio 定量生物 q-bio.BM 生物分子 q-bio.CB 细胞行为 q-bio.GN 基因组学 q-bio.MN 分子网络 q-bio.NC 神经认知 q-bio.OT 其他定量生物 q-bio.PE 种群进化 q-bio.QM 定量方法 q-bio.SC 亚细胞过程 q-bio.TO 组织器官

Q-FIN 定量金融 10

q-fin 定量金融 q-fin.CP 计算金融 q-fin.EC 经济学 q-fin.GN 一般金融 q-fin.MF 数学金融 q-fin.PM 投资组合 q-fin.PR 证券定价 q-fin.RM 风险管理 q-fin.ST 统计金融 q-fin.TR 交易微观结构

STAT 统计 7

stat 统计 stat.AP 统计应用 stat.CO 统计计算 stat.ME 统计方法 stat.ML 机器学习 stat.OT 其他统计 stat.TH 统计理论

2502.15633 2026-02-16 cs.CV

RGB-Only Gaussian Splatting SLAM for Unbounded Outdoor Scenes

Sicheng Yu, Chong Cheng, Yifan Zhou, Xiaojun Yang, Hao Wang

Comments ICRA 2025

Journal ref 2025 IEEE International Conference on Robotics and Automation (ICRA) Robotics and Automation (ICRA), 2025 IEEE International Conference on. :11068-11074 May, 2025

2501.18138 2026-02-16 cs.LG

B3C: A Minimalist Approach to Offline Multi-Agent Reinforcement Learning

Woojun Kim, Katia Sycara

Comments Accepted at the 25th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2026)

2501.05454 2026-02-16 cs.AI cs.LO

The Epistemic Asymmetry of Consciousness Self-Reports: A Formal Analysis of AI Consciousness Denial

Chang-Eop Kim

Comments 6 pages, 0 figures

2412.14058 2026-02-16 cs.RO cs.CV

What Matters in Building Vision-Language-Action Models for Generalist Robots

Xinghang Li, Peiyan Li, Long Qian, Minghuan Liu, Dong Wang, Jirong Liu, Bingyi Kang, Xiao Ma, Xinlong Wang, Di Guo, Tao Kong, Hanbo Zhang, Huaping Liu

Comments Project page: robovlms.github.io. Added limitations and future works. Fix categorization

2412.07909 2026-02-16 cs.LG cs.AI cs.CV

Explaining and Mitigating the Modality Gap in Contrastive Multimodal Learning

Can Yaras, Siyi Chen, Peng Wang, Qing Qu

Comments The first two authors contributed equally to this work

2412.06001 2026-02-16 cs.SD cs.MM eess.AS

M6: Multi-generator, Multi-domain, Multi-lingual and cultural, Multi-genres, Multi-instrument Machine-Generated Music Detection Databases

Yupei Li, Hanqian Li, Lucia Specia, Björn W. Schuller

Comments Accepted at Scientific reports

2412.00366 2026-02-16 cs.RO

Efficient Multi-Robot Motion Planning for Manifold-Constrained Manipulators by Randomized Scheduling and Informed Path Generation

Weihang Guo, Zachary Kingston, Kaiyu Hang, Lydia E. Kavraki

2410.16882 2026-02-16 cs.AI cs.LG cs.SI

SaVe-TAG: LLM-based Interpolation for Long-Tailed Text-Attributed Graphs

Leyao Wang, Yu Wang, Bo Ni, Yuying Zhao, Hanyu Wang, Yao Ma, Tyler Derr

Comments Accepted KDD 2026 Research Track Paper

2410.03952 2026-02-16 cs.LG cs.AI cs.CV q-bio.NC

Pixel-Based Similarities as an Alternative to Neural Data for Improving Convolutional Neural Network Adversarial Robustness

Elie Attias, Cengiz Pehlevan, Dina Obeid

Comments Camera-ready version in the Asilomar Conference on Signals, Systems, and Computers, 2025

2407.20836 2026-02-16 cs.CV cs.CR

Vulnerabilities in AI-generated Image Detection: The Challenge of Adversarial Attacks

Yunfeng Diao, Naixin Zhai, Changtao Miao, Zitong Yu, Xingxing Wei, Xun Yang, Meng Wang

Comments Accepted in TMM

2407.20034 2026-02-16 cs.CV

MaskInversion: Localized Embeddings via Optimization of Explainability Maps

Walid Bousselham, Sofian Chaybouti, Christian Rupprecht, Vittorio Ferrari, Hilde Kuehne

Comments Project page: https://walidbousselham.com/MaskInversion

2406.19391 2026-02-16 cs.CV

Fibottention: Inceptive Visual Representation Learning with Diverse Attention Across Heads

Ali K. Rahimian, Manish K. Govind, Subhajit Maity, Dominick Reilly, Christian Kümmerle, Srijan Das, Aritra Dutta

Comments The complete implementation, including source code and evaluation scripts, is publicly available at: https://github.com/Charlotte-CharMLab/Fibottention

详情

英文摘要

Vision Transformers and their variants have achieved remarkable success in diverse visual perception tasks. Despite their effectiveness, they suffer from two significant limitations. First, the quadratic computational complexity of multi-head self-attention (MHSA), which restricts scalability to large token counts, and second, a high dependency on large-scale training data to attain competitive performance. In this paper, to address these challenges, we propose a novel sparse self-attention mechanism named Fibottention. Fibottention employs structured sparsity patterns derived from the Wythoff array, enabling an $\mathcal{O}(N \log N)$ computational complexity in self-attention. By design, its sparsity patterns vary across attention heads, which provably reduces redundant pairwise interactions while ensuring sufficient and diverse coverage. This leads to an \emph{inception-like functional diversity} in the attention heads, and promotes more informative and disentangled representations. We integrate Fibottention into standard Transformer architectures and conduct extensive experiments across multiple domains, including image classification, video understanding, and robot learning. Results demonstrate that models equipped with Fibottention either significantly outperform or achieve on-par performance with their dense MHSA counterparts, while leveraging only $2\%$ of all pairwise interactions across self-attention heads in typical settings, $2-6\%$ of the pairwise interactions in self-attention heads, resulting in substantial computational savings. Moreover, when compared to existing sparse attention mechanisms, Fibottention consistently achieves superior results on a FLOP-equivalency basis. Finally, we provide an in-depth analysis of the enhanced feature diversity resulting from our attention design and discuss its implications for efficient representation learning.

URL PDF HTML ☆

赞 0 踩 0

2406.04112 2026-02-16 cs.LG cs.AI eess.SP stat.ML

Compressible Dynamics in Deep Overparameterized Low-Rank Learning & Adaptation

Can Yaras, Peng Wang, Laura Balzano, Qing Qu

Comments Accepted at ICML'24 (Oral)

2404.08567 2026-02-16 cs.CL cs.AI

CATP: Cross-Attention Token Pruning for Accuracy Preserved Multimodal Model Inference

Ruqi Liao, Chuqing Zhao, Jin Li, Weiqi Feng, Yi Lyu, Bingxian Chen, Haochen Yang

2303.14322 2026-02-16 cs.CV

Spatio-Temporal driven Attention Graph Neural Network with Block Adjacency matrix (STAG-NN-BA) for Remote Land-use Change Detection

Usman Nazir, Wadood Islam, Sara Khalid, Murtaza Taj

Journal ref AAAI Symposium 2023

详情

英文摘要

Land-use monitoring is fundamental for spatial planning, particularly in view of compound impacts of growing global populations and climate change. Despite existing applications of deep learning in land use monitoring, standard convolutional kernels in deep neural networks limit the applications of these networks to the Euclidean domain only. Considering the geodesic nature of the measurement of the earth's surface, remote sensing is one such area that can benefit from non-Euclidean and spherical domains. For this purpose, we designed a novel Graph Neural Network architecture for spatial and spatio-temporal classification using satellite imagery to acquire insights into socio-economic indicators. We propose a hybrid attention method to learn the relative importance of irregular neighbors in remote sensing data. Instead of classifying each pixel, we propose a method based on Simple Linear Iterative Clustering (SLIC) image segmentation and Graph Attention Network. The superpixels obtained from SLIC become the nodes of our Graph Convolution Network (GCN). A region adjacency graph (RAG) is then constructed where each superpixel is connected to every other adjacent superpixel in the image, enabling information to propagate globally. Finally, we propose a Spatially driven Attention Graph Neural Network (SAG-NN) to classify each RAG. We also propose an extension to our SAG-NN for spatio-temporal data. Unlike regular grids of pixels in images, superpixels are irregular in nature and cannot be used to create spatio-temporal graphs. We introduce temporal bias by combining unconnected RAGs from each image into one supergraph. This is achieved by introducing block adjacency matrices resulting in novel Spatio-Temporal driven Attention Graph Neural Network with Block Adjacency matrix (STAG-NN-BA). SAG-NN and STAG-NN-BA outperform graph and non-graph baselines on Asia14 and C2D2 datasets efficiently.

URL PDF HTML ☆

赞 0 踩 0

2011.07687 2026-02-16 cs.LG stat.ML

DART: aDaptive Accept RejecT for non-linear top-K subset identification

Mridul Agarwal, Vaneet Aggarwal, Christopher J. Quinn, Abhishek Umrawal

Comments extended version of AAAI 2021 paper

Journal ref extended version of AAAI 2021

2602.13181 2026-02-16 physics.ao-ph cs.LG

Selection of CMIP6 Models for Regional Precipitation Projection and Climate Change Assessment in the Jhelum and Chenab River Basins

Saad Ahmed Jamal, Ammara Nusrat, Muhammad Azmat, Muhammad Osama Nusrat

Comments 28 pages

2602.13177 2026-02-16 math.OC cs.DS cs.LG

Improved Regret Guarantees for Online Mirror Descent using a Portfolio of Mirror Maps

Swati Gupta, Jai Moondra, Mohit Singh

详情

英文摘要

OMD and its variants give a flexible framework for OCO where the performance depends crucially on the choice of the mirror map. While the geometries underlying OPGD and OEG, both special cases of OMD, are well understood, it remains a challenging open question on how to construct an optimal mirror map for any given constrained set and a general family of loss functions, e.g., sparse losses. Motivated by parameterizing a near-optimal set of mirror maps, we consider a simpler question: is it even possible to obtain polynomial gains in regret by using mirror maps for geometries that interpolate between $L_1$ and $L_2$, which may not be possible by restricting to only OEG ($L_1$) or OPGD ($L_2$). Our main result answers this question positively. We show that mirror maps based on block norms adapt better to the sparsity of loss functions, compared to previous $L_p$ (for $p \in [1, 2]$) interpolations. In particular, we construct a family of online convex optimization instances in $\mathbb{R}^d$, where block norm-based mirror maps achieve a provable polynomial (in $d$) improvement in regret over OEG and OPGD for sparse loss functions. We then turn to the setting in which the sparsity level of the loss functions is unknown. In this case, the choice of geometry itself becomes an online decision problem. We first show that naively switching between OEG and OPGD can incur linear regret, highlighting the intrinsic difficulty of geometry selection. To overcome this issue, we propose a meta-algorithm based on multiplicative weights that dynamically selects among a family of uniform block norms. We show that this approach effectively tunes OMD to the sparsity of the losses, yielding adaptive regret guarantees. Overall, our results demonstrate that online mirror-map selection can significantly enhance the ability of OMD to exploit sparsity in online convex optimization.

URL PDF HTML ☆

赞 0 踩 0

2602.13157 2026-02-16 math.OC cs.RO cs.SY eess.SY

A Data-Driven Algorithm for Model-Free Control Synthesis

Sean Bowerfind, Matthew R. Kirchner, Gary Hewer

2602.13112 2026-02-16 stat.ML cs.LG math.OC

AdaGrad-Diff: A New Version of the Adaptive Gradient Algorithm

Matia Bojovic, Saverio Salzo, Massimiliano Pontil

Comments 24 pages

2602.13098 2026-02-16 stat.ME cs.LG

Barron-Wiener-Laguerre models

Rahul Manavalan, Filip Tronarp

2602.13017 2026-02-16 cs.NE cs.AI cs.LG

Synaptic Activation and Dual Liquid Dynamics for Interpretable Bio-Inspired Models

Mónika Farsang, Radu Grosu

2602.12986 2026-02-16 eess.AS cs.SD

A two-step approach for speech enhancement in low-SNR scenarios using cyclostationary beamforming and DNNs

Giovanni Bologni, Nicolás Arrieta Larraza, Richard Heusdens, Richard C. Hendriks

Comments Submitted version

2602.12985 2026-02-16 eess.SP cs.CV

Represent Micro-Doppler Signature in Orders

Weicheng Gao

Comments 17 pages, 8 figures, 5 tables

2602.12974 2026-02-16 stat.AP cs.CV stat.ME

Statistical Opportunities in Neuroimaging

Jian Kang, Thomas Nichols, Lexin Li, Martin A. Lindquist, Hongtu Zhu

Comments 33 pages, 3 figures

2602.12968 2026-02-16 cs.IR cs.AI cs.CL

RGAlign-Rec: Ranking-Guided Alignment for Latent Query Reasoning in Recommendation Systems

Junhua Liu, Yang Jihao, Cheng Chang, Kunrong LI, Bin Fu, Kwan Hui Lim

2602.12962 2026-02-16 cs.AR cs.AI

TriGen: NPU Architecture for End-to-End Acceleration of Large Language Models based on SW-HW Co-Design

Jonghun Lee, Junghoon Lee, Hyeonjin Kim, Seoho Jeon, Jisup Yoon, Hyunbin Park, Meejeong Park, Heonjae Ha

Comments 13 pages, 14 figures

2602.12932 2026-02-16 stat.ML cs.LG

TFTF: Training-Free Targeted Flow for Conditional Sampling

Qianqian Qu, Jun S. Liu

2602.12923 2026-02-16 stat.ML cs.LG

Annealing in variational inference mitigates mode collapse: A theoretical study on Gaussian mixtures

Luigi Fogliani, Bruno Loureiro, Marylou Gabrié

2602.12917 2026-02-16 physics.med-ph cs.AI

Ultrasound-Guided Real-Time Spinal Motion Visualization for Spinal Instability Assessment

Feng Li, Yuan Bi, Tianyu Song, Zhongliang Jiang, Nassir Navab

AI 大模型

视觉与机器人

科学与医疗

RGB-Only Gaussian Splatting SLAM for Unbounded Outdoor Scenes

B3C: A Minimalist Approach to Offline Multi-Agent Reinforcement Learning

The Epistemic Asymmetry of Consciousness Self-Reports: A Formal Analysis of AI Consciousness Denial

What Matters in Building Vision-Language-Action Models for Generalist Robots

Explaining and Mitigating the Modality Gap in Contrastive Multimodal Learning

M6: Multi-generator, Multi-domain, Multi-lingual and cultural, Multi-genres, Multi-instrument Machine-Generated Music Detection Databases

Efficient Multi-Robot Motion Planning for Manifold-Constrained Manipulators by Randomized Scheduling and Informed Path Generation

SaVe-TAG: LLM-based Interpolation for Long-Tailed Text-Attributed Graphs

Pixel-Based Similarities as an Alternative to Neural Data for Improving Convolutional Neural Network Adversarial Robustness

Vulnerabilities in AI-generated Image Detection: The Challenge of Adversarial Attacks

MaskInversion: Localized Embeddings via Optimization of Explainability Maps

Fibottention: Inceptive Visual Representation Learning with Diverse Attention Across Heads

Compressible Dynamics in Deep Overparameterized Low-Rank Learning & Adaptation

CATP: Cross-Attention Token Pruning for Accuracy Preserved Multimodal Model Inference

Spatio-Temporal driven Attention Graph Neural Network with Block Adjacency matrix (STAG-NN-BA) for Remote Land-use Change Detection

DART: aDaptive Accept RejecT for non-linear top-K subset identification

Selection of CMIP6 Models for Regional Precipitation Projection and Climate Change Assessment in the Jhelum and Chenab River Basins

Improved Regret Guarantees for Online Mirror Descent using a Portfolio of Mirror Maps

A Data-Driven Algorithm for Model-Free Control Synthesis

AdaGrad-Diff: A New Version of the Adaptive Gradient Algorithm

Barron-Wiener-Laguerre models

Synaptic Activation and Dual Liquid Dynamics for Interpretable Bio-Inspired Models

A two-step approach for speech enhancement in low-SNR scenarios using cyclostationary beamforming and DNNs

Represent Micro-Doppler Signature in Orders

Statistical Opportunities in Neuroimaging

RGAlign-Rec: Ranking-Guided Alignment for Latent Query Reasoning in Recommendation Systems

TriGen: NPU Architecture for End-to-End Acceleration of Large Language Models based on SW-HW Co-Design

TFTF: Training-Free Targeted Flow for Conditional Sampling

Annealing in variational inference mitigates mode collapse: A theoretical study on Gaussian mixtures

Ultrasound-Guided Real-Time Spinal Motion Visualization for Spinal Instability Assessment