arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

检索范围排序方式

检索时间范围

重置

HOT 人工智能、机器人等 9

cs.AI 人工智能 cs.CV 计算机视觉 cs.CL 自然语言处理 cs.RO 机器人 cs.LG 机器学习 cs.SD 声音 cs.ET 新兴技术 eess.AS 音频语音 eess.IV 图像视频

CS 计算机 41

cs 计算机 cs.AI 人工智能 cs.AR 硬件架构 cs.CC 计算复杂性 cs.CE 计算工程 cs.CG 计算几何 cs.CL 自然语言处理 cs.CR 密码安全 cs.CV 计算机视觉 cs.CY 计算机与社会 cs.DB 数据库 cs.DC 分布式计算 cs.DL 数字图书馆 cs.DM 离散数学 cs.DS 数据结构 cs.ET 新兴技术 cs.FL 形式语言 cs.GL 综述文献 cs.GR 图形学 cs.GT 博弈论 cs.HC 人机交互 cs.IR 信息检索 cs.IT 信息论 cs.LG 机器学习 cs.LO 计算机逻辑 cs.MA 多智能体 cs.MM 多媒体 cs.MS 数学软件 cs.NA 数值分析 cs.NE 神经进化 cs.NI 网络架构 cs.OH 其他计算机 cs.OS 操作系统 cs.PF 性能 cs.PL 编程语言 cs.RO 机器人 cs.SC 符号计算 cs.SD 声音 cs.SE 软件工程 cs.SI 社会信息网络 cs.SY 系统控制

ECON 经济学 4

econ 经济学 econ.EM 计量经济 econ.GN 一般经济 econ.TH 理论经济

EESS 电气与系统 5

eess 电气与系统 eess.AS 音频语音 eess.IV 图像视频 eess.SP 信号处理 eess.SY 系统控制

MATH 数学 33

math 数学 math.AC 交换代数 math.AG 代数几何 math.AP 偏微分方程 math.AT 代数拓扑 math.CA 经典分析 math.CO 组合数学 math.CT 范畴论 math.CV 复变函数 math.DG 微分几何 math.DS 动力系统 math.FA 泛函分析 math.GM 一般数学 math.GN 一般拓扑 math.GR 群论 math.GT 几何拓扑 math.HO 历史综述 math.IT 信息论 math.KT K理论 math.LO 逻辑 math.MG 度量几何 math.MP 数学物理 math.NA 数值分析 math.NT 数论 math.OA 算子代数 math.OC 优化控制 math.PR 概率 math.QA 量子代数 math.RA 环与代数 math.RT 表示论 math.SG 辛几何 math.SP 谱理论 math.ST 统计理论

PHYSICS 物理 55

astro-ph 天体物理 astro-ph.CO 宇宙学 astro-ph.EP 地球行星 astro-ph.GA 星系物理 astro-ph.HE 高能天体 astro-ph.IM 天文仪器 astro-ph.SR 太阳恒星 cond-mat 凝聚态 cond-mat.dis-nn 无序神经 cond-mat.mes-hall 介观纳米 cond-mat.mtrl-sci 材料科学 cond-mat.other 其他凝聚态 cond-mat.quant-gas 量子气体 cond-mat.soft 软凝聚态 cond-mat.stat-mech 统计力学 cond-mat.str-el 强关联电子 cond-mat.supr-con 超导 gr-qc 广义相对论 hep-ex 高能实验 hep-lat 格点高能 hep-ph 高能唯象 hep-th 高能理论 math-ph 数学物理 nlin 非线性科学 nlin.AO 自适应系统 nlin.CD 混沌动力学 nlin.CG 胞自动机 nlin.PS 斑图孤子 nlin.SI 可积系统 nucl-ex 核物理实验 nucl-th 核物理理论 physics 物理 physics.acc-ph 加速器物理 physics.ao-ph 大气海洋 physics.app-ph 应用物理 physics.atm-clus 原子分子团簇 physics.atom-ph 原子物理 physics.bio-ph 生物物理 physics.chem-ph 化学物理 physics.class-ph 经典物理 physics.comp-ph 计算物理 physics.data-an 数据分析 physics.ed-ph 物理教育 physics.flu-dyn 流体动力学 physics.gen-ph 普通物理 physics.geo-ph 地球物理 physics.hist-ph 物理史哲 physics.ins-det 仪器探测 physics.med-ph 医学物理 physics.optics 光学 physics.plasm-ph 等离子体 physics.pop-ph 科普物理 physics.soc-ph 物理与社会 physics.space-ph 空间物理 quant-ph 量子物理

Q-BIO 定量生物 11

q-bio 定量生物 q-bio.BM 生物分子 q-bio.CB 细胞行为 q-bio.GN 基因组学 q-bio.MN 分子网络 q-bio.NC 神经认知 q-bio.OT 其他定量生物 q-bio.PE 种群进化 q-bio.QM 定量方法 q-bio.SC 亚细胞过程 q-bio.TO 组织器官

Q-FIN 定量金融 10

q-fin 定量金融 q-fin.CP 计算金融 q-fin.EC 经济学 q-fin.GN 一般金融 q-fin.MF 数学金融 q-fin.PM 投资组合 q-fin.PR 证券定价 q-fin.RM 风险管理 q-fin.ST 统计金融 q-fin.TR 交易微观结构

STAT 统计 7

stat 统计 stat.AP 统计应用 stat.CO 统计计算 stat.ME 统计方法 stat.ML 机器学习 stat.OT 其他统计 stat.TH 统计理论

2410.10258 2026-03-02 cs.LG stat.ML

Revisiting Matrix Sketching in Linear Bandits: Achieving Sublinear Regret via Dyadic Block Sketching

Dongxie Wen, Hanyan Yin, Xiao Zhang, Peng Zhao, Lijun Zhang, Zhewei Wei

Comments Accepted by ICLR 2026

2410.05419 2026-03-02 cs.LG cs.AI stat.ME

Joint Distribution-Informed Shapley Values for Sparse Counterfactual Explanations

Lei You, Yijun Bian, Lele Cao

2410.01469 2026-03-02 cs.SD cs.AI eess.AS

TIGER: Time-frequency Interleaved Gain Extraction and Reconstruction for Efficient Speech Separation

Mohan Xu, Kai Li, Guo Chen, Xiaolin Hu

Comments Accepted by ICLR 2025, demo page: https://cslikai.cn/TIGER/

2409.01728 2026-03-02 cs.CV

Shuffle Mamba: State Space Models with Random Shuffle for Multi-Modal Image Fusion

Ke Cao, Xuanhua He, Tao Hu, Chengjun Xie, Man Zhou, Jie Zhang

Comments Accepted by IEEE Transactions on Circuits and Systems for Video Technology

2408.09743 2026-03-02 cs.CV cs.AI cs.CL

R2GenCSR: Mining Contextual and Residual Information for LLMs-based Radiology Report Generation

Xiao Wang, Yuehang Li, Fuling Wang, Shiao Wang, Chuanfu Li, Bo Jiang

Comments R2GenCSR is accepted by IEEE Journal of Biomedical and Health Informatics (JBHI) 2026

2408.08448 2026-03-02 cs.LG

Exploring Cross-model Neuronal Correlations in the Context of Predicting Model Performance and Generalizability

Haniyeh Ehsani Oskouie, Sajjad Ghiasvand, Lionel Levine, Majid Sarrafzadeh

2406.01003 2026-03-02 cs.CV

Uni-ISP: Toward Unifying the Learning of ISPs from Multiple Mobile Cameras

Lingen Li, Mingde Yao, Xingyu Meng, Muquan Yu, Tianfan Xue, Jinwei Gu

Journal ref IEEE Transactions on Image Processing, vol. 34, pp. 6126-6137, 2025

2405.09101 2026-03-02 cs.RO cs.SY eess.SY

Adaptive Koopman Embedding for Robust Control of Complex Nonlinear Dynamical Systems

Rajpal Singh, Chandan Kumar Sah, Jishnu Keshavan

Comments Corrected the title

2405.07780 2026-03-02 cs.LG cs.AI cs.CV

DirMixE: Harnessing Test Agnostic Long-tail Recognition with Hierarchical Label Vartiations

Zhiyong Yang, Qianqian Xu, Sicong Li, Zitai Wang, Xiaochun Cao, Qingming Huang

Comments Conference version: Zhiyong Yang, Qianqian Xu, Zitai Wang, Sicong Li, Boyu Han, Shilong Bao, Xiaochun Cao, and Qingming Huang. Harnessing Hierarchical Label Distribution Variations in Test Agnostic Long-tail Recognition. ICML, 56624-56664, 2024

详情

DOI: 10.1109/TPAMI.2025.3647124

英文摘要

This paper explores test-agnostic long-tail recognition, a challenging long-tail task where the test label distributions are unknown and arbitrarily imbalanced. We argue that the variation in these distributions can be broken down hierarchically into global and local levels. The global ones reflect a broad range of diversity, while the local ones typically arise from milder changes, often focused on a particular neighbor. Traditional methods predominantly use a Mixture-of-Expert (MoE) approach, targeting a few fixed test label distributions that exhibit substantial global variations. However, the local variations are left unconsidered. To address this issue, we propose a new MoE strategy, DirMixE, which assigns experts to different Dirichlet meta-distributions of the label distribution, each targeting a specific aspect of local variations. Additionally, the diversity among these Dirichlet meta-distributions inherently captures global variations. This dual-level approach also leads to a more stable objective function, allowing us to sample different test distributions better to quantify the mean and variance of performance outcomes. Building on this idea, we develop a general Latent Skill Finetuning (LSF) framework for parameter-efficient finetuning of foundation models. We provide implementations based on LoRA and Adapter. Theoretically, we derive upper bounds on the generalization error for both standard learning and PEFT. Under mild assumptions, we show that the variance-based regularization helps tighten these bounds. Furthermore, we prove that the covering number of the PEFT hypothesis class scales with the number of trainable parameters. Finally, extensive experiments on CIFAR-10-LT, CIFAR-100-LT, ImageNet-LT, and iNaturalist validate the effectiveness of DirMixE.

URL PDF HTML ☆

赞 0 踩 0

2402.08552 2026-03-02 cs.LG cs.CV

Confronting Reward Overoptimization for Diffusion Models: A Perspective of Inductive and Primacy Biases

Ziyi Zhang, Sen Zhang, Yibing Zhan, Yong Luo, Yonggang Wen, Dacheng Tao

Comments Accepted to ICML 2024

Journal ref International Conference on Machine Learning, pp. 60396-60413, 2024

2402.07462 2026-03-02 cs.AI cs.CY cs.LG cs.MA econ.TH

A Hormetic Approach to the Value-Loading Problem: Preventing the Paperclip Apocalypse?

Nathan I. N. Henry, Mangor Pedersen, Matt Williams, Jamin L. B. Martin, Liesje Donkin

Comments 24 pages, 7 figures

Journal ref SN COMPUT. SCI. 6, 872 (2025)

2312.09120 2026-03-02 cs.LG cs.AI cs.RO

Less is more -- the Dispatcher/ Executor principle for multi-task Reinforcement Learning

Martin Riedmiller, Andrea Gesmundo, Tim Hertweck, Roland Hafner

Comments Videos showing the results can be found at https://sites.google.com/view/dispatcher-executor

2310.05179 2026-03-02 cs.LG

DRL-ORA: Distributional Reinforcement Learning with Online Risk Adaption

Yupeng Wu, Wenyun Li, Wenjie Huang, Chin Pang Ho

2306.09778 2026-03-02 cs.LG cs.NA math.NA math.OC stat.ML

Gradient is All You Need? How Consensus-Based Optimization can be Interpreted as a Stochastic Relaxation of Gradient Descent

Konstantin Riedl, Timo Klock, Carina Geldhauser, Massimo Fornasier

Comments 49 pages, 5 figures

2303.00320 2026-03-02 cs.LG

TimeMAE: Self-Supervised Representations of Time Series with Decoupled Masked Autoencoders

Mingyue Cheng, Xiaoyu Tao, Zhiding Liu, Qi Liu, Hao Zhang, Rujiao Zhang, Enhong Chen

Comments Accepted by WSDM'26

2210.09011 2026-03-02 cs.AI eess.SP

ANFIS-based prediction of power generation for combined cycle power plant

Maryam Paparimoghadamborazjani, Amin Kazemi

2206.04028 2026-03-02 cs.CV cs.RO

CO^3: Cooperative Unsupervised 3D Representation Learning for Autonomous Driving

Runjian Chen, Yao Mu, Runsen Xu, Wenqi Shao, Chenhan Jiang, Hang Xu, Zhenguo Li, Ping Luo

2204.10762 2026-03-02 cs.CV

Dite-HRNet: Dynamic Lightweight High-Resolution Network for Human Pose Estimation

Qun Li, Ziyi Zhang, Fu Xiao, Feng Zhang, Bir Bhanu

Comments Accepted by IJCAI-ECAI 2022

Journal ref International Joint Conference on Artificial Intelligence, pp. 1095-1101, 2022

2602.23599 2026-03-02 cs.LG

Normalisation and Initialisation Strategies for Graph Neural Networks in Blockchain Anomaly Detection

Dang Sy Duy, Nguyen Duy Chien, Kapil Dev, Jeff Nijsse

Comments 14 pages, 5 figures

2602.23595 2026-03-02 cs.CV

Incremental dimension reduction for efficient and accurate visual anomaly detection

Teng-Yok Lee

2602.23588 2026-03-02 cs.CV cs.AI cs.LG

Hyperdimensional Cross-Modal Alignment of Frozen Language and Image Models for Efficient Image Captioning

Abhishek Dalvi, Vasant Honavar

详情

英文摘要

Large unimodal foundation models for vision and language encode rich semantic structures, yet aligning them typically requires computationally intensive multimodal fine-tuning. Such approaches depend on large-scale parameter updates, are resource intensive, and can perturb pretrained representations. Emerging evidence suggests, however, that independently trained foundation models may already exhibit latent semantic compatibility, reflecting shared structures in the data they model. This raises a fundamental question: can cross-modal alignment be achieved without modifying the models themselves? Here we introduce HDFLIM (HyperDimensional computing with Frozen Language and Image Models), a framework that establishes cross-modal mappings while keeping pretrained vision and language models fully frozen. HDFLIM projects unimodal embeddings into a shared hyperdimensional space and leverages lightweight symbolic operations -- binding, bundling, and similarity-based retrieval to construct associative cross-modal representations in a single pass over the data. Caption generation emerges from high-dimensional memory retrieval rather than iterative gradient-based optimization. We show that HDFLIM achieves performance comparable to end-to-end vision-language training methods and produces captions that are more semantically grounded than zero-shot baselines. By decoupling alignment from parameter tuning, our results suggest that semantic mapping across foundation models can be realized through symbolic operations on hyperdimensional encodings of the respective embeddings. More broadly, this work points toward an alternative paradigm for foundation model alignment in which frozen models are integrated through structured representational mappings rather than through large-scale retraining. The codebase for our implementation can be found at https://github.com/Abhishek-Dalvi410/HDFLIM.

URL PDF HTML ☆

赞 0 踩 0

2602.23583 2026-03-02 cs.RO

VCA: Vision-Click-Action Framework for Precise Manipulation of Segmented Objects in Target Ambiguous Environments

Donggeon Kim, Seungwon Jan, Hyeonjun Park, Daegyu Lim

Comments Submitted to UR 2026

2602.23581 2026-03-02 cs.LG cs.AI

SDMixer: Sparse Dual-Mixer for Time Series Forecasting

Xiang Ao

Comments 12pages,2 figures

2602.23579 2026-03-02 cs.AI cs.LG

Construct, Merge, Solve & Adapt with Reinforcement Learning for the min-max Multiple Traveling Salesman Problem

Guillem Rodríguez-Corominas, Maria J. Blesa, Christian Blum

2602.23578 2026-03-02 cs.LG

Hybrid Quantum Temporal Convolutional Networks

Junghoon Justin Park, Maria Pak, Sebin Lee, Samuel Yen-Chi Chen, Shinjae Yoo, Huan-Hsin Tseng, Jiook Cha

Journal ref IEEE International Conference on Quantum Communications, Networking, and Computing (QCNC 2026)

2602.23577 2026-03-02 cs.CL

Multi-Agent Causal Reasoning for Suicide Ideation Detection Through Online Conversations

Jun Li, Xiangmeng Wang, Haoyang Li, Yifei Yan, Shijie Zhang, Hong Va Leong, Ling Feng, Nancy Xiaonan Yu, Qing Li

2602.23576 2026-03-02 cs.RO

Tilt-X: Enabling Compliant Aerial Manipulation through a Tiltable-Extensible Continuum Manipulator

Anuraj Uthayasooriyan, Krishna Manaswi Digumarti, Jack Breward, Fernando Vanegas, Julian Galvez-Serna, Felipe Gonzalez

Comments Accepted to IEEE International Conference on Robotics and Automation (ICRA) 2026

2602.23575 2026-03-02 cs.CV cs.AI

CycleBEV: Regularizing View Transformation Networks via View Cycle Consistency for Bird's-Eye-View Semantic Segmentation

Jeongbin Hong, Dooseop Choi, Taeg-Hyun An, Kyounghwan An, Kyoung-Wook Min

Comments CVPR 2026

2602.23565 2026-03-02 cs.LG cs.MA

Dynamics of Learning under User Choice: Overspecialization and Peer-Model Probing

Adhyyan Narang, Sarah Dean, Lillian J Ratliff, Maryam Fazel

2602.23559 2026-03-02 cs.CV

No Calibration, No Depth, No Problem: Cross-Sensor View Synthesis with 3D Consistency

Cho-Ying Wu, Zixun Huang, Xinyu Huang, Liu Ren

Comments CVPR 2026 Main Conference. Project page: https://choyingw.github.io/3d-rgbx.github.io/