arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

检索范围排序方式

检索时间范围

重置

HOT 人工智能、机器人等 9

cs.AI 人工智能 cs.CV 计算机视觉 cs.CL 自然语言处理 cs.RO 机器人 cs.LG 机器学习 cs.SD 声音 cs.ET 新兴技术 eess.AS 音频语音 eess.IV 图像视频

CS 计算机 41

cs 计算机 cs.AI 人工智能 cs.AR 硬件架构 cs.CC 计算复杂性 cs.CE 计算工程 cs.CG 计算几何 cs.CL 自然语言处理 cs.CR 密码安全 cs.CV 计算机视觉 cs.CY 计算机与社会 cs.DB 数据库 cs.DC 分布式计算 cs.DL 数字图书馆 cs.DM 离散数学 cs.DS 数据结构 cs.ET 新兴技术 cs.FL 形式语言 cs.GL 综述文献 cs.GR 图形学 cs.GT 博弈论 cs.HC 人机交互 cs.IR 信息检索 cs.IT 信息论 cs.LG 机器学习 cs.LO 计算机逻辑 cs.MA 多智能体 cs.MM 多媒体 cs.MS 数学软件 cs.NA 数值分析 cs.NE 神经进化 cs.NI 网络架构 cs.OH 其他计算机 cs.OS 操作系统 cs.PF 性能 cs.PL 编程语言 cs.RO 机器人 cs.SC 符号计算 cs.SD 声音 cs.SE 软件工程 cs.SI 社会信息网络 cs.SY 系统控制

ECON 经济学 4

econ 经济学 econ.EM 计量经济 econ.GN 一般经济 econ.TH 理论经济

EESS 电气与系统 5

eess 电气与系统 eess.AS 音频语音 eess.IV 图像视频 eess.SP 信号处理 eess.SY 系统控制

MATH 数学 33

math 数学 math.AC 交换代数 math.AG 代数几何 math.AP 偏微分方程 math.AT 代数拓扑 math.CA 经典分析 math.CO 组合数学 math.CT 范畴论 math.CV 复变函数 math.DG 微分几何 math.DS 动力系统 math.FA 泛函分析 math.GM 一般数学 math.GN 一般拓扑 math.GR 群论 math.GT 几何拓扑 math.HO 历史综述 math.IT 信息论 math.KT K理论 math.LO 逻辑 math.MG 度量几何 math.MP 数学物理 math.NA 数值分析 math.NT 数论 math.OA 算子代数 math.OC 优化控制 math.PR 概率 math.QA 量子代数 math.RA 环与代数 math.RT 表示论 math.SG 辛几何 math.SP 谱理论 math.ST 统计理论

PHYSICS 物理 55

astro-ph 天体物理 astro-ph.CO 宇宙学 astro-ph.EP 地球行星 astro-ph.GA 星系物理 astro-ph.HE 高能天体 astro-ph.IM 天文仪器 astro-ph.SR 太阳恒星 cond-mat 凝聚态 cond-mat.dis-nn 无序神经 cond-mat.mes-hall 介观纳米 cond-mat.mtrl-sci 材料科学 cond-mat.other 其他凝聚态 cond-mat.quant-gas 量子气体 cond-mat.soft 软凝聚态 cond-mat.stat-mech 统计力学 cond-mat.str-el 强关联电子 cond-mat.supr-con 超导 gr-qc 广义相对论 hep-ex 高能实验 hep-lat 格点高能 hep-ph 高能唯象 hep-th 高能理论 math-ph 数学物理 nlin 非线性科学 nlin.AO 自适应系统 nlin.CD 混沌动力学 nlin.CG 胞自动机 nlin.PS 斑图孤子 nlin.SI 可积系统 nucl-ex 核物理实验 nucl-th 核物理理论 physics 物理 physics.acc-ph 加速器物理 physics.ao-ph 大气海洋 physics.app-ph 应用物理 physics.atm-clus 原子分子团簇 physics.atom-ph 原子物理 physics.bio-ph 生物物理 physics.chem-ph 化学物理 physics.class-ph 经典物理 physics.comp-ph 计算物理 physics.data-an 数据分析 physics.ed-ph 物理教育 physics.flu-dyn 流体动力学 physics.gen-ph 普通物理 physics.geo-ph 地球物理 physics.hist-ph 物理史哲 physics.ins-det 仪器探测 physics.med-ph 医学物理 physics.optics 光学 physics.plasm-ph 等离子体 physics.pop-ph 科普物理 physics.soc-ph 物理与社会 physics.space-ph 空间物理 quant-ph 量子物理

Q-BIO 定量生物 11

q-bio 定量生物 q-bio.BM 生物分子 q-bio.CB 细胞行为 q-bio.GN 基因组学 q-bio.MN 分子网络 q-bio.NC 神经认知 q-bio.OT 其他定量生物 q-bio.PE 种群进化 q-bio.QM 定量方法 q-bio.SC 亚细胞过程 q-bio.TO 组织器官

Q-FIN 定量金融 10

q-fin 定量金融 q-fin.CP 计算金融 q-fin.EC 经济学 q-fin.GN 一般金融 q-fin.MF 数学金融 q-fin.PM 投资组合 q-fin.PR 证券定价 q-fin.RM 风险管理 q-fin.ST 统计金融 q-fin.TR 交易微观结构

STAT 统计 7

stat 统计 stat.AP 统计应用 stat.CO 统计计算 stat.ME 统计方法 stat.ML 机器学习 stat.OT 其他统计 stat.TH 统计理论

2602.09112 2026-02-11 cs.AI

A Small-Scale System for Autoregressive Program Synthesis Enabling Controlled Experimentation

Russ Webb, Jason Ramapuram

2602.09109 2026-02-11 cs.LG cs.AI cs.CV cs.DC

Distributed Hybrid Parallelism for Large Language Models: Comparative Study and System Design Guide

Hossam Amer, Rezaul Karim, Ali Pourranjbar, Weiwei Zhang, Walid Ahmed, Boxing Chen

2602.09101 2026-02-11 cs.LG cs.NA math.DS math.NA math.OC

From Adam to Adam-Like Lagrangians: Second-Order Nonlocal Dynamics

Carlos Heredia

Comments 42 pages, 10 figures

2602.09081 2026-02-11 cs.LG cs.AI

DMamba: Decomposition-enhanced Mamba for Time Series Forecasting

Ruxuan Chen, Fang Sun

Comments 9 pages, 3 figures, 4 tables

2602.09080 2026-02-11 cs.LG cs.AI

Looping Back to Move Forward: Recursive Transformers for Efficient and Flexible Large Multimodal Models

Ruihan Xu, Yuting Gao, Lan Wang, Jianing Li, Weihao Chen, Qingpei Guo, Ming Yang, Shiliang Zhang

Comments This is a primary contribution in the Recursive Vision-Language Models

2602.09079 2026-02-11 cs.LG

Patient foundation model for risk stratification in low-risk overweight patients

Zachary N. Flamholz, Dillon Tracy, Ripple Khera, Jordan Wolinsky, Nicholas Lee, Nathaniel Tann, Xiao Yin Zhu, Harry Phillips, Jeffrey Sherman

2602.09076 2026-02-11 cs.RO

Legs Over Arms: On the Predictive Value of Lower-Body Pose for Human Trajectory Prediction from Egocentric Robot Perception

Nhat Le, Daeun Song, Xuesu Xiao

Comments Accepted to IEEE ICRA 2026

2602.09066 2026-02-11 cs.LG cs.AI

Spectral Disentanglement and Enhancement: A Dual-domain Contrastive Framework for Representation Learning

Jinjin Guo, Yexin Li, Zhichao Huang, Jun Fang, Zhiyuan Liu, Chao Liu, Pengzhang Liu, Qixia Jiang

2602.09065 2026-02-11 cs.LG cs.AI

Enhanced Graph Transformer with Serialized Graph Tokens

Ruixiang Wang, Yuyang Hong, Shiming Xiang, Chunhong Pan

Comments ICASSP 2026

2602.09046 2026-02-11 cs.RO

Feasible Static Workspace Optimization of Tendon Driven Continuum Robot based on Euclidean norm

Mohammad Jabari, Carmen Visconte, Giuseppe Quaglia, Med Amine Laribi

Journal ref the 9th International Workshop on Medical and Service Robots (MESROB), Jul 2025, Poitiers, France. pp.489-501

2602.09042 2026-02-11 cs.SD eess.AS

The SJTU X-LANCE Lab System for MSR Challenge 2025

Jinxuan Zhu, Hao Qiu, Haina Zhu, Jianwei Yu, Kai Yu, Xie Chen

2602.09041 2026-02-11 cs.SD cs.AI eess.AS

DSFlow: Dual Supervision and Step-Aware Architecture for One-Step Flow Matching Speech Synthesis

Bin Lin, Peng Yang, Chao Yan, Xiaochen Liu, Wei Wang, Boyong Wu, Pengfei Tan, Xuerui Yang

2602.09007 2026-02-11 cs.AI cs.CV

GEBench: Benchmarking Image Generation Models as GUI Environments

Haodong Li, Jingwei Wu, Quan Sun, Guopeng Li, Juanxi Tian, Huanyu Zhang, Yanlin Lai, Ruichuan An, Hongbo Peng, Yuhong Dai, Chenxi Li, Chunmei Qing, Jia Wang, Ziyang Meng, Zheng Ge, Xiangyu Zhang, Daxin Jiang

Comments 23 pages, 5 figures, 4 tables

2602.08794 2026-02-11 cs.CV cs.SD

MOVA: Towards Scalable and Synchronized Video-Audio Generation

OpenMOSS Team, Donghua Yu, Mingshu Chen, Qi Chen, Qi Luo, Qianyi Wu, Qinyuan Cheng, Ruixiao Li, Tianyi Liang, Wenbo Zhang, Wenming Tu, Xiangyu Peng, Yang Gao, Yanru Huo, Ying Zhu, Yinze Luo, Yiyang Zhang, Yuerong Song, Zhe Xu, Zhiyu Zhang, Chenchen Yang, Cheng Chang, Chushu Zhou, Hanfu Chen, Hongnan Ma, Jiaxi Li, Jingqi Tong, Junxi Liu, Ke Chen, Shimin Li, Shiqi Jiang, Songlin Wang, Wei Jiang, Zhaoye Fei, Zhiyuan Ning, Chunguo Li, Chenhui Li, Ziwei He, Zengfeng Huang, Xie Chen, Xipeng Qiu

Comments Technical report for MOVA (open-source video-audio generation model). 38 pages, 10 figures, 22 tables. Project page: https://mosi.cn/models/mova Code: https://github.com/OpenMOSS/MOVA Models: https://huggingface.co/collections/OpenMOSS-Team/mova. Qinyuan Cheng and Tianyi Liang are project leader. Xie Chen and Xipeng Qiu are corresponding authors

2602.08681 2026-02-11 cs.LG stat.ML

The Theory and Practice of MAP Inference over Non-Convex Constraints

Leander Kurscheidt, Gabriele Masina, Roberto Sebastiani, Antonio Vergari

2602.08658 2026-02-11 cs.CL

Fundamental Reasoning Paradigms Induce Out-of-Domain Generalization in Language Models

Mingzi Cao, Xingwei Tan, Mahmud Elahi Akhter, Marco Valentino, Maria Liakata, Xi Wang, Nikolaos Aletras

2602.08533 2026-02-11 cs.AI

Dialogue Model Optimization via Agent Game and Adaptive Tree-based GRPO

Kun Peng, Conghui Tan, Yu Liu, Guohua Tang, Zhongqian Sun, Wei Yang, Zining Zhu, Lei Jiang, Yanbing Liu, Hao Peng

2602.08528 2026-02-11 cs.CV math.OC

Automatic regularization parameter choice for tomography using a double model approach

Chuyang Wu, Samuli Siltanen

2602.08491 2026-02-11 cs.CV cs.LG

Understanding Image2Video Domain Shift in Food Segmentation: An Instance-level Analysis on Apples

Keonvin Park, Aditya Pal, Jin Hong Mok

2602.08425 2026-02-11 cs.RO

Bi-Adapt: Few-shot Bimanual Adaptation for Novel Categories of 3D Objects via Semantic Correspondence

Jinxian Zhou, Ruihai Wu, Yiwei Liu, Yiwen Hou, Xunzhe Zhou, Checheng Yu, Licheng Zhong, Lin Shao

2602.08321 2026-02-11 cs.CL

Improving Data and Reward Design for Scientific Reasoning in Large Language Models

Zijie Chen, Zhenghao Lin, Xiao Liu, Zhenzhong Lan, Yeyun Gong, Peng Cheng

2602.08268 2026-02-11 cs.AI

Puda: Private User Dataset Agent for User-Sovereign and Privacy-Preserving Personalized AI

Akinori Maeda, Yuto Sekiya, Sota Sugimura, Tomoya Asai, Yu Tsuda, Kohei Ikeda, Hiroshi Fujii, Kohei Watanabe

Comments 9 pages, 5 figures

2602.08224 2026-02-11 cs.CV

Efficient-SAM2: Accelerating SAM2 with Object-Aware Visual Encoding and Memory Retrieval

Jing Zhang, Zhikai Li, Xuewen Liu, Qingyi Gu

Comments ICLR 2026,Code is available at: https://github.com/jingjing0419/Efficient-SAM2

详情

英文摘要

Segment Anything Model 2 (SAM2) shows excellent performance in video object segmentation tasks; however, the heavy computational burden hinders its application in real-time video processing. Although there have been efforts to improve the efficiency of SAM2, most of them focus on retraining a lightweight backbone, with little exploration into post-training acceleration. In this paper, we observe that SAM2 exhibits sparse perception pattern as biological vision, which provides opportunities for eliminating redundant computation and acceleration: i) In mask decoder, the attention primarily focuses on the foreground objects, whereas the image encoder in the earlier stage exhibits a broad attention span, which results in unnecessary computation to background regions. ii) In memory bank, only a small subset of tokens in each frame contribute significantly to memory attention, and the salient regions exhibit temporal consistency, making full-token computation redundant. With these insights, we propose Efficient-SAM2, which promotes SAM2 to adaptively focus on object regions while eliminating task-irrelevant computations, thereby significantly improving inference efficiency. Specifically, for image encoder, we propose object-aware Sparse Window Routing (SWR), a window-level computation allocation mechanism that leverages the consistency and saliency cues from the previous-frame decoder to route background regions into a lightweight shortcut branch. Moreover, for memory attention, we propose object-aware Sparse Memory Retrieval (SMR), which allows only the salient memory tokens in each frame to participate in computation, with the saliency pattern reused from their first recollection. With negligible additional parameters and minimal training overhead, Efficient-SAM2 delivers 1.68x speedup on SAM2.1-L model with only 1.0% accuracy drop on SA-V test set.

URL PDF HTML ☆

赞 0 踩 0

2602.08060 2026-02-11 cs.LG

Compiler-Assisted Speculative Sampling for Accelerated LLM Inference on Heterogeneous Edge Devices

Alejandro Ruiz y Mesa, Guilherme Korol, Moritz Riesterer, João Paulo Cardoso de Lima, Jeronimo Castrillon

Comments Accepted to AccML@HiPEAC 2026

2602.08030 2026-02-11 cs.AI cs.CL

Free(): Learning to Forget in Malloc-Only Reasoning Models

Yilun Zheng, Dongyang Ma, Tian Liang, Jiahao Xu, Xinting Huang, Lihui Chen, Haitao Mi, Yan Wang

2602.07859 2026-02-11 cs.LG cs.SY eess.SY

Dynamic Load Model for Data Centers with Pattern-Consistent Calibration

Siyu Lu, Chenhan Xiao, Yang Weng

Comments 10 pages, 13 figures

2602.07629 2026-02-11 cs.RO

LCLA: Language-Conditioned Latent Alignment for Vision-Language Navigation

Nitesh Subedi, Adam Haroon, Samuel Tetteh, Prajwal Koirala, Cody Fleming, Soumik Sarkar

2602.07358 2026-02-11 cs.LG

UTOPIA: Unlearnable Tabular Data via Decoupled Shortcut Embedding

Jiaming He, Fuming Luo, Hongwei Li, Wenbo Jiang, Wenshu Fan, Zhenbo Shi, Xudong Jiang, Yi Yu

2602.06566 2026-02-11 cs.CV cs.AI cs.CL

SPARC: Separating Perception And Reasoning Circuits for Test-time Scaling of VLMs

Niccolo Avogaro, Nayanika Debnath, Li Mi, Thomas Frick, Junling Wang, Zexue He, Hang Hua, Konrad Schindler, Mattia Rigotti

2602.06317 2026-02-11 cs.LG cs.AI cs.CL

The Condensate Theorem: Transformers are O(n), Not $O(n^2)$

Jorge L. Ruiz Williams

Comments 13 pages, 4 figures, 8 tables, 1 pseudocode algorithm

AI 大模型

视觉与机器人

科学与医疗