arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

检索范围排序方式

检索时间范围

重置

HOT 人工智能、机器人等 9

cs.AI 人工智能 cs.CV 计算机视觉 cs.CL 自然语言处理 cs.RO 机器人 cs.LG 机器学习 cs.SD 声音 cs.ET 新兴技术 eess.AS 音频语音 eess.IV 图像视频

CS 计算机 41

cs 计算机 cs.AI 人工智能 cs.AR 硬件架构 cs.CC 计算复杂性 cs.CE 计算工程 cs.CG 计算几何 cs.CL 自然语言处理 cs.CR 密码安全 cs.CV 计算机视觉 cs.CY 计算机与社会 cs.DB 数据库 cs.DC 分布式计算 cs.DL 数字图书馆 cs.DM 离散数学 cs.DS 数据结构 cs.ET 新兴技术 cs.FL 形式语言 cs.GL 综述文献 cs.GR 图形学 cs.GT 博弈论 cs.HC 人机交互 cs.IR 信息检索 cs.IT 信息论 cs.LG 机器学习 cs.LO 计算机逻辑 cs.MA 多智能体 cs.MM 多媒体 cs.MS 数学软件 cs.NA 数值分析 cs.NE 神经进化 cs.NI 网络架构 cs.OH 其他计算机 cs.OS 操作系统 cs.PF 性能 cs.PL 编程语言 cs.RO 机器人 cs.SC 符号计算 cs.SD 声音 cs.SE 软件工程 cs.SI 社会信息网络 cs.SY 系统控制

ECON 经济学 4

econ 经济学 econ.EM 计量经济 econ.GN 一般经济 econ.TH 理论经济

EESS 电气与系统 5

eess 电气与系统 eess.AS 音频语音 eess.IV 图像视频 eess.SP 信号处理 eess.SY 系统控制

MATH 数学 33

math 数学 math.AC 交换代数 math.AG 代数几何 math.AP 偏微分方程 math.AT 代数拓扑 math.CA 经典分析 math.CO 组合数学 math.CT 范畴论 math.CV 复变函数 math.DG 微分几何 math.DS 动力系统 math.FA 泛函分析 math.GM 一般数学 math.GN 一般拓扑 math.GR 群论 math.GT 几何拓扑 math.HO 历史综述 math.IT 信息论 math.KT K理论 math.LO 逻辑 math.MG 度量几何 math.MP 数学物理 math.NA 数值分析 math.NT 数论 math.OA 算子代数 math.OC 优化控制 math.PR 概率 math.QA 量子代数 math.RA 环与代数 math.RT 表示论 math.SG 辛几何 math.SP 谱理论 math.ST 统计理论

PHYSICS 物理 55

astro-ph 天体物理 astro-ph.CO 宇宙学 astro-ph.EP 地球行星 astro-ph.GA 星系物理 astro-ph.HE 高能天体 astro-ph.IM 天文仪器 astro-ph.SR 太阳恒星 cond-mat 凝聚态 cond-mat.dis-nn 无序神经 cond-mat.mes-hall 介观纳米 cond-mat.mtrl-sci 材料科学 cond-mat.other 其他凝聚态 cond-mat.quant-gas 量子气体 cond-mat.soft 软凝聚态 cond-mat.stat-mech 统计力学 cond-mat.str-el 强关联电子 cond-mat.supr-con 超导 gr-qc 广义相对论 hep-ex 高能实验 hep-lat 格点高能 hep-ph 高能唯象 hep-th 高能理论 math-ph 数学物理 nlin 非线性科学 nlin.AO 自适应系统 nlin.CD 混沌动力学 nlin.CG 胞自动机 nlin.PS 斑图孤子 nlin.SI 可积系统 nucl-ex 核物理实验 nucl-th 核物理理论 physics 物理 physics.acc-ph 加速器物理 physics.ao-ph 大气海洋 physics.app-ph 应用物理 physics.atm-clus 原子分子团簇 physics.atom-ph 原子物理 physics.bio-ph 生物物理 physics.chem-ph 化学物理 physics.class-ph 经典物理 physics.comp-ph 计算物理 physics.data-an 数据分析 physics.ed-ph 物理教育 physics.flu-dyn 流体动力学 physics.gen-ph 普通物理 physics.geo-ph 地球物理 physics.hist-ph 物理史哲 physics.ins-det 仪器探测 physics.med-ph 医学物理 physics.optics 光学 physics.plasm-ph 等离子体 physics.pop-ph 科普物理 physics.soc-ph 物理与社会 physics.space-ph 空间物理 quant-ph 量子物理

Q-BIO 定量生物 11

q-bio 定量生物 q-bio.BM 生物分子 q-bio.CB 细胞行为 q-bio.GN 基因组学 q-bio.MN 分子网络 q-bio.NC 神经认知 q-bio.OT 其他定量生物 q-bio.PE 种群进化 q-bio.QM 定量方法 q-bio.SC 亚细胞过程 q-bio.TO 组织器官

Q-FIN 定量金融 10

q-fin 定量金融 q-fin.CP 计算金融 q-fin.EC 经济学 q-fin.GN 一般金融 q-fin.MF 数学金融 q-fin.PM 投资组合 q-fin.PR 证券定价 q-fin.RM 风险管理 q-fin.ST 统计金融 q-fin.TR 交易微观结构

STAT 统计 7

stat 统计 stat.AP 统计应用 stat.CO 统计计算 stat.ME 统计方法 stat.ML 机器学习 stat.OT 其他统计 stat.TH 统计理论

2602.05735 2026-03-03 cs.LG cs.AI cs.IR cs.IT math.IT

CSRv2: Unlocking Ultra-Sparse Embeddings

Lixuan Guo, Yifei Wang, Tiansheng Wen, Yifan Wang, Aosong Feng, Bo Chen, Stefanie Jegelka, Chenyu You

Comments Accepted by ICLR2026. Project Page: https://y-research-sbu.github.io/CSRv2/

详情

英文摘要

In the era of large foundation models, the quality of embeddings has become a central determinant of downstream task performance and overall system capability. Yet widely used dense embeddings are often extremely high-dimensional, incurring substantial costs in storage, memory, and inference latency. To address these, Contrastive Sparse Representation (CSR) is recently proposed as a promising direction, mapping dense embeddings into high-dimensional but k-sparse vectors, in contrast to compact dense embeddings such as Matryoshka Representation Learning (MRL). Despite its promise, CSR suffers severe degradation in the ultra-sparse regime, where over 80% of neurons remain inactive, leaving much of its efficiency potential unrealized. In this paper, we introduce CSRv2, a principled training approach designed to make ultra-sparse embeddings viable. CSRv2 stabilizes sparsity learning through progressive k-annealing, enhances representational quality via supervised contrastive objectives, and ensures end-to-end adaptability with full backbone finetuning. CSRv2 reduces dead neurons from 80% to 20% and delivers a 14% accuracy gain at k=2, bringing ultra-sparse embeddings on par with CSR at k=8 and MRL at 32 dimensions, all with only two active features. While maintaining comparable performance, CSRv2 delivers a 7x speedup over MRL, and yields up to 300x improvements in compute and memory efficiency relative to dense embeddings in text representation. Extensive experiments across text and vision demonstrate that CSRv2 makes ultra-sparse embeddings practical without compromising performance, where CSRv2 achieves 7%/4% improvement over CSR when k=4 and further increases this gap to 14%/6% when k=2 in text/vision representation. By making extreme sparsity viable, CSRv2 broadens the design space for real-time and edge-deployable AI systems where both embedding quality and efficiency are critical.

URL PDF HTML ☆

赞 0 踩 0

2602.04369 2026-03-03 cs.LG

Multi-scale hypergraph meets LLMs: Aligning large language models for time series analysis

Zongjiang Shang, Dongliang Cui, Binqing Wu, Ling Chen

Comments Accepted by ICLR2026

2602.02742 2026-03-03 cs.LG cs.AI

Entropy-Guided Dynamic Tokens for Graph-LLM Alignment in Molecular Understanding

Zihao Jing, Qiuhao Zeng, Ruiyi Fang, Yan Sun, Boyu Wang, Pingzhao Hu

Comments Accepted by ICLR 2026

2602.02356 2026-03-03 cs.CV cs.LG

NAB: Neural Adaptive Binning for Sparse-View CT reconstruction

Wangduo Xie, Matthew B. Blaschko

2602.01844 2026-03-03 cs.CV cs.AI

CloDS: Visual-Only Unsupervised Cloth Dynamics Learning in Unknown Conditions

Yuliang Zhan, Jian Li, Wenbing Huang, Wenbing Huang, Yang Liu, Hao Sun

Comments ICLR 2026

2602.01041 2026-03-03 cs.RO

Coordinated Control of Multiple Construction Machines Using LLM-Generated Behavior Trees with Flag-Based Synchronization

Akinosuke Tsutsumi, Tomoya Itsuka, Yuichiro Kasahara, Tomoya Kouno, Kota Akinari, Genki Yamauchi, Daisuke Endo, Taro Abe, Takeshi Hashimoto, Keiji Nagatani, Ryo Kurazume

Comments 9 pages, 7 figures

2602.00640 2026-03-03 cs.LG

Combinatorial Bandit Bayesian Optimization for Tensor Outputs

Jingru Huang, Haijie Xu, Jie Guo, Manrui Jiang, Chen Zhang

2601.23280 2026-03-03 cs.LG cs.NA math.NA

Decoupled Diffusion Sampling for Inverse Problems on Function Spaces

Thomas Y. L. Lin, Jiachen Yao, Lufang Chiang, Julius Berner, Anima Anandkumar

Comments Accepted to ICLR AI&PDE Workshop (Oral)

2601.23064 2026-03-03 cs.CV cs.AI

HierLoc: Hyperbolic Entity Embeddings for Hierarchical Visual Geolocation

Hari Krishna Gadi, Daniel Matos, Hongyi Luo, Lu Liu, Yongliang Wang, Yanfeng Zhang, Liqiu Meng

Comments This is camera ready version of the paper accepted to ICLR 2026 (poster)

2601.20838 2026-03-03 cs.LG cs.AI cs.CL cs.CY

Reward Models Inherit Value Biases from Pretraining

Brian Christian, Jessica A. F. Thompson, Elle Michelle Yang, Vincent Adam, Hannah Rose Kirk, Christopher Summerfield, Tsvetomira Dumbalska

2601.10729 2026-03-03 cs.AI cs.LG cs.PF

OrbitFlow: SLO-Aware Long-Context LLM Serving with Fine-Grained KV Cache Reconfiguration

Xinyue Ma, Heelim Hong, Taegeon Um, Jongseop Lee, Seoyeong Choy, Woo-Yeon Lee, Myeongjae Jeon

Comments Accepted at the 52nd International Conference on Very Large Data Bases (VLDB 2026). Xinyue Ma and Heelim Hong contributed equally (co-first authors)

2601.08011 2026-03-03 cs.CV cs.AI cs.LG cs.MM

TP-Blend: Textual-Prompt Attention Pairing for Precise Object-Style Blending in Diffusion Models

Xin Jin, Yichuan Zhong, Yapeng Tian

详情

Journal ref: Transactions on Machine Learning Research, 2025

英文摘要

Current text-conditioned diffusion editors handle single object replacement well but struggle when a new object and a new style must be introduced simultaneously. We present Twin-Prompt Attention Blend (TP-Blend), a lightweight training-free framework that receives two separate textual prompts, one specifying a blend object and the other defining a target style, and injects both into a single denoising trajectory. TP-Blend is driven by two complementary attention processors. Cross-Attention Object Fusion (CAOF) first averages head-wise attention to locate spatial tokens that respond strongly to either prompt, then solves an entropy-regularised optimal transport problem that reassigns complete multi-head feature vectors to those positions. CAOF updates feature vectors at the full combined dimensionality of all heads (e.g., 640 dimensions in SD-XL), preserving rich cross-head correlations while keeping memory low. Self-Attention Style Fusion (SASF) injects style at every self-attention layer through Detail-Sensitive Instance Normalization. A lightweight one-dimensional Gaussian filter separates low- and high-frequency components; only the high-frequency residual is blended back, imprinting brush-stroke-level texture without disrupting global geometry. SASF further swaps the Key and Value matrices with those derived from the style prompt, enforcing context-aware texture modulation that remains independent of object fusion. Extensive experiments show that TP-Blend produces high-resolution, photo-realistic edits with precise control over both content and appearance, surpassing recent baselines in quantitative fidelity, perceptual quality, and inference speed.

URL PDF HTML ☆

赞 0 踩 0

2601.07367 2026-03-03 cs.SD

FOCAL: A Novel Benchmarking Technique for Multi-modal Agents

Anupam Purwar, Aditya Choudhary

Comments We present a framework for evaluation of Multi-modal Agents consisting of Voice-to-voice model components viz. Text to Speech (TTS), Retrieval Augmented Generation (RAG) and Speech-to-text (STT)

2601.06502 2026-03-03 cs.AI

DRAGON: LLM-Driven Decomposition and Reconstruction Agents for Large-Scale Combinatorial Optimization

Shengkai Chen, Zhiguang Cao, Jianan Zhou, Yaoxin Wu, Senthilnath Jayavelu, Zhuoyi Lin, Xiaoli Li, Shili Xiang

Comments This paper has been accepted for presentation and publication at the 25th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2026), source code: https://github.com/skychan/DARGON

2601.05724 2026-03-03 cs.AI

Overcoming Joint Intractability with Lossless Hierarchical Speculative Decoding

Yuxuan Zhou, Fei Huang, Heng Li, Fengyi Wu, Tianyu Wang, Jianwei Zhang, Junyang Lin, Zhi-Qi Cheng

2601.02643 2026-03-03 cs.AI

AWARE-US: Preference-Aware Infeasibility Resolution in Tool-Calling Agents

Mehmet Kurmaz

Comments 22 pages, 5 figures, 6 tables

2512.15657 2026-03-03 cs.LG cs.CV

SoFlow: Solution Flow Models for One-Step Generative Modeling

Tianze Luo, Haotian Yuan, Zhuang Liu

Comments Accepted to ICLR 2026. Our code is available at https://github.com/zlab-princeton/SoFlow

2512.14696 2026-03-03 cs.CV cs.GR cs.RO

CRISP: Contact-Guided Real2Sim from Monocular Video with Planar Scene Primitives

Zihan Wang, Jiashun Wang, Jeff Tan, Yiwen Zhao, Jessica Hodgins, Shubham Tulsiani, Deva Ramanan

Comments Published at ICLR 2026. Project page: https://crisp-real2sim.github.io/CRISP-Real2Sim/

2512.14341 2026-03-03 cs.CV cs.AI cs.CY cs.LG

Towards Transferable Defense Against Malicious Image Edits

Jie Zhang, Shuai Dong, Shiguang Shan, Xilin Chen

Comments 14 pages, 5 figures, accepted by IEEE TPAMI

2512.12678 2026-03-03 cs.CV

$β$-CLIP: Text-Conditioned Contrastive Learning for Multi-Granular Vision-Language Alignment

Fatimah Zohra, Chen Zhao, Hani Itani, Bernard Ghanem

2512.11582 2026-03-03 cs.LG cs.CV q-bio.NC

Brain-Semantoks: Learning Semantic Tokens of Brain Dynamics with a Self-Distilled Foundation Model

Sam Gijsen, Marc-Andre Schulz, Kerstin Ritter

Comments Accepted at ICLR 2026. Code and pretrained models available at https://github.com/SamGijsen/Brain-Semantoks

2512.04388 2026-03-03 cs.LG

Learning to Orchestrate Agents in Natural Language with the Conductor

Stefan Nielsen, Edoardo Cetin, Peter Schwendeman, Qi Sun, Jinglue Xu, Yujin Tang

Comments To appear at the 14th International Conference on Learning Representations (ICLR 2026)

2512.03819 2026-03-03 cs.LG

Transmit Weights, Not Features: Orthogonal-Basis Aided Wireless Point-Cloud Transmission

Junlin Chang, Yubo Han, Hang Yue, John S Thompson, Rongke Liu

Comments 5 pages, 5 figures

2512.01210 2026-03-03 cs.AI

Knowledge Graph Augmented Large Language Models for Disease Prediction

Ruiyu Wang, Tuan Vinh, Ran Xu, Yuyin Zhou, Jiaying Lu, Carl Yang, Francisco Pasquel

2511.19785 2026-03-03 cs.CL cs.CY

Gender Bias in Emotion Recognition by Large Language Models

Maureen Herbert, Katie Sun, Angelica Lim, Yasaman Etesam

Comments Accepted at AAAI 2026 Workshop (WS37)

2511.19661 2026-03-03 cs.CV

CodeV: Code with Images for Faithful Visual Reasoning via Tool-Aware Policy Optimization

Xinhai Hou, Shaoyuan Xu, Manan Biyani, Moyan Li, Jia Liu, Todd C. Hollon, Bryan Wang

2511.19473 2026-03-03 cs.LG cs.AI

WavefrontDiffusion: Dynamic Decoding Schedule for Improved Reasoning

Haojin Yang, Rui Hu, Zequn Sun, Rui Zhou, Yujun Cai, Yiwei Wang

Comments 19 pages. 3 figures

2511.18942 2026-03-03 cs.CV

VeCoR -- Velocity Contrastive Regularization for Flow Matching

Zong-Wei Hong, Jing-lun Li, Lin-Ze Li, Shen Zhang, Yao Tang

Comments Accepted to Findings of CVPR 2026

2511.17649 2026-03-03 cs.CV cs.AI cs.RO

SWITCH: Benchmarking Modeling and Handling of Tangible Interfaces in Long-horizon Embodied Scenarios

Jieru Lin, Zhiwei Yu, Börje F. Karlsson

2511.16330 2026-03-03 cs.RO

Safe and Optimal Variable Impedance Control via Certified Reinforcement Learning

Shreyas Kumar, Ravi Prakash

Comments Accepted at ICRA 2026