arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

检索范围排序方式

检索时间范围

重置

HOT 人工智能、机器人等 9

cs.AI 人工智能 cs.CV 计算机视觉 cs.CL 自然语言处理 cs.RO 机器人 cs.LG 机器学习 cs.SD 声音 cs.ET 新兴技术 eess.AS 音频语音 eess.IV 图像视频

CS 计算机 41

cs 计算机 cs.AI 人工智能 cs.AR 硬件架构 cs.CC 计算复杂性 cs.CE 计算工程 cs.CG 计算几何 cs.CL 自然语言处理 cs.CR 密码安全 cs.CV 计算机视觉 cs.CY 计算机与社会 cs.DB 数据库 cs.DC 分布式计算 cs.DL 数字图书馆 cs.DM 离散数学 cs.DS 数据结构 cs.ET 新兴技术 cs.FL 形式语言 cs.GL 综述文献 cs.GR 图形学 cs.GT 博弈论 cs.HC 人机交互 cs.IR 信息检索 cs.IT 信息论 cs.LG 机器学习 cs.LO 计算机逻辑 cs.MA 多智能体 cs.MM 多媒体 cs.MS 数学软件 cs.NA 数值分析 cs.NE 神经进化 cs.NI 网络架构 cs.OH 其他计算机 cs.OS 操作系统 cs.PF 性能 cs.PL 编程语言 cs.RO 机器人 cs.SC 符号计算 cs.SD 声音 cs.SE 软件工程 cs.SI 社会信息网络 cs.SY 系统控制

ECON 经济学 4

econ 经济学 econ.EM 计量经济 econ.GN 一般经济 econ.TH 理论经济

EESS 电气与系统 5

eess 电气与系统 eess.AS 音频语音 eess.IV 图像视频 eess.SP 信号处理 eess.SY 系统控制

MATH 数学 33

math 数学 math.AC 交换代数 math.AG 代数几何 math.AP 偏微分方程 math.AT 代数拓扑 math.CA 经典分析 math.CO 组合数学 math.CT 范畴论 math.CV 复变函数 math.DG 微分几何 math.DS 动力系统 math.FA 泛函分析 math.GM 一般数学 math.GN 一般拓扑 math.GR 群论 math.GT 几何拓扑 math.HO 历史综述 math.IT 信息论 math.KT K理论 math.LO 逻辑 math.MG 度量几何 math.MP 数学物理 math.NA 数值分析 math.NT 数论 math.OA 算子代数 math.OC 优化控制 math.PR 概率 math.QA 量子代数 math.RA 环与代数 math.RT 表示论 math.SG 辛几何 math.SP 谱理论 math.ST 统计理论

PHYSICS 物理 55

astro-ph 天体物理 astro-ph.CO 宇宙学 astro-ph.EP 地球行星 astro-ph.GA 星系物理 astro-ph.HE 高能天体 astro-ph.IM 天文仪器 astro-ph.SR 太阳恒星 cond-mat 凝聚态 cond-mat.dis-nn 无序神经 cond-mat.mes-hall 介观纳米 cond-mat.mtrl-sci 材料科学 cond-mat.other 其他凝聚态 cond-mat.quant-gas 量子气体 cond-mat.soft 软凝聚态 cond-mat.stat-mech 统计力学 cond-mat.str-el 强关联电子 cond-mat.supr-con 超导 gr-qc 广义相对论 hep-ex 高能实验 hep-lat 格点高能 hep-ph 高能唯象 hep-th 高能理论 math-ph 数学物理 nlin 非线性科学 nlin.AO 自适应系统 nlin.CD 混沌动力学 nlin.CG 胞自动机 nlin.PS 斑图孤子 nlin.SI 可积系统 nucl-ex 核物理实验 nucl-th 核物理理论 physics 物理 physics.acc-ph 加速器物理 physics.ao-ph 大气海洋 physics.app-ph 应用物理 physics.atm-clus 原子分子团簇 physics.atom-ph 原子物理 physics.bio-ph 生物物理 physics.chem-ph 化学物理 physics.class-ph 经典物理 physics.comp-ph 计算物理 physics.data-an 数据分析 physics.ed-ph 物理教育 physics.flu-dyn 流体动力学 physics.gen-ph 普通物理 physics.geo-ph 地球物理 physics.hist-ph 物理史哲 physics.ins-det 仪器探测 physics.med-ph 医学物理 physics.optics 光学 physics.plasm-ph 等离子体 physics.pop-ph 科普物理 physics.soc-ph 物理与社会 physics.space-ph 空间物理 quant-ph 量子物理

Q-BIO 定量生物 11

q-bio 定量生物 q-bio.BM 生物分子 q-bio.CB 细胞行为 q-bio.GN 基因组学 q-bio.MN 分子网络 q-bio.NC 神经认知 q-bio.OT 其他定量生物 q-bio.PE 种群进化 q-bio.QM 定量方法 q-bio.SC 亚细胞过程 q-bio.TO 组织器官

Q-FIN 定量金融 10

q-fin 定量金融 q-fin.CP 计算金融 q-fin.EC 经济学 q-fin.GN 一般金融 q-fin.MF 数学金融 q-fin.PM 投资组合 q-fin.PR 证券定价 q-fin.RM 风险管理 q-fin.ST 统计金融 q-fin.TR 交易微观结构

STAT 统计 7

stat 统计 stat.AP 统计应用 stat.CO 统计计算 stat.ME 统计方法 stat.ML 机器学习 stat.OT 其他统计 stat.TH 统计理论

2510.10467 2026-02-03 cs.LG cs.AI

AnyBCQ: Hardware Efficient Flexible Binary-Coded Quantization for Multi-Precision LLMs

Gunho Park, Jeongin Bae, Beomseok Kwon, Byeongwook Kim, Se Jung Kwon, Dongsoo Lee

Comments ICLR 2026

2510.09556 2026-02-03 cs.CL

WUGNECTIVES: Novel Entity Inferences of Language Models from Discourse Connectives

Daniel Brubaker, William Sheffield, Junyi Jessy Li, Kanishka Misra

Comments 19 pages total, 10 pages main; 8 figures total, 5 figures main; 10 tables total, 4 tables main

2510.04236 2026-02-03 cs.CV

Scaling Sequence-to-Sequence Generative Neural Rendering

Shikun Liu, Kam Woh Ng, Wonbong Jang, Jiadong Guo, Junlin Han, Haozhe Liu, Yiannis Douratsos, Juan C. Pérez, Zijian Zhou, Chi Phung, Tao Xiang, Juan-Manuel Pérez-Rúa

Comments Published at ICLR 2026. Project Page: https://shikun.io/projects/kaleido

2510.04217 2026-02-03 cs.LG cs.AI

MLLMEraser: Achieving Test-Time Unlearning in Multimodal Large Language Models through Activation Steering

Chenlu Ding, Jiancan Wu, Leheng Sheng, Fan Zhang, Yancheng Yuan, Xiang Wang, Xiangnan He

2510.01984 2026-02-03 cs.RO cs.SY eess.SY

SPARC: Spine with Prismatic and Revolute Compliance for Quadruped Robots

Yue Wang

2509.23951 2026-02-03 cs.CV

HunyuanImage 3.0 Technical Report

Siyu Cao, Hangting Chen, Peng Chen, Yiji Cheng, Yutao Cui, Xinchi Deng, Ying Dong, Kipper Gong, Tianpeng Gu, Xiusen Gu, Tiankai Hang, Duojun Huang, Jie Jiang, Zhengkai Jiang, Weijie Kong, Changlin Li, Donghao Li, Junzhe Li, Xin Li, Yang Li, Zhenxi Li, Zhimin Li, Jiaxin Lin, Linus, Lucaz Liu, Shu Liu, Songtao Liu, Yu Liu, Yuhong Liu, Yanxin Long, Fanbin Lu, Qinglin Lu, Yuyang Peng, Yuanbo Peng, Xiangwei Shen, Yixuan Shi, Jiale Tao, Yangyu Tao, Qi Tian, Pengfei Wan, Chunyu Wang, Kai Wang, Lei Wang, Linqing Wang, Lucas Wang, Qixun Wang, Weiyan Wang, Hao Wen, Bing Wu, Jianbing Wu, Yue Wu, Senhao Xie, Fang Yang, Miles Yang, Xiaofeng Yang, Xuan Yang, Zhantao Yang, Jingmiao Yu, Zheng Yuan, Chao Zhang, Jian-Wei Zhang, Peizhen Zhang, Shi-Xue Zhang, Tao Zhang, Weigang Zhang, Yepeng Zhang, Yingfang Zhang, Zihao Zhang, Zijian Zhang, Penghao Zhao, Zhiyuan Zhao, Xuefei Zhe, Jianchen Zhu, Zhao Zhong

2509.23948 2026-02-03 cs.LG

Monotonic Transformation Invariant Multi-task Learning

Surya Murthy, Kushagra Gupta, Mustafa O. Karabag, David Fridovich-Keil, Ufuk Topcu

2509.22738 2026-02-03 cs.CL cs.LG

Enabling Approximate Joint Sampling in Diffusion LMs

Parikshit Bansal, Sujay Sanghavi

2509.22221 2026-02-03 cs.CV

Towards Faithful Reasoning in Remote Sensing: A Perceptually-Grounded GeoSpatial Chain-of-Thought for Vision-Language Models

Jiaqi Liu, Lang Sun, Ronghao Fu, Bo Yang

2509.22102 2026-02-03 cs.LG cs.AI

Reinforcement Learning for Durable Algorithmic Recourse

Marina Ceccon, Alessandro Fabris, Goran Radanović, Asia J. Biega, Gian Antonio Susto

2509.20295 2026-02-03 cs.CV

FAST: Foreground-aware Diffusion with Accelerated Sampling Trajectory for Segmentation-oriented Anomaly Synthesis

Xichen Xu, Yanshu Wang, Jinbao Wang, Xiaoning Lei, Guoyang Xie, Guannan Jiang, Zhichao Lu

Comments Accepted to NeurIPS 2025

2509.17942 2026-02-03 cs.LG cs.AI

StefaLand: An Efficient Geoscience Foundation Model That Improves Dynamic Land-Surface Predictions

Nicholas Kraabel, Jiangtao Liu, Yuchen Bian, Daniel Kifer, Chaopeng Shen

2509.15748 2026-02-03 cs.CV q-bio.NC

Hybrid Lie semi-group and cascade structures for the generalized Gaussian derivative model for visual receptive fields

Tony Lindeberg

Comments 27 pages, 9 figures

2509.15552 2026-02-03 cs.LG

The Multi-Query Paradox in Zeroth-Order Optimization

Wei Lin, Qingyu Song, Hong Xu

2509.15206 2026-02-03 cs.CL

Fair-GPTQ: Bias-Aware Quantization for Large Language Models

Irina Proskurina, Guillaume Metzler, Julien Velcin

2509.15048 2026-02-03 cs.CL

MaiBERT: A Pre-training Corpus and Language Model for Low-Resourced Maithili Language

Sumit Yadav, Raju Kumar Yadav, Utsav Maskey, Gautam Siddharth Kashyap, Ganesh Gautam, Usman Naseem

Comments Accepted at EACL LoResLM 2026

2509.14944 2026-02-03 cs.SD cs.AI eess.AS

Estimating Respiratory Effort from Nocturnal Breathing Sounds for Obstructive Sleep Apnoea Screening

Xiaolei Xu, Chaoyue Niu, Guy J. Brown, Hector Romero, Ning Ma

Comments Accepted at ICASSP 2026

2509.09509 2026-02-03 cs.RO

SMapper: A Multi-Modal Data Acquisition Platform for SLAM Benchmarking

Pedro Miguel Bastos Soares, Ali Tourani, Miguel Fernandez-Cortizas, Asier Bikandi-Noya, Holger Voos, Jose Luis Sanchez-Lopez

Comments 13 pages, 5 figures, 6 tables

详情

DOI: 10.1007/s10846-026-02351-7

英文摘要

Advancing research in fields such as Simultaneous Localization and Mapping (SLAM) and autonomous navigation critically depends on the availability of reliable and reproducible multimodal datasets. While several influential datasets have driven progress in these domains, they often suffer from limitations in sensing modalities, environmental diversity, and the reproducibility of the underlying hardware setups. To address these challenges, this paper introduces SMapper, a novel open-hardware, multi-sensor platform designed explicitly for, though not limited to, SLAM research. The device integrates synchronized LiDAR, multi-camera, and inertial sensing, supported by a robust calibration and synchronization pipeline that ensures precise spatio-temporal alignment across modalities. Its open and replicable design allows researchers to extend its capabilities and reproduce experiments across both handheld and robot-mounted scenarios. To demonstrate its practicality, we additionally release SMapper-light, a publicly available SLAM dataset containing representative indoor and outdoor sequences. The dataset includes tightly synchronized multimodal data and ground truth trajectories derived from offline LiDAR-based SLAM with sub-centimeter accuracy, alongside dense 3D reconstructions. Furthermore, the paper contains benchmarking results on state-of-the-art LiDAR and visual SLAM frameworks using the SMapper-light dataset. By combining open-hardware design, reproducible data collection, and comprehensive benchmarking, SMapper establishes a robust foundation for advancing SLAM algorithm development, evaluation, and reproducibility. The project's documentation, including source code, CAD models, and dataset links, is publicly available at https://snt-arg.github.io/smapper_docs.

URL PDF HTML ☆

赞 0 踩 0

2509.09199 2026-02-03 cs.CL

CCF: A Context Compression Framework for Efficient Long-Sequence Language Modeling

Wenhao Li, Bangcheng Sun, Weihao Ye, Tianyi Zhang, Daohai Yu, Fei Chao, Rongrong Ji

Comments The quality of this paper is low

2509.06608 2026-02-03 cs.LG

Small Vectors, Big Effects: A Mechanistic Study of RL-Induced Reasoning via Steering Vectors

Viacheslav Sinii, Nikita Balagansky, Gleb Gerasimov, Daniil Laptev, Yaroslav Aksenov, Vadim Kurochkin, Alexey Gorbatovski, Boris Shaposhnikov, Daniil Gavrilov

Comments Preprint

2509.02295 2026-02-03 cs.CV

Data-Driven Loss Functions for Inference-Time Optimization in Text-to-Image

Sapir Esther Yiflach, Yuval Atzmon, Gal Chechik

Comments Project page is at https://learn-to-steer-paper.github.io/

2508.17778 2026-02-03 cs.AI cs.NI

AgentRAN: An Agentic AI Architecture for Autonomous Control of Open 6G Networks

Maxime Elkael, Salvatore D'Oro, Leonardo Bonati, Michele Polese, Yunseong Lee, Koichiro Furueda, Tommaso Melodia

Comments This work has been submitted to the IEEE for possible publication

2508.17649 2026-02-03 cs.LG

Longitudinal Progression Prediction of Alzheimer's Disease with Tabular Foundation Model

Yilang Ding, Jiawen Ren, Jiaying Lu, Gloria Hyunjung Kwak, Armin Iraji, Shengpu Tang, Alex Fedorov

Comments preprint

2508.14779 2026-02-03 cs.CV eess.IV

Hospital-Specific Bias in Patch-Based Pathology Models

Mengliang Zhang

Comments 4 pages,3 figures

2508.13531 2026-02-03 cs.RO

A Three-Level Whole-Body Disturbance Rejection Control Framework for Dynamic Motions in Legged Robots

Bolin Li, Gewei Zuo, Zhixiang Wang, Xiaotian Ke, Lijun Zhu, Han Ding

Comments has been accepted for publication as a SPECIAL ISSUE paper in the IEEE Transactions on Automation Science and Engineering

2508.12596 2026-02-03 cs.LG

Constructing 3D Rotational Invariance and Equivariance with Symmetric Tensor Networks

Meng Zhang, Chao Wang, Hao Zhang, Shaojun Dong, Lixin He

2508.09198 2026-02-03 cs.LG cs.AI

SACO: Sequence-Aware Constrained Optimization Framework for Coupon Distribution in E-commerce

Li Kong, Bingzhe Wang, Zhou Chen, Suhan Hu, Yuchao Ma, Qi Qi, Suoyuan Song, Bicheng Jin

2508.08855 2026-02-03 cs.CL cs.AI cs.LG

BiasGym: A Simple and Generalizable Framework for Analyzing and Removing Biases through Elicitation

Sekh Mainul Islam, Nadav Borenstein, Siddhesh Milind Pawar, Haeun Yu, Arnav Arora, Isabelle Augenstein

Comments Under review. Title updated

2508.06051 2026-02-03 cs.CV

VQAThinker: Exploring Generalizable and Explainable Video Quality Assessment via Reinforcement Learning

Linhan Cao, Wei Sun, Weixia Zhang, Xiangyang Zhu, Jun Jia, Kaiwei Zhang, Dandan Zhu, Guangtao Zhai, Xiongkuo Min

Comments Accepted by AAAI2026

详情

英文摘要

Video quality assessment (VQA) aims to objectively quantify perceptual quality degradation in alignment with human visual perception. Despite recent advances, existing VQA models still suffer from two critical limitations: \textit{poor generalization to out-of-distribution (OOD) videos} and \textit{limited explainability}, which restrict their applicability in real-world scenarios. To address these challenges, we propose \textbf{VQAThinker}, a reasoning-based VQA framework that leverages large multimodal models (LMMs) with reinforcement learning to jointly model video quality understanding and scoring, emulating human perceptual decision-making. Specifically, we adopt group relative policy optimization (GRPO), a rule-guided reinforcement learning algorithm that enables reasoning over video quality under score-level supervision, and introduce three VQA-specific rewards: (1) a \textbf{bell-shaped regression reward} that increases rapidly as the prediction error decreases and becomes progressively less sensitive near the ground truth; (2) a \textbf{pairwise ranking reward} that guides the model to correctly determine the relative quality between video pairs; and (3) a \textbf{temporal consistency reward} that encourages the model to prefer temporally coherent videos over their perturbed counterparts. Extensive experiments demonstrate that VQAThinker achieves state-of-the-art performance on both in-domain and OOD VQA benchmarks, showing strong generalization for video quality scoring. Furthermore, evaluations on video quality understanding tasks validate its superiority in distortion attribution and quality description compared to existing explainable VQA models and LMMs. These findings demonstrate that reinforcement learning offers an effective pathway toward building generalizable and explainable VQA models solely with score-level supervision.

URL PDF HTML ☆

赞 0 踩 0

2508.02741 2026-02-03 cs.LG cs.AI cs.SD eess.AS

DeepGB-TB: A Risk-Balanced Cross-Attention Gradient-Boosted Convolutional Network for Rapid, Interpretable Tuberculosis Screening

Zhixiang Lu, Yulong Li, Feilong Tang, Zhengyong Jiang, Chong Li, Mian Zhou, Tenglong Li, Jionglong Su

Comments Accepted by AAAI 2026 (oral)

Journal ref Proceedings of the AAAI Conference on Artificial Intelligence, 2026

AI 大模型

视觉与机器人

科学与医疗

AnyBCQ: Hardware Efficient Flexible Binary-Coded Quantization for Multi-Precision LLMs

WUGNECTIVES: Novel Entity Inferences of Language Models from Discourse Connectives

Scaling Sequence-to-Sequence Generative Neural Rendering

MLLMEraser: Achieving Test-Time Unlearning in Multimodal Large Language Models through Activation Steering

SPARC: Spine with Prismatic and Revolute Compliance for Quadruped Robots

HunyuanImage 3.0 Technical Report

Monotonic Transformation Invariant Multi-task Learning

Enabling Approximate Joint Sampling in Diffusion LMs

Towards Faithful Reasoning in Remote Sensing: A Perceptually-Grounded GeoSpatial Chain-of-Thought for Vision-Language Models

Reinforcement Learning for Durable Algorithmic Recourse

FAST: Foreground-aware Diffusion with Accelerated Sampling Trajectory for Segmentation-oriented Anomaly Synthesis

StefaLand: An Efficient Geoscience Foundation Model That Improves Dynamic Land-Surface Predictions

Hybrid Lie semi-group and cascade structures for the generalized Gaussian derivative model for visual receptive fields

The Multi-Query Paradox in Zeroth-Order Optimization

Fair-GPTQ: Bias-Aware Quantization for Large Language Models

MaiBERT: A Pre-training Corpus and Language Model for Low-Resourced Maithili Language

Estimating Respiratory Effort from Nocturnal Breathing Sounds for Obstructive Sleep Apnoea Screening

SMapper: A Multi-Modal Data Acquisition Platform for SLAM Benchmarking

CCF: A Context Compression Framework for Efficient Long-Sequence Language Modeling

Small Vectors, Big Effects: A Mechanistic Study of RL-Induced Reasoning via Steering Vectors

Data-Driven Loss Functions for Inference-Time Optimization in Text-to-Image

AgentRAN: An Agentic AI Architecture for Autonomous Control of Open 6G Networks

Longitudinal Progression Prediction of Alzheimer's Disease with Tabular Foundation Model

Hospital-Specific Bias in Patch-Based Pathology Models

A Three-Level Whole-Body Disturbance Rejection Control Framework for Dynamic Motions in Legged Robots

Constructing 3D Rotational Invariance and Equivariance with Symmetric Tensor Networks

SACO: Sequence-Aware Constrained Optimization Framework for Coupon Distribution in E-commerce

BiasGym: A Simple and Generalizable Framework for Analyzing and Removing Biases through Elicitation

VQAThinker: Exploring Generalizable and Explainable Video Quality Assessment via Reinforcement Learning

DeepGB-TB: A Risk-Balanced Cross-Attention Gradient-Boosted Convolutional Network for Rapid, Interpretable Tuberculosis Screening