arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

检索范围排序方式

检索时间范围

重置

HOT 人工智能、机器人等 9

cs.AI 人工智能 cs.CV 计算机视觉 cs.CL 自然语言处理 cs.RO 机器人 cs.LG 机器学习 cs.SD 声音 cs.ET 新兴技术 eess.AS 音频语音 eess.IV 图像视频

CS 计算机 41

cs 计算机 cs.AI 人工智能 cs.AR 硬件架构 cs.CC 计算复杂性 cs.CE 计算工程 cs.CG 计算几何 cs.CL 自然语言处理 cs.CR 密码安全 cs.CV 计算机视觉 cs.CY 计算机与社会 cs.DB 数据库 cs.DC 分布式计算 cs.DL 数字图书馆 cs.DM 离散数学 cs.DS 数据结构 cs.ET 新兴技术 cs.FL 形式语言 cs.GL 综述文献 cs.GR 图形学 cs.GT 博弈论 cs.HC 人机交互 cs.IR 信息检索 cs.IT 信息论 cs.LG 机器学习 cs.LO 计算机逻辑 cs.MA 多智能体 cs.MM 多媒体 cs.MS 数学软件 cs.NA 数值分析 cs.NE 神经进化 cs.NI 网络架构 cs.OH 其他计算机 cs.OS 操作系统 cs.PF 性能 cs.PL 编程语言 cs.RO 机器人 cs.SC 符号计算 cs.SD 声音 cs.SE 软件工程 cs.SI 社会信息网络 cs.SY 系统控制

ECON 经济学 4

econ 经济学 econ.EM 计量经济 econ.GN 一般经济 econ.TH 理论经济

EESS 电气与系统 5

eess 电气与系统 eess.AS 音频语音 eess.IV 图像视频 eess.SP 信号处理 eess.SY 系统控制

MATH 数学 33

math 数学 math.AC 交换代数 math.AG 代数几何 math.AP 偏微分方程 math.AT 代数拓扑 math.CA 经典分析 math.CO 组合数学 math.CT 范畴论 math.CV 复变函数 math.DG 微分几何 math.DS 动力系统 math.FA 泛函分析 math.GM 一般数学 math.GN 一般拓扑 math.GR 群论 math.GT 几何拓扑 math.HO 历史综述 math.IT 信息论 math.KT K理论 math.LO 逻辑 math.MG 度量几何 math.MP 数学物理 math.NA 数值分析 math.NT 数论 math.OA 算子代数 math.OC 优化控制 math.PR 概率 math.QA 量子代数 math.RA 环与代数 math.RT 表示论 math.SG 辛几何 math.SP 谱理论 math.ST 统计理论

PHYSICS 物理 55

astro-ph 天体物理 astro-ph.CO 宇宙学 astro-ph.EP 地球行星 astro-ph.GA 星系物理 astro-ph.HE 高能天体 astro-ph.IM 天文仪器 astro-ph.SR 太阳恒星 cond-mat 凝聚态 cond-mat.dis-nn 无序神经 cond-mat.mes-hall 介观纳米 cond-mat.mtrl-sci 材料科学 cond-mat.other 其他凝聚态 cond-mat.quant-gas 量子气体 cond-mat.soft 软凝聚态 cond-mat.stat-mech 统计力学 cond-mat.str-el 强关联电子 cond-mat.supr-con 超导 gr-qc 广义相对论 hep-ex 高能实验 hep-lat 格点高能 hep-ph 高能唯象 hep-th 高能理论 math-ph 数学物理 nlin 非线性科学 nlin.AO 自适应系统 nlin.CD 混沌动力学 nlin.CG 胞自动机 nlin.PS 斑图孤子 nlin.SI 可积系统 nucl-ex 核物理实验 nucl-th 核物理理论 physics 物理 physics.acc-ph 加速器物理 physics.ao-ph 大气海洋 physics.app-ph 应用物理 physics.atm-clus 原子分子团簇 physics.atom-ph 原子物理 physics.bio-ph 生物物理 physics.chem-ph 化学物理 physics.class-ph 经典物理 physics.comp-ph 计算物理 physics.data-an 数据分析 physics.ed-ph 物理教育 physics.flu-dyn 流体动力学 physics.gen-ph 普通物理 physics.geo-ph 地球物理 physics.hist-ph 物理史哲 physics.ins-det 仪器探测 physics.med-ph 医学物理 physics.optics 光学 physics.plasm-ph 等离子体 physics.pop-ph 科普物理 physics.soc-ph 物理与社会 physics.space-ph 空间物理 quant-ph 量子物理

Q-BIO 定量生物 11

q-bio 定量生物 q-bio.BM 生物分子 q-bio.CB 细胞行为 q-bio.GN 基因组学 q-bio.MN 分子网络 q-bio.NC 神经认知 q-bio.OT 其他定量生物 q-bio.PE 种群进化 q-bio.QM 定量方法 q-bio.SC 亚细胞过程 q-bio.TO 组织器官

Q-FIN 定量金融 10

q-fin 定量金融 q-fin.CP 计算金融 q-fin.EC 经济学 q-fin.GN 一般金融 q-fin.MF 数学金融 q-fin.PM 投资组合 q-fin.PR 证券定价 q-fin.RM 风险管理 q-fin.ST 统计金融 q-fin.TR 交易微观结构

STAT 统计 7

stat 统计 stat.AP 统计应用 stat.CO 统计计算 stat.ME 统计方法 stat.ML 机器学习 stat.OT 其他统计 stat.TH 统计理论

2603.04938 2026-03-06 cs.CV cs.LG cs.RO

Person Detection and Tracking from an Overhead Crane LiDAR

Nilusha Jayawickrama, Henrik Toikka, Risto Ojala

Comments 8 pages, 7 figures, 4 tables. Submitted to Ubiquitous Robots (UR) 2026. Code: https://github.com/nilushacj/O-LiPeDeT-Overhead-LiDAR-Person-Detection-and-Tracking

2603.04936 2026-03-06 cs.LG

Semantic Communication-Enhanced Split Federated Learning for Vehicular Networks: Architecture, Challenges, and Case Study

Lu Yu, Zheng Chang, Ying-Chang Liang

Comments Accepted for publication in IEEE Communications Magazine. 7 pages, 5 figures

2603.04933 2026-03-06 cs.CL

AILS-NTUA at SemEval-2026 Task 3: Efficient Dimensional Aspect-Based Sentiment Analysis

Stavros Gazetas, Giorgos Filandrianos, Maria Lymperaiou, Paraskevi Tzouveli, Athanasios Voulodimos, Giorgos Stamou

2603.04921 2026-03-06 cs.CL

AILS-NTUA at SemEval-2026 Task 10: Agentic LLMs for Psycholinguistic Marker Extraction and Conspiracy Endorsement Detection

Panagiotis Alexios Spanakis, Maria Lymperaiou, Giorgos Filandrianos, Athanasios Voulodimos, Giorgos Stamou

2603.04920 2026-03-06 cs.AI

Knowledge-informed Bidding with Dual-process Control for Online Advertising

Huixiang Luo, Longyu Gao, Yaqi Liu, Qianqian Chen, Pingchun Huang, Tianning Li

2603.04918 2026-03-06 cs.LG cs.AI

BandPO: Bridging Trust Regions and Ratio Clipping via Probability-Aware Bounds for LLM Reinforcement Learning

Yuan Li, Bo Wang, Yufei Gao, Yuqian Yao, Xinyuan Wang, Zhangyue Yin, Xipeng Qiu

Comments Code available at https://github.com/OpenMOSS/BandPO.git

2603.04915 2026-03-06 cs.LG cs.AI cs.CR

EVMbench: Evaluating AI Agents on Smart Contract Security

Justin Wang, Andreas Bigger, Xiaohai Xu, Justin W. Lin, Andy Applebaum, Tejal Patwardhan, Alpin Yukseloglu, Olivia Watkins

2603.04914 2026-03-06 cs.RO cs.SY eess.SY math.OC

U-OBCA: Uncertainty-Aware Optimization-Based Collision Avoidance via Wasserstein Distributionally Robust Chance Constraints

Zehao Wang, Yuxuan Tang, Han Zhang, Jingchuan Wang, Weidong Chen

2603.04913 2026-03-06 cs.RO cs.CV

Beyond the Patch: Exploring Vulnerabilities of Visuomotor Policies via Viewpoint-Consistent 3D Adversarial Object

Chanmi Lee, Minsung Yoon, Woojae Kim, Sebin Lee, Sung-eui Yoon

Comments 8 pages, 10 figures, Accepted to ICRA 2026. Project page: https://chan-mi-lee.github.io/3DAdvObj/

2603.04910 2026-03-06 cs.RO cs.AI cs.LG

VPWEM: Non-Markovian Visuomotor Policy with Working and Episodic Memory

Yuheng Lei, Zhixuan Liang, Hongyuan Zhang, Ping Luo

2603.04908 2026-03-06 cs.CV

AdaIAT: Adaptively Increasing Attention to Generated Text to Alleviate Hallucinations in LVLM

Li'an Zhong, Ziqiang He, Jibin Zheng, Jin Li, Z. Jane Wang, Xiangui Kang

2603.04904 2026-03-06 cs.AI cs.CL

Alignment Backfire: Language-Dependent Reversal of Safety Interventions Across 16 Languages in LLM Multi-Agent Systems

Hiroki Fukui

Comments 89 pages, 4 figures, 4 supplementary figures, 12 supplementary tables; preprint

2603.04900 2026-03-06 cs.AI

EvoTool: Self-Evolving Tool-Use Policy Optimization in LLM Agents via Blame-Aware Mutation and Diversity-Aware Selection

Shuo Yang, Soyeon Caren Han, Xueqi Ma, Yan Li, Mohammad Reza Ghasemi Madani, Eduard Hovy

Comments Work under review, 9 pages, 5 figures

2603.04899 2026-03-06 cs.CV

FC-VFI: Faithful and Consistent Video Frame Interpolation for High-FPS Slow Motion Video Generation

Ganggui Ding, Hao Chen, Xiaogang Xu

Comments ICASSP2026

2603.04898 2026-03-06 cs.LG cs.NI

U-Parking: Distributed UWB-Assisted Autonomous Parking System with Robust Localization and Intelligent Planning

Yiang Wu, Qiong Wu, Pingyi Fan, Kezhi Wang, Wen Chen, Guoqiang Mao, Khaled B. Letaief

Comments This paper has been accepted by infocom. The source code has been released at: https://github.com/qiongwu86/U-Parking

2603.04896 2026-03-06 cs.AI

Authorize-on-Demand: Dynamic Authorization with Legality-Aware Intellectual Property Protection for VLMs

Lianyu Wang, Meng Wang, Huazhu Fu, Daoqiang Zhang

2603.04894 2026-03-06 cs.AI

Differentially Private Multimodal In-Context Learning

Ivoline C. Ngong, Zarreen Reza, Joseph P. Near

2603.04893 2026-03-06 cs.CL cs.AI

Free Lunch for Pass@$k$? Low Cost Diverse Sampling for Diffusion Language Models

Sean Lamont, Christian Walder, Paul Montague, Amir Dezfouli, Michael Norrish

2603.04892 2026-03-06 cs.CV

Locality-Attending Vision Transformer

Sina Hajimiri, Farzad Beizaee, Fereshteh Shakeri, Christian Desrosiers, Ismail Ben Ayed, Jose Dolz

Comments Accepted to ICLR 2026

2603.04890 2026-03-06 cs.LG cs.AI cs.CV

FedAFD: Multimodal Federated Learning via Adversarial Fusion and Distillation

Min Tan, Junchao Ma, Yinfu Feng, Jiajun Ding, Wenwen Pan, Tingting Han, Qian Zheng, Zhenzhong Kuang, Zhou Yu

Comments Accepted by CVPR 2026

2603.04887 2026-03-06 cs.CV

Federated Modality-specific Encoders and Partially Personalized Fusion Decoder for Multimodal Brain Tumor Segmentation

Hong Liu, Dong Wei, Qian Dai, Xian Wu, Yefeng Zheng, Liansheng Wang

Comments Medical Image Analysis 2025. arXiv admin note: substantial text overlap with arXiv:2403.11803

详情

DOI: 10.1016/j.media.2025.103759

英文摘要

Most existing federated learning (FL) methods for medical image analysis only considered intramodal heterogeneity, limiting their applicability to multimodal imaging applications. In practice, some FL participants may possess only a subset of the complete imaging modalities, posing intermodal heterogeneity as a challenge to effectively training a global model on all participants' data. Meanwhile, each participant expects a personalized model tailored to its local data characteristics in FL. This work proposes a new FL framework with federated modality-specific encoders and partially personalized multimodal fusion decoders (FedMEPD) to address the two concurrent issues. Specifically, FedMEPD employs an exclusive encoder for each modality to account for the intermodal heterogeneity. While these encoders are fully federated, the decoders are partially personalized to meet individual needs -- using the discrepancy between global and local parameter updates to dynamically determine which decoder filters are personalized. Implementation-wise, a server with full-modal data employs a fusion decoder to fuse representations from all modality-specific encoders, thus bridging the modalities to optimize the encoders via backpropagation. Moreover, multiple anchors are extracted from the fused multimodal representations and distributed to the clients in addition to the model parameters. Conversely, the clients with incomplete modalities calibrate their missing-modal representations toward the global full-modal anchors via scaled dot-product cross-attention, making up for the information loss due to absent modalities. FedMEPD is validated on the BraTS 2018 and 2020 multimodal brain tumor segmentation benchmarks. Results show that it outperforms various up-to-date methods for multimodal and personalized FL, and its novel designs are effective.

URL PDF HTML ☆

赞 0 踩 0

2603.04882 2026-03-06 cs.CV cs.AI cs.MM

DeformTrace: A Deformable State Space Model with Relay Tokens for Temporal Forgery Localization

Xiaodong Zhu, Suting Wang, Yuanming Zheng, Junqi Yang, Yangxu Liao, Yuhong Yang, Weiping Tu, Zhongyuan Wang

Comments 9 pages, 4 figures, accepted by AAAI 2026

2603.04878 2026-03-06 cs.CV

Structure Observation Driven Image-Text Contrastive Learning for Computed Tomography Report Generation

Hong Liu, Dong Wei, Qiong Peng, Yawen Huang, Xian Wu, Yefeng Zheng, Liansheng Wang

Comments Accept to IPMI 2025

详情

DOI: 10.1007/978-3-031-96625-5_15

英文摘要

Computed Tomography Report Generation (CTRG) aims to automate the clinical radiology reporting process, thereby reducing the workload of report writing and facilitating patient care. While deep learning approaches have achieved remarkable advances in X-ray report generation, their effectiveness may be limited in CTRG due to larger data volumes of CT images and more intricate details required to describe them. This work introduces a novel two-stage (structure- and report-learning) framework tailored for CTRG featuring effective structure-wise image-text contrasting. In the first stage, a set of learnable structure-specific visual queries observe corresponding structures in a CT image. The resulting observation tokens are contrasted with structure-specific textual features extracted from the accompanying radiology report with a structure-wise image-text contrastive loss. In addition, text-text similarity-based soft pseudo targets are proposed to mitigate the impact of false negatives, i.e., semantically identical image structures and texts from non-paired images and reports. Thus, the model learns structure-level semantic correspondences between CT images and reports. Further, a dynamic, diversity-enhanced negative queue is proposed to guide the network in learning to discriminate various abnormalities. In the second stage, the visual structure queries are frozen and used to select the critical image patch embeddings depicting each anatomical structure, minimizing distractions from irrelevant areas while reducing memory consumption. Also, a text decoder is added and trained for report generation.Our extensive experiments on two public datasets demonstrate that our framework establishes new state-of-the-art performance for CTRG in clinical efficiency, and its components are effective.

URL PDF HTML ☆

赞 0 踩 0

2603.04874 2026-03-06 cs.CV cs.AI cs.LG

Interpretable Pre-Release Baseball Pitch Type Anticipation from Broadcast 3D Kinematics

Jerrin Bright, Michelle Lu, John Zelek

Comments Submitted to CVPRW'26

2603.04869 2026-03-06 cs.CV

SURE: Semi-dense Uncertainty-REfined Feature Matching

Sicheng Li, Zaiwang Gu, Jie Zhang, Qing Guo, Xudong Jiang, Jun Cheng

Comments Accepted by ICRA 2026

2603.04868 2026-03-06 cs.AI

K-Gen: A Multimodal Language-Conditioned Approach for Interpretable Keypoint-Guided Trajectory Generation

Mingxuan Mu, Guo Yang, Lei Chen, Ping Wu, Jianxun Cui

2603.04864 2026-03-06 cs.CV

Scalable Injury-Risk Screening in Baseball Pitching From Broadcast Video

Jerrin Bright, Justin Mende, John Zelek

Comments Submitted to CVPRW'26

2603.04861 2026-03-06 cs.AI cs.LG cs.RO

Causally Robust Reward Learning from Reason-Augmented Preference Feedback

Minjune Hwang, Yigit Korkmaz, Daniel Seita, Erdem Bıyık

Comments Published in International Conference on Learning Representations (ICLR) 2026

2603.04857 2026-03-06 cs.CL cs.SE

FireBench: Evaluating Instruction Following in Enterprise and API-Driven LLM Applications

Yunfan Zhang, Yijie Bei, Jetashree Ravi, Pawel Garbacki

2603.04854 2026-03-06 cs.CL

SinhaLegal: A Benchmark Corpus for Information Extraction and Analysis in Sinhala Legislative Texts

Minduli Lasandi, Nevidu Jayatilleke

Comments 18 pages, 8 figures, 18 tables, Accepted paper at the 2nd workshop on Language Models for Low-Resource Languages (LoResLM 2026) @ EACL 2026