arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

检索范围排序方式

检索时间范围

重置

HOT 人工智能、机器人等 9

cs.AI 人工智能 cs.CV 计算机视觉 cs.CL 自然语言处理 cs.RO 机器人 cs.LG 机器学习 cs.SD 声音 cs.ET 新兴技术 eess.AS 音频语音 eess.IV 图像视频

CS 计算机 41

cs 计算机 cs.AI 人工智能 cs.AR 硬件架构 cs.CC 计算复杂性 cs.CE 计算工程 cs.CG 计算几何 cs.CL 自然语言处理 cs.CR 密码安全 cs.CV 计算机视觉 cs.CY 计算机与社会 cs.DB 数据库 cs.DC 分布式计算 cs.DL 数字图书馆 cs.DM 离散数学 cs.DS 数据结构 cs.ET 新兴技术 cs.FL 形式语言 cs.GL 综述文献 cs.GR 图形学 cs.GT 博弈论 cs.HC 人机交互 cs.IR 信息检索 cs.IT 信息论 cs.LG 机器学习 cs.LO 计算机逻辑 cs.MA 多智能体 cs.MM 多媒体 cs.MS 数学软件 cs.NA 数值分析 cs.NE 神经进化 cs.NI 网络架构 cs.OH 其他计算机 cs.OS 操作系统 cs.PF 性能 cs.PL 编程语言 cs.RO 机器人 cs.SC 符号计算 cs.SD 声音 cs.SE 软件工程 cs.SI 社会信息网络 cs.SY 系统控制

ECON 经济学 4

econ 经济学 econ.EM 计量经济 econ.GN 一般经济 econ.TH 理论经济

EESS 电气与系统 5

eess 电气与系统 eess.AS 音频语音 eess.IV 图像视频 eess.SP 信号处理 eess.SY 系统控制

MATH 数学 33

math 数学 math.AC 交换代数 math.AG 代数几何 math.AP 偏微分方程 math.AT 代数拓扑 math.CA 经典分析 math.CO 组合数学 math.CT 范畴论 math.CV 复变函数 math.DG 微分几何 math.DS 动力系统 math.FA 泛函分析 math.GM 一般数学 math.GN 一般拓扑 math.GR 群论 math.GT 几何拓扑 math.HO 历史综述 math.IT 信息论 math.KT K理论 math.LO 逻辑 math.MG 度量几何 math.MP 数学物理 math.NA 数值分析 math.NT 数论 math.OA 算子代数 math.OC 优化控制 math.PR 概率 math.QA 量子代数 math.RA 环与代数 math.RT 表示论 math.SG 辛几何 math.SP 谱理论 math.ST 统计理论

PHYSICS 物理 55

astro-ph 天体物理 astro-ph.CO 宇宙学 astro-ph.EP 地球行星 astro-ph.GA 星系物理 astro-ph.HE 高能天体 astro-ph.IM 天文仪器 astro-ph.SR 太阳恒星 cond-mat 凝聚态 cond-mat.dis-nn 无序神经 cond-mat.mes-hall 介观纳米 cond-mat.mtrl-sci 材料科学 cond-mat.other 其他凝聚态 cond-mat.quant-gas 量子气体 cond-mat.soft 软凝聚态 cond-mat.stat-mech 统计力学 cond-mat.str-el 强关联电子 cond-mat.supr-con 超导 gr-qc 广义相对论 hep-ex 高能实验 hep-lat 格点高能 hep-ph 高能唯象 hep-th 高能理论 math-ph 数学物理 nlin 非线性科学 nlin.AO 自适应系统 nlin.CD 混沌动力学 nlin.CG 胞自动机 nlin.PS 斑图孤子 nlin.SI 可积系统 nucl-ex 核物理实验 nucl-th 核物理理论 physics 物理 physics.acc-ph 加速器物理 physics.ao-ph 大气海洋 physics.app-ph 应用物理 physics.atm-clus 原子分子团簇 physics.atom-ph 原子物理 physics.bio-ph 生物物理 physics.chem-ph 化学物理 physics.class-ph 经典物理 physics.comp-ph 计算物理 physics.data-an 数据分析 physics.ed-ph 物理教育 physics.flu-dyn 流体动力学 physics.gen-ph 普通物理 physics.geo-ph 地球物理 physics.hist-ph 物理史哲 physics.ins-det 仪器探测 physics.med-ph 医学物理 physics.optics 光学 physics.plasm-ph 等离子体 physics.pop-ph 科普物理 physics.soc-ph 物理与社会 physics.space-ph 空间物理 quant-ph 量子物理

Q-BIO 定量生物 11

q-bio 定量生物 q-bio.BM 生物分子 q-bio.CB 细胞行为 q-bio.GN 基因组学 q-bio.MN 分子网络 q-bio.NC 神经认知 q-bio.OT 其他定量生物 q-bio.PE 种群进化 q-bio.QM 定量方法 q-bio.SC 亚细胞过程 q-bio.TO 组织器官

Q-FIN 定量金融 10

q-fin 定量金融 q-fin.CP 计算金融 q-fin.EC 经济学 q-fin.GN 一般金融 q-fin.MF 数学金融 q-fin.PM 投资组合 q-fin.PR 证券定价 q-fin.RM 风险管理 q-fin.ST 统计金融 q-fin.TR 交易微观结构

STAT 统计 7

stat 统计 stat.AP 统计应用 stat.CO 统计计算 stat.ME 统计方法 stat.ML 机器学习 stat.OT 其他统计 stat.TH 统计理论

2602.24021 2026-03-02 cs.CV

Steering and Rectifying Latent Representation Manifolds in Frozen Multi-modal LLMs for Video Anomaly Detection

Zhaolin Cai, Fan Li, Huiyu Duan, Lijun He, Guangtao Zhai

Comments Accepted by ICLR 2026

2602.24020 2026-03-02 cs.CV

SR3R: Rethinking Super-Resolution 3D Reconstruction With Feed-Forward Gaussian Splatting

Xiang Feng, Xiangbo Wang, Tieshi Zhong, Chengkai Wang, Yiting Zhao, Tianxiang Xu, Zhenzhong Kuang, Feiwei Qin, Xuefei Yin, Yanming Zhu

Comments CVPR 2026

2602.24014 2026-03-02 cs.CV cs.AI

Interpretable Debiasing of Vision-Language Models for Social Fairness

Na Min An, Yoonna Jang, Yusuke Hirota, Ryo Hachiuma, Isabelle Augenstein, Hyunjung Shim

Comments 25 pages, 30 figures, 13 Tables Accepted to CVPR 2026

2602.24013 2026-03-02 cs.CV

Ordinal Diffusion Models for Color Fundus Images

Gustav Schmidt, Philipp Berens, Sarah Müller

2602.24011 2026-03-02 cs.RO

Autonomous Inspection of Power Line Insulators with UAV on an Unmapped Transmission Tower

Václav Riss, Vít Krátký, Robert Pěnička, Martin Saska

Comments 8 pages, 9 figues

2602.24002 2026-03-02 cs.CL

Dialect and Gender Bias in YouTube's Spanish Captioning System

Iris Dania Jimenez, Christoph Kern

Comments 21 pages, 4 tables

2602.23997 2026-03-02 cs.LG cs.AI

Foundation World Models for Agents that Learn, Verify, and Adapt Reliably Beyond Static Environments

Florent Delgrange

Comments AAMAS 2026, Blue Sky Idea Track. 4 pages, 1 Figure

2602.23996 2026-03-02 cs.CV

Accelerating Masked Image Generation by Learning Latent Controlled Dynamics

Kaiwen Zhu, Quansheng Zeng, Yuandong Pu, Shuo Cao, Xiaohui Li, Yi Xin, Qi Qin, Jiayang Li, Yu Qiao, Jinjin Gu, Yihao Liu

2602.23994 2026-03-02 cs.LG cs.AI cs.CV

MINT: Multimodal Imaging-to-Speech Knowledge Transfer for Early Alzheimer's Screening

Vrushank Ahire, Yogesh Kumar, Anouck Girard, M. A. Ganaie

详情

英文摘要

Alzheimer's disease is a progressive neurodegenerative disorder in which mild cognitive impairment (MCI) marks a critical transition between aging and dementia. Neuroimaging modalities, such as structural MRI, provide biomarkers of this transition; however, their high costs and infrastructure needs limit their deployment at a population scale. Speech analysis offers a non-invasive alternative, but speech-only classifiers are developed independently of neuroimaging, leaving decision boundaries biologically ungrounded and limiting reliability on the subtle CN-versus-MCI distinction. We propose MINT (Multimodal Imaging-to-Speech Knowledge Transfer), a three-stage cross-modal framework that transfers biomarker structure from MRI into a speech encoder at training time. An MRI teacher, trained on 1,228 subjects, defines a compact neuroimaging embedding space for CN-versus-MCI classification. A residual projection head aligns speech representations to this frozen imaging manifold via a combined geometric loss, adapting speech to the learned biomarker space while preserving imaging encoder fidelity. The frozen MRI classifier, which is never exposed to speech, is applied to aligned embeddings at inference and requires no scanner. Evaluation on ADNI-4 shows aligned speech achieves performance comparable to speech-only baselines (AUC 0.720 vs 0.711) while requiring no imaging at inference, demonstrating that MRI-derived decision boundaries can ground speech representations. Multimodal fusion improves over MRI alone (0.973 vs 0.958). Ablation studies identify dropout regularization and self-supervised pretraining as critical design decisions. To our knowledge, this is the first demonstration of MRI-to-speech knowledge transfer for early Alzheimer's screening, establishing a biologically grounded pathway for population-level cognitive triage without neuroimaging at inference.

URL PDF HTML ☆

赞 0 踩 0

2602.23993 2026-03-02 cs.CL

The GRADIEND Python Package: An End-to-End System for Gradient-Based Feature Learning

Jonathan Drechsel, Steffen Herbold

2602.23981 2026-03-02 cs.LG cs.AI

Intrinsic Lorentz Neural Network

Xianglong Shi, Ziheng Chen, Yunhan Jiang, Nicu Sebe

Comments Published in ICLR 2026

2602.23980 2026-03-02 cs.CV

Venus: Benchmarking and Empowering Multimodal Large Language Models for Aesthetic Guidance and Cropping

Tianxiang Du, Hulingxiao He, Yuxin Peng

Comments Accepted by CVPR 2026

2602.23968 2026-03-02 cs.LG

Learning Generation Orders for Masked Discrete Diffusion Models via Variational Inference

David Fox, Sam Bowyer, Song Liu, Laurence Aitchison, Raul Santos-Rodriguez, Mengyue Yang

Comments 12 pages, 1 figure

2602.23963 2026-03-02 cs.CV

SpikeTrack: A Spike-driven Framework for Efficient Visual Tracking

Qiuyang Zhang, Jiujun Cheng, Qichao Mao, Cong Liu, Yu Fang, Yuhong Li, Mengying Ge, Shangce Gao

Comments Accepted by CVPR2026

2602.23960 2026-03-02 cs.SD cs.AI

SHINE: Sequential Hierarchical Integration Network for EEG and MEG

Xiran Xu, Yujie Yan, Xihong Wu, Jing Chen

Comments ranked second at LibriBrain Competition 2025 https://neural-processing-lab.github.io/2025-libribrain-competition/prizes/

2602.23959 2026-03-02 cs.CV

Thinking with Images as Continuous Actions: Numerical Visual Chain-of-Thought

Kesen Zhao, Beier Zhu, Junbao Zhou, Xingyu Zhu, Zhongqi Yue, Hanwang Zhang

2602.23953 2026-03-02 cs.CV

GDA-YOLO11: Amodal Instance Segmentation for Occlusion-Robust Robotic Fruit Harvesting

Caner Beldek, Emre Sariyildiz, Son Lam Phung, Gursel Alici

Comments 9 pages, journal pre-print

2602.23952 2026-03-02 cs.CV

CC-VQA: Conflict- and Correlation-Aware Method for Mitigating Knowledge Conflict in Knowledge-Based Visual Question Answering

Yuyang Hong, Jiaqi Gu, Yujin Lou, Lubin Fan, Qi Yang, Ying Wang, Kun Ding, Yue Wu, Shiming Xiang, Jieping Ye

Comments Accepted by CVPR2026

2602.23950 2026-03-02 cs.CV cs.AI

Micro-expression Recognition Based on Dual-branch Feature Extraction and Fusion

Mingjie Zhang, Bo Li, Wanting Liu, Hongyan Cui, Yue Li, Qingwen Li, Hong Li, Ge Gao

Comments 4 pages, 4 figures,conference paper

2602.23947 2026-03-02 cs.LG cs.AI

Hierarchical Concept-based Interpretable Models

Oscar Hill, Mateo Espinosa Zarlenga, Mateja Jamnik

Comments Published as a conference paper at ICLR 2026

2602.23945 2026-03-02 cs.CV cs.AI cs.MM

PointCoT: A Multi-modal Benchmark for Explicit 3D Geometric Reasoning

Dongxu Zhang, Yiding Sun, Pengcheng Li, Yumou Liu, Hongqiang Lin, Haoran Xu, Xiaoxuan Mu, Liang Lin, Wenbiao Yan, Ning Yang, Chaowei Fang, Juanjuan Zhao, Jihua Zhu, Conghui He, Cheng Tan

2602.23944 2026-03-02 cs.CL

MemEmo: Evaluating Emotion in Memory Systems of Agents

Peng Liu, Zhen Tao, Jihao Zhao, Ding Chen, Yansong Zhang, Cuiping Li, Zhiyu Li, Hong Chen

2602.23941 2026-03-02 cs.CL cs.DL cs.IR

EDDA-Coordinata: An Annotated Dataset of Historical Geographic Coordinates

Ludovic Moncla, Pierre Nugues, Thierry Joliveau, Katherine McDonough

Comments Accepted at LREC 2026

2602.23940 2026-03-02 cs.CL cs.LG

Benchmarking BERT-based Models for Sentence-level Topic Classification in Nepali Language

Nischal Karki, Bipesh Subedi, Prakash Poudyal, Rupak Raj Ghimire, Bal Krishna Bal

Comments 5 pages, 2 figures. Accepted and presented at the Regional International Conference on Natural Language Processing (RegICON 2025), Gauhati University, Guwahati, India, November 27-29, 2025. To appear in the conference proceedings. Accepted papers list available at: accepted-papers" target="_blank" rel="noopener">https://www.regicon2025.in/accepted-papers

2602.23937 2026-03-02 cs.RO cs.CV

Enhancing Vision-Language Navigation with Multimodal Event Knowledge from Real-World Indoor Tour Videos

Haoxuan Xu, Tianfu Li, Wenbo Chen, Yi Liu, Xingxing Zuo, Yaoxian Song, Haoang Li

2602.23934 2026-03-02 cs.RO cs.LG

Learning to Build: Autonomous Robotic Assembly of Stable Structures Without Predefined Plans

Jingwen Wang, Johannes Kirschner, Paul Rolland, Luis Salamanca, Stefana Parascho

2602.23926 2026-03-02 cs.CV

Leveraging Geometric Prior Uncertainty and Complementary Constraints for High-Fidelity Neural Indoor Surface Reconstruction

Qiyu Feng, Jiwei Shan, Shing Shin Cheng, Hesheng Wang

Comments Accepted by ICRA 2026

2602.23906 2026-03-02 cs.CV

Half-Truths Break Similarity-Based Retrieval

Bora Kargi, Arnas Uselis, Seong Joon Oh

2602.23903 2026-03-02 cs.CV cs.LG

SegMate: Asymmetric Attention-Based Lightweight Architecture for Efficient Multi-Organ Segmentation

Andrei-Alexandru Bunea, Dan-Matei Popovici, Radu Tudor Ionescu

2602.23901 2026-03-02 cs.RO cs.CV

ABPolicy: Asynchronous B-Spline Flow Policy for Real-Time and Smooth Robotic Manipulation

Fan Yang, Peiguang Jing, Kaihua Qu, Ningyuan Zhao, Yuting Su