arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2603.19776 2026-03-23 cs.CV

ReManNet: A Riemannian Manifold Network for Monocular 3D Lane Detection

Chengzhi Hong, Bijun Li

详情

英文摘要

Monocular 3D lane detection remains challenging due to depth ambiguity and weak geometric constraints. Mainstream methods rely on depth guidance, BEV projection, and anchor- or curve-based heads with simplified physical assumptions, remapping high-dimensional image features while only weakly encoding road geometry. Lacking an invariant geometric-topological coupling between lanes and the underlying road surface, 2D-to-3D lifting is ill-posed and brittle, often degenerating into concavities, bulges, and twists. To address this, we propose the Road-Manifold Assumption: the road is a smooth 2D manifold in $\mathbb{R}^3$, lanes are embedded 1D submanifolds, and sampled lane points are dense observations, thereby coupling metric and topology across surfaces, curves, and point sets. Building on this, we propose ReManNet, which first produces initial lane predictions with an image backbone and detection heads, then encodes geometry as Riemannian Gaussian descriptors on the symmetric positive-definite (SPD) manifold, and fuses these descriptors with visual features through a lightweight gate to maintain coherent 3D reasoning. We also propose the 3D Tunnel Lane IoU (3D-TLIoU) loss, a joint point-curve objective that computes slice-wise overlap of tubular neighborhoods along each lane to improve shape-level alignment. Extensive experiments on standard benchmarks demonstrate that ReManNet achieves state-of-the-art (SOTA) or competitive results. On OpenLane, it improves F1 by +8.2% over the baseline and by +1.8% over the previous best, with scenario-level gains of up to +6.6%. The code will be publicly available at https://github.com/changehome717/ReManNet.

URL PDF HTML ☆

赞 0 踩 0

2603.19773 2026-03-23 cs.CV

Template-based Object Detection Using a Foundation Model

Valentin Braeutigam, Matthias Stock, Bernhard Egger

2603.19771 2026-03-23 cs.CL

Neither Here Nor There: Cross-Lingual Representation Dynamics of Code-Mixed Text in Multilingual Encoders

Debajyoti Mazumder, Divyansh Pathak, Prashant Kodali, Jasabanta Patro

Comments 24 pages

2603.19770 2026-03-23 cs.CV

FlashCap: Millisecond-Accurate Human Motion Capture via Flashing LEDs and Event-Based Vision

Zekai Wu, Shuqi Fan, Mengyin Liu, Yuhua Luo, Xincheng Lin, Ming Yan, Junhao Wu, Xiuhong Lin, Yuexin Ma, Chenglu Wen, Lan Xu, Siqi Shen, Cheng Wang

Comments Accepted to CVPR 2026

2603.19766 2026-03-23 cs.CV

Adapting a Pre-trained Single-Cell Foundation Model to Spatial Gene Expression Generation from Histology Images

Donghai Fang, Yongheng Li, Zhen Wang, Yuansong Zeng, Wenwen Min

Comments Accepted by CVPR 2026

2603.19765 2026-03-23 cs.CV

FREAK: A Fine-grained Hallucination Evaluation Benchmark for Advanced MLLMs

Zhihan Yin, Jianxin Liang, Yueqian Wang, Yifeng Yao, Huishuai Zhang, Dongyan Zhao

Comments 34 pages

2603.19762 2026-03-23 cs.CV

PCSTracker: Long-Term Scene Flow Estimation for Point Cloud Sequences

Min Lin, Gangwei Xu, Xianqi Wang, Yuyi Peng, Xin Yang

Comments Accepted in CVPR 2026 (Findings)

2603.19759 2026-03-23 cs.CV cs.LG

Growing Networks with Autonomous Pruning

Charles De Lambilly, Stefan Duffner

2603.19757 2026-03-23 cs.CV cs.AI

Uncertainty-aware Prototype Learning with Variational Inference for Few-shot Point Cloud Segmentation

Yifei Zhao, Fanyu Zhao, Yinsheng Li

Comments 5 pages, 3 figures, 3 tables, accepted by ICASSP 2026

2603.19753 2026-03-23 cs.CV cs.GR

ReLi3D: Relightable Multi-view 3D Reconstruction with Disentangled Illumination

Jan-Niklas Dihlmann, Mark Boss, Simon Donne, Andreas Engelhardt, Hendrik P. A. Lensch, Varun Jampani

Comments Project Page: https://reli3d.jdihlmann.com/

2603.19752 2026-03-23 cs.CV

PhysNeXt: Next-Generation Dual-Branch Structured Attention Fusion Network for Remote Photoplethysmography Measurement

Junzhe Cao, Bo Zhao, Zhiyi Niu, Dan Guo, Yue Sun, Haochen Liang, Yong Xu, Zitong YU

2603.19744 2026-03-23 cs.CL

Rethinking Ground Truth: A Case Study on Human Label Variation in MLLM Benchmarking

Tomas Ruiz, Tanalp Agustoslu, Carsten Schwemmer

Comments 6 pages, 3 tables, 1 figure

2603.19742 2026-03-23 cs.LG cs.CL

Dual Path Attribution: Efficient Attribution for SwiGLU-Transformers through Layer-Wise Target Propagation

Lasse Marten Jantsch, Dong-Jae Koh, Seonghyeon Lee, Young-Kyoon Suh

2603.19741 2026-03-23 cs.LG cs.CL

FedPDPO: Federated Personalized Direct Preference Optimization for Large Language Model Alignment

Kewen Zhu, Liping Yi, Zhiming Zhao, Zhuang Qi, Han Yu, Qinghua Hu

Comments under review

2603.19739 2026-03-23 cs.SD cs.AI cs.CL

MOSS-TTSD: Text to Spoken Dialogue Generation

Yuqian Zhang, Donghua Yu, Zhengyuan Lin, Botian Jiang, Mingshu Chen, Yaozhou Jiang, Yiwei Zhao, Yiyang Zhang, Yucheng Yuan, Hanfu Chen, Kexin Huang, Jun Zhan, Cheng Chang, Zhaoye Fei, Shimin Li, Xiaogui Yang, Qinyuan Cheng, Xipeng Qiu

2603.19733 2026-03-23 cs.CL

PoC: Performance-oriented Context Compression for Large Language Models via Performance Prediction

Runsong Zhao, Shilei Liu, Jiwei Tang, Langming Liu, Haibin Chen, Weidong Zhang, Yujin Yuan, Tong Xiao, Jingbo Zhu, Wenbo Su, Bo Zheng

2603.19731 2026-03-23 cs.CV

PerformRecast: Expression and Head Pose Disentanglement for Portrait Video Editing

Jiadong Liang, Bojun Xiong, Jie Tian, Hua Li, Xiao Long, Yong Zheng, Huan Fu

Comments Accepted to CVPR 2026. Project Page: https://youku-aigc.github.io/PerformRecast

2603.19718 2026-03-23 cs.CV

BALM: A Model-Agnostic Framework for Balanced Multimodal Learning under Imbalanced Missing Rates

Phuong-Anh Nguyen, Tien Anh Pham, Duc-Trong Le, Cam-Van Thi Nguyen

Comments Accepted by CVPR 2026

2603.19714 2026-03-23 cs.CL

LoopRPT: Reinforcement Pre-Training for Looped Language Models

Guo Tang, Shixin Jiang, Heng Chang, Nuo Chen, Yuhan Li, Huiming Fan, Jia Li, Ming Liu, Bing Qin

2603.19713 2026-03-23 cs.LG

Learning from Similarity/Dissimilarity and Pairwise Comparison

Tomoya Tate, Kosuke Sugiyama, Masato Uchida

2603.19712 2026-03-23 cs.CL

TAB-AUDIT: Detecting AI-Fabricated Scientific Tables via Multi-View Likelihood Mismatch

Shuo Huang, Yan Pen, Lizhen Qu

2603.19711 2026-03-23 cs.CL

EvoTaxo: Building and Evolving Taxonomy from Social Media Streams

Yiyang Li, Tianyi Ma, Yanfang Ye

2603.19708 2026-03-23 cs.CV

WorldAgents: Can Foundation Image Models be Agents for 3D World Models?

Ziya Erkoç, Angela Dai, Matthias Nießner

Comments Webpage: https://ziyaerkoc.com/worldagents/ Video: https://www.youtube.com/watch?v=Mj2FqqhurdI

2603.19700 2026-03-23 cs.LG cs.GT

Regret Analysis of Sleeping Competing Bandits

Shinnosuke Uba, Yutaro Yamaguchi

Comments 29 pages, 3 figures

2603.19695 2026-03-23 cs.CV

Demographic-Aware Self-Supervised Anomaly Detection Pretraining for Equitable Rare Cardiac Diagnosis

Chaoqin Huang, Zi Zeng, Aofan Jiang, Yuchen Xu, Qing Cao, Kang Chen, Chenfei Chi, Yanfeng Wang, Ya Zhang

2603.19688 2026-03-23 cs.CL

DataProphet: Demystifying Supervision Data Generalization in Multimodal LLMs

Xuan Qi, Luxi He, Dan Roth, Xingyu Fu

Comments 14 pages

2603.19685 2026-03-23 cs.AI cs.LG cs.MA

A Subgoal-driven Framework for Improving Long-Horizon LLM Agents

Taiyi Wang, Sian Gooding, Florian Hartmann, Oriana Riva, Edward Grefenstette

Comments 50 pages, 15 figures

2603.19683 2026-03-23 cs.LG

Ontology-Based Knowledge Modeling and Uncertainty-Aware Outdoor Air Quality Assessment Using Weighted Interval Type-2 Fuzzy Logic

Md Inzmam, Ritesh Chandra, Sadhana Tiwari, Sonali Agarwal, Triloki Pant

2603.19681 2026-03-23 cs.CV

Unbiased Dynamic Multimodal Fusion

Shicai Wei, Kaijie Zhang, Luyi Chen, Tao He, Guiduo Duan

Comments CVPR2026 Findings, 11 pages, 4 figures

2603.19678 2026-03-23 cs.CV

Vision-Language Attribute Disentanglement and Reinforcement for Lifelong Person Re-Identification

Kunlun Xu, Haotong Cheng, Jiangmeng Li, Xu Zou, Jiahuan Zhou

Comments Accepted by CVPR 2026