arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2604.15675 2026-04-20 cs.CL

C-Mining: Unsupervised Discovery of Seeds for Cultural Data Synthesis via Geometric Misalignment

Pufan Zeng, Yilun Liu, Mingchen Dai, Mengyao Piao, Chunguang Zhao, Lingqi Miao, Shimin Tao, Weibin Meng, Minggui He, Chenxin Liu, Zhenzhen Qin, Li Zhang, Hongxia Ma, Boxing Chen, Daimeng Wei

详情

英文摘要

Achieving cultural alignment in Large Language Models (LLMs) increasingly depends on synthetic data generation. For such synthesis, the most vital initial step is seed curation; however, current methods lack quantifiable standards for selecting these seeds. Existing approaches rely on unscalable manual curation or bias-prone LLM extraction, treating cultural specificity as an abstract concept rather than a measurable signal. In this paper, we address this "quantification gap" by proposing C-Mining, an unsupervised framework that transforms the discovery of cultural seeds from a subjective selection process into a computable data mining formulation. Our approach exploits a novel geometric insight, leveraging the cross-lingual misalignment of cultural concepts within pre-trained embedding spaces as a quantifiable discovery signal. By systematically identifying these regions characterized by pronounced linguistic exclusivity and geometric isolation, while actively filtering out noise, C-Mining automatically extracts high-fidelity Culture Points (CPs) from raw multilingual corpora without reliance on human or LLM supervision, reducing preparation costs by more than 150-fold. We further leverage the mined knowledge to steer the synthesis of diverse instruction-tuning datasets. Extensive experiments demonstrate that this seed-centric approach significantly enhances cultural understanding and reasoning capabilities, achieving a +6.03 point improvement on CulturalBench-Hard and surpassing state-of-the-art baselines, providing a scalable, quantifiable solution for high-quality cultural data synthesis.

URL PDF HTML ☆

赞 0 踩 0

2604.15672 2026-04-20 cs.LG cs.CL

Faster LLM Inference via Sequential Monte Carlo

Yahya Emara, Mauricio Barba da Costa, Chi-Chih Chang, Cameron Freer, Tim Vieira, Ryan Cotterell, Mohamed S. Abdelfattah

2604.15671 2026-04-20 cs.RO

Long-Term Memory for VLA-based Agents in Open-World Task Execution

Xu Huang, Weixin Mao, Yinhao Li, Hua Chen, Jiabao Zhao

2604.15670 2026-04-20 cs.CV

PixDLM: A Dual-Path Multimodal Language Model for UAV Reasoning Segmentation

Shuyan Ke, Yifan Mei, Changli Wu, Yonghan Zheng, Jiayi Ji, Liujuan Cao, Rongrong Ji

Comments Accepted to CVPR 2026 (highlight)

2604.15668 2026-04-20 cs.LG

NK-GAD: Neighbor Knowledge-Enhanced Unsupervised Graph Anomaly Detection

Zehao Wang, Lanjun Wang

2604.15665 2026-04-20 cs.CV cs.PF

CPU Optimization of a Monocular 3D Biomechanics Pipeline for Low-Resource Deployment

Yan Zhang, Xiong Zhao

2604.15654 2026-04-20 cs.CV

From Zero to Detail: A Progressive Spectral Decoupling Paradigm for UHD Image Restoration with New Benchmark

Chen Zhao, Yunzhe Xu, Zhizhou Chen, Enxuan Gu, Kai Zhang, Xiaoming Liu, Jian Yang, Ying Tai

Comments TPAMI

2604.15652 2026-04-20 cs.CV

Towards Realistic Open-Vocabulary Remote Sensing Segmentation: Benchmark and Baseline

Bingyu Li, Tao Huo, Haocheng Dong, Da Zhang, Zhiyuan Zhao, Junyu Gao, Xuelong Li

详情

英文摘要

Open-vocabulary remote sensing image segmentation (OVRSIS) remains underexplored due to fragmented datasets, limited training diversity, and the lack of evaluation benchmarks that reflect realistic geospatial application demands. Our previous \textit{OVRSISBenchV1} established an initial cross-dataset evaluation protocol, but its limited scope is insufficient for assessing realistic open-world generalization. To address this issue, we propose \textit{OVRSISBenchV2}, a large-scale and application-oriented benchmark for OVRSIS. We first construct \textbf{OVRSIS95K}, a balanced dataset of about 95K image--mask pairs covering 35 common semantic categories across diverse remote sensing scenes. Built upon OVRSIS95K and 10 downstream datasets, OVRSISBenchV2 contains 170K images and 128 categories, substantially expanding scene diversity, semantic coverage, and evaluation difficulty. Beyond standard open-vocabulary segmentation, it further includes downstream protocols for building extraction, road extraction, and flood detection, thereby better reflecting realistic geospatial application demands and complex deployment scenarios. We also propose \textbf{Pi-Seg}, a baseline for OVRSIS. Pi-Seg improves transferability through a \textbf{positive-incentive noise} mechanism, where learnable and semantically guided perturbations broaden the visual-text feature space during training. Extensive experiments on OVRSISBenchV1, OVRSISBenchV2, and downstream tasks show that Pi-Seg delivers strong and consistent results, particularly on the more challenging OVRSISBenchV2 benchmark. Our results highlight both the importance of realistic benchmark design and the effectiveness of perturbation-based transfer for OVRSIS. The code and datasets are available at \href{https://github.com/LiBingyu01/RSKT-Seg/tree/Pi-Seg}{LiBingyu01/RSKT-Seg/tree/Pi-Seg}.

URL PDF HTML ☆

赞 0 踩 0

2604.15651 2026-04-20 cs.CV

SPLIT: Self-supervised Partitioning for Learned Inversion in Nonlinear Tomography

Markus Haltmeier, Lukas Neumann, Nadja Gruber, Gyeongha Hwang

2604.15648 2026-04-20 cs.CL cs.CV

HyperGVL: Benchmarking and Improving Large Vision-Language Models in Hypergraph Understanding and Reasoning

Yanbin Wei, Chun Kang, Siwei Li, Haoxuan Che, Yang Chen, Hua Liu, Jian Liu, Zhuang Liu, Can Ouyang, Fei Xing, Lei Sha, Rui Liu, Yu Zhang, James Kwok

Comments Under Review; Opensource after accepted

2604.15646 2026-04-20 cs.CL

FD-NL2SQL: Feedback-Driven Clinical NL2SQL that Improves with Use

Suparno Roy Chowdhury, Tejas Anvekar, Manan Roy Choudhury, Muhammad Ali Khan, Kaneez Zahra Rubab Khakwani, Mohamad Bassam Sonbol, Irbaz Bin Riaz, Vivek Gupta

2604.15645 2026-04-20 cs.LG physics.comp-ph quant-ph

PINNACLE: An Open-Source Computational Framework for Classical and Quantum PINNs

Shimon Pisnoy, Hemanth Chandravamsi, Ziv Chen, Aaron Goldgewert, Gal Shaviner, Boris Shragner, Steven H. Frankel

2604.15638 2026-04-20 cs.RO cs.SY eess.SY math.OC

Contact-Aware Planning and Control of Continuum Robots in Highly Constrained Environments

Aedan Mangan, Kehan Long, Ki Myung Brian Lee, Miheer Potdar, Nikolay Atanasov, Tania K. Morimoto

Comments 15 pages, 3 figures

2604.15631 2026-04-20 cs.CV

Causal Bootstrapped Alignment for Unsupervised Video-Based Visible-Infrared Person Re-Identification

Shuang Li, Jiaxu Leng, Changjiang Kuang, Mingpi Tan, Yu Yuan, Xinbo Gao

Comments Submit to IEEE TIFS

详情

英文摘要

VVI-ReID is a critical technique for all-day surveillance, where temporal information provides additional cues beyond static images. However, existing approaches rely heavily on fully supervised learning with expensive cross-modality annotations, limiting scalability. To address this issue, we investigate Unsupervised Learning for VVI-ReID (USL-VVI-ReID), which learns identity-discriminative representations directly from unlabeled video tracklets. Directly extending image-based USL-VI-ReID methods to this setting with generic pretrained encoders leads to suboptimal performance. Such encoders suffer from weak identity discrimination and strong modality bias, resulting in severe intra-modality identity confusion and pronounced clustering granularity imbalance between visible and infrared modalities. These issues jointly degrade pseudo-label reliability and hinder effective cross-modality alignment. To address these challenges, we propose a Causal Bootstrapped Alignment (CBA) framework that explicitly exploits inherent video priors. First, we introduce Causal Intervention Warm-up (CIW), which performs sequence-level causal interventions by leveraging temporal identity consistency and cross-modality identity consistency to suppress modality- and motion-induced spurious correlations while preserving identity-relevant semantics, yielding cleaner representations for unsupervised clustering. Second, we propose Prototype-Guided Uncertainty Refinement (PGUR), which employs a coarse-to-fine alignment strategy to resolve cross-modality granularity mismatch, reorganizing under-clustered infrared representations under the guidance of reliable visible prototypes with uncertainty-aware supervision. Extensive experiments on the HITSZ-VCM and BUPTCampus benchmarks demonstrate that CBA significantly outperforms existing USL-VI-ReID methods when extended to the USL-VVI-ReID setting.

URL PDF HTML ☆

赞 0 踩 0

2604.15628 2026-04-20 cs.CV cs.CL cs.IR cs.LG cs.MM

SIMMER: Cross-Modal Food Image--Recipe Retrieval via MLLM-Based Embedding

Keisuke Gomi, Keiji Yanai

Comments 20 pages, 6 figures

2604.15622 2026-04-20 cs.CV cs.LG

AdaVFM: Adaptive Vision Foundation Models for Edge Intelligence via LLM-Guided Execution

Yiwei Zhao, Yi Zheng, Huapeng Su, Jieyu Lin, Stefano Ambrogio, Cijo Jose, Michaël Ramamonjisoa, Patrick Labatut, Barbara De Salvo, Chiao Liu, Phillip B. Gibbons, Ziyun Li

2604.15619 2026-04-20 cs.RO

Factor Graph-Based Shape Estimation for Continuum Robots via Magnus Expansion

Lorenzo Ticozzi, Patricio A. Vela, Panagiotis Tsiotras

2604.15618 2026-04-20 cs.LG

Majority Voting for Code Generation

Tim Launer, Jonas Hübotter, Marco Bagatella, Ido Hakimi, Andreas Krause

Comments ICLR 2026 Test-Time Updates (TTU) Workshop

2604.15614 2026-04-20 cs.LG

Flexible Empowerment at Reasoning with Extended Best-of-N Sampling

Taisuke Kobayashi

Comments 15 pages, 4 figures

2604.15612 2026-04-20 cs.RO cs.CV

GaussianFlow SLAM: Monocular Gaussian Splatting SLAM Guided by GaussianFlow

Dong-Uk Seo, Jinwoo Jeon, Eungchang Mason Lee, Hyun Myung

Comments 8 pages, 5 figures, 7 tables, accepted to IEEE RA-L

2604.15611 2026-04-20 cs.CV cs.AI

CLIMB: Controllable Longitudinal Brain Image Generation using Mamba-based Latent Diffusion Model and Gaussian-aligned Autoencoder

Duy-Phuong Dao, Muhammad Taqiyuddin, Jahae Kim, Sang-Heon Lee, Hye-Won Jung, Jaehoo Choi, Hyung-Jeong Yang

Comments 18 pages, 5 figures, 5 tables

2604.15609 2026-04-20 cs.LG cs.CV

Adapting in the Dark: Efficient and Stable Test-Time Adaptation for Black-Box Models

Yunbei Zhang, Shuaicheng Niu, Chengyi Cai, Feng Liu, Jihun Hamm

Comments Third Workshop on Test-Time Updates (Oral)

2604.15607 2026-04-20 cs.CL cs.AI cs.CY cs.HC

Imperfectly Cooperative Human-AI Interactions: Comparing the Impacts of Human and AI Attributes in Simulated and User Studies

Myke C. Cohen, Mingqian Zheng, Neel Bhandari, Hsien-Te Kao, Xuhui Zhou, Daniel Nguyen, Laura Cassani, Maarten Sap, Svitlana Volkova

Comments Will be presented at ACL 2026 and published in the Findings of the Association for Computational Linguistics: ACL 2026

2604.15602 2026-04-20 cs.CL

GroupDPO: Memory efficient Group-wise Direct Preference Optimization

Jixuan Leng, Si Si, Hsiang-Fu Yu, Vinod Raman, Inderjit S. Dhillon

2604.15597 2026-04-20 cs.CL cs.HC

LLMs Corrupt Your Documents When You Delegate

Philippe Laban, Tobias Schnabel, Jennifer Neville

2604.15593 2026-04-20 cs.CL cs.AI

DALM: A Domain-Algebraic Language Model via Three-Phase Structured Generation

Chao Li

2604.15589 2026-04-20 cs.CL cs.AI cs.LG

LLM attribution analysis across different fine-tuning strategies and model scales for automated code compliance

Jack Wei Lun Shi, Minghao Dang, Wawan Solihin, Justin K. W. Yeoh

Comments 8 pages, 9 figures. Accepted at ICCCBE 2026 (International Conference on Computing in Civil and Building Engineering)

2604.15588 2026-04-20 cs.CL cs.AI cs.HC cs.LG

"Excuse me, may I say something..." CoLabScience, A Proactive AI Assistant for Biomedical Discovery and LLM-Expert Collaborations

Yang Wu, Jinhong Yu, Jingwei Xiong, Zhimin Tao, Xiaozhong Liu

Comments ACL 2026 Main Conference

2604.15585 2026-04-20 cs.LG cs.AI

PAWN: Piece Value Analysis with Neural Networks

Ethan Tang, Hasan Davulcu, Jia Zou, Zhongju Zhang

Comments 19 pages, 5 figures, 12 tables

2604.15577 2026-04-20 cs.LG cs.AI

Reward Weighted Classifier-Free Guidance as Policy Improvement in Autoregressive Models

Alexander Peysakhovich, William Berman