arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2603.18488 2026-03-20 cs.CV

TexEditor: Structure-Preserving Text-Driven Texture Editing

Bo Zhao, Yihang Liu, Chenfeng Zhang, Huan Yang, Kun Gai, Wei Ji

Comments 19pages

详情

英文摘要

Text-guided texture editing aims to modify object appearance while preserving the underlying geometric structure. However, our empirical analysis reveals that even SOTA editing models frequently struggle to maintain structural consistency during texture editing, despite the intended changes being purely appearance-related. Motivated by this observation, we jointly enhance structure preservation from both data and training perspectives, and build TexEditor, a dedicated texture editing model based on Qwen-Image-Edit-2509. Firstly, we construct TexBlender, a high-quality SFT dataset generated with Blender, which provides strong structural priors for a cold start. Sec- ondly, we introduce StructureNFT, a RL-based approach that integrates structure-preserving losses to transfer the structural priors learned during SFT to real-world scenes. Moreover, due to the limited realism and evaluation coverage of existing benchmarks, we introduce TexBench, a general-purpose real-world benchmark for text-guided texture editing. Extensive experiments on existing Blender-based texture benchmarks and our TexBench show that TexEditor consistently outperforms strong baselines such as Nano Banana Pro. In addition, we assess TexEditor on the general purpose benchmark ImgEdit to validate its generalization. Our code and data are available at https://github.com/KlingAIResearch/TexEditor.

URL PDF HTML ☆

赞 0 踩 0

2603.18482 2026-03-20 cs.CL cs.LG stat.ML

The Truncation Blind Spot: How Decoding Strategies Systematically Exclude Human-Like Token Choices

Esteban Garces Arias, Nurzhan Sapargali, Christian Heumann, Matthias Aßenmacher

Comments Under review

2603.18481 2026-03-20 cs.CV cs.LG

T-QPM: Enabling Temporal Out-Of-Distribution Detection and Domain Generalization for Vision-Language Models in Open-World

Aditi Naiknaware, Salimeh Sekeh

2603.18480 2026-03-20 cs.CV cs.AI cs.HC

Do Vision Language Models Understand Human Engagement in Games?

Ziyi Wang, Qizan Guo, Rishitosh Singh, Xiyang Hu

2603.18469 2026-03-20 cs.CL

GAIN: A Benchmark for Goal-Aligned Decision-Making of Large Language Models under Imperfect Norms

Masayuki Kawarada, Kodai Watanabe, Soichiro Murakami

Comments We are working towards releasing the code in April 2026

2603.18466 2026-03-20 cs.CV

Recolour What Matters: Region-Aware Colour Editing via Token-Level Diffusion

Yuqi Yang, Dongliang Chang, Yijia Ling, Ruoyi Du, Zhanyu Ma

Comments 18 pages, 12 figures

2603.18465 2026-03-20 cs.CV

MedQ-UNI: Toward Unified Medical Image Quality Assessment and Restoration via Vision-Language Modeling

Jiyao Liu, Junzhi Ning, Wanying Qu, Lihao Liu, Chenglong Ma, Junjun He, Ningsheng Xu

2603.18462 2026-03-20 cs.AI

AlignMamba-2: Enhancing Multimodal Fusion and Sentiment Analysis with Modality-Aware Mamba

Yan Li, Yifei Xing, Xiangyuan Lan, Xin Li, Haifeng Chen, Dongmei Jiang

Comments Accepted by Pattern Recognition

2603.18461 2026-03-20 cs.CV

Cell-Type Prototype-Informed Neural Network for Gene Expression Estimation from Pathology Images

Kazuya Nishimura, Ryoma Bise, Shinnosuke Matsuo, Haruka Hirose, Yasuhiro Kojima

Comments Accepted by CVPR 2026

2603.18460 2026-03-20 cs.CV cs.AI

Interpretable Prostate Cancer Detection using a Small Cohort of MRI Images

Vahid Monfared, Mohammad Hadi Gharib, Ali Sabri, Maryam Shahali, Farid Rashidi, Amit Mehta, Reza Rawassizadeh

Comments 26 pages, 5 figures, 7 tables

2603.18453 2026-03-20 cs.CV

Learning Consistent Temporal Grounding between Related Tasks in Sports Coaching

Arushi Rai, Adriana Kovashka

2603.18448 2026-03-20 cs.LG

Seeking Universal Shot Language Understanding Solutions

Haoxin Liu, Harshavardhan Kamarthi, Zhiyuan Zhao, Hongjie Chen, B. Aditya Prakash

2603.18446 2026-03-20 cs.CL cs.LG

UT-ACA: Uncertainty-Triggered Adaptive Context Allocation for Long-Context Inference

Lang Zhou, Shuxuan Li, Zhuohao Li, Shi Liu, Zhilin Zhao, Wei-Shi Zheng

2603.18443 2026-03-20 cs.CV

SR-Nav: Spatial Relationships Matter for Zero-shot Object Goal Navigation

Leyuan Fang, Zan Mao, Zijing Wang, Yinlong Yan

2603.18436 2026-03-20 cs.AI

AS2 -- Attention-Based Soft Answer Sets: An End-to-End Differentiable Neuro-Soft-Symbolic Reasoning Architecture

Wael AbdAlmageed

2603.18432 2026-03-20 cs.LG

MLOW: Interpretable Low-Rank Frequency Magnitude Decomposition of Multiple Effects for Time Series Forecasting

Runze Yang, Longbing Cao, Xiaoming Wu, Xin You, Kun Fang, Jianxun Li, Jie Yang

2603.18431 2026-03-20 cs.LG

Towards Noise-Resilient Quantum Multi-Armed and Stochastic Linear Bandits

Zhuoyue Chen, Kechao Cai

2603.18429 2026-03-20 cs.CV

AndroTMem: From Interaction Trajectories to Anchored Memory in Long-Horizon GUI Agents

Yibo Shi, Jungang Li, Linghao Zhang, Zihao Dongfang, Biao Wu, Sicheng Tao, Yibo Yan, Chenxi Qin, Weiting Liu, Zhixin Lin, Hanqian Li, Yu Huang, Song Dai, Yonghua Hei, Yue Ding, Xiang Li, Shikang Wang, Chengdong Xu, Jingqi Liu, Xueying Ma, Zhiwen Zheng, Xiaofei Zhang, Bincheng Wang, Nichen Yang, Jie Wu, Lihua Tian, Chen Li, Xuming Hu

2603.18428 2026-03-20 cs.CL cs.AI

Adaptive Decoding via Test-Time Policy Learning for Self-Improving Generation

Asmita Bhardwaj, Yuya Jeremy Ong, Eelaaf Zahid, Basel Shbita

2603.18427 2026-03-20 cs.CV cs.AI

R&D: Balancing Reliability and Diversity in Synthetic Data Augmentation for Semantic Segmentation

Huy Che, Dinh-Duy Phan, Duc-Khai Lam

详情

DOI: 10.1007/978-3-032-09321-9_30
Journal ref: Computational Collective Intelligence, ICCCI 2025, Lecture Notes in Computer Science 16139 (2026) 433-448

英文摘要

Collecting and annotating datasets for pixel-level semantic segmentation tasks are highly labor-intensive. Data augmentation provides a viable solution by enhancing model generalization without additional real-world data collection. Traditional augmentation techniques, such as translation, scaling, and color transformations, create geometric variations but fail to generate new structures. While generative models have been employed to extend semantic information of datasets, they often struggle to maintain consistency between the original and generated images, particularly for pixel-level tasks. In this work, we propose a novel synthetic data augmentation pipeline that integrates controllable diffusion models. Our approach balances diversity and reliability data, effectively bridging the gap between synthetic and real data. We utilize class-aware prompting and visual prior blending to improve image quality further, ensuring precise alignment with segmentation labels. By evaluating benchmark datasets such as PASCAL VOC and BDD100K, we demonstrate that our method significantly enhances semantic segmentation performance, especially in data-scarce scenarios, while improving model robustness in real-world applications. Our code is available at \href{https://github.com/chequanghuy/Enhanced-Generative-Data-Augmentation-for-Semantic-Segmentation-via-Stronger-Guidance}{https://github.com/chequanghuy/Enhanced-Generative-Data-Augmentation-for-Semantic-Segmentation-via-Stronger-Guidance}.

URL PDF HTML ☆

赞 0 踩 0

2603.18426 2026-03-20 cs.AI

Prune-then-Quantize or Quantize-then-Prune? Understanding the Impact of Compression Order in Joint Model Compression

Minjun Kim, Jaehyeon Choi, Hyunwoo Yang, Jongjin Kim, Jinho Song, U Kang

Comments ICLR 2026

2603.18425 2026-03-20 cs.CL

Multimodal Task Interference: A Benchmark and Analysis of History-Target Mismatch in Multimodal LLMs

Masayuki Kawarada, Tatsuya Ishigaki, Hiroya Takamura

2603.18423 2026-03-20 cs.CV

SynQ: Accurate Zero-shot Quantization by Synthesis-aware Fine-tuning

Minjun Kim, Jongjin Kim, U Kang

Comments ICLR 2025

2603.18420 2026-03-20 cs.AI cs.CL cs.IR cs.LG

From Topic to Transition Structure: Unsupervised Concept Discovery at Corpus Scale via Predictive Associative Memory

Jason Dury

Comments 22 pages, 5 figures. Code and demo: https://github.com/EridosAI/PAM-Concept-Discovery

2603.18418 2026-03-20 cs.CV cs.AI

Mind the Rarities: Can Rare Skin Diseases Be Reliably Diagnosed via Diagnostic Reasoning?

Yang Liu, Jiyao Yang, Hongjin Zhao, Xiaoyong Li, Yanzhe Ji, Xingjian Li, Runmin Jiang, Tianyang Wang, Saeed Anwar, Dongwoo Kim, Yue Yao, Zhenyue Qin, Min Xu

2603.18417 2026-03-20 cs.LG cs.AI

Self-Tuning Sparse Attention: Multi-Fidelity Hyperparameter Optimization for Transformer Acceleration

Arundhathi Dev, Justin Zhan

Comments Accepted to the International Conference on Machine Intelligence Theory and Applications (MiTA 2026)

2603.18409 2026-03-20 cs.CL

TopoChunker: Topology-Aware Agentic Document Chunking Framework

Xiaoyu Liu

2603.18408 2026-03-20 cs.RO

Efficient and Versatile Quadrupedal Skating: Optimal Co-design via Reinforcement Learning and Bayesian Optimization

Hanwen Wang, Zhenlong Fang, Josiah Hanna, Xiaobin Xiong

2603.18402 2026-03-20 cs.CV

Inst4DGS: Instance-Decomposed 4D Gaussian Splatting with Multi-Video Label Permutation Learning

Yonghan Lee, Dinesh Manocha

2603.18401 2026-03-20 cs.CV

Pixel-Accurate Epipolar Guided Matching

Oleksii Nasypanyi, Francois Rameau