arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2602.18309 2026-02-23 cs.CV

Multi-Level Conditioning by Pairing Localized Text and Sketch for Fashion Image Generation

Ziyue Liu, Davide Talon, Federico Girella, Zanxi Ruan, Mattia Mondo, Loris Bazzani, Yiming Wang, Marco Cristani

Comments Project page: https://intelligolabs.github.io/lots/

详情

英文摘要

Sketches offer designers a concise yet expressive medium for early-stage fashion ideation by specifying structure, silhouette, and spatial relationships, while textual descriptions complement sketches to convey material, color, and stylistic details. Effectively combining textual and visual modalities requires adherence to the sketch visual structure when leveraging the guidance of localized attributes from text. We present LOcalized Text and Sketch with multi-level guidance (LOTS), a framework that enhances fashion image generation by combining global sketch guidance with multiple localized sketch-text pairs. LOTS employs a Multi-level Conditioning Stage to independently encode local features within a shared latent space while maintaining global structural coordination. Then, the Diffusion Pair Guidance stage integrates both local and global conditioning via attention-based guidance within the diffusion model's multi-step denoising process. To validate our method, we develop Sketchy, the first fashion dataset where multiple text-sketch pairs are provided per image. Sketchy provides high-quality, clean sketches with a professional look and consistent structure. To assess robustness beyond this setting, we also include an "in the wild" split with non-expert sketches, featuring higher variability and imperfections. Experiments demonstrate that our method strengthens global structural adherence while leveraging richer localized semantic guidance, achieving improvement over state-of-the-art. The dataset, platform, and code are publicly available.

URL PDF HTML ☆

赞 0 踩 0

2602.18301 2026-02-23 cs.LG cs.CL

On the Semantic and Syntactic Information Encoded in Proto-Tokens for One-Step Text Reconstruction

Ivan Bondarenko, Egor Palkin, Fedor Tikunov

2602.18297 2026-02-23 cs.LG cs.AI cs.CL cs.IT math.IT

Analyzing and Improving Chain-of-Thought Monitorability Through Information Theory

Usman Anwar, Tim Bakker, Dana Kianfar, Cristina Pinneri, Christos Louizos

Comments First two authors contributed equally

2602.18282 2026-02-23 cs.CV

DEIG: Detail-Enhanced Instance Generation with Fine-Grained Semantic Control

Shiyan Du, Conghan Yue, Xinyu Cheng, Dongyu Zhang

Comments Accepted by AAAI 2026

2602.18277 2026-02-23 cs.LG cs.AI stat.ML

PRISM: Parallel Reward Integration with Symmetry for MORL

Finn van der Knaap, Kejiang Qian, Zheng Xu, Fengxiang He

2602.18266 2026-02-23 cs.LG

A Probabilistic Framework for LLM-Based Model Discovery

Stefan Wahl, Raphaela Schenk, Ali Farnoud, Jakob H. Macke, Daniel Gedon

2602.18262 2026-02-23 cs.CL cs.AI cs.LG

Simplifying Outcomes of Language Model Component Analyses with ELIA

Aaron Louis Eidt, Nils Feldhus

Comments EACL 2026 System Demonstrations. GitHub: https://github.com/aaron0eidt/ELIA

2602.18260 2026-02-23 cs.RO

Role-Adaptive Collaborative Formation Planning for Team of Quadruped Robots in Cluttered Environments

Magnus Norén, Marios-Nektarios Stamatopoulos, Avijit Banerjee, George Nikolakopoulos

2602.18258 2026-02-23 cs.RO cs.CV

RoEL: Robust Event-based 3D Line Reconstruction

Gwangtak Bae, Jaeho Shin, Seunggu Kang, Junho Kim, Ayoung Kim, Young Min Kim

Comments IEEE Transactions on Robotics (T-RO)

2602.18253 2026-02-23 cs.LG

MEG-to-MEG Transfer Learning and Cross-Task Speech/Silence Detection with Limited Data

Xabier de Zuazo, Vincenzo Verbeni, Eva Navas, Ibon Saratxaga, Mathieu Bourguignon, Nicola Molinaro

Comments 6 pages, 3 figures, 3 tables, submitted to Interspeech 2026

2602.18252 2026-02-23 cs.CV cs.AI

On the Adversarial Robustness of Discrete Image Tokenizers

Rishika Bhagwatkar, Irina Rish, Nicolas Flammarion, Francesco Croce

2602.18250 2026-02-23 cs.LG

Variational Distributional Neuron

Yves Ruffenach

Comments 29 pages, 7 figures. Code available at GitHub (link in paper)

详情

英文摘要

We propose a proof of concept for a variational distributional neuron: a compute unit formulated as a VAE brick, explicitly carrying a prior, an amortized posterior and a local ELBO. The unit is no longer a deterministic scalar but a distribution: computing is no longer about propagating values, but about contracting a continuous space of possibilities under constraints. Each neuron parameterizes a posterior, propagates a reparameterized sample and is regularized by the KL term of a local ELBO - hence, the activation is distributional. This "contraction" becomes testable through local constraints and can be monitored via internal measures. The amount of contextual information carried by the unit, as well as the temporal persistence of this information, are locally tuned by distinct constraints. This proposal addresses a structural tension: in sequential generation, causality is predominantly organized in the symbolic space and, even when latents exist, they often remain auxiliary, while the effective dynamics are carried by a largely deterministic decoder. In parallel, probabilistic latent models capture factors of variation and uncertainty, but that uncertainty typically remains borne by global or parametric mechanisms, while units continue to propagate scalars - hence the pivot question: if uncertainty is intrinsic to computation, why does the compute unit not carry it explicitly? We therefore draw two axes: (i) the composition of probabilistic constraints, which must be made stable, interpretable and controllable; and (ii) granularity: if inference is a negotiation of distributions under constraints, should the primitive unit remain deterministic or become distributional? We analyze "collapse" modes and the conditions for a "living neuron", then extend the contribution over time via autoregressive priors over the latent, per unit.

URL PDF HTML ☆

赞 0 踩 0

2602.18248 2026-02-23 cs.LG cs.NA math.NA

Neural-HSS: Hierarchical Semi-Separable Neural PDE Solver

Pietro Sittoni, Emanuele Zangrando, Angelo A. Casulli, Nicola Guglielmi, Francesco Tudisco

2602.18232 2026-02-23 cs.CL cs.AI

Thinking by Subtraction: Confidence-Driven Contrastive Decoding for LLM Reasoning

Lexiang Tang, Weihao Gao, Bingchen Zhao, Lu Ma, Qiao jin, Bang Yang, Yuexian Zou

2602.18224 2026-02-23 cs.RO cs.LG

SimVLA: A Simple VLA Baseline for Robotic Manipulation

Yuankai Luo, Woping Chen, Tong Liang, Baiqiao Wang, Zhenguo Li

2602.18216 2026-02-23 cs.LG

Generative Model via Quantile Assignment

Georgi Hrusanov, Oliver Y. Chén, Julien S. Bodelet

2602.18212 2026-02-23 cs.RO

Design and Characterization of a Dual-DOF Soft Shoulder Exosuit with Volume-Optimized Pneumatic Actuator

Rui Chen, Domenico Chiaradia, Daniele Leonardis, Antonio Frisoli

2602.18201 2026-02-23 cs.AI cs.LG

SOMtime the World Ain$'$t Fair: Violating Fairness Using Self-Organizing Maps

Joseph Bingham, Netanel Arussy, Dvir Aran

Comments 10 pages, 2 figures, preprint

2602.18199 2026-02-23 cs.CV

A Self-Supervised Approach on Motion Calibration for Enhancing Physical Plausibility in Text-to-Motion

Gahyeon Shim, Soogeun Park, Hyemin Ahn

2602.16653 2026-02-23 cs.AI

Agent Skill Framework: Perspectives on the Potential of Small Language Models in Industrial Environments

Yangjie Xu, Lujun Li, Lama Sleem, Niccolo Gentile, Yewei Song, Yiqun Wang, Siming Ji, Wenbo Wu, Radu State

2602.16086 2026-02-23 cs.CV cs.LG

LGQ: Learning Discretization Geometry for Scalable and Stable Image Tokenization

Idil Bilge Altun, Mert Onur Cakiroglu, Elham Buxton, Mehmet Dalkilic, Hasan Kurban

2602.14514 2026-02-23 cs.CV

Efficient Text-Guided Convolutional Adapter for the Diffusion Model

Aryan Das, Koushik Biswas, Swalpa Kumar Roy, Badri Narayana Patro, Vinay Kumar Verma

Comments Accepted in WACV 2026

2602.14498 2026-02-23 cs.CV cs.LG

Uncertainty-Aware Vision-Language Segmentation for Medical Imaging

Aryan Das, Tanishq Rachamalla, Koushik Biswas, Swalpa Kumar Roy, Vinay Kumar Verma

Comments Accepted in WACV 2026

2602.03175 2026-02-23 cs.LG

Probe-then-Commit Multi-Objective Bandits: Theoretical Benefits of Limited Multi-Arm Feedback

Ming Shi

2601.17991 2026-02-23 cs.RO

Prosthetic Hand Manipulation System Based on EMG and Eye Tracking Powered by the Neuromorphic Processor AltAi

Roman Akinshin, Elizaveta Lopatina, Kirill Bogatikov, Nikolai Kiz, Anna V. Makarova, Mikhail Lebedev, Miguel Altamirano Cabrera, Dzmitry Tsetserukou, Valerii Kangler

Comments This paper has been accepted for publication at LBR of HRI 2026 conference

2601.11924 2026-02-23 cs.LG

Communication-Corruption Coupling and Verification in Cooperative Multi-Objective Bandits

Ming Shi

2512.01865 2026-02-23 cs.CL cs.AI

Cross-Lingual Interleaving for Speech Language Models

Adel Moumen, Guangzhi Sun, Philip C. Woodland

2511.17081 2026-02-23 cs.CL

MUCH: A Multilingual Claim Hallucination Benchmark

Jérémie Dentan, Alexi Canesse, Davide Buscaldi, Aymen Shabou, Sonia Vanier

Comments To appear in Proceedings of LREC 2026

2511.10855 2026-02-23 cs.LG

ExPairT-LLM: Exact Learning for LLM Code Selection by Pairwise Queries

Tom Yuviler, Dana Drachsler-Cohen

2511.10164 2026-02-23 cs.AI cs.SC

Two Constraint Compilation Methods for Lifted Planning

Periklis Mantenoglou, Luigi Bonassi, Enrico Scala, Pedro Zuidberg Dos Martires