arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2602.17664 2026-02-20 cs.CL cs.AI cs.LG

Sink-Aware Pruning for Diffusion Language Models

Aidar Myrzakhan, Tianyi Li, Bowei Guo, Shengkun Tang, Zhiqiang Shen

Comments Code at: https://github.com/VILA-Lab/Sink-Aware-Pruning

详情

英文摘要

Diffusion Language Models (DLMs) incur high inference cost due to iterative denoising, motivating efficient pruning. Existing pruning heuristics largely inherited from autoregressive (AR) LLMs, typically preserve attention sink tokens because AR sinks serve as stable global anchors. We show that this assumption does not hold for DLMs: the attention-sink position exhibits substantially higher variance over the full generation trajectory (measured by how the dominant sink locations shift across timesteps), indicating that sinks are often transient and less structurally essential than in AR models. Based on this observation, we propose ${\bf \texttt{Sink-Aware Pruning}}$, which automatically identifies and prunes unstable sinks in DLMs (prior studies usually keep sinks for AR LLMs). Without retraining, our method achieves a better quality-efficiency trade-off and outperforms strong prior pruning baselines under matched compute. Our code is available at https://github.com/VILA-Lab/Sink-Aware-Pruning.

URL PDF HTML ☆

赞 0 踩 0

2602.17663 2026-02-20 cs.AI cs.CL cs.IR

CLEF HIPE-2026: Evaluating Accurate and Efficient Person-Place Relation Extraction from Multilingual Historical Texts

Juri Opitz, Corina Raclé, Emanuela Boros, Andrianos Michail, Matteo Romanello, Maud Ehrmann, Simon Clematide

Comments ECIR 2026. CLEF Evaluation Lab. Registration DL: 2026/04/23. Task Homepage at https://hipe-eval.github.io/HIPE-2026/

2602.17659 2026-02-20 cs.CV cs.RO

When Vision Overrides Language: Evaluating and Mitigating Counterfactual Failures in VLAs

Yu Fang, Yuchun Feng, Dong Jing, Jiaqi Liu, Yue Yang, Zhenyu Wei, Daniel Szafir, Mingyu Ding

Comments Website: https://vla-va.github.io/

2602.17655 2026-02-20 cs.CL

What Language is This? Ask Your Tokenizer

Clara Meister, Ahmetcan Yavuz, Pietro Lesci, Tiago Pimentel

2602.17654 2026-02-20 cs.IR cs.LG

Mine and Refine: Optimizing Graded Relevance in E-commerce Search Retrieval

Jiaqi Xi, Raghav Saboo, Luming Chen, Martin Wang, Sudeep Das

2602.17651 2026-02-20 cs.CR

Non-Trivial Zero-Knowledge Implies One-Way Functions

Suvradip Chakraborty, James Hulett, Dakshita Khurana, Kabir Tomer

2602.17647 2026-02-20 quant-ph cs.CC

Pseudo-deterministic Quantum Algorithms

Hugo Aaronson, Tom Gur, Jiawei Li

2602.17645 2026-02-20 cs.LG cs.AI cs.CL cs.CV

Pushing the Frontier of Black-Box LVLM Attacks via Fine-Grained Detail Targeting

Xiaohan Zhao, Zhaoyi Li, Yaxin Luo, Jiacheng Cui, Zhiqiang Shen

Comments Code at: https://github.com/vila-lab/M-Attack-V2

2602.17642 2026-02-20 cs.LG

A.R.I.S.: Automated Recycling Identification System for E-Waste Classification Using Deep Learning

Dhruv Talwar, Harsh Desai, Wendong Yin, Goutam Mohanty, Rafael Reveles

2602.17641 2026-02-20 cs.LG cs.AI

FAMOSE: A ReAct Approach to Automated Feature Discovery

Keith Burghardt, Jienan Liu, Sadman Sakib, Yuning Hao, Bo Li

Comments 23 pages, 6 figures

2602.17639 2026-02-20 cs.CV

IntRec: Intent-based Retrieval with Contrastive Refinement

Pourya Shamsolmoali, Masoumeh Zareapoor, Eric Granger, Yue Lu

2602.17637 2026-02-20 math.CO cs.DM

On Sets of Monochromatic Objects in Bicolored Point Sets

Sujoy Bhore, Konrad Swanepoel

Comments 19 pages, 7 figures

2602.17636 2026-02-20 cs.CV

CORAL: Correspondence Alignment for Improved Virtual Try-On

Jiyoung Kim, Youngjin Shin, Siyoon Jin, Dahyun Chung, Jisu Nam, Tongmin Kim, Jongjae Park, Hyeonwoo Kang, Seungryong Kim

Comments 32 pages, 25 figures

2602.17634 2026-02-20 cs.LG cs.AI

Reverso: Efficient Time Series Foundation Models for Zero-shot Forecasting

Xinghong Fu, Yanhong Li, Georgios Papaioannou, Yoon Kim

2602.17633 2026-02-20 cs.LG cs.AI stat.ML

When to Trust the Cheap Check: Weak and Strong Verification for Reasoning

Shayan Kiyani, Sima Noorani, George Pappas, Hamed Hassani

2602.17625 2026-02-20 cs.LG cs.DC

Catastrophic Forgetting Resilient One-Shot Incremental Federated Learning

Obaidullah Zaland, Zulfiqar Ahmad Khan, Monowar Bhuyan

Comments Accepted for publication in the IEEE International Conference on Big Data (IEEE BigData) 2025

2602.17623 2026-02-20 cs.CL

Unmasking the Factual-Conceptual Gap in Persian Language Models

Alireza Sakhaeirad, Ali Ma'manpoosh, Arshia Hemmat

2602.17622 2026-02-20 cs.CR cs.SE

What Makes a Good LLM Agent for Real-world Penetration Testing?

Gelei Deng, Yi Liu, Yuekang Li, Ruozhao Yang, Xiaofei Xie, Jie Zhang, Han Qiu, Tianwei Zhang

2602.17621 2026-02-20 eess.SY cs.SY

Method to Compute Pointing Displacement, Smear, and Jitter Covariances for Optical Payloads

Peter Seiler, Mark E. Pittelkau, Felix Biertümpfel

Comments Final accepted manuscript (AAM) for AIAA Journal of Guidance, Control and Dynamics

2602.17619 2026-02-20 cs.NI

EDRP: Enhanced Dynamic Relay Point Protocol for Data Dissemination in Multi-hop Wireless IoT Networks

Jothi Prasanna Shanmuga Sundaram, Magzhan Gabidolla, Luis Fujarte, Shawn Duong, Jianlin Guo, Toshiaki Koike-Akino, Pu, Wang, Kieran Parsons, Philip V. Orlik, Takenori Sumi, Yukimasa Nagai, Miguel A. Carreira-Perpinan, Alberto E. Cerpa

2602.17614 2026-02-20 cs.LG cs.DC

Guarding the Middle: Protecting Intermediate Representations in Federated Split Learning

Obaidullah Zaland, Sajib Mistry, Monowar Bhuyan

Comments Accepted for Publication in IEEE International Conference on Big Data (IEEE BigData) 2025

2602.17610 2026-02-20 cs.DC cs.DB

Exploring Novel Data Storage Approaches for Large-Scale Numerical Weather Prediction

Nicolau Manubens Gil

Comments PhD. thesis successfully defended at The University of Edinburgh on the 16th October 2025

2602.17609 2026-02-20 eess.SP cs.ET

Device-Centric ISAC for Exposure Control via Opportunistic Virtual Aperture Sensing

Marouan Mizmizi, Zhibin Yu, Guanglong Du, Umberto Spagnolini

2602.17608 2026-02-20 cs.LG cs.AI stat.ML

Towards Anytime-Valid Statistical Watermarking

Baihe Huang, Eric Xu, Kannan Ramchandran, Jiantao Jiao, Michael I. Jordan

2602.17607 2026-02-20 cs.AI cs.LG cs.NA math.NA

AutoNumerics: An Autonomous, PDE-Agnostic Multi-Agent Pipeline for Scientific Computing

Jianda Du, Youran Sun, Haizhao Yang

2602.17602 2026-02-20 cs.AI

MolHIT: Advancing Molecular-Graph Generation with Hierarchical Discrete Diffusion Models

Hojung Jung, Rodrigo Hormazabal, Jaehyeong Jo, Youngrok Park, Kyunggeun Roh, Se-Young Yun, Sehui Han, Dae-Woong Jeong

2602.17599 2026-02-20 cs.CV cs.MM cs.SD

Art2Mus: Artwork-to-Music Generation via Visual Conditioning and Large-Scale Cross-Modal Alignment

Ivan Rinaldi, Matteo Mendula, Nicola Fanelli, Florence Levé, Matteo Testi, Giovanna Castellano, Gennaro Vessio

详情

英文摘要

Music generation has advanced markedly through multimodal deep learning, enabling models to synthesize audio from text and, more recently, from images. However, existing image-conditioned systems suffer from two fundamental limitations: (i) they are typically trained on natural photographs, limiting their ability to capture the richer semantic, stylistic, and cultural content of artworks; and (ii) most rely on an image-to-text conversion stage, using language as a semantic shortcut that simplifies conditioning but prevents direct visual-to-audio learning. Motivated by these gaps, we introduce ArtSound, a large-scale multimodal dataset of 105,884 artwork-music pairs enriched with dual-modality captions, obtained by extending ArtGraph and the Free Music Archive. We further propose ArtToMus, the first framework explicitly designed for direct artwork-to-music generation, which maps digitized artworks to music without image-to-text translation or language-based semantic supervision. The framework projects visual embeddings into the conditioning space of a latent diffusion model, enabling music synthesis guided solely by visual information. Experimental results show that ArtToMus generates musically coherent and stylistically consistent outputs that reflect salient visual cues of the source artworks. While absolute alignment scores remain lower than those of text-conditioned systems-as expected given the substantially increased difficulty of removing linguistic supervision-ArtToMus achieves competitive perceptual quality and meaningful cross-modal correspondence. This work establishes direct visual-to-music generation as a distinct and challenging research direction, and provides resources that support applications in multimedia art, cultural heritage, and AI-assisted creative practice. Code and dataset will be publicly released upon acceptance.

URL PDF HTML ☆

赞 0 踩 0

2602.17596 2026-02-20 cs.LG

Asymptotic Smoothing of the Lipschitz Loss Landscape in Overparameterized One-Hidden-Layer ReLU Networks

Saveliy Baturin

2602.17594 2026-02-20 cs.AI

AI Gamestore: Scalable, Open-Ended Evaluation of Machine General Intelligence with Human Games

Lance Ying, Ryan Truong, Prafull Sharma, Kaiya Ivy Zhao, Nathan Cloos, Kelsey R. Allen, Thomas L. Griffiths, Katherine M. Collins, José Hernández-Orallo, Phillip Isola, Samuel J. Gershman, Joshua B. Tenenbaum

Comments 29 pages, 14 figures

2602.17590 2026-02-20 cs.CR cs.MA

BMC4TimeSec: Verification Of Timed Security Protocols

Agnieszka M. Zbrzezny

Comments To appear in the Proceedings of the 25th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2026), May 25 - 29, 2026, Paphos, Cyprus