arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2603.01984 2026-03-03 cs.SD cs.SC

ViTex: Visual Texture Control for Multi-Track Symbolic Music Generation via Discrete Diffusion Models

Xiaoyu Yi, Qi He, Gus Xia, Ziyu Wang

详情

英文摘要

In automatic music generation, a central challenge is to design controls that enable meaningful human-machine interaction. Existing systems often rely on extrinsic inputs such as text prompts or metadata, which do not allow humans to directly shape the composition. While prior work has explored intrinsic controls such as chords or hierarchical structure, these approaches mainly address piano or vocal-accompaniment settings, leaving multitrack symbolic music largely underexplored. We identify instrumentation, the choice of instruments and their roles, as a natural dimension of control in multi-track composition, and propose ViTex, a visual representation of instrumental texture. In ViTex, color encodes instrument choice, spatial position represents pitch and time, and stroke properties capture local textures. Building on this representation, we develop a discrete diffusion model conditioned on ViTex and chord progressions to generate 8-measure multi-track symbolic music, enabling explicit texture-level control while maintaining strong unconditional generation quality. The demo page and code are avaliable at https://vitex2025.github.io/.

URL PDF HTML ☆

赞 0 踩 0

2603.01982 2026-03-03 cs.RO

From Transportation to Manipulation: Transforming Magnetic Levitation to Magnetic Robotics

Lara Bergmann, Noah Greis, Klaus Neumann

2603.01976 2026-03-03 cs.CV

Robust White Blood Cell Classification with Stain-Normalized Decoupled Learning and Ensembling

Luu Le, Hoang-Loc Cao, Ha-Hieu Pham, Thanh-Huy Nguyen, Ulas Bagci

2603.01973 2026-03-03 cs.CL cs.AI cs.SI

CharacterFlywheel: Scaling Iterative Improvement of Engaging and Steerable LLMs in Production

Yixin Nie, Lin Guan, Zhongyao Ma, Anchit Gupta, Yipin Zhou, Xiao Li, Zhengping Zhou, Raymond Zeng, Gelin Zhou, Shigan Chu, Ajay Thampi, Wancen Mu, Nathan Shuster, Ketong Wang, Lin Chen, Jason Brewer, Derek Hao Hu, Alexander McCauley, Jason Weston, Sem Park, Na Zhang, Kevin Tang

2603.01968 2026-03-03 cs.LG cs.AI

Intrinsic Task Symmetry Drives Generalization in Algorithmic Tasks

Hyeonbin Hwang, Yeachan Park

Comments Preprint

2603.01966 2026-03-03 cs.CL cs.AI

AMemGym: Interactive Memory Benchmarking for Assistants in Long-Horizon Conversations

Cheng Jiayang, Dongyu Ru, Lin Qiu, Yiyang Li, Xuezhi Cao, Yangqiu Song, Xunliang Cai

Comments Accepted to ICLR 2026

2603.01965 2026-03-03 cs.LG q-bio.QM

CoVAE: correlated multimodal generative modeling

Federico Caretti, Guido Sanguinetti

2603.01959 2026-03-03 cs.LG

The Expressive Limits of Diagonal SSMs for State-Tracking

Mehran Shakerinava, Behnoush Khavari, Siamak Ravanbakhsh, Sarath Chandar

Comments 18 pages, 5 figures, 4 tables. Accepted at ICLR 2026

2603.01952 2026-03-03 cs.AI

LiveCultureBench: a Multi-Agent, Multi-Cultural Benchmark for Large Language Models in Dynamic Social Simulations

Viet-Thanh Pham, Lizhen Qu, Thuy-Trang Vu, Gholamreza Haffari, Dinh Phung

2603.01951 2026-03-03 cs.LG math.OC stat.ML

Accelerating Single-Pass SGD for Generalized Linear Prediction

Qian Chen, Shihong Ding, Cong Fang

Comments 50 pages

2603.01950 2026-03-03 cs.LG cs.CL cs.CV

Semantic Similarity is a Spurious Measure of Comic Understanding: Lessons Learned from Hallucinations in a Benchmarking Experiment

Christopher Driggers-Ellis, Nachiketh Tibrewal, Rohit Bogulla, Harsh Khanna, Sangpil Youm, Christan Grant, Bonnie Dorr

Comments 8 pages, 2 figures, 3 tables. Includes link to code

2603.01949 2026-03-03 cs.LG cs.AI cs.CE

Probabilistic Retrofitting of Learned Simulators

Cristiana Diaconu, Miles Cranmer, Richard E. Turner, Tanya Marwah, Payel Mukhopadhyay

Comments Code provided at https://github.com/cddcam/lola_crps

2603.01948 2026-03-03 cs.CV

PreSight: Preoperative Outcome Prediction for Parkinson's Disease via Region-Prior Morphometry and Patient-Specific Weighting

Yand Wang, Chen Zhang, Lanyun Zhu, Yixin Chen, Qunbo Wang, Yutong Bai, Jurgen Germann, Yinghong Wen, Shuai Shao

2603.01947 2026-03-03 cs.CV cs.AI

physfusion: A Transformer-based Dual-Stream Radar and Vision Fusion Framework for Open Water Surface Object Detection

Yuting Wan, Liguo Sun, Jiuwu Hao, Zao Zhang, Pin LV

2603.01945 2026-03-03 cs.CL cs.AI cs.LG

When Numbers Tell Half the Story: Human-Metric Alignment in Topic Model Evaluation

Thibault Prouteau, Francis Lareau, Nicolas Dugué, Jean-Charles Lamirel, Christophe Malaterre

2603.01941 2026-03-03 cs.LG

BAED: a New Paradigm for Few-shot Graph Learning with Explanation in the Loop

Chao Chen, Xujia Li, Dongsheng Hong, Shanshan Lin, Xiangwen Liao, Chuanyi Liu, Lei Chen

Comments Accepted to Neural Networks 2026

2603.01940 2026-03-03 cs.AI

CoVe: Training Interactive Tool-Use Agents via Constraint-Guided Verification

Jinpeng Chen, Cheng Gong, Hanbo Li, Ziru Liu, Zichen Tian, Xinyu Fu, Shi Wu, Chenyang Zhang, Wu Zhang, Suiyun Zhang, Dandan Tu, Rui Liu

2603.01938 2026-03-03 cs.LG cs.AI

Explanation-Guided Adversarial Training for Robust and Interpretable Models

Chao Chen, Yanhui Chen, Shanshan Lin, Dongsheng Hong, Shu Wu, Xiangwen Liao, Chuanyi Liu

Comments Accepted by IEEE Transactions On Circuits and Systems For Video Technology (TCSVT 2026)

2603.01935 2026-03-03 cs.LG cs.AI

Dream2Learn: Structured Generative Dreaming for Continual Learning

Salvatore Calcagno, Matteo Pennisi, Federica Proietto Salanitri, Amelia Sorrenti, Simone Palazzo, Concetto Spampinato, Giovanni Bellitto

2603.01913 2026-03-03 cs.CV

Zero-shot Low-Field MRI Enhancement via Diffusion-Based Adaptive Contrast Transport

Muyu Liu, Chenhe Du, Xuanyu Tian, Qing Wu, Xiao Wang, Haonan Zhang, Hongjiang Wei, Yuyao Zhang

Comments 11 pages, 4 figures, conference paper

2603.01912 2026-03-03 cs.CL cs.AI

Demonstrating ViviDoc: Generating Interactive Documents through Human-Agent Collaboration

Yinghao Tang, Yupeng Xie, Yingchaojie Feng, Tingfeng Lan, Wei Chen

2603.01910 2026-03-03 cs.CL cs.AI

FLANS at SemEval-2026 Task 7: RAG with Open-Sourced Smaller LLMs for Everyday Knowledge Across Diverse Languages and Cultures

Liliia Bogdanova, Shiran Sun, Lifeng Han, Natalia Amat Lefort, Flor Miriam Plaza-del-Arco

2603.01907 2026-03-03 cs.LG cs.CL

Efficient RLVR Training via Weighted Mutual Information Data Selection

Xinyu Zhou, Boyu Zhu, Haotian Zhang, Huiming Wang, Zhijiang Guo

Comments 15 Pages

2603.01898 2026-03-03 cs.RO

SaferPath: Hierarchical Visual Navigation with Learned Guidance and Safety-Constrained Control

Lingjie Zhang, Zeyu Jiang, Changhao Chen

Comments ICRA 2026

2603.01894 2026-03-03 cs.SD

VietSuperSpeech: A Large-Scale Vietnamese Conversational Speech Dataset for ASR Fine-Tuning in Chatbot, Customer Support, and Call Center Applications

Loan Do, Thanh Ngoc Nguyen, Thanh Pham, Vinh Do, Hien Nguyen, Charlotte Nguyen

2603.01891 2026-03-03 cs.LG

SEAR: Sample Efficient Action Chunking Reinforcement Learning

C. F. Maximilian Nagy, Onur Celik, Emiliyan Gospodinov, Florian Seligmann, Weiran Liao, Aryan Kaushik, Gerhard Neumann

2603.01890 2026-03-03 cs.CV

Resolving Blind Inverse Problems under Dynamic Range Compression via Structured Forward Operator Modeling

Muyu Liu, Xuanyu Tian, Chenhe Du, Qing Wu, Hongjiang Wei, Yuyao Zhang

Comments 16 pages, 10 figures, conference paper

2603.01879 2026-03-03 cs.LG cs.AI

Diagnosing Generalization Failures from Representational Geometry Markers

Chi-Ning Chou, Artem Kirsanov, Yao-Yuan Yang, SueYeon Chung

Comments Published in the International Conference on Learning Representations (ICLR), 2026

2603.01878 2026-03-03 cs.CV

CTForensics: A Comprehensive Dataset and Method for AI-Generated CT Image Detection

Yiheng Li, Zichang Tan, Guoqing Xu, Yijun Ye, Yang Yang, Zhen Lei

Comments under review, repo: https://github.com/liyih/CTForensics

2603.01869 2026-03-03 cs.CL cs.CY

Sovereign AI-based Public Services are Viable and Affordable

António Branco, Luís Gomes, Rodrigo Santos, Eduardo Santos, João Silva, Nuno Marques, Madalena Rodrigues

Comments Accepted at LREC 2026