arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2602.22029 2026-02-26 cs.SD eess.AS

MIDI-Informed Singing Accompaniment Generation in a Compositional Song Pipeline

Fang-Duo Tsai, Yi-An Lai, Fei-Yueh Chen, Hsueh-Wei Fu, Li Chai, Wei-Jaw Lee, Hao-Chung Cheng, Yi-Hsuan Yang

详情

英文摘要

Song generation aims to produce full songs with vocals and accompaniment from lyrics and text descriptions, yet end-to-end models remain data- and compute-intensive and provide limited editability. We advocate a compositional alternative that decomposes the task into melody composition, singing voice synthesis, and singing accompaniment generation. Central to our approach is MIDI-informed singing accompaniment generation (MIDI-SAG), which conditions accompaniment on the symbolic vocal-melody MIDI to improve rhythmic and harmonic alignment between singing and instrumentation. Moreover, beyond conventional SAG settings that assume continuously sung vocals, compositional song generation features intermittent vocals; we address this by combining explicit rhythmic/harmonic controls with audio continuation to keep the backing track consistent across vocal and non-vocal regions. With lightweight newly trained components requiring only 2.5k hours of audio on a single RTX 3090, our pipeline approaches the perceptual quality of recent open-source end-to-end baselines in several metrics. We provide audio demos and will open-source our model at https://composerflow.github.io/web/.

URL PDF HTML ☆

赞 0 踩 0

2602.22026 2026-02-26 cs.CV cs.AI

RGB-Event HyperGraph Prompt for Kilometer Marker Recognition based on Pre-trained Foundation Models

Xiaoyu Xian, Shiao Wang, Xiao Wang, Daxin Tian, Yan Tian

Comments Accepted by IEEE Transactions on Cognitive and Developmental Systems (IEEE TCDS) 2026

2602.22018 2026-02-26 cs.LG

Disease Progression and Subtype Modeling for Combined Discrete and Continuous Input Data

Sterre de Jonge, Elisabeth J. Vinke, Meike W. Vernooij, Daniel C. Alexander, Alexandra L. Young, Esther E. Bron

Comments Accepted for publication, 2026 IEEE 23rd International Symposium on Biomedical Imaging (ISBI), April 2026, London, United Kingdom

2602.22015 2026-02-26 cs.LG

Function-Space Empirical Bayes Regularisation with Student's t Priors

Pengcheng Hao, Ercan Engin Kuruoglu

2602.22014 2026-02-26 cs.CL

A Diversity Diet for a Healthier Model: A Case Study of French ModernBERT

Louis Estève, Christophe Servan, Thomas Lavergne, Agata Savary

2602.22010 2026-02-26 cs.RO cs.CV

World Guidance: World Modeling in Condition Space for Action Generation

Yue Su, Sijin Chen, Haixin Shi, Mingyu Liu, Zhengshen Zhang, Ningyuan Huang, Weiheng Zhong, Zhengbang Zhu, Yuxiao Liu, Xihui Liu

Comments Project Page: https://selen-suyue.github.io/WoGNet/

2602.22003 2026-02-26 cs.LG math.OC stat.ML

Neural solver for Wasserstein Geodesics and optimal transport dynamics

Hailiang Liu, Yan-Han Chen

Comments 28 pages, 22 figures

2602.22001 2026-02-26 cs.RO

Are Foundation Models the Route to Full-Stack Transfer in Robotics?

Freek Stulp, Samuel Bustamante, João Silvério, Alin Albu-Schäffer, Jeannette Bohg, Shuran Song

Comments 12 pages, 4 figures

2602.21992 2026-02-26 cs.CV

PanoEnv: Exploring 3D Spatial Intelligence in Panoramic Environments with Reinforcement Learning

Zekai Lin, Xu Zheng

2602.21983 2026-02-26 cs.RO

Humanizing Robot Gaze Shifts: A Framework for Natural Gaze Shifts in Humanoid Robots

Jingchao Wei, Jingkai Qin, Yuxiao Cao, Jingcheng Huang, Xiangrui Zeng, Min Li, Zhouping Yin

Comments submitted to AIM 2026

2602.21978 2026-02-26 cs.CL

CxMP: A Linguistic Minimal-Pair Benchmark for Evaluating Constructional Understanding in Language Models

Miyu Oba, Saku Sugawara

2602.21967 2026-02-26 cs.RO cs.CV

Dream-SLAM: Dreaming the Unseen for Active SLAM in Dynamic Environments

Xiangqi Meng, Pengxu Hou, Zhenjun Zhao, Javier Civera, Daniel Cremers, Hesheng Wang, Haoang Li

2602.21965 2026-02-26 cs.LG

Compact Circulant Layers with Spectral Priors

Joseph Margaryan, Thomas Hamelryck

2602.21963 2026-02-26 cs.CV

Global-Aware Edge Prioritization for Pose Graph Initialization

Tong Wei, Giorgos Tolias, Jiri Matas, Daniel Barath

Comments accepted to CVPR 2026

2602.21961 2026-02-26 cs.LG physics.soc-ph

Robustness in sparse artificial neural networks trained with adaptive topology

Bendegúz Sulyok, Gergely Palla, Filippo Radicchi, Santo Fortunato

2602.21959 2026-02-26 cs.LG

Estimation and Optimization of Ship Fuel Consumption in Maritime: Review, Challenges and Future Directions

Dusica Marijan, Hamza Haruna Mohammed, Bakht Zaman

Comments 23 pages, 4 figures. Published in Journal of Marine Science and Technology (2026)

2602.21956 2026-02-26 cs.CV

Global-Local Dual Perception for MLLMs in High-Resolution Text-Rich Image Translation

Junxin Lu, Tengfei Song, Zhanglin Wu, Pengfei Li, Xiaowei Liang, Hui Yang, Kun Chen, Ning Xie, Yunfei Lu, Jing Zhao, Shiliang Sun, Daimeng Wei

2602.21952 2026-02-26 cs.CV

MindDriver: Introducing Progressive Multimodal Reasoning for Autonomous Driving

Lingjun Zhang, Yujian Yuan, Changjie Wu, Xinyuan Chang, Xin Cai, Shuang Zeng, Linzhe Shi, Sijin Wang, Hang Zhang, Mu Xu

Comments CVPR2026; Yujian Yuan and Lingjun Zhang contributed equally with random order

2602.21951 2026-02-26 cs.CL

RADAR: Reasoning as Discrimination with Aligned Representations for LLM-based Knowledge Graph Reasoning

Bo Xue, Yuan Jin, Luoyi Fu, Jiaxin Ding, Xinbing Wang

2602.21948 2026-02-26 cs.LG stat.ML

Bayesian Generative Adversarial Networks via Gaussian Approximation for Tabular Data Synthesis

Bahrul Ilmi Nasution, Mark Elliot, Richard Allmendinger

Comments 28 pages, 5 Figures, Accepted in Transactions on Data Privacy

2602.21944 2026-02-26 cs.CV

Learning to Fuse and Reconstruct Multi-View Graphs for Diabetic Retinopathy Grading

Haoran Li, Yuxin Lin, Huan Wang, Xiaoling Luo, Qi Zhu, Jiahua Shi, Huaming Chen, Bo Du, Johan Barthelemy, Zongyan Xue, Jun Shen, Yong Xu

2602.21943 2026-02-26 cs.CV

Mobile-Ready Automated Triage of Diabetic Retinopathy Using Digital Fundus Images

Aadi Joshi, Manav S. Sharma, Vijay Uttam Rathod, Ashlesha Sawant, Prajakta Musale, Asmita B. Kalamkar

Comments Presented at ICCI 2025. 11 pages, 2 figures. MobileNetV3 + CORAL-based lightweight model for diabetic retinopathy severity classification with mobile deployment

2602.21942 2026-02-26 cs.CV

Directed Ordinal Diffusion Regularization for Progression-Aware Diabetic Retinopathy Grading

Huangwei Chen, Junhao Jia, Ruocheng Li, Cunyuan Yang, Wu Li, Xiaotao Pang, Yifei Chen, Haishuai Wang, Jiajun Bu, Lei Wu

Comments 3 figures

2602.21941 2026-02-26 cs.CL

MERRY: Semantically Decoupled Evaluation of Multimodal Emotional and Role Consistencies of Role-Playing Agents

Zhenyu Wang, Xiaofen Xing, Yirong Chen, Xiangmin Xu

Comments 11 pages, 6 figures

2602.21935 2026-02-26 cs.CV cs.AI

A Framework for Cross-Domain Generalization in Coronary Artery Calcium Scoring Across Gated and Non-Gated Computed Tomography

Mahmut S. Gokmen, Moneera N. Haque, Steve W. Leung, Caroline N. Leach, Seth Parker, Stephen B. Hobbs, Vincent L. Sorrell, W. Brent Seales, V. K. Cody Bumgardner

2602.21933 2026-02-26 cs.CL

Small Wins Big: Comparing Large Language Models and Domain Fine-Tuned Models for Sarcasm Detection in Code-Mixed Hinglish Text

Bitan Majumder, Anirban Sen

2602.21929 2026-02-26 cs.CV

Geometry-as-context: Modulating Explicit 3D in Scene-consistent Video Generation to Geometry Context

JiaKui Hu, Jialun Liu, Liying Yang, Xinliang Zhang, Kaiwen Li, Shuang Zeng, Yuanwei Li, Haibin Huang, Chi Zhang, Yanye Lu

Comments Accepted by CVPR 2026

2602.21928 2026-02-26 cs.LG stat.ML

Learning Unknown Interdependencies for Decentralized Root Cause Analysis in Nonlinear Dynamical Systems

Ayush Mohanty, Paritosh Ramanan, Nagi Gebraeel

Comments Manuscript under review

2602.21919 2026-02-26 cs.LG cs.CV

Learning in the Null Space: Small Singular Values for Continual Learning

Cuong Anh Pham, Praneeth Vepakomma, Samuel Horváth

Comments 17 pages, accepted as Oral presentation at the Third Conference on Parsimony and Learning (CPAL 2026)

2602.21910 2026-02-26 cs.LG cs.NA math.NA

The Error of Deep Operator Networks Is the Sum of Its Parts: Branch-Trunk and Mode Error Decompositions

Alexander Heinlein, Johannes Taraz

Comments 29 pages, 12 figures