arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2603.07544 2026-03-10 cs.SD eess.AS

Evaluating Parkinson's Disease Detection in Anonymized Speech: A Performance and Acoustic Analysis

Carlos Franzreb, Francisco Teixeira, Ben Luks, Sebastian Möller, Alberto Abad

Comments Submitted to Interspeech 2026

2603.07543 2026-03-10 cs.CV cs.MM

CONSTANT: Towards High-Quality One-Shot Handwriting Generation with Patch Contrastive Enhancement and Style-Aware Quantization

Anh-Duy Le, Van-Linh Pham, Thanh-Nam Vo, Xuan Toan Mai, Tuan-Anh Tran

Comments Accepted as oral presentation at WACV 2026

2603.07540 2026-03-10 cs.CV cs.AI

How Long Can Unified Multimodal Models Generate Images Reliably? Taming Long-Horizon Interleaved Image Generation via Context Curation

Haoyu Chen, Qing Liu, Yuqian Zhou, He Zhang, Zhaowen Wang, Mengwei Ren, Jingjing Ren, Xiang Wang, Zhe Lin, Lei Zhu

2603.07535 2026-03-10 cs.CV

Scale-Aware UAV-to-Satellite Cross-View Geo-Localization: A Semantic Geometric Approach

Yibin Ye, Shuo Chen, Kun Wang, Xiaokai Song, Jisheng Dang, Qifeng Yu, Xichao Teng, Zhang Li

Comments 14 pages

2603.07534 2026-03-10 cs.CL

Accent Vector: Controllable Accent Manipulation for Multilingual TTS Without Accented Data

Thanathai Lertpetchpun, Thanapat Trachu, Jihwan Lee, Tiantian Feng, Dani Byrd, Shrikanth Narayanan

Comments Submitted to Interspeech2026

2603.07530 2026-03-10 cs.RO

ICLR: In-Context Imitation Learning with Visual Reasoning

Toan Nguyen, Weiduo Yuan, Songlin Wei, Hui Li, Daniel Seita, Yue Wang

Comments Project website: ICLR" target="_blank" rel="noopener">https://toannguyen1904.github.io/ICLR

2603.07525 2026-03-10 cs.LG

Generative prediction of laser-induced rocket ignition with dynamic latent space representations

Tony Zahtila, Ettore Saetta, Murray Cutforth, Davy Brouzet, Diego Rossinelli, Gianluca Iaccarino

2603.07524 2026-03-10 cs.LG cs.AI

Neural Dynamics-Informed Pre-trained Framework for Personalized Brain Functional Network Construction

Hongjie Jiang, Yifei Tang, Shuqiang Wang

2603.07521 2026-03-10 cs.CV cs.AI

SketchGraphNet: A Memory-Efficient Hybrid Graph Transformer for Large-Scale Sketch Corpora Recognition

Shilong Chen, Mingyuan Li, Zhaoyang Wang, Zhonglin Ye, Haixing Zhao

2603.07518 2026-03-10 cs.LG

Reinforcement learning-based dynamic cleaning scheduling framework for solar energy system

Heungjo An

Comments 16 pages, 6 figures, This is an accepted manuscript of the article published in Journal of Korean Institute of Intelligent Systems, 35(1), 84-97, 2025

详情

DOI: 10.5391/JKIIS.2025.35.1.84
Journal ref: Journal of Korean Institute of Intelligent Systems, 35(1), 84-97, 2025

英文摘要

Advancing autonomous green technologies in solar photovoltaic (PV) systems is key to improving sustainability and efficiency in renewable energy production. This study presents a reinforcement learning (RL)-based framework to autonomously optimize the cleaning schedules of PV panels in arid regions, where soiling from dust and other airborne particles significantly reduces energy output. By employing advanced RL algorithms, Proximal Policy Optimization (PPO) and Soft Actor-Critic (SAC), the framework dynamically adjusts cleaning intervals based on uncertain environmental conditions. The proposed approach was applied to a case study in Abu Dhabi, UAE, demonstrating that PPO outperformed SAC and traditional simulation optimization (Sim-Opt) methods, achieving up to 13% cost savings by dynamically responding to weather uncertainties. The results highlight the superiority of flexible, autonomous scheduling over fixed-interval methods, particularly in adapting to stochastic environmental dynamics. This aligns with the goals of autonomous green energy production by reducing operational costs and improving the efficiency of solar power generation systems. This work underscores the potential of RL-driven autonomous decision-making to optimize maintenance operations in renewable energy systems. In future research, it is important to enhance the generalization ability of the proposed RL model, while also considering additional factors and constraints to apply it to different regions.

URL PDF HTML ☆

赞 0 踩 0

2603.07516 2026-03-10 cs.RO cs.AI

InterReal: A Unified Physics-Based Imitation Framework for Learning Human-Object Interaction Skills

Dayang Liang, Yuhang Lin, Xinzhe Liu, Jiyuan Shi, Yunlong Liu, Chenjia Bai

2603.07515 2026-03-10 cs.CV

EvolveReason: Self-Evolving Reasoning Paradigm for Explainable Deepfake Facial Image Identification

Binjia Zhou, Dawei Luo, Shuai Chen, Feng Xu, Seow, Haoyuan Li, Jiachi Wang, Jiawen Wang, Zunlei Feng, Yijun Bei

2603.07513 2026-03-10 cs.CL

Bolbosh: Script-Aware Flow Matching for Kashmiri Text-to-Speech

Tajamul Ashraf, Burhaan Rasheed Zargar, Saeed Abdul Muizz, Ifrah Mushtaq, Nazima Mehdi, Iqra Altaf Gillani, Aadil Amin Kak, Janibul Bashir

Comments https://gaash-lab.github.io/Bolbosh/

2603.07507 2026-03-10 cs.LG

Online Continual Learning for Anomaly Detection in IoT under Data Distribution Shifts

Matea Marinova, Shashi Raj Pandey, Junya Shiraishi, Martin Voigt Vejling, Valentin Rakovic, Petar Popovski

Comments Manuscript submitted to EUSIPCO 2026. The copyright might be transferred without further notice

2603.07506 2026-03-10 cs.LG

A Unified Framework for Knowledge Transfer in Bidirectional Model Scaling

Jianlu Shen, Fu Feng, Jiaze Xu, Yucheng Xie, Jiaqi Lv, Xin Geng

2603.07500 2026-03-10 cs.LG

Enhanced Random Subspace Local Projections for High-Dimensional Time Series Analysis

Eman Khalid, Moimma Ali Khan, Zarmeena Ali, Abdullah Illyas, Muhammad Usman, Saoud Ahmed

Comments 12 pages, 18 figures

2603.07497 2026-03-10 cs.CV

AMR-CCR: Anchored Modular Retrieval for Continual Chinese Character Recognition

Yuchuan Wu, Yinglian Zhu, Haiyang Yu, Ke Niu, Bin Li, Xiangyang Xue

2603.07494 2026-03-10 cs.CV

DocCogito: Aligning Layout Cognition and Step-Level Grounded Reasoning for Document Understanding

Yuchuan Wu, Minghan Zhuo, Teng Fu, Mengyang Zhao, Bin Li, Xiangyang Xue

2603.07493 2026-03-10 cs.CV

RayD3D: Distilling Depth Knowledge Along the Ray for Robust Multi-View 3D Object Detection

Rui Ding, Zhaonian Kuang, Zongwei Zhou, Meng Yang, Xinhu Zheng, Gang Hua

2603.07487 2026-03-10 cs.CL cs.AI

A Joint Neural Baseline for Concept, Assertion, and Relation Extraction from Clinical Text

Fei Cheng, Ribeka Tanaka, Sadao Kurohashi

Comments Technical Report. Our code is available at: https://github.com/racerandom/JaMIE

2603.07486 2026-03-10 cs.CV

Multi-Modal Decouple and Recouple Network for Robust 3D Object Detection

Rui Ding, Zhaonian Kuang, Yuzhe Ji, Meng Yang, Xinhu Zheng, Gang Hua

详情

英文摘要

Multi-modal 3D object detection with bird's eye view (BEV) has achieved desired advances on benchmarks. Nonetheless, the accuracy may drop significantly in the real world due to data corruption such as sensor configurations for LiDAR and scene conditions for camera. One design bottleneck of previous models resides in the tightly coupling of multi-modal BEV features during fusion, which may degrade the overall system performance if one modality or both is corrupted. To mitigate, we propose a Multi-Modal Decouple and Recouple Network for robust 3D object detection under data corruption. Different modalities commonly share some high-level invariant features. We observe that these invariant features across modalities do not always fail simultaneously, because different types of data corruption affect each modality in distinct ways.These invariant features can be recovered across modalities for robust fusion under data corruption.To this end, we explicitly decouple Camera/LiDAR BEV features into modality-invariant and modality-specific parts. It allows invariant features to compensate each other while mitigates the negative impact of a corrupted modality on the other.We then recouple these features into three experts to handle different types of data corruption, respectively, i.e., LiDAR, camera, and both.For each expert, we use modality-invariant features as robust information, while modality-specific features serve as a complement.Finally, we adaptively fuse the three experts to exact robust features for 3D object detection. For validation, we collect a benchmark with a large quantity of data corruption for LiDAR, camera, and both based on nuScenes. Our model is trained on clean nuScenes and tested on all types of data corruption. Our model consistently achieves the best accuracy on both corrupted and clean data compared to recent models.

URL PDF HTML ☆

赞 0 踩 0

2603.07484 2026-03-10 cs.RO

HSC-VLA: Hierarchical Scene-Clearing for Robust Bimanual Manipulation in Dense Clutter

Zhen Liu, Xinyu Ning, Zhe Hu, XinXin Xie, Yitong Liu, Zhongzhu Pu

2603.07482 2026-03-10 cs.LG cs.AI

Interpretable-by-Design Transformers via Architectural Stream Independence

Clayton Kerce, Alexis Fox

2603.07480 2026-03-10 cs.RO

GSAT: Geometric Traversability Estimation using Self-supervised Learning with Anomaly Detection for Diverse Terrains

Dongjin Cho, Miryeong Park, Juhui Lee, Geonmo Yang, Younggun Cho

Comments 8 pages, 8 figures, accepted to ICRA 2026

2603.07476 2026-03-10 cs.CV

EVLF: Early Vision-Language Fusion for Generative Dataset Distillation

Wenqi Cai, Yawen Zou, Guang Li, Chunzhi Gu, Chao Zhang

Comments CVPR2026 (main conference)

2603.07472 2026-03-10 cs.LG cs.AI

Contact-Guided 3D Genome Structure Generation of E. coli via Diffusion Transformers

Mingxin Zhang, Xiaofeng Dai, Yu Yao, Ziqi Yin

Comments Accepted at the Gen2 Workshop at ICLR 2026

2603.07468 2026-03-10 cs.CV

FedEU: Evidential Uncertainty-Driven Federated Fine-Tuning of Vision Foundation Models for Remote Sensing Image Segmentation

Xiaokang Zhang, Xuran Xiong, Jianzhong Huang, Lefei Zhang

Comments 14 pages, 8 figures

2603.07465 2026-03-10 cs.CV

Classifying Novel 3D-Printed Objects without Retraining: Towards Post-Production Automation in Additive Manufacturing

Fanis Mathioulakis, Gorjan Radevski, Silke GC Cleuren, Michel Janssens, Brecht Das, Koen Schauwaert, Tinne Tuytelaars

2603.07464 2026-03-10 cs.CV

Selective Transfer Learning of Cross-Modality Distillation for Monocular 3D Object Detection

Rui Ding, Meng Yang, Nanning Zheng

2603.07463 2026-03-10 cs.CV

SIGMAE: A Spectral-Index-Guided Foundation Model for Multispectral Remote Sensing

Xiaokang Zhang, Bo Li, Chufeng Zhou, Weikang Yu, Lefei Zhang

Comments 17pages,10figures