arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2512.14106 2026-03-06 cs.AI

HydroGEM: A Self Supervised Zero Shot Hybrid TCN Transformer Foundation Model for Continental Scale Streamflow Quality Control

Ijaz Ul Haq, Byung Suk Lee, Julia N. Perdrial, David Baude

Comments Supplementary materials, datasets, and implementation code will be made publicly available upon acceptance for publication in a peer-reviewed journal

详情

英文摘要

Advances in sensor networks have enabled real-time stream discharge monitoring, yet persistent sensor malfunctions limit data utility. Manual quality control by expert hydrologists cannot scale with networks generating millions of measurements annually. We introduce HydroGEM, a foundation model for continental-scale streamflow quality control designed to support human expertise. HydroGEM uses self-supervised pretraining on 6.03 million clean sequences from 3,724 USGS stations to learn general hydrological representations, followed by fine-tuning with synthetic anomalies for detection and reconstruction. A hybrid TCN-Transformer architecture (14.2M parameters) captures both local and long-range temporal dependencies, while hierarchical normalization handles six orders of magnitude in discharge. On held-out observations from 799 stations with 18 synthetic anomaly types grounded in USGS standards, HydroGEM achieves F1=0.792 for detection and 68.7% reconstruction error reduction, outperforming the strongest baseline by 36.3%. For cross-national validation on 100 Environment and Climate Change Canada stations using tolerant evaluation with a plus or minus 24-hour buffer, HydroGEM achieves Tolerant F1=0.70 with 90.1% segment-level event detection, demonstrating cross-national generalization. The model maintains consistent detection across correction magnitudes and aligns with operational seasonal patterns, with peak flagging during winter ice-affected periods matching hydrologists' correction behavior. Architectural separation between simplified training anomalies and complex test anomalies confirms that performance reflects learned hydrometric principles rather than pattern memorization.

URL PDF HTML ☆

赞 0 踩 0

2512.07419 2026-03-06 cs.LG cs.CV

Revolutionizing Mixed Precision Quantization: Towards Training-free Automatic Proxy Discovery via Large Language Models

Haidong Kang, Jun Du, Lihong Lin

2512.04551 2026-03-06 cs.SD cs.AI eess.AS

Multi-Loss Learning for Speech Emotion Recognition with Energy-Adaptive Mixup and Frame-Level Attention

Cong Wang, Yizhong Geng, Yuhua Wen, Qifei Li, Yingming Gao, Ruimin Wang, Chunfeng Wang, Hao Li, Ya Li, Wei Chen

Comments Submitted for review to Interspeech 2026

2512.04277 2026-03-06 cs.LG cs.AI

Bootstrapped Mixed Rewards for RL Post-Training: Injecting Canonical Action Order

Prakhar Gupta, Vaibhav Gupta

2511.17781 2026-03-06 cs.RO

ROVER: Regulator-Driven Robust Temporal Verification of Black-Box Robot Policies

Kristy Sakano, Jianyu An, Dinesh Manocha, Huan Xu

2511.16786 2026-03-06 cs.LG cs.AI cs.CV

Revisiting Multimodal KV Cache Compression: A Frequency-Domain-Guided Outlier-KV-Aware Approach

Yaoxin Yang, Peng Ye, Xudong Tan, Chongjun Tu, Maosen Zhao, Jia Hao, Tao Chen

Comments CVPR2026

2511.13306 2026-03-06 cs.AI cs.CV

DAP: A Discrete-token Autoregressive Planner for Autonomous Driving

Bowen Ye, Bin Zhang, Hang Zhao

2511.13197 2026-03-06 cs.CV

Fully Automatic Data Labeling for Ultrasound Screen Detection

Alberto Gomez, Jorge Oliveira, Ramon Casero, Agis Chartsias

Comments Submitted to ISBI AI-POCUS workshop 2026

2511.12048 2026-03-06 cs.CV cs.CR

DeiTFake: Deepfake Detection Model using DeiT Multi-Stage Training

Saksham Kumar, Ashish Singh, Srinivasarao Thota, Sunil Kumar Singh, Chandan Kumar

2511.00412 2026-03-06 cs.RO

Runge-Kutta Approximations for Direct Coning Compensation Applying Lie Theory

John A. Christian, Michael R. Walker, Wyatt Bridgman, Michael J. Sparapany

Comments Accepted manuscript. AIAA JGCD

2511.00141 2026-03-06 cs.CV cs.AI

FLoC: Facility Location-Based Efficient Visual Token Compression for Long Video Understanding

Janghoon Cho, Jungsoo Lee, Munawar Hayat, Kyuwoong Hwang, Fatih Porikli, Sungha Choi

Comments Accepted to ICLR 2026

2510.27048 2026-03-06 cs.RO

SpikeATac: A Multimodal Tactile Finger with Taxelized Dynamic Sensing for Dexterous Manipulation

Eric T. Chang, Peter Ballentine, Zhanpeng He, Do-Gon Kim, Kai Jiang, Hua-Hsuan Liang, Joaquin Palacios, William Wang, Pedro Piacenza, Ioannis Kymissis, Matei Ciocarlie

Comments 8 pages, 8 figures, ICRA 2026

2510.26139 2026-03-06 cs.RO

Kinodynamic Task and Motion Planning using VLM-guided and Interleaved Sampling

Minseo Kwon, Young J. Kim

2510.22503 2026-03-06 cs.LG cond-mat.mtrl-sci cs.AI cs.NE

LLEMA: Evolutionary Search with LLMs for Multi-Objective Materials Discovery

Nikhil Abhyankar, Sanchit Kabra, Saaketh Desai, Chandan K. Reddy

Comments ICLR 2026

2510.16834 2026-03-06 cs.SD cs.AI cs.LG eess.AS

Schrödinger Bridge Mamba for One-Step Speech Enhancement

Jing Yang, Sirui Wang, Chao Wu, Lei Guo, Fan Fan

Comments Revised version. Submitted to Interspeech 2026

2510.00654 2026-03-06 cs.CV

Weakly Supervised Cloud Detection Combining Spectral Features and Multi-Scale Deep Network

Shaocong Zhu, Zhiwei Li, Xinghua Li, Huanfeng Shen

详情

DOI: 10.1080/15481603.2026.2626022
Journal ref: GIScience & Remote Sensing, 63(1).

英文摘要

Clouds significantly affect the quality of optical satellite images, which seriously limits their precise application. Recently, deep learning has been widely applied to cloud detection and has achieved satisfactory results. However, the lack of distinctive features in thin clouds and the low quality of training samples limit the cloud detection accuracy of deep learning methods, leaving space for further improvements. In this paper, we propose a weakly supervised cloud detection method that combines spectral features and multi-scale scene-level deep network (SpecMCD) to obtain highly accurate pixel-level cloud masks. The method first utilizes a progressive training framework with a multi-scale scene-level dataset to train the multi-scale scene-level cloud detection network. Pixel-level cloud probability maps are then obtained by combining the multi-scale probability maps and cloud thickness map based on the characteristics of clouds in dense cloud coverage and large cloud-area coverage images. Finally, adaptive thresholds are generated based on the differentiated regions of the scene-level cloud masks at different scales and combined with distance-weighted optimization to obtain binary cloud masks. Two datasets, WDCD and GF1MS-WHU, comprising a total of 60 Gaofen-1 multispectral (GF1-MS) images, were used to verify the effectiveness of the proposed method. Compared to the other weakly supervised cloud detection methods such as WDCD and WSFNet, the F1-score of the proposed SpecMCD method shows an improvement of over 7.82%, highlighting the superiority and potential of the SpecMCD method for cloud detection under different cloud coverage conditions.

URL PDF HTML ☆

赞 0 踩 0

2509.25762 2026-03-06 cs.LG

OPPO: Accelerating PPO-based RLHF via Pipeline Overlap

Kaizhuo Yan, Yingjie Yu, Yifan Yu, Haizhong Zheng, Fan Lai

2509.23886 2026-03-06 cs.LG cs.AI

Towards Understanding Subliminal Learning: When and How Hidden Biases Transfer

Simon Schrodi, Elias Kempf, Fazl Barez, Thomas Brox

Comments ICLR 2026

2509.23506 2026-03-06 cs.RO

Ask, Reason, Assist: Robot Collaboration via Natural Language and Temporal Logic

Dan BW Choe, Sundhar Vinodh Sangeetha, Steven Emanuel, Chih-Yuan Chiu, Samuel Coogan, Shreyas Kousik

Comments arXiv admin note: substantial text overlap with arXiv:2505.13376

2509.08177 2026-03-06 cs.RO cs.AI cs.CV

Quadrotor Navigation using Reinforcement Learning with Privileged Information

Jonathan Lee, Abhishek Rathod, Kshitij Goel, John Stecklein, Wennie Tabib

2509.05609 2026-03-06 cs.CL cs.LG

New Insights into Optimal Alignment of Acoustic and Linguistic Representations for Knowledge Transfer in ASR

Xugang Lu, Peng Shen, Hisashi Kawai

Comments Accepted to ICASSP 2026

2508.16943 2026-03-06 cs.RO cs.AI

LHM-Humanoid: Learning a Unified Policy for Long-Horizon Humanoid Whole-Body Loco-Manipulation in Diverse Messy Environments

Haozhuo Zhang, Jingkai Sun, Michele Caprio, Jian Tang, Shanghang Zhang, Qiang Zhang, Wei Pan

2506.07080 2026-03-06 cs.CV

FLAIR-HUB: Large-scale Multimodal Dataset for Land Cover and Crop Mapping

Anatol Garioud, Sébastien Giordano, Nicolas David, Nicolas Gonthier

2506.02015 2026-03-06 cs.CV

OSPO: Object-Centric Self-Improving Preference Optimization for Text-to-Image Generation

Yoonjin Oh, Yongjin Kim, Hyomin Kim, Donghwan Chi, Sungwoong Kim

Comments 11 pages, 6 figures

2503.21692 2026-03-06 cs.CV

RapidPoseTriangulation: Multi-view Multi-person Whole-body Human Pose Triangulation in a Millisecond

Daniel Bermuth, Alexander Poeppel, Wolfgang Reif

2502.03540 2026-03-06 cs.LG cs.AI

Path Planning for Masked Diffusion Model Sampling

Fred Zhangzhi Peng, Zachary Bezemek, Sawan Patel, Jarrid Rector-Brooks, Sherwood Yao, Avishek Joey Bose, Alexander Tong, Pranam Chatterjee

2412.20298 2026-03-06 cs.LG cs.CY stat.ML

An Experimental Study on Fairness-aware Machine Learning for Credit Scoring Problems

Huyen Giang Thi Thu, Thang Viet Doan, Ha-Bang Ban, Tai Le Quy

Comments The manuscript is submitted to Springer Nature's journal

2411.19210 2026-03-06 cs.CV

Track Anything Behind Everything: Zero-Shot Amodal Video Object Segmentation

Finlay G. C. Hudson, William A. P. Smith

2603.05280 2026-03-06 cs.CV cs.LG stat.ML

Layer by layer, module by module: Choose both for optimal OOD probing of ViT

Ambroise Odonnat, Vasilii Feofanov, Laetitia Chapel, Romain Tavenard, Ievgen Redko

Comments Accepted at ICLR 2026 CAO Workshop

2603.05279 2026-03-06 cs.RO cs.SY eess.SY

From Code to Road: A Vehicle-in-the-Loop and Digital Twin-Based Framework for Central Car Server Testing in Autonomous Driving

Chengdong Wu, Sven Kirchner, Nils Purschke, Axel Torschmied, Norbert Kroth, Yinglei Song, André Schamschurko, Erik Leo Haß, Kuo-Yi Chao, Yi Zhang, Nenad Petrovic, Alois C. Knoll

Comments 8 pages; Accepted for publication at the 37th IEEE Intelligent Vehicles Symposium (IV), Detroit, MI, United States, June 22-25, 2026