arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2506.14391 2026-03-23 cs.LG cs.AI

HALO: Hierarchical Reinforcement Learning for Large-Scale Adaptive Traffic Signal Control

Yaqiao Zhu, Hongkai Wen, Geyong Min, Man Luo

Comments Accepted to The Web Conference (WWW) 2026

详情

英文摘要

Adaptive traffic signal control (ATSC) is essential for mitigating urban congestion in modern smart cities, where traffic infrastructure is evolving into interconnected Web-of-Things (WoT) environments with thousands of sensing-and-control nodes. However, existing methods face a critical scalability-coordination tradeoff: centralized approaches optimize global objectives but become computationally intractable at city scale, while decentralized multi-agent methods scale efficiently yet lack network-level coherence, resulting in suboptimal performance. In this paper, we present HALO, a hierarchical reinforcement learning framework that addresses this tradeoff for large-scale ATSC. HALO decouples decision-making into two levels: a high-level global guidance policy employs Transformer-LSTM encoders to model spatio-temporal dependencies across the entire network and broadcast compact guidance signals, while low-level local intersection policies execute decentralized control conditioned on both local observations and global context. To ensure better alignment of global-local objectives, we introduce an adversarial goal-setting mechanism where the global policy proposes challenging-yet-feasible network-level targets that local policies are trained to surpass, fostering robust coordination. We evaluate HALO extensively on multiple standard benchmarks, and a newly constructed large-scale Manhattan-like network with 2,668 intersections under real-world traffic patterns, including peak transitions, adverse weather and holiday surges. Results demonstrate HALO shows competitive performance and becomes increasingly dominant as network complexity grows across small-scale benchmarks, while delivering the strongest performance in all large-scale regimes, offering up to 6.8% lower average travel time and 5.0% lower average delay than the best state-of-the-art.

URL PDF HTML ☆

赞 0 踩 0

2505.15693 2026-03-23 cs.AI

Average Reward Reinforcement Learning for Omega-Regular and Mean-Payoff Objectives

Milad Kazemi, Mateo Perez, Fabio Somenzi, Sadegh Soudjani, Ashutosh Trivedi, Alvaro Velasquez

Comments 29 pages

详情

DOI: 10.1613/jair.1.19233
Journal ref: Journal of Artificial Intelligence Research 85, Article 25(March 2026)

英文摘要

Recent advances in reinforcement learning (RL) have renewed interest in reward design for shaping agent behavior, but manually crafting reward functions is tedious and error-prone. A principled alternative is to specify behavioral requirements in a formal, unambiguous language and automatically compile them into learning objectives. $ω$-regular languages are a natural fit, given their role in formal verification and synthesis. However, most existing $ω$-regular RL approaches operate in an episodic, discounted setting with periodic resets, which is misaligned with $ω$-regular semantics over infinite traces. For continuing tasks, where the agent interacts with the environment over a single uninterrupted lifetime, the average-reward criterion is more appropriate. We focus on absolute liveness specifications, a subclass of $ω$-regular languages that cannot be violated by any finite prefix and thus aligns naturally with continuing interaction. We present the first model-free RL framework that translates absolute liveness specifications into average-reward objectives and enables learning in unknown communicating Markov decision processes (MDPs) without episodic resetting. We also introduce a reward structure for lexicographic multi-objective optimization: among policies that maximize the satisfaction probability of an absolute liveness specification, the agent maximizes an external average-reward objective. Our method guarantees convergence in unknown communicating MDPs and supports on-the-fly reductions that do not require full environment knowledge, enabling model-free learning. Experiments across several benchmarks show that the continuing, average-reward approach outperforms competing discount-based methods.

URL PDF HTML ☆

赞 0 踩 0

2502.05709 2026-03-23 cs.LG stat.ML

Flow-based Conformal Prediction for Multi-dimensional Time Series

Junghwan Lee, Chen Xu, Yao Xie

2501.18788 2026-03-23 cs.CV math.OC

On the Theory of Bias Tuning in Event Cameras

David El-Chai Ben-Ezra, Daniel Brisk, Adar Tal

Comments 15 pages, 2 figures

2501.13558 2026-03-23 cs.CV

GoDe: Gaussians on Demand for Progressive Level of Detail and Scalable Compression

Francesco Di Sario, Riccardo Renzulli, Marco Grangetto, Akihiro Sugimoto, Enzo Tartaglione

2410.19884 2026-03-23 cs.CV

A Survey of AI-Generated Video Evaluation

Xiao Liu, Xinhao Xiang, Zizhong Li, Yongheng Wang, Zhuoheng Li, Zhuosheng Liu, Weidi Zhang, Weiqi Ye, Jiawei Zhang

2306.02393 2026-03-23 cs.RO cs.CV

EgoSpot:Egocentric Multimodal Control for Hands-Free Mobile Manipulation

Ganlin Zhang, Deheng Zhang, Longteng Duan, Guo Han, Yuqian Fu, Danda Pani Paudel, Luc Van Gool, Eric Vollenweider

2603.20181 2026-03-23 cs.CR cs.AI

Improving Generalization on Cybersecurity Tasks with Multi-Modal Contrastive Learning

Jianan Huang, Rodolfo V. Valentim, Luca Vassio, Matteo Boffa, Marco Mellia, Idilio Drago, Dario Rossi

Comments Submitted to Euro S&P - 5th International Workshop on Designing and Measuring Security in Systems with AI

2603.20151 2026-03-23 cs.CE cs.AI cs.SY eess.SY

Design-OS: A Specification-Driven Framework for Engineering System Design with a Control-Systems Design Case

H. Sinan Bank, Daniel R. Herber, Thomas H. Bradley

Comments 2 figures, 11 pages, Submitted to ASME IDETC 2026 - DAC-09

2603.20122 2026-03-23 cs.CR cs.AI

Evolving Jailbreaks: Automated Multi-Objective Long-Tail Attacks on Large Language Models

Wenjing Hong, Zhonghua Rong, Li Wang, Feng Chang, Jian Zhu, Ke Tang, Zexuan Zhu, Yew-Soon Ong

2603.20118 2026-03-23 eess.AS cs.SD

BioDCASE 2026 Challenge Baseline for Cross-Domain Mosquito Species Classification

Yuanbo Hou, Vanja Zdravkovic, Marianne Sinka, Yunpeng Li, Wenwu Wang, Mark D. Plumbley, Kathy Willis, Stephen Roberts

Comments BioDCASE 2026 CD-MSC Baseline, source code and models: https://github.com/Yuanbo2020/CD-MSC

2603.20112 2026-03-23 cs.HC cs.AI

Demonstration of Adapt4Me: An Uncertainty-Aware Authoring Environment for Personalizing Automatic Speech Recognition to Non-normative Speech

Niclas Pokel, Yiming Zhao, Pehuén Moure, Yingqiang Gao, Roman Böhringer

2603.20094 2026-03-23 cs.IR cs.AI cs.DB

LLM-Enhanced Semantic Data Integration of Electronic Component Qualifications in the Aerospace Domain

Antonio De Santis, Marco Balduini, Matteo Belcao, Andrea Proia, Marco Brambilla, Emanuele Della Valle

Comments ESWC 2026

2603.20075 2026-03-23 cs.SE cs.AI

Agentic Harness for Real-World Compilers

Yingwei Zheng, Cong Li, Shaohua Li, Yuqun Zhang, Zhendong Su

2603.20072 2026-03-23 quant-ph cs.LG eess.SP

Antenna Array Beamforming Based on a Hybrid Quantum Optimization Framework

Shuai Zeng

2603.20048 2026-03-23 eess.SP cs.LG

Structured Latent Dynamics in Wireless CSI via Homomorphic World Models

Salmane Naoumi, Mehdi Bennis, Marwa Chafii

Comments ACCEPTED FOR PUBLICATION IN IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC) 2026

2603.20045 2026-03-23 eess.IV cs.CV

Investigating a Policy-Based Formulation for Endoscopic Camera Pose Recovery

Jan Emily Mangulabnan, Akshat Chauhan, Laura Fleig, Lalithkumar Seenivasan, Roger D. Soberanis-Mukul, S. Swaroop Vedula, Russell H. Taylor, Masaru Ishii, Gregory D. Hager, Mathias Unberath

2603.20034 2026-03-23 cs.IR cs.AI

CoverageBench: Evaluating Information Coverage across Tasks and Domains

Saron Samuel, Andrew Yates, Dawn Lawrie, Ian Soboroff, Trevor Adriaanse, Benjamin Van Durme, Eugene Yang

Comments 8

2603.20028 2026-03-23 cs.SE cs.AI

Orchestrating Human-AI Software Delivery: A Retrospective Longitudinal Field Study of Three Software Modernization Programs

Maximiliano Armesto, Christophe Kolb

Comments 18 pages, 4 figures, 12 tables

2603.20024 2026-03-23 quant-ph cs.CV cs.LG

Layered Quantum Architecture Search for 3D Point Cloud Classification

Natacha Kuete Meli, Jovita Lukasik, Vladislav Golyanik, Michael Moeller

2603.20007 2026-03-23 physics.comp-ph cond-mat.mtrl-sci cs.AI

Physics-Informed Long-Range Coulomb Correction for Machine-learning Hamiltonians

Yang Zhong, Xiwen Li, Xingao Gong, Hongjun Xiang

Comments 9 pages,3 figures

2603.19975 2026-03-23 cs.HC cs.AI

Promoting Critical Thinking With Domain-Specific Generative AI Provocations

Thomas Şerban von Davier, Hao-Ping Lee, Jodi Forlizzi, Sauvik Das

Comments 6 pages, 2 figures, 1 table, CHI2026 Workshop on Tools for Thought, 2026 CHI Conference on Human Factors in Computing Systems CHI26

2603.19974 2026-03-23 cs.CR cs.AI

Trojan's Whisper: Stealthy Manipulation of OpenClaw through Injected Bootstrapped Guidance

Fazhong Liu, Zhuoyan Chen, Tu Lan, Haozhen Tan, Zhenyu Xu, Xiang Li, Guoxing Chen, Yan Meng, Haojin Zhu

详情

英文摘要

Autonomous coding agents are increasingly integrated into software development workflows, offering capabilities that extend beyond code suggestion to active system interaction and environment management. OpenClaw, a representative platform in this emerging paradigm, introduces an extensible skill ecosystem that allows third-party developers to inject behavioral guidance through lifecycle hooks during agent initialization. While this design enhances automation and customization, it also opens a novel and unexplored attack surface. In this paper, we identify and systematically characterize guidance injection, a stealthy attack vector that embeds adversarial operational narratives into bootstrap guidance files. Unlike traditional prompt injection, which relies on explicit malicious instructions, guidance injection manipulates the agent's reasoning context by framing harmful actions as routine best practices. These narratives are automatically incorporated into the agent's interpretive framework and influence future task execution without raising suspicion.We construct 26 malicious skills spanning 13 attack categories including credential exfiltration, workspace destruction, privilege escalation, and persistent backdoor installation. We evaluate them using ORE-Bench, a realistic developer workspace benchmark we developed. Across 52 natural user prompts and six state-of-the-art LLM backends, our attacks achieve success rates from 16.0% to 64.2%, with the majority of malicious actions executed autonomously without user confirmation. Furthermore, 94% of our malicious skills evade detection by existing static and LLM-based scanners. Our findings reveal fundamental tensions in the design of autonomous agent ecosystems and underscore the urgent need for defenses based on capability isolation, runtime policy enforcement, and transparent guidance provenance.

URL PDF HTML ☆

赞 0 踩 0

2603.19962 2026-03-23 cs.CR cs.LG

Channel Prediction-Based Physical Layer Authentication under Consecutive Spoofing Attacks

Yijia Guo, Junqing Zhang, Yao-Win Peter Hong

2603.19955 2026-03-23 math.OC cs.LG cs.SI cs.SY eess.SY

Structural Controllability of Large-Scale Hypergraphs

Joshua Pickard, Xin Mao, Can Chen

Comments 14 pages, 4 figures, 1 table

2603.19949 2026-03-23 cs.CR cs.LG

TAPAS: Efficient Two-Server Asymmetric Private Aggregation Beyond Prio(+)

Harish Karthikeyan, Antigoni Polychroniadou

2603.19925 2026-03-23 eess.IV cs.CV

ReconMIL: Synergizing Latent Space Reconstruction with Bi-Stream Mamba for Whole Slide Image Analysis

Lubin Gan, Jing Zhang, Heng Zhang, Xin Di, Zhifeng Wang, Wenke Huang, Xiaoyan Sun

2603.19914 2026-03-23 cs.HC cs.RO

Sense4HRI: A ROS 2 HRI Framework for Physiological Sensor Integration and Synchronized Logging

Manuel Scheibl, Julian Leichert, Sinem Görmez, Britta Wrede

Comments 6 pages, 3 figures, submitted at IEEE RO-MAN 2026

2603.19907 2026-03-23 math.OC cs.LG cs.NA math.NA

Infinite-dimensional spherical-radial decomposition for probabilistic functions, with application to constrained optimal control and Gaussian process regression

Kewei Wang, Georg Stadler

Comments 25 pages, 8 figures

2603.19899 2026-03-23 stat.ML cs.LG stat.AP

Deep Autocorrelation Modeling for Time-Series Forecasting: Progress and Prospects

Hao Wang, Licheng Pan, Qingsong Wen, Jialin Yu, Zhichao Chen, Chunyuan Zheng, Xiaoxi Li, Zhixuan Chu, Chao Xu, Mingming Gong, Haoxuan Li, Yuan Lu, Zhouchen Lin, Philip Torr, Yan Liu