arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2604.20496 2026-04-23 cs.CR cs.AI

Mythos and the Unverified Cage: Z3-Based Pre-Deployment Verification for Frontier-Model Sandbox Infrastructure

Dominik Blain

Comments 12 pages, 2 figures, 4 production case studies, 4 tables. Research paper on formal verification for frontier-model sandbox infrastructure

详情

英文摘要

The April 2026 Claude Mythos sandbox escape exposed a critical weakness in frontier AI containment: the infrastructure surrounding advanced models remains susceptible to formally characterizable arithmetic vulnerabilities. Anthropic has not publicly characterized the escape vector; some secondary accounts hypothesize a CWE-190 arithmetic vulnerability in sandbox networking code. We treat this as unverified and analyze the vulnerability class rather than the specific escape. This paper presents COBALT, a Z3 SMT-based formal verification engine for identifying CWE-190/191/195 arithmetic vulnerability patterns in C/C++ infrastructure prior to deployment. We distinguish two classes of contribution. Validated: COBALT detects arithmetic vulnerability patterns in production codebases, producing SAT verdicts with concrete witnesses and UNSAT guarantees under explicit safety bounds. We demonstrate this on four production case studies: NASA cFE, wolfSSL, Eclipse Mosquitto, and NASA F Prime, with reproducible encodings, verified solver output, and acknowledged security outcomes. Proposed: a four-layer containment framework consisting of COBALT, VERDICT, DIRECTIVE-4, and SENTINEL, mapping pre-deployment verification, pre-execution constraints, output control, and runtime monitoring to the failure modes exposed by the Mythos incident. Under explicit assumptions, we further argue that the publicly reported Mythos escape class is consistent with a Z3-expressible CWE-190 arithmetic formulation and that pre-deployment formal analysis would have been capable of surfacing the relevant pattern. The broader claim is infrastructural: frontier-model safety cannot depend on behavioral safeguards alone; the containment stack itself must be subjected to formal verification.

URL PDF HTML ☆

赞 0 踩 0

2604.20495 2026-04-23 cs.CR cs.LG

Towards Certified Malware Detection: Provable Guarantees Against Evasion Attacks

Nandakrishna Giri, Asmitha K. A., Serena Nicolazzo, Antonino Nocera, Vinod P

2604.20492 2026-04-23 stat.ML cs.IT cs.LG math.IT

Decentralized Machine Learning with Centralized Performance Guarantees via Gibbs Algorithms

Yaiza Bermudez, Samir Perlaza, Iñaki Esnaola

Comments In Proceedings of the International Symposium on Information Theory (ISIT), 2026

2604.20490 2026-04-23 cs.IR

Break the Optimization Barrier of LLM-Enhanced Recommenders: A Theoretical Analysis and Practical Framework

Zhangchi Zhu, Wei Zhang

2604.20489 2026-04-23 cs.NI

Assessing the Challenges of Collective Perception via V2I Communications in High-Speed Scenarios with Open Road Testing

Jon Ander Iñiguez de Gordoa, Iker Alkorta, Itziar Urbieta, Gorka Velez, Andoni Mujika

2604.20486 2026-04-23 cs.CV

ProMMSearchAgent: A Generalizable Multimodal Search Agent Trained with Process-Oriented Rewards

Wentao Yan, Shengqin Wang, Huichi Zhou, Yihang Chen, Kun Shao, Yuan Xie, Zhizhong Zhang

2604.20475 2026-04-23 math.NA cs.CE cs.NA

A topological decoupling of modified nodal analysis including controlled sources

Idoia Cortes Garcia, Peter F. Förster, Lennart Jansen, Wil Schilders, Sebastian Schöps

Comments 14 pages, 8 figures

2604.20474 2026-04-23 cs.CV

Random Walk on Point Clouds for Feature Detection

Yuhe Zhang, Zhikun Tu, Zhi Li, Jian Gao, Bao Guo, Shunli Zhang

Comments 20 pages, 11 figures. Published in Information Sciences

2604.20473 2026-04-23 cs.CV

Video-ToC: Video Tree-of-Cue Reasoning

Qizhong Tan, Zhuotao Tian, Guangming Lu, Jun Yu, Wenjie Pei

2604.20472 2026-04-23 cs.RO cs.LG

Temporal Difference Calibration in Sequential Tasks: Application to Vision-Language-Action Models

Shelly Francis-Meretzki, Mirco Mutti, Yaniv Romano, Aviv Tamar

2604.20470 2026-04-23 cs.CV

DynamicRad: Content-Adaptive Sparse Attention for Long Video Diffusion

Yongji Long, Shijun Liang, Jintao Li, Yun Li

2604.20467 2026-04-23 physics.ao-ph cs.LG physics.comp-ph

Mechanistic Interpretability Tool for AI Weather Models

Kirsten I. Tempest, Matthias Beylich, George C. Craig

Comments 14 pages, 5 figures. Submitted to International Conference on Computational Science 2026

2604.20466 2026-04-23 eess.SP cs.SY eess.IV eess.SY

Adaptive Multi-UAV Relay Deployment Framework in Satellite Aerial Ground Integrated Systems

Bhola, Yu-Jia Chen, Ashutosh Balakrishnan, Swades De, Li-Chun Wang

2604.20461 2026-04-23 cs.SE

On the Informativeness of Security Commit Messages: A Large-scale Replication Study

Syful Islam, Stefano Zacchiroli

Comments This paper has been accepted for publication in the EASE 2026 (RENE track)

2604.20460 2026-04-23 cs.CV

CCTVBench: Contrastive Consistency Traffic VideoQA Benchmark for Multimodal LLMs

Xingcheng Zhou, Hao Guo, Rui Song, Walter Zimmer, Mingyu Liu, André Schamschurko, Hu Cao, Alois Knoll

2604.20458 2026-04-23 cs.LG physics.chem-ph

Surrogate Functionals for Machine-Learned Orbital-Free Density Functional Theory

Roman Remme, Fred A. Hamprecht

2604.20457 2026-04-23 cs.DS

Cluster Vertex Deletion on Chordal Graphs

Yixin Cao, Peng Li

2604.20454 2026-04-23 cs.CL

Not all ANIMALs are equal: metaphorical framing through source domains and semantic frames

Yulia Otmakhova, Matteo Guida, Lea Frermann

Comments Accepted to ACL 2026 Findings

2604.20452 2026-04-23 cs.IR cs.CL

HaS: Accelerating RAG through Homology-Aware Speculative Retrieval

Peng Peng, Weiwei Lin, Wentai Wu, Xinyang Wang, Yongheng Liu

Comments Accepted by ICDE 2026

2604.20447 2026-04-23 cs.CL

Decoding Text Spans for Efficient and Accurate Named-Entity Recognition

Andrea Maracani, Savas Ozkan, Junyi Zhu, Sinan Mutlu, Mete Ozay

2604.20446 2026-04-23 cs.LG stat.ML

The Origin of Edge of Stability

Elon Litman

2604.20444 2026-04-23 cs.RO cs.AI cs.DB cs.LG

VTouch++: A Multimodal Dataset with Vision-Based Tactile Enhancement for Bimanual Manipulation

Qianxi Hua, Xinyue Li, Zheng Yan, Yang Li, Chi Zhang, Yongyao Li, Yufei Liu

2604.20441 2026-04-23 cs.AI

MedSkillAudit: A Domain-Specific Audit Framework for Medical Research Agent Skills

Yingyong Hou, Xinyuan Lao, Huimei Wang, Qianyu Yao, Wei Chen, Bocheng Huang, Fei Sun, Yuxian Lv, Weiqi Lei, Xueqian Wen, Pengfei Xia, Zhujun Tan, Shengyang Xie

Comments 20 pages, 9 figures, 1 graphic abstract, 4 tables

2604.20436 2026-04-23 cs.SE cs.AI

Shift-Up: A Framework for Software Engineering Guardrails in AI-native Software Development -- Initial Findings

Petrus Lipsanen, Liisa Rannikko, François Christophe, Konsta Kalliokoski, Vlad Stirbu, Tommi Mikkonen

Comments This paper has been accepted for presentation at the VibeX 2026 International Workshop on Vibe Coding and Vibe Researching

2604.20434 2026-04-23 cs.IR

Discrete Preference Learning for Personalized Multimodal Generation

Yuting Zhang, Ying Sun, Dazhong Shen, Ziwei Xie, Feng Liu, Changwang Zhang, Xiang Liu, Jun Wang, Hui Xiong

Comments be accepted to SIGIR 2026

详情

DOI: 10.1145/3805712.3809552

英文摘要

The emergence of generative models enables the creation of texts and images tailored to users' preferences. Existing personalized generative models have two critical limitations: lacking a dedicated paradigm for accurate preference modeling, and generating unimodal content despite real-world multimodal-driven user interactions. Therefore, we propose personalized multimodal generation, which captures modal-specific preferences via a dedicated preference model from multimodal interactions, and then feeds them into downstream generators for personalized multimodal content. However, this task presents two challenges: (1) Gap between continuous preferences from dedicated modeling and discrete token inputs intrinsic to generator architectures; (2) Potential inconsistency between generated images and texts. To tackle these, we present a two-stage framework called Discrete Preference learning for Personalized Multimodal Generation (DPPMG). In the first stage, to accurately learn discrete modal-specific preferences, we introduce a modal-specific graph neural network (a dedicated preference model) to learn users' modal-specific preferences, which preferences are then quantized into discrete preference tokens. In the second stage, the discrete modal-specific preference tokens are injected into downstream text and image generators. To further enhance cross-modal consistency while preserving personalization, we design a cross-modal consistent and personalized reward to fine-tune token-associated parameters. Extensive experiments on two real-world datasets demonstrate the effectiveness of our model in generating personalized and consistent multimodal content.

URL PDF HTML ☆

赞 0 踩 0

2604.20433 2026-04-23 math.OC cs.SY eess.SY

On Reward-Balancing Methods for Reinforcement Learning

Simone Baroncini, Bahman Gharesifard, Giuseppe Notarstefano

2604.20431 2026-04-23 cs.IT math.IT

A New Paradigm Towards Reconfigurable Environment: Reconfigurable Distributed Antennas and Reflecting Surface

Jintao Wang, Pingping Zhang, Chengzhi Ma, Chengwang Ji, Zheng Shi, Guanghua Yang, Shaodan Ma

Comments 12 pages, 9 figures. This manuscript has been accepted by Journal of Communications and Information Networks

2604.20429 2026-04-23 cs.CV

Fast-then-Fine: A Two-Stage Framework with Multi-Granular Representation for Cross-Modal Retrieval in Remote Sensing

Xi Chen, Xu Chen, Xiangyang Jia, Xu Zhang, Shuquan Wei, Wei Wang

2604.20423 2026-04-23 cs.RO

OVPD: A Virtual-Physical Fusion Testing Dataset of OnSite Auton-omous Driving Challenge

Yuhang Zhang, Jiarui Zhang, Bowen Jian, Xin Zhou, Zhichao Lv, Peng Hang, Rongjie Yu, Ye Tian, Jian Sun

Comments 11 pages, 6 figures, 3 tables

2604.20421 2026-04-23 cs.LG

Unlocking the Forecasting Economy: A Suite of Datasets for the Full Lifecycle of Prediction Market: [Experiments \& Analysis]

Huaiyu Jia, Luofeng Zhou, Wentao Zhang, Lin William Cong, Siguang Li, Shuo Sun

Comments Project page: https://www.polymonitor.club/