arXivDaily arXiv每日学术速递 周一至周五更新
全部学科分类 3188
专题追踪
2603.01755 2026-03-03 cs.NI cs.AI

Federated Agentic AI for Wireless Networks: Fundamentals, Approaches, and Applications

Lingyi Cai, Yu Zhang, Ruichen Zhang, Yinqiu Liu, Tao Jiang, Dusit Niyato, Wei Ni, Abbas Jamalipour

Comments 7 pages, 3 figures

详情
英文摘要

Agentic artificial intelligence (AI) presents a promising pathway toward realizing autonomous and self-improving wireless network services. However, resource-constrained, widely distributed, and data-heterogeneous nature of wireless networks poses significant challenges to existing agentic AI that relies on centralized architectures, leading to high communication overhead, privacy risks, and non-independent and identically distributed (non-IID) data. Federated learning (FL) has the potential to improve the overall loop of agentic AI through collaborative local learning and parameter sharing without exchanging raw data. This paper proposes new federated agentic AI approaches for wireless networks. We first summarize fundamentals of agentic AI and mainstream FL types. Then, we illustrate how each FL type can strengthen a specific component of agentic AI's loop. Moreover, we conduct a case study on using FRL to improve the performance of agentic AI's action decision in low-altitude wireless networks (LAWNs). Finally, we provide a conclusion and discuss future research directions.

2603.01719 2026-03-03 stat.ML cs.LG

Co-optimization for Adaptive Conformal Prediction

Xiaoyi Su, Zhixin Zhou, Rui Luo

详情
英文摘要

Conformal prediction (CP) provides finite-sample, distribution-free marginal coverage, but standard conformal regression intervals can be inefficient under heteroscedasticity and skewness. In particular, popular constructions such as conformalized quantile regression (CQR) often inherit a fixed notion of center and enforce equal-tailed errors, which can displace the interval away from high-density regions and produce unnecessarily wide sets. We propose Co-optimization for Adaptive Conformal Prediction (CoCP), a framework that learns prediction intervals by jointly optimizing a center $m(x)$ and a radius $h(x)$.CoCP alternates between (i) learning $h(x)$ via quantile regression on the folded absolute residual around the current center, and (ii) refining $m(x)$ with a differentiable soft-coverage objective whose gradients concentrate near the current boundaries, effectively correcting mis-centering without estimating the full conditional density. Finite-sample marginal validity is guaranteed by split-conformal calibration with a normalized nonconformity score. Theory characterizes the population fixed point of the soft objective and shows that, under standard regularity conditions, CoCP asymptotically approaches the length-minimizing conditional interval at the target coverage level as the estimation error and smoothing vanish. Experiments on synthetic and real benchmarks demonstrate that CoCP yields consistently shorter intervals and achieves state-of-the-art conditional-coverage diagnostics.

2603.01702 2026-03-03 cs.AR cs.LG

Security Risks in Machining Process Monitoring: Sequence-to-Sequence Learning for Reconstruction of CNC Axis Positions

Lukas Krupp, Rickmar Stahlschmidt, Norbert Wehn

Comments Accepted for presentation at the 2026 IEEE Symposium on Artificial Intelligence for Instrumentation and Measurement (AI4IM 2026). Proceedings to be included in IEEE Xplore

详情
英文摘要

Accelerometer-based process monitoring is widely deployed in modern machining systems. When mounted on moving machine components, such sensors implicitly capture kinematic information related to machine motion and tool trajectories. If this information can be reconstructed, condition monitoring data constitutes a severe security threat, particularly for retrofitted or weakly protected sensor systems. Classical signal processing approaches are infeasible for position reconstruction from broadband accelerometer signals due to sensor- and process-specific non-idealities, like noise or sensor placement effects. In this work, we demonstrate that sequence-to-sequence machine learning models can overcome these non-idealities and enable reconstruction of CNC axis and tool positions. Our approach employs LSTM-based sequence-to-sequence models and is evaluated on an industrial milling dataset. We show that learning-based models reduce the reconstruction error by up to 98% for low complexity motion profiles and by up to 85% for complex machining sequences compared to double integration. Furthermore, key geometric characteristics of tool trajectories and workpiece-related motion features are preserved. To the best of our knowledge, this is the first study demonstrating learning-based CNC position reconstruction from industrial condition monitoring accelerometer data.

2603.01689 2026-03-03 math.NA cs.LG cs.NA

Randomized Neural Networks for Partial Differential Equation on Static and Evolving Surfaces

Jingbo Sun, Fei Wang

详情
英文摘要

Surface partial differential equations arise in numerous scientific and engineering applications. Their numerical solution on static and evolving surfaces remains challenging due to geometric complexity and, for evolving geometries, the need for repeated mesh updates and geometry or solution transfer. While neural-network-based methods offer mesh-free discretizations, approaches based on nonconvex training can be costly and may fail to deliver high accuracy in practice. In this work, we develop a randomized neural network (RaNN) method for solving PDEs on both static and evolving surfaces: the hidden-layer parameters are randomly generated and kept fixed, and the output-layer coefficients are determined efficiently by solving a least-squares problem. For static surfaces, we present formulations for parametrized surfaces, implicit level-set surfaces, and point-cloud geometries, and provide a corresponding theoretical analysis for the parametrization-based formulation with interface compatibility. For evolving surfaces with topology preserved over time, we introduce a RaNN-based strategy that learns the surface evolution through a flow-map representation and then solves the surface PDE on a space--time collocation set, avoiding remeshing. Extensive numerical experiments demonstrate broad applicability and favorable accuracy--efficiency performance on representative benchmarks.

2603.01624 2026-03-03 cs.CY cs.AI

Assessing Crime Disclosure Patterns in a Large-Scale Cybercrime Forum

Raphael Hoheisel, Tom Meurs, Jai Wientjes, Marianne Junger, Abhishta Abhishta, Masarah Paquet-Clouston

Comments 12 pages, 4 figures

详情
英文摘要

Cybercrime forums play a central role in the cybercrime ecosystem, serving as hubs for the exchange of illicit goods, services, and knowledge. Previous studies have explored the market and social structures of these forums, but less is known about the behavioral dynamics of users, particularly regarding participants' disclosure of criminal activity. This study provides the first large-scale assessment of crime disclosure patterns in a major cybercrime forum, analysing over 3.5 million posts from nearly 300k users. Using a three-level classification scheme (benign, grey, and crime) and a scalable labelling pipeline powered by large language models (LLMs), we measure the level of crime disclosure present in initial posts, analyse how participants switch between levels, and assess how crime disclosure behavior relates to private communications. Our results show that crime disclosure is relatively normative: one quarter of initial posts include explicit crime-related content, and more than one third of users disclose criminal activity at least once in their initial posts. At the same time, most participants show restraint, with over two-thirds posting only benign or grey content and typically escalating disclosure gradually. Grey initial posts are particularly prominent, indicating that many users avoid overt statements and instead anchor their activity in ambiguous content. The study highlights the value of LLM-based text classification and Markov chain modelling for capturing crime disclosure patterns, offering insights for law enforcement efforts aimed at distinguishing benign, grey, and criminal content in cybercrime forums.

2603.01590 2026-03-03 cs.IR cs.LG

IDProxy: Cold-Start CTR Prediction for Ads and Recommendation at Xiaohongshu with Multimodal LLMs

Yubin Zhang, Haiming Xu, Guillaume Salha-Galvan, Ruiyan Han, Feiyang Xiao, Yanhua Huang, Li Lin, Yang Luo, Yao Hu

详情
英文摘要

Click-through rate (CTR) models in advertising and recommendation systems rely heavily on item ID embeddings, which struggle in item cold-start settings. We present IDProxy, a solution that leverages multimodal large language models (MLLMs) to generate proxy embeddings from rich content signals, enabling effective CTR prediction for new items without usage data. These proxies are explicitly aligned with the existing ID embedding space and are optimized end-to-end under CTR objectives together with the ranking model, allowing seamless integration into existing large-scale ranking pipelines. Offline experiments and online A/B tests demonstrate the effectiveness of IDProxy, which has been successfully deployed in both Content Feed and Display Ads features of Xiaohongshu's Explore Feed, serving hundreds of millions of users daily.

2603.01574 2026-03-03 cs.CR cs.AI

DualSentinel: A Lightweight Framework for Detecting Targeted Attacks in Black-box LLM via Dual Entropy Lull Pattern

Xiaoyi Pang, Xuanyi Hao, Pengyu Liu, Qi Luo, Song Guo, Zhibo Wang

详情
英文摘要

Recent intelligent systems integrate powerful Large Language Models (LLMs) through APIs, but their trustworthiness may be critically undermined by targeted attacks like backdoor and prompt injection attacks, which secretly force LLMs to generate specific malicious sequences. Existing defensive approaches for such threats typically rely on high access rights, impose prohibitive costs, and hinder normal inference, rendering them impractical for real-world scenarios. To solve these limitations, we introduce DualSentinel, a lightweight and unified defense framework that can accurately and promptly detect the activation of targeted attacks alongside the LLM generation process. We first identify a characteristic of compromised LLMs, termed Entropy Lull: when a targeted attack successfully hijacks the generation process, the LLM exhibits a distinct period of abnormally low and stable token probability entropy, indicating it is following a fixed path rather than making creative choices. DualSentinel leverages this pattern by developing an innovative dual-check approach. It first employs a magnitude and trend-aware monitoring method to proactively and sensitively flag an entropy lull pattern at runtime. Upon such flagging, it triggers a lightweight yet powerful secondary verification based on task-flipping. An attack is confirmed only if the entropy lull pattern persists across both the original and the flipped task, proving that the LLM's output is coercively controlled. Extensive evaluations show that DualSentinel is both highly effective (superior detection accuracy with near-zero false positives) and remarkably efficient (negligible additional cost), offering a truly practical path toward securing deployed LLMs. The source code can be accessed at https://doi.org/10.5281/zenodo.18479273.

2603.01570 2026-03-03 cs.DB cs.LG

Adversarial Query Synthesis via Bayesian Optimization

Jeffrey Tao, Yimeng Zeng, Haydn Thomas Jones, Natalie Maus, Osbert Bastani, Jacob R. Gardner, Ryan Marcus

详情
英文摘要

Benchmark workloads are extremely important to the database management research community, especially as more machine learning components are integrated into database systems. Here, we propose a Bayesian optimization technique to automatically search for difficult benchmark queries, significantly reducing the amount of manual effort usually required. In preliminary experiments, we show that our approach can generate queries with more than double the optimization headroom compared to existing benchmarks.

2603.01565 2026-03-03 eess.AS cs.SD

Investigating Group Relative Policy Optimization for Diffusion Transformer based Text-to-Audio Generation

Yi Gu, Yanqing Liu, Chen Yang, Sheng Zhao

详情
英文摘要

Text-to-audio (T2A) generation has advanced considerably in recent years, yet existing methods continue to face challenges in accurately rendering complex text prompts, particularly those involving intricate audio effects, and achieving precise text-audio alignment. While prior approaches have explored data augmentation, explicit timing conditioning, and reinforcement learning, overall synthesis quality remains constrained. In this work, we experiment with reinforcement learning to further enhance T2A generation quality, building on diffusion transformer (DiT)-based architectures. Our method first employs a large language model (LLM) to generate high-fidelity, richly detailed audio captions, substantially improving text-audio semantic alignment, especially for ambiguous or underspecified prompts. We then apply Group Relative Policy Optimization (GRPO), a recently introduced reinforcement learning algorithm, to fine-tune the T2A model. Through systematic experimentation with diverse reward functions (including CLAP, KL, FAD, and their combinations), we identify the key drivers of effective RL in audio synthesis and analyze how reward design impacts final audio quality. Experimental results demonstrate that GRPO-based fine-tuning yield substantial gains in synthesis fidelity and prompt adherence.

2603.01517 2026-03-03 cs.AR cs.RO

RoboGPU: Accelerating GPU Collision Detection for Robotics

Lufei Liu, Liwei Xue, Youssef Mohammed, Jocelyn Zhao, Yuan Hsi Chou, Tor M. Aamodt

详情
英文摘要

Autonomous robots are increasingly prevalent in our society, emerging in medical care, transportation vehicles, and home assistance. These robots rely on motion planning and collision detection to identify a sequence of movements allowing them to navigate to an end goal without colliding with the surrounding environment. While many specialized accelerators have been proposed to meet the real-time requirements of robotics planning tasks, they often lack the flexibility to adapt to the rapidly changing landscape of robotics and support future advancements. However, GPUs are well-positioned for robotics and we find that they can also tackle collision detection algorithms with enhancements to existing ray tracing accelerator (RTA) units. Unlike intersection tests in ray tracing, collision queries in robotics require control flow mechanisms to avoid unnecessary computations in each query. In this work, we explore and compare different architectural modifications to address the gaps of existing GPU RTAs. Our proposed RoboGPU architecture introduces a RoboCore that computes collision queries 3.1$\times$ faster than RTA implementations and 14.8$\times$ faster than a CUDA baseline. RoboCore is also useful for other robotics tasks, achieving 3.6$\times$ speedup on a state-of-the-art neural motion planner and 1.1$\times$ speedup on Monte Carlo Localization compared to a baseline GPU. RoboGPU matches the performance of dedicated hardware accelerators while being able to adapt to evolving motion planning algorithms and support classical algorithms.

2603.01495 2026-03-03 cs.HC cs.RO

Bimanual XR Specification of Relative and Absolute Assembly Hierarchies for Teleoperation

Benjamin Yang, Xichen He, Charlie Zou, Jen-Shuo Liu, Barbara Tversky, Steven Feiner

详情
英文摘要

We present a bimanual XR interaction approach for specifying remote assembly tasks as hierarchies of relative and absolute object constraints that specify high-level teleoperation goals for robots. Grabbing one object in each hand creates a constraint group (visualized as a hull) and groups can be nested into hierarchies. Each group can be relative (with a robot-specifiable 6DoF pose) or absolute (with an author-specified fixed 6DoF pose) in relation to its parent. A relative group specifies a subassembly that can be constructed at a location chosen by the robot software for efficiency rather than mandated by the user.

2603.01494 2026-03-03 cs.SE cs.AI cs.CR cs.LG

Inference-Time Safety For Code LLMs Via Retrieval-Augmented Revision

Manisha Mukherjee, Vincent J. Hellendoorn

Comments Accepted at the ICLR 2026 Workshop on Principled Design for Trustworthy AI: Interpretability, Robustness, and Safety Across Modalities

详情
英文摘要

Large Language Models (LLMs) are increasingly deployed for code generation in high-stakes software development, yet their limited transparency in security reasoning and brittleness to evolving vulnerability patterns raise critical trustworthiness concerns. Models trained on static datasets cannot readily adapt to newly discovered vulnerabilities or changing security standards without retraining, leading to the repeated generation of unsafe code. We present a principled approach to trustworthy code generation by design that operates as an inference-time safety mechanism. Our approach employs retrieval-augmented generation to surface relevant security risks in generated code and retrieve related security discussions from a curated Stack Overflow knowledge base, which are then used to guide an LLM during code revision. This design emphasizes three aspects relevant to trustworthiness: (1) interpretability, through transparent safety interventions grounded in expert community explanations; (2) robustness, by allowing adaptation to evolving security practices without model retraining; and (3) safety alignment, through real-time intervention before unsafe code reaches deployment. Across real-world and benchmark datasets, our approach improves the security of LLM-generated code compared to prompting alone, while introducing no new vulnerabilities as measured by static analysis. These results suggest that principled, retrieval-augmented inference-time interventions can serve as a complementary mechanism for improving the safety of LLM-based code generation, and highlight the ongoing value of community knowledge in supporting trustworthy AI deployment.

2603.01493 2026-03-03 cs.IR cs.AI cs.CV cs.MM

PhotoBench: Beyond Visual Matching Towards Personalized Intent-Driven Photo Retrieval

Tianyi Xu, Rong Shan, Junjie Wu, Jiadeng Huang, Teng Wang, Jiachen Zhu, Wenteng Chen, Minxin Tu, Quantao Dou, Zhaoxiang Wang, Changwang Zhang, Weinan Zhang, Jun Wang, Jianghao Lin

Comments Under review

详情
英文摘要

Personal photo albums are not merely collections of static images but living, ecological archives defined by temporal continuity, social entanglement, and rich metadata, which makes the personalized photo retrieval non-trivial. However, existing retrieval benchmarks rely heavily on context-isolated web snapshots, failing to capture the multi-source reasoning required to resolve authentic, intent-driven user queries. To bridge this gap, we introduce PhotoBench, the first benchmark constructed from authentic, personal albums. It is designed to shift the paradigm from visual matching to personalized multi-source intent-driven reasoning. Based on a rigorous multi-source profiling framework, which integrates visual semantics, spatial-temporal metadata, social identity, and temporal events for each image, we synthesize complex intent-driven queries rooted in users' life trajectories. Extensive evaluation on PhotoBench exposes two critical limitations: the modality gap, where unified embedding models collapse on non-visual constraints, and the source fusion paradox, where agentic systems perform poor tool orchestration. These findings indicate that the next frontier in personal multimodal retrieval lies beyond unified embeddings, necessitating robust agentic reasoning systems capable of precise constraint satisfaction and multi-source fusion. Our PhotoBench is available.

2603.01459 2026-03-03 astro-ph.IM cs.LG

PhysFormer: A Physics-Embedded Generative Model for Physically Self-Consistent Spectral Synthesis

Siqi Wang, Mengmeng Zhang, Yude Bu, Chaozhou Mou

Comments 9 pages, 5 figures

详情
英文摘要

In scientific and engineering domains, modeling high-dimensional complex systems governed by partial differential equations (PDEs) remains challenging in terms of physical consistency and numerical stability. However, existing approaches, such as physics-informed neural networks (PINNs), typically rely on known physical fields or coefficients and enforce physical constraints via external loss functions, which can lead to training instability and make it difficult to handle high-dimensional or unobservable scenarios. To this end, we propose PhysFormer, a generative modeling framework that is self-consistent at both the data and physical levels. PhysFormer leverages a low-dimensional, physically interpretable latent space to learn key physical quantities directly from data without requiring known high-dimensional physical field parameters, and embeds the physical process of radiative flux generation within the network to ensure the physical consistency of the generated spectra. In high-dimensional, degenerate inversion tasks, PhysFormer constrains generation within physical limits and enhances spectral fidelity and inversion stability under varying signal-to-noise ratios (SNRs). More broadly, this approach shifts the physical processes from external loss functions into the generative mechanism itself, providing a physically consistent generative modeling paradigm for complex systems involving unknown or unobservable physical quantities.

2603.01457 2026-03-03 cs.HC cs.CL

Power Echoes: Investigating Moderation Biases in Online Power-Asymmetric Conflicts

Yaqiong Li, Peng Zhang, Peixu Hou, Kainan Tu, Guangping Zhang, Shan Qu, Wenshi Chen, Yan Chen, Ning Gu, Tun Lu

Comments Accepted at the ACM CHI conference on Human Factors in Computing Systems (ACM CHI 2026)

详情
英文摘要

Online power-asymmetric conflicts are prevalent, and most platforms rely on human moderators to conduct moderation currently. Previous studies have been continuously focusing on investigating human moderation biases in different scenarios, while moderation biases under power-asymmetric conflicts remain unexplored. Therefore, we aim to investigate the types of power-related biases human moderators exhibit in power-asymmetric conflict moderation (RQ1) and further explore the influence of AI's suggestions on these biases (RQ2). For this goal, we conducted a mixed design experiment with 50 participants by leveraging the real conflicts between consumers and merchants as a scenario. Results suggest several biases towards supporting the powerful party within these two moderation modes. AI assistance alleviates most biases of human moderation, but also amplifies a few. Based on these results, we propose several insights into future research on human moderation and human-AI collaborative moderation systems for power-asymmetric conflicts.

2603.01449 2026-03-03 eess.IV cs.CV

Revisiting Global Token Mixing in Task-Dependent MRI Restoration: Insights from Minimal Gated CNN Baselines

Xiangjian Hou, Chao Qin, Chang Ni, Xin Wang, Chun Yuan, Xiaodong Ma

详情
英文摘要

Global token mixing, implemented via self-attention or state-space sequence models, has become a popular model design choice for MRI restoration. However, MRI restoration tasks differ substantially in how their degradations vary over image and k-space domains, and in the degree to which global coupling is already imposed by physics-driven data consistency terms. In this work, we ask the question whether global token mixing is actually beneficial in each individual task across three representative settings: accelerated MRI reconstruction with explicit data consistency, MRI super-resolution with k-space center cropping, and denoising of clinical carotid MRI data with spatially heteroscedastic noise. To reduce confounding factors, we establish a controlled testbed comparing a minimal local gated CNN and its large-field variant, benchmarking them directly against state-of-the-art global models under aligned training and evaluation protocols. For accelerated MRI reconstruction, the minimal unrolled gated-CNN baseline is already highly competitive compared to recent token-mixing approaches in public reconstruction benchmarks, suggesting limited additional benefits when the forward model and data-consistency steps provide strong global constraints. For super-resolution, where low-frequency k-space data are largely preserved by the controlled low-pass degradation, local gated models remain competitive, and a lightweight large-field variant yields only modest improvements. In contrast, for denoising with pronounced spatially heteroscedastic noise, token-mixing models achieve the strongest overall performance, consistent with the need to estimate spatially varying reliability. In conclusion, our results demonstrate that the utility of global token mixing in MRI restoration is task-dependent, and it should be tailored to the underlying imaging physics and degradation structure.

2603.01430 2026-03-03 math.OC cs.LG cs.NA cs.SY eess.SY math.NA

On the Stability Connection Between Discrete-Time Algorithms and Their Resolution ODEs: Applications to Min-Max Optimisation

Amir Ali Farzin, Yuen-Man Pun, Philipp Braun, Iman Shames

详情
英文摘要

This work establishes a rigorous connection between stability properties of discrete-time algorithms (DTAs) and corresponding continuous-time dynamical systems derived through $ O(s^r) $-resolution ordinary differential equations (ODEs). We show that for discrete- and continuous-time dynamical systems satisfying a mild error assumption, exponential stability of a common equilibrium with respect to the continuous time dynamics implies exponential stability of the corresponding equilibrium for the discrete-time dynamics, provided that the step size is chosen sufficiently small. We extend this result to common compact invariant sets. We prove that if an equilibrium is exponentially stable for the $ O(s^r) $-resolution ODE, then it is also exponentially stable for the associated DTA. We apply this framework to analyse the limit point properties of several prominent optimisation algorithms, including Two-Timescale Gradient Descent--Ascent (TT-GDA), Generalised Extragradient (GEG), Two-Timescale Proximal Point (TT-PPM), Damped Newton (DN), Regularised Damped Newton (RDN), and the Jacobian method (JM), by studying their $ O(1) $- and $ O(s) $-resolution ODEs. We show that under a proper choice of hyperparameters, the set of saddle points of the objective function is a subset of the set of exponentially stable equilibria of GEG, TT-PPM, DN, and RDN. We relax the common Hessian invariance assumption through direct analysis of the resolution ODEs, broadening the applicability of our results. Numerical examples illustrate the theoretical findings.

2603.01399 2026-03-03 cs.DC cs.LG

Quasar: Quantized Self-Speculative Acceleration for Rapid Inference via Memory-Efficient Verification

Guang Huang, Zeyi Wen

Comments 10 pages

详情
英文摘要

Speculative Decoding (SD) has emerged as a premier technique for accelerating Large Language Model (LLM) inference by decoupling token generation into rapid drafting and parallel verification. While recent advancements in self-speculation and lookahead decoding have successfully minimized drafting overhead, they have shifted the primary performance bottleneck to the verification phase. Since verification requires a full forward pass of the target model, it remains strictly memory-bandwidth bound, fundamentally limiting the maximum achievable speedup.In this paper, we introduce \textbf{Quasar} (\textbf{Qua}ntized \textbf{S}elf-speculative \textbf{A}cceleration for \textbf{R}apid Inference), a novel, training-free framework designed to overcome this "memory wall" by employing low-bit quantization specifically for the verification stage. Our empirical analysis reveals that while aggressive structural pruning significantly degrades verification accuracy, quantization-based verification preserves the logit distribution with high fidelity while effectively halving memory traffic. Extensive experiments on state-of-the-art models (e.g., OpenPangu and Qwen3) demonstrate that Quasar maintains a speculative acceptance length comparable to full-precision methods while achieving a $1.28\times$ improvement in end-to-end throughput. Being orthogonal to existing drafting strategies, Quasar offers a generic and efficient pathway to accelerate the verification leg of speculative execution. Code is available at https://github.com/Tom-HG/Quasar.

2603.01366 2026-03-03 cs.LO cs.CL

NM-DEKL$^3_\infty$: A Three-Layer Non-Monotone Evolving Dependent Type Logic

Peng Chen

详情
英文摘要

We present a new dependent type system, NM-DEKL$^3_\infty$ (Non-Monotone Dependent Knowledge-Enhanced Logic), for formalising evolving knowledge in dynamic environments. The system uses a three-layer architecture separating a computational layer, a constructive knowledge layer, and a propositional knowledge layer. We define its syntax and semantics and establish Soundness and Equational Completeness; we construct a syntactic model and prove that it is initial in the category of models, from which equational completeness follows. We also give an embedding into the $μ$-calculus and a strict expressiveness inclusion (including the expressibility of non-bisimulation-invariant properties).

2603.01340 2026-03-03 cs.CR cs.AI cs.LG

SubstratumGraphEnv: Reinforcement Learning Environment (RLE) for Modeling System Attack Paths

Bahirah Adewunmi, Edward Raff, Sanjay Purushotham

Comments Presented at the AI for Cyber Security Workshop at AAAI-26

详情
英文摘要

Automating network security analysis, particularly the identification of potential attack paths, presents significant challenges. Due in part to the sequential, interconnected, and evolutionary nature of system events which most artificial intelligence (AI) techniques struggle to model effectively. This paper proposes a Reinforcement Learning (RL) environment generation framework that simulates the sequence of processes executed on a Windows operating system, enabling dynamic modeling of malicious processes on a system. This methodology models operating system state and transitions using a graph representation. This graph is derived from open-source System Monitor (Sysmon) logs. To address the variety in system event types, fields, and log formats, a mechanism was developed to capture and model parent-child processes from Sysmon logs. A Gymnasium environment (SubstratumGraphEnv) was constructed to establish the perceptible basis for an RL environment, and a customized PyTorch interface was also built (SubstratumBridge) to translate Gymnasium graphs into Deep Reinforcement Learning (DRL) observations and discrete actions. Graph Convolutional Networks (GCNs) concretize the graph's local and global state, which feed the distinct policy and critic heads of an Advantage Actor-Critic (A2C) model. This work's central contribution lies in the design of a novel deep graphical RL environment that automates translation of sequential user and system events, furnishing crucial context for cybersecurity analysis. This work provides a foundation for future research into shaping training parameters and advanced reward shaping, while also offering insight into which system events attributes are critical to training autonomous RL agents.

2603.01339 2026-03-03 stat.ML cs.LG

Causal Effects with Unobserved Unit Types in Interacting Human-AI Systems

William Overman, Sadegh Shirani, Mohsen Bayati

详情
英文摘要

We study experiments on interacting populations of humans and AI agents, where both unit types and the interaction network remain unobserved. Although causal effects propagate throughout the system, the goal is to estimate effects on humans. Examples include online platforms where human users interact alongside AI-driven accounts. We assume a human-AI prior that gives each unit a probability of being human. While humans cannot be distinguished at the unit level, the prior allows us to compute the average human composition within large subpopulations. We then model outcome dynamics through a causal message passing (CMP) framework and analyze sample-mean outcomes across subpopulations. We show that by constructing subpopulations that vary in expected human composition and treatment exposure, one can consistently recover human-specific causal effects. Our results characterize when distributional knowledge of population composition (without observing unit types or the interaction network) is sufficient for identification. We validate the approach on a simulated human-AI platform driven by behaviorally differentiated LLM agents. Together, these results provide a theoretical and practical framework for experimentation in emerging human-AI systems.

2603.01337 2026-03-03 stat.ML cs.LG

Adaptive Estimation and Inference in Conditional Moment Models via the Discrepancy Principle

Jiyuan Tan, Vasilis Syrgkanis

详情
英文摘要

We study adaptive estimation and inference in ill-posed linear inverse problems defined by conditional moment restrictions. Existing regularized estimators such as Regularized DeepIV (RDIV) require prior knowledge of the smoothness of the nuisance function, typically encoded by a beta source condition to tune their regularization parameters. In practice, this smoothness is unknown, and misspecified hyperparameters can lead to suboptimal convergence or instability. We introduce a discrepancy-principle-based framework for adaptive hyperparameter selection that automatically balances bias and variance without relying on the unknown smoothness parameter. Our framework applies to both RDIV (Li et al. [2024]) and the Tikhonov Regularized Adversarial Estimator (TRAE) (Bennett et al. [2023a]) and achieves the same rates in both weak and strong metrics. Building on this, we construct a fully adaptive doubly robust estimator for linear functionals that attains the optimal rate of the better-conditioned primal or dual problem, providing a practical, theoretically grounded approach for adaptive inference in ill-posed econometric models.

2603.01325 2026-03-03 physics.soc-ph cs.LG

From GEV to ResLogit: Spatially Correlated Discrete Choice Models for Pedestrian Movement Prediction

Rulla Al-Haideri, Bilal Farooq

详情
英文摘要

High frequency pedestrian motion forecasting when interacting with autonomous vehicles (AVs) can be enhanced through the use of behavioural frameworks, such as discrete choice models, that can explicitly account for correlation among similar movement alternatives. We formulate the pedestrian next step choice as a spatial discrete choice defined by a grid of speed adjustment and heading change. Using naturalistic pedestrian-AV encounters from nuScenes and Argoverse 2 (1 sec decision interval), we estimate a multinomial logit baseline and four spatial generalized extreme value (GEV) specifications (SCL, GSCL, SCNL, and GSCNL). We then compare them to a residual neural network logit (ResLogit) model that learns cross alternative effects while retaining an interpretable linear utility component. Across the evaluated data, spatial GEV structures yield only marginal improvements over multinomial logit, whereas ResLogit achieves a substantially better fit and produces behaviourally coherent errors concentrated among neighbouring grid cells. The results suggest that in dense, high frequency spatial choice sets, learning based residual corrections can capture proximity induced correlation more effectively than analyst specified GEV nesting structures, while maintaining interpretability.

2603.01306 2026-03-03 math.OC cs.LG

GPU-friendly and Linearly Convergent First-order Methods for Certifying Optimal $k$-sparse GLMs

Jiachang Liu, Andrea Lodi, Soroosh Shafiee

Comments Extended version of the ICML 2025 conference paper

详情
英文摘要

We investigate the problem of certifying optimality for sparse generalized linear models (GLMs), where sparsity is enforced through a cardinality constraint. While Branch-and-Bound (BnB) frameworks can certify optimality using perspective relaxations, existing methods for solving these relaxations are computationally intensive, limiting their scalability. To address this challenge, we reformulate the relaxations as composite optimization problems and develop a unified proximal framework that is both linearly convergent and computationally efficient. Under specific geometric regularity conditions, our analysis links primal quadratic growth to dual quadratic decay, yielding error bounds that make the Fenchel duality gap a sharp proxy for progress towards the solution set. This leads to a duality gap-based restart scheme that upgrades a broad class of sublinear proximal methods to provably linearly convergent methods, and applies beyond the sparse GLM setting. For the implicit perspective regularizer, we further derive specialized routines to evaluate the regularizer and its proximal operator exactly in log-linear time, avoiding costly generic conic solvers. The resulting iterations are dominated by matrix--vector multiplications, which enables GPU acceleration. Experiments on synthetic and real-world datasets show orders-of-magnitude faster dual-bound computations and substantially improved BnB scalability on large instances.

2603.01241 2026-03-03 cs.IR cs.AI

TARSE: Test-Time Adaptation via Retrieval of Skills and Experience for Reasoning Agents

Junda Wang, Zonghai Tao, Hansi Zeng, Zhichao Yang, Hamed Zamani, Hong Yu

详情
英文摘要

Complex clinical decision making often fails not because a model lacks facts, but because it cannot reliably select and apply the right procedural knowledge and the right prior example at the right reasoning step. We frame clinical question answering as an agent problem with two explicit, retrievable resources: skills, reusable clinical procedures such as guidelines, protocols, and pharmacologic mechanisms; and experience, verified reasoning trajectories from previously solved cases (e.g., chain-of-thought solutions and their step-level decompositions). At test time, the agent retrieves both relevant skills and experiences from curated libraries and performs lightweight test-time adaptation to align the language model's intermediate reasoning with clinically valid logic. Concretely, we build (i) a skills library from guideline-style documents organized as executable decision rules, (ii) an experience library of exemplar clinical reasoning chains indexed by step-level transitions, and (iii) a step-aware retriever that selects the most useful skill and experience items for the current case. We then adapt the model on the retrieved items to reduce instance-step misalignment and to prevent reasoning from drifting toward unsupported shortcuts. Experiments on medical question-answering benchmarks show consistent gains over strong medical RAG baselines and prompting-only reasoning methods. Our results suggest that explicitly separating and retrieving clinical skills and experience, and then aligning the model at test time, is a practical approach to more reliable medical agents.

2603.01222 2026-03-03 cs.IT cs.AI math.IT

Communication-Efficient Quantum Federated Learning over Large-Scale Wireless Networks

Shaba Shaon, Christopher G. Brinton, Dinh C. Nguyen

Comments 21 pages, accepted at IEEE Transactions on Networking

详情
英文摘要

Quantum federated learning (QFL) combines the robust data processing of quantum computing with the privacy-preserving features of federated learning (FL). However, in large-scale wireless networks, optimizing sum-rate is crucial for unlocking the true potential of QFL, facilitating effective model sharing and aggregation as devices compete for limited bandwidth amid dynamic channel conditions and fluctuating power resources. This paper studies a novel sum-rate maximization problem within a muti-channel QFL framework, specifically designed for non-orthogonal multiple access (NOMA)-based large-scale wireless networks. We develop a sum-rate maximization problem by jointly considering quantum device's channel selection and transmit power. Our formulated problem is a non-convex, mixed-integer nonlinear programming (MINLP) challenge that remains non-deterministic polynomial time (NP)-hard even with specified channel selection parameters. The complexity of the problem motivates us to create an effective iterative optimization approach that utilizes the sophisticated quantum approximate optimization algorithm (QAOA) to derive high-quality approximate solutions. Additionally, our study presents the first theoretical exploration of QFL convergence properties under full device participation, rigorously analyzing real-world scenarios with nonconvex loss functions, diverse data distributions, and the effects of quantum shot noise. Extensive simulation results indicate that our multi-channel NOMA-based QFL framework enhances model training and convergence behavior, surpassing conventional algorithms in terms of accuracy and loss. Moreover, our quantum-centric joint optimization approach achieves more than a 100% increase in sum-rate while ensuring rapid convergence, significantly outperforming the state-of-the-arts.

2603.01170 2026-03-03 cs.CR cs.AI

ATLAS: AI-Assisted Threat-to-Assertion Learning for System-on-Chip Security Verification

Ishraq Tashdid, Kimia Tasnia, Alexander Garcia, Jonathan Valamehr, Sazadur Rahman

Comments Accepted at the 63rd Design Automation Conference (DAC 2026), Long Beach, CA, USA (July, 2026)

详情
英文摘要

This work presents ATLAS, an LLM-driven framework that bridges standardized threat modeling and property-based formal verification for System-on-Chip (SoC) security. Starting from vulnerability knowledge bases such as Common Weakness Enumeration (CWE), ATLAS identifies SoC-specific assets, maps relevant weaknesses, and generates assertion-based security properties and JasperGold scripts for verification. By combining asset-centric analysis with standardized threat model templates and multi-source SoC context, ATLAS automates the transformation from vulnerability reasoning to formal proof. Evaluated on three HACK@DAC benchmarks, ATLAS detected 39/48 CWEs and generated correct properties for 33 of those bugs, advancing automated, knowledge-driven SoC security verification toward a secure-by-design paradigm.

2603.01119 2026-03-03 stat.ME cs.AI

Robust Weighted Triangulation of Causal Effects Under Model Uncertainty

Rohit Bhattacharya, Ina Ocelli, Ted Westling

Comments 17 pages

详情
英文摘要

A fundamental challenge in causal inference with observational data is correct specification of a causal model. When there is model uncertainty, analysts may seek to use estimates from multiple candidate models that rely on distinct, and possibly partially overlapping, sets of identifying assumptions to infer the causal effect, a process known as triangulation. Principled methods for triangulation, however, remain underdeveloped. Here, we develop a framework for causal effect triangulation that combines model testability methods from causal discovery with statistical inference methods from semiparametric theory, while avoiding explicit model selection and post-selection inference problems. We propose a triangulation functional that combines identified functionals from each model with data-driven measures of model validity. We provide a bound on the distance of the functional from the true causal effect along with conditions under which this distance can be taken to zero. Finally, we derive valid statistical inference for this functional. Our framework formalizes robustness under causal pluralism without requiring agreement across models or commitment to a single specification. We demonstrate its performance through simulations and an empirical application.

2603.01104 2026-03-03 cs.HC cs.AI cs.CV cs.CY

Egocentric Co-Pilot: Web-Native Smart-Glasses Agents for Assistive Egocentric AI

Sicheng Yang, Yukai Huang, Weitong Cai, Shitong Sun, Fengyi Fang, You He, Yiqiao Xie, Jiankang Deng, Hang Zhang, Jifei Song, Zhensong Zhang

Comments 14 pages, 6 figures, WWW 2026

详情
英文摘要

What if accessing the web did not require a screen, a stable desk, or even free hands? For people navigating crowded cities, living with low vision, or experiencing cognitive overload, smart glasses coupled with AI agents could turn the web into an always-on assistive layer over daily life. We present Egocentric Co-Pilot, a web-native neuro-symbolic framework that runs on smart glasses and uses a Large Language Model (LLM) to orchestrate a toolbox of perception, reasoning, and web tools. An egocentric reasoning core combines Temporal Chain-of-Thought with Hierarchical Context Compression to support long-horizon question answering and decision support over continuous first-person video, far beyond a single model's context window. Additionally, a lightweight multimodal intent layer maps noisy speech and gaze into structured commands. We further implement and evaluate a cloud-native WebRTC pipeline integrating streaming speech, video, and control messages into a unified channel for smart glasses and browsers. In parallel, we deploy an on-premise WebSocket baseline, exposing concrete trade-offs between local inference and cloud offloading in terms of latency, mobility, and resource use. Experiments on Egolife and HD-EPIC demonstrate competitive or state-of-the-art egocentric QA performance, and a human-in-the-loop study on smart glasses shows higher task completion and user satisfaction than leading commercial baselines. Taken together, these results indicate that web-connected egocentric co-pilots can be a practical path toward more accessible, context-aware assistance in everyday life. By grounding operation in web-native communication primitives and modular, auditable tool use, Egocentric Co-Pilot offers a concrete blueprint for assistive, always-on web agents that support education, accessibility, and social inclusion for people who may benefit most from contextual, egocentric AI.

2603.01102 2026-03-03 physics.flu-dyn cs.LG cs.NA math.NA

Structure-preserving Randomized Neural Networks for Incompressible Magnetohydrodynamics Equations

Yunlong Li, Fei Wang, Lingxiao Li

详情
英文摘要

The incompressible magnetohydrodynamic (MHD) equations are fundamental in many scientific and engineering applications. However, their strong nonlinearity and dual divergence-free constraints make them highly challenging for conventional numerical solvers. To overcome these difficulties, we propose a Structure-Preserving Randomized Neural Network (SP-RaNN) that automatically and exactly satisfies the divergence-free conditions. Unlike deep neural network (DNN) approaches that rely on expensive nonlinear and nonconvex optimization, SP-RaNN reformulates the training process into a linear least-squares system, thereby eliminating nonconvex optimization. The method linearizes the governing equations through Picard or Newton iterations, discretizes them at collocation points within the domain and on the boundaries using finite-difference schemes, and solves the resulting linear system via a linear least-squares procedure. By design, SP-RaNN preserves the intrinsic mathematical structure of the equations within a unified space-time framework, ensuring both stability and accuracy. Numerical experiments on the Navier-Stokes, Maxwell, and MHD equations demonstrate that SP-RaNN achieves higher accuracy, faster convergence, and exact enforcement of divergence-free constraints compared with both traditional numerical methods and DNN-based approaches. This structure-preserving framework provides an efficient and reliable tool for solving complex PDE systems while rigorously maintaining their underlying physical laws.