arXivDaily arXiv每日学术速递 周一至周五更新
全部学科分类 2393
专题追踪
2602.06797 2026-02-17 stat.ML cs.LG

Optimal Learning-Rate Schedules under Functional Scaling Laws: Power Decay and Warmup-Stable-Decay

Binghui Li, Zilin Wang, Fengling Chen, Shiyang Zhao, Ruiheng Zheng, Lei Wu

详情
英文摘要

We study optimal learning-rate schedules (LRSs) under the functional scaling law (FSL) framework introduced in Li et al. (2025), which accurately models the loss dynamics of both linear regression and large language model (LLM) pre-training. Within FSL, loss dynamics are governed by two exponents: a source exponent $s>0$ controlling the rate of signal learning, and a capacity exponent $β>1$ determining the rate of noise forgetting. Focusing on a fixed training horizon $N$, we derive the optimal LRSs and reveal a sharp phase transition. In the easy-task regime $s \ge 1 - 1/β$, the optimal schedule follows a power decay to zero, $η^*(z) = η_{\mathrm{peak}}(1 - z/N)^{2β- 1}$, where the peak learning rate scales as $η_{\mathrm{peak}} \eqsim N^{-ν}$ for an explicit exponent $ν= ν(s,β)$. In contrast, in the hard-task regime $s < 1 - 1/β$, the optimal LRS exhibits a warmup-stable-decay (WSD) (Hu et al. (2024)) structure: it maintains the largest admissible learning rate for most of training and decays only near the end, with the decay phase occupying a vanishing fraction of the horizon. We further analyze optimal shape-fixed schedules, where only the peak learning rate is tuned -- a strategy widely adopted in practiceand characterize their strengths and intrinsic limitations. This yields a principled evaluation of commonly used schedules such as cosine and linear decay. Finally, we apply the power-decay LRS to one-pass stochastic gradient descent (SGD) for kernel regression and show the last iterate attains the exact minimax-optimal rate, eliminating the logarithmic suboptimality present in prior analyses. Numerical experiments corroborate our theoretical predictions.

2602.06374 2026-02-17 math.FA cs.LG

A Multiplicative Neural Network Architecture: Locality and Regularity of Approximation

Hee-Sun Choi, Beom-Seok Han

Comments 22 pages, Minor typographical corrections

详情
英文摘要

We introduce a multiplicative neural network architecture in which multiplicative interactions constitute the fundamental representation, rather than appearing as auxiliary components within an additive model. We establish a universal approximation theorem for this architecture and analyze its approximation properties in terms of locality and regularity in Bessel potential spaces. To complement the theoretical results, we conduct numerical experiments on representative targets exhibiting sharp transition layers or pointwise loss of higher-order regularity. The experiments focus on the spatial structure of approximation errors and on regularity-sensitive quantities, in particular, the convergence of Zygmund-type seminorms. The results show that the proposed multiplicative architecture yields residual error structures that are more tightly aligned with regions of reduced regularity and exhibit more stable convergence in regularity-sensitive metrics. These results demonstrate that adopting a multiplicative representation format has concrete implications for the localization and regularity behavior of neural network approximations, providing a direct connection between architectural design and analytical properties of the approximating functions.

2602.00220 2026-02-17 eess.IV cs.CV

Deep learning Based Correction Algorithms for 3D Medical Reconstruction in Computed Tomography and Macroscopic Imaging

Tomasz Les, Tomasz Markiewicz, Malgorzata Lorent, Miroslaw Dziekiewicz, Krzysztof Siwek

Comments 23 pages, 9 figures, submitted to Applied Sciences (MDPI)

详情
英文摘要

This paper introduces a hybrid two-stage registration framework for reconstructing three-dimensional (3D) kidney anatomy from macroscopic slices, using CT-derived models as the geometric reference standard. The approach addresses the data-scarcity and high-distortion challenges typical of macroscopic imaging, where fully learning-based registration (e.g., VoxelMorph) often fails to generalize due to limited training diversity and large nonrigid deformations that exceed the capture range of unconstrained convolutional filters. In the proposed pipeline, the Optimal Cross-section Matching (OCM) algorithm first performs constrained global alignment: translation, rotation, and uniform scaling to establish anatomically consistent slice initialization. Next, a lightweight deep-learning refinement network, inspired by VoxelMorph, predicts residual local deformations between consecutive slices. The core novelty of this architecture lies in its hierarchical decomposition of the registration manifold. This hybrid OCM+DL design integrates explicit geometric priors with the flexible learning capacity of neural networks, ensuring stable optimization and plausible deformation fields even with few training examples. Experiments on an original dataset of 40 kidneys demonstrated better results compared to single-stage baselines. The pipeline maintains physical calibration via Hough-based grid detection and employs Bezier-based contour smoothing for robust meshing and volume estimation. Although validated on kidney data, the proposed framework generalizes to other soft-tissue organs reconstructed from optical or photographic cross-sections. By decoupling interpretable global optimization from data-efficient deep refinement, the method advances the precision, reproducibility, and anatomical realism of multimodal 3D reconstructions for surgical planning, morphological assessment, and medical education.

2601.20538 2026-02-17 cs.MA cs.AI

Interpreting Emergent Extreme Events in Multi-Agent Systems

Ling Tang, Jilin Mei, Dongrui Liu, Chen Qian, Dawei Cheng, Jing Shao, Xia Hu

Comments 8 pages, 5 figures

详情
英文摘要

Large language model-powered multi-agent systems have emerged as powerful tools for simulating complex human-like systems. The interactions within these systems often lead to extreme events whose origins remain obscured by the black box of emergence. Interpreting these events is critical for system safety. This paper proposes the first framework for explaining emergent extreme events in multi-agent systems, aiming to answer three fundamental questions: When does the event originate? Who drives it? And what behaviors contribute to it? Specifically, we adapt the Shapley value to faithfully attribute the occurrence of extreme events to each action taken by agents at different time steps, i.e., assigning an attribution score to the action to measure its influence on the event. We then aggregate the attribution scores along the dimensions of time, agent, and behavior to quantify the risk contribution of each dimension. Finally, we design a set of metrics based on these contribution scores to characterize the features of extreme events. Experiments across diverse multi-agent system scenarios (economic, financial, and social) demonstrate the effectiveness of our framework and provide general insights into the emergence of extreme phenomena.

2601.12991 2026-02-17 cs.HC cs.CL

RAGExplorer: A Visual Analytics System for the Comparative Diagnosis of RAG Systems

Haoyu Tian, Yingchaojie Feng, Zhen Wen, Haoxuan Li, Minfeng Zhu, Wei Chen

Comments 11 pages, 7 figures. Accepted to IEEE TVCG (PacificVis 2026)

详情
英文摘要

The advent of Retrieval-Augmented Generation (RAG) has significantly enhanced the ability of Large Language Models (LLMs) to produce factually accurate and up-to-date responses. However, the performance of a RAG system is not determined by a single component but emerges from a complex interplay of modular choices, such as embedding models and retrieval algorithms. This creates a vast and often opaque configuration space, making it challenging for developers to understand performance trade-offs and identify optimal designs. To address this challenge, we present RAGExplorer, a visual analytics system for the systematic comparison and diagnosis of RAG configurations. RAGExplorer guides users through a seamless macro-to-micro analytical workflow. Initially, it empowers developers to survey the performance landscape across numerous configurations, allowing for a high-level understanding of which design choices are most effective. For a deeper analysis, the system enables users to drill down into individual failure cases, investigate how differences in retrieved information contribute to errors, and interactively test hypotheses by manipulating the provided context to observe the resulting impact on the generated answer. We demonstrate the effectiveness of RAGExplorer through detailed case studies and user studies, validating its ability to empower developers in navigating the complex RAG design space. Our code and user guide are publicly available at https://github.com/Thymezzz/RAGExplorer.

2512.22136 2026-02-17 cs.DC cs.CV

SlimEdge: Performance and Device Aware Distributed DNN Deployment on Resource-Constrained Edge Hardware

Mahadev Sunil Kumar, Arnab Raha, Debayan Das, Gopakumar G, Rounak Chatterjee, Amitava Mukherjee

详情
英文摘要

Distributed deep neural networks (DNNs) have become central to modern computer vision, yet their deployment on resource-constrained edge devices remains hindered by substantial parameter counts, computational demands, and the probability of device failure. Here, we present an approach to the efficient deployment of distributed DNNs that jointly respect hardware limitations, preserve task performance, and remain robust to partial system failures. Our method integrates structured model pruning with a multi-objective optimization framework to tailor network capacity for heterogeneous device constraints, while explicitly accounting for device availability and failure probability during deployment. We demonstrate this framework using Multi-View Convolutional Neural Networks (MVCNN), a state-of-the-art architecture for 3D object recognition, by quantifying the contribution of individual views to classification accuracy and allocating pruning budgets accordingly. Experimental results show that the resulting models satisfy user-specified bounds on accuracy and memory footprint, even under multiple simultaneous device failures. The inference time is reduced by factors up to 4.7x across diverse simulated device configurations. These findings suggest that performance-aware, view-adaptive, and failure-resilient compression provides a viable pathway for deploying complex vision models in distributed edge environments.

2512.16051 2026-02-17 astro-ph.IM cs.LG

Graph Neural Networks for Interferometer Simulations

Sidharth Kannan, Pooyan Goodarzi, Evangelos E. Papalexakis, Jonathan W. Richardson

Comments 11 pages, 4 figures, Accepted and Presented to the 39th Conference on Neural Information Processing Systems (NeurIPS 2025): AI for Science Workshop

详情
英文摘要

In recent years, graph neural networks (GNNs) have shown tremendous promise in solving problems in high energy physics, materials science, and fluid dynamics. In this work, we introduce a new application for GNNs in the physical sciences: instrumentation design. As a case study, we apply GNNs to simulate models of the Laser Interferometer Gravitational-Wave Observatory (LIGO) and show that they are capable of accurately capturing the complex optical physics at play, while achieving runtimes 815 times faster than state of the art simulation packages. We discuss the unique challenges this problem provides for machine learning models. In addition, we provide a dataset of high-fidelity optical physics simulations for three interferometer topologies, which can be used as a benchmarking suite for future work in this direction.

2512.13697 2026-02-17 cs.CY cs.AI cs.CL

Writing in Symbiosis: Mapping Human Creative Agency in the AI Era

Vivan Doshi, Mengyuan Li

Comments Advances in Neural Information Processing Systems (NeurIPS 2025)

详情
英文摘要

The proliferation of Large Language Models (LLMs) raises a critical question about what it means to be human when we share an increasingly symbiotic relationship with persuasive and creative machines. This paper examines patterns of human-AI coevolution in creative writing, investigating how human craft and agency are adapting alongside machine capabilities. We challenge the prevailing notion of stylistic homogenization by examining diverse patterns in longitudinal writing data. Using a large-scale corpus spanning the pre- and post-LLM era, we observe patterns suggestive of a "Dual-Track Evolution": thematic convergence around AI-related topics, coupled with structured stylistic differentiation. Our analysis reveals three emergent adaptation patterns: authors showing increased similarity to AI style, those exhibiting decreased similarity, and those maintaining stylistic stability while engaging with AI-related themes. This Creative Archetype Map illuminates how authorship is coevolving with AI, contributing to discussions about human-AI collaboration, detection challenges, and the preservation of creative diversity.

2511.20974 2026-02-17 eess.AS cs.CL cs.LG

RosettaSpeech: Zero-Shot Speech-to-Speech Translation without Parallel Speech

Zhisheng Zheng, Xiaohang Sun, Tuan Dinh, Abhishek Yanamandra, Abhinav Jain, Zhu Liu, Sunil Hadap, Vimal Bhat, Manoj Aggarwal, Gerard Medioni, David Harwath

Comments 12 pages, 4 figures

详情
英文摘要

End-to-end speech-to-speech translation (S2ST) systems typically struggle with a critical data bottleneck: the scarcity of parallel speech-to-speech corpora. To overcome this, we introduce RosettaSpeech, a novel zero-shot framework trained exclusively on monolingual speech-text data augmented by machine translation supervision. Unlike prior works that rely on complex cascaded pseudo-labeling, our approach strategically utilizes text as a semantic bridge during training to synthesize translation targets, thereby eliminating the need for parallel speech pairs while maintaining a direct, end-to-end inference pipeline. Empirical evaluations on the CVSS-C benchmark demonstrate that RosettaSpeech achieves state-of-the-art zero-shot performance, surpassing leading baselines by significant margins - achieving ASR-BLEU scores of 25.17 for German-to-English (+27% relative gain) and 29.86 for Spanish-to-English (+14%). Crucially, our model effectively preserves the source speaker's voice without ever seeing paired speech data. We further analyze the impact of data scaling and demonstrate the model's capability in many-to-one translation, offering a scalable solution for extending high-quality S2ST to "text-rich, speech-poor" languages.

2511.19628 2026-02-17 stat.ML cs.LG stat.CO

Optimization and Regularization Under Arbitrary Objectives

Jared N. Lakhani, Etienne Pienaar

Comments 74 pages, 29 figures, 16 tables

详情
英文摘要

This study investigates the limitations of applying Markov Chain Monte Carlo (MCMC) methods to arbitrary objective functions, focusing on a two-block MCMC framework which alternates between Metropolis-Hastings and Gibbs sampling. While such approaches are often considered advantageous for enabling data-driven regularization, we show that their performance critically depends on the sharpness of the employed likelihood form. By introducing a sharpness parameter and exploring alternative likelihood formulations proportional to the target objective function, we demonstrate how likelihood curvature governs both in-sample performance and the degree of regularization inferred by the training data. Empirical applications are conducted on reinforcement learning tasks: including a navigation problem and the game of tic-tac-toe. The study concludes with a separate analysis examining the implications of extreme likelihood sharpness on arbitrary objective functions stemming from the classic game of blackjack, where the first block of the two-block MCMC framework is replaced with an iterative optimization step. The resulting hybrid approach achieves performance nearly identical to the original MCMC framework, indicating that excessive likelihood sharpness effectively collapses posterior mass onto a single dominant mode.

2511.07293 2026-02-17 cs.LO cs.AI cs.CV

Formal Reasoning About Confidence and Automated Verification of Neural Networks

Mohammad Afzal, S. Akshay, Blaise Genest, Ashutosh Gupta

详情
英文摘要

In the last decade, a large body of work has emerged on robustness of neural networks, i.e., checking if the decision remains unchanged when the input is slightly perturbed. However, most of these approaches ignore the confidence of a neural network on its output. In this work, we aim to develop a generalized framework for formally reasoning about the confidence along with robustness in neural networks. We propose a simple yet expressive grammar that captures various confidence-based specifications. We develop a novel and unified technique to verify all instances of the grammar in a homogeneous way, viz., by adding a few additional layers to the neural network, which enables the use any state-of-the-art neural network verification tool. We perform an extensive experimental evaluation over a large suite of 8870 benchmarks, where the largest network has 138M parameters, and show that this outperforms ad-hoc encoding approaches by a significant margin.

2511.01228 2026-02-17 cs.SI cs.AI

Importance Ranking in Complex Networks via Influence-aware Causal Node Embedding

Jiahui Gao, Kuang Zhou, Yuchen Zhu, Keyu Wu

详情
英文摘要

Understanding and quantifying node importance is a fundamental problem in network science and engineering, underpinning a wide range of applications such as influence maximization, social recommendation, and network dismantling. Prior research often relies on centrality measures or advanced graph embedding techniques using structural information, followed by downstream classification or regression tasks to identify critical nodes. However, these approaches typically decouple node representation learning from the ranking objective and depend heavily on the topological structure of target networks, leading to feature-task inconsistency and poor cross-network generalization. This paper proposes a novel framework that leverages causal representation learning to obtain robust and invariant node embeddings for cross-network ranking tasks. Specifically, we introduce an influence-aware causal node embedding module within an autoencoder architecture to extract node embeddings that are causally related to node importance. Furthermore, we design a unified optimization framework incorporating a causal ranking loss that jointly optimizes reconstruction and ranking objectives, thereby enabling mutual reinforcement between node representation learning and ranking optimization. This design allows the proposed model to be trained on synthetic networks and to generalize effectively across diverse real-world networks. Extensive experiments on multiple benchmark datasets demonstrate that the proposed model consistently outperforms state-of-the-art baselines in terms of both ranking accuracy and cross-network transferability, offering new insights for network analysis and engineering applications-particularly in scenarios where the target network's structure is inaccessible in advance due to privacy or security constraints.

2511.01144 2026-02-17 cs.CR cs.AI

AthenaBench: A Dynamic Benchmark for Evaluating LLMs in Cyber Threat Intelligence

Md Tanvirul Alam, Dipkamal Bhusal, Salman Ahmad, Nidhi Rastogi, Peter Worth

详情
英文摘要

Large Language Models (LLMs) have demonstrated strong capabilities in natural language reasoning, yet their application to Cyber Threat Intelligence (CTI) remains limited. CTI analysis involves distilling large volumes of unstructured reports into actionable knowledge, a process where LLMs could substantially reduce analyst workload. CTIBench introduced a comprehensive benchmark for evaluating LLMs across multiple CTI tasks. In this work, we extend CTIBench by developing AthenaBench, an enhanced benchmark that includes an improved dataset creation pipeline, duplicate removal, refined evaluation metrics, and a new task focused on risk mitigation strategies. We evaluate twelve LLMs, including state-of-the-art proprietary models such as GPT-5 and Gemini-2.5 Pro, alongside seven open-source models from the LLaMA and Qwen families. While proprietary LLMs achieve stronger results overall, their performance remains subpar on reasoning-intensive tasks, such as threat actor attribution and risk mitigation, with open-source models trailing even further behind. These findings highlight fundamental limitations in the reasoning capabilities of current LLMs and underscore the need for models explicitly tailored to CTI workflows and automation.

2510.26046 2026-02-17 stat.ML cs.LG stat.ME

Bias-Corrected Data Synthesis for Imbalanced Learning

Pengfei Lyu, Zhengchi Ma, Linjun Zhang, Anru R. Zhang

Comments 41 pages, 4 figures, includes proofs and appendix

详情
英文摘要

Imbalanced data, where the positive samples represent only a small proportion compared to the negative samples, makes it challenging for classification problems to balance the false positive and false negative rates. A common approach to addressing the challenge involves generating synthetic data for the minority group and then training classification models with both observed and synthetic data. However, since the synthetic data depends on the observed data and fails to replicate the original data distribution accurately, prediction accuracy is reduced when the synthetic data is naïvely treated as the true data. In this paper, we address the bias introduced by synthetic data and provide consistent estimators for this bias by borrowing information from the majority group. We propose a bias correction procedure to mitigate the adverse effects of synthetic data, enhancing prediction accuracy while avoiding overfitting. This procedure is extended to broader scenarios with imbalanced data, such as imbalanced multi-task learning and causal inference. Theoretical properties, including bounds on bias estimation errors and improvements in prediction accuracy, are provided. Simulation results and data analysis on handwritten digit datasets demonstrate the effectiveness of our method.

2510.18259 2026-02-17 stat.ML cs.AI cs.LG

Learning under Quantization for High-Dimensional Linear Regression

Dechen Zhang, Junwei Su, Difan Zou

详情
英文摘要

The use of low-bit quantization has emerged as an indispensable technique for enabling the efficient training of large-scale models. Despite its widespread empirical success, a rigorous theoretical understanding of its impact on learning performance remains notably absent, even in the simplest linear regression setting. We present the first systematic theoretical study of this fundamental question, analyzing finite-step stochastic gradient descent (SGD) for high-dimensional linear regression under a comprehensive range of quantization targets: data, label, parameter, activation, and gradient. Our novel analytical framework establishes precise algorithm-dependent and data-dependent excess risk bounds that characterize how different quantization affects learning: parameter, activation, and gradient quantization amplify noise during training; data quantization distorts the data spectrum and introduces additional approximation error. Crucially, we distinguish the effects of two quantization schemes: we prove that for additive quantization (with constant quantization steps), the noise amplification benefits from a suppression effect scaled by the batch size, while multiplicative quantization (with input-dependent quantization steps) largely preserves the spectral structure, thereby reducing the spectral distortion. Furthermore, under common polynomial-decay data spectra, we quantitatively compare the risks of multiplicative and additive quantization, drawing a parallel to the comparison between FP and integer quantization methods. Our theory provides a powerful lens to characterize how quantization shapes the learning dynamics of optimization algorithms, paving the way to further explore learning theory under practical hardware constraints.

2510.17734 2026-02-17 math.NA cs.LG cs.NA

Efficient Tensor Completion Algorithms for Highly Oscillatory Operators

Navjot Singh, Edgar Solomonik, Xiaoye Sherry Li, Yang Liu

详情
英文摘要

This paper presents low-complexity tensor completion algorithms and their efficient implementation to reconstruct highly oscillatory operators discretized as $n\times n$ matrices. The underlying tensor decomposition is based on the reshaping of the input matrix and its butterfly decomposition into an order $O (\log n)$ tensor. The reshaping of the input matrix into a tensor allows for representation of the butterfly decomposition as a tensor decomposition with dense tensors. This leads to efficient utilization of the existing software infrastructure for dense and sparse tensor computations. We propose two tensor completion algorithms in the butterfly format, using alternating least squares and gradient-based optimization, as well as a novel strategy that uses low-rank matrix completion to efficiently generate an initial guess for the proposed algorithms. To demonstrate the efficiency and applicability of our proposed algorithms, we perform three numerical experiments using simulated oscillatory operators in seismic applications. In these experiments, we use $O (n \log n)$ observed entries in the input matrix and demonstrate an $O(n\log^3 n)$ computational cost of the proposed algorithms, leading to a speedup of orders of magnitudes per iteration for large matrices compared to the low-rank matrix and quantized tensor-train completion. Moreover, the proposed butterfly completion algorithms, equipped with the novel initial guess generation strategy, achieve reconstruction errors that are smaller by an order of magnitude, enabling accurate recovery of the underlying structure compared to the state-of-the-art completion algorithms.

2510.11418 2026-02-17 cs.IT cs.LG math.IT

Forward-Forward Autoencoder Architectures for Energy-Efficient Wireless Communications

Daniel Seifert, Onur Günlü, Rafael F. Schaefer

Comments To be published in the proceedings of the IEEE International Conference on Communications (ICC), May 2026

详情
英文摘要

The application of deep learning to the area of communications systems has been a growing field of interest in recent years. Forward-forward (FF) learning is an efficient alternative to the backpropagation (BP) algorithm, which is the typically used training procedure for neural networks. Among its several advantages, FF learning does not require the communication channel to be differentiable and does not rely on the global availability of partial derivatives, allowing for an energy-efficient implementation. In this work, we design end-to-end learned autoencoders using the FF algorithm and numerically evaluate their performance for the additive white Gaussian noise and Rayleigh block fading channels. We demonstrate their competitiveness with BP-trained systems in the case of joint coding and modulation, and in a scenario where a fixed, non-differentiable modulation stage is applied. Moreover, we provide further insights into the design principles of the FF network, its training convergence behavior, and significant memory and processing time savings compared to BP-based approaches.

2510.04455 2026-02-17 math.OC cs.AI cs.LG math.ST stat.ML stat.TH

Inverse Mixed-Integer Programming: Learning Constraints then Objective Functions

Akira Kitaoka

Comments 40 pages

详情
英文摘要

Data-driven inverse optimization for mixed-integer linear programs (MILPs), which seeks to learn an objective function and constraints consistent with observed decisions, is important for building accurate mathematical models in a variety of domains, including power systems and scheduling. However, to the best of our knowledge, existing data-driven inverse optimization methods primarily focus on learning objective functions under known constraints, and learning both objective functions and constraints from data remains largely unexplored. In this paper, we propose a two-stage approach for a class of inverse optimization problems in which the objective is a linear combination of given feature functions and the constraints are parameterized by unknown functions and thresholds. Our method first learns the constraints and then, conditioned on the learned constraints, estimates the objective-function weights. On the theoretical side, we provide finite-sample guarantees for solving the proposed inverse optimization problem. To this end, we develop statistical learning tools for pseudo-metric spaces under sub-Gaussian assumptions and use them to derive a learning-theoretic framework for inverse optimization with both unknown objectives and constraints. On the experimental side, we demonstrate that our method successfully solves inverse optimization problems on scheduling instances formulated as ILPs with up to 100 decision variables.

2510.02983 2026-02-17 cs.DS cs.LG math.OC stat.ML

Oracle-based Uniform Sampling from Convex Bodies

Thanh Dang, Jiaming Liang

Comments 32 pages

详情
英文摘要

We propose new Markov chain Monte Carlo algorithms to sample a uniform distribution on a convex body $K$. Our algorithms are based on the proximal sampler, which uses Gibbs sampling on an augmented distribution and assumes access to the so-called restricted Gaussian oracle (RGO). The key contribution of this work is an efficient implementation of the RGO for uniform sampling on convex $K$ that goes beyond the membership-oracle model used in many classical and modern uniform samplers, and instead leverages richer oracle access commonly assumed in convex optimization. We implement the RGO via rejection sampling and access to either a projection oracle or a separation oracle on $K$. In both oracle models, we provide non-asymptotic complexity guarantees for obtaining unbiased samples, with accuracy quantified in Rényi divergence and $χ^2$-divergence, and we support these theoretical guarantees with numerical experiments.

2510.02356 2026-02-17 cs.CR cs.AI

Measuring Physical-World Privacy Awareness of Large Language Models: An Evaluation Benchmark

Xinjie Shen, Mufei Li, Pan Li

Comments Accepted by ICLR 2026

详情
英文摘要

The deployment of Large Language Models (LLMs) in embodied agents creates an urgent need to measure their privacy awareness in the physical world. Existing evaluation methods, however, are confined to natural language based scenarios. To bridge this gap, we introduce EAPrivacy, a comprehensive evaluation benchmark designed to quantify the physical-world privacy awareness of LLM-powered agents. EAPrivacy utilizes procedurally generated scenarios across four tiers to test an agent's ability to handle sensitive objects, adapt to changing environments, balance task execution with privacy constraints, and resolve conflicts with social norms. Our measurements reveal a critical deficit in current models. The top-performing model, Gemini 2.5 Pro, achieved only 59\% accuracy in scenarios involving changing physical environments. Furthermore, when a task was accompanied by a privacy request, models prioritized completion over the constraint in up to 86\% of cases. In high-stakes situations pitting privacy against critical social norms, leading models like GPT-4o and Claude-3.5-haiku disregarded the social norm over 15\% of the time. These findings, demonstrated by our benchmark, underscore a fundamental misalignment in LLMs regarding physically grounded privacy and establish the need for more robust, physically-aware alignment. Codes and datasets will be available at https://github.com/Graph-COM/EAPrivacy.

2509.23519 2026-02-17 cs.CR cs.AI

ReliabilityRAG: Effective and Provably Robust Defense for RAG-based Web-Search

Zeyu Shen, Basileal Imana, Tong Wu, Chong Xiang, Prateek Mittal, Aleksandra Korolova

Comments Accepted to NeurIPS 2025

详情
英文摘要

Retrieval-Augmented Generation (RAG) enhances Large Language Models by grounding their outputs in external documents. These systems, however, remain vulnerable to attacks on the retrieval corpus, such as prompt injection. RAG-based search systems (e.g., Google's Search AI Overview) present an interesting setting for studying and protecting against such threats, as defense algorithms can benefit from built-in reliability signals -- like document ranking -- and represent a non-LLM challenge for the adversary due to decades of work to thwart SEO. Motivated by, but not limited to, this scenario, this work introduces ReliabilityRAG, a framework for adversarial robustness that explicitly leverages reliability information of retrieved documents. Our first contribution adopts a graph-theoretic perspective to identify a "consistent majority" among retrieved documents to filter out malicious ones. We introduce a novel algorithm based on finding a Maximum Independent Set (MIS) on a document graph where edges encode contradiction. Our MIS variant explicitly prioritizes higher-reliability documents and provides provable robustness guarantees against bounded adversarial corruption under natural assumptions. Recognizing the computational cost of exact MIS for large retrieval sets, our second contribution is a scalable weighted sample and aggregate framework. It explicitly utilizes reliability information, preserving some robustness guarantees while efficiently handling many documents. We present empirical results showing ReliabilityRAG provides superior robustness against adversarial attacks compared to prior methods, maintains high benign accuracy, and excels in long-form generation tasks where prior robustness-focused methods struggled. Our work is a significant step towards more effective, provably robust defenses against retrieved corpus corruption in RAG.

2509.22794 2026-02-17 stat.ML cs.AI cs.LG econ.EM math.ST stat.TH

Differentially Private Two-Stage Gradient Descent for Instrumental Variable Regression

Haodong Liang, Yanhao Jin, Krishnakumar Balasubramanian, Lifeng Lai

Comments 37 pages, 12 figures

详情
英文摘要

We study instrumental variable regression (IVaR) under differential privacy constraints. Classical IVaR methods (like two-stage least squares regression) rely on solving moment equations that directly use sensitive covariates and instruments, creating significant risks of privacy leakage and posing challenges in designing algorithms that are both statistically efficient and differentially private. We propose a noisy two-stage gradient descent algorithm that ensures $ρ$-zero-concentrated differential privacy by injecting carefully calibrated noise into the gradient updates. Our analysis establishes finite-sample convergence rates for the proposed method, showing that the algorithm achieves consistency while preserving privacy. In particular, we derive precise bounds quantifying the trade-off among optimization, privacy, and sampling error. To the best of our knowledge, this is the first work to provide both privacy guarantees and provable convergence rates for instrumental variable regression in linear models. We further validate our theoretical findings with experiments on both synthetic and real datasets, demonstrating that our method offers practical accuracy-privacy trade-offs.

2509.18129 2026-02-17 math.OC cs.DC cs.LG

Pareto-optimal Trade-offs Between Communication and Computation with Flexible Gradient Tracking

Yan Huang, Jinming Xu, Li Chai, Jiming Chen, Karl H. Johansson

Comments 15 pages

详情
英文摘要

This paper addresses distributed stochastic optimization problems under non-i.i.d. data, focusing on the inherent trade-offs between communication and computational efficiency. To this end, we propose FlexGT, a flexible snapshot gradient tracking method that enables tunable numbers of local updates and neighbor communications per round, thereby adapting efficiently to diverse system resource conditions. Leveraging a unified convergence analysis framework, we derive tight communication and computational complexity for FlexGT with explicit dependence on objective properties and certain tunable parameters. Moreover, we introduce an accelerated variant, termed Acc-FlexGT, and prove that, with prior knowledge of the graph, it achieves Pareto-optimal trade-offs between communication and computation. Particularly, in the nonconvex case, Acc-FlexGT achieves the optimal iteration complexity of $\tilde{\mathcal{O}}\left( \left( Lσ^2 \right) /\left( nε^2 \right) +L/\left( ε\sqrt{1-\sqrt{ρ_W}} \right) \right) $ and optimal communication complexity of $\tilde{\mathcal{O}}\left( L/\left( ε\sqrt{1-\sqrt{ρ_W}} \right) \right)$ for appropriately chosen numbers of local updates, matching existing lower bounds up to logarithmic factors. And, it improves the existing results for the strongly convex case by a factor of $\tilde{\mathcal{O}} \left( 1/\sqrtε \right)$, where $ε$ is the targeted accuracy, $n$ the number of nodes, $L$ the Lipschitz constant, $ρ_W$ the connectivity of the graph, and $σ$ the stochastic gradient variance. Numerical experiments corroborate the theoretical results and demonstrate the effectiveness of the proposed methods.

2509.18008 2026-02-17 cs.HC cs.AI cs.CL

Through the Lens of Human-Human Collaboration: A Configurable Research Platform for Exploring Human-Agent Collaboration

Bingsheng Yao, Jiaju Chen, Chaoran Chen, April Wang, Toby Jia-jun Li, Dakuo Wang

Comments Accepted at CHI 2026

详情
英文摘要

Intelligent systems have traditionally been designed as tools rather than collaborators, often lacking critical characteristics that collaboration partnerships require. Recent advances in large language model (LLM) agents open new opportunities for human-LLM-agent collaboration by enabling natural communication and various social and cognitive behaviors. Yet it remains unclear whether principles of computer-mediated collaboration established in HCI and CSCW persist, change, or fail when humans collaborate with LLM agents. To support systematic investigations of these questions, we introduce an open and configurable research platform for HCI researchers. The platform's modular design allows seamless adaptation of classic CSCW experiments and manipulation of theory-grounded interaction controls. We demonstrate the platform's research efficacy and usability through three case studies: (1) two Shape Factory experiments for resource negotiation with 16 participants, (2) one Hidden Profile experiment for information pooling with 16 participants, and (3) a participatory cognitive walkthrough with five HCI researchers to refine workflows of researcher interface for experiment setup and analysis.

2509.13550 2026-02-17 math.OC cs.AI

Complexity Bounds for Smooth Multiobjective Optimization

Phillipe R. Sampaio

Comments 25 pages

详情
英文摘要

We study the oracle complexity of finding $\varepsilon$-Pareto stationary points in smooth multiobjective optimization with $m$ objectives. Progress is measured by the Pareto stationarity gap $\mathcal{G}(x)$, the norm of the best convex combination of objective gradients. Our analysis relies on a non-degenerate lifting that embeds hard single-objective instances into MOO instances with distinct objectives and non-singleton Pareto fronts while preserving lower bounds on $\mathcal{G}$. We establish: (i) in the $μ$-strongly convex case, any span first-order method has worst-case linear convergence no faster than $\exp(-Θ(T/\sqrtκ))$ after $T$ oracle calls, yielding $Θ(\sqrtκ\log(1/\varepsilon))$ iterations and matching accelerated upper bounds; (ii) in the convex case, an $Ω(1/T)$ min-iterate lower bound for oblivious one-step methods and a universal last-iterate lower bound $Ω(1/T^2)$ for oblivious span methods via polynomial-degree arguments, and we further show this latter bound is loose (for general adaptive methods) by importing geometric lower bounds to obtain an $Ω(1/T)$ min-iterate lower bound for general adaptive first-order methods; (iii) in the nonconvex case with $L$-Lipschitz gradients, an $Ω(\sqrt{L}/(T+1))$-type lower bound on $\mathcal{G}$ (tight in order), implying $Ω(1/\varepsilon^2)$ iterations to reach $\mathcal{G}(x)\le\varepsilon$ up to natural scaling.

2509.12456 2026-02-17 q-fin.TR cs.AI

Reinforcement Learning-Based Market Making as a Stochastic Control on Non-Stationary Limit Order Book Dynamics

Rafael Zimmer, Oswaldo Luiz do Valle Costa

Comments 9 pages, 8 figures, 3 tables, 31 equations

详情
英文摘要

Reinforcement Learning has emerged as a promising framework for developing adaptive and data-driven strategies, enabling market makers to optimize decision-making policies based on interactions with the limit order book environment. This paper explores the integration of a reinforcement learning agent in a market-making context, where the underlying market dynamics have been explicitly modeled to capture observed stylized facts of real markets, including clustered order arrival times, non-stationary spreads and return drifts, stochastic order quantities and price volatility. These mechanisms aim to enhance stability of the resulting control agent, and serve to incorporate domain-specific knowledge into the agent policy learning process. Our contributions include a practical implementation of a market making agent based on the Proximal-Policy Optimization (PPO) algorithm, alongside a comparative evaluation of the agent's performance under varying market conditions via a simulator-based environment. As evidenced by our analysis of the financial return and risk metrics when compared to a closed-form optimal solution, our results suggest that the reinforcement learning agent can effectively be used under non-stationary market conditions, and that the proposed simulator-based environment can serve as a valuable tool for training and pre-training reinforcement learning agents in market-making scenarios.

2508.10765 2026-02-17 math.DS cs.LG nlin.AO

Memorisation and forgetting in a learning Hopfield neural network: bifurcation mechanisms, attractors and basins

Adam E. Essex, Natalia B. Janson, Rachel A. Norris, Alexander G. Balanov

Comments 24 pages, 17 figures. The following article has been submitted to `Chaos: An Interdisciplinary Journal of Nonlinear Science'. After it is published, it will be found at https://pubs.aip.org/aip/cha

详情
英文摘要

Despite explosive expansion of artificial intelligence based on artificial neural networks (ANNs), these are employed as "black boxes'', as it is unclear how, during learning, they form memories or develop unwanted features, including spurious memories and catastrophic forgetting. Much research is available on isolated aspects of learning ANNs, but due to their high dimensionality and non-linearity, their comprehensive analysis remains a challenge. In ANNs, knowledge is thought to reside in connection weights or in attractor basins, but these two paradigms are not linked explicitly. Here we comprehensively analyse mechanisms of memory formation in an 81-neuron Hopfield network undergoing Hebbian learning by revealing bifurcations leading to formation and destruction of attractors and their basin boundaries. We show that, by affecting evolution of connection weights, the applied stimuli induce a pitchfork and then a cascade of saddle-node bifurcations creating new attractors with their basins that can code true or spurious memories, and an abrupt disappearance of old memories (catastrophic forgetting). With successful learning, new categories are represented by the basins of newly born point attractors, and their boundaries by the stable manifolds of new saddles. With this, memorisation and forgetting represent two manifestations of the same mechanism. Our strategy to analyse high-dimensional learning ANNs is universal and applicable to recurrent ANNs of any form. The demonstrated mechanisms of memory formation and of catastrophic forgetting shed light on the operation of a wider class of recurrent ANNs and could aid the development of approaches to mitigate their flaws.

2508.03882 2026-02-17 cs.CR cs.AI

Simulating Cyberattacks through a Breach Attack Simulation (BAS) Platform empowered by Security Chaos Engineering (SCE)

Arturo Sánchez-Matas, Pablo Escribano Ruiz, Daniel Díaz-López, Angel Luis Perales Gómez, Pantaleone Nespoli, Gregorio Martínez Pérez

Comments 8 pages, 4 figures, paper in proceedings of the X National Cybersecurity Research Conference (JNIC) in Zaragoza, Spain, June, 2025

详情
英文摘要

In today digital landscape, organizations face constantly evolving cyber threats, making it essential to discover slippery attack vectors through novel techniques like Security Chaos Engineering (SCE), which allows teams to test defenses and identify vulnerabilities effectively. This paper proposes to integrate SCE into Breach Attack Simulation (BAS) platforms, leveraging adversary profiles and abilities from existing threat intelligence databases. This innovative proposal for cyberattack simulation employs a structured architecture composed of three layers: SCE Orchestrator, Connector, and BAS layers. Utilizing MITRE Caldera in the BAS layer, our proposal executes automated attack sequences, creating inferred attack trees from adversary profiles. Our proposal evaluation illustrates how integrating SCE with BAS can enhance the effectiveness of attack simulations beyond traditional scenarios, and be a useful component of a cyber defense strategy.

2507.14902 2026-02-17 cs.IR cs.CV

U-MARVEL: Unveiling Key Factors for Universal Multimodal Retrieval via Embedding Learning with MLLMs

Xiaojie Li, Chu Li, Shi-Zhe Chen, Xi Chen

Comments Accepted to ICLR 2026

Journal ref International Conference on Learning Representations (ICLR), 2026

详情
英文摘要

Universal multimodal retrieval (UMR), which aims to address complex retrieval tasks where both queries and candidates span diverse modalities, has been significantly advanced by the emergence of MLLMs. While state-of-the-art MLLM-based methods in the literature predominantly adopt contrastive learning principles, they often differ in their specific training recipes. Despite their success, the mechanisms underlying their retrieval capabilities remain largely unexplored, potentially resulting in suboptimal performance and limited generalization ability. To address these issues, we present a comprehensive study aimed at uncovering the key factors that drive effective embedding learning for UMR using MLLMs. We begin by implementing a general MLLM-based embedding learning pipeline, and systematically analyze the primary contributors to high-performing universal retrieval systems. Based on this, we explore various aspects of the details in embedding generation and training strategies, including progressive transition, hard negative mining and re-ranker distillation. Notably, our findings reveal that often-overlooked factors can have a substantial impact on model performance. Building on these discoveries, we introduce a unified framework termed U-MARVEL (Universal MultimodAl RetrieVal via Embedding Learning), which outperforms state-of-the-art competitors on the M-BEIR benchmark by a large margin in supervised settings, and also exhibits strong zero-shot performance on several tasks such as composed image retrieval and text-to-video retrieval. These results underscore the generalization potential of our framework across various embedding-based retrieval tasks. Code is available at https://github.com/chaxjli/U-MARVEL

2506.05402 2026-02-17 cs.CR cs.LG

Lorica: A Synergistic Fine-Tuning Framework for Advancing Personalized Adversarial Robustness

Tianyu Qi, Lei Xue, Yufeng Zhan, Xiaobo Ma

Comments Accepted by the ACM Conference on Computer and Communications Security (CCS) 2025

详情
英文摘要

The growing use of large pre-trained models in edge computing has made model inference on mobile clients both feasible and popular. Yet these devices remain vulnerable to adversarial attacks, threatening model robustness and security. Federated adversarial training (FAT) offers a promising solution by enhancing robustness while preserving client privacy. However, FAT often yields a generalized global model that struggles with heterogeneous client data, leading to limited personalization and significant communication overhead. In this paper, we propose \textit{Lorica}, a personalized synergistic adversarial training framework that delivers customized defense models through a two-phase process. In Phase 1, \textit{Lorica} applies LoRA-FA for local adversarial fine-tuning, enabling personalized robustness while reducing communication by uploading only LoRA-FA parameters. In Phase 2, a forward-gating selection strategy improves benign accuracy, further refining the personalized model. This yields tailored defense models that effectively balance robustness and accuracy. Extensive experiments on benchmark datasets demonstrate that \textit{Lorica} can achieve up to 68$\times$ improvements in communication efficiency compared to state-of-the-art algorithms, while achieving up to 29.9\% and 52.2\% enhancements in adversarial robustness and benign accuracy, respectively.