arXivDaily arXiv每日学术速递 周一至周五更新
重置
全部学科分类 1079
2601.10479 2026-04-24 quant-ph cs.LG math-ph math.MP

H-EFT-VA: An Effective-Field-Theory Variational Ansatz with Provable Barren Plateau Avoidance

Eyad I. B Hamid

Comments v2: Expanded Section III with explicit circuit architecture description. Added Section IV.F to discuss static initialization limitations and reference-state dependence. Abstract and conclusion updated to scope TFIM results and cite concurrent work on dynamic extensions. 8 pages, 5 figures, Appendix

详情
英文摘要

Variational Quantum Algorithms (VQAs) are critically threatened by the Barren Plateau (BP) phenomenon. In this work, we introduce the H-EFT Variational Ansatz (H-EFT-VA), an architecture inspired by Effective Field Theory (EFT). By enforcing a hierarchical "UV-cutoff" on initialization, we theoretically restrict the circuit's state exploration, preventing the formation of approximate unitary 2-designs. We provide a rigorous proof that this localization guarantees an inverse-polynomial lower bound on the gradient variance: $Var[\partialθ] \in Ω(1/poly(N))$. Crucially, unlike approaches that avoid BPs by limiting entanglement, we demonstrate that H-EFT-VA maintains volume-law entanglement and near-Haar purity, ensuring sufficient expressibility for complex quantum states. Extensive benchmarking across 16 experiments on the Transverse Field Ising Model confirms a 109x improvement in energy convergence and a 10.7x increase in ground-state fidelity over standard Hardware-Efficient Ansätze (HEA), with statistical significance of $p < 10^{-88}$. The static framework is most effective for Hamiltonians with moderate reference-state overlap; extension to systems with larger reference-state gaps is addressed through dynamic UV-cutoff relaxation strategies explored in concurrent work.

2512.16001 2026-04-24 eess.SP cs.LG

Concurrence: A dependence criterion for time series, applied to biological data

Evangelos Sariyanidi, John D. Herrington, Lisa Yankowitz, Pratik Chaudhari, Theodore D. Satterthwaite, Casey J. Zampella, Jeffrey S. Morris, Edward Gunning, Robert T. Schultz, Russell T. Shinohara, Birkan Tunc

Comments arXiv admin note: text overlap with arXiv:2508.02703

详情
英文摘要

Measuring the statistical dependence between observed signals is a primary tool for scientific discovery. However, biological systems often exhibit complex non-linear interactions that currently cannot be captured without a priori knowledge or large datasets. We introduce a criterion for dependence, whereby two time series are deemed dependent if one can construct a classifier that distinguishes between temporally aligned vs. misaligned segments extracted from them. We show that this criterion, concurrence, is theoretically linked with dependence, and can become a standard approach for scientific analyses across disciplines, as it can expose relationships across a wide spectrum of signals (fMRI, physiological and behavioral data) without ad-hoc parameter tuning or large amounts of data.

2512.08216 2026-04-24 eess.IV cs.CV cs.LG

Tumor-anchored deep feature random forests for out-of-distribution detection in lung cancer segmentation

Aneesh Rangnekar, Harini Veeraraghavan

Comments Accepted for publication in Transactions on Machine Learning Research (TMLR), 2026. Code available at: https://github.com/aneesh3108/RF-Deep

详情
英文摘要

Accurate segmentation of lung tumors from 3D computed tomography (CT) scans is essential for automated treatment planning and response assessment. Despite self-supervised pretraining on numerous datasets, state-of-the-art transformer backbones remain susceptible to out-of-distribution (OOD) inputs, often producing confidently incorrect segmentations with potential for risk in clinical deployment. Hence, we introduce RF-Deep, a lightweight post-hoc random forests-based framework that leverages deep features trained with limited outlier exposure, requiring as few as 40 labeled scans (20 in-distribution and 20 OOD), to improve scan-level OOD detection. RF-Deep repurposes the hierarchical features from the pretrained-then-finetuned segmentation backbones, aggregating features from multiple regions-of-interest anchored to predicted tumor regions to capture OOD likelihood. We evaluated RF-Deep on 2,232 CT volumes spanning near-OOD (pulmonary embolism, COVID-19 negative) and far-OOD (kidney cancer, healthy pancreas) datasets. RF-Deep achieved AUROC >~93 on the challenging near-OOD datasets, where it outperformed the next best method by 4--7 percentage points, and produced near-perfect detection (AUROC >~99) on far-OOD datasets. The approach also showed transferability to two blinded validation datasets under the ensemble configuration (COVID-19 positive and breast cancer; AUROC >~94). RF-Deep maintained consistent performance across backbones of different depths and pretraining strategies, demonstrating applicability of post-hoc detectors as a safety filter for clinical deployment of tumor segmentation pipelines.

2511.23159 2026-04-24 cs.SE cs.AI

AI for software engineering: from probable to provable

Bertrand Meyer

详情
英文摘要

Vibe coding, the much-touted use of AI techniques for programming, faces two overwhelming obstacles: the difficulty of specifying goals ("prompt engineering" is a form of requirements engineering, one of the toughest disciplines of software engineering); and the hallucination phenomenon. Programs are only useful if they are correct or very close to correct. The solution? Combine the creativity of artificial intelligence with the rigor of formal specification methods and the power of formal program verification, supported by modern proof tools.

2511.20834 2026-04-24 cs.DC cs.AR cs.LG cs.PF

Spira: Exploiting Voxel Data Structural Properties for Efficient Sparse Convolution in Point Cloud Networks

Dionysios Adamopoulos, Anastasia Poulopoulou, Georgios Goumas, Christina Giannoula

详情
英文摘要

Sparse Convolution (SpC) powers 3D point cloud networks widely used in autonomous driving and augmented/virtual reality. SpC builds a kernel map that stores mappings between input voxel coordinates, output coordinates, and weight offsets, then uses this map to compute feature vectors for output coordinates. Our work identifies three key properties of voxel coordinates: they are integer-valued, bounded within a limited spatial range, and geometrically continuous, i.e., neighboring voxels on the same object surface are highly likely to exist at small spatial offsets from each other. Prior SpC engines do not fully exploit these properties and suffer from high pre-processing and post-processing overheads during kernel map construction. To address this, we design Spira, the first voxel-property-aware SpC engine for GPUs. Spira proposes (i) a high-performance one-shot search algorithm that builds the kernel map with no pre-processing and high data locality, (ii) an effective packed-native processing scheme that accesses packed voxel coordinates at low cost, (iii) a flexible dual-dataflow execution mechanism that efficiently computes output feature vectors by adapting to layer characteristics, and (iv) a network-wide parallelization strategy that builds kernel maps for all SpC layers concurrently at network start. Our evaluation shows that Spira significantly outperforms prior state-of-the-art SpC engines by 1.68x on average and up to 3.04x for end-to-end inference, and by 2.11x on average and up to 3.44x for layer-wise execution across diverse layer configurations. The source code of Spira is freely available at github.com/SPIN-Research-Group/Spira.

2510.08814 2026-04-24 cs.CC cs.AI

A Quantale-Weakness Route to $P \neq NP$ via CD Evidence Normalization and Gauge-Buffered Locked Ensembles

Ben Goertzel

详情
英文摘要

We present a proof architecture for \(P \neq NP\) based on an upper--lower clash in polytime-capped conditional description length. We construct an efficiently samplable family of SAT instances \(Y\) such that every satisfying witness for \(Y\) yields the same global message \(M(Y)\). If \(P=NP\), then a standard polynomial-time SAT self-reduction recovers \(M(Y)\) from \(Y\), so \[ K_{\mathrm{poly}}(M(Y)\mid Y)=O(1). \] The lower-bound side shows the opposite. For the same ensemble, no fixed polynomial-time observer can gain substantial predictive advantage on a linear number of selected message coordinates. The argument treats computation as an evidence-producing process: predictive advantage is converted into constructible-dual evidence skew and then into pairwise distinctions between message-opposite worlds. A normalization theorem shows that every target-relevant non-neutral evidence leaf is either a safe-buffer observation or a hidden-gauge observation. Safe-buffer observations have negligible leakage, while hidden-gauge observations are limited by gauge-rank accounting. This yields an atomic evidence budget implying that total message-resolving advantage is \(o(t)\) across \(t\) selected coordinates. Boundary-law mixing gives the near-random baseline for the visible surface. Combining this with the evidence budget gives product small-success and then, by Compression-from-Success, \[ K_{\mathrm{poly}}(M(Y)\mid Y)\ge Ω(t) \] with high probability. This contradicts the constant upper bound from \(P=NP\). Therefore \(P \neq NP\).

2510.04548 2026-04-24 cond-mat.dis-nn cs.LG stat.ML

Learning Linear Regression with Low-Rank Tasks in-Context

Kaito Takanami, Takashi Takahashi, Yoshiyuki Kabashima

Comments Accepted at AISTATS 2026

详情
英文摘要

In-context learning (ICL) is a key building block of modern large language models, yet its theoretical mechanisms remain poorly understood. It is particularly mysterious how ICL operates in real-world applications where tasks have a common structure. In this work, we address this problem by analyzing a linear attention model trained on low-rank regression tasks. Within this setting, we precisely characterize the distribution of predictions and the generalization error in the high-dimensional limit. Moreover, we find that statistical fluctuations in finite pre-training data induce an implicit regularization. Finally, we identify a sharp phase transition of the generalization error governed by task structure. These results provide a framework for understanding how transformers learn to learn the task structure.

2509.25630 2026-04-24 stat.ML cs.LG cs.NA math.NA

When Langevin Monte Carlo Meets Randomization: New Sampling Algorithms with Non-asymptotic Error Bounds beyond Log-Concavity and Gradient Lipschitzness

Xiaojie Wang, Bin Yang

详情
英文摘要

Efficient sampling from complex and high dimensional target distributions turns out to be a fundamental task in diverse disciplines such as scientific computing, statistics and machine learning. In this paper, we propose a new kind of randomized splitting Langevin Monte Carlo (RSLMC) algorithm for sampling from high dimensional distributions without log-concavity. Compared with the existing randomized Langevin Monte Carlo (RLMC), the newly proposed RSLMC algorithm requires less evaluations of gradients and is thus computationally cheaper. Under the gradient Lipschitz condition and the log-Sobolev inequality, we prove a uniform-in-time error bound in $\mathcal{W}_2$-distance of order $O(\sqrt{d}h)$ for both RLMC and RSLMC sampling algorithms, which matches the best one in the literature under the log-concavity condition. Moreover, when the gradient of the potential $U$ is non-globally Lipschitz with superlinear growth, new modified R(S)LMC algorithms are introduced and analyzed, with non-asymptotic error bounds established. Numerical examples are finally reported to corroborate the theoretical findings.

2509.23649 2026-04-24 cs.IR cs.CL

From Past To Path: Masked History Learning for Next-Item Prediction in Generative Recommendation

KaiWen Wei, Kejun He, Xiaomian Kang, Jie Zhang, Yuming Yang, Li Jin, Zhenyang Li, Jiang Zhong, He Bai, Junnan Zhu

Comments Accepted to ACL 2026

详情
英文摘要

Generative recommendation, which directly generates item identifiers, has emerged as a promising paradigm for recommendation systems. However, its potential is fundamentally constrained by the reliance on purely autoregressive training. This approach focuses solely on predicting the next item while ignoring the rich internal structure of a user's interaction history, thus failing to grasp the underlying intent. To address this limitation, we propose Masked History Learning (MHL), a novel training framework that shifts the objective from simple next-step prediction to deep comprehension of history. MHL augments the standard autoregressive objective with an auxiliary task of reconstructing masked historical items, compelling the model to understand ``why'' an item path is formed from the user's past behaviors, rather than just ``what'' item comes next. We introduce two key contributions to enhance this framework: (1) an entropy-guided masking policy that intelligently targets the most informative historical items for reconstruction, and (2) a curriculum learning scheduler that progressively transitions from history reconstruction to future prediction. Experiments on three public datasets show that our method significantly outperforms state-of-the-art generative models, highlighting that a comprehensive understanding of the past is crucial for accurately predicting a user's future path.

2509.13576 2026-04-24 eess.IV cs.CV

Cross-Distribution Diffusion Priors-Driven Iterative Reconstruction for Sparse-View CT

Haodong Li, Shuo Han, Haiyang Mao, Yu Shi, Changsheng Fang, Jianjia Zhang, Weiwen Wu, Hengyong Yu

Comments 17 pages, 15 figures, accepted by IEEE Transactions on Medical Imaging

详情
Journal ref
IEEE Transactions on Medical Imaging, 2026 (early access)
英文摘要

Sparse-View CT (SVCT) reconstruction enhances temporal resolution and reduces radiation dose, yet its clinical use is hindered by artifacts due to view reduction and domain shifts from scanner, protocol, or anatomical variations, leading to performance degradation in out-of-distribution (OOD) scenarios. In this work, we propose a Cross-Distribution Diffusion Priors-Driven Iterative Reconstruction (CDPIR) framework to tackle the OOD problem in SVCT. CDPIR integrates cross-distribution diffusion priors, derived from a Scalable Interpolant Transformer (SiT), with model-based iterative reconstruction methods. Specifically, we train a SiT backbone, an extension of the Diffusion Transformer (DiT) architecture, to establish a unified stochastic interpolant framework, leveraging Classifier-Free Guidance (CFG) across multiple datasets. By randomly dropping the conditioning with a null embedding during training, the model learns both domain-specific and domain-invariant priors, enhancing generalizability. During sampling, the globally sensitive transformer-based diffusion model exploits the cross-distribution prior within the unified stochastic interpolant framework, enabling flexible and stable control over multi-distribution-to-noise interpolation paths and decoupled sampling strategies, thereby improving adaptation to OOD reconstruction. By alternating between data fidelity and sampling updates, our model achieves state-of-the-art performance with superior detail preservation in SVCT reconstructions. Extensive experiments demonstrate that CDPIR significantly outperforms existing approaches, particularly under OOD conditions, highlighting its robustness and potential clinical value in challenging imaging scenarios.

2509.03294 2026-04-24 cs.CR cs.AI cs.LG

A Comprehensive Guide to Differential Privacy: From Theory to User Expectations

Napsu Karmitsa, Antti Airola, Tapio Pahikkala, Tinja Pitkämäki

详情
英文摘要

The increasing availability of personal data has enabled significant advances in fields such as machine learning, healthcare, and cybersecurity. However, this data abundance also raises serious privacy concerns, especially in light of powerful re-identification attacks and growing legal and ethical demands for responsible data use. Differential privacy (DP) has emerged as a principled, mathematically grounded framework for mitigating these risks. This review provides a comprehensive survey of DP, covering its theoretical foundations, practical mechanisms, and real-world applications. It explores key algorithmic tools and domain-specific challenges - particularly in privacy-preserving machine learning and synthetic data generation. The report also highlights usability issues and the need for improved communication and transparency in DP systems. Overall, the goal is to support informed adoption of DP by researchers and practitioners navigating the evolving landscape of data privacy.

2508.15840 2026-04-24 cs.CR cs.CL cs.IR

Unveiling Unicode's Unseen Underpinnings in Undermining Authorship Attribution

Robert Dilworth

Comments 33 pages, 7 figures, 3 tables

详情
英文摘要

When using a public communication channel--whether formal or informal, such as commenting or posting on social media--end users have no expectation of privacy: they compose a message and broadcast it for the world to see. Even if an end user takes utmost precautions to anonymize their online presence--using an alias or pseudonym; masking their IP address; spoofing their geolocation; concealing their operating system and user agent; deploying encryption; registering with a disposable phone number or email; disabling non-essential settings; revoking permissions; and blocking cookies and fingerprinting--one obvious element still lingers: the message itself. Assuming they avoid lapses in judgment or accidental self-exposure, there should be little evidence to validate their actual identity, right? Wrong. The content of their message--necessarily open for public consumption--exposes an attack vector: stylometric analysis, or author profiling. In this paper, we dissect the technique of stylometry, discuss an antithetical counter-strategy in adversarial stylometry, and devise enhancements through Unicode steganography.

2507.14491 2026-04-24 math.NA cs.LG cs.NA

Artifacts of Numerical Integration in Learning Dynamical Systems

Bing-Ze Lu, Richard Tsai

详情
英文摘要

In many applications, one needs to learn a dynamical system from its solutions sampled at a finite number of time points. The learning problem is often formulated as an optimization problem over a chosen function class. However, in the optimization procedure, prediction data from generic dynamics requires a numerical integrator to assess the mismatch with the observed data. This paper reveals potentially serious effects of a chosen numerical scheme on the learning outcome. Specifically, the analysis demonstrates that a damped oscillatory system may be incorrectly identified as having "anti-damping" and exhibiting a reversed oscillation direction, even though it adequately fits the given data points. This paper shows that the stability region of the selected integrator will distort the nature of the learned dynamics. Crucially, reducing the step size or raising the order of an explicit integrator does not, in general, remedy this artifact, because higher-order explicit methods have stability regions that extend further into the right half complex plane. Furthermore, it is shown that the implicit midpoint method can preserve either conservative or dissipative properties from discrete data, offering a principled integrator choice even when the only prior knowledge is that the system is autonomous.

2506.04292 2026-04-24 cs.SI cs.LG stat.AP

GARG-AML against Smurfing: A Scalable and Interpretable Graph-Based Framework for Anti-Money Laundering

Bruno Deprez, Bart Baesens, Tim Verdonck, Wouter Verbeke

详情
英文摘要

Purpose: We introduce GARG-AML, a fast and transparent graph-based method to catch `smurfing', a common money-laundering tactic. It assigns a single, easy-to-understand risk score to every account in both directed and undirected networks. Unlike overly complex models, it balances detection power with the speed and clarity that investigators require. Methodology: The method maps an account's immediate and secondary connections (its second-order neighbourhood) into an adjacency matrix. By measuring the density of specific blocks within this matrix, GARG-AML flags patterns that mimic smurfing behaviour. We further boost the model's performance using decision trees and gradient-boosting classifiers, testing the results against current state-of-the-art on both synthetic and open-source data. Findings: GARG-AML matches or beats state-of-the-art performance across all tested datasets. Crucially, it easily processes the massive transaction graphs typical of large financial institutions. By leveraging only the adjacency matrix of the second-order neighbourhood and basic network features, this work highlights the potential of fundamental network properties towards advancing fraud detection. Originality: The originality lies in the translation of human expert knowledge of smurfing directly into a simple network representation, rather than relying on uninterpretable deep learning. Because GARG-AML is built expressly for the real-world business demands of scalability and interpretability, banks can easily incorporate it in their existing AML solutions.

2505.05261 2026-04-24 math.OC cs.LG

ICNN-enhanced 2SP: Leveraging input convex neural networks for solving two-stage stochastic programming

Yu Liu, Fabricio Oliveira, Jan Kronqvist

详情
英文摘要

Two-stage stochastic programming (2SP) offers a basic framework for modelling decision-making under uncertainty, yet scalability remains a challenge due to the computational complexity of recourse function evaluation. Existing learning-based methods like Neural Two-Stage Stochastic Programming (Neur2SP) employ neural networks (NNs) as recourse function surrogates but rely on computationally intensive mixed-integer programming (MIP) formulations. We propose ICNN-enhanced 2SP, a method that leverages Input Convex Neural Networks (ICNNs) to exploit linear programming (LP) representability in convex 2SP problems. By architecturally enforcing convexity and enabling exact inference through LP, our approach eliminates the need for integer variables inherent to the conventional MIP-based formulation while retaining an exact embedding of the ICNN surrogate within the 2SP framework. This results in a more computationally efficient alternative, and we show that good solution quality can be maintained. Comprehensive experiments reveal that ICNNs incur only marginally longer training times while achieving validation accuracy on par with their standard NN counterparts. Across benchmark problems, ICNN-enhanced 2SP often exhibits considerably faster solution times than the MIP-based formulations while preserving solution quality, with these advantages becoming significantly more pronounced as problem scale increases. For the most challenging instances, the method achieves speedups of up to 100$\times$ and solution quality superior to MIP-based formulations.

2503.07341 2026-04-24 econ.GN cs.AI q-fin.EC

The Economics of p(doom): Scenarios of Existential Risk and Economic Growth in the Age of Transformative AI

Jakub Growiec, Klaus Prettner

详情
英文摘要

Recent advances in artificial intelligence (AI) have led to a wide range of predictions about its long-term impact on humanity. A central focus is the potential emergence of transformative AI (TAI), eventually capable of outperforming humans in all economically valuable tasks and fully automating labor. Discussed scenarios range from unprecedented economic growth and abundance ("post-scarcity" or "cornucopia") to human extinction after a misaligned TAI takes over ("AI doom"). However, the probabilities and implications of these scenarios remain highly uncertain. We contribute by organizing the various scenarios and evaluating their associated existential risks and economic outcomes in terms of aggregate welfare. Our results imply that even low-probability catastrophic outcomes justify substantial investments in AI safety and alignment research. This result highlights that current global efforts in AI safety and alignment research are insufficient relative to the scale and urgency of the risks posed by TAI.

2503.04492 2026-04-24 cond-mat.mtrl-sci cs.LG

Accurate predictive model of band gap with selected important features based on explainable machine learning

Joohwi Lee, Kaito Miyamoto

Comments 10 pages, 3 figures, SI is included, accpeted in Sci. Rep. (will be updated soon)

详情
英文摘要

In the rapidly advancing field of materials informatics, nonlinear machine learning models have demonstrated exceptional predictive capabilities for material properties. However, their black-box nature limits interpretability, and they may incorporate features that do not contribute to -- or even deteriorate -- model performance. This study employs explainable ML (XML) techniques, including permutation feature importance and the SHapley Additive exPlanation, applied to a pristine support vector regression model designed to predict band gaps at the GW level using 18 input features. Guided by XML-derived individual feature importance, a simple framework is proposed to construct reduced-feature predictive models. Model evaluations indicate that an XML-guided compact model, consisting of the top five features, achieves comparable accuracy to the pristine model on in-domain datasets (0.254 vs. 0.247 eV) while showing improved generalization with lower prediction errors on out-of-domain data (0.348 vs. 0.460 eV). Additionally, the study underscores the necessity for eliminating strongly correlated features (correlation coefficient greater than 0.8) to prevent misinterpretation and overestimation of feature importance before applying XML. This study highlights XML's effectiveness in developing simplified yet highly accurate machine learning models by clarifying feature roles, thereby reducing computational costs for feature acquisition and enhancing model trustworthiness for materials discovery.

2502.03484 2026-04-24 eess.AS cs.LG cs.SD

Dementia classification from spontaneous speech using wrapper-based feature selection

Marko Niemelä, Mikaela von Bonsdorff, Sami Äyrämö, Tommi Kärkkäinen

详情
英文摘要

Dementia encompasses a group of syndromes that impair cognitive functions such as memory, reasoning, and the ability to perform daily activities. As populations globally age, over 10 million new dementia diagnoses are reported annually. Currently, clinical diagnosis of dementia remains challenging due to overlapping symptoms, the need to exclude alternative conditions and the requirement for a comprehensive clinical evaluation and cognitive assessment. This underscores the growing need to develop feasible and accurate methods for detecting cognitive deficiencies. Recent advances in machine learning have highlighted spontaneous speech as a promising noninvasive, cost-effective, and scalable biomarker for dementia detection. In this study, spontaneous speech recordings from the ADReSS and Pitt Corpus datasets are analyzed, consisting of picture description tasks performed by cognitively healthy individuals and people with Alzheimer's disease. Unlike prior approaches that focus solely on speech-active segments, acoustic features are extracted from entire recordings using the openSMILE toolkit. This representation reduces the number of feature vectors and improves computational efficiency without compromising classification performance. Classification models with classifier-based wrapper feature selection are employed to estimate feature importance and identify diagnostically relevant acoustic characteristics. Among the evaluated models, the Extreme Minimal Learning Machine achieved competitive classification accuracy with substantially lower computational cost, reflecting an inherent property of the model formulation and learning procedure. Overall, the results demonstrate that the proposed framework is computationally efficient, interpretable, and well suited as a supportive tool for speech-based dementia assessment.

2411.14748 2026-04-24 astro-ph.CO astro-ph.IM cs.LG

Cosmological Analysis with Calibrated Neural Quantile Estimation and Approximate Simulators

He Jia

Comments 5+5 pages, 5+4 figures, published in PRL

详情
Journal ref
Phys. Rev. Lett. 136, 161001 (2026)
英文摘要

A major challenge in extracting information from current and upcoming surveys of cosmological Large-Scale Structure (LSS) is the limited availability of computationally expensive high-fidelity simulations. We introduce calibrated Neural Quantile Estimation (NQE), a new Simulation-Based Inference (SBI) method that leverages a large number of approximate simulations for training and a small number of high-fidelity simulations for calibration. This approach guarantees an unbiased posterior regardless of approximate simulation accuracy, while achieving near-optimal constraining power when the approximate simulations are reasonably accurate. As a proof of concept, we demonstrate that cosmological parameters can be inferred at field level from projected 2-dim dark matter density maps up to $k_{\rm max}\sim1.5\,h$/Mpc at $z=0$ by training on $\sim10^4$ Particle-Mesh (PM) simulations with transfer function correction and calibrating with $\sim10^2$ Particle-Particle (PP) simulations. The calibrated posteriors closely match those obtained by directly training on $\sim10^4$ expensive PP simulations, but at a fraction of the computational cost. Our method offers a practical and scalable framework for SBI of cosmological LSS, enabling precise inference across vast volumes and down to small scales.

2401.16407 2026-04-24 stat.ML cs.LG eess.IV eess.SP

Is K-fold cross validation the best model selection method for Machine Learning?

Juan M Gorriz, R. Martin Clemente, F Segovia, J Ramirez, A Ortiz, J. Suckling

Comments 40 pages, 24 figures

详情
英文摘要

As a technique that can compactly represent complex patterns, machine learning has significant potential for predictive inference. K-fold cross-validation (CV) is the most common approach to ascertaining the likelihood that a machine learning outcome is generated by chance, and it frequently outperforms conventional hypothesis testing. This improvement uses measures directly obtained from machine learning classifications, such as accuracy, that do not have a parametric description. To approach a frequentist analysis within machine learning pipelines, a permutation test or simple statistics from data partitions (i.e., folds) can be added to estimate confidence intervals. Unfortunately, neither parametric nor non-parametric tests solve the inherent problems of partitioning small sample-size datasets and learning from heterogeneous data sources. The fact that machine learning strongly depends on the learning parameters and the distribution of data across folds recapitulates familiar difficulties around excess false positives and replication. A novel statistical test based on K-fold CV and the Upper Bound of the actual risk (K-fold CUBV) is proposed, where uncertain predictions of machine learning with CV are bounded by the worst case through the evaluation of concentration inequalities. Probably Approximately Correct-Bayesian upper bounds for linear classifiers in combination with K-fold CV are derived and used to estimate the actual risk. The performance with simulated and neuroimaging datasets suggests that K-fold CUBV is a robust criterion for detecting effects and validating accuracy values obtained from machine learning and classical CV schemes, while avoiding excess false positives.

2303.03237 2026-04-24 stat.ML cs.LG math.ST stat.CO stat.TH

Convergence Rates for Non-Log-Concave Sampling and Log-Partition Estimation

David Holzmüller, Francis Bach

Comments Published in JMLR. New in v4: Summary tables / sections. Plots can be reproduced using the code at https://github.com/dholzmueller/sampling_experiments

详情
Journal ref
Journal of Machine Learning Research 26(249):1-72, 2025
英文摘要

Sampling from Gibbs distributions and computing their log-partition function are fundamental tasks in statistics, machine learning, and statistical physics. While efficient algorithms are known for log-concave densities, the worst-case non-log-concave setting necessarily suffers from the curse of dimensionality. For many numerical problems, the curse of dimensionality can be alleviated when the target function is smooth, allowing the exponent in the rate to improve linearly with the number of available derivatives. Recently, it has been shown that similarly fast convergence rates can be achieved by efficient optimization algorithms. Since optimization can be seen as the low-temperature limit of sampling from Gibbs distributions, we pose the question of whether similarly fast convergence rates can be achieved for non-log-concave sampling. We first study the information-based complexity of the sampling and log-partition estimation problems and show that the optimal rates for sampling and log-partition computation are sometimes equal and sometimes faster than for optimization. We then analyze various polynomial-time sampling algorithms, including an extension of a recent promising optimization approach, and find that they sometimes exhibit interesting behavior but no near-optimal rates. Our results also give further insights into the relation between sampling, log-partition, and optimization problems.

2604.21610 2026-04-24 hep-ph

$γZ$-exchange contribution in elastic $ep$ scattering by perturbative QCD

Qian-Qian Guo, Hui-Yun Cao, Hai-Qing Zhou

详情
英文摘要

In this study, we calculate the $γZ$-exchange contribution to elastic $ep$ scattering at large momentum transfer within perturbative QCD. We present analytical expressions for the $γZ$-exchange contributions to the amplitudes. We also estimate the asymptotic behaviors of the amplitude contributions and of the physical quantity $A_{\text{PV}}$ at high momentum transfer. These asymptotic behaviors determine the subtraction order in the dispersion relations (DRs) satisfied by the amplitudes. We find that the DR usually used in the literature for the axial-vector part of the amplitude is not valid at high $Q^2$ and should be modified to a once-subtracted form. Within the present pQCD framework and the adopted proton distribution amplitudes, these high-energy properties also provide nontrivial constraints on low-energy DR assumptions.

2604.21609 2026-04-24 physics.optics nlin.PS

Hybridization of Kerr Solitons in Coupled Microresonators

Alena Kolesnikova, Ivan Pshenichnyuk, Andrey Gelash

Comments 7 pages, 7 figures

详情
英文摘要

Recent advances in manufacturing photonic integrated devices enable efficient coupling between high-Q microresonators in both linear and nonlinear regimes, creating a tunable, complex, hybridized optical system. Considering two coupled microresonators with normal and anomalous dispersion and equal free spectral range (FSR), we theoretically predict a novel nonlinear phenomenon: fully coherent hybridization of dissipative Kerr solitons (DKS) and propose a realistic integrated photonic design for its experimental observation. Using the Lugiato-Lefever equations in the supermode basis, we show that the emergent picture of inter-resonator DKS interactions can be understood as the formation of coherent structures in both supermodes generated by an unusual four-wave mixing process. The found hybridized DKS states can exhibit a broad, flat spectral profile near the pumped mode and remarkable oscillatory features in the spectral wings, promising broad applications in the generation and control of optical Kerr frequency combs.

2604.21608 2026-04-24 eess.SY cs.SY

ADMM-Based Distributed Kalman-like Observer with Applications to Cooperative Localization

Nicola De Carli, Nicola Bastianello, Dimos V. Dimarogonas

详情
英文摘要

This paper addresses distributed state estimation for multi-agent systems with local and relative measurements, motivated by cooperative localization problems in which the global state dimension scales with the size of the network. We consider a Kalman-like observer in information form and introduce a sparsity-preserving prediction step based on an exponential forgetting factor, thereby avoiding the dense Riccati recursion of the standard information filter. The correction step is recast as a strongly convex quadratic program with structure induced by the sensing graph, which enables a distributed solution based on the alternating direction method of multipliers (ADMM). In the resulting scheme, each agent updates local copies of its own correction variable and those of its neighbors using only local communication, thus avoiding centralized matrix inversion and consensus over full global-state quantities. A two-time-scale stability analysis is developed for the interconnected observer: the reduced estimation-error dynamics are shown to be uniformly exponentially stable, the ADMM dynamics define an exponentially stable fast subsystem, and these properties are combined to establish uniform exponential stability of the overall distributed observer. Numerical simulations in a multi-agent cooperative localization scenario illustrate the performance of the proposed distributed observer.

2604.21607 2026-04-24 math.CO

On the hamiltonicity problem of bicirculants: a reduction to cyclic Haar graphs

Simona Bonvicini, Tomaž Pisanski, Arjana Žitnik

Comments 26 pages, 3 figures

详情
英文摘要

A bicirculant is a regular graph that admits an automorphism having two vertex-orbits of the same size. A bicirculant can be described as follows. Given an integer $m \ge 1$ and sets $R, S, T \subseteq \mathbb Z_m$ such that $R=-R$, $T=-T$, $0 \not\in R \cup T$ and $0 \in S$, the graph $B(m;R,S,T)$ has vertex set $V=\{u_0,\dots,u_{m-1},v_0,\dots,v_m-1\}$ and edge set $E=\{u_iu_{i+j}| \ i \in\mathbb Z_m, j \in R\} \cup \{v_iv_{i+j}| \ i \in\mathbb Z_m, j \in T\} \cup\{u_iv_{i+j}| \ i \in\mathbb Z_m, j \in S\}.$ Bicirculant graphs with $R=T=\emptyset$ are known as cyclic Haar graphs. In 2025 we conjectured that the only non-hamiltonian graphs among regular connected bicirculants of degree more than one are the generalized Petersen graphs $G(m,2)$ with $m \equiv 5 \pmod 6$. Recently we have verified the conjecture for bicirculants with $|S|\le 2$ and for bicirculants with $|R|=|T|$ odd. In this paper we show that the conjecture holds for all bicirculants with $|S| \le 3$ and for all bicirculants with $|S| \ge 4$ and $m/\gcd(m, S)$ even. As a byproduct of our results, we prove that every connected bicirculant graph on $2m$ vertices with $|S| \ge 4$ is hamiltonian for even $m< 9\, 240$, and for odd $m< 3\,465$. Finally, we show that the existence of a hamilton cycle in every connected cyclic Haar graph of valence at least $4$ implies that every connected bicirculant graph of valence at least $4$ is hamiltonian.

2604.21606 2026-04-24 cs.CR

Process-Mining of Hypertraces: Enabling Scalable Formal Security Verification of (Automotive) Network Architectures

Julius Figge, David Knuplesch, Andreas Maletti, Dragan Zuvic

Comments Full version prior to submission for publication

详情
英文摘要

The automotive domain is transitioning: vehicles act as rolling servers, persistently connected to numerous external entities. This connectivity, combined with rising on-board computing power for advanced driver assistance systems and similar use cases, creates escalating challenges for securing automotive network architectures. This work advances the security analysis of internet-connected automotive network architectures and their protocols. We introduce a strong, active adversary model tailored to the automotive domain. We substantially extend security protocol verification possible based on Attack Resilience Hyperproperties (ARHs) by introducing a verification-orchestration algorithm. Furthermore, we provide methods for comparative attribution of security property invalidations to specific, ne-grained component compromises. We present a novel integration of formal verification and process mining. By utilizing ARH counterexample traces for process mining, we systematically identify and aggregate attacker behavior that causes security property invalidations. This pipeline enables in-depth understanding of root causes and attack paths leading to protocol-security invalidations. We demonstrate real-world applicability through a prototype and case study on the secure transmission of battery management system data within an automotive network architecture.

2604.21604 2026-04-24 cs.CR cs.CY econ.GN q-fin.EC

Mitigate or Fail: How Risk Management Shapes Cybersecurity Competency

Jeffrey T. Gardiner

Comments Doctor of Business Administration (DBA) Dissertation

详情
英文摘要

Contemporary cybersecurity governance assumes that professionals apply risk reasoning. Yet major organisational failures persist despite investment in tools, staffing, and credentials. This study investigates the structural source of that paradox. Cybersecurity speaks the language of risk, but its training architecture has shaped the profession to think in terms of threats. A sequential mixed-methods design integrated four analyses; NLP of the NIST NICE Framework v2.0.0 (2,111 TKS statements), SEM (n = 126 cybersecurity professionals), a control-group comparison (n = 133 general professionals), and thematic coding of seven leadership interviews. Four convergent findings emerged. First, "likelihood" and "probability" appear zero times across all TKS statements. Risk management content accounts for 4.5% of high-confidence semantic classifications, ranking 18th of 29 competency domains. NICE codifies threat-management activity while invoking risk mainly at the category level. Second, SEM showed that training exposure significantly predicts risk management competence directly and indirectly through conceptual salience, for a total effect of Beta = .629. However, the theoretically four-dimensional competence construct collapsed into a single factor, indicating epistemic compression. Third, cybersecurity professionals showed no measurable advantage over the general professional population in foundational risk reasoning; only 11.9% showed high differentiation. Fourth, all seven leaders expected Likelihood x Impact reasoning, yet five did not articulate the formula themselves. These findings support a structural conclusion: cybersecurity has taken professional form as a threat-management discipline that has borrowed risk vocabulary. Remediation requires redesign of professional formation, not marginal curriculum reform.

2604.21600 2026-04-24 math.NA cs.NA

Positivity-Preserving and Entropy-Stable Oscillation-Eliminating DGSEM for the Compressible Euler Equations on Curvilinear Meshes with Adaptive Mesh Refinement

Jieling Yang, Guosheng Fu

详情
英文摘要

We extend the entropy-stable oscillation-eliminating discontinuous Galerkin spectral element method (ES-OEDG) on curvilinear meshes to adaptive mesh refinement (AMR) grids with nonconforming interfaces. The formulation targets two-dimensional curvilinear quadrilateral meshes under a 2:1 refinement constraint, allowing a single level of hanging nodes. Elementwise volume discretization and geometric mapping are retained, while oscillation elimination and interface coupling are adapted for nonconforming interfaces. A central contribution is the design and analysis of numerical fluxes for such interfaces. We construct an entropy-stable flux that ensures global conservation and a semi-discrete entropy inequality. However, for polynomial degree N >= 2, negative entries in nonconforming interpolation operators lead to loss of formal high-order consistency. To address this, we propose a mortar-based flux that preserves high-order accuracy by interpolating at the solution level and evaluating standard two-point fluxes on fine-side mortars, at the cost of losing provable entropy stability. We also extend the Zhang--Shu positivity-preserving framework to curvilinear AMR meshes. Under forward Euler time stepping and a suitable CFL condition, the scheme using either flux preserves positivity of cell-average density and pressure. Combined with the Zhang--Shu limiter, this yields a fully discrete scheme maintaining admissibility at all nodal points. We further incorporate shock-indicator-based AMR and a conservative, positivity-preserving data transfer procedure between successive meshes, resulting in a robust and efficient algorithm. Numerical experiments on Cartesian and curvilinear AMR grids confirm high-order accuracy and robustness.

2604.21596 2026-04-24 stat.ME stat.CO

Efficient Bayes Factor Sensitivity Analysis via Posterior Density Ratios

František Bartoš, Eric-Jan Wagenmakers, Maarten Marsman, Don van den Bergh

详情
英文摘要

Bayes factor sensitivity analysis examines how the evidence for one hypothesis over another depends on the prior distribution. In complex models, the standard approach refits the model at each hyper-parameter value, and the total computational cost scales linearly in the grid size. We propose a method that recovers the entire sensitivity curve from a single additional model fit. The key identity decomposes the Bayes factor at any hyper-parameter value $γ_x$ into an ``anchor'' Bayes factor at a fixed reference $γ_0$ and a Savage--Dickey density ratio in an extended model that places a hyper-prior on $γ$. Once this extended model is fit, the Bayes factor at any $γ_x$ follows from the anchor value and a ratio of two posterior density ordinates. To approximate this ratio, we employ the importance-weighted marginal density estimator (IWMDE). Because the sensitivity parameter enters the model only through the prior distribution on the model parameters, the data likelihood cancels in the IWMDE, reducing it to a simple ratio of prior density evaluations on the MCMC draws, without any additional likelihood computation. The resulting estimator is fast, remains accurate even with small MCMC samples, and substantially outperforms kernel density estimation across the full sensitivity range. The method extends naturally to simultaneous sensitivity over multiple hyper-parameters and to Bayesian model averaging. We illustrate it on a univariate Bayesian $t$-test with exact Bayes factors for validation, a bivariate informed $t$-test, and a Bayesian model-averaged meta-analysis, obtaining accurate sensitivity curves at a fraction of the brute-force cost.

2604.21591 2026-04-24 math.PR

Long-time dynamics of stochastic 2D hydrodynamic-type evolution equations driven by multiplicative Lévy noise

Jiangwei Zhang

详情
英文摘要

This paper investigates the long-time dynamics of solutions for an abstract nonlinear stochastic hydrodynamic-type equation driven by multiplicative Lévy noise. The framework encompasses several key hydrodynamical models, including the stochastic 2D Navier-Stokes equations, magnetohydrodynamic equations, the magnetic Bérnard problem, as well as various stochastic shell models of turbulence. Under the assumption that the nonlinear noise coefficients satisfy local Lipschitz and linear growth conditions, we first establish global well-posedness using a truncation technique. Then, by introducing a mean random dynamical system, we prove the existence and uniqueness of weak pullback mean random attractors for the system. Furthermore, when the external force is time-independent, we study the existence of invariant measures for the corresponding autonomous system, as well as the double limiting behavior of invariant measures with respect to the intensities of Gaussian and Lévy noise. Finally, under additional assumptions on the bilinear nonlinear term (e.g., as in the Navier-Stokes equations), we examine the existence and uniqueness of pullback measure attractors, along with the asymptotically autonomous stability of such attractors as the time parameter tends to negative infinity. It is worth noting that the results of this paper are new even for the single stochastic 2D Navier-Stokes equations.