arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2604.16288 2026-04-20 math.AP cond-mat.stat-mech math-ph math.MP math.PR stat.ML

Phase transitions in Doi-Onsager, Noisy Transformer, and other multimodal models

Kyunghoo Mun, Matthew Rosenzweig

Comments 16 pages

详情

英文摘要

We study phase transitions for repulsive-attractive mean-field free energies on the circle. For a $\frac{1}{n+1}$-periodic interaction whose Fourier coefficients satisfy a certain decay condition, we prove that the critical coupling strength $K_c$ coincides with the linear stability threshold $K_\#$ of the uniform distribution and that the phase transition is continuous, in the sense that the uniform distribution is the unique global minimizer at criticality. The proof is based on a sharp coercivity estimate for the free energy obtained from the constrained Lebedev--Milin inequality. We apply this result to three motivating models for which the exact value of the phase transition and its (dis)continuity in terms of the model parameters was not fully known. For the two-dimensional Doi--Onsager model $W(θ)=-|\sin(2πθ)|$, we prove that the phase transition is continuous at $K_c=K_\#=3π/4$. For the noisy transformer model $W_β(θ)=(e^{β\cos(2πθ)}-1)/β$, we identify the sharp threshold $β_*$ such that $K_c(β) = K_\#(β)$ and the phase transition is continuous for $β\leq β_*$, while $K_c(β)<K_\#(β)$ and the phase transition is discontinuous for $β> β_*$. We also obtain the corresponding sharp dichotomy for the noisy Hegselmann--Krause model $W_{R}(θ) = (R-2π|θ|)_{+}^2$ .

URL PDF HTML ☆

赞 0 踩 0

2604.16239 2026-04-20 stat.ML cs.LG

Adaptive multi-fidelity optimization with fast learning rates

Come Fiegel, Victor Gabillon, Michal Valko

Comments Published at International Conference on Artificial Intelligence and Statistics (AISTATS) 2020

2604.16238 2026-04-20 cs.LG physics.ao-ph stat.ML

Enhancing AI and Dynamical Subseasonal Forecasts with Probabilistic Bias Correction

Hannah Guan, Soukayna Mouatadid, Paulo Orenstein, Judah Cohen, Haiyu Dong, Zekun Ni, Jeremy Berman, Genevieve Flaspohler, Alex Lu, Jakob Schloer, Joshua Talib, Jonathan A. Weyn, Lester Mackey

2604.16221 2026-04-20 stat.ME

Network Meta-analysis and Diffusion

Gerta Rücker, Annabel L. Davies, Guido Schwarzer

Comments 19 pages, 8 figures

2604.16219 2026-04-20 math.ST stat.ME stat.TH

Simultaneous Inference for Covariance and Precision Matrices of Long-Range Dependent Time Series

Percy S. Zhai, Mladen Kolar, Wei Biao Wu

2604.16206 2026-04-20 math.PR math.ST stat.TH

Extrapolation of max-stable random fields with Fréchet marginals

Vitalii Makogin, Evgeny Spodarev, Ilja Sukhanov

Comments 32 pages, 9 figures

2604.16203 2026-04-20 stat.ME stat.AP stat.ML

A Bayesian Updating Framework for Long-term Multi-Environment Trial Data in Plant Breeding

Stephan Bark, Waqas Ahmed Malik, Maryna Prus, Hans-Peter Piepho, Volker Schmid

Comments 27 pages, 4 figures, 2 tables; includes supplementary material and reproducible code (GitHub link)

详情

英文摘要

In variety testing, multi-environment trials (MET) are essential for evaluating the genotypic performance of crop plants. A persistent challenge in the statistical analysis of MET data is the estimation of variance components, which are often still inaccurately estimated or shrunk to exactly zero when using residual (restricted) maximum likelihood (REML) approaches. At the same time, institutions conducting MET typically possess extensive historical data that can, in principle, be leveraged to improve variance component estimation. However, these data are rarely incorporated sufficiently. The purpose of this paper is to address this gap by proposing a Bayesian framework that systematically integrates historical information to stabilize variance component estimation and better quantify uncertainty. Our Bayesian linear mixed model (BLMM) reformulation uses priors and Markov chain Monte Carlo (MCMC) methods to maintain the variance components as positive, yielding more realistic distributional estimates. Furthermore, our model incorporates historical prior information by managing MET data in successive historical data windows. Variance component prior and posterior distributions are shown to be conjugate and belong to the inverse gamma and inverse Wishart families. While Bayesian methodology is increasingly being used for analyzing MET data, to the best of our knowledge, this study comprises one of the first serious attempts to objectively inform priors in the context of MET data. This refers to the proposed Bayesian updating approach. To demonstrate the framework, we consider an application where posterior variance component samples are plugged into an A-optimality experimental design criterion to determine the average optimal allocations of trials to agro-ecological zones in a sub-divided target population of environments (TPE).

URL PDF HTML ☆

赞 0 踩 0

2604.16111 2026-04-20 cs.LG stat.ML

Sample Complexity Bounds for Stochastic Shortest Path with a Generative Model

Jean Tarbouriech, Matteo Pirotta, Michal Valko, Alessandro Lazaric

Comments Accepted at the 32nd International Conference on Algorithmic Learning Theory (ALT 2021)

2604.16087 2026-04-20 cs.LG stat.ML

The Harder Path: Last Iterate Convergence for Uncoupled Learning in Zero-Sum Games with Bandit Feedback

Côme Fiegel, Pierre Ménard, Tadashi Kozuno, Michal Valko, Vianney Perchet

Comments Accepted at the 42nd International Conference on Machine Learning (ICML 2025)

2604.16086 2026-04-20 cs.CV cs.AI cs.LG stat.ML

Stylistic-STORM (ST-STORM) : Perceiving the Semantic Nature of Appearance

Hamed Ouattara, Pierre Duthon, Pascal Houssam Salmane, Frédéric Bernardin, Omar Ait Aider

Comments 20 pages, 16 figures, ICPR 2026 (28th International Conference on Pattern Recognition)

详情

英文摘要

One of the dominant paradigms in self-supervised learning (SSL), illustrated by MoCo or DINO, aims to produce robust representations by capturing features that are insensitive to certain image transformations such as illumination, or geometric changes. This strategy is appropriate when the objective is to recognize objects independently of their appearance. However, it becomes counterproductive as soon as appearance itself constitutes the discriminative signal. In weather analysis, for example, rain streaks, snow granularity, atmospheric scattering, as well as reflections and halos, are not noise: they carry the essential information. In critical applications such as autonomous driving, ignoring these cues is risky, since grip and visibility depend directly on ground conditions and atmospheric conditions. We introduce ST-STORM, a hybrid SSL framework that treats appearance (style) as a semantic modality to be disentangled from content. Our architecture explicitly separates two latent streams, regulated by gating mechanisms. The Content branch aims at a stable semantic representation through a JEPA scheme coupled with a contrastive objective, promoting invariance to appearance variations. In parallel, the Style branch is constrained to capture appearance signatures (textures, contrasts, scattering) through feature prediction and reconstruction under an adversarial constraint. We evaluate ST-STORM on several tasks, including object classification (ImageNet-1K), fine-grained weather characterization, and melanoma detection (ISIC 2024 Challenge). The results show that the Style branch effectively isolates complex appearance phenomena (F1=97% on Multi-Weather and F1=94% on ISIC 2024 with 10% labeled data), without degrading the semantic performance (F1=80% on ImageNet-1K) of the Content branch, and improves the preservation of critical appearance

URL PDF HTML ☆

赞 0 踩 0

2604.08809 2026-04-20 cs.LG stat.AP

Structural Evaluation Metrics for SVG Generation via Leave-One-Out Analysis

Haonan Zhu, Adrienne Deganutti, Elad Hirsch, Purvanshi Mehta

2604.08691 2026-04-20 math.ST cs.CC math.PR stat.TH

Planted clique detection and recovery from the hypergraph adjacency matrix

Kalle Alaluusua, B. R. Vinay Kumar

Comments 45 pages. This revision fixes a measurability issue in the leave--one--out proof by separating a measurable eigenvector representative from the subsequent sign choice. It also removes an unnecessary factor left over from an earlier modification, which makes the argument more transparent

2603.03188 2026-04-20 stat.ML cs.LG

Scalable Posterior Uncertainty for Flexible Density-Based Clustering

Nicola Bariletto, Stephen G. Walker

2512.14504 2026-04-20 stat.ME

A flexible class of latent variable models for the analysis of antibody response data

Emanuele Giorgi, Jonas Wallin

Comments This is a working paper, and updated versions will be released in the future. For further information about this research, please contact Emanuele Giorgi at e.giorgi@bham.ac.uk

2508.12886 2026-04-20 stat.AP

Forecasting Extreme Day and Night Heat in Paris: A Proof of Concept

Richard Berk

Comments 5 figures and 2 pseudocode tables. Revised with new technical material added. Prose edited. References updated

2507.05701 2026-04-20 stat.ME

Area-based epigraph and hypograph indices for functional outlier detection

Belen Pulido, Alba M. Franco-Pereira, Rosa E. Lillo, Fabian Scheipl

Comments 24 pages

2502.15036 2026-04-20 math.ST stat.TH

Extreme Value Analysis based on Blockwise Top-Two Order Statistics

Axel Bücher, Erik Haufs

Comments 96 pages

2412.19363 2026-04-20 cs.AI cs.LG stat.ME stat.ML

Large Language Models for Market Research: A Data-augmentation Approach

Mengxin Wang, Dennis J. Zhang, Heng Zhang

2409.01794 2026-04-20 stat.ME cs.LG stat.ML

Estimating Joint Interventional Distributions from Marginal Interventional Data

Sergio Hernan Garrido Mejia, Elke Kirschbaum, Armin Kekić, Bernhard Schölkopf, Atalanti Mastakouri

Comments Accepted at the Causal Reasoning and Learning (CLeaR) conference 2026

2407.14781 2026-04-20 math.ST cs.NA math.AP math.NA math.PR stat.TH

Bernstein-von Mises theorems for time evolution equations

Richard Nickl

Comments 54 pages

2604.16031 2026-04-20 stat.ME stat.AP

A Comparison of Joint and Stepwise Dynamic Cognitive Diagnostic Models

Yawen Ma, Anastasia Ushakova, Kate Cain, Gabriel Wallin

Comments 14 pages, 7 tables

2604.15980 2026-04-20 math.ST stat.TH

Decompounding on Compact Symmetric Spaces

Erik Kennerland

2604.15940 2026-04-20 cs.LG stat.AP

(Weighted) Adaptive Radius Near Neighbor Search: Evaluation for WiFi Fingerprint-based Positioning

Khang Le, Joaquín Torres-Sospedra, Philipp Müller

Comments 11 pages, 2 figures, 2 tables, submitted to IPIN 2026

2604.15889 2026-04-20 stat.CO

Markov embedding of ranked unlabelled evolutionary trees and its applications

Lasse Thorup Fallesen, Simon Pauli, Elisabeth Sommer James, Lars Nørvang Andersen, Asger Hobolth

Comments 26 pages, 15 figures

2604.15773 2026-04-20 cond-mat.stat-mech cs.AI stat.ME

Phase Transitions as the Breakdown of Statistical Indistinguishability

Taiyo Narita, Hideyuki Miyahara

2604.15742 2026-04-20 cs.LG hep-th stat.ML

Collective Kernel EFT for Pre-activation ResNets

Hidetoshi Kawase, Toshihiro Ota

Comments 20 pages

2604.15696 2026-04-20 stat.ME math.PR

Testing and estimation of the index of stability of univariate and bivariate symmetric $α-$stable distributions via modified Greenwood statistic

Katarzyna Skowronek, Marek Arendarczyk, Anna K. Panorska, Tomasz J. Kozubowski, Agnieszka Wyłomańska

2604.15544 2026-04-20 stat.AP stat.ME

Practical Process Capability Indices Workflows

Fei Jiang, Lei Yang

Comments 12 pages, 5 figures and 5 tables

2604.15538 2026-04-20 stat.ML cs.LG

PRIM-cipal components analysis

Tianhao Liu, Daniel Andrés Díaz-Pachón, J. Sunil Rao

Comments 12 pages, 46 figures

2604.15531 2026-04-20 q-fin.ST stat.ME stat.ML

Spurious Predictability in Financial Machine Learning

Sotirios D. Nikolopoulos

Comments 49 pages, 10 figures. The QuantAudit R package and full replication scripts will be made publicly available upon journal publication

2604.15504 2026-04-20 cs.SI stat.AP

A Quasi-Experiment comparing the health of unhoused people who have and have not experienced an eviction in King County, WA

Ihsan Kahveci, Timothy A. Thomas, Nathalie E. Williams, Janelle Rothfolk, Cathea Carey, Paul Hebert, Amy Hagopian, Zack W. Almquist

2604.15469 2026-04-20 stat.ME

Sample continuation in Bayesian hierarchical model via variational inference

Yucong Liu, Zilai Si, Alexander Strang

2604.15452 2026-04-20 stat.ME stat.CO

Spatially continuous modelling of aggregated outcome data

Stephen Jun Villejo, Peter Diggle, Finn Lindgren, Haavard Rue, Guangquan Li, Ella White, Matthew Wade, Marta Blangiardo

2604.15392 2026-04-20 cs.LG cs.AI stat.ML

Lightweight Geometric Adaptation for Training Physics-Informed Neural Networks

Kang An, Chenhao Si, Shiqian Ma, Ming Yan

Comments 22 pages, Chenhao Si and Kang An contributed equally to this work. Their authorship order was determined randomly

2604.11305 2026-04-20 cs.LG cs.IT math.IT stat.ML

Beyond Fixed False Discovery Rates: Post-Hoc Conformal Selection with E-Variables

Meiyi Zhu, Osvaldo Simeone

Comments 32 pages, 29 figures

2604.10013 2026-04-20 stat.ME math.OC

Toward Exact Convergence in Byzantine-Robust Decentralized Learning: A Statistical Identification Approach

Siyuan Zhang, Chengde Qian, Xin Liu, Changliang Zou

Comments 52 pages, 7 figures

2604.07770 2026-04-20 stat.ME

Efficient Targeted Maximum Likelihood Estimation of Average Treatment Effects under Structured Outcome Models with Unknown Error Distributions

Mijeong Kim

2604.00843 2026-04-20 math.AP cs.NA math.NA math.PR math.ST stat.TH

Sharp local sparsity of regularized optimal transport

Alberto González-Sanz, Rishabh S. Gvalani, Lukas Koch

Comments 18 pages, no figures, fixed typo in first author's name

2602.14630 2026-04-20 astro-ph.CO stat.ML

Bayesian Cosmic Void Finding with Graph Flows

Leander Thiele

Comments 8+3 pages, 9+2 figures; v2: Published in OJAp

2602.07006 2026-04-20 cs.CV cs.LG stat.ML

Scalable spatial point process models for forensic footwear analysis

Alokesh Manna, Neil Spencer, Dipak K. Dey

2602.06105 2026-04-20 stat.ML cs.LG math.AG

Robustness Verification of Polynomial Neural Networks

Yulia Alexandr, Hao Duan, Guido Montúfar

2601.17734 2026-04-20 stat.ME

Group Permutation Testing in Linear Model: Sharp Validity, Power Improvement, and Extension Beyond Exchangeability

Zonghan Li, Hongyi Zhou, Zhiheng Zhang

Comments 74 pages, 3 figures. Includes supplementary material

详情

英文摘要

We consider finite-sample inference for a single regression coefficient in the fixed-design linear model $Y = Zβ+ bX + \varepsilon$, where $\varepsilon\in\mathbb{R}^n$ may exhibit complex dependence or heterogeneity. We develop a group permutation framework, yielding a unified and analyzable randomization structure for linear-model testing. Under exchangeable errors, we place permutation-augmented regression tests within this group-theoretic setting and show that a grouped version of PALMRT controls Type I error at level at most $2α$ for any permutation group; moreover, we provide an worst-case construction demonstrating that the factor $2$ is sharp and cannot be improved without additional assumptions. Second, we relate the Type II error to a design-dependent geometric separation. We formulate it as a combinatorial optimization problem over permutation groups and bound it under additional mild sub-Gaussian assumptions. For the Type II error upper bound control, we propose a constructive algorithm for the permutation strategy that is better (at least no worse) than the i.i.d. permutation, with simulations empirically indicating substantial power gains, especially under heavy-tailed designs. Finally, we extend group-based CPT and PALMRT beyond exchangeability by connecting rank-based randomization arguments to conformal inference. The resulting weighted group tests satisfy finite-sample Type I error bounds that degrade gracefully with a weighted average of total variation distances between $\varepsilon$ and its group-permuted versions, recovering exact validity when these discrepancies vanish and yielding quantitative robustness otherwise. Taken together, the group-permutation viewpoint provides a principled bridge from exact randomization validity to design-adaptive power and quantitative robustness under approximate symmetries.

URL PDF HTML ☆

赞 0 踩 0

2601.01854 2026-04-20 stat.ME

Causal inference for censored data with continuous marks

Lianqiang Qu, Long Lv, Liuquan Sun

Comments This paper is a replacement for the previous work titled "Causal inference for censored data with continuous marks." In the current version, we introduce a new definition of causal inference by considering the mark as a post-treatment variable. This approach offers a clearer causal interpretation compared to the previous version

2510.21934 2026-04-20 cs.LG stat.ML

Joint Score-Threshold Optimization for Interpretable Risk Assessment

Fardin Ganjkhanloo, Emmett Springer, Erik H. Hoyer, Daniel L. Young, Kimia Ghobadi

2510.12700 2026-04-20 cs.LG cs.AI cs.CG math.AT stat.ML

Topological Signatures of ReLU Neural Network Activation Patterns

Vicente Bosca, Tatum Rask, Sunia Tanweer, Andrew R. Tawfeek, Branden Stone

2510.10959 2026-04-20 cs.LG cs.AI cs.CL stat.ML

Revisiting Entropy Regularization: Adaptive Coefficient Unlocks Its Potential for LLM Reinforcement Learning

Xiaoyun Zhang, Xiaojian Yuan, Di Huang, Wang You, Chen Hu, Jingqing Ruan, Ai Jian, Kejiang Chen, Xing Hu

Comments 16 pages, 4 figures

2509.24397 2026-04-20 stat.AP

Assessing Roundabout Safety Perceptions under Heterogeneous Traffic: Socio-Demographic and Geometric Influences in Indian Urban Contexts

Abhijnan Maji, Indrajit Ghosh

详情

DOI: 10.1016/j.ijtst.2025.08.008

英文摘要

Evaluation of the safety perceptions of roundabout users is crucial for improving road safety in mixed-traffic environments. The crash- and conflict-based analyses do not incorporate the socio-demographic characteristics of the roundabout users, which can only be captured through questionnaire surveys on a larger scale. This research evaluated the relationship of roundabout safety perception with demographic factors, driving characteristics, and varying roundabout geometries using multiple correspondence analysis, cluster analysis, factor analysis, and multinomial logistic regression. The study analyzed data from 1,530 respondents across two Indian cities. The study identified three roundabout user clusters. Single-lane roundabouts were perceived as safer during entry and circulation, with a significant prominence among middle-aged users. In contrast, double- and multi-lane roundabouts presented higher perceived risks during exit maneuvers, especially among young, inexperienced, unemployed/self-employed users. Vulnerable road users reported significantly higher perceived risks, especially under suboptimal lighting conditions. Respondents with 10-20 years of driving experience, especially car users, perceived lower risk at single-lane roundabouts but acknowledged the higher risk linked to speed variations and complex maneuvers at multi-lane roundabouts. Driving experience, vehicle type, and geometric configurations were crucial in roundabout safety perception. The study highlighted the need to improve the built environment of roundabouts for vulnerable road users. The roundabout merging area was perceived as the most dangerous spot; however, exits were also perceived as dangerous for double- and multi-lane roundabouts. The findings can benefit policymakers, engineers, and urban planners by enabling them to deploy targeted safety interventions based on issues highlighted in the study.

URL PDF HTML ☆

赞 0 踩 0

2509.19104 2026-04-20 cs.LG stat.ML

Online Distributionally Robust LLM Alignment via Regression to Relative Reward

Sharan Sahu, Martin T. Wells

Comments 70 pages, 7 figures, 1 table

2509.02772 2026-04-20 stat.ME stat.CO stat.ML

Inference on covariance structure in high-dimensional multi-view data

Lorenzo Mauri, David B. Dunson

Comments 22 pages including references (35 with appendix), 4 figures, 3 tables

2507.04962 2026-04-20 stat.ME

Covariance test for discretely observed functional data: when and how it works?

Yang Zhou, Jin Yang, Fang Yao

Comments 35 pages, 2 figures, 1 table

2507.03759 2026-04-20 stat.ML cs.LG

Sequential Regression Learning with Randomized Algorithms

Dorival Leão, Reiko Aoki, Alberto Ohashi, Teh Led Red

2505.02636 2026-04-20 math.OC math.ST stat.TH

Phase retrieval and matrix sensing via benign and overparametrized nonconvex optimization

Andrew D. McRae

2503.07976 2026-04-20 stat.ML cs.LG

Two-Dimensional Deep ReLU CNN Approximation for Korobov Functions: A Constructive Approach

Qin Fang, Lei Shi, Min Xu, Ding-Xuan Zhou

2502.19312 2026-04-20 cs.LG cs.AI cs.CL cs.HC stat.ML

FSPO: Few-Shot Optimization of Synthetic Preferences Personalizes to Real Users

Anikait Singh, Sheryl Hsu, Kyle Hsu, Eric Mitchell, Stefano Ermon, Tatsunori Hashimoto, Archit Sharma, Chelsea Finn

Comments Website: https://fewshot-preference-optimization.github.io/

2411.12502 2026-04-20 cs.LG cs.AI stat.ML

Transformer Neural Processes - Kernel Regression

Daniel Jenson, Jhonathan Navott, Mengyan Zhang, Makkunda Sharma, Elizaveta Semenova, Seth Flaxman

Comments This was superseded by 'Scalable Spatiotemporal Inference with Biased Scan Attention Transformer Neural Processes' (arXiv:2506.09163)

2411.05808 2026-04-20 math.ST math.PR stat.TH

Layered Hill estimator for extreme data in clusters

Taegyu Kang, Takashi Owada

Comments 36 pages

2408.07066 2026-04-20 stat.ME

Conformal prediction after data-dependent model selection

Ruiting Liang, Wanrong Zhu, Rina Foygel Barber

2303.12660 2026-04-20 cs.SI math.PR math.ST stat.TH

Structural Measures of Resilience for Supply Chains

Marios Papachristou, M. Amin Rahimian, Arash Azadegan

2301.05660 2026-04-20 physics.data-an math.ST stat.ME stat.TH

Learn your entropy from informative data: an axiom ensuring the consistent identification of generalized entropies

Andrea Somazzi, Diego Garlaschelli