arXivDaily arXiv每日学术速递 周一至周五更新
重置
2604.03218 2026-04-06 math.ST math.PR stat.ML stat.TH

Power one sequential tests exist for weakly compact $\mathscr P$ against $\mathscr P^c$

Ashwin Ram, Aaditya Ramdas

Comments Preprint

详情
英文摘要

Suppose we observe data from a distribution $P$ and we wish to test the composite null hypothesis that $P\in\mathscr P$ against a composite alternative $P\in \mathscr Q\subseteq \mathscr P^c$. Herbert Robbins and coauthors pointed out around 1970 that, while no batch test can have a level $α\in(0,1)$ and power equal to one, sequential tests can be constructed with this fantastic property. Since then, and especially in the last decade, a plethora of sequential tests have been developed for a wide variety of settings. However, the literature has not yet provided a clean and general answer as to when such power-one sequential tests exist. This paper provides a remarkably general sufficient condition (that we also prove is not necessary). Focusing on i.i.d. laws in Polish spaces without any further restriction, we show that there exists a level-$α$ sequential test for any weakly compact $\mathscr P$, that is power-one against $\mathscr P^c$ (or any subset thereof). We show how to aggregate such tests into an $e$-process for $\mathscr P$ that increases to infinity under $\mathscr P^c$. We conclude by building an $e$-process that is asymptotically relatively growth rate optimal against $\mathscr P^c$, an extremely powerful result.

2604.03215 2026-04-06 stat.ME stat.AP

Directional Dependence of Extreme Events

Matthieu Garcin, Maxime L. D. Nicolas

详情
英文摘要

This paper introduces a novel measure to quantify the directional dependence of extreme events between two variables. The proposed approach is designed to capture asymmetric tail dependence by studying conditional tail expectations of rank-transformed variables, thereby quantifying the behavior of one variable when the other takes extreme values. We investigate the theoretical asymptotic behavior of the associated estimator. The effectiveness of the approach is demonstrated through an extensive simulation study. In addition, we discuss the use of the proposed coefficient for the detection of causal effects in extreme events. Finally, we apply the method to an oceanographic dataset, where the results highlight the strong asymmetric nature of extreme events and identify the dominant directions of extremal influence among key oceanographic variables. As a directional measure of tail dependence, our approach provides a natural tool for exploring causal-effect relationships in extreme-value settings.

2604.03076 2026-04-06 stat.AP

Carbon cost pass-through rate in power system: evidence from Italy under the EU ETS

Pierdomenico Duttilo, Francesco Lisi

详情
英文摘要

This paper investigates the impact of carbon pricing under the EU Emissions Trading System (EU ETS) on the Italian electricity market, focusing on the carbon cost pass-through rate (CPTR) across market zones during Phases 3 and 4 (2016-2024). Using daily data, the study applies an econometric framework based on a linear regression model with autoregressive dynamics to estimate the extent to which carbon costs are reflected in wholesale electricity prices. It further incorporates robustness checks and quantile regression to assess how the CPTR varies across different fuel spread levels. The results show that carbon costs are positively and significantly transmitted to electricity prices, confirming the relevance of carbon pricing as a key market driver. However, pass-through is incomplete, with CPTR values consistently below 100%. At the national level, the CPTR remains relatively stable at around 30% across the two phases. Substantial heterogeneity emerges across market zones: pass-through increases in the North, Centre-North, and Sardinia during Phase 4, while it declines in the Centre-South and Sicily, reflecting differences in generation mix, carbon intensity, and market conditions. Overall, the findings highlight the importance of market zones factors in shaping the effectiveness of carbon pricing in electricity markets.

2604.03073 2026-04-06 stat.ME

Modeling within-department homogeneity in research quality rankings: an application to the Italian ISPD

Giorgio E. Montanari, Marco Doretti

Comments 20 pages, 6 figures, 2 tables

详情
英文摘要

In this paper, we consider the academic department ranking system of Italy, which is based on a performance index named Indice Standardizzato di Performance Dipartimentale (ISPD). While critiques to the ISPD have been moved for its marked tendency to polarization, we here formalize a yet unexplored determinant of this phenomenon, that is, the presence of within-department homogeneity among the standardized scores used to build the index. We account for this intra-departmental correlation by modeling it as a function of departments' size. The proposed model, estimated via Maximum Likelihood, allows to build a fairer ranking procedure via the definition of a properly adjusted version of the ISPD. The estimation framework is also adapted to fit publicly available data, which are coarsened by rounding and/or left-truncated. To this end, a novel probability distribution termed Betoidal is introduced. Empirical evidence in favor of the proposed model is found in the 2017 and 2022 data. Moreover, a simulation study shows that the adjusted index significantly overcomes not only the original ISPD, but also other more data-demanding competing proposals.

2604.03068 2026-04-06 cond-mat.dis-nn cond-mat.stat-mech stat.ML

Escape dynamics and implicit bias of one-pass SGD in overparameterized quadratic networks

Dario Bocchi, Theotime Regimbeau, Carlo Lucibello, Luca Saglietti, Chiara Cammarota

Comments 30 pages, 6 figures

详情
英文摘要

We analyze the one-pass stochastic gradient descent dynamics of a two-layer neural network with quadratic activations in a teacher--student framework. In the high-dimensional regime, where the input dimension $N$ and the number of samples $M$ diverge at fixed ratio $α= M/N$, and for finite hidden widths $(p,p^*)$ of the student and teacher, respectively, we study the low-dimensional ordinary differential equations that govern the evolution of the student--teacher and student--student overlap matrices. We show that overparameterization ($p>p^*$) only modestly accelerates escape from a plateau of poor generalization by modifying the prefactor of the exponential decay of the loss. We then examine how unconstrained weight norms introduce a continuous rotational symmetry that results in a nontrivial manifold of zero-loss solutions for $p>1$. From this manifold the dynamics consistently selects the closest solution to the random initialization, as enforced by a conserved quantity in the ODEs governing the evolution of the overlaps. Finally, a Hessian analysis of the population-loss landscape confirms that the plateau and the solution manifold correspond to saddles with at least one negative eigenvalue and to marginal minima in the population-loss geometry, respectively.

2604.03015 2026-04-06 cs.LG math.PR stat.ML

Generating DDPM-based Samples from Tilted Distributions

Himadri Mandal, Dhruman Gupta, Rushil Gupta, Sarvesh Ravichandran Iyer, Agniv Bandyopadhyay, Achal Bassamboo, Varun Gupta, Sandeep Juneja

Comments 33 pages, 4 figures

详情
英文摘要

Given $n$ independent samples from a $d$-dimensional probability distribution, our aim is to generate diffusion-based samples from a distribution obtained by tilting the original, where the degree of tilt is parametrized by $θ\in \mathbb{R}^d$. We define a plug-in estimator and show that it is minimax-optimal. We develop Wasserstein bounds between the distribution of the plug-in estimator and the true distribution as a function of $n$ and $θ$, illustrating regimes where the output and the desired true distribution are close. Further, under some assumptions, we prove the TV-accuracy of running Diffusion on these tilted samples. Our theoretical results are supported by extensive simulations. Applications of our work include finance, weather and climate modelling, and many other domains, where the aim may be to generate samples from a tilted distribution that satisfies practically motivated moment constraints.

2604.02992 2026-04-06 stat.OT stat.AP stat.ME

Why is Regularization Underused? An Empirical Study on Trust and Adoption of Statistical Methods

Konstantin Emil Thiel, Marléne Baumeister, Nicole Krämer, Andreas Groll, Markus Pauly, Magdalena Wischnewski

详情
英文摘要

Statistical practice does not automatically follow methodological innovation. Regularization methods, widely advocated to reduce overfitting and stabilize inference, are readily available in modern software, but are not consistently used by data analysts. We investigate this implementation gap in a large-scale empirical study of trust in, and acceptance of, regularization techniques, based on $N = 606$ data analysts. Drawing on measurement frameworks from technology acceptance research, we survey practitioners and embed a randomized experiment to test whether written recommendation of regularization methods increases trust or intended use. We find no evidence of such an effect. Instead, adoption intentions are strongly associated with analysts' perceptions of ease of implementation and practical benefit, such as improved bias control or interpretability. Perceived social norms also emerge as a central driver. These results indicate that uptake of statistical methodology depends less on formal recommendations than on usability, perceived utility, and community practice.

2604.02887 2026-04-06 stat.ML cs.LG

Lipschitz bounds for integral kernels

Justin Reverdi, Sixin Zhang, Fabrice Gamboa, Serge Gratton

详情
英文摘要

Feature maps associated with positive definite kernels play a central role in kernel methods and learning theory, where regularity properties such as Lipschitz continuity are closely related to robustness and stability guarantees. Despite their importance, explicit characterizations of the Lipschitz constant of kernel feature maps are available only in a limited number of cases. In this paper, we study the Lipschitz regularity of feature maps associated with integral kernels under differentiability assumptions. We first provide sufficient conditions ensuring Lipschitz continuity and derive explicit formulas for the corresponding Lipschitz constants. We then identify a condition under which the feature map fails to be Lipschitz continuous and apply these results to several important classes of kernels. For infinite width two-layer neural network with isotropic Gaussian weight distributions, we show that the Lipschitz constant of the associated kernel can be expressed as the supremum of a two-dimensional integral, leading to an explicit characterization for the Gaussian kernel and the ReLU random neural network kernel. We also study continuous and shift-invariant kernels such as Gaussian, Laplace, and Matérn kernels, which admit an interpretation as neural network with cosine activation function. In this setting, we prove that the feature map is Lipschitz continuous if and only if the weight distribution has a finite second-order moment, and we then derive its Lipschitz constant. Finally, we raise an open question concerning the asymptotic behavior of the convergence of the Lipschitz constant in finite width neural networks. Numerical experiments are provided to support this behavior.

2604.02886 2026-04-06 stat.ME q-bio.GN q-bio.QM stat.AP stat.ML

High-dimensional Many-to-many-to-many Mediation Analysis

Tien Dat Nguyen, Trung Khang Tran, Cong Khanh Truong, Duy-Cat Can, Binh T. Nguyen, Oliver Y. Chén

详情
英文摘要

We study high-dimensional mediation analysis in which exposures, mediators, and outcomes are all multivariate, and both exposures and mediators may be high-dimensional. We formalize this as a many (exposures)-to-many (mediators)-to-many (outcomes) (MMM) mediation analysis problem. Methodologically, MMM mediation analysis simultaneously performs variable selection for high-dimensional exposures and mediators, estimates the indirect effect matrix (i.e., the coefficient matrices linking exposure-to-mediator and mediator-to-outcome pathways), and enables prediction of multivariate outcomes. Theoretically, we show that the estimated indirect effect matrices are consistent and element-wise asymptotically normal, and we derive error bounds for the estimators. To evaluate the efficacy of the MMM mediation framework, we first investigate its finite-sample performance, including convergence properties, the behavior of the asymptotic approximations, and robustness to noise, via simulation studies. We then apply MMM mediation analysis to data from the Alzheimer's Disease Neuroimaging Initiative to study how cortical thickness of 202 brain regions may mediate the effects of 688 genome-wide significant single nucleotide polymorphisms (SNPs) (selected from approximately 1.5 million SNPs) on eleven cognitive-behavioral and diagnostic outcomes. The MMM mediation framework identifies biologically interpretable, many-to-many-to-many genetic-neural-cognitive pathways and improves downstream out-of-sample classification and prediction performance. Taken together, our results demonstrate the potential of MMM mediation analysis and highlight the value of statistical methodology for investigating complex, high-dimensional multi-layer pathways in science. The MMM package is available at https://github.com/THELabTop/MMM-Mediation.

2604.02849 2026-04-06 cs.NE stat.ML

Frame Theoretical Derivation of Three Factor Learning Rule for Oja's Subspace Rule

Taiki Yamada

Comments 5 pages note

详情
英文摘要

We show that the error-gated Hebbian rule for PCA (EGHR-PCA), a three-factor learning rule equivalent to Oja's subspace rule under Gaussian inputs, can be systematically derived from Oja's subspace rule using frame theory. The global third factor in EGHR-PCA arises exactly as a frame coefficient when the learning rule is expanded with respect to a natural frame on the space of symmetric matrices. This provides a principled, non-heuristic derivation of a biologically plausible learning rule from its mathematically canonical counterpart.

2604.02802 2026-04-06 stat.ME

A Scale-Invariant Entropy Statistic for Distance Distributions

Mohamed Gewily

Comments 8 pages

详情
英文摘要

We introduce a family of scale-invariant entropy statistics derived from logarithmically aggregated distance distributions of point processes, with prime numbers serving as a motivating example. The construction associates to each finite configuration a scalar quantity encoding structural features of relative spacing while remaining insensitive to absolute scale. This work is intended as a methodological contribution rather than a source of new raw results.

2604.02739 2026-04-06 stat.ME stat.ML

Quotient-Based Posterior Analysis for Euclidean Latent Space Models

Kisung You, Mauro Giuffrè

详情
英文摘要

Latent space models are widely used in statistical network analysis and are often fit by Markov chain Monte Carlo. However, posterior summaries of latent coordinates are not canonical because the likelihood depends only on pairwise distances and is invariant under rigid motions of the latent space. Standard post hoc alignment can aid visualization, but the resulting summaries depend on an arbitrary reference configuration. We propose a quotient-based posterior analysis for Euclidean latent space models using the centered Gram map, which represents identifiable latent structure while removing nonidentifiability. This yields intrinsic posterior summaries of mean structure and uncertainty that can be computed directly from posterior samples, together with basic theoretical guarantees including canonicality, existence, and stability. Through simulations and analyses of the Florentine marriage network and a statisticians' coauthorship network, the proposed framework clarifies when alignment-based summaries are stable, when they become reference-sensitive, and which nodes or relationships are weakly identified. These results show how coherent posterior analysis can reveal latent relational structure beyond a single embedding.

2604.02738 2026-04-06 stat.ML cs.LG math.OC stat.CO

State estimations and noise identifications with intermittent corrupted observations via Bayesian variational inference

Peng Sun, Ruoyu Wang, Xue Luo

Comments 8 pages, 6 figures

详情
英文摘要

This paper focuses on the state estimation problem in distributed sensor networks, where intermittent packet dropouts, corrupted observations, and unknown noise covariances coexist. To tackle this challenge, we formulate the joint estimation of system states, noise parameters, and network reliability as a Bayesian variational inference problem, and propose a novel variational Bayesian adaptive Kalman filter (VB-AKF) to approximate the joint posterior probability densities of the latent parameters. Unlike existing AKF that separately handle missing data and measurement outliers, the proposed VB-AKF adopts a dual-mask generative model with two independent Bernoulli random variables, explicitly characterizing both observable communication losses and latent data authenticity. Additionally, the VB-AKF integrates multiple concurrent multiple observations into the adaptive filtering framework, which significantly enhances statistical identifiability. Comprehensive numerical experiments verify the effectiveness and asymptotic optimality of the proposed method, showing that both parameter identification and state estimation asymptotically converge to the theoretical optimal lower bound with the increase in the number of sensors.

2604.02722 2026-04-06 math.ST stat.TH

Parameter Estimation of Incomplete Gamma Subordinators

Meena Sanjay Babulal, Sunil Kumar Gauttam, Aditya Maheshwari

详情
英文摘要

In this paper, we estimate the parameters of InG, InG-$ε$ and TInG subordinators which have been studied by Babulal \textit{et al} (see \cite{babulal}). We have modified the method of moments technique to use fractional moments of the InG and InG-$ε$ subordinator due to their infinite moments. For the TInG subordinator's parameter estimation, we have used the method of moments. We also compute the maximum likelihood estimator(MLE) for the parameter $α$ of the InG and InG-$ε$ subordinators using jump distribution of the process. We also discussed the asymptotic normality of MLE.

2604.02678 2026-04-06 stat.ME cs.AI stat.AP

Eligibility-Aware Evidence Synthesis: An Agentic Framework for Clinical Trial Meta-Analysis

Yao Zhao, Zhiyue Zhang, Yanxun Xu

详情
英文摘要

Clinical evidence synthesis requires identifying relevant trials from large registries and aggregating results that account for population differences. While recent LLM-based approaches have automated components of systematic review, they do not support end-to-end evidence synthesis. Moreover, conventional meta-analysis weights studies by statistical precision without considering clinical compatibility reflected in eligibility criteria. We propose EligMeta, an agentic framework that integrates automated trial discovery with eligibility-aware meta-analysis, translating natural-language queries into reproducible trial selection and incorporating eligibility alignment into study weighting to produce cohort-specific pooled estimates. EligMeta employs a hybrid architecture separating LLM-based reasoning from deterministic execution: LLMs generate interpretable rules from natural-language queries and perform schema-constrained parsing of trial metadata, while all logical operations, weight computations, and statistical pooling are executed deterministically to ensure reproducibility. The framework structures eligibility criteria and computes similarity-based study weights reflecting population alignment between target and comparator trials. In a gastric cancer landscape analysis, EligMeta reduced 4,044 candidate trials to 39 clinically relevant studies through rule-based filtering, recovering all 13 guideline-cited trials. In an olaparib adverse events meta-analysis across four trials, eligibility-aware weighting shifted the pooled risk ratio from 2.18 (95% CI: 1.71-2.79) under conventional Mantel-Haenszel estimation to 1.97 (95% CI: 1.76-2.20), demonstrating quantifiable impact of incorporating eligibility alignment. EligMeta bridges automated trial discovery with eligibility-aware meta-analysis, providing a scalable and reproducible framework for evidence synthesis in precision medicine.

2604.02664 2026-04-06 stat.ME astro-ph.IM stat.AP

A comparison of methods for Poisson regression in the presence of background

Massimiliano Bonamente, Vinay Kashyap, Xiaoli Li, Jelle de Plaa

Comments Submitted to ApJ

详情
英文摘要

This paper provides a statistical analysis of three common methods of regression for Poisson data in the presence of Poisson background, namely the joint fit with two parametric models for the source and the background, the use of a non-parametric model for the background known as the wstat method, and the regression with a fixed background. The non-parametric background method, which is a popular method for spectral data, is found to be significantly biased, especially in the low-count and background-dominated regimes. Similar conclusions apply to the fixed-background regression. The joint-fit method, on the other hand, simultaneously affords reliable hypothesis testing by means of the usual Cash statistic and unbiased reconstruction of source parameters. We also investigate the effect of non-parametric regression on the number of effective degrees of freedom by means of the Efron degree of freedom function. We find that the wstat method adds a significantly larger number of degrees of freedom, compared to the number of free parameters in the source model. The other two methods have a number of degrees of freedom consistent with the number of adjustable parameters, at least for the simple models investigated in this paper.

2604.02659 2026-04-06 cs.LG cs.AI cs.NA math.NA stat.ML

Low-Rank Compression of Pretrained Models via Randomized Subspace Iteration

Farhad Pourkamali-Anaraki

Comments 13 pages

详情
英文摘要

The massive scale of pretrained models has made efficient compression essential for practical deployment. Low-rank decomposition based on the singular value decomposition (SVD) provides a principled approach for model reduction, but its exact computation is expensive for large weight matrices. Randomized alternatives such as randomized SVD (RSVD) improve efficiency, yet they can suffer from poor approximation quality when the singular value spectrum decays slowly, a regime commonly observed in modern pretrained models. In this work, we address this limitation from both theoretical and empirical perspectives. First, we establish a connection between low-rank approximation error and predictive performance by analyzing softmax perturbations, showing that deviations in class probabilities are controlled by the spectral error of the compressed weights. Second, we demonstrate that RSVD is inadequate, and we propose randomized subspace iteration (RSI) as a more effective alternative. By incorporating multiple power iterations, RSI improves spectral separation and provides a controllable mechanism for enhancing approximation quality. We evaluate our approach on both convolutional networks and transformer-based architectures. Our results show that RSI achieves near-optimal approximation quality while outperforming RSVD in predictive accuracy under aggressive compression, enabling efficient model compression.

2604.02610 2026-04-06 stat.ML cs.LG

Structure-Preserving Multi-View Embedding Using Gromov-Wasserstein Optimal Transport

Rafael Pereira Eufrazio, Eduardo Fernandes Montesuma, Charles Casimiro Cavalcante

Comments This manuscript is currently under review for possible publication in the journal Signal Processing (ELSEVIER)

详情
英文摘要

Multi-view data analysis seeks to integrate multiple representations of the same samples in order to recover a coherent low-dimensional structure. Classical approaches often rely on feature concatenation or explicit alignment assumptions, which become restrictive under heterogeneous geometries or nonlinear distortions. In this work, we propose two geometry-aware multi-view embedding strategies grounded in Gromov-Wasserstein (GW) optimal transport. The first, termed Mean-GWMDS, aggregates view-specific relational information by averaging distance matrices and applying GW-based multidimensional scaling to obtain a representative embedding. The second strategy, referred to as Multi-GWMDS, adopts a selection-based paradigm in which multiple geometry-consistent candidate embeddings are generated via GW-based alignment and a representative embedding is selected. Experiments on synthetic manifolds and real-world datasets show that the proposed methods effectively preserve intrinsic relational structure across views. These results highlight GW-based approaches as a flexible and principled framework for multi-view representation learning.

2604.02595 2026-04-06 stat.ME

Multi-Site Health Research Integrating Complementary Data Sources: A Scoping Review of Statistical Inference Methods for Vertically Partitioned Data

Marie-Pier Domingue, Simon Lévesque, Anita Burgun, Jean-François Ethier, Félix Camirand Lemyre

详情
英文摘要

To address the multidimensional nature of health-related questions, advances in health research often require integrating information from various data sources within statistical analyses. When complementary information pertaining to the same set of individuals are distributed across different institutions, vertical methods make it possible to obtain analysis results without sharing or pooling individual-level data. To guide stakeholders toward a transparent use of vertical methods, this study aims to (1) Identify existing vertical methods enabling statistical inference; and (2) Characterize the methodological properties of these methods and the current extent of their use with health data. We conducted a scoping review using four interdisciplinary databases. We then systematically extracted the characteristics of identified vertical methods with respect to comparability with the pooled analysis, efficiency of communication schemes and confidentiality. We additionally screened studies that cited included articles to identify applications on vertically partitioned real-world health data. Among 2887 articles initially screened, 30 were included in the review. Inference for the linear and the logistic regression framework were the most frequent statistical inference tasks undertaken in proposed methods. Equivalence with the pooled analyses was not systematically addressed and most methods required multiple communications between participating parties. Almost all articles described their approach as privacy-preserving, although a minority provided privacy assessments. The scope of existing approaches enabling statistical inference for vertically partitioned data is still relatively limited. Most existing methods do not concurrently achieve results equivalent to centralized analyses, high communication efficiency, and guaranteed protection of individual-level data.

2604.02581 2026-04-06 stat.ML cs.LG cs.NA math.NA

Learning interacting particle systems from unlabeled data

Viska Wei, Fei Lu

Comments 39 pages, 7 figures

详情
英文摘要

Learning the potentials of interacting particle systems is a fundamental task across various scientific disciplines. A major challenge is that unlabeled data collected at discrete time points lack trajectory information due to limitations in data collection methods or privacy constraints. We address this challenge by introducing a trajectory-free self-test loss function that leverages the weak-form stochastic evolution equation of the empirical distribution. The loss function is quadratic in potentials, supporting parametric and nonparametric regression algorithms for robust estimation that scale to large, high-dimensional systems with big data. Systematic numerical tests show that our method outperforms baseline methods that regress on trajectories recovered via label matching, tolerating large observation time steps. We establish the convergence of parametric estimators as the sample size increases, providing a theoretical foundation for the proposed approach.

2604.02526 2026-04-06 stat.AP

Applied Statistics Requires Scientific Context

Ashley I Naimi

Comments 12 pages, 1 figure, 63 references

详情
英文摘要

Statistical methods are indispensable to scientific inference. However, there exists a longstanding tension across a wide range of scientific disciplines about the role that ``context'' should play in the application of statistical methods and the interpretation of statistical results. Though frequently invoked, the notion of ``scientific context'' refers to at least two distinct concepts: a set of foundational nuanced and elusive background assumptions and substantive features of a given area of study that shape the validity and reliability of statistical methods; and more quantifiable contextual issues that affect the performance of statistical methods and interpretation of statistical results. I argue here that the application and interpretation of statistical methods requires careful consideration of foundational contextual issues. To motivate the arguments, I review a recent re-formulation of the $p$-value as a measure of divergence between an observed dataset and a set of assumptions used to construct statistical measures. I use this framework to illustrate the role that context plays in two randomized trials: on low-dose aspirin for pregnancy loss, and a new inhibitor of a key biochemical pathway affecting ankylosing spondylitis. Finally, I note that the adoption of low significance thresholds in genome-wide association studies and high energy particle physics has been successful more so because of extensive validity-checking gauntlets and contextual considerations that have accompanied these low thresholds, not because of the low thresholds themselves. I use these illustrations and arguments to suggest that (i) the adoption of a universal threshold for significance testing should be abandoned as a goal of statistics reform; and (ii) the validity and optimal use of applied statistical tools requires careful consideration of nuanced scientific context.

2604.02507 2026-04-06 stat.ML cs.LG

Reinforcement Learning from Human Feedback: A Statistical Perspective

Pangpang Liu, Chengchun Shi, Will Wei Sun

详情
英文摘要

Reinforcement learning from human feedback (RLHF) has emerged as a central framework for aligning large language models (LLMs) with human preferences. Despite its practical success, RLHF raises fundamental statistical questions because it relies on noisy, subjective, and often heterogeneous feedback to learn reward models and optimize policies. This survey provides a statistical perspective on RLHF, focusing primarily on the LLM alignment setting. We introduce the main components of RLHF, including supervised fine-tuning, reward modeling, and policy optimization, and relate them to familiar statistical ideas such as Bradley-Terry-Luce (BTL) model, latent utility estimation, active learning, experimental design, and uncertainty quantification. We review methods for learning reward functions from pairwise preference data and for optimizing policies through both two-stage RLHF pipelines and emerging one-stage approaches such as direct preference optimization. We further discuss recent extensions including reinforcement learning from AI feedback, inference-time algorithms, and reinforcement learning from verifiable rewards, as well as benchmark datasets, evaluation protocols, and open-source frameworks that support RLHF research. We conclude by highlighting open challenges in RLHF. An accompanying GitHub demo https://github.com/Pangpang-Liu/RLHF_demo illustrates key components of the RLHF pipeline.

2604.02489 2026-04-06 stat.ME

Sequentially-Rerandomized Switchback Experiments

Zhenghao Zeng, Christopher Adjaho, Alonso Bucarey, Chao Qin, Ruixuan Zhang, Paul Hoban, Ramesh Johari, Stefan Wager

Comments 33 pages, 10 figures

详情
英文摘要

Large-scale online platforms and marketplace systems often evaluate new policies through experiments that randomize treatment across operational units (e.g., geographies, regions, or clusters) over many time periods. In these settings, standard A/B testing can be inefficient or unreliable due to a limited number of units, substantial cross-unit heterogeneity, non-stationarity, and potential carryover across periods. We propose Sequentially-Rerandomized Switchback Experiments (SRSB), a new experimental design that helps mitigate these challenges. SRSB re-randomizes treatment at each time period such as to enforce balance on pre-specified prognostic variables constructed from past observations. In the absence of carryover, SRSB improves precision by leveraging temporal dependence through balancing lagged outcomes and covariates; we develop finite-sample randomization inference under a sharp null as well as asymptotic inference as the number of periods grows. We then extend SRSB to settings with first-order carryover and introduce a blocked SRSB variant that rerandomizes within strata defined by the previous treatment to form stable and comparable "stay" groups. Extensive simulations demonstrate the practical gains and robustness of SRSB relative to standard switchback designs.

2604.02454 2026-04-06 stat.AP

Remote, bivariate expert elicitation to determine the prior probability distribution for sample size calculation in a Bayesian non-inferiority multicenter randomized controlled trial (Croup Dosing Trial)

Arlene Jiang, Alex Aregbesola, Apoorva Gangwani, Terry P. Klassen, Amy C. Plint, Elisabete Doyle, William Craig, Mohamed Eltorki, Banke Oketola, Hoda Badra, Yongdong Ouyang, Anna Heath

详情
英文摘要

Prior distributions must be specified for the parameters of interest in a Bayesian clinical trial. When existing evidence on the effects of the trial interventions is limited, prior distributions can be constructed with expert elicitation. However, conventional elicitation requires face-to-face interactions and intensive pre-elicitation training, which can be infeasible. Our remote elicitation was based on established expert elicitation methods. We used bivariate prior distributions for dependencies between elicited quantities. We elicited a prior distribution for the Croup Dosing Trial, which will assess the number of return visits to the emergency department within 7 days in children with croup. This trial evaluates the non-inferiority of 0.15 mg/kg of dexamethasone, compared to the standard dose of 0.60 mg/kg to treat croup. We conducted three remote workshops to elicit expert beliefs on the efficacy of the two doses of dexamethasone. Each workshop consisted of two survey rounds, separated by a group discussion. Prior to the workshop, experts reviewed provided literature on the effects of the two doses of dexamethasone. Beliefs were aggregated with expert-specific bivariate distributions. The aggregated distribution and surveyed non-inferiority margin determined the sample size. Twelve emergency medicine physicians participated in our remote elicitation exercise. The elicitation generated a prior distribution centered at 6% for the 0.60 mg/kg dose and 8% for the 0.15 mg/kg dose. The aggregated prior distribution produced a sample size of 1850, based on a non-inferiority margin of 4%. We elicited a prior distribution that incorporated past evidence and expert opinion. The elicited prior is consistent with literature on the efficacy of the dexamethasone doses in treating croup. Our approach demonstrates the feasibility of remotely eliciting bivariate distributions for clinical trials.

2604.02403 2026-04-06 econ.EM cs.CL stat.ME

Measuring What Cannot Be Surveyed: LLMs as Instruments for Latent Cognitive Variables in Labor Economics

Cristian Espinal Maya

Comments Working paper. 13 pages, 7 figures, 6 references. Part of the Cognitive Factor Economics research program. Code: https://github.com/Cespial/cognitive-factor-economics

详情
英文摘要

This paper establishes the theoretical and practical foundations for using Large Language Models (LLMs) as measurement instruments for latent economic variables -- specifically variables that describe the cognitive content of occupational tasks at a level of granularity not achievable with existing survey instruments. I formalize four conditions under which LLM-generated scores constitute valid instruments: semantic exogeneity, construct relevance, monotonicity, and model invariance. I then apply this framework to the Augmented Human Capital Index (AHC_o), constructed from 18,796 O*NET task statements scored by Claude Haiku 4.5, and validated against six existing AI exposure indices. The index shows strong convergent validity (r = 0.85 with Eloundou GPT-gamma, r = 0.79 with Felten AIOE) and discriminant validity. Principal component analysis confirms that AI-related occupational measures span two distinct dimensions -- augmentation and substitution. Inter-rater reliability across two LLM models (n = 3,666 paired scores) yields Pearson r = 0.76 and Krippendorff's alpha = 0.71. Prompt sensitivity analysis across four alternative framings shows that task-level rankings are robust. Obviously Related Instrumental Variables (ORIV) estimation recovers coefficients 25% larger than OLS, consistent with classical measurement error attenuation. The methodology generalizes beyond labor economics to any domain where semantic content must be quantified at scale.

2604.02400 2026-04-06 stat.AP

Varying risk exposure in auto insurance: a weighted tweedie framework for experience rating an cancellation penalties

Jean-Philippe Boucher, Raïssa Coulibaly, Julien Trufin

Comments 31 pages, 22 figures, 4 tables

详情
英文摘要

This paper proposes a new family of Tweedie-based ratemaking models that explicitly account for mid-term policy cancellations. Using an automobile insurance dataset from a Canadian insurer, we document a marked difference in claims experience between policyholders who maintain their coverage until maturity and those who cancel their policies mid-term. Building on the classical Tweedie framework, we introduce flexible weighting functions and a premium penalty structure that depend on the level of exposure, allowing for a more realistic representation of the earned premium when coverage is interrupted before the end of the policy period. We compare several weighting structures within the Tweedie framework and examine their theoretical properties, as well as their empirical performance using deviance-based model comparison criteria, an area-between-curves criterion derived from concentration and Lorenz curves, and Murphy diagrams grounded in Bregman dominance. To operationalize the proposed models, monotonicity and non-negativity constraints are imposed on the penalty function, ensuring consistency with actuarial principles. Finally, using real-world data, we show that this approach provides both a strategic and competitive advantage: it allows the insurer to indirectly compensate for large losses through a cancellation surcharge, while preserving actuarial coherence and statistical consistency.

2604.02394 2026-04-06 q-bio.GN stat.ME

Benchmarking Heritability Estimation Strategies Across 86 Configurations and Their Downstream Effect on Polygenic Risk Score Performance

Muhammad Muneeb, David B. Ascher

详情
英文摘要

Objective: SNP heritability estimates vary substantially across estimation strategies, yet the downstream consequences for polygenic risk score (PRS) construction remain poorly characterised. We systematically benchmarked heritability estimation configurations and assessed their propagation into downstream PRS performance. Methods: We benchmarked 86 heritability-estimation configurations spanning six tool families (GEMMA, GCTA, LDAK, DPR, LDSC, and SumHer) and ten method groups across 10 UK Biobank phenotypes, yielding 844 configuration-level estimates. Each estimate was propagated into GCTA-SBLUP and LDpred2-lassosum2 PRS frameworks and evaluated across five cross-validation folds using null, PRS-only, and full models. Eleven binary analytical contrasts were tested using Mann-Whitney U tests to identify drivers of heritability variability. Results: Heritability ranged from -0.862 to 2.735 (mean = 0.134, SD = 0.284), with 133 of 844 estimates (15.8%) being negative and concentrated in unconstrained estimation regimes. Ten of eleven analytical contrasts significantly affected heritability magnitude, with algorithm choice and GRM standardisation showing the largest effects. Despite this upstream variability, downstream PRS test performance was only weakly coupled to heritability magnitude: pooled Pearson correlations between h^2 and test AUC were r = -0.023 for GCTA-SBLUP and r = +0.014 for LDpred2-lassosum2, with both being non-significant. Conclusion: SNP heritability is best interpreted as a configuration-sensitive modelling parameter rather than a universally stable scalar input. Heritability estimates should always be reported alongside their full estimation specification, and downstream PRS performance is comparatively robust to moderate variation in the heritability input.

2604.02380 2026-04-06 q-bio.GN math.MG stat.ME

VeloTree: Inferring single-cell trajectories from RNA velocity fields with varifold distances

Elodie Maignant, Tim Conrad, Christoph von Tycowicz

Comments arXiv admin note: text overlap with arXiv:2507.11313

详情
英文摘要

Trajectory inference is a critical problem in single-cell transcriptomics, which aims to reconstruct the dynamic process underlying a population of cells from sequencing data. Of particular interest is the reconstruction of differentiation trees. One way of doing this is by estimating the path distance between nodes -- labeled by cells -- based on cell similarities observed in the sequencing data. Recent sequencing techniques make it possible to measure two types of data: gene expression levels, and RNA velocity, a vector that quantifies variation in gene expression. The sequencing data then consist in a discrete vector field in dimension the number of genes of interest. In this article, we present a novel method for inferring differentiation trees from RNA velocity fields using a distance-based approach. In particular, we introduce a cell dissimilarity measure defined as the squared varifold distance between the integral curves of the RNA velocity field, which we show is a robust estimate of the path distance on the target differentiation tree. Upstream of the dissimilarity measure calculation, we also implement comprehensive routines for the preprocessing and integration of the RNA velocity field. Finally, we illustrate the ability of our method to recover differentiation trees with high accuracy on several simulated and real datasets, and compare these results with the state of the art.

2604.02336 2026-04-06 math.FA math.ST stat.TH

Stationary Process Invertibility and the Unilateral Shift Operator

Anand Ganesh, Babhrubahan Bose, Anand Rajagopalan

Comments 4 pages

详情
英文摘要

The bilateral shift operator $B$ has been the mainstay of stationary process modeling whereas we argue that the unilateral shift operator $T$ may be better suited to analyze invertibility. While doing so, we partially unify the notion of stationary process invertibility (associated with a sufficent but not necessary $\ell^1$ condition) with the algebraic invertibility of the transfer function $f(T)$. We establish a rigorous operator theoretic foundation for these arguments proving that for $f \in \mathbb{W}_+$, the Wiener algebra, $f(T)$ is well defined, that $\| f(T) \| = \| f \|_{\infty}$ and that $f(T) = T_f$, the Toeplitz operator.

2604.01735 2026-04-06 stat.AP physics.data-an

Correlation analysis of the dispersion of SARS-CoV-2 in Mexico

Pablo Carlos López, Marcos Flores, Soham Biswas

Comments 8 pages, 6 figures

详情
英文摘要

In this paper, we propose a method to analyze correlations in pandemic-related data across different geographical regions, relying on the analysis of correlations for non-stationary time series, which are typical of pandemic data. Unlike traditional epidemiological approaches focused on medical and modeling perspectives during a pandemic, our method emphasizes post-pandemic analysis to assess how societal responses; such as lockdowns, travel restrictions, mobility patterns, and vaccination campaigns, manifest in the collective behavior of regions. These insights can inform future public health strategies and enhance understanding of the complex dynamics underlying pandemic spread and control.

2603.29889 2026-04-06 econ.EM stat.ML

Penalized GMM Framework for Inference on Functionals of Nonparametric Instrumental Variable Estimators

Edvard Bakhitov

Comments Previously circulated as "Automatic Debiased Machine Learning in Presence of Endogeneity"

详情
英文摘要

This paper develops a penalized GMM (PGMM) framework for automatic debiased inference on functionals of nonparametric instrumental variable estimators. We derive convergence rates for the PGMM estimator and provide conditions for root-n consistency and asymptotic normality of debiased functional estimates, covering both linear and nonlinear functionals. Monte Carlo experiments on average derivative show that the PGMM-based debiased estimator performs on par with the analytical debiased estimator that uses the known closed-form Riesz representer, achieving 90-96% coverage while the plug-in estimator falls below 5%. We apply our procedure to estimate mean own-price elasticities in a semiparametric demand model for differentiated products. Simulations confirm near-nominal coverage while the plug-in severely undercovers. Applied to IRI scanner data on carbonated beverages, debiased semiparametric estimates are approximately 20% more elastic compared to the logit benchmark, and debiasing corrections are heterogeneous across products, ranging from negligible to several times the standard error.

2603.29415 2026-04-06 math.ST math.PR stat.TH

Concentration of the bootstrap empirical process, with applications to statistical inference

Guillaume Maillard, Adrien Saumard

详情
英文摘要

Considering a general framework of bootstrap with exchangeable weights, we show some concentration inequalities for the supremum of the bootstrap empirical process. On the one hand, we discuss the concentration of the bootstrap empirical process around its conditional expectation with respect to the original data, and on the other hand, the concentration of the latter quantity around its mean. For the concentration conditional on data, we build on Chatterjee's exchangeable pairs approach to concentration. To attain optimal concentration rates, we develop some refined arguments for the convergence of transposition walks on the symmetric group. The conditional expectation of the bootstrap empirical process is proved to be self-bounding, thus extending a well-known property for conditional Rademacher averages. To illustrate the interest of these concentration inequalities, we provide some new results pertaining to confidence regions for the estimation of a mean vector, as well as non-asymptotic bounds for the two-sample permutation test.

2603.28681 2026-04-06 stat.ML cs.LG

Functional Natural Policy Gradients

Aurelien Bibaut, Houssam Zenati, Thibaud Rahier, Nathan Kallus

详情
英文摘要

We propose a cross-fitted debiasing device for policy learning from offline data. A key consequence of the resulting learning principle is $\sqrt N$ regret even for policy classes with complexity greater than Donsker, provided a product-of-errors nuisance remainder is $O(N^{-1/2})$. The regret bound factors into a plug-in policy error factor governed by policy-class complexity and an environment nuisance factor governed by the complexity of the environment dynamics, making explicit how one may be traded against the other.

2603.26227 2026-04-06 stat.ML cs.LG

Privacy-Accuracy Trade-offs in High-Dimensional LASSO under Perturbation Mechanisms

Ayaka Sakata, Haruka Tanzawa

Comments 53 pages, 11 figures

详情
英文摘要

We study privacy-preserving sparse linear regression in the high-dimensional regime, focusing on the LASSO estimator. We analyze two widely used mechanisms for differential privacy: output perturbation, which injects noise into the estimator, and objective perturbation, which adds a random linear term to the loss function. Using approximate message passing (AMP), we characterize the typical behavior of these estimators under random design and privacy noise. To quantify privacy, we adopt typical-case measures, including the on-average KL divergence, which admits a hypothesis-testing interpretation in terms of distinguishability between neighboring datasets. Our analysis reveals that sparsity plays a central role in shaping the privacy-accuracy trade-off: stronger regularization can improve privacy by stabilizing the estimator against single-point data changes. We further show that the two mechanisms exhibit qualitatively different behaviors. In particular, for objective perturbation, increasing the noise level can have non-monotonic effects, and excessive noise may destabilize the estimator, leading to increased sensitivity to data perturbations. Our results demonstrate that AMP provides a powerful framework for analyzing privacy-accuracy trade-offs in high-dimensional sparse models.

2603.24705 2026-04-06 stat.ME cs.LG econ.EM

Amortized Inference for Correlated Discrete Choice Models via Equivariant Neural Networks

Easton Huch, Michael Keane

详情
英文摘要

Discrete choice models are fundamental tools in management science, economics, and marketing for understanding and predicting decision-making. Logit-based models are dominant in applied work, largely due to their convenient closed-form expressions for choice probabilities. However, these models entail restrictive assumptions on the stochastic utility component, constraining our ability to capture realistic and theoretically grounded choice behavior$-$most notably, substitution patterns. In this work, we propose an amortized inference approach using a neural network emulator to approximate choice probabilities for general error distributions, including those with correlated errors. Our proposal includes a specialized neural network architecture and accompanying training procedures designed to respect the invariance properties of discrete choice models. We provide group-theoretic foundations for the architecture, including a proof of universal approximation given a minimal set of invariant features. Once trained, the emulator enables rapid likelihood evaluation and gradient computation. We use Sobolev training, augmenting the likelihood loss with a gradient-matching penalty so that the emulator learns both choice probabilities and their derivatives. We show that emulator-based maximum likelihood estimators are consistent and asymptotically normal under mild approximation conditions, and we provide sandwich standard errors that remain valid even with imperfect likelihood approximation. Simulations show significant gains over the GHK simulator in accuracy and speed.

2603.10288 2026-04-06 math.ST stat.TH

Version-Robust Methods for Identifying Minimal Sufficient Statistics

Rafael Oliveira Cavalcante, Alexandre Galvão Patriota

Comments 29 pages (now it provides a generalization to separable spaces)

详情
英文摘要

Let $f_θ$ be the joint density of a random sample $X$. A frequently used criterion asserts that a statistic $T(X)$ is minimal sufficient if, for any sample points $x$ and $y$, $T(x) = T(y)$ exactly when there exists a finite constant $h_{xy} > 0$, independent of $θ$, such that $f_θ(y) = f_θ(x)h_{xy}$ for all $θ$. We show that this criterion is false in general via a counterexample exploiting the non-uniqueness of versions of Radon--Nikodym derivatives. Although Sato (1996) established sufficient regularity conditions for the validity of this criterion, these conditions are frequently intractable to verify in practice. We resolve this limitation by introducing a version-robust method applicable whenever sufficiency is known. Moreover, our method allows us to generalize Sato's approach from Euclidean settings to arbitrary analytic Borel sample spaces and separable measurable statistic spaces. We also obtain a method for exponential-family densities under verifiable hypotheses. Taken together, these results clarify when pointwise likelihood-ratio arguments for minimal sufficiency are mathematically sound in irregular settings. Finally, we construct a counterexample demonstrating that a distinct criterion for minimal sufficiency due to Pfanzagl (1994, 2017) similarly fails in the absence of supplementary hypotheses. Identifying minimal sufficient statistics is important not only for parsimonious data reduction but also because, in models admitting complete sufficiency, such statistics provide a practical route to the complete sufficient structure underlying optimal estimation and prediction.

2512.16383 2026-04-06 cs.LG stat.ML

Multivariate Uncertainty Quantification with Tomographic Quantile Forests

Takuya Kanazawa

Comments 36 pages. v2: matches published version

详情
Journal ref
Math. Comput. Appl. 2026, 31(2), 53
英文摘要

Quantifying predictive uncertainty is essential for safe and trustworthy real-world AI deployment. Yet, fully nonparametric estimation of conditional distributions remains challenging for multivariate targets. We propose Tomographic Quantile Forests (TQF), a nonparametric, uncertainty-aware, tree-based regression model for multivariate targets. TQF learns conditional quantiles of directional projections $\mathbf{n}^{\top}\mathbf{y}$ as functions of the input $\mathbf{x}$ and the unit direction $\mathbf{n}$. At inference, it aggregates quantiles across many directions and reconstructs the multivariate conditional distribution by minimizing the sliced Wasserstein distance via an efficient alternating scheme with convex subproblems. Unlike classical directional-quantile approaches that typically produce only convex quantile regions and require training separate models for different directions, TQF covers all directions with a single model without imposing convexity restrictions. We evaluate TQF on synthetic and real-world datasets, and release the source code on GitHub.

2512.03537 2026-04-06 cs.LG stat.ML

Pushing the Limits of Distillation-Based Continual Learning via Classifier-Proximal Lightweight Plugins

Zhiming Xu, Baile Xu, Jian Zhao, Furao Shen, Suorong Yang

Comments 10 pages, 8 figures, 2 tables

详情
英文摘要

Continual learning requires models to learn continuously while preserving prior knowledge under evolving data streams. Distillation-based methods are appealing for retaining past knowledge in a shared single-model framework with low storage overhead. However, they remain constrained by the stability-plasticity dilemma: knowledge acquisition and preservation are still optimized through coupled objectives, and existing enhancement methods do not alter this underlying bottleneck. To address this issue, we propose a plugin extension paradigm termed Distillation-aware Lightweight Components (DLC) for distillation-based CL. DLC deploys lightweight residual plugins into the base feature extractor's classifier-proximal layer, enabling semantic-level residual correction for better classification accuracy while minimizing disruption to the overall feature extraction process. During inference, plugin-enhanced representations are aggregated to produce classification predictions. To mitigate interference from non-target plugins, we further introduce a lightweight weighting unit that learns to assign importance scores to different plugin-enhanced representations. DLC could deliver a significant 8% accuracy gain on large-scale benchmarks while introducing only a 4% increase in backbone parameters, highlighting its exceptional efficiency. Moreover, DLC is compatible with other plug-and-play CL enhancements and delivers additional gains when combined with them.

2512.00508 2026-04-06 stat.ME

High-dimensional Autoregressive Modeling for Time Series with Hierarchical Structures

Lan Li, Shibo Yu, Yingzhou Wang, Guodong Li

详情
英文摘要

Modern applications have made ubiquitous high-dimensional data, especially time-dependent data, with more and more complicated structures, and it also has become more frequent to encounter the scenario of hierarchical relationships among variables. However, there is still a lack of supervised learning tool in the literature for them. To fill this gap, we introduce a new model-designing framework, and it then combines with unsupervised factor modeling tools to form an efficient and interpretable autoregressive model for high-dimensional time series with hierarchical structures. An ordinary least squares estimation is considered, and its non-asymptotic properties are established. Moreover, we propose an algorithm to search for estimates, and a boosting method is also suggested for hyperparameter selection. Simulation experiments are conducted to evaluate finite-sample performance of the proposed methodology, and its usefulness is demonstrated by an application to the Personality-120 dataset.

2511.21595 2026-04-06 stat.ME math.ST stat.TH

Degrees of Freedom in Penalized Regression: Model Selection with Adaptive Penalties

Mauro Bernardi, Antonio Canale, Marco Stefanucci

详情
英文摘要

Model selection in penalized regression critically depends on an accurate assessment of model complexity, commonly quantified through the effective degrees of freedom. While the Lasso admits a simple and unbiased characterization, given by the size of the active set, this property does not extend to adaptive penalization methods, despite the widespread use of this approximation in practice. To solve this issue, in this paper we derive a novel unbiased estimator of the effective degrees of freedom for the Adaptive Lasso within Stein's unbiased risk estimation framework. Our analysis reveals additional terms induced by data-dependent penalization, reflecting the role of adaptive weights and regularization in determining model complexity. We further revisit the Group Lasso, providing an alternative derivation of its degrees of freedom, and extend these results to the Adaptive Group Lasso. Importantly, we characterize the behavior of the degrees of freedom along the regularization path beyond the orthonormal design setting commonly assumed in the literature, providing a new theoretical description of this behavior under general design matrices. By correcting the common misuse of active set size as a proxy for degrees of freedom, our results enable more reliable risk estimation and inference, offering a rigorous foundation for understanding model complexity in adaptive penalized regression.

2511.13394 2026-04-06 cs.LG stat.ML

Fast and Robust Simulation-Based Inference With Optimization Monte Carlo

Vasilis Gkolemis, Christos Diou, Michael U. Gutmann

Comments Accepted at AISTATS 2026

详情
英文摘要

Bayesian parameter inference for complex stochastic simulators is challenging due to intractable likelihood functions. Existing simulation-based inference methods often require large number of simulations and become costly to use in high-dimensional parameter spaces or in problems with partially uninformative outputs. We propose a new method for differentiable simulators that delivers accurate posterior inference with substantially reduced runtimes. Building on the Optimization Monte Carlo framework, our approach reformulates inference for stochastic simulators in terms of deterministic optimization problems. Gradient-based methods are then applied to efficiently navigate toward high-density posterior regions and avoid wasteful simulations in low-probability areas. A JAX-based implementation further enhances the performance through vectorization of key method components. Extensive experiments, including high-dimensional parameter spaces, uninformative outputs, multiple observations and multimodal posteriors show that our method consistently matches, and often exceeds, the accuracy of state-of-the-art approaches, while reducing the runtime by a substantial margin.

2511.01154 2026-04-06 math.PR cs.LG math.ST stat.TH

Stability of the Kim--Milman flow map

Sinho Chewi, Aram-Alexandre Pooladian, Matthew S. Zhang

详情
英文摘要

In this short note, we characterize stability of the Kim--Milman flow map -- also known as the probability flow ODE -- with respect to variations in the target measure in relative Fisher information.

2510.15483 2026-04-06 stat.ML cs.LG

Fast Best-in-Class Regret for Contextual Bandits

Samuel Girard, Aurelien Bibaut, Arthur Gretton, Nathan Kallus, Houssam Zenati

详情
英文摘要

We study the problem of stochastic contextual bandits in the agnostic setting, where the goal is to compete with the best policy in a given class without assuming realizability or imposing model restrictions on losses or rewards. In this work, we establish the first fast rate for regret relative to the best-in-class policy. Our proposed algorithm updates the policy at every round by minimizing a pessimistic objective, defined as a clipped inverse-propensity estimate of the policy value plus a variance penalty. By leveraging entropy assumptions on the policy class and a Hölderian error-bound condition (a generalization of the margin condition), we achieve fast best-in-class regret rates, including polylogarithmic rates in the parametric case. The analysis is driven by a sequential self-normalized maximal inequality for bounded martingale empirical processes, which yields uniform variance-adaptive confidence bounds and guarantees pessimism under adaptive data collection.

2510.15075 2026-04-06 cs.LG stat.ML

Physics-informed data-driven machine health monitoring for two-photon lithography

Sixian Jia, Zhiqiao Dong, Chenhui Shao

详情
Journal ref
Journal of Manufacturing Processes 166 (2026) 319 - 329
英文摘要

Two-photon lithography (TPL) is a sophisticated additive manufacturing technology for creating three-dimensional (3D) micro- and nano-structures. Maintaining the health of TPL systems is critical for ensuring consistent fabrication quality. Current maintenance practices often rely on experience rather than informed monitoring of machine health, resulting in either untimely maintenance that causes machine downtime and poor-quality fabrication, or unnecessary maintenance that leads to inefficiencies and avoidable downtime. To address this gap, this paper presents three methods for accurate and timely monitoring of TPL machine health. Through integrating physics-informed data-driven predictive models for structure dimensions with statistical approaches, the proposed methods are able to handle increasingly complex scenarios featuring different levels of generalizability. A comprehensive experimental dataset that encompasses six process parameter combinations and six structure dimensions under two machine health conditions was collected to evaluate the effectiveness of the proposed approaches. Across all test scenarios, the approaches are shown to achieve high accuracies, demonstrating excellent effectiveness, robustness, and generalizability. These results represent a significant step toward condition-based maintenance for TPL systems.

2510.02513 2026-04-06 stat.ML cs.DS cs.LG cs.NA math.NA stat.CO

Adaptive randomized pivoting and volume sampling

Ethan N. Epperly

Comments 14 pages, 2 figures

详情
英文摘要

Adaptive randomized pivoting (ARP) is a recently proposed and highly effective algorithm for column subset selection. This paper reinterprets the ARP algorithm by drawing connections to the volume sampling distribution and active learning algorithms for linear regression. As consequences, this paper presents new analysis for the ARP algorithm and faster implementations using rejection sampling.

2509.25708 2026-04-06 stat.ME stat.AP

Modeling Spatial Heterogeneity in Exposure Buffers and Risk: A Hierarchical Bayesian Approach

Saskia Comess, Daniel E Ho, Joshua L Warren

Comments Submitted to the Journal of the Royal Statistical Society, Series C

详情
Journal ref
Journal of the Royal Statistical Society Series C: Applied Statistics, 2026;, qlag020
英文摘要

Place-based epidemiology studies often rely on circular buffers to define ``exposure'' to spatially distributed risk factors, where the buffer radius represents a threshold beyond which exposure does not influence the outcome of interest. This approach is popular due to its simplicity and alignment with public health policies. However, buffer radii are often chosen relatively arbitrarily and assumed constant across the spatial domain. This may result in suboptimal statistical inference if these modeling choices are incorrect. To address this, we develop SVBR (Spatially-Varying Buffer Radii), a flexible hierarchical Bayesian spatial change points approach that treats buffer radii as unknown parameters and allows both radii and exposure effects to vary spatially. Through simulations, we find that SVBR improves estimation and inference for key model parameters compared to traditional methods. We also apply SVBR to study healthcare access in Madagascar, finding that proximity to healthcare facilities generally increases antenatal care usage, with clear spatial variation in this relationship. By relaxing rigid assumptions about buffer characteristics, our method offers a flexible, data-driven approach to accurately defining exposure and quantifying its impact. The newly developed methods are available in the R package EpiBuffer.

2509.05221 2026-04-06 stat.ME

A functional tensor model for dynamic multilayer networks with common invariant subspaces and the RKHS estimation

Runshi Tang, Runbing Zheng, Anru R. Zhang, Carey E. Priebe

详情
英文摘要

Dynamic multilayer networks are frequently used to describe the structure and temporal evolution of multiple relationships among common entities, with applications in fields such as sociology, economics, and neuroscience. However, exploration of analytical methods for these complex data structures remains limited. We propose a functional tensor-based model for dynamic multilayer networks, with the key feature of capturing the shared structure among common vertices across all layers, while simultaneously accommodating smoothly varying temporal dynamics and layer-specific heterogeneity. The proposed model and its embeddings can be applied to various downstream network inference tasks, including dimensionality reduction, vertex community detection, analysis of network evolution periodicity, visualization of dynamic network evolution patterns, and evaluation of inter-layer similarity. We provide an estimation algorithm based on functional tensor Tucker decomposition and the reproducing kernel Hilbert space framework, with an effective initialization strategy to improve computational efficiency. The estimation procedure can be extended to address more generalized functional tensor problems, as well as to handle missing data or unaligned observations. We validate our method on simulated data and two real-world cases: the dynamic Citi Bike trip network and an international food trade dynamic multilayer network, with each layer corresponding to a different product.

2509.04603 2026-04-06 stat.AP cs.LG

DRtool: An Interactive Tool for Analyzing High-Dimensional Clusterings

Justin Lin, Julia Fukuyama

Comments 32 pages, 14 figures

详情
英文摘要

When faced with new data, we often conduct a cluster analysis to obtain a better understanding of the data's structure and the archetypical samples present in the data. This process often includes visualization of the data, either as a way to discover or verify clusters. However, the increases in data complexity and dimensionality has made this step very tricky. To visualize data, nonlinear dimension reduction methods are the de facto standard for their ability to non-uniformly stretch and shrink space in order to preserve local clusters. Because this process requires a drastic manipulation of space, however, nonlinear dimension reduction methods are known to produce false structures, especially when mishandled. A common consequence that often goes undetected by the untrained eye is over-clustering of the data. In efforts to deal with this phenomenon, we developed an interactive tool that empowers analysts to distinguish false clusters and better interpret their high-dimensional clustering results. The tool uses various analytical plots to provide a multi-faceted perspective on the data's global structure as well as local inter-cluster relationships, helping users determine the legitimacy of their high-dimensional clustering results. The tool is available via an R package named DRtool.

2506.06845 2026-04-06 stat.CO stat.ML

Linear Discriminant Analysis with Gradient Optimization

Cencheng Shen, Yuexiao Dong

Comments 26 pages

详情
英文摘要

Linear discriminant analysis (LDA) is a fundamental classification and dimension reduction method that achieves Bayes optimality under Gaussian mixture, but often struggles in high-dimensional settings where the covariance matrix cannot be reliably estimated. We propose LDA with gradient optimization (LDA-GO), which learns a low-rank precision matrix via scalable gradient-based optimization. The method automatically selects between a Gaussian likelihood and a cross-entropy loss using data-driven structural diagnostics, adapting to the signal structure without user tuning. The gradient computation avoids any quadratic-sized intermediate matrix, keeping the per-iteration cost linear in the number of dimensions. Theoretically, we prove several properties of the method, including the convexity of the objective functions, Bayes-optimality of the method, and a finite-sample bound of the excess error. Numerically, we conducted a variety of simulations and real data experiments to show that LDA-GO wins a majority of settings among other LDA variants, particularly in sparse-signal high-dimensional regimes.

2505.21723 2026-04-06 stat.CO cs.LG stat.ML

Are Statistical Methods Obsolete in the Era of Deep Learning? A Study of ODE Inverse Problems

Skyler Wu, Shihao Yang, S. C. Kou

Comments 35 pages, 11 figures (main text)

详情
英文摘要

In the era of AI, neural networks have become increasingly popular for modeling, inference, and prediction, largely due to their potential for universal approximation. With the proliferation of such deep learning models, a question arises: are leaner statistical methods still relevant? To shed insight on this question, we employ the mechanistic nonlinear ordinary differential equation (ODE) inverse problem as a testbed, using the physics-informed neural network (PINN) as a representative of the deep learning paradigm and manifold-constrained Gaussian process inference (MAGI) as a representative of statistically principled methods. Through case studies involving the SEIR model from epidemiology and the Lorenz model from chaotic dynamics, we demonstrate that statistical methods are far from obsolete, especially when working with sparse and noisy observations. On tasks such as parameter inference and trajectory reconstruction, statistically principled methods consistently achieve lower bias and variance, while using far fewer parameters and requiring less hyperparameter tuning. Statistical methods can also decisively outperform deep learning models on out-of-sample future prediction, where the absence of relevant data often leads overparameterized models astray. Additionally, we find that statistically principled approaches are more robust to accumulation of numerical imprecision and can represent the underlying system more faithfully to the true governing ODEs.

2505.07647 2026-04-06 math.PR stat.ML

Langevin Diffusion Approximation to Same Marginal Schrödinger Bridge

Medha Agarwal, Zaid Harchaoui, Garrett Mulcahy, Soumik Pal

Comments Final version. arXiv admin note: substantial text overlap with arXiv:2406.10823

详情
英文摘要

We introduce a novel approximation to the same marginal Schrödinger bridge using the Langevin diffusion. As $\varepsilon \downarrow 0$, it is known that the barycentric projection (also known as the entropic Brenier map) of the Schrödinger bridge converges to the Brenier map, which is the identity. Our diffusion approximation is leveraged to show that, under suitable assumptions, the difference between the two is $\varepsilon$ times the gradient of the marginal log density (i.e., the score function), in $\mathbf{L}^2$. More generally, we show that the family of Markov operators, indexed by $\varepsilon > 0$, derived from integrating test functions against the conditional density of the static Schrödinger bridge at temperature $\varepsilon$, admits a derivative at $\varepsilon=0$ given by the generator of the Langevin semigroup. Hence, these operators satisfy an approximate semigroup property at low temperatures.

2504.19331 2026-04-06 math.ST stat.TH

Bahadur asymptotic efficiency in the zone of moderate deviation probabilities

Mikhail Ermakov

Comments 9 pages

详情
英文摘要

For a sequence of independent identically distributed random variables having a distribution function with an unknown parameter from a set $Θ\subset \mathbf{R}^d$, we prove an analogue of the lower bound of Bahadur asymptotic efficiency for the zone of moderate deviation probabilities. The assumptions coincide with assumptions conditions under which the locally asymptotically minimax lower bound of Hajek-Le Cam was proved. The lower bound for local Bahadur asymptotic efficiency is a special case of this lower bound.

2504.01938 2026-04-06 cs.LG cs.NA math.NA stat.ML

A Unified Approach to Analysis and Design of Denoising Markov Models

Yinuo Ren, Grant M. Rotskoff, Lexing Ying

详情
英文摘要

Probabilistic generative models based on measure transport, such as diffusion and flow-based models, are often formulated in the language of Markovian stochastic dynamics, where the choice of the underlying process impacts both algorithmic design choices and theoretical analysis. In this paper, we aim to establish a rigorous mathematical foundation for denoising Markov models, a broad class of generative models that postulate a forward process transitioning from the target distribution to a simple, easy-to-sample distribution, alongside a backward process particularly constructed to enable efficient sampling in the reverse direction. Leveraging deep connections with nonequilibrium statistical mechanics and generalized Doob's $h$-transform, we propose a minimal set of assumptions that ensure: (1) explicit construction of the backward generator, (2) a unified variational objective directly minimizing the measure transport discrepancy, and (3) adaptations of the classical score-matching approach across diverse dynamics. Our framework unifies existing formulations of continuous and discrete diffusion models, identifies the most general form of denoising Markov models under certain regularity assumptions on forward generators, and provides a systematic recipe for designing denoising Markov models driven by arbitrary Lévy-type processes. We illustrate the versatility and practical effectiveness of our approach through novel denoising Markov models employing geometric Brownian motion and jump processes as forward dynamics, highlighting the framework's potential flexibility and capability in modeling complex distributions.

2503.10773 2026-04-06 stat.ML cs.LG

Learn then Decide: A Learning Approach for Designing Data Marketplaces

Yingqi Gao, Wenlu Xu, Jin J. Zhou, Hua Zhou, Yong Chen, Xiaowu Dai

详情
英文摘要

As data marketplaces become increasingly central to the digital economy, it is crucial to design efficient pricing mechanisms that optimize revenue while ensuring fair and adaptive pricing. We introduce the Maximum Auction-to-Posted Price (MAPP) mechanism, a novel two-stage approach that first estimates the bidders' value distribution through auctions and then determines the optimal posted price based on the learned distribution. We establish that MAPP is individually rational and incentive-compatible, ensuring truthful bidding while balancing revenue maximization with minimal price discrimination. On the theoretical side, we establish a statistical viewpoint that recasts revenue optimization as a valuation density estimation problem: we show that revenue regret can be controlled by uniform error in estimating the valuation density. MAPP achieves a regret of $O_p(n^{-1}(\log n)^2)$ when incorporating historical bid data, where $n$ is the number of bids in the current round. For sequential dataset sales over $T$ rounds, we propose an online MAPP mechanism that dynamically adjusts pricing across datasets with varying value distributions. Our approach achieves no-regret learning, with the average cumulative regret converging at a rate of $O_p(T^{-1/2}(\log T)^2)$. We validate the effectiveness of MAPP through simulations and real-world data from the FCC AWS-3 spectrum auction.

2503.04876 2026-04-06 stat.ME math.ST stat.TH

Estimation of relative risk, odds ratio and their logarithms with guaranteed accuracy and controlled sample size ratio

Luis Mendo

Comments 47 pages, 19 figures

详情
Journal ref
Statistical Papers, volume 67, issue 3, June 2026
英文摘要

Given two populations from which independent binary observations are taken with parameters $p_1$ and $p_2$ respectively, estimators are proposed for the relative risk $p_1/p_2$, the odds ratio $p_1(1-p_2)/(p_2(1-p_1))$ and their logarithms. The sampling strategy used by the estimators is based on two-stage sequential sampling applied to each population, where the sample sizes of the second stage depend on the results observed in the first stage. The estimators guarantee that the relative mean-square error, or the mean-square error for the logarithmic versions, is less than a target value for any $p_1, p_2 \in (0,1)$, and the ratio of average sample sizes from the two populations is close to a prescribed value. The estimators can also be used with group sampling, whereby samples are taken in batches of fixed size from the two populations simultaneously, each batch containing samples from the two populations. The efficiency of the estimators with respect to the Cramér-Rao bound is good, and in particular it is close to $1$ for small values of the target error.

2502.11868 2026-04-06 stat.ME

Phylogenetic latent space models for network data

Federico Pavone, Daniele Durante, Robin J. Ryder

详情
英文摘要

Latent space models for network data characterize each node through a vector of latent features whose pairwise similarities define the edge probabilities among the pairs of nodes. Although this formulation has led to successful implementations, the overarching focus has been on directly inferring node embeddings through the latent features, rather than learning the generative process underlying these embeddings. This focus prevents borrowing information across the node features and limits the ability to infer higher-level architectures governing network formation. For example, routinely-studied networks often exhibit multiscale structures informing on nested modular hierarchies among nodes, which could be learned via tree-based representations of dependencies among the latent features. We pursue this direction by bridging latent variable representations of network data with concepts from phylogenetic inference to design a novel latent space model that explicitly characterizes the generative process of the node feature vectors through a branching Brownian motion, with branching structure parametrized by a tree. This tree constitutes the main object of interest and is learned under a Bayesian perspective leveraging priors inherited from phylogenetic literature to infer tree-based modular hierarchies across nodes, which explain heterogeneous multiscale patterns in the network. Identifiability results are derived along with posterior consistency theory. The inference potentials of our model are illustrated in simulations and two real-data applications from criminology and neuroscience, where our formulation learns core structures hidden to state-of-the-art alternatives.

2502.00092 2026-04-06 math.ST cond-mat.dis-nn math.MG math.PR stat.TH

Minkowski tensors for point clouds and voxelized data: robust, asymptotically unbiased estimators

Daniel Hug, Michael A. Klatt, Dominik Pabst

Comments Substantially revised version

详情
英文摘要

Minkowski tensors, also known as tensor valuations, provide robust $n$-point information for a wide range of random spatial structures. Local estimators for point clouds, e.g., representing voxelized data, however, are unavoidably biased even in the limit of infinitely high resolution. Here, we substantially improve a recently proposed, asymptotically unbiased algorithm to estimate Minkowski tensors from point clouds. Our improved algorithm is more robust and efficient. Moreover we generalize the theoretical foundations for an asymptotically bias-free estimation of the interfacial tensors, among others, to the case of finite unions of compact sets with positive reach, which is relevant for many applications like rough surfaces or composite materials. As a realistic test case of random spatial structures, we consider random (beta) polytopes. We first derive explicit expressions of the expected Minkowski tensors, which we then compare to our simulation results. We obtain precise estimates with relative errors of a few percent for practically relevant resolutions. Finally, we apply our methods to real data of metallic grains and nanorough surfaces, and we provide an open-source python package, which works in any dimension.

2410.06128 2026-04-06 cs.LG stat.ML

Amortized Inference of Causal Models via Conditional Fixed-Point Iterations

Divyat Mahajan, Jannes Gladrow, Agrin Hilmkil, Cheng Zhang, Meyer Scetbon

Comments Transactions on Machine Learning Research (TMLR) 2025 (J2C Certification). ICLR 2026

详情
英文摘要

Structural Causal Models (SCMs) offer a principled framework to reason about interventions and support out-of-distribution generalization, which are key goals in scientific discovery. However, the task of learning SCMs from observed data poses formidable challenges, and often requires training a separate model for each dataset. In this work, we propose an amortized inference framework that trains a single model to predict the causal mechanisms of SCMs conditioned on their observational data and causal graph. We first use a transformer-based architecture for amortized learning of dataset embeddings, and then extend the Fixed-Point Approach (FiP) to infer the causal mechanisms conditionally on their dataset embeddings. As a byproduct, our method can generate observational and interventional data from novel SCMs at inference time, without updating parameters. Empirical results show that our amortized procedure performs on par with baselines trained specifically for each dataset on both in and out-of-distribution problems, and also outperforms them in scarce data regimes.

2409.11167 2026-04-06 stat.ME math.ST stat.TH

Using fractional derivatives to derive marginal densities

Si-Yang Li, David A. van Dyk, Maximilian Autenrieth

详情
英文摘要

This paper presents a novel method for analytical derivations of marginal densities using the fractional derivatives of moment-generating functions. Although the method requires likelihood functions to take specific forms, its assumptions are otherwise modest. It only requires that the prior moment-generating functions exist, are finite, and are continuous and differentiable at certain points. We also present the probabilistic and statistical insights behind this method.

2404.03198 2026-04-06 stat.ME

Delaunay Weighted Two-sample Test for High-dimensional Data by Incorporating Geometric Information

Jiaqi Gu, Ruoxu Tan, Guosheng Yin

详情
英文摘要

Two-sample hypothesis testing is a fundamental problem with various applications, which faces new challenges in the high-dimensional context. To mitigate the issue of the curse of dimensionality, high-dimensional data are typically assumed to lie on a low-dimensional manifold. To incorporate geometric information in the data, we propose to apply the Delaunay triangulation and develop the Delaunay weight to measure the geometric proximity among data points. In contrast to existing similarity measures that only utilize pairwise distances, the Delaunay weight can take both the distance and direction information into account. A detailed computation procedure is developed to learn the unknown manifold and approximate the Delaunay weight. We further propose a novel nonparametric test statistic using the Delaunay weight matrix. Asymptotic normality under the null and consistency under the alternative of the test statistic are developed. Applied on simulated data, the new test shows robustness to the learning of the unknown manifold and exhibits substantial power gain if the distributions differ directions. The proposed test also shows great power on a real dataset of mice protein expression levels.

2402.01207 2026-04-06 cs.LG cs.AI stat.ME

Efficient Causal Graph Discovery Using Large Language Models

Thomas Jiralerspong, Xiaoyin Chen, Yash More, Vedant Shah, Yoshua Bengio

详情
英文摘要

We propose a novel framework that leverages LLMs for full causal graph discovery. While previous LLM-based methods have used a pairwise query approach, this requires a quadratic number of queries which quickly becomes impractical for larger causal graphs. In contrast, the proposed framework uses a breadth-first search (BFS) approach which allows it to use only a linear number of queries. We also show that the proposed method can easily incorporate observational data when available, to improve performance. In addition to being more time and data-efficient, the proposed framework achieves state-of-the-art results on real-world causal graphs of varying sizes. The results demonstrate the effectiveness and efficiency of the proposed method in discovering causal relationships, showcasing its potential for broad applicability in causal graph discovery tasks across different domains.

2308.02005 2026-04-06 stat.ME

Randomization-Based Inference for Average Treatment Effects in Inexactly Matched Observational Studies

Jianan Zhu, Jeffrey Zhang, Zijian Guo, Siyu Heng

详情
英文摘要

Matching is a widely used causal inference design that aims to approximate a randomized experiment using observational data by forming matched sets of treated and control units based on similarities in their covariates. Ideally, treated units are exactly matched with controls on these covariates, enabling randomization-based inference for treatment effects as in a randomized experiment, under the assumption of no unobserved covariates. However, inexact matching often occurs, leading to residual covariate imbalance after matching. Previous matched studies have typically overlooked this issue and relied on conventional randomization-based inference, assuming that some covariate balance criteria are met. Recent research, however, has shown that this approach can introduce significant bias and proposed methods to correct for bias arising from inexact matching in randomization-based inference. These methods, however, are primarily focused on the constant treatment effect and its extensions (i.e., Fisher's sharp null) and do not apply to average treatment effects (i.e., Neyman's weak null). To address this gap, we introduce a new method--inverse post-matching probability weighting--for conducting randomization-based inference for average treatment effects under inexact matching. Our theoretical and simulation results indicate that, compared to conventional randomization-based inference methods, our approach significantly reduces bias and improves coverage rates in the presence of inexact matching.

2109.11142 2026-04-06 stat.ME math.ST stat.TH

Sparse PCA: A New Scalable Estimator Based On Integer Programming

Kayhan Behdin, Rahul Mazumder

Comments To appear in the Annals of Statistics

详情
英文摘要

We consider the Sparse Principal Component Analysis (SPCA) problem under the well-known spiked covariance model. Recent work has shown that the SPCA problem can be reformulated as a Mixed Integer Program (MIP) and can be solved to global optimality, leading to estimators that are known to enjoy optimal statistical properties. However, prior MIP algorithms for SPCA appear to be limited in terms of scalability to up to a thousand features or so. In this paper, we propose a new estimator for SPCA which can be formulated as a MIP. Different from earlier work, we make use of the underlying spiked covariance model and properties of the multivariate Gaussian distribution to arrive at our estimator. We establish statistical guarantees for our proposed estimator in terms of estimation error and support recovery. We derive guarantees under departures from the spiked covariance model, and for approximate solutions to the optimization problem. We propose a custom algorithm to solve the MIP, which scales better than off-the-shelf solvers, and demonstrate that our approach can be much more computationally attractive compared to earlier exact MIP-based approaches for the SPCA problem. Our numerical experiments on synthetic and real datasets show that our algorithms can address problems with up to 20,000 features in minutes; and generally result in favorable statistical properties compared to existing popular approaches for SPCA.

2104.04590 2026-04-06 econ.EM stat.ME

Identification of Dynamic Panel Logit Models with Fixed Effects

Christopher Dobronyi, Jiaying Gu, Kyoo il Kim, Thomas M. Russell

详情
英文摘要

We show that identification in a general class of dynamic panel logit models with fixed effects is related to the truncated moment problem from the mathematics literature. We use this connection to show that the identified set for structural parameters and functionals of the distribution of latent individual effects can be characterized by a finite set of conditional moment equalities subject to a certain set of shape constraints on the model parameters. In addition to providing a general approach to identification, the new characterization can deliver informative bounds in cases where competing methods deliver no identifying restrictions, and can deliver point identification in cases where competing methods deliver partial identification. We then present an estimation and inference procedure that uses semidefinite programming methods, is applicable with continuous or discrete covariates, and can be used for models that are either point- or partially-identified. Finally, we illustrate our identification result with a number of examples, and provide an empirical application to employment dynamics using data from the National Longitudinal Survey of Youth.