arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2603.12163 2026-03-13 cs.LG cs.AI math.ST stat.ML stat.TH

A Quantitative Characterization of Forgetting in Post-Training

Krishnakumar Balasubramanian, Shiva Prasad Kasiviswanathan

详情

英文摘要

Continual post-training of generative models is widely used, yet a principled understanding of when and why forgetting occurs remains limited. We develop theoretical results under a two-mode mixture abstraction (representing old and new tasks), proposed by Chen et al. (2025) (arXiv:2510.18874), and formalize forgetting in two forms: (i) mass forgetting, where the old mixture weight collapses to zero, and (ii) old-component drift, where an already-correct old component shifts during training. For equal-covariance Gaussian modes, we prove that forward-KL objectives trained on data from the new distribution drive the old weight to zero, while reverse-KL objectives converge to the true target (thereby avoiding mass forgetting) and perturb the old mean only through overlap-gated misassignment probabilities controlled by the Bhattacharyya coefficient, yielding drift that decays exponentially with mode separation and a locally well-conditioned geometry with exponential convergence. We further quantify how replay interacts with these objectives. For forward-KL, replay must modify the training distribution to change the population optimum; for reverse-KL, replay leaves the population objective unchanged but prevents finite-batch old-mode starvation through bounded importance weighting. Finally, we analyze three recently proposed near-on-policy post-training methods, SDFT (arxiv:2601.19897), TTT-Discover (arxiv:2601.16175), and OAPL (arxiv:2602.19362), via the same lens and derive explicit conditions under which each retains old mass and exhibits overlap-controlled drift. Overall, our results show that forgetting can by precisely quantified based on the interaction between divergence direction, geometric behavioral overlap, sampling regime, and the visibility of past behavior during training.

URL PDF HTML ☆

赞 0 踩 0

2603.12102 2026-03-13 stat.ML cs.LG stat.CO stat.ME

Wasserstein Gradient Flows for Batch Bayesian Optimal Experimental Design

Louis Sharrock

详情

英文摘要

Bayesian optimal experimental design (BOED) provides a powerful, decision-theoretic framework for selecting experiments so as to maximise the expected utility of the data to be collected. In practice, however, its applicability can be limited by the difficulty of optimising the chosen utility. The expected information gain (EIG), for example, is often high-dimensional and strongly non-convex. This challenge is particularly acute in the batch setting, where multiple experiments are to be designed simultaneously. In this paper, we introduce a new approach to batch EIG-based BOED via a probabilistic lifting of the original optimisation problem to the space of probability measures. In particular, we propose to optimise an entropic regularisation of the expected utility over the space of design measures. Under mild conditions, we show that this objective admits a unique minimiser, which can be explicitly characterised in the form of a Gibbs distribution. The resulting design law can be used directly as a randomised batch-design policy, or as a computational relaxation from which a deterministic batch is extracted. To obtain scalable approximations when the batch size is large, we then consider two tractable restrictions of the full batch distribution: a mean-field family, and an i.i.d. product family. For the i.i.d. objective, and formally for its mean-field extension, we derive the corresponding Wasserstein gradient flow, characterise its long-time behaviour, and obtain particle-based algorithms via space-time discretisations. We also introduce doubly stochastic variants that combine interacting particle updates with Monte Carlo estimators of the EIG gradient. Finally, we illustrate the performance of the proposed methods in several numerical experiments, demonstrating their ability to explore multimodal optimisation landscapes and obtain high-utility batches in challenging examples.

URL PDF HTML ☆

赞 0 踩 0

2603.12060 2026-03-13 cs.LG cs.AI math.ST stat.ML stat.TH

Chemical Reaction Networks Learn Better than Spiking Neural Networks

Sophie Jaffard, Ivo F. Sbalzarini

Comments Keywords: Chemical Reaction Networks, Spiking Neural Networks, Supervised Learning, Classification, Mass-Action Kinetics, Statistical Learning Theory, Regret Bounds, Model Complexity

2603.11991 2026-03-13 cs.CL cs.AI cs.LG stat.ML

BTZSC: A Benchmark for Zero-Shot Text Classification Across Cross-Encoders, Embedding Models, Rerankers and LLMs

Ilias Aarab

Comments Accepted at ICLR 2026. 31 pages, 5 figures, 9 tables. Code: https://github.com/IliasAarab/btzsc ; Dataset: https://huggingface.co/datasets/btzsc/btzsc ; Leaderboard: https://huggingface.co/spaces/btzsc/btzsc-leaderboard . Proceedings of the Fourteenth International Conference on Learning Representations (ICLR 2026), 2026

详情

英文摘要

Zero-shot text classification (ZSC) offers the promise of eliminating costly task-specific annotation by matching texts directly to human-readable label descriptions. While early approaches have predominantly relied on cross-encoder models fine-tuned for natural language inference (NLI), recent advances in text-embedding models, rerankers, and instruction-tuned large language models (LLMs) have challenged the dominance of NLI-based architectures. Yet, systematically comparing these diverse approaches remains difficult. Existing evaluations, such as MTEB, often incorporate labeled examples through supervised probes or fine-tuning, leaving genuine zero-shot capabilities underexplored. To address this, we introduce BTZSC, a comprehensive benchmark of 22 public datasets spanning sentiment, topic, intent, and emotion classification, capturing diverse domains, class cardinalities, and document lengths. Leveraging BTZSC, we conduct a systematic comparison across four major model families, NLI cross-encoders, embedding models, rerankers and instruction-tuned LLMs, encompassing 38 public and custom checkpoints. Our results show that: (i) modern rerankers, exemplified by Qwen3-Reranker-8B, set a new state-of-the-art with macro F1 = 0.72; (ii) strong embedding models such as GTE-large-en-v1.5 substantially close the accuracy gap while offering the best trade-off between accuracy and latency; (iii) instruction-tuned LLMs at 4--12B parameters achieve competitive performance (macro F1 up to 0.67), excelling particularly on topic classification but trailing specialized rerankers; (iv) NLI cross-encoders plateau even as backbone size increases; and (v) scaling primarily benefits rerankers and LLMs over embedding models. BTZSC and accompanying evaluation code are publicly released to support fair and reproducible progress in zero-shot text understanding.

URL PDF HTML ☆

赞 0 踩 0

2603.11989 2026-03-13 cs.LG math.OC stat.ML

On-Average Stability of Multipass Preconditioned SGD and Effective Dimension

Simon Vary, Tyler Farghly, Ilja Kuzborskij, Patrick Rebeschini

Comments 35 pages, 1 figure

2603.11965 2026-03-13 stat.ML cs.LG stat.ME

Uncovering Locally Low-dimensional Structure in Networks by Locally Optimal Spectral Embedding

Hannah Sansford, Nick Whiteley, Patrick Rubin-Delanchy

2603.11960 2026-03-13 stat.ME cond-mat.mtrl-sci

Bayesian Model Calibration with Integrated Discrepancy: Addressing Inexact Dislocation Dynamics Models

Liam Myhill, Enrique Martinez Saez, Sez Russcher

Comments Preprint with arxiv formatting

2603.10520 2026-03-13 astro-ph.EP astro-ph.IM stat.AP

Spectral Decomposition Reveals Surface Processes on Europa

Gideon Yoffe, Sahar Shahaf

Comments Accepted for publication in The Astrophysical Journal

2602.23151 2026-03-13 math.CA math.PR math.ST stat.TH

High-dimensional Laplace asymptotics up to the concentration threshold

Alexander Katsevich, Anya Katsevich

Comments Change from v1: added new result on normalizing flow style posterior approximation

2512.17113 2026-03-13 stat.ME stat.CO

A systematic assessment of Large Language Models for constructing two-level fractional factorial designs

Alan R. Vazquez, Kilian M. Rother, Marco V. Charles-Gonzalez

Comments 31 pages, 11 tables

2512.06297 2026-03-13 cs.LG cond-mat.dis-nn cond-mat.stat-mech cs.AI stat.ML

Entropic Confinement and Mode Connectivity in Overparameterized Neural Networks

Luca Di Carlo, Chase Goddard, David J. Schwab

Comments ICLR 2026

2512.06210 2026-03-13 stat.AP cs.LG

Forests of Uncertaint(r)ees: Using tree-based ensembles to estimate probability distributions of future conflict

Daniel Mittermaier, Tobias Bohne, Martin Hofer, Daniel Racek

Comments 23 pages, 4 figures, 3 tables. Replication code available at https://github.com/ccew-unibw/uncertaintrees

2511.06967 2026-03-13 stat.ME stat.CO stat.ML

Approximate Bayesian inference for cumulative probit regression models

Emanuele Aliverti

2510.07204 2026-03-13 econ.EM math.ST stat.ME stat.TH

Beyond the Oracle Property: Adaptive LASSO in Cointegrating Regressions with Local-to-Unity Regressors

Karsten Reichold, Ulrike Schneider

2507.14132 2026-03-13 stat.ME

A Bayesian Dirichlet Auto-Regressive Conditional Heteroskedasticity Model for Forecasting Currency Shares

Harrison Katz, Robert E. Weiss

2506.17373 2026-03-13 stat.ME q-bio.QM

A practical identifiability criterion leveraging weak-form parameter estimation

Nora Heitzman-Breen, Vanja Dukic, David M. Bortz

2505.22034 2026-03-13 stat.ME math.ST stat.TH

Random irregular histograms

Oskar Høgberg Simensen, Dennis Christensen, Nils Lid Hjort

2411.12184 2026-03-13 stat.ME cs.AI cs.LG

Testability of Instrumental Variables in Additive Nonlinear, Non-Constant Effects Models

Xichen Guo, Zheng Li, Biwei Huang, Yan Zeng, Zhi Geng, Feng Xie

2411.03387 2026-03-13 cs.LG stat.ML

Quantifying Aleatoric Uncertainty of the Treatment Effect: A Novel Orthogonal Learner

Valentyn Melnychuk, Stefan Feuerriegel, Mihaela van der Schaar

2311.11321 2026-03-13 stat.ML cs.AI cs.LG

Bounds on Representation-Induced Confounding Bias for Treatment Effect Estimation

Valentyn Melnychuk, Dennis Frauen, Stefan Feuerriegel

2603.11916 2026-03-13 stat.ME

Distributionally balanced sampling designs

Anton Grafström, Wilmer Prentius

Comments 16 pages, 3 figures

2603.11909 2026-03-13 cs.LG cs.AI stat.ML

EnTransformer: A Deep Generative Transformer for Multivariate Probabilistic Forecasting

Rajdeep Pathak, Rahul Goswami, Madhurima Panja, Palash Ghosh, Tanujit Chakraborty

2603.11897 2026-03-13 q-fin.RM stat.AP

Deriving the term-structure of loan write-off risk under IFRS 9 by using survival analysis: A benchmark study

Arno Botha, Mohammed Gabru, Marcel Muller, Janette Larney

Comments 16871 words, 44 pages, 12 Figures

2603.11835 2026-03-13 stat.ML cs.LG

Hypercomplex Widely Linear Processing: Fundamentals for Quaternion Machine Learning

Sayed Pouria Talebi, Clive Cheong Took

Comments Contributed chapter to appear in Handbook of Statistics Volume 54: Multidimensional Signal Processing, Elsevier, 2026

2603.11784 2026-03-13 cs.LG stat.ML

Language Generation with Replay: A Learning-Theoretic View of Model Collapse

Giorgio Racca, Michal Valko, Amartya Sanyal

2603.11764 2026-03-13 cs.LG stat.ML

A Further Efficient Algorithm with Best-of-Both-Worlds Guarantees for $m$-Set Semi-Bandit Problem

Botao Chen, Jongyeong Lee, Chansoo Kim, Junya Honda

2603.11761 2026-03-13 stat.ME

Causal Influence Maximization with Steady-State Guarantees

Renjie Cao, Zhuoxin Yan, Xinyan Su, Zhiheng Zhang

2603.11757 2026-03-13 cs.LG cs.AI stat.ML

Exploiting Expertise of Non-Expert and Diverse Agents in Social Bandit Learning: A Free Energy Approach

Erfan Mirzaei, Seyed Pooya Shariatpanahi, Alireza Tavakoli, Reshad Hosseini, Majid Nili Ahmadabadi

2603.11730 2026-03-13 stat.ME

Including historical control data in simultaneous inference for pre-clinical multi-arm studies

Max Menssen, Carsten Kneuer, Gyamfi Akyianu, Christian Röver, Tim Friede, Frank Schaarschmidt

Comments 48 pages, 12 figures

2603.11728 2026-03-13 stat.ME stat.CO

A Semiparametric Nonlinear Mixed Effects Model with Penalized Splines Using Automatic Differentiation

Matteo D'Alessandro, Magne Thoresen, Øystein Sørensen

2603.11705 2026-03-13 stat.ME stat.AP

Effective Degrees of Freedom for Balanced Repeated Replication and Paired Jackknife Variance Estimates: A Unified Approach via Stratum Contrasts

Matthias von Davier

2603.11701 2026-03-13 stat.ML cs.LG

Decomposing Observational Multiplicity in Decision Trees: Leaf and Structural Regret

Mustafa Cavus

Comments 19 pages, 3 figures

详情

英文摘要

Many machine learning tasks admit multiple models that perform almost equally well, a phenomenon known as predictive multiplicity. A fundamental source of this multiplicity is observational multiplicity, which arises from the stochastic nature of label collection: observed training labels represent only a single realization of the underlying ground-truth probabilities. While theoretical frameworks for observational multiplicity have been established for logistic regression, their implications for non-smooth, partition-based models like decision trees remain underexplored. In this paper, we introduce two complementary notions of observational multiplicity for decision tree classifiers: leaf regret and structural regret. Leaf regret quantifies the intrinsic variability of predictions within a fixed leaf due to finite-sample noise, while structural regret captures variability induced by the instability of the learned tree structure itself. We provide a formal decomposition of observational multiplicity into these two components and establish statistical guarantees. Our experimental evaluation across diverse credit risk scoring datasets confirms the near-perfect alignment between our theoretical decomposition and the empirically observed variance. Notably, we find that structural regret is the primary driver of observational multiplicity, accounting for over 15 times the variability of leaf regret in some datasets. Furthermore, we demonstrate that utilizing these regret measures as an abstention mechanism in selective prediction can effectively identify arbitrary regions and improve model safety, elevating recall from 92% to 100% on the most stable sub-populations. These results establish a rigorous framework for quantifying observational multiplicity, aligning with recent advances in algorithmic safety and interpretability.

URL PDF HTML ☆

赞 0 踩 0

2603.11685 2026-03-13 stat.AP math.ST stat.CO stat.ME stat.TH

On the Unit Teissier Distribution: Properties, Estimation Procedures and Applications

Zuber Akhter, Mohamed A. Abdelaziz, M. Z. Anis, Ahmed Z. Afify

2603.11660 2026-03-13 stat.AP q-fin.RM

One-Shot Individual Claims Reserving

Ronald Richman, Mario V. Wüthrich

2603.11532 2026-03-13 math.OC cs.LG stat.ME

Simultaneous estimation of multiple discrete unimodal distributions under stochastic order constraints

Yasuhiro Yoshida, Noriyoshi Sukegawa, Jiro Iwanaga

2603.11524 2026-03-13 stat.ME

Robust Joint Modeling for Data with Continuous and Binary Responses

Yu Wang, Ran Jin, Lulu Kang

Comments 25 pages of main texts, 13 pages of supplement, 8 figures

2603.11497 2026-03-13 econ.EM stat.ME

Variance Estimation with Dependence and Heterogeneous Means

Luther Yap

2603.11478 2026-03-13 stat.ME cs.DS

Graph Generation Methods under Partial Information

Tong Sun, Jianshu Hao, Michael C. Fu, Guangxin Jiang

Comments 53 pages, 10 figures

2603.11474 2026-03-13 stat.ME stat.AP

Dynamic Bayesian regression quantile synthesis for forecasting outlook-at-risk

Genya Kobayashi, Shonosuke Sugasawa, Yuta Yamauchi, Dongu Han

2603.11465 2026-03-13 stat.ME

Prediction-Oriented Transfer Learning for Survival Analysis

Yu Gu, Donglin Zeng, D. Y. Lin

2603.11385 2026-03-13 stat.ME stat.AP

Multivariate Functional Principal Component Analysis for Mixed-Type mHealth Data: An Application to Mood Disorders

Debangan Dey, Rahul Ghosal, Kathleen Merikangas, Vadim Zipunnikov

Comments 30 pages, 12 figures, 2 tables

2603.11368 2026-03-13 stat.ML cs.LG econ.EM stat.AP stat.ME

Spatially Robust Inference with Predicted and Missing at Random Labels

Stephen Salerno, Zhenke Wu, Tyler McCormick

2603.11355 2026-03-13 cs.LG stat.AP

Teleodynamic Learning a new Paradigm For Interpretable AI

Enrique ter Horst, Juan Diego Zambrano

2603.11315 2026-03-13 stat.AP math.ST stat.TH

Finite-Sample Decision Instability in Threshold-Based Process Capability Approval

Fei Jiang, Lei Yang

Comments 14 pages, 6 figures

2603.11304 2026-03-13 stat.ML cs.AI cs.LG stat.ME

Worst-case low-rank approximations

Anya Fries, Markus Reichstein, David Blei, Jonas Peters

2603.11283 2026-03-13 astro-ph.IM astro-ph.CO stat.AP

Two Point Correlation Function Estimation with Contaminated Data

Arya Farahi

Comments 22 pages, comments are welcome

详情

英文摘要

The two-point correlation function (2PCF) is a cornerstone of precision cosmology, yet its estimation from imaging surveys is vulnerable to contamination and incompleteness arising from imperfect target selection and pipeline-level inclusion decisions. In practice, the scientific target is a physically defined population, while the working catalog is constructed from noisy measurements and selection cuts, leading to mismatches between true and observed inclusion. These errors are often spatially structured, correlating with survey depth, observing conditions, and foregrounds, and can imprint spurious large-scale power or suppress the true clustering signal. High-resolution spectroscopic samples provide gold-standard inclusion in the target population but are typically available for only a small subset of objects. We introduce a prediction-powered Landy--Szalay (PP--LS) estimator that combines noisy inclusion labels across the full catalog with exact labels on a small spectroscopic subset while preserving the standard random-catalog normalization for survey geometry and selection. PP--LS debiases pair counts using residual-based, design-weighted corrections computed only on the labeled subset, requiring no probability calibration, known misclassification rates, or explicit modeling of contamination. Under simple random sampling of the labeled subset, we establish recovery of the oracle (true-label) Landy--Szalay pair counts and thus consistency for the target 2PCF. In simulations with clustered and spatially structured contaminants, PP--LS removes the bias of naive catalog-level estimators while achieving substantially lower variance than spectroscopic-only clustering. The resulting estimator is statistically principled, computationally lightweight, and integrates directly with standard pair-counting pipelines, enabling robust clustering inference in next-generation surveys.

URL PDF HTML ☆

赞 0 踩 0

2603.11282 2026-03-13 stat.ME math.ST stat.ML stat.TH

Outrigger local polynomial regression

Elliot H. Young, Rajen D. Shah, Richard J. Samworth

2603.11258 2026-03-13 stat.ME math.PR

Continuous-time modeling and bootstrap for Schnieper's reserving

Nicolas Baradel

2603.11229 2026-03-13 stat.ML cs.LG

Trustworthy predictive distributions for rare events via diagnostic transport maps

Elizabeth Cucuzzella, Rafael Izbicki, Ann B. Lee

Comments 19 pages, 5 figures, 2 tables

2603.11138 2026-03-13 stat.ML cs.LG math.ST stat.TH

Deep regression learning from dependent observations with minimum error entropy principle

William Kengne, Modou Wade

2603.11134 2026-03-13 math.ST cs.LG stat.TH

Conformal e-prediction in the presence of confounding

Vladimir Vovk, Ruodu Wang

Comments 8 pages, 2 figures

2603.11128 2026-03-13 stat.ML cs.LG cs.NE

Efficient Approximation to Analytic and $L^p$ functions by Height-Augmented ReLU Networks

ZeYu Li, FengLei Fan, TieYong Zeng

2603.11125 2026-03-13 stat.ML cs.LG

Co-Diffusion: An Affinity-Aware Two-Stage Latent Diffusion Framework for Generalizable Drug-Target Affinity Prediction

Yining Qian, Pengjie Wang, Yixiao Li, An-Yang Lu, Cheng Tan, Shuang Li, Lijun Liu

2603.11113 2026-03-13 stat.ME math.ST stat.ML stat.TH

Partition-Based Functional Ridge Regression for High-Dimensional Data

Shaista Ashraf, Ismail Shah, Farrukh Javed

Comments 32 pages, 5 figures

2603.11084 2026-03-13 stat.ME q-bio.QM

Realizing Common Random Numbers: Event-Keyed Hashing for Causally Valid Stochastic Models

Vince Buffalo, Carl A. B. Pearson, Daniel Klein

2603.11060 2026-03-13 cs.SI math.PR stat.OT

LLY Ricci Reweighting in Stochastic Block Models: Uniform Curvature Concentration and Finite-Horizon Tracking

Varun Kotharkar

2603.10065 2026-03-13 cs.IT cs.AI cs.SY eess.SY math.IT stat.ME

The Epistemic Support-Point Filter: Jaynesian Maximum Entropy Meets Popperian Falsification

Moriba Kemessia Jah

2603.09602 2026-03-13 math.ST stat.TH

Inhomogeneous Submatrix Detection

Mor Oren-Loberman, Dvir Jerbi, Tamir Bendory, Wasim Huleihel

2603.08771 2026-03-13 stat.ML cs.IT cs.LG math.IT

Micro-Diffusion Compression - Binary Tree Tweedie Denoising for Online Probability Estimation

Roberto Tacconelli

Comments 12 pages, 1 figure

2603.06506 2026-03-13 stat.ML cs.LG

Semantics-Aware Caching for Concept Learning

Louis Mozart Kamdem Teyou, Caglar Demir, Axel-Cyrille Ngonga Ngomo

2603.01470 2026-03-13 cs.LG stat.ML

Randomized Kriging Believer for Parallel Bayesian Optimization with Regret Bounds

Shuhei Sugiura, Ichiro Takeuchi, Shion Takeno

2601.13010 2026-03-13 q-bio.PE stat.ME

Extracting useful information about reversible evolutionary processes from irreversible evolutionary accumulation models

Iain G. Johnston

2512.18492 2026-03-13 stat.ME

A Bayesian likely responder approach for the analysis of randomized controlled trials

Annan Deng, Carole Siegel, Hyung G. Park

2511.00617 2026-03-13 cs.LG cs.AI cs.CL stat.ML

Belief Dynamics Reveal the Dual Nature of In-Context Learning and Activation Steering

Eric Bigelow, Daniel Wurgaft, YingQiao Wang, Noah Goodman, Tomer Ullman, Hidenori Tanaka, Ekdeep Singh Lubana

2510.05440 2026-03-13 stat.ML cs.CR cs.LG

Refereed Learning

Ran Canetti, Ephraim Linder, Connor Wagaman

2510.04579 2026-03-13 cs.LG math.MG stat.ML

Busemann Functions in the Wasserstein Space: Existence, Closed-Forms, and Applications to Slicing

Clément Bonet, Elsa Cazelles, Lucas Drumetz, Nicolas Courty

Comments Published as a conference paper at AISTATS 2026

2509.22961 2026-03-13 stat.ME

Measuring capacities in multimodal maritime port systems with anchorage queues

Debojjal Bagchi, Kyle Bathgate, Kenneth N. Mitchell, Magdalena I. Asborno, Marin M. Kress, Stephen D. Boyles

2509.11821 2026-03-13 stat.ME astro-ph.IM physics.data-an

Covering Unknown Correlations in Bayesian Priors by Inflating Uncertainties

Lukas Koch

Comments 5 pages, added citations and acknowledgments

2509.02337 2026-03-13 stat.ML cs.LG math.ST stat.TH

Distribution estimation via Flow Matching with Lipschitz guarantees

Lea Kunkel

2504.12760 2026-03-13 stat.ME

Robust Covariate Adjustment in Multi-Center Randomized Trials

Muluneh Alene, Stijn Vansteelandt, Kelly Van Lancker

2502.13698 2026-03-13 stat.ME stat.AP

Multi-view biclustering via non-negative matrix tri-factorisation

Ella S. C. Orme, Theodoulos Rodosthenous, Marina Evangelou

2502.13325 2026-03-13 q-fin.RM math.PR math.ST q-fin.MF stat.TH

Arbitrage-free catastrophe reinsurance valuation for compound dynamic contagion claims

Jiwook Jang, Patrick J. Laub, Tak Kuen Siu, Hongbiao Zhao

2412.20555 2026-03-13 stat.ME

Parameter-Specific Bias Diagnostics in Random-Effects Panel Data Models

Andrew T. Karl

2412.12213 2026-03-13 cs.LG q-fin.CP stat.ML

Finance-Informed Neural Network: Learning the Geometry of Option Pricing

Amine M. Aboussalah, Xuanze Li, Cheng Chi, Raj Patel

2410.16004 2026-03-13 math.ST math.PR stat.ML stat.TH

Are Bayesian networks typically faithful?

Philip Boeken, Patrick Forré, Joris M. Mooij

详情

英文摘要

Faithfulness is a common assumption in causal inference, often motivated by the fact that the faithful parameters of linear Gaussian and discrete Bayesian networks are typical, and the folklore belief that this should also hold for other classes of Bayesian networks. We address this open question by showing that among all Bayesian networks over a given DAG, the faithful Bayesian networks are indeed `typical': they constitute a dense, open set with respect to the total variation metric. This does not directly imply that faithfulness is typical in restricted classes of Bayesian networks that are often considered in statistical applications. To this end we consider the class of Bayesian networks parametrised by conditional exponential families, for which we show that under regularity conditions, the faithful parameters constitute a dense and open set, the unfaithful parameters have Lebesgue measure zero, and the induced faithful distributions are open and dense in the weak topology. This extends the existing results for linear Gaussian and discrete Bayesian networks. We also show for nonparametric classes of Bayesian networks with uniformly equicontinuous and uniformly bounded conditional densities that the faithful Bayesian networks are open and dense in the weak topology. All these results also hold for Bayesian networks with latent variables, if faithfulness is only required to hold with respect to the latent projection. Finally, for the considered conditional exponential family parametrisations and nonparametric conditional density models, the topological properties of conditional independence imply the existence of a consistent conditional independence test. Together with the topological properties of faithfulness, this implies that sound constraint-based causal discovery algorithms like PC and FCI are consistent on an open and dense -- and hence `typical' -- set of Bayesian networks.

URL PDF HTML ☆

赞 0 踩 0

2410.08009 2026-03-13 stat.AP

Quasi-average predictions and regression to the trend: an application the M6 financial forecasting competition

Jose M. G. Vilar

Comments 12 pages, 5 figures

2409.07412 2026-03-13 cs.LG stat.ML

Geometry of Singular Foliations and Learning Manifolds in ReLU Networks via the Data Information Matrix

Eliot Tron, Rita Fioresi

2312.05169 2026-03-13 q-fin.PM cs.NA math.NA q-fin.CP stat.ML

Onflow: a model free, online portfolio allocation algorithm robust to transaction fees

Gabriel Turinici, Pierre Brugiere

2307.11465 2026-03-13 cs.LG stat.AP

A Deep Learning Approach for Overall Survival Prediction in Lung Cancer with Missing Values

Camillo Maria Caruso, Valerio Guarrasi, Sara Ramella, Paolo Soda

Comments 24 pages, 4 figures

详情

DOI: 10.1016/j.cmpb.2024.108308
Journal ref: Computer Methods and Programs in Biomedicine 254 (2024) 108308

英文摘要

In the field of lung cancer research, particularly in the analysis of overall survival (OS), artificial intelligence (AI) serves crucial roles with specific aims. Given the prevalent issue of missing data in the medical domain, our primary objective is to develop an AI model capable of dynamically handling this missing data. Additionally, we aim to leverage all accessible data, effectively analyzing both uncensored patients who have experienced the event of interest and censored patients who have not, by embedding a specialized technique within our AI model, not commonly utilized in other AI tasks. Through the realization of these objectives, our model aims to provide precise OS predictions for non-small cell lung cancer (NSCLC) patients, thus overcoming these significant challenges. We present a novel approach to survival analysis with missing values in the context of NSCLC, which exploits the strengths of the transformer architecture to account only for available features without requiring any imputation strategy. More specifically, this model tailors the transformer architecture to tabular data by adapting its feature embedding and masked self-attention to mask missing data and fully exploit the available ones. By making use of ad-hoc designed losses for OS, it is able to account for both censored and uncensored patients, as well as changes in risks over time. We compared our method with state-of-the-art models for survival analysis coupled with different imputation strategies. We evaluated the results obtained over a period of 6 years using different time granularities obtaining a Ct-index, a time-dependent variant of the C-index, of 71.97, 77.58 and 80.72 for time units of 1 month, 1 year and 2 years, respectively, outperforming all state-of-the-art methods regardless of the imputation method used.

URL PDF HTML ☆

赞 0 踩 0

2110.11149 2026-03-13 stat.ME math.ST stat.TH

Asymptotics of cut distributions and robust modular inference using Posterior Bootstrap

Emilia Pompe, Mikołaj J. Kasprzak, Pierre E. Jacob

Comments Major revision, including new results on the control of the Laplace approximation error for cut posteriors

2109.02236 2026-03-13 stat.ME

Predictive Distributions and the Transition from Sparse to Dense Functional Data

Álvaro Gajardo, Xiongtao Dai, Hans-Georg Müller