arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2604.06169 2026-04-08 cs.LG cs.AI cs.CL stat.ML

In-Place Test-Time Training

Guhao Feng, Shengjie Luo, Kai Hua, Ge Zhang, Di He, Wenhao Huang, Tianle Cai

Comments ICLR 2026 Oral Presentation; Code is released at https://github.com/ByteDance-Seed/In-Place-TTT

详情

英文摘要

The static ``train then deploy" paradigm fundamentally limits Large Language Models (LLMs) from dynamically adapting their weights in response to continuous streams of new information inherent in real-world tasks. Test-Time Training (TTT) offers a compelling alternative by updating a subset of model parameters (fast weights) at inference time, yet its potential in the current LLM ecosystem is hindered by critical barriers including architectural incompatibility, computational inefficiency and misaligned fast weight objectives for language modeling. In this work, we introduce In-Place Test-Time Training (In-Place TTT), a framework that seamlessly endows LLMs with Test-Time Training ability. In-Place TTT treats the final projection matrix of the ubiquitous MLP blocks as its adaptable fast weights, enabling a ``drop-in" enhancement for LLMs without costly retraining from scratch. Furthermore, we replace TTT's generic reconstruction objective with a tailored, theoretically-grounded objective explicitly aligned with the Next-Token-Prediction task governing autoregressive language modeling. This principled objective, combined with an efficient chunk-wise update mechanism, results in a highly scalable algorithm compatible with context parallelism. Extensive experiments validate our framework's effectiveness: as an in-place enhancement, it enables a 4B-parameter model to achieve superior performance on tasks with contexts up to 128k, and when pretrained from scratch, it consistently outperforms competitive TTT-related approaches. Ablation study results further provide deeper insights on our design choices. Collectively, our results establish In-Place TTT as a promising step towards a paradigm of continual learning in LLMs.

URL PDF HTML ☆

赞 0 踩 0

2604.06123 2026-04-08 stat.CO cs.LG econ.EM stat.ME

A Large-Scale Empirical Comparison of Meta-Learners and Causal Forests for Heterogeneous Treatment Effect Estimation in Marketing Uplift Modeling

Aman Singh

Comments 6 pages

2604.06116 2026-04-08 q-fin.ST econ.EM q-fin.RM stat.ME stat.ML

Sequential Audit Sampling with Statistical Guarantees

Masahiro Kato, Kei Nakagawa

2604.06065 2026-04-08 math.ST math.PR stat.ML stat.TH

Lipschitz regularity in Flow Matching and Diffusion Models: sharp sampling rates and functional inequalities

Arthur Stéphanovitch

2604.06032 2026-04-08 stat.ML cs.LG

Ensemble-Based Dirichlet Modeling for Predictive Uncertainty and Selective Classification

Courtney Franzen, Farhad Pourkamali-Anaraki

Comments 48 pages

2604.05993 2026-04-08 cs.LG stat.ML

Data Distribution Valuation Using Generalized Bayesian Inference

Cuong N. Nguyen, Cuong V. Nguyen

Comments Paper published at AISTATS 2026

2604.05974 2026-04-08 stat.ME

Nonparametric Statistical Inference for Multivariate Niche Overlap

Jonas Beck, Solomon Harrar

2604.05910 2026-04-08 math.PR math-ph math.AP math.FA math.MP math.ST stat.TH

Well-posedness and Hurst parameter estimation for fluid equations driven by fractional transport noise

Alexandra Blessing Neamtu, Dan Crisan, Oana Lang

Comments 43 pages

2604.05842 2026-04-08 cs.LG cs.IT math.IT stat.ML

Expectation Maximization (EM) Converges for General Agnostic Mixtures

Avishek Ghosh

Comments Accepted at IEEE International Symposium on Information Theory (ISIT 2026)

2604.04360 2026-04-08 stat.ME

Generalized win fraction regression for composite survival endpoints

Zhiqiang Cao, Xi Fang, Fan Li

2604.03541 2026-04-08 cs.LG stat.ML

Choosing the Right Regularizer for Applied ML: Simulation Benchmarks of Popular Scikit-learn Regularization Frameworks

Benjamin S. Knight, Ahsaas Bajaj

2603.28917 2026-04-08 math.OC cs.LG cs.SY eess.SY stat.ML

Symmetrizing Bregman Divergence on the Cone of Positive Definite Matrices: Which Mean to Use and Why

Tushar Sial, Abhishek Halder

2602.10370 2026-04-08 stat.ML cs.LG stat.ME

Causal Effect Estimation with Learned Instrument Representations

Frances Dean, Jenna Fields, Radhika Bhalerao, Marie Charpignon, Ahmed Alaa

2509.06076 2026-04-08 econ.GN q-fin.EC stat.AP

DETERring more than Deforestation: Environmental Enforcement Reduces Violence in the Amazon

Rafael Araujo, Vitor Possebom, Gabriela Setti

2507.05084 2026-04-08 cs.LG stat.ML

Distribution-dependent Generalization Bounds for Tuning Linear Regression Across Tasks

Maria-Florina Balcan, Saumya Goyal, Dravyansh Sharma

Comments 55 pages

2505.24868 2026-04-08 math.ST stat.ML stat.TH

Consistent line clustering using geometric hypergraphs

Kalle Alaluusua, Konstantin Avrachenkov, B. R. Vinay Kumar, Lasse Leskelä

Comments Major revision: new information-theoretic analysis for latent sampling laws concentrating near the intersection, recovery results for arbitrary fixed angles between the latent lines, revised spectral clustering guarantees, and substantial expository improvements (60 pages, 5 figures, 1 table)

2404.01566 2026-04-08 econ.EM stat.ME

Heterogeneous Treatment Effects and Causal Mechanisms

Jiawei Fu, Tara Slough

2604.05829 2026-04-08 cs.LG stat.ML

Bivariate Causal Discovery Using Rate-Distortion MDL: An Information Dimension Approach

Tiago Brogueira, Mário A. T. Figueiredo

Comments 22 pages

2604.05778 2026-04-08 math.DS physics.chem-ph stat.ML

Effective Dynamics and Transition Pathways from Koopman-Inspired Neural Learning of Collective Variables

Alexander Sikorski, Luca Donati, Marcus Weber, Christof Schütte

2604.05759 2026-04-08 stat.CO stat.ME stat.ML

High-dimensional reliability-based design optimization using stochastic emulators

M. Moustapha, B. Sudret

2604.05669 2026-04-08 stat.ML cs.LG

Efficient machine unlearning with minimax optimality

Jingyi Xie, Linjun Zhang, Sai Li

2604.05518 2026-04-08 math.OC cs.LG cs.SY eess.SY stat.ML

Optimal Centered Active Excitation in Linear System Identification

Kaito Ito, Alexandre Proutiere

Comments 11 pages

2604.05513 2026-04-08 stat.ME

From Unsupervised to Guided Clustering: A Variational Implementation

Violaine Courrier, Christophe Biernacki

2604.05470 2026-04-08 stat.ME

Evaluating Black-Box Classifiers via Stable Adaptive Two-Sample Inference

Yuchen Chen, Jing Lei

Comments 30 pages

2604.05469 2026-04-08 stat.ME cs.LG stat.ML

Task Ecologies and the Evolution of World-Tracking Representations in Large Language Models

Giulio Valentino Dalla Riva

2604.05462 2026-04-08 stat.ML cs.LG math.ST stat.TH

Hierarchical Contrastive Learning for Multimodal Data

Huichao Li, Junhan Yu, Doudou Zhou

Comments 34 pages,11 figures

2604.05460 2026-04-08 stat.ME cs.AI

LLM Evaluation as Tensor Completion: Low Rank Structure and Semiparametric Efficiency

Jiachun Li, David Simchi-Levi, Will Wei Sun

2604.05337 2026-04-08 stat.ML cs.LG

Individual-heterogeneous sub-Gaussian Mixture Models

Huan Qing

Comments 32 pages, 4 figures, 2 tables

2604.05303 2026-04-08 cs.LG cs.NA math.NA physics.comp-ph stat.ML

Jeffreys Flow: Robust Boltzmann Generators for Rare Event Sampling via Parallel Tempering Distillation

Guang Lin, Christian Moya, Di Qi, Xuda Ye

2604.05285 2026-04-08 stat.ME cs.LG

Robust Learning of Heterogeneous Dynamic Systems

Shuoxun Xu, Zijian Guo, Brooke R. Staveland, Robert T. Knight, Lexin Li

2604.05283 2026-04-08 stat.ME

Truncation by death in the sufficient cause framework

Bronner P. Gonçalves, Eiji Yamamoto, Etsuji Suzuki

2604.05275 2026-04-08 stat.AP

Statistical Analysis of Spatial and Temporal Variability of Maximum Precipitation Events on the Rio Grande do Sul

Cleber Souza Corrêa

Comments 9 pages, 2 figures, published in Journal of Aerospace Technology and Management (JATM). São José dos Campos, Vol. 4, No. 2, pp. 227-235, Apr.-Jun., 2012

2604.05188 2026-04-08 physics.soc-ph stat.ME

Ratio of Quantiles Indicates Burstiness with Fewer False Negatives than the Conventional Burstiness Parameter

Joshua Z. Stadlan, Michelle Birkett, Jason H. Rife

Comments 41 pages, 14 figures; for associated code, see https://github.com/jstadlan-compass/burstiness-tail-index

2604.05057 2026-04-08 cs.LG stat.ML

Blind-Spot Mass: A Good-Turing Framework for Quantifying Deployment Coverage Risk in Machine Learning Systems

Biplab Pal, Santanu Bhattacharya, Madanjit Singh

Comments 15 pages, 7 figures, 1 table; submitted to Journal of Machine Learning Research (JMLR)

详情

英文摘要

Blind-spot mass is a Good-Turing framework for quantifying deployment coverage risk in machine learning. In modern ML systems, operational state distributions are often heavy-tailed, implying that a long tail of valid but rare states is structurally under-supported in finite training and evaluation data. This creates a form of 'coverage blindness': models can appear accurate on standard test sets yet remain unreliable across large regions of the deployment state space. We propose blind-spot mass B_n(tau), a deployment metric estimating the total probability mass assigned to states whose empirical support falls below a threshold tau. B_n(tau) is computed using Good-Turing unseen-species estimation and yields a principled estimate of how much of the operational distribution lies in reliability-critical, under-supported regimes. We further derive a coverage-imposed accuracy ceiling, decomposing overall performance into supported and blind components and separating capacity limits from data limits. We validate the framework in wearable human activity recognition (HAR) using wrist-worn inertial data. We then replicate the same analysis in the MIMIC-IV hospital database with 275 admissions, where the blind-spot mass curve converges to the same 95% at tau = 5 across clinical state abstractions. This replication across structurally independent domains - differing in modality, feature space, label space, and application - shows that blind-spot mass is a general ML methodology for quantifying combinatorial coverage risk, not an application-specific artifact. Blind-spot decomposition identifies which activities or clinical regimes dominate risk, providing actionable guidance for industrial practitioners on targeted data collection, normalization/renormalization, and physics- or domain-informed constraints for safer deployment.

URL PDF HTML ☆

赞 0 踩 0

2604.05055 2026-04-08 stat.ME math.ST stat.TH

Hypothesis Testing for Penalized Estimating Equations with Cross-Fitted Covariance Calibration

Jing Zhou, Zhe Zhang

2604.05008 2026-04-08 stat.ML cs.LG q-fin.MF q-fin.ST

Generative Path-Law Jump-Diffusion: Sequential MMD-Gradient Flows and Generalisation Bounds in Marcus-Signature RKHS

Daniel Bloch

2604.04993 2026-04-08 stat.ML cs.CR cs.LG stat.ME

The Hiremath Early Detection (HED) Score: A Measure-Theoretic Evaluation Standard for Temporal Intelligence

Prakul Sunil Hiremath

Comments 11 pages. Introduces a measure-theoretic framework for predictive velocity including the Hiremath Standard Table. Dedicated to the Hiremath lineage

详情

英文摘要

We introduce the Hiremath Early Detection (HED) Score, a principled, measure-theoretic evaluation criterion for quantifying the time-value of information in systems operating over non-stationary stochastic processes subject to abrupt regime transitions. Existing evaluation paradigms, chiefly the ROC/AUC framework and its downstream variants, are temporally agnostic: they assign identical credit to a detection at t + 1 and a detection at t + tau for arbitrarily large tau. This indifference to latency is a fundamental inadequacy in time-critical domains including cyber-physical security, algorithmic surveillance, and epidemiological monitoring. The HED Score resolves this by integrating a baseline-neutral, exponentially decaying kernel over the posterior probability stream of a target regime, beginning precisely at the onset of the regime shift. The resulting scalar simultaneously encodes detection acuity, temporal lead, and pre-transition calibration quality. We prove that the HED Score satisfies three axiomatic requirements: (A1) Temporal Monotonicity, (A2) Invariance to Pre-Attack Bias, and (A3) Sensitivity Decomposability. We further demonstrate that the HED Score admits a natural parametric family indexed by the Hiremath Decay Constant (lambda_H), whose domain-specific calibration constitutes the Hiremath Standard Table. As an empirical vehicle, we present PARD-SSM (Probabilistic Anomaly and Regime Detection via Switching State-Space Models), which couples fractional Stochastic Differential Equations (fSDEs) with a Switching Linear Dynamical System (S-LDS) inference backend. On the NSL-KDD benchmark, PARD-SSM achieves a HED Score of 0.0643, representing a 388.8 percent improvement over a Random Forest baseline (0.0132), with statistical significance confirmed via block-bootstrap resampling (p < 0.001). We propose the HED Score as the successor evaluation standard to ROC/AUC.

URL PDF HTML ☆

赞 0 踩 0

2604.04987 2026-04-08 cs.LG cs.AI math.OC stat.ML

Cactus: Accelerating Auto-Regressive Decoding with Constrained Acceptance Speculative Sampling

Yongchang Hao, Lili Mou

Comments Camera-ready version. Accepted at ICLR 2026

2604.04963 2026-04-08 stat.ML cs.LG

Learning Nonlinear Regime Transitions via Semi-Parametric State-Space Models

Prakul Sunil Hiremath

Comments 12 pages, 1 figures, 2 tables

2604.04961 2026-04-08 stat.ML cs.LG econ.EM math.ST stat.TH

Identification and Inference in Nonlinear Dynamic Network Models

Diego Vallarino

2604.02656 2026-04-08 stat.ML cs.LG

Transfer Learning for Meta-analysis Under Covariate Shift

Zilong Wang, Ali Abdeen, Turgay Ayer

Comments Accepted to IEEE ICHI 2026 Early Bird Track (Oral Presentation)

2603.17466 2026-04-08 math.DS cs.NA math.NA stat.CO

A Full-Density Approach to Simulating Random Iteration Equations with Applications

Wolfgang Hoegele

2602.15089 2026-04-08 cs.LG stat.ML

Triplet Feature Fusion for Equipment Anomaly Prediction : An Open-Source Methodology Using Small Foundation Models

Takato Yasuno

Comments 15 pages, 8 figures, 7 table

2512.01026 2026-04-08 math.ST stat.TH

Asymptotic inference in a stationary quantum time series

Michael Nussbaum, Arleta Szkoła

2511.21497 2026-04-08 stat.CO stat.ME

Nested ensemble Kalman filter for static parameter inference in nonlinear state-space models

Andrew Golightly, Sarah E. Heaps, Chris Sherlock, Laura E. Wadkin, Darren J. Wilkinson

Comments 31 pages

2511.14292 2026-04-08 stat.ME stat.AP

Covariate Adjustment for the Win Odds: Application to Cardiovascular Outcomes Trials

Cyrill Scheidegger, Simon Wandel, Tobias Mütze

2511.09425 2026-04-08 cs.LG stat.ML

Supporting Evidence for the Adaptive Feature Program across Diverse Models

Yicheng Li, Qian Lin

2509.02617 2026-04-08 stat.ML cs.LG stat.CO

Gaussian process surrogate with physical law-corrected prior for multi-coupled PDEs defined on irregular geometry

Pucheng Tang, Hongqiao Wang, Wenzhou Lin, Qian Chen, Heng Yong

Comments 28 pages, 15 figures, 6 tables

2507.13301 2026-04-08 stat.CO stat.AP stat.ML

mNARX+: A surrogate model for complex dynamical systems using manifold-NARX and automatic feature selection

S. Schär, S. Marelli, B. Sudret

2507.04567 2026-04-08 stat.ME stat.AP

Inverse Probability Weighting for Recurrent Event Models

Jiren Sun, Tobias Mutze, Richard Cook, Tianmeng Lyu

2506.15272 2026-04-08 stat.ME

A penalized least squares estimator for extreme-value mixture models

Anas Mourahib, Anna Kiriliouk, Johan Segers

2505.00711 2026-04-08 math.ST stat.TH

Global Activity Scores

Ruilong Yue, Giray Ökten

2505.00629 2026-04-08 stat.ME math.ST stat.TH

Expected Weighted D-optimal Designs for Experiments with Mixed Factors

Siting Lin, Yifei Huang, Jie Yang

Comments 42 pages, 13 tables, and 4 figures

2504.13620 2026-04-08 math.PR math.ST stat.TH

Set-valued conditional functionals of random sets

Tobias Fissler, Ilya Molchanov

Comments 30 pages

2504.13382 2026-04-08 stat.AP

Intelligent data collection for network discrimination in material flow analysis using Bayesian optimal experimental design

Jiankan Liao, Xun Huan, Daniel Cooper

Comments 21 pages for manuscript, 8 pages for supporting information and bibliography, 8 figures

详情

DOI: 10.1111/jiec.70111
Journal ref: Journal of Industrial Ecology 29 (2025) 2005-2023

英文摘要

Material flow analyses (MFAs) are powerful tools for highlighting resource efficiency opportunities in supply chains. MFAs are often represented as directed graphs, with nodes denoting processes and edges representing mass flows. However, network structure uncertainty -- uncertainty in the presence or absence of flows between nodes -- is common and can compromise flow predictions. While collection of more MFA data can reduce network structure uncertainty, an intelligent data acquisition strategy is crucial to optimize the resources (person-hours and money spent on collecting and purchasing data) invested in constructing an MFA. In this study, we apply Bayesian optimal experimental design (BOED), based on the Kullback-Leibler divergence, to efficiently target high-utility MFA data -- data that minimizes network structure uncertainty. We introduce a new method with reduced bias for estimating expected utility, demonstrating its superior accuracy over traditional approaches. We illustrate these advances with a case study on the U.S. steel sector MFA, where the expected utility of collecting specific single pieces of steel mass flow data aligns with the actual reduction in network structure uncertainty achieved by collecting said data from the United States Geological Survey and the World Steel Association. The results highlight that the optimal MFA data to collect depends on the total amount of data being gathered, making it sensitive to the scale of the data collection effort. Overall, our methods support intelligent data acquisition strategies, accelerating uncertainty reduction in MFAs and enhancing their utility for impact quantification and informed decision-making.

URL PDF HTML ☆

赞 0 踩 0

2504.03943 2026-04-08 stat.ML cond-mat.mtrl-sci cs.LG

Multi-Variable Batch Bayesian Optimization in Materials Research: Synthetic Data Analysis of Noise Sensitivity and Problem Landscape Effects

Imon Mia, Armi Tiihonen, Anna Ernst, Anusha Srivastava, Tonio Buonassisi, William Vandenberghe, Julia W. P. Hsu

详情

DOI: 10.1557/s43578-026-01803-y

英文摘要

Bayesian Optimization (BO) machine learning method is increasingly used to guide experimental optimization tasks in materials science. To emulate the large number of input variables and noise-containing results in experimental materials research, we perform batch BO simulation of six design variables with a range of noise levels. Two test cases relevant for materials science problems are examined: a needle-in-a-haystack case (Ackley function) that may be encountered in, e.g., molecule optimizations, and a smooth landscape with a local optimum in addition to the global optimum (Hartmann function) that may be encountered in, e.g., material composition optimization. We show learning curves, performance metrics, and visualization to effectively track the optimization progression and evaluate how the optimization outcomes are affected by noise, batch-picking method, choice of acquisition function, and exploration hyperparameter values. We find that the effects of noise depend on the problem landscape: noise degrades the optimization results of a needle-in-a-haystack search (Ackley) dramatically more. However, with increasing noise, we observe an increasing probability of landing on the local optimum in Hartmann. Therefore, prior knowledge of the problem domain structure and noise level is essential when designing BO for materials research experiments. Synthetic data studies -- with known ground truth and controlled noise levels -- enable us to isolate and evaluate the impact of different batch BO components, {\it e.g.}, acquisition policy, objective metrics, and hyperparameter values, before transitioning to the inherent uncertainties of real experimental systems. The results and methodology of this study will facilitate a greater utilization of BO in guiding experimental materials research, specifically in settings with a large number of design variables to optimize.

URL PDF HTML ☆

赞 0 踩 0

2411.10646 2026-04-08 math.ST stat.ME stat.TH

Wasserstein Spatial Depth

François Bachoc, Alberto González-Sanz, Jean-Michel Loubes, Yisha Yao

2408.02667 2026-04-08 stat.ME

An Online Meta-Level Adaptive Design Framework with Targeted Learning Inference: Applications to Evaluating and Utilizing Surrogate Outcomes in Adaptive Designs

Wenxin Zhang, Aaron Hudson, Maya Petersen, Mark van der Laan

2404.19367 2026-04-08 math.ST stat.TH

Parametric estimation and LAN property of the birth-death-move process with mutations

Lisa Balsollier, Frédéric Lavancier

2403.18072 2026-04-08 stat.CO cs.LG stat.ME stat.ML

Goal-Oriented Bayesian Optimal Experimental Design for Nonlinear Models using Markov Chain Monte Carlo

Shijie Zhong, Wanggang Shen, Tommie Catanach, Xun Huan

Comments 28 pages, 19 figures

2403.13027 2026-04-08 cs.LG cs.CR cs.IT math.IT stat.ML

Towards Better Statistical Understanding of Watermarking LLMs

Zhongze Cai, Shang Liu, Hanzhao Wang, Huaiyang Zhong, Xiaocheng Li

2402.15095 2026-04-08 math.ST cs.DS cs.LG math.PR stat.TH

The Umeyama algorithm for matching correlated Gaussian geometric models in the low-dimensional regime

Shuyang Gong, Zhangsong Li

Comments 31 pages; updated funding information

2307.02719 2026-04-08 cs.LG stat.ML

Understanding Uncertainty Sampling via Equivalent Loss

Shang Liu, Xiaocheng Li

Comments An updated version of the previous paper titled "Understanding Uncertainty Sampling". Added a major result of sample complexity and other theoretical results; cut the experiment part

2306.10430 2026-04-08 stat.ML cs.AI cs.LG stat.CO stat.ME

Variational Sequential Optimal Experimental Design using Reinforcement Learning

Wanggang Shen, Jiayuan Dong, Xun Huan

2305.02657 2026-04-08 stat.ML cs.LG

On the Eigenvalue Decay Rates of a Class of Neural-Network Related Kernel Functions Defined on General Domains

Yicheng Li, Zixiong Yu, Guhan Chen, Qian Lin

2206.04236 2026-04-08 cs.CR cs.DS cs.LG stat.ML

Edgeworth Accountant: An Analytical Approach to Differential Privacy Composition

Hua Wang, Sheng Gao, Huanyu Zhang, Milan Shen, Weijie J. Su, Jiayuan Wu