arXivDaily arXiv每日学术速递 周一至周五更新
重置
2602.17543 2026-02-20 stat.ML cs.LG econ.EM math.ST stat.ME stat.TH

genriesz: A Python Package for Automatic Debiased Machine Learning with Generalized Riesz Regression

Masahiro Kato

详情
英文摘要

Efficient estimation of causal and structural parameters can be automated using the Riesz representation theorem and debiased machine learning (DML). We present genriesz, an open-source Python package that implements automatic DML and generalized Riesz regression, a unified framework for estimating Riesz representers by minimizing empirical Bregman divergences. This framework includes covariate balancing, nearest-neighbor matching, calibrated estimation, and density ratio estimation as special cases. A key design principle of the package is automatic regressor balancing (ARB): given a Bregman generator $g$ and a representer model class, genriesz} automatically constructs a compatible link function so that the generalized Riesz regression estimator satisfies balancing (moment-matching) optimality conditions in a user-chosen basis. The package provides a modulr interface for specifying (i) the target linear functional via a black-box evaluation oracle, (ii) the representer model via basis functions (polynomial, RKHS approximations, random forest leaf encodings, neural embeddings, and a nearest-neighbor catchment basis), and (iii) the Bregman generator, with optional user-supplied derivatives. It returns regression adjustment (RA), Riesz weighting (RW), augmented Riesz weighting (ARW), and TMLE-style estimators with cross-fitting, confidence intervals, and $p$-values. We highlight representative workflows for estimation problems such as the average treatment effect (ATE), ATE on treated (ATT), and average marginal effect estimation. The Python package is available at https://github.com/MasaKat0/genriesz and on PyPI.

2602.16631 2026-02-20 econ.GN q-fin.EC

Can Wearable Exoskeletons Reduce Gender and Disability Gaps in the Construction Industry?

Yana Rodgers, Xiangmin Liu, Jingang Yi, Liang Zhang

Comments Assistive Technology

详情
Journal ref
January 2026, 1-10
英文摘要

The share of construction trade jobs held by women and people with disabilities has remained stubbornly low in the face of chronic shortages of skilled labor. This study explores the potential of wearable assistive technologies to reduce these disparities. We use U.S. worker-level data to estimate employment and wage differences by gender and by mobility/strength impairments in construction and non-construction jobs. We also use occupational-level data to examine variations in workforce composition, physical skill requirements, and earnings across detailed construction occupations. Regression estimates indicate that being a woman and having strength and mobility impairments are associated with substantial employment and pay gaps in construction compared to non-construction jobs. Further analysis shows a high negative correlation between the representation of women and the ability levels required in those occupations. Finally, we discuss several wearable exoskeletons under development for people with upper-body and lower-body impairments, focusing on how these innovations could be integrated into construction jobs. These findings suggest that wearable exoskeletons that enhance manual dexterity, balance, and strength may improve the representation of women and people with disabilities in some of the higher-paying occupations in construction.

2601.10506 2026-02-20 econ.TH cs.GT cs.MA

The incompatibility of the Condorcet winner and loser criteria with positive involvement and resolvability

Wesley H. Holliday

Comments 9 pages, 4 figures, 1 table

详情
Journal ref
Economics Letters, Vol. 262, April 2026, 112868
英文摘要

We prove that there is no preferential voting method satisfying the Condorcet winner and loser criteria, positive involvement (if a candidate $x$ wins in an initial preference profile, then adding a voter who ranks $x$ uniquely first cannot cause $x$ to lose), and $n$-voter resolvability (if $x$ initially ties for winning, then $x$ can be made the unique winner by adding some set of up to $n$ voters). This impossibility theorem holds for any positive integer $n$. It also holds if either the Condorcet loser criterion is replaced by independence of clones or positive involvement is replaced by negative involvement.

2410.12306 2026-02-20 cs.GT cs.MA econ.TH math.DS

Time-Varyingness in Auction Breaks Revenue Equivalence

Yuma Fujimoto, Kaito Ariu, Kenshi Abe

Comments 8 pages, 4 figures (main); 6 pages, 1 figure (appendix)

详情
英文摘要

Auction is applied for trade with various mechanisms. A simple but practical question is which mechanism, typically first-price or second-price auctions, is preferred from the perspective of bidders or sellers. A celebrated answer is revenue equivalence, where each bidder's equilibrium payoff is proven to be independent of auction mechanisms (and a seller's revenue, too). In reality, however, auction environments like the value distribution of items would vary over time, and such equilibrium bidding cannot always be achieved. Indeed, bidders must continue to track their equilibrium bidding by learning in first-price auctions, but they can keep their equilibrium bidding in second-price auctions. This study discusses whether and how revenue equivalence is violated in the long run by comparing the time series of non-equilibrium bidding in first-price auctions with those of equilibrium bidding in second-price auctions. We characterize the value distribution by two parameters: its basis value, which means the lowest price to bid, and its value interval, which means the width of possible values. Surprisingly, our theorems and experiments find that revenue equivalence is broken by the correlation between the basis value and the value interval, uncovering a novel phenomenon that could occur in the real world.

2602.17086 2026-02-20 econ.TH cs.LG math.ST stat.TH

Dynamic Decision-Making under Model Misspecification: A Stochastic Stability Approach

Xinyu Dai, Daniel Chen, Yian Qian

详情
英文摘要

Dynamic decision-making under model uncertainty is central to many economic environments, yet existing bandit and reinforcement learning algorithms rely on the assumption of correct model specification. This paper studies the behavior and performance of one of the most commonly used Bayesian reinforcement learning algorithms, Thompson Sampling (TS), when the model class is misspecified. We first provide a complete dynamic classification of posterior evolution in a misspecified two-armed Gaussian bandit, identifying distinct regimes: correct model concentration, incorrect model concentration, and persistent belief mixing, characterized by the direction of statistical evidence and the model-action mapping. These regimes yield sharp predictions for limiting beliefs, action frequencies, and asymptotic regret. We then extend the analysis to a general finite model class and develop a unified stochastic stability framework that represents posterior evolution as a Markov process on the belief simplex. This approach characterizes two sufficient conditions to classify the ergodic and transient behaviors and provides inductive dimensional reductions of the posterior dynamics. Our results offer the first qualitative and geometric classification of TS under misspecification, bridging Bayesian learning with evolutionary dynamics, and also build the foundations of robust decision-making in structured bandits.

2602.17052 2026-02-20 stat.ME econ.EM

Generative modeling for the bootstrap

Leon Tran, Ting Ye, Peng Ding, Fang Han

Comments 62 pages

详情
英文摘要

Generative modeling builds on and substantially advances the classical idea of simulating synthetic data from observed samples. This paper shows that this principle is not only natural but also theoretically well-founded for bootstrap inference: it yields statistically valid confidence intervals that apply simultaneously to both regular and irregular estimators, including settings in which Efron's bootstrap fails. In this sense, the generative modeling-based bootstrap can be viewed as a modern version of the smoothed bootstrap: it could mitigate the curse of dimensionality and remain effective in challenging regimes where estimators may lack root-$n$ consistency or a Gaussian limit.

2602.16973 2026-02-20 econ.GN q-fin.EC

Lies, Labels, and Mechanisms

Alex L. Brown, Ethan Park, Rodrigo A. Velez

详情
英文摘要

We test whether lying aversion can steer equilibrium selection in mechanism design. In a principal-worker environment, the direct mechanism admits two dominant-strategy equilibria: the designer's target and a worker-optimal outcome. We show this limitation persists for all robust mechanisms, then ask whether framing misreports as explicit lies helps. We develop a 2X2 experiment that varies direct vs. extended mechanisms with implicit vs. explicit messages. We find that framing misreporting of type as an explicit lie shifts play away from the worker-optimal outcome toward truthful reporting, raising designer payoffs with minimal efficiency loss. These findings indicate that lying aversion is an effective lever for aligning behavior with social objectives.

2602.16731 2026-02-20 econ.GN q-fin.EC

A Decade of Public Procurement in Spain: A Longitudinal Open Dataset from the BOE (2014-2024)

Manuel Munoz Pla

Comments Dataset and statistical analysis of Spanish public procurement (BOE, 2014-2024). 5 figures

详情
英文摘要

This paper presents a longitudinal open dataset of Spanish public procurement extracted from the Official State Gazette (BOE) covering the period 2014-2024. The dataset integrates structured information on contracts, contracting authorities, suppliers, amounts, and procedures, enabling large-scale quantitative analysis of public procurement dynamics in Spain. We describe the data extraction and normalization pipeline, provide descriptive statistical analyses of temporal and sectoral trends, and discuss potential applications in transparency research, public policy evaluation, and computational social science. The dataset is released to facilitate reproducible research on public procurement and government contracting.

2602.16211 2026-02-20 econ.TH

Equity in auction design with unit-demand agents and non-quasilinear preferences

Tomoya Kazumura, Debasis Mishra, Shigehiro Serizawa

详情
英文摘要

We study a model of auction design where a seller is selling a set of objects to a set of agents who can be assigned no more than one object. Each agent's preference over (object, payment) pair need not be quasilinear. If the domain contains all classical preferences, we show that there is a unique mechanism, the minimum Walrasian equilibrium price (MWEP) mechanism, which is strategy-proof, individually rational, and satisfies equal treatment of equals, no-wastage (every object is allocated to some agent), and no-subsidy (no agent is subsidized). This provides an equity-based characterization of the MEWP mechanism, and complements the efficiency-based characterization of the MWEP mechanism known in the literature.

2506.20749 2026-02-20 econ.EM stat.ME

Analytic inference with two-way clustering

Laurent Davezies, Xavier D'Haultfœuille, Yannick Guyonvarch

Comments 69 pages, supplement starts at p.43

详情
英文摘要

This paper studies analytic inference along two dimensions of clustering. In such setups, the commonly used approach has two drawbacks. First, the corresponding variance estimator is not necessarily positive. Second, inference is invalid in non-Gaussian regimes, namely when the estimator of the parameter of interest is not asymptotically Gaussian. We consider a simple fix that addresses both issues. In Gaussian regimes, the corresponding tests are asymptotically exact and equivalent to usual ones. Otherwise, the new tests are asymptotically conservative. We also establish their uniform validity over a certain class of data generating processes. Independently of our tests, we highlight potential issues with multiple testing and nonlinear estimators under two-way clustering. Finally, we compare our approach with existing ones through simulations.

2505.00282 2026-02-20 econ.EM cs.LG

A Unifying Framework for Robust and Efficient Inference with Unstructured Data

Jacob Carlson, Melissa Dell

详情
英文摘要

To analyze unstructured data (text, images, audio, video), economists typically first extract low-dimensional structured features with a neural network. Neural networks do not make generically unbiased predictions, and biases will propagate to estimators that use their predictions. While structured variables extracted from unstructured data have traditionally been treated as proxies - implicitly accepting arbitrary measurement error - this poses various challenges in an era where constantly evolving AI can cheaply extract data. Researcher degrees of freedom (e.g., the choice of neural network architecture, training data or prompts, and numerous implementation details) raise concerns about p-hacking and how to best show robustness, the frequent deprecation of proprietary neural networks complicates reproducibility, and researchers need a principled way to determine how accurate predictions need to be before making costly investments to improve them. To address these challenges, this study develops MAR-S (Missing At Random Structured Data), a semiparametric missing data framework that enables unbiased, efficient, and robust inference with unstructured data, by correcting for neural network prediction error with a validation sample. MAR-S synthesizes and extends existing methods for debiased inference using machine learning predictions and connects them to familiar problems such as causal inference, highlighting valuable parallels. We develop robust and efficient estimators for both descriptive and causal estimands and address inference with aggregated and transformed neural network predictions, a common scenario outside the existing literature.

2501.15761 2026-02-20 econ.EM

Universal Factor Models

Songnian Chen, Junlong Feng

详情
英文摘要

We propose a new factor analysis framework and estimators of the factors and loadings that are robust to certain weak factors in a large $N$ and large $T$ setting. Our framework, by simultaneously considering all quantile levels of the outcome variable, induces standard mean and quantile factor models, but the factors can have an arbitrarily weak influence on the outcome's mean or quantile at most quantile levels. Our method estimates the factor space at the $\sqrt{N}$-rate as long as each factor is strong at some unknown quantile level, and achieves $\sqrt{N}$- and $\sqrt{T}$-asymptotic normality for the factors and loadings based on a novel sample splitting approach that handles incidental nuisance parameters. We also develop a weak-factor-robust estimator of the number of factors and consistent selectors of factors of any tolerated level of influence on the outcome's mean or quantiles. Monte Carlo simulations demonstrate the effectiveness of our method.

2307.15313 2026-02-20 econ.EM

Group-Heterogeneous Changes-in-Changes and Distributional Synthetic Controls

Songnian Chen, Junlong Feng

详情
英文摘要

We develop new changes-in-changes (CIC) and distributional synthetic controls (DSC) types of methods when there exists group-level heterogeneity. For CIC, we allow individuals to belong to heterogeneous groups, extending Athey and Imbens (2006) by finding appropriate control groups that share similar group-level unobserved characteristics to the treatment groups. For DSC, we show that the synthetic control units are not necessarily from the same period as in Gunsilius (2023); they may come from different periods in which they have comparable group-level heterogeneity to the treatment group. Implementation of these new methods is briefly discussed.

1911.10116 2026-02-20 econ.TH cs.SI econ.GN q-fin.EC

Aggregative Efficiency of Bayesian Learning in Networks

Krishna Dasaratha, Kevin He

详情
英文摘要

When individuals in a social network learn about an unknown state from private signals and neighbors' actions, the network structure often causes information loss. We consider rational agents and Gaussian signals in the canonical sequential social-learning problem and ask how the network changes the efficiency of signal aggregation. Rational actions in our model are log-linear functions of observations and admit a signal-counting interpretation of accuracy. Networks where agents observe multiple neighbors but not their common predecessors confound information, and even a small amount of confounding can lead to much lower accuracy. In a class of networks where agents move in generations and observe the previous generations, we quantify the information loss with an aggregative efficiency index. Aggregative efficiency is a simple function of network parameters: increasing in observations and decreasing in confounding. Later generations contribute little additional information, even when generations are arbitrarily large and agents observe arbitrarily far into the past.