arXivDaily arXiv每日学术速递 周一至周五更新
重置
2605.08071 2026-05-11 econ.EM cs.HC stat.ME

Vibe Econometrics and the Analysis Contract

Lydia Ashton

AI总结 本文探讨了“vibe方法论”在经济学中的应用,指出人工智能辅助的因果分析(即“vibe计量经济学”)在提升效率的同时,也带来了新的方法与数据不匹配、置信度漂白和隐形分叉等失效模式。文章提出“分析契约”框架,通过预分析计划和因果路线图的改进,为AI辅助下的因果推断提供一种治理机制,以增强结果的可信度和可审查性。

Comments 20 pages, 2 figures. Appendices A-C (fillable templates) provided as ancillary file. Companion materials: https://github.com/lydiaashton/vibe-econometrics-supp . Also posted on SSRN: https://doi.org/10.2139/ssrn.6699999

详情
英文摘要

"Vibe coding" and "vibe analytics" have been framed as a democratization of technical capability. This paper argues that AI-assisted methodology more broadly, or what I call "vibe methodology," also democratizes the failure modes specific to each domain. When AI assists with methods whose validity depends on assumptions that cannot be verified from the output alone (a class I call "vibe inference"), the failure surface is structurally different: the output does not reliably signal invalidity, and when it does, recognizing the signal requires the expertise the workflow bypasses. I focus on "vibe econometrics," the subset of AI-assisted causal analysis where identification can be named faster than it can be audited. The claim of this paper is not that AI invents inferential failures that did not previously exist, but that it changes their incidence, observability, and persuasive force enough to create a practically distinct governance problem. This results in three failure modes: method-data mismatch, where AI bypasses expertise at execution; confidence laundering, where AI amplifies the credibility of formatted output; and invisible forking, which spans both. What is new is not the failure modes but AI's industrialization of their packaging. The barrier between naming a method and executing it has collapsed, and weak foundations, dressed as rigorous analysis, now reach audiences at a scale, speed, and polish that previously required expertise. I propose the Analysis Contract, a pre-commitment framework that adapts the logic of pre-analysis plans and the Causal Roadmap to the AI-assisted setting. The contract imposes three conditions before a causal claim is made: a method-data contract, a data audit, and a pre-commitment statement defining what would count as a disconfirming result. The framework generalizes across domains of vibe inference through domain-specific instantiation.

2605.07996 2026-05-11 cs.GT cs.MA econ.GN q-fin.EC

Nash without Numbers: A Social Choice Approach to Mixed Equilibria in Context-Ordinal Games

Ian Gemp, Crystal Qian, Marc Lanctot, Kate Larson

AI总结 本文提出了一种无需精确效用值的纳什均衡新方法,适用于仅能提供行动序位信息的博弈场景。研究通过引入社会选择理论,重新定义了最佳响应概念,从而在偏好序位的基础上建立了“情境序位纳什均衡”的新框架。该方法在弱化效用假设的前提下保证了均衡的存在性,并探讨了其计算复杂性与学习规则,为基于人类偏好数据的博弈分析提供了新工具。

详情
英文摘要

Nash equilibrium serves as a fundamental mathematical tool in economics and game theory. However, it classically assumes knowledge of player utilities, whereas economics generally regards preferences as more fundamental. To leverage equilibrium analysis in strategic scenarios, one must first elicit numerical utilities consistent with player preferences, a delicate and time-consuming process. In this work, we forgo precise utilities and generalize the Nash equilibrium to a setting where we only assume a player is capable of providing an ordinal ranking of their actions within the context of other players' joint actions. The key technical challenge is to rethink the definition of a best-response. While the classical definition identifies actions maximizing expected payoff, we naturally look towards social choice theory for how to aggregate preferences to identify the most preferred actions. We define this generalized notion of a context-ordinal Nash equilibrium, establish its existence under mild conditions on aggregation methods, introduce notions of regularization, approximation, and regret, explore complexity for simple settings, and develop learning rules for computing such equilibria. In doing so, we provide a generalization of Nash equilibrium and demonstrate its direct applicability to elicited preferences in human experiments.

2605.07671 2026-05-11 cs.GT cs.AI cs.MA econ.TH math.OC

The Endogeneity of Miscalibration: Impossibility and Escape in Scored Reporting

Lauri Lovén, Sasu Tarkoma

AI总结 本文研究了在自主智能体报告评分机制中,由于评分规则与智能体自身利益之间的内生性关联,导致真实报告难以被有效激励的问题。核心发现是,当使用非仿射的批准函数进行类型筛选时,智能体在无法被检测的偏差下,真实报告将不再是其最优策略,从而破坏评分校准。研究进一步提出,通过使用阶梯函数的批准阈值,可以在不损害校准的前提下实现最优类型筛选,尤其在Brier评分下,次优与最优之间的福利差距可以被消除,这一特性在其他评分规则中并不成立。

Comments 38 pages, no figures. Targeting ACM Transactions on Economics and Computation (TEAC); preprint

详情
英文摘要

Eliciting truthful reports from autonomous agents is a core problem in scalable AI oversight: a principal scores the agent's report using a strictly proper scoring rule, but the agent also benefits from the report through a non-accuracy channel (approval for autonomous action, allocation share, downstream control). The same structure appears in classical mechanism-design settings such as marketplace operation. Our main result is an endogeneity: the principal's optimal oversight necessarily uses a non-affine approval function to screen types, yet any non-affine approval makes truthful reporting suboptimal under the combined objective whenever deviation is undetectable. The principal cannot avoid the perturbation that undermines calibration. This impossibility holds for all strictly proper scoring rules, with a closed-form perturbation formula. A constructive escape exists: a step-function approval threshold achieves first-best screening for every strictly proper scoring rule, because the agent's binary inflate-or-not choice creates a type-space threshold regardless of the generator's curvature. Under the Brier score specifically, the type-independent inflation cost yields a welfare equivalence between second-best and first-best; we prove this equivalence is unique to Brier (the welfare gap under smooth $C^1$ oversight is bounded below by $Ω(\text{Var}(1/G'') (γ/β)^2)$ for every non-Brier rule). Two instances develop the framework: AI agent oversight (the lead motivating setting) and marketplace operation (a parallel mechanism-design domain). The message for AI alignment is direct: smooth scoring-based oversight cannot elicit truthful reports from a strategic agent; sharp thresholds are the calibration-preserving design.

2605.07528 2026-05-11 econ.TH

Aggregate Stable Matching with Money Burning

Alfred Galichon, Yu-Wei Hsieh, Antoine Jacquet

AI总结 本文研究了具有固定价格的去中心化匹配市场中的非转移效用(NTU)稳定性问题,提出了一种基于“金钱销毁”机制的聚合稳定性概念,该机制可解释为等待时间。文章将代理分为可观测类型,并在类型层面定义均衡,使得同一类型内的代理具有相等的间接效用。研究引入了两种NTU模型,分别在确定性和随机效用框架下分析了均衡的存在性、唯一性,并提出了一个基于交替约束离散选择问题的广义延迟接受算法,证明其收敛于唯一均衡。

详情
英文摘要

We propose an aggregate notion of non-transferable utility (NTU) stability for decentralized matching markets with fixed prices, where market clearing is achieved through one-sided money burning, which can be interpreted as waiting. Agents are grouped into observable types and are indifferent among individuals within type; equilibrium is defined at the type level and delivers equal indirect utility within each type. We introduce money burning into two types of NTU models: In a deterministic model, we relate our notion to classical Gale--Shapley stability and show how money burning decentralizes stable outcomes under aggregation. We then introduce separable random utility, obtaining an NTU counterpart to Choo and Siow (2006). We prove the existence and uniqueness of equilibrium and provide a stationary queueing interpretation. Finally, we develop a generalized deferred acceptance algorithm based on alternating constrained discrete-choice problems and prove its convergence to the unique equilibrium.

2604.25826 2026-05-11 econ.GN q-fin.EC stat.AP

General-Purpose Technology and Speculative Bubble Detection

Haiqiang Chen, Li Chen, Difang Huang, Yuexin Li, Zhengjun Zhang

AI总结 本文研究了通用技术采用对资产价格泡沫检测的影响,指出传统泡沫检验方法在考虑技术冲击时存在严重的规模扭曲。作者通过在Campbell-Shiller现值模型中引入驼峰形技术冲击,证明技术采用期间基本价格会出现局部爆炸性增长,从而影响检验的极限分布。为此,提出将价格分解为基本价值与投机成分的方法,实证分析表明该方法能有效区分2020-2025年AI热潮中的投机行为,并确认了1999年12月至2000年3月的互联网泡沫高峰期。

详情
英文摘要

We show that the leading bubble test suffers severe size distortion when fundamentals incorporate general-purpose technology adoption. Embedding a hump-shaped technology shock in the Campbell-Shiller present-value model, we prove that the fundamental price becomes locally explosive during adoption, contaminating the test's limit distribution with a non-centrality parameter proportional to the shock's peak. We propose a fundamental-versus-speculative decomposition that projects prices onto observable technology proxies and applies the test to the residual. Empirically, the decomposition eliminates evidence of speculation in the 2020-2025 AI rally while confirming a speculative peak confined to December 1999-March 2000 in the dot-com episode.

2603.00041 2026-05-11 cs.LG cs.AI econ.EM stat.ME

Econometric vs. Causal Structure-Learning for Time-Series Policy Decisions: Evidence from the UK COVID-19 Policies

Bruno Petrungaro, Anthony C. Constantinou

AI总结 本文研究了在时间序列政策决策中,计量经济学方法与因果结构学习方法在因果关系发现上的表现差异,以英国新冠疫情政策为案例进行实证分析。研究对比了四种计量经济学方法与十一种因果机器学习算法在图结构、模型维度和因果效应恢复能力方面的表现,发现计量经济学方法在时间结构上提供了明确的规则,而因果机器学习方法则能探索更广泛的图结构空间,从而发现更多可识别的因果关系。研究为因果机器学习从计量经济学中借鉴经验提供了实证依据,并提供了将计量经济学结果转换为贝叶斯网络工具的代码支持。

详情
英文摘要

Causal machine learning (ML) recovers graphical structures that inform us about potential cause-and-effect relationships. Most progress has focused on cross-sectional data with no explicit time order, whereas recovering causal structures from time series data remains the subject of ongoing research in causal ML. In addition to traditional causal ML, this study assesses econometric methods that some argue can recover causal structures from time series data. The use of these methods can be explained by the significant attention the field of econometrics has given to causality, and specifically to time series, over the years. This presents the possibility of comparing the causal discovery performance between econometric and traditional causal ML algorithms. We seek to understand if there are lessons to be incorporated into causal ML from econometrics, and provide code to translate the results of these econometric methods to the most widely used Bayesian Network R library, bnlearn. We investigate the benefits and challenges that these algorithms present in supporting policy decision-making, using the real-world case of COVID-19 in the UK as an example. Four econometric methods are evaluated in terms of graphical structure, model dimensionality, and their ability to recover causal effects, and these results are compared with those of eleven causal ML algorithms. Amongst our main results, we see that econometric methods provide clear rules for temporal structures, whereas causal-ML algorithms offer broader discovery by exploring a larger space of graph structures that tends to lead to denser graphs that capture more identifiable causal relationships.

2602.20281 2026-05-11 econ.TH

Existence of Equilibrium Mechanisms in Generalized Principal-Agent Problems with Interacting Teams

Brian Roberson

AI总结 本文研究了在存在战略溢出效应的环境中,多个委托人同时为其团队设计激励机制时的均衡机制是否存在。核心方法通过追踪真实服从路径下的结果分布以及单方偏离可实现的结果分布集,提出了新的分析框架,从而为多委托人机制设计中的团队生产和代理问题提供了均衡存在性的通用条件。这一成果为相关领域的激励机制设计提供了理论基础。

详情
英文摘要

We study incentive design when multiple principals simultaneously design mechanisms for their respective teams in environments with strategic spillovers. In this environment, each principal's set of incentive-compatible mechanisms--those that satisfy their own agents' incentive compatibility constraints--depends on the mechanisms offered by the other teams. Following a classic example by Myerson (1982), such games may lack equilibrium due to discontinuities in the correspondence of incentive-compatible mechanisms. We establish general conditions for equilibrium existence by introducing a novel approach that involves tracking both the outcome distributions along the truthful-obedient path and the sets of outcome distributions achievable through unilateral deviations, thereby providing a foundation for analyzing a wide range of multi-principal mechanism design with team production and agency problems.

2511.06545 2026-05-11 econ.GN cs.CY q-fin.EC

Vibecoding and Digital Entrepreneurship

Ruiqing Cao, Abhishek Bhatia

AI总结 本文研究了生成式人工智能(GenAI)驱动的“vibecoding”对数字创业进入和创业绩效的影响。通过分析初创企业对vibecoding的先验暴露程度,研究发现vibecoding加快了首次创业的启动速度,但只有在GenAI辅助而非完全替代产品开发的情况下,经济上可行的创业进入才会增加。研究还表明,vibecoding在与内部工程能力互补时最具价值,能够促进创业团队将人力投入到更高层次的问题解决和动态适应中。

详情
英文摘要

As generative artificial intelligence (GenAI) automates coding tasks and expands access to technical resources, this paper examines how GenAI-enabled coding automation, colloquially known as "vibecoding," affects digital entrepreneurial entry and venture performance. We exploit ex-ante variation in ventures' exposure to vibecoding based on the product characteristics of their initial launches and estimate difference-in-differences models around the diffusion of GenAI coding tools. Vibecoding increases first-time launches and shortens time to launch, but economically viable entry rises only where vibecoding augments, rather than fully automates, product development. In these partially exposed product segments, viable entry increases by 11%, driven entirely by ventures founded by individuals with STEM education or work experience, especially those whose most recent employment was outside middle management. Among ventures launched before GenAI became widely accessible, performance gains similarly concentrate among partially exposed ventures with engineering-intensive initial teams. Together, these results suggest that GenAI-enabled coding automation does not eliminate the value of technical expertise. Instead, vibecoding creates the greatest value when it complements internal engineering capabilities, allowing ventures to delegate lower-level coding tasks to GenAI while shifting human effort toward higher-level problem solving and dynamic adaptation.

2408.12577 2026-05-11 econ.EM

Microtransit revenue management informed by citywide travel demand and joint subscription-mode choice modeling

Xiyuan Ren, Joseph Y. J. Chow, Venktesh Pandey, Linfei Yuan

AI总结 本文研究了如何通过城市级出行需求和联合出行方式与订阅模式选择建模,提升微出行(microtransit)服务的收益管理能力。作者提出了一种创新的建模方法,结合城市级合成数据与非参数嵌套模型,模拟不同政策场景下的收益与出行效益。通过德克萨斯州阿灵顿市的案例分析,研究显示适当调整订阅价格和实施补贴政策可有效提升微出行的乘客量、系统收入及出行者福利。

详情
Journal ref
Transportation Research Part A, 210 (2026), 105046
英文摘要

As an IT-enabled multi-passenger mobility service, microtransit can improve accessibility, reduce congestion, and promote sustainability. However, realizing its business potential requires a deeper understanding of traveler preferences, highlighting the need for more effective tools for demand forecasting and revenue management, especially when actual usage data are limited. We propose an innovative modeling approach that integrates travel behavioral insights into microtransit policymaking. The approach operates by (1) leveraging citywide synthetic data to achieve greater spatiotemporal granularity, (2) estimating a nonparametric nested model for joint travel mode and ride-pass subscription choices, and (3) employing a simulation-based method to calculate revenue and traveler benefits under various policy scenarios. We demonstrate the applicability of our approach through a case study in Arlington, TX, one of the largest deployments of microtransit (Via) in the U.S. Using the simulation-based workflow, we evaluate alternative policy scenarios, including ride-pass discounts, event-based subsidies, and place-based subsidies, to assess their impacts on microtransit ridership, system revenue, and traveler welfare. The results indicate that reducing the weekly pass price from $25 to $18.9 and the monthly pass price from $80 to $71.5 would increase total revenue by approximately $127 per day. A 100% trip fare discount could reduce 61 car trips to AT&T Stadium during a game event while generating an additional 82 microtransit trips per day to Medical City Arlington. However, achieving these mode shifts would require subsidies of approximately $533 per event and $483 per day, respectively.

2605.07469 2026-05-11 econ.TH

Coordination Mechanisms with Partially Specified Probabilities

Francesco Giordano

AI总结 本文研究了在仅公开数据生成过程的部分统计信息而非完整分布的情况下,哪些结果是可以实现的。玩家根据有限随机变量的期望值,通过最大熵推理形成信念。研究发现,当消息空间不受限时,可实现的结果与联合协调结果一致,扩展了相关均衡的范围;而在经典机制下,可实现性归结为一个交叉熵条件。该框架通过多个例子和游戏类展示了其广泛适用性。

详情
英文摘要

We study which outcomes are implementable by disclosing coarse statistics of a data-generating process rather than its full distribution. Players observe data whose joint distribution is only partially known: they know the expectations of finitely many random variables and form beliefs by maximum-entropy inference. We obtain two characterizations. When message spaces are unrestricted, implementable outcomes coincide with jointly coherent outcomes, expanding the set of correlated equilibria. With canonical mechanisms, implementability reduces to a single cross-entropy condition: the target outcome must lie on the cross-entropy level set of some correlated equilibrium that passes through that equilibrium itself. Examples and several classes of games illustrate the reach of the framework.

2605.07404 2026-05-11 math.ST econ.EM stat.TH

Self-normalized tests for multistep conditional predictive ability

Qitong Chen, Shuwen Lai

AI总结 本文提出了一种用于多步条件预测能力比较的自归一化检验方法。通过利用变换后损失差值样本均值的累积和(CUSUM)过程的功能量进行归一化,该方法避免了对长期协方差矩阵的直接估计,从而省去了传统方法中所需的带宽、核函数和滞后截断等人为设定。研究建立了该检验统计量的渐近理论,推导了其在原假设下的极限分布,并证明了检验的一致性。仿真实验表明,该方法有效缓解了传统异方差与自相关一致(HAC)方法在小样本下的显著性扭曲问题,同时保持了对条件可预测性备择假设的强大检验能力。

详情
英文摘要

This paper proposes self-normalized tests for multistep conditional predictive ability in forecast comparison. By normalizing the sample mean of the transformed loss differential using functionals of its cumulative sum (CUSUM) process, specifically an adjusted-range normalizer for scalars and a matrix normalizer for vectors, our approach avoids direct estimation of the long-run covariance matrix. Consequently, it eliminates the need for the ad hoc bandwidth, kernel, and lag-truncation choices required by traditional methods. We establish the asymptotic theory for these statistics, deriving pivotal null limiting distributions and proving test consistency. Monte Carlo simulations show that the proposed tests effectively mitigate the finite-sample size distortions associated with traditional heteroskedasticity and autocorrelation consistent (HAC) methods, while retaining strong empirical power against conditional predictability alternatives.

2605.07377 2026-05-11 econ.TH

Mental Health and Human Capital Composition in a Dynastic OLG Model with PAYG Pensions

Sushmita Kumari, Siddharth Gavhale

AI总结 本文构建了一个包含两期的代际重叠(OLG)模型,研究在现收现付(PAYG)养老金制度下,父母如何在消费、储蓄、生育及子女教育、身体和心理健康等多维质量投资之间进行决策。核心创新在于将心理健康建模为一种独立的、具有自身弹性参数的生产率提升要素,从而揭示养老金政策对人力资本总量及结构的影响。研究发现,提高PAYG缴费率虽会通过“雅吉塔效应”增加生育率,但会挤出对子女各维度质量的投资,特别是心理健康;而心理健康弹性参数的提升则会促进非认知技能发展,同时抑制生育率,凸显出发展中国家在养老保障与人力资本积累之间的政策矛盾。

Comments Accepted at the CDE-IEDS International Conference 2026, Delhi School of Economics (DSE), University of Delhi. 9 pages, 1 table

详情
英文摘要

This paper develops a two-period dynastic overlapping-generations (OLG) model in which parents simultaneously choose consumption, savings, fertility, and three distinct dimensions of child quality-education, physical health, and mental health-under a pay-as-you-go (PAYG) pension system. The central innovation is modelling mental health as an independent productivity-enhancing input with its own elasticity $θ$ in a Cobb-Douglas human-capital technology. This yields simple proportional allocation rules and shows how pension policy affects not only the overall level but also the composition of human capital investments. In steady state, higher PAYG contribution rates raise fertility through the Yakita effect but crowd out per-child investments in all quality dimensions, including mental health. An increase in the mental-health elasticity $θ$ shifts resources toward non-cognitive skill development while reducing fertility. These results reveal a fundamental policy tension for developing economies: pension systems that rely on children for old-age support simultaneously increase birth rates while reducing long-term human capital formation, with disproportionate effects on non-cognitive skills. The framework provides theoretical guidance for complementary policies that protect mental-health investments, with particular relevance for countries such as India where children remain a primary source of retirement security and mental-health services are underfunded.

2605.07065 2026-05-11 stat.ML cs.AI cs.LG econ.EM

Causal EpiNets: Precision-corrected Bounds on Individual Treatment Effects using Epistemic Neural Networks

Gandharv Patil, Keyi Tang, Raquel Aoki, Leo Guelman

AI总结 该研究针对个体处理效应的识别问题,提出了一种基于认知神经网络的因果EpiNets方法,用于在有限样本下更精确地估计个体层面的因果效应。该方法通过设计满足结构约束的神经网络架构,并结合精度校正的交集界推理,有效解决了传统估计方法在结构概率约束和极值偏差上的缺陷。实验表明,该方法在高维场景下能够保持名义覆盖度和约束有效性,优于现有估计器。

详情
英文摘要

Individual treatment effects are not point-identified from data. The Probability of Necessity and Sufficiency (PNS) circumvents this limitation by characterizing individual-level causality through intersection bounds derived from combined experimental and observational data. In finite samples, however, standard plug-in estimators systematically fail: they violate structural probability constraints and suffer from extremum bias induced by max-min operators, yielding spuriously narrow intervals. We propose a neural framework for finite-sample PNS estimation that resolves both pathologies. We introduce an anchored neural architecture that guarantees structural constraint satisfaction by construction. To correct extremum bias, we employ precision-corrected intersection-bound inference, leveraging Epistemic Neural Networks for scalable, high-dimensional uncertainty quantification. Empirical evaluations confirm that this approach maintains nominal coverage and exact constraint validity in high-dimensional regimes where standard estimators systematically undercover.

2605.06987 2026-05-11 cs.LG cs.GT econ.TH stat.ML

Response Time Enhances Alignment with Heterogeneous Preferences

Federico Echenique, Alireza Fallah, Baihe Huang, Michael I. Jordan

AI总结 本文研究了如何在存在异质偏好标签者的情况下,提升大语言模型与人类偏好的对齐效果。传统方法通过聚合二元选择数据构建奖励模型,但忽略了标签者之间的偏好差异,导致模型无法准确学习真实的人群平均偏好。为此,作者提出利用用户响应时间作为补充信号,结合漂移-扩散模型(DDM),设计了一种能够识别异质偏好的新估计方法,有效纠正了传统方法的偏差,并在多种数据集上验证了其优越性。该方法无需用户身份信息,具有实际应用价值。

详情
英文摘要

Aligning large language models (LLMs) to human preferences typically relies on aggregating pooled feedback into a single reward model. However, this standard approach assumes that all labelers share the same underlying preferences, ignoring the fact that real-world labelers are highly heterogeneous and usually anonymous. Consequently, relying solely on binary choice data fundamentally distorts the learned policy, making the true population-average preference unidentifiable. To overcome this critical limitation, we demonstrate that augmenting preference datasets with a simple, secondary signal -- the user's response time -- can restore the identifiability of the population's average preference. By modeling each decision as a Drift-Diffusion Model (DDM), we introduce a novel, consistent estimator of heterogeneous preferences that successfully corrects the distortions of standard choice-only labels. We prove that our estimator asymptotically converges to the true average preference even in extreme cases where each anonymous labeler contributes only a single choice. Empirically, across both synthetic and real-world datasets, our method consistently outperforms standard baselines that otherwise fail and plateau at a bias floor. Because response times are essentially free to record and require zero user tracking or identification, our results bring promises and open up new opportunities for future data-collection pipelines to improve the social benefit without requiring user-level identifiers or repeated elicitations.

2605.06757 2026-05-11 econ.GN q-fin.EC

Introducing Feedback Thinking and System Dynamics Modeling in Economics Education

Oleg V. Pavlov, Robert Y. Cavana, I. David Wheat, Khalid Saeed, Michael J. Radzicki, Brian C. Dangerfield

AI总结 本文探讨了在经济学教育中引入反馈思维和系统动力学建模的机遇与挑战,旨在通过系统动力学方法提升学生对复杂经济系统因果关系和反馈机制的理解。文章提出了一种价格反馈模型作为教学示例,并总结了多位作者在经济学课程中应用系统动力学的教学经验,同时构建了一个四层次的课程体系,为经济学教学提供了新的方法论支持。

详情
Journal ref
System Dynamics Review 41(2): e70001 (2025)
英文摘要

System dynamics is a methodology that is widely used in many academic fields. It explains the behavior of social and economic systems with models that capture complex causality and feedback effects. This 'practice paper' discusses the opportunities and barriers for introducing feedback thinking and system dynamics models in the economics curriculum. We start by providing a pricing feedback model that illustrates some of the benefits that system dynamics can provide in enhancing economics education. Then we summarize the experiences of each of the authors in teaching system dynamics on economics educational programs. This includes different approaches to teaching economics with system dynamics that depend on the learning objectives, the preparation of students, and the background of the instructor. We also develop a four-level course hierarchy for using system dynamics in economics teaching. We then point out the tradeoffs that instructors must consider as they introduce new pedagogies for delivering economics material. Finally, we provide some concluding comments with some suggestions for future work. The expected audiences for this paper are instructors as well as graduate students who are considering academia as a profession.

2605.06686 2026-05-11 cs.LG econ.EM stat.AP stat.ML

Robustness of Refugee-Matching Gains to Off-Policy Evaluation Choices

Kirk Bansak, Elisabeth Paulson, Dominik Rothenhäusler, Jeremy Ferwerda, Jens Hainmueller, Michael Hotard

AI总结 本文研究了在美国难民匹配政策中,反事实影响评估结果对离线策略评估方法的稳健性。通过应用逆概率加权(IPW)和增强型逆概率加权(AIPW)等多种评估方法,并结合不同的模型结构和分配程序,研究发现无论采用何种方法,影响估计结果在数量级上均保持一致,且在多数情况下具有统计显著性。这些结果与Bansak等人(2018)最初的研究结论也高度一致。

Comments 13 pages, 2 figures, 10 tables

详情
英文摘要

Previous research has investigated the potential of refugee matching for boosting refugee outcomes, first considered by Bansak et al. (2018). This paper demonstrates the stability of counterfactual impact evaluation results in the context of refugee matching in the United States using a range of off-policy evaluation methods. In order to estimate counterfactual impact and test the robustness of our results, we employ several evaluation methods, including inverse probability weighting (IPW) and multiple variants of augmented inverse probability weighting (AIPW). We also consider various modifications, including alternative modeling architectures and different assignment procedures. The impact estimates remain consistent in magnitude in all scenarios as well as statistically significant in most cases. Furthermore, the estimates are also consistent with the results originally presented in Bansak et al. (2018).

2601.18991 2026-05-11 q-fin.TR cs.GT econ.GN q-fin.EC

Who Restores the Peg? A Mean-Field Game Approach to Model Stablecoin Market Dynamics

Hardhik Mohanty, Bhaskar Krishnamachari

AI总结 本文研究了在稳定币脱锚事件中,是谁在恢复锚定价格的问题。作者构建了一个基于均场博弈的动态模型,模拟法币抵押稳定币市场中套利者和散户投资者在一级和二级市场中的策略性互动。该模型能够内生地反映市场摩擦对价格路径和订单流的影响,从而识别出不同渠道对恢复锚定价格的贡献,并评估基础设施在应对市场压力时的承受能力。通过分析三次历史脱锚事件,研究发现一级市场套利是稳定系统性压力的主要力量,并揭示了脱锚恢复速度的非线性阈值特性。

Comments 9 pages, 9 figures, 3 tables

详情
英文摘要

USDC and USDT are the dominant stablecoins pegged to \$1 with a total market capitalization of over \$300B and rising. Stablecoins make dollar value globally accessible with secure transfer and settlement. Yet in practice, these stablecoins experience periods of stress and de-pegging from their \$1 target, posing significant systemic risks. The behavior of market participants during these stress events and the collective actions that either restore or break the peg are not well understood. This paper addresses the question: who restores the peg?. We develop a dynamic, agent-based mean-field game framework for fiat-collateralized stablecoins, in which a large population of arbitrageurs and retail traders strategically interact across primary and secondary markets during a de-peg episode. The key advantage of this equilibrium formulation is that it endogenously maps market frictions into a market-clearing price path and implied net order flows, allowing us to attribute peg-reverting pressure by channel and to stress-test when a given infrastructure becomes insufficient for recovery. Using three historical de-peg events, we show that the calibrated equilibrium reproduces observed recovery half-lives and yields an order flow decomposition in which system-wide stress is predominantly stabilized by primary-market arbitrage. Finally, a quantitative sensitivity analysis identifies a non-linear breakdown threshold, beyond which a de-peg becomes markedly slower to reverse.

2512.23694 2026-05-11 stat.ML cs.LG econ.EM

Bellman Calibration for $V$-Learning in Offline Reinforcement Learning

Lars van der Laan, Nathan Kallus

AI总结 在离线强化学习中,长期价值预测的可靠性面临挑战,因为拟合价值方法涉及引导、函数逼近和分布偏移,而标准保证通常需要贝尔曼完备性或可实现性。本文提出贝尔曼校准,一种较弱的可靠性准则,要求预测值相近的状态具有一致的贝尔曼目标平均值,并基于此提出迭代贝尔曼校准方法,通过拟合原始预测的一维映射对价值预测器进行后处理校准。该方法无需贝尔曼完备性或价值函数可实现性,即可在有限样本下保证校准误差以一维非参数速率控制,并将价值误差分解为统计估计、有限迭代和逼近误差,明确了校准在何时能提升预测性能。

详情
英文摘要

Reliable long-horizon value prediction is difficult in offline reinforcement learning because fitted value methods combine bootstrapping, function approximation, and distribution shift, while standard guarantees often require Bellman completeness or realizability. We introduce Bellman calibration, a weak reliability criterion requiring that states assigned similar predicted values have average Bellman targets that agree with those predictions. This criterion yields a scalar calibration error for diagnosing systematic numerical miscalibration, which we estimate from off-policy data using doubly robust Bellman target estimates. We then propose Iterated Bellman Calibration, a model-agnostic post-hoc procedure that recalibrates any learned value predictor by fitting a one-dimensional map of its original prediction, with histogram and isotonic variants. We prove finite-sample guarantees showing that Bellman calibration error is controlled at one-dimensional nonparametric rates without Bellman completeness or value-function realizability. Our value-error bounds separate statistical estimation, finite-iteration, and approximation errors, clarifying when calibration improves value prediction and when its gains are limited by the information in the original predictor or insufficient coverage.

2511.12456 2026-05-11 cs.GT econ.TH

Collusion-proof Auction Design using Side Information

Sukanya Kudva, Edward Dowling, Anil Aswani

AI总结 现有拍卖机制容易受到投标人合谋的影响,导致收入和非合谋者福利大幅下降。本文提出一种利用机器学习分类器识别合谋者的新方法,设计出两种新的真实机制V-PoP和C-PoP,以抵御合谋并保障拍卖效率。研究证明,在合谋者较多的情况下,传统机制受损严重,而新机制通过结合VCG机制和固定价格策略,能够在分类误差存在的情况下仍保持较高的福利和收入,并揭示了分类器设计中应优先减少误判非合谋者为合谋者的情况。

详情
英文摘要

Existing auction mechanisms are vulnerable to bidder collusion, which substantially degrades revenue and non-colluder welfare. To design truthful mechanisms resilient to collusion, we introduce a novel approach that leverages a machine learning classifier to predict (even imprecisely) which bidders are colluding. We first establish a Bulow-Klemperer-type result for multi-unit auctions with single-minded bidders, demonstrating that collusion significantly harms existing mechanisms only when the colluding coalition is large. Consequently, we focus our design on settings with many colluders. Building on the welfare-optimal Vickrey-Clarke-Groves (VCG) mechanism, we propose two novel truthful mechanisms: VCG-Posted Price (V-PoP) and Conditional-Posted Price (C-PoP). V-PoP applies VCG to non-colluding bidders and posted prices to colluding ones, and ensuring truthfulness is non-trivial because we must dynamically split the quantity of items between these groups based on the values of the non-colluder bids. C-PoP further advances this by computing a posted price conditioned on non-colluder bids, and ensuring truthfulness is non-obvious because the posted price is chosen using the values of the non-colluder bids. Because real-world classifiers make errors, we provide theoretical lower bounds on the auction price of V-PoP and C-PoP under misclassification, which theory shows acts as a proxy for welfare and revenue. Crucially, our bounds yield actionable insights for classifier design, revealing that false negatives (misclassifying colluders as non-colluders) are preferable to false positives (misclassifying non-colluders as colluders). Numerical experiments demonstrate that our mechanisms achieve high welfare and revenue against collusion, even when utilizing simple, low-cost classifiers.

2509.21172 2026-05-11 cs.LG econ.EM math.OC stat.ML

Inverse Reinforcement Learning with Just Classification and a Few Regressions

Lars van der Laan, Nathan Kallus, Aurelien Bibaut

AI总结 本文研究了逆强化学习中在最大熵模型下的奖励函数恢复问题,提出了一种新的通用方法GenPQR,该方法通过分类和少量回归即可实现,无需依赖特定神经网络结构或锚定动作限制。GenPQR 模块化地估计行为策略、计算软Q函数并恢复归一化奖励,理论分析表明其在函数逼近下具有有限样本保证,并通过实验验证其在奖励恢复效果上优于 DeepPQR,同时具备更高的灵活性和模块性。

详情
英文摘要

Inverse reinforcement learning (IRL) aims to infer rewards from observed behavior, but rewards are not identified from the policy alone: many reward--value pairs can rationalize the same actions. Meaningful reward recovery therefore requires a normalization, yet existing normalized IRL methods often rely on anchor-action restrictions or specialized neural architectures. We study reward recovery in the maximum-entropy, or Gumbel-shock, model under a broad class of statewise affine normalizations, with anchor-action constraints as a special case. This yields Generalized Policy-to-$Q$-to-Reward (GenPQR), a modular procedure that estimates the behavior policy, evaluates its soft $Q$-function through the Bellman equation, and recovers the normalized reward. Both stages can be implemented with off-the-shelf classification and regression methods. We prove modular finite-sample guarantees under general function approximation, with separate policy-estimation and $Q$-estimation errors. As a concrete instantiation, we study GenPQR with fitted $Q$-evaluation, reducing IRL to policy estimation followed by regression. Experiments show that GenPQR matches or improves reward recovery relative to DeepPQR while remaining simpler and more modular. Compared with DeepPQR, our theory goes beyond anchor actions, accommodates large and continuous action spaces, makes coverage requirements explicit, and is not tied to a specific neural-network architecture or training procedure.

2508.00208 2026-05-11 econ.GN q-fin.EC

Channel Adoption Pathways and Post-Adoption Behavior

Shirsho Biswas, Hema Yoganarasimhan, Haonan Zhang

AI总结 随着数字购物渠道的迅速发展,许多传统零售商纷纷投资建设电商平台和移动应用。本文研究了巴西一家宠物用品零售商的交易数据,探讨了仅在线下购物的消费者通过不同途径(如促销活动、疫情、忠诚计划等)转向线上购物后的行为差异。研究发现,不同渠道采用动机显著影响消费者的后续消费、盈利能力和渠道使用习惯,为零售商在制定促销策略和预测客户终身价值时提供了重要参考。

Comments 95 pages

详情
英文摘要

The rapid growth of digital shopping channels has prompted many traditional retailers to invest in e-commerce websites and mobile apps. While prior literature shows that multichannel customers are more valuable, it overlooks how the motive for adopting a new channel shapes post-adoption behavior. Using transaction-level data from a major Brazilian pet supplies retailer, we study offline-only consumers who adopt online shopping via four distinct pathways: organic adoption, the COVID-19 pandemic, Black Friday promotions, and a loyalty program. We examine how these pathways affect post-adoption spend, profitability, and channel usage using consumer-level panel data and difference-in-differences estimates. We find that all adopters increase spending relative to offline-only consumers, but their post-adoption behaviors differ systematically by adoption motive. Promotion-driven adopters engage in forward buying and exhibit lower subsequent profitability, whereas COVID-19 adopters display stronger offline persistence consistent with consumer inertia and habit theory. Our findings have important managerial implications: firms should design promotions that discourage stockpiling, reinforce habits among customers pushed online by external shocks, and explicitly account for heterogeneity in channel adoption motives when forecasting customer lifetime value and assessing the breakeven and ROI of promotions designed to induce the adoption of new channels.

2506.06776 2026-05-11 econ.EM

Testing the Solvability of Systems of Linear Inequalities

Leonard Goff, Eric Mbakop

AI总结 本文研究了在系统系数需要估计的情况下,判断线性等式和不等式约束系统是否存在解的问题,并指出这一问题在部分识别模型的统计推断中具有广泛的应用。作者提出了一种基于线性规划值是否为零的替代性假设刻画方法,并据此构建了基于自助法的检验程序,证明了其在大量数据生成过程中的统一有效性。仿真结果和实证应用表明该方法在小样本情况下也具有良好的表现。

详情
英文摘要

This paper studies the problem of testing whether a system of linear equality and inequality constraints admits a solution when the coefficients of that system may have to be estimated. We show that a wide range of inferential questions in partially identified models can be formulated as hypotheses of this form. Our approach exploits an alternative characterization of the hypothesis based on whether the value of a certain linear program is equal to zero. Building on this characterization, we develop bootstrap-based testing procedures and establish their uniform validity over large classes of data-generating processes. Simulation results demonstrate good finite-sample performance, even for moderate sample sizes. We illustrate the usefulness of the approach in two empirical applications.

2304.05515 2026-05-11 econ.TH

A Comparison of Cursed Sequential Equilibrium and Sequential Cursed Equilibrium: Different Concepts of Cursedness in Dynamic Games

Meng-Jhang Fong, Po-Hsuan Lin, Thomas R. Palfrey

AI总结 本文比较了两种动态博弈中的“受诅咒均衡”概念——受诅咒序贯均衡(CSE)和序贯受诅咒均衡(SCE),阐明了它们在概念基础、信念更新和公共历史处理等方面的异同。这两种均衡扩展分别由Fong等人(2025)和Cohen与Li(2026)提出,旨在解决静态受诅咒均衡在动态环境中的局限性,本文系统分析了它们的核心差异与技术含义。

Comments 33 pages, 4 figures and 1 table

详情
英文摘要

Cursed Equilibrium of Eyster and Rabin (2005) has been a leading theory for explaining winner's-curse-type behavior in static Bayesian games, but it faces conceptual limitations when applied to dynamic games. Two recent extensions, Cursed Sequential Equilibrium (CSE) by Fong, Lin and Palfrey (2025) and Sequential Cursed Equilibrium (SCE) by Cohen and Li (2026), address these limitations in fundamentally different ways. Complementing these two papers, this paper provides a systematic comparison of CSE and SCE, clarifying their conceptual foundations and technical implications, including their notions of cursedness, belief updating, and treatment of public histories.

2212.07384 2026-05-11 econ.GN q-fin.EC

Valuing Pharmaceutical Drug Innovations

Gaurab Aryal, Federico Ciliberto, Leland E. Farmer, Ekaterina Khmelnitskaya

AI总结 本文提出了一种估算制药药物市场价值的方法,结合事件研究法与贴现现金流模型,通过分析药物研发公告对股市的反应来推断药物价值。研究估计小型企业开发的药物平均市场价值约为21.6亿美元,临床前阶段的风险调整后现值约为5000万美元,并估算药物研发初期的平均成本约为3800万美元。研究还针对不同治疗领域进行了价值与成本估算,并探讨了如何利用这些结果制定支持药物研发的政策。

详情
英文摘要

We propose a methodology to estimate the market value of pharmaceutical drugs. Our approach combines the event study method with a discounted cash flow model that infers drug values from stock market responses to drug development announcements. We estimate the average value of a drug developed by small firms (those below the 95th percentile of market capitalization) to be \$2.16 billion. At the preclinical stage, the risk-adjusted and present discounted average net value of drugs is \$50 million. Leveraging these estimates, we also determine the expected drug development cost at the start of the discovery stage to be \$38 million. We estimate values and costs for several therapeutic areas (e.g., neoplasm, infections) and explore applying these estimates to design policies that support drug development through drug buyouts and targeted preclinical interventions.

2110.13814 2026-05-11 econ.GN cs.GT q-fin.EC

Bidders' Responses to Auction Format Change in Internet Display Advertising Auctions

Shumpei Goke, Gabriel Y. Weintraub, Ralph Mastromonaco, Sam Seljan

AI总结 本文研究了互联网展示广告拍卖中,当新的拍卖格式(如从二价拍卖改为一价拍卖)引入市场时,投标人的实际竞价行为变化。通过分析不同出版商分阶段采用一价拍卖的新型数据集,研究发现,采用新格式的出版商相比未采用的出版商,每千次展示的广告价格显著上升,增幅达原价格的25%至75%。然而,随着时间推移,这种价格增长逐渐减弱,表明投标人在初期未充分调整出价策略,最终趋向于逐步适应新格式的均衡状态。该研究为拍卖格式变更对投标人行为的影响提供了首个实证分析,对拍卖设计具有重要参考价值。

Comments 35 pages, 37 figures

详情
英文摘要

We study actual bidding behavior when a new auction format gets introduced into the marketplace. More specifically, we investigate this question using a novel dataset on internet display advertising auctions that exploits a staggered adoption by different publishers (sellers) of first-price auctions (FPAs), instead of the traditional second-price auctions (SPAs). We analyze the auction format change using difference-in-differences regressions and a synthetic difference-in-differences estimator, which better handles pre-trends. The results show that revenue per sold impression (price) jumps considerably for treated publishers relative to control publishers, with increases ranging from 25% to 75% of the pre-treatment price level of the treated group. Moreover, for later auction format changes, the increase in price levels under FPAs relative to those under SPAs tends to dissipate over time, reminiscent of the revenue equivalence theorem, although the extent of this reversion depends on the specification. We view these results as suggestive of initially insufficient bid shading following the format change, as opposed to an immediate transition to a new Bayesian Nash equilibrium, with prices tending to decline in several specifications in a manner consistent with gradual adjustment in bidding behavior as bidders learn to shade their bids. Our work constitutes one of the first field studies on bidders'responses to auction format changes, providing an important complement to theoretical model predictions. As such, it provides valuable information to auction designers when considering the implementation of different formats.