arXivDaily arXiv每日学术速递 周一至周五更新
重置
2605.14976 2026-05-15 stat.ME econ.EM q-fin.ST

Multi-regime Markov-switching models with time-varying transition probabilities: An application to U.S. Treasury yields

Samuel Modée, Yushu Li, Sjur Westgaard, Stein Andreas Bethuelsen

AI总结 本文研究了具有时间变化转移概率的多制度马尔可夫切换模型,并将其应用于美国国债收益率分析。作者将广义自回归得分(GAS)模型中两制度共同方差设定扩展到具有制度特异均值和方差的多制度一般情形,并开发了开源R包用于数据模拟与参数估计。研究表明,制度均值、方差和转移概率可可靠估计,但转移概率驱动系数较难识别,同时GAS得分系数在联合似然函数中存在非识别问题。实证分析显示,基于收益率水平的外生设定在拟合效果上优于常数和滞后变化模型,而GAS设定则因收敛问题表现不佳。

详情
Comments
15 pages, 1 figure
英文摘要

This paper studies Markov-switching (MS) models with time-varying transition probabilities (TVTP) under various specifications of the transition probability matrix. Especially, we extend the two-regime common-variance setting of the Generalized Autoregressive Score (GAS) model from (Bazzi et al., 2017) to the general $K$-regime case with regime-specific means and variances. Our study contains comprehensive Monte Carlo simulations and we developed an open-source R package, \texttt{multiregimeTVTP}, for data simulation and parameter estimation. We find that the regime means, variances, and transition probabilities are reliably recovered, whereas the TVTP driving coefficients are harder to identify. Another finding from our paper is that the GAS score coefficient appears to be statistically non-identifiable, due to a ridge in the joint likelihood surface $(σ^2,A)$. In addition, we find that one-step point forecasts are remarkably robust to TVTP misspecification, but filtered regime probabilities are not, so correct specification matters most for characterizing regime dynamics rather than short-horizon forecasting. An empirical application to U.S. Treasury zero-coupon yield changes at four maturities (1961-2024) shows that an exogenous specification driven by the lagged yield level dominates the constant and lagged-change models in fit, while the GAS specification fails to converge, with $\hat{A}$ collapsing to zero, reflecting the same identifiability issue observed in simulation.

2605.14575 2026-05-15 econ.GN q-fin.EC stat.ME

The Asset Price Channel of Monetary Policy: Evidence from Regional Stock-Market Developments in the Successor States of Former Yugoslavia

Stefan Tanevski

AI总结 本研究旨在实证分析前南斯拉夫六个共和国地区是否存在货币政策的部门资产价格传导渠道。通过构建区域部门股票指数,并运用面板向量自回归模型和混合均值组估计方法,研究发现金融和电信部门存在明显的资产价格传导效应,这可能归因于跨国企业网络促进了子市场区域化。相比之下,制造业和电力部门则未表现出类似效应,表明当地股票市场仍较为分散,亟需更高效的区域市场整合或加强交易所合作。

详情
英文摘要

The aim of this study is to empirically investigate the existence of a sectoral asset price channel of monetary policy in the region of the six republics of former Yugoslavia. The study constructs sectoral indices for the entire region, building on the idea that one regional stock exchange may provide more efficiency for the listed companies in the region, while monetary policy relevance for it may be sector-specific. We employ panel vector autoregressive model to observe impulse responses of sectoral indices to innovations in monetary policy, while then disentangle the long- from the short-run relationships per index through a Pooled Mean Group estimation. Overall, we document presence of the asset price channel in the finance and telecom sectors, likely driven by the established multinational corporate networks fostering sub-market regionalization. Yet, this is not the case for the manufacturing and electricity sectors, which may imply that local stock markets are yet too fragmented and space for a more efficient regional stock market, either in the true sense of the word or, more realistically, though enhanced regional cooperation of the stock exchanges certainly exists.

2605.14493 2026-05-15 econ.GN q-fin.EC

Deep Learning for Solving and Estimating Dynamic Models in Economics and Finance

Simon Scheidegger

AI总结 本文介绍了深度学习在解决和估计经济学与金融学中高维动态随机模型中的应用方法,旨在应对传统张量积网格方法在处理复杂模型时面临的维度灾难问题。文章围绕四种互补方法展开,包括深度均衡网络、物理信息神经网络、深度代理模型和高斯过程,这些方法在模型求解、参数估计和政策设计等方面展现出显著优势。研究覆盖了代表性代理模型、重叠代际模型、连续时间宏观金融模型及气候经济学等多个应用领域,为研究者提供了实践深度学习工具的途径。

详情
Comments
330 pages
英文摘要

This script offers an implementation-oriented introduction to deep learning methods for solving and estimating high-dimensional dynamic stochastic models in economics and finance. Its starting point is the curse of dimensionality: heterogeneous-agent economies, overlapping-generations models with aggregate risk, continuous-time models with occasionally binding constraints, climate-economy models, and macro-finance environments with many assets and frictions generate state and parameter spaces that strain classical tensor-product grid methods. The exposition is organized around four complementary methodologies. Deep Equilibrium Nets embed discrete-time equilibrium conditions into neural-network loss functions. Physics-Informed Neural Networks approximate continuous-time Hamilton--Jacobi--Bellman, Kolmogorov forward, and related partial differential equations. Deep surrogate models provide fast, differentiable approximations to expensive structural models, while Gaussian processes add a probabilistic layer that quantifies approximation uncertainty; together they support estimation, sensitivity analysis, and constrained policy design. Gaussian-process-based dynamic programming, combined with active learning and dimension reduction, extends value-function iteration to very large continuous state spaces. Applications span representative-agent and international real business cycle models, overlapping-generations and heterogeneous-agent economies, continuous-time macro-finance, structural estimation by simulated method of moments, and climate economics under uncertainty. Companion notebooks in TensorFlow and PyTorch invite hands-on experimentation. These notes are a deliberately subjective and inevitably incomplete snapshot of a rapidly evolving field, aimed at equipping PhD students and researchers to engage with this frontier hands-on.

2605.12698 2026-05-15 q-fin.MF math.OC math.PR

Optimal investment and Pension policy in Pay-As-You-Go systems under forward utility and ageing population

Jennifer Alonso-Garcia, Caroline Hillairet, Sarah Kaakai, Mohamed Mrad

AI总结 本文研究了在人口老龄化背景下,采用缓冲基金作为代际风险分担机制的现收现付(PAYG)养老体系中的最优投资与养老金政策。研究在非零波动率的常相对风险规避(CRRA)向前效用框架下,考虑了可持续性与保障充足性约束,推导出了最优政策的闭式解,并深入分析了偏好敏感性对养老金方案的影响。通过数值模拟,评估了该混合PAYG缓冲基金安排在不同人口、金融和宏观经济情景下的可持续性与保障水平。

详情
英文摘要

This paper investigates optimal investment and pension policies in a Pay-As-You-Go (PAYG) system supplemented by a buffer fund used as an intergenerational risk-sharing mechanism. The social planner's preference criterion is represented by non-zero volatility forward Constant Relative Risk Aversion (CRRA) utilities, and explicitly accounts for both sustainability and adequacy constraints. The optimal policies are characterized in closed form, and an in-depth analysis of the impact of preference sensitivities on the pension scheme is conducted. A detailed numerical analysis is performed to evaluate the sustainability and benefit adequacy of this hybrid PAYG buffer fund arrangement under a range of demographic, financial, and macroeconomic scenarios.

2605.12508 2026-05-15 cs.SI cs.CR q-fin.RM

Interoperability Effects: Extending DeFi Lending Risk Models to Multi-Chain Environments

Hasret Ozan Sevim

AI总结 随着DeFi逐渐向多链环境发展,跨链互操作性对借贷协议的风险管理提出了新的挑战。本文通过面板回归固定效应和OLS模型,分析了2022年10月至2025年1月期间15个去中心化借贷协议和53个跨链桥接器的数据,揭示了跨链活动对协议总锁定价值(TVL)和总收入的影响。研究发现,跨链桥接量是影响TVL和收入的关键因素,但其影响方向因链类型而异,同时新链的推出对TVL和收入影响较小,而跨链桥接攻击则表现出显著的正向关系。研究结果表明,有效的DeFi风险模型应纳入跨链指标,并采用分层感知的方法以更准确地反映多链生态的演变。

详情
英文摘要

On-chain lending has expanded across multiple distributed ledgers as DeFi becomes increasingly multi-chain. This environment introduces novel technical and financial mechanisms, particularly cross-blockchain communication and asset transfer protocols, yet cross-chain elements remain understudied in lending protocol risk management. To address this gap, we applied panel regression fixed effects and OLS models to empirically analyze cross-blockchain interoperability solutions, using TVL and total revenue as performance proxies from October 2022 to January 2025. Our data set covers 15 decentralized lending protocols and 53 cross-chain bridges across 9 EVM-compatible blockchains, categorized as Ethereum, alternative layer-1s, and Ethereum layer-2 networks. Results reveal that cross-chain activity impacts on protocol performance. Bridge volume emerges as a critical driver, exerts a significant effect on TVL and revenue across different categories, though the direction of this effect varies heterogeneously. Increased bridge integrations are associated with decreased TVL and protocol revenue across categories, indicating liquidity escapes from those lending ecosystems. Liquidations produce heterogeneous effects across categories. New network launches do not have as significant relationships with TVL and revenue while bridge hacks show a significant and positive relationship. High R-squared values confirm meaningful explanatory power. We further show Ethereum attracts large depositors, while layer-2s skew toward retail participation. We conclude that effective DeFi risk models should incorporate cross-chain metrics and adopt a layer-aware approach to accurately reflect the evolving multi-chain landscape.

2605.10060 2026-05-15 econ.GN q-fin.EC

Skill Premia and Pre-Marital Investments in Marriage Markets

Aditya Kuvalekar

AI总结 本文研究了一个存在搜索摩擦、婚前技能投资成本和非可转移效用的分散化婚姻市场,发现即使在对称环境下,市场也可能出现不对称均衡,即一方比另一方进行更多的技能投资。研究指出,随着高技能劳动者工资的上升,技能溢价的增加可能导致从对称均衡向唯一的不对称均衡转变,其中一方完全投资而另一方投资显著减少。这一发现为理解婚姻市场中性别角色和技能投资差异提供了新的理论依据。

详情
英文摘要

I study a decentralized marriage market with search frictions, costly pre-marital skill investments, and non-transferable utility. Despite a symmetric environment, the market can exhibit asymmetric equilibria, with one gender investing more in skills than the other; in some environments, the asymmetric equilibrium is unique. A microfounded model of household utility maximization shows that this transition from a unique symmetric equilibrium to a unique asymmetric equilibrium can be driven by rising labor-market wages for high-skilled workers: as the skill premium rises, one gender ends up fully investing while the other invests substantially less.

2605.02287 2026-05-15 q-fin.TR cs.CY q-fin.GN

Per-Market Information Leakage and Order-Flow Skill: Two Methodological Lenses on Informed Trading in Decentralized Prediction Markets

Maksym Nechepurenko

AI总结 本文对比分析了三种用于检测去中心化预测市场中知情交易的方法,分别关注账户层面的交易技能、生命周期启发式识别的内部人员以及基于市场层面的信息泄露量化。研究指出这些方法属于不同层次的检测,各自侧重不同维度,结合使用可提升检测精度。通过2026年美国对委内瑞拉操作事件的案例,展示了这些方法如何协同揭示交易中的信息泄露与违规行为。

详情
Comments
v2 (May 2026): added Revision Note section. No methodological-comparison changes. 21 pages, 4 tables
英文摘要

April 2026 saw notable methodological convergence in the academic study of informed trading on decentralized prediction markets. Three approaches surfaced almost simultaneously: Mitts and Ofir (2026) apply a composite screen to over 210,000 wallet-market pairs; Gomez-Cram et al. (2026) apply an event-level sign-randomization test to Polymarket's complete transaction history, classifying 3.14% of accounts as "skilled winners" and separately flagging 1,950 accounts as "insiders" via a lifecycle heuristic; Nechepurenko (2026) develops the Information Leakage Score (ILS) framework, which quantifies per-market information front-loading at an article-derived public-event timestamp. This paper provides a methodological comparison. The central claim is that these are three distinct layers of detection, not competing methods on a single layer. Sign-randomization is best understood as an account-level test of persistent directional skill conditional on opportunity selection -- not a direct test of insider trading, and not a per-market measure. The heuristic insider flag is separate from the skill classifier, applies to a population the classifier excludes by design, and has unknown precision. The Polymarket sample pools politics, sports, crypto, and other categories with different information technologies, so a platform-wide "skilled winner" classification is mechanism-ambiguous. The January 2026 U.S.-Venezuela operation cluster, where the DOJ indictment of Master Sergeant Gannon Van Dyke provides a rare external enforcement benchmark, illustrates how the layers stack: lifecycle heuristics identify suspicious accounts; legal investigation addresses non-public-information possession; per-market scoring would quantify how much information was leaked into each contract. A combined pipeline gains in precision because each layer filters a different dimension.

2605.02286 2026-05-15 q-fin.TR q-fin.GN

Empirical Evaluation of Deadline-Resolved Information Leakage on Documented Polymarket Insider Cases

Maksym Nechepurenko

AI总结 本文对一种改进的信息泄露评分方法——截止时间信息泄露评分(ILS-dl)进行了端到端的实证评估,该方法专门用于分析Polymarket平台上公开记录的内幕交易案例。研究基于ForesightFlow内幕案例库中最大的截止时间集群——2026年美伊冲突案例,通过多种分析方法验证了ILS-dl在区分真实信号与代理变量上的有效性,并揭示了不同市场类别中信息泄露的时间特征和钱包行为模式。

详情
Comments
v2 (May 2026): hazard-rate fits updated to full Tier-3 population (n=18 for military_geopolitics, was n=9). v1 estimate lies inside v2 95% CI. Esports taxonomy correction applied. No conclusion changes. 11 pages, 6 tables
英文摘要

This paper reports an end-to-end empirical evaluation of the deadline-Information Leakage Score (ILS-dl) extension introduced in the companion methodology paper. The deadline-ILS extends the original ILS to deadline-resolved prediction-market contracts, the dominant structural form of publicly documented insider trading on Polymarket. We anchor the evaluation in the 2026 U.S.-Iran conflict cluster of the ForesightFlow Insider Cases (FFIC) inventory, the largest documented deadline cluster. The evaluation has four parts: per-category exponential-hazard estimation, a single-case ILS-dl computation, cross-market wallet analysis, and methodological refinements. Hazard-rate estimation produces an adequate exponential fit for military-geopolitics markets (KS p = 0.426, half-life 2.9 days, n = 18) and a preliminary fit for corporate-disclosure markets (n = 5). The regulatory-decision category is rejected as bimodal (p = 0.023). On the largest applicable FFIC contract ("US forces enter Iran by April 30," $269M volume), the article-derived public-event timestamp yields ILS-dl = +0.113 versus a resolution-anchored proxy value of -0.331: a 0.444 shift in magnitude on opposite sides of zero, demonstrating that the extension distinguishes signal from proxy artefact. Pre-event drift is mild, and short-window variants (30-min, 2-hour) are exactly zero. Cross-market wallet analysis identifies 332 wallets active in both major Iran-cluster markets, but the available trade history covers only the resolution-settlement window. v2 (May 2026) corrects the hazard fit to the full Tier-3 population; the v1 estimate lies inside the v2 95% CI.

2605.00493 2026-05-15 q-fin.TR cs.CR q-fin.GN

ForesightFlow: An Information Leakage Score Framework for Prediction Markets

Maksym Nechepurenko

AI总结 ForesightFlow 是一个用于检测去中心化预测市场中知情交易的信息泄露评分(ILS)框架,通过量化事件发生前市场定价中包含的终端信息比例来识别潜在的内幕交易行为。该框架提出了三个操作条件以确保评分的解释有效性,并引入了Murphy分解方法以连接标签生成与评分规则理论。实证研究表明,现有时间戳代理存在局限性,且实际案例显示需要对ILS进行扩展以适应更真实的交易场景,文章进一步提出了以公共事件时间戳为基础的改进方法,以更好地匹配已知的内幕交易数据。

详情
Comments
v2 (May 2026): added Revision Note section; No methodology changes. 41 pages, 12 tables, 4 figures
英文摘要

ForesightFlow is an Information Leakage Score (ILS) framework for detecting informed trading on decentralized prediction markets. For an event-resolved binary market, the score quantifies the fraction of the terminal information move priced in before the public news event. Three operational scope conditions (edge effect, non-trivial total move, anchor sensitivity) are stated as preconditions for interpretation. The score admits a Murphy-decomposition reading that connects label generation to the proper-scoring-rule literature. A pilot empirical evaluation surfaces three findings. First, a resolution-anchored proxy for the public-event timestamp does not separate event-resolved markets from a matched control population (Mann-Whitney p = 1e-6, separation reversed), demonstrating that proxy quality is itself a binding constraint. Second, the article-derived timestamp on a single high-stakes case shifts the score by 0.444 in magnitude relative to the proxy and lies on the opposite side of zero. Third, an audit of the publicly documented Polymarket insider record reveals that documented cases are systematically deadline-resolved, falling outside the original ILS scope (0 of 24 FFIC inventory markets satisfied original scope conditions). This last finding motivates a deadline-ILS extension introduced in Section 7, anchored at the public-event timestamp rather than the news timestamp, and equipped with a per-category exponential hazard baseline for the time-to-event distribution. The extension closes the gap between the methodology and the population in which insider trading has been empirically documented. An end-to-end evaluation of the extension on the 2026 U.S.-Iran conflict cluster is reported in a companion paper. We release the FFIC inventory, the resolution-typology classification of the 911,237-market corpus, and all code at github.com/ForesightFlow.

2604.24366 2026-05-15 q-fin.TR cs.GT q-fin.GN

The Anatomy of a Decentralized Prediction Market: Microstructure Evidence from the Polymarket Order Book

Philipp D. Dubach

AI总结 本文研究了目前最大的链上预测市场Polymarket的微观结构,利用其公开的订单簿数据和链上交易记录,揭示了八个典型的市场特征,包括长尾溢价、订单深度分布、交易方向识别准确率较传统交易所低等。研究还发现,订单簿数据中的交易方向与链上实际交易存在较大偏差,因此建议在分析Polymarket微观结构时应直接使用链上交易事件。该研究为去中心化预测市场的分析提供了重要的实证基础和数据工具。

详情
Comments
16 pages, 9 figures, 5 tables. JEL: G14, G12, G19, C58, L86. v2: scope narrowing in Section 3, SF2 redone with full per-level depth profile, SF8 reframed (time-to-close coefficient becomes NS once duration and p(1-p) are controlled). Replication: https://github.com/philippdubach/polymarket-microstructure ; Zenodo DOI 10.5281/zenodo.19811426
英文摘要

We study the microstructure of Polymarket, the largest on-chain prediction market, using a continuous tick-level archive of the public order-book feed (30 billion events over 52 days) joined to the authoritative on-chain trade record. On a pre-registered stratified panel of 600 markets we report eight stylized facts: a longshot spread premium; a depth profile closer to uniform than to top-of-book; a null block-clock alignment effect; broad maker-wallet diversity with a concentrated tail; category-conditional effective-spread differences; a sub-50 ms median archive-ingestion delay with a multi-second tail; a self-counterparty wash share with median 1% and a 22% upper tail (well below Cong et al. 2023's 25-70% for unregulated crypto venues -- a sanity bound, not an apples-to-apples reference); and a cross-sectional depth profile explained by market duration, price level, and volume, with no residual time-to-close effect. The paper also contributes a measurement result: trade direction inferred from Polymarket's public order-book feed agrees with on-chain ground truth on only ~59% of buckets (panel mean 0.615, 95% CI [0.58, 0.65]), well below the ~80% Lee-Ready accuracy on Nasdaq. The effective half-spread changes sign between feed- and on-chain trade directions on 67%/50% of markets across two 7-day windows; Kyle's lambda on 60%/43%. Microstructure work on Polymarket therefore needs to source trade direction from on-chain OrderFilled events; we release a replication package that performs the join.

2603.16659 2026-05-15 cs.AI econ.GN q-fin.EC

LLMs learn scientific taste from institutional traces across the social sciences

Ziqin Gong, Ning Li, Huaikang Zhou

AI总结 该研究探讨了大型语言模型(LLMs)如何通过学习社会科学领域中的机构痕迹(如论文发表记录)来提升对低可验证性领域的评估能力。研究构建了八个学科的分级研究提案基准,并通过监督微调(SFT)训练模型,结果表明这些模型在判断研究价值方面显著优于随机猜测,甚至超越了前沿推理模型和专家评审的平均水平。研究还发现,模型的置信度与其预测准确性高度相关,表明其具备一定的判断可靠性。

详情
英文摘要

Reinforcement-learned reasoning has powered recent AI leaps on verifiable tasks, including mathematics, code, and structure prediction. The harder bottleneck is evaluative judgment in low-verifiability domains, where no oracle anchors reward and the core question is which untested ideas deserve attention. We test whether institutional traces, the record of what fields published, where, and at which tier, can serve as a training signal for AI evaluators. Across eight social science disciplines (psychology, economics, communication, sociology, political science, management, business and finance, public administration), we built held-out four-tier research-pitch benchmarks and supervised-fine-tuned (SFT) LLMs on field-specific publication outcomes. The fine-tuned models cleared the 25 percent chance baseline and exceeded frontier-model performance by wide margins, with best single-model accuracy ranging from 55.0 percent in public administration to 85.5 percent in psychology. In management, evaluated against 48 expert gatekeepers, 174 junior researchers, and 11 frontier reasoning models, the best single fine-tuned model (Qwen3-4B) reached 59.2 percent, 17.6 percentage points above expert majority vote (41.6 percent, non-tied) and 28.1 percentage points above the frontier mean (31.1 percent). The fine-tuned models also showed calibrated confidence: confidence rose when predictions were correct and fell when wrong, mirroring how a skilled reviewer can say "I'm sure" versus "I'm guessing." Selective triage on this signal reached very high accuracy on the highest-confidence subsets in every field. Institutional traces, we conclude, encode a scalable training signal for the low-verifiability judgment on which science depends.

2603.06875 2026-05-15 cs.LG q-fin.CP

Stochastic Attention via Langevin Dynamics on the Modern Hopfield Energy

Abdulrahman Alswaidan, Jeffrey D. Varner

AI总结 本文提出了一种基于现代霍普菲尔德能量函数的随机注意力机制,通过朗之万动力学从对应的玻尔兹曼分布中进行采样,实现了无需训练的注意力生成模型。该方法通过调整温度参数,可在精确检索与开放生成之间切换,且无需评分网络或训练循环,特别适用于数据稀缺的场景。实验表明,该方法在多个领域均表现出优异的生成能力,包括人脸生成、手写数字识别和蛋白质序列生成,且在保持新颖性的同时保留了结构特征。

详情
Comments
Main body (including references excluding the appendix): 11 pages, 2 figures and 1 table. Total paper: 26 pages, 13 figures and 7 pages
英文摘要

Attention heads retrieve: given a query, they return a weighted average of stored values. We showed that this computation is one step of gradient descent on the modern Hopfield energy, and that Langevin sampling from the corresponding Boltzmann distribution yielded stochastic attention, a training-free sampler controlled by a single temperature parameter. Lowering the temperature gave exact retrieval; raising it gave open-ended generation. Because the energy gradient equals the attention map, no score network, training loop, or learned model was required, making the approach particularly suited to the low-data regime where learned generative models are starved of training signal. We derived an entropy inflection condition that identified the retrieval-to-generation transition temperature for any memory geometry and validated the sampler on five domains spanning two orders of magnitude in dimension. A single Boolean mask on the attention softmax, identical to the causal mask used in transformers but applied along the memory axis rather than the sequence axis, turned the sampler into a zero-shot class-conditional generator on Olivetti faces with no retraining and no learned classifier. On MNIST digit images, stochastic attention produced samples that were markedly more novel and more diverse than the best learned baseline while matching a Metropolis-corrected gold standard. On protein sequences from a small Pfam family, the generation regime preserved amino acid composition far more faithfully than a variational autoencoder at matched novelty, indicating that the training-free score function retained family-level fidelity that learned models lost. A denoising diffusion baseline failed across all memory sizes tested, producing samples indistinguishable from isotropic noise. The approach required no architectural changes to the underlying attention mechanism.

2508.16919 2026-05-15 q-fin.RM

Combining a Large Pool of Forecasts of Value-at-Risk and Expected Shortfall

James W. Taylor, Chao Wang

AI总结 本文研究了在拥有大量候选预测方法的情况下,如何有效结合价值风险(VaR)和预期亏损(ES)的预测结果。作者提出了一系列新的组合方法,包括基于中位数、众数的简单方法,以及利用正则化技术减少过拟合的加权方法,并将VaR和ES联合视为区间预测,应用了截断均值和基于概率分布的混合方法。实证研究表明,基于性能的加权方法在多样化的方法池中表现最佳,尤其在仅包含六种方法时取得了更高的预测精度。

详情
Comments
34 pages, 10 figures
英文摘要

We consider the combination of value-at-risk (VaR) and expected shortfall (ES) forecasts when a large pool of candidate forecasts is available. Given the limited literature in this area, we implement a variety of new combining methods. In terms of simplistic methods, in addition to the mean, we consider the median and mode. As a complement to the previously proposed performance-based weighted combinations, we use regularisation to reduce overfitting in the presence of many weights. Treating VaR and ES forecasts jointly as interval forecasts allows the application of adapted interval forecast combination methods, including trimmed means and a mixtures approach based on inferred probability distributions. In an empirical study involving 90 forecasting methods, trimmed mean combinations, the mixtures method, and performance-based weighting delivered particularly strong results. However, greater forecasting accuracy resulted for a pool of just six methods, chosen to ensure diversity, with performance-based weighting producing the best overall performance.

2410.02091 2026-05-15 cs.SE cs.AI cs.HC econ.GN q-fin.EC

The Impact of Generative AI on Collaborative Open-Source Software Development: Evidence from GitHub Copilot

Fangchen Song, Ashish Agarwal, Wen Wen

AI总结 本研究探讨了生成式人工智能(AI)对协作式开源软件(OSS)开发的影响,重点分析了GitHub Copilot这一AI编程助手在GitHub开源项目中的实际作用。研究发现,使用Copilot可使项目层面的代码贡献量提升5.9%,主要源于开发者参与度和个体生产力的提高,但同时也带来了8%的协调时间增加。研究还指出,AI对核心开发者和外围开发者的影响存在差异,为理解AI在开源社区中的长期影响提供了重要参考。

详情
英文摘要

Generative artificial intelligence (AI) facilitates content production and enhances ideation capabilities, which can significantly influence developer productivity and participation in software development. To explore its impact on collaborative open-source software (OSS) development, we investigate the role of GitHub Copilot, a generative AI pair programmer, in OSS development where multiple distributed developers voluntarily collaborate. Using GitHub's proprietary Copilot usage data, combined with public OSS project data obtained from GitHub, we find that Copilot use increases project-level code contributions by 5.9%. This gain is driven by a 3.4% rise in developer coding participation and a 2.1% increase in individual productivity. However, Copilot use also leads to an increase in coordination time by 8% due to more code discussions. This reveals an important tradeoff: While AI expands who can contribute and how much they contribute, it slows coordination in collective development efforts. Despite this tension, the combined effect of these two competing forces remains positive, indicating a net gain in overall project-level timely merge of code contributions from using AI pair programmers. Interestingly, we also find the effects differ across developer roles. Peripheral developers show relatively smaller increases in project-level code contributions and experience larger increases in coordination time than core developers. In summary, our study underscores the dual role of AI pair programmers in affecting project-level code contributions and coordination time in OSS development. Our findings on the differential effects between core and peripheral developers also provide important implications for the structure of OSS communities in the long run.

2407.15536 2026-05-15 q-fin.CP

Calibrating the Heston model with deep differential networks

Giovanni Amici, Marco Morandotti, Chen Zhang

AI总结 本文提出了一种基于梯度的深度学习框架,用于校准Heston期权定价模型。所设计的深度微分网络(DDN)能够同时学习普通期权的Heston定价公式及其对模型参数的偏导数,避免了传统数值方法在计算梯度时可能遇到的问题。实验表明,该网络在多个股票市场中表现出更高的校准精度,并显著减少了依赖梯度信息的全局优化方法的计算时间。

详情
英文摘要

We propose a gradient-based deep learning framework to calibrate the Heston option pricing model (Heston, 1993). Our neural network, henceforth deep differential network (DDN), learns both the Heston pricing formula for plain-vanilla options and the partial derivatives with respect to the model parameters. The price sensitivities estimated by the DDN are not subject to the numerical issues that can be encountered in computing the gradient of the Heston pricing function. Thus, our network is an excellent pricing engine for fast gradient-based calibrations. Extensive tests on selected equity markets show that the DDN significantly outperforms non-differential feedforward neural networks in terms of calibration accuracy. In addition, it dramatically reduces the computational time with respect to global optimizers that do not use gradient information.

2012.01331 2026-05-15 econ.GN q-fin.EC

Motivating Careerists

Liqun Liu

AI总结 本文研究政治组织如何激励职业官员在缺乏明确合同的情况下做出符合公共利益的决策。通过分析不同信息结构下代理人履行职责的行为,论文提出,若委托人能够基于政策结果而非具体执行细节制定绩效奖励机制,便可有效激励代理人做出正确决策并努力执行。研究还揭示了最优信息结构的特征,并探讨了其对政策设计的启示。

详情
英文摘要

Motivating careerists is challenging for political organizations. Without explicit contracts, careerists often pander to public opinions or their superiors' preferences. Worse, when tasked with implementing these distorted decisions, they tend to underinvest in the necessary efforts. We analyze the motivation problem by examining how a careerist agent fulfills these roles on behalf of a principal across various information structures. Importantly, the principal can credibly commit to performance-based reward schemes to incentivize correct decisions and diligent implementation. However, such schemes are feasible only if the principal observes policy consequences while backing away from implementation details. Along the way, we characterize the principal-optimal information structure. Putting theoretical findings into practice, we explore the underlying incentive structures and their policy implications.

2605.13998 2026-05-15 q-fin.CP cs.LG

Synthetic American Option Pricing via Jump-HMM-Driven Heston Implied Volatility

Julia Sun, Zheyu Jin, Jiawei Zhang, Jeffrey D. Varner

AI总结 该研究提出了一种用于生成合成美式期权价格的框架,解决了隐含波动率依赖真实期权价格而造成的循环依赖问题。通过结合跳跃隐马尔可夫模型生成多资产价格路径,并利用改进的Heston波动率模型生成隐含波动率曲面,最终使用二叉树模型计算美式期权价格。该方法无需外部校准即可生成波动率微笑、偏斜和期限结构,并通过神经网络代理模型和行业特征提升模型的泛化能力与跨资产鲁棒性。

详情
英文摘要

Generating realistic synthetic option prices requires implied volatility as an input, yet implied volatility is itself derived from observed option prices, creating a circular dependency that limits synthetic data for machine-learning and risk-analysis applications. We break this circularity with a pipeline in which implied volatility emerges as an output of a structural model of equity returns. A Jump Hidden Markov Model produces multi-asset price paths with realistic stylized facts and cross-asset tail dependence; a modified Heston variance process, whose mean-reversion target depends on regime state, days to expiration, moneyness, and a market-mood indicator, converts those paths into implied-volatility paths; and a recombining binomial lattice prices American options from the resulting surface. Initializing variance at its mean-reversion target for each strike-expiration pair lets smile, skew, and term structure emerge without external calibration. We calibrate the shape function through a hierarchy spanning a parametric baseline, a globally shared neural surrogate, and a sector-specific neural surrogate fit to a multi-ticker, multi-sector option ladder. A temporal holdout on a multi-day capture isolated scheduled corporate events as the dominant source of test-time generalization error, and calendar-derived earnings-distance and same-sector peer-coupling features recovered the anticipatory portion of that signal. We then apply the framework as a synthetic-data generator on real near-the-money put and call contracts, forward-simulating price paths, and recovering path-conditional implied volatility, finite-difference American Greeks, and terminal short-premium profit and loss from one coherent simulation, and confirm cross-ticker robustness by re-running on a second underlying from a different sector and volatility regime. The framework is released as an open-source Julia package.

2605.13866 2026-05-15 cs.CY econ.GN q-fin.EC

AI Alignment Amplifies the Role of Race, Gender, and Disability in Hiring Decisions

Ze Wang, Guobin Shen, Michael Thaler

AI总结 本研究探讨了语言模型在招聘决策中是否再现或重塑了人类的歧视模式。通过大规模实验,研究发现经过对齐训练的语言模型在招聘推荐中对女性和黑人候选人更有利,而对残疾候选人则不利,且这些差异相当于额外半年到一年的教育水平。研究指出,对齐训练显著放大了性别和种族的有利影响,同时加剧了对残疾候选人的不利影响,揭示了AI对齐过程可能强化而非缓解社会偏见的问题。

详情
英文摘要

Humans increasingly delegate decisions to language models, yet whether these systems reproduce or reshape human patterns of discrimination remains unclear. Here we run a large-scale study to analyse whether language models use demographic information in hiring decisions. We show, across 27 models and 177 occupations, that language models give female and Black candidates hiring advantages relative to otherwise-comparable male and white candidates, while giving disabled candidates disadvantages. The differences are meaningful in magnitude: the role of race, gender, and disability status is comparable to six months to one year of additional education. Post-training alignment is the primary driver: relative to matched pre-trained models, alignment amplifies advantages for female and Black candidates by 325% and 330%, and disadvantages for disabled candidates by 171%. Compared with previous human correspondence studies, language models reverse the direction of racial discrimination, attenuate the disability penalty, and amplify the female advantage by 190%. Alignment changes how models use qualification signals: alignment increases returns to skills and work experience overall, but relatively more so for female and Black candidates. Meanwhile, the absence of qualification signals harms marginalised groups more, particularly for disabled candidates, differences that may explain the asymmetry of alignment effects across groups we observe.