arXivDaily arXiv每日学术速递 周一至周五更新
重置
2605.26074 2026-05-26 cs.CL cs.AI q-fin.GN

StakeBench: Evaluating Language Understanding Grounded in Market Commitment

StakeBench: 评估基于市场承诺的语言理解

Yunhua Pei, Jingyu Hu, Yiwei Shi, Hongnan Ma, Weiru Liu, John Cartlidge

AI总结 提出StakeBench框架,通过将市场评论与可验证的交易记录关联,从市场行为中自动生成监督信号,评估语言模型对市场承诺的理解能力。

详情
Comments
21 pages, 2 figures, 20 tables. Preprint. Dataset and evaluation code included
AI中文摘要

现有的金融自然语言处理基准通常依赖外部观察者提供的标签,衡量语言如何被感知而非说话者在市场中承诺了什么。我们引入StakeBench,一个基于市场承诺的语言理解评估框架。StakeBench将来自2261个已结算市场的560,876条评论与Polymarket和Manifold上可验证的头寸、行动和市场赔率记录相关联。监督信号来自可观察的市场行为。头寸方向、评论后交易行动和市场赔率轨迹取代了人工标注。四个诊断任务测试模型是否检测到市场承诺、识别揭示的方向、预测未来行动以及执行集体赔率预测。三个承诺感知指标衡量与揭示偏好而非感知情绪的一致性。有效性审计和明确的解释边界有助于区分可观察的承诺信号与潜在信念和因果市场赔率影响。在15个LLM、18个主题和平台设置中,模型部分恢复了头寸方向信号,定向准确率从0.506到0.599,但在后续任务中出现结构性失败。15个模型中有10个在未来行动预测中崩溃为一到两个行动标签,且没有模型在集体赔率预测中持续优于朴素赔率方向基线。模型规模与性能不相关,金融领域微调不改善揭示方向识别,平台激励强烈影响高阶结果。StakeBench在CC-BY 4.0许可下附带评估代码和数据集。

英文摘要

Existing financial NLP benchmarks often rely on labels supplied by outside observers, measuring how language is perceived rather than what speakers have committed to in the market. We introduce StakeBench, an evaluation framework for language understanding grounded in market commitment. StakeBench links 560,876 comments from 2,261 resolved markets to verified position, action, and market-odds records across Polymarket and Manifold. Supervision is derived from observable market behavior. Position sides, post-comment trading actions, and market-odds trajectories replace human annotation. Four diagnostic tasks test whether models detect market commitment, identify the revealed side, anticipate future action, and perform collective odds projection. Three commitment-aware metrics measure alignment with revealed preferences rather than perceived sentiment. Validity audits and explicit interpretation boundaries help distinguish observable commitment signals from latent belief and causal market-odds impact. Across 15 LLMs and 18 topics and platform settings, models partially recover position-side signals, with Directed Accuracy from 0.506 to 0.599, but show structural failures on later tasks. Ten of the fifteen models collapse to one or two action labels in future action anticipation, and no model consistently improves on the naive odds-direction baseline in collective odds projection. Model scale is not correlated with performance, finance-domain tuning does not improve revealed-side identification, and platform incentives strongly shape higher-order results. StakeBench is packaged with evaluation code and dataset under CC-BY 4.0.

2605.25894 2026-05-26 cs.LG q-fin.ST

Predicting Stock Price Direction on Earnings Announcement Days using Multi-modal Deep Learning

使用多模态深度学习预测盈利公告日的股价方向

Manuel Noseda, Nathan Soldati, Marco Paina

AI总结 本研究结合基本面指标、技术指标和新闻情感,利用LSTM和Transformer模型预测盈利公告日的股价方向,发现Transformer在识别波动方面更敏感,且新闻情感有助于提升性能。

详情
AI中文摘要

预测盈利公告(EAs)期间的股价走势是一个重大挑战,因为市场噪音和高冲击价格不连续性。在本研究中,我们评估了公告前的新闻情感、公司基本面和近期市场动态是否共同预测了EA日股票的价格方向运动。我们构建了一个多模态特征空间,结合了15个基本面指标、3个基于价格的技术指标以及使用FinBERT处理的金融新闻文章的情感分数。我们将长短期记忆(LSTM)网络和基于Transformer的架构与逻辑回归基线进行比较,并进一步评估所有模型在有和没有情感特征的情况下的增量价值。我们的结果表明,虽然LSTM通过保守的安全策略显示出更高的精确度,但Transformer模型在识别波动性运动方面表现出更高的敏感性,获得了更高的宏观F1分数,消融实验显示加入新闻情感有一致的益处。

英文摘要

Predicting stock price movements during Earnings Announcements (EAs) is a significant challenge due to market noise and high-impact price discontinuities. In this study, we evaluate whether pre-announcement news sentiment, firm fundamentals, and recent market dynamics jointly predict the directional price movement of equities on EA days. We construct a multi-modal feature space combining 15 fundamental metrics, 3 price-based technical indicators and sentiment scores derived from financial news articles processed using FinBERT. We compare a Long Short-Term Memory (LSTM) network and a Transformer-based architecture against a logistic regression baseline, and further assess all models with and without sentiment features to quantify their incremental value. Our results indicate that while the LSTM demonstrates higher precision through a conservative safe-bet strategy, the Transformer model exhibits superior sensitivity in identifying volatile movements, achieving a higher macro F1-score, with ablation experiments showing a consistent benefit from incorporating news sentiment.

2605.25824 2026-05-26 q-fin.MF

Mean-field game of mean-variance portfolio management with peer-based relative risk aversion

基于同伴相对风险厌恶的均值-方差投资组合管理的平均场博弈

Weilun Cheng, Zongxia Liang, Sheng Wang, Xiang Yu

AI总结 本文研究具有同伴相对风险厌恶的均值-方差投资组合管理的平均场博弈问题,通过平滑正则化和不动点论证证明了时间不一致平均场均衡的存在性。

详情
AI中文摘要

本文研究了均值-方差投资组合管理的平均场博弈问题,强调了由同伴风险厌恶编码的一种新型相对绩效。具体而言,风险厌恶被表述为分段形式,取决于个人财富是否高于或低于人口平均水平。由于MV准则固有的时间不一致性,加上分段风险厌恶,我们遇到了一类在文献中新颖的时间不一致平均场博弈。我们的目标是寻找一个平均场均衡,其特征是前向-后向随机微分方程系统和平均场一致性条件。新的挑战源于分段风险厌恶引起的不连续系数。为此,我们首先提出一种平滑正则化技术,并通过建立不连续多维FBSDE的解,获得代表性代理人在个人内部博弈中均衡的存在性。接着,通过不动点论证和当平滑正则化消失时的收敛性分析,我们得出时间不一致平均场博弈中平均场均衡的存在性。

英文摘要

This paper investigates a mean-field game (MFG) problem for mean-variance (MV) portfolio management, highlighting a new type of relative performance encoded by the peer-based risk aversion. Specifically, the risk aversion is formulated as a piecewise form that depends on whether the individual's wealth is above or below the population average. Due to the inherent time-inconsistency in the MV criterion, together with the piecewise risk aversion, we encounter a class of time-inconsistent MFG, new to the literature. Our goal is to seek a mean-field equilibrium, characterized by a forward-backward stochastic differential equation (FBSDE) system and a mean-field consistency condition. The new challenge stems from the discontinuous coefficients induced by the piecewise risk aversion. In response, we first propose a smooth regularization technique and obtain the existence of the equilibrium in the intra-personal game for the representative agent by establishing the solution to the discontinuous multi-dimensional FBSDE. Next, by invoking fixed-point arguments and convergence analysis as smoothing regularization vanishes, we conclude the existence of the mean-field equilibrium in the time-inconsistent MFG.

2605.25766 2026-05-26 math.ST q-fin.RM stat.TH

Measuring multivariate maximal tail dependence

测量多元最大尾部依赖性

Takaaki Koike, Marius Hofert, Haruki Tsunekawa

AI总结 针对经典尾部依赖系数仅沿对角线评估的局限性,本文提出并研究了多元最大尾部协调测度(MTCM),将其从二元扩展到多元情形,量化了单位体积超矩形上的最大尾部质量,并证明了最优方向的存在性。

详情
AI中文摘要

经典的尾部依赖系数(TDC)可能无法捕捉二元尾部依赖的非交换特征,因为它仅沿对角线评估底层连接函数。为解决这一局限性,二元情形下已提出几种最强尾部依赖表现形式的测度,包括基于底层二元连接函数的尾部连接函数的测度。本文引入并研究了多元最大尾部协调测度(MTCM),将二元测度扩展到多元情形。MTCM量化了公共单位体积下超矩形上的最大尾部质量,而相应的最大化器识别了最大尾部概率的方向。我们建立了多元情形下MTCM的基本性质,包括最优方向的存在性。我们还推导了几个重要模型类的解析表示。对于具有正则变化阿基米德生成元的生存Marshall-Olkin连接函数、Archimax和嵌套阿基米德连接函数,进一步得到了闭式表达式。应用于英格兰三变量年海平面最大值表明,MTCM可以揭示非对角应力方向以及基于似然或TDC的比较未能检测到的潜在极端依赖的显著差异。

英文摘要

The classical tail dependence coefficient (TDC) may fail to capture non-exchangeable features of bivariate tail dependence since it evaluates the underlying copula only along the diagonal. To address this limitation, several measures of strongest manifestation of tail dependence have been proposed in the bivariate case, including a measure based on the tail copula of the underlying bivariate copula. This paper introduces and investigates the multivariate maximal tail concordance measure (MTCM) which extends the bivariate measure to the multivariate case. The MTCM quantifies the largest tail mass over lower hyperrectangles of common unit volume, while the associated maximizer identifies the direction of maximal tail probability. We establish fundamental properties of the MTCM in the multivariate case, including existence of an optimal direction. We also derive analytical representations for several important model classes. Closed-form expressions are further obtained for survival Marshall-Olkin copulas, Archimax and nested Archimedean copulas with regularly varying Archimedean generators. An application to trivariate annual sea-level maxima in England shows that the MTCM can reveal off-diagonal stress directions and substantial differences in the underlying extremal dependence not detected by likelihood- or TDC-based comparisons.

2605.25632 2026-05-26 cs.AI cs.LG q-fin.RM

Insuring Every Action: An Authority Frontier Framework for Runtime Actuarial Control of Autonomous AI Agents

为每个行动投保:自主AI代理运行时精算控制的权威边界框架

Hao-Hsuan Chen

AI总结 提出精算行动接口(AAI)和权威边界框架,通过确定性运行时合约对自主AI代理的副作用行动进行定价、门控和评估,实现跨领域的精算控制与基准测试。

详情
Comments
35 pages, 4 figures, 11 tables. Companion paper on the mathematical foundations: SSRN 6761960
AI中文摘要

自主AI代理越来越多地产生带有副作用的行动:数据库变更、退款、支付、外部承诺。我们提出精算行动接口(AAI),这是一个确定性的运行时合约,它在时间一致的风险映射下,对每个此类行动按照合约固定的安全默认值进行定价,并根据每个边界的储备资本预算门控执行。然后我们开发了权威边界,这是一种评估原语,用于衡量运行时在每个储备资本水平下释放的自主权威量。该框架提供:(i) 一个确定性的报价-绑定-提交协议,带有通行费限制的能力令牌;(ii) 一个通用的七类行动分类法,将异构工具调用映射到可比较的权威单位;(iii) 在alpha支出下的重放确定性和逐路径储备覆盖;(iv) 通过全储备需求C_full和资本指标Capital@k进行跨域归一化。我们在四个代理环境(数据库变更、客服退款以及公共tau-bench零售和航空工具使用轨迹)中实例化AAI,并报告一个实时Postgres面板,其中三个Azure托管的模型通过同一合约提出行动。边界在跨域中表现出常见的低储备拒绝和中间释放模式,仅在预算网格达到全储备需求时饱和;所需储备资本变化达22倍(Capital@50从289到6457)。该框架不强制域采用相同形状;它揭示每个域的精算几何。在实时面板中,合约在低预算下防止了所有三个模型的实现损失,但在拒绝下的承保持续性方面有所不同:模型身份是一个精算承保变量。贡献是一个用于自主代理副作用运行时精算控制的基准就绪评估框架。

英文摘要

Autonomous AI agents increasingly issue side-effect-bearing actions: database mutations, refunds, payments, external commitments. We propose the Actuarial Action Interface (AAI), a deterministic runtime contract that prices each such action against a contractually fixed safe default under a time-consistent risk mapping, and gates execution against a per-boundary reserve capital budget. We then develop the Authority Frontier, an evaluation primitive measuring how much autonomous authority the runtime releases at each level of reserve capital. The framework provides (i) a deterministic quote-bind-commit protocol with toll-bounded capability tokens; (ii) a universal seven-class action taxonomy mapping heterogeneous tool calls to comparable authority units; (iii) replay determinism and pathwise reserve coverage under alpha-spending; (iv) cross-domain normalization via full reserve demand C_full and capital metrics Capital@k. We instantiate AAI across four agentic environments (database mutation, customer-service refund, and the public tau-bench retail and airline tool-use traces) and report a live Postgres panel in which three Azure-hosted models propose actions through the same contract. The frontier exhibits a common low-reserve refusal and intermediate-release pattern across domains, with saturation only where the budget grid reaches full reserve demand; required reserve capital varies by 22x (Capital@50 from 289 to 6457). The framework does not force domains into the same shape; it surfaces each domain's actuarial geometry. In the live panel the contract prevents realized loss across all three models at low budget while differing in underwriting persistence under denial: model identity is an actuarial underwriting variable. The contribution is a benchmark-ready evaluation framework for runtime actuarial control of autonomous-agent side effects.

2605.25610 2026-05-26 physics.soc-ph econ.GN math.OC q-fin.EC stat.AP

Match classification in the last round of four-team round-robin tournaments

四队循环赛最后一轮的比赛分类

László Csató, András Gyimesi

AI总结 本文通过分析FIFA世界杯小组赛,首次比较了确定性和概率性比赛分类方法,并利用概率模型量化了2026年世界杯改革的影响。

详情
Comments
22 pages, 4 figures, 6 tables
AI中文摘要

体育比赛最后一轮比赛分类是评估锦标赛设计的成熟工具。确定性和概率性方法均可用于此目的。本文通过分析最突出的四队循环赛例子——FIFA世界杯小组赛,首次对它们进行了比较。我们表明两种方法在实践中高度相关:2014年和2018年FIFA世界杯中分别出现了所有(四种)确定性和(六种)概率性比赛类型。考虑进攻和防守相对收益的概率模型提供了更深入的见解;例如,确定性方法中的竞争性比赛可以是六种概率类型中的任何一种。最后,利用概率框架量化并分解了2026年FIFA世界杯引入的主要改革的影响:扩军至48支球队,以及修改后的晋级和打破平局规则。

英文摘要

Classification of matches played in the last rounds of sports competitions is a well-established tool for evaluating tournament designs. Both deterministic and probabilistic approaches are available for this purpose. Our paper offers the first comparison of them by analysing the most prominent example of four-team round-robin competitions, the group stage of the FIFA World Cup. We show that both methods are highly relevant in practice: all (four) deterministic and (six) probabilistic match types occurred in the 2014 and 2018 FIFA World Cups, respectively. The probabilistic model, which accounts for the relative benefits of attacking and defending, provides deeper insights; for instance, the competitive matches from the deterministic approach can be of any of the six probabilistic types. Finally, the probabilistic framework is used to quantify and decompose the impact of the main reforms introduced for the 2026 FIFA World Cup: the expansion to 48 teams, as well as the modified qualification and tie-breaking rules.

2605.25559 2026-05-26 q-fin.RM

Modeling dependence in sparse time series of Insurance Claims

保险索赔稀疏时间序列中的依赖建模

Roberto Baviera, Pietro Manzoni, Michele Domenico Massaria

AI总结 提出Comb-Bernoulli模型,结合Lévy copulas和零混合模型的优势,实现稀疏保险风险时间序列依赖的易处理模拟、似然评估和参数估计,并在丹麦火灾保险数据中验证了建模优势和数值效率。

详情
AI中文摘要

建模多种风险类型之间的依赖关系是当代保险风险管理中的一个核心挑战。标准方法,即Lévy copulas和零混合模型,在模拟和参数校准中常面临实际困难。本文引入Comb-Bernoulli模型,这是一种捕捉稀疏保险风险时间序列之间依赖关系的新框架,融合了两种标准方法的优点。所提出模型的(传统)copula结构使得以下方面易于处理:i) 模拟,ii) 似然评估,以及iii) 依赖参数的估计。我们展示了模型的一般性质,并详细分析了具有对数正态边缘的高斯copula情况。此外,我们使用丹麦火灾保险数据集说明了一个应用,突出了我们的方法在实际风险管理中的建模优势和数值效率。

英文摘要

Modeling the dependence between multiple risk types is a central challenge in contemporary insurance risk management. The standard approaches, Lévy copulas and zero-mixed models, often face practical difficulties in simulation and parameter calibration. In this paper, we introduce the Comb-Bernoulli model, a novel framework for capturing dependence between sparse time series of insurance risks, bridging the benefits of the two standard approaches. The (traditional) copula structure of the proposed model enables tractable: i) simulation, ii) likelihood evaluation, and iii) estimation of dependence parameters. We present the general properties of the model and analyze in detail the Gaussian copula case with lognormal marginals. Moreover, we illustrate an application using the Danish fire insurance dataset, highlighting both the modeling strengths and numerical efficiency of our approach in real-world risk management.

2605.25555 2026-05-26 econ.GN q-fin.EC

Ownership Networks and Economic Power in the Italian Energy Sector

意大利能源行业的所有权网络与经济权力

Andrea Pannone, Francesco Giancaterini, Tiziano Bacaloni, Andrea Bernardini, Alessio Abeltino

AI总结 本文通过引入聚合网络权力指数(A-NPI)和聚合网络权力流(A-NPF),研究意大利能源行业经济权力的分布,揭示了国家保留多数所有权但全球资本和共同所有权削弱公共战略指导的“治理悖论”。

详情
AI中文摘要

能源行业是国家战略自主的基石,但其日益金融化已将所有权结构转变为复杂的网络配置。本文通过引入网络权力框架的两个行业级扩展——聚合网络权力指数(A-NPI)和聚合网络权力流(A-NPF),研究意大利能源行业经济权力的分布。与传统宏观层面指标不同,这些指数将企业层面的控制力和影响力聚合到一个系统性框架中,考虑了每个运营商的相对经济权重。将该框架应用于意大利案例揭示了一个“治理悖论”:虽然国家保留正式多数所有权,但该行业对全球资本市场的日益依赖以及跨国机构投资者的普遍共同所有权,已逐步削弱了公共战略指导。结果表明,资本集中使全球金融行为者能够内化行业竞争,促进关键基础设施管理中一种隐性战略趋同机制。这种配置挑战了欧洲战略自主,引发了对传统外国直接投资(FDI)筛选和反垄断工具在应对通过网络所有权结构施加的系统性影响方面是否充分的疑问。

英文摘要

The energy sector is a cornerstone of national strategic autonomy, yet its increasing financialization has transformed ownership structures into complex networked configurations. This paper investigates the distribution of economic power in the Italian energy sector by introducing two sector-level extensions of the Network Power framework: the Aggregate Network Power Index (A-NPI) and the Aggregate Network Power Flow (A-NPF). Unlike traditional macro-level measures, these indices aggregate firm-level control and influence into a systemic framework that accounts for the relative economic weight of each operator. Applying this framework to the Italian case reveals a "Governance Paradox": while the State retains formal majority ownership, the sector's deepening reliance on global capital markets and the pervasive presence of common ownership by transnational institutional investors have progressively hollowed out public strategic direction. The results show that capital centralization enables global financial actors to internalize sectoral competition, fostering a regime of tacit strategic convergence in the management of critical infrastructure. This configuration challenges European strategic autonomy, raising questions about the adequacy of traditional Foreign Direct Investment (FDI) screening and antitrust tools in addressing the systemic influence exerted through networked ownership structures.

2605.25505 2026-05-26 cs.CY cs.AI econ.GN physics.soc-ph q-fin.EC

Generative AI impacts on intra-urban inequality and skill premium in Beijing

生成式人工智能对北京城市内部不平等和技能溢价的影响

Xiliu He, Haoxiang Zhao, Mingyi Ma, Edward Wen Chuan Lai, Koei Enomoto, Anni Hu, Jiatong Li, Lingyun Chu, Yuan Lai

AI总结 利用北京2018-2024年500万条招聘数据,通过五个大语言模型评估任务级暴露度,构建社区级生成式人工智能暴露指数,发现生成式人工智能暴露集中在核心区,导致高暴露社区工资停滞和“高技能陷阱”,挑战了技能偏向技术变革理论。

详情
Comments
21 pages, 8 figures
AI中文摘要

生成式人工智能(GenAI)是首次大规模触及高认知任务的自动化浪潮,但其对城市内部不平等的影响仍基本未知。利用北京2018-2024年500万条招聘数据,我们通过汇总五个领先大语言模型的任务级评估,构建了社区级GenAI暴露指数。我们考察了这一冲击的空间、结构和因果机制。我们发现,GenAI暴露高度集中在城市核心区,加深了城市内部的人工智能鸿沟。自2023年以来,高暴露社区尽管继续吸引高技能工人,却经历了工资停滞——一种“高技能陷阱”。这种工资惩罚是由任务去技能化和劳动力市场拥挤加剧驱动的。以ChatGPT发布为中心的倍差法设计支持因果解释。这些发现挑战了流行的技能偏向技术变革理论,并为全球科技中心的包容性人工智能治理提供了基础。

英文摘要

Generative artificial intelligence (GenAI) is the first automation wave to reach high-cognitive tasks at scale, yet its effects on intra-urban inequality remain largely unknown. Using 5 million job postings from Beijing (2018--2024), we construct a neighborhood-level GenAI Exposure Index by aggregating task-level assessments from five leading large language models. We examine the spatial, structural and causal mechanisms of this shock. We find that GenAI exposure is highly concentrated in the city's core districts, deepening the intra-urban AI divide. Since 2023, high-exposure neighborhoods have experienced wage stagnation even as they continue to attract high-skilled workers -- a "high-skill trap." This wage penalty is driven by task de-skilling and intensified labor-market crowding. A difference-in-differences design centered on ChatGPT's release supports a causal interpretation. These findings challenge the prevailing theory of skill-biased technological change and provide a basis for inclusive AI governance in global technology hubs.

2605.25450 2026-05-26 q-fin.MF q-fin.RM

Valuation of Variable Annuities with Equity Protection Swaps under Jumps and Default Risks

跳跃与违约风险下具有权益保护互换的变额年金估值

Marek Rutkowski, Huansang Xu

AI总结 基于Merton跳跃扩散模型和Szimayer独立随机时间违约模型,推导欧式期权闭式估值公式与看跌-看涨平价关系,分析跳跃和违约风险下权益保护互换产品的对冲策略及违约调整保费。

详情
AI中文摘要

本文研究了Xu等人提出的标准权益保护互换(EPS)产品的估值与对冲问题。为考虑金融危机和交易对手违约风险,我们基于Merton跳跃扩散模型和Szimayer独立随机时间违约模型建立了定价框架,推导出欧式期权的闭式估值公式和看跌-看涨平价关系。分析了跳跃和违约风险下EPS产品的对冲策略。在无违约情况下静态对冲仍然有效,但交易对手违约风险导致无法完全对冲的剩余损失。这些损失被量化,并用于在Black-Scholes和跳跃扩散设定下定义违约调整初始保费。数值结果说明了跳跃特征和违约强度对对冲成本和保费的影响,强调了在EPS定价和风险管理中纳入危机和信用风险的重要性。

英文摘要

This paper examines the valuation and hedging of standard equity protection swap (EPS) products proposed by Xu et al.. To account for financial crises and counterparty default risk, we develop pricing frameworks based on Merton's jump-diffusion model and Szimayer's independent random time default model, under which closed-form valuation formulas and put-call parity relations for European options are derived. Hedging strategies for EPS products are analysed under jump and default risks. While static hedging remains effective in the absence of default, counterparty default risk leads to residual losses that cannot be fully hedged. These losses are quantified and used to define default-adjusted initial premiums under both Black-Scholes and jump-diffusion settings. Numerical results illustrate the effects of jump characteristics and default intensity on hedging costs and premiums, highlighting the importance of incorporating crisis and credit risks in EPS pricing and risk management.

2605.25438 2026-05-26 econ.GN q-fin.EC

Coding Beyond Your Training: Claude Code and the Technological Frontier of Software Developers

超越训练范围编码:Claude Code 与软件开发者的技术前沿

Alexander Quispe

AI总结 利用双重稳健估计器分析 Claude Code 的逐步推出,发现 AI 编码助手显著增加了开发者的月度提交数、贡献仓库数、使用语言数及语言熵,且累积语言效应随时间增长,表明 AI 降低了技术切换障碍。

详情
AI中文摘要

我们研究采用 AI 编码助手是否因果性地扩展了个体软件开发者的技术前沿。我们利用 2025 年 5 月至 2026 年 1 月期间 Claude Code 在 GitHub 上的逐步推出,基于 5,838 名开发者 28 个月的月度面板数据,以开发者首次与 Claude 共同提交的 commit 定义处理组,尚未采用 AI 的开发者作为对照组。使用双重稳健的 Callaway 和 Sant'Anna (2021) 估计量,我们发现对月度提交数(+41)、贡献仓库数(+1.5)、使用不同编程语言数(+0.83)、香农语言熵(+0.14)、新使用语言数(+0.31)和累积终身语言数(+0.51)有显著正向影响。累积语言效应随采用时间增长,符合贝叶斯学习模型:AI 提供关于不熟悉技术的免费信号并降低切换障碍。结果对两种更严格的活动过滤条件稳健。估计结果记录了开发者行为在 AI 采用时发生的显著、持续性转变;识别限制阻止了严格的因果推断,我们提出了更清晰检验的研究议程。

英文摘要

We study whether adoption of an AI coding assistant causally expands the technological frontier of individual software developers. We exploit the staggered rollout of Claude Code across GitHub between May 2025 and January 2026 in a panel of 5,838 developers observed monthly over 28 months, with treatment defined by the developer's first Claude-co-authored commit and not-yet-treated developers as controls. Using the doubly robust Callaway and Sant'Anna (2021) estimator, we find positive and significant effects on monthly commits (+41), repositories contributed to (+1.5), distinct programming languages used (+0.83), Shannon language entropy (+0.14), newly-used languages (+0.31), and cumulative lifetime languages (+0.51). The cumulative-languages effect grows with time since adoption, matching a Bayesian-learning model in which AI provides free signals about unfamiliar technologies and lowers the switching barrier. Results are robust to two stricter activity filters. The estimates document a sharp, persistent shift in developer behavior coincident with AI adoption; identification limits prevent a strict causal claim and we outline an agenda for cleaner tests.

2605.25392 2026-05-26 q-fin.MF

One Currency, Two Forward Prices: The Onshore-Offshore Renminbi Puzzle

一种货币,两种远期价格:在岸-离岸人民币之谜

Samuel Drapeau, Peng Luo, Xuan Tao, Tan Wang

AI总结 本文通过建立包含交易成本和分割供给的即期-远期联合均衡模型,研究了在岸人民币(CNY)与离岸人民币(CNH)远期价格持续且经济上显著差异的原因,并发现随机离岸流动性压力(建模为交易成本跳跃)可以解释这一差异。

详情
AI中文摘要

部分可兑换经济体面临一个市场设计问题:在金融完全一体化之前,贸易一体化、跨境投资和国内资产负债表敞口增加了货币对冲的需求。中国针对这一问题采用了独特的架构,即培育可交割的离岸人民币市场(CNH),同时保留分割的在岸市场(CNY),而非仅依赖无本金交割远期。这为同一货币的紧密相关债权创造了两个交易场所。即期价格紧密相连,但CNY和CNH远期价格却存在持续且经济上显著的差异。我们在一个包含交易成本和分割供给的即期-远期联合均衡模型中研究这一差异。在共同恒定供给和确定性成本的基准情形下,即期平价导致远期价差的符号与数据相反。随机离岸压力(建模为交易成本跳跃)推翻了这一基准,同时保持了紧密的即期平价。该模型在CNY/CNH应用中给出了半显式表示,并根据市场隐含的离岸流动性压力的可能性和严重程度对观察到的远期差异进行了校准。

英文摘要

Partially convertible economies face a market-design problem: trade integration, cross-border investment, and domestic balance-sheet exposure increase the demand for currency hedging before full financial integration is complete. China adopted a distinctive architecture for this problem by fostering a deliverable offshore Renminbi market (CNH) alongside the segmented onshore market (CNY), rather than relying only on non-deliverable forwards. This creates two venues for closely related claims on the same currency. Spot prices are tightly linked, yet CNY and CNH forwards display a persistent and economically large discrepancy. We study that discrepancy in a joint equilibrium model for spot and forward trading with transaction costs and segmented supply. In the benchmark case with common constant supply and deterministic costs, spot parity implies a forward differential with the wrong sign relative to the data. Random offshore stress, modeled as a jump in trading costs, overturns this benchmark while preserving tight spot parity. The model yields a semi-explicit representation in the CNY/CNH application and a calibration of the observed forward discrepancy in terms of the market-implied likelihood and severity of offshore liquidity stress.

2605.22892 2026-05-26 q-fin.RM cs.LG

Is TabPFN the Silver Bullet for Insurance Pricing?

TabPFN 是保险定价的银弹吗?

Bruno Deprez, Wouter Verbeke, Tim Verdonck

AI总结 本文首次实证评估 TabPFN 在车险定价中的表现,与 GLM 和 XGBoost 对比,发现其性能不稳定、推理时间长且对上下文训练集大小敏感,目前无法替代传统精算方法。

详情
AI中文摘要

非寿险定价中的索赔频率和严重性建模主要依赖广义线性模型,梯度提升机是领先的机器学习替代方案。表格基础模型(TFM)提出了一种根本不同的推理范式。通过在大量合成数据集上预训练,TFM 能够通过上下文学习对新数据进行推理,无需针对特定数据集进行拟合或超参数调优。本文首次对 TabPFN 在车险定价中进行实证评估,在两个公开的 MTPL 数据集上将其与 GLM 和 XGBoost 进行基准测试。我们的结果表明,TabPFN 并未持续优于已建立的基线,推理时间显著更长,并且对上下文训练集的大小敏感。虽然表格基础模型代表了有前景的方向,特别是在数据稀缺的情况下,但其当前性能无法为已建立的精算方法提供可行的替代方案。

英文摘要

Modelling claim frequency and severity for non-life insurance pricing predominantly relies on generalised linear models, with gradient-boosted machines as the leading machine learning alternative. Tabular foundation models (TFMs) present a fundamentally different inference paradigm. By pre-training on large collections of synthetic datasets, TFMs enable inference on new data through in-context learning, without any dataset-specific fitting or hyperparameter tuning. This paper presents a first empirical evaluation of TabPFN for motor insurance pricing, benchmarking it against GLM and XGBoost on two publicly available MTPL datasets. Our results show that TabPFN does not consistently outperform established baselines, exhibits substantially longer inference times, and is sensitive to the size of the in-context training set. While tabular foundation models represent a promising direction, particularly in data-scarce settings, their current performance does not offer a viable replacement for established actuarial methods.

2605.20137 2026-05-26 q-fin.GN

A Three-Variable Benchmark for Post-GFC Covered Interest Parity Deviations

后全球金融危机覆盖利率平价偏离的三变量基准

Useong Shin

AI总结 本文提出一个公开的日频基准,利用三个滞后公共状态变量(NFCI、名义广义美元指数和国债10年减2年斜率)来建模后全球金融危机政府债券CIP偏离,并验证其捕捉持久背景成分而非短期季度末尖峰或虚假水平相关。

详情
AI中文摘要

本文为后全球金融危机政府债券CIP偏离提出了一个公开的日频基准。尽管CIP偏离每日可见,但文献中缺乏类似于资产定价中标准因子模型的可用于日度回归的规范基准。利用G10加韩元货币期限面板,我展示了三个滞后公共状态变量——NFCI、名义广义美元指数和国债10年减2年斜率——在样本内和留一年外表现强劲。协整、季度末和聚合差异诊断表明,该基准捕捉了一个持久的背景成分,而非短期季度末尖峰或虚假水平相关。

英文摘要

This paper proposes a public daily-frequency benchmark for post-GFC government-bond CIP deviations. Although CIP deviations are observed daily, the literature lacks a canonical benchmark for daily regressions comparable to standard factor models in asset pricing. Using G10 plus KRW currency-tenor panels, I show that three lagged public state variables-NFCI, the nominal broad U.S. dollar index, and the Treasury 10-year minus 2-year slope-deliver strong in-sample and leave-one-year-out performance. Cointegration, quarter-end, and aggregation-difference diagnostics suggest that the benchmark captures a persistent background component rather than short-maturity quarter-end spikes or spurious level correlation.

2605.18745 2026-05-26 stat.ML cs.LG cs.NA math.NA math.PR q-fin.MF stat.CO

SURGE: Approximation and Training Free Particle Filter for Diffusion Surrogate

SURGE: 扩散替代模型的近似与免训练粒子滤波

Lifu Wei, Yinuo Ren, Naichen Shi, Yiping Lu

AI总结 提出一种基于扩散模型的无偏粒子滤波方法,通过序列蒙特卡洛对扩散轨迹进行重加权和重采样,融合观测数据与模型模拟,实现状态估计的连续校正。

详情
Comments
accepted by ICML 2026
AI中文摘要

数据同化(DA)解决从含噪声和不完整的观测中顺序估计动力系统状态的问题。本文采用扩散模型作为世界模型来模拟和预测系统动力学。最近,基于分数的扩散模型学习了全局扩散先验,能有效建模(随机)动力学,显示出数据同化的强大潜力。本文研究如何利用含噪观测信息,在使用扩散先验时实现对预测系统状态的连续校正和细化。受粒子滤波方法启发,我们使用一组粒子表示后验分布。接收到含噪观测后,利用观测似然引导扩散模型,使生成过程朝向与观测一致的状态。然而,这种引导并不能保证从真实后验中采样。因此,我们将扩散轨迹视为路径测度,采用序列蒙特卡洛方法对粒子进行重加权和重采样,从而纠正生成过程并确保收敛到所需的后验分布。这产生了一种无偏的粒子滤波方法,严格地将观测数据与扩散模型模拟融合。

英文摘要

Data assimilation (DA) addresses the problem of sequentially estimating the state of a dynamical system from noisy and incomplete observations. In this work, we employ a diffusion model as a world model to simulate and predict the system's dynamics. Recently, score-based diffusion models have learned global diffusion priors that effectively model (stochastic) dynamics, revealing strong potential for data assimilation. In this paper, we investigate how information from noisy observations can be incorporated to enable continuous correction and refinement of the predicted system state when using a diffusion prior. Motivated by particle filtering methods, we represent the posterior distribution using a set of particles. After receiving noisy observations, the diffusion model is guided using the observation likelihood to steer the generation process toward observation-consistent states. Nevertheless, such guidance does not guarantee sampling from the true posterior. We therefore employ a Sequential Monte Carlo approach over the diffusion trajectory, viewed as a path measure, to reweight and resample particles, thereby correcting the generation process and ensuring convergence toward the desired posterior distribution. This leads to an unbiased particle filtering method that rigorously fuses observational data with diffusion model simulations.

2604.25123 2026-05-26 q-fin.CP q-fin.MF

Implied Volatility Expansions for VIX Options in Forward Variance Models

远期方差模型中VIX期权隐含波动率的展开

Ying Liao, Ankush Agarwal, Florian Bourgey

AI总结 本文在远期方差模型框架下,利用弱近似技术推导出VIX期权隐含波动率的闭式展开,实现快速准确校准。

详情
AI中文摘要

我们在远期方差模型类别中,为VIX期权的隐含波动率开发了闭式展开。我们的方法基于VIX期权价格的弱近似技术,并产生具有可计算修正项的显式隐含波动率展开。所得公式无需使用期权价格进行数值求根,即可实现快速准确的校准。我们通过数值实验展示了所提展开在标准及粗糙Bergomi型模型以及混合规格中的性能,并验证了其准确性。

英文摘要

We develop closed-form expansions for the implied volatility of VIX options within the class of forward variance models. Our approach builds on weak-approximation techniques for VIX option prices and yields explicit implied volatility expansions with computable correction terms. The resulting formulas enable fast and accurate calibration without requiring numerical root-finding using option prices. We illustrate the performance of the proposed expansions in both standard and rough Bergomi-type models, as well as in mixed specifications, and demonstrate their accuracy through numerical experiments.

2604.00582 2026-05-26 econ.GN q-fin.EC

Green Subsidies and Local Transitions: Evidence from Energy Communities

绿色补贴与地方转型:来自能源社区的证据

Akcan Balkir

AI总结 利用《通胀削减法案》引入的税收抵免地理差异,研究可再生能源投资与生产税收抵免的有效性及影响,发现税收抵免显著增加可再生能源资本与产出,并形成支持可再生能源政策的政治反馈循环。

详情
Comments
18 pages, 13 figures
AI中文摘要

本文研究了可再生能源投资与生产税收抵免的有效性和影响。我利用《通胀削减法案》引入的这些税收抵免的新地理差异,检验可再生能源税收抵免是否产生了实际经济影响。与类似县相比,获得更多税收抵免的社区可再生能源资本增加了33%,可再生能源产量增加了31%。这表明投资和生产税收抵免的弹性分别为1.62和6.11。我使用一个关于计划投资的未被充分研究的数据集来补充这些结果,以区分预先计划的项目和额外项目。考虑边际内投资后,投资税收抵免的投资弹性显著降低至0.6。在描述了供给侧对这些可再生能源税收抵免的反应后,我记录了一个新的政治反馈循环,即激励增加与对可再生能源政策的支持之间的互动。税收激励更大的地区对可再生能源政策的支持出现了跃升,这与“不要在我家后院”的说法相反。政治反应的异质性表明,投资和生产税收抵免通过两个渠道获得支持:1) 劳动力市场溢出效应,税收激励更大的地区建筑工资上涨了7%;2) 公共物品溢出效应,跨党派家长对可再生能源的支持增加了13%。

英文摘要

This paper studies the effectiveness and incidence of the renewable energy Investment and Production Tax Credits. I leverage new geographical variation in these credits, introduced by the Inflation Reduction Act, to test whether renewable energy credits had real economic impacts. Communities with greater tax credits accrued 33% more renewable energy capital and produced 31% more renewable energy compared to similar counties. This suggests elasticities of 1.62 and 6.11 for the Investment and Production Tax Credits respectively. I augment these results using an understudied dataset on planned investment to disentangle preplanned from additional projects. Accounting for inframarginal investment significantly reduces the Investment Tax Credit's investment elasticity to 0.6. After characterizing the supply side responses to these renewable tax credits, I document a new political feedback loop between increased incentives and support for renewable energy policies. Areas with greater tax incentives experienced jumps in support for renewable energy policies, contrary to the Not In My Backyard narrative. Heterogeneity in political responses suggests that the Investment and Production Tax Credits garnered support through two channels: 1) labor market spillovers, with construction wages increasing by 7% in areas with greater tax incentives, and 2) public goods spillovers, with parents across party lines increasing support for renewable energy by 13%.

2404.18029 2026-05-26 q-fin.RM

Value-at-Risk- and Expectile-based Systemic Risk Measures and Second-order Asymptotics: With Applications to Diversification

基于风险价值与期望值的系统性风险度量及二阶渐近:在分散化中的应用

Bingzhen Geng, Yang Liu, Yimiao Zhao

AI总结 本文提出两类系统性风险度量(基于VaR和基于期望值),在多元Sarmanov分布下推导其二阶渐近公式,并通过数值分析证明期望值度量在尾部风险报告中的优势,最后应用于分散化效益讨论。

详情
Comments
Keywords: Asymptotic approximation; Systemic risk; Expectile; Sarmanov distribution; Second-order regular variation; Diversification benefit
AI中文摘要

系统性风险度量在分析极端系统性灾难条件下的个体损失中起着关键作用。本文为系统性风险度量提供了统一的渐近处理。首先,我们将它们分为两类:基于风险价值(VaR)和基于期望值的系统性风险度量。虽然基于VaR的风险度量已被广泛研究,但在后一类中,我们提出了两种新的系统性风险度量,命名为个体条件期望值(ICE)和系统性个体条件期望值(SICE),作为边际预期短缺(MES)和系统性预期短缺(SES)的替代。其次,为了刻画一般相互依赖且重尾的风险,我们考虑一个遵循多元Sarmanov分布的损失系统,其共同边际分布表现出二阶正则变化。第三,在此设定下,我们为两类系统性风险度量提供了二阶渐近结果。这些结果扩展了标准的一阶渐近,并允许更精确的尾部逼近。通过数值和分析示例,我们展示了二阶渐近在准确评估系统性风险方面的优越性。我们进一步对基于期望值和基于VaR的系统性风险度量进行了全面比较。结果表明,基于期望值的度量通常比基于VaR的度量具有更高的渐近精度,强调了前者在报告极端事件和尾部风险方面的潜在优势。作为金融应用,我们利用渐近处理讨论了与各种风险度量相关的分散化效益。最后,我们推广并得到了基于幂函数的广义分位数系统性风险度量的二阶渐近公式。

英文摘要

Systemic risk measures play a crucial role in analyzing individual losses conditional on extreme system-wide disasters. In this paper, we provide a unified asymptotic treatment for systemic risk measures. First, we classify them into two families of Value-at-Risk- (VaR-) and expectile-based systemic risk measures. While VaR-based risk measures have been extensively studied, in the latter family, we propose two new systemic risk measures named the Individual Conditional Expectile (ICE) and the Systemic Individual Conditional Expectile (SICE), as alternatives to Marginal Expected Shortfall (MES) and Systemic Expected Shortfall (SES). Second, to characterize general mutually dependent and heavy-tailed risks, we consider a multivariate loss system following a multivariate Sarmanov distribution with common marginal distributions exhibiting second-order regular variation. Third, within this setting, we provide second-order asymptotic results for both families of systemic risk measures. These results extend standard first-order asymptotics and allow for more accurate tail approximations. Through numerical and analytical examples, we demonstrate the superiority of second-order asymptotics in accurately assessing systemic risk. We further conduct a comprehensive comparison between expectile-based and VaR-based systemic risk measures. The results indicate that expectile-based measures often yield higher asymptotic accuracy than VaR-based ones, emphasizing the former's potential advantages in reporting extreme events and tail risk. As a financial application, we use the asymptotic treatment to discuss the diversification benefits associated with various risk measures. Finally, we extend and obtain the second-order asymptotic formulas for generalized-quantile-based systemic risk measures with power functions.

2605.24878 2026-05-26 q-fin.TR q-fin.MF

Entropy-Regularized Certainty-Equivalent Bellman Policies for Risk-Sensitive Market Making

风险敏感做市中的熵正则化确定性等价贝尔曼策略

Tenghan Zhong

AI总结 针对有限库存风险敏感做市问题,提出精确离散熵正则化贝尔曼算子,证明其收敛速率、性能界及与未正则化最优报价集的一致性。

详情
AI中文摘要

我们研究了一个有限库存风险敏感做市问题,其中交易者控制买卖报价,面临布朗运动中间价风险,并通过具有报价依赖强度的点过程接收流动性获取订单。目标是由指数效用函数与终端和库存惩罚引起的确定性等价。我们引入了一个精确的离散熵正则化贝尔曼算子,该算子将log-sum-exp正则化应用于确定性动作的确定性等价分数,而不是风险中性的一步奖励。这种区别至关重要,因为指数确定性等价不与报价随机化交换。对于时间步长\(h\)和熵参数\(λ\),我们证明了对未正则化连续时间风险敏感值的均匀收敛速度为\(O\bigl(h+λ(1+|\logλ|)\bigr)\)。我们还证明了在新鲜采样松弛实现下,诱导的吉布斯策略的确定性等价性能界,其中报价标记在潜在成交事件时采样,而不是在时间步长内冻结。在哈密顿量关于相关报价坐标的二次增长条件下,这些策略集中在未正则化最优报价集附近。最后,我们证明了一个低成本哈密顿-吉布斯代理满足与精确贝尔曼吉布斯策略相同阶数的确定性等价性能界。在Avellaneda-Stoikov规范中的数值实验支持了离散化误差、熵偏差、策略差距、报价集中度以及精确与代理一致性的预测标度。

英文摘要

We study a finite-inventory risk-sensitive market making problem in which a dealer controls bid and ask quotes, faces Brownian midprice risk, and receives liquidity-taking orders through point processes with quote-dependent intensities. The objective is the certainty equivalent induced by exponential utility with terminal and running inventory penalties. We introduce an exact discrete entropy-regularized Bellman operator that applies log-sum-exp regularization to deterministic-action certainty-equivalent scores, rather than to a risk-neutral one-step reward. This distinction is essential because the exponential certainty equivalent does not commute with quote randomization. For time step \(h\) and entropy parameter \(λ\), we prove uniform convergence to the unregularized continuous-time risk-sensitive value at rate \[ O\bigl(h+λ(1+|\logλ|)\bigr). \] We also prove certainty-equivalent performance bounds for the induced Gibbs policies under a fresh-sampling relaxed implementation, in which quote marks are sampled at potential fill events rather than frozen over a time step. Under a quadratic growth condition on the Hamiltonian in the relevant quote coordinates, these policies concentrate around the unregularized optimal quote set. Finally, we show that a lower-cost Hamiltonian-Gibbs proxy satisfies a certainty-equivalent performance bound of the same order as the exact Bellman Gibbs policy. Numerical experiments in an Avellaneda--Stoikov specification support the predicted scaling for discretization error, entropy bias, policy gap, quote concentration, and exact-versus-proxy consistency.

2605.24730 2026-05-26 econ.TH q-fin.GN

Private Languages

私人语言

Jeremy Bertomeu

AI总结 研究发送者拥有私人锚点时的策略沟通,发现报告成本导致连续信息传递,可改善或扭曲信息,并解释组织为何依赖非正式渠道。

详情
AI中文摘要

策略沟通通常依赖于发送者观察到但接收者未观察到的锚点。分析师可能根据专有估值模型报告,审计师根据内部评分,经理根据会计估计,机构根据自身标准。我研究了一个发送者-接收者博弈,其中报告偏离此类私人观察到的锚点是有成本的。锚点异质性改变了沟通的几何结构。私人锚定报告不依赖于分区,而是产生连续的消息变化,因为不同的发送者发现不同的报告成本高昂。这种机制可以改善信息传递,但也可能将报告拉向有噪声的私人锚点。我表明:(i) 小的正报告成本可以使沟通接近完全揭示,即使零成本将模型恢复为廉价谈话;(ii) 无信息的锚点可以通过策略性扭曲传递信息。锚定报告和廉价谈话信息可以作为内生的硬信息和软信息共存,但在足够低的不一致下,所有方都偏好仅使用廉价谈话,这解释了为什么组织可能完全依赖非正式渠道。

英文摘要

Strategic communication often relies on anchors observed by the sender but not by the receiver. An analyst may report against a proprietary valuation model, an auditor against an internal score, a manager against an accounting estimate, or an institution against its own standard. I study a sender-receiver game in which reports are costly to move away from such privately observed anchors. Anchor heterogeneity changes the geometry of communication. Rather than relying on partitions, privately anchored reporting generates continuous variation in messages because different senders find different reports costly to make. This mechanism can improve information transmission, but it can also pull reports toward noisy private anchors. I show that (i) small positive reporting costs can make communication approach full revelation, even though zero costs return the model to cheap talk, (ii) uninformative anchors can transmit information through strategic distortions. Anchored reports and cheap-talk messages can coexist as endogenous hard and soft information, but cheap-talk alone is preferred by all parties under sufficiently low misalignment, explaining why organizations may rely exclusively on informal channels.

2605.12764 2026-05-26 q-fin.MF cs.LG stat.ML

Yield Curves Dynamics Using Variational Autoencoders Under No-arbitrage

无套利条件下使用变分自编码器的收益率曲线动力学

Fusheng Luo, H'elyette Geman

AI总结 提出一种物理信息生成框架,通过两阶段架构(学生t条件变分自编码器+动态水平注入和神经随机微分方程)解决深度学习统计灵活性与固定收益理论约束的冲突,在多个主权货币上显著降低预测误差并实现无套利。

详情
Comments
This is the full script (version 2) of our paper, which is awaiting submission to financial journals/conferences, after modifying and double-checking the reference lists
AI中文摘要

本文引入了一个物理信息生成框架,解决了深度学习统计灵活性与固定收益建模严格理论约束之间的根本冲突。我们证明,标准生成模型和无约束统计外推在预测跨多种宏观经济体制的期限结构时,会遭受“流形崩溃”和严重的套利违规。为克服这一问题,我们提出了一种两阶段架构。首先,具有动态水平注入的学生t条件变分自编码器(CVAEsT+LS)提取了一个稳健、重尾的期限结构流形,有效解耦了宏观经济形状动态与绝对基准利率。其次,潜在动态演化由连续时间神经随机微分方程(SDE)控制,并受到无套利偏微分方程(PDE)的严格惩罚。跨多个主权货币(美元、英镑、日元)的实证结果证实,我们的协同方法大幅降低了样本外预测误差——实现了卓越的6.58个基点平均期限RMSE——并成功克服了经典HJM模型在极端环境中表现出的巨大平行漂移和零下限违规。此外,通过相空间向量场分析,我们展示了该模型在无监督宏观经济体制检测和高质量连续时间情景生成方面的卓越能力。最终,本研究为期限结构建模提供了一个高度可扩展、数学上合理的演化引擎。

英文摘要

This paper introduces a physics-informed generative framework that resolves the fundamental conflict between the statistical flexibility of deep learning and the rigorous theoretical constraints of fixed-income modeling. We demonstrate that standard generative models and unconstrained statistical extrapolations suffer from "manifold collapse" and severe arbitrage violations when forecasting term structures across diverse macroeconomic regimes. To overcome this, we propose a two-stage architecture. First, a Student-t Conditional Variational Autoencoder with Dynamic Level Injection (CVAEsT+LS) extracts a robust, heavy-tailed term structure manifold, effectively decoupling macroeconomic shape dynamics from absolute base rates. Second, the latent dynamic evolution is governed by a continuous-time Neural Stochastic Differential Equation (SDE) strictly penalized by a No-Arbitrage Partial Differential Equation (PDE). Empirical results across multiple sovereign currencies (USD, GBP, JPY) confirm that our synergistic approach drastically reduces out-of-sample forecasting errors -- achieving an exceptional 6.58 bps Mean Tenor RMSE -- and successfully overcomes the massive parallel drift and zero-lower-bound violations exhibited by the classical HJM model in extreme environments. Furthermore, through phase space vector field analysis, we demonstrate the model's superior capability in unsupervised macroeconomic regime detection and high-quality continuous-time scenario generation. Ultimately, this research provides a highly scalable, mathematically sound evolutionary engine for term structure modeling.

2605.12250 2026-05-26 q-fin.GN

The P behind Q: Empirical Evidence from Physical Drift in Put-Call Parity

P背后的Q:来自看跌-看涨平价中物理漂移的经验证据

Useong Shin

AI总结 通过研究SPX和RUT指数期权中的carry缺口,发现物理漂移项r μ-hat τ改善了平价拟合,表明物理漂移影响的是执行风险中性平价的资本过程而非期权收益。

详情
AI中文摘要

看跌-看涨平价是一个终端收益恒等式,但其执行需要占用资本。我研究了SPX和RUT指数期权中的carry缺口,即期权隐含贴现因子与OIS贴现因子之间的年化楔形。报价平价被紧密压缩,而合成交易远期通道留下了系统性楔形。我将这一楔形解释为有限套利资本下的执行溢价。一个保持漂移的GBM项r μ-hat τ改善了样本内和留一年外拟合,尤其是在SPX中。证据表明,物理漂移进入的不是期权收益,而是执行风险中性平价的过程。

英文摘要

Put-call parity is a terminal-payoff identity, but its enforcement is capital-using. I study the carry gap, the annualized wedge between option-implied and OIS discount factors, in SPX and RUT index options. Quoted parity is tightly compressed, while the synthetic-traded forward channel leaves a systematic wedge. I interpret this wedge as an implementation premium under finite arbitrage capital. A drift-preserving GBM term, r μ-hat τ, improves in-sample and leave-one-year-out fit, especially in SPX. The evidence suggests that physical drift enters not option payoffs, but the process enforcing risk-neutral parity.

2604.19605 2026-05-26 q-fin.GN

Tuning in to Frequencies: How Global Assets Align with U.S. Put-Call Parity Residuals

调谐频率:全球资产如何与美国看跌-看涨平价残差对齐

Useong Shin

AI总结 本文通过检验SPX和RUT看跌-看涨平价残差,发现全球资产(IEFA、IGOV、IAU)能解释美国市场以外的信息,支持有限资本下的平价执行反映物理测度投资机会的观点。

详情
AI中文摘要

看跌-看涨平价在最终收益上是风险中性的,但其执行是路径依赖且消耗资本的。我检验了SPX和RUT的价差是否由基于OIS的融资、波动率、交易摩擦和金融状况变量解释,或者也由残余的外部选择信息解释。在加入美国中心控制变量后,添加IEFA、IGOV和IAU改善了样本内和留一年外的拟合度。这些收益在广泛美元中性化、替代区块、PCA、残差化和嵌套期限选择下仍然存在。结果支持简化形式的P-Q对齐:有限资本下的平价执行反映了物理测度的投资机会,而非收益水平的无套利失败。

英文摘要

Put-call parity is risk-neutral at terminal payoff, but its enforcement is path-dependent and capital-using. I test whether the SPX and RUT carry gap is explained by OIS-based funding, volatility, trading-friction, and financial-condition variables, or also by residual outside-option information. Adding IEFA, IGOV, and IAU improves in-sample and leave-one-year-out fit after U.S.-centered controls. Gains survive broad-dollar neutralization, alternative blocks, PCA, residualization, and nested horizon selection. Results support reduced-form P-Q alignment: finite-capital parity enforcement reflects physical-measure investment opportunities, not payoff-level no-arbitrage failure.

2604.19604 2026-05-26 q-fin.GN q-fin.CP

The Cost of a Free Lunch: Evidence from U.S. Derivatives Markets

免费午餐的成本:来自美国衍生品市场的证据

Useong Shin

AI总结 利用S&P 500和Russell 2000期权的分钟级NBBO数据,通过期权隐含贴现因子与OIS曲线的比较构建年化持有成本缺口,并基于波动率乘以sqrt(tau)的路径风险项解释其与实施风险、交易摩擦和金融状况的关系。

详情
AI中文摘要

看跌-看涨平价是一个终端收益恒等式;相对于交易期货的报价残差接近于零。然而,强制平价是路径依赖的,使套利者面临每日结算、保证金和有限资本。利用S&P 500和Russell 2000期权的分钟级NBBO数据,我提取期权隐含贴现因子,将其与OIS曲线进行比较,并构建年化持有成本缺口。一个以波动率乘以sqrt(tau)的路径风险项为中心的简化形式规范将持有成本缺口与实施风险、交易摩擦和金融状况联系起来,系数符号在留一年交叉验证中保持稳定。持有成本缺口是一个在价格空间中不可见但在持有成本空间中系统性的实施楔子。

英文摘要

Put-call parity is a terminal-payoff identity; quoted residuals against traded futures are near zero. Yet enforcing parity is path-dependent, exposing arbitrageurs to daily settlement, margin, and finite capital. Using minute-level NBBO data on S&P 500 and Russell 2000 options, I extract option-implied discount factors, compare them with the OIS curve, and construct an annualized carry gap. A reduced-form specification centered on a volatility times sqrt(tau) path-risk term links the carry gap to implementation risk, trading frictions, and financial conditions, with coefficient signs stable across leave-one-year-out validation. The carry gap is an implementation wedge invisible in price space but systematic in carry space.

2603.14438 2026-05-26 q-fin.MF

Curved Greeks: A Geometric Layer for Option P&L Adjustments

曲率希腊值:期权P&L调整的几何层

Pedro Pablo Pérez Velasco, Mengjue Lu, Daniel Arrieta

AI总结 提出一个局部、模型无关的框架,通过协变Hessian使二次P&L项坐标不变,并校准连接以匹配交易台目标,同时纳入执行摩擦的二次成本模型。

详情
Comments
34 pages, 5 figures, 4 tables
AI中文摘要

短期期权组合管理依赖于少量风险因子的P&L展开。在实践中,二次项和常见的交易台调整(微笑修正、执行成本附加)取决于所选因子坐标,因此在现货、远期和对数远期参数化之间移动时,预测的二阶P&L会发生变化。我们提出了一个局部、模型无关的框架,使二次项坐标不变。通常的Hessian被由仿射连接定义的协变Hessian替代,产生一个不变的二次预测器。该连接被校准以匹配二次P&L的交易台目标(微笑效应的Vanna-Volga或原则上对实现P&L的局部拟合),同时保持一阶对冲希腊值不变。执行摩擦通过对冲交易的二次成本模型进入。与对冲比率结合,这引入了对因子移动的等价二次惩罚,使成本的投资组合净额明确,并提供局部流动性感知的二阶敏感性和再平衡方向。校准简化为具有清晰可识别性条件的小型线性系统。两个FX障碍案例研究(EURUSD、USDTRY)说明了工作流程,我们简要概述了对其他二次惩罚(风险归一化、情景/缺口项以及xVA/资本附加)的扩展。

英文摘要

Short-horizon option book management relies on P&L expansions in a small set of risk factors. In practice, the quadratic term and common desk adjustments (smile corrections, execution cost add-ons) depend on the chosen factor coordinates, so predicted second-order P&L can change when moving between spot, forward, and log-forward parameterizations. We propose a local, model-agnostic framework that makes the quadratic term coordinate invariant. The usual Hessian is replaced by a covariant Hessian defined by an affine connection, yielding an invariant quadratic predictor. The connection is calibrated to match a desk target for quadratic P&L (Vanna-Volga for smile effects or, in principle, a local fit to realized P&L) while leaving first-order hedge Greeks unchanged. Execution frictions enter through a quadratic cost model for hedge trades. Combined with hedge ratios, this induces an equivalent quadratic penalty on factor moves, makes portfolio netting of costs explicit, and provides local liquidity-aware second-order sensitivities and rebalancing directions. Calibration reduces to small linear systems with clear identifiability conditions. Two FX barrier case studies (EURUSD, USDTRY) illustrate the workflow, and we briefly sketch extensions to other quadratic penalties (risk normalization, scenario/gap terms, and xVA/capital add-ons).

2603.02946 2026-05-26 q-fin.MF cs.NA math.NA math.PR

Fast simulation of Volterra processes using random Fourier features with application to the log-stationary fractional Brownian motion

使用随机傅里叶特征快速模拟Volterra过程及其在对数平稳分数布朗运动中的应用

Othmane Zarhali, Nicolas Langrené

AI总结 提出基于随机傅里叶特征(RFF)的Volterra过程快速模拟框架,通过核的谱表示和哈密顿蒙特卡洛采样实现高效近似,并应用于平稳分数布朗运动(S-fBM)核的Volterra过程。

详情
Comments
56 pages, 9 figures
AI中文摘要

开发了一种基于随机傅里叶特征(RFF)近似核的随机Volterra过程快速模拟框架。在回顾Volterra过程的主要性质和现有数值模拟方法后,引入了一种依赖于核谱表示的加速方案。特别关注使用哈密顿蒙特卡洛从核谱密度中采样,其效率和稳定性比替代采样方法更方便。为所提方法建立了定量保证,包括矩估计和强误差界。进一步将该方法与文献中常用的指数和核近似进行比较,强调了本框架的更广泛通用性。作为主要应用,研究了与平稳分数布朗运动(S-fBM)核相关的Volterra过程。使用超几何函数导出了闭式谱密度表示,建立了正定性条件,并提供了该设置下RFF近似的显式截断和蒙特卡洛误差界。一维和二维数值实验说明了核近似的准确性、模型参数的可靠恢复以及加速模拟方案在计算效率以及弱误差和强误差性能方面的竞争力。

英文摘要

A fast simulation framework for stochastic Volterra processes based on Random Fourier Features (RFF) approximation of the kernel is developed. After recalling the main properties of Volterra processes and reviewing existing numerical simulation methods, an accelerated scheme is introduced that relies on a spectral representation of the kernel. A particular attention is devoted to sampling from the kernel spectral density using Hamiltonian Monte Carlo, whose efficiency and stability bring more convenience than alternative sampling procedures. Quantitative guarantees for the proposed method are established, including moment estimates and strong error bounds. The approach is further compared with the kernel approximation by sum of exponentials commonly used in the literature, emphasizing the broader generality of the present framework. As a primary application, Volterra processes associated with the Stationary fractional Brownian Motion (S-fBM) kernel are investigated. A spectral density representation is derived in closed form using hypergeometric functions, a condition for positive definiteness is established and explicit truncation as well as Monte Carlo error bounds are provided for the RFF approximation in this setting. Numerical experiments in dimensions one and two illustrate the accuracy of the kernel approximation, the reliable recovery of model parameters and the competitiveness of the accelerated simulation scheme in terms of computational efficiency and both weak and strong error performance.

2509.07718 2026-05-26 q-fin.MF

Hedging Options on Asset Portfolios against Just One Underlying Asset in the Presence of Transaction Costs

存在交易成本时仅用单一标的资产对冲资产组合期权

Erina Nanyonga, Matt Davison

AI总结 本文研究在交易成本下,用与标的资产相关的低成本资产对冲资产组合期权,通过模拟数据计算风险调整值以决定对冲资产选择。

详情
Journal ref
Research in Mathematics, 13(1), 2026
Comments
14 pages
AI中文摘要

期权是关于标的资产价值的或有要求权。Black-Scholes公式提供了在风险中性环境下对这些期权定价的路线图,其合理性源于delta对冲策略,即对标的资产持有适当规模的反向头寸。然而,如果标的资产交易成本高昂怎么办?或许用另一种相关但交易成本更低的资产进行对冲更好。本研究考虑在以下情形中的这一问题:期权写在包含$α$份资产$S_{t_1}$和$(1-α)$份与$S_{t_1}$相关的另一证券$S_{t_2}$的投资组合上。我们假设仅用$S_{t_1}$或$S_{t_2}$之一进行对冲。在$α=0$或$1$的情况下,该模型可涵盖期权标的资产为单一资产时,用“正确”(标的)资产或“错误”(相关但不同)资产进行对冲的情形。我们使用不同的交易间隔、相关系数$ρ$和交易成本,在模拟数据上对冲投资组合。计算风险调整值($RAV$)作为风险与收益度量,以决定何时交易$S_{t_1}$或$S_{t_2}$。基于$RAV$的结论表明,两种资产的市场风险价格和交易成本的大小是决策的关键。从我们的结果来看,当$ρ$非常高且两种资产的交易成本都相当低时,可以选择交易错误资产。

英文摘要

Options are contingent claims regarding the value of underlying assets. The Black-Scholes formula provides a road map for pricing these options in a risk-neutral setting, justified by a delta hedging argument in which countervailing positions of appropriate size are taken in the underlying asset. However, what if an underlying asset is expensive to trade? It might be better to hedge with a different, but related asset that is cheaper to trade. This study considers this question in a setting in which the option written on a portfolio containing $α$ shares of one asset $S_{t_1}$ and $(1-α)$ shares of another security $S_{t_2}$ correlated with $S_{t_1}$. We suppose that the asset is hedged against only one of $S_{t_1}$ or $S_{t_2}.$ In the case of $α=0~\text{or}~1$ we can consider this model to cover the case where an option on one asset is hedged against either the ``right" (underlying) asset or the``wrong" (related, different) asset. We hedge our portfolio on simulated data using varying trading intervals, correlation coefficients, $ρ$ and transaction costs. We calculated the risk-adjusted values ($RAV$) as the risk and return measures to make meaningful decisions on when to trade $S_{t_1}$ or $S_{t_2}.$ From the conclusions made based on $RAV,$ the size of the market price of risk and that of transaction costs on both assets are key to making a decision while hedging. From our results, trading the wrong asset can be opted for when $ρ$ is very high for reasonably small transaction costs for either of the assets.

2506.00762 2026-05-26 math.PR q-fin.MF

Markovian projections for functionals of Itô semimartingales with jumps

带跳跃的Itô半鞅泛函的马尔可夫投影

Martin Larsson, Shukun Long

AI总结 研究带跳跃的Itô半鞅的马尔可夫投影的存在性,将Brunick和Shreve的连续情形结果推广到带跳跃的一般情形。

详情
AI中文摘要

给定一个Itô半鞅$X$,其马尔可夫投影是一个Itô半鞅$\widehat{X}$,具有马尔可夫微分特征,并且与$X$的一维边际分布相匹配。甚至可以要求这两个过程的某些泛函具有相同的固定时间边际分布,代价是增强$\widehat{X}$的微分特征,但仍然是马尔可夫意义上的。在连续情形下,马尔可夫投影存在性的最终结果由Brunick和Shreve~\cite{MR3098443}获得。在本文中,我们将他们的结果推广到带跳跃的Itô半鞅的完全一般设定。

英文摘要

Given an Itô semimartingale $X$, its Markovian projection is an Itô semimartingale $\widehat{X}$, with Markovian differential characteristics, that matches the one-dimensional marginal laws of $X$. One may even require certain functionals of the two processes to have the same fixed-time marginals, at the cost of enhancing the differential characteristics of $\widehat{X}$ but still in a Markovian sense. In the continuous case, the definitive result on existence of Markovian projections was obtained by Brunick and Shreve~\cite{MR3098443}. In this paper, we extend their result to the fully general setting of Itô semimartingales with jumps.

2505.07078 2026-05-26 q-fin.TR cs.AI cs.CE

Can LLM-based Financial Investing Strategies Outperform the Market in Long Run?

基于LLM的金融投资策略能否长期跑赢市场?

Weixian Waylon Li, Hyeonjun Kim, Mihai Cucuringu, Tiejun Ma

AI总结 提出FINSABER回测框架,在更长时间和更大股票池上评估基于LLM的择时策略,发现其优势在长期和广泛截面下显著下降,且在牛熊市中表现不佳。

详情
Comments
KDD 2026, Datasets & Benchmarks Track
AI中文摘要

大型语言模型(LLM)最近被用于资产定价任务和股票交易应用,使AI代理能够从非结构化金融数据中生成投资决策。然而,大多数对LLM择时投资策略的评估都是在狭窄的时间范围和有限的股票池中进行的,由于幸存者偏差和数据窥探偏差,其有效性被夸大。我们通过提出FINSABER(一个在更长时间段和更大符号池中评估择时策略的回测框架),批判性地评估其泛化能力和稳健性。跨越二十年和100多个符号的系统回测表明,先前报告的LLM优势在更广泛的截面和更长期的评估下显著恶化。我们的市场制度分析进一步表明,LLM策略在牛市中过于保守,表现不及被动基准,在熊市中过于激进,导致重大损失。这些发现强调了开发能够优先考虑趋势检测和制度感知风险控制,而不仅仅是增加框架复杂性的LLM策略的必要性。

英文摘要

Large Language Models (LLMs) have recently been leveraged for asset pricing tasks and stock trading applications, enabling AI agents to generate investment decisions from unstructured financial data. However, most evaluations of LLM timing-based investing strategies are conducted on narrow timeframes and limited stock universes, overstating effectiveness due to survivorship and data-snooping biases. We critically assess their generalizability and robustness by proposing FINSABER, a backtesting framework evaluating timing-based strategies across longer periods and a larger universe of symbols. Systematic backtests over two decades and 100+ symbols reveal that previously reported LLM advantages deteriorate significantly under broader cross-section and over a longer-term evaluation. Our market regime analysis further demonstrates that LLM strategies are overly conservative in bull markets, underperforming passive benchmarks, and overly aggressive in bear markets, incurring heavy losses. These findings highlight the need to develop LLM strategies that are able to prioritise trend detection and regime-aware risk controls over mere scaling of framework complexity.

2409.08379 2026-05-26 cs.SE cs.AI econ.GN q-fin.EC

The Impact of Large Language Models on Open-source Innovation: Evidence from GitHub Copilot

大型语言模型对开源创新的影响:来自GitHub Copilot的证据

Doron Yeverechyahu, Raveesh Mayya, Gal Oestreicher-Singer

AI总结 利用GitHub Copilot推出的自然实验,通过三种识别策略和两种分类方法,发现LLM使开源贡献增加28%-40%,且增量贡献增长显著大于实质性贡献,表明LLM偏向于利用现有代码库而非探索新功能。

详情
Comments
JEL Classification: O31, C88, J24, O35, L86
AI中文摘要

大型语言模型(LLM)正在重塑知识工作,但它们对自愿、自我指导的开源创新论坛(贡献者无管理指导地选择任务)的影响可能与组织环境中观察到的效果根本不同。我们在开源软件开发中研究这个问题,其中个人的贡献在社区层面共同推动创新。与产品创新不同,产品创新中创新的分类类型已明确,开源环境中的知识工作需要根据任务对贡献者的认知需求进行区分。新兴文献区分了实质性贡献(需要创造性地解决问题以引入新功能)和增量贡献(利用对现有代码的理解来维护和改进代码)。我们利用2021年10月GitHub Copilot推出的自然实验,其中Copilot支持Python等语言,但出于商业原因不支持R,从而在原本可比的生态系统之间创建了外生划分。使用三种互补的识别策略和两种分类方法,我们发现Copilot的可用性使开源贡献增加了28%到40%。在所有规格中,增量贡献的增长显著大于实质性贡献的增长。这种差异在活动水平较高的项目中更为明显,并在模型升级后扩大:当现有上下文有助于定义问题和约束解决方案时,LLM更有效地发挥作用,使协作创新偏向于利用现有代码库而非探索新功能。鉴于生成式AI在知识经济中的爆炸性速度,本文提供了关于LLM影响的罕见因果实地证据。

英文摘要

Large Language Models (LLMs) are reshaping knowledge work, yet their impact on voluntary, self-guided open innovation forums (contributors choose tasks without managerial direction) may differ fundamentally from effects observed in organizational settings. We study this question in open-source software development, where individuals' contributions collectively drive innovation at a community level. Unlike product innovation, where typologies for classifying innovation are well established, knowledge work in open-source settings calls for a distinction grounded in the cognitive demand a task places on the contributor. Burgeoning literature distinguishes substantive contributions, which require creative problem formulation to introduce new functionality, from incremental contributions, which draw on comprehension of existing code to maintain and refine it. We exploit a natural experiment around GitHub Copilot's launch in October 2021, where Copilot supported languages like Python while not supporting R for business reasons, creating an exogenous partition between otherwise comparable ecosystems. Using three complementary identification strategies and two classification approaches, we find that Copilot availability increases open-source contributions by 28 to 40 percent. The increase in incremental contributions is significantly larger than the increase in substantive contributions across all specifications. This disparity is more pronounced in projects with higher activity levels and widens following a model upgrade: LLMs function more effectively when existing context helps define the problem and constrain solutions, tilting collaborative innovation toward exploitation of established codebases rather than exploration of new functionality. This paper provides a rare instance of causal field evidence on LLM effects, given the speed at which GenAI has exploded across the knowledge economy.