arXivDaily arXiv每日学术速递 周一至周五更新
2606.20420 2026-06-19 q-fin.CP stat.AP 新提交

Advanced Calibration Analysis and Tools: Identifying Influential Observations in Stochastic Interest Rate Model Calibration

高级校准分析与工具:识别随机利率模型校准中的有影响观测值

Philipp Mahler, Peter Ruckdeschel

AI总结 将校准问题嵌入非线性回归理论,证明最小化RMSRE等价于加权最小二乘,开发诊断框架(加权帽子矩阵、影响函数、泛函Delta方法),实证发现杠杆边界主导、有效维度损失及2022年后参数稳定性转变,指出低RMSRE不足以验证校准。

Comments 47 pages, 9 figures, 1 table

详情
AI中文摘要

利率模型的准确校准对于市场一致性估值和经济情景生成器(ESGs)至关重要。多因子模型(如G2++模型)的传统校准方法通常依赖于点估计,忽略了特定市场数据的影响和估计不确定性的量化。本文开发了一个诊断框架,将校准问题嵌入非线性回归理论。研究表明,行业常见的均方根相对误差(RMSRE)最小化等价于加权最小二乘(WLS)问题。这一等价关系导出了诊断工具的相应公式,包括用于杠杆分析的加权帽子矩阵、用于局部敏感性诊断的影响函数,以及用于局部、边界置信区间的泛函Delta方法。实现中采用了高效的雅可比矩阵分解,利用了平价(ATM)上限的解析可处理性。该框架应用于2016-2025年期间的欧元ATM上限数据集。我们的实证分析揭示了边界主导的杠杆分布、由于参数约束活跃导致的重复有效维度损失,以及2022年后市场转型中局部参数稳定性的诊断机制转变。对精算模型治理的启示是:低RMSRE不足以验证校准。最后,我们讨论了该框架对一般最小二乘问题的适用性,同时指出了对于缺乏闭式梯度的工具(如互换期权)的计算挑战。

英文摘要

The accurate calibration of interest rate models is central to market-consistent valuation and Economic Scenario Generators (ESGs). Traditional calibration methods for multi-factor models such as the G2++ model often rely on point estimates, neglecting the influence of specific market data and the quantification of estimation uncertainty. This paper develops a diagnostic framework embedding the calibration problem into non-linear regression theory. It shows that the common industry practice of minimizing the Root Mean Squared Relative Error (RMSRE) is equivalent to a Weighted Least Squares (WLS) problem. This equivalence yields the corresponding formulations for diagnostic tools, including the Weighted Hat Matrix for leverage analysis, Influence Functions for local sensitivity diagnostics, and the Functional Delta Method for local, boundary-respecting confidence intervals. The implementation uses an efficient Jacobian factorization that exploits the analytical tractability of At-The-Money (ATM) caps. The framework is applied to a dataset of Euro ATM caps covering the period 2016--2025. Our empirical analysis reveals a boundary-dominated leverage profile, repeated losses of effective dimensionality due to active parameter constraints, and a diagnostic regime shift in local parameter stability around the post-2022 market transition. The resulting message for actuarial model governance is that low RMSRE is not sufficient for calibration validation. We conclude by discussing the framework's applicability to general least-squares problems while highlighting the computational challenges for instruments lacking closed-form gradients, such as swaptions.

2606.20079 2026-06-19 q-fin.RM 新提交

How to spot outliers: an Ensemble Anomaly Detection Framework

如何发现异常值:一种集成异常检测框架

Daniil Peysakhovich, Rafał Sieradzki

AI总结 针对风险估值输出中的异常问题,提出集成质量评估框架(EQAF),结合多种无监督异常检测方法,在信用衍生品数据上实现F1分数61-79%,优于最佳单一方法(6-66%),并揭示纯统计方法无法检测冻结馈送异常。

详情
AI中文摘要

由数据馈送失败、模型配置错误或系统故障引起的风险估值输出错误可能通过投资银行的风险基础设施未被检测地传播,并产生重大操作损失。利用一家全球大型投资银行涵盖129个交易日183笔交易的专有每日信用衍生品数据,我们设计、实施并实证评估了集成质量评估框架(EQAF),这是一种分层无监督架构,结合互补的异常检测方法,实时监控风险计算完整性。通过使用八种操作现实场景的受控异常注入协议,我们表明校准后的集成在四个不同风险度量数据集上实现了61-79%的F1分数,显著优于最佳单一方法(6-66%)。AUC-ROC提高4-6个百分点证实了这种优势对阈值选择具有鲁棒性。我们进一步证明,纯统计检测方法系统地无法识别冻结值异常,这是一类冻结馈送错误,其中估值输出与先前观测相同,因此与正常数据无法区分,并且领域特定的确定性规则在架构上是不可或缺的。这些发现对巴塞尔III和交易账簿基本审查(FRTB)下的模型风险管理具有直接影响,其中对内部风险模型的自动化和可审计质量控制要求日益增加。

英文摘要

Errors in risk valuation outputs arising from data-feed failures, model misconfiguration, or system malfunctions can propagate undetected through an investment bank's risk infrastructure and generate material operational losses. Using proprietary daily credit-derivatives data from a major global investment bank covering 183 trades across 129 trading days, we design, implement, and empirically evaluate the Ensemble Quality Assessment Framework (EQAF), a layered unsupervised architecture that combines complementary outlier-detection methods to monitor risk calculation integrity in real time. Using a controlled anomaly-injection protocol with eight operationally realistic scenarios, we show that the calibrated ensemble achieves F1 scores of 61-79%, substantially outperforming the best individual method (6-66%) across four distinct risk-measure datasets. Improvements of 4-6 percentage points in AUC-ROC confirm that this advantage is robust to threshold selection. We further demonstrate that purely statistical detection methods systematically fail to identify stale-value anomalies, a class of frozen-feed errors in which valuation outputs are identical to prior observations and therefore indistinguishable from normal data, and that domain-specific deterministic rules are architecturally indispensable. These findings have direct implications for model risk management under Basel III and the Fundamental Review of the Trading Book (FRTB), where automated and auditable quality controls for internal risk models are increasingly required.

2606.19846 2026-06-19 econ.GN q-fin.EC 新提交

What Capital After Labor? Forecasting the Talent ROI Transition in the Human-AI Era

劳动力之后是什么资本?预测人机时代的人才ROI转型

Kwan Soo Shin, In Seok Kang

AI总结 针对AI增强打破劳动时间与贡献的会计关联,本文构建从时间到产出的人才ROI预测框架,核心定理为ROI反转,并利用韩国52小时工作制案例验证了前期压力信号,预测产出型企业在2032年TFP增长领先1.5-2.0个百分点。

Comments 90 pages, 6 figures

详情
AI中文摘要

AI增强打破了劳动时间与生产贡献之间的会计联系,但企业仍通过基于时间的间接费用包来评估人才。本文开发了一个预测框架,用于在人机时代从基于时间的人才会计向基于产出的人才ROI转型。该框架以定理3(在τ*处的ROI反转)为实证主轴,包含四个机制定理:间接费用非加性、增强节省时间路径、创新溢价放大以及人机二元归因不确定性。韩国分阶段实施的52小时工作制规定提供了一个实证预警案例。在一个包含365家上市公司的DART面板数据(2281个公司-年观测值)中,SG&A与收入比率从2018年的18.26%上升至2020年的20.06%,在2021-2022年略有修正,并于2024年达到20.10%的峰值。在收入百分位队列代理下,双向固定效应(+1.56个百分点,p=0.049)、合并事件研究估计(t=+3时为+4.21个百分点,p=0.001)以及Callaway-Sant'Anna双重稳健交错DID估计(t=+4时为+4.51个百分点)收敛于一个正向间接费用压力特征。2015-2017年的向后扩展(224家公司,601个观测值)提供了预处理数据,提供了反对预先存在的上升趋势混杂因素的证据。我们将韩国证据解读为,据我们所知,第一个经验记录的τ*前间接费用压力制度特征,其中基于时间的会计仍占主导地位,而AI增强和劳动时间压缩共同推高了间接费用。预计到2032年,基于产出的公司在公司层面TFP增长上比基于时间的同行高出1.5-2.0个百分点。贡献在于为向AI增强的人才ROI会计转型提供了一个预测模型和管理规划工具。

英文摘要

AI augmentation breaks the accounting link between labor time and productive contribution, yet firms continue to evaluate talent through time-based overhead bundles. This paper develops a forecasting framework for the transition from time-based talent accounting to output-based talent ROI in the human-AI era. The framework centres on Theorem 3 (ROI Inversion at τ*) as the empirical spine, with four mechanism theorems: overhead non-additivity, augmentation-saved-time pathways, innovation-premium amplification, and human-AI dyad attribution uncertainty. Korea's staged 52-hour workweek mandate provides an empirical early-warning case. In a DART panel of 365 listed firms (2,281 firm-year observations), the SG&A-to-revenue ratio rose from 18.26 percent in 2018 to 20.06 percent in 2020, corrected mildly in 2021-2022, and peaked at 20.10 percent in 2024. Under the revenue-percentile cohort proxy, two-way fixed effects (+1.56 pp, p = 0.049), pooled event-study estimates (+4.21 pp at t = +3, p = 0.001), and Callaway-Sant'Anna doubly-robust staggered DiD estimates (+4.51 pp at t = +4) converge on a positive overhead-pressure signature. A 2015-2017 backward extension (224 firms, 601 observations) supplies pre-treatment data, providing evidence against pre-existing upward-trend confounds. We read the Korean evidence not as a direct τ* estimate or a point causal magnitude, but as, to our knowledge, the first empirically documented signature of the pre-τ overhead-pressure regime, where time-based accounting still dominates while AI augmentation and labor-time compression jointly raise overhead. Output-based firms are forecast to outperform time-based peers by 1.5-2.0 percentage points in firm-level TFP growth by 2032. The contribution is a forecasting model and managerial planning tool for the shift to AI-augmented talent ROI accounting.

2606.19550 2026-06-19 q-fin.GN q-fin.PR 新提交

Which Portfolios? The Construction Dependence of Factor Model Performance

哪些投资组合?因子模型表现的构建依赖性

Useong Shin

AI总结 研究发现因子模型表现高度依赖于测试资产的构建方式,如选股、初始加权、持有期和再平衡,其中买入持有策略偏好FF5和FF6,而每日恒定加权偏好FF3,且q5在因子跨度测试中夏普比率最高但定价误差较大。

详情
AI中文摘要

因子模型的表现不仅取决于模型本身,还取决于测试资产的构建方式。我们从广泛的CRSP范围内形成特征未排序的随机投资组合,并改变股票选择、初始加权、持有期和再平衡。排名发生实质性变化:买入持有策略偏好FF5和FF6,而每日恒定加权偏好FF3,这是跨设计最稳定的模型。尽管q5在因子跨度测试中达到了最高的最大夏普比率,但它对随机投资组合留下了相对较大且对构建敏感的定价误差。这些结果反映了每个模型定价误差向量的构建特定加权。因此,测试资产构建,包括动态权重管理,是模型评估中的一个设计选择。

英文摘要

Factor-model performance depends not only on the model but also on how test assets are constructed. We form characteristic-unsorted random portfolios from a broad CRSP universe and vary stock selection, initial weighting, holding, and rebalancing. Rankings shift materially: buy-and-hold favors FF5 and FF6, whereas daily constant-weighting favors FF3, the most stable model across designs. Although q5 attains the highest maximum Sharpe ratio in factor-spanning tests, it leaves comparatively large and construction-sensitive pricing errors on random portfolios. These results reflect construction-specific weighting of each model's pricing-error vector. Test-asset construction, including dynamic weight management, is therefore a design choice in model evaluation.

2606.19517 2026-06-19 q-fin.TR 新提交

Do Prediction Markets Match Option Prices? Bitcoin Threshold Evidence from Binance and Polymarket

预测市场是否与期权价格匹配?来自币安和Polymarket的比特币阈值证据

Victoria Portnaya

AI总结 本文通过比较Polymarket预测市场与币安期权隐含的比特币阈值合约价格,发现两者之间存在显著且持久的定价差距,平均约6.3个百分点,表明数字金融市场碎片化导致经济上相同的收益产生系统性定价偏差。

Comments 22 pages, 6 figures, 7 tables; JEL: G13, G14, G19

详情
AI中文摘要

金融市场的数字化产生了两类平台,它们原则上对相同的状态依存收益进行定价:中心化加密期权交易所和基于区块链的预测市场。本文首次提供了加密货币阈值合约的预测市场定价的期权隐含基准测试。在匹配样本的每个小时,我们将Polymarket的Yes价格与同一标的、行权价和到期日的上市币安看涨期权所隐含的贴现风险中性二元值进行比较,并研究两者之间的差距。在2023年9月的主要比特币合约中,平均定价差距为5.6个百分点(基于214个每小时观测值,t=6.46,p<10^{-9})。合并三个与币安兼容的比特币阈值市场,在287个观测值上得到平均差距为6.3个百分点,对HAC和块自举推断稳健。该差距是持久的——AR(1)半衰期约为四小时——但均值回归,这与分割场所之间缓慢的信息传递而非机械噪声一致。横截面回归显示,价差在期权隐含概率低和到期时间长时最大,这与预测市场合约的投机需求而非测量误差一致。在对冲套利代理中,在保守交易成本后仍保持盈利,但统计精度边际。在相同三个比特币合约上扩展至Deribit,合并差距更大,为11个百分点,而较小的以太坊练习则产生混合证据。结果表明,数字金融市场碎片化导致经济上相同的收益产生系统性、持久的定价偏差。

英文摘要

The digitization of financial markets has produced two classes of platforms that price, in principle, the same state - contingent payoffs: centralized crypto-option exchanges and blockchain-based prediction markets. This paper provides the first option-implied benchmark test of prediction-market pricing for cryptocurrency threshold contracts. For each hour in a matched sample, we compare the Polymarket Yes price with the discounted risk-neutral binary value implied by a listed Binance call option on the same underlying, strike, and maturity, and study the gap between them. In the main September 2023 Bitcoin contract, the mean pricing gap equals 5.6 percentage points across 214 hourly observations (t = 6.46, p < 10^{-9}). Pooling three Binance-compatible Bitcoin threshold markets yields a mean gap of 6.3 percentage points across 287 observations, robust to HAC and block-bootstrap inference. The gap is persistent - with an AR(1) half-life of roughly four hours - yet mean-reverting, consistent with slow information transmission between segmented venues rather than mechanical noise. Cross-sectional regressions reveal that the wedge is largest at low option-implied probabilities and long maturities, a pattern consistent with speculative demand for prediction-market contracts rather than measurement error. A delta-hedged arbitrage proxy remains profitable after conservative transaction costs, though with marginal statistical precision. A Deribit extension on the same three Bitcoin contracts produces a larger pooled gap of 11 percentage points, while a smaller Ethereum exercise yields mixed evidence. The results demonstrate that digital fragmentation of financial markets generates systematic, persistent pricing wedges even for economically identical payoffs.

2606.20041 2026-06-19 econ.GN cs.AI cs.LG q-fin.EC q-fin.GN 新提交

AI Economist Agent: An Agentic Framework for Model-Grounded Economic Analysis with RAG, Knowledge Graphs, and Large Language Models

AI经济学家代理:一种基于模型的经济分析代理框架,结合RAG、知识图谱和大语言模型

Masahiro Kato

AI总结 提出一种基于RAG的AI经济学家代理框架,利用知识图谱和大语言模型进行经济情景分析,通过代理规划、检索证据、选择模型并生成报告,提高经济叙事的连贯性和可追溯性。

详情
AI中文摘要

我们提出了一种基于模型的RAG型AI经济学家,具有用于经济情景分析的代理框架,使用大语言模型(LLMs)和知识图谱。虽然LLMs可以生成流畅的经济叙事,但经济学家通常需要做出基于经济理论和现实数据的经济主张。基于这一动机,本研究提出了一种基于RAG的AI经济学家,它利用包含经济数据和理论的知识图谱以及基于LLM的代理来规划分析、检索相关证据、选择合适的模型并生成报告。在我们的框架中,我们不直接仅使用语言模型产生定量主张;相反,我们生成基于显式模型计算的叙事,并通过AI代理与检索到的证据相关联。我们将我们的框架称为AI经济学家代理。我们在两个应用中评估了AI经济学家代理:为美国通胀持续性和美联储政策生成经济学家报告,以及为美国商业房地产再融资压力生成银行压力测试叙事。结果说明了如何通过基于生成报告来提高其经济连贯性和可追溯性。

英文摘要

We propose a model-grounded RAG-based AI economist with an agentic framework for economic scenario analysis using large language models (LLMs) and knowledge graphs. While LLMs can generate fluent economic narratives, economists are often required to make economic claims grounded by economic theory and real-world data. Based on this motivation, this study proposes an RAG-based AI economist, which utilizes knowledge graphs including economic data and theory and LLM-based agents to plan the analysis, retrieve relevant evidence, select appropriate models, and generate reports. In our framework, we do not produce quantitative claims directly with the language model alone; instead, we generate narratives grounded in explicit model-based computations and linked to the retrieved evidence via AI agents. We refer to our framework as an AI economist agent. We evaluate the AI economist agent in two applications: economist report generation for U.S. inflation persistence and Federal Reserve policy, and bank stress-test narrative generation for U.S. commercial real estate refinancing stress. The results illustrate how grounding the generated reports improves their economic coherence and traceability.

2606.19794 2026-06-19 econ.GN cs.CY q-fin.EC 新提交

Forecasting AI-Era Productivity: The Intellectually Converged Human Framework and a Missing Cognitive Mediator in Production Function Theory

预测AI时代的生产率:智力融合人类框架与生产函数理论中缺失的认知中介

Kwan Soo Shin, In Seok Kang

AI总结 本文提出智力融合人类(ICH)框架,通过引入四维认知构念“融合能力”(C)作为AI与生产率之间的认知中介,解释了AI投资未能带来相应生产率增长的理论悖论,并基于20个OECD国家的数据分析验证了AI与C的交互作用对全要素生产率变异的解释力。

Comments 78 pages, 3 figures

详情
AI中文摘要

为什么大规模AI投资未能产生相应的生产率增长?我们认为这一悖论在理论上是生成的:主流生产函数框架通过将AI视为可分离的生产要素,而未建模AI产生生产性价值的认知中介,从而遇到了结构性边界。这导致投资倾向于部署,而生产率需要先发展我们称之为融合能力(C)的东西。我们提出了智力融合人类(ICH)框架,这是生产函数理论的第五阶段框架:H-hat = H[1 + phi(A,C)],其中有效生产能力等于人力资本(H)乘以一个增强因子[1 + phi],phi由AI利用强度(A)和融合能力(C)共同决定,C是一个四维认知构念,涵盖具身理解、元认知、时间整合和整合思维。生产函数Y = F(K, H-hat)为索洛的TFP残差提供了一个以人为中心的机制:A_Solow = [1 + phi(A,C)]^(1-alpha)。该框架预测了三种具有不同政策含义的增强机制。对20个OECD经济体的描述性跨国分析显示,AIxC交互作用与86%的TFP变异相关,而仅AI为31%,这是小n理论传统中模式一致的发现。韩国是国家级欠增强的例证:高H、大量A、低C导致phi=0。我们将融合能力与相邻构念——吸收能力、动态能力和人力资本——区分开来,并证明C构成了先前框架中隐含的特定认知中介。我们推导出C优先的政策建议,并提出了三个可实证检验的命题及一个可证伪的10年预测。

英文摘要

Why does massive AI investment fail to generate commensurate productivity gains? We argue the paradox is theoretically generated: prevailing production function frameworks encounter a structural boundary by treating AI as a separable factor of production without modeling the cognitive mediation through which AI generates productive value. This directs investment toward deployment when productivity requires prior development of what we term convergence capacity (C). We propose the Intellectually Converged Human (ICH) framework, a fifth-stage framework for production function theory: H-hat = H[1 + phi(A,C)], where effective productive capacity equals human capital (H) scaled by an augmentation factor [1 + phi], with phi jointly determined by AI utilization intensity (A) and convergence capacity (C), a four-dimensional cognitive construct encompassing embodied understanding, metacognition, temporal integration, and integrative thinking. The production function Y = F(K, H-hat) provides a human-centered mechanism for Solow's TFP residual: A_Solow = [1 + phi(A,C)]^(1-alpha). The framework predicts three augmentation regimes with distinct policy implications. Descriptive cross-national analysis of 20 OECD economies shows the AIxC interaction is associated with 86% of TFP variance versus 31% for AI alone, a pattern-consistent finding in the small-n theoretical tradition. South Korea exemplifies national-scale under-augmentation: high H, substantial A, low C produce phi = 0. We distinguish convergence capacity from adjacent constructs, absorptive capacity, dynamic capability, and human capital, and demonstrate that C constitutes the specific cognitive mediator that prior frameworks have left implicit. We derive C-first policy prescriptions and offer three empirically testable propositions with a falsifiable 10-year forecast.

2606.19501 2026-06-19 cs.AI cs.CL cs.LG q-fin.RM 新提交

DeXposure-Claw: An Agentic System for DeFi Risk Supervision

DeXposure-Claw: 一个用于DeFi风险监管的智能体系统

Aijie Shu, Bowei Chen, Wenbin Wu, Cathy Yi-Hsuan Chen, Fengxiang He

发表机构 * University of Edinburgh(爱丁堡大学) University of Glasgow(格拉斯哥大学) University of Cambridge(剑桥大学)

AI总结 针对DeFi监管中LLM智能体易误报的问题,提出DeXposure-Claw系统,通过图时间序列基础模型预测风险网络,结合确定性监控和置信度门控生成可审计监管票据,并构建六轴评估基准DeXposure-Bench,实验验证有效性。

详情
AI中文摘要

去中心化金融使监管者面临快速变化的网络化信用风险。通用LLM智能体不适合此场景:它们过度解读弱证据并推荐高风险干预,而现有评估无法提供符合监管者需求的误报衡量方式。我们提出DeXposure-Claw,一个基于预测的智能体监管系统,通过结构化证据引导LLM决策:(1) DeXposure-FM,一个图时间序列基础模型,预测未来风险网络;(2) 确定性监控和压力场景将预测转化为类型化警报、归因信号和场景证据;(3) 数据健康和置信度门控在DeXposure-Claw发出带有理由的可审计监管票据前限制升级。我们进一步开发了DeXposure-Bench,一个六轴评估框架,其决策轴根据符合监管者的绝对损失真实情况和显式误干预率对票据评分。在五年每周真实数据上的实验充分支持了我们的系统。代码见 https://this URL。

英文摘要

Decentralized finance exposes supervisors to fast-moving, networked credit risks. General-purpose LLM agents fit this setting poorly: they over-read weak evidence and recommend high-stakes interventions, while existing evaluations offer no regulator-aligned way to measure the resulting false alarms. We introduce DeXposure-Claw, a forecast-grounded agentic supervision system that routes LLM decisions through structured evidence: (1) DeXposure-FM, a graph time-series foundation model, forecasts future exposure networks; (2) deterministic monitors and stress scenarios then turn those forecasts into typed alerts, attribution signals, and scenario evidence; and (3) data-health and confidence gates constrain escalation before DeXposure-Claw emits auditable supervisory tickets with rationales. We further develop DeXposure-Bench, a six-axis evaluation harness, whose decision axis scores tickets against a regulator-aligned absolute-loss ground truth and an explicit false-intervention rate. Experiments on five years of weekly real data fully support our system. Code is at https://github.com/EVIEHub/DeXposure-Claw.

2606.19777 2026-06-19 physics.soc-ph econ.GN q-fin.EC 新提交

Have Data Centers Raised Your Electric Bill? Causal Evidence from the United States

数据中心提高了你的电费吗?来自美国的因果证据

Asa Watten, John Bistline, Geoffrey Blanford

AI总结 利用工具变量法,发现2015-2024年美国数据中心使平均零售电价温和下降,归因于电力系统的规模经济效应。

详情
AI中文摘要

我们使用工具变量法估计,从2015年到2024年,数据中心导致美国平均零售电价温和下降。尽管普遍看法相反,这一发现与经济推理一致:现有的大型电力系统固定成本、输配电的规模经济以及发电单位成本的下降意味着持久的需求增长会降低平均价格。我们发现了输电、配电和发电成本以及零售客户类别内部和之间的规模经济模式。我们警告说,未来的供应限制可能会逆转这一效应。

英文摘要

We estimate that data centers caused average retail electricity rates to fall modestly in the United States from 2015 to 2024 using an instrumental variables approach. Despite prevailing sentiment, the finding is consistent with economic reasoning: existing large power system fixed costs, economies of scale in transmission and distribution, and declining unit costs for generation imply that durable demand growth lowers average prices. We find patterns of economies of scale for transmission, distribution, and generation costs as well as within and across retail customer classes. We caution that future supply constraints could reverse the effect.

2606.20485 2026-06-19 q-fin.RM cs.AI nlin.AO physics.soc-ph 新提交

Optimal Order of Multi-Agent and General Many-Body Systems

多智能体与一般多体系统的最优序

Jake J. Xia

AI总结 提出一个分析多智能体系统的通用框架,基于智能体的权力和响应函数,推导出宏观性质,并引入风险偏好系数研究增长与韧性之间的权衡,得出最优有序度。

Comments Key Words: Many body systems, multi agent crowd interactions, feedback loops, agent power, response function, utility function, risk appetite, order, optimal order, fragility, mobility, synchronization, useful energy, entropy, concentration, correlation, task dependency, receiver dependency, collective intelligence, AI model scaling law

详情
AI中文摘要

本文开发了一个通用框架,用于分析具有智能体行动与集体观测之间反馈回路的多智能体系统。该框架建立在两个基本的智能体层面变量上:权力,衡量智能体对集体结果的影响;以及响应函数,决定智能体如何对观测做出反应。我们推导了宏观性质(包括总权力、有用权力、熵、有序度、脆弱性和流动性)如何从异质智能体的这两个变量中涌现。为了研究增长与韧性之间的权衡,我们引入了一个由风险偏好系数参数化的系统层面效用函数,并推导出一个平衡生产力、稳定性和适应性的最优有序度。分析表明,更强的同步可以增加集体产出,但也可能增加系统脆弱性并降低流动性。我们进一步论证,有序度、熵、信息和有用能量是任务依赖和系统相对的概念,其含义取决于系统的目标。通过测量和设计智能体的权力分布和响应函数,可能更好地理解、预测和优化集体行为,并识别集体智慧和最优序出现的条件。

英文摘要

This paper develops a general framework for analyzing multi-agent systems with feedback loops between agents actions and collective observations. The framework is built on two fundamental agent-level variables: power, which measures agent influence on collective outcomes, and response functions, which determine how agents react to observations. We derive how macroscopic properties, including total power, useful power, entropy, order, fragility, and mobility, emerge from these two variables of heterogeneous agents. To study the trade off between growth and resilience, we introduce a system-level utility function parameterized by a risk-appetite coefficient and derive an optimal degree of order that balances productivity, stability, and adaptability. The analysis suggests that stronger synchronization can increase collective output but may also increase systemic fragility and reduce mobility. We further argue that order, entropy, information, and useful energy are task-dependent and system-relative concepts whose meanings depend on the objectives of the system. By measuring and designing agent power distributions and response functions, it may be possible to better understand, predict, and optimize collective behavior and identify the conditions under which collective intelligence and optimal order emerge.

2606.20145 2026-06-19 q-fin.ST cond-mat.stat-mech physics.data-an q-fin.MF q-fin.RM 新提交

Trends, Volatility, Correlations, and Critical Phenomena in Financial Markets

金融市场中的趋势、波动率、相关性和临界现象

Sara A. Safari, Christoph Schmidhuber

AI总结 基于当前市场趋势预测未来波动率和相关性,发现趋势强度与波动率、相关性呈二次关系,改进风险预测并支持临界点晶格气体模型。

Comments 31 pages, 9 figures

详情
AI中文摘要

我们基于金融市场的当前趋势预测未来的波动率和相关性。这补充了先前的工作,该工作通过当前趋势强度的三次多项式来建模未来预期收益。经验上,我们观察到在强烈上升或下降趋势期间,波动率和相关性往往逐日增加。这种效应在下降趋势中尤为显著。它可以通过当前趋势强度的二次多项式精确量化,这细化了波动率和相关性的常见均值回归模型。我们的结果通过考虑市场趋势改进了市场风险的预测。它们也支持最近一项将金融市场建模为接近其临界点的晶格气体的提议。

英文摘要

We forecast future volatilities and correlations of financial markets based on the current trends in these markets. This complements previous work that models future expected returns by a cubic polynomial of the current trend strength. Empirically, we observe that volatilities and correlations tend to increase day after day in times of strong up- or down-trends. This effect is particularly pronounced in down-trends. It can be accurately quantified by quadratic polynomials of today's trend strengths, which refine common mean-reversion models of volatilities and correlations. Our results improve the prediction of market risk by accounting for market trends. They also support a recent proposal to model financial markets by a lattice gas near its critical point.

2606.16326 2026-06-19 cs.GT cs.AI q-fin.RM 新提交

Gaming-Resistant Insurance Contracts for Autonomous AI Agents: Strategy-Proof Toll Mechanism Design

自主AI代理的抗博弈保险合约:策略证明的通行费机制设计

Hao-Hsuan Chen

发表机构 * Hao-Hsuan Chen(何浩轩)

AI总结 本文扩展了时间一致精算运行时的框架,使运营商策略化,刻画了自主AI代理保险合约的五种攻击空间,并证明了精算运行时的抗博弈性,通过新合约条款实现激励兼容。

Comments 29 pages. Companion to arXiv:2605.26508 (Paper A, foundations) and arXiv:2605.25632 (Paper B, empirical)

详情
AI中文摘要

论文A定义了一个时间一致的精算运行时,该运行时根据合约固定的安全默认值对每个产生副作用的行动定价,并针对储备预算门控执行。它将运营商视为被动。本文使运营商策略化。我们刻画了自主AI代理保险合约的五种攻击空间,并证明了精算运行时何时具有抗博弈性。两种攻击面——通行费后的安全默认选择以及边界内的行动分割——通过论文A的最小权限和无分割条款得以关闭。其余三种需要新的合约条款。首先,公共控制聚合防止跨边界重新路由将通行费降低到应用于总暴露的边界潜力以下。其次,接口故障(如无效JSON)是合约相关事件,而非安全胜利:将其视为零通行费安全默认值可能奖励不可靠的模型,而升级费用则逆转了激励。我们通过来自配套实证论文的跨模型轨迹验证了这一接口合规定理。第三,一个带有分量最小惩罚计划的模型身份菜单使得部署模型的真实报告成为弱占优策略。然后,我们将这些条款与论文A的运行时保证组合,以获得在五种攻击空间上的联合激励兼容性。最后,一个双参数保费族在真实均衡下满足了运营商个体理性和弱预算平衡。结果是为自主代理副作用的精算控制提供了一个激励兼容层。

英文摘要

Paper A defines a time-consistent actuarial runtime that prices each side-effect-bearing action against a contractually fixed safe default and gates execution against a reserve budget. It treats the operator as passive. This paper makes the operator strategic. We characterise a five-attack space for autonomous AI-agent insurance contracts and prove when the actuarial runtime is gaming-resistant. Two attack surfaces -- post-toll safe-default selection and within-boundary action splitting -- are closed by Paper A's minimal-authority and no-splitting clauses. The remaining three require new contract clauses. First, common-control aggregation prevents cross-boundary re-routing from reducing toll below the boundary potential applied to total exposure. Second, interface failures such as invalid JSON are contract-relevant events, not safety wins: treating them as zero-toll safe defaults can reward unreliable models, while escalation fees reverse the incentive. We validate this interface-compliance theorem on committed cross-model traces from the companion empirical paper. Third, a model-identity menu with a componentwise-minimum penalty schedule makes truthful reporting of the deployed model weakly dominant. We then compose these clauses with Paper A's runtime guarantees to obtain joint incentive compatibility over the five-attack space. Finally, a two-parameter premium family discharges operator individual rationality and weak budget balance at the truthful equilibrium. The result is an incentive-compatibility layer for actuarial control of autonomous-agent side effects.

2410.19333 2026-06-19 econ.GN physics.soc-ph q-fin.EC stat.AP 版本更新

Swiss-system chess tournaments and unfairness

瑞士制国际象棋锦标赛与不公平性

László Csató, Alex Krumer

AI总结 研究瑞士制国际象棋锦标赛中轮次奇偶性导致的不公平性,发现多执白一局的选手得分显著更高,建议采用偶数轮次和平衡颜色分配机制。

Comments 13 pages, 4 tables

详情
AI中文摘要

瑞士制是一种日益流行的比赛形式,因为它提供了比赛场次与排名准确性之间的有利权衡。然而,关于瑞士制国际象棋锦标赛在奇数轮次下潜在的不公平性,尚无实证研究。为了分析这一问题,我们的论文比较了比赛中多执白一局的选手与少执白一局的选手的得分。利用28个高知名度赛事的数据,我们发现多执白一局的选手得分显著更高。特别是在四个Grand Swiss赛事中,这一优势超过了平局的价值。解决这种不公平性的一种潜在方案是组织偶数轮次的瑞士制国际象棋锦标赛,并使用最近提出的配对机制保证所有选手的颜色分配平衡。

英文摘要

The Swiss system is an increasingly popular competition format as it provides a favourable trade-off between the number of matches and ranking accuracy. However, there is no empirical study on the potential unfairness of Swiss-system chess tournaments if an odd number of rounds is played. To analyse this issue, our paper compares the number of points scored in the tournament between players who played one game more with the white pieces and players who played one game fewer with the white pieces. Using data from 28 highly prestigious competitions, we find that players with an extra white game score significantly more points. In particular, the advantage exceeds the value of a draw in the four Grand Swiss tournaments. A potential solution to this unfairness could be organising Swiss-system chess tournaments with an even number of rounds, and guaranteeing a balanced colour assignment for all players using a recently proposed pairing mechanism.

2512.17422 2026-06-19 econ.GN q-fin.EC 版本更新

Hired in High Season: Seasonal Labor Demand and Refugee Labor Market Integration

旺季雇佣:季节性劳动力需求与难民劳动力市场融合

Felix Degenhardt

AI总结 利用奥地利难民准外生分配与酒店业季节性变化,发现旺季进入低门槛酒店业使难民早期就业概率提高3个百分点,三年收入显著增加,但加剧了行业和职场隔离。

详情
AI中文摘要

我研究了早期但临时性的低门槛酒店业就业是否影响难民的劳动力市场融合。我通过将难民在奥地利各地区的准外生分配与酒店业的季节性变化相结合,利用区域内、年份内的变异,其中25%的难民首次找到工作。在季节性高需求期间进入劳动力市场使早期就业概率提高3个百分点(占均值的9%)。就业增长在一年后消失,但受影响的难民在三年内积累了显著更高的收入,中期工资或工作质量没有差异。然而,早期的酒店业工作增加了向难民典型行业和奥地利同事较少的公司的隔离。

英文摘要

I examine whether early but temporary access to low-barrier hospitality employment affects refugees' labor market integration. I exploit within-region, within-year variation by combining the quasi-exogenous allocation of refugees to Austrian regions with seasonality in hospitality, where 25% of refugees first find work. Labor market access during high seasonal demand raises early employment probability by 3 percentage points (9% of the mean). Employment gains fade after one year, but treated refugees accumulate significantly higher three-year earnings, with no differences in medium-term wages or job quality. However, early hospitality work increases segregation into refugee-typical industries and firms with fewer Austrian coworkers.

2503.13328 2026-06-19 q-fin.MF math.PR 版本更新

Model-independent upper bounds for the prices of Bermudan options with convex payoffs

凸收益百慕大期权价格的无模型上界

David Hobson, Dominykas Norgilas

AI总结 研究在给定欧式期权价格下,寻找具有凸收益的百慕大期权价格的无套利上界,通过刻画对偶问题并假设测度满足分散性条件完全求解,发现标准设定不足以定义最优模型,需要额外随机化。

Comments 55 pages, 6 figures. In the new version we work with arbitrary convex payoffs and marginal distributions that satisfy the Dispersion Assumption

详情
AI中文摘要

假设 $\mu$ 和 $\nu$ 是 $\mathbb{R}$ 上的概率测度,满足 $\mu \leq_{cx} \nu$。设 $a$ 和 $b$ 是 $\mathbb{R}$ 上的凸函数,且 $a \geq b \geq 0$。我们感兴趣的是寻找 $$\sup_{\mathbf{M}} \sup_{\tau} \mathbb{E}^{\mathbf{M}} \left[ a(X) I_{ \{ \tau = 1 \} } + b(Y) I_{ \{ \tau = 2 \} } \right] $$ 其中第一个上确界取遍所有一致模型 $\mathbf{M}$(即过滤概率空间 $(\Omega, \mathbf{F}, \mathbb{F}, \mathbb{P})$,使得 $Z=(z,Z_1,Z_2)=(\int_{\mathbb{R}} x \mu(dx) = \int_{\mathbb{R}} y \nu(dy), X, Y)$ 是一个 $(\mathbb{F},\mathbb{P})$ 鞅,且在 $\mathbb{P}$ 下 $X$ 服从分布 $\mu$,$Y$ 服从分布 $\nu$),第二个上确界中的 $\tau$ 是取值于 $\{1,2\}$ 的 $(\mathbb{F},\mathbb{P})$ 停时。我们的贡献首先是刻画并简化对偶问题,其次是在对测度 $\mu$ 和 $\nu$ 的一些结构假设(即 $\mu$ 和 $\nu$ 是绝对连续的概率测度且满足分散性假设)下完全求解该问题。一个关键发现是,由 $Z$ 生成的过滤的标准设定不足以定义最优模型,需要额外的随机化。即使边际分布 $\mu$ 和 $\nu$ 是无原子的,这一结论仍然成立。该问题可解释为:在给定同时到期的欧式期权价格的情况下,寻找具有两个可能行权日的百慕大期权价格的稳健或无模型无套利上界。

英文摘要

Suppose $μ$ and $ν$ are probability measures on $\mathbb{R}$ satisfying $μ\leq_{cx} ν$. Let $a$ and $b$ be convex functions on $\mathbb{R}$ with $a \geq b \geq 0$. We are interested in finding $$\sup_{\mathbf{M}} \sup_τ \mathbb{E}^{\mathbf{M}} \left[ a(X) I_{ \{ τ= 1 \} } + b(Y) I_{ \{ τ= 2 \} } \right] $$ where the first supremum is taken over consistent models $\mathbf{M}$ (i.e., filtered probability spaces $(Ω, \mathbf{F}, \mathbb{F}, \mathbb{P})$ such that $Z=(z,Z_1,Z_2)=(\int_{\mathbb{R}} x μ(dx) = \int_{\mathbb{R}} y ν(dy), X, Y)$ is a $(\mathbb{F},\mathbb{P})$ martingale, where $X$ has law $μ$ and $Y$ has law $ν$ under $\mathbb{P}$) and $τ$ in the second supremum is a $(\mathbb{F},\mathbb{P})$-stopping time taking values in $\{1,2\}$. Our contributions are first to characterise and simplify the dual problem, and second to completely solve the problem under some structural assumptions on the measures $μ$ and $ν$ (namely that $μ$ and $ν$ are absolutely continuous probability measures that satisfy the Dispersion Assumption). A key finding is that the canonical set-up in which the filtration is that generated by $Z$ is not rich enough to define an optimal model and additional randomisation is required. This holds even though the marginal laws $μ$ and $ν$ are atom-free. The problem has an interpretation of finding the robust, or model-free, no-arbitrage bound on the price of a Bermudan option with two possible exercise dates, given the prices of co-maturing European options.