arXivDaily arXiv每日学术速递 周一至周五更新
2606.20079 2026-06-19 q-fin.RM 新提交

How to spot outliers: an Ensemble Anomaly Detection Framework

如何发现异常值:一种集成异常检测框架

Daniil Peysakhovich, Rafał Sieradzki

AI总结 针对风险估值输出中的异常问题,提出集成质量评估框架(EQAF),结合多种无监督异常检测方法,在信用衍生品数据上实现F1分数61-79%,优于最佳单一方法(6-66%),并揭示纯统计方法无法检测冻结馈送异常。

详情
AI中文摘要

由数据馈送失败、模型配置错误或系统故障引起的风险估值输出错误可能通过投资银行的风险基础设施未被检测地传播,并产生重大操作损失。利用一家全球大型投资银行涵盖129个交易日183笔交易的专有每日信用衍生品数据,我们设计、实施并实证评估了集成质量评估框架(EQAF),这是一种分层无监督架构,结合互补的异常检测方法,实时监控风险计算完整性。通过使用八种操作现实场景的受控异常注入协议,我们表明校准后的集成在四个不同风险度量数据集上实现了61-79%的F1分数,显著优于最佳单一方法(6-66%)。AUC-ROC提高4-6个百分点证实了这种优势对阈值选择具有鲁棒性。我们进一步证明,纯统计检测方法系统地无法识别冻结值异常,这是一类冻结馈送错误,其中估值输出与先前观测相同,因此与正常数据无法区分,并且领域特定的确定性规则在架构上是不可或缺的。这些发现对巴塞尔III和交易账簿基本审查(FRTB)下的模型风险管理具有直接影响,其中对内部风险模型的自动化和可审计质量控制要求日益增加。

英文摘要

Errors in risk valuation outputs arising from data-feed failures, model misconfiguration, or system malfunctions can propagate undetected through an investment bank's risk infrastructure and generate material operational losses. Using proprietary daily credit-derivatives data from a major global investment bank covering 183 trades across 129 trading days, we design, implement, and empirically evaluate the Ensemble Quality Assessment Framework (EQAF), a layered unsupervised architecture that combines complementary outlier-detection methods to monitor risk calculation integrity in real time. Using a controlled anomaly-injection protocol with eight operationally realistic scenarios, we show that the calibrated ensemble achieves F1 scores of 61-79%, substantially outperforming the best individual method (6-66%) across four distinct risk-measure datasets. Improvements of 4-6 percentage points in AUC-ROC confirm that this advantage is robust to threshold selection. We further demonstrate that purely statistical detection methods systematically fail to identify stale-value anomalies, a class of frozen-feed errors in which valuation outputs are identical to prior observations and therefore indistinguishable from normal data, and that domain-specific deterministic rules are architecturally indispensable. These findings have direct implications for model risk management under Basel III and the Fundamental Review of the Trading Book (FRTB), where automated and auditable quality controls for internal risk models are increasingly required.

2606.20485 2026-06-19 q-fin.RM cs.AI nlin.AO physics.soc-ph 新提交

Optimal Order of Multi-Agent and General Many-Body Systems

多智能体与一般多体系统的最优序

Jake J. Xia

AI总结 提出一个分析多智能体系统的通用框架,基于智能体的权力和响应函数,推导出宏观性质,并引入风险偏好系数研究增长与韧性之间的权衡,得出最优有序度。

Comments Key Words: Many body systems, multi agent crowd interactions, feedback loops, agent power, response function, utility function, risk appetite, order, optimal order, fragility, mobility, synchronization, useful energy, entropy, concentration, correlation, task dependency, receiver dependency, collective intelligence, AI model scaling law

详情
AI中文摘要

本文开发了一个通用框架,用于分析具有智能体行动与集体观测之间反馈回路的多智能体系统。该框架建立在两个基本的智能体层面变量上:权力,衡量智能体对集体结果的影响;以及响应函数,决定智能体如何对观测做出反应。我们推导了宏观性质(包括总权力、有用权力、熵、有序度、脆弱性和流动性)如何从异质智能体的这两个变量中涌现。为了研究增长与韧性之间的权衡,我们引入了一个由风险偏好系数参数化的系统层面效用函数,并推导出一个平衡生产力、稳定性和适应性的最优有序度。分析表明,更强的同步可以增加集体产出,但也可能增加系统脆弱性并降低流动性。我们进一步论证,有序度、熵、信息和有用能量是任务依赖和系统相对的概念,其含义取决于系统的目标。通过测量和设计智能体的权力分布和响应函数,可能更好地理解、预测和优化集体行为,并识别集体智慧和最优序出现的条件。

英文摘要

This paper develops a general framework for analyzing multi-agent systems with feedback loops between agents actions and collective observations. The framework is built on two fundamental agent-level variables: power, which measures agent influence on collective outcomes, and response functions, which determine how agents react to observations. We derive how macroscopic properties, including total power, useful power, entropy, order, fragility, and mobility, emerge from these two variables of heterogeneous agents. To study the trade off between growth and resilience, we introduce a system-level utility function parameterized by a risk-appetite coefficient and derive an optimal degree of order that balances productivity, stability, and adaptability. The analysis suggests that stronger synchronization can increase collective output but may also increase systemic fragility and reduce mobility. We further argue that order, entropy, information, and useful energy are task-dependent and system-relative concepts whose meanings depend on the objectives of the system. By measuring and designing agent power distributions and response functions, it may be possible to better understand, predict, and optimize collective behavior and identify the conditions under which collective intelligence and optimal order emerge.

2606.19501 2026-06-19 cs.AI cs.CL cs.LG q-fin.RM 交叉投稿

DeXposure-Claw: An Agentic System for DeFi Risk Supervision

DeXposure-Claw: 一个用于DeFi风险监管的智能体系统

Aijie Shu, Bowei Chen, Wenbin Wu, Cathy Yi-Hsuan Chen, Fengxiang He

发表机构 * University of Edinburgh(爱丁堡大学) University of Glasgow(格拉斯哥大学) University of Cambridge(剑桥大学)

AI总结 针对DeFi监管中LLM智能体易误报的问题,提出DeXposure-Claw系统,通过图时间序列基础模型预测风险网络,结合确定性监控和置信度门控生成可审计监管票据,并构建六轴评估基准DeXposure-Bench,实验验证有效性。

详情
AI中文摘要

去中心化金融使监管者面临快速变化的网络化信用风险。通用LLM智能体不适合此场景:它们过度解读弱证据并推荐高风险干预,而现有评估无法提供符合监管者需求的误报衡量方式。我们提出DeXposure-Claw,一个基于预测的智能体监管系统,通过结构化证据引导LLM决策:(1) DeXposure-FM,一个图时间序列基础模型,预测未来风险网络;(2) 确定性监控和压力场景将预测转化为类型化警报、归因信号和场景证据;(3) 数据健康和置信度门控在DeXposure-Claw发出带有理由的可审计监管票据前限制升级。我们进一步开发了DeXposure-Bench,一个六轴评估框架,其决策轴根据符合监管者的绝对损失真实情况和显式误干预率对票据评分。在五年每周真实数据上的实验充分支持了我们的系统。代码见 https://this URL。

英文摘要

Decentralized finance exposes supervisors to fast-moving, networked credit risks. General-purpose LLM agents fit this setting poorly: they over-read weak evidence and recommend high-stakes interventions, while existing evaluations offer no regulator-aligned way to measure the resulting false alarms. We introduce DeXposure-Claw, a forecast-grounded agentic supervision system that routes LLM decisions through structured evidence: (1) DeXposure-FM, a graph time-series foundation model, forecasts future exposure networks; (2) deterministic monitors and stress scenarios then turn those forecasts into typed alerts, attribution signals, and scenario evidence; and (3) data-health and confidence gates constrain escalation before DeXposure-Claw emits auditable supervisory tickets with rationales. We further develop DeXposure-Bench, a six-axis evaluation harness, whose decision axis scores tickets against a regulator-aligned absolute-loss ground truth and an explicit false-intervention rate. Experiments on five years of weekly real data fully support our system. Code is at https://github.com/EVIEHub/DeXposure-Claw.

2606.20145 2026-06-19 q-fin.ST cond-mat.stat-mech physics.data-an q-fin.MF q-fin.RM 交叉投稿

Trends, Volatility, Correlations, and Critical Phenomena in Financial Markets

金融市场中的趋势、波动率、相关性和临界现象

Sara A. Safari, Christoph Schmidhuber

AI总结 基于当前市场趋势预测未来波动率和相关性,发现趋势强度与波动率、相关性呈二次关系,改进风险预测并支持临界点晶格气体模型。

Comments 31 pages, 9 figures

详情
AI中文摘要

我们基于金融市场的当前趋势预测未来的波动率和相关性。这补充了先前的工作,该工作通过当前趋势强度的三次多项式来建模未来预期收益。经验上,我们观察到在强烈上升或下降趋势期间,波动率和相关性往往逐日增加。这种效应在下降趋势中尤为显著。它可以通过当前趋势强度的二次多项式精确量化,这细化了波动率和相关性的常见均值回归模型。我们的结果通过考虑市场趋势改进了市场风险的预测。它们也支持最近一项将金融市场建模为接近其临界点的晶格气体的提议。

英文摘要

We forecast future volatilities and correlations of financial markets based on the current trends in these markets. This complements previous work that models future expected returns by a cubic polynomial of the current trend strength. Empirically, we observe that volatilities and correlations tend to increase day after day in times of strong up- or down-trends. This effect is particularly pronounced in down-trends. It can be accurately quantified by quadratic polynomials of today's trend strengths, which refine common mean-reversion models of volatilities and correlations. Our results improve the prediction of market risk by accounting for market trends. They also support a recent proposal to model financial markets by a lattice gas near its critical point.

2606.16326 2026-06-19 cs.GT cs.AI q-fin.RM 交叉投稿

Gaming-Resistant Insurance Contracts for Autonomous AI Agents: Strategy-Proof Toll Mechanism Design

自主AI代理的抗博弈保险合约:策略证明的通行费机制设计

Hao-Hsuan Chen

发表机构 * Hao-Hsuan Chen(何浩轩)

AI总结 本文扩展了时间一致精算运行时的框架,使运营商策略化,刻画了自主AI代理保险合约的五种攻击空间,并证明了精算运行时的抗博弈性,通过新合约条款实现激励兼容。

Comments 29 pages. Companion to arXiv:2605.26508 (Paper A, foundations) and arXiv:2605.25632 (Paper B, empirical)

详情
AI中文摘要

论文A定义了一个时间一致的精算运行时,该运行时根据合约固定的安全默认值对每个产生副作用的行动定价,并针对储备预算门控执行。它将运营商视为被动。本文使运营商策略化。我们刻画了自主AI代理保险合约的五种攻击空间,并证明了精算运行时何时具有抗博弈性。两种攻击面——通行费后的安全默认选择以及边界内的行动分割——通过论文A的最小权限和无分割条款得以关闭。其余三种需要新的合约条款。首先,公共控制聚合防止跨边界重新路由将通行费降低到应用于总暴露的边界潜力以下。其次,接口故障(如无效JSON)是合约相关事件,而非安全胜利:将其视为零通行费安全默认值可能奖励不可靠的模型,而升级费用则逆转了激励。我们通过来自配套实证论文的跨模型轨迹验证了这一接口合规定理。第三,一个带有分量最小惩罚计划的模型身份菜单使得部署模型的真实报告成为弱占优策略。然后,我们将这些条款与论文A的运行时保证组合,以获得在五种攻击空间上的联合激励兼容性。最后,一个双参数保费族在真实均衡下满足了运营商个体理性和弱预算平衡。结果是为自主代理副作用的精算控制提供了一个激励兼容层。

英文摘要

Paper A defines a time-consistent actuarial runtime that prices each side-effect-bearing action against a contractually fixed safe default and gates execution against a reserve budget. It treats the operator as passive. This paper makes the operator strategic. We characterise a five-attack space for autonomous AI-agent insurance contracts and prove when the actuarial runtime is gaming-resistant. Two attack surfaces -- post-toll safe-default selection and within-boundary action splitting -- are closed by Paper A's minimal-authority and no-splitting clauses. The remaining three require new contract clauses. First, common-control aggregation prevents cross-boundary re-routing from reducing toll below the boundary potential applied to total exposure. Second, interface failures such as invalid JSON are contract-relevant events, not safety wins: treating them as zero-toll safe defaults can reward unreliable models, while escalation fees reverse the incentive. We validate this interface-compliance theorem on committed cross-model traces from the companion empirical paper. Third, a model-identity menu with a componentwise-minimum penalty schedule makes truthful reporting of the deployed model weakly dominant. We then compose these clauses with Paper A's runtime guarantees to obtain joint incentive compatibility over the five-attack space. Finally, a two-parameter premium family discharges operator individual rationality and weak budget balance at the truthful equilibrium. The result is an incentive-compatibility layer for actuarial control of autonomous-agent side effects.