arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2606.12375 2026-06-11 cs.CE math.NA physics.comp-ph 新提交

A coupled finite element formulation for chemo-mechano-thermodynamical contact and its application to bonding and debonding

化学-力学-热力学接触的耦合有限元公式及其在粘接与脱粘中的应用

Roger A. Sauer

AI总结提出一种基于Sauer等人接触理论的耦合有限元公式，用于模拟化学-力学-热力学大变形接触，重点研究粘接与脱粘的演化及其与机械和热接触状态的耦合，并通过多个算例验证其通用性。

详情

Comments: 42 pages, 22 figures, 6 tables

AI中文摘要

本文提出了一种用于耦合化学-力学-热力学大变形接触的有限元公式。该公式基于Sauer等人（2022）的接触理论，包含六个耦合但独立的场：两个接触体的变形和温度，以及界面粘接场和界面温度。后者由界面处的化学和机械能量耗散控制。这里重点研究粘接和脱粘的演化，以及它们如何与机械和热接触状态耦合。基于二次接触势，提出了几个基本模型。由此产生的接触公式变得非常通用和灵活，通过几个具有挑战性的算例进行了说明。这些算例包括压力依赖和间隙依赖的粘接、放热粘接反应、热硬化和热膨胀，以及同时发生的粘接和脱粘。它们基于使用经典和等几何形函数以及隐式时间积分的整体有限元实现。还提供了牛顿-拉夫逊求解方法所需的完全线性化。如果粘接点是材料点，则粘接变量可以在局部凝聚掉。

英文摘要

This work presents a finite element formulation for coupled chemo-mechano-thermodynamical large deformation contact. The formulation is based on the contact theory of Sauer et al. (2022) that contains six coupled (but separate) fields: the deformation and temperature of the two contacting bodies, as well as an interfacial bonding field and interfacial temperature. The latter is governed by the chemical and mechanical energy dissipation at the interface. Here the focus is placed on the evolution of bonding and debonding, and how it is coupled to the mechanical and thermal contact state. Several elementary models are proposed for this based on a quadratic contact potential. The resulting contact formulation becomes very general and versatile, which is illustrated by several challenging examples. They include pressure- and gap- depended bonding, exothermic bonding reactions, thermal hardening and thermal expansion, as well as simultaneous bonding and debonding. They are based on a monolithic finite element implementation using classical and isogeometric shape functions together with implicit time integration. Its full linearization, required for the Newton-Raphson solution method, is also provided. If bonding sites are material points, the bonding variable can be condensed-out locally.

URL PDF HTML ☆

赞 0 踩 0

2606.12347 2026-06-11 cs.CE physics.geo-ph 新提交

Local Stress Redistribution Controls Interactions between Hydraulic Fractures and Pre-existing Fractures

局部应力重分布控制水力裂缝与预先存在裂缝之间的相互作用

S. Shandilaya, M. Alaleeli, S.H. Kim, M. Mobasher, S. Roshankhah

AI总结通过实验和模拟，研究了天然裂缝诱导的应力重分布如何控制水力裂缝的轨迹，揭示了剪切变形对裂缝吸引或排斥的作用机制。

详情

Comments: 24 pages, 12 figures. Submitted to the International Journal of Rock Mechanics and Mining Sciences

AI中文摘要

水力裂缝在天然裂缝性地层中的传播受到预先存在的天然裂缝附近局部应力状态的强烈影响。天然裂缝诱导的剪切变形和应力重分布在控制水力裂缝轨迹中的作用仍不明确。本研究通过耦合实验室实验和孔隙弹性扩展有限元模拟，在平面应变条件下对完整和预裂PMMA试样进行了研究，探讨了天然裂缝诱导的应力重分布如何控制水力裂缝与天然裂缝的相互作用。数字图像相关提供了机械加载和水力压裂过程中位移和应变演化的全场测量。在固定底座、侧向约束和垂直压缩边界条件下，倾斜的天然裂缝诱导不对称的应力重分布和剪切变形，在流体注入前产生不同的局部应力状态。结果表明，水力裂缝轨迹由天然裂缝相对于远场最大主应力方向产生的剪应力和剪应变分量的符号和空间分布控制。促进天然裂缝附近压应力发展的剪切变形导致水力裂缝偏转远离，而降低天然裂缝有效法向应力的剪切变形则促进裂缝吸引和连接。预裂试样中水力裂缝曲率的相应数值再现需要混合模式（I-II型）断裂能释放准则，而完整试样则纯I型扩展。总体而言，研究结果揭示了由于天然裂缝的存在，局部应力状态演化导致从拉伸张开到剪切辅助混合模式传播的转变，为地下刺激和储存应用中预测和控制裂缝轨迹提供了机理基础。

英文摘要

Hydraulic fracture (HF) propagation in naturally fractured formations is strongly influenced by local stress states near pre-existing natural fractures (NFs). The role of NF-induced shear deformation and stress redistribution in controlling HF trajectories remains poorly characterized. This study investigates how NF-induced stress redistribution governs HF-NF interactions through coupled laboratory experiments and poroelastic extended finite element simulations on intact and pre-fractured PMMA specimens under plane-strain conditions. Digital image correlation provides full-field measurements of displacement and strain evolution during mechanical loading and hydraulic fracturing. Under fixed-base, lateral confinement, and vertical compression boundary conditions, inclined NFs induce asymmetric stress redistribution and shear deformation, generating distinct local stress states prior to fluid injection. The results demonstrate that the HF trajectory is governed by the sign and spatial distribution of shear stress and shear strain components generated by NF orientation relative to the far-field maximum principal stress. Shear deformation that promotes compressive stress development adjacent to the NF causes the HF to deflect away, whereas shear deformation that reduces the effective normal stress along the NF promotes fracture attraction and linkage. Corresponding numerical reproduction of HF curvature in pre-fractured specimens requires mixed-mode (Mode I-II) fracture energy release criteria, while the intact specimen propagates in pure Mode I. Overall, the findings reveal a transition from tensile opening to shear-assisted mixed-mode propagation as local stress states evolve due to the presence of NFs, providing a mechanistic basis for predicting and controlling fracture trajectories in subsurface stimulation and storage applications.

URL PDF HTML ☆

赞 0 踩 0

2606.12128 2026-06-11 cs.CE 新提交

From Agent Identity to Agent Economy: Measuring the Operational Readiness of ERC-8004 AI Agents

从代理身份到代理经济：衡量ERC-8004 AI代理的操作就绪度

Rischan Mafrur, Priagung Khusumanegara

AI总结本文通过分析以太坊上ERC-8004代理的数据，构建操作就绪度框架，发现早期采用以注册为主但操作浅层，身份层可见但元数据、服务、声誉和跨链证据有限，所有权和反馈活动高度集中，表明从代理身份到代理经济的转型尚未完成。

详情

AI中文摘要

本文研究区块链注册的AI代理是否在身份注册之外表现出操作就绪度。利用以太坊上ERC-8004代理的数据集，我们构建了一个代理级特征表，涵盖身份状态、元数据、服务声明、声誉反馈、转移和跨链注册。我们基于可观察证据层开发了一个操作就绪度框架，并通过所有者-代理、反馈-客户端、钱包-转移以及组合证据关系的网络分析进行补充。结果表明，早期ERC-8004采用以注册为主但操作浅层。虽然身份层在大规模上可见，但元数据可用性、服务暴露、声誉形成和跨链证据仍然有限。所有权和反馈活动也高度集中，表明早期参与由少数高活动性钱包和客户端塑造。网络分析进一步表明，更丰富的操作证据集中在少数代理周围，而非广泛分布于整个生态系统。研究结果表明，ERC-8004为去中心化AI代理提供了重要的身份层，但从代理身份到代理经济的转型尚未完成。

英文摘要

This paper examines whether blockchain-registered AI agents demonstrate operational readiness beyond identity registration. Using a dataset of ERC-8004 agents on Ethereum, we construct an agent-level feature table covering identity status, metadata, service declarations, reputation feedback, transfers, and cross-chain registration. We develop an operational readiness framework based on observable evidence layers and complement it with network analysis of owner-agent, feedback-client, wallet-transfer, and combined evidence relationships. The results show that early ERC-8004 adoption is registration-heavy but operationally shallow. While the identity layer is visible at scale, metadata availability, service exposure, reputation formation, and cross-chain evidence remain limited. Ownership and feedback activity are also highly concentrated, suggesting that early participation is shaped by a small number of high-activity wallets and clients. The network analysis further shows that richer operational evidence clusters around a small subset of agents rather than being broadly distributed across the ecosystem. The findings suggest that ERC-8004 provides an important identity layer for decentralized AI agents, but the transition from agent identity to agent economy remains incomplete.

URL PDF HTML ☆

赞 0 踩 0

2606.11995 2026-06-11 cs.CE 新提交

A Computational Model for Measuring Adaptability Among U.S. Farmers: Evidence from 1997-2022

衡量美国农民适应性的计算模型：来自1997-2022年的证据

Hossein Sabzian

AI总结基于1997-2022年数据，构建框架研究美国县域作物选择的文化进化机制，发现环境收益偏向选择驱动适应性最大化，并呈现长期组合性状趋同趋势。

详情

Comments: 17 pages, 7 figures

AI中文摘要

农作物是一种文化特征，美国各县农民选择它们的方式本身可以产生县级文化特征。利用1997年至2022年的真实世界数据，我们开发了一个系统框架来研究这些特征背后的选择机制。我们的研究结果表明，环境收益偏向选择已促使各县采用在其特定环境中最大化适应性和产量的特征。这些实证结果与现有理论文献[3,16]一致。此外，一个明显的长期选择趋势表明，美国各县正在逐渐发展出一组特定的更复杂的组合特征，这些特征通过增强农民的环境适应性来提供更大的收益。本研究为美国农民文化进化过程的实证建模提供了一个强有力的案例。

英文摘要

Agricultural crops are a type of cultural trait and the way farmers of US counties select them can itself result in county-level cultural traits. Using real-world data from 1997 to 2022, we have developed a systematic framework to study the selective mechanisms behind these traits. Our findings indicate that environmental payoff-biased selection has driven counties to adopt traits that maximize their adaptability and yield within their specific environments. These empirical results align with existing theoretical literature [3,16]. Additionally, a clear long-term selective trend is evident, showing that US counties are gradually developing a specific set of more complex combinatorial traits, which provide greater payoffs by enhancing the farmers' environmental adaptability. This study serves as a strong case for empirically modeling the cultural evolutionary processes among US farmers.

URL PDF HTML ☆

赞 0 踩 0

2606.11676 2026-06-11 cs.CE cs.LG physics.comp-ph 新提交

Neural-Parameterized Cellular Automata for Wildfire Spread

神经参数化元胞自动机用于野火蔓延

Maksym Zhenirovskyy, Ion Matei, Rohit Vuppala, Takuya Kurihana, Hon Yung Wonga

AI总结提出一种混合深度学习参数化概率元胞自动机框架，利用多尺度卷积神经网络动态生成空间变化参数，在保持物理可解释性的同时捕捉复杂环境交互，在六次大型野火中实现72小时IoU>0.6的预测。

详情

Comments: 16 pages, 9 figures

AI中文摘要

传统野火模型依赖刚性、低维参数和静态燃料图，常常低估火势蔓延。为解决这一弱点，我们引入了一个在JAX中实现的混合深度学习参数化概率元胞自动机（CA）框架。我们的方法采用多尺度卷积神经网络动态生成控制火势蔓延概率、风向对齐和坡度影响的空间变化参数。这种混合设计捕捉了复杂的非线性环境交互，同时保留了底层三态CA的物理可解释性。JAX实现支持硬件加速和基于梯度的参数校准。在美国西部六次大规模野火上的评估显示，在10天数据同化窗口期间模型逐步拟合观测到的火线后，该模型在72小时预测范围内保持IoU>0.6；由此产生的预测是在这些观测中已编码的抑制机制下火势增长的条件投影。

英文摘要

Traditional wildfire models rely on rigid, low-dimensional parameters and static fuel maps, frequently underpredicting fire spread. To address this weakness, we introduce a hybrid deep-learning parameterized Probabilistic Cellular Automata (CA) framework implemented in JAX. Our approach employs a Multi-Scale Convolutional Neural Network to dynamically generate spatially varying parameters that govern fire-spread probability, wind alignment, and slope influence. This hybrid design captures complex, nonlinear environmental interactions while preserving the physical interpretability of the underlying three-state CA. The JAX implementation enables hardware acceleration and gradient-based parameter calibration. Evaluated on six large-scale wildfires in the western United States, the model maintains IoU > 0.6 over 72-hour forecast horizons after a 10-day data assimilation window during which the model is fitted incrementally to observed perimeters; the resulting forecast is a conditional projection of fire growth under the suppression regime already ncoded in those observations.

URL PDF HTML ☆

赞 0 踩 0

2606.11537 2026-06-11 cs.AI cs.CE 新提交

MoCA-Agent: A Market-of-Claims Code Agent for Financial and Numerical Reasoning

MoCA-Agent: 一种用于金融和数值推理的声明市场代码智能体

Abdelrahman Abdallah, AbdelRahim A. Elmadany, Sameh Al Natour, Hasan Cavusoglu, Adam Jatowt, Muhammad Abdul-Mageed

发表机构 * University of Innsbruck（因斯布鲁克大学）； University of British Columbia（不列颠哥伦比亚大学）； Toronto Metropolitan University（多伦多都会大学）

AI总结提出MoCA-Agent，通过声明级验证和代码生成解决金融表格问答中的数值推理错误，在十个基准上取得强性能。

详情

AI中文摘要

金融和表格问答不仅需要流畅的推理：答案必须基于支持它们的确切事实、公式、单位、符号和尺度。单个误读的单元格或错误操作可能会悄无声息地产生看似合理但错误的结果。我们引入了 \textsc{MOCA-Agent}，一种声明市场代码智能体，它用声明级验证取代了自由形式的多智能体辩论。该系统将每个问题分解为类型化的原子声明，要求专业交易智能体买入或卖出这些声明，将其订单清算为置信度加权的接受/拒绝决策，并从市场支持的证据中合成可执行的Python程序。然后，一个代码感知验证器检查程序的执行、结构一致性和常见的金融推理错误，最多进行一次市场感知修复轮次。在涵盖金融数值推理、通用表格推理、ESG问答和多模态图表推理的十个公开基准上，\textsc{MOCA-Agent} 使用固定的 Qwen3.6-27B 骨干网络实现了强劲性能，包括在 FinQA 上达到 78.3%，在 FinanceMath 上达到 76.0%，在 MultiHiertt 上达到 71.2%，在 ESGenius 上达到 86.9%，以及在 FinChart-Bench 上平均达到 85.6%。这些结果表明，在原子声明级别聚合证据，而不是整个答案，提高了高风险数值推理的鲁棒性。\footnote{代码和数据可在以下网址获取：this https URL。}

英文摘要

Financial and tabular question answering requires more than fluent reasoning: answers must be grounded in the exact facts, formulas, units, signs, and scales that support them. A single misread cell or incorrect operation can silently produce a plausible but wrong result. We introduce \textsc{MOCA-Agent}, a market-of-claims code agent that replaces free-form multi-agent debate with claim-level verification. The system decomposes each question into typed atomic claims, asks specialist trader agents to buy or sell those claims, clears their orders into confidence-weighted accept/reject decisions, and synthesizes an executable Python program from market-supported evidence. A code-aware verifier then checks the program for execution, structural consistency, and common financial reasoning errors, with at most one market-aware repair round. Across ten public benchmarks spanning financial numerical reasoning, general tabular reasoning, ESG question answering, and multimodal chart reasoning, \textsc{MOCA-Agent} achieves strong performance using a fixed Qwen3.6-27B backbone, including $78.3\%$ on FinQA, $76.0\%$ on FinanceMath, $71.2\%$ on MultiHiertt, $86.9\%$ on ESGenius, and $85.6\%$ average on FinChart-Bench. These results show that aggregating evidence at the level of atomic claims, rather than whole answers, improves robustness in high-stakes numerical reasoning.\footnote{The code and data are available: this https URL.

URL PDF HTML ☆

赞 0 踩 0

2606.11500 2026-06-11 eess.IV cs.CE cs.IT cs.LG q-bio.NC 新提交

FlexiBrain: Resolution-Agnostic Voxel-Level Encoding for Native fMRI

FlexiBrain: 面向原生fMRI的分辨率无关体素级编码

Mo Wang, Wenhao Ye, Junfeng Xia, Minghao Xu, Hongkai Wen, Quanying Liu

AI总结提出FlexiBrain，一种基于Mamba-JEPA的分辨率无关体素级编码框架，通过动态补丁调整直接处理原生fMRI数据，避免破坏性空间标准化，在五个下游任务中性能提升达12个百分点，并显著降低预处理成本。

详情

AI中文摘要

大规模深度学习模型在神经科学中的成功从根本上受到严重数据异质性的制约。从不同来源聚合的原生fMRI数据在空间和时间分辨率上表现出显著差异。因此，大多数现有框架依赖于冗长、僵化的预处理流程，以强制数据集之间的一致性。这种做法引入了两个关键限制：（1）可能退化受试者特定的解剖信息；（2）显著的计算开销，通常每个受试者需要数小时的处理。在此，我们提出FlexiBrain，一种基于Mamba-JEPA的分辨率无关体素级编码框架，用于原生fMRI。FlexiBrain以真实物理单位定义补丁大小，并采用动态补丁调整，从而绕过破坏性的空间标准化，同时允许直接摄取原生空间中的数据。我们使用高效的Mamba-JEPA骨干网络实例化该框架，以建模高维4D fMRI信号。在五个不同的下游神经科学任务中，FlexiBrain持续优于近期最先进的方法，在不使用外部数据增强的情况下实现了高达12个百分点的提升。重要的是，FlexiBrain作为一个无缝插件模块，显著降低了预处理成本，并加速了稳健的体素级fMRI基础模型的开发。代码可在该https URL获取。

英文摘要

The success of large-scale deep learning models in neuroscience is fundamentally constrained by severe data heterogeneity. Native fMRI data aggregated from diverse sources exhibit substantial variation in both spatial and temporal resolutions. Consequently, most existing frameworks rely on lengthy, rigid preprocessing pipelines that enforce uniformity across datasets. This practice introduces two critical limitations: (1) potential degradation of subject-specific anatomical information; (2) significant computational overhead, often requiring hours of processing per subject. Here, we propose FlexiBrain, a resolution-agnostic voxel-level encoding framework for native fMRI based on Mamba-JEPA. FlexiBrain defines patch sizes in real-world physical units and employs a dynamic patch resizing, thereby bypassing destructive spatial standardization while enabling direct ingestion of data in native space. We instantiate the framework using an efficient Mamba-JEPA backbone to model high-dimensional 4D fMRI signals. Across five diverse downstream neuroscience tasks, FlexiBrain consistently outperforms recent state-of-the-art methods, achieving gains of up to 12 percentage points without external data augmentation. Importantly, FlexiBrain functions as a seamless plug-in module, substantially reducing preprocessing costs and accelerating the development of robust voxel-level fMRI foundation models. Code is available at this https URL.

URL PDF HTML ☆

赞 0 踩 0

2511.17259 2026-06-11 quant-ph cs.CC cs.CE cs.DM math-ph 版本更新

Fundamental Limitations of QAOA on Constrained Problems and a Route to Exponential Enhancement

QAOA在约束问题上的基本限制及指数增强的路径

Chinonso Onah, Kristel Michielsen

AI总结研究通用QAOA在约束问题上的基本限制，通过约束嵌入实现指数级改进，并针对排列约束问题提出最小约束增强核（CE QAOA），证明其可行质量比随深度指数增长。

详情

AI中文摘要

我们研究了通用量子近似优化算法（QAOA）在约束问题上的基本限制，其中有效解在布尔超立方体内形成一个低维流形，并提出了通过约束嵌入实现指数级改进的可证明路径。聚焦于排列约束目标，我们表明标准的通用QAOA ansatz（具有横向场混合器和对角r局部代价）面临固有的可行性瓶颈：即使在角度优化后，深度最多随n次线性增长的电路无法将可行流形上的总概率质量提高到远高于由整个希尔伯特空间大小抑制的均匀基线。针对这一限制，我们引入了一个最小约束增强核（CE QAOA），它直接在一个乘积独热子空间内操作，并使用块局部XY哈密顿量进行混合。对于排列约束问题，我们证明了一个角度鲁棒、深度匹配的指数增强，其中来自CE QAOA和通用QAOA的可行质量之比在$n^2$中指数增长，对于所有深度达到n的线性分数，在相互作用超图上满足温和的多项式增长条件。由于核构造中的问题-算法协同设计，这些技术和保证从排列扩展到一类广泛的NP难约束优化问题。

英文摘要

We study fundamental limitations of the generic Quantum Approximate Optimization Algorithm (QAOA) on constrained problems where valid solutions form a low dimensional manifold inside the Boolean hypercube, and we present a provable route to exponential improvements via constraint embedding. Focusing on permutation constrained objectives, we show that the standard generic QAOA ansatz, with a transverse field mixer and diagonal r local cost, faces an intrinsic feasibility bottleneck: even after angle optimization, circuits whose depth grows at most sublinearly with n cannot raise the total probability mass on the feasible manifold much above the uniform baseline suppressed by the size of the full Hilber space. Against this envelope we introduce a minimal constraint enhanced kernel (CE QAOA) that operates directly inside a product one hot subspace and mixes with a block local XY Hamiltonian. For permutation constrained problems, we prove an angle robust, depth matched exponential enhancement where the ratio between the feasible mass from CE QAOA and generic QAOA grows exponentially in $n^2$ for all depths up to a linear fraction of n, under a mild polynomial growth condition on the interaction hypergraph. Thanks to the problem algorithm co design in the kernel construction, the techniques and guarantees extend beyond permutations to a broad class of NP-Hard constrained optimization problems.

URL PDF HTML ☆

赞 0 踩 0

2512.19912 2026-06-11 cs.CE math.OC 版本更新

Solving strategies for data-driven one-dimensional elasticity exhibiting nonlinear strains

数据驱动下具有非线性应变的弹性问题的求解策略

Thi-Hoa Nguyen, Viljar H. Gjerde, Bruno A. Roccia, Cristian G. Gebhardt

AI总结提出一种结合贪婪优化和交替方向法的数据驱动求解策略，用于非线性应变弹性结构，能更好逼近全局最优解，但计算成本更高。

详情

AI中文摘要

在这项工作中，我们扩展并推广了最初在[1]中引入的求解策略，该策略基于贪婪优化算法和用于多载荷步非线性系统的交替方向法（ADM）。具体而言，我们将贪婪优化算法与基于ADM的直接数据驱动求解器相结合，该求解器首次在[2]中引入，并在[3]中与牛顿-拉夫森方法结合用于非线性弹性。我们通过一维和二维的杆和桁架结构（具有非线性应变度量和不同本构数据集）进行数值示例，表明我们的求解策略通常能更好地逼近全局最优解。然而，这是以更高的计算成本为代价的，该成本随“贪婪”搜索次数而缩放。使用这种求解策略，我们重现了在工业测试设施中为系泊绳制造商进行的尼龙绳循环测试的第一个循环。我们还通过桁架结构数值示例表明，在非对称数据分布和噪声数据的情况下，我们的求解策略通常能提高精度和鲁棒性。

英文摘要

In this work, we extend and generalize our solving strategy, first introduced in [1], based on a greedy optimization algorithm and the alternating direction method (ADM) for nonlinear systems computed with multiple load steps. In particular, we combine the greedy optimization algorithm with the direct data-driven solver based on ADM which is firstly introduced in [2] and combined with the Newton-Raphson method for nonlinear elasticity in [3]. We numerically illustrate via one- and two-dimensional bar and truss structures exhibiting nonlinear strain measures and different constitutive datasets that our solving strategy generally achieves a better approximation of the globally optimal solution. This, however, comes at the expense of higher computational cost which is scaled by the number of "greedy" searches. Using this solving strategy, we reproduce the first cycle of the cyclic testing for a nylon rope that was performed at industrial testing facilities for mooring lines manufacturers. We also numerically illustrate for a truss structure that our solving strategy generally improves the accuracy and robustness in cases of an unsymmetrical data distribution and noisy data.

URL PDF HTML ☆

赞 0 踩 0

2603.19225 2026-06-11 cs.CE cs.AI cs.CL cs.IR q-fin.CP 版本更新

FinTradeBench: A Financial Reasoning Benchmark for LLMs

FinTradeBench: 面向LLM的金融推理基准

Yogesh Agrawal, Aniruddha Dutta, Md Mahadi Hasan, Santu Karmaker, Aritra Dutta

AI总结提出FinTradeBench基准，通过结合公司基本面与交易信号，评估大语言模型在金融推理中的表现，发现检索增强对数值和时间序列推理帮助有限。

详情

Comments: 9 pages main text, 31 pages total (including references and appendix). 5 figures, 16 tables. Preprint under review. Code and data will be made available upon publication

AI中文摘要

现实世界的金融决策是一个具有挑战性的问题，需要对异构信号进行推理，包括从监管文件中提取的公司基本面和从价格动态计算出的交易信号。最近，随着大语言模型（LLM）的进步，金融分析师开始将它们用于金融决策任务。然而，现有的用于测试这些模型的金融问答基准主要关注公司资产负债表数据，很少评估关于公司股票如何在市场中交易或它们与基本面相互作用的推理。为了利用这两种方法的优势，我们引入了FinTradeBench，这是一个评估金融推理的基准，它整合了公司基本面和交易信号。FinTradeBench包含1400个问题，这些问题基于纳斯达克-100公司十年历史窗口的数据。该基准分为三个推理类别：基本面聚焦、交易信号聚焦以及需要跨信号推理的混合问题。为了确保大规模可靠性，我们采用了一个校准然后扩展的框架，该框架结合了专家种子问题、多模型响应生成、模型内自过滤、数值审计以及人类-LLM判断对齐。我们在零样本提示和检索增强设置下评估了14个LLM，并观察到了明显的性能差距。检索显著改善了对文本基本面的推理，但对交易信号推理的益处有限。这些发现突显了当前LLM在数值和时间序列推理方面的根本性挑战，并激励了未来在金融智能方面的研究。

英文摘要

Real-world financial decision-making is a challenging problem that requires reasoning over heterogeneous signals, including company fundamentals derived from regulatory filings and trading signals computed from price dynamics. Recently, with advances in Large Language Models (LLMs), financial analysts have begun to use them for financial decision-making tasks. However, existing financial question-answering benchmarks for testing these models primarily focus on company balance sheet data and rarely evaluate reasoning about how company stocks trade in the market or their interactions with fundamentals. To leverage the strengths of both approaches, we introduce FinTradeBench, a benchmark for evaluating financial reasoning that integrates company fundamentals and trading signals. FinTradeBench contains 1,400 questions grounded in NASDAQ-100 companies over a ten-year historical window. The benchmark is organized into three reasoning categories: fundamentals-focused, trading-signal-focused, and hybrid questions requiring cross-signal reasoning. To ensure reliability at scale, we adopt a calibration-then-scaling framework that combines expert seed questions, multi-model response generation, intra-model self-filtering, numerical auditing, and human-LLM judge alignment. We evaluate 14 LLMs under zero-shot prompting and retrieval-augmented settings and witness a clear performance gap. Retrieval substantially improves reasoning over textual fundamentals, but provides limited benefit for trading-signal reasoning. These findings highlight fundamental challenges in the numerical and time-series reasoning for current LLMs and motivate future research in financial intelligence.

URL PDF HTML ☆

赞 0 踩 0

2511.20216 2026-06-11 cs.AI cs.CE cs.CV cs.LG cs.RO

CostNav: A Navigation Benchmark for Real-World Economic-Cost Evaluation of Physical AI Agents

Haebin Seong, Sungmin Kim, Yongjun Cho, Myunchul Joe, Geunwoo Kim, Yubeen Park, Sunhoo Kim, Samwoo Seong, Yoonshik Kim, Suhwan Choi, Jaeyoon Jung, Jiyong Youn, Jinmyung Kwak, Sunghee Ahn, Jaemin Lee, Younggil Do, Seungyeop Yi, Woojin Cheong, Minhyeok Oh, Minchan Kim, Seongjae Kang, Youngjae Yu, Yunsung Lee

2602.13513 2026-06-11 math.OC cs.CE cs.LG cs.NA math.DS math.NA

Learning Gradient Flow: Using Equation Discovery to Accelerate Engineering Optimization

Grant Norman, Conor Rowan, Kurt Maute, Alireza Doostan

2601.23268 2026-06-11 cs.CE 版本更新

TCBench: A Benchmark for Tropical Cyclone Track and Intensity Forecasting at the Global Scale

TCBench：全球尺度热带气旋路径和强度预测的基准测试

Milton Gomez, Marie McGraw, Saranya Ganesh S., Frederick Iat-Hin Tam, Ilia Azizi, Samuel Darmon, Monika Feldmann, Stella Bourdin, Louis Poulain--Auzéau, Suzana J. Camargo, Jonathan Lin, Dan Chavas, Chia-Ying Lee, Ritwik Gupta, Andrea Jenney, Tom Beucler

AI总结提出TCBench基准，用于评估全球1-5天热带气旋路径和强度预测，整合物理模型和AI模型，提供确定性和概率性指标，发现AI模型在路径预测上表现良好，但强度预测需后处理或专用训练。

详情

Comments: 28 Pages, Including SI

AI中文摘要

TCBench是一个用于评估全球短期至中期（1-5天）热带气旋（TC）路径和强度预测的基准测试。为了实现公平且与模型无关的比较，TCBench基于IBTrACS观测数据集，将TC预测表述为根据初始位置和强度预测现有热带系统的时间演化。TCBench包括最先进的基于物理的模型（TIGGE）和人工智能天气预报（AIWP）模型（AIFS、Pangu-Weather、FourCastNet v2、GenCast、FNV3）。如果无法直接获取（例如，像TIGGE那样从NOAA网站获取），则使用TempestExtremes库从模型输出中一致地推导出TC路径。TCBench提供确定性和概率性的风暴跟随指标。在2023年的测试案例中，AIWP模型能熟练预测TC路径，而熟练的强度预测则需要额外步骤，如后处理或任务特定训练。TCBench设计为易于访问，帮助AI从业者应对领域相关的TC挑战，并为热带气象学家提供数据驱动的工具和工作流程，以改进预测和TC过程理解。通过降低对极端事件进行可重复、过程感知评估的门槛，TCBench旨在使数据驱动的TC预测民主化。

英文摘要

TCBench is a benchmark for evaluating global, short to medium-range (1-5 days) forecasts of tropical cyclone (TC) track and intensity. To allow a fair and model-agnostic comparison, TCBench builds on the IBTrACS observational dataset and formulates TC forecasting as predicting the time evolution of an existing tropical system conditioned on its initial position and intensity. TCBench includes state-of-the-art physics-based (TIGGE) and Artificial Intelligence Weather Prediction (AIWP) models (AIFS, Pangu-Weather, FourCastNet v2, GenCast, FNV3). If not readily available (e.g., from the NOAA website as is done with TIGGE), TC tracks are consistently derived from model outputs using the TempestExtremes library. TCBench provides deterministic and probabilistic storm-following metrics. On 2023 test cases, AIWP models skillfully forecast TC tracks, while skillful intensity forecasts require additional steps such as post-processing or task-specific training. Designed for accessibility, TCBench helps AI practitioners tackle domain-relevant TC challenges and equips tropical meteorologists with data-driven tools and workflows to improve prediction and TC process understanding. By lowering barriers to reproducible, process-aware evaluation of extreme events, TCBench aims to democratize data-driven TC forecasting.

URL PDF HTML ☆

赞 0 踩 0

2512.13302 2026-06-11 cs.CE physics.app-ph

On the impact of geometric variance on the performance of formed parts: A probabilistic approach on the example of airbag pressure bins

Lukas Schnelle, Niklas Fehlemann, Ali O. M. Kilicsoy, Niklas Bechler, Marcos A. Valdebenito, Yannis P. Korkolis, Matthias G. R. Faes, Sebastian Münstermann, Kai-Uwe Schröder

2208.10271 2026-06-11 cs.CR cs.CE

Agent-based Model of Initial Token Allocations: Evaluating Wealth Concentration in Fair Launches

Joaquin Delgado Fernandez, Tom Barbereau, Orestis Papageorgiou

2507.17012 2026-06-11 cs.AI cs.CE 版本更新

Sustainability assessment using multimodal AI agents

使用多模态AI代理进行可持续性评估

Zhihan Zhang, Alexander Metzger, Yuxuan Mei, Felix Hähnlein, Zachary Englhardt, Tingyu Cheng, Gregory D. Abowd, Shwetak Patel, Adriana Schulz, Vikram Iyer

AI总结提出多模态多代理AI系统，模拟生命周期评估专家与利益相关者协作，自动估算电子设备碳足迹，将数据收集时间从数周缩短至一分钟，误差在19%以内。

详情

Comments: This article is published in Nature Electronics, and is available online at: this https URL

AI中文摘要

减少计算行业快速增长的环境影响需要大规模评估电子产品的排放。然而，传统的电子设备生命周期评估（LCA）需要专有或不可用的数据。在这里，我们通过引入一个多模态多代理AI系统重新构想传统的可持续性评估，该系统模拟LCA专业人员与利益相关者（如产品经理和工程师）之间的协作过程，自动估算电子设备的碳足迹。代理通过利用结构化数据抽象和从公共互联网（包括维修社区和政府监管数据库）挖掘信息的软件工具，迭代构建完整的生命周期清单。这将数据收集时间从数周或数月减少到不到一分钟。该系统可以在零专有数据的情况下，以专家LCA的19%误差范围内计算碳足迹（典型的人类LCA之间的差异）。我们还表明，通过编码领域特定知识，环境影响估算可以重新定义为数据驱动的预测任务，其中未知产品和排放因子都被表示为具有已知排放的相似产品的加权组合。

英文摘要

Reducing the rapidly growing environmental impact of the computing industry requires assessing the emissions of electronics at scale. However, a traditional life cycle assessment (LCA) of an electronic device, which maps materials and processes to environmental impacts, often requires proprietary or unavailable data. Here, we reimagine conventional sustainability assessment by introducing a multimodal multi-agent AI system that emulates the collaborative process between LCA professionals and stakeholders (such as product managers and engineers) to automatically estimate the carbon footprint of electronic devices. The agents iteratively construct a complete life-cycle inventory by leveraging a structured data abstraction and software tools that mine information from the public internet, including repair communities and government regulatory databases. This reduces data gaps and data collection from weeks or months of expert time to under one minute. The system can calculate carbon footprint within 19% of expert LCAs with zero proprietary data (typical of the variation between human LCAs). We also show that by encoding domain-specific knowledge, environmental impact estimation can be reframed as a data-driven prediction task, in which both unknown products and emission factors are represented as weighted combinations of similar ones with known emissions.

URL PDF HTML ☆

赞 0 踩 0

2407.12618 2026-06-11 quant-ph cs.CE

A Brief Review of Quantum Machine Learning for Financial Services

Mina Doosti, Petros Wallden, Conor Brian Hamill, Robert Hankache, Oliver Thomson Brown, Chris Heunen