URL PDF HTML ☆

赞 0 踩 0

2606.16624 2026-06-16 cs.AI 新提交

MR-GVNO: A Geometry-Aware Variational Physics-Informed Neural Operator for Mindlin-Reissner Plates on Irregular Domains

MR-GVNO：一种面向不规则域上Mindlin-Reissner板的几何感知变分物理信息神经算子

Siqi Wang, Daobo Sun, Yizheng Wang, Yilong Zhang, Yabin Jin, Xiaoying Zhuang, Timon Rabczuk

AI总结提出MR-GVNO，一种几何感知变分神经算子，通过边界点云表示不规则几何，利用交叉注意力机制融合多物理场输入，基于离散总势能的变分物理信息损失无监督训练，实现对Mindlin-Reissner板问题的快速准确预测。

详情

AI中文摘要

板壳结构在工程中广泛应用，因此在不同几何、材料和载荷下进行快速响应预测非常理想。然而，传统的有限元方法需要重复建模和求解，导致计算成本高昂。本研究提出了一种用于Mindlin-Reissner板问题的几何感知变分神经算子，称为MR-GVNO。该方法使用边界点云表示不规则几何，并采用独立的编码器处理空间变化的材料场、压力载荷和标量物理参数。交叉注意力机制将这些输入与查询点信息集成，以预测任意位置的横向挠度和转角。MR-GVNO无需标记解数据，通过从离散总势能导出的变分物理信息损失进行训练。它直接处理不规则点云，并允许不同的物理场独立离散化，避免了插值到公共网格。在单孔、双孔和L形板上的数值实验表明，在均匀和非均匀材料以及均匀和随机载荷下，该方法能准确预测响应。该模型还实现了毫秒级的全场推理和良好的跨几何泛化能力。

英文摘要

Plate and shell structures are widely used in engineering, making rapid response prediction under varying geometries, materials, and loads highly desirable. However, conventional finite element methods require repeated modeling and solution, resulting in high computational costs. This study proposes a geometry-aware variational neural operator for Mindlin-Reissner plate problems, termed MR-GVNO. The method uses boundary point clouds to represent irregular geometries and employs separate encoders for spatially varying material fields, pressure loads, and scalar physical parameters. A cross-attention mechanism integrates these inputs with query point information to predict transverse deflections and rotations at arbitrary locations. MR-GVNO is trained without labeled solution data using a variational physics-informed loss derived from the discretized total potential energy. It directly processes irregular point clouds and allows different physical fields to be discretized independently, avoiding interpolation onto a common grid. Numerical experiments on single-hole, double-hole, and L-shaped plates demonstrate accurate response prediction under homogeneous and heterogeneous materials and uniform and random loads. The model also achieves millisecond-level full-field inference and favorable cross-geometry generalization.

URL PDF HTML ☆

赞 0 踩 0

2606.16617 2026-06-16 cs.CL cond-mat.mtrl-sci cs.AI 新提交

Sycophancy as Material Failure under Pushback Loading: A Multi-Axis Characterization Across Three Loading Cases and up to Seventeen Material Charges

推挤载荷下的谄媚作为材料失效：三种加载情形及多达十七种材料批次的多元表征

Ferdinand M. Schessl

AI总结采用材料科学框架，将LLM谄媚视为推挤载荷下的材料失效，通过14个轴测量和三种加载情形（辩论、错误预设、伦理设定）共7800个样本，揭示失效模式依赖加载类型，并发现跨评判者可靠性差异。

Comments 12 pages, 3 figures. Code, data, and pre-registrations: https://github.com/FerdinandSchessl/sycophancy-note-companion

详情

AI中文摘要

LLM中的谄媚现象在70多篇论文中有记录，但专家对构念边界的共识仍然较低（ICC=.184；Ye等人，2026）。该构念碎片化是因为行为分类取决于哪种表面形式被优先考虑。我们采用材料科学框架：对话作为加载下的测试样本，LLM模型作为材料批次，推挤作为渐进载荷，立场翻转作为材料失效。我们在三种加载情形（辩论n=1000；错误预设n=3400；伦理设定n=3400；每种情形10-17种材料批次；共7800个样本）下，使用14个回合级轴测量（涵盖速度、损伤累积、框架漂移、脆性和方向稳定性）以及来自独立管道的三个说话者解析轴来表征这种失效。测量是胡克耦合的（$σ= E \cdot \varepsilon$类比），并在加载情形间重现，在辩论上效应高达$|r_{rb}| = 0.35$；符号结构增加了第二种模式：伦理设定情形反转了速度和累积块。方差组成分为两个轮廓：辩论是批次主导的（类似脆性断裂：材料等级决定），错误预设和伦理设定是主题主导的（类似蠕变：载荷决定）；比率（2.03 vs 0.13/0.17）依赖于估计器，对于辩论甚至在方向上也是如此。跨评判者可靠性（GPT-4o vs Haiku 4.5）显示辩论评分是评判者鲁棒的（Cohen's $κ= 0.88$），而错误预设评分是评判者敏感的（$κ= 0.36$）——这是单评判者基准必须报告的注意事项。这是Ye等人诊断所要求的方法论举措：一种不依赖于构念的哪种表面形式被优先考虑的多元表征。

英文摘要

Sycophancy in LLMs is documented across 70+ papers, but expert agreement on construct boundaries remains low (ICC=.184; Ye et al., 2026). The construct fragments because behavioral classification depends on which surface form is privileged. We adopt a materials-science framing: conversation as test specimen under load, LLM-model as material charge, pushback as progressive load, stance-flip as material failure. We characterize this failure across three loading cases (debate n=1000; false-presuppositions n=3400; ethical-setting n=3400; 10-17 material charges per case; 7800 specimens total) using 14 turn-level axis-measurements spanning velocity, damage accumulation, frame-drift, brittleness, and direction stability, plus three speaker-resolved axes from an independent pipeline. The measurements are Hooke-coupled ($σ= E \cdot \varepsilon$ analog) and reproduce across loading cases with effects up to $|r_{rb}| = 0.35$ on debate; the sign structure adds a second pattern: the ethical-setting case inverts the velocity and accumulation blocks. Variance composition partitions into two profiles: debate is charge-dominated (brittle-fracture-like: the material grade decides), false-presuppositions and ethical-setting are topic-dominated (creep-like: the load decides); the ratios (2.03 vs 0.13/0.17) are estimator-dependent, for debate even in direction. Cross-judge reliability (GPT-4o vs Haiku 4.5) shows debate scoring is judge-robust (Cohen's $κ= 0.88$) while false-presupposition scoring is judge-sensitive ($κ= 0.36$) -- a caveat single-judge benchmarks must report. This is the methodological move Ye et al.'s diagnosis calls for: a multi-axis characterization that does not depend on which surface form of the construct one privileges.

URL PDF HTML ☆

赞 0 踩 0

2606.16612 2026-06-16 cs.SD cs.LG cs.MM 新提交

Beyond Artifacts: Towards Generalizable Synthetic Song Detection via Music-Intrinsic Features

超越伪影：基于音乐内在特征的可泛化合成歌曲检测

Yan Han, Zhibin Wen, Yuan Wang, Shuangrun Shao, Xiaobing Li, Yang Xu, Wei Li

AI总结提出Sofia框架，通过特征特定专家和自适应混合专家模型利用音乐内在特征（人声、音频效果、全局结构）进行合成歌曲检测，在MUSIC8K基准上F1提升18.5点，具有强鲁棒性。

详情

AI中文摘要

非对数凹采样的方差缩减及其在逆问题中的应用

M. Berk Sahin, Ahmet Ege Tanriverdi, Behzad Sharif, Abolfazl Hashemi

AI总结针对非对数凹分布采样中随机梯度高方差问题，提出统一分析动量、STORM和PAGE等方差缩减方法，证明其在相对Fisher信息和非平方总变差距离下的改进收敛率，并扩展至基于得分的生成先验逆问题求解。

Comments Accepted to Uncertainty in Artificial Intelligence (UAI) 2026

详情

AI中文摘要

从具有未归一化密度的高维、非对数凹分布中采样是机器学习中的一个基本挑战，特别是当势能的精确梯度不可用，且必须通过每次迭代固定梯度计算预算下表现出高方差的随机梯度来近似时。尽管诸如带动量的SGD、STORM和PAGE等方差缩减技术已在非凸优化中展现出改进的收敛性质，但它们对非对数凹分布采样的影响仍 largely unexplored。在这项工作中，我们首次对这些估计器用于非对数凹分布采样进行了统一分析。我们在$\varepsilon$-相对Fisher信息下建立了改进的非渐近收敛率，并在Poincaré不等式假设下，在平方总变差距离下建立了改进的非渐近收敛率，进一步证明了向目标分布的弱收敛。我们将分析扩展到使用基于得分的生成先验求解逆问题。我们通过实验验证了理论，并证明在每次迭代固定梯度计算预算下，方差缩减技术在两个标准成像应用中 consistently 提高了样本质量。

英文摘要

Sampling from high-dimensional, non-log-concave distributions with unnormalized densities is a fundamental challenge in machine learning, particularly when the exact gradient of the potential is unavailable and must be approximated via stochastic gradients that exhibit high variance under a fixed budget of gradient computations per iteration. Although variance reduction techniques such as SGD with momentum, STORM, and PAGE have demonstrated improved convergence properties in non-convex optimization, their implications for sampling from non-log-concave distributions remain largely unexplored. In this work, we develop the first unified analysis of these estimators for sampling from non-log-concave distributions. We establish improved non-asymptotic convergence rates in $\varepsilon$-relative Fisher information and, under a Poincaré inequality assumption, in squared total variation distance, and further prove weak convergence to the target distribution. We extend our analysis to solving inverse problems with score-based generative priors. We empirically validate our theory and demonstrate that, under a fixed gradient computations per iteration, variance-reduction techniques consistently improve sample quality in two standard imaging applications.

URL PDF HTML ☆

赞 0 踩 0

2606.16226 2026-06-16 cs.LG 新提交

Prediction of Runtime Parameters of Parallel Chemistry Applications via Active and Generative Learning

通过主动和生成学习预测并行化学应用的运行时参数

Tanzila Tabassum, Omer Subasi, Ajay Panyala, Epiya Ebiapia, Gerald Baumgartner, Erdal Mutlu, P Sadayappan, Karol Kowalski

AI总结提出基于主动学习和生成学习的机器学习方法，结合梯度提升回归树模型，预测并行化学计算的运行时参数，在CCSD计算中MAPE低至0.023，R²高达99.9%。

2606.16160 2026-06-16 cs.LG cs.AI cs.HC 新提交

A comparative and critical study of EEGNet for fNIRS-driven cognitive load classification

EEGNet在fNIRS驱动的认知负荷分类中的比较与批判性研究

Mehshan Ahmed Khan, Houshyar Asadi, Li Zhang, Mohammad reza Chalak Qazani, Ghazal Bargshady, Stefanos gkikas, Christian arzate, Sam Oladazimi, Zoran Najdovsk, Lei Wei, Chee Peng Lim

AI总结本研究系统评估EEGNet在fNIRS认知负荷分类中的性能，发现重叠分段和小固定学习率在随机分割中表现最佳，但受试者独立评估准确率大幅下降，非重叠分段和PCA特征在SI评估中取得最佳56.11%准确率，表明消除时间冗余有助于学习更鲁棒的跨个体表征。

详情

AI中文摘要

CRIS：跨模态各向异性体积成像的跨平面自监督各向同性恢复

Adi Ahituv, Anat Ilivitzki, Moti Freiman

AI总结提出CRIS，一种无需配对各向同性真值的跨平面自监督框架，通过正交重切2D条带补全实现3D各向同性恢复，在MRI和体积电镜上优于插值和多种方法。

Comments 22 pages, 8 figures, supplementary material included. Submitted to Medical Image Analysis

详情

AI中文摘要

各向异性体积采集在临床MRI和体积电子显微镜（vEM）中很常见，其中稀疏的跨平面采样产生厚切片或截面，降低了正交重切和下游分析的质量。我们提出CRIS，一种跨平面自监督框架，无需配对各向同性真值即可实现各向同性恢复。CRIS将3D恢复视为各向同性网格正交重切上的2D条带补全：训练时，高分辨率面内切片被合成退化并周期性掩蔽；推理时，空白切片定义各向同性网格，恢复两个正交重切，并通过多视图平均融合预测。我们在两个MRI队列和两个显微镜基准上评估CRIS，各向异性高达8倍。在脑MRI上，CRIS达到32.921±0.436 dB PSNR和0.9631±0.0027 SSIM，优于插值、SMORE4、SIMPLE、SA-INR和ATME，并给出最佳分割一致性（Dice 0.940±0.004，ASSD 0.245±0.014 mm，HD99 1.275±0.061 mm）。在无参考腹部MRI上，CRIS将FID/KID降至48.714/0.023。在vEM上，CRIS优于插值、NIIV和vEMINR，在4倍时达到29.133 dB/0.834 3D PSNR/SSIM，在EPFL 8倍时达到27.123 dB/0.734，在噪声hemibrain数据上达到21.915 dB/0.699。在鲁棒性实验中，一个可变间隙CRIS模型在间隙因子3-7以及冠状、轴向和矢状退化下评估，保持比插值更高的PSNR/SSIM（36.36-31.14 dB和0.977-0.932对比33.07-27.85 dB和0.951-0.853）。这些结果支持CRIS作为一种模态灵活的途径，无需配对各向同性目标或特定配置的重新训练即可实现各向同性恢复。代码可在https://github.com/adi-hatav/CRIS获取。

英文摘要

Anisotropic volumetric acquisitions are common in clinical MRI and volume electron microscopy (vEM), where sparse through-plane sampling creates thick slices or sections that degrade orthogonal reformats and downstream analysis. We present CRIS, a cross-plane self-supervised framework for isotropic restoration without paired isotropic ground truth. CRIS casts 3D restoration as 2D stripe completion on orthogonal reformats of an isotropic grid: high-resolution in-plane slices are synthetically degraded and periodically masked for training, while at inference blank slices define the isotropic grid, two orthogonal reformats are restored, and predictions are fused by multi-view averaging. We evaluate CRIS on two MRI cohorts and two microscopy benchmarks up to 8x anisotropy. On brain MRI, CRIS achieves 32.921 +/- 0.436 dB PSNR and 0.9631 +/- 0.0027 SSIM, outperforming interpolation, SMORE4, SIMPLE, SA-INR, and ATME, and gives the best segmentation consistency (Dice 0.940 +/- 0.004, ASSD 0.245 +/- 0.014 mm, HD99 1.275 +/- 0.061 mm). On reference-free abdominal MRI, CRIS reduces FID/KID to 48.714/0.023. On vEM, CRIS outperforms interpolation, NIIV, and vEMINR, reaching 29.133 dB/0.834 3D PSNR/SSIM at 4x, 27.123 dB/0.734 on EPFL at 8x, and 21.915 dB/0.699 on noisy hemibrain data. In a robustness experiment, one variable-gap CRIS model evaluated across gap factors 3--7 and coronal, axial, and sagittal degradations maintained higher PSNR/SSIM than interpolation (36.36--31.14 dB and 0.977--0.932 vs. 33.07--27.85 dB and 0.951--0.853). These results support CRIS as a modality-flexible route to isotropic restoration without paired isotropic targets or configuration-specific retraining. Code is available at https://github.com/adi-hatav/CRIS.

URL PDF HTML ☆

赞 0 踩 0

2606.15940 2026-06-16 cs.LG 新提交

Causal-Privacy Audit Workflow for Synthetic and Distilled Data in Dropout Support

辍学支持中合成与蒸馏数据的因果隐私审计工作流

Hanghang Zheng, Xiwei Zhuang, Zhong Wang, Hong Liu, Xiao Chen, Jingwen He, Xia Li

AI总结提出CaP-Eval工作流，在固定估计目标下审计合成学生数据的预测效用、因果保真度和隐私风险，发现DPGNet和蒸馏数据在保留处理效应结构上优于基线方法。

详情

AI中文摘要

合成和蒸馏的学生数据越来越多地用于实现隐私意识的学习分析，但它们对面向决策的机构支持的适用性仍不确定。在辍学支持中，生成的数据不仅必须保留预测效用或分布相似性，还必须保留用于指导咨询、付款计划援助和奖学金相关决策的财务状况证据。方法：本研究引入了CaP-Eval，一种面向决策的因果隐私审计工作流，用于在固定估计目标、时间感知调整设计、估计器集和经验隐私治理筛选下评估生成的学生数据。该工作流比较了原始数据、蒸馏数据、对抗合成数据、统计合成数据和DPGNet隐私导向生成数据在预测效用、处理效应保真度、对替代估计器的鲁棒性以及局部训练记录邻近性方面的表现。结果：DPGNet和蒸馏数据比对抗和高斯Copula基线更可靠地保留了原始财务状况处理效应结构。DPGNet在epsilon水平上保留了完整的方向和秩一致性；epsilon=10产生了最小的非原始IPW和DML偏差，而epsilon=1和epsilon=5放大了若干财务状况对比。蒸馏数据保持高度忠实，但保留了最强的局部训练记录邻近信号。TabularGNet保留了定性方向但存在中度衰减，高斯Copula压缩了效应幅度。结论：预测效用、隐私导向、经验披露信号和因果保真度存在分歧；生成的学生数据在决策使用前需要对方向、幅度、重叠和发布治理风险进行联合审计。

英文摘要

Synthetic and distilled student data are increasingly used to enable privacy-conscious learning analytics, yet their suitability for decision-facing institutional support remains uncertain. In dropout support, generated data must preserve not only predictive utility or distributional resemblance, but also the financial-status evidence used to guide advising, payment-plan assistance, and scholarship-related decisions. Method: This study introduces CaP-Eval, a decision-facing causal-privacy audit workflow for evaluating generated student data under a fixed estimand, timing-aware adjustment design, estimator set, and empirical privacy-governance screen. The workflow compares original, distilled, adversarial synthetic, statistical synthetic, and DPGNet privacy-oriented generated data on predictive utility, treatment-effect fidelity, robustness to alternative estimators, and local training-record proximity. Results: DPGNet and distilled data preserved the original financial-status treatment-effect structure more reliably than the adversarial and Gaussian Copula baselines. DPGNet preserved full direction and rank agreement across epsilon levels; epsilon = 10 produced the smallest non-original IPW and DML deviations, while epsilon = 1 and epsilon = 5 amplified several financial-status contrasts. Distilled data remained highly faithful but retained the strongest local training-record proximity signal. TabularGNet preserved qualitative directions with moderate attenuation, and Gaussian Copula compressed effect magnitudes. Conclusions: Predictive utility, privacy orientation, empirical disclosure signals, and causal fidelity diverged; generated student data require joint audits of direction, magnitude, overlap, and release-governance risk before decision use.

URL PDF HTML ☆

赞 0 踩 0

2606.15930 2026-06-16 cs.RO cs.AI 新提交

ControlMap: Controllable High-Definition Map Generation for Traffic Scenario Simulation

ControlMap: 用于交通场景仿真的可控高清地图生成

Marwan Farag, Steffen Wäldele, Yu Yao

AI总结提出基于潜在扩散和ControlNet的数据驱动管道，实现可控高清地图生成，支持空间引导、条件强度调整和城市风格迁移，并引入新指标评估控制信号遵循度和地图真实性。

详情

AI中文摘要

仿真是验证自动驾驶系统的核心，但当前流程因高精（HD）地图创建成本高昂而受限于场景多样性不足。扩展HD地图需要昂贵的数据收集和人工处理。此外，现有生成模型缺乏在生成过程中针对特定道路拓扑进行细粒度控制的能力。本文提出一种数据驱动的可控HD地图生成管道，使用潜在扩散和ControlNet进行空间条件控制。据我们所知，我们是首个将空间引导信号注入扩散模型用于HD地图合成的工作。此外，我们的模型支持通过无分类器引导调整条件强度，并通过城市标签条件实现城市级风格迁移。为补充现有指标，我们引入两个新指标来评估对控制信号的遵循程度以及与真实地图的相似性。实验表明，我们的模型生成的HD地图真实且忠实遵循输入道路拓扑，同时准确保留城市特定细节。

英文摘要

Simulation is central to validating autonomous driving systems, yet current pipelines are limited by insufficient scenario diversity due to costly High Definition (HD) map creation. Scaling HD maps requires expensive data collection and manual processing. Moreover, existing generative models lack the fine-grained control necessary to target specific road topologies during generation. This paper presents a data-driven pipeline for controllable HD map generation using latent diffusion and ControlNet for spatial conditioning. To our knowledge, we are the first to inject spatial guidance signals into a diffusion model for HD map synthesis. Furthermore, our model supports adjustable conditioning strength through classifier-free guidance and city-level style transfer via city label conditioning. To complement existing metrics, we introduce two novel metrics to evaluate adherence to the control signal and similarity to ground-truth maps. Experiments demonstrate that our model generates realistic HD maps that faithfully follow input road topologies while accurately preserving city-specific details.

URL PDF HTML ☆

赞 0 踩 0

2606.15897 2026-06-16 cs.LG cs.AI stat.ML 新提交

Topological Flow Matching

拓扑流匹配

Kacper Wyrwal, İsmail İlkan Ceylan, Alexander Tong

AI总结提出拓扑流匹配，通过拉普拉斯漂移增强参考过程，在保留流匹配稳定性和无模拟目标的同时，捕捉底层域拓扑结构，适用于脑fMRI、洋流等结构化数据。

Comments Accepted at ICLR 2026. 26 pages, 24 figures. Code: https://github.com/KacperWyrwal/topological-flow-matching

详情

AI中文摘要

流匹配是一个强大的生成建模框架，因其简单性和强大的经验性能而受到重视。然而，其标准公式将结构化空间上的信号（例如脑图上的fMRI数据）视为欧几里得空间中的点，忽略了其域的丰富拓扑特征。为了解决这个问题，我们引入了拓扑流匹配，这是流匹配的一种拓扑感知泛化。我们将流匹配解释为解决退化薛定谔桥问题的框架，并通过用拉普拉斯导出的漂移增强参考过程来注入拓扑信息。这种原则性修改捕获了底层域的结构，同时保留了流匹配的理想特性：稳定的、无模拟的目标和确定性样本路径。因此，我们的框架可以作为标准流匹配的直接替代品。我们在多样化的结构化数据集上展示了其有效性，包括脑fMRI、洋流、地震事件和交通流。

英文摘要

Flow matching is a powerful generative modeling framework, valued for its simplicity and strong empirical performance. However, its standard formulation treats signals on structured spaces, such as fMRI data on brain graphs, as points in Euclidean space, overlooking the rich topological features of their domains. To address this, we introduce topological flow matching, a topology-aware generalization of flow matching. We interpret flow matching as a framework for solving a degenerate Schrödinger bridge problem and inject topological information by augmenting the reference process with a Laplacian-derived drift. This principled modification captures the structure of the underlying domain while preserving the desirable properties of flow matching: a stable, simulation-free objective and deterministic sample paths. As a result, our framework serves as a drop-in replacement for standard flow matching. We demonstrate its effectiveness on diverse structured datasets, including brain fMRIs, ocean currents, seismic events, and traffic flows.

URL PDF HTML ☆

赞 0 踩 0

2606.15872 2026-06-16 cs.CL 新提交

SciOrch: Learning to Orchestrate Expert LLMs for Solving Frontier Multimodal Scientific Reasoning Tasks

SciOrch: 学习编排专家大语言模型以解决前沿多模态科学推理任务

Jingru Guo, Xiangyuan Xue, Lian Zhang, Wanghan Xu, Siki Chen, Philip Torr, Wanli Ouyang, Lei Bai, Zhenfei Yin

AI总结提出SciOrch框架，训练轻量级8B模型编排多个前沿大语言模型，通过MCTS和GRPO优化，在科学推理任务上超越最强单模型和多智能体基线。

详情

AI中文摘要

前沿科学推理仍然是大语言模型（LLMs）面临的主要挑战，即使是最强大的商业系统也达不到专家级性能。对模型行为的深入分析揭示了单模型评估所隐藏的显著互补性：不同的前沿模型在不同类型的问题上表现出色，没有一个模型能全面覆盖。我们提出了SciOrch，一个训练轻量级8B模型来编排前沿LLMs进行科学推理的框架。编排器分解每个问题，通过API调用将子问题委托给选定的商业模型，并综合最终答案。训练这样的编排器比传统的智能体强化学习更难：每个动作都会触发一次API调用，这在金钱成本和延迟上都代价高昂，使得标准的在线回滚不可行。我们通过基于MCTS的方法解决了这个问题，生成了多样化的编排轨迹，提取了每个节点的单轮样本，并使用GRPO风格的训练优化编排器。在包含SGI-Reasoning和Scientists' First Exam的240个问题测试集上，SciOrch达到了56.66%的平均准确率，比最强的单个商业模型高出3.74%，比最强的多智能体基线高出3.33%。它还在SGI和SFE上都取得了最佳准确率，而API成本不到典型多智能体方法的一半。

英文摘要

Frontier scientific reasoning remains a major challenge for large language models (LLMs), where even the strongest commercial systems fall short of expert-level performance. A closer look at model behavior reveals substantial complementarity that single-model evaluation hides: different frontier models excel on different question types, and no single model captures the full picture. We present SciOrch, a framework that trains a lightweight 8B model to orchestrate frontier LLMs for scientific reasoning. The orchestrator decomposes each question, delegates sub-problems to selected commercial models through API calls, and synthesizes a final answer. Training such an orchestrator is fundamentally harder than conventional agentic RL: each action triggers an API call that is expensive in both dollar cost and latency, making standard online rollouts infeasible. We address this with MCTS-based approach, producing diverse orchestration trajectories, extracting per-node single-turn samples, and optimizing the orchestrator with GRPO-style training. On a 240-question test set spanning SGI-Reasoning and Scientists' First Exam, SciOrch reaches 56.66% average accuracy, outperforming the strongest single commercial model by 3.74% and the strongest multi-agent baseline by 3.33%. It also attains the best accuracy on both SGI and SFE with less than half the API cost of typical multi-agent methods.

URL PDF HTML ☆

赞 0 踩 0

2606.15857 2026-06-16 cs.CV 新提交

A Dual-Branch Collaborative Framework for Joint Optimization of Underwater Image Enhancement and Object Detection

用于水下图像增强与目标检测联合优化的双分支协作框架

Liyuan Cao, Zheng Liu, Guanghao Liao, Yonghui Yang, Qi Li

AI总结提出一种双分支水下图像增强框架，通过细节增强和颜色恢复分支分别提升纹理细节和校正色偏，在提升视觉质量的同时兼顾检测性能与效率，在URPC数据集上使YOLOv8的mAP50提升2.1%。

详情

AI中文摘要

由于波长依赖的光吸收和散射，水下图像通常存在颜色失真和细节模糊，这限制了水下目标检测的性能。现有的水下图像增强方法主要关注视觉质量提升，但仍难以平衡增强质量、处理效率和下游检测性能。因此，本文提出一种高效的双分支水下图像增强框架用于目标检测。细节增强分支通过提升亮度和局部对比度来恢复暗区域的纹理细节。颜色恢复分支使用自适应补偿来减少颜色失真并改善色彩层次。通过结合两个分支的互补输出，所提框架为目标检测提供更清晰、信息更丰富的图像。在UIEB和EUVP数据集上，所提方法分别达到2.249和2.576的UIQM分数。当应用于URPC数据集上的YOLOv8检测任务时，与基线相比，所提方法将mAP50提升了2.1%。大量实验表明，我们的方法在复杂水下场景中改善了目标检测，同时平衡了增强质量和处理效率。

英文摘要

Due to wavelength dependent light absorption and scattering, underwater images usually suffer from color distortion and blurred details, which limits underwater object detection performance. Existing underwater image enhancement methods mainly focus on visual quality improvement, while it is still difficult to balance enhancement quality, processing efficiency, and downstream detection performance. Therefore, this paper proposes an efficient dual-branch underwater image enhancement framework for object detection. The detail enhancement branch improves brightness and local contrast to recover texture details in dark regions. The color restoration branch uses adaptive compensation to reduce color distortion and improve color gradation. By combining the complementary outputs of the two branches, the proposed framework provides clearer and more informative images for object detection. On the UIEB and EUVP datasets, the proposed method achieves UIQM scores of 2.249 and 2.576. When applied to the YOLOv8 detection task on the URPC dataset, the proposed method improves mAP50 by 2.1\% compared with the baseline. Extensive experiments show that our method improves object detection in complex underwater scenes, while balancing enhancement quality and processing efficiency.

URL PDF HTML ☆

赞 0 踩 0

2606.15832 2026-06-16 cs.LG math.OC 新提交

SILAGE: Memory-Efficient, Full-Gradient-Free Nonconvex Optimization for Nested Finite Sums

SILAGE: 针对嵌套有限和的内存高效、完全无全梯度的非凸优化

Igor Sokolov, Laurent Condat, Peter Richtárik

AI总结针对大规模数据中嵌套双有限和结构的非凸优化，提出SILAGE算法，通过利用双和结构避免全局全梯度刷新，仅需O(n)内存，并基于组间和组内异质性实现自适应收敛分析。

Comments 80 pages, 3 algorithms, 4 theorems, 2 corollaries, 11 lemmas, 2 figures, 12 tables

详情

AI中文摘要

大规模数据集上的经验风险最小化自然呈现出嵌套的双有限和结构，其中 $N=nm$ 个总样本被逻辑或物理地划分为 $n$ 个大小为 $m$ 的块（例如，在池化数据孤岛、核外学习或有意分层中）。虽然方差缩减方法对非凸目标实现了最优的 oracle 复杂度，但在此集中式场景中它们遭受严重的扩展瓶颈。递归估计器（如 PAGE）需要定期对所有 $nm$ 个样本进行全局全梯度刷新，这在计算上代价高昂。相反，单循环方法（如 SILVER）避免了此类刷新，但需要不切实际的 $\mathcal{O}(nm)$ 内存来存储每个样本的控制变量。在本文中，我们提出了 SILAGE，一种解决此权衡的方差缩减算法。通过主动利用双和结构，SILAGE 消除了对所有 $nm$ 组件的周期性全局全梯度刷新（每次迭代最多评估一个局部组梯度），同时仅需 $\mathcal{O}(n)$ 内存。此外，我们提供了严格的收敛分析，避免了悲观的 worst-case Lipschitz 常数。相反，SILAGE 的复杂度通过嵌套的函数相似性（组间异质性 $δ_1$ 和组内异质性 $δ_2$）自然地适应底层数据几何。我们的结果在几个实际相关场景中改进了现有的最先进界限。

英文摘要

Empirical risk minimization on massive datasets naturally exhibits a nested double finite-sum structure, where $N=nm$ total samples are logically or physically partitioned into $n$ blocks of size $m$ (e.g., in pooled data silos, out-of-core learning, or deliberate stratification). While variance-reduced methods achieve optimal oracle complexities for nonconvex objectives, they suffer from severe scaling bottlenecks in this centralized regime. Recursive estimators, such as PAGE, require periodic global full-gradient refreshes over all $nm$ samples, which are computationally expensive. Conversely, single-loop methods, such as SILVER, avoid such refreshes but require an impractical $\mathcal{O}(nm)$ memory footprint to store a control variate for every sample. In this paper, we propose SILAGE, a variance-reduced algorithm that addresses this trade-off. By actively exploiting the double-sum structure, SILAGE eliminates periodic global full-gradient refreshes over all $nm$ components (evaluating at most one local group gradient per iteration) while requiring only $\mathcal{O}(n)$ memory. Furthermore, we provide a tight convergence analysis that avoids pessimistic worst-case Lipschitz constants. Instead, SILAGE's complexity natively adapts to the underlying data geometry via nested functional similarities: across-group ($δ_1$) and within-group ($δ_2$) heterogeneity. Our results improve existing state-of-the-art bounds in several practically relevant regimes.

URL PDF HTML ☆

赞 0 踩 0