arXivDaily arXiv每日学术速递 周一至周五更新
全部学科分类 2075
专题追踪
2605.13434 2026-05-14 cs.LG cs.DC math.OC stat.ML

Rescaled Asynchronous SGD: Optimal Distributed Optimization under Data and System Heterogeneity

Ammar Mahran, Artavazd Maranjyan, Peter Richtárik

AI总结 本文研究了在数据和系统异构环境下分布式学习中的异步随机梯度下降(ASGD)方法。传统ASGD因未考虑不同工作节点的计算速度差异,导致模型更新偏向于局部目标的频率加权平均,而非全局目标。本文提出了一种名为Rescaled ASGD的新方法,通过按各节点计算时间比例调整步长,使得每个节点在周期内对模型的总学习率贡献相同,从而恢复对全局目标的正确优化。理论分析表明,该方法在非凸设置下能够收敛到全局目标的平稳点,且时间复杂度达到已知下界,实验验证了其有效性与先进性。

详情
英文摘要

Asynchronous stochastic gradient descent (ASGD) is a standard way to exploit heterogeneous compute resources in distributed learning: instead of forcing fast workers to wait for slow ones, the server updates the model whenever a gradient arrives. Vanilla ASGD applies each arriving gradient with the same weight. When local data distributions are heterogeneous, this becomes problematic: faster workers contribute more updates, and we show theoretically that the method is biased toward a frequency-weighted average of the local objectives rather than the desired global objective. Existing remedies typically move away from the simple ASGD template by introducing gathering phases, buffering, or extra memory. We show that this is unnecessary. Keeping the standard ASGD mechanism, we recover the correct objective by rescaling worker-specific stepsizes in proportion to their computation times, so that each worker contributes the same aggregate learning rate over a cycle. In the non-convex setting, under smoothness and bounded heterogeneity assumptions, we prove that the resulting method, Rescaled ASGD, converges to stationary points of the correct global objective in the fixed-computation model. Its time complexity matches the known lower bound in the leading term, while the effects of staleness and data heterogeneity appear only in lower-order terms. Experiments confirm that the method converges to the correct objective and is competitive with state-of-the-art baselines.

2605.13402 2026-05-14 cs.CV cs.DS

Fast and Compact Graph Cuts for the Boykov-Kolmogorov Algorithm

Christian Møller Mikkelstrup, Anders Bjorholm Dahl, Philip Bille, Vedrana Andersen Dahl, Inge Li Gørtz

AI总结 本文研究了Boykov-Kolmogorov(BK)算法在计算最小$s$-$t$割问题中的性能优化,提出了改进的理论分析和新的快速紧凑算法(fcBK),将时间复杂度从$O(mn|C|)$降低至$O(m|C|)$。此外,作者设计了一种紧凑的图表示方法,使得算法能够在有限内存下处理包含数十亿顶点和万亿边的大规模图。实验表明,该实现是目前BK算法中最高效的实现,突显了内存效率在大规模图割计算中的重要性。

Comments 15 pages, 6 figures, submitted to the IEEE for possible publication

详情
英文摘要

Computing a minimum $s$-$t$ cut in a graph is a solution to a wide range of computer vision problems, and is often done using the Boykov-Kolmogorov (BK) algorithm. In this paper, we revisit the BK algorithm from both a theoretical and practical point of view. We improve the analysis of the time complexity of the BK algorithm to $O(mn|C|)$ and propose a new algorithm, the fast and compact BK (fcBK) algorithm, with a time complexity of $O(m|C|)$, where $m$, $n$, and $|C|$ are the number of edges, number of vertices, and the capacity of the cut, respectively. We additionally propose a compact graph representation that allows our implementation to find a minimum $s$-$t$ cut in a graph with upwards of $10^9$ vertices and $10^{10}$ edges on a machine with 128 GB of memory. We find our implementation of the BK algorithm to be the fastest available implementation of the BK algorithm when evaluating on a comprehensive set of benchmark datasets, highlighting the importance of memory-efficient implementations. We make our implementations publicly available for further research and implementation development within minimum $s$-$t$ cut algorithms.

2604.28045 2026-05-14 cs.CV

TAFA-GSGC: Group-wise Scalable Point Cloud Geometry Compression with Progressive Residual Refinement

Xiumei Li, Alexander Kopte, André Kaup

AI总结 本文提出了一种名为TAFA-GSGC的可扩展点云几何压缩方法,能够在单一比特流和单一训练模型下实现多质量解码。该方法结合了分层残差细化与通道组熵编码,并引入了目标对齐特征聚合模块以减少增强残差中的跨层冗余。实验表明,TAFA-GSGC在保持良好压缩效率的同时,支持多达9个解码质量等级,并在D1-PSNR和D2-PSNR指标上分别实现了4.99%和5.92%的比特率降低。

Comments Accepted at IEEE International Conference on Image Processing (ICIP) 2026

详情
英文摘要

Scalable compression is essential for bandwidth-adaptive transmission, yet most learned codecs are optimized for a fixed rate-distortion point, making rate adaptation costly due to re-encoding or maintaining multiple bitstreams. In this work, we propose TAFA-GSGC, a scalable learned point cloud geometry codec that enables multi-quality decoding from a single bitstream and a single trained model. TAFA-GSGC combines layered residual refinement with channel-group entropy coding, and introduces a Target-Aligned Feature Aggregation module to reduce cross-layer redundancy in enhancement residuals. Our framework supports up to 9 decodable quality levels with monotonic quality improvement as more subbitstreams are received, while maintaining strong compression efficiency. Compared with the PCGCv2 baseline, TAFA-GSGC demonstrates improved RD performance, achieving average BD-rate reductions of 4.99% and 5.92% in terms of D1-PSNR and D2-PSNR, respectively.

2604.10720 2026-05-14 cs.AI cs.CL cs.CY

Teaching Language Models How to Code Like Learners: Conversational Serialization for Student Simulation

Charles Koutcheme, Juho Leinonen, Arto Hellas

AI总结 本文提出了一种训练开放权重的编程学习模拟模型的新框架,通过将真实学生的学习过程数据转化为对话形式,模拟学生与自动评估系统之间的交互过程。该方法结合了监督微调和偏好优化,使模型能够更贴近真实学生的调试行为。实验表明,该方法在功能对齐和代码相似性方面优于传统仅基于代码的模型和提示生成的大语言模型。

Comments 8 pages, 2 figures, 2 tables. Accepted to Educational Data Mining 2026

详情
英文摘要

Artificial students -- models that simulate how learners act and respond within educational systems -- are a promising tool for evaluating tutoring strategies and feedback mechanisms at scale. However, most existing approaches rely on prompting large, proprietary language models, limiting adaptability to specific courses and raising concerns around privacy, cost, and dependence. In this work, we propose a framework for training open-weight artificial programming learners directly from authentic student process data. Our approach serializes temporal log traces into a conversational format, representing each student's problem-solving process as a dialogue between the learner and their automated assessment system. Student code submissions and environment feedback, such as test outcomes, grades, and error traces, form alternating conversational turns, enabling models to learn from the iterative debugging process. We additionally introduce a training pipeline combining supervised fine-tuning with preference optimization to align models with authentic student debugging behavior. We evaluate our framework by training Qwen models at 4B and 8B scales on a large-scale dataset of real student submissions to Python programming assignments. Our results show that incorporating environment feedback strengthens models' ability to replicate student debugging behavior, improving over both prior code-only approaches and prompted large language models baselines in functional alignment and code similarity. We release our code to support reproducibility.

2604.10634 2026-05-14 cs.CV

NTIRE 2026 The Second Challenge on Day and Night Raindrop Removal for Dual-Focused Images: Methods and Results

Xin Li, Yeying Jin, Suhang Yao, Beibei Lin, Zhaoxin Fan, Wending Yan, Xin Jin, Zongwei Wu, Bingchen Li, Peishu Shi, Yufei Wang, Yu Li, Zhibo Chen, Bihan Wen, Robby T. Tan, Radu Timofte, Runzhe Li, Kui Jiang, Zhaocheng Yu, Yiang Chen, Junjun Jiang, Xianming Liu, Hongde Gu, Zeliang Li, Mache You, Jiangxin Dong, Jinshan Pan, Qiyu Rong, Bowen Shao, Hongyuan Jing, Mengmeng Zhang, Bo Ding, Hui Zhang, Yi Ren, Mohab Kishawy, Jun Chen, Anh-Kiet Duong, Petra Gomez-Kramer, Jean-Michel Carozza, Wangzhi Xing, Xin Lu, Enxuan Gu, Jingxi Zhang, Diqi Chen, Qiaosi Yi, Bingcai Wei, Wenjie Li, Bowen Tie, Heng Guo, Zhanyu Ma, Jiachen Tu, Guoyi Xu, Yaoxin Jiang, Cici Liu, Yaokun Shi, Paula Garrido Mellado, Daniel Feijoo, Alvaro Garcia Lara, Marcos V. Conde, Zhidong Zhu, Bangshu Xiong, Qiaofeng Ou, Zhibo Rao, Wei Li, Zida Zhang, Hui Geng, Qisheng Xu, Xuyao Deng, Changjian Wang, Kele Xu, Guanglu Dong, Qiyao Zhao, Tianheng Zheng, Chunlei Li, Lichao Mou, Chao Ren, Chang-De Peng, Chieh-Yu Tsai, Guan-Cheng Liu, Li-Wei Kang, Abhishek Rajak, Milan Kumar Singh, Ankit Kumar, Dimple Sonone, Kishor Upla, Kiran Raja, Huilin Zhao, Xing Xu, Chuan Chen, Yeming Lao, Wenjing Xun, Li Yang, Bilel Benjdira, Anas M. Ali, Wadii Boulila, Hao Yang, Ruikun Zhang, Liyuan Pan

AI总结 本文介绍了NTIRE 2026第二届昼夜雨滴去除双焦点图像挑战赛的整体情况。该挑战基于真实场景下的Raindrop Clarity数据集,旨在建立一个在不同光照和对焦条件下具有良好实用性的雨滴去除基准。本次挑战吸引了168支队伍参与,其中17支队伍提交了最终方案,并在测试集上取得了较好的性能,展示了该领域技术的持续进步。

Comments Accepted by CVPR2026 Workshop; NTIRE 2026 Challenge Report

详情
英文摘要

This paper presents an overview of the NTIRE 2026 Second Challenge on Day and Night Raindrop Removal for Dual-Focused Images. Building upon the success of the first edition, this challenge attracted a wide range of impressive solutions, all developed and evaluated on our real-world Raindrop Clarity dataset~\cite{jin2024raindrop}. For this edition, we adjust the dataset with 14,139 images for training, 407 images for validation, and 593 images for testing. The primary goal of this challenge is to establish a strong and practical benchmark for the removal of raindrops under various illumination and focus conditions. In total, 168 teams have registered for the competition, and 17 teams submitted valid final solutions and fact sheets for the testing phase. The submitted methods achieved strong performance on the Raindrop Clarity dataset, demonstrating the growing progress in this challenging task.

2402.15415 2026-05-14 cs.LG math.DS stat.ML

Understanding Catastrophic Forgetting In LoRA via Mean-Field Attention Dynamics

Hugo Koubbi, Louis Hernandez, Matthieu Boussard

AI总结 本文研究了LoRA(低秩适配)方法在微调过程中出现的灾难性遗忘问题,通过构建一个可解析的均场自注意力玩具模型,将令牌视为相互作用的粒子系统,并将LoRA视为低秩扰动。利用偏微分方程和动力系统理论,揭示了遗忘行为与非遗忘行为之间的相变机制,并分析了扰动大小和模型深度对遗忘的影响,同时通过实验验证了理论预测。

Comments New version accepted at ICML 2026, with new results and without previous results

详情
英文摘要

Low-Rank Adaptation (LoRA) is the dominant parameter-efficient fine-tuning method due to its favorable compute-performance trade-off, yet it suffers from catastrophic forgetting. We study forgetting through a tractable _mean-field self-attention_ toy model, where tokens evolve as an interacting particle system and LoRA acts as a low-rank perturbation. Using tools from partial differential equations and dynamical systems, we characterize regimes suggesting a phase transition between forgetting and non-forgetting behavior. We show that one phase transition appears with respect to the norm of the perturbation, and the other with respect to the depth of the Transformers. We further bound the time-to-deviation in terms of the perturbation size and spectral quantities, and corroborate the predicted trends with experiments and exploratory analyses on real models under LoRA fine-tuning.

2210.09114 2026-05-14 cs.RO

INSANE: Cross-Domain UAV Data Sets with Increased Number of Sensors for developing Advanced and Novel Estimators

Christian Brommer, Alessandro Fornasier, Martin Scheiber, Jeff Delaune, Roland Brockers, Jan Steinbrener, Stephan Weiss

AI总结 本文提出了一种名为INSANE的跨领域无人机数据集,旨在支持自主移动机器人在复杂动态环境中的高精度定位研究。该数据集包含多种场景和不同难度级别的飞行轨迹,涵盖室内运动捕捉环境、室内外过渡飞行以及模拟火星环境的挑战性任务,提供了丰富的传感器数据和高精度真实值。数据集配备了多种传感器,包括多个惯性测量单元和摄像头,并支持基于机器学习的传感器信号增强方法研究。

Comments V2 with added dataset comparison tables

Journal ref Int. J. Robot. Res. 43 (2024) 1083-1113

详情
英文摘要

For real-world applications, autonomous mobile robotic platforms must be capable of navigating safely in a multitude of different and dynamic environments with accurate and robust localization being a key prerequisite. To support further research in this domain, we present the INSANE data sets - a collection of versatile Micro Aerial Vehicle (MAV) data sets for cross-environment localization. The data sets provide various scenarios with multiple stages of difficulty for localization methods. These scenarios range from trajectories in the controlled environment of an indoor motion capture facility, to experiments where the vehicle performs an outdoor maneuver and transitions into a building, requiring changes of sensor modalities, up to purely outdoor flight maneuvers in a challenging Mars analog environment to simulate scenarios which current and future Mars helicopters would need to perform. The presented work aims to provide data that reflects real-world scenarios and sensor effects. The extensive sensor suite includes various sensor categories, including multiple Inertial Measurement Units (IMUs) and cameras. Sensor data is made available as raw measurements and each data set provides highly accurate ground truth, including the outdoor experiments where a dual Real-Time Kinematic (RTK) Global Navigation Satellite System (GNSS) setup provides sub-degree and centimeter accuracy (1-sigma). The sensor suite also includes a dedicated high-rate IMU to capture all the vibration dynamics of the vehicle during flight to support research on novel machine learning-based sensor signal enhancement methods for improved localization. The data sets and post-processing tools are available at: https://sst.aau.at/cns/datasets

1903.00745 2026-05-14 cs.AI cs.LO cs.RO

A Formal Framework for Robot Construction Problems: A Hybrid Planning Approach

Faseeh Ahmad, Esra Erdem, Volkan Patoglu

AI总结 本文研究了由多个自主机器人协作堆叠预制模块构建稳定结构的机器人建造问题,该问题因动作的连锁效应、真正的并发操作以及结构稳定性和模块支撑性要求而具有挑战性。作者提出了一种基于答案集编程的混合规划框架,能够同时确定最终稳定结构配置并规划多机器人操作顺序,确保每一步部分结构的稳定性与支撑性。该方法在理论上有严格的正确性与完备性保证,并通过多个具有挑战性的建造实例验证了其有效性与实用性。

Comments 8 pages (double-column), 7 figures

详情
英文摘要

We study robot construction problems where multiple autonomous robots rearrange stacks of prefabricated blocks to build stable structures. These problems are challenging due to ramifications of actions, true concurrency, and requirements of supportedness of blocks by other blocks and stability of the structure at all times. We propose a formal hybrid planning framework to solve a wide range of robot construction problems, based on Answer Set Programming. This framework not only decides for a stable final configuration of the structure, but also computes the order of manipulation tasks for multiple autonomous robots to build the structure from an initial configuration, while simultaneously ensuring the stability, supportedness and other desired properties of the partial construction at each step of the plan. We prove the soundness and completeness of our formal method with respect to these properties. We introduce a set of challenging robot construction benchmark instances, including bridge building and stack overhanging scenarios, discuss the usefulness of our framework over these instances, and demonstrate the applicability of our method using a bimanual Baxter robot.

2605.12805 2026-05-14 cs.LG cs.AI

Discrete MeanFlow: One-Step Generation via Conditional Transition Kernels

Fairoz Nower Khan, Nabuat Zaman Nahim, Md Sajid Ahmed, Ruiquan Huang, Peizhong Ju

AI总结 该论文提出了一种名为 Discrete MeanFlow 的新方法,用于在离散状态空间中实现一步生成。与连续空间中的 MeanFlow 不同,它通过连续时间马尔可夫链的条件转移核来建模概率质量的转移,并定义了一个平均离散速率来衡量转移概率在时间区间内的变化。该方法通过边界构建设计直接参数化转移核,确保生成过程无需迭代去噪或微分方程求解,只需一次前向传播和分类采样即可完成生成,实验表明其在有限状态马尔可夫链和合成序列生成任务中具有高精度。

详情
英文摘要

MeanFlow enables one-step generation in continuous spaces by learning an average velocity over a time interval rather than the instantaneous velocity field of flow matching. However, discrete state spaces do not have smooth trajectories or spatial derivatives, so the continuous formulation does not directly apply. We introduce Discrete MeanFlow, which replaces the motion of a point with the transport of probability mass over finite states. Our key object is the conditional transition kernel of a continuous-time Markov chain (CTMC), from which we define a mean discrete rate that measures the average change in transition probability over a time interval. We prove a Discrete MeanFlow identity that relates this finite-interval rate to the instantaneous CTMC generator at the endpoint, with the Kolmogorov forward equation replacing the spatial chain rule of continuous MeanFlow. Based on this identity, we parameterize the transition kernel directly using a boundary-by-construction design that guarantees valid probability outputs and exact boundary conditions without auxiliary losses. Since the learned kernel is itself a probability distribution, generation reduces to a single forward pass followed by one categorical draw meaning no iterative denoising, ODE integration, or multi-step refinement is required. We validate the framework on exact finite-state Markov chains, where the learned kernel recovers the analytical ground truth to high precision, and on factorized synthetic sequence generation tasks with varying alphabet sizes and sequence lengths.

2605.12785 2026-05-14 cs.LG cs.SY eess.SY math.DS

Identifying the nonlinear string dynamics with port-Hamiltonian neural networks

Maximino Linares, Guillaume Doras, Thomas Hélie

AI总结 本文研究如何利用端口-哈密顿神经网络(PHNN)从数据中学习非线性弦动力学,提出了一种将物理知识融入神经网络结构的方法,用于识别由偏微分方程(PDE)描述的哈密顿系统。该方法通过构建基于端口-哈密顿系统(PHS)的结构化网络架构,能够同时恢复弦的哈密顿量和耗散项,相比非物理感知的基线方法,在准确性和可解释性方面均有显著提升。实验表明,该模型能够有效识别和模拟非线性弦的动态行为,在音乐声学等需要PDE建模的领域具有重要应用价值。

详情
英文摘要

Hybrid machine learning combines physical knowledge with data-driven models to enhance interpretability and performance. In this context, Port-Hamiltonian Systems (PHS), which generalize Hamiltonian mechanics to describe open, non-autonomous dynamical systems, have been successfully integrated with neural networks under the name Port-Hamiltonian Neural Networks (PHNNs). While the ability of PHNNs to identify Hamiltonian ordinary differential equation (ODE) systems has already been demonstrated, their application to learning Hamiltonian partial differential equation (PDE) systems remains largely unexplored. This limitation restricts their use in musical acoustics, where instruments are typically modeled as distributed parameter systems governed by PDEs. In this work, we demonstrate how to learn the nonlinear string dynamics from data in a physically-consistent framework through a PHNN extension to PDEs. By constructing structured neural network architectures based on PHS, we can recover both the Hamiltonian governing the string and the dissipation affecting it. This approach outperforms baseline, non-physics-informed methods in terms of both accuracy and interpretability. Numerical experiments using synthetic data demonstrate the ability of the proposed PHNN model to identify and emulate the nonlinear dynamics of the system.

2605.12718 2026-05-14 cs.AI cs.LG cs.MA

CHAL: Council of Hierarchical Agentic Language

Tommaso Giovannelli, Griffin D. Kent

AI总结 本文提出了一种名为CHAL的多智能体辩论框架,旨在通过可反驳的论证优化信念系统,解决当前多智能体辩论在结构上的局限性。CHAL引入了基于图结构的信念表示和梯度引导的动态更新机制,并将元认知价值系统作为可配置参数,以指导智能体的推理与裁决过程。该框架在多个领域展示了良好的泛化能力,并为构建透明、可审计的AI系统提供了基础。

详情
英文摘要

Multi-agent debate has emerged as a promising approach for improving LLM reasoning on ground-truth tasks, yet current methodologies face certain structural limitations: debate tends to induce a martingale over belief trajectories, majority voting accounts for most observed gains, and LLMs exhibit confidence escalation rather than calibration across rounds. We argue that the genuine value of debate, and dialectic systems as a whole, lies not in ground-truth tasks but in defeasible domains, where every position can in principle be defeated by better reasoning. We present the Council of Hierarchical Agentic Language (CHAL), a multi-agent dialectic framework that treats defeasible argumentation as an engine for belief optimization. Each agent maintains a CHAL Belief Schema (CBS), a graph-structured belief representation with a Bayesian-inspired architecture, that facilitates belief revision through a gradient-informed dynamic mechanism by leveraging the strength of the belief's thesis as a differentiable objective. Meta-cognitive value systems spanning epistemology, logic, and ethics are elevated to configurable hyperparameters governing agent reasoning and adjudication outcomes. We provide a series of ablation experiments that demonstrate systematic and interpretable effects: the adjudicator's value system determines the debate's overall trajectories in latent belief space, council diversity refines beliefs for all participants, and the framework generalizes across broad fields. CHAL is, to our knowledge, the first framework to treat multi-agent debate as structured belief optimization over defeasible domains. Further, the auditable belief artifacts it produces establish the foundation for dedicated evaluation suites for defeasible argumentation, with broader implications for building AI systems whose reasoning and value commitments are transparent, aligned, and subject to human oversight.

2605.12701 2026-05-14 cs.LG cs.AI cs.CE cs.CY

Do Fair Models Reason Fairly? Counterfactual Explanation Consistency for Procedural Fairness in Credit Decisions

Gideon Popoola, John Sheppard

AI总结 在信用决策等社会敏感领域,现有公平机器学习模型虽然能够实现预测结果的公平性,但仍可能在推理过程中对不同群体采用不同的逻辑,形成“隐藏的过程性偏差”。本文提出一种名为反事实解释一致性(CEC)的框架,通过对齐个体与其反事实样本的特征归因,检测并缓解这种偏差,并引入新的过程性公平度量与训练损失函数。实验表明,CEC能有效减少模型的隐藏偏差,且对模型性能的影响较小。

详情
英文摘要

Machine learning algorithms in socially sensitive domains (e.g., credit decisions) often focus on equalizing predictive outcomes. However, satisfying these metrics does not guarantee that models use the same reasoning for different groups. We show that existing outcome-fair models can still apply fundamentally different reasoning to individuals, a ``hidden procedural bias'' missed by standard fairness metrics and algorithms. We propose Counterfactual Explanation Consistency (CEC), a framework that detects and mitigates this bias by aligning feature attributions between individuals and their counterfactual counterparts. Key contributions include a nearest-neighbor counterfactual generation method, a modified baseline for integrated gradient comparisons, an individual-level procedural fairness metric, and a corresponding training loss. We introduce a taxonomy identifying ``Regime B'' (same outcome, different reasoning) as a critical blind spot. Experiments on synthetic data, German Credit, Adult Income, and HMDA mortgage data demonstrate that outcome-fair baselines exhibit substantial hidden bias, while CEC substantially reduces it with modest utility cost.

2605.12628 2026-05-14 cs.RO

Multistep Belief Space Dynamics Learning For Risk-Aware Control

Jason Gibson, Bogdan Vlahov, Patrick Spieler, Evangelos A. Theodorou

AI总结 本文研究了如何在自动驾驶系统中实现风险感知的控制,针对动态不确定性随时间演变的问题,提出了一种用于模型预测控制(MPC)的分布动态学习框架。该方法通过学习环境动力学的分布特性,能够在保证安全性的前提下优化控制策略,避免过于保守。实验表明,该方法在真实复杂的越野环境中表现出良好的适应性和智能行为。

详情
英文摘要

As autonomous vehicles move from a simplified research setting to practical use, there exists a large gap between the dynamic behavior of a human driving and an autonomous system. Risk-aware behavior needs to naturally develop in order to scale to the demands of the real world. A major issue for risk-aware planning and control has been predicting how dynamical uncertainty evolves through time and optimizing plans that account for this without being overly conservative. Here, we present a learning framework to predict distributional dynamics that can be optimized in real time for Model Predictive Control (MPC). We explore the importance of structure when learning distributional dynamics for use in MPC. A rigorous ablation study is conducted on a large dataset of real world off-road driving that shows the impact of deviations from our proposed structure. Furthermore, we deploy our learned model and planning stack on a full sized vehicle in challenging off-road conditions. Our planning architecture is able to naturally regulate the speed of the vehicle based on the environment and consistently demonstrates intelligent behavior over miles of diverse terrain.

2605.10127 2026-05-14 cs.CV

Fashion130K: An E-commerce Fashion Dataset for Outfit Generation with Unified Multi-modal Condition

Yu He, Ting Zhu, Yichun Liu, Lichen Ma, Xinyuan Shan, Jingling Fu, Yu Shi, Junshi Huang, Yan Li

AI总结 本文提出一个名为Fashion130K的新电商时尚数据集,包含多种场合、模特和服装类型,旨在推动服装搭配生成的研究。为实现服装生成的视觉一致性,作者设计了统一多模态条件(UMC)框架,通过融合文本和图像提示的嵌入信息,并引入融合变换器对齐多模态特征,进而引导生成模型关注提示与噪声图像之间的关键关联。该数据集和框架为多模态提示在生成模型中的应用提供了全面而细致的探索,并在多个实际应用和基准测试中表现出优于现有方法的视觉一致性效果。

Comments Accepted to CVPR 2026 Findings

详情
英文摘要

Recent research work on fashion outfit generation focuses on promoting visual consistency of garments by leveraging key information from reference image and text prompt. However, the potential of outfit generation remains underexplored, requiring comprehensive e-commercial dataset and elaborative utilization of multi-modal condition. In this paper, we propose a brand-new e-commerce dataset, named Fashion130k, with various occasions, models, and garment types. For the consistent generation of garment, we design a framework with Unified Multi-modal Condition (UMC) to align and integrate the text and visual prompts into generation model. Specifically, we explore an embedding refiner to extract the unified embeddings of multi-modal prompts, within which a Fusion Transformer is proposed to align the multi-modal embeddings by adjusting the modality gap between text and image. Based on unified embeddings, the attention in generation model is redesigned to emphasis the correlations between prompts and noise image, inducing that the noise image can select the pivotal tokens of prompts for consistent outfit generation. Our dataset and proposed framework offer a general and nuanced exploration of multi-modal prompts for generation models. Extensive experiments on real-world applications and benchmark demonstrate the effectiveness of UMC in visual consistency, achieving promising result than that of SoTA methods.

2605.10040 2026-05-14 cs.CV

Only Train Once: Uncertainty-Aware One-Class Learning for Face Authenticity Detection

Qingchao Jiang, Zhenxuan Hou, Zhiying Zhu, Zhenxing Qian, Xinpeng Zhang, Zaiwang Gu

AI总结 随着生成式模型的快速发展,生成高度逼真的图像带来了身份欺诈和虚假信息传播的风险。现有方法大多将人脸伪造检测视为全监督的二分类问题,难以应对新型生成方法带来的挑战。本文提出FADNet,将人脸真实性检测重新建模为一类分类任务,仅使用真实人脸数据进行训练,通过引入证据深度学习和伪伪造图像生成器,有效提升了模型的泛化能力和检测精度,在多个基准测试中取得了优于现有方法的优异性能。

Comments The sole reason for our withdrawal application is that we have identified critical areas in our manuscript that require substantial revision and improvement to meet rigorous scientific standards. Our only intention is to retract the current draft to revise and enhance it, with no plans to replace it with a different version or redirect readers to other sources at this time

详情
英文摘要

The rapid evolution of generative paradigms has enabled the creation of highly realistic imagery, which escalating the risks of identity fraud and the dissemination of disinformation. Most existing approaches frame face forgery detection as a fully supervised binary classification problem. Consequently, these models typically exhibit significant performance decay when tasked with detecting forgeries from previously unseen generative paradigms. Furthermore, these methods focus exclusively on either DeepFakes or fully synthesized faces, thereby failing to provide a generalized framework for universal face forgery detection. In this paper, we address this challenge by introducing FADNet (Face Authenticity Detector Net), % a self-supervised framework that which reformulates face forgery detection as a one-class classification (OCC) task. By training exclusively on authentic facial data to capture their intrinsic representations, FADNet flags any image whose feature embedding deviates significantly from the learned distribution of real faces as a forgery. The framework incorporates Evidential Deep Learning (EDL) to quantify predictive uncertainty and utilizes a plug-and-play pseudo-forgery image generator (PFIG) to tighten decision boundaries around authentic data. Extensive experimental evaluations on the DF40 and ASFD benchmarks demonstrate that FADNet achieves superior performance and generalization capabilities. Specifically, FADNet substantially outperforms existing state-of-the-art (SOTA) methods, yielding a remarkable average accuracy of 96.63\% and an average precision of 98.83\%.

2605.09935 2026-05-14 cs.CV cs.CR

Evidence-based Decision Modeling for Synthetic Face Detection with Uncertainty-driven Active Learning

Qingchao Jiang, Zhenxuan Hou, Zhiying Zhu, Zhenxing Qian, Xinpeng Zhang, Zaiwang Gu

AI总结 随着深度生成模型的快速发展,伪造人脸图像被广泛用于非法活动。现有合成人脸检测方法虽取得进展,但因依赖Softmax激活函数而存在过度自信的问题,导致在面对未知分布图像时预测不可靠。为此,本文提出EMSFD方法,通过狄利克雷分布建模类别证据并显式引入模型不确定性,提升检测可靠性与泛化能力;同时利用不确定性指导主动学习,减少标注成本,实验表明该方法在检测准确率上比现有最优方法提升了15%。

Comments The sole reason for our withdrawal application is that we have identified critical areas in our manuscript that require substantial revision and improvement to meet rigorous scientific standards. Our only intention is to retract the current draft to revise and enhance it, with no plans to replace it with a different version or redirect readers to other sources at this time

详情
英文摘要

With the rapid development of deep generative models, forged facial images are massively exploited for illegal activities. Although existing synthetic face detection methods have achieved significant progress, they suffer from the inherent limitation of overconfidence due to their reliance on the Softmax activation function. Thus, these methods often lead to unreliable predictions when encountering unknown Out-of-Distribution (OOD) images, and cannot ascertain the model's uncertainty in its prediction. Meanwhile, most existing methods require massive high-quality annotated data, which greatly limits their practicability across diverse scenarios. To address these limitations, we propose EMSFD (Evidence-based decision Modeling for Synthetic Face Detection with uncertainty-driven active learning), an approach designed to enhance detection reliability and generalizability. Specifically, EMSFD models class evidence using the Dirichlet distribution and explicitly incorporates model uncertainty into the prediction process. Furthermore, during training, the estimated uncertainty is exploited to prioritize more informative samples from the unlabeled pool for annotation, thereby reducing labeling cost and improving model generalization. Extensive experimental evaluations demonstrate that our method enhances the interpretability of synthetic face detection. Meanwhile, our method yields a 15\% increase in accuracy compared to existing state-of-the-art (SOTA) baselines, which demonstrates the superior detection performance and generalizability of our approach. Our code is available at: https://github.com/hzx111621/EMSFD.

2605.09923 2026-05-14 cs.AI

expo: Exploration-prioritized policy optimization via adaptive kl regulation and gaussian curriculum sampling

Mingxiong Lin, Zhangquan Gong, Maowen Tang, Qian Li, Chuangchuang Wang, Jian Ma, Sutian Huang, Kai Tang, Haonan Lu

AI总结 该论文针对基于可验证奖励的强化学习(RLVR)中主流算法Group Relative Policy Optimization(GRPO)存在的探索效率不足问题,提出了探索优先策略优化方法EXPO。EXPO通过引入动态调整的KL正则化模块和基于高斯分布的课程采样策略,有效提升了模型在数学推理任务中的探索能力和训练效率。实验表明,EXPO在多个基准测试中显著优于原始GRPO,尤其在高难度问题上的性能提升更为明显。

Comments Duplicate submission of arXiv:2605.11403

详情
英文摘要

Reinforcement Learning with Verifiable Rewards (RLVR) has become the standard paradigm for LLM mathematical reasoning, where Group Relative Policy Optimization (GRPO) serves as the mainstream algorithm. We point out two understudied inefficiencies existing in GRPO. First, the fixed KL penalty coefficient overly restricts policy exploration at stages where the model requires significant deviation from the reference policy. Second, uniform sampling of training questions ignores that moderately difficult problems provide the most informative gradient signals for optimization. We propose Exploration-Prioritized Policy Optimization (EXPO) with two lightweight plug-in modules. The Accuracy-Conditioned KL Scaling (AKL) dynamically adjusts KL regularization strength through a smooth nonlinear function of batch average accuracy, relaxing the penalty when the model underperforms and strengthening it when the model achieves good results. The Gaussian Curriculum Sampling (GCS) assigns sampling weights to questions following a Gaussian distribution centered at moderate accuracy around 0.5, focusing training on the model's learning frontier. We conduct extensive experiments on DeepSeek-R1-Distill-Qwen-1.5B and Qwen3-8B-Base over six mathematical reasoning benchmarks. The results show EXPO steadily surpasses vanilla GRPO. It obtains an absolute gain of 13.34 on AIME 2025 pass@32, rising from 63.33 percent to 76.67 percent, and achieves an average pass@32 improvement of 2.66 on the 8B model. The much larger performance gains on pass@32 compared with pass@1 demonstrate that EXPO effectively enlarges the model's exploration boundary under a fixed inference cost budget.

2605.01457 2026-05-14 cs.AI

CoFlow: Coordinated Few-Step Flow for Offline Multi-Agent Decision Making

Guowei Zou, Haitao Wang, Beiwen Zhang, Boning Zhang, Hejun Wu

AI总结 本文提出了一种名为CoFlow的协调少步流方法,用于离线多智能体决策问题。该方法通过引入协调速度注意力机制和自适应协调门控,实现了在单次生成过程中保持智能体间协调性的目标,从而克服了现有少步生成方法在协调性上的不足。实验表明,CoFlow在多种任务中表现出色,能够在仅需1到3步去噪的情况下达到最先进的协调质量,且其性能提升主要归因于智能体间的协调能力增强。

Comments 34 pages, 15 figures, 10 tables. Project page: https://guowei-zou.github.io/coflow/

详情
英文摘要

Generative models have emerged as a promising paradigm for offline multi-agent reinforcement learning (MARL), but existing approaches require many iterative sampling steps. Recent few-step acceleration methods either distill a joint teacher into independent students or apply averaged velocity fields independently to each agent. Unfortunately, these few-step approaches hurt inter-agent coordination. We show that the efficiency-coordination trade-off is not inherent: single-pass multi-agent generation can preserve coordination when the velocity field is natively joint-coupled. We propose Coordinated few-step Flow (CoFlow), an architecture that combines Coordinated Velocity Attention (CVA) with Adaptive Coordination Gating. A finite-difference consistency surrogate further replaces memory-prohibitive Jacobian-vector product backpropagation through the averaged velocity field with two stop-gradient forward passes. Across 60 configurations spanning MPE, MA-MuJoCo, and SMAC, CoFlow matches or surpasses Gaussian policies, value-based methods, transformer policies, diffusion models, and prior flow baselines on episodic return. Three independent coordination probes confirm that CoFlow's improvements arise from inter-agent coordination rather than per-agent capacity. A denoising-step sweep shows that single-pass inference suffices on every configuration. CoFlow reaches state-of-the-art coordination quality in 1-3 denoising steps under both centralized and decentralized execution. Project Page: https://guowei-zou.github.io/coflow/

2603.25340 2026-05-14 cs.CL

Large Language Model as Token Compressor and Decompressor

Wenbing Li, Yiran Wang, Zikai Song, Jielei Zhang, Tianhao Zhao, Junkai Lin, Wei Yang

AI总结 本文研究了如何将现成的大语言模型(LLM)适配为用于长文本处理的离散可变长度编码器和解码器。作者设计了一种自表达的自编码框架,通过轻量的LoRA适配器对预训练LLM进行微调,将长文本映射为紧凑的潜在编码序列(Z-tokens),并能将其解码回自然语言或任务输出。该方法在保持重建质量和下游任务性能的同时,有效减少了上下文长度、生成阶段的内存使用和端到端延迟,为高效长文本推理提供了实用的接口。

详情
英文摘要

In this paper, we study whether an off-the-shelf LLM can be adapted into a discrete, variable-length token compressor and decompressor for long-context processing. To this end, we design a self-expressive autoencoding framework that fine-tunes a pretrained LLM with lightweight LoRA adapters to map long texts into compact sequences of learned latent codes, termed Z-tokens, and to decode them back into natural language or task outputs. The resulting representation is content-adaptive: less predictable or information-dense segments can receive more Z-tokens, while redundant regions can be represented more compactly through a budget-aware length regularizer. Our method is evaluated on long-context datasets such as Wikipedia, CNN/DailyMail, HotpotQA, and QuALITY, showing that it preserves reconstruction quality and downstream performance while reducing effective context length, generation-stage memory usage, and end-to-end latency. This simple design supports both direct decoding from compressed contexts and autoregressive generation in the Z-token space, providing a practical interface for efficient long-context inference.

2602.09724 2026-05-14 cs.CL

Targum -- A Multilingual New Testament Translation Corpus

Maciej Rapacz, Aleksander Smywiński-Pohl

AI总结 本文介绍了一个名为 Targum 的多语种新约圣经翻译语料库,旨在弥补现有语料库在语言深度上的不足。该语料库包含 651 个新约翻译版本,其中 334 个为独家版本,涵盖英语、法语、意大利语、波兰语和西班牙语五种语言,每种语言的翻译数量均远超以往任何语料库。每个翻译版本都附有标准化元数据,便于研究者进行多层次的翻译分析,为圣经翻译史的量化研究提供了重要资源。

Comments v3 - fixed duplicated references section heading, fixed reference v2 - camera ready version

详情
英文摘要

Many European languages possess rich biblical translation histories, yet existing corpora - in prioritizing linguistic breadth - often fail to capture this depth. To address this gap, we introduce a multilingual corpus of 651 New Testament translations, of which 334 are unique, spanning five languages with 2.4-5.0x more translations per language than any prior corpus: English (194 unique versions from 390 total), French (41 from 78), Italian (17 from 33), Polish (29 from 48), and Spanish (53 from 102). Aggregated from 12 online biblical libraries and one preexisting corpus, each translation is annotated with metadata that maps the text to a standardized identifier for the work, its specific edition, and its year of revision. This canonicalization allows researchers to define "uniqueness" for their own needs: they can perform micro-level analyses on translation families, such as the KJV lineage, or conduct macro-level studies by deduplicating closely related texts. By providing the first multilingual resource with sufficient depth per language for flexible, multilevel analysis, the corpus fills a gap in the quantitative study of translation history.

2511.00066 2026-05-14 cs.LG

Sharpness-Guided Group Relative Policy Optimization via Probability Shaping

Tue Le, Linh Ngo Van, Trung Le

AI总结 本文研究了可验证奖励强化学习(RLVR)中策略优化的泛化问题,提出了一种基于梯度范数的锐度代理来上界泛化损失,并在此基础上改进了组相对策略优化(GRPO)算法。通过引入锐度引导的GRPO(GRPO-SG),该方法对可能引发过大梯度的token进行降权处理,从而减少剧烈更新,提升优化稳定性与模型泛化能力。实验表明,GRPO-SG在数学推理、逻辑谜题和工具增强问答任务中均优于原始GRPO,且梯度轨迹更平稳。

详情
英文摘要

Reinforcement learning with verifiable rewards (RLVR) has become a practical route to improve large language model reasoning, and Group Relative Policy Optimization (GRPO) is a widely used optimizer in this setting. However, RLVR training is typically performed with limited control over generalization. We revisit GRPO through a robustness-based generalization view, where the generalization loss is upper bounded by a combination of the empirical loss and a sharpness surrogate measured by the gradient norm. Building on this perspective, we propose Sharpness-Guided GRPO (GRPO-SG), a simple token-weighted variant of GRPO that downweights tokens likely to cause overly large gradients, reducing sharp updates and stabilizing optimization, thereby improving generalization. Experiments across mathematical reasoning, logic puzzles and tool-augmented question answering show consistent improvements over GRPO, along with smoother gradient-norm trajectories, supporting GRPO-SG as a simple and effective generalization-oriented upgrade to GRPO for RLVR.

2508.01049 2026-05-14 cs.LG

Centralized Adaptive Sampling for Reliable Co-Training of Independent Multi-Agent Policies

Nicholas E. Corrado, Josiah P. Hanna

AI总结 在多智能体强化学习中,独立策略梯度算法在合作且无冲突的游戏中广泛应用,但其收敛性能受限于联合策略分布的采样误差。本文提出了一种集中式自适应采样方法CoSER,通过协调各智能体的动作选择,减少联合采样误差,从而提升策略梯度学习的可靠性。实验表明,CoSER相比独立采样方法更有效地降低采样误差,并提高了算法收敛到最优联合策略的概率。

Comments RLC 2026

详情
英文摘要

Independent on-policy policy gradient algorithms are widely used for multi-agent reinforcement learning (MARL) in cooperative and no-conflict games, but they are known to converge sub-optimally when each agent's individual policy gradient points away from an optimal joint equilibrium. Going beyond prior work, we observe that sub-optimal convergence can still arise even when the expected individual policy gradients of each agent point toward the optimal joint solution. After collecting a finite set of trajectories, stochasticity in independent action sampling can cause the joint data distribution to deviate from the expected joint on-policy distribution. This \textit{sampling error} w.r.t. the joint on-policy distribution produces inaccurate gradient estimates that can make agents converge sub-optimally. We hypothesize that joint sampling error can be reduced through coordinated action selection and that doing so will increase the reliability of policy gradient learning in MARL (i.e., the probability of converging to an optimal joint policy). To test this hypothesis, we first introduce an adaptive action sampling approach to reduce joint sampling error in the Centralized Training with Decentralized Execution setting. Our method, Cooperative Sampling Error Reduction (CoSER), continually adapts a centralized behavior policy to place higher probability on joint actions that are under-sampled w.r.t. the current joint policy. We then empirically evaluate CoSER on a diverse set of multi-agent games and demonstrate that (1) CoSER reduces joint sampling error more efficiently than independent on-policy sampling and (2) this reduction increases the reliability of independent policy gradient algorithms.

2308.10058 2026-05-14 cs.CV

R-C-P Method: An Autonomous Volume Calculation Method Using Image Processing and Machine Vision

MA Muktadir, Sydney Parker, Sun Yi

AI总结 本文提出了一种基于图像处理和机器视觉的自主体积计算方法——R-C-P方法,旨在替代传统深度传感器(如LiDAR)以适应复杂环境下的应用需求。该方法利用两台2D摄像头实时测量矩形物体的尺寸,通过行-列-像素(R-C-P)策略结合边缘检测技术,实现了对物体表面积及不连续边缘或体积的检测。实验验证了该方法的有效性,并提供了基于摄像头与物体距离的尺寸计算公式,为实际物体的自主测量提供了可行的视觉解决方案。

Journal ref Communications in Computer and Information Science, vol. 2939, Springer, Cham (2026)

详情
英文摘要

Machine vision and image processing are often used with sensors for situation awareness in autonomous systems, from industrial robots to self-driving cars. The 3D depth sensors, such as LiDAR (Light Detection and Ranging), Radar, are great invention for autonomous systems. Due to the complexity of the setup, LiDAR may not be suitable for some operational environments, for example, a space environment. This study was motivated by a desire to get real-time volumetric and change information with multiple 2D cameras instead of a depth camera. Two cameras were used to measure the dimensions of a rectangular object in real-time. The R-C-P (row-column-pixel) method is developed using image processing and edge detection. In addition to the surface areas, the R-C-P method also detects discontinuous edges or volumes. Lastly, experimental work is presented for illustration of the R-C-P method, which provides the equations for calculating surface area dimensions. Using the equations with given distance information between the object and the camera, the vision system provides the dimensions of actual objects.

2605.12550 2026-05-14 cs.CV cs.AI

SSDA: Bridging Spectral and Structural Gaps via Dual Adaptation for Vision-Based Time Series Forecasting

Mingrui Zhang, Hanchen Yang, Wengen Li, Xudong Jiang, Yichao Zhang, Jihong Guan, Shuigeng Zhou

AI总结 该论文研究了基于视觉模型的时间序列预测问题,指出将时间序列渲染为图像后,仍存在光谱和结构上的差距,限制了预训练视觉模型的性能。为此,作者提出SSDA方法,通过光谱幅度对齐和结构引导的低秩适配,分别在数据和模型层面弥补这些差距,从而显著提升时间序列预测效果。实验表明,SSDA在多个真实数据集上优于现有方法,表现出良好的泛化能力。

详情
英文摘要

Large vision models (LVMs) have recently proven to be surprisingly effective time series forecasters, simply by rendering temporal data as images. This success, how ever, rests on a largely unexamined premise: the rendered time series images are sufficiently close to natural images for knowledge in pre-trained models to transfer effectively. We argue that two gaps still remain, i.e., spectral and structural gaps, fundamentally limiting the potential of LVMs for time series forecasting. Spectrally, we systematically reveal that rendered time series images exhibit a markedly shallower power spectrum than the natural images LVMs are pre-trained to recognize. Structurally, reshaping 1D temporal sequences into 2D grids fabricates spurious spatial adjacencies while severing genuine temporal continuities, misleading the spatial inductive biases of pre-trained LVMs. To bridge these gaps, we propose SSDA, a dual-branch network that spectrally and structurally adapts to unlock the full potential of LVMs for time series forecasting. At the data level, a Spectral Magnitude Aligner (SMA) applies 2D FFT to selectively enhance the magnitude spectrum toward natural-image statistics while preserving phase. At the model level, a Structural-Guided Low-Rank Adaptation (SG-LoRA) injects position-aware temporal encodings into patch embeddings and adapts at tention via low-rank updates. The two branches are further adaptively fused to produce the final forecast. Extensive experiments on seven real-world benchmarks demonstrate that SSDA consistently outperforms strong LVM- and LLM-based baselines under both full-shot and few-shot settings. Code is publicly available at https://anonymous.4open.science/r/SSDA-8C5B.

2507.22095 2026-05-14 stat.ML cs.LG math.PR

Posterior Bayesian Neural Networks with Dependent Weights

Nicola Apollonio, Giovanni Franzina, Giovanni Luca Torrisi

AI总结 本文研究具有依赖权重和可能重尾分布的全连接前馈深度神经网络,旨在克服标准高斯先验的局限性。通过引入高斯似然的后验分布视角,论文分析了在网络宽度趋于无穷时输出的后验分布行为,并在先验下随机协方差矩阵正定的条件下,确定了输出的后验分布。研究还给出了确保协方差矩阵可逆的温和条件,并展示了某些模型参数(如激活函数和相关Lévy测度)对极限独立性的影响,扩展了已有研究成果。

Comments 2 figures

详情
英文摘要

We consider fully connected and feedforward deep neural networks with dependent and possibly heavy-tailed weights, as introduced in [26], to address limitations of the standard Gaussian prior. It has been proved in [26] that, as the number of nodes in the hidden layers grows large, according to a sequential and ordered limit, the law of the output converges weakly to a Gaussian mixture. In this paper, we study the neural network through the lens of the posterior distribution with a Gaussian likelihood. If the random covariance matrix of the infinite-width limit is positive definite under the prior, we identify the posterior distribution of the output in the wide-width limit according to a sequential regime. Remarkably, we provide mild sufficient conditions to ensure the aforementioned invertibility of the random covariance matrix under the prior, thereby extending the results in [8]. Among our results, we present sufficient conditions on some model parameters (the activation function and the associated Lévy measures) which ensure that the sequential limits are independent of the order. We illustrate our findings with examples and numerical simulations.

2605.12524 2026-05-14 cs.LO cs.AI

Stress-Testing the Reasoning Competence of LLMs With Proofs Under Minimal Formalism

Konstantine Arkoudas, Serafim Batzoglou

AI总结 本文提出ProofGrid,一个用于评估大语言模型(LLM)推理能力的基准测试套件,通过机器可验证的证明而非仅最终答案来衡量模型能力。ProofGrid包含15个任务,涵盖证明生成、验证、掩码和补全,使用简洁的自然演绎语言NDL进行表达,支持精确且可审计的验证。该基准测试具有可重复、细粒度的评估机制,并覆盖从基础推理到复杂挑战任务的难度范围,揭示了当前模型在全局组合推理和低级证明合成等方面的显著局限。

详情
英文摘要

We introduce ProofGrid, a benchmark suite for evaluating LLM reasoning through machine-checkable proofs rather than final answers alone. ProofGrid contains 15 tasks spanning proof writing, proof checking, proof masking, and proof gap-filling. Tasks are expressed in minimal formal notation, especially NDL, a compact natural-deduction language that fits in short prompts and supports precise, auditable verification. This yields mechanical, reproducible, and fine-grained evaluation rather than judgments by humans or LLMs. ProofGrid covers a calibrated difficulty spectrum, from foundational reasoning tests to structurally rich challenge tasks that no current model solves, while minimizing reliance on domain knowledge, solver delegation, and long-context artifacts. We also develop a comparative framework for reasoning benchmarks and use it to situate ProofGrid relative to existing work in terms of representation, verification guarantees, and reasoning depth. Methodologically, we introduce an instrumented proof-checking pipeline that tolerates minor surface deviations while locating the first substantive reasoning failure, improving measurement resolution and separating proof planning from low-level execution noise. Using this pipeline, we evaluate a broad range of open and proprietary models. Results show rapid progress but substantial remaining limits: frontier models perform well on several foundational tasks, yet difficult tasks, especially those requiring global combinatorial reasoning or low-level proof synthesis, remain far from solved. We also identify epistemic instability, where models generate flawed proofs yet correctly reject those local inferences in isolation, and formalize this with an Epistemic Stability Index. Finally, we complement accuracy with 2PL IRT analyses, Wright maps, and a normalized task-discrimination measure based on Fisher information.

math/9901049 2026-05-14 math.GR

Rigidity of Right-Angled Coxeter Groups

David G. Radcliffe

AI总结 本文研究了右角Coxeter群的刚性性质,探讨了在不同生成集下该群的Coxeter系统是否等价。作者证明了若两个有限生成集生成同一个右角Coxeter群,则对应的Coxeter系统是等价的。这一结果揭示了该类群结构的稳定性,为理解其代数与几何性质提供了重要依据。

Comments 6 pages. Improved exposition and formatting

详情
英文摘要

If S and S' are two finite sets of Coxeter generators for a right-angled Coxeter group W, then the Coxeter systems (W,S) and (W,S') are equivalent.

2605.13844 2026-05-14 math.NT

Fields where torsion forms decompose

M. Archita, Karim Johannes Becher

AI总结 本文研究了在特定实数域上挠二次型的分解问题,证明了在满足一定条件的实数域上,每个挠二次型都可以分解为若干个二维挠二次型的正交和。研究基于对赋值域和一变量函数域上弱各向同性形式的更一般性分析,为理解二次型的结构提供了新的视角和结果。

Comments 10 pages

详情
英文摘要

Over a real field which is an extension of transcendence degree 1 of a hereditarily pythagorean base field, every quadratic form which is torsion decomposes into an orthogonal sum of 2-dimensional torsion forms. This is obtained from a more general study of weakly isotropic forms over henselian valued fields and over function fields in one variable.

2605.13843 2026-05-14 astro-ph.GA astro-ph.CO

The Galaxy Luminosity Functions in ASTRID: Predictions for LSST

Fatemeh Hafezianzadeh, Tianqing Zhang, Paul Rogozenski, Patrick Lachance, Yihao Zhou, Tiziana Di Matteo, Rupert A. C. Croft, Simeon Bird, Rachel Mandelbaum

AI总结 本文利用ASTRID宇宙学流体动力学模拟,为Vera C. Rubin天文台的Legacy Survey of Space and Time(LSST)项目生成了验证过的星系光度函数和光度预测。研究结合恒星群体合成模型与物理驱动的尘埃消光模型,准确再现了不同红移和波段下的观测星系统计特性,并据此构建了包含约3.78亿个星系的LSST模拟光度目录。研究还提供了LSST各波段的光度函数预测,推导了最佳拟合的Schechter参数,并计算了从第一年到第十年不同观测深度的星系数目分布。

Comments 17 pages, 13 figures

详情
英文摘要

We present validated and forward-modelled galaxy luminosity functions and photometric predictions for the Vera C. Rubin Observatory Legacy Survey of Space and Time using the ASTRID cosmological hydrodynamical simulation. Galaxy magnitudes are computed by combining stellar population synthesis modeling with a physically motivated dust attenuation prescription in which the optical depth scales with metal surface density. The dust model is calibrated at z = 0 using SDSS luminosity functions and tested at intermediate redshifts (z = 0.5, 1.0, and 1.5) in rest-frame B, V , R, and I bands. We find that the attenuated luminosity functions reproduce observed galaxy statistics across multiple wavelengths and redshifts. Using this calibrated framework, we construct LSST-ready mock photometric catalogs over 0 <= z <= 2 in steps of Delta z = 0.1, containing ~378 million galaxies. We provide predicted apparent-magnitude luminosity functions in the LSST ugrizy bands, derive best-fit Schechter parameters as a compact analytic representation, and compute differential and cumulative galaxy number counts as a function of survey depth from Year 1 to Year 10.

2605.13842 2026-05-14 astro-ph.GA astro-ph.IM

From DES to KiDS: Domain adaptation for cross-survey detection of low-surface-brightness galaxies

Hareesh Thuruthipilly, Krzysztof Lisiecki, Junais, Katarzyna Małek, Agnieszka Pollo, William J. Pearson, Antonio Vanzanella, Saptarshi Pal, Miguel Figueira, Pratik Dabhade, Anna Durkalec, Aidan P. Cotter, Unnikrishnan Sureshkumar, Nandini Hazra, Patryk Matera, Subhrata Dey, Michal Vrábel, Anirban Dutta, Henry Willems, Nicola Principi Cavaterra, Natalia Dobrowolska, Wojciech Knop

AI总结 该研究旨在解决跨巡天观测中低表面亮度星系(LSBG)的检测问题,利用域适应技术将基于暗能量巡天(DES)训练的深度学习模型应用于千里度巡天(KiDS)数据,实现了对KiDS DR5数据中LSBGs和超弥散星系(UDGs)的自动识别。研究共发现了20,180个LSBG和434个UDG,并揭示了它们的结构参数、颜色分布及与环境相关的演化特征,为未来大型巡天如LSST和Euclid提供了可扩展的LSBG目录构建方法。

Comments Accepted to Astronomy & Astrophysics

详情
英文摘要

Low-surface-brightness galaxies (LSBGs) are vital for understanding galaxy formation, but their diffuse nature makes them challenging to detect. Upcoming large-scale surveys are expected to uncover large numbers of LSBGs, requiring robust automated methods to identify them across heterogeneous datasets. As a precursor to the Legacy Survey of Space and Time (LSST) and Euclid, we explore domain adaptation techniques for cross-survey LSBG identification. Using models trained on the Dark Energy Survey (DES), we search for LSBGs in the Kilo-Degree Survey Data Release 5 (KiDS DR5). We used an ensemble consisting of one convolutional neural network (CNN) and two transformer models trained on DES cutouts and applied to KiDS DR5 imaging data. Structural parameters were estimated with galfitm, and photometric redshifts and stellar population properties were estimated through spectral energy distribution fitting with CIGALE. We identify 20,180 LSBGs and 434 ultra-diffuse galaxies (UDGs) in KiDS DR5. Their structural parameters are similar to known LSBGs from DES and the Hyper Suprime-Cam SSP Survey (HSC-SSP). The KiDS-LSBGs follow a continuous size-luminosity relation connecting classical dwarf galaxies and UDGs, and their colours are bimodal ($\sim73\%$ blue, $\sim27\%$ red). Cross-matching with spectroscopic and cluster catalogues provides redshifts for 4,913 systems, enabling a systematic characterisation of the star-forming main sequence of LSBGs. Strong environmental trends are evident, with cluster LSBGs and UDGs exhibiting redder colours and reduced star formation compared to non-cluster systems. We demonstrate that domain adaptation enables robust cross-survey LSBG identification with deep learning models, providing a scalable pathway for constructing homogeneous LSBG catalogues for the LSST and Euclid era.