arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2606.19496 2026-06-19 cs.LG 新提交

LEAP: 通过自适应进度实现视觉Transformer蒸馏的层跳过效率

Jiaqi Zhang, Ashton Lee, Anthony Wong, John Zou, Sami BuGhanem, Randall Balestriero

发表机构 * Brown University（布朗大学）； Rice University（莱斯大学）

AI总结提出LEAP训练课程，通过自适应选择教师中间特征图作为渐进式目标，加速学生ViT的知识蒸馏，在ImageNet-100上提升12.24%准确率，并节省25.1%训练FLOPs。

详情

AI中文摘要

基于视觉Transformer（ViT）骨干的视觉基础模型（VFMs），如DINOv2，已成为目标识别和语义分割等下游任务的关键。骨干网络的巨大计算需求通常需要将其蒸馏到更小的架构中以便在边缘部署。基于特征的知识蒸馏（KD）常受师生差距影响；学生由于容量有限难以模仿教师复杂的特征图。为缓解这一瓶颈，我们提出LEAP：通过自适应进度实现层跳过效率，一种用于ViT特征知识蒸馏的训练课程。通过利用教师的中间特征图作为一系列逐渐困难的渐进目标，我们的课程允许学生在处理更高层抽象之前构建基础表示。我们的结果表明，这种范式通过在不同学生模型大小和数据集规模上自适应选择难度，显著加速了收敛。采用我们的课程，LEAP蒸馏的ViT-S在ImageNet-100上达到90.1%的准确率，相比基线提升12.24%。在ImageNet-1K上，LEAP在Oxford和Paris数据集上的实例检索任务分别提升3.84%和7.75%。此外，该课程通过在训练初始阶段对教师推理实施早停，在ImageNet-100上节省了25.1%的训练FLOPs和21%的训练时间。代码可在以下网址获取：https://this URL

英文摘要

Vision Foundation Models (VFMs) with Vision Transformer (ViT) backbones, such as DINOv2, have become essential for downstream tasks like object recognition and semantic segmentation. The immense computational requirements of backbones often necessitate distillation into smaller architectures for edge deployment. Feature-based knowledge distillation (KD) often suffers from the teacher-student gap; the student struggles to imitate teacher's complex feature map due to its limited capacity. To mitigate this bottleneck, we propose LEAP: Layer-skipping Efficiency via Adaptive Progression, a training curriculum for ViT feature-based knowledge distillation. By utilizing the teacher's intermediate feature maps as a sequence of progressively more difficult targets, our curriculum allows the student to build a foundational representation before tackling higher-level abstractions. Our results demonstrate that this paradigm significantly accelerates convergence through adaptive difficulty selection across various student model sizes and dataset scales. With our curriculum, the LEAP-distilled ViT-S achieves 90.1% accuracy on ImageNet-100, a +12.24% improvement compared with baseline. On ImageNet-1K, LEAP achieves +3.84% and +7.75% improvement for the instance retrieval task on the Oxford and Paris datasets, respectively. Furthermore, the curriculum enables 25.1% savings in training FLOPs and 21% savings in training time on ImageNet-100 by implementing early-stopping for teacher inference during the initial stages of training. Code is available at https://github.com/KevinZ0217/LEAP

URL PDF HTML ☆

赞 0 踩 0

2606.19481 2026-06-19 cs.LG 新提交

Insulin4RL: Real-Time Insulin Management in the Intensive Care Unit for Offline Reinforcement Learning

Insulin4RL：面向离线强化学习的重症监护室实时胰岛素管理

Thomas Frost, Steve Harris

发表机构 * Institute of Health Informatics（健康信息学研究所）； University College London（伦敦大学学院）

AI总结针对电子健康记录离散化导致模型泛化性差的问题，提出基于真实临床轨迹的离线强化学习数据集Insulin4RL，包含375,000+决策和12,209名患者，用于评估模型在真实采样假设下的性能。

Comments Under submission

详情

AI中文摘要

离线强化学习（ORL）有潜力利用历史电子健康记录（EHR）数据提高临床决策质量。当前该领域的训练和评估实践严重依赖于按固定规则时间间隔离散化的EHR数据集。离散化创建了复杂临床场景的虚构表示，并损害了回顾性模型评估的泛化性。在本文中，我们介绍Insulin4RL，一个医疗ORL数据集，其特点是来自真实临床轨迹的自然不规则输入和动作。该数据集源自MIMIC-IV，包含超过375,000个标记决策，涉及12,209名需要在重症监护室进行胰岛素输注滴定的患者。因此，该数据集可用于研究ORL模型在现实临床采样假设下的性能。我们提供了数据集结构和特征的描述、使用无模型离线强化学习的基线性能指标，以及使用拟合Q评估的标准化评估协议。最后，我们提出了未来研究可以利用该资源解决的领域。

英文摘要

Offline reinforcement learning (ORL) offers the potential to improve the quality of clinical decision-making using historical electronic health record (EHR) data. Current training and evaluative practices in this field rely heavily on EHR datasets that have been temporally discretised into fixed, regular time intervals. Discretisation creates fictional representations of complex clinical scenarios and compromises the generalisability of retrospective model evaluations. In this paper, we introduce Insulin4RL, a healthcare ORL dataset featuring naturally irregular inputs and actions from real clinical trajectories. Derived from MIMIC-IV, Insulin4RL comprises over 375,000 labelled decisions across 12,209 patients requiring insulin infusion titration in the Intensive Care Unit. The dataset can thus be used for research into ORL model performance under realistic clinical sampling assumptions. We provide a description of the dataset's structure and characteristics, baseline performance metrics using model-free offline reinforcement learning, and a standardised evaluation protocol using fitted Q-evaluation. We conclude with suggested areas for future research that could be addressed using this resource.

URL PDF HTML ☆

赞 0 踩 0

2606.19476 2026-06-19 cs.LG cs.AI 新提交

Can In-Context Learning Support Intrinsic Curiosity?

上下文学习能否支持内在好奇心？

Eric Elmoznino, Sangnie Bhardwaj, Johannes von Oswald, Rajai Nasser, Blaise Agüera y Arcas, João Sacramento, Rif A. Saurous, Guillaume Lajoie

发表机构 * Google – Paradigms of Intelligence Team（Google – 智能范式团队）； Google DeepMind

AI总结研究利用序列模型的上下文学习能力作为即时无更新世界模型，以消除传统内在好奇心方法中梯度下降的计算瓶颈，理论证明在非时间设置下可渐近收敛到真实学习进度。

详情

AI中文摘要

有效的机器学习不仅取决于我们如何对数据建模，还取决于我们选择收集哪些数据。虽然大型序列模型已经彻底改变了数据建模，但自动数据选择或“内在好奇心”的问题仍然是一个重大挑战。经典方法通过基于智能体的“学习进度”奖励来激励探索，该奖励衡量新获得的观测在多大程度上改进了世界模型的预测能力。然而，传统上评估这些奖励需要在每个轨迹内进行昂贵的梯度下降内循环更新，这使得它们在规模上计算上不可行。在这项工作中，我们研究序列模型涌现的上下文学习（ICL）能力是否可以通过作为即时的、无需更新的世界模型来消除这一瓶颈。具体来说，我们评估是否可以训练一个探索策略来最大化学习进度，仅使用上下文学习者的预测误差和反事实上下文操作。我们首先证明，在一般马尔可夫决策过程中，这实际上不可能以无偏的方式实现：由此产生的内在奖励要么包含干扰项，使其对真实学习进度的估计产生偏差，要么无法使用上下文学习者的预测误差来实现。相反，我们对于非时间设置的一个广泛子类（包括主动学习和贝叶斯实验设计）证明了积极结果：在这里，ICL派生的奖励成功界定了真实学习进度并渐近收敛到它。我们通过连续和符号环境中的受控实验证实了我们的理论，表明我们的ICL驱动框架成功训练了以最优方式进行探索的好奇数据收集策略。

英文摘要

Effective machine learning depends not only on how we model data, but also on what data we choose to collect. While large sequence models have revolutionized data modeling, the problem of automated data selection, or "intrinsic curiosity", remains a significant challenge. Classic approaches incentivize exploration by rewarding an agent based on its "learning progress", which measures how much a newly acquired observation improves a world model's predictive ability. However, evaluating these rewards traditionally requires expensive inner loops of gradient descent updates within each trajectory, rendering them computationally impractical at scale. In this work, we investigate whether the emergent in-context learning (ICL) capabilities of sequence models can eliminate this bottleneck by serving as immediate, update-free world models. Specifically, we evaluate whether an exploration policy can be trained to maximize learning progress, using solely the prediction errors and counterfactual context manipulations of an in-context learner. We first prove that in general Markov decision processes, this is in fact impossible in an unbiased way: the resulting intrinsic rewards either suffer from nuisance terms that bias their estimation of true learning progress, or they cannot be implemented using an in-context learner's prediction errors. Conversely, we prove a positive result for a broad subclass of non-temporal settings, encompassing active learning and Bayesian Experimental Design: here, ICL-derived rewards successfully bound and asymptotically converge to the true learning progress. We corroborate our theory with controlled experiments across continuous and symbolic environments, demonstrating that our ICL-driven framework successfully trains curious data-collection policies that explore optimally.

URL PDF HTML ☆

赞 0 踩 0

2606.19475 2026-06-19 cs.AI cs.CL 新提交

MonaVec: 一种面向边缘和离线AI系统的免训练嵌入式向量搜索内核

Oğuzhan Yenen

AI总结提出MonaVec，一种无需训练、数据无关的嵌入式向量搜索内核，通过随机哈达玛变换和预计算Lloyd-Max量化实现4位压缩，在边缘和离线场景下提供确定性结果，支持单文件部署。

Comments 27 pages, 11 figures. Code and artifacts: https://github.com/mona-hq/monavec (PyPI: monavec; crates.io: monavec-core). Zenodo: doi:10.5281/zenodo.20559587

详情

AI中文摘要

我们提出MonaVec，一种确定性的嵌入式向量搜索内核，适用于边缘和离线AI场景——即服务器基础设施、网络连接和训练数据均不可用的环境。现有的向量搜索系统假设存在持久化服务器、千兆字节RAM或对语料库进行训练；而MonaVec则针对SQLite的部署模式：一个文件、一次函数调用、随处运行。其量化核心默认免训练且数据无关：随机哈达玛变换（RHDH）将任意输入分布调整至N(0,1)，因此预计算的Lloyd-Max表可将数据量化至4位（缩小8倍），无需学习码本或数据遍历。索引持久化为单个.mvec文件，其中嵌入的ChaCha20旋转种子使得结果在不同架构间可重现，并在同一构建内字节一致——这是并行构建图库无法提供的确定性保证。在语义嵌入（AG News，45K x 1024维BGE-M3，余弦相似度）上，MonaVec 4位BruteForce在27 MB内达到0.960 Recall@10，在召回率上领先float32 FAISS-IVF和8位usearch，同时以峰值吞吐量换取字节一致的确定性。单次全局标准化（fit()）将相同的数据无关流程扩展到对幅度敏感的L2数据，可选的IvfFlat和HNSW后端将其扩展到百万向量语料库。MonaVec使用纯Rust实现，并带有Python绑定和运行时SIMD调度（AVX-512/AVX2/NEON/scalar）。它面向设备端RAG、离线代理和嵌入式检索——即SQLite在关系数据领域占据的细分市场：一个文件、一次调用、随处运行。

英文摘要

We present MonaVec, a deterministic, embedded vector-search kernel for edge and offline AI -- settings where server infrastructure, network connectivity, and training data are all unavailable. Existing vector-search systems assume a persistent server, gigabytes of RAM, or a training pass over the corpus; MonaVec instead targets the deployment profile of SQLite: one file, one function call, runs anywhere. Its quantization core is training-free by default and data-oblivious: a Randomized Hadamard Transform (RHDH) conditions any input distribution toward N(0,1), so precomputed Lloyd-Max tables quantize to 4 bits (8x smaller) with no learned codebook and no data pass. The index persists as a single .mvec file whose embedded ChaCha20 rotation seed makes results reproducible across architectures and byte-identical within a build -- a determinism guarantee that parallel-build graph libraries cannot offer. On semantic embeddings (AG News, 45K x 1024-dim BGE-M3, cosine), MonaVec 4-bit BruteForce reaches 0.960 Recall@10 in 27 MB -- leading float32 FAISS-IVF and 8-bit usearch on recall -- while trading peak throughput for byte-identical determinism. A single-pass global standardization (fit()) extends the same data-oblivious pipeline to magnitude-sensitive L2 data, and optional IvfFlat and HNSW backends carry it to million-vector corpora. MonaVec is implemented in pure Rust with Python bindings and runtime SIMD dispatch (AVX-512/AVX2/NEON/scalar). It targets on-device RAG, offline agents, and embedded retrieval -- the niche SQLite occupies for relational data: one file, one call, runs anywhere.

URL PDF HTML ☆

赞 0 踩 0

2606.19451 2026-06-19 cs.LG cs.CV cs.RO 新提交

3D-DLP: Self-Supervised 3D Object-Centric Scene Representation Learning

3D-DLP：自监督3D物体中心场景表示学习

Ellina Zhang, Madhaven Iyengar, Amir Zadeh, Chuan Li, Deepak Pathak, David Held, Tal Daniel

发表机构 * Carnegie Mellon University（卡内基梅隆大学）

AI总结提出3D-DLP模型，通过自监督学习将场景级RGB-D或体素观测分解为3D潜在粒子，每个粒子编码解耦属性，实现可解释的逐粒子分割图，并支持场景操控和下游机器人操作。

Comments ICML 2026. Project webpage: https://eubooks3003.github.io/3d-dlp

详情

AI中文摘要

我们引入了3D-DLP，一种自监督的物体中心表示学习模型，它将场景级RGB-D或体素观测分解为一组3D潜在粒子。基于深度潜在粒子（DLP）框架，每个粒子编码解耦的属性，包括3D关键点位置、边界框尺寸和外观特征，并代表场景中的一个独特实体。该模型通过端到端的自监督重建目标学习可解释的逐粒子分割图。我们在模拟和真实数据集上证明，学习到的潜在空间是可解释和可控的：通过操纵粒子位置并解码，我们可以生成新颖的场景配置。此外，我们展示了将这些紧凑的3D潜在粒子用于下游机器人操作，相比缺乏显式3D信息或依赖无物体中心结构的密集3D输入的基线方法，性能有所提升。代码和视频可在以下网址获取：此 https URL。

英文摘要

We introduce 3D-DLP, a self-supervised object-centric representation learning model that decomposes scene-level RGB-D or voxel observations into a set of 3D latent particles. Building on the Deep Latent Particles (DLP) framework, each particle encodes disentangled attributes, including 3D keypoint position, bounding box dimensions, and appearance features, and represents a distinct entity in the scene. The model learns interpretable per-particle segmentation maps through an end-to-end self-supervised reconstruction objective. We demonstrate on both simulated and real-world datasets that the learned latent space is interpretable and controllable: by manipulating particle positions and decoding, we can generate novel scene configurations. Furthermore, we show that leveraging these compact 3D latent particles for downstream robotic manipulation improves performance over baselines that either lack explicit 3D information or rely on memory-intensive dense 3D inputs without object-centric structure. Code and videos are available at https://eubooks3003.github.io/3d-dlp.

URL PDF HTML ☆

赞 0 踩 0

2606.19419 2026-06-19 cs.RO cs.AI 新提交

Playful Agentic Robot Learning

趣味性具身机器人学习

Junyi Zhang, Jiaxin Ge, Hanjun Yoo, Letian Fu, Zihan Yang, Yaowei Liu, Raj Saravanan, Shaofeng Yin, Justin Yu, Dantong Niu, Zirui Wang, Roei Herzig, Ken Goldberg, Yutong Bai, David M. Chan, Ion Stoica, Angjoo Kanazawa, Jiahui Lei, Haiwen Feng, Trevor Darrell

发表机构 * University of California, Berkeley（加州大学伯克利分校）； Impossible Research

AI总结提出RATs框架，让机器人通过自主探索学习可复用技能，在LIBERO-PRO和MolmoSpaces上分别提升20.6和17.0个百分点。

Comments Project page: https://playful-rats.github.io/

详情

AI中文摘要

当前的具身机器人系统可以编写可执行的代码即策略程序、观察反馈并在多次尝试中修正行为，但它们仍然主要是任务驱动的：可复用技能仅在明确指令后获得。我们研究趣味性具身机器人学习，其中具身编码代理在下游任务到来之前，将自主导向的趣味性作为持续技能学习阶段。我们引入RATs，即专为趣味性技能获取设计的机器人代理团队。在趣味性阶段，RATs提出新颖且可学习的探索性任务，规划并执行机器人代码策略，验证中间进展，诊断失败，通过密集的步骤级反馈进行重试，并将成功执行提炼到持久代码技能库中。在测试时，代理从该冻结库中重用相关技能以帮助解决新任务。在LIBERO-PRO和MolmoSpaces上的实验表明，与无趣味性和随机趣味性基线相比，趣味性学习技能在保留的下游任务上分别提升了20.6和17.0个百分点（相对于CaP-Agent0）。此外，学习到的技能可以通过简单地检索到上下文中插入到其他推理时代码即策略代理中，无需微调基础模型，即可在RoboSuite和真实世界迁移中分别提升8.9和8.8个百分点。

英文摘要

Current agentic robot systems can write executable Code-as-Policy programs, observe feedback, and revise behavior across multiple attempts, but they remain largely task-driven: reusable skills are acquired only after explicit instructions. We study Playful Agentic Robot Learning, where an embodied coding agent uses self-directed play as a continual skill-learning stage before downstream tasks arrive. We introduce RATs, Robotics Agent Teams designed for play-time skill acquisition. During play, RATs proposes novel yet learnable exploratory tasks, plans and executes robot-code policies, verifies intermediate progress, diagnoses failures, retries with dense, step-level feedback, and distills successful executions into a persistent code skill library. At test time, the agent reuses relevant skills from this frozen library to help solve new tasks. Experiments in LIBERO-PRO and MolmoSpaces show that play-learned skills improve held-out downstream tasks over no-play and random-play baselines, with 20.6 and 17.0 percentage-point gains over CaP-Agent0 on LIBERO-PRO and MolmoSpaces, respectively. Moreover, the learned skills can be plugged into other inference-time Code-as-Policy agents by simply retrieving them into the context, improving RoboSuite and real-world transfer by 8.9 and 8.8 points, respectively, without finetuning the underlying model.

URL PDF HTML ☆

赞 0 踩 0

2606.19416 2026-06-19 cs.LG 新提交

MortarBench: Evaluating Mortgage Loan Origination Agents

MortarBench: 评估抵押贷款发起代理

Matthew Toles, Yunan Lu, Manav Munjal, Bojun Liu, Yuanhao Deng, Stephanie Selig, Derek Rindner, Cheng Li, Zhou Yu

发表机构 * Columbia University（哥伦比亚大学）； Tidalwave

AI总结提出MortarBench基准，通过金融数据合成与变异管道生成覆盖边缘案例的示例，评估大语言模型在贷款发起任务中的表现，发现模型准确率低且存在偏见，并引入CRIT校准框架提升准确率至80.5%。

详情

AI中文摘要

贷款发起是贷方创建新贷款的过程，从申请和承保到批准和融资。该过程在评估申请人的资格和风险水平方面起着关键作用。最近，尽管缺乏任何公开基准，公司已开始使用抵押贷款代理来增强人类贷款官员。为填补这一空白，我们提出了MortarBench，一个贷款发起代理基准。MortarBench使用金融数据合成和变异管道生成具有广泛边缘案例覆盖的示例，这些示例匹配真实世界的分布和问题。我们发现最先进的大语言模型（LLM）表现不佳，闭源模型最多达到77.1%的精确匹配准确率。我们还发现LLM对与非英语名字相关的外国性存在系统性偏见。注意到这些弱点，我们引入了CRIT，一个置信度校准框架。我们的方法将准确率提高到80.5%，同时改善了风险管理导向并减少了偏见。

英文摘要

Loan origination is the process by which a lender creates a new loan, from application and underwriting through approval and funding. This process serves a critical role in evaluating the eligibility and level of risk posed by an applicant. Recently, firms have begun using mortgage loan agents to augment human loan officers, despite a lack of any public benchmark. To fill this gap, we present MortarBench, a loan origination agent benchmark. MortarBench uses a financial data synthesis and mutation pipeline to generate examples with broad edge case coverage that match real-world distributions and questions. We find that state-of-the-art large language models (LLMs) perform poorly, with closed-source models achieving at most 77.1\% exact match accuracy. We also discover systematic biases in LLM perception of foreignness related to non-English names. Noting these weaknesses, we introduce CRIT, a confidence calibration framework. Our method increases accuracy to 80.5\% while improving risk management steering and reducing bias.

URL PDF HTML ☆

赞 0 踩 0

2606.19413 2026-06-19 cs.LG 新提交

Does Text Actually Help? Uncovering and Resolving Text Collapse in Multimodal Time Series Forecasting

文本真的有用吗？揭示并解决多模态时间序列预测中的文本坍缩问题

Huu Hiep Nguyen, Minh Hoang Nguyen, Dung Nguyen, Hung Le

发表机构 * Applied Artificial Intelligence Initiative（应用人工智能计划）

AI总结针对多模态时间序列预测中文本分支被忽视导致“文本坍缩”的问题，提出REST-TS方法，通过让文本分支专门预测数值主干无法解释的残差，强制其提取真实内容，实现最先进性能。

详情

AI中文摘要

多模态时间序列预测将数值序列与领域相关的文本报告配对，有望将世界知识注入预测流程。然而，我们揭示了现有框架中的一个关键失败模式，称为文本坍缩：文本分支收敛到与内容无关的变换，无论输入描述如何，都贡献可忽略的判别信号。我们认为文本坍缩是时间序列预测中基本不对称性的结果：数值输入与输出强自相关，使得数值主干天生占主导地位，而文本分支尽管携带互补且通常关键的信息，却未被充分利用，导致其系统性欠利用。为解决此问题，我们提出REST-TS（时间序列中文本的残差独占监督），将不对称性转化为设计原则：数值主干产生其独立的数值预测，而文本分支被独占监督以预测残差的结构化组成部分，即数值无法解释的预测差距。由于没有数值路径可以减少这些损失，文本分支必须从输入描述中提取真实内容。在多样化的现实领域和主干架构上的评估表明，REST-TS实现了最先进的性能，并一致地显示出比现有框架更高的文本分支利用率，提供了强有力的经验证据，表明对文本分支进行残差监督迫使其从输入中提取真实内容。

英文摘要

Multimodal time series forecasting, which pairs numerical sequences with domain-relevant textual reports, promises to inject world knowledge into forecasting pipelines. However, we uncover a critical failure mode in existing frameworks that we term text collapse: the text branch converges to a content-independent transformation, contributing negligible discriminative signal regardless of the input description. We argue that text collapse is a consequence of a fundamental asymmetry in time series forecasting: the numerical input is strongly autocorrelated with the output, making the numerical backbone inherently dominant, while the text branch, despite carrying complementary and often critical information, is insufficiently utilized, leading to its systematic underexploitation. To address this, we propose \textbf{REST-TS} (\textbf{R}esidual-\textbf{E}xclusive \textbf{S}upervision for \textbf{T}ext in \textbf{T}ime \textbf{S}eries), which turns the asymmetry into a design principle: the numerical backbone produces its own independent numerical forecast, and the text branch is exclusively supervised to predict the structured components of the residual, the prediction gap that numbers cannot explain. Because no numerical pathway can reduce these losses, the text branch must extract genuine content from the input description. Evaluated across diverse real-world domains and backbone architectures, REST-TS achieves state-of-the-art performance and consistently demonstrates greater text-branch utilization than existing frameworks, providing strong empirical evidence that supervising the text branch on the residual compels it to extract genuine content from the input.

URL PDF HTML ☆

赞 0 踩 0

2606.19412 2026-06-19 cs.LG 新提交

Spectral Retrieval-Augmented Time-Series Forecasting

频谱检索增强的时间序列预测

Huu Hiep Nguyen, Minh Hoang Nguyen, Dung Nguyen, Hung Le

发表机构 * Applied Artificial Intelligence Initiative（应用人工智能倡议）； Deakin University（迪肯大学）

AI总结提出SpecReTF方法，通过将时间序列转换为窗口化频率表示并采用结合幅度和相位的相似性度量，以及指数移动平均加权方案，解决了现有检索方法在频谱盲区和时间近因上的局限性，提升了非平稳时间序列预测的准确性。

详情

AI中文摘要

时间序列预测利用历史模式来预测未来值，但传统方法在处理复杂、非平稳模式时面临挑战，这些模式在训练期间难以记忆。检索增强方法通过检索相似历史模式来增强预测，已成为有前景的解决方案。然而，现有检索方法存在两个基本局限性：频谱盲区，即忽略了捕捉潜在周期结构的关键频域特征；以及时间近因，即对所有历史数据一视同仁，而不强调最近、更相关的模式。在本文中，我们提出SpecReTF，一种新颖的检索方法，通过将时间序列转换为窗口化频率表示，并使用结合幅度和相位信息的组合度量来衡量相似性，从而解决这些问题。为了平衡近因和历史上下文，我们应用指数移动平均加权方案，强调最近的窗口。在基准数据集上的大量实验表明，SpecReTF优于时域检索方法，在多样化的非平稳时间序列上实现了卓越的预测准确性。

英文摘要

Time series forecasting leverages historical patterns to predict future values, but traditional methods face challenges when dealing with complex, non-stationary patterns that are difficult to memorize during training. Retrieval-augmented approaches have emerged as promising solutions by retrieving similar historical patterns to enhance predictions. However, existing retrieval methods suffer from two fundamental limitations: spectral blindness, which overlooks critical frequency-domain characteristics that capture underlying periodic structures, and temporal recency, which treats all historical data equally without emphasizing recent, more relevant patterns. In this paper, we propose SpecReTF, a novel retrieval method that addresses these issues by converting time series into windowed frequency representations, measuring similarity with a combined metric that captures both amplitude and phase information. To balance recency and historical context, we apply an exponential moving average weighting scheme that emphasizes recent windows. Extensive experiments on benchmark datasets demonstrate that SpecReTF outperforms time-domain retrieval methods, achieving superior forecasting accuracy across diverse, non-stationary time series.

URL PDF HTML ☆

赞 0 踩 0

2606.19411 2026-06-19 cs.LG 新提交

Spectral DPPs via NEPv: A Scalable Continuous Relaxation of Determinantal MAP for Diversity-Aware Data Selection

通过NEPv的谱DPP：用于多样性感知数据选择的确定性点过程MAP的可扩展连续松弛

Richard Yi Da Xu

发表机构 * Hong Kong Baptist University（香港浸会大学）； TadReamk Limited（TadReamk有限公司）

AI总结提出将NP难的DPP-MAP选择问题转化为Stiefel流形上的连续优化，通过非线性特征值问题（NEPv）的自洽场迭代实现近线性时间求解，适用于大规模数据选择。

详情

AI中文摘要

从海量候选池中选择一个小的、多样化的、高质量的子集是现代机器学习中的一个常见原语——用于训练和微调大型模型的数据整理和核心集选择、主动学习批次获取、上下文学习的提示和示例选择、检索多样化以及实验设计。确定性点过程（DPP）为此任务提供了原则性的、良好校准的多样性概念，但其MAP目标——选择大小为$k$的子集$S$最大化$\log\det(L_S)$——是NP难的，并且标准的贪心和采样算法在候选集大小$n$上具有超线性复杂度。这种成本在多样性最重要的数据为中心的场景中尤其高昂，其中$n$范围从数百万到数十亿的候选示例、特征或嵌入。我们将DPP-MAP重新表述为Stiefel流形上的连续优化问题，并证明其最优性条件构成一个先前未研究形式的具有特征向量依赖性的非线性特征值问题（NEPv）。该NEPv允许自洽场（SCF）迭代，具有基于谱间隙的局部收缩保证，从而提供了一个原则性的迭代求解器，其中多样性目标驱动一个特征向量依赖的算子。由此产生的算法OurMethod仅需要与核的矩阵-向量乘积，运行时间为$O\!\big((ndk+nk^2)\,t\big)$，其中迭代次数$t$很小，在$n$上接近线性，并直接与机器学习中常见的低秩和特征映射核集成。本文重点介绍松弛、求解器和扩展分析；完整的真实数据基准测试留给计划中的实证研究。

英文摘要

Selecting a small, diverse, high-quality subset from a massive pool of candidates is a recurring primitive in modern machine learning -- data curation and coreset selection for training and fine-tuning large models, active-learning batch acquisition, prompt and exemplar selection for in-context learning, retrieval diversification, and experimental design. Determinantal Point Processes (\DPP s) give a principled, well-calibrated notion of diversity for this task, but their \emph{MAP} objective -- pick a size-$k$ subset $S$ maximizing $\logdet(L_S)$ -- is NP-hard, and the standard greedy and sampling algorithms scale superlinearly in the ground-set size $n$. This cost is prohibitive precisely in the data-centric regime where diversity matters most, where $n$ ranges over millions to billions of candidate examples, features, or embeddings. We recast \DPP-MAP as a continuous optimization problem over the Stiefel manifold, and show that its first-order optimality conditions form a \emph{Nonlinear Eigenvalue Problem with eigenvector dependency} (\NEPv) of a previously unstudied form. This \NEPv\ admits a self-consistent field (\SCF) iteration with a spectral-gap-based local contraction guarantee, giving a principled iterative solver where the diversity objective drives an eigenvector-dependent operator. The resulting algorithm, \OurMethod, requires only matrix-vector products with the kernel and runs in time $O\!\big((ndk+nk^2)\,t\big)$ for a small number of iterations $t$, scaling near-linearly in $n$ and integrating directly with low-rank and feature-map kernels common in ML. This paper focuses on the relaxation, solver, and scaling analysis; full real-data benchmarking is left to a planned empirical study.

URL PDF HTML ☆

赞 0 踩 0

2606.19409 2026-06-19 cs.SE cs.PL 新提交

DevOps 与普通开发者：来自 Stack Overflow 2023 年调查的见解

Hasan Abdulla, Fatema AlJazeeri, Fawzi AlBalooshi, Jaflah Al-Ammary

AI总结通过分析 Stack Overflow 2023 年调查数据，比较 DevOps 专家与普通开发者在工具、技术、方法论和人口统计上的差异，发现两者角色互补，工具偏好无显著差异。

Comments 17 pages, 11 tables, research paper based on the 2023 Stack Overflow Developer Survey data analysis

详情

AI中文摘要

目的：调查 DevOps 专家和普通软件开发者在当前软件开发环境中不同的角色，考察他们在工具、技术、方法论和人口统计方面的不同使用情况。此外，区分这两个专业群体在该领域的独特贡献和挑战。设计/方法论/方法：研究采用定量方法分析 Stack Overflow 2023 年开发者调查数据。重点比较 DevOps 专家和普通开发者在技术偏好、人口统计信息和专业经验方面的差异，突出关键趋势和差异。数据分析使用 Python 的 Pandas 库进行。发现：研究表明，DevOps 专家和普通开发者在工具和技术偏好上没有显著差异，突出了他们的互补角色。DevOps 专家和普通开发者都使用 Docker 和 Kubernetes 等工具，强调效率和自动化。而普通开发者根据不同的角色需求使用多样化的工具，人口统计趋势显示普通开发者更年轻，DevOps 专业人员处于职业生涯中期。这一年龄范围反映了 DevOps 经验的增长，两个群体都在适应技术行业不断发展的远程和混合工作模式。实际意义：这项研究提供了对软件开发中动态角色的视角，强调了 DevOps 日益增长的重要性。它是学术和行业专业人士了解软件开发角色不断演变的宝贵资源。原创性/价值：这项研究填补了现有文献中关于软件开发角色动态演变的重要空白。

英文摘要

Purpose: To investigate the distinct roles of DevOps specialists and general software developers, examining their varying use of tools, technologies, methodologies, and demographics in the current software development environment. In addition, to differentiate these two professional groups regarding their unique contributions and challenges in the field. Design/Methodology/Approach: The research uses a quantitative approach to analyze data from the Stack Overflow 2023 Developer Survey. It focuses on a comparative analysis of technological preferences, demographic information, and professional experiences between DevOps specialists and general developers, highlighting key trends and differences. The data analysis was conducted using Python's Pandas library for data analysis. Findings: The research indicates no significant difference in the tool and technology preferences between DevOps specialists and general software developers, highlighting their complementary roles. DevOps specialists and general software developers use tools like Docker and Kubernetes, emphasizing efficiency and automation. While general developers employ diverse tools for various role demands, demographic trends show younger general developers and mid-career DevOps professionals. This age range reflects growing experience in DevOps, and both groups are adapting to remote and hybrid work models in the evolving tech industry. Practical Implications: This research offers perspectives on the dynamic roles within software development, emphasizing the growing importance of DevOps. It is a valuable resource for academic and industry professionals to understand the evolving dynamics in software development roles. Originality/Value: This research fills a significant gap in the existing literature regarding the evolving dynamics of software development roles.

URL PDF HTML ☆

赞 0 踩 0