arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2508.20206 2026-05-13 cs.LG cs.AI

Filter then Attend: Improving attention-based Time Series Forecasting with Spectral Filtering

Elisha Dayag, Nhat Thanh Van Tran, Jack Xin

AI总结本文研究了如何通过频域滤波改进基于Transformer的长期时间序列预测模型。作者提出在模型输入阶段加入可学习的频域滤波器，以增强模型对不同频率成分的利用能力。实验表明，该方法在多个数据集上提升了预测性能，并且能够减少模型嵌入维度，使模型更小更高效。

2508.16070 2026-05-13 cs.CL

Less Redundancy: Boosting Practicality of Vision Language Model in Walking Assistants

Chongyang Li, Zhiqiang Yuan, Hanbo Bi, Zexi Jia, Jinchao Zhang

AI总结本文研究如何提升视觉语言模型在导盲系统中的实用性，针对现有模型输出冗余、缺乏环境风险主动评估的问题，提出了一种减少冗余的行走辅助模型WalkVLM-LR。该模型通过引入基于人类偏好的奖励函数优化输出简洁性与准确性，并结合环境感知判别器提升风险评估效率，实验表明其在输出简洁性和时间冗余度方面均优于现有方法。

Comments ICASSP 2026 Best Industry Paper

2508.14780 2026-05-13 cs.LG cs.IT math.IT

Context Steering: A New Paradigm for Compression-based Embeddings by Synthesizing Relevant Information Features

Guillermo Sarasa, Ana Granados, Francisco de Borja Rodríguez

AI总结本文提出了一种名为“上下文引导”（Context Steering）的新方法，用于基于压缩的嵌入表示，通过合成相关性信息特征来提升嵌入对任务的适应性。该方法主动引导特征生成过程，分析每个对象在聚类框架中的关系影响，从而生成定制化的嵌入表示，突出类间差异信息。实验表明，该方法在多种异构数据集上均能生成鲁棒的任务导向嵌入，有效提升了分类和聚类性能。

2508.10036 2026-05-13 cs.CL cs.AI cs.IR cs.LG

Reflect then Learn: Active Prompting for Information Extraction Guided by Introspective Confusion

Dong Zhao, Yadong Wang, Xiang Chen, Chenxi Wang, Hongliang Dai, Chuanxing Geng, Shengzhong Zhang, Shaoyuan Li, Sheng-Jun Huang

AI总结该研究提出了一种名为APIE的主动提示框架，用于指导信息抽取任务中的大语言模型。该方法基于“内省混淆”原则，通过量化格式不确定性和内容不确定性两个维度，评估模型自身的困惑程度，并据此选择最具挑战性和信息量的样本作为少样本示例。实验表明，该方法在多个基准数据集上显著提升了信息抽取的准确性和鲁棒性。

Comments Published at AAAI 2026

2508.08420 2026-05-13 cs.LG stat.ML

Regret minimization in Linear Bandits with offline data via extended D-optimal exploration

Sushant Vijayan, Arun Suggala, Karthikeyan Shanmugam, Soumyabrata Pal

AI总结本文研究了在拥有离线数据的情况下，如何在线最小化线性强盗问题的累积遗憾。提出了一种名为Offline-Online Phased Elimination (OOPE) 的算法，通过在探索阶段使用扩展的D-最优设计，有效利用离线数据以显著降低在线遗憾。该算法的在线遗憾界为 $\tilde{O}(\sqrt{\deff T \log (|\mathcal{A}|T)} + d^2)$，其中 $\deff$ 表示离线数据中未充分探索的方向数，反映了离线数据的质量。此外，本文还给出了依赖于离线数据质量的最小最大遗憾下界，并通过Frank-Wolfe近似进一步优化了算法的复杂度。

Comments Accepted to TMLR, with J2C certification, link: https://openreview.net/forum?id=4WcK8gKgCi

详情

英文摘要

We consider the problem of online regret minimization in linear bandits with access to prior observations (offline data) from the underlying bandit model. There are numerous applications where extensive offline data is often available, such as in recommendation systems, online advertising. Consequently, this problem has been studied intensively in recent literature. Our algorithm, Offline-Online Phased Elimination (OOPE), effectively incorporates the offline data to substantially reduce the online regret compared to prior work. To leverage offline information prudently, OOPE uses an extended D-optimal design within each exploration phase. OOPE achieves an online regret is $\tilde{O}(\sqrt{\deff T \log \left(|\mathcal{A}|T\right)}+d^2)$. $\deff \leq d)$ is the effective problem dimension which measures the number of poorly explored directions in offline data and depends on the eigen-spectrum $(λ_k)_{k \in [d]}$ of the Gram matrix of the offline data. The eigen-spectrum $(λ_k)_{k \in [d]}$ is a quantitative measure of the \emph{quality} of offline data. If the offline data is poorly explored ($\deff \approx d$), we recover the established regret bounds for purely online setting while, when offline data is abundant ($\Toff >> T$) and well-explored ($\deff = o(1) $), the online regret reduces substantially. Additionally, we provide the first known minimax regret lower bounds in this setting that depend explicitly on the quality of the offline data. These lower bounds establish the optimality of our algorithm in regimes where offline data is either well-explored or poorly explored. Finally, by using a Frank-Wolfe approximation to the extended optimal design we further improve the $O(d^{2})$ term to $O\left(\frac{d^{2}}{\deff} \min \{ \deff,1\} \right)$, which can be substantial in high dimensions with moderate quality of offline data $\deff = Ω(1)$.

URL PDF HTML ☆

赞 0 踩 0

2508.05269 2026-05-13 cs.CV

B4DL: A Benchmark for 4D LiDAR LLM in Spatio-Temporal Understanding

Changho Choi, Youngwoo Shin, Gyojin Han, Dong-Jae Lee, Junmo Kim

AI总结该研究提出B4DL，一个用于训练和评估多模态大语言模型（MLLM）在4D激光雷达时空理解能力的新基准。针对4D激光雷达数据在MLLM中应用不足的问题，研究设计了可扩展的数据生成流程，并提出了首个能直接处理原始4D激光雷达数据并与语言理解结合的MLLM模型，为动态户外环境中的时空推理提供了统一解决方案。

Comments Accepted at ACM MM 2025

2507.21159 2026-05-13 cs.AI cs.LG cs.MA

MAC: Masked Agent Collaboration Boosts Large Language Model Medical Decision-Making

Zhihao Peng, Liuxin Bao, Yixuan Yuan

AI总结该研究提出了一种名为MAC的掩码智能体协作框架，旨在提升大语言模型在医疗决策中的表现。通过帕累托最优智能体构建和跨一致性最大化机制，该方法实现了协作信息的自适应渐进传播，有效提升了医疗决策的准确性与鲁棒性。研究还引入了模型多样性评估和输出一致性筛选策略，以优化智能体协作过程并减少语义不一致带来的影响。

详情

英文摘要

Large language models (LLMs) have proven effective in artificial intelligence, where the multi-agent system (MAS) holds considerable promise for healthcare development by achieving the collaboration of LLMs. However, the absence of a systematic pipeline for agent construction and the rigidity of static collaboration patterns render current MAS-based models vulnerable to collaboration failures, resulting in substantial performance degradation in medical decision-making scenarios. To this end, we propose a novel Masked Agent Collaboration (MAC) framework that harnesses Pareto-optimal agent construction and cross-consistency maximization mechanisms to achieve adaptive progressive propagation of collaborative information, boosting the medical decision-making capacity. Specifically, we first conduct a Pareto-frontier factors analysis towards the LLMs pool to consider their key factors, including the model size, inference time, diversity score, and throughput ratio, where we calculate the similarity between pairwise outputs within an LLM to derive its diversity score. Beyond this analysis, we enable the identification of Pareto-optimal models that balance efficiency and capability, which are subsequently selected as collaborative agents to consider the fundamental trade-offs inherent in practical LLM deployment. Afterward, we measure the pairwise similarity between the outputs from collaborative agents to determine their cross-consistency values, subsequently masking out the agent with the lowest cross-consistency value to eliminate the output that is likely semantically inconsistent. Finally, we conduct collaboration of agents by achieving adaptive progressive propagation, where each agent aggregates the outputs of unmasked agents from the previous layer as its input to generate the corresponding output via prompt engineering.

URL PDF HTML ☆

赞 0 踩 0

2507.13625 2026-05-13 cs.AI

Bridging Dual Knowledge Graphs for Multi-Hop Question Answering in Construction Safety

Yuxin Zhang, Xi Wang, Mo Hu, Zhenyu Zhang

AI总结本文研究了如何从复杂的建筑安全法规中进行多跳问题回答，以支持自动化合规性检查。为此，提出了一种名为BifrostRAG的双图检索增强生成系统，该系统结合了语言关系和文档结构建模，通过融合图遍历与语义向量搜索的混合检索机制，提升了大语言模型对法规内容和结构的推理能力。实验表明，BifrostRAG在多跳问题数据集上取得了优异的性能，显著优于仅使用向量或仅使用图的基线方法，为复杂技术文档的智能处理提供了可迁移的解决方案。

Comments 22 pages, 13 figures

2507.12002 2026-05-13 cs.LG

Detecting In-Person Conversations in Noisy Real-World Environments with Smartwatch Audio and Motion Sensing

Alice Zhang, Callihan Bertley, Dawei Liang, Edison Thomaz

AI总结该研究提出了一种基于智能手表音频和运动传感数据的新方法，用于检测现实环境中面对面的口头对话。通过融合麦克风采集的音频信号与六轴惯性传感器数据，研究设计并训练了卷积和注意力机制神经网络，以识别非语言交流特征。实验表明，多模态数据融合显著提升了检测性能，在实验室和半自然场景中分别达到了82.0%和77.2%的宏F1分数，验证了该方法在实际应用中的有效性。

Comments Accepted to ACM Transactions on Intelligent Systems and Technology

2507.06694 2026-05-13 cs.LG cs.SY eess.SP eess.SY

Heterogeneous Graph Neural Networks for Short-term State Forecasting in Power Systems across Domains and Time Scales: A Hydroelectric Power Plant Case Study

Raffael Theiler, Olga Fink

AI总结本文研究了在多物理域和多时间尺度下，如何利用异构图神经网络进行电力系统短期状态预测的问题。针对传统图神经网络在处理异构传感器数据时的局限性，作者提出了一种基于异构图注意力网络的方法，能够同时建模水力和电气两个领域内及跨领域的传感器关系。实验结果表明，该方法在归一化均方根误差指标上比传统方法平均提升了35.5%，验证了其在多域多时间尺度电力系统状态预测中的有效性。

Comments 25 pages, 9 figures

详情

DOI: 10.1088/3049-4761/ae565c

英文摘要

Accurate short-term state forecasting is essential for efficient and stable operation of modern power systems, especially in the context of increasing variability introduced by renewable and distributed energy resources. As these systems evolve rapidly, it becomes increasingly important to reliably predict their states in the short term to ensure operational stability, support control decisions, and enable interpretable monitoring of sensor and machine behavior. Modern power systems often span multiple physical domains - including electrical, mechanical, hydraulic, and thermal - posing significant challenges for modeling and prediction. Graph Neural Networks (GNNs) have emerged as a promising data-driven framework for system state estimation and state forecasting in such settings. By leveraging the topological structure of sensor networks, GNNs can implicitly learn inter-sensor relationships and propagate information across the network. However, most existing GNN-based methods are designed under the assumption of homogeneous sensor relationships and are typically constrained to a single physical domain. This limitation restricts their ability to integrate and reason over heterogeneous sensor data commonly encountered in real-world energy systems, such as those used in energy conversion infrastructure. In this work, we propose the use of Heterogeneous Graph Attention Networks to address these limitations. Our approach models both homogeneous intra-domain and heterogeneous inter-domain relationships among sensor data from two distinct physical domains - hydraulic and electrical - which exhibit fundamentally different temporal dynamics. Experimental results demonstrate that our method significantly outperforms conventional baselines on average by 35.5% in terms of normalized root mean square error, confirming its effectiveness in multi-domain, multi-rate power system state forecasting.

URL PDF HTML ☆

赞 0 踩 0

2507.03622 2026-05-13 cs.LG cs.AI stat.ML

Localising Dropout Variance in Twin Networks

Cooper Doyle

AI总结该论文研究了如何在双网络模型中定位预测不确定性来源的问题，提出了一种分层方差分解方法，将总预测方差分解为编码器部分和输出头部分。通过独立控制共享编码器和输出头的蒙特卡洛Dropout，能够区分不同来源的不确定性。实验表明，编码器方差在分布偏移时占主导，是预测误差的主要指标，而输出头方差在编码器不确定性控制后才具有信息量，该方法成本低廉，可为数据收集提供实用指导。

Comments 14 pages, 5 figures, 3 tables

2506.23723 2026-05-13 cs.RO

A comprehensive control architecture for semi-autonomous dual-arm robots in agriculture settings

Jozsef Palmieri, Paolo Di Lillo, Stefano Chiaverini, Alessandro Marino

AI总结本文提出了一种适用于农业场景的半自主双臂机器人的综合控制架构，旨在实现如葡萄采摘等复杂任务。该架构基于16自由度的双臂移动机器人，采用分层二次规划（HQP）方法处理多优先级的等式和不等式约束，同时整合感知系统选择的葡萄串进行采摘。为应对环境不确定性和潜在碰撞，架构还通过HQP框架处理交互力，并支持人工操作员协助完成任务，最终通过实验室和真实葡萄园的广泛测试验证了其有效性。

详情

DOI: 10.1016/j.conengprac.2025.106394.
Journal ref: Control Engineering Practice, Vol. 163, 2025

英文摘要

The adoption of mobile robotic platforms in complex environments, such as agricultural settings, requires these systems to exhibit a flexible yet effective architecture that integrates perception and control. In such scenarios, several tasks need to be accomplished simultaneously, ranging from managing robot limits to performing operational tasks and handling human inputs. The purpose of this paper is to present a comprehensive control architecture for achieving complex tasks such as robotized harvesting in vineyards within the framework of the European project CANOPIES. In detail, a 16-DOF dual-arm mobile robot is employed, controlled via a Hierarchical Quadratic Programming (HQP) approach capable of handling both equality and inequality constraints at various priorities to harvest grape bunches selected by the perception system developed within the project. Furthermore, given the complexity of the scenario and the uncertainty in the perception system, which could potentially lead to collisions with the environment, the handling of interaction forces is necessary. Remarkably, this was achieved using the same HQP framework. This feature is further leveraged to enable semi-autonomous operations, allowing a human operator to assist the robotic counterpart in completing harvesting tasks. Finally, the obtained results are validated through extensive testing conducted first in a laboratory environment to prove individual functionalities, then in a real vineyard, encompassing both autonomous and semi-autonomous grape harvesting operations.

URL PDF HTML ☆

赞 0 踩 0

2506.22809 2026-05-13 cs.LG cs.AI cs.CL

Learning Adapter Rank via Symmetry Breaking

Cooper Doyle, Andy Hu, Rebecca Chan, Anna Leontjeva

AI总结该研究针对低秩适配（LoRA）中适配秩坐标不可识别的问题，提出通过变分推断引入对角后验分布，打破LoRA的旋转对称性，从而自动确定适配秩方向的重要性。基于此，研究提出了BayesLoRA，一种在低秩空间直接进行贝叶斯推断的框架，能够同时学习有效的适配秩和预测不确定性，仅需少量额外参数，实验表明其在保持训练成本的同时，实现了更紧凑的预测校准和优于现有低秩稀疏化方法的性能。

Comments 8 pages, 2 figures, 4 tables

2506.19417 2026-05-13 cs.LG cs.MA

Focusing Influence Mechanism for Multi-Agent Reinforcement Learning

Yisak Park, Sunwoo Lee, Seungyul Han

AI总结在稀疏奖励环境下，多智能体强化学习（MARL）面临协调探索困难的问题。本文提出了一种聚焦影响机制（FIM），通过基于熵的准则引导智能体关注未充分探索的状态空间区域，并利用资格迹保持多智能体在有益区域的持续协作，从而提升联合行为的协调性和持久性。实验表明，FIM在多种MARL基准任务中均能有效提升合作性能，尤其在稀疏奖励场景下表现出显著优势。

Comments 9 technical page followed by references and appendix

2506.14097 2026-05-13 cs.RO cond-mat.soft physics.comp-ph

Smooth-Rigid-Body Contact as a ReLCP: A Recursively Generated Linear Complementarity Problem

Bryce Palmer, Hasan Metin Aktulga, Tong Gao

AI总结本文将光滑刚体之间无摩擦非光滑接触的互补性时间步进方法重新表述为递归生成的线性互补问题（ReLCP），通过一系列维度递增的LCP问题逐步构建。该方法从经典的单约束共享法向有符号距离（SNSD）LCP出发，仅在当前接触集预测的离散时间更新会导致表面穿透时添加单边约束，从而直接作用于光滑几何，保证非穿透性并避免代理表面模型带来的过度采样问题。理论分析表明，在严格凸体和足够小的时间步长下，该方法能够保证有限终止和速度更新的唯一性，数值实验验证了其在大时间步下的稳定性与高效性。

2506.08902 2026-05-13 cs.LG cs.AI

Intention-Conditioned Flow Occupancy Models

Chongyi Zheng, Seohong Park, Sergey Levine, Benjamin Eysenbach

AI总结本文提出了一种名为“意图条件流占用模型”（InFOM）的概率模型，用于预测智能体在遥远未来可能访问的状态分布。该模型基于流匹配技术构建，并引入了一个捕捉用户意图的潜在变量，从而提升模型的表达能力并支持通用策略改进。实验表明，InFOM在多个基准任务中相比现有方法，平均回报提升了1.8倍，成功率提高了36%。

Comments ICLR 2026

2506.02215 2026-05-13 cs.RO cs.SY eess.SY

Active inference as a unified model of collision avoidance behavior in human drivers

Julian F. Schumann, Johan Engström, Leif Johnson, Matthew O'Kelly, Joao Messias, Jens Kober, Arkady Zgonnikov

AI总结本文提出了一种基于主动推断理论的计算认知模型，用于统一解释人类驾驶员在碰撞规避行为中的决策过程。该模型通过最小化自由能来模拟人类在两种典型碰撞场景下的反应，包括前车急刹和对向车辆侧向侵入，并成功复现了多项已有实证研究中的结果，如反应时间、避让策略选择等。研究展示了主动推断作为统一框架在复杂驾驶任务中理解人类行为的潜力。

2505.20761 2026-05-13 cs.LG stat.ML

Practical estimation of the optimal classification error with soft labels and calibration

Ryota Ushio, Takashi Ishida, Masashi Sugiyama

AI总结本文研究了在二分类任务中如何实用且理论严谨地估计最优分类错误率（即贝叶斯错误）。作者在原有基于软标签的方法基础上进行了两个重要扩展：一方面，他们分析了基于硬标签的估计器的偏差性质，揭示其衰减速度与两类条件分布的分离程度相关，并在每实例硬标签数量增加时可能显著优于先前结果；另一方面，他们解决了在软标签被污染的情况下进行估计的问题，指出即使使用校准后的软标签，估计结果仍可能不准确，并提出一种基于等距校准的估计方法，在更弱的假设下仍具有统计一致性。该方法无需具体实例，适用于隐私受限的实际场景。实验验证了方法的有效性。

Comments ICLR 2026 camera-ready version updated; 40 pages, 12 figures; GitHub: https://github.com/RyotaUshio/bayes-error-estimation

2505.19770 2026-05-13 cs.LG cs.CL

Understanding the Performance Gap in Preference Learning: A Dichotomy of RLHF and DPO

Ruizhe Shi, Minhak Song, Runlong Zhou, Zihan Zhang, Maryam Fazel, Simon S. Du

AI总结本文对两阶段强化学习从人类反馈（RLHF）和直接偏好优化（DPO）之间的性能差距进行了细致的理论分析，揭示了这一差距来源于精确优化下的显式表示差距和有限样本下的隐式表示差距。研究指出，在精确优化条件下，奖励模型和策略模型的相对容量会影响最终策略质量，并发现RLHF、DPO或在线DPO在不同模型误设情况下可能各有优劣；而在近似优化条件下，当真实奖励稀疏时，RLHF在恢复有效奖励模型所需的样本数量上具有统计优势，表明两阶段学习在某些场景下更具优势。这些结果为理解RLHF与DPO的性能差异提供了全面的理论依据，并为实际应用中选择合适方法提供了指导。

Comments ICML accepted version

2505.13770 2026-05-13 cs.AI cs.CL cs.LG stat.ME stat.ML

Ice Cream Doesn't Cause Drowning: Benchmarking LLMs Against Statistical Pitfalls in Causal Inference

Jin Du, Li Chen, Xun Xian, An Luo, Fangqiao Tian, Ganghua Wang, Charles Doss, Xiaotong Shen, Jie Ding

AI总结本研究探讨了大型语言模型（LLMs）在因果推断中应对统计陷阱的能力，指出当前模型在处理如辛普森悖论和选择偏差等复杂统计问题时存在明显不足。为此，研究提出了一个名为CausalPitfalls的综合性基准，通过多难度级别的结构化挑战和评分标准，系统评估模型的因果推理能力与回答可靠性。实验结果揭示了现有LLMs在统计因果推理中的局限性，并为构建可信的因果推理系统提供了重要参考。

2505.05772 2026-05-13 cs.CL cs.LG

Sparse Attention Remapping with Clustering for Efficient LLM Decoding on PIM

Zehao Fan, Garrett Gagnon, Zhenyu Liu, Liu Liu

AI总结本文研究了如何在处理-in-memory（PIM）架构上高效执行大语言模型（LLM）的解码过程，针对传统密集注意力机制在PIM上难以处理键值（KV）缓存稀疏性带来的不规则访问问题，提出了STARC方法。STARC通过语义相似性对KV对进行聚类，并将其映射到与PIM存储结构对齐的连续内存区域，从而减少解码过程中的内存访问和计算开销。实验表明，STARC在保持模型精度的同时，显著降低了注意力层的延迟和能耗，展示了其在PIM架构上实现高效长上下文LLM推理的有效性。

Comments Early preprint; peer-reviewed version of record published in ASPLOS '26

详情

DOI: 10.1145/3779212.3790226

英文摘要

Transformer-based models are the foundation of modern machine learning, but their execution, particularly during autoregressive decoding in large language models (LLMs), places significant pressure on memory systems due to frequent memory accesses and growing key-value (KV) caches. This creates a bottleneck in memory bandwidth, especially as context lengths increase. Processing-in-memory (PIM) architectures are a promising solution, offering high internal bandwidth and compute parallelism near memory. However, current PIM designs are primarily optimized for dense attention and struggle with the dynamic, irregular access patterns introduced by modern KV cache sparsity techniques. Consequently, they suffer from workload imbalance, reducing throughput and resource utilization. In this work, we propose STARC, a novel sparsity-optimized data mapping scheme tailored specifically for efficient LLM decoding on PIM architectures. STARC clusters KV pairs by semantic similarity and maps them to contiguous memory regions aligned with PIM bank structures. During decoding, queries retrieve relevant tokens at cluster granularity by matching against precomputed centroids, enabling selective attention and parallel processing without frequent reclustering or data movement overhead. Experiments on the HBM-PIM system show that, compared to common token-wise sparsity methods, STARC reduces attention-layer latency by 19%--31% and energy consumption by 19%--27%. Under a KV cache budget of 1024, it achieves up to 54%--74% latency reduction and 45%--67% energy reduction compared to full KV cache retrieval. Meanwhile, STARC maintains model accuracy comparable to state-of-the-art sparse attention methods, demonstrating its effectiveness in enabling efficient and hardware-friendly long-context LLM inference on PIM architectures.

URL PDF HTML ☆

赞 0 踩 0

2505.05665 2026-05-13 cs.RO cs.AI cs.CL

Characterizing the Robustness of Black-Box LLM Planners Under Perturbed Observations with Adaptive Stress Testing

Neeloy Chakraborty, John Pohovey, Melkior Ornik, Katherine Driggs-Campbell

AI总结该研究探讨了在观测信息受到干扰的情况下，黑箱大语言模型（LLM）规划器的鲁棒性问题。研究提出了两种不同的扰动维度，分别模拟语义相似的提示变化和传感器噪声带来的影响，并通过自适应压力测试（AST）结合蒙特卡洛树搜索（MCTS）方法，高效地探索扰动空间，发现可能导致模型产生高度不确定性或崩溃的场景与配置。实验表明，该方法能够提前识别潜在运行时故障，提升LLM在安全关键场景下的可靠性。

Comments Accepted to ACL Findings 2026; 31 pages, 26 figures, 6 tables

2504.14707 2026-05-13 cs.CL

FLAME: A New Dataset on FLemish Accounts of Momentary Experiences

Ratna Kandala, Niels Vanhasbroeck, Katie Hoemann

AI总结本文介绍了FLAME数据集，包含近25,000篇比利时荷兰语（弗莱明语）的个人日常叙事，旨在支持自然语言处理中对代表性不足语言变体的研究。研究探讨了哪种主题建模方法最适合揭示该语料库中的潜在主题，对比了K-Means聚类、LDA和BERTopic三种方法，发现BERTopic在生成连贯且具有文化相关性的主题方面表现最佳，突显了上下文嵌入在低资源、文化特定领域主题建模中的重要性。

2503.23947 2026-05-13 cs.CV

Spectral-Adaptive Modulation Networks for Visual Perception

Guhnoo Yun, Juhan Yoo, Kijung Kim, Jeongho Lee, Paul Hongsuck Seo, Dong Hwan Kim

AI总结本文研究了2D卷积与自注意力机制在频域特性上的差异，并通过图谱分析理论解释了它们在频率响应上的行为。基于这一分析，作者提出了一种频域自适应调制（SPAM）混合模块，利用多尺度卷积核和频域重缩放机制对视觉特征进行自适应处理。基于SPAM，作者构建了新型视觉主干网络SPANetV2，在多个视觉任务中表现出优于现有先进模型的性能。

Comments Accepted to IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)

2503.18760 2026-05-13 cs.CL

Synthetic Function Demonstrations Improve Generation in Low-Resource Programming Languages

Nick McKenna, Xinnuo Xu, Jack Williams, Nick Wilson, Benjamin Van Durme, Christian Poelitz

AI总结本研究探讨了在低资源编程语言环境下如何提升大语言模型的代码生成能力。针对缺乏真实数据的问题，作者提出了一种生成合成函数示例的方法，通过整理语言文档并利用强大教师模型生成高质量的训练数据，进而对学生模型进行微调。实验表明，该方法在问答数据集上显著提升了模型性能，并优于传统的检索增强生成方法。

Comments Published at LREC 2026

2503.16072 2026-05-13 cs.LG cs.AI cs.CL

Toxicity Detection Should Measure Contextual Harm, Not Text-Intrinsic Badness

Sergei Berezin, Reza Farahbakhsh, Noel Crespi

AI总结本文指出，当前的毒性检测方法往往将毒性视为文本本身的固有属性，而忽视了其在具体语境中的实际危害。作者主张应将毒性检测视为对情境中沟通行为所造成伤害的评估，而非单纯的文本分类任务。为此，他们提出了情境压力框架（CSF），将毒性定义为规范违反与引发压力或干扰之间的关系，并引入了CSF-Eval评估体系，以更全面地衡量毒性检测的效果。

2503.09051 2026-05-13 cs.LG cs.AI

Model-Level GNN Explanations via Rule-to-Graph Readout for Logit Reconstruction

Shengyao Lu, Jiuding Yang, Aedan J. DeFrates, Keith G. Mills, Baochun Li, Di Niu

AI总结本文提出了一种新的图神经网络（GNN）模型级解释框架，将解释目标从类别的规则提取转向基于规则的logit重建。该方法将预训练GNN的图级读出操作重新表述为加权规则级读出，通过将子图概念组合成逻辑规则，并直接从符号结构计算规则嵌入，再利用冻结的分类器头重建原始多分类logit值。实验表明，该方法在多个图分类数据集上能够高保真地重建原始logit，且在解释效率和功能分析方面优于现有方法。

2503.00341 2026-05-13 cs.RO cs.SY eess.SY

Feasible Force Set Shaping for a Payload-Carrying Platform Consisting of Tiltable Multiple UAVs Connected Via Passive Hinge Joints

Takumi Ito, Hayato Kawashima, Riku Funada, Mitsuji Sampei

AI总结本文研究了一种由多个可通过被动铰接关节连接的可倾斜无人机组成的载荷平台的可行力集塑造方法，并提出了一种利用该力集优势的控制律。通过调整无人机的倾斜角度，可以塑造出包含所需形状的可行力集，从而实现平台在多个方向上的冗余力生成。该方法有效提升了平台的负载控制能力和运动灵活性。

Comments This work has been accepted to IFAC for publication under a Creative Commons Licence CC-BY-NC-ND

2502.20209 2026-05-13 cs.CV cs.AI

DIPSER: A Dataset for In-Person Student Engagement Recognition in the Wild

Luis Marquez-Carpintero, Sergio Suescun-Ferrandiz, Carolina Lorenzo Álvarez, Jorge Fernandez-Herrero, Diego Viejo, Rosabel Roig-Vila, Miguel Cazorla

AI总结本文提出了一种名为 DIPSER 的新型数据集，用于评估真实课堂环境中学生的注意力水平。该数据集包含多角度 RGB 摄像头数据和智能手表传感器数据，能够捕捉学生的姿态、面部表情及生理指标，并提供了由学生自评和四位专家评估生成的注意力和情绪标签。该数据集结合了面部与环境摄像头数据、智能穿戴设备指标，并涵盖了以往数据集中较少见的族群群体，是目前最全面的面对面课堂教学中学生注意力与情绪分析数据集。

2502.19716 2026-05-13 cs.CV cs.LG

Fully AI-Generated Image Detection: Definition, Recent Advances and Challenges

Qijie Xu, Can Wang, Jiawei Chen, Siwei Lyu, Defang Chen

AI总结本文综述了全AI生成图像检测的研究进展，探讨了该领域面临的核心问题、检测方法及挑战。研究重点分析了数据集构建与特征提取两个关键环节，系统梳理了现有方法在利用先验知识提取生成痕迹方面的分类与差异。文章还指出了当前检测技术的局限性，并展望了未来提升检测鲁棒性与泛化能力的研究方向。