URL PDF HTML ☆

赞 0 踩 0

2606.19233 2026-06-18 cs.RO 新提交

Mobile Pedipulation for Object Sliding via Hierarchical Control on a Wheeled Bipedal Robot

基于轮式双足机器人分层控制的移动式腿部操作物体滑动

Yue Qin, Yulun Zhuang, Zelin Shen, Yanran Ding

发表机构 * University of Michigan（密歇根大学）

AI总结提出一种分层控制框架，使轮式双足机器人能用腿部滑动平面物体，通过简化三刚体动力学模型和轨迹优化运动规划器，在实验中成功实现1kg物体取回和4kg物体滑动。

Comments 8 pages, 7 figures

详情

AI中文摘要

在本文中，我们提出了一种分层控制框架，使轮式双足机器人能够利用其轮式腿执行平面物体滑动任务。该方法基于一个简化三刚体动力学模型构建了非线性模型预测控制器，该模型明确考虑了髋关节滚动自由度和多种轮-环境接触模式，这对于横向步态和腿部操作任务至关重要。在该框架内，非线性模型预测控制器同时调节机器人 locomotion 和交互力，使机器人能够稳定地执行滚动和物体操作行为。我们开发了一个基于轨迹优化的机器人-物体运动规划器，以生成包含地面-物体接触中粘滑转换的参考运动。通过实际硬件实验验证了两种代表性的腿部操作运动，即滑行和横向滑动，其中机器人成功地从桌子下取回一个1kg的物体，并通过滑行将一个4kg的物体滑动0.228米的距离。

英文摘要

In this letter, we present a hierarchical control framework that enables wheeled bipedal robots to perform planar object sliding tasks with their wheeled legs. The proposed approach formulates a nonlinear model predictive controller (NMPC) based on a reduced-order three rigid bodies (TRB) dynamical model that explicitly accounts for the hip roll degree of freedom and multiple wheel-environment contact modes, which is essential for lateral stepping and pedipulation tasks. Within this framework, the NMPC simultaneously regulates robot locomotion and interaction forces, allowing the robot to stably execute both rolling and object manipulation behaviors. A trajectory-optimization-based robot-object motion planner is developed to generate reference motions that incorporate stick-slip transitions in ground-object contact. Two representative pedipulation motions, namely scooting and lateral sliding, are validated through real-world hardware experiments, in which the robot successfully retrieves a 1 kg object from under a desk and slides a 4 kg object over a distance of 0.228 m via scooting.

URL PDF HTML ☆

赞 0 踩 0

2606.19230 2026-06-18 cs.LG cs.HC stat.ML 新提交

面向网络入侵数据集的XGBoost模型机器遗忘

Diana Magalhães, Eva Maia, João Vitorino, Isabel Praça

发表机构 * GECAD, ISEP, Polytechnic of Porto（波尔图理工学院工程学院GECAD研究所）

AI总结针对XGBoost模型提出XGBoost-Forget遗忘方法，在表格型网络入侵数据集上实现高效遗忘，保持模型性能的同时显著提升遗忘速度。

Comments 12 pages, 7 tables, WorldCist'26 Conference

2606.19218 2026-06-18 cs.CL 新提交

预测关键因素：面向决策的强化学习用于未知离开时间的受控电动汽车充电

Giuseppe Gabriele, Fabio Pavirani, Seyed Soroush Karimi Madahi, Chris Develder

发表机构 * Ghent University -- imec（根特大学 -- imec）

AI总结针对电动汽车充电中离开时间未知导致强化学习策略效果差的问题，提出面向决策的强化学习框架，联合训练预测器与控制器，实现端到端优化，使总奖励提升14%，未供应能量减少55%。

Comments ACM e-Energy 2026 5 pages, 1 figure, 1 table

详情

DOI: 10.1145/3744255.3811736

AI中文摘要

近年来电动汽车的普及给电力系统带来了挑战，包括峰值需求增加和潜在的电网不稳定。基于强化学习的智能充电控制可以通过从历史数据中学习时间和上下文模式来缓解这些问题。然而，在现实场景中，关键特征（如离开时间）通常不可用。这使得强化学习智能体更难学习和执行有效的充电策略。为了减轻这种不确定性，训练好的预测器可以从可用数据中近似未知特征。然而，由于这些预测模型通常针对准确性（而非对下游智能体决策质量的影响）进行训练，它们的误差可能会传播并阻碍使用预测的控制器的整体性能。为了避免这种情况，我们提出了一种面向决策的强化学习框架，其中预测器是端到端训练的，即通过强化学习智能体采取的充电策略动作的反馈。这种预测器和控制器的联合训练最终产生了更高质量的动作：与没有离开时间预测的强化学习方法相比，我们提出的面向决策的强化学习方法产生了更优的充电决策，总奖励提高了14%，未供应能量（即由于电动汽车已离开而未能进行的充电）减少了55%。

英文摘要

The recent growth of EV adoption poses challenges for power systems, including increased peak demand and potential grid instability. Smart control of EV charging -- e.g., based on reinforcement learning (RL) -- can alleviate these issues by learning temporal and contextual patterns from historical data. Yet, in real-world scenarios, key features, such as departure time, often are unavailable. This, in turn, makes it harder for an RL agent to learn and execute an effective charging policy. To mitigate this uncertainty, a trained forecaster can approximate the unknown features from available data. However, since these forecasting models are typically trained for accuracy (rather than their impact on a downstream agent's decision quality), their errors may propagate and hinder the overall performance of a controller that is using the forecasts. To avoid this, we propose a decision-focused RL (DF-RL) framework in which the forecaster is trained end-to-end, i.e., with feedback from the charging policy actions taken by the RL agent. Such joint training of both the forecaster and controller ultimately results in higher-quality actions: our proposed DF-RL method yields superior charging decisions compared to other baselines, achieving up to a 14% improvement in total reward and a 55% reduction of unsupplied energy (i.e., charging that failed to happen because the EV already left), relative to the RL method without departure time forecasting.

URL PDF HTML ☆

赞 0 踩 0

2606.19195 2026-06-18 cs.CV 新提交

Moebius: 0.2B Lightweight Image Inpainting Framework with 10B-Level Performance

Moebius: 0.2B轻量级图像修复框架，性能达10B级别

Kangsheng Duan, Ziyang Xu, Wenyu Liu, Xiaohu Ruan, Xiaoxin Chen, Xinggang Wang

发表机构 * Huazhong University of Science and Technology（华中科技大学）； VIVO AI Lab（维沃人工智能实验室）

AI总结提出Moebius轻量级图像修复框架，通过局部-λ混合交互模块和自适应多粒度蒸馏策略，以0.22B参数实现与10B级模型FLUX.1-Fill-Dev相当甚至更优的生成质量，推理速度提升15倍以上。

详情

AI中文摘要

尽管10B级别的工业基础模型推动了图像修复的边界，但其高昂的计算成本严重阻碍了实际部署。构建高度优化的任务特定专家模型是一个有前景的解决方案，然而极端的结构压缩不可避免地引发了严重的表示瓶颈。为解决这一问题，我们提出了Moebius，一个高效的轻量级修复框架。我们通过引入局部-λ混合交互（$L\lambda MI$）模块系统地重构了扩散主干。该模块由局部-λ和交互-λ子模块组成，巧妙地将空间上下文和全局语义先验总结为固定大小的线性矩阵，在保留复杂潜在交互的同时大幅减少参数。此外，为了释放这种高度紧凑架构的全部表示能力，我们将其与自适应多粒度蒸馏策略协同配对。该策略严格在潜在空间内操作以避免昂贵的像素空间解码，动态平衡多个基于梯度的损失以实现高保真对齐。在自然和肖像基准上的大量实验表明，这种最优协同使Moebius能够媲美甚至超越10B级工业通用模型FLUX.1-Fill-Dev的生成质量。值得注意的是，Moebius仅使用不到2%的参数（0.22B vs. 11.9B）就实现了这一点，同时总推理时间加速超过15倍，为高保真修复设立了新的效率标准。项目页面见此https URL。

英文摘要

While 10B-level industrial foundation models have pushed the boundaries of image inpainting, their prohibitive computational costs severely hinder practical deployment. Constructing a highly optimized task-specific specialist offers a promising solution; however, extreme structural compression inevitably triggers a severe representation bottleneck. To conquer this, we propose Moebius, a highly efficient lightweight inpainting framework. We systematically reconstruct the diffusion backbone by introducing the Local-$λ$ Mix Interaction ($LλMI$) block. Comprising Local-$λ$ and Interactive-$λ$ modules, it elegantly summarizes spatial contexts and global semantic priors into fixed-size linear matrices, preserving complex latent interactions while drastically shedding parameters. Furthermore, to unlock the full representational capacity of this highly compact architecture, we synergistically pair it with an adaptive multi-granularity distillation strategy. Operating strictly within the latent space to avoid expensive pixel-space decoding, this strategy dynamically balances multiple gradient-based losses to achieve high-fidelity alignment. Extensive experiments across natural and portrait benchmarks demonstrate that this optimal synergy enables Moebius to rival or even surpass the generation quality of the 10B-level industrial generalist FLUX.1-Fill-Dev. Remarkably, Moebius achieves this using less than 2\% of the parameters (0.22B vs. 11.9B) while delivering a $>15\times$ acceleration in total inference time, setting a new efficiency standard for high-fidelity inpainting. Project page at https://hustvl.github.io/Moebius.

URL PDF HTML ☆

赞 0 踩 0

2606.19190 2026-06-18 cs.RO 新提交

FAST-LIVGO: A Degeneracy-Robust LiDAR-Inertial-Visual-GNSS Fusion Odometry

FAST-LIVGO：一种退化鲁棒的LiDAR-惯性-视觉-GNSS融合里程计

Zhiyu Chen, Chunran Zheng, Jiayu Wen, XiaoLei Zhang, Jiaming Xu, Feng Pan, Yukang Cui

发表机构 * College of Mechatronics and Control Engineering, Shenzhen University（深圳大学机电与控制工程学院）； Department of Mechanical Engineering, The University of Hong Kong（香港大学机械工程系）； College of Automation, Harbin Engineering University（哈尔滨工程大学自动化学院）

AI总结提出一种基于误差状态迭代卡尔曼滤波的紧耦合LiDAR-惯性-视觉-GNSS融合框架，通过动态时间规整的时空对齐模块、多普勒和时差载波相位观测模型以及退化感知的双模式异常值拒绝策略，在长期大尺度动态环境中实现高精度鲁棒的状态估计。

Comments Accepted for presentation at the 2026 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2026)

详情

AI中文摘要

在长期、大规模和高度动态环境中的鲁棒状态估计与建图仍然是机器人领域的关键挑战。现有的LiDAR-惯性-视觉里程计（LIVO）系统在局部精度上表现良好，但在长距离下会累积漂移，并在几何退化或无纹理场景中可能失效。同时，GNSS辅助融合框架通常依赖LiDAR或视觉里程计进行状态预测和异常值拒绝，使其在里程计退化时变得脆弱。为解决这些局限，我们提出一种基于误差状态迭代卡尔曼滤波的紧耦合LiDAR-惯性-视觉-GNSS融合框架。引入基于动态时间规整的在线时空对齐模块以应对高度动态条件。为更好利用GNSS精度，我们开发了基于多普勒频移和固定锚点时间差载波相位的观测模型，在不增加历史锚点状态的情况下提供毫米级相对约束。我们进一步设计了一种退化感知的双模式异常值拒绝策略，根据LIVO退化程度在LIVO先验引导拒绝和GNSS辅助恢复之间切换。在公开M3DGR数据集和自建20 m/s固定翼无人机数据集上的实验表明，我们的系统减少了累积漂移和地图重影，在精度和鲁棒性上优于现有方法。

英文摘要

Robust state estimation and mapping in long-term, large-scale, and highly dynamic environments remains a key challenge in robotics. Existing LiDAR-Inertial-Visual Odometry (LIVO) systems achieve strong local accuracy but suffer from accumulated drift over long distances and may fail in geometrically degraded or textureless scenes. Meanwhile, GNSS-aided fusion frameworks often rely on LiDAR or visual odometry for state prediction and outlier rejection, making them vulnerable when odometry degenerates. To address these limitations, we propose a tightly coupled LiDAR-Inertial-Visual-GNSS fusion framework based on an Error-State Iterated Kalman Filter. An online spatiotemporal alignment module using Dynamic Time Warping is introduced for highly dynamic conditions. To better exploit GNSS precision, we develop observation models based on Doppler shifts and fixed-anchor Time-Differenced Carrier Phase, providing millimeter-level relative constraints without augmenting historical anchor states. We further design a degeneracy-aware dual-mode outlier rejection strategy that switches between LIVO-prior-guided rejection and GNSS-aided recovery according to the LIVO degeneracy level. Experiments on the public M3DGR dataset and a custom 20~m/s fixed-wing UAV dataset demonstrate that our system reduces accumulated drift and map ghosting, outperforming state-of-the-art methods in accuracy and robustness.

URL PDF HTML ☆

赞 0 踩 0

2606.19186 2026-06-18 cs.RO cs.LG 新提交

Learning to Annotate Delayed and False AEB Events: A Practical System for Extreme Class Imbalance and Asymmetric Label Noise

学习标注延迟和误报AEB事件：针对极端类别不平衡和非对称标签噪声的实用系统

Mengxiang Hao, Xin Jiang, Xinghao Huang, Wenliang Su, Zhiteng Wang, Junjie Rao, Xiaotian Yang, Wei Liao, Chengyu Han, Gen Liang, Yulun Song, Zhitao Xu, Xianpeng Lang

发表机构 * Li Auto（理想汽车）

AI总结提出首个自动化AEB标注框架，通过特定数据增强和噪声抑制技术，解决极端类别不平衡和非对称标签噪声问题，将延迟/误报触发召回率提升80%，人工工作量减少50%。

Comments 8 pages, 5 figures, accepted by IEEE International Conference on Robotics and Automation (ICRA)

详情

Journal ref: 2026 IEEE International Conference on Robotics and Automation (ICRA)

AI中文摘要

自主紧急制动（AEB）优化依赖于准确标注的真实世界触发事件，特别是揭示系统缺陷的罕见但关键的延迟和误报AEB触发事件。然而，这些少数样本在每天数千次触发事件中占比不到5%，使得大规模人工标注成本过高。我们提出了首个自动化AEB标注框架来解决这一问题。在开发过程中，我们识别出两个严重损害延迟/误报触发标注准确性的基本挑战：（1）极端类别不平衡，其中延迟/误报触发被真实触发淹没；（2）非对称标签噪声，其中误标注的多数样本（真实触发）抑制了少数样本（延迟/误报触发）的学习。为克服这些挑战，我们提出两项关键创新：（1）特定数据增强，通过操纵焦点目标属性、移植自车动态和掩蔽非焦点代理来合成逼真样本；（2）噪声抑制，使用稳定硬度估计和探针引导的自适应阈值来清理误标注的真实触发样本。关键的是，我们将模型部署为具有全栈架构的实用标注系统，从每天数千个AEB事件中高效识别关键的延迟/误报触发。生产结果表明，延迟/误报触发的召回率提高了80%，人工工作量减少了50%。除了直接收益，该系统通过积累高质量标注实现持续自我改进，为车载AEB系统优化奠定了必要的数据基础。

英文摘要

Autonomous Emergency Braking (AEB) optimization relies on accurately annotated real-world trigger events, particularly rare but critical delayed and false AEB triggers that expose system deficiencies. However, these minority samples comprise less than 5% of thousands of daily triggers, making manual annotation prohibitively expensive at scale. We present the first automated AEB annotation framework to address this problem. During development, we identified two fundamental challenges that severely impair delayed/false trigger annotation accuracy: (1) Extreme class imbalance where delayed/false triggers are overwhelmed by true triggers; (2) Asymmetric label noise where mislabeled majority samples (true triggers) suppress minority samples (delayed/false triggers) learning. To overcome these challenges, we propose two key innovations: (1) Specific data augmentation that synthesizes realistic samples by manipulating focal target attributes, transplanting ego-vehicle dynamics, and masking non-focal agents; (2) noise suppression using stable hardness estimation and probe-guided adaptive threshold to clean mislabeled true trigger samples. Crucially, we deploy our model as a practical annotation system with full-stack architecture, efficiently identifying critical delayed/false triggers from thousands of daily AEB events. Production results demonstrate 80% improvement in recall of delayed/false triggers and 50% reduction in manual workload. Beyond immediate gains, the system enables continuous self-improvement through accumulated high-quality annotations, establishing a necessary data foundation for on-vehicle AEB system optimization

URL PDF HTML ☆

赞 0 踩 0

2606.19185 2026-06-18 cs.LG 新提交

AGDN: Learning to Solve Traveling Salesman Problem with Anisotropic Graph Diffusion Network

AGDN：利用各向异性图扩散网络学习求解旅行商问题

Bolin Shen, Ziwei Huang, Zhiguang Cao, Yushun Dong

发表机构 * Florida State University（佛罗里达州立大学）； Singapore Management University（新加坡管理大学）

AI总结提出各向异性图扩散网络（AGDN），通过MixScore转移矩阵和各向异性扩散策略，有效利用图结构信息求解旅行商问题，在多种实例规模和分布上优于现有方法。

Comments Accepted at the 32nd ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2026)

详情

DOI: 10.1145/3770855.3817789

AI中文摘要

旅行商问题（TSP）是组合优化的基石，出现在许多实际场景中。尽管基于图的学习方法已被探索用于TSP，但如何更有效地利用图结构的问题仍然悬而未决。我们提出了各向异性图扩散网络（AGDN），一种新的图神经网络框架，旨在求解TSP。我们的方法解决了两个核心难点：（1）完全连接TSP图中缺乏信息丰富的拓扑先验，以及（2）在常用的图稀疏化技术后，最优解中丢失连接节点。为了克服这些问题，我们构建了一个MixScore转移矩阵，将节点相似性与成对距离相结合，并开发了一种各向异性图扩散策略，支持跨多跳的高效信息交换。涵盖不同实例规模和节点分布的全面实验表明，AGDN在保持计算时间竞争力的同时，始终优于现有方法。此外，AGDN能够很好地泛化到训练期间未见的问题规模和分布。实现代码已公开在：this https URL。

英文摘要

The Traveling Salesman Problem (TSP) is a cornerstone of combinatorial optimization and arises in many practical scenarios. Although graph-based learning approaches have been explored for TSP, the question of how to exploit graph structure more effectively remains open. We present the Anisotropic Graph Diffusion Network (AGDN), a new Graph Neural Network framework designed to solve TSP. Our method tackles two central difficulties: (1) the lack of informative topological prior in fully connected TSP graphs, and (2) losing connected nodes in the optimal solution after the commonly used graph sparsification techniques. To overcome these issues, we construct a MixScore transition matrix that merges node similarity with pairwise distance, and we develop an anisotropic graph diffusion strategy that supports efficient information exchange across multiple hops. Comprehensive experiments spanning diverse instance sizes and node distributions show that AGDN consistently outperforms existing methods while keeping computation time competitive. Furthermore, AGDN generalizes well to problem sizes and distributions beyond those seen during training. The implementation is publicly available at: https://github.com/LabRAI/AGDN.

URL PDF HTML ☆

赞 0 踩 0

2606.19184 2026-06-18 cs.CV cs.LG 新提交

When AUC Misleads: Polarization-Aware Evaluation of Deepfake Detectors under Domain Shift

当AUC误导：域偏移下深度伪造检测器的极化感知评估

Dat Nguyen, Cosmin Radoi, Romain Hermary, Marcella Astrid, Nesryne Mejri, Enjie Ghorbel, Djamila Aouada

发表机构 * Cristal Laboratory, National School of Computer Sciences, University of Manouba（马努巴大学国家计算机科学学院Cristal实验室）

AI总结针对现有AUC评估无法反映真实场景中混合数据源和不同伪影类型的问题，提出Cross-dataset AUC（Cross-AUC）指标，通过平均每域AUC并引入预测极化度量（Wasserstein距离）来评估域偏移鲁棒性，实验证明其有效性。

详情

AI中文摘要

生成式AI的最新进展，如扩散模型和换脸工具，使得创建高度逼真的深度伪造成为可能，导致了包括金融欺诈和非自愿色情内容在内的现实危害。为此，深度伪造检测成为一个活跃的研究领域，近期方法越来越关注提高对未见操作的泛化能力。这通常通过跨多个数据集分别测量的ROC曲线下面积（AUC）来评估。然而，这种评估未能反映检测器面对混合数据源和不同伪影类型的真实场景。为解决这一局限，我们引入一种新指标——跨数据集AUC（Cross-AUC），该指标平均每域AUC并加入预测极化度量，以考虑对域偏移的鲁棒性。极化程度通过类别分数分布之间的Wasserstein距离量化。Cross-AUC不仅更真实地评估深度伪造检测器在域偏移下的泛化能力，而且具有可解释性，因为它能更好地解释性能下降的原因。在七个基准数据集上的实验证明了其实用性。

英文摘要

Recent advances in generative AI, such as diffusion models and face-swapping tools, have enabled the creation of highly realistic deepfakes, leading to real-world harms including financial fraud and non-consensual explicit content. In response, deepfake detection has become an active research area, with recent methods increasingly focusing on improving generalization to unseen manipulations. This is typically evaluated using the Area Under the ROC Curve (AUC) measured separately across multiple datasets. However, such an evaluation fails to reflect real-world scenarios where detectors face a mixture of data sources and varying artifact types. To address this limitation, we introduce a novel metric, Cross-dataset AUC (Cross-AUC) that averages per-domain AUCs with a measure of prediction polarization for taking into account the robustness to domain shift. The polarization extent is quantified by the Wasserstein Distance between class score distributions. Cross-AUC not only assesses the generalization capabilities of deepfake detectors under domain shifts more realistically, but it is also interpretable as it better explains the reason behind a drop in performance. Experiments performed on seven benchmark datasets demonstrate its practical relevance.

URL PDF HTML ☆

赞 0 踩 0

2606.19183 2026-06-18 cs.CL cs.AI 新提交

Language Models as Interfaces, Not Oracles: A Hybrid LLM-ML System for Pediatric Appendicitis

语言模型作为接口而非预言机：用于小儿阑尾炎的混合LLM-ML系统

Soheyl Bateni, Maryam Abdolali

发表机构 * K. N. Toosi University of Technology（K. N. 图西理工大学）

AI总结提出ClaMPAPP混合系统，利用LLM从自由文本中提取结构化特征，再由XGBoost分类器进行诊断，在两个独立队列中优于端到端LLM，提高了诊断稳定性和可审计性。

详情

AI中文摘要

大型语言模型（LLM）通过解释自由文本记录可使临床决策支持更易获取，但直接作为诊断引擎使用时，受提示敏感性、信息顺序以及看似合理但错误的输出限制。结构化机器学习模型提供更稳定的风险预测，但需要难以与叙事性临床工作流集成的表格输入。我们提出ClaMPAPP（临床语言辅助机器学习阑尾炎诊断流程），这是一个混合系统，将LLM用作接口而非最终决策者。ClaMPAPP从类似笔记的叙述中提取模式约束的临床特征，应用确定性合理性检查，并将验证后的特征传递给基于临床、实验室和超声变量训练的XGBoost分类器。我们在来自德国医院的两个独立小儿阑尾炎队列上评估了ClaMPAPP，并将其与端到端LLM基线（包括开源和专有模型）进行比较。为在测试自由文本输入时保留真实标签，通过模板渲染和约束LLM重写从结构化电子健康记录生成叙述，并附加句子顺序排列以评估位置鲁棒性。ClaMPAPP在内部和外部验证中均达到最强的整体诊断性能，同时最小化漏诊阑尾炎病例（急性分诊中的关键安全问题）。端到端LLM表现出不稳定的灵敏度-特异性权衡，且在叙述重排下性能下降更严重。这些结果支持LLM作为接口、ML作为预测器的设计，将自然语言可用性与预测推理分离，并为临床决策支持提供更可审计的路径。

英文摘要

Large language models (LLMs) can make clinical decision support more accessible by interpreting free-text documentation, but their direct use as diagnostic engines is limited by sensitivity to prompts, information order, and plausible but incorrect outputs. Structured machine-learning models offer more stable risk prediction, yet they require tabular inputs that are difficult to integrate with narrative clinical workflows. We present ClaMPAPP (Clinical Language-assisted Machine-learning Pipeline for Appendicitis), a hybrid system that uses an LLM as an interface rather than as the final decision-maker. ClaMPAPP extracts schema-constrained clinical features from note-like narratives, applies deterministic plausibility checks, and passes validated features to an XGBoost classifier trained on clinical, laboratory, and ultrasound variables. We evaluated ClaMPAPP on two independent pediatric appendicitis cohorts from German hospitals and compared it with end-to-end LLM baselines, including open-source and proprietary models. To preserve ground truth while testing free-text input, narratives were generated from structured electronic health records through template rendering and constrained LLM rewriting, with additional sentence-order permutation to assess positional robustness. ClaMPAPP achieved the strongest overall diagnostic performance in both internal and external validation while minimizing missed appendicitis cases, the key safety concern in acute triage. End-to-end LLMs showed unstable sensitivity-specificity trade-offs and greater degradation under narrative reordering. These results support an LLM-as-interface, ML-as-predictor design that separates natural-language usability from predictive inference and provides a more auditable pathway for clinical decision support.

URL PDF HTML ☆

赞 0 踩 0

2606.19176 2026-06-18 cs.RO cs.AI cs.SY eess.SY 新提交

Hardware- and Vision-in-the-Loop Validation of Deep Monocular Pose Estimation for Autonomous Maritime UAV Flight

用于自主海上无人机飞行的深度单目位姿估计的硬件与视觉在环验证

Maneesha Wickramasuriya, Beomyeol Yu, Jaden Shin, Mason Huslig, Taeyoung Lee, Murray Snyder

发表机构 * George Washington University（乔治华盛顿大学）

AI总结提出硬件验证的视觉在环框架，结合深度变换器单目位姿估计器和延迟卡尔曼滤波器，在模拟逼真海上环境中实现自主室内飞行，验证了感知延迟等嵌入式效应。

Comments 6 pages 9 figues

详情

AI中文摘要

船舶上的自主无人机操作需要可靠的基于视觉的相对位姿估计，然而海上验证成本高、依赖天气且风险大。本文提出一个硬件验证的视觉在环框架，能够在模拟逼真海上环境的同时实现完全自主的室内飞行。渲染的海上视图由板载的基于深度变换器的单目位姿估计器处理。延迟的视觉测量与高频率IMU数据通过延迟卡尔曼滤波器融合，为几何控制提供一致的状态估计。该系统捕捉了纯仿真中缺失的关键嵌入式效应，包括感知延迟、异步更新和计算约束。自主起飞、轨迹跟踪和着陆实验证明了稳定的闭环飞行。结果建立了一个安全且硬件真实的中间阶段，用于在船上部署之前开发海上无人机自主性。

英文摘要

Autonomous UAV operations on ships require reliable vision-based relative pose estimation, yet at-sea validation is costly, weather-dependent, and risky. This paper presents a hardware-validated vision-in-the-loop framework that enables fully autonomous indoor flight while emulating photorealistic maritime environments. Rendered maritime views are processed onboard by a deep transformer-based monocular pose estimator. Delayed vision measurements are fused with high-rate IMU data using a delayed Kalman filter to provide consistent state estimates for geometric control. The system captures critical embedded effects, including perception latency, asynchronous updates, and computational constraints, that are absent in pure simulation. Autonomous takeoff, trajectory tracking, and landing experiments demonstrate stable closed-loop flight. The results establish a safe and hardware-realistic intermediate stage for developing maritime UAV autonomy prior to shipboard deployment.

URL PDF HTML ☆

赞 0 踩 0

2606.19172 2026-06-18 cs.AI 新提交

User as Engram: Internalizing Per-User Memory as Local Parametric Edits

用户作为印迹：将每用户记忆内化为局部参数编辑

Bojie Li

发表机构 * Pine AI

AI总结提出User as Engram方法，将用户事实存储为Engram模型的哈希键控记忆表中的局部编辑，推理技能共享一个适配器，实现高精度间接推理且内存占用极小。

详情

AI中文摘要

语言模型中的个人记忆涉及两个问题：内容和推理技能。大脑将两者分开（每个情节在海马体中有一个稀疏的局部印迹，解释它的共享技能在缓慢的新皮层中），因此新事实不必覆盖其他一切。如今大多数个性化方法将用户事实保存在权重之外，存储在自然语言记忆文件或检索索引中。当事实被写入模型时，标准方法是每用户的LoRA适配器，这与大脑相反，将内容和技能折叠成一个全局权重增量。将用户事实写为LoRA会污染与它们无关的文本；将相同事实写为局部Engram行则数学上保持不变，导致内存占用大约减少33,000倍。因此，我们提出User as Engram：将用户内容存储为对Engram模型的哈希键控记忆表的手术式编辑，并将推理技能携带在一个共享适配器中。这种分层设计匹配了每用户LoRA的直接召回，同时平均提供5.6倍更高的间接推理准确性，并且从未使单个用户在推理方面比未触及的基座更差。编辑是一个玻璃盒：写入一个事实会在精确触发时打开其查找，添加答案所需的值，保持其他每个位置不变到最后一位，如果写入错误层则失败。由于不同用户的事实落在不相交的哈希槽中，它们的编辑可组合：许多用户同时共享一个表，可加性且无损地堆叠，而每用户LoRA（一个全局权重增量）只允许一个。在检索时，每用户Engram表不会随着检索器必须搜索的群体增长，因此在大约100个事实后，它超越了在2.5倍更大模型上的检索流水线。

英文摘要

Personal memory in a language model is two problems: content and reasoning skill. The brain keeps the two apart (a sparse, local engram in the hippocampus for each episode, a slow neocortex for the shared skills that interpret it), so a new fact need not overwrite everything else. Most personalization today keeps a user's facts outside the weights, in a natural-language memory file or a retrieval index. When facts are written into the model instead, the standard recipe is the per-user LoRA adapter, which does the opposite of the brain, folding content and skill into one global weight delta. Writing a user's facts as a LoRA contaminates text unrelated to them; writing the same facts as local Engram rows leaves it mathematically untouched, resulting in a roughly 33,000x smaller memory footprint. We therefore propose User as Engram: store a user's content as surgical edits to the hash-keyed memory table of an Engram model, and carry the reasoning skill in one shared adapter. This layered design matches per-user LoRA's direct recall while delivering 5.6x higher indirect-reasoning accuracy on average, and never makes a single user worse at reasoning than the untouched base. The edit is a glass box: writing a fact switches on its lookup at exactly the trigger, adds the value the answer needs, leaves every other position unchanged to the last bit, and fails if written into the wrong layer. Because different users' facts land in disjoint hash slots, their edits compose: many users live in one shared table at once, stacking additively and losslessly, where a per-user LoRA, a single global weight delta, admits only one. Upon retrieval, a per-user Engram table does not grow with the population the retriever must search, so past ~100 facts it overtakes a retrieval pipeline on a 2.5x larger model.

URL PDF HTML ☆

赞 0 踩 0

2606.19170 2026-06-18 cs.CL 新提交

Dango: A Strictly L1-Only Large Language Model for Studying Second Language Acquisition

Dango：一个严格仅L1的大型语言模型，用于研究第二语言习得

Shiho Matta, Yin Jou Huang, Fei Cheng, Takashi Kodama, Hirokazu Kiyomaru, Yugo Murawaki

发表机构 * Kyoto University（京都大学）； NII-LLMC（国立信息学研究所-大规模语言模型中心）

AI总结提出1.8B参数的Dango模型，通过过滤L2污染和微调L2学习课程，模拟人类L2产出模式，优于未过滤和多语言基线。

Comments 8 pages main text, 20 pages total including references and appendices

详情

AI中文摘要

我们介绍了Dango，一个1.8B参数的大型语言模型，旨在用于第二语言习得（SLA）中L1到L2（日语到英语）迁移的受控研究。虽然先前的研究已经探索了语言模型中的SLA，但它们主要依赖于较小的或非解码器模型，限制了它们生成开放式文本的能力，并降低了它们作为实用L2模拟器的适用性。我们发现了将模型扩展到该规模时的一个关键挑战：用于L1习得的“单语”预训练语料库中的L2污染。为了解决这个问题，我们提出了一种过滤方法，以减少对英语的过早暴露，同时保留现实的最小暴露。然后，我们在LLM生成的L2学习课程上对模型进行微调，以模拟L2习得过程。我们的评估证实，Dango发展了类似人类的L2产出模式，优于未过滤和标准的多语言基线。我们发布了模型、数据和代码，以促进可重复的计算SLA研究和面向学习者的应用。

英文摘要

We introduce Dango, a 1.8B-parameter large language model designed for controlled studies of L1-to-L2 (Japanese-to-English) transfer in second language acquisition (SLA). While previous studies have explored SLA in language models, they have predominantly relied on smaller or non-decoder models, limiting their ability to generate open-ended text and reducing their suitability as practical L2 simulators. We identify a key challenge when scaling models to this size: L2 contamination within the "monolingual" pretraining corpus used for L1 acquisition. To address this, we propose a filtering method to reduce premature exposure to English while preserving realistic, minimal exposure. We then fine-tune the model on LLM-generated L2-learning lessons to simulate the L2 acquisition process. Our evaluations confirm that Dango develops human-like L2 production patterns, outperforming both unfiltered and standard multilingual baselines. We release the model, data, and code to facilitate reproducible computational SLA research and learner-facing applications.

URL PDF HTML ☆

赞 0 踩 0

2606.19168 2026-06-18 cs.AI cs.LG 新提交

Beyond Safe Data: Pretraining-Stage Alignment with Regular Safety Reflection

超越安全数据：具有正则安全反射的预训练阶段对齐

Jinhan Li, Kexian Tang, Yihan Xu, Zhuorui Ye, Kaifeng Lyu

发表机构 * Institute for Interdisciplinary Information Sciences, Tsinghua University（清华大学交叉信息研究院）

AI总结提出安全反射预训练方法，在预训练语料中插入安全反思，使模型具备自我监控能力，实验表明该方法能有效降低推理和微调攻击成功率。

详情

AI中文摘要

为了实现大型语言模型（LLMs）更深层次的安全对齐，最近的研究探讨了如何将安全干预措施提前到预训练阶段，主要通过过滤不安全数据或将其改写为更安全的形式。我们认为，预训练阶段的对齐应超越使数据安全：LLMs可能将看似良性的知识和能力组合成不安全的行为。为此，我们提出了安全反射预训练，一种预训练阶段的对齐方法，该方法定期在预训练语料中插入简短的安全反思，将自我监控直接集成到语言建模中，建立一种基础能力，随后通过兼容的后训练加以强化。我们在FineWeb-Edu上预训练的1.7B模型上的实验表明，安全反射预训练提高了安全分类准确性，并显著降低了推理阶段和微调攻击的成功率。除了真实世界实验，我们还引入了一个完全受控的合成环境MedSafetyWorld，其中包含清晰的安全定义和推理结构，模型可以轻松地从安全数据中泛化出不安全行为。在MedSafetyWorld中的消融实验进一步表明，与数据过滤和改写相比，安全反射预训练在防止模型根据安全数据泛化出的不安全行为方面具有明显优势。综合来看，我们的发现表明，预训练对齐不仅应使训练数据安全，还应塑造模型可能从安全数据中习得的行为。

英文摘要

To achieve deeper safety alignment for large language models (LLMs), recent efforts have studied how to push safety interventions earlier into the pretraining stage, primarily by filtering unsafe data or rewriting it into safer forms. We argue that pretraining-stage alignment should go beyond making the data safe: LLMs may compose seemingly benign knowledge and capabilities into unsafe behaviors. To this end, we propose Safety Reflection Pretraining, a pretraining-stage alignment method which regularly inserts short safety reflections into pretraining corpora to integrate self-monitoring directly into language modeling, establishing a foundational capability that is subsequently reinforced by compatible post-training. Our experiments with 1.7B models pretrained on FineWeb-Edu show that Safety Reflection Pretraining improves safety classification accuracy and substantially reduces the success rates of inference-stage and finetuning attacks. Complementary to our real-world experiments, we also introduce a fully controlled synthetic environment, MedSafetyWorld, with a clear definition of safety and a reasoning structure under which models can easily generalize unsafe behaviors from safe data. Ablations in MedSafetyWorld further demonstrate a clear advantage of Safety Reflection Pretraining in preventing models from acting on unsafe behaviors generalized from safe data, compared with data filtering and rewriting. Taken together, our findings suggest that pretraining alignment should not only make the training data safe, but also shape the behaviors that models are likely to acquire from safe data.

URL PDF HTML ☆

赞 0 踩 0

2606.19164 2026-06-18 cs.LG cs.AI 新提交

OrthoReg：混合符号-神经动力系统的正交正则化

Till Richter, Niki Kilbertus

发表机构 * Technical University of Munich（慕尼黑工业大学）； Helmholtz Munich（亥姆霍兹慕尼黑中心）

AI总结针对混合建模中神经部分可能重复学习符号结构导致模型冗余的问题，提出正交正则化方法OrthoReg，直接惩罚符号与神经组件间的重叠，实现互补分解，提升符号恢复和分布外行为。

详情

AI中文摘要

动力系统是建模自然世界的基础，然而建模过程中存在持续的权衡：手动指定的机械模型设计上可解释但通常过于简单且设定错误；相反，灵活的数据驱动神经方法缺乏物理洞察。混合建模旨在通过结合指定的或基于符号的物理组件与灵活的神经网络来兼顾两者优势。然而，一个关键挑战是神经组件可能重新学习机械部分，产生冗余且不可解释的模型，特别是当符号结构本身是从数据中发现时。基于标准$L^2$正则化的现有方法依赖于投影论证，但当符号组件通过稀疏发现学习时，该论证失效，允许神经增强与符号结构重叠。我们引入\textbf{OrthoReg}（正交正则化），直接惩罚符号与神经组件之间的重叠，防止符号结构被神经残差吸收。这产生互补分解：符号部分捕捉库能表达的内容，神经部分捕捉剩余内容。在存在部分库不匹配的基准动力系统上，OrthoReg改善了符号恢复和分布外行为。

英文摘要

Dynamical systems are fundamental to modeling the natural world, yet modeling them involves a persistent trade-off: manually prescribed mechanistic models are interpretable by design but often overly simplistic and misspecified; in contrast, flexible data-driven neural methods lack physical insight. Hybrid modeling aims for the best of both worlds by combining a prescribed or symbolic, physics-based component with a flexible neural network. A critical challenge, however, is that the neural component may relearn mechanistic parts, yielding redundant and uninterpretable models, especially when the symbolic structure itself is discovered from data. Existing methods based on standard $L^2$ regularization rely on a projection argument that breaks when the symbolic component is learned through sparse discovery, allowing the neural augmentation to overlap with symbolic structure. We introduce \textbf{OrthoReg} (Orthogonal Regularization), which directly penalizes overlap between the symbolic and neural components, preventing symbolic structure from being absorbed by the neural residual. This yields a complementary decomposition: the symbolic part captures what the library can express, and the neural part captures what remains. On benchmark dynamical systems with partial library mismatch, OrthoReg improves symbolic recovery and out-of-distribution behavior.

URL PDF HTML ☆

赞 0 踩 0