arXivDaily arXiv每日学术速递 周一至周五更新

视觉与机器人

多模态信息融合

面向图像、视频、多传感器和跨模态感知的信息融合,包括 Image Fusion、红外可见光、遥感、医学影像、LiDAR/雷达/相机和音视频融合。

今日/当前日期收录 3 信号源:cs.CV, eess.IV, eess.SP, cs.RO, cs.MM
2507.16859 2026-06-18 cs.RO cs.AI 版本更新 70%

Enhancing Fatigue Detection through Heterogeneous Multi-Source Data Integration and Cross-Domain Modality Imputation

通过异构多源数据集成与跨域模态插补增强疲劳检测

Luobin Cui, Yanlai Wu, Tang Ying, Weikai Li

专题命中 多传感器融合 :异构多源数据集成用于疲劳检测

AI总结 针对实际部署环境中高质量传感器不可用的问题,提出异构多源疲劳检测框架,利用共享模态进行跨域模态插补,融合源域知识提升目标域疲劳检测性能。

Comments 4figures,14pages

详情
AI中文摘要

疲劳检测对于安全相关应用(如航空、采矿和长途运输)中的人类操作员至关重要。可靠的操作员疲劳估计可以支持人机系统中的及时警告、自适应任务调度、接管提醒和其他安全管理决策。然而,这些功能的有效性取决于疲劳相关信号是否能在部署环境中可靠捕获。虽然许多研究已显示高保真传感器在受控实验室环境中的价值,但在实际环境中,由于噪声、光照条件和视野限制,其性能往往会下降,从而限制了实际应用。本文形式化了一种面向实际部署的疲劳检测设置,其中高质量传感器在实际应用中通常不可用。为解决这一问题,我们利用来自异构源域的知识,包括难以在现场部署但常用于受控环境的高保真传感器,来辅助真实目标域中的疲劳检测。基于这一思想,我们设计了一个异构多源疲劳检测框架,该框架利用目标域中的可用模态,同时通过基于共享模态的跨域模态插补来利用源域中的多样化配置。

英文摘要

Fatigue detection for human operators is important in safety-related applications such as aviation, mining, and long-haul transport. Reliable estimation of operator fatigue can support timely warnings, adaptive task scheduling, takeover reminders, and other safety-management decisions in human-machine systems. However, the effectiveness of these functions depends on whether fatigue-related signals can be reliably captured in the deployment environment. While many studies have shown the value of high-fidelity sensors in controlled laboratory environments, their performance often degrades when used in real-world settings because of noise, lighting conditions, and field-of-view constraints, thereby limiting their practical use. This paper formalizes a deployment-oriented setting for real-world fatigue detection, where high-quality sensors are often unavailable in practical applications. To address this issue, we use knowledge from heterogeneous source domains, including high-fidelity sensors that are difficult to deploy in the field but commonly used in controlled environments, to assist fatigue detection in the real-world target domain. Based on this idea, we design a heterogeneous and multi-source fatigue-detection framework that uses the available modalities in the target domain while leveraging diverse configurations in the source domains through cross-domain modality imputation based on shared modalities.

2606.01605 2026-06-18 cs.RO 版本更新 65%

Embedding Semantic Risk into Distance Fields and CBFs for Online Monocular Safe Control

将语义风险嵌入距离场和CBF用于在线单目安全控制

Dawei Zhang, Nuo Chen, Shuo Liu, Roberto Tron, Zhiwen Fan

发表机构 * Division of Systems Engineering, Boston University(系统工程系,波士顿大学) Department of Mechanical Engineering, Boston University(机械工程系,波士顿大学) Department of Electrical and Computer Engineering, Texas A&M University(电气与计算机工程系,德克萨斯农工大学)

专题命中 多传感器融合 :单目感知与语义风险嵌入距离场,涉及视觉与语义融合

AI总结 提出一种在线单目感知到控制框架,通过将语义风险直接嵌入欧几里得符号距离场(ESDF),在控制优化前编码风险,实现基于控制障碍函数(CBF)的语义感知安全导航与遥操作。

详情
AI中文摘要

我们提出了一种在线单目感知到控制框架,将语义风险嵌入到用于基于控制障碍函数(CBF)的安全导航和遥操作的距离场中。许多基于感知的安全过滤器对所有映射的障碍物分配相同的基于距离的安全裕度,或者仅将语义用作下游控制器调整,而不是在空间表示中编码语义风险。我们的框架通过将语义信息直接嵌入欧几里得符号距离场(ESDF),在线推理障碍物几何和类别相关风险。这种设计在控制优化前编码语义风险,因此高风险对象在安全场中施加更大的空间影响,同时保留运行时高效的ESDF查询。具体来说,基于基础模型的SLAM前端从单目RGB视频重建密集3D几何,而每帧语义分割提供像素级类别标签,这些标签被融合到重建的几何中。得到的几何-语义表示随后被转换为ESDF,其中语义标签识别安全相关区域并在场计算前施加类别相关的膨胀。语义感知的ESDF提供CBF控制器所需的局部距离值和空间导数,而类别相关的增益进一步调节控制器响应。广泛的仿真和硬件实验证明了在线操作在10-20 Hz的频率以及遥操作和自主导航中的语义感知安全行为。

英文摘要

We propose an online monocular perception-to-control framework that embeds semantic risk into the distance field used by Control Barrier Function (CBF)-based safe navigation and teleoperation. Many perception-based safety filters assign the same distance-based safety margin to all mapped obstacles or use semantics only as a downstream controller adjustment, rather than encoding semantic risk in the spatial representation. Our framework instead reasons online about obstacle geometry and class-dependent risk by embedding semantic information directly into the Euclidean Signed Distance Field (ESDF). This design encodes semantic risk before control optimization, so high-risk objects exert a larger spatial influence in the safety field while retaining efficient ESDF queries at runtime. Specifically, a foundation-model-based SLAM front end reconstructs dense 3-D geometry from monocular RGB video, while per-frame semantic segmentation provides pixel-level class labels that are fused into the reconstructed geometry. The resulting geometric-semantic representation is then converted into an ESDF, where semantic labels identify safety-relevant regions and impose class-dependent inflation before field computation. The semantic-aware ESDF provides the local distance values and spatial derivatives required by the CBF controller, while class-dependent gains further regulate the controller response. Extensive simulation and hardware experiments demonstrate online operation at 10--20 Hz and semantic-aware safe behavior in both teleoperation and autonomous navigation.

2512.14428 2026-06-18 cs.RO 版本更新 60%

Odyssey: An Automotive Lidar-Inertial Odometry Dataset with GNSS-denied situations

Odyssey:一种面向GNSS拒止场景的汽车激光雷达-惯性里程计数据集

Aaron Kurda, Simon Steuernagel, Lukas Jung, Marcus Baum

发表机构 * University of Göttingen(哥廷根大学) iMAR Navigation(iMAR导航)

专题命中 多传感器融合 :激光雷达-惯性里程计数据集,涉及多传感器

AI总结 提出Odyssey数据集,采用导航级环形激光陀螺仪RTK/INS提供高精度真值,包含36个序列和长时间GNSS拒止环境(隧道、室内停车场),用于评估LIO/SLAM系统。

Comments 10 pages, 4 figures, 3 tables, submitted to International Journal of Robotics Research (IJRR)

详情
AI中文摘要

激光雷达-惯性里程计(LIO)及同时定位与建图(SLAM)系统的开发与评估需要精确的真值。全球导航卫星系统(GNSS)常作为其基础,但在遮挡环境中,由于多径效应或信号丢失,其信号可能不可靠。现有数据集通过引入惯性测量单元(IMU)测量来补偿偶发的GNSS丢失,但由于累积漂移,常用系统不允许对GNSS拒止环境进行长时间研究。因此,此类数据集的多样性有限。为弥补这一空白,我们提出了Odyssey,一个汽车LIO数据集,其特点包括:(1)基于导航级环形激光陀螺仪(RLG)的RTK/INS导出的真值,其偏置稳定性比现有汽车数据集好1到4个数量级;(2)跨不同环境的36个序列的全面收集,支持稳健且全面的评估;(3)长时间的GNSS拒止环境,包括隧道以及汽车基准测试中此前未见过的室内停车场。在此,我们的RLG系统能够在常用系统会过度漂移的场景中实现准确评估。除了为LIO提供数据外,Odyssey还通过三次轨迹重复和通过精确大地坐标集成外部地图数据来支持地点识别任务。所有数据、数据加载器和补充材料均可在线获取,网址为:https://this https URL。

英文摘要

The development and evaluation of Lidar-Inertial Odometry (LIO) and Simultaneous Localization and Mapping (SLAM) systems requires a precise ground truth. The Global Navigation Satellite System (GNSS) is often used as a foundation for this, but its signals can be unreliable in obstructed environments due to multi-path effects or loss-of-signal. While existing datasets compensate for sporadic GNSS loss by incorporating Inertial Measurement Unit (IMU) measurements, the commonly used systems do not permit prolonged study of GNSS-denied environments due to accumulated drift. Therefore, the diversity of such datasets is limited. To close this gap, we present Odyssey, an automotive LIO dataset featuring: (1) a ground truth derived from a navigation-grade Ring Laser Gyroscope (RLG)-based RTK/INS, offering bias stability one to four orders of magnitude better than existing automotive datasets; (2) a comprehensive collection of 36 sequences across diverse environments, enabling robust and comprehensive evaluation and (3) prolonged GNSS-denied environments, including tunnels and, previously unseen in the context of automotive benchmarks, indoor parking garages. Here, our RLG-based system enables accurate evaluation in scenarios where commonly employed systems would drift excessively. Besides providing data for LIO, Odyssey also supports place recognition tasks through threefold trajectory repetition and integration of external mapping data via precise geodetic coordinates. All data, dataloader and supplementary material are available online at https://odyssey.uni-goettingen.de/ .