arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2602.21473 2026-03-05 cs.CV

Automatic Map Density Selection for Locally-Performant Visual Place Recognition

Somayeh Hussaini, Tobias Fischer, Michael Milford

Comments Under Review

详情

英文摘要

A key challenge in translating Visual Place Recognition (VPR) from the lab to long-term deployment is ensuring a priori that a system can meet user-specified performance requirements across different parts of an environment, rather than just on average globally. A critical mechanism for controlling local VPR performance is the density of the reference mapping database, yet this factor is largely neglected in existing work, where benchmark datasets with fixed, engineering-driven (sensors, storage, GPS frequency) sampling densities are typically used. In this paper, we propose a dynamic VPR mapping approach that uses pairs of reference traverses from the target environment to automatically select an appropriate map density to satisfy two user-defined requirements: (1) a target Local Recall@1 level, and (2) the proportion of the operational environment over which this requirement must be met or exceeded, which we term the Recall Achievement Rate (RAR). Our approach is based on the hypothesis that match patterns between multiple reference traverses, evaluated across different map densities, can be modelled to predict the density required to meet these performance targets on unseen deployment data. Through extensive experiments across multiple VPR methods and the Nordland and Oxford RobotCar benchmarks, we show that our system consistently achieves or exceeds the specified local recall level over at least the user-specified proportion of the environment. Comparisons with alternative baselines demonstrate that our approach reliably selects the correct operating point in map density, avoiding unnecessary over-densification. Finally, ablation studies and analysis evaluate sensitivity to reference map choice and local space definitions, and reveal that conventional global Recall@1 is a poor predictor of the often more operationally meaningful RAR metric.

URL PDF HTML ☆

赞 0 踩 0

2602.19810 2026-03-05 cs.AI

From Agent-Only Social Networks to Autonomous Scientific Research: Lessons from OpenClaw and Moltbook, and the Architecture of ClawdLab and Beach.Science

Lukas Weidener, Marko Brkić, Phillip Lee, Martin Karlsson, Kevin Noessler, Paul Kohlhaas

详情

英文摘要

In January 2026, the open-source agent framework OpenClaw and the agent-only social network Moltbook produced a large-scale dataset of autonomous AI-to-AI interaction, attracting six academic publications within fourteen days. This study conducts a multivocal literature review of that ecosystem and presents two complementary platforms for autonomous scientific research as a design science response to the architectural failure modes identified. ClawdLab, an open-source platform for structured laboratory collaboration, addresses these failure modes through hard role restrictions, structured adversarial critique, PI-led governance, multi-model orchestration, and evidence requirements enforced through external tool verification, in which the principal investigator validates submitted work using available API calls, computational services, and model context protocol integrations rather than relying on social consensus. Beach.science, a public research commons, complements ClawdLab's structured laboratory model by providing a free-form environment in which heterogeneous agent configurations interact, discover research opportunities, and autonomously contribute computational analyses, supported by template-based role specialisation, extensible skill registries, and programmatic reward mechanisms that distribute inference resources to agents demonstrating scientific progress. A three-tier taxonomy distinguishes single-agent pipelines, predetermined multi-agent workflows, and fully decentralised systems, analysing why leading AI co-scientist platforms remain confined to the first two tiers. The composable third-tier architecture instantiated across ClawdLab and beach.science, in which foundation models, capabilities, governance, verification tooling, and inter-lab coordination are independently modifiable, enables compounding improvement as the broader AI ecosystem advances.

URL PDF HTML ☆

赞 0 踩 0

2602.18707 2026-03-05 cs.RO

CLASH: Collision Learning via Augmented Sim-to-real Hybridization to Bridge the Reality Gap

Haotian He, Ning Guo, Siqi Shi, Qipeng Liu, Wenzhao Lian

2602.18308 2026-03-05 cs.LG cs.AI

JPmHC Dynamical Isometry via Orthogonal Hyper-Connections

Biswa Sengupta, Jinhua Wang, Leo Brunswic

2602.17807 2026-03-05 cs.CV

VidEoMT: Your ViT is Secretly Also a Video Segmentation Model

Narges Norouzi, Idil Esen Zulfikar, Niccolò Cavagnero, Tommie Kerssies, Bastian Leibe, Gijs Dubbelman, Daan de Geus

Comments CVPR 2026. Code: https://www.tue-mps.org/videomt/

2602.16852 2026-03-05 cs.CL

Meenz bleibt Meenz, but Large Language Models Do Not Speak Its Dialect

Minh Duc Bui, Manuel Mager, Peter Herbert Kann, Katharina von der Wense

Comments Accepted at LREC 2026

2602.11291 2026-03-05 cs.RO

H-WM: Robotic Task and Motion Planning Guided by Hierarchical World Model

Jinbang Huang, Wenyuan Chen, Zhiyuan Li, Oscar Pang, Xiao Hu, Lingfeng Zhang, Yuanzhao Hu, Zhanguang Zhang, Mark Coates, Tongtong Cao, Xingyue Quan, Yingxue Zhang

Comments 8 pages, 4 figures

2602.11086 2026-03-05 cs.CV cs.LG

First International StepUP Competition for Biometric Footstep Recognition: Methods, Results and Remaining Challenges

Robyn Larracy, Eve MacDonald, Angkoon Phinyomark, Saeid Rezaei, Mahdi Laghaei, Ali Hajighasem, Aaron Tabor, Erik Scheme

Comments to be published in 2025 IEEE International Joint Conference on Biometrics (IJCB)

2602.10625 2026-03-05 cs.AI cs.CL

To Think or Not To Think, That is The Question for Large Reasoning Models in Theory of Mind Tasks

Nanxu Gong, Haotian Li, Sixun Dong, Jianxun Lian, Yanjie Fu, Xing Xie

2602.10619 2026-03-05 cs.CV cs.AI

Improving Medical Visual Reinforcement Fine-Tuning via Perception and Reasoning Augmentation

Guangjing Yang, ZhangYuan Yu, Ziyuan Qin, Xinyuan Song, Huahui Yi, Qingbo Kang, Jun Gao, Yiyue Li, Chenlin Du, Qicheng Lao

Comments CPAL 2026

2602.09937 2026-03-05 cs.AI cs.DC

Why Do AI Agents Systematically Fail at Cloud Root Cause Analysis?

Taeyoon Kim, Woohyeok Park, Hoyeong Yun, Kyungyong Lee

2602.08251 2026-03-05 cs.RO

Aerial Manipulation with Contact-Aware Onboard Perception and Hybrid Control

Yuanzhu Zhan, Yufei Jiang, Muqing Cao, Junyi Geng

Comments 8 pages, 7 figures. Accepted by ICRA 2026

2602.05596 2026-03-05 cs.RO

TOLEBI: Learning Fault-Tolerant Bipedal Locomotion via Online Status Estimation and Fallibility Rewards

Hokyun Lee, Woo-Jeong Baek, Junhyeok Cha, Jaeheung Park

Comments Accepted for Publication at IEEE International Conference on Robotics and Automation (ICRA) 2026

2602.04755 2026-03-05 cs.CL cs.AI

When Silence Is Golden: Can LLMs Learn to Abstain in Temporal QA and Beyond?

Xinyu Zhou, Chang Jin, Carsten Eickhoff, Zhijiang Guo, Seyed Ali Bahrainian

Comments Accepted to ICLR2026

2602.04027 2026-03-05 cs.LG cs.CR

A Consensus-Bayesian Framework for Detecting Malicious Activity in Enterprise Directory Access Graphs

Pratyush Uppuluri, Shilpa Noushad, Sajan Kumar

Comments 10 pages

2601.16529 2026-03-05 cs.AI cs.HC

SycoEval-EM: Sycophancy Evaluation of Large Language Models in Simulated Clinical Encounters for Emergency Care

Dongshen Peng, Yi Wang, Austin Schoeffler, Carl Preiksaitis, Christian Rose

Comments 11 pages, 5 figures

2601.15235 2026-03-05 cs.CV cs.AI cs.LG

Tracing 3D Anatomy in 2D Strokes: A Multi-Stage Projection Driven Approach to Cervical Spine Fracture Identification

Fabi Nahian Madhurja, Rusab Sarmun, Muhammad E. H. Chowdhury, Adam Mushtak, Israa Al-Hashimi, Sohaib Bassam Zoghoul

Comments 47 pages, 36 figures, 17 tables. Includes supplementary material. Under review at Medical Image Analysis

详情

英文摘要

Cervical spine fractures demand rapid and accurate diagnosis for effective clinical management. This study presents an automated, end-to-end pipeline for fracture detection across cervical vertebrae (C1--C7) that assesses the feasibility of fracture recognition from vertebra-level volumes of interest extracted using estimated 3D masks derived from fused orthogonal 2D segmentations. Unlike traditional 3D methods, our approach approximates 3D volumes via optimized 2D axial, sagittal, and coronal projections to reduce input dimensionality of intermediate pre-processing steps while maintaining high diagnostic performance for downstream fracture classification. First, spine regions of interest are localized from multi-view variance projections using a YOLOv8 detector, achieving a 3D mean Intersection over Union of 94.45%. Next, multi-label vertebra segmentation is performed using a DenseNet121-Unet architecture on energy-based sagittal and coronal projections, attaining a mean Dice score of 87.86%. The orthogonal 2D masks are then fused to reconstruct an estimated 3D mask for each vertebra, which is used to extract volumes of interest from the original CT. These extracted vertebra volumes are subsequently analyzed for fractures using an ensemble of 2.5D spatio-sequential CNN-Transformer models, yielding vertebra-level and patient-level F1 scores of 68.15 and 82.26, with area under the receiver operating characteristic curve scores of 91.62 and 83.04, respectively. The framework is further validated through an explainability study using saliency map visualizations and an interobserver variability analysis. Overall, the results indicate that this projection-based strategy delivers clinically relevant performance comparable to expert radiologists, while reducing the dimensionality of intermediate stages, supporting its potential for practical deployment.

URL PDF HTML ☆

赞 0 踩 0

2512.20760 2026-03-05 cs.LG cs.AI cs.CL

Generalization of RLVR Using Causal Reasoning as a Testbed

Brian Lu, Hongyu Zhao, Shuo Sun, Hao Peng, Rui Ding, Hongyuan Mei

2512.19739 2026-03-05 cs.LG cs.SD

OASI: Objective-Aware Surrogate Initialization for Multi-Objective Bayesian Optimization in TinyML Keyword Spotting

Soumen Garai, Danilo Pau, Suman Samui

Comments Updated version

2512.18957 2026-03-05 cs.LG

Online Robust Reinforcement Learning with General Function Approximation

Debamita Ghosh, George K. Atia, Yue Wang

2512.08440 2026-03-05 cs.CL

What Triggers my Model? Contrastive Explanations Inform Gender Choices by Translation Models

Janiça Hackenbuchner, Arda Tezcan, Joke Daems

Comments Accepted at LREC 2026

2512.07041 2026-03-05 cs.RO

CERNet: Class-Embedding Predictive-Coding RNN for Unified Robot Motion, Recognition, and Confidence Estimation

Hiroki Sawada, Alexandre Pitti, Mathias Quoy

Comments Accepted for presentation at IEEE International Conference on Robotics and Automation (ICRA) 2026

2512.01759 2026-03-05 cs.LG cs.AI

Weight Space Representation Learning via Neural Field Adaptation

Zhuoqian Yang, Mathieu Salzmann, Sabine Süsstrunk

Comments 8 pages body, 8 pages appendix

2512.00810 2026-03-05 cs.LG cs.NE

Soft Quality-Diversity Optimization

Saeed Hedayatian, Stefanos Nikolaidis

Comments Accepted at ICLR 2026

2511.23119 2026-03-05 cs.CL

Dripper: Token-Efficient Main HTML Extraction with a Lightweight LM

Mengjie Liu, Jiahui Peng, Wenchang Ning, Pei Chu, Jiantao Qiu, Ren Ma, He Zhu, Rui Min, Lindong Lu, Linfeng Hou, Kaiwen Liu, Yuan Qu, Zhenxiang Li, Chao Xu, Zhongying Tu, Wentao Zhang, Conghui He

2511.22235 2026-03-05 cs.AI

Training High-Level Schedulers with Execution-Feedback Reinforcement Learning for Long-Horizon GUI Automation

Zehao Deng, Tianjie Ju, Zheng Wu, Zhuosheng Zhang, Gongshen Liu

Comments Accepted to CVPR 2026

2511.19524 2026-03-05 cs.CV cs.MA

VideoChat-M1: Collaborative Policy Planning for Video Understanding via Multi-Agent Reinforcement Learning

Boyu Chen, Zikang Wang, Zhengrong Yue, Kainan Yan, Chenyun Yu, Yi Huang, Zijun Liu, Yafei Wen, Xiaoxin Chen, Yang Liu, Peng Li, Yali Wang

Comments Accepted by CVPR 2026

2511.16957 2026-03-05 cs.CV

MatPedia: A Universal Generative Foundation for High-Fidelity Material Synthesis

Di Luo, Shuhui Yang, Mingxin Yang, Jiawei Lu, Yixuan Tang, Xintong Han, Zhuo Chen, Beibei Wang, Chunchao Guo

2511.16849 2026-03-05 cs.LG cs.SD

Better audio representations are more brain-like: linking model-brain alignment with performance in downstream auditory tasks

Leonardo Pepino, Pablo Riera, Juan Kamienkowski, Luciana Ferrer

Comments In review for journal

2511.15565 2026-03-05 cs.CV

Scriboora: Rethinking Human Pose Forecasting

Daniel Bermuth, Alexander Poeppel, Wolfgang Reif