arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2602.23253 2026-02-27 cs.RO

SPARR: Simulation-based Policies with Asymmetric Real-world Residuals for Assembly

Yijie Guo, Iretiayo Akinola, Lars Johannsmeier, Hugo Hadfield, Abhishek Gupta, Yashraj Narang

详情

英文摘要

Robotic assembly presents a long-standing challenge due to its requirement for precise, contact-rich manipulation. While simulation-based learning has enabled the development of robust assembly policies, their performance often degrades when deployed in real-world settings due to the sim-to-real gap. Conversely, real-world reinforcement learning (RL) methods avoid the sim-to-real gap, but rely heavily on human supervision and lack generalization ability to environmental changes. In this work, we propose a hybrid approach that combines a simulation-trained base policy with a real-world residual policy to efficiently adapt to real-world variations. The base policy, trained in simulation using low-level state observations and dense rewards, provides strong priors for initial behavior. The residual policy, learned in the real world using visual observations and sparse rewards, compensates for discrepancies in dynamics and sensor noise. Extensive real-world experiments demonstrate that our method, SPARR, achieves near-perfect success rates across diverse two-part assembly tasks. Compared to the state-of-the-art zero-shot sim-to-real methods, SPARR improves success rates by 38.4% while reducing cycle time by 29.7%. Moreover, SPARR requires no human expertise, in contrast to the state-of-the-art real-world RL approaches that depend heavily on human supervision.

URL PDF HTML ☆

赞 0 踩 0

2602.23235 2026-02-27 cs.CV cs.AI

Spatio-Temporal Token Pruning for Efficient High-Resolution GUI Agents

Zhou Xu, Bowen Zhou, Qi Wang, Shuwen Feng, Jingyu Xiao

2602.23229 2026-02-27 cs.CV

Large Multimodal Models as General In-Context Classifiers

Marco Garosi, Matteo Farina, Alessandro Conti, Massimiliano Mancini, Elisa Ricci

Comments CVPR Findings 2026. Project website at https://circle-lmm.github.io/

2602.23224 2026-02-27 cs.CV cs.RO

UniScale: Unified Scale-Aware 3D Reconstruction for Multi-View Understanding via Prior Injection for Robotic Perception

Mohammad Mahdavian, Gordon Tan, Binbin Xu, Yuan Ren, Dongfeng Bai, Bingbing Liu

2602.23219 2026-02-27 cs.LG

Takeuchi's Information Criteria as Generalization Measures for DNNs Close to NTK Regime

Hiroki Naganuma, Taiji Suzuki, Rio Yokota, Masahiro Nomura, Kohta Ishikawa, Ikuro Sato

2602.23212 2026-02-27 cs.CV

Through BrokenEyes: How Eye Disorders Impact Face Detection?

Prottay Kumar Adhikary

2602.23206 2026-02-27 cs.RO

Grasp, Slide, Roll: Comparative Analysis of Contact Modes for Tactile-Based Shape Reconstruction

Chung Hee Kim, Shivani Kamtikar, Tye Brady, Taskin Padir, Joshua Migdal

Comments 8 pages, 11 figures, Accepted by ICRA 2026

2602.23203 2026-02-27 cs.CV cs.AI

ColoDiff: Integrating Dynamic Consistency With Content Awareness for Colonoscopy Video Generation

Junhu Fu, Shuyu Liang, Wutong Li, Chen Ma, Peng Huang, Kehao Wang, Ke Chen, Shengli Lin, Pinghong Zhou, Zeju Li, Yuanyuan Wang, Yi Guo

2602.23199 2026-02-27 cs.AI

SC-Arena: A Natural Language Benchmark for Single-Cell Reasoning with Knowledge-Augmented Evaluation

Jiahao Zhao, Feng Jiang, Shaowei Qin, Zhonghui Zhang, Junhao Liu, Guibing Guo, Hamid Alinejad-Rokny, Min Yang

详情

英文摘要

Large language models (LLMs) are increasingly applied in scientific research, offering new capabilities for knowledge discovery and reasoning. In single-cell biology, however, evaluation practices for both general and specialized LLMs remain inadequate: existing benchmarks are fragmented across tasks, adopt formats such as multiple-choice classification that diverge from real-world usage, and rely on metrics lacking interpretability and biological grounding. We present SC-ARENA, a natural language evaluation framework tailored to single-cell foundation models. SC-ARENA formalizes a virtual cell abstraction that unifies evaluation targets by representing both intrinsic attributes and gene-level interactions. Within this paradigm, we define five natural language tasks (cell type annotation, captioning, generation, perturbation prediction, and scientific QA) that probe core reasoning capabilities in cellular biology. To overcome the limitations of brittle string-matching metrics, we introduce knowledge-augmented evaluation, which incorporates external ontologies, marker databases, and scientific literature to support biologically faithful and interpretable judgments. Experiments and analysis across both general-purpose and domain-specialized LLMs demonstrate that (i) under the Virtual Cell unified evaluation paradigm, current models achieve uneven performance on biologically complex tasks, particularly those demanding mechanistic or causal understanding; and (ii) our knowledge-augmented evaluation framework ensures biological correctness, provides interpretable, evidence-grounded rationales, and achieves high discriminative capacity, overcoming the brittleness and opacity of conventional metrics. SC-Arena thus provides a unified and interpretable framework for assessing LLMs in single-cell biology, pointing toward the development of biology-aligned, generalizable foundation models.

URL PDF HTML ☆

赞 0 踩 0

2602.23193 2026-02-27 cs.AI

ESAA: Event Sourcing for Autonomous Agents in LLM-Based Software Engineering

Elzo Brito dos Santos Filho

Comments 13 pages, 1 figure, 4 tables. Includes 5 technical appendices

2602.23192 2026-02-27 cs.CV cs.LG

FairQuant: Fairness-Aware Mixed-Precision Quantization for Medical Image Classification

Thomas Woergaard, Raghavendra Selvan

Comments Source code available at https://github.com/saintslab/FairQuant

2602.23188 2026-02-27 cs.LG physics.flu-dyn

Efficient Real-Time Adaptation of ROMs for Unsteady Flows Using Data Assimilation

Ismaël Zighed, Andrea Nóvoa, Luca Magri, Taraneh Sayadi

2602.23184 2026-02-27 cs.CL

MTRAG-UN: A Benchmark for Open Challenges in Multi-Turn RAG Conversations

Sara Rosenthal, Yannis Katsis, Vraj Shah, Lihong He, Lucian Popa, Marina Danilevsky

Comments 5 pages, 3 figures

2602.23182 2026-02-27 cs.LG

Closing the gap on tabular data with Fourier and Implicit Categorical Features

Marius Dragoi, Florin Gogianu, Elena Burceanu

2602.23177 2026-02-27 cs.CV

Phys-3D: Physics-Constrained Real-Time Crowd Tracking and Counting on Railway Platforms

Bin Zeng, Johannes Künzel, Anna Hilsmann, Peter Eisert

Comments published at VISAPP 2026

2602.23169 2026-02-27 cs.CV

Learning Continuous Wasserstein Barycenter Space for Generalized All-in-One Image Restoration

Xiaole Tang, Xiaoyi He, Jiayi Xu, Xiang Gu, Jian Sun

2602.23164 2026-02-27 cs.LG

MetaOthello: A Controlled Study of Multiple World Models in Transformers

Aviral Chawla, Galen Hall, Juniper Lovato

2602.23159 2026-02-27 cs.LG

Benchmarking Temporal Web3 Intelligence: Lessons from the FinSurvival 2025 Challenge

Oshani Seneviratne, Fernando Spadea, Adrien Pavao, Aaron Micah Green, Kristin P. Bennett

2602.23152 2026-02-27 cs.AI

The Trinity of Consistency as a Defining Principle for General World Models

Jingxuan Wei, Siyuan Li, Yuhang Xu, Zheng Sun, Junjie Jiang, Hexuan Jin, Caijun Jia, Honghao He, Xinglong Xu, Xi bai, Chang Yu, Yumou Liu, Junnan Zhu, Xuanhe Zhou, Jintao Chen, Xiaobin Hu, Shancheng Pang, Bihui Yu, Ran He, Zhen Lei, Stan Z. Li, Conghui He, Shuicheng Yan, Cheng Tan

Comments 119 pages, 50 figures

2602.23146 2026-02-27 cs.LG cs.CV physics.ao-ph

Partial recovery of meter-scale surface weather

Jonathan Giezendanner, Qidong Yang, Eric Schmitt, Anirban Chandra, Daniel Salles Civitarese, Johannes Jakubik, Jeremy Vila, Detlef Hohl, Campbell Watson, Sherrie Wang

2602.23142 2026-02-27 cs.LG

Prediction of Diffusion Coefficients in Mixtures with Tensor Completion

Zeno Romero, Kerstin Münnemann, Hans Hasse, Fabian Jirasek

2602.23141 2026-02-27 cs.CV

No Labels, No Look-Ahead: Unsupervised Online Video Stabilization with Classical Priors

Tao Liu, Gang Wan, Kan Ren, Shibo Wen

Comments CVPR2026

2602.23135 2026-02-27 cs.LG cs.AI cs.SI

DyGnROLE: Modeling Asymmetry in Dynamic Graphs with Node-Role-Oriented Latent Encoding

Tyler Bonnet, Marek Rei

2602.23133 2026-02-27 cs.CV

From Calibration to Refinement: Seeking Certainty via Probabilistic Evidence Propagation for Noisy-Label Person Re-Identification

Xin Yuan, Zhiyong Zhang, Xin Xu, Zheng Wang, Chia-Wen Lin

Comments Accepted by IEEE TMM 2026

详情

英文摘要

With the increasing demand for robust person Re-ID in unconstrained environments, learning from datasets with noisy labels and sparse per-identity samples remains a critical challenge. Existing noise-robust person Re-ID methods primarily rely on loss-correction or sample-selection strategies using softmax outputs. However, these methods suffer from two key limitations: 1) Softmax exhibits translation invariance, leading to over-confident and unreliable predictions on corrupted labels. 2) Conventional sample selection based on small-loss criteria often discards valuable hard positives that are crucial for learning discriminative features. To overcome these issues, we propose the CAlibration-to-REfinement (CARE) method, a two-stage framework that seeks certainty through probabilistic evidence propagation from calibration to refinement. In the calibration stage, we propose the probabilistic evidence calibration (PEC) that dismantles softmax translation invariance by injecting adaptive learnable parameters into the similarity function, and employs an evidential calibration loss to mitigate overconfidence on mislabeled samples. In the refinement stage, we design the evidence propagation refinement (EPR) that can more accurately distinguish between clean and noisy samples. Specifically, the EPR contains two steps: Firstly, the composite angular margin (CAM) metric is proposed to precisely distinguish clean but hard-to-learn positive samples from mislabeled ones in a hyperspherical space; Secondly, the certainty-oriented sphere weighting (COSW) is developed to dynamically allocate the importance of samples according to CAM, ensuring clean instances drive model updates. Extensive experimental results on Market1501, DukeMTMC-ReID, and CUHK03 datasets under both random and patterned noises show that CARE achieves competitive performance.

URL PDF HTML ☆

赞 0 踩 0

2602.23128 2026-02-27 cs.LG

Bound to Disagree: Generalization Bounds via Certifiable Surrogates

Mathieu Bazinet, Valentina Zantedeschi, Pascal Germain

2602.23123 2026-02-27 cs.AI

Multi-Agent Large Language Model Based Emotional Detoxification Through Personalized Intensity Control for Consumer Protection

Keito Inoshita

2602.23120 2026-02-27 cs.CV

TriLite: Efficient Weakly Supervised Object Localization with Universal Visual Features and Tri-Region Disentanglement

Arian Sabaghi, José Oramas

Comments This paper consists of 8 pages including 6 figures. Accepted at CVPR 2026

2602.23117 2026-02-27 cs.CV cs.AI

Devling into Adversarial Transferability on Image Classification: Review, Benchmark, and Evaluation

Xiaosen Wang, Zhijin Ge, Bohan Liu, Zheng Fang, Fengfan Zhou, Ruixuan Zhang, Shaokang Wang, Yuyang Luo

Comments Code is available at https://github.com/Trustworthy-AI-Group/TransferAttack

2602.23115 2026-02-27 cs.CV cs.CG cs.RO

FLIGHT: Fibonacci Lattice-based Inference for Geometric Heading in real-Time

David Dirnfeld, Fabien Delattre, Pedro Miraldo, Erik Learned-Miller

2602.23111 2026-02-27 cs.LG

PRAC: Principal-Random Subspace for LLM Activation Compression and Memory-Efficient Training

Yanyi Li, Yimu Zhang, Cong Fang